Real-world agents: a production-grade Salesforce agent for CPQ quoting
A four-month journey putting Agentforce to the test
Editor’s note:
I’m excited to feature our first guest author on AI Builders. This post comes from a colleague of mine on our Salesforce team who’s been deep in the weeds with Agentforce—Salesforce’s agentic AI platform.
There’s been a lot of buzz around Agentforce, but I haven’t seen many mature production use cases. That’s what makes this one special.
Over the past few months, the team has been in the trenches solving for automated quote creation. What follows is a detailed breakdown of what worked, what didn’t, and the lessons we wish we’d had at the start.
— Justin
Sales quote automation represents one of the most promising yet challenging applications for AI agents in B2B operations.
Unlike simple chatbot implementations, quote creation requires the agent to navigate complex product catalogs, multi-year pricing structures, and business rules while maintaining the accuracy that directly impacts revenue.
This field report documents our 4-month journey implementing a conversational agent with Agentforce and Salesforce CPQ automation. We will tackle the technical pivots, unexpected challenges, and practical lessons that could save other practitioners significant development time.
Business Context & Initial Pain Points
Core challenge:
Sales reps spent 30 mins manually creating the draft of multi-year quotes in Salesforce CPQ, with our team processing 140+ quotes monthly.
Key pain points addressed:
Time inefficiency: Complex multi-year quote creation with product bundles and implementation products
Error-prone process: Manual product selection from large catalog led to inconsistencies and missing components
Scalability issues: Growing product complexity made training new reps increasingly difficult
Administrative burden: Repetitive quote patterns consumed valuable selling time that could be better spent with customers
We implemented Agentforce to automate quote creation in Salesforce CPQ, evolving from a basic proof-of-concept to a production-ready agent over Q1 2025.
Agentforce Concepts
For non-Salesforce users: Think of it as ChatGPT integrated directly into the CRM, capable of reading/writing your business data and executing workflows through conversations.
Core Architecture:
Topics: Specific conversation areas the agent can handle (e.g., "Create a quote," "Manage deals"). Helps organize actions and guide the agents through which actions should be used based on the request.
A topic is made of instructions (text guidance, similar to ChatGPT prompts) and Actions
Actions: Actions are how agents get things done. Each Agent includes a set of actions, which are the tools it can use to do its job. For example, if a user asks an agent for help with writing an email, it will use the best action available within the appropriate topic, to draft and revise the email. An action can be either Standard or Custom.
Standard Actions: Pre-built by Salesforce (e.g., "Update Record," "Get Record Details")
Custom Actions: Created by your team for specific business requirements using Flows (workflow automation), Apex (custom code), or Prompt Templates (AI instructions that process data and return formatted responses)
Technology Options Evaluated
1. Screen Flow vs Agent Approach
Options considered:
Screen Flow: Traditional Salesforce UI with guided quote creation
Agentforce: Conversational AI interface
Both solutions rely on the same reusable automation (flows / apex), it was only a UX choice to make.
Decision rationale: Preferred Agentforce to test AI capabilities and provide natural language interface, despite higher complexity and token costs.
Outcome: Agentforce proved more flexible for iterative conversations and complex scenarios.
2. Product Matching Strategies
Evolution of approaches:
Phase 1: Used RAG approach
Agent used out of the box "Get Record detail" function to get data from Salesforce
Retrieved structured product data from database
Phase 2: Switched to In-Context Learning approach
Automated system fetched product data and embedded it in the prompt template
AI analyzed the text-based product information to make matches
Why the switch worked better: In-context learning gave the AI more comprehensive product information to analyze holistically, rather than trying to match against individual database records through API calls.
3. Quote Creation Architecture
Evolution from Flows to Apex.
Initial: Flow-based approach
Used Salesforce Flow with CPQ API calls
Issues encountered:
"Unable to lock rows" errors (>50% failure rate)
Complex flow logic became unmaintainable
Configuration attributes not applying properly
CPQ API:
The solution is less useful due to several limitations, including the inability to set the quantity of quote lines.
Advantage: It offers the capability to construct a quote virtually, utilizing quote line groups, quote lines, bundles, and more. Quotes can be built without the need for multiple DML operations, which minimizes the risk of encountering DML errors.
However, the limitations outweigh the advantages, leading us to discard this solution.
Final: Custom Apex solution
Developed custom Apex classes to take care of manual steps currently done by reps for building multi-years quote.
Implemented trigger disabling interfering mechanisms
Benefits: More reliable, better error handling, maintainable code
Trade-off: Higher development complexity but better long-term sustainability
Return on Experience
Keep Instructions Concise
Max 3 or 4 instructions per topic.
The trap is to add more and more instructions when the agent does not behave as expected.
Doing so can sometimes work, but will mostly confuse the agent, and it will be less likely that these additional instructions will be taken into account at the right time.
What we learned:
Started with extensive, detailed instructions but found them counterproductive
Consulting with Salesforce experts revealed that simpler instructions often work better
Build Guardrails in Code, Not Prompts
“Topic instructions are nondeterministic, which means they can't guarantee the same outcome 100% of the time. That's just the nature of generative AI. So we make sure to build important or sensitive business rules, requirements, and guardrails into the functionality of the agent's actions, not the topic instructions.”
Examples from our implementation:
Product eligibility: Instead of instructing "only use active products," we implemented filters into the action
Price validation: Automated CPQ rules enforcement rather than relying on instructions
Quote tracking: Added "Created by Automation" field populated with Agentforce_CreateQuoteAction for reliable tracking
Product matching: Moved from instruction-based to database-driven product selection with keywords and rankings, thanks to prompt action.
What we learned:
Instructions vs Actions trade-off: Critical business logic should always be in actions, not instructions
Reliability through automation: Database-driven constraints are more reliable than AI interpretation
Testing approach: Actions can be tested independently, instructions cannot be reliably tested
Maintenance: Updating action logic is more predictable than refining instruction wording
Real-World Examples
Success case: Complex multi-year quote
A sales rep requested: "Create a 3-year quote starting June 2025. Business licenses, Ultimate package, Coaching solution. Year 1: 100 users, Year 2: 150, Year 3: 200. 10% discount."
The agent successfully:
Identified correct products from natural language descriptions
Created appropriate quote line groups for each year
Applied quantities with ramp-up logic
Added required implementation products automatically
Applied global discount correctly
Challenging case: Product matching ambiguity
Request: "Add Essential package and licenses"
Initial approach: Failed due to multiple "Essential" products in catalog and inconsistent product naming
Solution implemented:
Semantic search action: Created dedicated product matching action using prompt templates to analyze our entire product catalog contextually, moving beyond exact keyword matching
Usage-based ranking: Products ranked by frequency in existing quotes—commonly used products prioritized in matching results
Clarification workflow: When multiple matches found, agent asks: "I found Essential Training Package and Essential License Bundle. Which did you mean, or both?"
Key breakthrough: The semantic search approach (via prompt templates) was transformational for addressing our product catalog's poor readability—vague product names, uninformative descriptions, and inconsistent naming conventions that made exact matching difficult.
Efficiency insight: This AI-driven approach proved far more time-efficient than manually cleaning up our product database. With 200+ active products, standardizing names and descriptions would have required months of cross-team coordination, whereas the semantic search solution was implemented in days and immediately improved matching accuracy.
Why this mattered: This hybrid approach (semantic understanding + usage data + human clarification) solved the core challenge of translating natural sales language into our technical product catalog structure, while avoiding the massive overhead of database restructuring.
What we learnt about Agentforce technology
Platform Insights
Rapid evolution: Platform improved significantly during our 3-month development period
Maturity level: Still building with occasional bugs but increasingly reliable
Key Capabilities & Limitations
Strengths:
Flexible action framework (Flows/Apex/Prompts) powered with natural language processing for translating business requests
Direct Salesforce data integration without requiring Data Cloud or data ingestion (at least for this particular use case)
Limitations:
Query Record action had known bugs during our implementation (Q1 2025)
Struggles with fuzzy product matching
Limited native CPQ integration (requires custom development)
Lessons for Other Implementations
Start with hybrid approach from day one
Use AI for natural language interpretation, deterministic logic for business rules
Don't try to solve complex business logic with instructions alone
Application: Any workflow requiring both flexibility and reliability
Data structure beats prompt engineering
Custom fields and metadata guide AI behavior more reliably than complex instructions
Invest in data modeling before building complex prompts
Application: Any scenario where AI selects from large, structured datasets
In-Context Learning > RAG for stable datasets
Embedding comprehensive data in prompts often outperforms real-time API calls
Works well when your dataset fits in context windows and doesn't change frequently
Application: Product catalogs, policy documents, structured knowledge bases