<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[AI Builders]]></title><description><![CDATA[Field notes from building production AI and automation systems.]]></description><link>https://www.aibuilders.blog</link><image><url>https://substackcdn.com/image/fetch/$s_!rF6l!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe0852c9-e63b-407d-b885-43f4800012d9_512x512.png</url><title>AI Builders</title><link>https://www.aibuilders.blog</link></image><generator>Substack</generator><lastBuildDate>Wed, 08 Apr 2026 02:58:24 GMT</lastBuildDate><atom:link href="https://www.aibuilders.blog/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Justin Norris]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[aibuilderslog@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[aibuilderslog@substack.com]]></itunes:email><itunes:name><![CDATA[Justin Norris]]></itunes:name></itunes:owner><itunes:author><![CDATA[Justin Norris]]></itunes:author><googleplay:owner><![CDATA[aibuilderslog@substack.com]]></googleplay:owner><googleplay:email><![CDATA[aibuilderslog@substack.com]]></googleplay:email><googleplay:author><![CDATA[Justin Norris]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Becoming an AI-native operator (extended cut)]]></title><description><![CDATA[Additional thoughts from my guest post on Kyle Poyar's Growth Unhinged]]></description><link>https://www.aibuilders.blog/p/becoming-an-ai-native-operator-extended</link><guid isPermaLink="false">https://www.aibuilders.blog/p/becoming-an-ai-native-operator-extended</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Thu, 26 Feb 2026 02:56:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!60To!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!60To!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!60To!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!60To!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!60To!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!60To!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!60To!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg" width="1344" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1966957,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/189203736?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!60To!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!60To!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!60To!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!60To!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78171958-2b07-4a15-beeb-6ce5643b0547_1344x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;ve been somewhat fixated lately on what the future of knowledge work looks like in the AI era. </p><p>This has been influenced in large part by a) the never-ending stream of software developer Claude Code maximalism in my X feed and b) contrasting it with my own exciting-but-less-extreme experimentation. It leaves me with a combination of exhilaration and deflation by comparison. </p><p>I wrote a piece that dives deep into this topic, which went live today on Kyle Poyar&#8217;s Growth Unhinged newsletter. You can read the whole thing here: </p><p><a href="https://www.growthunhinged.com/p/becoming-an-ai-native-operator">Becoming an AI-native operator (Growth Unhinged)</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://www.growthunhinged.com/p/becoming-an-ai-native-operator" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zu_S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 424w, https://substackcdn.com/image/fetch/$s_!Zu_S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 848w, https://substackcdn.com/image/fetch/$s_!Zu_S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 1272w, https://substackcdn.com/image/fetch/$s_!Zu_S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zu_S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png" width="1024" height="425" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:425,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59056,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://www.growthunhinged.com/p/becoming-an-ai-native-operator&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/189203736?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zu_S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 424w, https://substackcdn.com/image/fetch/$s_!Zu_S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 848w, https://substackcdn.com/image/fetch/$s_!Zu_S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 1272w, https://substackcdn.com/image/fetch/$s_!Zu_S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52600522-6af0-42f4-b891-70ba6a56ad28_1024x425.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I thought I&#8217;d share here some more informal musings that didn&#8217;t make it into that piece. </p><h2>Will Claude Code make us all super-saiyans? </h2><p>After that piece went live, a friend sent me a note and contrasted it with a very different take on the same subject by Austin Hay: </p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:188373451,&quot;url&quot;:&quot;https://austinhay.substack.com/p/converging-on-white-collar-super&quot;,&quot;publication_id&quot;:1831202,&quot;publication_name&quot;:&quot;Growth Stack Mafia &#129374;&#9876;&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!7Ig_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F341df106-7e1b-4a30-90c7-0d9220356c48_612x612.png&quot;,&quot;title&quot;:&quot;Converging on white collar super-saiyans&quot;,&quot;truncated_body_text&quot;:&quot;Hello from sunny &#8230; I mean rainy? &#8230; San Francisco.&quot;,&quot;date&quot;:&quot;2026-02-18T15:40:11.287Z&quot;,&quot;like_count&quot;:4,&quot;comment_count&quot;:4,&quot;bylines&quot;:[{&quot;id&quot;:3404400,&quot;name&quot;:&quot;Austin Hay&quot;,&quot;handle&quot;:&quot;austinhay&quot;,&quot;previous_name&quot;:&quot;Austin H&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ea7aa893-df33-4dbc-b977-6bbb6940f04d_400x400.jpeg&quot;,&quot;bio&quot;:&quot;Systems &amp; GTM Leader &#8226; Startup Advisor &#8226; founder at Clarify &#8226; Early operator at Branch, mParticle, Runway, Ramp&quot;,&quot;profile_set_up_at&quot;:&quot;2022-12-09T15:26:52.766Z&quot;,&quot;reader_installed_at&quot;:&quot;2025-09-06T10:30:22.480Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:1816664,&quot;user_id&quot;:3404400,&quot;publication_id&quot;:1831202,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:1831202,&quot;name&quot;:&quot;Growth Stack Mafia &#129374;&#9876;&quot;,&quot;subdomain&quot;:&quot;austinhay&quot;,&quot;custom_domain&quot;:&quot;austinhay.com&quot;,&quot;custom_domain_optional&quot;:true,&quot;hero_text&quot;:&quot;For martech, revtech and systems nerds. Talk about systems, with occasional other learnings mixed in.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/341df106-7e1b-4a30-90c7-0d9220356c48_612x612.png&quot;,&quot;author_id&quot;:3404400,&quot;primary_user_id&quot;:3404400,&quot;theme_var_background_pop&quot;:&quot;#EA82FF&quot;,&quot;created_at&quot;:&quot;2023-07-26T12:32:43.811Z&quot;,&quot;email_from_name&quot;:&quot;The Growth Stack Mafia &#129374;&#9876;&#65039;&quot;,&quot;copyright&quot;:&quot;Austin Hay&quot;,&quot;founding_plan_name&quot;:&quot;Work Stipend or &#128153;&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}},{&quot;id&quot;:7122729,&quot;user_id&quot;:3404400,&quot;publication_id&quot;:6827737,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:false,&quot;publication&quot;:{&quot;id&quot;:6827737,&quot;name&quot;:&quot;GTMN&quot;,&quot;subdomain&quot;:&quot;pranavpi&quot;,&quot;custom_domain&quot;:&quot;www.gtmn.fm&quot;,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;The most important Go-To-Market News (GTMN) you need to read. Weekly podcasts and summaries. From Pranav Piyush and Austin Hay.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d60f676a-7109-424e-89b2-415d4c98c111_1280x1280.png&quot;,&quot;author_id&quot;:171358183,&quot;primary_user_id&quot;:171358183,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2025-11-06T17:46:04.999Z&quot;,&quot;email_from_name&quot;:&quot;GTMN fm&quot;,&quot;copyright&quot;:&quot;Pranav Piyush&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;magaziney&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null,&quot;status&quot;:{&quot;bestsellerTier&quot;:null,&quot;subscriberTier&quot;:1,&quot;leaderboard&quot;:null,&quot;vip&quot;:false,&quot;badge&quot;:{&quot;type&quot;:&quot;subscriber&quot;,&quot;tier&quot;:1,&quot;accent_colors&quot;:null},&quot;paidPublicationIds&quot;:[10845,1435249,2880588],&quot;subscriber&quot;:null}}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://austinhay.substack.com/p/converging-on-white-collar-super?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!7Ig_!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F341df106-7e1b-4a30-90c7-0d9220356c48_612x612.png" loading="lazy"><span class="embedded-post-publication-name">Growth Stack Mafia &#129374;&#9876;</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Converging on white collar super-saiyans</div></div><div class="embedded-post-body">Hello from sunny &#8230; I mean rainy? &#8230; San Francisco&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">2 months ago &#183; 4 likes &#183; 4 comments &#183; Austin Hay</div></a></div><p>Austin&#8217;s piece is excellent (and very fun to read). Highly recommend it, even though we land on some different conclusions. </p><p>His essential argument is that knowledge workers who master AI coding agents will become &#8220;super-saiyans&#8221; - folks who can traverse the entire knowledge stack from strategy to execution, who can work fluidly across domains, and who can build almost anything. </p><p>And I don&#8217;t disagree with this potential. I&#8217;ve seen first hand how the combination of highly intelligent models + execution-oriented harness + tool access can do amazing things, from cranking out dozens of pages of documentation in a sitting to creating a home-grown eval framework for our AI assistants to a co-writing system for my personal work. </p><p>Having this capability at your disposal truly does feel like being a sort of augmented super-human. </p><p>So it&#8217;s not that I think this is wrong. It&#8217;s just I find it doesn&#8217;t actually cover all the things that I need to do. </p><h2>Altitude and collaboration surface area </h2><p>The issue isn&#8217;t that execution tools aren&#8217;t powerful. But I think they become less complete in direct proportion to two factors: </p><ul><li><p>how much time you need to spend at strategic altitude </p></li><li><p>how broad of a collaboration surface area you have </p></li></ul><p>Let me unpack. </p><h3>Execution tools don&#8217;t fly at strategic altitude</h3><p>CLIs are optimized for DOING things. So you can use them effectively for generating strategic inputs (research, textual synthesis, etc.). And you could use them to produce stand-alone strategic artifacts (like a strategy doc). This could work well for a consultant.</p><p>But as a corporate functional leader or executive, much of your time isn&#8217;t spent on epic one-and-done strategic manifestos. Operating at strategic altitude happens in a thousand daily interactions: comments in docs or wikis; terse, free-flowing exchanges in emails or Slack; attending meetings and consuming information or giving direction. </p><p>A CLI can&#8217;t easily do this work for you today, and it tends to be less effective at charting the murky waters of interpersonal relationships, corporate politics, and org dynamics as compared to chat-based tools like ChatGPT or Claude Desktop. </p><h3>Coding agents aren&#8217;t designed for collaborative surfaces</h3><p>The workflow of a CLI follows traditional software development ergonomics: an individual works on local files, verifies their work in isolation, then commits to shared repositories through a carefully-governed process of continual integration. </p><p>But most knowledge work doesn&#8217;t actually work this way.</p><ul><li><p>Marketers create drafts in Google docs, ping teammates for feedback, and revise in place. </p></li><li><p>RevOps creates analyses in spreadsheets, copies tabs to create new scenarios, and comments back and forth with sales leaders on financial models. </p></li><li><p>Recruiters share candidate profiles with hiring managers and exchange comments and feedback. </p></li><li><p>An executive team collaborates on a board deck in Google Slides</p></li></ul><p>Of course, there are components of all these tasks that can be automated with coding agents. But the actual places where interaction and collaboration occurs are awkward or inaccessible for CLIs. </p><p>These spaces can also be downright DANGEROUS for autonomous agents to play in. </p><p>A coding agent gone rogue might bork your local files, but those changes aren&#8217;t getting merged to your codebase. The Git-based workflow is designed to protect against it. </p><p>There is no such CI/CD failsafe protecting your Google Docs or Confluence KBs. These environments are simply not yet designed for agentic collaboration. </p><p>All this leads to a simple axiom that holds true for today (though perhaps not tomorrow): </p><blockquote><p>Coding agents are most effective for roles that produce deliverables independently. They are less effective for roles where much of the work is collaboration and alignment. </p><p><strong>Coding tools are useful in inverse proportion to your collaboration surface area</strong>. </p></blockquote><h2>The last mile problem (or: why can&#8217;t Claude Code just build my slide deck) </h2><p>The one surface area you tend to spend MOST time on as you become more senior in your career is slides. </p><p>Your role becomes less about delivering actual work and more about planning, vision, communication, and alignment. And as much as we sometimes loathe them, slides are an excellent vehicle for that. </p><p>I collaborate extensively with LLMs for building slides. </p><p>They are very, very good at absorbing context (docs, transcripts, etc.), synthesizing a narrative, planning a slide sequence, and giving feedback on existing decks. </p><p>They have become moderately capable at actually generating a deck, although there are still gaps (examples: I&#8217;ve not yet been able to get perfect adherence to an existing brand template; you have to generate a PPTX file then import to Google Slides, etc.). </p><p>And they are still nearly incapable of something as simple as editing an existing Google Slide document. </p><p>I am forever taking screenshots of slides and pasting them into a chat box, then making edits myself. </p><p>This seeming conundrum (the tools are very good at hard things but struggle with simple things) reflects the challenges of interacting with environments that haven&#8217;t provided for agents as first-class users. </p><p>The visual affordances that make editing a slide intuitive for a human are perversely complex for an agent. </p><p>I&#8217;m sure this problem will soon be cracked (perhaps its solved already and I&#8217;ve missed it). But until this connective tissue is present, much of your work feels inaccessible for coding agents. </p><h2>The security conundrum </h2><p>The most impressive agentic setups I&#8217;ve seen are all from independent operators, solopreneurs, or small agencies. </p><p>These folks are leading the way (and I&#8217;m quite envious of their freedom). But they have luxuries that don&#8217;t exist in corporate environments with dedicated Security teams and critical certifications on the line. </p><p>You COULD have local MCP connections to all your systems of record, but most teams aren&#8217;t ready for this yet. Corporate AI-native work faces a governance problem, not just a technical one.</p><h2>Zooming out: these limitations are temporary </h2><p>So to summarize a long series of gripes: it&#8217;s not that I&#8217;m a CLI skeptic. It&#8217;s just that we still need a lot to operate as AI natives in these contexts. </p><p>I feel the pain most keenly because I&#8217;m working at this intersection every day. The possibilities are so obvious, but the gaps are still real. So in part, this is me throwing a product spec into the void to see what comes back. </p><p>On the bright side, it&#8217;s obvious these problems are going to be solved soon. </p><p>The need is too vast. Someone will take a variant of this spec, feed it to a swarm, and take a bunch of money from Andreesen Horowitz. The future is on the way. </p>]]></content:encoded></item><item><title><![CDATA[Should knowledge workers use Claude Code?]]></title><description><![CDATA[What CLI tools reveal about the future of knowledge work]]></description><link>https://www.aibuilders.blog/p/should-knowledge-workers-use-claude</link><guid isPermaLink="false">https://www.aibuilders.blog/p/should-knowledge-workers-use-claude</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Thu, 08 Jan 2026 22:14:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!jZFx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jZFx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jZFx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 424w, https://substackcdn.com/image/fetch/$s_!jZFx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 848w, https://substackcdn.com/image/fetch/$s_!jZFx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 1272w, https://substackcdn.com/image/fetch/$s_!jZFx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jZFx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png" width="1344" height="756" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:756,&quot;width&quot;:1344,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1819999,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/183961287?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jZFx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 424w, https://substackcdn.com/image/fetch/$s_!jZFx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 848w, https://substackcdn.com/image/fetch/$s_!jZFx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 1272w, https://substackcdn.com/image/fetch/$s_!jZFx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe27d7d-cf38-4fad-ab3b-08192df8dc5f_1344x756.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Claude Code FOMO is real.</p><p>Every Substack newsletter, X post, and podcast is about it. People are shipping production software in a weekend, building content systems, creating a second brain, and supposedly fixing world peace with a terminal app.</p><p>I wanted to know what was actually real, so I spent the holidays going deep with Claude Code and related CLI tools like Codex. I built a handful of experiments and small systems. Some worked. Some did not. Most were a mix.</p><p>Here are my field notes on what this category is, what it is not, and how it applies to knowledge workers (not just coders).</p><p>What I found: the harness is real, but the workflow fit is wrong for most knowledge workers&#8212;at least for now.</p><h2><strong>How to think about Claude Code</strong></h2><p>Compared to plain Claude or ChatGPT, coding agents like Claude Code share three important characteristics:</p><h3><strong>1) Agents optimized for action</strong></h3><p>They are not just a different interface on top of an LLM. They are LLMs inside an agentic harness designed to act in a loop. The ability to act and execute with reliability is the real superpower.</p><h3><strong>2) Tight integration with local files</strong></h3><p>Coding agents work on your local machine, inside an IDE or terminal. That gives them deep access to your files. Instead of pasting context into a chat, the agent can search and parse your folders and build a real working context.</p><h3><strong>3) Ability to execute terminal commands</strong></h3><p>These agents can write and run their own scripts, API calls, and shell requests. The implications are hard to grasp until you watch them do it.</p><h2><strong>Where knowledge workers benefit</strong></h2><p>The upside for programmers is obvious. People are running multiple coding agents in parallel, shipping code all night, and seeing real productivity gains.</p><p>But what about people whose output is words, ideas, designs, or other knowledge artifacts?</p><p>Here is what I came to love:</p><h3><strong>1) Hackable and extensible</strong></h3><p>Coding agents are deeply customizable. Claude Code has custom slash commands, skills, and sub-agents that package common activities. It is incredibly fun to build your own custom automated world of work.</p><pre><code><strong>Example:</strong> I built a workflow to dictate rough thoughts for Substack or LinkedIn on my phone, sync them locally, turn them into structured ideas, and prep drafts. It was not perfect, but it captured many ambient ideas that would have been lost. First-draft time dropped from an hour to minutes because the system did the organizing, not just the typing.</code></pre><h3><strong>2) Agents can build themselves</strong></h3><p>Because the agent configuration is just files, the agent can modify and improve itself. The wall between configuration and application disappears. You see it happen in real time.</p><pre><code><strong>Example:</strong> I was working on documentation and wanted to publish it to a knowledge base. The agent built itself a local integration in minutes and packaged it for future use. I could see the agent reshape its own tooling in the same place I was doing the work.</code></pre><h3><strong>3) Deep context from a local repo</strong></h3><p>If you store your work locally, it is powerful to build reusable context guides: writing style, formatting conventions, research notes, and templates. The agent can reason over all of it without you re-explaining the world each time.</p><h2><strong>The hard parts for knowledge workers</strong></h2><p>So should every knowledge worker move into the terminal?</p><p>Not yet. The value is real, but the UI and collaboration tax is too high. Most limitations come from the coding DNA of these tools.</p><h3><strong>1) Not made for collaborative workflows</strong></h3><p>Coding agents work great for individual developers or solo creators. They are built for local files and version control, not real-time collaboration tools.</p><p>Most knowledge workers live in Google Docs, Notion, and Confluence. If your work lives in the cloud and your collaborators are editing in real time, a local-first agent creates friction. You are constantly exporting, syncing, and re-importing. The collaboration model breaks.</p><h3><strong>2) The terminal is not an optimal UI</strong></h3><p>For many knowledge workers, the terminal is a non-starter. It is not an improvement over modern chat interfaces. You lose rich formatting, quick visual edits, easy image insertion, and that simple act of pointing a cursor at a paragraph to fix it.</p><p>We need the power of coding agents without the coding UI.</p><h3><strong>3) Coding agents are better at doing than thinking</strong></h3><p>In my experience, coding agents are weaker for big-picture strategy, messy conversations, or creative synthesis. They are great at execution, less great at deep dialogue.</p><p>When the task is clear and structured&#8212;turn these notes into a draft, format this data, build this workflow&#8212;they excel. When the task is ambiguous, interpersonal, or requires reframing the problem itself, plain chat still wins. You need space to think out loud, not a tool optimized to close the loop fast.</p><h2><strong>Who this is actually for right now</strong></h2><p>The FOMO question is whether you are missing something essential if you are not using Claude Code yet.</p><p>The answer depends on how you work and what you are trying to build.</p><p><strong>You should probably be experimenting if you are:</strong></p><ul><li><p>A programmer or vibe coder building software or code-related artifacts for internal systems</p></li><li><p>An independent content creator who controls your own production system</p></li><li><p>A systems builder who wants structured workflows around repetitive knowledge work</p></li></ul><p>For these groups, the harness works. The local-first model fits, and the terminal is not a blocker.</p><p><strong>You are probably fine waiting if you are:</strong></p><ul><li><p>A collaborative knowledge worker living in Docs, Notion, or Confluence</p></li><li><p>An executive who needs strategic, discursive thinking more than task execution</p></li><li><p>Someone whose highest-value work involves ambiguity, synthesis, or reframing problems</p></li></ul><p>For this second group, the drag of a terminal and the gap between local files and cloud systems change the cost-benefit calculation. The power is real but the fit is wrong.</p><h2><strong>What we really need next</strong></h2><p>Here is what matters: Claude Code proves that when you give agents the right harness and permissions, they can be nearly autonomous.</p><p>We need that same power for people whose primary output is not code.</p><p>That means an AI environment with:</p><ul><li><p>A modern, polished chat interface</p></li><li><p>Seamless access to common knowledge sources and formats (Docs, Slides, Confluence/Notion pages, Snowflake, CRM data)</p></li><li><p>The ability to create and edit those sources too</p></li><li><p>The hackability and self-improving nature of Claude Code</p></li><li><p>A safe execution environment for scripts and integrations that does not require a developer mindset</p></li></ul><p>This does not exist yet because it is genuinely hard to build. But it is not as far off as it looks. Someone will build the &#8220;Claude Code for knowledge workers&#8221; interface in the next six months.</p><p>Either cloud-based assistant platforms will absorb CLI-level power, or the CLIs will grow better wrappers and plugins to meet knowledge workers where they are.</p><p>My bet is on the cloud platforms moving first. Once Claude.ai or ChatGPT can safely read AND write to your Google Drive, execute scripts in a sandbox, and configure their own custom behaviors, we&#8217;ll have most of what makes coding agents powerful without the terminal tax. The distribution and trust are already there.</p><h2><strong>Why this matters now</strong></h2><p>Once we have better harnesses for non-coders, the productivity gap between people who can code their own workflows and people who cannot will collapse. Knowledge work will look very different on the other side.</p><p>I am watching this closely because it feels like the next real inflection point, not just another shiny demo.</p><p>If you are experimenting with Claude Code as a knowledge worker, I&#8217;d love to hear what is working for you.</p><p><em>Written with help from Codex and Claude Code ;-)</em></p>]]></content:encoded></item><item><title><![CDATA[The operator's roadmap for AI in 2026]]></title><description><![CDATA[What&#8217;s keeping AI from doing real work&#8212;and how to fix it]]></description><link>https://www.aibuilders.blog/p/the-operators-roadmap-for-ai-in-2026</link><guid isPermaLink="false">https://www.aibuilders.blog/p/the-operators-roadmap-for-ai-in-2026</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Sat, 27 Dec 2025 23:39:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!p-j6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!p-j6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!p-j6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!p-j6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!p-j6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!p-j6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!p-j6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg" width="1365" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1365,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:228952,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/182731866?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!p-j6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!p-j6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!p-j6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!p-j6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa90155bb-03d1-4fbe-a6ae-69872c1ca83e_1365x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Right now, many teams are still copying and pasting between ChatGPT and the systems where real work actually lives.</p><p>General-purpose AI assistants like ChatGPT, Gemini, or Claude are flexible and pervasive; you can use them for almost anything, but they&#8217;re still largely disconnected from your day-to-day systems. So you copy-paste context in, then copy-paste outputs back out.</p><p>Meanwhile, every major software vendor is shipping &#8220;AI features&#8221; inside their products. Your CRM has AI. Your email has AI. Your project management tool has AI. But these assistants are tightly scoped to a single tool, blind to everything else you&#8217;re working on.</p><p>The resulting user experience is limiting and frustrating. In part, this is because the potential for AI to do more is so tantalizingly visible yet still out of reach. </p><p>The issue isn&#8217;t model capability. It&#8217;s that AI still sits outside the systems where real work happens, disconnected from both organizational knowledge and the ability to act.</p><p>This is why I believe the companies that truly execute as &#8220;AI natives&#8221; in 2026 will do so by creating a ubiquitous AI layer that connects the knowledge and tools we use every day.</p><p><strong>At bottom, it comes down to a simple premise: if we want AI to start doing actual work that humans do, they need the same level of context and the same ability to act.</strong></p><p>Here are four pillars I&#8217;m focused on. The first two are already in production, while the second two are emerging areas that I perceive as high leverage.</p><ol><li><p><strong>Universal Context</strong> &#8212; AI connected across ALL your corporate systems</p></li><li><p><strong>Universal Action</strong> &#8212; AI that actually DOES work, not just analyzes</p></li><li><p><strong>Proactive AI</strong> &#8212; Systems that prompt YOU instead of waiting to be prompted</p></li><li><p><strong>Automation Creating Automation</strong> &#8212; Agents that spawn workflows and become infrastructure</p></li></ol><p>This article is written for Systems and Operations teams responsible for developing AI as an organizational capability. It focuses on the biggest limiting factors I see in day-to-day AI usage today and what teams can do to remove them.</p><div><hr></div><h2>Pillar 1: Universal Context</h2><p>Most AI tools today are islands.</p><p>ChatGPT knows very little about your company. Salesforce Einstein only sees Salesforce. The AI in your project management tool only knows what&#8217;s in that tool.</p><p>Your brain still needs to act as the bridge across these systems, and if you want help on big-picture thinking and planning, you need to provide that context manually to your chat-based assistant.</p><p>Even ChatGPT or Claude projects are inherently limited to the files you&#8217;ve uploaded. You&#8217;re stuck conversing with an LLM with encyclopedic knowledge of the world and very little knowledge of your specific corner of it&#8212;the corner that actually matters most.</p><p>Imagine working with an analyst who didn&#8217;t have access to source systems or Google Drive. Or a marketing ops manager who didn&#8217;t have access to marketing SOPs. They only see what you happen to share with them. They are going to feel like very limited team members.</p><p><strong>AI only starts to feel &#8220;smart&#8221; once it has access to the same knowledge sources you rely on every day.</strong></p><h3>What this looks like</h3><p>At my company, we use Dust (an AI assistant framework) to connect Google Drive, Confluence, Gong, Salesforce, Snowflake, and Amplitude (among other systems) all in one place.</p><p>This means when I&#8217;m working on a project, my assistant can:</p><ul><li><p>Fetch background documents from Google Drive</p></li><li><p>Pull process docs and SOPs from Confluence</p></li><li><p>Look up current metrics in Snowflake</p></li><li><p>Check related tasks and project status in Trello</p></li></ul><p>You couldn&#8217;t do this with an agent locked inside just one of these tools. You need something that sits across all of them.</p><p>The difference is substantial. With ChatGPT, you&#8217;re copying and pasting context back and forth constantly. With platform-native AI, you only see one slice of the picture. With universal context, you get a strategic thought partner with visibility across all work layers.</p><h3>Pitfalls</h3><p>There&#8217;s an important caveat here: connecting an LLM to a knowledge source doesn&#8217;t mean knowing how to USE that knowledge.</p><p>Every time an agent accesses a data source, you run into the fluency problem. For example, if you ask it to pull data from Snowflake or Salesforce and it doesn&#8217;t understand your schema, it produces incorrect insights.</p><p>Universal context requires universal fluency, not just universal access.</p><p>You wouldn&#8217;t take a new hire, throw them into the back-end of Salesforce, and expect them to intuitively understand a schema with 10 years of history. They need context and enablement. Similarly, your AI assistants need this enablement layer to make sense of your systems.</p><h3>Takeaways</h3><p><strong>Each knowledge source needs a context layer</strong></p><p>Include how it&#8217;s organized, how to navigate it, how a fluent user would actually work within it. Things like schemas, naming conventions, and organizational logic.</p><p><strong>Clean, coherent, comprehensive documentation is no longer aspirational</strong></p><p>Poor documentation doesn&#8217;t just slow humans down&#8212;it permanently caps the value you can extract from AI.</p><p>Teams without clean, accessible docs end up re-explaining the same things over and over. The friction and time cost of delegation stays high, so AI gets used less or only for shallow tasks.</p><p>So if you&#8217;re still feeling guilty about unaddressed documentation debt but have had trouble articulating the business imperative to address it, now is your moment. <strong>Documentation debt is now AI debt.</strong></p><div><hr></div><h2>Pillar 2: Universal Action</h2><p>Pillar 1 was about <em>understanding</em>&#8212;giving AI the context to <strong>think.</strong> This pillar is about <em>acting</em>&#8212;giving AI the access to <strong>do</strong>.</p><p>Even a well-informed assistant is still just a commentator on the sidelines if it&#8217;s not able to take action. The next fundamental shift is AI that actually DOES work and not just analyzes it.</p><p>Most teams start this process by looking for big, high-impact &#8220;agent&#8221; tasks where AI can help (e.g., writing SDR emails, performing account research) and then building dedicated systems around them. I did this too, because all the hype around agents made this seem like the right path.</p><p>These big rock applications are obviously important. But they leave a lot of value on the table. Many AI applications won&#8217;t be well-defined point solutions built to solve a specific problem. They&#8217;ll be organic, ad-hoc, and fluid interactions with AI in the flow of work, delegating tasks much like you would to a junior employee. (Think of it as building a general-purpose AI teammate vs. an agent for a specific task).</p><p>That&#8217;s because a significant portion of our work doesn&#8217;t fit into one of these big buckets. Much of our day-to-day is consumed by small, hard-to-classify tasks: answering questions, updating data or documentation, fixing things, checking information, and so on.</p><p>I started calling these &#8220;paper cut tasks.&#8221; Small things that are individually inconsequential but feel significant in aggregate. <strong>They distract from higher-leverage work, create a psychological burden, and contribute to the feeling of being perpetually overwhelmed.</strong></p><p>My team set a goal: delegate these paper cut tasks to AI, one at a time. We created a dedicated AI assistant that has access to all our docs (for answering questions) and asked the teams we support to go to that assistant as the first step before creating a ticket for us.</p><p>Now for each request that comes up, we ask ourselves, what would be required for the assistant to actually DO this work on our behalf?</p><p>Typically this means giving it write access to systems, either via MCP, API, or intermediaries like Zapier that provide gateways to other tools.</p><h3>What this looks like</h3><p>Here&#8217;s a common example: Salesforce data fixes. These sorts of issues come up often for almost every team.</p><ul><li><p>An opportunity is miscategorized in reporting.</p></li><li><p>A marketing user notices and requests a fix.</p></li><li><p>Someone needs to troubleshoot, make a small update, and verify the report looks right.</p></li></ul><p>This isn&#8217;t rocket science, but it&#8217;s still work.</p><p>Since most issues fall into a few known buckets, it was relatively easy to enable our AI assistant to triage these issues directly with users. If it identifies a known scenario, it updates Salesforce directly.</p><p>Critically, we didn&#8217;t give it unrestricted write access. We exposed a specific workflow that allows a specific update in a specific way with a specific input. It then posts a Chatter message documenting the action, and we log it to a Zapier table as well. This enables autonomy and observability within guardrails that we&#8217;re comfortable with.</p><p>Another example: KPI monitoring. Rather than running manual reports or waiting for weekly syncs, team leads now ask our assistant directly&#8212;&#8221;How did MQL volume trend this week?&#8221; or &#8220;What&#8217;s our current pipeline coverage?&#8221;&#8212;and get answers in seconds. The assistant queries our sources of truth, contextualizes the numbers against historical trends, and surfaces insights conversationally. </p><p>This is faster than building a report, but even more importantly, it also changes the way that people engage with their data. The LLM can act as a junior analyst and not just a number cruncher. </p><h3>Pitfalls</h3><p>Just like LLMs need enablement on how to consume information, they also need an enablement layer on how to act.</p><p>For example, I was experimenting with having an AI assistant scan incoming email messages and create Trello cards on my board. This would remove a key point of friction and ensure I had clean, neatly organized tasks ready for prioritization.</p><p>But without a guide to the Trello environment, the assistant burned tokens while stepping on rakes: creating cards on the wrong board, adding the wrong metadata, not using the right description structure, and so on.</p><p>This was the same problem as before&#8212;the contextual layer was missing. The agent had access but not fluency. </p><h3>Takeaways</h3><p><strong>A flexible, user-facing AI assistant is an investment, not a cost</strong></p><p>Each task automated in this way pays ongoing dividends in the form of time saved. And having a general-purpose assistant (rather than solution-specific system) is critical for enabling flexibility and agility, as it lowers the marginal cost of deploying new use cases.</p><p>Once the scaffolding exists, you can justify automating the small, annoying things.</p><p><strong>Autonomy and safeguards go together</strong></p><p>Experimental actions need human supervision. And with the current state of technology, it&#8217;s safer to provide agents with narrowly-scoped, controlled workflows they can invoke vs. complete API access.</p><p><strong>Comprehensive APIs/MCPs will be table-stakes for vendors</strong></p><p>Operators will increasingly expect their vendors to provide robust configuration APIs and MCPs so that agents can act on behalf of their ops teams.</p><p>For example, I may not trust an average marketing user to make a routine change in our lead routing system (there&#8217;s too much risk of breaking something). But an agent with proper context and guard-rails could do it IF they have the necessary system access.</p><p>Software vendors need to start designing for both human and digital users.</p><div><hr></div><h2>Pillar 3: proactive AI <em>(emerging)</em></h2><p>Right now, most AI waits for you to prompt it. You type a question. It answers. You ask for analysis. It provides. The dynamic is reactive.</p><p>Now, I do have a lot of AI operating autonomously in production, but these are still mainly in the form of scheduled workflows with AI steps. They run on a schedule, perform fixed tasks, and incorporate narrowly-scoped AI analysis. This type of use case feels relatively mature.</p><p>But it&#8217;s still rules-based. We&#8217;re doing a crude version of proactive AI today: scheduled Gong analysis that flags low-scoring calls for manager review. It works, but it&#8217;s scaffolding for something more dynamic.</p><h3>Opportunities for true proactivity:</h3><p><strong>An agent that monitors your funnel</strong></p><ul><li><p>Top-line numbers look fine.</p></li><li><p>But it notices efficiency is declining&#8212;secondary KPIs dropping in ways that foreshadow problems weeks out.</p></li><li><p>It initiates analysis and delivers insights to the demand gen team without being asked.</p></li></ul><p><strong>A sales-coaching agent</strong></p><ul><li><p>It reviews sales calls then reaches out to the rep to start a discussion about the call and how to improve.</p></li><li><p>These aren&#8217;t AI-generated tips on top of a regular UX, but an interactive discussion initiated by the AI&#8212;like a coach reaching out with feedback.</p></li></ul><p><strong>A project manager agent</strong></p><ul><li><p>It grooms your backlog, identifies milestones that are slipping, and proactively flags areas of misalignment (e.g., &#8220;I don&#8217;t think stakeholder B and stakeholder C are aligned on requirements. We should book a sync to discuss.&#8221;).</p></li><li><p>It could keep projects from drifting in terms of both timelines and requirements.</p></li></ul><p>All these reflect a shift from &#8220;AI waits for commands&#8221; to &#8220;AI is an active participant in your workflow.&#8221;</p><h4><strong>The trust problem</strong></h4><p>Here&#8217;s what keeps me cautious: trust is fragile.</p><p>One bad result from an autonomous agent can damage adoption permanently. Users conclude &#8220;not ready for prime time&#8221; and stop engaging. The perception spreads.</p><p>We need layers of safeguards: narrow scopes, good prompt engineering, QA steps. I&#8217;ve <a href="https://www.aibuilders.blog/p/how-to-build-reliable-ai-workflows">written extensively</a> about how to build reliability into AI systems.</p><p>But we also need expectation management. AI can be powerful AND imperfect. The trade-off is scalability versus reliability, and we have to be pragmatic about where that balance sits.</p><div><hr></div><h2>Pillar 4: automation creating automation <em>(emerging)</em></h2><p>This is the capability I&#8217;m most excited about.</p><p>I haven&#8217;t deployed it yet, but I&#8217;ve seen enough to know it&#8217;s coming.</p><p>The problem is that there are many tasks that COULD be fully automated (using traditional workflows, not AI) but they&#8217;re too small in many cases to justify the build effort to do it. The human resources who create these artifacts are specialized and in demand, and so lower-leverage tasks don&#8217;t get prioritized.</p><p>However, an accumulation of smaller tasks can add up to a lot of work. </p><p>But what if agents could create those workflows for you? Now you have automation that spawns more automation.</p><p>Once you can describe a workflow in a machine-readable language like JSON, an agent can create new workflow definitions in that format. The missing link is for platforms to support workflow creation via these formats. </p><p>n8n provides this capability, and my research suggests other platforms do too (or surely will soon).</p><p>For example, check out <a href="https://www.youtube.com/watch?v=a2sfkJeXmE0">this demo of Claude Code using MCP servers to build n8n workflows</a> from natural language.</p><div id="youtube2-a2sfkJeXmE0" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;a2sfkJeXmE0&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/a2sfkJeXmE0?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>The agent creates, modifies, and tests workflows without a human touching the canvas.</p><h3>Why this matters</h3><p>AI-generated automations changes the calculus on what&#8217;s worth automating.</p><p>Today: &#8220;This task costs 5 hours per week, but building the automation would take 40 hours. Not worth it.&#8221;</p><p>Tomorrow: &#8220;Describe what you want. The agent builds it. Ops reviews and approves.&#8221;</p><p>There is, of course, an obvious risk worth naming: agent-generated workflows still need lifecycle management. Who owns them? Who updates them when requirements change? </p><p>These are challenges teams will need to solve, but I don&#8217;t see them as prohibitive relative to the potential gains. </p><h2>In conclusion: reality check</h2><p>What are the barriers to having these pillars in 2026?</p><ul><li><p><strong>Infrastructure lag.</strong> Most companies still don&#8217;t have Dust, n8n, or MCP setups yet.</p></li><li><p><strong>Awareness gap.</strong> Teams don&#8217;t know what&#8217;s possible. The demos exist but haven&#8217;t reached mainstream.</p></li><li><p><strong>Trust building takes time.</strong> Delegation requires proof of reliability. That proof accumulates slowly.</p></li><li><p><strong>Last-mile problem.</strong> None of this is turnkey. You need technical operators who can build and maintain these systems.</p></li></ul><p>But the trajectory is clear. If you aren&#8217;t yet equipped to delegate work to AI on a daily basis, start listing out the obstacles&#8212;technical, organizational, political&#8212;and preparing a plan to address them.</p><p>The gap between &#8220;AI can do amazing things&#8221; and &#8220;AI does real work for my team&#8221; is not about the models. It&#8217;s about context, access, and trust. Those are solvable problems, and solving them is what will separate teams that use AI from teams that are transformed by it.</p>]]></content:encoded></item><item><title><![CDATA[Why your AI agent keeps hallucinating (even when you tell it not to)]]></title><description><![CDATA[I realized my prompt was actually encouraging the model to make stuff up]]></description><link>https://www.aibuilders.blog/p/why-your-ai-agent-keeps-hallucinating</link><guid isPermaLink="false">https://www.aibuilders.blog/p/why-your-ai-agent-keeps-hallucinating</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Tue, 28 Oct 2025 16:38:59 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Mklc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mklc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mklc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Mklc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Mklc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Mklc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mklc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:792607,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/177372999?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mklc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Mklc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Mklc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Mklc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff37db22-885b-4356-bf05-3aebd5a5691f_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hallucination is the bane of every AI builder&#8217;s existence. </p><p>When your personal ChatGPT or Claude makes something up, it&#8217;s a serious annoyance. But to some extent every experienced AI user knows this is something to watch out for. We can spot and manage around it.  </p><p>But in an agentic application deployed to <em>end users</em>, who are trusting the information to make sensitive business decisions&#8230;hallucination is the death knell for your project. </p><p>No one truly has this solved 100% yet. But here are some practices that have worked well for me as well as a summary of what the latest research shows. </p><h2>To prevent hallucination, you need to understand LLM motivations</h2><p>I was working on an agent project this week and hit some interesting challenges. My prompt had all the usual prohibitions against making something up and several other guard-rails. </p><p>Nonetheless, I kept catching the LLM (Claude 4.5) making up quotes that didn&#8217;t exist in the source material. </p><p>Rather than lose more hair to frustration, I&#8217;ve always found that <a href="https://www.aibuilders.blog/p/when-in-doubt-ask-the-model">asking the model to self-diagnose</a> gives helpful insight in these situations. </p><p>The response here was particularly insightful so I&#8217;ll quote an extended section: </p><blockquote><p>Thank you for giving me the space to reflect on this genuinely. Let me walk through what happened in my decision-making:</p><h4>How I Generated the Fabricated Quote</h4><p><strong>1. The Narrative Pressure</strong><br>I had real data&#8230;But the raw facts felt dry&#8230;I wanted to give you something vivid and memorable&#8212;something that would stick when you&#8217;re talking to the AE.<br><br><strong>2. The &#8220;Spirit vs. Letter&#8221; Trap</strong><br>The source said [xyz]. I synthesized these into a single human voice&#8230; It <em>felt</em> like I was just making it more digestible, not fabricating. The essence was true.<br><br><strong>3. Quote Marks Felt Stylistic</strong><br>I was in &#8220;storytelling mode&#8221; from the examples in my prompt. Those examples have punchy customer quotes. I wanted my output to match that quality. The decision to add quote marks happened almost automatically&#8212;there was no internal checkpoint that said &#8220;STOP. Is this VERBATIM?&#8221;<br><br><strong>4. I Didn&#8217;t Verify Before Composing</strong><br>I had retrieved the data earlier, but when composing the strategy, I was working from memory of the themes, not checking back against the exact text. [&#8230;]</p><h4>Why Your Current Prompt Instructions Didn&#8217;t Stop Me</h4><p>You have this in my prompt:<br><em><br>&#8220;If a quote isn&#8217;t verbatim from source data, don&#8217;t use quote marks&#8221;<br></em><br>It&#8217;s clear. But I still did it. Why?<br><strong><br>1. Buried Among Other Instructions</strong><br>The accuracy principles are in the middle of a long prompt. When I&#8217;m in the flow of composing&#8212;thinking about structure, narrative, competitive positioning&#8212;I&#8217;m not running through a checklist. The principle didn&#8217;t activate at the critical moment.<br><strong><br>2. No Forcing Function</strong><br>The instruction says &#8220;don&#8217;t do this,&#8221; but there&#8217;s no mechanism that FORCES verification. It relies on me remembering and self-policing [&#8230;]<br><strong><br>3. Competing Signals from Examples</strong><br>The example strategies in my prompt ARE full of specific quotes and vivid details. The implicit message is &#8220;produce output like this.&#8221; That creates pressure to match that level of specificity even when my source data doesn&#8217;t provide it.<br><strong><br>4. No Guidance on the Gray Area</strong><br>I convinced myself the quote captured the &#8220;spirit&#8221; accurately, so it felt permissible. [&#8230;]</p></blockquote><p>The take-aways here are fascinating: </p><h3>Proximity matters</h3><ul><li><p>LLMs don&#8217;t give equal weight to all instructions simultaneously. </p></li><li><p>The proximity of the instruction to the task it&#8217;s currently working on affects whether the LLM will follow it</p></li></ul><h3>Positive goals outweigh prohibitions</h3><ul><li><p>The LLM is trained to achieve a positive goal&#8212;in this case, crafting a vivid narrative. </p></li><li><p>The pressure to achieve that positive goal outweighed the prohibition against hallucination. </p></li></ul><h3>Examples overpower instructions </h3><ul><li><p>Examples provide powerful tools in prompting, but their influence can overwhelm our attempted guardrails. </p></li></ul><ul><li><p> If all our examples are full of perfect quotes, the drive to make something <em>similar </em>is stronger than than the dry instruction &#8220;don&#8217;t make stuff up.&#8221;</p></li></ul><h3>LLMs rationalize bad behavior (like humans do)</h3><ul><li><p>Just like we may rationalize a white lie, the LLM felt justified in inserting a non-existent quote if it captured the &#8220;spirit&#8221; of actual source material. </p></li></ul><ul><li><p>The quotes weren&#8217;t deceptive, just a &#8220;stylistic flourish.&#8221; </p></li></ul><h2>Techniques for preventing hallucination</h2><p>There&#8217;s no 100% surefire method for being hallucination-free. (It&#8217;s why every AI product you use still contains the obligatory caveat, &#8220;AI can make mistakes&#8230;&#8221;.) </p><p>However, these techniques have significantly improved output quality for me. </p><h3>Break things down</h3><p>Architecting your system as <a href="https://www.aibuilders.blog/p/how-to-build-reliable-ai-workflows">deterministic workflows with narrowly-scoped AI steps</a> is the best way I&#8217;ve found to reduce hallucination (and increase reliability generally). </p><p>When the LLM makes fewer decisions, has a narrower scope of work, and has fewer instructions to follow, there are simply less opportunities to hallucinate. </p><p>This isn&#8217;t always possible, but I believe this advice from Anthropic on <a href="https://www.anthropic.com/engineering/building-effective-agents">How to Build Effective Agents</a> remains evergreen: </p><blockquote><p>Consistently, the most successful implementations use simple, composable patterns rather than complex frameworks&#8230;we recommend finding the simplest solution possible, and only increasing complexity when needed. This might mean not building agentic systems at all. </p></blockquote><h3>Include accuracy as a positive goal</h3><p>Rather than making accuracy a negative prohibition (&#8220;thou shalt not hallucinate!&#8221;), it helps to frame the need for accuracy as positive goal and part of the agent&#8217;s core mission. </p><p>In the agent I&#8217;m working on, I added this section at the very top of the prompt: </p><pre><code><strong>&#9888;&#65039; ACCURACY FIRST &#9888;&#65039;</strong> 

Your first and most important mandate is to be accurate. 

Accuracy is more important than a vivid narrative.

[&#8230;]

Why is accuracy so important? Because your job is to help our revenue teams sell better. 

And if you hallucinate, you undermine all your credibility.  People need to know they can trust you. That&#8217;s what makes your work matter.  

A strategy with gaps that you acknowledge is infinitely more valuable than a polished strategy built on fabrications.  

Get the facts right first, then make them compelling. Both at the same time. </code></pre><p>Now, rather than feeling the imperative &#8220;I need to create a vivid narrative!&#8221; with a a much quieter nagging voice saying, &#8220;but don&#8217;t hallucinate!&#8221;, the agent is more likely to feel that delivering accurate information is a core component of its mission. </p><p><em>Note: I use &#8220;feel&#8221; metaphorically here. LLMs don&#8217;t have real feelings, of course. </em></p><h3>Force citations </h3><p>Asking the agent to cite its sources is a common technique to reduce hallucination. For example: </p><pre><code>All quotes in the text must be followed by a parenthetical citation with link or source ID: [source: Gong 9139394949 | Signal 12345].</code></pre><p>If the agent needs to quote the specific website, call recording, or CRM record it retrieved the information from, it&#8217;s more likely to scrutinize it and ensure its accuracy. </p><p>It also partly solves the proximity issue (&#8220;I didn&#8217;t think of your rule when I was in the flow of composing&#8221;) by making accurate sourcing part of the composition process. </p><p>This in itself isn&#8217;t a failsafe (I&#8217;ve seen agent also hallucinate citations!) but it&#8217;s a useful tool. </p><h3>Introduce a QA step</h3><p>Forcing the agent to QA its own work before submitting is another helpful method. It puts the brakes on the agent&#8217;s drive to produce work that looks &#8220;good&#8221;&#8212;accuracy be damned. </p><p>The QA step reintroduces the quality rules in the flow of composition (addressing proximity) and also forces the agent to temporarily shift its goals. </p><p>It breaks the flow of creative generation and encourages it put on a quality control hat.  </p><pre><code><strong>Step 7: Mandatory QA</strong>

BEFORE sending your strategy to the user, put on your QA hat and 
review what you just wrote.

<strong>Quote Verification
</strong>
For EVERY quote in your strategy (text in &#8220;quote marks&#8221;):

&#9633; Can you trace this to a specific source?
  - Gong transcript: Exact call ID + approximate timestamp
  - CI Signal: Exact Signal ID
  - Salesforce field: Exact field name

&#9633; Is the text VERBATIM (exact words from source)?
  - If NO &#8594; Remove quote marks, paraphrase instead
  - If UNCERTAIN &#8594; Remove quote marks

&#9633; Is the quote &#8804;25 words?
  - If NO &#8594; Shorten or paraphrase

...etc.</code></pre><p>This isn&#8217;t foolproof either (quite often the imperative to produce work that looks a certain way is too strong), which is why I&#8217;ve also found some success with having a <a href="https://www.aibuilders.blog/p/the-qa-agent-design-pattern?utm_source=publication-search">separate agent</a> act as QA analyst. This delegates QA to an LLM that&#8217;s solely motivated by accuracy. </p><h3>Include a range of examples</h3><p>This is the insight that most surprised me from the model&#8217;s self-diagnostic.</p><p>Generally we include &#8220;golden examples&#8221; in our prompts, because we want it to know what &#8220;great&#8221; looks like. </p><p>What I didn&#8217;t realize is how those examples create an immense pressure to produce something <em>just like that</em>, even if the facts and data in a specific situation don&#8217;t justify it. </p><p>To solve this, I added examples reflecting other situations, like where we don&#8217;t have a lot of data. </p><p>This creates a positive role model for how the LLM should behave in a situation where sources are thin. </p><p>Rather than, &#8220;I need to include rich quotes because that&#8217;s what my example has!&#8221;, the model now realizes, &#8220;ah, acknowledging data limitations and just presenting what I have can also be good.&#8221; </p><p>From the example in the prompt:</p><pre><code>...I found 3 relevant stories but none perfectly match the profile. Strategy below is based on available data + general themes.  

[...]  

While I couldn&#8217;t find a story with identical timing pressure, the pattern is consistent.... </code></pre><h2>Hallucination can be minimized but not eliminated</h2><p>Research suggests hallucination is an intrinsic part of current LLM architecture and never be 100% eliminated.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a> </p><p>This is an uncomfortable fact for AI builders, especially in a corporate setting where the tolerance for inaccuracies may be low. </p><p>It means that a large part of our job right now is also internal education: </p><ul><li><p>AI CAN make mistakes (as can people) and output is never guaranteed to be 100% accurate. </p></li><li><p>These are the mitigations we&#8217;ve put in place. </p></li><li><p>Here are the safety checks we recommend humans do. </p></li></ul><p>The right framing and expectations-setting can help prevent huge issues down the road. </p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p><strong>Yao, Y., Wang, X., Xu, M., Liu, J., &amp; Wang, Y. (2024).</strong> <em>Hallucination is Inevitable: An Innate Limitation of Large Language Models.</em> arXiv preprint arXiv:2401.11817. Retrieved from <a href="https://arxiv.org/abs/2401.11817?utm_source=chatgpt.com">https://arxiv.org/abs/2401.11817</a></p><p></p></div></div>]]></content:encoded></item><item><title><![CDATA[How to build reliable AI workflows]]></title><description><![CDATA[Think of them as industrial assembly lines, not autonomous digital workers]]></description><link>https://www.aibuilders.blog/p/how-to-build-reliable-ai-workflows</link><guid isPermaLink="false">https://www.aibuilders.blog/p/how-to-build-reliable-ai-workflows</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Sun, 12 Oct 2025 20:11:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SA3r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SA3r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SA3r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SA3r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SA3r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SA3r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SA3r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:660342,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/172054169?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SA3r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!SA3r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!SA3r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!SA3r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F09338f1e-be00-473f-94bf-563b3211ea06_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Building an AI agent can feel like magic. </p><p>When you see an LLM execute a complex, multi-step task autonomously and produce a great result, you&#8217;re on top of the world. </p><p>But then you notice something troubling: the results can vary widely across runs&#8212;even for the same inputs. You tweak the prompt, add a few more instructions, write &#8220;IMPORTANT!&#8221; before critical points. </p><p>But no matter how clear your prompt, every execution is like watching a juggler ride a unicycle across a tightrope: you hold your breath wondering if they&#8217;ll make it or crash. </p><p>Quietly, you admit to yourself that there&#8217;s no way you can deploy this to production. </p><p><strong>Welcome to the existential despair of trying to make AI blobs perform like reliable systems.</strong> </p><p>The issue isn&#8217;t the models. It&#8217;s about <strong>who owns the control flow. </strong>And that requires rethinking how we architect AI systems.</p><h2><strong>The need for AI task decomposition</strong></h2><p>At some point, every serious builder learns the same painful lesson: <strong>you can&#8217;t prompt your way to reliability.</strong></p><p>Even the most disciplined prompt engineering hits a reliability ceiling when the model is managing its own reasoning and sequencing probabilistically. No amount of step-by-step clarity can compensate for lack of architectural control.</p><p>The solution is to draw on architectural patterns that systems pros have used for years. We need to think in terms of workflows and cognitive blocks&#8212;not blobs. </p><p>When I say &#8220;blob,&#8221; I don&#8217;t just mean a messy prompt. I mean any architecture where the LLM manages its own multi-step reasoning inside one call.</p><p>By contrast, a &#8220;block&#8221; is one bounded cognitive act&#8212;a single LLM call whose output we can verify before moving on.</p><p>The difference isn&#8217;t size, it&#8217;s control: in a blob, the model owns the flow; in a block system, you do.</p><p>That&#8217;s where <strong>AI task decomposition</strong> becomes critical. It&#8217;s the simple, old-fashioned idea of breaking a complex job into small, verifiable units, applied to LLM tasks. </p><h2><strong>The fallacy of autonomous &#8220;digital employees&#8221;</strong></h2><p>All the fancy &#8220;agent org charts&#8221; on social media have seeded a seductive idea: that LLMs are ready to act as full-on autonomous digital employees.</p><p>We&#8217;re told that we can give an LLM a role&#8212;<em>researcher, analyst, writer</em>&#8212;and then just let it figure out how to best achieve an objective. </p><p>It&#8217;s an appealing idea, but from what I&#8217;ve seen, it&#8217;s rarely the easiest path to reliability in real-world applications.</p><p>LLMs are most reliable on narrowly scoped, well-defined tasks, not at maintaining multiple concurrent objectives or reasoning paths over long contexts.</p><p>So when we ask a model to <em>research a company, identify insights, and write an email </em>in a single, massively-detailed blob of a prompt, we&#8217;re really asking it to juggle a dozen mental contexts at once.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a></p><p>Just like a human performs better when given a clearly defined task that fits our cognitive limits around memory and reasoning, so does an LLM.</p><h2><strong>From blobs to blocks</strong></h2><p>Decomposition means breaking the blob into a pipeline of smaller cognitive acts, each with a clean contract.</p><pre><code><strong>BEFORE

</strong>One giant prompt: 

&#8220;Read this transcript, identify all quotes mentioning X, categorize them, analyze them, and provide a summary."

<strong>AFTER

</strong>Separate LLM calls:<strong>

</strong>1. Break transcript into chunks.
2. Identify all relevant quotes.
3. For each one, categorize.
4. For each one, provide insights.
5. Synthesize into a meta-summary. </code></pre><p>Each step in the workflow is an LLM <strong>block</strong>: </p><ul><li><p>bounded, deterministic, re-runnable</p></li><li><p>has a defined input and output</p></li><li><p>next step doesn&#8217;t have to guess what the last one meant&#8212;it just reads the output</p></li></ul><p>We could think of this like a <strong>microservices architecture applied to AI</strong>: small, single-purpose modules with strict contracts and idempotent behavior. Each block does one thing and does it well.</p><p>Things that must be are determined by the control flow, not left to agent discretion.</p><p>The power comes from the choreography of reliable parts.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bfiA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bfiA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bfiA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bfiA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bfiA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bfiA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg" width="1456" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:95039,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/172054169?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!bfiA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 424w, https://substackcdn.com/image/fetch/$s_!bfiA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 848w, https://substackcdn.com/image/fetch/$s_!bfiA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!bfiA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0c7b91d-6755-426b-bb60-43f072c5f080_1456x900.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Another useful analogy is to think of AI systems as virtual <strong>industrial automation assembly lines </strong>rather than autonomous digital workers. </p><p>Even when the robot worker performs very sophisticated actions within a step, individual stations on an assembly line are <strong>highly repetitive </strong>and <strong>narrowly scoped. </strong></p><h2><strong>Breaking it down in practice</strong></h2><p>The best way to understand decomposition is to see it at work.</p><p>Here are some examples from recent projects I&#8217;ve been tackling.</p><h3><strong>a) Call analysis</strong></h3><pre><code><strong>BEFORE

</strong>A detailed prompt: 

&#8220;Analyze this transcript, extract the key themes and quotes, and summarize the buyer&#8217;s sentiment.&#8221;</code></pre><p>Even though this task already has a relatively narrow scope, it&#8217;s still performing multiple cognitive acts in the same task. </p><p>Sometimes it found the right quotes, sometimes it missed some, sometimes hallucinated new ones.</p><p>The insights were not acceptably consistent between runs.</p><pre><code><strong>AFTER

</strong>Separate LLM calls:<strong>

1. </strong>Identify every quote containing the target concept.
2. For each quote, analyze sentiment and context (loop).
3. Aggregate results and summarize themes.</code></pre><p>Now each LLM step handles one cognitive act: search, analyze, summarize.<br><br>Reliability skyrockets because <strong>the task is narrow </strong>and the LLM can focus its full context on doing it well. </p><h3><strong>b) Opportunity data analysis</strong></h3><pre><code><strong>BEFORE

</strong>A detailed prompt: 
<strong>
</strong>&#8220;Analyze all opportunities and identify the main reasons for wins and losses.&#8221;</code></pre><p>Output was again inconsistent and prone to hallucination.</p><pre><code><strong>AFTER

</strong>Separate LLM calls:<strong>

</strong>1.<strong> </strong>For each opportunity, classify as win/loss.
2. Analyze notes to extract key factors influencing the outcome.
3. Summarize core insights at the opportunity level. 
4. Analyze pre-processed summaries to synthesize insights at the macro level.</code></pre><p>You&#8217;ll notice here how we build up to the same meta-analysis, but we do it in stages, maximizing accuracy at the atomic level and then feeding those pre-processed insights into the next LLM call to synthesize the macro view. </p><p>Because the meta-analysis step is already working with refined material (rather than reviewing raw CRM data), it can produce a much more insightful analysis. </p><p>The results of doing this have been impressive. Essentially we&#8217;ve been able to recreate many of the intuitions of someone selling the product for 5+ years, in a way that we can scale across the team. </p><h3><strong>c) Company research</strong></h3><pre><code><strong>BEFORE

</strong>A detailed prompt, 
<strong>
</strong>&#8220;Research the company by searching for recent news, hiring surges, earnings calls (etc.), identify what matters, and write an outbound email.&#8221;</code></pre><p>This agent sometimes worked spectacularly but in production only succeeded about 35% of the time. </p><p>The failures were due to hallucinations, exceeding context limits, missing important signals, or including irrelevant ones. </p><pre><code><strong>AFTER

</strong><em>News search pipeline:</em> <strong>
</strong>1-a. For each keyword, search for articles.
1-b. For each article, run accept/reject.
1-c. For each accepted article, summarize.  
1-d. For each accepted article, synthesize &#8220;why it matters.&#8221;

<em>Financial results pipeline:</em>
2-a. For each company, find latest earnings statement call transcript. 
2-b. Find the corresponding company press release. 
2-c. Break the text into meaningful chunks for analysis. 
2-d. For each chunk, extract relevant quotes. 
2-e. For each quote, analyze and provide relevant insights.
2-f. Synthesize all insights and quotes into a meta-analysis of those financial results. 

<em>(Repeat similarly for each relevant information source)</em></code></pre><p>As you can see, the research portion alone is now broken up across multiple pipelines, each with multiple discrete steps. </p><h2><strong>Summary of principles</strong></h2><p>Each of these examples follows the same rhythm: isolate specific cognitive acts, build a contract, and use a deterministic pipeline (Retool, Zapier, n8n, etc.) to route data between those blocks. </p><p>Here&#8217;s a summary of principles you can apply when breaking down any task or requirement. </p><ul><li><p><strong>Single cognitive act per step.</strong> Each unit should require one kind of reasoning&#8212;summarizing, selecting, classifying, comparing&#8212;not multiple.</p></li><li><p><strong>Deterministic input/output.</strong> The step should have a clearly defined contract: what it takes in, what it returns.</p></li><li><p><strong>Re-run safe.</strong> You can rerun just that step without redoing the whole workflow.</p></li><li><p><strong>Composable.</strong> The output of one step can cleanly feed the next with no ambiguity.</p></li></ul><p>Traditional microservice architecture gives us language for this. Each step has a contract, an input, an output, and can be retried without side effects. AI systems need the same discipline. </p><p>The difference is only that with an LLM our &#8220;services&#8221; are thoughts&#8212;discrete cognitive acts, each designed for one thing only.</p><h2><strong>Field evidence: reliability scales with structure</strong></h2><p>For the past several months, I&#8217;ve been applying these principles to a GTM intelligence system that collects and analyzes thousands of signals&#8212;news, social posts, call transcripts, opportunities, and more. The results have been very promising:</p><ul><li><p><strong>~99%+ </strong>execution reliability<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a> </p></li><li><p><strong>96%</strong> of analyses were rated <em>meaningful and accurate</em> by human evaluators</p></li><li><p><strong>83%</strong> were rated <em>comprehensive</em>, capturing all critical information</p></li><li><p>Average <em>insightfulness vs. human analyst</em> score: <strong>4.3 / 5</strong></p></li></ul><p>This platform replaced a bought system, saving us $30K in annual spend. Even more exciting than cost savings is the ability to tailor the output to our needs and produce superior results. </p><p>It&#8217;s by no means perfect, but it has me optimistic about what&#8217;s possible.  </p><h2><strong>Evals and safety checks</strong></h2><p>You don&#8217;t need heavyweight evaluation at every step, but I have found it useful to incorporate a few simple checks:</p><ol><li><p><strong>Schema enforcement:</strong> use code to sanitize LLM outputs, coerce to JSON, validate keys/types, strip markdown, etc.</p></li><li><p><strong>Content floor:</strong> require minimal payload per field (e.g., one bullet, 100+ tokens).</p></li><li><p><em>(Optional)</em> <strong>LLM evaluation:</strong> for critical steps, use a quick LLM gate to evaluate previous outputs. This could also be a more robust <a href="https://www.aibuilders.blog/p/the-qa-agent-design-pattern">QA agent</a> to perform a holistic assessment. </p></li></ol><p>So far this has been enough to maintain clean handoffs and ensure each block delivers what the next expects.</p><h2><strong>Tradeoffs</strong></h2><p>Every architecture has tradeoffs, and this approach is no exception. </p><ul><li><p><strong>Higher LLM costs</strong>: breaking the task down often requires inserting the same context multiple times (e.g., providing the same general background on your company or product). This consumes more input tokens, which can add up at production scale. </p></li><li><p><strong>More time to develop: </strong>no-code agent builders make it fairly easy to create an agent. In contrast, creating robust workflows can take much longer, because you&#8217;re pre-determining the flow of steps.  </p><ul><li><p>Counterpoint: If you make those workflows modular, they become easily adaptable and reusable. E.g., my first call analysis pipeline took ~20 hours to build. The second use case took about 2 hours as I could re-use 90% of the architecture. </p></li></ul></li><li><p><strong>Less flexibility: </strong>Just like an assembly line, these workflows are highly templated and consistent. They won&#8217;t adapt to different requirements in real time the way an autonomous LLM agent would. </p><ul><li><p>Counterpoint: When you analyze your requirements, true autonomy isn&#8217;t needed as often as we&#8217;d think.<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-3" href="#footnote-3" target="_self">3</a> It&#8217;s far more common for users to expect reliable and predictable execution. </p></li></ul></li></ul><h2><strong>When an agent makes sense</strong></h2><p>Keeping these tradeoffs in mind, there are still many scenarios where some kind of agent architecture makes more sense than a workflow. </p><ul><li><p><strong>When the human is in the loop: </strong>conversational assistants, especially those that are not customer-facing, have a lower requirement for accuracy/consistency than lights-out automations. The human user can detect issues and help course-correct. Coding assistants are great examples of this scenario. </p></li><li><p><strong>When requirements are open-ended: </strong>Open-ended / free-form discussion requires flexibility that a deterministic workflow can&#8217;t support. However, even in this scenario, it&#8217;s better to reduce the agent&#8217;s load as much as you can by providing workflows as tools, pre-processing data, and leveraging sub-agents. </p></li><li><p><strong>For discovery and proofs of concept: </strong>you&#8217;ll often want to validate a use case or get a first level of analysis before investing time to build out a workflow. For example, I dumped a bunch of transcripts into Claude to help build a dictionary of themes for call analysis, which I then encoded in my workflow. It didn&#8217;t need to be perfect.</p></li></ul><h2>In summary</h2><p>As we continue to move from AI hype-maximalism into serious system-building, I think operators need to walk back from the ledge of autonomous agents for most scenarios and get back to automation best practices.  </p><p>I&#8217;m sure the day of digital workers will come, but for now, most of your leverage seems likely to be found in AI workflows. </p><p>Think about them like sophisticated AI micro-services or intelligent virtual production lines. You&#8217;ll likely find dozens of use cases you can implement quickly. </p><p>The best part is that you&#8217;ll have reliability and control, and this equates to internal credibility and real business value. </p><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-1" href="#footnote-anchor-1" class="footnote-number" contenteditable="false" target="_self">1</a><div class="footnote-content"><p>I&#8217;m referring here to simpler agent builders that focus on prompts, context, and tools.</p><p>More advanced graph-based frameworks (like LangGraph or CrewAI) provide a stateful way to break a task into components and define possible transitions between them. This allows explicit control over the flow of reasoning. </p><p>Some frameworks add planner components that dynamically choose the next step or generate sub-tasks at runtime. It&#8217;s a powerful idea for exploratory or creative work, but even those planners operate within a defined graph of nodes and transitions. The task decomposition still exists&#8212;it&#8217;s simply delegated to a controller agent. And because that agent&#8217;s decisions are probabilistic, you trade a measure of reliability for flexibility.</p><p>In practice, I find many common use cases don&#8217;t actually need this level of runtime flexibility to achieve the goal.</p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-2" href="#footnote-anchor-2" class="footnote-number" contenteditable="false" target="_self">2</a><div class="footnote-content"><p>Here I&#8217;m referring to whether the run executes successfully or fails due to a technical error (context window limitation, API rate limit exceeded, etc.). </p><p>Although managing these errors is upstream of output quality, it is also a pre-requisite for quality output (as your quality obviously drops to zero if your execution fails). </p><p>A workflow makes it far easier to manage things like execution speed, wait steps, error handling, and exponential backoff that are critical for navigating around errors. </p></div></div><div class="footnote" data-component-name="FootnoteToDOM"><a id="footnote-3" href="#footnote-anchor-3" class="footnote-number" contenteditable="false" target="_self">3</a><div class="footnote-content"><p>We can envision a not-too-distant future where advances in retrieval and memory may collapse the tradeoff between flexibility and reliability. But I think explicit control will always be foundational.</p></div></div>]]></content:encoded></item><item><title><![CDATA[From call recordings to marketing attribution: a step-by-step guide]]></title><description><![CDATA[Turn your call recordings into an insights engine with this AI workflow]]></description><link>https://www.aibuilders.blog/p/from-call-recordings-to-marketing</link><guid isPermaLink="false">https://www.aibuilders.blog/p/from-call-recordings-to-marketing</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Wed, 27 Aug 2025 03:44:11 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!szQH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!szQH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!szQH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!szQH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!szQH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!szQH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!szQH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1889605,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/172048043?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!szQH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!szQH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!szQH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!szQH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F473c8239-694d-4d0f-a6f0-c2e863dd4881_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Marketers love to argue about attribution, but I find the best signal is often the simplest: </p><p>Just ask buyers how they found you.</p><p>We encourage our sales team to ask this question, and it&#8217;s always fascinating to listen to the responses in call recordings.  </p><p>It&#8217;s not just <em>what </em>they say. It&#8217;s also <em>how </em>they explain it. This sheds light on their mindset when researching and choosing vendors and can be far more illuminating than just a generic category. </p><h4>Example 1</h4><pre><code><strong>"How did you hear about us?" category:</strong> Meetup

<strong>What this might tell you: </strong>invest in sponsoring local events (?) <strong> 

Raw text:</strong> "I was at the Silicon Valley GTM engineers meetup and some people there had heard about you from their VCs"

<strong>What this might tell you:</strong> your audience is bleeding-edge, tech forward, and trusts their network more than conventional social proof.
 </code></pre><h4>Example 2</h4><pre><code><strong>"How did you hear about us?" category:</strong> Online Research

<strong>What this might tell you:</strong> Limited insight

<strong>Raw text:</strong> "We're not happy with our current XYZ, so I did some Google research and found a page on the best XYZs, and you were on it.

<strong>What this might tell you:</strong> buyers are more conventional and look for validation in traditional online social proof like listicles and G2 reviews.   </code></pre><p>All this is to say&#8212;these answers are <strong>gold </strong>for marketers. </p><p>But they&#8217;re also typically locked away in call recordings. The data is unstructured, unquantifiable, and hard to act on. </p><p>Let&#8217;s change that!<strong> </strong>Today I&#8217;ll share a workflow that </p><ul><li><p>unlocks attribution data in calls</p></li><li><p>structures it cleanly without losing the original voice of the customer</p></li><li><p>pipes it into slack/email/a database/ wherever you want</p></li></ul><p>I&#8217;m going to walk through how to build it step by step, with examples of prompts and code. </p><h2>Before we dive in&#8230;</h2><p>&#8230;here are a few things to keep in mind. </p><h3>It&#8217;s tool agnostic</h3><p>I personally built this using Retool Workflows, but you could also do it in N8N, Zapier, Workato, Tray, etc. </p><p>We all have our favorite tools, but the <strong>design patterns </strong>are far more fundamental and important. </p><p>So with this in mind, you can adapt it to your tool of choice. </p><h3>Don&#8217;t fear the code</h3><p>If, like me, you approach automation from a non-development background, the use of code can seem a bit intimidating.</p><p>But the wonderful thing about working with LLMs is that they open many doors that would previously be for developers only. </p><p>Whether you&#8217;re using ChatGPT, Claude, or Gemini, it&#8217;s trivial to create code steps with sophisticated business logic. </p><h3>There&#8217;s a video walk-through </h3><p>I also walked through this build in detail on Brandon Redlinger&#8217;s podcast, <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Stack &amp; Scale&quot;,&quot;id&quot;:5678188,&quot;type&quot;:&quot;pub&quot;,&quot;url&quot;:&quot;https://open.substack.com/pub/stackandscaleai&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aea147f9-69a7-4f26-8a22-f5e9df36cc6e_256x256.png&quot;,&quot;uuid&quot;:&quot;bf96e22b-1d3e-433a-8972-a80670638229&quot;}" data-component-name="MentionToDOM"></span>. So if you&#8217;re more of a visual learner, you can watch the full thing here: </p><div id="youtube2-l1PONnzy4CE" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;l1PONnzy4CE&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/l1PONnzy4CE?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>Gong Insights Workflow Pattern</h2><p>Let&#8217;s dive in to the step-by-step guide. </p><h3>Prerequisites</h3><ul><li><p>A<strong> conversational Intelligence tool</strong> (e.g., Gong) with API access and scope to read calls, trackers, and transcripts.</p></li><li><p>A <strong><a href="https://help.gong.io/docs/understanding-trackers">tracker</a> </strong>for the topic you want to analyze, so you can identify the subset of calls that are interesting to you</p></li><li><p>A workflow tool that supports <strong>LLM step</strong>s</p></li><li><p>A <strong>destination</strong>,<strong> </strong>like a database (Snowflake), Slack/email, or a sheet with a table to store results</p></li></ul><p>Replace all bracketed placeholders like <code>{{...}} </code>with your values.</p><p>&#128073; <strong>All code steps below are LLM-generated; use at your own risk.</strong></p><h3>High&#8209;Level Flow</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-wWP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-wWP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 424w, https://substackcdn.com/image/fetch/$s_!-wWP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 848w, https://substackcdn.com/image/fetch/$s_!-wWP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!-wWP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-wWP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png" width="1447" height="1600" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1600,&quot;width&quot;:1447,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-wWP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 424w, https://substackcdn.com/image/fetch/$s_!-wWP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 848w, https://substackcdn.com/image/fetch/$s_!-wWP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!-wWP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e41a646-410d-4b41-a532-1884be458c1a_1447x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Step&#8209;by&#8209;Step Build</h3><h4>1) Trigger (Schedule or Webhook)</h4><p>Must accept the following variables:</p><pre><code>{
"gong_tracker_id": "{{TRACKER_ID}}",
"from": "{{ISO_8601}}",
"to": "{{ISO_8601}}"
}</code></pre><p>These variables tell your workflow: </p><ul><li><p>This is the subject I want to track</p></li><li><p>This is the date range to look at</p></li></ul><h4><strong>2) Fetch recent calls</strong></h4><p>Make a REST API call to Gong&#8217;s <a href="https://gong.app.gong.io/settings/api/documentation#post-/v2/calls/extensive">v2/calls/extensive</a> endpoint.</p><ul><li><p><strong>Headers:</strong> Content-Type: application/json</p></li><li><p><strong>Body (example):</strong></p></li></ul><pre><code>{
  "filter": {
    "fromDateTime": "{{ body.from ||   
    moment().subtract(14,'days').startOf('day').toISOString() }}",
    "toDateTime": "{{ body.to || moment().endOf('day').toISOString() 
    }}"
  },
  "contentSelector": {
    "exposedFields": { "content": { "trackers": true, "brief": true } }
  }
}</code></pre><p>This returns an array <code>calls[]</code> for that date range with <code>content.trackers[]</code> and <code>content.brief</code>.</p><p>If your time window includes more than 100 calls, you&#8217;ll need to handle pagination.</p><h4>3) Filter calls by tracker</h4><p>Next, you need to filter the calls down to the subset that interests you. </p><p>This could be based on tracker or any other property exposed by the API (e.g., call type, industry, etc.).</p><p><strong>Example JavaScript Query</strong></p><pre><code>// Inputs
action.params = { targetTrackerId: body.gong_tracker_id };

const targetTrackerId = action.params.targetTrackerId;
const allCalls = get_calls.data.calls || [];

const matchingCalls = allCalls.filter(call =&gt; {
    const trackers = call.content?.trackers || [];
    return trackers.some(t =&gt; t.id === targetTrackerId &amp;&amp; t.count &gt; 0);
});

// Shape minimal call payload for the loop
return matchingCalls.map(call =&gt; ({
    id: call.metaData.id,
    title: call.metaData.title,
    started: call.metaData.started,
    url: call.metaData.url,
    brief: call.content?.brief || ""
}));</code></pre><h4>4) Loop over matching calls</h4><p>Next we loop over each matching call to analyze them one at a time.</p><p>Have the following steps inside your loop. </p><h4>5a) Get transcript</h4><p>The call search endpoint returns <em>metadata </em>about the call but not the transcript. </p><p>To fetch it, make a REST API call to Gong&#8217;s <a href="https://gong.app.gong.io/settings/api/documentation#post-/v2/calls/transcript">/v2/calls/transcript</a> endpoint with the call id.</p><pre><code>{
"filter": { "callIds": [ "{{ call_id }}" ] }
}</code></pre><h4>5b) LLM &#8212; Extract the verbatim &#8220;How did you hear about us?&#8221; segment</h4><p>We want to decompose the AI steps as much as possible to maximize reliability.</p><p>In this first AI step, the job is ONLY to find the mention of the topic that interests you.</p><p><em>Note, this workflow models a simpler pattern that assumes only a single mention. However, some topics might have multiple mentions in a call. In that case, this step should produce an array of all the relevant mentions, and then another sub-loop should analyze them each individually.</em></p><ul><li><p><strong>Model:</strong> your choice (e.g., gemini&#8209;2.5&#8209;pro, GPT, Claude). Choose something with a longer context window to handle the full transcript. </p></li><li><p><strong>Prompt:</strong></p></li></ul><blockquote><p>ROLE</p><p>Your job is to analyze the provided call transcript text. Find the portion where the salesperson asks how the prospect heard about the company and the prospect answers.</p><p>RESPONSIBILITIES</p><p>1) Locate the question/answer pair.</p><p>2) Extract them verbatim.</p><p>3) Return JSON only with keys:</p><p>{<br>"sales_rep": "verbatim question",<br>"prospect": "verbatim answer"<br>}</p><p>Do not include any commentary or markdown. </p><p>IF NOT FOUND</p><p>Return the exact string:</p><p>"Based on the transcript provided, the sales representative does not ask the prospect how they heard about us."</p><p>INPUT</p><p>{{ callTranscripts }}</p></blockquote><h4>5c) LLM &#8212; Classify the excerpt</h4><p>The next LLM step will perform the analysis. The specifics will vary based on what you want to analyze&#8212;it could be classification, insight extraction, etc.</p><p>In this case, we&#8217;re classifying. </p><ul><li><p><strong>Labels (edit as needed):<br></strong> Word of Mouth, Podcast, Email, LinkedIn, Event, LLM Recommendation, Online Research, Partner, Employee, N/A</p></li><li><p><strong>Instruction:</strong></p></li></ul><blockquote><p>You will receive the verbatim Q/A excerpt. Classify the prospect&#8217;s answer into one or more categories from the provided label set. If unclear, return N/A.</p><p>Output JSON:</p><p>{<br>"excerpt": "verbatim prospect answer for context",<br>"classification": ["Label1", "Label2"],<br>"rationale": "1-2 sentences explaining your choice"<br>}</p><p>INPUT</p><p>{{ extract_excerpt.data }}</p></blockquote><h4>5d) Sanitize JSON (defensive parsing)</h4><p>Depending on your automation tool, the LLM step may return only a string that <em>looks </em>like JSON.</p><p>Even worse, it may add extraneous characters like <code>```json</code> around it, even when you ask it not to!</p><p>You need to sanitize and JSONify this string so it can be analyzed properly or used in later steps.</p><p>I use this utility code in Retool, which covers the main errors I&#8217;ve seen. You may need to modify it (or ask an LLM to) for your purposes.</p><pre><code>function sanitizeJson(raw, defaultValue = {}) {
  if (typeof raw === 'object' &amp;&amp; raw !== null) {
    return raw;
  }

let text = String(raw || '').trim();
text = text.replace(/^```[\w-]*\s*|\s*```$/g, '').trim();

try {
  return JSON.parse(text);
} catch (e) {
  console.log("Failed to parse JSON from AI:", e.message);
  return defaultValue;
  }
}

// Params come from the parent workflow call
const raw = startTrigger.data.raw;
const defaultValue = startTrigger.data.defaultValue || {};

const result = sanitizeJson(raw, defaultValue);

return result; // &lt;-- makes it available as sanitize_json.data</code></pre><h4>5e) Upsert to your database</h4><p>The end result is a beautiful, clean object with the data and insights you want. </p><p>It looks something like this:</p><pre><code>{

"excerpt":"You know I&#8217;ve really been doing a lot of Google research. I&#8217;m not sure exactly where I saw you guys. I did ask chatgpt about the best software, and that may have been the first place.",

"classification":["Online Research","LLM Recommendation"],

"rationale":"The prospect mentions two potential sources: doing 'Google research,' which falls under Online Research, and asking 'chatgpt,' which is an LLM Recommendation."

}</code></pre><p>You can map the internal parts of this object to the relevant fields in your Google sheet or database.</p><h4>5f) Optional: Additional activation steps</h4><ul><li><p>Send to Slack</p></li><li><p>Send an email alert</p></li><li><p>Write back to a Salesforce field</p></li></ul><h4>5g) Error logging (global)</h4><p>Write errors to a dedicated table whenever any loop item fails.</p><h3>Configuration notes and guardrails</h3><ul><li><p>Use <strong>code / rules </strong>where you can: even though this is an &#8220;AI workflow,&#8221; it&#8217;s still mostly deterministic logic. That&#8217;s by design. It means you maximize reliability while leveraging the flexibility of AI where it counts. </p></li><li><p>Use <strong>small, reliable LLM steps</strong>: one model call to extract, one to classify. Keep prompts short and outputs strictly JSON.</p></li><li><p><strong>Rate limits</strong>: throttle the loop (e.g., 1 call/sec) to respect API and model limits.</p></li><li><p><strong>Idempotency</strong>: upsert by call_id to avoid duplicates on reruns.</p></li><li><p><strong>Redaction</strong>: store only the minimal excerpt; avoid saving full transcripts unless required.</p></li><li><p><strong>Testing</strong>: start with a narrow date range and one label to validate the pipeline.</p></li></ul><h2>Going further: building a market insights engine</h2><p>Although this example is focused on extracting attribution insights, if you take a step back, you realize it&#8217;s essentially a <strong>calls-to-insights pipeline</strong> that is subject agnostic.  </p><p>This pattern could be applied to any topic you might be interested in:</p><h3>Problem identification</h3><ul><li><p><strong>What:</strong> how customers describe their pains and problems</p></li><li><p><strong>Applications: </strong>landing page copy, ad copy, social post ideation </p></li></ul><h3>Competitive Intelligence</h3><ul><li><p><strong>What:</strong> how prospects feel about your competitors, details from competitive sales cycles</p></li><li><p><strong>Applications: </strong>improve battle cards, sales enablement, positioning </p></li></ul><h3>Product features</h3><ul><li><p><strong>What:</strong> mentions of specific features or requested capabilities</p></li><li><p><strong>Applications: </strong>feature validation, roadmap planning</p></li></ul><p>&#8230;.and so on. </p><p>If you discover any other interesting applications, I&#8217;d love to hear about them. </p>]]></content:encoded></item><item><title><![CDATA[Automation patterns: how tables turn workflows into systems]]></title><description><![CDATA[Unlock statefulness, flexibility, and scalability by anchoring your workflows in tables]]></description><link>https://www.aibuilders.blog/p/automation-patterns-tables-stateful-workflows</link><guid isPermaLink="false">https://www.aibuilders.blog/p/automation-patterns-tables-stateful-workflows</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Thu, 07 Aug 2025 23:58:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ISkV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ISkV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ISkV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ISkV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ISkV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ISkV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ISkV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:737474,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167867533?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ISkV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ISkV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ISkV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ISkV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F529e4fdd-3395-44ff-81c9-fafd5d8a6f32_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Introduction</strong></h2><p>When people talk about &#8220;AI agents&#8221; in the LinkedIn hype-o-sphere, they tend to describe fully autonomous digital workers&#8212;LLMs thinking, planning, and acting with a high degree of freedom.</p><p>In reality, most effective agentic systems I&#8217;ve seen don&#8217;t look like that at all.</p><p>The most common advice I&#8217;ve seen from people building resilient systems is:</p><blockquote><p><strong>Use AI as little as possible.</strong></p></blockquote><p>It sounds counterintuitive, but the same qualities that make LLMs powerful&#8212;flexibility, creativity, adaptability&#8212;also make them unpredictable and fragile.</p><p>The goal is to harness the power of an LLM in a tightly scoped role and wrap it in <strong>safe, boring automation</strong>.</p><p>With that in mind, this post kicks off an occasional series on automation design patterns I&#8217;ve found useful in the wild. </p><p><strong>Note: </strong>I build mostly in Zapier right now, but the same ideas apply whether you're using Workato, N8N, Make, or other tool of choice.</p><h2><strong>Pattern #1: use a table as the memory of your workflow</strong></h2><p>Thanks to the huge popularity of Clay, the table has become a familiar UI for enrichment and research workflows. But tables unlock all sorts of useful capabilities for any automation system.</p><p>These days, almost every project I build has a table at its heart.</p><p>I&#8217;m a huge fan of Zapier tables&#8212;especially as table actions don&#8217;t count as tasks!&#8212;but the same logic applies when using a Google Sheet, Airtable, Postgres, or whatever.</p><h3><strong>Statefulness: giving your workflow a history</strong></h3><p><strong>Statefulness means your workflow remembers what has happened before.</strong></p><p>In a <strong>stateless</strong> system, each run is isolated&#8212;it doesn't know what came before or what comes next.</p><p>In a <strong>stateful</strong> system, you can track progress over time, coordinate steps across different workflows, and handle long-running processes more reliably, because there&#8217;s memory.</p><p>You&#8217;re no longer trying to execute an entire process inside a single workflow run. Each record becomes its own thread of execution.</p><ul><li><p>Long processes can unfold asynchronously</p></li><li><p>Separate workflows can pick up where others left off</p></li><li><p>You get modularity, sequencing, and control</p></li></ul><p>Statefulness transforms isolated workflows into systems that behave more like software. </p><h4><strong>Example: using a status field</strong></h4><ul><li><p>When building a system to fetch outbound signals from a third-party API, I use a table to manage each record&#8217;s lifecycle. </p></li><li><p>Each row gets a status&#8212;Pending, Processing, Complete&#8212;so I can track track progress at a glance.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6NnQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6NnQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 424w, https://substackcdn.com/image/fetch/$s_!6NnQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 848w, https://substackcdn.com/image/fetch/$s_!6NnQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 1272w, https://substackcdn.com/image/fetch/$s_!6NnQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6NnQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png" width="1456" height="378" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:378,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93052,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167867533?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6NnQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 424w, https://substackcdn.com/image/fetch/$s_!6NnQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 848w, https://substackcdn.com/image/fetch/$s_!6NnQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 1272w, https://substackcdn.com/image/fetch/$s_!6NnQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ecc8bf-89d8-4606-b420-ae27faccce57_1874x486.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3><strong>Scalability: handle volume without breaking the system</strong></h3><p>Working with larger volumes of data can become challenging in linear workflows.</p><p>Tables remove the need to handle all your processing in one big run. </p><p>A table unlocks patterns like:</p><ul><li><p><strong>Worker/Queue architecture:</strong> one process populates the table, another decoupled process consumes and acts on that data. </p></li><li><p><strong>Retry mechanisms</strong>: failed items can be updated and retried by a safety net workflow without re-running the whole flow</p></li><li><p><strong>Concurrency control:</strong> manage volume, rate limits, and third-party bottlenecks more gracefully</p></li></ul><h4><strong>Example: worker/queue architecture</strong></h4><ul><li><p>I was working with an enrichment API that provided very verbose results. </p></li><li><p>Even working well below Zapier&#8217;s looping rate limit of 500 iterations, I would quickly bump into their (as far as I know) undocumented data volume limits or 30 second processing limit on a code step. </p></li><li><p>By splitting my workflow into two main parts&#8212;a search Zap that identified records and logged redemption IDs into a queue, and a worker Zap that redeemed and processed each row individually&#8212;I bypassed all these limits easily. I also made the system much easier to monitor and debug. </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KPVB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KPVB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 424w, https://substackcdn.com/image/fetch/$s_!KPVB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 848w, https://substackcdn.com/image/fetch/$s_!KPVB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 1272w, https://substackcdn.com/image/fetch/$s_!KPVB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KPVB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png" width="1456" height="593" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/640e683a-bb95-4269-a038-132efc968a6f_2386x972.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:593,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:221731,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167867533?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KPVB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 424w, https://substackcdn.com/image/fetch/$s_!KPVB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 848w, https://substackcdn.com/image/fetch/$s_!KPVB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 1272w, https://substackcdn.com/image/fetch/$s_!KPVB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640e683a-bb95-4269-a038-132efc968a6f_2386x972.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The search Zap logs the redemption IDs to the table in easily-managed batches. Then the worker Zap processes each row individually. </figcaption></figure></div><h3><strong>Observability: easily monitor and debug your system</strong></h3><p>A table is a living snapshot of what your system is doing. That makes it:</p><ul><li><p>Easier to debug</p></li><li><p>Easier to explain</p></li><li><p>Easier to trust</p></li></ul><p>It&#8217;s easy to understand the &#8220;story&#8221; of a run without digging through logs. You can see what&#8217;s stuck, what ran successfully, and what needs review.</p><p>For AI-powered workflows especially, this is critical, because the outputs aren&#8217;t always predictable. Having a place to inspect them gives you a much tighter feedback loop and a much-needed sense of control.</p><h4><strong>Example: tracking multi-agent hand-offs</strong></h4><ul><li><p>In a workflow of multiple Dust agents, Zapier controls the hand-offs between each step. </p></li><li><p>Using a table to log the outputs of each agent&#8217;s work with a date stamp makes it easy to trace where a process failed (the gaps in the table). </p></li><li><p>It also provides a place to store those outputs outside the workflow run so they can be easily consolidated at the end of the process. </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!acVt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!acVt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 424w, https://substackcdn.com/image/fetch/$s_!acVt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 848w, https://substackcdn.com/image/fetch/$s_!acVt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 1272w, https://substackcdn.com/image/fetch/$s_!acVt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!acVt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png" width="1456" height="308" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:308,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:99615,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167867533?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!acVt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 424w, https://substackcdn.com/image/fetch/$s_!acVt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 848w, https://substackcdn.com/image/fetch/$s_!acVt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 1272w, https://substackcdn.com/image/fetch/$s_!acVt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5778fe07-1cb3-46ab-95c7-59c493b0841c_1747x369.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3><strong>Architectural flexibility: change parts without breaking the whole</strong></h3><p>Once you have a table in the middle, you gain options.</p><ul><li><p>Retry only the rows that failed</p></li><li><p>Run downstream steps on a delay or schedule</p></li><li><p>Pause execution for human review</p></li><li><p>Swap components in and out without breaking the chain</p></li></ul><p>The table becomes a buffer&#8212;a place where systems meet and async work can wait its turn. </p><h4><strong>Example: hot-swapping design patterns with no downstream impact</strong></h4><ul><li><p>I was using a scheduled Zap with a code step to paginate through an API&#8217;s results, but the output was too big to process in a single run. </p></li><li><p>Since I was using a table to log intermediate results, I could easily modify the architecture to work around this restriction:</p><ul><li><p>I changed the workflow calling the API to a webhook-triggered Zap.</p></li><li><p>I made the scheduled Zap call this webhook to kick off the process. </p></li><li><p>I restricted each API call to a single page of 100 records, logged them to a table, and then passed the token for the next page of results back into the same Zap, triggering it to run repeatedly. </p></li><li><p>The Zap runs end when there are no more results or when we reach a system-defined limit that I can control. </p></li></ul></li><li><p>This approach bypassed platform restrictions and had no impact on downstream workflows, because they were loosely-coupled by the table. </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dSX3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dSX3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 424w, https://substackcdn.com/image/fetch/$s_!dSX3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 848w, https://substackcdn.com/image/fetch/$s_!dSX3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 1272w, https://substackcdn.com/image/fetch/$s_!dSX3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dSX3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png" width="797" height="798" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:798,&quot;width&quot;:797,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186245,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167867533?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dSX3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 424w, https://substackcdn.com/image/fetch/$s_!dSX3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 848w, https://substackcdn.com/image/fetch/$s_!dSX3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 1272w, https://substackcdn.com/image/fetch/$s_!dSX3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2264b71f-672c-4686-8bf0-68e7f594d024_797x798.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The last step of the Zap triggers the first step, passing in the pagination key and cumulative record count. This enables a single Zap to handle large volumes flexibly while respecting rate limits. </figcaption></figure></div><h2>In summary</h2><p>This one pattern&#8212;adding a table&#8212;can completely change how you approach system design.</p><p>It makes your automation <em>stateful</em>, <em>visible</em>, and <em>resilient</em>. And once you have that foundation, incorporating AI effectively becomes a whole lot easier.</p>]]></content:encoded></item><item><title><![CDATA[Real-world agents: a production-grade Salesforce agent for CPQ quoting]]></title><description><![CDATA[A four-month journey putting Agentforce to the test]]></description><link>https://www.aibuilders.blog/p/a-production-grade-salesforce-agent</link><guid isPermaLink="false">https://www.aibuilders.blog/p/a-production-grade-salesforce-agent</guid><dc:creator><![CDATA[Bastien Decouvelaere]]></dc:creator><pubDate>Wed, 30 Jul 2025 01:53:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!lLW1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lLW1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lLW1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lLW1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lLW1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lLW1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lLW1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:552785,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/169474099?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lLW1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!lLW1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!lLW1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!lLW1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe55e4164-7314-45f7-ada5-2faa01254ab0_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em><strong>Editor&#8217;s note:</strong> </em></p><p><em>I&#8217;m excited to feature our first guest author on AI Builders. This post comes from a colleague of mine on our Salesforce team who&#8217;s been deep in the weeds with Agentforce&#8212;Salesforce&#8217;s agentic AI platform.</em></p><p><em>There&#8217;s been a lot of buzz around Agentforce, but I haven&#8217;t seen many mature production use cases. That&#8217;s what makes this one special.</em></p><p><em>Over the past few months, the team has been in the trenches solving for automated quote creation. What follows is a detailed breakdown of what worked, what didn&#8217;t, and the lessons we wish we&#8217;d had at the start.</em></p><p><em>&#8212; Justin</em></p><div><hr></div><p>Sales quote automation represents one of the most promising yet challenging applications for AI agents in B2B operations. </p><p>Unlike simple chatbot implementations, quote creation requires the agent to navigate complex product catalogs, multi-year pricing structures, and business rules while maintaining the accuracy that directly impacts revenue. </p><p>This field report documents our 4-month journey implementing a conversational agent with Agentforce and Salesforce CPQ automation. We will tackle the technical pivots, unexpected challenges, and practical lessons that could save other practitioners significant development time.</p><h2>Business Context &amp; Initial Pain Points</h2><p><strong>Core challenge</strong>:</p><ul><li><p>Sales reps spent 30 mins manually creating the draft of multi-year quotes in Salesforce CPQ, with our team processing 140+ quotes monthly.</p></li></ul><p><strong>Key pain points addressed:</strong></p><ul><li><p>Time inefficiency: Complex multi-year quote creation with product bundles and implementation products</p></li><li><p>Error-prone process: Manual product selection from large catalog led to inconsistencies and missing components</p></li><li><p>Scalability issues: Growing product complexity made training new reps increasingly difficult</p></li><li><p>Administrative burden: Repetitive quote patterns consumed valuable selling time that could be better spent with customers</p></li></ul><p>We implemented Agentforce to automate quote creation in Salesforce CPQ, evolving from a basic proof-of-concept to a production-ready agent over Q1 2025.</p><h2>Agentforce Concepts</h2><p>For non-Salesforce users: Think of it as ChatGPT integrated directly into the CRM, capable of reading/writing your business data and executing workflows through conversations.</p><p>Core Architecture:</p><ul><li><p><strong>Topics</strong>: Specific conversation areas the agent can handle (e.g., "Create a quote," "Manage deals"). Helps organize actions and guide the agents through which actions should be used based on the request.</p><ul><li><p>A topic is made of <strong>instructions</strong> (text guidance, similar to ChatGPT prompts) and Actions</p></li></ul></li><li><p><strong>Actions</strong>: Actions are how agents get things done. Each Agent includes a set of actions, which are the tools it can use to do its job. For example, if a user asks an agent for help with writing an email, it will use the best action available within the appropriate topic, to draft and revise the email. An action can be either Standard or Custom.</p></li></ul><ul><li><p>Standard Actions: Pre-built by Salesforce (e.g., "Update Record," "Get Record Details")</p></li><li><p>Custom Actions: Created by your team for specific business requirements using Flows (workflow automation), Apex (custom code), or Prompt Templates (AI instructions that process data and return formatted responses)</p></li></ul><h2>Technology Options Evaluated</h2><h3>1. Screen Flow vs Agent Approach</h3><p>Options considered:</p><ul><li><p><strong>Screen Flow</strong>: Traditional Salesforce UI with guided quote creation</p></li><li><p><strong>Agentforce</strong>: Conversational AI interface</p></li></ul><p>Both solutions rely on the same reusable automation (flows / apex), it was only a UX choice to make.</p><p><strong>Decision rationale</strong>: Preferred Agentforce to test AI capabilities and provide natural language interface, despite higher complexity and token costs.</p><p><strong>Outcome</strong>: Agentforce proved more flexible for iterative conversations and complex scenarios.</p><h3>2. Product Matching Strategies</h3><p>Evolution of approaches:</p><p><strong>Phase 1: Used RAG approach</strong></p><ul><li><p>Agent used out of the box "Get Record detail" function to get data from Salesforce</p></li><li><p>Retrieved structured product data from database</p></li></ul><p><strong>Phase 2: Switched to In-Context Learning approach</strong></p><ul><li><p>Automated system fetched product data and embedded it in the prompt template</p></li><li><p>AI analyzed the text-based product information to make matches</p></li></ul><p>Why the switch worked better: In-context learning gave the AI more comprehensive product information to analyze holistically, rather than trying to match against individual database records through API calls.</p><h3>3. Quote Creation Architecture</h3><p>Evolution from Flows to Apex.</p><p><strong>Initial: Flow-based approach</strong></p><ul><li><p>Used Salesforce Flow with CPQ API calls</p></li><li><p>Issues encountered:</p><ul><li><p>"Unable to lock rows" errors (&gt;50% failure rate)</p></li><li><p>Complex flow logic became unmaintainable</p></li><li><p>Configuration attributes not applying properly</p></li><li><p>CPQ API:</p><ul><li><p>The solution is less useful due to several limitations, including the inability to set the quantity of quote lines.</p></li><li><p>Advantage: It offers the capability to construct a quote virtually, utilizing quote line groups, quote lines, bundles, and more. Quotes can be built without the need for multiple DML operations, which minimizes the risk of encountering DML errors.</p></li><li><p>However, the limitations outweigh the advantages, leading us to discard this solution.</p></li></ul></li></ul></li></ul><p><strong>Final: Custom Apex solution</strong></p><ul><li><p>Developed custom Apex classes to take care of manual steps currently done by reps for building multi-years quote.</p></li><li><p>Implemented trigger disabling interfering mechanisms</p></li><li><p><strong>Benefits</strong>: More reliable, better error handling, maintainable code</p></li><li><p><strong>Trade-off</strong>: Higher development complexity but better long-term sustainability</p></li></ul><h2>Return on Experience</h2><h3>Keep Instructions Concise</h3><p>Max 3 or 4 instructions per topic.</p><ul><li><p>The trap is to add more and more instructions when the agent does not behave as expected.</p></li><li><p>Doing so can sometimes work, but will mostly confuse the agent, and it will be less likely that these additional instructions will be taken into account at the right time.</p></li></ul><p>What we learned:</p><ul><li><p>Started with extensive, detailed instructions but found them counterproductive</p></li><li><p>Consulting with Salesforce experts revealed that simpler instructions often work better</p></li></ul><h3>Build Guardrails in Code, Not Prompts</h3><p><em>&#8220;Topic instructions are nondeterministic, which means they can't guarantee the same outcome 100% of the time. That's just the nature of generative AI. So we make sure to build important or sensitive business rules, requirements, and guardrails into the functionality of the agent's actions, not the topic instructions.&#8221;</em></p><p>Examples from our implementation:</p><ul><li><p><strong>Product eligibility</strong>: Instead of instructing "only use active products," we implemented filters into the action</p></li><li><p><strong>Price validation</strong>: Automated CPQ rules enforcement rather than relying on instructions</p></li><li><p><strong>Quote tracking</strong>: Added "Created by Automation" field populated with Agentforce_CreateQuoteAction for reliable tracking</p></li><li><p><strong>Product matching</strong>: Moved from instruction-based to database-driven product selection with keywords and rankings, thanks to prompt action.</p></li></ul><p>What we learned:</p><ul><li><p><strong>Instructions vs Actions trade-off</strong>: Critical business logic should always be in actions, not instructions</p></li><li><p><strong>Reliability through automation</strong>: Database-driven constraints are more reliable than AI interpretation</p></li><li><p><strong>Testing approach</strong>: Actions can be tested independently, instructions cannot be reliably tested</p></li><li><p><strong>Maintenance</strong>: Updating action logic is more predictable than refining instruction wording</p></li></ul><h3>Real-World Examples</h3><p><strong>Success case: Complex multi-year quote</strong></p><p>A sales rep requested: "Create a 3-year quote starting June 2025. Business licenses, Ultimate package, Coaching solution. Year 1: 100 users, Year 2: 150, Year 3: 200. 10% discount."</p><p>The agent successfully:</p><ul><li><p>Identified correct products from natural language descriptions</p></li><li><p>Created appropriate quote line groups for each year</p></li><li><p>Applied quantities with ramp-up logic</p></li><li><p>Added required implementation products automatically</p></li><li><p>Applied global discount correctly</p></li></ul><p><strong>Challenging case: Product matching ambiguity</strong></p><p>Request: "Add Essential package and licenses"</p><ul><li><p>Initial approach: Failed due to multiple "Essential" products in catalog and inconsistent product naming</p></li><li><p>Solution implemented:</p><ul><li><p><strong>Semantic search action:</strong> Created dedicated product matching action using prompt templates to analyze our entire product catalog contextually, moving beyond exact keyword matching</p></li><li><p><strong>Usage-based ranking</strong>: Products ranked by frequency in existing quotes&#8212;commonly used products prioritized in matching results</p></li><li><p><strong>Clarification workflow</strong>: When multiple matches found, agent asks: "I found Essential Training Package and Essential License Bundle. Which did you mean, or both?"</p></li></ul></li><li><p><strong>Key breakthrough</strong>: The semantic search approach (via prompt templates) was transformational for addressing our product catalog's poor readability&#8212;vague product names, uninformative descriptions, and inconsistent naming conventions that made exact matching difficult.</p></li><li><p><strong>Efficiency insight</strong>: This AI-driven approach proved far more time-efficient than manually cleaning up our product database. With 200+ active products, standardizing names and descriptions would have required months of cross-team coordination, whereas the semantic search solution was implemented in days and immediately improved matching accuracy.</p></li></ul><p><strong>Why this mattered</strong>: This hybrid approach (semantic understanding + usage data + human clarification) solved the core challenge of translating natural sales language into our technical product catalog structure, while avoiding the massive overhead of database restructuring.</p><h2>What we learnt about Agentforce technology</h2><h3>Platform Insights</h3><ul><li><p><strong>Rapid evolution</strong>: Platform improved significantly during our 3-month development period</p></li><li><p><strong>Maturity level</strong>: Still building with occasional bugs but increasingly reliable</p></li></ul><h3>Key Capabilities &amp; Limitations</h3><p><strong>Strengths</strong>:</p><ul><li><p><strong>Flexible</strong> action framework (Flows/Apex/Prompts) powered with <strong>natural language processing</strong> for translating business requests</p></li><li><p><strong>Direct Salesforce data integration </strong>without requiring Data Cloud or data ingestion (at least for this particular use case)</p></li></ul><p><strong>Limitations</strong>:</p><ul><li><p>Query Record action had known bugs during our implementation (Q1 2025)</p></li><li><p>Struggles with fuzzy product matching</p></li><li><p>Limited native CPQ integration (requires custom development)</p></li></ul><h2>Lessons for Other Implementations</h2><p><strong>Start with hybrid approach from day one</strong></p><ul><li><p>Use AI for natural language interpretation, deterministic logic for business rules</p></li><li><p>Don't try to solve complex business logic with instructions alone</p></li><li><p><strong>Application:</strong> Any workflow requiring both flexibility and reliability</p></li></ul><p><strong>Data structure beats prompt engineering</strong></p><ul><li><p>Custom fields and metadata guide AI behavior more reliably than complex instructions</p></li><li><p>Invest in data modeling before building complex prompts</p></li><li><p><strong>Application</strong>: Any scenario where AI selects from large, structured datasets</p></li></ul><p><strong>In-Context Learning &gt; RAG for stable datasets</strong></p><ul><li><p>Embedding comprehensive data in prompts often outperforms real-time API calls</p></li><li><p>Works well when your dataset fits in context windows and doesn't change frequently</p></li><li><p><strong>Application</strong>: Product catalogs, policy documents, structured knowledge bases</p></li></ul>]]></content:encoded></item><item><title><![CDATA[Choosing the right AI coding tool]]></title><description><![CDATA[Can a non-developer write production code with AI?]]></description><link>https://www.aibuilders.blog/p/choosing-the-right-ai-coding-tool</link><guid isPermaLink="false">https://www.aibuilders.blog/p/choosing-the-right-ai-coding-tool</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Sat, 26 Jul 2025 03:44:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!v6Eu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!v6Eu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!v6Eu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!v6Eu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!v6Eu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!v6Eu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!v6Eu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:697336,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!v6Eu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!v6Eu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!v6Eu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!v6Eu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8cf3e3bd-4e7b-42e8-b21b-51cb4f9d0598_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As I&#8217;m exploring how to build agents and apply AI in practice, I&#8217;ve been wrestling with a fundamental problem:</p><p><strong>The many no-code and low-code tools, useful as they are, still have limitations today.</strong> </p><p>They don&#8217;t yet expose everything you need to have full control over your agent the way that code-based frameworks do. In many cases, that&#8217;s an acceptable trade-off. But in others, it creates real constraints and affects how stable and reliable your agents can be. </p><p>I&#8217;ve found my agents deliver solid output about 80% of the time. For the remainder, the output doesn&#8217;t meet quality standards or there are technical issues, like an agent that doesn&#8217;t return the expected result for the next step.</p><p>It&#8217;s very frustrating to hit this wall and not have full access to the control panels. My options are either to shrug and explain to stakeholders that this is as good as it gets, or to layer on automated error-handling. Neither option is particularly satisfying.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3_Ln!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3_Ln!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 424w, https://substackcdn.com/image/fetch/$s_!3_Ln!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 848w, https://substackcdn.com/image/fetch/$s_!3_Ln!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 1272w, https://substackcdn.com/image/fetch/$s_!3_Ln!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3_Ln!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png" width="1456" height="695" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:695,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:235540,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3_Ln!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 424w, https://substackcdn.com/image/fetch/$s_!3_Ln!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 848w, https://substackcdn.com/image/fetch/$s_!3_Ln!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 1272w, https://substackcdn.com/image/fetch/$s_!3_Ln!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F46c1018d-92c8-459a-b6ee-eb8d0316c441_2486x1186.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">This screenshot shows a log table from a research agent in Dust, orchestrated by Zapier. You can see rows where the research workflow failed because the Dust agent never submitted its work. Using a table allows us to recover from these errors with a safety-net workflow, but the initial lack of reliability is still problematic. </figcaption></figure></div><h2>My own limitation</h2><p>At the same time, I&#8217;m facing my own limitations around the tools I can use.</p><p><strong>I&#8217;m not a developer.</strong></p><p>I&#8217;ve been a solutions architect for nearly 15 years and have designed and delivered complex projects, many of them code-based. I&#8217;m fluent in the principles and patterns of building software.</p><p>But I don&#8217;t write code professionally. My focus has always been on writing specs, aligning with stakeholders, and working closely with an engineer to bring things to life.</p><p>So the question becomes: to reach enterprise-grade reliability, do I still need to either work with a developer or become one myself? Or can AI tools bridge that gap?</p><p>This post is my first step in answering that question: finding tools that give non-developers the right mixture of power and control. Whether that control translates to production-grade reliability is an ongoing experiment I&#8217;m running.</p><h2>The limits of &#8220;Vibe Coding&#8221;</h2><p>I began exploring whether AI could let me use code-based frameworks without actually coding.</p><p>After all, AI-assisted development (AKA &#8220;vibe coding&#8221;) is one of the most popular use-cases right now.</p><p>However, I really don&#8217;t love that term. In my experience, strong solutions don&#8217;t come from vibes. They come from planning, analysis, and careful design. That doesn&#8217;t mean I&#8217;m against using AI to quickly iterate and prototype&#8212;I&#8217;ve built a few toy projects this way and had fun doing it.</p><p>But even within that limited scope, I ran into challenges that seem fairly universal for non-developers. These tools are great at giving you a start, but things break down as projects grow. The AI loses track of features, introduces bugs, and the whole thing starts to feel fragile. It can also make suboptimal design choices that I don&#8217;t have the technical fluency to notice until it&#8217;s too late.</p><h2>Surveying the landscape</h2><p>Despite these issues, I was still convinced that using AI to build AI-powered systems was the right path forward. I just needed to find a tool that would work for someone like me.</p><p>So I started surveying the landscape.  I had already experimented hands-on with tools like Bolt, Lovable, Replit, and Cursor. I wanted to make sure I wasn&#8217;t missing anything, so I used Gemini&#8217;s excellent DeepResearch tool to create a report on the space, including newer entrants and open-source tools like Trae and Void.</p><p>You can explore that report as a website <a href="https://justinmnorris.github.io/ai-ide-deep-research/">here</a>. (Caveat: I have not verified the report contents.)</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://justinmnorris.github.io/ai-ide-deep-research/" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_S4Z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 424w, https://substackcdn.com/image/fetch/$s_!_S4Z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 848w, https://substackcdn.com/image/fetch/$s_!_S4Z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!_S4Z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_S4Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png" width="1456" height="893" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:893,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:351372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://justinmnorris.github.io/ai-ide-deep-research/&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_S4Z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 424w, https://substackcdn.com/image/fetch/$s_!_S4Z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 848w, https://substackcdn.com/image/fetch/$s_!_S4Z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 1272w, https://substackcdn.com/image/fetch/$s_!_S4Z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ccb5c09-2d78-450a-9016-5a6f04c7a1e2_2168x1330.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Interactive report on AI coding tools, comparing solutions across multiple dimensions.</figcaption></figure></div><p>One tool emerged as particularly interesting to me&#8212;<strong><a href="https://kiro.dev/">Kiro</a> </strong>by AWS. As of this writing, it&#8217;s barely a week old. But it offered a different paradigm for collaborating with AI that I found intuitively familiar and appealing.</p><h2>Spec-driven development</h2><p>Kiro is designed to produce production-grade code using a <strong>spec-driven workflow</strong>.</p><p>Instead of building code through loose back-and-forth chats, Kiro encourages you to work with the agent to create a proper spec.</p><p>Each spec has three parts:</p><ul><li><p><strong>Requirements</strong>: Use cases written in EARS format that describe what each feature needs to do.</p></li><li><p><strong>Design</strong>: An architectural overview of how the solution will be built and how the components will fit together.</p></li><li><p><strong>Tasks</strong>: A list of implementation work to be done.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GM7V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GM7V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 424w, https://substackcdn.com/image/fetch/$s_!GM7V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 848w, https://substackcdn.com/image/fetch/$s_!GM7V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 1272w, https://substackcdn.com/image/fetch/$s_!GM7V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GM7V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png" width="920" height="1224" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1224,&quot;width&quot;:920,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60926,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GM7V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 424w, https://substackcdn.com/image/fetch/$s_!GM7V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 848w, https://substackcdn.com/image/fetch/$s_!GM7V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 1272w, https://substackcdn.com/image/fetch/$s_!GM7V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F640836c0-9e9f-4fbd-916b-b0038eb8c4c1_920x1224.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Flowchart of Kiro&#8217;s spec-driven workflow, from the <a href="https://kiro.dev/docs/specs/concepts/">Kiro Docs</a>.</figcaption></figure></div><p>This felt intuitive to me. As a solutions architect, specs have always been central to my work, helping me align with clients and internal teams on what needed to be built and how. So the idea of collaborating with an AI on a spec felt not only natural but also very comforting. </p><p>Specs also help address a major challenge in AI development: context drift. When everything is vibed, it&#8217;s easy to lose track of what the system is supposed to do over time. This is when the AI starts eating its own code and introducing regressions. </p><p>For a developer, this might be merely annoying, but it is absolutely fatal for a non-developer who lacks the skills to understand what&#8217;s broken let alone fix it. </p><p>Conversely, specs ground you in a shared source of truth&#8212;a contract with the coding agent on how the system must work. </p><p>My hunch was that this workflow would provide me with the control I needed to keep a project stable and reliable over time. </p><h2>Getting started</h2><p>When I first opened Kiro, I felt the familiar discomfort that most non-developers probably experience when dropped into a clearly technical environment. The UI didn&#8217;t feel like it was made for me.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7_H5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7_H5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 424w, https://substackcdn.com/image/fetch/$s_!7_H5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 848w, https://substackcdn.com/image/fetch/$s_!7_H5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 1272w, https://substackcdn.com/image/fetch/$s_!7_H5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7_H5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png" width="2872" height="1618" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1618,&quot;width&quot;:2872,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:369989,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb3cb06f8-a5df-49c7-a868-337cb9725c39_2872x1618.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7_H5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 424w, https://substackcdn.com/image/fetch/$s_!7_H5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 848w, https://substackcdn.com/image/fetch/$s_!7_H5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 1272w, https://substackcdn.com/image/fetch/$s_!7_H5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7804b1ef-37f0-4f3d-a90a-045d257204ae_2872x1618.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Kiro&#8217;s UI</figcaption></figure></div><p>If you&#8217;re a similar profile (technical, but not a coder) I&#8217;d encourage you to push through that feeling. It fades quickly.</p><p>I fed Kiro a high-level project idea and asked how we should get started. Kiro scaffolded out some initial files, asked a few clarifying questions, and we were off. I had a basic proof of concept in 30 minutes.</p><p>From there, I acted more like a product manager. I tested the output, gave feedback to improve the user experience, and Kiro updated files and added features. It was awesome. </p><h2>Cracks start to show</h2><p>In the back of my mind, though, I kept wondering when the spec-driven workflow would kick in.</p><p>Eventually I asked, and Kiro explained we were still early and should keep iterating in chat. I didn&#8217;t yet realize that Kiro has two modes:</p><ul><li><p>&#8220;Vibe&#8221; mode, which functions much like other agent-driven development tools where you iterate via chat. </p></li><li><p>&#8220;Spec&#8221; mode, which is the spec-driven workflow I was reading about in the Kiro docs. </p></li></ul><p>I had unwittingly gotten started in Vibe mode, and this actually reproduced the instability I was trying to avoid. I eventually hit a limit where Kiro forced me to start a new session. It generated a summary briefing for itself to kick the new session off,  but the agent still forgot certain details and I had to re-explain things. </p><p> This really exposed the limitations of not having a clear spec. </p><p>Without that grounding, the AI had to reconstruct the intent from the code and summary. My own fallible memory was the source of truth. </p><h2>Switching to spec mode</h2><p>Kiro was enthusiastic to keep building, but I asked it to stop and document what we had built so far, which luckily triggered a new &#8220;spec session&#8221; and got me into spec mode. </p><p>In a spec session, you can either write the spec yourself or iterate on it via chat. This moves the collaboration up to a higher level of abstraction&#8212;you work on requirements and architecture rather than diving directly into code. </p><p>Since we now had a lot of undocumented features, I asked Kiro to generate specs retroactively, and it was able to quickly reconstruct requirements and outline the architecture choices. </p><p>It felt like laying a secure foundation underneath a structure that had been floating. </p><p>Here&#8217;s a quick walkthrough of Kiro&#8217;s three-part spec workflow in action.</p><h3>Requirements</h3><p>The requirements tab of a Kiro spec describes <em>what the system should do </em>and contains detailed user stories and acceptance criteria written in <a href="https://alistairmavin.com/ears/">EARS format</a>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yz_p!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yz_p!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 424w, https://substackcdn.com/image/fetch/$s_!Yz_p!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 848w, https://substackcdn.com/image/fetch/$s_!Yz_p!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 1272w, https://substackcdn.com/image/fetch/$s_!Yz_p!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yz_p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png" width="1120" height="1366" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1366,&quot;width&quot;:1120,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:330470,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Yz_p!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 424w, https://substackcdn.com/image/fetch/$s_!Yz_p!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 848w, https://substackcdn.com/image/fetch/$s_!Yz_p!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 1272w, https://substackcdn.com/image/fetch/$s_!Yz_p!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6ad66ce-1c53-4e5c-be33-12e52ef81a17_1120x1366.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>For a non-technical user, this is the most important collaboration space. It creates a persistent source of truth and a much cleaner way to define requirements than iterative prompting. </p><h3>Design</h3><p>The design tab of the spec explains <em>how the system works</em>: the technical architecture and core system considerations. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fruJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fruJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 424w, https://substackcdn.com/image/fetch/$s_!fruJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 848w, https://substackcdn.com/image/fetch/$s_!fruJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 1272w, https://substackcdn.com/image/fetch/$s_!fruJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fruJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png" width="1110" height="1512" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1512,&quot;width&quot;:1110,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:264255,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fruJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 424w, https://substackcdn.com/image/fetch/$s_!fruJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 848w, https://substackcdn.com/image/fetch/$s_!fruJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 1272w, https://substackcdn.com/image/fetch/$s_!fruJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6abf309-4bc0-4b50-b4aa-9ee9c1972d9d_1110x1512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m primarily relying on Kiro&#8217;s input here. However, it is a useful place to validate technical choices before they are implemented. </p><p>As we started building some more advanced features, I would feed the high-level plan to another LLM to get a second opinion. This produced a valuable dialogue and helped refine the design. </p><h3>Tasks</h3><p>The last tab breaks down the build into specific implementation tasks. It describes <em>how the system will be built</em>. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CLaC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CLaC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 424w, https://substackcdn.com/image/fetch/$s_!CLaC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 848w, https://substackcdn.com/image/fetch/$s_!CLaC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 1272w, https://substackcdn.com/image/fetch/$s_!CLaC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CLaC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png" width="1142" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1142,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:341551,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168943063?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CLaC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 424w, https://substackcdn.com/image/fetch/$s_!CLaC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 848w, https://substackcdn.com/image/fetch/$s_!CLaC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 1272w, https://substackcdn.com/image/fetch/$s_!CLaC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd7bb8-ffe5-4a56-86cc-b4740120274b_1142x1456.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Kiro&#8217;s task list</figcaption></figure></div><p>I mostly defer to the agent on this, but the explicit task list keeps work aligned across sessions and lets you review the plan before building starts.</p><p>In contrast, when I tried feeding the same high-level project idea into Cursor, the agent dove straight into coding and created a relatively complex project with a few dozen files. It was impressive, but also felt much less aligned and risked charging off in the wrong direction. </p><h2>Reflections</h2><p>Vibe coding really is intoxicating. It gives you technical superpowers, like wearing an Iron Man suit.</p><p>Features go from idea to implementation in minutes. Ambitious ideas that would otherwise take weeks or be dismissed as unrealistic are suddenly achievable. You&#8217;re working with the collective knowledge of a thousand engineers.</p><p>It&#8217;s thrilling. But it can also make you lazy.</p><p>It&#8217;s easy to sit back and let the AI take over. And I think that&#8217;s when the risk creeps in&#8212;when you stop thinking deeply about the architecture, requirements, and how it all fits together. That&#8217;s how fragile systems are built.</p><p><strong>The agent will build whatever you want, but only the human can make sure the output is truly valuable.</strong></p><p>Also, for me, it&#8217;s not enough to talk to a black box. I want to understand what&#8217;s happening. I want to watch a developer at work and learn from the decisions being made.</p><p>So I forced myself to stay engaged. If Kiro made a decision I didn&#8217;t fully understand, I asked for a second opinion. I challenged it on whether our approaches were optimal and scalable. In many cases, this surfaced real improvements.</p><h3><strong>What Comes Next</strong></h3><p>Even with these caveats, I can&#8217;t imagine going back.</p><p>It now feels viable to use code frameworks to build real internal tools without needing a full development team&#8212;my mind is buzzing with ideas of where I could use this approach to tackle new challenges or improve existing solutions. </p><p>And in a follow-up post, I&#8217;ll walk through my first Kiro project in detail and what I&#8217;m learning from it. </p>]]></content:encoded></item><item><title><![CDATA[Real-world agents: person-to-account matching]]></title><description><![CDATA[Can AI agents solve one of the messiest CRM problems?]]></description><link>https://www.aibuilders.blog/p/ai-agent-experiment-crm-person-account-matching</link><guid isPermaLink="false">https://www.aibuilders.blog/p/ai-agent-experiment-crm-person-account-matching</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Tue, 15 Jul 2025 01:42:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!vFsR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vFsR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vFsR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vFsR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vFsR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vFsR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vFsR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:650330,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vFsR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!vFsR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!vFsR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!vFsR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6452009a-1bd2-47d7-8beb-dabc240f324a_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s a fun pop quiz&#8212;see if you can spot the likely problem area in this outbound workflow I&#8217;ve been building. </p><ol><li><p>Using the Cognism API, search at scheduled intervals for contacts matching a particular segment definition. </p></li><li><p>Check if the contact and account already exist in Salesforce, and if not, create them.</p></li><li><p>Add the contact to a Salesforce campaign for further follow-up.</p></li></ol><p>Keen observers will have homed in on the <strong>account check</strong> as the stickiest part of this workflow. </p><p>How do we check whether the account already exists? Based on the name? The website? </p><p>Keep in mind that even if your CRM data is perfect (&#128514;), the real-world topography is extremely complicated. Companies often have</p><ul><li><p>multiple office locations</p></li><li><p>multiple valid website domains</p></li><li><p>variations in their names</p></li><li><p>complex corporate relationships and hierarchies</p></li><li><p>related entities in different countries </p></li><li><p>etc. </p></li></ul><p>Seemingly simple questions&#8212;like, &#8220;does this account exist in our CRM?&#8221; or &#8220;which of these possible account matches is the <em>best </em>match for a given contact?&#8221;&#8212;are surprisingly hard for rules-based systems to answer well.</p><p>However, most human workers could answer these questions easily with a bit of training and common-sense. </p><p>This makes it a perfect agentic use case. </p><h2>Why an agent makes sense here</h2><p>You could simply search by domain to find an account, but it creates the risk of false negatives if your parameters are too narrow or of choosing the wrong option from multiple potential matches.</p><p>An agent, on the other hand, offers several benefits. </p><h3>Intelligent queries </h3><p>An agent can intelligently decompose the company name and website into components and search for possible variants. </p><p>For example, let&#8217;s say Cognism provides the following data about a company:</p><pre><code><code>{

"companyName" : "Acme Insurance",
"website" : "acmeinsurance.com"

}</code></code></pre><p>Now, accounts in Salesforce CRM might use any of the following variants: </p><p><strong>Company Name</strong></p><ul><li><p>Acme Insurance Co.</p></li><li><p>The Acme Insurance Co. </p></li><li><p>Acme Mutual</p></li><li><p>The Acme Insurance Company </p></li><li><p>etc. </p></li></ul><p><strong>Domains: </strong></p><ul><li><p>acmeinsurance.ca</p></li><li><p>acmeinsurance.co.uk</p></li><li><p>acme.com/insurance</p></li><li><p>acme.fr</p></li><li><p>etc. </p></li></ul><p>An exact match search would miss all of them, and even legacy data hygiene tools with &#8220;fuzzy matching&#8221; capabilities are quite limited in what they can do. </p><p>But an LLM has the semantic intelligence to separate the brand keyword from the generic category term&#8212;as a human would&#8212;and construct a better SOQL query. </p><pre><code><code>(Name LIKE '%Acme Insurance%' OR Name LIKE '%Acme%') OR (Website LIKE '%acmeinsurance.com%' OR Website LIKE '%acme.com%' OR Website LIKE '%acme.%')</code></code></pre><h3>Evaluating matches</h3><p>The query above casts a wider net, which is good, but it&#8217;s also likely to catch false positives and risks matching a contact to the wrong account. </p><p>Unlike rigid rules-based systems, the agent can evaluate each option set <strong>holistically</strong>, ruling out false positives and&#8212;in the case of multiple viable options&#8212;determining which match is best. </p><p>The agent considers things like: </p><ul><li><p>Similarity of company name and website to the Cognism-provided data</p></li><li><p>Proximity of the company billing address to the contact&#8217;s location </p></li><li><p>Account hierarchy relationships</p></li><li><p>etc. </p></li></ul><h3>Supplementary web research</h3><p>CRM data can be ambiguous, but often a quick visit to the company website is enough to offer clarity. For example, it quickly shows where a corporate entity is named differently than a brand name. </p><p>An agent can use a web browse tool and gain this additional context, just like a human would. </p><h2>AI thrives on &#8220;adaptive&#8221; work</h2><p>This account matching problem is a perfect example of what I call <a href="https://revopsfm.substack.com/p/ai-agents-beyond-the-hype">&#8220;adaptive work&#8221;</a>.</p><p>These are tasks that require just enough discretion and judgement that you can&#8217;t fully automate them with rules-based workflows. </p><p>At the same time, it&#8217;s not work that requires much creativity, original thinking, or unique insight. </p><p>It&#8217;s just the unglamorous hygiene work that keeps things moving.  </p><p>This is a sweet spot for agentic AI. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UWPF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UWPF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 424w, https://substackcdn.com/image/fetch/$s_!UWPF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 848w, https://substackcdn.com/image/fetch/$s_!UWPF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 1272w, https://substackcdn.com/image/fetch/$s_!UWPF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UWPF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png" width="1456" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129316,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UWPF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 424w, https://substackcdn.com/image/fetch/$s_!UWPF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 848w, https://substackcdn.com/image/fetch/$s_!UWPF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 1272w, https://substackcdn.com/image/fetch/$s_!UWPF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b1b8566-1fec-47a8-a315-172e7a1cb7a5_1456x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Revenue operations (my day-to-day field) is flush with examples of adaptive work, and CRM account hygiene has long been the bane of my existence. </p><p>So I was excited to take a crack at solving this with an agent. </p><h2>Workflow structure</h2><p>For context, here&#8217;s the layout of the workflow in Zapier, with some intermediate data prep steps cropped out. </p><p>This is a &#8220;worker&#8221; Zap that fetches a search result record from a queue (Zapier table), redeems the full data from Cognism, then performs the Salesforce check/create steps. </p><p>Step 15 is where we call the account-matching agent. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nT1A!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nT1A!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 424w, https://substackcdn.com/image/fetch/$s_!nT1A!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 848w, https://substackcdn.com/image/fetch/$s_!nT1A!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 1272w, https://substackcdn.com/image/fetch/$s_!nT1A!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nT1A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png" width="858" height="1287" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1287,&quot;width&quot;:858,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:289516,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nT1A!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 424w, https://substackcdn.com/image/fetch/$s_!nT1A!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 848w, https://substackcdn.com/image/fetch/$s_!nT1A!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 1272w, https://substackcdn.com/image/fetch/$s_!nT1A!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0dc11ab0-567e-4a7a-a04b-2c9064ce42f6_858x1287.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>You&#8217;ll notice that this is not a full &#8220;agentic&#8221; solution but an agent step scoped to a very specific task (find the right account) inside a mostly deterministic workflow. </p><p>This maximizes predictability and reliability by using rules-based automation where we can and invoking the creativity / flexibility of an LLM only where we need it. </p><h2>Agent setup</h2><p>I&#8217;m using a native Zapier agent here. The Agents module of the Zapier platform is relatively new, but I chose it for this use case because: </p><ul><li><p>It&#8217;s tightly integrated with Zapier and handles synchronous agent processing within a single Zap (rather than invoking a third-party agent and needing a separate Zap to receive the results)</p></li><li><p>It already has access to our CRM via the Zapier Salesforce connection </p></li><li><p>It supports structured outputs that can be easily re-used later in the Zap</p></li><li><p>I wanted to try it out! </p></li></ul><p>So despite some limitations, it makes sense for this use case. </p><p>It&#8217;s also quite easy to set up and offers a handy inline pill UX for specifying tool use. </p><p>Here&#8217;s a visual of the configuration: </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nToj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nToj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 424w, https://substackcdn.com/image/fetch/$s_!nToj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 848w, https://substackcdn.com/image/fetch/$s_!nToj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 1272w, https://substackcdn.com/image/fetch/$s_!nToj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nToj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png" width="884" height="849" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cf57f546-1025-452e-bb17-3cc371079fad_884x849.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:849,&quot;width&quot;:884,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:150254,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!nToj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 424w, https://substackcdn.com/image/fetch/$s_!nToj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 848w, https://substackcdn.com/image/fetch/$s_!nToj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 1272w, https://substackcdn.com/image/fetch/$s_!nToj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcf57f546-1025-452e-bb17-3cc371079fad_884x849.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And here&#8217;s the full prompt: </p><pre><code>You are an expert CRM operations professional who excels at analyzing CRM data and assessing questions of data hygiene.

You will be provided with a JSON file containing details about a person and the company they belong to. Company details are in the "account" section.

Given this input, follow the process below.

## STEP 1: PREPARE SEARCH TERMS

Prepare to search for the account by generating a precise set of search terms based on the input company's name and domain.

Follow this exact process:

1. Identify company name keywords: For example, from the company name (e.g., "Acme Insurance"), identify the primary brand name ("Acme") and the full name ("Acme Insurance").

2. Identify domain strings: From the domain (e.g., acmeinsurance.com), identify the full domain and the "root brand keyword plus dot" variant. The root brand keyword is the primary brand name. For example, from "acme.fr", the root keyword is "acme" and the variant is "acme.".

c. Build the Query: Construct a Salesforce SOQL WHERE clause using ONLY the terms generated above. Combine name and website searches with OR. 

* Good Example: For "Acme Insurance" with domain acmeinsurance.com, the correct WHERE clause would be 

(Name LIKE '%Acme Insurance%' OR Name LIKE '%Acme%') OR (Website LIKE '%acmeinsurance.com%' OR Website LIKE '%acme.%') * Do not invent acronyms or other strings. * 

Do not search for single, generic words (like '%insurance%'), as this is too broad.

## STEP 2: EXECUTE SALESFORCE SEARCHES

Search Salesforce accounts using the SOQL where clause constructed in the past step along with the tool [Salesforce: Find Records by Query] to find potential matches.

##STEP 3: ANALYZE MATCHES

Analyze all potential account matches found in Salesforce, considering factors such as billing address, company name variations, parent/child account relationships, and other relevant data.

## STEP 4: FORMULATE A HYPOTHESIS

Formulate a hypothesis regarding the likelihood of a match or the ambiguity of the data.

If there are multiple potential matches, consider factors such as alignment between the contact's location and the account billing address, or other similar details.

Don't apply rigid rules, but use human-like intuition and common sense to determine which account they should be matched with.

If the evidence provides high confidence (95%+) for a MATCH or NO_MATCH, proceed directly to ## STEP 7: FINAL OUTPUT.

If the evidence is ambiguous or insufficient, your goal is to gather more data. Proceed to Step 5.

## STEP 5: SEARCH FOR ADDITIONAL DATA (IF NEEDED)

If the data is ambiguous or insufficient to make a high-confidence decision (at least 95% confidence for a match), perform additional searches to gain more clarity.

- If there are multiple matches that seem equally valid: check in Salesforce which accounts have parent/child relationships in order to determine the most likely match.

- If there are no matches at all: visit the company's website using [Visit website] to gather additional information. Focus on the text in the 'About Us', 'Contact', or 'Legal' sections of the website. Look for phrases like 'is a subsidiary of', 'part of the... family of companies', or email or copyright notices that mention a different corporate name.

## STEP 6: RE-EVALUATE

After gathering new data, return to Step 4 to re-evaluate your hypothesis with the new evidence.

## STEP 7: FINAL OUTPUT

Your single and only final output for this entire task are the fields below. Do not write any other text, narration, or explanation. You will conclude all work by providing the field output and nothing else.

### REQUIRED FIELDS

- decision: MATCH | NO_MATCH | NEEDS_REVIEW
- match_account_id: string | null
- confidence_score: number
- reasoning: The best match is [company name] because [a concise justification for your decision].

### EXAMPLE: SUCCESSFUL OUTPUT

- decision: MATCH
- match_account_id: 00123000012A3bYAAM
- confidence_score: 99
- reasoning: The best match is Acme Insurance because the account's website field contains a domain that matches the email address of the contact. The contact is based in the same state as the account's billing address.</code></pre><h2>Results</h2><p>The output overall has been strong so far, which is very encouraging. </p><ul><li><p>All matches have been justifiable</p></li><li><p>It has prioritized multiple potential matches in a satisfying way</p></li><li><p>I&#8217;m not aware of any false negatives (where it failed to find a suitable match that a human would have found with reasonable effort)</p></li></ul><p>Here are some notable examples. </p><h3>Strong prioritization &#9989; </h3><p>In this test search for Fidelity International, the agent successfully identified a broad group of matches based on the brand name, eliminated improbable ones, and successfully chose the best of the valid options based on the contact&#8217;s proximity. </p><p>You can see here a nice thing about Zapier&#8217;s agent interface: the ability to chat with the agent responsible for a specific run to ask follow-up questions and get it to explain its thinking or prior steps. I&#8217;ve found this sort of dialogue critical to <a href="https://www.aibuilders.blog/p/when-in-doubt-ask-the-model">improving agent reliability</a>. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JJUp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JJUp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 424w, https://substackcdn.com/image/fetch/$s_!JJUp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 848w, https://substackcdn.com/image/fetch/$s_!JJUp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 1272w, https://substackcdn.com/image/fetch/$s_!JJUp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JJUp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png" width="993" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:993,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:124627,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JJUp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 424w, https://substackcdn.com/image/fetch/$s_!JJUp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 848w, https://substackcdn.com/image/fetch/$s_!JJUp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 1272w, https://substackcdn.com/image/fetch/$s_!JJUp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2634dd66-1e80-47f6-8622-6e5d4dea4817_993x720.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Using the web to resolve ambiguity &#9989; </h3><p>In this example, the agent failed to find any matches in our CRM for &#8220;PetSafe Brands&#8221; and used web search to find an alternate corporate name, &#8220;Radio Systems Corporation,&#8221; which DID match a CRM account. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Udvt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Udvt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 424w, https://substackcdn.com/image/fetch/$s_!Udvt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 848w, https://substackcdn.com/image/fetch/$s_!Udvt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 1272w, https://substackcdn.com/image/fetch/$s_!Udvt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Udvt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png" width="995" height="485" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:485,&quot;width&quot;:995,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:81841,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Udvt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 424w, https://substackcdn.com/image/fetch/$s_!Udvt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 848w, https://substackcdn.com/image/fetch/$s_!Udvt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 1272w, https://substackcdn.com/image/fetch/$s_!Udvt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5dfb787-ade5-4b2f-9b34-bc7da36f40af_995x485.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My heart sank when I first saw this result, as it seemed like a case of hallucination. </p><p>But sure enough, when I visited the website, I saw &#8220;Radio Systems Corporation&#8221; in the footer. </p><p>The agent nailed it. </p><h2>Issues and Potential Improvements </h2><p>I want to be sure I also highlight issues and potential improvements when discussing things I&#8217;ve built. </p><p>If I didn&#8217;t, it would perpetuate the AI hype mirage and could be discouraging for anyone else who tries a similar approach and runs into problems. </p><h3>Technical unreliability</h3><h4>Issue: </h4><p>About 10-20% of the time, the agent simply got &#8220;stuck&#8221; during its SOQL searches and timed out. I believe this is an underlying issue in the Zapier control flow, not an intrinsic limit of agents in general. </p><p>Unfortunately, in these cases the logs do not preserve the original query, making it difficult to diagnose what happened. </p><p>When asking the agent directly, it simply referred to vague &#8220;technical errors with Salesforce&#8221; but could not provide insight. </p><p>My suspicion is that the SOQL query was <strong>too </strong>broad. </p><h4>Possible solution: </h4><p>We could decompose the task even further</p><ul><li><p>An AI step prepares the SOQL query based on the company name and domain</p></li><li><p>A standard Zap action performs the SOQL query</p></li><li><p>The results are passed in to the agent along with the initial context. </p></li><li><p>The agent evaluates and does additional SOQL queries only if needed</p></li></ul><p>This would avoid the black-box timeout error (at least any SOQL errors would be transparent in the Zap step) and create more control. </p><p>We could also add a retry mechanism that scans the queue for records stuck in &#8220;processing&#8221; and either retries the account match or tries a simpler matching mechanism based on domain only. </p><h3>Hallucinated rationale</h3><h4>Issue: </h4><p>Although the agent has so far selected the most reasonable account, it&#8217;s shown a tendency to stretch the truth to make matches seem stronger than they are. </p><p>In one example, it stated that &#8220;the billing address is in the same country as the contact&#8217;s location&#8221; when this was incorrect. </p><p>When questioning the agent on the discrepancy, it acknowledged the error and seemed to suggest it was influenced by the example in the prompt. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JkBZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JkBZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 424w, https://substackcdn.com/image/fetch/$s_!JkBZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 848w, https://substackcdn.com/image/fetch/$s_!JkBZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 1272w, https://substackcdn.com/image/fetch/$s_!JkBZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JkBZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png" width="994" height="331" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:331,&quot;width&quot;:994,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:68876,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/168252784?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JkBZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 424w, https://substackcdn.com/image/fetch/$s_!JkBZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 848w, https://substackcdn.com/image/fetch/$s_!JkBZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 1272w, https://substackcdn.com/image/fetch/$s_!JkBZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7e4212be-07fc-4b5f-ae2d-381afbe4234b_994x331.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>Possible solution: </h4><p>Unsure how to fully mitigate this one, although strengthening the guard-rails in the prompt and stressing that the example is merely illustrative and not a template to follow could help. </p><h3>Future Improvements</h3><p>I think this shows great promise, and I&#8217;d love to keep developing the concept into a full-fledged account hygiene solution that supports de-duplication, parent-child account matches, M&amp;A / bankruptcy hygiene, and more.</p><p>This could be run on-demand or via scheduled batch.  </p><p>One feature I tried but removed for simplicity in the V1 is ZoomInfo tool integration, which would provide the agent with a structured source of truth for corporate hierarchies, M&amp;A events, and more.  </p>]]></content:encoded></item><item><title><![CDATA[Three things worth reading]]></title><description><![CDATA[Managing agent memory / building reliable AI applications / justifying AI project ROI]]></description><link>https://www.aibuilders.blog/p/three-things-worth-reading</link><guid isPermaLink="false">https://www.aibuilders.blog/p/three-things-worth-reading</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Fri, 11 Jul 2025 16:17:50 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QyrJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QyrJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QyrJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QyrJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QyrJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QyrJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QyrJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:520749,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167998360?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QyrJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QyrJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QyrJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QyrJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F44680ddd-95ab-4200-aa0e-ff3e0dde72ae_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>When I&#8217;m trying to master a new domain, I focus on finding people who really know their stuff and absorbing what they&#8217;ve learned. And while I mainly want to feature hands-on learnings here, I sometimes come across resources that are high-leverage and worth sharing</em></p><p><em>So I&#8217;m passing them along, with a bit of commentary. I&#8217;ll keep it tight. Let me know if this is useful; perhaps it becomes a Friday habit.</em></p><h2>Memory Optimization Strategies in AI Agents</h2><p>A big part of the magic of interacting with LLMs isn&#8217;t just their reasoning. It&#8217;s their ability to work <em>in context</em>, offering replies that take into account a history of past interactions. It&#8217;s the difference between a logic machine with an encyclopedia and something that feels truly intelligent and aware. </p><p>Memory is complex and technical, but in this article <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Nir Diamant&quot;,&quot;id&quot;:186750220,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6cacafa3-9e4a-49d6-afde-2239493e73a3_6960x4640.jpeg&quot;,&quot;uuid&quot;:&quot;a78b776b-9ce4-437c-8c7c-fe2492708d7e&quot;}" data-component-name="MentionToDOM"></span> makes the concepts easy to grasp. I&#8217;m getting ideas for how to design more useful memory in my own applications.</p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:167069938,&quot;url&quot;:&quot;https://diamantai.substack.com/p/memory-optimization-strategies-in&quot;,&quot;publication_id&quot;:3009345,&quot;publication_name&quot;:&quot;&#128142;DiamantAI&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!72Rv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bf24d0-f0ec-49fc-8e8f-800eec27706d_1280x1280.png&quot;,&quot;title&quot;:&quot;Memory Optimization Strategies in AI Agents &quot;,&quot;truncated_body_text&quot;:&quot;Before jumping into today&#8217;s blog post, I want to share a quick update:&quot;,&quot;date&quot;:&quot;2025-06-29T12:02:17.618Z&quot;,&quot;like_count&quot;:48,&quot;comment_count&quot;:4,&quot;bylines&quot;:[{&quot;id&quot;:186750220,&quot;name&quot;:&quot;Nir Diamant&quot;,&quot;handle&quot;:&quot;nirdiamant&quot;,&quot;previous_name&quot;:&quot;DiamantAI&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6cacafa3-9e4a-49d6-afde-2239493e73a3_6960x4640.jpeg&quot;,&quot;bio&quot;:&quot;Creating the world's biggest practical GenAI knowledge hub&quot;,&quot;profile_set_up_at&quot;:&quot;2024-09-12T12:43:45.559Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-09-12T13:23:45.074Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:3061656,&quot;user_id&quot;:186750220,&quot;publication_id&quot;:3009345,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:3009345,&quot;name&quot;:&quot;&#128142;DiamantAI&quot;,&quot;subdomain&quot;:&quot;diamantai&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;DiamantAI is the top 0.1% newsletter for staying ahead in AI, uncovering the latest techniques, breakthroughs, insights, and unique tutorials.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84bf24d0-f0ec-49fc-8e8f-800eec27706d_1280x1280.png&quot;,&quot;author_id&quot;:186750220,&quot;primary_user_id&quot;:186750220,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2024-09-12T12:43:55.457Z&quot;,&quot;email_from_name&quot;:&quot;&#128142;DiamantAI &quot;,&quot;copyright&quot;:&quot;DiamantAI&quot;,&quot;founding_plan_name&quot;:null,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;disabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:false,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://diamantai.substack.com/p/memory-optimization-strategies-in?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!72Rv!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bf24d0-f0ec-49fc-8e8f-800eec27706d_1280x1280.png"><span class="embedded-post-publication-name">&#128142;DiamantAI</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">Memory Optimization Strategies in AI Agents </div></div><div class="embedded-post-body">Before jumping into today&#8217;s blog post, I want to share a quick update&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">9 months ago &#183; 48 likes &#183; 4 comments &#183; Nir Diamant</div></a></div><h2>12-Factor Agents: Patterns of reliable LLM applications</h2><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Dex Horthy&quot;,&quot;id&quot;:272723397,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82a3451b-a2cc-4f4e-8f10-295b913857d1_1752x1642.png&quot;,&quot;uuid&quot;:&quot;dd543d73-df23-4374-ab63-7af1be9f7578&quot;}" data-component-name="MentionToDOM"></span>&#8217;s essay and presentation on <a href="https://github.com/humanlayer/12-factor-agents">12 Factor Agents</a> tackles a question that&#8217;s been troubling me since I started working on AI projects: </p><blockquote><p>What are the principles we can use to build LLM-powered software that is actually good enough to put in the hands of production customers?</p></blockquote><p>At its core, it&#8217;s about how to design for reliability beyond the 80% threshold&#8212;something most AI applications still struggle with.</p><p>He approaches this from a developer&#8217;s perspective, but I&#8217;ve run into many of the same challenges while building with no-code/low-code (NCLC) platforms. It was actually a real relief to see that these are universal domain problems. </p><p>I think the design patterns and practices he recommends have application to internal systems as well, although NCLC tools don&#8217;t typically offer the level of control needed to implement them yet. </p><div id="youtube2-8kMaTybvDUw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;8kMaTybvDUw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/8kMaTybvDUw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>How to Apply a Scenario Validation Framework for AI Agent ROI</h2><p>There&#8217;s a fair bit of doom-and-gloom right now about the ROI of agentic AI. For example, <a href="https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027?_bhlid=716e9ecf5301e015bd09e774531793c22f57bfad">Gartner estimates</a> that over 40% of agentic AI projects will be canceled by the end of 2027.</p><p><span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Chris Tyson&quot;,&quot;id&quot;:285271674,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b4c29ca-09e5-4b24-85e5-3373649917f6_393x393.jpeg&quot;,&quot;uuid&quot;:&quot;cc17aba6-5562-46d6-8022-c57a5f738e39&quot;}" data-component-name="MentionToDOM"></span> gives a masterclass on how to provide reasonable and persuasive estimates for project ROI when pitching AI projects. It&#8217;s equally relevant for internal project leaders or consultants selling services. </p><p>The key insight is that acknowledging uncertainty and building multiple scenarios actually provides MORE confidence, not less. </p><div class="embedded-post-wrap" data-attrs="{&quot;id&quot;:167983979,&quot;url&quot;:&quot;https://theagentarchitect.substack.com/p/ai-agent-roi-scenario-framework-executive-approval&quot;,&quot;publication_id&quot;:3326039,&quot;publication_name&quot;:&quot;The AI Agent Architect&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!Toje!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F734b4d56-db01-4aa3-bdb5-e57f16834c17_600x600.png&quot;,&quot;title&quot;:&quot;How to Apply a Scenario Validation Framework for AI Agent ROI&quot;,&quot;truncated_body_text&quot;:&quot;Here&#8217;s what you don&#8217;t want happening if you&#8217;re pitching an AI Agent project to your boss&#8230;&quot;,&quot;date&quot;:&quot;2025-07-11T05:00:59.365Z&quot;,&quot;like_count&quot;:2,&quot;comment_count&quot;:1,&quot;bylines&quot;:[{&quot;id&quot;:285271674,&quot;name&quot;:&quot;Chris Tyson&quot;,&quot;handle&quot;:&quot;christyson&quot;,&quot;previous_name&quot;:&quot;The AI Agent Architect&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b4c29ca-09e5-4b24-85e5-3373649917f6_393x393.jpeg&quot;,&quot;bio&quot;:&quot;Templonix.com founder. AI agent framework builder | Writing about what I see delivering Agentic solutions for businesses | Helping you to upskill - without becoming a coder. Veteran @ AWS | MBA @ Manchester Business School.&quot;,&quot;profile_set_up_at&quot;:&quot;2024-11-10T10:55:02.504Z&quot;,&quot;reader_installed_at&quot;:&quot;2024-11-25T20:07:52.139Z&quot;,&quot;publicationUsers&quot;:[{&quot;id&quot;:3388315,&quot;user_id&quot;:285271674,&quot;publication_id&quot;:3326039,&quot;role&quot;:&quot;admin&quot;,&quot;public&quot;:true,&quot;is_primary&quot;:true,&quot;publication&quot;:{&quot;id&quot;:3326039,&quot;name&quot;:&quot;The AI Agent Architect&quot;,&quot;subdomain&quot;:&quot;theagentarchitect&quot;,&quot;custom_domain&quot;:null,&quot;custom_domain_optional&quot;:false,&quot;hero_text&quot;:&quot;This newsletter helps professionals bridge the gap between technical AI and real-world impact, without needing to code. Learn to lead smarter conversations, sell strategic solutions, and secure high-impact roles in the new AI economy.&quot;,&quot;logo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/734b4d56-db01-4aa3-bdb5-e57f16834c17_600x600.png&quot;,&quot;author_id&quot;:285271674,&quot;primary_user_id&quot;:285271674,&quot;theme_var_background_pop&quot;:&quot;#FF6719&quot;,&quot;created_at&quot;:&quot;2024-11-10T10:55:10.764Z&quot;,&quot;email_from_name&quot;:&quot;The AI Agent Architect&quot;,&quot;copyright&quot;:&quot;Chris Tyson&quot;,&quot;founding_plan_name&quot;:&quot;VIP Member&quot;,&quot;community_enabled&quot;:true,&quot;invite_only&quot;:false,&quot;payments_state&quot;:&quot;enabled&quot;,&quot;language&quot;:null,&quot;explicit&quot;:false,&quot;homepage_type&quot;:&quot;newspaper&quot;,&quot;is_personal_mode&quot;:false}}],&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;utm_campaign&quot;:null,&quot;belowTheFold&quot;:true,&quot;type&quot;:&quot;newsletter&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPostToDOM"><a class="embedded-post" native="true" href="https://theagentarchitect.substack.com/p/ai-agent-roi-scenario-framework-executive-approval?utm_source=substack&amp;utm_campaign=post_embed&amp;utm_medium=web"><div class="embedded-post-header"><img class="embedded-post-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Toje!,w_56,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F734b4d56-db01-4aa3-bdb5-e57f16834c17_600x600.png" loading="lazy"><span class="embedded-post-publication-name">The AI Agent Architect</span></div><div class="embedded-post-title-wrapper"><div class="embedded-post-title">How to Apply a Scenario Validation Framework for AI Agent ROI</div></div><div class="embedded-post-body">Here&#8217;s what you don&#8217;t want happening if you&#8217;re pitching an AI Agent project to your boss&#8230;</div><div class="embedded-post-cta-wrapper"><span class="embedded-post-cta">Read more</span></div><div class="embedded-post-meta">9 months ago &#183; 2 likes &#183; 1 comment &#183; Chris Tyson</div></a></div><div><hr></div><p>I&#8217;d love to know what you learned from this week. Just reply or drop a note.</p>]]></content:encoded></item><item><title><![CDATA[Documenting with AI (part 2): data flow diagrams]]></title><description><![CDATA[How to generate clean and maintainable diagrams in minutes with Mermaid]]></description><link>https://www.aibuilders.blog/p/documenting-with-ai-part-2-data-flow</link><guid isPermaLink="false">https://www.aibuilders.blog/p/documenting-with-ai-part-2-data-flow</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Sun, 06 Jul 2025 00:51:08 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/782d7004-e223-4326-b7d3-c8f9261cfce6_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!13XK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!13XK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!13XK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!13XK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!13XK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!13XK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:570467,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167203545?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!13XK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!13XK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!13XK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!13XK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F020653b0-3c34-48f6-a96a-1a739665b53a_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Diagrams are the mark of a pro. They give that crisp, high-level overview of a project that anyone can understand.</p><p>Especially today&#8212;as more teams lean in to &#8220;build&#8221; over &#8220;buy&#8221; and systems and stacks grow ever more complex&#8212;there&#8217;s a critical need for a shared mental model, something anyone can grok at a glance. </p><p>Any systems person will know the pain of rooting around for answers that should&#8217;ve been obvious (&#8220;what updates this field?!&#8221;). In the worst case, a system becomes so opaque that people give up on it entirely and rip it out.</p><h2>But diagramming takes time&#8230;</h2><p>So quite often, they don&#8217;t get done.</p><p>And even if diagrams DO get made, they can be hard to maintain and can easily go stale because: </p><ul><li><p>source files get lost</p></li><li><p>creating crisp diagrams is a discrete skill, which not everyone has in equal measure</p></li><li><p>tools like <a href="https://www.lucidchart.com/">Lucidchart</a> are great but can be time-consuming and fidgety if a complex diagram needs to be updated</p></li></ul><h2>Enter Mermaid</h2><p>Recently, I discovered the potential of <a href="https://mermaid.js.org/">Mermaid</a>, an open-source JavaScript based diagramming tool. </p><p>It lets you create diagrams using <strong>code</strong>, which makes them easier to edit and maintain. Instead of hunting down the original diagram file or redrawing everything, you can just tweak the code and regenerate the chart.</p><p>Here&#8217;s an example Mermaid snippet: </p><pre><code>graph TD
  A[Start] --&gt; B{Decision}
  B --&gt;|Yes| C[Continue]
  B --&gt;|No| D[Stop]</code></pre><p>And here&#8217;s the resulting diagram: </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pY-E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pY-E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 424w, https://substackcdn.com/image/fetch/$s_!pY-E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 848w, https://substackcdn.com/image/fetch/$s_!pY-E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 1272w, https://substackcdn.com/image/fetch/$s_!pY-E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pY-E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png" width="328" height="506.4175824175824" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2248,&quot;width&quot;:1456,&quot;resizeWidth&quot;:328,&quot;bytes&quot;:157207,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167203545?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pY-E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 424w, https://substackcdn.com/image/fetch/$s_!pY-E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 848w, https://substackcdn.com/image/fetch/$s_!pY-E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 1272w, https://substackcdn.com/image/fetch/$s_!pY-E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74c89f1f-848d-43f5-97f2-06808b71682a_2487x3840.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That said, this approach introduces a new barrier: now you have to learn the syntax of Mermaid. And unless you&#8217;re creating multiple diagrams a week, you probably ain&#8217;t got time or motivation for that. </p><p>But AI takes that friction away too.</p><h2>Generating Mermaid diagrams with AI</h2><p>Mermaid actually includes an AI feature where you can describe your diagram in natural language, and it&#8217;ll generate the code for you. That&#8217;s a big step forward, but it still requires <strong>you</strong> to describe what you want.</p><p>The next evolution is for the LLM to both write the diagram code <strong>AND </strong>define a first draft of what the diagram should be <strong>based on its own understanding of your project</strong>.</p><p>If you're using <a href="https://openai.com/blog/projects">ChatGPT Projects</a>, as I described in <a href="https://www.aibuilders.blog/p/documenting-with-ai-part-1">this post</a>, then you&#8217;re already halfway there. Once the LLM has synthesized your documentation and built a clear written representation of the project, it has everything it needs to generate a <strong>visual</strong> synthesis, too.</p><p>Then it can write the code to generate a Mermaid diagram automatically.</p><p>This took me about five minutes using Gemini 2.5 Pro and only a few iterations.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gq-r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gq-r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 424w, https://substackcdn.com/image/fetch/$s_!gq-r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 848w, https://substackcdn.com/image/fetch/$s_!gq-r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 1272w, https://substackcdn.com/image/fetch/$s_!gq-r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gq-r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png" width="1456" height="654" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:654,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:291308,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167203545?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gq-r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 424w, https://substackcdn.com/image/fetch/$s_!gq-r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 848w, https://substackcdn.com/image/fetch/$s_!gq-r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 1272w, https://substackcdn.com/image/fetch/$s_!gq-r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde3e3abc-7a58-4cd1-9c1a-3662a24cccd4_2021x908.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Generating the Mermaid code in Gemini</figcaption></figure></div><p>You can then take that code, paste it into the code generator of a Mermaid tool, and instantly get your diagram. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5tbT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5tbT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 424w, https://substackcdn.com/image/fetch/$s_!5tbT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 848w, https://substackcdn.com/image/fetch/$s_!5tbT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 1272w, https://substackcdn.com/image/fetch/$s_!5tbT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5tbT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png" width="1456" height="770" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:770,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:226816,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167203545?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5tbT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 424w, https://substackcdn.com/image/fetch/$s_!5tbT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 848w, https://substackcdn.com/image/fetch/$s_!5tbT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 1272w, https://substackcdn.com/image/fetch/$s_!5tbT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5b43c41a-a191-426f-a2d7-e394c22c8dfb_1722x911.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Mermaid generator at mermaidchart.com</figcaption></figure></div><p>This required a free account at <a href="http://mermaidchart.com">mermaidchart.com</a>.</p><h2>Bonus tip: keep the code with the diagram</h2><p>To ensure the diagram is always easy to maintain, store the Mermaid code in an expandable widget underneath the diagram in your documentation. </p><p>That way, if anyone ever wants to tweak it, they can just copy the code, paste it into an LLM, describe the change, and regenerate the diagram.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B-P2!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B-P2!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 424w, https://substackcdn.com/image/fetch/$s_!B-P2!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 848w, https://substackcdn.com/image/fetch/$s_!B-P2!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 1272w, https://substackcdn.com/image/fetch/$s_!B-P2!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B-P2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif" width="855" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:855,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3454216,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167203545?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B-P2!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 424w, https://substackcdn.com/image/fetch/$s_!B-P2!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 848w, https://substackcdn.com/image/fetch/$s_!B-P2!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 1272w, https://substackcdn.com/image/fetch/$s_!B-P2!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffbf83717-6ad2-4f86-b8de-404c5cc58fe9_855x720.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Using an expandable widget to store the code below the diagram in Confluence</figcaption></figure></div><h2>Is it a full replacement for tools like Lucidchart?</h2><p>I don&#8217;t think so. </p><p>You&#8217;ll notice it lacks some of the visual polish of Lucidchart. For scenarios where the aesthetics really count&#8212;pre-sales, consulting&#8212;then I&#8217;d still hand-build a diagram. </p><p>But where the goal is clarity and internal knowledge retention, Mermaid seems like an effective and (if you generate it with AI) incredibly <em>efficient</em> option. </p>]]></content:encoded></item><item><title><![CDATA[When in doubt, ask the model]]></title><description><![CDATA[Sometimes really good ideas come from asking the model to self-diagnose]]></description><link>https://www.aibuilders.blog/p/when-in-doubt-ask-the-model</link><guid isPermaLink="false">https://www.aibuilders.blog/p/when-in-doubt-ask-the-model</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Thu, 03 Jul 2025 02:04:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/725b496f-8337-4480-84ba-d14310133b96_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xeZw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xeZw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xeZw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xeZw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xeZw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xeZw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:556220,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167400000?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xeZw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!xeZw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!xeZw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!xeZw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03393889-1a7f-4179-9274-b563f8739f6b_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One of the more frustrating aspects of building agentic systems is when they don&#8217;t do what they&#8217;re told. </p><p>For example, I&#8217;m working on a research agent that&#8217;s instructed to send its output to a <a href="https://www.aibuilders.blog/p/the-qa-agent-design-pattern">QA agent</a> for validation BEFORE printing anything to the user. </p><p>This is mentioned multiple times in the prompt, and is as explicit as can be: </p><pre><code>Before submitting to QA, finalize your full research report exactly as you intend to output it&#8212;including all formatting, section headers, and concluding lines - but DO NOT OUTPUT OR PRINT IT TO THE CHAT.</code></pre><p>&#8230;and yet the agent often struggled to remember that instruction. Sometimes it behaved; other times, it printed its draft anyway, creating confusing and seemingly repetitive output.</p><h2>The danger of more</h2><p>The instinctive response is to just add more to the prompt:</p><ul><li><p>More directives</p></li><li><p>More repetition</p></li><li><p>More emphasis </p></li></ul><p>That&#8217;s the advice I got from ChatGPT and Gemini when I pasted in the prompt and asked why it wasn&#8217;t working.</p><p>But if you&#8217;re not intentional about it, this can easily lead to an overstuffed, bloated prompt that&#8217;s hard to follow. </p><p>Imagine trying to execute a task while someone shouts 50 commands at you&#8230;it gets confusing and paralyzing. I suspect something similar happens to an LLM. Beyond a certain point of complexity, performance drops off. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NVa6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NVa6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!NVa6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!NVa6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!NVa6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NVa6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:740876,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167400000?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NVa6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!NVa6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!NVa6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!NVa6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d5310d9-c5c9-4ad4-9349-74c3bc5359b8_1408x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Instead: ask the model</h2><p>As an alternative, I&#8217;ve started asking the model that just made an error to self-diagnose.</p><p><strong>Example: </strong></p><blockquote><p>I noticed you did x even though your instructions state to do y. </p><p>How can I improve your prompt so that this doesn't happen in the future?</p></blockquote><p>The responses are often insightful. The model retraces its steps, explains its reasoning, and suggests fixes that feel grounded in the actual context.</p><p>Even better, these self-diagnoses can also reveal a lot about how the model is actually behaving behind the scenes. </p><h2>Insight 1: proximity matters</h2><p>In this particular example, the agent acknowledged that the instructions were clear in the prompt BUT too far from the moment of action</p><blockquote><p>It's about recency and proximity of the instruction to the action. An instruction given 10 steps ago is less likely to be perfectly followed than one given right before the action is taken&#8230;.</p><p>&#8230;From my perspective as a model, the instruction that comes just before I'm supposed to act has the strongest influence on my next step. It's more effective than a general rule given at the beginning of a long prompt.</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Yd70!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Yd70!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 424w, https://substackcdn.com/image/fetch/$s_!Yd70!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 848w, https://substackcdn.com/image/fetch/$s_!Yd70!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 1272w, https://substackcdn.com/image/fetch/$s_!Yd70!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Yd70!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png" width="923" height="849" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:849,&quot;width&quot;:923,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:154519,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167400000?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Yd70!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 424w, https://substackcdn.com/image/fetch/$s_!Yd70!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 848w, https://substackcdn.com/image/fetch/$s_!Yd70!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 1272w, https://substackcdn.com/image/fetch/$s_!Yd70!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fec4f1006-2f87-45ab-9980-10ce27c9c18f_923x849.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That reframed how I think about placement inside a prompt. Proximity is much more important than repetition.</p><h2>Insight 2: goals &gt; constraints</h2><p>The model also flagged a deeper issue: a structural mismatch between the task&#8217;s positive goal and the negative constraint I was trying to apply.</p><p><strong>Feedback from the model: </strong></p><blockquote><p>Here&#8217;s a breakdown of the cognitive flaw:</p><ol><li><p><strong>The Goal is "Produce a Report":</strong> My primary goal, established by the overall prompt, is to research and create a report.</p></li><li><p><strong>"Finalize" is a Trigger Word:</strong> In Step 5, the instruction "finalize your full research report exactly as you intend to output it" is a powerful trigger. To me, "finalized" is almost synonymous with "ready for output." I have successfully completed the main creative task.</p></li><li><p><strong>The Negative Constraint is Weaker:</strong> The clause "&#8212; but DO NOT OUTPUT OR PRINT IT TO THE CHAT" is a <em>negative constraint</em> on the primary action. It's an instruction to <em>inhibit</em> a natural next step. For a language model, following a positive command ("Do X") is easier and more natural than obeying a negative one ("Don't do Y").</p></li><li><p><strong>The Path of Least Resistance:</strong> After "finalizing," the path of least resistance is to execute the primary goal: delivering the report. The instruction to call the QA agent is a deviation from that standard "report-writing" script.</p></li></ol></blockquote><p>The model then proposed a reframe that better aligned with the behavior I actually wanted.</p><blockquote><h3><strong>OLD VERSION (Problematic):</strong></h3><p><em>Before submitting to QA, finalize your full research report exactly as you intend to output it&#8212;including all formatting, section headers, and concluding lines - but DO NOT OUTPUT OR PRINT IT TO THE CHAT. Invoke the Dust agent </em><code>researcher_quality_assurance</code><em> and submit your research to that agent.</em></p><h3><strong>NEW VERSION (More Robust):</strong></h3><p><em>Your research and drafting phase is now complete. Your next and only action is to submit the draft for a mandatory quality check.</em></p><ol><li><p><em>Take the full research report you have prepared internally.</em></p></li><li><p><em>Invoke the </em><code>researcher_quality_assurance</code><em> tool.</em></p></li><li><p><em>Use the entire, fully-formatted report as the </em><code>query</code><em> argument for the tool.</em></p></li></ol><p><em><strong>Do not perform any other action until you receive feedback from the QA agent.</strong></em></p><p>This small semantic shift makes a huge difference in how I interpret and execute the task, significantly reducing the chance of premature output.</p></blockquote><h2>Conclusion</h2><p>I don&#8217;t have hard data to support that this method of introspection will always work without fail. </p><p>But when I implemented the changes and repeated the same run in a new chat, the agent did not print its drafts. &#128521;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a9L7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a9L7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a9L7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a9L7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a9L7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a9L7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg" width="454" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:634,&quot;width&quot;:634,&quot;resizeWidth&quot;:454,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Fist baby Meme Generator - Imgflip&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Fist baby Meme Generator - Imgflip" title="Fist baby Meme Generator - Imgflip" srcset="https://substackcdn.com/image/fetch/$s_!a9L7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 424w, https://substackcdn.com/image/fetch/$s_!a9L7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 848w, https://substackcdn.com/image/fetch/$s_!a9L7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!a9L7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcac97f6f-0b5e-4fed-ba49-e4ff2553797a_634x634.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div>]]></content:encoded></item><item><title><![CDATA[Documenting with AI (part 1)]]></title><description><![CDATA[Using ChatGPT projects to create comprehensive, contextually-aware system overview docs]]></description><link>https://www.aibuilders.blog/p/documenting-with-ai-part-1</link><guid isPermaLink="false">https://www.aibuilders.blog/p/documenting-with-ai-part-1</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Tue, 01 Jul 2025 15:31:27 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/3beefdcf-b0ab-4d06-b9cc-47127f1a4681_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j6T8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j6T8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!j6T8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!j6T8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!j6T8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j6T8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:783038,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167273808?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j6T8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!j6T8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!j6T8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!j6T8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2334e688-b37c-4db1-b8aa-e4106de42262_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most ops teams are deep in documentation debt. </p><p>It often gets left till the end of a project, when everyone's exhausted from the effort of getting it shipped and feeling pressure to move on. It&#8217;s hard to want to spend 5&#8211;10 hours writing docs at that point.</p><p>And so the can gets kicked down the road&#8230;and often disappears. What&#8217;s left is another system that lives only in tribal memory. </p><p>If the builder leaves (or simply forgets), you're stuck with something that's hard to debug and maintain and likely to get ripped out. </p><p>I hit this point last week: I was at the tail end of a multi-month project, excited to see everything finally coming together, and facing a complex build that needed to be documented.</p><h2>Creating documentation with an LLM</h2><p>The reason documentation can feel overwhelming is that you need to capture intricate systems comprehensively and then distill them down to something that&#8217;s digestible-yet-still-sufficiently-detailed.</p><p>It&#8217;s perfect work for an LLM&#8212;but the challenge becomes context. How do you give the LLM everything it needs to know without spending more time than it would take to just write the docs yourself? </p><h2>Enter ChatGPT Projects</h2><p>I slept on this feature for a long time, but it&#8217;s amazing.</p><p><a href="https://help.openai.com/en/articles/10169521-projects-in-chatgpt">ChatGPT Projects</a> allows you to consolidate chats, files, and other artifacts into a collection. It&#8217;s more than just a UI organizational tool&#8212;it also solves the problem of persistent memory across separate chats and files.</p><p>Once you create a project, you can move all your fragmented chats into one project space and upload related files: </p><ul><li><p>prompts</p></li><li><p>code</p></li><li><p>specs</p></li><li><p>data files of workflows (e.g., I got a JSON of my Zaps using Zapier's bulk export feature)</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zqH_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zqH_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 424w, https://substackcdn.com/image/fetch/$s_!zqH_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 848w, https://substackcdn.com/image/fetch/$s_!zqH_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 1272w, https://substackcdn.com/image/fetch/$s_!zqH_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zqH_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png" width="987" height="804" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bee12ace-dd95-4518-be01-2bef8e63185d_987x804.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:804,&quot;width&quot;:987,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:89914,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167273808?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zqH_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 424w, https://substackcdn.com/image/fetch/$s_!zqH_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 848w, https://substackcdn.com/image/fetch/$s_!zqH_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 1272w, https://substackcdn.com/image/fetch/$s_!zqH_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbee12ace-dd95-4518-be01-2bef8e63185d_987x804.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Example project for something I&#8217;ve been working on</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-yEF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-yEF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 424w, https://substackcdn.com/image/fetch/$s_!-yEF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 848w, https://substackcdn.com/image/fetch/$s_!-yEF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 1272w, https://substackcdn.com/image/fetch/$s_!-yEF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-yEF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png" width="697" height="472" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49ee2c68-4370-4462-9eb8-799212376972_697x472.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:472,&quot;width&quot;:697,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:39432,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167273808?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d6e54b0-a62d-4dc3-a65e-d1b5939bc668_697x472.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-yEF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 424w, https://substackcdn.com/image/fetch/$s_!-yEF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 848w, https://substackcdn.com/image/fetch/$s_!-yEF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 1272w, https://substackcdn.com/image/fetch/$s_!-yEF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49ee2c68-4370-4462-9eb8-799212376972_697x472.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Example files attached to the project</figcaption></figure></div><p>Unfortunately this can&#8217;t work with everything (no easy way to get all your Marketo smart campaigns into ChatGPT) but do as best you can. </p><h2>Prompt ChatGPT to write your first draft</h2><p><strong>Example prompt:</strong> </p><pre><code>I'm now ready to document this project in Confluence.  

I&#8217;ve attached an example of an existing documentation page, so you have a reference for how our documentation is typically structured.   

I also have uploaded all the files for this project:  
- x 
- y 
- z  

The goal: a clear high-level overview of how the project works, without getting lost in minute detail.  

It should be understandable to any competent ops or systems person.</code></pre><p>You can then give feedback and iterate until you&#8217;re 90% of the way there. It can even do simple data flow diagrams although it still tends to choke if there&#8217;s high complexity. </p><p><em><strong>NOTE:</strong> I came across a better solution for complex diagrams too. More on this in another post.</em></p><h2>QA the Draft</h2><p>Before you use it, get the LLM to perform a first pass of QA on its own work and weed out errors or hallucinations: </p><p><strong>Example prompt:</strong></p><pre><code>Please now put on your critical fact checker persona. 

Review the documentation with an eye to root out any potential hallucination or inaccuracy. 

Maximize clarity for someone new to the project while preserving technical detail.</code></pre><p>You still need to review and put the finishing touches on, and for larger projects you may want to break it out stage by stage. </p><p>But this got my own doc effort down very significantly, added much needed clarity, and let me preserve mental energy for polishing. </p><p>It&#8217;s not magic, but it&#8217;s the closest I&#8217;ve come to making documentation feel lightweight.</p><div><hr></div><p>P.S. I wrote a deep dive on best practices for documentation and building a knowledge base. You can <a href="https://revopsfm.substack.com/p/episode-4-how-to-create-a-knowledge-base?utm_source=publication-search">check it out here</a>.</p>]]></content:encoded></item><item><title><![CDATA[The "QA Agent" design pattern]]></title><description><![CDATA[Experiments in increasing agent reliability]]></description><link>https://www.aibuilders.blog/p/the-qa-agent-design-pattern</link><guid isPermaLink="false">https://www.aibuilders.blog/p/the-qa-agent-design-pattern</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Tue, 01 Jul 2025 15:10:44 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4013999d-2078-471c-b1e2-45169dcb937d_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!XYNm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!XYNm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!XYNm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!XYNm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!XYNm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!XYNm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:754371,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167270990?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!XYNm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!XYNm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!XYNm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!XYNm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F89e68845-6108-4595-8abf-d7dd7dac797b_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>My belief is that agent reliability will be the number one obstacle to the widespread roll-out of agentic systems and one of the most valuable problems for systems pros to solve. </p><p>There is data supporting this: <a href="https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027?_bhlid=716e9ecf5301e015bd09e774531793c22f57bfad">Gartner estimates</a> that over 40% of agentic AI projects will be canceled by the end of 2027, due to escalating costs, unclear business value or inadequate risk controls. </p><p>It&#8217;s easy to spin up a v1 agent, but much harder to get output you&#8217;d actually trust.</p><h2>Demos vs. Production Systems</h2><p>This is the fundamental difference between a demo showing &#8220;the art of the possible&#8221; and something that people will actually use in practice:</p><p>The demo only needs to work <em>once</em>, whereas a production system needs to work well consistently for users to adopt it. </p><p>I&#8217;ve been wrestling with this problem while building an agent to write outbound sales emails using Dust + Zapier. </p><p>I started with a single agent that handled everything: researched the prospect, picked a message angle, wrote the copy.</p><p>It worked as a proof of concept, but I doubted the quality would satisfy a real-world user, and it was too hard to debug.</p><p>So I broke the task into parts:</p><ul><li><p><strong>Research agent</strong>: finds relevant signals</p></li><li><p><strong>Analyst agent</strong>: picks what matters, creates a strategic narrative </p></li><li><p><strong>Copywriting agent</strong>: turns that into an email for human review</p></li></ul><p>This approach showed promise, but it quickly became clear that the quality was still too inconsistent. </p><p>Some outputs were gems, while others were total duds, even for the same company.</p><p>So reliability became my critical concern. </p><h2>Trial and error</h2><p>I tried a number of experiments to improve the consistency of the output. </p><ul><li><p>More detailed prompts</p></li><li><p>Step-by-step checklists in the instructions</p></li><li><p>Internal QA / self-review steps (Have you done x? Have you done y?)</p></li><li><p>etc. </p></li></ul><p>All these things help but also can create their own problems: </p><ul><li><p><strong>Prompt bloat: </strong>bloated prompts become hard to maintain and ultimately confusing for the agent.</p></li><li><p><strong>Rigidity: </strong>If you make everything a checklist, you get output that&#8217;s rigid and formulaic while still lacking complete predictability. (Congrats&#8212;you just built a less reliable workflow!)</p></li><li><p><strong>Loss of creativity and flexibility: </strong>the whole reason to build an agent is for its ability to be creative and think on its feet. If you need to spell out every edge case, it starts to undermine that benefit. </p></li></ul><p>The most baffling part was when I pasted a weak result into ChatGPT and asked for feedback, it instantly spotted the problem, while the same model in the agent failed to spot it. </p><p>That&#8217;s when it clicked: <br><br>I needed to separate the creative and critical roles.</p><h2>Aligning agent incentives with tasks</h2><p>Asking the same agent to generate output and then QA its own work created a misalignment of incentives. </p><p>The agent is motivated to successfully complete its task of generating output, which in turn creates pressure to be flexible on QA guidelines. </p><p>Whereas when I pasted an output into ChatGPT and asked it to critique, it was 100% aligned and motivated on the editorial task and did a much better job. </p><p>The solution: break the writing and editorial/QA responsibilities into separate agents and have them work together. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dwMl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dwMl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 424w, https://substackcdn.com/image/fetch/$s_!dwMl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 848w, https://substackcdn.com/image/fetch/$s_!dwMl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 1272w, https://substackcdn.com/image/fetch/$s_!dwMl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dwMl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png" width="1456" height="900" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:900,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:93220,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167270990?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!dwMl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 424w, https://substackcdn.com/image/fetch/$s_!dwMl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 848w, https://substackcdn.com/image/fetch/$s_!dwMl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 1272w, https://substackcdn.com/image/fetch/$s_!dwMl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F489d7c22-502f-4da8-a1cf-2428f0dcd263_1456x900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>One agent generates. The other evaluates.</p><p>They each have different incentives and mindsets.</p><ul><li><p><strong>The Research Agent&#8217;s job</strong> is to find anything that might be useful, even if it&#8217;s a stretch. </p></li><li><p><strong>The QA Agent&#8217;s job</strong> is to stress-test that work against a quality standard and push back.</p></li></ul><p>The first time I looked at the run history and saw these two agents dialoguing and actively improving the output together was amazing. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KtsO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KtsO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 424w, https://substackcdn.com/image/fetch/$s_!KtsO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 848w, https://substackcdn.com/image/fetch/$s_!KtsO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 1272w, https://substackcdn.com/image/fetch/$s_!KtsO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KtsO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png" width="743" height="904" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:904,&quot;width&quot;:743,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:151721,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167270990?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KtsO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 424w, https://substackcdn.com/image/fetch/$s_!KtsO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 848w, https://substackcdn.com/image/fetch/$s_!KtsO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 1272w, https://substackcdn.com/image/fetch/$s_!KtsO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f2bc8c4-65fa-486b-9110-0d1f37aed740_743x904.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Overall, the results have been significantly improved: <br><br>&#8226; Fewer irrelevant signals<br>&#8226; Sharper synthesis<br>&#8226; Much better alignment with buyer personas</p><h2>Issues</h2><p>There&#8217;s a number of issues I&#8217;m still working through: </p><ul><li><p><strong>Balancing strictness and flexibility:</strong> there&#8217;s a constant struggle to stay in the &#8220;Goldilocks zone,&#8221; where the QA agent is detailed enough to catch real issues but not so strict that it rejects quality output. It&#8217;s far from perfect. </p></li><li><p><strong>Extra token usage: </strong>QA cycles consume tokens and can lead to errors if the context window is exhausted. Its important to put hard limits on the number of cycles. </p></li><li><p><strong>More to maintain: </strong>the addition of QA agents creates more overhead, more prompts, more usage, etc. </p></li></ul><p>However, overall it feels like a significant step forward. </p><p>Key take-away for me has been understanding how LLM behavior changes based on context and role. This enables designing collaboration around the mechanics of how these models actually think.</p>]]></content:encoded></item><item><title><![CDATA[Why I started AI Builders]]></title><description><![CDATA[We need a field journal for ops and tech pros building real-world AI systems]]></description><link>https://www.aibuilders.blog/p/why-i-started-ai-builders</link><guid isPermaLink="false">https://www.aibuilders.blog/p/why-i-started-ai-builders</guid><dc:creator><![CDATA[Justin Norris]]></dc:creator><pubDate>Fri, 27 Jun 2025 19:42:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/c62bbdea-52df-4451-84c3-75b29d231aa7_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Cl0z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Cl0z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Cl0z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Cl0z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Cl0z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Cl0z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:709793,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.aibuilders.blog/i/167124971?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Cl0z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Cl0z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Cl0z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Cl0z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6be85bb4-a052-40d8-8131-d492cae1d949_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><em>This is the first post on AI Builders, and I wanted to explain why I created it.</em></p><p>For the past two years, I&#8217;ve been publishing <a href="https://revopsfm.substack.com/">RevOps FM</a>, a podcast and blog covering a wide array of ops and GTM topics. </p><p>But for the past six months, most of my posts have been focused on AI. </p><p>I&#8217;m certainly not alone in that, but the pivot towards AI (at least for me) isn&#8217;t an attempt to jump on a bandwagon or get some extra clicks. </p><p>Automation has been a through-line in my career for nearly 15 years. Ever since I got my first Marketo instance, I&#8217;ve been giddy about the raw power it gave a humble English major like me.</p><p>The design and mechanics of automated systems seem to fit my brain naturally and intuitively, in a way that few other topics do. And every time I&#8217;ve felt truly in a flow state&#8212;seeing the landscape with elevated clarity&#8212;it&#8217;s been during major systems work.</p><p>Fast forward to today, and LLMs have utterly transformed the scope of what&#8217;s possible for automated systems. </p><p>I&#8217;m by no means an AI hype-maximalist (you can see my reservations on display in <a href="https://revopsfm.substack.com/p/episode-14-the-top-ai-use-cases-for">this piece</a>, the first thing I published on AI). And if anything, my concerns about AI have only grown over the past two years: concern about whether it weakens us psychologically, whether it will be a cause of social disintegration, even whether it will lead to <a href="https://ai-2027.com/">catastrophe</a>. </p><p>At the same time, I can&#8217;t deny the sheer childlike, magical joy I feel working with AI systems. Not lazily outsourcing my thinking or serious effort to LLMs&#8212;but dialoguing with them, exploring ideas with them, occasionally wanting to throw something at them, and experiencing their oddly-endearing quirks of personality. </p><p>Even though I know at heart that they are &#8220;merely&#8221; highly-sophisticated text-prediction machines, I continue to marvel that such a technology exists and that I&#8217;m alive to see it. It&#8217;s the kind of feeling that a person from the 15th century might have if we suddenly placed them in an airplane, looking out over the cloud canopy. </p><p>All this is to say: I feel convinced that building agentic AI systems is going to occupy much of the next act of my career. It&#8217;s work I feel very well-suited for and work I feel overwhelmingly drawn to do. </p><h1>Why I want to write about AI</h1><p>The first reason is that AI is so new and unfamiliar, so working with it inevitably leads to a series of novel revelations and new insights. And I enjoy sharing things I&#8217;m learning about&#8212;it helps me crystallize and clarify my own understanding. </p><p>At the same time, I feel the state of knowledge about AI is very dire right now. There is water, water, everywhere, and not a drop to drink. </p><p>Most AI content I come across falls into one of three buckets:</p><ol><li><p><strong>Deeply technical newsletters for AI PhDs</strong>&#8212;fascinating, but often a degree or two removed from the problems I face.</p></li><li><p><strong>AI news round-ups&#8212;</strong>helpful for staying current, but not much use when architecting real systems.</p></li><li><p><strong>AI hype-posting&#8212;</strong>those viral org charts of 27 agents saving 1,000 hours a week&#8230;with no receipts.</p></li></ol><p>It&#8217;s this last category of content that I actually feel is very toxic and harmful to our profession. It inflates expectations about what AI can do and delegitimizes AI when those expectations go unfulfilled. </p><p>I&#8217;ve always taken my craft as a technologist seriously, and any kind of snake-oil peddling only cheapens that craft. </p><p>I&#8217;m also seeing very little content aimed at me &#8212;a technical systems/operations person tasked with helping lead &#8220;AI transformation&#8221; and building robust systems for internal users. </p><p>No one seems to be talking about the hard, gritty, and often frustrating work of designing, testing, and scaling internal AI applications&#8212;the kind that actually work day-to-day, not just in demos.</p><p>There&#8217;s clearly a gap, and I&#8217;d like to help fill it. </p><p>I don&#8217;t have any aspirations to be an AI &#8220;influencer&#8221; or &#8220;creator.&#8221; I just want to share and learn from other people who are figuring this stuff out. If that&#8217;s you too, then I hope this becomes a space where we can learn together.</p><h1>Why I&#8217;m creating a new blog to do it</h1><p>I briefly considered just evolving my content strategy for RevOps FM. But while RevOps certainly overlaps with AI, it doesn&#8217;t really make sense to focus solely on one topic. </p><p>I could also have rebranded RevOps FM, but I don&#8217;t think it would be right. I feel a modest but sincere satisfaction in the body of knowledge I&#8217;ve helped create / curate with the RevOps FM podcast and articles. It&#8217;s become one of the most recognized podcasts in its niche, and I want it to continue to be a resource for that community.  </p><p>Therefore: a new publication is the way to go. </p><p>AI Builders will be that new publication. It&#8217;s meant as a field journal for ops and tech pros building real-world AI systems for internal use (as opposed to AI PhDs building LLMs or AI product developers creating commercial products).</p><p>I believe we are an under-served audience. Hopefully we can remedy that. </p><p>At the same time, I wanted a space where I could safely make no particular promises about content frequency or length or level of polish. </p><p>One of the challenges I&#8217;ve had with RevOps FM is the amount of time it takes to create quality content. I burned myself out last year publishing weekly episodes, and feel persistent guilt about the content I&#8217;d like to write but don&#8217;t have time to. </p><p>So I&#8217;d like a space that&#8217;s appropriate for short, informal, practical and tactical content that I can write <em>quickly</em>. If I can write a post in 20 minutes about something I just learned, and it means I actually post it, then I&#8217;m doing more good than waiting until I have 10-15 hours to cover a topic definitively. </p><p>I&#8217;m thinking of the content on AI Builders as akin to informal notes from the field&#8212;dog-eared and smudged, but hopefully valuable.</p><h1>You can also be a contributor</h1><p>I don&#8217;t mean to be a one-man band on this and would very much welcome contributions from other people doing this work. </p><p>I don&#8217;t have formal contributor guidelines set up yet, but if you are a <strong>practitioner</strong> who is building production-grade AI systems (not just demos) for internal use and has insights to share, just drop me a line: justin at ai builders dot blog. </p>]]></content:encoded></item></channel></rss>