The operator's roadmap for AI in 2026

What’s keeping AI from doing real work—and how to fix it

Dec 27, 2025

Right now, many teams are still copying and pasting between ChatGPT and the systems where real work actually lives.

General-purpose AI assistants like ChatGPT, Gemini, or Claude are flexible and pervasive; you can use them for almost anything, but they’re still largely disconnected from your day-to-day systems. So you copy-paste context in, then copy-paste outputs back out.

Meanwhile, every major software vendor is shipping “AI features” inside their products. Your CRM has AI. Your email has AI. Your project management tool has AI. But these assistants are tightly scoped to a single tool, blind to everything else you’re working on.

The resulting user experience is limiting and frustrating. In part, this is because the potential for AI to do more is so tantalizingly visible yet still out of reach.

The issue isn’t model capability. It’s that AI still sits outside the systems where real work happens, disconnected from both organizational knowledge and the ability to act.

This is why I believe the companies that truly execute as “AI natives” in 2026 will do so by creating a ubiquitous AI layer that connects the knowledge and tools we use every day.

At bottom, it comes down to a simple premise: if we want AI to start doing actual work that humans do, they need the same level of context and the same ability to act.

Here are four pillars I’m focused on. The first two are already in production, while the second two are emerging areas that I perceive as high leverage.

Universal Context — AI connected across ALL your corporate systems
Universal Action — AI that actually DOES work, not just analyzes
Proactive AI — Systems that prompt YOU instead of waiting to be prompted
Automation Creating Automation — Agents that spawn workflows and become infrastructure

This article is written for Systems and Operations teams responsible for developing AI as an organizational capability. It focuses on the biggest limiting factors I see in day-to-day AI usage today and what teams can do to remove them.

Pillar 1: Universal Context

Most AI tools today are islands.

ChatGPT knows very little about your company. Salesforce Einstein only sees Salesforce. The AI in your project management tool only knows what’s in that tool.

Your brain still needs to act as the bridge across these systems, and if you want help on big-picture thinking and planning, you need to provide that context manually to your chat-based assistant.

Even ChatGPT or Claude projects are inherently limited to the files you’ve uploaded. You’re stuck conversing with an LLM with encyclopedic knowledge of the world and very little knowledge of your specific corner of it—the corner that actually matters most.

Imagine working with an analyst who didn’t have access to source systems or Google Drive. Or a marketing ops manager who didn’t have access to marketing SOPs. They only see what you happen to share with them. They are going to feel like very limited team members.

AI only starts to feel “smart” once it has access to the same knowledge sources you rely on every day.

What this looks like

At my company, we use Dust (an AI assistant framework) to connect Google Drive, Confluence, Gong, Salesforce, Snowflake, and Amplitude (among other systems) all in one place.

This means when I’m working on a project, my assistant can:

Fetch background documents from Google Drive
Pull process docs and SOPs from Confluence
Look up current metrics in Snowflake
Check related tasks and project status in Trello

You couldn’t do this with an agent locked inside just one of these tools. You need something that sits across all of them.

The difference is substantial. With ChatGPT, you’re copying and pasting context back and forth constantly. With platform-native AI, you only see one slice of the picture. With universal context, you get a strategic thought partner with visibility across all work layers.

Pitfalls

There’s an important caveat here: connecting an LLM to a knowledge source doesn’t mean knowing how to USE that knowledge.

Every time an agent accesses a data source, you run into the fluency problem. For example, if you ask it to pull data from Snowflake or Salesforce and it doesn’t understand your schema, it produces incorrect insights.

Universal context requires universal fluency, not just universal access.

You wouldn’t take a new hire, throw them into the back-end of Salesforce, and expect them to intuitively understand a schema with 10 years of history. They need context and enablement. Similarly, your AI assistants need this enablement layer to make sense of your systems.

Takeaways

Each knowledge source needs a context layer

Include how it’s organized, how to navigate it, how a fluent user would actually work within it. Things like schemas, naming conventions, and organizational logic.

Clean, coherent, comprehensive documentation is no longer aspirational

Poor documentation doesn’t just slow humans down—it permanently caps the value you can extract from AI.

Teams without clean, accessible docs end up re-explaining the same things over and over. The friction and time cost of delegation stays high, so AI gets used less or only for shallow tasks.

So if you’re still feeling guilty about unaddressed documentation debt but have had trouble articulating the business imperative to address it, now is your moment. Documentation debt is now AI debt.

Pillar 2: Universal Action

Pillar 1 was about understanding—giving AI the context to think. This pillar is about acting—giving AI the access to do.

Even a well-informed assistant is still just a commentator on the sidelines if it’s not able to take action. The next fundamental shift is AI that actually DOES work and not just analyzes it.

Most teams start this process by looking for big, high-impact “agent” tasks where AI can help (e.g., writing SDR emails, performing account research) and then building dedicated systems around them. I did this too, because all the hype around agents made this seem like the right path.

These big rock applications are obviously important. But they leave a lot of value on the table. Many AI applications won’t be well-defined point solutions built to solve a specific problem. They’ll be organic, ad-hoc, and fluid interactions with AI in the flow of work, delegating tasks much like you would to a junior employee. (Think of it as building a general-purpose AI teammate vs. an agent for a specific task).

That’s because a significant portion of our work doesn’t fit into one of these big buckets. Much of our day-to-day is consumed by small, hard-to-classify tasks: answering questions, updating data or documentation, fixing things, checking information, and so on.

I started calling these “paper cut tasks.” Small things that are individually inconsequential but feel significant in aggregate. They distract from higher-leverage work, create a psychological burden, and contribute to the feeling of being perpetually overwhelmed.

My team set a goal: delegate these paper cut tasks to AI, one at a time. We created a dedicated AI assistant that has access to all our docs (for answering questions) and asked the teams we support to go to that assistant as the first step before creating a ticket for us.

Now for each request that comes up, we ask ourselves, what would be required for the assistant to actually DO this work on our behalf?

Typically this means giving it write access to systems, either via MCP, API, or intermediaries like Zapier that provide gateways to other tools.

What this looks like

Here’s a common example: Salesforce data fixes. These sorts of issues come up often for almost every team.

An opportunity is miscategorized in reporting.
A marketing user notices and requests a fix.
Someone needs to troubleshoot, make a small update, and verify the report looks right.

This isn’t rocket science, but it’s still work.

Since most issues fall into a few known buckets, it was relatively easy to enable our AI assistant to triage these issues directly with users. If it identifies a known scenario, it updates Salesforce directly.

Critically, we didn’t give it unrestricted write access. We exposed a specific workflow that allows a specific update in a specific way with a specific input. It then posts a Chatter message documenting the action, and we log it to a Zapier table as well. This enables autonomy and observability within guardrails that we’re comfortable with.

Another example: KPI monitoring. Rather than running manual reports or waiting for weekly syncs, team leads now ask our assistant directly—”How did MQL volume trend this week?” or “What’s our current pipeline coverage?”—and get answers in seconds. The assistant queries our sources of truth, contextualizes the numbers against historical trends, and surfaces insights conversationally.

This is faster than building a report, but even more importantly, it also changes the way that people engage with their data. The LLM can act as a junior analyst and not just a number cruncher.

Pitfalls

Just like LLMs need enablement on how to consume information, they also need an enablement layer on how to act.

For example, I was experimenting with having an AI assistant scan incoming email messages and create Trello cards on my board. This would remove a key point of friction and ensure I had clean, neatly organized tasks ready for prioritization.

But without a guide to the Trello environment, the assistant burned tokens while stepping on rakes: creating cards on the wrong board, adding the wrong metadata, not using the right description structure, and so on.

This was the same problem as before—the contextual layer was missing. The agent had access but not fluency.

Takeaways

A flexible, user-facing AI assistant is an investment, not a cost

Each task automated in this way pays ongoing dividends in the form of time saved. And having a general-purpose assistant (rather than solution-specific system) is critical for enabling flexibility and agility, as it lowers the marginal cost of deploying new use cases.

Once the scaffolding exists, you can justify automating the small, annoying things.

Autonomy and safeguards go together

Experimental actions need human supervision. And with the current state of technology, it’s safer to provide agents with narrowly-scoped, controlled workflows they can invoke vs. complete API access.

Comprehensive APIs/MCPs will be table-stakes for vendors

Operators will increasingly expect their vendors to provide robust configuration APIs and MCPs so that agents can act on behalf of their ops teams.

For example, I may not trust an average marketing user to make a routine change in our lead routing system (there’s too much risk of breaking something). But an agent with proper context and guard-rails could do it IF they have the necessary system access.

Software vendors need to start designing for both human and digital users.

Pillar 3: proactive AI (emerging)

Right now, most AI waits for you to prompt it. You type a question. It answers. You ask for analysis. It provides. The dynamic is reactive.

Now, I do have a lot of AI operating autonomously in production, but these are still mainly in the form of scheduled workflows with AI steps. They run on a schedule, perform fixed tasks, and incorporate narrowly-scoped AI analysis. This type of use case feels relatively mature.

But it’s still rules-based. We’re doing a crude version of proactive AI today: scheduled Gong analysis that flags low-scoring calls for manager review. It works, but it’s scaffolding for something more dynamic.

Opportunities for true proactivity:

An agent that monitors your funnel

Top-line numbers look fine.
But it notices efficiency is declining—secondary KPIs dropping in ways that foreshadow problems weeks out.
It initiates analysis and delivers insights to the demand gen team without being asked.

A sales-coaching agent

It reviews sales calls then reaches out to the rep to start a discussion about the call and how to improve.
These aren’t AI-generated tips on top of a regular UX, but an interactive discussion initiated by the AI—like a coach reaching out with feedback.

A project manager agent

It grooms your backlog, identifies milestones that are slipping, and proactively flags areas of misalignment (e.g., “I don’t think stakeholder B and stakeholder C are aligned on requirements. We should book a sync to discuss.”).
It could keep projects from drifting in terms of both timelines and requirements.

All these reflect a shift from “AI waits for commands” to “AI is an active participant in your workflow.”

The trust problem

Here’s what keeps me cautious: trust is fragile.

One bad result from an autonomous agent can damage adoption permanently. Users conclude “not ready for prime time” and stop engaging. The perception spreads.

We need layers of safeguards: narrow scopes, good prompt engineering, QA steps. I’ve written extensively about how to build reliability into AI systems.

But we also need expectation management. AI can be powerful AND imperfect. The trade-off is scalability versus reliability, and we have to be pragmatic about where that balance sits.

Pillar 4: automation creating automation (emerging)

This is the capability I’m most excited about.

I haven’t deployed it yet, but I’ve seen enough to know it’s coming.

The problem is that there are many tasks that COULD be fully automated (using traditional workflows, not AI) but they’re too small in many cases to justify the build effort to do it. The human resources who create these artifacts are specialized and in demand, and so lower-leverage tasks don’t get prioritized.

However, an accumulation of smaller tasks can add up to a lot of work.

But what if agents could create those workflows for you? Now you have automation that spawns more automation.

Once you can describe a workflow in a machine-readable language like JSON, an agent can create new workflow definitions in that format. The missing link is for platforms to support workflow creation via these formats.

n8n provides this capability, and my research suggests other platforms do too (or surely will soon).

For example, check out this demo of Claude Code using MCP servers to build n8n workflows from natural language.

The agent creates, modifies, and tests workflows without a human touching the canvas.

Why this matters

AI-generated automations changes the calculus on what’s worth automating.

Today: “This task costs 5 hours per week, but building the automation would take 40 hours. Not worth it.”

Tomorrow: “Describe what you want. The agent builds it. Ops reviews and approves.”

There is, of course, an obvious risk worth naming: agent-generated workflows still need lifecycle management. Who owns them? Who updates them when requirements change?

These are challenges teams will need to solve, but I don’t see them as prohibitive relative to the potential gains.

In conclusion: reality check

What are the barriers to having these pillars in 2026?

Infrastructure lag. Most companies still don’t have Dust, n8n, or MCP setups yet.
Awareness gap. Teams don’t know what’s possible. The demos exist but haven’t reached mainstream.
Trust building takes time. Delegation requires proof of reliability. That proof accumulates slowly.
Last-mile problem. None of this is turnkey. You need technical operators who can build and maintain these systems.

But the trajectory is clear. If you aren’t yet equipped to delegate work to AI on a daily basis, start listing out the obstacles—technical, organizational, political—and preparing a plan to address them.

The gap between “AI can do amazing things” and “AI does real work for my team” is not about the models. It’s about context, access, and trust. Those are solvable problems, and solving them is what will separate teams that use AI from teams that are transformed by it.

Discussion about this post

Ready for more?