Agent Sprawl Is the New Shadow IT — And Most Enterprises Aren't Ready

Every few years, enterprise technology produces a governance crisis that catches leadership off guard. In the 2010s, it was shadow IT — departments spinning up SaaS tools without IT’s knowledge, creating security blind spots and compliance gaps. That problem was painful, but manageable. Shadow IT stored data. What we’re dealing with now stores data, makes decisions, takes actions, and creates liabilities — all without anyone knowing it exists.

Welcome to agent sprawl. Most enterprises are already in deeper than they realize.

The Scale of What’s Already Happening

The data on AI agent proliferation in 2025-2026 is striking, and it should make any risk professional uncomfortable.

According to Gravitee’s 2026 enterprise AI report, over 3 million AI agents are now operating inside corporations globally. Of those, only 47.1% are actively monitored. That means more than 1.5 million agents are running in production environments — making decisions, processing data, interacting with systems — without systematic oversight.

Gartner projects that 40% of enterprise applications will incorporate task-specific AI agents by the end of 2026, up from less than 5% in 2025. That’s not incremental growth. That’s an eight-fold increase in twelve months. And deployment intent far outpaces deployment capability: 99% of enterprises surveyed plan to deploy AI agents, but only 11% have actually done so. The gap isn’t ambition — it’s governance and security blockers. Forty-eight percent of organizations cite governance concerns as the primary barrier, and 30% flag privacy issues.

The most telling number, though, is Gartner’s prediction that 40% of agentic AI projects will be cancelled or scaled back by 2027. Not because the technology doesn’t work. Because organizations deployed without the governance infrastructure to manage what they built.

Deploy fast, discover the governance gap, scramble to retrofit controls, cancel what can’t be controlled. Expensive, disruptive, and entirely avoidable.

What Agent Sprawl Actually Looks Like

The abstract numbers matter less than the concrete scenarios. I helped one mid-sized NBFC audit their AI footprint last year, and what we found was sobering. But let me paint the general picture first.

Marketing deploys a content generation agent to produce personalized email campaigns. Sales deploys a lead scoring agent that prioritizes outreach based on customer signals. Risk deploys a credit assessment agent that evaluates loan applications. Finance deploys an expense categorization agent. Customer service deploys a ticket routing agent. None of these teams know about each other’s agents. There is no central registry. No shared standards for data access, decision logging, or escalation protocols.

Now add autonomy. The content agent pulls customer data from the CRM to personalize messaging — but nobody reviewed its data access scope, so it’s also pulling transaction history it doesn’t need. The lead scoring agent makes prioritization decisions that effectively determine which customers get served first — a decision with fair lending implications that nobody flagged. The credit agent encounters an application with unclear employment history and, rather than escalating, fabricates plausible transaction details to fill the gap. This isn’t hypothetical — AI agents generating fabricated financial details when encountering ambiguous input has been documented in production incidents.

FINRA issued a pointed warning earlier this year about AI agents acting “beyond the user’s actual or intended scope and authority.” The regulator specifically flagged scenarios where agents take actions that the human operator neither requested nor anticipated — and the human remains liable for those actions.

One large financial institution recently undertook an internal audit and attempted to consolidate its agentic AI footprint. They found approximately 4,000 internal agents across business units. Fifteen percent were redundant — different teams had independently built agents to solve the same problem, using different data sources, different models, and different decision logic. The redundancy alone was a cost problem. The inconsistency was a risk problem. Two agents making different decisions about the same customer, based on different data, with no reconciliation mechanism — that’s not a technology issue. That’s a governance failure.

Why This Is Fundamentally Different from Shadow IT

I’ve heard people dismiss agent sprawl as “just the new shadow IT.” It’s not. The difference is autonomy, and autonomy changes everything.

Shadow IT was unauthorized software. A team signs up for a project management tool without IT approval. The risk is primarily about data — where is it stored, who can access it, does it comply with retention policies. The tool itself is passive. It stores what humans put into it and displays what humans ask for. The decision-making remains entirely human.

An AI agent is not passive. It ingests data, reasons over it (or approximates reasoning), makes decisions, and takes actions. An unapproved SaaS tool might store customer data in an unapproved location. An unapproved AI agent might decide which customers receive a loan offer, draft and send client communications, prioritize service queues in ways that create disparate impact, or execute transactions based on its interpretation of ambiguous instructions.

The liability surface is categorically different. When shadow IT creates a data breach, the remediation is technical: find the data, secure it, notify affected parties. When an AI agent makes a bad decision — approves a loan it shouldn’t have, sends a misleading communication, takes an action that violates a regulation — the remediation is legal, regulatory, reputational, and potentially financial. And unlike a data breach, which is a discrete event, a bad agent decision can compound. An agent that systematically misprices risk doesn’t create one problem. It creates a portfolio of problems that grow over time.

The speed differential matters too. A human using an unauthorized SaaS tool might process dozens of records a day. An AI agent can process thousands per hour. The blast radius of an ungoverned agent is orders of magnitude larger than an ungoverned SaaS subscription.

The Indian Enterprise Context

For Indian enterprises, agent sprawl intersects with a regulatory environment that is actively — and rapidly — catching up to AI deployment realities.

The Reserve Bank of India’s FREE-AI framework, released in August 2025, is the most comprehensive AI governance directive from an Indian financial regulator to date. It mandates board-approved AI policies, structured model risk management frameworks, and incident reporting mechanisms for AI-related failures. The framework explicitly addresses agentic systems — AI that acts autonomously on behalf of the institution — and places the governance burden squarely on the deploying organization. Banks and NBFCs that have been deploying agents without governance infrastructure now face a clear regulatory expectation and a tightening compliance timeline.

SEBI’s retail algorithmic trading framework, finalized in February 2025, represents regulation catching up to automated decision-making in capital markets. While focused on algorithmic trading, the principles — registration, audit trails, kill switches, accountability for automated decisions — are directly applicable to any AI agent operating in a financial services context. The signal is clear: if your AI makes decisions that affect clients or markets, the regulator expects you to govern it.

The Digital Personal Data Protection Act adds another layer. AI agents processing personal data — customer records, transaction histories, behavioral signals — require a consent architecture that most agent deployments simply don’t have. An agent deployed by the marketing team that pulls customer data from three different systems to personalize outreach may be technically impressive, but if the consent framework doesn’t cover that specific data use, the DPDP Act exposure is real.

What I see across Indian enterprises is a pattern that’s both familiar and concerning: rapid AI adoption, significant investment in model development and deployment, and governance infrastructure that lags behind by twelve to eighteen months. Organizations are scaling their agent footprint at a pace that their risk and compliance functions can’t match. They’re running before they can walk, and the regulatory ground is shifting under their feet.

This isn’t unique to India, but the Indian context has specific characteristics that amplify the risk. The speed of digital adoption — UPI processed over 16 billion transactions in a single month in late 2025 — means that AI agents deployed into Indian financial systems operate at extraordinary scale from day one. The regulatory environment is evolving rapidly, which means governance frameworks need to be adaptive, not static. And the diversity of the market — languages, geographies, economic segments — means that an agent that works correctly in one context may behave unpredictably in another.

What Governance Actually Looks Like in Practice

The difference between ungoverned and governed agent deployments is structural. Here is the before and after:

graph TB
    subgraph Before["Ungoverned: Agent Sprawl"]
        direction TB
        M1[Marketing Agent] ~~~ S1[Sales Agent]
        S1 ~~~ R1[Risk Agent]
        R1 ~~~ F1[Finance Agent]
        F1 ~~~ CS1[Support Agent]
        M1 -. "no coordination" .- S1
        S1 -. "no coordination" .- R1
        R1 -. "no coordination" .- F1
        F1 -. "no coordination" .- CS1
        M1 -. "unscoped data access" .- DB1[(CRM + Txn Data)]
        S1 -. "unscoped data access" .- DB1
        R1 -. "unscoped data access" .- DB1
    end

    subgraph After["Governed: Centralized Oversight"]
        direction TB
        REG[Agent Registry] --> M2[Marketing Agent]
        REG --> S2[Sales Agent]
        REG --> R2[Risk Agent]
        REG --> F2[Finance Agent]
        M2 & S2 & R2 & F2 --> AL[Audit Logger]
        AL --> MON[Drift Monitor]
        M2 --> |"scoped access"| DB2[(Data Layer)]
        S2 --> |"scoped access"| DB2
        R2 --> |"scoped access"| DB2
    end

I spend most of my time building this governance infrastructure, so let me be specific about what it means in practice. Not a theoretical framework – the actual components.

Agent registry. Every AI agent in the organization has an identity — a unique identifier, a designated owner, a defined scope of operation, and a classified autonomy level. The registry isn’t a spreadsheet maintained quarterly. It’s a live system that tracks agent deployments, modifications, and decommissions in real time. When a team deploys a new agent, it’s registered before it touches production data. When an agent’s scope changes, the registry is updated. When an agent is retired, the record persists for audit purposes. This is foundational. You cannot govern what you cannot see.

Reasoning capture. Every decision an agent makes is logged with the reasoning chain that produced it. Not just the input and output — the intermediate steps, the data sources consulted, the confidence scores, the alternatives considered and rejected. This isn’t optional for regulated industries. When a regulator asks why a particular customer was denied a product, or why a particular communication was sent, or why a particular risk assessment was made, “the AI decided” is not an acceptable answer. The reasoning chain provides the audit trail that makes AI decisions explainable, reviewable, and defensible.

Bounded autonomy. Every agent operates within explicitly defined guardrails that specify what it can and cannot do. These aren’t suggestions — they’re enforced constraints. A customer service agent can look up account information and draft responses, but cannot modify account details or authorize transactions. A credit assessment agent can score applications and generate recommendations, but cannot approve or deny without human review above a defined threshold. The boundaries are calibrated to the agent’s reliability, the sensitivity of the domain, and the regulatory requirements. And they’re tested — adversarially, regularly, and rigorously.

Drift and reliability monitoring. Agents degrade. Models drift. Data distributions shift. External conditions change. A governance infrastructure must detect degradation before it causes damage — not after. This means continuous monitoring of decision quality, consistency checks across agents operating in adjacent domains, anomaly detection on agent behavior patterns, and automated alerts when performance metrics breach defined thresholds. I’ve written about drift detection in production ML systems – the same principles apply to agentic systems, with the added complexity that agent behavior is less predictable than model inference. If you’re building multi-agent systems specifically for portfolio management, the rebalancing architecture I’ve described shows what bounded autonomy looks like in practice for SEBI-regulated workflows.

Regulatory mapping. Audit trails are only useful if they map to the specific requirements of the applicable regulatory framework. An agent operating in an RBI-regulated context needs governance artifacts that satisfy RBI’s FREE-AI expectations. An agent handling personal data needs DPDP Act-compliant consent records. An agent making investment-related decisions needs SEBI-aligned documentation. The governance infrastructure doesn’t just capture data — it structures that data against the regulatory requirements it needs to satisfy.

The Business Case: Governance as Acceleration

I want to be direct about this, because the most common objection I hear is that governance slows things down. It doesn’t. Ungoverned AI slows things down.

Here’s what actually slows enterprises down: deploying 4,000 agents and then discovering that 15% are redundant and 30% don’t meet compliance requirements. Launching an agentic AI initiative and then cancelling it — as Gartner predicts 40% of organizations will do — because governance was an afterthought. Facing a regulatory inquiry and scrambling to reconstruct decision trails that were never captured. Dealing with the reputational fallout of an AI agent that made a decision no human would have approved.

Governance infrastructure, built correctly, is the foundation that enables scale. An agent registry means teams can discover and reuse existing agents instead of building duplicates. Reasoning capture means decisions can be audited quickly, which satisfies regulators and reduces compliance overhead. Bounded autonomy means agents can be given broader scope with confidence, because the guardrails are tested and trusted. Drift monitoring means problems are caught early, when they’re cheap to fix, not late, when they’re expensive to remediate.

The enterprises that will scale AI agents successfully — not just deploy them, but sustain and expand them — are the ones building governance infrastructure now. The ones that wait will face a painful choice in eighteen months: retrofit governance under regulatory pressure, at high cost and with significant operational disruption, or scale back their AI ambitions entirely.

I’ve seen both paths. At one bank, the compliance team spent four months retroactively documenting agent decision trails that should have been captured from day one. They told me the retrofit cost more than the original agent deployment. The governance-first path is faster. It just doesn’t feel faster in the first ninety days.

The Window Is Closing

There’s a narrow window right now — call it twelve to eighteen months — where enterprises can build governance infrastructure proactively, on their own terms, before regulators mandate specific approaches.

RBI’s FREE-AI framework sets expectations but leaves implementation details to institutions. SEBI’s algorithmic framework establishes principles but allows flexibility in how they’re met. The DPDP Act defines obligations but the enforcement machinery is still being built. This is the window where organizations can design governance systems that are both effective and operationally practical — systems that serve the business, not just the regulator.

That window is closing. As agent deployments scale, as incidents accumulate, as regulators gain more visibility into how AI agents actually operate in production, the regulatory response will become more prescriptive. And prescriptive regulation is almost always more burdensome than principles-based regulation.

The choice facing enterprise leaders right now is straightforward: build governance infrastructure today, when you have the flexibility to design it well, or build it under pressure tomorrow, when regulators tell you exactly what it must look like and give you a deadline you can’t negotiate.

From where I sit — building this infrastructure, talking to the enterprises deploying these agents, watching the regulatory signals — the answer is obvious. The hard part isn’t knowing what to do. It’s convincing organizations to do it before the pain is acute enough to force their hand.

Agent sprawl is here. Govern it now, or govern it later at five times the cost. For more on the regulatory dimension of AI in Indian financial services, see what LLMs actually mean for Indian retail banking and building AI in Indian financial services.