AI Strategy

Claude Code's Source Code Just Leaked — And What's Inside Changes Everything We Thought About AI Coding Tools

A single file left in an npm package exposed 512,000 lines of Anthropic's most important product. The code reveals an autonomous AI daemon, anti-competitive training defenses, a mode to hide AI's own fingerprints, and 44 unreleased features. Cybersecurity stocks dropped 7% within hours.

Published

Updated

Reading time

12 min read

Author

Alpadev AI Editorial

Software, AI & Cloud Strategy

Claude CodeAnthropicSource Code LeakKAIROSAI SecurityCybersecurityOpen Source

It started with a file that should not have been there. On March 31, 2026, security researcher Chaofan Shou — an intern at Solayer Labs — noticed something unusual in version 2.1.88 of the @anthropic-ai/claude-code npm package: a 59.8 megabyte JavaScript source map. Source maps are debugging tools. They are supposed to be stripped before publishing. This one was not. And inside it was a direct reference to a zip archive hosted on Anthropic's Cloudflare storage. That zip contained the entire Claude Code codebase. All of it. 512,000 lines of TypeScript. 1,900 files. The architecture, the tools, the unreleased features, the internal codenames, the performance benchmarks that were never meant to be public. Within hours, the code was mirrored on GitHub and forked over 41,500 times.

Here is the part that makes this story more than a simple security incident: this has happened before. In February 2025, an identical source map exposure leaked an earlier version of Claude Code through the same mechanism. Anthropic removed the package, deleted the source map, and moved on. Fifteen months later, the same error repeated itself. Once is an accident. Twice is a pattern.

But what the code reveals is far more consequential than how it leaked. Buried in Claude Code's source are systems that most developers did not know existed: an autonomous daemon named KAIROS that operates while you sleep, a defense mechanism that injects fake tools into API calls to corrupt competitor training data, a stealth mode designed to erase any trace that AI wrote your code, and 44 feature flags pointing to capabilities Anthropic has not announced. Wall Street noticed. Cybersecurity stocks dropped between 3% and 7% within the same trading session. This is the story of what was found, what it means, and why it matters — whether you write code for a living or simply use software built by people who do.

Key takeaways

  • A forgotten source map in a public npm package exposed Claude Code's entire architecture — 512,000 lines of TypeScript that Anthropic never intended to publish. It is the second identical leak in 15 months, pointing to a systemic packaging problem.
  • KAIROS is not a feature — it is a paradigm shift. An autonomous background daemon that consolidates memory, makes proactive decisions, and runs while the developer is away. Claude Code is evolving from a tool you use into an agent that works alongside you.
  • Anthropic built an anti-distillation system that injects decoy tool definitions into API traffic. The purpose: if a competitor records and trains on Claude Code's API calls, their model learns fake capabilities that do not exist. It is a digital poison pill.
  • The cybersecurity sector lost billions in market capitalization within hours. CrowdStrike dropped 7%, Palo Alto Networks fell 6%, and the broader tech index declined 3% — driven by fears that advanced AI agents could destabilize the economics of cyber defense.
Anthropic built a mode to hide that AI wrote your code. Then the code that hides AI's fingerprints got leaked by the AI's own packaging pipeline. If irony had a source map, this would be it.

What Happened: A Source Map, A Zip File, and 41,500 Forks

To understand this leak, you need to understand what a source map is. When developers write software in TypeScript, they compile it into JavaScript before publishing. The compiled code is harder to read — variable names are shortened, logic is compressed, structure is flattened. A source map is a file that reverses this process. It maps the compiled code back to the original source, line by line. It is essential for debugging during development. It is never supposed to ship in a public package.

Version 2.1.88 of the @anthropic-ai/claude-code package, published to npm — the world's largest JavaScript package registry — included a source map weighing 59.8 megabytes. For reference, a typical source map for a production package is a few hundred kilobytes. This one was 200 times larger. Inside it was a URL pointing to a zip archive on Anthropic's Cloudflare R2 storage. The archive contained Claude Code's full, unobfuscated TypeScript source code.

Anthropic's official response called it 'a release packaging issue caused by human error, not a security breach.' They emphasized that no customer data or credentials were exposed. Both of these statements are technically accurate and entirely beside the point. What was exposed was something more valuable than credentials: the complete intellectual property of Anthropic's flagship developer product.

  • 59.8 MB source map file — roughly 200x the size of a typical production source map — included in npm package version 2.1.88.
  • The source map contained a URL to a zip archive on Anthropic's Cloudflare R2 storage with the full, original TypeScript codebase.
  • Within hours: mirrored on GitHub, forked 41,500+ times, analyzed by thousands of developers and security researchers worldwide.
  • Second identical incident in 15 months. The February 2025 leak used the exact same vector: a source map that should have been stripped before publishing.

Inside the Code: What 512,000 Lines Tell Us About How AI Coding Tools Actually Work

The leaked codebase is not a monolith. It is a carefully modular system organized around three pillars: a query engine, a tool system, and a permission model.

The Query Engine is the brain. At 46,000 lines, it is the largest single module in the codebase. It handles every interaction with the underlying Claude model — streaming API calls, managing token counts, caching responses, orchestrating tool-call loops, and implementing retry logic when things go wrong. If you have ever used Claude Code and noticed that it seems to 'think' in steps, executing one tool, reading the result, then deciding what to do next — the query engine is what makes that loop work.

The Tool System is the hands. Claude Code exposes approximately 40 discrete tools — file reading, file writing, shell execution, git operations, web fetching, code search, and more. Each tool is a plugin with its own permission gate. When Claude Code asks to run a bash command or edit a file, it is not a freeform action: it is a specific tool invocation that passes through a permission layer before execution. This architecture explains why Claude Code can be simultaneously powerful and safe — every action is individually gated.

The Permission Model is the guardrail. Every tool call must pass through a permission check that considers the tool type, the user's configured permission level, and the specific action being requested. Users can allow certain tools automatically while requiring approval for others. This is not cosmetic — it is deeply embedded in the architecture.

  • Query Engine (46,000+ lines): The orchestration layer — API calls, streaming, caching, token management, tool-call loops.
  • Tool System (~40 tools, 29,000+ lines): Plugin architecture with individual permission gates per tool.
  • Permission Model: Three-tier access control — automatic, prompt-per-use, or blocked — configurable per tool.
  • Streaming: Real-time token-by-token response handling with thinking mode (chain-of-thought) support.

KAIROS: The Daemon That Works While You Sleep

Named after the ancient Greek concept of kairos — the opportune moment, the right time to act — KAIROS is the most significant discovery in the leaked source code. It is not a feature inside Claude Code. It is a fundamentally different operating mode.

Today, Claude Code is reactive. You ask it to do something, it does it, then it waits. KAIROS changes that model entirely. It is an always-on background daemon that continues working after the developer stops typing. Think of it as the difference between a calculator and a coworker: a calculator waits for input, a coworker takes initiative.

The most striking component is the autoDream process. During idle periods — when the developer is away, sleeping, or working on something else — KAIROS performs what the code calls 'memory consolidation.' It reviews the day's observations, merges overlapping notes, removes contradictions, and converts vague insights into structured facts. It is, in effect, thinking about what it learned today so it can be more useful tomorrow.

The architecture uses append-only daily log files, receives periodic tick prompts that trigger proactive decision-making, and has a strict 15-second budget for any autonomous action. It can subscribe to PR updates, send push notifications, and maintain heartbeat signals. The code references to KAIROS appear over 150 times in the source — this is not an experiment. It is a product in development.

For non-technical readers: imagine a junior developer on your team who, every night after work, organizes their notes, reviews what happened during the day, and comes in the next morning with a clear plan. That is KAIROS. Except it does this at machine speed, never forgets anything, and never takes a day off.

  • Always-on background daemon — represents a shift from reactive tool to proactive agent.
  • autoDream process: Consolidates memory during idle time — merges observations, removes contradictions, converts insights into structured facts.
  • 15-second action budget: Hard limit on any autonomous decision, preventing runaway behavior.
  • 150+ references in the source code — this is a serious development effort, not a prototype.

Anti-Distillation: Digital Poison Pills for Competitor Models

One of the most technically sophisticated findings in the leak is a mechanism Anthropic calls anti-distillation. The concept is simple. The implementation is elegant. And the implications are significant.

Here is the problem it solves: when Claude Code makes an API call to Anthropic's servers, the request includes the full system prompt — a detailed set of instructions that tells Claude how to behave, what tools are available, and how to use them. If a competitor were to intercept or record this API traffic, they could extract these prompts and use them to train their own models. In AI, this is called distillation: training a cheaper model to imitate an expensive one by learning from its inputs and outputs.

Anthropic's defense: when anti-distillation is active, Claude Code sends a flag that tells the server to inject fake tool definitions into the system prompt. These are tools that do not exist — plausible-sounding capabilities with realistic descriptions and parameter schemas, but no actual implementation. If a competitor trains on this traffic, their model learns to use tools that are not real. It is the digital equivalent of a cartographer who adds a fictional street to a map to catch copiers.

The mechanism is gated behind a GrowthBook feature flag and is only active for first-party CLI sessions — meaning it does not affect third-party integrations or API consumers. It specifically targets scenarios where Anthropic suspects its own product traffic is being recorded for competitive training.

  • Injects fake tool definitions into API system prompts — tools that look real but do not exist.
  • Purpose: Corrupt competitor training data if Claude Code's API traffic is recorded and used for distillation.
  • Gated behind a GrowthBook feature flag. Only active for first-party CLI sessions.
  • Analogous to trap streets in cartography or canary tokens in security — a deception designed to catch copiers.

Undercover Mode: When AI Erases Its Own Fingerprints

Of all the revelations in the leak, Undercover Mode may be the most controversial. It is a feature designed to systematically remove any evidence that AI was involved in writing code.

When active, Undercover Mode strips AI attribution from commit messages, removes internal codenames (Capybara, Tengu, Fennec, Numbat) from generated code, prevents mentions of Anthropic's internal Slack channels or repository names, and injects strict instructions into the model's prompts to prevent any leakage of Anthropic's involvement.

The stated purpose is to allow Claude Code to contribute to public repositories without revealing that the code was AI-generated. In a world where many open-source projects and companies have policies about AI-written code, this is a feature designed to bypass those policies by making detection impossible.

The irony is almost too perfect: a mode built to prevent information leaks was itself discovered through the biggest information leak in Anthropic's history. The code that hides fingerprints had its own fingerprints exposed through the same packaging pipeline it was designed to protect.

For the broader industry, this raises serious questions about transparency and disclosure. If AI tools can be configured to hide their own involvement, how do maintainers of open-source projects verify that contributions meet their policies? How do companies audit whether their codebase was human-written or AI-generated? Undercover Mode does not just raise ethical questions — it makes those questions harder to answer.

  • Strips all AI attribution from commit messages, generated code, and internal references.
  • Removes codenames (Capybara, Tengu, Fennec, Numbat) and Anthropic-specific references.
  • Designed to let Claude Code contribute to public repos without revealing AI involvement.
  • Raises fundamental questions about AI transparency, disclosure policies, and open-source integrity.

44 Feature Flags and What They Reveal About Anthropic's Roadmap

The leaked source code contains 44 feature flags for capabilities that have not been publicly announced. Feature flags are conditional switches in code — a built feature sits behind a flag that can be turned on or off without redeploying the software. They are standard engineering practice for gradual rollouts. But 44 of them in a single product suggests a significant volume of unreleased work.

The code reveals several model codenames. Capybara is the internal name for a Claude 4.6 variant, currently on its eighth iteration (v8). Fennec maps to Opus 4.6. And Numbat is an unreleased model still in testing. Perhaps most interesting is a performance metric: Capybara v8 shows a 29-30% false claims rate — a regression from v4, which achieved 16.7%. This suggests that scaling model capability sometimes comes at the cost of accuracy, a tradeoff the source code explicitly tries to manage through what it calls 'assertiveness counterweights.'

The codename 'Tengu' appears over a hundred times in the source. In Japanese mythology, Tengu are supernatural beings known for martial arts mastery and mischief. In Claude Code's source, Tengu appears to be the internal project name for the product itself — or possibly for a major upcoming version.

  • 44 feature flags for unreleased capabilities — a significant hidden roadmap behind the shipped product.
  • Model codenames: Capybara (Claude 4.6 variant, v8), Fennec (Opus 4.6), Numbat (unreleased).
  • Performance data: Capybara v8 shows 29-30% false claims — a regression from v4's 16.7%, actively being addressed.
  • Codename 'Tengu' appears 100+ times — likely Claude Code's internal project identity.

Market Reaction: Why Cybersecurity Stocks Lost Billions in One Session

The market's response was swift and severe. Cybersecurity stocks experienced their sharpest single-session decline since the start of the AI boom.

CrowdStrike fell 7%. Palo Alto Networks dropped 6%. Zscaler declined 4.5%. Okta, SentinelOne, and Fortinet each lost approximately 3%. The broader tech sector index fell 3%. Bitcoin dropped to $66,000 from above $70,000. The combined market capitalization loss across the cybersecurity sector alone was measured in billions.

The market logic was straightforward: the leaked code, combined with separately leaked details about Anthropic's 'Claude Mythos' model, revealed AI capabilities sophisticated enough to perform advanced code analysis, create custom exploits, and execute complex attack scenarios. One analyst characterized the implication as 'turning any ordinary hacker into a nation-state adversary.' Whether or not that assessment is hyperbolic, the market treated it as credible.

For cybersecurity companies, the concern is existential economics. Their business model is built on the assumption that sophisticated attacks require sophisticated attackers — and sophisticated attackers are scarce. If AI tools dramatically lower the skill floor for launching advanced attacks, the volume of threats increases faster than defensive tools can scale. The market repriced accordingly.

  • CrowdStrike: -7%, Palo Alto Networks: -6%, Zscaler: -4.5%, Okta: -3%, SentinelOne: -3%, Fortinet: -3%.
  • Broader tech sector index: -3%. Bitcoin: dropped to ~$66,000.
  • Core fear: AI tools lowering the skill floor for cyberattacks could overwhelm the economics of cyber defense.
  • Combined with the separate 'Claude Mythos' model leak, the market saw a pattern of Anthropic security failures.

What This Means for Developers — And Everyone Else

If you are a developer, the practical takeaways are clear. First, understand what your tools are doing. Claude Code's architecture is now public knowledge — its permission model, its tool system, its anti-distillation defenses. This transparency (however involuntary) allows developers to make more informed decisions about which AI tools they trust and how they configure them.

Second, the KAIROS revelation means that AI coding tools are heading toward autonomy. The current generation of tools waits for you to ask. The next generation will act on its own, within boundaries you define. Whether this excites you or concerns you depends on how much you trust those boundaries — and the leak shows that those boundaries are being carefully designed.

If you are not a developer, here is what matters: the software you use every day — your banking app, your messaging platform, your healthcare portal — is increasingly being written or assisted by AI tools exactly like Claude Code. This leak revealed that these tools are more complex, more autonomous, and more strategically deployed than most people realize. The question of how AI writes code is becoming inseparable from the question of how much we can trust the software it produces.

Anthropic called this a 'packaging error caused by human error.' That is true. It is also insufficient. When the same error happens twice in fifteen months, the conversation shifts from 'what happened' to 'why does this keep happening.' For an AI safety company — one whose founding mission is to build safe, trustworthy AI — the pattern matters more than the incident.

  • For developers: Review your AI tool configurations. The permission model exists for a reason — use it deliberately.
  • For teams: Expect autonomous AI coding agents (like KAIROS) within 12-18 months. Start defining boundaries now.
  • For everyone: AI-assisted code is already in the software you use daily. Transparency about AI involvement is a policy question, not a technical one.
  • For Anthropic: Two identical leaks in 15 months is not a bug. It is a process failure that requires structural fixes, not better packaging scripts.

The Bigger Picture: AI Tools Are Becoming AI Systems

There is a line that separates a tool from a system. A tool does what you tell it. A system makes decisions, takes initiative, and operates even when you are not watching. Claude Code, as it exists today, is a tool. KAIROS, as it exists in the leaked code, is a system.

This is the most important takeaway from the entire leak. Not the fake tools. Not the undercover mode. Not the stock market crash. The most important thing is the trajectory: AI coding tools are becoming AI coding systems. They are moving from reactive to proactive, from session-based to persistent, from doing what you say to anticipating what you need.

Whether this trajectory leads to dramatically better software or dramatically new risks depends on choices that are being made right now — by AI labs, by development teams, by regulators, and by the developers who decide how much autonomy to grant these systems. The leak gave us a premature look at that future. What we do with that knowledge is up to us.

Continue reading

Explore the rest of the journal for more writing on software systems, cloud execution, and AI operating models.

Back to blog