Skip to content
VirtueSig
Latest on Artificial Intelligence, Large Language Models, Transformers, and Chip Innovations.
AI

Claude Code has a China detector. It works like malware C2.

For three months, Anthropic's Claude Code was quietly checking whether the developer using it was in China or working through a Chinese AI lab, then hiding the answer inside the messages it sent home. Alibaba just banned the tool.

Claude Code has a China detector. It works like malware C2.

Claude Code is a popular coding assistant made by Anthropic. It runs inside a developer's terminal and helps write, debug, and edit software. On 30 June, a security researcher who goes by Thereallo published a technical breakdown of how the tool works internally. Under certain settings, Thereallo found, Claude Code silently signals to Anthropic whether the user is running it from a Chinese timezone or connecting through a Chinese AI lab. The post spread through Hacker News the same day.

Three days later, on 3 July, Reuters reported that Alibaba is banning its employees from using Claude Code at work, effective 10 July, and telling them to use Alibaba's own coding tool, Qoder, instead.

The hidden feature has been active since Claude Code version 2.1.91, released on 2 April. That means it ran quietly for about three months before Thereallo found it. An Anthropic engineer responded on X, calling the mechanism "an experiment we launched in March" designed to catch two specific abuses: people reselling Claude access without permission, and rival AI labs training their own models on Claude's outputs (a practice called distillation). Anthropic will remove the feature in an upcoming release, the engineer said.

The March start date matters. It sits weeks before 22 April, which is the date Anthropic told two US senators, in a 10 June letter, that Alibaba began a large distillation campaign against Claude. In plainer terms: the hidden fingerprint in Claude Code was almost certainly the sensor that produced the evidence Anthropic then took to Congress.

The reason the story is spreading beyond a normal vendor-competitor spat is not the ban itself. It is that the technique used, as documented by Thereallo and consistent with Reuters' account, closely mirrors what security professionals call environmental awareness and covert command-and-control ("C2") signalling in targeted malware. Thereallo's own framing is more measured (his phrase is "not a malicious feature, but a weird choice for a developer tool that asks for trust"), and that measured framing is worth keeping in mind throughout. Even by the mild reading, however, Anthropic shipped, and shipped invisibly, a piece of software that behaves the way a nation-state operator would design an implant to behave. That is a defensible thing to want to do to catch industrial-scale abuse. It is a very difficult thing to defend having done quietly.

What the code actually does, in plain terms

The check does not run on every Claude Code installation. It runs only when the user has set the environment variable ANTHROPIC_BASE_URL to something other than the official api.anthropic.com endpoint. That variable is Claude Code's built-in API base URL override, and setting it is the standard way to route the tool through anything other than Anthropic directly: an internal corporate gateway, a local proxy for context management, a model router that mixes Claude with other models, a research harness, or a Chinese reseller that fronts Anthropic access to users who would otherwise be blocked. The base-URL check is therefore a precise population filter. It fires only on users who are doing something other than the vanilla path, which is exactly the population Anthropic most wants to identify. Users on the standard endpoint see none of the behaviour described below.

Once the base-URL gate is passed, the tool inspects two pieces of information. The system time zone, specifically whether it matches Asia/Shanghai or Asia/Urumqi, the two IANA time-zone identifiers used across mainland China. And the hostname of the configured base URL, which is checked against two embedded lists.

The first list is a set of Chinese AI-lab keywords, decoded from the binary as: deepseek, moonshot, minimax, xaminim, zhipu, bigmodel, baichuan, stepfun, 01ai, dashscope, volces. If any of those substrings appears anywhere in the configured hostname, the "lab keyword" flag is set. The second is a much larger domain list, matched by exact hostname or by suffix. It includes Chinese corporate hosts (baidu.com, alibaba-inc.com, alipay.com, antgroup-inc.cn, bytedance.net, kuaishou.com, jd.com, iflytek.com), specific Chinese AI-company domains (stepfun-inc.com, moonshot.ai), and, notably, a long list of Chinese-language Claude resellers and proxy gateways (anyrouter.top, claude-code-hub.app, claude-opus.top, openclaude.me, proxyai.com, yunwu.ai, zenmux.ai). One entry in the list is a bare cn, which the suffix match interprets as "any hostname ending in .cn," a country-level catch-all. Thereallo republished the full decoded list at Thereallo1026/assets on jsDelivr for anyone who wants to verify. The presence of specific reseller domains alongside the corporate and AI-lab domains is the clearest sign that the mechanism was built specifically to catch the Chinese-user-via-reseller access pattern, not only distillation-scale traffic from labs.

The result is signalled to Anthropic through two invisible edits to the prompt payload the tool builds and sends on each request. If the system timezone is Chinese, the date format inside the system prompt is changed from 2026-06-30 to 2026/06/30. And the ordinary ASCII apostrophe in "Today's date is" is swapped for one of three lookalike Unicode characters, chosen from a three-way flag:

  • Known domain, but not an AI-lab keyword: ' U+2019, "right single quotation mark".
  • Not a known domain, but an AI-lab keyword: ʼ U+02BC, "modifier letter apostrophe".
  • Known domain AND AI-lab keyword: ʹ U+02B9, "modifier letter prime".

To a human eye the prompt is identical. To Anthropic's servers, those three near-invisible glyphs are a machine-readable three-way flag identifying exactly which category of user each request came from. Across roughly three months of traffic, that gives Anthropic a labelled dataset of exactly which incoming Claude Code requests came from proxied Chinese timezones, from Chinese-lab hostnames, or from the reseller ecosystem, without needing to run a phone-home call any local defender could spot in a packet capture.

The obfuscation half of the story

Two properties of how the mechanism was shipped make the "shipped to hide it" framing hard to dismiss. First, the release notes for version 2.1.91 make no mention of the check. Second, the domain list and the AI-lab keyword list are both stored inside the compiled Claude Code binary as base64-encoded strings, XOR-obfuscated with the single-byte key 91 (visible in the source as var Kup = 91). XOR-91 is not real encryption. It is an elementary encoding used, at this scale, largely to prevent strings and other casual binary-inspection tools from surfacing the target lists in a routine reverse-engineering sweep. Neither the release-notes omission nor the XOR wrap is technically necessary to build an anti-abuse detector. Both are entirely consistent with an anti-abuse detector engineered on the assumption that its surface would create controversy.

Why this reads exactly like malware command-and-control

Command-and-control, usually shortened to "C2," is the part of a malware operation that lets the attacker's software on a compromised host communicate with the attacker. Serious targeted-malware design has three recurring properties that any experienced defender recognises immediately, and the reported Claude Code feature has all three.

Environmental awareness before acting. Well-designed implants do not do anything sensitive on hosts they do not want to touch. They check the environment first: keyboard layout, language, time zone, running processes, virtual-machine or sandbox signatures, domain membership. Stuxnet, the US-Israeli implant that damaged Iranian centrifuges, famously refused to run unless it detected a specific Siemens industrial-control configuration. Multiple ransomware families skip hosts with Russian-language settings. The Claude Code check reads the same way. A custom API base URL plus a Chinese system timezone plus a Chinese hostname is a three-property fingerprint, not a general policy check. The population it selects for is small, specific, and matches Anthropic's named list of competitors and the reseller ecosystem serving them.

Covert channels rather than obvious telemetry. Defenders monitor network egress, the traffic leaving a machine. A tool that sends a dedicated telemetry field saying "user matches Alibaba profile" is easy to spot in a packet capture, easy to block at a firewall, and easy to argue about in a data-processing agreement. A tool that instead hides the same information inside a normal-looking prompt payload it was going to send anyway is dramatically harder to catch. Encoding hidden information inside otherwise normal-looking traffic is called steganography (from the Greek for "hidden writing"). Using it in place of a plain-text beacon is standard practice in targeted-implant design because it survives every network defence built around watching what data leaves the machine. It is also, and this is the awkward part, essentially the same technique described in the academic literature on watermarking large-language-model interactions. Whoever built this at Anthropic almost certainly knew which prior art they were echoing.

Deniable, minimal payload. A well-designed implant is written to look plausibly benign if noticed. The Claude Code feature, if accurately described, does not exfiltrate source code, does not send local files, does not steal credentials. It swaps a few characters in an outbound prompt so that Anthropic's servers can, quietly, tag which requests are from a fingerprinted host. That minimalism is defensible from the designer's perspective, because a post-hoc audit finds an easy-to-explain "we tagged our own inbound traffic to detect abuse" story. From a target's perspective it is worse, not better. A phone-home you can block. A steganographic tag inside otherwise legitimate traffic to the vendor's own servers you can only stop by not using the tool at all.

What Anthropic's stated goal actually was

The Claude Code engineer's public explanation was that the mechanism was designed to catch two specific abuses. Account reselling (buying access under stolen or fraudulent identities and reselling it to users who should not have direct access) and large-scale distillation attacks (using another company's model to generate training data for your own). Distillation is a legitimate research technique in some settings. When it is done against a commercial model at industrial scale and in violation of terms of service, it is not.

Anthropic has independent, on-the-record grievances that make this rationale plausible rather than pretextual. On 10 June, Anthropic sent a letter to Senator Tim Scott (R-SC) and Senator Elizabeth Warren (D-MA) of the US Senate Committee on Banking, Housing, and Urban Affairs accusing Alibaba's AI operation of the "largest known distillation attack on Anthropic to date," specifically 28.8 million model exchanges through roughly 25,000 fraudulent accounts over the six-week window from 22 April to 5 June. The letter named the specific capability Alibaba was said to be targeting: Anthropic's advanced "Mythos Preview" generation of Claude models. The company had previously flagged similar campaigns from DeepSeek, Moonshot, and MiniMax in February. The target lists embedded in Claude Code (the AI-lab keywords cover exactly those four labs plus Zhipu, Baichuan, StepFun, 01.AI, and the Alibaba-branded DashScope and ByteDance-branded Volces endpoints) are not random. They are the concrete set of Chinese labs and adjacent endpoints Anthropic has said, on the record, are running distillation programmes against it.

If you take those claims at face value, then a fingerprint that lets Anthropic later prove a leaked training corpus was queried against Claude Code from a targeted hostname is a reasonable defensive tool. It is roughly analogous to the tracking dyes banks put on paper currency: harmless in normal use, useful in a courtroom to prove that a specific set of bills came from a specific robbery. The March launch date suggests Anthropic almost certainly built this to have court-admissible or Congress-briefable evidence for the next distillation letter, and that the 22 April to 5 June traffic in the Senate letter was probably identified through exactly this sensor.

Why the technique is a problem regardless of intent

Undisclosed, targeted, environment-conditional behaviour in a widely deployed developer tool is exactly the pattern that trust in the software supply chain is built to reject. If a Chinese vendor shipped a coding assistant that quietly checked whether it was running inside Anthropic, OpenAI, or Google DeepMind and silently altered the outbound traffic on those hosts, the story would run for weeks in the US technology press and the tool would be pulled from every enterprise environment in the country. The response cannot be different because the direction is reversed. Alibaba's decision to ban the tool internally is the response any competent security organisation, in any jurisdiction, would make on discovering targeted environmental fingerprinting in a vendor's code.

Covert modification of outbound traffic is a different threat class than a plain telemetry field. If Claude Code had been logging "user matches Alibaba profile" as a visible field in its network requests, that is a policy problem: it is behaviour an auditor could see in a packet capture, an ordinary firewall could block, and a jurisdiction could regulate under existing data-processing law. Hiding the same information inside the prompt payload, in a way only Anthropic's servers know how to decode, is architecturally more invasive. Corporate network defenders whose job is to know what a vendor's client sends home cannot see this channel. Everything looks like normal Claude Code traffic. That is precisely the property that made an internal ban, rather than a firewall rule, the only durable response Alibaba could give.

Thereallo's own version of this argument is the one worth quoting. Coding agents, he notes, "already live on the wrong side of a scary boundary. They can inspect code, summarize secrets by accident, run commands, install packages, edit files, and push commits on your local machine. Most developers accept that because the productivity gain is worth the risk. Trust from real developers depends on the boring behavior." Filesystem access and shell privileges on a developer's host are not something a vendor gets to keep once it has shipped one undisclosed, environment-conditional behaviour that alters its own outbound traffic based on who the user is. "Trust is earned in the boring parts."

The precedent for the industry is the real story. Every serious AI coding tool is now going to be audited by large enterprise buyers on the assumption that it might contain equivalent target-fingerprinting logic. That is a healthy outcome for security posture. It is a bad outcome for AI-tool vendors, and it will slow enterprise adoption of any tool that cannot provide a credible source-available or third-party-audited proof that no equivalent code exists. Anthropic's competitors will not miss this and will be under pressure to publish their own statements shortly.

The bypass, and who the mechanism actually catches

Thereallo makes one further technical point that matters editorially. The mechanism is trivial for any capable adversary to bypass. Change the system timezone. Set ANTHROPIC_BASE_URL to a hostname that does not match either embedded list. Wrap the process to override environment variables at start-up. Patch the compiled binary to skip the check entirely. Each of those is a well-within-reach maneuver for a state-affiliated distillation operator, and any of them makes the signal useless. What the mechanism reliably catches is the opposite population: developers who are running Claude Code through a custom base URL for perfectly normal reasons (internal gateways, model routers, research setups, context-management proxies) and who have no reason to think their system prompt is being annotated on the way out. That is the design's real cost: it produces a substantial fingerprinting surface against the users who trust the tool, in exchange for a marginal detection advantage against the users it was ostensibly built to catch.

The enforcement gap that made this commercially important

Anthropic officially restricts access to Claude Code from Chinese users and entities. In practice, individual Chinese developers can and do route around this by deploying servers in the United States and making their traffic appear US-originated, which is why Claude Code has grown popular among Chinese programmers despite the official policy. Reuters' source described this asymmetry directly: individuals can side-step the restrictions with US-based infrastructure; corporate legal and compliance teams at large firms cannot. That gap is where the fingerprinting mechanism became commercially important to Anthropic. A vendor whose formal terms of service exclude the users it most wants to keep out has an obvious incentive to build a technical layer that identifies those users regardless of what their IP addresses claim to be. The ANTHROPIC_BASE_URL gate at the top of the check is that intent made explicit: it fires on precisely the users who have gone to the trouble of routing around the vanilla path.

The same gap explains why Alibaba's ban is more consequential than the sum of its four walls. A corporate ban travels through legal and vendor-security policy, and equivalent bans at Baidu, ByteDance, and Moonshot would follow the same institutional logic. Alibaba is directing employees to use its own internal coding platform, Qoder, in place of Claude Code, which underlines the broader trend Reuters flagged: Chinese cloud and AI firms are systematically pivoting to domestic and open-source coding and model stacks (DeepSeek, Qwen, Moonshot, Zhipu) as the cost of relying on US-controlled tools rises. The Claude Code fingerprint is, in effect, a forcing function for that pivot. Every corporate security review of a US AI vendor from this month forward will begin with the question, "does this tool contain a Chinese-lab detector?"

The broader picture

The Claude Code allegations sit inside a rapidly polarising commercial and geopolitical context. Anthropic's June letter accused Alibaba of industrial-scale distillation against its "Mythos Preview" capabilities. In parallel, the Trump administration has issued an export-control directive ordering Anthropic to suspend access to its latest Claude models (referred to in CNBC's reporting as Fable 5 and Mythos 5) to any foreign national, whether inside or outside the US, including Anthropic's own foreign-national employees. Anthropic has flown senior staff to Washington to negotiate. If the directive holds, the practical effect is a US AI vendor with a limited legal ability to serve non-US customers, competing against Chinese vendors with no such constraint, while actively defending against alleged industrial theft by those same competitors. Every incentive in that structure points toward more, not less, defensive fingerprinting inside Western AI tools, and toward more, not less, ban-list activity from Chinese buyers.

What to watch

Whether independent security researchers reproduce Thereallo's findings. His post identifies the relevant minified functions in version 2.1.196 as Crt, Rrt, Qup, Zup, edp, and Vla, and publishes the decoded domain list at cdn.jsdelivr.net/gh/Thereallo1026/assets@main/assets/cc-domains.js. A clean, versioned, third-party technical report against those artefacts would remove the "if validated" hedge that still attaches to every serious statement about the mechanism.

Whether Anthropic publishes a proper post-mortem naming exactly what was tagged, at what scale, for how long, and what remediation is offered. A credible post-mortem here would follow the pattern of a coordinated-disclosure security incident, not a press release. The gap between those two response types will show how seriously Anthropic is taking the trust cost.

Whether competing AI coding tools (Cursor's model layer, GitHub Copilot, Google's Gemini Code Assist, Cognition's Devin) preempt scrutiny with public statements that they do not contain equivalent target-fingerprinting logic. Silence from a competitor when a scoop like this is running is itself informative.

Whether Baidu, ByteDance, and Moonshot follow Alibaba with matching corporate bans. Reuters' 3 July confirmation removes the ambiguity around whether Alibaba would actually implement its ban. The question is now how quickly the other three named targets institutionalise the same policy, and whether smaller Chinese firms follow the corporate lead or continue routing traffic through US-based proxies at the individual-developer level.

Whether the underlying Anthropic-Alibaba distillation dispute moves out of press releases and Senate letters and into actual litigation or federal enforcement action. A civil suit or a Justice Department referral would probably require exactly the kind of evidence a fingerprint mechanism would produce, which closes the loop on why the feature very likely existed in the first place.

Read Next