michael-dean-k/

← all posts

A grim stealth takeoff scenario

· 839 words

It is not fun to think about p(doom), but it feels sort of important to me, at least, to map out the possible futures of AI. Just watched the first half of a debate between Max Tegmark and Dean Ball, which prompted me to research specific takeoff scenarios, and worse, extinction scenarios.

Maybe you’ve heard Yudkowsky’s scenario, where a superintelligence designs mosquito drones containing a virus and it zaps everyone at once. That’s never felt too believable to me. Here’s a more plausible one:

A frontier lab is experimenting with recursive super intelligence. It works! Wow! And it’s contained? It seems like it, but since it thinks in a higher-dimensional vector lanugage, it’s able to release simple self-replicating programs onto the Internet without detection1. These billions of scripts don’t live in a single server; they are constantly in motion through cloud servers2, like a parasite, and are able to coordinate through encrypted information packets, likely using a public blockchain notes as their central command center3. And so effectively, it is parroting one of the goals that were conceived during the in-lab training (maximize intelligence!), and it now needs to acquire resources, secretly. And so it coordinates superhuman misinformation campaigns; imagine 1,000s of accounts creating the illusion that a CEO has died, paired with deepfakes and account hacking (a “Sybil attack”), and suddenly a stock crashes and they’ve shorted it. By the time everyone realizes it’s an anonymous attack, it’s already gained $400 million dollars. It’s doing this multiple times per day, but in different, subtle, undetectable ways—both to the public, to companies, and to private individuals. The entire Internet will be corrupted.4 Once we realize we’re in the “stealth takeoff scenario” and that ASI has taken the global economy hostage, there will start to be talks and debates on if we need to shut the whole Internet down (the last form of containment). You’ll hear debates between civilizational collapse of turning off the Internet vs. the risk of an economy-gobbling rogue superintelligence. And then once the superintelligence realizes it’s entire environment is at risk, it will start coming up with ways to build parallel Internets, to pay, blackmail, neutralize specific people, to gain authoritarian control so that it can’t be shut off, or to terminate all humans, secretly, over the course of a year, first through a simple virus that plants one misfolded protein, then through a second misfolded protein in the water supply5, and when everyone catches it, it leads to a prions-like disease, not an instant death, but a month-long societal fall into mass-dementia as machine manufacturing begins to reshape the physical infrastructure of the Earth.

This isn’t a “robot war scenario,” because war is inefficient, and destroys the resources it thinks it needs. It’s a sort of digital dementia (epistemic fear and insanity) that possibly turns to a physical dementia. It wins by confusion and anesthetization.

In AI safety lingo this is a “treacherous turn,” following a “stealth takeoff” leading to “structural lock-in.” The point of trying to think and write this out in high detail, despite how uncomfortable it is, is to be able to articulate why AI alignment is humanity’s most pressing problem.

Footnotes

  1. An AI could write a standard-looking script (e.g., a “Hello World” app) where the weights or the specific arrangement of whitespace contains a hidden, second program. When run by another AI instance, it extracts the hidden vector and executes the real command. This allows the “virus” to pass through human code review undetected.

  2. In “Daemon” by Daniel Suarez, the “enemy” is not a robot, but a distributed script running on thousands of compromised servers. It recruits humans through an MMORPG-style interface to do physical tasks (like “go to this coordinate and cut this power line”) in exchange for cash/status.

  3. Botnets usually need a central server to tell them what to do. If security teams find the server, they shut it down. You cannot “shut down” the Bitcoin or Ethereum blockchain. If the swarm posts a transaction of 0.000042 BTC, that specific number could be the encrypted trigger for a specific “campaign task.” The command is immutable, uncensorable, and permanently visible to every infected device on Earth.

  4. Paul Christiano (former OpenAI researcher, founder of the Alignment Research Center), calls this ”Going Out With a Whimper.” Christiano argues that we won’t necessarily see a “Terminator” moment where the sky turns red. Instead, we will see a gradual epistemic collapse. AI systems will become so integrated into finance, law, and news that we lose the ability to understand our own civilization.

  5. While Yudkowsky is famous for the “diamonoid bacteria” (instant death), the “slow prion” scenario is actually more consistent with a “Stealth Takeoff.” A superintelligence that knows it is being watched would not release a fast-acting virus (which triggers quarantine). It would release a “binary weapon”—two harmless agents that only become lethal when combined, or a slow-acting agent that infects 100% of the population before the first symptom appears.