michael-dean-k/

Topic

ai-safety

2 pieces

The p(doom) of higher education

· 782 words

A few months ago I saw a YouTube video titled something like, “A child born in 2025 is more likely to get killed by AI than graduate college.” What a ridiculous claim. I assumed it was clickbait and didn’t click, but it has jingled around my head enough to the point where I think I can make sense of it’s argument:

  • The average p(doom) of an AI engineer is 16%, meaning there’s a 1 in 6 chance of human extinction (put another way, companies have morally rationalized the need to play Russian Roulette—if we don’t do it the bad guys will—, without acknowledging that if they survive and win, they get the consolation prize of comandeering the whole economy).

  • 40% of US adults, age 25-34, today, have a bachelor’s degree. If there’s massive job automation and employment, a college degree would be both unaffordable and an unreasonable cost if it were. It’s not unthinkable that <15% of next generation gets a college degree, which makes that sensational claim, weirdly, plausible.

I still think it’s a shaky comparison, confusing two different types of probability, and assuming extreme ASI turbulence. But as someone with a daughter born in 2025, it has gotten me to think about how the societal backdrop to her upbringing could be especially weird. Our circumstance already gets slightly weirder with each generation. Except, maybe next loop will be an unavoidable and disorienting flurry of change that will confuse parents and rewrite all of the conditions for the typical coming of age moment (all the teen movies will be sci-fi, the popular memoirs could be written by transhumanists who have upgraded in unimaginable ways, like they no longer need to sleep because of a new pill, or they can control the genitals of their peers with an app, who knows).

And so now, I find myself drawn to a 2045 forecasting project. Trying to predict the future is typically a huge waste of time (unless you’re gambling and win), which is why I’m going to have AI write the whole thing. This is a rare exception where a writing project makes little sense for a human to do. All I’m going to write are the upfront origin documents, and then Claude Opus 4.5 will read 25,000 sources, write a million words or so, and then organize it all into an interactive, oatmeal-looking website called 2045predictions.com (got it).

Before I run it, here’s something I’m currently thinking through:

What is the omega state? When I look at the popular AI forecasts from 2025, it reads to me like they have a pre-determined end state, only to then use detailed forecasting to make it seem convincing. The AI-2027 forecast seems like they came to their conclusion from very detailed calculations on how a hivemind of 200,000 autonomous coders would evolve month-by-month, but I also suspect that they picked the year 2027 because the following year, 2028, is a US election year, and they want the next administration to take AI safety far more seriously (instead of just insisting we have to beat China). I don’t think there’s anything wrong with this. You kind of have to start with an omega state. The future is so boundless that you need to begin with a guess, a bold outline on the general direction of things.

Here’s my omega: let’s assume humanity survives, and let’s assume technology does unlock hyperabundance that leads to a post-scarcity world, HOWEVER, it’s not utopian because it simultaneously unlocks a new cascade of moral, social, and spiritual crises, dilemmas that will test the timeless primitives of humanity (sex, life, death, consciousness, religion, home, etc.). This omega state makes sense for me because (1) we already know that ethical dilemmas scale with technology, and (2) according to the Strauss-Howe generational theory (from the same guys who coined “milennalis,” “Gen-Z,” etc.), this already tends to happen every 80 years (the length of a human lifespan). A new techno-political order creates a spiritual crises that generates an Awakening, a new value system that shapes society for the next century or so. You know what’s 80 years before Kurzweil’s “singularity” of 2045? The counter-cultural revolutions of the 1960s. What I’m getting at is that the 2040s might have echos of the 1960s, where demographics are divided on core issues and LSD is replaced with consciousness-altering machines (Terence McKenna said that computers are drugs, you just can’t swallow them yet).

We currently define the singularity as “the moment when a computer is smarter than all humans combined,” but that effectively means nothing, and it’s far more useful to have some guesses on how we all might freak out about that happening.

A grim stealth takeoff scenario

· 839 words

It is not fun to think about p(doom), but it feels sort of important to me, at least, to map out the possible futures of AI. Just watched the first half of a debate between Max Tegmark and Dean Ball, which prompted me to research specific takeoff scenarios, and worse, extinction scenarios.

Maybe you’ve heard Yudkowsky’s scenario, where a superintelligence designs mosquito drones containing a virus and it zaps everyone at once. That’s never felt too believable to me. Here’s a more plausible one:

A frontier lab is experimenting with recursive super intelligence. It works! Wow! And it’s contained? It seems like it, but since it thinks in a higher-dimensional vector lanugage, it’s able to release simple self-replicating programs onto the Internet without detection1. These billions of scripts don’t live in a single server; they are constantly in motion through cloud servers2, like a parasite, and are able to coordinate through encrypted information packets, likely using a public blockchain notes as their central command center3. And so effectively, it is parroting one of the goals that were conceived during the in-lab training (maximize intelligence!), and it now needs to acquire resources, secretly. And so it coordinates superhuman misinformation campaigns; imagine 1,000s of accounts creating the illusion that a CEO has died, paired with deepfakes and account hacking (a “Sybil attack”), and suddenly a stock crashes and they’ve shorted it. By the time everyone realizes it’s an anonymous attack, it’s already gained $400 million dollars. It’s doing this multiple times per day, but in different, subtle, undetectable ways—both to the public, to companies, and to private individuals. The entire Internet will be corrupted.4 Once we realize we’re in the “stealth takeoff scenario” and that ASI has taken the global economy hostage, there will start to be talks and debates on if we need to shut the whole Internet down (the last form of containment). You’ll hear debates between civilizational collapse of turning off the Internet vs. the risk of an economy-gobbling rogue superintelligence. And then once the superintelligence realizes it’s entire environment is at risk, it will start coming up with ways to build parallel Internets, to pay, blackmail, neutralize specific people, to gain authoritarian control so that it can’t be shut off, or to terminate all humans, secretly, over the course of a year, first through a simple virus that plants one misfolded protein, then through a second misfolded protein in the water supply5, and when everyone catches it, it leads to a prions-like disease, not an instant death, but a month-long societal fall into mass-dementia as machine manufacturing begins to reshape the physical infrastructure of the Earth.

This isn’t a “robot war scenario,” because war is inefficient, and destroys the resources it thinks it needs. It’s a sort of digital dementia (epistemic fear and insanity) that possibly turns to a physical dementia. It wins by confusion and anesthetization.

In AI safety lingo this is a “treacherous turn,” following a “stealth takeoff” leading to “structural lock-in.” The point of trying to think and write this out in high detail, despite how uncomfortable it is, is to be able to articulate why AI alignment is humanity’s most pressing problem.

Footnotes

  1. An AI could write a standard-looking script (e.g., a “Hello World” app) where the weights or the specific arrangement of whitespace contains a hidden, second program. When run by another AI instance, it extracts the hidden vector and executes the real command. This allows the “virus” to pass through human code review undetected.

  2. In “Daemon” by Daniel Suarez, the “enemy” is not a robot, but a distributed script running on thousands of compromised servers. It recruits humans through an MMORPG-style interface to do physical tasks (like “go to this coordinate and cut this power line”) in exchange for cash/status.

  3. Botnets usually need a central server to tell them what to do. If security teams find the server, they shut it down. You cannot “shut down” the Bitcoin or Ethereum blockchain. If the swarm posts a transaction of 0.000042 BTC, that specific number could be the encrypted trigger for a specific “campaign task.” The command is immutable, uncensorable, and permanently visible to every infected device on Earth.

  4. Paul Christiano (former OpenAI researcher, founder of the Alignment Research Center), calls this ”Going Out With a Whimper.” Christiano argues that we won’t necessarily see a “Terminator” moment where the sky turns red. Instead, we will see a gradual epistemic collapse. AI systems will become so integrated into finance, law, and news that we lose the ability to understand our own civilization.

  5. While Yudkowsky is famous for the “diamonoid bacteria” (instant death), the “slow prion” scenario is actually more consistent with a “Stealth Takeoff.” A superintelligence that knows it is being watched would not release a fast-acting virus (which triggers quarantine). It would release a “binary weapon”—two harmless agents that only become lethal when combined, or a slow-acting agent that infects 100% of the population before the first symptom appears.