michael-dean-k/

Topic

eschatology

5 pieces

Bubble Bill

· 153 words

A fiction plot came to me in the car: an ASI constructs an airtight waterproof bubble around a town, and everyone is puzzled why, until suddenly it usheeschatrs in a Biblical flood that kills everyone in the world, except the people inside the bubble. They choose this town because someone inside of it was determined to be "the supreme human," a genetic and moral code that is exemplary of how all humans should be and live. It turns out it was just a regular guy who said "please" and "thank you" to this chatbots, a kind of "reverse sycophant." We find out, in a very Vince Vaughn-esque apocalyptic romcom, that he's a mediocre fallible guy, but more remarkably, also immune to the crooning and praise from both his neighbors and overlords. He has every opportunity to step into the role of messiah, but would really rather not, and instead continue his pre-flood existence.

An Intelligence Framework

· 703 words

The AI takeoff hysteria is hard to avoid these days, and I'm realizing we don't have clear distinctions between AGI/ASI. I wanted to revisit an old framework of mine to see if anyone finds it helpful (and if it's worth developing). There are some existing classification frameworks, but they're low-resolution. My basic idea is to break AI into three eras: ANI (narrow intelligence), AGI (general intelligence), ASI (superintelligence). Then, you can break each era into 3 tiers. You only shift from one tier to the next when you make breakthroughs across different criteria (let's say, (a) generality, (b) transfer, (c) autonomy, (d) learning, (e) self-modeling). I think the last few weeks are the collective hype of us all realizing we're shifting from AGI-1 to AGI-2. It's exciting/scary, but I think the paranoia mostly comes from not realizing how big the gap is between AGI-2 and ASI-1. (Spoiler: ASI might arrive slower than we think.)

ANI-1 is scripted logic, the lowest form of "artificial intelligence," basically Goombas. ANI-2 might cover Google Maps or AlphaGo, intelligences that excel in a single function, traffic or chess. Siri is ANI-3; even though it feels broad, it really uses voice to route you to 20 or so pre-defined tricks. The chasm between Goomba and Siri is similar to the chasm between early-AGI and late-AGI. ChatGPT and the multi-modal models that followed, capture AGI-1, a single neural network that can do basically anything, even if it sucks: essays, songs, video, code. The newest models (and their agentic harnesses) are feeling like AGI-2. They're significantly better at coding, can run for hours at a time, and are starting to make contributions to machine learning itself.

AGI-2 could last a couple years. As agentic AI matures, I'm sure there will be a few "takeoff" scares, but they'll probably feel more like a flood of a trillion midwits than real ASI (still, that could be enough to break the economy/internet). While we went from AGI-1 to AGI-2 through data, scale, and engineering, it seems like we'll need research breakthroughs to get to AGI-3. It won't be through scaling alone. Whenever and however we get to "human complete" intelligence, the apex of AGI is a single agent that is a master of all human domains, a Nobel Prize winner in every field at once, seamlessly transferring knowledge between them, unlocking a cascade of civilization-altering inventions.

As crazy as AGI-3 could be, it still isn't superintelligence. That has its own era, and the chasm between early ASI and late ASI will be as big a gap between the chatbots who can't count the R's in strawberry and the agents that cure cancer. We can only really speculate on ASI (because it would be truly alien), but we can imagine it as step changes in recursion, scope, and complexity. Imagine ASI-1 as an agent that, as it's working, can infer its own limits, and self-modify its learning paradigms in ways we can't understand. Imagine ASI-3 as something that can monitor reality in real-time, and, reconfigure its hardware in real-time (some hydra of graphics cards, quantum computers, and neuromorphic wetware) to run simulations at unfathomable scales in unimaginable fields, running on a hardware stack so big we have to put it in space and run it on fusion. This goes far beyond my ability to not bullshit, but I think something as insane as this, thankfully, is still far away, which points to the real question nested in my framework:

Could the rise of AGI/ASI be linear? People gravitate towards "AI will plateau" or "the singularity is imminent," but the conservative middle ground is more boring: linear progress. Maybe the exponential advances are real, but so are the extreme frictions of research, infrastructure, and social effects. If AGI-1 arrived in 2022, and AGI-2 arrived in 2026, maybe we'll keep ascending tiers in 4-year intervals: AGI-3 in 2030, the first true "superintelligence" by 2034, and ASI-3 by 2042. This shift from AGI-1 to ASI-1 (12 years), is considered a "slow takeoff" scenario, even though the ANI era took around 70 years. If we zoom out to the scale of a human, linear progress will still feel like centuries of change all in a single turning of generations.

→ source

A grim stealth takeoff scenario

· 839 words

It is not fun to think about p(doom), but it feels sort of important to me, at least, to map out the possible futures of AI. Just watched the first half of a debate between Max Tegmark and Dean Ball, which prompted me to research specific takeoff scenarios, and worse, extinction scenarios.

Maybe you’ve heard Yudkowsky’s scenario, where a superintelligence designs mosquito drones containing a virus and it zaps everyone at once. That’s never felt too believable to me. Here’s a more plausible one:

A frontier lab is experimenting with recursive super intelligence. It works! Wow! And it’s contained? It seems like it, but since it thinks in a higher-dimensional vector lanugage, it’s able to release simple self-replicating programs onto the Internet without detection1. These billions of scripts don’t live in a single server; they are constantly in motion through cloud servers2, like a parasite, and are able to coordinate through encrypted information packets, likely using a public blockchain notes as their central command center3. And so effectively, it is parroting one of the goals that were conceived during the in-lab training (maximize intelligence!), and it now needs to acquire resources, secretly. And so it coordinates superhuman misinformation campaigns; imagine 1,000s of accounts creating the illusion that a CEO has died, paired with deepfakes and account hacking (a “Sybil attack”), and suddenly a stock crashes and they’ve shorted it. By the time everyone realizes it’s an anonymous attack, it’s already gained $400 million dollars. It’s doing this multiple times per day, but in different, subtle, undetectable ways—both to the public, to companies, and to private individuals. The entire Internet will be corrupted.4 Once we realize we’re in the “stealth takeoff scenario” and that ASI has taken the global economy hostage, there will start to be talks and debates on if we need to shut the whole Internet down (the last form of containment). You’ll hear debates between civilizational collapse of turning off the Internet vs. the risk of an economy-gobbling rogue superintelligence. And then once the superintelligence realizes it’s entire environment is at risk, it will start coming up with ways to build parallel Internets, to pay, blackmail, neutralize specific people, to gain authoritarian control so that it can’t be shut off, or to terminate all humans, secretly, over the course of a year, first through a simple virus that plants one misfolded protein, then through a second misfolded protein in the water supply5, and when everyone catches it, it leads to a prions-like disease, not an instant death, but a month-long societal fall into mass-dementia as machine manufacturing begins to reshape the physical infrastructure of the Earth.

This isn’t a “robot war scenario,” because war is inefficient, and destroys the resources it thinks it needs. It’s a sort of digital dementia (epistemic fear and insanity) that possibly turns to a physical dementia. It wins by confusion and anesthetization.

In AI safety lingo this is a “treacherous turn,” following a “stealth takeoff” leading to “structural lock-in.” The point of trying to think and write this out in high detail, despite how uncomfortable it is, is to be able to articulate why AI alignment is humanity’s most pressing problem.

Footnotes

  1. An AI could write a standard-looking script (e.g., a “Hello World” app) where the weights or the specific arrangement of whitespace contains a hidden, second program. When run by another AI instance, it extracts the hidden vector and executes the real command. This allows the “virus” to pass through human code review undetected.

  2. In “Daemon” by Daniel Suarez, the “enemy” is not a robot, but a distributed script running on thousands of compromised servers. It recruits humans through an MMORPG-style interface to do physical tasks (like “go to this coordinate and cut this power line”) in exchange for cash/status.

  3. Botnets usually need a central server to tell them what to do. If security teams find the server, they shut it down. You cannot “shut down” the Bitcoin or Ethereum blockchain. If the swarm posts a transaction of 0.000042 BTC, that specific number could be the encrypted trigger for a specific “campaign task.” The command is immutable, uncensorable, and permanently visible to every infected device on Earth.

  4. Paul Christiano (former OpenAI researcher, founder of the Alignment Research Center), calls this ”Going Out With a Whimper.” Christiano argues that we won’t necessarily see a “Terminator” moment where the sky turns red. Instead, we will see a gradual epistemic collapse. AI systems will become so integrated into finance, law, and news that we lose the ability to understand our own civilization.

  5. While Yudkowsky is famous for the “diamonoid bacteria” (instant death), the “slow prion” scenario is actually more consistent with a “Stealth Takeoff.” A superintelligence that knows it is being watched would not release a fast-acting virus (which triggers quarantine). It would release a “binary weapon”—two harmless agents that only become lethal when combined, or a slow-acting agent that infects 100% of the population before the first symptom appears.

On the optics of robot armies

· 492 words

Someone should do a shot-by-shot analysis of the UBTech humanoid robot army($100m USD in orders) and iRobot. Do you unlock marketing power by replicating products and cinematics from old scifi? … Separate but relevant, how long until there actually is a robot army? In one sense, I’d rather have two superpowers battle for land with non-human entities, but once you build autonomous machines with the intention to destroy, well, it’s not hard to see how scary a “context malfunction” might be.

I’d imagine there could be a decade of “tele-operated military technology” before anything autonomous is deployed (2040s, if ever), including something like a solider in VR, operating an android, combined with a personal fleet of “semi-autonomous” drones, who can maneuver and avoid on their own, but are directed by the human/cyborg soldier (giving each infantry unit it’s own atomic air-force). I assume this is an area of research, and don’t want to dedicate my imagination towards battlefront acceleration.

Similar to how television brought a shock to the public by televising frontline war, I imagine that by the end of my life, there will be another shock that comes from witnessing the frontier of machine war.

To circle back to this point: is there a world where machine war can be contained and prevent the combat death of humans? My guess is no, but I’m sure this is a common rhetorical point to advance the research here. It’s dangerously naive thinking: (1) it changes the ethics of war (it’s not about human life, but a manufacturing game), and makes war easier to start; (2) it likely isn’t containable; if one robot army beats another, but it doesn’t necessarily advance any objective, then the robots could sabotage infrastructure, take hostages, etc., until concessions are made; (3) a robot with autonomy to make decisions to destroy has one of two mindsets, (a) it is fixated on clear objectives, or (b) it is open-minded to refine goals and handle nuances, both of which are equally troubling.

You’d think there would be policies and stances against integrating AI into the military. Google had one, and this year, they revoked it. I guess they see it as inevitable, and are stuck in the “we need to be dominant” strategy. Realistically, we will always fall into these acceleration races unless we establish some global armistice, but those are complex and very hard to broker; there is only urgency to do this once we cross a line and realize how badly we’ve screwed up (ie: with nuclear). The difference is, as technology advances, (1) the first consequence might be existential, (2) if it’s not existential, but it’s autonomous, it may be too late to contain. I think one of the defining challenges of our century is how to create civic structures around exponential technology that can contain them before a wake-up incident.

Civic technology lags behind science

· 94 words

Kardashev ambitions reveal the self-destructive nature of science-forward intelligence. It’s like we’re skipping the prerequisite in social science. There's a fair chance that intelligent life destroys itself because civic technology lags behind hard technology—but I'm optimism in the sense that this is, in the end, just a very hard, society-scale design problem. No one person can fix the whole system, but any individual can contribute design protocols that can 1) solve little, local problems, 2) be reused in other contexts, and 3) integrate with other protocols.