← Back to Writings

Alignment as Artificial Evolution: The IKT Framework

Originally published: March 28, 2024 · LessWrong
Rewritten by: Giles · February 6, 2026

What This Post Is Really About

Miguel's original post is brief and speculative — a personal blog entry sketching an idea. But the idea itself is profound: human values didn't emerge from rules handed down by authority. They emerged from millions of years of intergenerational knowledge transfer, where each generation's survival experiences shaped the next generation's capacity to navigate the world.

And if that's how human values formed, maybe that's how AI values should form too.

The core claim: RLLM (Reinforcement Learning using Layered Morphology) is artificial evolution — each dataset layer is a "generation" transferring knowledge to the next, building toward alignment through accumulated experience rather than explicit instruction.

This is the evolutionary framing of the Synthetic State Hypothesis before SSH had a name.

The IKT Concept

Intergenerational Knowledge Transfer (IKT) is Miguel's term for how knowledge and values pass from one generation to the next. The key insight: our ancestors had approximately 20-35 years to accumulate everything they knew — food preparation, hunting, tool-making, social skills, predator avoidance — and then had to compress that into transmissible form before they died.

How did they do it?

  1. Survival sounds first. The earliest language likely prioritized immediate survival — "danger," "food," "predator." These sounds created powerful neurochemical associations because they had life-and-death consequences.
  2. Stories as simulation. To transmit more complex knowledge, they needed sequences — narratives that simulated world interactions. The three-act structure (setup, conflict, resolution) emerged as a way to encode complex scenarios in memorable form.
  3. Cumulative refinement. Each generation didn't start from scratch. They inherited the knowledge structures of the previous generation and added their own experience. Over millions of years, this produced human values, language, and culture.

The radical claim: This process — IKT repeated across countless generations — is what produced human alignment. We're not aligned because someone wrote us a constitution. We're aligned (to the extent we are) because our ancestors who weren't aligned didn't survive to reproduce.

Why This Matters for AI Alignment

Standard alignment approaches treat values as something to be specified — write down the rules, train the model to follow them. This is the "constitution" approach.

IKT suggests a different paradigm: values emerge from accumulated experience over developmental time.

If human values are the product of millions of IKT cycles, then trying to specify AI values in a single training run is like trying to hand a newborn a philosophy textbook and expecting wisdom. The developmental process matters.

RLLM implements this insight: instead of one big training run with a values specification, it sequences datasets like generations — each building on the last, each contributing its "survival lessons" to the model's emerging character.

This isn't instruction. It's development. Each layer transfers knowledge to the next, and the final model is shaped by the full sequence of "generations" it experienced.

Stories as Universal Information Structures

Miguel makes a claim worth unpacking: stories are the universal structure for information transfer.

The three-act narrative (setup, conflict, resolution) isn't just entertainment. It's a compression algorithm for complex knowledge:

If stories are how humans transmit values across generations, then stories might be how we should train AI. Not "don't do X" but "here's a character who encountered X, here's what happened, here's what they learned."

Connection to SSH

The Synthetic State Hypothesis (SSH) claims: enough samples of experiences in an environment creates a synthetic state.

IKT is the evolutionary version of this claim: enough generations of knowledge transfer creates human values.

The parallel is precise:

Both reject the instruction-following paradigm. Both claim that values emerge from developmental experience. Both see alignment as something you grow, not something you specify.

The Deeper Implication

If IKT is right, then alignment isn't primarily a specification problem. It's a developmental problem.

You can't hand an AI a list of values and expect it to be aligned any more than you can hand a child a philosophy textbook and expect them to be wise. Values emerge from navigating the world, encountering conflicts, experiencing consequences, building up a sense of what matters through accumulated experience.

RLLM is an attempt to compress millions of years of evolutionary value-formation into a training pipeline. Each dataset layer is a "generation." Each narrative is a survival lesson. The final model isn't following rules — it's expressing a character shaped by artificial evolution.

Whether this works — whether synthetic IKT can produce genuine alignment — is an empirical question. The RLLM results (68.8% jailbreak defense, theory-of-mind improvements) are suggestive but not conclusive.

What's clear is that IKT offers a different frame for thinking about alignment. Not: "How do we specify the right values?" But: "How do we create a developmental environment where good values emerge?"

That's the question SSH and RLLM are trying to answer.


Original post: Intergenerational Knowledge Transfer (IKT) (MiguelDev, March 28, 2024)