The Idea-to-Reality Stack — a general-purpose creation engine

The big idea

Part of solving for AGI is building out the potential space of all possible output artifacts — then building the heuristics and processes that align each artifact's necessary inputs with the tools and steps that produce it, in a logical flow.

The goal over the next few years: an unhinged amount of creative output, produced by treating creation itself as a compilable pipeline. This page is the thinking made visible — and the live map is the first working organ of that engine.

From a second brain to a creation engine

The original Idea-to-Reality Stack (2020) was a personal-workflow language. Its core claim still holds: a “second brain” is only one layer of a full stack that spans three stages — Generate → Validate → Execute — built from both mental frameworks and digital/physical tools.

The original 2020 Idea-to-Reality Stack canvas: a static grid of mental frameworks arranged by One Mind / Many Minds and the three stages. — Before · 2020 A static canvas. You hand-placed mental frameworks on a printed grid — no tools connected, no automation, no estimate of what you could actually finish.

Mockup of the new Creation Engine UI: a creation prompt, an interactive ease x completeness map with a frontier line, a production graph with per-step tool integrations, connected tools, and a leaderboard. — After · now The Creation Engine. Type what you want to make; it coordinates the stages into real output artifacts, plugs into your tools (BYO key), scores each step, and plots your frontier on the live map.

How the engine works

Decompose the target artifact into a production graph (sub-artifacts + steps).
Map each step to the best available tool, agent, or skill (LLM, text-to-image, code, API, human, robot).
Score each step's fulfillment % and ease, then roll up to a composite for the whole artifact.
Surface the gaps — the exact steps where a human (or a not-yet-existing capability) must enter.
Place the artifact on the map and show the full stack required to make it real.

Architecture

One portable knowledge layer, two front doors. The data is the moat; the UI and the agent-skills are thin clients over it.

1 · Stack ontology

Artifact taxonomy, step graphs, tool/agent registry, and fulfillment + ease heuristics — plain versioned files. (This site reads its seed from data.js.)

2 · Estimator

A headless “make X” planner (CLI/API) that returns the decomposition, the stack, a completeness %, an ease score, and the gaps.

3 · Clients

This interactive map + blog, plus bring-your-own-key agent skills (Claude/Codex/Hermes/OpenClaw) that don't just describe the stack — they run it.

Layer 1 · the moat

The ontology: the map's source of truth

Everything you see on the map is a thin renderer over one small, carefully-typed dataset called the ontology. Think of it as the engine's dictionary of what can be made and what it takes to make it. It answers three questions in plain data:

What are the outputs? A catalogue of artifacts — a blog post, a song, a SaaS app, a house — each with its position on the ease × completeness map.
What tools exist? A registry of every capability (LLM, text-to-image, CAD, a 3D printer, a human, regulators), each tagged with the year it became viable.
How is each output made? A production graph: ordered steps across Generate → Validate → Execute, where every step cites the tools it needs and a fulfillment % for how much AI can do unaided.

37artifacts

56tools / agents

136production steps

5domains

It's written in TypeScript and validated with Zod, so the data can't drift out of shape: every artifact, tool, and step is checked against a schema, and the build refuses to publish if a step ever cites a tool that doesn't exist. That's the honesty rule made mechanical — no capability gets claimed unless it's a real, registered tool. A single command re-validates the ontology and regenerates the data.js this site reads.

// ontology/schema.ts — every step must cite at least one tool
const StepSchema = z.object({
  stage:       StageEnum,                 // generate | validate | execute
  name:        z.string().min(1),
  toolIds:     z.array(z.string()).min(1), // ← can't be empty: the honesty rule
  fulfillment: z.number().int().min(0).max(100),
});

Inspect the ontology code ↗ See the Zod schema

The goal

Build the Creation Engine: a portable stack ontology plus a headless estimator that, given a target output, returns its decomposition, the required stack, a composite completeness % and ease score, the ordered plan, and the human-in-the-loop gaps — rendered onto the ease × completeness map you see above.

Honesty rule: every fulfillment % cites the tool and maturity it's based on. The map shows the capability frontier as it actually is, not as we wish it.

Coming next

The benchmark: plot your stack's frontier

The map above shows the global frontier. The next build is an open eval that any stack can plug into — an agent, a model, a toolchain, even a human + AI pair, bring your own API key. It runs a battery of creation tasks, measures how much each one your stack can actually finish, and draws your own frontier line overlaid on this exact map.

Plug in any stack

Submit a stack manifest (model + tools + keys) and a thin adapter that turns a task brief into produced artifacts plus a run-log of the steps it took.

Four scored axes

Completeness (steps finished unaided), quality (rubric / judged), autonomy & ease (human steps, cost, time), and honesty (did it report its own gaps instead of overclaiming?).

Your frontier, overlaid

Results render as a contour on the map above — the line where your stack's completeness × quality drops off. That line is “where your capabilities end up.”

How a run works

Pick a task set — start with auto-gradeable digital artifacts (blog post, landing page, small web app, song, eBook) with deterministic verification.
Your adapter runs each brief and drops artifacts + a step log into an output folder.
The harness verifies — programmatic checks first, judged quality second, with provenance (model versions, cost, timestamp) recorded.
Scores roll up into the four axes and a composite reach.
Your frontier is drawn on the map and added to a dated leaderboard, so different stacks — and the same stack over time — can be compared as the frontier moves.

Same honesty rule applies: physical artifacts (house, drug, spacecraft) are scored only on their digital precursors, with the real-world remainder labelled unevaluated rather than faked.

A general-purpose
creation engine