← Craig Stapley
Essay · 4 min read

AI brainstorming gives you the average. I wanted the outlier

Ask a model to brainstorm and it hands you the center of its training data, polished and useless. I built a four-phase engine that forces the collision the brain uses to make new ideas, then kills most of them on purpose.

The problem with asking a model to brainstorm

Ask any language model to brainstorm and you get the center. A confident, polished answer that is the mathematical average of everything it read. And that's not the model failing. That's the model working. Averaging is the job. Which means novelty isn't something you request from it, it's something you have to force against it.

Because that's not how a brain makes a new idea anyway. New ideas come from dragging the problem next to something that has no business being near it. A musician and a mycologist notice the same pattern. Not deeper thinking. Different thinking.

So I built a system that forces that collision.

Four phases, four neural analogs. The engine isn't mimicking a 'smart generalist.' It's mimicking the specific neural handoff that produces insight.

What the brain actually does

Arthur Koestler, 1964: bisociation. Insight happens when two unrelated ideas collide and suddenly share structure. A poem isn't a clever sentence. It's two semantic domains that had no business meeting, fused at one point.

fMRI confirms this. Novel ideas involve a handoff between two brain networks:

  • The Default Mode Network, fires when you're daydreaming, free-associating, letting thoughts wander. Unlikely pairings surface here.

  • The Executive Control Network, fires when you're focused, evaluating, testing. Ideas get pressure-tested here.

The Salience Network (insula, anterior cingulate) is the referee. It picks which loose association from the first network is worth hauling into the second one.

Creative people aren't better at staying in one state. They're better at switching between them. That's the whole skill.

Most brainstorming fails because it never leaves the ECN. You evaluate before you've diverged. You get the obvious answer, polished.

Four phases

Most AI tools pretend to be a smart generalist. This one mimics the specific neural handoff that produces a real idea.

Phase 1. Incubate. Pull signals from unrelated domains: museum curation, archaeology, documentary filmmaking, mycorrhizal networks, jazz improvisation. Generate weak associations to the problem on purpose. Zero evaluation. Just collision material.

Phase 2. Diverge. Now produce bisociations. Force pairings between the problem and one domain. Score each on semantic distance. 0 = obvious (already in the data). 10 = nonsense. Target 5-7. Novel enough to be surprising. Close enough to actually work.

Phase 3. Switch. Kill weak ideas with four filters:

  • Surprise, would this wake up a senior reviewer?

  • Coherence, does it solve the problem or just sound clever?

  • Timing, does it fit this moment, this team, these constraints?

  • Zero-Audience, would this still be good if no one ever knew you did it?

Phase 4. Converge. Survivors get Concept Cards. User story. Build sketch. Unit economics. Risk map. Competitive context. The winner gets a full dive.

The engine is transparent. You see why it killed each idea. That matters. Without it, you drift back toward the first coherent direction your brain found. The system prevents quiet reversion.

A real portfolio-redesign run. 'Live A/B Test' scored highest on surprise but lost on coherence. The hybrid of 'Three-Act Narrative + Inverted Resume' won.

The memory problem

First sessions with the engine, it kept reaching for the same three metaphors. Jazz. Fungal networks. Fermentation. I recognized them instantly, because they're mine. Those are the wells I go back to when I'm out of ideas.

A metaphor works once. Works twice. The third time it isn't insight, it's a groove you fell into. So the engine logs what it's used and benches a domain after three runs. It's not beating its own laziness. It's beating mine.

What actually matters

  • AI brainstorming without forced divergence is useless. You get the center every time. Period.
  • The filter is the hard part. Ask a model to "be surprising but coherent" and it averages. It needs explicit rubric. Explicit kill criteria.
  • Memory compounds novelty. Without it, you reset to the obvious every session.
  • This isn't a brainstorming app. It's a prosthesis for a brain that grabs the first sensible answer and quits. Mine still does it, engine or not. The tool just makes quitting visible before I ship it.

The engine is live

Paste a design problem. Click. The output unfolds in four labeled stages so you see the pipeline work.

Run the Creativity Engine →

It's open. System prompts, rubric, rotation log. Adapt it. Make it yours.

The tool is interesting. The architecture is the whole thing.