Real-data seeding
Every synthetic agent is built on fragments of real, anonymized human behaviour, not an LLM's second-hand memory of people.
The easiest way to build a synthetic population is to ask a language model for one. It will happily produce ten thousand plausible-sounding humans in a few minutes. They will also be extraordinarily bland, anchored to the model's training distribution, stripped of the idiosyncrasies that make real humans noisy, contradictory, and informative.
Prism agents are seeded from a fragment layer. Each fragment is a small, consent-cleared, anonymised unit of real human behaviour, a survey response, a product review, a community thread, a panel-consented conversation excerpt. When we build an agent in a given cluster, we retrieve a compact bundle of fragments that matches the cluster's distribution and anchor the agent to them. The LLM's job is to animate the fragment, not invent it.
The output is agents who behave more like data and less like prose.
For SaaS buyer clusters, the same principle holds with different sources. Each fragment is a G2 review excerpt, a Hacker News thread comment, a public switching post, a consent-cleared founder transcript. A simulated CTO reacting to a dev-tool landing page is anchored to actual senior-engineer voice, not a model's idea of what a CTO sounds like.