How It Works

Seedling follows a four-stage pipeline: Introspect → Plan → Generate → Write.

Pipeline

Stage 1: Introspect

Reads information_schema from a live Postgres or MySQL database and produces a structured Schema object containing:

All tables and their columns with types
Foreign key relationships
Unique constraints, NOT NULL, CHECK, DEFAULT values
Column comments (used as generator hints)

Output is written as schema.yaml (or JSON), which serves as the input for generation.

seedling introspect --db postgres://localhost:5432/mydb --output schema.yaml

Stage 2: Plan

The PlanBuilder performs a topological sort of tables by FK dependency (Kahn’s algorithm). This ensures parent rows exist before child rows referencing them. Each column is automatically assigned a generator based on its type and name:

Column Type	Auto-detected Generator
`serial` / `bigserial`	Sequence
`varchar(255)` with name “email”	`Email`
`varchar` with name “phone”	`Phone`
`timestamptz`	`Now`
FK column	FK lookup from parent
`varchar`	Random string

Stage 3: Generate

The StreamGenerator iterates tables in dependency order:

For each table, generate N rows
Each row’s columns are produced by their assigned generators
FK columns look up values from already-generated parent rows via FKPool
Unique constraints are tracked and enforced by UniqueTracker
Circular FK dependencies are split into multi-pass groups

Stage 4: Write

Generated rows are streamed to a writer. Supported formats:

Writer	Description
`SqlWriter`	Batched `INSERT INTO ...` statements
`CsvWriter`	One file per table
`JsonLinesWriter`	One JSON object per row
`ParquetWriter`	Tabular output
`DbWriter`	Direct batched INSERT into database
`CopyWriter`	Postgres COPY protocol (max throughput)

Determinism

When a seed is provided (--seed <int>), all generators use a ChaCha8-based deterministic PRNG. Every column gets a derived sub-seed, ensuring:

Same schema + seed + count = identical output
Parallel generation is deterministic within a run
No crypto/rand usage in deterministic mode

Documentation Index

​How It Works

​Pipeline

​Stage 1: Introspect

​Stage 2: Plan

​Stage 3: Generate

​Stage 4: Write

​Determinism

​Architecture Diagram