GitHub Get Started Free
The product

The data pipeline orchestrator built for schema reality

Most orchestrators assume schemas stay stable. Queryvine assumes they don't — and builds every feature around that truth.

Get Started Free See How It Works
Abstract visualization of a data pipeline DAG graph with glowing node connections on dark background
Core capabilities

Three systems. One outcome: clean data.

Queryvine is not an ETL tool — it does not extract or load data on its own. It orchestrates the pipelines that already move your data and adds schema-drift awareness to every hop. If your source schema changes, Queryvine handles it before a single bad row reaches your warehouse.

Schema fingerprinting engine

Every schema change, detected before data moves

Every source connection is fingerprinted on each poll cycle. Queryvine tracks field names, types, nullability, and nested structure. Drift is detected not by row failures but by schema comparison before data moves.

Poll cadence is configurable per source: every 60 seconds for critical streams, every hour for stable batch sources. The fingerprint database keeps a complete schema history for every version ever seen.

Drift response rules (YAML)

Define exactly what happens when schemas change

Define how each pipeline responds to schema changes. Remap a renamed field automatically. Pause ingestion and alert if a required field disappears. Widen type silently if safe. Rules live in your repo — not in a UI you can't audit.

Rules are evaluated in order. The first matching rule wins. Unmatched changes are handled by your pipeline's default policy.

Real-time event streaming

Schema drift in event streams, caught at consumer group level

Queryvine handles both batch and streaming pipelines. Kafka, Kinesis, and Pub/Sub sources are first-class citizens — schema changes in event topics are detected at consumer group level, not only at the broker.

When a producer pushes a new Avro or Protobuf schema, Queryvine intercepts at consume time, evaluates drift rules, and either adapts or holds before any record reaches downstream storage.

Pipeline observability

Column-level lineage from source to dashboard

Queryvine tracks lineage at the column level — not just which table went where, but which specific fields moved, what types they carried, and what schema version was in effect at the time of each run. When a field appears as NULL in a dbt model, the lineage graph shows exactly when the upstream column changed and which pipeline run first produced the NULL.

The schema history database retains every version ever seen for every source. You can diff any two versions via the API or CLI: qv schema diff orders-pipeline --from v1.4 --to v1.5.

See it in your stack in under 10 minutes

Connect your first source, define a drift rule, and watch schema changes get handled automatically.