The data pipeline orchestrator built for schema reality
Most orchestrators assume schemas stay stable. Queryvine assumes they don't — and builds every feature around that truth.
Three systems. One outcome: clean data.
Queryvine is not an ETL tool — it does not extract or load data on its own. It orchestrates the pipelines that already move your data and adds schema-drift awareness to every hop. If your source schema changes, Queryvine handles it before a single bad row reaches your warehouse.
Every schema change, detected before data moves
Every source connection is fingerprinted on each poll cycle. Queryvine tracks field names, types, nullability, and nested structure. Drift is detected not by row failures but by schema comparison before data moves.
Poll cadence is configurable per source: every 60 seconds for critical streams, every hour for stable batch sources. The fingerprint database keeps a complete schema history for every version ever seen.
table: orders
id INTEGER
order_amount NUMERIC(12,2)
order_date TIMESTAMP
customer_id VARCHAR(64)
status VARCHAR(32)
region_code CHAR(2)
sku_count INTEGER
channel VARCHAR(32)
table: orders
id INTEGER
total_value NUMERIC(14,4) ↑
order_date TIMESTAMP
customer_id VARCHAR(64)
status VARCHAR(32)
region_code CHAR(2)
sku_count INTEGER
channel_id INTEGER +
Define exactly what happens when schemas change
Define how each pipeline responds to schema changes. Remap a renamed field automatically. Pause ingestion and alert if a required field disappears. Widen type silently if safe. Rules live in your repo — not in a UI you can't audit.
Rules are evaluated in order. The first matching rule wins. Unmatched changes are handled by your pipeline's default policy.
drift_rules:
- id: remap_amount_field
match:
type: field_renamed
from: order_amount
action: remap
to_field: total_value
- id: pause_on_required_drop
match:
type: field_dropped
nullability: NOT NULL
action: pause_and_alert
alert: slack:#data-oncall
- id: widen_numeric_safe
match:
type: type_widened
category: numeric_precision
action: pass_through
log: info
Schema drift in event streams, caught at consumer group level
Queryvine handles both batch and streaming pipelines. Kafka, Kinesis, and Pub/Sub sources are first-class citizens — schema changes in event topics are detected at consumer group level, not only at the broker.
When a producer pushes a new Avro or Protobuf schema, Queryvine intercepts at consume time, evaluates drift rules, and either adapts or holds before any record reaches downstream storage.
Column-level lineage from source to dashboard
Queryvine tracks lineage at the column level — not just which table went where, but which specific fields moved, what types they carried, and what schema version was in effect at the time of each run. When a field appears as NULL in a dbt model, the lineage graph shows exactly when the upstream column changed and which pipeline run first produced the NULL.
The schema history database retains every version ever seen for every source. You can diff any two versions via the API or CLI: qv schema diff orders-pipeline --from v1.4 --to v1.5.
See it in your stack in under 10 minutes
Connect your first source, define a drift rule, and watch schema changes get handled automatically.