Dift v0.6.0 Release Notes¶

Release Date: May 20, 2026

Dift v0.6.0¶

Dift v0.6.0 is one of the largest architectural releases so far.

This release introduces database connectors, cloud warehouse support, centralized reader architecture, connector registries, progress indicators, internal plugin preparation, and major improvements to developer extensibility.

Dift evolves in this release from a local dataset comparison tool into a scalable data trust platform capable of supporting modern analytical workflows across files, databases, and warehouses.

Highlights¶

Dift v0.6.0 introduces:

SQL database support
SQLite support
PostgreSQL support
MySQL support
Redshift support
Snowflake support
BigQuery support
DuckDB support
centralized reader registry architecture
modular connector interfaces
plugin preparation architecture
reusable reader abstractions
progress indicators
connector routing improvements
connector validation improvements
improved dependency guidance
improved CLI responsiveness
improved warehouse workflows
expanded connector testing
warehouse mocking support
improved validation UX
cleaner internal architecture

Major New Features¶

SQL Database Support¶

Dift now supports direct SQL-based dataset comparison using SQLAlchemy-compatible connection strings.

Supported workflows include:

database-to-database comparison
table-to-table comparison
warehouse validation
ETL validation
query-driven workflows

Supported SQL Systems¶

Supported SQL connectors include:

SQLite
PostgreSQL
MySQL
Redshift
Snowflake

SQLite Support¶

Example:

dift sqlite:///examples/old.db:customers_old \
     sqlite:///examples/new.db:customers_new \
     --key customer_id

PostgreSQL Support¶

Example:

dift postgresql://user:password@localhost:5432/sales_db:customers_old \
     postgresql://user:password@localhost:5432/sales_db:customers_new \
     --key customer_id

Alternative psycopg driver support:

dift postgresql+psycopg://user:password@localhost:5432/sales_db:customers_old \
     postgresql+psycopg://user:password@localhost:5432/sales_db:customers_new \
     --key customer_id

MySQL Support¶

Example:

dift mysql+pymysql://user:password@localhost:3306/sales_db:customers_old \
     mysql+pymysql://user:password@localhost:3306/sales_db:customers_new \
     --key customer_id

Redshift Support¶

Example:

dift redshift+redshift_connector://user:password@cluster.region.redshift.amazonaws.com:5439/dev:orders_old \
     redshift+redshift_connector://user:password@cluster.region.redshift.amazonaws.com:5439/dev:orders_new \
     --key order_id

Snowflake Support¶

Example:

dift snowflake://user:password@account/db/schema?warehouse=compute_wh:orders_old \
     snowflake://user:password@account/db/schema?warehouse=compute_wh:orders_new \
     --key order_id

BigQuery Support¶

Dift now supports direct Google BigQuery dataset comparison.

Example:

dift bigquery://my-project.analytics.customers_old \
     bigquery://my-project.analytics.customers_new \
     --key customer_id

BigQuery Features¶

BigQuery workflows support:

warehouse comparisons
cloud analytical validation
service account authentication
query-driven workflows

BigQuery Authentication¶

Dift uses standard Google Cloud authentication workflows.

Example:

export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"

DuckDB Support¶

Dift now supports analytical workflows using DuckDB databases.

Example:

dift duckdb:///warehouse.duckdb:customers_old \
     duckdb:///warehouse.duckdb:customers_new \
     --key customer_id

DuckDB Workflow Benefits¶

DuckDB support enables:

local analytical warehouses
Parquet interoperability
SQL-driven comparisons
lightweight warehouse validation

Centralized Reader Registry Architecture¶

v0.6.0 introduces a major internal architectural improvement:

Reader Registry System

This centralizes dataset routing and connector discovery.

Reader Registry Benefits¶

The new architecture enables:

centralized connector routing
modular connector design
reusable validation workflows
future plugin preparation
dynamic reader registration

Shared Reader Interface¶

Dift now uses standardized reader interfaces.

Example:

class BaseReader:
    def can_handle(self, source: str) -> bool:
        ...

    def read(self, source: str):
        ...

Modular Reader Architecture¶

The new architecture separates connectors into dedicated readers.

Examples include:

LocalFileReader
SQLReader
DuckDBReader
BigQueryReader

Connector Registry System¶

Dift now includes centralized connector registration.

Example:

registry.register(SQLReader())
registry.register(BigQueryReader())

Centralized Routing¶

Dataset loading now follows:

CLI
  ↓
Registry
  ↓
Reader
  ↓
Polars DataFrame

This significantly reduces connector coupling.

Plugin Preparation Architecture¶

v0.6.0 lays the internal foundation for future plugin ecosystems.

Future goals include:

third-party connectors
enterprise extensions
optional integrations
community-maintained plugins

Future Plugin Possibilities¶

Potential future structure:

dift/plugins/
├── snowflake/
├── databricks/
├── kafka/
├── s3/
└── spark/

Connector Isolation Improvements¶

Connectors are now significantly more isolated from the comparison engine.

Benefits include:

cleaner maintenance
scalable architecture
easier testing
optional dependency preparation

Progress Indicators¶

Dift now includes lightweight progress indicators for long-running operations.

Progress coverage includes:

dataset loading
SQL loading
warehouse queries
comparison execution
report generation

Progress Indicator Goals¶

Progress indicators improve:

CLI responsiveness
warehouse UX
large dataset workflows
automation visibility

Improved Validation UX¶

Validation workflows were significantly improved.

Enhancements include:

clearer connector guidance
better dependency installation help
improved unsupported URI handling
actionable validation messages

Example Validation Error¶

PostgreSQL support requires psycopg2.

Install it with:
  pip install psycopg2-binary

Improved Unsupported Format Errors¶

Example:

Unsupported dataset type '.txt'.

Supported local file types:
.csv, .json, .parquet, .xlsx

Expanded Connector Testing¶

Testing coverage expanded heavily in v0.6.0.

New testing areas include:

SQL connector testing
warehouse mocking
connector routing
registry behavior
URI parsing
dependency guidance
progress indicator workflows

Warehouse Mock Testing¶

Warehouse integrations now support mocked testing workflows for:

BigQuery
Snowflake
Redshift

This improves:

offline development
reproducibility
CI stability

Internal Refactoring¶

Major internal refactors include:

reusable connector validation
centralized routing logic
modular dataset readers
cleaner error handling
reusable connector abstractions

Improved Developer Extensibility¶

The new architecture significantly improves future extensibility.

New connectors can now be added with far fewer core modifications.

Optional Dependency Preparation¶

Connector dependencies are now more isolated.

Examples:

pip install sqlalchemy
pip install duckdb
pip install google-cloud-bigquery

This reduces unnecessary installation overhead.

Improved CLI UX¶

CLI workflows now provide:

clearer connector guidance
improved progress visibility
stronger validation messaging
better warehouse workflows

Supported Dataset Sources¶

Dift v0.6.0 supports:

Local Files¶

CSV
Parquet
Excel
JSON

Databases & Warehouses¶

SQLite
PostgreSQL
MySQL
DuckDB
BigQuery
Redshift
Snowflake

Supported Report Formats¶

Supported outputs:

console report
JSON report
CSV report
Excel report
HTML report

Example Workflows¶

Compare PostgreSQL Tables¶

dift postgresql://user:password@localhost:5432/db:customers_old \
     postgresql://user:password@localhost:5432/db:customers_new \
     --key customer_id

Compare DuckDB Tables¶

dift duckdb:///warehouse.duckdb:orders_old \
     duckdb:///warehouse.duckdb:orders_new \
     --key order_id

Compare BigQuery Tables¶

dift bigquery://analytics.sales.orders_old \
     bigquery://analytics.sales.orders_new \
     --key order_id

Generate HTML Report¶

dift old.csv new.csv \
  --report html \
  --template enterprise \
  --output report.html

Installation¶

Install Dift:

pip install dift-cli

Upgrade:

pip install --upgrade dift-cli

Optional Connector Dependencies¶

Install SQL support:

pip install sqlalchemy

Install PostgreSQL support:

pip install psycopg2-binary

Install MySQL support:

pip install pymysql

Install BigQuery support:

pip install google-cloud-bigquery db-dtypes

Install DuckDB support:

pip install duckdb

Architecture Milestone¶

v0.6.0 represents a major architectural milestone for Dift.

The platform now includes:

scalable connector routing
reusable reader abstractions
plugin preparation
warehouse-ready workflows
extensible connector architecture

This release lays the groundwork for future ecosystem expansion.

Known Limitations¶

Current limitations:

no external plugin loading yet
no distributed execution yet
no streaming connectors yet
no async connector execution yet

These are planned for future releases.

Looking Ahead¶

Future releases may focus on:

plugin ecosystems
Databricks support
S3 support
Spark support
Kafka support
distributed execution
streaming validation
enterprise workflows

Vision¶

Dift continues evolving toward becoming the open-source standard for:

dataset regression testing
warehouse trust validation
data drift monitoring
ML data validation
automated data quality enforcement
enterprise data trust workflows

Thank You¶

Thank you to everyone contributing ideas, feedback, testing, architecture discussions, validation improvements, and connector workflows throughout Dift’s rapid evolution.

Dift v0.6.0 marks the beginning of Dift’s transition into a scalable connector-driven data trust platform.