Dift v0.6.0 Release Notes¶
Release Date: May 20, 2026
Dift v0.6.0¶
Dift v0.6.0 is one of the largest architectural releases so far.
This release introduces database connectors, cloud warehouse support, centralized reader architecture, connector registries, progress indicators, internal plugin preparation, and major improvements to developer extensibility.
Dift evolves in this release from a local dataset comparison tool into a scalable data trust platform capable of supporting modern analytical workflows across files, databases, and warehouses.
Highlights¶
Dift v0.6.0 introduces:
- SQL database support
- SQLite support
- PostgreSQL support
- MySQL support
- Redshift support
- Snowflake support
- BigQuery support
- DuckDB support
- centralized reader registry architecture
- modular connector interfaces
- plugin preparation architecture
- reusable reader abstractions
- progress indicators
- connector routing improvements
- connector validation improvements
- improved dependency guidance
- improved CLI responsiveness
- improved warehouse workflows
- expanded connector testing
- warehouse mocking support
- improved validation UX
- cleaner internal architecture
Major New Features¶
SQL Database Support¶
Dift now supports direct SQL-based dataset comparison using SQLAlchemy-compatible connection strings.
Supported workflows include:
- database-to-database comparison
- table-to-table comparison
- warehouse validation
- ETL validation
- query-driven workflows
Supported SQL Systems¶
Supported SQL connectors include:
- SQLite
- PostgreSQL
- MySQL
- Redshift
- Snowflake
SQLite Support¶
Example:
dift sqlite:///examples/old.db:customers_old \
sqlite:///examples/new.db:customers_new \
--key customer_id
PostgreSQL Support¶
Example:
dift postgresql://user:password@localhost:5432/sales_db:customers_old \
postgresql://user:password@localhost:5432/sales_db:customers_new \
--key customer_id
Alternative psycopg driver support:
dift postgresql+psycopg://user:password@localhost:5432/sales_db:customers_old \
postgresql+psycopg://user:password@localhost:5432/sales_db:customers_new \
--key customer_id
MySQL Support¶
Example:
dift mysql+pymysql://user:password@localhost:3306/sales_db:customers_old \
mysql+pymysql://user:password@localhost:3306/sales_db:customers_new \
--key customer_id
Redshift Support¶
Example:
dift redshift+redshift_connector://user:password@cluster.region.redshift.amazonaws.com:5439/dev:orders_old \
redshift+redshift_connector://user:password@cluster.region.redshift.amazonaws.com:5439/dev:orders_new \
--key order_id
Snowflake Support¶
Example:
dift snowflake://user:password@account/db/schema?warehouse=compute_wh:orders_old \
snowflake://user:password@account/db/schema?warehouse=compute_wh:orders_new \
--key order_id
BigQuery Support¶
Dift now supports direct Google BigQuery dataset comparison.
Example:
dift bigquery://my-project.analytics.customers_old \
bigquery://my-project.analytics.customers_new \
--key customer_id
BigQuery Features¶
BigQuery workflows support:
- warehouse comparisons
- cloud analytical validation
- service account authentication
- query-driven workflows
BigQuery Authentication¶
Dift uses standard Google Cloud authentication workflows.
Example:
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
DuckDB Support¶
Dift now supports analytical workflows using DuckDB databases.
Example:
dift duckdb:///warehouse.duckdb:customers_old \
duckdb:///warehouse.duckdb:customers_new \
--key customer_id
DuckDB Workflow Benefits¶
DuckDB support enables:
- local analytical warehouses
- Parquet interoperability
- SQL-driven comparisons
- lightweight warehouse validation
Centralized Reader Registry Architecture¶
v0.6.0 introduces a major internal architectural improvement:
Reader Registry System
This centralizes dataset routing and connector discovery.
Reader Registry Benefits¶
The new architecture enables:
- centralized connector routing
- modular connector design
- reusable validation workflows
- future plugin preparation
- dynamic reader registration
Shared Reader Interface¶
Dift now uses standardized reader interfaces.
Example:
class BaseReader:
def can_handle(self, source: str) -> bool:
...
def read(self, source: str):
...
Modular Reader Architecture¶
The new architecture separates connectors into dedicated readers.
Examples include:
- LocalFileReader
- SQLReader
- DuckDBReader
- BigQueryReader
Connector Registry System¶
Dift now includes centralized connector registration.
Example:
registry.register(SQLReader())
registry.register(BigQueryReader())
Centralized Routing¶
Dataset loading now follows:
CLI
↓
Registry
↓
Reader
↓
Polars DataFrame
This significantly reduces connector coupling.
Plugin Preparation Architecture¶
v0.6.0 lays the internal foundation for future plugin ecosystems.
Future goals include:
- third-party connectors
- enterprise extensions
- optional integrations
- community-maintained plugins
Future Plugin Possibilities¶
Potential future structure:
dift/plugins/
├── snowflake/
├── databricks/
├── kafka/
├── s3/
└── spark/
Connector Isolation Improvements¶
Connectors are now significantly more isolated from the comparison engine.
Benefits include:
- cleaner maintenance
- scalable architecture
- easier testing
- optional dependency preparation
Progress Indicators¶
Dift now includes lightweight progress indicators for long-running operations.
Progress coverage includes:
- dataset loading
- SQL loading
- warehouse queries
- comparison execution
- report generation
Progress Indicator Goals¶
Progress indicators improve:
- CLI responsiveness
- warehouse UX
- large dataset workflows
- automation visibility
Improved Validation UX¶
Validation workflows were significantly improved.
Enhancements include:
- clearer connector guidance
- better dependency installation help
- improved unsupported URI handling
- actionable validation messages
Example Validation Error¶
PostgreSQL support requires psycopg2.
Install it with:
pip install psycopg2-binary
Improved Unsupported Format Errors¶
Example:
Unsupported dataset type '.txt'.
Supported local file types:
.csv, .json, .parquet, .xlsx
Expanded Connector Testing¶
Testing coverage expanded heavily in v0.6.0.
New testing areas include:
- SQL connector testing
- warehouse mocking
- connector routing
- registry behavior
- URI parsing
- dependency guidance
- progress indicator workflows
Warehouse Mock Testing¶
Warehouse integrations now support mocked testing workflows for:
- BigQuery
- Snowflake
- Redshift
This improves:
- offline development
- reproducibility
- CI stability
Internal Refactoring¶
Major internal refactors include:
- reusable connector validation
- centralized routing logic
- modular dataset readers
- cleaner error handling
- reusable connector abstractions
Improved Developer Extensibility¶
The new architecture significantly improves future extensibility.
New connectors can now be added with far fewer core modifications.
Optional Dependency Preparation¶
Connector dependencies are now more isolated.
Examples:
pip install sqlalchemy
pip install duckdb
pip install google-cloud-bigquery
This reduces unnecessary installation overhead.
Improved CLI UX¶
CLI workflows now provide:
- clearer connector guidance
- improved progress visibility
- stronger validation messaging
- better warehouse workflows
Supported Dataset Sources¶
Dift v0.6.0 supports:
Local Files¶
- CSV
- Parquet
- Excel
- JSON
Databases & Warehouses¶
- SQLite
- PostgreSQL
- MySQL
- DuckDB
- BigQuery
- Redshift
- Snowflake
Supported Report Formats¶
Supported outputs:
- console report
- JSON report
- CSV report
- Excel report
- HTML report
Example Workflows¶
Compare PostgreSQL Tables¶
dift postgresql://user:password@localhost:5432/db:customers_old \
postgresql://user:password@localhost:5432/db:customers_new \
--key customer_id
Compare DuckDB Tables¶
dift duckdb:///warehouse.duckdb:orders_old \
duckdb:///warehouse.duckdb:orders_new \
--key order_id
Compare BigQuery Tables¶
dift bigquery://analytics.sales.orders_old \
bigquery://analytics.sales.orders_new \
--key order_id
Generate HTML Report¶
dift old.csv new.csv \
--report html \
--template enterprise \
--output report.html
Installation¶
Install Dift:
pip install dift-cli
Upgrade:
pip install --upgrade dift-cli
Optional Connector Dependencies¶
Install SQL support:
pip install sqlalchemy
Install PostgreSQL support:
pip install psycopg2-binary
Install MySQL support:
pip install pymysql
Install BigQuery support:
pip install google-cloud-bigquery db-dtypes
Install DuckDB support:
pip install duckdb
Architecture Milestone¶
v0.6.0 represents a major architectural milestone for Dift.
The platform now includes:
- scalable connector routing
- reusable reader abstractions
- plugin preparation
- warehouse-ready workflows
- extensible connector architecture
This release lays the groundwork for future ecosystem expansion.
Known Limitations¶
Current limitations:
- no external plugin loading yet
- no distributed execution yet
- no streaming connectors yet
- no async connector execution yet
These are planned for future releases.
Looking Ahead¶
Future releases may focus on:
- plugin ecosystems
- Databricks support
- S3 support
- Spark support
- Kafka support
- distributed execution
- streaming validation
- enterprise workflows
Vision¶
Dift continues evolving toward becoming the open-source standard for:
- dataset regression testing
- warehouse trust validation
- data drift monitoring
- ML data validation
- automated data quality enforcement
- enterprise data trust workflows
Thank You¶
Thank you to everyone contributing ideas, feedback, testing, architecture discussions, validation improvements, and connector workflows throughout Dift’s rapid evolution.
Dift v0.6.0 marks the beginning of Dift’s transition into a scalable connector-driven data trust platform.