Dift v0.3.0 Release Notes¶
Release Date: May 3, 2026
Dift v0.3.0¶
Dift v0.3.0 introduces major improvements focused on reporting workflows, configuration-driven execution, reusable validation settings, and developer experience improvements.
This release significantly expands Dift beyond simple dataset comparison into a more reusable and automation-friendly validation platform.
Highlights¶
Dift v0.3.0 introduces:
- reusable configuration files
- JSON configuration support
- TOML configuration support
- YAML configuration support
- improved JSON reports
- CSV summary reporting
- Excel report improvements
- HTML report improvements
- reusable threshold configurations
- column-level threshold overrides
- environment-based configurations
- environment variable support
- output directory support
- improved validation errors
- stronger CLI workflows
Major New Features¶
Configuration File Support¶
Dift now supports reusable configuration files.
Supported formats:
- YAML
- TOML
- JSON
This allows teams to standardize and reuse comparison workflows.
Example YAML Config¶
old_dataset: examples/old.csv
new_dataset: examples/new.csv
key: customer_id
threshold: 0.1
report: html
Run using:
dift --config config.yaml
TOML Configuration Support¶
Example:
old_dataset = "examples/old.csv"
new_dataset = "examples/new.csv"
key = "customer_id"
report = "json"
JSON Configuration Support¶
Example:
{
"old_dataset": "examples/old.csv",
"new_dataset": "examples/new.csv",
"key": "customer_id",
"report": "csv"
}
Dataset Paths Inside Configs¶
Dataset paths can now be fully defined inside config files.
This enables cleaner automation workflows:
dift --config config_with_datasets.yaml
CLI Override Support¶
CLI arguments now override config values.
Priority order:
CLI arguments > Config values > Defaults
This enables flexible workflow customization.
Reusable Threshold Configurations¶
Dift now supports reusable threshold policies.
Thresholds can be configured globally or per-column.
Global Threshold Example¶
thresholds:
numeric: 0.1
categorical: 0.2
outlier: 0.15
Column-Level Threshold Overrides¶
Example:
columns:
revenue:
numeric: 0.05
segment:
categorical: 0.3
This enables highly customized validation behavior.
Environment-Based Configurations¶
Dift now supports reusable environment workflows.
Example:
environments:
development:
threshold: 0.2
production:
threshold: 0.05
Run using:
dift --config config_env.yaml --env production
Environment Variable Support¶
Dift now supports environment variable interpolation inside config files.
Example:
old_dataset: ${OLD_DATASET}
new_dataset: ${NEW_DATASET}
This improves CI/CD and secret-management preparation.
Output Directory Support¶
Reports can now be written directly to output directories.
Example:
dift old.csv new.csv \
--report json \
--output-dir reports/
Dift automatically generates report filenames.
Auto-Generated Report Names¶
Examples:
dift_report.json
dift_report.csv
dift_report.xlsx
dift_report.html
Improved JSON Report Structure¶
JSON reports were redesigned for:
- cleaner automation support
- improved consistency
- future extensibility
New report sections include:
- metadata
- summary
- schema
- rows
- quality
- numeric
- categorical
Metadata Support¶
JSON reports now include metadata such as:
- tool name
- version
- report type
Example:
"metadata": {
"tool": "dift",
"version": "0.3.0"
}
Better CSV Summary Reports¶
CSV reporting now provides improved summary consistency for automation workflows and lightweight monitoring.
Improved Excel Reports¶
Excel reports received multiple improvements:
- better formatting
- improved worksheet organization
- cleaner readability
- improved summaries
Improved HTML Reports¶
HTML reports now include:
- better layouts
- improved summaries
- cleaner warning visibility
- improved readability
Better Validation Errors¶
Validation workflows were significantly improved.
Examples include:
- clearer missing dataset guidance
- improved unsupported file type errors
- actionable configuration validation
Example Unsupported File Error¶
Unsupported dataset type '.txt'.
Supported local file types:
.csv, .json, .parquet, .xlsx
Improved CLI UX¶
CLI workflows were improved through:
- clearer help output
- better command guidance
- improved validation behavior
- cleaner execution workflows
Testing Improvements¶
Testing coverage expanded significantly.
New focus areas include:
- config loading
- output directory workflows
- JSON schema consistency
- validation stability
- CLI regression protection
Example Usage¶
Run using config:
dift --config config_sample.yaml
Generate HTML report:
dift old.csv new.csv \
--report html \
--template clean
Supported Dataset Formats¶
Supported formats remain:
- CSV
- Parquet
- Excel (
.xlsx,.xls) - JSON
Report Formats¶
Supported outputs:
- console report
- JSON report
- CSV report
- Excel report
- HTML report
Internal Improvements¶
Internal improvements include:
- cleaner report architecture
- improved config loading
- reusable threshold handling
- better validation organization
Developer Experience Improvements¶
Developer workflows were improved through:
- expanded testing coverage
- clearer validation behavior
- improved report consistency
Installation¶
Install:
pip install dift-cli
Upgrade:
pip install --upgrade dift-cli
Looking Ahead¶
Future releases will focus on:
- SQL database support
- warehouse integrations
- automation workflows
- saved profiles
- scheduling systems
- historical drift tracking
- batch comparisons
Known Limitations¶
Current limitations:
- no SQL connectors yet
- no warehouse integrations yet
- no scheduling system yet
- no batch comparison workflows yet
These are planned for future releases.
Vision¶
Dift continues evolving toward becoming the open-source standard for:
- dataset regression testing
- data trust validation
- ML dataset drift monitoring
- warehouse comparison workflows
- automated data quality enforcement
Thank You¶
Thank you to everyone contributing feedback, ideas, testing, documentation improvements, and early feature discussions during Dift’s rapid growth.
Dift v0.3.0 represents a major step toward scalable and reusable data trust workflows.