Dift v0.5.0 Release Notes¶
Release Date: May 12, 2026
Dift v0.5.0¶
Dift v0.5.0 is a major release focused on advanced drift detection, automation workflows, reusable validation systems, reporting improvements, and scalable comparison orchestration.
This release transforms Dift from a dataset comparison CLI into a significantly more powerful data trust and validation platform.
Highlights¶
Dift v0.5.0 introduces:
- numeric drift detection
- categorical drift analysis
- outlier detection
- outlier severity scoring
- frequency distribution shift analysis
- reusable threshold policies
- saved comparison profiles
- scheduled comparison workflows
- batch dataset comparison
- comparison history tracking
- automation-friendly execution
- strict exit codes
- quiet mode
- no-color mode
- improved HTML reports
- improved Excel reports
- improved CSV summaries
- environment-based configurations
- reusable environment workflows
- improved risk scoring
- stronger CLI UX
- expanded testing coverage
Major New Features¶
Numeric Drift Detection¶
Dift now supports advanced numeric drift analysis.
Features include:
- mean shift detection
- standard deviation drift
- range drift analysis
- configurable drift thresholds
- severity classification
Numeric Drift Example¶
Numeric drift:
'revenue'
mean shift 900.00%
(high, threshold 0.1)
Categorical Drift Detection¶
Dift now detects categorical distribution shifts.
Features include:
- new categorical value detection
- removed value detection
- frequency shift analysis
- severity classification
Example Categorical Shift¶
Categorical shift:
'segment'
max frequency shift 60.00%
(high)
Outlier Detection¶
Dift now includes outlier analysis using IQR-based detection.
Features include:
- IQR outlier detection
- outlier spike analysis
- outlier percentage tracking
- risk integration
Example Outlier Warning¶
Outlier spike:
'revenue' increased by 100.00%
(high)
Improved Risk Scoring¶
Risk scoring was expanded to include:
- numeric drift severity
- categorical drift severity
- outlier spike severity
- weighted risk calculation
This improves overall trust analysis.
Reusable Threshold Policies¶
Threshold configurations became significantly more powerful in v0.5.0.
Supported threshold types:
- numeric thresholds
- categorical thresholds
- outlier thresholds
- column-specific overrides
Column-Level Threshold Overrides¶
Example:
thresholds:
columns:
revenue:
numeric: 0.05
outlier: 0.1
This enables highly granular validation workflows.
Environment-Based Configurations¶
Dift now supports reusable environment configurations.
Example:
environments:
development:
threshold: 0.2
production:
threshold: 0.05
Environment Variable Support¶
Environment variable interpolation is now supported.
Example:
old_dataset: ${OLD_DATASET}
new_dataset: ${NEW_DATASET}
This improves:
- CI/CD workflows
- automation pipelines
- deployment flexibility
Saved Comparison Profiles¶
Dift now supports reusable saved comparison profiles.
Create profile:
dift profile create nightly-check \
--old old.csv \
--new new.csv \
--key customer_id
Run profile:
dift profile run nightly-check
Profile Benefits¶
Profiles help support:
- recurring validations
- nightly checks
- reusable workflows
- standardized comparisons
Scheduled Comparison Workflows¶
Dift now supports automation-ready scheduling workflows.
Generate cron commands:
dift schedule cron nightly-check
Example output:
0 2 * * * dift profile run nightly-check
Saved Schedules¶
Create reusable schedules:
dift schedule create daily-check \
--profile nightly-check \
--cron "0 2 * * *"
Batch Dataset Comparison¶
Dift now supports comparing entire folders of datasets.
Example:
dift batch \
--old-dir data/old \
--new-dir data/new \
--key id
Batch Workflow Features¶
Batch comparisons support:
- folder-based matching
- report generation
- history tracking
- continue-on-error workflows
- stop-on-error workflows
Batch HTML Reporting¶
Example:
dift batch \
--old-dir data/old \
--new-dir data/new \
--report html \
--output-dir reports/batch
Comparison History Tracking¶
Dift now supports persistent comparison history.
Enable history:
dift old.csv new.csv \
--history
History Features¶
History workflows support:
- historical drift tracking
- recurring risk monitoring
- long-term comparison visibility
View History¶
List saved history:
dift history list
Show detailed record:
dift history show 1
Automation-Friendly Execution¶
Dift now includes automation-focused execution behavior.
Features include:
- strict exit codes
- quiet mode
- no-color mode
Strict Exit Codes¶
Enable automation-safe risk exits:
dift prod.csv staging.csv \
--strict-exit-codes
Exit mapping:
| Exit Code | Meaning |
|---|---|
| 0 | Low risk |
| 1 | Medium risk |
| 2 | High risk |
| 3 | Runtime failure |
Quiet Mode¶
Suppress non-error output:
dift old.csv new.csv --quiet
No-Color Mode¶
Disable ANSI colors:
dift old.csv new.csv --no-color
Useful for:
- CI logs
- automation systems
- plain-text terminals
Improved HTML Reports¶
HTML reporting received major improvements.
Enhancements include:
- improved layouts
- severity badges
- drift highlighting
- responsive design
- cleaner summaries
HTML Templates¶
Dift now supports customizable HTML templates.
Example:
dift old.csv new.csv \
--report html \
--template dark
Available templates:
- default
- clean
- compact
- enterprise
- dark
Improved Excel Reports¶
Excel reporting improvements include:
- conditional formatting
- severity highlighting
- improved worksheet structure
- readability improvements
Improved CSV Reports¶
CSV reporting now includes:
- cleaner summaries
- drift visibility
- improved automation compatibility
Improved JSON Reports¶
JSON reports now provide:
- cleaner structure
- metadata support
- stronger automation consistency
Metadata Expansion¶
Report metadata now includes:
- timestamps
- report type
- runtime information
- tool version
Improved Validation Errors¶
Validation UX was improved significantly.
Examples include:
- clearer missing dataset guidance
- connector installation hints
- improved unsupported format errors
- actionable workflow guidance
Improved CLI UX¶
CLI usability improvements include:
- clearer help output
- improved examples
- better validation messaging
- automation-friendly workflows
Expanded Testing Coverage¶
Testing coverage expanded significantly.
New focus areas include:
- automation workflows
- connector validation
- report consistency
- config workflows
- batch comparisons
- history tracking
Internal Improvements¶
Major internal improvements include:
- reusable threshold architecture
- cleaner report rendering
- modular workflow organization
- improved validation systems
Supported Dataset Formats¶
Supported local formats:
- CSV
- Parquet
- Excel (
.xlsx,.xls) - JSON
Report Formats¶
Supported outputs:
- console report
- JSON report
- CSV report
- Excel report
- HTML report
Example Usage¶
Basic comparison:
dift old.csv new.csv --key customer_id
Drift detection:
dift old_drift.csv new_drift.csv \
--key id \
--threshold 0.1
Batch comparison:
dift batch \
--old-dir data/old \
--new-dir data/new
Installation¶
Install:
pip install dift-cli
Upgrade:
pip install --upgrade dift-cli
Looking Ahead¶
Future releases will focus on:
- SQL database support
- warehouse integrations
- DuckDB support
- BigQuery support
- plugin preparation
- connector registry architecture
- enterprise connector workflows
Known Limitations¶
Current limitations:
- no SQL database support yet
- no warehouse connectors yet
- no plugin architecture yet
- no distributed execution yet
These are planned for future releases.
Vision¶
Dift continues evolving toward becoming the open-source standard for:
- dataset regression testing
- data drift monitoring
- warehouse trust validation
- ML dataset validation
- automated data quality enforcement
Thank You¶
Thank you to everyone contributing ideas, testing workflows, reporting issues, improving documentation, and helping shape the direction of Dift.
Dift v0.5.0 represents a major milestone in the platform’s evolution toward scalable enterprise-grade data trust workflows.