Dift v0.5.0 Release Notes¶

Release Date: May 12, 2026

Dift v0.5.0¶

Dift v0.5.0 is a major release focused on advanced drift detection, automation workflows, reusable validation systems, reporting improvements, and scalable comparison orchestration.

This release transforms Dift from a dataset comparison CLI into a significantly more powerful data trust and validation platform.

Highlights¶

Dift v0.5.0 introduces:

numeric drift detection
categorical drift analysis
outlier detection
outlier severity scoring
frequency distribution shift analysis
reusable threshold policies
saved comparison profiles
scheduled comparison workflows
batch dataset comparison
comparison history tracking
automation-friendly execution
strict exit codes
quiet mode
no-color mode
improved HTML reports
improved Excel reports
improved CSV summaries
environment-based configurations
reusable environment workflows
improved risk scoring
stronger CLI UX
expanded testing coverage

Major New Features¶

Numeric Drift Detection¶

Dift now supports advanced numeric drift analysis.

Features include:

mean shift detection
standard deviation drift
range drift analysis
configurable drift thresholds
severity classification

Numeric Drift Example¶

Numeric drift:
'revenue'
mean shift 900.00%
(high, threshold 0.1)

Categorical Drift Detection¶

Dift now detects categorical distribution shifts.

Features include:

new categorical value detection
removed value detection
frequency shift analysis
severity classification

Example Categorical Shift¶

Categorical shift:
'segment'
max frequency shift 60.00%
(high)

Outlier Detection¶

Dift now includes outlier analysis using IQR-based detection.

Features include:

IQR outlier detection
outlier spike analysis
outlier percentage tracking
risk integration

Example Outlier Warning¶

Outlier spike:
'revenue' increased by 100.00%
(high)

Improved Risk Scoring¶

Risk scoring was expanded to include:

numeric drift severity
categorical drift severity
outlier spike severity
weighted risk calculation

This improves overall trust analysis.

Reusable Threshold Policies¶

Threshold configurations became significantly more powerful in v0.5.0.

Supported threshold types:

numeric thresholds
categorical thresholds
outlier thresholds
column-specific overrides

Column-Level Threshold Overrides¶

Example:

thresholds:
  columns:
    revenue:
      numeric: 0.05
      outlier: 0.1

This enables highly granular validation workflows.

Environment-Based Configurations¶

Dift now supports reusable environment configurations.

Example:

environments:
  development:
    threshold: 0.2

  production:
    threshold: 0.05

Environment Variable Support¶

Environment variable interpolation is now supported.

Example:

old_dataset: ${OLD_DATASET}
new_dataset: ${NEW_DATASET}

This improves:

CI/CD workflows
automation pipelines
deployment flexibility

Saved Comparison Profiles¶

Dift now supports reusable saved comparison profiles.

Create profile:

dift profile create nightly-check \
  --old old.csv \
  --new new.csv \
  --key customer_id

Run profile:

dift profile run nightly-check

Profile Benefits¶

Profiles help support:

recurring validations
nightly checks
reusable workflows
standardized comparisons

Scheduled Comparison Workflows¶

Dift now supports automation-ready scheduling workflows.

Generate cron commands:

dift schedule cron nightly-check

Example output:

0 2 * * * dift profile run nightly-check

Saved Schedules¶

Create reusable schedules:

dift schedule create daily-check \
  --profile nightly-check \
  --cron "0 2 * * *"

Batch Dataset Comparison¶

Dift now supports comparing entire folders of datasets.

Example:

dift batch \
  --old-dir data/old \
  --new-dir data/new \
  --key id

Batch Workflow Features¶

Batch comparisons support:

folder-based matching
report generation
history tracking
continue-on-error workflows
stop-on-error workflows

Batch HTML Reporting¶

Example:

dift batch \
  --old-dir data/old \
  --new-dir data/new \
  --report html \
  --output-dir reports/batch

Comparison History Tracking¶

Dift now supports persistent comparison history.

Enable history:

dift old.csv new.csv \
  --history

History Features¶

History workflows support:

historical drift tracking
recurring risk monitoring
long-term comparison visibility

View History¶

List saved history:

dift history list

Show detailed record:

dift history show 1

Automation-Friendly Execution¶

Dift now includes automation-focused execution behavior.

Features include:

strict exit codes
quiet mode
no-color mode

Strict Exit Codes¶

Enable automation-safe risk exits:

dift prod.csv staging.csv \
  --strict-exit-codes

Exit mapping:

Exit Code	Meaning
0	Low risk
1	Medium risk
2	High risk
3	Runtime failure

Quiet Mode¶

Suppress non-error output:

dift old.csv new.csv --quiet

No-Color Mode¶

Disable ANSI colors:

dift old.csv new.csv --no-color

Useful for:

CI logs
automation systems
plain-text terminals

Improved HTML Reports¶

HTML reporting received major improvements.

Enhancements include:

improved layouts
severity badges
drift highlighting
responsive design
cleaner summaries

HTML Templates¶

Dift now supports customizable HTML templates.

Example:

dift old.csv new.csv \
  --report html \
  --template dark

Available templates:

default
clean
compact
enterprise
dark

Improved Excel Reports¶

Excel reporting improvements include:

conditional formatting
severity highlighting
improved worksheet structure
readability improvements

Improved CSV Reports¶

CSV reporting now includes:

cleaner summaries
drift visibility
improved automation compatibility

Improved JSON Reports¶

JSON reports now provide:

cleaner structure
metadata support
stronger automation consistency

Metadata Expansion¶

Report metadata now includes:

timestamps
report type
runtime information
tool version

Improved Validation Errors¶

Validation UX was improved significantly.

Examples include:

clearer missing dataset guidance
connector installation hints
improved unsupported format errors
actionable workflow guidance

Improved CLI UX¶

CLI usability improvements include:

clearer help output
improved examples
better validation messaging
automation-friendly workflows

Expanded Testing Coverage¶

Testing coverage expanded significantly.

New focus areas include:

automation workflows
connector validation
report consistency
config workflows
batch comparisons
history tracking

Internal Improvements¶

Major internal improvements include:

reusable threshold architecture
cleaner report rendering
modular workflow organization
improved validation systems

Supported Dataset Formats¶

Supported local formats:

CSV
Parquet
Excel (.xlsx, .xls)
JSON

Report Formats¶

Supported outputs:

console report
JSON report
CSV report
Excel report
HTML report

Example Usage¶

Basic comparison:

dift old.csv new.csv --key customer_id

Drift detection:

dift old_drift.csv new_drift.csv \
  --key id \
  --threshold 0.1

Batch comparison:

dift batch \
  --old-dir data/old \
  --new-dir data/new

Installation¶

Install:

pip install dift-cli

Upgrade:

pip install --upgrade dift-cli

Looking Ahead¶

Future releases will focus on:

SQL database support
warehouse integrations
DuckDB support
BigQuery support
plugin preparation
connector registry architecture
enterprise connector workflows

Known Limitations¶

Current limitations:

no SQL database support yet
no warehouse connectors yet
no plugin architecture yet
no distributed execution yet

These are planned for future releases.

Vision¶

Dift continues evolving toward becoming the open-source standard for:

dataset regression testing
data drift monitoring
warehouse trust validation
ML dataset validation
automated data quality enforcement

Thank You¶

Thank you to everyone contributing ideas, testing workflows, reporting issues, improving documentation, and helping shape the direction of Dift.

Dift v0.5.0 represents a major milestone in the platform’s evolution toward scalable enterprise-grade data trust workflows.