Skip to content

Dift v0.5.0 Release Notes

Release Date: May 12, 2026


Dift v0.5.0

Dift v0.5.0 is a major release focused on advanced drift detection, automation workflows, reusable validation systems, reporting improvements, and scalable comparison orchestration.

This release transforms Dift from a dataset comparison CLI into a significantly more powerful data trust and validation platform.


Highlights

Dift v0.5.0 introduces:

  • numeric drift detection
  • categorical drift analysis
  • outlier detection
  • outlier severity scoring
  • frequency distribution shift analysis
  • reusable threshold policies
  • saved comparison profiles
  • scheduled comparison workflows
  • batch dataset comparison
  • comparison history tracking
  • automation-friendly execution
  • strict exit codes
  • quiet mode
  • no-color mode
  • improved HTML reports
  • improved Excel reports
  • improved CSV summaries
  • environment-based configurations
  • reusable environment workflows
  • improved risk scoring
  • stronger CLI UX
  • expanded testing coverage

Major New Features


Numeric Drift Detection

Dift now supports advanced numeric drift analysis.

Features include:

  • mean shift detection
  • standard deviation drift
  • range drift analysis
  • configurable drift thresholds
  • severity classification

Numeric Drift Example

Numeric drift:
'revenue'
mean shift 900.00%
(high, threshold 0.1)

Categorical Drift Detection

Dift now detects categorical distribution shifts.

Features include:

  • new categorical value detection
  • removed value detection
  • frequency shift analysis
  • severity classification

Example Categorical Shift

Categorical shift:
'segment'
max frequency shift 60.00%
(high)

Outlier Detection

Dift now includes outlier analysis using IQR-based detection.

Features include:

  • IQR outlier detection
  • outlier spike analysis
  • outlier percentage tracking
  • risk integration

Example Outlier Warning

Outlier spike:
'revenue' increased by 100.00%
(high)

Improved Risk Scoring

Risk scoring was expanded to include:

  • numeric drift severity
  • categorical drift severity
  • outlier spike severity
  • weighted risk calculation

This improves overall trust analysis.


Reusable Threshold Policies

Threshold configurations became significantly more powerful in v0.5.0.

Supported threshold types:

  • numeric thresholds
  • categorical thresholds
  • outlier thresholds
  • column-specific overrides

Column-Level Threshold Overrides

Example:

thresholds:
  columns:
    revenue:
      numeric: 0.05
      outlier: 0.1

This enables highly granular validation workflows.


Environment-Based Configurations

Dift now supports reusable environment configurations.

Example:

environments:
  development:
    threshold: 0.2

  production:
    threshold: 0.05

Environment Variable Support

Environment variable interpolation is now supported.

Example:

old_dataset: ${OLD_DATASET}
new_dataset: ${NEW_DATASET}

This improves:

  • CI/CD workflows
  • automation pipelines
  • deployment flexibility

Saved Comparison Profiles

Dift now supports reusable saved comparison profiles.

Create profile:

dift profile create nightly-check \
  --old old.csv \
  --new new.csv \
  --key customer_id

Run profile:

dift profile run nightly-check

Profile Benefits

Profiles help support:

  • recurring validations
  • nightly checks
  • reusable workflows
  • standardized comparisons

Scheduled Comparison Workflows

Dift now supports automation-ready scheduling workflows.

Generate cron commands:

dift schedule cron nightly-check

Example output:

0 2 * * * dift profile run nightly-check

Saved Schedules

Create reusable schedules:

dift schedule create daily-check \
  --profile nightly-check \
  --cron "0 2 * * *"

Batch Dataset Comparison

Dift now supports comparing entire folders of datasets.

Example:

dift batch \
  --old-dir data/old \
  --new-dir data/new \
  --key id

Batch Workflow Features

Batch comparisons support:

  • folder-based matching
  • report generation
  • history tracking
  • continue-on-error workflows
  • stop-on-error workflows

Batch HTML Reporting

Example:

dift batch \
  --old-dir data/old \
  --new-dir data/new \
  --report html \
  --output-dir reports/batch

Comparison History Tracking

Dift now supports persistent comparison history.

Enable history:

dift old.csv new.csv \
  --history

History Features

History workflows support:

  • historical drift tracking
  • recurring risk monitoring
  • long-term comparison visibility

View History

List saved history:

dift history list

Show detailed record:

dift history show 1

Automation-Friendly Execution

Dift now includes automation-focused execution behavior.

Features include:

  • strict exit codes
  • quiet mode
  • no-color mode

Strict Exit Codes

Enable automation-safe risk exits:

dift prod.csv staging.csv \
  --strict-exit-codes

Exit mapping:

Exit Code Meaning
0 Low risk
1 Medium risk
2 High risk
3 Runtime failure

Quiet Mode

Suppress non-error output:

dift old.csv new.csv --quiet

No-Color Mode

Disable ANSI colors:

dift old.csv new.csv --no-color

Useful for:

  • CI logs
  • automation systems
  • plain-text terminals

Improved HTML Reports

HTML reporting received major improvements.

Enhancements include:

  • improved layouts
  • severity badges
  • drift highlighting
  • responsive design
  • cleaner summaries

HTML Templates

Dift now supports customizable HTML templates.

Example:

dift old.csv new.csv \
  --report html \
  --template dark

Available templates:

  • default
  • clean
  • compact
  • enterprise
  • dark

Improved Excel Reports

Excel reporting improvements include:

  • conditional formatting
  • severity highlighting
  • improved worksheet structure
  • readability improvements

Improved CSV Reports

CSV reporting now includes:

  • cleaner summaries
  • drift visibility
  • improved automation compatibility

Improved JSON Reports

JSON reports now provide:

  • cleaner structure
  • metadata support
  • stronger automation consistency

Metadata Expansion

Report metadata now includes:

  • timestamps
  • report type
  • runtime information
  • tool version

Improved Validation Errors

Validation UX was improved significantly.

Examples include:

  • clearer missing dataset guidance
  • connector installation hints
  • improved unsupported format errors
  • actionable workflow guidance

Improved CLI UX

CLI usability improvements include:

  • clearer help output
  • improved examples
  • better validation messaging
  • automation-friendly workflows

Expanded Testing Coverage

Testing coverage expanded significantly.

New focus areas include:

  • automation workflows
  • connector validation
  • report consistency
  • config workflows
  • batch comparisons
  • history tracking

Internal Improvements

Major internal improvements include:

  • reusable threshold architecture
  • cleaner report rendering
  • modular workflow organization
  • improved validation systems

Supported Dataset Formats

Supported local formats:

  • CSV
  • Parquet
  • Excel (.xlsx, .xls)
  • JSON

Report Formats

Supported outputs:

  • console report
  • JSON report
  • CSV report
  • Excel report
  • HTML report

Example Usage

Basic comparison:

dift old.csv new.csv --key customer_id

Drift detection:

dift old_drift.csv new_drift.csv \
  --key id \
  --threshold 0.1

Batch comparison:

dift batch \
  --old-dir data/old \
  --new-dir data/new

Installation

Install:

pip install dift-cli

Upgrade:

pip install --upgrade dift-cli

Looking Ahead

Future releases will focus on:

  • SQL database support
  • warehouse integrations
  • DuckDB support
  • BigQuery support
  • plugin preparation
  • connector registry architecture
  • enterprise connector workflows

Known Limitations

Current limitations:

  • no SQL database support yet
  • no warehouse connectors yet
  • no plugin architecture yet
  • no distributed execution yet

These are planned for future releases.


Vision

Dift continues evolving toward becoming the open-source standard for:

  • dataset regression testing
  • data drift monitoring
  • warehouse trust validation
  • ML dataset validation
  • automated data quality enforcement

Thank You

Thank you to everyone contributing ideas, testing workflows, reporting issues, improving documentation, and helping shape the direction of Dift.

Dift v0.5.0 represents a major milestone in the platform’s evolution toward scalable enterprise-grade data trust workflows.