Odel
GoldenCheck

GoldenCheck

@benseverndev-ossData & Analytics2PythonMITUpdated 1mo ago

Auto-discover validation rules from data — scan, profile, health-score. No rules to write.

Server endpointStreamable HTTP

This is the third-party server itself — Odel doesn't run it. Hitting this URL directly talks straight to the upstream server with no auth or proxying. Connect through Odel to front it with managed auth.

Moved. This repo has moved into the benzsevern/goldenmatch monorepo at packages/python/goldencheck (and packages/typescript/goldencheck)/. This repo is archived; new development happens in the monorepo.

GoldenCheck

Data validation that discovers rules from your data so you don't have to write them. Built by Ben Severn.

PyPI npm CI codecov PyPI Downloads npm Downloads Python 3.11+ Node 20+ TypeScript License: MIT DQBench Docs Open In Colab

Every competitor makes you write rules first. GoldenCheck flips it: validate first, keep the rules you care about.

Why GoldenCheck?

GoldenCheckGreat ExpectationsPanderaPointblank
RulesDiscovered from dataWritten by handWritten by handWritten by hand
ConfigZero to startHeavy YAML/Python setupDecorators/schemasYAML/Python
InterfaceCLI + interactive TUIHTML reportsExceptionsHTML/notebook
Learning curveOne commandHours/daysModerateModerate
LLM enhancementYes ($0.01/scan)NoNoNo
Fix suggestionsYes, in TUINoNoNo
Confidence scoringYes (H/M/L per finding)NoNoNo
DQBench Score88.4021.68 (best-effort)32.51 (best-effort)6.94 (auto)

Install

pip install goldencheck

With LLM boost support:

pip install goldencheck[llm]

With deep profiling & baseline support (scipy, numpy):

pip install goldencheck[baseline]

With semantic type inference for baseline (sentence-transformers):

pip install goldencheck[baseline,semantic]

JavaScript / TypeScript

npm install goldencheck

Edge-safe core (browsers, Cloudflare Workers, Vercel Edge):

import { scanData, TabularData } from "goldencheck/core";

Node.js (file reading, CLI, MCP):

import { readFile, scanData } from "goldencheck/node";

Quick Start

# Scan a file — discovers issues, launches interactive TUI
goldencheck data.csv

# CLI-only output (no TUI)
goldencheck data.csv --no-tui

# With LLM enhancement (requires API key)
goldencheck data.csv --llm-boost --no-tui

# Validate against saved rules (for CI/pipelines)
goldencheck validate data.csv

# JSON output for CI integration
goldencheck data.csv --no-tui --json

# Learn baseline (one-time, deep analysis)
goldencheck baseline data.csv

# Scan with drift detection (fast, uses saved baseline)
goldencheck scan new_data.csv

TypeScript Quick Start

// Scan an array of records (edge-safe — works anywhere)
import { scanData, TabularData, Severity } from "goldencheck";

const data = new TabularData([
  { id: 1, email: "alice@example.com", age: 30, status: "active" },
  { id: 2, email: "bob@test.com", age: -5, status: "inactive" },
  { id: 3, email: "not-an-email", age: 25, status: "active" },
]);

const { findings, profile } = scanData(data);
for (const f of findings) {
  console.log(`[${f.severity === Severity.ERROR ? "ERROR" : "WARNING"}] ${f.column}: ${f.message}`);
}
// Scan a CSV file (Node.js)
import { readFile, scanData, applyConfidenceDowngrade, healthScore } from "goldencheck/node";

const data = readFile("data.csv");
const result = scanData(data, { domain: "healthcare" });
const findings = applyConfidenceDowngrade(result.findings, false);

// Health score
const byCol = {};
for (const f of findings) {
  if (f.severity >= 2) {
    byCol[f.column] ??= { errors: 0, warnings: 0 };
    byCol[f.column][f.severity === 3 ? "errors" : "warnings"]++;
  }
}
const { grade, points } = healthScore(byCol);
console.log(`Health: ${grade} (${points}/100)`);
// Validate against pinned rules
import { readFile, scanData, validateConfig, validateData } from "goldencheck/node";
import { readFileSync } from "node:fs";
import YAML from "yaml";

const config = validateConfig(YAML.parse(readFileSync("goldencheck.yml", "utf-8")));
const data = readFile("data.csv");
const findings = validateData(data, config);
// Create baseline and detect drift
import { readFile, createBaseline, serializeBaseline, scanData } from "goldencheck/node";
import { runDriftChecks, deserializeBaseline } from "goldencheck";
import { writeFileSync, readFileSync } from "node:fs";

// Learn baseline
const data = readFile("reference.csv");
const baseline = createBaseline(data);
writeFileSync("baseline.json", serializeBaseline(baseline));

// Later: detect drift
const newData = readFile("production.csv");
const saved = deserializeBaseline(readFileSync("baseline.json", "utf-8"));
const driftFindings = runDriftChecks(newData, saved);
// LLM-enhanced scanning (edge-safe)
import { scanData, TabularData, callLlm, parseLlmResponse, mergeLlmFindings, buildSampleBlocks } from "goldencheck";

const data = new TabularData(records);
const result = scanData(data, { returnSample: true });
const blocks = buildSampleBlocks(result.sample, result.findings);
const { text } = await callLlm("anthropic", JSON.stringify(blocks));
const llmResponse = parseLlmResponse(text);
if (llmResponse) {
  const enhanced = mergeLlmFindings(result.findings, llmResponse);
}

How It Works

1. SCAN     →  goldencheck data.csv
                GoldenCheck profiles your data and discovers what "healthy" looks like

2. REVIEW   →  Interactive TUI shows findings sorted by severity
                Each finding has: description, affected rows, sample values

3. PIN      →  Press Space to promote findings into permanent rules
                Dismiss false positives — they won't come back

4. EXPORT   →  Press F2 to save rules to goldencheck.yml
                Human-readable YAML with your pinned rules

5. VALIDATE →  goldencheck validate data.csv
                Enforce rules in CI with exit codes (0 = pass, 1 = fail)

What It Detects

Column-Level Profilers

ProfilerWhat It CatchesExample
Type inferenceString columns that are actually numeric"Column age is string but 98% are integer"
NullabilityRequired vs. optional columns"0 nulls across 50k rows — likely required"
UniquenessPrimary key candidates, near-duplicates"100% unique — likely primary key"
Format detectionEmails, phones, URLs, dates"94% email format, 6% malformed"
Range & distributionOutliers, min/max bounds"3 rows have values >10,000"
CardinalityLow-cardinality enum suggestions"4 unique values — possible enum"
Pattern consistencyMixed formats within a column"3 phone formats detected"

Cross-Column Profilers

ProfilerWhat It Catches
Temporal orderingstart_date > end_date violations
Null correlationColumns that are null together (e.g., address + city + zip)
Numeric cross-columnvalue > max violations (e.g., claim_amount > policy_max)
Age vs DOBAge column doesn't match calculated age from date_of_birth

Baseline Deep Profiling & Drift Detection

Run goldencheck baseline once to build a statistical profile of healthy data. On every subsequent scan, GoldenCheck compares the new data against the saved baseline and reports drift across 13 check types:

Check TypeWhat It Catches
distribution_driftValue distribution has shifted significantly
entropy_driftEntropy of column values has changed
bound_violationValues exceed historical min/max bounds
benford_driftLeading-digit distribution deviates from Benford's Law
fd_violationFunctional dependency between columns is broken
key_uniqueness_lossPreviously unique column now has duplicates
temporal_order_driftHistorical column ordering constraint violated
type_driftDominant semantic type of column has changed
correlation_breakPreviously correlated columns are no longer correlated
new_correlationNew unexpected correlation appeared
pattern_driftValue format/pattern distribution has shifted
new_patternNew structural patterns appeared in a column

The baseline is built using 6 techniques: statistical profiler (distributions, Benford's Law, entropy), constraint miner (functional dependencies, temporal orders), semantic type inferrer (embeddings + keywords), correlation analyzer (Pearson, Cramér's V), pattern grammar inducer, and confidence prior builder.

Domain Packs

Improve detection accuracy with domain-specific type definitions:

goldencheck scan data.csv --domain healthcare   # NPI, ICD, insurance, patient types
goldencheck scan data.csv --domain finance      # accounts, routing, CUSIP, transactions
goldencheck scan data.csv --domain ecommerce    # SKUs, orders, tracking, products

Domain packs add semantic types that reduce false positives and improve classification for industry-specific data.

Schema Diff

Compare two versions of a data file:

goldencheck diff data.csv                  # compare against git HEAD
goldencheck diff old.csv new.csv           # compare two files
goldencheck diff data.csv --ref main       # compare against a branch

Auto-Fix

Apply automated fixes to clean your data:

goldencheck fix data.csv                          # safe: trim, normalize, fix encoding
goldencheck fix data.csv --mode moderate          # + standardize case
goldencheck fix data.csv --mode aggressive --force # + coerce types
goldencheck fix data.csv --dry-run                # preview changes

Watch Mode

Continuously monitor a directory for data quality:

goldencheck watch data/ --interval 30        # re-scan every 30s
goldencheck watch data/ --exit-on error      # CI mode: fail on first error

REST API

Run GoldenCheck as a microservice:

goldencheck serve --port 8000

# Scan via file upload
curl -X POST http://localhost:8000/scan --data-binary @data.csv

# Scan via URL
curl -X POST http://localhost:8000/scan/url -d '{"url": "https://example.com/data.csv"}'

Database Scanning

Scan tables directly — no CSV export needed:

pip install goldencheck[db]
goldencheck scan-db "postgresql://user:pass@host/db" --table orders
goldencheck scan-db "snowflake://..." --query "SELECT * FROM orders WHERE date > '2024-01-01'"

Scheduled Runs

Cron-like scheduling with webhook notifications:

goldencheck schedule data/*.csv --interval hourly --webhook https://hooks.slack.com/...
goldencheck schedule data/*.csv --interval daily --notify-on grade-drop

LLM Boost

Add --llm-boost to enhance profiler findings with LLM intelligence. The LLM receives a representative sample of your data and:

  1. Finds issues profilers miss — semantic understanding (e.g., "12345" in a name column)
  2. Upgrades severity — knows "emails should be required" even if the profiler only says "INFO"
  3. Discovers relationships — identifies temporal ordering between columns like signup_date and last_login
  4. Downgrades false positives — "mixed phone formats are common, not an error"
# Using OpenAI
export OPENAI_API_KEY=sk-...
goldencheck data.csv --llm-boost --llm-provider openai --no-tui

# Using Anthropic
export ANTHROPIC_API_KEY=sk-ant-...
goldencheck data.csv --llm-boost --no-tui

Cost: ~$0.01 per scan (one API call with representative samples, not per-row).

Budget control:

export GOLDENCHECK_LLM_BUDGET=0.50  # max spend per scan in USD

Configuration (goldencheck.yml)

version: 1

settings:
  sample_size: 100000
  fail_on: error

columns:
  email:
    type: string
    required: true
    format: email
    unique: true

  age:
    type: integer
    range: [0, 120]

  status:
    type: string
    enum: [active, inactive, pending, closed]

relations:
  - type: temporal_order
    columns: [start_date, end_date]

ignore:
  - column: notes
    check: nullability

Only pinned rules appear in this file — not every finding. The ignore list prevents dismissed findings from reappearing.

CLI Reference

CommandDescription
goldencheck <file>Scan and launch TUI
goldencheck scan <file>Explicit scan (supports --smart, --guided)
goldencheck validate <file>Validate against goldencheck.yml
goldencheck review <file>Scan + validate, launch TUI
goldencheck init <file>Interactive setup wizard (scan → config → CI)
goldencheck diff <file> [file2]Compare two files or against git HEAD
goldencheck watch <dir>Poll directory, re-scan on change
goldencheck fix <file>Auto-fix data quality issues
goldencheck baseline <file>Deep-profile data and save statistical baseline to YAML
goldencheck learn <file>Generate LLM validation rules
goldencheck historyShow scan history and trends
goldencheck serveStart REST API server
goldencheck scan-db <conn>Scan a database table directly
goldencheck schedule <files>Run scans on a cron schedule
goldencheck mcp-serveStart MCP server (19 tools)

Flags

FlagDescription
--no-tuiPrint results to console
--jsonJSON output
--fail-on <level>Exit 1 on severity: error or warning
--domain <name>Domain pack: healthcare, finance, ecommerce
--llm-boostEnable LLM enhancement
--llm-provider <name>LLM provider: anthropic (default) or openai
--mode <level>Fix mode: safe, moderate, aggressive
--smartAuto-triage: pin high-confidence, dismiss low
--guidedWalk through findings one-by-one
--webhook <url>POST findings to Slack/PagerDuty/any URL
--notify-on <trigger>Webhook trigger: grade-drop, any-error, any-warning
--baseline <path>Path to baseline YAML for drift detection
--no-baselineSkip auto-discovery of goldencheck_baseline.yaml
--skip <technique>Skip a baseline technique (can repeat)
--updateUpdate existing baseline instead of overwriting
-o <path>Output path for baseline file (default: goldencheck_baseline.yaml)
--versionShow version

TypeScript CLI

npx goldencheck-js scan data.csv --json
npx goldencheck-js scan data.csv --domain healthcare
npx goldencheck-js health-score data.csv
npx goldencheck-js profile data.csv
npx goldencheck-js validate data.csv --config goldencheck.yml
npx goldencheck-js baseline data.csv --output baseline.json
npx goldencheck-js fix data.csv --mode safe
npx goldencheck-js diff old.csv new.csv
npx goldencheck-js demo

TypeScript Architecture

goldencheck (npm)
├── goldencheck/core    # Edge-safe: browsers, Workers, Edge Runtime
│   ├── types           # Finding, Severity, DatasetProfile, Config types
│   ├── data            # TabularData — zero-dep columnar abstraction
│   ├── profilers       # 10 column profilers + 4 relation profilers
│   ├── semantic        # Type classifier, suppression, 3 domain packs
│   ├── engine          # Scanner, confidence, validator, triage, differ, fixer
│   ├── baseline        # Statistical profiling, constraints, correlation, patterns
│   ├── drift           # 13 drift checks against saved baseline
│   ├── llm             # Anthropic + OpenAI via fetch(), merger, budget
│   ├── agent           # Strategy, handoff, review queue
│   └── reporters       # JSON, CI
└── goldencheck/node    # Node.js >= 20
    ├── reader          # CSV, Parquet (via nodejs-polars)
    ├── mcp             # MCP server (7 tools)
    ├── a2a             # Agent-to-Agent HTTP server
    ├── tui             # ANSI terminal output
    ├── db-scanner      # Postgres, MySQL, SQLite
    └── watcher         # Directory polling

Benchmarks

Speed

DatasetTimeThroughput
1K rows0.05s19K rows/sec
10K rows0.23s43K rows/sec
100K rows2.29s44K rows/sec
1M rows2.07s482K rows/sec

DQBench v1.0 — Head-to-Head

ToolModeDQBench Score
GoldenCheckzero-config88.40
Panderabest-effort rules32.51
Soda Corebest-effort rules22.36
Great Expectationsbest-effort rules21.68

GoldenCheck's zero-config discovery outperforms every competitor — even when they have hand-written rules.

Run the benchmark yourself:

pip install dqbench goldencheck
dqbench run goldencheck

Detection Accuracy

ModeColumn RecallCost
Profiler-only (v0.1.0)87%$0
Profiler-only (v0.2.0 with confidence)100%$0
With LLM Boost100%~$0.003-0.01

Tested on a custom benchmark with 341 planted data quality issues across 9 categories.

v0.2.0 improvements: minority wrong-type detection, range profiler chaining, broader temporal heuristics, and confidence scoring pushed profiler-only recall from 87% to 100%.

Raha Benchmark Datasets

DatasetColumn Recall
Flights (2,376 rows)100% (4/4 columns)
Beers (2,410 rows)80% (4/5 columns)

Tech Stack

DependencyPurpose
PolarsAll data operations
TyperCLI framework
TextualInteractive TUI
RichCLI output formatting
Pydantic 2Config validation

Optional: Anthropic SDK / OpenAI SDK for LLM Boost | MCP SDK for MCP server | scipy + numpy for deep baseline profiling ([baseline]) | sentence-transformers for semantic type inference in baseline ([semantic])

TypeScript / Node.js

DependencyPurpose
Zero runtime depsCore package has no dependencies (edge-safe)
nodejs-polarsParquet reading (optional, Node.js only)
csv-parseCSV reading (Node.js only)
@modelcontextprotocol/sdkMCP server (Node.js only)

MCP Server (Claude Desktop)

GoldenCheck includes an MCP server for Claude Desktop integration:

pip install goldencheck[mcp]

Add to your Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "goldencheck": {
      "command": "goldencheck",
      "args": ["mcp-serve"]
    }
  }
}

Available tools:

ToolDescription
scanScan a file for data quality issues (with optional LLM boost)
validateValidate against pinned rules in goldencheck.yml
profileGet column-level statistics and health score
health_scoreQuick A-F grade for a data file
get_column_detailDeep-dive into a specific column
list_checksList all available profiler checks

Remote MCP Server

GoldenCheck is available as a hosted MCP server on Smithery — connect from any MCP client without installing anything.

Claude Desktop / Claude Code:

{
  "mcpServers": {
    "goldencheck": {
      "url": "https://goldencheck-mcp-production.up.railway.app/mcp/"
    }
  }
}

Local server:

pip install goldencheck[mcp]
goldencheck mcp-serve

19 tools available: scan files, validate rules, profile columns, health-score datasets, auto-configure validation, explain findings, compare domains, suggest fixes.

Jupyter / Colab

GoldenCheck renders rich HTML in Jupyter notebooks:

from goldencheck.engine.scanner import scan_file
from goldencheck.engine.confidence import apply_confidence_downgrade
from goldencheck.notebook import ScanResult

findings, profile = scan_file("data.csv")
findings = apply_confidence_downgrade(findings, llm_boost=False)

# Rich HTML display in notebooks
ScanResult(findings=findings, profile=profile)

Open In Colab

API Quick Reference

Python

import goldencheck

# Scan a CSV for quality issues
findings = goldencheck.scan_file("data.csv")
for f in findings:
    print(f"[{f.severity}] {f.column}: {f.check}{f.message}")

# Create baseline and detect drift
from goldencheck import create_baseline, scan_file
baseline = create_baseline("data.csv")
baseline.save("goldencheck_baseline.yaml")
findings, profile = scan_file("data.csv", baseline="goldencheck_baseline.yaml")

# Health score
score = goldencheck.health_score("data.csv")
print(score)  # e.g. "B (78/100)"

TypeScript

import { scanData, TabularData, Severity } from "goldencheck";

// Scan records (edge-safe)
const data = new TabularData(records);
const { findings, profile } = scanData(data);
for (const f of findings) {
  console.log(`[${f.severity === Severity.ERROR ? "ERROR" : "WARNING"}] ${f.column}: ${f.message}`);
}
import { readFile, scanData, applyConfidenceDowngrade, healthScore } from "goldencheck/node";

// Scan a CSV file (Node.js)
const data = readFile("data.csv");
const result = scanData(data, { domain: "healthcare" });
const findings = applyConfidenceDowngrade(result.findings, false);

// Health score
const byCol = {};
for (const f of findings) {
  if (f.severity >= 2) {
    byCol[f.column] ??= { errors: 0, warnings: 0 };
    byCol[f.column][f.severity === 3 ? "errors" : "warnings"]++;
  }
}
const { grade, points } = healthScore(byCol);
console.log(`Health: ${grade} (${points}/100)`);
import { readFile, createBaseline, serializeBaseline } from "goldencheck/node";
import { runDriftChecks, deserializeBaseline } from "goldencheck";
import { writeFileSync, readFileSync } from "node:fs";

// Create baseline and detect drift
const data = readFile("reference.csv");
const baseline = createBaseline(data);
writeFileSync("baseline.json", serializeBaseline(baseline));

const newData = readFile("production.csv");
const saved = deserializeBaseline(readFileSync("baseline.json", "utf-8"));
const driftFindings = runDriftChecks(newData, saved);

Contributing

See CONTRIBUTING.md for development setup and guidelines.

Author

Ben Severn

License

MIT — see LICENSE


Part of the Golden Suite

ToolPurposeInstall
GoldenCheckValidate & profile data qualitypip install goldencheck / npm install goldencheck
GoldenFlowTransform & standardize datapip install goldenflow
GoldenMatchDeduplicate & match recordspip install goldenmatch
GoldenPipeOrchestrate the full pipelinepip install goldenpipe

Companion projects: