ML Dataset Inspection
npx @tradejs/cli ml-inspect helps validate dataset quality before training.
Source:
@tradejs/cli
Quick Start
npx @tradejs/cli ml-inspect
npx @tradejs/cli ml-inspect --strategy TrendLine --rows 20000 --mode sample
npx @tradejs/cli ml-inspect --file data/ml/export/ml-dataset-trendline-merged-123.jsonl --mode tail
Modes
head: first N rowstail: last N rowssample: reservoir random sample
Inspect Tools
quick(default): built-in numeric diagnosticsydata: HTML profile report via ydata profiling runtime
Examples:
npx @tradejs/cli ml-inspect --tool quick --rows 15000
npx @tradejs/cli ml-inspect --tool ydata --rows 20000 --mode sample
What Quick Inspect Checks
For numeric fields it computes and flags:
- missing/non-finite rates
- near-constant features
- mostly-zero features
- outlier rate
- scale spread (
p99/medianand global scale mismatch)
It prints top problematic fields by score and recommended fixes.
YData Output
ydata mode generates:
<dataset-name>.profile.htmlnext to source dataset
Requirements:
- ydata profiling runtime available in your environment
- enough disk space for generated
.profile.htmlreport
Useful Flags
--dir data/ml/export--strategy <Strategy>--file <explicit path>--rows <N>--mode head|tail|sample--limitIssues <N>--minFieldValues <N>--tool quick|ydata