A toolkit for improving the quality of your LeRobot datasets (github.com)

🤖 AI Summary
A new lightweight open-source toolkit, score_lerobot_episodes, gives robotics researchers a quantitative way to evaluate and filter LeRobot demonstration datasets by per-episode quality. It combines classic computer-vision and kinematic heuristics (blur/exposure checks, 2nd-derivative joint-smoothness, collision/acceleration spikes, path-efficiency, final-joint stillness, gripper-consistency and runtime outlier detection) with an optional Gemini-powered vision–language check that grades task success. Each episode receives 0–1 scores across dimensions and an aggregate score; low-scoring episodes can be removed to create a filtered dataset that preserves the original LeRobot structure. The toolkit integrates with HuggingFace datasets, outputs results/{repo_id}_scores.json and a terminal table, and can run baseline vs filtered training runs to measure downstream impact. Key technical details: CLI options include --repo_id (HuggingFace), --threshold (default 0.5), --vision_type (opencv or vlm_gemini), --train-baseline / --train-filtered, and plotting. Default training uses the ACT policy, 10k steps, batch size 4, and WandB logging; checkpoints and filtered outputs are saved under ./checkpoints and ./output. Requires Python ≥3.8 and pip-installed deps; Gemini VLM use requires GOOGLE_API_KEY and may hit strict free-tier rate limits. The project is Apache 2.0 licensed, documented for installation/troubleshooting, and designed to help improve model performance by removing noisy or corrupt robot episodes.
Loading comments...
loading comments...