🤖 AI Summary
A new blog demonstrates an end-to-end computer-vision pipeline to automatically detect marathon runners as they cross a start/split/finish line, crop their bib region, run OCR to extract bib numbers, and record the exact crossing frame/time. The solution uses Roboflow Workflows for chaining vision tasks (object detection → tracking → line counting → dynamic crop → vision‑language OCR) and Google’s Gemini models for OCR, producing structured JSON outputs that tie bib numbers to the moment a runner crosses the line. Video inference requires a local Roboflow inference server (localhost:9001) and a Gemini API key (watch free-tier rate limits).
Key technical pieces: RF‑DETR (public model) set to class filter 'person' for detection, Byte Tracker to maintain identities across frames, a Line Counter + Line Counter Visualization to define and visualize a 2D finish line, and a Continue If block to branch on crossing events. When a crossing is detected, Dynamic Crop extracts the runner, Google Gemini (vision‑language block) performs Structured Output Generation to return {"bib_number":"..."} which a JSON Parser block validates. Important practical notes: the drawn line’s coordinates must match the video resolution, placement depends on camera angle (2D crossing), and the workflow outputs include json_parser, dynamic_crop, and line_counter_visualization for downstream timing and archival. This approach speeds up timing accuracy and scalable video annotation for races.
Loading comments...
login to comment
loading comments...
no comments yet