🤖 AI Summary
FPI-Det is a new, publicly released dataset targeted at detecting face–phone interactions, addressing a gap in benchmarks for fine-grained human–device behavior. The dataset contains 22,879 images with synchronized annotations for faces and phones across diverse real-world settings—workplaces, schools, transportation and public spaces—and deliberately includes extreme scale variation, frequent occlusions, and varied capture conditions. The authors frame phone-use detection as more than object recognition: it requires reasoning about spatial relationships between faces, hands and devices to infer behavioral context, which is critical for applications like safety monitoring, attention management and productivity analysis.
To establish baselines, the paper evaluates representative object detectors (YOLO family and DETR) on FPI-Det and presents a breakdown of performance by object size, occlusion level and environment. Their analysis highlights the limitations of current detectors when phones are small, heavily occluded, or presented in cluttered scenes, underscoring the need for models that better model inter-object relationships and contextual cues. The dataset and source code are available online, offering a new benchmark for developing models that combine detection with interaction reasoning and for advancing robustness in real-world phone-use understanding.
Loading comments...
login to comment
loading comments...
no comments yet