Dangerous Streets: Using ML to Prioritize Cyclist Safety (joshfonseca.com)

🤖 AI Summary
A data-driven system was built to rank Austin corridors by cyclist risk, using 2,757 bike crashes from Texas DOT’s CRIS (2015–2025). The project prioritized explainability and actionability: demographic predictors (age, gender) were deliberately excluded so the model would learn only from infrastructure and environmental features. A stacking ensemble (XGBoost, LightGBM, Gradient Boosting) with a logistic-regression meta-learner (5‑fold CV), Bayesian hyperparameter tuning, and class‑imbalance handling produced crash-level risk scores. SHAP values were used to unpack the ensemble’s decisions and surface which infrastructure factors drive risk. Key findings and implications: Speed limit emerged as the single most important actionable predictor of severe crashes, while bike lanes show protective effects but are far less effective on high‑speed arterials (e.g., painted lanes on 45 mph roads). By aggregating predictions into a Composite Danger Score that weights historical crashes, predicted risk, speed, and infrastructure gaps, the analysis surfaced IH‑35, US‑183 and S Congress Ave among the highest‑risk corridors; many dangerous segments cluster in East and South Austin. Specific interventions follow directly: targeted speed reductions, protected facilities (not just paint), improved lighting (e.g., S Pleasant Valley Rd), and equity-focused investment. The approach demonstrates a practical, interpretable path from ML to prioritized, place‑based safety actions for Vision Zero planning.
Loading comments...
loading comments...