Access Logging in 2025 (neugierig.org)

0 points 4 hours ago ago | visit original

🤖 AI Summary

A blogger experimented with “access logging in 2025” and found traditional server logs are now almost useless for measuring human readership: most traffic is bots, including AI-company crawlers that either self-identify or impersonate browsers. Modern analytics moved to JavaScript telemetry (e.g., Google Analytics), giving large vendors deep visibility into browsing; simpler tricks—an invisible <img> beacon relying on the Referer header, or a setTimeout-based dwell-time check—fail because bots fetch images and increasingly execute page JS (even waiting to mimic human delays). Feed readers and users without JS also evade such logging, and reader-reported subscriber counts don’t reliably equal actual human reads. The piece highlights a broader technical and product lesson for AI/ML engineers: bot detection is an arms race—user-agent and IP heuristics, image beacons, JS execution tests and timing thresholds are all brittle against smarter crawlers. That has implications for analytics accuracy, privacy and centralization of telemetry data, and for how we define a “reader.” The practical takeaway is to clarify objectives (archive for oneself vs. reach real humans) and favor explicit signals—subscriptions, email signups or opt-in metrics—over noisy, adversarial access logs.

Loading comments...

loading comments...