Toward provably private insights into AI use (research.google)

0 points 1 day ago ago | visit original

🤖 AI Summary

Google announced "provably private insights" (PPI), a production system that uses confidential federated analytics (CFA) to let developers analyze unstructured, on-device generative AI outputs without exposing individual data. By running open-source LLMs (Google’s Gemma family) inside trusted execution environments (TEEs) and combining their structured-summarization outputs with user-level differential privacy (DP), the system produces aggregate statistics (e.g., topic histograms, frustration detection, auto-ratings) that cannot be traced back to any single user. The Recorder app on Pixel is an initial deployment: selected transcripts are encrypted on-device, decrypted only inside attested TEEs (AMD SEV‑SNP via Project Oak), processed by a Gemma 3 4B model, and released as DP-noised aggregates (example ε = 1). Technically, CFA enforces end-to-end verifiability: devices upload encrypted data and only pre-approved, open-source processing steps (logged in Rekor) running in TEEs can access keys. The LLM “data expert” and DP aggregation run inside the TEE, and all privacy-relevant code is open-source in Google Parfait so third parties can reproduce and attest the stack. Implications: developers gain realistic, privacy-preserving signals to improve on-device GenAI (including LLM auto-raters, future DP clustering, and synthetic-data workflows) while minimizing raw-data exposure — a step toward auditable, provable privacy for large-scale, unstructured AI telemetry.

Loading comments...

loading comments...