How to Harden AI Instances for Privacy and Security (techshinobi.org)

🤖 AI Summary
Researchers found a widespread exposure of local AI servers: Cisco’s Shodan scan located over 1,000 Ollama instances in ten minutes (port 11434) and Censys later enumerated 10.6K Ollama hosts with ~1.5K responding to prompts. Exposed instances listening on 0.0.0.0 enable remote code execution, prompt injection/poisoning and leakage of private chat memory — a major privacy and security risk for anyone running models that handle sensitive data. Practical mitigations focus on restricting bind addresses and disabling telemetry. For Ollama, set OLLAMA_HOST to 127.0.0.1 or your LAN IP in the systemd override and restart; for Docker, bind the container port to a specific host IP (e.g. -p 192.168.x.x:11434:11434). Harbor is isolated by default via Docker networking; avoid overriding internal URLs unless necessary. For Web UIs: run Gradio with GRADIO_SHARE=False, GRADIO_SERVER_NAME=192.168.x.x, GRADIO_ANALYTICS_ENABLED=False (and DISABLE_TELEMETRY=1), and run Streamlit with --browser.gatherUsageStats false --browser.serverAddress 192.168.x.x or --server.headless false. Add system-level environment exports (TRANSFORMERS_OFFLINE, HF_HUB_OFFLINE, DISABLE_TELEMETRY, etc.) and harden the Linux host and network (firewall/VLAN, pfSense/OPNsense) as a last line of defense. Finally, use a privacy-minded browser for local WebUIs and minimize extensions to reduce client-side data leakage.
Loading comments...
loading comments...