Watermarking for Generative AI (arxiv.org)

🤖 AI Summary
Researchers introduce InvGNN-WM, a trigger-free watermarking scheme for Graph Neural Networks that ties model ownership to its implicit perception of a topological invariant (normalized algebraic connectivity). Instead of embedding backdoor triggers, InvGNN-WM adds a lightweight head that predicts this invariant on an owner-private carrier set; a sign-sensitive decoder converts predictions into bitstrings and a calibrated threshold limits false positives. The method supports black-box verification, preserves original task accuracy, and provides formal guarantees on imperceptibility and robustness—furthermore, the authors prove exact watermark removal is NP-complete. Technically, InvGNN-WM was evaluated across node- and graph-classification tasks and multiple GNN backbones, outperforming trigger- and compression-based baselines on watermark accuracy while matching clean performance. It stays robust under typical model edits such as unstructured pruning, fine-tuning, and post-training quantization; plain knowledge distillation can weaken the mark, but augmenting distillation with a watermark loss (KD+WM) restores it. By anchoring ownership to a model’s internal response to a graph invariant rather than explicit triggers, InvGNN-WM offers a stealthy, hard-to-remove IP protection mechanism for GNNs that balances detectability, robustness, and minimal impact on utility.
Loading comments...
loading comments...