AI training may leak secrets to canny thieves

Nicholas Carlini (photo: Kore Chan/Dail Cal)

A paper released on arXiv last week by a team of researchers including Prof. Dawn Song and Ph.D. student Nicholas Carlini (B.A. CS/Math ’13), reveals just how vulnerable deep learning is to information leakage.  The researchers labelled the problem “unintended memorization” and explained it happens if miscreants can access to the model’s code and apply a variety of search algorithms. That’s not an unrealistic scenario considering the code for many models are available online, and it means that text messages, location histories, emails or medical data can be leaked.  The team doesn’t “really know why neural networks memorize these secrets right now, ” Carlini says.  “At least in part, it is a direct response to the fact that we train neural networks by repeatedly showing them the same training inputs over and over and asking them to remember these facts.”   The best way to avoid all problems is to never feed secrets as training data. But if it’s unavoidable then developers will have to apply differentially private learning mechanisms, to bolster security, Carlini concluded.