First-order Adversarial Vulnerability of Neural Networks and Input Dimension

International Conference on Machine Learning (ICML)


Over the past few years, neural networks were proven vulnerable to adversarial images: targeted but imperceptible image perturbations lead to drastically different predictions. We show that adversarial vulnerability increases with the gradients of the training objective when viewed as a function of the inputs. Surprisingly, vulnerability does not depend on network topology: for many standard network architectures, we prove that at initialization, the l1-norm of these gradients grows as the square root of the input dimension, leaving the networks increasingly vulnerable with growing image size. We empirically show that this dimension dependence persists after either usual or robust training, but gets attenuated with higher regularization.

Related Publications

All Publications

NeurIPS - December 7, 2020

Labelling unlabelled videos from scratch with multi-modal self-supervision

Yuki M. Asano, Mandela Patrick, Christian Rupprecht, Andrea Vedaldi

NeurIPS - December 7, 2020

Adversarial Example Games

Avishek Joey Bose, Gauthier Gidel, Hugo Berard, Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, William L. Hamilton

NeurIPS - December 7, 2020

Learning Search Space Partition for Black-box Optimization using Monte Carlo Tree Search

Linnan Wang, Rodrigo Fonseca, Yuandong Tian

NeurIPS - December 7, 2020

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

Yuandong Tian, Qucheng Gong, Tina Jiang

To help personalize content, tailor and measure ads, and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookies Policy