July 10, 2018
Gradient Descent Learns One-hidden-layer CNN: Don’t be Afraid of Spurious Local Minima
International Conference on Machine Learning (ICML)
We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation function, i.e., f(Z; w, a) = Σj ajσ(wT Zj), in which both the convolutional weights w and the output weights a are parameters to be learned.
By: Simon S. Du, Jason D. Lee, Yuandong Tian, Barnabás Póczos, Aarti Singh
Facebook AI Research