Convergence of Linear Neural Networks to Global Minimizers

Detta är en Master-uppsats från KTH/Matematik (Inst.)

Sammanfattning: It is known that gradient flow in linear neural networks using Euclidean loss almost always avoids critical points that have at least one eigendirection with negative curvature. Using algebraic invariants of the gradient flow we try to prove that the set of all critical points with no second-order curvature (zero Hessian) for arbitrary networks is associated to a subset of the invariants of lower dimension. This would mean that these critical points are almost surely avoided. We show that this holds for networks with $3$ or less hidden layers and a few other special cases. We show by way of explicit counter-example that it is not true for general deep networks.

  HÄR KAN DU HÄMTA UPPSATSEN I FULLTEXT. (följ länken till nästa sida)