Meetings
Modular Duality in Deep Learning
Approaching Deep Learning through the Spectral Dynamics of Weights
The importance of discretisation drift in deep learning
Information-theoretic generalization bounds for black-box learning algorithms
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
Predicting grokking long before it happens
The geometry of neural nets' parameter spaces under reparametrization
Can Neural Network Memorization Be Localized?
DINO v1 and v2: Self-Supervised Vision Transformers
SGD with Large Step Sizes Learns Sparse Features
Bottleneck structure in large depth networks
Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent
Understanding edge of stability
Lottery ticket hypothesis and its current state
Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks
A Loss Curvature Perspective on Training Instability in Deep Learning
When Are Solutions Connected in Deep Networks?
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Towards Understanding Sharpness-Aware Minimization
When Do Neural Networks Outperform Kernel Methods?
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
Does the Data Induce Capacity Control in Deep Learning?
Deep Ensembles: A Loss Landscape Perspective
Taxonomizing local versus global structure in neural network loss landscapes
The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks
The Geometry of Neural Network Landscapes: Symmetry-Induced Saddles & Global Minima Manifold
Exploring Generalization in Deep Learning
subscribe via RSS