## Meetings

### Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks

### Predicting grokking long before it happens

### The geometry of neural nets' parameter spaces under reparametrization

### Can Neural Network Memorization Be Localized?

### DINO v1 and v2: Self-Supervised Vision Transformers

### SGD with Large Step Sizes Learns Sparse Features

### Bottleneck structure in large depth networks

### Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent

### Understanding edge of stability

### Lottery ticket hypothesis and its current state

### Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

### A Loss Curvature Perspective on Training Instability in Deep Learning

### When Are Solutions Connected in Deep Networks?

### From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent

### Towards Understanding Sharpness-Aware Minimization

### When Do Neural Networks Outperform Kernel Methods?

### SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs

### Does the Data Induce Capacity Control in Deep Learning?

### Deep Ensembles: A Loss Landscape Perspective

### Taxonomizing local versus global structure in neural network loss landscapes

### The Effects of Mild Over-parameterization on the Optimization Landscape of Shallow ReLU Neural Networks

### The Geometry of Neural Network Landscapes: Symmetry-Induced Saddles & Global Minima Manifold

### Exploring Generalization in Deep Learning

subscribe via RSS