Passer au contenu

/ Department of Computer Science and Operations Research

Je donne

Rechercher

Soutenance de thèse - Sara Hooker

Bonjour à tous,


Vous êtes cordialement invité.e.s à la soutenance de thèse de Sara Hooker, le 30 août à 14h30 (mode hybride).


Title: Beyond Top Line Metrics: Understanding the Trade-off Between Model Size and Generalization Properties

Date: 30 Août 2024 de 14:30 à 17:30 EST

Location:  Auditorium 1, MILA + *Zoom Link

 

Jury

Président rapporteur
Agrawal, Aishwarya
Directeur de rechercheCourville, Aaron
Co-directeur de rechercheLarochelle, Hugo
Membre régulier
Farnadi, Golnoosh
Examinateur externe
Frankle, Jonathan, Databricks inc.

 

Abstract

An argument in favor of scaling the size of modern algorithms is that it is a surprisingly simple recipe that has provided persuasive gains in overall performance. Ken Thompson famously said “When in doubt, use brute force.” It is costly to deviate from the predictable gains of adding more parameters, particularly when different regimes of parameter size appear to unlock new and unexpected generalization properties. However, a key limitation of simply throwing more parameters at a task is that the relationship between weights and generalization remains poorly understood.

The works we will discuss ask “What is gained or lost as we vary the number of parameters in a deep neural network?” This question is very relevant in an era of scientific inquiry where the large size of networks incurs prohibitive energy costs and hurts accessibility.

The key findings across the constituent works is that we pay an exorbitant amount in compute to learn rare patterns in the world around us. When we radically vary the number of the parameters, we lose performance on a tiny slice of the distribution -- the long-tail. Most natural datasets follow a long-tail distribution, with many infrequent attributes. Hence, the findings of this thesis have widespread implications for understanding the limitations of our current optimization approaches for modeling the real world.