Understanding and Aligning Large Scale Deep Learning - David Krueger - Département d'informatique et de recherche opérationnelle

Understanding and Aligning Large Scale Deep Learning

Par

David Krueger

University of Cambridge

mercredi 21 février 2024, 10:30-11:30 EST, Salle 6214

Pavillon André-Aisenstadt, Université de Montréal, 2920 Chemin de la Tour

Abstract: Large scale deep learning systems are poised to have a transformative impact on society, and may pose an existential risk to humanity. The safety of these systems depends on our ability to understand them and harness their capabilities. I will talk about my work on methods for doing this, and the limitations of existing approaches. I will cover my recent and ongoing work on alignment failure modes, and the limitations of fine-tuning (such as reinforcement learning from human feedback) as an alignment approach. A key theme is the need to better understand how deep learning systems learn and generalize, in order to predict and steer their behavior. I will also discuss how my technical work informs and is informed by AI governance.

Bio: David an Assistant Professor at the University of Cambridge and a member of Cambridge's Computational and Biological Learning lab (CBL) and Machine Learning Group (MLG). His research group focuses on Deep Learning, AI Alignment, and AI safety. He is broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Prior to joining the University of Cambridge, David did his PhD at Université de Montréal, working under the supervision of Aaron Courville.

Retour

Université de Montréal / Faculté des arts et des sciences Département d'informatique et de recherche opérationnelle

Understanding and Aligning Large Scale Deep Learning - David Krueger

Découvrez le parcours RECI

Services de soutien à la réussite et psychologique

Comment soutenir le Département?

BESOIN D'AIDE?

FACULTÉ DES ARTS ET DES SCIENCES