Understanding and Aligning Large Scale Deep Learning
Par
David Krueger
University of Cambridge
mercredi 21 février 2024, 10:30-11:30 EST, Salle 6214
Pavillon André-Aisenstadt, Université de Montréal, 2920 Chemin de la Tour
Abstract: Large scale deep learning systems are poised to have a transformative impact on society, and may pose an existential risk to humanity. The safety of these systems depends on our ability to understand them and harness their capabilities. I will talk about my work on methods for doing this, and the limitations of existing approaches. I will cover my recent and ongoing work on alignment failure modes, and the limitations of fine-tuning (such as reinforcement learning from human feedback) as an alignment approach. A key theme is the need to better understand how deep learning systems learn and generalize, in order to predict and steer their behavior. I will also discuss how my technical work informs and is informed by AI governance.
Bio: David an Assistant Professor at the University of Cambridge and a member of Cambridge's Computational and Biological Learning lab (CBL) and Machine Learning Group (MLG). His research group focuses on Deep Learning, AI Alignment, and AI safety. He is broadly interested in work (including in areas outside of Machine Learning, e.g. AI governance) that could reduce the risk of human extinction (“x-risk”) resulting from out-of-control AI systems. Prior to joining the University of Cambridge, David did his PhD at Université de Montréal, working under the supervision of Aaron Courville.