Soutenance de thèse - Jing Shan Shawn Tan
Dear all / Bonjour à tous,
We are happy to invite you to Jing Shan Shawn Tan's PhD defense on
Wednesday, June 11th at 2h30 pm (Virtual).
Vous êtes cordialement invité.e.s à la soutenance de thèse de Jing Shan Shawn
Tan, le mercredi 11 juin à 14h30 (en ligne).
Title: Neural Architectures for Compositional Generalisation
Date: June 11th at 2h30 pm
Location: Virtual
Link: umontreal.zoom.us/j/2438793436
Jury
President / Présidente | Bang Liu |
Director / Directeur de recherche | Aaron Courville |
Member / Membre | Alessandro Sardoni |
External examiner / Examinateur externe | Yoon Kim (TBD) |
Abstract:
In this thesis, we explore neural model architectures that aim to deal withcompositional generalisation problems, by incorporating notions from formallinguistics into neural architecture design. We present 5 papers with thisgoal in mind: (1) Ordered Neurons (Shen et al., 2019), a variant of theLSTM to introduce a syntactic inductive bias. (2) Ordered Memory (Shen etal., 2019), overcomes some limitations of Ordered Neurons, but performstree-structured encoding with a stack-augmented neural model. (3)Connectionist Trees (Tan et al., 2020) a decoding counterpart to OrderedMemory, with a specifically designed dynamic programming loss for trainingtree-structures unsupervised (4) Truncated Flows (Tan et al., 2022), adequantisation without full support over the real space. Using rejectionsampling to sample from this space allows for improvements in constrainedgeneration tasks. (5) Sparse Universal Transformers (Tan et al., 2023) Wescale up Universal Transformers using Sparse Mixture-of-Experts (SMoEs).These papers propose different architectures that allow for betterperformance on compositional generalisation tasks. Finally, we discuss whatlies ahead in compositional generalisation in the era of large languagemodels, and where compositional generalisation problems may still arise.