Les activités - Département d'informatique et de recherche opérationnelle

Présentation prédoc III de Amin Bonyadkhalaj

Tue, 30 Jun 2026 09:30:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Amin Bonyadkhalaj, le 30 juin à 09:30h .

Titre : A Cognitive Multi-Agent Architecture for Virtual Pilots

Date: mardi 30 juin à 09:30h

Location: Pavillon André-Aisenstadt, 2920 Ch. de la Tour, salle 3195

Jury

Président	Nadia El-Mabrouk
Directeur	Claude Frasson
Co-directeur	Hamdi Ben Abdessalem
Membre	Jan-Yun Nie

Résumé

Modern aviation places pilots in highly complex environments where advanced avionics, automation, and time-critical operational demands converge. Although automation has reduced some aspects of manual workload, it has also introduced cognitive risks such as mode confusion, reduced situational awareness, and overload during critical flight phases. This thesis proposes a Cognitive Multi-Agent Architecture for Virtual Pilot Assistance, developed within the C-Pilot project. The proposed architecture aims to transform processed pilot-state and flight-context information into governed, traceable, and non-intrusive cognitive assistance. It is organized around a central Coordinator and three specialized agents: the Companion Agent, the Recommender Agent, and the Derivation Agent. The Coordinator acts as the sole decision authority, enforcing safety policies, timing constraints, and assistance rules before any output is delivered to the pilot. The research builds on previous simulator-based studies on pilot workload, attention, physiological responses, facial temperature, and experience effects. These studies provide the empirical foundation for the thesis by showing that pilot cognitive state can be measured and used as contextual input. The main contribution of the thesis is therefore not only the estimation of pilot state, but the design and validation of a governed cognitive-agentic architecture capable of using this information safely for virtual pilot assistance.

Soutenance de thèse - Samuel Ducharme

Fri, 26 Jun 2026 10:30:00 -0400

Bonjour à tous,

Vous êtes cordialement invité.e.s à la soutenance de thèse de Samuel Ducharme qui se tiendra ce vendredi 26 juin à 10h30.

Title: Vérification interactive efficace de calculs quantiques délégués pour des problèmes à oracle

Date: 26 juin 2026, 10h30.

Salle: Pavillon André-Aisenstadt, 2920 Ch. de la Tour, au AA-3195

Jury

Président	Louis Salvail
Directeur de recherche	Gilles Brassard
Membre du jury	Frédéric Dupuis
Examinateur externe Représentant du doyen	Dave Touchette Richard Mackenzie

Les Grandes Retrouvailles - Retrouvailles des diplômées et diplômés du DIRO

Sat, 09 May 2026 16:00:00 -0400

Soutenance de thèse - Sifan Wu

Thu, 26 Mar 2026 08:30:00 -0400

Bonjour à tous,

Vous êtes cordialement invité.e.s à la soutenance de thèse de Sifan Wu le 26 mars à 8h30 EST (mode hybride).

Title: The Spectrum of Structured Reasoning in Multimodal Foundation Models.

Date: 26 mars à 8h30

Salle: Pavillon André-Aisenstadt, 2920 Ch. de la Tour, au AA-3195

Jury

Président / représentant du doyen	Philippe Langlais
Directeur de recherche	Bang Liu
Membre du jury	Jian-Yun Nie
Examinateur externe	Lizi Liao

Abstract:

Les avancées récentes des grands modèles de langage multimodaux (MLLM) ont démontré des capacités remarquables en matière de compréhension du langage naturel et de la vision à usage général. Cependant, un fossé important subsiste lorsque ces modèles probabilistes sont déployés dans des domaines scientifiques et d’ingénierie spécialisés, qui exigent un respect rigoureux de contraintes explicites, une cohérence logique et des principes physico-chimiques. Pour relever ce défi, cette thèse propose le « Spectre du Raisonnement Structuré » (Spectrum of Structured Reasoning), un cadre conceptuel qui caractérise les tâches de raisonnement multimodal selon trois axes orthogonaux : la Modalité, l’Intensité Structurelle et l’Agentivité.

Cette recherche étudie systématiquement ce spectre à travers quatre contributions méthodologiques. Premièrement, en abordant la structure implicite dans le texte, la thèse introduit KADE, un cadre augmenté par la connaissance qui exploite des graphes de connaissances externes et la recherche analogique pour identifier la causalité latente des événements dans des récits textuels caractérisés par la rareté des données. Deuxièmement, ciblant la structure explicite en ingénierie, le travail présente CAD-LLM et CadVLM. Ces modèles traitent les esquisses CAO paramétriques comme des séquences régies par des contraintes, utilisant le pré-entraînement vision-langage pour générer des conceptions d’ingénierie géométriquement et topologiquement valides à partir d’entrées multimodales. Troisièmement, pour les domaines scientifiques faiblement structurés, la thèse développe MatVQA, un banc d’essai (benchmark) pour le raisonnement visuel en science des matériaux. En introduisant le pipeline automatisé MArxivAgent, ce travail construit des tâches de raisonnement rigoureuses sur la relation Structure-Propriété-Performance (SPP), obligeant les modèles à intégrer une perception visuelle fine avec une logique scientifique spécifique au domaine. Enfin, traitant du raisonnement actif sous rareté d’information, la thèse présente TurtleSoup-Bench et le cadre Mosaic-Agent. Cette recherche explore le raisonnement imaginatif, évaluant la capacité des agents à formuler activement des hypothèses, à sélectionner des requêtes informatives et à reconstruire des récits cachés dans des énigmes de pensée latérale. Collectivement, ces contributions démontrent que pour développer des assistants IA fiables pour la découverte scientifique et la conception technique, il est essentiel de dépasser l’alignement multimodal de surface pour doter les modèles de fondation de capacités de perception de la structure et d’exploration active.

Soutenance de thèse - Mouna Dhaouadi

Fri, 13 Feb 2026 10:00:00 -0500

Bonjour à tous,

Vous êtes cordialement invité.e.s à la soutenance de thèse de Mouna Dhaouadi le 13 février à 10h00 EST (mode hybride).

Title: Empowering Rationale-aware Commit-based Development

Date: 13 Février à 10h

Salle: Pavillon André-Aisenstadt, 2920 Ch. de la Tour, au AA-3195

Jury

Président / représentant du doyen	Houari Sahraoui
Directeur de recherche	Michalis Famelis
Co-Directeur	Jian-Yun Nie
Membre du jury	Bang Liu
Examinateur externe	Martin Robillard

Abstract:

Les développeurs de logiciels doivent souvent prendre de nombreuses décisions. La logique sous-jacente à ces décisions, représente une information utile et précieuse. Dans le passé, les chercheurs ont essayé de l’extraire et de l’exploiter automatiquement. Cependant, les techniques antérieures ne sont applicables qu’à des contextes spécifiques et les progrès sont insuffisants en ce qui concerne un système automatisé complet d’extraction et de gestion de la logique de développement. Dans ce projet de recherche, nous proposons de créer un système d’extraction, de structuration et de gestion automatisé de la logique de développement des messages de validation. Ce système aiderait à assurer la cohérence du processus de développement.

Soutenance de thèse - Mélisande Teng

Thu, 12 Feb 2026 14:30:00 -0500

Bonjour à tous,

Vous êtes cordialement invité.e.s à la soutenance de thèse de Mélisande Teng le 12 février à 14h30 EST (mode hybride).

Title: Applications of Machine Learning for Biodiversity Monitoring

Date: 12 février at 14:30 pm EST

Room: Coworking Space (6650 St-Urbain, first floor)

Link: https://umontreal.zoom.us/j/81197854421?pwd=BAmPSvnvw7m8KO7cFJBzWZXp6BUgij.1

Jury

Présidente	Dhanya Sridhar
Directeur de recherche	Yoshua Bengio
Co-Directeurs	Hugo Larochelle & David Rolnick
Membre	Margaret Kalacska
Examinateur externe	Rebecca Hutchinson-TBD

Résumé:

Les crises interdépendantes du changement climatique et de la perte de biodiversité exigent des mesures urgentes afin d'atténuer leurs impacts sur la société humaine et l'environnement naturel. Pour relever ces défis, il est indispensable d'approfondir notre compréhension de l'état actuel de la biodiversité. Les avancées récentes dans le domaine de l'apprentissage automatique et des technologies de capteurs, tels que la télédétection et les enregistreurs audio, offrent des possibilités sans précédent pour l'observation et le suivi à grande échelle de la Terre et de la biodiversité.

Dans cette thèse, nous présentons des méthodes visant à combler les lacunes en matière de connaissances et de données sur la biodiversité et à permettre un suivi à grande échelle de la biodiversité afin de guider la prise de décision en matière de conservation écologique.

En particulier, nous nous intéressons à des applications en modélisation de la répartition des espèces, en surveillance des forêts et en bioacoustique, guidée par les questions suivantes : « Quelles opportunités l'apprentissage automatique et l'apprentissage profond offrent-ils pour le suivi de la biodiversité ? » et « Comment les connaissances du domaine d'application peuvent-elles être intégrées dans les modèles et guider leur conception ? ».

Dans notre première contribution, nous présentons SatBird [1], un jeu de données et un benchmark pour la modélisation de la répartition des espèces d'oiseaux à partir de données de télédétection et environnementales, en utilisant des étiquettes issues de données de science citoyenne. Nous mettons en lumière le potentiel de l'apprentissage profond et des images satellites pour cartographier la distribution des espèces à grande échelle.

Dans notre deuxième contribution, nous développons CISO [2], une nouvelle approche basée sur l'apprentissage profond pour la modélisation de la répartition des espèces, conditionnée sur des observations incomplètes d'espèces. Ce modèle prend en entrée des données environnementales et n'importe quel nombre d'observations d'espèces, conditionnant les prédictions aux présences ou absences connues d'autres espèces, s'adaptant ainsi à la variabilité fréquente et au caractère incomplet des données biotiques disponibles en pratique. Nous montrons que CISO améliore significativement les prédictions par rapport aux autres méthodes.

Dans cette thèse, nous nous penchons également sur la tâche de segmentation d'instances de couronnes d'arbres dans des images à haute résolution prises par drone pour l'estimation du carbone forestier [3]. Nous comparons plusieurs modèles et proposons une nouvelle méthode pour cette tâche, en s'appuyant sur le modèle Segment Anything et des données d'élévation dérivées de l'imagerie par drone. Nous l'appliquons à des contextes de plantation et de forêt tempérée et tropicale. Ce travail montre le potentiel de l'adaptation des modèles de fondation et de l'intégration des données d'élévation pour cette tâche, et met en évidence les défis posés par différents types de forêts.

Enfin, nous présentons une méthode non supervisée pour l'annotation automatique des syllabes dans les enregistrements de chants d'oiseaux [4], afin d'alléger le travail manuel fastidieux, coûteux et laborieux effectué par les écologistes. Etudier les chants d'oiseaux au niveau des syllabes est utile pour de nombreuses applications en bioacoustique, telles que l'identification individuelle des oiseaux et l'étude de la communication animale, positionnant notre méthode comme un outil prometteur pour accélérer la recherche en bioacoustique.

[1]

https://proceedings.neurips.cc/paper_files/paper/2023/hash/ef7653bbc4655305efb89a32362e332a-Abstract-Datasets_and_Benchmarks.html

[2]

https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210x.70238

[3]

https://openreview.net/forum?id=1vSLxdJNq8

[4]

https://arxiv.org/abs/2509.18412

Journée carrière et stages en informatique et finances computationnelles

Wed, 28 Jan 2026 15:00:00 -0500

Une journée carrière et stages en informatique et finances computationnelles aura lieu le mercredi 28 janvier, de 15h à 18h, dans l'entrée principale du pavillon André-Aisenstadt.

Pour plus d'informations, visitez ce site web.

Présentation prédoc III de Cristian Dragos Manta

Wed, 17 Dec 2025 10:00:00 -0500

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Cristian Dragos Manta, le 17 décembre à 10h (mode hybride).

Titre : Towards More Flexible Causal Inference Algorithms for Real-World Datasets

Date: mercredi 17 décembre à 10h

Location: Auditorium 1, MILA, 2e étage

Jury

Président	Simon Lacoste-Julien
Co-Directeur	Dhanya Sridhar
Co-directeur	Yoshua Bengio
Membre	Gauthier Gidel

Résumé

From using the law of gravity to predict the orbits of planets, to understanding how human cells react to new chemicals, or predicting impactsof new policies on the economy, or increasing the fairness of loan offers,the field of causality finds applications in a large diversity of settings by providing a rich mathematical framework for performing probabilistic inference over the effects of perturbations on a system. Unfortunately,despite the promises of this field, the applications of causal discovery algorithms to real-world problems remain challenging. Why is that so? Using scientific discovery as a motivating example, we propose a possible approach to reduce the gap with real-world settings for predicting the effects of unseen interventions. Our method is grounded in using meta-learning for amortized causal discovery over different environments,while using a more flexible architecture for end-to-end causal effects prediction. This approach characterizes our broader research vision which consists in relaxing as many structural assumptions as possible from causal discovery methods and leveraging simulated data to improve sample efficiency, while maintaining the core causal inductive biases. In doing so, our goal is to develop flexible methods that can be extended to challenging settings where standard causal discovery assumptions are violated and where data might be scarce.

Présentation prédoc III de Hafez Ghaemi

Tue, 16 Dec 2025 14:00:00 -0500

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Hafez Ghaemi, le 16 décembre à 14h (mode hybride).

Titre : Predictive World Modeling from an Egocentric Perspective

Date: mardi 16 décembre à 14h

Location: Auditorium 1, MILA, 2e étage

Jury

Président	Aaron Courville
Directeur	Eilif Muller
Co-directeur	Shahab Bakhtiari
Membre	Liam Paull

Résumé

Embodied artificial intelligence requires agents capable of perceiving, reasoning, and acting within complex, stochastic environments. While generative world models have demonstrated the ability to simulate environmental dynamics, they are often computationally expensive and predisposed to modeling task-irrelevant noise. In this report, we advocate for latent predictive world models as a scalable alternative paradigm for embodied agents. Adopting an egocentric agent perspective, we structure our research around three fundamental pillars: (1) self-supervised world modeling from sequential interaction, (2) eliminating heuristics for view construction in self-supervised learning (SSL) via active foveation, and (3) endowing world models with a “theory of mind” for multi-agent environments.

First, we address the challenge of learning robust visual representations from sequential interaction. We introduce seq-JEPA, a self-supervised world model that processes sequences of action-observation pairs. Through architectural inductive biases, we demonstrate that seq-JEPA resolves the trade-off between representational invariance and equivariance in joint-embedding SSL. Furthermore, our model excels at tasks that inherently require aggregating sequential observations, such as path integration across action trajectories.

Second, we propose to transition self-supervised visual pretraining from passive observation to active foveation. Current methods rely on hand-crafted augmentations or random masking as static heuristics to construct “views”. We propose Energy-Guided Masking (EGM), a mechanism for actively sampling target views in SSL from input regions with higher uncertainty. This approach allows view construction to be driven directly by the SSL objective rather than external heuristics.

Finally, we extend the scope of predictive world modeling to multi-agent, social contexts. A single-agent world model fails to predict the behavior of other agents whose internal states and actions are inaccessible to the agent. To model the behavior of others, we adopt an approach that captures prediction uncertainty in joint-embedding SSL via a latent variable learned through a variational objective. By leveraging this latent variable to model observed behavior, our framework enables the agent to predict the behavioral trajectories of other agents, facilitating planning in multi-agent environments. Furthermore, we propose a mechanism for scaling this approach as new agents are introduced to the environment.

Présentation prédoc III de Yipeng Zhang

Mon, 15 Dec 2025 10:00:00 -0500

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Yipeng Zhang, le 15 décembre à 10h (mode hybride).

Titre : Self-Supervised Representation Learning and World Modeling on Naturally-Structured Data

Date: Lundi 15 décembre à 10h

Location: Auditorium 1, MILA, 2e étage

Jury

Président	Aaron Courville
Directeur	Laurent Charlin
Membre	Pascal Vincent

Résumé

Joint-embedding self-supervised learning (SSL) has become a central paradigm for learning visual representations without labels, typically by learning the relationship between semantically related data pairs. When these pairs are temporally related, the learned predictor naturally functions as a world model that anticipates future states. However,standard SSL objectives implicitly assume a fixed dataset and simple pairwise relationships. These assumptions often break down in practical scenarios involving naturally-structured data, where real-world dynamics introduce non-stationary data distributions and structured dependencies between data pairs. This report studies these challenges.

First, in a continual learning setting where the data distribution changesover time, we show that many existing methods fail to consolidate representations across time and struggle to adequately learn new data. We demonstrate that explicitly optimizing a consolidation objective, together with a separate embedding space to learn new data, improves performance. We also show that continual SSL models can benefit from naturally ordered data sequences.

Second, we study the predictive uncertainty in SSL, where each datum may correspond to multiple valid targets. This arises when data pairs come from naturally occurring generative processes, such as successive video frames.We show that standard SSL objectives cannot learn this conditionalvuncertainty and propose AdaSSL, which adapts to different pairwise relationships and produces richer representations and more accurate latent world models.

Together, these contributions point toward self-supervised world modelsthat learn directly from naturally evolving data streams encountered inrealistic scenarios.

Présentation prédoc III de Sophia Gunluk

Fri, 12 Dec 2025 15:00:00 -0500

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Sophia Gunluk, le 12 décembre à 15h (mode hybride).

Titre :Causal Modeling for Real World Distribution Shifts

Date: vendredi 12 décembre à 15h

Location: Auditorium 1, MILA, 2e étage

Jury

Président	Gauthier Gidel
Directeur	Dhanya Sridhar
Membre	Matt Kusner

Résumé

Machine learning systems are increasingly used in decision-making settingswhere predictions influence, and are influenced by, human behavior andobservational constraints. Historical data often reflect structuralinequities, policy choices, and may contain spurious or unstableassociations that distort the relationships a model attempts to learn. Oncedeployed, these systems can shape the behavior of individuals andinstitutions, altering the distributions on which future decisions aremade. Understanding these dynamics requires tools that connect causalstructure, distribution shifts, and the effects of strategic andpolicy-driven feedback.

First, causal modeling provides a way to distinguish stable mechanisms fromspurious regularities, clarifying which aspects of the data-generatingprocess remain invariant across environments. This perspective explainswhen classifiers degrade under strategic adaptation, how feedback loopsarise as individuals respond to deployed classifiers, and why relying onnon-causal features leads to arbitrarily bad post-adaptation risk. Instrategic classification, this framing highlights when adaptation resultsin genuine improvements versus gaming and how these responses reshapepost-adaptation distributions and long-term performance.

In addition, institutional policies determine which outcomes becomeobservable, introducing selection bias that complicates identification,evaluation, and the design of fair decision rules. By modeling selection aspart of the underlying causal process, it becomes possible to analyze howhistorical decisions limit what can be inferred from data and howalternative policies induce different observable populations. Together,these perspectives form a foundation for understanding the long-termdynamics of classifiers and populations, and for developing decision rulesthat remain robust to adaptation and policy-dependent shifts whilepromoting fairness and social welfare.

Présentation prédoc III de Maryam Hashemzadeh

Fri, 12 Dec 2025 14:00:00 -0500

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Maryam Hashemzadeh, le 16 décembre à 14h (remote).

Titre : Algorithm and Architecture Design Towards Modularity in LLMs

Date: vendredi 12 décembre à 14h

Location: À distance, lien zoom

Jury

Président	Aaron Courville
Directeur	Sarath Chandar
Co-directeur	Marc-Alexandre Côté
Membre	Glen Berseth

Résumé

As Large Language Models (LLMs) continue to scale, their deployment inreal-world scenarios faces critical bottlenecks regarding computationalefficiency, adaptability, and safety. This thesis proposal argues thatmodularity—both in knowledge representation and model architecture—offers arobust pathway to address these challenges. We investigate this paradigmthrough two distinct but complementary frameworks.

First, we address Knowledge Modularity by introducing Sub-goalDistillation, a method to transfer the reasoning capabilities of massiveLLMs into smaller, resource-constrained agents. By decomposing complexlong-horizon tasks into hierarchical sub-goals, we train a 770M-parameteragent that decouples high-level planning from low-level execution. Testedin the ScienceWorld environment, this approach outperforms standardimitation learning by 16.7% while eliminating the need for real-time LLMinference.

Second, we address Architectural Modularity through SafeMoE, aMixture-of-Experts framework designed to resolve the tension between safetyand informativeness. Unlike tra- ditional alignment methods that often leadto blanket refusals, SafeMoE leverages experts explicitly trained on unsafedomain data, controlled by a safety-aware router. This allows the model tonavigate sensitive topics with nuance, achieving over 20% relativeimprovement in safe response rates compared to baselines whilesimultaneously enhancing informativeness.

Finally, we outline future research directions, including the integrationof generative world-model verifiers for robust planning and the use ofexpert divergence regularization to prevent representation collapse inmodular architectures.

Soutenance de thèse - Dorsaf Sallami

Tue, 09 Dec 2025 10:00:00 -0500

Dear all / Bonjour à tous,

We are happy to invite you to Dorsaf Sallami PhD defense on December 9th at 10 am.

Vous êtes cordialement invité.e.s à la soutenance de thèse de Dorsaf Sallami, le 9 decembre à 10h.

La soutenance se déroulera en anglais.

Title: Toward Socially Responsible Artificial Intelligence Approaches for Fake News Detection

Date: December 9th, at 10 am.

Room: Pavillon André-Aisenstadt, salle 3195

Jury

President / Présidente	Michalis Famelis
Director / Directeur de recherche Codirecteur de recherche	Esma Aïmeur Gilles Brassard
Member / Membre	Claude Frasson
External examiner / Examinateur externe	Reihaneh Rabbany, Université McGill

Abstract:

Once celebrated as a common good that transformed knowledge sharing, the Internet—and social media platforms in particular—has connected communities across borders and exponentially expanded the scale of information exchange. Yet, these infrastructures have also become fertile ground for disinformation. Driven by algorithms that prioritize virality over veracity, fake news often spreads faster than established facts.

Artificial Intelligence (AI) thus appears as a promising tool. Machine learning models enable detection at scales unreachable by human fact-checkers, filtering and identifying problematic content in real time. However, the reality is more complex. Unlike other applications of machine learning designed to optimize efficiency in low-stakes contexts, fake news detection lies at the heart of democratic debate, public trust, and epistemic integrity. This dual observation calls for greater equity, transparency, and robustness. A high-performing model alone is insufficient, especially since fake news can originate early in the information cycle and persist long after initial detection, necessitating coordinated interventions before, during, and after its dissemination.

On the theoretical level, I propose two complementary directions. First, I argue for reframing fake news detection within the framework of Socially Responsible AI (SRAI).
Rather than focusing solely on accuracy, I advocate for explicit alignment with broader societal values—equity, transparency and robustness. Second, I contend that fake news
cannot be reduced to a mere detection problem; instead, it unfolds across a chain of events that begins well before any intervention and extends beyond it. While few studies consider
the full spectrum of actions that can deter or prevent the creation and dissemination of false content, I address this gap by proposing an interdisciplinary taxonomy of interventions
designed to deter, prevent, and mitigate fake news.

On the practical level, my contributions unfold in four parts. First, I conduct, to the best of my knowledge, the first study on gender bias in fake news detection and introduce
a classifier-adversary integration scheme to reduce inter-group disparities while maintaining competitive performance. Second, to generalize across heterogeneous domains, I propose
CoALFake, a cross-domain approach that combines domain-aware active learning with human–Large Language Model (LLM) co-annotation. LLMs perform large-scale preliminary labelling, while humans in the loop arbitrate ambiguous cases and correct errors. CoALFake achieves significant improvements in cross-domain robustness while reducing annotation costs. Third, I enhance explainability and user trust through the M ulti-level, Model-Agnostic Post-hoc Explanations (MAPE) system, which provides a multi-layered structure allowing users to adjust the level of detail to their expertise and decision-making needs. Fourth, I present Aletheia, a browser extension combining Retrieval-Augmented Generation (RAG) and LLMs to detect fake news and deliver evidence-based explanations directly within the browsing environment. In addition to real-time detection, Aletheia integrates two interactive features. The first promotes dialogue and collaborative content evaluation through a discussion space, while the second highlights the most recent fact-checks.

Présentation prédoc III de Cedric Martens

Fri, 05 Dec 2025 10:00:00 -0500

Dear all / Bonjour à tous,

We are happy to invite you to the Predoc III evaluation of Cedric Martens on december 5th at 10 am (hybrid mode).

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Cedric Martens, le 5 decembre à 10h00 (mode hybride).

Titre: Shape Analysis of Imperfect Geometry

Date: December 5th, 10 am

Location: AA3195 et Zoom (lien ci-dessous)

Link: https://umontreal.zoom.us/j/4750838017?pwd=RkZiVGFCK0pySCsxcVFzcDFjNE9hQT09

Jury

Président	Noam Aigerman
Directeur	Mikhail Bessmeltsev
Membre	Pierre Poulin

Abstract

Modern geometric data comes from different sources: 3D scans, user-generated meshes, sketch-based interfaces, Computer-Aided-Design (CAD) software, and more recently, data-driven generative models. These sources produce shapes that people want to render, edit, analyze, and use in simulations. Unfortunately, the geometric data produced by these pipelines is rarely clean. These models may have defects such as inconsistent face orientation, self-intersections, or non-manifold features, making them difficult to process. These imperfections threaten the mathematical underpinning of many algorithms in rendering and geometry processing. When clean geometry assumptions are violated, inside-outside tests are inconsistent, shading computations are riddled with artifacts, surface parametrizations become difficult, and simulations are unstable.

To address these challenges, I propose a series of projects that make existing algorithms robust to geometric input or introduce new ones designed to deal with imperfect geometry. First, I introduce a boundary formulation for computing Generalized Winding Numbers (GWN) that extends its theoretical understanding. Second, I present a high-performance GPU-friendly algorithm for computing GWNs. Finally, I aim to recover geometric information from sketches via differentiable occluding contours, in a setting where sketch strokes can be messy and ambiguous.

Présentation prédoc III de Tianyue H. Zhang

Mon, 17 Nov 2025 10:00:00 -0500

Dear all / Bonjour à tous,

We are happy to invite you to the Predoc III evaluation of Tianyue H. Zhang on November 17th at 10 am (hybrid mode).

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Tianyue H. Zhang, le 17 novembre à 10h00 (mode hybride).

Titre: Exploring Overlooked Characteristics in Machine Learning Optimization

Date: November 17th, 10 am

Location: Auditorium 1

Link: https://umontreal.zoom.us/j/84339939626?pwd=VJZdTBRyZ3XotWhTbIwo7Lu6zYYCTv.1

Jury

Président	Ioannis Mitliagkas
Directeur	Simon Lacoste-Julien
Membre	Aristide Baratin

Abstract

Adam often outperforms stochastic gradient descent in training modern deep networks, yet the reasons are not fully understood. We examine characteristics of optimizers that are frequently overlooked in existing work, focusing on how the objective function and data structure influence training dynamics. This perspective allows us to understand better when and why adaptive optimizers like Adam provide an advantage in practice.

First, we investigate Adam's sensitivity to rotations of the parameter space. We show that Adam is sensitive to the choice of parameter basis: random rotations harm performance, and specific structured rotations preserve or enhance it. This demonstrates that rotation-invariant theoretical assumptions are insufficient to explain Adam's empirical advantages. We then examine the rotation-dependent assumptions in the literature and find that they fall short in describing Adam's behaviour across various rotation types. In contrast, we verify the orthogonality of the update as a promising indicator of Adam's basis sensitivity, suggesting it may be the key quantity for developing rotation-dependent theoretical frameworks.

In addition, previous work suggests that the performance gap between Adam and SGD is related to dataset class imbalance, with Adam showing an advantage when certain tokens appear far more frequently than others. However, this advantage is mainly observed in the large batch setting, and the gap nearly disappears with small batch training. We aim to understand whether the additional stochasticity counteracts this phenomenon. We start from studying a linear bigram next-token prediction model trained on data following the power law, and deriving the asymptotic behaviour of SGD in this setting.

Soutenance de thèse - Edward Hu

Mon, 17 Nov 2025 10:00:00 -0500

Dear all / Bonjour à tous,

We are happy to invite you to Edward Hu PhD defense on November 17th at 10 am (remote).

Vous êtes cordialement invité.e.s à la soutenance de thèse de Edward Hu, le 17 novembre à 10h (Mode Distanciel).

Title: Building a Reasoning Machine

Date: November 17th, at 10 am Location: Remote

Link: https://umontreal.zoom.us/j/81792703669?pwd=Mbobppk3qJdhk7h1CKnEtJOCzN2H8A.1

Jury

President / Présidente	Dhanya Sridhar
Director / Directeur de recherche	Yoshua Bengio
Member / Membre	Aaron Courville
External examiner / Examinateur externe	Jason Eisner(TBD)

Abstract:

Deep learning with maximum-likelihood estimation excels at directly modeling data at scale. However, it suffers from poor generalization and an inability to reason beyond a few logical steps. Many hypothesized that the solution lies in replicating human’s ability to perform slow and guided reasoning over a low-dimensional, structured concept space.

I built on the success of deep learning, where more compute empirically translates to bet- ter downstream performance, and proposed algorithms that convert traditionally intractable symbolic learning problems into deep learning ones that can be solved with more compute and better optimization. This is done in two main directions: (1) training a neural approxi- mator of an intractable posterior over compositional objects using generative flow networks; and (2) losslessly representing compositional objects as differentiable vectors amenable to gradient optimization. The objects considered include sequences, trees, and directed graphs with applications in latent variable modeling, chain-of-thought reasoning, tree manipulation, and causal discovery. I further studied the improved generalization of this approach across experiments on grammar induction, tree manipulation, and natural language reasoning.

Finally, I presented future research directions on incorporating inductive biases into the compositional objects and efficiency improvements.

*Keywords *reasoning, generative flow networks, latent variable models, amortized infer- ence, variational inference, compositionality, generalization, causal discovery.

Résumé:

L’apprentissage profond avec estimation de la vraisemblance maximale excelle dans la modélisation directe des données à grande échelle. Cependant, il souffre d’un manque de généralisation et d’une incapacité à raisonner au-delà de quelques étapes logiques. Beaucoup ont émis l’hypothèse que la solution réside dans la réplication de la capacité humaine à effectuer un raisonnement lent et guidé sur un espace conceptuel structuré de faible dimension.

Je me suis appuyé sur le succès de l’apprentissage profond, où davantage de calculs se traduisent empiriquement par de meilleures performances en aval, et j’ai proposé des algorithmes qui convertissent des problèmes d’apprentissage symbolique traditionnellement insolubles en problèmes d’apprentissage profond qui peuvent être résolus avec plus de calculs et une meilleure optimisation. Cela se fait dans deux directions principales : (1) former un approximateur neuronal d’un postérieur intraitable sur des objets compositionnels en utilisant des réseaux de flux génératifs ; et (2) représenter sans perte les objets compositionnels sous forme de vecteurs différentiables susceptibles d’être optimisés par gradient. Les objets considérés comprennent des séquences, des arbres et des graphes orientés avec des applications dans la modélisation de variables latentes, le raisonnement par chaîne de pensée, la manipulation d’arbres et la découverte causale. J’ai étudié plus en détail la généralisation améliorée de cette approche à travers des expériences sur l’induction grammaticale, la manipulation d’arbres et le raisonnement en langage naturel.

Finalement, j’ai présenté les futures orientations de recherche sur l’intégration des biais inductifs dans les objets compositionnels et les améliorations d’efficacité.

*Mots-clés* raisonnement, réseaux de flux génératifs, modèles de variables latentes, inférence amortie, inférence variationnelle, compositionnalité, généralisation, découverte causale.

Présentation prédoc III de Johan Samir Obando Ceron

Thu, 13 Nov 2025 09:00:00 -0500

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Johan Samir Obando Ceron, le 13 novembre à 9h00 (à distance).

Titre : Maximizing Learning, Minimizing Waste: The Art of Efficient Deep RL

Date: jeudi 13 novembre à 9h

Location: ZOOM

Jury

Président	Glen Berseth
Directeur	Aaron Courville
Co-Directeur	Pablo Samuel Castro
Membre	Sarath Chandar

Résumé

Deep reinforcement learning (RL) has achieved remarkable success but remains unstable and inefficient at scale. This thesis will investigate how modular architectures, sparsity, and stable optimization jointly enable scalable and reliable deep RL. First, Mixtures of Experts Unlock Parameter Scaling for Deep RL demonstrates that adding Soft MoE modules to value networks yields consistent performance gains as parameter counts increase, revealing effective parameter scaling when capacity is organized modularly.

Next, In Value-Based Deep Reinforcement Learning, a Pruned Network is a Good Network we show that gradual magnitude pruning can uncover sparse, high-performing subnetworks that outperform dense baselines while using only a fraction of the parameters—benefits that grow with model size.

Finally, Stable Gradients for Stable Learning at Scale in Deep Reinforcement Learning identifies gradient pathologies arising from interactions with RL non-stationarity as a key cause of scaling failures, and introduces simple architectural and optimization interventions that preserve gradient flow, enabling robust training of large models. Together, these works establish a unified view of how architectural scaling, gradient stability, and sparsity interact—offering a path toward reliable, large-scale deep reinforcement learning.

Conférence Addiroum 2025 - Évolutivité, robustesse et apprentissage continu dans les modèles fondamentaux : avancées récentes du Laboratoire d'IA Autonome - IRINA RISH

Wed, 05 Nov 2025 19:30:00 -0500

Au cours des dernières années, notre laboratoire s'est concentré sur la compréhension et l'amélioration des grands modèles pré-entraînés (fondamentaux) selon plusieurs axes : leur évolutivité (en termes de taille, de données, d'architecture), leur robustesse (face aux changements de distribution, aux perturbations adverses ou aux perturbations d'entrées) et la manière dont ils peuvent être adaptés ou étendus de manière continue plutôt que réentraînés à partir de zéro.

Parmi les principales contributions, citons les "benchmarks" pour les prévisions de séries chronologiques à l'aide de signaux textuels/contextuels ; les lois d'inférence et de mise à l'échelle efficaces pour les architectures spécialisées (par exemple, les modèles linguistiques ternaires et les mélanges d'experts) ; des études sur la prévention de l'effondrement de la représentation dans les couches intermédiaires des transformateurs afin d'améliorer le raisonnement ; des recherches sur les changements de domaine et la généralisation hors distribution ; et des méthodes de pré-entraînement continu, de compression et d'adaptation économe en ressources. Dans cette présentation, je donnerai un aperçu de nos travaux récents sur ces sujets, en vue de créer des modèles d'IA plus efficaces, plus résilients et plus adaptables au fil du temps et à travers les tâches et les domaines.

Déroulement de la soirée :

17:00 : Accueil
18:00 : Assemblée générale, S1-139 du Pavillon Jean-Coutu (diplômés seulement)
18:30 : Buffet, Agora du Pavillon Jean-Coutu (inscription obligatoire)
Début de la conférence qui sera présentée en anglais, S1-151 du Pavillon Jean-Coutu (admission générale)

Inscription:

L'inscription est nécessaire pour le buffet, vous pouvez le faire directement sur le site de l'ADDIROUM, régulier : 30$ - étudiant : 10$
Conférence gratuite, admission générale

Lieu : Université de Montréal - Pavillon Jean-Coutu Salle S1-151

Soutenance de thèse - Hattie Zhou's

Thu, 16 Oct 2025 09:00:00 -0400

Dear all / Bonjour à tous,

We are happy to invite you to Hattie Zhou's PhD defense on October 16th at 9 am (hybrid mode).

Vous êtes cordialement invité.e.s à la soutenance de thèse de Hattie Zhou's, le 16 octobre à 9h (mode hybride).

Title: Toward Neural Networks that Generalize Systematically

Date: October 16th, at 9 am

Location: A05 (Mila 6666, second floor)

Link: https://umontreal.zoom.us/j/84719965626?pwd=SFNiWVFqL2haNnVtTTFJSE9TUlJkZz09

Jury

President / Présidente	Aishwarya Agrawal
Director / Directeur de recherche	Hugo Larochelle
Member / Membre	Sarath Chandar
External examiner / Examinateur externe	Boaz Barak(TBD)

Abstract:

A defining characteristic of human intelligence is our ability to generalize systematically---i.e. generalize to test examples which are structurally different from those seen in training. This often requires the ability to combine known components in novel ways for compositional tasks, or to learn the correct underlying problem-solving strategy or algorithm for reasoning tasks. However, systematic generalization remains a challenge for deep learning systems, which are powerful in capturing statistical regularities in the dataset, but fall short when these patterns do not capture the true data-generating structures.

In this thesis, we aim to understand the factors that affect systematic generalization in neural networks. We begin by introducing the forget-and-relearn paradigm, which unifies a number of iterative training algorithms proposed in the literature. In this process, the forgetting operation selectively removes undesirable information from the model, and the relearning stage reinforces features that are consistently useful under different conditions. We show that this method of training can significantly improve generalization on vision tasks in the low-data setting, and improve the compositionality of emergent languages in the Lewis communication game. Next, we study the ability of Transformer-based language models to learn and execute an algorithm via in-context learning. We introduce "algorithmic prompting''--- a prompting strategy that unlocks symbolic reasoning capabilities in large language models (LLMs) on arithmetic tasks, and demonstrate the first instance of strong length generalization on tasks like addition and parity using general-purpose Transformer architectures. Finally, we aim to characterize the tasks for which Transformers trained from scratch can exhibit strong length generalization. We hypothesize that solutions which are simple-to-represent are also more likely to be learned, and that the number of RASP-L operations of a solution can be used as a measure of Transformer-complexity. We show empirically that tasks with simple algorithmic solutions per RASP-L are more likely to exhibit strong length generalization. Finally, we discuss remedies for learning tasks or algorithms which are unnatural for a Transformer.

Soutenance de thèse - Maude Lizaire

Fri, 19 Sep 2025 13:30:00 -0400

Bonjour à tous et à toutes,

Vous êtes cordialement invité.e.s à la défense de doctorat de Maude Lizaire, le 19 septembre à 13h30 (mode hybride).

Titre: Connecting Neural Networks, Automata Theory and Tensor Network Methods for Sequence Data Learning

Date: vendredi 19 septembre 2025 à 13h30

Location: Agora, MILA, 6650 rue Saint-Urbain

Jury

Président	Ioannis Mitliagkas
Directeur	Guillaume Rabusseau
Membre du jury	Pierre-Luc Bacon
Examinateur externe	Tai-Danea Bradley

Abstract:

Despite the remarkable progress of deep learning, neural networks remain mostly “black-box” models that are challenging to scale sustainably. To address these limitations, this thesis introduces a cross-disciplinary approach to sequence modeling that connects neural networks with automata theory and tensor network methods. Automata theory contributes formal guarantees and tractability, while tensor networks offer efficient representations to mitigate the curse of dimensionality. The starting point of this thesis is the recently uncovered connection between second-order recurrent neural networks (2RNNs, RNNs that have multiplicative interactions between input and previous hidden states), weighted finite automata (WFAs) and matrix product states (MPS), a tensor network architecture also known as tensor train.

Our first contribution introduces the spectral initialization for 2RNNs, which consists of setting the weights of the model with the solution of the spectral algorithm for WFAs. This initialization leverages data before gradient-based training, leading to faster convergence and improved performance over random methods, even with fewer data or smaller models

The second contribution characterizes the effect of second-order interactions on the expressive power of recurrent neural networks. While introducing multilinear dependencies between input and hidden state strictly increases the capacity of RNNs, they come at the cost of a large third-order tensor. An approach to mitigate this issue is to parameterize the second-order interactions using a CP decomposition. This model, which we refer to as a CPRNN, is characterized by a rank and we use this additional hyperparameter to formally compare the expressivity of recurrent architectures with varying degrees of multiplicative interactions. The rank proves to be an effective gauge of the bias–variance trade-off, and we corroborate these theoretical findings empirically with language modeling experiments.

The third contribution studies the role of depth in linear RNNs. Unlike feedforward networks, where depth without nonlinearities collapses to a linear map, for linear RNNs there is a subtle interplay between depth and recurrence, which we formally examine in this work. We show that deeper models are strictly more expressive due to an increased memory capacity. We extend our analysis to 2RNNs and show that, unlike linear RNNs, their computations are polynomial, with degree growing with depth. Empirically, we validate our theory on real and synthetic tasks using RNNs, 2RNNs, and state-space models. (modifié)

Soutenance de thèse - Pierre-André Brousseau

Thu, 18 Sep 2025 14:00:00 -0400

Bonjour à tous et à toutes,

Vous êtes cordialement invité.e.s à la soutenance de thèse de Pierre-André Brousseau, le jeudi 18 septembre, à 14:00pm

Titre: Stereoscopic Depth Estimation with Permutation

Date: jeudi le 18 september 2025 à 14h

Location: Pavillon André-Aisenstadt, salle 3195, 2920 Ch. de la tour

Jury

Président	Liam Paull
Directeur	Sébastien Roy
Membre du jury	Aaron Courville
Examinateur externe	Marc-Antoine Drouin, Centre National de Recherche du Canada.

Abstract:

This thesis is in the field of depth estimation in computer vision. Its specific interests are stereoscopic depth estimation, deep stereo matching, traditional stereo matching and monocular depth estimation. Its main contribution is a permutation formulation for stereo matching which is well suited for self-supervised stereo matching, where stereo neural networks are trained without ground truth disparity maps. The permutation formulation is further demonstrated to allow the training of a feature encoder which can be introduced in traditional stereo matching algorithms and that can be readily integrated in industry level vision systems. Finally, this thesis will extend the permutation model to single camera systems by introducing spherical rectification, a novel epipolar rectification method for generic motion. These contributions represent actionable solution paths for problems faced in the industry. Deep stereo matching has not been adopted in the industry because traditional algorithms have kept improving and remain highly generalizable, reliable and explainable regardless of benchmark performance. We expect neural networks to fill the role of depth completion and depth denoising as they can introduce single image information while also allowing for better feature representations. Monocular depth algorithms which cannot be solved in a traditional manner have emerged by relying on a large transformer based model but remain reliable only for relative depth. This will change in the coming years as physical constraints are introduced such as the monocular stereo presented in this thesis.

Présentation prédoc III de Shruti Joshi

Tue, 16 Sep 2025 15:00:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Shruti
Joshi, le 16 septembre à 15h00 (mode hybride).

Titre : Real-world Implications of Identifiable Representation Learning

Date: mardi 16 septembre à 15h

Location: Agora, MILA, 6650 rue Saint-Urbain, 1er étage

Jury

Président	Simon Lacoste-Julien
Directeur	Dhanya Sridhar
Membre	Guillaume Lajoie

Résumé

A core objective in unsupervised learning is to uncover latent structure that explains observed data. Identifiable representation learning addresses this challenge by requiring that latent representations be uniquely determined up to irrelevant indeterminacies. Despite progress since the 1930s, it remains underutilised in real-world domains. Interpretability is a natural proving ground, as both aim to recover latent variables that correspond to semantic factors or concepts. In this view, interpretability is not merely about producing explanations palatable to human readers; it is about establishing that a model’s internal representations are uniquely tied, up to benign symmetries, to the underlying semantics of its computation. This proposal develops the thesis that interpretability is fundamentally an identifiability problem, and shows how the two fields can advance each other: both seek precise, testable links between latent variables and observable phenomena, aligning with the broader scientific method of uncovering stable and reproducible explanations.

Présentation prédoc III de Pascal Junior Tikeng Notsawo

Thu, 04 Sep 2025 14:00:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Pascal Junior Tikeng Notsawo, le 4 septembre à 14h (mode hybride).

Titre : Toward Understanding Grokking: The Interplay of Regularization, Data Structure, and Optimization Dynamics

Date: jeudi 4 septembre à 14h

Location: Auditorium 1, MILA

Jury

Président	Gauthier Gidel
Directeur	Irina Rish
Co-directeur	Guillaume Rabusseau
Co-directeur	Guillaume Dumas
Membre	Ioannis Mitliagkas

Résumé

The remarkable ability of deep learning models to generalize in over-parameterized regimes remains a central mystery in machine learning. A particularly challenging case of this problem is grokking, a phenomenon of delayed generalization that occurs after overfitting when optimizing artificial neural networks with gradient-based methods. This thesis presents a comprehensive study of grokking, exploring the conditions under which it occurs, its underlying mechanisms, and methods for its early prediction. We first extend the study of grokking beyond simple algorithmic tasks to the general setting of finite-dimensional algebras, revealing how the algebraic structure of a problem influences learning difficulty and the emergence of generalization. We then challenge the conventional understanding of grokking's reliance on $\ell_2$ regularization by demonstrating that grokking step scales proportionally to $1/(\alpha \beta)$ when minimizing composite objectives of the form $f=g+\beta h$ using gradient descent with learning rate $\alpha$, where $g$ is the training error and $h$ is an arbitrary and appropriately chosen regularizer that enforces an inductive bias toward generalization (e.g., sparsity, low-rankness). We also show that the commonly used $\ell_2$-norm is not a reliable proxy for explaining grokking. Finally, to address the high computational cost of observing grokking, we propose a novel low-cost method for early prediction. We demonstrate that the spectral signature of the training loss in the early phases of optimization contains predictive signals, whose characteristics correlate with a model’s eventual ability to generalize. Together, these findings expand our understanding of grokking from a specific phenomenon to a universal dynamic in deep learning, driven by the interplay of optimization, regularization, and the intrinsic structure of the task. We also outline a concrete roadmap for future research, including plans to apply these insights to the broader context of language modeling and the evaluation of adversarial robustness.

Soutenance de thèse - Junyi Li (RALI)

Thu, 04 Sep 2025 10:00:00 -0400

Bonjour à tous et à toutes,

Hello everyone,

Vous êtes tous et toutes cordialement invité.es à assister à la soutenance de thèse de Junyi Li (RALI).

You’re cordially invited to the PhD defence of Junyi Li (RALI).

Title: Robust, Efficient, and Knowledge-Augmented Text Generation with Pre-trained Language Models.

Date: jeudi 4 septembre 10h00 (4st septembre 10AM (ET))

La soutenance aura lieu en ligne :

https://umontreal.zoom.us/j/82310045459?pwd=xb9frpHu9ZqVJDJ0I6LGaWJWGgIlKU.1

Jury

Président	Philippe Langlais
Directeur	Jian-Yun Nie
Membre du jury	Bang Liu
Examinateur externe	Eric Gaussier(Univ.Grenoble-Alpes)

Abstract:

Pre-trained Language Models (PLMs) have significantly advanced the field of text generation. However, their practical application is often hindered by challenges related to systematic capability evaluation, high computational costs for training and inference, and limitations imposed by static and outdated internal knowledge. This thesis addresses these critical challenges to make PLM-based text generation more robust, efficient, and reliable. First, we develop ElitePLM, a comprehensive evaluation framework that systematically assesses the general language abilities of various PLMs. Second, we propose PTG (Prompt Transfer for Text Generation), which leverages prompt-based transfer learning to effectively transfer knowledge from source tasks to new generation tasks with minimal parameter updates. Third, to tackle inference inefficiency, we introduce ELMER, a non autoregressive model, which integrates an early exit strategy with a novel Layer Permutation Language Modeling (LPLM) pre-training objective, significantly speeding up generation while maintaining competitive performance. Fourth, to overcome the limitations of PLMs internal knowledge, we present UniWeb that augments PLMs with dynamic and comprehensive knowledge retrieved from the online Web. Collectively, the methodologies and frameworks developed in this thesis contribute to a more thorough evaluation of PLMs and offer novel solutions for their efficient training, rapid inference, and enhanced factual grounding.

Présentation prédoc III de Pascal Archambault

Wed, 03 Sep 2025 09:30:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Pascal
Archambault, le 3 septembre à 9h30. La présentation sera en anglais.

Titre : A Data-Driven Digital Twin Framework For Controlled Environment Agriculture

Date: mercredi 3 septembre à 9h30

Location: Pavillon André-Aisenstadt, salle 3195, 2920 Ch. de la tour

Jury

Président	Philippe Langlais
Directeur	Eugene Syriani
Co-directeur	Houari Sahraoui
Membre	Benoit Baudry

Résumé

Digital twins are virtual representations of a real-world system (physical twin). By using real-time data produced by the physical twin, simulation models are continuously adapted to capture the behavior of the twin, and provide tools for decision-support. In the context of controlled environment agriculture, crop data is hard to collect and makes the calibration of simulation models difficult. On the other hand, data is scarce and can seldom be used for data-driven methods that enable system representation or inference. The goal of this thesis is to enable digital twins of controlled environment agriculture to be actionable for prescription. We propose a data-driven digital twin framework to monitor the dynamics and structure of the crop, generate plausible data in a data-scarce scenario, and infer simulator models in lieu of hard to calibrate, or even non-existent, mechanistic crop models. We believe our data-driven digital twin framework has the potential to become a general framework for controlled environment agriculture.

Présentation prédoc III de Emiliano Penaloza

Tue, 02 Sep 2025 13:00:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Emiliano Penaloza, le 2 septembre à 13h00.

Titre : Enhancing Scrutability in Modern Machine Learning

Date: mardi 2 septembre à 13h

Location: Auditorium 2, Mila, 6650 (2e étage)

Jury

Président	Aaron Courville
Directeur	Laurent Charlin
Membre	Christopher Pal

Résumé

Modern machine learning systems deliver strong predictive performance but often at the cost of interpretability and user autonomy. Post-hoc explanation methods, while common, suffer from inconsistency and offer limited support for user intervention. This motivates scrutable modeling, which embeds interpretability directly into model design. This presentation highlights two recent approaches that enhance scrutability in distinct domains. First, we discuss Concept Bottleneck Models (CBMs), which constrain predictions to pass through human-interpretable concepts (e.g., “red wings,” “fever”) before making final decisions. We focus on how incorporating preference-based optimization helps address concept mislabeling, improves model performance, and strengthens the effect of user edits. Second, we discuss Textual Representations for Recommender Systems (TEARS), a framework that replaces opaque latent embeddings with editable textual profiles to increase transparency and steerability. Together, these approaches illustrate how scrutability can be systematically embedded into machine learning systems to improve transparency, reliability, and user control.

Présentation prédoc III de Liu Zichu

Fri, 29 Aug 2025 14:00:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Liu Zichu, le 29 août à 14h (mode hybride).

Titre : Surrogate Methods for Solving Non-Monotone Variational Inequalities in Reinforcement Learning

Date: vendredi 29 août à 14h.

Location: Espace de cotravail, Mila 6650 (1er étage)

Jury

Président	Simon Lacoste-Julien
Directeur	Gauthier Gidel
Co-directeur	Ioannis Mitliagkas
Membre	Pierre-Luc Bacon

Résumé

World modeling aims to build robust AI systems that can simulate reality, a goal that fundamentally requires generative diversity to prevent narrow,brittle worldviews. My research investigates this critical need fordiversity, using text-to-image generation as a practical case study. We propose a benchmarking framework to evaluate the utility of synthetic datagenerated by text-to-image (T2I) models, comparing the aesthetic quality,diversity and consistency between generated contents and real data, and highlighting prompt complexity as a key factor for enhancing diversity. Wealso focus on bias analysis in unconditional image generation, showing thatfeature probability shifts between training and generation are often smaller than assumed, and that classifier-based evaluations can alignclosely with human perception of bias. Building on this, my future workwill extend the study of diversity from generation to reasoning within multimodal models and reinforcement learning, showcasing that diversity isan important factor to build next generation models.

Présentation prédoc III de Xiaofeng Zhang

Fri, 29 Aug 2025 09:30:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Xiaofeng Zhang, le 29 août à 9h30 (mode hybride).

Titre : A Diversity-Centric Perspective towards Real-World Modeling

Date: vendredi 29 août à 9h30.

Location: Auditorium 2, Mila 6650 (2e étage)

Jury

Président	Gauthier Gidel
Directeur	Aaron Courville
Membre	Golnoosh Farnadi

Résumé

World modeling aims to build robust AI systems that can simulate reality, a goal that fundamentally requires generative diversity to prevent narrow,brittle worldviews. My research investigates this critical need fordiversity, using text-to-image generation as a practical case study. We propose a benchmarking framework to evaluate the utility of synthetic datagenerated by text-to-image (T2I) models, comparing the aesthetic quality,diversity and consistency between generated contents and real data, and highlighting prompt complexity as a key factor for enhancing diversity. Wealso focus on bias analysis in unconditional image generation, showing thatfeature probability shifts between training and generation are often smaller than assumed, and that classifier-based evaluations can alignclosely with human perception of bias. Building on this, my future workwill extend the study of diversity from generation to reasoning withinmultimodal models and reinforcement learning, showcasing that diversity isan important factor to build next generation models.

Présentation prédoc III de Ben Hudson

Thu, 28 Aug 2025 13:30:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Ben Hudson, le 28 août à 13h30. La présentation sera en anglais.

Titre : User behaviour model learning for network design with decision-dependent uncertainty

Date: jeudi 28 août à 13h30

Location: Pavillon André-Aisenstadt, salle 5441, 2920 Ch. de la tour

Jury

Président	Utsav Sadana
Directeur	Laurent Charlin
Co-directeur	Emma Frejinger
Membre	Pierre-Luc Bacon

Résumé

We desire to optimize the efficiency of a transportation network where the optimal design depends on how travel demand distributes itself across the network. However, the travel demand itself is shaped by the network design. This introduces a complex feedback loop: decisions influence the probability distribution of outcomes, and those outcomes in turn affect the optimality of the decisions. This interaction, referred to as decision-dependent or endogenous uncertainty, significantly complicates solving the network design problem.

Solving this problem requires modelling traveller behaviour and leveraging those models to predict how travel patterns will shift in a redesigned network. The first challenge relates to the topics of route choice and traffic equilibrium modelling; the second corresponds to a stochastic network design problem with endogenous uncertainty. From the gaps in the literature, we identify three primary research questions: (i) How can we learn the full perturbation distribution in perturbed utility models and decision-focused learning, beyond the expected utility? (ii) How can we learn the link cost (disutility) function in Markovian traffic equilibrium models from observational data? (iii) How can we solve stochastic network design problems where user behaviour is both uncertain and influenced by the design itself?

To address the first question, we propose an estimation approach for perturbed utility models based on stochastic smoothing. Exploratory results suggest that our model may capture the complex relationship between traveller behaviour and network design more effectively than existing approaches. For the second question, we outline initial ideas developed in collaboration with researchers at EPFL. Finally, for the third question, we describe how predictions from these models of user behaviour might be adapted to tackle the stochastic network design problem with decision-dependent uncertainty.

The ultimate goal of this research is to equip transportation planners with tools to better
model traveller behaviour and implement network improvements, thereby enabling transportation systems that are more efficient, reliable, and equitable for all users.

Présentation prédoc III de Lucas Maes

Thu, 28 Aug 2025 10:00:00 -0400

Bonjour à tous,

Vous êtes tous et toutes cordialement invité.es à assister à la présentation de projet du prédoc III de Lucas Maes, le 28 août à 10h (mode hybride).

Titre : Towards Efficient Self-Supervised World Models

Date: jeudi 28 août à 10h.

Location: Espace de cotravail, Mila, 6650 (1er étage)

Jury

Président	Simon Lacoste-Julien
Directeur	Damien Scieur
Membre	Aristide Baratin

Résumé

World models have recently attracted considerable attention. However,progress remains limited by divergent motivations, heterogeneous applications, and inconsistent formalisms. Moreover, the absence of astandardized evaluation protocol further hinders iteration and meaningfulcomparison across approaches. In this work, we take steps toward unifyingthe field and addressing its long-standing challenges. First, we propose a standard evaluation protocol for world models that measures the robustnessof each model component under environmental perturbations. Using thisprotocol, we assess the robustness of DINO-WM, a recently proposed approach. Second, we introduce Relational Representation Learning (RRL), a new paradigm for representation learning based on predicting inter-samplerelationships. RRL provides a principled explanation for puzzling phenomenain self-supervised learning, like the role of projector networks. Wefurther discuss how RRL could be applied to a major open problem in worldmodels: hierarchical planning. Finally, we outline future directions forimproving world models toward the broader goal of Autonomous Machine Intelligence (AMI).