Passer au contenu

/ Département d'informatique et de recherche opérationnelle

Je donne

Rechercher

Navigation secondaire

Présentation prédoc III - Marawan Gamal

Bonjour à tous,


Vous êtes cordialement invité.e.s à l'évaluation du Predoc III de Marawan Gamal, le 30 août à 9h30 (mode hybride).


Title: Low-rank Methods for Efficient Model Storage, Training and Inference

Date: 30 Août 2024 de 9:30 à 12:30 EST

Location:  Auditorium 2, MILA + *Zoom Link

 

Jury

Président rapporteur
Liu, Bang
Directeur de rechercheRabusseau, Guillaume
Membre régulier
Reddy, Siva

 

Abstract

Efficient deployment and fine-tuning of Large Language Models (LLMs) is a critical challenge due to their substantial computational and memory requirements. This thesis proposal focuses on approaches to address these challenges using low-rank techniques for model compression, fine-tuning and inference. First, Random Orthogonal Subspace Adaptation (ROSA) is proposed to reduce memory usage during fine-tuning while maintaining model performance. Building on the LoRA framework, ROSA leverages singular value decomposition (SVD) for parameter initialization. For inference, Tensorized Joint Distribution Networks (TJDNet) are introduced as a step towards enabling parallelizable sampling to accelerate inference speed. By representing joint probabilities using Uniform Matrix Product states, marginalization operations can be done efficiently.