Présentation prédoc III - Marawan Gamal
Dear all /Bonjour à tous,
Vous êtes cordialement invité.e.s à l'évaluation du Predoc III de Marawan Gamal, le 30 août à 9h30 (mode hybride).
Title: Low-rank Methods for Efficient Model Storage, Training and Inference
Date: 30 Août 2024 de 9:30 à 12:30 EST
Location: Auditorium 2, MILA + *Zoom Link
Jury
Président rapporteur | Liu, Bang |
Directeur de recherche | Rabusseau, Guillaume |
Membre régulier | Reddy, Siva |
Abstract
Efficient deployment and fine-tuning of Large Language Models (LLMs) is a critical challenge due to their substantial computational and memory requirements. This thesis proposal focuses on approaches to address these challenges using low-rank techniques for model compression, fine-tuning and inference. First, Random Orthogonal Subspace Adaptation (ROSA) is proposed to reduce memory usage during fine-tuning while maintaining model performance. Building on the LoRA framework, ROSA leverages singular value decomposition (SVD) for parameter initialization. For inference, Tensorized Joint Distribution Networks (TJDNet) are introduced as a step towards enabling parallelizable sampling to accelerate inference speed. By representing joint probabilities using Uniform Matrix Product states, marginalization operations can be done efficiently.