Présentation prédoc III - Marawan Gamal - Département d'informatique et de recherche opérationnelle

Dear all /Bonjour à tous,

Vous êtes cordialement invité.e.s à l'évaluation du Predoc III de Marawan Gamal, le 30 août à 9h30 (mode hybride).

Title: Low-rank Methods for Efficient Model Storage, Training and Inference

Date: 30 Août 2024 de 9:30 à 12:30 EST

Location: Auditorium 2, MILA + *Zoom Link

Link: https://umontreal.zoom.us/j/81067937725?pwd=Qx4cSO2jTB2oy5oykAcCfLtMfdoZ7n.1

Jury

Président rapporteur	Liu, Bang
Directeur de recherche	Rabusseau, Guillaume
Membre régulier	Reddy, Siva

Abstract

Efficient deployment and fine-tuning of Large Language Models (LLMs) is a critical challenge due to their substantial computational and memory requirements. This thesis proposal focuses on approaches to address these challenges using low-rank techniques for model compression, fine-tuning and inference. First, Random Orthogonal Subspace Adaptation (ROSA) is proposed to reduce memory usage during fine-tuning while maintaining model performance. Building on the LoRA framework, ROSA leverages singular value decomposition (SVD) for parameter initialization. For inference, Tensorized Joint Distribution Networks (TJDNet) are introduced as a step towards enabling parallelizable sampling to accelerate inference speed. By representing joint probabilities using Uniform Matrix Product states, marginalization operations can be done efficiently.

Retour

Université de Montréal / Faculté des arts et des sciences Département d'informatique et de recherche opérationnelle

Présentation prédoc III - Marawan Gamal

Comment soutenir le Département?

BESOIN D'AIDE?

FACULTÉ DES ARTS ET DES SCIENCES