Passer au contenu

/ Département d'informatique et de recherche opérationnelle

Je donne

Rechercher

Navigation secondaire

David Grangier : Neural Machine Translation: Achieving Fast Training and Fast Inference with Gated Convolutions

Neural Machine Translation: Achieving Fast Training and Fast Inference
with Gated Convolutions

par


David Grangier*

Facebook AI Research

 

Jeudi 12 octobre, 15:30-16:30, Salle 1360, Pavillon André-Aisenstadt

    Université de Montréal, 2920 Chemin de la Tour

Café avant 15:00-15:30

                                                        ****************************************
                                                      ATTENTION AU LOCAL INHABITUEL
                                                        ****************************************

* joint work with Michael Auli, Yann Dauphin, Angela Fan, Jonas Ghering and Sergey Edunov

Résumé:

Neural architectures for Machine Translation (MT) and related language modeling tasks is an active research field. The first part of our talk introduces several architectural changes to the original work of Bahdanau et al. 2014. We replace non-linearities with our novel gated linear units, recurrent units with convolutions and introduce multi-hop attention to allow more complex attention patterns. These changes improve generalization performance, training efficiency and decoding speed. The second part of our talk analyzes the properties of the distribution predicted by the model, examine how predictions differ from their empirical counterpart and we discuss how this influences beam search.

Biographie:

David Grangier is a research scientist at Facebook AI Research, Menlo Park, CA. David earned his PhD in Machine Learning from Ecole Polytechnique Federale de Lausanne advised by Samy Bengio. He worked at different industrial labs, including NEC Labs America (2008-2011), AT&T Research (2011-2012) and Microsoft Research (2012-2014). Currently, David works on machine learning and its application to natural language processing, he is particularly interested in text generation tasks.
http://david.grangier.info/

 

Publications related to the talk:


Convolutional Sequence to Sequence Learning
Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, Yann N. Dauphin - International Conference on Machine Learning (ICML). 2017.

Language Modeling with Gated Convolutional Networks
Yann N. Dauphin, Angela Fan, Michael Auli and David Grangier - International Conference on Machine Learning (ICML). 2017.

Efficient softmax approximation for GPUs
Edouard Grave, Armand Joulin, Moustapha Cisse and David Grangier and Hervé Jegou - International Conference on Machine Learning (ICML). 2017.

A Convolutional Encoder Model for Neural Machine Translation
Jonas Gehring, Michael Auli, David Grangier, Yann N. Dauphin - Conference of the Association for Computational Linguistics (ACL). 2017.

Neural Generation of Text from Structured Data with Application to the Bibliography Domain
Remi Lebret, David Grangier and Michael Auli - Conference on Empirical Methods in Natural Language Processing (EMNLP). 2016.

Vocabulary Selection Strategies for Neural Machine Translation
Gurvan L'Hostis, David Grangier, Michael Auli - arXiv:1610.00072. 2016.

Strategies for Training Large Vocabulary Neural Language Models
W. Chen, D. Grangier and M. Auli - Conference of the Association for Computational Linguistics (ACL). 2016.