People

Francisco (Paco) Guzman

Research Scientist

I work at the Language and Translation Technology (LATTE) group in Facebook’s Applied Machine Learning (AML) division. I work in the field of machine translation. My research has been published in top-tier NLP venues like ACL, EMNLP. I participated in several machine translation competitions, obtaining top rankings for Arabic-English and Spanish-English language pairs. I have made contributions to machine translation evaluation using discourse information, winning the WMT2014 metrics evaluation campaign. I obtained my PhD from the ITESM in Mexico, was a visiting scholar at the LTI-CMU from 2008-2009, and participated in DARPA’s GALE evaluation program. I was a post-doc and scientist at Qatar Computing Research Institute in Qatar in 2012-2016.

Interests

Machine translation and natural language processing

Related Links

Google Scholar

Latest Publications

Are we Estimating or Guesstimating Translation Quality?

Shuo Sun, Francisco (Paco) Guzman, Lucia Specia

ACL - July 6, 2020

The FLORES Evaluation Datasets for Low-Resource Machine Translation: Nepali–English and Sinhala–English

Francisco (Paco) Guzman, Peng-Jen Chen, Myle Ott, Juan Pino, Guillaume Lample, Philipp Koehn, Vishrav Chaudhary, Marc'Aurelio Ranzato

EMNLP - October 31, 2019

Low-Resource Corpus Filtering using Multilingual Sentence Embeddings

Vishrav Chaudhary, Yuqing Tang, Francisco (Paco) Guzman, Holger Schwenk, Philipp Koehn

ACL - August 2, 2019

Findings of the WMT 2019 Shared Task on Parallel Corpus Filtering for Low-Resource Conditions

Philipp Koehn, Francisco (Paco) Guzman, Vishrav Chaudhary, Juan Pino

WMT at ACL - August 2, 2019

Design and Evaluation of a Social Media Writing Support Tool for People with Dyslexia

Shaomei Wu, Lindsay Reynolds, Xian Li, Francisco (Paco) Guzman

CHI - May 4, 2019

Downloads & Projects

View all Downloads & Projects

A public benchmark for low resource machine translation FLoRes is a dataset that can be used to reproduce the experiments…

WikiMatrix is a corpus of parallel sentences used in the project outlined in WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia. The goal of this project is to mine for parallel sentences in the textual content of Wikipedia for all possible language pairs.