We are aware of registration issues in browsers other than Internet Explorer and are extending the early bird registration till February 28th while we are fixing the issue.

About

AMTA 2018 is the 13th biennial conference organized by the Association for Machine Translation in the Americas. The AMTA conferences are unique in bringing together MT researchers, developers, and users of MT technology from government and industry. This year we will be collocating with GALA (Globalization and Localization Association) conference, sharing workshops and tutorials. AMTA 2018 attendees who wish to take advantage of this great opportunity at a discounted registration rate can use the discount code “GALA2018-AMTA” when they register on the GALA site. Please note that the discount is valid for non-GALA members only.

Why attend? AMTA 2108 will interest people from both academia and industry. For scientists it provides a unique opportunity to share research results with colleagues, and understand user demands. Business and government participants will benefit from updates on leading-edge R&D in Machine Translation and have a chance to present and discuss their use cases.

The conference will be featuring three tracks – Research, Commercial and Government, each dedicated to a respective area in machine translation research, commercial application and government use. There will also be invited talks and panels.

Topics covered will include the latest advancements in machine translation, such as deep learning and neural MT, lower resourced languages, user-interfaces for MT; MT evaluation approaches; MT in commercial settings, integration MT into localization lifecycles, MT in chat, support and other “live” applications, MT and “noisy” content types; MT as a translator and analyst tool and many more.

Call for Presentations & Papers

Program

DATE
Time
SESSION TITLE
07:30am - 09:00am
Breakfast
keyboard_arrow_down
keyboard_arrow_up

 

09:00am - 10:30am
Tutorial | De-mystifying Neural MT
keyboard_arrow_down
keyboard_arrow_up

Neural Machine Translation technology is progressing at a very rapid pace. In the last few years, the research community has proposed several different architectures with various levels of complexity. However, even complex Neural Networks are really built from simple building blocks; and their functioning is governed by relatively simple rules. In this tutorial, we aim to provide an intuitive understanding of the concepts that lies behind this very successful machine learning paradigm.

In the first part of the tutorial we will explain, through visuals and examples, how Neural Networks work. We will introduce the basic building block, the neuron; illustrate how networks are trained; and discuss the advantages and challenges of deep networks.

The second part will focus on Neural Machine Translation. We will present the main Neural Network architectures that power the current NMT engines: Recurrent Neural Networks with attention, Convolutional Networks for MT, and the Transformer model. We will discuss some of the practical aspects involved in training and deploying high-quality translation engines. Using examples we will illustrate some of the current challenges and limitations of the technology. Last but not least, we will try to look to the future and talk about the still not-fully-realized potential of deep learning.

Presenter(s):
Dragos Munteanu -
SDL
Ling Tsou -
Research Engineer -SDL
Target Audience:
Localization professionals with limited experience with Neural Networks and Deep Learning
09:00am - 10:30am
Tutorial | A Deep Learning curve for Post-Editing
keyboard_arrow_down
keyboard_arrow_up

Does post-editing also require a deep learning curve? How do the neural networks of post-editors work in concert with neural MT engines? Can post-editors and engines be retrained to work more effectively with each other?

In this tutorial, we demystify the process, focus on the latest MT developments and their impact on post-editing practices. We will cover enterprise-scale project integrations, zoom into the nitty-gritty of tool compatibility, address the different use cases of MT and dynamic quality models, and share our insights on BI, how to measure it all for informed stakeholder decisions.

Outline:

  • Introduction to MT and Post-Editing
  • MT integration for enterprise-scale programs:
  • How is it done
  • What do translators see: working online versus working offline
  • The impact of connectors on MT output
  • What is ‘normal’ in raw MT, and what isn’t
  • ‘pre-editing’ or ‘post-processing’
  • Different types of MT and implications for post-editors:
    • SMT and NMT: key concepts, current state, strength and weaknesses, typical errors and can they be fixed?
    • static MT and adaptive MT: key concepts, current state, strength and weaknesses, workflow integration
  • Adaptive MT demos & discussion: SDL, Lilt, ModernMT
  • Dynamic Quality Models and how to post-edit for different use cases:
    • Focus on fast, cheap, usable quality: light post-editing
    • Focus on technical accuracy: medium post-editing
    • Focus on maintaining highest translation quality: full post-editing
  • How to evaluate productivity based on automatic scoring, post-edit distances and productivity reports while discarding anomalies
Presenter(s):
Alex Yanishevsky -
Senior Manager, MT and NLP Deployments - Welocalize
Elaine O'Curran -
MT Program Manager – Welocalize
Target Audience:
This tutorial will provide guidance to translators, LSPs and translation buyers on how to navigate the complex landscape of tools for production, and effectively measure BI and KPIs for MT and post-editing
09:00am - 10:30am
Tutorial | MQM-DQF: A Good Marriage (Translation Quality for the 21st Century)
keyboard_arrow_down
keyboard_arrow_up

In the past three years, the language industry has been converging on the use of the MQM-DQF framework for analytic quality evaluation. It emerged from two separate quality-evaluation approaches: the European Commission-funded Multidimensional Quality Metrics (MQM) and the Dynamic Quality Framework (DQF) from TAUS. Harmonized in 2015, the resulting shared hierarchy of error types allows implementers to classify common translation problems and perform comparative analysis of translation quality.

MQM-DQF is currently undergoing a formal standardization process in ASTM F43 and will remain a free and open framework.

Attendees will learn how to apply MQM-DQF to their particular needs, including use in typical MT research scenarios where it can bring consistency and clarity. They will be better prepared to select a quality assessment methodology that is appropriate to their needs and that can help connect the needs of technology developers, users, linguists, and information consumers.

  1. A typology of translation quality metrics. This discussion will enable participants to understand how MQM-DQF compares to other quality evaluation approaches and the comparative strengths and weaknesses of them.
  2. Overview of MQM-DQF and key features. This detailed overview will highlight how the framework relates to existing standards, the role of translation specifications in evaluating quality, and the approach the specification takes to developing numerical quality scores.
  3. Market adoption. This section will cover the tools that have already adopted MQM/DQF and how they apply it.
  4. Detailed case studies. The presenters will discuss specific use cases submitted by tutorial participants to explore how they can create a customized MQM-DQF metric.
  5. Validity and reliability. This section discusses the importance of determining validity and measuring reliability within a translation quality evaluation system.

Note: The presenters were two of the leads in the harmonization of MQM and DQF and are active in the ongoing standardization effort around the resulting combined approach.

Target audience: The target audience includes researchers, developers, and linguists interested in understanding translation quality, ways of assessing it, and the strengths and weaknesses of various approaches.

Presenter(s):
Arle Lommel -
Senior analyst, CSA Research
Alan K. Melby -
Chair, LTAC Global
Target Audience:
The target audience includes researchers, developers, and linguists interested in understanding translation quality, ways of assessing it, and the strengths and weaknesses of various approaches.
10:30am - 11:00am
Break
keyboard_arrow_down
keyboard_arrow_up

11:00am - 12:30pm
Tutorial | De-mystifying Neural MT
keyboard_arrow_down
keyboard_arrow_up

Neural Machine Translation technology is progressing at a very rapid pace. In the last few years, the research community has proposed several different architectures with various levels of complexity. However, even complex Neural Networks are really built from simple building blocks; and their functioning is governed by relatively simple rules. In this tutorial, we aim to provide an intuitive understanding of the concepts that lies behind this very successful machine learning paradigm.

In the first part of the tutorial we will explain, through visuals and examples, how Neural Networks work. We will introduce the basic building block, the neuron; illustrate how networks are trained; and discuss the advantages and challenges of deep networks.

The second part will focus on Neural Machine Translation. We will present the main Neural Network architectures that power the current NMT engines: Recurrent Neural Networks with attention, Convolutional Networks for MT, and the Transformer model. We will discuss some of the practical aspects involved in training and deploying high-quality translation engines. Using examples we will illustrate some of the current challenges and limitations of the technology. Last but not least, we will try to look to the future and talk about the still not-fully-realized potential of deep learning.

Presenter(s):
Dragos Munteanu -
SDL
Ling Tsou -
Research Engineer -SDL
Target Audience:
Localization professionals with limited experience with Neural Networks and Deep Learning
11:00am - 12:30pm
Tutorial | A Deep Learning curve for Post-Editing
keyboard_arrow_down
keyboard_arrow_up

Does post-editing also require a deep learning curve? How do the neural networks of post-editors work in concert with neural MT engines? Can post-editors and engines be retrained to work more effectively with each other?

In this tutorial, we demystify the process, focus on the latest MT developments and their impact on post-editing practices. We will cover enterprise-scale project integrations, zoom into the nitty-gritty of tool compatibility, address the different use cases of MT and dynamic quality models, and share our insights on BI, how to measure it all for informed stakeholder decisions.

Outline:

  • Introduction to MT and Post-Editing
  • MT integration for enterprise-scale programs:
  • How is it done
  • What do translators see: working online versus working offline
  • The impact of connectors on MT output
  • What is ‘normal’ in raw MT, and what isn’t
  • ‘pre-editing’ or ‘post-processing’
  • Different types of MT and implications for post-editors:
    • SMT and NMT: key concepts, current state, strength and weaknesses, typical errors and can they be fixed?
    • static MT and adaptive MT: key concepts, current state, strength and weaknesses, workflow integration
  • Adaptive MT demos & discussion: SDL, Lilt, ModernMT
  • Dynamic Quality Models and how to post-edit for different use cases:
    • Focus on fast, cheap, usable quality: light post-editing
    • Focus on technical accuracy: medium post-editing
    • Focus on maintaining highest translation quality: full post-editing
  • How to evaluate productivity based on automatic scoring, post-edit distances and productivity reports while discarding anomalies
Presenter(s):
Alex Yanishevsky -
Senior Manager, MT and NLP Deployments - Welocalize
Elaine O'Curran -
MT Program Manager – Welocalize
Target Audience:
This tutorial will provide guidance to translators, LSPs and translation buyers on how to navigate the complex landscape of tools for production, and effectively measure BI and KPIs for MT and post-editing
11:00am - 12:30pm
Tutorial | MQM-DQF: A Good Marriage (Translation Quality for the 21st Century)
keyboard_arrow_down
keyboard_arrow_up

In the past three years, the language industry has been converging on the use of the MQM-DQF framework for analytic quality evaluation. It emerged from two separate quality-evaluation approaches: the European Commission-funded Multidimensional Quality Metrics (MQM) and the Dynamic Quality Framework (DQF) from TAUS. Harmonized in 2015, the resulting shared hierarchy of error types allows implementers to classify common translation problems and perform comparative analysis of translation quality.

MQM-DQF is currently undergoing a formal standardization process in ASTM F43 and will remain a free and open framework.

Attendees will learn how to apply MQM-DQF to their particular needs, including use in typical MT research scenarios where it can bring consistency and clarity. They will be better prepared to select a quality assessment methodology that is appropriate to their needs and that can help connect the needs of technology developers, users, linguists, and information consumers.

  1. A typology of translation quality metrics. This discussion will enable participants to understand how MQM-DQF compares to other quality evaluation approaches and the comparative strengths and weaknesses of them.
  2. Overview of MQM-DQF and key features. This detailed overview will highlight how the framework relates to existing standards, the role of translation specifications in evaluating quality, and the approach the specification takes to developing numerical quality scores.
  3. Market adoption. This section will cover the tools that have already adopted MQM/DQF and how they apply it.
  4. Detailed case studies. The presenters will discuss specific use cases submitted by tutorial participants to explore how they can create a customized MQM-DQF metric.
  5. Validity and reliability. This section discusses the importance of determining validity and measuring reliability within a translation quality evaluation system.

Note: The presenters were two of the leads in the harmonization of MQM and DQF and are active in the ongoing standardization effort around the resulting combined approach.

Target audience: The target audience includes researchers, developers, and linguists interested in understanding translation quality, ways of assessing it, and the strengths and weaknesses of various approaches.

Presenter(s):
Arle Lommel -
Senior analyst, CSA Research
Alan K. Melby -
Chair, LTAC Global
Target Audience:
The target audience includes researchers, developers, and linguists interested in understanding translation quality, ways of assessing it, and the strengths and weaknesses of various approaches.
12:30pm - 02:00pm
Lunch
keyboard_arrow_down
keyboard_arrow_up

 

2:00pm - 03:30pm
Tutorial | ModernMT: Open-Source Adaptive Neural MT for Enterprises and Translators
keyboard_arrow_down
keyboard_arrow_up

Nowadays, computer-assisted translation (CAT) tools represent the dominant technology in the translation market – and those including machine translation (MT) engines are on the increase. In this new scenario, where MT and post-editing are becoming the standard portfolio for professional translators, it is of the utmost importance that MT systems are specifically tailored to translators.

In this tutorial, we will present ModernMT, a new open-source MT software whose development was funded by the European Union. ModernMT targets two use cases: enterprises that need dedicated MT services; and professional translators working with CAT tools. This tutorial will focus on both use cases.

In the first part, we will present the ModernMT open source software architecture and guide the audience through its installation on an AWS instance. Then, we demonstrate how to create a new adaptive Neural MT engine from scratch, how to feed its internal memory, and finally how to query it.

In the second part, we will introduce ModernMT’s most distinguishing features when used through a CAT tool: (i) ModernMT does not require any initial training: as soon as translators upload their translation memories in the CAT tool, ModernMT seamlessly and quickly learns from this data; (ii) ModernMT adapts to the content to be translated in real time: the system leverages the training data most similar to the document being translated; (iii) ModernMT learns from user corrections: during the translation workflow, ModernMT constantly learns from the post-edited sentences to improve its translation suggestions. In particular, we will demonstrate ModernMT within MateCat, a popular online professional CAT tool.

In this tutorial, participants will learn about industry trends aiming to develop MT focusing on the specific needs of enterprises and translators. They will see how current state-of-the-art MT technology is being consolidated into a single, easy-to-use product capable of learning from – and evolving through – interaction with users, with the final aim of increasing MT-output utility for the translator in a real professional environment.

Presenter(s):
Marcello Federico -
MMT, FBK
Davide Caroselli -
MMT
Target Audience:
MT users, specialists, integrators, developers, managers, decision makers.
2:00pm - 03:30pm
Tutorial | Corpora Quality Management for MT - Practices and Roles
keyboard_arrow_down
keyboard_arrow_up

The quality of the corpora that trains MT systems has not been a prominent topic of discussion, if compared to MT technology. Most research uses the same, well-established corpora so that results can be reproduced. However, corpora can strongly determine the quality of the MT output. Neural MT relies more on data quality than previous technologies. Therefore, we thought that the time has come to take a deeper look into the corpora quality. We will look at this from a science view and from a linguist view, exploring current and future roles in the evolving MT scenario.

From the eBay experience, participants will learn and discuss:

  • Best practices on Corpora Management from a Science team perspective
    • Automatic ways to locate issues and cleanup – with examples
  • Best practices on Corpora Management from a Localization team perspective
    • Engineering and Linguistic Cleanup – with examples
    • Creating high-quality in-domain content via post-editing
  • Other Industry Best Practices on Corpora management – with examples
  • New metrics for corpora quality

Considering that data curation of corpora may become a task for a Language Professional, learn from Univ. of Texas what the profile of a Language Professional could be:

  • What a Language Professional education/skills should look like
  • Experience from bringing Corpora Management to L10N students
Presenter(s):
Nicola Ueffing -
eBay MT Science
MT Science
Pete Smith -
University of Texas Arlington
Silvio Picinini -
eBay Localization
Target Audience:
Any person involved with the deployment of MT that is interested in the quality of the corpora data that will be at the foundation of the MT deployment. Roles include managers, linguists, engineers or scientists.
2:00pm - 03:30pm
Workshop | The Role of Authoritative Standards in the MT Environment
keyboard_arrow_down
keyboard_arrow_up

In this workshop, we will bring together experts from across the standards community, including from the American Society for Testing and Materials (now just “ASTM International”), the American National Standards Institute (ANSI), the International Organization for Standardization (ISO), the Globalization and Localization Association (GALA), and the World Wide Web Consortium (W3C). These experts will discuss authoritative standards that impact the development, implementation, and evaluation of translation systems and of the interoperability of resources.

The workshop will consist of one-half day of technical presentations with invited talks on topics including the structure of the U.S. and international standards community, developing and implementing standards for translation quality assessment and quality assurance, the Translation API Class and Cases (TAPICC) initiative, and updates to Term Based eXchange (TBX). A panel will discuss gaps in this network of standards. They will also solicit input from co-panelists and from the audience on how to improve the standards and standards processes, particularly in the fast-changing world of semantic and neural technological development. Feedback will be provided to the relevant standards committees.

Organizer(s): Jennifer DeCamp -
MITRE
Participant(s): Alan K. Melby -
Chair, LTAC Global
Arle Lommel -
Senior analyst, CSA Research
David Filip -
ISO/IEC JTC 1
Bill Rivers -
Executive Director at Joint National Committee for Languages
Sue Ellen Wright -
Kent State University Institute for Applied Linguistics
Agenda:
02:00pm - 02:15pm | Jennifer DeCamp | Introduction
02:15pm - 02:30pm | Jennifer DeCamp | Language Codes
02:30pm - 03:00pm | Sue Ellen Wright | Term Base eXchange (TBX)
03:00pm - 03:30pm | David Filip | XLIFF 2
03:30pm - 04:00pm | Break
04:00pm - 04:30pm | Bill Rivers | Translation Standards
04:30pm - 05:00pm | Arle Lommel | Translation Quality Metrics
05:00pm - 05:30pm | Alan Melby | Translation API Cases and Classes (TAPICC)
05:30pm - 06:00pm | Panel
03:30pm - 04:00pm
Break
keyboard_arrow_down
keyboard_arrow_up

04:00pm - 05:30pm
Tutorial | ModernMT: Open-Source Adaptive Neural MT for Enterprises and Translators
keyboard_arrow_down
keyboard_arrow_up

Nowadays, computer-assisted translation (CAT) tools represent the dominant technology in the translation market – and those including machine translation (MT) engines are on the increase. In this new scenario, where MT and post-editing are becoming the standard portfolio for professional translators, it is of the utmost importance that MT systems are specifically tailored to translators.

In this tutorial, we will present ModernMT, a new open-source MT software whose development was funded by the European Union. ModernMT targets two use cases: enterprises that need dedicated MT services; and professional translators working with CAT tools. This tutorial will focus on both use cases.

In the first part, we will present the ModernMT open source software architecture and guide the audience through its installation on an AWS instance. Then, we demonstrate how to create a new adaptive Neural MT engine from scratch, how to feed its internal memory, and finally how to query it.

In the second part, we will introduce ModernMT’s most distinguishing features when used through a CAT tool: (i) ModernMT does not require any initial training: as soon as translators upload their translation memories in the CAT tool, ModernMT seamlessly and quickly learns from this data; (ii) ModernMT adapts to the content to be translated in real time: the system leverages the training data most similar to the document being translated; (iii) ModernMT learns from user corrections: during the translation workflow, ModernMT constantly learns from the post-edited sentences to improve its translation suggestions. In particular, we will demonstrate ModernMT within MateCat, a popular online professional CAT tool.

In this tutorial, participants will learn about industry trends aiming to develop MT focusing on the specific needs of enterprises and translators. They will see how current state-of-the-art MT technology is being consolidated into a single, easy-to-use product capable of learning from – and evolving through – interaction with users, with the final aim of increasing MT-output utility for the translator in a real professional environment.

Presenter(s):
Marcello Federico -
MMT, FBK
Davide Caroselli -
MMT
Target Audience:
MT users, specialists, integrators, developers, managers, decision makers.
04:00pm - 05:30pm
Tutorial | Corpora Quality Management for MT - Practices and Roles
keyboard_arrow_down
keyboard_arrow_up

The quality of the corpora that trains MT systems has not been a prominent topic of discussion, if compared to MT technology. Most research uses the same, well-established corpora so that results can be reproduced. However, corpora can strongly determine the quality of the MT output. Neural MT relies more on data quality than previous technologies. Therefore, we thought that the time has come to take a deeper look into the corpora quality. We will look at this from a science view and from a linguist view, exploring current and future roles in the evolving MT scenario.

From the eBay experience, participants will learn and discuss:

  • Best practices on Corpora Management from a Science team perspective
    • Automatic ways to locate issues and cleanup – with examples
  • Best practices on Corpora Management from a Localization team perspective
    • Engineering and Linguistic Cleanup – with examples
    • Creating high-quality in-domain content via post-editing
  • Other Industry Best Practices on Corpora management – with examples
  • New metrics for corpora quality

Considering that data curation of corpora may become a task for a Language Professional, learn from Univ. of Texas what the profile of a Language Professional could be:

  • What a Language Professional education/skills should look like
  • Experience from bringing Corpora Management to L10N students
Presenter(s):
Nicola Ueffing -
eBay MT Science
MT Science
Pete Smith -
University of Texas Arlington
Silvio Picinini -
eBay Localization
Target Audience:
Any person involved with the deployment of MT that is interested in the quality of the corpora data that will be at the foundation of the MT deployment. Roles include managers, linguists, engineers or scientists.
04:00pm - 06:00pm
Workshop | The Role of Authoritative Standards in the MT Environment
keyboard_arrow_down
keyboard_arrow_up

In this workshop, we will bring together experts from across the standards community, including from the American Society for Testing and Materials (now just “ASTM International”), the American National Standards Institute (ANSI), the International Organization for Standardization (ISO), the Globalization and Localization Association (GALA), and the World Wide Web Consortium (W3C). These experts will discuss authoritative standards that impact the development, implementation, and evaluation of translation systems and of the interoperability of resources.

The workshop will consist of one-half day of technical presentations with invited talks on topics including the structure of the U.S. and international standards community, developing and implementing standards for translation quality assessment and quality assurance, the Translation API Class and Cases (TAPICC) initiative, and updates to Term Based eXchange (TBX). A panel will discuss gaps in this network of standards. They will also solicit input from co-panelists and from the audience on how to improve the standards and standards processes, particularly in the fast-changing world of semantic and neural technological development. Feedback will be provided to the relevant standards committees.

Organizer(s): Jennifer DeCamp -
MITRE
Participant(s): Alan K. Melby -
Chair, LTAC Global
Arle Lommel -
Senior analyst, CSA Research
David Filip -
ISO/IEC JTC 1
Bill Rivers -
Executive Director at Joint National Committee for Languages
Sue Ellen Wright -
Kent State University Institute for Applied Linguistics
Agenda:
02:00pm - 02:15pm | Jennifer DeCamp | Introduction
02:15pm - 02:30pm | Jennifer DeCamp | Language Codes
02:30pm - 03:00pm | Sue Ellen Wright | Term Base eXchange (TBX)
03:00pm - 03:30pm | David Filip | XLIFF 2
03:30pm - 04:00pm | Break
04:00pm - 04:30pm | Bill Rivers | Translation Standards
04:30pm - 05:00pm | Arle Lommel | Translation Quality Metrics
05:00pm - 05:30pm | Alan Melby | Translation API Cases and Classes (TAPICC)
05:30pm - 06:00pm | Panel
06:00pm - 09:00pm
Welcome reception at Conference Hotel
keyboard_arrow_down
keyboard_arrow_up

07:30am - 09:00am
Breakfast
keyboard_arrow_down
keyboard_arrow_up

 

09:00am - 09:45am
Research Keynote | Unveiling the Linguistic Weaknesses of Neural MT
keyboard_arrow_down
keyboard_arrow_up

Almost four years after the advent of neural MT, it is time to look beyond the success story and reflect on the intrinsic limitations of this technology.
In particular, what specific language phenomena are learnt, or not, by recurrent neural networks in the typical training scenarios? To what extent is hierarchical language structure captured? Do NMT models learn to extract linguistic features from raw data and exploit them in any explicable way?
In this talk I will present recent answers to these questions and discuss promising directions to further advance our understanding of the linguistic strengths and weaknesses of NMT.

Presenter(s):

Arianna Bisazza -
Leiden University
Assistant Professor in computer science at Leiden University, Netherlands. Her research focuses on the statistical modeling of natural language, with the main goal of improving the quality of machine translation for challenging language pairs. She previously worked as a postdoc at the University of Amsterdam and as a research assistant at Fondazione Bruno Kessler. She obtained her PhD from the University of Trento in 2013 and was awarded a VENI (NWO starting grant) in 2016.
09:45am - 10:30am
Commercial Keynote | Machine Translation Beyond the Sentence
keyboard_arrow_down
keyboard_arrow_up

Machine translation has made great progress in the last couple of years toward solving the traditional task of producing one good translation for a given sentence. This progress makes it possible, and necessary, to take a harder look at the larger contexts where machine translation is used. We’ll look at some of the technical, linguistic, and social challenges that arise when we place a single-sentence machine translation engine into a wider context, and at some prospects for solutions.

Presenter(s):

Macduff Hughes -
Engineering Director at Google
Has led the Google Translate team as Engineering Director since 2012. He has worked at Google since 2007, having previously led the Google Voice and Google Accounts teams. He has a bachelor's degree from Stanford University and did graduate studies at the University of Trier and Columbia University.
10:30am - 11:00am
Break
keyboard_arrow_down
keyboard_arrow_up

11:00am - 11:30am
Research | Document-Level Information as Side Constraints for Improved Neural Patent Translation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Laura JehlStefan Riezler
11:00am - 11:30am
Commercial | Augmented Translation: A New Approach to Combining Human and Machine Capabilities
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Arle Lommel -
Senior analyst, CSA Research
11:00am - 12:30pm
Government Panel | Government Implications for Commercial Innovations in Systems Combining MT and TM
keyboard_arrow_down
keyboard_arrow_up

 

Moderator(s): Patti O'Neill-Brown -
Research and Development Director at U.S. Government
Panelist(s): Olga Beregovaya -
Welocalize
Ray Flournoy -
Director, Localization and Translation at Etsy
Spence Green -
CEO at Lilt
John Paul Barraza -
Systran
Marcin Junczys-Dowmunt -
Microsoft
11:30am - 12:00pm
Research | Fluency Over Adequacy: A Pilot Study in Measuring User Trust in Imperfect MT
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Marianna J. MartindaleMarine Carpuat
11:30am - 12:00pm
Commercial | Training, feedback and productivity measurement with NMT and Adaptive MT
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Jean-Luc Saillard -
COO at SmartCAT.ai
12:00pm - 12:30pm
Research | Combining Quality Estimation and Automatic Post-editing to Enhance Machine Translation output
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Rajen ChatterjeeMatteo NegriMarco TurchiFrédéric BlainLucia Specia
12:00pm - 12:30pm
Commercial | You talking to me? Bringing MT to the world of chatbots
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Jose Palomares -
CTO at Venga Global
12:30pm - 02:00pm
Lunch
keyboard_arrow_down
keyboard_arrow_up

 

12:30pm - 03:30pm
Technology Showcase
keyboard_arrow_down
keyboard_arrow_up

The exibitors will include:

  • Asia-Pacific Association for Machine Translation (AAMT) and Nagoya University
  • Amazon
  • eBay
  • MateCat
  • Memsource
  • MMT Srl
  • PluNet
  • Prompsit
  • SDL
  • SmartCat
  • STAR
  • Systran
  • Translations.com

 

We are also having presentations in a classroom-style environment.

12:30 – 12:50 StarMT
1:00 – 1:20          XTM International
1:30 – 1:50          Translation.com
2:00 – 2:20          Plunet
2:30 – 2:50          Asia-Pacific Association for Machine Translation (AAMT) and Nagoya University
3:00 – 3:20          Memsource
3:30 – 3:50          ModernMT
4:00 – 4:20          Systran
4:30 – 4:50          SDL
5:00 – 5:30          SmartCat

03:30pm - 04:00pm
Break
keyboard_arrow_down
keyboard_arrow_up

04:00pm - 05:30pm
Technology Showcase
keyboard_arrow_down
keyboard_arrow_up

The exibitors will include:

  • Asia-Pacific Association for Machine Translation (AAMT) and Nagoya University
  • Amazon
  • eBay
  • MateCat
  • Memsource
  • MMT Srl
  • PluNet
  • Prompsit
  • SDL
  • SmartCat
  • STAR
  • Systran
  • Translations.com

 

We are also having presentations in a classroom-style environment.

12:30 – 12:50 StarMT
1:00 – 1:20 XTM International
1:00 – 1:20          XTM International
1:30 – 1:50          Translation.com
2:00 – 2:20          Plunet
2:30 – 2:50          Asia-Pacific Association for Machine Translation (AAMT) and Nagoya University
3:00 – 3:20          Memsource
3:30 – 3:50          ModernMT
4:00 – 4:20          Systran
4:30 – 4:50          SDL
5:00 – 5:30          SmartCat

5:30pm
AMTA General Business
keyboard_arrow_down
keyboard_arrow_up

 

07:30am - 09:00am
Breakfast
keyboard_arrow_down
keyboard_arrow_up

 

09:00am - 09:45am
Government Keynote | Setting up a Machine Translation Program for IARPA
keyboard_arrow_down
keyboard_arrow_up

Dr. Carl Rubino will introduce IARPA’s latest multi-modal machine translation program MATERIAL (MT for English Retrieval of Information in Any Language). He will capture the philosophy behind the program structure and goal, and detail the unique end-to-end Cross-language information retrieval and summarization evaluation mechanism chosen to advance the science in a novel direction.

Presenter(s):

Dr. Carl Rubino -
IARPA
Is currently an IARPA program manager with a keen interest in natural language processing technologies for low resource languages. He came to IARPA with over a decade of experience in machine translation development. Prior to IARPA, he served many years as the Director of the Center for Applied Machine Translation, the government entity that develops Cybertrans. He has held positions in industry for other language technologies (at Panasonic Speech Technology Laboratory and AnswerLogic) after an academic career in Linguistics at the University of California at Santa Barbara and the Australian National University.
09:45am - 10:30am
Panel | Deploying Open Source Neural Machine Translation (NMT) toolkits in the Enterprise
keyboard_arrow_down
keyboard_arrow_up

Join industry leaders at the forefront of NMT to discuss the Open Source frameworks their organizations are developing/supporting. The panel will discuss specific considerations for using open source in the Enterprise, motivations from both company and user perspectives, benefits, best practices, and more

Moderator(s): John Paul Barraza -
Systran
Panelist(s): Anna Goldie -
Google
Marcin Junczys-Dowmunt -
Microsoft
Alon Lavie -
Amazon
Guillaume Klein -
Systran
10:30am - 11:00am
Break
keyboard_arrow_down
keyboard_arrow_up

11:00am - 11:30am
Research | Neural Morphological Tagging of Lemma Sequences for Machine Translation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Costanza ConfortiMatthias HuckAlexander Fraser
11:00am - 11:30am
Commercial | The Collision of Quality and Technology with Reality
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Don DePalma -
CSA
11:00am - 12:00pm
Government | Building OpenNMT Models for Production Scale Deployment on SYSTRAN Enterprise Server
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Dr. Tim AndersonJohn Paul Barraza
11:30am - 12:00pm
Research | Context Models for OOV Word Translation in Low-Resource Languages
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Angli LiuKatrin Kirchhoff
11:30am - 12:00pm
Commercial | Multi-modal Machine Translation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Jungi Kim -
Systran
12:00pm - 12:30pm
Research | How Robust Are Character-BasedWord Embeddings in Tagging and MT Against Wrod Scramlbing or Randdm Nouse?
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Georg HeigoldGuenter NeumannJosef van GenabithStalin Varanasi
12:00pm - 12:30pm
Commercial | Same-language machine translation for local flavours/flavors
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Gema Ramírez-Sánchez -
Prompsit
Janice Campbell -
Adobe
12:00pm - 12:30 pm
Government | Incorporating MT into a Bi-directional Speech Translation System for U.S. Army Units
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Steve LaRoccaJohn Morgan
12:30pm - 02:00pm
Lunch
keyboard_arrow_down
keyboard_arrow_up

 

02:00pm - 02:30pm
Commercial | Beyond MT: AI and Big Data in the Language Industry
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Jay Marciano -
Lionbridge
02:00pm - 02:30pm
Government | The Impact of Advances in Neural and Statistical MT on the Translation Workforce
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Jennifer DeCamp
02:00pm - 02:15pm
Short Presentations | Nearest Neighbour Class-Combination Method for Balancing Translation Quality and Sentiment Preservation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Pintu LoharHaithem AfliAndy Way
02:15pm - 02:30pm
Short Presentations | Register-sensitive Translation: a Case Study of Mandarin and Cantonese
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Tak-sum WongJohn Sie Yuen Lee
02:30pm - 03:30pm
Open Source 1: Neural Monkey, OpenNMT, XNMT
keyboard_arrow_down
keyboard_arrow_up

1. Neural Monkey: The Current State and Beyond

Jindřich Helcl, Jindřich Libovický, Tom Kocmi, Tomáš Musil, Ondřej Cífka, Dusan Varis and Ondřej Bojar.

2. OpenNMT: Open-Source Toolkit for Neural Machine Translation

Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart and Alexander Rush.

3. XNMT: The eXtensible Neural Machine Translation Toolkit

Graham Neubig, Matthias Sperber, Xinyi Wang, Matthieu Felix, Austin Matthews, Sarguna Padmanabhan, Ye Qi, Devendra Singh Sachan, Philip Arthur, Pierre Godard, John Hewitt, Rachid Riad and Liming Wang.

02:30pm - 03:00pm
Government | PEMT for the Public Sector - Evolution of Solution
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Konstantine BoukhvalovSandy Hogg
03:00pm - 03:30pm
Government | Embedding Register-Aware MT into the CAT Workflow
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Elizabeth Mallard
03:30pm - 04:00pm
Break
keyboard_arrow_down
keyboard_arrow_up

04:00pm - 04:30pm
Government | Challenges in Speech Recognition and Translation of High-Value Low-Density Polysynthetic Languages
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Judith L. Klavans
04:00pm - 05:00pm
Open Source 2: Tensor2Tensor, Sockeye, SGNMT
keyboard_arrow_down
keyboard_arrow_up

1. Tensor2Tensor for Neural Machine Translation

Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer and Jakob Uszkoreit.

2. The Sockeye Neural Machine Translation Toolkit at AMTA 2018

Felix Hieber, Tobias Domhan, Michael Denkowski, David Vilar, Artem Sokolov, Ann Clifton and Matt Post.

3. Why not be Versatile? Applications of the SGNMT Decoder for Machine Translation

Felix Stahlberg, Danielle Saunders, Gonzalo Iglesias and Bill Byrne.

04:30pm - 05:00pm
Government | Efficient Translation Workflows with the Neural Feedback Loop
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Spence Green
05:00pm - 05:30pm
Research | An Evaluation of Two Vocabulary Reduction Methods for Neural Machine Translation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Marcello FedericoDuygu Ataman
05:00pm - 05:30pm
Commercial | Thinking of Going Neural? Factors Honda R&D Americas is Considering before Making the Switch
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Phil Soldini -
Honda
05:00pm - 05:30pm
Government | Evaluating Automatic Speech Recognition in Translation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Evelyne TzoukermannCorey Miller
07:00pm
Banquet
keyboard_arrow_down
keyboard_arrow_up

 

07:30am - 09:00am
Breakfast
keyboard_arrow_down
keyboard_arrow_up

 

09:00am - 09:45am
Commercial Keynote | Use more Machine Translation and Keep Your Customers Happy
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):

Glen Poor -
Microsoft
A 26 year veteran of Microsoft with most of that time working on International and localization problems in Windows and Office. He heads up the Language Technology team inside the Office Product Group. He has a BS in Applied Mathematics and a BA in Political Economy from the University of Washington.
09:45am - 10:30am
Research Keynote | Towards Easier Machine Translation
keyboard_arrow_down
keyboard_arrow_up

Much has been made about the accuracy improvements from neural systems, but the transition to NMT will potentially make the technology of translation more accessible in transformative ways. I will present three projects from Harvard with this goal, including:

  1. OpenNMT; a collaborative open-source project to provide benchmark NMT components;
  2. Sequence distillation; a research project to build smaller, faster, on-device NMT systems;
  3. LSTMVis; a visualization framework designed to support inspection and debugging of translation output. These projects aim to make translation systems easier-to-extend, easier-to-run, and easier-to-understand.
Presenter(s):

Alexander (Sasha) Rush -
Harvard University
Is an assistant professor at Harvard University. His research interest is in ML methods for NLP with a focus on deep learning for text generation including applications in machine translation, data and document summarization, and diagram-to-text generation, as well as the development of the OpenNMT translation system. His past work focused on structured prediction and combinatorial optimization for NLP. Sasha received his PhD from MIT supervised by Michael Collins and was a postdoc at Facebook NY under Yann LeCun. His work has received four research awards at major NLP conferences.
10:30am - 11:00am
Break
keyboard_arrow_down
keyboard_arrow_up

11:00am - 11:30am
Research | A Smorgasbord of Features to Combine Phrase-Based and Neural Machine Translation
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Benjamin MarieAtsushi Fujita
11:00am - 11:30am
Commercial | Developing a Neural Machine Translation Service for the 2017-2018 European Union Presidency
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Mārcis Pinnis -
Tilde
11:00am - 11:30am
Government | Detecting and Correcting French Determiner Errors Using Byte Pair Encoding Inputs
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Liz Merkhofer
11:30am - 11:50am
Short Presentations | Exploring Word Sense Disambiguation Abilities of Neural Machine Translation Systems
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Rebecca MarvinPhilipp Koehn
11:30 am - 12:00pm
Commercial | Neural Won! Now What?
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Alex Yanishevsky -
Senior Manager, MT and NLP Deployments - Welocalize
11:30 am - 12:30pm
Government | Online Adaptation of Machine Translation with a Neural Cache
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Guido ZarellaJohn Henderson
11:50am - 12:10pm
Short Presentations | Improving Low Resource Machine Translation using Morphological Glosses
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Steven ShearingChristo KirovHuda KhayrallahDavid Yarowsky
12:00pm - 12:30pm
Commercial | Building and evaluating MT systems for large-volume enterprise workflows: The eBay experience
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Jose Luis Bonilla Sánchez -
eBay
12:10pm - 12:30pm
Short Presentations | Register-sensitive Translation: a Case Study of Mandarin and Cantonese
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Tak-sum WongJohn Sie Yuen Lee
12:30pm - 02:00pm
Lunch
keyboard_arrow_down
keyboard_arrow_up

 

02:00pm - 02:30pm
Research | A Dataset and Reranking Method for Multimodal MT of User-Generated Image Captions
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Shigehiko SchamoniJulian HitschlerStefan Riezler
02:00pm - 02:30pm
Commercial | Artificial Intelligence to Improve Machine Translation
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Diego Bartolomé -
Translations.com
02:00pm - 02:30pm
Government | Portable Speech-to-Speech Translation on an Android Smartphone: The MFLTS System
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Ralf Meermeier
02:30pm - 03:00pm
Research | Simultaneous Translation using Optimized Segmentation
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Maryam SiahbaniHassan S. ShavaraniAshkan AlinejadAnoop Sarkar
02:30pm - 03:00pm
Commercial | Tiered Machine Translation Model in VMware
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Lynn Ma -
VMware
02:30pm - 03:00pm
Government | Dragonfly: American Sign Language to Voice R&D Project
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s): Nick MalayskaPatti O'Neill-BrownMichael Brandstein
03:00pm - 03:30pm
Commercial | TM & MT – How to calculate the potential benefit?
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Judith Klein -
Star Group
03:00pm - 03:30pm
Commercial | Turning NMT Research into Commercial Products
keyboard_arrow_down
keyboard_arrow_up

Presenter(s):
Dragos Munteanu -
SDL
Adrià de Gispert -
SDL
03:00pm - 03:30pm
Government | Terminology in Operations
keyboard_arrow_down
keyboard_arrow_up

Presenter(s):
Linda Moreau -
03:30pm - 04:00pm
Break
keyboard_arrow_down
keyboard_arrow_up

04:00pm - 04:30pm
Commercial | Beyond Quality, Considerations for an MT solution
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Quinn Lam -
SDL
04:00pm - 04:30pm
Government | Resources and Planning for Live Machine Translation Product Demonstration Using BLEU and Reading Comprehension Benchmarks
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Sherri Condon
04:30pm - 05:00pm
Commercial | Towards Less Post-Editing
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Bill Lafferty -
Memsource
04:30pm - 05:00pm
Government | 5 Challenges for MT Research in a Government Setting
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Kathy Baker
05:00pm - 05:30pm
Commercial | SMT user search query
keyboard_arrow_down
keyboard_arrow_up

 

Presenter(s):
Steve Sloto -
Amazon
05:00pm - 05:30pm
Government | Keyword Translation
keyboard_arrow_down
keyboard_arrow_up

Presenter(s): Keith J. Miller
07:30am - 09:00am
Breakfast
keyboard_arrow_down
keyboard_arrow_up

 

09:00am - 10:30am
Tutorial | Getting Started Customizing MT with Microsoft Translator Hub: From Pilot Project to Production
keyboard_arrow_down
keyboard_arrow_up

  • Develop an Effective MT Customization Pilot Project
    Learn strategies to plan and carry out an effective pilot project to train a customized MT engine and learn tips to evaluate the MT pilot project against your goals so you can move it toward production. Participants will know how to plan a pilot project, select appropriate training and testing data, clean training data, discern how to make iterative improvements to an MT engine, carry out automatic and human evaluation of the translation, and use such evaluations to make decisions regarding budgets and timelines.
  • Hands-on MT Software Tutorial
    Custom train an instance of Microsoft Translator Hub and use free Okapi Framework tools to clean your training data for effective use in Microsoft Translator Hub. Example files will be provided.

Bring your own laptop and come prepared for hands-on work with the following tools to be used in the tutorial.

Presenter(s):
Adam Wooten -
Assistant Professor at the Middlebury Institute of International Studies, Columnist for MultiLingual, Consultant & Trainer for be.international
Target Audience:
This tutorial is for those who want to learn to manage a machine translation project using practical knowledge of tools to make it happen, without the need to be a programmer or a computational linguist.
09:00am - 10:30am
Workshop | Translation Quality Estimation and Automatic Post-Editing
keyboard_arrow_down
keyboard_arrow_up

The goal of quality estimation is to evaluate a translation system’s quality without access to reference translations (Blatz et al., 2004; Specia et al., 2013). This has many potential usages: informing an end user about the reliability of translated content; deciding if a translation is ready for publishing or if it requires human post-editing; highlighting the words that need to be changed. Quality estimation systems are particularly appealing for crowd-sourced and professional translation services, due to their potential to dramatically reduce post-editing times and to save labor costs (Specia, 2011). The increasing interest in this problem from an industrial angle comes as no surprise (Turchi et al., 2014; de Souza et al., 2015; Martins et al., 2016, 2017; Kozlova et al., 2016). A related task is that of automatic post-editing (Simard et al. (2007), Junczys-Dowmunt and Grundkiewicz (2016)), which aims to automatically correct the output of machine translation. Recent work (Martins, 2017, Kim et al., 2017, Hokamp, 2017) has shown that the tasks of quality estimation and automatic post-editing benefit from being trained or stacked together.

In this workshop, we will bring together researchers and industry practitioners interested in the tasks of quality estimation (word, sentence, or document level) and automatic post-editing, both from a research perspective and with the goal of applying these systems in industrial settings for routing, for improving translation quality, or for making human post-editors more efficient. Special emphasis will be given to the case of neural machine translation and the new open problems that it poses for quality estimation and automatic post-editing.

The workshop will consist of one full day of technical presentations, including a tentative number of 6 invited talks and 1 contributed talk, followed by a 30-minutes panel discussion. There will be a poster session featuring the papers accepted for publication in the workshop proceedings.

Topics: Topics of the workshop include but are not limited to:

  • Research, review, and position papers on document-level, sentence-level, or word-level Quality Estimation
  • Research, review, and position papers on Automatic Post-Editing
  • Machine learning techniques for exploiting the interaction among these two tasks (e.g. stacking and multi-task learning)
  • Corpora curation technologies for developing Quality Estimation datasets
  • User studies showing the impact of Quality Estimation tools in translator productivity
  • Automatic metrics for translation fluency and adequacy
  • Quality Estimation tailored to Neural Machine Translation
  • Quality Estimation tailored to Human Translation
Organizer(s): André Martins -
Unbabel and University of Lisbon
Ramón Astudillo -
Unbabel and INESC-ID Lisboa
João Graça -
Founder and CTO at Unbabel
09:00am - 10:30am
Workshop | Technologies for MT of Low Resource Languages (LoResMT 2018)
keyboard_arrow_down
keyboard_arrow_up

Statistical and neural machine translation (SMT/NMT) methods have been successfully used to build MT systems in many popular languages in the last two decades with significant improvements on the quality of automatic translation. However, these methods still rely upon a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages.

In many MT systems, the performance of these tools has great impacts on the quality of resulting translation. However, there is not much discussion on these NLP tools, their methods, their roles in different MT systems of diverse methods, and their coverage of support in the many languages of the world, etc. In this workshop, we would like to bring together researchers who work on these topics and help review/overview what are the most important tasks we need from these tools for MT in the following years.

These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. In this workshop, we solicit papers dedicated to these supplementary tools that are used in any language and especially in low resource languages. We would like to have an overview of these NLP tools from our community. The evaluations of these tools in research papers should include how they have improved the quality of MT output.

Topics: Topics of the workshop include but not limited to:

  • Research and review papers of pre-process and/or post-process NLP tools for MT
  • Position papers on the development of pre-process and/or post-process tools for MT
  • Word tokenizers/de-tokenizers for specific languages
  • Word/morpheme segmenters for specific languages
  • Alignment/Re-ordering tools for specific language-pairs
  • Use of morphology analysers and/or morpheme segmenters for MT
  • Multilingual and/or Cross-lingual NLP tools for MT
  • Reusability of existing NLP tools for low resource languages
  • Corpora curation technologies for low resource languages
  • Review of available parallel corpora for low resource languages
  • Research and review papers of MT methods for low resource languages
  • Fast building of MT systems for low resource languages
  • Reusability of existing MT systems for low resource languages
Organizer(s): Chao-Hong Liu -
ADAPT Centre, Dublin City University
10:30am - 11:00am
Break
keyboard_arrow_down
keyboard_arrow_up

11:00am - 12:30pm
Tutorial | Getting Started Customizing MT with Microsoft Translator Hub: From Pilot Project to Production
keyboard_arrow_down
keyboard_arrow_up

  • Develop an Effective MT Customization Pilot Project
    Learn strategies to plan and carry out an effective pilot project to train a customized MT engine and learn tips to evaluate the MT pilot project against your goals so you can move it toward production. Participants will know how to plan a pilot project, select appropriate training and testing data, clean training data, discern how to make iterative improvements to an MT engine, carry out automatic and human evaluation of the translation, and use such evaluations to make decisions regarding budgets and timelines.
  • Hands-on MT Software Tutorial
    Custom train an instance of Microsoft Translator Hub and use free Okapi Framework tools to clean your training data for effective use in Microsoft Translator Hub. Example files will be provided.

Bring your own laptop and come prepared for hands-on work with the following tools to be used in the tutorial.

Presenter(s):
Adam Wooten -
Assistant Professor at the Middlebury Institute of International Studies, Columnist for MultiLingual, Consultant & Trainer for be.international
Target Audience:
This tutorial is for those who want to learn to manage a machine translation project using practical knowledge of tools to make it happen, without the need to be a programmer or a computational linguist.
11:00am - 12:30pm
Workshop | Translation Quality Estimation and Automatic Post-Editing
keyboard_arrow_down
keyboard_arrow_up

The goal of quality estimation is to evaluate a translation system’s quality without access to reference translations (Blatz et al., 2004; Specia et al., 2013). This has many potential usages: informing an end user about the reliability of translated content; deciding if a translation is ready for publishing or if it requires human post-editing; highlighting the words that need to be changed. Quality estimation systems are particularly appealing for crowd-sourced and professional translation services, due to their potential to dramatically reduce post-editing times and to save labor costs (Specia, 2011). The increasing interest in this problem from an industrial angle comes as no surprise (Turchi et al., 2014; de Souza et al., 2015; Martins et al., 2016, 2017; Kozlova et al., 2016). A related task is that of automatic post-editing (Simard et al. (2007), Junczys-Dowmunt and Grundkiewicz (2016)), which aims to automatically correct the output of machine translation. Recent work (Martins, 2017, Kim et al., 2017, Hokamp, 2017) has shown that the tasks of quality estimation and automatic post-editing benefit from being trained or stacked together.

In this workshop, we will bring together researchers and industry practitioners interested in the tasks of quality estimation (word, sentence, or document level) and automatic post-editing, both from a research perspective and with the goal of applying these systems in industrial settings for routing, for improving translation quality, or for making human post-editors more efficient. Special emphasis will be given to the case of neural machine translation and the new open problems that it poses for quality estimation and automatic post-editing.

The workshop will consist of one full day of technical presentations, including a tentative number of 6 invited talks and 1 contributed talk, followed by a 30-minutes panel discussion. There will be a poster session featuring the papers accepted for publication in the workshop proceedings.

Topics: Topics of the workshop include but are not limited to:

  • Research, review, and position papers on document-level, sentence-level, or word-level Quality Estimation
  • Research, review, and position papers on Automatic Post-Editing
  • Machine learning techniques for exploiting the interaction among these two tasks (e.g. stacking and multi-task learning)
  • Corpora curation technologies for developing Quality Estimation datasets
  • User studies showing the impact of Quality Estimation tools in translator productivity
  • Automatic metrics for translation fluency and adequacy
  • Quality Estimation tailored to Neural Machine Translation
  • Quality Estimation tailored to Human Translation
Organizer(s): André Martins -
Unbabel and University of Lisbon
Ramón Astudillo -
Unbabel and INESC-ID Lisboa
João Graça -
Founder and CTO at Unbabel
11:00am - 12:30pm
Workshop | Technologies for MT of Low Resource Languages (LoResMT 2018)
keyboard_arrow_down
keyboard_arrow_up

Statistical and neural machine translation (SMT/NMT) methods have been successfully used to build MT systems in many popular languages in the last two decades with significant improvements on the quality of automatic translation. However, these methods still rely upon a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages.

In many MT systems, the performance of these tools has great impacts on the quality of resulting translation. However, there is not much discussion on these NLP tools, their methods, their roles in different MT systems of diverse methods, and their coverage of support in the many languages of the world, etc. In this workshop, we would like to bring together researchers who work on these topics and help review/overview what are the most important tasks we need from these tools for MT in the following years.

These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. In this workshop, we solicit papers dedicated to these supplementary tools that are used in any language and especially in low resource languages. We would like to have an overview of these NLP tools from our community. The evaluations of these tools in research papers should include how they have improved the quality of MT output.

Topics: Topics of the workshop include but not limited to:

  • Research and review papers of pre-process and/or post-process NLP tools for MT
  • Position papers on the development of pre-process and/or post-process tools for MT
  • Word tokenizers/de-tokenizers for specific languages
  • Word/morpheme segmenters for specific languages
  • Alignment/Re-ordering tools for specific language-pairs
  • Use of morphology analysers and/or morpheme segmenters for MT
  • Multilingual and/or Cross-lingual NLP tools for MT
  • Reusability of existing NLP tools for low resource languages
  • Corpora curation technologies for low resource languages
  • Review of available parallel corpora for low resource languages
  • Research and review papers of MT methods for low resource languages
  • Fast building of MT systems for low resource languages
  • Reusability of existing MT systems for low resource languages
Organizer(s): Chao-Hong Liu -
ADAPT Centre, Dublin City University
12:30pm - 02:00pm
Lunch
keyboard_arrow_down
keyboard_arrow_up

 

02:00pm - 03:30pm
Workshop | Translation Quality Estimation and Automatic Post-Editing
keyboard_arrow_down
keyboard_arrow_up

The goal of quality estimation is to evaluate a translation system’s quality without access to reference translations (Blatz et al., 2004; Specia et al., 2013). This has many potential usages: informing an end user about the reliability of translated content; deciding if a translation is ready for publishing or if it requires human post-editing; highlighting the words that need to be changed. Quality estimation systems are particularly appealing for crowd-sourced and professional translation services, due to their potential to dramatically reduce post-editing times and to save labor costs (Specia, 2011). The increasing interest in this problem from an industrial angle comes as no surprise (Turchi et al., 2014; de Souza et al., 2015; Martins et al., 2016, 2017; Kozlova et al., 2016). A related task is that of automatic post-editing (Simard et al. (2007), Junczys-Dowmunt and Grundkiewicz (2016)), which aims to automatically correct the output of machine translation. Recent work (Martins, 2017, Kim et al., 2017, Hokamp, 2017) has shown that the tasks of quality estimation and automatic post-editing benefit from being trained or stacked together.

In this workshop, we will bring together researchers and industry practitioners interested in the tasks of quality estimation (word, sentence, or document level) and automatic post-editing, both from a research perspective and with the goal of applying these systems in industrial settings for routing, for improving translation quality, or for making human post-editors more efficient. Special emphasis will be given to the case of neural machine translation and the new open problems that it poses for quality estimation and automatic post-editing.

The workshop will consist of one full day of technical presentations, including a tentative number of 6 invited talks and 1 contributed talk, followed by a 30-minutes panel discussion. There will be a poster session featuring the papers accepted for publication in the workshop proceedings.

Topics: Topics of the workshop include but are not limited to:

  • Research, review, and position papers on document-level, sentence-level, or word-level Quality Estimation
  • Research, review, and position papers on Automatic Post-Editing
  • Machine learning techniques for exploiting the interaction among these two tasks (e.g. stacking and multi-task learning)
  • Corpora curation technologies for developing Quality Estimation datasets
  • User studies showing the impact of Quality Estimation tools in translator productivity
  • Automatic metrics for translation fluency and adequacy
  • Quality Estimation tailored to Neural Machine Translation
  • Quality Estimation tailored to Human Translation
Organizer(s): André Martins -
Unbabel and University of Lisbon
Ramón Astudillo -
Unbabel and INESC-ID Lisboa
João Graça -
Founder and CTO at Unbabel
02:00pm - 03:30pm
Workshop | Technologies for MT of Low Resource Languages (LoResMT 2018)
keyboard_arrow_down
keyboard_arrow_up

Statistical and neural machine translation (SMT/NMT) methods have been successfully used to build MT systems in many popular languages in the last two decades with significant improvements on the quality of automatic translation. However, these methods still rely upon a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages.

In many MT systems, the performance of these tools has great impacts on the quality of resulting translation. However, there is not much discussion on these NLP tools, their methods, their roles in different MT systems of diverse methods, and their coverage of support in the many languages of the world, etc. In this workshop, we would like to bring together researchers who work on these topics and help review/overview what are the most important tasks we need from these tools for MT in the following years.

These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. In this workshop, we solicit papers dedicated to these supplementary tools that are used in any language and especially in low resource languages. We would like to have an overview of these NLP tools from our community. The evaluations of these tools in research papers should include how they have improved the quality of MT output.

Topics: Topics of the workshop include but not limited to:

  • Research and review papers of pre-process and/or post-process NLP tools for MT
  • Position papers on the development of pre-process and/or post-process tools for MT
  • Word tokenizers/de-tokenizers for specific languages
  • Word/morpheme segmenters for specific languages
  • Alignment/Re-ordering tools for specific language-pairs
  • Use of morphology analysers and/or morpheme segmenters for MT
  • Multilingual and/or Cross-lingual NLP tools for MT
  • Reusability of existing NLP tools for low resource languages
  • Corpora curation technologies for low resource languages
  • Review of available parallel corpora for low resource languages
  • Research and review papers of MT methods for low resource languages
  • Fast building of MT systems for low resource languages
  • Reusability of existing MT systems for low resource languages
Organizer(s): Chao-Hong Liu -
ADAPT Centre, Dublin City University
03:30pm - 04:00pm
Break
keyboard_arrow_down
keyboard_arrow_up

04:00pm - 05:30pm
Workshop | Translation Quality Estimation and Automatic Post-Editing
keyboard_arrow_down
keyboard_arrow_up

The goal of quality estimation is to evaluate a translation system’s quality without access to reference translations (Blatz et al., 2004; Specia et al., 2013). This has many potential usages: informing an end user about the reliability of translated content; deciding if a translation is ready for publishing or if it requires human post-editing; highlighting the words that need to be changed. Quality estimation systems are particularly appealing for crowd-sourced and professional translation services, due to their potential to dramatically reduce post-editing times and to save labor costs (Specia, 2011). The increasing interest in this problem from an industrial angle comes as no surprise (Turchi et al., 2014; de Souza et al., 2015; Martins et al., 2016, 2017; Kozlova et al., 2016). A related task is that of automatic post-editing (Simard et al. (2007), Junczys-Dowmunt and Grundkiewicz (2016)), which aims to automatically correct the output of machine translation. Recent work (Martins, 2017, Kim et al., 2017, Hokamp, 2017) has shown that the tasks of quality estimation and automatic post-editing benefit from being trained or stacked together.

In this workshop, we will bring together researchers and industry practitioners interested in the tasks of quality estimation (word, sentence, or document level) and automatic post-editing, both from a research perspective and with the goal of applying these systems in industrial settings for routing, for improving translation quality, or for making human post-editors more efficient. Special emphasis will be given to the case of neural machine translation and the new open problems that it poses for quality estimation and automatic post-editing.

The workshop will consist of one full day of technical presentations, including a tentative number of 6 invited talks and 1 contributed talk, followed by a 30-minutes panel discussion. There will be a poster session featuring the papers accepted for publication in the workshop proceedings.

Topics: Topics of the workshop include but are not limited to:

  • Research, review, and position papers on document-level, sentence-level, or word-level Quality Estimation
  • Research, review, and position papers on Automatic Post-Editing
  • Machine learning techniques for exploiting the interaction among these two tasks (e.g. stacking and multi-task learning)
  • Corpora curation technologies for developing Quality Estimation datasets
  • User studies showing the impact of Quality Estimation tools in translator productivity
  • Automatic metrics for translation fluency and adequacy
  • Quality Estimation tailored to Neural Machine Translation
  • Quality Estimation tailored to Human Translation
Organizer(s): André Martins -
Unbabel and University of Lisbon
Ramón Astudillo -
Unbabel and INESC-ID Lisboa
João Graça -
Founder and CTO at Unbabel
04:00pm - 05:30pm
Workshop | Technologies for MT of Low Resource Languages (LoResMT 2018)
keyboard_arrow_down
keyboard_arrow_up

Statistical and neural machine translation (SMT/NMT) methods have been successfully used to build MT systems in many popular languages in the last two decades with significant improvements on the quality of automatic translation. However, these methods still rely upon a few natural language processing (NLP) tools to help pre-process human generated texts in the forms that are required as input for these methods, and/or post-process the output in proper textual forms in target languages.

In many MT systems, the performance of these tools has great impacts on the quality of resulting translation. However, there is not much discussion on these NLP tools, their methods, their roles in different MT systems of diverse methods, and their coverage of support in the many languages of the world, etc. In this workshop, we would like to bring together researchers who work on these topics and help review/overview what are the most important tasks we need from these tools for MT in the following years.

These NLP tools include, but not limited to, several kinds of word tokenizers/de-tokenizers, word segmenters, morphology analysers, etc. In this workshop, we solicit papers dedicated to these supplementary tools that are used in any language and especially in low resource languages. We would like to have an overview of these NLP tools from our community. The evaluations of these tools in research papers should include how they have improved the quality of MT output.

Topics: Topics of the workshop include but not limited to:

  • Research and review papers of pre-process and/or post-process NLP tools for MT
  • Position papers on the development of pre-process and/or post-process tools for MT
  • Word tokenizers/de-tokenizers for specific languages
  • Word/morpheme segmenters for specific languages
  • Alignment/Re-ordering tools for specific language-pairs
  • Use of morphology analysers and/or morpheme segmenters for MT
  • Multilingual and/or Cross-lingual NLP tools for MT
  • Reusability of existing NLP tools for low resource languages
  • Corpora curation technologies for low resource languages
  • Review of available parallel corpora for low resource languages
  • Research and review papers of MT methods for low resource languages
  • Fast building of MT systems for low resource languages
  • Reusability of existing MT systems for low resource languages
Organizer(s): Chao-Hong Liu -
ADAPT Centre, Dublin City University

Get Tickets

What is included

  • Keynote speeches and panel discussions featuring renowned experts in MT;
  • Tutorials and workshops focusing on many aspects of applied MT;
  • Three parallel tracks with presentations by researchers, commercial users, and government representatives;
  • A technology showcase consisting of exhibits of commercial and research systems;

Conference fees

We offer a special discounted rate to the GALA 2018 attendees; if you are attending the GALA 2018 conference please select “I am also attending GALA” in your registration form and then register with GALA-specific rates. The GALA discount applies to non-member fees only.

There are several types of registration fees:

Read more here

Venue & Accommodations

The AMTA 2018 Conference Hotel is the same as the conference venue:

The Aloft Boston Seaport

Hotel room single and double $199

Triple: $239

Quad: $279

Reserve your rooms here!

You can see a map of nearby restaurants here.

You can find more about Boston here.

Sponsors