Time: pm - pm Location: 11th Floor Large Conference Room  Abstract: Thanks to massive training data, and powerful machine translation techniques, machine translation quality has reached acceptable levels for a handful of languages.
However, for hundreds of other languages, translation quality decreases quickly as the size of the available training data becomes smaller.
That work can take many different forms, from maps to data visualization to video-based projects.
In this talk, we’ll discuss humanities approaches to large-scale text analysis, with a focus on corpora that may be of interest to computer scientists.
Bio: Manuel is a 3rd year Ph D student at Aarhus University in Denmark.
His Ph D is focused on applying Data Mining and Machine Learning on large collections of unstructured text documents with the goal of extracting and representing knowledge embedded in the documents.
The first part will be a brief overview of Manuel's recent project in abbreviation disambiguation.
Following, Manuel will give a brief overview of how various NLP methods are used in an industrial setting in a danish company that provides text analytics services for publishers such as Springer-Nature.
His work focuses on large-scale analysis of social media in disasters.
Time: pm - pm Location: 11th Floor Large Conference Room  Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.
The best performing models also connect the encoder and decoder through an attention mechanism.
Bio: Andrew Wallace is a software developer in the UCLA digital library.
He received his Ph D in Cognitive Science from Brown University in 2011.