Ungoliant: An Optimized Pipeline for the Generation of a Very Large-Scale Multilingual Web Corpus
I’m a research engineer at ALMAnaCH research team at Inria
I’m a postdoctoral researcher at the Data and Web Science Group at the University of Mannheim.
Inria Senior Researcher, DARIAH EU infrastructure, director, ISO/TC 37 chair
Inria Senior Researcher in Natural Language Processing and Computational Linguistics