880414 :Text mining

Algemeen

Voertaal Engels
Werkvorm: 12 2-hour lectures, 2 lectures per week (Geen informatie over collegetijden bekend)
Tentamenvorm: Practical assignments and a final paper (Geen informatie over tentamendata bekend)
Niveau:Master
Studielast:6 ECTS credits
Inschrijving:Inschrijven via COMAP. Inschrijven van 1 tot 20 oktober

Docent(en)


prof. dr. E.O. Postma


Doel van de cursus (alleen in het Engels beschikbaar)

This course aims to give students an understanding, both at the conceptual and the technical level, of the development of natural language processing (NLP) applications in the text mining / information extraction area. At the conceptual level, the course introduces machine learning as a powerful generic toolbox for automatically learning NLP modules from data. At the technical level, the course offers hands-on training and experience in building an actual text mining application in which NLP modules contribute to extracting information from text.


Inhoud van de cursus (alleen in het Engels beschikbaar)

Text mining, also known as 'information extraction from text', or as 'knowledge discovery from text', is an IT research and development field that has gained increasing focus in the last decade, attracting researchers from computational linguistics, machine learning (an AI subfield), and information retrieval. Example key applications that have emerged from this melting pot are question answering, information extraction, and summarization. This course gives an overview of the field in a practical, hands-on fashion, by first describing and then building modules that perform subtasks in text mining, such as part-of-speech tagging, phrase chunking, relation finding, and named-entity recognition. Students build these models from basic ingredients (machine learning algorithms and language data) and subsequently integrate them in the larger framework of a text mining application. Using a mix of software tools (ranging from programming from scratch to tuning existing modules), students test and report on the modules they develop.


Verplichte literatuur

(18-jul-2017)