880221 :Natural Language Processing (HAIT)


Voertaal Engels
Werkvorm: 14x2 hoursSeminar style lectures (Geen informatie over collegetijden bekend)
Tentamenvorm: Assignments and written exam (Tentamenrooster)
Studielast:6 ECTS credits
Inschrijving:Inschrijven via COMAP. Enrolment from August 1 to August 20
dr. M.M. van Zaanen

Doel van de cursus

At the end of the unit, students will be able to:

  • describe natural language processing components on the lexical, syntactic, and semantic level, both from a functional and technical point-of-view and use these components;
  • analyze and design complex natural language processing end-user systems from the lower level language processing components;
  • compare different approaches (such as symbolic or probabilistic approaches) to natural language processing tasks, identifying pros and cons of the different approaches.

Inhoud van de cursus

Manually analyzing large collections of text, such as newspaper articles, blogs or tweets, quickly becomes infeasible due to the huge amounts of data. Content analysis of, for instance social media data, often requires linguistic analysis in order to identify and extract useful information. Techniques that automate this process are called natural language processing techniques.

Natural language processing comprises a vast collection of tasks, algorithms, and theoretical frameworks that, at various different levels, aim at making human language understandable to computers. In order to build working computer systems that are able to automatically process natural language, it is essential to have a thorough understanding of how these ingredients work. During the course, students acquire this knowledge through both theoretical study of techniques, and practical experience with basic language processing systems.

Topics that will be covered by the course include (among others):

  • Syntactic annotation of language,
  • Computational semantics,
  • User-oriented applications, such as machine translation or sentiment analysis.

Bijzonderheden

The final grade is calculated based on the grade for the written exam (80%) and two individual assignments (10% each). Assignments have non-negotiable deadlines. Assignments handed in after the deadline will not be accepted and will lead to a fail for the course.

Students are expected to present part of the course material during lectures.

Verplichte literatuur

  1. Daniel Jurafsky and James H. Martin, Speech and Language Processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd edition, Prentice-Hall: Upper Saddle River, New Jersey, 2009, ISBN 978-0-13-504196-3.