skip to content

Department of Computer Science and Technology

 
Principal lecturer: 
Dr Weiwei Sun
Other lecturers: 
Dr Andrew Caines
Prof Paula Buttery
Students: 
MPhil ACS, Part III
Course code: 
L90
Prerequisites: 
No prerequisites beyond those topics covered in an undergraduate CS degree. This course is a prerequisite for L95: Introduction to Natural Language Syntax and Parsing
Hours: 
18
Class limit: 
10

Aims

This course introduces the fundamental techniques of natural language processing. It aims to explain the potential and the main limitations of these techniques. Some current research issues are introduced and some current and potential applications discussed and evaluated. Students will also be introduced to practical experimentation in natural language processing.

Syllabus

  • Introduction. Brief history of NLP research, some current applications, components of NLP systems.
  • Finite-state techniques. Inflectional and derivational morphology, finite-state automata in NLP, finite-state transducers.
  • Prediction and part-of-speech tagging. Corpora, simple N-grams, word prediction, stochastic tagging, evaluating system performance.
  • Context-free grammars and parsing. Generative grammar, context-free grammars, parsing with context-free grammars, weights and probabilities. Some limitations of context-free grammars.
  • Dependency structures. English as an outlier. Universal dependencies. Introduction to dependency parsing.
  • Compositional semantics. Logical representations. Compositional semantics and lambda calculus. Inference and robust entailment. Negation.
  • Lexical semantics. Semantic relations, WordNet, word senses.
  • Distributional semantics. Representing lexical meaning with distributions. Similarity metrics.
  • Distributional semantics and deep learning. Embeddings. Grounding. Multimodal systems and visual question answering.
  • Discourse processing. Anaphora resolution, summarization.
  • Language generation and regeneration. Generation and regeneration. Components of a generation system. Generation of referring expressions.
  • Recent NLP research. Some recent NLP research.
  • Practical on information extraction.

Objectives

On completion of this module, students should:

  • be able to discuss the current and likely future performance of several NLP applications;
  • be able to describe briefly a fundamental technique for processing language for several subtasks, such as morphological processing, parsing, word sense disambiguation etc.;
  • understand how these techniques draw on and relate to other areas of computer science;
  • understand the basic principles of designing and running an NLP experiment.

Coursework

Undertake 2 ticked exercises as part of practical sessions on information extraction.

Write a 4,000-word report including results from an extended information extraction experiment.

Practical work

Build and evaluate a NLP system.

Assessment

Assessment will be based on the practicals:

  • First practical exercise (10%, ticked)
  • Second practical exercise (10%, ticked)
  • Final report (80%, 4,000 words, excluding references)

Recommended reading

Jurafsky, D. and Martin, J. (2008). Speech and language processing. Prentice Hall (specific chapter references will be provided in the lecture notes).

Although the lectures don't assume any exposure to linguistics, the course will be easier to follow if students have some understanding of basic linguistic concepts. The following may be useful for this: The Internet Grammar of English

Further Information

Due to COVID-19, the method of teaching for this module will be adjusted to cater for physical distancing and students who are working remotely. We will confirm precisely how the module will be taught closer to the start of term.

  • Current Cambridge undergraduate students who are continuing onto Part III or the MPhil in Advanced Computer Science may only take this module if they did NOT take it as a Unit of Assessment in Part II.
  • The class limit is 10 MPhil / Part III students with the practical assessed by the Departent of Computer Science and Technology.
  • Students from other departments may attend the lectures for this module if space allows. However students wanting to take it for credit will need to make arrangements for assessment within their own department.