MTU Courses - COMP9066 - Natural Language Processing

Module Details

Module Code:	COMP9066
Title:	Natural Language Processing
Long Title:	Natural Language Processing
NFQ Level:	Expert
Valid From:	Semester 1 - 2020/21 ( September 2020 )

Duration:	1 Semester

Credits:	5

Field of Study:	4811 - Computer Science

Module Delivered in:	1 programme(s)

Module Description:

Natural language processing (NLP) is a set of statistical and machine learning techniques applied to the analysis and synthesis of natural language and speech. This module will provide learners with a comprehensive introduction to the theory underpinning NLP and will also equip learners with the knowledge to implement and apply NLP algorithms and techniques to real-world problems such as sentiment analysis.

Learning Outcomes
On successful completion of this module the learner will be able to:
#	Learning Outcome Description
LO1	Apply and evaluate a language modelling technique such as n-grams to a natural language processing problem.
LO2	Compare and contrast the use of parsing techniques for context-free grammar problems.
LO3	Develop and evaluate a document classification model using machine learning techniques.
LO4	Implement a machine translation model for real-world data and assess its performance.

Dependencies
Module Recommendations This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named MTU module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).

Incompatible Modules These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list.
No incompatible modules listed
Co-requisite Modules
No Co-requisite modules listed
Requirements This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.
No requirements listed

Indicative Content
Language Modelling Introduction to natural language processing and language models. N-gram modelling, The Markov assumption and maximum likelihood estimation. Evaluating language models, perplexity, generalization, smoothing techniques and dealing with unknown words. Hidden Markov models and part-of-speech tagging.
Parsing for NLP Context free grammar. Syntactic parsing. Structural, attachment and coordination ambiguity. Handling structural ambiguities using the CKY algorithm. Statistical parsing, probabilistic context free grammars for disambiguation. Learning PCFG rule probabilities. Dependency Parsing. Dependency grammars and typed dependency structure.
Machine Learning Document classification using machine learning techniques such as naive bayes (mutli-nomial and bernoulli models), support vector machines, logistic regression and neural networks (embedding dense word vectors).
Machine Translation (MT) Introduction to linguistic knowledge. Rule-based MT (transfer-based MT and inter-lingual MT. Statistical MT (word and phrase-based translation). Neural MT and vector-based representations. MT evaluation metrics (WER, BLEU and TER).

Assessment Breakdown	%
Module Content & Assessment
Coursework	100.00%

Assessments

Coursework

Assessment Type	Project	% of Total Mark	50
Timing	Week 8	Learning Outcomes	1,2,3
Assessment Description Build a language model and use it in a given natural language processing application such as text generation. Produce a report that critically analyses the performance of the model.

Assessment Type	Project	% of Total Mark	50
Timing	Week 12	Learning Outcomes	3,4
Assessment Description Implement a machine model such as a neural model with vector-based representations for tasks of Machine Translation or Question answering. Assess the performance of the model using standard techniques such as BLEU or WER.

No End of Module Formal Examination

Reassessment Requirement
Coursework Only This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.

The University reserves the right to alter the nature and timings of assessment

Module Workload

Workload: Full Time
Workload Type	Contact Type	Workload Description	Frequency	Average Weekly Learner Workload	Hours
Lecture	Contact	Delivers the concepts and theories underpinning the learning outcomes.	Every Week	2.00	2
Lab	Contact	Application of learning to case studies and project work.	Every Week	2.00	2
Independent Learning	Non Contact	Student undertakes independent study. The student reads recommended papers and practices implementation.	Every Week	3.00	3
Total Hours					7.00
Total Weekly Learner Workload					7.00
Total Weekly Contact Hours					4.00

Workload: Part Time
Workload Type	Contact Type	Workload Description	Frequency	Average Weekly Learner Workload	Hours
Lecture	Contact	Delivers the concepts and theories underpinning the learning outcomes.	Every Week	2.00	2
Lab	Contact	Application of learning to case studies and project work.	Every Week	2.00	2
Independent Learning	Non Contact	Student undertakes independent study. Student reads recommended papers and practices implementation.	Every Week	3.00	3
Total Hours					7.00
Total Weekly Learner Workload					7.00
Total Weekly Contact Hours					4.00

Recommended Book Resources
Module Resources
N. Hardeniya J. Perkins, D. Chopra, N. Joshi, I. Mathur. (2016), Natural Language Processing: Python and NLTK, 1st. Packt Publishing, [ISBN: 9781787285101]. L. Hobson. H. Cole, H. Hannes. (2017), Natural Language Processing in Action: Understanding, analyzing, and generating text with Python, 1st. Manning Publications, [ISBN: 9781617294631].
Supplementary Book Resources
C. Manning. (1999), Foundations of Statistical Natural Language Processing, 4th. MIT Press, [ISBN: 9780262133609].
Recommended Article/Paper Resources
E. Cambria, B. White. (2014), Jumping NLP Curves: A Review of Natural Language Processing Research, IEEE Computational Intelligence Magazine, 9, http://ieeexplore.ieee.org/document/6786 458/
Supplementary Article/Paper Resources
J. Lafferty, A. McCallum, F. Pereira. (2001), Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data, International Conference on Machine Learning, http://repository.upenn.edu/cgi/viewcont ent.cgi?article=1162&context=cis_pap ers
This module does not have any other resources

Programme Code	Programme	Semester	Delivery
Module Delivered in
CR_KARIN_9	Master of Science in Artificial Intelligence	2	Elective

https://mtu.akarisoftware.com/

COMP9066 - Natural Language Processing

Module Details

Assessments

Module Workload