Module Details

Module Code: DATA8001
Title: Data Science and Analytics
Long Title: Data Science and Analytics
NFQ Level: Advanced
Valid From: Semester 1 - 2018/19 ( September 2018 )
Duration: 1 Semester
Credits: 5
Field of Study: 4816 - Data Format
Module Delivered in: 2 programme(s)
Module Description: This module will provide the learner with an overview of the important themes in the growing field of data science and analytics. The learner will study the established methods and technologies and also investigate new and emerging trends. Emphasis will be placed on statistical theory, mathematical algorithmic design and modelling concepts. The context and use of data analytics in real world setting will be investigated with topics such as data privacy, data security, and ethics. Data analytics/mining software will be used, e.g. R in both the lectures and labs.
 
Learning Outcomes
On successful completion of this module the learner will be able to:
# Learning Outcome Description
LO1 Describe the field of data science and analytics, its concepts, technologies and historical roots. Give a detailed overview of the main approaches to developing a data analytics/mining project lifecyle.
LO2 Perform exploratory data analysis using data science/mining software packages.
LO3 Find patterns and solutions within a data set using data mining and/or statistical modelling techniques.
LO4 Describe a number of data mining and business intelligence concepts and techniques.
LO5 Develop a deep understanding of data protection, data privacy and other ethical issues.
Dependencies
Module Recommendations

This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named MTU module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).

Incompatible Modules
These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list.
No incompatible modules listed
Co-requisite Modules
No Co-requisite modules listed
Requirements

This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.

No requirements listed
 
Indicative Content
Introduction
Investigate the data science and analytics landscape, its historical development, terminology and technologies; big data concepts, structured and unstructured data types.
Data analytics project life cycle
Use of the CRISP-DM framework to manage a data analytics project with its variety of actors and challenges. Investigate case studies in the field, looking at a variety of approaches, technologies with successes, failures, new developments and unusual applications of analytics.
Data quality, pre-processing and EDA
Cleaning/scrubbing data techniques, ETL (Extract, Transform, Load) systems and methods; data pre-processing: zero variance, dummy variables, correlations, linear dependencies. Use of exploratory data analysis, summary statistics, plots and visualisations.
Data analytical techniques
Examine an overview of data mining, regression and classification, pattern recognition, anomaly detection and visualisation techniques. Investigate how these techniques are used in a real-world setting, e.g. profit-testing scenarios, key performance indicators (KPIs), dashboards, balanced score cards.
Data science and analytics theory
Statistics, sampling theory, MLE (Maximum Likelihood Estimation), overview of statistical learning theory, algorithmic design; characteristics, strengths and weaknesses of models; decision trees and ensemble techniques, e.g. random forests; discuss testing and validation of models.
Data analytics techniques and software technologies
Introduction to various data analytics techniques, methods and predictive models. Explore how to load data and carry out initial data exploration. Use a variety of data analytics technologies e.g. R and Excel.
Technical report writing
Investigate how to write a technical report - structure and narrative of documents, referencing, bibliography and awareness of expected audience.
Ethics, data privacy and security
Investigate ethics, data privacy, security, data protection legislation, including GDPR and related topics in data governance.
Module Content & Assessment
Assessment Breakdown%
Coursework40.00%
End of Module Formal Examination60.00%

Assessments

Coursework
Assessment Type Project % of Total Mark 40
Timing Week 9 Learning Outcomes 2,3,4
Assessment Description
Solve a data analytics problem using R or similar data analytics software and produce a report.
End of Module Formal Examination
Assessment Type Formal Exam % of Total Mark 60
Timing End-of-Semester Learning Outcomes 1,4,5
Assessment Description
End-of-Semester Final Examination
Reassessment Requirement
Repeat examination
Reassessment of this module will consist of a repeat examination. It is possible that there will also be a requirement to be reassessed in a coursework element.

The University reserves the right to alter the nature and timings of assessment

 

Module Workload

Workload: Full Time
Workload Type Contact Type Workload Description Frequency Average Weekly Learner Workload Hours
Lecture Contact Theory and Case Studies Every Week 2.00 2
Lab Contact Computer-based lab Every Week 2.00 2
Independent & Directed Learning (Non-contact) Non Contact Independent Study Every Week 3.00 3
Total Hours 7.00
Total Weekly Learner Workload 7.00
Total Weekly Contact Hours 4.00
Workload: Part Time
Workload Type Contact Type Workload Description Frequency Average Weekly Learner Workload Hours
Lecture Contact Theory and case Studies Every Week 2.00 2
Lab Contact Computer-based lab Every Second Week 1.00 2
Independent Learning Non Contact Independent Study Every Week 4.00 4
Total Hours 8.00
Total Weekly Learner Workload 7.00
Total Weekly Contact Hours 3.00
 
Module Resources
Recommended Book Resources
  • Peter Bruce, Andrew Bruce. (2017), Practical Statistics for Data Scientists, 1st. O'Reilly Media, California, USA, [ISBN: 9781491952962].
  • Foster Provost, Tom Fawcett. (2013), Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking, O'Reilly Media, Cambridge UK, [ISBN: 1449361323].
  • Matthew North. (2012), Data Mining for the Masses, Global Text Project, [ISBN: 0615684378].
Supplementary Book Resources
  • Kabacoff, Robert. (2015), R in Action, 2nd. Manning, New York, [ISBN: 1617291382].
  • Gareth James, Daniela Witten, Trevor Hastie and Robert Tibshirani. (2013), An Introduction to Statistical Learning, Springer-Verlag, New York, [ISBN: 9781461471370].
  • Norman Matloff. (2011), The Art of R Programming, 1st. No Starch Press, San Francisco, [ISBN: 9781593273842].
  • Andy Field, Jeremy Miles. (2012), Discovering Statistics Using SAS, 1st. [ISBN: 1849200920].
  • Efraim Turban , Ramesh Sharda, Dursun Delen. (2011), Decision Support and Business Intelligence Systems, 9th. Pearson Prentice Hall, New Jersey, [ISBN: 013610729X].
Recommended Article/Paper Resources
  • Watson, Hugh. (2011), Business Analytics Insight: Hype or Here to Stay?, Business Intelligence Journal, vol. 16, No. 1, p.1-8.
  • Vijay Khatri, Carol V. Brown. (2010), Designing data governance, Communications of the ACM, Volume 53 Issue 1,
Other Resources
 
Module Delivered in
Programme Code Programme Semester Delivery
CR_SDAAN_8 Higher Diploma in Science in Data Science & Analytics 1 Mandatory
CR_SDAAN_9 Master of Science in Data Science & Analytics 1 Mandatory