Module Details
Module Code: |
STAT9004 |
Title: |
Statistical Data Analysis
|
Long Title:
|
Statistical Data Analysis
|
NFQ Level: |
Expert |
Valid From: |
Semester 2 - 2024/25 ( January 2025 ) |
Field of Study: |
4620 - Statistics
|
Module Description: |
In this module, the learner will study statistical techniques, with particular emphasis on linear models. Statistical analytical software such as R will be used in the labs.
|
Learning Outcomes |
On successful completion of this module the learner will be able to: |
# |
Learning Outcome Description |
LO1 |
Explore data sets and establish a data analysis protocol for data science problems. |
LO2 |
Explain and apply the statistical concepts relevant to experimental design and data analysis with an emphasis on large data sets. |
LO3 |
Build and validate statistical models with continuous response variables and multiple predictors (both categorical and continuous) using ANOVA, multiple regression and ANCOVA. |
LO4 |
Distinguish between parametric and non-parametric methods and decide when the most commonly used non-parametric methods should be applied. |
LO5 |
Build and validate statistical models with categorical response variables using logistic regression. |
LO6 |
Interpret the results of statistical analyses performed by a software package or presented in research papers. |
Dependencies |
Module Recommendations
This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named MTU module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).
|
|
Incompatible Modules
These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list.
|
No incompatible modules listed |
Co-requisite Modules
|
No Co-requisite modules listed |
Requirements
This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.
|
No requirements listed |
Indicative Content |
Data Analysis Protocol
Exploratory data analysis: graphical and numerical methods to explore categorical and continuous data sets, outlier detection, missing values, testing of assumptions and transformation of variables. Model fitting and model interpretation. Model diagnostics.
|
Design of Experiments
Observational (vs) experimental data. The fundamentals of experimental design. Analysis of variance. Factorial design. Statistical power and multiple comparisons. Non-parametric alternatives.
|
Multiple Regression
Assumptions, collinearity, interpreting coefficients, model fitting, model diagnostics, confidence intervals of coefficients,
Analysis of covariance (ANCOVA).
|
Generalised Linear Models
Definition of a generalized linear model: link functions. Overview of different types of generalised linear models and their uses with a focus on logistic regression for binary data.
|
Software analysis
SPSS, R, Excel
|
Module Content & Assessment
|
Assessment Breakdown | % |
Coursework | 100.00% |
Assessments
No End of Module Formal Examination |
Reassessment Requirement |
Coursework Only
This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.
|
The University reserves the right to alter the nature and timings of assessment
Module Workload
Workload: Full Time |
Workload Type |
Contact Type |
Workload Description |
Frequency |
Average Weekly Learner Workload |
Hours |
Lecture |
Contact |
Formal lectures describing the theory underpinning the statistical techniques covered by the learning outcomes. |
Every Week |
2.00 |
2 |
Lab |
Contact |
A series of laboratory exercises where the student will use a statistical software package to analyse data sets using the statistical techniques covered by the learning outcomes. |
Every Week |
2.00 |
2 |
Independent Learning |
Non Contact |
Independent learning |
Every Week |
3.00 |
3 |
Total Hours |
7.00 |
Total Weekly Learner Workload |
7.00 |
Total Weekly Contact Hours |
4.00 |
Workload: Part Time |
Workload Type |
Contact Type |
Workload Description |
Frequency |
Average Weekly Learner Workload |
Hours |
Lecture |
Contact |
Formal lectures describing the theory underpinning the statistical techniques covered by the learning outcomes. |
Every Week |
1.50 |
1.5 |
Lab |
Contact |
A series of laboratory exercises where the student will use a statistical software package to analyse data sets using the statistical techniques covered by the learning outcomes |
Every Week |
1.50 |
1.5 |
Lecturer Supervised Learning (Non-contact) |
Non Contact |
Lecturer Supervised Learning |
Every Week |
4.00 |
4 |
Total Hours |
7.00 |
Total Weekly Learner Workload |
7.00 |
Total Weekly Contact Hours |
3.00 |
Module Resources
|
Recommended Book Resources |
---|
-
Frank Harrell. (2015), Regression Modeling Strategies, 2. Springer International Publishing, [ISBN: 9783319194257].
-
Peter Dalgaard. (2008), Introductory Statistics with R, Springer, New York, [ISBN: 9780387790534].
| Supplementary Book Resources |
---|
-
Michael J. Crawley. (2012), The R Book, Wiley-Blackwell, [ISBN: 978-0470973929].
-
Annette J. Dobson. (2018), An Introduction to Generalized Linear Models, Second Edition, 4. Chapman and Hall, [ISBN: 9781138741515].
| Recommended Article/Paper Resources |
---|
-
Christodoulou E. et al. (2019), A systematic review shows no performance
benefit of machine learning over
logistic regression for clinical
prediction models, Journal of Clinical Epidemiology, Vol. 110, p.12,
-
Gary Smith. (2018), Step away from stepwise, Journal of Big Data, Vol 5,
-
Lang T. A. & Altman D.G.. (2015), Basic statistical reporting for articles
published in biomedical journals: the
"Statistical Analyses and Methods
in the Published Literature" or the
SAMPL Guidelines, Int J Nurs Stud., 52(1),
| Supplementary Article/Paper Resources |
---|
-
Collins G. S. et al.. (2024), TRIPOD+AI statement: updated guidance
for reporting clinical prediction models
that use regression or machine learning
methods, British Medical Journal, 385,
-
Wynants L. et al.. (2020), Prediction models for diagnosis and
prognosis of covid-19: systematic review
and critical appraisal, British Medical Journal, 369,
| This module does not have any other resources |
---|
|