Module Details

Module Code: STAT9006
Title: Data Analysis with R
Long Title: Data Analysis with R
NFQ Level: Expert
Valid From: Semester 2 - 2017/18 ( January 2018 )
Duration: 1 Semester
Credits: 10
Field of Study: 4620 - Statistics
Module Delivered in: 1 programme(s)
Module Description: This module provides the learner with advanced training in statistical methods with R, relevant to data analysis in the design and planning of experiments as part of the research process. It will address statistical methodologies and applications to research.
 
Learning Outcomes
On successful completion of this module the learner will be able to:
# Learning Outcome Description
LO1 Use statistics to reduce complex data situations to manageable formats in order to describe, explain or model them.
LO2 Derive descriptive statistics for various data types using R.
LO3 Perform and critique statistical tests on two sample data using R.
LO4 Set up and critically analyse data sets in both a parametric and non-parametric way for two and more samples using R.
LO5 Use multiple regression with R and other advanced statistical techniques to allow prediction of a score on one variable on the basis of the scores on several other variables.
LO6 Communicate effectively research findings in a clear concise manner using correct terminology based on output from R.
Dependencies
Module Recommendations

This is prior learning (or a practical skill) that is strongly recommended before enrolment in this module. You may enrol in this module if you have not acquired the recommended learning but you will have considerable difficulty in passing (i.e. achieving the learning outcomes of) the module. While the prior learning is expressed as named MTU module(s) it also allows for learning (in another module or modules) which is equivalent to the learning specified in the named module(s).

Incompatible Modules
These are modules which have learning outcomes that are too similar to the learning outcomes of this module. You may not earn additional credit for the same learning and therefore you may not enrol in this module if you have successfully completed any modules in the incompatible list.
No incompatible modules listed
Co-requisite Modules
No Co-requisite modules listed
Requirements

This is prior learning (or a practical skill) that is mandatory before enrolment in this module is allowed. You may not enrol on this module if you have not acquired the learning specified in this section.

No requirements listed
 
Indicative Content
Overview
Statistics fills the crucial gap between information and knowledge. Society cannot be run effectively on the basis of hunches or trial and error. This topic highlights which statistics to use, why to use those statistics, and when to use them.
Introduction to data analysis
Through using appropriate descriptive statistics, it is possible to make sense of data collected and tell a research story coherently and with justification. This entails deriving the correct measures of centrality/variation, if applicable. Along with interpreting bar charts, pie charts, histograms, stem-and-leaf plots, boxplots.
Statistical inference (two samples)
Understand the idea behind hypothesis testing through worked examples of test of normality/differences/relationships with various types of data - i.e., Independent and related t-tests; Mann Whitney and Wilcoxon test; Pearson and Spearman Rank correlation.
Multi-variable analysis
Set-up and analyse various data sets in both a parametric and non-parametric way. In the case of non-parametric data, suitable data transformations will be investigated prior to the use of parametric tests. One-way Analysis of Variance (ANOVA) with suitable posthoc testing. Between and within subjects factorial experiments. Investigation of the effect size of a result and the power to a test.
Multiple regression
Scatterplots and partial regression plots. Test for homoscedasticity. Detect for multicollinearity and outliers. Check that the residuals (errors) are approximately normally distributed. Interpret regression equations and use them to make predictions.
R as a statistical programming language
R will be used to turn raw data into insight, knowledge, and understanding. Packages such as: ggplot2 will be used to declaratively create graphics; dplyr for data manipulation; tidyr for a set of functions that help you get to tidy data; readr will be used to provide a fast and friendly way to read rectangular data; purrr for providing a complete and consistent set of tools for working with functions and vectors.
Module Content & Assessment
Assessment Breakdown%
Coursework100.00%

Assessments

Coursework
Assessment Type Practical/Skills Evaluation % of Total Mark 20
Timing Week 5 Learning Outcomes 1,2,6
Assessment Description
Use the appropriate descriptive statistics tools in R to suggest whether a difference exists between the two measurements. Summarise all results in a concise report.
Assessment Type Practical/Skills Evaluation % of Total Mark 30
Timing Week 7 Learning Outcomes 3,6
Assessment Description
Using functions in R, test the significance of hypotheses that differences exist between two measurements. Summarise results in a concise report.
Assessment Type Practical/Skills Evaluation % of Total Mark 30
Timing Week 10 Learning Outcomes 4,6
Assessment Description
Using R, test the significance of a hypothesis that a difference exists between more than two measurements. Apply posthoc tests to determine the presence of statistical differences. Summarise results in a concise report.
Assessment Type Practical/Skills Evaluation % of Total Mark 20
Timing Sem End Learning Outcomes 5,6
Assessment Description
Using the appropriate functions in R, test the relationship between a response variable and multiple explanatory variables. Summarise results in a concise report.
No End of Module Formal Examination
Reassessment Requirement
Coursework Only
This module is reassessed solely on the basis of re-submitted coursework. There is no repeat written examination.

The University reserves the right to alter the nature and timings of assessment

 

Module Workload

Workload: Full Time
Workload Type Contact Type Workload Description Frequency Average Weekly Learner Workload Hours
Lecture Contact Statistical theory and application Every Second Week 2.00 4
Lecturer-Supervised Learning (Contact) Contact Laboratory workshops Every Second Week 2.00 4
Independent & Directed Learning (Non-contact) Non Contact Data analysis Every Week 10.00 10
Total Hours 18.00
Total Weekly Learner Workload 14.00
Total Weekly Contact Hours 4.00
Workload: Part Time
Workload Type Contact Type Workload Description Frequency Average Weekly Learner Workload Hours
Lecture Contact Statistical theory and application Every Second Week 1.50 3
Lecturer-Supervised Learning (Contact) Contact Laboratory workshops Every Second Week 1.50 3
Independent & Directed Learning (Non-contact) Non Contact No Description Every Week 11.00 11
Total Hours 17.00
Total Weekly Learner Workload 14.00
Total Weekly Contact Hours 3.00
 
Module Resources
Recommended Book Resources
  • Tony Fischet. (2015), Data Analysis with R, Packt Publishing, [ISBN: 978-178528814].
  • Hadley Wickham, Garrett Grolemund. (2016), R for Data Science: Import, Tidy, Transform, Visualize, and Model Data, 1. O'Reilly Media, [ISBN: 1491910399].
  • Peter Dalgaard. (2008), Introductory Statistics with R, Springer New York, [ISBN: 9780387790534].
Supplementary Book Resources
  • Michael J. Crawley. (2012), The R Book, Wiley-Blackwell, [ISBN: 978-047097392].
  • Andrew P. Beckerman, Dylan Childs and Owen Petchey. (2017), Getting Started With R, Oxford University Press, [ISBN: 978-019878784].
  • Tadhg L. O'Shea. (2013), Essential Statistics for Researchers, [ISBN: 978-0-9575059-0-2].
This module does not have any article/paper resources
Other Resources
 
Module Delivered in
Programme Code Programme Semester Delivery
CR_HRDPR_9 Postgraduate Certificate in Research Development & Practice 1 Elective