Data Analysis (with R)
This workshop will be co-hosted by the IRTG and IMPRS-LS
Sign up with Elizabeth email@example.com
The online workshop will consist of three components:
- An online, independent eCourse, including videos eBook and exercises.
- Two 2-hour sessions spread over two days for group discussions and exercises
- Individual 30 minute consultations for each student spread over 2 days
The dates are:
in February 2022:
- 9 & 11 February 9:30 - 11:30 2 2-hour live online learning sessions with the whole group
- 14 or 15 February 9:30 - 13:00 30-minute 1:1 training sessions with each student
- 7 & 9 September 9:30 - 11:30 2 2-hour live online learning sessions with the whole group
- 13 or 14 September 9:30 - 13:00 30-minute 1:1 training sessions with each student
Although the presence time for this course is 4.5 h, committment is necessary to complete independent work within the time frame of available resources! The time investment for this online course is slightly more than a 2 full-day course, but you can organize most of this at your own convenience. Keep in mind: In order to get the most out of the course, reserve time during the week for viewing the videos and for independent work. A workplan will be provided with the introductory email.
R is an open-source cross-platform software tool that combines data manipulation, statistical modelling and visualisation.
The Data Analysis workshop enables laboratory-based life scientists to use the R statistical programming environment to analyse their own data. This workshop focuses on data manipulation and biostatistics modelling using relevant examples from the life sciences.
Using plenty of hands-on exercises, participants will learn about:
- The most common data structures and functions in R,
- How to manage and ask specific questions of their data, and
- How to use the results of statistical tests.
Packages (e.g. the tidyverse) and paradigms (vectorization) that make R well-suited to data manipulation, as well as common beginner pit-falls, will be introduced.
Basic visualisations will be covered, but will be treated in more depth in the separate Data Visualization workshop.
Methods for dealing with missing data will be broached at various parts in the workshop.
Approximately one third of class time is dedicated to having the students work on their own data-sets under the supervision of the instructor. The goal is to develop data analysis solutions as part of the workshop.
Extra material is provided in the reference book for specific problems, e.g. pattern matching with regular expressions, and control structures (e.g. loops and conditional statements). Participants will be provided with all data-sets and access to the book after the workshop to continue working on these case studies.