Data Analysis in Python and R
Course plans (divided by module)
Sign up here!
Last day to sign up is August 13th, 2023
Learn how to efficiently analyze your data, using scientific tools available in the R and Python programming languages. You can select among four modules to learn and implement the tools which are most useful to you. For detailed descriptions of the modules see the course plans and course schedule.
CLICK ON THE PICTURE TO ENLARGE
R is a language and environment for statistical computing and graphics, which contains many useful tools for data analysis in water research, particularly related to environmental quality data. In the “Analysis of environmental quality data using R” module, you will learn practical skills for analyzing your data in R, including the basics of working with data in R (opening and writing files, carrying out calculations on matrices and data frames), how to choose appropriate statistical hypothesis tests based on the distribution of your data, how to implement those tests in R , and how to make graphics using the powerful ggplot2 package. Finally, you will learn to analyze data sets that include non-detected values. Although these values are commonly either replaced with a fixed value or eliminated to facilitate statistical analysis, these approaches are suboptimal and can bias interpretation. In this course you will learn to analyze these types of data sets using state-of-the-art methods for the analysis of left-censored data (where the true value is known only to be inferior to a given level) included in the package NADA2.
In the water sector we frequently encounter continuous data, e.g. time series of precipitation, water levels or conductivity from. In the course “Analysis of continuous data with Python”, you will learn how to efficiently analyze such data using three widely-used libraries for scientific programming in Python: numpy for fast calculations on large arrays, pandas for organizing and analyzing data, and matplotlib for visualization. Combining these three will allow you to efficiently analyze large amounts of data, e.g. if you have long-term measurements or many different locations, and to present the data in a clear and attractive way. Note that previous knowledge on basic programing in Python is required. If you do not have this yet (or if you have programming experience with a different language such as Matlab), you should also take the course “Basic programming in Python”. In this course you will learn the basic concepts of the Python programming language which are necessary for any further work. This course is a self-study course which is examined through some short exercises, but the responsible teacher is of course available for questions.