This course provides an applied introduction to modern and flexible statistical techniques for modeling demographic data. Traditional demographic methods tend to either apply a large number of parameters or impose strong parametric assumptions. In this course you will learn to master flexible models to extract the most from your data with the fewest assumptions.
Smoothing the relationship between two variables (e.g. life expectancy and GDP per capita) is the simplest example where no prior knowledge of their relationship is assumed. However, more complex examples are frequent in demography. Examples include the pattern of mortality at different ages and/or at different time points by sex and cause of death; the fertility pattern across ages, cohorts and parity; spatial patterns of demographic phenomena; and non-linear effects of age or income by specific health outcomes. Moreover several population patterns are intrinsically continuous, it thus seems natural to model them by smooth functions which could be practically treated as continuous curves in, for instance, decomposition and rate-of-change calculations.
The course will start with an overview of generalized linear models (log-linear and logistic models). P-splines will be then presented as the most suitable and clear-cut smoothing approach for demographic data. This class of models can be easily generalized to more complex data structures (multi-dimensional and spatial data) and to achieve specific needs (forecasting and specialized smoothing). The course will finish with a demographic perspective on Generalized Additive Models and distribution regression approaches such as GAMLSS.
While we will focus on the few theoretical concepts that underpin the more detailed literature, this will be a hands-on course. We will emphasize the use of modern software such as R for implementing the approaches presented on relevant demographic datasets. Furthermore lab-sessions will help to reveal what is going on “under the hood". By the end of the course smoothing won’t be seen as a mere black box, but as a modern statistical tool to explore and model population data at their best.
Each of the five course days will consist of
- a lecture session from 10:00-12:00
- a lab session from 14:00-16:00
The course is targeted to non-statisticians and it will introduce all concepts from the basics. However, elementary knowledge of demographic analysis (i.e. construction of a life-table) and statistics (i.e. regressions) is required. Familiarity with basic concepts in matrix algebra (transposing and inverting a matrix) is helpful but not essential. Participants are expected to have a working knowledge of R because lab session will require its use. Participants are expected to bring a laptop computer with R and an associated editor (e.g. RStudio) installed.
Students will be evaluated on the basis of computer exercises and class participation.
A reading list will be provided as well as slides from the lectures and handouts for the lab-sessions. Datasets and scripts will be distributed in a self-contained R-package.
There is no tuition fee for this course. Students are expected to pay their own transportation and living costs. If you are accepted, MPIDR can provide advice on convenient places to stay in Rostock.
Recruitment of students
- Applicants should either be enrolled in a PhD program or have received their PhD.
- A maximum of 20 students will be admitted.
- The selection will be made by the MPIDR based on the applicants’ scientific qualifications.