Advanced and Specialised Research Methods

Upcoming Courses

Workshop on Finnish Registry data

 19 May 2021 10.15 – 13.45

By Sanna Kailaheimo-Lönnqvist

The aim of this workshop is to provide an introduction to Finnish registry data. The workshop introduces different kinds of Finnish registers, how to get them to scientific use, the basic structure of the data, and most importantly, what kind of research questions can be answered using registry data. The workshop includes both discussion and lectures. After the workshop, students know some benefits and limitations of registry data and know what kind of studies can be conducted using registry data.

1 credit: Short-essays and participation in the discussions.

Sanna Kailaheimo-Lönnqvist is a researcher in the Finnish National Rescue Association and visiting researcher in various institutions such as in the Institute of Criminology and Legal Policy at University of Helsinki. She obtained a PhD in sociology from the University of Turku in 2021. She has published articles in international peer-reviewed journals such as Demography, Research on Social Mobility and Stratification and European Societies.

Her main research interests are in the areas of social inequality and intergenerational relations. In the doctoral thesis she examined how resources and different life events are linked with children’s adulthood outcomes. She has conducted all her research using Finnish and Swedish register data.

Past courses

Causal Inference for nonexperimental data

 17-19 March 2021 (12 hours)

By Bruno Arpino, Department of Statistics, Computer Science, Applications, University of Florence, Italy

What is the effect of smoking on health? Does having an additional child increase the risk of poverty? Are development policies targeted on small firms effective in increasing investments?

Most studies in the social sciences are motivated by questions that are causal in nature.

However, in these areas experiments are not always possible because of ethical or practical reasons and the estimation of causal effects has often to rely on observational studies. The validity of inference will then strictly depend on the plausibility of the assumptions underlying the employed statistical techniques.

This course will cover some of the most popular techniques for estimating causal effects with observational data: propensity score matching, instrumental variable regression, regression discontinuity designs and fixed effects models. Special emphasis will be placed during the course on discussing the plausibility of the identifying assumptions, the data requirements and other practical and theoretical challenges for the implementation of each method.

This short course will offer participants theoretical and applied perspectives on the covered topics. Examples will be drawn from political science, sociology, economics, public health and policy evaluation. Lab sessions will demonstrate the implementation of the covered techniques using the software STATA.

More information and the programme of the course

Bruno Arpino is an associate professor at the Department of Statistics, Computer Science, Applications, University of Florence (Italy). Previously he was an Associate Professor at the Department of Political and Social Sciences, Universitat Pomepu Fabra (UPF) and co-director of the Research and Expertise Centre on Survey Methodology (RECSM, UPF). He obtained a PhD in Applied Statistics from the University of Florence in 2008.

His main research interests are in the areas of causal inference, applied statistics and social demography. From a substantive point of view, he has been studying intergenerational relationships, ageing and health, fertility and immigrants’ assimilation.

He has published articles in international peer-reviewed journals such as The Annals of Applied Statistics, Demography, European Sociological Review, The Journals of Gerontology: Series B, Journal of Marriage and Family, Journal of the Royal Statistical Society – A and C, Proceedings of the National Academy of Sciences (PNAS), Statistics in Medicine. Since October 2017 he is member of the Editorial Board of Statistical Methods and Applications.


Introduction to R

by Simon N. Chapman, INVEST Flagship, Department of Social Research, University of Turku, Finland

6-7 April 2021

Many researchers now work almost exclusively in the R programming environment, but what is R and how does one even get started? Why should I use R over (preferred statistics program)? What is a function and what is a package? What do those square brackets mean? 

Learning R can seem like a daunting task, like trying to climb Everest with no tools and no mountaineering experience. There is no need to fear though: as the famous idiom goes, the best way to eat an elephant is one bite at a time. In this course, we will break the basics of R into bite-sized chunks.

This course aims to give participants a starting point for working within the R environment, and is suitable for absolute beginners and more experienced programmers alike. Learn to import and export objects, view and summarise datasets, create variables, and much more.

Simon Chapman is a senior researcher in the INVEST Flagship at the University of Turku. He obtained his PhD in evolutionary biology from the University of Turku in 2020. His current interests are intergenerational relations, cooperation and conflict within families, life history evolution, and the impacts of parental leave on the life-course. His work has been published, amongst others, in Current Biology, Nature Communications, Evolution & Human Behavior, and Biology Letters.




Longitudinal social network analysis with RSiena 27.-29.10.2020

by: Tom Snijders

The course will be online. It will consist of an alternation of lectures, Q&A sessions, and practical work in breakout groups of 2-4 participants.

It is assumed that the participants have a good basic understanding of statistical methods, including regression and logistic regression; a good understanding of the basics of social network analysis (e.g., the textbook by Borgatti, Everett, and Johnson); and a good working knowledge of R.

The program is tentative, especially for the later days, and will be adapted to the interests of the participants.

Tom A.B. Snijders ( is professor of Statistics and Methodology in the Social Sciences at the University of Groningen and emeritus fellow of Nuffield College, University of Oxford. He studied mathematics and obtained a PhD in 1979 from the University of Groningen with a dissertation in mathematical statistics. His research concentrates on social network analysis and multilevel analysis. His work on developing statistical methodology for network dynamics is implemented in the software package RSiena (Simulation Inference for Empirical Network Analysis) in the statistical system R. With Roel J. Bosker he wrote Multilevel Analysis; An Introduction to Basic and Advanced Multilevel Modeling (Sage, 2nd ed., 2012). Combining these two research strands, together with Emmanuel Lazega he edited Multilevel Network Analysis for the Social Sciences; Theory, Methods and Applications (Springer, 2016). Together with Patrick Doreian he was co-editor of Social Networks from 2006 to 2011. He supervised and co-supervised more than 60 PhD dissertations. From 2002 to 2006 he was scientific director of the graduate school ICS (Inter-university Center for Social Science Theory and Methodology). He received two awards from INSNA (International Network for Social Network Analysis): the Georg Simmel Award in 2010 and the Bill Richards software award in 2017; and honorary doctorates from the University of Stockholm (2005) and the Université Paris-Dauphine (2005).

Introduction to Social Science Genetics 10.-12.11.2020

by: Felix Tropf

A growing number of social science data sources are providing molecular genetic data and researchers all over the world are interested in utilizing this information in order to better understand various social phenomena. In this course, we will learn about the history of social science and behaviour genetics as well as about the state of the art research and cutting-edge methods. After attending this workshop, participants should have a basic understanding of the fundamental advantages of integrating genetics into social science. They should understand the basic technical terms from quantitative genetics literature and be able to read and interpret studies concerning social science genetics. They should be able to conduct basic quantitative genetics analyses and interpret their findings. Participants need an interest and a basic understanding of quantitative social science research and some experience concerning the software R & Stata.

We will start with a general introduction of genetics in social sciences discussing potential research questions we can answer using genetic data. We subsequently learn about the theory behind twin and family models and how to estimate heritability as the proportion of observed variance in an outcome, which is explained by genetic effects. We move on to see how heritability is measured using molecular genetic data and discuss various challenges and applications. We use Plink software to prepare and analyze genetic data and GCTA software to estimate quantitative genetic models.

We will discuss how to genetic variants are discovered, which are associated with social science outcomes of interest and how we can utilize these results in social science research in terms of controlling for confounding effects, dealing with genetic heterogeneity in social science models, estimating gene-environment interaction models and using genes as instrumental variables. Substantively, we will rely on recently published genetic discovery studies on educational attainment, subjective well-being and fertility.

Felix is a sociologist and his current interests focus on social demography, genetics, and the life course. He is an Assistant Professor in Social Science Genetics at CREST/ENSAE, an Associate member of Nuffield College in Oxford and a Visiting Scientist at the Queensland Institute for Medical Research (QIMR) in Australia. He received the European Demography Award for best PhD Thesis. Felix’ research has been published, amongst others, in Demography, Nature Genetics, Nature Human Behaviour, JAMA Psychiatry, Proceedings of the National Academy of Sciences and Population Studies.

Advanced Causal Inference with Observational Data, 2-4 ECTS

by Moris Triventi

The aim of this course is to provide an introduction to the identification and estimation of causal effects using observational data typical of the social sciences. Each theoretical lesson is complemented by a laboratory/computer session in which the Stata software is used to analyze real-world data. Requirements: the students are expected to have basic knowledge of statistics (descriptive, inferential) and linear regression. Basic knowledge of Stata (files management, data preparation) is also warranted.

Moris Triventi, PhD, is Associate Professor in the Department of Sociology and Social Research at the University of Trento (Italy), where he teaches Quantitative Research Methods and Sociology of Education. From 2013 to 2016 he was Research Fellow at the European University Institute (Fiesole, Italy). His research interests comprise social inequalities, education, crime, migration and policy evaluation. His works have been published, among others, in Annual Review of Sociology, Policy Sciences, International Migration Review, and European Sociological Review.





Experimental Social Science – Lab and Field Experiments, 5 ECTS

by Lauri Sääksvuori

This course is about experimental social science. Students will learn to understand how to gather data using experimental methods and how various experimental designs relate to different statistical methods. After the course, students know how to design meaningful experiments and draft implementation and analysis plans to run the experiments in practice.

Behavioral Genetic Modeling using Twin data

by Tina Baier

The aim of this course is to introduce social scientists to twin studies and the related quantitative methods of behavioral genetic analysis. The first part of the course provides the relevant background and introduces the main concepts used in quantitative genetics. The second, applied part uses the statistical software Stata and the “acelong-package” developed for behavioral genetic modeling. Prerequisites: Participants should have basic knowledge of Stata 14 and regression analysis. A basic understanding of multilevel modeling is an advantage.