Welcome to the “Applied Data Analysis for Public Policy Studies” course at ScPo! On this page we outline the course and present the Syllabus. 2020/2021 is the first time that we teach this course in this format, so comments are appreciated!


The objective of this course is to provide students with the statistical tools to understand and analyze public policy critically. The course introduces students to basic concepts of statistical analysis, including multivariate regression, sampling, hypothesis testing, causality, difference-in-differences analysis, and regression discontinuity design. The course will place a strong emphasis on conceptually understanding the application of these tools, rather than their mathematical foundations. So, for instance, when discussing linear regressions, we will focus on what the estimates mean rather than the mathematical proof underpinning the calculations. To this extent, statistical concepts (i) will be presented through multiple examples from academic articles in the sphere of public policy, and (ii) making use of the statistical language R.

The ultimate goal of the course is to:

  1. Become a Reader: Be familiar with simple statistical tools, be able to interpret results by critically thinking through the conceptual limitations of the analyses presented in research papers and policy briefs.

  2. Become a User: Conduct statistical analysis and data cleaning exercises using R.

Syllabus and Requirements

We will be using the free statistical computing language R very intensively. Before coming to the first session, please install R and RStudio as explained at the beginning of chapter 1.

Course Structure

Groups meet once per week for 2 hours. The main purpose of the weekly meetings is to clarify any questions, and to work together through tutorials. The little theory we need will be covered in this book, and you are expected to read through this in your own time before coming to class.


There are slides for most book chapters at a dedicated github repository.

This Book and Other Material

What you are looking at is an online textbook. You can therefore look at it in your browser (as you are doing just now), on your mobile phone or tablet, but you can also download it as a pdf file or as an epub file for your ebook-reader. We don’t have any ambition to actually produce and publish a book for now, so you should just see this as a way to disseminate our lecture notes to you. The second part of course material next to the book is an extensive suite of tutorials and interactive demonstrations, which are all contained in the R package that builds this book (and which you installed by issuing the above commands).


Instructor: Michele Fioretti - email: michele dot fioretti at sciencespo dot fr
Teaching Assistant: Eléonore Richard - email: eleonore dot richard at sciencespo dot fr

Open Source

The book and all other content for this course are hosted under an open source license on github. You can contribute to the book by just clicking on the appropriate edit symbol in the top bar of this page. The material in this book comes from “Introduction to Econometrics with R” by lorian Oswald and Jean-Marc Robin and Vincent Viers (2018), which can be found at github repository.


100% assessed coursework. A problem set (worth 30%), quizzes on moodle (20%) and take home exam (50%).


We will communicate exclusively on our slack group. You will get an invitation email to join from your instructor in due course.