5. De-Identification

Intro and repetition of previous lesson

Throughout the last sessions you were introduced to frameworks that help you to better understand your data practices and hence identifying potential risks that could occur in those practices. And you also learned about different approaches that help you to minimise those risks. One of those approaches is De-Identification, as you already learned in the previous chapter. Today, we are going to dig a little bit deeper in the different ways of how you can de-identify data sets. You will learn what de-identification is, how to spot potential risky data by looking at a data set, how to conduct basic de-identification, and what are the limits of de-identification.

Now, de-identificacion might sound very technical, and only something data scientists or statisticians deal with. Don’t worry, we won’t go into technical details. But understanding basic principles of de-identification will help you to improve your practices on responsible data handling.