Systematic Privacy by Design Engineering

Similar documents
Workshop on anonymization Berlin, March 19, Basic Knowledge Terms, Definitions and general techniques. Murat Sariyar TMF

Protecting Privacy After the Failure of Anonymisation. The Paper

LOCATION PRIVACY & TRAJECTORY PRIVACY. Elham Naghizade COMP20008 Elements of Data Processing 20 rd May 2016

Data Anonymization Related Laws in the US and the EU. CS and Law Project Presentation Jaspal Singh

- A CONSOLIDATED PROPOSAL FOR TERMINOLOGY

Foundations of Privacy. Class 1

Privacy in a Networked World: Trouble with Anonymization, Aggregates

Is Transparency a useful Paradigm for Privacy?

Ethics of Data Science

USING BENFORD S LAW IN THE ANALYSIS OF SOCIO-ECONOMIC DATA

Location-Enhanced Computing

Subjective Study of Privacy Filters in Video Surveillance

Standards and privacy engineering ISO, OASIS, PRIPARE and Other Important Developments

Guidance on the anonymisation of clinical reports for the purpose of publication in accordance with policy 0070

An Introduction to a Taxonomy of Information Privacy in Collaborative Environments

Global Alliance for Genomics & Health Data Sharing Lexicon

Privacy engineering, privacy by design, and privacy governance

Presentation Outline

Bloom Cookies: Web Search Personalization without User Tracking

Caution: Danger Ahead (with Big Data)

Geocoding Techniques and Options for US and International Locations

Data Protection and Ethics in Healthcare

PRIVACY ANALYTICS WHITE PAPER

What to do with 500M Location Requests a Day?

Outline. Collective Intelligence. Collective intelligence & Groupware. Collective intelligence. Master Recherche - Université Paris-Sud

Eliminating Random Permutation Oracles in the Even-Mansour Cipher. Zulfikar Ramzan. Joint work w/ Craig Gentry. DoCoMo Labs USA

Veracity Managing Uncertain Data. Skript zur Vorlesung Datenbanksystem II Dr. Andreas Züfle

This Privacy Policy describes the types of personal information SF Express Co., Ltd. and

Economic and Social Council

LOCATION PRIVACY. Marc Langheinrich University of Lugano (USI), Switzerland

BCCDC Informatics Activities

Enabling Trust in e-business: Research in Enterprise Privacy Technologies

Introduction to GNSS Base-Station

Privacy by Design with or without information security? Kirsten Bock CPDP

Geocoding Address Data & Using Geocoded Data

Big Data, privacy and ethics: current trends and future challenges

Social Events in a Time-Varying Mobile Phone Graph

Wireless systems. includes issues of

A SECURITY MODEL FOR ANONYMOUS CREDENTIAL SYSTEMS

Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, data types 3 Basic tasks Project 1 out 4 Data preparation

Realigning Historical Census Tract and County Boundaries

Privacy Policy. What is Data Privacy? Privacy Policy. Data Privacy Friend or Foe? Some Positives

Guidance on the anonymisation of clinical reports for the purpose of publication

From Purple Prose to Machine-Checkable Proofs: Levels of rigor in family history tools

Engineering Privacy by Design Reloaded

Wireless Environments & Privacy

IAB Europe Guidance THE DEFINITION OF PERSONAL DATA. IAB Europe GDPR Implementation Working Group WHITE PAPER

Privacy preserving data mining multiplicative perturbation techniques

Deconvolution , , Computational Photography Fall 2017, Lecture 17

Unlinkability and Redundancy in Anonymous Publication Systems

Statistical and operational complexities of the studies I Sample design: Use of sampling and replicated weights

Geocoding regional and remote poor quality address records with confidence

COS433/Math 473: Cryptography. Mark Zhandry Princeton University Spring 2017

Vistradas: Visual Analytics for Urban Trajectory Data

Efficiency and detectability of random reactive jamming in wireless networks

BBMRI-ERIC WEBINAR SERIES #2

Game Architecture. Rabin is a good overview of everything to do with Games A lot of these slides come from the 1 st edition CS

Methods and Techniques Used for Statistical Investigation

Volume INFINTE CAMPUS. Oak Grove School District. Infinite Campus Instruction Guide

Privacy-Preserving Design of Data Processing Systems in the Public Transport Context

Deconvolution , , Computational Photography Fall 2018, Lecture 12

A Critical Analysis of Privacy Design Strategies Michael Colesky. Our Goals

The Game-Theoretic Approach to Machine Learning and Adaptation

Planning for an increased use of administrative data in censuses 2021 and beyond, with particular focus on the production of migration statistics

The number theory behind cryptography

Games, Privacy and Distributed Inference for the Smart Grid

Hemanta K. Maji. Academic Job Search: My Perspective

Three Minute Thesis & Research Presentations.

The Use of Commercial Databases for National Security: Privacy, Evaluation, and Accuracy

Finding U.S. Census Data with American FactFinder Tutorial

Predictive Analytics : Understanding and Addressing The Power and Limits of Machines, and What We Should do about it

Andrei Sabelfeld. Joint work with Per Hallgren and Martin Ochoa

MSc(CompSc) List of courses offered in

The University of Sheffield Research Ethics Policy Note no. 14 RESEARCH INVOLVING SOCIAL MEDIA DATA 1. BACKGROUND

Swarms of Bouncing Robots

Digital Surveillance Devices?

2

Best Practices for Automated Linking Using Historical Data: A Progress Report

A GI Science Perspective on Geocoding:

Digital surveillance devices?

Radio Deep Learning Efforts Showcase Presentation

Chapter 12 Summary Sample Surveys

The purpose of this study is to show that this difference is crucial.

How to Test A-GPS Capable Cellular Devices and Why Testing is Required

COMP9414: Artificial Intelligence Problem Solving and Search

Local and Low-Cost White Space Detection

Most of us will have heard of Open Data. Many of us are working to implement it.

Lecture 3 - Regression

This policy sets out how Legacy Foresight and its Associates will seek to ensure compliance with the legislation.

Privacy by Design: Integrating Technology into Global Privacy Practices

Privacy, Technology and Economics in the 5G Environment

I. INTRODUCTION II. LITERATURE SURVEY. International Journal of Advanced Networking & Applications (IJANA) ISSN:

Intelligent, Rapid Discovery of Audio, Video and Text Documents for Legal Teams

Privacy Preservation through good AIM. Dr Rhys Smith Dr John Chapman

Part A: Spread Spectrum Systems

Introduction to Global Navigation Satellite System (GNSS) Signal Structure

October 6, Linda Owens. Survey Research Laboratory University of Illinois at Chicago 1 of 22

ESSENTIAL RECIPES FOR THE DIGITAL JOURNEY OF ENTERPRISES

Dynamic Data-Driven Adaptive Sampling and Monitoring of Big Spatial-Temporal Data Streams for Real-Time Solar Flare Detection

Introduction INTRODUCTION TO SURVEY SAMPLING. Why sample instead of taking a census? General information. Probability vs. non-probability.

Transcription:

Systematic Privacy by Design Engineering

Privacy by Design Let's have it! Information and Privacy Commissioner of Ontario Article 25 European General Data Protection Regulation the controller shall [...] implement appropriate technical and organisational measures [ ] which are designed to implement data-protection principles[...] in order to meet the requirements of this Regulation and protect the rights of data subjects.

HIGH PRIVACY

Overarching goal Privacy by Design Strategies Minimizing privacy risks and trust assumptions placed on other entities

Overarching goal Privacy by Design Strategies Minimizing privacy risks and trust assumptions placed on other entities

Case study: Electronic Toll Pricing Motivation: European Electronic Toll Service (EETS) Toll collection on European Roads trough On Board Equipment Two approaches: Satellite Technology / DSRC Starting assumptions 1) Well defined functionality Charge depending on driving

Case study: Electronic Toll Pricing Activity 1: Classify Entities in domains User domain: components under the control of the user, eg, user devices Service domain: components outside the control of the user, eg, backend system at provider

Case study: Electronic Toll Pricing Activity 1: Classify Entities in domains User domain: components under the control of the user, eg, user devices Service domain: components outside the control of the user, eg, backend system at provider

Case study: Electronic Toll Pricing Trust Service to keep privacy of location data Risk of privacy breach

Case study: Electronic Toll Pricing Location is not needed, only the amount to bill!

Case study: Electronic Toll Pricing Location is not needed, only the amount to bill!

Case study: Electronic Toll Pricing Location is not needed, only the amount to bill! Service integrity?

Case study: Electronic Toll Pricing Location is not needed, only the amount to bill! Service integrity Requires knowledge of PETs Privacy ENABLING Technologies

Privacy by design Engineering: A change in the way we reason about systems The Usual approach

Privacy by design Engineering: A change in the way we reason about systems The Usual approach

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION WELL ESTABLISHED DESIGN AND EVALUATION METHODS Private searches Private billing Private comparison Private sharing Private statistics computation Private electronic cash Private genomic computations -...

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION WELL ESTABLISHED DESIGN AND EVALUATION METHODS but expensive and require expertise

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION cheap but... DIFFICULT TO DESIGN / EVALUATE

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION cheap but... DIFFICULT TO DESIGN / EVALUATE

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION cheap but... DIFFICULT TO DESIGN / EVALUATE

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION cheap but... DIFFICULT TO DESIGN / EVALUATE

PART I: Reasoning about Privacy when designing systems PART II: Evaluating Privacy in Privacy-Preserving systems PRIVACY-PRESERVING SOLUTIONS CRYPTO-BASED VS ANONYMIZATION/OBFUSCATION cheap but... DIFFICULT TO DESIGN / EVALUATE

We need technical objectives PRIVACY GOALS Anonymity: decoupling identity and action Pseudonymity: pseudonymous as ID (personal data!) Unlinkability: hiding link between actions Unobservability: hiding the very existence of actions Plausible deniability: not possible to prove a link between identity and action obfuscation : not possible to recover a real item from a noisy item Why is it so difficult to achieve them?

Let's take one example: Anonymity Art. 29 WP s opinion on anonymization techniques: 3 criteria to decide a dataset is non-anonymous (pseudonymous): 1) is it still possible to single out an individual 2) is it still possible to link two records within a dataset (or between two datasets) 3) can information be inferred concerning an individual? http://ec.europa.eu/justice/data-protection/article-29/documentation/opinion-recommendation/files/2014/wp216_en.pdf

Let's take one example: Anonymity 1) is it still possible to single out an individual the median size of the individual's anonymity set in the U.S. working population is 1, 21 and 34,980, for locations known at the granularity of a census block, census track and county respectively location

Let's take one example: Anonymity 1) is it still possible to single out an individual location if the location of an individual is specified hourly, and with a spatial resolution equal to that given by the carrier s antennas, four spatio-temporal points are enough to uniquely identify 95% of the individuals. [15 montsh, 1.5M people]

Let's take one example: Anonymity 1) is it still possible to single out an individual location web browser

Let's take one example: Anonymity 1) is it still possible to single out an individual location web browser It was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}

Let's take one example: Anonymity 2) Link two records within a dataset (or datasets) take two graphs representing social networks and map the nodes to each other based on the graph structure alone no usernames, no nothing Netflix Prize, Kaggle contest social graphs

Let's take one example: Anonymity 2) Link two records within a dataset (or datasets)

Let's take one example: Anonymity 2) Link two records within a dataset (or datasets)

Anti-surveillance PETs technical goals privacy properties: Anonymity 3) infer information about an individual Based on GPS tracks from, we identify the latitude and longitude of their homes. From these locations, we used a free Web service to do a reverse white pages lookup, which takes a latitude and longitude coordinate as input and gives an address and name. [172 individuals]

Let's take one example: Anonymity 3) infer information about an individual We investigate the subtle cues to user identity that may be exploited in attacks on the privacy of users in web search query logs. We study the application of simple classifiers to map a sequence of queries into the gender, age, and location of the user issuing the queries.

Let's take one example: Anonymity Wishful thinking! this cannot happen in general! Data anonymization is a weak privacy mechanism Impossible to sanitize without severely damaging usefulness Removing PII is not enough! - Any aspect could lead to re-identification Art. 29 WP s opinion : Risk of de-anonymization? Probabilistic Analysis Pr[identity action observation ]

Privacy evaluation is a Probabilistic analysis systematic reasoning to evaluate a mechanism Anonymity - Pr[identity action observation ] Unlinkability - Pr[action A action B observation ] Obfuscation - Pr[real action observed noisy action ]

Privacy evaluation is a Probabilistic analysis systematic reasoning to evaluate a mechanism Anonymity - Pr[identity action observation ] Unlinkability - Pr[action A action B observation ] Obfuscation - Pr[real action observed noisy action ]

Inversion? what do you mean? 1) Analytical mechanism inversion Given the description of the system, develop the mathematical expressions that effectively invert the system:

Take aways Realizing Privacy by design is non-trivial PART I: Reasoning about Privacy when designing systems Explicit privacy engineering activities PART II: Evaluating Privacy in PrivacyPreserving systems privacy evaluation

thanks! Any questions? carmela.troncoso@imdea.org https://software.imdea.org/~carmela.troncoso/ (these slides will be there soon)