LOCATION PRIVACY & TRAJECTORY PRIVACY Elham Naghizade COMP20008 Elements of Data Processing 20 rd May 2016
Part I TRAJECTORY DATA: BENEFITS & CONCERNS
Ubiquity of Trajectory Data Location data being collected and stored throughout the day GPS-enabled smart phones, cars, and wearable devices Wi-Fi access points Cell towers Geo-tagged tweets, Facebook status, location check-ins
Trajectory A function from time to geographical space p 4 p n 1 p 1 p 2 p 3 p n ID GPS-Latitude GPS-Longitude Time 111478 33.692771-111.993959 11:52 111478 33.692752-111.993895 11:54 111478 33.692723-111.993581 11:56 111478 33.692804-111.993464 11:58 111478 33.69314-111.993223 12:28 111478 33.69317-111.993192 12:30
Benefits of Location Data Individuals can benefit from sharing location data through Precise, tailored location services Monitoring daily activities for fitness purposes, finding friends, tracking children or the elderly Traffic monitoring and navigation purposes Importance of rich location datasets Identify most frequent paths between two points Provide best POI recommendations for particular groups of people Improve traffic management and urban planning Enable personal data analytics
Privacy Concerns of Location Data Status quo of current mobile systems Able to continuously monitor, communicate, and process information about a person s location Have a high degree of spatial and temporal precision and accuracy Might be linked with other data Analyzing and sharing location datasets has significant privacy implications Personal safety, e.g., stalking, assault Location-based profiling, e.g., Facebook Intrusive inferences, e.g. individual s political views, personal preferences, health conditions
Inference Attacks - Example An user s Monday to Thursday trips Home/work location pair may lead to a small set of potential individuals -> only {Bob, Alice} travel from A to B t 2 pm wa Stop A lk B Car wa lk 8 am A x y B
Inference Attacks - Example The same user s Friday trips Regular visit to a heart hospital -> Alice is Japanese, so most probably the user is Bob t 2 pm A B 8 am wa lk Car Stop Car lk Stop wa y Hospital B A POI x
Inference Attacks - Example Bob s Saturday trips We can learn about his habits, preferences, etc. t stop 2 pm wa A 11 am w k al lk train y Book Club A POI x
Tracking of Individuals Deutsche Telekom (telecommunication operator) Deutsche Telekom handed over six months of Malte Spitz s phone data Tracked position, phone calls, SMS, Internet access http://www.zeit.de/datenschutz/malte-spitz-data-retention Rob me please! An attempt to raise awareness about location/trajectory privacy http://pleaserobme.com
Part II LOCATION & TRAJECTORY PRIVACY
Anonymity: Cloaking k-anonymity Individuals are k-anonymous if their location information cannot be distinguished from k 1 other individuals Spatial cloaking Gruteser & Grunwald use quadtrees Adapt the spatial precision of location information about a person according to the number of other people in the same quadrant Temporal cloaking Reduce the frequency of temporal information Location Privacy and Trajectory Privacy Prof Lars Kulik
Spatial Cloaking (k min = 4) Location Privacy and Trajectory Privacy Prof Lars Kulik
Obfuscation Idea Mask an individual's precision Deliberately degrade the quality of information about an individual s location (imperfect information) Identity can be revealed Assumption Spatial imperfection privacy The greater the imperfect knowledge about a user s location, the greater the user s privacy Actual Location: (x,y) Reported Location: Region Location Privacy and Trajectory Privacy Prof Lars Kulik
Motivation for Obfuscation Finding the closest Sushi restaurant Ichiban Location-based service provider Yo! Sushi Sushi Ten Visitor A: Sushi Ten Q: I am in Princess park. What is the closest Sushi restaurant? Princess Park Location Privacy and Trajectory Privacy Prof Lars Kulik
Overview of Privacy Models Location privacy vs. trajectory privacy Exact location points 3-anonymized location points Obfuscated location points Clustering k similar trajectories: At each timestamp a point with the least distance to all trajectories is reported Discussion: What are the shortcomings of spatio-temporal cloaking & obfuscation?
Privacy vs. Data Utility Data utility The quality of delivered service or analyzed data Is difficult to maintain while preserving privacy Utility Privacy
No Privacy for Maximum Utility Finding the closest Sushi restaurant Ichiban Location-based service provider Yo! Sushi Sushi Ten A: Sushi Ten Visitor Q: I am in Princess park. What is the closest Sushi restaurant? Princess Park
Maximum Privacy for Low Utility Finding the closest Sushi restaurant Yo! Sushi Ichiban Sushi Ten A: Yo! Sushi B: Sushi Ten C: Ichiban Visitor Location-based service provider Q: I am in Princess park. What is the closest Sushi restaurant? Princess Park
Part III BALANCING PRIVACY VS. UTILITY
Stop/Move Exchange Key idea: Exchanging stop and move episodes of a trajectory Exchanging sensitive stop with an insensitive POI Preserving footprint and duration of a trajectory t M 1 S 1 t 1 M 2 t 2 3 t 4 begin t 1 M 1 S 1 M 2 begin t 2 t 3 end t 4 end M i -> actual move episodes M * i -> synthetic move episodes S i -> actual stop episodes S * i -> synthetic stop episodes
Pre-processing Stop extraction A set of consecutive points with large temporal gap and within a short distance Stop sensitivity Determining sensitivity based on user preferences and/or spatiotemporal features Type of the stop point, e.g., university vs. a bar, time and duration POI selection Less sensitive and preferably not repeated types
Exchange Process Stop Replacement Stop Displacement Finding POIs on the same route Preserving footprint and total duration Looking for POIs in a close region Minimal detour from the original route New POI is on the same footprint A small detour to get to the new POI
Exchange Process 2D overview Two trajectories with sensitive stops Displacement p 1 p 2 p 3 p 4 p 6 p 1 p 2 p 3 p 4 p 6 Stop Sensitivity Location Type Stop Duration p 1 p 2 p 3 p4 p5 p 5 p 7 p 7 p 1 p 2 p 3 p4 p5 p 5 p 7 p 7 Road network Actual footprint POIs with high sensitivity POIs with moderate sensitivity POIs with low sensitivity Discussion: p 6 Except for the footprint, which features of a trajectory, e.g., duration, average speed, spatial density are affected by displacement and replacement? p 6 Replacement
Exchange Process 2D overview One trajectory with two sensitive stops: The hospital is displaced with the library The bar is replaced with a restaurant
Exchange Process Efficiency Exhaustive search Searching for POIs over the complete route Partial search Local search for POIs Dividing the route into sub-trajectories wrt stop points Question: Having k sensitive stop points and n POIs on the footprint, what is the time complexity of the exhaustive search and partial search?