Traffic Incident Detection Enabled by Large Data Analytics REaltime AnlytiCs on TranspORtation data
Authors Forrest Hoffman (standing) and Bill Hargrove sit "inside" the computer they constructed from commodity PCs. (Concept of MPI)
Traffic Incident Detection Data Quality Assurance Real Time Decision Support Probe Data Traffic Incident Detection
Benchmark Dataset: Wavetronix Probe Dataset: INRIX Analysis Period: 1 month Segment - Sensor Pairs: 100 ( Freeways: 60 and Non-Freeway: 40) Association Rules: Bearing, Proximity Segment Center Wavetronix Max Distance: 50 feet
-5 Latencies: -10 Short-Term Example 12/20 12:00:00 AM 12/20 12:00:00 PM 12/21 12:00:00 AM 12/21 12:00:00 PM 100 Original Dataset b). 50 0 7:40:48 PM 8:09:36 PM 8:38:24 PM 9:07:12 PM 9:36:00 PM 10:04:48 PM 10:33:36 PM11:02:24 PM 11:31:12 PM 12:00:00 AM 12:28:48 AM 20 Congestion Detection c). Speed (mph) 0 9:50-20 9:53 7:57:17 PM 8:41:45 PM 9:26:15 PM 10:10:46 PM 10:55:16 PM 11:39:44 PM Time Get Corresponding probe and Benchmark Data Extract Short, Medium and Longterm trends Calculate the similarity between trends Similarity Measure > = Threshold Yes Estimate latency between dataset
Latencies 0.2 Freeways 0.15 0.1 0.05 0 0 2 4 6 8 10 12 14 16 18 20 0.2 Arterials Probability Distribution Likelihood Freeway Non - Freeway 0.15 0.1 0.05 0 0 2 4 6 8 10 12 14 16 18 20 Latency (minutes) Latency (minutes) Get Corresponding probe and Benchmark Data Extract Short, Medium and Longterm trends Calculate the similarity between trends Similarity Measure > = Threshold Yes Estimate latency between dataset
Small Data vs. Big Data Incident Detection Small Sample Complicated Models Transferability Check No Sampling one model for each 15 min period, for each day, for each segment Simple Models (Deviation from central tendency) Site Specific models
Traffic data 164 miles long Divided into 254 segments 0.2-1.5 miles long segments 1 st April, 2016 7 th July, 2016 500 GB of traffic data Video cameras available for incident verification Fig: Location of segments used 16
Incident data 04/01 06/30 07/07 April May June July Used for threshold computation Incident verification Incident data provided by local TMC Start and end-time of incident, type of incident reported 70 lane-blocking incidents causing traffic disruption reported in one-week validation period 17
Are traffic incidents outliers? Incident: Any non-recurring event that causes a reduction of roadway capacity or an abnormal increase in demand. Date Time of day 06/14 06/21 06/28 Traffic Incident 07/05 18
How to find outliers? Univariate Outlier Analysis Variation of only ONE variable considered: Speed θ s 0 Reference Value (s 0 ) Measure of Variation (θ) A data point s k is an outlier if s k < s 0 - tθ Threshold (t) 19
Selection of s 0 and θ s k < s 0 - tθ θ s 0 1. Standard Normal Deviate (SND) s 0 = Mean θ = Standard Deviation 2. Inter-Quartile Distance (IQD) s 0 = Median θ = Inter-Quartile Distance 3. Maximum Absolute Deviation (MAD) s 0 = Median θ = Max. Absolute Deviation 20
Alternatives of SND? s k < s 0 - tθ IQD Inter-Quartile Distance MAD Max. Absolute Deviation s 0 = Median θ = Inter-Quartile Distance = 75 th percentile 25 th percentile speed = s 0.75 s 0.25 s 0 = Median θ = Max. Absolute Deviation = Median s k s 0 21
All good with IQD and MAD? IQD = s 0.75 s 0.25 MAD = Median s k s 0 Swamping: When 50% of data values have same value IQD and MAD = 0 Incident alarm triggered only when congestion occurs Speed < 45 mph Traffic congestion and reliability: Trends and advanced strategies for congestion mitigation. Vol. 6. Federal Highway Administration, 2005. 22
s k < s 0 - tθ Threshold parameter Method Threshold Value used (t) SND 3 MAD 3 IQD 2 Pearson, R. K.. Mining imperfect data: Dealing with contamination and incomplete records. SIAM, 2005. 23
Speed Threshold Speed Threshold of a Typical Segment for Thursday PM Peak Workzone AM Peak SND has typically lower speed thresholds 24
Performance Measures Detection Rate DR = Total number of detected incidents Total number of actual incidents 100% Mean Time to Detect MTTD = Total time used to detect incidents Total number of incidents detected 100% 25
Validation Results Method DR (%) FAR (%) [False alarms/day] MTTD (mins) IQD 97.1 4.84 [4.1] 12.4 MAD 94.3 6.56 [4.0] 10.1 SND 82.9 0.62 [1.0] 13.2 Proposed Existing Although FAR is higher for proposed algorithms, but # of false alarms/day is lower than the accepted ones (10 false alarms/day ) Williams, B.M. and A. Guin. Traffic management center use of incident detection algorithms: Findings of a nationwide survey. IEEE Transactions on Intelligent Transportation Systems. Vol. 8, No. 2, 2007. pp. 351 358. 26
Anuj Sharma, Associate Professor Iowa State University anujs@iastate.edu