Attributing and Authenticating Evidence Forensic Framework Collection Identify and collect digital evidence selective acquisition? cloud storage? Generate data subset for examination? Examination of evidence Analysis String search? Pattern matching? Data visualization (timeline analysis)? 2 Forensic Framework Attribution Analysis determine data significance and draw conclusion Data mining? cluster analysis discriminant analysis rule mining Forensic source identification Link multimedia content to the acquisition device Presentation Attribution: Who did it? (source) Authentication: synthetic data? forgery? 3 4
Authentication Attribution Computer generated images? http://www.businessinsider.com/photorealistic -3d-images-2013-2 The scientists found that 97% of test subjects were fooled into believing that the digital renderings were real photographs and that real photos were CGI. 5 6 Authentication Fake photo? Tampering detection Multimedia Forensics Application of scientific methods to the investigation and prosecution of a crime Outcomes of a forensic analysis may serve as probative facts in court Detect: source of multimedia data Detect forgeries Copy-move forgery Hide undesired objects/replicate similar objects Copying another region of the same image Cell Phone Camera Standalone Camera 7 8 Scanner Computer Generated
Examples 9 A tampered image appeared in press in July 2008 4 Iranian missiles: 3 are real Red/purple: copy-move forgeries Recapturing problem 2007: Fars News Agency, Tehran copy-move forgeries 12
Applications Hash function https://www.youtube.com/watch?v=3b ZvtWA7qGQ 13 An algorithm Input: files (word document, pdf, image, ) Output: a pre-fixed length string Purpose: ensure data integrity Property Hashed result unique One way function Good for authenticating word/pdf documents 14 Hash function Example: http://www.fileformat.info/tool/hash.htm d41d8cd98f00b204e9800998ecf8427e 7dbc9f235835a899880f3e9a7ae1f393 Hashing function To see if images are modified Compare hash values Too strict for multimedia data Images: transmitted through sharing platform Compression content / meaning: doesn t change Video: wireless loss 15 16
Example Images taken by smart phone send through whatsapp compression 17 18 Feature-based Image Hashing bitmap JPEG Illustration Feature: invariant under perceptually insignificant distortion corners? (Harris corner detection) For each corner: find average brightness feature 19 rotate Local scale change Resizing Image hash Color Feature extraction Hash Generation
original whatsapp original whatsapp Color feature extraction Resize -> 8x8 Average color (mean) 93.42 92.39 If (intensity)> average 1 Otherwise 0 Grayscale Feature-based Image Hashing Features: color features For RGB and YCbCr color spaces: 6 color components For each color component, calculate the statistical information Mean, variance, Moment values http://www.naturalspublishing.com/files/published/ 54515x71g3omq1.pdf Concatenate moment values of the six color components to form a feature vector Feature-based Image Hashing Form signature based on the extracted features concatenate all features together Different methods to form the hash Represent them by using certain number of bits Take Fourier Transform, consider the magnitude and the phase as features and represented using certain number of bits http://www.brainflux.org/java/classes/fft2dapplet.html 23 24
Feature-based Image Hashing Evaluation Feature: invariant under perceptually insignificant distortion Hash length Robustness towards different changes Brightness adjustment Contrast adjustment JPEG compression Addition of noise Lowpass filtering 25 26 Evaluation Large degree of compression? Share through social media Miss detection? 27 28
Active approach for data authentication Active approach for data authentication Digital watermark 10011010 29 Copyright Active approach for data authentication Active approach Digital watermark Epson PhotoPC 3000Z, 700/750Z, 800/800Z (discontinued) Watermark is invisible Requires optional software to embed and view watermark Kodak DC-200, 260, 290 (discontinued) Watermark is invisible Watermark capabilities built into camera Active approach for data authentication Active approach Kodak DC-200, 260, 290 (discontinued) Watermark is invisible Watermark capabilities built into camera
Active vs Passive Active approach Addition of extra data More powerful, end-to-end protocol Not popular Passive approach: Detect intrinsic image regularity or tampering artifacts Wider application, less powerful Forgery detection techniques http://www.izitru.com/ Three levels of assumption Rules and models of the physics of the scene Inconsistency a basis for forgery detection Size inconsistency, lighting directions, shadow inconsistencies, reflection inconsistencies Inherent characteristics of the acquisition system (camera components, imaging pipeline) Statistics of natural images 33 34 Demonstration Web platform https://29a.ch/photo-forensics/ Python: http://www.sourcecodeonline.com/details/co pymove_forgery_detection_in_images.html Purchase: http://belkasoft.com/forgery-detection 35 36
Example: JPEG Forensics Quantization tables: Transform to frequency domain (Discrete cosine transform), divide each F[u,v] by a constant q[u,v] Eye: more sensitive to low frequencies Most software use standard quantization Some software (Photoshop) Have their own quantization table Camera manufacturers have their own table Clue for manipulation 37 Example: JPEG Forensics Quantization tables: www.dfrws.org/sites/default/files/sessio n-files/paperusing_jpeg_quantization_tables_to_identi fy_imagery_processed_by_software.pdf 38 Examples : Quantization Photo1_SamsungA7 Examples: Quantization Photo1_SamsungA7 Standard JPEG table, quality=96 39 40
Examples: Quantization Photo1_SamsungA7_modified (software) Standard JPEG table, quality=90 Examples: Quantization Photo1_SamsungNote Non Standard JPEG table, quality=97 41 42 Examples: Quantization Photo1_Nikkon Standard JPEG table, quality=80 Example: Clone detection Samsung A7 photos: combine 43 44
Example: Clone detection Clone: Copied regions in an image Similarity: the similarity between the copied regions and the original Minimal detail: Blocks with less detail are not considered in searching for copied regions Cluster size: how many copied regions need to be found in order for them to show up as results 45 Example: Clone detection Increase Minimum similarity 46 http://www.imageforensic.org/ http://www.imageforensic.org/ 47 48
Forgery detection techniques General two classes of techniques Non-source identification related Lighting direction, shape of the light source specific tampering anomalies Source identification related Features: sensor noise pattern, dust patterns, demosaicing regularity, statistical regularities, chromatic aberration Non-source identification methods Tampering characteristics Different tampering methods different characteristics Copy-move forgery Highly correlated regions Splicing Sharp discontinuity boundary Double JPEG compression Periodicity in DCT coefficient histogram 49 Uneven JPEG blocking artifacts 50 Copy-move Forgery Copying regions of the original image and pasting into other areas. The yellow area has been copied and moved to conceal the truck. 2 types of techniques Block-based Keypoint-based 51 52
Detection of Copy-move Forgery Blockbased B Feature extraction B B B N N Find Similar blocks Generate (N-B+1)(N-B+1) Blocks Detection of Copy-move Forgery: Features: DCT Block size : 4 4 155 155 155 158 158 156 158 159 155 155 155 158 158 156 158 159 155 155 155 158 158 156 158 159 155 155 155 158 158 156 158 159 155 155 155 158 158 156 158 159 151 151 151 154 157 156 156 156 155 155 155 156 157 158 156 153 149 149 149 153 155 154 153 154 Original image 54 155 155 155 158 155 155 155 158 155 155 155 158 155 155 155 158 155 155 158 158 155 155 158 158 155 155 158 158 155 155 158 158 158 156 158 159 157 156 156 156 157 158 156 153 155 154 153 154 Detection of Copy-move Forgery: DCT Detection of Copy-move Forgery Discrete cosine transform From spatial domain to frequency domain Blockbased Feature extraction Find Similar blocks 155 155 155 158 155 155 155 158 155 155 155 158 155 155 155 158 Original block DCT Transform 420.8 37.7-3.3 4.2-3.0 0.9 2.2-0.3-0.3-5.4 0.8-0.7 2.6 0.7-0.6 0.6 DCT coefficient block Features: coefficients or histogram 55 [05, 0.6, ] [08, 0.7, ] Similar condition : 4 k k 2 m_ match( Ai, Ai j ) ( vi vi j ) Dsimilar k 1 2 2 dvv (, ) x x y y N i i j i i j i i j d
Results 57 58 Detection of Copy-move Forgery: Block-based Detection of Copy-move Forgery: keypoints High computational complexity Lots of blocks compute features, find matching blocks Geometric manipulation Scaling, rotation Keypointbased Descriptors for each keypoint Associate similar keypoints [05, 0.6, ] [08, 0.7, ]
Review of SIFT-based approach Steps: Scale-space extreme detection Search over multiple scales DoG: difference of Gaussian Gaussian filtering Downsampling & Gaussian filtering Downsampling & Gaussian filtering Difference Difference Review of SIFT-based approach Steps: Scale-space extreme detection Keypoint localization Local extrema in the DoG pyramid Cleaning: remove low contrast points Orientation assignment Compute best orientation for each keypoint Achieve rotation invariance Steps: Review of SIFT-based approach Scale-space extreme detection Keypoint localization Orientation assignment Find orientation of intensity gradients,,,,, L x y G x y I x y 36 bins (10 degrees) histogram Keypoint orientation = histogram peak 1 L x, y 1 L( x, y 1) xy, tan Lx ( 1, y) Lx ( 1, y) Steps: Review of SIFT-based approach keypoint descriptors 16x16 image patch descriptors Center: keypoint, origin axis: orientation Form 4x4 sub-patches Sub-patch: histogram (8 bin) of gradient orientation Local image gradients: 4x4x8 = 128 values
Copy-move Forgery Detection similar objects matching Keypoint matching j D x, X xj x1 xj x2 xj xm,,,, F F F F F F d d d j,1 j,2 j, n Small distance: similarity of keypoints Forgery detection techniques Forensic work flow General two classes of techniques Non-source identification related Lighting direction, specific tampering anomalies Source identification related Legal system: Accepts the forensic analysis of digital image evidence if the attribution techniques are unbiased, reliable, non-destructive and widely accepted by experts in the field Features: Hardware defects (lens distortion) Sensor defects (sensor noise pattern, dust patterns) Processing regularities (CFA, JPEG) 67 68
Image authentication Two-step process Exam for the reliability of the evidence (image tampering and forgery detection) Analysis to determine its probative value regarding to source camera and image metadata 69 Example Prosecuting attorneys claim: Series of images discovered on a suspect s computer are potentially an evidence of a crime Possible that a third party has access to the suspect s computer, but no evidence of such access Desirable if forensic evidence examiner provides info about: The consistency of these images with a specific digital camera discovered in the suspect s house 70 Digital Image Generation Example: Image Acquisition 71 72
Example: Image Acquisition CFA / Demosaicing Lens: focus the light of scene on sensor Filters: filter out invisible part of light (infra-red, ultraviolet) CFA: color filter array (on top of the sensor) Common: only one sensor for detecting all three colors (red, green blue) 73 74 Example: Image Acquisition Sensor: CCD/CMOS Photosensitive pixels capture photons and convert them into charge CFA interpolation To generate image with full resolution for all colors At each sensor pixel, only one color is measured The other two colors have to be estimated from neighboring pixels 75 Example: Image Acquisition Post processing: Apply enhancement technique to eliminate unwanted artifacts, degradations or noise Color-artifact removal (introduced during CFA interpolation), edge enhancement Storing EXIF JPEG format 76
Source-based forgery detection or model identification Discover traces left by hardware component or software process during image generation process Image artifacts: 2 types Hardware-related Caused by lens, sensor imperfections (noise) software-related Introduced through camera processing 77 hardware software Image artifacts Optical aberrations Sensor Processing statistics Processing regularities Lens radial distortion Chromatic aberration Sensor noise Sensor dust pattern Model statistics High order statistics CFA array JPEG compression 78 Hardware: Optical defects Optical aberrations Radial lens distortion Straight lines appear curved in an image Serious in low-cost wide-angle lenses The degree of distortion changes with focal length Hardware: Optical defects Order-2 model (xd,yd): distorted image coordinate (x,y): undistorted coordinate (a,b): optical centre r = sqrt((x-a)^2 + (y-b)^2) x ( xd a)(1 k r k r ) a 2 4 1 2 y ( yd b)(1 kr kr) b 2 4 1 2 79 Find distorted lines to estimate k 1 and k 2 80
Hardware: Optical defects More likely to be used for forgery detection Less likely to be used for source camera attribution Built with the same/similar lenses similar characteristics Scene content dependency: difficult to estimate distortion in images with flat scene content 81 Hardware: Optical defects Less likely to be used for source camera attribution Camera setting dependency Change with focal length, focal distance, aperture size, illumination, etc Images captured with one device but different zooming different distortions 82 Hardware: sensor defects Sensor imperfections: Sensor defects, sensor pattern noise, sensor dust Sensor defects / pixel defects Dead pixels: not responding to light, appear as a black spot Rarely exist in new manufactured camera or be removed during post-processing Hardware: sensor defects Sensor pattern noise Most sever type of sensor artifacts Photo-response non-uniformity: generated based upon the sensitivity of pixels Sensitivity: measured by determining the light intensity Effect of inhomogeneity of silicon wafer and the imperfection of the sensor manufacturing process 83 84
Hardware: sensor defects Photo response non-uniformity noise (PRNU) Output image Original image PRNU = + + This pattern noise will survive for every image that taken by the same camera. Noisy Output Noise free input PRNU Noise Other Noise 85 Device 1 Device 2 Unique for each individual device 86 Hardware: sensor defects PRNU: Can be used to identify individual device used for taking the image Is able to distinguish cameras from same model and brand Has been used to solve court cases when the query image was tested to verify the claimed camera device Device linking 87 Hardware: sensor defects Dust pattern on lens Cameras with interchangeable lens Dust particles remain in front of the imaging sensor Produce a constant pattern in all captured images Results: High classification accuracy Problem: user cleaned the lens? Positive result is conclusive, but negative result is inconclusive 88
Software: processing statistics Identify statistical artifacts left by different cameras https://www.dpreview.com/reviews/studioco mpare.asp Color characteristics Color reproduction of the camera with respect to each color band image quality Measure quality of the scene reproduction by the optical system 89 90 Software: processing statistics Example statistical features Average pixel value per RGB and RGB pairs correlation Pixel difference Use filters to decompose RGB band to three sub-bands determine mean, variance etc Discrete cosine transform, wavelet transform, ridgelet, contourlet, Software: processing statistics Challenges Difficult to achieve large inter-model similarity for devices of the same brand sharing similar hardware and processing components Camera setting dependency: focal length, indoor/outdoor illumination/flash Scene content dependency: Images captured by 2 cameras in different environments 91 92
Software: processing regularities Examine processing artifacts CFA configuration: specific arrangement of color filters across the sensor plane 93 94 Software: processing regularities Examine processing artifacts CFA interpolation algorithms Used to estimate missing color from surrounding samples of the raw pixel Use different size for interpolation (number of surrounding samples) Adopt different methods to estimate the missing color Simple averaging, weighted averaging, image content dependent averaging 95 96
Software: processing regularities Bilinear interpolation Software: processing regularities Bilinear interpolation G A G L G R G I G B G I 1 ( G 4 L G R G B G A ) 97 98 original interpolated 99 100
artifacts Appear at edges / regions with high freq Features: Study the relationship among neighboring pixels 101 102 Image artifacts + machine learning Machine learning approach hardware Optical aberrations Sensor Lens radial distortion Chromatic aberration Sensor noise Sensor dust pattern Used to analyze large amounts of data Black Box Approach: Collect all features from a large number of multimedia data software Processing statistics Model statistics High order statistics Use the machine learning approach for grouping / classifying these features Processing regularities CFA array JPEG compression 103 104
Machine learning approach 2 types Supervised Make predictions based on a given set of features Unsupervised Learn the data and organize the data by the algorithm 105 106 Machine learning approach Examples: Support vector machine clustering algorithm artificial neural networks nearest neighbors Deep learning algorithm Example: Tampering detection using demosaicing regularity 107 108
Tools for source camera attribution Amped software: authenticate https://ampedsoftware.com/authenticate Qualified government/law enforcement agencies Software package for forensic image authentication and tamper detection on digital photos Error level analysis Multiple JPEG compression PRNU identification Create PRNU PRNU tampering Find inconsistencies in PRNU noise 109 Clone Blocks 110 Inconsistence of sensor noise Multiple compression 111 112
Applications Insurance companies Use forensics to cut fraud and abuse (save time) Car crash: minor dents and scratches Upload a picture/video to the insurance company to save time Findings: use photo editing software to create fake photo evidence Multiple compression 113 114 Applications: insurance Forensic Image analyser 115 http://www.forensicpathways.com/forensi c-image-analyser/ Identifies if the image was taken by a suspected device identifies which images in a set were taken by the same device and which were taken by other devices 116
Other approaches Read about the real court case in the web site Other tools: Fourandsix Technologies http://www.fourandsix.com/ 117 Photos: mostly come with EXIF header Consistency between information (ISO Speed rating, exposure time, focal length) with the image content? Estimate camera setting from the image content and compare with the data found in the EXIF header 118 Consistency checking 119