Statistical Tools for Digital Forensics Information Technologies for IPR Protection
Henry Chang-Yu Lee One of the world s foremost forensic scientists. Chief Emeritus for Scientific Services for the State of Connecticut. Full professor of forensic science at the University of New Haven, where he has helped to set up the Henry C. Lee Forensic Institute.
Forensics Forensic science, the application of a broad spectrum of sciences to answer questions of interest to the legal system. Criminal investigations. Other forensics disciplines: Forensic accounting. Forensic economics. Forensic engineering. Forensic linguistics. Forensic toxicology.
Digital Forensics Application of the scientific method to digital media in order to establish factual information for judicial review. What is digital forensics associate with DRM? Authorized images have been tampered. How to declare the image is neither authentic, nor authorized.
Image Tampering Tampering with images is neither new, nor recent. Tampering of film photographs: Airbrushing. Re-touching. Dodging and burning. Contrast and color adjustment. Outside the reach of the average user.
Image Tampering Digital Tampering: Compositing. Morphing. Re-touching. Enhancing. Computer graphics. Painted.
Image Tampering Tampering is not a well defined notion, and is often application dependent. Image manipulations may be legitimate in some cases, ex. use a composite image for a magazine cover. But illegitimate in others, ex. evidence in a court of law.
Watermarking-Based Forensics Digital watermarking has been proposed as a means by which a content can be authenticated. Exact authentication schemes: Change even a single bit is unacceptable. Fragile watermarks. Watermarks will be undetectable when the content is changed in any way. Embedded signatures. Embed at the time of recording an authentication signature in the content. Erasable watermarks. aka invertible watermarks, are employed in applications that do not tolerate the slight content changes.
Watermarking-Based Forensics Selective authentication schemes: Verify if a content has been modified by any illegitimate distortions. Semi-fragile watermarks. Watermark will survive only under legitimate distortion. Tell-tale watermarks. Robust watermarks that survive tampering, but are distorted in the process. The major drawback is that a watermark must be inserted at the time of recording, which would limit this approach to specially equipped digital cameras.
Assumption: Statistical Techniques for Detecting Traces Digital forgeries may be visually imperceptible, nevertheless, they may alter the underlying statistics of an image. Techniques: Copy-move forgery. Duplicated image regions. Re-sampled images. Inconsistencies in lighting. Chromatic Aberration. Inconsistent sensor pattern noise. Color filter array interpolation.
Detecting Inconsistencies in Lighting L: direction of the light source. A: constant ambient light term.
Detecting Inconsistent ( ) ( ) ( ) Sensor Pattern Noise ( ) k k k n = p F p ( k ) Pc = ( n ) N p p: series of images. F: denoising filter. n: noise residuals. P c : camera reference pattern.
Detecting Inconsistent Sensor Pattern Noise Calculate ρ( n( Qk ), Pc ( R) ) for regions Q k of the same size and shape coming from other cameras or different locations. Decide R was tampered if p > th = 10-3 and not tapered otherwise. R
Detecting Color Filter Array Interpolation Most digital cameras have the CFA algorithm, by each pixel only detecting one color. Detecting image forgeries by determining the CFA matrix and calculating the correlation.
Reference H. Farid, Exposing Digital Forgeries in Scientific Images, in ACM MMSec, 2006 J. Fridrich, D. Soukal, J. Lukas, Detection of Copy-Move Forgery in Digital Images, in Proceedings of Digital Forensic Research Workshop, Aug. 2003 A. C. Popescu, H. Farid, Exposing Digital Forgeries by Detecting Duplicated Image Regions, in Technical Report, 2004 A. C. Popescu, H. Farid, Exposing Digital Forgeries by Detecting Traces of Resampling, in IEEE TSP, vol.53, no.2, Feb. 2005
Reference M. K. Johnson, H. Farid, Exposing Digital Forgeries by Detecting Inconsistencies in Lighting, in ACM MMSec, 2005 M. K. Johnson, H. Farid, Exposing Digital Forgeries Through Chromatic Aberration, in ACM MMSec, 2006 J. Lukas, J. Fridrich, M. Goljan, Detecting Digital Image Forgeries Using Sensor Pattern Noise, in SPIE, Feb. 2006 A. C. Popescu, H. Farid, Exposing Digital Forgeries in Color Filter Array Interpolated Images, in IEEE TSP, vol.53, no.10, Oct. 2005
Discussion The problem of detecting digital forgeries is a complex one with no universally applicable solution. Reliable forgery detection should be approached from multiple directions. Forensics is done in a fashion that adheres to the standards of evidence admissible in a court of law. Thus, digital forensics must be techno-legal in nature rather than purely technical or purely legal.
Exposing Digital Forgeries in Scientific Images Hany Farid, ACM Proceedings of the 8th Workshop on Multimedia and Security, Sep. 2006
Outline Introduction Image Manipulation Image Segmentation Automatic Detection Discussion
Introduction 南韓黃禹錫幹細胞研究造假 2005/06/17 黃禹錫宣布成功的建立 11 個病人身上體細胞所衍生的幹細胞株, 論文並於國際知名的 科學 期刊發表 2005/11/11 共同作者夏騰指控黃禹錫對他隱瞞卵子取得來源的事實, 並認為其與黃禹錫所發表的論文數據有瑕疵 2005/11/21 南韓首爾國立大學應黃禹錫自己要求也展開調查其實驗結果
Introduction 南韓黃禹錫幹細胞研究造假 2005/12/23 初步報告顯示, 黃禹錫在 2005 年發表在 科學 期刊的論文, 數據絕大部份都是子虛烏有 : 由 11 個病人身上體細胞所衍生的幹細胞株, 實際存在的只有兩個, 這項結果也顯示黃禹錫的人為疏失並不是無意造成地, 而是刻意欺騙 2005/12/29 調查委員會再公佈所謂的實際存在的兩個病人幹細胞株其 DNA 也不符合原來的體細胞 2006/1/13 科學 期刊正式宣佈撤回黃禹錫在 2005 年和 2004 年的兩篇論文
Outline Introduction Image Manipulation Image Segmentation Automatic Detection Discussion
Image Manipulation Action of each manipulation scheme: Deletion, (a). A band was erased. Healing, (b). Several bands were removing using Photoshop s healing brush. Duplication, (c). A band was copied and pasted into a new location.
Image Manipulation Effect of each manipulation scheme: Deletion. Remove small amounts of noise that are present through the dark background of the image. Healing. Disturb the underlying spatial frequency (texture). Duplication. Leave behind an obvious statistical pattern two regions in the image are identical. Formulate the problem of detecting each of these statistical patterns as an image segmentation problem.
Outline Introduction Image Manipulation Image Segmentation Automatic Detection Discussion
Image Segmentation: Graph Cut Consider a weighted graph G = (V, E). A graph can be partitioned into A and B such that A B = φ and A B = V. To remove the bias which is a natural tendency to cut a small number of low-cost edges:
Image Segmentation: Graph Cut Define W a n n matrix such that W i,j = w (i, j) is the weight between vertices i and j. Define D a n n diagonal matrix whose i th element on the diagonal is. Solve the eigenvector problem with the second smallest eigenvalue λ. Let the sign of each component of e define the membership of the vertex.
Image Segmentation: Intensity For deletion. I (.): gray value at a given pixel. Δ i,j : Euclidean distance.
Image Segmentation: Intensity First Iteration: Group into regions corresponding to the bands (gray pixels) and the background. Second Iteration: The background is grouped into two regions (black and white pixels.)
Image Segmentation: Texture For healing. I g (.): the magnitude of the image gradient at a given pixel.
Image Segmentation: Texture s d (.): 1D deravative filter. [0.0187 0.1253 0.1930 0.0 0.1930 0.1253 0.0187] p (.): low-pass filter. [0.0047 0.0693 0.2454 0.3611 0.2454 0.0693 0.0047] [ ] = 1 0 1 2 0 2 1 0 1 1 0 1 1 2 1
Image Segmentation: Texture First Iteration: Using intensity-based segmentation. Group into regions corresponding to the bands (gray pixels) and the background. Second Iteration: Using texture-based segmentation. The background is grouped into two regions (black and white pixels.)
Image Segmentation: Duplication For duplication. One iteration.
Outline Introduction Image Manipulation Image Segmentation Automatic Detection Discussion
Automatic Detection Denote the segmentation map as S (x, y). Consider all pixels x, y with value S (x, y) = 0 such that all 8 spatial neighbors also have value 0. The mean of all of the edge weights between such vertices is computed across the entire segmentation map. This process is repeated for all pixels x, y with value S (x, y) = 1. Values near 1 are indicative of tampering because of significant similarity in the underlying measures of intensity, texture, or duplication.
Automatic Detection S0 = 0.19 S0 = 0.99 S0 = 0.30 S0 = 0.98 S0 = 0.50 S0 = 0.97
Outline Introduction Image Manipulation Image Segmentation Automatic Detection Discussion
Discussion These techniques are specifically designed for scientific images, and for common manipulations that may be applied to them. As usual, these techniques are vulnerable to a host of counter-measures that can hide traces of tampering. As continuing to develop new techniques, it will become increasingly difficult to evade all approaches.