Why Visual Quality Assessment?

Why Visual Quality Assessment? Sample image-and video-based applications Entertainment Communications Medical imaging Security Monitoring Visual sensing and control Art

Why Visual Quality Assessment? What is Quality? Fidelity Satisfaction Performance Aesthetic Diagnostic Other Some uses of Quality Assessment Monitoring & Improving the quality of service (QoS) and quality of experience (QoE) Performance evaluation Improved operation Perceptually improved design Authentication

Why Visual Quality Assessment? Quality affected by Sensing, capturing devices Display, printing, reproduction Attacks and Protection Compression Transmission Environment Human vision Viewing position

Basic Imaging System Imaged Scene Imaging Device DIGITIZER STORAGE PROCESS Sampling + Quantization Compression Quality of captured image depends on: Imaging optics, sensors, and electronics Color filter characteristics Digitization Processing Compression Enhancement, Restoration, Compression for transmission

Basic Imaging System Imaged Scene Imaging Device DIGITIZER STORAGE PROCESS Sampling + Quantization Compression Enhancement, Restoration, Compression for transmission Different storage and transmission media depending on application Multimedia applications over wireless portable devices gaining popularity: limited bandwidth and storage - Video over IP - Portable devices: power issues in addition to shared bandwidth and errorprone environment result in much lower data rate transfer - Harsh environments and security: operation under very low power and very low bandwith at below 20 Kbits/sec Data Storage Devices: CDs and DVDs Data throughput (read and write rates) is much lower (few mega bits per second) than storage capacity (few gigabits per second) 1xBlu-ray DVD: 32 Mbps

Compression Artifacts Image and video coding standards Transform based Block-based DCT coding: JPEG, MPEGx, H.26x Wavelet-based coding: JPEG 2000 Motion compensation for video Quantization

Common Compression Artifacts Blocking artifacts in block-based DCT codecs Ringing artifacts in wavelet-based codecs Blurriness loss of detail and sharpness due to removal of high frequency transform coefficients Graininess due to quantization of retained transform coefficients Contouring Color bleeding Mosquito noise in video Motion jerkiness in video Ghosting Flickering EUVIP 2010

Compression Artifacts Degradations due to block-based DCT transform coding

Compression Artifacts JPEG - 10,696 Bytes 757x507 Butterfly JPEG2000-10,436 Bytes 757x507 Butterfly http://www.elecard.com/products/j2kwavelet.php

Common Compression Artifacts Ringing Mosquito Noise

Compression Artifacts

Human Vision and Perception Quality affected by the human visual system Characteristics and limitations of the human visual system Some distortions are introduced Some distortions are masked Saliency visual attention Faces in images, eyes, mouth High-contrast objects Motion Snakes.

Objective Visual Quality Models and Metrics Goal: estimate automatically and reliably quality of visual media Subjective assessment are expensive and not practical for real-time implementations Subjective tests are important for evaluating the performance of objective visual quality metrics Subjective tests need to follow strict and repeatable evaluation conditions ITU-T recommendations: www.itu.int/itut/ Publications/ recs.html Video Quality Experts Group (VQEG) reports: www.vqeg.org

Visual Quality Assessment Image/Video fidelity criteria Useful for rating performance of image/video processing techniques measuring image/video quality and user satisfaction Issues: Viewing distance Subjective versus objective measures in evaluating image/video quality EEE 508

Image Quality Assessment Subjective criteria: Use rating scales goodness scales (rates image quality) Overall, global Group Excellent (5) Best (7) Good (4) Well above average (6) Fair (3) Slightly above average (5) Poor (2) Average (4) Unsatisfactory (1) Slightly below average (3) Well below average (2) Worst (1) Impairment scales (rates an image based on level of degradation present in image compared to ideal image; useful in applications such as image coding and compression) Not noticeable (1) Just noticeable (2) Definitely noticeable but only slight impairment (3) Impairment not objectionable (4) Somewhat objectionable (5) Definitely objectionable (6) Extremely objectionable (7) MOS (Mean Opinion Score) calculates average rating of observers EEE 508

Visual Quality Assessment Traditional Quantitative criteria: The most common set of traditional quantitative criteria used are based on the mean square error (MSE) norm. In most applications, the mean square error is expressed in terms of a Signal-to-Noise Ratio (SNR), which is defined in decibels (db) SNR db 10log 10 where 2 mse Original image E = mean square error 2 mse often approximated by the average least squares error: 2 2 lse I o i, j I i, j 1 M MN Processed image p N i 1 j 1 I o 2 i, j I 2 mse p 2 i, j Original image variance Error variance (MSE) EEE 508

Visual Quality Assessment Traditional Quantitative criteria: Other types of SNR used in image coding applications: - Peak-to-Peak SNR (db) = PPSNR PPSNR 10log 10 peak to peak valueof referenceimage - Peak SNR (db) = PSNR (more commonly used) PSNR 10log 10 peak value of reference image PSNR generally results in values 12 to 15 db above the value of SNR SNR or PSNR are usually measures of quality; they usually correlate well with perceptual quality in image coding applications at high or very low bit rates; but they might not well correlate at low bit rates Commonly used because of mathematical tractability (easy to compute and handle in developing image processing algorithms) 2 e 2 e 2 2 EEE 508

Image Quality Assessment RMSE = 8.5 RMSE = 9.0 EEE 508

Design and Evaluation of Quality Metrics Reference Content Visual Database Content Database [1] Subjective Testing Mean MOS Opinion Score (MOS) DMOS Raw Scores Z Scores Processing Test Content Statistical Analysis Performance Assessment Raw content Optional [1] Copyright LIVE Database 2010 by, Lina http://live.ece.utexas.edu/research/quality/ J. Karam Objective Visual Quality Metric MOS p Nonlinear logistic function Predicted MOS Metric, M

Performance Evaluation of Quality Metrics Popular performance evaluation measures Pearson Correlation Coefficient (PCC): measures prediction accuracy, i.e., the ability of metric to predict subjective MOS with a low error Spearman rank order correlation coefficient (SROCC): measures prediction monotonicity; i.e., it measures if increase (decrease) in one variable results in increase (decrease) in the other variable, independent of the magnitude of increase (decrease). Outlier Ratio (OR): measures consistency, i.e., the degree to which the metric maintains the prediction accuracy; it is defined as the percentage of the number of predictions outside the range of 2 times the standard deviations of the subjective results. Other RMSE and MAE of objective scores Hypothesis testing and F statistics

Visual Quality Databases What is a visual quality database? -Set of images/videos (typically with varying content) -Subjective assessment scores Why are visual quality databases needed? - To assess the performance of objective or automatic methods of quality assessment and compare their performance - To understand human visual perceptual properties

Visual Quality Databases Existing Image quality Databases LIVE Image (Release 2) JPEG compressed images (169 images) JPEG2000 compressed images (175 images) Gaussian blur (145 images) White noise (145 images) Bit errors in JPEG2000 bit stream (145 images) Tampere Image Database 2008 (TID 2008) 25 reference images x 17 types of distortions x 4 levels of distortions IRCCyN/IVC Database 10 original images, 235 distorted images generated from 4 different distortion types (JPEG,JPEG 2000, Rayleigh Fading, Blurring) Toyama Database 14 original images, 168 distorted images generated from 2 distortion types (JPEG, JPEG 2000)

Visual Quality Databases Existing Video quality Databases VQEG H.263 compression MPEG-2 compression LIVE Video MPEG-2 compression H.264 compression Simulated transmission of H.264 compressed bitstreams through errorprone IP networks and through error-prone wireless networks

Objective Visual Quality Models and Metrics Full Reference (FR) Reference Test FR Objective Metric Quality Reduced Reference (RR) Reference Features Test RR Objective Metric Quality No Reference (NR) Test NR Objective Metric Quality

Objective Visual Quality Models and Metrics Full Reference (FR) Reference Test FR Objective Metric Quality Camera Calibration/Tuning Aesthetic Fidelity Application

Objective Visual Quality Models and Metrics Reduced Reference (RR) Reference Features Test RR Objective Metric Quality Sample features from Reference Test

Objective Visual Quality Models and Metrics No Reference (NR) Test NR Objective Metric Quality

Objective Visual Quality Models and Metrics Full Reference Reduced Reference No Reference Perceptual (HVS) Visual Media Characteristics Hybrid Frequency Domain Pixel Domain Hybrid Natural Scene Statistics Visual Features Hybrid

Full Reference Perceptual-based Model Reference Multi-channel Decomposition...... Compute locally adaptive detection thresholds (JNDs) at each location in each channel Computer difference at each location in each channel Normalize by local JNDs Test Multi-channel Decomposition Basis of several metrics: -Watson s Spatial Standard Observer (SSO) metric -Watson s Video Standard Observer (VSO) metric -Liu, Karam, & Watson JPEG2000 compression distortion quantification and control -Watson s DCTune - Hontsch & Karam DCT-based JPEG compression distortion and control - Hontsch & Karam perceptually lossless compression...... Pool over foveal regions Pool all foveal differences over entire image/video Q = 1/D D

Perceptually lossless compression Original image, 8 bits per pixel Processed image, 0.35 bits per pixel

Perceptual Quality-based JPEG2K compression Original image, 8 bits per pixel

Perceptual Quality-based JPEG2K compression Conventional JPEG2K, 0.586 bit per pixel

Perceptual Quality-based JPEG2K compression Perceptual JPEG2K, 0.586 bit per pixel

Other FR Metrics based on contrast detection thresholds Visual SNR, or VSNR (Chandler & Hemami, ITIP, 2007) Weighted SNR, WSNR (Mitsa & Varkur, 93) Noise Quality Measure, NQM (Damera-Venkata et al., ITIP, 2000)

Quality Metrics based on Natural Scene Statistics Basic Assumption: Distortions are not natural in terms of Natural Scene Statistics (NSS).

Objective Visual Quality Models and Metrics Structural SIMilarity (SSIM) Index The SSIM metric is calculated on various patches of an image. The measure between two patches and of size N N is: mean of mean of covariance of and variance of variance of Multi-Scale Structural SIMilarity (MS-SSIM) Index

Quality Metrics based on Natural Scene Statistics Popular SSIM (Structural SIMilarity) FR Metric (Wang et al., ITIP 04) The SSIM between two subimages x and y is given by - x and y are the means of x and y - x and y are the variances of x and y -cov xy is the covariance used to stabilize the division SSIM index for image is average of SSIM indices over all subimages Extensions: MS-SSIM, CWSSIM, VSSIM, Other FR NSS Metrics: -Universal Quality Index (Wang & Bovik, ISPL, 02) earlier SSIM -Image Fidelity Criterion (Sheikh et al., ITIP, 05) GSM in wavelet domain -Visual Information Fidelity (Sheikh et al., ITIP, 06) adds HVS RR NSS Metric: Reduced Reference Image Quality Assessment (Wang & Simoncelli,05)

Other Sources of Information Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, "Image quality assessment: From error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, Apr. 2004. http://www.ece.uwaterloo.ca/~z70wang/research/ssim/

No Reference Blur Metric: Just-Noticeable Blur and Probability of Detection Just-Noticeable Blur (JNB) concept: The minimum amount of perceived blurriness around an edge given a contrast higher than the Just Noticeable Difference (JND).

No Reference Blur Metric: Just-Noticeable Blur and Probability of Detection CPBD (Cumulative Probability of Blur Detection) Metric < 0.63

No Reference Blur Metric: Just-Noticeable Blur and Probability of Detection CPBD (Cumulative Probability of Blur Detection) Metric = 0.9 = 1.7

No Reference Blur Metric: Just-Noticeable Blur and Probability of Detection Performance evaluation of CPBD using LIVE Database Set 1: All 174 Gaussian blurred images in LIVE. Set 2: 30 Gaussian blurred images with varying foreground and background blur quantities. Set 3: All 227 jpeg-2000 compressed images in LIVE.

No Reference Blur Metric: Just-Noticeable Blur and Probability of Detection Performance evaluation of CPBD using TID 2008 Database

No Reference Blur Metric: Just-Noticeable Blur and Probability of Detection Performance evaluation of CPBD using IVC Database Performance evaluation of CPBD using Toyama Database

Other Sources of Information R. Ferzli and L. J. Karam, A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB), IEEE Transactions on Image Processing, vol. 18, no. 4, pp. 717-728, April 2009. N. D. Narvekar and L. J. Karam, A No-Reference Image Blur Metric Based on the Cumulative Probability of Blur Detection (CPBD), IEEE Trans. on Image Processing, vol. 20, No. 9, pp. 2678-2683, Sept. 2011. http://ivulab.asu.edu/quality

Competitive FR Video Quality Metrics Existing still-image quality assessment metrics can be applied to assess video and pooling over frames PVQM (Swisscom/KPN): Leader in VQEG Phase 1 study; uses a linear combination of three distortion indicators, namely edginess, temporal decorrelation, and color error to measure the perceptual quality (visual feature based and weighted combinations of distortion indicators related to these features). VQM (NTIA): Leader in VQEG Phase 2 study and standardized by ITU-T and ISO; provides several quality models, such as the Television model, the General Model, and the Video Conferencing Model, with several calibration options prior to feature extraction (Visual feature based and weighted combinations of distortion indicators related to features); main impairments considered in General Model include blurring, block distortion, jerky/unnatural motion, noise, and error blocks

Competitive FR Video Quality Metrics PEVQ (Opticom): Leader in VQEG Multimedia Phase 1 study; builds upon PVQM ; became part of ITU-T Recommendation J.247 (FR MM video, 2008) MOVIE index (Seshadrinathan & Bovik, ITIP, 2009): spatio-temporal multi-channels, visual masking, temporal quality assessed along computed motion trajectories, builds on principles from SSIM and VIF

Competitive FR Video Quality Metrics Issue with current video quality metrics: Existing still-image quality assessment metrics results on video are very competitive with state-of-the-art video quality metrics Performance on LIVE Video Database Better video quality models are needed.