Visual Processing Driven by Perceptual Quality Gauge: A Perspective Weisi Lin, Zhongkang Lu, Susanto Rahardja, EePing Ong and Susu Yao Media Processing Department Institute for Infocomm Research, Singapore
Outline of Presentation Review on perceptual visual quality gauges perceptual image/video processing Some of our recent research attempts visual quality evaluation perceptual signal maniputions Concluding remarks
Facts about Visual Quality Evaluation as a standalone metric: image evaluation algorithm benchmarking as an embedded module: shaping algorithms/systems The HVS: ultimate appreciator of most images PSNR/MSE/MAE: not matching the HVS perception Perceptual metrics so far: much research interest (VQEG, IEEE G-2.1.6, many others) a difficult odyssey existence of general solution?
Different factors for perceptual metric building sensory perceptual emotional domain-specific PSNR/SNR/MSE/MAE perceptual metrics application-specific perceptual metrics performance, difficulty, complexity application scopes
Criterion Metric type Remarks A Glimpse on Different Metrics input signal output image video 3-D views distortion relatively well explored; color difference to be probed further simple temporal effect; pooling over frames; further modeling needed new area for single, multiple or overall distortion quality overall quality source natural scene Majority of work on natural pictures computer graphics viewer first-party most interested: third-party ones second-party third-party methodology top-down general, complex bottom-up relatively efficient reference full-reference more info available reduced-reference feature selection no-reference PSNR not applicable; wider scopes codec JPEG knowledge of artifacts to be incorporated JPEG 2000 MPEG 1,2 MPEG 4 H.261/3 H.264 application SDTV HDTV mobile comm domain knowledge to be used; piece-wise formulation for different quality ranges; PSNR largely irrelevant for mobile comm.; medical
Top-down Metrics single channel approach CSF filtering Mannos & Sakrison 74 Fauger 79 Lukas & Buddrikis 82 HeegerLambrecht 99 multi-channel decomposition Daly 93, Lubin 95 Lambrecht 96 Winkler 99 Watson 01 No-reference Metrics Wu & Yuen 97 Caviedes & Gurbus 02 Marziliano, et al. 02 Caviedes & Oberti 03 Dijk, et al. 03 Reduced-reference Metrics Wolf 97 Horita, et al. 03 A closer look Hybrid (top-down & bottom-up) metrics Yu, et al. 02 Ong, et al. 04 Tan & Ghanbari 00 Full-reference Metrics Daly 93 Lubin 95 Lambrecht 96 Miyahara, et al. 98 Zhang & Wandell 98 Wang, et al. 99, 04 Tan & Ghanbari 00 Winkler 99 Watson 01 Yu, et al. 02 Lin, et al. 05 Bottom-up Metrics lumonance/color difference Miyahara 98 Zhang & Wandell 98 sharpness Caviedes & Gurbus 02 Winkler 01 Dijk, et al. 03 common coding artifacts Wu & Yuen 97 Yu, et al. 02 Marziliano, et al. 02 Tan & Ghanbari 00 Mylene 03 Caviedes & Oberti 03 other features Suresh & Jayant 05 Lu, et al. 05
Perception-driven Visual Processing image/video compression quantizer and rate control Watson 93 Hontsch & Karam 00,02 Yang, et al. 05 foveation-based coding Wang & Bovik 01 Wang, et al. 03 Itti 04 motion search Malo, et al. 01 Yang, et al. 03 inter-frame replenishment Chiu & Berger 99 filtering of residues/coefficients Safranek 94 Yang, et al. 05 scalability Wang, et al. 03 Lu, et al. 05 closed-lopp control Tan, et al. 04 visual communication enhancement/ reconstruction watermarking Wolfgang, et al. 99 self-embedment for error correction unequal error protection Jiang, et al. 99 joint source-channel coding demoaicing Longere, et al. 02 synthesis Ramasubramanian, et al. 99 super-resolution formation post-processing Yao, et al. 05 edge-enhancement Lin, et al. 05
Just-noticeable Difference (JND) JND: the visibility threshold below which any change cannot be detected by the HVS (Jayant, et al. 93) differentiation in quality evaluation near-jnd supra-jnd 2 JNDs, 3 JNDs, can be also determined
DCT subbands Ahumada & Peterson 92, Watson 93, Hontsch & Karam 00,02, Zhang, et al. 05 wavelet subbands Watson, et al. 93 pyramid subbands Ramasubramanian, et al. 99 pixel domain Chou & Li 95, Chiu &Berger 99, Yang, et al. 03 contrast masking Tong&Venetsanopoulos 98, Zhang, et al. 05 temporal effect eye motion: Daly 98 frame difference: Chou & Chen 96 temporal CSF:for subbands-- Daly 98, Watson, et al. 01 for pixel-- Zhang 04 Visual-attention modulation Lu, et al. 05
Recent research attempts Visual Quality Gauge new ideas: noticeable edge contrast increase--enhancement noticeable edge contrast decrease--the worst degradation noticeable non-edge contrast decrease--degradation noticeable non-edge contrast increase degradation D + = α1c ne + α 2c ne + α 3c e α 4 c + e where α 3 > max( α 1, α 2 ) > α 4 >0 D reduces to the mean absolute error (MAE) measure, if JND is not considered different contrast changes are not differentiated
Recent research attempts to tell a good picture from a good one Better quality than the original image (Longere, et al. 02) (our method)
Recent research attempts Tests with VQEG-I Data Pearson & Spearman correlations (P0,1,3,5,8: the five best VQEG-I proponents) 95% CI the new metric: outperforms the relevant existing metrics with both databases: VQEG-I (compressed video) Longere, et al. 02 (demosaiced images) has small variation in performance under different test conditions std for all 9 test groups
Recent research attempts Perceptual Signal Modification 1-D illustration MC Residue (x10-1 ) modification of signal: for better compression MC Residue (x10-1 ) original signal Simplest but meaningless modification Reasonable modification: the mean in the neighborhood, B Pixel Pixel Problem: noticeable distortion introduced
Recent research attempts MC Residue (x10-1 ) JND range MC Residue (x10-1 ) Noticeable distortion Pixel making the distortion unnoticeable
Recent research attempts Perceptual Quality Significance Map (PQSM) The HVS: not with a ideal sensor with limited source -processing power -internal memory as a result of the evolution =>visual attention hierarchical PQSM (full to rough) PQSM generation integration of multiple stimuli:
Recent research attempts Applications of Perceptual Significance Map JND models quality metrics ROI-based compression scalable coding other visual processing, for resource savings/allocation bandwidth, computing power, memory space, display/printing resolution and/or performance enhancement picture quality JND Modulated JND in line with eye tracking results Y C b C r
Concluding Remarks interesting areas for further work modeling more temporal effects motion, jerkiness, mean time between errors, etc. significant progress perceptual quality gauges various types of metrics perceptual image/video processing compression other related areas more effective accounting for chrominance effects esp. for non-coding distortion joint modeling with other media audio, text, and so on no-reference situations PSNR not applicable; wider scope of application mobile comm applications PSNR largely irrelevant codec dependent metrics e.g. targeting H.264 artifacts ROI-based scalable coding ROI coding scalability SVC standardization adaptive watermarking authentication error resilience
References [1] S. Daly, The visible differences predictor: an algorithm for the assessment of image fidelity, Digital Images and Human Vision (A.B. Watson, ed.), pp.179-206, The MIT Press, 1993. [2] J. Lubin, A visual discrimination model for imaging system design and evaluation, Vision Models for Target Detection and Recognition (E. Peli, ed.), pp.245-283, World Scientific, 1995. [3] S. Winkler, A perceptual distortion metric for digital color video, Proc. SPIE Human Vision and Electronic Imaging IV, Vol. 3644, B.E. Rogowitz and T.N. Pappas eds., pp. 175 184, Bellingham, WA, 1999. [4] VQEG (Video Quality Expert Group), Final Report from the Video Quality Expert Group on the validation of Objective Models of Video Quality Assessment, www.vqeg.org, 2000. [5] VQEG (Video Quality Expert Group), Final Report from the Video Quality Expert Group on the validation of Objective Models of Video Quality Assessment, Phase II, www.vqeg.org, 2003. [6] A.M. Eskicioglu and P.S. Fisher, Image quality measures and their performance, IEEE Trans. Communications, Vol. 43(12), pp.2959-2965, Dec. 1995. [7] H.R. Wu and M. Yuen, A generalize block-edge impairment metric for video coding, IEEE Sig. Proc. Lett., Vol. 4(11), pp.317-320, 1997. [8] P. Marziliano, F. Dufaux, S. Winkler and T. Ebrahimi, a no-reference perceptual blur metric, Proc. IEEE Int l Conf. Ima. Proc.(ICIP), 2002. [9] S. Wolf, Measuring the end-to-end performance of digital video systems, IEEE Transactionson Broadcasting, vol.43(3), pp. 320-328, 1997. [10] M. Miyahara, K. Kotani, K., and V.R. Algazi, Objective picture quality scale (PQS) for image coding, IEEE Trans. Communications, Vol. 46(9), pp.1215-1225, 1998. [11] X. Zhang and B.A. Wandell, Color image fidelity metrics evaluated using image distortion maps, Signal Processing, Vol. 70 (3), pp.201-214, 1998. [12] K.T. Tan and M. Ghanbari, A multi-metric objective picture-quality measurement model for MPEG video, IEEE Trans. Circuits Syst. Video Technol., Vol. 10, No. 7, Oct. 2000, pp. 1208-1213. [13] S. Wolf and M. Pinson, Video quality measurement techniques, NTIA Report 02-392, June 2002. [14] E. Ong, W. Lin, Z. Lu, S. Yao and M. Etoh, Visual Distortion Assessment with Emphasis on Spatially Transitional Regions, IEEE Trans. Circuits and Systems for Video Technology, Vol. 14(4), PP.559 566, April 2004. [15] Z. Yu, H.R. Wu, S. Winkler, and T. Chen, Vision-model-based impairment metric to evaluate blocking artifacts in digital video, Proc. IEEE, Vol. 90(1), pp. 154-169, 2002. [16] Z. Wang, L. Lu and A.C. Bovik, Foveation scalable video coding with automatic fixation selection, IEEE Transactions on Image Processing, Vol. 12(2), pp.243-254, Feb. 2003. [17] Z. Lu, W. Lin, X. Yang, E. Ong and S. Yao, Modeling Visual Attention's Modulatory Aftereffects on Visual Sensitivity and Quality Evaluation, IEEE Trans. Image Processing, Vol.14(11), pp.1928 1942, Nov. 2005. [18] A. B. Watson, ``Proposal: Measurement of a JND Scale for Video Quality", prepared for the IEEE G-2.1.6 Subcommittee on Video Compression Measurements meeting, August 7th, 2000.
[19] ITU-R Recommendation 500-10, ``Methodology for the Subjective Assessment of the Quality of Television Pictures," ITU, Geneva, Switzerland, 2000. [20] Sarnoff Corporation, ``Sarnoff JND vision model", J. Lubin (Ed.), Contribution to IEEE G-2.1.6 Compression and Processing Subcommittee, Aug., 1997. [21] P. Longere and X. Zhang and P. B. Delahunt and D. H. Brainaro, Perceptual Assessment of Demosaicing Algorithm Performance, Proc. IEEE, vol.90, no.7, pp.123-132, Jan, 2002. [22] D.M. Tan, H. R. Wu and Z. H. Yu, Perceptual coding of digital monochrome images, IEEE Signal Processing Letters, Vol. 11( 2), pp.239 242, Feb. 2004. [23] B. Watson, DCTune: A technique for visual optimization of DCT quantization matrices for individual images, Society for Information Display Digest of Technical Papers XXIV, pp. 946-949, 1993. [24] I. Hontsch, and L. J. Karam, Adaptive image coding with perceptual distortion control, in IEEE Trans. on Image Processing, vol. 11, No. 3, pp. 213-222, 2002. [25] C.-H. Chou and Y.-C. Li, A perceptually optimized 3-D subband codec for video communication over wireless channels, in IEEE Trans. Circuits Syst. Video Technol., vol.6, no.2, pp. 143-156, 1996. [26] J. Malo, J. Gutierrez, I. Epifanio, F.J. Ferri and J. M. Artigas, ``Percetual feedback in multigrid motion estimation using an improved DCT quantization", IEEE Trans. Image Processing, vol. 10, No. 10, pp. 1411-1427, October, 2001. [27] X.K. Yang, W. Lin, Z.K. Lu, E.P. Ong and S.S.Yao, ``Perceptually-adaptive Hybrid Video Encoding Based On Justnoticeable-distortion Profile", SPIE 2003 Conference on Video Communications and Image Processing (VCIP), Vol.5150, pp.1448-1459, 2003. [28] X. Yang, W. Lin, Z. Lu, X. Lin, S. Rahardja, E. Ong and S. Yao, Rate Control for videophone using perceptual sensitivity cues, IEEE Trans. Circuits and Systems for Video Technology, vol 15(4), pp.496-507, April, 2005. [29] Y. J. Chiu and T. Berger, ``A Software-only Videocodec Using Pixelwise Conditional Differential Replenishment and Perceptual Enhancement", IEEE Trans. Circuits Syst. Video Technol., vol. 9, No. 3, pp. 438-450, April, 1999. [30] R. J. Safranek, ``A JPEG compliant encoder utilizing perceptually based quantization", Proc. SPIE Human Vision, Visual Proc., and Digital Display V, Vol. 2179, pp. 117-126, Feb. 1994. [31] R. B. Wolfgang, C. I. Podilchuk, and E. J. Delp, ``Perceptual Watermarks for Digital Images and Video", Proc IEEE, 87( 7), pp.1108-1126, July 1999. [32] A. E. Savakis, S. P. Etz and A. C. Loui, Evaluation of image appeal in consumer photograph, Proc. SPIE, Human Vision and Electronic Imaging V, vol. 3959, pp. 111-120, 2000. [33] H.R. Wu, Z. Yu and B. Qiu, Multiple reference impairment scale subjective assessment method for digital video, International Conference on Digital Signal Processing (DSP2002), pp. 185-189, July 2002. [34] M. Tapiovaara, Objective measurement of image quality in fluoroscopic X-ray equipment: FluoroQuality, STUK- A196, May 2003, http://www.stuk.fi/julkaisut/stuk-a/stuk-a196.pdf
[35] W. Lin, L. Dong and P.Xue, Visual Distortion Gauge Based on Discrimination of Noticeable Contrast Changes, IEEE Trans. Circuits and Systems for Video Technology, vol.15(7), pp. 900-909, July, 2005. [36] J. Caviedes and S. Gurbuz, No-reference sharpness metric based on local edge kurtosis, Proc IEEE Int l Conf. Ima. Proc.(ICIP), vol. 3, pp. 53-56, 2002. [37] E. Ong, X. Yang, W. Lin, Z. Lu, S. Yao, X. Lin, S. Rahardja and C. Boon, Perceptual Quality and Objective Quality Measurements of Compressed Videos, Journal of Visual Communication and Image Representation, accepted, 2005. [38] Z. Lu, W. Lin, Z. Li, K. P. Lim, X. Lin, S. Rahardja, E. Ong and S. Yao, Perceptual Region-of-interest (ROI) based Scalable Video Coding, ISO/IEC JTC1/SC29/WG11 and ITU-T SG16 Q.6, 15th Meeting, JVT-O056, Bushan, Korea, April, 2005.[39] M. P. Eckert and A. P. Bradley, Perceptual quality metrics applied to still image compression, Signal Processing, Vol. 70, 1998, pp.177-200. [40] L. M. J. Meesters, W. A. Ijsselsteijn and P. J. H. Seuntiens, A survey of perceptual evaluations and requirements of three-dimensional TV, IEEE Trans. Circuits Syst. Video Technol., Vol. 14, No. 3, Mar. 2004, pp. 381-391. [41] N. Suresh and N. Jayant, Mean time between failures: a functional quality metric for consumer video, First International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Scottsdale, Arizona, USA, 23-25 January 2005. [42] Z. Lu, W. Lin, X. Yang, E. Ong, S. Yao, C. S. Boon and S. Kato, Measuring the negative impact of frame dropping on perceptual visual quality, SPIE Human Vision and Electronic Imaging X, eds, B. E. Rogowitz,, T. N. Pappas, and S. J. Daly, Vol. 5666, pp.554-562, 2005. [43] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, Image quality assessment: From error measurement to structural similarity, IEEE Trans. Image Process., vol. 13, no. 4, pp. 600 612, Apr. 2004.[44] J. Caviedes and F. Oberti, No-reference quality metric for degraded and enhanced video, Proc PPIE, vol. 5150, pp. 621-632, 2003. [45] H. R. Sheikh, A. C. Bovik and L. Cormack, No-reference quality assessment using natural scene statistics: JPEG2000, IEEE Trans. Image Processing, Vol.14(11), pp.1918 1927, Nov. 2005. [46] M. C. Q. Farias, S. K. Mitra and J. M. Foley, Perceptual contributions of blocky, blurry and noisy artifacts to overall annoyance, IEEE International Conference on Multimedia and Expo (ICME), Vol. I, pp. 529-532, 2003. [47]M. Yuen, and H.R. Wu, A survey of MC/DPCM/DCT video coding distortions, Signal Processing, Vol. 70, No. 3, Nov. 1998, pp. 247-278. [48] L. Itti, Automatic foveation for video compression using a neurobiological model of visual attention, IEEE Trans. Image Processing, Vol.13(10), pp.1304 1318, Oct. 2004. [49] X. Yang, W. Lin, Z. Lu, E. Ong and S.Yao, Motion-compensated Residue Pre-processing in Video Coding Based on Just-noticeable-distortion Profile, IEEE Trans. Circuits and Systems for Video Technology, vol.15(6), pp.742-750, June, 2005.