Multiple Kernels for Object Detection. Andrea Vedaldi Varun Gulshan Manik Varma Andrew Zisserman

Size: px

Start display at page:

Download "Multiple Kernels for Object Detection. Andrea Vedaldi Varun Gulshan Manik Varma Andrew Zisserman"

Dwain Summers
5 years ago
Views:

1 Multiple Kernels for Object Detection Andrea Vedaldi Varun Gulshan Manik Varma Andrew Zisserman

2 MK classification PHOW Gray MK SVM PHOW Color combine one kernel per histogram PHOG PHOG Sym Feature vector [Varma Rai 2007] [Gehler Nowozin 2009] Visual Words SSIM

3 MK detection: challenges Goal: sliding window MK classifier Image Candidate region Feature vector MK SVM Time required: TMK #windows TMK Inference space is huge #windows = 100 millions TMK = seconds Excruciatingly slow (days/image) 3

4 Cascade Viola-Jones style Feature vector 4

5 Cascade ICCV 09 Vedaldi Gulshan Varma Zisserman Viola-Jones style Fast Linear SVM all full MK SVMs all look at all features trade-off speed and power tradeoff by choosing the kernel structure Feature vector Feature vector Quasi-linear SVM See also [Harzallah et al. 09] Non-linear SVM 5

6 Cascade Fast Linear SVM Feature vector Quasi-linear SVM Non-linear SVM 6

7 Non-linear sliding SVM Image Candidate region Feature Vector i-th Support Vector Support Vectors (SVs) Time required: #dimensions #windows #SVs Training Data 7

8 Cascade Fast Linear SVM Feature vector Quasi-linear SVM Non-linear SVM 8

9 Quasi-linear SVM Image Candidate region Feature Vector i-th Support Vector Quasi-linear (or additive) kernel decompose as: Thus SVM score rewrites: [Maji Berg Malik 2008] Time required: #dimensions #windows #SVs Pre-compute look-up table. #dimensions #windows 9

10 Cascade Fast Linear SVM Feature vector Quasi-linear SVM Non-linear SVM 10

11 Fast linear SVM Image Candidate region Linear SVM score Feature vector Pixel Image Feature vector Time required: #dimensions #windows #SVs Score map Pre-compute scores for each pixel. #windows Additional speedup possible with branch and bound [Lampert Blaschko Hofmann 2008] Compute sum with integral images 11

12 Histogram normalization Invariance to #features (region area) Kernel as similarity An image region should be most similar to itself - l 2 norm for linear kernel - l 1 norm for intersection, χ 2, Hellinger kernels #occur. feat none 5 l1 #occur. feat. 1 weak classifier suitable for the fist cascade stage only. score !50! region area score 0! region area 4. Features and implementation details 4.1. Appearance descriptors Scatter plot: linear SVM score vs region area To construct descriptors of the appearance of the candi date regions R we use a number of different feature chan nels. These are the features used in [4, 13, 21, 22, 25], and we use public domain source code. score 5 0 l2! region area false positive rate none l1 l detection rate Linear SVM works better with l2 Bag of words (SIFT). We extract visual words at Hessian Laplace normalization [18] points and compute rotation-variant SIFT de scriptors [15]. Those are are quantized in a vocabulary o 3000 Fast words, linear trained SVM onrequires features from the bounding boxes o several no or object l1 normalization instances. For each class, we discriminatively compress the vocabulary down to 64 visual words as in [11 (yielding 20 different vocabularies). 12 Dense words (PhowGray, PhowColor). We compute ro

13 SVMs overview First stage linear SVM (or jumping window) time: #windows Second stage quasi-linear SVM χ 2 kernel time: #windows #dimensions Third stage non-linear SVM χ 2 -RBF kernel time: #windows #dimensions #SVs Feature vector Fast Linear SVM Quasi-linear SVM Non-linear SVM Jumping Window 13

14 14

15 bus ar 15

16 motorbike 16

17 Single kernel vs multiple kernels Multiple Kernels substantial boost Multiple Kernel Learning marginal boost over averaging sparse feature selection Consistent with [Gehler Nowozin 09] precision MKL 50.4% avg 49.9% ssim 39.1% phog % phog % phowcolor 42.6% phowgray 44.4% recall 17

18 Quasi-linear vs non-linear kernels 50.0 Quasi-linear VS non-linear SVM (VOC 2008) aero plane bicycle bird boat bottle bus cat chair cow dining table dog horse motor bike potted plant sheep sofa train tv/ monitor Non-linear Quasi-linear 18

19 2007 vs 2008 vs Results on different editions aero plane bicycle bird boat bottle bus cat chair cow dining table dog horse motor bike potted plant sheep sofa train tv/ monitor VOC 2007 VOC 2008 VOC

20 VOC 2009 results 50.0 Results on 2009 edition aero plane bicycle bird boat bottle bus cat chair cow dining table dog horse motor bike potted plant sheep sofa train tv/ monitor OXFORD_MKL UoCTTI_LSVM-MDPM Other Best

21 Conclusions Hierarchy of kernel structures trade-off speed and power with the same data/algorithm Histogram normalization affects the results should be selected based on the kernel - consistency criterion MK large boost from feature combination sparse feature selection from MK learning MK classification code available MK detection code will be available soon 21

22 Thank You! aeroplane bicycle cow horse motorbike 22

Exploiting Photographic Style for Category-Level Image Classification by Generalizing the Spatial Pyramid

Exploiting Photographic Style for Category-Level Image Classification by Generalizing the Spatial Pyramid Gemert Jan C. Van To cite this version: Gemert Jan C. Van. Exploiting Photographic Style for Category-Level