CAT Training CNNs for Image Classification with Noisy Labels

Similar documents
Learning Deep Networks from Noisy Labels with Dropout Regularization

Adversarial Attacks on Face Detectors using Neural Net based Constrained Optimization

Convolutional Neural Networks

On Feature Selection, Bias-Variance, and Bagging

Defense Against the Dark Arts: Machine Learning Security and Privacy. Ian Goodfellow, Staff Research Scientist, Google Brain BayLearn 2017

's Signal Processing Society Camera Model Identification. [ods.ai] GPU_muscles_SPcup_eligible

Deep Learning. Dr. Johan Hagelbäck.

Learning to Predict Indoor Illumination from a Single Image. Chih-Hui Ho

Department of Computer Science and Engineering Shanghai, China

CS688/WST665 Student presentation Learning Fine-grained Image Similarity with Deep Ranking CVPR Gayoung Lee ( 이가영 )

Deep Learning for Infrastructure Assessment in Africa using Remote Sensing Data

Stacking Ensemble for auto ml

The Art of Neural Nets

CLASSLESS ASSOCIATION USING NEURAL NETWORKS

Creating an Agent of Doom: A Visual Reinforcement Learning Approach

Training a Minesweeper Solver

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. ECE 289G: Paper Presentation #3 Philipp Gysel

Dynamic Data-Driven Adaptive Sampling and Monitoring of Big Spatial-Temporal Data Streams for Real-Time Solar Flare Detection

The Capability of Error Correction for Burst-noise Channels Using Error Estimating Code

Are there alternatives to Sigmoid Hidden Units? MLP Lecture 6 Hidden Units / Initialisation 1

Applications of Music Processing

Playing CHIP-8 Games with Reinforcement Learning

Outline. Tracking with Unreliable Node Sequences. Abstract. Outline. Outline. Abstract 10/20/2009

6. Convolutional Neural Networks

Learning Deep Networks from Noisy Labels with Dropout Regularization

arxiv: v1 [cs.ce] 9 Jan 2018

Robust Frequency-Hopping System for Channels with Interference and Frequency-Selective Fading

Hash Function Learning via Codewords

Today s menu. Last lecture. Series mode interference. Noise and interferences R/2 V SM Z L. E Th R/2. Voltage transmission system

Chapter 4 Heuristics & Local Search

Frugal Sensing Spectral Analysis from Power Inequalities

Simulate IFFT using Artificial Neural Network Haoran Chang, Ph.D. student, Fall 2018

Radio Deep Learning Efforts Showcase Presentation

Machine Learning for Antenna Array Failure Analysis

Augmenting Self-Learning In Chess Through Expert Imitation

arxiv: v1 [cs.lg] 2 Jan 2018

EFFECTS OF PHASE AND AMPLITUDE ERRORS ON QAM SYSTEMS WITH ERROR- CONTROL CODING AND SOFT DECISION DECODING

AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE. A Thesis by. Andrew J. Zerngast

Image Recognition of Tea Leaf Diseases Based on Convolutional Neural Network

Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices

Singing Voice Detection. Applications of Music Processing. Singing Voice Detection. Singing Voice Detection. Singing Voice Detection

Mining for Statistical Models of Availability in Large-Scale Distributed Systems: An Empirical Study of

COGNITIVE Radio (CR) [1] has been widely studied. Tradeoff between Spoofing and Jamming a Cognitive Radio

CS440/ECE448 Lecture 11: Stochastic Games, Stochastic Search, and Learned Evaluation Functions

Introduction to Machine Learning

Revision of Channel Coding

Phil Schniter and Jason Parker

Auto-tagging The Facebook

CHAPTER 4 MONITORING OF POWER SYSTEM VOLTAGE STABILITY THROUGH ARTIFICIAL NEURAL NETWORK TECHNIQUE

Poker AI: Equilibrium, Online Resolving, Deep Learning and Reinforcement Learning

Adversarial Examples and Adversarial Training. Ian Goodfellow, OpenAI Research Scientist Presentation at Quora,

Campus Location Recognition using Audio Signals

Prewhitening. 1. Make the ACF of the time series appear more like a delta function. 2. Make the spectrum appear flat.

CSE 527: Introduction to Computer Vision

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

Chapter 4 SPEECH ENHANCEMENT

Deep Neural Network Architectures for Modulation Classification

Using Neural Network and Monte-Carlo Tree Search to Play the Game TEN

TD-Leaf(λ) Giraffe: Using Deep Reinforcement Learning to Play Chess. Stefan Lüttgen

SMILe: Shuffled Multiple-Instance Learning

SIMULATION VOICE RECOGNITION SYSTEM FOR CONTROLING ROBOTIC APPLICATIONS

A COMPARATIVE ANALYSIS OF IMAGE SEGMENTATION TECHNIQUES

PinPoint Localizing Interfering Radios

Keywords : Simultaneous perturbation, Neural networks, Neuro-controller, Real-time, Flexible arm. w u. (a)learning by the back-propagation.

Combining Sketch and Tone for Pencil Drawing Production. Cewu Lu, Li Xu, Jiaya Jia, The Chinese University of Hong Kong

Audio Augmentation for Speech Recognition

A Review of Related Work on Machine Learning in Semiconductor Manufacturing and Assembly Lines

Lecture 23 Deep Learning: Segmentation

Bayesian Nonparametrics and DPMM

Review: Theorem of irrelevance. Y j φ j (t) where Y j = X j + Z j for 1 j k and Y j = Z j for

RSSI Based Uncooperative Direction Finding

Simulationbased Development of ADAS and Automated Driving with the Help of Machine Learning

Author(s) Corr, Philip J.; Silvestre, Guenole C.; Bleakley, Christopher J. The Irish Pattern Recognition & Classification Society

Sketch-a-Net that Beats Humans

When a number cube is rolled once, the possible numbers that could show face up are

Research on Hand Gesture Recognition Using Convolutional Neural Network

Exploitation, Exploration and Innovation in a Model of Endogenous Growth with Locally Interacting Agents

3D-Assisted Image Feature Synthesis for Novel Views of an Object

Adaptive Antennas in Wireless Communication Networks

LIMITING NUMERICAL PRECISION OF NEURAL NETWORKS TO ACHIEVE REAL- TIME VOICE ACTIVITY DETECTION

Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation

Study Impact of Architectural Style and Partial View on Landmark Recognition

ONE of the important modules in reliable recovery of

Acoustic modelling from the signal domain using CNNs

Deep Learning Basics Lecture 9: Recurrent Neural Networks. Princeton University COS 495 Instructor: Yingyu Liang

An Improved Adaptive Median Filter for Image Denoising

An Introduction to Machine Learning for Social Scientists

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Mikko Myllymäki and Tuomas Virtanen

Smart Scheduling and Dumb Antennas

Deep learning architectures for music audio classification: a personal (re)view

Opportunistic Communication in Wireless Networks

Th N Robust and Fast Data-Driven MT processing

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Fourier transforms, SIM


IDENTIFICATION OF SIGNATURES TRANSMITTED OVER RAYLEIGH FADING CHANNEL BY USING HMM AND RLE

Colorful Image Colorizations Supplementary Material

Distance Estimation and Localization of Sound Sources in Reverberant Conditions using Deep Neural Networks

Punctured vs Rateless Codes for Hybrid ARQ

Transcription:

yclic Annealing Training (AT) NNs for Image lassification with Noisy Labels JiaWei Li, Tao Dai, QingTao Tang, YeLi Xing, Shu-Tao Xia Tsinghua University li-jw15@mailstsinghuaeducn October 8, 2018 AT Training NNs for Image lassification with Noisy Labels 1 / 15

Noisy Labels Noisy Labels Problem Noise Modeling with EM Speedup the training in M-cycle yclic Annealing Training Aggregate M-cycle NNs at test time Bagging NNs Algorithm Description AT on Noisy Labels Experiments Performance on MNIST Robustness on IFAR AT Training NNs for Image lassification with Noisy Labels 2 / 15

Noisy Labels Problem: I Labeling image dataset is a cubersome work and easily induce noise It has a large impact on learning Figure 1: left-middle1 -right might be labeled as dog, seal, and seal 1 opyright: http://wwwdianliwenmicom/postimg_3364775_6html AT Training NNs for Image lassification with Noisy Labels 3 / 15

Noise patterns: I Image x has a noisy label z, its true label y is unknown x y x z y z Figure 2: Two different noise patterns I Left: noisy label z only depends on true label y I Right: z depends on both of true label y and feature x AT Training NNs for Image lassification with Noisy Labels 4 / 15

Noise Modeling: Feature x NN h(x) Pattern Θ Parameter W True Label y=h(x) Noise Model Noisy Label z Figure 3: A typical label noise modeling procedure Learning with EM: I E-step: fix W and update the noise modeling parameter θ I M-step: use z, y=h(x, w), and θ to train W AT Training NNs for Image lassification with Noisy Labels 5 / 15 S

yclic Annealing Training (AT): I It abruptly raises the learning rate α and then quickly decreases it with a cosine function: α(t) = πmod(t 1, dt /e) α0 (cos( ) + 1) 2 dt /e I Align every annealing learning rate cycle to every M-step I Then use the obtained local minimal NN models to update the following E-step I Almost -times faster than original EM approaches AT Training NNs for Image lassification with Noisy Labels 6 / 15

AT vs standard training schedule: 085 Training Accuracy 080 075 070 Standard Learning Rate Schedual yclic Annealing Training (AT) 065 060 0 50 100 150 Epochs 200 250 Figure 4: Training DenseNet-40 on IFAR-10 with different schedule AT Training NNs for Image lassification with Noisy Labels 7 / 15

Aggregate M-cycle NNs at test time: Figure 5: Using AT for Snapshot Ensemble1 I Once the training finished, collect all local minimal NNs P I The aggregating output will be: h AVG (x) = 1 c=1 h c (x) 1 ILR 2017 Gao Huang, et al Snapshot ensembles: Train 1, get m for free AT Training NNs for Image lassification with Noisy Labels 8 / 15

The log likelihood of model parameters are: L(W, θ) = n X t=1 k X log( p(zt yt = i; θ)p(yt = i xt ; W )) i=1 Algorithm 1: AT on Noisy Labels AT Training NNs for Image lassification with Noisy Labels 9 / 15

Noise Setting on MNIST: I We use the label flipping operation on MNIST dataset Figure 6: Label flipping with noise pattern [7,9,0,4,2,1,3,5,6,8] AT Training NNs for Image lassification with Noisy Labels 10 / 15

Performance on MNIST: AT True labels 10 Simple NAL 10 10 8 8 08 6 6 06 4 4 2 2 0 0 2 4 6 Noisy Labels 8 10 0 0 04 02 2 4 6 Noisy Labels 8 10 00 Figure 7: The acquired transfer probability θ of AT and Simple NAL I 46% noisy labels with noise pattern [7,9,0,4,2,1,3,5,6,8] I The simple NAL has a 9968% classification accuracy and AT achieves 9977% AT Training NNs for Image lassification with Noisy Labels 11 / 15

Noise Setting on IFAR-100: I z depends on both of true label y and feature x Figure 8: Randomly selected images from the noisy-label IFAR AT Training NNs for Image lassification with Noisy Labels 12 / 15

Robustness on IFAR-100: IFAR-100 with random noise labels 0400 0375 Test Accuracy 0350 Baseline NN Hard Bootstrap EM Simple NAL omplex NAL AT without Bagging AT 0325 0300 0275 0250 0225 0300 0325 0350 0375 0400 0425 0450 0475 0500 Noise fraction Figure 9: ompare the robustness of noise modeling methods AT Training NNs for Image lassification with Noisy Labels 13 / 15

Selected Reference: 1 TNNLS 2014 lassification in the Presence of Label Noise: a Survey 2 ILR 2015 Training convolutional networks with noisy labels 3 ILR 2015 Training deep neural networks on noisy labels with bootstrapping 4 IASSP 2016 Training deep neural-networks based on unreliable labels 5 ILR 2017 Snapshot ensembles: Train 1, get m for free 6 ILR 2017 Training DNNs Using a Noise Adaptation Layer Some New Progress: 1 JMLR 2018 A theory of learning with corrupted labels 2 IML 2018 Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels 3 IML 2018 Dimensionality-Driven Learning with Noisy Labels 4 VPR 2018 Iterative Learning with Open-set Noisy Labels 5 ILR 2019 submission Pumpout: A Meta Approach for Robustly Training Deep Neural Networks with Noisy Labels AT Training NNs for Image lassification with Noisy Labels 14 / 15

Thanks for listening! AT Training NNs for Image lassification with Noisy Labels 15 / 15