PROJECT REPORT. Using Deep Learning to Classify Malignancy Associated Changes

Size: px
Start display at page:

Download "PROJECT REPORT. Using Deep Learning to Classify Malignancy Associated Changes"

Transcription

1 Using Deep Learning to Classify Malignancy Associated Changes Hakan Wieslander, Gustav Forslid Project in Computational Science: Report January 2017 PROJECT REPORT Department of Information Technology

2 Abstract Screening specimen from the cervix is an important way to discover cervical cancer in time. Today these screenings are done manually by trained medical experts. The procedure is time consuming and thus costly. A suggested solution is using deep learning for detecting malignancy associated changes. The scope of this project was using convolutional neural networks for classifying the different cell types associated with cervical cancer. Since most malignancy associated changes are present in the nuclei of the cells one approach is to segment out the nuclei and train the networks on those. To get a comparison to this approach a case with unsegmented images is also investigated. The result showed that a deeper network was needed for the segmented images and a shallower network for the unsegmented images. With the unsegmented images we found that the network can distinguish between normal cells and more cancerous cells. With the segmented images the networks got high scores for the low grade cancerous cells but could not distinguish between the other cell types. In conclusion there is potential for the deep learning approach for detecting malignancy associated changes but more extensive studies are needed.

3 Contents 1 Introduction Background Purpose of project Theory Cervical Cancer Deep Learning Convolutional Neural Networks Convolutional Layer Pooling layer Fully Connected Layer ReLU Layer Dropout layer Batch Normalization layer Backpropagation Stochastic Gradient Descent Caffe Image Processing Watershed segmentation Morphological dilation Normalization The Dataset Method Data collection Image pre-processing Deep Learning Result Network architecture Testing result Deeper architecture Smaller architecture Discussion Network Architecture Images Segmented Images Unsegmented Images Conclusion Using Deep Learning to classify Malignancy Associated Changes Future work

4 1 Introduction 1.1 Background Cervical cancer was for a long time one of the deadliest cancer types for women. Nowadays women have a few options to prevent or reduce the suffering cervical cancer can cause. In highly developed countries many women become vaccinated for cervical cancer. For those who do not become vaccinated it is recommended to do a test every five years. The most common way to perform these tests is called pap-smear screenings. A pap-smear is a specimen taken from the cervix. The screening of these are done by trained medical experts. Due to the manual examination of pap-smears it is time consuming and thus expensive. This may result in that some countries do not provide these screenings for women or that the women them self can not afford to be screened. Since most of the cervical cancer cases occur in developing countries one might conclude that it is the women in these countries that suffer the most. Deep learning can be traced back to the 1960 s but it is not until this last decade that the field exploded. The reason for this is the possibility to run all the mathematical operations on a graphical processing unit (GPU). As the field grew so did the spectra for what deep learning could be used for. Convolutional Neural Network (CNN) is a type of deep learning method that can be used for image classification and feature detection. A CNN is built up by layers containing weights and biases. These weights and biases are updated during training and optimized with steepest descent. The capability of classifying images at a high rate can be applied to many different fields. 1.2 Purpose of project The purpose of this project is to investigate if deep learning can be applied as a classification tool in the normal procedure of pap-smear screenings. The benefit would be that medical experts only need to screen the pap-smears containing cancerous cell types instead of screening all the pap-smears. This classification would then create a much more time efficient procedure and hopefully in the end help more women to become screened. The following tasks are pursued: Determine the best way to process the images. Construct the best architecture for classifying the different cell types. 2 Theory 2.1 Cervical Cancer Cervical cancer is a malignant disease in the cervix. Smoking, reproductive history and number of sexual partners can be risk factors, but the leading cause is infection by the Human Papillomavirus (HPV). HPV is a common sexually transmitted infection which varies on how it effects the human. In some cases the virus regress itself without treatment, but in other cases it develops into cervical cancer. This transformation is very slow, for women with a normal immune system it can take up to years to develop into cervical cancer. For women with damaged immune system it might only take 5-10 years. The earlier the cancer is discovered the more likely it is for the patient to be cured [1]. Malignancy Associated Changes are referred to as MAC. MAC describes subtle changes in the texture and morphology of a cell nucleus. These subtle changes are difficult to distinguish in practice, but have been proven to be a reliable source for detecting cancerous cells [2]. The system for reporting on a cervical diagnosis is called The Bethesda System. This system is used for pap-smear screenings and below follows a list of the most common cell types [3]. NILM - Negative for Intraepithelial Lesion or Malignancy LSIL - Low-grade Squamous Intraepithelial Lesion HSIL - High-grade Squamous Intraepithelial Lesion SCC - Squamous Cell Carcinomas Adenocarcinoma - Adenosquamous Carcinomas ASC-H - Atypical Squamous Cells, which cannot exclude a High-grade lesion ASC-US - Atypical Squamous Cells of Undetermined Significance 3

5 2.2 Deep Learning Deep learning is a subset of machine learning. The main difference is that deep learning methods are able to automatically extract features from a data set instead of having to point to what features to look for. The theory behind deep learning can be dated back to the 1960 s [4]. At that time the computers could not handle the high amount of mathematical operations that deep learning requires. In the beginning of the 21 century the computers could finally preform at a required rate. This resulted in more work and research done in the deep learning field. In the mid of 2000 it became possible to run these mathematical operations on a graphical processing unit (GPU). By 2009 it was almost 20 times faster to distribute a mathematical operation on the GPU than running it on a CPU [5]. Today the deep learning field is one of the fastest growing fields in computer science. The ImageNet Large Scale Visual Recognition Challenge (ILSVCR), which is a annual challenge about classification and detection of hundreds of categories in millions of images, has since 2012 been won exclusively by Convolutional neural networks [6] Convolutional Neural Networks Convolutional Neural Networks is referred to as CNN and is a type of deep learning method that is inspired by the animal visual cortex. Cells in the visual cortex are all sensitive to a small region in the receptive field. The inspiration from the animal visual cortex resulted in a spatial structure of CNN where a specific region is connected to a specific region in the next layer[7]. CNN s are constructed by neurons which have learnable weights and biases. The spatial structure creates a volume of these neurons with a length, height and depth. The depth represents the number of filter-kernels each layer contains which is also the number of outputs that layer produces. When training a network the filter-kernels will learn to identify different features in different parts of the input image. This means that the more features the input images contains the more filter-kernels might be needed [8]. The architecture of a CNN can vary a lot. Depending on the purpose of the CNN the architect can choose different layers, how many layers and the construction of each layer. These variables are referred to as hyperparameters of the network. The most common layers are convolutional layers, pooling layers and fully connected layers. Other examples can be ReLU layers, batch normalization layers, softmax layers and dropout layers. When training a CNN you have to include some form of metric on how to calculated how good or bad the network preforms. This is called a loss function and can be constructed in a couple of different ways. The loss function takes the label of the input image, compares it to the network prediction and calculates the loss. Minimizing the loss will make the network better and better at predicting its input images [9]. Figure 1: Example architecture of a simple Convolutional Neural Network Convolutional Layer The main task for convolutional layers is to distinguish local unions of features from the input. Convolutional layers are built up by learnable filter-kernels with a set size. Each convolutional layer has a spatial architecture with a height and width of the filter-kernels and a depth of how many filter-kernels it has. The filter-kernels are convolved over the input where they perform a dot product that creates an output as can be seen in Figure 2. Each convolutional layer has a set stride, which indicates how far the filter-kernel is moved before performing a new dot product. During training each filter-kernel will start looking for some specific feature. The first convolutional layer usually pick up on primitive features e.g. vertical and diagonal lines. The deeper the convolutional layer is placed the more advanced features its filter-kernels will pick up. 4

6 Figure 2: Convolving a 3x3 filter-kernel over an input and producing an output with a dot product Pooling layer Pooling layers are often inserted after a convolutional layer. The main task for the pooling layers is to decrease the spatial representation of its input. This creates less computations which speeds up the training process due to a reduced spatial size. There are two types of pooling layers that are more common than others. The first one is called Max Pooling and is described in Figure 3a. Here a set sized kernel is slid over an input image and the output is the max value inside the kernel. The second type is Average Pooling and is described in Figure 3b. Here you also have a set sized kernel that is slid over the input image and the output is the average value of the pixel values inside the kernel. By using a pooling layer on each input slice independently it resizes the input spatially [8]. (a) Max pooling. (b) Average pooling. Figure 3: Pooling Fully Connected Layer Fully connected layers are referred to as FC. The main task for FC layers is to compute the class scores, which indicate which class the network predicts the input belonging to. This can be described as the classification step in a CNN. Neurons in the FC layer are connected to all neurons in the previous layer and that is why they are called fully connected. The depth of the final FC layer depends on how many class sources the network has. The output from the last FC layer represents the probabilities of what class the input is according to the network. When summing up all the probabilities from the output from the last FC layer you always end up with 1. This is due to an activation function called softmax. Softmax takes a vector of real valued class scores and squashes it into numbers between 0 and 1.[8] ReLU Layer ReLU stands for rectified linear unit and is a non-linear pixel by pixel operation. ReLU layers are often systematically inserted in a CNN architecture after a convolutional layer. The main reason for adding ReLU layers is to introduce non-linearity in the CNN. This has to be introduced since almost all real world data is non-linear. The operation is defined in Equation (1). It takes the input value, compares it 5

7 with zero and picks the max value of the two as the output [10] Dropout layer Output = M ax(0, Input) (1) Dropout layers are implemented in the architecture to increase the accuracy by reducing overfitting. Overfitting is a phenomena that occurs when a network is trained with too few images. The network becomes good at classifying the images in the training set, but has a lower accuracy when classifying any other images. During the backpropagation, when the network adapts its weights and biases, the network becomes fragile and creates a structure that only works for the training data. Dropout layers counteracts these fragile networks by breaking up some relations at random. For each input batch in the training process some connections between two layers are temporarily removed from the network. Dropout can be seen as training multiple thinner networks which share the same weights and biases. A negative side of implementing dropout layers is that the execution time increases, which in general makes training a network 2-3 times longer than normal [11] Batch Normalization layer Batch Normalization layers are included to speed up the training process. Since the input of each layer in a CNN is effected by the parameters of the previous layer, small changes might amplify through the network. This phenomena is referred to as internal covariate shift. Batch normalization layers reduces this phenomena by first dividing the input data into mini-batches and then normalizing these. The normalization is done so that the features in each mini-batch are in the same range for each layer. By applying batch normalization to the networks architecture the learning rate can be much higher and the initialization is less sensitive. In some cases it can remove the need for a dropout layer, as it can act as a regularizer. Batch normalization layers make the training process approximately 14 times faster and also improve the result [12] Backpropagation After the class scores are calculated in the FC layer, the loss is calculated. The loss is calculated by comparing the networks prediction of the input images with the ground truth. For the network to learn from the errors you backpropagate the error through the network. In the backpropagation the gradient of the error is calculated with respect to the weights and biases. The update is executed in a way to minimize the error of the output[10]. One of the most common ways to preform this update is using stochastic gradient descent Stochastic Gradient Descent Stochastic Gradient Descent (SGD) is an algorithm used for updating the weights and the biases during the backpropagation. The update procedure is shown in Equation 2, where θ are the parameters, x (i) the training example and y (i) is the label. θ = θ α θ J(θ; x (i), y (i) ) (2) The SGD algorithm is created so that it can evaluate the gradient after just seeing a small part of the data set. This can be compared with the standard gradient descent which has to see the entire data set before doing the evaluation[13] Caffe Caffe is a framework designed for deep learning. It is developed by Berkely vision and learning center (BVLC) and community distributors. The caffe framework stores and manipulates the data in the network in blobs which are used as the memory interface between the layers. Each layer has a top blob and a bottom blob. The layers get data from the bottom blob and outputs the data to the top blob.[14] 2.3 Image Processing An important part of this project is the image processing. This section introduces some of the important methods used for processing the images before training. 6

8 2.3.1 Watershed segmentation A digital image is represented by each pixels intensity value. Dark areas have low intensity values and brighter areas have higher intensity values. The image can be visualized as a landscape with valleys and peaks at the different intensity values. Watershed segmentation can be described as imaginary drops of water flooding the landscape. Each local minima in the image represents a valley in the landscape. When water from two valleys meet, a border or a watershed is placed which will prevent the water from two valleys to merge. This will create a representation of the image were only the borders of the watersheds are present. To minimize the number of regions that are segmented with watershed you can place seeds at the specific objects of interest and the background. Those seeds are now the valleys that will be flooded and all pixels within those regions will be labeled with the same label [15]. Figure 4: Visualization of watershed segmentation. Water floods the valleys and the landscape is divided by the watesheds Morphological dilation Morphological dilation is an image processing method for filling gaps and extending object boundaries in an image. The method is mostly used on binary images. Dilation can be seen as a convolution between a mask (structuring element) and the image and is calculated as A B = {z ˆB z A } (3) where A is the image, B the mask and A and B are sets of Z 2. The mask is constructed so that the purpose of the dilation is fulfilled. The mask is convolved over each pixel of the image. The image is dilated if the convolution between the image segment and the mask is not equal to zero [15]. Figure 5: Effect of dilation on a binary image with a 3x3 square mask Normalization Normalization is a process to scale down the range of features in the data. If different features in the data have different scales it might become hard for the network to adjust its weights to favor all those features. The normalization is usually done in two steps, first subtracting the mean of the image from each pixel then dividing by the standard deviation as described in equation 4 [9]. N ormalized Image = (Image mean(image))/std(image) (4) 7

9 2.4 The Dataset The data set contained stacked images from pap-smear tests. The images were photographed from different heights to have each cell in focus in at least one image. All cells were marked with what type of cell it is, which stack image it has the best focus in and its coordinates[16]. The number of samples for each cell type is presented below in Table 1. Table 1: Number of samples in the dataset Cell Type Samples Nilm 6064 Lsil 402 Hsil 471 Scc 653 Adenocarcinoma 53 Other distorted 17 Other occluded 144 Other inflammatory 53 Other degenerative 16 ASC-H 2 ASC-US 8 3 Method Since most of the malignancy associated changes are present in the nuclei of the cells one approach was to segment out the nuclei from each image. To get a comparison with this approach a case with unsegmented images was also investigated. 3.1 Data collection To handle the lack of samples of different cell types (Table 1) we group some of the cell types into one class. This class contained Other-distorted, -occluded, -inflammatory, -degenerative, ASC-H and ASC- US. From this we obtained 6 different classes to classify. All marked cells in the data set were cut out in 200x200 segments. In a later stage the images were cropped down to a size of 100x100 with the cell nucleus centered in the image. All images were saved with the name of its specific cell type. The images were divided into three folders, training, validation and testing as seen in Table 2. Table 2: Number of samples for training, validation and testing Cell Type Training Validation Test Nilm Lsil Hsil Scc Adenocarcinoma Other Image pre-processing To segment out the nucleus of each cell seeded watershed was used. Since all cells have a circular shape you can mark an ellipsoidal area around the cell as the background. From the watershed segmentation you receive a labeled representation of the image where the object, the border and the background have different labels. This image can be made binary by setting all pixels labeled as object or border to ones and the background pixels to zero. To make the transition between the background and the nuclei smoother the images were dialated with a 3x3 square mask three times. The remaining background was set to the mean intensity value of the dialated area. This was done to further smoothen the transitions between the nuclei and the background. 8

10 Figure 6: Pre-processing from raw image to final segmented image used for training To handle the unevenly distributed data set (Table 2), augmentation of cells with fewest samples was made. The augmentation was done by first mirroring the image in three directions, up-down, left-right and diagonally by mirroring up-down then left-right (Figure 7a). After this random rotations were made until there were an equal amount of images of each class (Figure 7b). To get rid of the rotation border the center of each image was cropped out in a 100x100 segment. The augmentation was done for the training and validation set. Lastly the images were labeled from zero to five for the six different classes. Before training the network the images were normalized to scale down the range of the pixel intensities. (a) Mirrored images. (b) Rotated and scaled images. Figure 7: Mirrored and rotated images of a cell 3.3 Deep Learning For this project Caffe was used as the framework for implementing CNN. Two different architectures were investigated, one deeper architecture which was a scaled down version of the classic AlexNet architecture and one smaller architecture where batch normalization was introduced. The hyperparameters of the architectures were decided with trial and error. 4 Result 4.1 Network architecture The resulting network architectures that were investigated are presented below (Figure 3 and 4). For each network the layers with corresponding hyperparameters and sub layers are presented. 9

11 Table 3: Deeper architecture with used layers and sub layers Layer Num. of features Filter size Pooling (kernel/stride) Non-linearity Regularizer Convolution Max. 3x3, 2 Relu - Convolution Max. 3x3, 2 Relu - Convolution Relu - Convolution Relu - Convolution Max. 3x3, 2 Relu - Fully connected Relu Dropout Fully connected Relu Dropout Fully connected Table 4: Smaller architecure with used layers and sub layers Layer Num. of features Filter size Pooling (kernel/stride) Non-linearity Regularizer Convolution 32 5 Max. 3x3, 2 Relu Batch norm. Convolution 32 5 Avg. 3x3, 2 Relu Batch norm. Convolution 64 5 Avg. 3x3, 2 Relu Batch norm. Fully connected Testing result The results when testing each network with images not seen before by the network are presented in four confusion matrices. The confusion matrix presents the ground truth versus the predicted result of the network Deeper architecture The result when testing on a deeper architecture with both segmented and unsegmented images are presented in Table 5 and Table 6. Table 5: Confusion matrix representing results form segmented images for the deeper network Nilm Lsil Hsil Scc Adeno Other Nilm 0.00% 58.91% 1.07% 19.09% 0.00% 20.93% Lsil 0.00% 76.39% 4.17% 15.28% 0.00% 4.17% Hsil 0.00% 17.14% 41.43% 38.57% 0.00% 2.89% Scc 0.00% 36.44% 23.73% 33.05% 0.00% 6.78% Adeno 0.00% 66.67% 0.00% 0.00% 0.00% 33.33% Other 0.00% 56.25% 0.00% 25.00% 0.00% 18.75% Table 6: Confusion matrix representing results form unsegmented images for the deeper network Nilm Lsil Hsil Scc Adeno Other Nilm 53.99% 20.00% 7.74% 5.93% 2.14% 10.21% Lsil 47.56% 26.83% 9.76% 7.32% 2.44% 6.10% Hsil 7.37% 7.37% 35.79% 35.79% 9.47% 4.21% Scc 7.52% 14.29% 24.81% 38.35% 8.27% 6.77% Adeno 8.33% 8.33% 25.00% 17.67% 0.00% 41.67% Other 30.95% 16.67% 11.90% 4.76% 4.76% 30.95% Smaller architecture The results when testing the smaller trained network are presented in Table 7 and Table 8. 10

12 Table 7: Confusion matrix representing results for segmented images for the shallower network Nilm Lsil Hsil Scc Adeno Other Nilm 0.00% 20.25% 34.79% 27.42% 1.36% 16.18% Lsil 0.00% 16.67% 44.44% 22.22% 2.78% 13.89% Hsil 0.00% 21.43% 40.00% 32.86% 4.29% 1.43% Scc 0.00% 16.10% 36.44% 28.81% 6.78% 11.86% Adeno 0.00% 0.00% 25.00% 33.33% 0.00% 41.67% Other 0.00% 15.62% 28.12% 31.25% 3.12% 21.88% Table 8: Confusion matrix representing results for unsegmented images for the shallower network Nilm Lsil Hsil Scc Adeno Other Nilm 58.93% 7.24% 1.98% 0.74% 2.55% 28.56% Lsil 46.34% 15.85% 3.66% 1.22% 0.00% 32.93% Hsil 0.00% 30.53% 33.68% 12.63% 5.26% 17.89% Scc 1.50% 21.05% 24.81% 21.80% 12.03% 18.80% Adeno 0.00% 8.33% 0.00% 8.33% 0.00% 83.33% Other 38.10% 4.76% 4.76% 9.52% 0.00% 42.86% 5 Discussion 5.1 Network Architecture A difficult task in this project was finding a suitable network architecture. Since the choice of architecture is crucial this might have a large impact on the result. The network architecture was decided with trial and error where the size and hyperparameters were experimented with. The two architectures presented in the results (Table 3 and Table 4) were the ones that produced the best result when experimenting with different depth and hyperparameters. As can be seen in Table 5 and Table 7, choosing a deeper network for the segmented images generated a slightly better result. The test scores for the Lsil and Hsil cells are slightly higher than for the shallower architecture. For the unsegmented images the shallower network tend to classify images towards the Other class more than the deeper architecture. An important result is the lower scores in the first column for the Hsil, Scc and Adenocarsimona cells in the shallower network (Table 8). This indicates that the shallower network is better at distinguishing between the Nilm cells and the cancerous cells. You might also argue that the deeper network is better for distinguishing between the Lsil cells and the more cancerous Hsil and Scc (Table 6) which is also an important result. 5.2 Images Segmented Images The best result with the segmented images was obtained using the deeper architecture. When reviewing Table 5, a couple of interesting things can be seen. The network could not predict any of the Nilm or Adencarcinoma cells. Since the data set contained so few Adenocarcinoma cells the network might have had too few samples to pick up any specific features for this type. In the case of failing to predict Nilm cells one explanation might be that that information about the surrounding is lost when segmenting the images. Since the Nilm and Lsil look similar in the nuclei this might be the reason for the network to predict Nilm as Lsil. Another thing that can be observed is that the network predicts Lsil very well. One can also see that the network can distinguish between Lsil and Hsil, but can not really distinguish between the Hsil and Scc. The explanation for this might be that the Hsil and Scc cells are quite alike but differs from the Lsil cells Unsegmented Images As discussed in Section 6.1 the smaller architecture generated the most interesting result for the unsegmented images. Looking at the confusion matrix in Table 8 one can see that the class called Other is predicted a lot for all classes. The Other class contains samples from many types of cells. This class probably have the largest spectra of cells which might be the reason that the network having a bias 11

13 towards predicting this class. Another thing that can be observed is the failure to predict the Adenocarcinoma cells. This can be caused by the fact that the data set only contained approximately 60 samples of this cell type. Since there are so few samples, even with the augmentation, the features for this class might be hard to determine for the network. You can also see that it is best at predicting Nilm cells, but when predicting LSIL it predicts these as Nilm almost half of the times. This can be caused by that Lsil cells only have smaller abnormal changes in comparison to the Nilm cells. Since there are more samples of the Nilm cells than for the Lsil cells there are more variety in that class. The most interesting result is that the network is very good at distinguishing the difference between Hsil, Scc and Adenocarcinoma from Nilm cells. Hsil cells are not likely to return to normal and Scc and Adenocarcinoma are already cancerous cells. This means that the network can actually see a difference between the more dangerous cells to the Nilm cells. 6 Conclusion 6.1 Using Deep Learning to classify Malignancy Associated Changes Even though the results are not perfect we see that there is potential for using deep learning for detecting malignancy associated changes. The major reason for this is the result for the unsegmented images where the network distinguished the more dangerous cell types from the Nilm cells. Due to shortage of time in this project we believe that more extensive studies are needed to evaluate the method and generate a better result. 6.2 Future work The major limitation in this project was the data set. The data set was unevenly distributed and thus many classes had to be augmented to a large extent. Collecting more samples of the cancerous cells would give a much more evenly distributed data set and would make the augmentation less crucial. This might generate a more accurate distinction between the Hsil, Scc and Adenocarcioma cells. It would also be interesting to work with the depth in the image stack. Instead of only training on one image for each cell, a substack with different lens focus could be used. This might give the network some extra information about each cell. On the image processing part more work could be done on enhancing features in the different cell types. This requires more studies on how the malignancy associated changes appear. An area that has a lot of potential for generating a better result is the architecture of the CNN. Working with the depth, different hyperparameters and different architectures has development potential. Acknowledgments We would like to thank our supervisors Carolina Wählby, Sajith Kecheril Sadanandan and Petter Ranefall for all the help and inspiration they have given us during this project. 12

14 References [1] Human papillomavirus (HPV) and cervical cancer (fact sheet), 2016 URL accessed [2] Hallinan, J., Detection of malignancy associated changes in cervical cells using statistical and evolutionary computation techniques, Ph.D. thesis, The University of Queensland, [3] Pap Test Faq, Wisconsin State Laboratory of Hygiene, URL accessed: [4] Jürgen Schmidhuber,My First Deep Learning System of 1991, + Deep Learning Timeline , arxiv: [cs.ne], URL [5] Yann LeCun, Yoshua Bengio, Geoffrey Hinton., Deep learning, Nature 521, , 2015 [6] Olga Russakovsky*, Jia Deng*, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg and Li Fei-Fei. (* = equal contribution) ImageNet Large Scale Visual Recognition Challenge. IJCV, [7] Convolutional Neural Networks (LeNet), LISA lab, URL accessed: [8] Andrej Karpathy., Convolutional Neural Networks: Architectures, Convolution / Pooling Layers URL accessed: [9] Andrej Karpathy., Neural Networks Part 2: Setting up the Data and the Loss URL accessed: [10] Daniel Graupe., Deep Learning Neural Networks: Design and Case Studies, World Scientific Publishing Co Inc, 2016 [11] Nitish Srivastava, Geoffrey Hinton, Ilya Sutskever, Ruslan Salakhutdinov., Dropout: A Simple Way to Prevent Neural Networks from Overfitting, Department of Computer Science, University of Toronto, 2014 [12] Sergey Ioffe, Christian Szegedy., Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, 2015 URL accessed: [13] Optimization: Stochastic Gradient Descent, Stanford University URL accessed: [14] Jia, Yangqing and Shelhamer, Evan and Donahue, Jeff and Karayev, Sergey and Long, Jonathan and Girshick, Ross and Guadarrama, Sergio and Darrell, Trevor, Caffe: Convolutional Architecture for Fast Feature Embedding, 2014, arxiv: [cs.cv], URL [15] Gonzales, R.C, Woods, E. R., Digital image processing, 2nd edition, Prentice-Hall, 2002 [16] P. Malm, Multi-resolution Cervical Cell Dataset, Centre for Image Analysis, Swedish University of Agricultural Sciences and Uppsala University, Technical report (Blue series) No. 37. Available online at: 13

Biologically Inspired Computation

Biologically Inspired Computation Biologically Inspired Computation Deep Learning & Convolutional Neural Networks Joe Marino biologically inspired computation biological intelligence flexible capable of detecting/ executing/reasoning about

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Deep Learning Barnabás Póczos Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio Geoffrey Hinton Yann LeCun 2

More information

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni. Lesson 08 Convolutional Neural Network Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni Lesson 08 Convolution we will consider 2D convolution the result

More information

Deep Learning. Dr. Johan Hagelbäck.

Deep Learning. Dr. Johan Hagelbäck. Deep Learning Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Image Classification Image classification can be a difficult task Some of the challenges we have to face are: Viewpoint variation:

More information

Convolu'onal Neural Networks. November 17, 2015

Convolu'onal Neural Networks. November 17, 2015 Convolu'onal Neural Networks November 17, 2015 Ar'ficial Neural Networks Feedforward neural networks Ar'ficial Neural Networks Feedforward, fully-connected neural networks Ar'ficial Neural Networks Feedforward,

More information

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems Emeric Stéphane Boigné eboigne@stanford.edu Jan Felix Heyse heyse@stanford.edu Abstract Scaling

More information

Colorful Image Colorizations Supplementary Material

Colorful Image Colorizations Supplementary Material Colorful Image Colorizations Supplementary Material Richard Zhang, Phillip Isola, Alexei A. Efros {rich.zhang, isola, efros}@eecs.berkeley.edu University of California, Berkeley 1 Overview This document

More information

Research on Hand Gesture Recognition Using Convolutional Neural Network

Research on Hand Gesture Recognition Using Convolutional Neural Network Research on Hand Gesture Recognition Using Convolutional Neural Network Tian Zhaoyang a, Cheng Lee Lung b a Department of Electronic Engineering, City University of Hong Kong, Hong Kong, China E-mail address:

More information

Lecture 11-1 CNN introduction. Sung Kim

Lecture 11-1 CNN introduction. Sung Kim Lecture 11-1 CNN introduction Sung Kim 'The only limit is your imagination' http://itchyi.squarespace.com/thelatest/2012/5/17/the-only-limit-is-your-imagination.html Lecture 7: Convolutional

More information

Vehicle Color Recognition using Convolutional Neural Network

Vehicle Color Recognition using Convolutional Neural Network Vehicle Color Recognition using Convolutional Neural Network Reza Fuad Rachmadi and I Ketut Eddy Purnama Multimedia and Network Engineering Department, Institut Teknologi Sepuluh Nopember, Keputih Sukolilo,

More information

CS 7643: Deep Learning

CS 7643: Deep Learning CS 7643: Deep Learning Topics: Toeplitz matrices and convolutions = matrix-mult Dilated/a-trous convolutions Backprop in conv layers Transposed convolutions Dhruv Batra Georgia Tech HW1 extension 09/22

More information

Lecture 23 Deep Learning: Segmentation

Lecture 23 Deep Learning: Segmentation Lecture 23 Deep Learning: Segmentation COS 429: Computer Vision Thanks: most of these slides shamelessly adapted from Stanford CS231n: Convolutional Neural Networks for Visual Recognition Fei-Fei Li, Andrej

More information

Generating an appropriate sound for a video using WaveNet.

Generating an appropriate sound for a video using WaveNet. Australian National University College of Engineering and Computer Science Master of Computing Generating an appropriate sound for a video using WaveNet. COMP 8715 Individual Computing Project Taku Ueki

More information

TRANSFORMING PHOTOS TO COMICS USING CONVOLUTIONAL NEURAL NETWORKS. Tsinghua University, China Cardiff University, UK

TRANSFORMING PHOTOS TO COMICS USING CONVOLUTIONAL NEURAL NETWORKS. Tsinghua University, China Cardiff University, UK TRANSFORMING PHOTOS TO COMICS USING CONVOUTIONA NEURA NETWORKS Yang Chen Yu-Kun ai Yong-Jin iu Tsinghua University, China Cardiff University, UK ABSTRACT In this paper, inspired by Gatys s recent work,

More information

arxiv: v1 [cs.cv] 15 Apr 2016

arxiv: v1 [cs.cv] 15 Apr 2016 High-performance Semantic Segmentation Using Very Deep Fully Convolutional Networks arxiv:1604.04339v1 [cs.cv] 15 Apr 2016 Zifeng Wu, Chunhua Shen, Anton van den Hengel The University of Adelaide, SA 5005,

More information

arxiv: v2 [cs.cv] 11 Oct 2016

arxiv: v2 [cs.cv] 11 Oct 2016 Xception: Deep Learning with Depthwise Separable Convolutions arxiv:1610.02357v2 [cs.cv] 11 Oct 2016 François Chollet Google, Inc. fchollet@google.com Monday 10 th October, 2016 Abstract We present an

More information

ON CLASSIFICATION OF DISTORTED IMAGES WITH DEEP CONVOLUTIONAL NEURAL NETWORKS. Yiren Zhou, Sibo Song, Ngai-Man Cheung

ON CLASSIFICATION OF DISTORTED IMAGES WITH DEEP CONVOLUTIONAL NEURAL NETWORKS. Yiren Zhou, Sibo Song, Ngai-Man Cheung ON CLASSIFICATION OF DISTORTED IMAGES WITH DEEP CONVOLUTIONAL NEURAL NETWORKS Yiren Zhou, Sibo Song, Ngai-Man Cheung Singapore University of Technology and Design In this section, we briefly introduce

More information

Convolutional Networks Overview

Convolutional Networks Overview Convolutional Networks Overview Sargur Srihari 1 Topics Limitations of Conventional Neural Networks The convolution operation Convolutional Networks Pooling Convolutional Network Architecture Advantages

More information

Detection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 -

Detection and Segmentation. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 11 - Lecture 11: Detection and Segmentation Lecture 11-1 May 10, 2017 Administrative Midterms being graded Please don t discuss midterms until next week - some students not yet taken A2 being graded Project

More information

GESTURE RECOGNITION FOR ROBOTIC CONTROL USING DEEP LEARNING

GESTURE RECOGNITION FOR ROBOTIC CONTROL USING DEEP LEARNING 2017 NDIA GROUND VEHICLE SYSTEMS ENGINEERING AND TECHNOLOGY SYMPOSIUM AUTONOMOUS GROUND SYSTEMS (AGS) TECHNICAL SESSION AUGUST 8-10, 2017 - NOVI, MICHIGAN GESTURE RECOGNITION FOR ROBOTIC CONTROL USING

More information

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS Kuan-Chuan Peng and Tsuhan Chen Cornell University School of Electrical and Computer Engineering Ithaca, NY 14850

More information

Xception: Deep Learning with Depthwise Separable Convolutions

Xception: Deep Learning with Depthwise Separable Convolutions Xception: Deep Learning with Depthwise Separable Convolutions François Chollet Google, Inc. fchollet@google.com 1 A variant of the process is to independently look at width-wise correarxiv:1610.02357v3

More information

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. ECE 289G: Paper Presentation #3 Philipp Gysel

DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition. ECE 289G: Paper Presentation #3 Philipp Gysel DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition ECE 289G: Paper Presentation #3 Philipp Gysel Autonomous Car ECE 289G Paper Presentation, Philipp Gysel Slide 2 Source: maps.google.com

More information

Image Manipulation Detection using Convolutional Neural Network

Image Manipulation Detection using Convolutional Neural Network Image Manipulation Detection using Convolutional Neural Network Dong-Hyun Kim 1 and Hae-Yeoun Lee 2,* 1 Graduate Student, 2 PhD, Professor 1,2 Department of Computer Software Engineering, Kumoh National

More information

arxiv: v1 [cs.ce] 9 Jan 2018

arxiv: v1 [cs.ce] 9 Jan 2018 Predict Forex Trend via Convolutional Neural Networks Yun-Cheng Tsai, 1 Jun-Hao Chen, 2 Jun-Jie Wang 3 arxiv:1801.03018v1 [cs.ce] 9 Jan 2018 1 Center for General Education 2,3 Department of Computer Science

More information

Coursework 2. MLP Lecture 7 Convolutional Networks 1

Coursework 2. MLP Lecture 7 Convolutional Networks 1 Coursework 2 MLP Lecture 7 Convolutional Networks 1 Coursework 2 - Overview and Objectives Overview: Use a selection of the techniques covered in the course so far to train accurate multi-layer networks

More information

Deep Neural Network Architectures for Modulation Classification

Deep Neural Network Architectures for Modulation Classification Deep Neural Network Architectures for Modulation Classification Xiaoyu Liu, Diyu Yang, and Aly El Gamal School of Electrical and Computer Engineering Purdue University Email: {liu1962, yang1467, elgamala}@purdue.edu

More information

GPU ACCELERATED DEEP LEARNING WITH CUDNN

GPU ACCELERATED DEEP LEARNING WITH CUDNN GPU ACCELERATED DEEP LEARNING WITH CUDNN Larry Brown Ph.D. March 2015 AGENDA 1 Introducing cudnn and GPUs 2 Deep Learning Context 3 cudnn V2 4 Using cudnn 2 Introducing cudnn and GPUs 3 HOW GPU ACCELERATION

More information

The Art of Neural Nets

The Art of Neural Nets The Art of Neural Nets Marco Tavora marcotav65@gmail.com Preamble The challenge of recognizing artists given their paintings has been, for a long time, far beyond the capability of algorithms. Recent advances

More information

Impact of Automatic Feature Extraction in Deep Learning Architecture

Impact of Automatic Feature Extraction in Deep Learning Architecture Impact of Automatic Feature Extraction in Deep Learning Architecture Fatma Shaheen, Brijesh Verma and Md Asafuddoula Centre for Intelligent Systems Central Queensland University, Brisbane, Australia {f.shaheen,

More information

Multi-resolution Cervical Cell Dataset

Multi-resolution Cervical Cell Dataset Report 37 Multi-resolution Cervical Cell Dataset Patrik Malm December 2013 Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Uppsala 2013 Multi-resolution Cervical

More information

Learning Deep Networks from Noisy Labels with Dropout Regularization

Learning Deep Networks from Noisy Labels with Dropout Regularization Learning Deep Networks from Noisy Labels with Dropout Regularization Ishan Jindal, Matthew Nokleby Electrical and Computer Engineering Wayne State University, MI, USA Email: {ishan.jindal, matthew.nokleby}@wayne.edu

More information

Convolutional Neural Networks

Convolutional Neural Networks Convolutional Neural Networks Convolution, LeNet, AlexNet, VGGNet, GoogleNet, Resnet, DenseNet, CAM, Deconvolution Sept 17, 2018 Aaditya Prakash Convolution Convolution Demo Convolution Convolution in

More information

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1 Lecture 5: Convolutional Neural Networks Lecture 5-1 Administrative Assignment 1 due Thursday April 20, 11:59pm on Canvas Assignment 2 will be released Thursday Lecture 5-2 Last time: Neural Networks Linear

More information

Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising

Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising Learning Pixel-Distribution Prior with Wider Convolution for Image Denoising Peng Liu University of Florida pliu1@ufl.edu Ruogu Fang University of Florida ruogu.fang@bme.ufl.edu arxiv:177.9135v1 [cs.cv]

More information

Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks

Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks Multispectral Pedestrian Detection using Deep Fusion Convolutional Neural Networks Jo rg Wagner1,2, Volker Fischer1, Michael Herman1 and Sven Behnke2 1- Robert Bosch GmbH - 70442 Stuttgart - Germany 2-

More information

Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks Based on Decompressed Images

Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks Based on Decompressed Images Classification Accuracies of Malaria Infected Cells Using Deep Convolutional Neural Networks Based on Decompressed Images Yuhang Dong, Zhuocheng Jiang, Hongda Shen, W. David Pan Dept. of Electrical & Computer

More information

A Deep Learning Approach To Universal Image Manipulation Detection Using A New Convolutional Layer

A Deep Learning Approach To Universal Image Manipulation Detection Using A New Convolutional Layer A Deep Learning Approach To Universal Image Manipulation Detection Using A New Convolutional Layer ABSTRACT Belhassen Bayar Drexel University Dept. of ECE Philadelphia, PA, USA bb632@drexel.edu When creating

More information

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland

An Introduction to Convolutional Neural Networks. Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland An Introduction to Convolutional Neural Networks Alessandro Giusti Dalle Molle Institute for Artificial Intelligence Lugano, Switzerland Sources & Resources - Andrej Karpathy, CS231n http://cs231n.github.io/convolutional-networks/

More information

Camera Model Identification With The Use of Deep Convolutional Neural Networks

Camera Model Identification With The Use of Deep Convolutional Neural Networks Camera Model Identification With The Use of Deep Convolutional Neural Networks Amel TUAMA 2,3, Frédéric COMBY 2,3, and Marc CHAUMONT 1,2,3 (1) University of Nîmes, France (2) University Montpellier, France

More information

Teaching icub to recognize. objects. Giulia Pasquale. PhD student

Teaching icub to recognize. objects. Giulia Pasquale. PhD student Teaching icub to recognize RobotCub Consortium. All rights reservted. This content is excluded from our Creative Commons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. objects

More information

Semantic Segmentation on Resource Constrained Devices

Semantic Segmentation on Resource Constrained Devices Semantic Segmentation on Resource Constrained Devices Sachin Mehta University of Washington, Seattle In collaboration with Mohammad Rastegari, Anat Caspi, Linda Shapiro, and Hannaneh Hajishirzi Project

More information

Analyzing features learned for Offline Signature Verification using Deep CNNs

Analyzing features learned for Offline Signature Verification using Deep CNNs Accepted as a conference paper for ICPR 2016 Analyzing features learned for Offline Signature Verification using Deep CNNs Luiz G. Hafemann, Robert Sabourin Lab. d imagerie, de vision et d intelligence

More information

arxiv: v1 [cs.sd] 1 Oct 2016

arxiv: v1 [cs.sd] 1 Oct 2016 VERY DEEP CONVOLUTIONAL NEURAL NETWORKS FOR RAW WAVEFORMS Wei Dai*, Chia Dai*, Shuhui Qu, Juncheng Li, Samarjit Das {wdai,chiad}@cs.cmu.edu, shuhuiq@stanford.edu, {billy.li,samarjit.das}@us.bosch.com arxiv:1610.00087v1

More information

Understanding Neural Networks : Part II

Understanding Neural Networks : Part II TensorFlow Workshop 2018 Understanding Neural Networks Part II : Convolutional Layers and Collaborative Filters Nick Winovich Department of Mathematics Purdue University July 2018 Outline 1 Convolutional

More information

Radio Deep Learning Efforts Showcase Presentation

Radio Deep Learning Efforts Showcase Presentation Radio Deep Learning Efforts Showcase Presentation November 2016 hume@vt.edu www.hume.vt.edu Tim O Shea Senior Research Associate Program Overview Program Objective: Rethink fundamental approaches to how

More information

SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB

SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB SIMULATION-BASED MODEL CONTROL USING STATIC HAND GESTURES IN MATLAB S. Kajan, J. Goga Institute of Robotics and Cybernetics, Faculty of Electrical Engineering and Information Technology, Slovak University

More information

Free-hand Sketch Recognition Classification

Free-hand Sketch Recognition Classification Free-hand Sketch Recognition Classification Wayne Lu Stanford University waynelu@stanford.edu Elizabeth Tran Stanford University eliztran@stanford.edu Abstract People use sketches to express and record

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Perceptron Barnabás Póczos Contents History of Artificial Neural Networks Definitions: Perceptron, Multi-Layer Perceptron Perceptron algorithm 2 Short History of Artificial

More information

ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions

ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions ChannelNets: Compact and Efficient Convolutional Neural Networks via Channel-Wise Convolutions Hongyang Gao Texas A&M University College Station, TX hongyang.gao@tamu.edu Zhengyang Wang Texas A&M University

More information

Semantic Segmentation in Red Relief Image Map by UX-Net

Semantic Segmentation in Red Relief Image Map by UX-Net Semantic Segmentation in Red Relief Image Map by UX-Net Tomoya Komiyama 1, Kazuhiro Hotta 1, Kazuo Oda 2, Satomi Kakuta 2 and Mikako Sano 2 1 Meijo University, Shiogamaguchi, 468-0073, Nagoya, Japan 2

More information

En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring

En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring En ny æra for uthenting av informasjon fra satellittbilder ved hjelp av maskinlæring Mathilde Ørstavik og Terje Midtbø Mathilde Ørstavik and Terje Midtbø, A New Era for Feature Extraction in Remotely Sensed

More information

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1

Convolutional Neural Networks. Fei-Fei Li & Justin Johnson & Serena Yeung. Lecture 5-1 Lecture 5: Convolutional Neural Networks Lecture 5-1 Administrative Assignment 1 due Wednesday April 17, 11:59pm - Important: tag your solutions with the corresponding hw question in gradescope! - Some

More information

Learning Deep Networks from Noisy Labels with Dropout Regularization

Learning Deep Networks from Noisy Labels with Dropout Regularization Learning Deep Networks from Noisy Labels with Dropout Regularization Ishan Jindal*, Matthew Nokleby*, Xuewen Chen** *Department of Electrical and Computer Engineering **Department of Computer Science Wayne

More information

Pelee: A Real-Time Object Detection System on Mobile Devices

Pelee: A Real-Time Object Detection System on Mobile Devices Pelee: A Real-Time Object Detection System on Mobile Devices Robert J. Wang, Xiang Li, Shuang Ao & Charles X. Ling Department of Computer Science University of Western Ontario London, Ontario, Canada,

More information

Convolutional neural networks

Convolutional neural networks Convolutional neural networks Themes Curriculum: Ch 9.1, 9.2 and http://cs231n.github.io/convolutionalnetworks/ The simple motivation and idea How it s done Receptive field Pooling Dilated convolutions

More information

Continuous Gesture Recognition Fact Sheet

Continuous Gesture Recognition Fact Sheet Continuous Gesture Recognition Fact Sheet August 17, 2016 1 Team details Team name: ICT NHCI Team leader name: Xiujuan Chai Team leader address, phone number and email Address: No.6 Kexueyuan South Road

More information

ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS

ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS Bulletin of the Transilvania University of Braşov Vol. 10 (59) No. 2-2017 Series I: Engineering Sciences ROAD RECOGNITION USING FULLY CONVOLUTIONAL NEURAL NETWORKS E. HORVÁTH 1 C. POZNA 2 Á. BALLAGI 3

More information

Landmark Recognition with Deep Learning

Landmark Recognition with Deep Learning Landmark Recognition with Deep Learning PROJECT LABORATORY submitted by Filippo Galli NEUROSCIENTIFIC SYSTEM THEORY Technische Universität München Prof. Dr Jörg Conradt Supervisor: Marcello Mulas, PhD

More information

Compact Deep Convolutional Neural Networks for Image Classification

Compact Deep Convolutional Neural Networks for Image Classification 1 Compact Deep Convolutional Neural Networks for Image Classification Zejia Zheng, Zhu Li, Abhishek Nagar 1 and Woosung Kang 2 Abstract Convolutional Neural Network is efficient in learning hierarchical

More information

Fully Convolutional Networks for Semantic Segmentation

Fully Convolutional Networks for Semantic Segmentation Fully Convolutional Networks for Semantic Segmentation Jonathan Long* Evan Shelhamer* Trevor Darrell UC Berkeley Presented by: Gordon Christie 1 Overview Reinterpret standard classification convnets as

More information

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION

DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION Journal of Advanced College of Engineering and Management, Vol. 3, 2017 DYNAMIC CONVOLUTIONAL NEURAL NETWORK FOR IMAGE SUPER- RESOLUTION Anil Bhujel 1, Dibakar Raj Pant 2 1 Ministry of Information and

More information

Can you tell a face from a HEVC bitstream?

Can you tell a face from a HEVC bitstream? Can you tell a face from a HEVC bitstream? Saeed Ranjbar Alvar, Hyomin Choi and Ivan V. Bajić School of Engineering Science, Simon Fraser University, Burnaby, BC, Canada Email: {saeedr,chyomin, ibajic}@sfu.ca

More information

Correlating Filter Diversity with Convolutional Neural Network Accuracy

Correlating Filter Diversity with Convolutional Neural Network Accuracy Correlating Filter Diversity with Convolutional Neural Network Accuracy Casey A. Graff School of Computer Science and Engineering University of California San Diego La Jolla, CA 92023 Email: cagraff@ucsd.edu

More information

INFORMATION about image authenticity can be used in

INFORMATION about image authenticity can be used in 1 Constrained Convolutional Neural Networs: A New Approach Towards General Purpose Image Manipulation Detection Belhassen Bayar, Student Member, IEEE, and Matthew C. Stamm, Member, IEEE Abstract Identifying

More information

arxiv: v1 [cs.cv] 27 Nov 2016

arxiv: v1 [cs.cv] 27 Nov 2016 Real-Time Video Highlights for Yahoo Esports arxiv:1611.08780v1 [cs.cv] 27 Nov 2016 Yale Song Yahoo Research New York, USA yalesong@yahoo-inc.com Abstract Esports has gained global popularity in recent

More information

A Fuller Understanding of Fully Convolutional Networks. Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16

A Fuller Understanding of Fully Convolutional Networks. Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 A Fuller Understanding of Fully Convolutional Networks Evan Shelhamer* Jonathan Long* Trevor Darrell UC Berkeley in CVPR'15, PAMI'16 1 pixels in, pixels out colorization Zhang et al.2016 monocular depth

More information

Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices

Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices Deep Learning for Human Activity Recognition: A Resource Efficient Implementation on Low-Power Devices Daniele Ravì, Charence Wong, Benny Lo and Guang-Zhong Yang To appear in the proceedings of the IEEE

More information

یادآوری: خالصه CNN. ConvNet

یادآوری: خالصه CNN. ConvNet 1 ConvNet یادآوری: خالصه CNN شبکه عصبی کانولوشنال یا Convolutional Neural Networks یا نوعی از شبکههای عصبی عمیق مدل یادگیری آن باناظر.اصالح وزنها با الگوریتم back-propagation مناسب برای داده های حجیم و

More information

Creating an Agent of Doom: A Visual Reinforcement Learning Approach

Creating an Agent of Doom: A Visual Reinforcement Learning Approach Creating an Agent of Doom: A Visual Reinforcement Learning Approach Michael Lowney Department of Electrical Engineering Stanford University mlowney@stanford.edu Robert Mahieu Department of Electrical Engineering

More information

Convolutional Neural Networks: Real Time Emotion Recognition

Convolutional Neural Networks: Real Time Emotion Recognition Convolutional Neural Networks: Real Time Emotion Recognition Bruce Nguyen, William Truong, Harsha Yeddanapudy Motivation: Machine emotion recognition has long been a challenge and popular topic in the

More information

arxiv: v1 [cs.lg] 2 Jan 2018

arxiv: v1 [cs.lg] 2 Jan 2018 Deep Learning for Identifying Potential Conceptual Shifts for Co-creative Drawing arxiv:1801.00723v1 [cs.lg] 2 Jan 2018 Pegah Karimi pkarimi@uncc.edu Kazjon Grace The University of Sydney Sydney, NSW 2006

More information

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO Introduction to RNNs for NLP SHANG GAO About Me PhD student in the Data Science and Engineering program Took Deep Learning last year Work in the Biomedical Sciences, Engineering, and Computing group at

More information

AUTOMATED MALARIA PARASITE DETECTION BASED ON IMAGE PROCESSING PROJECT REFERENCE NO.: 38S1511

AUTOMATED MALARIA PARASITE DETECTION BASED ON IMAGE PROCESSING PROJECT REFERENCE NO.: 38S1511 AUTOMATED MALARIA PARASITE DETECTION BASED ON IMAGE PROCESSING PROJECT REFERENCE NO.: 38S1511 COLLEGE : BANGALORE INSTITUTE OF TECHNOLOGY, BENGALURU BRANCH : COMPUTER SCIENCE AND ENGINEERING GUIDE : DR.

More information

IMAGE PROCESSING PROJECT REPORT NUCLEUS CLASIFICATION

IMAGE PROCESSING PROJECT REPORT NUCLEUS CLASIFICATION ABSTRACT : The Main agenda of this project is to segment and analyze the a stack of image, where it contains nucleus, nucleolus and heterochromatin. Find the volume, Density, Area and circularity of the

More information

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET

CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET CONVOLUTIONAL NEURAL NETWORKS: MOTIVATION, CONVOLUTION OPERATION, ALEXNET MOTIVATION Fully connected neural network Example 1000x1000 image 1M hidden units 10 12 (= 10 6 10 6 ) parameters! Observation

More information

Comparison of Google Image Search and ResNet Image Classification Using Image Similarity Metrics

Comparison of Google Image Search and ResNet Image Classification Using Image Similarity Metrics University of Arkansas, Fayetteville ScholarWorks@UARK Computer Science and Computer Engineering Undergraduate Honors Theses Computer Science and Computer Engineering 5-2018 Comparison of Google Image

More information

Counterfeit Bill Detection Algorithm using Deep Learning

Counterfeit Bill Detection Algorithm using Deep Learning Counterfeit Bill Detection Algorithm using Deep Learning Soo-Hyeon Lee 1 and Hae-Yeoun Lee 2,* 1 Undergraduate Student, 2 Professor 1,2 Department of Computer Software Engineering, Kumoh National Institute

More information

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018

CPSC 340: Machine Learning and Data Mining. Convolutional Neural Networks Fall 2018 CPSC 340: Machine Learning and Data Mining Convolutional Neural Networks Fall 2018 Admin Mike and I finish CNNs on Wednesday. After that, we will cover different topics: Mike will do a demo of training

More information

Hand Gesture Recognition by Means of Region- Based Convolutional Neural Networks

Hand Gesture Recognition by Means of Region- Based Convolutional Neural Networks Contemporary Engineering Sciences, Vol. 10, 2017, no. 27, 1329-1342 HIKARI Ltd, www.m-hikari.com https://doi.org/10.12988/ces.2017.710154 Hand Gesture Recognition by Means of Region- Based Convolutional

More information

Convolutional Neural Network-based Steganalysis on Spatial Domain

Convolutional Neural Network-based Steganalysis on Spatial Domain Convolutional Neural Network-based Steganalysis on Spatial Domain Dong-Hyun Kim, and Hae-Yeoun Lee Abstract Steganalysis has been studied to detect the existence of hidden messages by steganography. However,

More information

Centre for Computational and Numerical Studies, Institute of Advanced Study in Science and Technology 2. Dept. of Statistics, Gauhati University

Centre for Computational and Numerical Studies, Institute of Advanced Study in Science and Technology 2. Dept. of Statistics, Gauhati University Cervix Cancer Diagnosis from Pap Smear Images Using Structure Based Segmentation and Shape Analysis 1 Lipi B. Mahanta, 2 Dilip Ch. Nath, 1 Chandan Kr. Nath 1 Centre for Computational and Numerical Studies,

More information

Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model

Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model Automatic tumor segmentation in breast ultrasound images using a dilated fully convolutional network combined with an active contour model Yuzhou Hu Departmentof Electronic Engineering, Fudan University,

More information

NON UNIFORM BACKGROUND REMOVAL FOR PARTICLE ANALYSIS BASED ON MORPHOLOGICAL STRUCTURING ELEMENT:

NON UNIFORM BACKGROUND REMOVAL FOR PARTICLE ANALYSIS BASED ON MORPHOLOGICAL STRUCTURING ELEMENT: IJCE January-June 2012, Volume 4, Number 1 pp. 59 67 NON UNIFORM BACKGROUND REMOVAL FOR PARTICLE ANALYSIS BASED ON MORPHOLOGICAL STRUCTURING ELEMENT: A COMPARATIVE STUDY Prabhdeep Singh1 & A. K. Garg2

More information

Chess Recognition Using Computer Vision

Chess Recognition Using Computer Vision Chess Recognition Using Computer Vision May 30, 2017 Ramani Varun (U6004067, contribution 50%) Sukrit Gupta (U5900600, contribution 50%) College of Engineering & Computer Science he Australian National

More information

A New Framework for Color Image Segmentation Using Watershed Algorithm

A New Framework for Color Image Segmentation Using Watershed Algorithm A New Framework for Color Image Segmentation Using Watershed Algorithm Ashwin Kumar #1, 1 Department of CSE, VITS, Karimnagar,JNTUH,Hyderabad, AP, INDIA 1 ashwinvrk@gmail.com Abstract Pradeep Kumar 2 2

More information

Augmenting Self-Learning In Chess Through Expert Imitation

Augmenting Self-Learning In Chess Through Expert Imitation Augmenting Self-Learning In Chess Through Expert Imitation Michael Xie Department of Computer Science Stanford University Stanford, CA 94305 xie@cs.stanford.edu Gene Lewis Department of Computer Science

More information

Pre-Trained Convolutional Neural Network for Classification of Tanning Leather Image

Pre-Trained Convolutional Neural Network for Classification of Tanning Leather Image Pre-Trained Convolutional Neural Network for Classification of Tanning Leather Image Sri Winiarti, Adhi Prahara, Murinto, Dewi Pramudi Ismi Informatics Department Universitas Ahmad Dahlan Yogyakarta, Indonesia

More information

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3

Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 Convolutional Networks for Image Segmentation: U-Net 1, DeconvNet 2, and SegNet 3 1 Olaf Ronneberger, Philipp Fischer, Thomas Brox (Freiburg, Germany) 2 Hyeonwoo Noh, Seunghoon Hong, Bohyung Han (POSTECH,

More information

THE problem of automating the solving of

THE problem of automating the solving of CS231A FINAL PROJECT, JUNE 2016 1 Solving Large Jigsaw Puzzles L. Dery and C. Fufa Abstract This project attempts to reproduce the genetic algorithm in a paper entitled A Genetic Algorithm-Based Solver

More information

Automated Image Timestamp Inference Using Convolutional Neural Networks

Automated Image Timestamp Inference Using Convolutional Neural Networks Automated Image Timestamp Inference Using Convolutional Neural Networks Prafull Sharma prafull7@stanford.edu Michel Schoemaker michel92@stanford.edu Stanford University David Pan napdivad@stanford.edu

More information

Classifying the Brain's Motor Activity via Deep Learning

Classifying the Brain's Motor Activity via Deep Learning Final Report Classifying the Brain's Motor Activity via Deep Learning Tania Morimoto & Sean Sketch Motivation Over 50 million Americans suffer from mobility or dexterity impairments. Over the past few

More information

Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation

Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation Steve Renals Machine Learning Practical MLP Lecture 4 9 October 2018 MLP Lecture 4 / 9 October 2018 Deep Neural Networks (2)

More information

Image Filtering. Median Filtering

Image Filtering. Median Filtering Image Filtering Image filtering is used to: Remove noise Sharpen contrast Highlight contours Detect edges Other uses? Image filters can be classified as linear or nonlinear. Linear filters are also know

More information

Review Scope Operator s Manual

Review Scope Operator s Manual ThinPrep Imaging System Review Scope Operator s Manual HOLOGIC, INC. 250 CAMPUS DRIVE MARLBOROUGH, MA 01752 USA TEL: 1-800-442-9892 1-508-263-2900 FAX: 1-508-229-2795 WEB: WWW.HOLOGIC.COM For Use With

More information

Visual Interpretation of Hand Gestures as a Practical Interface Modality

Visual Interpretation of Hand Gestures as a Practical Interface Modality Visual Interpretation of Hand Gestures as a Practical Interface Modality Frederik C. M. Kjeldsen Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate

More information

Lecture 17 Convolutional Neural Networks

Lecture 17 Convolutional Neural Networks Lecture 17 Convolutional Neural Networks 30 March 2016 Taylor B. Arnold Yale Statistics STAT 365/665 1/22 Notes: Problem set 6 is online and due next Friday, April 8th Problem sets 7,8, and 9 will be due

More information

On the Robustness of Deep Neural Networks

On the Robustness of Deep Neural Networks On the Robustness of Deep Neural Networks Manuel Günther, Andras Rozsa, and Terrance E. Boult Vision and Security Technology Lab, University of Colorado Colorado Springs {mgunther,arozsa,tboult}@vast.uccs.edu

More information

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks

Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Improving reverberant speech separation with binaural cues using temporal context and convolutional neural networks Alfredo Zermini, Qiuqiang Kong, Yong Xu, Mark D. Plumbley, Wenwu Wang Centre for Vision,

More information

Automatic understanding of the visual world

Automatic understanding of the visual world Automatic understanding of the visual world 1 Machine visual perception Artificial capacity to see, understand the visual world Object recognition Image or sequence of images Action recognition 2 Machine

More information