1 SPECIFICITY of MACHINE LEARNING PROJECTS Borys Pratsiuk, Head of R&D, Ci
2 Who am I? Senior Android Team Lead Android Architect Engineer, R&D Lab, Tescom, South Korea Android Developer Ph.D Solidstate Electronic Head of R&D Engineering 2004 2007 2013 2006 2009 2012 2015 -... Borys Pratsiuk, Ph.D. bopr@ciklum.com @b_pratsiuk First project, C, embedded Assistant professor, Kiev Polytechnic Institute
3 Agenda 2 What is a scientific research? 1 Why Science is important again? Data Science eco-system and main challenges 3 4 Case Studies
4 ML History McCulloch & Pitts Rosenblatt Ivakhnenko & Lapa A Logical Calculus of the Ideas Immanent in Nervous Activity Perceptron Group Method of Data Handling (GMDH) Networks trained by the Group Method of Data Handling (GMDH) (Ivakhnenko and Lapa, 1965; Ivakhnenko et al., 1967; Ivakhnenko, 1968, 1971) were perhaps the first Deep Learning systems Group Method of Data Handling (GMDH) Juergen Schmidhuber 1943 1957 1965
5 IT evolution Systems Application Systems Scientists Brilliant people Systems (IBM, Microsoft, etc) Application (Web, Mobile, etc) Eco-System - Outsourcing -Freelance PM / BA / QA / Data Science Brilliant people 60-th 80-th 2000 2010 now
6 Data today
7 Nvidia - main ML hardware manufacturer
8 Modern AI
9 Reasons for adopting Machine Intellegence https://www.forbes.com/sites/louis columbus/2017/09/10/howartificial-intelligence-isrevolutionizing-business-in- 2017/#42607d805463 Lot of nice charts about AI following the link!
10 What is scientific research? Scientific research is a fundamental background to test any revolutionary business ideas Zhang, W., Yu, Q., Siddiquie, B., Divakaran, A., & Sawhney, H. (2015). "Snap-n-Eat": food recognition and nutrition estimation on a smartphone. Journal of Diabetes Science and Technology, 9(3), 525-533. doi: 10.1177/1932296815582222 https://www.ncbi.nlm.nih.gov/pmc/articles/pmc3224860/
11 Image recognition better than human
12 Business case - AI Drones and Industrial Equipment https://news.develop er.nvidia.com/aidrones-help-inspectindustrial-equipment/
13 Bad Business case :(
14 Kaggle.com
15 Typical Machine Learning Flow Diagram Iterative process Data Acquisition Data Preparation Model Training Model Testing Train model Cleansing Shaping Enrichment Data Annotation Train Set Test Set Val Set Evaluate performance Cross-validation
16 Artificial Neuron Base element for any neural network design
17 NN/Deep Learning Neural net design is modern Art
18 ANN Example not good http://playground.tensorflow.org/
19 ANN Example OK
20 6 Problem #1 NO Trust NO Scope NO Estimates Strong Sales Strong Team Strong Portfolio
Problems #n Proper hardware Proper dataset Proper team Proper PM
22 CASE STUDIES
23 Case Study: Voice recognition and natural language processing Voice recognition and natural language processing is one of the most important things for IoT purposes. Voice recognition technique was implemented to prepare your favorite cocktail, e.g. you re saying: Scoofy, make my favorite drink! and device makes you your favorite drink based on your preference and previous history.
24 Fish recognition Kaggle competition Problem In the Western and Central Pacific, where 60% of the world s tuna is caught, illegal, unreported, and unregulated fishing practices are threatening marine ecosystems, global seafood supplies and local livelihoods. The Nature Conservancy solves this via developing a camera-based solution for the fish recognition to control the fishermen s catch. Results Developed the fish detection algorithm (localization precision ~ 95%, recall 80%) Developed the classification algorithm for different types of of fish on boat (accuracy ~ 93%) The overall performance log-loss~ 0.7
25 Case: Detection of parasites on fish Detect fish in video frame, Filter images and select most bright, Detect and count parasites
26 Satellite images multilabeling Kaggle competition Problem: Planet and its Brazilian partner SCCON were challenging to label satellite image chips with atmospheric conditions and various classes of land cover/land use. Resulting algorithms will help the global community better understand where, how, and why deforestation happens all over the world - and ultimately how to respond. Results: Developed the multilabeling algorithm based on convolutional neural networks and adaptive thresholding algorithm (accuracy 99%) The overall performance Fb = 93% (first place had Fb = 93,3%)
27 Background removal Kaggle competition Problem: to develop an algorithm that automatically removes the photo studio background. This will allow to superimpose cars on a variety of backgrounds. Current results: 99,71% of Dice metric (overlap of Photoshop mask and generated by the algorithm)
28 Pedestrian Tracking Demo Watch: https://vimeo.com/202976905
29 Emotion Recognition Demo Connect Location and Emotions: In-store Service Improvement
30 Real-time in-store augmented reality Connect location and virtual data: New technologies like Google Tango, allow to create augmented reality, connecting location and virtual data. This may result in improved logistics solutions (shelf filling, position tracking) and marketing offerings. Recommender systems + video processing in real time
31 Epileptic seizure prediction and prevention Epilepsy is a group of neurological diseases, characterized by spontaneous seizures that can vary from brief and nearly undetectable to long periods of vigorous shaking. 33% of patients have uncontrollable seizures.
32 Alzheimer's disease and Mild-cognitive impairment detection Alzheimer s disease (AD) is a neurological malady of the humans brain leading to inevitable neuron death and loss of intellectual abilities, including reasoning and memory, which become serious enough to impede social or occupational functioning. Treatment of AD directly depends on the stage of disease at which it was diagnosed, and the earlier start of treatment can help to ease patients daily life in long-term perspective. MRI image analysis and classification. Brain tissue degradation at the very early stages of AD (when there are only minor symptoms). To start treatment earlier.