Big Data Best Practice

Similar documents
Minds + Machines Europe 2017

FOREST PRODUCTS: THE SHIFT TO DIGITAL ACCELERATES

Enabling Scientific Breakthroughs at the Petascale

Achieving Operational Excellence with Information Technology

KÜNSTLICHE INTELLIGENZ JOBKILLER VON MORGEN?

School of Digital Media Arts Photography GM300BB

High-performance computing for soil moisture estimation

Digitally enabled. Bernard Looney, chief executive, Upstream

Cisco IPICS Dispatch Console

The Role of E&P Technologies Dr. Donald Paul Vice-President and Chief Technology Officer Chevron Corporation

The Dark Side of Data The NSA ThinThread Tale

Synchrophasor Technology at BPA: from Wide-Area Monitoring to Wide-Area Control

CANADA S OCEAN SUPERCLUSTER DRAFT NOVEMBER 1

DATA AT THE CENTER. Esri and Autodesk What s Next? February 2018

Operational Intelligence to deliver Smart Solutions

Operational Intelligence to Deliver Smart Solutions. Copyright 2015 OSIsoft, LLC

John Magee Rob Harwood (RH) John Magee (JM)

The future of work. Artificial Intelligence series

Esri and Autodesk What s Next?

What is Big Data? Jaakko Hollmén. Aalto University School of Science Helsinki Institute for Information Technology (HIIT) Espoo, Finland

Corporate Presentation

Byte = More common: 8 bits = 1 byte Abbreviation:

ICT strategy and solutions for upstream oil and gas. Supporting exploration and production globally

raw format format for capturing maximum continuous-tone color information. It preserves all information when photograph was taken.

Ease of Use Enables Ease of Adoption Jason Walker, CEO,

Developing an Embedded Digital Twin for HVAC Device Diagnostics

Top Manufacturing & Construction Technology Trends. Finding agility, security and connectivity to keep up with today s fast-paced market

Winners of the McRock IIoT Awards 2018 Announced

Keeping up with the times Tensions between workflow, status quo, and technology

Offshore Renewable Energy Catapult

Extending On-Premises Network-Attached Storage to Google Cloud Storage with Komprise

Smarter oil and gas exploration with IBM

OPAL Oil & Gas Conference 2018

Aan de vooravond van

Intel and XENON Help Oil Search Dig Deeper Into Sub-Surface Oil and Gas Analysis

Staff get data back just hours after fire guts The Academy, Selsey. Redstor to the rescue after disaster strikes

DOW IMPROVES INSTRUMENT RELIABILITY 66% AND SAVES MILLIONS OF DOLLARS WITH REAL-TIME HART TECHNOLOGY

Transformation to Artificial Intelligence with MATLAB Roy Lurie, PhD Vice President of Engineering MATLAB Products

IBM Research Zurich. A Strategy of Open Innovation. Dr. Jana Koehler, Manager Business Integration Technologies. IBM Research Zurich

Get Automating with Infoblox DDI IPAM and Ansible

Information Retrieval Evaluation

Sparking a New Economy. Canada s Advanced Manufacturing Supercluster

Data Representation. "There are 10 kinds of people in the world, those who understand binary numbers, and those who don't."

Innovation Report: The Manufacturing World Will Change Dramatically in the Next 5 Years: Here s How. mic-tec.com

Veterans and Offshore Drilling

DIGITAL IN MINING: PROGRESS... AND OPPORTUNITY

Interconnection-Wide Oscillation Analysis: Baselining Oscillation Modes in the North American Power System Objective Purpose

Enabling a Smarter World. Dr. Joao Schwarz da Silva DG INFSO European Commission

QualityMonitoring. range Ease of Installation Non Intursive Ease of Installation Non Intursive. Power Quality Parameters. Modular Wireless Achitecture

MAKING IOT SENSOR SOLUTIONS FUTURE-PROOF AT SCALE

Drilling for data Digitizing upstream oil and gas

MORE POWER TO THE ENERGY AND UTILITIES BUSINESS, FROM AI.

The Sherwin-Williams Company

Changing in a time of change

DocuSign for ios: For Field Sales & Field Services

For personal use only

M&A Update 1H Proven. Focused. Trusted. Accounts Receivable Management Healthcare IT Revenue Cycle Management

FINC915 Venture Lab Participating Firms: FALL 2010

Industry Outlook September 2015

Understanding AIS. The technology, the limitations and how to overcome them with Lloyd s List Intelligence

DESIGNING CHAT AND VOICE BOTS

Experiences with PMU-Based Three Phase Linear State Estimator at Dominion Virginia Power

Emerging FinTech Trends (FinTech Track) Monday, May 21 3:00 p.m. 4:00 p.m.

Background. White Paper THE DESTINY OF INTELLIGENT INFRASTRUCTURE. Mark Gabriel R. W. Beck, Inc.

MEETING RECAP SHIFTING POWER DATA ANALYTICS & THE SMART ENERGY GRID 2020

Disambiguation of Inventors, USPTO

GPS Interference Detection & Mitigation

Privacy and Security in an On Demand World

Digital Transformation

Why Artificial Intelligence will Revolutionize Healthcare including the Behavioral Health Workforce.

Cambridge International Examinations Cambridge Ordinary Level

PETROLEUM ENGINEERING. Meetings International. August 06-07, 2018 Dubai, UAE.

Collaborating for competitiveness how new challenges demand deeper partnerships. Bernard Looney, chief executive, Upstream

Digital Identity. Workshop. Roland Berger Munich Office

PMU Big Data Analysis Based on the SPARK Machine Learning Framework

National Instruments Accelerating Innovation and Discovery

PETROLEUM ENGINEERING. Meetings International. August 06-07, 2018 Dubai, UAE.

Digital Imaging Rochester Institute of Technology

Dassault Systèmes in High-Tech

DIGITAL INNOVATION MANUFACTURING EXECUTIVE. The Best Strategy for Reclaiming U.S. Manufacturing Jobs Is...

At last, a network storage solution that keeps everyone happy

Brevitas, located in San Francisco, is a real estate collaboration and workflow platform.

Smarter Defense, an IBM Perspective IBM Corporation

Danish Technological Institute Soeren Stjernqvist, CEO. Introduction It s all about innovation Governance The weather forecast

Notes and Thoughts By Tony Giovaniello, President, Shasta EDC

Creating a Public Safety Ecosystem

Crafting Powerful Business Solutions

MOTOROLA SOLUTIONS 2017 K-12 EDUCATION INDUSTRY SURVEY REPORT SURVEY REPORT 2017 SCHOOL COMMUNICATIONS

UPGRADE YOUR MPT NETWORK THE SMART WAY. harris.com #harriscorp

New silicon photonics technology delivers faster data traffic in data centers

Data Science is About Making Better Decisions

Microsoft Services. Mixed Reality: Helping manufacturers develop transformative customer solutions

Introduction. digitalsupercluster.ca

Forward Looking Information

Homeland Security Finance Forum 2011

Digital Imaging & Photoshop

At A Glance. - Wendell Weeks, Chairman, CEO, and President

The Age of the Terrific Deal : Information, Infrastructure, and Opportunity for All

Innovation for the 21st Century

The New Imperative: Collaborative Innovation. Dr. Anil Menon Vice President, Corporate Strategy IBM Growth Markets

Transcription:

Big Data Best Practice Sean Patrick Murphy sean@pingthings.io JSIS, Salt Lake City May 23, 2017 1

The Value of Data Circa 2006! Data is just like crude. It s valuable, but if unrefined it cannot really be used. It has to be changed into gas, plastic, chemicals, etc. to create a valuable entity that drives profitable activity; so must data be broken down, analyzed for it to have value. M. Palmer, Data is the new oil, http://ana.blogs.com/maestros/2006/11/data_is_the_new.html 2

! The problem is that just 2 percent of all the terabytes and petabytes of data generated by connected power plants, windfarms, grids, substations and energy management systems is being analyzed and used today. - Ganesh Bell Chief Digital Officer of GE Power 3

4

5

How Much Data? x 2 x 64 bits / sample 30 samples / second 60 seconds / minute 60 minutes / hour 24 hours / day 30 days / month 12 streams / PMU 1/8 bit/byte Assume 2500 deployed and active PMUs 2500 * 15 GB = 30 TB / month 14,929,920,000 bytes / month / PMU or 15 GB / month / PMU 6

Significant Scale 500,000 PMUs Deployed Today Dr. Edmund O. Schweitzer III, President, Chairman of the Board Schweitzer Engineering Laboratories 7

How Much Data? x 2 x 64 bits / sample 30 samples / second 60 seconds / minute 60 minutes / hour 24 hours / day 30 days / month 12 streams / PMU 1/8 bit/byte Assume 2500 deployed and active PMUs 2500 * 15 GB = 30 TB / month If all 500,000 PMUs came online 500,000 * 15 GB = 14,929,920,000 bytes / month / PMU or 15 GB / month / PMU 7.5 Petabytes / month 8

Just in Case You Were Wondering Bytes (8 Bits) Kilobyte (1000 Bytes) Megabyte (1 000 000 Bytes) Gigabyte (1 000 000 000 Bytes) Terabyte (1 000 000 000 000 Bytes) Petabyte (1 000 000 000 000 000 Bytes) Exabyte (1 000 000 000 000 000 000 Bytes) Zettabyte (1 000 000 000 000 000 000 000 Bytes) Yottabyte (1 000 000 000 000 000 000 000 000 Bytes) Named after Yoda! Xenottabyte (1 000 000 000 000 000 000 000 000 000 Bytes) Shilentnobyte (1 000 000 000 000 000 000 000 000 000 000 Bytes) Domegemegrottebyte (1 000 000 000 000 000 000 000 000 000 000 000 Bytes) 9

Perspective: Gorilla 2 billion unique time series identified by a string key. 700 million data points (time stamp and value) added per minute. Store data for 26 hours. More than 40,000 queries per second at peak. Reads succeed in under one millisecond. Support time series with 15 second granularity (4 points per minute per time series). Two in-memory, not co-located replicas (for disaster recovery capacity). Always serve reads even when a single server crashes. Ability to quickly scan over all in memory data. Support at least 2x growth per year. 10

Cramming More Components onto Integrated Circuits CPU RAM Storage Software Cost per Gigaflop Cost per Gigabyte Cost per Gigabyte Enterprise Data Systems 1995 $42,000 $32,000 $60,000 Million$ 2017 $0.03 $4 $0.03 Free, Open source The RAM required to hold a month s worth of PMU data for the entire North American continent costs approximately $10K 11

Current Analytics Process 1. Gain access to the data 2. Query data from historian and pull down to laptop 3. Write code in Excel/Matlab/R to do one-time analysis 4. Store data and code in local folder 5. Move to production? 12

Microsoft Excel Various studies over the past few years report that 88 percent of all spreadsheets have "significant" errors in them. Even the most carefully crafted spreadsheets contain errors in 1 percent or more of all formula cells. JPMorgan Chase lost more than $6 billion in its London Whale incident, in part due to Excel spreadsheet errors (including alleged copying and pasting of incorrect information from multiple spreadsheets). Raymond R Panko, What We Know About Spreadsheet Errors, Journal of End User Computing's. Special issue on Scaling Up End User Development, Volume 10, No 2.Spring 1998, pp. 15-21 13

Analyst or Engineer 14

Best Practices Free the Data 1. Data should be easily accessible and visualization should be free at full resolution within seconds not days 15

Best Practices Free the Data 2. The faster analyses execute, the more use cases emerge. Distributed, horizontally scalable message bus designed to handle any and all sensor data input. BTrDB Distributed, horizontally scalable data store designed for hierarchical time series data. Distributed analysis and deep learning platform for both real time and asynchronous data analysis at scale. Web/Mobile Apps for Specific Use Cases Data Quality GMD/GIC Asset Maintenance Anomaly Detection PT Work Bench Collaborative data science environment to build and share analytics within the utility. 16

Best Practices Free the Data 3. Code, results, and ideas should be easily shared within the organization. 17

Best Practices Free the Data 4. Move ad-hoc analyses to production rapidly 5. Never throw away data 18

Questions? sean@pingthings.io Leadership Co-Founder Chief Executive Officer Jerry Schuman Co-Founder Chief Data Scientist Sean Murphy PingThings is a great example of the type of disruptive software that industries need to scale up the individual Internet, and we're delighted to make an investment in the company. - WILLIAM 'BILL' RUH, CEO, GE DIGITAL Partners & Customers 19