The seven pillars of Data Science

Similar documents
アルゴリズムの設計と解析. 教授 : 黄潤和 (W4022) SA: 広野史明 (A4/A8)

Omochi rabbit amigurumi pattern

Intermediate Conversation Material #10

[ 言語情報科学論 A] 統計的言語モデル,N-grams

U N I T. 1. What are Maxine and Debbie talking about? They are talking about. 2. What doesn t Maxine like? She doesn t like. 3. What is a shame?

L1 Cultures Go Around the World

Installation Manual WIND TRANSDUCER

Lesson 5 What The Last Supper Tells Us

相関語句 ( 定型のようになっている語句 ) の表現 1. A is to B what C is to D. A と B の関係は C と D の関係に等しい Leaves are to the plant what lungs are to the animal.

Supporting Communications in Global Networks. Kevin Duh & 歐陽靖民

レーダー流星ヘッドエコー DB 作成グループ (murmhed at nipr.ac.jp) 本規定は レーダー流星ヘッドエコー DB 作成グループの作成した MU レーダー流星ヘッド エコーデータベース ( 以下 本データベース ) の利用方法を定めるものである

P (o w) P (o s) s = speaker. w = word. Independence bet. phonemes and pitch. Insensitivity to phase differences. phase characteristics

Chronicle of a Disaster: Understand

Delivering Business Outcomes

GDC2009 ゲーム AI 分野オーバービュー


研究開発評価に関する国際的な視点や国際動向

TED コーパスを使った プレゼンにおける効果的な 英語表現の抽出

Keio University Global Innovator Accelera6on Program 2015 Day 7 Design Process Exercise

Omni LED Bulb. Illustration( 实际安装, 설치사례, 設置事例 ) Bulb, Downlight OBB. OBB-i15W OBB-i20W OBB-i25W OBB-i30W OBB-i35W. Omni LED.

Final Product/Process Change Notification Document # : FPCN22191XD1 Issue Date: 24 January 2019

Private Equity: where should you invest today? P&I Global Pension Symposium, Tokyo

Decisions in games Minimax algorithm α-β algorithm Tic-Tac-Toe game

D80 を使用したオペレーション GSL システム周波数特性 アンプコントローラー設定. Arc 及びLine 設定ラインアレイスピーカーを2 から7 までの傾斜角度に湾曲したアレイセクションで使用する場合 Arcモードを用います Lineモード

Call for a Pro-Innovation

特集 米国におけるコンシューマ向けブロードバンド衛星サービスの現状

Ⅲ. 研究成果の刊行に関する一覧表 発表者氏名論文タイトル名発表誌名巻号ページ出版年. lgo/kourogi_ pedestrian.p df. xed and Augmen ted Reality

1XH DC Power Module. User manual ユーザマニュアル. (60V 15A module version) HB-UM-1XH

車載カメラにおける信号機認識および危険運転イベント検知 Traffic Light Recognition and Detection of Dangerous Driving Events from Surveillance Video of Vehicle Camera

Page No. 原文 リライト EDITOR'S NOTES 1 4 NATURAL ART

CER7027B / CER7032B / CER7042B / CER7042BA / CER7052B CER8042B / CER8065B CER1042B / CER1065B CER1242B / CER1257B / CER1277B

Season 15: GRAND FINAL PLAYER GUIDE. ver.2019/1/10

Yupiteru mvt F) 帯 FM 放送 テレビ音声 航空. 12 янв Yupiteru MVT-7300,

次の対話の文章を読んで, あとの各問に答えなさい ( * 印の付いている単語 語句には, 本文のあとに 注 がある )

Effects and Problems Coming in Sight Utilizing TRIZ for Problem Solving of Existing Goods

Title of the body. Citation. Issue Date Conference Paper. Text version author. Right

修士 / 博士課程専門課題 Ⅱ 試験問題

XG PARAMETER CHANGE TABLE

Creation of Digital Archive of Japanese Products Design process

Big thank you from Fukushima Friends UK (FF)

HARD LOCK Technical Reports

Keio EDGE Program. Kane Ishibashi Project Assistant Professor, Graduate School of System Design and Management

超伝導加速空洞のコストダウン. T. Saeki (KEK) 24July ILC 夏の合宿一ノ関厳美温泉

The Current State of Digital Healthcare

Gary McLeod is a Tokyo-based teacher of English and

ジェスチャ併用型 Voice-to-MIDI システムの提案 第五回知識創造支援システムシンポジウム報告書 : 本著作物の著作権は著者に帰属します

Navy Gray Navy Brown hel-905 Small Dot Silk Knit Tie Silk100% price:6,800

CPM6018RA Datasheet 定電流モジュール. Constant-current Power Modules. TAMURA CORPORATION Rev.A May, / 15

Finding Near Optimal Solutions for Complex Real-world Problems

Understanding User Acceptance of Electronic Information Resources:

Title inside of Narrow Hole by Needle-Typ. Issue Date Journal Article. Text version author.

Standardization of Data Transfer Format for Scanning Probe Microscopy

Toward The Organisational Innovation Study: A Critical Study of Previous Innovation Research

SanjigenJiten : Game System for Acquiring New Languages Visually 三次元辞典 : 第二言語学習のためのゲームシステム. Robert Howland Emily Olmstead Junichi Hoshino

都市基盤工学 ( リモートセンシングと GIS 入門 ) Introduction to Remote Sensing and GIS. Ground-based sensors 地上からのセンサ 第 4 回 千葉大学大学院融合理工学府

(Osaka Industrial Technology - Platform)

国際会議 ACM CHI ( ) HCI で生まれた研究例 2012/10/3 人とコンピュータの相互作用 WHAT IS HCI? (Human-Computer Interaction (HCI)

TDK-Lambda A C 1/27

Immersive and Non-Immersive VR Environments: A Preliminary EEG Investigation 没入型および非没入型 VR 環境 :EEG の比較. Herchel Thaddeus Machacon.

磁気比例式 / 小型高速応答単電源 3.3V Magnetic Proportion System / Compact size and High-speed response. Vcc = +3.3V LA02P Series

第 1 回先進スーパーコンピューティング環境研究会 (ASE 研究会 ) 発表資料

Developing Visual Information Processing Technology through Human Exploration

JSPS Science Dialog Program Kofu Higashi High School

On Endings 終結について. Ted Goossen

artist Chim Pom Chim Pom (Ryuta Ushiro, Ellie)

科学技術 学術審議会大型プロジェクト作業部会 2015 年 12 月 22 日 永野博

Future Perspectives of Science, Technology and Innovation

Glycymeris totomiensis Glycymeris rotunda. Glycymeris rotunda

日独学長シンポジウムと日仏高等教育改革シンポジウムが開催されました.

128 Dental Materials Journal 10(2): , 1991

4. Contact arrangement 回路形式 1 poles 1 throws 1 回路 1 接点 (Details of contact arrangement are given in the assembly drawings 回路の詳細は製品図による )

Present Status of SMEs I

品名 :SCM1561M 製品仕様書. LF No RoHS 指令対応 RoHS Directive Compliance 発行年月日 仕様書番号 SSJ SANKEN ELECTRIC CO., LTD. 承認審査作成 サンケン電気株式会社技術本部 MCD 事業部

Studies on Modulation Classification in Cognitive Radios using Machine Learning

F01P S05L, F02P S05L, F03P S05L SERIES

Minecraft You Need To Run The Version Manually At Least Once

ITU-R WP5D 第 9 回会合報告書

Multi-bit Sigma-Delta TDC Architecture for Digital Signal Timing Measurement

3 안전을위한주의사항 AAH-02B3W. Product Composition & Specifications. Product Manual. Cautions for Safety. Cautions for Safety. Cautions.

PH75A280-* RELIABILITY DATA 信頼性データ

レイ ブライアントふたたび ~ ボーカルとの共演を中心に ~

The Bright Side of Urban Shrinkage: Steps toward Restructuring Cities

Study on Multipath Propagation Modeling and Characterization in Advanced MIMO Communication Systems. Yi Wang

CG Image Generation of Four-Dimensional Origami 4 次元折り紙の CG 画像生成

How Capturing the Movement of Ions can Contribute to Brain Science and Improve Disease Diagnosis

宇宙飛行生物学 (Bioastronautics( 宇宙飛行生物学 (Bioastronautics) の大学院教育への利用. Astrobiology)? 宇宙生物学 (Astrobiology( 宇宙生物学 カリキュラム詳細

Indonesian Printing Industry Trends, Current Technology, and Future Development

Multi-Band CMOS Low Noise Amplifiers Utilizing Transformers

Lepton Flavor Physics with Most Intense DC Muon Beam Yusuke Uchiyama

Hacked ace gangster. City Hacked. Key hacks [3] Money [4] Health [5] Exp [6] Ammo for all weapons [7] Attribute points [8] Skill

V-TUNE ~CCT Controllable~

The Elephant Vanishes Haruki Murakami

Specifications characterize the warranted performance of the instrument under the stated operating conditions.

Local Populations Facing Long- Term Consequences of Nuclear Accidents: Lessons learned from Chernobyl and Fukushima

Development of a pixel sensor based on SOI technology for the ILC vertex detector

Two-Tone Signal Generation for Communication Application ADC Testing

IMPORTANT SAFETY INSTRUCTIONS Regulatory Safety Information

P Z N V S T I. センサ信号入力仕様 Input signal type. 1 ~ 5 V 4 ~ 20 ma 1 ~ 5 V 4 ~ 20 ma 1 ~ 5 V 4 ~ 20 ma 1 ~ 5 V 4 ~ 20 ma

IPR Information Dissemination Policy and Future Approach for MT Services in Japan

無線通信デバイスの技術動向 松澤昭 東京工業大学大学院理工学研究科電子物理工学専攻 TiTech A. Matsuzawa 1

Transcription:

2016 年度統計関連学会連合大会金沢大学 2016 年 9 月 6-9 日 The seven pillars of Data Science Hideyasu SHIMADZU Department of Mathematical Sciences and Centre for Data Science, Loughborough University, UK

Big Data Google Trends

Data Science Google Trends

データサイエンス <- Data Science in Japanese Oops! We already had Data Science boom a decay ago! 日本でのデータサイエンスの流行は 10 年程前!? Google Trends

If you torture the data long enough, it will confess. Ronald Coase Let data speak, never torture them. データを尋問するのではなく, データに語らせる October 2012

The World's 7 Most Powerful Data Scientists By Tim O Reilly (2011) 1. Larry Page, CEO, Google 2. Jeff Hammerbacher, Chief Scientist, Cloudera and DJ Patil, Entrepreneur-in-Residence, Greylock Ventures 3. Sebastian Thrun, Professor, Stanford University and Peter Norvig, Data Scientist, Google 4. Elizabeth Warren, Candidate, U.S. Senate (Massachusetts) 5. Todd Park, CTO, Department of Health and Human Services 6. Alex "Sandy" Pentland, Professor, MIT 7. Hod Lipson and Michael Schmidt, Computer Scientists, Cornell University

Data Science activities in the UK Alan Turing Institute (HQ: British Library): 2015- Big Data (University of Cambridge) Edinburgh Data Science (University of Edinburgh) 42m ~ 56 億円 for the initial 5 yrs Oxford Internet Institute (University of Oxford) Centre for Data Science (University College London) Warwick Data Science Institute (University of Warwick) Data Science Institute (Imperial College London): 2014- Data Science Institute (Lancaster University): 2014- Centre for Data Science (Loughborough University): 2014- Leeds Institute for Data Analytics (University of Leeds): 2014- Institute for Analytics and Data Science (University of Essex): 2015- Data Science Institute (University of Manchester): 2015- More and more!

Data Science activities in the UK (cont.) 154 universities in the UK 35 universities offer Data Science related BSc courses for 2017 (30 universities for 2016) The Universities and Colleges Admissions Service (UCAS) 50 universities now offer Data Science related MSc courses degrees More and more! These courses are jointly offered by a group of departments: computer science, statistics, mathematics, subject matter disciplines.

BSc (Data Science) 1st year Strong, general mathematical foundation. Programming, Data Structures, Probability and Exploratory Data Analysis. 2nd year Statistical topics in considerable depth, Algorithms, Databases, Software Engineering. Optional modules: Artificial Intelligence, Linear Statistical Modelling etc. 3rd year Data Science Project; Optional modules: Machine Learning, Bayesian Forecasting etc.

MSc (Data Science) for 1 year taught course Data Mining Data Science Fundamentals Programming for Data Scientists Statistical Inference Statistical Methods and Modelling Likelihood Inference Generalised Linear Models Elements of Distributed Systems Systems Architecture and Integration Applied Data Mining

Computer Science Mathematics Statistics Data Science

データサイエンス <- Data Science in Japanese Google Trends

データサイエンス <- Data Science in Japanese 1996 第 64 回日本統計学会 (JSS) 共通テーマ データサイエンス I, II, III 2013 International Conference (RSS) Contributed Session: Data Science 2013 Joint Statistical Meeting (ASA) Speaking Clearly About Data Scientists データサイエンスによる現象の数理 (2003-) Cherry Bud Workshop series Data Science and System Reduction (2004) Quantitative Risk Management (2005) Building Models from Data (2006) Interaction through Data (2007) Discovery through Data Science (2008) 温故知新 Dicipulus est prioris posterior dies Today is the scholar of yesterday. Google Trends

Sir Ronald Fisher Analyses of the vast amount of data accumulated from the "Classical Field Experiments. 1919 Rothamsted Research 1921 Studies in Crop Variation 1925 Statistical Methods for Research Workers

John Tukey The future of data analysis (Tukey 1962) For a long time I have thought I was a statistician... I have had cause to wonder and to doubt.... my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such procedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data. Statistics is not enough to cover things!!

John Tukey (cont.) The future of data analysis (Tukey 1962) There are diverse views as to what makes a science, but three constituents will be judged essential by most, viz: (a1) intellectual content, (a2) organization in an understandable form, (a3) reliance upon the test of experience as the ultimate standard of validity. Mathematics cannot be a science but Data Analysis can be a new science! -> Data Science ought to be science!!

赤池 (1998) 時系列解析の方法 柴田 (2000) データサイエンスのすすめ 柴田 (2001) データリテラシー 林 (2001) データの科学 J. Tukey (1962) The future of Data Analysis T. Speed (1986) Questions, answers and Statistics C. Wu (1997) Statistics = Data Science?" W. Cleveland (2001) Data science: an action plan for expanding the technical areas of the field of statistics P. Diggle (2015) Statistics: a data science for the 21st century ASA (2015) ASA Statement on The Role of Statistics in Data Science D. Donoho (2015) 50 years of Data Science 赤池 (1998) データを用いて必要な情報を作り出すこと. モデルは仮説の表現複雑であり, 基本的な知的活動. これによりデータに意味が生じ, 情報が創造される. 林 (2001) 複雑であいまいな現象について, データを中心に据えて物を見ていこうとするものである. これがデータの科学の根本理念である.

柴田 (1984-) データの上流から下流まで Data collection Data description/clearing Data browsing/visualisation Data modelling Model validation Donoho (2015) Data Exportation and Preparation Data Representation and Transformation Computing with Data Data Modelling Data visualisation and presentation Science about Data Science Cleveland (2001) Multidisciplinary investigation (25%) Models and methods for data (20%) Computing with data (15%) Pedagogy (15%) Tool evaluation (5%) Theory (20%)

Multidisciplinary interaction Data structurisation Disciplinary questions/techniques/knowledge Description/cleaning/processing/storing Data operationalisation Understanding data & questions/meanings Knowledge extraction Data analysis/summarising/modelling/algorithms Knowledge presentation Visualisation/models/prediction Knowledge validation Evaluation/simulation/model diagnostics Communication Literacy/technology transfer/pedagogy

Ex. Data structurisation Each pillar has expanded itself over time!

The seven pillars of Data Science Multidisciplinary interaction Data structurisation Data operationalisation Knowledge extraction Knowledge presentation Knowledge validation Communication

Thank you very much for your attention! Any questions/comments welcomed. Hideyasu SHIMADZU