Research Challenges in Forecasting Technical Emergence. Dewey Murdick, IARPA 25 September 2013

Similar documents
Finding Patterns of Emergence in Science and Technology Evaluation Implications

FORESIGHT AND UNDERSTANDING FROM SCIENTIFIC EXPOSITION (FUSE) Incisive Analysis Office. Dewey Murdick Program Manager

Finding Patterns of Emergence in Science and Technology

OPEN SOURCE INDICATORS (OSI) Intelligence ARPA. Jason Matheny

Where do patent measures fall short in the life sciences? Bhaven N. Sampat Columbia University and NBER July 28, 2017

Patent Mining: Use of Data/Text Mining for Supporting Patent Retrieval and Analysis

Medical Research Council

Forecasting Technology Emergence from Metadata and Language of Scientific Publications and Patents 1

Space Biology RESEARCH FOR HUMAN EXPLORATION

InSciTe Adaptive: Intelligent Technology Analysis Service Considering User Intention

General Education Rubrics

The EPSRC portfolio: Past, present and future

Technology Roadmapping An Overview for MAA Thrust Area Work Groups

Research & Development (R&D) defined (3 phase process)

STRATEGIC FRAMEWORK Updated August 2017

Expression Of Interest

Capturing and Classifying Ontology Evolution in News Media Archives

Find and analyse the most relevant patents for your research

Combining scientometrics with patentmetrics for CTI service in R&D decisionmakings

An Intellectual Property Whitepaper by Katy Wood of Minesoft in association with Kogan Page

GUIDE TO SPEAKING POINTS:

Mapping the Movement of AI into the Marketplace with Patent Data Research Team:

Technology forecasting used in European Commission's policy designs is enhanced with Scopus and LexisNexis datasets

Liu Xiwen. National Science Library of CAS Mailing address: No. 33 Beisihuan Xilu, Zhongguancun, Beijing, , China

Executive summary. AI is the new electricity. I can hardly imagine an industry which is not going to be transformed by AI.

PowerAnchor STEM Curriculum mapping Year 9

Revisiting the USPTO Concordance Between the U.S. Patent Classification and the Standard Industrial Classification Systems

LANGUAGE MATHEMATICS READING SCIENCE

How the analysis of structural holes in academic discussions helps in understanding genesis of advanced technology

Sentiment Analysis of User-Generated Contents for Pharmaceutical Product Safety

ty of solutions to the societal needs and problems. This perspective links the knowledge-base of the society with its problem-suite and may help

Information Infrastructure II (Data Mining) I211

The Complex Network of Skill and Ideas

FDA Centers of Excellence in Regulatory and Information Sciences

2012 COMMON CORE STATE STANDARDS ALIGNED MODULES 2012 COMMON CORE STATE STANDARDS ALIGNED MODULES

Research Excellence Framework

Forecasting Paper. Name. University / Affiliation / Institution

PowerAnchor STEM Curriculum mapping Year 10

Applying Text Analytics to the Patent Literature to Gain Competitive Insight

Restriction Enzyme/Recombinant. DNA Extraction from Plant and. DNA Fingerprint Activity. Proteins to Proteomics, Alternative Splicing

Daniel R. Cahoy Smeal College of Business Penn State University VALGEN Workshop January 20-21, 2011

TECHNOLOGY MASTER PLAN

Innovative Approaches in Collaborative Planning

Defend against infringement suits

LANGUAGE MATHEMATICS READING SCIENCE

A New Forecasting System using the Latent Dirichlet Allocation (LDA) Topic Modeling Technique

INTELLECTUAL PROPERTY OVERVIEW. Patrícia Lima

An Empirical Look at Software Patents (Working Paper )

Automated Terrestrial EMI Emitter Detection, Classification, and Localization 1

Investigate the great variety of body plans and internal structures found in multi cellular organisms.

Autonomy Test & Evaluation Verification & Validation (ATEVV) Challenge Area

Science Integration Fellowship: California Ocean Science Trust & Humboldt State University

OECD WORK ON ARTIFICIAL INTELLIGENCE

HIGH IMPACT INNOVATIONS TRANSFORMING AUSTRALIAN AGRICULTURE

Programming and Optimization with Intel Xeon Phi Coprocessors. Colfax Developer Training One-day Boot Camp

Advanced Analytics for Intelligent Society

Related Features of Alien Rescue

Concepts and Challenges

Software and service for advanced Intellectual Property analysis

Centre for Doctoral Training: opportunities and ideas

ENSURING READINESS WITH ANALYTIC INSIGHT

Dissemination Patterns of Technical Knowledge in the IR Industry. Scientometric Analysis of Citations in IR-related Patents

Translational scientist competency profile

Analogy Engine. November Jay Ulfelder. Mark Pipes. Quantitative Geo-Analyst

Derwent Data Analyzer Patent Analytics. Blending subject matter expertise with machine based intelligence to deliver commercially-ready insights

STRANDS KEY CONCEPTS BENCHMARKS GRADE LEVEL EXPECTATIONS. Grade 8 Science Assessment Structure

AN ENABLING FOUNDATION FOR NASA S EARTH AND SPACE SCIENCE MISSIONS

Thoughts on Reimagining The University. Rajiv Ramnath. Program Director, Software Cluster, NSF/OAC. Version: 03/09/17 00:15

MRC Health and Biomedical Informatics Research Strategy

Introducing Elsevier Research Intelligence

Understanding DARPA - How to be Successful - Peter J. Delfyett CREOL, The College of Optics and Photonics

Information Sociology

New Challenges for Research in Tuning. Clifford Adelman Tuning Academy Launch 15 June, 2011

2018 ASSESS Update. Analysis, Simulation and Systems Engineering Software Strategies

Multidisciplinary education for a low-carbon society. Douglas Halliday, Durham University, UK

Lexisnexis PatentOptimizer Streamline your patent analysis and applications

Research Metrics: Informing Institutional Strategy and Demonstrating Research Excellence / Impact

TERMS OF REFERENCE FOR CONSULTANTS

COMPREHENSIVE COMPETITIVE INTELLIGENCE MONITORING IN REAL TIME

U-Multirank 2017 bibliometrics: information sources, computations and performance indicators

Scientific linkage of science research and technology development: a case of genetic engineering research

13 Dec 2pm-5pm Olin Hall 218 Final Exam Topics

IBM SPSS Neural Networks

Chapter 22. Technological Forecasting

Attribution and impact for social science data

Evaluation of Strategic Area: Marine and Maritime Research. 1) Strategic Area Concept

Global and China Medical MRI Equipment Industry 2014 Deep Research Report

Review of the Research Trends and Development Trends of Library Science in China in the Past Ten Years

SciVal February 2016 release

As a Patent and Trademark Resource Center (PTRC), the Pennsylvania State University Libraries has a mission to support both our students and the

Chapter 7 Information Redux

The A.I. Revolution Begins With Augmented Intelligence. White Paper January 2018

FORESIGHT METHOD HORIZONS. Module. Introduction to Foresight for Canada Beyond 150

A Knowledge Discovery Framework for XML-Literature-Data

Six steps to measurable design. Matt Bernius Lead Experience Planner. Kristin Youngling Sr. Director, Data Strategy

A STUDY ON THE DOCUMENT INFORMATION SERVICE OF THE NATIONAL AGRICULTURAL LIBRARY FOR AGRICULTURAL SCI-TECH INNOVATION IN CHINA

The Defence of Basic

Who we are. What we offer

Brad Fenwick Elsevier Senior Vice President, Global Strategic Alliances

Fall National SBIR/STTR Conference

Transcription:

Research Challenges in Forecasting Technical Emergence Dewey Murdick, IARPA 25 September 2013 1

Invests in high-risk/high-payoff research programs that have the potential to provide our nation with an overwhelming intelligence advantage over our future adversaries http://www.iarpa.gov/ 2

A Few Interesting Research Problems Scan for technical emergence Move beyond search Reliably query for indicative patterns of technical emergence without starting with a known, named subject Analyze diverse and large data streams across disciplines, cultures, and languages Support strategic investment Facilitate discovery and innovation Forecast scientific, technical, application, and market events Quantitatively event forecasts Improve accuracy and early event event detection 3

Foresight and Understanding from Scientific Exposition (FUSE) Program Reduce technical surprise via reliable & validated, early detection of emerging scientific and technical capabilities across disciplines and languages found within the full-text content of scientific, technical, and patent literature Special focus from the outset on multiple languages, Phase 2 focus on English and Chinese Novelty à Discover patterns of emergence and connections between technical concepts at a speed, scale, and comprehensiveness that exceeds human capacity Usage à Alert analyst of emerging technical areas with sufficient explanatory evidence to support further exploration 4

What is technical emergence? Hypotheses from Phase 1 A concept has emerged if it has been accepted by others within and beyond one s community. ~Columbia A concept is emerging when its actant network is increasing in robustness. ~BAE A concept has emerged when evidence has appeared that the concept is new and unexpected, noticeable and growing. ~Raytheon BBN A concept is emerging when it is identifiable by its own practitioners, enables a capability that was not achievable previously, and persists. ~SRI Many ways to probe technical emergence Community of Practice Practical Application Debates Alternative Acceptance Interdisciplinarity Attention (Citation) Prediction Dominant sub-topic within set Commercial Application Infrastructure 5

Columbia Community of Practice Indicator Hypotheses (Ph 1) BAE BBN SRI Red edges connect data sources to data fields Blue edges connect BAE high-level indicators to BAE low-level indicators Line thickness between features and indicators, measures significance for the challenge 6

Evaluation Attempt #1: Case Studies Drawn from diverse areas of scientific inquiry & application: Biological Sciences / Biotechnology Computer Science / Information Science; Engineering Mathematics / Statistics Physical Sciences; Earth Science Medical / Clinical / Infectious Disease / Health Services; Social Sciences; Technical emergence measured from real world view point, but connected to literature Multiple case studies to be produced; some are held back for evaluation Case studies are representative but not comprehensive Insufficient to train technical emergence classifiers Limited examples of emergence & non-emergence (10s planned) Reference baseline has limited temporal resolution (~5 year blocks) 7

Phase 2 Evaluation: Nomination Test LEADING Data Period Reference Period Forecast Period LAGGING gap Time FUSE Document Repository FUSE Document Repository T now Test Sample GTF*(E,D,R,F) e 3 e 1 e 5 e e 2 4 e n D R F Performer-defined indicators I 2 I 1 I n Prominence Forecasts T&E Ground Truth Data Compare (E)ntity (D)ata Period (R)eference Period (F)orecast Period NQ Score FUSE Performer System *GTF = Ground Truth Function 8

Indicator Development and Testing Underway Regular analysis and evaluation of each team s features (e.g., scientific noun phrases, topic models) and their portfolio of indicators (i.e., quantitatively measured aspects / patterns of technical emergence) Promising Midterm Indicator Types Fundamental Research Citation, Author Networks (All) Topic Diversity (SRI) Citation Context and Sentiment (SRI) Technology and application concept type evolution (SRI) Patent classification dynamics (SRI, BAE) Emerging cluster / hot patent status (BAE) Patent originality (BAE) Corporate, Academic patent authorship (BAE) Topic modeling across time, thread dynamics (BBN) Research levels (BBN) Time series analysis, extensive portfolio (COL) Temporal pattern classification, time-series clustering (COL) Argumentative Zoning (SRI, COL) Time-dependent term co-occurrence (SRI) Author-topic modeling (SRI) Operations on annotated graphs, e.g., scientific concepts, terms (SRI) Chinese patent indicators (BAE, BBN) Fine-grained topic models (BBN) Causality modeling framework (BBN) Primary concept mentions (COL) Citation sentiment (COL) 9

Now Developing a Market for Scientific and Technical Forecasting Goal: Generate precise, testable forecasts for S&T developments Approach: Build world s largest prediction market for S&T events Thousands of subject matter experts in dozens of countries will make nuanced conditional forecasts for around one thousand S&T events Data-driven (i.e., scientific and patent literatures) indicators will be used to generate questions and adjust forecasts Evaluation: Forecasts will be scored against actual events, as they occur Potential impact: Dramatically improve S&T foresight with actionable information Schedule: June 2013 June 2015 Probabilities assigned to event in each period Number of forecasters providing judgments in each period By 31 December 2014, how much of the visible spectrum will a metamaterial be able to deflect?! 50nm 25nm 100nm 200nm 1 2 3 4 5 6 7 8 Fictional Real-world timeline (months) 10

Teams will Generate Questions What is the probability of a 10cm carbon nanotube being fabricated before 31 Dec 2014? Will the number of accepted articles for the 2015 International Conference on Machine Learning (ICML) conference that contain the term deep learning in the title/abstract exceed those that contain the term support vector machine(s) in the title/abstract? How many unique assignees will have at least two USPTO patent applications published using the term Type III Secretion System in its title/abstract/background/claims between 1 Oct 2013 and 30 Sep 2014? By 31 Dec 2017, how many FDA-approved products will be based on RNA interference? Will there be reported shortages of technetium-99m in the US in 2015? 11

Discussion & Questions Dewey Murdick, Ph.D. Program Manager, IARPA dewey.murdick@iarpa.gov 12