Multi-Example Search in Rich Information Graphs

Similar documents
Maverick: Discovering Exceptional Facts from Knowledge Graphs

Lecture 20: Combinatorial Search (1997) Steven Skiena. skiena

Fact Harvesting from Natural Language Text in Wikipedia

Complete and Incomplete Algorithms for the Queen Graph Coloring Problem

CMPUT 396 Tic-Tac-Toe Game

CSC 343 Inverted course material: SQL subqueries

p-percent Coverage in Wireless Sensor Networks

AI Approaches to Ultimate Tic-Tac-Toe

SCRABBLE ARTIFICIAL INTELLIGENCE GAME. CS 297 Report. Presented to. Dr. Chris Pollett. Department of Computer Science. San Jose State University

mywbut.com Two agent games : alpha beta pruning

Set 4: Game-Playing. ICS 271 Fall 2017 Kalev Kask

Small World Problem. Web Science (VU) ( ) Denis Helic. Mar 16, KTI, TU Graz. Denis Helic (KTI, TU Graz) Small-World Mar 16, / 51

Simple Search Algorithms

2. REVIEW OF LITERATURE

Topic 23 Red Black Trees

Design of Parallel Algorithms. Communication Algorithms

Small World Problem. Web Science (VU) ( ) Denis Helic. Mar 16, KTI, TU Graz. Denis Helic (KTI, TU Graz) Small-World Mar 16, / 50

Paper Presentation. Steve Jan. March 5, Virginia Tech. Steve Jan (Virginia Tech) Paper Presentation March 5, / 28

Adversarial Search. Human-aware Robotics. 2018/01/25 Chapter 5 in R&N 3rd Ø Announcement: Slides for this lecture are here:

EC O4 403 DIGITAL ELECTRONICS

Pedigree Reconstruction using Identity by Descent

Detection of Compound Structures in Very High Spatial Resolution Images

Universal Cycles for Permutations Theory and Applications

Tic-Tac-Toe on graphs

Lecture 20 November 13, 2014

Analysis of Workflow Graphs through SESE Decomposition

A Gentle Introduction to Dynamic Programming and the Viterbi Algorithm

Foundations of Distributed Systems: Tree Algorithms

CSC 396 : Introduction to Artificial Intelligence

NOVEL 6-PSK TRELLIS CODES

Modeling, Analysis and Optimization of Networks. Alberto Ceselli

CCO Commun. Comb. Optim.

Handling Search Inconsistencies in MTD(f)

Programming an Othello AI Michael An (man4), Evan Liang (liange)

Lambert, Gavin. I Never Promised You a Rose Garden : screenplay 1976

Online Graph Pruning for Pathfinding on Grid Maps. Daniel Harabor and Alban Grastien, AAAI 2011 Presented by James Walker

A review of standards for Smart Cities

Solving Dots-And-Boxes

Exhaustive Study of Median filter

Romantic Partnerships and the Dispersion of Social Ties

From Wireless Network Coding to Matroids. Rico Zenklusen

Game-Playing & Adversarial Search Alpha-Beta Pruning, etc.

Heuristics & Pattern Databases for Search Dan Weld

Basic Communication Operations (cont.) Alexandre David B2-206

Sensor network: storage and query. Overview. TAG Introduction. Overview. Device Capabilities

Best Practices for Automated Linking Using Historical Data: A Progress Report

Adversarial Search. CMPSCI 383 September 29, 2011

Game Playing Beyond Minimax. Game Playing Summary So Far. Game Playing Improving Efficiency. Game Playing Minimax using DFS.

Alpha-Beta search in Pentalath

Autocomplete Sketch Tool

Applications of Artificial Intelligence and Machine Learning in Othello TJHSST Computer Systems Lab

Algorithms for solving sequential (zero-sum) games. Main case in these slides: chess. Slide pack by Tuomas Sandholm

Optimization of Tile Sets for DNA Self- Assembly

Unit 1: Statistics and Probability (Calculator) Wednesday 6 November 2013 Morning Time: 1 hour 15 minutes

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Module 3. Problem Solving using Search- (Two agent) Version 2 CSE IIT, Kharagpur

Hash Function Learning via Codewords

Design Methods for Polymorphic Digital Circuits

Improved Detection by Peak Shape Recognition Using Artificial Neural Networks

CSE 332: Data Structures and Parallelism Games, Minimax, and Alpha-Beta Pruning. Playing Games. X s Turn. O s Turn. X s Turn.

Panel Study of Income Dynamics: Mortality File Documentation. Release 1. Survey Research Center

Early life:

A PageRank Algorithm based on Asynchronous Gauss-Seidel Iterations

Television sets in the United States:

Intro to coding and convolutional codes

Generalized Game Trees

Data and Knowledge as Infrastructure. Chaitan Baru Senior Advisor for Data Science CISE Directorate National Science Foundation

Supplementary Materials for

Algorithms for solving sequential (zero-sum) games. Main case in these slides: chess! Slide pack by " Tuomas Sandholm"

Optimal Transceiver Scheduling in WDM/TDM Networks. Randall Berry, Member, IEEE, and Eytan Modiano, Senior Member, IEEE

MITOCW ocw lec11

JANE MCCOWAN PHOTOGRAPHS AND PAPERS Mss Inventory. Compiled by Meghann Wollitz

AI Plays Yun Nie (yunn), Wenqi Hou (wenqihou), Yicheng An (yicheng)

Greedy Algorithms. Kleinberg and Tardos, Chapter 4

AUTOMATED MUSIC TRACK GENERATION

Learning from Hints: AI for Playing Threes

INF September 25, The deadline is postponed to Tuesday, October 3

Social Network Analysis and Its Developments

A Metric-Based Machine Learning Approach to Genealogical Record Linkage

Search Algorithms for a Bridge Double Dummy Solver

2017 Economic Outlook

A Fast Algorithm For Finding Frequent Episodes In Event Streams

CPS331 Lecture: Search in Games last revised 2/16/10

Computing Explanations for the Unary Resource Constraint

MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Community Detection and Labeling Nodes

Topology Control. Chapter 3. Ad Hoc and Sensor Networks. Roger Wattenhofer 3/1

First hit on Google Image:

Universal permuton limits of substitution-closed permutation classes

A new mixed integer linear programming formulation for one problem of exploration of online social networks

CMPUT 657: Heuristic Search

Playing Games. Henry Z. Lo. June 23, We consider writing AI to play games with the following properties:

Guess the Mean. Joshua Hill. January 2, 2010

A Parallel Monte-Carlo Tree Search Algorithm

On the Benefit of Tunability in Reducing Electronic Port Counts in WDM/TDM Networks

Variations of Rank Modulation for Flash Memories

Decision Making in Multiplayer Environments Application in Backgammon Variants

In order for metogivebackyour midterms, please form. a line and sort yourselves in alphabetical order, from A

A Primer on Image Segmentation. Jonas Actor

Albert Maltz: An Inventory of His Collection in the Manuscript Collection at the Harry Ransom Humanities Research Center

Transcription:

Multi-Example Search in Rich Information Graphs Matteo Lissandrini, Davide Mottin, Themis Palpanas, Yannis Velegrakis ml@disi.unitn.eu- ICDE 2018 Paris

https://unsplash.com/photos/ypcy9hep6v8 What are you looking for? 2 ICDE 2018 Paris Matteo Lissandrini

Search by a list of specifications 1. A Movie and an Actor 2. From the Movie return the Director 3. The Director has won an Award 4. The Movie is adapted from a Book 5. From the Book return the Author Hard to Specify! Too many options! What is the schema? Which specification is important? 3 ICDE 2018 Paris Matteo Lissandrini

Search by Example Lord of the Rings E. Wood P.Jackson Tolkien 4 ICDE 2018 Paris Matteo Lissandrini

Search by Example director LotR P.Jackson actor E. Wood Tolkien 5 ICDE 2018 Paris Matteo Lissandrini

Search by Example LotR actor director P.Jackson Harry Potter, actor director E. Wood Tolkien 6 ICDE 2018 Paris Matteo Lissandrini

Search by Example LotR actor director P.Jackson Harry Potter, James Bond, actor director E. Wood Tolkien 7 ICDE 2018 Paris Matteo Lissandrini

Search by Example LotR actor director P.Jackson Harry Potter, James Bond, actor director E. Wood Tolkien More Intuitive! Avoid list of specifications 8 ICDE 2018 Paris Matteo Lissandrini

ONE EXAMPLE IS NOT ENOUGH When known examples are only Partial Specifications 9 ICDE 2018 Paris Matteo Lissandrini

Search by Multiple Examples actor spouse spouse 10 ICDE 2018 Paris Matteo Lissandrini

ONE EXAMPLE IS NOT ENOUGH When Results have Different Structures 11 ICDE 2018 Paris Matteo Lissandrini

Search by Multiple Examples director actor spouse spouse director spouse 12 ICDE 2018 Paris Matteo Lissandrini

Multi-Example Search Multiple Simple Examples Each Example describes an Aspect Results are Combinations of aspects Results have possibly Multiple Structures 13 ICDE 2018 Paris Matteo Lissandrini

Ambiverse GmbH INFORMATION GRAPHS Nodes Edges 14 ICDE 2018 Paris Matteo Lissandrini

Edge-labelled Multigraphs G: V, E, L, l 15 ICDE 2018 Paris Matteo Lissandrini

Exemplar Queries SINGLE EXAMPLE Q e Input: Q e, an example element of interest Output: set of elements in the desired result set Exemplar Query Evaluation match Q e to sample S in the graph G find the set of elements A similar to S given a similarity relation [OPTIONAL] return only the top-k subset A K A A S A : { a D a S } similarity 16 ICDE 2018 Paris Matteo Lissandrini

Multi-Exemplar Queries Our Problem Formulation: Q e MULTIPLE EXAMPLE Input: Q e, a set of example elements of interest Output: set of elements in the desired result set Exemplar Query Evaluation match each q Q e to the set of samples S:{ s 1, s 2, } in the graph G find the set of elements A similar to each element in S given a similarity relation [OPTIONAL] return only the top-k subset A K A : { a D s S. s a } similarity 17 ICDE 2018 Paris Matteo Lissandrini

(Multi-)Exemplar Queries on Graphs Single Sample A : { a D a S } Similarity ( ) : graph isomorphism A: { a subgraph of G s isomorphic to a} Challenge : find ALL isomorphic graphs Graph Isomorphism is Transitive and Symmetric! A is an Equivalence Class Multiple Samples A : { a D s S. s a } s i,s j S. s i s j A= Similarity ( ) :?Subgraph-Isomorphism A: { a subgraph of G s S. s subgraph isomorphic to a} IS THIS CHARACTERIZATION ENOUGH? 18 ICDE 2018 Paris Matteo Lissandrini Answers Are Subgraphs That Contain Structures Similar To Each Sample

Multi-Exemplar Answers on Graphs Graph Similarity ( ) Subgraph-Isomorphism { a G s S. s subgraph isomorphic to a } What constitutes a good answer? With No Restrictions the Entire Graph Is Accepted as Answer 19 ICDE 2018 Paris Matteo Lissandrini

Multi-Exemplar Answers on Graphs Each answer should be correct, complete and non redundant: Ensure all Aspects are present & Limit Size of Answer Graphs Answers: WEAKLY CONNECTED SUBGRAPHS with NO SUPERFLOUS NODES or EDGES 1. Connectedness n 1,n 2 V A undirected path that connects n 1 to n 2 CORRECT 2. Consistency n A V A s S, n s V s Such that n A maps to n s COMPLETE NON-REDUNDANT 20 ICDE 2018 Paris Matteo Lissandrini

CHALLENGE! To find Multi-Exemplar answers 1. Find ALL isomorphic graphs to ALL samples 2. Find which samples combine into one connected answer For each sample needs to perform Subgraph-Isomorphism Search Candidate space = Cartesian Product for all samples 21 ICDE 2018 Paris Matteo Lissandrini

Search Framework S1 S3 S2 Exploit Localized Search 22 ICDE 2018 Paris Matteo Lissandrini

Search Framework Optimizations 1. Find CANDIDATE REGIONS a. Remove Unused Edges b. Identify SEEDs c. Expand Around each seed 2. SEARCH within each region a. Avoid Cartesian Product b. Fast Merge of Partial Aswers Naïve Algorithm 1 single region Retrieves ALL Isomorphic-subgraphs Hash-JOIN for Fast merge of Partial Answers 2 Advanced Algorithms Fast & Fast+ 23 ICDE 2018 Paris Matteo Lissandrini

Find Candidate Regions - Fast Identify SEED: S1 S2 S3 Min # of matches S1: 4 S2: 6 S3: 7 EXPAND around each seed: Retrieve candidate Regions 24 ICDE 2018 Paris Matteo Lissandrini

<latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> <latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> <latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> <latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> Find Candidate Regions - Fast Identify SEED: S1 S2 S3 Min # of matches S1: 4 S2: 6 S3: 7 EXPAND Precomputed around Statistics: each seed: Retrieve candidate Regions Label frequency Label Pair frequency Star cardinality I star (l, c) = {G 0 G G 0 : hv 0,E 0, `i is a star ^ E 0 = c ^9(v 1,v 2 ) 2 E 0 s.t. `(v 1,v 2 )=l} Approximate Cardinality Estimation 25 ICDE 2018 Paris Matteo Lissandrini

Find Candidate Regions - Fast Identify SEED: S1 S2 S3 Min # of matches S1: 4 S2: 6 S3: 7 EXPAND around each seed: Retrieve candidate Regions Seed search requires Isomorphic-Search Some Regions do not contain all structures 26 ICDE 2018 Paris Matteo Lissandrini

Find Candidate Regions Fast+ Identify SEED: S1 S2 S3 With cardinality Estimation Select SINGLE NODE With bitset-mapping EXPAND around each seed: Retrieve candidate Regions DISCARD incomplete regions With bitset-mapping & before ISO-search 27 ICDE 2018 Paris Matteo Lissandrini

Fast Pruning with Bit-Vectors d=1 S 1 actor actor director spouse author Barack Obama (S2) - - 1 -... Example: Detect JOIN-Node S 2 spouse director Quentin Tarantino (S3) - 1 - - S 3 Steven Spielberg (A1) - 1 1 - Kate Capshaw (A2) - - 1 - BO QT - 1 1 - (union) (BO QT) KC - 1 - - ( 0) (BO QT) SS - - - - (=0) spouse 28 ICDE 2018 Paris Matteo Lissandrini

Top-K Weight Function for Nodes Scoring Function Skip Regions that will not produce answers with score High Enough 29 ICDE 2018 Paris Matteo Lissandrini

Top-K Weight Function for Nodes Scoring Function 1. Sort regions (output of Partial) 2. Estimate Upper Bound Score for each Skip Regions that will not produce answers with score High Enough 3. Find Top-K & prune 30 ICDE 2018 Paris Matteo Lissandrini

Experimental Evaluation 3 Algorithms: a) Naïve b) Fast (iso-graphs as seeds) c) Fast+ (nodes as seeds) Tests: 100 Queries 2-5 Samples Count isomorphic computations Running Time 2 Large Real Datasets: a) YAGO +16.7M Edges b) Freebase +300M Edges 31 ICDE 2018 Paris Matteo Lissandrini

Evaluation Results Datasets: Freebase (300M Edges) Yago (16.7M edges) 100 queries, from 2 to 5 samples mq-naïve mq-fast mq-fast+ Saving Isomorphic Computations # Iso. subgraphs (k) 300 200 100 40-60% Reduction in Iso-Search Computations! a) 0 2 3 4 5 #Query Samples 32 ICDE 2018 Paris Matteo Lissandrini

Evaluation Results Datasets: Freebase (300M Edges) Yago (16.7M edges) 100 queries, from 2 to 5 samples mq-naïve mq-fast mq-fast+ Running Times Total time (sec) b) 100 10 1 AVERAGE MEDIAN 2 3 4 5 #Query Samples Faster 70% of queries Saves up to 25secs on AVG In some cases the Fast algorithm wastes computations 33 ICDE 2018 Paris Matteo Lissandrini

Conclusions Output: Search via Multiple-Examples Find structures similar to a SET of input examples. Composite Results Containing characteristics from each Example Thank You! Questions? p.s. Now you can hire me! Useful When: Optimizations: a Complete Example with all the desired characteristics is not known. Characteristics combine in Multiple ways. 1. Exploit Localized Search 2. Bitset Pruning 3. Cardinality Estimation 4. Top-K Optimizations Localized Search Is less effective With a Dense Graph! 34 ICDE 2018 Paris Matteo Lissandrini

bit.ly / icde18 Thank you! Questions? 35 ICDE 2018 Paris Matteo Lissandrini - ml@disi.unitn.eu

There is more... 36 ICDE 2018 Paris Matteo Lissandrini

Search Framework Optimizations (bis) 1. Find CANDIDATE REGIONS a. Remove Unused Edges b. Identify SEEDs c. Expand Around each seed 2. SEARCH within each region a. Avoid Cartesian Product b. Fast Merge of Partial Aswers Naïve Algorithm 1 single region Retrieves ALL Isomorphic-subgraphs Hash-JOIN for Fast merge of Partial Answers 2 Advanced Algorithms Fast & Fast+ 37 ICDE 2018 Paris Matteo Lissandrini

<latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> <latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> <latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> <latexit sha1_base64="wvtdkcvi7sy1gxm4eu+ielopiow=">aaacpxicbvfba9swfja9w5ddmm6pezk0re4hblsmvgabshhspqza0asfkbhzoulf5cskowuq/c/2k/a2fzm7mwnrd0di03duot+jmim08f1fjvvg4apht3aetp49f/fyt733aqltxhec81sm6ipigqvicgyekxivkwrxjpeyuvlu+y9xqlrikwuzznaws2uifoizu1fh+wenmbmoivu5dk02tjvd2eohg4lakud1hmk0+a1gubzayptajuuwesfovljnh/vfucosqno6qmfby0foyfcxgxka0hyc0o84xyiuq6+aaxbo3hrvqye1dfdh0fufr4dujdd0mpxtmd03fsjrpn+ibhjowyttjt/3nwb3qdcadmnslgz/ppou5zemhkum9ttwmzoztbnbjzytmmvmgl9hs5xwmgex6pndqfzc24qzwyjv1ukmbni/myyltv7hurvza6rv+mryf75pbhbhmyusldey8g2jrs7bpfcvdozcitdyxqhglaj+cvyakczntdhwjujwd+t7yhlud/x+cp6uc/kxkwohvch7pesc8p6ckfnyrsaeo/voqxpufhu994t74u62oa7t5lwm/5gb/gabv8mn</latexit> Cardinality Estimation Precomputed Statistics: Label frequency Label Pair frequency Star cardinality I star (l, c) = {G 0 G G 0 : hv 0,E 0, `i is a star ^ E 0 = c ^9(v 1,v 2 ) 2 E 0 s.t. `(v 1,v 2 )=l} Exact Number of ISO Subgraphs (Log2) 25 20 15 10 5 0 Pearson Corr. 0.77 Spearman Corr. 0.81 0 5 10 15 20 25 30 Estimated Number of ISO Subgraphs (Log2) 38 ICDE 2018 Paris Matteo Lissandrini

Example Query/Answers John Belushi Steven Tyler sibling child Steven Spielberg directed Maryl Streep awardin Jim Belushi Liv Tyler Hook The iron lady Nick Clooney child George Clooney directed Good night & good luck Frank Sinatra child Frank Sinatra Jr. sibling awardin awardin sibling Rosemary Clooney ER Fred Zinnemann directed From here to eternity Nancy Sinatra Alfred Newman child Thomas Newman sibling Lionel Newman awardin Doctor Dolittle directed Richard Fleischer 39 ICDE 2018 Paris Matteo Lissandrini

Example Query/Answers Examples Richard Hilton Has Child Paris Hilton Harrison Ford ismarriedto Calista Flockhart Acted In Born In Return Of the Jedi Chicago Dustin Hoffman Won Prize David di Donatello Acted In Rain Man Answers WonPrize Ronald Reagan Has Child George H. W.Bush Federal Cross Of Merit ismarriedto actedin Hellcats of The Navy Nancy Reagan Born In New York City Golden Won Prize Robert Globe Duvall Acted In Arnold Schwarzenegger Acted In The 6 th Day IsMarriedTo Born In Has Child Gray Davis Maria Shriver Thal, Styria Cowboy from Acted In Brooklyn Ronald Reagan Acted In HasChild Dick Powell Ron Reagan ismarriedto Born In Won Prize June Allyson Arkansas Federal Cross Of Merit 40 ICDE 2018 Paris Matteo Lissandrini