Liquid Benchmarks. Sherif Sakr 1 and Fabio Casati September and

Similar documents
NEES CYBERINFRASTRUCTURE: A FOUNDATION FOR INNOVATIVE RESEARCH AND EDUCATION

Addressing Information Overload in the Scientific Community

Initial communication and dissemination plan. Elias Alevizos, Alexander Artikis, George Giannakopoulos. Scalable Data Analytics Scalable Algorithms,

NASA s Strategy for Enabling the Discovery, Access, and Use of Earth Science Data

LIQUID JOURNALS: KNOWLEDGE DISSEMINATION IN THE WEB ERA. Marcos Baez, Fabio Casati, Aliaksandr Birukou and Maurizio Marchese

Data users and data producers interaction: the Web-COSI project experience

200 West Baltimore Street Baltimore, MD TTY/TDD marylandpublicschools.org

SECTION 2. Computer Applications Technology

MSc(CompSc) List of courses offered in

Where does architecture end and technology begin? Rami Razouk The Aerospace Corporation

SR&ED for the Software Sector Northwestern Ontario Innovation Centre

1 Publishable summary

RECOMMENDATIONS. COMMISSION RECOMMENDATION (EU) 2018/790 of 25 April 2018 on access to and preservation of scientific information

SMART PLACES WHAT. WHY. HOW.

2009 New Jersey Core Curriculum Content Standards - Technology

Public Consultation: Science 2.0 : science in transition

Computational Reproducibility in Medical Research:

Technology transactions and outsourcing deals: a practitioner s perspective. Michel Jaccard

BIM and Urban Infrastructure

Trenton Public Schools. Fifth Grade Technological Literacy 2013

Global Alzheimer s Association Interactive Network. Imagine GAAIN

Interoperable systems that are trusted and secure

Industry 4.0: the new challenge for the Italian textile machinery industry

Transparency! in open collaboration environments

A tool on Privacy Enhancing Technologies (PETs) knowledge management and maturity assessment

COMMISSION STAFF WORKING PAPER EXECUTIVE SUMMARY OF THE IMPACT ASSESSMENT. Accompanying the

Meeting of International Authorities under the Patent Cooperation Treaty (PCT)

Instrumentation, Controls, and Automation - Program 68

Introducing Elsevier Research Intelligence

UNITED NATIONS EDUCATIONAL, SCIENTIFIC AND CULTURAL ORGANIZATION

Computing Requirements of Sri Lankan Scientific Community

Establishment of a Multiplexed Thredds Installation and a Ramadda Collaboration Environment for Community Access to Climate Change Data

Enhanced performance of delayed teleoperator systems operating within nondeterministic environments

Hamburg, 25 March nd International Science 2.0 Conference Keynote. (does not represent an official point of view of the EC)

twitter.com/twc_rp Research Announcement

SENDORA: Design of wireless sensor network aided cognitive radio systems

Open Science. challenge and chance for medical librarians in Europe.

Cisco Live Healthcare Innovation Roundtable Discussion. Brendan Lovelock: Cisco Brad Davies: Vector Consulting

MECHANICAL DESIGN LEARNING ENVIRONMENTS BASED ON VIRTUAL REALITY TECHNOLOGIES

Principles for the Networked World

University of Kansas. The University of Kansas Libraries

Advertising & Marketing Law (Law 712) Eric Goldman Spring 2011

OpenUP. IRCDL 2018 Udine, Gennaio

COMMISSION RECOMMENDATION. of on access to and preservation of scientific information. {SWD(2012) 221 final} {SWD(2012) 222 final}

EGS-CC. System Engineering Team. Commonality of Ground Systems. Executive Summary

Foreword_. Smart Santander Foreword

WFEO STANDING COMMITTEE ON ENGINEERING FOR INNOVATIVE TECHNOLOGY (WFEO-CEIT) STRATEGIC PLAN ( )

INNOVATIVE APPROACH TO TEACHING ARCHITECTURE & DESIGN WITH THE UTILIZATION OF VIRTUAL SIMULATION TOOLS

Beyond MBSE: Looking towards the Next Evolution in Systems Engineering

The Five R s for Developing Trusted Software Frameworks to increase confidence in, and maximise reuse of, Open Source Software

Written response to the public consultation on the European Commission Green Paper: From

move move us Newsletter 2014 Content MoveUs has successfully finished the first year of the project!

THE EXPO AS GLOBAL VILLAGE

Open Research Online The Open University s repository of research publications and other research outputs

Lives: A System for Creating Families of Multimedia Stories

BIG IDEAS. Personal design choices require self-exploration, collaboration, and evaluation and refinement of skills. Learning Standards

Support of Design Reuse by Software Product Lines: Leveraging Commonality and Managing Variability

Software System/Design & Architecture. Eng.Muhammad Fahad Khan Assistant Professor Department of Software Engineering

The Disappearing Computer. Information Document, IST Call for proposals, February 2000.

Project Plan Groupwork for Google Chrome

BI TRENDS FOR Data De-silofication: The Secret to Success in the Analytics Economy

April 2015 newsletter. Efficient Energy Planning #3

Clinical Open Innovation

High Performance Computing Systems and Scalable Networks for. Information Technology. Joint White Paper from the

By the end of this chapter, you should: Understand what is meant by engineering design. Understand the phases of the engineering design process.

Finland s drive to become a world leader in open science

with permission from World Scientific Publishing Co. Pte. Ltd.

Harnessing the Power of AI: An Easy Start with Lattice s sensai

November 18, 2011 MEASURES TO IMPROVE THE OPERATIONS OF THE CLIMATE INVESTMENT FUNDS

HP Laboratories. US Labor Rates for Directed Research Activities. Researcher Qualifications and Descriptions. HP Labs US Labor Rates

Priority Theme 1: Science, Technology and Innovation (STI) for the Post-2015 Agenda

Navigating the Healthcare Innovation Cycle

Game Engine Programming

AGENTS AND AGREEMENT TECHNOLOGIES: THE NEXT GENERATION OF DISTRIBUTED SYSTEMS

Peer review innovations in Humanities: how can scholars in A&H profit from the wisdom of the crowds?

USE-ME.GOV USability-drivEn open platform for MobilE GOVernment. 2. Contributions of the Project to Research under e-government

Information Technology Assessment. Board Report San Jose Evergreen Community College District December 13, 2011

The Siemens Offshore Europe X-pert Center: Innovation and first hand information at your service!

WHY ACCOUNTANCY & SOCIAL DESIGN

PREFACE. Introduction

FUTURE NOW Securing Digital Success

EHR Optimization: Why Is Meaningful Use So Difficult?

Earth Cube Technical Solution Paper the Open Science Grid Example Miron Livny 1, Brooklin Gore 1 and Terry Millar 2

Brief to the. Senate Standing Committee on Social Affairs, Science and Technology. Dr. Eliot A. Phillipson President and CEO

UN-GGIM Future Trends in Geospatial Information Management 1

Back to (the Article of) the Future?

Faculty of Information Engineering & Technology. The Communications Department. Course: Advanced Communication Lab [COMM 1005] Lab 6.

Sven Wachsmuth Bielefeld University

Scientific Transparency, Integrity, and Reproducibility

Reproducibility Interest Group

A Test Bed for Verifying and Comparing BIM-based Energy Analysis Tools

Open Science and Research Initiative Infrastructures and networking for Open Science Seminar on at the University of Helsinki

SUBSEA CONTROLS & COMMUNICATIONS SOLUTIONS

STRATEGIC ORIENTATION FOR THE FUTURE OF THE PMR:

Framework Programme 7

INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED 03 STOCKHOLM, AUGUST 19-21, 2003

By Mark Hindsbo Vice President and General Manager, ANSYS

Grundlagen des Software Engineering Fundamentals of Software Engineering

Why Adalyser? Data Quality

BIM+Blockchain: A Solution to the "Trust" problem in Collaboration?

Transcription:

Liquid Benchmarks Sherif Sakr 1 and Fabio Casati 2 1 NICTA and University of New South Wales, Sydney, Australia and 2 University of Trento, Trento, Italy 2 nd Second TPC Technology Conference on Performance Evaluation and Benchmarking (TPCTC 10) 17 September 2010 S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 1 / 16

Problem Overview The last two decades have seen significant growth in the number of scientific research publications. An important characteristic of Computer Science research is that it produces artifacts other than publications, in particular software implementations (prototypes). There is a continuous existence of performance improvement claims from researchers. The quality of reported experimental results are usually limited due to several reasons such as: insufficient effort or time, unavailability of suitable test cases or any other resource constraints. Researchers are usually focusing on reporting the experimental results of the good sides for their work which may not reflect the whole picture of the real-world scenarios. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 2 / 16

Problem Overview Liquid Benchmarks: Benchmarking-as-a-Service An open call for online platforms that facilitates applying independent experimental evaluation and comparison techniques between competing alternatives of algorithms, approaches or complete systems in order to assess the practical impact and benefit of research results. The main aim of LiquidPub 1 Project is to develop concepts, models, metrics, and tools for an efficient, effective and sustainable way of creating, disseminating, evaluating, and consuming scientific knowledge. 1 http://liquidpub.org/ S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 3 / 16

Benchmarking Challenges in Computer Science Not enough standard benchmarks are available or widely-used A benchmark is a standard test or set of tests that used to evaluate/compare alternative approaches that have a common aim to solve a specific problem. A benchmark usually consists of a motivating scenario, task samples and a set of performance measures. Unavailability of a standard benchmark in a certain domain makes the job of researchers hard to evaluate/comprare their work and leads to having several adhoc experimental results in the literature. For any benchmark to be successful, it must gain wide acceptance by its target community. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 4 / 16

Benchmarking Challenges in Computer Science Not enough standard benchmarks are available or widely-used Designing a successful benchmark is a quite challenging task which is usually not easy to be achieved by a single author or research group. In practice, very few benchmarks were able to achieve big success in their communities (e.g TPC, oo7, XMark). In ideal world, simplifying and improving the task of building standard successful benchmarks can be achieved through collaborative efforts between peer researchers in the same fields. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 5 / 16

Benchmarking Challenges in Computer Science Limited repeatability of published results In an ideal world, researchers should make the source codes/binaries of the implementation of their contribution in addition to the experimental datasets available for other researchers to be reused for repeating the published results in their paper. Debates! Unfortunately, the world is not always ideal ;-) XMLCompBench. SIGMOD Repeatability Experiment. VLDB Experiments and Analysis Track. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 6 / 16

In some domains, conducting experimental evaluations may require huge computing resources. Conducting experimental evaluations may require using different settings for the computing environments in a manner that is similar to different types of real-world environments. Such computing resources requirements may be not available for researchers in their home environments/labs. Achieving a fair and apples-to-apples comparison between any two alternative scientific contributions requires performing their experiments using exactly the same computing environments. In an ideal word, researchers should have access to shared computing environments where they can evaluate/compare their contributions. The suitable configuration of these testing computing environments can be also decided collaboratively. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 7 / 16 Benchmarking Challenges in Computer Science Constraints of Computing Resources

S. Collaboration. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 8 / 16 Benchmarking Challenges in Computer Science Continuous evolution of the state-of-the-art Experimental evaluation papers suffer from a main problem is that they represent snapshots for the state-of-the-art at the time of their preparation. By default, the research contributions in any field are always dynamic and evolving (e.g. new approaches, improvement for existing approaches). Experimental papers can go out-of-date after relatively short time of their appearance. Continuous maintenance of the published results may require too much work from their authors who may loose the interest to redo the job after sometime.

Liquid Benchmarks: Underlying Technologies Cloud Computing: as an efficient way of broad sharing of computer software and hardware resources via the Internet in an elastic way. Software As A Service (SAAS): it provides the facility that each end-user does not require to manually download, install, configure, run or use the software applications on their own computing environments. Collaborative and Social Web: (e.g. Wikis, blogs, forums) offer a great flexibility in the ability of building online communities between groups of people that share the same interests (peers) where they can interact and work together in an effective and productive way. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 9 / 16

Liquid Benchmarks: Architecture S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 10 / 16

Liquid Benchmarks: Components Web-based User Interface: design experiments, submit requests and search results. Experiment Manager: control the execution, ensure absence of influences. Receives, stores and renders the experimental results. Repository of Experiment Results: stores the results of all running experiments with their associated configuration parameters, provenance information (e.g. timestamp, user) and social information (e.g. comments, discussions). S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 11 / 16

Liquid Benchmarks: Components Cloud-Based Computing Environments: It hosts testing environments which are shared by the liquid benchmark end-users. Collaborative Design Environment: It is used to build the specification of the benchmark scenarios and provides the tools to achieve the task collaboratively (e.g. forums, wikis). Solution Setup Environment: It is used to setup and configure the competing solutions in the different testing environments (SAAS). S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 12 / 16

Liquid Benchmarks: Ongoing Case Studies XML Compressors. SPARQL query processors. Graph query processors. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 13 / 16

Liquid Benchmarks: Benefits Providing workable environments to collaboratively build standard benchmarks. Developing centralized and focused repositories for related software implementations and their experimental results. That can be used as a very positive step to solve the repeatability problems. Facilitating collaborative maintenance of experimental studies to guarantee their freshness. Facilitate establishing shared computing resources environment that can be utilized by different active contributors in the same domain residing at different parts of the world. Leveraging the wisdom of the crowd in providing feedbacks over the experimental results in a way that can give useful insights for solving further problems and improving the state-of-the-art. Establishing a transparent platform for scientific crediting process based on collaborative community work. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 14 / 16

Conclusions Liquid Benchmarks: A Step Towards An Online Platform for Collaborative Assessment of Scientific Research Results. We believe that the Computer Science research community should have the leadership to significantly improve the ability of assessing the impact of scientific research results. This work is at a preliminary stage and may leave out some of the important details (e.g privacy, credit attribution). However, we hope that our proposal will serve as the foundation of a fundamental rethinking of the experimental evaluation process in the computer science field. S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 15 / 16

End Please follow the updates of our project on http://project.liquidpub.org/research-areas/liquid-benchmarks Please email questions to: ssakr@cse.unsw.edu.au THANK YOU S. Sakr and F. Casati () Liquid Benchmarks 17 September 2010 16 / 16