An open discussion on the Data Revolution Antonio Vetrò Director of Research - Nexa Center for Internet & Society Nexa Lunch Seminar nr 52-28 July 2017 phisaz 1
Acknowledgements This seminar is based on the studies and professional experiences of the last 10 years of my academic carrier, enriched by discussions, comments, materials and reflections shared with many colleagues along the way: I am profoundly thankful to all of them. In particular, I am especially grateful to*: Daniel Méndez Fernández Enrico Terrone Juan Carlos De Martin Marco Torchiano Marco Viola Maurizio Morisio 2 *alphabetical order :-)
3 Feb. 25, 2010
Feb. 11, 2011 Sep. 4, 2008 4
Source: https://www.scribblrs.com/heres-happens-every-second-internet/
The amount of data we produce doubles every year. In other words: in 2016 we produced as much data as in the entire history of humankind through 2015. source: https://www.scientificamerican.com/article/will-democracy-survive-big-data-and-artificial-intelligence/# 6
100 Billion things collecting and sending data over Internet within 202 Source: Morgan Stanley, The Internet of Things is now, 2014. http://bit.ly/2ga1epb 7
8 Oct., 2012
Big data adoption 63% of firms now report having Big Data in production in 2015, up from just 5% in 2012 63% of firms reported that they expect to invest greater than $10 million in Big Data by 2017, up from 24% in 2012 54% of firms say they have appointed a Chief Data Officer, up from 12% in 2012 70% of firms report that Big Data is of critical importance to their firms, up from 21% in 2012 At the top end of the investment scale, 27% of firms say they will invest greater than $50 million in Big Data by 2017, up from 5% of firms that invested this amount in 2015 Source: http://newvantage.com/wp-content/uploads/2016/01/big-data-executive-survey-2016-findings-final.pdf 9
10
11
Better data and statistics will help governments track progress and make sure their decisions are evidence-based; they can also strengthen accountability. This is not just about governments. International agencies, CSOs and the private sector should be involved. A true data revolution would draw on existing and new sources of data to fully integrate statistics into decision making, promote open access to, and use of, data and ensure increased support for statistical systems. (HLP Report, P23) 12
Better data and statistics will help governments track progress and make sure their decisions are evidence-based; they can also strengthen accountability. This is not just about governments. International agencies, CSOs and the private sector should be involved. A true data revolution would draw on existing and new sources of data to fully integrate statistics into decision making, promote open access to, and use of, data and ensure increased support for statistical systems. (HLP Report, P23) 13
unprecedented opportunities for data-driven discovery and decision making in virtually every area of human endeavour https://www.nsf.gov/pubs/2016/nsf16512/nsf16512.htm 14
15
CONTEXT 16
societies and environments where ICTs and their data processing No records ~Bronze Age capabilities are the necessary condition for the maintenance and any further development of societal welfare, personal well-being, and overall flourishing. societies that rely on ICTs to record and transmit data of all kind Prehistory History Hyperhistory 4th millennium BC (invention of writing) Luciano Floridi, The 4th Revolution 17
18 Mar. 1, 1995
Aetas Ferrea 19 image source: https://it.wikipedia.org
Development of Empiricism Sense experience is the ultimate source of all our concepts and knowledge 20 Source: Standford Encyclopedia of Philosophy
Empiricism the most reliable source of human knowledge is experience, especially perception by means of the physical senses Positivism 1- all knowledge regarding matters of fact is based on the positive data of experience 2- beyond the realm of fact is that of pure logic and pure mathematics. Only authentic knowledge is scientific knowledge, and that such knowledge can only come from positive affirmation of theories through strict scientific method, refusing every form of metaphysics 21
Galileo Galilei (1564-1642) Isaac Newton (1643-1727) 22 image source: https://it.wikipedia.org
Context a fusion of technologies that is blurring the lines between the physical, digital, and biological spheres source: https://www.weforum.org/agenda/2016/01/the-fourth-industrial-revolution-what-it-means-and-how-to-respond 23
Cyber physical systems Cyber-physical systems (CPS) are based on networked embedded software systems, which connect computational entities in a collaborative manner with physical entities of the real world to achieve an overall purpose. Together with available content and services on the World Wide Web, they build networks of systems that integrate with the physical environment. Source: http://www.bicc-net.de/aktivitaeten/aktivitaet/cyber-physical-systems/ 24
Infosphere An environment, like a biosphere, that is populated by informational entities called inforgs An inforg is an informationally embodied organism, entity made up of information, that exists in the infosphere. Luciano Floridi, The 4th Revolution 25
Cybernetical view "The Air Defense system is an organism... What then are organisms? They are of three kinds: animate organisms, which comprise animals and groups of animals, including men; partly animate organisms which involve animals together with inanimate devices such as in the Air Defense System; and inanimate organisms such as vending machines. All these organisms possess in common: sensory components, communication facilities, data analyzing devices, centers of judgment, directors of action, and effectors, or executing agencies " Source: Progress report of the U.S. Air Defence Systems Engineering Committee, 1950 26
Project Cybersyn (1971-1973) image source : Wikipedia 27
The Viable System Model (Chile, 1971-1973) image source : Wikipedia 28
The body nervous system (Stafford Beer) image source : Wikipedia 29
Analogy to a company system (Stafford Beer) image source : Wikipedia 30
Principal functions of the Viable System Model (1975) image source : Wikipedia 31
De Finetti board (1962) 32
Cybernetic heritage 33
A scenario Source :Geisberger, E., & Broy, M. (Eds.). (2015). Living in a networked world: Integrated research agenda Cyber-Physical Systems (agendacps). Herbert Utz Verlag. 34
image source : Wikipedia 35
36
37 http://www.wired.com/2008/06/pb-theory/
The new availability of huge amounts of data, along with the statistical tool to crunch these number, offers a whole new way of understanding the world. Correlation supersedes causation, and science can advance even without coherent model, unified theories, or really any mechanistic explanation at all. 38
In the 21st century, much of the vast volume of scientific data captured by new instruments on a 24/7 basis, along with information generated in the artificial world of computer models, is likely to reside forever in a live, substantially publicly accessible, curated state for the purpose of continued analysis. This analysis will result in the development of many new theories! extract from the Foreword 39
Feb. 3, 2017 40
Some controversial cases 41
42
43
PredPol http://www.predpol.com/ 44
image source: http://news.wabe.org/post/concerns-arise-over-new-predictive-policing-program 45
more 46
Lenddo source image: http://www.livingmarjorney.com 47
Chinese social credit system 48
source: https://www.socialcooling.com/ 49
source: https://www.socialcooling.com/ 50
51 Mar. 21, 2011
the orgy of fact extraction in which everybody is currently engaged has, like most consumer economies, accumulated a vast debt. This is a debt of theory and some of us are soon going to have an exciting time paying it back with interest, I hope Sydney Brenner, Nobel Prize for Medicine 52
Useful links and readings Readings on Big Data and application to Science: Viktor Mayer-Schnberger. 2013. Big Data: A Revolution that will Transform how We Live, Work and Think. Viktor Mayer-Schnberger and Kenneth Cukier. John Murray Publishers,, UK. The Fourth Paradigm: Data-Intensive Scientific Discovery In The Fourth Paradigm: Data-Intensive Scientific Discovery (2009) by Anthony J. G. Hey, Stewart Tansley, Kristin M. Tolle Perspective from Philosophy and Ethics: http://www.recode.net/2016/6/14/11923286/facebook-emotional-contagion-controversy-data-researchreview-policy-ethics http://www.theverge.com/2014/12/9/7360441/facebook-screwing-with-user-emotions-was-2014s-most-sharedscientific Perspectives on Big Data, Ethics, and Society, THE COUNCIL FOR BIG DATA, ETHICS, AND SOCIETY, http:// bdes.datasociety.net/wp-content/uploads/2016/05/perspectives-on-big-data.pdf An Introduction to Philosophy of Science, Kent W. Staley, Cambridge University Press Floridi L. and Taddeo M., What is data ethics?, Phil. Trans. R. Soc. A, Volume 374, Issue 2083, December 2016 Lepri, B., Staiano, J., Sangokoya, D., Letouzé, E., & Oliver, N. (2016). The Tyranny of Data? The Bright and Dark Sides of Data-Driven Decision-Making for Social Good. arxiv preprint arxiv:1612.00323. Brent Daniel Mittelstadt, Patrick Allo, Mariarosaria Taddeo,, Sandra Wachter, Luciano Floridi, The ethics of algorithms: Mapping the debate, Big Data & Society, Vol 3, Issue 2, First published date: November-01-2016, DOI: 10.1177/2053951716679679 53
Useful links and readings Perspective from Statistics: R. Foygel Barber and E. J. Candès. Controlling the false discovery rate via knockoffs. Sheldon M. Ross, Introduction to probability and statistic for engineers and scientists, ELSEVIER Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers, Keying Ye, Probability & Statistics for Engineers & Scientists Bradley Efron, Large-Scale Inference, Empirical Bayes Methods for Estimation, Testing, and Prediction, ISBN: 9781107619678, Jan 2013 Perspective from Physics Cecconi, F. and Cencini, M. and Falcioni, M. and Vulpiani, Predicting the future from the past: An old problem from a modern perspective, A., American Journal of Physics, 80, 1001-1008 (2012), DOI:http://dx.doi.org/10.1119/1.4746070 Francesco Sylos Labini, Big Data Complexity and Scientific Method Chris Anderson, The End of Theory L.F. Richardson, Weather Prediction by Numerical Process (Cambridge University Press, 1922) 54
Data Revolution from the philosophical roots to the last technological frontiers Antonio Vetrò Director of Research - Nexa Center for Internet & Society Nexa Lunch Seminar nr 52-28 July 2017 phisaz 55