36 th ISRSE 11 May 2015 Berlin, Germany Big Data Breaking Barriers First Steps on a Long Trail Sven Schade (@innovatearth) Institute for Environment and Sustainability Digital Earth and Reference Data Unit Including contributions from: Massimo Craglia, Davide De Marchi, Irene Eleta, Jacopo Grazzini, Jiří Hradec, Alexander Kotsev, Frank Ostermann, Nicole Ostländer, Francesco Pantisano, Elena Roglia, Cristina Sanchez, Sven Schade, Spyridon Spyratos, Chrisa Tsinaraki, Lorenzino Vaccari, as well as the two visiting scientists (Levente Juhász and Sergi Trilles) www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting legislation
2 Organisational Background - Digital Earth and Reference Data unit INSPIRE: Implementation, Maintenance and Evolution Open Data Policy of the JRC Complex data handling with geospatial informatics
[source: jimmiejoe.com] 3 Content Brief introduction to Big (Geospatial) Data Highlights from 10 completed case studies Discussion of 3 barriers Conclusions This talk provides on overview of work that was largely carried out in 2014, with some reflections on a possible way ahead. The content presents my personal view and does not necessarily reflect the position of the European Commission.
[source: NBDRA, NIST] [source: bigdatalandscape.com] Computer science & system engineering 4
[source: M. Craglia] [source: wikimedia.org] Geospatial information science 5 Volume: remote sensing (images or point clouds in 2/3D), or intense modelling (e.g. immediate and medium range weather forecasting and climate modelling) Velocity: large single volumes or continuous inputs - of the same type but from massive amounts of sources (e.g. in the context of the IoT) Variety: given any place on earth (or elsewhere), we already today receive spatially-related data sets and streams for multiple sources. These grow and accumulate over time. Veracity: reference data and the differentiation between authoritative sources and user-contributed contend discussed in the geospatial and statistics communities
[source: M. Craglia] [source: wikimedia.org] Geospatial information science 5 Volume: remote sensing (images or point clouds in 2/3D), or intense modelling (e.g. immediate and medium range weather forecasting and climate modelling) Velocity: large single volumes or continuous inputs - of the same type but from massive amounts of sources (e.g. in the context of the IoT) Variety: given any place on earth (or elsewhere), we already today receive spatially-related data sets and streams for multiple sources. These grow and accumulate over time. Veracity: reference data and the differentiation between authoritative sources and user-contributed contend discussed in the geospatial and statistics communities
3D platform for geospatial data handling 6 EU3D advanced desktop application 3D browser based viewer [source:d. De Marchi] Both building on of the Core003 data set, a Very High Resolution (VHR) optical coverage over the member and cooperating countries of the European Environment Agency (EEA) that was generated from SPOT-5 data through multispectral 2.5 meters resolution data ortho-rectified with a geo-location accuracy of less than 5 meters Root Mean Square Error (RMSE) [source: D. De Marchi]
Investigating usage potential of social media platforms 7 [source: I. Eleta, J. Grazzini and E. Roglia] Using social media platforms to complement authoritative vector data Using social network analysis to sense social behaviour Using new database technologies to store and query social media data [source:s. Spyratos et al.] [source: L. Juhász]
Sensing technologies and the Internet of Things 8 Real-time event detection from sensor networks Service-enabled sensing platform for the environment [source:s. Trilles et al.] [source: A. Kotsev, F. Pantisano et al.]
Handling complex data integration 9 Visualisations of complex metadata New modes for multi-sensory integration [source: J. Hradec] Model transparency [source:f. Ostermann and S. Schade] [source: N. Ostländer]
10
[source: pixabay.com] Discussion 11 Technical barriers Rich choice of implementations available Implications from technical choices and re-use Do not expect (only) one (Digital Earth) platform! Establish data flows between existing systems and enable the exchange of lessons learned
[source: pixabay.com] Discussion 12 Semantic barriers Potential useful data does not reside in well-know communities any more Desired to use data and tools from other sciences and vice versa Do not (only) work together! Enable knowledge transfers across the sciences
[source: pixabay.com] Discussion 13 Organisational barriers Beneficiaries of applications do not only reside in (data) science Adding value by including foreigners and opening new markets Do not (only) address the technocratic dimension! Increase efforts on social and behavioral aspects
Conclusion 14 These are only a few examples and findings Many case studies have been developed across the globe Many of us took their first steps on the long trail of successful and useful knowledge extraction from (small and big) data [source: wikimedia.org]
Conclusion 14 These are only a few examples and findings Many case studies have been developed across the globe Many of us took their first steps on the long trail of successful and useful knowledge extraction from (small and big) data [source: wikimedia.org] [source: pixabay.com] We are currently investigating the use of RM-ODP to integrate the findings of any number of case studies. It seems promising to use this standard methodology to describe information systems that is already widely used in the geospatial information domain to develop a high-level description of Digital Earth platforms and provide guidance for case specific re-use.
Thank you! Final slide, really These are only a few examples and findings Many case studies have been developed across the globe Many of us took their first steps on the long trail of successful and useful knowledge extraction from (small and big) data [source: wikimedia.org] [source: pixabay.com] We are currently investigating the use of RM-ODP to integrate the findings of any number of case studies. It seems promising to use this standard methodology to describe information systems that is already widely used in the geospatial information domain to develop a high-level description of Digital Earth platforms and provide guidance for case specific re-use.