GSDI Conferences, GSDI 15 World Conference

Font Size: 
Taming Big Data with Metadata
Joana Simoes, Paul Van Genuchten, Jeroen Ticheler

Last modified: 2016-06-06

Abstract


In recent years, we have watched an explosive growth of geospatial data. While in one hand this could be due to the “piling up” of time series from traditional data sources (e.g.: remote sensing), on the other hand there is an emergence of new geospatial datasets (e.g.: data generated by sensors, or by "humans as sensors"); these new sources are somehow linked to relatively recent phenomena such as the Internet of Things (IoT), or Volunteer Geographic Information (VGI).

Big Data has been often defined based on its five properties (or five V's): Volume, Velocity, Variety, Veracity and Value [1]. Although much emphasis has been put on addressing the first two V's, by developing innovative frameworks that are able to ingest Petabytes of data in real-time, or near real-time, a similar effort is needed in addressing the Variety, Veracity and Value of Big Data. And this is where we think that Metadata can help.

Metadata is often defined as "data about the data", and it is key to discover datasets, to assess their quality, and to use and preserve these datasets in the long term (e.g.: survivability of data). Having more and more heterogeneous information, does not necessarily bring any value to businesses and organizations, unless this information is discoverable, interoperable, and ensures a certain degree of quality. In order to enforce these properties, a variety of technologies have been introduced, such as OGC standards (e.g.: CSW), metadata profiles (e.g.: INSPIRE) and best practices (e.g.: Spatial Data on the Web best practices).

On this talk we are going to discuss some of these technologies and related challenges, in the context of a Spatial Web Catalog: GeoNetwork Opensource.  We are also going to discuss strategies for metadata creation inspired by the crowdsourced paradigm, which can increase the levels of confidence in data quality by a process of peer review. We intend to demonstrate how metadata can be used as a privileged asset, not only for discovering and managing Big Geo Data, but also to enforce its quality and, ultimately to increase its value.

 



Keywords


big data; catalog services; metadata;

References


[1] Xu, C., & Yang, C. (2014). Introduction to big geospatial data research. Annals of GIS, 20(4), 227-232.

An account with this site is required in order to view papers. Click here to create an account.