Integration and massive storage of hydro-meteorological data combining big data & semantic web technologies
Abstract
ABSTRACT
Ecuador contains an immense collection of hydro-meteorological data, informing us via standards how to locate and invoke them. If we want to make such data easier to understand and use, we need to store them in a common repository and annotate them by means of descriptive metadata. This paper proposes an approach for the massive storage, integration and semantic annotation of hydro-meteorological data using an open source integration container, NoSQL databases and Semantic Web technologies. The main contributions of this paper are: i) a shared common repository of hydro-meteorological data, ii) automatic semantic annotation to formally describe the data sources, and iii) efficient mechanisms for searching and retrieving hydro meteorological data.
Keywords: Hydro-meteorological data, data integration, big data, semantic web, NoSQL.
RESUMEN
Ecuador contiene una inmensa colección de datos hidro-meteorológicos usualmente descritos usando estándares que nos indican cómo localizarlos y cómo invocarlos. Si queremos hacer que esos datos sean potencialmente más sencillos de entender y usar se requiere almacenarlos en un repositorio común y anotarlos formalmente usando metadatos descriptivos. Este artículo propone un mecanismo para el almacenamiento masivo, integración y anotación semántica de datos hidro-meteorológicos utilizando un framework de integración, bases de datos NoSQL y tecnologías de web semántica. Las contribuciones principales de este artículo son: i) Un repositorio compartido de datos hidro-meteorológicos, ii) anotación semántica automática para describir las fuentes de manera formal, y iii) mecanismos eficientes de búsqueda y consulta de datos hidro-meteorológicos
Palabras clave: Datos hidro-meteorológicos, integración de datos, big data, web semántica, NoSQL.
Downloads
Metrics
References
Corcho, Ó., D. Garijo Verdejo, J. Mora, M. Poveda Villalon, D. Vila Suero, B. Villazón-Terrazas, G.A. Atemezing, 2012. Transforming meteorological data into linked data. Undefined, 1, 1-5, IOS Press. Available at http://www.semantic-web-journal.net/sites/default/files/swj281_0.pdf.
Berners-Lee, T., J. Hendler, O. Lassila, 2001. The semantic web. Scientific American, 284(5), 28-37.
Bifet, A., 2013. Mining big data in real time. Informatica, 37(1).
Borthakur, D., 2007. The hadoop distributed file system: Architecture and design. Hadoop Project Website, 11, 21
Cudré-Mauroux, P., I. Enchev, S. Fundatureanu, P. Groth, A. Haque, A. Harth, F.L. Keppmann, D. Miranker, J.F. Sequeda, M. Wylot, 2013. Nosql databases for rdf: an empirical evaluation. In: The Semantic Web-ISWC conference, 310-325.
Cuesta, C.E., M.A. Mart ınez-Prieto, J.D. Fernández, 2013. Towards an architecture for managing big semantic data in real-time. Software Architecture, 45-53.
Fung, D.S.C., 2006. Methods for the estimation of missing values in time series. PhD thesis, Edith Cowan University, Perth, Australia.
Patni, H., C. Henson, A. Sheth, 2010. Linked sensor data. In: Collaborative Technologies and Systems (CTS). IEEE International Symposium, 362-370.
Patni, H., C.A. Henson, M. Cooney, A.P. Sheth, K. Thirunarayan, 2011. Demonstration: real-time semantic analysis of sensor streams. Available at http://corescholar.libraries.wright.edu/ cgi/viewcontent.cgi?article=1249&context=knoesis.
RDF Working Group, 2014. Resource Description Framework (RDF). Available at https://www.w3. org/2001/sw/wiki/RDF.
Sagiroglu, S., D. Sinanc, 2013. Big data: A review. In: Collaboration Technologies and Systems (CTS), IEEE International Conference, 42-47.
Shvachko, K., H. Kuang, S. Radia, R. Chansler, 2010. The hadoop distributed file system. In: Mass Storage Systems and Technologies (MSST), IEEE 26th Symposium, 1-10.
Villars, R.L., C.W. Olofson, M. Eastwood, 2011. Big data: What it is and why you should care. White Paper, IDC. Available at http://www.admin-magazine.com/HPC/Vendors/AMD/Whitepaper-Big-Data-What-It-Is-and-Why-You-Should-Care.
Zeng, K., J. Yang, H. Wang, B. Shao, Z. Wang, 2013. A distributed graph engine for web scale rdf data. Proceedings of the VLDB Endowment, 6(4), 265-276.
Downloads
Published
How to Cite
Issue
Section
License
Copyright © Autors. Creative Commons Attribution 4.0 License. for any article submitted from 6 June 2017 onwards. For manuscripts submitted before, the CC BY 3.0 License was used.
You are free to:
Share — copy and redistribute the material in any medium or format |
Adapt — remix, transform, and build upon the material for any purpose, even commercially. |
Under the following conditions:
Attribution — You must give appropriate credit, provide a link to the licence, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licenser endorses you or your use. |
No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the licence permits. |