[dbis logo] [dbis]

.research.Projects
[Institut fuer Informatik] [Leerraum] [Humboldt-Universitaet zu Berlin]

Link Traversal Based Query Execution - A Novel Approach to Query the Web of Linked Data

The World Wide Web currently evolves into a Web of Data where content providers publish and connect data in a manner similar to the approach used for Web documents over the last 20 years. While the execution of structured, SQL-like queries over this emerging dataspace opens possibilities not conceivable before, querying the Web of Data poses novel challenges. Due to the openness of the Web, it is impossible to know all data sources that might contribute to the answer of a query. To tap the full potential of the Web, traditional query execution paradigms are insufficient because those assume a fixed set of potentially relevant data sources beforehand.

 

We work on a novel query execution paradigm that allows the execution engine to discover potentially relevant data during the evaluation of a query. Our approach of answering queries makes use of the characteristics of the Web of Data, in particular, the existence of data links between data items of different sources. The general idea of our approach, which we call link traversal based query execution, is to intertwine the construction of query results with the traversal of those data links that correspond to intermediate solutions in the construction process.

 

The integration of link traversal in the query execution process and the ability to discover data from unknown sources present novel challenges for the development of execution engines and for the application of query planning and optimization. In our work we address the following questions: How do we implement the general idea of link traversal based query execution in a query system? What are the trade-offs of different implementation approaches? How do we generate query execution plans without any information about statistics or distribution of the data that will be discovered? How do we reduce the impact of network access times on query execution times? How do we benefit from reusing data discovered and retrieved during the execution of previous queries as seed data for the current query? How do we integrate an assessment of the quality and trustworthiness of discovered data in order to guarantee certain quality criteria for the query results?

Publications

  • Executing SPARQL Queries over the Web of Linked Data (Abstract)
    Olaf Hartig, Christian Bizer, Johann-Christoph Freytag
    Proceedings of the 8th International Semantic Web Conference (ISWC'09), Washington, DC, USA, 2009/10 (pdf)
  • A Database Perspective on Consuming Linked Data on the Web
    Olaf Hartig, Andreas Langegger
    Datenbankspektrum, Semantic Web Special Issue, 2010/09 (link)
  • A Main Memory Index Structure to Query Linked Data
    Olaf Hartig and Frank Huber
    Proceedings of the 4th Linked Data on the Web (LDOW) Workshop at the World Wide Web Conference (WWW), Hyderabad, India, 2011/03 (pdf)
  • How Caching Improves Efficiency and Result Completeness for Querying Linked Data
    Olaf Hartig
    Proceedings of the 4th Linked Data on the Web (LDOW) Workshop at the World Wide Web Conference (WWW), Hyderabad, India, 2011/03 (pdf)
  • Zero-Knowledge Query Planning for an Iterator Implementation of Link Traversal Based Query Execution
    Olaf Hartig
    Proceedings of the 8th Extended Semantic Web Conference (ESWC), Heraklion, Greece, 2011/06 (pdf)
  • SPARQL for a Web of Linked Data: Semantics and Computability
    Olaf Hartig
    Proceedings of the 9th Extended Semantic Web Conference (ESWC), Heraklion, Greece, 2012/05 (pdf)
  • Foundations of Traversal Based Query Execution over Linked Data
    Olaf Hartig, Johann-Christoph Freytag
    Proceedings of the 23rd ACM Conference on Hypertext and Social Media (HT), Semantic Data Track, Milwaukee, WI, USA, 2012/06 (pdf)

Links

Project Website of our link traversal based query system SQUIN


Last update:  Monday, July 15, 2013

[Punkt]  DFG-Forschergruppe Stratosphere

[Punkt]  DFG-Graduate School SOAMED

[Punkt]  DFG-Graduate School METRIK

[aktiver Punkt]  Link Traversal Based Query Execution

[Punkt]  Web of Trusted Data

[Punkt]  Query Optimization in RDF Databases

[Punkt]  DBnovo - Datenbankgestützte Online Sequenzierung



Contact persons


Olaf Hartig