Publication

Systems Group Master's Thesis, no. ETH Zürich; Department of Computer Science, March 2010
Supervised by: Prof. Donald Kossmann
For developing an application which deals with a large amount of data, the state-of-the-art is to use a specialized query language like SQL for relational data or SPARQL for RDF. Whereas the business logic of an application is implemented in a host programming language like Java or C++, the query language provides an interface to the storage of persistent data. These languages often use different data models which cause the well-known impedance mismatch problem. XQuery is a declarative programming language that can be applied on all application tiers and thus leverages a unified technology stack by means of the XML data model. XQuery is very well suited for querying and manipulating data that is stored in XML collections. Yet, a large number of legacy applications still exist in companies which produce SQL code as an interface to a relational database on the one hand and SPARQL code to execute queries on RDF documents on the other hand. It is not feasible to replace all legacy databases and applications with XML databases and new programs written in XQuery at the same time. Therefore this thesis explores how legacy SQL and SPARQL code can be mapped to XQuery, which pre-conditions and limitations for an automated mapping hold and how well this transformation performs. In this way, all information can be moved to XML databases without having to change existing applications. Both old systems and applications written in XQuery can co-exist by accessing the same data. This eliminates the need for replication of different database systems and prevents inconsistencies.
@mastersthesis{abc,
	abstract = {For developing an application which deals with a large amount of data, the state-of-the-art is
to use a specialized query language like SQL for relational data or SPARQL for RDF.
Whereas the business logic of an application is implemented in a host programming language
like Java or C++, the query language provides an interface to the storage of persistent data.
These languages often use different data models which cause the well-known impedance
mismatch problem.
XQuery is a declarative programming language that can be applied on all application tiers and
thus leverages a unified technology stack by means of the XML data model. XQuery is very
well suited for querying and manipulating data that is stored in XML collections. Yet, a large
number of legacy applications still exist in companies which produce SQL code as an
interface to a relational database on the one hand and SPARQL code to execute queries on
RDF documents on the other hand.
It is not feasible to replace all legacy databases and applications with XML databases and new
programs written in XQuery at the same time. Therefore this thesis explores how legacy SQL
and SPARQL code can be mapped to XQuery, which pre-conditions and limitations for an
automated mapping hold and how well this transformation performs. In this way, all
information can be moved to XML databases without having to change existing applications.
Both old systems and applications written in XQuery can co-exist by accessing the same data.
This eliminates the need for replication of different database systems and prevents
inconsistencies.},
	author = {Martin Kaufmann},
	school = {ETH Z{\"u}rich},
	title = {Mapping SPARQL and SQL to XQuery},
	year = {2010}
}