Publication
Systems Group Master's Thesis, no. 2; Department of Computer Science, April 2011
Supervised by: Prof. Donald Kossmann
Supervised by: Prof. Donald Kossmann
Despite the advances in the areas of databases and information retrieval, there still remain certain
types of queries that are difficult to answer using machines alone. Such queries require human interaction
to either provide data that is not readily available to machines or to gain more information from
existing electronic data.
CrowdDB is a database system that enables difficult queries to be answered by using crowdsourcing
to integrate human knowledge with electronically available data. To a large extent, the concepts
and capabilities of traditional database systems are leveraged in CrowdDB. Despite the commonalities,
since CrowdDB deals with procuring and utilizing human input, several existing capabilities of
traditional database systems require modifications and extensions. Much unlike electronically available
data, human input provided by crowdsourcing is unbounded and virtually infinite. Accordingly,
CrowdDB is a system based on an open-world assumption. An extension of SQL, termed as Crowd-
SQL, is used to model data and manipulate it. CrowdSQL is also used as the language to express
complex queries on the integrated data sources. Furthermore, interaction with the crowd in CrowdDB
requires an additional component that governs automatic user interface generation, based on available
schemas and queries. Also, performance acquires a new meaning in the context of a system such as
CrowdDB. Response time (efficiency), quality (effectiveness) and cost (in $) in CrowdDB are dependent
on a number of different parameters including the availability of the crowd, financial rewards for
tasks and state of the crowdsourcing platform. In this thesis, we propose the design, architecture and
functioning of CrowdDB. In addition, we present the details of building such a system on an existing
Java-based database, H2. The design and functionalities of CrowdDB have also been presented in
[13].
@mastersthesis{abc, abstract = {Despite the advances in the areas of databases and information retrieval, there still remain certain types of queries that are difficult to answer using machines alone. Such queries require human interaction to either provide data that is not readily available to machines or to gain more information from existing electronic data. CrowdDB is a database system that enables difficult queries to be answered by using crowdsourcing to integrate human knowledge with electronically available data. To a large extent, the concepts and capabilities of traditional database systems are leveraged in CrowdDB. Despite the commonalities, since CrowdDB deals with procuring and utilizing human input, several existing capabilities of traditional database systems require modifications and extensions. Much unlike electronically available data, human input provided by crowdsourcing is unbounded and virtually infinite. Accordingly, CrowdDB is a system based on an open-world assumption. An extension of SQL, termed as Crowd- SQL, is used to model data and manipulate it. CrowdSQL is also used as the language to express complex queries on the integrated data sources. Furthermore, interaction with the crowd in CrowdDB requires an additional component that governs automatic user interface generation, based on available schemas and queries. Also, performance acquires a new meaning in the context of a system such as CrowdDB. Response time (efficiency), quality (effectiveness) and cost (in $) in CrowdDB are dependent on a number of different parameters including the availability of the crowd, financial rewards for tasks and state of the crowdsourcing platform. In this thesis, we propose the design, architecture and functioning of CrowdDB. In addition, we present the details of building such a system on an existing Java-based database, H2. The design and functionalities of CrowdDB have also been presented in [13]. }, author = {Sukriti Ramesh}, school = {2}, title = {CrowdDB \&$\#$150; Answering Queries with Crowdsourcing}, year = {2011} }