Publications by Khalid Ashmawy
2011
Systems Group Master's Thesis, no. 30; Department of Computer Science, December 2011
Supervised by: Prof. Donald Kossmann
Supervised by: Prof. Donald Kossmann
This master thesis aims to add partial live migration features in Crescando[17]. Crescando is a scalable, distributed relational table implementation based on parallel, collaborative scans in memory. These features developed in this thesis provide the building block for implementing elastic scalability and high availability in Crescando. Elastic scalability refers to adding or removing storage nodes to a cluster without downtime. High availability refers to avoiding unplanned outages by eliminating single points of failure. One of the methodologies for providing high availability is fault tolerance by replication.[10] Both, elastic scalability and high availability, require an efficient method to copy or move data across storage nodes, which this master thesis provides.
The problem is tackled in a black-box approach. Crescando external user interface is used to solve the problem, rather than altering its implementation. Crescando’s simple operations (Select, Insert, Delete) are used as the elementary units to provide the functionality of copying and moving data across nodes. One of the challenges of building such a system is migrating the contents of a relational table with minimal impact on the whole system availability and performance. Optimizations are incorporated to achieve efficient data transfer such that data transfer rate saturates a gigabit Ethernet interface. The system interrupt duration is minimized to the period required for data transfer. Moreover certain consistency guarantees must be provided by the solution. Our solution guarantees linearizability[9], a well-known strong consistency guarantee.
The migration system developed in this thesis is employed by a higher level layer known as Rubberband[16]. Rubberband implements a well-known replication scheme, known as successor-list replication[14]. Rubberband instructs appropriate nodes in a dynamic set of nodes to shuffle data using the migration system developed in this thesis. They are instructed to shuffle data in order to maintain successor-list replication scheme as storage nodes join and part the system.
@mastersthesis{abc, abstract = {This master thesis aims to add partial live migration features in Crescando[17]. Crescando is a scalable, distributed relational table implementation based on parallel, collaborative scans in memory. These features developed in this thesis provide the building block for implementing elastic scalability and high availability in Crescando. Elastic scalability refers to adding or removing storage nodes to a cluster without downtime. High availability refers to avoiding unplanned outages by eliminating single points of failure. One of the methodologies for providing high availability is fault tolerance by replication.[10] Both, elastic scalability and high availability, require an efficient method to copy or move data across storage nodes, which this master thesis provides. The problem is tackled in a black-box approach. Crescando external user interface is used to solve the problem, rather than altering its implementation. Crescando{\textquoteright}s simple operations (Select, Insert, Delete) are used as the elementary units to provide the functionality of copying and moving data across nodes. One of the challenges of building such a system is migrating the contents of a relational table with minimal impact on the whole system availability and performance. Optimizations are incorporated to achieve efficient data transfer such that data transfer rate saturates a gigabit Ethernet interface. The system interrupt duration is minimized to the period required for data transfer. Moreover certain consistency guarantees must be provided by the solution. Our solution guarantees linearizability[9], a well-known strong consistency guarantee. The migration system developed in this thesis is employed by a higher level layer known as Rubberband[16]. Rubberband implements a well-known replication scheme, known as successor-list replication[14]. Rubberband instructs appropriate nodes in a dynamic set of nodes to shuffle data using the migration system developed in this thesis. They are instructed to shuffle data in order to maintain successor-list replication scheme as storage nodes join and part the system.}, author = {Khalid Ashmawy}, school = {30}, title = {Partial live migration in scan-based database systems}, year = {2011} }