| dc.description.abstract |
We are increasingly relying on online services to store , access , share , and disseminate critical information from anywhere and at all times . Such services include email , digital storage , photos , video , health and financial services , etc . With increasing evidence of non -fail -stop failures in practical systems , Byzantine fault tolerant state machine replication technique is becoming increasingly attractive for building highlyreliable services in order to tolerate such failures . However , existing Byzantine fault tolerant techniques fall short of providing high availability , high performance , and long -term data durability guarantees with competitive replication cost . In this dissertation , we present BFT replication techniques that facilitate the design and implementation of such highly -reliable services by providing high availability , high performance and high durability with competitive replication cost (hardware , software , network , management ) . First , we propose CBASE , a BFT state machine replication architecture that leverages application -level parallelism to improve throughput of the replicated system by identifying and executing independent requests concurrently . Traditional state machine replication based Byzantine fault tolerant (BFT ) techniques provide high availability and security but fail to provide high throughput . This limitation stems from the fundamental assumption of generalized state machine replication techniques that all replicas execute requests sequentially in the same total order to ensure consistency across replicas . Our architecture thus provides a general way to exploit application parallelism in order to provide high throughput without compromising correctness . Second , we present Zyzzyva , an efficient BFT agreement protocol that uses speculation to significantly reduce the performance overhead and replication cost of BFT state machine replication . In Zyzzyva , replicas respond to a client’s request without first running an expensive three -phase commit protocol to reach agreement on the order in which the request must be processed . Instead , they optimistically adopt the order proposed by the primary and respond immediately to the client . Replicas can thus become temporarily inconsistent with one another , but clients detect inconsistencies , help correct replicas converge on a single total ordering of requests , and only rely on responses that are consistent with this total order . This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima . Third , we design and implement SafeStore , a distributed storage system designed to maintain long -term data durability despite conventional hardware and software faults , environmental disruptions , and administrative failures caused by human error or malice . The architecture of SafeStore is based on fault isolation , which SafeStore applies aggressively along administrative , physical , and temporal dimensions by spreading data across autonomous storage service providers (SSPs ) . SafeStore also performs an efficient end -to -end audit of SSPs to detect data loss quickly and improve data durability by reducing MTTR . SafeStore offers durable storage with cost , performance , and availability competitive with traditional storage systems . We evaluate these techniques by implementing BFT replication libraries and further demonstrate the practicality of these approaches by implementing an NFS based replicated file system (CBASE -FS ) and a durable storage system (SafeStore -FS ) . |
en_US |