Fault Tolerance in Distributed Systems (1994)
Front Cover Book Details
Author
Pankaj Jalote
Genre Distributed Processing; Distributed Systems; Fault Tolerant Systems
Publication Date 1994
Format Paperback (250 x mm)
Publisher Prentice Hall PTR
Language English
Plot
Table of Contents
1. Introduction.
Basic Concepts and Definitions. Phases in Fault Tolerance. Overview of Hardware Fault Tolerance. Reliability and Availability. Summary.

2. Distributed Systems.
System Model. Interprocess Communication. Ordering of Events and Logical Clocks. Execution Model and System State. Summary.

3. Basic Building Blocks.
Byzantine Agreement. Synchronized Clocks. Stable Storage. Fail Stop Processors. Failure Detection and Fault Diagnosis. Reliable Message Delivery. Summary.

4. Reliable, Atomic, and Causal Broadcast.
Reliable Broadcast. Atomic Broadcast. Causal Broadcast.

5. Recovering A Consistent State.
Asynchronous Checkpointing and Rollback. Distributed Checkpointing. Summary.

6. Atomic Actions.
Atomic Actions and Serializability. Atomic Actions in a Centralized System. Commit Protocols. Atomic Actions on Decentralized Data. Summary.

7. Data Replication And Resiliency.
Optimistic Approaches. Primary Site Approach. Resiliency with Active Replicas. Voting. Degree of Replication. Summary.

8. Process Resiliency.
Resilient Remote Procedure Call. Resiliency with Asynchronous Communication. Resiliency with Synchronous Message Passing. Total Failure and Last Process to Fail. Summary.

9. Software Design Faults.
Approaches for Uniprocess Software. Backward Recovery in Concurrent Systems. Forward Recovery in Concurrent Systems. Summary.

Bibliography.


From the Back Cover
Fault tolerance is an approach by which reliability of a computer system can be increased beyond what can be achieved by traditional methods. While hardware supported fault tolerance has been well-documented, the newer, software supported fault tolerance techniques have remained scattered throughout the literature. Comprehensive and self-contained, this book organizes that body of knowledge with a focus on fault tolerance in distributed systems. (The uniprocess case is treated as a special case of distributed systems.) Treats fault tolerant distributed systems as consisting of levels of abstraction, providing different tolerant services. For researchers/practitioners working in the area of fault tolerance.


Personal Details
Collection Status Not In Collection
Store Bookpool.com
Location Box 06
Purchase Price $49.95
Purchase Date 1/8/98
Condition Fine
Index 224
Owner Paulo Mendes
Read It No
Links URL
Collection # 00178G
Order # 4gsh9d
Product Details
LoC Classification QA76.9.F38 J35 1994
Dewey 004/.36 20
ISBN 0133013677
Edition 01
Printing 1
Country USA
Cover Price $65.00
Nr of Pages 432
First Edition Yes
Rare No
Notes
Includes bibliographical references (p. 401-420) and index.