Summary

{{rc.info.holds}} on this title.

Export MARC Display Return To Browse Find in WorldCat {{rc.info.type}}

Location	Call #	Volume	Status
E-BOOK

LINK(S):
Available via SpringerLink; click here for access

Author	Sorin, Daniel J.
Title	Fault tolerant computer architecture / Daniel J. Sorin.
OCLC	200904CAC005
ISBN	9781598299540 (electronic bk.)
	9781598299533 (pbk.)
ISBN/ISSN	10.2200/S00192ED1V01Y200904CAC005 doi
Publisher	San Rafael, Calif. (1537 Fourth Street, San Rafael, CA 94901 USA) : Morgan & Claypool Publishers, [2009]
	�2009
Description	1 electronic text (xii, 103 pages : illustrations\.) : digital file.
LC Subject heading/s	Fault-tolerant computing.
	Self-stabilization (Computer science)
	Computer architecture.
SUBJECT	Fault tolerance (or fault tolerant)
	Reliability.
	Dependability.
	Computer architecture.
	Error detection.
	Error recovery.
	Fault diagnosis.
	Self-repair.
	Autonomous.
	Dynamic verification.
System details note	Mode of access: World Wide Web.
	System requirements: Adobe Acrobat reader.
Bibliography	Includes bibliographical references.
Contents	Introduction -- Goals of this book -- Faults, errors, and failures -- Masking -- Duration of faults and errors -- Underlying physical phenomena -- Trends leading to increased fault rates -- Smaller devices and hotter chips -- More devices per processor -- More complicated designs -- Error models -- Error type -- Error duration -- Number of simultaneous errors -- Fault tolerance metrics -- Availability -- Reliability -- Mean time to failure -- Mean time between failures -- Failures in time -- Architectural vulnerability factor -- The rest of this book -- References -- Error detection -- General concepts -- Physical redundancy -- Temporal redundancy -- Information redundancy -- The end-to-end argument -- Microprocessor cores -- Functional units -- Register files -- Tightly lockstepped redundant cores -- Redundant multithreading without lockstepping -- Dynamic verification of invariants -- High-level anomaly detection -- Using software to detect hardware errors -- Error detection tailored to specific fault models -- Caches and memory -- Error code implementation -- Beyond EDCs -- Detecting errors in content addressable memories -- Detecting errors in addressing -- Multiprocessor memory systems -- Dynamic verification of cache coherence -- Dynamic verification of memory consistency -- Interconnection networks -- Conclusions -- References -- Error recovery -- General concepts -- Forward error recovery -- Backward error recovery -- Comparing the performance of FER and BER -- Microprocessor cores -- FER for cores -- BER for cores -- Single-core memory systems -- FER for caches and memory -- BER for caches and memory -- Issues unique to multiprocessors -- What state to save for the recovery point -- Which algorithm to use for saving the recovery point -- Where to save the recovery point -- How to restore the recovery point state -- Software-implemented BER -- Conclusions -- References -- Diagnosis -- General concepts -- The benefits of diagnosis -- System model implications -- Built-in self-test -- Microprocessor core -- Using periodic BIST -- Diagnosing during normal execution -- Caches and memory -- Multiprocessors -- Conclusions -- References -- Self-repair -- General concepts -- Microprocessor cores -- Superscalar cores -- Simple cores -- Caches and memory -- Multiprocessors -- Core replacement -- Using the scheduler to hide faulty functional units -- Sharing resources across cores -- Self-repair of noncore components -- Conclusions -- References -- The future -- Adoption by industry -- Future relationships between fault tolerance and other fields -- Power and temperature -- Security -- Static design verification -- Fault vulnerability reduction -- Tolerating software bugs -- References.
Restrictions	Abstract freely available; full-text restricted to subscribers or individual document purchasers.
	Access may be restricted to authorized users only.
	Unlimited user license access
NOTE	Compendex.
	Google scholar.
	Google book search.
Abstract	For many years, most computer architects have pursued one primary goal: performance. Architects have translated the ever-increasing abundance of ever-faster transistors provided by Moore's law into remarkable increases in performance. Recently, however, the bounty provided by Moore's law has been accompanied by several challenges that have arisen as devices have become smaller, including a decrease in dependability due to physical faults. In this book, we focus on the dependability challenge and the fault tolerance solutions that architects are developing to overcome it. The two main purposes of this book are to explore the key ideas in fault-tolerant computer architecture and to present the current state-of-the-art--over approximately the past 10 years--in academia and industry.
NOTE	INSPEC.
Additional physical form available note	Also available in print.
General note	Part of: Synthesis digital library of engineering and computer science.
	Title from PDF t.p. (viewed on June 4, 2009).
	Series from website.

{{ac.info.display_name}}
{{ac.info.email}}

{{rc.info.title}}

Summary