Probabilistic System-Level Fault Diagnostic Algorithms for Multiprocessors

TitleProbabilistic System-Level Fault Diagnostic Algorithms for Multiprocessors
Publication TypeJournal Article
Year of Publication1997
AuthorsBartha, T., and Selényi, E.
JournalParallel Computing
Volume22
Pagination1807 - 1821
Date Published1997
ISBN Number0167-8191
Abstract

Massively parallel computers (MFCs) introduce new requirements for system-level fault diagnosis, like handling a huge number of processing elements in a heterogeneous system. They also have specific attributes, such as regular topology and low local complexity. Traditional deterministic methods of system-level diagnosis did not consider these issues. This paper presents a new approach, called local information diagnosis that exploits the characteristics of massively parallel systems. The paper defines the diagnostic model, which is based on generalized test invalidation to handle inhomogeneity in multiprocessors. Five effective probabilistic diagnostic algorithms using the proposed method are also given, and their space and time complexity are estimated.

NotesUT: A1997WM04500008L3: citeulike-article-id:3911659KW: system-level diagnosis ISSN 0167-8191 Special issue: distributed and parallel systems: environments and tools