Monitoring distributed systems

Date
1985-11-01
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The monitoring of distributed systems involves the collection, interpretation, and display of information concerning the interactions among concurrently executing processes. This information and its display can support the debugging, testing, performance evaluation, and dynamic documentation of distributed systems. General problems associated with monitoring are outlined in this paper, and the architecture of a general purpose, extensible, distributed monitoring system is presented. Three approaches to the display of process interactions are described: sequential textual traces, animated graphical traces, and a third which combines aspects of textual and graphical views to display a systems's evolution versus inter-process communication events. The roles that each of these types of tracing fulfill in monitoring and debugging distributed systems are identified and compared. Monitoring tools for collecting communication statistics, deadlock detection, controlling the non-deterministic execution of distributed systems, and the use of protocol specifications in monitoring are also described. Our discussion is based on experience in the development and use of a monitoring system within a distributed programming environment called Jade. The Jade environment was developed within the Computer Science Department of the University of Calgary and is now being used to support teaching and research within a number of university and research organisations.
Description
Keywords
Computer Science
Citation