Monitoring distributed systems
Date
1985-11-01
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The monitoring of distributed systems involves the collection,
interpretation, and display of information concerning the interactions among
concurrently executing processes. This information and its display can support
the debugging, testing, performance evaluation, and dynamic documentation of
distributed systems. General problems associated with monitoring are
outlined in this paper, and the architecture of a general purpose, extensible,
distributed monitoring system is presented. Three approaches to the display
of process interactions are described: sequential textual traces, animated
graphical traces, and a third which combines aspects of textual and
graphical views to display a systems's evolution versus inter-process
communication events. The roles that each of these types of tracing fulfill
in monitoring and debugging distributed systems are identified and
compared. Monitoring tools for collecting communication statistics,
deadlock detection, controlling the non-deterministic execution of
distributed systems, and the use of protocol specifications in monitoring
are also described.
Our discussion is based on experience in the development and use of a
monitoring system within a distributed programming environment called
Jade. The Jade environment was developed within the Computer Science
Department of the University of Calgary and is now being used to support
teaching and research within a number of university and research
organisations.
Description
Keywords
Computer Science