Report on WarpKit performance study and improvement
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This is a report on the earlier development of WarpKit, a parallel simulation
kernel based on shared-memory multi-processor architecture, as part of the
Telesim project. The development is aimed at exploiting shared memory multi-
processor paradigm and developing a Parallel Discrete Event Simulation package
which is based on shared memory multi-processors and capable of delivering
high performance. Three major problems that have great impact on the
performance of Time Warp systems are: excessive cost incurred by rollback
computation resulting from sole reliance on rollback as a basic
synchronization mechanism in a distributed/parallel processing system, large
amount of memory space required to run applications, and high system overheads
in inter-process communication and global control (e.g. GVT computation and
memory management). Shared memory multi-processor architecture provides the
potential of delivering much higher performance for Time Warp systems than can
be achieved in distributed environment. New approaches could be conceived to
address these problems and to realize the potential.
This report covers the results of our effort to improve WarpKit Kernel
performance. Incremental State Saving has been implemented on top of existing
Kernel which reduces both the time and space spent on state saving, a
necessity of Time Warp. Purely asynchronous schemes have been developed and
implemented for the global control mechanism. As a result, the system overhead
on global control has been reduced significantly. The new global control
mechanism also makes the system overheads insensitivity to the number of
processors as opposed to the distributed situation where system overhead
experiences a sharp increase with the number of processors. A global
scheduling and load balancing mechanism is expected to restrict the number of
rollbacks to a low percentage over net events to be processed by the Kernel.
With these new mechanisms in place, one may expect close to linear speedup
curve for parallel discrete event simulation on shared memory multi-
processors.