Flow Size Prediction With Short Time Gaps

Date
2024-06-03
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Having a priori knowledge about network flow sizes is invaluable in network traffic control. Previous efforts on estimating flow sizes have focused on long flows, where each flow is identified by a large time gap in the sequence of packets. However, many network control mechanisms such as load balancing and rate control achieve better performance when operating over flowlets, short flows that are separated by small time gaps in the sequence of packets. In this work, using extensive measurements, we investigate the feasibility of predicting the size of short flows, where the flow duration can be in the order of microseconds. Specifically, we deploy several popular workloads in a public cloud testbed, and collect both network and host traces for each workload. The network trace contains standard packet metadata, while the host trace contains high-level host statistics (e.g.,memory usage and disk I/O) and low-level function call traces (e.g.,malloc(), send()) that are captured during the execution of each workload via host instrumentation using eBPF. These traces are then used to train machine learning models for flow size prediction with varying time gaps ranging from microseconds to milliseconds. Our results indicate that: (1) It is feasible to predict short flow sizes with high accuracy, i.e., percentage error in 0-12% range, (2) the low-level traces lead to 10-20% improvement in prediction accuracy compared to using the network and high-level traces.
Description
Keywords
Computer Networks, Machine Learning, Flow, Network Optimization
Citation
Hosseini, S. M. (2024). Flow size prediction with short time gaps (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.