Research Article
Modeling and Generating TCP Application Workloads
@INPROCEEDINGS{10.1109/BROADNETS.2007.4550436, author={F\^{e}lix Hern\^{a}ndez-Campos and Kevin Jeffay and F. Donelson Smith}, title={Modeling and Generating TCP Application Workloads}, proceedings={4th International IEEE Conference on Broadband Communications, Networks, Systems}, publisher={IEEE}, proceedings_a={BROADNETS}, year={2010}, month={5}, keywords={}, doi={10.1109/BROADNETS.2007.4550436} }
- Félix Hernández-Campos
Kevin Jeffay
F. Donelson Smith
Year: 2010
Modeling and Generating TCP Application Workloads
BROADNETS
IEEE
DOI: 10.1109/BROADNETS.2007.4550436
Abstract
In order to perform valid experiments, traffic generators used in network simulators and testbeds require contemporary models of traffic as it exists on real network links. Ideally one would like a model of the workload created by the full range of applications running on the Internet today. Unfortunately, at best, all that is available to the research community are a small number of models for single applications or application classes such as the web or peer-to-peer. We present a method for creating a model of the full TCP application workload that generates the traffic flowing on a network link. From this model, synthetic workload traffic can be generated in a simulation that is statistically similar to the traffic observed on the real link. The model is generated automatically using only a simple packet-header trace and requires no knowledge of the actual identity or mix of TCP applications on the network. We present the modeling method and a traffic generator that will enable researchers to conduct network experiments with realistic, easy-to-update TCP application workloads. An extensive validation study is performed using Abilene and university traces. The method is validated by comparing traces of synthetically generated traffic to the original traces for a set of important measures of realism. We also show how workload models can be re-sampled to generate statistically valid randomized and rescaled variations.