Home > Research > Publications & Outputs > Characterising a grid site's traffic
View graph of relations

Characterising a grid site's traffic

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Published

Standard

Characterising a grid site's traffic. / Ma, Tiejun; El-khatib, Yehia; Mackay, Michael et al.
HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. New York: ACM, 2010. p. 707-716.

Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSNConference contribution/Paperpeer-review

Harvard

Ma, T, El-khatib, Y, Mackay, M & Edwards, C 2010, Characterising a grid site's traffic. in HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. ACM, New York, pp. 707-716, The Third International Workshop on Data Intensive Distributed Computing (DIDC'10), Chicago, IL, USA, 1/01/00. https://doi.org/10.1145/1851476.1851581

APA

Ma, T., El-khatib, Y., Mackay, M., & Edwards, C. (2010). Characterising a grid site's traffic. In HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing (pp. 707-716). ACM. https://doi.org/10.1145/1851476.1851581

Vancouver

Ma T, El-khatib Y, Mackay M, Edwards C. Characterising a grid site's traffic. In HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. New York: ACM. 2010. p. 707-716 doi: 10.1145/1851476.1851581

Author

Ma, Tiejun ; El-khatib, Yehia ; Mackay, Michael et al. / Characterising a grid site's traffic. HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing. New York : ACM, 2010. pp. 707-716

Bibtex

@inproceedings{24d252ed42354e92af92a88b0172ea27,
title = "Characterising a grid site's traffic",
abstract = "Grid computing has been widely adopted for intensive high performance computing. Since grid resources are distributed over complex large-scale infrastructures, understanding grid site data traffic behaviour is important for efficient resource utilisation, performance optimisation, and the design of future grid sites as well as traffic-aware grid applications. In this paper, we study and analyse the traffic generated at a grid site in the Large Hadron Collider (LHC) Computing Grid (LCG). We find that most of the generated traffic is TCP-based and that a small set of grid applications generate significant amounts of the data. Upon analysing the different traffic metrics, we also find that the traffic exhibits long-range dependence and self-similarity. We also investigate packet-level metrics such as throughput, packet rate, round trip time (RTT) and packet loss. Our study establishes that these metrics can be well represented by Gaussian mixture models. The findings we present in this paper will enable accurate grid site traffic monitoring and potentially on-the-fly traffic modelling and prediction. It will also lead to a better understanding of grid site{\textquoteright}s traffic behaviour and contribute to more efficient grid site planning, traffic management, data transmission protocol optimisation, and data-aware grid application design.Grid computing has been widely adopted for intensive high performance computing. Since grid resources are distributed over complex large-scale infrastructures, understanding grid site data traffic behaviour is important for efficient resource utilisation, performance optimisation, and the design of future grid sites as well as traffic-aware grid applications. In this paper, we study and analyse the traffic generated at a grid site in the Large Hadron Collider (LHC) Computing Grid (LCG). We find that most of the generated traffic is TCP-based and that a small set of grid applications generate significant amounts of the data. Upon analysing the different traffic metrics, we also find that the traffic exhibits long-range dependence and self-similarity. We also investigate packet-level metrics such as throughput, packet rate, round trip time (RTT) and packet loss. Our study establishes that these metrics can be well represented by Gaussian mixture models. The findings we present in this paper will enable accurate grid site traffic monitoring and potentially on-the-fly traffic modelling and prediction. It will also lead to a better understanding of grid site{\textquoteright}s traffic behaviour and contribute to more efficient grid site planning, traffic management, data transmission protocol optimisation, and data-aware grid application design.",
author = "Tiejun Ma and Yehia El-khatib and Michael Mackay and Christopher Edwards",
year = "2010",
doi = "10.1145/1851476.1851581",
language = "English",
isbn = "978-1-60558-942-8 ",
pages = "707--716",
booktitle = "HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing",
publisher = "ACM",
note = "The Third International Workshop on Data Intensive Distributed Computing (DIDC'10) ; Conference date: 01-01-1900",

}

RIS

TY - GEN

T1 - Characterising a grid site's traffic

AU - Ma, Tiejun

AU - El-khatib, Yehia

AU - Mackay, Michael

AU - Edwards, Christopher

PY - 2010

Y1 - 2010

N2 - Grid computing has been widely adopted for intensive high performance computing. Since grid resources are distributed over complex large-scale infrastructures, understanding grid site data traffic behaviour is important for efficient resource utilisation, performance optimisation, and the design of future grid sites as well as traffic-aware grid applications. In this paper, we study and analyse the traffic generated at a grid site in the Large Hadron Collider (LHC) Computing Grid (LCG). We find that most of the generated traffic is TCP-based and that a small set of grid applications generate significant amounts of the data. Upon analysing the different traffic metrics, we also find that the traffic exhibits long-range dependence and self-similarity. We also investigate packet-level metrics such as throughput, packet rate, round trip time (RTT) and packet loss. Our study establishes that these metrics can be well represented by Gaussian mixture models. The findings we present in this paper will enable accurate grid site traffic monitoring and potentially on-the-fly traffic modelling and prediction. It will also lead to a better understanding of grid site’s traffic behaviour and contribute to more efficient grid site planning, traffic management, data transmission protocol optimisation, and data-aware grid application design.Grid computing has been widely adopted for intensive high performance computing. Since grid resources are distributed over complex large-scale infrastructures, understanding grid site data traffic behaviour is important for efficient resource utilisation, performance optimisation, and the design of future grid sites as well as traffic-aware grid applications. In this paper, we study and analyse the traffic generated at a grid site in the Large Hadron Collider (LHC) Computing Grid (LCG). We find that most of the generated traffic is TCP-based and that a small set of grid applications generate significant amounts of the data. Upon analysing the different traffic metrics, we also find that the traffic exhibits long-range dependence and self-similarity. We also investigate packet-level metrics such as throughput, packet rate, round trip time (RTT) and packet loss. Our study establishes that these metrics can be well represented by Gaussian mixture models. The findings we present in this paper will enable accurate grid site traffic monitoring and potentially on-the-fly traffic modelling and prediction. It will also lead to a better understanding of grid site’s traffic behaviour and contribute to more efficient grid site planning, traffic management, data transmission protocol optimisation, and data-aware grid application design.

AB - Grid computing has been widely adopted for intensive high performance computing. Since grid resources are distributed over complex large-scale infrastructures, understanding grid site data traffic behaviour is important for efficient resource utilisation, performance optimisation, and the design of future grid sites as well as traffic-aware grid applications. In this paper, we study and analyse the traffic generated at a grid site in the Large Hadron Collider (LHC) Computing Grid (LCG). We find that most of the generated traffic is TCP-based and that a small set of grid applications generate significant amounts of the data. Upon analysing the different traffic metrics, we also find that the traffic exhibits long-range dependence and self-similarity. We also investigate packet-level metrics such as throughput, packet rate, round trip time (RTT) and packet loss. Our study establishes that these metrics can be well represented by Gaussian mixture models. The findings we present in this paper will enable accurate grid site traffic monitoring and potentially on-the-fly traffic modelling and prediction. It will also lead to a better understanding of grid site’s traffic behaviour and contribute to more efficient grid site planning, traffic management, data transmission protocol optimisation, and data-aware grid application design.Grid computing has been widely adopted for intensive high performance computing. Since grid resources are distributed over complex large-scale infrastructures, understanding grid site data traffic behaviour is important for efficient resource utilisation, performance optimisation, and the design of future grid sites as well as traffic-aware grid applications. In this paper, we study and analyse the traffic generated at a grid site in the Large Hadron Collider (LHC) Computing Grid (LCG). We find that most of the generated traffic is TCP-based and that a small set of grid applications generate significant amounts of the data. Upon analysing the different traffic metrics, we also find that the traffic exhibits long-range dependence and self-similarity. We also investigate packet-level metrics such as throughput, packet rate, round trip time (RTT) and packet loss. Our study establishes that these metrics can be well represented by Gaussian mixture models. The findings we present in this paper will enable accurate grid site traffic monitoring and potentially on-the-fly traffic modelling and prediction. It will also lead to a better understanding of grid site’s traffic behaviour and contribute to more efficient grid site planning, traffic management, data transmission protocol optimisation, and data-aware grid application design.

U2 - 10.1145/1851476.1851581

DO - 10.1145/1851476.1851581

M3 - Conference contribution/Paper

SN - 978-1-60558-942-8

SP - 707

EP - 716

BT - HPDC '10 Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing

PB - ACM

CY - New York

T2 - The Third International Workshop on Data Intensive Distributed Computing (DIDC'10)

Y2 - 1 January 1900

ER -