IMDEA Networks Institute Publications Repository

Optimal Communication Structures for Big Data Aggregation

Culhane, William and Kogan, Kirill and Jayalath, Chamikara and Eugster, Patrick (2015) Optimal Communication Structures for Big Data Aggregation. In: The 34th IEEE International Conference on Computer Communications (IEEE INFOCOM 2015), 26 April - 1 May 2015, Hong Kong, China.

PDF (Optimal Communication Structures for Big Data Aggregation) - Published Version
Download (200Kb) | Preview


Aggregation of computed sets of results fundamentally underlies the distillation of information in many of today’s big data applications. To this end there are many systems which have been introduced which allow users to obtain aggregate results by aggregating along communication structures such as trees, but they do not focus on optimizing performance by optimizing the underlying structure to perform the aggregation. We consider two cases of the problem – aggregation of (1) single blocks of data, and of (2) streaming input. For each case we determine which metric of “fast” completion is the most relevant and mathematically model resulting systems based on aggregation trees to optimize that metric. Our assumptions and model are laid out in depth. From our model we determine how to create a provably ideal aggregation tree (i.e., with optimal fanin) using only limited information about the aggregation function being applied. Experiments in the Amazon Elastic Compute Cloud (EC2) confirm the validity of our models in practice.

Item Type: Conference or Workshop Papers (Paper)
Depositing User: Kirill Kogan
Date Deposited: 22 Jan 2015 10:40
Last Modified: 30 Nov 2016 09:45

Actions (login required)

View Item View Item