6533b7d8fe1ef96bd126b641
RESEARCH PRODUCT
MCD: Overcoming the Data Download Bottleneck in Data Centers
Viktor GottfriedAndré BrinkmannDirk MeisterJürgen Kaisersubject
Computer scienceDownloadbusiness.industryDistributed computingcomputer.file_formatNetwork topologyBottleneckFile serverPacket lossServerReliable multicastbusinessBitTorrentcomputerComputer networkdescription
The data download problem in data centers describes the increasingly common task of coordinated loading of identical data to a large number of nodes. Data download is seen as a significant problem in exascale HPC applications. Uncoor-dinated reading from a central file server creates contention at the file server and its network interconnect. We propose and evaluation a reliable multicast based approach to solve the data download problem. The MCD system builds a logical multi-rooted tree based on the physical network topology and uses the logical view for a two-phase approach. In the first phase, the data is multicasted to all nodes. In the second phase, the logical tree is used for an efficient error-correction. We evaluate the approach against the Twitter's Murder, which is BitTorrent-based data download solution used to deploy code binaries to thousands of nodes. The evaluation features a simulation of up to 10,000 nodes and shows that MCD finishes the reliable data download significantly faster. The simulation results are finally validated using a real-world deployment of more than 100 nodes.
year | journal | country | edition | language |
---|---|---|---|---|
2013-07-01 | 2013 IEEE Eighth International Conference on Networking, Architecture and Storage |