Research Article
Reliable Multicast Based on Erasure Resilient Codes over InfiniBand
@INPROCEEDINGS{10.1109/CHINACOM.2006.344802, author={Xigui Wang and Zifeng Xiao and Jizhong Han and Chengde Han}, title={Reliable Multicast Based on Erasure Resilient Codes over InfiniBand}, proceedings={1st International ICST Conference on Communications and Networking in China}, publisher={IEEE}, proceedings_a={CHINACOM}, year={2007}, month={4}, keywords={Erasure Resilient Codes InfiniBand Multicast Reed-Solomon code}, doi={10.1109/CHINACOM.2006.344802} }
- Xigui Wang
Zifeng Xiao
Jizhong Han
Chengde Han
Year: 2007
Reliable Multicast Based on Erasure Resilient Codes over InfiniBand
CHINACOM
IEEE
DOI: 10.1109/CHINACOM.2006.344802
Abstract
Many distributed applications and systems, e.g., an efficient implementation of distributed cache coherence protocol in distributed shared-memory systems, usually require efficient, reliable and scalable multicast capabilities from low-level interconnections. However, InfiniBand network, a high performance interconnection with low latency and high bandwidth, lacks the necessary reliable hardware multicast capability. To avoid low-efficiency multicast emulation with one-to-many point-to-point messages and ACKs, this paper proposes an efficient algorithm to provide reliable multicast based on erasure resilient codes over InfiniBand. This algorithm can not only avoid the feedback implosion problem by point-to-point multicast emulation messages, but also achieve lower latency and better scalability comparing with automatic-request retransmission (ARQ). Moreover, this algorithm can be optimized with message pipeline mechanism to achieve the same level of latency as the un-reliable InfiniBand hardware multicast. Performance analysis demonstrates that the failure probability to recover a message is less than 1.4times10 even for a system with 1000 message receivers