
Research Article
A Unicast Packet Forwarding Accelerator Design Based on the RNS Algorithm
@INPROCEEDINGS{10.1007/978-3-031-67162-3_25, author={Lidan Zhu and Wen-Kang Jia}, title={A Unicast Packet Forwarding Accelerator Design Based on the RNS Algorithm}, proceedings={Communications and Networking. 18th EAI International Conference, ChinaCom 2023, Sanya, China, November 18--19, 2023, Proceedings}, proceedings_a={CHINACOM}, year={2024}, month={8}, keywords={Longest Prefix Matching (LPM) Packet Forwarding Engine (PFE) Scalar-pair Vectors Routing and Forwarding (SVRF)}, doi={10.1007/978-3-031-67162-3_25} }
- Lidan Zhu
Wen-Kang Jia
Year: 2024
A Unicast Packet Forwarding Accelerator Design Based on the RNS Algorithm
CHINACOM
Springer
DOI: 10.1007/978-3-031-67162-3_25
Abstract
The challenge of achieving the Longest Prefix Match (LPM) in IP address lookup remains a bottleneck in high-speed routers. Classless Inter-Domain Routing (CIDR) utilizes member query algorithms within Packet Forwarding Engines (PFE) to determine whether a packet's next hop should be forwarded through a specific output port. Among these, Bloom Filters (BF), Scalar Vector Routing and Forwarding (SVRF), and their improved variants have all proven efficient for rapid packet forwarding. In this paper, we propose a Unicast Packet Forwarding Accelerator design that utilizes the Chinese Remainder Theorem (CRT), Residue Number System (RNS), and SVRF. This framework associates prefix lengths with processing blocks by partitioning n-elements into 22 columns (sub-blocks) based on prefix lengths. Within each sub-block, IP addresses are parallelly computed for output port indexing, optimizing the partitioning of scalar vectors to facilitate the determination of address membership in a prefix set sorted by prefix length. This scheme optimization simultaneously addresses the Longest Prefix Match problem while reducing memory space usage and forwarding latency. Furthermore, we perform an analysis of snapshots from real-world IPV4 BGP tables to instantiate and simulate the performance of our algorithm. Simulation evaluations demonstrate that our proposed algorithm offers significant advantages in terms of both time and space efficiency under optimized partitioning.