TY - GEN
T1 - NeuroLPM - Scaling Longest Prefix Match Hardware with Neural Networks
AU - Rashelbach, Alon
AU - De Paula, Igor
AU - Silberstein, Mark
N1 - Publisher Copyright: © 2023 ACM.
PY - 2023/10/28
Y1 - 2023/10/28
N2 - Longest Prefix Match engines (LPM) are broadly used in computer systems and especially in modern network devices such as Network Interface Cards (NICs), switches and routers. However, existing LPM hardware fails to scale to millions of rules required by modern systems, is often optimized for specific applications, and thus is performance-sensitive to the structure of LPM rules. We describe NeuroLPM, a new architecture for multi-purpose LPM hardware that replaces queries in traditional memory-intensive trie- and hash-table data structures with inference in a lightweight Neural Network-based model, called RQRMI. NeuroLPM scales to millions of rules under small on-die SRAM budget and achieves stable, rule-structure-agnostic performance, allowing its use in a variety of applications. We solve several unique challenges when implementing RQRMI inference in hardware, including minimizing the amount of floating point computations while maintaining query correctness, and scaling the rule-set size while ensuring small, deterministic off-chip memory bandwidth. We prototype NeuroLPM in Verilog and evaluate it on real-world packet forwarding rule-sets and network traces. NeuroLPM offers substantial scalability benefits without any application-specific optimizations. For example, it is the only algorithm that can serve a 950K-large rule-set at an average of 196M queries per second with 4.5MB of SRAM, only within 2% of the best-case throughput of the state-of-the-art Tree Bitmap and SAIL on smaller rule-sets. With 2MB of SRAM, it reduces the DRAM bandwidth per query, the dominant performance factor, by up to 9 × and 3 × compared to the state-of-the-art.
AB - Longest Prefix Match engines (LPM) are broadly used in computer systems and especially in modern network devices such as Network Interface Cards (NICs), switches and routers. However, existing LPM hardware fails to scale to millions of rules required by modern systems, is often optimized for specific applications, and thus is performance-sensitive to the structure of LPM rules. We describe NeuroLPM, a new architecture for multi-purpose LPM hardware that replaces queries in traditional memory-intensive trie- and hash-table data structures with inference in a lightweight Neural Network-based model, called RQRMI. NeuroLPM scales to millions of rules under small on-die SRAM budget and achieves stable, rule-structure-agnostic performance, allowing its use in a variety of applications. We solve several unique challenges when implementing RQRMI inference in hardware, including minimizing the amount of floating point computations while maintaining query correctness, and scaling the rule-set size while ensuring small, deterministic off-chip memory bandwidth. We prototype NeuroLPM in Verilog and evaluate it on real-world packet forwarding rule-sets and network traces. NeuroLPM offers substantial scalability benefits without any application-specific optimizations. For example, it is the only algorithm that can serve a 950K-large rule-set at an average of 196M queries per second with 4.5MB of SRAM, only within 2% of the best-case throughput of the state-of-the-art Tree Bitmap and SAIL on smaller rule-sets. With 2MB of SRAM, it reduces the DRAM bandwidth per query, the dominant performance factor, by up to 9 × and 3 × compared to the state-of-the-art.
UR - http://www.scopus.com/inward/record.url?scp=85183471637&partnerID=8YFLogxK
U2 - 10.1145/3613424.3623769
DO - 10.1145/3613424.3623769
M3 - منشور من مؤتمر
T3 - Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
SP - 886
EP - 899
BT - Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
T2 - 56th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2023
Y2 - 28 October 2023 through 1 November 2023
ER -