April 2, 2014

Libra: Divide and Conquer to Verify Forwarding Tables in Huge Networks

USENIX Symposium on Networked Systems Design and Implementation (NSDI)

By: James Hongyi Zeng, Shidong Zhang, Fei Ye, Vimal Kumar, Mickey Ju, Junda Liu, Nick McKeown, Amin Vahdat

Abstract

Data center networks often have errors in the forwarding tables, causing packets to loop indefinitely, fall into black-holes or simply get dropped before they reach the correct destination. Finding forwarding errors is possible using static analysis, but none of the existing tools scale to a large data center network with thousands of switches and millions of forwarding entries. Worse still, in a large data center network the forwarding state is constantly in flux, which makes it hard to take an accurate snapshot of the state for static analysis.

We solve these problems with Libra, a new tool for verifying forwarding tables in very large networks. Libra runs fast because it can exploit the scaling properties of MapReduce. We show how Libra can take an accurate snapshot of the forwarding state 99.9% of the time, and knows when the snapshot cannot be trusted. We show results for Libra analyzing a 10,000 switch.