— While BGP routing datasets, consisting of raw routing data, are freely available and easy to obtain, extracting any useful information is tedious. Currently, researcher and network operators implement their own custom data processing tools and scripts. A single tool that provides easy access to the information within large raw BGP data-sets could be used by both communities to avoid re-writing these tools each time. Moreover, providing not just raw BGP messages, but some commonly used summary statistics as well can help guide deeper custom analyses. Based on these observations this paper describes the first steps towards building a scalable tool. We describe the various techniques and algorithms we have used to build an efficient generic tool called BGP-Inspect. When dealing with large datasets, dataset size, lookup speed, and data processing time are the most challenging issues. We describe our implementations of chunked compressed files and B+ tree indices that attempt to addr...
Dionysus Blazakis, Manish Karir, John S. Baras