In January, M-Lab launched a beta test of new BigQuery tables for M-Lab data. Today, M-Lab is pleased to announce that the beta test was successful. The new, faster-performing tables will be M-Lab’s new standard BigQuery tables.
Before we move on to specifics, when we say faster performing, we mean a lot faster. As in, certain queries that used to take over 2 hours now complete in 8 seconds. That means that playing with the data just became a lot more fun.
To help users dig in to this data as quickly and seamlessly as possible, M-Lab has consolidated all of its data documentation and updated it to show how to take advantage of the new tables.
Today, M-Lab is happy to announce the public beta of new M-Lab BigQuery tables. These tables provide substantially improved performance and reduce the difficulty of writing BigQuery SQL.
The team working on archiving M-Lab data recently discovered that the M-Lab data hosted in BigQuery was affected by a bug that caused duplicates to appear in our dataset. Queries against M-Lab’s BigQuery dataset performed between May 2014 and April 2015 were impacted. The raw files in our Google Cloud Storage bucket were not.
Last October, Measurement Lab released the Internet Observatory, a data-visualization tool that enables consumers, policymakers, and researchers to better understand the impact of ISP relationships on Internet access and performance. The Observatory provides easier access to M-Lab’s rich dataset on network performance to reproduce the analysis in our report on “ISP Interconnection and its Impact on Consumer Internet Performance.”