Migration to the v2 Data Pipeline

If you typically use the measurement-lab.ndt.unified_uploads or measurement-lab.ndt.unified_downloads views, then nothing will change. We are updating the ndt5, switch, and tcpinfo schemas, removing obsolete views, and renaming some views in preparation for improving ease of use and documentation.

Read More

Evolution of M-Lab's Geographic and Network Annotations

In our recent roadmap post, we shared a list of milestones that the team is working on this and last quarter. Our Datatype migration and Standardized Columns milestone references the gardener service, which maintains and reprocesses M-Lab data, as well as the UUID annotator, that generates and saves per-connection metadata as annotations to user-conducted measurements. This post provides more detailed information about how these services have annotated measurements with geographic and network information in the past and present, and expands on what current work is happening now as mentioned in our roadmap post.

Read More

New ETL Pipeline and Transition to New BigQuery Tables

Since May 2017, the M-Lab team has been working on an updated, open source pipeline, which pulls raw data from our servers, saves it to Google Cloud Storage, and then parses it into our BigQuery tables. The team is particularly excited about this update because it means that the pipeline no longer relies on closed source libraries.

Read More

Transitioning to a New Backend Pipeline and Data Availability

M-Lab data is collected from distributed experiments hosted on servers all over the world, processed in a pipeline, and published for free in both raw and parsed (structured) formats. The back end processing component for this has served us well for many years, but it’s been showing its age recently. As M-Lab collects an increasing amount of data thanks to new partnerships, we have been concerned that it will not be as reliable.

Read More

Back to Top