Following the M-Lab platform upgrade in Nov. 2019, the development team began a series of follow up projects to enable access to NDT data for various audiences with differing needs. The first step in that process was the publication of “unified views”, which present the most commonly used fields in NDT data, and only show tests that meet our current, best understanding of test completeness. This was one step toward Long Term Support of stable schemas for our tables and views in BigQuery. In other words, a lot of work is happening in the background to support long term support for standard BigQuery columns across all M-Lab datasets.
If you’ve been following our blog over the last few months, you know M-Lab has been working toward a complete server platform upgrade. As of November 20, 2019, all M-Lab servers are now managed by Kubernetes, running Docker container services for all experiments. This transition has greatly improved our platform management, this post addresses the short term impact on downstream data users and applications, and outlines a temporary solution and our longer term for new NDT tables/views.
Earlier this year, M-Lab published blog post outlining our new ETL pipeline and transition to new BigQuery tables. That post also outlined where we’ve saved our datasets, tables, and views in BigQuery historically, and recommended tables and views for most researchers to use. At that time we also implemented semantic versioning to new dataset and table releases at that time, and began publishing BigQuery views that unify our NDT data across multiple schema iterations and migrations.