Product

April 7, 2023

Bigconfig continues to raise the bar for data monitoring as code

min read

Kendall Lovett

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Last year we launched Bigconfig, the first monitoring-as-code solution to support enterprise-scale data observability. Bigconfig allows data engineering teams to define data monitoring as code and deploy it across their enterprise data pipelines in a fast, automated, and version-controlled way.

Bigeye customers love that Bigconfig allows them to…

deploy metrics automatically on any new data that matches their specifications with dynamic tagging
apply any of Bigeye’s 60+ out of the box data quality checks or specify and customize their monitoring with full granular control
manage data observability from a central location and track changes with version-controlled audit logs
use optimized, human-readable YAML so there’s less code to write and no new language to learn
control their entire Bigeye operation from the command line

Since launching Bigconfig, we’ve had the opportunity to partner with dozens of top data engineering teams to develop additional improvements.

We’re excited to share what’s new.

Support for multiple Bigconfig files

Bigconfig allows teams to define data monitoring as code in a simple, human-readable YAML template. With the latest release, teams can now utilize multiple Bigconfig files to configure and manage monitoring for specific data sources, tables, or pipelines individually.

This is especially useful for teams who divvy up ownership of data quality responsibilities across different parts of their data stack. Alternatively, you may want to store different Bigconfig modules in different files. For example, you may want to have your saved metric definitions saved and shared across teams while each team has independent tag deployment files. Learn more about multiple file support.

Tags by column type

Bigconfig has always included dynamic wildcard tagging and reusable monitoring definitions so teams can define what they want to monitor and how they want to monitor it with just a few lines of code. While users can still choose to tag columns and tables by name, they can now also tag by column type.

This allows users to easily define metrics for specific types of data and apply them globally, or in conjunction with other dynamic tags, for increased granular control.

In the example above, the customer used a Bigconfig template file to create a tag for all tables where the table name includes “analytics_warehouse” and the column type is an integer. This allows for even more fine-tuned control over which metrics are deployed and where. Learn more about the flexibility of tag definitions.

Auto apply on indexing

Now administrators can set Bigeye to apply Bigconfig files automatically each time sources are indexed. Indexing occurs automatically once a day and can be triggered on demand by selecting "rescan" in the catalog.

With auto-apply enabled, Bigconfig will automatically apply monitoring to any new tables or columns that match tag definitions in the Bigconfig file. This ensures all new tables get monitoring applied on day 1 and reduces the risk of new tables or columns slipping through the cracks. Learn more.

Queueing for CLI commands

In addition to the above feature enhancements, we’ve also invested in the scalability of Bigconfig and the Bigeye command-line interface. Bigconfig "apply" commands are now queued on the backend to eliminate time out errors and ensure enterprise-scale support.

With these new enhancements, along with many other performance improvements and bug fixes, Bigconfig is now even more ready to take on enterprise data observability for your organization.

As a team of engineers, we love that Bigeye gives us the option to create version-controlled data monitoring as code with an elegant, ‘Terraform-like’ solution. With Bigconfig, we use a simple YAML file to define data monitoring rules and then let Bigeye automatically apply them across our entire data warehouse, including new tables that come online.

Simon Dong, Sr Manager, Data Engineering, Udacity

Check out an on-demand overview of Bigconfig or request a demo to see it in action.

share this episode

Resource

Monthly cost ($)

Number of resources

Time (months)

Total cost ($)

Software/Data engineer

$15,000

$540,000

Data analyst

$12,000

$144,000

Business analyst

$10,000

$30,000

Data/product manager

$20,000

$240,000

Total cost

$954,000

Role

Goals

Common needs

Data engineers

Overall data flow. Data is fresh and operating at full volume. Jobs are always running, so data outages don't impact downstream systems.

Freshness + volume
Monitoring
Schema change detection
Lineage monitoring

Data scientists

Specific datasets in great detail. Looking for outliers, duplication, and other—sometimes subtle—issues that could affect their analysis or machine learning models.

Freshness monitoringCompleteness monitoringDuplicate detectionOutlier detectionDistribution shift detectionDimensional slicing and dicing

Analytics engineers

Rapidly testing the changes they’re making within the data model. Move fast and not break things—without spending hours writing tons of pipeline tests.

Lineage monitoringETL blue/green testing

Business intelligence analysts

The business impact of data. Understand where they should spend their time digging in, and when they have a red herring caused by a data pipeline problem.

Integration with analytics toolsAnomaly detectionCustom business metricsDimensional slicing and dicing

Other stakeholders

Data reliability. Customers and stakeholders don’t want data issues to bog them down, delay deadlines, or provide inaccurate information.

Integration with analytics toolsReporting and insights

Get the Best of Data Leadership

Subscribe to the Data Leaders Digest for exclusive content on data reliability, observability, and leadership from top industry experts.

Stay Informed

Sign up for the Data Leaders Digest and get the latest trends, insights, and strategies in data management delivered straight to your inbox.

Get Data Insights Delivered

Join hundreds of data professionals who subscribe to the Data Leaders Digest for actionable insights and expert advice.

Bigconfig continues to raise the bar for data monitoring as code

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

Support for multiple Bigconfig files

Tags by column type

Auto apply on indexing

Queueing for CLI commands

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

Introducing Bigeye’s Azure Data Factory Connector

bigAI Is Now Generally Available

Get AI Ready with Governance & Data Observability

Join the Bigeye Newsletter

Bigconfig continues to raise the bar for data monitoring as code

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

Support for multiple Bigconfig files

Tags by column type

Auto apply on indexing

Queueing for CLI commands

Get the Best of Data Leadership

Stay Informed

Get Data Insights Delivered

Related posts

Introducing Bigeye’s Azure Data Factory Connector

bigAI Is Now Generally Available

Get AI Ready with Governance & Data Observability

Join the Bigeye Newsletter