Introducing cross-table monitoring with Virtual Tables
Bigeye is unique among data observability tools because we don't just monitor broadly across your tables, but also monitor deeply into your most critical datasets.
Bigeye is unique among data observability tools in its ability to not only monitor widely across your tables but also monitor deeply into your most critical datasets. It’s nice to know that every table in your warehouse is getting refreshed on time, but it’s critical to know that you’re not missing any outages on the core datasets that drive your most important dashboards, in-product analytics, and machine learning models.
Now with the release of Virtual Tables, we’re taking it even further. Teams frequently ask how Bigeye can help them monitor complex logic that spans multiple tables and until now, that was a challenge. Not anymore!
The power of Virtual Tables
Bigeye has always made it easy to monitor any given table within your data source. Advanced options like customizable Metric Templates support complex cross-column logic.
Now it’s just as easy to monitor for conditions that involve more than one table. One common example is checking foreign keys. Let’s say some critical information about a user is in the dim_user table, and some additional important information is in the fact_orders table. With Virtual Tables, you can check for business rules that could cause outages when violated by joining these tables.
Here are some monitoring goals data teams have asked us about that are now possible with Virtual Tables:
- Ensuring uniqueness of composite keys: Create a virtual table that concatenates two or more attributes used as a composite key. Enable a duplicate metric to monitor duplicates in the table.
- Measuring dimension aggregations: Write a join statement between a fact and dimension table to group rows by a dimension attribute, then monitor anomalies in that grouping. For example, create a virtual table that aggregates sales by product each week and monitor significant changes in the ratio between product categories or average revenue by product.
Validating type 2 slowly changing dimension tables: Ensure accurate timestamps for slowly changing dimension tables by using a cartesian join of the table to itself, then add monitoring to validate active time periods.
How it works
With Virtual Tables you can encapsulate all your custom logic into something akin to a database view. Write a SQL statement, save it, and Bigeye will persist the results as if it were a normal database table in your Bigeye catalog.
This “view-like” object only exists in Bigeye. Because it’s never materialized to your database, you don’t need write permissions to create it, so anyone can set up monitoring for complex conditions without waiting on — or creating more work for — the data engineering team.
Once the Virtual Table is created, it appears in your Bigeye catalog alongside all your materialized data, and you’re ready to go. Autometrics, Autothresholds, Grouped Metrics, custom Metric Templates, and every other feature in Bigeye work out of the box on your Virtual Table exactly as if it was a materialized table.
Much more scalable, much less work
But why design Virtual Tables instead of just asking users to write custom SQL for each business rule? Scalability.
Virtual Tables allow you to define as much business logic as you need while keeping it all in a single object, separate from your monitoring configuration. Business logic goes in the Virtual Table, and the rest of your monitoring setup happens like usual, with Autometrics and Autothresholds eliminating the repetitive work.
With Virtual Tables, you don’t end up with an ever-growing pile of custom metrics, and you don’t have to contort your business logic into boolean conditions. Virtual Tables lets you define numeric results that Bigeye’s Autothresholds anomaly detection system can monitor, giving you finer measurement and more context than simple pass-fail results.
And because Virtual Tables are designed to work as if they were materialized, they can even be combined with other Bigeye features like Deltas. Want to compare some business logic about user data in Postgres, and confirm that the same logic holds up once the data has landed in Snowflake? Write a Virtual Table to capture the business logic on each, then run a Delta on the pair of Virtual Tables. This saves you from having to write a long list of one-off checks to cover each condition.
How to get started with Virtual Tables
If you’d like to learn more, check out our documentation on Virtual Tables. Or if you’re ready to see how Bigeye can help your team address data quality and create more reliable data pipelines, we’d love to give you a demo.
Monitoring
Schema change detection
Lineage monitoring