I recently wrote about the role data observability plays in generating value from data by providing an environment for monitoring its quality and reliability. Data observability is a critical functional aspect of Data Operations, alongside the development, testing and deployment of data pipelines and data orchestration, as I explained in our Data Observability Buyers Guide. Maintaining data quality and trust is a perennial data management challenge, often preventing organizations from operating at the speed of business. A myriad of new software providers have emerged in recent years, with products designed to address this challenge through automation, including Monte Carlo.
Monte Carlo was founded in 2019 with the intention of providing data engineers with tools to monitor the validity of data pipelines, similar to those available to enable IT engineers to identify and resolve software and infrastructure failures and performance problems. Like other early data observability pioneers, Monte Carlo’s founders set out to create an environment for monitoring the quality and reliability of data used for analytics and governance projects that was inspired by the observability platforms that provide software and infrastructure engineers with an environment for monitoring metrics, traces and logs to track application and infrastructure performance. The company invested in automation to handle the growing range of data sources and the volume of data involved in data-driven decision-making, providing differentiation from traditionally manual data quality tools. As I previously explained, while data quality software is concerned with the suitability of the data to a given task, data observability is concerned with the reliability and health of the overall data environment. Data observability tools monitor not just the data in an individual environment for a specific purpose at a given point in time, but also the associated upstream and downstream data pipelines. By doing so, data observability software ensures that data is available and up to date, avoiding downtime caused by lost or inaccurate data due to schema changes, system failures or broken data pipelines. While data quality software is designed to help users identify and resolve problems related to the validity of the data itself, data observability software is designed to automate the detection and identification of the causes of data quality problems. As such, data observability can potentially enable users to prevent data quality issues before they occur.
Monte Carlo was initially focused primarily on issues related to the nature of the data—its freshness, quality, volume, schema and lineage—but is increasingly also focusing its attention on the additional aspect of data
Another recently announced key feature from Monte Carlo is taking data observability closer to the origins of the problem. In April 2023, Monte Carlo introduced integration with Fivetran’s data movement and transformation
The new capabilities introduced by Monte Carlo in recent months are indicative of the desire to ensure that improved trust in data is enabled by making data observability a core component of any data strategy. Enterprises that embrace data observability have the potential to improve the quality of the data as it is generated and processed, as opposed to checking for quality problems after the event. I recommend that any enterprises looking to improve trust in data to enable greater adoption of data-driven decision-making evaluate the potential of data observability software and include Monte Carlo in their assessments.
Regards,
Matt Aslett