Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection
We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.
Services for Technology Vendors
We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.
Data observability is a hot topic and trend. I have written about the importance of data observability for ensuring healthy data pipelines, and have covered multiple vendors with data observability capabilities, offered both as standalone and part of a larger data engineering system. Data observability software provides an environment that takes advantage of machine learning and DataOps to automate the monitoring of data quality and reliability. The term has been adopted by multiple vendors across the industry, and while they all have key functionality in common – including collecting and measuring metrics related to data quality and data lineage – there is also room for differentiation. A prime example is Acceldata, which takes a position that data observability requires monitoring not only data and data pipelines but also the underlying data processing compute infrastructure as well as data access and usage.
Acceldata was founded in 2018 by former executives and engineers of Apache Hadoop-specialist Hortonworks. The founders identified an opportunity to help organizations monitor and manage the reliability of data pipelines and data infrastructure by developing a product to address data infrastructure scaling and performance issues. Acceldata’s research and development has been fueled by $43.5 million in funding, including $8.5 million Series A funding and, most recently a $35 million Series B round provided by Insight Partners, March Capital, Lightspeed, Sorenson Ventures and Emergent Ventures.
Acceldata’s focus on data infrastructure scaling and performance issues is a differentiator among data observability specialists, many of which are focused specifically on automating data quality and lineage monitoring. It also manages the cost and performance of data projects, and monitors the quality and reliability of the data itself.
Data observability as a product segment is still nascent but is attracting standalone data observability software specialists as well as the inclusion of data observability functionality in wider data-operations platforms. I assert that, through 2025, data observability will continue to be a priority for the evolution of data operations products as vendors deliver more automated approaches to data engineering and improving trust in enterprise data.
The general availability of Acceldata Data Observability Cloud was announced in August 2022 and provides a platform for monitoring data compute infrastructure, reliability, pipelines and users. It offers alerts, audits and reports for data platforms, including Databricks, Hadoop, Kafka and Snowflake. It provides monitoring for anomalies, such as missing, late or erroneous data as well as information related to cost controls and predictions.
While Data Observability Cloud is available as a cloud managed service, Acceldata recently announced a new open source version of its data platform, as well as six tools and utilities used to support its data observability functionality, enabling organizations to begin building data products without upfront licensing or subscription costs. The company also offers two self-managed software services: Pulse and Torch. Pulse provides standalone data infrastructure monitoring for Hadoop environments that delivers utilization, scheduling and capacity planning capabilities as well as performance optimization recommendations for Apache Spark jobs and Apache Hive queries. Torch is a standalone product for data profiling, automated data quality management and data pipeline monitoring that provides data reliability observability, including data reconciliation and data drift and anomaly detection. Torch delivers automated alerts based on user-defined data quality policies as well as artificial intelligence-based recommendations to remediate data quality issues. Providing both data infrastructure and data reliability observability enables Acceldata to correlate data for improved root-cause analysis and also expands the applicability of Acceldata beyond data engineers and IT professionals to include system resource engineers and data architects.
As organizations strive to become more data-driven, the ability to rely on data used to make business decisions is more important than ever. Building trust in data requires the orchestration of data pipelines to automate and accelerate the flow of data from multiple sources to support analytics initiatives and drive business value. Data observability ensures that data used for analytics and governance projects is fit for its purpose. The monitoring of data quality and data lineage is already well-established as a data management discipline to ensure that data used for business decision-making is reliable.
Traditionally, data quality software has provided users an environment to manually check and correct data quality issues. This can be time-consuming, delaying time to insight. Almost two-thirds (64%) of participants in our Analytics and Data Benchmark Research cited reviewing data for quality issues as being one of the most time-consuming aspects of analytics initiatives, second only to preparing data for analysis. In contrast, data observability takes advantage of machine learning and DataOps to automate the monitoring of data used for analytics projects to ensure that it is complete, valid and consistent as well as relevant and free from duplication.
Adoption of data observability software remains nascent, but it has been a hot topic during 2022 due to the importance of ensuring the quality of data used in analytic projects as well as the reliability of data infrastructure and data pipelines. These factors are likely to become increasingly important to businesses as data volumes continue to grow and organizations become increasingly reliant on data-driven decision-making. The launch of Acceldata’s Open Source Data Platform also has the potential to fuel adoption for data observability in general, and the Data Observability Cloud managed service in particular. I recommend that organizations exploring approaches to improve the reliability of data infrastructure and trust in data evaluate Acceldata.
Regards,
Matt Aslett
Matt Aslett leads the software research and advisory for Analytics and Data at ISG Software Research, covering software that improves the utilization and value of information. His focus areas of expertise and market coverage include analytics, data intelligence, data operations, data platforms, and streaming and events.
Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business,
Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@isg-research.net