Using our research, best practices and expertise, we help you understand how to optimize your business processes using applications, information and technology. We provide advisory, education, and assessment services to rapidly identify and prioritize areas for improvement and perform vendor selection
We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.
Services for Technology Vendors
We provide guidance using our market research and expertise to significantly improve your marketing, sales and product efforts. We offer a portfolio of advisory, research, thought leadership and digital education services to help optimize market strategy, planning and execution.
The 2023 Ventana Research Buyers Guide for Data Orchestration research enables me to provide observations about how the market has advanced.
Data orchestration is a concept that has been growing in popularity in the past five years amid the rise of DataOps, which describes more agile approaches to data integration and data management. Data orchestration provides the capabilities to automate and accelerate the flow of data from multiple sources to support operational and analytics initiatives and drive business value.
At the highest level of abstraction, data orchestration covers three key capabilities: collection (including data ingestion, preparation and cleansing); transformation (additionally including integration and enrichment); and activation (making the results available to compute engines, analytics and data science tools, or operational applications).
This may sound very much like the tasks that data management practitioners have been undertaking for decades. As such, it is fair to ask what separates data orchestration from traditional approaches to data management.
Key to understanding why data orchestration is different, and necessary, is viewing data management challenges through the lens of modern data-processing requirements. Data-driven organizations stand to gain competitive advantage, responding faster to worker and customer demands for more innovative, data-rich applications and personalized experiences.
Being data-driven requires a combination of people, processes, information and technology improvements involving data culture, data literacy, data democracy, and data curiosity. Encouraging employees to discover and experiment with data is a key aspect of being data-driven that requires new, agile approaches to data management.
Meanwhile, the increasing reliance on real-time data processing is driving requirements for more agile, continuous data processing. Additionally, the rapid adoption of cloud computing has fragmented where data is accessed or consolidated, with data increasingly spread across multiple data centers and cloud providers.
Traditional approaches to data management are rooted in point-to-point batch data processing, whereby data is extracted from its source, transformed for a specific purpose, and loaded into a target environment for analysis. These approaches are unsuitable for the demands of modern analytics environments, which instead require agile data pipelines that can traverse multiple data-processing locations and can evolve in response to changing data sources and business requirements.
Given the increasing complexity of evolving data sources and requirements, there is a need to enable the flow of data across the organization through new approaches to the creation, scheduling, automation and monitoring of workflows. This is the realm of data orchestration, although the key capabilities of data orchestration will be familiar to existing data practitioners. Specific tasks related to these capabilities have traditionally been addressed with a variety of tools as well as manual effort, hand-coded scripts and expertise.
In comparison, data orchestration tools are designed to automate and coordinate the sequential or parallel execution of a complete set of tasks via data pipelines, typically based on directed acyclic graphs (DAGs) that represent the relationships and dependencies between the tasks. The capabilities delivered by data orchestration fall under three categories: pipeline monitoring, pipeline management, and workflow management.
As is often the case with new approaches to data and analytics, the requirements for data orchestration were first experienced by digital-native brands at the forefront of data-driven business strategies. One of the most prominent data orchestration tools, Apache Airflow, began as an internal development project within Airbnb, becoming an Apache Software Foundation project in 2016; workflow automation platform Flyte was originally created and subsequently open-sourced by Lyft; and Metaflow was developed and open-sourced by Netflix.
Data orchestration is not just for digital natives, however, and a variety of vendors have sprung up with offerings based around these open-source projects, as well as other development initiatives, to bring the benefits of data orchestration to the masses.
In addition to stand-alone data orchestration software products and cloud services, data orchestration capabilities are also being built into larger data-engineering platforms addressing broader data management requirements, including data observability, often in the context of data fabric and data mesh.
Whether stand-alone or embedded in larger data-engineering platforms, data orchestration has the potential to drive improved efficiency and agility in data and analytics projects. Data orchestration addresses one of the most significant impediments to generating value from data. More than two-thirds (69%) of participants in Ventana Research’s Analytics and Data Benchmark Research cite preparing data for analysis as the most time-consuming task in analyzing data.
Adoption of data orchestration is still in the early stages and is closely linked to larger data transformation efforts that introduce greater agility and flexibility. However, by 2026, more than one-half of organizations will adopt data orchestration technologies to automate and coordinate data workflows and increase efficiency and agility in data and analytics projects.
If an organization’s data processes and skills remain rooted in traditional products and manual intervention, then data orchestration is not likely to be a quick fix. However, alongside the cultural and organizational changes involved in people, processes, and information improvements, data orchestration has the potential to play a key role in the technological improvement involved in becoming more data-driven. All organizations are recommended to investigate the potential advantages of data orchestration with a view to improving their use of data and analytics.
This research evaluates the following vendors that offer products that address key elements of data orchestration as we define it: Alteryx, AWS, Astronomer, BMC, Databricks, DataKitchen, Google, Hitachi Vantara, IBM, Infoworks.io, Matillion, Microsoft, Prefect, Rivery, Saagie, SAP, Stonebranch, StreamSets and Y42.
You can find more details on our site as well as in the Buyers Guide Market Report.
Regards,
Matt Aslett
Matt Aslett leads the software research and advisory for Analytics and Data at ISG Software Research, covering software that improves the utilization and value of information. His focus areas of expertise and market coverage include analytics, data intelligence, data operations, data platforms, and streaming and events.
Ventana Research’s Analyst Perspectives are fact-based analysis and guidance on business,
Each is prepared and reviewed in accordance with Ventana Research’s strict standards for accuracy and objectivity and reviewed to ensure it delivers reliable and actionable insights. It is reviewed and edited by research management and is approved by the Chief Research Officer; no individual or organization outside of Ventana Research reviews any Analyst Perspective before it is published. If you have any issue with an Analyst Perspective, please email them to ChiefResearchOfficer@isg-research.net