Streaming data has been part of the industry landscape for decades but has largely been focused on niche applications in segments with the highest real-time data processing and analytics performance requirements, such as financial services and telecommunications. As demand for real-time interactive applications becomes more pervasive, streaming data is becoming a more mainstream pursuit, aided by the proliferation of open-source streaming data and event technologies, which have lowered the cost and technical barriers to developing new applications that take advantage of data in motion. Ventana Research’s Streaming Data Dynamic Insights enables an organization to assess its relative maturity in achieving value from streaming data. I assert that by 2024, more than one-half of all organizations’ standard information architectures will include streaming data and event processing, allowing organizations to be more responsive and provide better customer experiences.
This trend is also impacting the vendor landscape. While streaming data was initially the preserve of specialist providers, increasingly we see data vendors
DataStax was founded in 2010 to build a business around Apache Cassandra, a distributed, non-relational database developed by Facebook in 2008 and made available as an open-source project the following year. One of the early NoSQL database projects, Apache Cassandra gained adoption thanks to its ability to manage large data volumes at scale with high availability and fault tolerance. DataStax was the first company to provide a commercial support subscription for Apache Cassandra, now known as Luna for Apache Cassandra. The company also launched its own DataStax Enterprise commercial distribution with added security and other enterprise features.
In addition to contributing to the development of both Apache Cassandra and DataStax Enterprise, the company also acquired capabilities for graph data processing as well as cloud management services. In 2020, it launched the Astra DB managed database-as-a-service offering. DataStax also recognized the growing demand for stream data processing as a complement to high-performance database workloads, and in 2021 announced the acquisition of cloud messaging managed service provider Kesque to support distributed event streaming, expanding its purview to address the management and processing of data in motion as well as at rest. DataStax supports hundreds of customers, including Audi, Barracuda Networks, CapitalOne, ESL Gaming, Macys, Saab and US Bank. It also raised over $340 million in funding, including a recent $115 million funding round led by the growth equity business within Goldman Sachs Asset Management.
As I recently explained, the range of use cases for which NoSQL databases are a valid option has grown in recent years, thanks to evolving functionality
The acquisition of Kesque helped DataStax expand its addressable market with the addition of streaming data processing functionality and expertise based on the Apache Pulsar open-source project, which provides a cloud-native platform for publish-and-subscribe messaging and serverless stream data processing. DataStax now offers the Astra Streaming managed service, which is available on Amazon Web Services, Microsoft Azure and Google Cloud Platform.
For those that prefer to self-manage, the company also offers Luna Streaming, a commercially supported distribution of Apache Pulsar for deployment on-premises or on cloud infrastructure. Both offerings are targeted primarily at developers to enable them to create applications that take advantage of streamed data. While Astra Streaming and Astra DB can be used independently of each other, DataStax also highlights the benefits of using them in combination. Astra Streaming can be used to build pipelines to transport data into and out of the Astra DB database-as-a-service in real time, with Astra Streaming serving as a core component of CDC for Astra DB, providing change data capture functionality to synchronize data from Astra DB to other data platforms and applications as it is updated. The Astra DB managed service is now DataStax’s flagship database, delivering global scalability with data replication across multiple cloud providers, regions and availability zones, without the need for manual configuration or database sizing. DataStax Enterprise also provides native Kubernetes support for deployment on-premises or on multiple clouds, albeit with associated management and operations requirements. Both Astra DB and DataStax Enterprise support Storage Attached Indexing, which enables the creation of multiple secondary indexes on the same database table. Both also support Stargate, an open-source data application programming interface gateway designed to abstract Cassandra-specific concepts, making it easier for application developers to interact with data using APIs including GraphQL, REST, schemaless JSON and gRPC.
While Apache Cassandra and DataStax Enterprise have long been viable enterprise database platforms, DataStax has made significant inroads in recent years in making them easier for developers to work with, both through the Stargate API gateway and managed cloud services that reduce operational requirements. The addition of support for streaming data via Astra Streaming has further expanded the company’s addressable market with CDC for Astra DB, illustrating how the two can be used in combination. I recommend that organizations consider DataStax when evaluating data platforms to support applications requiring the processing of data in motion and at rest.
Regards,
Matt Aslett