ISG Software Research Analyst Perspectives

Databricks Scores Massive Funding Round, Continues to Expand Its Offerings

Written by David Menninger | Jan 29, 2025 11:00:00 AM

Databricks recently announced its Series J funding round, successfully raising $10 billion at a valuation of $62 billion. Led by Thrive Capital alongside high-profile investors such as Andreessen Horowitz and Insight Partners, the company intends to invest this capital towards new artificial intelligence (AI) products, acquisitions and significant expansion of its international operations. In the announcement, Databricks reported that it expects to achieve an annual revenue run rate of $3 billion in the quarter ending January 31, 2025.

Founded in 2013, Databricks initially gained prominence for its cloud-based Apache Spark services, aimed at enhancing big data processing and creating an alternative to MapReduce. Spark was initially started by Databricks co-founder Matei Zaharia at UC Berkeley's AMPLab in 2009. In 2013, the project was donated to the Apache Software Foundation. The Spark framework includes Spark SQL, DataFrames, Spark Streaming, MLlib and GraphX. By providing robust analytics over massive data sets, Spark became the de facto way to process big data. Although Databricks’ commercial operations focus exclusively on providing managed cloud services, it has also supported the broader open-source Spark community.

Over time, the worlds of data lakes and data warehouses collided. Databricks introduced the concept of a data lakehouse, adding Databricks SQL as well as open table formats. My colleague Matt Aslett has written about the importance of open table formats and Databricks’ support for both Delta Lake and, more recently, Apache Iceberg, including its acquisition of Tabular.

As the years progressed, Databricks evolved beyond its initial offerings. The company’s Data Intelligence Platform is now positioned as providing a lakehouse-based environment for data engineering, data warehousing, stream data processing, data governance, data sharing, business intelligence (BI), data science and AI. While MLlib provided machine learning (ML) capabilities, the company doubled down on its investment in AI with the acquisition of Mosaic ML. Databricks also set its sights on generative AI (GenAI) and large language models (LLMs) when it introduced DBRX, an open-source, general purpose LLM, as well as supporting enhancements, driving the future of intelligent data applications. Databricks’ Unity Catalog provides a single layer for governing structured and unstructured data, as well as ML models, notebooks, dashboards and files. As a result of both its own research and development efforts coupled with its acquisition strategy, Databricks is able to offer state-of-the-art tools for enterprises aiming to build, train and deploy AI/ML and LLMs using its Lakehouse Platform. As a result, Databricks has been rated Exemplary in our Analytic Data Platforms, AI Platforms and GenAI Platforms Buyers Guides. The initiatives underline a clear strategy to support enterprises in the various stages and types of AI investments, combining multiple models and components to create high-quality AI applications supported by and integrated with the necessary underlying data infrastructure. This trend is expected to be instrumental as organizations focus on transitioning from pilot projects to scalable production systems. Our research shows that only 15% of enterprises are fully in production with their GenAI efforts.

One of the company's hallmark events, the Data + AI Summit, has established itself as a nexus for industry stakeholders, offering insight into cutting-edge applications of data and AI technologies. Recently, the Summit focused on GenAI integrations, including functionalities that allow users to streamline the development of AI-based applications within the Lakehouse environment. This year, the emphasis was on announcing new capabilities like Mosaic AI Agent Framework and enhancements to its Lakehouse Platform for AI and data. The launch of Databricks AI/BI also represents the company’s entry into the self-service analytics market with two new AI-powered capabilities: low-code dashboarding and a conversational interface.

Databricks also unveiled LakeFlow to bring together all types of data engineering processes into a unified platform. Building on Databricks’ acquisition of Arcion as well as its existing Delta Live Tables and Databricks Workflows functionality, LakeFlow automates ingestion, transformation and monitoring, simplifying the overall experience for data teams. Such innovations are vital as businesses continue to grapple with the complexities of managing and optimizing data pipelines across various environments. Databricks was also rated Exemplary in our Data Intelligence, Data Integration and Data Governance Buyers Guides.

Shortly after the event, Databricks announced general availability of Databricks Assistant. Databricks Assistant can be used to generate code, provide help and troubleshoot errors. It can also be used to create visualizations and dashboards, although this feature is still in preview mode.

In conclusion, Databricks has made significant strides from its initial focus on Spark processing for big data, becoming an industry leader that unifies data and AI under a single platform. The current push for GenAI and the substantial funding acquired provide the resources for Databricks’ continued investments. As enterprises consider their data and AI strategies, I recommend they include Databricks in their evaluations.

Regards,

David Menninger