Databricks Explained
Databricks, Inc. |
Industry: | Computer software |
Location City: | San Francisco, California |
Location Country: | United States |
Revenue: | $1.6 billion (2023)[1] |
Num Employees: | (2023)[2] |
Databricks, Inc. is a global data, analytics and artificial intelligence company founded by the original creators of Apache Spark.[3]
The company provides a cloud-based platform to help enterprises build, scale, and govern data and AI, including generative AI and other machine learning models.[4]
Databricks pioneered the data lakehouse, a data and AI platform that combines the capabilities of a data warehouse with a data lake, allowing organizations to manage and use both structured and unstructured data for traditional business analytics and AI workloads.[5]
In November 2023, Databricks unveiled the Databricks Data Intelligence Platform, a new offering that combines the unification benefits of the lakehouse with MosaicML’s Generative AI technology to enable customers to better understand and use their own proprietary data.[6]
The company develops Delta Lake, an open-source project to bring reliability to data lakes for machine learning and other data science use cases.[7]
History
Databricks grew out of the AMPLab project at University of California, Berkeley that was involved in making Apache Spark, an open-source distributed computing framework built atop Scala. The company was founded by Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia,[8] Patrick Wendell, and Reynold Xin.
In November 2017, the company was announced as a first-party service on Microsoft Azure via integration Azure Databricks.[9] In February 2021 together with Google Cloud, Databricks provided integration with the Google Kubernetes Engine and Google's BigQuery platform.[10] By this time, the company said more than 5,000 organizations used its products.[11]
Fortune ranked Databricks as one of the best large "Workplaces for Millennials" in 2021.[12]
Acquisitions
Much of the company's expansion has come through acquisition. In June 2020, they acquired Redash, an open-source tool designed to help data scientists and analysts visualize and build interactive dashboards of their data.[13] Their second acquisition was of German no-code company 8080 Labs, the makers of bamboolib, a data exploration tool requires no coding to use.[14] The third acquisition was in May 2023, of the data security group Okera, extending their data governance capabilities.[15] The next month, they bought the open-source generative AI startup MosaicML for $1.4billion.[16] [17] In October of that year Databricks acquired data replication startup Arcion for $100 million.[18] In what is believed to be the sixth acquisition the company bought Tabular, a data-management system used by open source AI, for over $1 billion.[19]
In response to the popularity of OpenAI's ChatGPT, in March 2023, the company introduced an open-source language model, named Dolly after Dolly the sheep, that developers could use to create their own chatbots. Their model uses fewer parameters to produce similar results as ChatGPT, but Databricks had not released formal benchmark tests to show whether its bot actually matched the performance of ChatGPT.[20] [21] [22]
Databricks reported $1.6 billion in revenue for the 2023 fiscal year, more than doubling its previous level.[23]
Funding
In September 2013, Databricks announced it raised $13.9 million from Andreessen Horowitz and said it aimed to offer an alternative to Google's MapReduce system.[24] [25] Microsoft was a noted investor of Databricks in 2019, participating in the company's Series E at an unspecified amount.[26] [27] The company has raised $1.9 billion in funding, including a $1 billion Series G led by Franklin Templeton at a $28 billion post-money valuation in February 2021. Other investors include Amazon Web Services, CapitalG (a growth equity firm under Alphabet Inc.) and Salesforce Ventures. In August 2021, Databricks finished its eighth round of funding by raising $1.6 billion and valuing the company at $38 billion.[28]
Products
Databricks develops and sells a cloud data platform using the marketing term "lakehouse", a portmanteau based on the terms "data warehouse" and "data lake".[37] Databricks' lakehouse is based on the open source Apache Spark framework that allows analytical queries against semi-structured data without a traditional database schema.[38] In October 2022, Lakehouse received FedRAMP authorized status for use with the U.S. federal government and contractors.[39]
Databricks' Delta Engine launched in June 2020 as a new query engine that layers on top of Delta Lake to boost query performance.[40] It is compatible with Apache Spark and MLflow, which are also open source projects Databricks employees helped create.[41]
In November 2020, Databricks introduced Databricks SQL (previously known as SQL Analytics) for running business intelligence and analytics reporting on top of data lakes. Analysts can query data sets directly with standard SQL or use product connectors to integrate directly with business intelligence tools like Holistics, Tableau, Qlik, SigmaComputing, Looker, and ThoughtSpot.[42]
Databricks offers a platform for other workloads, including machine learning, data storage and processing, streaming analytics, and business intelligence.[43]
The company has also created Delta Lake, MLflow and Koalas, open source projects that span data engineering, data science and machine learning.[44] [45] In addition to building the Databricks platform, the company has co-organized massive open online courses about Spark[46] and a conference for the Spark community called the Data + AI Summit,[47] formerly known as Spark Summit.
In early 2024, Databricks released a portfolio of new tools to help customers customize, fine-tune or build their own AI systems, including: Mosaic AI Vector Search, which enables companies to build RAG models, Mosaic AI Model Serving, a unified service for deploying, governing, querying and monitoring models fine-tuned or pre-deployed by Databricks, and Mosaic AI Pretraining, a platform for enterprises to create their own LLMs.[48]
In March 2024, Databricks released DBRX, an open source foundation model. It relies on a mixture-of-experts architecture and is built on the MegaBlocks open source project.[49]
DBRX cost $10 million to create. At the time of launch, it was the fastest open source LLM, based on commonly-used industry benchmarks. It beat other models like LlaMA2 at solving logic puzzles and answering general knowledge questions, among other tasks. And while it’s a 136 billion parameters model, it uses only an average of 36 billion to generate outputs.[50]
DBRX also serves as a foundation for companies to build or customize their own AI models. Companies can also use proprietary data to generate higher-quality outputs for specific use cases.[51]
Operations
Databricks is headquartered in San Francisco.[52] It also has operations in Canada, the United Kingdom, Netherlands, Singapore, Australia, Germany, France, Japan, China, South Korea, India, Brazil, Switzerland, Costa Rica and Serbia.[53]
Notes and References
- News: Lin . Belle . 2024-03-06 . AI is Driving Record Sales at Multibillion-Dollar Databricks. An IPO Can Wait … . . subscription . https://archive.today/20240306145258/https://www.wsj.com/articles/ai-is-driving-record-sales-at-multibillion-dollar-databricks-an-ipo-can-wait-f8a55bd4 . 2024-03-06 . live.
- Web site: The Tech CEO Who Uses His Phone the Old-Fashioned Way . 2023-07-29 . . Driebusch . Corrie . subscription . https://archive.today/20240228035031/https://www.wsj.com/articles/ali-ghodsi-databricks-phone-3cec6486 . 2024-02-28.
- News: Top IPO Prospect Databricks Scores $43 Billion Valuation Thanks To $500 Million Funding Round Including AI Titan Nvidia. Saul . Derek . September 14, 2023 . Forbes. 2024-03-26.
- News: How Databricks is helping customers develop their own customized AI models . Sullivan . Mark . March 19, 2024 . Fast Company. 2024-03-19.
- News: Databricks' lakehouse becomes foundation under fresh layer of AI dreams. Clark . Lindsay . November 16, 2023 . The Register. 2023-11-16.
- News: Databricks' New AI Product Adds A ChatGPT-Like Interface To Its Software. Cai . Kenrick. November 16, 2023. Forbes. 2023-11-16.
- Web site: 2019-04-24 . Databricks launches Delta Lake, an open source data lake reliability project . 2021-04-06 . VentureBeat . en-US.
- Web site: Zaharia . Matei . Matei Zaharia . 2016-08-16 .
- Web site: Microsoft makes Databricks a first-party service on Azure . 2021-04-06 . TechCrunch . 15 November 2017 . en-US .
- Web site: Databricks brings its lakehouse to Google Cloud . 2021-02-18 . TechCrunch . 17 February 2021 . en-US .
- Web site: Konrad . Alex . February 2, 2021 . Databricks Raises $1 Billion At $28 Billion Valuation, With The Cloud's Elite All Buying In . July 29, 2021 . Forbes . en.
- 100 Best Large Workplaces for Millennials . June 16, 2021 . Fortune . 2021-07-16.
- Web site: 24 June 2020 . Databricks acquires Redash, a visualizations service for data scientists . 2021-04-06 . TechCrunch . en-US.
- Web site: $38 billion software start-up Databricks makes acquisition to leave code behind . CNBC . October 6, 2021 . Eric Rosenbaum. February 20, 2022.
- Web site: Palazzolo . Stephanie . May 3, 2023 . Exclusive: $38 billion data and AI darling Databricks acquires security startup Okera . subscription . live . https://web.archive.org/web/20230503195102/https://www.businessinsider.com/data-artificial-intelligence-startup-databricks-acquire-governance-security-okera-2023-5 . May 3, 2023 . Business Insider.
- News: Datta . Tiyashi . Hu . Krystal . June 26, 2023 . Databricks strikes $1.3 billion deal for generative AI startup MosaicML . Reuters.
- News: Council . Stephen . June 26, 2023 . SF tech firm Databricks to buy 2-year-old startup for $21 million per employee . SFGATE.
- Web site: 2023-10-23 . After $43B valuation, Databricks acquires data replication startup Arcion for $100M . 2023-10-23 . TechCrunch . en-US.
- News: 5 June 2024 . Galloni . Allessandra . Databricks to buy data management firm Tabular for over $1 bln . Reuters.
- News: Hu . Krystal . Nellis . Stephen . March 24, 2023 . Databricks pushes open-source chatbot as cheaper ChatGPT alternative . . https://web.archive.org/web/20230325141855/https://www.reuters.com/technology/databricks-pushes-open-source-chatbot-cheaper-chatgpt-alternative-2023-03-24/ . March 25, 2023 . live.
- News: Loften . Angus . March 24, 2023 . Databricks Launches 'Dolly,' Another ChatGPT Rival . . subscription . https://archive.today/20230324125524/https://www.wsj.com/amp/articles/databricks-launches-dolly-another-chatgpt-rival-31fd0f5f . March 24, 2023 . live.
- News: Goldman . Sharon . March 24, 2023 . Databricks debuts ChatGPT-like Dolly, a clone any enterprise can own . . https://web.archive.org/web/20230411011910/https://venturebeat.com/ai/databricks-debuts-chatgpt-like-dolly-a-clone-any-enterprise-can-own/ . April 11, 2023 . live.
- Web site: Wilhelm . Ron Miller and Alex . Databricks keeps marching forward with $1.6B in revenue . TechCrunch . 8 March 2024 . 7 March 2024.
- Web site: Databricks raises $14M from Andreessen Horowitz, wants to take on MapReduce with Spark . Harris . Derrick . September 25, 2013 . September 28, 2014 . January 15, 2022 . https://web.archive.org/web/20220115071749/https://gigaom.com/2013/09/25/databricks-raises-14m-from-andreessen-horowitz-wants-to-take-on-mapreduce-with-spark/ . dead .
- Web site: Databricks aims to build next-generation analytic tools for Big Data . Lorica . Ben . September 25, 2013. September 28, 2014 . O'Reilly Media.
- Web site: Databricks raises $250M at a $2.75B valuation for its analytics platform . 2021-04-08 . TechCrunch . 5 February 2019 . en-US .
- Web site: Novet . Jordan . 2019-02-05 . Microsoft used to scare start-ups but is now an 'outstandingly good partner,' says Silicon Valley investor Ben Horowitz . 2021-04-06 . CNBC . en.
- News: Mellor . Chris . 2021-09-01 . Databricks raises data lake of cash at monstrous $380bn valuation . 2021-09-04 . Blocks & Files.
- Web site: Miller . Ron . June 30, 2014 . Databricks Snags $33M In Series B And Debuts Cloud Platform For Processing Big Data . September 28, 2014 . TechCrunch.
- Web site: Shieber . Jonathan . Databricks raises $60 million to be big data's next great leap forward . 2016-12-16 . TechCrunch . 15 December 2016.
- Web site: Databricks Secures $140 Million to Accelerate Analytics and Artificial Intelligence in the Enterprise . 2019-05-16 . Databricks . 22 August 2017 . en-US.
- Web site: Databricks' $250 Million Funding Supports Explosive Growth and Global Demand for Unified Analytics; Brings Valuation to $2.75 Billion . 2019-02-05 . Databricks . 5 February 2019 . en-US.
- Web site: Databricks announces $400M round on $6.2B valuation as analytics platform continues to grow . 2019-10-24 . TechCrunch . 22 October 2019 . en-US .
- Web site: Databricks raises $1B at $28B valuation as it reaches $425M ARR . 2021-02-14 . Tech Crunch . February 2021 . en-US.
- Web site: Databricks raises $1.6B at $38B valuation as it blasts past $600M ARR . 2021-07-01 . Tech Crunch . en-US.
- Web site: Nishant . Niket . Hu . Krystal . 2023-09-14 . Databricks raises over $500 mln at $43 bln valuation . 2023-09-20 . Reuters . en-US.
- Michael . Armbrust . Ghodsi . Ali . Xin . Reynold . Zaharia . Matei . January 2021 . Lakehouse: A New Generation of Open Platforms that Unify Data Warehousing and Advanced Analytics . . July 29, 2021.
- Web site: 2021-02-01 . With massive $1B infusion, Databricks takes aim at IPO and rival Snowflake . 2021-04-08 . SiliconANGLE . en-US.
- News: Simone . Stephanie . Databricks achieves FedRAMP Authorized status . KMWorld . . 2022-10-17 . 2022-10-20.
- Web site: 2020-06-24 . Databricks Cranks Delta Lake Performance, Nabs Redash for SQL Viz . 2021-04-08 . Datanami.
- Web site: 2019-04-24 . Databricks launches Delta Lake, an open source data lake reliability project . 2021-04-08 . VentureBeat . en-US.
- Web site: Databricks launches SQL Analytics . 2021-04-08 . TechCrunch . 12 November 2020 . en-US .
- Web site: Brust . Andrew . Databricks, champion of data "lakehouse" model, closes $1B series G funding round . 2021-04-08 . ZDNet . en.
- Web site: The Two Sigma Ventures Open Source Index . 2021-04-08 . Two Sigma Ventures . en.
- Web site: MLOps Tools - Ranking. OSS Insight. 2024-04-03 . OSS Insight . en.
- News: Databricks to run two massive online courses on Apache Spark . 2014-12-02 . Databricks . en-US. 2016-12-16.
- Web site: Data + AI Summit . 2021-04-08 . Databricks . en-US.
- News: Riding the data-powered AI wave: Inside Databricks’ unified stack solution . 2024-03-14 . Databricks . en-US. 2024-04-05.
- News: Databricks open-sources its own large language model, DBRX . 2024-03-27 . Databricks . en-US. 2024-04-05.
- News: Inside the Creation of the World’s Most Powerful Open Source AI Model . 2024-03-27 . Databricks . en-US. 2024-04-05.
- News: Databricks’ new open-source AI model could offer enterprises a leaner alternative to OpenAI’s GPT-3.5 . 2024-03-27 . Databricks . en-US. 2024-04-05.
- Web site: staff . CNBC com . 2020-06-16 . 36. Databricks . 2021-04-08 . CNBC . en.
- Web site: Worldwide locations . 2022-10-20.