DataCebo
DataCebo is a technology company.
Financial History
DataCebo has raised $9.0M across 1 funding round.
Frequently Asked Questions
How much funding has DataCebo raised?
DataCebo has raised $9.0M in total across 1 funding round.
DataCebo is a technology company.
DataCebo has raised $9.0M across 1 funding round.
DataCebo has raised $9.0M in total across 1 funding round.
DataCebo is a technology company that builds an AI-powered synthetic data platform called SDV Enterprise, enabling enterprises to generate privacy-preserving synthetic datasets for machine learning, data analysis, and application development.[1][2][4] It serves industries like finance, banking, insurance, and healthcare, solving the problem of data privacy risks, regulatory compliance, and the tedious manual creation of test data by using generative AI models—rooted in MIT research—to produce realistic synthetic data that mirrors real data's statistical properties, formats, and relationships, allowing teams to complete 90% of data work without real data exposure.[1][2][3][4] The company has raised over $7M-$8.5M in seed funding, boasts a popular open-source SDV library with over 1 million downloads, and demonstrates growth through enterprise adoptions like ING Belgium, which achieved 100x test coverage improvement.[1][3][4][5]
DataCebo emerged from MIT's Data to AI Lab, where co-founders Kalyan Veeramachaneni (CEO) and Neha Patki began developing generative models for structured tabular and relational data in 2016.[2][3] Their work led to the open-source Synthetic Data Vault (SDV) library, which gained massive traction with over 1 million downloads by 2020, validating the technology through community feedback and bug fixes.[2][3] Spotting enterprise demand—especially in finance—the duo commercialized it, launching SDV Enterprise in late 2023 as a scalable version handling up to 100 tables (vs. open-source limits), backed by $8.5M seed funding co-led by Link Ventures and Zetta Venture Partners.[2][3][5]
DataCebo stands out in synthetic data generation through these key strengths:
DataCebo rides the synthetic data boom fueling AI adoption amid privacy regulations (e.g., GDPR) and data scarcity, as seen in 2024 trends like OpenAI's Voice Engine, Google's Gemma/AlphaGeometry, and Microsoft's Phi-4 trained mostly on synthetic data.[5] Timing is ideal post-LLM hype, where enterprises need secure alternatives to real data for training/testing, especially in high-stakes fields like finance (e.g., money laundering detection with 20K synthetic alerts) and healthcare.[2][3][4][5] Market forces—rising AI compute costs, bias risks in real data, and demand for scalable tabular synthesis—favor DataCebo's edge over text/image-focused tools, influencing the ecosystem by open-sourcing foundations (1M+ downloads) and enabling faster, safer AI pipelines for innovators.[1][2][5]
DataCebo is poised to expand SDV Enterprise with features like constraint-augmented generation and time-series enhancements, targeting deeper enterprise penetration in finance/insurance while exploring new sectors.[5] Trends like multi-modal synthetic data and improved quality frameworks (e.g., Microsoft's) will amplify its role, potentially driving acquisitions or larger rounds as AI shifts toward privacy-centric training.[5] Its influence could evolve from open-source pioneer to category leader, accelerating AI for millions—much like how it transformed ING's payment testing—by making synthetic data as routine as real data in enterprise stacks.[4][5]
DataCebo has raised $9.0M in total across 1 funding round.
DataCebo's investors include Basis Set Ventures, Next47, Zetta Venture Partners, Pawan Deshpande.
DataCebo has raised $9.0M across 1 funding round. Most recently, it raised $9.0M Seed in December 2023.
| Date | Round | Lead Investors | Other Investors |
|---|---|---|---|
| Dec 1, 2023 | $9.0M Seed | Basis Set Ventures, Next47, Zetta Venture Partners, Pawan Deshpande |