Loading organizations...

§ Private Profile · San Francisco, CA, USA
Data for LLMs through Competition
Key people at Sylvian.
Sylvian was founded in 2025 by Niall Kehoe (Founder) and William Huang (Founder).
Sylvian creates expert data for LLMs through competitions.
LLMs continue to need lots of expert data. Unfortunately, existing data vendors like Scale or Mercor are not motivating the best experts through part-time pay. Competitions are more intrinsic motivators, and naturally incentivize high quality data.
Sylvian creates expert data for LLMs through competitions, starting with tool use data (e.g. Excel, VSCode). We already have 4,500+ experts, from IMO Golds to MIT/Stanford PhDs to full time QRs at Point72, producing data at 1B tokens/week!
Key people at Sylvian.
Sylvian is a startup founded in 2025 that specializes in generating expert-level data for large language models (LLMs) through competitive crowdsourcing. Its core product involves hosting competitions where top experts contribute high-quality data focused initially on tool use, such as Excel and VSCode, to improve LLM training. Sylvian serves AI developers and organizations building LLMs by providing them with cutting-edge, expert-curated datasets that are difficult to obtain through traditional data vendors. The company has rapidly scaled its expert community to over 4,500 participants, including highly skilled individuals like IMO Gold medalists and PhDs, producing data at a rate of about 1 billion tokens per week, demonstrating strong growth momentum[2][1].
Sylvian was founded in 2025 by William Huang and Niall Kehoe, both of whom have distinguished backgrounds in competitive problem-solving and computer science. William Huang is an International Physics Olympiad Gold medalist, and Niall Kehoe has won international coding contests from a young age. Their combined experiences at Stanford, Harvard Medical School, and leading tech and finance firms inspired them to address a critical bottleneck in LLM development: the scarcity of expert data. They identified that existing data vendors failed to sufficiently motivate top experts, so they created a competition-based platform to engage these experts through leaderboards and prestige, rather than just part-time pay. This approach quickly gained traction, attracting a large, high-caliber expert community[2][1].
Sylvian is positioned at the intersection of two major trends: the explosive growth of LLMs and the increasing demand for high-quality, expert-generated training data. As LLMs scale, they require more specialized and nuanced datasets, especially for tasks involving tool use and real-world applications. Sylvian’s competition-driven data sourcing model addresses the limitations of traditional data vendors by tapping into motivated expert communities, thus accelerating LLM capabilities. This approach aligns with the broader AI ecosystem’s shift toward reinforcement learning and expert-in-the-loop data curation, making Sylvian a key enabler in advancing practical AI applications[2][1].
Looking ahead, Sylvian is likely to expand its data domains beyond initial tool use to cover other expert-driven tasks, potentially becoming a critical infrastructure provider for LLM training data. As AI models become more integrated into enterprise workflows, demand for high-quality, domain-specific data will increase, favoring Sylvian’s competitive crowdsourcing model. Their ability to scale expert engagement and maintain data quality will be pivotal. The company’s influence may grow as it helps shape how expert knowledge is harnessed for AI, potentially setting new standards for data sourcing in the AI industry[2][1].
Sylvian was founded in 2025 by Niall Kehoe (Founder) and William Huang (Founder).