# Not a Crypto Founder. A Scientist Who Found a Use for Bittensor ![](https://assets.louisebeattie.com/claims-most-important-knowledge-post.png) Every year the world spends around $3 trillion producing science, and almost none of it is in a form an AI can reason over. Three hundred million papers, written for humans, sitting just out of reach of the models that could put them to work. Philipp Koellinger calls this the most valuable knowledge we have. **Claims, the subnet he's building on Bittensor, is built to unlock it.** ## Not a crypto founder Koellinger isn't a crypto founder who found a use for science. He's a scientist who found a use for Bittensor. He spent years leading a 300-strong consortium that fixed a replication crisis in behavioural genetics and built the data standard that field still runs on, work published in *Nature* and *Science*. He runs DeSci Labs, and before Claims he built SciWeave, a research assistant that answers scientific questions from real papers with real citations. Now he wants to do for the whole scientific literature what he once did for one corner of it. His interest in incentives didn't start with crypto. Koellinger and his co-founder Christian Roessler grew up in East Berlin and watched the GDR collapse, a failure he puts down largely to bad incentives. **"Once you've seen such a grand failure, you can't unsee it."** He spent years watching Bittensor from the sidelines, seeing in it what he calls an incentive design language for any digital commodity, and after a Bittensor side event at Token2049 in Singapore decided this was the crowd he wanted to build with. Starting a subnet, he says, has "literally been a dream come true." ## The problem isn't reading science, it's weighing it The problem is bigger than it sounds. Ask Claude or ChatGPT a serious scientific question and you'll get a confident answer wrapped around references that don't exist. This is measured, not anecdotal: in the *Nature* study behind the benchmark Koellinger's team works with, GPT-4o fabricated citations between 78 and 90% of the time. The models have read almost all the science. They still hallucinate, because the literature was written for people to read, not for machines to reason over. Retrieval changes that. On the same benchmark, SciWeave returned zero hallucinated citations, and published the replication pipeline so anyone can check. But solved citations are only half of it: a retrieval tool can tell you what a paper says, not whether the claim underneath it holds up. **That distinction is the whole game.** Take a sentence like "statins lower cardiovascular mortality." On its own it's useless to a machine. Is it backed by a single small correlation, or by a randomised trial repeated across independent labs? One you go and check. The other you can point a research budget at. A literature review can live with the ambiguity. A company deciding which drug target to spend tens of millions chasing cannot. ## What Claims actually does What Claims does is make that difference machine-readable. Miners take a paper, pull out each claim, identify the evidence behind it, rate how credible that evidence is, and map the concepts to controlled vocabularies so the whole thing becomes queryable. **A claim stops being a sentence and becomes a node with a known evidence type attached.** The output is a canonical claim-evidence graph, served through an API, that other people can build on. ## Why a network, not a company Koellinger could build this inside a company. He's putting it on Bittensor because a network is how data quality compounds. **"Whoever has the best data will win this market,"** he says. Put many independent miners on the same papers and you don't get one model's verdict, you get a distribution, and the spread itself is signal - calibrated confidence on each claim. An incentive mechanism can reward the one thing that matters here, which is continuously improving the quality of the data. That's what a subnet is for, and it's more capital-efficient than hiring an army of analysts to do the same work centrally. **Underneath all of this is a real business, and it's close to earning.** He's already in negotiation with a paying enterprise client, a biotech weighing one of those multi-million-dollar drug-target decisions, and reckons revenue may arrive before the subnet even launches, built by hand if it has to be. The model is the Palantir Foundry playbook - take messy data, map it to the ontologies that capture how a field actually works, and the structured asset becomes something clients pay for and stay locked into. The Claims CTO came out of Palantir; the resemblance isn't an accident. ## The hard part he didn't hide When he brought Claims to the Beyond Finance community, the hardest question came from a doctor: how do you validate any of this when so much published research is overclaimed or never replicated? Koellinger didn't dodge it. His answer is to anchor the system to truth from outside it: human researchers seeding gold-standard datasets, and rewarding miners for predicting which findings will actually replicate, using heuristics any working scientist knows - a huge effect from a tiny sample is probably a fluke, a small effect proven across a very large one usually isn't. It would have been easy to gloss that. He named it instead. In a market full of confident pitches, **the builders worth watching tend to be the ones who'll show you where the hard part is.** ## Someone's already doing this work There's already a workforce for this. Scientists read and judge each other's papers largely for nothing - by one 2021 estimate, peer reviewers put in over 100 million hours a year, time donated mostly to commercial publishers and worth, in the US alone, more than $1.5 billion. And they increasingly lean on AI to get through it. **What they don't do is turn those judgements into something a machine can read.** That's the part Claims structures, and the part a network can actually pay for: work people are already doing for free, captured and rewarded. ## The data is the moat The defensibility is the data. The more independent miners work the same papers and the evidence graph sharpens, the harder it is for anyone arriving later to catch up - the asset builds on itself, and a snapshot won't stay current. The build reflects that: harvest the open-access papers first, nail one niche, expand later. None of this is finished. But the shape of it is rare: a serious scientist, a real problem, a business model proven in another domain, and a network built so the data compounds into an asset worth owning. **The science is already written.** Claims exists to make it machine-readable, and to pay the people who've been doing the reading for nothing. --- **Sources:** Asai et al., "Synthesizing scientific literature with retrieval-augmented language models," *Nature* (2025), doi.org/10.1038/s41586-025-10072-4 · SciWeave ScholarQABench results, sciweave.com/blog/zero-hallucinated-citations · replication pipeline, github.com/desci-labs/SciWeave