Generations of Clinical Trial Matching
Adam Blum
Nov 20, 2025
Medical science is being drastically impeded by the lack of patient uptake of trials: a world-wide average of just 3% of patients participating in a clinical trial for treatment of their disease. 89% of trials are delayed beyond their original timeline, mostly due to lack of patient registration. The biggest problem here is the ability of patients to find relevant trials that they are truly eligible for.
So the world needs better clinical trial matching. Let’s discuss a bit more what that means. Clinical trial matching systems have gone through four major generations.
The Base Registries
Starting in 2000 with clinicaltrials.gov, these are a government or cross-country organization’s official registries of trials. Other prominent trials registries include EUCTR from the European Union, ICTRP from the WHO, and ANZCTR for Australia and New Zealand. Finding matching eligible trials for a given patient is an enormous task for either clinicians or patients, taking days of dedicated effort. It took me weeks searching these underlying registries to find matches for my disease.
I was honored to work with the creator of clinicaltrials.gov Alexa McCray recently at the Harvard Radcliffe Institute seminar on centralized patient databases. It was clear from our discussions that getting structured eligibility attributes from the unstructured trial text (discussed below) was always envisioned, and thus is long overdue.
Commercial Trial Matchers
Seeing this problem of just free form trial searching, in the 2010s several trial matching services were initiated. They all ask the patient a handful of questions: disease, stage, grade, age, gender, and sometimes (though not often) previous treatment. They don’t actually do full matching to all patient labs, diagnostics, and biomarkers. Instead putative matches are shown and then a “consultant” reaches out to the patient and manually tries to find a matching trial and brokers that patient to a pharma researcher. In my own cancer journey looking for trials with these tools, I was presented with many matches by all the major services, none of which were actually true matches to my disease and lab value specifics. I still get calls, texts and emails from these services with non-matching “new trials”.
Open Source LLM-Based Matchers
In late 2024, a raft of AI Large Language Model (LLM) open source research projects all published contemporaneously. These include TrialGPT, TrialMatchAI, MatchMiner AI and OncoLLM. These projects (not products that patients can access from a hosted site) all present a hypothetical “patient vignette” (a brief description of the patient not including all of their values) to an AI LLM along with a set of trials from a small test database (TREC 2022 with just 50 trials is popular for this). Based on the vignette, each trial in the list is then ranked for suitability to the patient. They grade their own efficacy by whether the clinician labeled “gold standard” of the best trial among the tiny set of toy database trials is in the top n (generally top 5) of the matcher-ranked trials.
Of course there are many problems with this approach in building a useful system for finding trials from these projects:
1) Patient vignettes do not have sufficient information to determine true eligibility according to all trial participation criteria.
2) The test databases are several orders of magnitude smaller than actual registries and do not reflect the difficulties in searching among thousands of possible trials. Usually, this is just a test of whether the matcher found the one trial of the 50 in the dataset that is related the disease mentioned in the patient vignette. Of course that usually does work. As an example my patient vignette of an FL sufferer would match me to the one FL trial in the tiny test dataset. But solving this trivial problem does not provide anything of particular value to anyone.
3) Although its impossible to determine true eligibility from a vignette, the clinician-chosen trial is also some opaque, uncommunicated mixture of highest eligibility trial combined with what is believed to be most beneficial for the patient.
4) The clinician labeled gold standard reflects the doctor’s opinion of what the best trial is for the patient, using the doctor’s own similarly non-transparent opinions of the importance of the trial’s risk, benefit and patient burden for a given patient. It doesn’t reflect the patient priorities for those aspects in any way (no discussion with patients informs the clinician labeling process).
A New Generation: Precision Clinical Trial Matching
Seeing all the limitations of prior approaches as a patient, last year I started CancerBot and the underlying EXACT open source clinical trial matching system (which we offer free to all foundations and patient support organizations to be their own matcher). CancerBot’s foundational innovation is that we create a true structured database from the unstructured trial text with all trial eligibility attributes from all major registries.
While EXACT builds upon the major improvements in AI LLMs, just asking the LLM to bring back all the eligibility attributes from the original unstructured trial text does not work even close to reliably. Instead, this is done with a much more sophisticated approach described here. This approach results in extremely high accuracy of attribute extraction. We will be presenting these results in more detail at ASH 2025 in early December.
The resulting system is thus able to do true 100% eligibility assessment with precise matching to all patient attributes. We call this approach precision clinical trial matching. Thus patients see a list of trials that truly match (or potentially match given some more filled in patient attributes) their own disease with all its labs, diagnostics and biomarkers.
We then order those potential or eligible trials by what matters to the patient: their own preferences of risk, benefit, patient burden and distance. We score all the trials risk, benefit, and burden based on the rubrics defined here. Patients express their priorities on these factors and trials are scored based on those. The result is that patients now get a list of trials they are truly eligible for, ordered by the factors that are most important to them.
Where To Now?
We have listed many things that we think all trial matchers should be doing here. To summarize this again, the BEGIN principles listed are:
- Bots — to get complete patient information and explain terminology to patients
- Eligibility — show truly eligible trials to patients based on their detailed health attributes
- Goodness — rank the trials based on what is most important (risk, benefit, burden and distance) to the patient
- Interoperability- read FHIR feeds to get full patient records (not done by any of the above generations) and stored in standard ways such as OMOP for future analysis
- Navigator — acts as an empathetic patient-support nurse/navigator to explain trial risks and burdens
We want the whole trial matching industry to start following these principles. This is why we are giving our software away to foundations. To that end I am leading the database definition workgroup (WG1) of Harvard’s DCI Network trial matching pilot. If you are with another trial matcher, I would encourage you to consider adopting these principles. If you can move to precision matching, and make your service freely available for the pilot, you can participate in this pilot.
If you are a patient or clinician looking for trials, we would encourage you to look for services that provide such precision matching as well in your trial finding journey. If you are a foundation or patient supporting organization, reach out to us to find out how we can provide your navigators and patients trial matches from your own branded service.
Turning frustration into innovation
After being diagnosed with follicular lymphoma, AI tech entrepreneur Adam Blum assumed he could easily find cutting-edge treatment options. Instead, he faced resistance from doctors and an exhausting search process. Determined to fix this, he built CancerBot—an AI-powered tool that makes clinical trials more accessible, helping patients find potential life-saving treatments faster.


