Drug development is notoriously time-consuming and expensive, but what if we could simulate clinical trials before they even begin? Orr Inbar, Co-Founder and CEO of QuantHealth, joins me to explore how his team is doing just that. By simulating trials with AI-native models, QuantHealth helps pharmaceutical companies make better decisions about how to design trials and test drugs.
Orr shares how QuantHealth uses real-world patient data and detailed drug biology to build deep-learning models capable of forecasting patient responses to new therapies. He breaks down their biggest challenges, like the complexities of messy healthcare data, hidden biases, and the importance of domain knowledge when building AI tools for regulated environments. He also shares a key lesson for any AI startup: focus on solving real problems, not just building clever models. Tune in for a fascinating look at how AI is reshaping drug development and what the future of clinical trials could look like!
Key Points:
- Some background on Orr, his parents, and how he founded QuantHealth.
- Key problems QuantHealth is solving as a clinical trial simulation company.
- A breakdown of the biggest challenges facing clinical trials.
- Why we need to improve data-driven trials of drugs.
- How QuantHealth builds their foundation models for trial simulations.
- Examples of the type of predictions their models make in clinical contexts.
- How they use patient and drug data to make predictions and build “digital drugs”.
- Key challenges of working with these different types of data.
- Methods for combating bias, including the use of exogenous data.
- How they incorporate the medical context in model development.
- QuantHealth’s validation process: how they meet rigorous industry standards.
- Orr’s advice to other AI startups on creating value, not just smart models.
- Where you can expect to see QuantHeath in the next three to five years.
Quotes:
“There is a constant desire in drug development and pharmaceutical research to get your hands on more data. This makes sense since it's a very data-driven industry. But at the same time, there was a mismatch there, because there's actually quite a lot of data already out there.” — Orr Inbar
“How do we bridge the gap between the data that we already have and the insights that we need to generate to answer those questions?” — Orr Inbar
“If you take a step back and look at how drugs are being developed today and with an emphasis on clinical trials, we're essentially doing the same things that we were doing 50 years ago.” — Orr Inbar
“Even in a world of GenAI, you can't just snap your fingers and get the solution. It requires a lot of work to structure and harmonize the data.” — Orr Inbar
“Every trial that we simulate, we first go through a data enrichment process where we look for the latest information in terms of research publications, recently completed trials that are relevant to our drug of interest, and incorporate that data into our data sets.” — Orr Inbar
Links:
Orr Inbar on LinkedIn
QuantHealth
LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.
[INTRODUCTION]
[0:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven machine learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people and planetary health. You can sign up at pixelscientia.com/newsletter.
[INTERVIEW]
[0:00:33] HC: Today, I’m joined by guest Orr Inbar, co-founder and CEO of QuantHealth, to talk about simulating clinical trials. Orr welcome to the show.
[0:00:41] OI: Hi, Heather. Thank you. It’s nice to be here.
[0:00:43] HC: Orr could you share a bit about your background and how that led you to create QuantHealth?
[0:00:46] OI: Sure. Rewind maybe 15, 20 years ago, I was a pre-med student on the Boston suburbs. My mother being a physician and also just me being Jewish, being a good either doctor or a lawyer can have prompted me down that typical route. But somewhere along the road, I realized the long and arduous path that it takes to becoming a physician. I think even more compelling for me was the realization that becoming a doctor ultimately meant that I would only be able to treat one patient at a time. Which may seem obvious, but I was always looking for ways to impact the world at scale.
My father being a computer scientist also, I guess, helped spark the other side of me, the engineer in me. So, after undergrad, I basically started exploring life as a consultant and a citizen of the world and realized the value that engineering, and that computational methods can have on pretty much all fields, but medicine especially. So, I decided to go down that road of pursuing a master’s in computer science, specifically focusing on data science and machine learning and very naturally combining that with my love of medicine and life sciences.
So, very quickly to me, that became the path that I was destined to be on. So, basically after school, I was essentially really at the forefront, I always, well, I would say, I always try to be in place myself at the cutting edge of medical research intersecting with computer science and machine learning and AI. So, I founded my first company in that space in Boston in 2017, focusing on real-world data and precision diagnostics for oncology.
During that time, I got much more familiar with pharma, how pharma uses data to solve some of their biggest pain points and challenges. One of the things that became apparent to me as a company that was providing both data and analytic services to pharma was that there is a constant desire in drug development and pharmaceutical research to always bring more data and to generate, acquire, and get your hands on more data. This makes sense since it’s a very evidence, very data-driven industry. But at the same time, there was a mismatch there, because there’s actually quite a lot of data already out there.
I think – when you look at the different industries, especially now with GenAI, kind of really just regurgitating a lot of the same data over and over again, healthcare is probably one of the only spaces that is still generating huge amounts of data every day. So, when you look at that, that begs the question of, well, are we really doing enough with the data that we already have? So, there’s this almost this knee-jerk reaction within pharma whenever there’s a challenge that isn’t readily met to just go out and seek more data.
Again, more often than not, the answer is already there. You just have to look harder and find the right way to answer it with the already available data. So, this essentially all that can be summarized as the data insights gap, right? How do we bridge the gap between the data that we already have and the insights that we need to generate to answer those questions? That prompted me to essentially form my next company, which was QuantHealth, whose mission really was to solve the complex questions of drug human biology, bridging the gaps across these very diverse and complex data sets to answer a new generation of questions that are, for the first time, in human history, now able to be addressed given the advances in compute infrastructure and AI models.
It was a very exciting time to really start the company. I think it still is that when we started, this was well before Jenny, I was even a thing. So, that whole revolution has given us a lot of tailwinds and is definitely fueling what we’re doing. It’s an exciting time for sure and a lot to look out for.
[0:05:13] HC: Tell me more about what you’re doing at QuantHealth, like what problems are you trying to solve today?
[0:05:17] OI: QuantHealth is a clinical trial simulation company, essentially. If you take a step back and look at how drugs are being developed today and with an emphasis on clinical trials, we’re essentially doing the same things that we were doing 50 years ago. We’re basically, we have a drug that we think has some potential and we just go ahead and find patients in the real world, give them the drug and see what happens. There’s very little sophistication to that, kind of look at what we’re doing in pretty much every other highly advanced scientific pursuit across civilization, right, look at semiconductor and development and computer processor development, right.
There’s an enormous amount of simulation and virtual testing that goes on before the chip is actually fabricated. Look at aerospace engineering, right, shuttles, and rockets, and jets. All of those highly complex and expensive machines, also they go through expensive simulations and then they go through wind tunnels and all these physics simulators to understand, right, the different safety and then performance aspects of the craft before it gets produced at scale.
Go back to again, drug development and none of that happens. We discover the drug or we figure out how to manufacture in small quantities and spread to humans it goes, essentially, without any virtual testing, without any simulation. So, it’s no surprise that over 90% of drugs that make it to human trials ultimately fail. It’s because we don’t do sufficient testing that is cheap and scalable.
We’re looking at this in a very cold manner, but that doesn’t even address the human element, right. I mean, these are real people who are giving actual drugs who some of these drugs could be unsafe, some of these drugs could be safe, but could be ineffective. When you’re in, when you’re on a trial, it’s typically between that and another drug that you know is effective, but you’re taking a chance.
A lot of it is besides just being expensive, inefficient, a lot of it’s also just unethical. But in the end, it’s just the best we have, so we do it anyway, because without that, there would be no drugs at all. All that’s to say again that the need for doing better data-driven assessments of drugs is paramount and trial simulations is one of the most promising ways to go about that because you can do it in such a holistic and all-encompassing way, because again, trials are really, really complicated and there’s a lot of variables that go into that.
So, if you can truly simulate a trial and all its components and really give a good signal on what the results will be, you can really help the pharma company make better decisions about A, what drugs to even take the trials or not? What programs to discontinue? For those that do go to trial, how do you make a better decisions on what patients to target? How to administer your drug most effectively maybe synergizing it with other drugs? So, that ultimately, we get to the promise language as an approved drug, which just happens so rarely today. That’s essentially what we do. We help answer these questions around trial futility and trial optimization.
[0:08:44] HC: How does machine learning help in answering these questions?
[0:08:47] OI: It’s core. I mean, it’s central to everything we do. You know, well, there are a lot of companies out there that use AI or use machine learning and today that’s probably pretty much any other SaaS startup. For us, the machine learning component is the essence. We actually build the models themselves. We don’t take, for instance, Gen AI foundation models and fine-tune them. We actually build the foundation models, so to speak.
They’re not Gen AI-based, they’re built on different architectures, but nonetheless, we’re an AI native company, right? We actually build the models that fundamentally model out the drug human biology to actually run these predictions on a patient level for these trial simulations. We’re more than half the company is data scientists and engineers that really focus on that problem.
[0:09:42] HC: They are the models trained to predict something like is this patient going to respond to treatment or not? Is that that type of binary decision or are there other tests that we tackle with machine learning?
[0:09:54] OI: Yeah. It’s more granular than that. In most clinical contexts, you’re interested in a temporal prediction. That is to say, for instance, will this patient remiss, will they experience a remission in their disease in the next six months, for instance? Sometimes the question is even, is more granular and it could be for instance, if we’re talking about a weight loss drug, how much weight will the patient lose in the next over the next three months, for instance. Things like that. The question can be typically nailed down to a single point time where the measurement is taken, the outcome measurement, but it’s very precise, because that’s how the data is ultimately standardized and analyzed for the FDA.
[0:10:41] HC: What type of data do you work with in applying these machine learning models?
[0:10:46] OI: It’s a combination of data sets. Basically, in order to predict how these patients will respond to novel therapies, you essentially you need two core ingredients. The first is patient data, right? We’re modeling out these digital patients, if you will. So, we need patient data to represent those patients, right? Treatment histories, diagnostic information, outcome data, treatment histories, lab results, vitals, etc. that sort of thing. We work with a variety of data aggregators that essentially extract information from the healthcare system, EMR systems and insurance claims.
The second piece is essentially drug data, and in order to model out novel drugs, we need to understand how those drugs actually work. So, for that, we build out knowledge graphs from a variety of different genetic databases, pharmacology databases, a lot of different publications, and we stitch that data together to essentially build a map of drug human biology, and understanding how different targets participate in different cell processes, and then how those targets are affected by different therapeutic entities. So, that lets us build out these digital drugs.
We then have these two different data domains, right, this patient domain and the drug domain. Then by combining those two things, along with some other data sets, but that those are really the main ingredients. We’re able to train large deep learning models to understand how different drugs with different mechanistic properties affect different patients with different clinical characteristics, and thereby run inference on new patients and new drugs to predict how they will respond to a novel therapy in the context of a trial.
[0:12:40] HC: What kinds of challenges do you encounter in working with and training models based on these two different types of data?
[0:12:46] OI: Challenges are never-ending. Well, I mean, the first challenge is always with the data itself. Clinical data, for instance, can be extremely messy and large, and that combination of large and messiness can be particularly challenging. This is in a world where a lot of that data is semi-structured, and it’s not even text. Even in a world of GenAI, you can’t just snap your fingers and get the solution. It requires a lot of work to structure and harmonize the data.
I’ll give you one example, one of the models that we’re building tracks HBA1C, right, in the blood, different even a little bit measures and whatnot. For most patients, we only have HBA1C readings, maybe once a year, if we’re lucky, but in a clinical trial, you’re interested in measuring the effect on a monthly or even weekly or even daily basis. So, we have to build these imputation models to essentially help fill out the missing data, so to speak, right?
There’s this whole layer in a lot of these datasets that is the latent information that you know is true about a patient, right? A patient is obviously a living being that experiences, different events and is under a constant changing environment, whether that data is captured or not in the electronic health record. You have to always find a way to infer some of that information, even if it’s not directly available in the data. That’s a big one.
Then you have data bias, you know a lot of the data different patients are treated in different ways across different geographies and across different, even socioeconomic contexts. Accounting for all that and finding the single source of truth on a lot of that is oftentimes difficult. So, those are things, those are typical problems that we deal with.
[0:14:52] HC: How do you validate your machine learning models, particularly mentioned bias? How do you make sure that your models don’t end up incorporating that bias?
[0:15:00] OI: Yeah. That’s a tricky one that we constantly try to work against. The general idea is to incorporate exogenous data sets that you can use in some external fashion to benchmark and ideally debias your models. So, we use publicly available data from clinicaltrials.gov, which is one of the largest publicly available clinical trial registries where companies are essentially required to post the results of their clinical trials. So, we have this, we’re fortunate enough, right, to have this very large database of you could consider it a gold standard in a way, although clinical trials also have their own bias, but we can’t control for everything.
We have this clinical trial, gold standard data set that we can then use to measure ourselves against and see where we’re biased or where we’re potentially systematically over or underestimating and make those adjustments and essentially debias the models as much as possible.
[0:16:09] HC: In getting these models working and validated, I imagine there’s a fair bit of knowledge about health care, about how drugs work and their characteristics that would be very important to incorporate. How do your machine learning developers get up to speed on this knowledge or learn it so they can incorporate the important characteristics into the models?
[0:16:28] OI: Yeah. It’s tough, right, because to your point there’s the state of knowledge on various drugs is constantly advancing, right, as different PhD students, and researchers, and pharma companies are advancing the state of our understanding of these drugs. So, we have to track that we have to incorporate that. Basically, we go through, so every trial that we simulate, we first go through a data enrichment process where we look for the latest information in terms of research publications in terms of recently completed trials that are relevant to our drug of interest and incorporate that data into our data sets so that again, we have the latest and greatest of any given drug in development. It’s a complex process that requires a lot of Q&A and automation, but it’s really important.
[0:17:23] HC: How does the regular joy process affect the way you develop machine learning models or the things you do differently than if you weren’t in a regulated domain?
[0:17:31] OI: It’s a good question. What we do isn’t fundamentally regulated, because we’re essentially helping pharma companies make better decisions about their internal processes, right? That being said, what they do obviously is ultimately regulated, so that they are used to being scrutinized very carefully. So, naturally, they scrutinize us as well, right? For good reason. Then I think that’s a good thing. That is sort of approach to having everything be controlled and validated and regulated to definitely spills over into what we do.
That just puts a very high bar right on what we do. For that reason, we go through an extensive model validation process to understand very granularly what the data is that goes into the model, how the models are behaving, how they are learning, how the models are actually performing on the prediction. Then how well those predictions and simulations validate against actual clinical trials. All of that is part of our validation process. It’s definitely a direct result of the scientific rigor of the industry.
[0:18:44] HC: Is there any advice you could offer to other leaders of AI powered startups?
[0:18:48] OI: Well, I would say especially in today’s world where it’s so easy to say we’re an AI startup or we AI this and AI that. I think one of the things that one of the truths that holds constant that I’ve seen for the last decade or so in AI startups and still holds true today is focus on the product and on the value and not on the model, not on the prediction. That’s important too, right? But that’s, you can oftentimes miss the mark by focusing just on the core technology. How do you actually build a solution that actually solves a real problem, a real need that fits into a real user workflow, gets them from point A to point B as quickly and as efficiently as possible with as little doubt as possible. That might sound obvious, but I think a lot of entrepreneurs missed that somewhere along the way.
[0:19:49] HC: Finally, where do you see the impact of QuantHealth in three to five years?
[0:19:53] OI: I would say 50% of all trials phase two and phase three trials will utilize either QuantHealth or a technology like QuantHealth to design and execute their clinical trial.
[0:20:08] HC: Imagine overall that’ll make these clinical trials more efficient, fewer drugs failing before they get to patients and a lot of good overall.
[0:20:17] OI: Oh, 100%. I mean, it should have a dramatic difference on the amount of drugs that reach patients and as well as save a ton of money on drug development, which of course frees up resources to develop more better drugs. It’ll be huge. No question.
[0:20:35] HC: This has been great or I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?
[0:20:42] OI: Well, Google QuantHealth, quanthealth.ai. Yeah, and you’ll see a lot of information there. It’s a lot of interesting things.
[0:20:50] HC: Thanks for joining me today.
[0:20:52] OI: Thank you. It’s a pleasure.
[0:20:53] HC: All right, everyone. Thanks for listening. I’m Heather Couture and I hope you join me again next time for Impact AI.
[END OF INTERVIEW]
[0:21:03] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend. If you’d like to learn more about computer vision applications for people and planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.
[END]