As machine learning becomes increasingly widespread, AI holds the potential to revolutionize drug development, making it faster, safer, and more affordable than ever. In this episode, I'm joined by Jo Varshney, Founder and CEO of VeriSIM Life, to explore how her company is transforming drug translation through hybrid AI.
With her unique blend of expertise as a veterinarian and computer scientist, Jo leverages biology, chemistry, and machine learning knowledge to tackle the translational gap between animal models and human patients. You’ll learn about VeriSIM Life’s innovative approach to overcoming data limitations, synthesizing new data, and applying ML models tailored to various diseases, from rare conditions to neurological disorders. Jo also reveals VeriSIM’s unique translational index score, a tool that predicts clinical trial success rates and helps pharma companies identify promising drugs early and avoid costly failures.
For anyone curious about the future of AI in healthcare, this episode offers a fascinating glimpse into the world of biotech innovation. To discover how VeriSIM Life’s technology is poised to bring life-saving treatments to patients faster and more safely than ever before, be sure to tune in today!
Key Points:
- How Jo's background and interest in translational challenges led her to found VeriSIM Life.
- Addressing translational gaps between animal models and human trials with hybrid AI.
- Combining biology-based models with ML to enhance drug testing accuracy.
- Small molecules, peptides, large molecules, clinical trial outcomes, and other data inputs.
- Ways that VeriSIM’s models are tailored per data type, ensuring maximum accuracy.
- Insight into the challenge of overcoming data gaps and how VeriSIM solves it.
- How hybrid AI reduces overfitting, boosting model accuracy in data-limited scenarios.
- What goes into validating VeriSIM’s models through partnerships and external testing.
- Measuring the impact of this technology with VeriSIM’s translational index score.
- Jo’s advice for AI-powered startups: be specific, validate technology, and be adaptable.
- Her predictions for the impact VeriSIM will have in the next few years.
Quotes:
“[Hybrid AI] helps us not only unravel newer methods and mechanisms of actions or novel targets but also helps us identify better drug candidates that could eventually be safer and more effective in human patients.” — Jo Varshney
“Biology is complex. We need to understand it enough to create a codified version of that biology.” — Jo Varshney
“If you're just using machine learning-based methods, you may not get the right features to see the accuracy that you would see with the hybrid AI approach that we take.” — Jo Varshney
“Focus on validation and showing some real-world outcomes [rather than] just building the marketing outcome because, ultimately, we want it to get to the patients. We want to know if the technology really works. If it doesn't work, you can still pivot.” — Jo Varshney
Links:
VeriSIM Life
Jo Varshney on LinkedIn
Jo Varshney on X
LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.
[INTRO]
[00:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven machine learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people in planetary health. You can sign up at pixelscientia.com/newsletter.
[EPISODE]
[00:00:34] HC: Today, I’m joined by guest Jo Varshney, Founder and CEO of VeriSIM Life, to talk about drug development. Jo, welcome to the show.
[00:00:42] JV: Thank you, Heather, for having me here.
[00:00:44] HC: Jo, could you share a bit about your background on how that led you to create VeriSIM Life?
[00:00:48] JV: Absolutely. I started my career really being curious about translatability and drug development. That curiosity really comes from my father who’s been in pharma since I was two. Just for the audience’s sake, what translation means is how we look at different animal models to make a good understanding of what could happen in humans, and what could not happen in humans, and anything and everything we basically test in animals before it goes into humans. That kind of translational understanding really made me very curious to become a veterinarian because you get to study different types of animals, different types of physiological aspects of different animals and whatnot.
Also to deepen my knowledge in this aspect because I do believe that there are some things that are quite similar between animals and humans, and then there are things that are not similar. That’s why there are so many failures that you see in phase two clinical trials in drug development. That really led me to the next aspect is understanding how can I bridge the gap between animals to humans. Is there a way to do it in a more cost and time-effective manner?
This is when I started taking computer program courses, more specifically C++. I started going deeper into specific aspects of bioinformatics and pathobiology to understand how we model diseases in animals that replicate what’s happening in humans. Then pursued my career to get PhD in comparative genomics, in oncology, and computer science, where my first exposure to supervised, unsupervised learning in AI really came into existence.
All of my background and the curiosity of understanding the translational challenges that we see in drug development truly became an obsession. Then, really, that led to finding ways to utilize computational methods to address some of these translational challenges that are not really possible to do it experimentally. I can go into more examples on that in a bit, but that’s how my background really led to starting of VeriSIM.
[00:03:14] HC: What does VeriSIM do, and why is it important for healthcare?
[00:03:18] JV: Yes. VeriSIM Life’s vision is to really solve the translational challenges that we see in drug development using what we call hybrid AI. The hybrid AI is not really a new concept. We did not really come up with it. Many sectors are already using it. But what basically we are saying is we’ve built a technology that takes into account biological representations that have been codified using mathematics and physics to show the physiological differences that you would see in human and animal systems. Then we leverage several types of AI to learn about these physiological differences.
When our platform interacts with a novel molecule, it can kind of figure out what of those patterns are already unique to that molecule with the physiological relationships versus what are similar to the learnings of the relationships. That kind of understanding helps us not only unravel newer methods and mechanisms of actions or novel targets, but also helps us identify better drug candidates that could eventually be safer and more effective in human patients.
Going a little more deeper into the company’s goal is that we want to use this to build our own drug portfolio of different diseases. We have expanded our drug portfolio from oncology, rare diseases, to neurological disorders, where there is highly unmet need. Think metastatic cancers, addiction, pulmonary fibrosis, pulmonary arterial hypertension, where we don’t really have curative drugs in the market, to also helping identify partnerships with pharma, research institutes where we can leverage this technology to identify either novel targets or repurpose existing molecules into novel diseases where you could go faster to the clinical trials and also have a more speedy approval process. That’s really what VeriSIM Life’s mission is to address the translation challenges using hybrid AI.
[00:05:40] HC: How do you use AI to solve this?
[00:05:42] JV: That’s a good question, and it has a lot of different elements. We use different forms of AI and different types of AI. Most of our work is in machine learning where we have used algorithms to learn different patterns and also to fill in gaps. As you can imagine, in our industry, there is a lot of data. But most of the data either is inaccurate or has missing data. Or really the data has not been standardized or curated for AI to be used, so we use several machine learning techniques to fill in those gaps based on the outcome.
This is where we have these mathematical representations that help us unravel how potentially a drug interacts when there is an outcome. For example, drug exposure in a particular patient population. When we don’t have the data, it helps us fill that gap and then create more synthetic data based on that.
Then we also use generative AI for looking into different computational chemistry-based features, aspects where we looking at how different molecules and the features associated to those molecules will be helpful to identifying novel molecules but also helpful to run these translational scores that we have to identify what’s the best method in identifying the best feature in the chemistry that would connect with the physiological properties of our interest for that program.
[00:07:16] HC: What type of data are you working with on these problems? What does this data look like?
[00:07:20] JV: Our database consists of several different types of data. Think about chemistry data like small molecules, peptides, different types of large molecules such as ADCs, mAbs, and bispecifics. The flavors and the amount of data really heavily varies because, for example, large molecules, you don’t get a lot of data because there’s not a lot of large molecule data out there. Small molecules you have a lot of data, but it’s different several series of chemistry and whatnot.
Then we have data from in vitro, in vivo models, toxicity aspects, and potential adverse events. Ultimately, we have clinical trial data. Again, I want to emphasize, all of that is different flavors and different types, depending on the disease and the program we are looking into. Some of the data – the next question perhaps you may have is like, okay, how do you have all this data, right?
What we have done is, first, we focused on building the infrastructure to identify the gaps of where we can fill it with either knowledge or these mathematical models or where we need to go find data to validate these mathematical models or to identify the gaps within these mathematical models. With that said, we have our own assets data that goes into our platform. Then we have data partnerships with companies such as Clarivate that helps us not only identify novel aspects within their curated database.
Then we have created this massive synthetic data which is basically what I was talking about is creating different variations of a similar aspect. For example, looking at drug exposure and how it will behave in periodic population, female versus male, at different ethnicity. We’ve created [inaudible 00:09:21] of that data to really be able to use it in helping us make better predictions and also helping us have enough data to use more sophisticated AI algorithms.
[00:09:34] HC: Of all these different types of data, do multiple types go into building a single model? Or do you have different models for different types of data you’re working with?
[00:09:42] JV: It’s the latter because, as you can imagine, if you don’t have enough data in a specific problem that you are looking for, you really have to work with different types of models. We actually have a suite of different AI models that we test out the same data set to see where we can hit the max accuracy. Then we go and select that for that specific problem. One of the other things that we have done is really – this is fundamentally the unique aspect of our company that I don’t see it in many AI companies is rely a lot on the knowledge aspect.
What that really means is we try to unravel what exactly is the mechanism and what exactly is the physiological relevance to build those mathematical representations. That helps us reduce the data needs that you may see that many companies have but also help us really connect the dots between if you’re seeing lack of accuracy in a specific aspect, we actually can understand why this could happen, and creates more explainability and transparency in our models and our predictions.
[00:10:50] HC: That chemistry and biology knowledge goes into the models themselves. I imagine some kind of collaboration between those who understand biology and chemistry and those who understand the machine learning and some kind of merging of that knowledge.
[00:11:06] JV: Absolutely. It’s a lot, right? I don’t think I can say it’s a simple model. But biology is complex, and we need to understand it enough to really create a codified version of that biology. That’s really something we spend a lot of time doing before we even touch anything on the AI side because if we don’t understand the biology or the chemistry, then really we don’t know how the predictions are going to be made and how to make the right assumptions when those predictions we want to make.
Having a good understanding of biology is so critical, especially for us and, I think, for any other AI company who’s really trying to be in biology. Without that, I think we are just making predictions that most likely are not going to have any meaningful outcomes.
[00:11:59] HC: What kinds of challenges do you encounter in working with all this data? One you mentioned already is the missing data. That might be the largest one, but what others are there?
[00:12:08] JV: Oh, where do I start? It’s just so much. There’s so much noise out in the public where everybody’s posting they have this amount of data, that amount of data. But if you – I think that’s the big challenge, right? There’s a lot of misinformation that makes the knowledge models look like non-existent. But the reality is that we don’t have enough data. We need to get creative because the other option is do the same thing, what we have been doing forever, right?
The creative aspect is two different things. One is you can either raise hundreds of millions of dollars and create a lab and create a high-throughput assay. But that assay is only specific for a specific problem. It could be a specific disease or specific experiment. You can’t really scale all different experiments you need to do to convince the FDA and then go into the clinical trial. It’s just not practical, and it’s computationally very expensive. Also, you need billions of dollars to really create that level of scalability. There are some companies who really are honing into one or two specific types of aspects. They can build a lab do, all the work, and then basically do non-AI kind of work. Then the other approach which is what we are taking to solve the challenge of data is create these mathematical models.
In the industry, we already have a lot of mathematical models. We have computational chemistry models. We have efficacy models. We have drug exposure models. We have, say, some of these what we call pharmacodynamics model that shows representations of how the drug will impact the physiological response and how physiological response will impact the drug. But they’re not connected with each other. When they’re not connected with each other, they’re like – whatever outcomes you’re getting, let’s say, from chemistry models don’t really translate into these other models.
We have connected all these things so that we can address and learn about the biological relationship with chemistry without having to need a lot of data because these models are fairly good, but they have not really been connected in the industry. That’s something we’ve done to solve these data gaps. Then we use AI to fill in the gaps, and learn from each other, and then try to help get better and understanding and explainability.
Then the other aspect of staying on the challenge of the data, what we have done is explainability and transparency. There’s one thing of connecting these different mathematical models. Then the other thing is how do these models really inform the clinical success, right? Ultimately, the biggest way to figure out whether or not our technology is useful or not is in the clinical trials. To do that, we have come up with what we call a translational index score, which is very much inspired by credit score. Just like your credit score, you know your financial health, and you know where you should be making the right decisions. Or you should be investing more or less and all that stuff.
Similarly, the translational index is the nonlinear relationship between these models, and they have different weights. Then the score is basically a compilation of all these different outcomes into one to tell you what’s the likelihood of clinical success. It not only tells us which model is performing better or not, but it also tells us, okay, in reality, this molecule, even with everything going well, we may not see a clinical success for x, y, z reasons, unless you reformulate it or you change the chemistry or you change the dose without impacting the toxicity and all that good stuff.
It’s really a guiding light, just like your credit score, to help us inform are we making the right investments into the drug chemistry or the type of drug program we are invested in. Or should we look for different aspects or basically kill the program entirely at an earlier stage of drug development instead of spending millions of dollars and then not seeing the outcome? We are mitigating those challenges already actively and seeing significant improvements and results into our programs.
[00:16:35] HC: One of the common challenges with machine learning, especially when you’re limited on the amount of data, is overfitting to a particular training set that you have. How do you ensure that your model is generalized to different species, different therapeutic areas, whatever it is that you do need to generalize to in your case?
[00:16:52] JV: Yes. That’s a very good question. We use different types of class imbalance techniques to ensure that, especially in limited data scenario. Actually, one paper that should be coming out soon in collaboration with the FDA really talks about this, like how our approach really has still gone much more meaningful accuracy in data-limited scenarios versus and where you already have data.
I think the machine learning techniques that we use are several different kinds. I think the approach is what I want to really emphasize here is, especially in data-limited scenarios, we really look for mathematical models to understand the relationships of different aspects within the translational challenge that we’re trying to solve. But to be more specific on the example, I’m talking for the publication that we have, is we’re talking about this challenge on liver toxicity that we want to address for different chemical series.
What we found is that the way our approach of this hybrid AI has really improved the accuracy from 50% to 82% and because the feature selection with this hybrid AI approach has become more specific. What I’m trying to say is if you’re just using machine learning-based methods, you may not get the right features to really see that accuracy that you would see with the hybrid AI approach that we take, where the features are really the most sensitive and impacting the accuracy of the models.
[00:18:36] HC: You’re using your knowledge of chemistry and biology and all the work you do in order to identify more appropriate features, stronger features, so your models can generalize.
[00:18:46] JV: Yes, exactly. Now, then we know, okay, these are the features really impacting the accuracy. Then we can use that generalizability in other programs, see the similar things. To my pleasant surprise, it happened so often because most of the drugs, if you see and that have been approved, the delta is not that high. The variations are not that high. One of the big things we try to focus when you’re generalizing is also looking at molecules or drug-like molecules or just chemical particles to identify the best features, to see how they would interact with this physiological response when we already know they’re not drugs, right?
The reality is we learned so much from so-called negative data, and that helps us have more bias approach because there’s always going to be some bias in the models. We try to create this holistic view of these different molecular features with biology so that we are not missing, or we try not to miss things that could create so much bias that we are now looking into very heavily over fitted model for a specific condition, and then you can really generalize. That has been really an eye-opener, how much we are learning and how much it has improved our accuracy of our models based on this approach.
[00:20:06] HC: How do you validate your models? Even just the ability to generalize, how do you get the data and really make sure that they do generalize?
[00:20:14] JV: Yes. As I mentioned, there are different types of data and different types of prompts. This is not just one goal, and we have one type of data, and we validate it. We have validated data across different molecules. If there is a molecule or hand out there we already have trained and tested, whether we’re seeing the similar outcomes with those molecules. There’s definitely more small molecule data than the large molecules.
We also have partnerships with different companies. We work with – now currently working with 30 different companies. We have validated our platform on different aspects from ADCs to peptides and whatnot to really understand where our platform is lacking, the understanding, or where we’re doing generally well. Not only are we invalidating in-house. We’re also validating externally with partners to ensure that our platform is learning and also engaging with these experts who have spent decades learning about a particular program or mechanism of action or the disease and all that stuff.
We’re not a perfect system. Let me make it very clear. But we do believe we are probably the only system who has painfully sought out partnerships where we see, okay, we want to validate this aspect of a platform. We go seek out those partnerships and do those challenging projects because we want to make sure that a platform really can solve that challenge that we think it can do. Only way to do it is just by having that external validation because, internally, we can bias a system and say like, “Hey. Yes, it works.” We’ve done that, and we will continue to do that as the time passes.
[00:21:57] HC: How do you measure the impact of your technology?
[00:22:00] JV: Great question. We measure our impact of our technology in several forms. One is the translational index. Let’s say we’ve actually brought four different companies and their programs into – actually, most of them are post-phase one, so that’s really exciting. It’s not our assets. These are assets of a company, so I can’t really share further details on their programs.
But long story short, we used our translational index score for helping them identify either the molecule that would be the most promising amongst the thousands of molecules that they had or honing in a specific formulation within that molecule to help them make it more bioavailable before they enter into clinical trials. That’s really one of the big validations, right? Have our technology really helped in clinical trials? The answer is yes.
The other is in reducing our R&D cost for our own programs because what we do is we use our technology to find the most promising molecule and in the efficacy model that we want to test out. Then we go do the biological validation. We reduce the trial and error methods, and really we save not only the time and cost. But we are moving our programs much more significantly faster than if you did not have technology.
We use this technology to identify molecules for a rare disease program. In three months, we received orphan drug designation by the FDA, which is unheard of. You have to spend millions of dollars to show gathering of data to go to the FDA and get the orphan drug designation, so that’s really exciting. Now, we are entering into IND labeling studies in less than two years of time frame for these programs. I think in a year or maybe a year and a half, we will have two programs of our own in clinical stage, which is really exciting, given we have only spent two to three years to really develop the program. We have spent half the amount we would have if we did not use the technology. Really, that’s how we are validating our technology is working, and we’re really excited about that.
[00:24:15] HC: Is there any advice you could offer to other leaders of AI-powered startups?
[00:24:19] JV: I would say be specific. I think it’s beaten to death like what does that mean, right? I think many of the startups that I come across, they are like AI for healthcare or AI for this. But what exactly are you solving within the value chain? Pharma or healthcare is huge, and it’s very segmented. We focus very specifically on translational challenges and then go find novel molecules. What aspect of either drug development, or it could be clinical trial management, reimbursement. But be specific so that you’re not just another of the many companies saying the same thing.
Then focus on proving and getting that validation externally. I find this very interesting, especially in current time. There are a lot of VCs. They’re pouring millions and millions of dollars because they have a great team. But the technology has not been validated, and no one really knows, how does this really work? We have several publications out there. We have testimonials from our clients. I don’t see that often in our industry, so I find it very surprising and honestly shocking that we are giving all this capital without knowing how exactly the system really works but to build that system, and we still don’t know where exactly it’s going to be.
My advice is focus on that validation and showing some real-world outcomes than just building the marketing outcome because, ultimately, we want it to get to the patients, right? We want to know if technology really works. If it doesn’t work, you can still pivot, right? That’s the fun part about having a software component in your technology. It’s like, okay, find a different home, different application, and move to solve that problem, and be very clear about that. That’s my advice. Be clear about the application.
[00:26:14] HC: Finally, where do you see the impact of VeriSIM in three to five years?
[00:26:17] JV: I think our impact would be in managing different drug assets from very many different diseases. As I mentioned, the diversity of diseases that the platform has touched is really humbling. I never believed when I started the company that we could go to these many different diseases and start showing impact. I see that we’ll be having an impact on several different diseases, and we will have our own programs, hopefully, not beyond the clinical trial and to the patients since we want them to have it and who need it most.
I also see several different pharma partnerships that we will be forging as we move our translational index score more towards the standardization to really evaluate clinical success of a program. I’m pretty excited about that future.
[00:27:09] HC: This has been great, Jo. I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?
[00:27:17] JV: Please reach out to us at verisimlife.com. We have several CTAs on our website, so you can always click on research article, publication, or white paper, and one of our team members will reach out to you. Or you can email us directly at [email protected] and ask questions. We will be more than happy to address them. Hopefully, we can be partners, so we can help you make the best choices for your program, just like your credit score does. Who knows? Potentially, that could be the most successful path for your company, so looking forward to connecting further. Appreciate Heather for this call to action from your side.
[00:28:00] HC: Well, thank you for joining me today.
[00:28:02] JV: Likewise. Thank you very much for the time and the invite. Appreciate it.
[00:28:07] HC: All right, everyone. Thanks for listening. I’m Heather Couture, and I hope you join me again next time for Impact AI.
[OUTRO]
[00:28:16] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend. If you’d like to learn more about computer vision applications for people and planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.
[END]