Today’s guest believes that decoding the immune system is at the heart of improving drug efficacy. He is currently focused on this effort as the CEO and Co-founder of Immunai – a company that is building an AI model of the immune system to facilitate the development of next-generation immunomodulatory therapeutics. Noam Solomon begins our conversation by detailing his professional history and how it led to Immunai before explaining what Immunai does and why this work is vital for healthcare. Then, we discover how understanding the immune system will help to improve how drugs work in our bodies, how the team at Immunai accomplishes its goals, the major challenges of working with complex ML models, and some helpful recommendations for processing the high-dimensional nature of biological data. Noam also explains the collaborative landscape of Immunai, how the evolution of technology made his work possible, Immunai’s plans for the future, and his advice to others on a similar career path.
Key Points:
- Unpacking Noam Solomon’s professional journey that led to his founding of Immunai.
- What Immunai does and why this work is vital for the healthcare industry.
- How understanding the immune system will help to improve drug efficacy.
- Exploring how Noam and his team use AI to accomplish their goals.
- The standardization of data and other challenges of working with complex ML models.
- Techniques for handling the high-dimensional nature of biological data.
- How ML experts collaborate with other domains to inform and build Immunai’s models.
- The technical advancements that have made Noam’s work possible.
- His advice to other leaders of AI-powered startups, and imagining the future of Immunai.
- How to connect with Noam and his work.
Quotes:
“First, let’s talk about the problem, which is today, getting a drug from IND approval to FDA approval—which is the process of doing clinical trials—has less than a 10% chance of success, usually about a 5% chance, takes more than 10 years, and more than $2 billion of open immune therapy.” — Noam Solomon
“Different people respond differently to the same drug, and the reason they respond differently is because their immune system is different.” — Noam Solomon
“You first need to fall in love with the problems. Many ML people—physicists, mathematicians, computer scientists—we love building models; we love solving puzzles. In biology, you need to really fall in love with the question you are trying to answer.” — Noam Solomon
“It’s a great decade for biology.” — Noam Solomon
Links:
Noam Solomon on LinkedIn
Noam Solomon on X
Immunai
LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.
[INTRODUCTION]
[00:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven, machine-learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people in planetary health. You can sign up at pixelscientia.com/newsletter.
[INTERVIEW]
[0:00:34.1] HC: Today, I’m joined by guest Noam Solomon, CEO and co-founder of Immunai, to talk about decoding the immune system for drug discovery. Noam, welcome to the show.
[0:00:43.3] NS: Thanks, thanks for having me over.
[0:00:45.2] HC: Could you share a bit about your background and how that led you to create Immunai?
[0:00:48.6] NS: Yeah, happy to. I was born and raised in Israel and as long as I can remember, I was always fascinated by mathematics, solving puzzles, et cetera. So, I went to a university, when I was about 14. I did my bachelor’s in math and computer science and then it led me to do two Ph.D. and a few post-docs, a few industry jobs as a machine learning and data scientist, and then I founded the company.
[0:01:14.0] HC: So, what does Immunai do and why is this important for healthcare?
[0:01:17.8] NS: So, as a company, like our name, we are mapping the immune system, we’re mapping it with single-cell technologies and biology and AI. Mapping the immune system is considered incredibly complex so we can talk more about what that means but the impact that this has is around improvement of the drug development process, being able to inform critical decision-making can get examples to what it means to improve the incredibly low percentage of drugs being improved by the FDA.
[0:01:46.4] HC: So, how do you accomplish this by better understanding the immune system? What does it mean to understand the immune system in order to improve drugs?
[0:01:54.0] NS: Right. So, first, let’s talk about the problem, which is today, getting a drug from IND approval to FDA approval, which is the process of doing clinical trials, have less than a 10% chance of success, usually about 5% chance, it takes more than 10 years and more than two billion dollars of open to immune – immunotherapy. So, the question is, “Why is it so poor?” especially with growing technologies and improvement of science.
There are many answers to this question, but I think fundamentally, it’s because different people respond differently to the same drug, and the reason why they respond differently because the immune system is different, and so if we accept this as a premise, the question is, “How can you measure the immune system and immune response to the drug with better precision and better accuracy?”
We believe the answer is to really map the immune system with the most granular technologies, and that’s what we have been doing from the get-go, and I think the immune system with single-cell multimodal profiling to build a very large atlas of the Immune system before and after different therapies are being administered, and they’re leveraging competition models, especially AI, to try to find the better ones that could explain to us how the drugs work for the patient.
[0:03:06.9] HC: So, how do use AI to do this? Is this you put in your single slot data as input and then what are you trying to predict with it? So, how do you set up the machine learning problem there?
[0:03:16.9] NS: Right. So, let’s talk about, you know, there is the data aspect and the AI or the machine learning aspect. Should I start with data or with AI?
[0:03:24.6] HC: Either, whichever is easy to explain first.
[0:03:27.5] NS: So, let’s start with the data because I think engineering data comes first and then the AI. So, when you’re measuring how drugs impact the immune system, you want to measure drugs both pre-clinically and then clinically, and what it means is that you want to measure the way molecules affect cells in a dish or Ex Vivo or In Vivo. That means, either in different pre-clinical models like in mice or in patient-derived two new fragments, et cetera.
And, you want to be able to infer from these conclusions or insights that are going to allow you to conclude what will happen in the clinical trial to actual patients. So, what we do in Immunai, we generate it out of single-cell multimodal profiling of both patients being treated in a clinical trial setting with the drug before and after the drug has been administered. We also measure it out of pre-clinical data that we create in our lab, in New York.
And then we harmonize these data sets together, and we do it with as many drugs as we can with as many patients as we can. Today, we have dozens of thousands of patient samples in AMICA, it’s a growing database. So, this is the data front. We also harmonize and ingest a lot of data, sequencing data coming from the public domain, so not all of it is done in our lab. That allows us to really curate all the public domain data on top of our own datasets.
And then, on the AI side, the AI actually enters the picture in a few different places, at least there to bring. The first one, you need to harmonize the data. So, this means to clean the data, to build disease anthologies, to be able to account and reduce the batch effects that are a result of the fact that RNA data is very noisy, and then the measurement machines is also introducing the noise. So, the batch effects are critical.
It then use ML to do cell type in other biological mutations for the data, and then you use the AI and ML to make predictions of both clinical and other commercial questions that are relevant. So, there are different layers of whether ML and AI enter the picture.
[0:05:36.1] HC: So, you mentioned the standardization of data. I guess, that’s one of the challenges you deal with. What are some of the other challenges in working with this complex data and training machine learning model space based off of it?
[0:05:47.4] NS: Yeah, that’s a great question. I touch the point of RNA and protein data being noisy. I think it’s kind of common knowledge that when you deal with DNA sequencing data, and that’s something that even forensics is doing for many years, you can look at the DNA data many years even after and still see the sequencing data and the genes, and the genome quite well. RNA degrades really quickly.
So, when you take a logical specimen from a human patient and you start sequencing the RNA from the patient or deporting it from the patient, the way that you process the samples really impact the results. So, this is not biology, this is just technical knowledge, that you don’t want to have the same patient sample being processed slightly different, differently with different temperatures, different syringe, different weather conditions, and then you end up seeing different data at the end.
So, that’s a key challenge of you know, working with multimodal or multi-owned data that this data, biological data is now I see. That’s something that exists more in biology. Then the other aspect, it’s very high dimensional. So, for example, when you take a biological sample from a patient, even if you only measure 5,000 or 10,000 cells, there are billions of those cells, and then you measure 10,000 different genes for every cell, you get a very large metrics.
And, the problem that we all know from ML and AI being trans modal datasets with very high parameter space, you can very easily overfit. So, that’s a, you know, a key challenge of walking with high dimensional multi-owned data, and then I think the last challenge is that those essays that we’re using get better every quarter and every year. So, when you imagine the immune system with one essay, it could be too late.
If you’re going to use a more sophisticated, more up-to-date essay, you need to build it in a modular way so that the database will be relevant as you grow it over time.
[0:07:51.1] HC: So, the ever-fitting challenge that you mentioned due to the high dimensional nature of the data, how do you handle that? Are there specific techniques that make it tractable?
[0:07:59.5] NS: Yeah. I think to answer the question how do you know that your model is relevant that are correct because the overfitting or getting good results in small data sets is a very common problem, so I think you need to have metrics to validate your findings. I think that’s part of why we are building the database in the weather to be – we are building it, we are leveraging it out of clinical metadata on top of the biological data.
So, that means, for example, when you are doing a clinical trial and you’re doing sequencing of the peripheral blood sample from patients before and after therapy, we would also know what is the indication, what is the disease the patients have, and what was the drug that was treating the disease, and whether the patients responded and which is it, to what extent, and then we can try to see if our models can actually represent it in a good enough way to make predictions.
And, if the predictions can be clinically validated, we know that our models are successful, and so the secret is to actually test our models in other single datasets and that’s part of what we do. And not only that, we have different types of machine learning models, we have models that are a better fit for very high-volume datasets with some foundation models. We have other models that are a better fit for very small datasets and we compile those.
We do a lot of back-offs, a lot of competitions between different models and see which ones are better.
[0:09:26.8] HC: So, a lot of experimentation and a great deal of careful validation, it sounds like.
[0:09:31.5] NS: Right, and maybe I’ll give you and the audience some equation if they like to think about and leverage a couple of the notions. So, even as we’re building this database that we call AMICA, AMICA stands for our Annotated Multi-Omic Immune Cell Atlas that is ever-growing and we want it to be a billion cells. We still have a way to grow but we are approaching a hundred million cells and it means a lot of switch.
I think it’s the largest Immune Atlas in single service solution order and then we have the ID engine, which is an engine. You can think about the foundation model, this is growing with more data that we have, and at the end of what we’re trying to build immune knowledge or immune intelligence that we are unlocking with our models, and the questions I like to think about is that AMICA plus the ID foundation model plus interpretation of the model plus the validation that we do for every insight that we do functional genome recommendation in our lab that is the definition or that is equal to immune knowledge.
So, immune knowledge is the accumulation, over time, of all of this together and not one over the other.
[0:10:37.2] HC: So, it sounds like there’s a lot of biology knowledge that goes into this, something I imagine your average machine learning developer might not have been exposed to before. How do your machine learning experts collaborate with people with other domains in order to build this knowledge into your models?
[0:10:53.2] NS: It’s a very key question in what we are doing and maybe just to say about my own background, as I mentioned, I don’t come from biology and about six and a half years ago, I kind of jumped into this very deep ocean of immunology and I think you first need to fall in love with the problems. Many ML people, physicists, mathematicians, computer scientists, we love building models, we love solving puzzles.
In biology, you really need to fall in love with, you know, what questions are you trying to answer and then really signing up to this marriage of disciplines. Immunai is a very [inaudible 0:11:27.3] company and we have people coming from different disciplines from biology, immunology, machine learning engineering, medicine, and creating this language over time allows us to build models that are carefully labeled by domain experts.
And, as you know and our audience knows really well, in very high dimensional problems being able to do some sort of reduction of the dimensions of the problem by doing careful labeling or then a shot of reduction is something required on an expertise. So, when we have those immunologists, they are working with us and they’re telling us that we need to look very carefully on specific labels, they know from their own background and training this is something very helpful to build the models.
And over time, we can refine and make those expert design, the labeling better with AI and ML. So, I think it’s a critical marriage of domain experts with machine learning, that will make it more better and over time, better and better.
[0:12:27.6] HC: So, it sounds like these two expertise with the two different domains are collaborating quite closely probably even on a daily basis.
[0:12:35.0] NS: Right, and everybody’s expected they’re working the R&D to complement what they don’t know from before. So, if you have a very strong background in physics or mathematics, you’re going to learn about immunology and biology and try to understand why we’re trying to solve these problems and why you’re building these models, and equally, if you’re coming from immunology and you really understand different genes.
And different mechanisms and pathways, understanding how we are leveraging every high-dimension single-cell data for the problems and why we’re looking into more, I would say more rigorous way to think about the single cells not in the human mind and see the value in doing this is something critical, and so I think both approaches are very useful but if you find a good marriage between those approaches, you have something that is really important.
[0:13:25.7] HC: Are there any specific technological advancements that made it possible to do this now when it wouldn’t have been feasible even a few years ago?
[0:13:32.4] NS: Yeah, yeah, the single cell are on a sequencing, you know a decade ago was not a technology. So, single cell on a sequencing with technology that allows you to go from sequencing tissues and bug, which is like very simplistically, like backing up tissue and doing a milkshake of all the different cells in this tissue and looking at the average expressions and now, you can do it on a singular cellular level.
You can measure different cells differently. So, this is a relatively new technology today, not new technology about a decade ago, it was non-existent. I think we are seeing a huge advancement in the space where every year, we see more essays measuring more of the biology. You can measure now for one cell, you can measure the RNA, the protein, the epigene, genomics, etcetera.
So, that’s critical advancement, it is like exploding. The other of course is that we are able to – the compute power is so much stronger than before, right? I mean, that we are seeing this in the AI community. This is something we are talking about all the time and the ability to really work with sophisticated models that are supported by creation of a very loud dataset that they’re now more cheap than it was 10 years ago.
Like significantly cheaper than it was 10 years ago, it’s critical advancement and I think we’re going to see more, all of these components in the next few years. So, I think it is a great decade for biology.
[0:14:52.0] HC: Is there any advice you could offer to other leaders of AI-powered startups?
[0:14:56.0] NS: Yeah, I mean, maybe I’ll say that what we faced about six years ago was advice to go and leverage what we knew what to do, which is to buy, buy automatics this problem, instead of building our own lab, doing our own single-cell sequencing, functional genomics, CRISPR editing but at the end of the day, we decided we’re going to be the company to map the immune system and create the new data modality.
You know, our immonomics is something that does not exist elsewhere, and it was something that we had to insist on, invest a lot of time, money, resources in building despite lack of enthusiasm. People told me the immune system is infinitely complex, don’t even bother going there. So, I believe that the founders of the new companies doing AI, you need to build something that is new.
Don’t just create datasets from data that is out there and hope that no one knew that model or a new finished model is going to make a difference. This is going to be commoditized. Go after new measurement technologies and new technologies to create more data faster, better, and cheaper. Own it, create a model around it, and then, build the AI on top, and so if these things are going to happen by more founders and more companies, it’s going to be great I think for everybody. Definitely for the patients.
[0:16:16.1] HC: And finally, where do you see the impact of Immunai in three to five years?
[0:16:20.6] NS: So, we’re already now walking with multiple companies. We announced recently some multi-collaborations with Astra Zeneca and others that we’re going to announce soon. So, we hope to see more and more impact on – with the development decision-making by the large pharma companies and smaller biotechnology companies, diverging on technology, and of course, this means that we’re going to get to impact the patient outcomes.
So, we want to improve the drug development process to eventually improve patient outcomes and you know, we hope the next three to five years, to see an impact passed on decision making and then on the outcomes.
[0:16:54.9] HC: This has been great, Noam, I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?
[0:17:03.0] NS: So, go to our social media pages, both on LinkedIn and on Twitter. I’m always happy to meet new people and you can reach out to me personally. I’m available on LinkedIn as well, and we have a website, Immmunai.com that you can read about our new developments and press releases.
[0:17:21.0] HC: Perfect. I’ll link to all of those in the show notes. Thanks for joining me today.
[0:17:25.3] NS: Thanks a lot, I enjoyed our time together.
[0:17:27.0] HC: All right, everyone, thanks for listening. I’m Heather Couture and I hope you join me again next time for Impact AI.
[END OF INTERVIEW]
[0:17:36.7] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend. And if you’d like to learn more about computer vision applications for people in planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.
[END]