AI tools for healthcare are becoming more prevalent than ever before, and today, we explore how this could help usher in a future of democratized healthcare for all. I am joined by the neurocritical stroke and epilepsy specialist Junaid Kalia, MD, founder of NeuroCareAI – an innovative enterprise utilizing artificial intelligence solutions to enhance health outcomes and efficiency.
Junaid begins with his professional background and what led him to found NeuroCareAI before explaining what his company does and the products and services it offers. Then, we unpack the primary data sets that inform NeuroCareAI’s work, how to overcome the challenges of combining varied data types, the ethical responsibilities of AI, and how to ensure generalizability is upheld over long periods. To end, we learn why it’s essential to distinguish explainability from reason, how to mitigate the effects of bias on radiology data, how the regulatory process stunts the development of machine learning solutions, and Junaid’s vision of the future of NeuroCareAI.
Key Points:
- Junaid Kalia walks us through his professional background and why he formed NeuroCareAI.
- The ins and outs of NeuroCareAI and how it incorporates AI into its products and services.
- Understanding the two main forms of data that govern the company’s work.
- The challenges of combining different data types and how to overcome them.
- Unpacking the ethical responsibilities of AI.
- Generalizability over time: How Junaid and his team ensure their models continue to perform.
- Model accuracy versus explainability, and distinguishing explainability from reason.
- How bias affects models trained on radiology data and how to mitigate this.
- The way the regulatory process affects the development of machine learning solutions.
- Junaid Kalia’s advice for other leaders of AI-powered startups.
- His view on the future of NeuroCareAI.
Quotes:
“Coming from a very low resource country like Pakistan, I wanted to start a project in which AI can help democratize in countries with low resource settings.” — Junaid Kalia
“Our mission is if you save a life, it is as if you save the life of all mankind.” — Junaid Kalia
“When you are deploying artificial intelligence, you need to make sure that it's deployed ethically. [For] some of these things, we do expect our partner sites – [to] have a real quality assurance system in place before they can deploy my artificial intelligence, because I just want to be ethical.” — Junaid Kalia
“We need to differentiate [and] distinguish between reasoning and explainability. In the vision world, I believe that explainability is nice to have. In the large language models space, reasoning, in my opinion, is a must-have.” — Junaid Kalia
Links:
Junaid Kalia on LinkedIn
Junaid Kalia on X
NeuroCareAI
LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.
[INTRODUCTION]
[0:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven machine learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people and planetary health. You can sign up at pixelscientia.com/newsletter.
[INTERVIEW]
[0:00:33] HC: Today, I’m joined by, yes, Junaid Kalia, founder and CEO of NeuroCare.AI to talk about AI tools for health care. Junaid, welcome to the show.
[0:00:42] JK: Thank you so much, Heather, for inviting me. I really appreciate it. I’m looking forward to it.
[0:00:46] HC: Junaid, could you share a bit about your background and how that led you to create NeuroCare.AI?
[0:00:50] JK: I’m Junaid Kalia, neurocritical stroke and epilepsy specialist, out of Dallas, Texas. Coming from a very low resource country like Pakistan, I wanted to start a project in which AI can help democratize in countries with low resource settings. We initially started as a product project to automatically detect lead in the brain. The idea was that when you come in as a stroke, every second counts. As a matter of fact, every second, 32,000 neurons die. We have a medication that’s called clot buster or essentially a powerful blood thinner.
The idea is that if you have a blocking type of a stroke as compared to a bleeding type, you can give that medication, which again got off blacking, so that one became significantly cheaper. So, we initially started this as a project and it still is free for low-income countries. Then we realized when we did all this R&D to make this project feasible, both in terms of development of mobile, web applications, integrations, etc. and AI model development including getting data, etc., we figured out that might as well work into a company, so it’s a sustainable model. Then one thing led to another and now we have three more models that are going to be soon FDA approved.
[0:02:05] HC: You started with one project. What does NeuroCare do today? What are these other products you have?
[0:02:11] JK: The vision of NeuroCare.AI now is that it’s a global healthcare AI fact. The idea behind it is that we take two pillars of artificial intelligence. Technically there are three pillars. One of them is called computer vision, which is pixel by pixel analysis of medical imaging of any modality. The second one is generative AI, large language models, in which you bring any clinical context that can be summarized, improved upon, and then made recommendations for. Then the last one is called AI analytics in which you give continuous form of data, EEG, EKG, blood pressure monitoring, and you do predictive analytics that if, you know, the vision can have readmission or not.
We are considering on the first two pillars. Number one is the computer vision and number two is generative AI. With the computer vision, we always consider on bringing value in terms of patient care, saving lives, and limits of a goal as a matter of fact, our mission is if you save a life, it is as if you save the life of all mankind. The first one was bleed detection, clear management, clear benefit in terms of saving lives.
The second one is chest x-ray analysis. Just to give you an example, if you’re detecting pneumonia, even one hour early in start antibiotics, there’s a one in 20 chance you’re going to decrease in ICU admission and one in 40 chance you’re going to save a life. That’s very important to us. Therefore, we’re going to that project. Then, of course, we did a mammography assessment clearly there’s now a clear need, because FDA wanted density to be available on every ideology report, because that is significant indicator of development of breast mass potentially cancer in future. Therefore, we’re doing density assessment through AI and breast mass detection.
Again, with artificial intelligence, the technology is far enough that you can detect two AI, breast cancer, two years early. Again, saving lives of limbs and I’m going to be very honest with you. Women’s life is very, very, very important, because they’re just not their lives. Their whole family gets impacted. I mean mothers life is so much essential for children as well. Honestly, those are the few solutions we – few areas we look at where it brings the highest value and we create technologies in a way that is fairly democratizable.
Our solutions go on the edge. That means that you don’t need to upload on the cloud or on the cloud. It runs on CPU. Any computer kind of can do it. The idea is to make it available to as many people as possible. We are essentially lowering the cost of adoption as well.
[0:04:52] HC: These examples are all based on different imaging modalities. Where do the other pieces of AI come in? Are they extensions of the mammograms or x-rays or do they come in in other ways?
[0:05:04] JK: The generative AI technologies is what we use. It is for to bring value in terms of physicians. For example, if I have a bleed in the brain, it will automatically extract the size if it’s the type of bleed it is and then pushes it into a context of the large language model. That I can dictate on a report saying that just – I speak my intent rather than doing the whole, this is a 47-year-old female. Comes in with large, oh, go back, fix the large.
Then the dictation process is so awful. What it does is that first of all, we have our own automated speech recognition finding large language model that creates clinical documentation in general. But the pieces pulled together in terms of value is that additional information that computer vision already got pushes into our generative AI piece and increases the reporting time by 90%.
As a matter of fact, any time after – I’m not a radiologist, but if radiologist do report, rather than taking seven minutes, it’s going to take them 17 seconds. That’s where the real value comes in. Of course, bringing them together is also very important for me. We bring vision and large language models together into what we call a value chain. Industry leaders do call it VLM, Vision Language Models, but as I said, our idea is to deliver sometimes on edge, so we sometimes differentiate them other times in highly source situations where GPUs are available, we combine them directly.
[0:06:42] HC: It sounds like you work with both the radiology images and the language and all of this comes together with those two main forms of data, is that right?
[0:06:51] JK: That is correct. That is the multimodal format, including the agentic format that we say multimodal means that you have speech, vision, and language. Then agentic model is that that one agent goes through, for example, vision model that goes into the large language model. Then it can talk back and forth in terms of reasoning and understanding that, hey, the report says accordingly, but does it have a fusion or not? That’s where the true agentic model also comes through.
[0:07:22] HC: What types of challenges do you encounter in working with this data, perhaps in the radiology data itself or in combining the different data types?
[0:07:30] JK: The first challenge that is huge is, of course, data quality. As a matter of fact, it’s interesting that within the last year and a half, the amount of data, the quantity of data required is gone less. However, what we have realized that is less is more, but that less amount of data needs to be higher quality data. Then in certain situations, we still do not have the proper data.
I’m just going to give you an example. I want to develop things that are truly generalizable, right? Mammography is a best example. We do have massive amounts of data from the US, India, Bangladesh, Venezuela, Brazil, but I don’t have a data from, I don’t know, Ethiopia and Nigeria. Do you understand? But what I’m trying to do is have it a fully generalizable model. In that situation, we need to essentially have access to data that is truly massive at a global state, so we can create something that can be implemented.
The second biggest challenge is data quality. Again, over here in the US, they’re superbly trained. They know how to make sure that the mammogram is done perfectly. The quality is crisp. Again, in some cases, it’s not. As a matter of fact, there’s a huge issue in terms of even in US and their companies dedicated to improve quality of mammograms, but what I’m saying is that still the overall average quality is high. That is not the case.
When you are deploying artificial intelligence, you need to make sure that it’s deployed ethically. Some of these things, we do expect our partner sites, not in US, but others, that they have a real quality assurance system in place before they can deploy my artificial intelligence, because I just want to be ethical. I mean, people have deployed with all that, but that’s one name. From the language side, initially, the problem was scalability and cost.
Of course, more and more large language models are coming out. They are decreasing in cost, decreasing in, when I’m saying cost, is the number of GPUs that are acquired and availability of even different GPU providers have gone higher. That problem was there, but it solved. Again, still remains that we are very careful, because we’ve started with radiology reporting and we have started with what we call clinical documentation in terms of conversation between physician and patient. We have to fine tune in. We have to do a bunch of testing. I cannot go and take it to the FDA and say, “Hey, look, this is my sensitivity, specificity, accuracy, approve it.” FDA doesn’t have a pathway.
I wish FDA had that pathway and we had truly standardized benchmark that we can benchmark against and show everyone that it is truly accurate ethical implementation. I truly believe that AI and healthcare has a big future and we need to make sure that regulatory bodies implement some of these processes.
[0:10:30] HC: You mentioned generalizability with respect to gathering data from different countries. What about generalizability over time? How do you ensure that your models continue to perform, maybe a new scanner comes out or a new variant of a disease appears? How do you capture that in your models?
[0:10:47] JK: Correct. Currently the niche is still very similar, like for example, right now my model, for example, bleed in the brain has five types of bleed. That’s the only thing in tech. I mean, then Copilot, then the value brings in that you increase the speed of care. There are so many other things that could be wrong in the fricking brain, right? I mean, ventricles would be larger. It could be a different a mass, etc. Right now, we can do it. I mean, in the background, we know exactly how our AI can do it that are not deployed, but we do not again have pathways with FDA’s to actually deploy what we call more generalized AI.
That’s one problem on the other side as well, but your question that how what happened? We did multiple studies internally, we haven’t published it, that approximately at 1.5 million images process to our official intelligence vision models, it drifts. That’s the phenomenon is all model drift. That modern drift goes by decreases some sensitivities and specificities. What we do is we refresh the model way before that internally, for example, if it’s served by AWS or GCP or Azure, where it is, we have a cycle, we tally how many images are processed and then we start 1 billion images, 1 billion studies.
We actually refresh the model so that it doesn’t lose that sensitivity and specificity. That’s number one. We do monitor model drift internally when allowed, because again, our installations are highly cyber secured and privacy preserving. If the institutions do allow us, we expect them to share 1 in 100 images in a report. We always make sure that we have quality assurance going on.
The third, if I’m deploying on the edge similar things, this is not a onetime installation and anyone can use it forever. There’s accounting going on, even on deployed on edge. That means that a portable chest x-ray has inbuilt embedded AI, our AI in it. They still have to refresh the model on a yearly basis, at least, or at certain image point, as far as making sure of that. Those are the three things that we use to prevent against model drift and monitoring for model drift. The large language model side actually positioned the quantity using it. Believe me, if it wasn’t performing even on a warden, no, they give us that hell and we improve it right away.
[0:13:19] HC: How do you think about the balance between model accuracy and explainability? Is explainability critical for this type of application?
[0:13:26] JK: That question has been asked to me multiple times and I’m going to be very honest with you. Heather, what do you think my explainability is?
[0:13:33] HC: Well, I think part of it is based on experience and you might be able to explain a lot of your intuition.
[0:13:38] JK: Correct. I mean, at the end of the day, people say, “Well, AI is a black box, so am I.” That’s the understanding that we expect, because it’s doing done by a machine, we need explainability. Let me give you another harmful example. I don’t know if you know this. There were drugs approved by FDA without knowing the mechanism of actions for decades. There were epilepsy drugs that were approved in the 1960s, 70s, and 80s. Nobody hell knew how it worked. Then really later when we have this progress on genetics and molecular biology that we figured out, oh, by the way, this is the protein it attaches to. That’s why it does that. But we used to use drugs all the time without knowing the mechanism of action.
People make a big deal about explainability. I think it needs to be put into context. What is more important is exactly what I said, is that we need to make sure that models that come out of any company and FDA does a beautiful job, like for neuro-ICH, I had to prove that it works on multiple age brackets. I had to prove it works on different genders. I had to prove it works on different slides, thicknesses. I had to prove that it worked with different manufacturing devices, Philips, Siemens, etc. So, when you have a system in which that outcome, accuracy, sensitivity, and specificity of that particular diagnosis, detection and diagnosis is done. I think, secondarily, you take explainability into action.
Where does explainability is important? That’s more important in large language models. When you have large language models and then the model actually summarized that everything, but it says a recommendation or I wrote, for example, start a amoxicillin and it puts a red mark on my say that, hey. Then I have to click on it and say, “Why are you saying that I should not prescribe a amoxicillin?” Then it’s going to go in and say, “Oh, by the way, three years ago, patient had cephalosporin and then therefore there’s cross-reactivity between these two medications by 30%, you need to reconsider if you really want to use amoxicillin.”
Then in this situation, the explainability is not the word. The reasoning is the word and the data where it came from, like three years ago, the patient had this reaction. We need to differentiate, distinguish between reasoning and explainability. In the vision world, I believe that explainability is nice to have. In the large language models space, reasoning, in my opinion, is must-have.
[0:16:13] HC: What about bias? How might bias manifest with models trained on radiology data? Are there some things your team is doing to mitigate it?
[0:16:22] JK: Oh, yeah, exactly. I mean, we have to make sure that we have mass amounts of training data set, in which we do. Generally, most companies do have it. That they have large amounts of available data that they can train on. That’s, as I said, also points to bias versus generalizability, is the same and also discussion two sides of the coin. To reduce bias and essentially improve generalizability, it is all about a high quality, broad set of data that you can do. That is again, from the vision model.
Now, the large language models are trained on at least US systems, because let’s be honest. Outside the US, nobody’s really doing that good of record keeping, even in Europe, because US, I mean, the record keeping has to go through the billing process as well. We are explaining a few things, honestly, more so for billing as compared to anything. Even in those charts, I mean, EHR data is actually a lot of junk data garbage data. You garbage in, garbage out.
The way we have done is that that we have preselected curated high-quality data. Then we are using that data to create what we call more synthetic data. Make sure that we are not missing any biases like, you know, is there really X per se, X age, X gender, X diagnosis with X current situation. Then we have created so many variations, but that is dependent on high quality base data, with the synthetic data and then fine-tuned upon the model.
[0:18:04] HC: How does the regulatory process affect the way you develop machine learning solutions?
[0:18:09] JK: First of all, it’s constrained and just it’s shackles on my ankles and hands, because clearly as I told you, we cannot do more foundational type models just yet. I mean, there’s one approved recently, but other than that, we are inching towards it. Our decision is to add, because model development is very easy even anyone can develop a model, which has data and some AI engineer.
We are developing products. When you say, when you move from the word model, to a product, it has to be regulatory approved. That regulatory approval is significant hurdle for us to innovate into different modalities, into different ways. At this point in time, and hopefully FDA is going to remove some of these shackles. Again, the new administration has adjusted our goals in general as a whole too. It’s yet to see, but my choice of selection of even the products depends so much highly of FDA is going to prove it or not.
Even if I think that it’s going to bring massive value, but if the FDA is not going to be able to clear it, I might not do it in the vision space. Of course, there’s clearly no, from a large language model, generative AI is concerned, there’s no FDA approval process. All three needs to be processed. We need to have synthetic data benchmarks. We have to have just benchmarks for reasoning, and then lastly, accuracies, etc. I hope they catch up.
[0:19:40] HC: It sounds like you really need to think through that whole process and what will be coming from the outset when you’re starting to think about a new product.
[0:19:47] JK: Correct. Then FDA actually, last year already, and I was at the going to be moving to one with a total product lifecycle, TPLC, which is very important standards and other industries like in a military software, etc. but again, we still have to see how things change in this year.
[0:20:04] HC: Is there any advice you could offer to other leaders of AI-powered startups?
[0:20:08] JK: Just number one, concentrate on value. Anything, any product that you bring to the market has clearly defined value in both in terms of return on investment and patient outcomes, or physician satisfaction. Those are your three criteria. It’s extremely important that you are truly bringing value in terms of product selection, because that is one of the important things a product market fit is exactly that that find the right pain problem.
The second thing I would recommend is invest heavily, initially, from the ground up a cyber secure and privacy preserving system. We did actual cybersecurity framework, and we were glad we started from the very beginning, because there’s so many things that you’re going to figure out that becomes a technical debt when you’re not doing it from the get go.
[0:20:57] HC: Finally, where do you see the impact of NeuroCare AI in three to five years?
[0:21:02] JK: In three to five years, we are going to be a global company, probably delivering the highest value, which means that the best accuracy of the lowest cost globally. We have already installations going on here in the US and globally. We’ll see that within this year, we’re going to have four FDA approvals. By the end of year three, we’re looking at around 20 or so FDA approved products.
[0:21:28] HC: This has been great, Junaid. I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?
[0:21:36] JK: I think the best place to find me is on LinkedIn. Of course, you can follow our company neurocare.ai at the website as far as products, descriptions, etc.
[0:21:46] HC: Perfect. Thanks for joining me today.
[0:21:48] JK: Thank you so much for having me. I appreciate it.
[0:21:50] HC: All right, everyone. Thanks for listening. I’m Heather Couture, and I hope you join me again next time for Impact AI.
[END OF INTERVIEW]
[0:22:00] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend. If you’d like to learn more about computer vision applications for people and planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.
[END]