Today, I am joined by Kilian Koepsell, co-founder and Chief Innovation Officer of Caption Health. We’re taking on the multifaceted topic of ultrasound for early disease detection. Join us as Kilian talks about the problem Caption Health identified in the world of ultrasound use, and how he is working to solve it. Hear how he is using machine learning to help practitioners to guide and interpret ultrasound imaging, why his first point of entry was cardiac health, and where the role of the machine ends and the medical expert begins. Kilian shares some challenges he has faced along the way, and encourages anyone with a similar idea to approach the FDA sooner, rather than later. Tune in today to hear how his concept aims to support healthcare in a changing world, and how he sees the future of Caption Health unfolding. Thanks for listening!

Key Points:
  • An introduction to Kilian Koepsell, co-founder and Chief Innovation Officer of Caption Health.
  • What Caption Health does and why it is important for imaging. 
  • Why there was a hurdle to get ultrasound technology used by more people.
  • The two kinds of feedback Caption Health provides: guidance and interpretation.
  • How machine learning is used to perform these two functions.
  • Why their first focus is on the heart and why it is one of the most difficult organs to image. 
  • Measurements taken by the machine for a practitioner to interpret.
  • How the quality meter works to guide the probe and gives practitioners the feedback and confidence they need.
  • Challenges in training machine learning on ultrasound imagery. 
  • Validating models across many variations.
  • Why it is so important to take FDA considerations into account from the beginning.
  • How Kilian ensures that he is developing technology that will be of use to practitioners.
  • How the Caption Health vision has changed since its inception.
  • Having a high level thesis to survive a changing world.
  • Where Kilian sees the impact of Caption Health in five years.
“We realized that even though the hardware was available at a much lower cost to many more people, there was a big hurdle to get the ultrasound used by more people because it is actually very difficult to acquire good ultrasound images.”

“We use machine learning to understand the relationship between the imagery and the position of the probe in 3D space, and then guide the user to the right spot without the user having to even understand what they are looking at.”

“By just looking at the imagery you can see if the heart is not pumping well or if it’s enlarged or if the valves are not closing properly - all different kinds of structural heart diseases.”

“Normally you would require an expert to look over their shoulder and give them the feedback, but with this device, they can train themselves, and they get better over time by using it on patients.”

“Anyone who is trying something similar, I would encourage to get in contact with the FDA as early as possible.”




[00:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven, machine learning powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people in planetary health. You can sign up at


[00:00:33] HC: Today, I’m joined by Kilian Koepsell, co-founder, and Chief Innovation Officer of Caption Health to talk about ultrasound for early disease detection. Kilian, welcome to the show.

[00:00:44] KK: Welcome, Heather.

[00:00:45] HC: Kilian, could you share a bit about your background and how that led you to create Caption Health?

[00:00:50] KK: Yes. So my background is originally in physics and mathematics at that time, so I spent quite a while in academia, in neuroscience, theoretical neuroscience. My goal was to better understand how the brain processes images and information. Then also, with this understanding to help creating real-world applications. So we worked for a while on image analysis and image recognition. This was at a time when the neural networks weren’t really working yet, and we’re a little bit out of favor.

Then, around 2012, 2013, when the AI and deep learning took off, and was finally working, we were – the previous data was acquired, and we were really excited about the opportunities to really make a change in all kinds of different industries. We were particularly excited about medical applications and medical imaging. The vision was, that with the help of AI, we wanted to make medical imaging more accessible and give this image understanding that we thought the AI could achieve, give this in the hands of more people.

[00:01:58] HC: What does Caption Health do and why is this important for health care?

[00:02:02] KK: Yes. As I said, we wanted to focus on medical imaging, so we thought that with this vision of democratizing medical imaging, that ultrasound is a great modality. Because it doesn’t have any side effects and the devices over the last 1020 years became much more affordable and smaller. Today, you can even get a handheld probe that you can connect to your iPhone.

We realized that even though the hardware was available at much lower cost to many more people, there was a big burden or hurdle to get this ultrasound used by more people, because it is actually very difficult to acquire good ultrasound images. Also then, once you have acquired an ultrasound image, to understand what you’re seeing, and to interpret it, and detect diseases. We thought if we train an AI to do both of these tasks, to the guidance to a good ultrasound image and the interpretation, then you could make this medical imaging or ultrasound images available to pretty much anyone around the world.

[00:03:09] HC: You mentioned two key components, guidance, and the interpretation. What role does machine learning play in each of those?

[00:03:15] KK: Yes. That’s really the key for this to be impossible is machine learning. So there’s basically two tasks and both involve computer vision and understanding. The first task is if you put an ultrasound probe on your body, and so we started with cardiac ultrasound. So if you put it on your chest and you see a piece of your heart, then it is really difficult to understand how you have to rotate, and move, and angle this ultrasound probe to get the image of the heart that you want to see, that looks at a specific ventricle or a specific valve. This is learned typically by experts over many weeks, months, and years.

We use machine learning to basically understand the relationship of the imagery, and the position of the probe in 3D space, and then guide the user to the right spot without the user having to understand what they’re even looking at. That is the first component, is ultrasound guidance. Then once you get to a good image, the device would record automatically the image. Then there’s a second task. As a non-expert, you don’t really know what you’re looking at.

Again, we use machine learning to train on lots of ultrasound images, with labels of different diseases and conditions to recognize what’s going on in the image. So yes, it’s really an enabling technology, the machine learning for the application that we picked.

[00:04:45] HC: What are some examples of the types of diseases you’re looking at with ultrasound?

[00:04:49] KK: The interesting thing about ultrasound is, you can use it on most organs in the body, and you can detect a lot of different diseases. We focus first on the heart, and that is maybe one of the most difficult organs to image because it’s moving all the time. But you can see all kinds of different heart diseases by just looking at the imagery. You can see if the heart is not pumping well, or if it’s enlarged, or if the vulva not closing properly, all different kinds of structural heart diseases. Yes, this is what we focus on first.

[00:05:26] HC: This, maybe at the at the level of classification. In this case, this patient has this condition or not, or is it more locating a particular defect, or segmenting out structures, or something else that doesn’t occur to me.

[00:05:40] KK: Yes. No. Good question. Initially, what we thought is, so you can definitely do classification and look for the presence, or absence, or maybe the severity of certain diseases. What we focus on first is that we thought, if we help the user to make certain measurements of the heart, then the diagnostic part can be left to physicians, and we don’t have to do this in the AI algorithm. Because there’s a little bit resistance to having the AI make the final decision about a disease state.

What we did, for example is, one of the measurements is, I mentioned the pumping function of the heart. So you can ask what percentage of the blood in your left ventricle is squeezed out with every heartbeat. Traditionally, you would segment the imagery, and then calculate the volumes when the heart has expanded, and contracted, and you can calculate how much blood is pumped out of the heart. We use AI or deep learning algorithm to look at the video clip of the heart, and then regress a value, a percentage value, what percentage of the blood is ejected. That’s called ejection fraction. That is probably the most important parameter of the heart.

Similarly, you can do this with all kinds of other measurements there. The order of hundred measurements you can do on the heart, which then together give you a good picture of how normal or diseased the heart might be.

[00:07:08] HC: The goal of the deep learning algorithm is to come up with these measurements, and then leave it up to the clinician to make decisions based on that. Is that right?

[00:07:15] KK: Yes.

[00:07:16] HC: Going back to that guidance piece for a minute, can you give me an example of the type of guidance that those models produce? Is it things like rotate the probe, go in this direction?

[00:07:26] KK: Yes, that’s exactly right. There are these six possible dimensions. You can move the probe, and all the three space directions, and then there are different ways of changing the angle. That is rotating out left, rotating right, and then angling it in the two other directions. We have two types of feedback. The most useful and the most important one is something that we call equality meter. That just gives you a read on how close you are to the right image. It’s just a level that goes up and down.

Basically, if you reach a threshold level, if it goes up enough, if you close enough, then it tells you to hold the probe and to record. By just looking at this one-dimensional gauge parameter, you can actually try how to wiggle the probe around. It’s often very fine movements to get to a better-quality image. But in addition, we have what we call prescriptive guidance. This is exactly what you mentioned it will tell you. You have to rotate a little bit clockwise, or you have to move the probe a little bit closer to the patient’s head and things like that.

Then, as you do that, you can observe that the quality meter goes up. Ideally, if someone reaches the threshold level, that device automatically records the image.

[00:08:48] HC: I imagine that guidance is also quite helpful for training somebody who’s less experienced with ultrasound as well.

[00:08:55] KK: Yes, that’s true. As I mentioned, the traditional training takes a long time, and even after the formal training has ended, these sonographers, or these are the experts in ultrasound acquisition, they improve their skills over many years. We have found that with our guidance, you can within a couple of hours, you can get someone in the position to record an image. Then, if you’re not very experienced, there’s always this doubt, “Do I have the right image? Is this the right quality?” This quality meter really helps for people, even if they’re able to get a good image, it gives them the confidence that they arrived at the right image.

It’s absolutely true that it’s useful for people who have less experience, and are less trained, and helps them to – it gives them exactly the feedback they need in order to improve, and normally, you would require an expert to look over their shoulder and give them the feedback. But with this device, they can train themselves and they get better over time by you using it on the patient.

[00:10:01] HC: What kind of challenges you encounter in working with and training machine learning models on ultrasound imagery?

[00:10:08] KK: Yes. In general, I think it’s quite similar to training algorithm on natural images. However, there’s of course a couple of things that are different. The one thing is that the ultrasound image varies a lot, depending on the physical property of the person. It matters if you’re skinny. or more heavyweight, or if you have certain pathologies. So, if you have lung problems, sometimes that can affect the image research. It depends really on the patient, and it also depends very much on the device. It’s still pretty difficult to get a good ultrasound image, and so there are many different manufacturers out there. They provide images of different quality.

In order to train the algorithm to be robust to those variations, you have to use many different devices, on many different patients, with many different pathologies. Some of them, you might only find in a hospital. This requires that you work together with a hospital to do these acquisitions. It’s harder to do it. You can’t just do it in-house with healthy subjects. That is, I would say is the main difficulty.

The other thing that we were surprised about is that, with ultrasound images specifically, there is the problem that for historical reasons, the images often have patient information in the pixel data. Even though the medical image format, which is called DICOM, would allow you to put all the meta data into different data fields, these images in the past, they were in the pixel image, because people would record those with videotapes. The problem for us is, if we want to take these kinds of images out of the hospital, we have to deidentify them, and so you have to find all these places where there’s patient information and remove them from the image, which is machine learning task by itself. This was the first one we had to tackle, before we even got our hands on ultrasound images.

[00:12:09] HC: Yes. Medical images often come with unique and sometimes backwards ways of encoding information that you need to accommodate. This example in pathology, where pathologist might circle the actual tumor on the slide, if you’re training a model on that, your model already knows where to look and you have to get rid of that information first.

[00:12:32] KK: Exactly, yes. These kinds of labels, you’re right. These kinds of labels, you also find in ultrasound images.

[00:12:37] HC: You mentioned a bunch of sources, a variation in your imagery from different patients, different manufacturers, different medical centers. How do you go about validating your models across so many sources of variation?

[00:12:51] KK: That’s a very good question. The answer is a little bit different for this assessment, or measurement algorithms, and the guidance algorithms. If you assess an image that is already acquired, this is a task it can do based on imagery that you can find in large image databases in hospitals. This is where you have variety of patients, and variety of disease. You also typically have lots of different devices that are used in larger hospitals. So, it’s just a question of getting your hands on those medical images through a partnership with hospitals, and then deidentify them, and then follow proper procedures in dividing them into training and then validation data sets. That is fairly straightforward.

The same way how you would do it with natural images, and maybe just making sure that there’s different parameters that matter, like for example, the body mass index, the BMI of the patient. You want to make sure you cover heavy and light patients. For certain diseases, you want to make sure that they’re in your data set. But as I said, that’s typically the case in this hospital data sets.

When it comes to guidance, it’s a little bit more difficult, because there is an interaction between what the patient does, and how the images look like, and what our users, we want to make this available to less experienced users. The images that our users acquire with our guidance, they might look quite different from the images that you find in a medical database that are recorded by experts.

We have an iterative approach, we first start during the algorithm development. We have a lot of trial and error, and do what we call a prototype study with in-house, with users and subject that we use in-house. But then, in order to see how it really performs in the clinical practice, we collaborate then with clinics, or hospitals to do what we call the pilot study, where the actual user tries out algorithms on actual patients and we see how they perform.

Then when we reach the proper performance target, we have this formal clinical validation study that will support FDA submission. There’s basically several steps, and each one of them require working with healthcare providers to get real patients, and try it out with them, and also get real users. You have to make sure that the kinds of users that you use during testing are the ones that you market the device to later. The FDA looks very carefully for that.

[00:15:36] HC: The regulatory process definitely influences how you validate your models. Are there any other ways that influences how you develop machine learning models, maybe even thinking back to the beginning of a project, and how you plan it and execute it?

[00:15:50] KK: Absolutely. I think that’s one of the reasons why the first clearance for company always takes so much longer because initially, you might think that after development and validation, the FDA clearance is just something that you’re doing at the very end. But it’s very important to take those certain considerations into account from the very beginning. One thing is, the FDA requires that you follow certain processes. There’s a software development lifecycle process, and that you have procedures and what’s called the quality system in place to make sure that you follow all those recommended guidelines, and how you go about acquiring data and then developing the software and validating it.

All these processes is something you have to get in place. But there’s also, even before you start developing the machine learning algorithm, you have to decide what the inputs and especially the outputs are of the model. Then you have to decide if the output is a continuous parameter or if it’s a classification. You of course want to pick something that the user, the healthcare provider, or the user can make sense of, and something that performs at a level as the current clinical practice performs, and you want to prove that, and you design a study how you want to prove that. That makes sense to present all of that to the FDA.

They allow for something that’s called a pre-submission where you can present what you’re planning to do in your validation, clinical validation. They give you feedback and tell you if you follow those plans if that would address all of their concerns. It’s really good to get that feedback before you start an expensive clinical validation study, and not find out about that afterwards when you can’t change that anymore.

We typically engage with the FDA very early on, and we just start with this machine learning models. They have found that they give very helpful advice. Yes, I would – anyone who’s trying something similar, I would encourage you to try to get in contact as early as possible.

[00:18:01] HC: How do you ensure that the technology your team develops will fit in with a clinical workflow and provide the right kind of assistance to doctors and patients?

[00:18:09] KK: Yes, this is of course is very important, because if the performance on the technical level is great, but if it doesn’t fit in the workflow, of course, it will not be adopted. It’s actually very difficult because we have the vision that our technology might change the workflow or might enable people to do ultrasound that are not actually doing it today. That’s a big challenge, because how do you test that, and how do you make sure that this works? We do lots of interviews and usability testing with our users or proposed users. Then as I said, during these studies, where we evaluate the clinical performance, we also get, of course, a lot of feedback about how it fits in the workflow or not.

In fact, maybe the last thing I want to mention is that, even though we started out as a medical device company, and the idea was to – the model, the business model was that we would take the device and give it to a provider, and they would use it in their own practice with their own users. We found that because the users – this would require enough change of workflow that there was quite some resistance to make those changes, and buy the device and make those changes.

We pivoted our business model and we are now offering a full end-to-end service, which makes it very easy to integrate those taking ultrasound images in the workflow. It’s more similar to what they do today. Because typically, it’s for example, primary care physician who orders, or cardiologists who orders an ultrasound image, and then it’s all provided by some imaging center. We basically drop-in replacement for the existing services.

That was our way to make the integration of the workflow as easy as possible. So by just saying, just refer the patient to us, and we take care of the rest. Including scheduling the patient, recording the images, getting medical review, and then writing the report. Like all of these things that you otherwise have to make sure that they work the same way with your product as they work before you. We basically do all of this in-house and provide the end-to-end service. That is definitely a solution that works very well. But it of course requires to develop the necessary infrastructure in addition to the medical device.

[00:20:34] HC: Is there any advice you could offer to other leaders of AI-powered startups?

[00:20:39] KK: I think one thing that worked well for us, and so it took quite a while to get from this first idea to developing the technology, than getting a proper product, and having it tested, and improving it, and then having it clear. Then I’ll figure out the details in the business model. We have been at this thing more than eight years, and the world is changing very quickly. One thing that’s important is that you address a big enough problem, and you have some high-level thesis. In our case, that giving medical imaging to more people at a lower cost and high quality will help to detect disease earlier.

This thesis, we were very convinced – will it stay true, even if the details of the how care is delivered will change over time. I think you have to be high level enough so that you’re confident that five years later, when your device gets on the market, even with slightly changed world, your technology is still meaningful. That’s I think, one of the most important things I think if you address some incremental problem, and then it might be that five years later, that’s not even a big problem anymore. I think there are lots of problems out there that are tackled by humans, that where no technology existed to automate them.

Where with modern AI, you can now address some of those problems and finding something where machine learning is enabling technology and finding a problem that’s big enough so that it’s still relevant in three to five years.

[00:22:19] HC: Yes, that’s definitely helpful insight. Where do you see the impact of Caption Health in three to five years?

[00:22:23] KK: We still have the same vision that we will bring medical imaging to more people, and allow medicine to become more preventative, and to detect disease earlier. Currently, it’s often the case that you only go to a medical provider after you have problems, and there are certain diseases, for example, heart disease. When you have symptoms, it’s typically pretty far along, and so it would be very beneficial if you could detect heart disease, and other diseases earlier, and current today.

This is often not how the system operates and is often due to limitations. That device, the diagnostics are too expensive to do in a preventative fashion, and that you don’t have enough experts to perform those diagnostics. I see the system moving in a different direction, that there is more so-called value-based care, and so there are more providers that are actually trying to give the best possible care at the lowest possible cost. Devices like ours that can allow you to get a lot of information through medical imaging at a fairly low cost with a low – without expensive experts can really help in my opinion to give more people access to early disease detection.

[00:23:46] HC: This has been great, Kilian. Your team at Caption Health is doing some really interesting work for health care. I expect that the insights you share will be valuable to other AI companies. Where can people find out more about you online?

[00:23:57] KK: So yeah, on our website, Yes, there we have videos and I think that this is good to get a start to get more information.

[00:24:08] HC: Perfect. Thanks for joining me today.

[00:24:10] KK: Thank you. Thanks for having me.

[00:24:11] HC: All right, everyone. Thanks for listening. I’m Heather Couture, and I hope you’ll join me again next time for Impact AI.


[00:24:21] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe, and share with a friend. If you’d like to learn more about computer vision applications for people in planetary health, you can sign up for my newsletter at


Resources for Computer Vision Teams:

LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.
Computer Vision Advisory Services – Monthly advisory services to help you strategically plan your CV/ML capabilities, reduce the trial-and-error of model development, and get to market faster.