Giving generic drugs a new life in oncology is a game-changing strategy for developing new and affordable treatment options for cancer patients. But it would take years to review the thousands upon thousands of published research studies on non-cancer drugs tested as cancer treatments to identify the most promising candidates. Luckily, nonprofit health tech startup Reboot Rx is stepping in to solve this problem! 

I spoke to Founder and CEO, Laura Kleiman about how her company is fast-tracking the development of affordable cancer treatments using AI technology. Working with a team of biomedical and clinical scientists, Reboot Rx uses machine learning and natural language processing to analyze large volumes of scientific literature, identify the most viable drugs, and develop pathways to generate definitive evidence and change the standard of care so that patients can benefit from them. 

In this episode, you’ll learn more about Reboot Rx’s multi-pronged approach and the challenges that come with processing such large volumes of data. Plus Laura shares her advice for tech leaders looking to solve problems that have a meaningful societal impact.

Key Points:
  • A look at Laura’s background and the personal story of what led her to create Reboot Rx.
  • The important work Reboot Rx does to repurpose generic drugs for the treatment of cancer.
  • An example that illustrates the role that machine learning plays in this process.
  • Challenges that come with analyzing this type of data.
  • How Reboot Rx’s machine learning developers collaborate with healthcare experts to ensure that their models are effective.
  • Advice for leaders of AI-powered startups and nonprofits: choose problems that matter!
  • How Reboot Rx is working to make their AI technology scalable.

“Patients need both more effective but also more affordable treatment options. Reboot Rx is giving generic drugs a new life in oncology, taking drugs that are already available to treat other non-cancer indications and repurposing them for the treatment of cancer.”

“We use large language models to be able to process [large volumes] of scientific literature and predict which of these 600,000 studies are most likely to be relevant and extract key information about each of these studies.”

“There's so much opportunity for the use of AI and machine learning right now. I would encourage other leaders in the space to choose problems to solve that can have a meaningful societal impact.”



[00:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven, machine learning powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people in planetary health. You can sign up at


[00:00:33] HC: Today, I’m joined by Laura Kleiman, founder and CEO of Reboot Rx, to talk about affordable cancer treatments using repurposed generic drugs. Laura, welcome to the show.

[00:00:42] LK: Thank you so much. Looking forward to the conversation.

[00:00:45] HC: Laura, could you share a bit about your background and how that led you to create Reboot Rx?

[00:00:50] LK: Sure. My background is as a systems biologist and cancer researcher. My motivation for forming Reboot Rx came from both my professional experiences and my personal experiences with cancer. I was working as a Research Director at the Dana-Farber Cancer Institute in Boston when my mom was diagnosed with cancer. I was researching treatment options for her when I really just stumbled across this opportunity to repurpose existing generic drugs for the treatment of cancer and found many promising cancer treatments that patients like my mom weren’t able to benefit from because they hadn’t been fully developed due to a lack of incentives for pharmaceutical companies to develop new uses for low-cost generic drugs.

Then, wearing my professional hat, I started thinking about how we could solve this challenge and get treatments like these to patients who need more treatment options and more affordable treatment options. The center that I was running at the Dana-Farber Cancer Institute was bringing together machine learning engineers and data scientists with cancer researchers and clinicians to help them solve challenging problems in cancer research and clinical care. And I saw how some new approaches in the AI space and natural language processing could help solve one of the challenges with repurposing, which is analyzing a lot of data to be able to prioritize which drugs could have the greatest benefit for patients.

I also started thinking about how new funding models could enable the generation of definitive evidence for these drugs so that they could reach patients. I quit my job in 2019 to launch Reboot Rx, and we launched as a not-for-profit company.

[00:02:42] HC: What does Reboot Rx do? And why is this important for treating cancer?

[00:02:46] LK: Well, about 40% of patients deplete their life savings within two years of cancer treatment, which is really shocking and unacceptable. Patients need both more effective but also more affordable treatment options. Reboot Rx is giving generic drugs a new life in oncology, taking drugs that are already available to treat other non-cancer indications and repurposing them for the treatment of cancer. We’re not identifying new uses for drugs in the prediction phase. These new uses for cancer indications have already been discovered and published on. These drugs have been tested through phase two clinical trials in some cases for these cancer indications, but they’ve never made it into the standard of care.

Reboot Rx is taking a multi-pronged approach to identify the most promising generic drugs, based on the totality of existing clinical evidence, and develop that pathway to generate definitive evidence and change the standard of care so patients can benefit from them.

[00:03:58] HC: And what role does machine learning play in this technology?

[00:04:02] LK: The key challenge in identifying the most promising generic drugs for repurposing is the amount of existing data that first needs to be reviewed so that we can systematically identify the most promising opportunities. These drugs, since they’re generic, off-patent, they’ve been around for decades. They’ve been studied and used for decades. So, there’s a lot of data out there on their effects for treating cancer. These are small clinical trials. Up to small clinical trials including pre-clinical data as well. We need to understand everything that’s already known about these drugs first. That’s our first task. So that we can prioritize which drugs and drug combinations could provide the greatest benefit for which specific cancer patient populations. The type of information that we work with in this first case is published literature. So, these are completed and published clinical studies in biomedical repositories like PubMed. How we use machine learning and natural language processing is to be able to sift through a lot of information that’s this text-based information in PubMed where we’re trying to analyze the potential effects of these drugs if used in a large patient population.

[00:05:22] HC: How do you apply machine learning to that specifically? Maybe there’s an example of a model you’ve created, input and outputs, that can help illustrate this?

[00:05:31] LK: Sure. Our starting point is all of the non-cancer generic drugs. These are drugs that have been FDA approved for something other than cancer. They’ve gone off patent and now they’re widely available as generic drugs typically with many manufacturers for each of those generic drugs. And we’re looking for studies where these drugs have now been tested as potential cancer treatments.

If you went to PubMed and did a search looking for co-mentions of the non-cancer generic drugs, there are around a thousand of them, and cancer, you would find around 600,000 published studies mentioning these non-cancer generics and cancer. Only a small fraction of these studies are actually relevant to us. Meaning they contain consequential data on the effects of these drugs for treating cancer. But we don’t want to have to read all 600,000 manually. It could take decades for us to do that.

And so, we use large language models to be able to process this large volume of scientific literature and predict which of these 600,000 studies are most likely to be relevant and extract key information about each of these studies. For example, what type of study was it? Was it a randomized controlled clinical trial? An observational study? A case report? And also, identify whether the drug was found to be effective or not in those studies.

The large BERT-based models are able to help us quickly assess that evidence and narrow down the search space for a manual in-depth review. We use this approach to screen a lot of information quickly. And then what the output is from that are predicted relevant studies, extracted features, and then a ranked list of drugs based on synthesizing the totality of information from PubMed.

And after that, our biomedical scientists manually curate the literature, manually validate what comes out of those models to ensure that what we’re pulling out could provide clinically meaningful benefits for patients.

[00:07:50] HC: It sounds like these are supervised models and that the output is you’re trying to predict whether the study is relevant along with some other information. But they’re essentially supervised?

[00:07:59] LK: Yes.

[00:08:00] HC: How do you create the training set for that? How do you annotate data?

[00:08:04] LK: We take a subset of the 600,000 studies, although it really could be any type of study. It doesn’t have to be on these non-cancer generic drugs specifically. And our team of biomedical scientists manually curate a few thousand annotations on this scientific literature that are then used to train the language models so that we can understand the clinical context within these publications.

In essence, we take many examples and we label them regarding whether they are actually relevant in our situation. For example, is it a clinical study where the drug was being tested for the treatment of cancer with relevant phenotypic outcomes that were reported? And label other things like what type of study was it? And whether the drug was found to be effective or not. And use those as training data.

[00:08:55] HC: There’s multiple outputs. Not just is the study relevant. But some of the other criteria you mentioned as well.

[00:09:01] LK: Correct. Yeah.

[00:09:02] HC: What kind of challenges do you encounter in working with this type of study data and natural language to begin with?

[00:09:08] LK: A lot of the challenges are around what the limitations in the sense of what information we have access to in the first place. There are of course challenges with working with large text-based data and how well any automated approach can do at making these predictions and extracting certain types of information.

For example, when we’re trying to capture or infer whether the drug was found to be effective or not, we’re not able to actually pull out quantitative outcome metrics in an automated way right now. And that is a step that then happens later in a manual way with our team of biomedical and clinical scientists. That’s one.

There are many challenges. For example, the database of drugs, that is our starting point, we had to create from scratch—the database of 1,000 non-cancer generic drugs—because a database like that doesn’t exist yet. At least not in the public domain. And then, of course there are limitations in terms of what information has been published and is in the public domain and in repositories like PubMed. Access to full length, full-text articles. Because many are behind pay walls.

And then there’s the publication bias. There are not as many negative studies that are published that we could even know about. Although I am surprised that around 50% of publications are reporting negative results. I think that’s a good start. It’s mostly around what information we have access to.

[00:10:44] HC: And how do your machine learning developers collaborate with other experts like oncology and pharmaceutical experts in order to make sure their models that they’re developing are approaching the problem in the right way? That your data is annotated in a helpful way? All those types of issues?

[00:11:00] LK: Yeah, it’s a constant collaboration between our tech team and the domain experts. And that collaboration starts at the very beginning when we were even designing the project. For example, we did our first deep dive into prostate cancer. That was the first deployment of our technology. And we had a prostate cancer oncologist an expert in the field who was working with us from the very beginning as we even were thinking about what type of information we wanted to consider relevant. What outcomes were relevant in the prostate cancer setting specifically? And how do we want to think about different patient subpopulations and define those?

They’re very involved as we’re designing the study and during early iterations. And then, of course at the conclusion of our evidence, our automated evidence synthesis process, where we’re now interpreting what comes out of – what came out of our technology so that we can decide which drugs we want to move forward with in the clinic.

[00:12:04] HC: Is there any advice you could offer to other leaders of AI-powered startups and non-profits?

[00:12:09] LK: I think there’s so much opportunity for the use of AI and machine learning right now. I would encourage other leaders in the space to choose problems to solve that can have a meaningful societal impact.

[00:12:24] HC: And finally, where do you see the impact of Reboot Rx in three to five years?

[00:12:28] LK: We’re working towards making our AI technology scalable so that we can identify promising repurposing opportunities across cancer types with a higher degree of accuracy and needing much less manual validation. That’s our focus right now. And in the next three to five years, generic drugs for multiple cancer types are going to be in definitive clinical trials based on our findings where new trials aren’t even needed because perhaps there’s already a fair amount of existing clinical evidence.

My hope is that some drugs will already be available for patients as part of their standard of care based on our work. And patients will therefore have more treatment options and the overall cost of healthcare will decrease.

[00:13:18] HC: I look forward to seeing some of those things come to play. This has been great.

[00:13:22] LK: Thank you.

[00:13:22] HC: And Laura, your team at Reboot Rx is doing some really interesting work to make cancer treatments more affordable. I expect that the insights you’ve shared will be valuable to other AI companies. Where can other people find out more about you online?

[00:13:36] LK: On our website, Other social media, Twitter, LinkedIn. Looking forward to hearing from some of the folks who are listening to this.

[00:13:46] HC: Perfect. Thanks for joining me.

[00:13:47] LK: Thank you.

[00:13:48] HC: All right, everyone. Thanks for listening. I’m Heather Couture. And I hope you join me again next time for Impact AI.


[00:13:58] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend. And if you’d like to learn more about computer vision applications for people in planetary health, you can sign up for my newsletter at