Batteries are arguably the most important technological innovation of the century, powering everything from mobile phones to electric vehicles (EVs). Unfortunately, most batteries have a significant impact on the environment, requiring increasingly scarce and valuable resources to manufacture and typically not designed for easy repair, reuse, or recycling.

Today on Impact AI, I'm joined by Jason Koeller, Co-Founder and CTO of Chemix, to find out how his company is leveraging AI to create better, more sustainable EV batteries that could reduce our reliance on elements like lithium, nickel, and cobalt, all without compromising vehicle performance. For a fascinating conversation with a data-driven physicist working at the intersection of software, machine learning, chemistry, and materials science, be sure to tune in today!

Key Points:
  • Jason’s background in theoretical physics and how it led him to create Chemix.
  • Products and services offered by Chemix and the role that AI plays.
  • Four reasons that machine learning (ML) is at the core of everything Chemix does.
  • Unique challenges that their ML models need to contend with.
  • What goes into validating these models to ensure accuracy.
  • Why now is the right time for the technology that Chemix is developing.
  • Metrics for measuring the impact of a better EV battery.
  • Jason’s data-driven advice for leaders of AI-powered startups.
  • His “electrifying” vision for Chemix in the next three to five years.


“All data analysis and decision-making is automated by our AI system. This includes analyzing terabytes of battery test data each day.” — Jason Koeller

“Looking at broad trends, [electric vehicles (EVs)] and AI have both become [things] that people have been talking a lot more about in the past 10 years and even more so in the past four or five years, and that has happened simultaneously.” — Jason Koeller

“Why is everyone not buying an EV? It's largely because they're too expensive or because people are worried they're not charging fast enough or they don't hold enough range for long road trips. – Improving any one of these metrics would be a measure of impact.” — Jason Koeller


Jason Koeller on LinkedIn
Chemix on LinkedIn

Resources for Computer Vision Teams:

LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.



[00:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven, machine-learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people in planetary health. You can sign up at


[00:00:34] HC: Today, I’m joined by guest, Jason Koeller, Co-Founder and CTO of Chemix, to talk about batteries. Jason, welcome to the show.

[00:00:42] JK: Thank you. Great to be here.

[00:00:43] HC: Jason, could you share a bit about your background and how that led you to create Chemix?

[00:00:47] JK: Yeah. My academic background is in theoretical physics. I was doing a PhD in quantum gravity at Berkeley. But I sort of had a bit of an identity or a crisis and sort of came to the realization that the problems I was working on, while very exciting and intellectually stimulating to think about, didn’t feel like they were pressing. Didn’t feel like they needed to be solved urgently.

I ended up switching fields into the climate space and learning a lot about that. And got into energy storage and batteries. And I actually joined a startup working on lithium metal batteries. This is a very specific battery chemistry niche for electric aviation. And while I was there and learning a lot about the field, AI was really starting to hiccup. And a lot my friends from physics from grad school were actually getting pulled into the AI field, because there’s a lot of overlap between some of the methods in AI and physics.

And so, I started wondering whether it could make sense to bring AI and machine learning into the battery problems that I was working on. And this was kind of on my mind for some time. And I decided this needed to be pursued further. I actually got a job as a data scientist at a software company doing materials informatics. Still applied to material science and chemistry, but not applied to batteries, and doing a pure software-based approach, software platform.

Spent some time there. And Chemix is really a sort of fundamental unification of two core ideas. One is using data-driven methods for accelerated materials discovery and improving the performance of materials in a way that would be impactful to the world. And in particular, in batteries. And then number two is this idea of being really vertically integrated. Gathering our own data. Not relying on third parties for data. Ensuring that we have total control over the quality and the quantity of that data and allowing us to really get deep into a particular vertical.

And so, Chemix is really this kind of core data-driven research and data-driven development of new batteries. Very specific to the battery vertical and where we can basically customize data generation processes, the application of machine learning, and we can then customize all of the machine learning algorithms that we use for the battery-specific problems that we face every day.

[00:03:03] HC: What services and products do you offer? How do you do this in the market?

[00:03:07] JK: Yeah, exactly. Essentially, what we do is we actually work with companies that make EVs, whether it’s sports cars, motorcycles, sedans, long haul trucks even, electric flight, and basically understand what are the current pain points that they’re dealing with when it comes to the off-the-shelf batteries that they can buy today. And in what ways are their vehicles they’re building basically limited by batteries?

And then we essentially set our process in motion, which is an AI-guided ferment optimization process to optimize batteries. I come up with the formulations, the recipes, the materials that are needed to enable the batteries to deliver the performance that the EV customers want. We essentially contract manufacture out those batteries and supply them to the EV customers. In some sense, we are this link between the companies that use batteries and then the raw material suppliers that make the battery materials to identify the right combinations to deliver the performance they want.

[00:04:04] HC: How do you go about using machine learning for this? What type of data goes into the models? And then how do you set up the problem?

[00:04:11] JK: Yeah. Machine learning comes in everywhere throughout what we do. It’s really at the core. First is this concept of the self-driving lab, which is what we basically have built and operate every day in our facility in Sunnyvale, which to us just means that all data analysis and decision-making is automated by our AI system.

This includes analyzing terabytes of battery test data each day from our battery cyclers. Extracting from that test data the key performance metrics associated with each of the batteries we test. For example, how long it lasted before it failed? What its internal resistance was? The capacity of the battery. The amount of energy it was able to deliver. How fast it can charge and so on?

And then these performance metrics I just mentioned are essentially the labels, which are then – and then that labeled data set is then used in our AI system to basically design new materials, new material formulations, new recipes, new battery cell designs. And so, it’s this iterative optimization problem of essentially generating new experiment trials based on all of the data collected to date, carrying out those experiments, and getting the results.

Now that’s all well and good, but there’s a sort of specific issue with this when it comes to batteries. And this is where another area where we apply machine learning, which is the cycle life or longevity forecasting. Essentially, a key battery performance metric is how many times you can charge and discharge it before it degrades by a certain amount. For anyone who’s ever had a phone or a laptop is familiar with the phenomenon. The battery is usually the first thing to degrade and require to replace the device. That’s a key performance metric. But there’s really no first principles way to calculate how long a battery will last if you make a tweak to the recipe. And so, you end up having to do just lots of testing. Lots of charging and discharging of batteries over and over and over again.

And this can take months or even years, because these batteries need to last for a very long time inside the devices, inside the vehicles. And so, if we had to wait months or years before we could get the results of our experiments and basically iterate and generate the next set of designs, that would clearly be unfeasible.

Instead of waiting that length of time, we use machine learning to forecast how long a battery will last based on the early data collected from its charging and discharging process. The charging and discharging data in this case is voltage, current, and temperature as a function of time collected throughout those months of battery testing. Based on the subtle signals in that data, basically how the battery responds to charging and discharging it at different rates and in different conditions, we can essentially predict how it’s degrading. And, therefore, how long it will last? That’s kind of another key element of where machine learning comes in.

And then I’ll also just say, there are a handful of auxiliary problems as well which benefit from the use of machine learning. One that’s related to what I just talked about is the forecasting of remaining useful life of batteries in the field. When the batteries are actually in the vehicles, not just in the lab anymore, they’re also generating this time series data from the battery management system on the vehicle. And that can be used to forecast how long they’re going to last and understand the degradation.

Similarly, there’s other things along the lines of optimizing fast charging algorithm. Exactly how you apply the current to the battery when you’re trying to charge it. Dictates how fast it can be charged with how much degradation, as well as quality control diagnostics in the factory and various other steps of the manufacturing process. It comes in everywhere, but the key areas are the self-driving lab that I mentioned and the longevity prediction that I talked about.

[00:07:46] HC: These models are largely based on time series of knowing about the battery’s history so far or its early discharge if you’re trying to predict the future.

[00:07:55] JK: Yeah. For the longevity models, that’s absolutely right. It’s a time series type of model. Sequence-based model. And then for the self-driving lab part, basically the design of new batteries, and new recipes, and new materials. This is different types of model because the data modality is different in that case. There are a handful of data modalities here in addition to the time series that I talked about.

For example, each of the materials inside a battery, and there are dozens of unique materials inside any battery. Each of them has their own properties, right? If it’s a solid active material that actually stores the charge, then things such as the size of the particles matter, various other properties like the chemical composition of the materials, the stoichiometry, molecular weights of polymers, various properties of molecules.

And so, each of these individual materials is characterized or parameterized by the values of these properties. And in many cases, we get this data from spec sheets from our suppliers of materials. Or we have to compute it ourselves or measure it ourselves in our lab. And then the third modality of data here is the cell design data. It’s which of these materials, large possible space of materials to select from, are actually present in any given battery? And exactly how much?

Each material in the battery, each of these dozens is there for a precise reason. And it needs to be in a precise concentration and precise amount to have the right balance of performance, and cost, and so on. And so, essentially, this representation of what’s inside the battery is another modality. And then a fourth modality is essentially the processing information. When a battery is made, it goes through a series of steps. You can think of this like cooking. Essentially, it goes through an oven at a certain temperature. In the oven for a certain amount of time. Or we’re mixing a bunch of liquids together and we’re mixing them at a certain rate to dissolve something. And they’re being added in a certain order. Very much like you would think about in sort of a cooking recipe. I mean, I think the analogy works very well here.

There’s all of this processing information. In addition to what is in the end product, it’s how it was added. How it was formed? And all of these different data modalities have to essentially come together, be modeled together to operate this self-driving lab that I mentioned, where, every day, new battery designs, new materials, new formulations, new processing conditions are being generated by the system and then are being executed in our lab. And then the following day, the system repeats.

[00:10:16] HC: What types of challenges do your models need to contend with?

[00:10:19] JK: Yeah, there are definitely challenges. It’s a very specific use case. I think interesting one, multiple fidelities of data. I talked about how the cycle life estimation, the cycle life characterization, just the charging and discharging of the batteries over and over and over again. Talk about how that can take a very long time and how we need to accelerate that.

At any given day, there are batteries in our data set that some of them may have reached end of life a long time ago because we started those years ago. Some of which were just started a few months ago. They’re sort of partway through. And then others, which maybe only started a day or two ago and are only at the very beginning of their testing. Depending on how long they’ve undergone the testing, the predictions for the longevity has a different accuracy. The longer they’ve been tested, the higher the accuracy of the prediction.

And so, this introduces sort of a continuum of fidelities of data into the downstream model that’s using that information to design the next set of recipes. And it’s even more complicated than that. It’s not just the cycling forecasting that we’re speeding up. There’s also various other steps along the way. Even before the battery begins its testing, it still has to be made. That manufacturing process can take a few days. Even during that time, it’s helpful to speed things up and get some sense of how the battery will perform from the very early signals from the manufacturing process. Essentially, you have this challenge of the data set changing continuously as more and more information comes in.

I think another interesting example of a challenge here is what I’m going to call a nonviable result. Essentially, this is a situation where the recipe that the model – when you go to make it, for whatever reason, it cannot be made as intended, which means that you can’t naively label that data point. I’ll give you a concrete example. Imagine you’re mixing a solution for the electrolyte or the cathode or anode slurry. And you mix a bunch of things together. And, essentially, the viscosity is extremely high. It’s something like a honey or a molasses. And if the viscosity is high enough, it would be impossible to actually make a battery out of this formulation. And so, we can’t even build the battery, test it, and extract its performance.

In some sense, the information that that design is not good and needs to be adjusted needs to be incorporated into the system. We need to label that data point somehow. But we’re not going to get the labels that we’re typically used to getting. And so, how to basically deal with this is an important problem that is not really unique to batteries. This comes up in a lot of cases in chemistry and material science. Whether you’re designing a new molecule and turns out you can’t figure out how to synthesize it. Or you try to synthesize it and you end up synthesizing something else entirely. You have to deal with these kind of challenges.

And maybe I’ll just mention one more, which is around data drift. The way our system operates is we’re getting the battery performance requirements from our customers or from sort of general market signals. For example, perhaps for a given type of car, it’s really, really important to have a very long cycle life. Or really important to have really low internal resistance for the battery.

As we put more and more effort and resources into optimizing a battery for that specific case, it can mean that our system, in the space of possible battery designs, starts navigating away from where the data set was originally. And so, we start doing a lot more extrapolation. And, essentially, we start optimizing in a region of the space where the data is not as plentiful. This can cause significant data drift over time. And this then needs to be incorporated into any kind of benchmarking or evaluation of different modeling. Basically, different model types or different systems, because if you just sample randomly, for example, and do a random training test split, you’ll end up with perhaps a test set that looks significantly different than the actual test set you care about, which is the batteries that you’re going to be designing going forward. Not the ones that you designed in the past. That kind of thing.

[00:14:21] HC: I imagine that many of these scenarios also bring up the need for validation. How do you go about validating your models that they’re predicting the properties of the battery really intending to?

[00:14:33] JK: Yeah. It’s a really good question. The sort of simple answer, which I’ll just say in the beginning sort of cheap answer is the ultimate validation is when we actually go in the lab and actually make the samples and we get the result. But at the end of the day, we’re not using these models as some kind of ground truth that we’re going to rely on the predictions of for a very long period of time before we actually measure it. We’re going to just make the batteries and test them.

But still, it’s definitely important to be able to do some more systematic and sort of real-time validation for the comparison of different models. We’re constantly trying out new models. New features. New representations of the data. New architectures and so on. And so, essentially, the way we evaluate our models and evaluate any new candidate models that we’re interested in implementing is with a rigorous set of benchmarking on basically hold-out sets.

Sort of a classic train test. Split example, but where the test sets have been constructed deliberately to test the generalization ability to specific cases that we really care about. I mentioned one a minute ago, which is this kind of data drift where the sort of regions of the space that are interesting to the system is kind of evolving over time. A simple thing to do there is just train on all of the data collected before a certain date or before we started working on a specific project. And then create a test set based on the data after that date or the data from a specific project. And this really tests our ability to generalize that sense.

But there’s other things we do. And we run this on a schedule. For example, a hold-out set of only batteries of specific chemistry, where we train on all the batteries one chemistry, hold all the batteries except one chemistry. Test on a different chemistry to test the generalization there. As well as even testing on batteries that contain a specific novel chemical.

At various steps, we identify through the process that a certain chemical is really important to performance. And what we do is we train on all of the designs that did not have that chemical and then test on some of the designs that do have that chemical and identify the accuracy in that case. Because if we can’t accurately generalize to new chemicals, then part of our pipeline breaks down.

There’s a part of our self-driving lab that I haven’t really talked about yet, which is screening. Constantly scanning through libraries of thousands and thousands of different chemicals and materials that are available to purchase and identifying which ones we should bring into our lab and incorporate into our daily process. This requires us to be able to generalize in chemical space.

Those are some examples of how we think about validation. It ultimately boils down to evaluating generalization accuracy on a held-out test set. And doing this in a very consistent and repeatable manner so that we can really numerically compare model accuracies and other performance metrics.

[00:17:16] HC: And you’ve been able to identify what properties it is that you’re trying to generalize to so that you can structure your training and test splits in order to test.

[00:17:23] JK: That’s right. I think the classic thing people of course is sort of a random training test split. And that’s great if your data is sort of all drawn from the same distribution. And, essentially, if you’re interested in applying your system to test cases that are also drawn from the same distribution, everything is well-covered and all of that. But practically speaking, in reality, that’s usually not the case.

And so, you can definitely get different results depending on which way you construct the test set. And I think this is – certainly, people pay some attention to. But I think is probably under-appreciated, especially in academia. Because in academia, you’re usually dealing with sort of benchmark data sets that everyone’s agreed on. And everyone’s evaluating the models in the same way. The intricacies of how’s this model actually going to be deployed in production is sort of where this level of customizing the test set comes in.

[00:18:15] HC: Why is now the right time to build this technology?

[00:18:18] JK: Yeah. I would say it’s a confluence of a few factors. I mean, broadly speaking, just looking at kind of broad trends, EVs and AI have both become something that people have been talking a lot more about in the past, say, 10 years. And even more so in the past really, four or five years. And that’s kind of happened simultaneously, which is interesting.

From the perspective of EVs becoming more important, this is largely coming from people trying to shift or to prevent climate change, and then there’s a huge demand for EVs all over the world. I mean, this is really a global market. And this includes more and more electrified forms of transportation beyond just standard four-wheel passenger cars. Types of vehicles that actually have more unique requirements, whether it’s, again, kind of high-performance sports cars or it’s long haul trucking where the business model requires the battery to be charged and discharged many, many thousands of times before it degrades. There are other applications that require specific things like fast charging or low-heat generation. In some cases, for smaller vehicles, there’s not sufficient space to put a thermal management system. And so, you need a battery that’s going to generate less heat when it’s discharged or charged quickly in that case.

But then there’s also the maturation of the battery manufacturing process. Battery manufacturing is becoming more and more commoditized and standardized, which is good for us, because it means that the design space of the possible materials, the possible processing conditions is more and more well-known and we can essentially put that bounding box around what the system is generating. This is really a case where this approach thrives. When there’s a clearly defined space. But where navigating that space is highly challenging and the response surface is highly nonlinear in some sense. But it’s sort of well-defined in a well-defined space.

And then, of course, AI in general picking up speed, which I mentioned a minute ago. It just means there’s more and more people working on open-source frameworks. The frameworks are getting better and better. The infrastructure is there. There’s plenty of providers of software as a service for infrastructure and cloud computing, all of this. I do think this is a really good time to work on this. And since batteries and AI are really the core pieces of what we’re doing, it’s sort of quite coincident that they’re both happening at the same time.

[00:20:28] HC: Thinking more broadly about what you’re doing at Chemix, how do you measure the impact of your technology?

[00:20:32] JK: Yeah. I mean, I think the best way to measure it would be looking at performance improvement over the state-of the-art. Again, many different performance metrics for batteries depending on the application. But looking at any one of them or any handful of them, getting better performance than state-of-the-art batteries ultimately does mean that there will be more electric vehicles on the road.

I think maybe this is sort of – maybe people know this already. But if you think about why are there not more EVs on the road? Why is not everyone buying an EV? It essentially completely boils down to the battery. Because the actual experience of driving an EV, for anyone who’s driven one, is I think objectively way better. I mean, there’s less noise. There’s no exhaust fumes. There’s no transmission. The acceleration is extremely smooth. They’re very responsive. Cheap to maintain. There’s no oil involved.

Why is everyone not buying an EV? Well, it’s largely because they’re too expensive or because people are worried they’re not charging fast enough, they don’t hold enough range for long road trips, or basically, the materials that go into them are too hard to source. Or there’s supply chain concerns. And so, improving any one of these metrics would be a measure of impact.

In one case – and we’ve already demonstrated the number of these. In one case, we’ve achieved more than 1,500 continuous cycles at high-elevated operating temperature, which is more than a 3X improvement over a commercial baseline that we test in the same configuration. And this is important again for these kinds of smaller vehicles, like motorcycles or two-wheelers, which don’t have sophisticated thermal management. The temperature can actually get quite high and significantly degrade performance.

In another case – I mean, we developed a battery which can be fast-charged for more than 2,000 cycles continuously. Versus, again, like a baseline that’s less than a fifth as long. And fast charging is important for people who don’t have a place to charge EV at home and really have to charge it at charging stations continuously. There are other performance metrics as well. Energy, density, safety, cost. Other areas where we’ve demonstrated significant performance improvements. And that’s getting the attention of a lot of customers.

I think another way of evaluating the impact is really around the number of batteries we can sell. Because people are only buying from us if our batteries are better. And so, earlier this year, we actually signed a supply agreement for two giga-watt hours of batteries to a cutting-edge EV manufacturer for a sort of premium off-road SUV product, which equates to around 30,000 vehicles and a good amount of revenue for us at this stage having only been around three years. And so, I think that’s good validation. But certainly, more to come.

[00:23:04] HC: Is there any advice you could offer to other leaders of AI-powered startups?

[00:23:08] JK: Yeah. I mean, I don’t know if I need to say this. Probably, a lot of people would already be thinking this way. But it’s probably worth emphasizing, which is data is absolutely critical. I mean, as a rule of thumb, I would say if you’re not putting way more effort into data collection and data curation than you maybe initially expected you would or really maybe wanted to, then you’re probably not doing enough.

This is the number one thing I see the sort of general category of AI for X. Applied AI in a particular vertical kind of getting wrong. And it’s often very painful to do this right. It’s not the sexy part by any means of machine learning and AI. But if you’re really working on this in order to make a difference and really get real-world outcomes, I believe this is just what you have to do. It’s just part of the work. And I think people kind of need to frame their thought process around what aspects of their business that they sort of spend the most time on and really highly optimized for. Frame that around the data as much as possible, because that’s often the limiting factor. And that’s often the thing that people who will be joining your company sort of won’t be necessarily thinking about in the beginning, especially if they kind of come from academia.

[00:24:17] HC: And, finally, where do you see the impact of Chemix in three to five years?

[00:24:21] JK: Yeah. As I mentioned, I think our core impact would really be in enabling mass electrification for EVs. That’s really where I see our impact. I mean, it’s more EVs on the road. EVs lasting longer. And, especially, as I mentioned in markets that are harder to electrify, that are beyond kind of the premium passenger EV market where we see most EV adoption today and where most EV batteries are designed for. There’s a lot of other applications out there. All of which have significant CO2 emissions and would lead to significant quality of life from reduced pollution than just the premium passenger EV market.

I think in three to five years or in some timeframe like that, that’s where I would see our biggest impact. And in that span of time as well, we’ll be expanding beyond just the core battery products, but also into some of the more service-oriented aspects of things. For example, I kind of alluded to this earlier, but the forecasting of remaining useful life of vehicles when they’re actually on the road. And the evaluation of state of health of a battery.

This is a challenging problem, which I believe you can also bring machine learning to bear on on successfully that’s necessary for any kind of battery repurposing. In many cases, batteries degrade in a way that, while they’re not suitable for their original application anymore, they are potentially still very well-suited to a less demanding application. Kind of a classic example of this is taking batteries that have recent life for applications and vehicles, but then you can use them to stabilize the grid.

But in order to do this, it really requires having a refined characterization of the state of health of the battery and forecasting how much longer it’s going to operate in that grid condition, say, to make sure that the investment needed to repurpose that battery and kind of take it out of the vehicle, hook it up to the grid, do all the sort of re-engineering on the electronic side to make sure that’s going to pay off. You need to know how much longer it’s going to last.

[00:26:15] HC: This has been great, Jason. I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?

[00:26:24] JK: You can go to our website, which is And you’ll see some info there. You can also find us on LinkedIn. And you can find me on LinkedIn or the company on LinkedIn. Yeah, keep an eye out for us in the news. We just had a press release last week, because we just closed our Series A. Raised some money from Porsche Ventures and BNP Paribas. Some other big names out there. I think there’ll be more news to come.

[00:26:47] HC: Well, congratulations. That’s great to hear.

[00:26:48] JK: Thank you. Thank you, Heather. Great talking with you.

[00:26:50] HC: Yeah. Thanks so much for joining me today. All right, everyone. Thanks for listening. I’m Heather Couture. And I hope you join me again, next time, for Impact AI.


[00:27:03] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share it with a friend. And if you’d like to learn more about computer vision applications for people in planetary health, you can sign up for my newsletter at