Foundation Model Series: Building New Materials for Climate with Jonathan Godwin from Orbital Materials

AI is unlocking the future of materials science and today’s guest Jonathan Godwin, co-founder and CEO of Orbital Materials, is at the forefront of this transformation. With a background in AI research and experience leading groundbreaking projects at Google-owned DeepMind, Jonathan is now applying machine learning to develop advanced materials that can drive decarbonization.

In this episode, he explains how Orbital Materials is using foundation models (like ChatGPT for language or MidJourney for images) to design new materials that capture carbon, store energy, and improve industrial efficiency. He also shares insights into the company’s mission, the challenges of simulating atomic-level interactions, and why open-sourcing their model, Orb, is crucial for innovation.

To discover how AI is revolutionizing the fight against climate change and learn how these cutting-edge materials could shape a more sustainable future, don’t miss this inspiring conversation with Jonathan Godwin!

Key Points:

Insight into Jonathan’s diverse career path and how it led him to Orbital Materials.
What types of advanced materials Orbital develops and their potential impact.
The critical role AI plays in developing materials for decarbonization purposes.
Defining foundation models and why they’re an essential part of leveraging AI.
3D atomic simulations and other types of data that go into Orbital’s foundation model.
The computing infrastructure required to build a foundation model for materials.
Engineering and other challenges encountered while building models at this scale.
How AI enhances scientific discovery without replacing human expertise.
Why open-sourcing Orbital’s foundation model, Orb, is key for innovation.
Lessons from developing this model that could be applied to other data types.
Jonathan’s detail-oriented advice for leaders of AI-powered startups.
Orbital’s exciting mission to accelerate new materials development.

Quotes:

“We develop materials that can capture CO2 from specific gas streams – coming out of an industrial facility, new energy storage technologies that allow – [data centers] to operate behind the meter, or ways to improve the water efficiency of a data center or industrial facility.” — Jonathan Godwin

“Foundation models are the crux of how we're able to leverage AI in this day and age. If you want to [say], 'We're pushing the limits of what AI is able to do. We're leveraging the most recent breakthroughs,' – you've got to be building foundation models or using foundation models.” — Jonathan Godwin

“AI is a massively powerful creativity aid and accelerant. We’ve seen that in other areas of AI and we're bringing that to advanced materials.” — Jonathan Godwin

Links:

Orbital Materials
Orbital Materials on LinkedIn
Orbital Materials on X
Orbital Materials on GitHub
Jonathan Godwin on LinkedIn
Jonathan Godwin on X
Jonathan Godwin Substack

Resources for Computer Vision Teams:

LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.

Transcript:

[INTRODUCTION]

[0:00:03] HC: Welcome to Impact AI. Brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven, machine-learning-powered company. This episode is part of a mini-series about foundation models. Really, I should say domain-specific foundation models.

Following the trends of language processing, domain-specific foundation models are enabling new possibilities for a variety of applications with different types of data, not just text or images. In this series, I hope to shed light on this paradigm shift, including why it’s important, what the challenges are, how it impacts your business, and where this trend is heading. Enjoy.

[EPISODE]

[0:00:49] HC: Today I’m joined by guest Jonathan Godwin, co-founder and CEO of Orbital Materials, to talk about a foundation model for developing advanced materials. Jonathan, welcome to the show.

[0:01:00] JG: Thank you for having me on.

[0:01:00] HC: Jonathan, could you share a bit about your background and how that led you to create Orbital Materials?

[0:01:05] JG: Yes. My background is actually not in material science, not in the energy transition. It’s something that I’ve always been passionate about, something that I’ve always cared a huge amount about, but professionally, I’m an AI researcher. I’m a computer scientist. Before Orbital Materials, I worked for a while at different startups, but ended up at a place called DeepMind where I was using AI for all sorts of scientific challenges.

So, I started off using large image recognition models to spot tumors in breast cancer scans, and that’s now a product that has rolled out and been used to save lives around the world. Then, I moved on to leveraging AI for sort of fundamental scientific challenges. So, areas in particle fluid simulations and how we simulate the natural world, which really led me then to how do you simulate things at the atomic level.

I ended up leading a team at DeepMind, applying large-scale machine-learning techniques to simulating what happens at the atomic scale within advanced materials. Because advanced materials are things that underpin really everything that we do. I mean, they’re the way that we’re able to speak. I’m in London right now, a long way, away from where you are. We’re able to speak in real-time, and that’s because of advanced materials. Advanced materials are also our best hope of technological solutions for mediating climate change.

I thought this was an incredible area to use this incredibly powerful tool that we have, which is next-generation AI, to advance the discovery of new technologies that could really make a big difference. I led that team at DeepMind for a couple of years, and I felt that we’d made extraordinary progress. DeepMind is a computational company. Its mission is artificial general intelligence. It wasn’t going to go and spend the time and the money to set up an industrial lab to go make these materials and really bring them to life and bring them to market and have that impact that I was really looking for.

So, I decided to take what I’d learned and start a new company. That company is Orbital Materials and our mission is to develop large-scale machine learning foundational models, and then really use those within our industrial facility in Princeton, New Jersey, to synthesize and make and prototype new technologies for the energy transition.

[0:03:30] HC: What are some examples of the types of materials that you might make and the impact they can have? Why is this an important thing to do?

[0:03:37] JG: That’s a really good question. So, I think people are really familiar with the impact that designing things at the atomic level can have in things like pharmaceuticals and drugs. Saving human lives through solving and curing diseases is something we’re familiar with and we can sort of get an intuition for the impact that maybe designing the molecular formulas of those drugs using AI could have on human health.

Another area where we want to be designing things at that atomic level is going to be in things like semiconductors, things like new chips, or things like new batteries that could power and make use of renewable energy to power industrial facilities that ordinarily rely on fossil fuels. Or it could be for designing new catalysts that simulate really efficient chemical reactions that rely upon sustainable feedstocks or use electricity as an input rather than the heat that’s so normally used.

These are all things we can design using AI in a way that we’re just not able to do until very recently to the new generation of artificial intelligence. So, the breadth of what I think we’re able to achieve is really extraordinary. I think we’re going to be on the cusp of a new revolution in our ability to engineer things at that atomic level.

For Orbital Materials, we’ve got to start in a focused way. We’re a startup. So, we have a more narrow focus in the things that we’re going to market with. Our first materials are really focused around decarbonization technologies. So, carbon removal technologies and other ways of decarbonizing industrial facilities. We develop materials that can capture CO2 from specific gas streams that are coming out of an industrial facility, new energy storage technologies that allow something like a data center to operate behind the meter, or even ways to improve the water efficiency of a data center or an industrial facility.

These are kind of three areas that we’re actively looking at, and where we have the competency and the skills in order to develop new products and new materials.

[0:05:54] HC: So then, how do you use AI to develop materials for these purposes?

[0:05:59] JG: Yes, it’s a really good question. So, you’ll be really familiar, I’m sure, as most people will, you have a mental model of, how does someone design something like an airplane wing? You have a computer simulation that tells you the effectiveness or new engineering design under the conditions in which you want to use it. Then you look at that effect and this and then you adjust the design to make it as optimal as it can be. That computer-aided design revolution has really improved the efficacy and the performance of so many of the things that we use every day.

Up until recently, up until AI, our ability to do that sort of computer design at the atomic level has been really limited. That’s because what we’re simulating is just so complex. It’s so much more complex than simulating something like fluid dynamics. The reason being is that the interactions between these atoms are just really, really difficult. They take into account all sorts of quantum effects and you’ve got to simulate millions of different atoms to really understand what’s happening. Now, AI offers an alternative to sort of bottom-up physics simulations, which is the sort of old-school way of doing this, and AI is able to spot patterns and accelerate the simulations of what’s going on at that atomic level that enables us to then think about really doing computer-aided design in a way that preserves the fidelity and is accurate enough for us to make accurate predictions and design materials on a computer.

So, we’ve seen the huge advances that have been made in our ability to do this with pharmaceuticals and drugs, and there are a number of companies, not at least, Isomorphic Labs, coming out of DeepMind, where I used to work, that are doing this with proteins. And now, we’re applying those same techniques to enable us to design new materials on a computer by really accelerating and improving the fidelity of the stimulations that we’re able to run.

[0:07:59] HC: You mentioned before a foundation model is part of the solution here. Why is it the solution? Is there an alternative way you could have solved this? Or why did you choose to build a foundation model to solve this problem?

[0:08:11] JG: So, we use this phrase foundation model. The phrase foundation model, what does it mean? It means that you train a really large-scale AI model on a huge amount of data that covers an incredible breadth of the sorts of modality that you’re interested in. What does that mean? Well, we can use an example of ChatGPT. Sometimes that’s kind of used as an example of a foundation model.

Now, in that case, what they’ve done is they said, well, ChatGPT is going to work with language and I’m going to gather all of the data that I possibly can, that’s written as a language, from the Internet, from any source I can possibly find in multiple different languages. Then I’m going to train a really big model on all of that data and that’s going to give me incredible, emergent properties. It’s going to look like intelligence. That’s the way in which I’m going to be able to engineer this system to have this sort of appearance of intelligence. Maybe it is even categorized as real AI.

The incredible thing that once you’ve trained in all of that, you’ve trained on code, you’ve trained on news articles, you’ve trained on Reddit comments, you’ve trained on all sorts of different stuff, what you find is that by training on all of that, you become really good at each of those specific things. The best way to build an AI that’s really good at writing newspaper copy is to train on everything on the Internet. That’s why it’s called a foundation model is because you use it as a base for a lot of specific downstream tasks. So, ChatGPT is now used for all sorts of different really specific things, but it’s trained on a common base. The same thing is true for images. You’ll be perhaps familiar with things like Dall-E or Stable Diffusion or Midjourney, they take the same approach but for the image modality.

Now, when I talk about a foundation model for advanced materials, what I’m talking about is getting training data across the gamut of advanced materials, organic molecules, and organic crystals, across batteries, semiconductors, catalysts, organic small molecules, fiber optics, solar panels, all the data that we can possibly find and training on all of it, so we have a model that can really stretch across and be a foundation for any sort of R&D across advanced materials. That’s just the way that foundation models are really the crux of how we’re able to leverage AI in this day and age. And if you really want to be saying, “Well, we’re pushing the limits of what AI is able to do. We’re leveraging the most recent breakthroughs that have given everyone so much excitement around this area,” you’ve got to be building foundation models or using foundation models. In our case, there isn’t a foundation model for material science until we came along. So, that’s what we’ve built.

[0:11:09] HC: What does the data look like that goes into this model? Is it text describing the materials that you’re building? Or is it a graph model about the molecular structure?

[0:11:18] JG: That’s a really good question. So, it is, you can think of it as like a 3D simulation where you have atoms that interact with one another. Now, the data that goes into the model, as you say, it’s a little bit like a graph where the atoms are joined when perhaps they’re interacting strongly or they’re bonded. That graph structure is the input modality that we use input for a foundation model. That’s because so many of the properties that we care about are best represented in that 3D representation.

It’s a little bit like saying, “Well, why don’t you decompose your image into what looks like a sequence of numbers, which are the RGB values?" Yes, you could have AI model that worked on that. But of course, the right thing to do is design an AI model that works directly on images, because images are the right way to represent that information. The same thing is true for materials. The right way to represent the information that you care about for a material is by looking at the 3D structure of what’s going on, the atomics 3D structure. So, that’s the input that we use for our foundation model.

We get a lot of that data from experiments, and we also got a lot of that data from really high-quality, highly expensive quantum simulations, and that’s really great for us because that allows us to scale the amount of data that we then put into our models, but also keeping the accuracy really high.

[0:12:42] HC: What does it take to build a foundation model for material? The core components would be data, algorithms, and compute infrastructure. But can you put some numbers around this? Or give me an idea of what went into this?

[0:12:57] JG: Right. Sort of the mental model, but the amount of compute that’s required for advanced materials foundation model is kind of similar to the amount of compute for an image foundation model. Image foundation models are a lot lower in their compute requirements, text foundation models. So, you can see the image model that Grok uses, if you wanted to go on x.com and use Grok, that was built by a company that would raise $30 million. Now, $30 million would not be anywhere near enough to go and train a large language model by ChatGPT, but you get one of the world’s best-performing image models for that price.

That tells you a lot about the scale of compute, the scale of compute is still really significant. If I think about even years ago or seven years ago, we’re still using a lot of computers here, but it’s nowhere near as big as a large language model would be. But in terms of this scale of the data, I think that what we have is that we have a large amount of data out there on the Internet that’s publicly available and a large amount of data that we can use ourselves and generate ourselves to train these models. I think that it’s going to be perhaps an order of magnitude less than the number of images that you can find, but it’s still going to be in the hundreds of millions getting up to the billions of data points that we can use.

Each of those data points has a lot of information contained in them. So, each one of those data points is akin to a paragraph or so. It might contain something like 200 tokens. You really are still in a very large data regime, when you think about training models of this scale. It’s not the scale of resources required to train a large language model.

[0:14:43] HC: What are some of the challenges you’ve encountered in building a foundation model?

[0:14:47] JG: So, I think the engineering challenges are always there when you start scaling up the number of computers that you’re using, the amount of data that you’re using. I’ve been doing this for quite a while now. I’m still surprised at the level of engineering challenge that there is involved. I really don’t – we’ve made a number of algorithmic improvements and we’ve actually just open-sourced our version of our foundation model, which outperforms the models from Google and Microsoft. So, we’ve done really, really well, and part of that is algorithms.

But really the core of what we do is engineer things really well. There are always challenges there with things like device failure or trying to get the throughput and utilization of your GPUs as high as they could possibly be.

I’m surprised at how much that challenge still exists after the amount of resource that’s gone into trying to make training these models easier, it still requires, I think, a lot of distributed engineering expertise to make this stuff really work well.

[0:15:48] HC: How are you using your foundation model currently, and how do you plan to use it in the future?

[0:15:53] HG: So, we currently use it day to day in our lab to help design materials for our first materials products. This is both conducting high-fidelity simulations about what’s going on in candid materials that we’ve designed. But it’s also in the design of entirely new materials using this sort of generative part of our foundation model. So, we use it in all sorts of different ways within our lab, and what we find is that it’s really an accelerant to hypothesis generation and scientific creativity. You say, “Well, I want to have a design material that has this property.” AI will generate a number of different materials that can have that sort of property or have a better sense about what goes on if I were to put it under these conditions. The AI can help with that.

That quick, iterative back and forth with a really high-quality set of predictions really accelerates the ability of our scientists to get to the right answer and to make the right experiments and to limit the number of trial-and-error steps they would be taking. We find it, I think, really sometimes people talk about AI being a full sort of autonomous replacement for scientific discovery and that they kind of work immediately on that having maybe a robotic lab where an AI creates a design. The robotic lab goes to try and synthesizes that design and that design gets characterized and that information gets fed back into the AI.

I think we actually find that, especially given the ability of AI to be really creative, and you see this in text and images, that creativity is a stimulant and a sort of supportive tool to the intrinsic creativity in chemical and materials knowledge of our incredible scientists. That is actually the way that we see that acceleration has really happened in what we care about, which is synthesizing and making incredible new materials.

[0:17:53] HC: Do you consider your foundation model to be complete? Or do you see opportunities for further enhancements? Foundation models are still a relatively new concept. So, I’m curious how you go about identifying areas to enhance your model.

[0:18:06] JG: I think one of the things that we really care about is giving back to the community that we’ve grown up in and we’ve benefited hugely from, that kind of open-source community. So, that’s why we have open-sourced our version of our foundation model called Orb for the community to use for non-commercial purposes, for research purposes.

Now, a side benefit of that for us is that we get to understand all the things that people really care about in getting this stuff to work. It’s great battle-testing to figure out where the problems are. That’s a great way for them for us to figure out what do we need to prioritize when we think about developing and adding new functionality on to our foundation model.

I don’t think that it’s complete at all. We’re going to be continuing to improve this, adding new functionality. I’m extremely excited about the ability to add in agentic behavior. Conduct automatic computational experiments and then write a report of what you’ve done as a way to provide both the analysis and the simulation and the prediction and all the other things that you would ordinarily do directly to a scientist. I think those sorts of things are going to be orders of magnitude more effective at accelerating scientific discovery product development in this area than what we currently have. I think we’re just one step along the way and there’s so much, that’s more exciting to come.

[0:19:33] HC: Are there any lessons you’ve learned in developing this foundation model that could be applied more broadly to other data types?

[0:19:38] JG: I think that we are really excited that we can leverage a lot of what people have learned in other materials, in other modalities for AI. So, I think for the most part, right now, we’ve really been taking a lot of what our team have learned working on large-scale models and other modalities and bringing them to materials.

I think the overall lesson that we’ve embraced and we really care about is about, I think, is what I mentioned, using these models as ways to accelerate human creativity. I often think of scientists as creative. They’re creatives in the sense that you’ve got to have inspiration to understand and know what experiments to run next. Because there were too many different possibilities that you could possibly think about, so you’ve got to have that sort of slice of inspiration to know and to guide yourself in those experiments. I think AI is a massively powerful creativity aid and accelerant, and we’ve really seen that in other areas of AI, and we’re bringing that to advanced materials.

I think the AI for material science is a little bit behind other areas of AI. In a certain way, we’ve still got some catching up to do, I think, before we are starting to then really innovate in new ways that can then be brought back into something like language.

[0:21:03] HC: Thinking more broadly about your role as a founder, is there any advice you could offer to other leaders of AI-powered startups?

[0:21:08] JG: I think that AI is superficially simple and that if you have a maths background and you go and read some papers in large language models, the maths is incredibly straightforward, right? It’s really not very complicated maths. That can give you a kind of superficial sense that AI is a simple scientific discipline, but the kind of beauty of AI is in the weeds. It’s in every single little choice that goes into engineering and building what you’re doing. It’s a discipline that has an extraordinary requirement for attention to detail. So, my general advice for people who are wanting to not just be a user of something like ChatGPT, but to actually then engineer things on top of that or even engineer AI systems in new areas that wouldn’t be covered by ChatGPT or the image generation models is to maintain extremely high standards of engineering and scientific discipline.

If you don’t have that yourself, then you need someone who does have that. Because I think without that, it’s really hard to have differentiation in the performance of your AI systems. You’ll have to find differentiation through something else, maybe through UX or, I guess, more traditional methods of technology, a competitive advantage. But if you’re really serious about being an AI founder, I think you yourself as a leader or someone really, leadership hand level has to really have that degree of knowledge and commitment to attention to detail.

[0:22:51] HC: Finally, where do you see the impact of Orbital Materials in three to five years?

[0:22:55] JG: The thing that we have as our mission is to really broadly accelerate. We believe that AI has a broad power to accelerate new materials development. The way that we have measured success is how quickly can we start to commercialize new materials in a number of different areas. So, for us, we want the way that you design new technologies for the energy transition to be using AI, and ideally some of the AI that Orbital Materials has developed. We believe that Orbital Materials is really going to have developed and brought to the market new materials that have a really significant impact commercially and on that energy transition, taking something that would have ordinarily taken 10 years to develop and bringing it down to one. That’s the order of magnitude improvement that we think is possible with AI, and I think Orbital Materials is going to have the best shot of anyone achieving.

[0:23:57] HC: This has been great. Jonathan, I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?

[0:24:05] JG: So, you can visit our website, orbitalmaterials.com. We’re also on LinkedIn, Twitter, other areas of social media. I should say x.com, shouldn’t I? Where you can find out a lot more about what we’re up to and follow us on GitHub where you can start to experiment with some of our models and any feedback that you have for us would be great.

[0:24:22] HC: Perfect. I’ll link to all of those in the show notes. Thanks for joining me today.

[0:24:26] JG: It’s a pleasure. Thank you so much for having me.

[0:24:28] HC: All right, everyone. Thanks for listening. I’m Heather Couture and I hope you join me again next time for Impact AI.

[OUTRO]

[0:24:37] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend, and if you’d like to learn more about computer vision applications for people in planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.

[END]