Climate Intelligence from Satellite Data with Abhilasha Purwar from Blue Sky Analytics

In this episode, I sit down with Abhilasha Purwar, founder and CEO of Blue Sky Analytics, to explore the groundbreaking realm of climate intelligence derived from satellite data. Abhilasha’s captivating journey, from engineering to environmental research and policy consulting, reveals her passion for addressing climate change through data and technology. Blue Sky Analytics is on a mission to bridge the gap between satellite data and actionable insights, monitoring everything from carbon projects to wildfire risks and infrastructure assets.

Discover the pivotal role of machine learning in analyzing vast amounts of satellite imagery and how it’s transforming our ability to measure and combat climate change with precision. Abhilasha shares compelling examples of Blue Sky Analytics’ models, from monitoring forests to assessing biodiversity and air quality. We dive into the challenges of satellite data procurement and the importance of open data and open source in advancing climate solutions. Find out how Blue Sky Analytics measures its impact, learn valuable advice for AI startup leaders, and get a glimpse of the inspiring future where all forests and lakes become digital public assets worldwide. Tune in now to discover the power of satellite data with pioneer Abhilasha Purwar!

Key Points:

Abhilasha's background and her journey to founding Blue Sky Analytics.
Blue Sky Analytics and how the company is helping combat climate change.
Discover the role of machine learning at Blue Sky Analytics.
Exciting applications of the Blue Sky Analytics models.
The challenges and hurdles of relying on remote sensing data.
Hear why open data and open-source software are essential.
How Blue Sky Analytics plans to bridge the paywall gap.
Fascinating and potential future applications of satellite data.
Insights into the single performance indicator Blue Sky Analytics uses.
She shares key advice for leaders of AI-powered startups.
The vision that Blue Sky Analytics has for the future.

Quotes:

“What Blue Sky really does is effectively monitor the pulse of the planet.” — Abhilasha Purwar

“Objectivity and numbers ground us and they serve as some sort of truth and some sort of objectivity against all kinds of these emotionally-driven debates.” — Abhilasha Purwar

“What machines are able to do in one day? It would take you and I 10,000 years or something to do.” — Abhilasha Purwar

“The bottleneck of building out that trust within the community, building out that trust for the community with other stakeholders, can really be solved if the community was to collaborate with each other.” — Abhilasha Purwar

Links:

Abhilasha Purwar on LinkedIn
Abhilasha Purwar on X
Blue Sky Analytics

Resources for Computer Vision Teams:

LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.

Transcript:

[INTRODUCTION]

[00:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven, machine-learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people in planetary health. You can sign up at pixelscientia.com/newsletter.

[INTERVIEW]

[0:00:34.0] HC: Today, I’m joined by guest, Abhilasha Purwar, founder and CEO of Blue Sky Analytics, to talk about using satellite data for climate intelligence. Abhi, welcome to the show.

[0:00:45.1] AP: Thanks Heather, I’m really glad to be here.

[0:00:47.6] HC: Abhi, could you share a bit about your background and how that led you to create Blue Sky Analytics?

[0:00:52.8] AP: So my name is Abhilasha Purwar. I have been in the climate space, honestly, since 2010. So it’s going to be a full 13 years soon. I went to an engineering school in India at IAT, and I studied chemical and materials engineering, and I was really involved in like, how to process die pollution, like the tech side pollution and use of different kinds of photo-oxidation to break down those dyes and clean that polluted water.

From there, I moved into solar cells and how to make different kinds of solar cells and made flexible solar cells when I was 20 years old in the lab and kept on working on that side of research and somehow, I found myself working with the Indian government and with two now economist, who are now Nobel laureates, Abhijit Banerjee and Esther Duflo at Jameel Poverty Action Lab, where we were consulting different Indian ministries, especially Ministry of Environment on a series of projects.

So they are like my journey. It was like really taking me from one environment at a time with problems that does technical problems or our problems around building technology solutions to then moving into more policy aspect of it and working with governments and looking scale up or different incentives to implement different solutions.

So to get deeper into the whole thing, I went to Yale Environment School in 2015 and I focused on environment and economics and really started to going into more like now, the financial element of it. After graduation, I worked at a private equity fund in Connecticut into how to deploy different kinds of capital, different source of capital, with different risk appetite, and various kinds of projects for clean energy transition.

And with that whole journey, across different spectrums in 2020, I was like, “I think there is like a role for something like data and technology, which can be more catalytic” because as different frameworks were merging, climate was no longer just a nonprofit, just an activist exercise. It was coming into the mainstream of business, financing, and decision-making, whether we should build a real estate project here or not.

What should be different kind of like, you know, loan rates be, what should different insurance policies be? And I think, another element was like, how climate was first coming into this thing, as like a real financial number, which it was simply never like 10 years ago, 20 years ago. People wouldn’t think of climate change as a liability when that was happening and I really thought that at that moment, there was room to build something, a different kind of organization, almost like what Bloomberg told and molded in the 1990s, we set out to do that in 2020s.

[0:03:25.4] HC: So what all does Blue Sky Analytics do and why is this important for tackling climate change?

[0:03:30.2] AP: So what Blue Sky really does is effectively monitor the pulse of the planet. Like, we look across all different kinds of like, let’s say, carbon projects or water, you know, risk or flood risk, drought risk in different places. We do wildfire monitoring, we monitor different kinds of infrastructure assets like how solar projects are being developed or how infrastructure projects are being developed.

So using satellite data, we monitor the planet in very short and simple terms but the way it like, really links us to the environment and climate, is through either carbon projects or through asset monitoring or to climate risk analytics and when we look into something, let’s just quickly, you know, a forest. So there’s a large tract of land, which is either a national park or a managed timberland by a company.

This is a natural resource. Now, this natural resource could be depleted due to various reasons. It could be enhanced and improved and it might have a series of risks like wildfire risk and thereby, to monitor what was happening in past, what is happening right now, what may happen in the future, becomes quite important. That’s what we do.

Previously what we found was that multiple organizations were doing this work in-house and a lot of it was in silos and there were some inefficiencies so to speak that was flowing through the system. In addition, we also looked into the space market and what was happening in the satellite data industry.

Just like the explosion of our iPhones and phone sensors, phone cameras, resolution cost, the same thing was happening in you know, the space industry but that data was not being really analyzed and used. So we found that there was a lot of data out there, there was a lot of data with various kinds of satellite companies but the analytics, the intelligence from it, the information was not going to the stakeholders.

So we decided to rebuild in that middle, what they can call as like the neck of the funnel. Let’s take data from different kinds of sources, let’s process, analyze it, make models on it, do different machine learning algorithms, and different kinds of AI, and give different kinds of outputs to people, which they can use.

So something which is accessible, ready to use. Like, whether we talk about, let’s say, wildfire risk or flood risk or just flood verification or measurement of damage, all of these are final products so that you don’t have to go out there, figure out which satellite data to buy, buy it, analyze it, build up your data scientist team, you can just like, think of that were done by somebody else and you know, do the final work yourself.

Why it’s important for tackling climate change. I think that saying that we have like what you don’t measure, you cannot solve it kind of really applies here because the way most of us perceive climate change, and the reason why there’s so much of discrepancy/so much doubt and you have, even in 2023, you have people who don’t believe climate change is real is because you’ve made it very picture-heavy or video-heavy and heat-heavy sector, rather than – or sort of emotionally driven sector, rather than more data-driven sector.

When you have concrete numbers that, “Hey, in Canada, this has been wildfires in the past 50 years and this was wildfire in 2023 in this season alone” and this kind of really proves that there is something else going on. These numbers, purely proof. I think then, you really build out a more cogency, a more like sort of harmony across different stakeholders. January of this year was riled with all the controversies and carbon markets you might be very aware.

Like, anybody who has been in the climate sector. I think the Guardian’s article on carbon markets got probably, you know, the highest views and so on. Everybody was talking about that and everybody was attacking each other. “Oh, you are more right than I am more right” and it’s really become this like, you know, race to who is more right.

But I think at some point like numbers, methodologies, algorithms, which are fairly like you know, which might have some bias here and there but a fairly mathematical, fairly objective, solves that kind of finger-pointing and that’s why I think what we do, and what multiple organizations like us, do in the space of climate change is very important because objectivity and numbers ground us and they serve as some sort of truth and some sort of objectivity against all kinds of these emotionally-driven debates.

[0:07:53.6] HC: So what role does machine learning play in your moderating technology, maybe you have some examples of the types of models that you train and the powerful insights that you’re able to get from them?

[0:08:04.6] AP: Yeah, great question. I mean, you know, just like there’s so many pictures that are taken by different satellites of our planet. It’s quite impossible to analyze those pictures if we were not to use these different tools that are available to us like you know, if we were to – if you were from the DA sector, you might remember like you know, 10 years ago, you will take your tiles, put them on the software, like you know, ArcGIS, draw them on boundaries, and then do certain kinds of analysis.

And now, a lot of that has been sort of like changed by automatic codes, that code goes and learns. So for instance, what we have in Blue Sky is like, if you know, we would scan the satellite images and then, learn that this is the water boundary. This is the water-land boundary, this is how this shape is changing, this is how this depth measurement is happening.

So there is statistical models and then there is machine learning models and both combine like, I think really allows us the scale to do this kind of analysis for the entire planet, rather than for a very small geographical area, in the kind of, I would say, volume and cost. So the machine learning benefit and I think there is a quote by somebody, which is that what machines are able to do in one day, it would take you and I 10,000 years or something to do that.

Obviously, now, what happens is that there is adders and biases and inaccuracies that sometimes these models do proper fit. Sometimes, the models identify something which is not a water body as a water body and also there is definitely a very important room for correction and building it out, solving biases, improving accuracies.

Really having that discussion around the algorithm and methodology and having almost like that community engagement around improving models and accuracies but the matter of the fact is that today, in 2023, we can talk about monitoring the entire planet because we have both like AI machine learning and cloud computing with us. We could not have done this 10 years ago.

[0:10:03.9] HC: What are some other examples of models you could build? You talked about water boundaries there, are there’s some other common models that your team is building.

[0:10:12.2] AP: Oh my God, we have so many. We monitor forest for instance, so we got number of trees, we monitor deforestation, we monitor wildfires in a forest. So like really, a lot of models around forest. We are also now in the process of building model around biodiversity of forest using different kinds of different spectrums of satellite data.

That is like still in works. So you know, it’s going to take some time for us to get into that, like time and capital both to get into that level. Yeah, so like forest, water. We started our journey with monitoring air quality. That was like our very first model and now, we also work a lot on infrastructure development.

So roads and highways and real estate, solar farms, different things like that. Just, I think, overall, what simple statistical model plus cloud computing can do is I think, tremendous. You add a layer of machine learning onto it and it becomes almost like 10, a hundred X.

[0:11:05.8] HC: And all these models are largely based on satellite data. What kinds of challenges do you encounter in working in this form of data?

[0:11:13.9] AP: For us, challenges, I think the challenge the satellite data is very heavy. It’s crude, it’s not like a simple format of data, it has largely been solved in our team. The challenge we face really is hard procurement. Most companies end upstream kind of almost like guarded the data. So there’s a lot of data, a lot of pictures that are taken by, I think, there’s the thousand plus satellites in the orbit but procuring it is not always that easy.

Building partnerships to get the data streams is not that always easy. So I think the human element of accessing satellite data is probably more challenging than the type of data or processing it or how heavy or you know, difficult it is. That is definitely one challenge. I think the second challenge is, especially with rise of more and more private sector companies, so when we talk, you’re also from the remote sensing field, if I’m not wrong.

[0:12:07.4] HC: That’s right.

[0:12:07.9] AP: Yeah. So when you look at like data from NASA or the European Space Agency, one dataset has multiple users, right? Like lots of academic users, lots of users in open source are in. So any discussion around any sort of bias at our – is quite out there. So for data, like the more that the people use it, better the quality of that data becomes.

For other sources of data, I mean, I would even say like is this data is not always that easily accessible, which vendors, the discussion around the quality biases ever slightly poorer and that’s another challenge with a lot of private data that you can get the images but there might be a bias in those images.

The sensors might not be calibrated properly, you might have to correct for something, and because there’s just not that many users, like sometimes you simply don’t know what you don’t know. So I think from the perspective of consistency, even though the public data might be lower resolution, it is highly consistent and it has a high amount of documentation and usage and community around it.

Private data, this visible spectrum has been fairly easy for us to absorb, use indeed with our workflows. Other spectrums you find just some challenges around like, I think each of the stream companies should work really harder on building out their downstream analytics community because more companies like us or organizations, universities like us, in the downstream sector can use their data and analyze their data and build sort of like body of knowledge around it, the more they will find that their data is valuable.

But instead, the current business model, venture funding, everything, it leads to like almost a guarding of that data. It’s like, “Oh, it’s so expensive. I’m only going to sell you for like a thousand dollars a tile.” I think having a certain amount of data streams, almost like openly available to the community, it could be off like not important assets or something of that sort, would just build out like a more discussion around like any sort of advice and inaccuracies.

“Hey, this is showing this but it does not this.” Those aspects, which sometimes many of us are like left to work within silo. So yeah, procurement and then secondly, lack of community around usage of private data.

[0:14:26.2] HC: Yeah, so related to the lack of community with the satellite data. Open-source tools are ubiquitous in AI, along with many publicly available data sets for benchmarking algorithms. What role does open data and open source play in the climate crisis?

[0:14:41.2] AP: Very important. I mean, pretty much like anybody who is making a more, let’s say, higher resolution model, higher resolution, higher frequency model. Let’s say we build something, which had a resolution of like every day compared to another open data or open source model, which has a resolution of like quarterly or monthly or something datasets, which are widely acceptable, which are openly available.

The community knows them, even if let’s say they are resolution or frequencies lower, this all has a benchmark. So when you are improving, you’re building out a model, you can go back and say like, “Okay, my outcomes are parallel.” Otherwise, without those models, there will be almost like no way to calibrate against, no way to benchmark against, right? So I think a lot more research is needed in the open source, in the ground-proving realm.

Let me pick an example of monitoring a forest, right? So multiple groups across the world, Blue Sky, various very heavily venture-funded companies, public agencies, foundations, nonprofits, universities are mapping forests across the world. Now, what happens is that let’s say, UNC, University of North Carolina has mapped the forest and has put that data publically available. Then what Blue Sky can do is that I’m doing let’s say a bunch of forest in India.

I can run my model on the UNC forest, see what my result is compared to the UNC result and if it is really, really widely off, I know that my model is horrible in simple words, right? So I can start to work on it, I can improve it, I can account for that, “Okay, this thing is more different than were available in India” and so on and so forth but when there is just simply no data because everybody is guarding their models very close to their chest, everybody is keeping their data private, then it really becomes very difficult to calibrate against anything.

So I think previously, the model in this, by model I mean like the business or the organizational model and the climate industry was majorly in the universities and nonprofits, now, whatever the technicalities of those, the technical competencies of those organizations but everything was in some format in the form of a peer review paper and everybody was calibrating against each other, right?

There has been a tremendous rise of climate tech companies in the last five years, where the multiple private organizations, multiple think tanks, multiple technical or techno NGOs, so to speak, or 501 (c) (3) tech companies or B Corps and so on in the space, I would say that if everybody starts to put just a little bit like 1% of their work in the sort of like digital public comments kind of a thing, that would be very, very tremendously helpful to everybody in the community because we can all like improve our models and calibrate it.

But it’s really, really large, not a single organization, not a single company, would ever be able to really solve all of the global demand. The demand is way higher than the supply. So the bottleneck of building out that trust within the community, building out that trust for the community with other stakeholders, can really be solved if the community was to collaborate with each other.

[0:17:49.7] HC: Yeah, the progress of AI we’ve seen over the last 10 or so years, I think part of the reason that’s been so rapid is because of the open source and open data and all that and these large language models that the pace just keeps increasing. If open source and open data could help make progress towards fighting climate change a whole lot faster, that’s our – we all could definitely benefit from that.

[0:18:12.7] AP: Definitely. I mean, honestly, we have been thinking of late and this would probably the first public podcast. I still need to run it by our investors and the rest of the community but within Blue Sky, we are really thinking about essentially either opening a foundation or one of those organizations, where by a lot of analysis that they’ve done, which is currently behind the paywall, we can bring it ahead of the paywall and we can have it readily available for pretty much the entire world.

So there’s some assets which belong to a client, right? That’s a private asset, I cannot technically release its information. It is on the client to release the information but then there are some assets, which are just public assets. Let’s say, a forest, a public forest, a national park, it’s a public asset. Now, obviously, you can’t always have all of the analysis, all of the public asset because there is a cloud cost.

There is a cost of procuring satellite data and everything, there is a cost of engineers, they have to have their salaries. So it cannot be technically completely free but if the like you know, just base-level no margin added funding is figured out first, a digital public good and that public good can be available for the public at large for pretty much anybody, almost like a Google maps or something to be able to access. To be able to use, to be able to – something very simple. You know, we had flooding in Delhi last month, and even right now, the state of Himachal Pradesh, which is just a few miles north of where I live north of Delhi has had like immense amount of flooding. I think in one month, they got like more than 400 millimeters of rainfall. I think one city got like 100 mm plus rainfall in a single day.

So all the water bodies, all the rivers are at like maximum capacity. They’re overflowing, Yamuna was at an all-time historical high level I think a month ago. Some of this data, sure, your governments and your news channels and your national weather agencies are able to inform via news but if somebody else could join in there and inform people like, “Hey, the river is at an all-time high, and if there to be this much amount of rainfall, the probability is that this much area is going to be submerged.”

If this information was available publically seven days prior, people can move. This information is powerful, it can save lives and I find it really sad that some of this information is available with us behind paywalls and we don’t have the resources to either put it ahead of paywalls, make it accessible to people, do a little more research where conduit make those models more accurate, make those models more developed and advanced.

I think there is a tremendous role, especially within the climate of open data, and open source. Something which is a challenge in this industry though is that open source and open data is typically extremely like foundation/nonprofit model. Technology works the best if you might have seen within, with venture-backed companies like engineers, the incentives, data scientists like a lot of that, and both those models are growing.

I am not saying one is better than the other but both of those sectors exist and both of them are doing great work. So probably finding a middle-ground structure where both can collaborate more actively without like those typical stereotypes that we have about each other’s sectors could be really beneficial.

[0:21:28.7] HC: So you mentioned used cases related to water and related to forests. Are there other powerful used cases in machine learning and geospatial data that we have yet to see and that you would like to see? [0:21:40.5] AP: I think currently we have not – we have only touched on the surface of what is possible with the used cases in water and forest like we’ve just touched the surface. Both forest, water, wildfire, I would say the penetration of that analytics that intelligence within the market with the stakeholders, people who could actually take decisions basis where with the layperson, with somebody.

For instance, there was a wildfire in Greece and it is fairly easy for high-quality machine learning model with geospatial data to do a seven or 10-day prediction and inform the people who are taking flights to Greece and booking in resorts that, “Hey, there is a high likelihood of wildfire in that region and don’t go.” It’s quite simple. So we do have that kind of technology available with us today.

But the challenge has been in the penetration and adaptability of that technology with the right kind of stakeholders, decision-makers, and people who can do something with that information. Information is fine but information is as good as the person who has access to it and who has agency to use that information for the good, right? I think that’s where we really lack.

My personal belief is that taking that information, the final user who can do something about it is really a very private sector for them because you can take that information to airlines, to insurances, to resort owners. The incentives are different if we were to move that dissemination of information to typically a nonprofit model, it may or may not be adopted by people because you have to figure out the incentive why people will do something.

We’ve established that the incentive of people will do something because it’s just good, it may or may not be enough. If that was the case, we would have solved climate change a decade ago or something. So I think the real game in the industry is going to be for used cases of machine learning and open data and geospatial data is adoption, is like it’s actually being used by the stakeholders.

You’d be surprised to know that most electric utilities across the world, most water utilities across the world, most cities for city planning, most disaster management agencies, they’re not using the best technology today, both I would say in developed countries and developing countries, US, North America, Europe, India, like you know, all of those geographies, technologies, their adoption is still some years to go.

[0:24:12.5] HC: Thinking more broadly about what you are doing at Blue Sky, how do you measure the impact of your technology?

[0:24:18.5] AP: We recently came up with a singular KPI. We had a series of multiple KPIs and we realized that that’s just not the best way to measure. So our singular KPI is area under monitoring. We think that more area of your monitoring that’s like the achievement, like more area that various clients or various stakeholders are paying for. We are procuring satellite data for them, we are doing the analysis of them and we are obviously, it’s our job to make the analysis better.

Get more and more accuracy, better resolution, better frequency, essentially, making a better product for your clients and your users. That goes without saying but having like that product, more of that product in very simple ways I think is one of the singular best KPIs. I think if the whole foundation thing or the additional public asset thing that I’ve been thinking about and that flies off, then we’ll have another KPI, which will be how much area under monitoring is with industrial public assets and what is the API sort of like API consumption or how many people are hitting their APIs.

Is it treating millions, two million, three million, different kinds of people who are hitting those APIs, so we will start to segment in those but yeah, like more area under monitoring and more utility of those APIs by various types of people.

[0:25:36.3] HC: Is there any advice you could offer to other leaders of AI-powered startups?

[0:25:41.4] AP: Oh my god, I am actually a fairly junior founder when it comes to this whole realm of AI and startups, so I think for – you know, I’ve done, I will say we have done a lot of mistakes. So we’re on a different journey and I would say that our biggest error or a mistake I would not do in the future, which is something that probably other founders coming into the space could look into is I would say like ship out the product faster and give it to different kinds of customers quicker. So I think we took a little bit of more time in pursuit of like more accuracy, quality things like that, and obviously that matters but I would have, if I would do it differently, I would send like literally the crappiest version like that model that we had discarded to a bunch of people for just reference. You don’t have to use it, you don’t have to pay for it but just for your reference, you’re working on this.

We’ll keep on improving but look at this horrible thing that I made and just have a general comment but I would say that one of the reasons why we ended up not doing that was because clients or users or general users, because we are in the – head off like in the technology products like Google and Facebook. So we all have very high expectations, right? When we see an MVP (Minimum Viable Product), we want the MVP to look like this like beautiful smooth product.

So I think that was another reason why because we would put that as a bold and really in caps like disclaimer, “This is only a work in progress. Take it with a pinch of salt.” People would not read it but I would say you know, getting this to people faster, getting your product to people faster would be my biggest advice.

[0:27:18.7] HC: Finally, where do you see the impact of Blue Sky in three to five years?

[0:27:22.6] AP: Oh my god, that is a very difficult question. I think if you are able to get both, we’re doing pretty good on – we’ve identified a couple of areas where we have like we’re solving a direct customer need, ironically that need is outside of climate change and sustainability but it is still a big customer need and we are solving it. So if that area keeps on growing that will be able to provide us with like a very consistent and sustainable revenue.

While on the other side, if you are able to get like more digital public assets out, my dream honestly would be to have all the diverse all the forests of the world and lakes sort of like as a digital public asset available to everybody, so that would be my dream. If we’re able to get there in three to five years, that would be unbelievably fantastic.

[0:28:07.1] HC: This has been great. Abhi, your team at Blue Sky Analytics is doing some really interesting work in tech and climate change. I expect that the insights you shared will be valuable to other AI companies. Where can people find out more about you online?

[0:28:19.8] AP: Oh, so we have a website called blueskyhq.io but if you write Blue Sky Analytics, we pop up. You can find me on LinkedIn as Abhilasha Purwar, and I frequently comment about different kinds of things on LinkedIn and Twitter but I am also writing a couple of books related to climate change. So I hope to have them out soon and then I’ll share them with you and if you can give me a shoutout and then people can read it, further along.

[0:28:44.2] HC: Perfect, I’ll look forward to it. Thanks for joining me today.

[0:28:47.6] AP: Thanks so much, Heather. It was great talking to you.

[0:28:50.4] HC: All right everyone, thanks for listening. I’m Heather Couture and I hope you join me again next time for Impact AI.

[END OF INTERVIEW]

[0:29:01.1] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share with a friend, and if you’d like to learn more about computer vision applications for people in planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.

[END]