There are now a few different AI foundation models available for Earth Observation (EO) data. These vast neural networks can be rapidly fine-tuned for many downstream tasks, making them a highly versatile and appealing tool.

Today on Impact AI, I am joined by Hamed Alemohammad, Associate Professor in the Department of Geography at Clark University, Director of the Clark Center for Geospatial Analytics, and former Chief Data Scientist of the Radiant Earth Foundation, to discuss the applications of foundation models for remote sensing. Hamed’s research interests lie at the intersection of geographic information science and geography, using observations and analytical methods like machine learning to better understand the changing systems of our planet.

In this episode, he shares his perspective on the myriad purposes that foundation models serve and offers insight into training and fine-tuning them for different downstream applications. We also discuss how to choose the right one for a given project, ethical considerations for using them responsibly, and more. For a glimpse at the future of foundation models for remote sensing, tune in today!


Key Points:
  • A look at Hamed’s professional journey and the research topics he focuses on today.
  • Defining foundation models and the purposes they serve.
  • The vast amount of data and resources required to train and fine-tune a foundation model.
  • Ways to determine whether or not a foundation model will be beneficial.
  • How foundation models improve generalizability for downstream tasks.
  • Factors to consider when selecting a foundation model for a given downstream task.
  • Insight into the future of foundation models for remote sensing.
  • Hamed’s advice for machine learning teams looking to give foundation models a try.
  • His take on the impact of foundation models in the next three to five years.
  • Ethical considerations for the responsible use of AI that apply to foundation models too.

Quotes:

“[Foundation models] are pre-trained on a large amount of unlabeled data. Secondly, they use self-supervised learning techniques – The third property is that you can fine-tune this model with a very small set of labeled data for multiple downstream tasks.” — Hamed Alemohammad

“It takes a lot to train a model, but you would not [do it] as frequently as you would [fine-tune] the model. You can use shared resources from different teams to do that - share it as an open-source model, and then anybody can fine-tune it for their downstream application.” — Hamed Alemohammad

“The promising future [for foundation models] will be combining different modes of data as input.” — Hamed Alemohammad

“There is a lot to do and the community is eager to learn, so if people are looking for challenging problems, I would encourage them to explore [the foundation model domain] and work with domain experts.” — Hamed Alemohammad


Links:

Hamed Alemohammad
Hamed Alemohammad, Clark University
Hamed Alemohammad on LinkedIn
Hamed Alemohammad on X
Hamed Alemohammad on GitHub
Foundation Models for Generalist Geospatial Artificial Intelligence
Prithvi-100M on Hugging Face
HLS Multi-Temporal Crop Classification Model on Hugging Face


Resources for Computer Vision Teams:

LinkedIn – Connect with Heather.
Computer Vision Insights Newsletter – A biweekly newsletter to help bring the latest machine learning and computer vision research to applications in people and planetary health.
Computer Vision Strategy Session – Not sure how to advance your computer vision project? Get unstuck with a clear set of next steps. Schedule a 1 hour strategy session now to advance your project.


Transcript:

[INTRODUCTION]

[0:00:03] HC: Welcome to Impact AI, brought to you by Pixel Scientia Labs. I’m your host, Heather Couture. On this podcast, I interview innovators and entrepreneurs about building a mission-driven machine learning-powered company. If you like what you hear, please subscribe to my newsletter to be notified about new episodes. Plus, follow the latest research in computer vision for people and planetary health. You can sign up at pixelscientia.com/newsletter.

[INTERVIEW]

[0:00:34] HC: Today, I’m joined by guest, Hamed Alemohammad. Hamed is an Associate Professor in the Department of Geography at Clark University and the Director of the Clark Center for Geospatial Analytics. We’re going to talk about foundation models for remote sensing. Hamed, welcome to the show.

[0:00:50] HA: Hi, Heather. Thanks for having me here.

[0:00:52] HC: Hamed, could you share a bit about your background and how that led you to Clark University?

[0:00:57] HA: Sure. By training, I’m actually an environmental scientist. I did my PhD in environmental science at MIT. In my early days, even during my undergrad, I started using satellite imagery for various environmental, basically, monitoring projects. I was always fascinated with data analytics. It was during actually my PhD, this is going back to 2019 through 2010, that I was introduced to computer vision.

Back then, well, it wasn’t much about AI. It was the early days of deep learning. Everything was computer vision, feature detection. I was applying those to, basically, remote sensing imagery of precipitation, doing a bit of a statistical analysis. That got me excited in doing more in that direction. I started, basically, combining my domain knowledge of remote sensing with statistical analysis and gradually getting into AI and machine learning.

After MIT, I did two years of postdoc, basically, doubling down on my domain knowledge in remote sensing. Then I joined, actually, a foundation called Radiant Earth as a senior data scientist, working on more applied science. Working on applications across the spectrum, particularly in data scarce regions, such as Africa, that remote sensing is available, but AI can help you get the answer.

We found a bottleneck was their training data and benchmark. We initiated a new effort and I became the chief data scientist there to lead the effort on what we called Radiant MLHub, which was publishing benchmark data in our field and making sure the broader spectrum of the community can benefit from advancements in AI applied to remote sensing imagery. Toward the end at Radiant, I was the executive director, leading the organization in different initiatives.

Then earlier this year, back in January, I joined Clark. The reason I joined was Clark established a new center, as you mentioned about my title, called Center for Geospatial Analytics, which is basically, a new initiative started by the university to expand the historical, basically, successive Clark in geography and geospatial to be at the forefront of the analytics way, which is now being driven by AI advancements.

At CGA, we are basically in charge of bringing the new analytics to campus in terms of the curriculum and teaching students, also enabling new research, so working with the outside community, both academic and non-academic to build partnerships and relationships and collaborations to work with Clark faculty. I’ve been, basically, leading the efforts since January and am excited to be back in academia and working with the students.

[0:03:21] HC: Great. Now that you’re back in academia, what research topics do you focus on?

[0:03:26] HA: Broadly speaking, continuing to the track that I started doing my graduate school, I basically work with advanced analytics, applied to various types of observation data to better understand what I call our changing earth system. In our science and broadly speaking, geospatial, we have different types of data. We have satellite data, which comes in different modes and different types. We have in-fusion measurements from different sensors and more growing from mobile phones and basically, people who do data collection on the ground.

My research involves how we can better combine these data to better characterize these patterns. These involve particularly about land surface, for example, we monitor soil moisture using satellite data, how we look at the interactions of soil moisture and vegetation on the ground and how these dynamics are changing over time. That will tell us how the climate, basically, is changing.

This is from an ML perspective, a very challenging problem, because in many parts of the world, we have these observations from satellites, but we don’t have necessarily what we call a labeling ML. Let’s say, for example, let’s pick up an agriculture example, which is one of the areas I work on, which is, for example, monitoring what crop types are growing in a specific region, or what is the status of the crop in terms of the yield that we’ll have toward the end of the growing season. If you think about the US and most of the European countries, you have a lot of ground reference data coming from particularly official government reports, like USDA in the US. You can use those historical benchmarks to build models that can tell you, okay, this season, if I look at the satellite imagery, I can tell you what is the yield toward the end of the season.

If you want to apply the same approach in a data-scarce country, which happens to be most of the developing countries around the world, you don’t have much ground reference underground, and there is a domain shift in terms of agricultural practices, the scale of farming, the crop types, the seasonality and, basically, the climatology there, the elevation, everything that impacts the crop phenology.

Transferring those models to that region is a challenge. That’s an area I have worked on and I am continuing to do more now that I’m back in the research board. That deals with the applications of AI, broadly speaking to remote sensing. But agriculture, land cover, land use change, broadly speaking, land surface processes are the areas I apply these technology and techniques to.

[0:05:39] HC: One of the solutions, I understand that you’re focused on, is foundation models. What is a foundation model and what is its purpose?

[0:05:47] HA: Yes. Foundation models as some of your audience might know is a relatively new term, but at the same time, a very rapidly growing field. The term was coined about two years ago, two and a half years ago by a paper that came out of a large group of researchers from Stanford. It’s basically talking about new types of model that have three, I would say, general properties. One is they are pre-trained on a large amount of unlabeled data. Secondly, they use self-supervised learning techniques. The idea is the third property is that you can fine-tune this model with a very small set of labeled data for multiple downstream tasks.

Those tasks can be very different. Can be segmentation, can be regression, can be classification. They have been primarily applied to, I would say, early days into language models. Many of the large language models that we know these days, like ChatGPT of the world, were all a foundation model. Now, they become a product and an application that would be used. The idea there is, so you have, for example, in the satellite imagery board, we have a lot of imagery captured by these satellites regularly across the world, and we don’t have necessarily labels for them.

These can be a very, I would say, suitable problem for a foundation model, because you can use self-supervised techniques, which means you don’t have necessarily labels for your data, but you can mask part of your data set, either part of a sentence in a language model, or part of an image in a sense of a satellite imagery, and train a model, which is in the architectural sense, an encoder and a decoder. The encoder will receive all these images, will embed all the information into the embedded dimension that you have.

Then in the training phase, the decoder is basically predicting those hatched, or basically, masked pixels. You’re basically learning the data, the patterns in the data. After this space, when you have trained a model, you can then, for any fine-tuning, any downstream application, replace your decoder with a different decoder. It can be for segmentation. It can be for regression. Then using that as a theory, and it has been proven with many applications, with a limited number of label data, you can basically have a good model performing on your specific downstream task.

What we did this year, and it led to basically, my group working on foundation model, was a joint project between NASA IMPACT, IBM Research, and our team at Clark. We worked on a foundation model for a specific type of satellite imagery, which is the Landsat Sentinel Harmonized Satellite Imagery. These are optical, and actually, not just optical, multi-spectral satellite imagery, including optical bands and spectral bands in microwave frequencies that are available globally.

We built a foundation model and then tested it on several downstream applications to see if there is value, basically, in using this kind of model, instead of end-to-end fully supervised model. What are the trade-offs in terms of the training time, in terms of the performance, in terms of the sample sizes that you have? Basically, our hypothesis was, is it worth doing foundation models for remote sensing? The short answer is yes, but there’s always trade-offs in terms of the computation resources you need and everything. It has been a very successful project and we’re continuing to expand it to other directions as well.

[0:09:02] HC: Saying as you’ve recently trained one of these foundation models, what does it take to train one? I’m thinking in terms of how much data, what resources, infrastructure, how long does it take? Can you put some numbers on these types of things?

[0:09:17] HA: To be fair, the actual training of the foundation model in our project was done by IBM as part of our collaboration. The numbers I’m going to share are from that team. We couldn’t do it, honestly, alone from an academic unit, basically. But it takes a lot of effort. The reason is, as I mentioned, these are trained on massive numbers of unlabeled data. You’re basically, in the case of satellite imagery, scrapping the data stores of all the satellite data, or in the language cases, scrapping the web for all the content and then feeding that into a model.

The model itself is very heavy in terms of architecture. The model that we use, and we have released it publicly on Hugging Face and the paper is also available on archive, includes 100 million parameters. It is not massive compared to, I would say, the GPT models, because those are in a billion parameters. But a 100 million is still a very significant number.

In terms of the training samples, we had 175,000 chips. For each chip, we actually had three scenes from the satellite, because it’s a time series model. It looks at the temporal patterns. You can imagine a 175,000 images multiplied by three, and each of these, unlike the typical images in computer region, which are three-band RGB, are actually six-band, because we have the multispectral data. That adds to the complexity and the size of the data.

These are typical size of 24 pixels by 224 pixels of typical sizes that we use in these types of models. The training takes a lot of time. I think with all the cloud resources IBM had, it took about a week to train the model. It is a heavy data training, but the paradigm is when you do that heavy lifting originally and you release that model open source, then a group can basically, take that because it’s a trained model, so you don’t need all the GPU resources for training, but for fine-tuning, you can use much, basically, smaller resources. that’s what we did at Clark, for example. We received that training model from IBM team and we used our own HPC cluster.

We don’t need two GPUs to fine-tune in for a downstream application, actually multiple of them. One was gap filling of cloudy pixels, one was, for example, land cover segmentation. We could do it in half a day, for example, very limited around 3,000 samples, for example. That’s the paradigm shift. It takes a lot to train a model. That’s correct. But the idea is you would do it once, or I would say, not frequently as you would do, for example, fine-tuning of the model. You can use shared resources from different teams to do that and then share it as an open-source model, then anybody can fine-tune it for their downstream application.

That’s what’s happening in the language work. Many of the GPT models are being served even on APIs that people can go and fit their own data and fine-tune it. In the remote sensing, I think we are behind. We are not at the production level yet, or products that can people consume. It’s more research and R&D level. That’s why the model is hosted, for example, on Hugging Face, or GitHub. People download it and use it on their own platforms. It’s getting there. It is a lot of interest. There’s a lot of interest from different groups, particularly the growing availability of satellite data. There is a lot of interest in improved analytics and foundation model is definitely one of those solutions.

[0:12:21] HC: You mentioned earlier that there’s some nuance to when a foundation model is beneficial for a downstream task and when it might not be. Do you have any general idea on some principles on when it might be helpful and in what situations a foundation model might not be the best choice?

[0:12:36] HA: Yeah. As I mentioned, this is basically, an active area of research. There is good evidence in the language board that there’s definitely benefit in training foundation models and then fine-tuning them for downstream applications. In medical imaging, there is good evidence that these can do better compared to the traditional, basically, fully supervised, or even semi-supervised models.

In remote sensing, we are in the early days, but we have shown in this paper and there’s actually the fact that this field is growing so rapidly within two months that we released our model, another paper came out and, basically, used our model for another downstream application without even, basically, working, or collaborating with us. Completely independent. It’s very rapid development.

That paper also shows, there is value in building that foundation model because in many cases, you don’t have the downstream label. My take is that in terms of usefulness of foundation models, in remote sensing, as I mentioned earlier, we capture imagery globally everywhere and consistently, particularly from many of the satellite data, which are open access to satellites run by NASA of the world, or the ESA, European Space Agency. These are open satellite data that capture imagery very frequently and from everywhere on the earth. But we don’t necessarily have those target labels on the ground and it’s sometimes challenging, or impossible to collect that label depending on what’s happening on the ground.

For that reason, foundation models that can basically, learn the patterns in the data without having a label are becoming very useful. You can then fine-tune them with very limited data and potentially be able to generalize. We can talk a little bit about that later to the applications, or to the regions that are outside of your training zone, basically, or training domain. Generally, they are becoming very helpful. But I would say, we are still exploring them.

I haven’t seen and we haven’t come to any, I would say, framework for saying, this is definitely useful for this application and not for that. So far, what we have seen is that any application we have posted, it has been better than using, basically, into an application. To simulate this data scarcity, one of the things we did was we basically randomized our sample, the training data that we had, and created data-scarce scenarios and tested the model against those where we were doing the fine-tuning.

In all of those cases, even with the zero-shot learning, we were showing the foundation model can perform much better than a fully supervised model, even in the case of generated models, like GAN. There’s definitely promise in the foundation model world. I would say, if someone is in this domain, they should definitely explore it for their application.

[0:15:03] HC: Let’s talk a bit more about generalizability. This is something I’ve done a fair amount of reading myself on, but mostly for medical applications. I’m interested in what you have to say, especially from the remote sensing perspective. How do foundation models improve the generalizability of models to domain shift for the downstream task?

[0:15:20] HA: Yeah. Generalizability in our field, particularly the out of distribution generalizability that we’re talking about, I want to be explicit about that; in our field, the challenge there is related particularly to the geographical diversity of the data, as I mentioned. I want to, basically, clarify the landscape and then say how foundation models are helpful. The challenge is that going back to the agriculture example I mentioned early days, if you look at the scale of, for example, farming in the US, you have this massive fields in the Midwest, or other parts of the country, which are consistently managed. They have a very, I would say, management practice for how they manage the farm in terms of the seed they grow that’s very consistent. It’s very automated and machinery that be used to plant, to water, irrigate and also to harvest.

When you go to a small, older, dominated farm region, which are farms smaller than one hectare or so, or usually actually, depending on the definition of which organization you look at, at one acre or less, they are very much heterogeneous in terms of management practices. Even the land itself, within one farm may not be basically homogeneous. There’s a lot of changes in there. In the seed, in the soil top, in the fertilizer that are applied, even if they apply a certain fertilizer.

You can’t necessarily collect data in those diverse and heterogeneous regions to have a good representative model there. The promise and what we have seen so far in terms of the generalizability of foundation models is, if you can train them in where data-rich labels we have, the same satellite data is captured everywhere, so we can transfer the model that is trained in a data-rich region to a data scarce region with very few samples there. We are showing that in that scenario that I mentioned, the data scarcity one by reducing the sample size. With the caveat there that we are still there using the distribution samples, so we are not going out of the distribution. In the paper we haven’t shown that, but internally, we are working on examples that we show that out-of-distribution samples are also performing well.

We are doing that literally by taking the model from one region to another region of the world, which has completely different distributions in terms of the target variable we are interested in. We are also working with Kenya Space Agency in terms of more practical applications on the ground for land cover mapping, again, applying the same model. The promise for us and what generalizability means in the, basically, foundation model world is, because the model learns the patterns in the raw data and the embeddings that I mentioned in the first place, it has a better, basically, capability to generalize the unseen data, because the training phase, the model doesn’t learn from the labels. It learns from the input data, basically. It is not learning specific things in the target variable, it’s actually learning the original domain of the data. The icks of the variable, basically.

The promise there is because of that, the model learns all those patterns and then it can quickly with few samples, fine-tune to your target domain and basically, generalize better to that problem. There is good evidence in all of the fields. In remote sensing, it’s very new again, because the foundation models have been around maybe less than six months, the first model that came out in our field, but there is evidence that it works very well in other domains, and soon in remote sensing as well.

[0:18:30] HC: Foundation models for remote sensing are newer than for other areas, like NLP or computer vision. If you had multiple foundation models to choose from when you start building for a downstream task, how would you decide which one is most suitable?

[0:18:43] HA: There are multiple aspects to this. One is, so look at the data that the model is trained on. The difference in remote sensing compared to computer vision in other domains is the imagery and the data that is fed to the model can be very different. Computer vision, everything is for example, imagery with cellphones and digital cameras, and so on. Here, the satellite that is in orbit and capturing the imagery has a different spectral resolution, what bands its collecting imagery from, what spatial resolution is doing that. Is it 10 meter, is it 1 meter, is it 50 meter? Then the temporal resolution, if you’re interested in the time series model and do you have enough, basically, temporal observation.

Looking at what data the model was trained on is a critical aspect. Noting that and we show that in our paper, that you can train model on one satellite data and potentially use it. Another one, if they have very similar properties in terms of spectral properties of the input data and we show that in an example in the paper. The other aspect is the representativeness of the data. If the data is trained, for example, in certain parts of the world is the data used in certain parts of the door for training and you’re going to do the generalization for your fine-tuning, you need to be careful with that. Being aware of that problem is definitely something to take into account.

I think the solution to that is honestly, to test the model. You can’t say it in advance if the model necessarily generalizes or not, unless they have shown in their paper, or in the publication that it works in other parts of the world, or it generalizes well to the domain shifts. The other aspect to take to account is basically, the properties of the satellite input data. I mentioned the resolution and everything, but some of the satellite data have better quality compared to the others, because for example, at 30-centimeter resolution, which is the highest you can get from the satellite imagery in terms of a spatial resolution, you can see many fine details, like roads and buildings and infrastructures in individual objects. But if you come to 30-meter, you can’t see necessarily those, basically, phenomena on the ground.

That scale matters in terms of what model you want to use, because now we are seeing actually foundation models in remote sensing that are pixel-based. In a typical convolutional neural network, you apply to the image space, in 2D space and then you derive, basically, insights, whether it’s for segmentation, or any other application, and particularly for segmentation, honestly. Now we are seeing models of remote sensing that are working with imagery of 30 meters and more that are pixel-based. At 30 meters, you are losing a lot of those spatial correlations, basically. It doesn’t matter to apply this convolution in the space, you apply it actually in time, because that’s more important for your application. You want to look at phonology of, for example, surface properties over time.

There is actually a foundation model that is doing that in time pixel-based and it works pretty well. Very interestingly, that model, it’s much lighter because it needs fewer parameters. It doesn’t care about the space, and it’s easier to fine-tune and also even pre-train it from scratch. That is a basically, trade-off you need to take into account. If you’re looking at that level of resolution, 30 meters or more, you might be good enough with the pixel based on vision model, versus a convolutional, or a VIT model, which uses the vision transformers.

Those will be the factors that I would suggest people to take into account when they are looking at these models. At the end of the day, again, as I mentioned, you need to test a couple of them to see which one works better. But you can, basically, downsample from the jungle of models out there using these criteria. Yeah.

[0:22:07] HC: Thinking about the future of foundation models for remote sensing, I imagine there’s a few different avenues you could explore, pre-training them on larger data sets, trying different architectures, trying out different pretext tasks. What do you see as the future for foundation models here?

[0:22:23] HA: I think, you mentioned some of them and I mentioned the model about pixel-based in the previous one. That’s an area definitely, looking at different architectures. But I think the promising future will be combining different modes of data as input. Because so far, the foundation models we have seen are working with one type of satellite data, which is the multispectral data from one particular satellite. But the challenge with those satellites is their optical data, so they can be obscured by cloud, and many times you just don’t have a valid observation, because it’s all clouds and you don’t see the surface.

We have other types of satellites, like radar which are in orbit and they also collect imagery frequently. The idea would be to combine and fuse these types of data in the same model and the model, basically, can run an inference, no matter what type of data is the input. I think that’s where the future promising part of the foundation models for remote sensing, fusion of different types of data and then exploring all the other avenues like the, what architecture is the best? Should we go pixel-based? Should we go with the theme-based one, and so on? But the fusion will be there, the advancements that we are all looking forward to. That’s an area that I’m also interested in, because I worked on radar data a lot in the past. But the fusion is the interesting part now for all of us in the community.

[0:23:37] HC: Is there any advice you could offer to machine learning teams looking to give foundation models a try?

[0:23:42] HA: I mean, first, a suggestion which might sound very trivial, but catch up with the literature. It’s a very rapidly growing field. There’s a lot of development coming up. You want to learn what is going on in the community. Sometimes, even with our case, we released the model as soon as it was ready, and then we worked on the publication. Check out models that come on Hugging Face or other repositories and be aware of the development, basically, there and then potentially read the literature.

The second thing is I would say, understand the domain problem and see what is the challenge there before designing the actual model because there might be some practical constraints in the domain itself that you want to incorporate in your design, whether it’s about the resources that the model takes to run at the training time, at the interest time, whether it’s about availability of data. Maybe you’re training your model on some data, which works pretty well, but in practice, when it goes to the operational stage, you don’t have access to that data, or in some parts of the world that you’re interested this model be explored, you don’t have that data. That data problem can be something that we need to take to account in the training phase.

Then the quality of the data is another aspect. Sometimes with training models particularly in the foundation board that we have high-quality data and this comes back to the generalization question, that we can train it for example in the US, because we have very high-quality data and that data is available in other parts, but it is not necessarily the same quality. It might be no easier. Easier to foundation model resistant to that noise, basically. Taking to account that aspect of the foundation model training and inference is another key aspect.

I think in terms of the foundation model research in remote sensing, there is a lot to explore, particularly with the different types of data we have. Another fusion problem to think about is embedding in-fusion data into the game. I mentioned the fusion of different types of satellite data, but another area to explore is can be learned from in-fusion measurements, whether it’s weather station data for example, or other types of sensor data on the ground. Can a foundation model inference be improved by embedding those, basically, inputs to the model?

There is a lot to do and the community is eager to learn, so there is a good opportunity. If people are looking for challenging problems, I would encourage them to explore this domain, and also, work with the domain experts because there’s always good knowledge you can learn from them as you’re applying AI techniques to that.

[0:25:59] HC: Finally, where do you see the impact of foundation models over the next three to five years?

[0:26:04] HA: Impact, I would classify two positive impacts and negative impacts, honestly. I think the positive one is definitely, we are going to solve problems and build new applications where we have limited data. The foundation models are going to help a lot with those problems and application areas. These can be regions that labeling is hard, or it can be in terms of temporal shifts, because one of the applications of basically, foundation models in geospatial is about climate projections. Can we basically do climate projection more accurately? I mean, with less uncertainty. When you look at the basically, uncertainty envelops on many of the physical models for climate projection, there is a wide range as you move toward the 2,100, for example.

One of the research areas is how we can embed AI, ML broadly speaking and more about foundation models into that. I think the foundation models have a role to impact that field particularly. I’m confident we will see a lot of improvements in that field with the advancements in the foundation model. That generalization there is in time not necessarily in a spatial, or geographical domains.

Definitely, the fusion will come in. I’m sure we will see in the next three to five years a lot of foundation models that use different types of satellite data and let us benefit from the wealth of data we are collecting globally. But at the same time, I think we need to be careful with the negative impacts, because it might and this might be more – it may be easier to talk about it in the language terms. Many of us are using ChatGPT now. What will happen if everybody uses ChatGPT or ChatGPTs of the world? I’m not saying necessarily that app. For generating text. Are we going to lose creative writing skills if we do that?

Thinking about, basically, human skills in terms of how we benefit from foundation models, I think, is a key aspect. In remote sensing, the downside might be we might be able to detect patterns and derive insights from human behavior, which includes human privacy, basically. Being careful with the applications that we are applying these to is another aspect to think about, because we are detecting things, we are seeing patterns from satellite imagery that we couldn’t even imagine 10 years ago. I think that’s the side with the ethical applications and responsible use of AI that it’s no excuse for foundation model either. I think that’s the downside. But the community is very much I would say, aware and active in making sure we have discussions around that and we, basically, build in safeguards around that.

[0:28:24] HC: This has been great, Hamed. I appreciate your insights today. I think this will be valuable to many listeners. Where can people find out more about you online?

[0:28:32] HA: We have our website. I can share with you to put it on the page of the podcast. There’s also my LinkedIn account and also CGA’s LinkedIn account that we share a lot of content about our work, our opportunities and things ongoing regularly at CGA. Yeah, I will share the links and let people follow us and updates. Look forward to interactions with the community if they have updates, they want to share things with us, hopefully, to get in touch with us, definitely.

[0:28:54] HC: Yeah. Well, I’ll definitely include all of that in the show notes, as well as links to your recent papers about the foundation models. Thank you for joining me today.

[0:29:03] HA: Thank you, Heather.

[0:29:04] HC: All right, everyone. Thanks for listening. I’m Heather Couture, and I hope you join me again next time for Impact AI.

[END OF INTERVIEW]

[0:29:14] HC: Thank you for listening to Impact AI. If you enjoyed this episode, please subscribe and share it with a friend. If you’d like to learn more about computer vision applications for people and planetary health, you can sign up for my newsletter at pixelscientia.com/newsletter.

[END]