Artificial Intelligence Helps Distinguish the Forests From the Trees: Part 2
Since 2015, World Resources Institute (WRI) and Orbital Insight have worked together with a grant from the Generation Foundation to find new applications of computer vision and deep learning that will support Global Forest Watch in better monitoring the world’s forests.
This blog is Part II of a series that explores our latest project working to identify where oil palm is being planted and grown across large areas of the tropics. In this post, we dive into the technological foundation of our palm oil classifier. See Part I for a breakdown of the real-world implications of mapping agricultural plantations.
As companies around the world pledge to end deforestation-linked palm oil production, it is critical that we are able to monitor when and where these commodities are actually replacing natural forests. The proliferation of high resolution satellite imagery and advancements in deep learning are now making it easier to differentiate natural forests from vast oil palm plantations.
Orbital Insight and Global Forest Watch (GFW) are working together to leverage these cutting-edge technologies to create preliminary oil palm maps for Malaysia, Cambodia, Indonesia and Colombia, with plans to expand to other palm oil producing countries soon.
Training an algorithm to identify land use
To create a model that identifies industrial oil palm plantations, we used supervised machine learning. This process involves providing the algorithm examples of satellite images of oil palm plantations alongside non-plantation areas such as cities, bodies of water and natural forest. With these examples, the model can effectively “learn” what a plantation looks like in satellite imagery. For the algorithm to effectively classify tree plantations, it needs an abundance of training examples. Our teams manually labeled over 3,000 satellite images that covered a diverse set of plantations and geographies.
After we marked the images, we trained an algorithm to distinguish industrial oil palm plantations using a technique called a convolutional neural network (CNN). We provided the model with a set of images and corresponding maps marked by our human experts. After “learning” with our training examples, the algorithm is able to identify industrial oil palm plantations based on clues such as texture and pattern of trees and roads. We evaluated the performance of our trained models by comparing the predictions to ground truth markings. Following this training phase, models can then be applied to large numbers of images to create country and region-wide maps.
Mastering high resolution at scale
For this project, we used high resolution satellite imagery from the imaging company, Planet, which allowed the human markers and algorithm to clearly identify the fine details of land cover that wouldn’t be visible on freely available but lower resolution image sources like Landsat.
Planet also has nearly daily global coverage, allowing us to run the algorithm on a large scale and ensuring that we will obtain imagery on clear days, even in extremely cloudy parts of the tropics. We trained an additional CNN to identify which portions of an image were covered by clouds, allowing us to combine the cloud-free parts of each image to increase accuracy and provide greater geographic coverage.
The frequency and resolution of the Planet images created a massive volume of data which posed a problem for traditional processing methods. In order to ingest, process and store petabytes (that’s 1 million gigabytes!) of geospatial data, Orbital Insight spent thousands of hours developing a cloud computing-based pipeline, which has now analyzed over 600,000 satellite images for this project alone.
A promising start
We now have a process in place that can analyze large volumes of imagery across entire countries. The prototype is successfully identifying large-scale plantations and is correctly classifying over 90 percent of pixels in a dataset marked by our experts.
We’ve made significant progress over the last few years, but we still haven’t completely accomplished our goal. The current datasets pick up large-scale row plantations reasonably well, but we’re still seeing some misclassification between oil palm and other, similar-looking plantations like bananas. The algorithm also can’t identify plantations until they are mature enough to be visible in satellite imagery, which means we may not be able to attribute deforestation to oil palm until several years after it occurs.
This project has taught us a great deal about the challenges of implementing deep learning for land use mapping. As noted above, the sheer amount of effort to provide training examples was daunting — the algorithm may learn by itself, but it required significant human help to bring together the materials for its learning. This has implications for the feasibility of these methods for other mapping projects. In addition, the price of high resolution imagery may mean that expanding these maps further will become cost-prohibitive.
Our vision for the future
Over the next couple of months, we intend to update the existing maps through 2018, and expand to Peru, Liberia, Guatemala, Honduras and Papua New Guinea, which are seeing an influx of oil palm plantations.
We see long-term opportunities to better detect oil palm, perhaps by analyzing even higher resolution imagery. We could also use the process now in place to run more frequent updates of the algorithm and better identify change over time. And, of course, there is the potential to use these methods to map other commodities, forest types or deforestation risks.
This experimental project between GFW and Orbital Insight has taught us a great deal about applying deep learning technology to some of the world’s trickiest forest monitoring problems. We see great potential for continuing to refine this approach and expand it beyond oil palm in the years to come, improving the data-driven foundation for the global effort to end deforestation.
BANNER PHOTO: Palm oil plantation. Ridhwan Siregar/ WRI