Data & Tools Users in Action Forest Insights

Artificial Intelligence Helps Distinguish the Forests From the Trees: Part 2

Dec 17, 2018|

Eric Lewandowski, Tom Swartz, Lisa Wang, Adam Kraft and Mikaela Weisse

|5 minutes

DCIM100MEDIADJI_0310.JPG

Training an algorithm to identify land use

To create a model that identifies industrial oil palm plantations, we used supervised machine learning. This process involves providing the algorithm examples of satellite images of oil palm plantations alongside non-plantation areas such as cities, bodies of water and natural forest. With these examples, the model can effectively “learn” what a plantation looks like in satellite imagery. For the algorithm to effectively classify tree plantations, it needs an abundance of training examples. Our teams manually labeled over 3,000 satellite images that covered a diverse set of plantations and geographies.

After we marked the images, we trained an algorithm to distinguish industrial oil palm plantations using a technique called a convolutional neural network (CNN). We provided the model with a set of images and corresponding maps marked by our human experts. After “learning” with our training examples, the algorithm is able to identify industrial oil palm plantations based on clues such as texture and pattern of trees and roads. We evaluated the performance of our trained models by comparing the predictions to ground truth markings. Following this training phase, models can then be applied to large numbers of images to create country and region-wide maps.

Mastering high resolution at scale

For this project, we used high resolution satellite imagery from the imaging company, Planet, which allowed the human markers and algorithm to clearly identify the fine details of land cover that wouldn’t be visible on freely available but lower resolution image sources like Landsat.

Planet also has nearly daily global coverage, allowing us to run the algorithm on a large scale and ensuring that we will obtain imagery on clear days, even in extremely cloudy parts of the tropics. We trained an additional CNN to identify which portions of an image were covered by clouds, allowing us to combine the cloud-free parts of each image to increase accuracy and provide greater geographic coverage.

The frequency and resolution of the Planet images created a massive volume of data which posed a problem for traditional processing methods. In order to ingest, process and store petabytes (that’s 1 million gigabytes!) of geospatial data, Orbital Insight spent thousands of hours developing a cloud computing-based pipeline, which has now analyzed over 600,000 satellite images for this project alone.

A promising start

We now have a process in place that can analyze large volumes of imagery across entire countries. The prototype is successfully identifying large-scale plantations and is correctly classifying over 90 percent of pixels in a dataset marked by our experts.

We’ve made significant progress over the last few years, but we still haven’t completely accomplished our goal. The current datasets pick up large-scale row plantations reasonably well, but we’re still seeing some misclassification between oil palm and other, similar-looking plantations like bananas. The algorithm also can’t identify plantations until they are mature enough to be visible in satellite imagery, which means we may not be able to attribute deforestation to oil palm until several years after it occurs.

This project has taught us a great deal about the challenges of implementing deep learning for land use mapping. As noted above, the sheer amount of effort to provide training examples was daunting — the algorithm may learn by itself, but it required significant human help to bring together the materials for its learning. This has implications for the feasibility of these methods for other mapping projects. In addition, the price of high resolution imagery may mean that expanding these maps further will become cost-prohibitive.

Planet images of Colombia; Orbital Insight human-marked ground truth with planted forest class in light green and not planted forest in blue; the prediction made by our final trained model.

Our vision for the future

Over the next couple of months, we intend to update the existing maps through 2018, and expand to Peru, Liberia, Guatemala, Honduras and Papua New Guinea, which are seeing an influx of oil palm plantations.

Land use results for a region in Borneo. Blue represents natural forest class, green represents planted forest class and orange represents urban class.

We see long-term opportunities to better detect oil palm, perhaps by analyzing even higher resolution imagery. We could also use the process now in place to run more frequent updates of the algorithm and better identify change over time. And, of course, there is the potential to use these methods to map other commodities, forest types or deforestation risks.

This experimental project between GFW and Orbital Insight has taught us a great deal about applying deep learning technology to some of the world’s trickiest forest monitoring problems. We see great potential for continuing to refine this approach and expand it beyond oil palm in the years to come, improving the data-driven foundation for the global effort to end deforestation.

BANNER PHOTO: Palm oil plantation. Ridhwan Siregar/ WRI

Explore More Articles

Aerial footage of palm oil and the forest in Sentabai Village, West Kalimantan, 2017.

Apr 04, 2024|Data|6 minutes

Global Forest Watch’s 2023 Tree Cover Loss Data Explained

New data shows persistent primary forest loss in 2023. What does the data measure and how does it compare to other official estimates of deforestation?

Ripe Cocoa pods from a cocoa farm in Ghana.

Feb 14, 2024|Data|10 minutes

Ending Deforestation from Cocoa in West Africa with New Data-Driven Resources

Two new data-driven resources provide a shared view of priority areas in West Africa and can help realize a a deforestation-free cocoa sector.

Jan 18, 2024|Data|8 minutes

Comparing Forest Extent in 2020 from Global Forest Watch and the Forest Resources Assessment

This blog compares the forest extent in 2020 for data from UMD on GFW and the FAO Forest Resources Assessment and explains the differences.