Tagging Earth Observation Data to Monitor Deforestation

Deforestation Monitoring for AI-powered Tagging utilizes a decadal set of satellite-derived climate data and related information to detect habitat loss over multiple years. It combines programmatic access, standard frameworks, and native Python tools to enable teams to quickly discover and process information. This approach reduces noise and seasonality by utilizing climate data and derived variables to create consistent training records.

Quality and traceability are key. Each tag is linked to observations, metadata, and formal reviews, allowing AI models to learn from continuous changes rather than one-off events.

Quick Take

  • Uses long-term records to disentangle interannual variability from ongoing forest loss.
  • Consistent data and variables reduce noise and improve labeling quality.
  • Access is programmatic and Python-friendly, making it ideal for production workflows.
  • Tags link to observations and metadata for tracking and viewing.
  • A comprehensive system encompasses tag detection, processing, and definition.

Climate Data Records and Support for Consistent Tagging

Stable, calibrated records are the foundation for reliable markers of canopy change. The markers are tied to official records so that the training data reflects ongoing structural changes rather than transient noise.

Fundamental Climate Data Records (FCDRs) are calibrated sensor observations. They store brightness, backscatter, and auxiliary calibration information to ensure the time series is consistent across sensor changes.

Climate Data Records (CDRs) are derived geophysical variables used for tagging. Interim CDRs (ICDRs) enable teams to tag in real-time, maintaining comparability with final releases.

Combine passive visible-infrared and thermal measurements with active radar and lidar to capture both canopy structure and atmospheric context. Auxiliary sensor calibration and metadata are essential variables for tracking.

  • Time series integrity. Calibration and homogenization prevent artificial jumps across years.
  • Origin of tags. To understand whether a tag comes from FCDR, CDR, or ICDR.
  • Quality assurance. Reviews from research to operations ensure that every piece of data entry is accurate and valid.

Access to Satellite Climate Datasets

Access to satellite and climate datasets opens up the possibility of analytically assessing the state of the atmosphere, the Earth's surface, the oceans, and climate change over long periods. For defense or applied projects, such data is a valuable resource. Consider software and service options to view and download variables.

NASA Earthdata

Full access to Earth observation data: atmosphere, surface, oceans, ice, climate change.

EUMETSAT

European satellite operator: weather, climate, and ocean monitoring from orbit. 

NOAA 

Archive of climate and weather observations, stations, satellite data, statistics 

Climate Engine

Cloud platform for analyzing decades of satellite and climate observations. 

CHIRPS

Specialized dataset for rainfall estimates (satellite + station data), global coverage from 1981 to present. 

Core datasets for building deforestation labeling pipelines

The accuracy of the input data and consistent variables enables the detection of continuous changes in the canopy, thereby filtering out seasonal noise. This is achieved by selecting records that encompass land, water, radiation, and cloud contexts, ensuring robust labeling over multiple years.

Land cover and vegetation

Land use data enable the identification of various land cover types, including forest areas, agricultural areas, water bodies, and urban areas. Vegetation datasets include satellite indices such as the NDVI (Normalized Difference Vegetation Index) and EVI (Enhanced Vegetation Index), as well as other spectral indices that reflect the density and health of vegetation cover. Combining these sources enables analysts to automatically identify areas of deforestation and assess land-cover change, track changes over time, and develop accurate AI models to monitor and predict the impacts on ecosystems.

Computer Vision | Keymakr

Moisture and water cycle

Soil moisture, vegetation moisture, and the water cycle are of particular interest as they affect the health of forest ecosystems, canopy structure, and biomass estimation, which is critical for understanding the impact of logging.

Key datasets include:

  1. SMAP (Soil Moisture Active Passive, NASA) global satellite data on soil moisture, which allows us to monitor the impact of logging on the water balance of areas.
  2. MODIS (NASA Terra/Aqua) Vegetation Moisture Index and satellite data, which help us assess changes in the water cycle in forests.
  3. GRACE / GRACE-FO (NASA & GFZ) gravity data, which allows us to assess changes in soil water reserves and underground aquifers.
  4. Copernicus Land Monitoring, utilizing Sentinel-1 & Sentinel-2 (ESA) radar and optical data, enables the analysis of forest structure and surface moisture, which helps identify areas of logging and their impact on hydrological processes.

Using these datasets in annotation pipelines enables the automated recognition of changes in forest areas related to water balance, providing an assessment of the ecological consequences of logging.

Cloudiness, radiation, and surface energy

Cloudiness affects surface visibility, so data on cloud cover, its thickness, and dynamics are needed for analysis. Radiative indices reflect the amount of solar radiation reaching the Earth's surface. This is needed to assess plant photosynthetic activity and identify areas of forest degradation. Surface energy encompasses indices of heat flow between the ground and the atmosphere, which help detect changes in microclimate after logging and assess the impact on local ecosystems.

Supporting factors and diagnostics

  1. CCMP winds and GHRSST SST for regional moisture transport and fire weather.
  2. AVISO sea surface height and global mean sea level to contextualize coastal impacts.
  3. OSI SAF sea ice concentrations where high-latitude forests require ice-related variables.

Fix coverage years and formats to optimize data collection and validation.

Developing a high-quality training dataset for deforestation signals

Creating a high-quality training dataset for detecting deforestation signals is a crucial step in developing effective monitoring systems and automated detection of changes in forest areas. Such a dataset allows AI models to identify areas of deforestation, assess the intensity of disturbances, and predict further changes in the ecosystem.

1. Raw Data Collection

Acquisition of satellite imagery and climate data (optical and radar images, cloud cover, radiation, and surface energy metrics) for the selected area.

2. Preprocessing

Correction of satellite images, noise removal, multispectral channel merging, and data normalization for further analysis.

3. Canopy Change Annotation

Manual and semi-automated labeling of deforested areas using historical data and expert assessments creates classes for different types of changes, including canopy loss, allowing AI models to learn from continuous changes rather than one-off events.

4. Verification & Quality Control

Checking annotation accuracy, expert cross-validation, correcting errors, and removing invalid samples.

5. Integration of Climate and Energy Data

Adding layers for cloud cover, radiation, and surface energy to each sample to provide context for machine learning models.

6. Final Dataset Creation

Building a standardized dataset with all annotations and metadata, ready for training machine learning models.

7. Documentation & Metadata

Preparing a detailed description of the dataset, sources, formats, annotation methods, and usage guidelines to ensure transparency and reproducibility.

Model readiness: combining observations with climate and machine models

For high accuracy in detecting changes in forest areas and predicting the consequences of deforestation, it is important to combine real-world observations, climate data, and machine learning models. Observations from satellites and ground stations provide information on the state of the forest, cloud cover, radiative fluxes, and surface energy. Climate models enable us to account for seasonal and long-term changes in weather conditions, assess risks of degradation, and predict the impact of climate factors on forests.

Machine models use this data for automated learning, pattern recognition, and forecast generation. Combining the three sources of information provides reliable results, allows us to account for the errors of individual methods, and enhances the system's ability to respond adaptively to changes in the ecosystem.

Thus, the readiness of an AI model is determined by the accuracy of machine learning algorithms and the integration of real-world observations and climate forecasts. This provides a comprehensive picture of the state of forest areas, enabling informed decisions to be made about preserving natural resources.

FAQ

What is Earth Observation Data Tagging for Deforestation Monitoring?

Earth Observation Data Tagging for Deforestation Monitoring is the process of creating annotations on satellite or ground-based imagery to indicate areas of deforestation and changes in forest cover.

Why is tagging important for deforestation monitoring today?

Tagging enables the accurate detection and tracking of changes in forest cover, allowing for a rapid response to illegal logging and an assessment of its impacts on ecosystems.

What is the difference between FCDR, CDR, and ICDR?

FCDR is raw, calibrated satellite data; CDR is long-term, climate-resilient data; and ICDR is integrated CDR for continuous analysis of changes over time.

What types of sensors are useful for deforestation tagging?

Optical and radar satellite sensors, multispectral cameras, LiDAR, and ground-based sensors are useful for deforestation tagging to measure changes in forest cover and structure.

In what ways do professionals access relevant data for labeling?

Professionals can access relevant data for labeling through open satellite portals, climate databases, and licensed geospatial platforms.

What file formats and metadata standards do Trackable Tags support?

Trackable Tags support standard file formats, including CSV, JSON, GeoJSON, TIFF, and HDF5, as well as metadata in ISO 19115, CF-compliant, and Dublin Core formats, to ensure compatibility and reproducibility.