Prediction of Soil Organic Matter based on Multi-resolution Remote Sensing Data and Random Forest Algorithm
Author:
Affiliation:

Clc Number:

Fund Project:

Supported by the Special Foundation of National Science and Technology Basic Work Project of China (No. 2014FY110200A08)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Soil organic matter is closely related with soil fertility, so the knowledge about spatial distribution of soil organic matter is conducive to rationalization of fertilization management and improvement of land use potential. As carbon source, soil organic carbon is an important factor affecting regional carbon budgeting. Remote sensing data has widely been used in digital soil mapping, which may improve accuracy of the prediction of soil properties to a certain extent. With the aeolian sandy fluvial land and loess hills in Yuyang District cited as subject, this study tried to predict soil organic content and distribution in the topsoil layer of the region of a varying resolution (30 m, 56 m and 250 m), using random forest (RF) algorithm and relevant thematic mapper (TM), Advanced Wide Field Sensor (AWIFS), Moderate Resolution Imaging Spectroradiometer (MODIS) and Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) data, separately, and in addition, other various factors affecting distribution of soil organic matter, and to validate the predictions with soil samples collected from 324 sampling sites. Variables in the prediction were screened in the light of out-of-bag (OOB) errors the RF algorithm may yield. The mean error (ME), mean absolute error (MAE), root mean square error (RMSE) and Pearson correlation coefficient (R) were used to evaluate the differences between predicted and observed values of soil organic matter relative to resolution. Entropies of the prediction, using the RF model, of distribution of soil organic matter in regions different in topography were compared. Besides, explanation of spatial variability of soil organic matter with the RF model was compared relative to resolution, and at the same time, various environmental variables in the aeolian sandy fluvial land area and loess hilly area were ranked in importance relative to resolution of the TM, AWIFS and MODIS data used, so as to identify the most important environmental variables affecting distribution of soil organic matter; and based on the partial dependence map of soil organic matter on the variables, specific range of the impacts of the main variables were delineated. Results showed that: 1) in the aeolian sand fluvial land area, the prediction using the RF model and the AWIFS data is the highest in accuracy, with OOB error being 3.52 and correlation coefficient between predicted and measured values reaching 0.67, regardless of percentage of the samples taken for validation, while in the loess hilly area, the prediction based on the TM data is the highest, with OOB error being is 3.31 and correlation coefficient reaching 0.71. The prediction is better in the loess hilly area than in the aeolian sand fluvial land area, with MAE being in the range of 1.27 ~ 1.57 g kg-1 in the former and in the range of 1.46 ~ 2.08 g kg-1 in the latter. 2) In the aeolian sand fluvial land area, vegetation is the most important factor affecting distribution of soil organic matter, and mostly in positive relationship with soil organic matter. Among the TM data, reduced simple ratio (RSR) is the highest in effect on soil organic matter, or > 7.5 g kg-1, among the AWIFS data, normalized difference vegetation green index (NDVI) and ratio vegetation index (RVI) are, or > 8.5 g kg-1, and among the MODIS data, NDVI is or > 8 g kg-1. Elevation is the second one and its impact varies the most sharply when it ranges between 1 200 and 1 260 m and peaks at 1 220 m. Distance from water source is the third one. As water sources in the aeolian sandy fluvial land area are quite scattered and small in area, their impacts on soil organic matter seldom exceed 500 m. 3) In the loess hilly area, elevation is the most important factor affecting soil organic matter and negatively related to soil organic matter. Geographic location is the second one, soil organic matter declines in content from southwest to northeast in the area. Vegetation is the third one, in positive relationship with soil organic matter, but in all the three types of datasets, the impacts of vegetation indices on soil organic matter never go beyond 8 g kg-1. So, it is quite obvious that in areas relatively simple in topography, it is advisable to use data relatively low in resolution instead of data high in resolution in predicting soil organic matter and the RF model is more effective in predicting in areas complex in topography.

    Reference
    Related
    Cited by
Get Citation

WANG Yinyin, QI Yanbing, CHEN Yang, XIE Fei. Prediction of Soil Organic Matter based on Multi-resolution Remote Sensing Data and Random Forest Algorithm[J]. Acta Pedologica Sinica,2016,53(2):342-354.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 02,2015
  • Revised:October 23,2015
  • Adopted:November 12,2015
  • Online: December 15,2015
  • Published: