Prediction of Spatial Distribution of Soil Organic Carbon in Farmland Based on Multi-Variables and Random Forest Algorithm—A Case Study of A Subtropical Complex Geomorphic Region in Fujian as An Example
Author:
Affiliation:

Clc Number:

Fund Project:

National Natural Science Foundation of China (No.41971050), Special Fund for Science and Technology Innovation of Fujian Agriculture and Forestry University (No.KFA17616A), Science and Technology Planning Project of Fujian Province (No.2017N5006) and National Undergraduate Innovation Program (No.201910389026)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    [Objective] Soil organic carbon (SOC) plays an important role in soil fertility and the terrestrial ecosystem carbon cycle. A detailed understanding of the spatial distribution of SOC is vital to management of the soil resources and mitigation of the global climate change. With the development of the 3S technology, the models for predicting soil properties based on environmental variables are getting increasingly popular. The purpose of our study is to try to simulate the complex and nonlinear relationship between SOC and environmental variables, and evaluate the importance of soil attributes to accuracy in SOC mapping.[Method] For this purpose, machine learning methods and a random forest (RF) model was applied to map the spatial distribution of topsoil organic carbon contents for farmlands in the high-yield agricultural areas in Southeast Fujian. A set of environmental variables (including 5 hard-to-obtain quantitative soil attributes such as hydrolysable nitrogen, available phosphorus, pH, etc) and 11 easy-to-obtain variables (i.e. topography factors, vegetation indexes and climate factors) were acquired through analysis of a large number of soil samples collected from that region, and then processed with the RF algorithm to predict spatial distribution of SOC content in the topsoil layers of the farmlands of that region. Two different combinations of the above variables were entered as input to RF-S model and RF-A model separately. The RF-S model functioned only on the basis of easy-to-obtain variables and the RF-A model did on the basis of all the variables, both easy-to-or hard-to-obtain ones, for predicting SOC. Root mean square errors (RMSE), mean absolute errors (MAE), Pearson correlation coefficients (r), coefficients of variation (CV), relative errors (RE) and relative root mean square errors (RRMSE) of the two models were worked out for evaluation of accuracy of their predictions, and screening-out of an optimal RF model for mapping SOC in the study area based the raster datasets of all variables. Then cross-validation was performed to compare the optimal RF model with the Ordinary Kriging (OK) interpolation model.[Result] Results show that of the two models, different in input of environmental variables, the RF-A model that functioned based on remote sensing variables, climate factors and soil attributes was much better than the other in performance and could explain the most of the spatial heterogeneity of SOC. Compared with the RF-S model, the RF-A model significantly improved in fitting and prediction (r increased by 7.95% and RMSE decreased by 45.13%). The SOC contents of the farmlands of the region predicted with the RF-A model varied in the range of 14.70±2.95 g·kg-1 and were quite similar to what was obtained with the OK model in spatial distribution, i.e. an ascending trend from the east coastal area to the western inland of the study area. And despite sampling percentage, the RF-A model was generally higher than the OK model in prediction accuracy, and in capability of capturing spatial heterogeneity, and preferred especially in the case of relatively fewer sampling sites. Among the variables, hydrolysable nitrogen (N) was the most important one for the RF-A model, and followed by elevation(DEM). Both of them significantly affected spatial heterogeneity of the SOC, exhibiting positive relationships with SOC.[Conclusion] It is therefore concluded that the random forest model that functions based on remote sensing variables, climate factors as well as soil attributes is a promising approach to predicting spatial distribution of SOC in Southeast Fujian. In addition, soil attributes variables, such as N and P, should be taken into account for improving prediction accuracy for mapping of SOC in regions with complex geomorphology.

    Reference
    Related
    Cited by
Get Citation

YUAN Yuqi, CHEN Hanyue, ZHANG Liming, REN Biwu, XING Shihe, TONG Junyue. Prediction of Spatial Distribution of Soil Organic Carbon in Farmland Based on Multi-Variables and Random Forest Algorithm—A Case Study of A Subtropical Complex Geomorphic Region in Fujian as An Example[J]. Acta Pedologica Sinica,2021,58(4):887-899.

Copy
Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:January 14,2020
  • Revised:May 21,2020
  • Adopted:July 30,2020
  • Online: December 08,2020
  • Published: July 11,2021