引用本文:王茵茵,齐雁冰,陈 洋,解 飞.基于多分辨率遥感数据与随机森林算法的土壤有机质预测研究[J].土壤学报,2016,53(2):342-354. DOI:10.11766/trxb201508170308
WANG Yinyin,QI Yanbing,CHEN Yang,XIE Fei.Prediction of Soil Organic Matter based on Multi-resolution Remote Sensing Data and Random Forest Algorithm[J].Acta Pedologica Sinica,2016,53(2):342-354. DOI:10.11766/trxb201508170308
【打印本页】   【HTML】   【下载PDF全文】   查看/发表评论  【EndNote】   【RefMan】   【BibTex】
←前一篇|后一篇→ 过刊浏览    高级检索
本文已被:浏览 2311次   下载 2753 本文二维码信息
码上扫一扫!
分享到: 微信 更多
基于多分辨率遥感数据与随机森林算法的土壤有机质预测研究
王茵茵, 齐雁冰, 陈 洋, 解 飞
西北农林科技大学资源环境学院
摘要:
遥感数据已经在数字土壤制图中得到广泛应用,并且可以一定程度上提高土壤属性预测的精度。本文以榆阳区的黄土丘陵和风沙滩地两种地貌区为例,利用不同分辨率的专题制图仪(Thematic mapper,TM)、先进宽视场传感器(Advanced wide field sensor,AWIFS)和中等分辨率成像仪(Moderate resolution imaging spectroradiometer,MODIS)的遥感影像数据(分辨率分别为30 m、56 m和250 m)和基于高级热量散射和反射辐射仪全球数字高程模型(Advanced spaceborne thermal emission and reflection radiometer global digital elevation model,ASTER GDEM)的地形衍生数据,结合其他影响土壤有机质分布的辅助因子,用随机森林算法(Random Forest,RF)对表层土壤有机质进行模拟预测,并通过实测数据的百分比抽样对预测结果进行了验证。结果表明,在榆阳区的黄土丘陵区,基于TM数据的土壤有机质预测结果较好;在风沙滩地区,基于AWIFS数据的土壤有机质预测结果较好。基于RF的土壤有机质预测在榆阳区的黄土丘陵区结果较好,三个分辨率下的平均绝对误差在1.27 ~ 1.57 g kg-1之间,在风沙滩地区预测精度较低,平均绝对误差在1.46 ~ 2.08 g kg-1之间。高程、地理位置和植被是影响黄土丘陵区土壤有机质预测的主要因素,在风沙滩地区,植被、高程和离水源地的距离是影响有机质预测的主要因素。可见,在地貌相对简单的地区进行土壤有机质含量的预测时可以使用较低分辨率的数据代替较高分辨率的数据,同时,RF算法在复杂地貌区的土壤有机质预测更有效。
关键词:  多分辨率遥感数据  随机森林  土壤有机质
基金项目:国家科技基础性工作专项项目(2014FY110200A08)资助
Prediction of Soil Organic Matter based on Multi-resolution Remote Sensing Data and Random Forest Algorithm
WANG Yinyin, QI Yanbing, CHEN Yang, XIE Fei
College of Resources and Environment, Northwest A & F University
Abstract:
Soil organic matter is closely related with soil fertility, so the knowledge about spatial distribution of soil organic matter is conducive to rationalization of fertilization management and improvement of land use potential. As carbon source, soil organic carbon is an important factor affecting regional carbon budgeting. Remote sensing data has widely been used in digital soil mapping, which may improve accuracy of the prediction of soil properties to a certain extent. With the aeolian sandy fluvial land and loess hills in Yuyang District cited as subject, this study tried to predict soil organic content and distribution in the topsoil layer of the region of a varying resolution (30 m, 56 m and 250 m), using random forest (RF) algorithm and relevant thematic mapper (TM), Advanced Wide Field Sensor (AWIFS), Moderate Resolution Imaging Spectroradiometer (MODIS) and Advanced Spaceborne Thermal Emission and Reflection Radiometer Global Digital Elevation Model (ASTER GDEM) data, separately, and in addition, other various factors affecting distribution of soil organic matter, and to validate the predictions with soil samples collected from 324 sampling sites. Variables in the prediction were screened in the light of out-of-bag (OOB) errors the RF algorithm may yield. The mean error (ME), mean absolute error (MAE), root mean square error (RMSE) and Pearson correlation coefficient (R) were used to evaluate the differences between predicted and observed values of soil organic matter relative to resolution. Entropies of the prediction, using the RF model, of distribution of soil organic matter in regions different in topography were compared. Besides, explanation of spatial variability of soil organic matter with the RF model was compared relative to resolution, and at the same time, various environmental variables in the aeolian sandy fluvial land area and loess hilly area were ranked in importance relative to resolution of the TM, AWIFS and MODIS data used, so as to identify the most important environmental variables affecting distribution of soil organic matter; and based on the partial dependence map of soil organic matter on the variables, specific range of the impacts of the main variables were delineated. Results showed that: 1) in the aeolian sand fluvial land area, the prediction using the RF model and the AWIFS data is the highest in accuracy, with OOB error being 3.52 and correlation coefficient between predicted and measured values reaching 0.67, regardless of percentage of the samples taken for validation, while in the loess hilly area, the prediction based on the TM data is the highest, with OOB error being is 3.31 and correlation coefficient reaching 0.71. The prediction is better in the loess hilly area than in the aeolian sand fluvial land area, with MAE being in the range of 1.27 ~ 1.57 g kg-1 in the former and in the range of 1.46 ~ 2.08 g kg-1 in the latter. 2) In the aeolian sand fluvial land area, vegetation is the most important factor affecting distribution of soil organic matter, and mostly in positive relationship with soil organic matter. Among the TM data, reduced simple ratio (RSR) is the highest in effect on soil organic matter, or > 7.5 g kg-1, among the AWIFS data, normalized difference vegetation green index (NDVI) and ratio vegetation index (RVI) are, or > 8.5 g kg-1, and among the MODIS data, NDVI is or > 8 g kg-1. Elevation is the second one and its impact varies the most sharply when it ranges between 1 200 and 1 260 m and peaks at 1 220 m. Distance from water source is the third one. As water sources in the aeolian sandy fluvial land area are quite scattered and small in area, their impacts on soil organic matter seldom exceed 500 m. 3) In the loess hilly area, elevation is the most important factor affecting soil organic matter and negatively related to soil organic matter. Geographic location is the second one, soil organic matter declines in content from southwest to northeast in the area. Vegetation is the third one, in positive relationship with soil organic matter, but in all the three types of datasets, the impacts of vegetation indices on soil organic matter never go beyond 8 g kg-1. So, it is quite obvious that in areas relatively simple in topography, it is advisable to use data relatively low in resolution instead of data high in resolution in predicting soil organic matter and the RF model is more effective in predicting in areas complex in topography.
Key words:  Multi-resolution remote sensing data  Random forest  Soil organic matter