基于CR-FOD变换XGBoost模型的高光谱土壤盐分反演及可解释性分析
DOI:
CSTR:
作者:
作者单位:

山东理工大学建筑工程与空间信息学院

作者简介:

通讯作者:

中图分类号:

X144

基金项目:


Hyperspectral Soil Salinity Inversion and Interpretability Analysis Based on CR-FOD Transform and XGBoost Model
Author:
Affiliation:

School of Civil Engineering and Geomatics,Shandong University of Technology

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    本研究以黄河三角洲东营市为研究区,基于高光谱数据开展土壤盐分估算。对原始光谱进行Savitzky-Golay(S-G)滤波和多元散射校正(Multiplicative Scatter Correction, MSC)预处理后,进行倒数(1/R)、倒数对数(log(1/R))与连续统去除(Continuum Removal, CR)变换,并结合分数阶微分(Fractional Order Derivative, FOD)处理,构建10种典型二维光谱指数作为特征。在此基础上,采用贝叶斯优化(Bayesian Optimization, BO)方法训练偏最小二乘回归(Partial Least Squares Regression, PLSR)、卷积神经网络(Convolutional Neural Network, CNN)、极限梯度提升(eXtreme Gradient Boosting, XGBoost)和支持向量机(Support Vector Machine, SVM)4种模型。结果表明:连续统去除组合二阶微分后的归一化差值指数(Normalized Difference Index, NDI)与土壤含盐量相关性最高(|r|=0.91);基于该变换的XGBoost模型反演性能最优,测试集决定系数(R2)、均方根误差(Root Mean Square Error, RMSE)与相对分析误差(Residual Prediction Deviation, RPD)分别达0.94、0.85 g·kg-1和4.33;SHAP(SHapley Additive exPlanation)可解释性分析进一步揭示,广义指数1(Generalized Index 1, GDI 1)为模型中最重要的特征。本研究为区域盐渍化高光谱监测提供了有效方法支持。

    Abstract:

    【Objective】Under the global context of climate change and anthropogenic impacts, soil salinization has become increasingly severe. However, traditional salinization monitoring suffers from being time-consuming, labor-intensive, and costly. Hyperspectral-based salinization monitoring often relies on single mathematical transformations and one-dimensional spectral information, while also exhibiting poor model interpretability. Research utilizing combined spectral transformations to construct spectral indices for salinization estimation urgently requires in-depth exploration. Thus, this study aims to fully exploit spectral information, enhance data sensitivity, and establish a high-precision, interpretable salinization inversion model based on spectral indices.【Method】Dongying City was selected as the study area, where hyperspectral datasets were collected through field surveys, and soil samples were analyzed in the laboratory for salinity determination. The samples were divided into training and testing sets in a 7:3 ratio based on salinity gradients. Spectral data were preprocessed using Savitzky-Golay (S-G) filtering and Multiplicative Scatter Correction (MSC). Four spectral transformations were applied: Reflectance (R), Reciprocal (1/R), Logarithm of Reciprocal (log(1/R)), and Continuum Removal (CR). The Fractional Order Derivative (FOD) transformation was subsequently performed on each form. Ten types of two-dimensional spectral indices were constructed from the combined transformed data at each derivative order. Optimal band combinations and differential orders were identified by assessing correlation coefficients with soil salt content (SSC). Using these spectral indices as features and measured salinity as the dependent variable, four machine learning models—Partial Least Squares Regression (PLSR), Convolutional Neural Network (CNN), eXtreme Gradient Boosting (XGBoost), and Support Vector Machine (SVM)—were constructed. The hyperparameters of all models were optimized using the Bayesian Optimization (BO) algorithm, which iteratively fitted a probabilistic surrogate model to guide the search for hyperparameters that minimize cross-validation error. Each model was trained and tuned via ten-fold cross-validation. Performance was evaluated using the Coefficient of Determination (R2), Root Mean Square Error (RMSE), and Residual Prediction Deviation (RPD). The best-performing model was further interpreted using SHapley Additive exPlanations (SHAP) to identify influential spectral features. 【Result】Results demonstrated that:(1) FOD effectively enhances spectral sensitivity by highlighting gradient information during spectral curve variations; (2) Mathematical transformations combined with FOD significantly improve correlations between spectral data and SSC; (3) The 2-order NDI index after CR treatment achieved the highest absolute correlation coefficient (|r|=0.91) with SSC; (4) The CR-FOD-XGBoost model delivered optimal accuracy (testing set: R2=0.94, RMSE=0.85 g·kg?1, RPD=4.33); (5) In the optimal model, GDI1 contributed most significantly while DI clusters adjacent to zero contributed minimally. 【Conclusion】Collectively, this study demonstrates that combining spectral transformations to construct indices with Bayesian-optimized XGBoost modeling effectively improves soil salinity inversion accuracy, providing scientific foundations for salinization control and ecological sustainability. Future research should focus on enhancing spectral sensitivity responsiveness to further improve model performance, thereby advancing theoretical frameworks for sustainable land-use and environmental conservation strategies.

    参考文献
    相似文献
    引证文献
引用本文

杨吉存,郭兵,韩保民.基于CR-FOD变换XGBoost模型的高光谱土壤盐分反演及可解释性分析[J].土壤学报,,[待发表]
YANG Jicun, GUO Bing, HAN Baomin. Hyperspectral Soil Salinity Inversion and Interpretability Analysis Based on CR-FOD Transform and XGBoost Model[J]. Acta Pedologica Sinica,,[In Press]

复制
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-08-02
  • 最后修改日期:2026-03-04
  • 录用日期:2026-03-13
  • 在线发布日期:
  • 出版日期:
文章二维码