全氮是土壤肥力的重要指标，对作物产量具有决定性作用，采用土壤可见-近红外（Vis-NIR）光谱预测技术及时获取土壤全氮含量信息具有重要意义。采用来自5省的450个土壤样本来验证局部加权回归方法（LWR）结合Vis-NIR光谱技术预测大面积土壤全氮含量的适用性。结果表明，LWR模型的预测效果优于偏最小二乘回归（PLSR）、人工神经网络（ANN)和支持向量机(SVM)，选取主成分数为5，相似样本为40时，模型验证的决定系数（RP 2）为0.83，均方根误差（RMSEP）为0.25 g kg-1，测定值标准偏差与标准预测误差的比值（RPD）达到2.41。LWR从建模集中选取与验证样本相似的土样作为局部建模样本，降低了差别大的样本对模型的干扰，从而提高了模型的预测能力。因此，LWR建模方法通过大范围、大样本土壤光谱数据进行大尺度区域的全氮等土壤属性预测时能够发挥更好的作用。
Diffuse reflectance spectroscopy within the visible and near-infrared (Vis-NIR) range is a promising way for acquisition of soil properties and digital soil mapping. The diffuse reflectance spectroscopy technique is rapid, nondestructive, environment-friendly and more efficient than the conventional analysis method. However, due to the diversity and spatial heterogeneity of soil, the prediction model based on the technique will have to face the issue of universality. Total nitrogen (TN) in soil is not only a significant index of soil fertility but also an important factor deciding crop yield. It is, therefore, essential to timely acquire the information of TN in soil. This paper introduces a method, i.e. locally weighted regression (LWR), as supplement to the use of the Vis-NIR spectrum technique in predicting TN in soil at a regional scale, and evaluates accuracy of the prediction using Vis-NIR plus LWR. To that end a total of four hundred and fifty soil samples were collected from Zhejiang, Jilin, Yunnan, Hainan and Gansu, air dried and ground to pass a 2 mm sieve. Their Vis-NIR diffuse reflectance spectra were collected using a FieldSpec Pro FR Spectrometer. The reflectance spectra in the wavelength range from 400 to 2 450 nm were denoised by Savitzky-Golay and first derivatived. Three fourths of the samples were selected for calibration dataset using the Kennard-Stone algorithm and the remaining one fourth were used as validation dataset. The core of the LWR method is to select samples from the calibration dataset most spectrally similar to those in the validation dataset. The algorithm of LWR goes in three steps: to decompose and compress the spectral matrix through Principal Component Analysis and pick out local modeling subsets from the modeling dataset similar to the validation dataset by Euclidean distance. Based on the spectral distance of each sample in the local modeling subset to the validation sample, weight of the sample in the regression model is defined, by means of tri-cube weight function. The number of principle components and the number of similar samples were the crucial parameters in the LWR model, and in this study, the two parameters were optimized to be 5 and 40, respectively. The determination coefficient (RP 2), the root mean square error (RMSEp ) and ratio of standard error of performance to standard deviation (RPD) was 0.63, 0.36 g kg-1 and 1.63, respectively, in the PLSR model. However, the support vector machine (SVM) model and artificial neural network (ANN) model was higher than the PLSR model in prediction accuracy (RP 2=0.75~0.80, RMSEp =0.27~0.30 g kg-1, RPD=1.98~2.22). Thanks to the advantages of LWR in algorithm, the LWR model reduced the interference of samples lower in similarity in local modeling, and hence increased the accuracy of TN prediction (RP 2=0.83, RMSEp =0.25 g kg-1, RPD=2.41). The findings demonstrate that correlation coefficient between soil TN and the spectral reflectance after first-order differential transformation peaks at 820, 1 400, 1 430, 1 630, 1 800, 1 930, 2 100, 2 200 and 2 300 nm, which overlap the important bands for spectral modeling of soil organic matter. Due to the spatial heterogeneity of the study areas and soil samples, the two crucial parameters, i.e. number of similar samples and number of principle components, of the LWR model vary with the modeling datasets, so they should be optimized when LWR is used to predict TN. The LWR TN prediction model diminishes the probability of underestimating TN content as the PLSR model would, and makes the prediction closer to 1:1 line. Besides, the LWR model performs better than the non-linear ANN and SVM models in TN prediction, and does not have any black-box problem. Therefore, it can be concluded that LWR is a reliable method for prediction of soil TN content when a large spectral database is available. With the consummation of various large-scaled soil spectral libraries, LWR can be used to tap more useful information out of these soil database and bring them into full play.
陈颂超,冯来磊,李 硕,纪文君,史 舟.基于局部加权回归的土壤全氮含量可见-近红外光谱反演[J].土壤学报,2015,52(2):312-320. DOI:10.11766/trxb201311290572 Chen Songchao, Feng Lailei, Li Shuo, Ji Wenjun, Shi Zhou. Vis-Nir spectral inversion for prediction of soil total nitrogen content in laboratory based on locally weighted regression[J]. Acta Pedologica Sinica,2015,52(2):312-320.复制