土壤图更新中基于土壤类型面积分级的训练样点选择方法
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(41431177;41471178)、江苏省高校自然科学研究重大项目(14KJA170001)、江苏省高校研究生科研创新计划项目(KYLX15_0715)、国家重点基础研究发展计划973项目 (2015CB954102)和千人计划


Training Sample Selection Method Based on Grading of Soil Types by Area for Updating Conventional Soil Maps
Author:
Affiliation:

Fund Project:

the National Natural Science Foundation of China (41431177; 41471178), the Natural Science Research Program of Jiangsu(14KJA170001), the Graduate Research Innovation Program of Jiangsu(KYLX15_0715), the National Basic Research Program of China (2015CB954102),the “One-Thousand Talents” Program of China

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    基于数据挖掘模型的土壤图更新是一项重要的研究。数据挖掘模型构建中训练样点的质量不仅决定其对研究区土壤-环境关系表达的充分程度,而且会对推理制图的结果产生至关重要的影响。本文提出一种基于土壤类型面积分级的典型训练样点选择方法,即依据土壤面积对土壤类型分级,并按照等级之间的比例关系基于典型点选择训练样点。将方法应用于更新美国威斯康星州Raffelson流域的传统土壤图,并与另外两种训练样点选择方法对比,以验证该方法的有效性。结果表明,500次重复实验中,本研究方法与另外两种训练样点选择方法相比,能够更新传统土壤图的比例分别为79.5%、71.8%和63.6%,而且其推理制图结果更符合研究区土壤分布的特征。本研究所提方法是一种有效的训练样点选择方法。

    Abstract:

    【Objective】Traditional soil surveyshave turned out huge piles ofconventional soil mapsvarious in scale and nature.Although these maps are not very high in spatial detail or accuracy, they contain large volumes of valuable expertise concerning soil-environment relationships in relevant regions. Data mining models can be used to extract from these maps information useful to updatingof the conventional soil maps. In using data mining models to extract the information of soil spatial distribution, selection of training samples is an essential step.Quality of training samples will affect to a great extent full expression of soil-environmental relationships and accuracy of the updatedsoil maps.The area-weighted proportion method was a common method for selecting of training samples. However, this method usually assigns too much weight to those soil types large in area, so that too many training samples would be selected. Meanwhile, random selection of training samples from polygons of the same soil type may bring in some “noise” samples, occurring ontransition areas between soil types,which make the accuracy of the updated soil maps not high.【Method】In this paper, a new method was developed to select training samples from conventional soil mapsbased on grading of soil types by area.The method consists of the following two steps. The first step is to specifytypical (representative)samples of each soil type based on conventional soil map, so as to avoid generation of “noise pixels” due to misplacement in delineating boundaries between soil polygons.It is assumed that most of the boundaries of the soil polygons of a certain soil type arecorrectly delineated, and then the peak of the histogram of a certain environmental factor enclosed in the polygons of the soil type represents the typical environmental conditionunder which the soil develops or exists. The pixels close to the selectedenvironmentalconditions or within the peak zone of the histogram are considered as representative samples. All the representative samples selected through histograms of various environmental conditions of a certain soil type are combined into a typical sample set of the soil type.The second step is to select training samples based on grading of soil type by area, with a view to keep the numbers of samples of each soil type in balance. Soil types in the samegrade should have the same number of training samples out of the typical sample set of each of the soil types.The random forest model adopted in this study is to update conventional soil maps based on the selected training samples. To evaluate the above-proposed method, comparison was made between this method and two other training sample selection methods.Oneis to randomly select trainingsamples from polygons of each soil type and the number oftraining samples for each soil type depended on proportion of the grade the soil type is in, while the other is the common area-weighed proportion method, which randomly selects training samples form the soil polygons of a soil type and the number of training samples for each soil type depended on the area-weighted proportion of the soil type.The study area was a small watershed inRaffelson, Wisconsin of USA.The three selection methods were tried repeatedly, each for 500 times, and validate mean precision of the inferential mapping and proportion of the updated conventional soil maps with 92 independent verification samples in the field.【Result】Resultsshow that based on the 500 trails, comparison of this method with the other two reveals thatabout 79.5%, 71.8% and 63.6% of the conventional soil maps could be updated, respectively. Meanwhile, the updatedsoil maps based on the proposed training sample selectionmethod are more consistent with the actual soil distribution inthe Raffelson watershed.【Conclusion】It is concluded that the proposed method is an effective training sample selection method for data mining model to update conventional soil maps.

    参考文献
    相似文献
    引证文献
引用本文

刘雪琦,朱阿兴,杨 琳,缪亚敏,曾灿英.土壤图更新中基于土壤类型面积分级的训练样点选择方法[J].土壤学报,2017,54(1):36-47. DOI:10.11766/trxb201604210130 LIU Xueqi, ZHU Axing, YANG Lin, MIAO Yamin, ZENG Canying. Training Sample Selection Method Based on Grading of Soil Types by Area for Updating Conventional Soil Maps[J]. Acta Pedologica Sinica,2017,54(1):36-47.

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2016-03-21
  • 最后修改日期:2016-07-13
  • 录用日期:2016-07-27
  • 在线发布日期: 2016-10-17
  • 出版日期: