The assessment of class frequency in soil map legends is affected by uncertainty, especially at small scales where generalization is greater. The aim of this study was to test the hypothesis that data mining techniques provide better estimation of class frequency than traditional deterministic pedology in a national soil map. In the 1:5,000,000 map of Italian soil regions, the soil classes are the WRB reference soil groups (RSGs). Different data mining techniques, namely neural networks, randomforests, boosted tree, classification and regression tree, and supported vector machine (SVM), were tested and the last one gave the best RSG predictions using selected auxiliary variables and 22,015 classified soil profiles. The five most frequent RSGs resulting from the two approaches were compared. The outcomes were validated with a Bayesian approach applied to a subset of 10% of geographically representative profiles, which were kept out before data processing. The validation provided the values of both positive and negative prediction abilities. The most frequent classes were equally predicted by the two methods,which differed however from the forecast of the other classes. The Bayesian validation indicated that the SVMmethod wasmore reliable than the deterministic pedological approach and that both approaches were more confident in predicting the absence rather than the presence of a soil type.
Lorenzetti, R., Barbetti, R., Fantappiè, M., L'Abate, G., Costantini, E. (2014). Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps. GEODERMA, 237-238(237-238), 237-245 [doi:10.1016/j.geoderma.2014.09.006].
Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps
FANTAPPIE', Maria;
2014-01-01
Abstract
The assessment of class frequency in soil map legends is affected by uncertainty, especially at small scales where generalization is greater. The aim of this study was to test the hypothesis that data mining techniques provide better estimation of class frequency than traditional deterministic pedology in a national soil map. In the 1:5,000,000 map of Italian soil regions, the soil classes are the WRB reference soil groups (RSGs). Different data mining techniques, namely neural networks, randomforests, boosted tree, classification and regression tree, and supported vector machine (SVM), were tested and the last one gave the best RSG predictions using selected auxiliary variables and 22,015 classified soil profiles. The five most frequent RSGs resulting from the two approaches were compared. The outcomes were validated with a Bayesian approach applied to a subset of 10% of geographically representative profiles, which were kept out before data processing. The validation provided the values of both positive and negative prediction abilities. The most frequent classes were equally predicted by the two methods,which differed however from the forecast of the other classes. The Bayesian validation indicated that the SVMmethod wasmore reliable than the deterministic pedological approach and that both approaches were more confident in predicting the absence rather than the presence of a soil type.File | Dimensione | Formato | |
---|---|---|---|
Lorenzetti_et_al_2015_geoderma.pdf
Solo gestori archvio
Descrizione: Articolo principale
Dimensione
1.56 MB
Formato
Adobe PDF
|
1.56 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.