Detailed Mapping of Soil Texture of a Paddy Growing Soil Using Multivariate Geostatistical Approaches

This study was conducted to explore performances of multivariate geostatistical techniques, co-kriging and regression kriging in contrast to univariate ordinary kriging, to generate detailed maps of soil texture by using proximally sensed apparent electrical conductivity (ECa) as secondary information. The survey of ECa (n = 21110) was conducted in a paddy tract (2.5 ha) located in Kurunegala, Sri Lanka using DUALEM-1S proximal soil sensor. Twenty-five soil samples were collected on the basis of conditioned Latin hypercube sampling approach. Soil texture was determined using pipette method. Additive log transformed sand and clay values were used to produce soil texture maps using multivariate co-kriging, regression kriging and univariate ordinary kriging. Data ranges of clay (3.3 – 19.5%) and sand (62.5 – 90.1%) showed a considerable variability within the study area. Correlation analysis revealed a strong relationship of clay% with horizontal (r = 0.86) and perpendicular (r = 0.89) coplanar ECa and their geometric mean (r = 0.89). Sand % showed strong negative relationships with horizontal (r = -0.89) and perpendicular (r = -0.89) coplanar ECa and their geometric mean (r = 0.90). Second order polynomial regression models were best fitted for the prediction of clay (R 2 = 0.83), and sand (R 2 = 0.83) and these relationships were used for regression kriging. Prediction accuracies of geostatistical approaches were investigated by leave-one-out cross validation procedure and estimation of mean error, mean absolute error and root mean square error. Detailed maps ) of clay and sand generated using co-kriging, regression kriging and ordinary kriging showed similar spatial patterns. However, multivariate geostatistical techniques produced more accurate detailed soil texture maps. Further, the comparison of mean error, mean absolute error and root mean square error values of three different interpolation techniques indicated that regression kriging produced the most accurate soil texture maps. This study emphasizes the high potential and robustness of regression kriging combined with proximally sensed apparent electrical conductivity to create high accurate detailed texture maps in efficient and cost effective manners.


INTRODUCTION
Detailed maps on different soil properties are essential for precision management of soils, process based land evaluation, land & water management, irrigation and drainage (Basayigit and Senol, 2008). With the advancement of technology, various computer based approaches of soil mapping have been evolved. These methods allow the fusion of many layers of secondary information such as proximally and remotely sensed soil information, and elevation to generate accurate soil maps. Secondary information are proxy data that strongly correlate with a target soil property such as soil texture, organic carbon or trace metals. Often, secondary information are available in high density. Thus, integration of secondary information with sparsely measured primary soil variables allows mapping of the latter in more detailed, accurately and in cost effective manner.
Apparent electrical conductivity (EC a ) is a secondary information which can be proximally sensed in cost effective manner. Proximal sensors such as DUALEM-1S and Geonics EM38 usually provide large number of EC a measurements in a single survey. Although, EC a sensors primarily measures soil salinity, many studies have showed a strong correlation of EC a with soil texture (Saey et al., 2012), organic carbon, CEC and soil moisture. Soil texture refers to relative proportions of sand (0.05 -2 mm), silt (0.002 -0.05 mm) and clay (<0.002 mm) particles in soil. Soil texture determines many soil functions such as cation exchange, nutrient and water retention (Kumaragamage and Kendaragama, 2010). Therefore, detailed maps of soil texture have been often used to delineate management zones and subsequent application of site-specific soil management practices.
Co-kriging (CK) and Regression kriging (RK) are two multivariate geostatistical interpolation techniques used for handling of incorporate secondary soil information for mapping of primary soil information (Li and Heap, 2008). However, use of these multivariate interpolation techniques for the prediction of soil texture using EC a in tropical soils have been rarely investigated. This study was conducted to explore the performances of CK and RK in contrast to univariate ordinary kriging (OK) to generate accurate maps of soil texture in a paddy growing tropical soil.

MATERIALS AND METHODS
The study was carried out in 2.5 ha paddy tract at Makulana in Kurunegala district ( Figure 1, central coordinates; 7° 27ʹ 15ʺ N, 80° 27ʹ 5ʺ E). The soil in this area belongs to the Kurunegala series (Typic Endoaqualf). The survey of EC a (n=10121) was carried out using DUALEM-1S (DUALEM Inc., 2012) proximal soil sensor. DUALEM-1S is an electromagnetic induction based proximal soil sensor consists of one vertically oriented transmitter coil and two receiver coils, each oriented vertically and horizontally. The transmitter coil is energized with an alternating current (9 kHz) forming a primary magnetic field around the transmitter coil which induces a comparable secondary magnetic field in the soil. This secondary magnetic field is superimposed by the primary magnetic field and the resultant magnetic field is measured by receiver coils (Saey et al., 2009). Thus, DUALEM-1S proximal soil sensor measures two EC a measurements, horizontal geometry apparent electrical conductivity (EC a HCP) and perpendicular geometry apparent electrical conductivity (EC a PRP). Furthermore, EC a PRP and EC a HCP are sensitive to topsoil (0-75 cm) and subsoil (75-150 cm) conductivities, respectively. Thus, the sensor provides simultaneous measurements of both topsoil and subsoil properties (DUALEM Inc., 2012). During the study, the sensor was mounted on a wooden sled and pulled along parallel lines spaced at 2 m at an approximate speed of 3 km h -1 and measurements were recorded at one second interval.

Figure 1. Spatial distribution of measurement locations of EC a (n = 10121).
Conditioned Latin hypercube sampling approach (Minasny and McBratney, 2006) was used to locate 25 soil sampling points representing spatial variability of proximally sensed EC a HCP and EC a PRP ( Figure 2). Soil samples were taken from the depth of 0-30 cm using a gouge auger. Air-dried samples were passed through a 2 mm sieve to remove gravel fraction. Soil texture was determined using pipette method (Gee and Bauder, 2002). Exploratory data analysis was performed prior to the spatial analysis. In soil textural analysis, sand, silt and clay contents are presented as relative proportions. Such data are called as compositional data.  Aitchison (1986) observed errors in predictions when such compositional data were directly used in spatial analysis. Therefore, additive log transformation (ALT) (Aitchison, 1986) was performed on sand, silt and clay contents using the following equation.
) and ) are the i th and k th components of the composition at the same location. Moreover, Pawlowsky-Glahn and Olea (2004) showed that the choice of the k th component does not influence analytical results (e.g. prediction results) of ALT transformed data. The total number of components in the composition is p. Additive log transformed values were back transformed by using generalized logistic transformation (Aitchison, 1986) using following equation.
where, ) is the ALT back transformed value of ) and ) is ALT transformed value of i th component of composition.
Semivariogram analysis was used to quantify the spatial structures of soil texture. Semivariogram was calculated using the following equation.
where, )is the semivariance of the textural fraction z for lag distance h calculated using pairs of observations ( )) separated by lag distance h. The empirical semivariogram calculation was followed by fitting a theoretical semivariogram model. Ordinary kriging is a univariate geostatistical interpolation technique and it is often considered as the best linear unbiased predictor (BLUP). Ordinary kriging estimates a value of a textural fraction z at location by: where, ) is the sum of the neighboring observed values i ) ) with the weights which were derived from the semivariogram by solving the OK system. The sum of the weights were set to one in order to ensure an unbiased estimate and the prediction was optimized by minimizing the expected error (Webster and Oliver, 1990).
Correlation analysis was performed to explore relationships between texture data and EC a HCP, EC a PRP, and their mean (EC a M) and geometric mean (EC a GM). Highly correlated measurement was selected as the secondary variable for the prediction of soil textural fractions using CK and RK.
Co-kriging is a multivariate interpolation technique which uses joint spatial structure (coregionalization) of primary (e.g. soil texture) and secondary variables (e.g. EC a ) to make (2) optimal predictions. Thus, joint spatial structure was modeled by calculating cross variograms (Goovaerts, 1997).
) are measured values of primary and secondary variables and at two points separated by a lag distance h. Variograms of individual soil properties and the corresponding cross variograms were modeled jointly as described by Carroll and Oliver (2005) under the constraint of positive semi-definiteness of co-regionalization matrices. Cokriging makes predictions of the primary variable using the following equation: where, ) is the estimate of at location ) is the value of primary variable at th location. ( ) is the value of secondary variable at th location. and are CK weights. Co-kriging weights are calculated by a system of equations identical to that of OK (Issaaks and Srivastava, 1989). Ordinary CK algorithm calculates weights subjecting to following two unbiasedness conditions: Regression kriging is a spatial extension of ordinary least square regression (OLS) which accounts for the spatial structure of residuals to make accurate predictions (Sun et al., 2010). In RK algorithm, soil property at unvisited location is predicted by summing the predicted drift and residual (Odeh et al., 1994).
where, ) is the fitted drift and )is the interpolated residual. The drift ) was modeled using OLS regression of soil texture and EC a and the residuals ) were interpolated using simple kriging with expected value 0 (Hengl et al., 2007). The accuracies of predictions were assessed by leave one out cross validation procedure. Thus, validation indices root mean square error (RMSE), mean error (ME) and mean absolute error (MAE) calculated using following equations: (8) where, ) and ) are interpolated and measured values, respectively at the location x i . The number of validation data points is denoted by n (n=25). In these calculations, texture data obtained from laboratory analysis and predicted data value in cross validation results are considered to be measure values and interpolated values. The relative improvement (RI) of texture maps generated through multivariate kriging techniques (CK and RK) was calculated over univariate OK using the following equation.

)
where, and are root means square error for univariate ordinary kriging and a given multivariate (CK or RK) method, respectively.

Variability of soil texture and EC a
Apparent electrical conductivity survey followed by exploratory data analysis resulted in 10121 data points of each of EC a HCP and EC a PRP. Descriptive statistics of proximally sensed EC a HCP, EC a PRP, ECaGM and laboratory measured clay and sand are given in the Table 1. Coefficient of variations (CVs) of both EC a measurements showed moderate level of variability according to CV classification of Warrick and Nielsen (1980). The coefficient of skewness values indicated that all data distributions are positively skewed (skewness > 0). Moreover, coefficient of kurtosis for EC a HCP distribution indicated that EC a HCP distribution is slightly platykurtic. Distributions of EC a PRP and EC a GM were leptokurtic (kurtosis > 0). Average values of EC a HCP and EC a PRP indicated comparatively higher conductivity of the subsurface soil. Moreover, CV values of EC a HCP and EC a PRP reflected comparatively heterogeneous surface soil. Positively skewed data distribution was observed for clay measurements. However, the distribution of sand was negatively skewed. Coefficients of kurtosis of both clay and sand distributions showed leptokurtic nature that deviates from the normal distribution. Average clay and sand contents were 7.1% and 83.5 %, respectively. According to Warrick and Nielsen (1980), clay data showed a high spatial variability (CV > 60%) whereas sand data showed low variability (CV< 12%).
The Pearson's correlation coefficient (r) between soil textural fractions and EC a are given in the Table 2. Correlation coefficient indicates the strength of the linear relationship between two variables (Mukaka, 2012). Mason et al. (1983) stated that range of r between 0.68 and 1 indicates a very strong linear relationship between two variables. Thus, the correlation between proximally sensed EC a and texture was very strong. A strong positive correlation was observed between clay and EC a . Studies have revealed that the negative charge of clay particles and its high water retention as reasons for positive relationship between clay and EC a ( Rhoades et al., 1999). Sand particles behave opposite manner resulting in a negative correlation between sand and EC a . The highest correlation of textural fractions was observed with EC a GM. Thus, EC a GM was used as secondary information when preparing detailed soil maps using CK and RK.

Regression kriging, ordinary kriging and co-kriging of soil texture
Scatter plots showing the relationship between clay and EC a GM and sand and EC a GM are provided in Figure 3a and 3b, respectively. Second order polynomial models were best fitted for the prediction of clay (R 2 = 0.83, equation 15) and sand (R 2 = 0.83, equation 16) by EC a GM.

Figure 3. Scatter plot of (a) clay and EC a GM and (b) sand and EC a GM and fitted polynomial prediction models
The Empirical variograms of ALT transformed residual clay% and sand% are shown in Figure 4a and 4b, respectively. Both variograms were best fitted with spherical model. The relative nugget effect (RNE) (ratio of the nugget to sill) indicates the proportion of spatially unstructured variation in relation to total variation (Vitharana, 2008). According to the classification of RNE proposed by Cambardella et al. (1994), RNEs ranging from 0% to 25% and 25% to 75% indicate a strongly and moderately structured spatial dependencies, respectively. Thus, residual variograms of ALT transformed residuals of clay% (RNE = 33%) and sand% (RNE = 61%) showed moderately structured spatial variability. Empirical Variograms of ALT transformed clay and sand are shown in Figure 5a and 5b, respectively. Both variograms were best fitted with spherical models. Bijanzadeh et al. (2014) have also shown that empirical semivariogram of many chemical and physical properties of soil are often fitted with spherical models and rarely with exponential models. RNE values of modeled variograms of clay (RNE = 3.7 %) and sand (RNE = 4.2 %) reflected a moderately structured spatial variability.

Figure 4. Variograms of ALT transformed (a) residual clay% and (b) residual sand%
Cross variography is an important step in CK interpolation which quantifies the coregionalisation between primary and secondary variable. Both cross variograms were fitted with spherical model (Figure 5c and 5d).

Figure 5. Variograms of additive log transformed (a) clay and (b) sand, cross variograms of additive log transformed (c) clay & (d) sand with EC a GM
However, cross variograms of EC a GM and ALT transformed sand showed a negative spherical shape (Figure 5d). Negative semi variance values indicate a positive increase of one variable over distance h corresponds with a decrease of the other variable over the same distance (Rossi et al., 1996).
Maps of soil clay prepared by OK, CK and RK are provided in Figure 6a, 6b and 6c, respectively. All interpolation techniques showed the same general pattern of spatial variability of clay across the study filed. Clay maps showed a distribution of high clay content (15-18%) in the northern part of the study field. The northern area of the field lies on the lower position of the landscape. Therefore, it can be stated that long-term translocation of clay particles from higher positions of landscape to lower position by soil erosion have resulted high clay content in northern part of the study field. Babalola et al. (2007) have also found comparatively higher clay contents at foot slopes of a typical catena resulted by erosion-sedimentation processes.

Figure 6. Spatial distribution of clay% prepared by (a) ordinary kriging, (b) co-kriging and (c) regression kriging
Maps of sand prepared by OK, CK and RK showed a similar spatial distributions (Figure 7a to 7c) which opposite to that of clay. Texture maps generated using OK showed less detailed patterns depicting the average spatial variability of the field (Figure 6a and 7a). In contrast, maps generated using RK and CK showed the spatial variability in detail (Figure 6b, 6c and 7b, 7c) that suits for site specific soil management through precision agricultural approaches.

Figure7. Spatial distribution of sand% prepared by (a) ordinary kriging, (b) cokriging and (c) regression kriging
Results of the cross validation of OK, CK and RK are summarized in Figure 8. Mean error computes the average bias of prediction and a value nearing zero is desired for accurate mapping (Isaaks and Srivastava, 1989). The lowest MEs for the prediction of clay (-0.13) and sand (0.24) were found for RK multivariate interpolation method. This was followed by OK (clay = -0.32, sand = 0.52) and CK (clay = 4.96, sand = -4.02).

Figure 8. Results of cross validation of (a) Mean Error, (b) Mean Absolute Error and (c) Root mean square error of texture maps prepared by ordinary kriging (OK), cokriging (CK) and regression kriging (RK)
Mean absolute error (Nalder and Wein, 1998) and RMSE are measures of the magnitude of prediction error (Hernandez-Stefanoni and Ponce-Hernandez, 2006). However, MAE is less sensitive to outliers while RMSE is sensitive to outliers. Values of MAE and RMSE ( Figure  8b and 8c) revealed a superior prediction accuracy of RK for clay (MAE = 1.22, RMSE = 1.73) and sand (MAE = 1.95, RMSE = 2.80) in comparison to the OK and CK. We observed very poor prediction accuracy of CK. Goovaerts (1997) has shown two intrinsic limitations of CK; some of the weights can take negative values, thus increasing the risk of getting unacceptable estimates and, often weights tend to be small, thus reducing the influence of the secondary data (Li and Heap, 2008). However, previous studies have revealed better accuracy of CK in comparison to univariate OK.
Regression kriging by combing 25 clay content measurements and 10121 EC a measurements has resulted 30% improvement of prediction accuracy in comparison to the univariate OK. However, accuracy improvement for the mapping of sand was 2.5%. Several studies (Odeh et al., 1994;Goovaerts, 1999;Bishop and McBratney, 2001) have also revealed that combination of kriging and correlation with secondary information data outperformed OK and CK.

CONCLUSIONS
This study explores the usability of multivariate geostatistical methods for detailed mapping of soil texture in a paddy grown soil of Kurunegala series. A considerable variability of soil texture exists in the study area. Variograms analysis revealed a structured spatial variability of soil texture. Thus, detailed mapping of soil texture will be of importance for the site specific management of the studied paddy grown soils of Kurunegala series. The correlation and regression analysis revealed that proximal sensed EC a is a very strong predictor (secondary information) for detailed mapping of soil texture. Comparison of OK, CK and RK revealed that RK is the best multivariate geostatitical approach to predict soil texture using densely measured proximally sensed EC a.