Ordinary and Simple Kriging#

This tutorial will teach us how to perform spatial interpolation with Ordinary and Simple Kriging. We will compare Kriging behavior in relation to a different number of ranges, and finally check how number of neighbors changes the interpolation quality. Ordinary and Simple Kriging are the simplest form of Kriging, but they’re still powerful, thus you should always start from those methods and then move to more sophisticated techniques.

Prerequisites#

  • Domain:

    • semivariance and covariance functions

    • kriging (basics)

  • Package:

    • TheoreticalVariogram

    • ExperimentalVariogram

  • Programming:

    • Python basics

Table of contents#

  1. Set semivariogram model (fit data into semivariogram).

  2. Create Ordinary and Simple Kriging models.

  3. Predict values at unknown locations and evaluate output.

1. Set semivariogram model (fit)#

[1]:
import geopandas as gpd
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from pyinterpolate import (
    build_experimental_variogram,
    build_theoretical_variogram,
    ordinary_kriging,
    simple_kriging
)
[2]:
# Load DEM data

df = pd.read_csv(
    '../data/dem.csv'
)
# Populate geometry column and set CRS

dem_geometry = gpd.points_from_xy(x=df['longitude'], y=df['latitude'], crs='epsg:4326')
dem = gpd.GeoDataFrame(df, geometry=dem_geometry)

# Transform crs to metric values
dem.to_crs(epsg=2180, inplace=True)
dem.plot(column='dem', cmap='gist_earth', alpha=0.6, vmin=0, legend=True);
../../../_images/usage_tutorials_functional_3-1-ordinary-and-simple-kriging_2_0.png

In the beginning, we remove 90 % of our points from the dataset, and we will leave them as a test set to calculate interpolation error of Kriging models.

Note: probably you have realized, that we put much more records in test set than in classic machine learning. This simulates real-world problems where we have only few samples and we must deliver a large surface of predicted values.

[3]:
train = dem.sample(n=int(0.1*len(dem)))
test = dem.loc[~dem.index.isin(train.index)]
[4]:
len(train), len(test), len(dem) - (len(train) + len(test))
[4]:
(689, 6206, 0)
[5]:
train.plot(column='dem', cmap='gist_earth', alpha=0.6, vmin=0, legend=True)
train.head()
[5]:
longitude latitude dem geometry
1478 15.168636 52.628530 19.145895 POINT (240802.993 536090.593)
1229 15.158352 52.723126 56.446690 POINT (240668.961 546640.642)
3055 15.226381 52.715467 18.290354 POINT (245214.816 545546.353)
1970 15.185643 52.595825 46.901897 POINT (241760.565 532394.821)
3945 15.261187 52.681934 18.407000 POINT (247370.181 541696.816)
../../../_images/usage_tutorials_functional_3-1-ordinary-and-simple-kriging_6_1.png

We have removed a subset of points from a dataset to be sure that Kriging is working. In this scenario, 90% of available points are removed, but in real-world cases, you will probably have even fewer points to perform estimations, down to the ~1% of known values.

We have two variables right now:

  • train: training set used for semivariogram model derivation,

  • test: test set used for the model error calculation.

The worst idea is to take sample as k first values from the list. Let’s imagine we have a sorted list of Digital Elevation Model points. The western part of our measurements covers a mountain, and the eastern part is plain. When we use the part of the west for modeling and the east part for tests, we are going directly into a catastrophe! Points for each set are chosen randomly to avoid bias related to the geographical location, but we cannot be sure that the sampled points are not from the neighborhood, thus in a fully controlled scenario we perform this experiment multiple times sampling dataset dozens or hundreds of times.

Having datasets, we can start modeling. Kriging requires fitted theoretical semivariogram.

[6]:
step_size = 500  # meters
max_range = 10000  # meters

exp_var = build_experimental_variogram(
    values=train['dem'],
    geometries=train['geometry'],
    step_size=step_size,
    max_range=max_range
)
[7]:
exp_var.plot();
../../../_images/usage_tutorials_functional_3-1-ordinary-and-simple-kriging_9_0.png
[8]:
theo_var = build_theoretical_variogram(
    experimental_variogram=exp_var,
    models_group='linear'
)
[9]:
theo_var.plot()
../../../_images/usage_tutorials_functional_3-1-ordinary-and-simple-kriging_11_0.png
[10]:
print(theo_var)
* Selected model: Linear model
* Nugget: 0.0
* Sill: 480.57557214875214
* Range: 8028.237650988341
* Spatial Dependency Strength is Undefined: nugget equal to 0, cannot estimate
* Mean Bias: None
* Mean RMSE: 25.496114727814966
* Error-lag weighting method: equal


+--------+--------------------+--------------------+---------------------+
|  lag   |    theoretical     |    experimental    |   bias (real-yhat)  |
+--------+--------------------+--------------------+---------------------+
| 500.0  | 29.930327989829088 | 16.982943545110665 | -12.947384444718423 |
| 1000.0 | 59.860655979658176 | 30.336797059220284 | -29.523858920437892 |
| 1500.0 | 89.79098396948727  | 55.19369693532188  |  -34.59728703416539 |
| 2000.0 | 119.72131195931635 | 72.64214325577524  | -47.079168703541114 |
| 2500.0 | 149.65163994914545 | 119.83419152308294 | -29.817448426062512 |
| 3000.0 | 179.58196793897454 | 127.38055541167519 | -52.201412527299354 |
| 3500.0 | 209.51229592880364 | 173.80396800141062 | -35.708327927393015 |
| 4000.0 | 239.4426239186327  | 221.17530533500022 | -18.267318583632488 |
| 4500.0 | 269.3729519084618  | 244.3961592666817  | -24.976792641780094 |
| 5000.0 | 299.3032798982909  | 279.82078294996256 | -19.482496948328333 |
| 5500.0 |  329.23360788812   | 332.5369149543105  |  3.3033070661904844 |
| 6000.0 | 359.1639358779491  | 367.62601681181036 |  8.462080933861273  |
| 6500.0 | 389.0942638677782  | 397.55117274410134 |  8.456908876323155  |
| 7000.0 | 419.0245918576073  | 432.9732345205596  |  13.948642662952295 |
| 7500.0 | 448.9549198474363  | 451.60459167886455 |  2.649671831428236  |
| 8000.0 | 478.8852478372654  | 462.7533281080885  |  -16.1319197291769  |
| 8500.0 | 480.57557214875214 | 488.04251358205227 |  7.466941433300121  |
| 9000.0 | 480.57557214875214 | 489.5822318182907  |  9.006659669538578  |
| 9500.0 | 480.57557214875214 | 510.8951955564648  |  30.31962340771264  |
+--------+--------------------+--------------------+---------------------+

2. Create Ordinary and Simple Kriging models#

This is the essential step of our tutorial. We’ve fit semivariances to the model that is utilized by Kriging algorithm. Knowing this model, we can interpolate values in empty locations.

We will predict a known and arbitrary point in the first run. It is not a real prediction, but test of Kriging. When we perform Kriging, we should get the same exact values as in training data because Kriging is an unbiased linear estimator. Then, we will interpolate values at unknown locations given in the test set, and calculate the Root Mean Squared Error (RMSE) between those predicted and real values.

Functions ordinary_kriging() and simple_kriging() are similar, there is only one parameter that is different - process_mean, and it is passed only to the simple_kriging() function. Kriging parameters are:

  • theoretical_model: fitted TheoreticalVariogram model

  • known_values: observed values at known locations

  • known_geometries: coordinates of known locations

  • unknown_locations: locations where you want to interpolate a missing value (or a single location)

  • neighbors_range: optional parameter, the maximum distance where we search for point neighbors. If you don’t provide it then it is equal to the range of theoretical variogram model

  • no_neighbors: number of the closest neighbors used for interpolation, by default it is set to 4, but for datasets with many point samples it is recommended to use more neighbors

  • max_tick: If searching for neighbors in a specific direction how large should be tolerance for increasing the search angle (how many degrees more)

  • use_all_neighbors_in_range: boolean parameter. True: if the number of neighbors within the neighbors_range is greater than the no_neighbors, then take all of them for modeling. Otherwise, take only n-closest neighbors

  • allow_approx_solutions: Allows the approximation of kriging weights based on the OLS algorithm. This is by default set to False, and parameter is useful for debugging purposes. Do not ignore errors coming from the Kriging system because usually those errors inform you about data quality (duplicated coordinates, clusters, etc.)

  • process_mean: The mean value of a process over a study area. Should be known before processing. That’s why Simple Kriging has a limited number of applications. You must have multiple samples and well-known area to know this parameter.

We use only the first four parameters and process_mean when we run Simple Kriging. Let’s start! The first step is an interpolation of the value known by our model.

[11]:
# Select one known value

known_value = train.sample()
known_value
[11]:
longitude latitude dem geometry
6260 15.34741 52.732855 48.885471 POINT (253481.516 547057.061)
[12]:
# Predict with Ordinary Kriging

unknown_location = known_value.geometry.iloc[0]
print(unknown_location)
POINT (253481.51620009867 547057.0611242605)
[13]:
ok_interpolation = ordinary_kriging(
    theoretical_model=theo_var,
    known_values=train['dem'],
    known_geometries=train['geometry'],
    unknown_locations=unknown_location,
    progress_bar=False
)
print(ok_interpolation)
[[4.88854713e+01 0.00000000e+00 2.53481516e+05 5.47057061e+05]]
[14]:
print(known_value['dem'] - ok_interpolation[0][0])
6260    0.0
Name: dem, dtype: float64

The first value is our prediction is precisely the same as the input in the training set! So far, the algorithm has worked well. The second value is prediction variance error, equal, or really close to zero - as it should be.

Simple Kriging differs from Ordinary Kriging, and we must set the process mean to retrieve valid results. It is rarely the case. That’s why Ordinary Kriging is the first choice for many applications. We know the global mean because we have the whole dataset, but in the real-world scenario, we cannot divide the set into training and test sets and then get the mean from the entire dataset - it is an information leak from the test set into a model!

[15]:
sk_interpolation = simple_kriging(
    theoretical_model=theo_var,
    known_values=train['dem'],
    known_geometries=train['geometry'],
    unknown_locations=unknown_location,
    process_mean=np.mean(df['dem'])
)
print(sk_interpolation)
100%|██████████| 1/1 [00:00<00:00, 614.73it/s]
[[4.88854713e+01 0.00000000e+00 2.53481516e+05 5.47057061e+05]]

The Simple Kriging algorithm returns the same output as the Ordinary Kriging: [prediction, error variance, pt x, pt y]. Simple Kriging has returned the same value as the actual value fed to the algorithm.

3. Predict values at unknown locations and evaluate output#

In this part of the tutorial we interpolate values at missing locations. Kriging does it (usually) better than other simple techniques like filling missing data with the averages or inverse distance weighting. Interpolation quality is driven mostly by a good semivariogram model, but we can make better predictions when we scan more neighbors around the missing location. We will test this assumption in the next step, validating output with test dataset, and calculating the root mean squared error of prediction.

We will perform those tests only for ordinary kriging, dividing dataset multiple times to get distribution of errors for each number of neighbors.

[16]:
from tqdm import tqdm


neighbors_number = [4, 8, 16, 32, 64, 128, 256, 512]
number_of_trials = 10

outputs = []

for _ in range(number_of_trials):
    for nn in tqdm(neighbors_number):
        # Divide dataset into train and test set
        train = dem.sample(n=int(0.1*len(dem)))
        test = dem.loc[~dem.index.isin(train.index)]

        # Take sample from test, we don't need a whole dataset
        test_sample = test.sample(n=100)

        # Fit semivariogram model
        exp_var = build_experimental_variogram(
            values=train['dem'],
            geometries=train['geometry'],
            step_size=step_size,
            max_range=max_range
        )
        theo_var = build_theoretical_variogram(
            experimental_variogram=exp_var,
            models_group='linear'
        )

        # Predict new values
        yhats = ordinary_kriging(
            theoretical_model=theo_var,
            known_values=train['dem'],
            known_geometries=train['geometry'],
            unknown_locations=test_sample['geometry'],
            no_neighbors=nn,
            use_all_neighbors_in_range=False,
            progress_bar=False
        )
        yhats = yhats[:, 0]
        avg_rmse = np.sqrt(np.mean((test_sample['dem'] - yhats)**2))
        outputs.append([avg_rmse, nn])
100%|██████████| 8/8 [00:02<00:00,  2.74it/s]
100%|██████████| 8/8 [00:02<00:00,  3.60it/s]
100%|██████████| 8/8 [00:02<00:00,  3.54it/s]
100%|██████████| 8/8 [00:02<00:00,  3.55it/s]
100%|██████████| 8/8 [00:02<00:00,  3.37it/s]
100%|██████████| 8/8 [00:02<00:00,  3.34it/s]
100%|██████████| 8/8 [00:02<00:00,  3.43it/s]
100%|██████████| 8/8 [00:02<00:00,  3.36it/s]
100%|██████████| 8/8 [00:02<00:00,  3.25it/s]
100%|██████████| 8/8 [00:02<00:00,  3.36it/s]
[17]:
# create dataframe from results

df = pd.DataFrame(outputs, columns=['rmse', 'nn'])
df.head()
[17]:
rmse nn
0 3.572057 4
1 4.489833 8
2 4.091741 16
3 5.458568 32
4 4.077269 64
[18]:
# First plot average rmse per number of neighbors

plt.figure()
plt.plot(df.groupby('nn')['rmse'].mean())
plt.xlabel('Number of neighbors')
plt.ylabel('RMSE')
plt.show()
../../../_images/usage_tutorials_functional_3-1-ordinary-and-simple-kriging_24_0.png

As we see, there could be too many neighbors - interpolations become wildly inaccurate. We can check general statistics for each number of neighbors using .describe() method on transformed dataframe:

[19]:
# First, set number of neighbors as a column

transformed = df.pivot(columns='nn', values='rmse')
transformed.head()
[19]:
nn 4 8 16 32 64 128 256 512
0 3.572057 NaN NaN NaN NaN NaN NaN NaN
1 NaN 4.489833 NaN NaN NaN NaN NaN NaN
2 NaN NaN 4.091741 NaN NaN NaN NaN NaN
3 NaN NaN NaN 5.458568 NaN NaN NaN NaN
4 NaN NaN NaN NaN 4.077269 NaN NaN NaN
[20]:
transformed.describe()
[20]:
nn 4 8 16 32 64 128 256 512
count 10.000000 10.000000 10.000000 10.000000 10.000000 10.000000 10.000000 10.000000
mean 4.424563 4.589075 4.531294 4.754989 4.510437 26.591935 429.343974 1588.650863
std 0.721704 0.745411 0.671426 1.025088 0.469793 45.112219 456.209873 2928.686785
min 3.572057 3.333597 3.688744 3.262639 3.769000 4.388957 45.168559 183.274077
25% 3.992381 4.271690 3.909511 4.317161 4.145270 5.895184 83.092058 243.304685
50% 4.195704 4.547581 4.599459 4.672608 4.537990 7.752335 272.913118 359.240210
75% 4.744351 5.106922 5.043499 4.879660 4.809733 17.771979 685.394778 691.069539
max 5.829943 5.718141 5.551242 7.140792 5.207114 149.477744 1406.556887 9413.566202

In general, too much neighbors might worsen the results; we are assuming, that the closest neighbors are the most important, and introducing more distant neighbors may amplify the noise in the model response.

The best idea is to look into semivariogram and limit range as much as you can. Then you will be sure, that only the closest neighbors are taken into account.

Changelog#

Date

Changes

Author

2025-11-07

Used values and geometries parameters instead of ds in the experimental variogram initialization, and known_values and known_geometries for kriging process (instead of known_locations parameter)

@SimonMolinsky (Szymon Moliński)

2025-04-30

Tutorial has been adapted to the 1.0 release

@SimonMolinsky (Szymon Moliński)

2025-07-10

Changed behavior of ordinary_kriging and simplified tutorial

@SimonMolinsky (Szymon Moliński)