Theoretical Semivariogram#

class pyinterpolate.TheoreticalVariogram(model_params: dict | None = None, protect_from_overwriting=True, verbose=False)[source]

Theoretical model of a spatial dissimilarity.

Parameters:
model_paramsUnion[dict, TheoreticalVariogramModel], optional

Dictionary with 'nugget', 'sill', 'rang' and 'variogram_model_type'.

protect_from_overwritingbool, default = True

Protect model parameters from overwriting.

verbosebool, default = False

Show autofit() iteration results.

Attributes:
experimental_variogramExperimentalVariogram, optional

Empirical Variogram class and its attributes.

experimental_semivariancesnumpy array, optional

Experimental semivariances.

yhatnumpy array, optional

Predicted semivariances.

model_typestr, optional

The name of the chosen model.

nuggetfloat, default=0

The nugget parameter (bias at the zero distance).

sillfloat, default=0

Partial sill, or sill when nugget is set to zero. Total sill is a sum of partial sill and nugget. If given, then partial sill is fixed to this value.

rangfloat, default=0

The semivariogram range is a distance at which spatial correlation might be observed. It shouldn’t be set at a distance larger than a half of a study extent.

directionfloat, in range [0, 360], optional

The direction of a semivariogram. If not given then semivariogram is isotropic.

rmsefloat, default=0

Root mean squared error of the difference between the empirical observations and the modeled curve.

maebool, default=True

Mean Absolute Error of a model.

biasfloat, default=0

Forecast Bias of the modeled semivariogram vs experimental points. Large positive value means that the estimated model underestimates predictions. A large negative value means that model overestimates predictions.

smapefloat, default=0

Symmetric Mean Absolute Percentage Error of the prediction - values from 0 to 100%.

spatial_dependency_ratiofloat, optional

The ratio of nugget vs sill multiplied by 100. Levels from 0 to 25 indicate strong spatial dependency, from 25 to 75 - moderate spatial dependency, from 75 to 95 - weak spatial dependency, and above the process is considered to be not spatially-depended.

spatial_dependency_strengthstr, default = “Unknown”

Descriptive indicator of spatial dependency strength based on the spatial_dependency_level. It could be:

  • unknown if ratio is None,

  • strong if ratio is below 25,

  • moderate if ratio is between 25 and 75,

  • weak if ratio is between 75 and 95,

  • no spatial dependency if ratio is greater than 95.

protect_from_overwritingbool, default = True

Protect model parameters from overwriting.

verbosebool, default = False

Show autofit() results.

Methods

fit()

Fits experimental variogram data into theoretical model.

autofit()

The same as fit but tests multiple nuggets, ranges, sills and models.

calculate_model_error()

Evaluates the model performance against experimental values.

to_dict()

Store model parameters in a dict.

from_dict()

Read model parameters from a dict.

to_json()

Save model parameteres in a JSON file.

from_json()

Read model parameters from a JSON file.

plot()

Shows theoretical model.

See also

ExperimentalVariogram

class to calculate experimental variogram and more.

Examples

>>> import numpy as np
>>> from pyinterpolate import ExperimentalVariogram, TheoreticalVariogram
>>>
>>>
>>> REFERENCE_INPUT = np.array([
...    [0, 0, 8],
...    [1, 0, 6],
...    [2, 0, 4],
...    [3, 0, 3],
...    [4, 0, 6],
...    [5, 0, 5],
...    [6, 0, 7],
...    [7, 0, 2],
...    [8, 0, 8],
...    [9, 0, 9],
...    [10, 0, 5],
...    [11, 0, 6],
...    [12, 0, 3]
...    ])
>>> step_size = 1
>>> max_range = 4.1
>>> empirical_smv = ExperimentalVariogram(
...     values=REFERENCE_INPUT[:, -1],
...     geometries=REFERENCE_INPUT[:, :-1],
...     step_size=step_size,
...     max_range=max_range
... )
>>> theoretical_var = TheoreticalVariogram()
>>> theoretical_var.fit(
...     experimental_variogram=empirical_smv,
...     model_type='linear',
...     sill=np.var(REFERENCE_INPUT[:, -1]),
...     rang=5
... )
... )
>>> print(theoretical_var)
* Selected model: Linear model
* Nugget: 0.0
* Sill: 4.2485207100591715
* Range: 5
* Spatial Dependency Strength is Undefined: nugget equal to 0, cannot estimate
* Mean Bias: 2.949918937899707
* Mean RMSE: 3.150422552980984
* Error-lag weighting method: None
+-----+--------------------+--------------------+--------------------+
| lag |    theoretical     |    experimental    |  bias (real-yhat)  |
+-----+--------------------+--------------------+--------------------+
| 1.0 | 0.8497041420118343 |       4.625        | 3.7752958579881657 |
| 2.0 | 1.6994082840236686 | 5.2272727272727275 | 3.527864443249059  |
| 3.0 | 2.549112426035503  |        6.0         | 3.450887573964497  |
| 4.0 | 3.3988165680473372 | 4.444444444444445  | 1.0456278763971074 |
+-----+--------------------+--------------------+--------------------+
autofit(experimental_variogram: ExperimentalVariogram | ndarray, models_group: str | list = 'safe', nugget=None, min_nugget=0, max_nugget=0.5, number_of_nuggets=16, rang=None, min_range=0.1, max_range=0.5, number_of_ranges=16, sill=None, n_sill_values=5, sill_from_variance=False, min_sill=0.5, max_sill=2, number_of_sills=16, direction=None, error_estimator='rmse', deviation_weighting='equal', return_params=True) TheoreticalVariogramModel | None[source]

Method finds the optimal range, sill and model (function) of theoretical semivariogram.

Parameters:
experimental_variogramExperimentalVariogram

Experimental variogram model or array with lags and semivariances.

models_groupstr or list, default=’safe’

Models group to test:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - [‘linear’, ‘power’, ‘spherical’]

  • as a list: multiple model types to test

  • as a single model type from:
    • ‘circular’,

    • ‘cubic’,

    • ‘exponential’,

    • ‘gaussian’,

    • ‘linear’,

    • ‘power’,

    • ‘spherical’.

nuggetfloat, optional

Nugget (bias) of a variogram. If given then it is fixed to this value.

min_nuggetfloat, default = 0

The minimum nugget as the ratio of the parameter to the first lag variance.

max_nuggetfloat, default = 0.5

The maximum nugget as the ratio of the parameter to the first lag variance.

number_of_nuggetsint, default = 16

How many equally spaced nuggets tested between min_nugget and max_nugget.

rangfloat, optional

If given, then range is fixed to this value.

min_rangefloat, default = 0.1

The minimal fraction of a variogram range, 0 < min_range <= max_range.

max_rangefloat, default = 0.5

The maximum fraction of a variogram range, min_range <= max_range <= 1. Parameter max_range greater than 0.5 raises warning.

number_of_rangesint, default = 16

How many equally spaced ranges are tested between min_range and max_range.

sillfloat, default = None

Partial sill, or sill when nugget is set to zero. Total sill is a sum of partial sill and nugget. If given, then partial sill is fixed to this value.

n_sill_valuesint, default = 5

The last n experimental semivariance records for sill estimation. (Used only when sill_from_variance is set to False).

sill_from_variancebool, default = False

Estimate sill from the variance (semivariance at distance 0).

min_sillfloat, default = 0.5

The minimal fraction of the value chosen with the sill estimation method. The value is: for sill_from_values - the mean of the last n_sill_values number of experimental semivariances, for sill_from_variance - the experimental variogram variance.

max_sillfloat, default = 2

The maximum fraction of the value chosen with the sill estimation method. The value is: for sill_from_values - the mean of the last n_sill_values number of experimental semivariances, for sill_from_variance - the experimental variogram variance.

number_of_sillsint, default = 16

How many equally spaced sill values are tested between min_sill and max_sill.

directionfloat, in range [0, 360], default=None

The direction of a semivariogram. If None given then semivariogram is isotropic. This parameter is required if passed experimental variogram is stored as a numpy array.

error_estimatorstr, default = ‘rmse’

A model error estimation method. Available options are:

  • ‘rmse’: Root Mean Squared Error,

  • ‘mae’: Mean Absolute Error,

  • ‘bias’: Forecast Bias,

  • ‘smape’: Symmetric Mean Absolute Percentage Error.

deviation_weightingstr, default = “equal”

The name of the method used to weight error at a given lags. Works only with RMSE. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger weights,

  • dense: error is weighted by the number of point pairs within lag.

return_paramsbool, default = True

Returns model.

Returns:
theoretical_variogram_modelTheoreticalVariogramModel

See TheoreticalVariogramModel class.

Raises:
AttributeError

Method is invoked on the calculated variogram.

ValueError

Raised when wrong nugget, range, or sill limits are passed.

KeyError

Raised when wrong error type is provided by the users.

calculate_model_error(fitted_values: ndarray, rmse=True, bias=True, mae=True, smape=True, deviation_weighting='equal') dict[source]

Method calculates error associated with a difference between the theoretical model and the experimental semivariances.

Parameters:
fitted_valuesnumpy array
rmsebool, default=True

Root Mean Squared Error of a model.

biasbool, default=True

Forecast Bias of a model.

maebool, default=True

Mean Absolute Error of a model.

smapebool, default=True

Symmetric Mean Absolute Percentage Error of a model.

deviation_weightingstr, default = “equal”

The name of the method used to weight errors at a given lags. Works only with RMSE. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger weights,

  • dense: error is weighted by the number of point pairs within a lag.

Returns:
model_errorsDict

Computed errors: rmse, bias, mae, smape.

Raises:
MetricsTypeSelectionError

User has set all error types to False.

fit(experimental_variogram: ExperimentalVariogram | ndarray, model_type: str, sill: float, rang: float, nugget=0.0, direction=None) Tuple[ndarray, dict][source]

Fits theoretical model into experimental semivariances.

Parameters:
experimental_variogramExperimentalVariogram

Experimental variogram model or array with lags and semivariances.

model_typestr

The name of the model to check. Available models:

  • ‘circular’,

  • ‘cubic’,

  • ‘exponential’,

  • ‘gaussian’,

  • ‘linear’,

  • ‘power’,

  • ‘spherical’.

sillfloat

Partial sill, or sill when nugget is set to zero. Total sill is a sum of partial sill and nugget. If given, then partial sill is fixed to this value.

rangfloat

The semivariogram range is a distance at which spatial correlation exists. It shouldn’t be set at a distance larger than a half of a study extent.

nuggetfloat, default=0.

Nugget parameter (bias at a zero distance).

directionfloat, in range [0, 360], default=None

The direction of a semivariogram. If None given then semivariogram is isotropic. This parameter is required if passed experimental variogram is stored as a numpy array.

Returns:
theoretical_values, error: Tuple[ numpy array, dict ]

[ theoretical semivariances, {'rmse bias smape mae'}]

Raises:
AttributeErrorSemivariogram parameters could be overwritten
from_dict(parameters: dict)[source]

Method updates model with a given set of parameters.

Parameters:
parametersDict

Dictionary with model’s: 'variogram_model_type', 'nugget', 'sill', 'range', 'direction'.

from_json(fname: str)[source]

Method reads data from a JSON file.

Parameters:
fnamestr

JSON file name.

property name

Returns theoretical model name.

plot(experimental=True)[source]

Method plots theoretical semivariogram.

Parameters:
experimentalbool

Plots experimental observations with theoretical semivariogram.

Raises:
AttributeError

Model is not fitted yet, nothing to plot.

predict(distances: ndarray) ndarray[source]

Method predicts semivariances from distances using fitted semivariogram model.

Parameters:
distancesnumpy array

Distances between points.

Returns:
predictednumpy array

Predicted semivariances.

to_dict() dict[source]

Method exports the theoretical variogram parameters to dictionary.

Returns:
model_parametersDict

Dictionary with model’s 'variogram_model_type', 'nugget', 'sill', 'rang' and 'direction'.

Raises:
AttributeError

The model parameters have not been derived yet.

to_json(fname: str)[source]

Method stores semivariogram parameters into a JSON file.

Parameters:
fnamestr

JSON file name.

pyinterpolate.build_theoretical_variogram(experimental_variogram: ExperimentalVariogram | ndarray, models_group: str | list = 'safe', nugget=None, min_nugget=0, max_nugget=0.5, number_of_nuggets=16, rang=None, min_range=0.1, max_range=0.5, number_of_ranges=16, sill=None, min_sill=0.0, max_sill=1, number_of_sills=16, direction=None, error_estimator='rmse', deviation_weighting='equal') TheoreticalVariogram[source]

Function creates Theoretical Variogram.

Parameters:
experimental_variogramExperimentalVariogram

Experimental variogram model or array with lags and semivariances.

models_groupstr or list, default=’safe’

Models group to test:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - [‘linear’, ‘power’, ‘spherical’]

  • as a list: multiple model types to test

  • as a single model type from:
    • ‘circular’,

    • ‘cubic’,

    • ‘exponential’,

    • ‘gaussian’,

    • ‘linear’,

    • ‘power’,

    • ‘spherical’.

nuggetfloat, optional

Nugget (bias) of a variogram. If given then it is fixed to this value.

min_nuggetfloat, default = 0

The minimum nugget as the ratio of the parameter to the first lag variance.

max_nuggetfloat, default = 0.5

The maximum nugget as the ratio of the parameter to the first lag variance.

number_of_nuggetsint, default = 16

How many equally spaced nuggets tested between min_nugget and max_nugget.

rangfloat, optional

If given, then range is fixed to this value.

min_rangefloat, default = 0.1

The minimal fraction of a variogram range, 0 < min_range <= max_range.

max_rangefloat, default = 0.5

The maximum fraction of a variogram range, min_range <= max_range <= 1. Parameter max_range greater than 0.5 raises warning.

number_of_rangesint, default = 16

How many equally spaced ranges are tested between min_range and max_range.

sillfloat, default = None

Partial sill, or sill when nugget is set to zero. Total sill is a sum of partial sill and nugget. If given, then partial sill is fixed to this value.

min_sillfloat, default = 0

The minimal fraction of the variogram variance at lag 0 to find partial sill, 0 <= min_sill <= max_sill.

max_sillfloat, default = 1

The maximum fraction of the variogram variance at lag 0 to find partial sill. It should be lower or equal to 1. It is possible to set it above 1, but then warning is printed.

number_of_sillsint, default = 16

How many equally spaced sill values are tested between min_sill and max_sill.

directionfloat, in range [0, 360], default=None

The direction of a semivariogram. If None given then semivariogram is isotropic. This parameter is required if passed experimental variogram is stored in a numpy array.

error_estimatorstr, default = ‘rmse’

A model error estimation method. Available options are:

  • ‘rmse’: Root Mean Squared Error,

  • ‘mae’: Mean Absolute Error,

  • ‘bias’: Forecast Bias,

  • ‘smape’: Symmetric Mean Absolute Percentage Error.

deviation_weightingstr, default = “equal”

The name of the method used to weight error at a given lags. Works only with RMSE. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger weights,

  • dense: error is weighted by the number of point pairs within lag.

Returns:
theo_varTheoreticalVariogram

Fitted theoretical semivariogram.

Examples

>>> import numpy as np
>>> from pyinterpolate import (
...     ExperimentalVariogram,
...     build_theoretical_variogram
... )
>>>
>>>
>>> REFERENCE_INPUT = np.array([
...    [0, 0, 8],
...    [1, 0, 6],
...    [2, 0, 4],
...    [3, 0, 3],
...    [4, 0, 6],
...    [5, 0, 5],
...    [6, 0, 7],
...    [7, 0, 2],
...    [8, 0, 8],
...    [9, 0, 9],
...    [10, 0, 5],
...    [11, 0, 6],
...    [12, 0, 3]
...    ])
>>> step_size = 1
>>> max_range = 4.1
>>> empirical_smv = ExperimentalVariogram(
...     values=REFERENCE_INPUT[:, -1],
...     geometries=REFERENCE_INPUT[:, :-1],
...     step_size=step_size,
...     max_range=max_range
... )
>>> theoretical_var = build_theoretical_variogram(
...     experimental_variogram=empirical_smv
... )
>>> print(theoretical_var)
* Selected model: Linear model
* Nugget: 1.85
* Sill: 3.3827861952861946
* Range: 1.2000000000000002
* Spatial Dependency Strength is moderate
* Mean Bias: None
* Mean RMSE: 0.5504691463605988
* Error-lag weighting method: equal
+-----+-------------------+--------------------+-----------------------+
| lag |    theoretical    |    experimental    |    bias (real-yhat)   |
+-----+-------------------+--------------------+-----------------------+
| 1.0 | 4.668988496071829 |       4.625        |  -0.04398849607182864 |
| 2.0 | 5.232786195286195 | 5.2272727272727275 | -0.005513468013467637 |
| 3.0 | 5.232786195286195 |        6.0         |   0.7672138047138048  |
| 4.0 | 5.232786195286195 | 4.444444444444445  |  -0.7883417508417505  |
+-----+-------------------+--------------------+-----------------------+
pyinterpolate.calculate_spatial_dependence_index(nugget: float, sill: float) Tuple[source]

Function estimates spatial dependence index and its ratio.

Parameters:
nuggetfloat

Semivariogram nugget.

sillfloat

Partial sill, difference between total sill and nugget. If given, then partial sill is fixed to this value.

Returns:
: Tuple[float, str]

ratio, descriptive spatial dependency strength

Raises:
ValueError

Nugget is equal to zero.

References

[1] CAMBARDELLA, C.A.; MOORMAN, T.B.; PARKIN, T.B.; KARLEN, D.L.; NOVAK, J.M.; TURCO, R.F.; KONOPKA, A.E. Field-scale variability of soil properties in central Iowa soils. Soil Science Society of America Journal, v. 58, n. 5, p. 1501-1511, 1994.

Examples

>>> from pyinterpolate import calculate_spatial_dependence_index
>>>
>>>
>>> ratio_percent, strength = calculate_spatial_dependence_index(
...     nugget=0.1,
...     sill=0.9
... )
>>> print((ratio_percent, strength))
(10.0, 'strong')