Semivariogram Deconvolution#

Deconvolution#

class pyinterpolate.Deconvolution(verbose=True, store_models=False)[source]

Class performs deconvolution of semivariogram of areal data. Whole procedure is based on the iterative process described in: [1].

Steps to regularize semivariogram:

  • initialize your object (no parameters),

  • use fit() method to build the initial point support model,

  • use transform() method to perform the semivariogram regularization,

  • save deconvoluted semivariogram model with export_model() method.

Parameters:
verbosebool, default = True

Print process steps into a terminal.

store_modelsbool, default = False

Should theoretical and regularized models be stored after each iteration.

Attributes:
psPointSupport

Point support data.

blocksBlocks

Blocks with aggregated data.

deviationDeviation

Deviation object tracking the fit error between the initial model and regularized model.

min_deviation_ratiofloat

Minimal ratio of model deviation to initial deviation when algorithm is stopped. Parameter must be set within the limits (0, 1). Set in transform() method.

min_deviation_decreasefloat

The minimum ratio of the difference between model deviation and optimal deviation to the optimal deviation: |dev - opt_dev| / opt_dev. Parameter must be set within the limits (0, 1). Set in transform() method.

reps_deviation_decreaseint

How many repetitions of small deviation decrease before termination of the algorithm.

weightsList

Weights applied to each lag during each regularization iteration.

final_theoretical_modelTheoreticalVariogram

Optimal model with the lowest deviation between this model and initial theoretical variogram derived from blocks.

final_regularized_variogramnumpy array

[lag, semivariance]

is_fitbool, default = False

Was model fitted? (The first step of regularization).

is_transformedbool, default = False

Was model transformed?

Methods

fit()

Fits areal data and the point support data into a model, initializes the experimental semivariogram, the theoretical semivariogram model, regularized point support model, and deviation.

transform()

Performs semivariogram regularization.

fit_transform()

Performs fit() and transform() at once.

export_model()

Exports regularized (or fitted) model.

plot_variograms()

Plots semivariances before and after regularization.

plot_deviations()

Plots each deviation divided by the initial deviation.

plot_weights()

Plots the mean weight value per lag.

References

[1] Goovaerts P., Kriging and Semivariogram Deconvolution in the Presence of Irregular Geographical Units, Mathematical Geology 40(1), 101-128, 2008

Examples

>>> import os
>>> import geopandas as gpd
>>> from pyinterpolate import (
...     Blocks,
...     Deconvolution,
...     PointSupport,
... )
>>>
>>>
>>> FILENAME = 'cancer_data.gpkg'
>>> LAYER_NAME = 'areas'
>>> DS = gpd.read_file(FILENAME, layer=LAYER_NAME)
>>> AREA_VALUES = 'rate'
>>> AREA_INDEX = 'FIPS'
>>> AREA_GEOMETRY = 'geometry'
>>> PS_LAYER_NAME = 'points'
>>> PS_VALUES = 'POP10'
>>> PS_GEOMETRY = 'geometry'
>>> PS = gpd.read_file(FILENAME, layer=PS_LAYER_NAME)
>>>
>>> CANCER_DATA = {
...    'ds': DS,
...    'index_column_name': AREA_INDEX,
...    'value_column_name': AREA_VALUES,
...    'geometry_column_name': AREA_GEOMETRY
... }
>>> POINT_SUPPORT_DATA = {
...     'ps': PS,
...     'value_column_name': PS_VALUES,
...     'geometry_column_name': PS_GEOMETRY
... }
>>> BLOCKS = Blocks(**CANCER_DATA)
>>>
>>> PS = PointSupport(
...     points=POINT_SUPPORT_DATA['ps'],
...     ps_blocks=BLOCKS,
...     points_value_column=POINT_SUPPORT_DATA['value_column_name'],
...     points_geometry_column=POINT_SUPPORT_DATA['geometry_column_name']
... )
>>> dcv = Deconvolution(verbose=False)
>>> dcv.fit(
...     blocks=BLOCKS,
...     point_support=PS,
...     step_size=40000,
...     max_range=300001
... )
>>> dcv.transform(max_iters=5)
>>> print(dcv.final_theoretical_model)
* Selected model: Spherical model
* Nugget: 0.0
* Sill: 157.72180532432975
* Range: 28000.0
* Spatial Dependency Strength is Undefined: nugget equal to 0, cannot estimate
* Mean Bias: None
* Mean RMSE: 16.524700893642784
* Error-lag weighting method: closest
+----------+--------------------+--------------------+---------------------+
|   lag    |    theoretical     |    experimental    |   bias (real-yhat)  |
+----------+--------------------+--------------------+---------------------+
| 20000.0  | 140.2482525478734  | 193.16205582305136 |  52.91380327517797  |
| 40000.0  | 157.72180532432975 | 138.25751906267658 |  -19.46428626165317 |
| 60000.0  | 157.72180532432975 |  149.997349965762  |  -7.724455358567752 |
| 80000.0  | 157.72180532432975 | 142.84364743170343 |  -14.87815789262632 |
| 100000.0 | 157.72180532432975 | 142.4222925698438  |  -15.29951275448596 |
| 120000.0 | 157.72180532432975 | 154.47000840562424 | -3.2517969187055087 |
| 140000.0 | 157.72180532432975 | 145.67546701766707 | -12.046338306662676 |
| 160000.0 | 157.72180532432975 | 161.2524338627573  |  3.5306285384275498 |
| 180000.0 | 157.72180532432975 | 144.86386526565153 | -12.857940058678224 |
| 200000.0 | 157.72180532432975 | 141.46845881256164 | -16.253346511768115 |
| 220000.0 | 157.72180532432975 | 126.10100191843915 | -31.620803405890598 |
| 240000.0 | 157.72180532432975 | 142.37493167303575 | -15.346873651294004 |
| 260000.0 | 157.72180532432975 | 125.00372185725449 |  -32.71808346707526 |
| 280000.0 | 157.72180532432975 | 122.2260745900828  |  -35.49573073424695 |
+----------+--------------------+--------------------+---------------------+
>>> dcv.export_model_to_json('results.csv')
export_model_to_json(fname: str)[source]

Function exports final theoretical model.

Parameters:
fnamestr

File name for model parameters (nugget, sill, range, model type)

Raises:
RunetimeError

A model hasn’t been transformed yet.

fit(blocks: Blocks, point_support: PointSupport, step_size: float, max_range: float, nugget: float = None, direction: float = None, tolerance: float = None, variogram_weighting_method: str = 'closest', models_group: str | list = 'safe') None[source]

Fits the blocks semivariogram into the point support semivariogram. The initial step of regularization.

Parameters:
blocksBlocks

Aggregated data | choropleth map.

point_supportPointSupport

Point support of blocks.

step_sizefloat

Step size between lags - estimated for blocks.

max_rangefloat

Maximal distance of analysis - estimated for blocks.

nuggetfloat, default = 0

The nugget of a data - estimated for blocks.

directionfloat (in range [0, 360]), optional

Direction of blocks semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

tolerancefloat (in range [0, 1]), optional

If tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If tolerance is > 0 then the points are selected in elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

  • The major axis size == step_size.

  • The minor axis size is tolerance * step_size

  • The baseline point is at a center of the ellipse.

  • The tolerance == 1 creates an omnidirectional semivariogram.

variogram_weighting_methodstr, default = "closest"

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger custom_weights,

  • distant: lags that are further away have bigger custom_weights,

  • dense: error is weighted by the number of point pairs within lag.

models_groupstr or list, default=’safe’

Models group to test:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - [‘linear’, ‘power’, ‘spherical’]

  • as a list: multiple model types to test

  • as a single model type from:
    • ‘circular’,

    • ‘cubic’,

    • ‘exponential’,

    • ‘gaussian’,

    • ‘linear’,

    • ‘power’,

    • ‘spherical’.

fit_transform(blocks: Blocks, point_support: PointSupport, step_size: float, max_range: float, nugget: float = None, direction: float = None, tolerance: float = None, variogram_weighting_method: str = 'closest', models_group: str | list = 'safe', max_iters=25, limit_deviation_ratio=0.1, minimum_deviation_decrease=0.01, reps_deviation_decrease=3)[source]

Method performs fit() and transform() operations at once.

Fits the blocks semivariogram into the point support semivariogram. The initial step of regularization.

Parameters:
blocksBlocks

Aggregated data | choropleth map.

point_supportPointSupport

Point support of blocks.

step_sizefloat

Step size between lags - estimated for blocks.

max_rangefloat

Maximal distance of analysis - estimated for blocks.

nuggetfloat, default = 0

The nugget of a data - estimated for blocks.

directionfloat (in range [0, 360]), optional

Direction of blocks semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

tolerancefloat (in range [0, 1]), optional

If tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If tolerance is > 0 then the bin is selected as an elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

  • The major axis size == step_size.

  • The minor axis size is tolerance * step_size

  • The baseline point is at a center of the ellipse.

  • The tolerance == 1 creates an omnidirectional semivariogram.

variogram_weighting_methodstr, default = "closest"

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger custom_weights,

  • distant: lags that are further away have bigger custom_weights,

  • dense: error is weighted by the number of point pairs within lag.

models_groupstr or list, default=’safe’

Models group to test:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - [‘linear’, ‘power’, ‘spherical’]

  • as a list: multiple model types to test

  • as a single model type from:
    • ‘circular’,

    • ‘cubic’,

    • ‘exponential’,

    • ‘gaussian’,

    • ‘linear’,

    • ‘power’,

    • ‘spherical’.

max_itersint, default = 25

Maximum number of iterations.

limit_deviation_ratiofloat, default = 0.01

Minimal ratio of model deviation to initial deviation when algorithm is stopped. Parameter must be set within the limits (0, 1).

minimum_deviation_decreasefloat, default = 0.001

The minimum ratio of the difference between model deviation and optimal deviation to the optimal deviation: |dev - opt_dev| / opt_dev. The parameter must be set within the limits (0, 1).

reps_deviation_decreaseint, default = 3

How many repetitions of small deviation decrease before termination of the algorithm.

plot_variograms()[source]

Function shows experimental semivariogram, theoretical semivariogram and regularized semivariogram after semivariogram regularization (transforming).

transform(max_iters=25, min_deviation_ratio=0.1, min_deviation_decrease=0.01, reps_deviation_decrease=3)[source]

Method performs semivariogram regularization after model fitting.

Parameters:
max_itersint, default = 25

Maximum number of iterations.

min_deviation_ratiofloat, default = 0.1

Minimal ratio of model deviation to initial deviation when algorithm is stopped. Parameter must be set within the limits (0, 1).

min_deviation_decreasefloat, default = 0.01

The minimum ratio of the difference between model deviation and optimal deviation to the optimal deviation: |dev - opt_dev| / opt_dev. Parameter must be set within the limits (0, 1).

reps_deviation_decreaseint, default = 3

How many repetitions of small deviation decrease before termination of the algorithm.

Raises:
AttributeError

initial_regularized_model is undefined (user didn’t perform fit() method).

ValueError

limit_deviation_ratio or minimum_deviation_decrease parameters <= 0 or >= 1.

Deviation#

class pyinterpolate.Deviation(theoretical_semivariances: ndarray, regularized_semivariances: ndarray, method: str = 'mrd')[source]

Regularization process deviation calculation and monitoring. Deviation is defined as absolute difference between the theoretical semivariances and the regularized semivariances divided by the regularized semivariances.

Parameters:
theoretical_semivariancesnumpy array

Predicted semivariances

regularized_semivariancesnumpy array

Regualrized semivariances

methodstr, default=`’mrd’`

Deviation monitoring method, available options:

  • ‘mrd’ - mean relative difference

  • ‘smrd’ - symmetric mean relative difference

  • ‘rmse’ - root mean squared error

Attributes:
methodstr

See method parameter.

initial_deviationfloat

See initial_deviation parameter.

deviationslist

The list of deviations. The first element is the initial deviation.

optimal_deviationfloat

Minimal deviation. During the first iteration it is equal to the initial deviation. It is updated only when the current deviation is lower than the optimal deviation.

Methods

current_deviation_decrease(), property

Current deviation decrease (current deviation minus the lowest deviation).

current_ratio(), property

Ratio of current deviation to initial deviation.

calculate_deviation_decrease()

Current deviation decrease (current deviation minus the lowest deviation).

calculate_deviation_ratio()

Ratio of current deviation to initial deviation.

deviation_direction()

Returns True if deviation is increasing, False if deviation is decreasing.

normalize()

Ratios of current deviations to initial deviations. (Element-wise).

plot()

Plots the deviations.

Raises:
KeyErrorUser provides unsupported deviation method name.

Examples

>>> import numpy as np
>>> from pyinterpolate import (
...     build_experimental_variogram,
...     build_theoretical_variogram
... )
>>> from pyinterpolate.semivariogram.deconvolution.deviation import Deviation
>>>
>>>
>>> ref_input = np.array([
...         [0, 0, 8],
...         [1, 0, 6],
...         [2, 0, 4],
...         [3, 0, 3],
...         [4, 0, 6],
...         [5, 0, 5],
...         [6, 0, 7],
...         [7, 0, 2],
...         [8, 0, 8],
...         [9, 0, 9],
...         [10, 0, 5],
...         [11, 0, 6],
...         [12, 0, 3]
...     ])
>>> step_size = 1
>>> max_range = 4
>>>
>>> experimental = build_experimental_variogram(
...     values=ref_input[:, -1],
...     geometries=ref_input[:, :-1],
...     step_size=step_size,
...     max_range=max_range
... )
>>>
>>> regularized = build_experimental_variogram(
...     values=ref_input[:, -1] + 2,
...     geometries=ref_input[:, :-1],
...     step_size=step_size,
...     max_range=max_range
... )
>>>
>>> dv = Deviation(
...     theoretical_semivariances=theoretical.yhat,
...     regularized_semivariances=regularized.semivariances
... )
>>> print(dv.optimal_deviation)
0.05754045538774986
calculate_deviation_decrease()[source]

Deviation decrease (current deviation minus the lowest deviation).

Returns:
: float
calculate_deviation_ratio()[source]

Ratio of current deviation to initial deviation.

Returns:
: float
property current_deviation_decrease

Current deviation decrease (current deviation minus the lowest deviation).

Returns:
: float
property current_ratio

Ratio of current deviation to initial deviation.

Returns:
: float
deviation_direction() bool[source]

Returns False if deviation is decreasing, True if deviation is increasing.

Returns:
: bool
normalize()[source]

Normalizes the deviations by dividing them by the initial deviation.

Returns:
: numpy array

Normalized array of deviations.

set_current_as_optimal()[source]

Sets current deviation as optimal deviation. It is used when the current deviation is lower than the optimal deviation.

update(theoretical_model: ndarray, regularized_variances: ndarray)[source]

Updates the deviation list with the current deviation.

Parameters:
theoretical_modelnumpy array

The initial semivariances.

regularized_variancesnumpy array

Regularized semivariances.

Aggregated Variogram#

pyinterpolate.regularize(blocks: Blocks, point_support: PointSupport, step_size: float, max_range: float, nugget: float = 0, direction: float = None, tolerance: float = None, average_inblock_semivariances: ndarray = None, semivariance_between_point_supports: ndarray = None, experimental_block_semivariances: ndarray = None, theoretical_block_model: TheoreticalVariogram = None, variogram_weighting_method: str = 'closest', verbose: bool = False, log_process: bool = False) ndarray[source]

Function is an alias for AggregatedVariogram class and performs semivariogram regularization. Function returns regularized variogram.

Parameters:
blocksBlocks

Aggregated data | choropleth map.

point_supportPointSupport

Point support of blocks.

step_sizefloat

Step size between lags.

max_rangefloat

Maximal distance of analysis.

nuggetfloat, default = 0

The nugget of blocks data.

directionfloat (in range [0, 360]), optional, default=0

Direction of blocks semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

tolerancefloat (in range [0, 1]), optional

If tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If tolerance is > 0 then the bin is selected as an elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

  • The major axis size == step_size.

  • The minor axis size is tolerance * step_size

  • The baseline point is at a center of the ellipse.

  • The tolerance == 1 creates an omnidirectional semivariogram.

average_inblock_semivariancesnp.ndarray, default = None

The mean semivariance between the blocks. See Notes to learn more.

semivariance_between_point_supportsnp.ndarray, default = None

Semivariance between all blocks calculated from the theoretical model.

experimental_block_semivariancesnp.ndarray, default = None

The experimental semivariance between area coordinates.

theoretical_block_modelTheoreticalVariogram, default = None

A modeled variogram.

variogram_weighting_methodstr, default = “closest”

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger custom_weights,

  • distant: lags that are further away have bigger custom_weights,

  • dense: error is weighted by the number of point pairs within a lag - more pairs, smaller weight.

verbosebool, default = False

Print steps performed by the algorithm.

log_processbool, default = False

Log process info (Level DEBUG).

Returns:
regularizednumpy array

[lag, semivariance]

See also

AggregatedVariogram

core class for block semivariogram regularization

References

[1] Goovaerts P., Kriging and Semivariogram Deconvolution in the Presence of Irregular Geographical Units, Mathematical Geology 40(1), 101-128, 2008

Examples

>>> import os
>>> import geopandas as gpd
>>> from pyinterpolate import (
...     regularize,
...     Blocks,
...     PointSupport,
... )
>>>
>>>
>>> FILENAME = 'cancer_data.gpkg'
>>> LAYER_NAME = 'areas'
>>> DS = gpd.read_file(FILENAME, layer=LAYER_NAME)
>>> AREA_VALUES = 'rate'
>>> AREA_INDEX = 'FIPS'
>>> AREA_GEOMETRY = 'geometry'
>>> PS_LAYER_NAME = 'points'
>>> PS_VALUES = 'POP10'
>>> PS_GEOMETRY = 'geometry'
>>> PS = gpd.read_file(FILENAME, layer=PS_LAYER_NAME)
>>>
>>> CANCER_DATA = {
...    'ds': DS,
...    'index_column_name': AREA_INDEX,
...    'value_column_name': AREA_VALUES,
...    'geometry_column_name': AREA_GEOMETRY
... }
>>> POINT_SUPPORT_DATA = {
...     'ps': PS,
...     'value_column_name': PS_VALUES,
...     'geometry_column_name': PS_GEOMETRY
... }
>>> BLOCKS = Blocks(**CANCER_DATA)
>>> indexes = BLOCKS.block_indexes
>>>
>>> PS = PointSupport(
...     points=POINT_SUPPORT_DATA['ps'],
...     ps_blocks=BLOCKS,
...     points_value_column=POINT_SUPPORT_DATA['value_column_name'],
...     points_geometry_column=POINT_SUPPORT_DATA['geometry_column_name']
... )
>>> STEP_SIZE = 20000
>>> MAX_RANGE = 300000
>>> reg_variogram = regularize(
...     blocks=BLOCKS,
...     point_support=PS,
...     step_size=STEP_SIZE,
...     max_range=MAX_RANGE
... )
>>> print(reg_variogram[0])
[20000.            53.13549729]
class pyinterpolate.AggregatedVariogram(blocks: Blocks, point_support: PointSupport, step_size: float, max_range: float, nugget: float = 0, direction: float = None, tolerance: float = None, variogram_weighting_method: str = 'closest', models_group: str | list = 'safe', verbose: bool = False, log_process: bool = False)[source]

Class calculates semivariances between blocks of aggregated data.

Parameters:
blocksBlocks

Aggregated data | choropleth map.

point_supportPointSupport

Point supports of blocks.

step_sizefloat

Step size between lags.

max_rangefloat

Maximal distance of analysis.

nuggetfloat, default = 0

The nugget of blocks data.

directionfloat (in range [0, 360]), optional

Direction of blocks semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

tolerancefloat (in range [0, 1]), optional

If tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If tolerance is > 0 then the bin is selected as an elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

variogram_weighting_methodstr, default = “closest”

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger custom_weights,

  • dense: error is weighted by the number of point pairs within a lag - more pairs, lower weight.

models_groupstr or list, default=’safe’

Models group to test:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - ['linear', 'power', 'spherical']

  • as a list: multiple model types to test

  • as a single model type from the list:
    • ‘circular’,

    • ‘cubic’,

    • ‘exponential’,

    • ‘gaussian’,

    • ‘linear’,

    • ‘power’,

    • ‘spherical’.

verbosebool, default = False

Print steps performed by the algorithm.

log_processbool, default = False

Log process info (Level DEBUG).

Attributes:
blocksBlocks

See aggregared_data parameter.

step_sizefloat

See step_size parameter.

max_rangefloat

See max_range parameter.

agg_lagsnumpy array

Lags calculated as a set of equidistant points from step_size to max_range with a step of size step_size.

tolerancefloat, default = 1

See tolerance parameter.

directionfloat, default = 0

See direction parameter.

nuggetfloat, default = 0

See nugget parameter.

models_groupstr or list

See models_group parameter.

point_supportPointSupport

See the point_support parameter.

weighting_methodstr, default = ‘closest’

See the variogram_weighting_method parameter.

verbosebool, default = False

See verbose parameter.

log_processbool, default = False

See the log_process parameter.

experimental_variogramExperimentalVariogram

The experimental variogram calculated from blocks (their coordinates).

theoretical_modelTheoreticalVariogram

The theoretical model derived from blocks’ coordinates.

inblock_semivarianceDict

{area id: the average inblock semivariance}

avg_inblock_semivariancenumpy array

[lag, average inblocks semivariances, number of blocks within a lag]

block_to_block_semivarianceDict

{(block i, block j): [distance, semivariance, number of point pairs between blocks]}

avg_block_to_block_semivariancenumpy array

[lag, semivariance, number of point pairs between blocks]

regularized_variogramnumpy array

[lag, semivariance]

distances_between_blocksDict

Weighted distances between all blocks: {block id : [distances to other blocks]}

Methods

regularize()

Method performs semivariogram regularization.

show_semivariograms()

Shows experimental variogram, theoretical model, average inblock semivariance, average block to block semivariance and regularized variogram.

References

[1] Goovaerts P., Kriging and Semivariogram Deconvolution in the Presence of Irregular Geographical Units, Mathematical Geology 40(1), 101-128, 2008

Examples

>>> import os
>>> import geopandas as gpd
>>> from pyinterpolate import (
...     AggregatedVariogram,
...     Blocks,
...     PointSupport
... )
>>>
>>>
>>> FILENAME = 'cancer_data.gpkg'
>>> LAYER_NAME = 'areas'
>>> DS = gpd.read_file(FILENAME, layer=LAYER_NAME)
>>> AREA_VALUES = 'rate'
>>> AREA_INDEX = 'FIPS'
>>> AREA_GEOMETRY = 'geometry'
>>> PS_LAYER_NAME = 'points'
>>> PS_VALUES = 'POP10'
>>> PS_GEOMETRY = 'geometry'
>>> PS = gpd.read_file(FILENAME, layer=PS_LAYER_NAME)
>>>
>>> CANCER_DATA = {
...    'ds': DS,
...    'index_column_name': AREA_INDEX,
...    'value_column_name': AREA_VALUES,
...    'geometry_column_name': AREA_GEOMETRY
... }
>>> POINT_SUPPORT_DATA = {
...     'ps': PS,
...     'value_column_name': PS_VALUES,
...     'geometry_column_name': PS_GEOMETRY
... }
>>> BLOCKS = Blocks(**CANCER_DATA)
>>> indexes = BLOCKS.block_indexes
>>>
>>> PS = PointSupport(
...     points=POINT_SUPPORT_DATA['ps'],
...     ps_blocks=BLOCKS,
...     points_value_column=POINT_SUPPORT_DATA['value_column_name'],
...     points_geometry_column=POINT_SUPPORT_DATA['geometry_column_name']
... )
>>> STEP_SIZE = 20000
>>> MAX_RANGE = 300000
>>> ag = AggregatedVariogram(
...     blocks=BLOCKS,
...     point_support=PS,
...     step_size=STEP_SIZE,
...     max_range=MAX_RANGE
... )
>>> reg_variogram = ag.regularize()
>>> print(reg_variogram[0])
[20000.            53.13549729]
calculate_avg_inblock_semivariance() ndarray[source]

Method calculates the average semivariance within blocks \(\gamma_h(v, v)\). The average inblock semivariance is calculated as:

\[\gamma_h(v, v) = \frac{1}{(2*N(h))} \sum_{a=1}^{N(h)} [\gamma(v_{a}, v_{a}) + \gamma(v_{a+h}, v_{a+h})]\]

where:

  • \(\gamma(v_{a}, v_{a})\) is a semivariance within a block \(a\),

  • \(\gamma(v_{a+h}, v_{a+h})\) is samivariance within a block at a distance \(h\) from the block \(a\).

Returns:
avg_inblock_semivariancenumpy array

[lag, semivariance, number of block pairs]

calculate_avg_semivariance_between_blocks() ndarray[source]

Function calculates semivariance between areas based on their division into smaller blocks. It is \(\gamma(v, v_h)\) - semivariogram value between any two blocks separated by the distance \(h\).

Returns:
avg_block_to_block_semivariancenumpy array

The average semivariance between the neighboring blocks point-supports: [lag, semivariance, number of block pairs within a range].

Notes

Block-to-block semivariance is calculated as:

\[\gamma(v_{a}, v_{a+h})=\frac{1} {P_{a}P_{a+h}}\sum_{s=1}^{P_{a}}\sum_{s'=1}^{P_{a+h}}\gamma(u_{s}, u_{s'})\]

where:

  • \(\gamma(v_{a}, v_{a+h})\) - block-to-block semivariance of the block \(a\) and paired block \(a+h\).

  • \(P_{a}\) and \(P_{a+h}\) - number of support points within the block \(a\) and block \(a+h\).

  • \(\gamma(u_{s}, u_{s'})\) - semivariance of point supports between blocks.

Then average block-to-block semivariance is calculated as:

\[\gamma_{h}(v, v_{h}) = \frac{1}{N(h)}\sum_{a=1}^{N(h)}\gamma(v_{a}, v_{a+h})\]

where:

  • \(\gamma_{h}(v, v_{h})\) - averaged block-to-block semivariances for a lag \(h\),

  • \(\gamma(v_{a}, v_{a+h})\) - semivariance of block \(a\) and paired block at a distance \(h\).

regularize(average_inblock_semivariances: ndarray = None, semivariance_between_point_supports: ndarray = None, experimental_block_semivariances: ndarray = None, theoretical_block_model: TheoreticalVariogram = None) ndarray[source]

Method regularizes point support model. Procedure is described in [1].

Parameters:
average_inblock_semivariancesnp.ndarray, default = None

The mean semivariance between blocks. See Notes to learn more.

semivariance_between_point_supportsnp.ndarray, default = None

Semivariance between all blocks calculated from the theoretical model.

experimental_block_semivariancesnp.ndarray, default = None

The experimental semivariance between area coordinates.

theoretical_block_modelTheoreticalVariogram, default = None

A modeled variogram.

Returns:
regularized_modelnumpy array

[lag, semivariance]

Notes

Function has the form:

\[\gamma_v(h) = \gamma(v, v_h) - \gamma_h(v, v)\]

where:

  • \(\gamma_v(h)\) - the regularized variogram,

  • \(\gamma(v, v_h)\) - a variogram value between any two blocks separated by the distance \(h\) (calculated from their point support),

  • \(\gamma_h(v, v)\) - average inblock semivariance between blocks.

Average inblock semivariance between blocks:

\[\gamma_h(v, v) = \frac{1}{(2*N(h))} \sum_{a=1}^{N(h)} [\gamma(v_{a}, v_{a}) + \gamma(v_{a+h}, v_{a+h})]\]

where \(\gamma(v_{a}, v_{a})\) and \(\gamma(v_{a+h}, v_{a+h})\) are inblock semivariances of block \(a\) and block \(a+h\) separated by the distance \(h\).

References

[1] Goovaerts P., Kriging and Semivariogram Deconvolution in the

Presence of Irregular Geographical Units, Mathematical Geology 40(1), 101-128, 2008

regularize_variogram() ndarray[source]

Function regularizes semivariograms.

Returns:
reg_variogramnumpy array

[lag, semivariance]

Raises:
ValueError

Semivariance at a given lag is below zero.

Notes

Regularized semivariogram is a difference between the average block to block semivariance \(\gamma(v, v_{h})\) and the average inblock semivariances \(\gamma_{h}(v, v)\) at a given lag \(h\).

\[\gamma_{v}(h)=\gamma(v, v_{h}) - \gamma_{h}(v, v)\]
show_semivariograms()[source]

Method plots:

  • experimental variogram,

  • theoretical model,

  • average inblock semivariance,

  • average block-to-block semivariance,

  • regularized variogram.

Raises:
AttributeError

The semivariogram regularization process has not been performed.