Deconvolution#

class Deconvolution(verbose=True, store_models=False)[source]

Class performs deconvolution of semivariogram of areal data. Whole procedure is based on the iterative process described in: [1].

Steps to regularize semivariogram:

  • initialize your object (no parameters),

  • use fit() method to build initial point support model,

  • use transform() method to perform semivariogram regularization,

  • save deconvoluted semivariogram model with export_model() method.

References

[1] Goovaerts P., Kriging and Semivariogram Deconvolution in the Presence of Irregular Geographical Units, Mathematical Geology 40(1), 101-128, 2008

Examples

>>> dcv = Deconvolution(verbose=True)
>>> dcv.fit(agg_dataset=...,
...         point_support_dataset=...,
...         agg_step_size=...,
...         agg_max_range=...,
...         variogram_weighting_method='closest')
>>> dcv.transform(max_iters=5)
>>> dcv.plot_variograms()
>>> dcv.plot_deviations()
>>> dcv.plot_weights()
>>> dcv.export_model('results.csv')
Attributes:
psUnion[Dict, np.ndarray, gpd.GeoDataFrame, pd.DataFrame, PointSupport]
  • Dict: {block id: [[point x, point y, value]]}

  • numpy array: [[block id, x, y, value]]

  • DataFrame and GeoDataFrame: columns = {x, y, ds, index}

  • PointSupport

aggUnion[Blocks, gpd.GeoDataFrame, pd.DataFrame, np.ndarray]

Blocks with aggregated data.

  • Blocks: Blocks() class object.

  • GeoDataFrame and DataFrame must have columns: centroid.x, centroid.y, ds, index. Geometry column with polygons is not used and optional.

  • numpy array: [[block index, centroid x, centroid y, value]].

initial_regularized_variogramnumpy array

[lag, semivariance]

initial_theoretical_agg_modelTheoreticalVariogram
initial_theoretical_model_predictionnumpy array

[lag, semivariance]

initial_experimental_variogramnumpy array

[lag, semivariance, number of pairs]

final_theoretical_modelTheoreticalVariogram
final_optimal_variogramnumpy array

[lag, semivariance]

agg_stepfloat

Step size between lags.

agg_rngfloat

Maximal distance of analysis.

rangesnumpy array

np.arange(agg_step, agg_rng, agg_step)

directionfloat (in range [0, 360])

Direction of semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

tolerancefloat (in range [0, 1])

If tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If tolerance is > 0 then the bin is selected as an elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

  • The major axis size == step_size.

  • The minor axis size is tolerance * step_size

  • The baseline point is at a center of the ellipse.

  • The tolerance == 1 creates an omnidirectional semivariogram.

weighting_methodstr

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger weights,

  • dense: error is weighted by the number of point pairs within a lag - more pairs, lesser weight.

deviationsList

List of deviations per iteration. The first element is the initial deviation.

weightsList

List of weights applied to lags in each iteration.

verbosebool

Should algorithm print() process steps into a terminal.

store_modelsbool

Should theoretical and regularized models be stored after each iteration.

theoretical_modelsList

List with theoretical models parameters.

regularized_modelsList

List with numpy arrays with regularized models.

Methods

fit()

Fits areal data and the point support data into a model, initializes the experimental semivariogram, the theoretical semivariogram model, regularized point support model, and deviation.

transform()

Performs semivariogram regularization.

fit_transform()

Performs fit() and transform() at one time.

export_model()

Exports regularized (or fitted) model.

plot_variograms()

Plots semivariances before and after regularization.

plot_deviations()

Plots each deviation divided by the initial deviation.

plot_weights()

Plots the mean weight value per lag.

export_model(fname)[source]

Function exports final theoretical model.

Parameters:
fnamestr

File name for model parameters (nugget, sill, range, model type)

Raises:
RunetimeError

A model hasn’t been transformed yet.

fit(agg_dataset, point_support_dataset, agg_step_size, agg_max_range, agg_nugget=None, agg_direction=None, agg_tolerance=1, variogram_weighting_method='closest', model_name='safe', model_types=None)[source]

Function fits given areal data variogram into point support variogram - it is the first step of regularization process.

Parameters:
agg_datasetUnion[Blocks, gpd.GeoDataFrame, pd.DataFrame, np.ndarray]

Blocks with aggregated data.

  • Blocks: Blocks() class object.

  • GeoDataFrame and DataFrame must have columns: centroid.x, centroid.y, ds, index. Geometry column with polygons is not used.

  • numpy array: [[block index, centroid x, centroid y, value]].

point_support_datasetUnion[Dict, np.ndarray, gpd.GeoDataFrame, pd.DataFrame, PointSupport]
  • Dict: {block id: [[point x, point y, value]]}

  • numpy array: [[block id, x, y, value]]

  • DataFrame and GeoDataFrame: columns = {x, y, ds, index}

  • PointSupport

agg_step_sizefloat

Step size between lags.

agg_max_rangefloat

Maximal distance of analysis.

agg_nuggetfloat, default = 0

The nugget of a data.

agg_directionfloat (in range [0, 360]), optional, default=0

Direction of semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

agg_tolerancefloat (in range [0, 1]), optional, default=1

If agg_tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If agg_tolerance is > 0 then the bin is selected as an elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

  • The major axis size == agg_step_size.

  • The minor axis size is agg_tolerance * agg_step_size

  • The baseline point is at a center of the ellipse.

  • agg_tolerance == 1 creates an omnidirectional semivariogram.

variogram_weighting_methodstr, default = “closest”

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger weights,

  • dense: error is weighted by the number of point pairs within a lag - more pairs, lesser weight.

model_namestr, default=’safe’

The name of the model to check or special command for multiple models. Available models:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - [‘linear’, ‘power’, ‘spherical’],

  • ‘circular’,

  • ‘cubic’,

  • ‘exponential’,

  • ‘gaussian’,

  • ‘linear’,

  • ‘power’,

  • ‘spherical’.

model_typesList, default = None

The list with model names to check excluding ‘all’ and ‘safe’ names.

fit_transform(agg_dataset, point_support_dataset, agg_step_size, agg_max_range, agg_nugget=None, agg_direction=None, agg_tolerance=1, variogram_weighting_method='closest', model_name='safe', model_types=None, max_iters=25, limit_deviation_ratio=0.1, minimum_deviation_decrease=0.01, reps_deviation_decrease=3)[source]

Method performs fit() and transform() operations at once.

Parameters:
agg_datasetUnion[Blocks, gpd.GeoDataFrame, pd.DataFrame, np.ndarray]

Blocks with aggregated data.

  • Blocks: Blocks() class object.

  • GeoDataFrame and DataFrame must have columns: centroid.x, centroid.y, ds, index. Geometry column with polygons is not used.

  • numpy array: [[block index, centroid x, centroid y, value]].

point_support_datasetUnion[Dict, np.ndarray, gpd.GeoDataFrame, pd.DataFrame, PointSupport]
  • Dict: {block id: [[point x, point y, value]]}

  • numpy array: [[block id, x, y, value]]

  • DataFrame and GeoDataFrame: columns = {x, y, ds, index}

  • PointSupport

agg_step_sizefloat

Step size between lags.

agg_max_rangefloat

Maximal distance of analysis.

agg_nuggetfloat, default = None

The nugget of a dataset.

agg_directionfloat (in range [0, 360]), default=0

Direction of semivariogram, values from 0 to 360 degrees:

  • 0 or 180: is E-W,

  • 90 or 270 is N-S,

  • 45 or 225 is NE-SW,

  • 135 or 315 is NW-SE.

agg_tolerancefloat (in range [0, 1]), optional, default=1

If agg_tolerance is 0 then points must be placed at a single line with the beginning in the origin of the coordinate system and the direction given by y axis and direction parameter. If agg_tolerance is > 0 then the bin is selected as an elliptical area with major axis pointed in the same direction as the line for 0 tolerance.

  • The major axis size == agg_step_size.

  • The minor axis size is agg_tolerance * agg_step_size

  • The baseline point is at a center of the ellipse.

  • agg_tolerance == 1 creates an omnidirectional semivariogram.

variogram_weighting_methodstr, default = “closest”

Method used to weight error at a given lags. Available methods:

  • equal: no weighting,

  • closest: lags at a close range have bigger weights,

  • distant: lags that are further away have bigger weights,

  • dense: error is weighted by the number of point pairs within a lag - more pairs, lesser weight.

model_namestr, default=’safe’

The name of the model to check or special command for multiple models. Available models:

  • ‘all’ - the same as list with all models,

  • ‘safe’ - [‘linear’, ‘power’, ‘spherical’],

  • ‘circular’,

  • ‘cubic’,

  • ‘exponential’,

  • ‘gaussian’,

  • ‘linear’,

  • ‘power’,

  • ‘spherical’.

model_typesList, default = None

The list with model names to check excluding ‘all’ and ‘safe’ names.

max_itersint, default = 25

Maximum number of iterations.

limit_deviation_ratiofloat, default = 0.01

Minimal ratio of model deviation to initial deviation when algorithm is stopped. Parameter must be set within the limits (0, 1).

minimum_deviation_decreasefloat, default = 0.001

The minimum ratio of the difference between model deviation and optimal deviation to the optimal deviation: |dev - opt_dev| / opt_dev. The parameter must be set within the limits (0, 1).

reps_deviation_decreaseint, default = 3

How many repetitions of small deviation decrease before termination of the algorithm.

plot_variograms()[source]

Function shows experimental semivariogram, theoretical semivariogram and regularized semivariogram after semivariogram regularization with transform() method.

transform(max_iters=25, limit_deviation_ratio=0.1, minimum_deviation_decrease=0.01, reps_deviation_decrease=3)[source]

Method performs semivariogram regularization after model fitting.

Parameters:
max_itersint, default = 25

Maximum number of iterations.

limit_deviation_ratiofloat, default = 0.1

Minimal ratio of model deviation to initial deviation when algorithm is stopped. Parameter must be set within the limits (0, 1).

minimum_deviation_decreasefloat, default = 0.01

The minimum ratio of the difference between model deviation and optimal deviation to the optimal deviation: |dev - opt_dev| / opt_dev. Parameter must be set within the limits (0, 1).

reps_deviation_decreaseint, default = 3

How many repetitions of small deviation decrease before termination of the algorithm.

Raises:
AttributeError

initial_regularized_model is undefined (user didn’t perform fit() method).

ValueError

limit_deviation_ratio or minimum_deviation_decrease parameters <= 0 or >= 1.