Pipelines#

Poisson Kriging pipelines#

pyinterpolate.filter_blocks(semivariogram_model: TheoreticalVariogram, point_support: PointSupport, number_of_neighbors, kriging_type='ata', data_crs=None, raise_when_negative_prediction=True, raise_when_negative_error=False, negative_prediction_to_zero=False, verbose=True) → GeoDataFrame[source]

Function filters block data using Poisson Kriging. By filtering we understand computing aggregated values again using point support data for ratios regularization.

Parameters:

semivariogram_modelTheoreticalVariogram

The fitted variogram model.

point_supportPointSupport

Blocks and their point supports.

number_of_neighborsint

The minimum number of neighbours that potentially affect a block.

kriging_typestr, default=’ata’

A type of Poisson Kriging operation. Available methods:

'ata': Area-to-Area Poisson Kriging.
'atp': Area-to-Point Poisson Kriging.
'cb': Centroid-based Poisson Kriging.

data_crsstr, default=None

Data crs, look into: https://geopandas.org/projections.html. If None given then returned GeoDataFrame doesn’t have a crs.

raise_when_negative_predictionbool, default=True

Raise error when prediction is negative.

raise_when_negative_errorbool, default=True

Raise error when prediction error is negative.

negative_prediction_to_zerobool, default=False

When predicted value is below zero then set it to zero.

verbosebool, default=True

Show progress bar

Returns:

: GeoPandas GeoDataFrame: Regularized set of ps_blocks: ['id', 'geometry', 'reg.est', 'reg.err', 'rmse']

Examples

>>> import os
>>> import geopandas as gpd
>>> from pyinterpolate import (
...     filter_blocks,
...     Blocks,
...     ExperimentalVariogram,
...     PointSupport,
...     TheoreticalVariogram
... )
>>>
>>>
>>> FILENAME = 'cancer_data.gpkg'
>>> LAYER_NAME = 'areas'
>>> DS = gpd.read_file(FILENAME, layer=LAYER_NAME)
>>> AREA_VALUES = 'rate'
>>> AREA_INDEX = 'FIPS'
>>> AREA_GEOMETRY = 'geometry'
>>> PS_LAYER_NAME = 'points'
>>> PS_VALUES = 'POP10'
>>> PS_GEOMETRY = 'geometry'
>>> PS = gpd.read_file(FILENAME, layer=PS_LAYER_NAME)
>>>
>>> CANCER_DATA = {
...    'ds': DS,
...    'index_column_name': AREA_INDEX,
...    'value_column_name': AREA_VALUES,
...    'geometry_column_name': AREA_GEOMETRY
... }
>>> POINT_SUPPORT_DATA = {
...     'ps': PS,
...     'value_column_name': PS_VALUES,
...     'geometry_column_name': PS_GEOMETRY
... }
>>> BLOCKS = Blocks(**CANCER_DATA)
>>> indexes = BLOCKS.block_indexes
>>>
>>> PS = PointSupport(
...     points=POINT_SUPPORT_DATA['ps'],
...     ps_blocks=BLOCKS,
...     points_value_column=POINT_SUPPORT_DATA['value_column_name'],
...     points_geometry_column=POINT_SUPPORT_DATA['geometry_column_name']
... )
>>>
>>> EXPERIMENTAL = ExperimentalVariogram(
...     ds=BLOCKS.representative_points_array(),
...     step_size=40000,
...     max_range=300001
... )
>>>
>>> THEO = TheoreticalVariogram()
>>> THEO.autofit(
...     experimental_variogram=EXPERIMENTAL,
...     sill=150
... )
>>> filtered = filter_blocks(
...     semivariogram_model=THEO,
...     point_support=PS,
...     number_of_neighbors=8,
...     kriging_type='cb',
...     raise_when_negative_error=False,
...     verbose=False
... )
>>> print(filtered.columns)
Index(['id', 'geometry', 'reg.est', 'reg.err', 'rmse'], dtype='object')

pyinterpolate.smooth_blocks(semivariogram_model: TheoreticalVariogram, point_support: PointSupport, number_of_neighbors, data_crs=None, raise_when_negative_prediction=True, raise_when_negative_error=True, negative_prediction_to_zero=False, verbose=True) → GeoDataFrame[source]

Function smooths aggregated block values, and transform those into point support.

Parameters:

semivariogram_modelTheoreticalVariogram: The fitted variogram model.
point_supportPointSupport: Blocks and their point supports.
number_of_neighborsint: The minimum number of neighbours that potentially affect a block.
data_crsstr, default=None: Data crs, look into: https://geopandas.org/projections.html. If None given then returned GeoDataFrame doesn’t have a crs.
raise_when_negative_predictionbool, default=True: Raise error when prediction is negative.
raise_when_negative_errorbool, default=True: Raise error when prediction error is negative.
negative_prediction_to_zerobool, default=False: When predicted value is below zero then set it to zero.
verbosebool, default=True: Show progress bar

Returns:

: GeoPandas GeoDataFrame: columns = ['id', 'geometry', 'reg.est', 'reg.err', 'rmse']

Examples

>>> import os
>>> import geopandas as gpd
>>> from pyinterpolate import (
...     filter_blocks,
...     Blocks,
...     ExperimentalVariogram,
...     PointSupport,
...     TheoreticalVariogram
... )
>>>
>>>
>>> FILENAME = 'cancer_data.gpkg'
>>> LAYER_NAME = 'areas'
>>> DS = gpd.read_file(FILENAME, layer=LAYER_NAME)
>>> AREA_VALUES = 'rate'
>>> AREA_INDEX = 'FIPS'
>>> AREA_GEOMETRY = 'geometry'
>>> PS_LAYER_NAME = 'points'
>>> PS_VALUES = 'POP10'
>>> PS_GEOMETRY = 'geometry'
>>> PS = gpd.read_file(FILENAME, layer=PS_LAYER_NAME)
>>>
>>> CANCER_DATA = {
...    'ds': DS,
...    'index_column_name': AREA_INDEX,
...    'value_column_name': AREA_VALUES,
...    'geometry_column_name': AREA_GEOMETRY
... }
>>> POINT_SUPPORT_DATA = {
...     'ps': PS,
...     'value_column_name': PS_VALUES,
...     'geometry_column_name': PS_GEOMETRY
... }
>>> BLOCKS = Blocks(**CANCER_DATA)
>>> indexes = BLOCKS.block_indexes
>>>
>>> PS = PointSupport(
...     points=POINT_SUPPORT_DATA['ps'],
...     ps_blocks=BLOCKS,
...     points_value_column=POINT_SUPPORT_DATA['value_column_name'],
...     points_geometry_column=POINT_SUPPORT_DATA['geometry_column_name']
... )
>>>
>>> EXPERIMENTAL = ExperimentalVariogram(
...     ds=BLOCKS.representative_points_array(),
...     step_size=40000,
...     max_range=300001
... )
>>>
>>> THEO = TheoreticalVariogram()
>>> THEO.autofit(
...     experimental_variogram=EXPERIMENTAL,
...     sill=150
... )
>>> smoothed = smooth_blocks(
...     semivariogram_model=THEO,
...     point_support=PS,
...     number_of_neighbors=8,
...     verbose=True
... )
>>> print(smoothed.columns)
Index(['id', 'geometry', 'reg.est', 'reg.err', 'rmse'], dtype='object')

Ordinary Kriging pipelines#

pyinterpolate.interpolate_points(theoretical_model: TheoreticalVariogram, unknown_locations: ArrayLike, known_locations: ArrayLike = None, known_values: ArrayLike = None, known_geometries: ArrayLike = None, neighbors_range=None, no_neighbors=4, max_tick=5.0, use_all_neighbors_in_range=False, allow_approximate_solutions=False, progress_bar=True) → ndarray[source]

Function predicts values at unknown locations with Ordinary Kriging.

Parameters:

theoretical_modelTheoreticalVariogram: Fitted theoretical variogram model.
unknown_locationsnumpy array: Points where you want to estimate value [(x, y), ...] <-> [(lon, lat), ...].
known_locationsnumpy array, optional: The known locations: [x, y, value].
known_valuesArrayLike, optional: Observation in the i-th geometry (from known_geometries). Optional parameter, if not given then known_locations must be provided.
known_geometriesArrayLike, optional: Array or similar structure with geometries. It must have the same length as known_values. Optional parameter, if not given then known_locations must be provided. Point type geometry.
neighbors_rangefloat, default=None: The maximum distance where we search for the neighbors. If None is given then range is selected from the theoretical model’s rang attribute.
no_neighborsint, default = 4: The number of the n-closest neighbors used for interpolation.
max_tickfloat, default=5.: Maximum number of degrees for neighbors search angle.
use_all_neighbors_in_rangebool, default = False: True: if the real number of neighbors within the neighbors_range is greater than the number_of_neighbors parameter then take all of them anyway.
allow_approximate_solutionsbool, default=False: Allows the approximation of kriging weights based on the OLS algorithm. We don’t recommend set it to True if you don’t know what are you doing. This parameter can be useful when you have clusters in your dataset, that can lead to singular or near-singular matrix creation. But the better idea is to get rid of those clusters.
progress_barbool, default = True: Shows progress bar

Returns:

: numpy array: [predicted value, variance error, longitude (x), latitude (y)]

Examples

>>> import geopandas as gpd
>>> import numpy as np
>>> import pandas as pd
>>>
>>> from pyinterpolate import (build_experimental_variogram,
...     build_theoretical_variogram)
>>> from pyinterpolate.core.pipelines.interpolate import interpolate_points
>>>
>>> dem = gpd.read_file('dem.gpkg')
>>> unknown_locations = gpd.read_file('unknown_locations.gpkg')
>>> step_size = 500
>>> max_range = 10000
>>> exp_variogram = build_experimental_variogram(
...     values=dem['dem'],
...     geometries=dem['geometry'],
...     step_size=step_size,
...     max_range=max_range
... )
>>> theo_variogram = build_theoretical_variogram(exp_variogram)
>>> interp = interpolate_points(
...     theoretical_model=theo_variogram,
...     unknown_locations=unknown_locations['geometry'],
...     known_values=dem['dem'],
...     known_geometries=dem['geometry']
... )
>>> print(interp[0])
[7.91222896e+01 9.72740449e+01 2.38012302e+05 5.51466805e+05]

pyinterpolate.interpolate_points_dask(theoretical_model: TheoreticalVariogram, unknown_locations: ArrayLike, known_locations: ArrayLike = None, known_values: ArrayLike = None, known_geometries: ArrayLike = None, neighbors_range=None, no_neighbors=4, max_tick=5.0, use_all_neighbors_in_range=False, allow_approximate_solutions=False, number_of_workers=1, progress_bar=True) → ndarray[source]

Function predicts values at unknown locations with Ordinary Kriging using Dask backend, makes sense when you must interpolate large number of points.

Parameters:

theoretical_modelTheoreticalVariogram: Fitted theoretical variogram model.
unknown_locationsnumpy array: Points where you want to estimate value [(x, y), ...] <-> [(lon, lat), ...].
known_locationsnumpy array, optional: The known locations: [x, y, value].
known_valuesArrayLike, optional: Observation in the i-th geometry (from known_geometries). Optional parameter, if not given then known_locations must be provided.
known_geometriesArrayLike, optional: Array or similar structure with geometries. It must have the same length as known_values. Optional parameter, if not given then known_locations must be provided. Point type geometry.
neighbors_rangefloat, default=None: The maximum distance where we search for the neighbors. If None is given then range is selected from the theoretical model’s rang attribute.
no_neighborsint, default = 4: The number of the n-closest neighbors used for interpolation.
max_tickfloat, default=5.: Maximum number of degrees for neighbors search angle.
use_all_neighbors_in_rangebool, default = False: True: if the real number of neighbors within the neighbors_range is greater than the number_of_neighbors parameter then take all of them anyway.
allow_approximate_solutionsbool, default=False: Allows the approximation of kriging weights based on the OLS algorithm. We don’t recommend set it to True if you don’t know what are you doing. This parameter can be useful when you have clusters in your dataset, that can lead to singular or near-singular matrix creation. But the better idea is to get rid of those clusters.
number_of_workersint, default = 1: How many processing units can be used for predictions. Increase it only for a very large number of interpolated points (~10k+).
progress_barbool, default = True: Shows progress bar

Returns:

: numpy array: [predicted value, variance error, longitude (x), latitude (y)]

Examples

>>> import geopandas as gpd
>>> import numpy as np
>>> import pandas as pd
>>>
>>> from pyinterpolate import (build_experimental_variogram,
...     build_theoretical_variogram)
>>> from pyinterpolate.core.pipelines.interpolate import interpolate_points_dask
>>>
>>> dem = gpd.read_file('dem.gpkg')
>>> unknown_locations = gpd.read_file('unknown_locations.gpkg')
>>> step_size = 500
>>> max_range = 10000
>>> exp_variogram = build_experimental_variogram(
...     values=dem['dem'],
...     geometries=dem['geometry'],
...     step_size=step_size,
...     max_range=max_range
... )
>>> theo_variogram = build_theoretical_variogram(exp_variogram)
>>> interp = interpolate_points_dask(
...     theoretical_model=theo_variogram,
...     unknown_locations=unknown_locations['geometry'],
...     known_values=dem['dem'],
...     known_geometries=dem['geometry']
... )
>>> print(interp[0])
[7.91222896e+01 9.72740449e+01 2.38012302e+05 5.51466805e+05]

Pipelines#

Poisson Kriging pipelines#

Ordinary Kriging pipelines#

This Page