Utility Functions & Classes

Several development utilities and other functions not part of the sampling API are included here.

Datasets

Utilities for generating data used for testing and exploration of provided samplers.

occuspytial.utils.rand_precision_mat(lat_row, lat_col, max_neighbors=8, rho=1)[source]

Generate a random spatial precision matrix.

The spatial precision matrix is generated using a rectengular lattice of dimensions lat_row x lat_col, and thus the row and colum size of the matrix is (lat_row x lat_col).

Parameters
lat_rowint

Number of rows of the lattice used to generate the matrix.

lat_colint

Number of columns of the lattice used to generate the matrix.

max_neighbors{4, 8}, optional

The maximum number of neighbors for each site. The default is 8.

rhofloat, optional

The spatial weight parameter. Takes values between 0 and 1, with 0 implying independent random effects and 1 implying strong spatial autocorrelation. Setting the value to 1 is equivalent to generating the Intrinsic Autoregressive Model.

Returns
scipy.sparse.coo_matrix

Spatial precision matrix

Raises
ValueError

If the max_neighbours is any value other than 4 or 8.

Examples

>>> from occuspytial.utils import rand_precision_mat
>>> Q = rand_precision_mat(10, 5)
>>> Q
<50x50 sparse matrix of type '<class 'numpy.int64'>'
        with 364 stored elements in COOrdinate format>
# The matrix can be converted to numpy format using method ``toarray()``
>>> Q.toarray()
array([[ 3, -1,  0, ...,  0,  0,  0],
       [-1,  5, -1, ...,  0,  0,  0],
       [ 0, -1,  5, ...,  0,  0,  0],
       ...,
       [ 0,  0,  0, ...,  5, -1,  0],
       [ 0,  0,  0, ..., -1,  5, -1],
       [ 0,  0,  0, ...,  0, -1,  3]])
occuspytial.utils.make_data(n=150, min_v=None, max_v=None, ns=None, p=3, q=3, tau_range=(0.25, 1.5), max_neighbors=8, random_state=None)[source]

Generate random data to use for modelling species occupancy.

Parameters
nint, optional

Number of sites. Defaults to 150.

min_vint, optional

Minimum number of visits per site. If None, the maximum number is set to 2. Defaults to None.

max_vint, optional

Maximum number of visits per site. If None, the maximum number is set to 10% of n. Defaults to None.

nsint, optional

Number of surveyed sites out of n. If None, then this parameter is set to 50% of n. Defaults to None.

pint, optional

Number covariates to use for species occupancy. Defaults to 3.

qint, optional

Number of covariates to use for conditonal detection. Defaults to 3.

tau_rangetuple, optional

The range to randomly sample the precision parameter value from. Defaults to (0.25, 1.5).

max_neighborsint, optional

Maximum number of neighbors per site. Should be one of {4, 8}. Default is 8.

random_stateint, optional

The seed to use for random number generation. Useful for reproducing generated data. If None then a random seed is chosen. Defaults to None.

Returns
Qscipy.sparse.coo_matrix

Spatial precision matrix

WDict[int, np.ndarray]

Dictionary of detection corariates where the keys are the site numbers of the surveyed sites and the values are arrays containing the design matrix of each corresponding site.

Xnp.ndarray

Design matrix of species occupancy covariates.

yDict[int, np.ndarray]

Dictionary of survey data where the keys are the site numbers of the surveyed sites and the values are number arrays of 1’s and 0’s where 0’s indicate “no detection” and 1’s indicate “detection”. The length of each array equals the number of visits in the corresponding site.

alphanp.ndarray

True values of coefficients of detection covariates.

betanp.ndarray

True values of coefficients of occupancy covariates.

taunp.ndarray

True value of the precision parameter

znp.ndarray

True occupancy state for all n sites.

Raises
ValueError

When n is less than the default 150 sites. When min_v is less than 1. When max_v is less than 2 or greater than n. When ns is not a positive integer or greater than n.

Examples

>>> from occuspytial.utils import make_data
>>> Q, W, X, y, alpha, beta, tau, z = make_data()
>>> Q
<150x150 sparse matrix of type '<class 'numpy.float64'>'
        with 1144 stored elements in COOrdinate format>
>>> Q.toarray()
array([[ 3., -1.,  0., ...,  0.,  0.,  0.],  # random
       [-1.,  5., -1., ...,  0.,  0.,  0.],
       [ 0., -1.,  5., ...,  0.,  0.,  0.],
       ...,
       [ 0.,  0.,  0., ...,  5., -1.,  0.],
       [ 0.,  0.,  0., ..., -1.,  5., -1.],
       [ 0.,  0.,  0., ...,  0., -1.,  3.]])
>>> W
{81: array([[ 1.        ,  1.01334565,  0.93150242],  # random
        [ 1.        ,  0.19276808, -1.71939657],
        [ 1.        ,  0.23866531,  0.0559545 ],
        [ 1.        ,  1.36102304,  1.73611887],
        [ 1.        ,  0.47247886,  0.73410589],
        [ 1.        , -1.9018879 ,  0.0097963 ]]),
 131: array([[ 1.        ,  1.67846707, -1.12476746],
        [ 1.        , -1.63131532, -1.32216705],
        [ 1.        , -1.37431173, -0.79734213],
        ...,
 21: array([[ 1.        ,  1.6416734 , -1.91642502],
        [ 1.        ,  0.2256312 , -1.68929118],
        [ 1.        ,  1.36953093,  1.08758129],
        [ 1.        , -1.08029212,  0.40219588]])}
>>> X
array([[ 1.        ,  0.71582433,  1.76344395],
       [ 1.        ,  0.8561976 ,  1.0520401 ],
       [ 1.        , -0.28051247,  0.16809809],
       ...,
       [ 1.        ,  0.86702262, -1.18225448],
       [ 1.        , -0.41346399, -0.9633078 ],
       [ 1.        , -0.23182363,  1.69930761]])
>>> y
{15: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),  # random
 81: array([0, 0, 0, 1, 1, 0]),
 ...,
 21: array([0, 1, 0, 0])}
>>> alpha
array([-1.43291816, -0.87932413, -1.84927642])  # random
>>> beta
array([-0.62084322, -1.09645564, -0.93371374])  # random
>>> tau
1.415532667780688  # random
>>> z
array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
       1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1,
       1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0,
       0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0,
       0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1,
       0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0,
       0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0])

Development Utils

occuspytial.utils.get_generator(random_state=None)[source]

Get an instance of a numpy random number generator object.

This instance uses SFC64 bitgenerator, which is the fastest numpy currently has to offer as of version 1.19. This function conveniently instantiates a generator of this kind and should be used in all modules.

Parameters
random_state{None, int, array_like[ints], numpy.random.SeedSequence}

A seed to initialize the bitgenerator. Defaults to None.

Returns
numpy.random.Generator

Instance of numpy’s Generator class, which exposes a number of random number generating methods.

Examples

>>> from occuspytial.utils import get_generator
>>> rng = get_generator()
# The instance can be used to access functions of ``numpy.random``
>>> rng.standard_normal()
-0.203  # random
occuspytial.gibbs.parallel.sample_parallel(class_, **kwargs)[source]

Perform MCMC sampling in parallel.

Parameters
class_object

Sampler instance that implements a sample, step and _run methods.

**kwargs

Keyword arguments to pass to the sampler’s sample / _run methods.

Returns
outList[PosteriorParameter]

posterior samplers

class occuspytial.data.Data

Container for Detection data.

This class is most useful storing the conditional detection covariate data (W) and detection-survey data (y) in a more accessible way that allows convenient multiple site data access at once.

Parameters
dataDict[int, nd.ndarray]

Dictionary of site data. The key should be the site numbers and value the relevant site data array (detection data or its design matrix).

Attributes
surveyedList[int]

indices of sites were surveyed. This is calculated using data keys.

Methods

visits(sites)

__getitem__()

Return data of a site.

If sites is a sequence the returned data is a concatenated array of the data in all sites contained in sites along the first axis.

Parameters
sites{int, List[int], Tuple[int]}

Site id/number(s).

Returns
outnp.ndarray

Concatenated data per site provided in sites.

visits(sites)

Return the number of visits per site.

Parameters
sites{int, List[int], Tuple[int]}

Site(s) whose number of visits are to be returned.

Returns
out{Tuple[int], int}

Number of visits per site provided by sites.

class occuspytial.posterior.PosteriorParameter(*chains)[source]

Container to store posterior samples, produce plots and summaries.

This object is returned by samplers so that posterior parameter samples can be easily accessed. It also provides several methods to perform basic inference on the posterior samples.

Parameters
*chains

instances of Chain.

Attributes
summary

Return summary statistics of posterior parameter samples.

dataarviz.InferenceData

Inference data object.

plot_auto_corr(**kwargs)[source]

See arviz library documentation for a full list of legal parameters.

Parameters
**kwargs

Keyword arguments optionally passed to arviz.plot_autocorr.

Returns
axesmatplotlib axes
plot_density(**kwargs)[source]

See arviz library documentation for a full list of legal parameters.

Parameters
**kwargs

Keyword arguments optionally passed to arviz.plot_posterior.

Returns
axesmatplotlib axes
plot_ess(**kwargs)[source]

See arviz library documentation for a full list of legal parameters.

Parameters
**kwargs

Keyword arguments optionally passed to arviz.plot_ess.

Returns
axesmatplotlib axes
plot_pair(**kwargs)[source]

See arviz library documentation for a full list of legal parameters.

Parameters
**kwargs

Keyword arguments optionally passed to arviz.plot_pair.

Returns
axesmatplotlib axes
plot_trace(**kwargs)[source]

See arviz library documentation for a full list of legal parameters.

Parameters
**kwargs

Keyword arguments optionally passed to arviz.plot_trace.

Returns
axesmatplotlib axes
property summary

Return summary statistics of posterior parameter samples.

Default statistics are: mean, sd, hdi_3%, hdi_97%, mcse_mean, mcse_sd, ess_bulk, ess_tail, and r_hat. r_hat is only computed for traces with 2 or more chains.

Returns
pandas.DataFrame

A dataframe of the summary.

class occuspytial.chain.Chain(params, size)[source]

Container to store parameter chains during sampling.

Parameters
paramsDict[str, int]

Dictionary of parameter metadata. The keys are the parameter names and the values are the number of dimensions of the parameter.

sizeint

Length of the parameter chain.

Attributes
full

Return the full chain as a numpy array.

append(params)[source]

Append new values to the chain.

Parameters
paramsDict[str, Union[float, nd.ndarray]]

Dictionary of values to append. Keys are the parameter names.

Raises
ValueError

If chain is already full (i.e. number of values per porameter is already equal to the size attribute.)

expand(size)[source]

Extend the chain capacity by a specified length.

Parameters
sizeint

Length to extend the chain by.

property full

Return the full chain as a numpy array.

The returned array is a concatenation of the arrays of all parameters. The number of columns is the sum of the parameters’ dimensions. The number of rows may be less than size attribute if the chain is not full.

Returns
np.ndarray

concatenated parameter chains.

class occuspytial.gibbs.state.State[source]

Store parameter variables so they can be accessed as attributes.

class occuspytial.gibbs.state.FixedState[source]

Store parameter variables so they can be accessed as attributes.

Values of variables assigned to an instance of this class cannot be changed Thus this class should be used for values that remain constant during sampling.