Utility Functions & Classes¶
Several development utilities and other functions not part of the sampling API are included here.
Datasets¶
Utilities for generating data used for testing and exploration of provided samplers.
-
occuspytial.utils.
rand_precision_mat
(lat_row, lat_col, max_neighbors=8, rho=1)[source]¶ Generate a random spatial precision matrix.
The spatial precision matrix is generated using a rectengular lattice of dimensions lat_row x lat_col, and thus the row and colum size of the matrix is (lat_row x lat_col).
- Parameters
- lat_rowint
Number of rows of the lattice used to generate the matrix.
- lat_colint
Number of columns of the lattice used to generate the matrix.
- max_neighbors{4, 8}, optional
The maximum number of neighbors for each site. The default is 8.
- rhofloat, optional
The spatial weight parameter. Takes values between 0 and 1, with 0 implying independent random effects and 1 implying strong spatial autocorrelation. Setting the value to 1 is equivalent to generating the Intrinsic Autoregressive Model.
- Returns
- scipy.sparse.coo_matrix
Spatial precision matrix
- Raises
- ValueError
If the max_neighbours is any value other than 4 or 8.
Examples
>>> from occuspytial.utils import rand_precision_mat >>> Q = rand_precision_mat(10, 5) >>> Q <50x50 sparse matrix of type '<class 'numpy.int64'>' with 364 stored elements in COOrdinate format> # The matrix can be converted to numpy format using method ``toarray()`` >>> Q.toarray() array([[ 3, -1, 0, ..., 0, 0, 0], [-1, 5, -1, ..., 0, 0, 0], [ 0, -1, 5, ..., 0, 0, 0], ..., [ 0, 0, 0, ..., 5, -1, 0], [ 0, 0, 0, ..., -1, 5, -1], [ 0, 0, 0, ..., 0, -1, 3]])
-
occuspytial.utils.
make_data
(n=150, min_v=None, max_v=None, ns=None, p=3, q=3, tau_range=(0.25, 1.5), max_neighbors=8, random_state=None)[source]¶ Generate random data to use for modelling species occupancy.
- Parameters
- nint, optional
Number of sites. Defaults to 150.
- min_vint, optional
Minimum number of visits per site. If None, the maximum number is set to 2. Defaults to None.
- max_vint, optional
Maximum number of visits per site. If None, the maximum number is set to 10% of n. Defaults to None.
- nsint, optional
Number of surveyed sites out of n. If None, then this parameter is set to 50% of n. Defaults to None.
- pint, optional
Number covariates to use for species occupancy. Defaults to 3.
- qint, optional
Number of covariates to use for conditonal detection. Defaults to 3.
- tau_rangetuple, optional
The range to randomly sample the precision parameter value from. Defaults to (0.25, 1.5).
- max_neighborsint, optional
Maximum number of neighbors per site. Should be one of {4, 8}. Default is 8.
- random_stateint, optional
The seed to use for random number generation. Useful for reproducing generated data. If None then a random seed is chosen. Defaults to None.
- Returns
- Qscipy.sparse.coo_matrix
Spatial precision matrix
- WDict[int, np.ndarray]
Dictionary of detection corariates where the keys are the site numbers of the surveyed sites and the values are arrays containing the design matrix of each corresponding site.
- Xnp.ndarray
Design matrix of species occupancy covariates.
- yDict[int, np.ndarray]
Dictionary of survey data where the keys are the site numbers of the surveyed sites and the values are number arrays of 1’s and 0’s where 0’s indicate “no detection” and 1’s indicate “detection”. The length of each array equals the number of visits in the corresponding site.
- alphanp.ndarray
True values of coefficients of detection covariates.
- betanp.ndarray
True values of coefficients of occupancy covariates.
- taunp.ndarray
True value of the precision parameter
- znp.ndarray
True occupancy state for all n sites.
- Raises
- ValueError
When n is less than the default 150 sites. When min_v is less than 1. When max_v is less than 2 or greater than n. When ns is not a positive integer or greater than n.
Examples
>>> from occuspytial.utils import make_data >>> Q, W, X, y, alpha, beta, tau, z = make_data() >>> Q <150x150 sparse matrix of type '<class 'numpy.float64'>' with 1144 stored elements in COOrdinate format> >>> Q.toarray() array([[ 3., -1., 0., ..., 0., 0., 0.], # random [-1., 5., -1., ..., 0., 0., 0.], [ 0., -1., 5., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 5., -1., 0.], [ 0., 0., 0., ..., -1., 5., -1.], [ 0., 0., 0., ..., 0., -1., 3.]]) >>> W {81: array([[ 1. , 1.01334565, 0.93150242], # random [ 1. , 0.19276808, -1.71939657], [ 1. , 0.23866531, 0.0559545 ], [ 1. , 1.36102304, 1.73611887], [ 1. , 0.47247886, 0.73410589], [ 1. , -1.9018879 , 0.0097963 ]]), 131: array([[ 1. , 1.67846707, -1.12476746], [ 1. , -1.63131532, -1.32216705], [ 1. , -1.37431173, -0.79734213], ..., 21: array([[ 1. , 1.6416734 , -1.91642502], [ 1. , 0.2256312 , -1.68929118], [ 1. , 1.36953093, 1.08758129], [ 1. , -1.08029212, 0.40219588]])} >>> X array([[ 1. , 0.71582433, 1.76344395], [ 1. , 0.8561976 , 1.0520401 ], [ 1. , -0.28051247, 0.16809809], ..., [ 1. , 0.86702262, -1.18225448], [ 1. , -0.41346399, -0.9633078 ], [ 1. , -0.23182363, 1.69930761]]) >>> y {15: array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]), # random 81: array([0, 0, 0, 1, 1, 0]), ..., 21: array([0, 1, 0, 0])} >>> alpha array([-1.43291816, -0.87932413, -1.84927642]) # random >>> beta array([-0.62084322, -1.09645564, -0.93371374]) # random >>> tau 1.415532667780688 # random >>> z array([0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 0, 1, 0])
Development Utils¶
-
occuspytial.utils.
get_generator
(random_state=None)[source]¶ Get an instance of a numpy random number generator object.
This instance uses SFC64 bitgenerator, which is the fastest numpy currently has to offer as of version 1.19. This function conveniently instantiates a generator of this kind and should be used in all modules.
- Parameters
- random_state{None, int, array_like[ints], numpy.random.SeedSequence}
A seed to initialize the bitgenerator. Defaults to
None
.
- Returns
- numpy.random.Generator
Instance of numpy’s Generator class, which exposes a number of random number generating methods.
Examples
>>> from occuspytial.utils import get_generator >>> rng = get_generator() # The instance can be used to access functions of ``numpy.random`` >>> rng.standard_normal() -0.203 # random
-
occuspytial.gibbs.parallel.
sample_parallel
(class_, **kwargs)[source]¶ Perform MCMC sampling in parallel.
- Parameters
- class_object
Sampler instance that implements a sample, step and _run methods.
- **kwargs
Keyword arguments to pass to the sampler’s sample / _run methods.
- Returns
- outList[PosteriorParameter]
posterior samplers
-
class
occuspytial.data.
Data
¶ Container for Detection data.
This class is most useful storing the conditional detection covariate data (
W
) and detection-survey data (y
) in a more accessible way that allows convenient multiple site data access at once.- Parameters
- dataDict[int, nd.ndarray]
Dictionary of site data. The key should be the site numbers and value the relevant site data array (detection data or its design matrix).
- Attributes
- surveyedList[int]
indices of sites were surveyed. This is calculated using data keys.
Methods
visits(sites)
-
__getitem__
()¶ Return data of a site.
If sites is a sequence the returned data is a concatenated array of the data in all sites contained in sites along the first axis.
- Parameters
- sites{int, List[int], Tuple[int]}
Site id/number(s).
- Returns
- outnp.ndarray
Concatenated data per site provided in sites.
-
visits
(sites)¶ Return the number of visits per site.
- Parameters
- sites{int, List[int], Tuple[int]}
Site(s) whose number of visits are to be returned.
- Returns
- out{Tuple[int], int}
Number of visits per site provided by sites.
-
class
occuspytial.posterior.
PosteriorParameter
(*chains)[source]¶ Container to store posterior samples, produce plots and summaries.
This object is returned by samplers so that posterior parameter samples can be easily accessed. It also provides several methods to perform basic inference on the posterior samples.
- Parameters
- *chains
instances of
Chain
.
- Attributes
summary
Return summary statistics of posterior parameter samples.
- dataarviz.InferenceData
Inference data object.
-
plot_auto_corr
(**kwargs)[source]¶ See arviz library documentation for a full list of legal parameters.
- Parameters
- **kwargs
Keyword arguments optionally passed to
arviz.plot_autocorr
.
- Returns
- axesmatplotlib axes
-
plot_density
(**kwargs)[source]¶ See arviz library documentation for a full list of legal parameters.
- Parameters
- **kwargs
Keyword arguments optionally passed to
arviz.plot_posterior
.
- Returns
- axesmatplotlib axes
-
plot_ess
(**kwargs)[source]¶ See arviz library documentation for a full list of legal parameters.
- Parameters
- **kwargs
Keyword arguments optionally passed to
arviz.plot_ess
.
- Returns
- axesmatplotlib axes
-
plot_pair
(**kwargs)[source]¶ See arviz library documentation for a full list of legal parameters.
- Parameters
- **kwargs
Keyword arguments optionally passed to
arviz.plot_pair
.
- Returns
- axesmatplotlib axes
-
plot_trace
(**kwargs)[source]¶ See arviz library documentation for a full list of legal parameters.
- Parameters
- **kwargs
Keyword arguments optionally passed to
arviz.plot_trace
.
- Returns
- axesmatplotlib axes
-
property
summary
¶ Return summary statistics of posterior parameter samples.
Default statistics are:
mean
,sd
,hdi_3%
,hdi_97%
,mcse_mean
,mcse_sd
,ess_bulk
,ess_tail
, andr_hat
.r_hat
is only computed for traces with 2 or more chains.- Returns
- pandas.DataFrame
A dataframe of the summary.
-
class
occuspytial.chain.
Chain
(params, size)[source]¶ Container to store parameter chains during sampling.
- Parameters
- paramsDict[str, int]
Dictionary of parameter metadata. The keys are the parameter names and the values are the number of dimensions of the parameter.
- sizeint
Length of the parameter chain.
- Attributes
full
Return the full chain as a numpy array.
-
append
(params)[source]¶ Append new values to the chain.
- Parameters
- paramsDict[str, Union[float, nd.ndarray]]
Dictionary of values to append. Keys are the parameter names.
- Raises
- ValueError
If chain is already full (i.e. number of values per porameter is already equal to the size attribute.)
-
expand
(size)[source]¶ Extend the chain capacity by a specified length.
- Parameters
- sizeint
Length to extend the chain by.
-
property
full
¶ Return the full chain as a numpy array.
The returned array is a concatenation of the arrays of all parameters. The number of columns is the sum of the parameters’ dimensions. The number of rows may be less than size attribute if the chain is not full.
- Returns
- np.ndarray
concatenated parameter chains.