mot package¶

Subpackages¶

Submodules¶

mot.configuration module¶

Contains the runtime configuration of MOT.

This consists of two parts, functions to get the current runtime settings and configuration actions to update these settings. To set a new configuration, create a new ConfigAction and use this within a context environment using config_context(). Example:

from mot.configuration import RuntimeConfigurationAction, config_context

with config_context(RuntimeConfigurationAction(...)):
    ...

class mot.configuration.CLRuntimeAction(cl_runtime_info)[source]¶

Bases: mot.configuration.SimpleConfigAction

Set the current configuration to use the information in the given configuration action.

Parameters:	cl_runtime_info (CLRuntimeInfo) – the runtime info with the configuration options

class mot.configuration.CLRuntimeInfo(cl_environments=None, compile_flags=None, double_precision=None)[source]¶

Bases: object

All information necessary for applying operations using OpenCL.

Parameters:

cl_environments (list of mot.lib.cl_environments.CLEnvironment) – The list of CL environments used by this routine. If None is given we use the defaults in the current configuration.
compile_flags (list) – the list of compile flags to use during analysis.
double_precision (boolean) – if we apply the computations in double precision or in single float precision. By default we go for single float precision.

cl_environments¶

compile_flags¶: Get all defined compile flags.

double_precision¶

mot_float_dtype¶

class mot.configuration.ConfigAction[source]¶

Bases: object

Defines a configuration action for use in a configuration context.

This should define an apply and unapply function that sets and unsets the configuration options.

The applying action needs to remember the state before the application of the action.

apply()[source]¶: Apply the current action to the current runtime configuration.

unapply()[source]¶: Reset the current configuration to the previous state.

class mot.configuration.RuntimeConfigurationAction(cl_environments=None, compile_flags=None, double_precision=None)[source]¶

Bases: mot.configuration.SimpleConfigAction

Updates the runtime settings.

Parameters:	cl_environments (list of CLEnvironment) – the new CL environments we wish to use for future computations compile_flags (list) – the list of compile flags to use during analysis. double_precision (boolean) – if we compute in double precision or not

class mot.configuration.SimpleConfigAction[source]¶

Bases: mot.configuration.ConfigAction

Defines a default implementation of a configuration action.

This simple config implements a default apply() method that saves the current state and a default unapply() that restores the previous state.

For developers, it is easiest to implement _apply() such that you do not manually need to store the old configuraration.

apply()[source]¶: Apply the current action to the current runtime configuration.

unapply()[source]¶: Reset the current configuration to the previous state.

class mot.configuration.VoidConfigurationAction[source]¶

Bases: mot.configuration.ConfigAction

Does nothing, useful as a default config action.

mot.configuration.config_context(config_action)[source]¶

Creates a context in which the config action is applied and unapplies the configuration after execution.

Parameters:	config_action (ConfigAction) – the configuration action to use

mot.configuration.get_cl_environments()[source]¶

Get the current CL environment to use during CL calculations.

Returns:	the current list of CL environments.
Return type:	list of CLEnvironment

mot.configuration.get_compile_flags()[source]¶

Get the default compile flags to use in a CL routine.

Returns:	the default list of compile flags we wish to use
Return type:	list

mot.configuration.set_cl_environments(cl_environments)[source]¶

Set the current CL environments to the given list

Please note that this will change the global configuration, i.e. this is a persistent change. If you do not want a persistent state change, consider using config_context() instead.

Parameters:	cl_environments (list of CLEnvironment) – the new list of CL environments.
Raises:	`ValueError` – if the list of environments is empty

mot.configuration.set_compile_flags(compile_flags)[source]¶

Set the current compile flags.

Parameters:	compile_flags (list) – the new list of compile flags

mot.configuration.set_default_proposal_update(proposal_update)[source]¶

Set the default proposal update function to use in sample.

Please note that this will change the global configuration, i.e. this is a persistent change. If you do not want a persistent state change, consider using config_context() instead.

Parameters:	mot.model_building.parameter_functions.proposal_updates.ProposalUpdate – the new proposal update function to use by default if no specific one is provided.

mot.configuration.set_use_double_precision(double_precision)[source]¶

Set the default use of double precision.

Returns:	if we use double precision by default or not
Return type:	boolean

mot.configuration.use_double_precision()[source]¶

Check if we run the computations in default precision or not.

Returns:	if we run the computations in double precision or not
Return type:	boolean

mot.mcmc_diagnostics module¶

This module contains some diagnostic functions to diagnose the performance of MCMC sample.

The two most important functions are multivariate_ess() and univariate_ess() to calculate the effective sample size of your samples.

class mot.mcmc_diagnostics.BatchMeansMCSE[source]¶

Bases: mot.mcmc_diagnostics.ComputeMonteCarloStandardError

Computes the Monte Carlo Standard Error using simple batch means.

compute_standard_error(chain, batch_size)[source]¶

Compute the standard error of the given chain and the given batch size.

Parameters:	chain (ndarray) – the chain for which to compute the SE batch_size (int) – batch size or window size to use in the computations
Returns:	the Monte Carlo Standard Error
Return type:	float

class mot.mcmc_diagnostics.ComputeMonteCarloStandardError[source]¶

Bases: object

Method to compute the Monte Carlo Standard error.

compute_standard_error(chain, batch_size)[source]¶

Compute the standard error of the given chain and the given batch size.

Parameters:	chain (ndarray) – the chain for which to compute the SE batch_size (int) – batch size or window size to use in the computations
Returns:	the Monte Carlo Standard Error
Return type:	float

class mot.mcmc_diagnostics.CubeRootSingleBatch[source]¶

Bases: mot.mcmc_diagnostics.MultiVariateESSBatchSizeGenerator, mot.mcmc_diagnostics.UniVariateESSBatchSizeGenerator

Returns \(n^{1/3}\).

get_multivariate_ess_batch_sizes(nmr_params, chain_length)[source]¶

Get the batch sizes to use for the calculation of the Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\Sigma\)

Parameters:	nmr_params (int) – the number of parameters in the samples chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

get_univariate_ess_batch_sizes(chain_length)[source]¶

Get the batch sizes to use for the calculation of the univariate Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\sigma\)

Parameters:	chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

class mot.mcmc_diagnostics.LinearSpacedBatchSizes(nmr_batches=200)[source]¶

Bases: mot.mcmc_diagnostics.MultiVariateESSBatchSizeGenerator

Returns a number of batch sizes from which the ESS algorithm will select the one with the lowest ESS.

This is a conservative choice since the lowest ESS of all batch sizes is chosen.

The batch sizes are generated as linearly spaced values in:

\[\Big[ n^{1/4}, max(\lfloor x/max(20,p) \rfloor, \lfloor \sqrt{n} \rfloor) \Big]\]

where \(n\) is the chain length and \(p\) is the number of parameters.

Parameters:	nmr_batches (int) – the number of linearly spaced batches we will generate.

get_multivariate_ess_batch_sizes(nmr_params, chain_length)[source]¶

Get the batch sizes to use for the calculation of the Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\Sigma\)

Parameters:	nmr_params (int) – the number of parameters in the samples chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

class mot.mcmc_diagnostics.MultiVariateESSBatchSizeGenerator[source]¶

Bases: object

Objects of this class are used as input to the multivariate ESS function.

The multivariate ESS function needs to have at least one batch size to use during the computations. More batch sizes are also possible and the batch size with the lowest ESS is then preferred. Objects of this class implement the logic behind choosing batch sizes.

get_multivariate_ess_batch_sizes(nmr_params, chain_length)[source]¶

Get the batch sizes to use for the calculation of the Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\Sigma\)

Parameters:	nmr_params (int) – the number of parameters in the samples chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

class mot.mcmc_diagnostics.OverlappingBatchMeansMCSE[source]¶

Bases: mot.mcmc_diagnostics.ComputeMonteCarloStandardError

Computes the Monte Carlo Standard Error using overlapping batch means.

compute_standard_error(chain, batch_size)[source]¶

Compute the standard error of the given chain and the given batch size.

Parameters:	chain (ndarray) – the chain for which to compute the SE batch_size (int) – batch size or window size to use in the computations
Returns:	the Monte Carlo Standard Error
Return type:	float

class mot.mcmc_diagnostics.SquareRootSingleBatch[source]¶

Bases: mot.mcmc_diagnostics.MultiVariateESSBatchSizeGenerator, mot.mcmc_diagnostics.UniVariateESSBatchSizeGenerator

Returns \(\sqrt(n)\).

get_multivariate_ess_batch_sizes(nmr_params, chain_length)[source]¶

Get the batch sizes to use for the calculation of the Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\Sigma\)

Parameters:	nmr_params (int) – the number of parameters in the samples chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

get_univariate_ess_batch_sizes(chain_length)[source]¶

Get the batch sizes to use for the calculation of the univariate Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\sigma\)

Parameters:	chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

class mot.mcmc_diagnostics.UniVariateESSBatchSizeGenerator[source]¶

Bases: object

Objects of this class are used as input to the univariate ESS function that uses the batch means.

The univariate batch means ESS function needs to have at least one batch size to use during the computations. More batch sizes are also possible and the batch size with the lowest ESS is then preferred. Objects of this class implement the logic behind choosing batch sizes.

get_univariate_ess_batch_sizes(chain_length)[source]¶

Get the batch sizes to use for the calculation of the univariate Effective Sample Size (ESS).

This should return a list of batch sizes that the ESS calculation will use to determine \(\sigma\)

Parameters:	chain_length (int) – the length of the chain
Returns:	the batches of the given sizes we will test in the ESS calculations
Return type:	list

mot.mcmc_diagnostics.estimate_multivariate_ess(samples, batch_size_generator=None, full_output=False)[source]¶

Compute the multivariate Effective Sample Size of your (single instance set of) samples.

This multivariate ESS is defined in Vats et al. (2016) and is given by:

\[ESS = n \bigg(\frac{|\Lambda|}{|\Sigma|}\bigg)^{1/p}\]

Where \(n\) is the number of samples, \(p\) the number of parameters, \(\Lambda\) is the covariance matrix of the parameters and \(\Sigma\) captures the covariance structure in the target together with the covariance due to correlated samples. \(\Sigma\) is estimated using estimate_multivariate_ess_sigma().

In the case of NaN in any part of the computation the ESS is set to 0.

To compute the multivariate ESS for multiple problems, please use multivariate_ess().

Parameters:

samples (ndarray) – an pxn matrix with for p parameters and n samples.
batch_size_generator (MultiVariateESSBatchSizeGenerator) – the batch size generator, tells us how many batches and of which size we use for estimating the minimum ESS. Defaults to SquareRootSingleBatch
full_output (boolean) – set to True to return the estimated \(\Sigma\) and the optimal batch size.

Returns:

when full_output is set to True we return a tuple with the estimated multivariate ESS,: the estimated \(\Sigma\) matrix and the optimal batch size. When full_output is False (the default) we only return the ESS.

Return type:

float or tuple

References

Vats D, Flegal J, Jones G (2016). Multivariate Output Analysis for Markov Chain Monte Carlo. arXiv:1512.07713v2 [math.ST]

mot.mcmc_diagnostics.estimate_multivariate_ess_sigma(samples, batch_size)[source]¶

Calculates the Sigma matrix which is part of the multivariate ESS calculation.

This implementation is based on the Matlab implementation found at: https://github.com/lacerbi/multiESS

The Sigma matrix is defined as:

\[\Sigma = \Lambda + 2 * \sum_{k=1}^{\infty}{Cov(Y_{1}, Y_{1+k})}\]

Where \(Y\) are our samples and \(\Lambda\) is the covariance matrix of the samples.

This implementation computes the \(\Sigma\) matrix using a Batch Mean estimator using the given batch size. The batch size has to be \(1 \le b_n \le n\) and a typical value is either \(\lfloor n^{1/2} \rfloor\) for slow mixing chains or \(\lfloor n^{1/3} \rfloor\) for reasonable mixing chains.

If the length of the chain is longer than the sum of the length of all the batches, this implementation calculates \(\Sigma\) for every offset and returns the average of those offsets.

Parameters:	samples (ndarray) – the samples for which we compute the sigma matrix. Expects an (p, n) array with p the number of parameters and n the sample size batch_size (int) – the batch size used in the approximation of the correlation covariance
Returns:	an pxp array with p the number of parameters in the samples.
Return type:	ndarray

References

Vats D, Flegal J, Jones G (2016). Multivariate Output Analysis for Markov Chain Monte Carlo. arXiv:1512.07713v2 [math.ST]

mot.mcmc_diagnostics.estimate_univariate_ess_autocorrelation(chain, max_lag=None)[source]¶

Estimate effective sample size (ESS) using the autocorrelation of the chain.

The ESS is an estimate of the size of an iid sample with the same variance as the current sample. This function implements the ESS as described in Kass et al. (1998) and Robert and Casella (2004; p. 500):

\[ESS(X) = \frac{n}{\tau} = \frac{n}{1 + 2 * \sum_{k=1}^{m}{\rho_{k}}}\]

where \(\rho_{k}\) is estimated as:

\[\hat{\rho}_{k} = \frac{E[(X_{t} - \mu)(X_{t + k} - \mu)]}{\sigma^{2}}\]

References

Kass, R. E., Carlin, B. P., Gelman, A., and Neal, R. (1998)

Markov chain Monte Carlo in practice: A roundtable discussion. The American Statistician, 52, 93–100.
Robert, C. P. and Casella, G. (2004) Monte Carlo Statistical Methods. New York: Springer.
Geyer, C. J. (1992) Practical Markov chain Monte Carlo. Statistical Science, 7, 473–483.

Parameters:	chain (ndarray) – the chain for which to calculate the ESS, assumes a vector of length `n` samples max_lag (int) – the maximum lag used in the variance calculations. If not given defaults to \(min(n/3, 1000)\).
Returns:	the estimated ESS
Return type:	float

mot.mcmc_diagnostics.estimate_univariate_ess_standard_error(chain, batch_size_generator=None, compute_method=None)[source]¶

Compute the univariate ESS using the standard error method.

This computes the ESS using:

\[ESS(X) = n * \frac{\lambda^{2}}{\sigma^{2}}\]

Where \(\lambda\) is the standard deviation of the chain and \(\sigma\) is estimated using the monte carlo standard error (which in turn is, by default, estimated using a batch means estimator).

Parameters:	chain (ndarray) – the Markov chain batch_size_generator (UniVariateESSBatchSizeGenerator) – the method that generates that batch sizes we will use. Per default it uses the `SquareRootSingleBatch` method. compute_method (ComputeMonteCarloStandardError) – the method used to compute the standard error. By default we will use the `BatchMeansMCSE` method
Returns:	the estimated ESS
Return type:	float

mot.mcmc_diagnostics.get_auto_correlation(chain, lag)[source]¶

Estimates the auto correlation for the given chain (1d vector) with the given lag.

Given a lag \(k\), the auto correlation coefficient \(\rho_{k}\) is estimated as:

\[\hat{\rho}_{k} = \frac{E[(X_{t} - \mu)(X_{t + k} - \mu)]}{\sigma^{2}}\]

Please note that this equation only works for lags \(k < n\) where \(n\) is the number of samples in the chain.

Parameters:	chain (ndarray) – the vector with the samples lag (int) – the lag to use in the autocorrelation computation
Returns:	the autocorrelation with the given lag
Return type:	float

mot.mcmc_diagnostics.get_auto_correlation_time(chain, max_lag=None)[source]¶

Compute the auto correlation time up to the given lag for the given chain (1d vector).

This will halt when the maximum lag \(m\) is reached or when the sum of two consecutive lags for any odd lag is lower or equal to zero.

The auto correlation sum is estimated as:

\[\tau = 1 + 2 * \sum_{k=1}^{m}{\rho_{k}}\]

Where \(\rho_{k}\) is estimated as:

\[\hat{\rho}_{k} = \frac{E[(X_{t} - \mu)(X_{t + k} - \mu)]}{\sigma^{2}}\]

Parameters:	chain (ndarray) – the vector with the samples max_lag (int) – the maximum lag to use in the autocorrelation computation. If not given we use: \(min(n/3, 1000)\).

mot.mcmc_diagnostics.minimum_multivariate_ess(nmr_params, alpha=0.05, epsilon=0.05)[source]¶

Calculate the minimum multivariate Effective Sample Size you will need to obtain the desired precision.

This implements the inequality from Vats et al. (2016):

\[\widehat{ESS} \geq \frac{2^{2/p}\pi}{(p\Gamma(p/2))^{2/p}} \frac{\chi^{2}_{1-\alpha,p}}{\epsilon^{2}}\]

Where \(p\) is the number of free parameters.

Parameters:

nmr_params (int) – the number of free parameters in the model
alpha (float) – the level of confidence of the confidence region. For example, an alpha of 0.05 means that we want to be in a 95% confidence region.
epsilon (float) – the level of precision in our multivariate ESS estimate. An epsilon of 0.05 means that we expect that the Monte Carlo error is 5% of the uncertainty in the target distribution.

Returns:

the minimum multivariate Effective Sample Size that one should aim for in MCMC sample to: obtain the desired confidence region with the desired precision.

Return type:

float

References

Vats D, Flegal J, Jones G (2016). Multivariate Output Analysis for Markov Chain Monte Carlo. arXiv:1512.07713v2 [math.ST]

mot.mcmc_diagnostics.monte_carlo_standard_error(chain, batch_size_generator=None, compute_method=None)[source]¶

Compute Monte Carlo standard errors for the expectations

This is a convenience function that calls the compute method for each batch size and returns the lowest ESS over the used batch sizes.

Parameters:	chain (ndarray) – the Markov chain batch_size_generator (UniVariateESSBatchSizeGenerator) – the method that generates that batch sizes we will use. Per default it uses the `SquareRootSingleBatch` method. compute_method (ComputeMonteCarloStandardError) – the method used to compute the standard error. By default we will use the `BatchMeansMCSE` method

mot.mcmc_diagnostics.multivariate_ess(samples, batch_size_generator=None)[source]¶

Estimate the multivariate Effective Sample Size for the samples of every problem.

This essentially applies estimate_multivariate_ess() to every problem.

Parameters:	samples (ndarray, dict or generator) – either a matrix of shape (d, p, n) with d problems, p parameters and n samples, or a dictionary with for every parameter a matrix with shape (d, n) or, finally, a generator function that yields sample arrays of shape (p, n). batch_size_generator (MultiVariateESSBatchSizeGenerator) – the batch size generator, tells us how many batches and of which size we use in estimating the minimum ESS.
Returns:	the multivariate ESS per problem
Return type:	ndarray

mot.mcmc_diagnostics.multivariate_ess_precision(nmr_params, multi_variate_ess, alpha=0.05)[source]¶

Calculate the precision given your multivariate Effective Sample Size.

Given that you obtained \(ESS\) multivariate effective samples in your estimate you can calculate the precision with which you approximated your desired confidence region.

This implements the inequality from Vats et al. (2016), slightly restructured to give \(\epsilon\) back instead of the minimum ESS.

\[\epsilon = \sqrt{\frac{2^{2/p}\pi}{(p\Gamma(p/2))^{2/p}} \frac{\chi^{2}_{1-\alpha,p}}{\widehat{ESS}}}\]

Where \(p\) is the number of free parameters and ESS is the multivariate ESS from your samples.

Parameters:

nmr_params (int) – the number of free parameters in the model
multi_variate_ess (int) – the number of iid samples you obtained in your sample results.
alpha (float) – the level of confidence of the confidence region. For example, an alpha of 0.05 means that we want to be in a 95% confidence region.

Returns:

the minimum multivariate Effective Sample Size that one should aim for in MCMC sample to: obtain the desired confidence region with the desired precision.

Return type:

float

References

Vats D, Flegal J, Jones G (2016). Multivariate Output Analysis for Markov Chain Monte Carlo. arXiv:1512.07713v2 [math.ST]

mot.mcmc_diagnostics.univariate_ess(samples, method='standard_error', **kwargs)[source]¶

Estimate the univariate Effective Sample Size for the samples of every problem.

This computes the ESS using:

\[ESS(X) = n * \frac{\lambda^{2}}{\sigma^{2}}\]

Where \(\lambda\) is the standard deviation of the chain and \(\sigma\) is estimated using the monte carlo standard error (which in turn is, by default, estimated using a batch means estimator).

Parameters:	samples (ndarray, dict or generator) – either a matrix of shape (d, p, n) with d problems, p parameters and n samples, or a dictionary with for every parameter a matrix with shape (d, n) or, finally, a generator function that yields sample arrays of shape (p, n). method (str) – one of ‘autocorrelation’ or ‘standard_error’ defaults to ‘standard_error’. If ‘autocorrelation’ is chosen we apply the function: `estimate_univariate_ess_autocorrelation()`, if ‘standard_error` is choosen we apply the function: `estimate_univariate_ess_standard_error()`. **kwargs – passed to the chosen compute method
Returns:	a matrix of size (d, p) with for every problem and every parameter an ESS.
Return type:	ndarray

References

Flegal, J.M., Haran, M., and Jones, G.L. (2008). “Markov chain Monte Carlo: Can We Trust the Third Significant Figure?”. Statistical Science, 23, p. 250-260.
Marc S. Meketon and Bruce Schmeiser. 1984. Overlapping batch means: something for nothing?. In Proceedings of the 16th conference on Winter simulation (WSC ‘84), Sallie Sheppard (Ed.). IEEE Press, Piscataway, NJ, USA, 226-230.

mot.random module¶

This uses the random123 library for generating multiple lists of random numbers.

From the Random123 documentation:

Unlike conventional RNGs, counter-based RNGs are stateless functions (or function classes i.e. functors) whose arguments are a counter and a key, and returns a result of the same type as the counter.

result = CBRNGname(counter, key)

The result is producted by a deterministic function of the key and counter, i.e. a unique (counter, key) tuple will always produce the same result. The result is highly sensitive to small changes in the inputs, so that the sequence of values produced by simply incrementing the counter (or key) is effectively indistinguishable from a sequence of samples of a uniformly distributed random variable.

All the Random123 generators are counter-based RNGs that use integer multiplication, xor and permutation of W-bit words to scramble its N-word input key.

In this implementation we generate a counter and key automatically from a single seed.

mot.random.normal(nmr_distributions, nmr_samples, mean=0, std=1, ctype='float', seed=None)[source]¶

Draw random samples from the Gaussian distribution.

Parameters:	nmr_distributions (int) – the number of unique continuous_distributions to create nmr_samples (int) – The number of samples to draw mean (float or ndarray) – The mean of the distribution std (float or ndarray) – The standard deviation or the distribution ctype (str) – the C type of the output samples seed (float) – the seed for the RNG
Returns:	A two dimensional numpy array as (nmr_distributions, nmr_samples).
Return type:	ndarray

mot.random.uniform(nmr_distributions, nmr_samples, low=0, high=1, ctype='float', seed=None)[source]¶

Draw random samples from the Uniform distribution.

Parameters:	nmr_distributions (int) – the number of unique continuous_distributions to create nmr_samples (int) – The number of samples to draw low (double) – The minimum value of the random numbers high (double) – The minimum value of the random numbers ctype (str) – the C type of the output samples seed (float) – the seed for the RNG
Returns:	A two dimensional numpy array as (nmr_distributions, nmr_samples).
Return type:	ndarray

mot.stats module¶

mot.stats.deviance_information_criterions(mean_posterior_lls, ll_per_sample)[source]¶

Calculates the Deviance Information Criteria (DIC) using three methods.

This returns a dictionary returning the DIC_2002, the DIC_2004 and the DIC_Ando_2011 method. The first is based on Spiegelhalter et al (2002), the second based on Gelman et al. (2004) and the last on Ando (2011). All cases differ in how they calculate model complexity, i.e. the effective number of parameters in the model. In all cases the model with the smallest DIC is preferred.

All these DIC methods measure fitness using the deviance, which is, for a likelihood \(p(y | \theta)\) defined as:

\[D(\theta) = -2\log p(y|\theta)\]

From this, the posterior mean deviance,

\[\bar{D} = \mathbb{E}_{\theta}[D(\theta)]\]

is then used as a measure of how well the model fits the data.

The complexity, or measure of effective number of parameters, can be measured in see ways, see Spiegelhalter et al. (2002), Gelman et al (2004) and Ando (2011). The first method calculated the parameter deviance as:

\begin{align} p_{D} &= \mathbb{E}_{\theta}[D(\theta)] - D(\mathbb{E}[\theta)]) \\ &= \bar{D} - D(\bar{\theta}) \end{align}

i.e. posterior mean deviance minus the deviance evaluated at the posterior mean of the parameters.

The second method calculated \(p_{D}\) as:

\[p_{D} = p_{V} = \frac{1}{2}\hat{var}(D(\theta))\]

i.e. half the variance of the deviance is used as an estimate of the number of free parameters in the model.

The third method calculates the parameter deviance as:

\[p_{D} = 2 \cdot (\bar{D} - D(\bar{\theta}))\]

That is, twice the complexity of that of the first method.

Finally, the DIC is (for all cases) defined as:

\[DIC = \bar{D} + p_{D}\]

Parameters:

mean_posterior_lls (ndarray) – a 1d matrix containing the log likelihood for the average posterior point estimate. That is, the single log likelihood of the average parameters.
ll_per_sample (ndarray) – a (d, n) array with for d problems the n log likelihoods. This is the log likelihood per sample.

Returns:

a dictionary containing the DIC_2002, the DIC_2004 and the DIC_Ando_2011 information: criterion maps.

Return type:

dict

mot.stats.fit_circular_gaussian(samples, high=3.141592653589793, low=0)[source]¶

Compute the circular mean for samples in a range

Parameters:	samples (ndarray) – a one or two dimensional array. If one dimensional we calculate the fit using all values. If two dimensional, we fit the Gaussian for every set of samples over the first dimension. high (float) – The maximum wrap point low (float) – The minimum wrap point

mot.stats.fit_gaussian(samples, ddof=0)[source]¶

Calculates the mean and the standard deviation of the given samples.

Parameters:	samples (ndarray) – a one or two dimensional array. If one dimensional we calculate the fit using all values. If two dimensional, we fit the Gaussian for every set of samples over the first dimension. ddof (int) – the difference degrees of freedom in the std calculation. See numpy.

mot.stats.fit_truncated_gaussian(samples, lower_bounds, upper_bounds)[source]¶

Fits a truncated gaussian distribution on the given samples.

This will do a maximum likelihood estimation of a truncated Gaussian on the provided samples, with the truncation points given by the lower and upper bounds.

Parameters:	samples (ndarray) – a one or two dimensional array. If one dimensional we fit the truncated Gaussian on all values. If two dimensional, we calculate the truncated Gaussian for every set of samples over the first dimension. lower_bounds (ndarray or float) – the lower bound, either a scalar or a lower bound per problem (first index of samples) upper_bounds (ndarray or float) – the upper bound, either a scalar or an upper bound per problem (first index of samples)
Returns:	the mean and std of the fitted truncated Gaussian
Return type:	mean, std

mot.stats.gaussian_overlapping_coefficient(means_0, stds_0, means_1, stds_1, lower=None, upper=None)[source]¶

Compute the overlapping coefficient of two Gaussian continuous_distributions.

This computes the \(\int_{-\infty}^{\infty}{\min(f(x), g(x))\partial x}\) where \(f \sim \mathcal{N}(\mu_0, \sigma_0^{2})\) and \(f \sim \mathcal{N}(\mu_1, \sigma_1^{2})\) are normally distributed variables.

This will compute the overlap for each element in the first dimension.

Parameters:

means_0 (ndarray) – the set of means of the first distribution
stds_0 (ndarray) – the set of stds of the fist distribution
means_1 (ndarray) – the set of means of the second distribution
stds_1 (ndarray) – the set of stds of the second distribution
lower (float) – the lower limit of the integration. If not set we set it to -inf.
upper (float) – the upper limit of the integration. If not set we set it to +inf.

Module contents¶

mot.smart_device_selection()[source]¶

Get a list of device environments that is suitable for use in MOT.

Returns:	List with the CL device environments.
Return type:	list of CLEnvironment