This section collects various statistical tests and tools. Some can be used independently of any models, some are intended as extension to the models and model results.
API Warning: The functions and objects in this category are spread out in various modules and might still be moved around.
durbin_watson(resids) | Calculates the Durbin-Watson statistic |
jarque_bera(resids) | Calculate residual skewness, kurtosis, and do the JB test for normality |
omni_normtest(resids[, axis]) | Omnibus test for normality |
acorr_ljungbox(x[, lags, boxpierce]) | Ljung-Box test for no autocorrelation |
acorr_breush_godfrey(results[, nlags, store]) | Breush Godfrey Lagrange Multiplier tests for residual autocorrelation |
HetGoldfeldQuandt | test whether variance is the same in 2 subsamples |
het_goldfeldquandt | see class docstring |
het_breushpagan(resid, exog_het) | Breush-Pagan Lagrange Multiplier test for heteroscedasticity |
het_white(resid, exog[, retres]) | White’s Lagrange Multiplier Test for Heteroscedasticity |
het_arch(resid[, maxlag, autolag, store, ddof]) | Enlge’s Test for Autoregressive Conditional Heteroscedasticity (ARCH) |
linear_harvey_collier(res) | Harvey Collier test for linearity |
linear_rainbow(res[, frac]) | Rainbow test for linearity |
linear_lm(resid, exog[, func]) | Lagrange multiplier test for linearity against functional alternative |
breaks_cusumolsresid(olsresidual[, ddof]) | cusum test for parameter stability based on ols residuals |
breaks_hansen(olsresults) | test for model stability, breaks in parameters for ols, Hansen 1992 |
recursive_olsresiduals(olsresults[, skip, ...]) | calculate recursive ols with residuals and cusum test statistic |
CompareCox | Cox Test for non-nested models |
compare_cox | Cox Test for non-nested models |
CompareJ | J-Test for comparing non-nested models |
compare_j | J-Test for comparing non-nested models |
unitroot_adf(x[, maxlag, trendorder, ...]) | |
normal_ad(x[, axis]) | Anderson-Darling test for normal distribution unknown mean and variance |
kstest_normal(x[, pvalmethod]) | Lillifors test for normality, |
lillifors(x[, pvalmethod]) | Lillifors test for normality, |
OLSInfluence(results) | class to calculate outlier and influence measures for OLS result |
variance_inflation_factor(exog, exog_idx) | variance inflation factor, VIF, for one exogenous variable |
See also the notes on notes on regression diagnostics
some tests for goodness of fit for univariate distributions
powerdiscrepancy(observed, expected[, ...]) | Calculates power discrepancy, a class of goodness-of-fit tests as a measure of discrepancy between observed and expected data. |
gof_chisquare_discrete(distfn, arg, rvs, ...) | perform chisquare test for random sample of a discrete distribution |
gof_binning_discrete(rvs, distfn, arg[, nsupp]) | get bins for chisquare type gof tests for a discrete distribution |
normal_ad(x[, axis]) | Anderson-Darling test for normal distribution unknown mean and variance |
kstest_normal(x[, pvalmethod]) | Lillifors test for normality, |
lillifors(x[, pvalmethod]) | Lillifors test for normality, |
mcnemar(x, y[, exact, correction]) | McNemar test |
median_test_ksample(x, groups) | chisquare test for equality of median/location |
runstest_1samp(x[, cutoff]) | use runs test on binary discretized data above/below cutoff |
runstest_2samp(x[, y, groups]) | Wald-Wolfowitz runstest for two samples |
cochran_q(x) | Cochran’s Q test for identical effect of k treatments |
Runs(x) | class for runs in a binary sequence |
multipletests is a function for p-value correction, which also includes p-value correction based on fdr in fdrcorrection. tukeyhsd performs simulatenous testing for the comparison of (independent) means. These three functions are verified. GroupsStats and MultiComparison are convenience classes to multiple comparisons similar to one way ANOVA, but still in developement
multipletests(pvals[, alpha, method, ...]) | test results and p-value correction for multiple tests |
fdrcorrection0(pvals[, alpha, method]) | pvalue correction for false discovery rate |
tukeyhsd(mean_all, nobs_all, var_all[, df, ...]) | simultaneous Tukey HSD |
GroupsStats(x[, useranks, uni, intlab]) | statistics by groups (another version) |
MultiComparison(x, groups) | Tests for multiple comparisons |
The following functions are not (yet) public (here for my own benefit, JP)
varcorrection_pairs_unbalanced(nobs_all[, ...]) | correction factor for variance with unequal sample sizes for all pairs |
varcorrection_pairs_unequal(var_all, ...) | return joint variance from samples with unequal variances and unequal |
varcorrection_unbalanced(nobs_all[, srange]) | correction factor for variance with unequal sample sizes |
varcorrection_unequal(var_all, nobs_all, df_all) | return joint variance from samples with unequal variances and unequal |
StepDown(vals, nobs_all, var_all[, df]) | a class for step down methods |
catstack(args) | |
ccols | |
compare_ordered(vals, alpha) | simple ordered sequential comparison of means |
distance_st_range(mean_all, nobs_all, var_all) | pairwise distance matrix, outsourced from tukeyhsd |
ecdf(x) | no frills empirical cdf used in fdrcorrection |
get_tukeyQcrit(k, df[, alpha]) | return critical values for Tukey’s HSD (Q) |
homogeneous_subsets(vals, dcrit) | recursively check all pairs of vals for minimum distance |
line | str(object) -> string |
maxzero(x) | find all up zero crossings and return the index of the highest |
maxzerodown(x) | find all up zero crossings and return the index of the highest |
mcfdr([nrepl, nobs, ntests, ntrue, mu, ...]) | MonteCarlo to test fdrcorrection |
qcrit | str(object) -> string |
randmvn(rho[, size, standardize]) | create random draws from equi-correlated multivariate normal distribution |
rankdata(x) | rankdata, equivalent to scipy.stats.rankdata |
rejectionline(n[, alpha]) | reference line for rejection in multiple tests |
set_partition(ssli) | extract a partition from a list of tuples |
set_remove_subs(ssli) | remove sets that are subsets of another set from a list of tuples |
tiecorrect(xranks) | should be equivalent of scipy.stats.tiecorrect |
CompareMeans(d1, d2) | temporary just to hold formulas |
DescrStatsW(data[, weights, ddof]) | descriptive statistics with weights for simple case |
tstat_generic(value, value2, std_diff, dof, ...) | generic ttest to save typing |
These are utility functions to convert between central and non-central moments, skew, kurtosis and cummulants.
cum2mc(kappa) | convert non-central moments to cumulants |
mc2mnc(mc) | convert central to non-central moments, uses recursive formula |
mc2mvsk(args) | convert central moments to mean, variance, skew, kurtosis |
mnc2cum(mnc) | convert non-central moments to cumulants |
mnc2mc(mnc[, wmean]) | convert non-central to central moments, uses recursive formula |
mnc2mvsk(args) | convert central moments to mean, variance, skew, kurtosis |
mvsk2mc(args) | convert mean, variance, skew, kurtosis to central moments |
mvsk2mnc(args) | convert mean, variance, skew, kurtosis to non-central moments |