3.1.4. Modules for analysis¶

3.1.4.1. Module hypotest¶

3.1.4.1.1. Description¶

Quantitative statistics on multi grouped data. Building proper hypothesis test and quantitative analysis requires some basic knowledge on mathematical statistics.

Hypothesis test module in informatics is mainly in the namespace of info.toolbox.libs.hypotest. For convenience the import from mian entry (like from info.me import hypotest) is also supported.

The prefix hypoi denotes the test required independent data populations, based on which the sizes of all population are unnecessary to be identical. hypoj for joint pairs generally required the sizes of two samples are of the same, intrinsically paired. hypos is simulation methods using random sampling.

`hypoi_f`	perform one-way ANOVA test among multi-grouped data.
`hypoi_t`	perform pair-wise independent T test among multi-grouped data.
`hypoi_sw`	perform Shapiro-Wilk test on each group among multi-grouped data.
`hypoi_normality`	perform Omnibus Normality test on each group among multi-grouped data.
`hypoi_ks`	perform Kolmogorov-Smirnov test among multi-grouped data.
`hypoi_cvm`	perform Cramér-von Mises test among multi-grouped data.
`hypoi_ag`	perform Alexander Govern test among multi-grouped data.
`hypoi_thsd`	perform Tukey's range test among multi-grouped data.
`hypoi_kw`	perform Kruskal-Wallis H-test among multi-grouped data.
`hypoi_mood`	perform Mood's median and scale test among multi-grouped data.
`hypoi_bartlett`	perform Bartlett's test among multi-grouped data.
`hypoi_levene`	perform Levene test among multi-grouped data.
`hypoi_fk`	perform Fligner-Killeen test among multi-grouped data.
`hypoi_ad`	perform Anderson-Darling test among multi-grouped data.
`hypoi_rank`	perform rank sum test among multi-grouped data.
`hypoi_es`	perform Epps-Singleton test on each possible pairs among multi-grouped data.
`hypoi_u`	perform Mann–Whitney U test on each possible pairs among multi-grouped data.
`hypoi_bm`	perform Brunner-Munzel test on each possible pairs among multi-grouped data.
`hypoi_ab`	perform Ansari-Bradley test on each possible pairs among multi-grouped data.
`hypoi_skew`	perform skew test on each group among multi-grouped data.
`hypoi_kurtosis`	perform kurtosis test on each group among multi-grouped data.
`hypoi_jb`	perform Jarque-Bera test on each group among multi-grouped data.
`hypoi_pd`	perform Cressie-Read power divergence test on each group among multi-grouped data.
`hypoi_chi2`	perform Chi-Squared test on each group among multi-grouped data.
`hypoj_pearson`	compute Pearson correlation coefficient on each possible pairs among multi-grouped data.
`hypoj_spearman`	compute Spearman correlation coefficient on each possible pairs among multi-grouped data.
`hypoj_kendall`	compute Kendall's tau correlation coefficient on each possible pairs among multi-grouped data.
`hypoj_t`	perform pair-wise related T test among multi-grouped data.
`hypoj_rank`	perform single-rank test among multi-grouped data.
`hypoj_friedman`	perform Friedman test among multi-grouped data.
`hypoj_mgc`	perform Multiscale Graph Correlation test on each possible pairs among multi high-dimensional data.
`hypos_mc`	perform Monte Carlo hypothesis test on each group among multi-grouped data.
`hypos_permu`	perform Permutation test on each possible permutation of groups among multi-grouped data.

3.1.4.1.2. Docstrings¶

hypoi_f¶

perform one-way ANOVA test among multi-grouped data.

Arguments:

Parameters:: data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
Returns:: F statistic and \(p\)-value
Return type:: dict

Examples:

Code 3.83 one-way ANOVA on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_f(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_t¶

perform pair-wise independent T test among multi-grouped data. statistic uses Equation 4.6.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
equal_var (bool) – tigger to determine whether groups under comparison are of the identical variance; False as default
trim (float) – fraction to trim data from two-tails (Yuen’s T test); valid value ranges from 0.0 to 0.5; 0.0 as default
permutations (Optional[int]) – \(\mathbb{N}\), number of permutations used for calculating numerical solution for \(p\)-value; 0 or None for analytical solution using t distribution without permutations; None as default
random_state (Optional[int]) – random state in Monte Carlo; effective when permutations is activated; None as default
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

t statistics and \(p\)-values on pair-wised groups

Return type:

dict

Examples:

Code 3.84 independent t test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_t(data=data)

See also:

hypoj_t

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_sw¶

perform Shapiro-Wilk test on each group among multi-grouped data.

Arguments:

Parameters:: data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
Returns:: shapiro statistic and \(p\)-value via Monte Carlo simulation
Return type:: dict

Examples:

Code 3.85 Shapiro-Wilk test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_sw(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_normality¶

perform Omnibus Normality test on each group among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'omit' as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.86 omnibus test for normality on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_normality(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_ks¶

perform Kolmogorov-Smirnov test among multi-grouped data. calculating on each group, as well as on pair-wised groups.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
dist (Union[dist, list[dist]]) – distribution pre-defined as criterion (or criteria); rv_frozen object, or list of those objects in scipy; the standard uni-variate gaussian scipy.stats.norm(loc=0, scale=1) as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default
method (Literal['exact', 'asymp', 'auto']) – the method to calculate \(p\)-value; 'exact' uses exact distribution of distribution(s); 'asymp' uses asymptotic distribution(s); 'auto' uses one of the above options; 'auto' as default
n_sample (int) – number of samples generated from pre-defined distribution(s); 20 as default

Returns:

statistic and \(p\)-value on each group, and pair-wised groups

Return type:

dict

Examples:

Code 3.87 Kolmogorov-Smirnov test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_ks(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_cvm¶

perform Cramér-von Mises test among multi-grouped data. calculating on each group, as well as on pair-wised groups.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
dist (Union[dist, list[dist]]) – distribution pre-defined as criterion (or criteria); rv_frozen object, or list of those objects in scipy; the standard uni-variate gaussian scipy.stats.norm(loc=0, scale=1) as default
method (Literal['exact', 'asymp', 'auto']) – the method to calculate \(p\)-value; 'exact' uses exact distribution of distribution(s); 'asymp' uses asymptotic distribution(s); 'auto' uses one of the above options; 'auto' as default

Returns:

statistic and \(p\)-value on each group, and pair-wised groups

Return type:

dict

Examples:

Code 3.88 Cramér-von Mises test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_cvm(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_ag¶

perform Alexander Govern test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.89 Alexander Govern test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_ag(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_thsd¶

perform Tukey’s range test among multi-grouped data.

Arguments:

Parameters:: data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
Variables:: ~full_return (bool) – if True, low and high of confidence interval will be returned as extra information as well; False as default
Returns:: statistic and \(p\)-value on pair-wised groups
Return type:: dict

Examples:

Code 3.90 Tukey’s range test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_thsd(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_kw¶

perform Kruskal-Wallis H-test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.91 Kruskal-Wallis H-test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_kw(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_mood¶

perform Mood’s median and scale test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
ties (Literal['below', 'above', 'ignore']) – determines how values equal to the grand median are classified; 'below' and 'above' counts for below and above respectively; 'ignore' will not count; 'below' as default
power_lambda (float) – number used for power divergence; 1.0 as default for Pearson’s chi-squared statistic
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Variables:

~full_return (bool) – if True, median and contingency table will be returned as extra information from median test as well; False as default

Returns:

statistics and \(p\)-values for median and scale tests

Return type:

dict

Examples:

Code 3.92 Mood’s median and scale test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_mood(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_bartlett¶

perform Bartlett’s test among multi-grouped data.

Arguments:

Parameters:: data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
Returns:: statistic and \(p\)-value
Return type:: dict

Examples:

Code 3.93 Bartlett’s test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_bartlett(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_levene¶

perform Levene test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
center (Literal['mean', 'median', 'trimmed']) – the referenced center to determine absolute distance for each observation; 'mean' uses mean; 'median' uses median; 'trimmed' uses the mean calculate from trimmed data; 'median' as default
proportiontocut (float) – fraction from leftmost and rightmost to be trimmed; effective when center is 'trimmed'; valid value ranges from 0.0 to 0.5; 0.05 as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.94 Levene test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_levene(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_fk¶

perform Fligner-Killeen test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
center (Literal['mean', 'median', 'trimmed']) – the referenced center to determine absolute distance for each observation; 'mean' uses mean; 'median' uses median; 'trimmed' uses the mean calculate from trimmed data; 'median' as default
proportiontocut (float) – fraction from leftmost and rightmost to be trimmed; effective when center is 'trimmed'; valid value ranges from 0.0 to 0.5; 0.05 as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.95 Fligner-Killeen test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_fk(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_ad¶

perform Anderson-Darling test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
midrank (bool) – type of Anderson-Darling test; True for to continuous and discrete distributions; False for right side empirical distribution; True as default

Variables:

~full_return (bool) – if True, critical values in different significance levels will be returned as extra information; False as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.96 Anderson-Darling test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_ad(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_rank¶

perform rank sum test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.97 Rank sum test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_rank(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_es¶

perform Epps-Singleton test on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
es_t (tuple[float, float]) – where the characteristic function to be evaluated; (0.4, 0.8) as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.98 Epps-Singleton on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_es(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_u¶

perform Mann–Whitney U test on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
method (Literal['asymptotic', 'exact', 'auto']) – the method to calculate \(p\)-value; 'exact' uses exact distribution of distribution(s); 'asymptotic' uses approximate distribution(s); 'auto' uses one of the above options; 'auto' as default to choose 'exact' when one of the samples is no greater than 8 and no ties, otherwise 'asymptotic'
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default
u_continuity (bool) – whether apply continuity correction; effective when 'method' is 'asymptotic'; default is True

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.99 Mann–Whitney U test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_u(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_bm¶

perform Brunner-Munzel test on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default
bm_dis (Literal['t', 'normal']) – determine \(p\)-value calculated from t or normal distribution; "t" as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.100 Brunner-Munzel test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_bm(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_ab¶

perform Ansari-Bradley test on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.101 Ansari-Bradley test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_ab(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_skew¶

perform skew test on each group among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.102 skew test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_skew(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_kurtosis¶

perform kurtosis test on each group among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.103 kurtosis test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_kurtosis(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_jb¶

perform Jarque-Bera test on each group among multi-grouped data.

Arguments:

Parameters:: data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
Returns:: statistic and \(p\)-value on each group
Return type:: dict

Examples:

Code 3.104 Jarque-Bera test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_jb(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_pd¶

perform Cressie-Read power divergence test on each group among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
f_exp (Iterable[int]) – expected frequencies of all categories; None as default for all equal for all categories
ddf (int) – number to be subtracted from degree of freedom; 0 as default uses degree of freedom \(k-1\) where \(k\) is the number of all observations
pd_lambda (Numeric) – real-value to determine the power of statistic; 1 as default for Pearson version

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.105 Power divergence on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_pd(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoi_chi2¶

perform Chi-Squared test on each group among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
f_exp (Iterable[int]) – expected frequencies of all categories; None as default for all equal for all categories

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.106 Chi-Squared test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoi_chi2(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_pearson¶

compute Pearson correlation coefficient on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.107 Pearson correlation coefficient on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoj_pearson(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_spearman¶

compute Spearman correlation coefficient on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.108 Spearman correlation coefficient on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoj_spearman(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_kendall¶

compute Kendall’s tau correlation coefficient on each possible pairs among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default
method (Literal['asymptotic', 'exact', 'auto']) – the method to calculate \(p\)-value; 'exact' uses exact distribution of distribution(s); 'approx' uses double probability of single-tailed to approximate that of two-tailed; 'asymp' uses asymptotic distribution(s); 'auto' uses one of the above options; 'auto' as default
kendall_tau (Literal['b', 'c', 'w']) – determine type of :math:` au` to be calculated; 'b' uses Kendall :math:` au`; 'c' uses Stuart’s :math:` au`; 'w' will activate weighted :math:` au`.
rank (bool) – whether using decreasing lexicographical rank; if False, index of element will be processed as rank; effective when weighted :math:` au` is activated; True as default
weigher (Optional[Callable]) – trigger to determine whether using weight when computing rank \(r\); acceptable mapping must be able to convert positive integer into weight (e.g. \(f(r) = (1+r)^{-1}\)); None as default to use no weight
additive (bool) – determine how weight be calculated on statistic; if True, weight will be processed as item to be added; Otherwise the item to be multiplied; effective when weighted :math:` au` is activated; True as default;

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.109 Kendall’s tau correlation coefficient on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoj_kendall(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_t¶

perform pair-wise related T test among multi-grouped data. statistic uses Equation 4.9.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
nan_policy (Literal['propagate', 'raise', 'omit']) – strategy for null value-contained in data; 'propagate' return nan; 'raise' will throw exception; 'omit' will ignore null values; 'propagate' as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Variables:

~full_return (bool) – if True, degree of freedom will be returned as extra information as well; False as default

Returns:

t statistics and \(p\)-values on pair-wised groups

Return type:

dict

Examples:

See also:

hypoi_t

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_rank¶

perform single-rank test among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
zero_method (Literal['wilcox', 'pratt', 'zsplit']) – method for counting the pairs with equal value; 'wilcox' ignore that cases; 'pratt' only include that cases in ranking process; 'zsplit' include that cases in ranking process and split half-half to positive and negative counts; 'wilcox' as default
correction (bool) – Whether apply continuity correction to adjust rank statistic if normal approximation used; False as default
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default
method (Literal['exact', 'approx', 'auto']) – the method to calculate \(p\)-value; 'exact' uses exact distribution of distribution(s); 'approx' uses approximate distribution(s); 'auto' uses one of the above options; 'auto' as default

Variables:

~full_return (bool) – if True, the \(Z\) statistic will be returned; False as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.111 Single rank test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoj_rank(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_friedman¶

perform Friedman test among multi-grouped data.

Arguments:

Parameters:: data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
Returns:: statistic and \(p\)-value
Return type:: dict

Examples:

Code 3.112 Friedman test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypoj_friedman(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypoj_mgc¶

perform Multiscale Graph Correlation test on each possible pairs among multi high-dimensional data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
distance_criteria (Callable) – criterion to measure the distance of two element when calculating distance matrix; lambda x, y: np.linalg.norm(x-y, ord=2, axis=0) as default to calculate the Euclidean distance
n_resamples (int) – number of resampled permutations to calculate \(p\)-value; 1000 as default
random_state (Optional[int]) – random state to control random sample generation; None as default

Variables:

~full_return (bool) – if True, scale map, optimal scales, and random points for null distribution will be returned as extra information as well; False as default

Returns:

statistic and \(p\)-value

Return type:

dict

Examples:

Code 3.113 Multiscale Graph Correlation test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random((10, 5)) for _ in range(3)}

res = ht.hypoj_mgc(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypos_mc¶

perform Monte Carlo hypothesis test on each group among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
dist (Union[dist, list[dist]]) – distribution(s) predefined; standard uni-variate gaussian norm(loc=0, scale=1) as default
n_resamples (int) – number of resampled datapoints generated from predefined distribution(s); 9999 as default
agg_statistics (dict[str, Callable]) – dict composed of name and aggregation function mapping to calculate statistic; {'mean': lambda x: numpy.mean(x)} as default
batch (Optional[int]) – number of samples used for each call of values in agg_statistics; None as default which equals the n_resamples
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Variables:

~full_return (bool) – if True, random points for null distribution will be returned as extra information as well; False as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.114 Monte Carlo hypothesis test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypos_mc(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

hypos_permu¶

perform Permutation test on each possible permutation of groups among multi-grouped data.

Arguments:

Parameters:

data (dict[str, ndarray]) – dict composed of group names as keywords and corresponding values
permu_type (Literal['independent', 'samples', 'pairings']) – permutation type; 'samples' and 'pairings' requires all data to be compared have the same size; 'independent' assume all input data are of independent; 'independent' as default
n_resamples (int) – number of resampled datapoints generated from predefined distribution(s); 9999 as default
binding_groups (int) – number of groups for each call of test; acceptable value is integer equal or greater than 2; 2 as default
agg_statistics (dict[str, Callable]) – dict composed of name and aggregation mapping to calculate statistic; {'std_of_mean': lambda *x: np.std([np.mean(_) for _ in x])} as default
batch (Optional[int]) – number of samples used for each call of values in agg_statistics; None as default which equals the n_resamples
alternative (Literal['two-sided', 'less', 'greater']) – type of alternative hypothesis \(H_1\); 'two-sided' as default

Variables:

~full_return (bool) – if True, random points for null distribution will be returned as extra information as well; False as default

Returns:

statistic and \(p\)-value on each group

Return type:

dict

Examples:

Code 3.115 Permutation test on multi groups¶

from info.me import hypotest as ht
import numpy as np
data = {f"group{_+1}": np.random.random(20) for _ in range(3)}

res = ht.hypos_permu(data=data)

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

3.1.4.2. Module factors¶

3.1.4.2.1. Description¶

The module factors will support for scientific experiment design, data exploration, and etc. It is a powerful tool for data exploration, allowing researchers to extract meaningful patterns and relationships from complex datasets. Refer supplementary for its scientific background.

Similarly, the import through entry through info.me is available.

priori_scoring

priori scoring implementation for multi factors analysis.

3.1.4.2.2. Docstrings¶

priori_scoring¶

priori scoring implementation for multi factors analysis.

Arguments:

Parameters:

data (DataFrame) – table with multi factors as indexing, whose columns are un-ranked
constructor (dict[str, list[str]]) – constructor to parse the factors and corresponding levels in indexing of data; dict used factor names as keywords, and list composed of level names as the corresponding value
response_dimensions (list[str]) – list composed of factors that sensitive to affect the final numeric; the factor selection should follow the common sense, or expertise in that field
inertia_dimensions (Optional[list[str]]) – list composed of factors that no sensitive to affect the final numeric; None as default will automatically the unselected factors based on constructor and response_dimensions
measure (Optional[Callable]) – the callback aggregation function to map the rearranged pseudo-tensor to a scalar; None as default to use normality combined with ANOVA to measure how extent the data departure from the priori hypothesis
empty_value (Optional[Any]) – value to fill the un-existed factor combinations; the measure function should be capable to deal with this value if use customized method; numpy.nan as default
score_output (Optional[bool]) – whether export the final scores for all column names; False as default

Returns:

a dict composed of importance level, and column names (and corresponding scores) in that level

Return type:

dict[str, ndarray]

Examples:

Code 3.116 factor analysis automation using priori scoring algorithm¶

from info.me import priori_scoring
from itertools import product
import numpy as np
import pandas as pd

cons = {
    'A': ['a1', 'a2'],
    'B': ['b1', 'b2', 'b3'],
    'C': ['c1', 'c2']
}

index = np.repeat(['-'.join(_) for _ in product(*[v for k, v in cons.items()])], 10)
where_c1 = np.array(['c1' in _ for _ in index])
columns = np.array([f"group_{_+1}" for _ in range(20)])
_values = np.random.random((len(index), len(columns)))
values = np.array([vec * 10.8 if c1 else vec * 0.3 for c1, vec in zip(where_c1, _values)])

df = pd.DataFrame(values, index=index, columns=columns)
#            group_1   group_2   group_3  ...  group_18  group_19  group_20
# a1-b1-c1  8.330263  0.224121  6.843401  ...  3.152262  9.911961  7.717418
# a1-b1-c1  5.859479  1.535437  4.032080  ...  8.949758  0.506480  6.763901
# ...            ...       ...       ...  ...       ...       ...       ...
# a2-b3-c2  0.205181  0.175918  0.293796  ...  0.020738  0.017385  0.094473
# a2-b3-c2  0.162649  0.077234  0.133392  ...  0.122661  0.200381  0.172522

res = priori_scoring(data=df, constructor=cons, response_dimensions=['C'], score_output=True)
# {'importance_level_0': array([['group_11', 10.583273152581747]]),  # most discriminative
#  'importance_level_1': array([['group_1', 5.543840398683746],
#                               ['group_2', 6.006970191046672],
#                               ['group_3', 4.691317734172809],
#                               ...}

Logs:: Added in version 0.0.3.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

Authors:: Chen Zhang
Version:: 0.0.5
Created on:: Jun 30, 2023

3.1.4. Modules for analysis¶

3.1.4.1. Module hypotest¶

3.1.4.1.1. Description¶

3.1.4.1.2. Docstrings¶

3.1.4.2. Module factors¶

3.1.4.2.1. Description¶

3.1.4.2.2. Docstrings¶

Table of Contents

This Page