3.1.1. Framework utilities¶
3.1.1.1. Description¶
Utility set of informatics framework for agile development. Functions cover attribute registering, document attaching,
runtime type checking, unit testing, workflow design, building and interface wrapping. For easy importing, all
classes and functions listed here have been integrated into the main entry info.me.
indicator for setting default values, add constraint type for keyword arguments. |
|
info lambda function. |
|
decorator collections for multi purposes. |
|
package a single or multiple data processing or operation steps within a Unit. |
|
dict with |
|
executable dict composed of |
|
tool to make single map function. |
|
traversal on parameters pool. |
|
experiment test pipeline for info function or unit. |
|
unit test pipeline for info function or unit. |
And also some meta implementation frameworks for data loading, processing, visualization, analyzing, and exporting,
as well as some code block wrapper for easy develop. Function here mainly in namespace info.toolbox.libs.operations.
All those function are integrated into info.me as well.
generic printing function to show print attributes or methods for data. |
|
generic logger function for saving export from feature extracting functions. |
|
info function or Unit implementation for recording exceptive case. |
|
dynamic default value setting in function body. |
|
diagnose unit test result then return a list of bool values. |
3.1.1.2. Docstrings¶
- class T¶
indicator for setting default values, add constraint type for keyword arguments. Using combined with
FuncTools.- Examples:
from info.me import T, Null, FuncTools from numpy import ndarray import numpy as np img = np.random.random((40, 40)) @FuncTools.params_setting(data=T[Null: ndarray], spacing=T[(1, 1, 1): tuple[int, ...]], mask=T[None: lambda x: x.dtype == bool and x.ndim == 3 if isinstance(x, ndarray) else True], whichever=T[12: int: True]) def func(**params): mask = params.get('mask') if params.get('mask') else np.ones_like(params.get('data')) ... func(x=img) # error detected here: 'data' is Null but not assigned func(data=img, spacing=[1, 1, 1]) # error detected here: typing hint is tuple but got list func(data=img, spacing=(1, 0.8, 1)) # error detected here: int only based on typing hint func(data=img, mask=img) # error detected here: img.dtype is not bool func(data=img) # correct, using (1, 1, 1) as default spacing, all ones mask func(data=img, spacing=(1, 1, 2), mask=np.zeros_like(img)) # correct as well func(data=img, whichever='not_int') # correct, the last True in T refers to escape this check
- Notes:
In info, the built-in value
Nullis a mnemonic name to remind the required argument(s):@FuncTools.params_setting(a=T[Null: object], b=T[Null: object], c=T[1: int]) def func(**params): ... func.required_args # {'a', 'b'}
Note under this circumstance, if call
funcwithout assignment foraorb,TypeErrorraised.
- See also:
- Logs:
Added in version 0.0.3.
Changed in version 0.0.4: Support new union typing hint like A | B in python 3.10 or later.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- class F¶
info lambda function.
- Arguments:
- Parameters:
expr (Callable) – lambda expression with key word arguments
**kwargs- Returns:
anonymous function in info version
- Examples:
from info.me import F, Unit f1 = F(lambda **_: _.get('data')) f2 = (lambda **_: _.get('data')) u1 = Unit(mappings=[f1]) # no problem here u2 = Unit(mappings=[f2]) # raise TypeError here, as no registered f2
- See also:
- Logs:
Added in version 0.0.3.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- class FuncTools¶
decorator collections for multi purposes.
- Methods:
- attach_attr:
static method to add document to function, applying typing hint, or type checking with hint
- Variables:
docstring (Union[str, Callable]) – replace document of decorated function
info_func (bool) – marker whether is info function; if
True, theentry_tpandreturn_tpwill be added into type checker flow, and keyword'data'argument must be included; otherwise just hint onlyentry_tp (type) – the data type for decorated function to process; use python builtin class or typing hint; when
'info_func'isTrue, it is considered as the type of'data'value and will be checked before processing startreturn_tp (type) – return type for decorated function; can use python builtin class or typing hint; when
'info_func'isTrue, the result will be checked before actually return~deep_inspect (bool) – whether check deeper when meet a simple iterable object;
Trueas default~unknown_tp (Union[bool, list[type]]) – if
False, will raiseUnSupportableTypeErrorwhen dealing with unparseable type;Truewill pass the test for unparseable type; and list of unparseable type(s) will add those types in checking workflow;Falseas default
- Raises:
NoDataInflowError – if
info_funcisTrueand no keyword assignment for'data'TypeError – if
info_funcisTrueand entry type of'data', or return data do not match the desired type
It can be used to attach documents on decorated function, or checking whether the data match the desired type in entry, or the data after processing matched the desired type as well. for example:
from info.me import FuncTools @FuncTools.attach_attr(docstring='''simple function''') def func1(**params): ... @FuncTools.attach_attr(docstring=func1) def func2(**params): ... help(func1) # simple function help(func2) # simple function @FuncTools.attach_attr(entry_tp=int, return_tp=float) def func3(**params): return float(params.get('x')) @FuncTools.attach_attr(entry_tp=int, return_tp=float) def func4(**params): return params.get('x') func3(x=5) # 5.0 func4(x=5) # 5, without error @FuncTools.attach_attr(info_func=True, entry_tp=int, return_tp=float) def func5(**params): return float(params.get('x')) @FuncTools.attach_attr(info_func=True, entry_tp=int, return_tp=float) def func6(**params): return params.get('data') @FuncTools.attach_attr(info_func=True, entry_tp=int, return_tp=float) def func7(**params): return float(params.get('data')) func5(x=5) # NoDataInflowError here, remind for using 'data' keyword func6(data='5') # TypeError here, 'data' requires <class 'int'>, but got <class 'str'> func6(data=5) # TypeError here, return requires <class 'float'>, but got int 5 func7(data=5) # correct
However, this decorator can also handy to attach some customized attribute if necessary:
@FuncTools.attach_attr(author='Chen', foo=12, bar='sz') def func(**params): ... func.author # 'Chen' func.foo # 12 func.bar # 'sz'
- params_setting:
static method to set default values, applying typing hint, or type checking with those hints. some complicated situation can also use anonymous
lambdafunction to check.- Variables:
~deep_inspect (bool) – whether check deeper when meet a simple iterable object;
Trueas default~unknown_tp (Union[bool, list[type]]) – if
False, will raise UnSupportableTypeError when meet unparseable type;Truewill pass the test for unparseable type; and list of unparseable type(s) will add those types in checking workflow;Falseas default
- Raises:
NoDataInflowError – if
info_funcisTrueand no keyword assignment for'data'TypeError – if
info_funcisTrueand entry type of'data', or return data do not match the desired type
Flowing examples show how to pre-define default parameters for info function.
from info.me import T, Null, FuncTools, default_param from typing import Optional from numpy import ndarray import numpy as np @FuncTools.params_setting(data=T[Null: ndarray], clip=T[(0.2, 0.8): tuple[float, float]], coefficient=T[0.6: lambda x: 0 <= x <= 1], normalize=T[None: Optional[tuple[float, float]]]) def func(**params): dt = params.get('data').clip(*params.get('clip')) * params.get('coefficient') _mean, _std = default_param(params, 'normalize', (dt.mean(), dt.std())) return (dt - _mean) / _std func.required_args # {'data'}, if calling without assignment for 'data', error raised
This example show the convenience to use
params_settingto build the function. fordata, theNullmark it as a required argument when calling,ndarrayconfines its type; the same asclipbut with default value as(0.2, 0.8); forcoefficient, its acceptable value is no less than zero and no greater than one.As all conditions are guaranteed so it can safely using one line pythonic statement to obtain
dt, without worrying about wrong parameters passed in. Additionally, maybe sometimes it needs dynamic setting for some arguments. For instance thenormalizein above code can hardly determined usingTas it dependsdata, in this circumstance, set it asNonethen usedefault_paramto implement the calculation in the body. If calling without assignment ofnormalize, the_meanand_stdwill be automatically calculated based ondt, otherwise using the parameters passed in.Additionally, use
params_settingto initialize arguments is safe with mutable built-in Python object. The following example shows this feature:from info.me import FuncTools, T def py_func(*, x=[]): x.append(len(x)) return x @FuncTools.params_setting(x=T[[]: list]) def info_func(**params): (res := params.get('x')).append(len(res)) return res _ = [print(_, id(_)) for _ in [py_func(), py_func(), info_func(), info_func()]] # [0, 1] 2763346399552 # [0, 1] 2763346399552 # [0] 2763804769280 # [0] 2761764188160
- test_for:
- Variables:
~in_decorator (bool) – whether use as the decorator; if
False, will return a none argument lambda function to get test result, and cost time;Trueas default
static method to test the decorated function.
from info.me import FuncTools from time import sleep @FuncTools.test_for(2, 'foo') def func1(a, b): # test for common function print('string here: ', b) sleep(a) return 'bar' # string here: foo # running test for func1(2, foo) ... # time cost: 0:00:02.007849 # final result: bar @FuncTools.test_for(a=2, b='bar') def func2(**params): # test for info like function print('another string here: ', params.get('b')) sleep(params.get('a')) return 'bar' # another string here: bar # running test for func2(a=2, b=bar) ... # time cost: 0:00:02.014251 # final result: bar
Unnecessary to edit the test script in addition. It capable for testing while editing. When finish function, clean the test arguments inside, or comment that line.
- Examples:
With
FuncTools, you can design, implement, and test for function all in one script. Using the previous Code 3.6 , make it as a info version, then doing test, the script will be:from info.me import T, Null, FuncTools, default_param from typing import Optional from numpy import ndarray import numpy as np @FuncTools.test_for(data=np.random.random((50, 60)), coefficient=0.8) @FuncTools.params_setting(data=T[Null: ndarray], clip=T[(0.2, 0.8): tuple[float, float]], coefficient=T[0.6: lambda x: 0 <= x <= 1], normalize=T[None: Optional[tuple[float, float]]]) @FuncTools.attach_attr(docstring='under editing', info_func=True, entry_tp=ndarray, return_tp=ndarray) def func(**params): dt = params.get('data').clip(*params.get('clip')) * params.get('coefficient') _mean, _std = default_param(param, 'normalize', (dt.mean(), dt.std())) return (dt - _mean) / _std
- See also:
- Logs:
Added in version 0.0.3.
Changed in version 0.0.4: branch
~in_decoratorintest_formethod.Changed in version 0.0.5: use copy to initiate arguments with set, list, or dict type; safe to use mutable object for default initialization in
Tindicator insideparams_setting.– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- class Unit¶
package a single or multiple data processing or operation steps within a Unit.
- Arguments:
- Parameters:
mappings (list[Callable]) – list composed of info function(s), or
Unitinstance(s)structure (Literal['sequential', 'parallel']) –
'sequential'or'parallel', to determine the mapping order of the Unit;'sequential'as defaultentry_tp (Optional[type]) – the type of data in entry of this Unit;
Noneas default to guess the inflow type based onstructurereturn_tp (Optional[type]) – the type of return data of this Unit;
Noneas default to guess the outflow type based onstructuredocstring (str) – docstring or formal function which contains objective docstring;
Noneas default to automatically generate based on docstrings of elements
- Returns:
a Unit for data processing or operation
- Return type:
- Raises:
TypeError – when
mappingsare not all registered functions or unit; or inflow data not pass correctly through type checker
- Methods:
- add_locker:
lock arguments of current unit as default values, through
**kwargsassignments.
- refresh_locker:
update arguments of current unit as default values, through
**kwargsassignments.
- shadow:
return a copy of self, with a set of specified
**kwargsparameters related to the inner functions.
- append_document:
overwrite docstring of the unit.
- Variables:
docstring (Union[str, Callable]) – str for docstring, or callable object contained that str.
- reset:
reset default values for all arguments when initiate the Unit.
- get_equivalent_config:
output the equivalent values for all arguments when the unit is actually called.
- __call__:
call as function using
dataas required argument.- Variables:
data (Any) – instance to be processed via this unit.
- __rshift__:
sequentially connects the next unit.
- __or__:
parallel connects the next unit.
- Examples:
Using info Unit to build comprehensive processing steps sequentially:
from info.me import tensorn as tsn from info.me import Unit, datasets img = datasets.blackcurrant() u = Unit(mappings=[tsn.cropper, tsn.resize]) center_to_500x500 = u.shadow(crop_range=[(.25, .25), (.75, .75)], new_size=(500, 500)) img1 = center_to_500x500(data=img)
Or applied parallel structure to process data in different sets of processing parameters simultaneously. The inner functions use Unit instance self:
topleft_to_200x200 = u.shadow(crop_range=[(0., 0.), (.25, .25)], new_size=(200, 200)) u_parallel = Unit(mappings=[center_to_500x500, topleft_to_200x200], structure='parallel') img2, img3 = u_parallel(data=img)
- Logs:
Added in version 0.0.1.
Changed in version 0.0.4: Support operator
>>and|to build pipeline.– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- class TrialDict¶
dict with
trialmethod. The argument assignment intrialdoes not modify the dict itself.- Arguments:
- Parameters:
kwargs (dict) –
**kwargsas general dict- Returns:
the dict with
trialmethod- Return type:
- Methods:
- trial:
return a copy of self with updated keywords and values
- Logs:
Added in version 0.0.1.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- class ExeDict¶
executable dict composed of
executekeyword and function or generic function as value. uses for high-order function.- Arguments:
- Parameters:
execute (Callable) – a Callable object, usually function for generic function
kwargs (dict) – dict contains other keyword and default values as parameters of that executable object
- Return type:
an ExeDict object
- Raises:
TypeError – required argument
executeis not assigned properly
- Methods:
- __call__:
run using
executeas the function body, while other keywords and values as its parameter assignments
- Examples:
Without ExeDict, for building derived function with different default assignment values must require high-order function, which one is usually of high abstract and low readability:
rom info.me import ExeDict, datasets img = datasets.blackcurrant() add_baseline = (lambda **k1: (lambda **k2: (k2.update(**k1), k2.get('data') + k2.get('baseline'))[1])) aug_30 = add_baseline(baseline=30) dec_50 = add_baseline(baseline=-50) img_aug, img_dec = aug_30(data=img), dec_50(data=img)
The calling of high-order function is not such explicit as common functions, because of the currying character of it. With the previous example, it can be found the following calling are equivalent, but difficult to understand as common function:
img1 = add_baseline(data=img, baseline=20)() img2 = add_baseline(data=img)(baseline=20) img3 = add_baseline()(data=img, baseline=20) # img1 == img2 == img3
With ExeDict, it can derive a common function to versions with different argument assignments. This property also supports info function intrinsically.
add_baseline = (lambda **kwargs: kwargs.get('data') + kwargs.get('baseline')) aug_30 = ExeDict(execute=add_baseline, baseline=30) dec_50 = ExeDict(execute=add_baseline, baseline=-50) img_aug, img_dec = aug_30(data=img), dec_50(data=img)
Obviously, the call itself of
add_baselinein the last example is more explicit than that of high-order functions.
- Logs:
Added in version 0.0.1.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- class SingleMap¶
tool to make single map function.
- Arguments:
- Parameters:
x (dict) – an 1-length dict with a callable object as keyword, and a dict composed of keywords and values for the default arguments assignment for this callable object
- Returns:
a partial function
- Return type:
Callable
- Raises:
TypeError – if the argument
xwas not assigned properly
- Examples:
This class is usually used for building single map function
f(x)when the arguments require adaptive modification in the main process. It is mainly used for lazy calculation, for example:from info.me import SingleMap, datasets imgs = [datasets.accent(), datasets.blackcurrant()] f = (lambda x, **k: (x-k.get('loc'))/k.get('scale')) fs_imgs = {SingleMap({f: {'loc': img.mean(), 'scale': img.std()}}): img for img in imgs} # define calculation norm_imgs = [f(img) for f, img in fs_imgs.items()] # execute calculation
- Logs:
Added in version 0.0.1.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- traversal_on_params¶
traversal on parameters pool. function to make auto unit test, or auto experiment, for info pipeline.
- Arguments:
- Parameters:
data (Callable) – info function or unit
params_pool (dict[str, list[Any]]) – pool for parameters, values to be investigated should be collected into a list
scope_in_builtin (bool) – trigger to determine whether testing for built-in parameters in pool;
Falseas defaultconcise_result (bool) – trigger to determine whether reserve the test results; if
False, original will be included, otherwise class type if final results are not short enough;Trueas default for unit test
- Returns:
DataFrame container for testing result
- Return type:
DataFrame
- Examples:
from info.me import FuncTools from info.me import autotesting as tst @FuncTools.attach_attr(docstring="simple function", info_func=True, entry_tp=int, return_tp=int) def simple_function(**params): return (params.get('a') + params.get('b') * params.get('c')) ** params.get('d') test_pool = { 'a': [1, 2, 3], 'b': [3, 4], 'c': [5, 6, 7, 8], 'd': [2, 3] } tst.traversal_on_params(data=simple_function, params_pool=test_pool)
- Logs:
Added in version 0.0.4.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- experiments¶
experiment test pipeline for info function or unit.
- Arguments:
- Parameters:
data (Callable) – info function or unit
params_pool (dict[str, list[Any]]) – pool for parameters, values to be investigated should be collected into a list
to_file (str) – cache file to dump result of test parameters;
Nonewill create a new dict in each invocationbranch_comment (bool) – prefix marker for callable object to be tested if necessary;
''as default for no prefix attached to the function namescope_in_builtin (bool) – trigger to determine whether testing for built-in parameters in pool;
Falseas default
- Returns:
DataFrame container for testing result
- Return type:
DataFrame
- Examples:
from info.me import FuncTools from info.me import autotesting as tst @FuncTools.attach_attr(docstring="simple function", info_func=True, entry_tp=int, return_tp=int) def simple_function(**params): return (params.get('a') + params.get('b') * params.get('c')) ** params.get('d') test_pool = { 'a': [1, 2, 3], 'b': [3, 4], 'c': [5, 6, 7, 8], 'd': [2, 3] } res = tst.experiments(data=simple_function, params_pool=test_pool)
- Notes:
experiments dump the original result for each test.
- Logs:
Added in version 0.0.4.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- functest¶
unit test pipeline for info function or unit.
- Arguments:
- Parameters:
data (Callable) – info function or unit
params_pool (dict[str, list[Any]]) – pool for parameters, values to be investigated should be collected into a list
to_file (str) – cache file to dump result of test parameters;
Nonewill create a new dict in each invocationbranch_comment (bool) – prefix marker for callable object to be tested if necessary;
''as default for no prefix attached to the function namescope_in_builtin (bool) – trigger to determine whether testing for built-in parameters in pool;
Falseas default
- Returns:
DataFrame container for testing result
- Return type:
DataFrame
- Examples:
from info.me import FuncTools from info.me import autotesting as tst @FuncTools.attach_attr(docstring="simple function", info_func=True, entry_tp=int, return_tp=int) def simple_function(**params): return (params.get('a') + params.get('b') * params.get('c')) ** params.get('d') test_pool = { 'a': [1, 2, 3], 'b': [3, 4], 'c': [5, 6, 7, 8], 'd': [2, 3] } res = tst.functest(data=simple_function, params_pool=test_pool)
- Notes:
functest dump the class type of result for each test, if the final result is difficult to be printed out concisely.
- Logs:
Added in version 0.0.4.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- generic_printer¶
generic printing function to show print attributes or methods for data.
- Arguments:
- Parameters:
data (object) – original data with attributes or methods to be showed
attrs (list[str]) – list of callable attributes; no assignment uses empty list
[]as default
- Returns:
NoReturn
- Return type:
NoneType
- Raises:
AttributeError – invalid attribute or method calling in
attrsassignment
- Examples:
For exporting dimension, shape and max value for images individually:
from info.ins import datasets, generic_printer imgs = [datasets.cat(), datasets.accent(), datasets.blackcurrant()] for img in imgs: generic_printer(data=img, attrs=['ndim', 'shape', 'max()'])
Or alternative implementation using
printing_uunit coupled withExeDictclass:describe = printing_u.shadow(o_prt=ExeDict(execute=generic_printer, attrs=['ndim', 'max()', 'clip(min=0.2).sum()'])) imgs = [describe(data=img) for img in imgs]
- See also:
printing_u
- Logs:
Added in version 0.0.1.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- generic_logger¶
generic logger function for saving export from feature extracting functions.
- Arguments:
- Parameters:
data (object) – data prepared to be executed via extractors
extractors (dict[str, Union[Unit, Pipeline, Callable]]) – dict composed of feature names and mapping methods on data as values; no assignment uses empty dict
{}as defaultdirectory (str) – path-like string for folder where the file will be saved; no assignment uses current work directory (
os.getcwd()) as defaultto_file (str) – file name for recording output;
'.df_sav'as defaultother_params (dict) – the global parameters passed on all mapping methods in extractors; no assignment uses empty dict
{}as default
- Returns:
NoReturn
- Return type:
None
- Examples:
For example, logging max and percentile values for each image into the file
'describe.log':from info.me import generic_logger, datasets import numpy as np imgs = [datasets.cat(), datasets.accent(), datasets.blackcurrant()] # define two extraction functions for max and percentile values get_max = (lambda **kw: np.max(kw['data'])) get_percentiles = (lambda **kw: np.percentile(kw['data'], q=kw['percentiles'])) for img in imgs: generic_logger(data=img, extractors={'max': get_max, 'percentiles': get_percentiles}, to_file='describe.log', other_params={'percentiles': [25, 50, 75]})
It can also be wrapped into a saving unit:
from info.me import saving_u, ExeDict record = saving_u.shadow(o_sav=ExeDict(execute=generic_logger, extractors={'max': np_max, 'percentiles': percentile}, to_file='describe.log', other_params={'percentiles': [25, 50, 75]})) imgs = [record(data=img) for img in imgs]
- See also:
saving_u
- Logs:
Added in version 0.0.1.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- exception_logger¶
info function or Unit implementation for recording exceptive case.
- Arguments:
- Parameters:
data (tuple[str, Exception]) – tuple composed of str for exceptive case, and exception raised during running the corresponding case
directory (str) – path-like string for folder where the file will be saved; no assignment uses current work directory (
os.getcwd()) as defaultto_file (str) – file name for recording exceptive cases;
'run_error.log'as default
- Returns:
NoReturn
- Return type:
None
- Examples:
from info.me import exception_logger import numpy as np rand_sizes = np.random.randint(0, 50, 20) for _ in range(20): try: dt = np.random.random(rand_sizes[_]) _ = dt[17] # IndexError raised for some certain steps except Exception as err: exception_logger(data=(f"case_{_}", err))
- See also:
saving_u
- Logs:
Added in version 0.0.1.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- default_param¶
dynamic default value setting in function body.
- Arguments:
- Parameters:
param – dict params to be detected
k – keyword for parameter
v – default value
- Returns:
vifkinparamsisNone, else the value inparams- Return type:
Any
- See also:
- Logs:
Added in version 0.0.2.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- diagnosing_tests¶
diagnose unit test result then return a list of bool values.
- Arguments:
- Parameters:
data (DataFrame) – dataframe result
- Variables:
~verbosity (bool) – trigger to show details for the exceptive cases;
Trueas default- Returns:
list of bool for dataframe result;
Truefor case with 0 exit code, otherwiseFalse- Return type:
list[bool]
- Notes:
the tool is used for auto unit test framework. If all cases pass, its return will be list of
Trueonly.
- Logs:
Added in version 0.0.4.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- Authors:
Chen Zhang
- Version:
0.0.5
- Created on:
Jun 29, 2023