3.1.6. Option networks¶
3.1.6.1. Description¶
Neural networks related tool-chain and conventional implementation via meta programming. For normal use, the dependency of PyTorch is required.
Namespace of this module is mainly in info.toolbox.networks. For convenience in practice, use main
entry of info.net.
a flexible neural network base class with enhanced training/inference capabilities. |
|
a configurable fully connected neural network module with flexible architecture options. |
|
a configurable convolutional neural network (CNN) module with flexible architecture. |
|
a configurable U-Net architecture for semantic segmentation with dynamic dimensionality support. |
|
a highly configurable transformer architecture supporting multiple attention mechanisms and embedding methods. |
3.1.6.2. Docstrings¶
- class Module¶
a flexible neural network base class with enhanced training/inference capabilities. this module extends PyTorch’s
nn.Modulewith additional features including:configurable training/inference sessions
automatic data type handling
built-in training loop with stopping conditions
support for both regression and classification tasks
generator-based online learning support
- Logs:
Added in version 1.0.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- full_connected_neural¶
a configurable fully connected neural network module with flexible architecture options. This implementation provides a multi-layer perceptron (MLP) with customizable layer dimensions, activation functions, dropout, and data type specifications. The network can be either statically sized or dynamically initialized with lazy weight initialization.
- Arguments:
- Parameters:
structure (list[int]) – list specifying layer dimensions; its first element can be
Noneto enable lazy initialization; e.g.[None, 256, 128]for lazy input or[784, 256, 128]for fixed inputactivation (Union[Callable, list[Callable]]) – activation function(s) between layers; can be single function or list per layer;
nn.ReLU, or[nn.ReLU, nn.Sigmoid]for different activation per layer;bias (bool) – whether to include bias terms in linear layers;
Trueas defaultdropout (Optional[float]) – dropout probability (0-1) applied after last hidden layer;
Noneas default to disable dropoutctype_option (_Ctype) – torch datatype for network parameters;
'float32'as default
- Returns:
a fully connected neural network
- Return type:
- Examples:
import torch as tch from info.net import full_connected_neural num_samples, input_size, output_size = 120, 10, 5 x, y_classification, y_regression = (tch.randn(num_samples, input_size), tch.randint(0, output_size, (num_samples,)), tch.randn(num_samples, output_size)) x_train, x_validation = x[:100], x[100:] yc_train, yc_validation = y_classification[:100], y_classification[100:] yr_train, yr_validation = y_regression[:100], y_regression[100:] # apply on classification task model1 = full_connected_neural(structure=[input_size, 40, output_size], activation=tch.nn.ReLU) with model1.train_session() as md: md.solve(train=x_train, target=yc_train, validation=(x_validation, yc_validation)) # apply on regression task, with specified configuration model2 = full_connected_neural(structure=[None, 40, 50, output_size], bias=False, activation=[tch.nn.LeakyReLU, tch.nn.Tanh], dropout=0.2) with model2.train_session(criterion=tch.nn.MSELoss()) as md: md.solve(train=x_train, target=yr_train, validation=(x_validation, yr_validation))
- Notes:
this implementation provides a flexible fully connected network that supports:
both static and dynamic (lazy) input dimension handling
per-layer activation function specification
configurable precision through dtype options
optional dropout regularization
automatic training configuration (SGD optimizer with CrossEntropy loss by default)
the network uses PyTorch’s LazyLinear when input dimension is unspecified (None), which automatically infers input size during first forward pass.
- See Also:
- Logs:
Added in version 1.0.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- convolutional_neural¶
a configurable convolutional neural network (CNN) module with flexible architecture. This framework provides a convolutional neural network architecture comprising configurable convolutional blocks cascaded with a multi-stage fully-connected network. The modular design enables flexible customization of both feature extraction components (convolutional operations) and classification modules (MLP layers), supporting both baseline configurations and application-specific topological variations through parameterized layer composition.
- Arguments:
- Parameters:
conv_structure (list[int]) – list specifying the number of output channels for each convolutional block
mpl_structure (list[int]) – list specifying the layer sizes for the final MLP
activation (Callable) – global activation function;
nn.ReLUas defaultin_dimension (int) – spatial dimension of input; must be
1,2, or3;2as default to adapt natural images related tasksconv_kernel (dict) –
parameter dict containing
'kernel_size','stride','padding'or'dilation'; accepted values can be a positive integer (applied to all dimensions),or tuple of positive integers specifying per-dimension values matching the input data’s dimensional structure;
{'kernel_size': 3, 'stride': 1, 'padding': 1}as default configurationbatch_norm (dict) – batch normalization configuration;
Noneas default to disable batch normalization; if provided, its accepted value should be a dict with'eps','momentum','affine', or'track_running_state'as keys, and allowable value for its respective keypre_activation (bool) – whether to use pre-activation ordering before convolution;
Falseas defaultpool (dict) – 1-length dict of pooling configuration; key should be one among
'Max','FractionalMax','AdaptiveAvg','AdaptiveMax','Avg','LP','MaxUn', and the value is of the similar form as theconv_kernelparameter;{'Max': {'kernel_size': 2, 'stride': 2}}as default to apply conventional max pooling approachdropout (float) – dropout probability from 0 to 1;
Noneas default to disable dropoutconv_customization (list[dict]) – list of dictionaries to customize each convolutional block’s parameters; each dict can override default conv parameters (from
activationtodropout);Noneas default to apply global configuration; e.g.,[{'pre_activation': True}, {'dropout': 0.4}]to specify a two-convolutional layers with pre-activation the 1st, and 0.4 dropout the 2nd if theconv_structureis of[16, 32]
- Returns:
a convolutional neural network
- Return type:
- Examples:
import torch as tch from info.net import convolutional_neural sp1, sp2, nums, n = (128, 128), (64, 64, 32), 20, 10 x_2d, x_3d, y_c, y_r = (tch.randn(nums, 3, *sp1), tch.randn(nums, 1, *sp2), tch.randint(0, 10, (nums,)), tch.randn(nums, n)) # natural image classification task: model1 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n]) with model1.train_session() as md: md.solve(train=x_2d, target=y_c, loading_mode='local') # 3D image classification task: model2 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n], in_dimensions=3) with model2.train_session() as md: md.solve(train=x_3d, target=y_c, loading_mode='local') # online loading for 3D images with customized configuration: model3 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n], in_dimensions=3, pre_activation=True, dropout=0.13) with model3.train_session(criterion=tch.nn.HingeEmbeddingLoss()) as md: md.solve(train=(_ for _ in x_3d), target=(_ for _ in y_c), stop_conditions={'epochs': 40}) # or application on regression task: model4 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n]) with model4.train_session(criterion=tch.nn.MSELoss()) as md: md.solve(train=x_2d, target=y_r, loading_mode='local')
- Notes:
this implementation is featured as:
dynamic input dimension handling via lazy layers
configurable per-block parameters
automatic flattening before MLP
default Adam optimizer (lr=0.001) and CrossEntropyLoss
it employs an MLP-based backend implementation, inheriting its most features such as adaptive capabilities for both classification and regression tasks.
- See Also:
- Logs:
Added in version 1.0.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- unet¶
a configurable U-Net architecture for semantic segmentation with dynamic dimensionality support. This implementation follows the classic U-Net encoder-decoder structure with skip connections, while providing extensive customization options through parameterized components.
- Arguments:
- Parameters:
mirrored_channels (list[int]) – channel dimensions for each level of the encoder-decoder blocks; the decoder path mirrors the encoder channel structure
in_dimension (int) – spatial dimension of input; must be
1,2, or3;2as default to adapt natural images related tasksactivation (Callable) – activation function factory; default uses in-place ReLU for memory efficiency
conv_kernel (dict) –
parameter dict containing
'kernel_size','stride','padding'or'dilation'; accepted values can be a positive integer (applied to all dimensions),or tuple of positive integers specifying per-dimension values matching the input data’s dimensional structure;
{'kernel_size': 3, 'stride': 1, 'padding': 1}as default configurationbatch_norm (dict) – batch normalization configuration with optional keys;
Noneas default to disable batch normalization; if provided, its accepted value should be a dict with'eps','momentum','affine', or'track_running_state'as keys, and allowable value for its respective keypre_activation (bool) – whether to use pre-activation ordering before convolution;
Falseas defaultpool (dict) – 1-length dict of pooling configuration; key should be one among
'Max','FractionalMax','AdaptiveAvg','AdaptiveMax','Avg','LP','MaxUn', and the value is of the similar form as theconv_kernelparameter;{'Max': {'kernel_size': 2, 'stride': 2}}as default to apply conventional max pooling approachdropout (float) – dropout probability from 0 to 1;
Noneas default to disable dropoutexport_channel (int) – number of output channels; positive integer no greater than
3;1as default
- Returns:
an U-Net instance
- Return type:
- Examples:
import torch as tch from info.net import unet # standard 2D U-Net for binary segmentation x, y = tch.randn(5, 1, 20, 40), tch.randint(0, 2, (5, 1, 20, 40)).float() model1 = unet(mirrored_channels=[64, 128, 256, 512], in_dimensions=2) with model1.train_session() as md: md.solve(train=x, target=y) # 3D U-Net with custom normalization x, y = tch.randn(5, 1, 20, 40, 35), tch.randint(0, 2, (5, 1, 20, 40, 35)).float() model2 = unet(mirrored_channels=[16, 32], in_dimensions=3, batch_norm={'eps': 1e-6, 'momentum': 0.01}, activation=(lambda: tch.nn.LeakyReLU(0.1))) with model2.train_session(criterion=net.dice(1e-3)) as md: md.solve(train=x, target=y, loading_mode='local') # 3D U-Net natively support multimodal fusion x_multi, y = tch.randn(5, 4, 20, 40, 35), tch.randint(0, 2, (5, 1, 20, 40, 35)).float() model3 = unet(mirrored_channels=[16, 32], in_dimensions=3) with model3.train_session() as md: md.solve(train=x_multi, target=y) # 3D U-Net for multiple segmentations, trained using mixture loss x, y_multi = tch.randn(5, 1, 20, 40, 35), tch.randint(0, 2, (5, 3, 20, 40, 35)).float() model4 = unet(mirrored_channels=[16, 32], in_dimensions=3, export_channel=3) mixture_loss = (lambda m1, m2: 0.9 * dice(1e-3)(m1, m2) + 0.1 * tch.nn.CrossEntropyLoss()(m1, m2)) with model4.train_session(criterion=_c) as md: md.solve(train=x, target=y_multi, loading_mode='local')
- Notes:
architectural features:
symmetric encoder-decoder structure with skip connection
automatic handling of input dimensions (1D/2D/3D)
dynamic channel sizing through
mirrored_channelsargumentlazy initialization for input flexibility
nearest-exact interpolation for precise feature map alignment
default training configuration utilizes Adam optimizer with
0.001learning rate, and dice loss with1e-5smoothing factor; if requires customization, overwrite thecriterionoroptimizerargument when invoking the train session.
- See Also:
- Logs:
Added in version 1.0.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- transformer¶
a highly configurable transformer architecture supporting multiple attention mechanisms and embedding methods. this implementation provides dynamic dimensionality handling, flexible positional encoding strategies, and modular attention configurations suitable for sequence-to-sequence tasks.
- Arguments:
- Parameters:
dimension_model (int) – hidden dimension size, must be positive integer
num_heads (int) – number of attention heads, positive integer; If unable to precisely divide
dimension_model, this value will be heuristically adjustedvocabulary_size (dict[Literal['in', 'out'], int]) – dictionary specifying input and output vocabulary sizes;
{'in': 10000, 'out': 8000}as defaultembedding_func (dict[Literal['in', 'out', 'endmost'], Optional[Callable]]) – custom embedding functions for input, output, and final layer; acceptable value is dictionary containing
'in','out'and'endmost'as keys, and embedding function as their respective value; default configuration usesNoneto automatically initialize the embedding functionencoding_meth (Literal['sinusoid', 'trainable', 'relative', 'rotation']) – positional encoding method; accept value must be one option among
'sinusoid','trainable','relative', and'rotation'; default uses'sinusoid'for canonical transformer implementationencoding_configs (dict) – configuration dict for positional encoding (method-specific parameters); default configuration uses
{'max_length': 5000, 'base': 10000}for'sinusoid'encoding,{'max_relative': 3}for'relative', and{'theta': 10000.0, 'start_pos': 0}for'rotation'dimension_feed_forward (int) – dimension of feed forward network;
2048as defaultactivation (Callable) – global activation function;
torch.nn.ReLUas defaultnum_layers (Union[int, tuple[int, int]]) – encoder and decoder layer counts; support unbalanced encode decode architecture via tuple assignment; e.g.,
(6, 3)for unequal encoder and decoder transformerattn_init (dict) – initialization parameters for attention layer; standard configuration uses
{'bias': True, 'add_bias_kv': False, 'add_zero_attn': False, 'batch_first': True}; as for cross attention, the'kdim'and'vdim'will be determined bydimension_modelwhileNonefor self attention in'relative'and'rotation'encoding method ('sinusoid'and'trainable'will beNonein both self and cross);'dropout'will be adjusted from global dropoutattn_forward (dict) – configuration passed in attention forward; the standard setting utilizes
'need_weights'asTrue,'attn_mask'asNone,average_attn_weightsasTrue, and'is_causal'asFalse; if customized configuration are provided, the values will be overwrote from defaultdropout (float) – global dropout rate;
0.1as default
- Returns:
a Transformer model
- Return type:
- Examples:
import torch as tch from info.net import transformer batch, seq1, seq2, voc1, voc2 = 32, 20, 15, 10000, 8000 src, tgt = tch.randint(0, voc1, (batch, seq1)), tch.randint(0, voc2, (batch, seq2)) src_msk, tgt_msk = tch.randn(src.shape) > 0, tch.randn(tgt.shape) > 0 # application on basic sequence-to-sequence task (e.g., machine translation): model1 = transformer(dimension_model=512, num_heads=8) with model1.train_session() as md: md.solve(train=(src, src_msk), target=(tgt, tgt_msk)) # flexibility to support multiple data types input src_np, tgt_pt, tgt_msk_gen = src.numpy, tgt, (_ for _ in tgt_msk.clone()) model2 = transformer(dimension_model=512, num_heads=8) with model1.train_session() as md: md.solve(train=(src_np, src_msk), target=(tgt_pt, tgt_msk_gen)) # parameter-reduced memory-efficient model for edge devices, importing locally stored data for training model3 = transformer(dimension_model=256, num_heads=4, dimension_feed_forward=1024, num_layers=(4, 2), dropout=0.05) with model3.train_session() as md: md.solve(train=(src, src_msk), target=(tgt, tgt_msk), loading_mode='local') # rotary position embedding to comprehend long-range dependence of sequence model4 = transformer(dimension_model=512, num_heads=8, encoding_meth='rotation', encoding_configs={'max_length': 4096, 'theta': 10000.0, 'start_pos': 0}) with model4.train_session(optimizer=tch.optim.Adam(model3.parameters(), lr=0.005)) as md: md.solve(train=(src, src_msk), target=(tgt, tgt_msk)) # transfer learning using pre-trained embedding function emb_func = base_model.from_pretrain(...) model5 = transformer(dimension_model=512, num_heads=8, encoding_meth='trainable', embedding_func={'in': emb_func, 'out': None, 'endmost': None}) with model5.train_session() as md: md.solve(train=(src, src_msk), target=(tgt, tgt_msk), stop_conditions={'epochs': 50})
- Notes:
architectural features:
flexibility on encoding method options
dynamic attention mechanism selection
configurable encoder-decoder asymmetry
expandability for integrating on-going works
- See Also:
- Logs:
Added in version 1.0.
– Created by Chen Zhang; Last updated on 01:34, 2025-09-06
- Authors:
Chen Zhang
- Version:
0.0.6
- Created on:
Jun 11, 2025