3.1.6. Option networks

3.1.6.1. Description

Neural networks related tool-chain and conventional implementation via meta programming. For normal use, the dependency of PyTorch is required.

Namespace of this module is mainly in info.toolbox.networks. For convenience in practice, use main entry of info.net.

Module

a flexible neural network base class with enhanced training/inference capabilities.

full_connected_neural

a configurable fully connected neural network module with flexible architecture options.

convolutional_neural

a configurable convolutional neural network (CNN) module with flexible architecture.

unet

a configurable U-Net architecture for semantic segmentation with dynamic dimensionality support.

transformer

a highly configurable transformer architecture supporting multiple attention mechanisms and embedding methods.

3.1.6.2. Docstrings

class Module

a flexible neural network base class with enhanced training/inference capabilities. this module extends PyTorch’s nn.Module with additional features including:

  • configurable training/inference sessions

  • automatic data type handling

  • built-in training loop with stopping conditions

  • support for both regression and classification tasks

  • generator-based online learning support

Logs:

Added in version 1.0.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

full_connected_neural

a configurable fully connected neural network module with flexible architecture options. This implementation provides a multi-layer perceptron (MLP) with customizable layer dimensions, activation functions, dropout, and data type specifications. The network can be either statically sized or dynamically initialized with lazy weight initialization.

Arguments:
Parameters:
  • structure (list[int]) – list specifying layer dimensions; its first element can be None to enable lazy initialization; e.g. [None, 256, 128] for lazy input or [784, 256, 128] for fixed input

  • activation (Union[Callable, list[Callable]]) – activation function(s) between layers; can be single function or list per layer; nn.ReLU, or [nn.ReLU, nn.Sigmoid] for different activation per layer;

  • bias (bool) – whether to include bias terms in linear layers; True as default

  • dropout (Optional[float]) – dropout probability (0-1) applied after last hidden layer; None as default to disable dropout

  • ctype_option (_Ctype) – torch datatype for network parameters; 'float32' as default

Returns:

a fully connected neural network

Return type:

Module

Examples:
Code 3.130 multi neural network
import torch as tch
from info.net import full_connected_neural
num_samples, input_size, output_size = 120, 10, 5
x, y_classification, y_regression = (tch.randn(num_samples, input_size),
                                     tch.randint(0, output_size, (num_samples,)),
                                     tch.randn(num_samples, output_size))
x_train, x_validation = x[:100], x[100:]
yc_train, yc_validation = y_classification[:100], y_classification[100:]
yr_train, yr_validation = y_regression[:100], y_regression[100:]

# apply on classification task
model1 = full_connected_neural(structure=[input_size, 40, output_size], activation=tch.nn.ReLU)
with model1.train_session() as md:
    md.solve(train=x_train, target=yc_train, validation=(x_validation, yc_validation))

# apply on regression task, with specified configuration
model2 = full_connected_neural(structure=[None, 40, 50, output_size], bias=False,
                               activation=[tch.nn.LeakyReLU, tch.nn.Tanh], dropout=0.2)
with model2.train_session(criterion=tch.nn.MSELoss()) as md:
    md.solve(train=x_train, target=yr_train, validation=(x_validation, yr_validation))
Notes:

this implementation provides a flexible fully connected network that supports:

  • both static and dynamic (lazy) input dimension handling

  • per-layer activation function specification

  • configurable precision through dtype options

  • optional dropout regularization

  • automatic training configuration (SGD optimizer with CrossEntropy loss by default)

the network uses PyTorch’s LazyLinear when input dimension is unspecified (None), which automatically infers input size during first forward pass.

See Also:
Logs:

Added in version 1.0.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

convolutional_neural

a configurable convolutional neural network (CNN) module with flexible architecture. This framework provides a convolutional neural network architecture comprising configurable convolutional blocks cascaded with a multi-stage fully-connected network. The modular design enables flexible customization of both feature extraction components (convolutional operations) and classification modules (MLP layers), supporting both baseline configurations and application-specific topological variations through parameterized layer composition.

Arguments:
Parameters:
  • conv_structure (list[int]) – list specifying the number of output channels for each convolutional block

  • mpl_structure (list[int]) – list specifying the layer sizes for the final MLP

  • activation (Callable) – global activation function; nn.ReLU as default

  • in_dimension (int) – spatial dimension of input; must be 1, 2, or 3; 2 as default to adapt natural images related tasks

  • conv_kernel (dict) –

    parameter dict containing 'kernel_size', 'stride', 'padding' or 'dilation'; accepted values can be a positive integer (applied to all dimensions),

    or tuple of positive integers specifying per-dimension values matching the input data’s dimensional structure; {'kernel_size': 3, 'stride': 1, 'padding': 1} as default configuration

  • batch_norm (dict) – batch normalization configuration; None as default to disable batch normalization; if provided, its accepted value should be a dict with 'eps', 'momentum', 'affine', or 'track_running_state' as keys, and allowable value for its respective key

  • pre_activation (bool) – whether to use pre-activation ordering before convolution; False as default

  • pool (dict) – 1-length dict of pooling configuration; key should be one among 'Max', 'FractionalMax', 'AdaptiveAvg', 'AdaptiveMax', 'Avg', 'LP', 'MaxUn', and the value is of the similar form as the conv_kernel parameter; {'Max': {'kernel_size': 2, 'stride': 2}} as default to apply conventional max pooling approach

  • dropout (float) – dropout probability from 0 to 1; None as default to disable dropout

  • conv_customization (list[dict]) – list of dictionaries to customize each convolutional block’s parameters; each dict can override default conv parameters (from activation to dropout); None as default to apply global configuration; e.g., [{'pre_activation': True}, {'dropout': 0.4}] to specify a two-convolutional layers with pre-activation the 1st, and 0.4 dropout the 2nd if the conv_structure is of [16, 32]

Returns:

a convolutional neural network

Return type:

Module

Examples:
Code 3.131 convolutional neural network
import torch as tch
from info.net import convolutional_neural

sp1, sp2, nums, n = (128, 128), (64, 64, 32), 20, 10
x_2d, x_3d, y_c, y_r = (tch.randn(nums, 3, *sp1), tch.randn(nums, 1, *sp2), tch.randint(0, 10, (nums,)),
                        tch.randn(nums, n))

# natural image classification task:
model1 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n])
with model1.train_session() as md:
    md.solve(train=x_2d, target=y_c, loading_mode='local')

# 3D image classification task:
model2 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n], in_dimensions=3)
with model2.train_session() as md:
    md.solve(train=x_3d, target=y_c, loading_mode='local')

# online loading for 3D images with customized configuration:
model3 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n], in_dimensions=3,
                              pre_activation=True, dropout=0.13)
with model3.train_session(criterion=tch.nn.HingeEmbeddingLoss()) as md:
    md.solve(train=(_ for _ in x_3d), target=(_ for _ in y_c), stop_conditions={'epochs': 40})

# or application on regression task:
model4 = convolutional_neural(conv_structure=[16, 32], mpl_structure=[128, n])
with model4.train_session(criterion=tch.nn.MSELoss()) as md:
    md.solve(train=x_2d, target=y_r, loading_mode='local')
Notes:

this implementation is featured as:

  • dynamic input dimension handling via lazy layers

  • configurable per-block parameters

  • automatic flattening before MLP

  • default Adam optimizer (lr=0.001) and CrossEntropyLoss

it employs an MLP-based backend implementation, inheriting its most features such as adaptive capabilities for both classification and regression tasks.

See Also:
Logs:

Added in version 1.0.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

unet

a configurable U-Net architecture for semantic segmentation with dynamic dimensionality support. This implementation follows the classic U-Net encoder-decoder structure with skip connections, while providing extensive customization options through parameterized components.

Arguments:
Parameters:
  • mirrored_channels (list[int]) – channel dimensions for each level of the encoder-decoder blocks; the decoder path mirrors the encoder channel structure

  • in_dimension (int) – spatial dimension of input; must be 1, 2, or 3; 2 as default to adapt natural images related tasks

  • activation (Callable) – activation function factory; default uses in-place ReLU for memory efficiency

  • conv_kernel (dict) –

    parameter dict containing 'kernel_size', 'stride', 'padding' or 'dilation'; accepted values can be a positive integer (applied to all dimensions),

    or tuple of positive integers specifying per-dimension values matching the input data’s dimensional structure; {'kernel_size': 3, 'stride': 1, 'padding': 1} as default configuration

  • batch_norm (dict) – batch normalization configuration with optional keys; None as default to disable batch normalization; if provided, its accepted value should be a dict with 'eps', 'momentum', 'affine', or 'track_running_state' as keys, and allowable value for its respective key

  • pre_activation (bool) – whether to use pre-activation ordering before convolution; False as default

  • pool (dict) – 1-length dict of pooling configuration; key should be one among 'Max', 'FractionalMax', 'AdaptiveAvg', 'AdaptiveMax', 'Avg', 'LP', 'MaxUn', and the value is of the similar form as the conv_kernel parameter; {'Max': {'kernel_size': 2, 'stride': 2}} as default to apply conventional max pooling approach

  • dropout (float) – dropout probability from 0 to 1; None as default to disable dropout

  • export_channel (int) – number of output channels; positive integer no greater than 3; 1 as default

Returns:

an U-Net instance

Return type:

Module

Examples:
Code 3.132 U-Net demonstration
import torch as tch
from info.net import unet

# standard 2D U-Net for binary segmentation
x, y = tch.randn(5, 1, 20, 40), tch.randint(0, 2, (5, 1, 20, 40)).float()
model1 = unet(mirrored_channels=[64, 128, 256, 512], in_dimensions=2)
with model1.train_session() as md:
    md.solve(train=x, target=y)

# 3D U-Net with custom normalization
x, y = tch.randn(5, 1, 20, 40, 35), tch.randint(0, 2, (5, 1, 20, 40, 35)).float()
model2 = unet(mirrored_channels=[16, 32], in_dimensions=3, batch_norm={'eps': 1e-6, 'momentum': 0.01},
              activation=(lambda: tch.nn.LeakyReLU(0.1)))
with model2.train_session(criterion=net.dice(1e-3)) as md:
    md.solve(train=x, target=y, loading_mode='local')

# 3D U-Net natively support multimodal fusion
x_multi, y = tch.randn(5, 4, 20, 40, 35), tch.randint(0, 2, (5, 1, 20, 40, 35)).float()
model3 = unet(mirrored_channels=[16, 32], in_dimensions=3)
with model3.train_session() as md:
    md.solve(train=x_multi, target=y)

# 3D U-Net for multiple segmentations, trained using mixture loss
x, y_multi = tch.randn(5, 1, 20, 40, 35), tch.randint(0, 2, (5, 3, 20, 40, 35)).float()
model4 = unet(mirrored_channels=[16, 32], in_dimensions=3, export_channel=3)
mixture_loss = (lambda m1, m2: 0.9 * dice(1e-3)(m1, m2) + 0.1 * tch.nn.CrossEntropyLoss()(m1, m2))
with model4.train_session(criterion=_c) as md:
    md.solve(train=x, target=y_multi, loading_mode='local')
Notes:

architectural features:

  • symmetric encoder-decoder structure with skip connection

  • automatic handling of input dimensions (1D/2D/3D)

  • dynamic channel sizing through mirrored_channels argument

  • lazy initialization for input flexibility

  • nearest-exact interpolation for precise feature map alignment

default training configuration utilizes Adam optimizer with 0.001 learning rate, and dice loss with 1e-5 smoothing factor; if requires customization, overwrite the criterion or optimizer argument when invoking the train session.

See Also:
Logs:

Added in version 1.0.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06

transformer

a highly configurable transformer architecture supporting multiple attention mechanisms and embedding methods. this implementation provides dynamic dimensionality handling, flexible positional encoding strategies, and modular attention configurations suitable for sequence-to-sequence tasks.

Arguments:
Parameters:
  • dimension_model (int) – hidden dimension size, must be positive integer

  • num_heads (int) – number of attention heads, positive integer; If unable to precisely divide dimension_model, this value will be heuristically adjusted

  • vocabulary_size (dict[Literal['in', 'out'], int]) – dictionary specifying input and output vocabulary sizes; {'in': 10000, 'out': 8000} as default

  • embedding_func (dict[Literal['in', 'out', 'endmost'], Optional[Callable]]) – custom embedding functions for input, output, and final layer; acceptable value is dictionary containing 'in', 'out' and 'endmost' as keys, and embedding function as their respective value; default configuration uses None to automatically initialize the embedding function

  • encoding_meth (Literal['sinusoid', 'trainable', 'relative', 'rotation']) – positional encoding method; accept value must be one option among 'sinusoid', 'trainable', 'relative', and 'rotation'; default uses 'sinusoid' for canonical transformer implementation

  • encoding_configs (dict) – configuration dict for positional encoding (method-specific parameters); default configuration uses {'max_length': 5000, 'base': 10000} for 'sinusoid' encoding, {'max_relative': 3} for 'relative', and {'theta': 10000.0, 'start_pos': 0} for 'rotation'

  • dimension_feed_forward (int) – dimension of feed forward network; 2048 as default

  • activation (Callable) – global activation function; torch.nn.ReLU as default

  • num_layers (Union[int, tuple[int, int]]) – encoder and decoder layer counts; support unbalanced encode decode architecture via tuple assignment; e.g., (6, 3) for unequal encoder and decoder transformer

  • attn_init (dict) – initialization parameters for attention layer; standard configuration uses {'bias': True, 'add_bias_kv': False, 'add_zero_attn': False, 'batch_first': True}; as for cross attention, the 'kdim' and 'vdim' will be determined by dimension_model while None for self attention in 'relative' and 'rotation' encoding method ('sinusoid' and 'trainable' will be None in both self and cross); 'dropout' will be adjusted from global dropout

  • attn_forward (dict) – configuration passed in attention forward; the standard setting utilizes 'need_weights' as True, 'attn_mask' as None, average_attn_weights as True, and 'is_causal' as False; if customized configuration are provided, the values will be overwrote from default

  • dropout (float) – global dropout rate; 0.1 as default

Returns:

a Transformer model

Return type:

Module

Examples:
Code 3.133 transformer demonstration
import torch as tch
from info.net import transformer

batch, seq1, seq2, voc1, voc2 = 32, 20, 15, 10000, 8000
src, tgt = tch.randint(0, voc1, (batch, seq1)), tch.randint(0, voc2, (batch, seq2))
src_msk, tgt_msk = tch.randn(src.shape) > 0, tch.randn(tgt.shape) > 0

# application on basic sequence-to-sequence task (e.g., machine translation):
model1 = transformer(dimension_model=512, num_heads=8)
with model1.train_session() as md:
    md.solve(train=(src, src_msk), target=(tgt, tgt_msk))

# flexibility to support multiple data types input
src_np, tgt_pt, tgt_msk_gen = src.numpy, tgt, (_ for _ in tgt_msk.clone())
model2 = transformer(dimension_model=512, num_heads=8)
with model1.train_session() as md:
    md.solve(train=(src_np, src_msk), target=(tgt_pt, tgt_msk_gen))

# parameter-reduced memory-efficient model for edge devices, importing locally stored data for training
model3 = transformer(dimension_model=256, num_heads=4, dimension_feed_forward=1024, num_layers=(4, 2),
                     dropout=0.05)
with model3.train_session() as md:
    md.solve(train=(src, src_msk), target=(tgt, tgt_msk), loading_mode='local')

# rotary position embedding to comprehend long-range dependence of sequence
model4 = transformer(dimension_model=512, num_heads=8, encoding_meth='rotation',
                     encoding_configs={'max_length': 4096, 'theta': 10000.0, 'start_pos': 0})
with model4.train_session(optimizer=tch.optim.Adam(model3.parameters(), lr=0.005)) as md:
    md.solve(train=(src, src_msk), target=(tgt, tgt_msk))

# transfer learning using pre-trained embedding function
emb_func = base_model.from_pretrain(...)
model5 = transformer(dimension_model=512, num_heads=8, encoding_meth='trainable',
                     embedding_func={'in': emb_func, 'out': None, 'endmost': None})
with model5.train_session() as md:
    md.solve(train=(src, src_msk), target=(tgt, tgt_msk), stop_conditions={'epochs': 50})
Notes:

architectural features:

  • flexibility on encoding method options

  • dynamic attention mechanism selection

  • configurable encoder-decoder asymmetry

  • expandability for integrating on-going works

See Also:
Logs:

Added in version 1.0.

– Created by Chen Zhang; Last updated on 01:34, 2025-09-06


Authors:

Chen Zhang

Version:

0.0.6

Created on:

Jun 11, 2025