rtnn.models package
Submodules
rtnn.models.rnn module
Bidirectional recurrent neural network models for sequence modeling.
This module provides implementations of bidirectional recurrent neural networks (RNNs) using Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) cells. These models are designed for sequence-based data, such as time series or vertically structured physical profiles.
The module includes:
BaseRNN: A flexible base class supporting both LSTM and GRU architectures with bidirectional processing.
RNN_LSTM: A specialized LSTM-based model built on BaseRNN.
RNN_GRU: A specialized GRU-based model built on BaseRNN.
Features
Bidirectional sequence processing for improved context awareness
Support for stacked recurrent layers
Unified interface for LSTM and GRU architectures
Automatic hidden state initialization
Final 1D convolution layer for channel-wise output projection
Compatible with batched inputs and GPU acceleration
Notes
Inputs are expected in the shape (batch_size, feature_channel, seq_length).
Internally, inputs are permuted to (batch_size, seq_length, feature_channel) to match PyTorch RNN requirements.
Bidirectional RNNs double the hidden state size, which is handled automatically in the final projection layer.
Hidden states are initialized to zeros at each forward pass.
The final Conv1d layer maps hidden representations to the desired output channels while preserving sequence length.
Dependencies
torch
torch.nn
typing
Examples
Using LSTM-based model:
>>> model = RNN_LSTM(
... feature_channel=6,
... output_channel=4,
... hidden_size=128,
... num_layers=3
... )
>>> x = torch.randn(16, 6, 10)
>>> y = model(x)
Using GRU-based model:
>>> model = RNN_GRU(
... feature_channel=6,
... output_channel=4,
... hidden_size=128,
... num_layers=3
... )
>>> x = torch.randn(16, 6, 10)
>>> y = model(x)
Using BaseRNN directly:
>>> model = BaseRNN(
... feature_channel=6,
... output_channel=4,
... hidden_size=64,
... num_layers=2,
... rnn_type="lstm"
... )
- class rtnn.models.rnn.BaseRNN(*args: Any, **kwargs: Any)[source]
Bases:
ModuleBase class for bidirectional RNN modules (LSTM/GRU).
This class provides a common interface for both LSTM and GRU models with bidirectional processing and a final 1D convolutional layer to map the hidden states to the desired output channels.
- Parameters:
feature_channel (int) – Number of input features per time step.
output_channel (int) – Number of output channels (target variables).
hidden_size (int) – Number of hidden units in the RNN layers.
num_layers (int) – Number of stacked RNN layers.
rnn_type (str) – Type of RNN cell, either ‘lstm’ or ‘gru’.
- rnn
The bidirectional RNN layer.
- Type:
nn.LSTM or nn.GRU
- final
Final 1D convolution to project hidden states to output channels.
- Type:
nn.Conv1d
Number of hidden units.
- Type:
Examples
>>> model = BaseRNN( ... feature_channel=6, ... output_channel=4, ... hidden_size=64, ... num_layers=2, ... rnn_type='lstm' ... ) >>> x = torch.randn(32, 6, 10) # (batch, features, sequence) >>> y = model(x) >>> y.shape torch.Size([32, 4, 10])
- __init__(feature_channel: int, output_channel: int, hidden_size: int, num_layers: int, rnn_type: str) None[source]
Initialize the BaseRNN module.
Initialize the hidden state for the RNN.
- Parameters:
batch_size (int) – Batch size for the input.
device (torch.device) – Device to create the hidden state on.
- Returns:
For GRU: returns hidden state tensor of shape (2 * num_layers, batch_size, hidden_size) For LSTM: returns tuple (hidden, cell) both of same shape.
- Return type:
torch.Tensor or tuple of torch.Tensor
- forward(x: torch.Tensor) torch.Tensor[source]
Forward pass through the bidirectional RNN.
- Parameters:
x (torch.Tensor) – Input tensor of shape (batch_size, feature_channel, seq_length).
- Returns:
Output tensor of shape (batch_size, output_channel, seq_length).
- Return type:
Notes
The input is permuted to (batch_size, seq_length, feature_channel) for the RNN, then the output is permuted back for the convolution.
- class rtnn.models.rnn.RNN_LSTM(*args: Any, **kwargs: Any)[source]
Bases:
BaseRNNLSTM-based bidirectional RNN model.
This class inherits from BaseRNN and configures it to use LSTM cells.
- Parameters:
Examples
>>> model = RNN_LSTM( ... feature_channel=6, ... output_channel=4, ... hidden_size=128, ... num_layers=3 ... ) >>> x = torch.randn(16, 6, 10) >>> y = model(x) >>> print(y.shape) torch.Size([16, 4, 10])
- class rtnn.models.rnn.RNN_GRU(*args: Any, **kwargs: Any)[source]
Bases:
BaseRNNGRU-based bidirectional RNN model.
This class inherits from BaseRNN and configures it to use GRU cells.
- Parameters:
Examples
>>> model = RNN_GRU( ... feature_channel=6, ... output_channel=4, ... hidden_size=128, ... num_layers=3 ... ) >>> x = torch.randn(16, 6, 10) >>> y = model(x) >>> print(y.shape) torch.Size([16, 4, 10])
rtnn.models.fcn module
Neural network building blocks and radiative transfer-inspired models.
This module represents a PyTorch modules for fully connected networks designed for structured data, particularly vertical profile modeling such as atmospheric or canopy radiative transfer.
The module includes:
FCBlock: A reusable fully connected block with normalization and activation.FCN: A configurable fully connected network for sequence-like inputs.
Features
Modular fully connected components with batch normalization
Flexible depth and width configuration for dense networks
Support for sequence reshaping and optional dimension expansion
Notes
FCNexpects inputs shaped as (batch_size, feature_channel, seq_length) and internally flattens them before processing.
Examples
Using FCBlock:
>>> block = FCBlock(128, 64)
>>> x = torch.randn(32, 128)
>>> y = block(x)
Using FCN:
>>> model = FCN(
... feature_channel=6,
... output_channel=4,
... num_layers=3,
... hidden_size=196,
... seq_length=10
... )
>>> x = torch.randn(32, 6, 10)
>>> y = model(x)
- class rtnn.models.fcn.FCBlock(*args: Any, **kwargs: Any)[source]
Bases:
ModuleA fully connected block with linear layer, batch normalization, and ReLU activation.
This module applies a linear transformation, followed by batch normalization, and then a ReLU activation function.
- Parameters:
- linear
Linear transformation layer.
- Type:
nn.Linear
- bn
Batch normalization layer.
- Type:
nn.BatchNorm1d
- relu
ReLU activation function.
- Type:
nn.ReLU
Examples
>>> block = FCBlock(128, 64) >>> x = torch.randn(32, 128) >>> y = block(x) >>> y.shape torch.Size([32, 64])
- forward(x: torch.Tensor) torch.Tensor[source]
Forward pass through the FCBlock.
- Parameters:
x (torch.Tensor) – Input tensor of shape (batch_size, in_features).
- Returns:
Output tensor of shape (batch_size, out_features).
- Return type:
Notes
The forward pass applies: ReLU(BatchNorm(Linear(x)))
- class rtnn.models.fcn.FCN(*args: Any, **kwargs: Any)[source]
Bases:
ModuleFully Connected Network with configurable depth and width.
This model flattens the input sequence and processes it through a series of fully connected layers. It can optionally expand the sequence length using a linear transformation.
- Parameters:
feature_channel (int) – Number of input features per time step.
output_channel (int) – Number of output channels.
num_layers (int) – Number of hidden layers.
hidden_size (int) – Size of hidden layers.
seq_length (int, optional) – Length of the input sequence. Default is 10.
dim_expand (int, optional) – Number of time steps to expand the output sequence by. Default is 0 (no expansion).
Stack of hidden layers.
- Type:
nn.Sequential
- output_layer
Final output layer.
- Type:
nn.Linear
- dim_change
Optional layer for sequence length expansion.
- Type:
nn.Linear or None
Examples
>>> model = FCN( ... feature_channel=6, ... output_channel=4, ... num_layers=3, ... hidden_size=196, ... seq_length=10 ... ) >>> x = torch.randn(32, 6, 10) >>> y = model(x) >>> y.shape torch.Size([32, 4, 10])
- __init__(feature_channel: int, output_channel: int, num_layers: int, hidden_size: int, seq_length: int = 10, dim_expand: int = 0) None[source]
Initialize the FCN model.
- Parameters:
feature_channel (int) – Number of input features.
output_channel (int) – Number of output channels.
num_layers (int) – Number of hidden layers.
hidden_size (int) – Size of hidden layers.
seq_length (int, optional) – Length of the input sequence. Default is 10.
dim_expand (int, optional) – Number of time steps to expand the output sequence by. Default is 0 (no expansion).
- Raises:
ValueError – If num_layers is less than 1.
- forward(x: torch.Tensor) torch.Tensor[source]
Forward pass through the FCN.
- Parameters:
x (torch.Tensor) – Input tensor of shape (batch_size, feature_channel, seq_length).
- Returns:
Output tensor of shape (batch_size, output_channel, seq_length + dim_expand) if dim_expand > 0, otherwise (batch_size, output_channel, seq_length).
- Return type:
Notes
The forward pass: 1. Flattens the input to (batch_size, feature_channel * seq_length) 2. Passes through FCBlocks 3. Projects to output dimensions 4. Reshapes to (batch_size, output_channel, seq_length) 5. Optionally expands sequence length
rtnn.models.transformer module
Transformer-based encoder model for sequence modeling.
This module implements a Transformer encoder architecture using PyTorch’s
native nn.TransformerEncoder components. It is designed for processing
structured sequence data, such as time series or vertical profiles, where
contextual relationships across positions are important.
The model projects input features into an embedding space, adds learnable positional encodings, and processes the sequence through stacked self-attention layers before projecting to the desired output channels.
Features
Learnable input projection to embedding space
Learnable positional embeddings for sequence order awareness
Multi-head self-attention via Transformer encoder layers
Configurable depth, attention heads, and feedforward expansion
Dropout for regularization
Final 1D convolution for channel-wise output projection
Support for attention masks and padding masks
Notes
Inputs are expected in the shape (batch_size, feature_channel, seq_length).
Internally, inputs are permuted to (batch_size, seq_length, feature_channel) to match Transformer expectations.
Positional embeddings are added to the projected input features.
The
maskargument is used for attention masking (e.g., causal masking).The
src_key_padding_maskis used to ignore padded positions in sequences.The final output preserves the sequence length and maps embeddings to
output_channeldimensions.
Dependencies
torch
torch.nn
typing
Examples
Basic usage:
>>> model = EncoderTorch(
... feature_channel=6,
... output_channel=4,
... embed_size=128,
... num_layers=3,
... heads=4,
... forward_expansion=4,
... seq_length=10,
... dropout=0.1
... )
>>> x = torch.randn(32, 6, 10)
>>> y = model(x)
>>> y.shape
torch.Size([32, 4, 10])
Using attention masks:
>>> mask = torch.triu(torch.ones(10, 10), diagonal=1).bool()
>>> y = model(x, mask=mask)
- class rtnn.models.transformer.EncoderTorch(*args: Any, **kwargs: Any)[source]
Bases:
Module- __init__(feature_channel: int, output_channel: int, embed_size: int, num_layers: int, heads: int, forward_expansion: int, seq_length: int, dropout: float) None[source]
- forward(x: torch.Tensor, mask: torch.Tensor | None = None, src_key_padding_mask: torch.Tensor | None = None) torch.Tensor[source]
x: (batch, feature_channel, seq_length)
rtnn.models.mlp module
Multi-layer perceptron architectures for structured and sequence-based modeling.
This module provides flexible and extensible implementations of multi-layer perceptrons (MLPs) tailored for tasks such as radiative transfer emulation and other scientific machine learning applications involving structured inputs.
The module includes:
MLPBlock: A configurable fully connected block with optional normalization, activation, and dropout.
MLP: A flexible MLP architecture supporting positional embeddings, residual connections, and customizable depth.
MLPResidual: A residual MLP with skip connections across all hidden layers for improved gradient flow and training stability.
Features
Configurable hidden layer sizes and depth
Support for multiple normalization strategies (batch norm, layer norm)
Choice of activation functions (ReLU, GELU, SiLU)
Optional dropout for regularization
Residual connections for improved optimization
Learnable positional embeddings for sequence-aware modeling
Designed for flattened sequence inputs and structured data
Notes
Inputs are expected in the shape (batch_size, feature_channel, seq_length) and are internally flattened before processing.
Positional embeddings, when enabled, are concatenated to the input features before passing through the network.
Residual connections in
MLPare applied globally, whileMLPResidualapplies residual connections at every hidden layer.Layer normalization is applied to outputs for improved numerical stability.
Dependencies
torch
torch.nn
typing
Examples
Basic MLP usage:
>>> model = MLP(
... feature_channel=6,
... output_channel=4,
... seq_length=10,
... hidden_sizes=[512, 256, 128]
... )
>>> x = torch.randn(32, 6, 10)
>>> y = model(x)
Using MLP with positional embeddings and residuals:
>>> model = MLP(
... feature_channel=6,
... output_channel=4,
... seq_length=10,
... use_positional_embedding=True,
... use_residual=True
... )
Using MLPResidual:
>>> model = MLPResidual(
... feature_channel=6,
... output_channel=4,
... seq_length=10,
... hidden_size=256,
... num_layers=4
... )
>>> x = torch.randn(16, 6, 10)
>>> y = model(x)
- class rtnn.models.mlp.MLPBlock(*args: Any, **kwargs: Any)[source]
Bases:
ModuleA single MLP block with linear layer, normalization, activation, and dropout.
- Parameters:
in_features (int) – Number of input features.
out_features (int) – Number of output features.
dropout (float, optional) – Dropout rate. Default is 0.1.
use_batch_norm (bool, optional) – Whether to use batch normalization. Default is True.
use_layer_norm (bool, optional) – Whether to use layer normalization. Default is False.
activation (str, optional) – Activation function (‘relu’, ‘gelu’, ‘silu’). Default is ‘relu’.
- __init__(in_features: int, out_features: int, dropout: float = 0.1, use_batch_norm: bool = True, use_layer_norm: bool = False, activation: str = 'relu')[source]
- forward(x: torch.Tensor) torch.Tensor[source]
Forward pass.
- class rtnn.models.mlp.MLP(*args: Any, **kwargs: Any)[source]
Bases:
ModuleMulti-Layer Perceptron for radiative transfer emulation.
- Parameters:
feature_channel (int) – Number of input features per time step.
output_channel (int) – Number of output channels.
seq_length (int) – Length of the input sequence.
hidden_sizes (List[int], optional) – List of hidden layer sizes. Default is [512, 256, 128].
dropout (float, optional) – Dropout rate. Default is 0.1.
use_batch_norm (bool, optional) – Whether to use batch normalization. Default is True.
use_layer_norm (bool, optional) – Whether to use layer normalization. Default is False.
use_residual (bool, optional) – Whether to use residual connections. Default is False.
activation (str, optional) – Activation function (‘relu’, ‘gelu’, ‘silu’). Default is ‘relu’.
use_positional_embedding (bool, optional) – Whether to add positional embeddings. Default is True.
positional_embed_dim (int, optional) – Dimension of positional embeddings. Default is 16.
- __init__(feature_channel: int, output_channel: int, seq_length: int = 10, hidden_sizes: List[int] = None, dropout: float = 0.1, use_batch_norm: bool = True, use_layer_norm: bool = False, use_residual: bool = False, activation: str = 'relu', use_positional_embedding: bool = True, positional_embed_dim: int = 16)[source]
- forward(x: torch.Tensor) torch.Tensor[source]
Forward pass.
- class rtnn.models.mlp.MLPResidual(*args: Any, **kwargs: Any)[source]
Bases:
ModuleMLP with residual connections between all layers.
- Parameters:
- __init__(feature_channel: int, output_channel: int, seq_length: int = 10, hidden_size: int = 256, num_layers: int = 4, dropout: float = 0.1)[source]
- forward(x: torch.Tensor) torch.Tensor[source]
Forward pass with residual connections.
rtnn.models.pinn module
Physics-inspired neural network architectures for vertical profile modeling. These models are designed to capture the structured interactions in radiative transfer processes, particularly in vegetation canopies. The key innovation is the two-stream formulation that mimics the coupled upward and downward fluxes in radiative transfer, with learnable coupling coefficients and separate output heads for each stream.
- class rtnn.models.pinn.LayerPositionalEmbedding(*args: Any, **kwargs: Any)[source]
Bases:
ModuleLearnable positional embedding for layer-wise data. This module provides a learnable embedding for each layer in a vertical profile, allowing the model to distinguish between different physical layers (e.g., canopy levels). The embedding is added to the input features before processing.
- class rtnn.models.pinn.PINN(*args: Any, **kwargs: Any)[source]
Bases:
ModuleTwo-stream RT emulator for vegetation canopies.
This architecture is inspired by the coupled radiative transfer equations:
Coupled sweep: D and U interact at every layer via C_down/C_up, mirroring the γ2 cross-coupling in Eq.(2) of the paper. d_new = T_down * d + C_down * u + S_down u_new = T_up * u + C_up * d + S_up The upward sweep then refines U[l] using the same coupling, with D[l] already fixed.
Separate projection heads for D and U before merging. head_D(D[l]) + head_U(U[l]) + skip_proj(h[:,l]) This preserves the physical identity of each stream.
Sigmoid on C_down / C_up keeps coupling coefficients in (0,1), consistent with γ2 being a positive scattering fraction.
x shape in : (B, C, L) C = feature channels, L = 10 layers x shape out : (B, out_C, L)