Temporal DenseNet: AI Now Lives in Time

Abstract

This network is a fully connected “temporal” architecture where each neuron sees not only the neurons in earlier layers of the current computation, but also all neurons from the previous computation step (tick).

By spreading information across layers and time, no single neuron can memorize an input-output pair directly. Instead, the network learns patterns in a distributed way, naturally favoring generalization over memorization. This makes it useful for tasks where datasets are small or where overfitting is a risk, because the architecture itself prevents simple lookup-table memorization and encourages learning the underlying rules.

Temporal Accumulation Results

The model demonstrates extreme precision on the XOR dataset by utilizing time ticks to accumulate state.

Final Loss: 0.000002
Ticks: 3 iterative steps
Structure: 8x8x8 Hidden layers

Epoch	MSE Loss
200	0.000091
1000	0.000008
2000	0.000002

Final Prediction Accuracy

The network achieves near-perfect separation for non-linear logic:

Raw Predictions:
[[7.0460670e-04] [9.9793684e-01] [9.9922490e-01] [2.1156450e-03]]
Rounded: [0, 1, 1, 0]

PyTorch Implementation

The architecture uses nn.ModuleList to manage current tick layers ($U$) and previous tick recurrence ($W$).


import torch
import torch.nn as nn

class TemporalDenseNet(nn.Module):
    def __init__(self, input_size, hidden_sizes, output_size):
        super().__init__()
        self.num_layers = len(hidden_sizes)
        self.hidden_sizes = hidden_sizes
        self.prev_concat_size = sum(hidden_sizes)
        
        # Current-tick linear layers U[i]
        self.U = nn.ModuleList()
        for i in range(self.num_layers):
            in_size = input_size if i == 0 else sum(hidden_sizes[:i])
            self.U.append(nn.Linear(in_size, hidden_sizes[i]))
        
        # Previous-tick linear layers W[i]
        self.W = nn.ModuleList([nn.Linear(self.prev_concat_size, hidden_sizes[i]) 
                               for i in range(self.num_layers)])
        
        self.out = nn.Linear(self.prev_concat_size, output_size)
        self.activation = torch.tanh
        
    def forward(self, x, prev_outputs=None):
        layer_outputs = []
        prev_cat = torch.cat(prev_outputs, dim=1) if prev_outputs is not None else None
        
        for i in range(self.num_layers):
            current_input = x if i == 0 else torch.cat(layer_outputs, dim=1)
            out = self.U[i](current_input)
            if prev_cat is not None:
                out = out + self.W[i](prev_cat)
            out = self.activation(out)
            layer_outputs.append(out)
        
        final_cat = torch.cat(layer_outputs, dim=1)
        return layer_outputs, torch.sigmoid(self.out(final_cat))

AI Now Lives in Time: Temporal DenseNet

Abstract

Temporal Accumulation Results

Final Prediction Accuracy

PyTorch Implementation