AI Now Lives in Time: Temporal DenseNet

Independent Research

Abstract

This network is a fully connected “temporal” architecture where each neuron sees not only the neurons in earlier layers of the current computation, but also all neurons from the previous computation step (tick).

By spreading information across layers and time, no single neuron can memorize an input-output pair directly. Instead, the network learns patterns in a distributed way, naturally favoring generalization over memorization. This makes it useful for tasks where datasets are small or where overfitting is a risk, because the architecture itself prevents simple lookup-table memorization and encourages learning the underlying rules.

Temporal Accumulation Results

The model demonstrates extreme precision on the XOR dataset by utilizing time ticks to accumulate state.

  • Final Loss: 0.000002
  • Ticks: 3 iterative steps
  • Structure: 8x8x8 Hidden layers
EpochMSE Loss
2000.000091
10000.000008
20000.000002

Final Prediction Accuracy

The network achieves near-perfect separation for non-linear logic:

Raw Predictions:
[[7.0460670e-04] [9.9793684e-01] [9.9922490e-01] [2.1156450e-03]]
Rounded: [0, 1, 1, 0]

PyTorch Implementation

The architecture uses nn.ModuleList to manage current tick layers ($U$) and previous tick recurrence ($W$).


import torch
import torch.nn as nn

class TemporalDenseNet(nn.Module):
    def __init__(self, input_size, hidden_sizes, output_size):
        super().__init__()
        self.num_layers = len(hidden_sizes)
        self.hidden_sizes = hidden_sizes
        self.prev_concat_size = sum(hidden_sizes)
        
        # Current-tick linear layers U[i]
        self.U = nn.ModuleList()
        for i in range(self.num_layers):
            in_size = input_size if i == 0 else sum(hidden_sizes[:i])
            self.U.append(nn.Linear(in_size, hidden_sizes[i]))
        
        # Previous-tick linear layers W[i]
        self.W = nn.ModuleList([nn.Linear(self.prev_concat_size, hidden_sizes[i]) 
                               for i in range(self.num_layers)])
        
        self.out = nn.Linear(self.prev_concat_size, output_size)
        self.activation = torch.tanh
        
    def forward(self, x, prev_outputs=None):
        layer_outputs = []
        prev_cat = torch.cat(prev_outputs, dim=1) if prev_outputs is not None else None
        
        for i in range(self.num_layers):
            current_input = x if i == 0 else torch.cat(layer_outputs, dim=1)
            out = self.U[i](current_input)
            if prev_cat is not None:
                out = out + self.W[i](prev_cat)
            out = self.activation(out)
            layer_outputs.append(out)
        
        final_cat = torch.cat(layer_outputs, dim=1)
        return layer_outputs, torch.sigmoid(self.out(final_cat))