Neural Mechanisms of Relational Learning and Fast Knowledge Reassembly in Plastic Neural Networks

Neural Mechanisms and Relational Learning: Rapid Knowledge Reassembly in Neural Networks

Background

Humans and animals possess a remarkable ability to learn relationships between items in experience (such as stimuli, objects, and events), enabling structured generalization and rapid information assimilation. A fundamental type of such relational learning is order learning, which allows for transitive inference (e.g., if a > b and b > c, then a > c) and list linking (e.g., a > b > c and d > e > f can be rapidly “reassembled” into a > b > c > d > e > f upon learning c > d). Despite longstanding research, the neurobiological mechanisms underlying transitive inference and rapid knowledge reassembly remain elusive. This paper demonstrates how neural networks endowed with neuromodulated synaptic plasticity (enabling self-directed learning) and identified through artificial metalearning (learning-to-learn) can perform both transitive inference and list linking, further expressing behavioral patterns widely observed in humans and animals.

Source of the Paper

This paper was authored by Thomas Miconi and Kenneth Kay, affiliated with ML Collective (San Francisco, USA) and Columbia University (New York, USA), respectively. It was published in the journal Nature Neuroscience in February 2025, with the DOI 10.1038/s41593-024-01852-8.

Research Process

1. Task and Model Design

The study first designed a classic task paradigm involving transitive inference and list linking. The task was organized into multiple episodes, each containing several trials. In each episode, the network was tasked with learning the order of a set of completely new random stimuli. The stimuli were high-dimensional binary vectors, randomly generated for each episode. Each episode consisted of 30 trials, with the first 20 trials including only adjacent pairs and the last 10 trials including all possible pairs (excluding identical pairs such as aa or bb).

2. Network Structure and Metatraining

A recurrent neural network (RNN) was used, which featured synaptic plasticity and self-controlled neuromodulation. The network’s inputs included the current time step’s stimuli, the reward signal, and the response from the previous time step. The network’s output was a probability distribution over the two possible responses. At the start of each episode, the network’s activations and Hebbian plasticity traces were reset, but synaptic weights remained unchanged.

3. Synaptic Plasticity

The recurrent connections in the network were endowed with adjustable Hebbian plasticity. Each connection maintained a Hebbian eligibility trace, which was a decaying running average of the product of outputs and inputs. The network also generated a neuromodulatory signal, m(t), to gate the transformation of the Hebbian trace into actual synaptic weight changes.

4. Metatraining Process

The goal of metatraining was to enable the network to autonomously learn arbitrary new orders across multiple episodes. After each episode, gradient descent was applied to optimize the network’s structural parameters (e.g., base weights and plasticity parameters) to improve within-episode plasticity-based learning. The loss function was the total reward obtained over the entire episode.

Key Findings

1. Behavioral Patterns in Transitive Inference

The study first evaluated the behavioral patterns of successful learning networks. The network exhibited classic behavioral patterns in test trials, such as the symbolic distance effect and the end-anchor effect. These patterns were consistent with observations in human and animal experiments.

2. List-Linking Ability

The network also demonstrated the ability to rapidly link separately learned lists. After learning two sublists (e.g., a > b > c > d and e > f > g > h), the network could quickly infer the order across the entire joint list (e.g., b > f) upon learning d > e. This ability indicated that the network could rapidly reassemble existing knowledge.

3. Analysis of Neural Mechanisms

Using principal component analysis (PCA), the study found that the first principal component of the network’s activity was strongly aligned with the output weight vector. Further analysis revealed that the network encoded the order information of each stimulus in its representation. The network represented the order of stimuli by aligning their individual representations with the output weight vector.

4. Representation Learning and Reinstatement

The study also found that the network reinstated representations of previous stimuli in a recoded form during trials. This reinstatement allowed the network to modify the representations of previously presented stimuli after a trial delay, enabling rapid knowledge reassembly.

Conclusion

By metatraining neural networks endowed with synaptic plasticity and neuromodulation, this study successfully achieved autonomous learning and knowledge reassembly in a classic transitive inference task. The study revealed that the network performed model-based learning by reinstating representations of previous stimuli, a mechanism reminiscent of memory replay observed in humans and animals. This research not only sheds light on the neural mechanisms of relational learning but also provides new insights for future cognitive models.

Highlights of the Study

  1. Transitive Inference and List Linking: The network successfully performed transitive inference and list linking, reproducing classic behavioral patterns observed in humans and animals.
  2. Reinstatement Mechanism: The network achieved knowledge reassembly by reinstating representations of previous stimuli, a mechanism similar to memory replay.
  3. Metalearning Approach: By metatraining the network, the study enabled autonomous learning and optimization of synaptic plasticity, offering new tools for future cognitive model research.

Significance and Value

This study not only reveals the neural mechanisms of relational learning but also provides new directions for future research in cognitive models. By metatraining neural networks, the study demonstrates how autonomous learning and optimization of synaptic plasticity can achieve complex cognitive tasks. This research offers a fresh perspective on understanding the learning mechanisms in humans and animals and provides significant theoretical support for advancements in artificial intelligence.