Learning Neural Network Classifiers by Distributing Nearest Neighbors on Adaptive Hypersphere

Adaptive Hypersphere Neural Network Classifier: Overview of ASNN Research

Introduction and Research Background

In recent years, with the development of artificial intelligence and deep learning, neural networks (NNs) have been widely applied to classification tasks. The essence of these tasks lies in establishing decision boundaries through neural networks to categorize samples into their respective classes. However, in traditional neural network classification methods, limitations in embedding space scalability and inefficiencies in positive/negative (P/N) pairing strategies have been major hindrances to further improving neural network performance. Specifically, existing pair-wise constraint-based (PWCB) methods primarily design contrastive loss functions (such as triplet loss or contrastive loss) and fixed embedding spaces to guide neural networks in learning discriminative features of samples. Nevertheless, these methods face the following challenges: Adaptive Hypersphere Nearest Neighbor

  1. Fixed Embedding Space Limitation: Fixed-scale Euclidean or hypersphere embedding spaces struggle to adapt to the distribution requirements of samples across different problems. This leads to optimization difficulties and challenges in distinguishing between sample classes.
  2. Inefficient Positive/Negative Pairing Strategies: In large-scale datasets, selecting appropriate positive/negative sample pairs is extremely challenging. Inaccurate pairing choices can lead to premature convergence or suboptimal local solutions, subsequently affecting the learned discriminative features.

To address these challenges, a research team from the University of Jinan and Quan Cheng Laboratory proposed a method named “Adaptive Hypersphere Nearest Neighbor” (ASNN). The primary authors are Xiaojing Zhang, Shuangrong Liu, Lin Wang, among others, and this research was published in IEEE Transactions on Artificial Intelligence, Volume 6, Issue 1, in 2025. This study seeks to tackle the issues of embedding space scalability and inefficient P/N pairing by introducing a scale-adaptive hypersphere embedding space and a neighborhood-based probability loss (NPL). These innovations enable significant enhancements in the generalization performance of neural network classifiers.

Research Methodology and Workflow

Overview of Research Workflow

The ASNN research workflow includes the following steps: 1. Design a scale-adaptive hypersphere embedding space to address embedding space scalability deficiencies. 2. Develop a nearest-neighbor-based pairing strategy for dynamic selection of sample pairs. 3. Construct a neighborhood-based probability loss (NPL) function to optimize neural network discriminative capabilities. 4. Validate the proposed method on a variety of datasets, including 29 UCI machine learning datasets and 3 image recognition datasets.

Detailed Research Steps

1. Scale-Adaptive Hypersphere Embedding Space

A novel embedding space design was proposed, using a learnable scale factor (η) to dynamically adjust the boundaries of the embedding space:

$$ f^*(x) = η \cdot \frac{\langle w, π(x;θ) \rangle}{||w||_2 \cdot ||π(x;θ)||_2} $$

Here, $f^*(x)$ represents the embedding point of sample $x$, $\langle w, π(x;θ) \rangle$ represents the normalized feature vector from the fully connected layer output, and $\eta$ is optimized through gradient descent to adaptively adjust the embedding space size. This design ensures that the embedding space can self-explore appropriate scales based on sample distribution, enabling intraclass compactness and clear interclass separability.

2. Nearest-Neighbor-Based Pairing Strategy

To improve the efficiency of P/N pairing, the study proposed a nearest-neighbor-based pairing strategy: - For each training iteration, compute the distance matrix for mini-batch samples and determine positive/negative neighbor sets for each anchor point. - This strategy dynamically adjusts the P/N pairing ratio based on local sample distribution rather than relying on a fixed number of pairs, improving the accuracy of embedded sample distribution evaluation.

3. Neighborhood-Based Probability Loss (NPL)

The authors designed two variants of the neighborhood-based probability loss (NPL): Partial-NPL and Global-NPL. Taking Partial-NPL as an example, the loss function is defined as:

$$ \mathcal{L} = - \frac{1}{m} \sum{i=1}^m [ \lambda \sum{j \in P} \log \hat{p}{ij} + (1-\lambda) \sum{k \in N} \log (1 - \hat{p}_{ik})] $$

Here, $\hat{p}{ij}$ and $\hat{p}{ik}$ represent the probabilities of positive and negative neighbors for an anchor point:

$$ \hat{p}_{ij} = \frac{\exp{(-d(x_a^i, xp^j)/2)}}{\sum{j \in |P|} \exp{(-d(x_a^i, xp^j)/2)} + \sum{k \in |N|} \exp{(-d(x_a^i, x_n^k)/2)}} $$

By comprehensively considering the spatial relationships between anchor points and neighbors, NPL aims to maximize the similarity of intraclass samples while minimizing interclass distances, thus enhancing the performance of neural network discriminative features.

Datasets and Experimental Settings

The study used 29 UCI datasets (e.g., Iris, Wine, Car Evaluation) and 3 image datasets (MNIST, CIFAR-10, CIFAR-100) for experiments. By comparing ASNN with existing optimization methods (e.g., Triplet Loss, Contrastive Loss, Softmax + Cross-Entropy), the authors evaluated its classification accuracy (ACC) and average F1-score (AFS).

Results and Analysis

The experimental results demonstrate that ASNN outperforms its competitors on most datasets. In UCI dataset experiments, ASNN’s G-NPL variant achieved the highest accuracy in 23 out of 29 datasets. In image dataset experiments, ASNN showed significantly lower test error rates compared to other methods. For example, ASNN’s test error rate on CIFAR-100 was 26.32%, significantly improved compared to Triplet Loss’s 42.20%.

ASNN exhibits the capability to adaptively adjust embedding space scales and dynamically select P/N pairs based on local neighbor distributions, thus significantly enhancing optimization efficiency and overall performance. Notably, ASNN showed exceptional performance on highly imbalanced datasets, such as Covertype and Poker Hand, indicating its effectiveness in addressing sample imbalance problems.

Conclusions and Implications

ASNN introduces an innovative optimization framework for neural networks by incorporating a scale-adaptive hypersphere embedding space and a neighborhood-based probability loss function. This study not only provides methodological innovation but also offers significant references for designing deep learning frameworks. The excellent performance of ASNN across diverse datasets highlights its potential for widespread application in classification tasks and other neural network application scenarios.

By addressing the challenges of fixed embedding space and inefficiencies in P/N pairing, ASNN brings a fresh perspective to the study of neural network optimization methods. Its theoretical contributions and practical implications are both substantial and far-reaching.