Relation-Guided Versatile Regularization for Federated Semi-Supervised Learning

Academic Background and Problem Statement

With the increasing prominence of data privacy issues, Federated Learning (FL) has emerged as a decentralized machine learning paradigm, allowing multiple clients to collaboratively train a global model without sharing data, thereby protecting data privacy. However, existing FL methods typically assume that each client’s data is fully labeled, which is often unrealistic in practical applications, especially when labeling capabilities are limited. To address this issue, Federated Semi-Supervised Learning (FSSL) has been proposed. FSSL aims to leverage a large amount of unlabeled data for knowledge extraction, thereby enhancing model performance while preserving privacy.

However, existing FSSL methods primarily rely on data augmentation to maintain consistency between local and global models, leading to biased classifiers and poor performance when the distribution of unlabeled client data is skewed. To tackle these problems, this paper proposes a novel FSSL framework—Relation-Guided Versatile Regularization (FedRVR). This framework introduces versatile regularization at the client side and a relation-guided aggregation strategy at the server side, significantly improving local training efficiency and the robustness of the global model.

Paper Source and Author Information

This paper is co-authored by Qiushi Yang, Zhen Chen, Zhe Peng, and Yixuan Yuan, affiliated with the Department of Electrical Engineering at City University of Hong Kong, the Centre for Artificial Intelligence and Robotics (CAIR) in Hong Kong, the Department of Industrial and Systems Engineering at The Hong Kong Polytechnic University, and the Department of Electronic Engineering at The Chinese University of Hong Kong, respectively. The paper was accepted on December 10, 2024, and published in the International Journal of Computer Vision.

Research Process and Experimental Design

1. Overview of the Research Process

The FedRVR framework consists of two core components: Versatile Regularization and Relation-Guided Aggregation Strategy. Versatile regularization at the client side introduces two extreme global models (one with superior ability and one with inferior ability) to provide richer regularization constraints, thereby enhancing the training effectiveness of local models. The relation-guided aggregation strategy at the server side uses a model relation predictor to capture relationships between client models and aggregates models based on these relationships to generate more robust global models.

2. Versatile Regularization

At the unlabeled client side, FedRVR enhances local model training through versatile regularization. Specifically, versatile regularization includes data-guided regularization and model-guided regularization.

  • Data-Guided Regularization: Through data augmentation techniques, FedRVR uses pseudo-labels generated by the global model to guide the training of the local model. Specifically, the global model predicts weakly augmented data to generate pseudo-labels, while the local model predicts strongly augmented data and is required to align with these pseudo-labels.

  • Model-Guided Regularization: FedRVR introduces a global model with inferior ability to enhance local model training through its generated features. Specifically, the features generated by the inferior global model are fed into the local classifier, and its predictions are required to align with those of the local model.

Through these two types of regularization, FedRVR provides richer regularization constraints at the client side, thereby improving the training effectiveness of local models.

3. Relation-Guided Aggregation Strategy

At the server side, FedRVR generates robust global models through a relation-guided aggregation strategy. Specifically, the server captures relationships between client models using a model relation predictor and aggregates models based on these relationships.

  • Model Relation Predictor: The server uses a parametric relation predictor to capture pairwise relationships between client models and generates a model ranking. Based on this ranking, the server can generate a superior global model and an inferior global model. The superior global model is used to enhance global training, while the inferior global model is used to strengthen regularization for local models.

Through this relation-guided aggregation strategy, FedRVR can generate more robust global models at the server side, thereby improving the effectiveness of global training.

Experimental Results and Analysis

1. Experimental Setup

FedRVR was extensively validated on three FSSL benchmark datasets: CIFAR-10, CIFAR-100, and ISIC-2018. The experiments considered two FSSL settings: labeled-unlabeled clients and partially labeled clients, and were tested under different data distributions (IID and non-IID).

2. Main Results

The experimental results show that FedRVR outperforms existing FSSL methods across various federated learning settings. Specifically, in the labeled-unlabeled client setting, FedRVR achieved average accuracy improvements of 1.21%, 1.67%, and 1.62% on the CIFAR-10, CIFAR-100, and ISIC-2018 datasets, respectively, compared to the second-best method. In the partially labeled client setting, FedRVR also demonstrated superior performance, significantly outperforming other methods.

3. Ablation Studies

To verify the effectiveness of each component of FedRVR, ablation studies were conducted. The results show that both the relation-guided aggregation strategy and versatile regularization significantly contribute to improving model performance. Specifically, the relation-guided aggregation strategy improved accuracy by 1.26% and 0.88% in IID and non-IID settings, respectively, while versatile regularization improved accuracy by 2.06% and 1.31%, respectively.

Conclusion and Significance

The FedRVR framework proposed in this paper significantly enhances the performance of federated semi-supervised learning through versatile regularization and a relation-guided aggregation strategy. Specifically, FedRVR introduces two extreme global models at the client side to provide richer regularization constraints, thereby improving the training effectiveness of local models. At the server side, it uses a model relation predictor to generate more robust global models, enhancing global training. The experimental results demonstrate that FedRVR outperforms existing FSSL methods across various federated learning settings, highlighting its scientific value and application potential.

Research Highlights

  1. Versatile Regularization: FedRVR is the first to introduce both data-guided and model-guided regularization at the client side, significantly improving local model training.
  2. Relation-Guided Aggregation Strategy: FedRVR uses a model relation predictor to capture relationships between client models and aggregates models based on these relationships, generating more robust global models.
  3. Extensive Experimental Validation: FedRVR was extensively validated on multiple FSSL benchmark datasets, demonstrating its superiority over existing FSSL methods across various federated learning settings.

Other Valuable Information

The source code for FedRVR will be publicly released to facilitate replication and improvement by other researchers. Additionally, FedRVR can be extended to standard federated learning, further broadening its application scope.