Asyco: An Asymmetric Dual-Task Co-Training Model for Partial-Label Learning

2025-02-03 Mon
partial-label learning weakly supervised learning dual-task training label disambiguation error accumulation co-training models
Research on an Asymmetric Dual-Task Co-Training Model for Improving Partial Label Learning in Deep LearningResearch Background

In the field of deep learning, supervised learning has become the core method for many artificial intelligence tasks. However, training deep neural networks often requires a massive amount of accurately labeled data, which is time-consuming and costly to collect. As an effective alternative, weakly supervised learning has garnered extensive attention in recent years. Partial Label Learning (PLL) is a typical challenge in weakly supervised learning, assuming that each training instance is labeled with a candidate label set that contains both the true label and several false labels. Due to the inherent ambiguity in candidate labels, PLL has become a challenging problem.
A key objective in PLL research is to disambiguate these labels and correctly identify the true label for each sample. Existing approaches include maximum margin algorithms, graph models, expectation-maximization algorithms, contrastive learning, and consistency regularization. However, these traditional methods are often limited to classical machine learning models and struggle to handle large-scale data effectively.
Recent research has demonstrated that deep models based on self-training are effective solutions for PLL. These methods iteratively evaluate the confidence of labels for samples and optimize their models accordingly. Nonetheless, self-training models suffer from error accumulation issues, where misclassified labels further mislead the model, leading to performance degradation. Although co-training strategies have been widely applied in handling noisy label learning problems, most existing co-training methods adopt symmetric designs, where two networks with identical architectures are trained. This makes them prone to similar limitations, decreasing their ability to effectively correct errors in each other.
In this context, researchers from Chongqing University, the Institute of Software at the Chinese Academy of Sciences, Zhejiang University, and Nanyang Technological University proposed a novel model called ASYCO, an asymmetric dual-task co-training model for PLL. This model aims to overcome the limitations of symmetric co-training and enhance PLL performance.
Paper SourceThis research is presented as a paper published in the May 2025 issue (Vol. 68, Issue 5) of Science China Information Sciences. The paper is titled ASYCO: An Asymmetric Dual-Task Co-Training Model for Partial-Label Learning. The primary authors include Beibei Li, Yiyuan Zheng, Beihong Jin, Tao Xiang, Haobo Wang, and Lei Feng, representing institutions such as Chongqing University, the Institute of Software at the Chinese Academy of Sciences, Zhejiang University, and Nanyang Technological University.
Research Workflowa) Research Design and ProcedureThe ASYCO model is built upon an asymmetric co-training framework, comprising two identical yet task-differentiated networks: a Disambiguation Network and an Auxiliary Network. The research process can be summarized in the following steps:
Disambiguation Network Construction and Training:
The primary role of the Disambiguation Network is to resolve label ambiguity by learning confidence vectors to identify the true labels within the candidate label set.
It uses PLL-specific loss functions, including Classifier-Consistent Loss (CC Loss) and Risk-Consistent Loss (RC Loss).
Data augmentation techniques, such as Autoaugment and Cutout, are applied to enrich sample data, thereby improving the model’s generalization ability.
Auxiliary Network Construction and Training:
The Auxiliary Network generates low-noise pairwise similarity labels (Pairwise Similarity Labels) based on the pseudo-labels (Pseudo Class Labels) identified by the Disambiguation Network.
For any given pair of samples, a similarity label of 1 is assigned if they share the same pseudo-label and 0 otherwise. This structured data is used to train the Auxiliary Network with supervised learning.
Error Correction Module Design:
The Auxiliary Network mitigates the issue of error accumulation in the Disambiguation Network using two strategies: Information Distillation and Confidence Refinement.
Specifically, the predicted distributions of the Auxiliary Network guide the Disambiguation Network through KL-divergence constraints, while the Auxiliary Network’s confidence vectors dynamically refine the Disambiguation Network’s confidence scores.
Unified Model Training and Prediction:
Initially, the Disambiguation Network is trained independently; its parameters are then used to initialize the Auxiliary Network for joint co-training.
During inference, a single network (either the Disambiguation or Auxiliary Network) is used for predictions to minimize computational overhead.
b) Innovative Techniques and Designs in the ResearchThe core innovation of ASYCO lies in its asymmetric co-training framework. Unlike symmetric co-training approaches, ASYCO forces the two networks to learn from different perspectives by utilizing distinct task designs. Specific innovations include:
1. Label Transformation Strategy in the Auxiliary Network: Pseudo-labels are transformed into pairwise similarity labels, effectively reducing the noise rate in training data.
2. Error Correction Strategies: Information Distillation and Confidence Refinement establish bidirectional communication to correct errors dynamically.
3. Data Augmentation and Temperature Parameter Optimization: Both techniques enhance intra-sample variation representation and improve confidence representation.
c) Dataset and Empirical ValidationThe research team conducted extensive experiments on multiple public datasets, including SVHN, CIFAR-10, CIFAR-100, CNAE-9, and a real-world dataset, Birdsong. They introduced two label generation processes—Uniform and Instance-Dependent—to validate the model’s performance under varying levels of noise.
Experimental Results and Key Findings1. Performance ComparisonResults demonstrate that ASYCO outperforms all existing methods across all datasets. For example:
- On CIFAR-10, ASYCO achieves improvements ranging from 0.361% to 1.694% in accuracy as the noise level (q) increases.
2. Comparison Against Symmetric Co-Training DesignsA comparison with the symmetric co-training variant SyCo highlights the effectiveness of asymmetric design. On multiple datasets, ASYCO achieves accuracy improvements of 0.607%-0.955% compared to SyCo.
3. Effectiveness of Error Correction StrategiesBoth Information Distillation and Confidence Refinement significantly enhance the model’s prediction accuracy. Excluding either strategy leads to noticeable performance drops.
4. Impact of Label Processing in the Auxiliary NetworkBy converting pseudo-labels into pairwise similarity labels, the noise rate was reduced, enabling stable training for the Auxiliary Network.
Conclusions and SignificanceThe ASYCO model effectively addresses the problem of error accumulation in PLL by introducing an asymmetric dual-task co-training strategy. Its experimental results and theoretical findings fully demonstrate the efficacy of this novel design. Key contributions of the study include:
- Significantly improved prediction accuracy for PLL tasks, especially under high noise conditions.
- A novel co-training framework that broadens research directions in PLL.
- Substantial potential for both theoretical advancements and practical applications, such as image annotation and multimedia content analysis.
Though ASYCO shows remarkable performance, its training requires higher computational resources, posing challenges. The research team plans to further optimize co-training architectures and network interaction mechanisms to reduce training costs and explore potential application areas.