Negative Deterministic Information-Based Multiple Instance Learning for Weakly Supervised Object Detection and Segmentation

Negative Deterministic Information-Based Multiple Instance Learning for Weakly Supervised Object Detection and Segmentation

Example of Weakly Supervised Object Detection and Segmentation

Background Introduction

In the past decade, significant progress has been made in the field of computer vision, particularly in object detection and semantic segmentation. However, most of the designed algorithms and models heavily rely on precise annotated data, which is labor-intensive and time-consuming for practical applications. Weakly Supervised Learning (WSL) addresses this issue by requiring only coarse-grained annotated data (such as image-level annotations). In this context, Weakly Supervised Object Detection (WSOD) and Weakly Supervised Semantic Segmentation (WSSS) have garnered substantial attention due to their efficient use of labeling.

Multiple Instance Learning (MIL) offers a viable solution for these tasks by treating each image as a bag containing a series of instances (object regions or pixels) and identifying the foreground instances that contribute to bag classification. However, traditional MIL paradigms often face numerous issues such as dominant discriminative instances and missing instances. This paper observes that negative instances usually contain valuable determinative information (Negative Deterministic Information, NDI), which is crucial for addressing the aforementioned issues.

Paper Source

The article is authored by Guanchun Wang, Xiangrong Zhang (IEEE Senior Member), Zelin Peng, Tianyang Zhang, Xu Tang (IEEE Senior Member), Huiyu Zhou, and Licheng Jiao (IEEE Fellow), from the School of Artificial Intelligence, Xidian University, the Institute of Artificial Intelligence, Shanghai Jiao Tong University, and the School of Computing and Mathematical Sciences, University of Leicester. The paper is published in IEEE Transactions on Neural Networks and Learning Systems.

Research Process

Overview of Research Process

The research process primarily includes two core designs: NDI Collection and Negative Contrastive Learning (NCL). First, the paper proposes an online NDI collection module where a dynamic feature bank is utilized to identify and refine NDI from negative instances. This information is then used in the NCL mechanism to locate and penalize those excessively activated discriminative regions, ultimately solving the issues of dominant discriminative instances and missing instances, thereby improving object-level and pixel-level localization accuracy and completeness. Additionally, an NDI-Guided Instance Selection (NGIS) strategy is designed to further enhance system performance.

Research Objects and Experimental Steps

The research subjects include several public benchmark datasets such as Pascal VOC 2007, Pascal VOC 2012, and MS COCO. In each step, the subjects are processed as follows:

  1. Online NDI Collection Module: Using a dynamic feature bank to extract NDI from negative instances by monitoring a series of instances online, identifying negative instances that do not belong to the current image category based on image-level annotations, and setting a threshold (τ) to filter out invaluable instances. A confidence-level-driven momentum update strategy (Confidence-Driven Momentum Update, CMU) is employed to update the feature bank to extract high-quality NDI from the collected instances.

  2. Negative Contrastive Learning Mechanism: Based on the collected NDI, an NCL mechanism is proposed. By using NDI as a template to match overfitting discriminative instances, these instances are pulled farther apart in the representation space, guiding the network away from the issue of dominant discriminative instances.

  3. NDI-Guided Instance Selection Strategy: Introducing the NGIS strategy after the MIL branch to further alleviate the issue of missing instances. By using NDI as a template to screen potential positive instances, detection performance is improved.

Experiments and Analysis

Experiments conducted on the Pascal VOC 2007, 2012, and MS COCO datasets demonstrated the significant improvements of the proposed method, such as: - On the Pascal VOC 2007 dataset, the NDI-MIL method achieved 56.8% mAP and 71.0% CorLoc, significantly outperforming other methods. - On the Pascal VOC 2012 dataset, NDI-MIL reached 53.9% mAP. - On the MS COCO dataset, NDI-MIL also performed well under high precision standards, improving Map[.5:.05:.95] and Map by 0.7% and 1.9%, respectively.

Detailed Analysis

  1. NDI Collection Module: The paper details the NDI extraction process and the CMU strategy, explaining how the dynamic feature bank optimizes the selection of negative instances and reduces noise instances caused by insufficient training, thereby enhancing NDI quality.

  2. Negative Contrastive Learning Mechanism: Specific formulas are used to explain how NDI is employed to impose penalties on discriminative instances, relieving the issues of dominant discriminative instances and missing instances.

  3. Experimental Results: Detailed data tables compare the performance of NDI-MIL with other popular methods, showcasing its excellent results obtained without retraining the fully supervised models.

Conclusion and Value

NDI-MIL proposes a novel MIL paradigm based on negative deterministic information, effectively solving the common issues of dominant discriminative instances and missing instances in weakly supervised tasks, and enhancing object detection and semantic segmentation performance. This is of great significance for practical applications in the field of computer vision, especially in scenarios requiring efficient use of labeling data.

Research Highlights

The highlights of this research include the discovery and utilization of valuable deterministic information in negative instances, the design of a novel NDI collection module and NCL mechanism, comprehensive experiments proving the method’s effectiveness, and introducing an NGIS strategy to further improve system performance.