Seaformer++: Squeeze-Enhanced Axial Transformer for Mobile Visual Recognition

SEAFormer++ - An Efficient Transformer Architecture Designed for Mobile Visual Recognition Research Background and Problem Statement In recent years, the field of computer vision has undergone a significant shift from Convolutional Neural Networks (CNNs) to Transformer-based methods. However, despite Vision Transformers demonstrating excellent glob...

Lidar-guided Geometric Pretraining for Vision-centric 3D Object Detection

Lidar-guided Geometric Pretraining for Vision-centric 3D Object Detection

Lidar-Guided Geometric Pretraining Enhances Performance of Vision-Centric 3D Object Detection Background Introduction In recent years, multi-camera 3D object detection has garnered significant attention in the field of autonomous driving. However, vision-based methods still face challenges in precisely extracting geometric information from RGB imag...

An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-training

An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-training Academic Background In recent years, self-supervised learning (SSL) has made significant progress in the field of computer vision. In particular, the successful application of masked image modeling (MIM) pre-training methods on large-sca...

A Memory-Assisted Knowledge Transferring Framework with Curriculum Anticipation for Weakly Supervised Online Activity Detection

Research Background and Significance In recent years, weakly supervised online activity detection (WS-OAD), as an important topic in high-level video understanding, has garnered widespread attention. Its primary goal is to detect ongoing activities frame-by-frame in streaming videos using only inexpensive video-level annotations. This task holds si...

Sample Correlation for Fingerprinting Deep Face Recognition

Report on Academic Paper: “Sample Correlation for Fingerprinting Deep Face Recognition” Background and Research Problem In recent years, the rapid advancements in deep learning technologies have significantly propelled the development of face recognition. However, commercial face recognition models face increasing intellectual property (IP) threats...

Noninvasive Grading of Glioma by Knowledge Distillation Based Lightweight Convolutional Neural Network

Review of Non-Invasive Glioma Grading Research: Lightweight Convolutional Neural Networks Based on Knowledge Distillation Background Gliomas are the main tumors of the central nervous system, and early detection is crucial. The World Health Organization (WHO) classifies gliomas from grade I to IV, with grades I and II being low-grade gliomas (LGG) ...

Exploring Adaptive Inter-Sample Relationship in Data-Free Knowledge Distillation

In recent years, applications such as privacy protection and large-scale data transmission have posed significant challenges to the inaccessibility of data. Researchers have proposed Data-Free Knowledge Distillation (DFKD) methods to address these issues. Knowledge Distillation (KD) is a method for training a lightweight model (student model) to le...

Prototype-Based Sample-Weighted Distillation Unified Framework Adapted to Missing Modality Sentiment Analysis

Prototype-Based Sample-Weighted Distillation Unified Framework Adapted to Missing Modality Sentiment Analysis

Application of a Prototype-Based Sample Weighted Distillation Unified Framework in Missing Modality Sentiment Analysis Research Background Sentiment analysis is a significant field in Natural Language Processing (NLP). With the development of social media platforms, people increasingly tend to express their emotions through short video clips. This ...

Balancing Feature Alignment and Uniformity for Few-Shot Classification

Balancing Feature Alignment and Uniformity for Few-Shot Classification

Solving Few-Shot Classification Problems with Balanced Feature Alignment and Uniformity Background and Motivation The goal of Few-Shot Learning (FSL) is to correctly recognize new samples with only a few examples from new classes. Existing few-shot learning methods mainly learn transferable knowledge from base classes by maximizing the information ...

Weakly Supervised Semantic Segmentation via Alternate Self-Dual Teaching

Weakly Supervised Semantic Segmentation via Alternate Self-Dual Teaching

Weakly Supervised Semantic Image Segmentation via Alternate Self-Dual Teaching Background Introduction With the continuous development of the computer vision field, semantic segmentation has become an important and active research direction. Traditional semantic segmentation methods rely on manually labeled pixel-level tags; however, obtaining thes...