Heuristic Underwater Perceptual Enhancement with Semantic Collaborative Learning

Academic Background and Problem Statement Underwater images have significant application value in fields such as marine exploration, underwater robotics, and marine life identification. However, due to the refraction and absorption of light by water, underwater images often suffer from low contrast and color distortion, which severely impacts the a...

Blind Image Quality Assessment: Exploring Content Fidelity Perceptibility via Quality Adversarial Learning

Exploring Content Fidelity Perceptibility via Quality Adversarial Learning Academic Background Image Quality Assessment (IQA) is a fundamental problem in the field of computer vision, aiming to evaluate the fidelity of visual content in images. IQA has significant applications in areas such as image compression and restoration. Traditional IQA meth...

RepsNet: A Nucleus Instance Segmentation Model Based on Boundary Regression and Structural Re-parameterization

RepsNet: A Nucleus Instance Segmentation Model Based on Boundary Regression and Structural Re-parameterization

Report on the Paper “RepsNet: A Nucleus Instance Segmentation Model Based on Boundary Regression and Structural Re-parameterization” Academic Background Pathological diagnosis is the gold standard for tumor diagnosis, and nucleus instance segmentation is a key step in digital pathology analysis and pathological diagnosis. However, the computational...

CSFRNet: Integrating Clothing Status Awareness for Long-Term Person Re-Identification

Report on the Paper “CSFRNet: Integrating Clothing Status Awareness for Long-Term Person Re-Identification” Introduction Person Re-Identification (Re-ID) is a critical task in visual surveillance, aiming to match individuals across non-overlapping cameras captured at different times and locations. The challenge becomes more complex in Long-Term Per...

Pseudo-Plane Regularized Signed Distance Field for Neural Indoor Scene Reconstruction

Pseudo-Plane Regularized Signed Distance Field for Neural Indoor Scene Reconstruction Academic Background 3D reconstruction of indoor scenes is a significant task in computer vision with broad applications, such as computer graphics and virtual reality. Traditional 3D reconstruction methods often rely on expensive 3D ground truth data. In recent ye...

AutoStory: Generating Diverse Storytelling Images with Minimal Human Efforts

AutoStory: Generating Diverse Storytelling Images with Minimal Human Efforts

Academic Background and Problem Statement Story Visualization is a task aimed at generating a series of visually consistent images from a story described in text. This task requires the generated images to be of high quality, aligned with the text description, and consistent in character identities across different images. Despite its wide range of...

Combating Label Noise with a General Surrogate Model for Sample Selection

Academic Background and Problem Statement With the rapid development of Deep Neural Networks (DNNs), visual intelligence systems have made significant progress in tasks such as image classification, object detection, and video understanding. However, these breakthroughs rely heavily on the collection of high-quality annotated data, which is often t...

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person Re-Identification

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person Re-Identification Background Introduction Visible-Infrared Person Re-Identification (VI-ReID) is an important research direction in the field of computer vision, aiming to retrieve images of the same pedestrian from different modalities (v...

Aniclipart: Clipart Animation with Text-to-Video Priors

Academic Background and Problem Statement Clipart, a pre-made graphic art form, is widely used in documents, presentations, and websites to quickly enhance visual content. However, traditional workflows for converting static clipart into motion sequences are laborious and time-consuming, often involving intricate steps such as rigging, keyframing, ...

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

High-Quality Video Generation with Cascaded Latent Diffusion Models: LaVie Academic Background In recent years, with the breakthrough progress of Diffusion Models (DMs) in the field of image generation, Text-to-Image (T2I) generation technology has achieved significant success. However, extending this technology to Text-to-Video (T2V) generation st...