Artificial Intelligence-Discipline-FmRead Academic Frontier

Report on the Paper “RepsNet: A Nucleus Instance Segmentation Model Based on Boundary Regression and Structural Re-parameterization” Academic Background Pathological diagnosis is the gold standard for tumor diagnosis, and nucleus instance segmentation is a key step in digital pathology analysis and pathological diagnosis. However, the computational...

Report on the Paper “CSFRNet: Integrating Clothing Status Awareness for Long-Term Person Re-Identification” Introduction Person Re-Identification (Re-ID) is a critical task in visual surveillance, aiming to match individuals across non-overlapping cameras captured at different times and locations. The challenge becomes more complex in Long-Term Per...

Pseudo-Plane Regularized Signed Distance Field for Neural Indoor Scene Reconstruction Academic Background 3D reconstruction of indoor scenes is a significant task in computer vision with broad applications, such as computer graphics and virtual reality. Traditional 3D reconstruction methods often rely on expensive 3D ground truth data. In recent ye...

AutoStory: Generating Diverse Storytelling Images with Minimal Human Efforts

Academic Background and Problem Statement Story Visualization is a task aimed at generating a series of visually consistent images from a story described in text. This task requires the generated images to be of high quality, aligned with the text description, and consistent in character identities across different images. Despite its wide range of...

Academic Background and Problem Statement With the rapid development of Deep Neural Networks (DNNs), visual intelligence systems have made significant progress in tasks such as image classification, object detection, and video understanding. However, these breakthroughs rely heavily on the collection of high-quality annotated data, which is often t...

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person Re-Identification Background Introduction Visible-Infrared Person Re-Identification (VI-ReID) is an important research direction in the field of computer vision, aiming to retrieve images of the same pedestrian from different modalities (v...

Academic Background and Problem Statement Clipart, a pre-made graphic art form, is widely used in documents, presentations, and websites to quickly enhance visual content. However, traditional workflows for converting static clipart into motion sequences are laborious and time-consuming, often involving intricate steps such as rigging, keyframing, ...

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

High-Quality Video Generation with Cascaded Latent Diffusion Models: LaVie Academic Background In recent years, with the breakthrough progress of Diffusion Models (DMs) in the field of image generation, Text-to-Image (T2I) generation technology has achieved significant success. However, extending this technology to Text-to-Video (T2V) generation st...

SLIDE: A Unified Mesh and Texture Generation Framework with Enhanced Geometric Control and Multi-View Consistency

Academic Background With the increasing demand for high-quality 3D content across industries such as gaming, architecture, and social media, the manual creation of 3D assets has become time-consuming, technically demanding, and costly. In the gaming industry, the aesthetic quality of assets like characters and furniture significantly impacts the im...

UAV Behavior Intent Recognition Based on Generative Models: A Cross-Modal Study From Behavior to Natural Language Background and Research Objectives In recent years, Unmanned Aerial Vehicle (UAV) technology has advanced rapidly and has found widespread applications in civilian and military domains, including search and rescue, precision agriculture...

RepsNet: A Nucleus Instance Segmentation Model Based on Boundary Regression and Structural Re-parameterization

CSFRNet: Integrating Clothing Status Awareness for Long-Term Person Re-Identification

Pseudo-Plane Regularized Signed Distance Field for Neural Indoor Scene Reconstruction

AutoStory: Generating Diverse Storytelling Images with Minimal Human Efforts

Combating Label Noise with a General Surrogate Model for Sample Selection

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person Re-Identification

Aniclipart: Clipart Animation with Text-to-Video Priors

LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models

SLIDE: A Unified Mesh and Texture Generation Framework with Enhanced Geometric Control and Multi-View Consistency

From Behavior to Natural Language: Generative Approach for UAV Intent Recognition