Unsupervised Fusion of Misaligned PAT and MRI Images via Mutually Reinforcing Cross-Modality Image Generation and Registration

Unsupervised Fusion of Unaligned PAT and MRI Images Using Mutually Enhancing Cross-Modality Image Generation and Registration Methods

PAT-MRI Image Registration and Fusion

Background and Research Objectives

In recent years, photoacoustic tomography (PAT) and magnetic resonance imaging (MRI) have been widely used in preclinical research as cutting-edge biomedical imaging techniques. PAT provides high optical contrast and deep imaging but has poor soft tissue contrast; on the other hand, MRI has excellent soft tissue imaging capabilities but low temporal resolution. Although there has been some progress in multimodal data fusion, the fusion of PAT and MRI images remains challenging due to misalignment and spatial distortion issues.

Neural Network Architecture Designed in This Study To address these issues, the authors of this paper propose a phased deep learning framework called PAMRFuse, focusing on fusing unaligned PAT and MRI images. This framework includes a multimodal-to-single-modal registration network to accurately align the input PAT and MRI image pairs, and a self-attention fusion network to select information-rich features for fusion. The study aims to achieve information fusion between unaligned PAT and MRI images to provide researchers with more complete and detailed target information.

Source of the Paper

The paper was authored by Yutian Zhong, Shuangyang Zhang, Zhenyang Liu, Xiaoming Zhang, Zongxin Mo, Yizhe Zhang, Haoyu Hu, Wufan Chen (IEEE Senior Member), and Li Qi from Southern Medical University and its affiliated institutions. The paper was published in the May 2024 issue of IEEE Transactions on Medical Imaging.

Research Workflow and Methods

Workflow

The workflow of PAMRFuse is divided into two main parts: the multimodal-to-single-modal registration network and the self-attention fusion network.

  1. Multimodal-to-Single-Modal Registration Network:

    • Simplifies image registration through an image synthesis strategy to reduce spatial shifts and ghosting. Specifically, this process uses a Generative Adversarial Network (GAN) structure. GAN includes a generator and a discriminator, where the generator improves the diversity and quality of generated images through residual connections.
    • Synthesizes pseudo-MRI images to assist the registration of real MRI images and PAT images, reducing alignment errors between the two.
  2. Self-Attention Fusion Network:

    • Includes global path, local path, and merging modules. The global path extracts global features using a self-attention mechanism, while the local path retains details. Eventually, these features are fused in the merging module to generate the fused image.
    • Uses two adversarial discriminators to distinguish fused images from single-modal images, improving fusion quality.

Multimodal Image Synthesis Network

Due to the differences in the imaging environments of PAT and MRI, there is often misalignment between the images. Directly fusing these unaligned images results in ghosting artifacts. Therefore, the researchers proposed the generation of pseudo-MRI images using GAN, simplifying the multimodal registration problem to a single-modal registration problem. The generator’s architecture employs residual connections to enhance training stability and the diversity of generated images. The discriminator uses multiple convolutional layers to extract image features and classify the image’s authenticity through fully convolutional layers.

Registration Network

This network uses the pseudo-MRI images generated by the image synthesis network and real MRI images to generate an image deformation field, reducing the computational complexity of multimodal registration and transforming the task into a single-modal registration problem. The architecture of the registration network is similar to the U-Net model and integrates residual modules to enhance network performance.

Self-Attention Fusion Network

The fusion network includes three sub-modules: global path, local path, and merging module. The global path uses a self-attention mechanism to model long-range dependencies, while the local path retains details. The merging module fuses the features extracted by both paths to generate the final fused image. To further distinguish the source of the fused image, two adversarial discriminators are used to process pseudo-MRI and PAT images respectively.

Key Research Results

Quantitative and Qualitative Analysis

The researchers conducted extensive quantitative and qualitative experiments to verify the excellent performance of PAMRFuse in small animal PAT and MRI images. The experimental results show that PAMRFuse effectively eliminates misalignment issues while retaining the soft tissue details from MRI and brightness information from PAT. Compared to 10 other advanced fusion methods, PAMRFuse demonstrates higher image quality and better fusion effectiveness across various metrics.

Performance Metrics

Through a series of metrics, including Mean Squared Error (MSE), Peak Signal-to-Noise Ratio (PSNR), and Correlation Coefficient (CC), the effectiveness of PAMRFuse was further validated. Especially in regions with high-intensity brightness information, PAMRFuse successfully retains the original image details while avoiding information loss and image blurring.

Analysis of Modal Image Variability and Ablation Experiments

To verify the impacts of different modal image synthesis and loss functions, the researchers conducted multiple ablation experiments. The results show that using GAN to generate pseudo-MRI images, combined with global correlation loss (GCC) and second-order deformation field gradient loss (LSmooth), significantly improves image registration accuracy and overall quality. Furthermore, adding a self-attention mechanism and densely connected fusion network enables effective selection of significant features for fusion, while retaining image details.

Conclusion

PAMRFuse pioneers the use of information fusion to eliminate misalignment issues in PAT and MRI images, achieving significant results. Not only does this method outperform traditional and current deep learning methods in image fusion performance, but it also demonstrates broad applicability in handling complex image fusion tasks. Although there are still challenges regarding hardware integration, PAMRFuse offers a novel and practical method in the field of multimodal image fusion, laying a solid foundation for future research.

Research Value

This study expands the application scope of multimodal image fusion and proposes an effective new method to handle complex image registration and fusion tasks. By retaining image details and rich information, PAMRFuse provides high-quality fused images, which hold significant importance for preclinical research and applications in other fields.