A Monolithic 3D IGZO-RRAM-SRAM-Integrated Architecture for Robust and Efficient Compute-in-Memory

Monolithic 3D IGZO-RRAM-SRAM Compute-in-Memory Architecture: A Breakthrough in Improving Neural Network Computation Efficiency

Background and Research Motivation

As neural networks (NNs) continue to find applications in artificial intelligence, traditional computing architectures struggle to meet their needs for energy efficiency, speed, and density. This challenge has led researchers to explore Compute-in-Memory (CIM) technology. CIM integrates computing units with memory units in one architecture, eliminating the “memory wall” caused by excessive data transfer between storage and computation units, thereby significantly improving system efficiency. Existing CIM architectures are mainly based on Static Random Access Memory (SRAM), Resistive Random Access Memory (RRAM), and Indium-Gallium-Zinc-Oxide (IGZO) devices.

However, current single-device-based CIM systems face significant challenges in balancing density, energy efficiency, and precision. Specifically: 1. Non-Ideality Issues of Single Devices: Different memory devices have their limitations. For example, SRAM offers high precision but suffers from low density and inefficient power consumption, while RRAM provides high density but faces challenges with cell-to-cell variations and limited write endurance. 2. High Resource Proportion Outside CIM Arrays: This is especially evident in activation data storage. Some large-scale neural networks require significant intermediate activation data storage, and traditional solutions rely on SRAM, whose low density leads to inefficiency in CIM systems.

These challenges have driven researchers to seek a novel CIM architecture that combines the advantages of different devices while overcoming their limitations. This study, published in Science China Information Sciences, proposes a Monolithic 3D IGZO-RRAM-SRAM integrated architecture (Monolithic 3D IGZO-RRAM-SRAM Architecture) to address these challenges.

Research Source

This work was jointly completed by the Institute of Microelectronics of the Chinese Academy of Sciences and the University of Chinese Academy of Sciences. The lead authors include Shengzhe Yan, Zhaori Cong, Zi Wang, and others. The paper, titled A monolithic 3D IGZO-RRAM-SRAM-integrated architecture for robust and efficient compute-in-memory enabling equivalent-ideal device metrics, was published online in Science China Information Sciences in February 2025.

Research Process and Technical Details

1. Introducing the “Equivalent-Ideal” CIM Architecture

The researchers proposed an “Equivalent-Ideal CIM Architecture” (EQ-CIM), which uses monolithic integration to achieve functional decomposition among SRAM, RRAM, and IGZO in a 3D architecture. Its goal is to combine the unique advantages of each device: - IGZO is responsible for activation storage, featuring ultra-low leakage currents for high density and low power. - RRAM serves as high-density weight storage. - SRAM is used for high-precision, efficient CIM operations.

This functional decomposition strategy takes advantage of the distinct capabilities of each device while avoiding their respective non-idealities through architectural design.

2. 3D Stacking and Device Modeling

The researchers adopted monolithic 3D stacking technology, integrating RRAM between metal layers (Metal 56), placing IGZO on the topmost metal layer (Metal 9), and SRAM on the silicon layer. Key experiments include: - Modeling and Variation Analysis for RRAM and IGZO Devices: Using a 2 KB RRAM array and 52 IGZO devices for testing, they analyzed performance variations caused by temperature, geometric parameters (e.g., contact depth), and other factors. - Device Characteristics Extraction: Extracted metrics such as threshold voltage drift and on-state current variations for IGZO devices, as well as resistance distribution shifts over time for high/low resistance states (HRS/LRS) in RRAM.

Additionally, to address frequency mismatches between different devices (e.g., SRAM operating at 400 MHz versus IGZO at a typical operating frequency of 50 MHz), the researchers proposed a bandwidth multiplication solution. This solution eliminates frequency disparities by operating multiple IGZO storage blocks in parallel.

3. Device-to-System Simulation Framework

The researchers constructed a simulation framework that connects device-level to system-level analysis: - At the device level, key parameters and variations (e.g., temperature-related drift, geometric changes) of RRAM and IGZO were extracted. - At the system level, these device-level effects were translated into algorithmic impacts to assess their influence on neural network precision and energy consumption. The researchers used a Python toolchain based on PyTorch for this evaluation.

Neural network workloads were compiled and mapped to different storage layers (IGZO, RRAM, SRAM). The simulation calculated the energy consumption and area efficiency of the entire system based on weight and activation read/write operations.

4. Workflow and Experimental Results

The researchers conducted tests using standard neural network models (such as VGG16 and ResNet50) on CIFAR-10 and ImageNet datasets: - Storage Density: EQ-CIM achieved a storage density of 19.8 MB/mm², improving storage capacity by 5 to 11 times compared to existing CIM schemes (e.g., those based on RRAM or PCM). - Energy Efficiency: In ResNet50 tests, the EQ-CIM system achieved an energy efficiency of 95.2 TOPS/W, 2.45 times higher than the most efficient single-device solutions. - Neural Network Accuracy: On ImageNet tasks, EQ-CIM maintained high accuracy (with <0.27% loss) even within a wide operating temperature range (-40°C to 120°C). - Area Efficiency: Compared to pure SRAM or RRAM schemes, EQ-CIM achieved a 3.99× increase in system area efficiency.

Conclusions and Academic Implications

1. Study Conclusions

EQ-CIM successfully achieved breakthroughs in computing density, energy efficiency, and precision through the innovative integration of IGZO, RRAM, and SRAM. Additionally, its architecture exhibited excellent robustness under high temperatures and device variations, making it suitable for large-scale neural network models.

2. Scientific and Engineering Significance

This study demonstrated outstanding co-optimization across device, architecture, and system design, opening new directions for CIM development. Its scientific implications include: 1. Proposing a novel solution to address the non-ideal characteristics of single-device-based design. 2. Expanding application scenarios for monolithic 3D stacking technology in memory and computation. 3. Providing a device-to-system simulation framework as a powerful analytical tool for future CIM research.

Its engineering value lies in: - Pioneering advancements in high-efficiency CIM chip technologies for edge computing. - Offering a novel design approach for hardware accelerators for neural network inference and training.

3. Research Highlights

  • Innovatively combines multiple memory types to achieve “equivalent-ideal” CIM performance.
  • Demonstrates highly efficient 3D stacking technology complemented by robust temperature evaluation frameworks.
  • Experimental validation using standard neural networks indicates strong potential for real-world applications.

The researchers also noted several engineering challenges that remain unresolved, such as detailed 3D stacking processes, material selection, thermal management, and chip-level reliability, which will require further exploration in future studies.