Current Status and Future Challenge of PRAM
Yen-Hao Shih
IBM/Macronix PCRAM Joint Project
Macronix International Co., Ltd.
Tel: 886-3-5786688 ext. 78014, email: yhshih@mxic.com.tw

Abstract
Phase-Change RAM (PRAM) is one of the most promising new technologies that may scale beyond current charge-based flash memories. Because of new materials, reliability is the most difficult challenge PRAM faces. Recent research work shows that the biggest current challenge is to reduce tail bits that limit the chip retention time. For the future technology nodes, challenges would come from retention time after scaling and suppression of resistance drift for stable multi-level cell (MLC) operations. In general, challenges for PRAM remain high as long as the fundamental understanding of phase change materials remains weak.

Introduction
It is well known that the conventional charge-based non-volatile memories are approaching their scaling physical limits [1]. A number of new non-volatile memories have been proposed [2-6] and among them phase-change RAM has been the most promising candidate [5, 6]. To enable this new technology, the researchers and engineers in the PRAM community have been working on three major challenges: write power, phase change materials, and reliability. The write power issue mainly comes from the gap between the current density provided by silicon driving devices and the current density required to melt phase change materials. The gap can be narrowed though various current confining structures [7-10]. Integrating phase change materials into BEOL is not as easy as integrating phase change materials into re-writable CD/DVD. The features sizes of PRAM devices are small, in the nanometer range, and thus devices are very sensitive to the local variation of phase change materials. The reliability of PRAM is the most challenging topic because PRAM is a new technology and it is very dependent on the phase change materials adopted in the chips. The most critical reliability problem is tail bits, or so-called early fail bits, due to their shorter retention time and their unstable nature [11, 12].

The memory arrays in this work were fabricated by a 0.18μm CMOS logic process, with the bottom electrodes made using a key-hole transfer process described elsewhere [10]. The phase-change material is doped Ge2Sb2Te5 (GST).

Challenge I—Tail Bits
Tail bits are defined as RESET bits that have significantly shorter data retention time when subjected to high temperature retention tests as compared to the main distribution of RESET cells (normal bits). Tail bits were first reported by B. Gleixner, et al. in 2007 [11] after these authors carefully examined the data from their PCM chips. Figure 1 shows how tail bits, (called early fail bits in [11]) fail chip retention test very early in time. The model proposed in Reference 11 is based on percolation path(s) created by imperfect RESET operation and high temperature bake.

Figure 2 shows the R bit maps during 130°C baking experiments from our PRAM chips using doped GST. After 2 hours of baking, many tail bits appear, and longer baking brings all the cells below the failure criterion (R=100KΩ). Figure 3 shows an R-R plot, which is used to check individual cell’s R in two consecutive baking experiments at 130°C. A bit that at one time exhibits tail behavior may later be a normal bit, and vice versa. Once the R distribution is known, the overall group behaviors are well predicted (the inset in Fig. 3). No dependency among those tail bits is found, i.e., the tail bits in each bake are independent events, and they are randomly distributed across the memory array.

We further verified the previous model (Fig. 4) through a bake-Vt stress-bake experiment, trying to investigate the origin of the tail bits. The results suggest that tail bits may not be controlled by the pre-existing grains left by the imperfect RESET operation, but rather by the material’s spontaneous nucleation generation [13].

Tail bits may or may not be an issue for chip operation, depending on what applications PRAM chips are used for. Error correction code (ECC) could be adopted to reduce tail bits but this will involve slower access times and cause silicon area penalty.

Challenge II—Retention after Scaling
Normal bits do not maintain high resistance (>1MΩ) forever. Figure 6 (a) shows that the cell resistance starts to drop when the bake time is longer than 22 hours at 130°C. This implies a separate mechanism for retention loss of normal bits. The mechanism is directly confirmed by TEM observation. We first prepared a TEM specimen containing 10 RESET cells. Every cell on the specimen was imaged (Fig. 10 (b)) before the specimen received 150°C annealing in nitrogen. After the annealing, the cells were imaged again and compared (Fig. 10 (c)). All 10 cells showed the same result – the amorphous GST (aGST) region shrank significantly after the annealing. This is the evidence that grain growth from the aGST/polycrystalline GST (cGST) boundary dominates normal bit retention time in our doped GST PRAM devices. However, the nucleation mechanism [13] may be still valid for other PRAM devices using GST with different composition or doping.

A retention failure model (Fig. 7) for PRAM using doped GST has been proposed [12]. In general, RESET cells suffer two retention loss mechanisms. Spontaneous nucleation and grain growth in aGST create the random tail bits, while grain growth from the aGST/cGST boundary dominates the normal bit retention time. This model well explains the different slopes in the failure probability plot (Fig. 8).

On the other hand, the grain growth model may introduce a potential issue for scaling—the mushroom-type PRAM may not provide enough data retention time after scaling to advance nodes, which have less current for RESET from driving devices. Figure 9 shows the retention time from cells RESET with different amount of amorphizing current. Indeed, less current gives shorter retention time. How to guarantee retention after scaling is a challenge for future.

Challenge III—Retention for MLC
Resistance drift [14] may not be important for single-level cell operation, but it cuts the MLC operation window. Figure 10 compares 1-2 MΩ cells programmed using two different programming approaches: i) by 600μA RESET current but fast quenching and ii) by ~1.4mA RESET current but with a 400ms trailing pulse edge. Although there is no significant difference in resistance between bits programmed by the two approaches, but our newly developed characterization method [15] shows that the aGST volume for those fast-quenched cells is relatively smaller, and that its trap density in the aGST is lower (Fig. 16). For the cells programmed by high RESET current and slow quenching, they have larger aGST volume but more traps inside. A higher trap density could lead to more serious R-drift and even shorten the retention time for MLC operation. To find a way to completely stop R-drift will require more fundamental material studies using PRAM devices or else more innovative ways to track/characterize R-drift to guarantee MLC retention.

Conclusions
Although very significant progress has been made in PRAM, the fundamental understanding of phase change materials and PRAM devices is still weak. The technology challenges will continue even after PRAM is commercialized.

Acknowledgement
The author would like to thank IBM/Macronix PCRAM team for their great contribution to this paper.

References
Fig. 1 Impact of early fail cells (called “tail bits”, in this paper) on retention time and the proposed model, first published by B. Gleixner, et al., in 2007.

Fig. 2 R bitmaps of 5K cells from one of our PCM chips using doped Ge$_2$Sb$_2$Te$_5$, during 130°C retention test. (a) After RESET operation, (b) after 2 hours, and (c) after 50 hours at 130°C.

Fig. 3 A R-R plot is used to check the repeatability of tail bits after 10 hours at 130°C. The pattern and the insert imply that the tail bits in each bake are erratic events.

Fig. 4 A designed experiment for checking tail bit mechanism. If tail bits are caused by percolation path(s) after imperfect RESET operation and bake, the tail-bit rate should be much lower after the 2nd bake.

Fig. 5 Results of the designed experiment to check tail bit mechanism. (a) After the 1st 150°C bake, the tight RESET distribution became wider due to appearance of tail bits. (b) The $V_T$ stress test (1.1V) eliminated all tail bits. (c) After the 2nd 150°C bake, new tail bits repopulate from remnant high resistance cells, and the tail bits maintain the same failure rate.

Fig. 6. (a) Normal bits won’t maintain high resistance forever; their resistance drops when the 130°C bake time is longer than 22 hours. (b) TEM images of a RESET cell and (c) the same cell after 150°C bake. The grain growth from the aGST/cGST boundary (white dot curve) is directly observed.

Fig. 7 Retention model of PCM RESET cells. For normal cells, retention time is limited by the grain growth rate (red arrows) while tail bits are more likely driven by spontaneous nucleation generation.

Fig. 8 The failure probability plot presents two slopes—the shallower one (slope-B) by tail bits and the steeper one (slope-A) by grain growth. Based on the slope-B and $E_a$, storage temperature for 10-year retention and 1ppb failure rate is 80°C.

Fig. 9 Chip-level data show the retention time becomes shorter when programming current is less (limited by the word line voltage). From scaling point of view, driving devices will provide less current in advanced nodes, and how to guarantee retention after scaling is a challenge for future.

Fig. 10. For MLC operation, there are two programming approaches: by pulse amplitude or by pulse trailing edge (quenching time). Cells between 1MΩ and 2MΩ, programmed by the two approaches, are pulled for detailed examination.

Fig. 11 Cells programmed by 2.5ns quenching time have smaller aGST with a lower trap density. Cells programmed by 400ns trailing time shows bigger aGST and a higher trap density, which could lead to worse R-drift. R-drift in MLC is a challenge for future PRAM.