Improved Write Margin 6T-SRAM for Low Supply Voltage Applications

Farshad Moradi1, Dag T. Wisland1, Hamid Mahmoodi2, Tuan Vu Cao1
1Nanoelectronics Group, Department of Informatics, University of Oslo, NO-0316 Oslo, NORWAY
2School of Engineering San Francisco State University, 1600 Holloway Avenue, San Francisco, CA 94132, USA

Abstract: In this paper a new technique to increase the write margin of 6T-SRAM cell is proposed. Using this technique the area of subthreshold SRAM cell is reduced and also the write cycle is improved significantly with a lower area overhead. In this technique, PMOS stacked network is used to evaluate the write cycle. Based on behavior of devices in 65nm for weak inversion operation, this technique is proposed to decrease area overhead of 6T-SRAM in subthreshold region.

Keyword: SRAM, Sub-threshold, 65nm

I. Introduction:

Continued increase in the process variability is perceived to be a major challenge to future technology scaling. Process variability stems from systematic effects such as variations in critical dimensions and oxide thickness, as well truly random effects, like the dopant fluctuations. Current design methodology hardly distinguishes systematic variations from truly random ones. Commonly the entire process variability is lumped together, and included in process corners. Traditionally, integrated circuit designs capture the impact of variability by satisfying the design constraints at various process corners, where the process corners are the extreme deviations of the process parameters from their typical values. For digital logic design, the worst-case corners typically capture 3 standard deviations. To satisfy the worst-case performance requirements, often a large penalty is paid in power. Also, most of the today’s designs are power limited. In such cases, satisfying the power budget often requires a back off in performance.

Memory design presents an extreme example of corner-based design. To satisfy the functionality of several tens of millions of SRAM cells, the designer has to capture 5 to 6 standard deviations of parameter variations. This is becoming increasingly challenging to satisfy, and may present a problem for continued scaling of memory density. Concurrently, high-end microprocessors have been increasing the amount of on-die cache to improve the performance.

Presently, to achieve highest possible densities with high parametric yields in bulk-CMOS and SOI technologies, designers use a combination of multi-layered ad-hoc techniques and heuristics, which include device sizing, supply and threshold voltage selection, SRAM column height and sense-amplifier optimization, and redundancy and error correction techniques.

Paper is arranged as follows: in chapter I, the characteristics of 6T-SRAM cell is investigated. This chapter is continued by showing the effects of process variations on SRAM operation in different modes. Chapter II, present some techniques to enable designer to scale supply voltage more in this chapter the proposed circuits are presented. In chapter III, Simulation results are shown. Also the effect of process variations are discussed. Chapter IV, includes the conclusions and discussions about the simulation results.

II. SRAM Cell:

Fig. 1 shows the standard 6T-SRAM cell. In this circuit, storage nodes are specified by X, Y. Suppose that node X stores “0” and node Y stores “1”. Due to low supply voltage, leakage sources are important. The leakage sources are shown in Fig. 1. In this case, M1, M4 are turned on. Also, M2 and M3 are turned off. During the hold time, when WL is not selected (idle mode), M5 and M6 are turned off. In idle mode, M5 and M2 give a small rise on node X because of leakage currents, so node X has a low voltage in range of few millivolt (instead of zero) that causes an increase in leakage through M3 potentially introducing failures [1].

When WL is selected (Read cycle) M6 is turned off and M5 is turned on. In this case, the read cycle is done through M5 and M1, but a rise in node X due to stacking effect [2] causes more leakage through M3. This causes a drop in the voltage of node Y, so the read cycle speed is degraded and in some cases it flips the data on storage nodes. To improve the speed of the read cycle, the best way is upsizing M1 and M5, but there are some limitations on increasing sizes of transistors. By increasing the ratio of WM1/WM5, discharging BL is faster, but due to the upsizing of these devices the voltage of node X is increased which increases the leakage current through M3 and then discharges node Y more causing an increase in the leakage through PMOS that is higher than NMOS in 65nm CMOS technology for ultra low supply voltages.

In write cycle, when WL=0, storage nodes are discharged through the write path that include three NMOS stacked transistors including NMOS pass transistor in 6T-SRAM cell. Fig.2, shows the path of writing in 6T-SRAM cell. As it is shown, the write cycle speed is dependent to these three stacked NMOS. When Data=0, Write=1, and WL=1, then node x is being discharged through M7, M9 and NMOS pass transistor. In this case, suppose that X stores value of “1”, so in write mode, it is discharged using stacked NMOS’s. In this case there is a contention between NMOS transistors and M3 PMOS transistor. M3 is trying to hold the stored data on node x. So, sizing the NMOS transistors plays an important role in write speed. For high enough supply voltages, using small
enough NMOS transistors is enough to discharge the storage node. But for lower supply voltage or for sub-threshold design, M7-M10 should be upsized too much. In [3] it’s shown that for lower supply voltages, PMOS transistors are faster than NMOS’s. So, in this case, it’s shown that if we use the PMOS transistors to implement the write path, the write cycle is improved. The proposed circuitry is shown in Fig.3.

As it can be seen in Fig.3, in write cycle, when write=1, DATA=1, and WWL=0 (RDWL=0), node x is charged to VDD through three stacked PMOS transistors. As it is shown in [3], the three stacked PMOS has a much higher speed of three stacked NMOS transistors. So, in this case, write and read cycle are separated and they are done, in two different modes due to WWL and RDWL signals. Simulations results show that this configuration has a much higher speed in write cycle compared to Fig.2.

Another method that is preferred is using the whole PMOS write and read mode. Fig.4, show the configuration of this circuit. In this circuit M4 and M6 are used as a PMOS transistors. In this case, we use two NMOS transistors to discharge BL and BLB to ground when WWL=1.

When WWL=0, we are not in hold mode, so due to the write value, the circuit is working. When WRITE=1, then the circuit is in the write mode and depending the DATA value, the storage nodes values are changed. If WRITE=0, then the circuit is in read mode and the BL and BLB are changed. Suppose that the circuit has been in hold mode, so the BL and BLB are at zero value. In Read cycle, when WWL=0, if node x stores “1”, then BL is charged to “1” through M5 and the value of BLB is not changed and is fixed at zero.

So, in this case, due to higher speed of evaluation for PMOS transistors in sub-threshold mode, read and write cycle speed are improved. In this case we can decrease supply voltage more, because there is possibility to upsize stacked PMOS more. In 6T-SRAM standard cell, this is not possible, because

the write cycle is not completed if we go in lower supply voltages. So, we have to use very huge transistors to discharge storage nodes in write cycle. There are some ways to obviate this problem in write margin. One of those is using isolation transistor to isolate M3 and M4 from VDD during the write cycle. But this technique increases the area and also degrades the reliability problems due to PMOS transistor between VDD rail and M3 and M4.

Fig.4, shows the proposed SRAM design for sub-threshold applications. The Proposed SRAM works as follow: In hold mode, when WWL=1, M5 and M6 are turned off and separate the storage nodes from bit lines. During the hold cycle, BL and BLB are discharged to ground. In read cycle, when WWL=0 and WRITEBAR=“1”, depending the values on storage nodes BL and BLB are charged to X and Y values. Suppose that X=“1”, Y=“0”, then BL is charged to “1”, through M5 and this turns on M2 to discharge the Y node value. In this case, M6 is turned of and separates the Y from BLB value. So there is no way for BLB to be charged in this case. Just there is a leakage through M6 that helps node Y to
be lowered during this mode. Then BL and BLB are sensed in sense amplifier to read the data. For read mode, because PMOS is faster than NMOS in sub-threshold design, read speed is improved but it depends on the M5 and M6 sizing. During the write mode, as it can be seen in Fig.5, when WWL="0", WRITEBAR="0", the values of input data are written on storage nodes through PMOS transistors. In this case there are three stacked PMOS transistors that has a much higher speed than three stacked NMOS transistors. So, it’s not necessary to use another technique to separate M3 and M4 from supply voltage rail during the write cycle. As an example, for proposed circuit, write circuit (two stacked PMOS’s) are sized WP=0.8um, but for NMOS stacked devices to get the same speed for write cycle, the NMOS transistors are sized W=2um. So it shows it’s possible to get the lower area using PMOS transistors in subthreshold design. As you can see in Fig.6, the circuit operates well.

### III. SIMULATION RESULTS:

Static noise margin are used as a metric to show the stability of a SRAM. To find the SNM in read and hold mode, butterfly approach is used [4]. Fig.7 shows the butterfly curve for a 6T-SRAM cell in VDD=0.4V. By lowering the supply voltage in submicron technologies, SNM is degraded due to the process variations [5] . As it can be seen the butterfly curve is degraded by lowering the supply voltage due to the lower drive current and process variations in submicron technology.

To find the write static noise margin some metrics are used [6-10]. In this paper we use the most common approach uses SNM as a criterion [7] is shown in Fig.8. To avoid the failure, the curves should have just one cross point that indicates the monostability of the cell. As an example the Write SNM for 6T-SRAM standard sell for TT model in T=27°C is 0.15V that shows it doesn’t fail. Fig.9, shows the results for WSNM for VDD=0.5V for 6T-SRAM cell. In this case the method in [7] has been used to see the difference between results.
The topology of a benchmark to find the write SNM is the topology used in [9] that is illustrated in Fig.10. The results of WSNM for proposed SRAM are illustrated in Fig.11. As it’s shown, the WSNM is improved by 47% in some cases.

Fig.12 shows the SNM for both read and hold mode for a 6T-SRAM cell. As you can see, for very low supply voltages the noise margin is very low. To see the effect of process variations on SNM, Monte carlo simulation is used. Fig. 13, shows the monte carlo simulation of SNM for a 6T-SRAM cell. As it can be seen, $g_{m}$ is around 5.6mV. The data have been acquired using monte carlo simulation based on mismatch and process variations for a 6T-SRAM standard cell. The figure, shows that the variation of SNM in read mode is high.

The effect of temperature on SNM in read and hold mode is shown in Fig.14. As it can be seen, the effect of temperature is not too much. The maximum change in SNM due to the temperature is 18mV in read cycle, and 27mV for Hold mode for 6T-SRAM cell. As it can be seen in Fig.15, using PMOS transistor as a pass transistor (W/L=0.2/0.06 compared to W/L=0.4/0.06 for 6T-SRAM standard cell), shows good enough results compared to standard SRAM cell. The hold SNM is improved due to the smaller PMOS pass transistor and lower leakage current for proposed SRAM cell.

**IV. CONCLUSIONS:**

In this paper we showed that using PMOS transistors as a pass transistor and also in write path is much better compared to using NMOS transistors. Results showed significant improvement in write SNM with a lower area compared to standard 6T-SRAM cell. We showed that using proposed circuit the WRITE static noise margin is improved by around 50% for TT CMOS model. Simulation results showed the area overhead of proposed circuit is lower than 6T-SRAM standard cell due to using smaller pass transistors.

**REFERENCES**