Cache memory is nothing more than a “geographically close” copy of that part of memory that different cores in a processor are processing all the time. It mainly has three different utilities:
- Decrease the latency, the execution time, of each of the instructions.
- Reduce power consumption by the interface between processor and memory.
- Eliminate memory access conflicts caused by multiple clients.
Since the cache capacity is limited, when the CPU or GPU searches for data and cannot find it, it will jump from level to level. Where the last level is the one with the greatest storage capacity. So logically, it would be good to use memory with a larger storage capacity like DRAM memory. Why is DRAM not used as a cache?
DRAM vs. SRAM
There are two types of memory, SRAM and DRAM, from which the different types of volatile memory used in PCs throughout their history have been created.
SRAM owes its name to Static Random Access Memory, it is the oldest PC RAM, since it was used as memory in the first PCs, before being replaced by DRAM and its greater storage capacity. SRAM is built using six transistors in total to store a single bit of data.
DRAM, on the other hand, owes its name to Dynamic Random Access and is built using a transistor and a capacitor for each bit of data. Therefore, the area it occupies is much smaller and therefore allows more data bits to be put per area, making it ideal as a volatile memory in which to temporarily store large amounts of data. However, DRAM is not used as a cache.
Why is DRAM not used as a cache?
The reason is very simple, both in DRAM and SRAM when they store some data what they do is keep an electric charge inside. In DRAM, this electric charge is stored in the capacitor and they tend to discharge over time, so it is necessary to refresh said memory.
It is precisely in said memory refresh period during which the CPU would not be able to access the data in the cache built with the DRAM and therefore would enter a dead time bubble. To date, if Intel, AMD, or NVIDIA were to use DRAM in their processors, at least at the latest level, it would force the entire processor back to the design table.
One trick that is done in some designs with DRAM as the last level cache is to implement SRAM at the same level in the hierarchy, which is where the CPU actually communicates. An internal DMA system directly copies the data from the built-in DRAM to the SRAM. This combination, being inside the processor, has much lower latency than RAM and combines the capacity advantages of DRAM with those of SRAM.