Our evaluation found that no single design is superior to the other designs across the diverse application workloads and hybrid memory configurations. Unlike the results from the prior work, the article reports that the best DRAM cache organization is highly affected by the hybrid memory configurations as well as the memory access patterns of applications. Using HBM, DDR, PCM, and NAND technologies, this article uses six different hybrid memory systems, and evaluates the prior approaches with a range of applications on the six memory configurations. Motivated by the tradeoffs in the prior DRAM cache designs, this article first investigates the effective DRAM cache organizations for several different hybrid memory scenarios. A large block can reduce the tag overhead and amortize the migration cost, but it can waste the capacity of the fast memory, if spatial locality is low (Jevdjic et al. In addition, the block size can pose another tradeoff. On the other hand, Alloy cache is a direct-mapped cache, and allows a single access to retrieve both tag and data with an extended 72B interface (Qureshi and Loh 2012). However, to support such a high associativity, it requires one to access DRAM three times for tag accesses only. For example, LH cache supports 29 ways, as it places 29 data blocks and tags in a DRAM row (Loh and Hill 2011). Due to the tag access limitation, the prior DRAM cache organizations have tradeoffs between the associativity and hit latency. Unlike the traditional on-chip SRAM caches, accessing tags is costly as they are also stored in the relatively slow DRAM. The main differences among the prior DRAM cache designs stem from how tags are organized. DRAM caches commonly store both tags and data in DRAM instead of a separate SRAM array for tags, as using SRAM tags requires a large SRAM storage for increasing DRAM cache capacity. Such DRAM cache organizations have advanced to reduce latencies, while improving hit rates of DRAM caches. Such DRAM-based caches can provide a SW-transparent performance improvement with the HW-controlled data mapping and migration between the DRAM cache and slow backing memory (Jevdjic et al. 2017).Īmong various approaches to organize the hybrid memory, one of the promising techniques is to use the fast memory as a memory-based cache. 20 Loh and Hill 2011 Qureshi and Loh 2012 Sim et al. Based on the temporal and spatial locality of data, data are migrated between the two memory types, to provide fast accesses to frequently used data (Chou et al. Commonly, such a composite or hybrid memory system consists of low latency and high bandwidth near memory (fast memory) and capacity-oriented far memory (slow memory). ![]() Such increasing heterogeneity in memory components has enabled a composite memory system consisting of more than two types of different memory technologies, such as hybrid HBM-DDR (Sodani et al. ![]() 2014 Pawlowski 2011) and reduced-latency DRAM (RL-DIMM) (Micron 2016), has been increasing the heterogeneity of DRAM.In addition to the diversification of DRAM technologies, new non-volatile memory technologies have emerged to complement DRAM as high capacity memory (Wong et al. ![]() The advent of new DRAM technologies, such as 3D stacked memory (Lee et al. Our evaluation shows that the proposed morphable DRAM cache can outperform the fixed DRAM configurations across six hybrid memory configurations. Using a sample-based mechanism, the proposed DRAM cache controller dynamically finds the best organization from three candidates and applies the best one by reconfiguring the tags and data layout in the DRAM cache. Unlike the fixed tag and data arrays of conventional on-chip SRAM caches, this study advocates to exploit the flexibility of DRAM caches, which can store tags and data to DRAM in any arbitrary way. Based on this observation, this article proposes a reconfigurable DRAM cache design that can adapt to different hybrid memory combinations and workload patterns. From the investigation, we observe no single DRAM cache organization always outperforms the other organizations across the diverse hybrid memory configurations and memory access patterns. To quantitatively assess the effect of memory configurations and application patterns on the DRAM cache designs, this article first investigates how three prior approaches perform with six hybrid memory scenarios. ![]() In such DRAM cache designs, their effectiveness is affected by the bandwidth and latency of both fast and slow memory. When a small amount of fast memory is combined with slow but large memory, the cache-based organization of the fast memory can provide a SW-transparent solution for the hybrid memory systems. DRAM caches have emerged as an efficient new layer in the memory hierarchy to address the increasing diversity of memory components.
0 Comments
Leave a Reply. |