Abstract:
With the advent of many-core era, scalable hardware support for cache coherence has
become vital to system performance. Cache coherence protocols are provided in order
to ensure that multiple cached copies of a single memory block are kept up-to-date.
As the number of cores being integrated on a single chip is growing rapidly, scalability of cache coherency presents a promising research opportunity. Cache coherency models are broadly based on either snoopy coherence protocol or directory-based coherence protocol. While snoopy coherence is unscalable because of its dependence on ordered networks that are inherently diffi cult to scale, directory-based coherence is weighed down
by its requirement of excessive directory area overhead and inaccurate tracking because of compressed sharer bits. In this work, we propose a scalable cache coherency model for multicore and many-core processors through a hardware and software co-design. We begin with modeling the performance metrics of current cache coherence protocols for high-performance multicore systems interconnected through regular packet-switched network-on-chip architectures and identify the possible bottlenecks imposed by them on system performance. We, then, design and develop network-on-chip architecture augmented with wireless interconnects for effi cient handling of broadcast traffi c. We propose
a segmented design applicable for every level of cache memory according to the sharing pattern of the memory blocks among the cores. Finally, we design and implement an effi cient and scalable cache coherence algorithm/protocol that can exploit the proposed wireless interconnects based network-on-chip architecture and the share-pattern aware
cache segmentation. We demonstrate that our proposed architecture improves upon the results produced by some well-known multicore architectures employing conventional protocols for cache. Also, owing to the modularity of the proposed design, it can be extended to be used in the future many-core systems by increasing the levels of hierarchy of interconnects, memory and cache coherency.