| dc.description.abstract |
The semiconductor industry has been driven by digital computation using binary digits, following Moore’s law for over half a century. However, the demands of emerging applications like artificial intelligence (AI), cloud computing, exascale computing, and the Internet of Things (IoT) require scalable computational resources. Implementing these applications sometimes requires solving a certain class of problems, such as optimization problem, which requires stochasticity to arrive at the solution. It should be noted that the optimization problems can be implemented using digital computation that includes a pseudo-random number generator. However, the area and power overhead to implement a pseudo-random number generator is significant. One way to reduce the area- and power-overhead is to choose the computational paradigm where the stochasticity is inherently present in the fundamental building block. A recently proposed computational paradigm, probabilistic computing, has shown promising results as an energy- and area-efficient alternative to digital computation. The fundamental building block of probabilistic computing is a probabilistic bit or a p-bit. A p-bit is a classical entity similar to bits with logic levels 0 and 1, however, unlike bits, a p-bit fluctuates between the two logic levels. The p-bits can be realized using different semiconductor devices such as metal-oxide-semiconductor field-effect transistor (MOSFET), diodes, low-barrier magnets (LBM), etc. Among these alternatives, the LBM-based implementation have shown promising results in reducing the area and power overhead. This work provides a comprehensive overview of probabilistic computing, from material physics to the system level, to optimize the entire stack from algorithms to device design and address challenges in hardware implementation to enhance the performance and robustness of applications using probabilistic computing. In this thesis, the design of an LBM for integration in the one transistor-one magnetic tunnel junction spin-transfer torque magnetic random access memory (1T-1MTJSTT-MRAM) structure is investigated. A method is proposed for the selection of material parameters in the LBM-based p-bit implementation to improve flips per second (fps), a critical system-level metric. The significance of specific material properties in the LBM design is highlighted. It is demonstrated that, beyond material selection, several design parameters are significantly influenced by process-induced variations and are, therefore, critical to the development of a robust p-bit-based computational system. Subsequently, the work systematically investigates the impact of non-idealities, such as process variations, environmental factors, and ageing, on the performance of p-bit networks. An analytical model is proposed to incorporate these non-idealities, and the model’s predictions are validated using numerical and SPICE simulations. For demonstration, the image completion problem for digits (0 to 9) using non-ideal p-bits is implemented. The analytical model closely aligns with the behavioral model, revealing that non-idealities in p-bits significantly affect the performance of probabilistic computing. Furthermore, the impact of these non-idealities in p-bit-based implementations is demonstrated in circuit simulations using SPICE models. This work highlights the importance of considering process-induced variations when designing p-bit networks. By incorporating these considerations, the performance and robustness of p-bit networks can be enhanced, paving the way for their real-world application. In line with efforts to enhance the robustness of p-bit networks, this work investigates the impact of faults arising from fabrication defects, ageing, and variability in the p-bits that lead to stuck-at faults, which can degrade system functionality. The effect of such faults is examined using the Modified National Institute of Standards and Technology (MNIST) dataset, and a mutual information-based criticality score (CS) is proposed to guide fault-tolerance strategies. To further improve fault resilience, testable, isolatable, and fault-tolerant p-bit architectures are proposed and validated through Simulation Program with Integrated Circuit Emphasis (SPICE) simulations using 14nm Fin Field-Effect Transistor (FinFET) technology. Testable p-bits integrate conventional p-bits with scan cells, introducing controllability and observability into the network. Isolatable p-bits enable the disconnection of faulty p-bits, while fault-tolerant p-bits restore network functionality by activating faulty p-bits with redundant counterparts. By selectively replacing only the most critical p-bits, accuracy degradation is minimized with limited overhead, thus demonstrating an effective framework for fault-tolerant p-bit systems. Next, the exploration of probabilistic computing from an algorithmic perspective is discussed. The effectiveness of a p-bit system in tasks such as image completion, where the system uses partially clamped inputs, such as images of digits (0 to 9), to generate a complete output, is demonstrated. Additionally, a method is proposed to sparsify a probabilistic computing network by leveraging mutual information, a concept from information theory. The findings show that the proposed method is computationally efficient and can produce a sparse network with only 42% of the original connections while delivering accuracy comparable to the fully connected network. In summary, this work comprehensively investigates probabilistic computing, spanning device materials, circuit-level implementations, and system-level design considerations, focusing on understanding non-idealities and enabling fault tolerance for practical applications. |
en_US |