First Advisor

Christof Teuscher

Term of Graduation

Summer 2020

Date of Publication


Document Type


Degree Name

Doctor of Philosophy (Ph.D.) in Electrical and Computer Engineering


Electrical and Computer Engineering



Physical Description

1 online resource (xviii, 111 pages)


The goal of this thesis is to design fast, low-power, robust graph-based inference systems. Our approach to this is by using (1) in-memory computing, (2) approximate computing principles, and (3) memristive devices.

This work is motivated by the fact that conventional von Neumann architectures are not efficient for inference applications, mainly due to the data transfer bottleneck. Adding cache memories and using GPUs is a remedy, however, does not eliminate this bottleneck. In-memory computing is an alternative approach that performs all computations inside the memory, thus eliminating the data transfer bottleneck.

The memristor, which is a passive two-terminal device, is capable of providing storage and computation simultaneously. This capability and its compatibility with CMOS devices make it a good candidate to build compact, low power, in-memory computing systems.

To design graph-based inference systems, we rely on both probabilistic and non- probabilistic graph-based models. Bayesian networks and Markov models are the commonly used probabilistic models for inference applications. They use mutual probabilistic information to connect between the attributes of the data. Naive Bayes (NB) model is an approximated Bayesian model used to overcome the intractability of the non-approximated Bayesian models. Zhu presented a Bayesian Memory (BM) that can perform NB inference using an Associative Memory (AM) and showed that the posterior probabilities can be calculated using a distance metric, such as the Hamming Distance (HD). Although her design showed promising results, it lacked the hardware design and the associative memory used suffered from low capacity and high crosstalks.

Current HD systems suffer from either high latency and/or high-power consumption. We designed an in-memory computing, low-power, approximate, and parallel memristive HD circuit that can be used to implement an Associative Content Addressable Memory (ACAM). We showed that the operation of our HD circuit under non-ideal fabrication conditions changes only slightly, decreasing the correct classification rates for the MNIST handwritten digits dataset by < 1%. Also, its operation is independent of the memristor model used, as long as the model allows a reverse current. Because we leverage in- memory parallel computing, our circuit is nx faster than other HD circuits, where n is the number of HDs to be computed, and it consumes on average ≈500x less power compared to other memristive and CMOS HD circuits. Another benefit of this circuit is that it can be controlled to trade-off accuracy to power and vice versa. We then used this HD circuit to design a Hamming Distance Associative Content Addressable Memory (HDACAM). This HDACAM consumes on average ≈25x lower power than state-of-the- art HDACAMs. We then used the HDACAM to build the BM where we proposed a novel and a simple way to incorporate prior probabilities while adding a 4% power overhead only. We tested this BM on the MNIST dataset and showed that using variable prior probabilities can lead to better performance that exceeds the constant prior probabilities system by approximately 3% with a small increase in the consumed power. We also showed that we could conserve approximately 300x the power by using the priors as we can lower the input voltage with a 1% performance decrease only. The BM obtained a 77% classification rate on the MNIST dataset, which is comparable to the 79% obtained by using a univariate Gaussian NB system.

Graph Convolutional Neural Networks (GCNNs) can be used to match and infer on graphs. The computational cost of most GCNN models scales quadratically with the number of nodes. Defferrard et al. proposed a GCNN technique that uses localized spectral filters in the convolutional process and has a lower linear complexity. However, their design had a high computational cost when dealing with larger graphs.

We proposed a novel ensemble multi-attribute graph inference technique that reduces the computation cost and energy/power consumption by reducing the graph sizes. For example, when testing on the Yale faces database we were able to decrease the graph sizes by ≈70% and reduce the energy/power consumption by ≈60%, while maintaining comparable accuracy, with a standard deviation of 0.25, to using the single full-size graph. Compared to other face recognition/inference systems, our techniques outperformed them from the classification rate perspective while consuming about 8x less power.

To further reduce the computational time and cost, we introduced another novel multi- attribute graph inference technique called the Early Exit (EE) inference scheme that relies on a confidence measure. By using the EE technique, we were able to lower the power consumption by a further 30% while maintaining a comparable accuracy to our ensemble technique. Our schemes outperformed other face recognition techniques by an average of 9% while consuming ≈11x less power.

Our proposed approximate designs are useful for embedded inference applications where low power is of great importance and a slight loss in accuracy is tolerable.


In Copyright. URI: This Item is protected by copyright and/or related rights. You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use. For other uses you need to obtain permission from the rights-holder(s).

Persistent Identifier

Available for download on Tuesday, July 13, 2021