Skip to Content

Molinaroli College of Engineering and Computing

  • computer architecture

Using novel technologies for removing computer memory bottlenecks

According to the website Statista, the U.S. had the largest machine learning market worldwide last year, with a value of more than $21 billion. While machine learning continues to be implemented into more daily applications, large data centers that serve the workloads face challenges in keeping pace with current and future demand.

But since last June, Computer Science and Engineering Assistant Professor Ramtin Zand has been working on a project to implement a platform technology that could offer a solution.

The origin of Zand’s three-year, nearly $600,000 National Science Foundation-funded project started in 2019 when he initially submitted the proposal as a new assistant professor at the college. It was declined several times, but instead of moving to another project, Zand used the rejections to improve his proposal, while continuing to publish journal articles and compiling results. 

“It's been six years that we've been working on this project, but the feedback from the rejections helped us build what we have today,” Zand says. “I appreciated all of the reviewers’ comments because we had fixed all of the early issues when it was accepted.”

Zand added that the primary motivation for this project comes from his previous and current research on his expertise in computer architecture. 

“There's an entire class known as Von Neumann computer architectures that are in almost all of today’s computer systems. In this architecture, the data is stored in the memory, and the processor fetches the data, processes it, and sends it back to the memory,” he says. 

While machine learning and artificial intelligence perform repetitive processes and simple tasks, they use an extensive amount of data that causes memory bottleneck issues. This is when a system’s memory resources are insufficient to meet data processing and storage demands. With memory bottleneck, transferring data between processors and memory consumes a significant amount of energy, particularly in large data centers. 

Zand plans to address this issue by utilizing in-memory analog computing (IMAC), which reduces the processor-memory bottleneck by performing computations directly where the data resides, saving both time and energy. His team will develop a novel IMAC architecture and a corresponding framework for deploying machine learning workloads to eliminate the need of a separate memory and processor. 

“The entire data transfer is not necessary because we’re doing the processing locally where data exists, then addressing the bottleneck concern,” Zand says. 

Cross-layer design automation is one of the significant aspects of the project as Zand wants to examine the problem from different design abstraction layers. There are different layers of abstraction in the software-hardware co-design process. It starts with an electronic circuit design, followed by designing the computer architecture to building a system. Zand believes his solution is necessary because memory bottleneck cannot be viewed as an isolated problem. 

“Our cross-layer solution not only looks at the problem from the circuit level, but from the system, computer architecture and applications,” Zand says. “For example, you can have the best electronic circuits but can’t integrate it with the existing computer architecture to run applications. This is why we must look at it from the full software hardware stack.” 

Co-principal investigator Jason Bakos, a professor in the Department of Computer Science and Engineering, is leading an effort to develop a fast simulator for IMAC circuits, which could be beneficial because it can help with providing a tool for expanding the research. While Zand will work on the computer architecture and system side, Bakos is focusing on the tool chain development and simulation framework. 

“We’re developing memory technology that can perform tensor operations - the fundamental operations performed by AI models - on data stored in the memory and without the use of a graphics processing unit. This can potentially provide 1,000 times greater energy efficiency,” Bakos says. “In addition, traditional, general-purpose circuit simulators are far too slow, so we’re developing substantially faster simulators that are tailored specifically for the type of circuit simulation needed.”

Zand also wants to automate the process of designing these systems. Through machine learning, the approach automatically optimizes the neural networks based on available hardware. He aims to combine this method with working on the computer architecture. 

“We’ll build core processors and integrate them with existing units to make sure they are part of the machine learning and AI ecosystem. They could be integrated with the whole system in a seamless way,” Zand says. 

Zand is continuing to build on preliminary work, and one of the current focuses is how his technology can be used for large language models (LLMs) like Generative Pre-trained Transformer (GPT). Most machine learning models use services such as GPT, but they are energy consuming and have sustainability concerns due to the size of the language models. To remedy this issue, Zand’s team has developed a computer architecture known as Processing-In-Memory LLM (PIM-LLM). 

“We’re excited about achieving significant throughput, which is important to us,” Zand says. “Through the use of this technology, we can now deploy large language models on mobile devices and have a real-time conversational AI system without the need to send data to cloud. It's in the early stages, but the results so far are promising.”

Zand intends to develop several applications before the end of the project in 2027, including language models and audio processing. He also hopes that metrics for energy efficiency and speed can be reached. But Zand is also interested in convincing stakeholders that his technology is promising enough and worth a larger investment for commercialization by major industry and government. 

“Through our work and the cross-layer computer system architecture circuit design, we hope to improve energy consumption and throughput,” Zand says. “Through all the frameworks and tools that we’re developing, I'm hoping that we can prove what we proposed in terms of speed and energy efficiency of these systems for AI and machine learning technologies and different workloads.” 


Challenge the conventional. Create the exceptional. No Limits.

©