Memory-centric Chip Architecture for Deep Learning

Deep learning requires totally new chip architectures, different from conventional von Neumann architecture. They will look like real neural networks where computation and memory are blurred and distributed. Thus, it is inefficient to run deep learning on the conventional architecture. Our work has two steps. First, in the short term, we try to improve the efficiency of deep learning execution on the conventional architecture. Then, finally, we will devise new chip architectures targeted for deep learning. In our work, we pay special attention to energy efficiency. In the conventional architecture, it is expected that memory accesses dominate the total energy consumption of deep learning execution, especially when hardware accelerators are adopted for deep learning. Innovations are needed to reduce memory accesses while keeping the quality of inference. In the long term, brain-inspired circuits and architectures will be explored to enable brain-like highly energy-efficient operations.