Computing-in-memory technology for edge speech processing
Computing-in-memory technology is poised to eliminate the massive data communications bottlenecks otherwise associated with performing AI speech processing at the network’s edge but requires an embedded memory that simultaneously performs neural network computation and stores weights. Microchip Technology Inc., via its Silicon Storage Technology (SST) subsidiary, has announced that its SuperFlash memBrain neuromorphic memory has solved this problem for the WITINMEM neural processing SoC, the first in volume production that enables sub-mA systems to reduce speech noise and recognize hundreds of command words, in real time and immediately after power-up.
Microchip has worked to incorporate its memBrain analog in-memory computing product, based on SuperFlash technology, into the ultra-low-power SoC from WITINMEM. The SoC features computing-in-memory technology for neural networks processing including speech recognition, voice-print recognition, deep speech noise reduction, scene detection, and health status monitoring. WITINMEM, in turn, is working with multiple customers to bring products to market during 2022 based on this SoC.
“WITINMEM is breaking new ground with Microchip’s memBrain for addressing the compute-intensive requirements of real-time AI speech at the network edge based on advanced neural network models,” said Shaodi Wang, CEO of WITINMEM. “We were the first to develop a computing-in-memory chip for audio in 2019, and now we have achieved another milestone with volume production of this technology in our ultra-low-power neural processing SoC that streamlines and improves speech processing performance in intelligent voice and health products.”
“The WITINMEM SoC showcases the value of using memBrain technology to create a single-chip solution based on a computing-in-memory neural processor that eliminates the problems of traditional processors that use digital DSP and SRAM/DRAM-based approaches for storing and executing machine learning models,” said Mark Reiten, vice president of the license division at SST.
The memBrain neuromorphic memory product is optimized to perform vector matrix multiplication (VMM) for neural networks. It enables processors used in battery-powered and deeply-embedded edge devices to deliver the highest possible AI inference performance per watt. This is accomplished by both storing the neural model weights as values in the memory array and using the memory array as the neural compute element. The result is 10 to 20 times lower power consumption than alternative approaches along with lower overall processor Bill of Materials (BOM) costs because external DRAM and NOR are not required.
Permanently storing neural models inside the memBrain’s processing element also supports instant-on functionality for real-time neural network processing. WITINMEM has leveraged SuperFlash technology’s floating gate cells’ nonvolatility to power down its computing-in-memory macros during the idle state to further reduce leakage power in demanding IoT use cases.