Cadence claims that the DNA 100 processor has the ability to deliver up to 4.7X better performance and up to 2.3X more performance per watt compared to other solutions with similar MAC array sizes. The DNA 100 processor’s specialized hardware compute engine eliminates both loading and multiplying zeros, allowing this sparsity to be leveraged for power efficiency and compute reduction. The sparsity in the network can be increased through retraining to get the maximum performance from the sparse compute engine, enabling the DNA 100 processor to maximize throughput with a smaller array. The increased efficiency gives the DNA processor IP the ability to achieve up to 2,550 fps and up to 3.4TMACs/W (in 16 nm) of estimated on-device inference performance on ResNet 50 for a 4K MAC configuration.
The processor is is available as part of a complete AI software platform and is compatible with the latest version of the Tensilica Neural Network Compiler, which provides support for advanced AI frameworks including Caffe, TensorFlow, TensorFlow Lite, and a broad spectrum of neural networks including convolution and recurrent networks. The Tensilica Neural Network Compiler leverages a comprehensive set of optimized neural network library functions to map any neural network into executable and highly optimized high-performance code. The DNA 100 processor has robust software ecosystem support for different network types, including classification, object detection, segmentation, recurrent and regression. It also supports the Android Neural Network (ANN) API for on-device AI inference in Android-powered devices.