However, although the company has already completed some of the compiler work the technology is still in development and is not expected to tape out or have IP availability until the second half of 2019.
The company has exploited the similarities between the sea-of-lookup-tables of FPGA fabric and the sea-of-MACs fabric that can support artificial neural networks for inferencing and come up with its NMAX architecture.
FPGA-style interconnect structure is good for moving data efficiently, Flex Logix CEO Geoff Tate, told eeNews Europe. The architecture is being optimized for inferencing at the edge where such features as low batch size and low latency are valued. "Things like Google's TPU [tensor processing unit] batch up 100s of pictures for the efficient use of architecture but it is inevitably low latency," Tate said.
NMAX512 tile with standardized periphery to allow tiling and xFLX universal interconnect. Source: Flex Logix.
To address low latency applications in areas such as sensor fusion and automotive vision processing Flex Logix has developed the NMAX512 tile of 512 multiply accumulate (MAC) units with local SRAM. Implemented in a 16nm CMOS process the tile has peak performance of about 1TOPS with a 1GHz clock frequency and occupying an area of about 2 square millimeters.
In neural inferencing, the computation is primarily trillions of operations (multiplies and accumulates, typically using 8-bit integer inputs and weights, and sometimes 16-bit integer, and this is what Flex Logix has chosen to support with interconnects that allow reconfigurable connections between SRAM input banks, MAC clusters, and activation to SRAM output banks at each stage.
Next: Meaningful arrays