Wave Computing’s “TritonAI 64” IP for edge inferencing enables SoCs with up to 6x open-ISA MIPS-64 cores (with SIMD) running Google TensorFlow on a Debian stack plus WaveTensor and WaveFlow technologies for up to 8 TOPS/watt neural processing.
Earlier this month, Wave Computing released its first open source MIPS ISA without license fees or royalties, as promised when it announced its MIPS Open initiative last December. Now, the company has unveiled a licensable IP design for constructing system-on-chips that combine up to 6x open source MIPS-64 cores with a Linux stack that runs Google’s TensorFlow. The new TritonAI 64 design also includes Wave’s proprietary WaveTensor subsystem and WaveFlow fabric for neural processing.
TritonAI 64 conceptual diagram (left) and block diagram
(click images to enlarge)
The TritonAI 64 platform “delivers 8-to-32-bit integer-based support for high-performance AI inferencing at the edge now, with bfloat16 and 32-bit floating point-based support for edge training in the future,” says Wave Computing. TritonAI 64 starts with a MIPS 64-bit SIMD engine that supports up to 6x quad-threaded MIPS-64 cores. This CPU block “is integrated with Wave’s unique approach to dataflow and tensor-based configurable technology,” says the company.
Wave offers a TritonAI Application Programming Kit (APK) that includes the open source, Linux-based MIPS integrated developer environment (IDE). The APK also includes a Debian Linux-based TensorFlow programming environment with an API-based WaveRT runtime and libraries.
TritonAI 64 APK architecture
(click image to enlarge)
TensorFlow support includes TensorFlow-lite build support and updates and a TensorFlow build for edge training. Additional AI frameworks such as Caffe2, can be ported to the MIPS subsystem, and support for additional AI networks is enabled via ONNX conversion.
WaveTensor and WaveFlow
The TensorFlow environment interacts with both the MIPS cores and Wave’s AI inferencing and training IP: WaveTensor and WaveFlow. Its WaveTensor subsystem is designed to execute Convolutional Neural Network (CNN) algorithms. The WaveTensor processing engines can scale up to a PetaTOP of 8-bit integer operations on a single core instantiation by combining extensible slices of 4×4 or 8×8 kernel matrix multiplier engines, claims Wave. CNN execution performance can scale up to 8 TOPS/watt and over 10 TOPS/mm2 in industry standard 7nm process nodes “with libraries using typical voltage and processes,” says the company.
WaveTensor (left) and WaveFlow diagrams
(click images to enlarge)
TritonAI 64 also provides a “highly flexible, linearly scalable” WaveFlow fabric, which supports complex AI algorithms, as well as conventional signal processing and vision algorithms, says Wave Computing. WaveFlow is said to enable “low latency, single batch size AI network execution and reconfigurability to address concurrent AI network execution.”
The WaveFlow fabric includes 2-1K scalable tiles in a 2-D tiling layout. Each tile has 16x CPUs and 8x MACs (8int). WaveFlow can execute algorithms with or without intervention or support from the MIPS subsystem.
“The tremendous growth of edge-based AI use cases is exacerbating the challenges of SoC designers who continue to struggle with legacy IP products that were not designed for efficient AI processing,” stated Derek Meyer, CEO of Wave Computing. “Our TritonAI solution provides them with the investment protection of a programmable platform that can scale to support the AI applications of both today and tomorrow.”
No pricing or availability information was provided by Wave Computing for its TritonAI 64 IP. More information may be found in the TritonAI 64 announcement and product page.
>> Source Link