NPU

1 Articles
NPU core improves inference performance by over 60%
Tech

NPU core improves inference performance by over 60%

Oaken’s quantization algorithm consisting of three components: (a) threshold-based online-offline hybrid quantization, (b) group-shift quantization, and (c) fused dense-and-sparse encoding. Credit: Proceedings of...