Tech

Tech startup proposes a novel way to tackle massive LLMs using the fastest memory available to mankind

Share
Share


  • GPU-like PCIe card offers 10PFLOPs FP4 compute power and 2GB of SRAM
  • SRAM is usually used in small amounts as cache in processors (L1 to L3)
  • It also uses LPDDR5 rather than far more expensive HBM memory

Silicon Valley startup d-Matrix, which is backed by Microsoft, has developed a chiplet-based solution designed for fast, small-batch inference of LLMs in enterprise environments. Its architecture takes an all-digital compute-in-memory approach, using modified SRAM cells for speed and energy efficiency.

The Corsair, d-Matrix’s current product, is described as the “first-of-its-kind AI compute platform” and features two d-Matrix ASICs on a full-height, full-length PCIe card, with four chiplets per ASIC. It achieves a total of 9.6 PFLOPs FP4 compute power with 2GB of SRAM-based performance memory. Unlike traditional designs that rely on expensive HBM, Corsair uses LPDDR5 capacity memory, with up to 256GB per card for handling larger models or batch inference workloads.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
When the school bell rings, the bandwidth drops: How post-15:40 internet surges affect UK broadband quality
Tech

When the school bell rings, the bandwidth drops: How post-15:40 internet surges affect UK broadband quality

Half of parents work after school, causing a broadband battle with streaming-addicted...

You can put Google Gemini right on your smartphone home screen – here’s how
Tech

You can put Google Gemini right on your smartphone home screen – here’s how

Google has launched Gemini home screen widgets for Android and iOS devices...

You can now fact check anybody’s post in WhatsApp – here’s how
Tech

You can now fact check anybody’s post in WhatsApp – here’s how

Perplexity AI’s new WhatsApp integration offers instant fact-checking without leaving the app...