Tech

‘A virtual DPU within a GPU’: Could clever hardware hack be behind DeepSeek’s groundbreaking AI efficiency?

Share
Share

  • A new approach called DualPipe seems to be the key to DeekSeek’s success
  • One expert describes it as an on-GPU virtual DPU that maximizes bandwidth efficiency
  • While DeepSeek has used Nvidia GPUs only, one wonders how AMD’s Instinct would fare

China’s DeepSeek AI chatbot has stunned the tech industry, representing a credible alternative to OpenAI’s ChatGPT at a fraction of the cost.

A recent paper revealed DeepSeek V3 was trained on a cluster of 2,048 Nvidia H800 GPUs – crippled versions of the H100 (we can only imagine how much more powerful it would be running on AMD Instinct accelerators!). It reportedly required 2.79 million GPU-hours for pretraining, fine-tuning on 14.8 trillion tokens, and cost – according to calculations made by The Next Platform – a mere $5.58 million.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
Bilinear sequence regression model shows why AI excels at learning from word sequences
Tech

Bilinear sequence regression model shows why AI excels at learning from word sequences

Credit: Unsplash/CC0 Public Domain Researchers at EPFL have created a mathematical model...

How Europe can source critical raw materials at home
Tech

How Europe can source critical raw materials at home

Credit: Pixabay/CC0 Public Domain From Li-ion batteries and electric vehicles to drones...

Indigenous engagement is essential for small modular nuclear reactor projects
Tech

Indigenous engagement is essential for small modular nuclear reactor projects

Small modular reactors (SMRs) could be relatively feasible way to generate power...