Tech

Apple embraces Nvidia GPUs to accelerate LLM inference via its open source ReDrafter tech

Share
Share


  • ReDrafter delivers 2.7x more tokens per second compared to traditional auto-regression
  • ReDrafter could reduce latency for users while using fewer GPUs
  • Apple hasn’t said when ReDrafter will be deployed on rival AI GPUs from AMD and Intel

Apple has announced a collaboration with Nvidia to accelerate large language model inference using its open source technology, Recurrent Drafter (or ReDrafter for short).

The partnership aims to address the computational challenges of auto-regressive token generation, which is crucial for improving efficiency and reducing latency in real-time LLM applications.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
Waymo looks to test its self-driving cars in New York
Tech

Waymo looks to test its self-driving cars in New York

Human drivers will remain at the wheel in Waymo self-driving cars once...

Justice at stake as generative AI enters the courtroom
Tech

Justice at stake as generative AI enters the courtroom

Generative artificial intelligence has been used in the US legal system by...

Some AI prompts could cause 50 times more CO₂ emissions than others, researchers find
Tech

Some AI prompts could cause 50 times more CO₂ emissions than others, researchers find

Credit: Sanket Mishra from Pexels No matter which questions we ask an...