Inference

AMD brings back an iconic name with Radeon AI Pro R9700 for local inference and large-scale model training

Radeon AI Pro R9700 targets local AI workloads and multi-GPU setups The new workstation-class GPU shares its name with a 20 year old...

Lovabledaniels

Tech

Google Cloud unveils Ironwood, its 7th Gen TPU to help boost AI performance and inference

Google unveils Ironwood, its 7th-generation TPU Ironwood is designed for inference, the new big challenge for AI It offers huge advances in power...

Lovabledaniels

Tech

SETI but for LLM; how an LLM solution that’s barely a few months old could revolutionize the way inference is done

Exo supports LLaMA, Mistral, LlaVA, Qwen, and DeepSeek Can run on Linux, macOS, Android, and iOS, but not Windows AI models needing 16GB...

Lovabledaniels

Tech

Bye bye Nvidia? Chinese cloud providers aggressively cut down AI inference costs by using Huawei’s controversial accelerators and DeepSeek’s tech

DeepSeek’s V3 and R1 models are available through Huawei’s Ascend cloud service They are powered by the Ascend 910x accelerators banned in the...

Lovabledaniels

Tech

Navigating the rising costs of AI inference in the era of large-scale applications

The momentum of AI-driven applications is accelerating around the world and shows little sign of slowing. According to data from IBM, 42% of...

Lovabledaniels

Tech

AI energy efficiency monitoring ranks low among enterprise users, survey by inference CPU specialists finds

Swimlane survey finds many businesses aren’t keeping on top of AI energy needs Nearly three quarters are aware of the dramatic energy demands...

Lovabledaniels

Tech

Apple embraces Nvidia GPUs to accelerate LLM inference via its open source ReDrafter tech

ReDrafter delivers 2.7x more tokens per second compared to traditional auto-regression ReDrafter could reduce latency for users while using fewer GPUs Apple hasn’t...

Lovabledaniels

Tech

Microsoft backed a tiny hardware startup that just launched its first AI processor that does inference without GPU or expensive HBM memory and a key Nvidia partner is collaborating with it

Microsoft-backed startup introduces GPU-free alternatives for generative AI DIMC architecture delivers an ultra-high memory bandwidth of 150 TB/s Corsair supports transformers, agentic AI,...

Lovabledaniels

Tech

Nvidia’s closest rival once again obliterates cloud giants in AI performance; Cerebras Inference is 75x faster than AWS, 32x faster than Google on Llama 3.1 405B

Cerebras hits 969 tokens/second on Llama 3.1 405B, 75x faster than AWS Claims industry-low 240ms latency, twice as fast as Google Vertex Cerebras...

Lovabledaniels

Weekly update

AMD shifts to modular GPU strategy with MI355X, ending MI300A-style APU designs

Angela Simmons Confirms Yo Gotti Breakup

Windows 11 user has 30 years of ‘irreplaceable photos and work’ locked away in OneDrive – and Microsoft’s silence is deafening

Weekly Newsletter

AMD brings back an iconic name with Radeon AI Pro R9700 for local inference and large-scale model training

Google Cloud unveils Ironwood, its 7th Gen TPU to help boost AI performance and inference

SETI but for LLM; how an LLM solution that’s barely a few months old could revolutionize the way inference is done

Bye bye Nvidia? Chinese cloud providers aggressively cut down AI inference costs by using Huawei’s controversial accelerators and DeepSeek’s tech

Navigating the rising costs of AI inference in the era of large-scale applications

AI energy efficiency monitoring ranks low among enterprise users, survey by inference CPU specialists finds

Apple embraces Nvidia GPUs to accelerate LLM inference via its open source ReDrafter tech

Microsoft backed a tiny hardware startup that just launched its first AI processor that does inference without GPU or expensive HBM memory and a key Nvidia partner is collaborating with it

Nvidia’s closest rival once again obliterates cloud giants in AI performance; Cerebras Inference is 75x faster than AWS, 32x faster than Google on Llama 3.1 405B

Get to Know Us

Let's keep in touch