Reasoning

13 Articles
Reinforcement learning boosts reasoning skills in new diffusion-based language model d1
Tech

Reinforcement learning boosts reasoning skills in new diffusion-based language model d1

Log Probability Estimation in diffu-GRPO. Credit: arXiv (2025). DOI: 10.48550/arxiv.2504.12216 A team of AI researchers at the University of California, Los Angeles, working...

OpenAI beats DeepSeek on sentence-level reasoning
Tech

OpenAI beats DeepSeek on sentence-level reasoning

Credit: AI-generated image ChatGPT and other AI chatbots based on large language models are known to occasionally make things up, including scientific and...

Deep Reasoning is coming to ChatGPT free, but I think it’s still worth paying for ChatGPT Plus
Tech

Deep Reasoning is coming to ChatGPT free, but I think it’s still worth paying for ChatGPT Plus

Deep Research is coming to the free tier of ChatGPT very soon The Plus tier still offers considerable advantages over the free tier...

Gemini’s ‘most intelligent AI model’ yet is now available for free – here are 3 ways you can use its incredible reasoning capabilities
Tech

Gemini’s ‘most intelligent AI model’ yet is now available for free – here are 3 ways you can use its incredible reasoning capabilities

Google announced Gemini 2.5 last week Now you can access its reasoning model, Gemini 2.5 Pro Experimental, for free It tops Humanity’s Last...

Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning
Tech

Gemini 2.5 is now available for Advanced users and it seriously improves Google’s AI reasoning

Google announces Gemini 2.5 Gemini 2.5 Pro Experimental is available for paid subscribers right now Tops Humanity’s Last Exam, the most difficult AI...

How Claude’s 3.7’s new ‘extended’ thinking compares to ChatGPT o1’s reasoning
Tech

How Claude’s 3.7’s new ‘extended’ thinking compares to ChatGPT o1’s reasoning

Anthropic just released a new model called Claude 3.7 Sonnet, and while I’m always interested in the latest AI capabilities, it was the...

A new backdoor attack that leverages the reasoning capabilities of LLMs
Tech

A new backdoor attack that leverages the reasoning capabilities of LLMs

The user submits two queries (Q1 and Q2) to the backdoored customized LLM (the middle entity, highlighted in red). In Q1’s reasoning steps,...

Academic researchers find a way to train an AI reasoning model for less than
Tech

Academic researchers find a way to train an AI reasoning model for less than $50

Sequential and parallel test-time scaling. (a): Budget forcing shows clear scaling trends and extrapolates to some extent. For the three rightmost dots, we...

OpenAI responds to the DeepSeek buzz by launching its latest o3-mini reasoning model for all users
Tech

OpenAI responds to the DeepSeek buzz by launching its latest o3-mini reasoning model for all users

OpenAI has pushed out o3-mini models to ChatGPT The launch has previously been teased OpenAI is facing increasing competition from China As promised...