Tech

Reinforcement learning for nuclear microreactor control

LovabledanielsJune 30, 202550 Views

Reinforcement learning for nuclear microreactor control — A new machine learning approach models adjusting power output of the Holos-Quad microreactor design by HolosGen LLC. The multi-agent reinforcement learning approach trains more efficiently than previous approaches, taking a step forward towards more autonomous nuclear microreactors for operation in remote areas. Credit: HolosGen LLC.

A machine learning approach leverages nuclear microreactor symmetry to reduce training time when modeling power output adjustments, according to a study led by University of Michigan researchers, published in the journal Energy Conversion and Management: X.

Improved training efficiency will help researchers model reactors faster, taking a step toward real-time automated nuclear microreactor control for operation in remote locations or eventually in space.

These compact reactors—able to generate up to 20 megawatts of thermal energy that can be used directly as heat or converted to electricity—could be easily transported or potentially used in cargo ships that wish to take very long trips without refueling. If incorporated into an electrical grid, nuclear microreactors could provide stable, carbon-free energy when renewables like solar or wind are not abundantly available.

Small reactors sidestep the huge capital costs that come with large reactors, and partial automation of microreactor power output control would help keep costs low. In potential space applications—such as directly propelling a spacecraft or providing electrical power to the spacecraft’s systems—nuclear microreactors would need to operate completely autonomously.

As a first step toward automation, researchers are simulating load-following—when power plants increase or decrease output to match the electricity demand of the grid. This process is relatively simple to model compared to reactor start-up, which includes rapidly changing conditions that are harder to predict.

The Holos-Quad microreactor design modeled in this study adjusts power through the position of eight control drums that center around the reactor’s central core where neutrons split uranium atoms to produce energy. One side of the control drum’s circumference is lined with a neutron-absorbing material, boron carbide.

When rotated inwards, the drums absorb neutrons from the core, causing the neutron population and the power to decrease. Rotating the cores outwards keeps more neutrons in the core, increasing power output.

“Deep reinforcement learning builds a model of system dynamics, enabling real-time control—something traditional methods like model predictive control often struggle to achieve due to the repetitive optimization needs,” said Majdi Radaideh, an assistant professor of nuclear engineering and radiological sciences at U-M and senior author of the study.

The research team simulated load-following by control drum rotation based on reactor feedback with reinforcement learning—a machine learning paradigm that enables agents to make decisions through repeated interactions with their environment through trial and error. While deep reinforcement learning is highly effective, it requires extensive training which drives up computational time and cost.

For the first time, the researchers tested a multi-agent reinforcement learning approach that trains eight independent agents to control a specific drum while sharing information about the core as a whole. This framework exploits the microreactor’s symmetry to help reduce training time by multiplying the learning experience.

The study evaluated the multi-agent reinforcement learning against two other models: a single-agent approach, where a single agent observes core status and controls all eight drums, and the industry-standard proportional-integral-derivative (PID) control, that uses a feedback-based control loop.

Reinforcement learning approaches achieved similar or superior load following compared to PID. In imperfect scenarios where sensors provided imperfect readings or when reactor conditions fluctuated, reinforcement learning maintained lower error rates than PID at up to 150% lower control costs—meaning it reached the solution with less effort.

The multi-agent approach trained at least twice as fast as the single-agent approach with only a slightly higher error rate.

The technique needs extensive validation in more complex, realistic conditions before real-world application, but the findings establish a more efficient path forward for reinforcement learning in autonomous nuclear microreactors.

“This study is a step toward a forward digital twin where reinforcement learning drives system actions. Next, we aim to close the loop with inverse calibration and high-fidelity simulations to enhance control accuracy,” Radaideh said.

More information:
Leo Tunkle et al, Nuclear microreactor transient and load-following control with deep reinforcement learning, Energy Conversion and Management: X (2025). DOI: 10.1016/j.ecmx.2025.101090

Provided by
University of Michigan College of Engineering

Citation:
Reinforcement learning for nuclear microreactor control (2025, June 30)
retrieved 30 June 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Previous post Hawaiian Airlines says it was hit by ‘cybersecurity event’ - but flyers should be safe

Next post Spotify’s latest breakout band The Velvet Sundown appears to be AI-generated – and fans aren’t happy

New hires are cybersecurity goldmines for hackers, and most companies don’t even realize they’re making it easy

Most phishing incidents happen before new employees even understand how internal systems...

Lovabledaniels

Tech

Analytical model evaluates performance of grant-free communication in densely populated IoT environment

Credit: Pixabay/CC0 Public Domain Imagine a world where every smart device, from...

Lovabledaniels

Samsung’s ,599 Smart Monitor M9 sounds amazing, but this OLED AI beast may be an overkill for your desk

Tech

Samsung’s $1,599 Smart Monitor M9 sounds amazing, but this OLED AI beast may be an overkill for your desk

Samsung Smart Monitor M9 merges OLED clarity and AI intelligence in a...

Lovabledaniels

Tech

Widespread non-compliance found in loot box advertising disclosure rules in the UK and South Korea

A collage of screenshots of the social media ads for three different...

Lovabledaniels

Weekly update

Using generative AI to help robots jump higher and land safely

Creating a 3D interactive digital room from simple video

UK police say pro-Palestine performances at Glastonbury subject to probe | Israel-Palestine conflict News

Weekly Newsletter

Reinforcement learning for nuclear microreactor control

Leave a comment

Leave a Reply Cancel reply

Explore more

Creating a 3D interactive digital room from simple video

UK police say pro-Palestine performances at Glastonbury subject to probe | Israel-Palestine conflict News

Doechii Makes An Unforgettable Debut At Glastonbury Festival 2025

Brain-computer interface robotic hand control reaches new finger-level milestone

New hires are cybersecurity goldmines for hackers, and most companies don’t even realize they’re making it easy

Analytical model evaluates performance of grant-free communication in densely populated IoT environment

Samsung’s $1,599 Smart Monitor M9 sounds amazing, but this OLED AI beast may be an overkill for your desk

Widespread non-compliance found in loot box advertising disclosure rules in the UK and South Korea

Get to Know Us

Let's keep in touch