Tech

Chain-of-Zoom framework enables extreme super-resolution zoom without retraining

Share
Share
Chain-of-Zoom framework enables extreme super-resolution zoom using existing models without retraining
Credit: Bryan Sangwoo Kim et al

A trio of AI researchers at KAIST AI, in Korea, has developed what they call a Chain-of-Zoom framework that allows the generation of extreme super-resolution imagery using existing super-resolution models without the need for retraining.

In their study published on the arXiv preprint server, Bryan Sangwoo Kim, Jeongsol Kim, and Jong Chul Ye broke down the process of zooming in on an image and then used an existing super-resolution model at each step to refine the image, resulting in incremental improvements in resolution.

The team in Korea began by noting that existing frameworks for improving the resolution of pictures tend to use interpolation or regression when zooming, resulting in blurry imagery. To overcome these problems, they took a new approach—using a stepwise zooming process, in which subsequent steps improve on those that came before.

The researchers call their new framework Chain-of-Zoom (CoZ), due to the chain of processes that are used to improve resolution.

For each step, the new framework uses a super-resolution (SR) model that already exists to begin the refinement process. As such processing is taking place, a vision-language-model (VLM) generates descriptive prompts that help the SR model conduct the generation process. The result is the generation of a zoomed-in part of the first image.

Chain-of-Zoom framework enables extreme super-resolution zoom using existing models without retraining
(a) Conventional SR. When an SR backbone trained for a fixed up-scale factor (e.g., 4x) is pushed to much larger magnifications beyond its training regime, blur and artifacts are produced. (b) Chain-of-Zoom (ours). Starting from an LR input, a pretrained VLM generates a descriptive prompt, which—together with the image—is fed to the same SR backbone to yield the next HR scale-state. This prompt-and-upscale cycle is repeated, allowing a single off-the-shelf model to climb to extreme resolutions (16x–256x) while preserving sharp detail and semantic fidelity. Credit: arXiv (2025). DOI: 10.48550/arxiv.2505.18600

The framework then repeats the process, using helpful cues from VLM, repeatedly, improving the resolution of the zoomed image each time, until settling on a final version. To ensure that the prompts given by the VLM were useful, the research team applied reinforcement-learning techniques. Testing of the framework showed it is capable of besting imagery generated by standard benchmarks.

The researchers note that their framework does not require retraining to improve image quality, which, they suggest, makes it more portable. They also state that users need to be careful about how their framework is used. The zoomed-in image is not real—it has been generated using artificial intelligence.

Thus, if it were to be used for making out the letters and/or numbers on a getaway car license plate used during a bank robbery, for example, it might show some very clear letters and numbers—but they might not match those on the real car.

More information:
Bryan Sangwoo Kim et al, Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment, arXiv (2025). DOI: 10.48550/arxiv.2505.18600

Project page: bryanswkim.github.io/chain-of-zoom/

Journal information:
arXiv


© 2025 Science X Network

Citation:
Chain-of-Zoom framework enables extreme super-resolution zoom without retraining (2025, June 4)
retrieved 4 June 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Share

Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles
7 new movies and TV shows to stream on Netflix, Prime Video, Max, and more this weekend (June 6)
Tech

7 new movies and TV shows to stream on Netflix, Prime Video, Max, and more this weekend (June 6)

We’re not one for overexaggerating here at TechRadar. But, if you’ll humor...

IPVanish teams up with URC to promote cybersecurity outside the rugby pitch
Tech

IPVanish teams up with URC to promote cybersecurity outside the rugby pitch

IPVanish has announced a new partnership with the United Rugby Championship (URC)...

Anthropic is building new Claude AI models specifically for US national security designed to handle classified information
Tech

Anthropic is building new Claude AI models specifically for US national security designed to handle classified information

Anthropic has developed several US national security-oriented models They can handle classified...

New class of SrHfSe₃ chalcogenide perovskite solar cells with diverse HTMs may make more efficient solar tech
Tech

New class of SrHfSe₃ chalcogenide perovskite solar cells with diverse HTMs may make more efficient solar tech

Credit: Pixabay/CC0 Public Domain The photovoltaic industry has witnessed a remarkable breakthrough...