Tech

Creating a 3D interactive digital room from simple video

LovabledanielsJune 30, 202547 Views

Creating 3D interactive digital room from simple video — Credit: Hongchi Xia

Cornell researchers have developed an AI-powered process that automatically transforms a short video of a room into an interactive, 3D simulation of the space.

Inside this highly accurate “digital twin,” users can open drawers and cabinets and handle objects on the countertop. The technology can be used to develop more realistic video games and virtually train robots to operate within a specific real-world space—essentially any application that needs a realistic, interactive model of a room.

“Existing techniques, although they allow you to synthesize what the world looks like from different viewpoints, sometimes lack this capability of being immersive, where you can really interact with the scene,” said Wei-Chiu Ma, assistant professor of computer science in the Cornell Ann S. Bowers College of Computing and Information Science, and senior researcher on the project. “Because of the advances in generative AI techniques, we finally have enough tools to make a baby step toward creating digital twins that are now interactable.”

His collaborators include Hongchi Xia, a Ph.D. student in computer science at the University of Illinois Urbana-Champaign. Xia presented their project, “DRAWER: Digital Reconstruction and Articulation With Environment Realism,” on June 15 at the IEEE/CVF Conference on Computer Vision and Pattern Recognition in Nashville, Tennessee.

The process of creating a digital twin of a room using DRAWER starts with just a few minutes of filming.

DRAWER automatically converts a video of a static scene without any interactions with the doors and objects in the scene into an interactive environment with segmented objects and articulated doors. Credit: Hongchi Xia

“Our input is just a video that you casually capture in the kitchen. You don’t need to interact with any cabinet doors or with the objects,” Xia said. “I just hold my iPhone—you don’t need an advanced video device or expensive camera.”

To turn that video into a digital room that is both photorealistic and interactive, the researchers put together multiple AI models. They combined two methods for rendering digital images: one that looks attractive, and a second that recreates the scene with highly accurate dimensions. They also added a perception module, which determines which parts of the scene are mobile and how they should move, such as how a refrigerator door should swing open. Finally, they included a model that fills in the unseen insides of the drawers.

However, developing DRAWER wasn’t as simple as just linking up the modules, Xia said. He had to integrate them into a unified framework. Once completed, he used the method to develop recreations of a kitchen, a bathroom and even his office.

The digital twins generated by this approach work seamlessly with the game engines used to create video games, Xia said. The research team demonstrated this by creating a game where the user shoots balls to knock over objects in the kitchen, like the kettle and soap bottle.

The framework can also be applied to virtually train robots to operate in real-world environments through a process called real-to-sim-to-real transfer. The researchers virtually trained a robotic arm on the digital twin of the kitchen and then showed it successfully put away objects in the drawer in the real world.

Articulation Simulation Motions. We visualize the comparisons between our predicted articulation motion trajectories (red) and the GT trajectories (blue). Credit: Hongchi Xia

They envision that in the near future, someone could order a robot, upload a video of their house and the digital twin of the house could be used to train the robot to function within the space before it’s even out of the box. The simulation is a cheaper, faster and safer way to train a robot, Ma said.

Currently, DRAWER only works with rigid objects, like a kettle, but eventually they plan to include soft or deformable objects, like cloth or windows that can break.

Additionally, DRAWER currently recreates a single room, but Ma and Xia hope to extend this work to encompass entire buildings. They also envision creating digital twins of outdoor spaces where the technology could be used for designing cities or optimizing agricultural yields.

“Our final goal is to try to build a digital twin of everything in the world,” said Xia, “so there are a lot of things that we can explore in the future.”

Additional authors on the study include colleagues from the University of Washington, including Entong Su, Marius Memmel, Arhan Jain, Raymond Yu, Numfor Mbiziwo-Tiapo, Ali Farhadi (also at the Allen Institute for Artificial Intelligence) and Abhishek Gupta, as well as Shenlong Wang from the University of Illinois Urbana-Champaign.

More information:
Paper: DRAWER: Digital Reconstruction and Articulation With Environment Realism

Provided by
Cornell University

Citation:
Creating a 3D interactive digital room from simple video (2025, June 30)
retrieved 30 June 2025
from

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.

Previous post UK police say pro-Palestine performances at Glastonbury subject to probe | Israel-Palestine conflict News

Next post How can Iranians rebuild while crackdowns intensify? | News

New hires are cybersecurity goldmines for hackers, and most companies don’t even realize they’re making it easy

Most phishing incidents happen before new employees even understand how internal systems...

Lovabledaniels

Tech

Analytical model evaluates performance of grant-free communication in densely populated IoT environment

Credit: Pixabay/CC0 Public Domain Imagine a world where every smart device, from...

Lovabledaniels

Samsung’s ,599 Smart Monitor M9 sounds amazing, but this OLED AI beast may be an overkill for your desk

Tech

Samsung’s $1,599 Smart Monitor M9 sounds amazing, but this OLED AI beast may be an overkill for your desk

Samsung Smart Monitor M9 merges OLED clarity and AI intelligence in a...

Lovabledaniels

Tech

Widespread non-compliance found in loot box advertising disclosure rules in the UK and South Korea

A collage of screenshots of the social media ads for three different...

Lovabledaniels

Weekly update

Doechii Makes An Unforgettable Debut At Glastonbury Festival 2025

Brain-computer interface robotic hand control reaches new finger-level milestone

Iran hardens stance against IAEA and its chief in wake of US-Israel attacks | Nuclear Weapons News

Weekly Newsletter

Creating a 3D interactive digital room from simple video

Leave a comment

Leave a Reply Cancel reply

Explore more

Brain-computer interface robotic hand control reaches new finger-level milestone

Iran hardens stance against IAEA and its chief in wake of US-Israel attacks | Nuclear Weapons News

Sean “Diddy” Combs Sex Trafficking Trial: Jury Deliberation Begins

Insider risk grows as survey reveals half of employees have excessive access and AI tools are making it worse

New hires are cybersecurity goldmines for hackers, and most companies don’t even realize they’re making it easy

Analytical model evaluates performance of grant-free communication in densely populated IoT environment

Samsung’s $1,599 Smart Monitor M9 sounds amazing, but this OLED AI beast may be an overkill for your desk

Widespread non-compliance found in loot box advertising disclosure rules in the UK and South Korea

Get to Know Us

Let's keep in touch