
Every day, millions of workers step onto construction sites—arguably some of the most hazardous environments in modern industry. Despite years of safety protocols, equipment upgrades, and training programs, construction continues to rank among the top industries for workplace injuries and fatalities worldwide. For years, we’ve asked: Could artificial intelligence help? So far, the results have been mixed.
Construction sites are dynamic, cluttered, and unpredictable—filled with workers, equipment, and heavy machinery. This chaos presents a major challenge for traditional computer vision systems, which are typically designed for clean, structured, and occlusion-free environments. While AI has transformed industries like autonomous driving and smart manufacturing, its impact on construction safety has lagged far behind.
At Carnegie Mellon University’s Mechanical Engineering Department, we set out to advance computer vision solutions that make construction sites safer, smarter, and more efficient—one of the most challenging and under-explored applications of AI in the broader computer vision community.
We designed a new model, Safe-Construct, which will be presented at the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2025) Affective and Behavior Analysis In-the-Wild Workshop, held in Nashville June 11–15. The paper is available on the arXiv preprint server.
The research team started with a simple question: What would it take to teach a machine to care about safety like a supervisor? Not in theory, but in practice. Not by merely counting hard hats in photos—but by seeing risk the way a human would. Or even better. The project, Safe-Construct, redefines how AI can “see” and respond to real-world safety risks in construction—not hypothetically, but operationally.

Rethinking safety: From pixels to poses
Unlike conventional models that encode workers as mere bounding boxes which often fail in the unpredictable conditions of active construction sites, Safe-Construct takes a fundamentally new approach. It uses 3D multi-person, multi-view human pose estimation to monitor workers in real-time—identifying safety violations, tracking posture, and analyzing behavior across multiple viewpoints and under varying conditions.
To our knowledge, this is the first time such a system has been designed specifically for dynamic construction environments. It can scale to any number of workers, adapt across industries, and operate live on-site. Most importantly, the model redefines construction safety violation recognition as a 3D Multi-View Engagement Task.

Built for the real world, trained in simulation
During training the model, the team leveraged synthetic data generation, Sim2Real transfer, and domain randomization—techniques that essentially throw the AI model into hundreds of simulated what-if scenarios, preparing it to handle real-world unpredictability. Testing took place at CMU’s Advanced Manufacturing Facility at Mill-19, a hub for robotics and industrial innovation.
The result: a generalizable system that doesn’t just detect whether a worker is wearing a hard hat—it understands how workers move, interact, and perform tasks, offering a deeper, more context-aware understanding of safety. It can even detect advanced violations, such as whether a ladder is being properly stabilized while another worker climbs—an interaction that involves multiple workers, tools, and contextual understanding.
Beyond detection: Toward digital twins and egocentric vision
But Safe-Construct doesn’t stop at violation detection. The team is now developing a full digital twin ecosystem—a live, virtual replica of the construction site that enables managers to monitor key performance indicators (KPIs) like safety, productivity, and quality.

The team is also exploring 360-degree camera systems and egocentric (first-person) vision, which can bring richer context and worker-centric data into the analysis—dramatically reshaping how companies understand risk, assess workflows, and design safer operational protocols. The research team collaborated with YKK AP Inc. Japan—a global industry leader in building solutions, ensuring that Safe-Construct remains grounded in real-world needs and industry constraints, extending its impact far beyond the lab.
Construction may never be risk-free. But the future could be a lot safer, smarter, faster—and more human-aware—than ever before.
This story is part of Science X Dialog, where researchers can report findings from their published research articles. Visit this page for information about Science X Dialog and how to participate.
More information:
Aviral Chharia et al, Safe-Construct: Redefining Construction Safety Violation Recognition as 3D Multi-View Engagement Task, arXiv (2025). DOI: 10.48550/arxiv.2504.10880
Aviral Chharia is a graduate student at Carnegie Mellon University. He has been awarded the ATK-Nick G. Vlahakis Graduate Fellowship at CMU, the Students’ Undergraduate Research Graduate Excellence (SURGE) fellowship at IIT Kanpur, India, and the MITACS Globalink Research Fellowship at the University of British Columbia. Additionally, he was a two-time recipient of the Dean’s List Scholarship during his undergraduate. His research interests include computer vision, computer graphics, and machine learning.
Citation:
Inside Safe-Construct: The AI system built for the world’s most dangerous workplaces (2025, May 22)
retrieved 22 May 2025
from
This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.
Leave a comment