Last week, NVIDIA announced that they have released the open-source code for Neuralangelo, the company’s 3D reconstruction tool. Originally released back in June, Neuralangelo uses artificial intelligence to convert 2D videos capturing an object from multiple angles and turns it into a highly accurate 3D model. Now, the code for the AI model is available for anyone, and can be found here.
Neuralangelo, whose name is an homage to Renaissance sculptor Michelangelo, was introduced in June with a paper released by NVIDIA Research. In a blog post introducing the model, NVIDIA compares Neuralangelo’s ability to create 3D models to Michelangelo sculpting realistic visions from blocks of models. Like the great sculptor, what Neuralangelo really excels at is getting the intricate details for a model, capturing things like repetitive texture patterns, strong color variations, and homogenous colors, all of which NVIDIA notes has been a challenge for other, similar AI tools. Below, you can see how it works in a video put together by NVIDIA.
The base for this tool is the same as what the company used to create their Instant NeRF technology, which was first introduced in 2022 and promised to change how photogrammetric projects are completed. Now, instead of needing to take fewer pictures, Neuralangelo completes a similar process using video. An object must be captured from different angles in a 2D video, and from there the model takes individual frames from various angles, essentially creating the same basis for a model that would be used in photogrammetry.
The model then determines the camera position for the selected frames and uses that to create an initial, rough 3D representation of the scene. After that initial reconstruction takes place, the model optimizes the render to sharpen the details, creating a truly accurate and high-fidelity reconstruction of a scene from just a single video. They point to art, video game development, robotics, and digital twins as potential use cases for this technology.
We talked last year about the growing trend of democratizing reality capture, and while this Neuralangelo tool doesn’t exactly overlap with all reality capture use cases, it does open up the possibility of creating better models for a lot of spaces. Much of the focus in the release was around the ability to create reconstructions of individual objects like a statue or a truck, but NVIDIA does note that entire indoor spaces can be recreated as well. Being able to do it with a video makes that process significantly easier for many professionals, removing the need to buy special equipment and the process of ensuring photographs are captured from all necessary angles.
For things like video game design, it’s not hard at all to see how this kind of tool can be used to place realistic objects into the virtual world. It’s not limited to that use case, though. NVIDIA mentions industrial digital twins as a space in which Neuralangelo can thrive, and one can envision stakeholders taking 2D videos of spaces and creating the base of their digital twins much more easily than ever before. We could also consider the kind of work that is being done by a company like Alteia, who is using NeRF technology with images of powerlines and other vertical structures to complete inspection workflows. In theory, that workflow would become even simpler with the use of video rather than images.
“The 3D reconstruction capabilities Neuralangelo offers will be a huge benefit to creators, helping them recreate the real world in the digital world,” said Ming-Yu Liu, senior director of research and co-author on the paper, in the aforementioned NVIDIA blog post. “This tool will eventually enable developers to import detailed objects — whether small statues or massive buildings — into virtual environments for video games or industrial digital twins.”