In July, Matteport and Facebook announced a unique collaboration - where Matterport would provide scans of indoor spaces for use in Facebook’s AI research. But how did a reality capture company find itself enabling cutting-edge robotics? We connected with Matterport CTO Japjit Tulsi to find out.
Matterport has been growing and evolving its business model from its inception as a hardware producer to a publicly-traded company hosting the world’s largest data library of 3D captured scans. Once known primarily for its 3D camera the Matterport Pro 2, the company has evolved into an integrated platform for capturing and hosting 3D scans across a variety of devices. Matterport users can now scan 3D spaces using Matterport Pro 2 cameras, but can also take advantage of additional lower-cost options including the lidar-based Leica BLK, off-the-shelf 360 cameras, and, recently, mobile devices (including those enhanced with lidar sensors, such as the iPhone 12 Pro.)
Tulsi says the key to this evolution has been their growing research into creating 3D meshes from a variety of inputs.
“One of the ways we were able to do that, especially where you don’t have depth data, is by building a deep learning network as part of our Cortex AI vision pipeline so that we can infer it instead. Our AI maps trillions of depth data points, and based on that we were able to build a neural network that then allows us to infer depth and create a mesh, where nobody could have before.”
Over the last few years, Matterport has also grown a capture services ecosystem that can be leveraged to quickly capture spaces, for example, at all corporate locations of a franchise or for stand-alone large projects, or even enlisted on a regular basis to plan/execute changing retail store displays and signage. Third party integrations built for real estate and other companies are also possible via APIs and SDKs, and have allowed 3D Matterport scans of properties to be added to real estate listings on sites like Zillow.
All of this experience - and all of these scans - have contributed to Matterport’s ability to analyze those spaces and learn from them internally through their AI and deep space indexing, says Tulsi.
“Once we’ve actually created the space, which is part of our vision pipeline, we’re then taking that and really building out a spatial index that understands what that space is. Now that we’ve surpassed 5 million spaces scanned into the 3D data spatial library, we are really deeply understanding all the spaces, the objects within the spaces, understand the context of the overall space, as well as what type of room it might be - e.g., a conference room vs. a factory floor.”
“So we’re starting to enumerate that, measure them and then provide that as part of our deep spatial index, with a searchable capability as well.”
A robust dataset for robotic research
While this internal research into deep spatial indexing is ongoing at Matterport, the collaboration with Facebook will allow Matteport’s data to be used in new ways - including for researchers looking into robotic navigation, personal AI-based assistance, and other embodied AI applications.
The collaboration, at its core, involves Matterport sharing the “largest-ever dataset of 3D indoor spaces” available exclusively for academic, non-commercial use. The dataset, known as the Habitat-Matterport 3D Research Dataset (HM3D) is an unprecedented collection of 1,000 high-resolution Matterport digital twins made up of residential, commercial, and civic spaces that were generated from real-world environments. By having a baseline dataset with a variety of location types, HM3D is seen as a key step towards advancing embodied AI research - which seeks to train robots and virtual AI assistants to understand and interact with the complexities of the physical world.
While other datasets have been used to train AI for various purposes (mostly image-based), few (if any) have focused on indoor spaces. Facial recognition and object identification for autonomous vehicles have made the most news - but navigating indoor spaces autonomously is a growing need for robotics companies.
AI Habitat is Facebook AI’s state-of-the-art open simulation platform for training embodied agents for a range of tasks. It allows researchers to easily and efficiently complete a large number of repeated trials due to its faster-than-real-time speed. Thanks to this collaboration with Matterport, researchers can now gain the critical scale required to train robots and other intelligent systems on their spatial understanding.
HM3D might be a foundational step towards helping these agents navigate through real-world environments, by better understanding the variations of spaces such as bedrooms, bathrooms, kitchen and hallways, as well as the different configurations of those rooms within every structure.
Facebook’s research also aims to assist robots in recognizing how objects within rooms are typically arranged so that instructions are correctly understood. This research could one day be used in production applications like robots that can retrieve medicine from a bedroom nightstand or AR glasses that can help people remember where they left their keys (see an example below).
“Until now, this rich spatial data has been glaringly absent in the field, so HM3D has the potential to change the landscape of embodied AI and 3D computer vision,” said Dhruv Batra, Research Scientist at Facebook AI Research.
“Our hope is that the 3D dataset brings researchers closer to building intelligent machines, to do for embodied AI what pioneers before us did for 2D computer vision and other areas of AI.”
The dataset is not exclusive to Facebook’s use, and has been made freely available for any researchers that want to use it, says Tulsi.
“In this particular case, it’s any or all research in the 3D space - that’s really the intention behind it. We haven’t really put any guardrails around it other than the fact that it’s for non-commercial use - it’s meant for research specifically. It is meant to be a core building block dataset, which is what folks really need in this type of research.”
Sharing scans while preserving privacy
If you’ve scanned your own spaces with Matterport’s software, and are wondering if your rooms are among the research dataset - that isn’t cause for worry. Tulsi shared that the process of choosing the scans for HMD3 involved direct outreach to the owners of any of the scans that were included.
“So they’ve given us permission, very specifically for these scans. And, secondarily, we also did do work around making the spaces privacy-enabled. We have something called a blur brush where we could do some automated blurring and manual blurring, so we’ve made our best efforts in ensuring that privacy is there. But I think the overriding thing is that each of the individual owners gave us very specific permissions to share the scans.”
“As a group we are very privacy-friendly, and as a company we’ve definitely taken a privacy-first stance. I also happen to be the privacy officer - so I can say that it is in my mind to make sure that’s always the case.”
While AI research may sound like distant science fiction to some, there is a crucial need for datasets to be created to enable the research that will power future AI and machine learning systems. With increasing computing power and demand for automation, research datasets like these can provide a robust foundation to “teach” the next generation of automated entities about the world around them. It will be interesting to follow the outcomes of the research that uses this dataset as the industry continues to evolve.