Apple’s machine learning team, together with researchers from Nanjing University and the Hong Kong University of Science and Technology, has developed a new 3D AI model called Matrix3D. This model can reconstruct 3D objects and scenes from as few as three 2D photographs, representing a significant step forward in the field of photogrammetry.
A Simplified Approach to 3D Reconstruction
Matrix3D falls under the category of Large Photogrammetry Models. Photogrammetry is a process that uses photographs to gather information about objects and create accurate 3D models or maps. Traditionally, this process has relied on multiple separate models for pose estimation and depth prediction, which often introduces inaccuracies or complexity into the workflow.
What sets Matrix3D apart is its ability to handle the entire process using a single architecture. It takes input images, camera parameters (such as angle and focal length), and depth data all at once. This unified approach not only streamlines the reconstruction process but also improves the overall precision and quality of the 3D output.
Matrix3D was trained using a masked learning strategy. In this method, some parts of the data are deliberately hidden during training, encouraging the model to infer the missing information. This approach enables more efficient learning even when using limited or incomplete datasets.
Potential Applications and Accessibility
With only three input images, Matrix3D can generate detailed 3D reconstructions of objects and entire environments. This opens up potential applications in areas such as augmented and virtual reality, including integration with immersive headsets like Apple’s Vision Pro.
The researchers have made the source code for Matrix3D publicly available on GitHub. In addition, a dedicated website has been launched where users can explore the capabilities of the model in more depth. These resources make it easier for developers and researchers to experiment with or build on the technology.
We’ll keep you updated as more integrations and use cases emerge around this promising development in 3D AI modeling.