HunyuanWorld-Mirror: Universal 3D World Reconstruction with Any-Prior Prompting (github.com)

0 points 7 days ago ago | visit original

🤖 AI Summary

Tencent released HunyuanWorld-Mirror, a feed-forward universal 3D reconstruction model and inference code + weights (with a Gradio and Hugging Face demo). The model implements "Any-Prior Prompting": lightweight encoders convert any subset of calibrated intrinsics, camera poses and depth maps into structured tokens, conditioning a single network that simultaneously predicts point clouds, multi-view depths, surface normals, camera intrinsics/poses, and 3D Gaussian splats (means, scales, opacities, quaternions, spherical harmonics). Outputs include per-view confidence maps and formats for downstream use (COLMAP export, gsplat rendering/optimization). The repo provides PyTorch 2.4/CUDA 12.4 install instructions, inference examples, and utilities to save and visualize results. Significance: HunyuanWorld-Mirror unifies multiple 3D perception tasks into one fast forward pass and consistently improves accuracy by incorporating available priors, making it a practical alternative to slower optimization-based methods (e.g., NeRF pipelines) for tasks like novel-view synthesis and point-cloud reconstruction. On benchmarks (7-Scenes, NRGBD, DTU, Re10K, DL3DV) it achieves state-of-the-art feed-forward performance—e.g., higher PSNR/SSIM and lower LPIPS for view synthesis and reduced point-cloud error—and further improves when intrinsics/poses/depths are provided. The release includes full inference code, model checkpoints, evaluation scripts, and instructions for converting outputs to 3D Gaussian Splatting workflows for high-quality rendering and optimization.

Loading comments...

loading comments...