DiscoverDaily Paper CastPrompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Update: 2024-12-20
Share

Description

🤗 Upvotes: 10 | cs.CV



Authors:

Haotong Lin, Sida Peng, Jingxiao Chen, Songyou Peng, Jiaming Sun, Minghuan Liu, Hujun Bao, Jiashi Feng, Xiaowei Zhou, Bingyi Kang



Title:

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation



Arxiv:

http://arxiv.org/abs/2412.14015v1



Abstract:

Prompts play a critical role in unleashing the power of language and vision foundation models for specific tasks. For the first time, we introduce prompting into depth foundation models, creating a new paradigm for metric depth estimation termed Prompt Depth Anything. Specifically, we use a low-cost LiDAR as the prompt to guide the Depth Anything model for accurate metric depth output, achieving up to 4K resolution. Our approach centers on a concise prompt fusion design that integrates the LiDAR at multiple scales within the depth decoder. To address training challenges posed by limited datasets containing both LiDAR depth and precise GT depth, we propose a scalable data pipeline that includes synthetic data LiDAR simulation and real data pseudo GT depth generation. Our approach sets new state-of-the-arts on the ARKitScenes and ScanNet++ datasets and benefits downstream applications, including 3D reconstruction and generalized robotic grasping.

Comments 
loading
In Channel
GUI Agents: A Survey

GUI Agents: A Survey

2024-12-2021:01

loading
00:00
00:00
1.0x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation

Jingwen Liang, Gengyu Wang