DiscoverDaily Paper CastOceanGym: A Benchmark Environment for Underwater Embodied Agents
OceanGym: A Benchmark Environment for Underwater Embodied Agents

OceanGym: A Benchmark Environment for Underwater Embodied Agents

Update: 2025-10-02
Share

Description

🤗 Upvotes: 30 | cs.CL, cs.AI, cs.CV, cs.LG, cs.RO



Authors:

Yida Xue, Mingjun Mao, Xiangyuan Ru, Yuqi Zhu, Baochang Ren, Shuofei Qiao, Mengru Wang, Shumin Deng, Xinyu An, Ningyu Zhang, Ying Chen, Huajun Chen



Title:

OceanGym: A Benchmark Environment for Underwater Embodied Agents



Arxiv:

http://arxiv.org/abs/2509.26536v1



Abstract:

We introduce OceanGym, the first comprehensive benchmark for ocean underwater embodied agents, designed to advance AI in one of the most demanding real-world environments. Unlike terrestrial or aerial domains, underwater settings present extreme perceptual and decision-making challenges, including low visibility, dynamic ocean currents, making effective agent deployment exceptionally difficult. OceanGym encompasses eight realistic task domains and a unified agent framework driven by Multi-modal Large Language Models (MLLMs), which integrates perception, memory, and sequential decision-making. Agents are required to comprehend optical and sonar data, autonomously explore complex environments, and accomplish long-horizon objectives under these harsh conditions. Extensive experiments reveal substantial gaps between state-of-the-art MLLM-driven agents and human experts, highlighting the persistent difficulty of perception, planning, and adaptability in ocean underwater environments. By providing a high-fidelity, rigorously designed platform, OceanGym establishes a testbed for developing robust embodied AI and transferring these capabilities to real-world autonomous ocean underwater vehicles, marking a decisive step toward intelligent agents capable of operating in one of Earth's last unexplored frontiers. The code and data are available at https://github.com/OceanGPT/OceanGym.

Comments 
In Channel
loading
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

OceanGym: A Benchmark Environment for Underwater Embodied Agents

OceanGym: A Benchmark Environment for Underwater Embodied Agents

Jingwen Liang, Gengyu Wang