Paper
SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints
Authors
Hyungmin Kim, Hobeom Jeon, Dohyung Kim, Minsu Jang, Jeahong Kim
Abstract
Embodied Task Planning with large language models faces safety challenges in real-world environments, where partial observability and physical constraints must be respected. Existing benchmarks often overlook these critical factors, limiting their ability to evaluate both feasibility and safety. We introduce SPOC, a benchmark for safety-aware embodied task planning, which integrates strict partial observability, physical constraints, step-by-step planning, and goal-condition-based evaluation. Covering diverse household hazards such as fire, fluid, injury, object damage, and pollution, SPOC enables rigorous assessment through both state and constraint-based online metrics. Experiments with state-of-the-art LLMs reveal that current models struggle to ensure safety-aware planning, particularly under implicit constraints. Code and dataset are available at https://github.com/khm159/SPOC
Metadata
Related papers
Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini
Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongy... • 2026-03-25
Comparing Developer and LLM Biases in Code Evaluation
Aditya Mittal, Ryan Shar, Zichu Wu, Shyam Agarwal, Tongshuang Wu, Chris Donah... • 2026-03-25
The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence
Biplab Pal, Santanu Bhattacharya • 2026-03-25
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, ... • 2026-03-25
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2602.21595v1</id>\n <title>SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints</title>\n <updated>2026-02-25T05:44:21Z</updated>\n <link href='https://arxiv.org/abs/2602.21595v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2602.21595v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Embodied Task Planning with large language models faces safety challenges in real-world environments, where partial observability and physical constraints must be respected. Existing benchmarks often overlook these critical factors, limiting their ability to evaluate both feasibility and safety. We introduce SPOC, a benchmark for safety-aware embodied task planning, which integrates strict partial observability, physical constraints, step-by-step planning, and goal-condition-based evaluation. Covering diverse household hazards such as fire, fluid, injury, object damage, and pollution, SPOC enables rigorous assessment through both state and constraint-based online metrics. Experiments with state-of-the-art LLMs reveal that current models struggle to ensure safety-aware planning, particularly under implicit constraints. Code and dataset are available at https://github.com/khm159/SPOC</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.RO'/>\n <published>2026-02-25T05:44:21Z</published>\n <arxiv:comment>Accepted to IEEE ICASSP 2026</arxiv:comment>\n <arxiv:primary_category term='cs.RO'/>\n <author>\n <name>Hyungmin Kim</name>\n </author>\n <author>\n <name>Hobeom Jeon</name>\n </author>\n <author>\n <name>Dohyung Kim</name>\n </author>\n <author>\n <name>Minsu Jang</name>\n </author>\n <author>\n <name>Jeahong Kim</name>\n </author>\n </entry>"
}