Research

Paper

AI LLM February 25, 2026

SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints

Authors

Hyungmin Kim, Hobeom Jeon, Dohyung Kim, Minsu Jang, Jeahong Kim

Abstract

Embodied Task Planning with large language models faces safety challenges in real-world environments, where partial observability and physical constraints must be respected. Existing benchmarks often overlook these critical factors, limiting their ability to evaluate both feasibility and safety. We introduce SPOC, a benchmark for safety-aware embodied task planning, which integrates strict partial observability, physical constraints, step-by-step planning, and goal-condition-based evaluation. Covering diverse household hazards such as fire, fluid, injury, object damage, and pollution, SPOC enables rigorous assessment through both state and constraint-based online metrics. Experiments with state-of-the-art LLMs reveal that current models struggle to ensure safety-aware planning, particularly under implicit constraints. Code and dataset are available at https://github.com/khm159/SPOC

Metadata

arXiv ID: 2602.21595
Provider: ARXIV
Primary Category: cs.RO
Published: 2026-02-25
Fetched: 2026-02-26 05:00

Related papers

Raw Data (Debug)
{
  "raw_xml": "<entry>\n    <id>http://arxiv.org/abs/2602.21595v1</id>\n    <title>SPOC: Safety-Aware Planning Under Partial Observability And Physical Constraints</title>\n    <updated>2026-02-25T05:44:21Z</updated>\n    <link href='https://arxiv.org/abs/2602.21595v1' rel='alternate' type='text/html'/>\n    <link href='https://arxiv.org/pdf/2602.21595v1' rel='related' title='pdf' type='application/pdf'/>\n    <summary>Embodied Task Planning with large language models faces safety challenges in real-world environments, where partial observability and physical constraints must be respected. Existing benchmarks often overlook these critical factors, limiting their ability to evaluate both feasibility and safety. We introduce SPOC, a benchmark for safety-aware embodied task planning, which integrates strict partial observability, physical constraints, step-by-step planning, and goal-condition-based evaluation. Covering diverse household hazards such as fire, fluid, injury, object damage, and pollution, SPOC enables rigorous assessment through both state and constraint-based online metrics. Experiments with state-of-the-art LLMs reveal that current models struggle to ensure safety-aware planning, particularly under implicit constraints. Code and dataset are available at https://github.com/khm159/SPOC</summary>\n    <category scheme='http://arxiv.org/schemas/atom' term='cs.RO'/>\n    <published>2026-02-25T05:44:21Z</published>\n    <arxiv:comment>Accepted to IEEE ICASSP 2026</arxiv:comment>\n    <arxiv:primary_category term='cs.RO'/>\n    <author>\n      <name>Hyungmin Kim</name>\n    </author>\n    <author>\n      <name>Hobeom Jeon</name>\n    </author>\n    <author>\n      <name>Dohyung Kim</name>\n    </author>\n    <author>\n      <name>Minsu Jang</name>\n    </author>\n    <author>\n      <name>Jeahong Kim</name>\n    </author>\n  </entry>"
}