Perception–Awareness–Decision (PAD)

Socially-Aware Robot Navigation and Interaction

Edison Jair Bejarano Sepulveda · Valerio Bo · Alberto Sanfeliu · Anaís Garrell
Institut de Robòtica i Informàtica Industrial (IRI), CSIC–UPC · Barcelona, Spain
Video: ./media/LBR-PAD.mp4
Scenario: Corridor-blocking

If the video doesn’t load, confirm the file name is exactly LBR-PAD.mp4 and it’s committed & pushed (case-sensitive).

Abstract

Pipeline: SLAM + VLM + ASR + LLM
Output: Situational-awareness map

Robots working in spaces shared by people need more than geometric mapping: they must recognize people, understand social context, and decide whether to proceed or negotiate passage. We introduce a Perception–Awareness–Decision (PAD) framework that combines SLAM with Vision–Language Models (VLMs), speech recognition, and Large Language Models (LLMs) through an explicit situational-awareness representation. In a corridor-blocking task, PAD improves task success, increases safety margins, and produces behavior participants judged as more socially appropriate than a geometric baseline.

(Text adapted from the paper abstract.) When you can publish the PDF (DOI/arXiv/preprint), I’ll wire the “Paper (PDF)” button.

At a glance

Conditions: P0 / P1 / P2
Participants: 8

PAD separates (1) multimodal perception, (2) situation awareness, and (3) decision-making, enabling a robot to switch between safe replanning and context-grounded dialogue when navigation depends on human cooperation.

  • P0: SLAM-only baseline (no semantics, no dialogue)
  • P1: Context-aware conservative interaction
  • P2: Context-aware assertive interaction

Framework

Figure: PAD overview
Key idea: explicit awareness layer
PAD framework overview
PAD architecture. Perception fuses LiDAR/SLAM with VLM & language inputs; the situation-awareness layer builds a semantic contextual map; a meta-controller selects between navigation/control and an LLM-driven dialogue module.
Experimental setup
Experimental setup. Corridor-blocking scenario with TIAGo++ (IVO) in an indoor environment, used to test whether the robot can resolve a navigation conflict requiring human cooperation.

Experiment

Task: corridor blocked by participant
Outcome: proceed vs negotiate

Each trial starts with the robot navigating toward a fixed goal. When the participant blocks the corridor, the robot either repeatedly replans (P0) or uses PAD to initiate context-grounded dialogue and then continues navigation (P1/P2).

Example trial sequence (TPA)
Example trial sequence (TPA). The robot detects blockage, initiates dialogue, verifies clearance, and resumes navigation. (This sequence corresponds to the paper’s illustrated experimental trial.)
Example sequence EDA
EDA sequence. Additional example of interaction/navigation flow under PAD.
Example sequence NDA
NDA sequence. Additional example illustrating resolution and continuation to goal.

Results

Success: P1/P2 = 8/8
Baseline: P0 = 2/8

PAD-enabled behaviors achieved perfect success and improved safety margins compared to the SLAM-only baseline. P1 tended to finish faster with fewer turns (but longer utterances), while P2 used more turns with shorter messages.

Metric P0 (SLAM-only) P1 (Conservative) P2 (Assertive)
Success rate 2/8 8/8 8/8
Re-planning attempts (avg.) 7.25 4.71 4.29
Stopping distance (m, avg.) 0.70 0.83 0.79
Trial duration (s, avg.) 105.94 99.95 162.21
Dialogue turns (avg.) 3.4 6.2
Tokens per turn (avg.) 35.45 21.36

Metrics reproduced from Table 1 in the paper.

Clarity of explanation
Clarity. Participants rated PAD conditions higher in explanation clarity than the baseline.
Perceived naturalness
Naturalness. PAD interaction styles were judged more natural than SLAM-only navigation.
Comfort/safety
Comfort/Safety. Self-reported comfort/safety trends favored PAD-enabled behaviors.
Preference for real-life use
Preference. If choosing a real-life strategy, participants tended to prefer PAD behaviors over baseline.

Citation & links

DOI: 10.1145/3776734.3794394

DOI (will activate after publication): 10.1145/3776734.3794394

When you’re ready, I can add: (1) the final PDF link, (2) a BibTeX block, and (3) a “Contact” section (email button + links).