SpatialCorpus · I/O browser

Input

image raw path

question raw

options raw

question_with_extra raw

extra_info (provenance / answer_type / task) derived

format note LLaVA conversation

Output

ground-truth answer raw

expected response format conventional

Varies by task: option letter (mcq), yes/no (judgement), scalar+unit (depth/size), or structured coords (3D detection).

evaluation from paper

Training target (SFT) — the gpt turn value, used directly as the supervision label.

Raw record from SpatialCorpus jsons/ · LLaVA entry (mapped)