SpatialScore · I/O browser

Input

image raw path

question raw

options raw

question_with_extra (final prompt to model) raw

extra_info (auxiliary info: bbox coords, etc.) raw

system suffix (assistant_prompt) from test_qwen.py

Output

ground-truth answer raw

expected response format conventional

Single letter choice (A/B/C/D) corresponding to one of the options above.

evaluation from paper

Multiple-choice accuracy · correct iff predicted letter == ground-truth letter

Raw record from SpatialScore_benchmark.ndjson · entry as-is