Inhaltsangabe: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
1. Verfasser:	Dubey, Mradul
Format:	Recurso digital
Sprache:
Veröffentlicht:	Zenodo 2026
Schlagworte:	anchoring-bias vision-language-models vlm object-detection yolov8 visual-reasoning prompt-engineering computer-vision
Online-Zugang:	https://doi.org/10.5281/zenodo.19557723
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Inhaltsangabe:

<p>Adding structured detection metadata to vision-language model prompts systematically degrades visual reasoning due to anchoring bias, and the delivery channel determines the magnitude. Across seven controlled conditions on a surveillance scene, text-encoded bounding boxes dropped visual reasoning to 53%, visual overlays preserved 69%, and cross-modal ID-mapping collapsed to 47%, despite having smaller text to image token ratio. Plausibly positioned fabricated detections pass unchallenged; the metadata cost on visual perception is monotonic for scene description case. This repository contains all raw prompts, model responses, scoring rubrics, and reproducibility artifacts for the study.</p>

Ähnliche Einträge