Saved in:
| Main Author: | |
|---|---|
| Format: | Recurso digital |
| Language: | |
| Published: |
Zenodo
2026
|
| Online Access: | https://doi.org/10.5281/zenodo.19602437 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Table of Contents:
- <p class="p1">We propose a methodology for constructing geometric representations of specialized visual</p> <p class="p1">domains. The approach extends the EcoLoop synthesis framework beyond generation and</p> <p class="p1">evaluation to treat the domain’s embedding space as itself an object of study. Three architectural</p> <p class="p1">commitments distinguish the methodology from existing synthesis-and-preference-optimization</p> <p class="p1">pipelines. First, the primary training signal is expert-produced ranked distributions of image sets</p> <p class="p1">rather than pairwise preferences; listwise preference optimization, grounded in the Plackett–Luce</p> <p class="p1">model and the learning-to-rank tradition, is used in place of pairwise reward modeling, recovering</p> <p class="p1">the distance, cluster, and magnitude information that pairwise reduction discards. Second, the rank</p> <p class="p1">position assigned by experts is reinterpreted as a supervised scalar coordinate along which the</p> <p class="p1">embedding space can be introspected using the concept-direction and linear-representation tradition</p> <p class="p1">(TCAV, representation engineering, linear representation hypothesis) — producing a principled</p> <p class="p1">way to ask whether realism is a linear direction, a nonlinear manifold, or a clustered structure in a</p> <p class="p1">domain-adapted representation. Third, a trained general inverse model is learned from the free</p> <p class="p1">byproducts of the primary synthesis pipeline (image–caption pairs, prompt–generation pairs,</p> <p class="p1">ranked sets), enabling geometric introspection by intervention in the style of causal-abstraction</p> <p class="p1">analyses and concept-slider interventions: a representation is perturbed along a discovered</p> <p class="p1">direction, inverted back to an input, and the induced change is examined. These three commitments</p> <p class="p1">together produce not only a generation system but a studyable representation of the domain itself</p> <p class="p1">— one that supports natural extensions to video generation as latent-space trajectories, and that</p> <p class="p1">generalizes across specialized visual domains (medical imaging, remote sensing, microscopy,</p> <p class="p1">materials science) in which expert perception diverges substantially from lay perception. The</p> <p class="p1">methodology is formulated here as a research program; experimental validation is the subject of</p> <p class="p1">ongoing work.</p>