Internformat: :: Library Catalog

Gespeichert in:

Bibliographische Detailangaben
Hauptverfasser:	Khan, Ufaq, Teja, L. D. M. S. Sai, Shakiru, Ayuba, Shaaban, Mai A., Xie, Yutong, Bilal, Muhammad, Khan, Muhammad Haris
Format:	Preprint
Veröffentlicht:	2026
Schlagworte:	Computer Vision and Pattern Recognition
Online-Zugang:	https://arxiv.org/abs/2603.19364
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

_version_	1866908901819023360
author	Khan, Ufaq Teja, L. D. M. S. Sai Shakiru, Ayuba Shaaban, Mai A. Xie, Yutong Bilal, Muhammad Khan, Muhammad Haris
author_facet	Khan, Ufaq Teja, L. D. M. S. Sai Shakiru, Ayuba Shaaban, Mai A. Xie, Yutong Bilal, Muhammad Khan, Muhammad Haris
contents	Ultrasound images vary widely across scanners, operators, and anatomical targets, which often causes models trained in one setting to generalize poorly to new hospitals and clinical conditions. The Foundation Model Challenge for Ultrasound Image Analysis (FMC-UIA) reflects this difficulty by requiring a single model to handle multiple tasks, including segmentation, detection, classification, and landmark regression across diverse organs and datasets. We propose a unified multi-task framework based on a transformer visual encoder from the Qwen3-VL family. Intermediate token features are projected into spatial feature maps and fused using a lightweight multi-scale feature pyramid, enabling both pixel-level predictions and global reasoning within a shared representation. Each task is handled by a small task-specific prediction head, while training uses task-aware sampling and selective loss balancing to manage heterogeneous supervision and reduce task imbalance. Our method is designed to be simple to optimize and adaptable across a wide range of ultrasound analysis tasks. The performance improved from 67% to 85% on the validation set and achieved an average score of 81.84% on the official test set across all tasks. The code is publicly available at: https://github.com/saitejalekkala33/FMCUIA-ISBI.git
format	Preprint
id	arxiv_https___arxiv_org_abs_2603_19364
institution	arXiv
publishDate	2026
record_format	arxiv
spellingShingle	AURORA: Adaptive Unified Representation for Robust Ultrasound Analysis Khan, Ufaq Teja, L. D. M. S. Sai Shakiru, Ayuba Shaaban, Mai A. Xie, Yutong Bilal, Muhammad Khan, Muhammad Haris Computer Vision and Pattern Recognition Ultrasound images vary widely across scanners, operators, and anatomical targets, which often causes models trained in one setting to generalize poorly to new hospitals and clinical conditions. The Foundation Model Challenge for Ultrasound Image Analysis (FMC-UIA) reflects this difficulty by requiring a single model to handle multiple tasks, including segmentation, detection, classification, and landmark regression across diverse organs and datasets. We propose a unified multi-task framework based on a transformer visual encoder from the Qwen3-VL family. Intermediate token features are projected into spatial feature maps and fused using a lightweight multi-scale feature pyramid, enabling both pixel-level predictions and global reasoning within a shared representation. Each task is handled by a small task-specific prediction head, while training uses task-aware sampling and selective loss balancing to manage heterogeneous supervision and reduce task imbalance. Our method is designed to be simple to optimize and adaptable across a wide range of ultrasound analysis tasks. The performance improved from 67% to 85% on the validation set and achieved an average score of 81.84% on the official test set across all tasks. The code is publicly available at: https://github.com/saitejalekkala33/FMCUIA-ISBI.git
title	AURORA: Adaptive Unified Representation for Robust Ultrasound Analysis
topic	Computer Vision and Pattern Recognition
url	https://arxiv.org/abs/2603.19364

Ähnliche Einträge