:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Chowdhury, Sanjoy, Yang, Karren D., Liu, Xudong, Faghri, Fartash, Vasu, Pavan Kumar Anasosalu, Tuzel, Oncel, Manocha, Dinesh, Li, Chun-Liang, Vemulapalli, Raviteja
Format:	Preprint
Published:	2025
Subjects:	Artificial Intelligence Multiagent Systems
Online Access:	https://arxiv.org/abs/2512.16250
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

MobileCLIP: Fast Image-Text Models through Multi-Modal Reinforced Training
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2023)

CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)

SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding
by: Wang, Haoxiang, et al.
Published: (2023)

FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations
by: Hsieh, Cheng-Yu, et al.
Published: (2025)

MobileCLIP2: Improving Multi-Modal Reinforced Training
by: Faghri, Fartash, et al.
Published: (2025)

Knowledge Transfer from Vision Foundation Models for Efficient Training of Small Task-specific Models
by: Vemulapalli, Raviteja, et al.
Published: (2023)

MUSCLE: A Model Update Strategy for Compatible LLM Evolution
by: Echterhoff, Jessica, et al.
Published: (2024)

VSAS-Bench: Real-Time Evaluation of Visual Streaming Assistant Models
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2026)

TiC-CLIP: Continual Training of CLIP Models
by: Garg, Saurabh, et al.
Published: (2023)

Proxy-FDA: Proxy-based Feature Distribution Alignment for Fine-tuning Vision Foundation Models without Forgetting
by: Huang, Chen, et al.
Published: (2025)

TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining
by: Li, Jeffrey, et al.
Published: (2025)

FastVLM: Efficient Vision Encoding for Vision Language Models
by: Vasu, Pavan Kumar Anasosalu, et al.
Published: (2024)

Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum
by: Pouransari, Hadi, et al.
Published: (2024)

GAMEOPT+: Improving Fuel Efficiency in Unregulated Heterogeneous Traffic Intersections via Optimal Multi-agent Cooperative Control
by: Suriyarachchi, Nilesh, et al.
Published: (2024)

Uncovering the Representation Geometry of Minimal Cores in Overcomplete Reasoning Traces
by: Chowdhury, Sanjoy, et al.
Published: (2026)

CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
by: Mehta, Sachin, et al.
Published: (2024)

Learning from Self Critique and Refinement for Faithful LLM Summarization
by: Hu, Ting-Yao, et al.
Published: (2025)

Graph-Based Captioning: Enhancing Visual Descriptions by Interconnecting Region Captions
by: Hsieh, Yu-Guan, et al.
Published: (2024)

AgentWebBench: Benchmarking Multi-Agent Coordination in Agentic Web
by: Zhong, Shanshan, et al.
Published: (2026)

Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning
by: Yu, Peihong, et al.
Published: (2024)

ClawMobile: Rethinking Smartphone-Native Agentic Systems
by: Du, Hongchao, et al.
Published: (2026)

CREW-WILDFIRE: Benchmarking Agentic Multi-Agent Collaborations at Scale
by: Hyun, Jonathan, et al.
Published: (2025)

Agentization of Digital Assets for the Agentic Web: Concepts, Techniques, and Benchmark
by: Chen, Linyao, et al.
Published: (2026)

Beyond Black-Box Benchmarking: Observability, Analytics, and Optimization of Agentic Systems
by: Moshkovich, Dany, et al.
Published: (2025)

Can LLMs Generate Human-Like Wayfinding Instructions? Towards Platform-Agnostic Embodied Instruction Synthesis
by: Dorbala, Vishnu Sashank, et al.
Published: (2024)

Meerkat: Audio-Visual Large Language Model for Grounding in Space and Time
by: Chowdhury, Sanjoy, et al.
Published: (2024)

AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
by: Chowdhury, Sanjoy, et al.
Published: (2025)

LVRPO: Language-Visual Alignment with GRPO for Multimodal Understanding and Generation
by: Mo, Shentong, et al.
Published: (2026)

Multi-Agent Game Generation and Evaluation via Audio-Visual Recordings
by: Jolicoeur-Martineau, Alexia
Published: (2025)

ATOD: An Evaluation Framework and Benchmark for Agentic Task-Oriented Dialogue Systems
by: Zhang, Yifei, et al.
Published: (2026)

Multi-Agent Medical Decision Consensus Matrix System: An Intelligent Collaborative Framework for Oncology MDT Consultations
by: Han, Xudong, et al.
Published: (2025)

LongCLI-Bench: A Preliminary Benchmark and Study for Long-horizon Agentic Programming in Command-Line Interfaces
by: Feng, Yukang, et al.
Published: (2026)

Benchmarking Agentic Workflow Generation
by: Qiao, Shuofei, et al.
Published: (2024)

Aurelia: Test-time Reasoning Distillation in Audio-Visual LLMs
by: Chowdhury, Sanjoy, et al.
Published: (2025)

Multi-Agentic Approach for History Matching of Oil Reservoirs
by: Samigullin, Linar, et al.
Published: (2026)

RobustFlow: Towards Robust Agentic Workflow Generation
by: Xu, Shengxiang, et al.
Published: (2025)

CARES: Collaborative Agentic Reasoning for Error Detection in Surgery
by: Low, Chang Han, et al.
Published: (2025)

Towards Understanding, Analyzing, and Optimizing Agentic AI Execution: A CPU-Centric Perspective
by: Raj, Ritik, et al.
Published: (2025)

SV-LLM: An Agentic Approach for SoC Security Verification using Large Language Models
by: Saha, Dipayan, et al.
Published: (2025)

Agentic SPARQL: Evaluating SPARQL-MCP-powered Intelligent Agents on the Federated KGQA Benchmark
by: Dobriy, Daniel, et al.
Published: (2026)