Guardado en:
| Autores principales: | Zhou, Chenyu, Shi, Xiaoming, Qiu, Hui, Zheng, Xiawu, Leng, Haitao, Jiang, Yankai, Liu, Shaoguo, Gao, Tingting, Ji, Rongrong |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2509.23836 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
KwaiChat: A Large-Scale Video-Driven Multilingual Mixed-Type Dialogue Corpus
por: Shi, Xiaoming, et al.
Publicado: (2025)
por: Shi, Xiaoming, et al.
Publicado: (2025)
OMPQ: Orthogonal Mixed Precision Quantization
por: Ma, Yuexiao, et al.
Publicado: (2021)
por: Ma, Yuexiao, et al.
Publicado: (2021)
STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
por: Wang, Jieyi, et al.
Publicado: (2024)
por: Wang, Jieyi, et al.
Publicado: (2024)
Towards Efficient Automatic Self-Pruning of Large Language Models
por: Huang, Weizhong, et al.
Publicado: (2025)
por: Huang, Weizhong, et al.
Publicado: (2025)
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
por: Lei, Yiming, et al.
Publicado: (2025)
por: Lei, Yiming, et al.
Publicado: (2025)
SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding
por: Zhang, Chenkai, et al.
Publicado: (2025)
por: Zhang, Chenkai, et al.
Publicado: (2025)
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art
por: Lei, Yiming, et al.
Publicado: (2025)
por: Lei, Yiming, et al.
Publicado: (2025)
Benchmarking Abstract and Reasoning Abilities Through A Theoretical Perspective
por: Ma, Qingchuan, et al.
Publicado: (2025)
por: Ma, Qingchuan, et al.
Publicado: (2025)
OxyEcomBench: Benchmarking Multimodal Foundation Models across E-Commerce Ecosystems
por: Liu, Yong, et al.
Publicado: (2026)
por: Liu, Yong, et al.
Publicado: (2026)
First‐Order Mixed Autoregressive Model for Bivariate Mixed Time Series
por: Weiyang Yu, et al.
Publicado: (2025)
por: Weiyang Yu, et al.
Publicado: (2025)
VEGA: Learning Interleaved Image-Text Comprehension in Vision-Language Large Models
por: Zhou, Chenyu, et al.
Publicado: (2024)
por: Zhou, Chenyu, et al.
Publicado: (2024)
Determining Layer-wise Sparsity for Large Language Models Through a Theoretical Perspective
por: Huang, Weizhong, et al.
Publicado: (2025)
por: Huang, Weizhong, et al.
Publicado: (2025)
EcomBench: Towards Holistic Evaluation of Foundation Agents in E-commerce
por: Min, Rui, et al.
Publicado: (2025)
por: Min, Rui, et al.
Publicado: (2025)
HASTE: Training-Free Video Diffusion Acceleration via Head-Wise Adaptive Sparse Attention
por: Zheng, Xuzhe, et al.
Publicado: (2026)
por: Zheng, Xuzhe, et al.
Publicado: (2026)
An Efficient and Mixed Heterogeneous Model for Image Restoration
por: Gu, Yubin, et al.
Publicado: (2025)
por: Gu, Yubin, et al.
Publicado: (2025)
SpecEyes: Accelerating Agentic Multimodal LLMs via Speculative Perception and Planning
por: Huang, Haoyu, et al.
Publicado: (2026)
por: Huang, Haoyu, et al.
Publicado: (2026)
Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models
por: Luo, Gen, et al.
Publicado: (2024)
por: Luo, Gen, et al.
Publicado: (2024)
Depth-Guided Semi-Supervised Instance Segmentation
por: Chen, Xin, et al.
Publicado: (2024)
por: Chen, Xin, et al.
Publicado: (2024)
ALGOGEN: Tool-Generated Verifiable Traces for Reliable Algorithm Visualization
por: Liao, Kunpeng, et al.
Publicado: (2026)
por: Liao, Kunpeng, et al.
Publicado: (2026)
A2RBench: An Automatic Paradigm for Formally Verifiable Abstract Reasoning Benchmark Generation
por: Ma, Qingchuan, et al.
Publicado: (2026)
por: Ma, Qingchuan, et al.
Publicado: (2026)
Multi-branch Collaborative Learning Network for 3D Visual Grounding
por: Qian, Zhipeng, et al.
Publicado: (2024)
por: Qian, Zhipeng, et al.
Publicado: (2024)
From Objects to Events: Unlocking Complex Visual Understanding in Object Detectors via LLM-guided Symbolic Reasoning
por: Zeng, Yuhui, et al.
Publicado: (2025)
por: Zeng, Yuhui, et al.
Publicado: (2025)
EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs
por: Guo, Song, et al.
Publicado: (2024)
por: Guo, Song, et al.
Publicado: (2024)
Polybasic Speculative Decoding Through a Theoretical Perspective
por: Wang, Ruilin, et al.
Publicado: (2025)
por: Wang, Ruilin, et al.
Publicado: (2025)
Dynamic Low-Rank Sparse Adaptation for Large Language Models
por: Huang, Weizhong, et al.
Publicado: (2025)
por: Huang, Weizhong, et al.
Publicado: (2025)
Enhancing Supervised Composed Image Retrieval via Reasoning-Augmented Representation Engineering
por: Li, Jun, et al.
Publicado: (2025)
por: Li, Jun, et al.
Publicado: (2025)
Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
por: Luo, Yongdong, et al.
Publicado: (2024)
por: Luo, Yongdong, et al.
Publicado: (2024)
Mixed Degradation Image Restoration via Local Dynamic Optimization and Conditional Embedding
por: Gu, Yubin, et al.
Publicado: (2024)
por: Gu, Yubin, et al.
Publicado: (2024)
UI-AGILE: Advancing GUI Agents with Effective Reinforcement Learning and Precise Inference-Time Grounding
por: Lian, Shuquan, et al.
Publicado: (2025)
por: Lian, Shuquan, et al.
Publicado: (2025)
Training-Free Multimodal Large Language Model Orchestration
por: Xie, Tianyu, et al.
Publicado: (2025)
por: Xie, Tianyu, et al.
Publicado: (2025)
Distilling Rule-based Knowledge into Large Language Models
por: Yang, Wenkai, et al.
Publicado: (2023)
por: Yang, Wenkai, et al.
Publicado: (2023)
Unleashing the Power of Intermediate Domains for Mixed Domain Semi-Supervised Medical Image Segmentation
por: Ma, Qinghe, et al.
Publicado: (2025)
por: Ma, Qinghe, et al.
Publicado: (2025)
Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation
por: Ma, Qinghe, et al.
Publicado: (2024)
por: Ma, Qinghe, et al.
Publicado: (2024)
Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation
por: Cheng, Zesen, et al.
Publicado: (2024)
por: Cheng, Zesen, et al.
Publicado: (2024)
Few-Shot Image Quality Assessment via Adaptation of Vision-Language Models
por: Li, Xudong, et al.
Publicado: (2024)
por: Li, Xudong, et al.
Publicado: (2024)
Event-Anchored Frame Selection for Effective Long-Video Understanding
por: Chen, Wang, et al.
Publicado: (2026)
por: Chen, Wang, et al.
Publicado: (2026)
Uncovering the Over-smoothing Challenge in Image Super-Resolution: Entropy-based Quantification and Contrastive Optimization
por: Xu, Tianshuo, et al.
Publicado: (2022)
por: Xu, Tianshuo, et al.
Publicado: (2022)
Linear Discriminant Analysis with High-dimensional Mixed Variables
por: Jiang, Binyan, et al.
Publicado: (2021)
por: Jiang, Binyan, et al.
Publicado: (2021)
BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning
por: Zheng, Hongwei, et al.
Publicado: (2024)
por: Zheng, Hongwei, et al.
Publicado: (2024)
Q-DeepSight: Incentivizing Thinking with Images for Image Quality Assessment and Refinement
por: Li, Xudong, et al.
Publicado: (2026)
por: Li, Xudong, et al.
Publicado: (2026)
Ejemplares similares
-
KwaiChat: A Large-Scale Video-Driven Multilingual Mixed-Type Dialogue Corpus
por: Shi, Xiaoming, et al.
Publicado: (2025) -
OMPQ: Orthogonal Mixed Precision Quantization
por: Ma, Yuexiao, et al.
Publicado: (2021) -
STAMPsy: Towards SpatioTemporal-Aware Mixed-Type Dialogues for Psychological Counseling
por: Wang, Jieyi, et al.
Publicado: (2024) -
Towards Efficient Automatic Self-Pruning of Large Language Models
por: Huang, Weizhong, et al.
Publicado: (2025) -
ContextQFormer: A New Context Modeling Method for Multi-Turn Multi-Modal Conversations
por: Lei, Yiming, et al.
Publicado: (2025)