Saved in:
| Main Authors: | Huang, Wei, Ge, Yi, Yang, Shuai, Xiao, Yicheng, Mao, Huizi, Lin, Yujun, Ye, Hanrong, Liu, Sifei, Cheung, Ka Chun, Yin, Hongxu, Lu, Yao, Qi, Xiaojuan, Han, Song, Chen, Yukang |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2510.11696 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Scaling RL to Long Videos
by: Chen, Yukang, et al.
Published: (2025)
by: Chen, Yukang, et al.
Published: (2025)
RegionGPT: Towards Region Understanding Vision Language Model
by: Guo, Qiushan, et al.
Published: (2024)
by: Guo, Qiushan, et al.
Published: (2024)
GSPN-2: Efficient Parallel Sequence Modeling
by: Wang, Hongjun, et al.
Published: (2025)
by: Wang, Hongjun, et al.
Published: (2025)
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation
by: Chen, Yukang, et al.
Published: (2026)
by: Chen, Yukang, et al.
Published: (2026)
Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders
by: Jiang, Yitong, et al.
Published: (2026)
by: Jiang, Yitong, et al.
Published: (2026)
Parallel Sequence Modeling via Generalized Spatial Propagation Network
by: Wang, Hongjun, et al.
Published: (2025)
by: Wang, Hongjun, et al.
Published: (2025)
3D Aware Region Prompted Vision Language Model
by: Cheng, An-Chieh, et al.
Published: (2025)
by: Cheng, An-Chieh, et al.
Published: (2025)
VILA: On Pre-training for Visual Language Models
by: Lin, Ji, et al.
Published: (2023)
by: Lin, Ji, et al.
Published: (2023)
SpatialRGPT: Grounded Spatial Reasoning in Vision Language Models
by: Cheng, An-Chieh, et al.
Published: (2024)
by: Cheng, An-Chieh, et al.
Published: (2024)
HetRL: Efficient Reinforcement Learning for LLMs in Heterogeneous Environments
by: He, Yongjun, et al.
Published: (2025)
by: He, Yongjun, et al.
Published: (2025)
Beyond Recommender: An Exploratory Study of the Effects of Different AI Roles in AI-Assisted Decision Making
by: Ma, Shuai, et al.
Published: (2024)
by: Ma, Shuai, et al.
Published: (2024)
Enhancing Efficiency and Exploration in Reinforcement Learning for LLMs
by: Liao, Mengqi, et al.
Published: (2025)
by: Liao, Mengqi, et al.
Published: (2025)
GeoSVG-RL: Geometry-Aware Reinforcement Learning for Layout-Constrained Text-to-SVG Diagram Generation
by: Li, Sifan, et al.
Published: (2026)
by: Li, Sifan, et al.
Published: (2026)
GeometrySticker: Enabling Ownership Claim of Recolorized Neural Radiance Fields
by: Huang, Xiufeng, et al.
Published: (2024)
by: Huang, Xiufeng, et al.
Published: (2024)
X-VILA: Cross-Modality Alignment for Large Language Model
by: Ye, Hanrong, et al.
Published: (2024)
by: Ye, Hanrong, et al.
Published: (2024)
GaussianMarker: Uncertainty-Aware Copyright Protection of 3D Gaussian Splatting
by: Huang, Xiufeng, et al.
Published: (2024)
by: Huang, Xiufeng, et al.
Published: (2024)
SEED-Story: Multimodal Long Story Generation with Large Language Model
by: Yang, Shuai, et al.
Published: (2024)
by: Yang, Shuai, et al.
Published: (2024)
CHAMP: A Competition-level Dataset for Fine-Grained Analyses of LLMs' Mathematical Reasoning Capabilities
by: Mao, Yujun, et al.
Published: (2024)
by: Mao, Yujun, et al.
Published: (2024)
NaVILA: Legged Robot Vision-Language-Action Model for Navigation
by: Cheng, An-Chieh, et al.
Published: (2024)
by: Cheng, An-Chieh, et al.
Published: (2024)
From Uncertainty to Clarity: Uncertainty-Guided Class-Incremental Learning for Limited Biomedical Samples via Semantic Expansion
by: Yao, Yifei, et al.
Published: (2024)
by: Yao, Yifei, et al.
Published: (2024)
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
by: Qian, Yusu, et al.
Published: (2024)
by: Qian, Yusu, et al.
Published: (2024)
ImageSentinel: Protecting Visual Datasets from Unauthorized Retrieval-Augmented Image Generation
by: Luo, Ziyuan, et al.
Published: (2025)
by: Luo, Ziyuan, et al.
Published: (2025)
Protecting NeRFs' Copyright via Plug-And-Play Watermarking Base Model
by: Song, Qi, et al.
Published: (2024)
by: Song, Qi, et al.
Published: (2024)
Align 3D Representation and Text Embedding for 3D Content Personalization
by: Song, Qi, et al.
Published: (2025)
by: Song, Qi, et al.
Published: (2025)
Stereo-GS: Multi-View Stereo Vision Model for Generalizable 3D Gaussian Splatting Reconstruction
by: Huang, Xiufeng, et al.
Published: (2025)
by: Huang, Xiufeng, et al.
Published: (2025)
Geometry Cloak: Preventing TGS-based 3D Reconstruction from Copyrighted Images
by: Song, Qi, et al.
Published: (2024)
by: Song, Qi, et al.
Published: (2024)
WorldMark: A Unified Benchmark Suite for Interactive Video World Models
by: Xu, Xiaojie, et al.
Published: (2026)
by: Xu, Xiaojie, et al.
Published: (2026)
Low-Bit Quantization Favors Undertrained LLMs: Scaling Laws for Quantized LLMs with 100T Training Tokens
by: Ouyang, Xu, et al.
Published: (2024)
by: Ouyang, Xu, et al.
Published: (2024)
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
by: Huang, Wei, et al.
Published: (2024)
by: Huang, Wei, et al.
Published: (2024)
DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data
by: Ye, Hanrong, et al.
Published: (2024)
by: Ye, Hanrong, et al.
Published: (2024)
Neural Posterior Estimation for Spatial Individual-Level Epidemic Models
by: Mao, Yicheng, et al.
Published: (2026)
by: Mao, Yicheng, et al.
Published: (2026)
Towards Automating the Retrospective Generation of BIM Models: A Unified Framework for 3D Semantic Reconstruction of the Built Environment
by: Cheung, Ka Lung, et al.
Published: (2024)
by: Cheung, Ka Lung, et al.
Published: (2024)
ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds
by: Cheung, Ka Lung, et al.
Published: (2024)
by: Cheung, Ka Lung, et al.
Published: (2024)
Scaling Vision Pre-Training to 4K Resolution
by: Shi, Baifeng, et al.
Published: (2025)
by: Shi, Baifeng, et al.
Published: (2025)
Step Out and Seek Around: On Warm-Start Training with Incremental Data
by: Shen, Maying, et al.
Published: (2024)
by: Shen, Maying, et al.
Published: (2024)
Wukong's 72 Transformations: High-fidelity Textured 3D Morphing via Flow Models
by: Yin, Minghao, et al.
Published: (2025)
by: Yin, Minghao, et al.
Published: (2025)
Simulated Annealing for Model-Robust Partial Profile Choice Designs in Healthcare Preference Studies
by: Mao, Yicheng, et al.
Published: (2026)
by: Mao, Yicheng, et al.
Published: (2026)
JetViT: Efficient High-Resolution Vision Transformer with Post-Training Attention Search
by: Zou, Dongyun, et al.
Published: (2026)
by: Zou, Dongyun, et al.
Published: (2026)
From Hallucination to Structure Snowballing: The Alignment Tax of Constrained Decoding in LLM Reflection
by: Zhou, Hongxu
Published: (2026)
by: Zhou, Hongxu
Published: (2026)
QForce-RL: Quantized FPGA-Optimized Reinforcement Learning Compute Engine
by: Jha, Anushka, et al.
Published: (2025)
by: Jha, Anushka, et al.
Published: (2025)
Similar Items
-
Scaling RL to Long Videos
by: Chen, Yukang, et al.
Published: (2025) -
RegionGPT: Towards Region Understanding Vision Language Model
by: Guo, Qiushan, et al.
Published: (2024) -
GSPN-2: Efficient Parallel Sequence Modeling
by: Wang, Hongjun, et al.
Published: (2025) -
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation
by: Chen, Yukang, et al.
Published: (2026) -
Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders
by: Jiang, Yitong, et al.
Published: (2026)