Saved in:
| Main Authors: | Kerola, Tommi, Masuda, Yuya, Masuko, Takashi, Nakanishi, Toshiki, Nishino, Daisuke, Takahashi, Kuniyuki, Wang, Hanqin, Yamada, Yoshihiro |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.19324 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
PLaMo 2 Technical Report
by: Networks, Preferred, et al.
Published: (2025)
by: Networks, Preferred, et al.
Published: (2025)
Xiaomi MiMo-VL-Miloco Technical Report
by: Li, Jiaze, et al.
Published: (2025)
by: Li, Jiaze, et al.
Published: (2025)
Singpath-VL Technical Report
by: Qiu, Zhen, et al.
Published: (2026)
by: Qiu, Zhen, et al.
Published: (2026)
Kimi-VL Technical Report
by: Kimi Team, et al.
Published: (2025)
by: Kimi Team, et al.
Published: (2025)
Kwai Keye-VL Technical Report
by: Kwai Keye Team, et al.
Published: (2025)
by: Kwai Keye Team, et al.
Published: (2025)
SAIL-VL2 Technical Report
by: Yin, Weijie, et al.
Published: (2025)
by: Yin, Weijie, et al.
Published: (2025)
PLaMo-100B: A Ground-Up Language Model Designed for Japanese Proficiency
by: Elements, Preferred, et al.
Published: (2024)
by: Elements, Preferred, et al.
Published: (2024)
Qwen3-VL Technical Report
by: Bai, Shuai, et al.
Published: (2025)
by: Bai, Shuai, et al.
Published: (2025)
STEP3-VL-10B Technical Report
by: Huang, Ailin, et al.
Published: (2026)
by: Huang, Ailin, et al.
Published: (2026)
Kwai Keye-VL 1.5 Technical Report
by: Yang, Biao, et al.
Published: (2025)
by: Yang, Biao, et al.
Published: (2025)
SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent Objects
by: Ummadisingu, Avinash, et al.
Published: (2024)
by: Ummadisingu, Avinash, et al.
Published: (2024)
Qwen2.5-VL Technical Report
by: Bai, Shuai, et al.
Published: (2025)
by: Bai, Shuai, et al.
Published: (2025)
Seed1.5-VL Technical Report
by: Guo, Dong, et al.
Published: (2025)
by: Guo, Dong, et al.
Published: (2025)
ZAYA1-VL-8B Technical Report
by: Shapourian, Hassan, et al.
Published: (2026)
by: Shapourian, Hassan, et al.
Published: (2026)
Phoenix-VL 1.5 Medium Technical Report
by: Phoenix, Team, et al.
Published: (2026)
by: Phoenix, Team, et al.
Published: (2026)
AndesVL Technical Report: An Efficient Mobile-side Multimodal Large Language Model
by: Jin, Zhiwei, et al.
Published: (2025)
by: Jin, Zhiwei, et al.
Published: (2025)
CAT: Circular-Convolutional Attention for Sub-Quadratic Transformers
by: Yamada, Yoshihiro
Published: (2025)
by: Yamada, Yoshihiro
Published: (2025)
Person-In-Situ: Scene-Consistent Human Image Insertion with Occlusion-Aware Pose Control
by: Masuda, Shun, et al.
Published: (2025)
by: Masuda, Shun, et al.
Published: (2025)
TerraFusion: Joint Generation of Terrain Geometry and Texture Using Latent Diffusion Models
by: Higo, Kazuki, et al.
Published: (2025)
by: Higo, Kazuki, et al.
Published: (2025)
J-EDI QA: Benchmark for deep-sea organism-specific multimodal LLM
by: Yoshida, Takero, et al.
Published: (2024)
by: Yoshida, Takero, et al.
Published: (2024)
MiMo-Embodied: X-Embodied Foundation Model Technical Report
by: Hao, Xiaoshuai, et al.
Published: (2025)
by: Hao, Xiaoshuai, et al.
Published: (2025)
Quantifying Cancer Likeness: A Statistical Approach for Pathological Image Diagnosis
by: Kindo, Toshiki
Published: (2024)
by: Kindo, Toshiki
Published: (2024)
Kelix Technical Report
by: Ding, Boyang, et al.
Published: (2026)
by: Ding, Boyang, et al.
Published: (2026)
VEN-VL: A Visual Ensemble MoE Framework for Effective and Efficient Multi-Modal Understanding
by: Wu, Yinghao, et al.
Published: (2026)
by: Wu, Yinghao, et al.
Published: (2026)
MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks
by: Tian, Haijiang, et al.
Published: (2024)
by: Tian, Haijiang, et al.
Published: (2024)
MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
by: Uchida, Kengo, et al.
Published: (2024)
by: Uchida, Kengo, et al.
Published: (2024)
Detection of trade in products derived from threatened species using machine learning and a smartphone
by: Kulkarni, Ritwik, et al.
Published: (2025)
by: Kulkarni, Ritwik, et al.
Published: (2025)
StreamingClaw Technical Report
by: Chen, Jiawei, et al.
Published: (2026)
by: Chen, Jiawei, et al.
Published: (2026)
Uni-Parser Technical Report
by: Fang, Xi, et al.
Published: (2025)
by: Fang, Xi, et al.
Published: (2025)
Step-GUI Technical Report
by: Yan, Haolong, et al.
Published: (2025)
by: Yan, Haolong, et al.
Published: (2025)
Logics-Parsing Technical Report
by: Chen, Xiangyang, et al.
Published: (2025)
by: Chen, Xiangyang, et al.
Published: (2025)
ABot-OCR Technical Report
by: Jiang, Kaitao, et al.
Published: (2026)
by: Jiang, Kaitao, et al.
Published: (2026)
NeuroClaw Technical Report
by: Wang, Cheng, et al.
Published: (2026)
by: Wang, Cheng, et al.
Published: (2026)
Qwen-Image Technical Report
by: Wu, Chenfei, et al.
Published: (2025)
by: Wu, Chenfei, et al.
Published: (2025)
Kling-Omni Technical Report
by: Kling Team, et al.
Published: (2025)
by: Kling Team, et al.
Published: (2025)
InternVL-X: Advancing and Accelerating InternVL Series with Efficient Visual Token Compression
by: Lu, Dongchen, et al.
Published: (2025)
by: Lu, Dongchen, et al.
Published: (2025)
HeatFormer: A Neural Optimizer for Multiview Human Mesh Recovery
by: Matsubara, Yuto, et al.
Published: (2024)
by: Matsubara, Yuto, et al.
Published: (2024)
Diffusion Reflectance Map: Single-Image Stochastic Inverse Rendering of Illumination and Reflectance
by: Enyo, Yuto, et al.
Published: (2023)
by: Enyo, Yuto, et al.
Published: (2023)
ARTalk: Speech-Driven 3D Head Animation via Autoregressive Model
by: Chu, Xuangeng, et al.
Published: (2025)
by: Chu, Xuangeng, et al.
Published: (2025)
Time-varying rPPG signal separation via block-sparse signal model
by: Kurihara, Kosuke, et al.
Published: (2026)
by: Kurihara, Kosuke, et al.
Published: (2026)
Similar Items
-
PLaMo 2 Technical Report
by: Networks, Preferred, et al.
Published: (2025) -
Xiaomi MiMo-VL-Miloco Technical Report
by: Li, Jiaze, et al.
Published: (2025) -
Singpath-VL Technical Report
by: Qiu, Zhen, et al.
Published: (2026) -
Kimi-VL Technical Report
by: Kimi Team, et al.
Published: (2025) -
Kwai Keye-VL Technical Report
by: Kwai Keye Team, et al.
Published: (2025)