Saved in:
| Main Authors: | Wang, Peiyu, Peng, Yi, Gan, Yimeng, Hu, Liang, Xie, Tianyidan, Wang, Xiaokun, Wei, Yichen, Tang, Chuanxin, Zhu, Bo, Li, Changshi, Wei, Hongyang, Li, Eric, Song, Xuchen, Liu, Yang, Zhou, Yahui |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2508.03320 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model
by: Wei, Hongyang, et al.
Published: (2025)
by: Wei, Hongyang, et al.
Published: (2025)
Skywork UniPic 3.0: Unified Multi-Image Composition via Sequence Modeling
by: Wei, Hongyang, et al.
Published: (2026)
by: Wei, Hongyang, et al.
Published: (2026)
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
by: Wang, Xiaokun, et al.
Published: (2025)
by: Wang, Xiaokun, et al.
Published: (2025)
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
by: Wang, Peiyu, et al.
Published: (2025)
by: Wang, Peiyu, et al.
Published: (2025)
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought
by: Peng, Yi, et al.
Published: (2025)
by: Peng, Yi, et al.
Published: (2025)
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch
by: Zhang, Yifan, et al.
Published: (2025)
by: Zhang, Yifan, et al.
Published: (2025)
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs
by: Zeng, Liang, et al.
Published: (2025)
by: Zeng, Liang, et al.
Published: (2025)
Skywork-R1V3 Technical Report
by: Shen, Wei, et al.
Published: (2025)
by: Shen, Wei, et al.
Published: (2025)
LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models
by: Zhao, Liang, et al.
Published: (2024)
by: Zhao, Liang, et al.
Published: (2024)
CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs
by: Jian, Ai, et al.
Published: (2025)
by: Jian, Ai, et al.
Published: (2025)
Skywork Open Reasoner 1 Technical Report
by: He, Jujie, et al.
Published: (2025)
by: He, Jujie, et al.
Published: (2025)
Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models
by: Wei, Tianwen, et al.
Published: (2024)
by: Wei, Tianwen, et al.
Published: (2024)
UniX: Unifying Autoregression and Diffusion for Chest X-Ray Understanding and Generation
by: Zhang, Ruiheng, et al.
Published: (2026)
by: Zhang, Ruiheng, et al.
Published: (2026)
UniShield: Unified Face Attack Detection via KG-Informed Multimodal Reasoning
by: Li, Hongrui, et al.
Published: (2026)
by: Li, Hongrui, et al.
Published: (2026)
UniVL: Unified Vision-Language Embedding for Spatially Grounded Contextual Image Generation
by: Wang, Jiayun, et al.
Published: (2026)
by: Wang, Jiayun, et al.
Published: (2026)
Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
by: Liu, Chris Yuhao, et al.
Published: (2024)
by: Liu, Chris Yuhao, et al.
Published: (2024)
UniVBench: Towards Unified Evaluation for Video Foundation Models
by: Wei, Jianhui, et al.
Published: (2026)
by: Wei, Jianhui, et al.
Published: (2026)
UniModel: A Visual-Only Framework for Unified Multimodal Understanding and Generation
by: Zhang, Chi, et al.
Published: (2025)
by: Zhang, Chi, et al.
Published: (2025)
Autoregressive Semantic Visual Reconstruction Helps VLMs Understand Better
by: Wang, Dianyi, et al.
Published: (2025)
by: Wang, Dianyi, et al.
Published: (2025)
UniEval: Unified Holistic Evaluation for Unified Multimodal Understanding and Generation
by: Li, Yi, et al.
Published: (2025)
by: Li, Yi, et al.
Published: (2025)
UniVideo: Unified Understanding, Generation, and Editing for Videos
by: Wei, Cong, et al.
Published: (2025)
by: Wei, Cong, et al.
Published: (2025)
Unified Medical Image Tokenizer for Autoregressive Synthesis and Understanding
by: Ma, Chenglong, et al.
Published: (2025)
by: Ma, Chenglong, et al.
Published: (2025)
UniT: Unified Geometry Learning with Group Autoregressive Transformer
by: Wang, Haotian, et al.
Published: (2026)
by: Wang, Haotian, et al.
Published: (2026)
UniVoice: Unifying Autoregressive ASR and Flow-Matching based TTS with Large Language Models
by: Guan, Wenhao, et al.
Published: (2025)
by: Guan, Wenhao, et al.
Published: (2025)
Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy
by: Liu, Chris Yuhao, et al.
Published: (2025)
by: Liu, Chris Yuhao, et al.
Published: (2025)
Unified Autoregressive Visual Generation and Understanding with Continuous Tokens
by: Fan, Lijie, et al.
Published: (2025)
by: Fan, Lijie, et al.
Published: (2025)
UniARM: Towards a Unified Autoregressive Reward Model for Multi-Objective Test-Time Alignment
by: Xie, Hongyan, et al.
Published: (2026)
by: Xie, Hongyan, et al.
Published: (2026)
UniFlow: A Unified Pixel Flow Tokenizer for Visual Understanding and Generation
by: Yue, Zhengrong, et al.
Published: (2025)
by: Yue, Zhengrong, et al.
Published: (2025)
Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models -- The Story Goes On
by: Zeng, Liang, et al.
Published: (2024)
by: Zeng, Liang, et al.
Published: (2024)
OpenUni: A Simple Baseline for Unified Multimodal Understanding and Generation
by: Wu, Size, et al.
Published: (2025)
by: Wu, Size, et al.
Published: (2025)
UniLiP: Adapting CLIP for Unified Multimodal Understanding, Generation and Editing
by: Tang, Hao, et al.
Published: (2025)
by: Tang, Hao, et al.
Published: (2025)
UniECG: Understanding and Generating ECG in One Unified Model
by: Jin, Jiarui, et al.
Published: (2025)
by: Jin, Jiarui, et al.
Published: (2025)
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models
by: Wei, Hongyang, et al.
Published: (2025)
by: Wei, Hongyang, et al.
Published: (2025)
PhysCodeBench: Benchmarking Physics-Aware Symbolic Simulation of 3D Scenes via Self-Corrective Multi-Agent Refinement
by: Xie, Tianyidan, et al.
Published: (2026)
by: Xie, Tianyidan, et al.
Published: (2026)
UniWorld-V1: High-Resolution Semantic Encoders for Unified Visual Understanding and Generation
by: Lin, Bin, et al.
Published: (2025)
by: Lin, Bin, et al.
Published: (2025)
UniQueR: Unified Query-based Feedforward 3D Reconstruction
by: Peng, Chensheng, et al.
Published: (2026)
by: Peng, Chensheng, et al.
Published: (2026)
UniMo: Unified Motion Generation and Understanding with Chain of Thought
by: Wang, Guocun, et al.
Published: (2026)
by: Wang, Guocun, et al.
Published: (2026)
UniCompress: Token Compression for Unified Vision-Language Understanding and Generation
by: Wang, Ziyao, et al.
Published: (2026)
by: Wang, Ziyao, et al.
Published: (2026)
UniTok: A Unified Tokenizer for Visual Generation and Understanding
by: Ma, Chuofan, et al.
Published: (2025)
by: Ma, Chuofan, et al.
Published: (2025)
UniFormer: Unifying Convolution and Self-attention for Visual Recognition
by: Li, Kunchang, et al.
Published: (2022)
by: Li, Kunchang, et al.
Published: (2022)
Similar Items
-
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model
by: Wei, Hongyang, et al.
Published: (2025) -
Skywork UniPic 3.0: Unified Multi-Image Composition via Sequence Modeling
by: Wei, Hongyang, et al.
Published: (2026) -
Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
by: Wang, Xiaokun, et al.
Published: (2025) -
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning
by: Wang, Peiyu, et al.
Published: (2025) -
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought
by: Peng, Yi, et al.
Published: (2025)