Saved in:
| Main Authors: | Xu, Ke, Lian, Zhongyuan |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.27629 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
WaferLLM: Large Language Model Inference at Wafer Scale
by: He, Congjie, et al.
Published: (2025)
by: He, Congjie, et al.
Published: (2025)
Observational and Experimental Insights into Machine Learning-Based Defect Classification in Wafers
by: Taha, Kamal
Published: (2023)
by: Taha, Kamal
Published: (2023)
DarwinWafer: A Wafer-Scale Neuromorphic Chip
by: Zhu, Xiaolei, et al.
Published: (2025)
by: Zhu, Xiaolei, et al.
Published: (2025)
Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation
by: Fan, Zhiting, et al.
Published: (2026)
by: Fan, Zhiting, et al.
Published: (2026)
FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
by: Jiang, Yuqi, et al.
Published: (2024)
by: Jiang, Yuqi, et al.
Published: (2024)
Configurable Preference Tuning with Rubric-Guided Synthetic Data
by: Gallego, Víctor
Published: (2025)
by: Gallego, Víctor
Published: (2025)
Deep Learning-based Multi Project InP Wafer Simulation for Unsupervised Surface Defect Detection
by: Cantú, Emílio Dolgener, et al.
Published: (2025)
by: Cantú, Emílio Dolgener, et al.
Published: (2025)
RubricBench: Aligning Model-Generated Rubrics with Human Standards
by: Zhang, Qiyuan, et al.
Published: (2026)
by: Zhang, Qiyuan, et al.
Published: (2026)
Prognostics and Health Management of Wafer Chemical-Mechanical Polishing System using Autoencoder
by: Lim, Kart-Leong, et al.
Published: (2025)
by: Lim, Kart-Leong, et al.
Published: (2025)
Reinforcement Learning with Rubric Anchors
by: Huang, Zenan, et al.
Published: (2025)
by: Huang, Zenan, et al.
Published: (2025)
Reinforcement Learning with Robust Rubric Rewards
by: Yu, Ya-Qi, et al.
Published: (2026)
by: Yu, Ya-Qi, et al.
Published: (2026)
Reward Hacking in Rubric-Based Reinforcement Learning
by: Mahmoud, Anas, et al.
Published: (2026)
by: Mahmoud, Anas, et al.
Published: (2026)
Wafer Map Defect Classification Using Autoencoder-Based Data Augmentation and Convolutional Neural Network
by: Bao, Yin-Yin, et al.
Published: (2024)
by: Bao, Yin-Yin, et al.
Published: (2024)
SynAdapt: Learning Adaptive Reasoning in Large Language Models via Synthetic Continuous Chain-of-Thought
by: Wang, Jianwei, et al.
Published: (2025)
by: Wang, Jianwei, et al.
Published: (2025)
SCALER:Synthetic Scalable Adaptive Learning Environment for Reasoning
by: Xu, Caijun, et al.
Published: (2026)
by: Xu, Caijun, et al.
Published: (2026)
RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning
by: Huang, Tzu-Heng, et al.
Published: (2026)
by: Huang, Tzu-Heng, et al.
Published: (2026)
SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback
by: Xu, Fangyuan, et al.
Published: (2026)
by: Xu, Fangyuan, et al.
Published: (2026)
Large Language Model for Verilog Generation with Code-Structure-Guided Reinforcement Learning
by: Wang, Ning, et al.
Published: (2024)
by: Wang, Ning, et al.
Published: (2024)
Synthetic Defect Data Generation Using Deep Learning Architecture for Improved Wafer Inspection Performance
by: Kumar, Roopesh
Published: (2026)
by: Kumar, Roopesh
Published: (2026)
Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning
by: Si, Shuzheng, et al.
Published: (2025)
by: Si, Shuzheng, et al.
Published: (2025)
AMARIS: A Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning
by: Wu, Peilin, et al.
Published: (2026)
by: Wu, Peilin, et al.
Published: (2026)
SynLLM: A Comparative Analysis of Large Language Models for Medical Tabular Synthetic Data Generation via Prompt Engineering
by: Ilaty, Arshia, et al.
Published: (2025)
by: Ilaty, Arshia, et al.
Published: (2025)
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
by: Zhou, Yang, et al.
Published: (2025)
by: Zhou, Yang, et al.
Published: (2025)
Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning
by: Wan, Xu, et al.
Published: (2026)
by: Wan, Xu, et al.
Published: (2026)
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation
by: Li, Sunzhu, et al.
Published: (2026)
by: Li, Sunzhu, et al.
Published: (2026)
Rubric-Guided Process Reward for Stepwise Model Routing
by: Ye, Shenghao, et al.
Published: (2026)
by: Ye, Shenghao, et al.
Published: (2026)
Wafer-Level Etch Spatial Profiling for Process Monitoring from Time-Series with Time-LLM
by: Kim, Hyunwoo, et al.
Published: (2026)
by: Kim, Hyunwoo, et al.
Published: (2026)
Self-Preference Bias in Rubric-Based Evaluation of Large Language Models
by: Pombal, José, et al.
Published: (2026)
by: Pombal, José, et al.
Published: (2026)
SAGE-RT: Synthetic Alignment data Generation for Safety Evaluation and Red Teaming
by: Kumar, Anurakt, et al.
Published: (2024)
by: Kumar, Anurakt, et al.
Published: (2024)
CommonIT: Commonality-Aware Instruction Tuning for Large Language Models via Data Partitions
by: Rao, Jun, et al.
Published: (2024)
by: Rao, Jun, et al.
Published: (2024)
Auto-Rubric: Learning From Implicit Weights to Explicit Rubrics for Reward Modeling
by: Xie, Lipeng, et al.
Published: (2025)
by: Xie, Lipeng, et al.
Published: (2025)
Reinforcement-Guided Synthetic Data Generation for Privacy-Sensitive Identity Recognition
by: Jia, Xuemei, et al.
Published: (2026)
by: Jia, Xuemei, et al.
Published: (2026)
Synthetic Data Generation for Phrase Break Prediction with Large Language Model
by: Lee, Hoyeon, et al.
Published: (2025)
by: Lee, Hoyeon, et al.
Published: (2025)
Generating Data-Driven Reasoning Rubrics for Domain-Adaptive Reward Modeling
by: Sanders, Kate, et al.
Published: (2026)
by: Sanders, Kate, et al.
Published: (2026)
Rubrics as Rewards: Reinforcement Learning Beyond Verifiable Domains
by: Gunjal, Anisha, et al.
Published: (2025)
by: Gunjal, Anisha, et al.
Published: (2025)
Synthesize-on-Graph: Knowledgeable Synthetic Data Generation for Continue Pre-training of Large Language Models
by: Ma, Shengjie, et al.
Published: (2025)
by: Ma, Shengjie, et al.
Published: (2025)
Reinforcement Learning and Data-Generation for Syntax-Guided Synthesis
by: Parsert, Julian, et al.
Published: (2023)
by: Parsert, Julian, et al.
Published: (2023)
Alternating Reinforcement Learning with Contextual Rubric Rewards: Beyond the Scalarization Strategy
by: Lan, Guangchen, et al.
Published: (2026)
by: Lan, Guangchen, et al.
Published: (2026)
Efficient and Stable Reinforcement Learning for Diffusion Language Models
by: Liu, Jiawei, et al.
Published: (2026)
by: Liu, Jiawei, et al.
Published: (2026)
Piculet: Specialized Models-Guided Hallucination Decrease for MultiModal Large Language Models
by: Wang, Kohou, et al.
Published: (2024)
by: Wang, Kohou, et al.
Published: (2024)
Similar Items
-
WaferLLM: Large Language Model Inference at Wafer Scale
by: He, Congjie, et al.
Published: (2025) -
Observational and Experimental Insights into Machine Learning-Based Defect Classification in Wafers
by: Taha, Kamal
Published: (2023) -
DarwinWafer: A Wafer-Scale Neuromorphic Chip
by: Zhu, Xiaolei, et al.
Published: (2025) -
Optimsyn: Influence-Guided Rubrics Optimization for Synthetic Data Generation
by: Fan, Zhiting, et al.
Published: (2026) -
FabGPT: An Efficient Large Multimodal Model for Complex Wafer Defect Knowledge Queries
by: Jiang, Yuqi, et al.
Published: (2024)