Saved in:
| Main Authors: | Luo, Yun, Yang, Zhen, Meng, Fandong, Li, Yingjie, Guo, Fang, Qi, Qinglin, Zhou, Jie, Zhang, Yue |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2310.05502 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
by: Luo, Yun, et al.
Published: (2023)
by: Luo, Yun, et al.
Published: (2023)
PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
by: Luo, Yun, et al.
Published: (2024)
by: Luo, Yun, et al.
Published: (2024)
EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation
by: Xu, Yulin, et al.
Published: (2022)
by: Xu, Yulin, et al.
Published: (2022)
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
by: Yang, Zhen, et al.
Published: (2023)
by: Yang, Zhen, et al.
Published: (2023)
DeepTrans: Deep Reasoning Translation via Reinforcement Learning
by: Wang, Jiaan, et al.
Published: (2025)
by: Wang, Jiaan, et al.
Published: (2025)
ExTrans: Multilingual Deep Reasoning Translation via Exemplar-Enhanced Reinforcement Learning
by: Wang, Jiaan, et al.
Published: (2025)
by: Wang, Jiaan, et al.
Published: (2025)
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
by: Liu, Yijin, et al.
Published: (2024)
by: Liu, Yijin, et al.
Published: (2024)
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation
by: Liang, Yunlong, et al.
Published: (2025)
by: Liang, Yunlong, et al.
Published: (2025)
Figure It Out: Improve the Frontier of Reasoning with Executable Visual States
by: Chen, Meiqi, et al.
Published: (2025)
by: Chen, Meiqi, et al.
Published: (2025)
LongDPO: Unlock Better Long-form Generation Abilities for LLMs via Critique-augmented Stepwise Information
by: Ping, Bowen, et al.
Published: (2025)
by: Ping, Bowen, et al.
Published: (2025)
CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers
by: Hu, Yong, et al.
Published: (2022)
by: Hu, Yong, et al.
Published: (2022)
Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts
by: Zhang, Xue, et al.
Published: (2025)
by: Zhang, Xue, et al.
Published: (2025)
Beyond Next Token Prediction: Patch-Level Training for Large Language Models
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
Supervised Knowledge Makes Large Language Models Better In-context Learners
by: Yang, Linyi, et al.
Published: (2023)
by: Yang, Linyi, et al.
Published: (2023)
C-LLM: Learn to Check Chinese Spelling Errors Character by Character
by: Li, Kunting, et al.
Published: (2024)
by: Li, Kunting, et al.
Published: (2024)
TIM: Teaching Large Language Models to Translate with Comparison
by: Zeng, Jiali, et al.
Published: (2023)
by: Zeng, Jiali, et al.
Published: (2023)
Improving Machine Translation with Large Language Models: A Preliminary Study with Cooperative Decoding
by: Zeng, Jiali, et al.
Published: (2023)
by: Zeng, Jiali, et al.
Published: (2023)
Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
SlangDIT: Benchmarking LLMs in Interpretative Slang Translation
by: Liang, Yunlong, et al.
Published: (2025)
by: Liang, Yunlong, et al.
Published: (2025)
Retrieval-Augmented Machine Translation with Unstructured Knowledge
by: Wang, Jiaan, et al.
Published: (2024)
by: Wang, Jiaan, et al.
Published: (2024)
Language Generation with Strictly Proper Scoring Rules
by: Shao, Chenze, et al.
Published: (2024)
by: Shao, Chenze, et al.
Published: (2024)
DRT: Deep Reasoning Translation via Long Chain-of-Thought
by: Wang, Jiaan, et al.
Published: (2024)
by: Wang, Jiaan, et al.
Published: (2024)
Large Language Models Are Not Robust Multiple Choice Selectors
by: Zheng, Chujie, et al.
Published: (2023)
by: Zheng, Chujie, et al.
Published: (2023)
MiniPLM: Knowledge Distillation for Pre-Training Language Models
by: Gu, Yuxian, et al.
Published: (2024)
by: Gu, Yuxian, et al.
Published: (2024)
Continuous Autoregressive Language Models
by: Shao, Chenze, et al.
Published: (2025)
by: Shao, Chenze, et al.
Published: (2025)
CRAT: A Multi-Agent Framework for Causality-Enhanced Reflective and Retrieval-Augmented Translation with Large Language Models
by: Chen, Meiqi, et al.
Published: (2024)
by: Chen, Meiqi, et al.
Published: (2024)
On the token distance modeling ability of higher RoPE attention dimension
by: Hong, Xiangyu, et al.
Published: (2024)
by: Hong, Xiangyu, et al.
Published: (2024)
LLaVE: Large Language and Vision Embedding Models with Hardness-Weighted Contrastive Learning
by: Lan, Zhibin, et al.
Published: (2025)
by: Lan, Zhibin, et al.
Published: (2025)
Making Language Model a Hierarchical Classifier
by: Wang, Yihong, et al.
Published: (2025)
by: Wang, Yihong, et al.
Published: (2025)
General learned delegation by clones
by: Li, Darren, et al.
Published: (2026)
by: Li, Darren, et al.
Published: (2026)
Making Language Models Better Tool Learners with Execution Feedback
by: Qiao, Shuofei, et al.
Published: (2023)
by: Qiao, Shuofei, et al.
Published: (2023)
Task Calibration: Calibrating Large Language Models on Inference Tasks
by: Li, Yingjie, et al.
Published: (2024)
by: Li, Yingjie, et al.
Published: (2024)
Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space
by: Ma, Zhengrui, et al.
Published: (2025)
by: Ma, Zhengrui, et al.
Published: (2025)
Make LVLMs Focus: Context-Aware Attention Modulation for Better Multimodal In-Context Learning
by: Li, Yanshu, et al.
Published: (2025)
by: Li, Yanshu, et al.
Published: (2025)
Warmup-Distill: Bridge the Distribution Mismatch between Teacher and Student before Knowledge Distillation
by: Sun, Zengkui, et al.
Published: (2025)
by: Sun, Zengkui, et al.
Published: (2025)
Enhancing Cross-Tokenizer Knowledge Distillation with Contextual Dynamical Mapping
by: Chen, Yijie, et al.
Published: (2025)
by: Chen, Yijie, et al.
Published: (2025)
Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words
by: Chen, Yijie, et al.
Published: (2024)
by: Chen, Yijie, et al.
Published: (2024)
Comments as Natural Logic Pivots: Improve Code Generation via Comment Perspective
by: Chen, Yijie, et al.
Published: (2024)
by: Chen, Yijie, et al.
Published: (2024)
On Large Language Models' Hallucination with Regard to Known Facts
by: Jiang, Che, et al.
Published: (2024)
by: Jiang, Che, et al.
Published: (2024)
What Makes Diffusion Language Models Super Data Learners?
by: Gao, Zitian, et al.
Published: (2025)
by: Gao, Zitian, et al.
Published: (2025)
Similar Items
-
An Empirical Study of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
by: Luo, Yun, et al.
Published: (2023) -
PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization
by: Luo, Yun, et al.
Published: (2024) -
EAG: Extract and Generate Multi-way Aligned Corpus for Complete Multi-lingual Neural Machine Translation
by: Xu, Yulin, et al.
Published: (2022) -
TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models
by: Yang, Zhen, et al.
Published: (2023) -
DeepTrans: Deep Reasoning Translation via Reinforcement Learning
by: Wang, Jiaan, et al.
Published: (2025)