:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Lu, Junxin, Song, Tengfei, Wu, Zhanglin, Li, Pengfei, Liang, Xiaowei, Yang, Hui, Chen, Kun, Xie, Ning, Lu, Yunfei, Zhao, Jing, Sun, Shiliang, Wei, Daimeng
Format:	Preprint
Published:	2026
Subjects:	Computer Vision and Pattern Recognition
Online Access:	https://arxiv.org/abs/2602.21956
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

A Causality-Aware Spatiotemporal Model for Multi-Region and Multi-Pollutant Air Quality Forecasting
by: Lu, Junxin, et al.
Published: (2025)

Evaluating Menu OCR and Translation: A Benchmark for Aligning Human and Automated Evaluations in Large Vision-Language Models
by: Wu, Zhanglin, et al.
Published: (2025)

Multimodal Machine Translation with Visual Scene Graph Pruning
by: Lu, Chenyu, et al.
Published: (2025)

DIMT25@ICDAR2025: HW-TSC's End-to-End Document Image Machine Translation System Leveraging Large Vision-Language Model
by: Wu, Zhanglin, et al.
Published: (2025)

Memory Reviving, Continuing Learning and Beyond: Evaluation of Pre-trained Encoders and Decoders for Multimodal Machine Translation
by: Yu, Zhuang, et al.
Published: (2025)

Choose the Final Translation from NMT and LLM hypotheses Using MBR Decoding: HW-TSC's Submission to the WMT24 General MT Shared Task
by: Wu, Zhanglin, et al.
Published: (2024)

HW-TSC's Submission to the CCMT 2024 Machine Translation Tasks
by: Wu, Zhanglin, et al.
Published: (2024)

PromptEVC: Controllable Emotional Voice Conversion with Natural Language Prompts
by: Qi, Tianhua, et al.
Published: (2025)

R-BI: Regularized Batched Inputs enhance Incremental Decoding Framework for Low-Latency Simultaneous Speech Translation
by: Guo, Jiaxin, et al.
Published: (2024)

Combining the Best of Both Worlds: A Method for Hybrid NMT and LLM Translation
by: Wu, Zhanglin, et al.
Published: (2025)

Context-aware and Style-related Incremental Decoding framework for Discourse-Level Literary Translation
by: Luo, Yuanchang, et al.
Published: (2024)

LEMON: How Well Do MLLMs Perform Temporal Multimodal Understanding on Instructional Videos?
by: Yu, Zhuang, et al.
Published: (2026)

Machine Translation Advancements of Low-Resource Indian Languages by Transfer Learning
by: Wei, Bin, et al.
Published: (2024)

DoCIA: An Online Document-Level Context Incorporation Agent for Speech Translation
by: Lyu, Xinglin, et al.
Published: (2025)

Align-then-Slide: A complete evaluation framework for Ultra-Long Document-Level Machine Translation
by: Guo, Jiaxin, et al.
Published: (2025)

FCNR: Fast Compressive Neural Representation of Visualization Images
by: Lu, Yunfei, et al.
Published: (2024)

Text-Phase Synergy Network with Dual Priors for Unsupervised Cross-Domain Image Retrieval
by: Yang, Jing, et al.
Published: (2026)

Doc-Guided Sent2Sent++: A Sent2Sent++ Agent with Doc-Guided memory for Document-level Machine Translation
by: Guo, Jiaxin, et al.
Published: (2025)

MCAT: Scaling Many-to-Many Speech-to-Text Translation with MLLMs to 70 Languages
by: Du, Yexing, et al.
Published: (2025)

Free-MoRef: Instantly Multiplexing Context Perception Capabilities of Video-MLLMs within Single Inference
by: Wang, Kuo, et al.
Published: (2025)

Bridging Local Details and Global Context in Text-Attributed Graphs
by: Wang, Yaoke, et al.
Published: (2024)

From Pixels to Objects: A Hierarchical Approach for Part and Object Segmentation Using Local and Global Aggregation
by: Xie, Yunfei, et al.
Published: (2024)

Struct2D: A Perception-Guided Framework for Spatial Reasoning in MLLMs
by: Zhu, Fangrui, et al.
Published: (2025)

From Text to Pixel: Advancing Long-Context Understanding in MLLMs
by: Lu, Yujie, et al.
Published: (2024)

TiO 2 ‐Engineered MOFs Activate Electron‐Rich Ni Sites for Efficient and Durable Hydrogen Production
by: Tao Liang, et al.
Published: (2026)

Enhanced Differential Evolution Based on Adaptive Mutation and Wrapper Local Search Strategies for Global Optimization Problems
by: Chun-Liang Lu
Published: (2014)

BovWGS-Pipeline
by: Gao, Junxin
Published: (2026)

Emotion Knowledge Enhancement for Vision Large Language Models: A Self-Verification Approach for High-Quality Emotion Instruction Data Generation
by: Wang, Feifan, et al.
Published: (2025)

Global-Local Stepwise Generative Network for Ultra High-Resolution Image Restoration
by: Feng, Xin, et al.
Published: (2022)

PEAN: A Diffusion-Based Prior-Enhanced Attention Network for Scene Text Image Super-Resolution
by: Zhao, Zuoyan, et al.
Published: (2023)

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?
by: Kang, Caixin, et al.
Published: (2026)

Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement
by: Dong, Yichen, et al.
Published: (2025)

Depressive Symptoms and Cognitive Function in Older Adults with Subjective Cognitive Decline: Longitudinal Findings From 2010 to 2020
by: Jing Huang, et al.
Published: (2024)

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness
by: Tang, Yolo Y., et al.
Published: (2025)

Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding
by: Lu, Jianxiang, et al.
Published: (2024)

A Novel Paradigm Boosting Translation Capabilities of Large Language Models
by: Guo, Jiaxin, et al.
Published: (2024)

Linking Perception, Confidence and Accuracy in MLLMs
by: Du, Yuetian, et al.
Published: (2026)

Global existence, blowup phenomena, and asymptotic behavior for quasilinear Schrödinger equations
by: An, Xiaowei, et al.
Published: (2018)

Spatial Preference Rewarding for MLLMs Spatial Understanding
by: Qiu, Han, et al.
Published: (2025)

Med-Scout: Curing MLLMs' Geometric Blindness in Medical Perception via Geometry-Aware RL Post-Training
by: Liu, Anglin, et al.
Published: (2026)