Saved in:
| Main Authors: | Lin, Wei-Cheng, Lien, Chih-Ming, Lo, Chen, Yeh, Chia-Hung |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.05782 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
ObjectNLQ @ Ego4D Episodic Memory Challenge 2024
by: Feng, Yisen, et al.
Published: (2024)
by: Feng, Yisen, et al.
Published: (2024)
HCQA-1.5 @ Ego4D EgoSchema Challenge 2025
by: Zhang, Haoyu, et al.
Published: (2025)
by: Zhang, Haoyu, et al.
Published: (2025)
OSGNet @ Ego4D Episodic Memory Challenge 2025
by: Feng, Yisen, et al.
Published: (2025)
by: Feng, Yisen, et al.
Published: (2025)
TextGaze: Gaze-Controllable Face Generation with Natural Language
by: Wang, Hengfei, et al.
Published: (2024)
by: Wang, Hengfei, et al.
Published: (2024)
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
by: Chen, Xianyu, et al.
Published: (2024)
by: Chen, Xianyu, et al.
Published: (2024)
PCIE_EgoHandPose Solution for EgoExo4D Hand Pose Challenge
by: Chen, Feng, et al.
Published: (2024)
by: Chen, Feng, et al.
Published: (2024)
CuriosAI Submission to the EgoExo4D Proficiency Estimation Challenge 2025
by: Tanoue, Hayato, et al.
Published: (2025)
by: Tanoue, Hayato, et al.
Published: (2025)
HCQA @ Ego4D EgoSchema Challenge 2024
by: Zhang, Haoyu, et al.
Published: (2024)
by: Zhang, Haoyu, et al.
Published: (2024)
GazeGaussian: High-Fidelity Gaze Redirection with 3D Gaussian Splatting
by: Wei, Xiaobao, et al.
Published: (2024)
by: Wei, Xiaobao, et al.
Published: (2024)
Gaze-Regularized VLMs for Ego-Centric Behavior Understanding
by: Pani, Anupam, et al.
Published: (2026)
by: Pani, Anupam, et al.
Published: (2026)
PCIE_LAM Solution for Ego4D Looking At Me Challenge
by: Lertniphonphan, Kanokphan, et al.
Published: (2024)
by: Lertniphonphan, Kanokphan, et al.
Published: (2024)
PCIE_Interaction Solution for Ego4D Social Interaction Challenge
by: Lertniphonphan, Kanokphan, et al.
Published: (2025)
by: Lertniphonphan, Kanokphan, et al.
Published: (2025)
Cross-View Multi-Modal Segmentation @ Ego-Exo4D Challenges 2025
by: Fu, Yuqian, et al.
Published: (2025)
by: Fu, Yuqian, et al.
Published: (2025)
Technical Report for Ego4D Long-Term Action Anticipation Challenge 2025
by: Chu, Qiaohui, et al.
Published: (2025)
by: Chu, Qiaohui, et al.
Published: (2025)
GTATrack: Winner Solution to SoccerTrack 2025 with Deep-EIoU and Global Tracklet Association
by: Jian, Rong-Lin, et al.
Published: (2026)
by: Jian, Rong-Lin, et al.
Published: (2026)
EgoCampus: Egocentric Pedestrian Eye Gaze Model and Dataset
by: John, Ronan, et al.
Published: (2025)
by: John, Ronan, et al.
Published: (2025)
VL4Gaze: Unleashing Vision-Language Models for Gaze Following
by: Wang, Shijing, et al.
Published: (2025)
by: Wang, Shijing, et al.
Published: (2025)
PCIE_Pose Solution for EgoExo4D Pose and Proficiency Estimation Challenge
by: Chen, Feng, et al.
Published: (2025)
by: Chen, Feng, et al.
Published: (2025)
OSGNet with MLLM Reranking @ Ego4D Episodic Memory Challenge 2026
by: Feng, Yisen, et al.
Published: (2026)
by: Feng, Yisen, et al.
Published: (2026)
EgoTV: Egocentric Task Verification from Natural Language Task Descriptions
by: Hazra, Rishi, et al.
Published: (2023)
by: Hazra, Rishi, et al.
Published: (2023)
Divide and Conquer: Grounding a Bleeding Areas in Gastrointestinal Image with Two-Stage Model
by: Lin, Yu-Fan, et al.
Published: (2024)
by: Lin, Yu-Fan, et al.
Published: (2024)
Progressive Alignment with VLM-LLM Feature to Augment Defect Classification for the ASE Dataset
by: Hsu, Chih-Chung, et al.
Published: (2024)
by: Hsu, Chih-Chung, et al.
Published: (2024)
DenseSR: Image Shadow Removal as Dense Prediction
by: Lin, Yu-Fan, et al.
Published: (2025)
by: Lin, Yu-Fan, et al.
Published: (2025)
WWE-UIE: A Wavelet & White Balance Efficient Network for Underwater Image Enhancement
by: Cheng, Ching-Heng, et al.
Published: (2025)
by: Cheng, Ching-Heng, et al.
Published: (2025)
UMCL: Unimodal-generated Multimodal Contrastive Learning for Cross-compression-rate Deepfake Detection
by: Lai, Ching-Yi, et al.
Published: (2025)
by: Lai, Ching-Yi, et al.
Published: (2025)
MISS: Memory-efficient Instance Segmentation Framework By Visual Inductive Priors Flow Propagation
by: Hsu, Chih-Chung, et al.
Published: (2024)
by: Hsu, Chih-Chung, et al.
Published: (2024)
DualGazeNet: A Biologically Inspired Dual-Gaze Query Network for Salient Object Detection
by: Zhang, Yu, et al.
Published: (2025)
by: Zhang, Yu, et al.
Published: (2025)
TalkingEyes: Pluralistic Speech-Driven 3D Eye Gaze Animation
by: Zhuang, Yixiang, et al.
Published: (2025)
by: Zhuang, Yixiang, et al.
Published: (2025)
AntifakePrompt: Prompt-Tuned Vision-Language Models are Fake Image Detectors
by: Chang, You-Ming, et al.
Published: (2023)
by: Chang, You-Ming, et al.
Published: (2023)
VISTA: Validation-Guided Integration of Spatial and Temporal Foundation Models with Anatomical Decoding for Rare-Pathology VCE Event Detection -- after competition results
by: Qiu, Bo-Cheng, et al.
Published: (2026)
by: Qiu, Bo-Cheng, et al.
Published: (2026)
OCR is All you need: Importing Multi-Modality into Image-based Defect Detection System
by: Hsu, Chih-Chung, et al.
Published: (2024)
by: Hsu, Chih-Chung, et al.
Published: (2024)
GPAFormer: Graph-guided Patch Aggregation Transformer for Efficient 3D Medical Image Segmentation
by: Lo, Chung-Ming, et al.
Published: (2026)
by: Lo, Chung-Ming, et al.
Published: (2026)
CARLOR @ Ego4D Step Grounding Challenge: Bayesian temporal-order priors for test time refinement
by: Plou, Carlos, et al.
Published: (2024)
by: Plou, Carlos, et al.
Published: (2024)
Augment Before Copy-Paste: Data and Memory Efficiency-Oriented Instance Segmentation Framework for Sport-scenes
by: Hsu, Chih-Chung, et al.
Published: (2024)
by: Hsu, Chih-Chung, et al.
Published: (2024)
Semantic Consistent Language Gaussian Splatting for Point-Level Open-vocabulary Querying
by: Yin, Hairong, et al.
Published: (2025)
by: Yin, Hairong, et al.
Published: (2025)
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation
by: Yang, Chiao-An, et al.
Published: (2025)
by: Yang, Chiao-An, et al.
Published: (2025)
EgoReasoner: Learning Egocentric 4D Reasoning via Task-Adaptive Structured Thinking
by: Zhu, Fangrui, et al.
Published: (2026)
by: Zhu, Fangrui, et al.
Published: (2026)
4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding
by: Zhu, Wenxuan, et al.
Published: (2025)
by: Zhu, Wenxuan, et al.
Published: (2025)
MultiEgo: A Multi-View Egocentric Video Dataset for 4D Scene Reconstruction
by: Li, Bate, et al.
Published: (2025)
by: Li, Bate, et al.
Published: (2025)
OmniEgo-R$^2$: A Routed Reasoning Framework for the 1st Cross-Domain EgoCross Challenge at CVPR 2026
by: Li, Zixu, et al.
Published: (2026)
by: Li, Zixu, et al.
Published: (2026)
Similar Items
-
ObjectNLQ @ Ego4D Episodic Memory Challenge 2024
by: Feng, Yisen, et al.
Published: (2024) -
HCQA-1.5 @ Ego4D EgoSchema Challenge 2025
by: Zhang, Haoyu, et al.
Published: (2025) -
OSGNet @ Ego4D Episodic Memory Challenge 2025
by: Feng, Yisen, et al.
Published: (2025) -
TextGaze: Gaze-Controllable Face Generation with Natural Language
by: Wang, Hengfei, et al.
Published: (2024) -
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths
by: Chen, Xianyu, et al.
Published: (2024)