Saved in:
| Main Authors: | Yasunaga, Ayaka, Saito, Hideo, Mori, Shohei |
|---|---|
| Format: | Preprint |
| Published: |
2025
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2506.21009 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
IntelliCap: Intelligent Guidance for Consistent View Sampling
by: Yasunaga, Ayaka, et al.
Published: (2025)
by: Yasunaga, Ayaka, et al.
Published: (2025)
Dense Depth from Event Focal Stack
by: Horikawa, Kenta, et al.
Published: (2024)
by: Horikawa, Kenta, et al.
Published: (2024)
High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights
by: Kato, Yuna, et al.
Published: (2025)
by: Kato, Yuna, et al.
Published: (2025)
Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery
by: Kato, Yuna, et al.
Published: (2025)
by: Kato, Yuna, et al.
Published: (2025)
Profile-Specific 3DMM Regression from a Single Lateral Face Image
by: Kanaya, Taiki, et al.
Published: (2026)
by: Kanaya, Taiki, et al.
Published: (2026)
Enhancing Visual Prompting through Expanded Transformation Space and Overfitting Mitigation
by: Enomoto, Shohei
Published: (2025)
by: Enomoto, Shohei
Published: (2025)
Hand Held Multi-Object Tracking Dataset in American Football
by: Otsubo, Rintaro, et al.
Published: (2025)
by: Otsubo, Rintaro, et al.
Published: (2025)
RealTraj: Towards Real-World Pedestrian Trajectory Forecasting
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
EMAG: Ego-motion Aware and Generalizable 2D Hand Forecasting from Egocentric Videos
by: Hatano, Masashi, et al.
Published: (2024)
by: Hatano, Masashi, et al.
Published: (2024)
RatBodyFormer: Rat Body Surface from Keypoints
by: Higami, Ayaka, et al.
Published: (2024)
by: Higami, Ayaka, et al.
Published: (2024)
The Invisible EgoHand: 3D Hand Forecasting through EgoBody Pose Estimation
by: Hatano, Masashi, et al.
Published: (2025)
by: Hatano, Masashi, et al.
Published: (2025)
Ev4DGS: Novel-view Rendering of Non-Rigid Objects from Monocular Event Streams
by: Nakabayashi, Takuya, et al.
Published: (2025)
by: Nakabayashi, Takuya, et al.
Published: (2025)
Human Preference-Aligned Concept Customization Benchmark via Decomposed Evaluation
by: Ishikawa, Reina, et al.
Published: (2025)
by: Ishikawa, Reina, et al.
Published: (2025)
Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition
by: Hatano, Masashi, et al.
Published: (2024)
by: Hatano, Masashi, et al.
Published: (2024)
E2GS: Event Enhanced Gaussian Splatting
by: Deguchi, Hiroyuki, et al.
Published: (2024)
by: Deguchi, Hiroyuki, et al.
Published: (2024)
Weakly Semi-supervised Tool Detection in Minimally Invasive Surgery Videos
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
VIOLA: Towards Video In-Context Learning with Minimal Annotations
by: Fujii, Ryo, et al.
Published: (2026)
by: Fujii, Ryo, et al.
Published: (2026)
Déjà View: Looping Transformers for Multi-View 3D Reconstruction
by: Burzio, Alessandro, et al.
Published: (2026)
by: Burzio, Alessandro, et al.
Published: (2026)
LoopSparseGS: Loop Based Sparse-View Friendly Gaussian Splatting
by: Bao, Zhenyu, et al.
Published: (2024)
by: Bao, Zhenyu, et al.
Published: (2024)
SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images
by: Shinoda, Risa, et al.
Published: (2024)
by: Shinoda, Risa, et al.
Published: (2024)
LoopViT: Scaling Visual ARC with Looped Transformers
by: Shu, Wen-Jie, et al.
Published: (2026)
by: Shu, Wen-Jie, et al.
Published: (2026)
CrowdMAC: Masked Crowd Density Completion for Robust Crowd Density Forecasting
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
V-Loop: Visual Logical Loop Verification for Hallucination Detection in Medical Visual Question Answering
by: Jin, Mengyuan, et al.
Published: (2026)
by: Jin, Mengyuan, et al.
Published: (2026)
Towards Predicting Any Human Trajectory In Context
by: Fujii, Ryo, et al.
Published: (2025)
by: Fujii, Ryo, et al.
Published: (2025)
Leveraging LLMs with Iterative Loop Structure for Enhanced Social Intelligence in Video Question Answering
by: Mori, Erika, et al.
Published: (2025)
by: Mori, Erika, et al.
Published: (2025)
View Transformation Robustness for Multi-View 3D Object Reconstruction with Reconstruction Error-Guided View Selection
by: Zhang, Qi, et al.
Published: (2024)
by: Zhang, Qi, et al.
Published: (2024)
EgoSurgery-Tool: A Dataset of Surgical Tool and Hand Detection from Egocentric Open Surgery Videos
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
HalDec-Bench: Benchmarking Hallucination Detector in Image Captioning
by: Saito, Kuniaki, et al.
Published: (2025)
by: Saito, Kuniaki, et al.
Published: (2025)
HalDec-Bench: Benchmarking Hallucination Detector in Image Captioning
by: Saito, Kuniaki, et al.
Published: (2026)
by: Saito, Kuniaki, et al.
Published: (2026)
Interactive Garment Recommendation with User in the Loop
by: Becattini, Federico, et al.
Published: (2024)
by: Becattini, Federico, et al.
Published: (2024)
Map-Mono-Ego: Map-Grounded Global Human Pose Estimation from Monocular Egocentric Video
by: Deguchi, Hiroyuki, et al.
Published: (2026)
by: Deguchi, Hiroyuki, et al.
Published: (2026)
Multimodal RewardBench: Holistic Evaluation of Reward Models for Vision Language Models
by: Yasunaga, Michihiro, et al.
Published: (2025)
by: Yasunaga, Michihiro, et al.
Published: (2025)
ELT: Elastic Looped Transformers for Visual Generation
by: Goyal, Sahil, et al.
Published: (2026)
by: Goyal, Sahil, et al.
Published: (2026)
Ctrl123: Consistent Novel View Synthesis via Closed-Loop Transcription
by: Zhao, Hongxiang, et al.
Published: (2024)
by: Zhao, Hongxiang, et al.
Published: (2024)
Prime and Reach: Synthesising Body Motion for Gaze-Primed Object Reach
by: Hatano, Masashi, et al.
Published: (2025)
by: Hatano, Masashi, et al.
Published: (2025)
Pseudo Multi-Source Domain Generalization: Bridging the Gap Between Single and Multi-Source Domain Generalization
by: Enomoto, Shohei
Published: (2025)
by: Enomoto, Shohei
Published: (2025)
A Robust Error-Resistant View Selection Method for 3D Reconstruction
by: Zhang, Shaojie, et al.
Published: (2024)
by: Zhang, Shaojie, et al.
Published: (2024)
Human-in-the-Loop Visual Re-ID for Population Size Estimation
by: Perez, Gustavo, et al.
Published: (2023)
by: Perez, Gustavo, et al.
Published: (2023)
EgoSurgery-HTS: A Dataset for Egocentric Hand-Tool Segmentation in Open Surgery Videos
by: Darjana, Nathan, et al.
Published: (2025)
by: Darjana, Nathan, et al.
Published: (2025)
EgoSurgery-Phase: A Dataset of Surgical Phase Recognition from Egocentric Open Surgery Videos
by: Fujii, Ryo, et al.
Published: (2024)
by: Fujii, Ryo, et al.
Published: (2024)
Similar Items
-
IntelliCap: Intelligent Guidance for Consistent View Sampling
by: Yasunaga, Ayaka, et al.
Published: (2025) -
Dense Depth from Event Focal Stack
by: Horikawa, Kenta, et al.
Published: (2024) -
High-Quality Virtual Single-Viewpoint Surgical Video: Geometric Autocalibration of Multiple Cameras in Surgical Lights
by: Kato, Yuna, et al.
Published: (2025) -
Disturbance-Free Surgical Video Generation from Multi-Camera Shadowless Lamps for Open Surgery
by: Kato, Yuna, et al.
Published: (2025) -
Profile-Specific 3DMM Regression from a Single Lateral Face Image
by: Kanaya, Taiki, et al.
Published: (2026)