:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Yang, Jia-Qi, Dai, Chenglei, OU, Dan, Li, Dongshuai, Huang, Ju, Zhan, De-Chuan, Zeng, Xiaoyi, Yang, Yang
Natura:	Preprint
Pubblicazione:	2023
Soggetti:	Computer Vision and Pattern Recognition Machine Learning
Accesso online:	https://arxiv.org/abs/2306.05001
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Revisiting Content-Based Music Recommendation: Efficient Feature Aggregation from Large-Scale Music Models
di: Zhou, Yizhi, et al.
Pubblicazione: (2026)

Guiding Diffusion-based Reconstruction with Contrastive Signals for Balanced Visual Representation
di: Han, Boyu, et al.
Pubblicazione: (2026)

Predicting User Grasp Intentions in Virtual Reality
di: Zeng, Linghao
Pubblicazione: (2025)

Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens
di: Huang, Ting-Ji, et al.
Pubblicazione: (2024)

Learning Group Interactions and Semantic Intentions for Multi-Object Trajectory Prediction
di: Qi, Mengshi, et al.
Pubblicazione: (2024)

PM25Vision: A Large-Scale Benchmark Dataset for Visual Estimation of Air Quality
di: Han, Yang
Pubblicazione: (2025)

Enhancing Bandit Algorithms with LLMs for Time-varying User Preferences in Streaming Recommendations
di: Shen, Chenglei, et al.
Pubblicazione: (2026)

EfficientGS: Streamlining Gaussian Splatting for Large-Scale High-Resolution Scene Representation
di: Liu, Wenkai, et al.
Pubblicazione: (2024)

Mutual Information guided Visual Contrastive Learning
di: Chen, Hanyang, et al.
Pubblicazione: (2025)

CACE-Net: Co-guidance Attention and Contrastive Enhancement for Effective Audio-Visual Event Localization
di: He, Xiang, et al.
Pubblicazione: (2024)

TV100: A TV Series Dataset that Pre-Trained CLIP Has Not Seen
di: Zhou, Da-Wei, et al.
Pubblicazione: (2024)

Pyramid Feature Attention Network for Monocular Depth Prediction
di: Xu, Yifang, et al.
Pubblicazione: (2024)

CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs
di: Huang, Xiaoyi, et al.
Pubblicazione: (2026)

eMotions: A Large-Scale Dataset and Audio-Visual Fusion Network for Emotion Analysis in Short-form Videos
di: Wu, Xuecheng, et al.
Pubblicazione: (2025)

3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for Indoor 3D Object Detection
di: Cao, Yang, et al.
Pubblicazione: (2024)

Polaris: Scaling Up Instruction-Guided Image Generation Towards Millions of Personalized Style Needs
di: Chen, Zhi-Kai, et al.
Pubblicazione: (2026)

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
di: Si, Chenglei, et al.
Pubblicazione: (2024)

3CAD: A Large-Scale Real-World 3C Product Dataset for Unsupervised Anomaly
di: Yang, Enquan, et al.
Pubblicazione: (2025)

Design2Code: Benchmarking Multimodal Code Generation for Automated Front-End Engineering
di: Si, Chenglei, et al.
Pubblicazione: (2024)

Free Geometry: Refining 3D Reconstruction from Longer Versions of Itself
di: Dai, Yuhang, et al.
Pubblicazione: (2026)

PromptForge-350k: A Large-Scale Dataset and Contrastive Framework for Prompt-Based AI Image Forgery Localization
di: Wang, Jianpeng, et al.
Pubblicazione: (2026)

Fitting Different Interactive Information: Joint Classification of Emotion and Intention
di: Li, Xinger, et al.
Pubblicazione: (2025)

EMIE-MAP: Large-Scale Road Surface Reconstruction Based on Explicit Mesh and Implicit Encoding
di: Wu, Wenhua, et al.
Pubblicazione: (2024)

VIAssist: Adapting Multi-modal Large Language Models for Users with Visual Impairments
di: Yang, Bufang, et al.
Pubblicazione: (2024)

Understanding Bias in Large-Scale Visual Datasets
di: Zeng, Boya, et al.
Pubblicazione: (2024)

MosaicDoc: A Large-Scale Bilingual Benchmark for Visually Rich Document Understanding
di: Chen, Ketong, et al.
Pubblicazione: (2025)

PhysPatch: A Physically Realizable and Transferable Adversarial Patch Attack for Multimodal Large Language Models-based Autonomous Driving Systems
di: Guo, Qi, et al.
Pubblicazione: (2025)

Contrastive Conditional Alignment based on Label Shift Calibration for Imbalanced Domain Adaptation
di: Sun, Xiaona, et al.
Pubblicazione: (2024)

Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction
di: Lyu, Xiaoyang, et al.
Pubblicazione: (2024)

65 YEARS OF THE MILITARY TECHNICAL COURIER – ACKNOWLEDGMENTS
di: Nebojša N. Gaćeša
Pubblicazione: (2017)

Jointly Understand Your Command and Intention:Reciprocal Co-Evolution between Scene-Aware 3D Human Motion Synthesis and Analysis
di: Gao, Xuehao, et al.
Pubblicazione: (2025)

VRU-CIPI: Crossing Intention Prediction at Intersections for Improving Vulnerable Road Users Safety
di: Abdelrahman, Ahmed S., et al.
Pubblicazione: (2025)

CityGaussianV2: Efficient and Geometrically Accurate Reconstruction for Large-Scale Scenes
di: Liu, Yang, et al.
Pubblicazione: (2024)

Adaptive Adapter Routing for Long-Tailed Class-Incremental Learning
di: Qi, Zhi-Hong, et al.
Pubblicazione: (2024)

AD3: Implicit Action is the Key for World Models to Distinguish the Diverse Visual Distractors
di: Wang, Yucen, et al.
Pubblicazione: (2024)

Efficient Large Multi-modal Models via Visual Context Compression
di: Chen, Jieneng, et al.
Pubblicazione: (2024)

Oblique-MERF: Revisiting and Improving MERF for Oblique Photography
di: Zeng, Xiaoyi, et al.
Pubblicazione: (2024)

Leveraging Cross-Modal Neighbor Representation for Improved CLIP Classification
di: Yi, Chao, et al.
Pubblicazione: (2024)

Understanding the Implicit User Intention via Reasoning with Large Language Model for Image Editing
di: Wang, Yijia, et al.
Pubblicazione: (2025)

Task-Agnostic Guided Feature Expansion for Class-Incremental Learning
di: Zheng, Bowen, et al.
Pubblicazione: (2025)