:: Library Catalog

Copertina

Salvato in:

Dettagli Bibliografici
Autori principali:	Chae, Hyunsik, Yoon, Seungwoo, Park, Jaden, Chun, Chloe Yewon, Cho, Yongin, Cai, Mu, Lee, Yong Jae, Ryu, Ernest K.
Natura:	Preprint
Pubblicazione:	2025
Soggetti:	Computer Vision and Pattern Recognition Artificial Intelligence
Accesso online:	https://arxiv.org/abs/2505.20021
Tags:	Aggiungi Tag Nessun Tag, puoi essere il primo ad aggiungerne!!

Documenti analoghi

Contamination Detection for VLMs using Multi-Modal Semantic Perturbation
di: Park, Jaden, et al.
Pubblicazione: (2025)

Encryption-Friendly LLM Architecture
di: Rho, Donghwan, et al.
Pubblicazione: (2024)

Visual Preference Inference: An Image Sequence-Based Preference Reasoning in Tabletop Object Manipulation
di: Lee, Joonhyung, et al.
Pubblicazione: (2024)

A Real-Time Defense Against Object Vanishing Adversarial Patch Attacks for Object Detection in Autonomous Vehicles
di: Mu, Jaden
Pubblicazione: (2024)

QFlash: Bridging Quantization and Memory Efficiency in Vision Transformer Attention
di: Oh, Sehyeon, et al.
Pubblicazione: (2026)

The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers
di: Son, Seungwoo, et al.
Pubblicazione: (2023)

Mixed Non-linear Quantization for Vision Transformers
di: Kim, Gihwan, et al.
Pubblicazione: (2024)

Seeing Through Touch: Tactile-Driven Visual Localization of Material Regions
di: Kim, Seongyu, et al.
Pubblicazione: (2026)

Decomposed On-Policy Distillation for Vision-Language Reasoning: Steering Gradients for Visual Grounding
di: Yoon, Hee Suk, et al.
Pubblicazione: (2026)

Improving Visual Token Reduction via Rectifying Distortions for Efficient Multimodal LLM Inference
di: Cho, Hyeonwoo, et al.
Pubblicazione: (2026)

Q-HyViT: Post-Training Quantization of Hybrid Vision Transformers with Bridge Block Reconstruction for IoT Systems
di: Lee, Jemin, et al.
Pubblicazione: (2023)

Multi‐Level Wettability Patterned Porous Matrix for Advanced Optical Information Encryption
di: Min Ryu, et al.
Pubblicazione: (2024)

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts
di: Cai, Mu, et al.
Pubblicazione: (2023)

Fourier Surfaces Reaching Full‐Color Diffraction Limits
di: Yongjun Lim, et al.
Pubblicazione: (2024)

Fourier Surfaces Reaching Full‐Color Diffraction Limits (Adv. Mater. 40/2024)
di: Yongjun Lim, et al.
Pubblicazione: (2024)

Exploration and Exploitation Errors Are Measurable for Language Model Agents
di: Park, Jaden, et al.
Pubblicazione: (2026)

Accelerated Minimax Algorithms Flock Together
di: Yoon, TaeHo, et al.
Pubblicazione: (2022)

Extend3D: Town-Scale 3D Generation
di: Yoon, Seungwoo, et al.
Pubblicazione: (2026)

Unveiling the Response of Large Vision-Language Models to Visually Absent Tokens
di: Kim, Sohee, et al.
Pubblicazione: (2025)

Toward New Organizational Sociology of Quantification: From Interlopers to Plurality and Contestation
di: Hyunsik Chun, et al.
Pubblicazione: (2025)

MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models
di: Zou, Bocheng, et al.
Pubblicazione: (2026)

LaViC: Adapting Large Vision-Language Models to Visually-Aware Conversational Recommendation
di: Jeon, Hyunsik, et al.
Pubblicazione: (2025)

Gradient-free Decoder Inversion in Latent Diffusion Models
di: Hong, Seongmin, et al.
Pubblicazione: (2024)

CHARTOM: A Visual Theory-of-Mind Benchmark for LLMs on Misleading Charts
di: Bharti, Shubham, et al.
Pubblicazione: (2024)

Optimal First-Order Algorithms as a Function of Inequalities
di: Park, Chanwoo, et al.
Pubblicazione: (2021)

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
di: Zhang, Jianrui, et al.
Pubblicazione: (2024)

Modeling the Spatiotemporal Spread and Control of African Swine Fever in the Republic of Korea Using a Patch‐Based Stochastic Framework
di: Changdae Son, et al.
Pubblicazione: (2026)

Assessing LLM Reasoning Steps via Principal Knowledge Grounding
di: Hwang, Hyeon, et al.
Pubblicazione: (2025)

On Epistemic Uncertainty of Visual Tokens for Object Hallucinations in Large Vision-Language Models
di: Seo, Hoigi, et al.
Pubblicazione: (2025)

Latency in Real-Time 3D Volumetric Streaming: A Comprehensive Study
di: Hong, Seungwoo, et al.
Pubblicazione: (2026)

Exploring the Trade-Offs: Quantization Methods, Task Difficulty, and Model Size in Large Language Models From Edge to Giant
di: Lee, Jemin, et al.
Pubblicazione: (2024)

Deep neural network‐based infinitesimal dipole modeling using either near or far electric‐field
di: Jae‐Yoon Park, et al.
Pubblicazione: (2024)

Comprehensive Evaluation of OATP‐ and BCRP‐Mediated Drug–Drug Interactions of Methotrexate Using Physiologically‐Based Pharmacokinetic Modeling
di: Sejung Hwang, et al.
Pubblicazione: (2024)

Image Clustering Conditioned on Text Criteria
di: Kwon, Sehyun, et al.
Pubblicazione: (2023)

Paradigm Shift to the Cross Economy: Transforming Waste Into Innovative Material Platforms
di: Hanna Kim, et al.
Pubblicazione: (2026)

Paradigm Shift to the Cross Economy: Transforming Waste Into Innovative Material Platforms (Adv. Sustainable Syst. 1/2026)
di: Hanna Kim, et al.
Pubblicazione: (2026)

Yo'LLaVA: Your Personalized Language and Vision Assistant
di: Nguyen, Thao, et al.
Pubblicazione: (2024)

NOVO: Bridging LLaVA and SAM with Visual-only Prompts for Reasoning Segmentation
di: Yoon, Kyung-Yoon, et al.
Pubblicazione: (2025)

Correction to “NEST‐C: A deep learning compiler framework for heterogeneous computing systems with artificial intelligence accelerators”
di: Jeman Park, et al.
Pubblicazione: (2024)

Video-Skill-CoT: Skill-based Chain-of-Thoughts for Domain-Adaptive Video Reasoning
di: Lee, Daeun, et al.
Pubblicazione: (2025)