:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Huang, Junchao, Wu, Xiaoqi He Yebo, Zhao, Sheng
Format:	Preprint
Published:	2023
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2307.14591
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Pre-training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation
by: An, Xiaoqi, et al.
Published: (2024)

LTOS: Layout-controllable Text-Object Synthesis via Adaptive Cross-attention Fusions
by: Zhao, Xiaoran, et al.
Published: (2024)

MBDS: A Multi-Body Dynamics Simulation Dataset for Graph Networks Simulators
by: Yang, Sheng, et al.
Published: (2024)

Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT
by: Dong, Zhuobai, et al.
Published: (2025)

MM-UNet: Morph Mamba U-shaped Convolutional Networks for Retinal Vessel Segmentation
by: Liu, Jiawen, et al.
Published: (2025)

EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric Optimization
by: Wang, Xiaoqi, et al.
Published: (2025)

Lifting Unlabeled Internet-level Data for 3D Scene Understanding
by: Chen, Yixin, et al.
Published: (2026)

KAN-RCBEVDepth: A multi-modal fusion algorithm in object detection for autonomous driving
by: Lai, Zhihao, et al.
Published: (2024)

Multi-modal user interface control detection using cross-attention
by: Moradi, Milad, et al.
Published: (2026)

Stroke-based Cyclic Amplifier: Image Super-Resolution at Arbitrary Ultra-Large Scales
by: Guo, Wenhao, et al.
Published: (2025)

Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference Engines
by: Qu, Chongyu, et al.
Published: (2025)

PBCAT: Patch-based composite adversarial training against physically realizable attacks on object detection
by: Li, Xiao, et al.
Published: (2025)

Collaboration of Teachers for Semi-supervised Object Detection
by: Chen, Liyu, et al.
Published: (2024)

FR-TTS: Test-Time Scaling for NTP-based Image Generation with Effective Filling-based Reward Signal
by: Xu, Hang, et al.
Published: (2025)

SENSE: Satellite-based ENergy Synthesis for Sustainable Environment
by: Sun, Kailai, et al.
Published: (2026)

Computer Vision based group activity detection and action spotting
by: Sivalingam, Narthana, et al.
Published: (2025)

Deep learning based detection of collateral circulation in coronary angiographies
by: Hatfaludi, Cosmin-Andrei, et al.
Published: (2024)

Multi-identity Human Image Animation with Structural Video Diffusion
by: Wang, Zhenzhi, et al.
Published: (2025)

Knowledge-based anomaly detection for identifying network-induced shape artifacts
by: Deshpande, Rucha, et al.
Published: (2025)

A benchmark dataset for deep learning-based airplane detection: HRPlanes
by: Bakirman, Tolga, et al.
Published: (2022)

OpenGround: Active Cognition-based Reasoning for Open-World 3D Visual Grounding
by: Huang, Wenyuan, et al.
Published: (2025)

Topology-Driven Transferability Estimation of Medical Foundation Models for Segmentation
by: Tang, Jiaqi, et al.
Published: (2026)

MoTiC: Momentum Tightness and Contrast for Few-Shot Class-Incremental Learning
by: He, Zeyu, et al.
Published: (2025)

Robotic Visual Instruction
by: Li, Yanbang, et al.
Published: (2025)

DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
by: Ye, Fulong, et al.
Published: (2025)

Rethinking Epistemic and Aleatoric Uncertainty for Active Open-Set Annotation: An Energy-Based Approach
by: Zong, Chen-Chen, et al.
Published: (2025)

SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
by: Huang, Zhenglin, et al.
Published: (2024)

Computer vision-based model for detecting turning lane features on Florida's public roadways
by: Antwi, Richard Boadu, et al.
Published: (2024)

Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization
by: Zhang, Xiang, et al.
Published: (2025)

PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
by: Gong, Junchao, et al.
Published: (2024)

AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
by: Ye, Yilin, et al.
Published: (2025)

Research on target detection method of distracted driving behavior based on improved YOLOv8
by: Shen, Shiquan, et al.
Published: (2024)

PathReasoning: A multimodal reasoning agent for query-based ROI navigation on whole-slide images
by: Zhang, Kunpeng, et al.
Published: (2025)

A computer vision-based model for occupancy detection using low-resolution thermal images
by: Cui, Xue, et al.
Published: (2025)

Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
by: Liu, Jiajun, et al.
Published: (2024)

Structured Click Control in Transformer-based Interactive Segmentation
by: Xu, Long, et al.
Published: (2024)

IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation
by: Huang, Jiacui, et al.
Published: (2024)

RT-DEMT: A hybrid real-time acupoint detection model combining mamba and transformer
by: Yang, Shilong, et al.
Published: (2025)

ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning
by: Zhao, Yuan, et al.
Published: (2026)

Deep learning-based automated damage detection in concrete structures using images from earthquake events
by: Turer, Abdullah, et al.
Published: (2025)