Saved in:
| Main Authors: | Huang, Junchao, Wu, Xiaoqi He Yebo, Zhao, Sheng |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2307.14591 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Pre-training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation
by: An, Xiaoqi, et al.
Published: (2024)
by: An, Xiaoqi, et al.
Published: (2024)
LTOS: Layout-controllable Text-Object Synthesis via Adaptive Cross-attention Fusions
by: Zhao, Xiaoran, et al.
Published: (2024)
by: Zhao, Xiaoran, et al.
Published: (2024)
MBDS: A Multi-Body Dynamics Simulation Dataset for Graph Networks Simulators
by: Yang, Sheng, et al.
Published: (2024)
by: Yang, Sheng, et al.
Published: (2024)
Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT
by: Dong, Zhuobai, et al.
Published: (2025)
by: Dong, Zhuobai, et al.
Published: (2025)
MM-UNet: Morph Mamba U-shaped Convolutional Networks for Retinal Vessel Segmentation
by: Liu, Jiawen, et al.
Published: (2025)
by: Liu, Jiawen, et al.
Published: (2025)
EVA02-AT: Egocentric Video-Language Understanding with Spatial-Temporal Rotary Positional Embeddings and Symmetric Optimization
by: Wang, Xiaoqi, et al.
Published: (2025)
by: Wang, Xiaoqi, et al.
Published: (2025)
Lifting Unlabeled Internet-level Data for 3D Scene Understanding
by: Chen, Yixin, et al.
Published: (2026)
by: Chen, Yixin, et al.
Published: (2026)
KAN-RCBEVDepth: A multi-modal fusion algorithm in object detection for autonomous driving
by: Lai, Zhihao, et al.
Published: (2024)
by: Lai, Zhihao, et al.
Published: (2024)
Multi-modal user interface control detection using cross-attention
by: Moradi, Milad, et al.
Published: (2026)
by: Moradi, Milad, et al.
Published: (2026)
Stroke-based Cyclic Amplifier: Image Super-Resolution at Arbitrary Ultra-Large Scales
by: Guo, Wenhao, et al.
Published: (2025)
by: Guo, Wenhao, et al.
Published: (2025)
Post-Training Quantization for 3D Medical Image Segmentation: A Practical Study on Real Inference Engines
by: Qu, Chongyu, et al.
Published: (2025)
by: Qu, Chongyu, et al.
Published: (2025)
PBCAT: Patch-based composite adversarial training against physically realizable attacks on object detection
by: Li, Xiao, et al.
Published: (2025)
by: Li, Xiao, et al.
Published: (2025)
Collaboration of Teachers for Semi-supervised Object Detection
by: Chen, Liyu, et al.
Published: (2024)
by: Chen, Liyu, et al.
Published: (2024)
FR-TTS: Test-Time Scaling for NTP-based Image Generation with Effective Filling-based Reward Signal
by: Xu, Hang, et al.
Published: (2025)
by: Xu, Hang, et al.
Published: (2025)
SENSE: Satellite-based ENergy Synthesis for Sustainable Environment
by: Sun, Kailai, et al.
Published: (2026)
by: Sun, Kailai, et al.
Published: (2026)
Computer Vision based group activity detection and action spotting
by: Sivalingam, Narthana, et al.
Published: (2025)
by: Sivalingam, Narthana, et al.
Published: (2025)
Deep learning based detection of collateral circulation in coronary angiographies
by: Hatfaludi, Cosmin-Andrei, et al.
Published: (2024)
by: Hatfaludi, Cosmin-Andrei, et al.
Published: (2024)
Multi-identity Human Image Animation with Structural Video Diffusion
by: Wang, Zhenzhi, et al.
Published: (2025)
by: Wang, Zhenzhi, et al.
Published: (2025)
Knowledge-based anomaly detection for identifying network-induced shape artifacts
by: Deshpande, Rucha, et al.
Published: (2025)
by: Deshpande, Rucha, et al.
Published: (2025)
A benchmark dataset for deep learning-based airplane detection: HRPlanes
by: Bakirman, Tolga, et al.
Published: (2022)
by: Bakirman, Tolga, et al.
Published: (2022)
OpenGround: Active Cognition-based Reasoning for Open-World 3D Visual Grounding
by: Huang, Wenyuan, et al.
Published: (2025)
by: Huang, Wenyuan, et al.
Published: (2025)
Topology-Driven Transferability Estimation of Medical Foundation Models for Segmentation
by: Tang, Jiaqi, et al.
Published: (2026)
by: Tang, Jiaqi, et al.
Published: (2026)
MoTiC: Momentum Tightness and Contrast for Few-Shot Class-Incremental Learning
by: He, Zeyu, et al.
Published: (2025)
by: He, Zeyu, et al.
Published: (2025)
Robotic Visual Instruction
by: Li, Yanbang, et al.
Published: (2025)
by: Li, Yanbang, et al.
Published: (2025)
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning
by: Ye, Fulong, et al.
Published: (2025)
by: Ye, Fulong, et al.
Published: (2025)
Rethinking Epistemic and Aleatoric Uncertainty for Active Open-Set Annotation: An Energy-Based Approach
by: Zong, Chen-Chen, et al.
Published: (2025)
by: Zong, Chen-Chen, et al.
Published: (2025)
SIDA: Social Media Image Deepfake Detection, Localization and Explanation with Large Multimodal Model
by: Huang, Zhenglin, et al.
Published: (2024)
by: Huang, Zhenglin, et al.
Published: (2024)
Computer vision-based model for detecting turning lane features on Florida's public roadways
by: Antwi, Richard Boadu, et al.
Published: (2024)
by: Antwi, Richard Boadu, et al.
Published: (2024)
Latent-Info and Low-Dimensional Learning for Human Mesh Recovery and Parallel Optimization
by: Zhang, Xiang, et al.
Published: (2025)
by: Zhang, Xiang, et al.
Published: (2025)
PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
by: Gong, Junchao, et al.
Published: (2024)
by: Gong, Junchao, et al.
Published: (2024)
AKRMap: Adaptive Kernel Regression for Trustworthy Visualization of Cross-Modal Embeddings
by: Ye, Yilin, et al.
Published: (2025)
by: Ye, Yilin, et al.
Published: (2025)
Research on target detection method of distracted driving behavior based on improved YOLOv8
by: Shen, Shiquan, et al.
Published: (2024)
by: Shen, Shiquan, et al.
Published: (2024)
PathReasoning: A multimodal reasoning agent for query-based ROI navigation on whole-slide images
by: Zhang, Kunpeng, et al.
Published: (2025)
by: Zhang, Kunpeng, et al.
Published: (2025)
A computer vision-based model for occupancy detection using low-resolution thermal images
by: Cui, Xue, et al.
Published: (2025)
by: Cui, Xue, et al.
Published: (2025)
Kangaroo: A Powerful Video-Language Model Supporting Long-context Video Input
by: Liu, Jiajun, et al.
Published: (2024)
by: Liu, Jiajun, et al.
Published: (2024)
Structured Click Control in Transformer-based Interactive Segmentation
by: Xu, Long, et al.
Published: (2024)
by: Xu, Long, et al.
Published: (2024)
IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot Navigation
by: Huang, Jiacui, et al.
Published: (2024)
by: Huang, Jiacui, et al.
Published: (2024)
RT-DEMT: A hybrid real-time acupoint detection model combining mamba and transformer
by: Yang, Shilong, et al.
Published: (2025)
by: Yang, Shilong, et al.
Published: (2025)
ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning
by: Zhao, Yuan, et al.
Published: (2026)
by: Zhao, Yuan, et al.
Published: (2026)
Deep learning-based automated damage detection in concrete structures using images from earthquake events
by: Turer, Abdullah, et al.
Published: (2025)
by: Turer, Abdullah, et al.
Published: (2025)
Similar Items
-
Pre-training a Density-Aware Pose Transformer for Robust LiDAR-based 3D Human Pose Estimation
by: An, Xiaoqi, et al.
Published: (2024) -
LTOS: Layout-controllable Text-Object Synthesis via Adaptive Cross-attention Fusions
by: Zhao, Xiaoran, et al.
Published: (2024) -
MBDS: A Multi-Body Dynamics Simulation Dataset for Graph Networks Simulators
by: Yang, Sheng, et al.
Published: (2024) -
Seeing is Not Reasoning: MVPBench for Graph-based Evaluation of Multi-path Visual Physical CoT
by: Dong, Zhuobai, et al.
Published: (2025) -
MM-UNet: Morph Mamba U-shaped Convolutional Networks for Retinal Vessel Segmentation
by: Liu, Jiawen, et al.
Published: (2025)