Saved in:
| Main Authors: | Sharma, Aaryam, Czarnecki, Chris, Chen, Yuhao, Xi, Pengcheng, Xu, Linlin, Wong, Alexander |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2405.08717 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
In The Wild Ellipse Parameter Estimation for Circular Dining Plates and Bowls
by: Pathiranage, Akil, et al.
Published: (2024)
by: Pathiranage, Akil, et al.
Published: (2024)
Food Portion Estimation: From Pixels to Calories
by: Vinod, Gautham, et al.
Published: (2026)
by: Vinod, Gautham, et al.
Published: (2026)
FoodTrack: Estimating Handheld Food Portions with Egocentric Video
by: Wang, Ervin, et al.
Published: (2025)
by: Wang, Ervin, et al.
Published: (2025)
Understanding the Limitations of Diffusion Concept Algebra Through Food
by: Zeng, E. Zhixuan, et al.
Published: (2024)
by: Zeng, E. Zhixuan, et al.
Published: (2024)
Food Portion Estimation via 3D Object Scaling
by: Vinod, Gautham, et al.
Published: (2024)
by: Vinod, Gautham, et al.
Published: (2024)
NutritionVerse: Empirical Study of Various Dietary Intake Estimation Approaches
by: Tai, Chi-en Amy, et al.
Published: (2023)
by: Tai, Chi-en Amy, et al.
Published: (2023)
6D Pose Estimation on Spoons and Hands
by: Tan, Kevin, et al.
Published: (2025)
by: Tan, Kevin, et al.
Published: (2025)
Improving Remote Sensing Classification using Topological Data Analysis and Convolutional Neural Networks
by: Sharma, Aaryam
Published: (2025)
by: Sharma, Aaryam
Published: (2025)
Size Matters: Reconstructing Real-Scale 3D Models from Monocular Images for Food Portion Estimation
by: Vinod, Gautham, et al.
Published: (2026)
by: Vinod, Gautham, et al.
Published: (2026)
MetaFood3D: 3D Food Dataset with Nutrition Values
by: Chen, Yuhao, et al.
Published: (2024)
by: Chen, Yuhao, et al.
Published: (2024)
How Much 3D Do Video Foundation Models Encode?
by: Huang, Zixuan, et al.
Published: (2025)
by: Huang, Zixuan, et al.
Published: (2025)
NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images
by: Keller, Matthew, et al.
Published: (2024)
by: Keller, Matthew, et al.
Published: (2024)
Guess the Unified Model: How Much Can We Recover from Generated Images?
by: Cekinmez, Jasin, et al.
Published: (2026)
by: Cekinmez, Jasin, et al.
Published: (2026)
SSL-Interactions: Pretext Tasks for Interactive Trajectory Prediction
by: Bhattacharyya, Prarthana, et al.
Published: (2024)
by: Bhattacharyya, Prarthana, et al.
Published: (2024)
Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts
by: Zeng, E. Zhixuan, et al.
Published: (2024)
by: Zeng, E. Zhixuan, et al.
Published: (2024)
DreamPose3D: Hallucinative Diffusion with Prompt Learning for 3D Human Pose Estimation
by: Bright, Jerrin, et al.
Published: (2025)
by: Bright, Jerrin, et al.
Published: (2025)
Domain-Guided Masked Autoencoders for Unique Player Identification
by: Balaji, Bavesh, et al.
Published: (2024)
by: Balaji, Bavesh, et al.
Published: (2024)
Advancing Food Nutrition Estimation via Visual-Ingredient Feature Fusion
by: Qi, Huiyan, et al.
Published: (2025)
by: Qi, Huiyan, et al.
Published: (2025)
How Much To Guide: Revisiting Adaptive Guidance in Classifier-Free Guidance Text-to-Vision Diffusion Models
by: Zhang, Huixuan, et al.
Published: (2025)
by: Zhang, Huixuan, et al.
Published: (2025)
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning
by: Yu, Shoubin, et al.
Published: (2026)
by: Yu, Shoubin, et al.
Published: (2026)
PortionNet: Distilling 3D Geometric Knowledge for Food Nutrition Estimation
by: Bright, Darrin, et al.
Published: (2025)
by: Bright, Darrin, et al.
Published: (2025)
LensWalk: Agentic Video Understanding by Planning How You See in Videos
by: Li, Keliang, et al.
Published: (2026)
by: Li, Keliang, et al.
Published: (2026)
HAWAII: Hierarchical Visual Knowledge Transfer for Efficient Vision-Language Models
by: Wang, Yimu, et al.
Published: (2025)
by: Wang, Yimu, et al.
Published: (2025)
Fake It Till You Make It: Using Synthetic Data and Domain Knowledge for Improved Text-Based Learning for LGE Detection
by: Jacob, Athira J, et al.
Published: (2025)
by: Jacob, Athira J, et al.
Published: (2025)
Vision-Based Approach for Food Weight Estimation from 2D Images
by: Wimalasiri, Chathura, et al.
Published: (2024)
by: Wimalasiri, Chathura, et al.
Published: (2024)
Artificial Intelligence in the Food Industry: Food Waste Estimation based on Computer Vision, a Brief Case Study in a University Dining Hall
by: Rokhva, Shayan, et al.
Published: (2025)
by: Rokhva, Shayan, et al.
Published: (2025)
Zero-Shot Monocular Motion Segmentation in the Wild by Combining Deep Learning with Geometric Motion Model Fusion
by: Huang, Yuxiang, et al.
Published: (2024)
by: Huang, Yuxiang, et al.
Published: (2024)
How Much Is a Dataset Worth? Scaling Laws, the Vendi Score, and Matrix Spectral Functions
by: Bilmes, Jeff A., et al.
Published: (2026)
by: Bilmes, Jeff A., et al.
Published: (2026)
Composite Classifier-Free Guidance for Multi-Modal Conditioning in Wind Dynamics Super-Resolution
by: Schnell, Jacob, et al.
Published: (2025)
by: Schnell, Jacob, et al.
Published: (2025)
From Canteen Food to Daily Meals: Generalizing Food Recognition to More Practical Scenarios
by: Liu, Guoshan, et al.
Published: (2024)
by: Liu, Guoshan, et al.
Published: (2024)
Audio-Infused Automatic Image Colorization by Exploiting Audio Scene Semantics
by: Zhao, Pengcheng, et al.
Published: (2024)
by: Zhao, Pengcheng, et al.
Published: (2024)
An Explainable Hybrid AI Framework for Enhanced Tuberculosis and Symptom Detection
by: Patel, Neel, et al.
Published: (2025)
by: Patel, Neel, et al.
Published: (2025)
Annolid: Annotate, Segment, and Track Anything You Need
by: Yang, Chen, et al.
Published: (2024)
by: Yang, Chen, et al.
Published: (2024)
How Far Are Surgeons from Surgical World Models? A Pilot Study on Zero-shot Surgical Video Generation with Expert Assessment
by: Chen, Zhen, et al.
Published: (2025)
by: Chen, Zhen, et al.
Published: (2025)
SAEC: Scene-Aware Enhanced Edge-Cloud Collaborative Industrial Vision Inspection with Multimodal LLM
by: Tian, Yuhao, et al.
Published: (2025)
by: Tian, Yuhao, et al.
Published: (2025)
Ideal Registration? Segmentation is All You Need
by: Chen, Xiang, et al.
Published: (2025)
by: Chen, Xiang, et al.
Published: (2025)
Memory augment is All You Need for image restoration
by: Zhang, Xiao Feng, et al.
Published: (2023)
by: Zhang, Xiao Feng, et al.
Published: (2023)
Boosting Semi-Supervised Medical Image Segmentation via Masked Image Consistency and Discrepancy Learning
by: Zhou, Pengcheng, et al.
Published: (2025)
by: Zhou, Pengcheng, et al.
Published: (2025)
Debiasing Central Fixation Confounds Reveals a Peripheral "Sweet Spot" for Human-like Scanpaths in Hard-Attention Vision
by: Pan, Pengcheng, et al.
Published: (2026)
by: Pan, Pengcheng, et al.
Published: (2026)
Emergence of Fixational and Saccadic Movements in a Multi-Level Recurrent Attention Model for Vision
by: Pan, Pengcheng, et al.
Published: (2025)
by: Pan, Pengcheng, et al.
Published: (2025)
Similar Items
-
In The Wild Ellipse Parameter Estimation for Circular Dining Plates and Bowls
by: Pathiranage, Akil, et al.
Published: (2024) -
Food Portion Estimation: From Pixels to Calories
by: Vinod, Gautham, et al.
Published: (2026) -
FoodTrack: Estimating Handheld Food Portions with Egocentric Video
by: Wang, Ervin, et al.
Published: (2025) -
Understanding the Limitations of Diffusion Concept Algebra Through Food
by: Zeng, E. Zhixuan, et al.
Published: (2024) -
Food Portion Estimation via 3D Object Scaling
by: Vinod, Gautham, et al.
Published: (2024)