Saved in:
| Main Authors: | de Avalle, Guillermo Gil, Maruster, Laura, Sloot, Eric, Emmanouilidis, Christos |
|---|---|
| Format: | Preprint |
| Published: |
2026
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2604.06770 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Procedural Knowledge Extraction from Industrial Troubleshooting Guides Using Vision Language Models
by: de Avalle, Guillermo Gil, et al.
Published: (2026)
by: de Avalle, Guillermo Gil, et al.
Published: (2026)
FlowLearn: Evaluating Large Vision-Language Models on Flowchart Understanding
by: Pan, Huitong, et al.
Published: (2024)
by: Pan, Huitong, et al.
Published: (2024)
SONIC-O1: A Real-World Benchmark for Evaluating Multimodal Large Language Models on Audio-Video Understanding
by: Radwan, Ahmed Y., et al.
Published: (2026)
by: Radwan, Ahmed Y., et al.
Published: (2026)
JSynFlow: Japanese Synthesised Flowchart Visual Question Answering Dataset built with Large Language Models
by: Sasaki, Hiroshi
Published: (2026)
by: Sasaki, Hiroshi
Published: (2026)
EdgeFlow: Edge-Map Augmented VLM-Based Flowchart Processing for Industrial Requirements Engineering
by: Dou, Zhifei, et al.
Published: (2026)
by: Dou, Zhifei, et al.
Published: (2026)
First Multi-Dimensional Evaluation of Flowchart Comprehension for Multimodal Large Language Models
by: Zhang, Enming, et al.
Published: (2024)
by: Zhang, Enming, et al.
Published: (2024)
An Online Reference-Free Evaluation Framework for Flowchart Image-to-Code Generation
by: Nguyen, Giang Son, et al.
Published: (2026)
by: Nguyen, Giang Son, et al.
Published: (2026)
Arrow-Guided VLM: Enhancing Flowchart Understanding via Arrow Direction Encoding
by: Omasa, Takamitsu, et al.
Published: (2025)
by: Omasa, Takamitsu, et al.
Published: (2025)
Towards Making Flowchart Images Machine Interpretable
by: Shukla, Shreya, et al.
Published: (2025)
by: Shukla, Shreya, et al.
Published: (2025)
Procedure-Aware Surgical Video-language Pretraining with Hierarchical Knowledge Augmentation
by: Yuan, Kun, et al.
Published: (2024)
by: Yuan, Kun, et al.
Published: (2024)
Guiding Video Prediction with Explicit Procedural Knowledge
by: Takenaka, Patrick, et al.
Published: (2024)
by: Takenaka, Patrick, et al.
Published: (2024)
FC-Attack: Jailbreaking Multimodal Large Language Models via Auto-Generated Flowcharts
by: Zhang, Ziyi, et al.
Published: (2025)
by: Zhang, Ziyi, et al.
Published: (2025)
Separating Knowledge and Perception with Procedural Data
by: Rodríguez-Muñoz, Adrián, et al.
Published: (2025)
by: Rodríguez-Muñoz, Adrián, et al.
Published: (2025)
Is There Knowledge Left to Extract? Evidence of Fragility in Medically Fine-Tuned Vision-Language Models
by: McLaughlin, Oliver, et al.
Published: (2026)
by: McLaughlin, Oliver, et al.
Published: (2026)
Optical Flow Matters: an Empirical Comparative Study on Fusing Monocular Extracted Modalities for Better Steering
by: Makiyeh, Fouad, et al.
Published: (2024)
by: Makiyeh, Fouad, et al.
Published: (2024)
Procedural terrain generation with style transfer
by: Merizzi, Fabio
Published: (2024)
by: Merizzi, Fabio
Published: (2024)
FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation
by: Bill, Eric Tillmann, et al.
Published: (2026)
by: Bill, Eric Tillmann, et al.
Published: (2026)
Detection-Fusion for Knowledge Graph Extraction from Videos
by: Das, Taniya, et al.
Published: (2024)
by: Das, Taniya, et al.
Published: (2024)
Learning Robust Intervention Representations with Delta Embeddings
by: Alimisis, Panagiotis, et al.
Published: (2025)
by: Alimisis, Panagiotis, et al.
Published: (2025)
Spatiotemporal Object Detection for Improved Aerial Vehicle Detection in Traffic Monitoring
by: Telegraph, Kristina, et al.
Published: (2024)
by: Telegraph, Kristina, et al.
Published: (2024)
Motion-Boundary-Driven Unsupervised Surgical Instrument Segmentation in Low-Quality Optical Flow
by: Liu, Yang, et al.
Published: (2024)
by: Liu, Yang, et al.
Published: (2024)
ViPro: Enabling and Controlling Video Prediction for Complex Dynamical Scenarios using Procedural Knowledge
by: Takenaka, Patrick, et al.
Published: (2024)
by: Takenaka, Patrick, et al.
Published: (2024)
PGT: Procedurally Generated Tasks for improving visual grounding in MLLMs
by: Assouel, Rim, et al.
Published: (2026)
by: Assouel, Rim, et al.
Published: (2026)
Less is More: Label-Guided Summarization of Procedural and Instructional Videos
by: Rajpal, Shreya, et al.
Published: (2026)
by: Rajpal, Shreya, et al.
Published: (2026)
SceneX: Procedural Controllable Large-scale Scene Generation
by: Zhou, Mengqi, et al.
Published: (2024)
by: Zhou, Mengqi, et al.
Published: (2024)
Enhanced Cascade Prostate Cancer Classifier in mp-MRI Utilizing Recall Feedback Adaptive Loss and Prior Knowledge-Based Feature Extraction
by: Luo, Kun, et al.
Published: (2024)
by: Luo, Kun, et al.
Published: (2024)
CogFlow: Bridging Perception and Reasoning through Knowledge Internalization for Visual Mathematical Problem Solving
by: Chen, Shuhang, et al.
Published: (2026)
by: Chen, Shuhang, et al.
Published: (2026)
Modality Translation for Object Detection Adaptation Without Forgetting Prior Knowledge
by: Medeiros, Heitor Rapela, et al.
Published: (2024)
by: Medeiros, Heitor Rapela, et al.
Published: (2024)
MMCL-Bench: Multimodal Context Learning from Visual Rules, Procedures, and Evidence
by: Chen, Yifan, et al.
Published: (2026)
by: Chen, Yifan, et al.
Published: (2026)
CityX: Controllable Procedural Content Generation for Unbounded 3D Cities
by: Zhang, Shougao, et al.
Published: (2024)
by: Zhang, Shougao, et al.
Published: (2024)
Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns
by: Kyrkou, Christos
Published: (2024)
by: Kyrkou, Christos
Published: (2024)
FACE: Faithful Automatic Concept Extraction
by: Bhusal, Dipkamal, et al.
Published: (2025)
by: Bhusal, Dipkamal, et al.
Published: (2025)
Sharingan: Extract User Action Sequence from Desktop Recordings
by: Chen, Yanting, et al.
Published: (2024)
by: Chen, Yanting, et al.
Published: (2024)
IMPACT: A Dataset for Multi-Granularity Human Procedural Action Understanding in Industrial Assembly
by: Wen, Di, et al.
Published: (2026)
by: Wen, Di, et al.
Published: (2026)
ReXSonoVQA: A Video QA Benchmark for Procedure-Centric Ultrasound Understanding
by: Wang, Xucheng, et al.
Published: (2026)
by: Wang, Xucheng, et al.
Published: (2026)
Designing and Generating Diverse, Equitable Face Image Datasets for Face Verification Tasks
by: Baltsou, Georgia, et al.
Published: (2025)
by: Baltsou, Georgia, et al.
Published: (2025)
Personalized Federated Learning for Cross-view Geo-localization
by: Anagnostopoulos, Christos, et al.
Published: (2024)
by: Anagnostopoulos, Christos, et al.
Published: (2024)
Masked Generative Story Transformer with Character Guidance and Caption Augmentation
by: Papadimitriou, Christos, et al.
Published: (2024)
by: Papadimitriou, Christos, et al.
Published: (2024)
TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models
by: Huang, Yushi, et al.
Published: (2023)
by: Huang, Yushi, et al.
Published: (2023)
A Stitch in Time: Learning Procedural Workflow via Self-Supervised Plackett-Luce Ranking
by: Che, Chengan, et al.
Published: (2025)
by: Che, Chengan, et al.
Published: (2025)
Similar Items
-
Procedural Knowledge Extraction from Industrial Troubleshooting Guides Using Vision Language Models
by: de Avalle, Guillermo Gil, et al.
Published: (2026) -
FlowLearn: Evaluating Large Vision-Language Models on Flowchart Understanding
by: Pan, Huitong, et al.
Published: (2024) -
SONIC-O1: A Real-World Benchmark for Evaluating Multimodal Large Language Models on Audio-Video Understanding
by: Radwan, Ahmed Y., et al.
Published: (2026) -
JSynFlow: Japanese Synthesised Flowchart Visual Question Answering Dataset built with Large Language Models
by: Sasaki, Hiroshi
Published: (2026) -
EdgeFlow: Edge-Map Augmented VLM-Based Flowchart Processing for Industrial Requirements Engineering
by: Dou, Zhifei, et al.
Published: (2026)