:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Rašajski, Nemanja, Trivedi, Chintan, Makantasis, Konstantinos, Liapis, Antonios, Yannakakis, Georgios N.
Format:	Preprint
Published:	2024
Subjects:	Computer Vision and Pattern Recognition Artificial Intelligence
Online Access:	https://arxiv.org/abs/2402.01335
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

GameVibe: A Multimodal Affective Game Corpus
by: Barthet, Matthew, et al.
Published: (2024)

Across-Game Engagement Modelling via Few-Shot Learning
by: Pinitas, Kosmas, et al.
Published: (2024)

Can Large Language Models Capture Video Game Engagement?
by: Melhart, David, et al.
Published: (2025)

Dynamic Quality-Diversity Search
by: Gallotta, Roberto, et al.
Published: (2024)

FREYR: A Framework for Recognizing and Executing Your Requests
by: Gallotta, Roberto, et al.
Published: (2025)

Large Language Models and Games: A Survey and Roadmap
by: Gallotta, Roberto, et al.
Published: (2024)

MTR-VP: Towards End-to-End Trajectory Planning through Context-Driven Image Encoding and Multiple Trajectory Prediction
by: Keskar, Maitrayee, et al.
Published: (2025)

Enhancing Object Detection with Privileged Information: A Model-Agnostic Teacher-Student Approach
by: Bartolo, Matthias, et al.
Published: (2026)

The Procedural Content Generation Benchmark: An Open-source Testbed for Generative Challenges in Games
by: Khalifa, Ahmed, et al.
Published: (2025)

mAVE: A Watermark for Joint Audio-Visual Generation Models
by: Si, Luyang, et al.
Published: (2026)

Affectively Framework: Towards Human-like Affect-Based Agents
by: Barthet, Matthew, et al.
Published: (2024)

Few-shot Semantic Encoding and Decoding for Video Surveillance
by: Cheng, Baoping, et al.
Published: (2025)

VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance
by: Taesiri, Mohammad Reza, et al.
Published: (2025)

An Integrated Framework for Multi-Granular Explanation of Video Summarization
by: Tsigos, Konstantinos, et al.
Published: (2024)

GameGen-X: Interactive Open-world Game Video Generation
by: Che, Haoxuan, et al.
Published: (2024)

CRAFT: Critic-Refined Adaptive Key-Frame Targeting for Multimodal Video Question Answering
by: Bhosale, Mahesh, et al.
Published: (2026)

PAS: A Training-Free Stabilizer for Temporal Encoding in Video LLMs
by: Sun, Bowen, et al.
Published: (2025)

How Much 3D Do Video Foundation Models Encode?
by: Huang, Zixuan, et al.
Published: (2025)

Encoding and Controlling Global Semantics for Long-form Video Question Answering
by: Nguyen, Thong Thanh, et al.
Published: (2024)

STANCE: Motion Coherent Video Generation Via Sparse-to-Dense Anchored Encoding
by: Chen, Zhifei, et al.
Published: (2025)

The More, the Merrier: Contrastive Fusion for Higher-Order Multimodal Alignment
by: Koutoupis, Stefanos, et al.
Published: (2025)

MoCA-Video: Motion-Aware Concept Alignment for Consistent Video Editing
by: Zhang, Tong, et al.
Published: (2025)

Slot-ID: Identity-Preserving Video Generation from Reference Videos via Slot-Based Temporal Identity Encoding
by: Lai, Yixuan, et al.
Published: (2026)

REMAP: Regularized Matching and Partial Alignment of Video Embeddings
by: Chandra, Soumyadeep, et al.
Published: (2025)

MAP-Elites with Transverse Assessment for Multimodal Problems in Creative Domains
by: Zammit, Marvin, et al.
Published: (2024)

Reasoning over the Behaviour of Objects in Video-Clips for Adverb-Type Recognition
by: Seshadri, Amrit Diggavi, et al.
Published: (2023)

Multi-Frame, Lightweight & Efficient Vision-Language Models for Question Answering in Autonomous Driving
by: Gopalkrishnan, Akshay, et al.
Published: (2024)

Uncertainty-Guided Self-Questioning and Answering for Video-Language Alignment
by: Chen, Jin, et al.
Published: (2024)

PIPE: Physics-Informed Position Encoding for Alignment of Satellite Images and Time Series
by: Li, Haobo, et al.
Published: (2025)

Graph Alignment via Dual-Pass Spectral Encoding and Latent Space Communication
by: Behmanesh, Maysam, et al.
Published: (2025)

Sense Less, Generate More: Pre-training LiDAR Perception with Masked Autoencoders for Ultra-Efficient 3D Sensing
by: Tayebati, Sina, et al.
Published: (2024)

Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video
by: Xia, Hongchi, et al.
Published: (2024)

Temporal Alignment-Free Video Matching for Few-shot Action Recognition
by: Lee, SuBeen, et al.
Published: (2025)

V-LynX: Token Interface Alignment for Video+X LLMs
by: Park, Jungin, et al.
Published: (2026)

Multimodal Alignment with Cross-Attentive GRUs for Fine-Grained Video Understanding
by: Kim, Namho, et al.
Published: (2025)

Learning Using Privileged Information for Litter Detection
by: Bartolo, Matthias, et al.
Published: (2025)

Comment-aided Video-Language Alignment via Contrastive Pre-training for Short-form Video Humor Detection
by: Liu, Yang, et al.
Published: (2024)

VideoLights: Feature Refinement and Cross-Task Alignment Transformer for Joint Video Highlight Detection and Moment Retrieval
by: Paul, Dhiman, et al.
Published: (2024)

Rethinking Weakly-supervised Video Temporal Grounding From a Game Perspective
by: Fang, Xiang, et al.
Published: (2026)

A Hybrid Co-Finetuning Approach for Visual Bug Detection in Video Games
by: Yi, Faliu, et al.
Published: (2025)