:: Library Catalog

Cover Image

Saved in:

Bibliographic Details
Main Authors:	Savadikar, Chinmay, Zhao, Mingyu, Zhu, Yuanzheng, Li, Han, Xie, Shuang, Castelo, Alberto, Wu, Tianfu, Wang, Lingyun
Format:	Preprint
Published:	2026
Subjects:	Artificial Intelligence
Online Access:	https://arxiv.org/abs/2605.16116
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

SimGym: A Framework for A/B Test Simulation in E-Commerce with Traffic-Grounded VLM Agents
by: Li, Han, et al.
Published: (2026)

WeGeFT: Weight-Generative Fine-Tuning for Multi-Faceted Efficient Adaptation of Large Models
by: Savadikar, Chinmay, et al.
Published: (2023)

CHEEM: Continual Learning by Reuse, New, Adapt and Skip -- A Hierarchical Exploration-Exploitation Approach
by: Savadikar, Chinmay, et al.
Published: (2023)

SimGym: Traffic-Grounded Browser Agents for Offline A/B Testing in E-Commerce
by: Castelo, Alberto, et al.
Published: (2026)

SimPersona: Learning Discrete Buyer Personas from Raw Clickstreams for Grounded E-Commerce Agents
by: Foumani, Zahra Zanjani, et al.
Published: (2026)

WebGym: Scaling Training Environments for Visual Web Agents with Realistic Tasks
by: Bai, Hao, et al.
Published: (2026)

ClawGym: A Scalable Framework for Building Effective Claw Agents
by: Bai, Fei, et al.
Published: (2026)

Shopping Companion: Benchmarking and Training LLM Agents for Long-Horizon Preference-Grounded E-Commerce Tasks
by: Yu, Zijian, et al.
Published: (2026)

WebMall -- A Multi-Shop Benchmark for Evaluating Web Agents
by: Peeters, Ralph, et al.
Published: (2025)

LLaSA: Large Language and E-Commerce Shopping Assistant
by: Zhang, Shuo, et al.
Published: (2024)

The BrowserGym Ecosystem for Web Agent Research
by: De Chezelles, Thibault Le Sellier, et al.
Published: (2024)

Weblica: Scalable and Reproducible Training Environments for Visual Web Agents
by: Kar, Oğuzhan Fatih, et al.
Published: (2026)

ViroGym: Realistic Large-Scale Benchmarks for Evaluating Viral Proteins
by: Zhou, Yichen, et al.
Published: (2026)

AgenticShop: Benchmarking Agentic Product Curation for Personalized Web Shopping
by: Kim, Sunghwan, et al.
Published: (2026)

InnoGym: Benchmarking the Innovation Potential of AI Agents
by: Zhang, Jintian, et al.
Published: (2025)

Odysseys: Benchmarking Web Agents on Realistic Long Horizon Tasks
by: Jang, Lawrence Keunho, et al.
Published: (2026)

RiskWebWorld: A Realistic Interactive Benchmark for GUI Agents in E-commerce Risk Management
by: Chen, Renqi, et al.
Published: (2026)

DeepShop: A Benchmark for Deep Research Shopping Agents
by: Lyu, Yougang, et al.
Published: (2025)

TimeSeriesGym: A Scalable Benchmark for (Time Series) Machine Learning Engineering Agents
by: Cai, Yifu, et al.
Published: (2025)

EconWebArena: Benchmarking Autonomous Agents on Economic Tasks in Realistic Web Environments
by: Liu, Zefang, et al.
Published: (2025)

OceanGym: A Benchmark Environment for Underwater Embodied Agents
by: Xue, Yida, et al.
Published: (2025)

VisGym: Diverse, Customizable, Scalable Environments for Multimodal Agents
by: Wang, Zirui, et al.
Published: (2026)

Strategic Exploitation in LLM Agent Markets: A Simulation Framework for E-Commerce Trust
by: Lei, Shijun, et al.
Published: (2026)

Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation
by: Zhu, Yixin, et al.
Published: (2026)

StressWeb: A Diagnostic Benchmark for Web Agent Robustness under Realistic Interaction Variability
by: Bai, Haoyue, et al.
Published: (2026)

Evolution and Impact of Shopping in the Digital Age: Consumer Behavior, E-Commerce, and Retail Innovations
by: Tanmoy Dey
Published: (2025)

ShopSimulator: Evaluating and Exploring RL-Driven LLM Agent for Shopping Assistants
by: Wang, Pei, et al.
Published: (2026)

Towards a Realistic Long-Term Benchmark for Open-Web Research Agents
by: Mühlbacher, Peter, et al.
Published: (2024)

WebArena: A Realistic Web Environment for Building Autonomous Agents
by: Zhou, Shuyan, et al.
Published: (2023)

Consumer Shopping Intentions in Metaverse Food E‐Commerce: A Hybrid ISM‐DEMATEL Approach
by: M. K. P. Naik, et al.
Published: (2025)

Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce
by: Yang, Zhantao, et al.
Published: (2024)

DevOps-Gym: Benchmarking AI Agents in Software DevOps Cycle
by: Tang, Yuheng, et al.
Published: (2026)

ShoppingBench: A Real-World Intent-Grounded Shopping Benchmark for LLM-based Agents
by: Wang, Jiangyuan, et al.
Published: (2025)

Sliding Puzzles Gym: A Scalable Benchmark for State Representation in Visual Reinforcement Learning
by: de Oliveira, Bryan L. M., et al.
Published: (2024)

Memory Gym: Towards Endless Tasks to Benchmark Memory Capabilities of Agents
by: Pleines, Marco, et al.
Published: (2023)

Aerial Gym Simulator: A Framework for Highly Parallelized Simulation of Aerial Robots
by: Kulkarni, Mihir, et al.
Published: (2025)

E2E-GRec: An End-to-End Joint Training Framework for Graph Neural Networks and Recommender Systems
by: Xue, Rui, et al.
Published: (2025)

SciAgentGym: Benchmarking Multi-Step Scientific Tool-use in LLM Agents
by: Shen, Yujiong, et al.
Published: (2026)

NegotiationGym: Self-Optimizing Agents in a Multi-Agent Social Simulation Environment
by: Mangla, Shashank, et al.
Published: (2025)

Cyber Shopping Beyond Boundaries: The Metaverse Revolution in e‐Commerce and Consumer Behavior
by: Rana Muhammad Sohail Jafar, et al.
Published: (2025)