Enregistré dans:
| Auteurs principaux: | Kabir, Md Rysul, Mochizuki-Freeman, James, Tiganj, Zoran |
|---|---|
| Format: | Preprint |
| Publié: |
2024
|
| Sujets: | |
| Accès en ligne: | https://arxiv.org/abs/2412.15292 |
| Tags: |
Ajouter un tag
Pas de tags, Soyez le premier à ajouter un tag!
|
Documents similaires
Different Paths to Harmful Compliance: Behavioral Side Effects and Mechanistic Divergence Across LLM Jailbreaks
par: Kabir, Md Rysul, et autres
Publié: (2026)
par: Kabir, Md Rysul, et autres
Publié: (2026)
Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training
par: Mistry, Deven Mahesh, et autres
Publié: (2025)
par: Mistry, Deven Mahesh, et autres
Publié: (2025)
Who Do LLMs Trust? Human Experts Matter More Than Other LLMs
par: Bajaj, Anooshka, et autres
Publié: (2026)
par: Bajaj, Anooshka, et autres
Publié: (2026)
Gradual Forgetting: Logarithmic Compression for Extending Transformer Context Windows
par: Dickson, Billy, et autres
Publié: (2025)
par: Dickson, Billy, et autres
Publié: (2025)
An advantage based policy transfer algorithm for reinforcement learning with measures of transferability
par: Alam, Md Ferdous, et autres
Publié: (2023)
par: Alam, Md Ferdous, et autres
Publié: (2023)
Normalization and effective learning rates in reinforcement learning
par: Lyle, Clare, et autres
Publié: (2024)
par: Lyle, Clare, et autres
Publié: (2024)
A deep learning and machine learning approach to predict neonatal death in the context of São Paulo
par: Raihan, Mohon, et autres
Publié: (2025)
par: Raihan, Mohon, et autres
Publié: (2025)
PC-DeepNet: A GNSS Positioning Error Minimization Framework Using Permutation-Invariant Deep Neural Network
par: Kabir, M. Humayun, et autres
Publié: (2025)
par: Kabir, M. Humayun, et autres
Publié: (2025)
Deep reinforcement learning for irrigation scheduling using high-dimensional sensor feedback
par: Saikai, Yuji, et autres
Publié: (2023)
par: Saikai, Yuji, et autres
Publié: (2023)
Logic-informed reinforcement learning for cross-domain optimization of large-scale cyber-physical systems
par: Wan, Guangxi, et autres
Publié: (2025)
par: Wan, Guangxi, et autres
Publié: (2025)
Explaining Fine Tuned LLMs via Counterfactuals A Knowledge Graph Driven Framework
par: Wang, Yucheng, et autres
Publié: (2025)
par: Wang, Yucheng, et autres
Publié: (2025)
Deep reinforcement learning for weakly coupled MDP's with continuous actions
par: Robledo, Francisco, et autres
Publié: (2024)
par: Robledo, Francisco, et autres
Publié: (2024)
Curriculum reinforcement learning with measurable task representation learning
par: Wen, Yongyan, et autres
Publié: (2026)
par: Wen, Yongyan, et autres
Publié: (2026)
Scores as Actions: a framework of fine-tuning diffusion models by continuous-time reinforcement learning
par: Zhao, Hanyang, et autres
Publié: (2024)
par: Zhao, Hanyang, et autres
Publié: (2024)
Dual-Temporal LSTM with Hybrid Attention for Airline Passenger Load Factor Forecasting: Integrating Intra-Flight and Inter-Flight Booking Dynamics
par: Islam, ASM Nazrul, et autres
Publié: (2026)
par: Islam, ASM Nazrul, et autres
Publié: (2026)
Deep progressive reinforcement learning-based flexible resource scheduling framework for IRS and UAV-assisted MEC system
par: Dong, Li, et autres
Publié: (2024)
par: Dong, Li, et autres
Publié: (2024)
Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning
par: Kobayashi, Seijin, et autres
Publié: (2025)
par: Kobayashi, Seijin, et autres
Publié: (2025)
Rysul119/A-2D-Heat-Conduction-Problem-with-Dirichlet-Boundary-Condition: Initial release
par: Md Rysul Kabir
Publié: (2025)
par: Md Rysul Kabir
Publié: (2025)
Causal prompting model-based offline reinforcement learning
par: Yu, Xuehui, et autres
Publié: (2024)
par: Yu, Xuehui, et autres
Publié: (2024)
Offline reinforcement learning for job-shop scheduling problems
par: Echeverria, Imanol, et autres
Publié: (2024)
par: Echeverria, Imanol, et autres
Publié: (2024)
Counterfactual experience augmented off-policy reinforcement learning
par: Lee, Sunbowen, et autres
Publié: (2025)
par: Lee, Sunbowen, et autres
Publié: (2025)
Delayed homomorphic reinforcement learning for environments with delayed feedback
par: Lee, Jongsoo, et autres
Publié: (2026)
par: Lee, Jongsoo, et autres
Publié: (2026)
Bellman operator convergence enhancements in reinforcement learning algorithms
par: Kadurha, David Krame, et autres
Publié: (2025)
par: Kadurha, David Krame, et autres
Publié: (2025)
Deep reinforcement learning-based spacecraft attitude control with pointing keep-out constraint
par: Yang, Juntang, et autres
Publié: (2025)
par: Yang, Juntang, et autres
Publié: (2025)
Dynamic feature selection in medical predictive monitoring by reinforcement learning
par: Chen, Yutong, et autres
Publié: (2024)
par: Chen, Yutong, et autres
Publié: (2024)
Economic span selection of bridge based on deep reinforcement learning
par: Zhang, Leye, et autres
Publié: (2024)
par: Zhang, Leye, et autres
Publié: (2024)
Leveraging weights signals -- Predicting and improving generalizability in reinforcement learning
par: Moulin, Olivier, et autres
Publié: (2025)
par: Moulin, Olivier, et autres
Publié: (2025)
Not all tokens are needed(NAT): token efficient reinforcement learning
par: Sang, Hejian, et autres
Publié: (2026)
par: Sang, Hejian, et autres
Publié: (2026)
Deep reinforcement learning for machine scheduling: Methodology, the state-of-the-art, and future directions
par: Khadivi, Maziyar, et autres
Publié: (2023)
par: Khadivi, Maziyar, et autres
Publié: (2023)
Survey on reinforcement learning for language processing
par: Uc-Cetina, Victor, et autres
Publié: (2021)
par: Uc-Cetina, Victor, et autres
Publié: (2021)
FALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization system
par: Li, Zeyuan, et autres
Publié: (2024)
par: Li, Zeyuan, et autres
Publié: (2024)
Deep learning surrogate models of JULES-INFERNO for wildfire prediction on a global scale
par: Cheng, Sibo, et autres
Publié: (2024)
par: Cheng, Sibo, et autres
Publié: (2024)
Maximum diffusion reinforcement learning
par: Berrueta, Thomas A., et autres
Publié: (2023)
par: Berrueta, Thomas A., et autres
Publié: (2023)
On the consistency of hyper-parameter selection in value-based deep reinforcement learning
par: Obando-Ceron, Johan, et autres
Publié: (2024)
par: Obando-Ceron, Johan, et autres
Publié: (2024)
CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives
par: Saghafian, Armin, et autres
Publié: (2024)
par: Saghafian, Armin, et autres
Publié: (2024)
Policy-shaped prediction: avoiding distractions in model-based reinforcement learning
par: Hutson, Miles, et autres
Publié: (2024)
par: Hutson, Miles, et autres
Publié: (2024)
An efficient deep reinforcement learning environment for flexible job-shop scheduling
par: Wu, Xinquan, et autres
Publié: (2025)
par: Wu, Xinquan, et autres
Publié: (2025)
Task diversity produces systematic transfer but inhibits continual reinforcement learning
par: Seth, Purab, et autres
Publié: (2026)
par: Seth, Purab, et autres
Publié: (2026)
Found-RL: foundation model-enhanced reinforcement learning for autonomous driving
par: Qu, Yansong, et autres
Publié: (2026)
par: Qu, Yansong, et autres
Publié: (2026)
Reinforcement learning for quantum processes with memory
par: Lumbreras, Josep, et autres
Publié: (2026)
par: Lumbreras, Josep, et autres
Publié: (2026)
Documents similaires
-
Different Paths to Harmful Compliance: Behavioral Side Effects and Mechanistic Divergence Across LLM Jailbreaks
par: Kabir, Md Rysul, et autres
Publié: (2026) -
Emergence of Episodic Memory in Transformers: Characterizing Changes in Temporal Structure of Attention Scores During Training
par: Mistry, Deven Mahesh, et autres
Publié: (2025) -
Who Do LLMs Trust? Human Experts Matter More Than Other LLMs
par: Bajaj, Anooshka, et autres
Publié: (2026) -
Gradual Forgetting: Logarithmic Compression for Extending Transformer Context Windows
par: Dickson, Billy, et autres
Publié: (2025) -
An advantage based policy transfer algorithm for reinforcement learning with measures of transferability
par: Alam, Md Ferdous, et autres
Publié: (2023)