Guardado en:
| Autores principales: | Pochinkov, Nicholas, Pasero, Ben, Shibayama, Skylar |
|---|---|
| Formato: | Preprint |
| Publicado: |
2024
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2408.17322 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
Modularity in Transformers: Investigating Neuron Separability & Specialization
por: Pochinkov, Nicholas, et al.
Publicado: (2024)
por: Pochinkov, Nicholas, et al.
Publicado: (2024)
CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
por: Kaiser, Daniel, et al.
Publicado: (2025)
por: Kaiser, Daniel, et al.
Publicado: (2025)
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
por: Cao, Deyu, et al.
Publicado: (2025)
por: Cao, Deyu, et al.
Publicado: (2025)
Research on a hybrid LSTM-CNN-Attention model for text-based web content classification
por: Kuz, Mykola, et al.
Publicado: (2025)
por: Kuz, Mykola, et al.
Publicado: (2025)
NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution
por: Breneur, Oleksandr Marchenko, et al.
Publicado: (2026)
por: Breneur, Oleksandr Marchenko, et al.
Publicado: (2026)
Rethinking the Multilingual Reasoning Gap with Layer Swap
por: Lasbordes, Maxence, et al.
Publicado: (2026)
por: Lasbordes, Maxence, et al.
Publicado: (2026)
Do Reasoning Models Enhance Embedding Models?
por: Chan, Wun Yu, et al.
Publicado: (2026)
por: Chan, Wun Yu, et al.
Publicado: (2026)
Linguistic Collapse: Neural Collapse in (Large) Language Models
por: Wu, Robert, et al.
Publicado: (2024)
por: Wu, Robert, et al.
Publicado: (2024)
Sliced-Wasserstein Distribution Alignment Loss Improves the Ultra-Low-Bit Quantization of Large Language Models
por: Cao, Deyu, et al.
Publicado: (2026)
por: Cao, Deyu, et al.
Publicado: (2026)
CLMN: Concept based Language Models via Neural Symbolic Reasoning
por: Yang, Yibo
Publicado: (2025)
por: Yang, Yibo
Publicado: (2025)
Rule Extraction in Machine Learning: Chat Incremental Pattern Constructor
por: Nwokocha, Caleb Princewill
Publicado: (2022)
por: Nwokocha, Caleb Princewill
Publicado: (2022)
When Does Content-Based Routing Work? Representation Requirements for Selective Attention in Hybrid Sequence Models
por: Basu, Abhinaba
Publicado: (2026)
por: Basu, Abhinaba
Publicado: (2026)
A Survey on Vision-Language-Action Models for Embodied AI
por: Ma, Yueen, et al.
Publicado: (2024)
por: Ma, Yueen, et al.
Publicado: (2024)
Unpacking Hateful Memes: Presupposed Context and False Claims
por: Cai, Weibin, et al.
Publicado: (2025)
por: Cai, Weibin, et al.
Publicado: (2025)
Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning
por: Imanov, Olaf Yunus Laitinen
Publicado: (2026)
por: Imanov, Olaf Yunus Laitinen
Publicado: (2026)
Latent Object Permanence: Topological Phase Transitions, Free-Energy Principles, and Renormalization Group Flows in Deep Transformer Manifolds
por: Alpay, Faruk, et al.
Publicado: (2026)
por: Alpay, Faruk, et al.
Publicado: (2026)
Inference acceleration for large language models using "stairs" assisted greedy generation
por: Grigaliūnas, Domas, et al.
Publicado: (2024)
por: Grigaliūnas, Domas, et al.
Publicado: (2024)
Strategic Doctrine Language Models (sdLM): A Learning-System Framework for Doctrinal Consistency and Geopolitical Forecasting
por: Imanov, Olaf Yunus Laitinen, et al.
Publicado: (2026)
por: Imanov, Olaf Yunus Laitinen, et al.
Publicado: (2026)
Neuro-Symbolic Process Anomaly Detection
por: Gaikwad, Devashish, et al.
Publicado: (2026)
por: Gaikwad, Devashish, et al.
Publicado: (2026)
AIPsy-Affect: A Keyword-Free Clinical Stimulus Battery for Mechanistic Interpretability of Emotion in Language Models
por: Keeman, Michael
Publicado: (2026)
por: Keeman, Michael
Publicado: (2026)
Product-of-Experts Training Reduces Dataset Artifacts in Natural Language Inference
por: Mathew, Aby Mammen
Publicado: (2026)
por: Mathew, Aby Mammen
Publicado: (2026)
Vis-CoT: A Human-in-the-Loop Framework for Interactive Visualization and Intervention in LLM Chain-of-Thought Reasoning
por: Pather, Kaviraj, et al.
Publicado: (2025)
por: Pather, Kaviraj, et al.
Publicado: (2025)
Harnessing non-adversarial robustness in large language models
por: Zhou, Qinghua, et al.
Publicado: (2026)
por: Zhou, Qinghua, et al.
Publicado: (2026)
Heterogeneous LLM Methods for Ontology Learning (Few-Shot Prompting, Ensemble Typing, and Attention-Based Taxonomies)
por: Beliaeva, Aleksandra, et al.
Publicado: (2025)
por: Beliaeva, Aleksandra, et al.
Publicado: (2025)
Application of deep learning approaches for medieval historical documents transcription
por: Voloshchuk, Maksym, et al.
Publicado: (2025)
por: Voloshchuk, Maksym, et al.
Publicado: (2025)
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
por: Laborde, Stanislas, et al.
Publicado: (2025)
por: Laborde, Stanislas, et al.
Publicado: (2025)
InhibiDistilbert: Knowledge Distillation for a ReLU and Addition-based Transformer
por: Zhang, Tony, et al.
Publicado: (2025)
por: Zhang, Tony, et al.
Publicado: (2025)
Approaches to Semantic Textual Similarity in Slovak Language: From Algorithms to Transformers
por: Radosky, Lukas, et al.
Publicado: (2026)
por: Radosky, Lukas, et al.
Publicado: (2026)
Who's Asking? Investigating Bias Through the Lens of Disability Framed Queries in LLMs
por: Hari, Vishnu, et al.
Publicado: (2025)
por: Hari, Vishnu, et al.
Publicado: (2025)
Understanding the Uncertainty of LLM Explanations: A Perspective Based on Reasoning Topology
por: Da, Longchao, et al.
Publicado: (2025)
por: Da, Longchao, et al.
Publicado: (2025)
On measuring grounding and generalizing grounding problems
por: Quigley, Daniel, et al.
Publicado: (2025)
por: Quigley, Daniel, et al.
Publicado: (2025)
Measuring Alignment-Induced Activation Shifts Correctly: A Template-Controlled Difference-in-Differences Protocol
por: Nakamura, Yuki
Publicado: (2026)
por: Nakamura, Yuki
Publicado: (2026)
mHC-SSM: Manifold-Constrained Hyper-Connections for State Space Language Models with Stream-Specialized Adapters
por: Mutlu, Abdulvahap, et al.
Publicado: (2026)
por: Mutlu, Abdulvahap, et al.
Publicado: (2026)
Extracting Sentence Embeddings from Pretrained Transformer Models
por: Stankevičius, Lukas, et al.
Publicado: (2024)
por: Stankevičius, Lukas, et al.
Publicado: (2024)
Sentiment Analysis of Lithuanian Online Reviews Using Large Language Models
por: Vileikytė, Brigita, et al.
Publicado: (2024)
por: Vileikytė, Brigita, et al.
Publicado: (2024)
ReFactor GNNs: Revisiting Factorisation-based Models from a Message-Passing Perspective
por: Chen, Yihong, et al.
Publicado: (2022)
por: Chen, Yihong, et al.
Publicado: (2022)
JAM: Controllable and Responsible Text Generation via Causal Reasoning and Latent Vector Manipulation
por: Huang, Yingbing, et al.
Publicado: (2025)
por: Huang, Yingbing, et al.
Publicado: (2025)
RACAS: Controlling Diverse Robots With a Single Agentic System
por: Ashley, Dylan R., et al.
Publicado: (2026)
por: Ashley, Dylan R., et al.
Publicado: (2026)
ProbeScale: Probing Analysis to Optimize Neural Scaling Laws for Efficient Small Language Model Inference
por: Das, Sourav
Publicado: (2026)
por: Das, Sourav
Publicado: (2026)
Unsolvability Ceiling in Multi-LLM Routing: An Empirical Study of Evaluation Artifacts
por: Garg, Saloni, et al.
Publicado: (2026)
por: Garg, Saloni, et al.
Publicado: (2026)
Ejemplares similares
-
Modularity in Transformers: Investigating Neuron Separability & Specialization
por: Pochinkov, Nicholas, et al.
Publicado: (2024) -
CogniLoad: A Synthetic Natural Language Reasoning Benchmark With Tunable Length, Intrinsic Difficulty, and Distractor Density
por: Kaiser, Daniel, et al.
Publicado: (2025) -
Enhancing Ultra-Low-Bit Quantization of Large Language Models Through Saliency-Aware Partial Retraining
por: Cao, Deyu, et al.
Publicado: (2025) -
Research on a hybrid LSTM-CNN-Attention model for text-based web content classification
por: Kuz, Mykola, et al.
Publicado: (2025) -
NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution
por: Breneur, Oleksandr Marchenko, et al.
Publicado: (2026)