Saved in:
| Main Authors: | Çano, Erion, Lamaj, Dario |
|---|---|
| Format: | Preprint |
| Published: |
2024
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2402.04028 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
NSINA: A News Corpus for Sinhala
by: Hettiarachchi, Hansi, et al.
Published: (2024)
by: Hettiarachchi, Hansi, et al.
Published: (2024)
Differentially-private text generation degrades output language quality
by: Çano, Erion, et al.
Published: (2025)
by: Çano, Erion, et al.
Published: (2025)
Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training
by: Kesgin, H. Toprak, et al.
Published: (2024)
by: Kesgin, H. Toprak, et al.
Published: (2024)
From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory
by: Dury, Jason
Published: (2026)
by: Dury, Jason
Published: (2026)
Industry-Aligned Granular Topic Modeling
by: Moon, Sae Young, et al.
Published: (2026)
by: Moon, Sae Young, et al.
Published: (2026)
Discovering Forbidden Topics in Language Models
by: Rager, Can, et al.
Published: (2025)
by: Rager, Can, et al.
Published: (2025)
TopicProphet: Prophesies on Temporal Topic Trends and Stocks
by: Kim, Olivia
Published: (2025)
by: Kim, Olivia
Published: (2025)
TopicTag: Automatic Annotation of NMF Topic Models Using Chain of Thought and Prompt Tuning with LLMs
by: Wanna, Selma, et al.
Published: (2024)
by: Wanna, Selma, et al.
Published: (2024)
TopicDiff: A Topic-enriched Diffusion Approach for Multimodal Conversational Emotion Detection
by: Luo, Jiamin, et al.
Published: (2024)
by: Luo, Jiamin, et al.
Published: (2024)
RedTopic: Toward Topic-Diverse Red Teaming of Large Language Models
by: Ding, Jiale, et al.
Published: (2025)
by: Ding, Jiale, et al.
Published: (2025)
Evaluating Negative Sampling Approaches for Neural Topic Models
by: Adhya, Suman, et al.
Published: (2025)
by: Adhya, Suman, et al.
Published: (2025)
S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling
by: Adhya, Suman, et al.
Published: (2025)
by: Adhya, Suman, et al.
Published: (2025)
Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation
by: Yoo, YoungJoon, et al.
Published: (2023)
by: Yoo, YoungJoon, et al.
Published: (2023)
HLDC: Hindi Legal Documents Corpus
by: Kapoor, Arnav, et al.
Published: (2022)
by: Kapoor, Arnav, et al.
Published: (2022)
Transformer-based Joint Modelling for Automatic Essay Scoring and Off-Topic Detection
by: Das, Sourya Dipta, et al.
Published: (2024)
by: Das, Sourya Dipta, et al.
Published: (2024)
Exploring Anti-Aging Literature via ConvexTopics and Large Language Models
by: Yeganova, Lana E., et al.
Published: (2026)
by: Yeganova, Lana E., et al.
Published: (2026)
Predictable Confabulations: Factual Recall by LLMs Scales with Model Size and Topic Frequency
by: Smith, Matthew L., et al.
Published: (2026)
by: Smith, Matthew L., et al.
Published: (2026)
Topic Modelling Black Box Optimization
by: Akramov, Roman, et al.
Published: (2025)
by: Akramov, Roman, et al.
Published: (2025)
MathPile: A Billion-Token-Scale Pretraining Corpus for Math
by: Wang, Zengzhi, et al.
Published: (2023)
by: Wang, Zengzhi, et al.
Published: (2023)
WorldSpeech: A Multilingual Speech Corpus from Around the World
by: Asonitis, Antonis, et al.
Published: (2026)
by: Asonitis, Antonis, et al.
Published: (2026)
IDIAPers @ Causal News Corpus 2022: Efficient Causal Relation Identification Through a Prompt-based Few-shot Approach
by: Burdisso, Sergio, et al.
Published: (2022)
by: Burdisso, Sergio, et al.
Published: (2022)
Cross-lingual Named Entity Corpus for Slavic Languages
by: Piskorski, Jakub, et al.
Published: (2024)
by: Piskorski, Jakub, et al.
Published: (2024)
Don't Shoot The Breeze: Topic Continuity Model Using Nonlinear Naive Bayes With Attention
by: Pi, Shu-Ting, et al.
Published: (2026)
by: Pi, Shu-Ting, et al.
Published: (2026)
When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation
by: Prouteau, Thibault, et al.
Published: (2026)
by: Prouteau, Thibault, et al.
Published: (2026)
Meta4XNLI: A Crosslingual Parallel Corpus for Metaphor Detection and Interpretation
by: Sanchez-Bayona, Elisa, et al.
Published: (2024)
by: Sanchez-Bayona, Elisa, et al.
Published: (2024)
Filtered Corpus Training (FiCT) Shows that Language Models can Generalize from Indirect Evidence
by: Patil, Abhinav, et al.
Published: (2024)
by: Patil, Abhinav, et al.
Published: (2024)
Towards the TopMost: A Topic Modeling System Toolkit
by: Wu, Xiaobao, et al.
Published: (2023)
by: Wu, Xiaobao, et al.
Published: (2023)
NepTam: A Nepali-Tamang Parallel Corpus and Baseline Machine Translation Experiments
by: Ghimire, Rupak Raj, et al.
Published: (2026)
by: Ghimire, Rupak Raj, et al.
Published: (2026)
Topic Classification of Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment
by: Sargeant, Holli, et al.
Published: (2024)
by: Sargeant, Holli, et al.
Published: (2024)
Retrieve, Then Classify: Corpus-Grounded Automation of Clinical Value Set Authoring
by: Mukherjee, Sumit, et al.
Published: (2026)
by: Mukherjee, Sumit, et al.
Published: (2026)
BeliN: A Novel Corpus for Bengali Religious News Headline Generation using Contextual Feature Fusion
by: Osama, Md, et al.
Published: (2025)
by: Osama, Md, et al.
Published: (2025)
Improving Customer Service with Automatic Topic Detection in User Emails
by: Bašaragin, Bojana, et al.
Published: (2025)
by: Bašaragin, Bojana, et al.
Published: (2025)
GreenTEA: Gradient Descent with Topic-modeling and Evolutionary Auto-prompting
by: Dong, Zheng, et al.
Published: (2025)
by: Dong, Zheng, et al.
Published: (2025)
Dynamics of Spontaneous Topic Changes in Next Token Prediction with Self-Attention
by: Jia, Mumin, et al.
Published: (2025)
by: Jia, Mumin, et al.
Published: (2025)
Neural Multimodal Topic Modeling: A Comprehensive Evaluation
by: González-Pizarro, Felipe, et al.
Published: (2024)
by: González-Pizarro, Felipe, et al.
Published: (2024)
Episodic Memories Generation and Evaluation Benchmark for Large Language Models
by: Huet, Alexis, et al.
Published: (2025)
by: Huet, Alexis, et al.
Published: (2025)
PMOA-TTS: Introducing the PubMed Open Access Textual Times Series Corpus
by: Noroozizadeh, Shahriar, et al.
Published: (2025)
by: Noroozizadeh, Shahriar, et al.
Published: (2025)
DiffETM: Diffusion Process Enhanced Embedded Topic Model
by: Shao, Wei, et al.
Published: (2025)
by: Shao, Wei, et al.
Published: (2025)
CoSD: Collaborative Stance Detection with Contrastive Heterogeneous Topic Graph Learning
by: Cheng, Yinghan, et al.
Published: (2024)
by: Cheng, Yinghan, et al.
Published: (2024)
Interpretable Topic Extraction and Word Embedding Learning using row-stochastic DEDICOM
by: Hillebrand, Lars, et al.
Published: (2025)
by: Hillebrand, Lars, et al.
Published: (2025)
Similar Items
-
NSINA: A News Corpus for Sinhala
by: Hettiarachchi, Hansi, et al.
Published: (2024) -
Differentially-private text generation degrades output language quality
by: Çano, Erion, et al.
Published: (2025) -
Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training
by: Kesgin, H. Toprak, et al.
Published: (2024) -
From Topic to Transition Structure: Unsupervised Concept Discovery at Corpus Scale via Predictive Associative Memory
by: Dury, Jason
Published: (2026) -
Industry-Aligned Granular Topic Modeling
by: Moon, Sae Young, et al.
Published: (2026)