Guardado en:
| Autores principales: | Collins, Jack, Buzea, Adrian, Collier, Chris, Rosen, Alejandro Ballesta, Maclaren, Julian, Lyon, Richard F., Carlile, Simon |
|---|---|
| Formato: | Preprint |
| Publicado: |
2025
|
| Materias: | |
| Acceso en línea: | https://arxiv.org/abs/2507.23590 |
| Etiquetas: |
Agregar Etiqueta
Sin Etiquetas, Sea el primero en etiquetar este registro!
|
Ejemplares similares
A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet
por: Zhong, Henry, et al.
Publicado: (2025)
por: Zhong, Henry, et al.
Publicado: (2025)
Modeling the Difficulty of Saxophone Music
por: Libřický, Šimon, et al.
Publicado: (2025)
por: Libřický, Šimon, et al.
Publicado: (2025)
Can Audio Reveal Music Performance Difficulty? Insights from the Piano Syllabus Dataset
por: Ramoneda, Pedro, et al.
Publicado: (2024)
por: Ramoneda, Pedro, et al.
Publicado: (2024)
HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers
por: Xie, Yadong, et al.
Publicado: (2025)
por: Xie, Yadong, et al.
Publicado: (2025)
The Rarity of Musical Audio Signals Within the Space of Possible Audio Generation
por: Collins, Nick
Publicado: (2024)
por: Collins, Nick
Publicado: (2024)
NoiseBandNet: Controllable Time-Varying Neural Synthesis of Sound Effects Using Filterbanks
por: Barahona-Ríos, Adrián, et al.
Publicado: (2023)
por: Barahona-Ríos, Adrián, et al.
Publicado: (2023)
Language-based Audio Moment Retrieval
por: Munakata, Hokuto, et al.
Publicado: (2024)
por: Munakata, Hokuto, et al.
Publicado: (2024)
Proactive Hearing Assistants that Isolate Egocentric Conversations
por: Hu, Guilin, et al.
Publicado: (2025)
por: Hu, Guilin, et al.
Publicado: (2025)
Serenade: A Singing Style Conversion Framework Based On Audio Infilling
por: Violeta, Lester Phillip, et al.
Publicado: (2025)
por: Violeta, Lester Phillip, et al.
Publicado: (2025)
HEAR: Hearing Enhanced Audio Response for Video-grounded Dialogue
por: Yoon, Sunjae, et al.
Publicado: (2023)
por: Yoon, Sunjae, et al.
Publicado: (2023)
Vision Language Models Are Few-Shot Audio Spectrogram Classifiers
por: Dixit, Satvik, et al.
Publicado: (2024)
por: Dixit, Satvik, et al.
Publicado: (2024)
Leveraging Diverse Semantic-based Audio Pretrained Models for Singing Voice Conversion
por: Zhang, Xueyao, et al.
Publicado: (2023)
por: Zhang, Xueyao, et al.
Publicado: (2023)
Continuous Audio Language Models
por: Rouard, Simon, et al.
Publicado: (2025)
por: Rouard, Simon, et al.
Publicado: (2025)
Hearing from Silence: Reasoning Audio Descriptions from Silent Videos via Vision-Language Model
por: Ren, Yong, et al.
Publicado: (2025)
por: Ren, Yong, et al.
Publicado: (2025)
The CARFAC v2 Cochlear Model in Matlab, NumPy, and JAX
por: Lyon, Richard F., et al.
Publicado: (2024)
por: Lyon, Richard F., et al.
Publicado: (2024)
Generating Diverse Audio-Visual 360 Soundscapes for Sound Event Localization and Detection
por: Roman, Adrian S., et al.
Publicado: (2025)
por: Roman, Adrian S., et al.
Publicado: (2025)
I Can Hear You: Selective Robust Training for Deepfake Audio Detection
por: Zhang, Zirui, et al.
Publicado: (2024)
por: Zhang, Zirui, et al.
Publicado: (2024)
Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning
por: Kuan, Chun-Yi, et al.
Publicado: (2024)
por: Kuan, Chun-Yi, et al.
Publicado: (2024)
Descriptor:: Extended-Length Audio Dataset for Synthetic Voice Detection and Speaker Recognition (ELAD-SVDSR)
por: Vijaykumar, Rahul, et al.
Publicado: (2025)
por: Vijaykumar, Rahul, et al.
Publicado: (2025)
Audio Conditioning for Music Generation via Discrete Bottleneck Features
por: Rouard, Simon, et al.
Publicado: (2024)
por: Rouard, Simon, et al.
Publicado: (2024)
HybridVC: Efficient Voice Style Conversion with Text and Audio Prompts
por: Niu, Xinlei, et al.
Publicado: (2024)
por: Niu, Xinlei, et al.
Publicado: (2024)
The Extended SONICOM HRTF Dataset and Spatial Audio Metrics Toolbox
por: Poole, Katarina C., et al.
Publicado: (2025)
por: Poole, Katarina C., et al.
Publicado: (2025)
Pitch Contour Exploration Across Audio Domains: A Vision-Based Transfer Learning Approach
por: Abeßer, Jakob, et al.
Publicado: (2025)
por: Abeßer, Jakob, et al.
Publicado: (2025)
RAVE for Speech: Efficient Voice Conversion at High Sampling Rates
por: Bargum, Anders R., et al.
Publicado: (2024)
por: Bargum, Anders R., et al.
Publicado: (2024)
Do Models Hear Like Us? Probing the Representational Alignment of Audio LLMs and Naturalistic EEG
por: Yang, Haoyun, et al.
Publicado: (2026)
por: Yang, Haoyun, et al.
Publicado: (2026)
UniAudio: An Audio Foundation Model Toward Universal Audio Generation
por: Yang, Dongchao, et al.
Publicado: (2023)
por: Yang, Dongchao, et al.
Publicado: (2023)
MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models
por: Gong, Yitian, et al.
Publicado: (2026)
por: Gong, Yitian, et al.
Publicado: (2026)
Streaming Audio Transformers for Online Audio Tagging
por: Dinkel, Heinrich, et al.
Publicado: (2023)
por: Dinkel, Heinrich, et al.
Publicado: (2023)
Discrete Audio Representations for Automated Audio Captioning
por: Tian, Jingguang, et al.
Publicado: (2025)
por: Tian, Jingguang, et al.
Publicado: (2025)
Pengi: An Audio Language Model for Audio Tasks
por: Deshmukh, Soham, et al.
Publicado: (2023)
por: Deshmukh, Soham, et al.
Publicado: (2023)
HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids
por: Wisnu, Dyah A. M. G., et al.
Publicado: (2024)
por: Wisnu, Dyah A. M. G., et al.
Publicado: (2024)
Did You Hear That? Introducing AADG: A Framework for Generating Benchmark Data in Audio Anomaly Detection
por: Raghavan, Ksheeraja, et al.
Publicado: (2024)
por: Raghavan, Ksheeraja, et al.
Publicado: (2024)
MACE: Leveraging Audio for Evaluating Audio Captioning Systems
por: Dixit, Satvik, et al.
Publicado: (2024)
por: Dixit, Satvik, et al.
Publicado: (2024)
Audio-Mind: An Auditable Agentic Framework for Audio Understanding
por: Wang, Yucheng, et al.
Publicado: (2026)
por: Wang, Yucheng, et al.
Publicado: (2026)
Audio Entailment: Assessing Deductive Reasoning for Audio Understanding
por: Deshmukh, Soham, et al.
Publicado: (2024)
por: Deshmukh, Soham, et al.
Publicado: (2024)
SemanticAudio: Audio Generation and Editing in Semantic Space
por: Dai, Zheqi, et al.
Publicado: (2026)
por: Dai, Zheqi, et al.
Publicado: (2026)
Voice Conversion-based Privacy through Adversarial Information Hiding
por: Webber, Jacob J, et al.
Publicado: (2024)
por: Webber, Jacob J, et al.
Publicado: (2024)
A Multi-stage Low-latency Enhancement System for Hearing Aids
por: Ouyang, Chengwei, et al.
Publicado: (2025)
por: Ouyang, Chengwei, et al.
Publicado: (2025)
SRC-gAudio: Sampling-Rate-Controlled Audio Generation
por: Li, Chenxing, et al.
Publicado: (2024)
por: Li, Chenxing, et al.
Publicado: (2024)
AudioLCM: Text-to-Audio Generation with Latent Consistency Models
por: Liu, Huadai, et al.
Publicado: (2024)
por: Liu, Huadai, et al.
Publicado: (2024)
Ejemplares similares
-
A dataset and model for auditory scene recognition for hearing devices: AHEAD-DS and OpenYAMNet
por: Zhong, Henry, et al.
Publicado: (2025) -
Modeling the Difficulty of Saxophone Music
por: Libřický, Šimon, et al.
Publicado: (2025) -
Can Audio Reveal Music Performance Difficulty? Insights from the Piano Syllabus Dataset
por: Ramoneda, Pedro, et al.
Publicado: (2024) -
HearFit+: Personalized Fitness Monitoring via Audio Signals on Smart Speakers
por: Xie, Yadong, et al.
Publicado: (2025) -
The Rarity of Musical Audio Signals Within the Space of Possible Audio Generation
por: Collins, Nick
Publicado: (2024)