Saved in:
| Main Authors: | Chang, Chih-Cheng, Su, Li |
|---|---|
| Format: | Preprint |
| Published: |
2023
|
| Subjects: | |
| Online Access: | https://arxiv.org/abs/2312.17156 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
Efficient Adapter Tuning for Joint Singing Voice Beat and Downbeat Tracking with Self-supervised Learning Features
by: Deng, Jiajun, et al.
Published: (2025)
by: Deng, Jiajun, et al.
Published: (2025)
Beat and Downbeat Tracking in Performance MIDI Using an End-to-End Transformer Architecture
by: Murgul, Sebastian, et al.
Published: (2025)
by: Murgul, Sebastian, et al.
Published: (2025)
Streaming Audio Transformers for Online Audio Tagging
by: Dinkel, Heinrich, et al.
Published: (2023)
by: Dinkel, Heinrich, et al.
Published: (2023)
HingeNet: A Harmonic-Aware Fine-Tuning Approach for Beat Tracking
by: Ru, Ganghui, et al.
Published: (2025)
by: Ru, Ganghui, et al.
Published: (2025)
MaskBeat: Loopable Drum Beat Generation
by: Lanzendörfer, Luca A., et al.
Published: (2025)
by: Lanzendörfer, Luca A., et al.
Published: (2025)
Enhancing Automatic Chord Recognition through LLM Chain-of-Thought Reasoning
by: Chang, Chih-Cheng, et al.
Published: (2025)
by: Chang, Chih-Cheng, et al.
Published: (2025)
The SMC Blind Spot: A Failure Mode Analysis of State-of-the-Art Beat Tracking
by: Ahn, Jaehoon, et al.
Published: (2026)
by: Ahn, Jaehoon, et al.
Published: (2026)
Transformer-Based Rhythm Quantization of Performance MIDI Using Beat Annotations
by: Wachter, Maximilian, et al.
Published: (2026)
by: Wachter, Maximilian, et al.
Published: (2026)
Controlling Contrastive Self-Supervised Learning with Knowledge-Driven Multiple Hypothesis: Application to Beat Tracking
by: Gagnere, Antonin, et al.
Published: (2025)
by: Gagnere, Antonin, et al.
Published: (2025)
Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
by: Huang, Zikai, et al.
Published: (2024)
by: Huang, Zikai, et al.
Published: (2024)
Streaming Sortformer: Speaker Cache-Based Online Speaker Diarization with Arrival-Time Ordering
by: Medennikov, Ivan, et al.
Published: (2025)
by: Medennikov, Ivan, et al.
Published: (2025)
LS-EEND: Long-Form Streaming End-to-End Neural Diarization with Online Attractor Extraction
by: Liang, Di, et al.
Published: (2024)
by: Liang, Di, et al.
Published: (2024)
A New Perspective on Speaker Verification: Joint Modeling with DFSMN and Transformer
by: Wang, Hongyu, et al.
Published: (2023)
by: Wang, Hongyu, et al.
Published: (2023)
SmoothSync: Dual-Stream Diffusion Transformers for Jitter-Robust Beat-Synchronized Gesture Generation from Quantized Audio
by: Jiang, Yujiao, et al.
Published: (2026)
by: Jiang, Yujiao, et al.
Published: (2026)
StreamAAD: Decoding Spatial Auditory Attention with a Streaming Architecture
by: Qiu, Zelin, et al.
Published: (2024)
by: Qiu, Zelin, et al.
Published: (2024)
Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation
by: Bai, Ye, et al.
Published: (2024)
by: Bai, Ye, et al.
Published: (2024)
DualStream Contextual Fusion Network: Efficient Target Speaker Extraction by Leveraging Mixture and Enrollment Interactions
by: Xue, Ke, et al.
Published: (2025)
by: Xue, Ke, et al.
Published: (2025)
Lightweight Target-Speaker-Based Overlap Transcription for Practical Streaming ASR
by: Pražák, Aleš, et al.
Published: (2025)
by: Pražák, Aleš, et al.
Published: (2025)
Llasa+: Free Lunch for Accelerated and Streaming Llama-Based Speech Synthesis
by: Tian, Wenjie, et al.
Published: (2025)
by: Tian, Wenjie, et al.
Published: (2025)
Towards Streaming Synchronized Spatial Audio Generation via Autoregressive Diffusion Transformer
by: Lei, Ke, et al.
Published: (2026)
by: Lei, Ke, et al.
Published: (2026)
Joint Optimization of Streaming and Non-Streaming Automatic Speech Recognition with Multi-Decoder and Knowledge Distillation
by: Shakeel, Muhammad, et al.
Published: (2024)
by: Shakeel, Muhammad, et al.
Published: (2024)
Beat-Based Rhythm Quantization of MIDI Performances
by: Wachter, Maximilian, et al.
Published: (2025)
by: Wachter, Maximilian, et al.
Published: (2025)
StreamVoice+: Evolving into End-to-end Streaming Zero-shot Voice Conversion
by: Wang, Zhichao, et al.
Published: (2024)
by: Wang, Zhichao, et al.
Published: (2024)
Streaming Keyword Spotting Boosted by Cross-layer Discrimination Consistency
by: Xi, Yu, et al.
Published: (2024)
by: Xi, Yu, et al.
Published: (2024)
Combining Deterministic Enhanced Conditions with Dual-Streaming Encoding for Diffusion-Based Speech Enhancement
by: Shi, Hao, et al.
Published: (2025)
by: Shi, Hao, et al.
Published: (2025)
DualVC 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion
by: Ning, Ziqian, et al.
Published: (2023)
by: Ning, Ziqian, et al.
Published: (2023)
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement
by: Ren, Wenze, et al.
Published: (2024)
by: Ren, Wenze, et al.
Published: (2024)
Multichannel Long-Term Streaming Neural Speech Enhancement for Static and Moving Speakers
by: Quan, Changsheng, et al.
Published: (2024)
by: Quan, Changsheng, et al.
Published: (2024)
StreamFlow: Streaming Flow Matching with Block-wise Guided Attention Mask for Speech Token Decoding
by: Guo, Dake, et al.
Published: (2025)
by: Guo, Dake, et al.
Published: (2025)
CUSIDE-T: Chunking, Simulating Future and Decoding for Transducer based Streaming ASR
by: Zhao, Wenbo, et al.
Published: (2024)
by: Zhao, Wenbo, et al.
Published: (2024)
Accelerating Diffusion Transformer-Based Text-to-Speech with Transformer Layer Caching
by: Sakpiboonchit, Siratish
Published: (2025)
by: Sakpiboonchit, Siratish
Published: (2025)
PiCoGen2: Piano cover generation with transfer learning approach and weakly aligned data
by: Tan, Chih-Pin, et al.
Published: (2024)
by: Tan, Chih-Pin, et al.
Published: (2024)
DNN-Based Online Source Counting Based on Spatial Generalized Magnitude Squared Coherence
by: Gode, Henri, et al.
Published: (2026)
by: Gode, Henri, et al.
Published: (2026)
Phoenix-VAD: Streaming Semantic Endpoint Detection for Full-Duplex Speech Interaction
by: Wu, Weijie, et al.
Published: (2025)
by: Wu, Weijie, et al.
Published: (2025)
Joint Speech and Text Training for LLM-Based End-to-End Spoken Dialogue State Tracking
by: Vendrame, Katia, et al.
Published: (2025)
by: Vendrame, Katia, et al.
Published: (2025)
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR
by: Li, Longhao, et al.
Published: (2025)
by: Li, Longhao, et al.
Published: (2025)
Robust Online Overdetermined Independent Vector Analysis Based on Bilinear Decomposition
by: Chen, Kang, et al.
Published: (2026)
by: Chen, Kang, et al.
Published: (2026)
ASTAR-NTU solution to AudioMOS Challenge 2025 Track1
by: Ritter-Gutierrez, Fabian, et al.
Published: (2025)
by: Ritter-Gutierrez, Fabian, et al.
Published: (2025)
Relationships between Keywords and Strong Beats in Lyrical Music
by: Liao, Callie C., et al.
Published: (2024)
by: Liao, Callie C., et al.
Published: (2024)
Beat this! Accurate beat tracking without DBN postprocessing
by: Foscarin, Francesco, et al.
Published: (2024)
by: Foscarin, Francesco, et al.
Published: (2024)
Similar Items
-
Efficient Adapter Tuning for Joint Singing Voice Beat and Downbeat Tracking with Self-supervised Learning Features
by: Deng, Jiajun, et al.
Published: (2025) -
Beat and Downbeat Tracking in Performance MIDI Using an End-to-End Transformer Architecture
by: Murgul, Sebastian, et al.
Published: (2025) -
Streaming Audio Transformers for Online Audio Tagging
by: Dinkel, Heinrich, et al.
Published: (2023) -
HingeNet: A Harmonic-Aware Fine-Tuning Approach for Beat Tracking
by: Ru, Ganghui, et al.
Published: (2025) -
MaskBeat: Loopable Drum Beat Generation
by: Lanzendörfer, Luca A., et al.
Published: (2025)