Six Presentations at ICASSP 2026

We will present the following papers at ICASSP 2026.

  • SPATIAL-CLAP: LEARNING SPATIALLY-AWARE AUDIO–TEXT EMBEDDINGS FOR MULTI-SOURCE CONDITIONS
  • TTSOPS: A CLOSED-LOOP CORPUS OPTIMIZATION FRAMEWORK FOR TRAINING MULTI-SPEAKER TTS MODELS FROM DARK DATA
  • XACLE Challenge 2026: The first x-to-audio alignment challenge
  • MANGAVOX: DATASET OF ACTED VOICES ALIGNED WITH MANGA IMAGES TOWARDS COMPUTER UNDERSTANDING OF AUDIO COMICS
  • SS-JDSC: SINGLE-SPEAKER JAPANESE DYSARTHRIC SPEECH CORPUS
  • THREE-STAGE BSRNN FOR UNIVERSAL SPEECH ENHANCEMENT AND DATA CURATION USING A LARGE PRE-TRAINED SPEECH RESTORATION MODEL

References

2026

  1. SPATIAL-CLAP: LEARNING SPATIALLY-AWARE AUDIO–TEXT EMBEDDINGS FOR MULTI-SOURCE CONDITIONS
    Kentaro Seki ,  Yuki Okamoto ,  Kouei Yamaoka ,  Yuki SaitoShinnosuke Takamichi ,  and  Hiroshi Saruwatari
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026
  2. TTSOPS: A CLOSED-LOOP CORPUS OPTIMIZATION FRAMEWORK FOR TRAINING MULTI-SPEAKER TTS MODELS FROM DARK DATA
    Kentaro Seki ,  Shinnosuke TakamichiTakaaki Saeki ,  and  Hiroshi Saruwatari
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026
  3. XACLE Challenge 2026: The first x-to-audio alignment challenge
    Yuki Okamoto ,  Riki Takizawa ,  Minoru Kishi ,  Yusuke Kanamori ,  Noriyuki Tonami ,  Ryotaro Nagase ,  Shinnosuke Takamichi ,  and  Keisuke Imoto
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026
  4. MANGAVOX: DATASET OF ACTED VOICES ALIGNED WITH MANGA IMAGES TOWARDS COMPUTER UNDERSTANDING OF AUDIO COMICS
    Shinnosuke TakamichiTomohiko Nakamura ,  Hitoshi Suda ,  Satoru Fukayama ,  and  Jun Ogata
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026
  5. SS-JDSC: SINGLE-SPEAKER JAPANESE DYSARTHRIC SPEECH CORPUS
    Asahi Ogasawara ,  Shinnosuke Takamichi ,  Jianing Yang ,  Go Suenaga ,  and  Yiyu Tan
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026
  6. THREE-STAGE BSRNN FOR UNIVERSAL SPEECH ENHANCEMENT AND DATA CURATION USING A LARGE PRE-TRAINED SPEECH RESTORATION MODEL
    Ryutaro Matsunaga ,  Ryo Takahashi ,  and  Shinnosuke Takamichi
    In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026