Publications | Takamichi Lab. / 高道研究室

2026

Sign-to-Speech Prosody Transfer via Sign Reconstruction-based GAN

Toranosuke Manabe , Yuto Shibata , Shinnosuke Takamichi , and Yoshimitsu Aoki

In icpr , Aug 2026

@inproceedings{manabe26icpr_sign-to-speech-prosody-transfer,
  abbr_publisher = icpr,
  booktitle = icpr,
  title = {Sign-to-Speech Prosody Transfer via Sign Reconstruction-based GAN},
  author = {Manabe, Toranosuke and Shibata, Yuto and Takamichi, Shinnosuke and Aoki, Yoshimitsu},
  year = {2026},
  month = aug
}

SPATIAL-CLAP: LEARNING SPATIALLY-AWARE AUDIO–TEXT EMBEDDINGS FOR MULTI-SOURCE CONDITIONS

Kentaro Seki , Yuki Okamoto , Kouei Yamaoka , Yuki Saito , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026

Bib

@inproceedings{seki26icassp_spatial-clap,
  abbr_publisher = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {SPATIAL-CLAP: LEARNING SPATIALLY-AWARE AUDIO–TEXT EMBEDDINGS FOR MULTI-SOURCE CONDITIONS},
  author = {Seki, Kentaro and Okamoto, Yuki and Yamaoka, Kouei and Saito, Yuki and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2026},
  month = may
}

TTSOPS: A CLOSED-LOOP CORPUS OPTIMIZATION FRAMEWORK FOR TRAINING MULTI-SPEAKER TTS MODELS FROM DARK DATA

Kentaro Seki , Shinnosuke Takamichi , Takaaki Saeki , and Hiroshi Saruwatari

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026

Bib

@inproceedings{seki26icassp_ttsops,
  abbr_publisher = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {TTSOPS: A CLOSED-LOOP CORPUS OPTIMIZATION FRAMEWORK FOR TRAINING MULTI-SPEAKER TTS MODELS FROM DARK DATA},
  author = {Seki, Kentaro and Takamichi, Shinnosuke and Saeki, Takaaki and Saruwatari, Hiroshi},
  year = {2026},
  month = may
}

XACLE Challenge 2026: The first x-to-audio alignment challenge

Yuki Okamoto , Riki Takizawa , Minoru Kishi , Yusuke Kanamori , Noriyuki Tonami , Ryotaro Nagase , Shinnosuke Takamichi , and Keisuke Imoto

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026

Bib

@inproceedings{okamoto26icassp_xacle-challenge,
  abbr_publisher = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {XACLE Challenge 2026: The first x-to-audio alignment challenge},
  author = {Okamoto, Yuki and Takizawa, Riki and Kishi, Minoru and Kanamori, Yusuke and Tonami, Noriyuki and Nagase, Ryotaro and Takamichi, Shinnosuke and Imoto, Keisuke},
  year = {2026},
  month = may
}

MANGAVOX: DATASET OF ACTED VOICES ALIGNED WITH MANGA IMAGES TOWARDS COMPUTER UNDERSTANDING OF AUDIO COMICS

Shinnosuke Takamichi , Tomohiko Nakamura , Hitoshi Suda , Satoru Fukayama , and Jun Ogata

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026

Bib

@inproceedings{takamichi26icassp_mangavox,
  abbr_publisher = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {MANGAVOX: DATASET OF ACTED VOICES ALIGNED WITH MANGA IMAGES TOWARDS COMPUTER UNDERSTANDING OF AUDIO COMICS},
  author = {Takamichi, Shinnosuke and Nakamura, Tomohiko and Suda, Hitoshi and Fukayama, Satoru and Ogata, Jun},
  year = {2026},
  month = may
}

SS-JDSC: SINGLE-SPEAKER JAPANESE DYSARTHRIC SPEECH CORPUS

Asahi Ogasawara , Shinnosuke Takamichi , Jianing Yang , Go Suenaga , and Yiyu Tan

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026

Bib

@inproceedings{ogasawara26icassp_ss-jdsc,
  abbr_publisher = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {SS-JDSC: SINGLE-SPEAKER JAPANESE DYSARTHRIC SPEECH CORPUS},
  author = {Ogasawara, Asahi and Takamichi, Shinnosuke and Yang, Jianing and Suenaga, Go and Tan, Yiyu},
  year = {2026},
  month = may
}

THREE-STAGE BSRNN FOR UNIVERSAL SPEECH ENHANCEMENT AND DATA CURATION USING A LARGE PRE-TRAINED SPEECH RESTORATION MODEL

Ryutaro Matsunaga , Ryo Takahashi , and Shinnosuke Takamichi

In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , May 2026

Bib

@inproceedings{matsunaga26icassp_three-stage-bsrnn,
  abbr_publisher = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
  title = {THREE-STAGE BSRNN FOR UNIVERSAL SPEECH ENHANCEMENT AND DATA CURATION USING A LARGE PRE-TRAINED SPEECH RESTORATION MODEL},
  author = {Matsunaga, Ryutaro and Takahashi, Ryo and Takamichi, Shinnosuke},
  year = {2026},
  month = may
}

日本語コーパス案内図

高道慎之介

日本語学, Mar 2026

(Invited article / 招待記事)

Bib

@article{takamichi26meijishoin_nihongogaku-japanese-map,
  abbr_publisher = {コーパス},
  title = {日本語コーパス案内図},
  author = {慎之介, 高道},
  year = {2026},
  month = mar,
  journal = {日本語学},
  note = {(Invited article / 招待記事)}
}

LLMを用いたゲーム実況テキストの評価手法の検討

井手平彩夏 , 須藤克仁 , 高道慎之介 , 齋藤佑樹 , ニュービッググラム , 高村大也 , and 石垣達也

In 情報処理学会自然言語処理研究会 , Mar 2026

Bib

@inproceedings{idehira26ipsjnl_llm-game-commentary-eval,
  abbr_publisher = {情報処理学会 自然言語処理研究会},
  booktitle = {情報処理学会 自然言語処理研究会},
  title = {LLMを用いたゲーム実況テキストの評価手法の検討},
  author = {彩夏, 井手平 and 克仁, 須藤 and 慎之介, 高道 and 佑樹, 齋藤 and ニュービッググラム and 大也, 高村 and 達也, 石垣},
  year = {2026},
  month = mar
}

声質変換による原話者音声出力を行う音声から音声への同時翻訳システム

須藤克仁 , 譚皓天 , 西川勇太 , 加納保昌 , サクティサクリアニ , 戸田智基 , and 中村哲

In 情報処理学会自然言語処理研究会 , Mar 2026

Bib

@inproceedings{sudo26ipsjnl_simul-s2st-voice-conversion,
  abbr_publisher = {情報処理学会 自然言語処理研究会},
  booktitle = {情報処理学会 自然言語処理研究会},
  title = {声質変換による原話者音声出力を行う音声から音声への同時翻訳システム},
  author = {克仁, 須藤 and 皓天, 譚 and 勇太, 西川 and 保昌, 加納 and サクティサクリアニ and 智基, 戸田 and 哲, 中村},
  year = {2026},
  month = mar
}

一般的・魅力的な男性の声質を用いた低遅延音声変換による聴覚提示が自尊心に与える影響の検証

國見友亮 , 木村健太 , 鳴海拓志 , 高道慎之介 , and 持丸正明

In 多感覚研究会 , Feb 2026

Bib

@inproceedings{kunimi26multisens_selfesteem-voice-conversion,
  abbr_publisher = {多感覚研究会},
  booktitle = {多感覚研究会},
  title = {一般的・魅力的な男性の声質を用いた低遅延音声変換による聴覚提示が自尊心に与える影響の検証},
  author = {友亮, 國見 and 健太, 木村 and 拓志, 鳴海 and 慎之介, 高道 and 正明, 持丸},
  year = {2026},
  month = feb
}

競技かるたにおける決まり字の音声リアルタイム予測

高見澤芽生 , 桑名真結香 , 高木健 , 高道慎之介 , 松田孟留 , and 鳴海紘也

In 情報処理学会全国大会 , Mar 2026

Bib

@inproceedings{takamisawa26ipsj_karuta-kimariji,
  abbr_publisher = {情報処理学会 全国大会},
  booktitle = {情報処理学会 全国大会},
  title = {競技かるたにおける決まり字の音声リアルタイム予測},
  author = {芽生, 高見澤 and 真結香, 桑名 and 健, 高木 and 慎之介, 高道 and 孟留, 松田 and 紘也, 鳴海},
  year = {2026},
  month = mar
}

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling

Wataru Nakata , Kentaro Seki , Hitomi Yanaka , Yuki Saito , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Proceedings of the International Conference on Language Resources and Evaluation (LREC) , May 2026

Bib

@inproceedings{nakata26lrec_j-chat,
  abbr_publisher = {Proceedings of the International Conference on Language Resources and Evaluation (LREC)},
  booktitle = {Proceedings of the International Conference on Language Resources and Evaluation (LREC)},
  title = {J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling},
  author = {Nakata, Wataru and Seki, Kentaro and Yanaka, Hitomi and Saito, Yuki and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2026},
  month = may
}

Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches

Anum Afzal , Hiroya Takamura , Katsuhito Sudoh , Yuki Saito , Shinnosuke Takamichi , Graham Neubig , Florian Matthes , and Tatsuya Ishigaki

In Proceedings of the International Conference on Language Resources and Evaluation (LREC) , May 2026

Bib

@inproceedings{afzal26lrec_real-time-video-commentary,
  abbr_publisher = {Proceedings of the International Conference on Language Resources and Evaluation (LREC)},
  booktitle = {Proceedings of the International Conference on Language Resources and Evaluation (LREC)},
  title = {Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches},
  author = {Afzal, Anum and Takamura, Hiroya and Sudoh, Katsuhito and Saito, Yuki and Takamichi, Shinnosuke and Neubig, Graham and Matthes, Florian and Ishigaki, Tatsuya},
  year = {2026},
  month = may
}

声道パラメータ表現および強化学習を利用したText-to-Action-to-Speech

小野晶子 , 加藤徳啓 , and 高道慎之介

In 電子情報通信学会音声研究会 , Mar 2026

Bib PDF

@inproceedings{ono26speasip_text-to-action-to-speech,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {声道パラメータ表現および強化学習を利用したText-to-Action-to-Speech},
  author = {晶子, 小野 and 徳啓, 加藤 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

Full-duplex音声対話モデルにおける性別表現のプロービング

八木颯斗 , 稲垣賢斗 , 高島悠樹 , 安藤厚志 , and 高道慎之介

In 電子情報通信学会音声研究会 , Mar 2026

Bib

@inproceedings{yagi26speasip_full-duplex-probing,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {Full-duplex音声対話モデルにおける性別表現のプロービング},
  author = {颯斗, 八木 and 賢斗, 稲垣 and 悠樹, 高島 and 厚志, 安藤 and 慎之介, 高道},
  year = {2026},
  month = mar
}

Altered auditory feedbackに基づく感情誘導における音声特徴量弁別閾の調査

中村颯 , 福田航希 , 高道慎之介 , and 大畑龍

In 情報処理学会音声言語処理研究会 , Mar 2026

Bib PDF

@inproceedings{nakamura26speasip_auditory-feedback-emotion,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {Altered auditory feedbackに基づく感情誘導における音声特徴量弁別閾の調査},
  author = {颯, 中村 and 航希, 福田 and 慎之介, 高道 and 龍, 大畑},
  year = {2026},
  month = mar,
}

多ジャンルのスポーツ音声実況における音声特徴量の時間的構造の調査

松下嶺佑 , 高道慎之介 , 齋藤佑樹 , ニュービッググラム , 須藤克仁 , 高村大也 , and 石垣達也

In 情報処理学会音声言語処理研究会 , Mar 2026

Bib PDF

@inproceedings{matsushita26speasip_sports-commentary-structure,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {多ジャンルのスポーツ音声実況における音声特徴量の時間的構造の調査},
  author = {嶺佑, 松下 and 慎之介, 高道 and 佑樹, 齋藤 and グラム, ニュービッグ and 克仁, 須藤 and 大也, 高村 and 達也, 石垣},
  year = {2026},
  month = mar,
}

既存データセットとの意図しない重複を避ける環境音評価データセットの半自動構築法

岸秀 , 高道慎之介 , 滝沢力 , 金森勇介 , 砺波紀之 , 永瀬亮太郎 , 井本桂右 , and 岡本悠希

In 電子情報通信学会応用音響研究会 , Mar 2026

Bib PDF

@inproceedings{kishi26speasip_environmental-sound-dataset,
  abbr_publisher = {電子情報通信学会 応用音響研究会},
  booktitle = {電子情報通信学会 応用音響研究会},
  title = {既存データセットとの意図しない重複を避ける環境音評価データセットの半自動構築法},
  author = {秀, 岸 and 慎之介, 高道 and 力, 滝沢 and 勇介, 金森 and 紀之, 砺波 and 亮太郎, 永瀬 and 桂右, 井本 and 悠希, 岡本},
  year = {2026},
  month = mar,
}

SMASHコーパスDLC：対戦ゲーム動画に対する掛け合い実況解説音声コーパス

齋藤佑樹 , 川松亮太 , 高道慎之介 , ニュービッググラム , 須藤克仁 , 猿渡洋 , 高村大也 , and 石垣達也

In 電子情報通信学会音声研究会 , Mar 2026

Bib PDF

@inproceedings{saito26speasip_smash-corpus-dlc,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {SMASHコーパスDLC：対戦ゲーム動画に対する掛け合い実況解説音声コーパス},
  author = {佑樹, 齋藤 and 亮太, 川松 and 慎之介, 高道 and グラム, ニュービッグ and 克仁, 須藤 and 洋, 猿渡 and 大也, 高村 and 達也, 石垣},
  year = {2026},
  month = mar,
}

プロンプト音声合成を用いた漫画音声合成

越野颯太 , 上治正太郎 , 高道慎之介 , and 中村友彦

In 情報処理学会音声言語処理研究会 , Mar 2026

Bib

@inproceedings{koshino26speasip_manga-speech-synthesis,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {プロンプト音声合成を用いた漫画音声合成},
  author = {颯太, 越野 and 正太郎, 上治 and 慎之介, 高道 and 友彦, 中村},
  year = {2026},
  month = mar
}

SMASHコーパスDLC：対戦ゲーム動画に対する掛け合い実況解説音声コーパス

齋藤佑樹 , 川松亮太 , 高道慎之介 , ニュービッググラム , 須藤克仁 , 猿渡洋 , 高村大也 , and 石垣達也

In 電子情報通信学会音声研究会 , Mar 2026

Bib PDF

@inproceedings{saito26speasip_smash-corpus-dld,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {SMASHコーパスDLC：対戦ゲーム動画に対する掛け合い実況解説音声コーパス},
  author = {佑樹, 齋藤 and 亮太, 川松 and 慎之介, 高道 and グラム, ニュービッグ and 克仁, 須藤 and 洋, 猿渡 and 大也, 高村 and 達也, 石垣},
  year = {2026},
  month = mar,
}

不正収録音声から合成されたディープフェイク音声によるなりすまし攻撃

古林嵯羽仁 , 高道慎之介 , and 塩田さやか

In 電子情報通信学会音声研究会 , Mar 2026

Bib PDF

@inproceedings{furubayashi26speasip_deepfake-spoofing-attack,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {不正収録音声から合成されたディープフェイク音声によるなりすまし攻撃},
  author = {嵯羽仁, 古林 and 慎之介, 高道 and さやか, 塩田},
  year = {2026},
  month = mar,
}

J-SpAW2:録音再生攻撃によるなりすまし音声の収録環境を分析可能な日本語音声コーパス

堀江涼花 , 高道慎之介 , and 塩田さやか

In 電子情報通信学会音声研究会 , Mar 2026

Bib PDF

@inproceedings{horie26speasip_jspaw2-corpus,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {J-SpAW2:録音再生攻撃によるなりすまし音声の収録環境を分析可能な日本語音声コーパス},
  author = {涼花, 堀江 and 慎之介, 高道 and さやか, 塩田},
  year = {2026},
  month = mar,
}

環境音と説明文の意味的関連性に関する主観評価データセットの分析

金森勇介 , 岡本悠希 , 高道慎之介 , 齋藤佑樹 , and 猿渡洋

In 電子情報通信学会音声研究会 , Mar 2026

Bib

@inproceedings{kanamori26speasip_environmental-sound-relevance,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {環境音と説明文の意味的関連性に関する主観評価データセットの分析},
  author = {勇介, 金森 and 悠希, 岡本 and 慎之介, 高道 and 佑樹, 齋藤 and 洋, 猿渡},
  year = {2026},
  month = mar
}

空間音とテキストの対照学習による音源情報と空間情報の分離表現学習

上治正太郎 , 高道慎之介 , and 山岡洸瑛

In 電子情報通信学会応用音響研究会 , Mar 2026

Bib PDF

@inproceedings{ueji26speasip_spatial-audio-text-learning,
  abbr_publisher = {電子情報通信学会 応用音響研究会},
  booktitle = {電子情報通信学会 応用音響研究会},
  title = {空間音とテキストの対照学習による音源情報と空間情報の分離表現学習},
  author = {正太郎, 上治 and 慎之介, 高道 and 洸瑛, 山岡},
  year = {2026},
  month = mar,
}

ニューラルオーディオコーデックにおける雑音頑健性分析～ Zipf則・Heaps則に基づく言語統計構造と劣化音声の関係～

朴浚鎔 , 高道慎之介 , David M. Chan , 神藤駿介 , 齋藤佑樹 , and 猿渡洋

In 電子情報通信学会音声研究会 , Mar 2026

Bib PDF

@inproceedings{park26speasip_neural-codec-robustness,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {ニューラルオーディオコーデックにおける雑音頑健性分析 ～ Zipf則・Heaps則に基づく言語統計構造と劣化音声の関係 ～},
  author = {浚鎔, 朴 and 慎之介, 高道 and Chan, David M. and 駿介, 神藤 and 佑樹, 齋藤 and 洋, 猿渡},
  year = {2026},
  month = mar,
}

大規模言語モデルの音象徴ベンチマーク

稲垣賢斗 , 神藤駿介 , and 高道慎之介

In 言語処理学会全国大会 , Mar 2026

Bib PDF

@inproceedings{inagaki26nlp_sound-symbolism-benchmark,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {大規模言語モデルの音象徴ベンチマーク},
  author = {賢斗, 稲垣 and 駿介, 神藤 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

一人称・三人称視点対話収録システムとエゴセントリック津軽弁音声対話コーパスの構築

阪井瞭介 , 江舒婷 , 郭傲 , 高道慎之介 , 小川哲司 , and 東中竜一郎

In 言語処理学会全国大会 , Mar 2026

Bib PDF

@inproceedings{sakai26nlp_egocentric-tsugaru-corpus,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {一人称・三人称視点対話収録システムとエゴセントリック津軽弁音声対話コーパスの構築},
  author = {瞭介, 阪井 and 舒婷, 江 and 傲, 郭 and 慎之介, 高道 and 哲司, 小川 and 竜一郎, 東中},
  year = {2026},
  month = mar,
}

並列テキスト生成による低遅延ゲーム音声実況システム

川松亮太 , 齋藤佑樹 , 高道慎之介 , ニュービッググラム , 須藤克仁 , 高村大也 , and 石垣達也

In 言語処理学会全国大会 , Mar 2026

Bib PDF

@inproceedings{kawamatsu26nlp_game-commentary-system,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {並列テキスト生成による低遅延ゲーム音声実況システム},
  author = {亮太, 川松 and 佑樹, 齋藤 and 慎之介, 高道 and ニュービッググラム and 克仁, 須藤 and 大也, 高村 and 達也, 石垣},
  year = {2026},
  month = mar,
}

仙台市方言を合成する方言音声合成の構築と小学校 2 年生向け方言学習授業における活用報告

高道慎之介 , 丹治尚子 , 庄司潤子 , 佐藤照一 , and 田村文子

In 言語処理学会全国大会 , Mar 2026

Bib PDF

@inproceedings{takamichi26nlp_sendai-dialect-synthesis,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {仙台市方言を合成する方言音声合成の構築と小学校 2 年生向け方言学習授業における活用報告},
  author = {慎之介, 高道 and 尚子, 丹治 and 潤子, 庄司 and 照一, 佐藤 and 文子, 田村},
  year = {2026},
  month = mar,
}

記述言語学に基づいた方言音声合成評価枠組みの構築と池間西原方言音声合成の評価

佐藤なな子 , 阪井瞭介 , 中田亘 , 高道慎之介 , 中川奈津子 , 林由華 , 宮川創 , and 坂井美日

In 言語処理学会全国大会 , Mar 2026

Bib PDF

@inproceedings{sato26nlp_ikema-dialect-evaluation,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {記述言語学に基づいた方言音声合成評価枠組みの構築と池間西原方言音声合成の評価},
  author = {なな子, 佐藤 and 瞭介, 阪井 and 亘, 中田 and 慎之介, 高道 and 奈津子, 中川 and 由華, 林 and 創, 宮川 and 美日, 坂井},
  year = {2026},
  month = mar,
}

Theory of mind のベンチマーク指標は対話能力と関係があるのか？ LLM における対話能力と Theory of Mind の相関分析

伊勢野晴久 , 大橋厚元 , 小川哲司 , 高道慎之介 , and 東中竜一郎

In 言語処理学会全国大会 , Mar 2026

Bib PDF

@inproceedings{iseno26nlp_tom-dialogue-analysis,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {Theory of mind のベンチマーク指標は対話能力と関係があるのか？ LLM における対話能力と Theory of Mind の相関分析},
  author = {晴久, 伊勢野 and 厚元, 大橋 and 哲司, 小川 and 慎之介, 高道 and 竜一郎, 東中},
  year = {2026},
  month = mar,
}

音楽基盤モデルにおける音響特徴と内在音高螺旋の関係

八木颯斗 , 高道慎之介 , 佐藤りん , 田中啓太郎 , and 森島繁生

In 情報処理学会音楽情報科学研究会 , Mar 2026

Bib PDF

@inproceedings{yagi26mus_acoustic-pitch-spiral,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {音楽基盤モデルにおける音響特徴と内在音高螺旋の関係},
  author = {颯斗, 八木 and 慎之介, 高道 and りん, 佐藤 and 啓太郎, 田中 and 繁生, 森島},
  year = {2026},
  month = mar,
}

大規模言語モデルと自己修正に基づく歌唱可能な歌詞への phonemic translation

阪井瞭介 , 深尾貫太 , and 高道慎之介

In 情報処理学会音楽情報科学研究会 , Mar 2026

Bib PDF

@inproceedings{sakai26mus_phonemic-translation,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {大規模言語モデルと自己修正に基づく歌唱可能な歌詞への phonemic translation},
  author = {瞭介, 阪井 and 貫太, 深尾 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

音楽基盤モデルの表現形成における学習過程の解析手法の検討

佐藤りん , 田中啓太郎 , 八木颯斗 , 高道慎之介 , and 森島繁生

In 情報処理学会音楽情報科学研究会 , Mar 2026

Bib PDF

@inproceedings{sato26mus_inverse-learning-analysis,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {音楽基盤モデルの表現形成における学習過程の解析手法の検討},
  author = {りん, 佐藤 and 啓太郎, 田中 and 颯斗, 八木 and 慎之介, 高道 and 繁生, 森島},
  year = {2026},
  month = mar,
}

人間ーAI斉唱において合成歌声特徴量の変調が斉唱らしさにもたらす効果

三井啓史 , 松下嶺佑 , 深尾貫太 , and 高道慎之介

In 情報処理学会音楽情報科学研究会 , Mar 2026

Bib PDF

@inproceedings{mitsui26mus_human-ai-unison-modulation,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {人間ーAI斉唱において合成歌声特徴量の変調が斉唱らしさにもたらす効果},
  author = {啓史, 三井 and 嶺佑, 松下 and 貫太, 深尾 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

人間-AI斉唱における合成歌声の呼吸パラメータの歌唱者間リアルタイム同期

深尾貫太 , 三井啓史 , 小野晶子 , 上原祟寛 , and 高道慎之介

In 情報処理学会音楽情報科学研究会 , Mar 2026

Bib PDF

@inproceedings{fukao26mus_realtime-sync-breathing,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {人間-AI斉唱における合成歌声の呼吸パラメータの歌唱者間リアルタイム同期},
  author = {貫太, 深尾 and 啓史, 三井 and 晶子, 小野 and 祟寛, 上原 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

発話内容ドメイン教師あり LoRA による構音障害音声認識のドメイン適応

小笠原朝陽 , 高道慎之介 , 末永剛 , and 談宜育

In 日本音響学会春季研究発表会 , Mar 2026

Bib PDF

@inproceedings{ogasawara26asjs_lora-domain-adaptation,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {発話内容ドメイン教師あり LoRA による構音障害音声認識のドメイン適応},
  author = {朝陽, 小笠原 and 慎之介, 高道 and 剛, 末永 and 宜育, 談},
  year = {2026},
  month = mar,
}

Moshi に基づく音声対話モデルの日本語ファインチューニングにおける対話データ特性の影響

阿部雄斗 , 佐伯真於 , 大橋厚元 , 高道慎之介 , 藤江真也 , 小林哲則 , 小川哲司 , and 東中竜一郎

In 日本音響学会春季研究発表会 , Mar 2026

Bib PDF

@inproceedings{abe26asjs_moshi-dialogue,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {Moshi に基づく音声対話モデルの日本語ファインチューニングにおける対話データ特性の影響},
  author = {雄斗, 阿部 and 真於, 佐伯 and 厚元, 大橋 and 慎之介, 高道 and 真也, 藤江 and 哲則, 小林 and 哲司, 小川 and 竜一郎, 東中},
  year = {2026},
  month = mar,
}

XACLE Challenge 2026: 環境音とテキストにおける主観的意味関連性の自動評価に向けた国際コンペティション

岡本悠希 , 滝沢力 , 岸秀 , 金森勇介 , 砺波紀之 , 永瀬亮太郎 , 高道慎之介 , and 井本桂右

In 日本音響学会春季研究発表会 , Mar 2026

Bib PDF

@inproceedings{okamoto26asjs_xacle-challenge,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {XACLE Challenge 2026: 環境音とテキストにおける主観的意味関連性の自動評価に向けた国際コンペティション},
  author = {悠希, 岡本 and 力, 滝沢 and 秀, 岸 and 勇介, 金森 and 紀之, 砺波 and 亮太郎, 永瀬 and 慎之介, 高道 and 桂右, 井本},
  year = {2026},
  month = mar,
}

Spatial Audio Captioning: 複数音源状況下における空間情報を伴う説明文の生成とその評価

関健太郎 , 岡本悠希 , 山岡洸瑛 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会春季研究発表会 , Mar 2026

Bib PDF

@inproceedings{seki26asjs_spatial-captioning,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {Spatial Audio Captioning: 複数音源状況下における空間情報を伴う説明文の生成とその評価},
  author = {健太郎, 関 and 悠希, 岡本 and 洸瑛, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2026},
  month = mar,
}

ボイスコミックデータセット MangaVox が拓く音声科学・工学タスク

高道慎之介 , 中村友彦 , 須田仁志 , 深山覚 , and 緒方淳

In 日本音響学会春季研究発表会 , Mar 2026

Bib PDF

@inproceedings{takamichi26asjs_mangavox,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {ボイスコミックデータセット MangaVox が拓く音声科学・工学タスク},
  author = {慎之介, 高道 and 友彦, 中村 and 仁志, 須田 and 覚, 深山 and 淳, 緒方},
  year = {2026},
  month = mar,
}

SS-JDSC：単一話者日本語構音障害音声コーパス

小笠原朝陽 , 高道慎之介 , 楊家寧 , 末永剛 , and 談宜育

In 日本音響学会春季研究発表会 , Mar 2026

Bib

@inproceedings{takamichi26asjs_ssjdsc,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {SS-JDSC：単一話者日本語構音障害音声コーパス},
  author = {朝陽, 小笠原 and 慎之介, 高道 and 家寧, 楊 and 剛, 末永 and 宜育, 談},
  year = {2026},
  month = mar
}

TTSOps 2.0: テキスト音声合成におけるデータ収集・前処理・学習プロセスの統合的最適化

関健太郎 , 齋藤佑樹 , 高道慎之介 , 佐伯高明 , and 猿渡洋

In 日本音響学会春季研究発表会 , Mar 2026

Bib PDF

@inproceedings{seki26asjs_ttsops2,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {TTSOps 2.0: テキスト音声合成におけるデータ収集・前処理・学習プロセスの統合的最適化},
  author = {健太郎, 関 and 佑樹, 齋藤 and 慎之介, 高道 and 高明, 佐伯 and 洋, 猿渡},
  year = {2026},
  month = mar,
}

Effects of Dialogue Corpora Properties on Fine-Tuning a Moshi-Based Spoken Dialogue Model

Yuto Abe , Mao Saeki , Atsumoto Ohashi , Shinnosuke Takamichi , Shiyna Fujie , Tetsunori Kobayashi , Tetsuji Ogawa , and Ryuichiro Higashinaka

In Proceedings of the International Workshop on Spoken Dialogue Systems (IWSDS) , Feb 2026

Bib PDF

@inproceedings{abe26iwsds_moshi-corpus-properties,
  abbr_publisher = {Proceedings of the International Workshop on Spoken Dialogue Systems (IWSDS)},
  booktitle = {Proceedings of the International Workshop on Spoken Dialogue Systems (IWSDS)},
  title = {Effects of Dialogue Corpora Properties on Fine-Tuning a Moshi-Based Spoken Dialogue Model},
  author = {Abe, Yuto and Saeki, Mao and Ohashi, Atsumoto and Takamichi, Shinnosuke and Fujie, Shiyna and Kobayashi, Tetsunori and Ogawa, Tetsuji and Higashinaka, Ryuichiro},
  year = {2026},
  month = feb,
}

Investigating the Effects of Translation Quality on LLM Performance in Machine-Translated Theory of Mind Benchmarks

Haruhisa Iseno , Atsumoto Ohashi , Tetsuji Ogawa , Shinnosuke Takamichi , and Ryuichiro Higashinaka

In Proceedings of Advancing Artificial Intelligence through Theory of Mind (ToM4AI) , Jan 2026

Bib PDF

@inproceedings{iseno26aaai_translation-tom,
  abbr_publisher = {Proceedings of Advancing Artificial Intelligence through Theory of Mind (ToM4AI)},
  booktitle = {Proceedings of Advancing Artificial Intelligence through Theory of Mind (ToM4AI)},
  title = {Investigating the Effects of Translation Quality on LLM Performance in Machine-Translated Theory of Mind Benchmarks},
  author = {Iseno, Haruhisa and Ohashi, Atsumoto and Ogawa, Tetsuji and Takamichi, Shinnosuke and Higashinaka, Ryuichiro},
  year = {2026},
  month = jan,
}

AudioBERTScore: Objective Evaluation of Environmental Sound Synthesis Based on Similarity of Audio embedding Sequences

Minoru Kishi , Ryosuke Sakai , Shinnosuke Takamichi , Yusuke Kanamori , and Yuki Okamoto

In Proceedings of Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI) , Jan 2026

Bib PDF

@inproceedings{kishi26aaai_audiobertscore,
  abbr_publisher = {Proceedings of Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI)},
  booktitle = {Proceedings of Audio-Centric AI: Towards Real-World Multimodal Reasoning and Application Use Cases (Audio-AAAI)},
  title = {AudioBERTScore: Objective Evaluation of Environmental Sound Synthesis Based on Similarity of Audio embedding Sequences},
  author = {Kishi, Minoru and Sakai, Ryosuke and Takamichi, Shinnosuke and Kanamori, Yusuke and Okamoto, Yuki},
  year = {2026},
  month = jan,
}

2025

Excitement-Inducing Commentary Text-to-Speech System for Fighting Game Video Scenes

Kota Iura , Yuki Saito , Shinnosuke Takamichi , Graham Neubig , Katsuhito Sudoh , Hiroshi Saruwatari , Hiroya Takamura , and Tatsuya Ishigaki

IEEE Access, Jan 2025

Bib PDF

@article{iura25ieee-access_commentary-tts,
  title = {Excitement-Inducing Commentary Text-to-Speech System for Fighting Game Video Scenes},
  author = {Iura, Kota and Saito, Yuki and Takamichi, Shinnosuke and Neubig, Graham and Sudoh, Katsuhito and Saruwatari, Hiroshi and Takamura, Hiroya and Ishigaki, Tatsuya},
  year = {2025},
  journal = {IEEE Access},
}

Toward Data-Efficient Speech Synthesis: Active Learning–Based Corpus Construction for Multi-Speaker Text-to-Speech Synthesis

Kentaro Seki , Yuki Saito , Shinnosuke Takamichi , Takaaki Saeki , and Hiroshi Saruwatari

IEEE Access, Jan 2025

Bib PDF

@article{seki25ieee-access_data-efficient-tts,
  title = {Toward Data-Efficient Speech Synthesis: Active Learning–Based Corpus Construction for Multi-Speaker Text-to-Speech Synthesis},
  author = {Seki, Kentaro and Saito, Yuki and Takamichi, Shinnosuke and Saeki, Takaaki and Saruwatari, Hiroshi},
  year = {2025},
  journal = {IEEE Access},
}

情報工学から見た新しい音声研究とコーパス

高道慎之介

日本語学, Dec 2025

(Invited article / 招待記事)

Bib

@article{takamichi25meijishoin_nihongogaku,
  abbr_publisher = {コーパス},
  title = {情報工学から見た新しい音声研究とコーパス},
  author = {慎之介, 高道},
  year = {2025},
  month = dec,
  journal = {日本語学},
  note = {(Invited article / 招待記事)}
}

TTSOps: A Closed-Loop Corpus Optimization Framework for Training Multi-Speaker TTS Models from Dark Data

Kentaro Seki , Shinnosuke Takamichi , Takaaki Saeki , and Hiroshi Saruwatari

In IEEE Transactions on Audio, Speech, and Language Processing , Nov 2025

Bib PDF

@inproceedings{seki25taslp_ttsops,
  abbr_publisher = {IEEE Transactions on Audio, Speech, and Language Processing},
  booktitle = {IEEE Transactions on Audio, Speech, and Language Processing},
  title = {TTSOps: A Closed-Loop Corpus Optimization Framework for Training Multi-Speaker TTS Models from Dark Data},
  author = {Seki, Kentaro and Takamichi, Shinnosuke and Saeki, Takaaki and Saruwatari, Hiroshi},
  year = {2025},
  month = nov,
}

How do audio foundation models understand sound?

Shinnosuke Takamichi

In Proceedings of the International Workshop on Symbolic-Neural Learning (SNL) , Oct 2025

(Invited talk / 招待講演)

Bib Slides

@inproceedings{takamichi25snl_audio-foundation-models,
  abbr_publisher = {Proceedings of the International Workshop on Symbolic-Neural Learning (SNL)},
  booktitle = {Proceedings of the International Workshop on Symbolic-Neural Learning (SNL)},
  title = {How do audio foundation models understand sound?},
  author = {Takamichi, Shinnosuke},
  year = {2025},
  month = oct,
  note = {(Invited talk / 招待講演)}
}

Analysis of the Correlation Between Theory of Mind and Dialogue Ability to Identify Essential ToM for Dialogue Systems

Haruhisa Iseno , Atsumoto Ohashi , Tetsuji Ogawa , Shinnosuke Takamichi , and Ryuichiro Higashinaka

In PACLIC , Dec 2025

Bib PDF

@inproceedings{iseno25paclic_tom-dialogue,
  abbr_publisher = {PACLIC},
  booktitle = {PACLIC},
  title = {Analysis of the Correlation Between Theory of Mind and Dialogue Ability to Identify Essential ToM for Dialogue Systems},
  author = {Iseno, Haruhisa and Ohashi, Atsumoto and Ogawa, Tetsuji and Takamichi, Shinnosuke and Higashinaka, Ryuichiro},
  year = {2025},
  month = dec,
}

Blended English as an International Language Learning Program Utilizing Text-to-Speech Technology: A Pilot Study

Yasushige Ishikawa , Tomoko Kasamaki , Shinnosuke Takamichi , Yuta Matsunaga , Shigeo Fujiwara , Yusuke Yoshikawa , Kikuko Yui , and Takatoyo Umemoto

eLearn, Oct 2025

Bib

@article{ishikawa25elearn_tts-eil,
  abbr_publisher = {音声合成},
  title = {Blended English as an International Language Learning Program Utilizing Text-to-Speech Technology: A Pilot Study},
  author = {Ishikawa, Yasushige and Kasamaki, Tomoko and Takamichi, Shinnosuke and Matsunaga, Yuta and Fujiwara, Shigeo and Yoshikawa, Yusuke and Yui, Kikuko and Umemoto, Takatoyo},
  year = {2025},
  month = oct,
  journal = {eLearn}
}

RELATE: 環境音と説明文の意味的関連性の自動評価に向けた主観評価データセットの構築

金森勇介 , 岡本悠希 , 高道慎之介 , 齋藤佑樹 , and 猿渡洋

In 言語処理学会若手支援事業 , Sep 2025

Bib

@inproceedings{kanamori25yans_relate,
  abbr_publisher = {言語処理学会 若手支援事業},
  booktitle = {言語処理学会 若手支援事業},
  title = {RELATE: 環境音と説明文の意味的関連性の自動評価に向けた主観評価データセットの構築},
  author = {勇介, 金森 and 悠希, 岡本 and 慎之介, 高道 and 佑樹, 齋藤 and 洋, 猿渡},
  year = {2025},
  month = sep
}

ステレオ信号に対する空間情報を伴う音響キャプショニング

関健太郎 , 岡本悠希 , 山岡洸瑛 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 言語処理学会若手支援事業 , Sep 2025

Bib

@inproceedings{seki25yans_audio-captioning,
  abbr_publisher = {言語処理学会 若手支援事業},
  booktitle = {言語処理学会 若手支援事業},
  title = {ステレオ信号に対する空間情報を伴う音響キャプショニング},
  author = {健太郎, 関 and 悠希, 岡本 and 洸瑛, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
  month = sep
}

Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology

Rinka Nobukawa , Makito Kitamura , Tomohiko Nakamura , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) , Oct 2025

Bib PDF

@inproceedings{nobukawa25apsipa_drum-to-vocalpercussion,
  abbr_publisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  title = {Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology},
  author = {Nobukawa, Rinka and Kitamura, Makito and Nakamura, Tomohiko and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2025},
  month = oct,
}

Constructing an In-the-Wild Spoken Dialogue Dataset Based on YouTube Dialogue Videos

Yuki Sato , Sanae Yamashita , Shinnosuke Takamichi , and Ryuichiro Higashinaka

In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) , Oct 2025

Bib PDF

@inproceedings{sato25apsipa_youtube-dialogue,
  abbr_pabulisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  title = {Constructing an In-the-Wild Spoken Dialogue Dataset Based on YouTube Dialogue Videos},
  author = {Sato, Yuki and Yamashita, Sanae and Takamichi, Shinnosuke and Higashinaka, Ryuichiro},
  year = {2025},
  month = oct,
}

Active Learning for Text-to-Speech Synthesis with Informative Sample Collection

Kentaro Seki , Shinnosuke Takamichi , Takaaki Saeki , and Hiroshi Saruwatari

In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) , Oct 2025

Bib PDF

@inproceedings{seki25apsipa_active-tts,
  abbr_publisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  title = {Active Learning for Text-to-Speech Synthesis with Informative Sample Collection},
  author = {Seki, Kentaro and Takamichi, Shinnosuke and Saeki, Takaaki and Saruwatari, Hiroshi},
  year = {2025},
  month = oct,
}

VitaEval: Open-source Human Evaluation Tool for Video-to-Text and Video-to-Audio Systems

Goran Topić , Graham Neubig , Katsuhito Sudoh , Yuki Saito , Shinnosuke Takamichi , Ryosuke Matsushita , Kota Iura , Hiroya Takamura , and Tatsuya Ishigaki

In International Conference on Natural Language Generation , Mar 2025

Bib

@inproceedings{topic25inlg_vitaeval,
  abbr_publisher = { International Conference on Natural Language Generation},
  booktitle = { International Conference on Natural Language Generation},
  title = {VitaEval: Open-source Human Evaluation Tool for Video-to-Text and Video-to-Audio Systems},
  author = {Topić, Goran and Neubig, Graham and Sudoh, Katsuhito and Saito, Yuki and Takamichi, Shinnosuke and Matsushita, Ryosuke and Iura, Kota and Takamura, Hiroya and Ishigaki, Tatsuya},
  year = {2025},
  month = mar
}

Real-Time Drum-to-Vocal Percussion Sound Conversion System

Rinka Nobukawa , Tomohiko Nakamura , Shinnosuke Takamichi , and Hiroshi Saruwatari

In International Society for Music Information Retrieval Late‑Breaking/Demo Session , Sep 2025

Bib PDF

@inproceedings{nobukawa25ismir_drum-to-vocal,
  abbr_publisher = {International Society for Music Information Retrieval Late‑Breaking/Demo Session},
  booktitle = {International Society for Music Information Retrieval Late‑Breaking/Demo Session},
  title = {Real-Time Drum-to-Vocal Percussion Sound Conversion System},
  author = {Nobukawa, Rinka and Nakamura, Tomohiko and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2025},
  month = sep,
}

Analysing the Language of Neural Audio Codecs

Joonyong Park , Shinnosuke Takamichi , David M. Chan , Shunsuke Kando , Yuki Saito , and Hiroshi Saruwatari

In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Dec 2025

Bib PDF

@inproceedings{park25asru_analysis-neural-audio-codec,
  abbr_publisher = {IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
  booktitle = {IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
  title = {Analysing the Language of Neural Audio Codecs},
  author = {Park, Joonyong and Takamichi, Shinnosuke and Chan, David M. and Kando, Shunsuke and Saito, Yuki and Saruwatari, Hiroshi},
  year = {2025},
  month = dec,
}

Learning Marmoset Vocal Patterns with a Masked Autoencoder for Robust Call Segmentation, Classification, and Caller Identification

Bin Wu , Shinnosuke Takamichi , Sakriani Sakti , and Satoshi Nakamura

In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Dec 2025

Bib PDF

@inproceedings{wu25asru_marmoset-masked-autoencoder,
  abbr_publisher = {IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
  booktitle = {IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
  title = {Learning Marmoset Vocal Patterns with a Masked Autoencoder for Robust Call Segmentation, Classification, and Caller Identification},
  author = {Wu, Bin and Takamichi, Shinnosuke and Sakti, Sakriani and Nakamura, Satoshi},
  year = {2025},
  month = dec,
}

Do statistical patterns in neural audio codec tokens by synthesized speech reveal structure beyond speech quality?

Joonyong Park , Shinnosuke Takamichi , David M. Chan , Shunsuke Kando , Yuki Saito , and Hiroshi Saruwatari

In Joint Meeting Acoustical Society of America and Acoustical Society of Japan , Dec 2025

Bib

@inproceedings{park25asj-asa_audio-codec-speech-quality,
  abbr_publisher = {Joint Meeting Acoustical Society of America and Acoustical Society of Japan},
  booktitle = {Joint Meeting Acoustical Society of America and Acoustical Society of Japan},
  title = {Do statistical patterns in neural audio codec tokens by synthesized speech reveal structure beyond speech quality?},
  author = {Park, Joonyong and Takamichi, Shinnosuke and Chan, David M. and Kando, Shunsuke and Saito, Yuki and Saruwatari, Hiroshi},
  year = {2025},
  month = dec
}

Analysis of a Dataset for Evaluating Semantic Relevance Between Text and Audio

Yusuke Kanamori , Yuki Okamoto , Shinnosuke Takamichi , Yuki Saito , and Hiroshi Saruwatari

In Joint Meeting Acoustical Society of America and Acoustical Society of Japan , Dec 2025

Bib

@inproceedings{kanamori25asj-asa_semantic-relevance-dataset-analysis,
  abbr_publisher = {Joint Meeting Acoustical Society of America and Acoustical Society of Japan},
  booktitle = {Joint Meeting Acoustical Society of America and Acoustical Society of Japan},
  title = {Analysis of a Dataset for Evaluating Semantic Relevance Between Text and Audio},
  author = {Kanamori, Yusuke and Okamoto, Yuki and Takamichi, Shinnosuke and Saito, Yuki and Saruwatari, Hiroshi},
  year = {2025},
  month = dec
}

Developing learners’ communicative competence in English as an international language using non-native English variations with AI speech synthesis technology

Tomoko Kasamaki , Shinnosuke Takamichi , Yuta Matsunaga , Shigeo Fujiwara , Yusuke Yoshikawa , Kikuko Yui , and Takatoyo Umemoto

In EUROCALL , Aug 2025

Bib

@inproceedings{kasamaki25euro_non-native-variations-AI,
  abbr_publisher = {EUROCALL},
  booktitle = {EUROCALL},
  title = {Developing learners’ communicative competence in English as an international language using non-native English variations with AI speech synthesis technology},
  author = {Kasamaki, Tomoko and Takamichi, Shinnosuke and Matsunaga, Yuta and Fujiwara, Shigeo and Yoshikawa, Yusuke and Yui, Kikuko and Umemoto, Takatoyo},
  year = {2025},
  month = aug
}

一般的・魅力的な男性の声質を用いた Voice Ownership Illusion による顕在的・潜在的自尊心の変化の検証

國見友亮 , 木村健太 , 鳴海拓志 , 高道慎之介 , and 持丸正明

In 日本バーチャルリアリティ学会 , Sep 2025

Bib

@inproceedings{kunimi25vrsj_voice-ownership-illusion,
  abbr_publisher = {日本バーチャルリアリティ学会},
  booktitle = {日本バーチャルリアリティ学会},
  title = {一般的・魅力的な男性の声質を用いた Voice Ownership Illusion による顕在的・潜在的自尊心の変化の検証},
  author = {友亮, 國見 and 健太, 木村 and 拓志, 鳴海 and 慎之介, 高道 and 正明, 持丸},
  year = {2025},
  month = sep
}

音楽基盤モデルは音高情報を螺旋構造に埋め込むか？

八木颯斗 , and 高道慎之介

In 情報処理学会音楽情報科学研究会 , Aug 2025

Bib PDF Slides

@inproceedings{yagi25mus_music-helicality,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {音楽基盤モデルは音高情報を螺旋構造に埋め込むか？},
  author = {颯斗, 八木 and 慎之介, 高道},
  year = {2025},
  month = aug,
}

記述言語学に基づく方言音声合成の評価枠組みの構築と宮古語池間西原方言への適用検討

佐藤なな子 , 高道慎之介 , 中川奈津子 , 宮川創 , and 坂井美日

In 言語処理学会若手支援事業 , Sep 2025

Bib Slides

@inproceedings{sato25yans_ikehara_eval,
  abbr_publisher = {言語処理学会 若手支援事業},
  booktitle = {言語処理学会 若手支援事業},
  title = {記述言語学に基づく方言音声合成の評価枠組みの構築と宮古語池間西原方言への適用検討},
  author = {なな子, 佐藤 and 慎之介, 高道 and 奈津子, 中川 and 創, 宮川 and 美日, 坂井},
  year = {2025},
  month = sep,
}

Common Crawl を用いた大規模音声音響データセットの構築

淺井航平 , 杉浦一瑳 , 中田亘 , 栗田修平 , 高道慎之介 , 小川哲司 , and 東中竜一郎

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{asai25asja_commoncrawl-dataset,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {Common Crawl を用いた大規模音声音響データセットの構築},
  author = {航平, 淺井 and 一瑳, 杉浦 and 亘, 中田 and 修平, 栗田 and 慎之介, 高道 and 哲司, 小川 and 竜一郎, 東中},
  year = {2025},
  month = sep,
}

パラ言語・非言語情報の記述文をクエリとした目的音声抽出

関健太郎 , 伊藤信貴 , 山内一輝 , 岡本悠希 , 山岡洸瑛 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{seki25asja_paralinguistic-query,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {パラ言語・非言語情報の記述文をクエリとした目的音声抽出},
  author = {健太郎, 関 and 信貴, 伊藤 and 一輝, 山内 and 悠希, 岡本 and 洸瑛, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
  month = sep,
}

空間情報を伴う音響言語モデルの検討

関健太郎 , 岡本悠希 , 山岡洸瑛 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{seki25asja_acoustic-llm,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {空間情報を伴う音響言語モデルの検討},
  author = {健太郎, 関 and 悠希, 岡本 and 洸瑛, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
  month = sep,
}

なりすまし音声検出に対する録音再生攻撃の収録条件に関する影響分析

堀江涼花 , 高道慎之介 , and 塩田さやか

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{horie25asja_replay-attack,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {なりすまし音声検出に対する録音再生攻撃の収録条件に関する影響分析},
  author = {涼花, 堀江 and 慎之介, 高道 and さやか, 塩田},
  year = {2025},
  month = sep,
}

なりすまし音声検出に対する話者適応を用いた音声合成攻撃

古林嵯羽仁 , 高道慎之介 , and 塩田さやか

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{furubayashi25asja_tts-attack,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {なりすまし音声検出に対する話者適応を用いた音声合成攻撃},
  author = {嵯羽仁, 古林 and 慎之介, 高道 and さやか, 塩田},
  year = {2025},
  month = sep,
}

大規模音声自己教師あり学習モデルにもとづく音声好感度予測

須田仁志 , 高道慎之介 , and 深山覚

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{suda25asja_likeability-prediction,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {大規模音声自己教師あり学習モデルにもとづく音声好感度予測},
  author = {仁志, 須田 and 慎之介, 高道 and 覚, 深山},
  year = {2025},
  month = sep,
}

Audio Captioning モデルの発達的カリキュラム学習

稲垣賢斗 , and 高道慎之介

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF Slides

@inproceedings{inagaki25asja_audio-captioning,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {Audio Captioning モデルの発達的カリキュラム学習},
  author = {賢斗, 稲垣 and 慎之介, 高道},
  year = {2025},
  month = sep,
}

どのような音声離散表現が音声の再合成と継続に適するか？

神藤駿介 , 高道慎之介 , and 宮尾祐介

In 日本音響学会秋季研究発表会 , Sep 2025

Bib PDF

@inproceedings{kando25asja_speech-repr,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {どのような音声離散表現が音声の再合成と継続に適するか？},
  author = {駿介, 神藤 and 慎之介, 高道 and 祐介, 宮尾},
  year = {2025},
  month = sep,
}

漫画画像理解性能が漫画音声合成の品質に与える影響の調査

越野颯太 , 上治正太郎 , 高道慎之介 , and 中村友彦

In 電子情報通信学会ヒューマンコミュニケーショングループ・コミック工学研究会 , Jul 2025

Bib PDF Slides

@inproceedings{koshino25comic_maga2voice-eval,
  abbr_publisher = {電子情報通信学会ヒューマンコミュニケーショングループ・コミック工学研究会},
  booktitle = {電子情報通信学会ヒューマンコミュニケーショングループ・コミック工学研究会},
  title = {漫画画像理解性能が漫画音声合成の品質に与える影響の調査},
  author = {颯太, 越野 and 正太郎, 上治 and 慎之介, 高道 and 友彦, 中村},
  year = {2025},
  month = jul,
}

Measuring Time Delay Tolerance in Third-Person Live Commentary for Super Smash Bros. Ultimate

Ryosuke Matsushita , Ryosuke Sakai , Koki Fukuda , Shinnosuke Takamichi , Kota Iura , Yuki Saito , Graham Neubig , Katsuhito Sudoh , Hiroya Takamura , and Tatsuya Ishigaki

In IEEE Conference on Games , Aug 2025

Bib PDF Slides

@inproceedings{matsushita25cog_time-delay-tolerance,
  abbr_publisher = {IEEE Conference on Games},
  booktitle = {IEEE Conference on Games},
  title = {Measuring Time Delay Tolerance in Third-Person Live Commentary for Super Smash Bros. Ultimate},
  author = {Matsushita, Ryosuke and Sakai, Ryosuke and Fukuda, Koki and Takamichi, Shinnosuke and Iura, Kota and Saito, Yuki and Neubig, Graham and Sudoh, Katsuhito and Takamura, Hiroya and Ishigaki, Tatsuya},
  month = aug,
  year = {2025},
}

Language-queried target speech extraction using para-linguistic and non-linguistic prompts

関健太郎 , 伊藤信孝 , 山内一樹 , 岡本悠希 , 山岡幸英 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

Acoustical Science and Technology, Jun 2025

Bib PDF

@article{seki25ast_language-queried,
  abbr_publisher = {IEEE Open Journal of Signal Processing},
  title = {Language-queried target speech extraction using para-linguistic and non-linguistic prompts},
  author = {健太郎, 関 and 信孝, 伊藤 and 一樹, 山内 and 悠希, 岡本 and 幸英, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
  month = jun,
  journal = {Acoustical Science and Technology},
}

コーパス

MangaVox：ボイスコミックの計算機理解に向けたマルチモーダル演技音声データセット

高道慎之介 , 中村友彦 , 須田仁志 , 深山覚 , and 緒方淳

In 電子情報通信学会パターン認識・メディア理解研究専門委員会 , Jul 2025

Bib PDF

@inproceedings{takamichi25miru_manga-vox,
  abbr_publisher = {電子情報通信学会パターン認識・メディア理解研究専門委員会},
  booktitle = {電子情報通信学会パターン認識・メディア理解研究専門委員会},
  title = {MangaVox：ボイスコミックの計算機理解に向けたマルチモーダル演技音声データセット},
  author = {慎之介, 高道 and 友彦, 中村 and 仁志, 須田 and 覚, 深山 and 淳, 緒方},
  year = {2025},
  month = jul
}

Sign-to-Speech Prosody Transfer

Toranosuke Manabe , Yuto Shibata , Shinnosuke Takamichi , and Yoshimitsu Aoki

In 電子情報通信学会パターン認識・メディア理解研究専門委員会 , Jul 2025

Bib

@inproceedings{manabe25miru_sign-to-speech,
  abbr_publisher = {電子情報通信学会パターン認識・メディア理解研究専門委員会},
  booktitle = {電子情報通信学会パターン認識・メディア理解研究専門委員会},
  title = {Sign-to-Speech Prosody Transfer},
  author = {Manabe, Toranosuke and Shibata, Yuto and Takamichi, Shinnosuke and Aoki, Yoshimitsu},
  year = {2025},
  month = jul
}

Exploring the Effect of Segmentation and Vocabulary Size on Speech Tokenization for Speech Language Models

Shunsuke Kando , Yusuke Miyao , and Shinnosuke Takamichi

In Proceedings of Interspeech , Aug 2025

Bib PDF

@inproceedings{kando25interspeech_speech-tokenization,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Exploring the Effect of Segmentation and Vocabulary Size on Speech Tokenization for Speech Language Models},
  author = {Kando, Shunsuke and Miyao, Yusuke and Takamichi, Shinnosuke},
  year = {2025},
  month = aug,
}

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

Hitoshi Suda , Shinnosuke Takamichi , and Satoru Fukayama

In Proceedings of Interspeech , Aug 2025

Bib PDF

@inproceedings{suda25interspeech_likability-control,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora},
  author = {Suda, Hitoshi and Takamichi, Shinnosuke and Fukayama, Satoru},
  year = {2025},
  month = aug,
}

Japanese speaker verification and spoofing attacks recorded in-the-wild dataset

Sayaka Shiota , Suzuka Horie , Kouta Kanno , and Shinnosuke Takamichi

In Proceedings of Interspeech , Aug 2025

Bib PDF

@inproceedings{shiota25interspeech_japanese-spoofing,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Japanese speaker verification and spoofing attacks recorded in-the-wild dataset},
  author = {Shiota, Sayaka and Horie, Suzuka and Kanno, Kouta and Takamichi, Shinnosuke},
  year = {2025},
  month = aug,
}

RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio

Yusuke Kanamori , Yuki Okamoto , Taisei Takano , Shinnosuke Takamichi , Yuki Saito , and Hiroshi Saruwatari

In Proceedings of Interspeech , Aug 2025

Bib PDF

@inproceedings{kanamori25interspeech_relate,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {RELATE: Subjective evaluation dataset for automatic evaluation of relevance between text and audio},
  author = {Kanamori, Yusuke and Okamoto, Yuki and Takano, Taisei and Takamichi, Shinnosuke and Saito, Yuki and Saruwatari, Hiroshi},
  year = {2025},
  month = aug,
}

The text-to-speech in the wild (TITW) dataset

Jee-weon Jung , Wangyou Zhang , Soumi Maiti , Yihan Wu , Xin Wang , Ji-Hoon Kim , Yuta Matsunaga , Seyun Um , Jinchuan Tian , Hye-jin Shim , and 4 more authors

In Proceedings of Interspeech , Aug 2025

Bib PDF

@inproceedings{jung25interspeech_titw,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {The text-to-speech in the wild (TITW) dataset},
  author = {Jung, Jee-weon and Zhang, Wangyou and Maiti, Soumi and Wu, Yihan and Wang, Xin and Kim, Ji-Hoon and Matsunaga, Yuta and Um, Seyun and Tian, Jinchuan and Shim, Hye-jin and Evans, Nicholas and Chung, Joon Son and Takamichi, Shinnosuke and Watanabe, Shinji},
  year = {2025},
  month = aug,
}

環境音と説明文の意味的関連性の自動評価に向けたデータセット構築と基本性能評価

岡本悠希 , 金森勇介 , 高野大成 , 高道慎之介 , 齋藤佑樹 , 永瀬亮太郎 , and 猿渡洋

In 電子情報通信学会応用音響研究会 , May 2025

Bib PDF

@inproceedings{okamoto25ea_soundtext-dataset,
  abbr_publisher = {電子情報通信学会 応用音響研究会},
  booktitle = {電子情報通信学会 応用音響研究会},
  title = {環境音と説明文の意味的関連性の自動評価に向けたデータセット構築と基本性能評価},
  author = {悠希, 岡本 and 勇介, 金森 and 大成, 高野 and 慎之介, 高道 and 佑樹, 齋藤 and 亮太郎, 永瀬 and 洋, 猿渡},
  year = {2025},
  month = may,
}

音声トークンの言語に関する分析

朴浚鎔 , 高道慎之介 , David M. Chan , 神藤駿介 , 齋藤佑樹 , and 猿渡洋

In 情報処理学会音声言語処理研究会 , Jun 2025

Bib PDF

@inproceedings{park25speasip_language-token,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {音声トークンの言語に関する分析},
  author = {浚鎔, 朴 and 慎之介, 高道 and Chan, David M. and 駿介, 神藤 and 佑樹, 齋藤 and 洋, 猿渡},
  year = {2025},
  month = jun
}

Spoken Language Technologies for Historical Speech and Audio

高道慎之介

国立民族学博物館, Apr 2025

(Invited talk / 招待講演)

Bib PDF

@article{takamichi25minpaku_invited-talk,
  abbr_publisher = {音声復元},
  title = {Spoken Language Technologies for Historical Speech and Audio},
  author = {慎之介, 高道},
  year = {2025},
  month = apr,
  journal = {国立民族学博物館},
  note = {(Invited talk / 招待講演)},
}

環境音埋め込みベクトル系列の類似度に基づく環境音生成の自動評価

岸秀 , 阪井瞭介 , 高道慎之介 , 金森勇介 , and 岡本悠希

In 情報処理学会音声言語処理研究会 , Jun 2025

Bib PDF Slides

@inproceedings{kishi25muslp_envsound-generation-eval,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {環境音埋め込みベクトル系列の類似度に基づく環境音生成の自動評価},
  author = {秀, 岸 and 瞭介, 阪井 and 慎之介, 高道 and 勇介, 金森 and 悠希, 岡本},
  year = {2025},
  month = jun,
}

コーパス

音環境に適応する音声合成能力を搭載した音声対話システムの構築と実証実験に基づく検討

武伯寒 , 高道慎之介 , 関健太郎 , and 猿渡洋

In 情報処理学会音声言語処理研究会 , Jun 2025

Bib PDF

@inproceedings{take25speasip_egotts-dialogue,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {音環境に適応する音声合成能力を搭載した音声対話システムの構築と実証実験に基づく検討},
  author = {伯寒, 武 and 慎之介, 高道 and 健太郎, 関 and 洋, 猿渡},
  year = {2025},
}

三人称ゲーム実況音声に対する時間遅延許容量の測定

松下嶺佑 , 阪井瞭介 , 福田航希 , 高道慎之介 , 井浦昂太 , 齋藤佑樹 , ニュービッググラム , 須藤克仁 , 高村大也 , and 石垣達也

In 電子情報通信学会音声研究会 , Jun 2025

Bib PDF Slides

@inproceedings{matsushita25speasip_delay-tolerance,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {三人称ゲーム実況音声に対する時間遅延許容量の測定},
  author = {嶺佑, 松下 and 瞭介, 阪井 and 航希, 福田 and 慎之介, 高道 and 昂太, 井浦 and 佑樹, 齋藤 and グラム, ニュービッグ and 克仁, 須藤 and 大也, 高村 and 達也, 石垣},
  year = {2025},
}

好感度自動推定モデルを利用した任意話者音声の好感度を制御可能な声質変換

須田仁志 , and 高道慎之介

In 情報処理学会音声言語処理研究会 , Jun 2025

Bib PDF

@inproceedings{suda25speasip_voice-likeability,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {好感度自動推定モデルを利用した任意話者音声の好感度を制御可能な声質変換},
  author = {仁志, 須田 and 慎之介, 高道},
  year = {2025},
}

Text-to-audioにおける入出力関連性の自動評価に向けた主観評価データセット構築

金森勇介 , 岡本悠希 , 高野大成 , 高道慎之介 , 齋藤佑樹 , and 猿渡洋

In 情報処理学会音声言語処理研究会 , Jun 2025

Bib

@inproceedings{kanamori25speasip_text-to-audio,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {Text-to-audioにおける入出力関連性の自動評価に向けた主観評価データセット構築},
  author = {勇介, 金森 and 悠希, 岡本 and 大成, 高野 and 慎之介, 高道 and 佑樹, 齋藤 and 洋, 猿渡},
  year = {2025}
}

盛り上がり制御可能な対戦ゲーム実況解説音声合成モデルの検討

井浦昂太 , 齋藤佑樹 , 高道慎之介 , ニュービッググラム , 須藤克仁 , 猿渡洋 , 高村大也 , and 石垣達也

In 情報処理学会音声言語処理研究会 , Jun 2025

Bib PDF

@inproceedings{iura25speasip_game-commentary,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {盛り上がり制御可能な対戦ゲーム実況解説音声合成モデルの検討},
  author = {昂太, 井浦 and 佑樹, 齋藤 and 慎之介, 高道 and グラム, ニュービッグ and 克仁, 須藤 and 洋, 猿渡 and 大也, 高村 and 達也, 石垣},
  year = {2025},
}

YouTube上の対話動画に基づく音声対話データセットの構築

佐藤友紀 , 高道慎之介 , and 東中竜一郎

In 日本音響学会春季研究発表会 , Jun 2025

Bib PDF

@inproceedings{sato25asjs_youtube-dialogue,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {YouTube上の対話動画に基づく音声対話データセットの構築},
  author = {友紀, 佐藤 and 慎之介, 高道 and 竜一郎, 東中},
  year = {2025},
}

データ単位前処理自動選択による音声合成コーパスのデータクレンジング

関健太郎 , 高道慎之介 , 佐伯高明 , and 猿渡洋

In 日本音響学会春季研究発表会 , Jun 2025

Bib PDF

@inproceedings{seki25asjs_tts-ops,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {データ単位前処理自動選択による音声合成コーパスのデータクレンジング},
  author = {健太郎, 関 and 慎之介, 高道 and 高明, 佐伯 and 洋, 猿渡},
  year = {2025},
}

音声・音響・音楽を扱うオープン基盤モデルの構築に向けたデータセット策定

高道慎之介 , 和田仰 , 小川諒 , 山岡洸瑛 , 中田亘 , 淺井航平 , 関健太郎 , 岡本悠希 , 齋藤佑樹 , 小川哲司 , and 3 more authors

In 言語処理学会全国大会 , Jun 2025

Bib PDF

@inproceedings{takamichi25nlp_foundation,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {音声・音響・音楽を扱うオープン基盤モデルの構築に向けたデータセット策定},
  author = {慎之介, 高道 and 仰, 和田 and 諒, 小川 and 洸瑛, 山岡 and 亘, 中田 and 航平, 淺井 and 健太郎, 関 and 悠希, 岡本 and 佑樹, 齋藤 and 哲司, 小川 and 洋, 猿渡 and 友彦, 中村 and 覚, 深山},
  year = {2025},
}

音声トークナイズが音声言語モデルの性能に与える影響の調査

神藤駿介 , 宮尾祐介 , and 高道慎之介

In 言語処理学会全国大会 , Jun 2025

Bib PDF

@inproceedings{kando25nlp_speech-tokenization,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {音声トークナイズが音声言語モデルの性能に与える影響の調査},
  author = {駿介, 神藤 and 祐介, 宮尾 and 慎之介, 高道},
  year = {2025},
}

Open-source Human Evaluation Framework for Video-to-Text and Video-to-Audio Systems

Goran Topic , Graham Neubig , Katsuhito Sudoh , Yuki Saito , Shinnosuke Takamichi , Ryosuke Matsushita , Kota Iura , Hiroya Takamura , and Tatsuya Ishigaki

In 言語処理学会全国大会 , Jun 2025

Bib PDF

@inproceedings{topic25nlp_video-evaluation,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {Open-source Human Evaluation Framework for Video-to-Text and Video-to-Audio Systems},
  author = {Topic, Goran and Neubig, Graham and Sudoh, Katsuhito and Saito, Yuki and Takamichi, Shinnosuke and Matsushita, Ryosuke and Iura, Kota and Takamura, Hiroya and Ishigaki, Tatsuya},
  year = {2025},
}

変分オートエンコーダによるドラムからボーカルパーカッションへの楽器音変換と評価

信川凜佳 , 北村優輝士 , 中村友彦 , 高道慎之介 , and 猿渡洋

In 情報処理学会音楽情報科学研究会 , Jun 2025

Bib PDF

@inproceedings{nobukawa25mus_drum-to-vocal,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {変分オートエンコーダによるドラムからボーカルパーカッションへの楽器音変換と評価},
  author = {凜佳, 信川 and 優輝士, 北村 and 友彦, 中村 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
}

知覚感情の不整合：人間が作曲した音楽，音楽を記述したテキスト，テキスト楽音合成による音楽の比較

阪井瞭介 , 福田航希 , 松下嶺佑 , 高道慎之介 , and 植村あい子

In 情報処理学会音楽情報科学研究会 , Jun 2025

Bib PDF Slides

@inproceedings{sakai25mus_perceptual-inconsistency,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {知覚感情の不整合：人間が作曲した音楽，音楽を記述したテキスト，テキスト楽音合成による音楽の比較},
  author = {瞭介, 阪井 and 航希, 福田 and 嶺佑, 松下 and 慎之介, 高道 and あい子, 植村},
  year = {2025},
}

ELEVATE：学習者自身の自己聴取音声で聴く講義システム

福田航希 , 阪井瞭介 , 松下嶺佑 , 國見友亮 , and 高道慎之介

In 情報処理学会インタラクション , Jun 2025

Bib PDF Slides

@inproceedings{fukuda25interaction_elevate,
  abbr_publisher = {情報処理学会 インタラクション},
  booktitle = {情報処理学会 インタラクション},
  title = {ELEVATE：学習者自身の自己聴取音声で聴く講義システム},
  author = {航希, 福田 and 瞭介, 阪井 and 嶺佑, 松下 and 友亮, 國見 and 慎之介, 高道},
  year = {2025},
}

なりすまし音声検出システムに対する音声合成攻撃手法の検討

古林嵯羽仁 , 高道慎之介 , and 塩田さやか

In 電子情報通信学会総合大会 , Jun 2025

Bib

@inproceedings{furubayashi25impersonation_speech_detection,
  abbr_publisher = {電子情報通信学会 総合大会},
  booktitle = {電子情報通信学会 総合大会},
  title = {なりすまし音声検出システムに対する音声合成攻撃手法の検討},
  author = {嵯羽仁, 古林 and 慎之介, 高道 and さやか, 塩田},
  year = {2025}
}

SpoofCeleb: Speech Deepfake Detection and SASV In The Wild

Jee-weon Jung , Yihan Wu , Xin Wang , Ji-Hoon Kim , Soumi Maiti , Yuta Matsunaga , Hye-jin Shim , Jinchuan Tian , Nicholas Evans , Joon Son Chung , and 4 more authors

Jun 2025

Bib Website

@article{jung25ojsp_spoofceleb,
  abbr_publisher = {コーパス},
  booktitle = {IEEE Open Journal of Signal Processing},
  title = {SpoofCeleb: Speech Deepfake Detection and SASV In The Wild},
  author = {Jung, Jee-weon and Wu, Yihan and Wang, Xin and Kim, Ji-Hoon and Maiti, Soumi and Matsunaga, Yuta and Shim, Hye-jin and Tian, Jinchuan and Evans, Nicholas and Chung, Joon Son and Zhang, Wangyou and Um, Seyun and Takamichi, Shinnosuke and Watanabe, Shinji},
  year = {2025}
}

2024

Emerging Topics for Speech Synthesis: Versatility and Eﬃciency (Tutorial)

Yuki Saito , Shinnosuke Takamichi , and Wataru Nakata

In APSIPA , Jun 2024

Bib Slides

@inproceedings{saito24apsipa_tutorial,
  abbr_publisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {APSIPA},
  title = {Emerging Topics for Speech Synthesis: Versatility and Eﬃciency (Tutorial)},
  author = {Saito, Yuki and Takamichi, Shinnosuke and Nakata, Wataru},
  year = {2024}
}

Character-Voice Embodiment Impacts on the Cognitive Task Performance with the Voice Ownership Illusion

Kunimi Yusuke , Kenta Kimura , Keigo Matsumoto , Shinnosuke Takamichi , Takuji Narumi , and Masaaki Mochimaru

In Proceedings of International Conference on Artificial Reality and Telexistence & the Eurographics Symposium on Virtual Environments (ICAT-EGVE) , Jun 2024

Bib PDF

@inproceedings{kunimi24icategve_voice-embodiment,
  abbr_publisher = {Proceedings of International Conference on Artificial Reality and Telexistence & the Eurographics Symposium on Virtual Environments (ICAT-EGVE)},
  booktitle = {Proceedings of International Conference on Artificial Reality and Telexistence & the Eurographics Symposium on Virtual Environments (ICAT-EGVE)},
  title = {Character-Voice Embodiment Impacts on the Cognitive Task Performance with the Voice Ownership Illusion},
  author = {Yusuke, Kunimi and Kimura, Kenta and Matsumoto, Keigo and Takamichi, Shinnosuke and Narumi, Takuji and Mochimaru, Masaaki},
  year = {2024}
}

A Transformer Model for Segmentation, Classification, and Caller Identification of Marmoset Vocalization

Bin Wu , Shinnosuke Takamichi , Sakriani Sakti , and Satoshi Nakamura

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) , Jun 2024

arXiv Bib

@inproceedings{wu24slt_marmoset-transformer,
  abbr_publisher = {Proceedings of IEEE Spoken Language Technology Workshop (SLT)},
  booktitle = {Proceedings of IEEE Spoken Language Technology Workshop (SLT)},
  title = {A Transformer Model for Segmentation, Classification, and Caller Identification of Marmoset Vocalization},
  author = {Wu, Bin and Takamichi, Shinnosuke and Sakti, Sakriani and Nakamura, Satoshi},
  year = {2024}
}

Real-Time Noise Estimation for Lombard-Effect Speech Synthesis in Human–Avatar Dialogue Systems

Yuto Ishikawa , Osamu Take , Tomohiko Nakamura , Norihiro Takamune , Yuki Saito , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) , Jun 2024

Bib PDF

@inproceedings{ishikawa24apsipa_lombard,
  abbr_publisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  title = {Real-Time Noise Estimation for Lombard-Effect Speech Synthesis in Human–Avatar Dialogue Systems},
  author = {Ishikawa, Yuto and Take, Osamu and Nakamura, Tomohiko and Takamune, Norihiro and Saito, Yuki and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2024}
}

NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec

Wataru Nakata , Takaaki Saeki , Yuki Saito , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) , Jun 2024

Bib PDF

@inproceedings{wataru24apsipa_necobert,
  abbr_publisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  title = {NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec},
  author = {Nakata, Wataru and Saeki, Takaaki and Saito, Yuki and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2024}
}

DNN-based ensemble singing voice synthesis with interactions between singers

Hiroaki Hyodo , Shinnosuke Takamichi , Tomohiro Nakamura , Junya Koguchi , and Hiroshi Saruwatari

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) , Jun 2024

arXiv Bib

@inproceedings{hyodo24slt_chorus,
  abbr_publisher = {Proceedings of IEEE Spoken Language Technology Workshop (SLT)},
  booktitle = {Proceedings of IEEE Spoken Language Technology Workshop (SLT)},
  title = {DNN-based ensemble singing voice synthesis with interactions between singers},
  author = {Hyodo, Hiroaki and Takamichi, Shinnosuke and Nakamura, Tomohiro and Koguchi, Junya and Saruwatari, Hiroshi},
  year = {2024}
}

基盤モデル時代に言語で音声を処理したい

高道慎之介

In 情報処理学会自然言語処理研究会 , Jun 2024

(Invited talk / 招待講演)

Bib PDF

@inproceedings{takamichi24nl_foundation-model,
  abbr_publisher = {情報処理学会 自然言語処理研究会},
  booktitle = {情報処理学会 自然言語処理研究会},
  title = {基盤モデル時代に言語で音声を処理したい},
  author = {慎之介, 高道},
  year = {2024},
  note = {(Invited talk / 招待講演)}
}

発話内容書き起こしを越えて音声と言語を結びつけたい

高道慎之介

In 言語処理学会若手支援事業 , Jun 2024

(Invited talk / 招待講演)

Bib PDF

@inproceedings{takamichi24yans_speech-language,
  abbr_publisher = {言語処理学会 若手支援事業},
  booktitle = {言語処理学会 若手支援事業},
  title = {発話内容書き起こしを越えて音声と言語を結びつけたい},
  author = {慎之介, 高道},
  year = {2024},
  note = {(Invited talk / 招待講演)}
}

J-CHAT: 音声言語モデルのための大規模日本語対話音声コーパス

中田亘 , 関健太郎 , 谷中瞳 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Jun 2024

Bib PDF

@inproceedings{nakata24asja_j-chat,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {J-CHAT: 音声言語モデルのための大規模日本語対話音声コーパス},
  author = {亘, 中田 and 健太郎, 関 and 瞳, 谷中 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2024}
}

二重唱の歌い出しタイミングに対する同時性知覚の刺激閾調査

兵藤弘明 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Jun 2024

Bib PDF

@inproceedings{hyodo24asja_duet-timing,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {二重唱の歌い出しタイミングに対する同時性知覚の刺激閾調査},
  author = {弘明, 兵藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2024}
}

人間とアバターとの対話システムにおける拡散性雑音下リアルタイム推定雑音を用いたLombard効果模擬音声合成のための検討

石川悠人 , 武伯寒 , 中村友彦 , 高宗典玄 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Jun 2024

Bib PDF

@inproceedings{ishikawa24asja_lombard,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {人間とアバターとの対話システムにおける拡散性雑音下リアルタイム推定雑音を用いたLombard効果模擬音声合成のための検討},
  author = {悠人, 石川 and 伯寒, 武 and 友彦, 中村 and 典玄, 高宗 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2024}
}

コーパス

最先端の予測性能を持つ合成音声品質の自動評価システム UTMOS について

佐伯高明 , and 高道慎之介

日本音響学会誌, Jun 2024

(Invited article / 招待記事)

Bib PDF

@article{saeki24asj-kaisetsu_utmos,
  title = {最先端の予測性能を持つ合成音声品質の自動評価システム UTMOS について},
  author = {高明, 佐伯 and 慎之介, 高道},
  year = {2024},
  journal = {日本音響学会誌},
  note = {(Invited article / 招待記事)},
  memo = {本研究は科研費 21H04900，22H03639，23H03418，23K18474，JST創発的研究支援事業 JP23KJ0828，ムーンショット JPMJPS2011 の助成を受けた．本解説記事の執筆に際し，東京大学大学院の関健太郎氏の助言を受けた．}
}

音声分析

Who Finds This Voice Attractive? A Large-Scale Experiment Using In-the-Wild Data

Hitoshi Suda , Aya Watanabe , and Shinnosuke Takamichi