人間に准ずる計算機

計算機が人間に准ずるための音声認識合成技術

Title / タイトル

計算機が人間に准ずるための音声認識合成技術 (2023-2030, 創発的研究支援事業代表)

Projects / プロジェクト

This research aims to develop a speech recognition and synthesis technology to implement a computer as a human-like agent. To realize such an existence, speech synthesis technology that makes computers talk, cry, and laugh just like humans, and conversely, speech recognition technology that recognizes human speech, are necessary. In this research, we will research and develop speech design, machine learning, and common basic database for this purpose.

本研究は、人間に准ずる存在として計算機を実装するための音声認識合成技術です。そのような存在の実現には、人間と同じように計算機が喋り泣き笑う音声合成技術と、逆に人間のそれらを認識する音声認識技術が必要です。本研究では、そのための音声デザイン、機械学習、共通基盤データベースについて研究開発します。

Member / メンバ

Shinnosuke Takamichi / 高道慎之介（慶應義塾大学，代表）

Acknowledgement / 謝辞

JST FOREST JPMJFR226V (English)
xxx (日本語)

Website / ウェブサイト

https://www.jst.go.jp/souhatsu/research/panel_yagi.html

Reference / 発表文献

(Hyodo et al., 2024)
(高明佐伯 & 慎之介高道, 2024)
(Suda et al., 2024)
(Kando et al., 2024)
(Saeki et al., 2024)
(高明佐伯 et al., 2024)
(伯寒武 et al., 2024)
(裕太松永 et al., 2024)
(弘明兵藤 et al., 2024)
(亞椰渡邊 et al., 2024)
(Watanabe et al., 2023)
(徳泰辛 et al., 2024)
(Xin et al., 2023)
(Ueda et al., 2023)
(亞椰渡邊 et al., 2023)
(紘希前田 et al., 2023)
(missing reference)
(伯寒武 et al., 2024)
(Take et al., 2024)
(Yusuke et al., 2024)
(Hyodo et al., 2024)
(弘明兵藤 et al., 2024)
(航希福田 et al., 2025)
(瞭介阪井 et al., 2025)
(凜佳信川 et al., 2025)
(嶺佑松下 et al., 2025)
(Suda et al., 2025)
(仁志須田 & 慎之介高道, 2025)
(健太郎関 et al., 2026)
(凜佳信川 et al., 2025)
(賢斗稲垣 & 慎之介高道, 2025)
(Nobukawa et al., 2025)
(嶺佑松下 et al., 2025)
(颯太越野 et al., 2025)
(健太郎関 et al., 2025)
(missing reference)
(慎之介高道 et al., 2025)
(Nobukawa et al., 2025)
(航平淺井 et al., 2025)
(秀岸 et al., 2026)
(嶺佑松下 et al., 2026)
(晶子小野 et al., 2026)
(慎之介高道 et al., 2026)
(正太郎上治 et al., 2026)
(賢斗稲垣 et al., 2026)

References

2026

Spatial Audio Captioning: 複数音源状況下における空間情報を伴う説明文の生成とその評価

関健太郎 , 岡本悠希 , 山岡洸瑛 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会春季研究発表会 , Mar 2026

@inproceedings{seki26asjs_spatial-captioning,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {Spatial Audio Captioning: 複数音源状況下における空間情報を伴う説明文の生成とその評価},
  author = {健太郎, 関 and 悠希, 岡本 and 洸瑛, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2026},
  month = mar,
}

既存データセットとの意図しない重複を避ける環境音評価データセットの半自動構築法

岸秀 , 高道慎之介 , 滝沢力 , 金森勇介 , 砺波紀之 , 永瀬亮太郎 , 井本桂右 , and 岡本悠希

In 電子情報通信学会応用音響研究会 , Mar 2026

@inproceedings{kishi26speasip_environmental-sound-dataset,
  abbr_publisher = {電子情報通信学会 応用音響研究会},
  booktitle = {電子情報通信学会 応用音響研究会},
  title = {既存データセットとの意図しない重複を避ける環境音評価データセットの半自動構築法},
  author = {秀, 岸 and 慎之介, 高道 and 力, 滝沢 and 勇介, 金森 and 紀之, 砺波 and 亮太郎, 永瀬 and 桂右, 井本 and 悠希, 岡本},
  year = {2026},
  month = mar,
}

多ジャンルのスポーツ音声実況における音声特徴量の時間的構造の調査

松下嶺佑 , 高道慎之介 , 齋藤佑樹 , ニュービッググラム , 須藤克仁 , 高村大也 , and 石垣達也

In 情報処理学会音声言語処理研究会 , Mar 2026

@inproceedings{matsushita26speasip_sports-commentary-structure,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {多ジャンルのスポーツ音声実況における音声特徴量の時間的構造の調査},
  author = {嶺佑, 松下 and 慎之介, 高道 and 佑樹, 齋藤 and グラム, ニュービッグ and 克仁, 須藤 and 大也, 高村 and 達也, 石垣},
  year = {2026},
  month = mar,
}

声道パラメータ表現および強化学習を利用したText-to-Action-to-Speech

小野晶子 , 加藤徳啓 , and 高道慎之介

In 電子情報通信学会音声研究会 , Mar 2026

@inproceedings{ono26speasip_text-to-action-to-speech,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {声道パラメータ表現および強化学習を利用したText-to-Action-to-Speech},
  author = {晶子, 小野 and 徳啓, 加藤 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

ボイスコミックデータセット MangaVox が拓く音声科学・工学タスク

高道慎之介 , 中村友彦 , 須田仁志 , 深山覚 , and 緒方淳

In 日本音響学会春季研究発表会 , Mar 2026

@inproceedings{takamichi26asjs_mangavox,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {ボイスコミックデータセット MangaVox が拓く音声科学・工学タスク},
  author = {慎之介, 高道 and 友彦, 中村 and 仁志, 須田 and 覚, 深山 and 淳, 緒方},
  year = {2026},
  month = mar,
}

空間音とテキストの対照学習による音源情報と空間情報の分離表現学習

上治正太郎 , 高道慎之介 , and 山岡洸瑛

In 電子情報通信学会応用音響研究会 , Mar 2026

@inproceedings{ueji26speasip_spatial-audio-text-learning,
  abbr_publisher = {電子情報通信学会 応用音響研究会},
  booktitle = {電子情報通信学会 応用音響研究会},
  title = {空間音とテキストの対照学習による音源情報と空間情報の分離表現学習},
  author = {正太郎, 上治 and 慎之介, 高道 and 洸瑛, 山岡},
  year = {2026},
  month = mar,
}

大規模言語モデルの音象徴ベンチマーク

稲垣賢斗 , 神藤駿介 , and 高道慎之介

In 言語処理学会全国大会 , Mar 2026

@inproceedings{inagaki26nlp_sound-symbolism-benchmark,
  abbr_publisher = {言語処理学会 全国大会},
  booktitle = {言語処理学会 全国大会},
  title = {大規模言語モデルの音象徴ベンチマーク},
  author = {賢斗, 稲垣 and 駿介, 神藤 and 慎之介, 高道},
  year = {2026},
  month = mar,
}

2025

ELEVATE：学習者自身の自己聴取音声で聴く講義システム

福田航希 , 阪井瞭介 , 松下嶺佑 , 國見友亮 , and 高道慎之介

In 情報処理学会インタラクション , Mar 2025

@inproceedings{fukuda25interaction_elevate,
  abbr_publisher = {情報処理学会 インタラクション},
  booktitle = {情報処理学会 インタラクション},
  title = {ELEVATE：学習者自身の自己聴取音声で聴く講義システム},
  author = {航希, 福田 and 瞭介, 阪井 and 嶺佑, 松下 and 友亮, 國見 and 慎之介, 高道},
  year = {2025},
}

知覚感情の不整合：人間が作曲した音楽，音楽を記述したテキスト，テキスト楽音合成による音楽の比較

阪井瞭介 , 福田航希 , 松下嶺佑 , 高道慎之介 , and 植村あい子

In 情報処理学会音楽情報科学研究会 , Mar 2025

@inproceedings{sakai25mus_perceptual-inconsistency,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {知覚感情の不整合：人間が作曲した音楽，音楽を記述したテキスト，テキスト楽音合成による音楽の比較},
  author = {瞭介, 阪井 and 航希, 福田 and 嶺佑, 松下 and 慎之介, 高道 and あい子, 植村},
  year = {2025},
}

変分オートエンコーダによるドラムからボーカルパーカッションへの楽器音変換と評価

信川凜佳 , 北村優輝士 , 中村友彦 , 高道慎之介 , and 猿渡洋

In 情報処理学会音楽情報科学研究会 , Mar 2025

@inproceedings{nobukawa25mus_drum-to-vocal,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {変分オートエンコーダによるドラムからボーカルパーカッションへの楽器音変換と評価},
  author = {凜佳, 信川 and 優輝士, 北村 and 友彦, 中村 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
}

三人称ゲーム実況音声に対する時間遅延許容量の測定

松下嶺佑 , 阪井瞭介 , 福田航希 , 高道慎之介 , 井浦昂太 , 齋藤佑樹 , ニュービッググラム , 須藤克仁 , 高村大也 , and 石垣達也

In 電子情報通信学会音声研究会 , Mar 2025

@inproceedings{matsushita25speasip_delay-tolerance,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {三人称ゲーム実況音声に対する時間遅延許容量の測定},
  author = {嶺佑, 松下 and 瞭介, 阪井 and 航希, 福田 and 慎之介, 高道 and 昂太, 井浦 and 佑樹, 齋藤 and グラム, ニュービッグ and 克仁, 須藤 and 大也, 高村 and 達也, 石垣},
  year = {2025},
}

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

Hitoshi Suda , Shinnosuke Takamichi , and Satoru Fukayama

In Proceedings of Interspeech , Aug 2025

@inproceedings{suda25interspeech_likability-control,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora},
  author = {Suda, Hitoshi and Takamichi, Shinnosuke and Fukayama, Satoru},
  year = {2025},
  month = aug,
}

好感度自動推定モデルを利用した任意話者音声の好感度を制御可能な声質変換

須田仁志 , and 高道慎之介

In 情報処理学会音声言語処理研究会 , Aug 2025

@inproceedings{suda25speasip_voice-likeability,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {好感度自動推定モデルを利用した任意話者音声の好感度を制御可能な声質変換},
  author = {仁志, 須田 and 慎之介, 高道},
  year = {2025},
}

Audio Captioning モデルの発達的カリキュラム学習

稲垣賢斗 , and 高道慎之介

In 日本音響学会秋季研究発表会 , Sep 2025

@inproceedings{inagaki25asja_audio-captioning,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {Audio Captioning モデルの発達的カリキュラム学習},
  author = {賢斗, 稲垣 and 慎之介, 高道},
  year = {2025},
  month = sep,
}

Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology

Rinka Nobukawa , Makito Kitamura , Tomohiko Nakamura , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) , Oct 2025

@inproceedings{nobukawa25apsipa_drum-to-vocalpercussion,
  abbr_publisher = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  booktitle = {Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  title = {Drum-to-Vocal Percussion Sound Conversion and Its Evaluation Methodology},
  author = {Nobukawa, Rinka and Kitamura, Makito and Nakamura, Tomohiko and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2025},
  month = oct,
}

漫画画像理解性能が漫画音声合成の品質に与える影響の調査

越野颯太 , 上治正太郎 , 高道慎之介 , and 中村友彦

In 電子情報通信学会ヒューマンコミュニケーショングループ・コミック工学研究会 , Jul 2025

@inproceedings{koshino25comic_maga2voice-eval,
  abbr_publisher = {電子情報通信学会ヒューマンコミュニケーショングループ・コミック工学研究会},
  booktitle = {電子情報通信学会ヒューマンコミュニケーショングループ・コミック工学研究会},
  title = {漫画画像理解性能が漫画音声合成の品質に与える影響の調査},
  author = {颯太, 越野 and 正太郎, 上治 and 慎之介, 高道 and 友彦, 中村},
  year = {2025},
  month = jul,
}

空間情報を伴う音響言語モデルの検討

関健太郎 , 岡本悠希 , 山岡洸瑛 , 齋藤佑樹 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Sep 2025

@inproceedings{seki25asja_acoustic-llm,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {空間情報を伴う音響言語モデルの検討},
  author = {健太郎, 関 and 悠希, 岡本 and 洸瑛, 山岡 and 佑樹, 齋藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2025},
  month = sep,
}

MangaVox：ボイスコミックの計算機理解に向けたマルチモーダル演技音声データセット

高道慎之介 , 中村友彦 , 須田仁志 , 深山覚 , and 緒方淳

In 電子情報通信学会パターン認識・メディア理解研究専門委員会 , Jul 2025

@inproceedings{takamichi25miru_manga-vox,
  abbr_publisher = {電子情報通信学会パターン認識・メディア理解研究専門委員会},
  booktitle = {電子情報通信学会パターン認識・メディア理解研究専門委員会},
  title = {MangaVox：ボイスコミックの計算機理解に向けたマルチモーダル演技音声データセット},
  author = {慎之介, 高道 and 友彦, 中村 and 仁志, 須田 and 覚, 深山 and 淳, 緒方},
  year = {2025},
  month = jul
}

Real-Time Drum-to-Vocal Percussion Sound Conversion System

Rinka Nobukawa , Tomohiko Nakamura , Shinnosuke Takamichi , and Hiroshi Saruwatari

In International Society for Music Information Retrieval Late‑Breaking/Demo Session , Sep 2025

@inproceedings{nobukawa25ismir_drum-to-vocal,
  abbr_publisher = {International Society for Music Information Retrieval Late‑Breaking/Demo Session},
  booktitle = {International Society for Music Information Retrieval Late‑Breaking/Demo Session},
  title = {Real-Time Drum-to-Vocal Percussion Sound Conversion System},
  author = {Nobukawa, Rinka and Nakamura, Tomohiko and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2025},
  month = sep,
}

Common Crawl を用いた大規模音声音響データセットの構築

淺井航平 , 杉浦一瑳 , 中田亘 , 栗田修平 , 高道慎之介 , 小川哲司 , and 東中竜一郎

In 日本音響学会秋季研究発表会 , Sep 2025

@inproceedings{asai25asja_commoncrawl-dataset,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {Common Crawl を用いた大規模音声音響データセットの構築},
  author = {航平, 淺井 and 一瑳, 杉浦 and 亘, 中田 and 修平, 栗田 and 慎之介, 高道 and 哲司, 小川 and 竜一郎, 東中},
  year = {2025},
  month = sep,
}

2024

DNN-based ensemble singing voice synthesis with interactions between singers

Hiroaki Hyodo , Shinnosuke Takamichi , Tomohiro Nakamura , Junya Koguchi , and Hiroshi Saruwatari

In Proceedings of IEEE Spoken Language Technology Workshop (SLT) , Sep 2024

@inproceedings{hyodo24slt_chorus,
  abbr_publisher = {Proceedings of IEEE Spoken Language Technology Workshop (SLT)},
  booktitle = {Proceedings of IEEE Spoken Language Technology Workshop (SLT)},
  title = {DNN-based ensemble singing voice synthesis with interactions between singers},
  author = {Hyodo, Hiroaki and Takamichi, Shinnosuke and Nakamura, Tomohiro and Koguchi, Junya and Saruwatari, Hiroshi},
  year = {2024}
}

最先端の予測性能を持つ合成音声品質の自動評価システム UTMOS について

佐伯高明 , and 高道慎之介

日本音響学会誌, Sep 2024

(Invited article / 招待記事)

@article{saeki24asj-kaisetsu_utmos,
  title = {最先端の予測性能を持つ合成音声品質の自動評価システム UTMOS について},
  author = {高明, 佐伯 and 慎之介, 高道},
  year = {2024},
  journal = {日本音響学会誌},
  note = {(Invited article / 招待記事)},
  memo = {本研究は科研費 21H04900，22H03639，23H03418，23K18474，JST創発的研究支援事業 JP23KJ0828，ムーンショット JPMJPS2011 の助成を受けた．本解説記事の執筆に際し，東京大学大学院の関健太郎氏の助言を受けた．}
}

Who Finds This Voice Attractive? A Large-Scale Experiment Using In-the-Wild Data

Hitoshi Suda , Aya Watanabe , and Shinnosuke Takamichi

In Proceedings of Interspeech , Sep 2024

arXiv Bib Website

@inproceedings{suda24interspeech_sukikirai,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Who Finds This Voice Attractive? A Large-Scale Experiment Using In-the-Wild Data},
  author = {Suda, Hitoshi and Watanabe, Aya and Takamichi, Shinnosuke},
  year = {2024},
  memo = {This work was supported by JSPS KAKENHI Grant Number 23K20017, 21H04900, 22H03639, and 23H03418, and JST FOREST JPMJFR226V. This paper is based on results obtained from a project, JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).}
}

Textless Dependency Parsing by Labeled Sequence Prediction

Shunsuke Kando , Yusuke Miyao , Jason Naradowsky , and Shinnosuke Takamichi

In Proceedings of Interspeech , Sep 2024

@inproceedings{kando24interspeech_textlessparsing,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Textless Dependency Parsing by Labeled Sequence Prediction},
  author = {Kando, Shunsuke and Miyao, Yusuke and Naradowsky, Jason and Takamichi, Shinnosuke},
  year = {2024},
  memo = {This work was supported by JST Moonshot JPMJMS2237 and JST FOREST JPMJFR226V.}
}

Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis

Takaaki Saeki , Soumi Maiti , Xinjian Li , Shinji Watanabe , Shinnosuke Takamichi , and Hiroshi Saruwatari

IEEE Transactions on Audio, Speech, and Language Processing, Sep 2024

@article{saeki24taslp_text-inductive-tts,
  title = {Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis},
  author = {Saeki, Takaaki and Maiti, Soumi and Li, Xinjian and Watanabe, Shinji and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  year = {2024},
  journal = {IEEE Transactions on Audio, Speech, and Language Processing}
}

テキスト生成の自動評価尺度に基づく音声生成の自動評価

佐伯高明 , マイティソウミ , 高道慎之介 , 渡部晋治 , and 猿渡洋

In 電子情報通信学会音声研究会 , Sep 2024

@inproceedings{saeki24sp_speechevaluation,
  abbr_publisher = {電子情報通信学会 音声研究会},
  booktitle = {電子情報通信学会 音声研究会},
  title = {テキスト生成の自動評価尺度に基づく音声生成の自動評価},
  author = {高明, 佐伯 and ソウミ, マイティ and 慎之介, 高道 and 晋治, 渡部 and 洋, 猿渡},
  year = {2024},
  memo = {JSPS 科 研 費 23H03418，23K18474，22H03639，21H05054，22KJ0838 ムーンショット研究開発費 JPMJPS2011，および JST FOREST JPMJFR226V によって支援された．}
}

音環境に適応するテキスト音声合成のための一人称視点コーパス構築

武伯寒 , 高道慎之介 , 関健太郎 , 坂東宜昭 , and 猿渡洋

In 情報処理学会音声言語処理研究会 , Sep 2024

@inproceedings{take24slp_1st-person-tts,
  abbr_publisher = {情報処理学会 音声言語処理研究会},
  booktitle = {情報処理学会 音声言語処理研究会},
  title = {音環境に適応するテキスト音声合成のための一人称視点コーパス構築},
  author = {伯寒, 武 and 慎之介, 高道 and 健太郎, 関 and 宜昭, 坂東 and 洋, 猿渡},
  year = {2024},
  memo = {本研究の一部は，科研費 22H03639，23K18474， JST 創発的研究支援事業 JP23KJ0828，及び JST ムーンショット型研究開発事業 JPMJMS2011 の助成を受け実施 しました．また, 原稿の作成に際して, 渡邊 亞椰さんには 図の作成でご協力頂きました. この場を借りて感謝申し上げます}
}

Cocktail Machine Speech Chain: 重複あり音声を用いた音声認識・音声合成モデルの統一的学習

松永裕太 , 高道慎之介 , 上乃聖 , and 猿渡洋

In 日本音響学会春季研究発表会 , Sep 2024

@inproceedings{matsunaga24asjs_cocktail-speech-chain,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {Cocktail Machine Speech Chain: 重複あり音声を用いた音声認識・音声合成モデルの統一的学習 },
  author = {裕太, 松永 and 慎之介, 高道 and 聖, 上乃 and 洋, 猿渡},
  year = {2024},
  memo = {本研究は，JST 次世代研究者挑戦的研究プログラム JPMJSP2108，ムーンショット JPMJPS2011，JST 創発的研究支 援事業 JP23KJ0828，科研費 21H05054, 22H03639，23H03418 の支援と，東京大学の齋藤佑樹博士, 佐伯高明氏の協力を受け実施 したものです.}
}

歌唱者間相互作用を再現するDNN重唱歌声合成の検討

兵藤弘明 , 高道慎之介 , 中村友彦 , 小口純矢 , and 猿渡洋

In 情報処理学会音楽情報科学研究会 , Sep 2024

@inproceedings{hyodo24mus_chorus-synthesis,
  abbr_publisher = {情報処理学会 音楽情報科学研究会},
  booktitle = {情報処理学会 音楽情報科学研究会},
  title = {歌唱者間相互作用を再現する{DNN}重唱歌声合成の検討},
  author = {弘明, 兵藤 and 慎之介, 高道 and 友彦, 中村 and 純矢, 小口 and 洋, 猿渡},
  year = {2024},
  memo = {アノテーションの方法について，西山陽子様から 多くの助言を受けた．本研究は JST 創発的研究支援事業 JPMJFR226V，JSPS 科研費 23H03418，23K18474 の助成を受けた．}
}

対照学習モデルによる音声-声質表現文の埋め込み表現獲得

渡邊亞椰 , 高道慎之介 , 齋藤佑樹 , 中田亘 , 辛徳泰 , and 猿渡洋

In 日本音響学会春季研究発表会 , Sep 2024

@inproceedings{watanabe24asjs_coconut-embedding,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {対照学習モデルによる音声-声質表現文の埋め込み表現獲得},
  author = {亞椰, 渡邊 and 慎之介, 高道 and 佑樹, 齋藤 and 亘, 中田 and 徳泰, 辛 and 洋, 猿渡},
  year = {2024},
  memo = {本研究は科研費 21H04900, 22H03639，23H03418，JST 創発的研究支援事業 JP23KJ0828，ムーンショット JPMJPS2011 の助成を受けたものです.}
}

大規模な日本語笑い声コーパスを用いたテキストレス笑い声合成

辛徳泰 , 高道慎之介 , 森松亜依 , and 猿渡洋

In 日本音響学会春季研究発表会 , Sep 2024

@inproceedings{xin24asjs_laughter-synthesis,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {大規模な日本語笑い声コーパスを用いたテキストレス笑い声合成},
  author = {徳泰, 辛 and 慎之介, 高道 and 亜依, 森松 and 洋, 猿渡},
  year = {2024},
  memo = {本研究は，JST 次世代研究者挑戦的研究プログラム JPMJSP2108，JSPS 科研費 JP23KJ0828，JST 創発的研究支援事業 JPMJFR22 の支援を受けたものです。}
}

複数のオーディオエフェクトが適用された楽音に対するエフェクトチェイン推定と原音復元

武伯寒 , 渡邉研斗 , 中塚貴之 , Tian Cheng , 中野倫靖 , 後藤真孝 , 高道慎之介 , and 猿渡洋

In 日本音響学会春季研究発表会 , Sep 2024

@inproceedings{take24asjs_audio-effect,
  abbr_publisher = {日本音響学会春季研究発表会},
  booktitle = {日本音響学会春季研究発表会},
  title = {複数のオーディオエフェクトが適用された楽音に対するエフェクトチェイン推定と原音復元},
  author = {伯寒, 武 and 研斗, 渡邉 and 貴之, 中塚 and Cheng, Tian and 倫靖, 中野 and 真孝, 後藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2024},
  memo = {本研究は科研費 21H04900, 22H03639，23H03418， JST 創発的研究支援事業 JP23KJ0828，ムーンショット JPMJPS2011 の助成を受けたものです}
}

Audio Effect Chain Estimation and Dry Signal Recovery from Multi-Effect-Processed Musical Signals

Osamu Take , Kento Watanabe , Takayuki Nakatsuka , Tian Cheng , Tomoyasu Nakano , Masataka Goto , Shinnosuke Takamichi , and Hiroshi Saruwatari

In Proceedings of International Conference on Digital Audio Effects (DAFx) , Sep 2024

@inproceedings{take24dafx_effect-chain,
  abbr_publisher = {Proceedings of International Conference on Digital Audio Effects (DAFx)},
  booktitle = {Proceedings of International Conference on Digital Audio Effects (DAFx)},
  title = {Audio Effect Chain Estimation and Dry Signal Recovery from Multi-Effect-Processed Musical Signals},
  author = {Take, Osamu and Watanabe, Kento and Nakatsuka, Takayuki and Cheng, Tian and Nakano, Tomoyasu and Goto, Masataka and Takamichi, Shinnosuke and Saruwatari, Hiroshi},
  memo = {This work is supported by JSPS KAKENHI 21H04900, 22H03639, and 23H03418, JST FOREST JPMJFR226V, and Moonshot R&D Grant Number JPMJPS2011.},
  year = {2024}
}

Character-Voice Embodiment Impacts on the Cognitive Task Performance with the Voice Ownership Illusion

Kunimi Yusuke , Kenta Kimura , Keigo Matsumoto , Shinnosuke Takamichi , Takuji Narumi , and Masaaki Mochimaru

In Proceedings of International Conference on Artificial Reality and Telexistence & the Eurographics Symposium on Virtual Environments (ICAT-EGVE) , Sep 2024

@inproceedings{kunimi24icategve_voice-embodiment,
  abbr_publisher = {Proceedings of International Conference on Artificial Reality and Telexistence & the Eurographics Symposium on Virtual Environments (ICAT-EGVE)},
  booktitle = {Proceedings of International Conference on Artificial Reality and Telexistence & the Eurographics Symposium on Virtual Environments (ICAT-EGVE)},
  title = {Character-Voice Embodiment Impacts on the Cognitive Task Performance with the Voice Ownership Illusion},
  author = {Yusuke, Kunimi and Kimura, Kenta and Matsumoto, Keigo and Takamichi, Shinnosuke and Narumi, Takuji and Mochimaru, Masaaki},
  year = {2024}
}

二重唱の歌い出しタイミングに対する同時性知覚の刺激閾調査

兵藤弘明 , 高道慎之介 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Sep 2024

@inproceedings{hyodo24asja_duet-timing,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {二重唱の歌い出しタイミングに対する同時性知覚の刺激閾調査},
  author = {弘明, 兵藤 and 慎之介, 高道 and 洋, 猿渡},
  year = {2024}
}

2023

Coco-Nut: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-based Control

Aya Watanabe , Shinnosuke Takamichi , Yuki Saito , Wataru Nakata , Detai Xin , and Hiroshi Saruwatari

In IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) , Sep 2023

arXiv Bib Website

@inproceedings{watanabe23asru_coconut-corpus,
  abbr_publisher = {IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
  booktitle = {IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
  title = {{Coco-Nut}: Corpus of {J}apanese Utterance and Voice Characteristics Description for Prompt-based Control},
  author = {Watanabe, Aya and Takamichi, Shinnosuke and Saito, Yuki and Nakata, Wataru and Xin, Detai and Saruwatari, Hiroshi},
  year = {2023}
}

Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus

Detai Xin , Shinnosuke Takamichi , Ai Morimatsu , and Hiroshi Saruwatari

In Proceedings of Interspeech , Sep 2023

arXiv Bib Website

@inproceedings{xin23interspeech_laughter-synthesis,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus},
  author = {Xin, Detai and Takamichi, Shinnosuke and Morimatsu, Ai and Saruwatari, Hiroshi},
  memo = {This work was supported by JST SPRING, Grant Number JPMJSP2108, JSPS KAKENHI, Grant Number JP23KJ0828, and JST FOREST JPMJFR226V.},
  year = {2023}
}

HumanDiffusion: diffusion model using perceptual gradients

Yota Ueda , Shinnosuke Takamichi , Yuki Saito , Norihiro Takamune , and Hiroshi Saruwatari

In Proceedings of Interspeech , Sep 2023

@inproceedings{ueda23interspeech_humandiffusion,
  abbr_publisher = {Proceedings of Interspeech},
  booktitle = {Proceedings of Interspeech},
  title = {HumanDiffusion: diffusion model using perceptual gradients},
  author = {Ueda, Yota and Takamichi, Shinnosuke and Saito, Yuki and Takamune, Norihiro and Saruwatari, Hiroshi},
  year = {2023}
}

Coco-Nut: 自由記述文による声質制御に向けた多話者音声・声質自由記述ペアデータセット

渡邊亞椰 , 高道慎之介 , 齋藤佑樹 , 辛徳泰 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Sep 2023

Bib PDF Slides Website

@inproceedings{watanabe23asja_coconut,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {Coco-Nut: 自由記述文による声質制御に向けた多話者音声・声質自由記述ペアデータセット},
  author = {亞椰, 渡邊 and 慎之介, 高道 and 佑樹, 齋藤 and 徳泰, 辛 and 洋, 猿渡},
  year = {2023}
}

深層学習で獲得される音声シンボルは自然言語シンボルと同様に Zipf 則に従うか？

前田紘希 , 高道慎之介 , 朴浚鎔 , and 猿渡洋

In 日本音響学会秋季研究発表会 , Sep 2023

@inproceedings{maeda23asja_zipf,
  abbr_publisher = {日本音響学会秋季研究発表会},
  booktitle = {日本音響学会秋季研究発表会},
  title = {深層学習で獲得される音声シンボルは自然言語シンボルと同様に {Zipf} 則に従うか？},
  author = {紘希, 前田 and 慎之介, 高道 and 浚鎔, 朴 and 洋, 猿渡},
  year = {2023}
}