docs: 添加自定义 TTS 和音色的配置和使用教程

2025-04-13 00:47:29 +00:00 · 2024-06-16 17:34:38 +08:00 · 2024-06-16 17:34:38 +08:00 · 86feece787
commit 86feece787
parent cc451ac2e6
11 changed files with 159 additions and 81 deletions
--- a/.env.example
+++ b/.env.example
@ -15,6 +15,5 @@ OPENAI_BASE_URL=https://api.openai.com/v1
 # AUDIO_ACTIVE=唤醒提示音链接，同上
 # AUDIO_ERROR=出错了提示音链接，同上

-# Doubao TTS（可选，用于调用第三方 TTS 服务，比如：豆包）
-# TTS_DOUBAO=豆包 TTS 接口
-# SPEAKERS_DOUBAO=豆包 TTS 音色列表接口
+# 第三方 TTS（可选，用于调用第三方 TTS 服务）
+# TTS_BASE_URL=你的 TTS 接口地址，比如：http://[你的局域网/公网地址]:[端口]/api
--- a/.migpt.example.js
+++ b/.migpt.example.js
@ -9,6 +9,8 @@ export default {
    profile: masterProfile,
  },
  speaker: {
+    // TTS 引擎
+    tts: "xiaoai",
    // 小米 ID
    userId: "987654321", // 注意：不是手机号或邮箱，请在「个人信息」-「小米 ID」查看
    // 账号密码
--- a/README.md
+++ b/README.md
@ -96,6 +96,7 @@ main();

 - [⚙️ 参数设置](https://github.com/idootop/mi-gpt/blob/main/docs/settings.md)
 - [💬 常见问题](https://github.com/idootop/mi-gpt/blob/main/docs/faq.md)
+- [🚗 使用第三方 TTS](https://github.com/idootop/mi-gpt/blob/main/docs/tts.md)
 - [🛠️ 本地开发](https://github.com/idootop/mi-gpt/blob/main/docs/development.md)
 - [💎 工作原理](https://github.com/idootop/mi-gpt/blob/main/docs/how-it-works.md)
 - [✨ 更新日志](https://github.com/idootop/mi-gpt/blob/main/docs/changelog.md)
--- a/docs/faq.md
+++ b/docs/faq.md
@ -1,5 +1,7 @@
 # 💬 常见问题

+> 善用搜索，大多数问题都可在此处找到答案。如果你有新的问题，欢迎提交 [issue](https://github.com/idootop/mi-gpt/issues)。
+
 ### Q：支持哪些型号的小爱音箱？

 大部分型号的小爱音箱都支持，推荐小爱音箱 Pro（完美运行）
@ -8,6 +10,12 @@

 > 注意：本项目暂不支持小度音箱、天猫精灵、HomePod 等智能音箱设备，亦无相关适配计划。

+### Q：是否支持其他 TTS 服务，如何接入？
+
+支持接入任意 TTS 服务，包括本地部署的 ChatTTS 等。
+
+具体的配置和使用教程，请查看此处：[🚗 使用第三方 TTS](https://github.com/idootop/mi-gpt/blob/main/docs/tts.md)
+
 ### Q：什么是唤醒模式，如何唤醒 AI？

 `唤醒模式` 类似于小爱技能，可能让你在跟小爱互动的时候，无需每句话都要以“小爱同学”开头唤醒。假设你的唤醒词配置如下：
--- a/docs/settings.md
+++ b/docs/settings.md
@ -6,36 +6,37 @@

 然后，将里面的配置参数修改成你自己的，参数含义如下：

-| 参数名称                     | 描述                                                                                                                                               | 示例                                                             |
-| ---------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
-| `systemTemplate`             | 系统 Prompt 模板，可以更灵活的控制 AI 的各种行为规则，是否需要携带上下文等，[设置教程](https://github.com/idootop/mi-gpt/blob/main/docs/prompt.md) | `"你是一个博学多识的人，下面请友好的回答用户的提问，保持精简。"` |
-| **bot**                      |                                                                                                                                                    |                                                                  |
-| `name`                       | 对方名称（小爱音箱）                                                                                                                               | `"傻妞"`                                                         |
-| `profile`                    | 对方的个人简介/人设                                                                                                                                | `"性别女，性格乖巧可爱，喜欢搞怪，爱吃醋。"`                     |
-| **master**                   |                                                                                                                                                    |                                                                  |
-| `name`                       | 主人名称（我自己）                                                                                                                                 | `"陆小千"`                                                       |
-| `profile`                    | 主人的个人简介/人设                                                                                                                                | `"性别男，善良正直，总是舍己为人，是傻妞的主人。"`               |
-| **room**                     |                                                                                                                                                    |                                                                  |
-| `name`                       | 会话群名称                                                                                                                                         | `"魔幻手机"`                                                     |
-| `description`                | 会话群简介                                                                                                                                         | `"傻妞和陆小千的私聊"`                                           |
-| **speaker**                  |                                                                                                                                                    |                                                                  |
-| `userId`                     | [小米 ID](https://account.xiaomi.com/fe/service/account/profile)（注意：不是手机号或邮箱）                                                         | `"987654321"`                                                    |
-| `password`                   | 账户密码                                                                                                                                           | `"123456"`                                                       |
-| `did`                        | 小爱音箱 ID 或名称                                                                                                                                 | `"小爱音箱 Pro"`                                                 |
-| `ttsCommand`                 | 小爱音箱 TTS 指令（[可在此查询](https://home.miot-spec.com)）                                                                                      | `[5, 1]`                                                         |
-| `wakeUpCommand`              | 小爱音箱唤醒指令（[可在此查询](https://home.miot-spec.com)）                                                                                       | `[5, 3]`                                                         |
+| 参数名称                     | 描述                                                                                                                                                 | 示例                                                             |
+| ---------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------- |
+| `systemTemplate`             | 系统 Prompt 模板，可以更灵活的控制 AI 的各种行为规则，是否需要携带上下文等 👉 [设置教程](https://github.com/idootop/mi-gpt/blob/main/docs/prompt.md) | `"你是一个博学多识的人，下面请友好的回答用户的提问，保持精简。"` |
+| **bot**                      |                                                                                                                                                      |                                                                  |
+| `name`                       | 对方名称（小爱音箱）                                                                                                                                 | `"傻妞"`                                                         |
+| `profile`                    | 对方的个人简介/人设                                                                                                                                  | `"性别女，性格乖巧可爱，喜欢搞怪，爱吃醋。"`                     |
+| **master**                   |                                                                                                                                                      |                                                                  |
+| `name`                       | 主人名称（我自己）                                                                                                                                   | `"陆小千"`                                                       |
+| `profile`                    | 主人的个人简介/人设                                                                                                                                  | `"性别男，善良正直，总是舍己为人，是傻妞的主人。"`               |
+| **room**                     |                                                                                                                                                      |                                                                  |
+| `name`                       | 会话群名称                                                                                                                                           | `"魔幻手机"`                                                     |
+| `description`                | 会话群简介                                                                                                                                           | `"傻妞和陆小千的私聊"`                                           |
+| **speaker**                  |                                                                                                                                                      |                                                                  |
+| `userId`                     | [小米 ID](https://account.xiaomi.com/fe/service/account/profile)（注意：不是手机号或邮箱）                                                           | `"987654321"`                                                    |
+| `password`                   | 账户密码                                                                                                                                             | `"123456"`                                                       |
+| `did`                        | 小爱音箱 ID 或名称                                                                                                                                   | `"小爱音箱 Pro"`                                                 |
+| `ttsCommand`                 | 小爱音箱 TTS 指令（[可在此查询](https://home.miot-spec.com)）                                                                                        | `[5, 1]`                                                         |
+| `wakeUpCommand`              | 小爱音箱唤醒指令（[可在此查询](https://home.miot-spec.com)）                                                                                         | `[5, 3]`                                                         |
 | **speaker 其他参数（可选）** |
-| `callAIKeywords`             | 当消息以关键词开头时，会调用 AI 来响应用户消息                                                                                                     | `["请", "傻妞"]`                                                 |
-| `wakeUpKeywords`             | 当消息以关键词开头时，会进入 AI 唤醒状态                                                                                                           | `["召唤傻妞", "打开傻妞"]`                                       |
-| `exitKeywords`               | 当消息以关键词开头时，会退出 AI 唤醒状态                                                                                                           | `["退出傻妞", "关闭傻妞"]`                                       |
-| `onEnterAI`                  | 进入 AI 模式的欢迎语                                                                                                                               | `["你好，我是傻妞，很高兴认识你"]`                               |
-| `onExitAI`                   | 退出 AI 模式的提示语                                                                                                                               | `["傻妞已退出"]`                                                 |
-| `onAIAsking`                 | AI 开始回答时的提示语                                                                                                                              | `["让我先想想", "请稍等"]`                                       |
-| `onAIReplied`                | AI 结束回答时的提示语                                                                                                                              | `["我说完了", "还有其他问题吗"]`                                 |
-| `onAIError`                  | AI 回答异常时的提示语                                                                                                                              | `["出错了，请稍后再试吧！"]`                                     |
-| `playingCommand`             | 查询小爱音箱是否在播放中指令（注意：默认无需配置此参数，播放出现问题时再尝试开启）                                                                 | `[3, 1, 1]`                                                      |
-| `streamResponse`             | 是否启用流式响应（部分小爱音箱型号不支持查询播放状态，此时需要关闭流式响应）                                                                       | `true`                                                           |
-| `exitKeepAliveAfter`         | 无响应一段时间后，多久自动退出唤醒模式（单位秒，默认 30 秒）                                                                                       | `30`                                                             |
+| `tts`                        | TTS 引擎（教程：[🚗 使用第三方 TTS](https://github.com/idootop/mi-gpt/blob/main/docs/tts.md)）                                                       | `"xiaoai"`                                                       |
+| `callAIKeywords`             | 当消息以关键词开头时，会调用 AI 来响应用户消息                                                                                                       | `["请", "傻妞"]`                                                 |
+| `wakeUpKeywords`             | 当消息以关键词开头时，会进入 AI 唤醒状态                                                                                                             | `["召唤傻妞", "打开傻妞"]`                                       |
+| `exitKeywords`               | 当消息以关键词开头时，会退出 AI 唤醒状态                                                                                                             | `["退出傻妞", "关闭傻妞"]`                                       |
+| `onEnterAI`                  | 进入 AI 模式的欢迎语                                                                                                                                 | `["你好，我是傻妞，很高兴认识你"]`                               |
+| `onExitAI`                   | 退出 AI 模式的提示语                                                                                                                                 | `["傻妞已退出"]`                                                 |
+| `onAIAsking`                 | AI 开始回答时的提示语                                                                                                                                | `["让我先想想", "请稍等"]`                                       |
+| `onAIReplied`                | AI 结束回答时的提示语                                                                                                                                | `["我说完了", "还有其他问题吗"]`                                 |
+| `onAIError`                  | AI 回答异常时的提示语                                                                                                                                | `["出错了，请稍后再试吧！"]`                                     |
+| `playingCommand`             | 查询小爱音箱是否在播放中指令（注意：默认无需配置此参数，播放出现问题时再尝试开启）                                                                   | `[3, 1, 1]`                                                      |
+| `streamResponse`             | 是否启用流式响应（部分小爱音箱型号不支持查询播放状态，此时需要关闭流式响应）                                                                         | `true`                                                           |
+| `exitKeepAliveAfter`         | 无响应一段时间后，多久自动退出唤醒模式（单位秒，默认 30 秒）                                                                                         | `30`                                                             |

 ## 环境变量

@ -43,18 +44,17 @@

 然后，将里面的环境变量修改成你自己的，参数含义如下：

-| 环境变量名称           | 描述                                                                                        | 示例                                 |
-| ---------------------- | ------------------------------------------------------------------------------------------- | ------------------------------------ |
-| **OpenAI**             |                                                                                             |                                      |
-| `OPENAI_API_KEY`       | OpenAI API 密钥                                                                             | `abc123`                             |
-| `OPENAI_MODEL`         | 使用的 OpenAI 模型                                                                          | `gpt-4o`                             |
-| `OPENAI_BASE_URL`      | 可选，OpenAI API BaseURL                                                                    | `https://api.openai.com/v1`          |
-| `AZURE_OPENAI_API_KEY` | 可选，[Microsoft Azure OpenAI](https://www.npmjs.com/package/openai#microsoft-azure-openai) | `abc123`                             |
-| **提示音效（可选）**   |                                                                                             |                                      |
-| `AUDIO_SILENT`         | 静音音频链接                                                                                | `"https://example.com/slient.wav"`   |
-| `AUDIO_BEEP`           | 默认提示音链接                                                                              | `"https://example.com/beep.wav"`     |
-| `AUDIO_ACTIVE`         | 唤醒提示音链接                                                                              | `"https://example.com/active.wav"`   |
-| `AUDIO_ERROR`          | 出错提示音链接                                                                              | `"https://example.com/error.wav"`    |
-| **豆包 TTS（可选）**   |                                                                                             |                                      |
-| `TTS_DOUBAO`           | 豆包 TTS 接口                                                                               | `"https://example.com/tts.wav"`      |
-| `SPEAKERS_DOUBAO`      | 豆包 TTS 音色列表接口                                                                       | `"https://example.com/tts-speakers"` |
+| 环境变量名称           | 描述                                                                                        | 示例                               |
+| ---------------------- | ------------------------------------------------------------------------------------------- | ---------------------------------- |
+| **OpenAI**             |                                                                                             |                                    |
+| `OPENAI_API_KEY`       | OpenAI API 密钥                                                                             | `abc123`                           |
+| `OPENAI_MODEL`         | 使用的 OpenAI 模型                                                                          | `gpt-4o`                           |
+| `OPENAI_BASE_URL`      | 可选，OpenAI API BaseURL                                                                    | `https://api.openai.com/v1`        |
+| `AZURE_OPENAI_API_KEY` | 可选，[Microsoft Azure OpenAI](https://www.npmjs.com/package/openai#microsoft-azure-openai) | `abc123`                           |
+| **提示音效（可选）**   |                                                                                             |                                    |
+| `AUDIO_SILENT`         | 静音音频链接                                                                                | `"https://example.com/slient.wav"` |
+| `AUDIO_BEEP`           | 默认提示音链接                                                                              | `"https://example.com/beep.wav"`   |
+| `AUDIO_ACTIVE`         | 唤醒提示音链接                                                                              | `"https://example.com/active.wav"` |
+| `AUDIO_ERROR`          | 出错提示音链接                                                                              | `"https://example.com/error.wav"`  |
+| **第三方 TTS（可选）** |                                                                                             |                                    |
+| `TTS_BASE_URL`         | 第三方 TTS 服务接口                                                                         | `"https://example.com/tts.wav"`    |
--- a/docs/todo.md
+++ b/docs/todo.md
@ -29,7 +29,7 @@
 - ✅ 添加常见小爱音箱型号的支持情况和参数列表
 - ✅ 添加 OpenAI 账号充值前可能无法使用 gpt-4 系列模型的相关说明
 - ✅ 添加无需和小爱音箱在同一局域网下运行的说明
- 添加自定义 TTS 和音色的配置和使用教程
+- ✅ 添加自定义 TTS 和音色的配置和使用教程
 - 添加更详细的使用和配置视频教程
 - 添加 302.AI Sponsor 链接

--- a/docs/tts.md
+++ b/docs/tts.md
@ -0,0 +1,74 @@
+# 🚗 使用第三方 TTS
+
+`MiGPT` 默认使用小米自带的 TTS 朗读文字内容，如果你需要：
+
+1. 绕过小米 TTS 提示文字存在敏感信息
+2. 使用第三方 TTS 或本地搭建的 TTS 服务，自定义 TTS 音色
+
+你可以通过以下步骤，切换 `MiGPT` 使用的 TTS 引擎：
+
+1. 配置 `TTS_BASE_URL` 环境变量
+2. 切换 `speaker.tts` 为 `custom`
+
+```js
+// .env
+TTS_BASE_URL=http://[你的局域网或公网地址]:[端口号]/api
+
+// .migpt.js
+export default {
+  speaker: {
+    // TTS 引擎
+    tts: 'custom',
+    // ...
+  },
+};
+```
+
+## TTS_BASE_URL
+
+其中 `TTS_BASE_URL` 是你的外部 TTS 服务引擎地址。这里提供一个 Node.js 端的示例：[MiGPT-TTS](https://github.com/idootop/mi-gpt-tts)：目前只接入了 [火山引擎](https://www.volcengine.com/docs/6561/79817) 的语音合成服务，实名认证后可以免费使用 21 款常用音色。
+
+具体部署和使用教程，请移步：https://github.com/idootop/mi-gpt-tts
+
+## 支持更多的 TTS 服务
+
+如果你想使用本地 TTS 服务（比如：ChatTTS），或者接入其他 TTS 服务商（比如微软、讯飞、OpenAI 等），可参考上面的 [MiGPT-TTS](https://github.com/idootop/mi-gpt-tts) 项目代码自行搭建服务端，只需满足以下接口：
+
+### GET `TTS_BASE_URL/api/tts.mp3`
+
+文字合成音频，请求示例：`/api/tts.mp3?speaker=BV700_streaming&text=很高兴认识你`
+
+其中，请求参数 `speaker` 为指定音色名称或标识，可选。
+
+### GET `TTS_BASE_URL/api/speakers`
+
+获取音色列表
+
+| 属性    | 说明     | 示例              |
+| ------- | -------- | ----------------- |
+| name    | 音色名称 | `灿灿`            |
+| gender  | 性别     | `女`              |
+| speaker | 音色标识 | `BV700_streaming` |
+
+返回值示例
+
+```json
+[
+  {
+    "name": "广西老表",
+    "gender": "男",
+    "speaker": "BV213_streaming"
+  },
+  {
+    "name": "甜美台妹",
+    "gender": "女",
+    "speaker": "BV025_streaming"
+  }
+]
+```
+
+## 可用的 TTS 引擎列表
+
+如果你实现了对更多 TTS 服务的支持，欢迎提交 PR，将你的项目分享给大家。
+
+- [MiGPT-TTS](https://github.com/idootop/mi-gpt-tts)：目前接入了 [火山引擎](https://www.volcengine.com/docs/6561/79817) 的语音合成服务，实名认证后可以免费使用 21 款常用音色。
--- a/src/services/speaker/ai.ts
+++ b/src/services/speaker/ai.ts
@ -192,7 +192,7 @@ export class AISpeaker extends Speaker {
            msg.text.startsWith(e)
          )!;
          const speaker = msg.text.replace(prefix, "");
-          const success = await this.switchDefaultSpeaker(speaker);
+          const success = await this.switchSpeaker(speaker);
          await this.response({
            text: success ? "音色已切换！" : "音色切换失败！",
            keepAlive: this.keepAlive,
--- a/src/services/speaker/base.ts
+++ b/src/services/speaker/base.ts
@ -11,11 +11,11 @@ import { StreamResponse } from "./stream";
 import { kAreYouOK } from "../../utils/string";
 import { fastRetry } from "../../utils/retry";

-export type TTSProvider = "xiaoai" | "doubao";
+export type TTSProvider = "xiaoai" | "custom";

 type Speaker = {
-  name: string;
-  gender: "男" | "女";
+  name?: string;
+  gender?: string;
  speaker: string;
 };

@ -205,9 +205,9 @@ export class BaseSpeaker {
      return;
    }

-    const doubaoTTS = process.env.TTS_DOUBAO;
-    if (!doubaoTTS) {
-      tts = "xiaoai"; // 没有提供豆包语音接口时，只能使用小爱自带 TTS
+    const customTTS = process.env.TTS_BASE_URL;
+    if (!customTTS) {
+      tts = "xiaoai"; // 没有提供 TTS 接口时，只能使用小爱自带 TTS
    }

    const ttsNotXiaoai = tts !== "xiaoai" && !audio;
@ -300,16 +300,10 @@ export class BaseSpeaker {
      playSFX = true,
      keepAlive = false,
      tts = this.tts,
-      speaker = this._defaultSpeaker,
+      speaker = this._currentSpeaker,
    } = options ?? {};

-    const hasNewMsg = () => {
-      const flag = options.hasNewMsg?.();
-      if (this.debug) {
-        this.logger.debug("checkIfHasNewMsg:" + flag);
-      }
-      return flag;
-    };
+    const hasNewMsg = () => options.hasNewMsg?.();

    const ttsText = text?.replace(/\n\s*\n/g, "\n")?.trim();
    const ttsNotXiaoai = tts !== "xiaoai" && !audio;
@ -399,10 +393,9 @@ export class BaseSpeaker {
    } else if (ttsText) {
      // 文字回复
      switch (tts) {
-        case "doubao":
+        case "custom":
          const _text = encodeURIComponent(ttsText);
-          const doubaoTTS = process.env.TTS_DOUBAO;
-          const url = `${doubaoTTS}?speaker=${speaker}&text=${_text}`;
+          const url = `${process.env.TTS_BASE_URL}/tts.mp3?speaker=${speaker}&text=${_text}`;
          res = await play({ url });
          break;
        case "xiaoai":
@ -414,26 +407,27 @@ export class BaseSpeaker {
    return res;
  }

-  private _doubaoSpeakers?: Speaker[];
-  private _defaultSpeaker = "zh_female_maomao_conversation_wvae_bigtts";
-  async switchDefaultSpeaker(speaker: string) {
-    const speakersAPI = process.env.SPEAKERS_DOUBAO;
-    if (!this._doubaoSpeakers && speakersAPI) {
-      const resp = await fetch(speakersAPI).catch(() => null);
+  private _speakers?: Speaker[];
+  private _currentSpeaker: string | undefined;
+  async switchSpeaker(speaker: string) {
+    if (!this._speakers && process.env.TTS_BASE_URL) {
+      const resp = await fetch(`${process.env.TTS_BASE_URL}/speakers`).catch(
+        () => null
+      );
      const res = await resp?.json().catch(() => null);
      if (Array.isArray(res)) {
-        this._doubaoSpeakers = res;
+        this._speakers = res;
      }
    }
-    if (!this._doubaoSpeakers) {
+    if (!this._speakers) {
      return false;
    }
-    const target = this._doubaoSpeakers.find(
+    const target = this._speakers.find(
      (e) => e.name === speaker || e.speaker === speaker
    );
    if (target) {
-      this._defaultSpeaker = target.speaker;
+      this._currentSpeaker = target.speaker;
+      return true;
    }
-    return this._defaultSpeaker === target?.speaker;
  }
 }
--- a/tests/bot.ts
+++ b/tests/bot.ts
@ -10,7 +10,7 @@ async function testRunBot() {
  const name = "傻妞";
  const speaker = new AISpeaker({
    name,
-    tts: "doubao",
+    tts: "custom",
    userId: process.env.MI_USER!,
    password: process.env.MI_PASS!,
    did: process.env.MI_DID,
@ -41,7 +41,7 @@ async function testStreamResponse() {
    userId: process.env.MI_USER!,
    password: process.env.MI_PASS!,
    did: process.env.MI_DID,
-    tts: "doubao",
+    tts: "custom",
  };
  const speaker = new AISpeaker(config);
  await speaker.initMiServices();
--- a/tests/speaker.ts
+++ b/tests/speaker.ts
@ -47,8 +47,8 @@ async function testSpeakerUnWakeUp(speaker: AISpeaker) {

 async function testSwitchSpeaker(speaker: AISpeaker) {
  await speaker.response({ text: "你好，我是傻妞，很高兴认识你！" });
-  const success = await speaker.switchDefaultSpeaker("魅力苏菲");
-  console.log("switchDefaultSpeaker 魅力苏菲", success);
+  const success = await speaker.switchSpeaker("魅力苏菲");
+  console.log("switchSpeaker 魅力苏菲", success);
  await speaker.response({ text: "你好，我是傻妞，很高兴认识你！" });
  console.log("hello");
 }