inworld-tts：通过 Inworld.ai API 实现文本转语音功能

crest · 2026-02-05 23:04:51 · 55 次点击 · 0 条评论

名称： inworld-tts
描述： 通过 Inworld.ai API 实现文本转语音。适用于从文本生成语音音频、创建语音回复或将文本转换为 MP3/音频文件。支持多种音色、语速和长文本流式处理。

Inworld TTS

使用 Inworld.ai 的 TTS API 将文本转换为语音音频。

环境配置

从 https://platform.inworld.ai 获取 API 密钥
生成具有 "Voices: Read" 权限的密钥
复制 "Basic (Base64)" 格式的密钥
设置环境变量：

export INWORLD_API_KEY="your-base64-key-here"

如需永久生效，可将该行添加到 ~/.bashrc 或 ~/.clawdbot/.env 文件中。

安装

# 将技能复制到你的技能目录
cp -r inworld-tts /path/to/your/skills/

# 使脚本可执行
chmod +x /path/to/your/skills/inworld-tts/scripts/tts.sh

# 可选：创建符号链接以便全局访问
ln -sf /path/to/your/skills/inworld-tts/scripts/tts.sh /usr/local/bin/inworld-tts

使用方式

# 基础用法
./scripts/tts.sh "Hello world" output.mp3

# 带选项
./scripts/tts.sh "Hello world" output.mp3 --voice Dennis --rate 1.2

# 流式处理（适用于超过 4000 字符的文本）
./scripts/tts.sh "Very long text..." output.mp3 --stream

选项参数

选项	默认值	说明
`--voice`	Dennis	音色 ID
`--rate`	1.0	语速 (0.5-2.0)
`--temp`	1.1	温度参数 (0.1-2.0)
`--model`	inworld-tts-1.5-max	模型 ID
`--stream`	false	使用流式处理端点

API 参考

端点	用途
`POST https://api.inworld.ai/tts/v1/voice`	标准语音合成
`POST https://api.inworld.ai/tts/v1/voice:stream`	长文本流式处理

依赖要求

curl - 发送 HTTP 请求
jq - 处理 JSON 数据
base64 - 解码音频数据

示例

# 快速测试
export INWORLD_API_KEY="aXM2..."
./scripts/tts.sh "Testing one two three" test.mp3
mpv test.mp3  # 或使用任意音频播放器

# 不同音色和语速
./scripts/tts.sh "Slow and steady" slow.mp3 --rate 0.8

# 快速播报
./scripts/tts.sh "Breaking news!" fast.mp3 --rate 1.5

故障排除

"INWORLD_API_KEY not set" - 运行前请先导出环境变量。

输出文件为空 - 检查 API 密钥是否有效且具有 "Voices: Read" 权限。

流式处理问题 - 确保 jq 支持 --unbuffered 标志。