名称: audiopod
描述: 使用 AudioPod AI 的 API 进行音频处理任务,包括 AI 音乐生成(文本生成音乐、文本生成说唱、伴奏、采样、人声)、音轨分离、文本转语音、降噪、语音转文字转录、说话人分离和媒体提取。当用户需要从文本生成音乐/歌曲/说唱、将歌曲分离为音轨/人声/乐器、从文本生成语音、清理嘈杂音频、转录音频/视频,或从 YouTube/URL 提取音频时使用。需要设置 AUDIOPOD_API_KEY 环境变量或直接传递 api_key。
完整的音频处理 API:音乐生成、音轨分离、文本转语音、降噪、转录、说话人分离、钱包管理。
pip install audiopod # Python
npm install audiopod # Node.js
认证:设置 AUDIOPOD_API_KEY 环境变量或传递给客户端构造函数。
ap_ 开头)from audiopod import AudioPod
client = AudioPod() # 使用 AUDIOPOD_API_KEY 环境变量
# 或者:client = AudioPod(api_key="ap_...")
根据文本提示生成歌曲、说唱、伴奏、采样和人声。
任务类型: text2music(带人声的歌曲)、text2rap(说唱)、prompt2instrumental(伴奏)、lyric2vocals(仅人声)、text2samples(循环/采样)、audio2audio(风格迁移)、songbloom
# 生成带歌词的完整歌曲
result = client.music.song(
prompt="欢快的流行音乐,合成器,鼓点,120 bpm,女声,电台品质",
lyrics="主歌 1:\n阳光明媚的日子走在街上\n\n副歌:\n我们今晚激情燃烧!",
duration=60
)
print(result["output_url"])
# 生成说唱
result = client.music.rap(
prompt="Lo-Fi 嘻哈,100 BPM,男声说唱,忧郁,键盘和弦",
lyrics="主歌 1:\n从底层开始,现在我们正在攀登...",
duration=60
)
# 生成伴奏(无需歌词)
result = client.music.instrumental(
prompt="氛围环境音景,振奋人心,驱动情绪",
duration=30
)
# 通用生成,指定任务类型
result = client.music.generate(
prompt="电子舞曲,高能量",
task="text2samples", # 任何任务类型
duration=30
)
# 异步:提交后轮询
job = client.music.create(
prompt="轻松 Lo-Fi 节拍",
duration=30,
task="prompt2instrumental"
)
result = client.music.wait_for_completion(job["id"], timeout=600)
# 获取可用流派预设
presets = client.music.get_presets()
# 列出/管理任务
jobs = client.music.list(skip=0, limit=50)
job = client.music.get(job_id=123)
client.music.delete(job_id=123)
# 带歌词的歌曲
curl -X POST "https://api.audiopod.ai/api/v1/music/text2music" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"欢快流行,合成器,120bpm,女声", "lyrics":"Walking down the street...", "audio_duration":60}'
# 说唱
curl -X POST "https://api.audiopod.ai/api/v1/music/text2rap" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"Lo-Fi Hip Hop,男声说唱,100 BPM", "lyrics":"Started from the bottom...", "audio_duration":60}'
# 伴奏
curl -X POST "https://api.audiopod.ai/api/v1/music/prompt2instrumental" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"氛围音景,振奋人心", "audio_duration":30}'
# 采样/循环
curl -X POST "https://api.audiopod.ai/api/v1/music/text2samples" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"鼓点循环,悲伤情绪", "audio_duration":15}'
# 仅人声
curl -X POST "https://api.audiopod.ai/api/v1/music/lyric2vocals" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"干净人声,欢快", "lyrics":"Eternal chorus of unity...", "audio_duration":30}'
# 检查任务状态 / 获取结果
curl "https://api.audiopod.ai/api/v1/music/jobs/JOB_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 获取流派预设
curl "https://api.audiopod.ai/api/v1/music/presets" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 列出任务
curl "https://api.audiopod.ai/api/v1/music/jobs?skip=0&limit=50" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 删除任务
curl -X DELETE "https://api.audiopod.ai/api/v1/music/jobs/JOB_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
| 字段 | 必填 | 描述 |
|---|---|---|
| prompt | 是 | 风格/流派描述 |
| lyrics | 用于歌曲/说唱/人声 | 带有主歌/副歌结构的歌词 |
| audio_duration | 否 | 时长(秒),默认 30 |
| genre_preset | 否 | 流派预设名称(从预设端点获取) |
| display_name | 否 | 曲目显示名称 |
将音频分离为独立的乐器/人声轨道。
| 模式 | 音轨数 | 输出 | 使用场景 |
|---|---|---|---|
| single | 1 | 仅指定音轨 | 人声隔离、鼓点提取 |
| two | 2 | 人声 + 伴奏 | 卡拉 OK 音轨 |
| four | 4 | 人声、鼓、贝斯、其他 | 标准混音(默认) |
| six | 6 | + 吉他、钢琴 | 完整乐器分离 |
| producer | 8 | + 底鼓、军鼓、踩镲 | 节拍制作 |
| studio | 12 | + 镲片、次低音、合成器 | 专业混音 |
| mastering | 16 | 最大细节 | 音频分析 |
单音轨选项: vocals, drums, bass, guitar, piano, other
# 同步:提取并等待结果
result = client.stems.separate(
url="https://youtube.com/watch?v=VIDEO_ID",
mode="six",
timeout=600
)
for stem, url in result["download_urls"].items():
print(f"{stem}: {url}")
# 从本地文件
result = client.stems.separate(file="/path/to/song.mp3", mode="four")
# 单音轨提取
result = client.stems.separate(
url="https://youtube.com/watch?v=ID",
mode="single",
stem="vocals"
)
# 异步:提交后轮询
job = client.stems.extract(url="https://youtube.com/watch?v=ID", mode="six")
print(f"任务 ID: {job['id']}")
status = client.stems.status(job["id"])
# 或等待:
result = client.stems.wait_for_completion(job["id"], timeout=600)
# 列出可用模式
modes = client.stems.modes()
# 任务管理
jobs = client.stems.list(skip=0, limit=50, status="COMPLETED")
job = client.stems.get(job_id=1234)
client.stems.delete(job_id=1234)
# 从 URL 提取
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-F "url=https://youtube.com/watch?v=VIDEO_ID" \
-F "mode=six"
# 从文件提取
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-F "file=@/path/to/song.mp3" \
-F "mode=four"
# 单音轨提取
curl -X POST "https://api.audiopod.ai/api/v1/stem-extraction/api/extract" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-F "url=URL" \
-F "mode=single" \
-F "stem=vocals"
# 检查任务状态
curl "https://api.audiopod.ai/api/v1/stem-extraction/status/JOB_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 列出可用模式
curl "https://api.audiopod.ai/api/v1/stem-extraction/modes" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 列出任务(按状态过滤:PENDING, PROCESSING, COMPLETED, FAILED)
curl "https://api.audiopod.ai/api/v1/stem-extraction/jobs?skip=0&limit=50&status=COMPLETED" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 获取特定任务
curl "https://api.audiopod.ai/api/v1/stem-extraction/jobs/JOB_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 删除任务
curl -X DELETE "https://api.audiopod.ai/api/v1/stem-extraction/jobs/JOB_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
{
"id": 1234,
"status": "COMPLETED",
"download_urls": {
"vocals": "https://...",
"drums": "https://...",
"bass": "https://...",
"other": "https://..."
},
"quality_scores": {
"vocals": 0.95,
"drums": 0.88
}
}
使用 50 多种语音、支持 60 多种语言从文本生成语音。支持语音克隆。
# 生成语音并等待结果
result = client.voice.generate(
text="你好,世界!这是一个测试。",
voice_id=123,
speed=1.0
)
print(result["output_url"])
# 异步:提交后轮询
job = client.voice.speak(
text="你好世界",
voice_id=123,
speed=1.0
)
status = client.voice.get_job(job["id"])
result = client.voice.wait_for_completion(job["id"], timeout=300)
# 列出所有可用语音
voices = client.voice.list()
for v in voices:
print(f"{v['id']}: {v['name']}")
# 克隆语音(需要约 5 秒音频样本)
new_voice = client.voice.create(
name="我的语音克隆",
audio_file="./sample.mp3",
description="从录音克隆"
)
# 获取/删除语音
voice = client.voice.get(voice_id=123)
client.voice.delete(voice_id=123)
# 列出所有语音
curl "https://api.audiopod.ai/api/v1/voice/voice-profiles" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 生成语音(表单数据,非 JSON!)
curl -X POST "https://api.audiopod.ai/api/v1/voice/voices/{VOICE_UUID}/generate" \
-H "Authorization: Bearer $AUDIOPOD_API_KEY" \
-d "input_text=你好世界,这是一个测试" \
-d "audio_format=mp3" \
-d "speed=1.0"
# 轮询任务状态
curl "https://api.audiopod.ai/api/v1/voice/tts-jobs/{JOB_ID}/status" \
-H "Authorization: Bearer $AUDIOPOD_API_KEY"
# SDK 风格端点(替代方案)
# 通过 SDK 端点生成
curl -X POST "https://api.audiopod.ai/api/v1/voice/tts/generate" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-H "Content-Type: application/json" \
-d '{"text":"你好世界","voice_id":123,"speed":1.0}'
# 通过 SDK 端点轮询
curl "https://api.audiopod.ai/api/v1/voice/tts/status/JOB_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 列出语音(SDK 端点)
curl "https://api.audiopod.ai/api/v1/voice/voices" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
# 克隆语音
curl -X POST "https://api.audiopod.ai/api/v1/voice/voices" \
-H "X-API-Key: $AUDIOPOD_API_KEY" \
-F "name=我的语音" \
-F "file=@sample.mp3" \
-F "description=克隆语音"
# 删除语音
curl -X DELETE "https://api.audiopod.ai/api/v1/voice/voices/VOICE_ID" \
-H "X-API-Key: $AUDIOPOD_API_KEY"
| 字段 | 必填 | 描述 |
|---|---|---|
| input_text | 是 | 要朗读的文本(最多 5000 字符)。原始 HTTP 用 input_text,SDK 用 text |
| audio_format | 否 | mp3, wav, ogg(默认:mp3) |
| speed | 否 | 0.25 - 4.0(默认:1.0) |
| language | 否 | ISO 语言代码,省略则自动检测 |
// 生成响应
{"job_id": 12345, "status": "pending", "credits_reserved": 25}
// 状态响应(已完成)
{"status": "completed", "output_url": "https://r2-url/generated.mp3"}
input_text 而非 text/api/v1/voice/tts/generate)使用 JSON,字段为 textffmpeg -i output.mp3 -c:a aac real.m4a 转换通过自动语音分割分离音频中的不同说话人。
```python
result = client.speaker.identify(
file="./meeting.mp3",
num_speakers=3, # 可选提示,提高准确性
timeout=600
)
for segment in result["segments"]:
print(f"说话人 {segment['speaker']}: {segment['text']} [{segment['start']:.1f}s - {segment['end']:.1f}s]")
result = client.speaker.identify(
url="https://youtube.com/watch?v=VIDEO_ID",
num_speakers=2
)
job = client.speaker.diarize(
file="./meeting.mp3",
num_speakers=3
)
result = client.speaker.wait_for_completion(job["id"], timeout=600)
jobs = client.speaker.list(skip=0, limit=50, status="COMPLETED")
job = client.speaker.get(j