名称: upstage-document-parse
描述: 使用 Upstage 文档解析 API 解析文档(PDF、图像、DOCX、PPTX、XLSX、HWP)。提取文本、表格、图像和带有边界框的布局元素。当用户要求解析、提取或分析文档内容、将文档转换为 Markdown/HTML,或从 PDF 和图像中提取结构化数据时使用。
主页: https://console.upstage.ai/api/document-digitization/document-parsing
元数据: {"openclaw":{"emoji":"📑","requires":{"bins":["curl"],"env":["UPSTAGE_API_KEY"]},"primaryEnv":"UPSTAGE_API_KEY"}}
使用 Upstage 文档解析 API 从文档中提取结构化内容。
PDF(异步处理最多 1000 页)、PNG、JPG、JPEG、TIFF、BMP、GIF、WEBP、DOCX、PPTX、XLSX、HWP
clawhub install upstage-document-parse
openclaw config set skills.entries.upstage-document-parse.apiKey "your-api-key"
或者添加到 ~/.openclaw/openclaw.json 文件中:
{
"skills": {
"entries": {
"upstage-document-parse": {
"apiKey": "your-api-key"
}
}
}
}
直接要求智能体解析您的文档:
"解析这个 PDF:~/Documents/report.pdf"
"解析:~/Documents/report.jpg"
适用于小型文档(建议少于 20 页)。
| 参数 | 类型 | 默认值 | 描述 |
|---|---|---|---|
model |
字符串 | 必填 | 使用 document-parse(最新版)或 document-parse-nightly |
document |
文件 | 必填 | 要解析的文档文件 |
mode |
字符串 | standard |
standard(侧重文本)、enhanced(复杂表格/图像)、auto |
ocr |
字符串 | auto |
auto(仅图像)或 force(始终进行 OCR) |
output_formats |
字符串 | ['html'] |
text、html、markdown(数组格式) |
coordinates |
布尔值 | true |
包含边界框坐标 |
base64_encoding |
字符串 | [] |
需要 base64 编码的元素:["table"]、["figure"] 等 |
chart_recognition |
布尔值 | true |
将图表转换为表格(Beta 功能) |
merge_multipage_tables |
布尔值 | false |
跨页合并表格(Beta 功能,若启用则最多 20 页) |
curl -X POST "https://api.upstage.ai/v1/document-digitization" \
-H "Authorization: Bearer $UPSTAGE_API_KEY" \
-F "document=@/path/to/file.pdf" \
-F "model=document-parse"
curl -X POST "https://api.upstage.ai/v1/document-digitization" \
-H "Authorization: Bearer $UPSTAGE_API_KEY" \
-F "document=@report.pdf" \
-F "model=document-parse" \
-F "output_formats=['markdown']"
curl -X POST "https://api.upstage.ai/v1/document-digitization" \
-H "Authorization: Bearer $UPSTAGE_API_KEY" \
-F "document=@complex.pdf" \
-F "model=document-parse" \
-F "mode=enhanced" \
-F "output_formats=['html', 'markdown']"
curl -X POST "https://api.upstage.ai/v1/document-digitization" \
-H "Authorization: Bearer $UPSTAGE_API_KEY" \
-F "document=@scan.pdf" \
-F "model=document-parse" \
-F "ocr=force"
curl -X POST "https://api.upstage.ai/v1/document-digitization" \
-H "Authorization: Bearer $UPSTAGE_API_KEY" \
-F "document=@invoice.pdf" \
-F "model=document-parse" \
-F "base64_encoding=['table']"
{
"api": "2.0",
"model": "document-parse-251217",
"content": {
"html": "<h1>...</h1>",
"markdown": "# ...",
"text": "..."
},
"elements": [
{
"id": 0,
"category": "heading1",
"content": { "html": "...", "markdown": "...", "text": "..." },
"page": 1,
"coordinates": [{"x": 0.06, "y": 0.05}, ...]
}
],
"usage": { "pages": 1 }
}
paragraph、heading1、heading2、heading3、list、table、figure、chart、equation、caption、header、footer、index、footnote
适用于最多 1000 页的文档。文档按每批 10 页进行处理。
curl -X POST "https://api.upstage.ai/v1/document-digitization/async" \
-H "Authorization: Bearer $UPSTAGE_API_KEY" \
-F "document=@large.pdf" \
-F "model=document-parse" \
-F "output_formats=['markdown']"
响应:
{"request_id": "uuid-here"}
curl "https://api.upstage.ai/v1/document-digitization/requests/{request_id}" \
-H "Authorization: Bearer $UPSTAGE_API_KEY"
响应包含每个批次的 download_url(有效期为 30 天)。
curl "https://api.upstage.ai/v1/document-digitization/requests" \
-H "Authorization: Bearer $UPSTAGE_API_KEY"
submitted:请求已接收started:处理中completed:准备就绪,可下载failed:发生错误(检查 failure_message)import requests
api_key = "up_xxx"
# 同步 API
with open("doc.pdf", "rb") as f:
response = requests.post(
"https://api.upstage.ai/v1/document-digitization",
headers={"Authorization": f"Bearer {api_key}"},
files={"document": f},
data={"model": "document-parse", "output_formats": "['markdown']"}
)
print(response.json()["content"]["markdown"])
# 异步 API(用于大型文档)
with open("large.pdf", "rb") as f:
r = requests.post(
"https://api.upstage.ai/v1/document-digitization/async",
headers={"Authorization": f"Bearer {api_key}"},
files={"document": f},
data={"model": "document-parse"}
)
request_id = r.json()["request_id"]
# 轮询结果
import time
while True:
status = requests.get(
f"https://api.upstage.ai/v1/document-digitization/requests/{request_id}",
headers={"Authorization": f"Bearer {api_key}"}
).json()
if status["status"] == "completed":
break
time.sleep(5)
from langchain_upstage import UpstageDocumentParseLoader
loader = UpstageDocumentParseLoader(
file_path="document.pdf",
output_format="markdown",
ocr="auto"
)
docs = loader.load()
您也可以将 API 密钥设置为环境变量:
export UPSTAGE_API_KEY="your-api-key"
mode=enhancedmode=auto 让 API 为每页自动选择模式ocr=forcemerge_multipage_tables=true 可合并跨页表格(在增强模式下最多 20 页)