Document AI
Document AI 使用視覺模型(例如 GPT-4o)從 PDF 和圖片中提取結構化資料 — 發票、表單、收據、合約或任何包含可見文字的文件。
SDK 公開兩個命名空間:
client.chatAi/client.chat_ai— 直接文件處理和模型列表client.documentAi/client.document_ai— 基於 agent 的文件處理,包含結構定義管理、agent CRUD 和端到端編排
client.chatAi — 處理文件
列出可用模型
const models = await client.chatAi.listDocumentModels();models = client.chat_ai.list_document_models()直接文件處理
const result = await client.chatAi.processDocument({ modelName: "gpt-4o", url: "https://example.com/invoice.pdf", organizationId: "org_xxx",});result = client.chat_ai.process_document( model_name="gpt-4o", url="https://example.com/invoice.pdf", organization_id="org_xxx",)參數參考
| 參數 | TS 欄位 | Python 參數 | 類型 | 必填 |
|---|---|---|---|---|
| 模型名稱 | modelName | model_name | string / str | 是 |
| 文件 URL | url | url | string / str | 是 |
| 組織 ID | organizationId | organization_id | string / str | 是 |
| Board ID | boardId | board_id | string / str | 否 |
| 語言 | language | language | string / str | 否 |
| 額外指示 | additionalInstructions | additional_instructions | string / str | 否 |
| 額外文件指示 | additionalDocumentInstructions | additional_document_instructions | string / str | 否 |
| 處理模型名稱 | processModelName | process_model_name | string / str | 否 |
| 要填寫的檔案 URL | fileUrlToFill | file_url_to_fill | string / str | 否 |
| 工具 | tools | tools | Record<string, unknown>[] / List[Dict] | 否 |
| UTC 偏移 | utc | utc | number / int | 否 |
| 區塊大小 | chunkSize | chunk_size | number / int | 否 |
| 最大並行數 | maxConcurrent | max_concurrent | number / int | 否 |
| 最大重試次數 | maxRetries | max_retries | number / int | 否 |
| 增強處理 | useEnhancedProcessing | use_enhanced_processing | boolean / bool | 否 |
PDF 表單填寫範例
const result = await client.chatAi.processDocument({ modelName: "gpt-4o", url: "https://example.com/blank-invoice.pdf", organizationId: "org_xxx", fileUrlToFill: "https://example.com/blank-invoice.pdf", language: "en",});
if (result.success && result.data.filledPdfUrl) { console.log("Filled PDF:", result.data.filledPdfUrl);}result = client.chat_ai.process_document( model_name="gpt-4o", url="https://example.com/blank-invoice.pdf", organization_id="org_xxx", file_url_to_fill="https://example.com/blank-invoice.pdf", language="en",)
if result["success"] and result["data"].get("filledPdfUrl"): print("Filled PDF:", result["data"]["filledPdfUrl"])client.documentAi — Agent CRUD
Document AI Agent 儲存提取結構定義、指示和模型設定以供重複使用。
列出 agent
// 所有 agentconst agents = await client.documentAi.listAgents();
// 依名稱篩選const filtered = await client.documentAi.listAgents({ nameContains: "Invoice" });
// 僅 Document AI agent(透過 createFull 或 webapp 建立)const docAiAgents = await client.documentAi.listAgents({ documentAiOnly: true });# 所有 agentagents = client.document_ai.list_agents()
# 依名稱篩選filtered = client.document_ai.list_agents(name_contains="Invoice")
# 僅 Document AI agentdoc_ai_agents = client.document_ai.list_agents(document_ai_only=True)取得 agent
const agent = await client.documentAi.getAgent("agent_id");agent = client.document_ai.get_agent("agent_id")建立 agent
const agent = await client.documentAi.createAgent({ name: "Invoice Extractor", instructions: "Extract invoice fields. Dates as YYYY-MM-DD.", model_id: "gpt-4o", schema: { invoice_number: { type: "string", description: "Invoice ID" }, total: { type: "number" }, date: { type: "string", format: "date" }, },});agent = client.document_ai.create_agent( name="Invoice Extractor", instructions="Extract invoice fields. Dates as YYYY-MM-DD.", model_id="gpt-4o", schema={ "invoice_number": {"type": "string", "description": "Invoice ID"}, "total": {"type": "number"}, "date": {"type": "string", "format": "date"}, },)更新 agent
await client.documentAi.updateAgent("agent_id", { name: "Invoice Extractor v2", instructions: "Updated extraction logic.",});client.document_ai.update_agent("agent_id", { "name": "Invoice Extractor v2", "instructions": "Updated extraction logic.",})刪除 agent
await client.documentAi.deleteAgent("agent_id");client.document_ai.delete_agent("agent_id")client.documentAi — 使用 Agent 處理
使用已設定的 agent 處理文件(從 agent 查詢模型 + 指示)。
const result = await client.documentAi.process({ agentId: "agent_id", url: "https://example.com/invoice.pdf", organizationId: "org_xxx",});result = client.document_ai.process( agent_id="agent_id", url="https://example.com/invoice.pdf", organization_id="org_xxx",)您也可以覆蓋 agent 的模型或指示:
const result = await client.documentAi.process({ agentId: "agent_id", url: "https://example.com/invoice.pdf", organizationId: "org_xxx", modelName: "gpt-4o", // 覆蓋 agent 的模型 instructions: "Custom prompt", // 覆蓋 agent 的指示});result = client.document_ai.process( agent_id="agent_id", url="https://example.com/invoice.pdf", organization_id="org_xxx", model_name="gpt-4o", instructions="Custom prompt",)client.documentAi — 建議結構定義
透過分析範例文件自動提出 JSON 結構定義。
const schema = await client.documentAi.suggestSchema({ url: "https://example.com/invoice.pdf", organizationId: "org_xxx", modelName: "gpt-4o", // 可選,預設為 "gpt-4o"});schema = client.document_ai.suggest_schema( url="https://example.com/invoice.pdf", organization_id="org_xxx", model_name="gpt-4o",)client.documentAi — 完整建立(Orchestrator)
端到端 Document AI agent 建立(iMBRACE webapp 的做法):建立包含提取結構定義的 board,然後建立與該 board 關聯的 UseCase + AI Agent。
const result = await client.documentAi.createFull({ name: "Invoice Extractor", instructions: "Extract invoice fields. Dates as YYYY-MM-DD.", schemaFields: [ { name: "invoice_number", type: "ShortText", description: "Invoice ID" }, { name: "total", type: "Number", description: "Total amount" }, { name: "date", type: "Date", description: "Invoice date" }, ], modelId: "gpt-4o", providerId: "system",});
console.log(result.board_id); // "brd_xxx"console.log(result.ai_agent_id); // 建立的 AI Agent 的 UUIDconsole.log(result.usecase_id); // 建立的 UseCase 的 UUIDresult = client.document_ai.create_full( name="Invoice Extractor", instructions="Extract invoice fields. Dates as YYYY-MM-DD.", schema_fields=[ {"name": "invoice_number", "type": "ShortText", "description": "Invoice ID"}, {"name": "total", "type": "Number", "description": "Total amount"}, {"name": "date", "type": "Date", "description": "Invoice date"}, ], model_id="gpt-4o", provider_id="system",)
print(result["board_id"]) # "brd_xxx"print(result["ai_agent_id"]) # 建立的 AI Agent 的 UUID完整建立選項
| 參數 | TS 欄位 | Python 參數 | 類型 | 預設值 |
|---|---|---|---|---|
| 名稱 | name | name | string / str | — |
| 指示 | instructions | instructions | string / str | — |
| 結構定義欄位 | schemaFields | schema_fields | CreateBoardFieldInput[] / List[Dict] | — |
| 模型 ID | modelId | model_id | string / str | — |
| 提供者 ID | providerId | provider_id | string / str | — |
| 描述 | description | description | string / str | None |
| VLM 模型 | vlmModel | vlm_model | string / str | modelId |
| VLM 提供者 | vlmProviderId | vlm_provider_id | string / str | providerId |
| 來源語言 | sourceLanguages | source_languages | string[] / List[str] | ["English"] |
| 手寫支援 | handwritingSupport | handwriting_support | boolean / bool | false |
| 時間偏移 | timeOffset | time_offset | string / str | "UTC+00:00" |
| 失敗時繼續 | continueOnFailure | continue_on_failure | boolean / bool | false |
| 重試時間 | retryTime | retry_time | number / int | 2 |
| 溫度 | temperature | temperature | number / float | 0.1 |
| 示範 URL | demoUrl | demo_url | string / str | None |
| 團隊 ID | teamIds | team_ids | string[] / List[str] | [] |
| 額外 AI Agent 欄位 | extraAiAgent | extra_ai_agent | Record<string, unknown> / Dict | None |
非同步使用(Python)
from imbrace import AsyncImbraceClient
async with AsyncImbraceClient() as client: # 直接處理(chat_ai) models = await client.chat_ai.list_document_models() result = await client.chat_ai.process_document( model_name="gpt-4o", url="https://example.com/invoice.pdf", organization_id="org_xxx", )
# 基於 agent 的處理(document_ai) agents = await client.document_ai.list_agents(document_ai_only=True) result2 = await client.document_ai.process( agent_id=agents[0]["_id"], url="https://example.com/receipt.pdf", organization_id="org_xxx", )參閱
- 完整流程指南 §3 — 知識中心 — 上傳檔案以供 RAG 使用
- AI Agent — 嵌入與知識庫 — 管理用於檢索的嵌入檔案