跳转到内容

Document AI

Document AI 使用视觉模型(如 GPT-4o)从 PDF 和图片中提取结构化数据 — 发票、表单、收据、合同或任何带有可见文本的文档。

SDK 暴露两个命名空间:

  • client.chatAi / client.chat_ai — 直接文档处理和模型列表
  • client.documentAi / client.document_ai — 基于代理的文档处理,具有模式管理、代理增删改查和端到端编排

首先初始化客户端(参见安装快速入门)。


client.chatAi — 处理文档

列出可用模型

const models = await client.chatAi.listDocumentModels();

直接文档处理

const result = await client.chatAi.processDocument({
modelName: "gpt-4o",
url: "https://example.com/invoice.pdf",
organizationId: "org_xxx",
});

参数参考

参数TS 字段Python 参数类型必填
模型名称modelNamemodel_namestring / str
文档 URLurlurlstring / str
组织 IDorganizationIdorganization_idstring / str
面板 IDboardIdboard_idstring / str
语言languagelanguagestring / str
附加指令additionalInstructionsadditional_instructionsstring / str
附加文档指令additionalDocumentInstructionsadditional_document_instructionsstring / str
处理模型名称processModelNameprocess_model_namestring / str
填充文件 URLfileUrlToFillfile_url_to_fillstring / str
工具toolstoolsRecord<string, unknown>[] / List[Dict]
UTC 偏移utcutcnumber / int
块大小chunkSizechunk_sizenumber / int
最大并发maxConcurrentmax_concurrentnumber / int
最大重试maxRetriesmax_retriesnumber / int
增强处理useEnhancedProcessinguse_enhanced_processingboolean / bool

PDF 表单填写示例

const result = await client.chatAi.processDocument({
modelName: "gpt-4o",
url: "https://example.com/blank-invoice.pdf",
organizationId: "org_xxx",
fileUrlToFill: "https://example.com/blank-invoice.pdf",
language: "en",
});
if (result.success && result.data.filledPdfUrl) {
console.log("Filled PDF:", result.data.filledPdfUrl);
}

client.documentAi — 代理增删改查

文档 AI 代理存储提取模式、指令和模型配置,供重复使用。

列出代理

// 所有代理
const agents = await client.documentAi.listAgents();
// 按名称过滤
const filtered = await client.documentAi.listAgents({ nameContains: "Invoice" });
// 仅文档 AI 代理(通过 createFull 或 webapp 创建)
const docAiAgents = await client.documentAi.listAgents({ documentAiOnly: true });

获取代理

const agent = await client.documentAi.getAgent("agent_id");

创建代理

const agent = await client.documentAi.createAgent({
name: "Invoice Extractor",
instructions: "Extract invoice fields. Dates as YYYY-MM-DD.",
model_id: "gpt-4o",
schema: {
invoice_number: { type: "string", description: "Invoice ID" },
total: { type: "number" },
date: { type: "string", format: "date" },
},
});

更新代理

await client.documentAi.updateAgent("agent_id", {
name: "Invoice Extractor v2",
instructions: "Updated extraction logic.",
});

删除代理

await client.documentAi.deleteAgent("agent_id");

client.documentAi — 使用代理处理文档

使用配置好的代理处理文档(从代理查找模型 + 指令)。

const result = await client.documentAi.process({
agentId: "agent_id",
url: "https://example.com/invoice.pdf",
organizationId: "org_xxx",
});

你也可以覆盖代理的模型或指令:

const result = await client.documentAi.process({
agentId: "agent_id",
url: "https://example.com/invoice.pdf",
organizationId: "org_xxx",
modelName: "gpt-4o", // 覆盖代理的模型
instructions: "Custom prompt", // 覆盖代理的指令
});

client.documentAi — 建议模式

通过分析示例文档自动提出 JSON 模式。

const schema = await client.documentAi.suggestSchema({
url: "https://example.com/invoice.pdf",
organizationId: "org_xxx",
modelName: "gpt-4o", // 可选,默认为 "gpt-4o"
});

client.documentAi — 完整创建(编排器)

端到端 Document AI 代理创建(iMBRACE webapp 所做的):创建一个带有提取模式的面板,然后创建一个链接到该面板的 UseCase + AI 代理。

const result = await client.documentAi.createFull({
name: "Invoice Extractor",
instructions: "Extract invoice fields. Dates as YYYY-MM-DD.",
schemaFields: [
{ name: "invoice_number", type: "ShortText", description: "Invoice ID" },
{ name: "total", type: "Number", description: "Total amount" },
{ name: "date", type: "Date", description: "Invoice date" },
],
modelId: "gpt-4o",
providerId: "system",
});
console.log(result.board_id); // "brd_xxx"
console.log(result.ai_agent_id); // 所创建的 AI 代理的 UUID
console.log(result.usecase_id); // 所创建的 UseCase 的 UUID

完整创建选项

参数TS 字段Python 参数类型默认值
名称namenamestring / str
指令instructionsinstructionsstring / str
模式字段schemaFieldsschema_fieldsCreateBoardFieldInput[] / List[Dict]
模型 IDmodelIdmodel_idstring / str
提供商 IDproviderIdprovider_idstring / str
描述descriptiondescriptionstring / strNone
VLM 模型vlmModelvlm_modelstring / strmodelId
VLM 提供商vlmProviderIdvlm_provider_idstring / strproviderId
源语言sourceLanguagessource_languagesstring[] / List[str]["English"]
手写支持handwritingSupporthandwriting_supportboolean / boolfalse
时间偏移timeOffsettime_offsetstring / str"UTC+00:00"
失败时继续continueOnFailurecontinue_on_failureboolean / boolfalse
重试次数retryTimeretry_timenumber / int2
温度temperaturetemperaturenumber / float0.1
演示 URLdemoUrldemo_urlstring / strNone
团队 IDteamIdsteam_idsstring[] / List[str][]
额外 AI 代理字段extraAiAgentextra_ai_agentRecord<string, unknown> / DictNone

异步用法(Python)

from imbrace import AsyncImbraceClient
async with AsyncImbraceClient() as client:
# 直接处理 (chat_ai)
models = await client.chat_ai.list_document_models()
result = await client.chat_ai.process_document(
model_name="gpt-4o",
url="https://example.com/invoice.pdf",
organization_id="org_xxx",
)
# 基于代理的处理 (document_ai)
agents = await client.document_ai.list_agents(document_ai_only=True)
result2 = await client.document_ai.process(
agent_id=agents[0]["_id"],
url="https://example.com/receipt.pdf",
organization_id="org_xxx",
)

参见