跳转到主要内容
POST
/
datasets
/
{dataset_id}
/
document
/
create-by-text
从文本创建文档
curl --request POST \
  --url http://{apiBaseUrl}/datasets/{dataset_id}/document/create-by-text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "<string>",
  "text": "<string>",
  "indexing_technique": "high_quality",
  "doc_form": "text_model",
  "doc_language": "中文",
  "process_rule": {
    "mode": "automatic",
    "rules": {
      "pre_processing_rules": [
        {
          "id": "remove_extra_spaces",
          "enabled": true
        }
      ],
      "segmentation": {
        "separator": "<string>",
        "max_tokens": 123
      },
      "parent_mode": "full-doc",
      "subchunk_segmentation": {
        "separator": "<string>",
        "max_tokens": 123,
        "chunk_overlap": 123
      }
    }
  },
  "retrieval_model": {
    "search_method": "hybrid_search",
    "reranking_enable": true,
    "reranking_mode": {
      "reranking_provider_name": "<string>",
      "reranking_model_name": "<string>"
    },
    "top_k": 123,
    "score_threshold_enabled": true,
    "score_threshold": 123,
    "weights": 123
  },
  "embedding_model": "<string>",
  "embedding_model_provider": "<string>"
}
'
{
  "document": {
    "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "position": 123,
    "data_source_type": "<string>",
    "data_source_info": {},
    "dataset_process_rule_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "name": "<string>",
    "created_from": "<string>",
    "created_by": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "created_at": 123,
    "tokens": 123,
    "indexing_status": "<string>",
    "error": "<string>",
    "enabled": true,
    "disabled_at": 123,
    "disabled_by": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "archived": true,
    "display_status": "<string>",
    "word_count": 123,
    "hit_count": 123,
    "doc_form": "<string>"
  },
  "batch": "<string>"
}

Authorizations

Authorization
string
header
required

API-Key 鉴权。所有 API 请求都应在 Authorization HTTP Header 中包含你的 API-Key,格式为 Bearer {API_KEY}强烈建议开发者把 API-Key 放在后端存储,而非分享或者放在客户端存储,以免 API-Key 泄露,导致财产损失。

Path Parameters

dataset_id
string<uuid>
required

要添加文档的知识库 ID。

Body

application/json
name
string
required

文档名称。

text
string
required

文档的完整文本内容。

indexing_technique
enum<string>

文档的索引技术。

可用选项:
high_quality,
economy
doc_form
enum<string>

索引内容的格式。

可用选项:
text_model,
hierarchical_model,
qa_model
doc_language
string

文档的语言,在 Q&A 模式中很重要。

Example:

"中文"

process_rule
object

用于处理文档的规则集,包括清理和分割。

retrieval_model
object
embedding_model
string

要使用的嵌入模型名称。

embedding_model_provider
string

嵌入模型的提供商。

Response

200 - application/json

文档创建成功,正在被索引。

document
object
batch
string

用于跟踪索引进度的批次标识符。