Skip to main content
POST
/
datasets
/
{dataset_id}
/
document
/
create-by-text
Create a Document from Text
curl --request POST \
  --url http://{apiBaseUrl}/datasets/{dataset_id}/document/create-by-text \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "name": "<string>",
  "text": "<string>",
  "indexing_technique": "high_quality",
  "doc_form": "text_model",
  "doc_language": "English",
  "process_rule": {
    "mode": "automatic",
    "rules": {
      "pre_processing_rules": [
        {
          "id": "remove_extra_spaces",
          "enabled": true
        }
      ],
      "segmentation": {
        "separator": "<string>",
        "max_tokens": 123
      },
      "parent_mode": "full-doc",
      "subchunk_segmentation": {
        "separator": "<string>",
        "max_tokens": 123,
        "chunk_overlap": 123
      }
    }
  },
  "retrieval_model": {
    "search_method": "hybrid_search",
    "reranking_enable": true,
    "reranking_mode": {
      "reranking_provider_name": "<string>",
      "reranking_model_name": "<string>"
    },
    "top_k": 123,
    "score_threshold_enabled": true,
    "score_threshold": 123,
    "weights": 123
  },
  "embedding_model": "<string>",
  "embedding_model_provider": "<string>"
}
'
{
  "document": {
    "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "position": 123,
    "data_source_type": "<string>",
    "data_source_info": {},
    "dataset_process_rule_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "name": "<string>",
    "created_from": "<string>",
    "created_by": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "created_at": 123,
    "tokens": 123,
    "indexing_status": "<string>",
    "error": "<string>",
    "enabled": true,
    "disabled_at": 123,
    "disabled_by": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "archived": true,
    "display_status": "<string>",
    "word_count": 123,
    "hit_count": 123,
    "doc_form": "<string>"
  },
  "batch": "<string>"
}

Authorizations

Authorization
string
header
required

API Key authentication. For all API requests, include your API Key in the Authorization HTTP Header, prefixed with 'Bearer '. Example: Authorization: Bearer {API_KEY}. Strongly recommend storing your API Key on the server-side, not shared or stored on the client-side, to avoid possible API-Key leakage that can lead to serious consequences.

Path Parameters

dataset_id
string<uuid>
required

The ID of the knowledge base to add the document to.

Body

application/json
name
string
required

Name of the document.

text
string
required

Full text content of the document.

indexing_technique
enum<string>

Indexing technique for the document.

Available options:
high_quality,
economy
doc_form
enum<string>

Format of the indexed content.

Available options:
text_model,
hierarchical_model,
qa_model
doc_language
string

Language of the document, important for Q&A mode.

Example:

"English"

process_rule
object

A set of rules for processing a document, including cleaning and segmentation.

retrieval_model
object
embedding_model
string

Name of the embedding model to use.

embedding_model_provider
string

Provider of the embedding model.

Response

200 - application/json

Document created successfully and is being indexed.

document
object
batch
string

A batch identifier for tracking indexing progress.