Skip to main content
POST
/
completion-messages
curl --request POST \
--url http://{api_base_url}/completion-messages \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data '
{
"inputs": {
"query": "Hello, world!"
},
"response_mode": "streaming",
"user": "abc-123"
}
'
{
  "event": "message",
  "message_id": "9da23599-e713-473b-982c-4328d4f5c78a",
  "mode": "completion",
  "answer": "Hello World!...",
  "metadata": {
    "usage": {
      "prompt_tokens": 1033,
      "prompt_unit_price": "0.001",
      "prompt_price_unit": "0.001",
      "prompt_price": "0.0010330",
      "completion_tokens": 128,
      "completion_unit_price": "0.002",
      "completion_price_unit": "0.001",
      "completion_price": "0.0002560",
      "total_tokens": 1161,
      "total_price": "0.0012890",
      "currency": "USD",
      "latency": 0.7682376249867957
    }
  },
  "created_at": 1705407629
}

Authorizations

Authorization
string
header
required

API Key authentication. For all API requests, include your API Key in the Authorization HTTP Header, prefixed with 'Bearer '. Example: Authorization: Bearer {API_KEY}. Strongly recommend storing your API Key on the server-side, not shared or stored on the client-side, to avoid possible API-Key leakage that can lead to serious consequences.

Body

application/json

Request body to create a completion message.

inputs
object
required

Allows the entry of various variable values defined by the App. The inputs parameter contains multiple key/value pairs, with each key corresponding to a specific variable and each value being the specific value for that variable. The text generation application requires at least one key/value pair to be inputted.

Example:
{ "query": "Translate 'hello' to Spanish." }
response_mode
enum<string>

The mode of response return.

  • streaming: Streaming mode (recommended), implements a typewriter-like output through SSE (Server-Sent Events).
  • blocking: Blocking mode, returns result after execution is complete. (Requests may be interrupted if the process is long). Due to Cloudflare restrictions, the request will be interrupted without a return after 100 seconds in blocking mode for long processes.
Available options:
streaming,
blocking
Example:

"streaming"

user
string

User identifier, used to define the identity of the end-user for retrieval and statistics. Should be uniquely defined by the developer within the application.

Example:

"user-12345"

files
object[]

File list, suitable for inputting files (images) combined with text understanding and answering questions, available only when the model supports Vision capability.

Response

Successful response. The content type and structure depend on the response_mode parameter in the request.

  • If response_mode is blocking, returns application/json with a CompletionResponse object.
  • If response_mode is streaming, returns text/event-stream with a ChunkCompletionResponse stream.

Response object for blocking mode completion.

event
string

Event type, for blocking mode this is typically 'message'.

Example:

"message"

message_id
string<uuid>

Unique message ID.

mode
string

App mode, fixed as completion for this response type (Note: MD also mentions 'chat', using 'completion' from example).

Example:

"completion"

answer
string

Complete response content.

metadata
object
created_at
integer

Message creation timestamp (Unix epoch).

Example:

1705395332