RAG QA

Perform retrieval-augmented generation QA based on uploaded knowledge base documents. The endpoint retrieves relevant document chunks, constructs context, and then calls the specified large language model to generate answers.

Try It

POST/search/chat_messagehttps://api-platform.ope.ai

Authentication

Uses Bearer Token authentication.

Header: Authorization: Bearer <token>
Example: Authorization: Bearer sk-xxxxxx

Request Body (application/json)

Field	Type	Required	Description	Default / Range
`question`	string	Yes	User question, used for document retrieval and answer generation	-
`llm_model_name`	string	Yes	Large language model name used to generate answers	-
`embedding_model_name`	string	Yes	Embedding model name used for text vectorization and retrieval	-
`session_id`	string	Yes	Session ID, used to associate conversation history	-
`top_k`	integer	No	Number of similar documents returned by retrieval	Default `5`
`stream`	boolean	No	Whether to use SSE streaming response	Default `true`
`return_sources`	boolean	No	Whether to return relevant source documents	Default `true`
`return_history`	boolean	No	Whether to return conversation history records	Default `false`
`extra_prompt`	string	No	Additional prompt to supplement answer constraints	Default `None`
`example_prompt`	string	No	Example prompt to constrain response style	Default example
`fallback_context`	string	No	Backup context when no relevant content is found in the knowledge base	Default empty string
`cosine_radius`	number	No	Cosine similarity threshold radius	Default `0.55`

Request Examples

Non-streaming cURL
Streaming cURL
JavaScript

curl -X POST "https://api-platform.ope.ai/search/chat_message" \
  -H "Authorization: Bearer $OPEAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What is RAG?",
    "llm_model_name": "deepseek-v4-flash",
    "embedding_model_name": "bge-m3",
    "top_k": 5,
    "session_id": "session_1",
    "stream": false,
    "return_sources": true
  }'

curl -N -X POST "https://api-platform.ope.ai/search/chat_message" \
  -H "Authorization: Bearer $OPEAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "Summarize product advantages based on the knowledge base",
    "llm_model_name": "deepseek-v4-flash",
    "embedding_model_name": "bge-m3",
    "session_id": "session_1",
    "stream": true,
    "return_sources": true
  }'

const res = await fetch("https://api-platform.ope.ai/search/chat_message", {
  method: "POST",
  headers: {
    Authorization: `Bearer ${process.env.OPEAI_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    question: "What is RAG?",
    llm_model_name: "deepseek-v4-flash",
    embedding_model_name: "bge-m3",
    top_k: 5,
    session_id: "session_1",
    stream: false,
    return_sources: true,
  }),
});

console.log(await res.json());

Response Example

Non-streaming
Streaming

{
  "message": "RAG is Retrieval-Augmented Generation, which retrieves relevant content from the knowledge base and generates answers combined with the model.",
  "sources": [
    {
      "doc_id": "doc_1",
      "chunk_id": "chunk_1",
      "content": "RAG combines retrieval with generation."
    }
  ],
  "info": {
    "top_k": 5
  },
  "history": []
}

event: message
data: {"content":"RAG is"}

event: sources
data: [{"doc_id":"doc_1","chunk_id":"chunk_1"}]

event: end
data: {}

Error Responses

Status Code	Scenario	Description
`400`	Parameter validation error	Missing or incorrect format for required fields like question, model, or session
`401`	Authentication failed	API Key is missing, invalid, or has insufficient permissions
`422`	Validation failed	Parameter structure validation failed
`500`	Processing failed	Database, retrieval, model call, or other server errors

Try It​

Authentication​

Request Body (application/json)​

Request Examples​

Response Example​

Error Responses​