RAG QA
Perform retrieval-augmented generation QA based on uploaded knowledge base documents. The endpoint retrieves relevant document chunks, constructs context, and then calls the specified large language model to generate answers.
Try It
POST
/search/chat_messagehttps://api-platform.ope.aiAuthentication
Uses Bearer Token authentication.
- Header:
Authorization: Bearer <token> - Example:
Authorization: Bearer sk-xxxxxx
Request Body (application/json)
| Field | Type | Required | Description | Default / Range |
|---|---|---|---|---|
question | string | Yes | User question, used for document retrieval and answer generation | - |
llm_model_name | string | Yes | Large language model name used to generate answers | - |
embedding_model_name | string | Yes | Embedding model name used for text vectorization and retrieval | - |
session_id | string | Yes | Session ID, used to associate conversation history | - |
top_k | integer | No | Number of similar documents returned by retrieval | Default 5 |
stream | boolean | No | Whether to use SSE streaming response | Default true |
return_sources | boolean | No | Whether to return relevant source documents | Default true |
return_history | boolean | No | Whether to return conversation history records | Default false |
extra_prompt | string | No | Additional prompt to supplement answer constraints | Default None |
example_prompt | string | No | Example prompt to constrain response style | Default example |
fallback_context | string | No | Backup context when no relevant content is found in the knowledge base | Default empty string |
cosine_radius | number | No | Cosine similarity threshold radius | Default 0.55 |
Request Examples
- Non-streaming cURL
- Streaming cURL
- JavaScript
curl -X POST "https://api-platform.ope.ai/search/chat_message" \
-H "Authorization: Bearer $OPEAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "What is RAG?",
"llm_model_name": "deepseek-v4-flash",
"embedding_model_name": "bge-m3",
"top_k": 5,
"session_id": "session_1",
"stream": false,
"return_sources": true
}'
curl -N -X POST "https://api-platform.ope.ai/search/chat_message" \
-H "Authorization: Bearer $OPEAI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"question": "Summarize product advantages based on the knowledge base",
"llm_model_name": "deepseek-v4-flash",
"embedding_model_name": "bge-m3",
"session_id": "session_1",
"stream": true,
"return_sources": true
}'
const res = await fetch("https://api-platform.ope.ai/search/chat_message", {
method: "POST",
headers: {
Authorization: `Bearer ${process.env.OPEAI_API_KEY}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
question: "What is RAG?",
llm_model_name: "deepseek-v4-flash",
embedding_model_name: "bge-m3",
top_k: 5,
session_id: "session_1",
stream: false,
return_sources: true,
}),
});
console.log(await res.json());
Response Example
- Non-streaming
- Streaming
{
"message": "RAG is Retrieval-Augmented Generation, which retrieves relevant content from the knowledge base and generates answers combined with the model.",
"sources": [
{
"doc_id": "doc_1",
"chunk_id": "chunk_1",
"content": "RAG combines retrieval with generation."
}
],
"info": {
"top_k": 5
},
"history": []
}
event: message
data: {"content":"RAG is"}
event: sources
data: [{"doc_id":"doc_1","chunk_id":"chunk_1"}]
event: end
data: {}
Error Responses
| Status Code | Scenario | Description |
|---|---|---|
400 | Parameter validation error | Missing or incorrect format for required fields like question, model, or session |
401 | Authentication failed | API Key is missing, invalid, or has insufficient permissions |
422 | Validation failed | Parameter structure validation failed |
500 | Processing failed | Database, retrieval, model call, or other server errors |