API-GET-v1-chat-ask-stream

abstract

The primary chat endpoint. Accepts a question and session context, classifies intent, retrieves relevant document chunks from Qdrant, streams an LLM response via SSE, and persists both the user and assistant messages to PostgreSQL as a BackgroundTask after streaming completes.


🔒 Authentication

None.


🛠️ Technical Specification

Request

PropertyValue
MethodGET
Path/api/v1/chat/ask-stream
Tags["Chat"]
Response typetext/event-stream (SSE)

📦 Query Parameters

ParamTypeRequiredDefaultDescription
questionstringYesUser's message
session_idUUIDYesActive chat session
providerstringNo"ollama"LLM provider
modelstringNo"minimax-m2:cloud"Model name
top_kintNoreads rag_top_k from DB (default 5)Number of Qdrant chunks to retrieve
score_thresholdfloatNoreads rag_score_threshold from DBMinimum similarity score for chunks

Logic Flow

SSE Event Types

type fieldPayloadWhen sent
"intent"{ mode, label, icon }Before retrieval — signals which mode badge to show
"token"{ content: "..." }Each LLM output chunk
"sources"{ sources: SourceItem[] }After streaming completes

📤 HTTP Response

StatusDescription
200 OKStream opened — SSE begins immediately
Connection errorFrontend use-chat-stream.ts handles abort/error
RoleFile
Routerbackend/app/api/v1/chat.py :: ask_question_stream()
Intent classificationbackend/app/services/intent_service.py
Vector retrievalbackend/app/services/retrieval_service.py
LLM streamingbackend/app/services/llm_service.py :: generate_answer_stream()
Message persistencebackend/app/services/chat_history_service.py :: add_message()
RAG paramsbackend/app/services/settings_service.pyappsetting table
Frontend SSE handlerfrontend/src/hooks/use-chat-stream.ts via apiStream() in lib/api.ts

RoleLink
Non-streaming askAPI-POST-v1-chat-ask
History loadAPI-GET-v1-chat-history-id
RAG settingsDB - appsetting
DB message storageDB - chatmessage