Chat
Send messages to AI agents and receive streaming or non-streaming responses.
/v1/chatThe chat endpoint is the core of the Galadri API. It sends a message to an AI agent, which can use tools (search, data management, booking, etc.) to fulfill the request. Responses stream by default as Server-Sent Events when the agent's Streaming capability is enabled. If streaming is turned off on the agent, the endpoint returns a single JSON response by default.
Try it in the console
You can test chat requests interactively on the agent page in the Galadri Console. Open any agent and use the built-in chat panel to send messages, see tool calls, and watch data mutations in real time.
Request Body
Required fields
| Parameter | Type | Description |
|---|---|---|
agentrequired | string | The agent slug or UUID. Must be an active agent in your organization. |
messagerequired | string | The user's message. Maximum 32,000 characters. |
end_user_idrequired | string | Your application's user identifier. Galadri uses this to scope data (vehicles, documents, etc.) to the correct user. If the user does not exist, one is created automatically. |
Optional fields
| Parameter | Type | Description |
|---|---|---|
session_id | string (UUID) | Continue an existing conversation. If omitted, a new session is created. The session must belong to the same organization and end user. |
timezone | string | IANA timezone for this request (e.g., "America/New_York"). The agent uses this to format times and schedule events correctly. Falls back to the timezone stored on the user record. |
system_prompt_append | string | Additional instructions appended to the agent's system prompt. Use this for per-request context like "The user is currently on the vehicle details page for their 2020 Camry." Maximum 4,000 characters. Does not replace the agent's prompt. |
callback_url | string (HTTPS URL) | Enable async mode. When provided, the API returns 202 Accepted immediately with a request_id, then processes the full agent pipeline in the background. On completion or failure, results are POSTed to this URL with an HMAC-SHA256 signature. See Async Mode below. |
Overrides
Per-request overrides are nested under the overrides object. All fields are optional.
| Parameter | Type | Description |
|---|---|---|
overrides.model | string | Override the agent's default model for this request. Use the OpenRouter model ID (e.g., "anthropic/claude-sonnet-4-5"). |
overrides.system_prompt | string | Fully replaces the agent's system prompt for this request. The org environment prompt, time context, entity context, tool reference, and behavioral rules are still included. Only the agent-level prompt is swapped. Maximum 50,000 characters. For appending instructions without replacing the prompt, use the top-level system_prompt_append field instead. |
overrides.tools | object | Override the agent's tool configuration for this request. |
overrides.tools.include | string[] | Only allow these tools (by slug). All other tools are disabled. |
overrides.tools.exclude | string[] | Disable specific tools. All other tools remain enabled. |
overrides.hidden | boolean | If true and a new session is created, mark it as hidden. Hidden sessions do not appear in the console session list. Useful for automated or background interactions. |
overrides.streaming | boolean | Set to false to force a single JSON response instead of SSE. When omitted, the agent's Streaming capability decides the default. |
overrides.thinking | boolean | Request thinking traces when the agent's Thinking Tokens capability is enabled. Set false to disable thinking for this request. True cannot enable thinking for an agent that does not allow it. |
overrides.auto_prompts | boolean | Request suggested follow-up prompts when the agent's Suggested Follow-ups capability is enabled. Set false to disable them for this request. |
{
"agent": "my-agent",
"message": "Find oil change shops near me and check traffic to the closest highly rated option",
"end_user_id": "user-123",
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"timezone": "America/New_York",
"system_prompt_append": "The user is viewing their 2020 Toyota Camry. Prefer shops within 5 miles.",
"overrides": {
"model": "anthropic/claude-sonnet-4-5",
"tools": {
"include": ["google-maps-search", "traffic"]
},
"streaming": true,
"thinking": true,
"auto_prompts": true
}
}Streaming Response (SSE)
When the agent's Streaming capability is enabled, the API returns a streaming response using Server-Sent Events. Each event is a JSON object on a single data: line. The stream ends with data: [DONE].
Event types
| Field | Type | Description |
|---|---|---|
session | object | Sent first. Contains session_id (UUID) and is_new (boolean). Use session_id in subsequent requests to continue the conversation. |
content | object | A chunk of the agent's response text. Concatenate all content events to build the full message. |
thinking | object | A chunk of the agent's reasoning trace. Emitted only when the agent allows thinking and the request does not disable it. |
tool_call | object | The agent is invoking tools. Contains an id and an actions array with tool slugs and arguments. |
tool_result | object | Results from tool execution. Each result has a status of "success", "error", or "initiated" (for fire-and-forget tools). |
data_saved | object | Emitted when the agent creates or updates end-user data. Contains a mutations array with the action ("create" or "update"), table name, and full record. |
usage | object | Sent at the end. Contains prompt_tokens, completion_tokens, thinking_tokens, and cost_cents. |
error | object | An error occurred during processing. Contains an error message string. |
{"type": "session", "session_id": "550e8400-e29b-41d4-a716-446655440000", "is_new": true}{"type": "tool_call", "id": "tc_1", "actions": [
{"tool": "google-maps-search", "args": {"query": "oil change shops near Phoenix, AZ"}},
{"tool": "traffic", "args": {"action": "route_traffic", "origin": "Phoenix, AZ", "destination": "Best rated oil change shop near Phoenix, AZ"}}
]}{"type": "tool_result", "id": "tc_1", "results": [
{"tool": "google-maps-search", "status": "success", "data": {"results": [...]}},
{"tool": "traffic", "status": "success", "data": {"route": {...}, "incidents": [...]}}
]}{"type": "data_saved", "mutations": [
{
"action": "create",
"table": "vehicles",
"record": {
"id": "a1b2c3d4-...",
"year": 2020,
"make": "Toyota",
"model": "Camry",
"trim": "SE",
"vin": "4T1BF1FK5LU123456",
"odometer_km": 72420
}
}
]}{"type": "usage", "prompt_tokens": 1250, "completion_tokens": 340, "thinking_tokens": 0, "cost_cents": 42}JavaScript client example
async function chat(agent, message, endUserId, sessionId) {
const response = await fetch("https://api.galadri.com/v1/chat", {
method: "POST",
headers: {
"Authorization": "Bearer gld_your_api_key",
"Content-Type": "application/json",
},
body: JSON.stringify({
agent,
message,
end_user_id: endUserId,
...(sessionId && { session_id: sessionId }),
}),
});
if (!response.ok) {
const err = await response.json();
throw new Error(err.error);
}
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = "";
let currentSessionId = sessionId;
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split("\n");
buffer = lines.pop() || "";
for (const line of lines) {
if (!line.startsWith("data: ")) continue;
const payload = line.slice(6);
if (payload === "[DONE]") return currentSessionId;
const event = JSON.parse(payload);
switch (event.type) {
case "session":
currentSessionId = event.session_id;
break;
case "content":
process.stdout.write(event.content);
break;
case "tool_call":
console.log("\nUsing tools:", event.actions.map(a => a.tool));
break;
case "data_saved":
console.log("\nData saved:", event.mutations);
break;
case "usage":
console.log("\nCost:", event.cost_cents, "cents");
break;
case "error":
console.error("\nError:", event.error);
break;
}
}
}
return currentSessionId;
}Non-Streaming Response
Set "overrides": { "streaming": false } to receive a single JSON response. The API will wait until the agent finishes all tool calls and generates a complete response before returning. Agents with Streaming disabled use this JSON response mode by default.
{
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"is_new_session": true,
"message": "I found several oil change shops near Phoenix, AZ:\n\n1. **Valvoline** — ...",
"thinking": "The user wants oil change shops. I should search Google Maps...",
"tool_calls": [
{
"actions": [
{"tool": "google-maps-search", "args": {"query": "oil change shops near Phoenix, AZ"}}
],
"results": [
{"tool": "google-maps-search", "status": "success", "data": {"places": [...]}}
]
}
],
"data_saved": [],
"usage": {
"prompt_tokens": 1250,
"completion_tokens": 340,
"thinking_tokens": 128,
"cost_cents": 42
},
"auto_prompts": ["Show EV chargers nearby", "Save this shop to my notes"]
}Async Mode (Callback Webhook)
For background processing (scheduled runs, push notification triggers, or when the user closes the app mid-reply), include a callback_url in the request body. The API returns 202 Accepted immediately and processes the full agent pipeline in the background.
{
"agent": "my-agent",
"message": "Check the user's upcoming service schedule and send a reminder email",
"end_user_id": "user-123",
"callback_url": "https://your-app.com/webhooks/galadri"
}{
"request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"status": "processing",
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"is_new_session": true
}When processing completes, Galadri POSTs the full result to your callback URL:
Callback payloads are finalized background results. They omit thinking traces and send one post-tool assistant message instead of any interim tool-planning text.
{
"event": "chat.completed",
"request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
"session_id": "550e8400-e29b-41d4-a716-446655440000",
"is_new_session": true,
"message": "I've sent a service reminder email to the user.",
"tool_calls": [...],
"usage": { "prompt_tokens": 980, "completion_tokens": 120, "thinking_tokens": 0, "cost_cents": 18 },
"completed_at": "2026-04-01T14:30:00.000Z"
}If processing fails, you receive a chat.failed event instead. Callback delivery retries up to 3 times with exponential backoff.
Verifying callback signatures
If your agent has a webhook credential configured, the callback includes an X-Galadri-Signature header containing an HMAC-SHA256 hex digest of the request body, signed with your webhook signing secret. Verify this to ensure the callback is from Galadri.
import { createHmac } from "crypto";
function verifySignature(body: string, signature: string, secret: string): boolean {
const expected = createHmac("sha256", secret).update(body).digest("hex");
return signature === expected;
}Tool Calls
Galadri uses a meta-tool pattern where the AI agent can invoke multiple tools in a single batch. All independent actions execute in parallel, dramatically reducing latency compared to sequential tool calling.
The agent may make multiple tool rounds per request when actions have sequential dependencies. For example, searching for shops, then booking at the best one.
Tool execution modes
Tools run in one of two modes: sync (the agent waits for the result before responding) or fire-and-forget (the agent gets back "status": "initiated" immediately). Fire-and-forget is used for background tasks like sending emails or generating images.
You can override which tools are available per request using overrides.tools. Use include to whitelist specific tools, or exclude to blacklist them.
Sessions
Every chat request creates or continues a session. Sessions maintain conversation context across multiple messages, allowing the agent to reference previous interactions.
Starting a new session
Omit session_id to start a new conversation. The response will include the new session_id in the first SSE event (or in the JSON response for non-streaming mode).
Continuing a session
Pass the session_id from a previous response to continue the conversation. The agent will have access to the conversation history. By default this is up to 10 recent messages and roughly 5,000 recent-message tokens, and developers can adjust those per-agent auto compaction limits. The agent can also use the hidden conversation-history search tool to retrieve bounded older context for the same end user when a user asks about something discussed before. The session and any searched messages must belong to the same organization and end user.
{
"agent": "my-agent",
"message": "Book the first one for Saturday at 10am",
"end_user_id": "user-123",
"session_id": "550e8400-e29b-41d4-a716-446655440000"
}Hidden sessions
Set "overrides": { "hidden": true } to create a hidden session. Hidden sessions work identically but do not appear in the console session list. Use this for automated or background interactions (e.g., nightly data enrichment tasks, system-initiated conversations).
Error Handling
Errors can occur at two levels: HTTP-level errors (returned as JSON before streaming starts) and stream-level errors (sent as SSE events during processing).
HTTP errors
These are returned as standard JSON responses with the appropriate HTTP status code. See Authentication errors for the full list.
{
"error": "Agent not found or inactive"
}Stream errors
If an error occurs after streaming has started (e.g., the model provider returns an error), it is sent as an SSE event. The stream will end after the error event.
data: {"type": "error", "error": "Model provider returned an error. Please try again."}
data: [DONE]Tool errors are not fatal
If an individual tool fails during execution, the agent receives the error and communicates it to the user. The overall request does not fail. Check tool_result events for per-tool error status.