Chat

Send messages to AI agents and receive streaming or non-streaming responses.

POST/v1/chat

The Chat API is the main runtime endpoint for product conversations. Your app sends one user message to a configured agent. Galadri handles model execution, enabled capabilities, end-user context, billing, persistence, and rich result events. Responses stream by default as Server-Sent Events when the agent's Streaming capability is enabled. If streaming is turned off on the agent, the endpoint returns a single JSON response by default.

Try it in the console

You can test chat requests interactively on the agent page in the Galadri Console. Open any agent and use the built-in chat panel to send messages, inspect structured results, and watch data mutations in real time.

Keep API keys server-side

Call the Chat API from your backend. Do not expose Galadri API keys in browser or mobile clients.

Request Body

Required fields

Parameter	Type	Description
`agent`required	string	The agent slug or UUID. Must be an active agent in your organization.
`message`required	string	The user's message. Maximum 32,000 characters. Returns HTTP 400 if exceeded.
`end_user_id`required	string	Your application's user identifier. Galadri uses this to scope vehicles, documents, sessions, and communications to the correct user. If the user does not exist, one is created automatically. Maximum 255 characters.

Optional fields

Parameter	Type	Description
`session_id`	string (UUID)	Continue an existing conversation. If omitted, a new session is created. The session must belong to the same organization and end user.
`timezone`	string	IANA timezone for this request (e.g., "America/New_York"). The agent uses this to format times and schedule events correctly. Falls back to the timezone stored on the user record.
`system_prompt_append`	string	Additional instructions appended to the agent's system prompt. Use this for per-request context like "The user is currently on the vehicle details page for their 2020 Camry." Maximum 4,000 characters. Does not replace the agent's prompt.
`callback_url`	string (HTTPS URL)	Enable async mode. When provided, the API returns 202 Accepted immediately with a request_id, then POSTs the final result or failure to this URL. Must be HTTPS. Maximum 2,048 characters.

Overrides

Per-request overrides are nested under the overrides object. Most apps only need the top-level fields.

Parameter	Type	Description
`overrides.model`	string	Override the agent's default model for this request. Use the OpenRouter model ID format, such as "openai/gpt-5.4".
`overrides.system_prompt`	string	Fully replaces the agent's saved system prompt for this request. Galadri still includes runtime context and safety rules. Maximum 50,000 characters. For additive instructions, use the top-level system_prompt_append field instead.
`overrides.hidden`	boolean	If true and a new session is created, mark it as hidden. Use this for automated or background interactions your app should exclude from end-user-visible chat history.
`overrides.streaming`	boolean	Set to false to force a single JSON response instead of SSE. When omitted, the agent's Streaming capability decides the default.
`overrides.thinking`	boolean	Request thinking traces when the agent allows them. Set false to disable thinking for this request. True cannot enable thinking for an agent that does not allow it.
`overrides.auto_prompts`	boolean	Request suggested follow-up prompts when the agent's Suggested Follow-ups capability is enabled. Set false to disable them for this request.

Request example

{
  "agent": "my-agent",
  "message": "Find oil change shops near me and show the best option on a map.",
  "end_user_id": "user-123",
  "timezone": "America/New_York",
  "system_prompt_append": "The user is viewing their 2020 Toyota Camry. Prefer shops within 5 miles."
}

Streaming Response (SSE)

When the agent's Streaming capability is enabled, the API returns a streaming response using Server-Sent Events. Each event is a JSON object on a single data: line. The stream ends with data: [DONE].

Build against documented events

The stream may include diagnostic events that are not part of the stable frontend contract. Ignore event types you do not recognize. Build user-facing UI from the documented events below.

Event types

Field	Type	Description
`session`	`object`	Sent first. Contains session_id (UUID) and is_new (boolean). Use session_id in subsequent requests to continue the conversation.
`content`	`object`	A chunk of the agent's response text. Concatenate all content events to build the full message.
`thinking`	`object`	A chunk of the agent's reasoning trace. Emitted only when the agent allows thinking and the request does not disable it.
`tool_result`	`object`	Structured result data from capabilities the agent used. Use these events to render rich results such as maps, cards, routes, and saved-data confirmations. Each result has a status of "success", "error", or "initiated".
`data_saved`	`object`	Emitted when the agent creates or updates end-user data. Contains a mutations array with the action ("create" or "update"), table name, and full record.
`auto_prompts`	`object`	Suggested follow-up prompts. Emitted only when the agent allows suggested follow-ups and the request does not disable them.
`usage`	`object`	Sent at the end. Contains prompt_tokens, completion_tokens, thinking_tokens, cost_micro_usd, and the rounded cost_cents rollup.
`error`	`object`	An error occurred during processing. Contains an error message string.

Session event

{"type": "session", "session_id": "550e8400-e29b-41d4-a716-446655440000", "is_new": true}

Structured result event

{"type": "tool_result", "id": "tc_1", "results": [
  {"tool": "google-maps-search", "status": "success", "data": {"results": [
    {"_id": "place_0", "name": "Valvoline Instant Oil Change", "rating": 4.7}
  ]}},
  {"tool": "traffic", "status": "success", "data": {"route": {"duration": "12 min"}}}
]}

Data saved event

{"type": "data_saved", "mutations": [
  {
    "action": "create",
    "table": "vehicles",
    "record": {
      "id": "a1b2c3d4-...",
      "year": 2020,
      "make": "Toyota",
      "model": "Camry",
      "trim": "SE",
      "vin": "4T1BF1FK5LU123456",
      "odometer_km": 72420
    }
  }
]}

Usage event

{"type": "usage", "prompt_tokens": 1250, "completion_tokens": 340, "thinking_tokens": 0, "cost_micro_usd": 418723, "cost_cents": 42}

See Rich Results for the complete rendering contract and example shapes for maps, cards, route views, and saved-data cards.

Display directives live in content

Structured capability data arrives in tool_result events. Inline map and card placement instructions arrive only in assistant content as  comments. These directives can span multiple lines, so use the parser from the Rich Results guide instead of a single-line-only regular expression.

Server-side JavaScript example

Consuming the SSE stream from Node.js

async function chat(agent, message, endUserId, sessionId) {
  const response = await fetch("https://api.galadri.com/v1/chat", {
    method: "POST",
    headers: {
      "Authorization": "Bearer gld_your_api_key",
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      agent,
      message,
      end_user_id: endUserId,
      ...(sessionId && { session_id: sessionId }),
    }),
  });

  if (!response.ok) {
    const errorBody = await response.text();
    throw new Error(`Galadri request failed (${response.status}): ${errorBody}`);
  }

  if (!response.body) {
    throw new Error("Galadri response did not include a stream body");
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = "";
  let currentSessionId = sessionId;

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });
    const lines = buffer.split("\n");
    buffer = lines.pop() || "";

    for (const line of lines) {
      if (!line.startsWith("data: ")) continue;
      const payload = line.slice(6);
      if (payload === "[DONE]") return currentSessionId;

      const event = JSON.parse(payload);

      switch (event.type) {
        case "session":
          currentSessionId = event.session_id;
          break;
        case "content":
          process.stdout.write(event.content);
          break;
        case "tool_result":
          console.log("\nStructured results:", event.results);
          break;
        case "data_saved":
          console.log("\nData saved:", event.mutations);
          break;
        case "auto_prompts":
          console.log("\nFollow-ups:", event.prompts);
          break;
        case "usage":
          console.log("\nCost:", event.cost_micro_usd / 1_000_000, "USD");
          break;
        case "error":
          console.error("\nError:", event.error);
          break;
        default:
          break;
      }
    }
  }

  return currentSessionId;
}

Non-Streaming Response

Set "overrides": { "streaming": false } to receive a single JSON response. The API will wait until the agent finishes its work and generates a complete response before returning. Agents with Streaming disabled use this JSON response mode by default. Use this mode for server-side jobs, tests, or workflows where you only need the final answer.

Non-streaming response

{
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "is_new_session": true,
  "message": "I found several oil change shops near Phoenix, AZ:\n\n1. **Valvoline**: ...",
  "usage": {
    "prompt_tokens": 1250,
    "completion_tokens": 340,
    "thinking_tokens": 0,
    "cost_micro_usd": 418723,
    "cost_cents": 42
  },
  "auto_prompts": ["Show EV chargers nearby", "Save this shop to my notes"]
}

Async Mode (Callback Webhook)

For background processing (scheduled runs, push notification triggers, or when the user closes the app mid-reply), include a callback_url in the request body. The API returns 202 Accepted immediately and sends the final result to your callback URL when processing completes.

Async request

{
  "agent": "my-agent",
  "message": "Check the user's upcoming service schedule and send a reminder email.",
  "end_user_id": "user-123",
  "callback_url": "https://your-app.com/webhooks/galadri"
}

Immediate 202 response

{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "status": "processing",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "is_new_session": true
}

When processing completes, Galadri POSTs the full result to your callback URL:

Callback payloads are finalized async results. They omit thinking traces and send one final assistant message instead of interim planning text.

Communication capabilities can also create async continuations inside an existing chat session. If an agent starts an external call, SMS, or email from a chat turn and the follow-up route is chat, the later result can trigger a new assistant message in that same session. Galadri retries claimed continuations with the same assistant message id if the first worker is interrupted before the follow-up message is persisted. Clients that subscribe to session messages should treat those assistant messages like any other persisted chat response.

Callback payload (chat.completed)

{
  "event": "chat.completed",
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "session_id": "550e8400-e29b-41d4-a716-446655440000",
  "is_new_session": true,
  "message": "I've sent a service reminder email to the user.",
  "usage": { "prompt_tokens": 980, "completion_tokens": 120, "thinking_tokens": 0, "cost_micro_usd": 176204, "cost_cents": 18 },
  "completed_at": "2026-04-01T14:30:00.000Z"
}

If processing fails, you receive a chat.failed event instead. Callback delivery times out after 10 seconds and retries up to 3 times with exponential backoff. Galadri does not retry 4xx responses except 429.

Agents with a webhook credential can also be configured to send chat.completed after every successful Chat API run without requiring a per-request callback_url. This uses the same payload shape, signature header, retry behavior, and idempotency header as callback webhooks.

Verifying callback signatures

If your agent has a webhook credential configured, the callback includes an X-Galadri-Signature header containing an HMAC-SHA256 hex digest of the request body, signed with your webhook signing secret. Verify this to ensure the callback is from Galadri.

Signature verification (Node.js)

import { createHmac } from "crypto";

function verifySignature(body: string, signature: string, secret: string): boolean {
  const expected = createHmac("sha256", secret).update(body).digest("hex");
  return signature === expected;
}

Capability Results

During a chat request, the agent may use enabled capabilities such as maps, vehicle data, communications, or managed end-user data. Galadri handles that orchestration internally. Your client should build UI from the final message, structured result events, data mutation events, auto prompts, and usage fields.

For maps, cards, routes, and saved-data confirmations, keep a registry of structured result IDs while rendering a message. That lets the assistant text refer to concrete results like places, vehicles, or routes without forcing your app to parse natural language.

Sessions

Every chat request creates or continues a session. Sessions maintain conversation context across multiple messages, allowing the agent to reference previous interactions. When the same end user has recent visible activity in another channel, Galadri can include a small cross-channel brief so the agent is aware that related email or SMS context may exist.

Starting a new session

Omit session_id to start a new conversation. The response will include the new session_id in the first SSE event (or in the JSON response for non-streaming mode).

Continuing a session

Pass the session_id from a previous response to continue the conversation. The agent will have access to the conversation history and relevant structured context for that end user. Galadri manages history compaction automatically, so clients do not need to resend prior messages. Recent capability results are summarized for model context, and setup metadata is omitted from future turns. Other-channel context is supplied as a bounded brief, not as a replacement for continuing the correct session when your app already has the session ID.

Attachments

The public /v1/chat request body accepts a text message, not inline file uploads or base64 attachments. To analyze a file, host it at an HTTPS URL and include the link in the user message for agents that have file analysis enabled.

Continuing a conversation

{
  "agent": "my-agent",
  "message": "Save the first shop to my notes.",
  "end_user_id": "user-123",
  "session_id": "550e8400-e29b-41d4-a716-446655440000"
}

Hidden sessions

Set "overrides": { "hidden": true } to create a hidden session. Hidden sessions work identically at runtime, but your app can exclude them from end-user-visible chat history. Use this for automated or background interactions such as nightly data enrichment, asynchronous reports, system-initiated conversations, or server-side tasks that update managed data on a user's behalf. Hidden sessions are still persisted, returned by the Sessions API unless you filter them, and visible in the Galadri Console data browser for auditing and debugging.

Error Handling

Errors can occur at two levels: HTTP-level errors (returned as JSON before streaming starts) and stream-level errors (sent as SSE events during processing).

HTTP errors

These are returned as standard JSON responses with the appropriate HTTP status code. See Authentication errors for the full list.

HTTP error response

{
  "error": "Agent not found or inactive"
}

Stream errors

If an error occurs after streaming has started (e.g., the model provider returns an error), it is sent as an SSE event. The stream will end after the error event.

Stream error event

data: {"type": "error", "error": "Model provider returned an error. Please try again."}
data: [DONE]

Capability errors are not fatal

If an individual capability fails during execution, the agent receives the error and communicates it to the user. The overall request does not fail. Check structured result tool_result events for per-capability status.

Communications APINext