Whenever you deploy a workload on Blaxel, an inference endpoint is generated on Global Inference Network, the infrastructure powerhouse that hosts it.

The inference API URL depends on the type of workload (agent, model API, function) you are trying to request:


POST run.blaxel.ai/{YOUR-WORKSPACE}/agents/{YOUR-AGENT}

Showing the full request, with the input payload:


curl -X POST "https://run.blaxel.ai/{your-workspace}/agents/{your-agent}" \
-H 'Content-Type: application/json' \
-H "X-Blaxel-Authorization: Bearer <YOUR_API_KEY>" \
-d '{"inputs":"Hello, world!"}'

Manage sessions

To simulate multi-turn conversations, you can include a thread ID in your agent request header. You’ll need to generate this ID and pass it using either X-Blaxel-Thread-Id or Thread-Id. Without a thread ID, the agent won’t maintain nor use any conversation memory when processing the request.

This is only available for agent requests.

Query agent with thread ID

curl -X POST "https://run.blaxel.ai/{your-workspace}/agents/{your-agent}" \
-H 'Content-Type: application/json' \
-H "X-Blaxel-Authorization: Bearer <YOUR_API_KEY>" \
-H "X-Blaxel-Thread-Id: <THREAD_ID>" \
-d '{"inputs":"Hello, world!"}'

Invoke pre-built functions

Pre-built functions are Model Context Protocol (MCP) servers deployed on Blaxel. They provide a toolkit of multiple tools—individual capabilities for accessing specific APIs or databases. These functions can be interacted using WebSocket protocol.

Read more about how to call pre-built functions in this documentation.

Product documentation

Read our product guide on querying an agent.