Run inferences on your Blaxel deployments.
Whenever you deploy a workload on Blaxel, an inference endpoint is generated on Global Agentics Network, the infrastructure powerhouse that hosts it.
The inference API URL depends on the type of workload (agent, model API, MCP server) you are trying to request:
Showing the full request, with the input payload:
MCP servers (Model Context Protocol) provide a toolkit of multiple capabilities for agents. These servers can be interacted with using Blaxel’s WebSocket transport implementation on the server’s global endpoint.
Learn how to run invocation requests on your MCP server.
To simulate multi-turn conversations, you can pass on request headers. You’ll need your client to generate this ID and pass it using any header which you can retrieve via the code (e.g. Thread-Id
). Without a thread ID, the agent won’t maintain nor use any conversation memory when processing the request.
This is only available for agent requests.
Run inferences on your Blaxel deployments.
Whenever you deploy a workload on Blaxel, an inference endpoint is generated on Global Agentics Network, the infrastructure powerhouse that hosts it.
The inference API URL depends on the type of workload (agent, model API, MCP server) you are trying to request:
Showing the full request, with the input payload:
MCP servers (Model Context Protocol) provide a toolkit of multiple capabilities for agents. These servers can be interacted with using Blaxel’s WebSocket transport implementation on the server’s global endpoint.
Learn how to run invocation requests on your MCP server.
To simulate multi-turn conversations, you can pass on request headers. You’ll need your client to generate this ID and pass it using any header which you can retrieve via the code (e.g. Thread-Id
). Without a thread ID, the agent won’t maintain nor use any conversation memory when processing the request.
This is only available for agent requests.