Blaxel provides a serverless infrastructure to instantly deploy MCP servers. You receive a global inference endpoint for each deployment, and your workloads are served optimally to dramatically accelerate cold-start and latency. The main way to deploy an MCP server on Blaxel is by using Blaxel CLI.

Deploy an MCP server with Blaxel CLI

This section assumes you have developed the MCP server locally, as explained in this documentation, and are ready to deploy it.

Serve locally

Blaxel offers you a way to serve locally either:

Deploy on production

You can deploy the MCP server in order to make the entrypoint function (by default: server.ts / server.py) available on a global hosted endpoint. When deploying to Blaxel, you get a dedicated endpoint that enforces your deployment policies. Run the following command to build and deploy the MCP server on Blaxel:
bl deploy
You can now connect to the MCP server either from an agent on Blaxel (using the Blaxel SDK), or from an external client that supports WebSockets transport.

Connect to an MCP server

Learn how to run tool calls through your MCP server.

Customize an MCP server deployment

You can set custom parameters for an MCP server deployment (e.g. specify the server name, etc.) in the blaxel.toml file at the root of your directory. For more information on MCP deployment settings, refer to the reference section down at the bottom of this guide.

Deploy with a Dockerfile

While Blaxel uses predefined, optimized container images to build and deploy your code, you can also deploy your workload using your own Dockerfile.

Deploy using Dockerfile

Deploy resources using a custom Dockerfile.

Reference for deployment life-cycle

Choosing the infrastructure generation

Blaxel offers two infrastructure generations. When deploying a workload, you can select between Mk 2 infrastructure—which provides stable, globally distributed container-based workloads—and Mk 3 (in Alpha), which delivers ultra-fast cold starts. Choose the generation that best fits your specific requirements.

Maximum runtime

  • Deployed MCP servers have a runtime limit after which executions time out. This timeout duration is determined by your chosen infrastructure generation. For Mk 2 generation, the maximum timeout is 10 minutes.

Manage revisions

As you iterate on your software development, you will need to update the version of a function that is currently deployed and used by your consumers. Every time you build a new version of your function, this creates a revision. Blaxel stores the 10 latest revisions for each object. image.png Revisions are atomic builds of your deployment that can be either deployed (accessible via the inference endpoint) or not. This system enables you to:
  • rollback a deployment to its exact state from an earlier date
  • create a revision without immediate deployment to prepare for a future release
  • implement progressive rollout strategies, such as canary deployments
Important: Revisions are not the same as versions. You cannot use revisions to return to a previous configuration and branch off from it. For version control, use your preferred system (such as GitHub) alongside Blaxel. Deployment revisions are updated following a blue-green paradigm. The Global Inference Network will wait for the new revision to be completely up and ready before routing requests to the new deployment. You can also set up a canary deployment to split traffic between two revisions (maximum of two). image.png
When making a deployment using Blaxel CLI (bl deploy), the new traffic routing depends on the --traffic option. Without this option specified, Blaxel will automatically deploy the new revision with full traffic (100%) if the previous deployment was the latest revision. Otherwise, it will create the revision without deploying it (0% traffic).

Deployment reference

The MCP server deployment can be configured via the blaxel.toml file in your MCP server directory. This file is used to configure the deployment of the MCP server on Blaxel. The only mandatory parameter is the type so Blaxel knows which kind of entity to deploy. Others are not mandatory but allow you to customize the deployment.
name = "my-mcp-server"
workspace = "my-workspace"
type = "function"

[env]
DEFAULT_CITY = "San Francisco"
  • name, workspace, and type fields are optional and serve as default values. Any bl command run in the folder will use these defaults rather than prompting you for input.
  • [env] section defines environment variables that the MCP server can access via the SDK. Note that these are NOT secrets.

Deployment manifests (advanced usage)

When bl deploy runs, it generates a YAML configuration manifest automatically and deploys it to Blaxel’s hosting infrastructure. You can also create custom manifest files in the .blaxel folder and deploy them using the following command:
bl apply -f ./my-deployment.yaml
Read our reference for MCP server deployments.

Query MCP servers

Learn how to run tool calls on your MCP server.