Generations

All workloads deployed on Blaxel run on one of the available infrastructure generations. Depending on the workload type, not all generations may be available. You can set a default generation value in your workspace, which applies when creating new resources unless overridden. Choose the generation based on which features best suit your specific use case.

Mk 2 infrastructure features extensive global distribution, with more than 15 points of presence worldwide.

Mk 3 infrastructure (coming soon!) delivers dramatically lower cold starts, with sub-20-ms boot times for workloads. It is currently available in private Alpha release. Contact us to get access.

How to choose an infrastructure generation

Mark 2 infrastructure

Mk 2 infrastructure uses containers to run workloads, providing emulation of most Linux system calls. Cold starts typically take between 2 and 10 seconds. After a deployment is queried, it stays warm for a period that varies based on overall infrastructure usage, allowing it to serve subsequent requests instantly.

You should choose Mk 2 infrastructure if:

your workload requires system calls not yet supported by Mk 3 infrastructure
boot times of around 5 seconds are suitable for your needs
your deployment receives consistent traffic that keeps it running warm
you need to run workloads in specific regions for sovereignty or regulatory compliance using deployment policies
you require revision control for rollbacks or canary deployments

Mark 2 infrastructure is currently available to run the following workloads:

Mark 3 infrastructure

Mk 3 infrastructure leverages Firecracker-based micro VMs to run code with mission-critical low cold-starts. Mk 3 is currently available in private Alpha.

You should choose Mk 3 infrastructure if:

low latency is important to your use case (sub-20ms boot times)

Mark 3 infrastructure is currently available to run the following workloads:

sandboxes
coming soon: agents and MCP servers

Mark 3 infrastructure is currently in Alpha release.

What about Mk 1

Mk 1 infrastructure was originally designed for serverless ML model inference but proved inadequate for running agentic workloads. Built on Knative Custom Resource Definitions (CRDs) running atop managed Kubernetes clusters, it leveraged KNative Serving’s scale-to-zero capabilities and Kubernetes’ container orchestration features. The infrastructure utilized pod autoscaling through the Knative Autoscaler (KNA). It also allowed to federate multiple clusters via a Blaxel agent that would offload inference requests from one Knative cluster to another based on a usage metric.

While it demonstrated reasonable stability even at 20+ requests per second and achieved somewhat acceptable cold starts through runtime optimization, its architecture wasn’t suited for the more lightweight workloads that make up most of autonomous agents: tool calls, agent orchestration, and external model routing.

Mark 1 infrastructure was decommissioned in January 2025.

Get Started

Agents Hosting

Sandboxes 🆕

Batch Jobs 🆕

MCP Servers Hosting

Model Gateway

Observability

Integrations

Administration & security

How to choose an infrastructure generation

Mark 2 infrastructure

Mark 3 infrastructure

What about Mk 1

Get Started

Agents Hosting

Sandboxes 🆕

Batch Jobs 🆕

MCP Servers Hosting

Model Gateway

Observability

Integrations

Administration & security

​How to choose an infrastructure generation

​Mark 2 infrastructure

​Mark 3 infrastructure

​What about Mk 1

How to choose an infrastructure generation

Mark 2 infrastructure

Mark 3 infrastructure

What about Mk 1