Skip to main content

Instances

Placement previews, launch flows, instance lookup, and lifecycle management for running models.

📄️ Quick-launch a model placement

Place and launch a model with Skulk choosing a valid concrete placement from the requested sharding, instance metadata, and minimum-node constraints. The placement is validated against the current cluster state before the command is forwarded: an impossible placement returns 400 with the specific reason (no connected cycle, exclusions removed every candidate, a node cannot fit its shard with runtime headroom, ...). If node memory info is still being gathered (cluster just formed), the request waits up to 15 seconds for it before returning 503 — retry shortly in that case.