Skip to main content

Quick-launch a model placement

POST 

/place_instance

Place and launch a model with Skulk choosing a valid concrete placement from the requested sharding, instance metadata, and minimum-node constraints. The placement is validated against the current cluster state before the command is forwarded: an impossible placement returns 400 with the specific reason (no connected cycle, exclusions removed every candidate, a node cannot fit its shard with runtime headroom, ...). If node memory info is still being gathered (cluster just formed), the request waits up to 15 seconds for it before returning 503 — retry shortly in that case.

Request

Responses

Successful Response