VisionCardConfig
Vision configuration attached to a model card for VLM support.
Populated from the [vision] section of a TOML model card or
auto-detected from config.json during card creation.
imageTokenId object
Token id the model uses as the image placeholder in the prompt. Required by
the MLX vision path (which splices image embeddings at this token); None
is allowed for a llama.cpp-only vision GGUF, whose chat handler inserts image
features itself and never reads this. MLX cards always set it (from
config.json).
- integer
- null
Vision model-type tag (from config.json's vision_config), selecting
the image processor (MLX) or chat handler (llama.cpp). Empty when a bare GGUF
repo only signals vision via its mmproj projector; the llama.cpp runner
then falls back to its general multimodal handler.
Repo holding the vision-tower weights when separate from the LM; empty if bundled with the main weights.
imageToken object
The literal image placeholder string, when distinct from image_token_id.
- string
- null
processorRepo object
Repo providing the image processor/preprocessor config, if not the main repo.
- string
- null
boiTokenId object
Begin-of-image token id, for families that bracket image spans.
- integer
- null
eoiTokenId object
End-of-image token id, for families that bracket image spans.
- integer
- null
{
"imageTokenId": 0,
"modelType": "",
"weightsRepo": "",
"imageToken": "string",
"processorRepo": "string",
"boiTokenId": 0,
"eoiTokenId": 0
}