CLI Reference
The skulk-weights CLI has three jobs:
- inspect the merged catalog
- check the local environment
- plan or run one weight publication
Those jobs keep publication reviewable before LARQL extracts large weight directories. The published artifacts are the stable objects Skulk can later place across GPU inference nodes and CPU/high-memory weight-serving nodes.
catalogue is a legacy alias for catalog; every catalog subcommand below
also works under catalogue.
Global Options
Global options come before the subcommand. --config and --manifest are
mutually exclusive.
--config PATH loads PATH as skulk-weights.yaml. The built-in Foxlight
catalog is still included, and the config can add operator catalog sources.
skulk-weights --config skulk-weights.yaml catalog validate
--manifest PATH is a legacy single-file mode. It bypasses the merged
catalog and reads one manifest source directly.
skulk-weights --manifest /path/to/models.yaml catalog validate
skulk-weights catalog validate
Validates the effective catalog. Run this after changing
skulk-weights.yaml or any source file.
Example:
skulk-weights catalog validate
skulk-weights --config skulk-weights.yaml catalog validate
skulk-weights catalog sources
Prints the sources that contributed entries to the merged catalog.
Example:
skulk-weights catalog sources
skulk-weights catalog list
Lists effective catalog keys. Use this to see which entries are available by tier.
Options:
--tier all--tier smoke--tier moe
Example:
skulk-weights catalog list --tier smoke
skulk-weights catalog show KEY
Prints one catalog entry as JSON. Use this when you want to inspect exactly what a namespaced key resolves to before publishing.
Example:
skulk-weights catalog show foxlight/gemma-3-4b-full-q4-k
skulk-weights catalog find HF_URL_OR_OWNER/REPO
Reverse lookup: given an upstream source model, prints every catalog entry
whose source_model matches. The lookup is one-to-many — a single source model
can map to several entries (for example a full slice plus an expert-server
slice) — so each matching entry is printed as a JSON object, one per line.
Accepts either a bare owner/repo string or a full https://huggingface.co/...
URL. Respects --config, so it searches operator sources too.
Exits 1 with a stderr message when no entry matches the given source model.
Example:
skulk-weights catalog find google/gemma-3-4b-it
skulk-weights catalog find https://huggingface.co/google/gemma-3-4b-it
skulk-weights catalog init
Writes a starter skulk-weights.yaml. The generated file is valid immediately
because the Foxlight catalog is included automatically. Add operator sources
when you are ready.
Options:
--output PATH: write to a path other thanskulk-weights.yaml--force: replace an existing file
Example:
skulk-weights catalog init
skulk-weights catalog add HF_MODEL_ID
Adds a new Foxlight catalog entry by fetching metadata from a Hugging Face
model repo and generating the YAML block automatically. Use this instead of
editing foxlight.yaml by hand.
Options:
--dry-run: print the generated YAML block without writing anything--yes/-y: skip the confirmation prompt before writing
What it does:
- Resolves the HF model ID (accepts bare
owner/repostrings) - Fetches model card tags to detect quant scheme and tier
- Checks the base model for
mtp.*tensor keys and populates MTP sidecar fields if found - If no
mtp.*keys are found, checks for a Gemma-4-style companion model named{model}-assistantand, when present, writesassistant_model_repoinstead of any MTP fields - Derives
key,output_name, andhf_repofrom the model ID - Validates no key, output name, or repo collisions against the existing catalog
- Appends the entry to the built-in
foxlight.yaml
Examples:
skulk-weights catalog add mlx-community/Qwen3-6B-4bit --dry-run
skulk-weights catalog add mlx-community/Qwen3-6B-4bit
The command exits 1 if the detected quant scheme is not supported, or if the
generated key, output_name, or hf_repo would collide with an existing entry.
Always use --dry-run first to review the generated block before writing.
Detection recognizes 8-bit models and labels them q8k, but validation only
allows q4k (ALLOWED_QUANTS is q4k-only). Adding an 8-bit model therefore
fails with a quant error — catalog add exits 1 reporting that the detected
quant is not supported. Only 4-bit (q4k) entries can be added today.
Setting HF_TOKEN in the environment is recommended — it allows the MTP key
scan to access gated base model repos without hitting rate limits.
skulk-weights scratch clean
Deletes the scratch directory and all cached weight shards inside it. Use this to reclaim disk space after a publish run or to force a clean re-download.
Options:
--scratch PATH: overrideSKULK_WEIGHTS_SCRATCHfor this operation--yes/-y: skip the confirmation prompt
Examples:
skulk-weights scratch clean
skulk-weights scratch clean --yes
skulk-weights scratch clean --scratch /fast/skulk-weights --yes
The command refuses to delete paths that are too broad: home directory, root, current working directory, any ancestor of the current working directory, or any path fewer than three components deep.
skulk-weights doctor
Checks local prerequisites that are safe on any machine: Python dependencies, scratch directory access, and catalog validity.
skulk-weights doctor --publish
Adds publication-specific checks for larql, HF_TOKEN, and the
huggingface_hub package used for collection updates.
skulk-weights publish --model KEY
Builds the publish plan for one catalog entry. With --dry-run, it only
prints the plan. Without --dry-run, it runs the selected artifact step:
vindex: runslarql extract,larql publish, and files the repository into the configured Hugging Face collection.mtp: downloads only the shards that containmtp.*tensor keys from the original BF16 checkpoint and uploads them at full precision (bf16, unquantized) asmtp.safetensorsto the sidecar repository. Requiresmtp_source_repoandmtp_sidecar_repoon the catalog entry.vision: mirrors the vision source repo's weights and configs into the vision sidecar repo byte-for-byte — no quantization and no dtype conversion. Requiresvision_source_repoandvision_sidecar_repoon the catalog entry; raises an error only when those are not configured.all(default): runsvindex, thenmtp, thenvision, skipping any artifact not configured on the entry.
Model cards
Every real publish (vindex, mtp, or vision) also uploads a self-describing
README.md model card to the published repo, with frontmatter recording the
base_model, tags, the inherited source license, and a foxlight: block
(artifact type, source repo and pinned source revision, target model, quant,
catalog key, and generation timestamp). The source revision and license are
resolved best-effort from the Hub using HF_TOKEN. See the
Manifest Reference for the entry fields these cards describe.
Collections
Each artifact is filed into the Hugging Face collection for its type:
- the vindex is filed into the configured slug exactly — the entry's
hf_collectionor theSKULK_WEIGHTS_COLLECTIONoverride; - mtp and vision sidecars are filed into their per-type collections,
MTP SidecarsandVision Sidecars, resolved by title (created if missing, reused if present).
Filing is disabled when no collection is configured for the entry, or when
SKULK_WEIGHTS_COLLECTION is set to a disable value (see the
Environment Reference).
Options:
--artifact vindex|mtp|vision|all: publish only the named artifact, or all declared artifacts when omitted.--dry-run: print the plan without running any extraction or upload--force: replace an existing local output path--scratch PATH: overrideSKULK_WEIGHTS_SCRATCH
Examples:
skulk-weights publish --model foxlight/gemma-3-4b-full-q4-k --dry-run
skulk-weights publish --model foxlight/gemma-3-4b-full-q4-k
skulk-weights publish --model my-org/qwen3-6b-full-q4-k --artifact mtp --dry-run
skulk-weights publish --model my-org/qwen3-6b-full-q4-k --artifact mtp
skulk-weights --config skulk-weights.yaml publish \
--model my-org/llama-3-8b-full-q4-k \
--scratch /fast/skulk-weights
Expected dry-run output includes:
- catalog key
- tier
- artifact selection
- source model
- local output path
- target Hugging Face repository
- the vindex collection (
vindex collection: <slug>, orcollection: disabled) larql extractcommand (vindex artifact)larql publishcommand (vindex artifact)- MTP source repo, sidecar repo, precision (
bf16 (unquantized)), and output path, plusmtp collection: MTP Sidecars(mtp artifact) - vision source repo, sidecar repo, the mirror note, plus
vision collection: Vision Sidecars(vision artifact) - a note when mtp or vision is not configured for the entry