v0.53.8 / v0.53.10 — Remote loaders, alternative hubs, tracker extras

Live fsspec remote loaders

yaml
data:
  train: s3://my-bucket/instructions.jsonl

Closed scheme allowlist: s3, gs, gcs, az, abfs, abfss, oci. The schema gate from v0.42.0 is now backed by a real loader.

Alternative model hubs

yaml
training:
  hub: modelscope   # hf | modelscope | modelers

v0.53.8 wires the dispatcher into soup train (pre-fetches the model before training starts). v0.53.10 plumbs the --hub flag through chat / serve / infer / merge / export / push and soup data download --hub.

SSRF hardening matches v0.29.0 HF endpoint validation: scheme allowlist, null-byte rejection, RFC1918 / link-local / cloud-metadata IPs blocked.

Tracker extras

bash
pip install 'soup-cli[trackers]'
soup train --tracker mlflow      # or swanlab | trackio

The [trackers] extra installs mlflow / swanlab / trackio as one bundle. Telemetry is opt-IN via SOUP_TELEMETRY=1; endpoint + API key overridable via SOUP_POSTHOG_KEY / SOUP_POSTHOG_ENDPOINT.

HF Space SDK auto-pick

soup deploy hf-space detects streamlit vs gradio automatically from the template's requirements.txt.

Bundled JSONL fixtures

The 4 demo bundles from v0.43.0 are now packaged as wheel data — they survive pip install --target and zipapp deployment.

New extras

  • pip install 'soup-cli[mix]'scikit-optimize for the Bayesian mix optimizer
  • pip install 'soup-cli[data-pro]'langdetect + presidio-analyzer for proper langdetect / PII

Web UI: Tool Outputs panel

The Web UI sidebar grows a "Tool Outputs" panel (3s polling, XSS-safe rendering) backed by GET /api/tool-outputs. SFT trainer-side record_call writes structured tool-call records that survive crashes.

Tests

  • v0.53.8: 8,162 → 8,257 (+95)
  • v0.53.10: 8,285 → 8,330 (+45)

See also

  • [HuggingFace Hub integration](/docs/hf-hub-integration)
  • [Data Pipeline Pro](/docs/data-pipeline-pro)