Getting started

Is hfo free?

Yes. MIT-licensed open source with no subscriptions, no accounts, and no telemetry. It only talks to the public Hugging Face API and your local Ollama daemon.

Which operating systems does hfo support?

Windows, macOS, and Linux. The npm package is cross-OS; standalone binaries for each platform are published on the GitHub Releases page when a version is tagged.

Do I need to install Ollama first?

No. If Ollama isn't on your PATH, hfo will offer to install it using winget on Windows, Homebrew on macOS, or the official shell script on Linux. You always confirm before anything is installed.

What Node.js version is required?

Node.js 20 or newer for the npm / pnpm / yarn / bun installs. The standalone binaries have no Node dependency at all.

How it works

How does hfo choose which GGUF to recommend?

Every quant in a Hugging Face repo is scored 0–100 against your detected VRAM and RAM headroom. The tool reads the model's README for recommended temperature, top_p, top_k, repeat_penalty, min_p and context size, then synthesises a tuned Modelfile. On the quick-confirm screen you can accept with Enter, change the install directory with O, or customise the Modelfile parameters with C.

What's a GGUF?

GGUF is the file format llama.cpp and Ollama use to package a quantised LLM. Typical quants (Q4_K_M, Q5_K_M, Q6_K, Q8_0, F16) trade quality for size. Q5/Q6 is a good sweet spot for most hardware.

Why not just use ollama pull?

Ollama's library is curated; if the model you want is on Hugging Face only, you're on your own for picking the right quant, downloading the GGUF, writing a Modelfile, and running ollama create. hfo collapses that sequence into one Enter keystroke with hardware-aware recommendations.

Can I launch Claude Code or Codex with a local model?

Yes. Press L on any installed model and hfo will run ollama launch against the chosen integration — Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, or VS Code — with that model as the backend.

Files & storage

Where are my model files stored?

By default under the directory you launched hfo from (process.cwd()). Each model lives in <baseDir>/<repo-slug>/<quant>/, with the GGUF and a tuned Modelfile. hfo records the tag-to-directory mapping so Alt+d (deep delete) and B (backup) know what to operate on. Set a persistent default in the Settings tab via modelDir.

How big can a backup be?

The archiver is streaming so backups scale to multi-GB without blowing memory. Level-9 compression typically shrinks GGUFs by 5–15%. Each backup is a standalone .zip with a metadata.json sidecar containing tag, repo, quant, byte counts, and compression ratio.

What does the Tune tab actually change?

It reads your current Ollama environment variables (OLLAMA_FLASH_ATTENTION, OLLAMA_KV_CACHE_TYPE, OLLAMA_KEEP_ALIVE, OLLAMA_NUM_PARALLEL, OLLAMA_MAX_LOADED_MODELS, OLLAMA_MAX_QUEUE) and shows them side-by-side with defaults and hfo's ~90%-capacity suggestions. Apply persists via setx on Windows, launchctl + ~/.zprofile on macOS, or ~/.profile + systemd override on Linux.

Privacy & operations

Does hfo send any telemetry?

No. Zero network calls except the two Hugging Face endpoints (tree listing and file download). No analytics, no error reporting, no phoning home. See the privacy page for the full data-flow breakdown.

Can I use gated or private HF repos?

Yes. Provide a token via the HF_TOKEN environment variable or the --token / -t flag. hfo never persists the token to its settings file.

Does hfo work offline after the initial install?

Yes. Everything except the Install tab (which needs Hugging Face) works offline: Dashboard, Models, Tune, Help, Settings — all fully local. You can also use --launch and --list without a network.