Question 1

Is hfo free?

Accepted Answer

Yes. MIT-licensed open source with no subscriptions, no accounts, and no telemetry. It only talks to the public Hugging Face API and your local Ollama daemon.

Question 2

Which operating systems does hfo support?

Accepted Answer

Windows, macOS, and Linux. The npm package is cross-OS; standalone binaries for each platform are published on the GitHub Releases page when a version is tagged.

Question 3

Do I need to install Ollama first?

Accepted Answer

No. If Ollama isn't on your PATH, hfo will offer to install it using winget on Windows, Homebrew on macOS, or the official shell script on Linux. You always confirm before anything is installed.

Question 4

How does hfo choose which GGUF to recommend?

Accepted Answer

Every quant in a Hugging Face repo is scored 0–100 against your detected VRAM and RAM headroom. The tool reads the model's README for recommended temperature, top_p, top_k, repeat_penalty, min_p and context size, then synthesises a tuned Modelfile.

Question 5

Can I launch Claude Code or Codex with a local model?

Accepted Answer

Yes. Press L on any installed model and hfo will run ollama launch against the chosen integration — Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, or VS Code — with that model as the backend.

Question 6

Where are my model files stored?

Accepted Answer

By default under the directory you launched hfo from (process.cwd()). Each model lives in ///, with the GGUF and a tuned Modelfile. hfo records the tag-to-directory mapping so Alt+d (deep delete) and B (backup) know what to operate on.

Question 7

What's a GGUF?

Accepted Answer

GGUF is the file format llama.cpp and Ollama use to package a quantised LLM. Typical quants (Q4_K_M, Q5_K_M, Q6_K, Q8_0, F16) trade quality for size — Q5/Q6 is a good sweet spot for most hardware.

Question 8

Why not just use `ollama pull`?

Accepted Answer

Ollama's library is curated; if the model you want is on Hugging Face only, you're on your own for picking the right quant, downloading the GGUF, writing a Modelfile, and running `ollama create`. hfo collapses that sequence into one Enter keystroke with hardware-aware recommendations.

Question 9

Does hfo send any telemetry?

Accepted Answer

No. Zero network calls except the two Hugging Face endpoints (tree listing and file download). No analytics, no error reporting, no phoning home.

Question 10

Can I use gated or private HF repos?

Accepted Answer

Yes. Provide a token via HF_TOKEN environment variable or the --token / -t flag. hfo never persists the token to its settings file.

Question 11

What does the Tune tab actually change?

Accepted Answer

It reads your current Ollama environment variables (OLLAMA_FLASH_ATTENTION, OLLAMA_KV_CACHE_TYPE, OLLAMA_KEEP_ALIVE, OLLAMA_NUM_PARALLEL, OLLAMA_MAX_LOADED_MODELS, OLLAMA_MAX_QUEUE) and shows them side-by-side with defaults and hfo's ~90%-capacity suggestions. Apply persists via setx on Windows, launchctl + ~/.zprofile on macOS, or ~/.profile + systemd override on Linux.

Question 12

How big can a backup be?

Accepted Answer

The archiver is streaming so backups scale to multi-GB without blowing memory. Level-9 compression typically shrinks GGUFs by 5-15%. Each backup is a standalone .zip with a metadata.json sidecar containing tag, repo, quant, byte counts, and compression ratio.

Question 13

Does hfo work offline after the initial install?

Accepted Answer

Yes, everything except the Install tab (which needs HF) works offline. Dashboard, Models, Tune, Help, Settings — fully local. You can also use --launch and --list without a network.

Question 14

What Node version is required?

Accepted Answer

Node.js 20 or newer for the npm / pnpm / yarn / bun installs. The standalone binaries have no Node dependency at all.

Frequently asked questions

Getting started

How it works

Files & storage

Privacy & operations