Hugging Face Ollama,
in your terminal.

Q: Do I need to install Ollama first?

No. If Ollama isn't on your PATH, hfo will offer to install it using the platform-appropriate method — winget on Windows, Homebrew on macOS, or the official shell script on Linux. You always confirm before anything is installed.

Q: Can I launch Claude Code or Codex with a local model?

Yes. Press L on any installed model and hfo will run ollama launch against the chosen integration — Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, or VS Code — with that model as the backend.

Score every GGUF against your VRAM, install with one Enter, then hand any model off to Claude Code, Codex, Copilot CLI and more — without leaving the shell.

npm pnpm yarn bun curl

View on npm Star on GitHub

~ hfo · install live

$ hfo bartowski/Qwen2.5-Coder-7B-Instruct-GGUF
→ RTX 4090 · 24 GiB VRAM · 64 GiB RAM · Windows 11
→ reading HF card · scoring 8 quants…

   Q6_K    6.25 GB  ██████████  97  Full GPU
   Q5_K_M  5.33 GB  █████████░  94  Full GPU
   Q4_K_M  4.68 GB  █████████░  90  Full GPU

? Install Q6_K → ~/AI/LLMs/qwen2-5-coder/q6-k
  [Enter] install now   [O] change dir   [C] customize

What it does

Everything that normally takes a dozen shell commands and a hand-written Modelfile — folded into one tool that understands your rig.

Version v0.1.0 CI passing Tests 76 / 76 GitHub ★ Star Node ≥ 20 License MIT Platforms

Fullscreen TUI

Six tabs — Dashboard, Models, Install, Tune, Help, Settings — with live GPU / VRAM / RAM meters. Alt-screen buffer preserves your scrollback on exit.

Quick-confirm install

New models install in one Enter keystroke: the tool scores every quant against your VRAM, parses the HF card, and proposes a ready-to-run Modelfile.

Hardware scoring

Every GGUF graded 0–100 against your usable VRAM and RAM headroom, with per-quant labels like Full GPU, Partial 87%, CPU-heavy.

HF-card-aware Modelfile

Reads each model's README for recommended temperature, top_p, top_k, repeat_penalty, min_p, and context size. Overlaid on hardware defaults.

Orphan reinstall

Models removed from Ollama but still on disk appear as available to reinstall. Press I to rebuild the Modelfile from the GGUF alone.

Zip backup & restore

Level-9 compression, streaming (multi-GB safe), sidecar metadata.json. hfo --restore <zip> extracts and re-registers automatically.

Launch integrations

Press L on any model to run ollama launch with it as backend: Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, VS Code.

~90% capacity tuner

Side-by-side default / suggested / current view of every key Ollama env var. Persists via setx · launchctl · systemd per OS.

7 themes, 20 languages

Dark, Light, Dracula, Solarized, Nord, Monokai. Translations in English, Spanish, Chinese, Hindi, Arabic, Portuguese, Bengali, Russian, Japanese, German, French, Korean, and more.

Install

One-line installs for each OS, or a package manager of your choice. Standalone binaries ship on every tagged GitHub release — the install scripts below always fetch the latest one for your OS and architecture.

One-liner install

# Detects OS + arch, downloads the latest binary, # drops it on your PATH (no sudo needed on ~/.local/bin). $ curl -fsSL https://hfo.carrillo.app/install.sh | sh

# Installs to %LOCALAPPDATA%\Programs\hfo and adds it to user PATH. # No admin prompt required. PS> irm https://hfo.carrillo.app/install.ps1 | iex

Prefer a package manager? All three publish the same hfo-cli package — use whichever you already have.

$ npm i -g hfo-cli

Manual binary download

Tagged releases on GitHub ship per-OS executables. Download directly, or use the one-liner scripts above which wrap this for you.

# Pick the binary matching your OS + arch; available after the first tagged release. # hfo-linux-x64 hfo-linux-arm64 # hfo-macos-x64 hfo-macos-arm64 # hfo-win-x64.exe $ curl -LO https://github.com/carrilloapps/hfo/releases/latest/download/hfo-linux-x64 $ chmod +x hfo-linux-x64 $ sudo mv hfo-linux-x64 /usr/local/bin/hfo $ hfo --version

Try it

# Explore — fullscreen dashboard $ hfo # Install a specific repo (opens the Install tab pre-filled) $ hfo bartowski/Llama-3.2-3B-Instruct-GGUF # Or skip the TUI entirely — every capability has a flag $ hfo --view $ hfo --list $ hfo --launch claude --model llama3.1:8b

Every capability, non-interactively

Every TUI feature ships a one-shot command equivalent — ideal for scripts, CI, or editor integrations.

Command	What it does
`hfo`	Open the fullscreen TUI
`hfo --help` · `-h`	Print the full flag reference
`hfo --version`	Print version, license, and author
`hfo --view`	Print hardware profile, capacity score, tier, picks, pre-filtered HF search URLs
`hfo --list`	List installed models + orphaned GGUFs on disk
`hfo --tune`	Persist the ~90% capacity Ollama env profile, then restart the daemon
`hfo --backup <tag>`	Create a `.zip` backup of the model's folder
`hfo --restore <zip>`	Extract a backup and re-register with Ollama
`hfo --delete <tag> [--deep]`	Remove the Ollama tag. With `--deep` also delete the folder
`hfo --launch <integration>`	Run `ollama launch <integration>` (claude · codex · copilot · opencode · droid · cline · hermes · kimi · openclaw · pi · vscode)
`hfo --tab <name>`	Open the TUI on a specific tab (dashboard · models · install · tune · help · settings)

Keyboard reference

Global shortcuts work on every tab. Tab-specific keys appear in the footer as you navigate.

Scope	Keys	Action
Global	1–6	Jump to a tab
Global	Tab · Shift+Tab	Cycle tabs
Global	,	Open Settings
Global	?	Open Help
Global	Ctrl+H	Open author homepage
Global	q · Ctrl+C	Quit (restores scrollback)
Dashboard	Alt+↑/↓	Focus a panel (full-screen zoom)
Dashboard	Esc · b	Return to the 4-panel grid
Dashboard	Enter	Open the selected link in the default browser
Models	L / G	Launch integration · Launch menu with no preset
Models	I	Reinstall an orphan (regenerates Modelfile if missing)
Models	B	Create a .zip backup
Models	d · Alt+d	Remove tag · Remove tag AND delete files
Install (quick-confirm)	Enter	Install now with recommended defaults
Install (quick-confirm)	O / C	Change target dir · Customize parameters
Tune	↑↓ · ←→	Navigate · cycle enum values
Settings	Enter	Open dropdown (themes, 20 languages, searchable)

Privacy, security, cost

Exactly two kinds of external requests, both to Hugging Face's public API. Everything else is local.

Public endpoints used

GET huggingface.co/api/models/<repo>/tree/main
GET huggingface.co/<repo>/resolve/main/<file>

Runs locally

nvidia-smi, systeminformation, ollama ps, ollama create, ollama launch — all invoked in your shell with your privileges.

Opt-in only

Ollama install helpers (winget, brew, the official shell script) run only when you explicitly confirm from the installer overlay.

No telemetry. No accounts. No external calls beyond those two HF endpoints. Gated or private HF repos use a standard HF_TOKEN environment variable. See SECURITY.md for the full threat model and reporting policy.

Frequently asked

Short answers to the questions people ask first.

Is hfo free?

Yes. hfo is MIT-licensed open source with no subscriptions, no accounts, and no telemetry. It only talks to the public Hugging Face API and your local Ollama daemon.

Which operating systems does hfo support?

Windows, macOS, and Linux. The npm package is cross-OS; standalone binaries for each platform are published on the GitHub Releases page when a version is tagged.

Do I need to install Ollama first?

No. If Ollama isn't on your PATH, hfo offers to install it using the platform-appropriate method — winget on Windows, Homebrew on macOS, or the official shell script on Linux. You always confirm before anything is installed.

How does hfo choose which GGUF to recommend?

Every quant in a Hugging Face repo is scored 0–100 against your detected VRAM and RAM headroom. The tool reads the model's README for recommended temperature, top_p, top_k, repeat_penalty, min_p and context size, then synthesises a tuned Modelfile. On the quick-confirm screen you accept with Enter, change the install directory with O, or customise the parameters with C.

Can I launch Claude Code or Codex with a local model?

Yes. Press L on any installed model and hfo will run ollama launch against the chosen integration — Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, or VS Code — with that model as the backend.

Where are my model files stored?

By default under the directory you pick in the file browser (or the modelDir setting). Each model lives in <baseDir>/<repo-slug>/<quant>/, with the GGUF, a tuned Modelfile, and hfo records the tag → directory mapping so Alt+d (deep delete) and B (backup) know what to operate on.

Author

Built and maintained by one person.

José Carrillo

Senior Full-stack Developer · Tech Lead

carrillo.app m@carrillo.app GitHub LinkedIn X Dev.to Medium

Hugging Face · Ollama, in your terminal.