Fullscreen TUI
Six tabs — Dashboard, Models, Install, Tune, Help, Settings — with live GPU / VRAM / RAM meters. Alt-screen buffer preserves your scrollback on exit.
A fullscreen terminal UI and headless CLI that turns any Hugging Face GGUF repository into a running, hardware-tuned Ollama model — then lets you back it up, restore it, or hand it off to Claude Code, Codex, Copilot CLI, OpenCode, Droid, and seven more integrations without leaving your shell.
Everything that normally takes a dozen shell commands and a hand-written Modelfile — folded into one tool that understands your rig.
Six tabs — Dashboard, Models, Install, Tune, Help, Settings — with live GPU / VRAM / RAM meters. Alt-screen buffer preserves your scrollback on exit.
New models install in one Enter keystroke: the tool scores every quant against your VRAM, parses the HF card, and proposes a ready-to-run Modelfile.
Every GGUF graded 0–100 against your usable VRAM and RAM headroom, with per-quant labels like Full GPU, Partial 87%, CPU-heavy.
Reads each model's README for recommended temperature, top_p, top_k, repeat_penalty, min_p, and context size. Overlaid on hardware defaults.
Models removed from Ollama but still on disk appear as available to reinstall. Press I to rebuild the Modelfile from the GGUF alone.
Level-9 compression, streaming (multi-GB safe), sidecar metadata.json. hfo --restore <zip> extracts and re-registers automatically.
Press L on any model to run ollama launch with it as backend: Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, VS Code.
Side-by-side default / suggested / current view of every key Ollama env var. Persists via setx · launchctl · systemd per OS.
Dark, Light, Dracula, Solarized, Nord, Monokai. Translations in English, Spanish, Chinese, Hindi, Arabic, Portuguese, Bengali, Russian, Japanese, German, French, Korean, and more.
One-line installs for each OS, or a package manager of your choice. Standalone binaries ship on every tagged GitHub release — the install scripts below always fetch the latest one for your OS and architecture.
# Detects OS + arch, downloads the latest binary,
# drops it on your PATH (no sudo needed on ~/.local/bin).
$ curl -fsSL https://hfo.carrillo.app/install.sh | sh
# Installs to %LOCALAPPDATA%\Programs\hfo and adds it to user PATH.
# No admin prompt required.
PS> irm https://hfo.carrillo.app/install.ps1 | iex
Prefer a package manager? All three publish the same hfo-cli package — use whichever you already have.
$ npm i -g hfo-cli
$ pnpm add -g hfo-cli
$ yarn global add hfo-cli
$ bun add -g hfo-cli
Tagged releases on GitHub ship per-OS executables. Download directly, or use the one-liner scripts above which wrap this for you.
# Pick the binary matching your OS + arch; available after the first tagged release.
# hfo-linux-x64 hfo-linux-arm64
# hfo-macos-x64 hfo-macos-arm64
# hfo-win-x64.exe
$ curl -LO https://github.com/carrilloapps/hfo/releases/latest/download/hfo-linux-x64
$ chmod +x hfo-linux-x64
$ sudo mv hfo-linux-x64 /usr/local/bin/hfo
$ hfo --version
# Explore — fullscreen dashboard
$ hfo
# Install a specific repo (opens the Install tab pre-filled)
$ hfo bartowski/Llama-3.2-3B-Instruct-GGUF
# Or skip the TUI entirely — every capability has a flag
$ hfo --view
$ hfo --list
$ hfo --launch claude --model llama3.1:8b
Every TUI feature ships a one-shot command equivalent — ideal for scripts, CI, or editor integrations.
| Command | What it does |
|---|---|
hfo | Open the fullscreen TUI |
hfo --help · -h | Print the full flag reference |
hfo --version | Print version, license, and author |
hfo --view | Print hardware profile, capacity score, tier, picks, pre-filtered HF search URLs |
hfo --list | List installed models + orphaned GGUFs on disk |
hfo --tune | Persist the ~90% capacity Ollama env profile, then restart the daemon |
hfo --backup <tag> | Create a .zip backup of the model's folder |
hfo --restore <zip> | Extract a backup and re-register with Ollama |
hfo --delete <tag> [--deep] | Remove the Ollama tag. With --deep also delete the folder |
hfo --launch <integration> | Run ollama launch <integration> (claude · codex · copilot · opencode · droid · cline · hermes · kimi · openclaw · pi · vscode) |
hfo --tab <name> | Open the TUI on a specific tab (dashboard · models · install · tune · help · settings) |
Global shortcuts work on every tab. Tab-specific keys appear in the footer as you navigate.
| Scope | Keys | Action |
|---|---|---|
| Global | 1–6 | Jump to a tab |
| Global | Tab · Shift+Tab | Cycle tabs |
| Global | , | Open Settings |
| Global | ? | Open Help |
| Global | Ctrl+H | Open author homepage |
| Global | q · Ctrl+C | Quit (restores scrollback) |
| Dashboard | Alt+↑/↓ | Focus a panel (full-screen zoom) |
| Dashboard | Esc · b | Return to the 4-panel grid |
| Dashboard | Enter | Open the selected link in the default browser |
| Models | L / G | Launch integration · Launch menu with no preset |
| Models | I | Reinstall an orphan (regenerates Modelfile if missing) |
| Models | B | Create a .zip backup |
| Models | d · Alt+d | Remove tag · Remove tag AND delete files |
| Install (quick-confirm) | Enter | Install now with recommended defaults |
| Install (quick-confirm) | O / C | Change target dir · Customize parameters |
| Tune | ↑↓ · ←→ | Navigate · cycle enum values |
| Settings | Enter | Open dropdown (themes, 20 languages, searchable) |
Exactly two kinds of external requests, both to Hugging Face's public API. Everything else is local.
GET huggingface.co/api/models/<repo>/tree/mainGET huggingface.co/<repo>/resolve/main/<file>
nvidia-smi, systeminformation, ollama ps, ollama create, ollama launch — all invoked in your shell with your privileges.
Ollama install helpers (winget, brew, the official shell script) run only when you explicitly confirm from the installer overlay.
No telemetry. No accounts. No external calls beyond those two HF endpoints. Gated or private HF repos use a standard HF_TOKEN environment variable. See SECURITY.md for the full threat model and reporting policy.
Short answers to the questions people ask first.
Yes. hfo is MIT-licensed open source with no subscriptions, no accounts, and no telemetry. It only talks to the public Hugging Face API and your local Ollama daemon.
Windows, macOS, and Linux. The npm package is cross-OS; standalone binaries for each platform are published on the GitHub Releases page when a version is tagged.
No. If Ollama isn't on your PATH, hfo offers to install it using the platform-appropriate method — winget on Windows, Homebrew on macOS, or the official shell script on Linux. You always confirm before anything is installed.
Every quant in a Hugging Face repo is scored 0–100 against your detected VRAM and RAM headroom. The tool reads the model's README for recommended temperature, top_p, top_k, repeat_penalty, min_p and context size, then synthesises a tuned Modelfile. On the quick-confirm screen you accept with Enter, change the install directory with O, or customise the parameters with C.
Yes. Press L on any installed model and hfo will run ollama launch against the chosen integration — Claude Code, Cline, Codex, Copilot CLI, Droid, Hermes, Kimi, OpenCode, OpenClaw, Pi, or VS Code — with that model as the backend.
By default under the directory you pick in the file browser (or the modelDir setting). Each model lives in <baseDir>/<repo-slug>/<quant>/, with the GGUF, a tuned Modelfile, and hfo records the tag → directory mapping so Alt+d (deep delete) and B (backup) know what to operate on.