Codex can route a normal run through Hugging Face Inference Providers when a task needs an open model hosted behind Hugging Face's router instead of the built-in OpenAI provider or a local --oss backend. The handoff is a custom Codex model provider plus a named profile, so the command still starts from codex or codex exec while Hugging Face handles model routing.
The provider entry in ~/.codex/config.toml points Codex at https://router.huggingface.co/v1, reads the bearer token from HF_TOKEN, and uses the Responses API wire format. The reusable profile belongs in ~/.codex/huggingface.config.toml with top-level model_provider and model keys, not in a legacy [profiles.huggingface] table inside the main config file.
Choose a model that is available on Hugging Face Inference Providers for chat completion. Leaving the provider suffix off lets the router choose a route, while a suffix such as :groq pins the backend for that model. Keep the Hugging Face token out of Codex config files, and expect a missing or invalid HF_TOKEN to stop the run before a model response is returned.
Steps to use Hugging Face models with Codex:
- Create a Hugging Face token with Inference Providers permission.
https://huggingface.co/settings/tokens
The fine-grained token needs permission to make calls to Inference Providers. Keep the token value out of shared terminal transcripts, shell history, and Codex config files.
- Export the token for the current shell session.
$ export HF_TOKEN=hf_your_token_here
Use a shell secret manager or session-scoped environment injection when the token should not be typed directly.
- Create the user-level Codex config directory.
$ mkdir -p ~/.codex
- Add the Hugging Face provider to ~/.codex/config.toml with wire_api set to responses.
[model_providers.huggingface] name = "Hugging Face" base_url = "https://router.huggingface.co/v1" env_key = "HF_TOKEN" wire_api = "responses"
- Create a profile file for the Hugging Face route at ~/.codex/huggingface.config.toml.
model_provider = "huggingface" model = "openai/gpt-oss-120b"
- Verify the Hugging Face profile with a short Codex run.
$ codex exec --profile huggingface "Reply with exactly OK." OK
Use codex --profile huggingface for an interactive session with the same provider and model.
Related: How to run Codex exec with a prompt - Override the model for one run when a different Hugging Face route is needed.
$ codex exec --profile huggingface -m "openai/gpt-oss-120b:groq" "Reply with exactly OK." OK
Append a provider suffix only when that backend should be pinned. Hugging Face also supports routing policies such as :fastest or :cheapest when those policies match the model and account settings.
Related: How to set the default model in Codex
Related: How to override Codex configuration for a single run
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.