Codex can route a normal run through Hugging Face Inference Providers when a task needs an open model hosted behind Hugging Face's router instead of the built-in OpenAI provider or a local --oss backend. The handoff is a custom Codex model provider plus a named profile, so the command still starts from codex or codex exec while Hugging Face handles model routing.
The provider entry in ~/.codex/config.toml points Codex at https://router.huggingface.co/v1, reads the bearer token from HF_TOKEN, and uses the Responses API wire format. The reusable profile belongs in ~/.codex/huggingface.config.toml with top-level model_provider and model keys, not in a legacy [profiles.huggingface] table inside the main config file.
Choose a model that is available on Hugging Face Inference Providers for chat completion. Leaving the provider suffix off lets the router choose a route, while a suffix such as :groq pins the backend for that model. Keep the Hugging Face token out of Codex config files, and expect a missing or invalid HF_TOKEN to stop the run before a model response is returned.
https://huggingface.co/settings/tokens
The fine-grained token needs permission to make calls to Inference Providers. Keep the token value out of shared terminal transcripts, shell history, and Codex config files.
$ export HF_TOKEN=hf_your_token_here
Use a shell secret manager or session-scoped environment injection when the token should not be typed directly.
$ mkdir -p ~/.codex
[model_providers.huggingface] name = "Hugging Face" base_url = "https://router.huggingface.co/v1" env_key = "HF_TOKEN" wire_api = "responses"
model_provider = "huggingface" model = "openai/gpt-oss-120b"
$ codex exec --profile huggingface "Reply with exactly OK." OK
Use codex --profile huggingface for an interactive session with the same provider and model.
Related: How to run Codex exec with a prompt
$ codex exec --profile huggingface -m "openai/gpt-oss-120b:groq" "Reply with exactly OK." OK
Append a provider suffix only when that backend should be pinned. Hugging Face also supports routing policies such as :fastest or :cheapest when those policies match the model and account settings.
Related: How to set the default model in Codex
Related: How to override Codex configuration for a single run