How to use Hugging Face models with Codex

Codex can route a normal run through Hugging Face Inference Providers when a task needs an open model hosted behind Hugging Face's router instead of the built-in OpenAI provider or a local --oss backend. The handoff is a custom Codex model provider plus a named profile, so the command still starts from codex or codex exec while Hugging Face handles model routing.

The provider entry in ~/.codex/config.toml points Codex at https://router.huggingface.co/v1, reads the bearer token from HF_TOKEN, and uses the Responses API wire format. The reusable profile belongs in ~/.codex/huggingface.config.toml with top-level model_provider and model keys, not in a legacy [profiles.huggingface] table inside the main config file.

Choose a model that is available on Hugging Face Inference Providers for chat completion. Leaving the provider suffix off lets the router choose a route, while a suffix such as :groq pins the backend for that model. Keep the Hugging Face token out of Codex config files, and expect a missing or invalid HF_TOKEN to stop the run before a model response is returned.

Steps to use Hugging Face models with Codex:

Create a Hugging Face token with Inference Providers permission.
```
https://huggingface.co/settings/tokens
```
The fine-grained token needs permission to make calls to Inference Providers. Keep the token value out of shared terminal transcripts, shell history, and Codex config files.
Export the token for the current shell session.
```
$ export HF_TOKEN=hf_your_token_here
```
Use a shell secret manager or session-scoped environment injection when the token should not be typed directly.
Create the user-level Codex config directory.
```
$ mkdir -p ~/.codex
```

Add the Hugging Face provider to ~/.codex/config.toml with wire_api set to responses.

[model_providers.huggingface]
name = "Hugging Face"
base_url = "https://router.huggingface.co/v1"
env_key = "HF_TOKEN"
wire_api = "responses"

Create a profile file for the Hugging Face route at ~/.codex/huggingface.config.toml.
```
model_provider = "huggingface"
model = "openai/gpt-oss-120b"
```
Verify the Hugging Face profile with a short Codex run.
```
$ codex exec --profile huggingface "Reply with exactly OK."
OK
```
Use codex --profile huggingface for an interactive session with the same provider and model.
Related: How to run Codex exec with a prompt
Override the model for one run when a different Hugging Face route is needed.
```
$ codex exec --profile huggingface -m "openai/gpt-oss-120b:groq" "Reply with exactly OK."
OK
```
Append a provider suffix only when that backend should be pinned. Hugging Face also supports routing policies such as :fastest or :cheapest when those policies match the model and account settings.
Related: How to set the default model in Codex
Related: How to override Codex configuration for a single run

Author: Mohd Shakir Zakaria
Mohd Shakir Zakaria is a cloud architect with deep roots in software development and open-source advocacy. Certified in AWS, Red Hat, VMware, ITIL, and Linux, he specializes in designing and managing robust cloud and on-premises infrastructures.