How to use Hugging Face models with Codex

Using Hugging Face Inference Providers with Codex makes it possible to run prompts against hosted open-weight models without changing the normal Codex command flow. This is useful when a task needs a model that is available through Hugging Face's router instead of the built-in OpenAI or local --oss paths.

Current Codex releases treat Hugging Face as a custom OpenAI-compatible provider rather than a built-in --local-provider backend. A saved provider entry in /~/.codex/config.toml points Codex at https://router.huggingface.co/v1, HF_TOKEN supplies the bearer token, and the saved profile can pin a default model while -m still overrides it for one run.

Custom providers in current Codex builds use the Responses API, so older chat-completions-only examples are not the right fit unless the remote service also exposes /v1/responses. Hugging Face's current Inference Providers docs provide a Responses-compatible router and support provider suffixes such as :groq to pin the backend, while omitting the suffix lets the router choose the default route for the model.

Steps to use Hugging Face models with Codex:

  1. Create the Codex config directory if it does not already exist.
    $ mkdir -p ~/.codex
  2. Export a Hugging Face token that has Make calls to Inference Providers permission.
    $ export HF_TOKEN=<your-hugging-face-token>

    The token stays in the current shell session instead of being stored in /~/.codex/config.toml.

  3. Add a Hugging Face provider entry and a reusable profile in /~/.codex/config.toml.
    ~/.codex/config.toml
    [model_providers.huggingface]
    name = "Hugging Face"
    base_url = "https://router.huggingface.co/v1"
    env_key = "HF_TOKEN"
    wire_api = "responses"
     
    [profiles.huggingface]
    model_provider = "huggingface"
    model = "openai/gpt-oss-120b:groq"

    Use wire_api = "responses". Current Codex custom providers no longer support the older chat wire setting.

  4. Run Codex against the saved Hugging Face profile and confirm that the model responds normally.
    $ codex exec -p huggingface "Return OK."
    OK.

    The same profile also works for interactive sessions with codex -p huggingface.

  5. Override the saved model ID when Hugging Face should choose the router's default route instead of a pinned provider suffix.
    $ codex exec -p huggingface -m openai/gpt-oss-120b "Return OK."
    OK.

    Append a provider suffix such as :groq only when a specific backend matters for latency, cost, or availability. Any other current Hugging Face model can be selected the same way with -m.

    If a pinned suffix stops working, remove it or switch to a provider that the model page currently shows as available.