How to use local models with Codex

Using local models with Codex keeps prompts and code on a local model server instead of sending them to a hosted provider. This is useful when the task includes sensitive source code, internal identifiers, or data that should stay on the local machine.

The Codex CLI can route local runs through a supported provider such as Ollama or LM Studio. You still choose the model with -m, but the model name must exactly match the identifier exposed by the local provider.

A local run only works when the provider is reachable and supports POST /v1/responses on its local API. Keep that API bound to localhost unless network access is intentional, and use the provider-specific setup guides if Codex cannot connect or the model name is rejected.

Steps to use local models with Codex:

  1. Check the local provider for the exact model ID you will pass to Codex.
    $ ollama list
    NAME           ID              SIZE     MODIFIED
    gpt-oss:20b    17052f91a42e    13 GB    4 months ago

    For LM Studio, note the model ID shown by the local server or app, for example openai/gpt-oss-20b.

  2. Run Codex against the local provider with the matching model ID.
    $ codex exec --oss --local-provider ollama -m gpt-oss:20b "Reply with exactly: OK"
    OK

    Replace ollama and gpt-oss:20b with the provider and model ID you actually use.

  3. Save the final local reply to a file when you need reusable output for scripts or review.
    $ codex exec --oss --local-provider ollama -m gpt-oss:20b --output-last-message /tmp/codex-local.txt "Reply with exactly: OK"
    OK

    --output-last-message overwrites the destination file when it already exists.

  4. Confirm that the saved result contains the expected local-model response.
    $ cat /tmp/codex-local.txt
    OK