Using local models with Codex keeps prompts and outputs on the local machine, which helps reduce data exposure and can lower latency for iterative work.

The Codex CLI routes requests to a local provider when invoked with --oss. The active provider comes from the configured default or from --local-provider (ollama or lmstudio), and you can override it per run with --local-provider while -m selects the model name.

A local provider server must already be running and the chosen model must already be pulled or downloaded before executing Codex commands. Model behavior and capabilities vary by provider and model, and exposing a local model API beyond the workstation can leak prompts to other hosts.

Steps to use local models with Codex:

  1. Start the local model server for Ollama or LM Studio.

    Binding the local model API to a non-local interface can allow other hosts to submit prompts and retrieve responses; keep the listener restricted to localhost unless network access is required.

  2. Pull or download the model intended for Codex prompts.

    The model name passed to Codex must match the provider's model identifier, which can differ between Ollama tags and LM Studio model names.

  3. Run Codex with the Ollama provider and a local model name.
    $ codex exec --oss --local-provider ollama -m llama3.2 "Return OK."
    OK
  4. Run Codex with the LM Studio provider and a local model name.
    $ codex exec --oss --local-provider lmstudio -m "llama-3.2-3b-instruct" "Return OK."
    OK
  5. Save a local model response to a file when you need to reuse it.
    $ codex exec --oss --local-provider ollama -m llama3.2 --output-last-message /tmp/codex-local.txt "Return OK."
    OK