Bombshell Alert! Guide to Unlimited Local Tokens with Claude Code

Claude Code is powerful, but token consumption can be painful!

Finally, Claude Code can work with local models, and the setup is very simple.

The following environment is Mac Mini4. Windows environment is also possible.

In the current era, if you're into desktop AI, it's recommended to get a Mac M-series mini host, such as mini4\mini4 pro\m3 ultra\m4 max, a personal desktop AI powerhouse.

First, you need to upgrade LM Studio to the latest version, which is 0.4.1, because the latest version adds support for Claude Code. (Ollama is also possible)

You can run any open-source model locally, as long as your Mac has enough memory. We'll use gpt-oss-20b-mlx as an example, which is an open-source model from OpenAI.

Note one thing: Set the Context length to the maximum, meaning pull the context length to the maximum supported by the model, because the performance of multi-turn tasks with agents heavily depends on the context length; too short won't work. This parameter also needs to be balanced and adjusted based on your Mac's memory and the model's inference speed. Also note: For Mac environments, prioritize downloading models in MLX format, as they infer faster than GGUF format models.

Next, we install claude code in the command line terminal.

Configure environment variables:

export ANTHROPIC_AUTH_TOKEN=lmstudio

export ANTHROPIC_BASE_URL=http://localhost:1234

Install the claude code main body:

npm install -g @anthropic-ai/claude-code

Then, start claude code:

claude --model gpt-oss-20b-mlx

At this point, claude code will call your local model to output.

In addition to using it in the terminal, it can also be used in VS Code with the following configuration:

First, we install the Claude Code for VS Code extension.

Then set the environment variables:

{    "claudeCode.environmentVariables": [        { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:1234" },         { "name": "ANTHROPIC_AUTH_TOKEN", "value": "lmstudio" }     ]}

Then you can get to work.

Food for thought: Is Claude Code still the same Claude Code without using Anthropic models?

The gpt-oss-20b-mlx model we use certainly can't compare to Opus 4.5, but if you deploy Kimi K2.5 locally, currently, its capabilities are no less than Opus 4.5.

Bombshell Alert! Guide to Unlimited Local Tokens with Claude Code

You Might Also Like

Claude Code Buddy Modification Guide: How to Obtain Shiny Legendary Pets

Obsidian Launches Defuddle, Taking Obsidian Web Clipper to New Heights

OpenAI Suddenly Announces 'All-in-One': Browser + Programming + ChatGPT Merge, Internally Admits Mistakes Over the Past Year

2026, No More Forcing Myself to be 'Disciplined'! Do These 8 Simple Things, and Health Will Naturally Follow

Moms Who Work Hard to Lose Weight but Can't, Definitely Fall Here

AI Browser 24-Hour Stable Operation Guide