Stars на GitHub не коррелируют с вашим burn rate. В 2026 году инженерный стек определяют RPM/RPD лимиты, OAuth refresh policy и совместимость BASE_URL с OpenAI API shape. Этот гайд — для тех, кто поднимает Gemini CLI, Codex CLI, Claude Code, OpenCode, OpenClaw, Copilot и Cursor на macOS: полная карта free tier, конфиги relay, 10 правил экономии токенов и runbook из шести шагов на облачном Mac NUKCLOUD. Контекст: миграция Gemini CLI → Antigravity, OpenRouter CLI ranking, SKILL.md стандарт.
00Карта free tier: specs на июнь 2026
Baseline snapshot. Proxy = нужен VPN/прокси для прямого доступа из РФ/СНГ без relay.
| Runtime / API | Quota | Paid acct | Proxy | Hook |
|---|---|---|---|---|
| Gemini CLI OAuth | 1000 RPD, 60 RPM | no | yes | no CC, 1M ctx window |
| Gemini API Studio | Flash 1500 RPD, Pro 100 RPD | no | yes | official free tier API |
| Codex CLI | ChatGPT Free (timeboxed) | no | yes | OS sandbox, GPT-5.3-Codex |
| Claude Code | Pro/Max $20+ or relay | yes/relay | yes | best codegen quality |
| OpenCode | tool free, API BYOK | no | no w/ CN relay | 146K stars, 75+ providers |
| OpenClaw | tool free, multi-auth | no | no w/ CN relay | reuse Gemini OAuth |
| Copilot Free | 2000 completions + 50 premium/mo | no | no | instant enable |
| Copilot Student | 300 premium/mo (Pro parity) | no (Edu) | no | ~$10/mo value |
| Cursor Hobby | 2000 Tab + 50 slow premium/mo | no | no | full VS Code fork |
| SiliconFlow | 20M tokens permanent | no | no (CN) | DeepSeek, Qwen, GLM-5 |
| Alibaba DashScope | 70M tokens promo | no | no (CN) | 70+ models |
| Zhipu GLM | 20M tokens permanent | no | no (CN) | GLM-5 for Claude relay |
| Groq | 14400 RPD permanent | no | yes | Llama 3.3, Mixtral |
| NVIDIA NIM | permanent free tier | no | partial | Llama, Nemotron, DeepSeek |
Stack math: SiliconFlow 20M + Alibaba 70M + Zhipu 20M = 110M tokens без подписки. Gemini CLI OAuth даёт до 1000 invocations/day — при avg 2K output tokens это theoretical ceiling ~2M tokens/day, на практике меньше за счёт Flash-Lite routing.
01Gemini CLI: OAuth flow и rate limit matrix
Install path (Node 18+): npm install -g @google/gemini-cli | brew install gemini-cli | zero-install: npx @google/gemini-cli. Auth: gemini → «Sign in with Google» → browser OAuth → token cached locally.
| Model | RPM | TPM | RPD |
|---|---|---|---|
| gemini-2.5-pro | 5 | 250000 | 100 |
| gemini-2.5-flash | 10 | 250000 | 250 |
| gemini-2.5-flash-lite | 15 | 250000 | 1000 |
| CLI OAuth aggregate | 60 | — | 1000 |
Runtime commands: /model, /model gemini-2.5-flash, /stats model. ToS constraint: OAuth token must not be proxied to third-party gateways — account ban risk. Enterprise/API-key path exempt from 2026-06-18 Antigravity cutover for personal OAuth users per Google blog.
02Codex CLI: Rust sandbox + relay config
OpenAI Codex CLI — 83K+ stars, kernel-level sandbox, headless CI capable. Install: npm install -g @openai/codex | brew install --cask codex. Auth path A: ChatGPT OAuth (codex → browser). Path B: CN relay without VPN.
~/.codex/config.toml example:
openai_base_url = "https://api.siliconflow.cn/v1"model = "deepseek-ai/DeepSeek-V3"sandbox_mode = "workspace-write"web_search = "disabled"approval_policy = "on-request"
Env: export OPENAI_API_KEY="sk-...". Health check: codex doctor. One-shot: codex "fix rustc error E0382 in src/main.rs".
03Claude Code: Pro gate или API relay
No permanent free tier. Install vectors: curl -fsSL https://claude.ai/install.sh | bash, brew install --cask claude-code, npm install -g @anthropic-ai/claude-code, Windows: winget install Anthropic.ClaudeCode.
Relay via ~/.claude/settings.json:
"env": { "ANTHROPIC_BASE_URL": "https://api.siliconflow.cn/v1", "ANTHROPIC_API_KEY": "sk-xxxx" }
Or shell: export ANTHROPIC_BASE_URL=... export ANTHROPIC_API_KEY=... claude. Model switch: /model | claude --model claude-sonnet-4-6. Critical: never run /init on monorepo — single invocation can burn 100K–500K tokens indexing entire tree.
04OpenCode + OpenClaw: BYOK multi-provider stack
OpenCode install: curl -fsSL https://opencode.ai/install | bash | brew install anomalyco/tap/opencode | Docker: docker run -it --rm ghcr.io/anomalyco/opencode. Provider wiring: TUI /connect or ~/.config/opencode/config.json.
| Provider | Model | Cost |
|---|---|---|
| Gemini API | gemini-2.5-flash | 1500 RPD free |
| SiliconFlow | deepseek-ai/DeepSeek-V3 | 20M reg bonus |
| Ollama local | qwen3:8b, deepseek-coder-v2:16b | $0, RAM-bound |
| OpenCode Zen | Grok Code Fast, GLM 4.7 | time-limited free |
OpenClaw: curl -fsSL https://openclaw.ai/install.sh | bash, openclaw onboard --install-daemon. Gemini OAuth reuse: openclaw models auth login --provider google-gemini-cli --set-default. Claude token: claude setup-token → openclaw models auth paste-token --provider anthropic. Free model discovery: openclaw models scan. Production deploy: OpenClaw на Mac Mini M4.
05Copilot + Cursor: IDE-side quotas
Copilot Free: GitHub Settings → Copilot → Enable. Limits: 2000 completions/mo, 50 premium requests/mo. Copilot Student: GitHub Education → auto Pro parity, 300 premium/mo. Note: since 2026-04-20 new Copilot Pro signups may be paused — check UI state. OSS maintainers: github-copilot/signup/orgs.
Cursor Hobby: download from cursor.com, 2000 Tab completions + 50 slow premium/mo, no CC. Student Pro: cursor.com/students. Usage dashboard: app.cursor.sh/account/usage. Agent workflows: Copilot Coding Agent runbook.
06API endpoints: copy-paste reference
| Provider | BASE_URL | Bonus | Models |
|---|---|---|---|
| SiliconFlow | https://api.siliconflow.cn/v1 | 20M permanent | DeepSeek-V3, Qwen3.5, GLM-5 |
| Alibaba | https://dashscope.aliyuncs.com/compatible-mode/v1 | 70M promo | Qwen3.5-Max/Plus |
| Zhipu | https://open.bigmodel.cn/api/paas/v4 | 20M permanent | GLM-5, GLM-4.7-Flash |
| InfinigenceAI | cloud.infini-ai.com | billion-token promo | GenStudio API |
| Groq | https://api.groq.com/openai/v1 | 14400 RPD | Llama 3.3, Mixtral |
| NVIDIA NIM | build.nvidia.com | permanent free | Llama, Nemotron |
Universal env pattern for Codex/OpenCode/Claude relay:
export OPENAI_API_KEY="sk-..."export OPENAI_BASE_URL="https://api.siliconflow.cn/v1"
OpenRouter free models complement CN stack — routing policy in LLM trends June 2026.
07Token optimization: 10 hard rules
- Rule 1: ban
/initon monorepos — Claude Code/Codex will ingest entire tree. - Rule 2: single-file scope per prompt, not «refactor entire codebase».
- Rule 3: route trivial tasks to gemini-2.5-flash-lite (1000 RPD) before Pro.
- Rule 4: set 80% quota alerts on SiliconFlow/Alibaba dashboards.
- Rule 5: local Ollama for lint/format —
ollama pull qwen3:8b, zero API burn. - Rule 6: compress system prompts in CLI configs — 5–15% overhead reduction per call.
- Rule 7: version SKILL.md for recurring prompts — reduces multi-CLI drift.
- Rule 8:
openclaw models scanfor OpenRouter permanent free models. - Rule 9: multi-key rotation when hitting HTTP 429 rate limits.
- Rule 10: track vendor promo campaigns — CN platforms run seasonal credit drops.
08Decision matrix: workload × stack × RAM
| Workload | Stack | Free lever | Cloud Mac RAM |
|---|---|---|---|
| daily terminal coding | Gemini CLI OAuth | 1000 RPD | 16 GB |
| max codegen quality | Claude Code + GLM relay | Zhipu 20M | 32 GB |
| student IDE workflow | Copilot Student + Cursor Edu | $0 Pro parity | 16 GB |
| multi-model routing | OpenCode + OpenRouter | BYOK + scan | 32 GB |
| 7×24 Telegram gateway | OpenClaw + Hermes | Gemini OAuth reuse | 32–96 GB |
| CI headless agent | Codex CLI + DeepSeek relay | SiliconFlow 20M | 32 GB+ |
| offline / air-gap | Ollama + OpenCode | zero API cost | 64 GB+ |
Rent vs buy: 3–6 month token experiments → hourly cloud Mac from страница цен, then lock spec on заказ. Dimension by agent parallelism and Docker sandbox count, not weekly model hype.
09Runbook ×6: free tokens on NUKCLOUD cloud Mac
Free tiers solve model access. Dedicated Apple Silicon solves OAuth session persistence, launchd KeepAlive, and tenant isolation. Aligns with console provisioning runbook.
-
01
Provision instance: NUKCLOUD консоль — pick region, 16 GB for light CLI OAuth, 32 GB+ for OpenClaw/Hermes + Docker. SSH key, verify disk quota for OAuth token cache and agent logs.
-
02
SSH baseline:
ssh user@your-cloud-mac,xcode-select --installif needed,brew install git node python@3.12. Ping Google OAuth endpoints, GitHub, and your chosen API BASE_URLs. -
03
Register free API keys: SiliconFlow, Groq, Gemini AI Studio. Write to
~/.zshrc:export GEMINI_API_KEY,OPENAI_API_KEY,OPENROUTER_API_KEY. Enable 80% quota alerts on provider dashboards. -
04
Install CLI stack:
npm install -g @google/gemini-cli @openai/codex, OpenCode via curl, OpenClaw viaopenclaw onboard. Complete Gemini OAuth in browser; runcodex doctor; OpenCode/connect. -
05
launchd 7×24 gateway:
~/Library/LaunchAgents/com.team.freeagent.plistwithRunAtLoad+KeepAlivefor OpenClaw/Hermes. Centralize prompts in SKILL.md — see Hermes install guide. - 06
Shared minute-pool macOS VPS resets long OAuth sessions and kills launchd gateways — fatal for 12h agent loops with thousands of tool calls. NUKCLOUD dedicated cloud Mac nodes provide auditable uptime for Gemini OAuth cache and parallel CLI processes.