本地部署Ollama+Qwen-code加内网穿透实现外网访问

### 前言

最近在使用Cursor、Codex的道路上一去不复返了，可钱包却有点扛不住了。遂产生本地部署大模型的想法，正好有一台闲置的Windows笔记本，配置为32GB DDR4 + RTX1080 8GB，加之博客的云服务器，正好来做内网穿透。

### 1. 模型选型与载入

1. 首先前往官网下载[Ollama](https://ollama.com/)，并安装。

2. 根据我的闲置电脑配置最终选择`Qwen2.5-Coder-7B-Instruct-GGUF`。

3. 载入模型文件。

Ollama本身并不支持GGUF格式的模型文件，因此要编写Modelfile文件，使其可以被正确读取。模型文件在[魔塔社区](https://modelscope.cn/models/Qwen/Qwen2.5-Coder-14B-Instruct-GGUF)下载。

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">//创建模型文件目录，并将下载好的GGUF放进去
   
   mkdir -p ~/ollama-models/qwen-coder
   cd ~/ollama-models/qwen-coder</code></pre></div>

在同目录下：

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">touch touch Modelfile</code></pre></div>

内容如下：

<div class="code-toolbar"><pre class="language-Bash line-numbers"><code class="language-Bash">FROM ./qwen2.5-coder-14b-instruct-q4_k_m.gguf
   
   TEMPLATE """{{- if .System }}<|im_start|>system
   {{ .System }}<|im_end|>
   {{- end }}
   <|im_start|>user
   {{ .Prompt }}<|im_end|>
   <|im_start|>assistant
   """
   
   PARAMETER temperature 0.7
   PARAMETER top_p 0.9
   PARAMETER repeat_penalty 1.1</code></pre></div>

4. 读取文件，创建Ollama模型

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">ollama create qwen2.5-coder:14b -f Modelfile</code></pre></div>

5. 运行模型

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">ollama run qwen2.5-coder:14b</code></pre></div>

Ollama 启动后，默认会在后台运行一个 API 服务。你可以在浏览器访问 `http://127.0.0.1:11434`，如果看到 "Ollama is running"，说明部署成功。

### 2. 内网穿透

使用开源的 **FRPC** 搭建专属隧道，因为我的服务器是Linux的，前往[FRP Release](https://github.com/fatedier/frp/releases)界面下载对应的压缩包。

#### 1.配置服务器的FRPC

我使用的是宝塔可视化界面，将下载好的FRP压缩包在server目录下解压，并配置 `frpc.ini`文件。

![](https://images.xxzxka.com/typecho/image-20260430100032958.png)

配置如下：

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">[common]
server_addr = 127.0.0.1
server_port = 7000
token = “your_strong_password”

[ssh]
type = tcp
local_ip = 127.0.0.1
local_port = 22
remote_port = 6000</code></pre></div>

#### 2. 配置用于反向代理的网站

来到宝塔面板的 “网站”，新建站点。PHP纯静态，不创建数据库即可。域名一般填写已有域名的子域名，如：ai.yourdomain.com

添加完成后配置该网站的反向代理设置。

给代理网站申请并配置encrypt证书，选择DNS验证或文件验证，根据提示去你的域名服务商添加对应的解析记录。完成后在frp目录下打开终端，启动frpc

> ⚠️注意：
>
> 记得在云服务器的**安全组/防火墙**中放行 `7000` 端口，以及接下来要映射的 `11434` 端口。

#### 3.Windows 本地端配置 (frpc)

在上面的release地址下载对应windows版本的FRPC，解压到一个位置，编辑目录下的`frpc.toml`

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">serverAddr = "你的云服务器公网IP"
serverPort = 7000
auth.token = "your_strong_password_here" # 必须与服务端一致

[[proxies]]
name = "ollama-api" //自定义
type = "tcp"
localIP = "127.0.0.1" # Windows 本地的 Ollama 地址
localPort = 11434     # Ollama 默认端口
remotePort = 11434    # 映射到云服务器的端口 (可以自定义，记得云端放行)</code></pre></div>

完成后在解压目录的根目录下，运行frpc：

如果提示 `start proxy success`，说明穿透成功。此时你可以用手机浏览器访问 `http://你的云服务器公网IP:11434` 测试。

![](https://images.xxzxka.com/typecho/9767827232bcbcf5baa3ba11ac18d107.jpg)

### 3.测试与接入Agent

#### 1. 测试连接

在另一台电脑上，测试连接是否正常，在终端输入以下内容：

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">curl -X POST https://你的代理域名/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 你的密码" \
-d '{
  "model": "qwen2.5-coder:7b",
  "messages": [
    {
      "role": "user",
      "content": "测试连接，请只回复“连接成功”四个字。"
    }
  ],
  "stream": false
}'</code></pre></div>

![](https://images.xxzxka.com/typecho/46EA7B61-AF1A-4155-AAF7-DDED72A3C2D5_1_201_a.jpeg)

#### 2.接入到桌面版Condex中

![C9C04548-C3CE-4631-BF19-B3E08B1C14D8](https://images.xxzxka.com/typecho/C9C04548-C3CE-4631-BF19-B3E08B1C14D8.png)

打开`Config.toml`文件并进行以下复写：

<div class="code-toolbar"><pre class="language-toml line-numbers"><code class="language-toml"># --- 修改这里：指定使用的模型和提供商 ---
model = "qwen2.5-coder:7b" # 注意：请确保这与你运行 `ollama list` 看到的名称一模一样
model_provider = "my_ollama"
model_reasoning_effort = "medium"

# --- 新增这里：配置私有 API 提供商节点 ---
[model_providers.my_ollama]
name = "Local Qwen via Ollama"
base_url = "https://你的内网穿透域名/v1" # 记得替换成你真实的内网穿透地址，必须保留末尾的 /v1
env_key = "OLLAMA_DUMMY_KEY"

# --- 以下保留你原本的配置不动 ---
[marketplaces.openai-bundled]
last_updated = "2026-04-30T06:00:34Z"
source_type = "local"
source = "/Users/panpan/.codex/.tmp/bundled-marketplaces/openai-bundled"

[marketplaces.openai-primary-runtime]
last_updated = "2026-04-27T01:45:09Z"
source_type = "local"
source = "/Users/panpan/.cache/codex-runtimes/codex-primary-runtime/plugins/openai-primary-runtime"

[plugins."browser-use@openai-bundled"]
enabled = true

[plugins."documents@openai-primary-runtime"]
enabled = true

[plugins."spreadsheets@openai-primary-runtime"]
enabled = true

[plugins."presentations@openai-primary-runtime"]
enabled = true

[projects."/Users/panpan/工作/om-client"]
trust_level = "trusted"</code></pre></div>

> 注意：
>
> env_key = "OLLAMA_DUMMY_KEY"不能直接填写你在Windows端设置的token，因为codex默认不允许填写token明文。这里是先设置了系统变量，然后允许codex在打开时读取系统变量。

<div class="code-toolbar"><pre class="language-bash line-numbers"><code class="language-bash">//追加密钥
echo 'export OLLAMA_DUMMY_KEY="sk-123456789"' >> ~/.zshrc

//重新应用配置
source ~/.zshrc

//测试是否添加成功
echo $OLLAMA_DUMMY_KEY

//利用 Mac 的 launchctl 工具将变量注入到系统全局，这样每次双击打开codex，会读取该变量
launchctl setenv OLLAMA_DUMMY_KEY "sk-123456789"</code></pre></div>

本文由 yuin 创作，
本站文章除注明转载/出处外，均为本站原创或翻译，转载前请务必署名。