mirror of
https://github.com/labring/FastGPT.git
synced 2025-07-21 11:43:56 +00:00

* Aiproxy (#3649) * model config * feat: model config ui * perf: rename variable * feat: custom request url * perf: model buffer * perf: init model * feat: json model config * auto login * fix: ts * update packages * package * fix: dockerfile * feat: usage filter & export & dashbord (#3538) * feat: usage filter & export & dashbord * adjust ui * fix tmb scroll * fix code & selecte all * merge * perf: usages list;perf: move components (#3654) * perf: usages list * team sub plan load * perf: usage dashboard code * perf: dashboard ui * perf: move components * add default model config (#3653) * 4.8.20 test (#3656) * provider * perf: model config * model perf (#3657) * fix: model * dataset quote * perf: model config * model tag * doubao model config * perf: config model * feat: model test * fix: POST 500 error on dingtalk bot (#3655) * feat: default model (#3662) * move model config * feat: default model * fix: false triggerd org selection (#3661) * export usage csv i18n (#3660) * export usage csv i18n * fix build * feat: markdown extension (#3663) * feat: markdown extension * media cros * rerank test * default price * perf: default model * fix: cannot custom provider * fix: default model select * update bg * perf: default model selector * fix: usage export * i18n * fix: rerank * update init extension * perf: ip limit check * doubao model order * web default modle * perf: tts selector * perf: tts error * qrcode package * reload buffer (#3665) * reload buffer * reload buffer * tts selector * fix: err tip (#3666) * fix: err tip * perf: training queue * doc * fix interactive edge (#3659) * fix interactive edge * fix * comment * add gemini model * fix: chat model select * perf: supplement assistant empty response (#3669) * perf: supplement assistant empty response * check array * perf: max_token count;feat: support resoner output;fix: member scroll (#3681) * perf: supplement assistant empty response * check array * perf: max_token count * feat: support resoner output * member scroll * update provider order * i18n * fix: stream response (#3682) * perf: supplement assistant empty response * check array * fix: stream response * fix: model config cannot set to null * fix: reasoning response (#3684) * perf: supplement assistant empty response * check array * fix: reasoning response * fix: reasoning response * doc (#3685) * perf: supplement assistant empty response * check array * doc * lock * animation * update doc * update compose * doc * doc --------- Co-authored-by: heheer <heheer@sealos.io> Co-authored-by: a.e. <49438478+I-Info@users.noreply.github.com>
3.6 KiB
3.6 KiB
title, description, icon, draft, toc, weight
title | description | icon | draft | toc | weight |
---|---|---|---|---|---|
接入 bge-rerank 重排模型 | 接入 bge-rerank 重排模型 | sort | false | true | 920 |
不同模型推荐配置
推荐配置如下:
{{< table "table-hover table-striped-columns" >}}
模型名 | 内存 | 显存 | 硬盘空间 | 启动命令 |
---|---|---|---|---|
bge-reranker-base | >=4GB | >=4GB | >=8GB | python app.py |
bge-reranker-large | >=8GB | >=8GB | >=8GB | python app.py |
bge-reranker-v2-m3 | >=8GB | >=8GB | >=8GB | python app.py |
{{< /table >}} |
源码部署
1. 安装环境
- Python 3.9, 3.10
- CUDA 11.7
- 科学上网环境
2. 下载代码
3 个模型代码分别为:
- https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-base
- https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-large
- https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-v2-m3
3. 安装依赖
pip install -r requirements.txt
4. 下载模型
3个模型的 huggingface 仓库地址如下:
- https://huggingface.co/BAAI/bge-reranker-base
- https://huggingface.co/BAAI/bge-reranker-large
- https://huggingface.co/BAAI/bge-reranker-v2-m3
在对应代码目录下 clone 模型。目录结构:
bge-reranker-base/
app.py
Dockerfile
requirements.txt
5. 运行代码
python app.py
启动成功后应该会显示如下地址:
这里的
http://0.0.0.0:6006
就是连接地址。
docker 部署
镜像名分别为:
- registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1 (4 GB+)
- registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-large:v0.1 (5 GB+)
- registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1 (5 GB+)
端口
6006
环境变量
ACCESS_TOKEN=访问安全凭证,请求时,Authorization: Bearer ${ACCESS_TOKEN}
运行命令示例
# auth token 为mytoken
docker run -d --name reranker -p 6006:6006 -e ACCESS_TOKEN=mytoken --gpus all registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
docker-compose.yml示例
version: "3"
services:
reranker:
image: registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
container_name: reranker
# GPU运行环境,如果宿主机未安装,将deploy配置隐藏即可
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- 6006:6006
environment:
- ACCESS_TOKEN=mytoken
接入 FastGPT
- 打开 FastGPT 模型配置,新增一个重排模型。
- 填写模型配置表单:模型 ID 为
bge-reranker-base
,地址填写{{host}}/v1/rerank
,host 为你部署的域名/IP:Port。
QA
403报错
FastGPT中,自定义请求 Token 和环境变量的 ACCESS_TOKEN 不一致。
Docker 运行提示 Bus error (core dumped)
尝试增加 docker-compose.yml
配置项 shm_size
,以增加容器中的共享内存目录大小。
...
services:
reranker:
...
container_name: reranker
shm_size: '2gb'
...