Files
FastGPT/docSite/content/zh-cn/docs/development/custom-models/bge-rerank.md
Archer db2c0a0bdb V4.8.20 feature (#3686)
* Aiproxy (#3649)

* model config

* feat: model config ui

* perf: rename variable

* feat: custom request url

* perf: model buffer

* perf: init model

* feat: json model config

* auto login

* fix: ts

* update packages

* package

* fix: dockerfile

* feat: usage filter & export & dashbord (#3538)

* feat: usage filter & export & dashbord

* adjust ui

* fix tmb scroll

* fix code & selecte all

* merge

* perf: usages list;perf: move components (#3654)

* perf: usages list

* team sub plan load

* perf: usage dashboard code

* perf: dashboard ui

* perf: move components

* add default model config (#3653)

* 4.8.20 test (#3656)

* provider

* perf: model config

* model perf (#3657)

* fix: model

* dataset quote

* perf: model config

* model tag

* doubao model config

* perf: config model

* feat: model test

* fix: POST 500 error on dingtalk bot (#3655)

* feat: default model (#3662)

* move model config

* feat: default model

* fix: false triggerd org selection (#3661)

* export usage csv i18n (#3660)

* export usage csv i18n

* fix build

* feat: markdown extension (#3663)

* feat: markdown extension

* media cros

* rerank test

* default price

* perf: default model

* fix: cannot custom provider

* fix: default model select

* update bg

* perf: default model selector

* fix: usage export

* i18n

* fix: rerank

* update init extension

* perf: ip limit check

* doubao model order

* web default modle

* perf: tts selector

* perf: tts error

* qrcode package

* reload buffer (#3665)

* reload buffer

* reload buffer

* tts selector

* fix: err tip (#3666)

* fix: err tip

* perf: training queue

* doc

* fix interactive edge (#3659)

* fix interactive edge

* fix

* comment

* add gemini model

* fix: chat model select

* perf: supplement assistant empty response (#3669)

* perf: supplement assistant empty response

* check array

* perf: max_token count;feat: support resoner output;fix: member scroll (#3681)

* perf: supplement assistant empty response

* check array

* perf: max_token count

* feat: support resoner output

* member scroll

* update provider order

* i18n

* fix: stream response (#3682)

* perf: supplement assistant empty response

* check array

* fix: stream response

* fix: model config cannot set to null

* fix: reasoning response (#3684)

* perf: supplement assistant empty response

* check array

* fix: reasoning response

* fix: reasoning response

* doc (#3685)

* perf: supplement assistant empty response

* check array

* doc

* lock

* animation

* update doc

* update compose

* doc

* doc

---------

Co-authored-by: heheer <heheer@sealos.io>
Co-authored-by: a.e. <49438478+I-Info@users.noreply.github.com>
2025-02-05 00:10:47 +08:00

145 lines
3.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: '接入 bge-rerank 重排模型'
description: '接入 bge-rerank 重排模型'
icon: 'sort'
draft: false
toc: true
weight: 920
---
## 不同模型推荐配置
推荐配置如下:
{{< table "table-hover table-striped-columns" >}}
| 模型名 | 内存 | 显存 | 硬盘空间 | 启动命令 |
|------|---------|---------|----------|--------------------------|
| bge-reranker-base | >=4GB | >=4GB | >=8GB | python app.py |
| bge-reranker-large | >=8GB | >=8GB | >=8GB | python app.py |
| bge-reranker-v2-m3 | >=8GB | >=8GB | >=8GB | python app.py |
{{< /table >}}
## 源码部署
### 1. 安装环境
- Python 3.9, 3.10
- CUDA 11.7
- 科学上网环境
### 2. 下载代码
3 个模型代码分别为:
1. [https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-base](https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-base)
2. [https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-large](https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-large)
3. [https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-v2-m3](https://github.com/labring/FastGPT/tree/main/python/bge-rerank/bge-reranker-v2-m3)
### 3. 安装依赖
```sh
pip install -r requirements.txt
```
### 4. 下载模型
3个模型的 huggingface 仓库地址如下:
1. [https://huggingface.co/BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)
2. [https://huggingface.co/BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large)
3. [https://huggingface.co/BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3)
在对应代码目录下 clone 模型。目录结构:
```
bge-reranker-base/
app.py
Dockerfile
requirements.txt
```
### 5. 运行代码
```bash
python app.py
```
启动成功后应该会显示如下地址:
![](/imgs/rerank1.png)
> 这里的 `http://0.0.0.0:6006` 就是连接地址。
## docker 部署
**镜像名分别为:**
1. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1 (4 GB+)
2. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-large:v0.1 (5 GB+)
3. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1 (5 GB+)
**端口**
6006
**环境变量**
```
ACCESS_TOKEN=访问安全凭证请求时Authorization: Bearer ${ACCESS_TOKEN}
```
**运行命令示例**
```sh
# auth token 为mytoken
docker run -d --name reranker -p 6006:6006 -e ACCESS_TOKEN=mytoken --gpus all registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
```
**docker-compose.yml示例**
```
version: "3"
services:
reranker:
image: registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
container_name: reranker
# GPU运行环境如果宿主机未安装将deploy配置隐藏即可
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- 6006:6006
environment:
- ACCESS_TOKEN=mytoken
```
## 接入 FastGPT
1. 打开 FastGPT 模型配置,新增一个重排模型。
2. 填写模型配置表单:模型 ID 为`bge-reranker-base`,地址填写`{{host}}/v1/rerank`host 为你部署的域名/IP:Port。
![alt text](/imgs/image-102.png)
## QA
### 403报错
FastGPT中自定义请求 Token 和环境变量的 ACCESS_TOKEN 不一致。
### Docker 运行提示 `Bus error (core dumped)`
尝试增加 `docker-compose.yml` 配置项 `shm_size` ,以增加容器中的共享内存目录大小。
```
...
services:
reranker:
...
container_name: reranker
shm_size: '2gb'
...
```