mirror of
https://github.com/labring/FastGPT.git
synced 2025-07-24 05:23:57 +00:00
4.7.1-alpha2 (#1153)
Co-authored-by: UUUUnotfound <31206589+UUUUnotfound@users.noreply.github.com> Co-authored-by: Hexiao Zhang <731931282qq@gmail.com> Co-authored-by: heheer <71265218+newfish-cmyk@users.noreply.github.com>
This commit is contained in:
114
python/bge-rerank/README.md
Normal file
114
python/bge-rerank/README.md
Normal file
@@ -0,0 +1,114 @@
|
||||
# 接入 bge-rerank 重排模型
|
||||
|
||||
## 不同模型推荐配置
|
||||
|
||||
推荐配置如下:
|
||||
|
||||
| 模型名 | 内存 | 显存 | 硬盘空间 | 启动命令 |
|
||||
| ---------------- | ----- | ----- | -------- | ------------- |
|
||||
| bge-rerank-base | >=4GB | >=4GB | >=8GB | python app.py |
|
||||
| bge-rerank-large | >=8GB | >=8GB | >=8GB | python app.py |
|
||||
| bge-rerank-v2-m3 | >=8GB | >=8GB | >=8GB | python app.py |
|
||||
|
||||
## 源码部署
|
||||
|
||||
### 1. 安装环境
|
||||
|
||||
- Python 3.9, 3.10
|
||||
- CUDA 11.7
|
||||
- 科学上网环境
|
||||
|
||||
### 2. 下载代码
|
||||
|
||||
3 个模型代码分别为:
|
||||
|
||||
1. [https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-base](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-base)
|
||||
2. [https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-large](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-large)
|
||||
3. [https://github.com/labring/FastGPT/tree/main/python/reranker/bge-rerank-v2-m3](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-rerank-v2-m3)
|
||||
|
||||
### 3. 安装依赖
|
||||
|
||||
```sh
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 4. 下载模型
|
||||
|
||||
3个模型的 huggingface 仓库地址如下:
|
||||
|
||||
1. [https://huggingface.co/BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)
|
||||
2. [https://huggingface.co/BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large)
|
||||
3. [https://huggingface.co/BAAI/bge-rerank-v2-m3](https://huggingface.co/BAAI/bge-rerank-v2-m3)
|
||||
|
||||
在对应代码目录下 clone 模型。目录结构:
|
||||
|
||||
```
|
||||
bge-reranker-base/
|
||||
app.py
|
||||
Dockerfile
|
||||
requirements.txt
|
||||
```
|
||||
|
||||
### 5. 运行代码
|
||||
|
||||
```bash
|
||||
python app.py
|
||||
```
|
||||
|
||||
启动成功后应该会显示如下地址:
|
||||
|
||||

|
||||
|
||||
> 这里的 `http://0.0.0.0:6006` 就是请求地址。
|
||||
|
||||
## docker 部署
|
||||
|
||||
**镜像名分别为:**
|
||||
|
||||
1. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
|
||||
2. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-large:v0.1
|
||||
3. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1
|
||||
|
||||
**端口**
|
||||
|
||||
6006
|
||||
|
||||
**环境变量**
|
||||
|
||||
```
|
||||
ACCESS_TOKEN=访问安全凭证,请求时,Authorization: Bearer ${ACCESS_TOKEN}
|
||||
```
|
||||
|
||||
**运行命令示例**
|
||||
|
||||
```sh
|
||||
# auth token 为mytoken
|
||||
docker run -d --name reranker -p 6006:6006 -e ACCESS_TOKEN=mytoken --gpus all registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
|
||||
```
|
||||
|
||||
**docker-compose.yml示例**
|
||||
|
||||
```
|
||||
version: "3"
|
||||
services:
|
||||
reranker:
|
||||
image: registry.cn-hangzhou.aliyuncs.com/fastgpt/rerank:v0.2
|
||||
container_name: reranker
|
||||
# GPU运行环境,如果宿主机未安装,将deploy配置隐藏即可
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: all
|
||||
capabilities: [gpu]
|
||||
ports:
|
||||
- 6006:6006
|
||||
environment:
|
||||
- ACCESS_TOKEN=mytoken
|
||||
|
||||
```
|
||||
|
||||
## 接入 FastGPT
|
||||
|
||||
参考 [ReRank模型接入](https://doc/fastai.site/docs/development/configuration/#rerank-接入)
|
@@ -17,20 +17,9 @@ from FlagEmbedding import FlagReranker
|
||||
from pydantic import Field, BaseModel, validator
|
||||
from typing import Optional, List
|
||||
|
||||
def response(code, msg, data=None):
|
||||
time = str(datetime.datetime.now())
|
||||
if data is None:
|
||||
data = []
|
||||
result = {
|
||||
"code": code,
|
||||
"message": msg,
|
||||
"data": data,
|
||||
"time": time
|
||||
}
|
||||
return result
|
||||
|
||||
def success(data=None, msg=''):
|
||||
return
|
||||
app = FastAPI()
|
||||
security = HTTPBearer()
|
||||
env_bearer_token = 'ACCESS_TOKEN'
|
||||
|
||||
class QADocs(BaseModel):
|
||||
query: Optional[str]
|
||||
@@ -46,42 +35,35 @@ class Singleton(type):
|
||||
|
||||
RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-base")
|
||||
|
||||
class Reranker(metaclass=Singleton):
|
||||
class ReRanker(metaclass=Singleton):
|
||||
def __init__(self, model_path):
|
||||
self.reranker = FlagReranker(model_path,
|
||||
use_fp16=False)
|
||||
self.reranker = FlagReranker(model_path, use_fp16=False)
|
||||
|
||||
def compute_score(self, pairs: List[List[str]]):
|
||||
if len(pairs) > 0:
|
||||
result = self.reranker.compute_score(pairs)
|
||||
result = self.reranker.compute_score(pairs, normalize=True)
|
||||
if isinstance(result, float):
|
||||
result = [result]
|
||||
return result
|
||||
else:
|
||||
return None
|
||||
|
||||
|
||||
class Chat(object):
|
||||
def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
|
||||
self.reranker = Reranker(rerank_model_path)
|
||||
self.reranker = ReRanker(rerank_model_path)
|
||||
|
||||
def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
|
||||
if query_docs is None or len(query_docs.documents) == 0:
|
||||
return []
|
||||
new_docs = []
|
||||
pair = []
|
||||
for answer in query_docs.documents:
|
||||
pair.append([query_docs.query, answer])
|
||||
scores = self.reranker.compute_score(pair)
|
||||
for index, score in enumerate(scores):
|
||||
new_docs.append({"index": index, "text": query_docs.documents[index], "score": 1 / (1 + np.exp(-score))})
|
||||
#results = [{"document": {"text": documents["text"]}, "index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
|
||||
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
|
||||
return {"results": results}
|
||||
|
||||
app = FastAPI()
|
||||
security = HTTPBearer()
|
||||
env_bearer_token = 'ACCESS_TOKEN'
|
||||
pair = [[query_docs.query, doc] for doc in query_docs.documents]
|
||||
scores = self.reranker.compute_score(pair)
|
||||
|
||||
new_docs = []
|
||||
for index, score in enumerate(scores):
|
||||
new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
|
||||
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
|
||||
return results
|
||||
|
||||
@app.post('/v1/rerank')
|
||||
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
|
||||
@@ -89,8 +71,12 @@ async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCreden
|
||||
if env_bearer_token is not None and token != env_bearer_token:
|
||||
raise HTTPException(status_code=401, detail="Invalid token")
|
||||
chat = Chat()
|
||||
qa_docs_with_rerank = chat.fit_query_answer_rerank(docs)
|
||||
return response(200, msg="重排成功", data=qa_docs_with_rerank)
|
||||
try:
|
||||
results = chat.fit_query_answer_rerank(docs)
|
||||
return {"results": results}
|
||||
except Exception as e:
|
||||
print(f"报错:\n{e}")
|
||||
return {"error": "重排出错"}
|
||||
|
||||
if __name__ == "__main__":
|
||||
token = os.getenv("ACCESS_TOKEN")
|
@@ -1,6 +1,6 @@
|
||||
fastapi==0.104.1
|
||||
transformers[sentencepiece]
|
||||
FlagEmbedding==1.1.5
|
||||
FlagEmbedding==1.2.8
|
||||
pydantic==1.10.13
|
||||
uvicorn==0.17.6
|
||||
itsdangerous
|
12
python/bge-rerank/bge-reranker-large/Dockerfile
Normal file
12
python/bge-rerank/bge-reranker-large/Dockerfile
Normal file
@@ -0,0 +1,12 @@
|
||||
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
|
||||
|
||||
# please download the model from https://huggingface.co/BAAI/bge-reranker-large and put it in the same directory as Dockerfile
|
||||
COPY ./bge-reranker-large ./bge-reranker-large
|
||||
|
||||
COPY requirements.txt .
|
||||
|
||||
RUN python3 -m pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
COPY app.py Dockerfile .
|
||||
|
||||
ENTRYPOINT python3 app.py
|
88
python/bge-rerank/bge-reranker-large/app.py
Normal file
88
python/bge-rerank/bge-reranker-large/app.py
Normal file
@@ -0,0 +1,88 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
@Time: 2023/11/7 22:45
|
||||
@Author: zhidong
|
||||
@File: reranker.py
|
||||
@Desc:
|
||||
"""
|
||||
import os
|
||||
import numpy as np
|
||||
import logging
|
||||
import uvicorn
|
||||
import datetime
|
||||
from fastapi import FastAPI, Security, HTTPException
|
||||
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
|
||||
from FlagEmbedding import FlagReranker
|
||||
from pydantic import Field, BaseModel, validator
|
||||
from typing import Optional, List
|
||||
|
||||
app = FastAPI()
|
||||
security = HTTPBearer()
|
||||
env_bearer_token = 'ACCESS_TOKEN'
|
||||
|
||||
class QADocs(BaseModel):
|
||||
query: Optional[str]
|
||||
documents: Optional[List[str]]
|
||||
|
||||
|
||||
class Singleton(type):
|
||||
def __call__(cls, *args, **kwargs):
|
||||
if not hasattr(cls, '_instance'):
|
||||
cls._instance = super().__call__(*args, **kwargs)
|
||||
return cls._instance
|
||||
|
||||
|
||||
RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-large")
|
||||
|
||||
class ReRanker(metaclass=Singleton):
|
||||
def __init__(self, model_path):
|
||||
self.reranker = FlagReranker(model_path, use_fp16=False)
|
||||
|
||||
def compute_score(self, pairs: List[List[str]]):
|
||||
if len(pairs) > 0:
|
||||
result = self.reranker.compute_score(pairs, normalize=True)
|
||||
if isinstance(result, float):
|
||||
result = [result]
|
||||
return result
|
||||
else:
|
||||
return None
|
||||
|
||||
class Chat(object):
|
||||
def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
|
||||
self.reranker = ReRanker(rerank_model_path)
|
||||
|
||||
def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
|
||||
if query_docs is None or len(query_docs.documents) == 0:
|
||||
return []
|
||||
|
||||
pair = [[query_docs.query, doc] for doc in query_docs.documents]
|
||||
scores = self.reranker.compute_score(pair)
|
||||
|
||||
new_docs = []
|
||||
for index, score in enumerate(scores):
|
||||
new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
|
||||
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
|
||||
return results
|
||||
|
||||
@app.post('/v1/rerank')
|
||||
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
|
||||
token = credentials.credentials
|
||||
if env_bearer_token is not None and token != env_bearer_token:
|
||||
raise HTTPException(status_code=401, detail="Invalid token")
|
||||
chat = Chat()
|
||||
try:
|
||||
results = chat.fit_query_answer_rerank(docs)
|
||||
return {"results": results}
|
||||
except Exception as e:
|
||||
print(f"报错:\n{e}")
|
||||
return {"error": "重排出错"}
|
||||
|
||||
if __name__ == "__main__":
|
||||
token = os.getenv("ACCESS_TOKEN")
|
||||
if token is not None:
|
||||
env_bearer_token = token
|
||||
try:
|
||||
uvicorn.run(app, host='0.0.0.0', port=6006)
|
||||
except Exception as e:
|
||||
print(f"API启动失败!\n报错:\n{e}")
|
7
python/bge-rerank/bge-reranker-large/requirements.txt
Normal file
7
python/bge-rerank/bge-reranker-large/requirements.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
fastapi==0.104.1
|
||||
transformers[sentencepiece]
|
||||
FlagEmbedding==1.2.8
|
||||
pydantic==1.10.13
|
||||
uvicorn==0.17.6
|
||||
itsdangerous
|
||||
protobuf
|
12
python/bge-rerank/bge-reranker-v2-m3/Dockerfile
Normal file
12
python/bge-rerank/bge-reranker-v2-m3/Dockerfile
Normal file
@@ -0,0 +1,12 @@
|
||||
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
|
||||
|
||||
# please download the model from https://huggingface.co/BAAI/bge-reranker-v2-m3 and put it in the same directory as Dockerfile
|
||||
COPY ./bge-reranker-v2-m3 ./bge-reranker-v2-m3
|
||||
|
||||
COPY requirements.txt .
|
||||
|
||||
RUN python3 -m pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
|
||||
|
||||
COPY app.py Dockerfile .
|
||||
|
||||
ENTRYPOINT python3 app.py
|
88
python/bge-rerank/bge-reranker-v2-m3/app.py
Normal file
88
python/bge-rerank/bge-reranker-v2-m3/app.py
Normal file
@@ -0,0 +1,88 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
"""
|
||||
@Time: 2023/11/7 22:45
|
||||
@Author: zhidong
|
||||
@File: reranker.py
|
||||
@Desc:
|
||||
"""
|
||||
import os
|
||||
import numpy as np
|
||||
import logging
|
||||
import uvicorn
|
||||
import datetime
|
||||
from fastapi import FastAPI, Security, HTTPException
|
||||
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
|
||||
from FlagEmbedding import FlagReranker
|
||||
from pydantic import Field, BaseModel, validator
|
||||
from typing import Optional, List
|
||||
|
||||
app = FastAPI()
|
||||
security = HTTPBearer()
|
||||
env_bearer_token = 'ACCESS_TOKEN'
|
||||
|
||||
class QADocs(BaseModel):
|
||||
query: Optional[str]
|
||||
documents: Optional[List[str]]
|
||||
|
||||
|
||||
class Singleton(type):
|
||||
def __call__(cls, *args, **kwargs):
|
||||
if not hasattr(cls, '_instance'):
|
||||
cls._instance = super().__call__(*args, **kwargs)
|
||||
return cls._instance
|
||||
|
||||
|
||||
RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-v2-m3")
|
||||
|
||||
class ReRanker(metaclass=Singleton):
|
||||
def __init__(self, model_path):
|
||||
self.reranker = FlagReranker(model_path, use_fp16=False)
|
||||
|
||||
def compute_score(self, pairs: List[List[str]]):
|
||||
if len(pairs) > 0:
|
||||
result = self.reranker.compute_score(pairs, normalize=True)
|
||||
if isinstance(result, float):
|
||||
result = [result]
|
||||
return result
|
||||
else:
|
||||
return None
|
||||
|
||||
class Chat(object):
|
||||
def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
|
||||
self.reranker = ReRanker(rerank_model_path)
|
||||
|
||||
def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
|
||||
if query_docs is None or len(query_docs.documents) == 0:
|
||||
return []
|
||||
|
||||
pair = [[query_docs.query, doc] for doc in query_docs.documents]
|
||||
scores = self.reranker.compute_score(pair)
|
||||
|
||||
new_docs = []
|
||||
for index, score in enumerate(scores):
|
||||
new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
|
||||
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
|
||||
return results
|
||||
|
||||
@app.post('/v1/rerank')
|
||||
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
|
||||
token = credentials.credentials
|
||||
if env_bearer_token is not None and token != env_bearer_token:
|
||||
raise HTTPException(status_code=401, detail="Invalid token")
|
||||
chat = Chat()
|
||||
try:
|
||||
results = chat.fit_query_answer_rerank(docs)
|
||||
return {"results": results}
|
||||
except Exception as e:
|
||||
print(f"报错:\n{e}")
|
||||
return {"error": "重排出错"}
|
||||
|
||||
if __name__ == "__main__":
|
||||
token = os.getenv("ACCESS_TOKEN")
|
||||
if token is not None:
|
||||
env_bearer_token = token
|
||||
try:
|
||||
uvicorn.run(app, host='0.0.0.0', port=6006)
|
||||
except Exception as e:
|
||||
print(f"API启动失败!\n报错:\n{e}")
|
7
python/bge-rerank/bge-reranker-v2-m3/requirements.txt
Normal file
7
python/bge-rerank/bge-reranker-v2-m3/requirements.txt
Normal file
@@ -0,0 +1,7 @@
|
||||
fastapi==0.104.1
|
||||
transformers[sentencepiece]
|
||||
FlagEmbedding==1.2.8
|
||||
pydantic==1.10.13
|
||||
uvicorn==0.17.6
|
||||
itsdangerous
|
||||
protobuf
|
BIN
python/bge-rerank/rerank1.png
Normal file
BIN
python/bge-rerank/rerank1.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 91 KiB |
@@ -1,48 +0,0 @@
|
||||
|
||||
## 推荐配置
|
||||
|
||||
推荐配置如下:
|
||||
|
||||
{{< table "table-hover table-striped-columns" >}}
|
||||
| 类型 | 内存 | 显存 | 硬盘空间 | 启动命令 |
|
||||
|------|---------|---------|----------|--------------------------|
|
||||
| base | >=4GB | >=3GB | >=8GB | python app.py |
|
||||
{{< /table >}}
|
||||
|
||||
## 部署
|
||||
|
||||
### 环境要求
|
||||
|
||||
- Python 3.10.11
|
||||
- CUDA 11.7
|
||||
- 科学上网环境
|
||||
|
||||
### 源码部署
|
||||
|
||||
1. 根据上面的环境配置配置好环境,具体教程自行 GPT;
|
||||
2. 下载 [python 文件](app.py)
|
||||
3. 在命令行输入命令 `pip install -r requirments.txt`;
|
||||
4. 按照[https://huggingface.co/BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)下载模型仓库到app.py同级目录
|
||||
5. 添加环境变量 `export ACCESS_TOKEN=XXXXXX` 配置 token,这里的 token 只是加一层验证,防止接口被人盗用,默认值为 `ACCESS_TOKEN` ;
|
||||
6. 执行命令 `python app.py`。
|
||||
|
||||
然后等待模型下载,直到模型加载完毕为止。如果出现报错先问 GPT。
|
||||
|
||||
启动成功后应该会显示如下地址:
|
||||
|
||||

|
||||
|
||||
> 这里的 `http://0.0.0.0:6006` 就是连接地址。
|
||||
|
||||
### docker 部署
|
||||
|
||||
**镜像和端口**
|
||||
|
||||
+ 镜像名: `registry.cn-hangzhou.aliyuncs.com/fastgpt/rerank:v0.2`
|
||||
+ 端口号: 6006
|
||||
|
||||
```
|
||||
# 设置安全凭证(即oneapi中的渠道密钥)
|
||||
通过环境变量ACCESS_TOKEN引入,默认值:ACCESS_TOKEN。
|
||||
有关docker环境变量引入的方法请自寻教程,此处不再赘述。
|
||||
```
|
Reference in New Issue
Block a user