4.7.1-alpha2 (#1153)

Co-authored-by: UUUUnotfound <31206589+UUUUnotfound@users.noreply.github.com>
Co-authored-by: Hexiao Zhang <731931282qq@gmail.com>
Co-authored-by: heheer <71265218+newfish-cmyk@users.noreply.github.com>
This commit is contained in:
Archer
2024-04-08 21:17:33 +08:00
committed by GitHub
parent 3b0b2d68cc
commit 1fbc407ecf
84 changed files with 1773 additions and 715 deletions

114
python/bge-rerank/README.md Normal file
View File

@@ -0,0 +1,114 @@
# 接入 bge-rerank 重排模型
## 不同模型推荐配置
推荐配置如下:
| 模型名 | 内存 | 显存 | 硬盘空间 | 启动命令 |
| ---------------- | ----- | ----- | -------- | ------------- |
| bge-rerank-base | >=4GB | >=4GB | >=8GB | python app.py |
| bge-rerank-large | >=8GB | >=8GB | >=8GB | python app.py |
| bge-rerank-v2-m3 | >=8GB | >=8GB | >=8GB | python app.py |
## 源码部署
### 1. 安装环境
- Python 3.9, 3.10
- CUDA 11.7
- 科学上网环境
### 2. 下载代码
3 个模型代码分别为:
1. [https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-base](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-base)
2. [https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-large](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-reranker-large)
3. [https://github.com/labring/FastGPT/tree/main/python/reranker/bge-rerank-v2-m3](https://github.com/labring/FastGPT/tree/main/python/reranker/bge-rerank-v2-m3)
### 3. 安装依赖
```sh
pip install -r requirements.txt
```
### 4. 下载模型
3个模型的 huggingface 仓库地址如下:
1. [https://huggingface.co/BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)
2. [https://huggingface.co/BAAI/bge-reranker-large](https://huggingface.co/BAAI/bge-reranker-large)
3. [https://huggingface.co/BAAI/bge-rerank-v2-m3](https://huggingface.co/BAAI/bge-rerank-v2-m3)
在对应代码目录下 clone 模型。目录结构:
```
bge-reranker-base/
app.py
Dockerfile
requirements.txt
```
### 5. 运行代码
```bash
python app.py
```
启动成功后应该会显示如下地址:
![](./rerank1.png)
> 这里的 `http://0.0.0.0:6006` 就是请求地址。
## docker 部署
**镜像名分别为:**
1. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
2. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-large:v0.1
3. registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-v2-m3:v0.1
**端口**
6006
**环境变量**
```
ACCESS_TOKEN=访问安全凭证请求时Authorization: Bearer ${ACCESS_TOKEN}
```
**运行命令示例**
```sh
# auth token 为mytoken
docker run -d --name reranker -p 6006:6006 -e ACCESS_TOKEN=mytoken --gpus all registry.cn-hangzhou.aliyuncs.com/fastgpt/bge-rerank-base:v0.1
```
**docker-compose.yml示例**
```
version: "3"
services:
reranker:
image: registry.cn-hangzhou.aliyuncs.com/fastgpt/rerank:v0.2
container_name: reranker
# GPU运行环境如果宿主机未安装将deploy配置隐藏即可
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
ports:
- 6006:6006
environment:
- ACCESS_TOKEN=mytoken
```
## 接入 FastGPT
参考 [ReRank模型接入](https://doc/fastai.site/docs/development/configuration/#rerank-接入)

View File

@@ -17,20 +17,9 @@ from FlagEmbedding import FlagReranker
from pydantic import Field, BaseModel, validator
from typing import Optional, List
def response(code, msg, data=None):
time = str(datetime.datetime.now())
if data is None:
data = []
result = {
"code": code,
"message": msg,
"data": data,
"time": time
}
return result
def success(data=None, msg=''):
return
app = FastAPI()
security = HTTPBearer()
env_bearer_token = 'ACCESS_TOKEN'
class QADocs(BaseModel):
query: Optional[str]
@@ -46,42 +35,35 @@ class Singleton(type):
RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-base")
class Reranker(metaclass=Singleton):
class ReRanker(metaclass=Singleton):
def __init__(self, model_path):
self.reranker = FlagReranker(model_path,
use_fp16=False)
self.reranker = FlagReranker(model_path, use_fp16=False)
def compute_score(self, pairs: List[List[str]]):
if len(pairs) > 0:
result = self.reranker.compute_score(pairs)
result = self.reranker.compute_score(pairs, normalize=True)
if isinstance(result, float):
result = [result]
return result
else:
return None
class Chat(object):
def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
self.reranker = Reranker(rerank_model_path)
self.reranker = ReRanker(rerank_model_path)
def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
if query_docs is None or len(query_docs.documents) == 0:
return []
new_docs = []
pair = []
for answer in query_docs.documents:
pair.append([query_docs.query, answer])
scores = self.reranker.compute_score(pair)
for index, score in enumerate(scores):
new_docs.append({"index": index, "text": query_docs.documents[index], "score": 1 / (1 + np.exp(-score))})
#results = [{"document": {"text": documents["text"]}, "index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
return {"results": results}
app = FastAPI()
security = HTTPBearer()
env_bearer_token = 'ACCESS_TOKEN'
pair = [[query_docs.query, doc] for doc in query_docs.documents]
scores = self.reranker.compute_score(pair)
new_docs = []
for index, score in enumerate(scores):
new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
return results
@app.post('/v1/rerank')
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
@@ -89,8 +71,12 @@ async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCreden
if env_bearer_token is not None and token != env_bearer_token:
raise HTTPException(status_code=401, detail="Invalid token")
chat = Chat()
qa_docs_with_rerank = chat.fit_query_answer_rerank(docs)
return response(200, msg="重排成功", data=qa_docs_with_rerank)
try:
results = chat.fit_query_answer_rerank(docs)
return {"results": results}
except Exception as e:
print(f"报错:\n{e}")
return {"error": "重排出错"}
if __name__ == "__main__":
token = os.getenv("ACCESS_TOKEN")

View File

@@ -1,6 +1,6 @@
fastapi==0.104.1
transformers[sentencepiece]
FlagEmbedding==1.1.5
FlagEmbedding==1.2.8
pydantic==1.10.13
uvicorn==0.17.6
itsdangerous

View File

@@ -0,0 +1,12 @@
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
# please download the model from https://huggingface.co/BAAI/bge-reranker-large and put it in the same directory as Dockerfile
COPY ./bge-reranker-large ./bge-reranker-large
COPY requirements.txt .
RUN python3 -m pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
COPY app.py Dockerfile .
ENTRYPOINT python3 app.py

View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
@Time: 2023/11/7 22:45
@Author: zhidong
@File: reranker.py
@Desc:
"""
import os
import numpy as np
import logging
import uvicorn
import datetime
from fastapi import FastAPI, Security, HTTPException
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from FlagEmbedding import FlagReranker
from pydantic import Field, BaseModel, validator
from typing import Optional, List
app = FastAPI()
security = HTTPBearer()
env_bearer_token = 'ACCESS_TOKEN'
class QADocs(BaseModel):
query: Optional[str]
documents: Optional[List[str]]
class Singleton(type):
def __call__(cls, *args, **kwargs):
if not hasattr(cls, '_instance'):
cls._instance = super().__call__(*args, **kwargs)
return cls._instance
RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-large")
class ReRanker(metaclass=Singleton):
def __init__(self, model_path):
self.reranker = FlagReranker(model_path, use_fp16=False)
def compute_score(self, pairs: List[List[str]]):
if len(pairs) > 0:
result = self.reranker.compute_score(pairs, normalize=True)
if isinstance(result, float):
result = [result]
return result
else:
return None
class Chat(object):
def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
self.reranker = ReRanker(rerank_model_path)
def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
if query_docs is None or len(query_docs.documents) == 0:
return []
pair = [[query_docs.query, doc] for doc in query_docs.documents]
scores = self.reranker.compute_score(pair)
new_docs = []
for index, score in enumerate(scores):
new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
return results
@app.post('/v1/rerank')
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
token = credentials.credentials
if env_bearer_token is not None and token != env_bearer_token:
raise HTTPException(status_code=401, detail="Invalid token")
chat = Chat()
try:
results = chat.fit_query_answer_rerank(docs)
return {"results": results}
except Exception as e:
print(f"报错:\n{e}")
return {"error": "重排出错"}
if __name__ == "__main__":
token = os.getenv("ACCESS_TOKEN")
if token is not None:
env_bearer_token = token
try:
uvicorn.run(app, host='0.0.0.0', port=6006)
except Exception as e:
print(f"API启动失败\n报错:\n{e}")

View File

@@ -0,0 +1,7 @@
fastapi==0.104.1
transformers[sentencepiece]
FlagEmbedding==1.2.8
pydantic==1.10.13
uvicorn==0.17.6
itsdangerous
protobuf

View File

@@ -0,0 +1,12 @@
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
# please download the model from https://huggingface.co/BAAI/bge-reranker-v2-m3 and put it in the same directory as Dockerfile
COPY ./bge-reranker-v2-m3 ./bge-reranker-v2-m3
COPY requirements.txt .
RUN python3 -m pip install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple
COPY app.py Dockerfile .
ENTRYPOINT python3 app.py

View File

@@ -0,0 +1,88 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
@Time: 2023/11/7 22:45
@Author: zhidong
@File: reranker.py
@Desc:
"""
import os
import numpy as np
import logging
import uvicorn
import datetime
from fastapi import FastAPI, Security, HTTPException
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from FlagEmbedding import FlagReranker
from pydantic import Field, BaseModel, validator
from typing import Optional, List
app = FastAPI()
security = HTTPBearer()
env_bearer_token = 'ACCESS_TOKEN'
class QADocs(BaseModel):
query: Optional[str]
documents: Optional[List[str]]
class Singleton(type):
def __call__(cls, *args, **kwargs):
if not hasattr(cls, '_instance'):
cls._instance = super().__call__(*args, **kwargs)
return cls._instance
RERANK_MODEL_PATH = os.path.join(os.path.dirname(__file__), "bge-reranker-v2-m3")
class ReRanker(metaclass=Singleton):
def __init__(self, model_path):
self.reranker = FlagReranker(model_path, use_fp16=False)
def compute_score(self, pairs: List[List[str]]):
if len(pairs) > 0:
result = self.reranker.compute_score(pairs, normalize=True)
if isinstance(result, float):
result = [result]
return result
else:
return None
class Chat(object):
def __init__(self, rerank_model_path: str = RERANK_MODEL_PATH):
self.reranker = ReRanker(rerank_model_path)
def fit_query_answer_rerank(self, query_docs: QADocs) -> List:
if query_docs is None or len(query_docs.documents) == 0:
return []
pair = [[query_docs.query, doc] for doc in query_docs.documents]
scores = self.reranker.compute_score(pair)
new_docs = []
for index, score in enumerate(scores):
new_docs.append({"index": index, "text": query_docs.documents[index], "score": score})
results = [{"index": documents["index"], "relevance_score": documents["score"]} for documents in list(sorted(new_docs, key=lambda x: x["score"], reverse=True))]
return results
@app.post('/v1/rerank')
async def handle_post_request(docs: QADocs, credentials: HTTPAuthorizationCredentials = Security(security)):
token = credentials.credentials
if env_bearer_token is not None and token != env_bearer_token:
raise HTTPException(status_code=401, detail="Invalid token")
chat = Chat()
try:
results = chat.fit_query_answer_rerank(docs)
return {"results": results}
except Exception as e:
print(f"报错:\n{e}")
return {"error": "重排出错"}
if __name__ == "__main__":
token = os.getenv("ACCESS_TOKEN")
if token is not None:
env_bearer_token = token
try:
uvicorn.run(app, host='0.0.0.0', port=6006)
except Exception as e:
print(f"API启动失败\n报错:\n{e}")

View File

@@ -0,0 +1,7 @@
fastapi==0.104.1
transformers[sentencepiece]
FlagEmbedding==1.2.8
pydantic==1.10.13
uvicorn==0.17.6
itsdangerous
protobuf

Binary file not shown.

After

Width:  |  Height:  |  Size: 91 KiB

View File

@@ -1,48 +0,0 @@
## 推荐配置
推荐配置如下:
{{< table "table-hover table-striped-columns" >}}
| 类型 | 内存 | 显存 | 硬盘空间 | 启动命令 |
|------|---------|---------|----------|--------------------------|
| base | >=4GB | >=3GB | >=8GB | python app.py |
{{< /table >}}
## 部署
### 环境要求
- Python 3.10.11
- CUDA 11.7
- 科学上网环境
### 源码部署
1. 根据上面的环境配置配置好环境,具体教程自行 GPT
2. 下载 [python 文件](app.py)
3. 在命令行输入命令 `pip install -r requirments.txt`
4. 按照[https://huggingface.co/BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base)下载模型仓库到app.py同级目录
5. 添加环境变量 `export ACCESS_TOKEN=XXXXXX` 配置 token这里的 token 只是加一层验证,防止接口被人盗用,默认值为 `ACCESS_TOKEN`
6. 执行命令 `python app.py`
然后等待模型下载,直到模型加载完毕为止。如果出现报错先问 GPT。
启动成功后应该会显示如下地址:
![](/imgs/chatglm2.png)
> 这里的 `http://0.0.0.0:6006` 就是连接地址。
### docker 部署
**镜像和端口**
+ 镜像名: `registry.cn-hangzhou.aliyuncs.com/fastgpt/rerank:v0.2`
+ 端口号: 6006
```
# 设置安全凭证即oneapi中的渠道密钥
通过环境变量ACCESS_TOKEN引入默认值ACCESS_TOKEN。
有关docker环境变量引入的方法请自寻教程此处不再赘述。
```