4.6.7-alpha commit (#743)

Co-authored-by: Archer <545436317@qq.com> Co-authored-by: heheer <71265218+newfish-cmyk@users.noreply.github.com>
2025-10-19 01:54:04 +00:00 · 2024-01-19 11:17:28 +08:00
parent 8ee7407c4c
commit c031e6dcc9
324 changed files with 8509 additions and 4757 deletions
--- a/docSite/content/docs/use-cases/datasetEngine.md
+++ b/docSite/content/docs/use-cases/datasetEngine.md
@@ -25,7 +25,9 @@ FastGPT 采用了 RAG 中的 Embedding 方案构建知识库，要使用好 Fast

 FastGPT 采用了 `PostgresSQL` 的 `PG Vector` 插件作为向量检索器，索引为`HNSW`。且`PostgresSQL`仅用于向量检索，`MongoDB`用于其他数据的存取。

-在`PostgresSQL`的表中，设置一个 `index` 字段用于存储向量，以及一个`data_id`用于在`MongoDB`中寻找对应的映射值。多个`index`可以对应一组`data_id`，也就是说，一组向量可以对应多组数据。在进行检索时，相同数据会进行合并。
+在`MongoDB`的`dataset.datas`表中，会存储向量原数据的信息，同时有一个`indexes`字段，会记录其对应的向量ID，这是一个数组，也就是说，一组向量可以对应多组数据。
+
+在`PostgresSQL`的表中，设置一个 `index` 字段用于存储向量。在检索时，会先召回向量，再根据向量的ID，去`MongoDB`中寻找原数据内容，如果对应了同一组原数据，则进行合并，向量得分取最高得分。

 ![](/imgs/datasetSetting1.png)