Files
FastGPT/docSite/content/zh-cn/docs/guide/knowledge_base/websync.md
Archer e9d52ada73 4.8.13 feature (#3118)
* chore(ui): login page & workflow page (#3046)

* login page & number input & multirow select & llm select

* workflow

* adjust nodes

* New file upload (#3058)

* feat: toolNode aiNode readFileNode adapt new version

* update docker-compose

* update tip

* feat: adapt new file version

* perf: file input

* fix: ts

* feat: add chat history time label (#3024)

* feat:add chat and logs time

* feat: add chat history time label

* code perf

* code perf

---------

Co-authored-by: 勤劳上班的卑微小张 <jiazhan.zhang@ggimage.com>

* add chatType (#3060)

* pref: slow query of full text search (#3044)

* Adapt findLast api;perf: markdown zh format. (#3066)

* perf: context code

* fix: adapt findLast api

* perf: commercial plugin run error

* perf: markdown zh format

* perf: dockerfile proxy (#3067)

* fix ui (#3065)

* fix ui

* fix

* feat: support array reference multi-select (#3041)

* feat: support array reference multi-select

* fix build

* fix

* fix loop multi-select

* adjust condition

* fix get value

* array and non-array conversion

* fix plugin input

* merge func

* feat: iframe code block;perf: workflow selector type (#3076)

* feat: iframe code block

* perf: workflow selector type

* node pluginoutput check (#3074)

* feat: View will move when workflow check error;fix: ui refresh error when continuous file upload (#3077)

* fix: plugin output check

* fix: ui refresh error when continuous file upload

* feat: View will move when workflow check error

* add dispatch try catch (#3075)

* perf: workflow context split (#3083)

* perf: workflow context split

* perf: context

* 4.8.13 test (#3085)

* perf: workflow node ui

* chat iframe url

* feat: support sub route config (#3071)

* feat: support sub route config

* dockerfile

* fix upload

* delete unused code

* 4.8.13 test (#3087)

* fix: image expired

* fix: datacard navbar ui

* perf: build action

* fix: workflow file upload refresh (#3088)

* fix: http tool response (#3097)

* loop node dynamic height (#3092)

* loop node dynamic height

* fix

* fix

* feat: support push chat log (#3093)

* feat: custom uid/metadata

* to: custom info

* fix: chat push latest

* feat: add chat log envs

* refactor: move timer to pushChatLog

* fix: using precise log

---------

Co-authored-by: Finley Ge <m13203533462@163.com>

* 4.8.13 test (#3098)

* perf: loop node refresh

* rename context

* comment

* fix: ts

* perf: push chat log

* array reference check & node ui (#3100)

* feat: loop start add index (#3101)

* feat: loop start add index

* update doc

* 4.8.13 test (#3102)

* fix: loop index;edge parent check

* perf: reference invalid check

* fix: ts

* fix: plugin select files and ai response check (#3104)

* fix: plugin select files and ai response check

* perf: text editor selector;tool call tip;remove invalid image url;

* perf: select file

* perf: drop files

* feat: source id prefix env (#3103)

* 4.8.13 test (#3106)

* perf: select file

* perf: drop files

* perf: env template

* 4.8.13 test (#3107)

* perf: select file

* perf: drop files

* fix: imple mode adapt files

* perf: push chat log (#3109)

* fix: share page load title error (#3111)

* 4.8.13 perf (#3112)

* fix: share page load title error

* update file input doc

* perf: auto add file urls

* perf: auto ser loop node offset height

* 4.8.13 test (#3117)

* perf: plugin

* updat eaction

* feat: add more share config (#3120)

* feat: add more share config

* add i18n en

* fix: missing subroute (#3121)

* perf: outlink config (#3128)

* update action

* perf: outlink config

* fix: ts (#3129)

* 更新 docSite 文档内容 (#3131)

* fix: null pointer (#3130)

* fix: null pointer

* perf: not input text

* update doc url

* perf: outlink default value (#3134)

* update doc (#3136)

* 4.8.13 test (#3137)

* update doc

* perf: completions chat api

* Restore docSite content based on upstream/4.8.13-dev (#3138)

* Restore docSite content based on upstream/4.8.13-dev

* 4813.md缺少更正

* update doc (#3141)

---------

Co-authored-by: heheer <heheer@sealos.io>
Co-authored-by: papapatrick <109422393+Patrickill@users.noreply.github.com>
Co-authored-by: 勤劳上班的卑微小张 <jiazhan.zhang@ggimage.com>
Co-authored-by: Finley Ge <32237950+FinleyGe@users.noreply.github.com>
Co-authored-by: a.e. <49438478+I-Info@users.noreply.github.com>
Co-authored-by: Finley Ge <m13203533462@163.com>
Co-authored-by: Jiangween <145003935+Jiangween@users.noreply.github.com>
2024-11-13 11:29:53 +08:00

2.7 KiB
Raw Blame History

title, description, icon, draft, toc, weight
title description icon draft toc weight
Web 站点同步 FastGPT Web 站点同步功能介绍和使用方式 language false true 406

该功能目前仅向商业版用户开放。

什么是 Web 站点同步

Web 站点同步利用爬虫的技术,可以通过一个入口网站,自动捕获同域名下的所有网站,目前最多支持200个子页面。出于合规与安全角度FastGPT 仅支持静态站点的爬取,主要用于各个文档站点快速构建知识库。

Tips: 国内的媒体站点基本不可用公众号、csdn、知乎等。可以通过终端发送curl请求检测是否为静态站点,例如:

curl https://doc.tryfastgpt.ai/docs/intro/

如何使用

1. 新建知识库,选择 Web 站点同步

2. 点击配置站点信息

3. 填写网址和选择器

好了, 现在点击开始同步,静等系统自动抓取网站信息即可。

创建应用,绑定知识库

选择器如何使用

选择器是 HTML CSS JS 的产物,你可以通过选择器来定位到你需要抓取的具体内容,而不是整个站点。使用方式为:

首先打开浏览器调试面板(通常是 F12或者【右键 - 检查】)

输入对应元素的选择器

菜鸟教程 css 选择器,具体选择器的使用方式可以参考菜鸟教程。

上图中,我们选中了一个区域,对应的是div标签,它有 data-prismjs-copy, data-prismjs-copy-success, data-prismjs-copy-error 三个属性,这里我们用到一个就够。所以选择器是: div[data-prismjs-copy]

除了属性选择器常见的还有类和ID选择器。例如

上图 class 里的是类名(可能包含多个类名,都是空格隔开的,选择一个即可),选择器可以为:.docs-content

多选择器使用

在开头的演示中,我们对 FastGPT 文档是使用了多选择器的方式来选择,通过逗号隔开了两个选择器。

我们希望选中上图两个标签中的内容,此时就需要两组选择器。一组是:.docs-content .mb-0.d-flex,含义是 docs-content 类下同时包含 mb-0d-flex 两个类的子元素;

另一组是.docs-content div[data-prismjs-copy],含义是docs-content 类下包含data-prismjs-copy属性的div元素。

把两组选择器用逗号隔开即可:.docs-content .mb-0.d-flex, .docs-content div[data-prismjs-copy]