Commit Graph

792 Commits

Author SHA1 Message Date
chanzhi82020
31c17999b8 This PR introduces evaluation support designed specifically to track and benchmark applications built on the FastGPT platform. (#5476)
- Adds a lightweight evaluation framework for app-level tracking and benchmarking.
- Changes: 28 files, +1455 additions, -66 deletions.
- Branch: add-evaluations -> main.
- PR: https://github.com/chanzhi82020/FastGPT/pull/1

Applications built on FastGPT need repeatable, comparable benchmarks to measure regressions, track improvements, and validate releases. This initial implementation provides the primitives to define evaluation scenarios, run them against app endpoints or model components, and persist results for later analysis.

I updated the PR description to emphasize that the evaluation system is targeted at FastGPT-built apps and expanded the explanation of the core pieces so reviewers understand the scope and intended use. The new description outlines the feature intent, core components, and how results are captured and aggregated for benchmarking.

- Evaluation definitions
  - Define evaluation tasks that reference an app (app id, version, endpoint), test datasets or input cases, expected outputs (when applicable), and run configuration (parallelism, timeouts).
  - Support for custom metric plugins so teams can add domain-specific measures.

- Runner / Executor
  - Executes evaluation cases against app endpoints or internal model interfaces.
  - Captures raw responses, response times, status codes, and any runtime errors.
  - Computes per-case metrics (e.g., correctness, latency) immediately after each case run.

- Metrics & Aggregation
  - Built-in metrics: accuracy/success rate, latency (p50/p90/p99), throughput, error rate.
  - Aggregation produces per-run summaries and per-app historical summaries for trend analysis.
  - Allows combining metrics into composite scores for high-level benchmarking.

- Persistence & Logging
  - Stores run results, input/output pairs (when needed), timestamps, environment info, and app/version metadata so runs are reproducible and auditable.
  - Logs are retained to facilitate debugging and root-cause analysis of regressions.

- Reporting & Comparison
  - Produces aggregated reports suitable for CI gating, release notes, or dashboards.
  - Supports comparing multiple app versions or deployments side-by-side.

- Extensibility & Integration
  - Designed to plug into CI (automated runs on PRs or releases), dashboards, and downstream analysis tools.
  - Easy to add new metrics, evaluators, or dataset connectors.

By centering the evaluation system on FastGPT apps, teams can benchmark full application behavior (not only raw model outputs), correlate metrics with deployment configurations, and make informed release decisions.

- Expand built-in metric suite (e.g., F1, BLEU/ROUGE where applicable), add dataset connectors, and provide example evaluation scenarios for sample apps.
- Integrate with CI pipelines and add basic dashboarding for trend visualization.

Related Issue: N/A

Co-authored-by: Archer <545436317@qq.com>
2025-09-16 15:20:59 +08:00
Archer
cb7d1a3205 perf: init shell (#5651)
* perf: init shell

* fix: tool run select

* border radius
2025-09-15 22:21:24 +08:00
Archer
2ed1545eb5 V4.12.4 features (#5626)
* fix: push again, user select option button and form input radio content overflow (#5601)

* fix: push again, user select option button and form input radio content overflow

* fix: use useCallback instead of useMemo, fix unnecessary delete

* fix: Move the variable inside the component

* fix: do not pass valueLabel to MySelect

* ui

* del collection api adapt

* refactor: inherit permission (#5529)

* refactor: permission update conflict check function

* refactor(permission): app collaborator update api

* refactor(permission): support app update collaborator

* feat: support fe permission conflict check

* refactor(permission): app permission

* refactor(permission): dataset permission

* refactor(permission): team permission

* chore: fe adjust

* fix: type error

* fix: audit pagiation

* fix: tc

* chore: initv4130

* fix: app/dataset auth logic

* chore: move code

* refactor(permission): remove selfPermission

* fix: mock

* fix: test

* fix: app & dataset auth

* fix: inherit

* test(inheritPermission): test syncChildrenPermission

* prompt editor add list plugin (#5620)

* perf: search result (#5608)

* fix: table size (#5598)

* temp: list value

* backspace

* optimize code

---------

Co-authored-by: Archer <545436317@qq.com>
Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>

* fix: fe & member list (#5619)

* chore: initv4130

* fix: MemberItemCard

* fix: MemberItemCard

* chore: fe adjust & init script

* perf: test code

* doc

* fix debug variables (#5617)

* perf: search result (#5608)

* fix: table size (#5598)

* fix debug variables

* fix

---------

Co-authored-by: Archer <545436317@qq.com>
Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>

* perf: member ui

* fix: inherit bug (#5624)

* refactor(permission): remove getClbsWithInfo, which is useless

* fix: app list privateApp

* fix: get infos

* perf(fe): remove delete icon when it is disable in MemberItemCard

* fix: dataset private dataset

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Archer <545436317@qq.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* perf: auto coupon

* chore: upgrade script & get infos avatar  (#5625)

* fix: get infos

* chore: initv4130

* feat: support WecomRobot publish, and fix AesKey can not save bug (#5526)

* feat: resolve conflicts

* fix: add param 'show_publish_wecom'

* feat: abstract out WecomCrypto type

* doc: wecom robot document

* fix: solve instability in AI output

* doc: update some pictures

* feat: remove functions from request.ts to chat.ts and toolCall.ts

* doc: wecom robot doc update

* fix

* delete unused code

* doc: update version and prompt

* feat: remove wecom crypto, delete wecom code in workflow

* feat: delete unused codes

---------

Co-authored-by: heheer <zhiyu44@qq.com>

* remove test

* rename init shell

* feat: collection page store

* reload sandbox

* pysandbox

* remove log

* chore: remove useless code (#5629)

* chore: remove useless code

* fix: checkConflict

* perf: support hidden type for RoleList

* fix: copy node

* update doc

* fix(permission): some bug (#5632)

* fix: app/dataset list

* fix: inherit bug

* perf: del app;i18n;save chat

* fix: test

* i18n

* fix: sumper overflow return OwnerRoleVal (#5633)

* remove invalid code

* fix: scroll

* fix: objectId

* update next

* update package

* object id

* mock redis

* feat: add redis append to resolve wecom stream response  (#5643)

* feat: resolve conflicts

* fix: add param 'show_publish_wecom'

* feat: abstract out WecomCrypto type

* doc: wecom robot document

* fix: solve instability in AI output

* doc: update some pictures

* feat: remove functions from request.ts to chat.ts and toolCall.ts

* doc: wecom robot doc update

* fix

* delete unused code

* doc: update version and prompt

* feat: remove wecom crypto, delete wecom code in workflow

* feat: delete unused codes

* feat: add redis append method

---------

Co-authored-by: heheer <zhiyu44@qq.com>

* cache per

* fix(test): init team sub when creating mocked user (#5646)

* fix: button is not vertically centered (#5647)

* doc

* fix: gridFs objectId (#5649)

---------

Co-authored-by: Zeng Qingwen <143274079+fishwww-ww@users.noreply.github.com>
Co-authored-by: Finley Ge <32237950+FinleyGe@users.noreply.github.com>
Co-authored-by: heheer <heheer@sealos.io>
Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: heheer <zhiyu44@qq.com>
2025-09-15 20:02:54 +08:00
Archer
25207c5060 perf: search result (#5608) 2025-09-09 14:06:31 +08:00
Archer
c4632a2222 V4.12.3 document (#5600)
* doc

* doc

* perf: log
2025-09-07 20:55:14 +08:00
Archer
3f9b0fa1d4 V4.12.3 features (#5595)
* refactor: remove ModelProviderIdType and update related types (#5549)

* perf: model provider

* fix eval create split (#5570)

* git rebase --continuedoc

* add more variable types (#5540)

* variable types

* password

* time picker

* internal var

* file

* fix-test

* time select default value & range

* password & type render

* fix

* fix build

* fix

* move method

* split date select

* icon

* perf: variable code

* prompt editor add markdown plugin (#5556)

* editor markdown

* fix build

* pnpm lock

* add props

* update code

* fix list

* editor ui

* fix variable reset (#5586)

* perf: variables type code

* customize lexical indent (#5588)

* perf: multiple selector

* perf: tab plugin

* doc

* refactor: update workflow constants to use ToolTypeEnum (#5491)

* refactor: replace FlowNodeTemplateTypeEnum with string literals in workflow templates

* perf: tool type

---------

Co-authored-by: archer <545436317@qq.com>

* update doc

* fix: make table's row more natural while dragging it (#5596)

* feat: add APIGetTemplate function and refactor template fetching logic (#5498)

* feat: add APIGetTemplate function and refactor template fetching logic

* chore: adjust the code

* chore: update sdk

---------

Co-authored-by: FinleyGe <m13203533462@163.com>

* perf init system

* doc

* remove log

* remove i18n

* perf: variables render

---------

Co-authored-by: Ctrlz <143257420+ctrlz526@users.noreply.github.com>
Co-authored-by: heheer <heheer@sealos.io>
Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>
Co-authored-by: FinleyGe <m13203533462@163.com>
2025-09-07 14:41:48 +08:00
Archer
85ea117481 Fix workflow (#5592)
* fix: fileselector default

* fix: workflow run process
2025-09-04 20:22:35 +08:00
伍闲犬
9be1e591d3 fix: delete "Content-Length" while redirect request to pro api (#5589) 2025-09-04 14:17:07 +08:00
Archer
c67e645469 perf: login page (#5571) 2025-09-01 21:03:58 +08:00
Finley Ge
f41775fe56 fix: leave team (#5554)
* chore: fe copywrite

* feat: auto accept invitation when login

* perf: only show forbidden filter in sync mode

* chore: auto accept invitation
2025-09-01 20:33:29 +08:00
Archer
42e249f30f perf: rrf code (#5559) 2025-08-29 01:32:08 +08:00
Archer
a952539875 perf: rrf code (#5558) 2025-08-29 01:24:19 +08:00
YeYuheng
e4756c76dd rrf_weight (#5551)
Co-authored-by: xxYyh <xxyyh@xxYyhdeMacBook-Pro.local>
2025-08-29 00:54:29 +08:00
Finley Ge
486d791b94 fix: extract node can not extract when using tool-calling-able model. (#5555) 2025-08-28 18:09:33 +08:00
Archer
c4799df3fd perf: workflow code (#5548)
* perf: workflow code

* add tool call limit
2025-08-27 11:45:46 +08:00
Archer
324aaae769 fix: ai response test (#5544)
* fix: ai response test

* fix: skip edge check

* fix: app list

* fix: toolset conflict interactive node

* fix: username show
2025-08-27 00:08:22 +08:00
伍闲犬
3fb1ff2614 fix: read permission; incorrect name; redirect (#5541) 2025-08-26 17:57:59 +08:00
Archer
93e9cb675d fix: oceanbase insert (#5539) 2025-08-26 17:29:42 +08:00
Archer
7cd4a8b0bc feat: Store pdfparse in local (#5534) 2025-08-26 14:35:39 +08:00
伍闲犬
3c2bf20666 feat: add switch to control if enable home (#5531) 2025-08-26 12:00:18 +08:00
Archer
3b25bf57c4 perf: search key refresh parentId (#5530) 2025-08-26 11:30:02 +08:00
Archer
830eb19055 feature: V4.12.2 (#5525)
* feat: favorite apps & quick apps with their own configuration (#5515)

* chore: extract chat history and drawer; fix model selector

* feat: display favourite apps and make it configurable

* feat: favorite apps & quick apps with their own configuration

* fix: fix tab title and add loading state for searching

* fix: cascade delete favorite app and quick app while deleting relative app

* chore: make improvements

* fix: favourite apps ui

* fix: add permission for quick apps

* chore: fix permission & clear redundant code

* perf: chat home page code

* chatbox ui

* fix: 4.12.2-dev (#5520)

* fix: add empty placeholder; fix app quick status; fix tag and layout

* chore: add tab query for the setting tabs

* chore: use `useConfirm` hook instead of `MyModal`

* remove log

* fix: fix modal padding (#5521)

* perf: manage app

* feat: enhance model provider handling and update icon references (#5493)

* perf: model provider

* sdk package

* refactor: create llm response (#5499)

* feat: add LLM response processing functions, including the creation of stream-based and complete responses

* feat: add volta configuration for node and pnpm versions

* refactor: update LLM response handling and event structure in tool choice logic

* feat: update LLM response structure and integrate with tool choice logic

* refactor: clean up imports and remove unused streamResponse function in chat and toolChoice modules

* refactor: rename answer variable to answerBuffer for clarity in LLM response handling

* feat: enhance LLM response handling with tool options and integrate tools into chat and tool choice logic

* refactor: remove volta configuration from package.json

* refactor: reorganize LLM response types and ensure default values for token counts

* refactor: streamline LLM response handling by consolidating response structure and removing redundant checks

* refactor: enhance LLM response handling by consolidating tool options and streamlining event callbacks

* fix: build error

* refactor: update tool type definitions for consistency in tool handling

* feat: llm request function

* fix: ts

* fix: ts

* fix: ahook ts

* fix: variable name

* update lock

* ts version

* doc

* remove log

* fix: translation type

* perf: workflow status check

* fix: ts

* fix: prompt tool call

* fix: fix missing plugin interact window & make tag draggable (#5527)

* fix: incorrect select quick apps state; filter apps type (#5528)

* fix: usesafe translation

* perf: add quickapp modal

---------

Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>
Co-authored-by: Ctrlz <143257420+ctrlz526@users.noreply.github.com>
Co-authored-by: francis <zhichengfan18@gmail.com>
2025-08-25 19:19:43 +08:00
Archer
b6318aa35a fix: version schema ref error (#5518) 2025-08-22 11:34:46 +08:00
Archer
95325346ff perf: vector format (#5516)
* perf: vector format

* feat: embedding batch size
2025-08-22 10:18:24 +08:00
heheer
a92917c05f fix team app template search (#5514) 2025-08-21 20:41:46 +08:00
Archer
e19eddf976 fix: model selector overlay (#5511) 2025-08-20 21:58:13 +08:00
Finley Ge
37eec3d452 perf: customizable embedding chunk size via env var (#5494)
* perf: customizable embedding chunk size via env var

* Update .env.template

---------

Co-authored-by: Archer <545436317@qq.com>
2025-08-20 18:42:15 +08:00
Finley Ge
f41e3ffc68 fix: multiple select value type when empty string does not have map function (#5487) 2025-08-19 14:32:58 +08:00
Archer
ce36230285 perf: page ui (#5469)
* perf: page ui

* fix: icon

* limit chat items

* limit chat items
2025-08-15 17:56:49 +08:00
Archer
76dc23c2e4 perF: getInitData api cache;perf: tool description field;signoz store level (#5465)
* perf: auto focus

* perF: getInitData api cache

* perf: tool description field

* signoz store level

* perF: chat logs index
2025-08-15 15:01:20 +08:00
Ctrlz
d78a0e9e4b feat: add toolDescription field across various schemas and update related functions (#5452) 2025-08-15 14:13:14 +08:00
Archer
5cd1c2af14 perf: chat pane (#5462)
* fix: sync pane with URL appId vs Home appId to avoid cross-tab interference (#5456)

* perf: chat pane

* perf: markdown render

* update app chat logs index

* doc

* doc redirect

---------

Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>
2025-08-15 11:03:38 +08:00
Archer
eadf2fd54c fix: index (#5458)
* doc

* fix: home app name

* fix: char init error status

* fix: index

* fix: secret input
2025-08-14 18:54:47 +08:00
heheer
9a9f094e15 prompt optimze loading (#5461) 2025-08-14 18:26:16 +08:00
heheer
c5cabd0efc export chat detail (#5454)
* export chat detail

* fix

* key name
2025-08-14 16:25:40 +08:00
Ctrlz
8f3424cea1 feat: enhance SystemPluginTemplateItemType to include user instructions (#5455) 2025-08-14 16:00:35 +08:00
heheer
d72929dcf8 fix dataset auth filter (#5457) 2025-08-14 15:59:59 +08:00
Archer
9fbfabac61 perf: variabel replace;Feat: prompt optimizer code (#5453)
* feat: add prompt optimizer (#5444)

* feat: add prompt optimizer

* fix

* perf: variabel replace

* perf: prompt optimizer code

* feat: init charts shell

* perf: user error remove

---------

Co-authored-by: heheer <heheer@sealos.io>
2025-08-14 15:48:22 +08:00
Ctrlz
6a02d2a2e5 fix: concatenate answerText in dispatchRunTool function (#5451) 2025-08-13 21:05:00 +08:00
Archer
ad550f4444 perf: workflow response field (#5443) 2025-08-13 14:29:13 +08:00
Archer
83aa3a855f redirect (#5440) 2025-08-12 23:08:00 +08:00
Archer
c51395b2c8 V4.12.0 features (#5435)
* add logs chart (#5352)

* charts

* chart data

* log chart

* delete

* rename api

* fix

* move api

* fix

* fix

* pro config

* fix

* feat: Repository interaction (#5356)

* feat: 1好像功能没问题了,明天再测

* feat: 2 解决了昨天遗留的bug,但全选按钮又bug了

* feat: 3 第三版,解决了全选功能bug

* feat: 4 第四版,下面改小细节

* feat: 5 我勒个痘

* feat: 6

* feat: 6 pr

* feat: 7

* feat: 8

* feat: 9

* feat: 10

* feat: 11

* feat: 12

* perf: checkbox ui

* refactor: tweak login loyout (#5357)

Co-authored-by: Archer <545436317@qq.com>

* login ui

* app chat log chart pro display (#5392)

* app chat log chart pro display

* add canopen props

* perf: pro tag tip

* perf: pro tag tip

* feat: openrouter provider (#5406)

* perf: login ui

* feat: openrouter provider

* provider

* perf: custom error throw

* perf: emb batch (#5407)

* perf: emb batch

* perf: vector retry

* doc

* doc (#5411)

* doc

* fix: team folder will add to workflow

* fix: generateToc shell

* Tool price (#5376)

* resolve conflicts for cherry-pick

* fix i18n

* Enhance system plugin template data structure and update ToolSelectModal to include CostTooltip component

* refactor: update systemKeyCost type to support array of objects in plugin and workflow types

* refactor: simplify systemKeyCost type across plugin and workflow types to a single number

* refactor: streamline systemKeyCost handling in plugin and workflow components

* fix

* fix

* perf: toolset price config;fix: workflow array selector ui (#5419)

* fix: workflow array selector ui

* update default model tip

* perf: toolset price config

* doc

* fix: test

* Refactor/chat (#5418)

* refactor: add homepage configuration; add home chat page; add side bar animated collapse and layout

* fix: fix lint rules

* chore: improve logics and code

* chore: more clearer logics

* chore: adjust api

---------

Co-authored-by: Archer <545436317@qq.com>

* perf: chat setting code

* del history

* logo image

* perf: home chat ui

* feat: enhance chat response handling with external links and user info (#5427)

* feat: enhance chat response handling with external links and user info

* fix

* cite code

* perf: toolset add in workflow

* fix: test

* fix: search paraentId

* Fix/chat (#5434)

* wip: rebase了upstream

* wip: adapt mobile UI

* fix: fix chat page logic and UI

* fix: fix UI and improve some logics

* fix: model selector missing logo; vision model to retrieve file

* perf: role selector

* fix: chat ui

* optimize export app chat log (#5436)

* doc

* chore: move components to proper directory; fix the api to get app list (#5437)

* chore: improve team app panel display form (#5438)

* feat: add home chat log tab

* chore: improve team app panel display form

* chore: improve log panel

* fix: spec

* doc

* fix: log permission

* fix: dataset schema required

* add loading status

* remove ui weight

* manage log

* fix: log detail per

* doc

* fix: log menu

* rename permission

* bg color

* fix: app log per

* fix: log key selector

* fix: log

* doc

---------

Co-authored-by: heheer <zhiyu44@qq.com>
Co-authored-by: colnii <1286949794@qq.com>
Co-authored-by: 伍闲犬 <76519998+xqvvu@users.noreply.github.com>
Co-authored-by: Ctrlz <143257420+ctrlz526@users.noreply.github.com>
Co-authored-by: 伍闲犬 <whoeverimf5@gmail.com>
Co-authored-by: heheer <heheer@sealos.io>
2025-08-12 22:22:18 +08:00
Finley Ge
c6e58291f7 fix: permission can not edit admin permission (#5433) 2025-08-11 21:46:27 +08:00
Finley Ge
57e1ef1176 refactor: permission role & app read chat log permission (#5416)
* refactor: permission role

* refactor: permission type

* fix: permission manage

* fix: group owner cannot be deleted

* chore: common per map

* chore: openapi

* chore: rename

* fix: type error

* chore: app chat log permission

* chore: add initv4112
2025-08-11 10:51:44 +08:00
Archer
29edf1ea5f Perf: llm parse paragraph (#5420)
* feat: llm directory optimization (#5400)

* perf: llm parse

* doc

---------

Co-authored-by: colnii <1286949794@qq.com>
2025-08-09 18:38:58 +08:00
Archer
1fc1e3fa80 fix: max tokens config (#5409) 2025-08-08 10:38:57 +08:00
Finley Ge
17599d95bb fix: old mcp tool compatible (#5399) 2025-08-07 11:37:26 +08:00
Archer
37648d5c71 fix: mcp not response output (#5388) 2025-08-05 10:51:42 +08:00
Archer
6a0b0b1991 update doc search engine (#5386)
* update doc search engine

* custom tokenizer

* tokenizer
2025-08-04 22:07:52 +08:00
Archer
16a74c909d fix: doc preview action;update doc (#5383)
* fix: doc preview action

* update doc

* dpc
2025-08-04 18:10:58 +08:00