Files
FastGPT/document/content/docs/faq/dataset.en.mdx
Archer 4b24472106 docs(i18n): translate final 9 files in introduction directory (#6471)
* docs(i18n): translate batch 1

* docs(i18n): translate batch 2

* docs(i18n): translate batch 3 (20 files)

- openapi/: app, share
- faq/: all 8 files
- use-cases/: index, external-integration (5 files), app-cases (4 files)

Translated using North American style with natural, concise language.
Preserved MDX syntax, code blocks, images, and component imports.

* docs(i18n): translate protocol docs

* docs(i18n): translate introduction docs (part 1)

* docs(i18n): translate use-cases docs

* docs(i18n): translate introduction docs (part 2 - batch 1)

* docs(i18n): translate final 9 files

* fix(i18n): fix YAML and MDX syntax errors in translated files

- Add quotes to description with colon in submit_application_template.en.mdx
- Remove duplicate Chinese content in translate-subtitle-using-gpt.en.mdx
- Fix unclosed details tag issue

* docs(i18n): translate all meta.json navigation files

* fix(i18n): translate Chinese separators in meta.en.json files

* translate

* translate

* i18n

---------

Co-authored-by: archer <archer@archerdeMac-mini.local>
Co-authored-by: archer <545436317@qq.com>
2026-02-26 22:14:30 +08:00

76 lines
3.0 KiB
Plaintext

---
title: Knowledge Base Usage
description: Common Knowledge Base usage questions
---
## Uploaded file content shows garbled characters
Re-save the file with UTF-8 encoding.
## What's the difference between the File Processing Model and the Index Model in Knowledge Base settings?
* **File Processing Model**: Used for **Enhanced Processing** and **Q&A Splitting** during data ingestion. Enhanced Processing generates related questions and summaries; Q&A Splitting generates question-answer pairs.
* **Index Model**: Used for vectorization — it processes and organizes text data into a structure optimized for fast retrieval.
## Does the Knowledge Base support Excel files?
Yes. You can upload xlsx and other spreadsheet formats, not just CSV.
## How are Knowledge Base tokens calculated?
All token counts use the GPT-3.5 tokenizer as the standard.
## I accidentally deleted the rerank model. How do I add it back?
![](/imgs/dataset3.png)
Add the rerank model configuration in your `config.json` file, then you'll be able to select it again.
## If I created apps and Knowledge Bases on the cloud platform, will my data be deleted if I don't renew right away?
On the free plan, Knowledge Base data is cleared after 30 days of inactivity (no login). Apps are not affected. Paid plans automatically downgrade to the free plan upon expiration.
![](/imgs/dataset4.png)
## The AI stops responding mid-answer when there are too many relevant Knowledge Base results.
FastGPT calculates the maximum response length as:
Max Response = min(Configured Max Response, Max Context Window - History)
For example, with an 18K context model, input + output share the same window. As output grows, available input shrinks.
To fix this:
1. Check your configured max response (response limit) setting.
2. Reduce input to free up space for output — specifically, reduce the number of chat history turns included in the workflow.
Where to find the max response setting:
![](/imgs/dataset1.png)
![](/imgs/dataset2.png)
For self-hosted deployments, you can reserve headroom when configuring model context limits. For example, set a 128K model to 120K — the remaining space will be allocated to output.
## I'm hitting context limit errors before reaching the configured number of chat history turns.
FastGPT calculates the maximum response length as:
Max Response = min(Configured Max Response, Max Context Window - History)
For example, with an 18K context model, input + output share the same window. As output grows, available input shrinks.
To fix this:
1. Check your configured max response (response limit) setting.
2. Reduce input to free up space for output — specifically, reduce the number of chat history turns included in the workflow.
Where to find the max response setting:
![](/imgs/dataset1.png)
![](/imgs/dataset2.png)
For self-hosted deployments, you can reserve headroom when configuring model context limits. For example, set a 128K model to 120K — the remaining space will be allocated to output.