Files
FastGPT/packages/web/i18n/zh-CN/dataset.json
Archer c30f069f2f V4.9.11 feature (#4969)
* Feat: Images dataset collection (#4941)

* New pic (#4858)

* 更新数据集相关类型,添加图像文件ID和预览URL支持;优化数据集导入功能,新增图像数据集处理组件;修复部分国际化文本;更新文件上传逻辑以支持新功能。

* 与原先代码的差别

* 新增 V4.9.10 更新说明,支持 PG 设置`systemEnv.hnswMaxScanTuples`参数,优化 LLM stream 调用超时,修复全文检索多知识库排序问题。同时更新数据集索引,移除 datasetId 字段以简化查询。

* 更换成fileId_image逻辑,并增加训练队列匹配的逻辑

* 新增图片集合判断逻辑,优化预览URL生成流程,确保仅在数据集为图片集合时生成预览URL,并添加相关日志输出以便调试。

* Refactor Docker Compose configuration to comment out exposed ports for production environments, update image versions for pgvector, fastgpt, and mcp_server, and enhance Redis service with a health check. Additionally, standardize dataset collection labels in constants and improve internationalization strings across multiple languages.

* Enhance TrainingStates component by adding internationalization support for the imageParse training mode and update defaultCounts to include imageParse mode in trainingDetail API.

* Enhance dataset import context by adding additional steps for image dataset import process and improve internationalization strings for modal buttons in the useEditTitle hook.

* Update DatasetImportContext to conditionally render MyStep component based on data source type, improving the import process for non-image datasets.

* Refactor image dataset handling by improving internationalization strings, enhancing error messages, and streamlining the preview URL generation process.

* 图片上传到新建的 dataset_collection_images 表,逻辑跟随更改

* 修改了除了controller的其他部分问题

* 把图片数据集的逻辑整合到controller里面

* 补充i18n

* 补充i18n

* resolve评论:主要是上传逻辑的更改和组件复用

* 图片名称的图标显示

* 修改编译报错的命名问题

* 删除不需要的collectionid部分

* 多余文件的处理和改动一个删除按钮

* 除了loading和统一的imageId,其他都resolve掉的

* 处理图标报错

* 复用了MyPhotoView并采用全部替换的方式将imageFileId变成imageId

* 去除不必要文件修改

* 报错和字段修改

* 增加上传成功后删除临时文件的逻辑以及回退一些修改

* 删除path字段,将图片保存到gridfs内,并修改增删等操作的代码

* 修正编译错误

---------

Co-authored-by: archer <545436317@qq.com>

* perf: image dataset

* feat: insert image

* perf: image icon

* fix: training state

---------

Co-authored-by: Zhuangzai fa <143257420+ctrlz526@users.noreply.github.com>

* fix: ts (#4948)

* Thirddatasetmd (#4942)

* add thirddataset.md

* fix thirddataset.md

* fix

* delete wrong png

---------

Co-authored-by: dreamer6680 <146868355@qq.com>

* perf: api dataset code

* perf: log

* add secondary.tsx (#4946)

* add secondary.tsx

* fix

---------

Co-authored-by: dreamer6680 <146868355@qq.com>

* perf: multiple menu

* perf: i18n

* feat: parse queue (#4960)

* feat: parse queue

* feat: sync parse queue

* fix thirddataset.md (#4962)

* fix thirddataset-4.png (#4963)

* feat: Dataset template import (#4934)

* 模版导入部分除了文档还没写

* 修复模版导入的 build 错误

* Document production

* compress pictures

* Change some constants to variables

---------

Co-authored-by: Archer <545436317@qq.com>

* perf: template import

* doc

* llm pargraph

* bocha tool

* fix: del collection

---------

Co-authored-by: Zhuangzai fa <143257420+ctrlz526@users.noreply.github.com>
Co-authored-by: dreamer6680 <1468683855@qq.com>
Co-authored-by: dreamer6680 <146868355@qq.com>
2025-06-06 14:48:44 +08:00

210 lines
12 KiB
JSON
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"Enable": "启用",
"add_file": "添加文件",
"api_file": "API 文件库",
"api_url": "接口地址",
"apidataset_configuration": "配置信息",
"auto_indexes": "自动生成补充索引",
"auto_indexes_tips": "通过大模型进行额外索引生成,提高语义丰富度,提高检索的精度。",
"auto_training_queue": "增强索引排队",
"backup_collection": "备份数据",
"backup_dataset": "备份导入",
"backup_dataset_success": "备份创建成功",
"backup_dataset_tip": "可以将导出知识库时,下载的 csv 文件重新导入。",
"backup_mode": "备份导入",
"backup_template_invalid": "备份文件格式不正确,应该是首列为 q,a,indexes 的 csv 文件",
"chunk_max_tokens": "分块上限",
"chunk_process_params": "分块处理参数",
"chunk_size": "分块大小",
"chunk_trigger": "分块条件",
"chunk_trigger_force_chunk": "强制分块",
"chunk_trigger_max_size": "原文长度大于文件处理模型最大上下文70%",
"chunk_trigger_min_size": "原文长度大于",
"chunk_trigger_tips": "当满足一定条件时才触发分块存储,否则会直接完整存储原文",
"close_auto_sync": "确认关闭自动同步功能?",
"collection.Create update time": "创建/更新时间",
"collection.Training type": "训练模式",
"collection.training_type": "处理模式",
"collection_data_count": "数据量",
"collection_metadata_custom_pdf_parse": "PDF增强解析",
"collection_name": "数据集名称",
"collection_not_support_retraining": "该集合类型不支持重新调整参数",
"collection_not_support_sync": "该集合不支持同步",
"collection_sync": "立即同步",
"collection_sync_confirm_tip": "确认开始同步数据?系统将会拉取最新数据进行比较,如果内容不相同,则会创建一个新的集合并删除旧的集合,请确认!",
"collection_tags": "集合标签",
"common.dataset.data.Input Error Tip": "[图片数据集] 处理过程错误:",
"common.error.unKnow": "未知错误",
"common_dataset": "通用知识库",
"common_dataset_desc": "通过导入文件、网页链接或手动录入形式构建知识库",
"condition": "条件",
"config_sync_schedule": "配置定时同步",
"confirm_import_images": "共 {{num}} 张图片 | 确认创建",
"confirm_to_rebuild_embedding_tip": "确认为知识库切换索引?\n切换索引是一个非常重量的操作需要对您知识库内所有数据进行重新索引时间可能较长请确保账号内剩余积分充足。\n\n此外你还需要注意修改选择该知识库的应用避免它们与其他索引模型知识库混用。",
"core.dataset.Image collection": "图片数据集",
"core.dataset.import.Adjust parameters": "调整参数",
"custom_data_process_params": "自定义",
"custom_data_process_params_desc": "自定义设置数据处理规则",
"custom_split_sign_tip": "允许你根据自定义的分隔符进行分块。通常用于已处理好的数据,使用特定的分隔符来精确分块。可以使用 | 符号表示多个分割符,例如:“。|.” 表示中英文句号。\n尽量避免使用正则相关特殊符号例如: * () [] {} 等。",
"data_amount": "{{dataAmount}} 组数据, {{indexAmount}} 组索引",
"data_error_amount": "{{errorAmount}} 组训练异常",
"data_index_image": "图片索引",
"data_index_num": "索引 {{index}}",
"data_parsing": "数据解析中",
"data_process_params": "处理参数",
"data_process_setting": "数据处理配置",
"data_uploading": "数据上传中: {{num}}%",
"dataset.Chunk_Number": "分块号",
"dataset.Completed": "完成",
"dataset.Delete_Chunk": "删除",
"dataset.Edit_Chunk": "编辑",
"dataset.Error_Message": "报错信息",
"dataset.No_Error": "暂无异常信息",
"dataset.Operation": "操作",
"dataset.ReTrain": "重试",
"dataset.Training Process": "训练状态",
"dataset.Training_Count": "{{count}} 组训练中",
"dataset.Training_Errors": "异常 ({{count}})",
"dataset.Training_QA": "{{count}} 组问答对训练中",
"dataset.Training_Status": "训练状态",
"dataset.Training_Waiting": "需等待 {{count}} 组数据",
"dataset.Unsupported operation": "操作不支持",
"dataset.no_collections": "暂无数据集",
"dataset.no_tags": "暂无标签",
"default_params": "默认",
"default_params_desc": "使用系统默认的参数和规则",
"download_csv_template": "点击下载 CSV 模板",
"edit_dataset_config": "编辑知识库配置",
"empty_collection": "空白数据集",
"enhanced_indexes": "索引增强",
"error.collectionNotFound": "集合找不到了~",
"external_file": "外部文件库",
"external_file_dataset_desc": "可以通过 API使用外部文件库构建知识库",
"external_id": "文件阅读 ID",
"external_other_dataset_desc": "自定义API、飞书、语雀等外部文档作为知识库",
"external_read_url": "外部预览地址",
"external_read_url_tip": "可以配置你文件库的阅读地址。便于对用户进行阅读鉴权操作。目前可以使用 {{fileId}} 变量来指代外部文件 ID。",
"external_url": "文件访问 URL",
"failedToLoadRootDirectories": "加载根目录失败",
"failedToLoadSubDirectories": "加载子目录失败",
"feishu_dataset": "飞书知识库",
"feishu_dataset_config": "配置飞书知识库",
"feishu_dataset_desc": "可通过配置飞书文档权限,使用飞书文档构建知识库,文档不会进行二次存储",
"file_list": "文件列表",
"file_model_function_tip": "用于增强索引和 QA 生成",
"filename": "文件名",
"folder_dataset": "文件夹",
"getDirectoryFailed": "获取目录失败",
"image_auto_parse": "图片自动索引",
"image_auto_parse_tips": "调用 VLM 自动标注文档里的图片,并生成额外的检索索引",
"image_training_queue": "图片处理排队",
"images_creating": "正在创建",
"immediate_sync": "立即同步",
"import.Auto mode Estimated Price Tips": "需调用文本理解模型需要消耗较多AI 积分:{{price}} 积分/1K tokens",
"import.Embedding Estimated Price Tips": "仅使用索引模型,消耗少量 AI 积分:{{price}} 积分/1K tokens",
"import_confirm": "确认上传",
"import_data_preview": "数据预览",
"import_data_process_setting": "数据处理方式设置",
"import_file_parse_setting": "文件解析设置",
"import_model_config": "模型选择",
"import_param_setting": "参数设置",
"import_select_file": "选择文件",
"import_select_link": "输入链接",
"index_size": "索引大小",
"index_size_tips": "向量化时内容的长度,系统会自动按该大小对分块进行进一步的分割。",
"input_required_field_to_select_baseurl": "请先输入必填信息",
"insert_images": "新增图片",
"insert_images_success": "新增图片成功,需等待训练完成才会展示",
"is_open_schedule": "启用定时同步",
"keep_image": "保留图片",
"loading": "加载中...",
"max_chunk_size": "最大分块大小",
"move.hint": "移动后,所选知识库/文件夹将继承新文件夹的权限设置,原先的权限设置失效。",
"noChildren": "无子目录",
"noSelectedFolder": "没有选择文件夹",
"noSelectedId": "没有选择 ID",
"noValidId": "没有有效的 ID",
"open_auto_sync": "开启定时同步后,系统将会每天不定时尝试同步集合,集合同步期间,会出现无法搜索到该集合数据现象。",
"other_dataset": "第三方知识库",
"paragraph_max_deep": "最大段落深度",
"paragraph_split": "按段落分块",
"paragraph_split_tip": "优先按 Makdown 标题段落进行分块,如果分块过长,再按长度进行二次分块",
"params_config": "配置",
"pdf_enhance_parse": "PDF增强解析",
"pdf_enhance_parse_price": "{{price}}积分/页",
"pdf_enhance_parse_tips": "调用 PDF 识别模型进行解析,可以将其转换成 Markdown 并保留文档中的图片,同时也可以对扫描件进行识别,识别时间较长。",
"permission.des.manage": "可管理整个知识库数据和信息",
"permission.des.read": "可查看知识库内容",
"permission.des.write": "可增加和变更知识库内容",
"pleaseFillUserIdAndToken": "请填写 User ID 和 Token",
"preview_chunk": "分块预览",
"preview_chunk_empty": "文件内容为空",
"preview_chunk_intro": "共 {{total}} 个分块,最多展示 10 个",
"preview_chunk_not_selected": "点击左侧文件后进行预览",
"process.Auto_Index": "自动索引生成",
"process.Get QA": "问答对提取",
"process.Image_Index": "图片索引生成",
"process.Is_Ready": "已就绪",
"process.Is_Ready_Count": "{{count}} 组已就绪",
"process.Parse_Image": "图片解析中",
"process.Parsing": "内容解析中",
"process.Vectorizing": "索引向量化",
"process.Waiting": "排队中",
"rebuild_embedding_start_tip": "切换索引模型任务已开始",
"rebuilding_index_count": "重建中索引数量:{{count}}",
"request_headers": "请求头参数,会自动补充 Bearer",
"retain_collection": "调整训练参数",
"retrain_task_submitted": "重新训练任务已提交",
"rootDirectoryFormatError": "根目录数据格式不正确",
"rootdirectory": "/根目录",
"same_api_collection": "存在相同的 API 集合",
"selectDirectory": "选择",
"selectRootFolder": "选择根目录",
"split_chunk_char": "按指定分割符分块",
"split_chunk_size": "按长度分块",
"split_sign_break": "1 个换行符",
"split_sign_break2": "2 个换行符",
"split_sign_custom": "自定义",
"split_sign_exclamatiob": "感叹号",
"split_sign_null": "不设置",
"split_sign_period": "句号",
"split_sign_question": "问号",
"split_sign_semicolon": "分号",
"start_sync_website_tip": "确认开始同步数据?将会删除旧数据后重新获取,请确认!",
"status_error": "运行异常",
"sync_collection_failed": "同步集合错误,请检查是否能正常访问源文件",
"sync_schedule": "定时同步",
"sync_schedule_tip": "仅会同步已存在的集合。包括链接集合以及 API 知识库里所有集合。系统会每天进行轮询更新,无法确定具体的更新时间。",
"tag.Add_new_tag": "新建标签",
"tag.Edit_tag": "编辑标签",
"tag.add": "创建",
"tag.add_new": "新建",
"tag.cancel": "取消选择",
"tag.delete_tag_confirm": "确定删除标签?",
"tag.manage": "标签管理",
"tag.searchOrAddTag": "搜索或添加标签",
"tag.tags": "标签",
"tag.total_tags": "共{{total}}个标签",
"template_dataset": "模版导入",
"template_file_invalid": "模板文件格式不正确,应该是首列为 q,a,indexes 的 csv 文件",
"template_mode": "模板导入",
"the_knowledge_base_has_indexes_that_are_being_trained_or_being_rebuilt": "知识库有训练中或正在重建的索引",
"total_num_files": "共 {{total}} 个文件",
"training.Error": "{{count}} 组异常",
"training.Image mode": "图片处理",
"training.Normal": "正常",
"training_mode": "处理方式",
"training_ready": "{{count}} 组",
"upload_by_template_format": "按模版文件上传",
"uploading_progress": "上传中: {{num}}%",
"vector_model_max_tokens_tip": "每个分块数据,最大长度为 3000 tokens",
"vllm_model": "图片理解模型",
"vlm_model_required_warning": "需要图片理解模型",
"website_dataset": "Web 站点同步",
"website_dataset_desc": "通过爬虫,批量爬取网页数据构建知识库",
"website_info": "网站信息",
"yuque_dataset": "语雀知识库",
"yuque_dataset_config": "配置语雀知识库",
"yuque_dataset_desc": "可通过配置语雀文档权限,使用语雀文档构建知识库,文档不会进行二次存储"
}