Files
FastGPT/packages/web/i18n/zh-Hant/dataset.json
Archer c30f069f2f V4.9.11 feature (#4969)
* Feat: Images dataset collection (#4941)

* New pic (#4858)

* 更新数据集相关类型,添加图像文件ID和预览URL支持;优化数据集导入功能,新增图像数据集处理组件;修复部分国际化文本;更新文件上传逻辑以支持新功能。

* 与原先代码的差别

* 新增 V4.9.10 更新说明,支持 PG 设置`systemEnv.hnswMaxScanTuples`参数,优化 LLM stream 调用超时,修复全文检索多知识库排序问题。同时更新数据集索引,移除 datasetId 字段以简化查询。

* 更换成fileId_image逻辑,并增加训练队列匹配的逻辑

* 新增图片集合判断逻辑,优化预览URL生成流程,确保仅在数据集为图片集合时生成预览URL,并添加相关日志输出以便调试。

* Refactor Docker Compose configuration to comment out exposed ports for production environments, update image versions for pgvector, fastgpt, and mcp_server, and enhance Redis service with a health check. Additionally, standardize dataset collection labels in constants and improve internationalization strings across multiple languages.

* Enhance TrainingStates component by adding internationalization support for the imageParse training mode and update defaultCounts to include imageParse mode in trainingDetail API.

* Enhance dataset import context by adding additional steps for image dataset import process and improve internationalization strings for modal buttons in the useEditTitle hook.

* Update DatasetImportContext to conditionally render MyStep component based on data source type, improving the import process for non-image datasets.

* Refactor image dataset handling by improving internationalization strings, enhancing error messages, and streamlining the preview URL generation process.

* 图片上传到新建的 dataset_collection_images 表,逻辑跟随更改

* 修改了除了controller的其他部分问题

* 把图片数据集的逻辑整合到controller里面

* 补充i18n

* 补充i18n

* resolve评论:主要是上传逻辑的更改和组件复用

* 图片名称的图标显示

* 修改编译报错的命名问题

* 删除不需要的collectionid部分

* 多余文件的处理和改动一个删除按钮

* 除了loading和统一的imageId,其他都resolve掉的

* 处理图标报错

* 复用了MyPhotoView并采用全部替换的方式将imageFileId变成imageId

* 去除不必要文件修改

* 报错和字段修改

* 增加上传成功后删除临时文件的逻辑以及回退一些修改

* 删除path字段,将图片保存到gridfs内,并修改增删等操作的代码

* 修正编译错误

---------

Co-authored-by: archer <545436317@qq.com>

* perf: image dataset

* feat: insert image

* perf: image icon

* fix: training state

---------

Co-authored-by: Zhuangzai fa <143257420+ctrlz526@users.noreply.github.com>

* fix: ts (#4948)

* Thirddatasetmd (#4942)

* add thirddataset.md

* fix thirddataset.md

* fix

* delete wrong png

---------

Co-authored-by: dreamer6680 <146868355@qq.com>

* perf: api dataset code

* perf: log

* add secondary.tsx (#4946)

* add secondary.tsx

* fix

---------

Co-authored-by: dreamer6680 <146868355@qq.com>

* perf: multiple menu

* perf: i18n

* feat: parse queue (#4960)

* feat: parse queue

* feat: sync parse queue

* fix thirddataset.md (#4962)

* fix thirddataset-4.png (#4963)

* feat: Dataset template import (#4934)

* 模版导入部分除了文档还没写

* 修复模版导入的 build 错误

* Document production

* compress pictures

* Change some constants to variables

---------

Co-authored-by: Archer <545436317@qq.com>

* perf: template import

* doc

* llm pargraph

* bocha tool

* fix: del collection

---------

Co-authored-by: Zhuangzai fa <143257420+ctrlz526@users.noreply.github.com>
Co-authored-by: dreamer6680 <1468683855@qq.com>
Co-authored-by: dreamer6680 <146868355@qq.com>
2025-06-06 14:48:44 +08:00

208 lines
12 KiB
JSON
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"Enable": "啟用",
"add_file": "新增文件",
"api_file": "API 檔案庫",
"api_url": "介面位址",
"apidataset_configuration": "配寘資訊",
"auto_indexes": "自動生成補充索引",
"auto_indexes_tips": "透過大模型進行額外索引生成,提高語義豐富度,提高檢索的精度。",
"auto_training_queue": "增強索引排隊",
"backup_collection": "備份數據",
"backup_dataset": "備份導入",
"backup_dataset_success": "備份創建成功",
"backup_dataset_tip": "可以將導出知識庫時,下載的 csv 文件重新導入。",
"backup_mode": "備份導入",
"backup_template_invalid": "備份文件格式不正確,應該是首列為 q,a,indexes 的 csv 文件",
"chunk_max_tokens": "分塊上限",
"chunk_process_params": "分塊處理參數",
"chunk_size": "分塊大小",
"chunk_trigger": "分塊條件",
"chunk_trigger_force_chunk": "強制分塊",
"chunk_trigger_max_size": "原文長度大於文件處理模型最大上下文70%",
"chunk_trigger_min_size": "原文長度大於",
"close_auto_sync": "確認關閉自動同步功能?",
"collection.Create update time": "建立/更新時間",
"collection.Training type": "分段模式",
"collection.training_type": "處理模式",
"collection_data_count": "資料量",
"collection_metadata_custom_pdf_parse": "PDF 增強解析",
"collection_name": "數據集名稱",
"collection_not_support_retraining": "此集合類型不支援重新調整參數",
"collection_not_support_sync": "該集合不支援同步",
"collection_sync": "立即同步",
"collection_sync_confirm_tip": "確認開始同步資料?\n系統將會拉取最新資料進行比較如果內容不相同則會建立一個新的集合並刪除舊的集合請確認",
"collection_tags": "集合標籤",
"common.dataset.data.Input Error Tip": "[圖片數據集] 處理過程錯誤:",
"common.error.unKnow": "未知錯誤",
"common_dataset": "通用資料集",
"common_dataset_desc": "通過導入文件、網頁鏈接或手動錄入形式構建知識庫",
"condition": "條件",
"config_sync_schedule": "設定定時同步",
"confirm_import_images": "共 {{num}} 張圖片 | 確認創建",
"confirm_to_rebuild_embedding_tip": "確定要為資料集切換索引嗎?\n切換索引是一個重要的操作需要對您資料集內所有資料重新建立索引可能需要較長時間請確保帳號內剩餘點數充足。\n\n此外您還需要注意修改使用此資料集的應用程式避免與其他索引模型資料集混用。",
"core.dataset.Image collection": "圖片數據集",
"core.dataset.import.Adjust parameters": "調整參數",
"custom_data_process_params": "自訂",
"custom_data_process_params_desc": "自訂資料處理規則",
"custom_split_sign_tip": "允許你根據自定義的分隔符進行分塊。\n通常用於已處理好的資料使用特定的分隔符來精確分塊。\n可以使用 | 符號表示多個分割符,例如:“。|.”表示中英文句號。\n\n盡量避免使用正則相關特殊符號例如* () [] {} 等。",
"data_amount": "{{dataAmount}} 組資料,{{indexAmount}} 組索引",
"data_error_amount": "{{errorAmount}} 組訓練異常",
"data_index_image": "圖片索引",
"data_index_num": "索引 {{index}}",
"data_parsing": "數據解析中",
"data_process_params": "處理參數",
"data_process_setting": "資料處理設定",
"data_uploading": "數據上傳中: {{num}}%",
"dataset.Chunk_Number": "分塊號",
"dataset.Completed": "完成",
"dataset.Delete_Chunk": "刪除",
"dataset.Edit_Chunk": "編輯",
"dataset.Error_Message": "報錯資訊",
"dataset.No_Error": "暫無異常資訊",
"dataset.Operation": "操作",
"dataset.ReTrain": "重試",
"dataset.Training Process": "訓練狀態",
"dataset.Training_Count": "{{count}} 組訓練中",
"dataset.Training_Errors": "異常",
"dataset.Training_QA": "{{count}} 組問答對訓練中",
"dataset.Training_Status": "訓練狀態",
"dataset.Training_Waiting": "需等待 {{count}} 組資料",
"dataset.Unsupported operation": "操作不支援",
"dataset.no_collections": "尚無資料集",
"dataset.no_tags": "尚無標籤",
"default_params": "預設",
"default_params_desc": "使用系統預設的參數和規則",
"download_csv_template": "點擊下載 CSV 模板",
"edit_dataset_config": "編輯知識庫設定",
"empty_collection": "空白數據集",
"enhanced_indexes": "索引增強",
"error.collectionNotFound": "找不到集合",
"external_file": "外部檔案庫",
"external_file_dataset_desc": "可以通過 API使用外部文件庫構建知識庫",
"external_id": "檔案讀取識別碼",
"external_other_dataset_desc": "自定義API、飛書、語雀等外部文檔作為知識庫",
"external_read_url": "外部預覽網址",
"external_read_url_tip": "可以設定您檔案庫的讀取網址,方便對使用者進行讀取權限驗證。目前可使用 {{fileId}} 變數來代表外部檔案識別碼。",
"external_url": "檔案存取網址",
"failedToLoadRootDirectories": "加載根目錄失敗",
"failedToLoadSubDirectories": "加載子目錄失敗",
"feishu_dataset": "飛書知識庫",
"feishu_dataset_config": "設定飛書知識庫",
"feishu_dataset_desc": "可透過設定飛書文件權限,使用飛書文件建構知識庫,文件不會進行二次儲存",
"file_list": "文件列表",
"file_model_function_tip": "用於增強索引和問答生成",
"filename": "檔案名稱",
"folder_dataset": "資料夾",
"getDirectoryFailed": "獲取目錄失敗",
"image_auto_parse": "圖片自動索引",
"image_auto_parse_tips": "呼叫 VLM 自動標註文件裡的圖片,並生成額外的檢索索引",
"image_training_queue": "圖片處理排隊",
"images_creating": "正在創建",
"immediate_sync": "立即同步",
"import.Auto mode Estimated Price Tips": "需呼叫文字理解模型,將消耗較多 AI 點數:{{price}} 點數 / 1K tokens",
"import.Embedding Estimated Price Tips": "僅使用索引模型,消耗少量 AI 點數:{{price}} 點數 / 1K tokens",
"import_confirm": "確認上傳",
"import_data_preview": "資料預覽",
"import_data_process_setting": "資料處理方式設定",
"import_file_parse_setting": "文件解析設定",
"import_model_config": "模型選擇",
"import_param_setting": "參數設定",
"import_select_file": "選擇文件",
"import_select_link": "輸入連結",
"index_size": "索引大小",
"index_size_tips": "向量化時內容的長度,系統會自動按該大小對分塊進行進一步的分割。",
"input_required_field_to_select_baseurl": "請先輸入必填信息",
"insert_images": "新增圖片",
"insert_images_success": "新增圖片成功,需等待訓練完成才會展示",
"is_open_schedule": "啟用定時同步",
"keep_image": "保留圖片",
"loading": "加載中...",
"max_chunk_size": "最大分塊大小",
"move.hint": "移動後,所選資料集/資料夾將繼承新資料夾的權限設定,原先的權限設定將失效。",
"noChildren": "無子目錄",
"noSelectedFolder": "沒有選擇文件夾",
"noSelectedId": "沒有選擇 ID",
"noValidId": "沒有有效的 ID",
"open_auto_sync": "開啟定時同步後,系統將每天不定時嘗試同步集合,集合同步期間,會出現無法搜尋到該集合資料現象。",
"other_dataset": "第三方知識庫",
"paragraph_max_deep": "最大段落深度",
"paragraph_split": "按段落分塊",
"paragraph_split_tip": "優先按 Makdown 標題段落進行分塊,如果分塊過長,再按長度進行二次分塊",
"params_config": "設定",
"pdf_enhance_parse": "PDF 增強解析",
"pdf_enhance_parse_price": "{{price}}積分/頁",
"pdf_enhance_parse_tips": "呼叫 PDF 識別模型進行解析,可以將其轉換成 Markdown 並保留文件中的圖片,同時也可以對掃描件進行識別,識別時間較長。",
"permission.des.manage": "可管理整個資料集的資料和資訊",
"permission.des.read": "可檢視資料集內容",
"permission.des.write": "可新增和變更資料集內容",
"pleaseFillUserIdAndToken": "請填寫 User ID 和 Token",
"preview_chunk": "分塊預覽",
"preview_chunk_empty": "文件內容為空",
"preview_chunk_intro": "共 {{total}} 個分塊,最多展示 10 個",
"preview_chunk_not_selected": "點選左側文件後進行預覽",
"process.Auto_Index": "自動索引生成",
"process.Get QA": "問答對提取",
"process.Image_Index": "圖片索引生成",
"process.Is_Ready": "已就緒",
"process.Is_Ready_Count": "{{count}} 組已就緒",
"process.Parse_Image": "圖片解析中",
"process.Parsing": "內容解析中",
"process.Vectorizing": "索引向量化",
"process.Waiting": "排隊中",
"rebuild_embedding_start_tip": "切換索引模型任務已開始",
"rebuilding_index_count": "重建中索引數量:{{count}}",
"request_headers": "請求頭",
"retain_collection": "調整訓練參數",
"retrain_task_submitted": "重新訓練任務已提交",
"rootDirectoryFormatError": "根目錄資料格式不正確",
"rootdirectory": "/根目錄",
"same_api_collection": "存在相同的 API 集合",
"selectDirectory": "選擇",
"selectRootFolder": "選擇根目錄",
"split_chunk_char": "按指定分割符分塊",
"split_chunk_size": "按長度分塊",
"split_sign_break": "1 個換行符",
"split_sign_break2": "2 個換行符",
"split_sign_custom": "自定義",
"split_sign_exclamatiob": "驚嘆號",
"split_sign_null": "不設定",
"split_sign_period": "句號",
"split_sign_question": "問號",
"split_sign_semicolon": "分號",
"start_sync_website_tip": "確認開始同步資料?\n將會刪除舊資料後重新取得請確認",
"status_error": "執行異常",
"sync_collection_failed": "同步集合錯誤,請檢查是否能正常存取來原始檔",
"sync_schedule": "定時同步",
"sync_schedule_tip": "只會同步已存在的集合。\n包括連結集合以及 API 知識庫裡所有集合。\n系統會每天進行輪詢更新無法確定特定的更新時間。",
"tag.Add_new_tag": "新增標籤",
"tag.Edit_tag": "編輯標籤",
"tag.add": "建立",
"tag.add_new": "新增",
"tag.cancel": "取消",
"tag.delete_tag_confirm": "確定要刪除標籤嗎?",
"tag.manage": "標籤管理",
"tag.searchOrAddTag": "搜尋或新增標籤",
"tag.tags": "標籤",
"tag.total_tags": "共 {{total}} 個標籤",
"template_dataset": "模版導入",
"template_file_invalid": "模板文件格式不正確,應該是首列為 q,a,indexes 的 csv 文件",
"template_mode": "模板導入",
"the_knowledge_base_has_indexes_that_are_being_trained_or_being_rebuilt": "資料集有索引正在訓練或重建中",
"total_num_files": "共 {{total}} 個文件",
"training.Error": "{{count}} 組異常",
"training.Image mode": "圖片處理",
"training.Normal": "正常",
"training_mode": "分段模式",
"training_ready": "{{count}} 組",
"upload_by_template_format": "按模版文件上傳",
"vector_model_max_tokens_tip": "每個分塊資料,最大長度為 3000 tokens",
"vllm_model": "圖片理解模型",
"vlm_model_required_warning": "需要圖片理解模型",
"website_dataset": "網站同步",
"website_dataset_desc": "通過爬蟲,批量爬取網頁數據構建知識庫",
"website_info": "網站資訊",
"yuque_dataset": "語雀知識庫",
"yuque_dataset_config": "設定語雀知識庫",
"yuque_dataset_desc": "可透過設定語雀文件權限,使用語雀文件建構知識庫,文件不會進行二次儲存"
}