【深度学习】大模型GLM-4-9B Chat ，微调与部署(1)

🕗 发布于 2024-07-26 10:33 深度学习 人工智能

下载好东西：
在这里插入图片描述

启动容器环境:

docker run -it --gpus all --net host  --shm-size=8g -v /ssd/xiedong/glm-4-9b-xd:/ssd/xiedong/glm-4-9b-xd  kevinchina/deeplearning:pytorch2.3.0-cuda12.1-cudnn8-devel-yolov8train  bash

pip install typer tiktoken numpy==1.25 -i https://pypi.tuna.tsinghua.edu.cn/simple

安装微调的环境:

cd /ssd/xiedong/glm-4-9b-xd/GLM-4/finetune_demo/

pip install -r requirements.txt   -i https://pypi.tuna.tsinghua.edu.cn/simple

下载数据集ccfbdci.jsonl到同级目录下。
https://huggingface.co/datasets/qgyd2021/chinese_ner_sft/tree/main/data

将数据集处理为glm4的格式:

import json
import random

def convert_jsonl(input_file, train_output_file, test_output_file, split_ratio=0.8):
    system_message = {"role": "system", "content": "你是一个命名实体提取的专家。"}
    all_data = []

    with open(input_file, 'r', encoding='utf-8') as infile:
        for line in infile:
            data = json.loads(line)
            user_content = data["text"]
            entities = data["entities"]

            if entities:
                entity_texts = [entity["entity_text"] for entity in entities]
                assistant_content = ", ".join(entity_texts)
            else:
                assistant_content = "无"

            conversation = {
                "messages": [
                    system_message,
                    {"role": "user", "content": user_content},
                    {"role": "assistant", "content": assistant_content}
                ]
            }

            all_data.append(conversation)

    # Shuffle the data for random splitting
    random.shuffle(all_data)

    # Calculate split index
    split_index = int(len(all_data) * split_ratio)

    # Split the data into training and testing sets
    train_data = all_data[:split_index]
    test_data = all_data[split_index:]

    # Write training data to file
    with open(train_output_file, 'w', encoding='utf-8') as train_outfile:
        for item in train_data:
            json.dump(item, train_outfile, ensure_ascii=False)
            train_outfile.write('\n')

    # Write testing data to file
    with open(test_output_file, 'w', encoding='utf-8') as test_outfile:
        for item in test_data:
            json.dump(item, test_outfile, ensure_ascii=False)
            test_outfile.write('\n')

input_file = 'ccfbdci.jsonl'
train_output_file = 'ccfbdci_train.jsonl'
test_output_file = 'ccfbdci_test.jsonl'
convert_jsonl(input_file, train_output_file, test_output_file)

配置文件
微调的配置文件位于config目录中，包括以下文件：

ds_zero_2.json / ds_zero_3.json：DeepSpeed配置文件。
lora.yaml / ptuning_v2.yaml / sft.yaml：不同模式模型的配置文件，包括模型参数、优化器参数、训练参数等。

一些重要参数解释如下：

data_config部分

train_file：训练数据集的文件路径。
val_file：验证数据集的文件路径。
test_file：测试数据集的文件路径。
num_proc：加载数据时使用的进程数量。
max_input_length：输入序列的最大长度。
max_output_length：输出序列的最大长度。

training_args部分

output_dir：保存模型和其他输出的目录。
max_steps：最大训练步数。
per_device_train_batch_size：每个设备（如GPU）的训练批次大小。
dataloader_num_workers：加载数据时使用的工作线程数量。
remove_unused_columns：是否移除数据中未使用的列。
save_strategy：模型保存策略（例如，每多少步保存一次）。
save_steps：每多少步保存一次模型。
log_level：日志级别（例如，info）。
logging_strategy：日志记录策略。
logging_steps：每多少步记录一次日志。
per_device_eval_batch_size：每个设备的评估批次大小。
evaluation_strategy：评估策略（例如，每多少步进行一次评估）。
eval_steps：每多少步评估一次。
predict_with_generate：是否使用生成模式进行预测。

generation_config部分

max_new_tokens：生成的新标记的最大数量。

peft_config部分

peft_type：使用的参数微调类型（支持LORA和PREFIX_TUNING）。
task_type：任务类型，这里是因果语言模型（不要更改）。

LoRA参数

r：LoRA的秩。
lora_alpha：LoRA的缩放因子。
lora_dropout：LoRA层中使用的dropout概率。

P-TuningV2参数

num_virtual_tokens：虚拟标记的数量。
num_attention_heads：P-TuningV2的注意力头数量（不要更改）。
token_dim：P-TuningV2的标记维度（不要更改）。

CUDA_VISIBLE_DEVICES=2,3 OMP_NUM_THREADS=1 torchrun --standalone --nnodes=1 --nproc_per_node=2  finetune.py  /ssd/xiedong/glm-4-9b-xd/GLM-4/finetune_demo/ /ssd/xiedong/glm-4-9b-xd/glm-4-9b-chat configs/ptuning_v2.yaml # For Chat Fine-tune

可以训练，但是多张卡保存模型报错了，重启一个镜像试试。

docker commit b512e777882f kevinchina/deeplearning:pytorch2.3.0-cuda12.1-cudnn8-devel-glm4train

docker run -it --gpus all --net host  --shm-size=8g -v /ssd/xiedong/glm-4-9b-xd:/ssd/xiedong/glm-4-9b-xd  kevinchina/deeplearning:pytorch2.3.0-cuda12.1-cudnn8-devel-glm4train  bash

cd /ssd/xiedong/glm-4-9b-xd/GLM-4/finetune_demo/

CUDA_VISIBLE_DEVICES=2,3 OMP_NUM_THREADS=1 torchrun --standalone --nnodes=1 --nproc_per_node=2  finetune.py  /ssd/xiedong/glm-4-9b-xd/GLM-4/finetune_demo/ /ssd/xiedong/glm-4-9b-xd/glm-4-9b-chat configs/ptuning_v2.yaml # For Chat Fine-tune

CUDA_VISIBLE_DEVICES=2 python finetune.py  /ssd/xiedong/glm-4-9b-xd/GLM-4/finetune_demo/ /ssd/xiedong/glm-4-9b-xd/glm-4-9b-chat configs/ptuning_v2.yaml # For Chat Fine-tune

6，还是报错，换个项目的训练方法：

https://github.com/hiyouga/LLaMA-Factory/blob/main/README_zh.md

原文地址：https://blog.csdn.net/x1131230123/article/details/140611195

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：B4005 [GESP202406 四级] 黑白方块【暴力枚举】【前缀和】
下一篇：Kylin Cube监控：掌握数据立方体的资源使用

代码随想录算法训练营第十六天|513. 找树左下角的值 112. 路径总和 106. 从中序与后序遍历序列构造二叉树
二叉树day4，涉及到一点点最简单的回溯
阅读更多2024-10-18
【学习】word保存图片
直接右键另存为的话，文件总是不清晰，截屏的话，好像也欠妥。可以另存为网页 .html。word中有想保存的照片。原图就放到了文件夹里面。
阅读更多2024-10-18
群晖前面加了雷池社区版，安装失败，然后无法识别出用户真实访问IP
有nas的相信对公网都不模式，在现在基础上传带宽能有100兆的时代，有公网代表着家里有一个小服务器，像百度网盘，优酷这种在线服务都能部署为私有化服务。但现在运营商几乎不可能提供公网ip，要么自己买个云
阅读更多2024-10-18
探索光耦：光耦——不间断电源（UPS）系统中的安全高效卫士
综上，光耦在不间断电源（UPS）系统中的应用，不仅提升了系统的安全性和可靠性，还为电源管理和信号传输提供了坚实保障。光耦通过光信号传输控制信号，确保信号在高频切换中保持稳定与准确，如电源切换时，光耦能
阅读更多2024-10-18
JavaFX学习系列--第一章: 简单Fx界面
版本为jdk8 （因为jdk8已经内置JavaFX库，高版本JDK中被剥离，需要额外下载jar 包），https://oc.gdufs.edu.cn 教学资源站点可下载JDK8（如果使用下面所述的i
阅读更多2024-10-18
力扣简单 876.快慢指针
while(fast!= null){
阅读更多2024-10-18
React 项目热更新失效问题的解决方案和产生的原因
通过以上的依赖升级、编码注意事项和预防措施，我们成功修复了 React 项目热更新失效的问题，并且为后续开发规避了类似的问提。在修复React项目热更新失效的问题时，经过一系列问题排查和依赖升级，最终
阅读更多2024-10-18
list转map常用方法
account -> account是一个返回本身的lambda表达式，其实还可以使用Function接口中的一个默认方法 Function.identity()，这个方法返回自身对象，更加简洁
阅读更多2024-10-18
Java 中简化操作集合的方法
通过本文的介绍，我们了解了如何在 Java 中简化集合操作，特别是在 Java 8 之后，StreamAPI 提供了一种更具表现力和简洁性的编程方式。与传统的显式循环和条件判断相比，使用流操作可以让代
阅读更多2024-10-18
vue3基础入门以及常用api使用
多个页面需要同一个功能就可以使用hooks,而且hooks里边能使用钩子例如onMounted等，还能用computeduseSum.ts。
阅读更多2024-10-18