[论文笔记] Pai-megatron Qwen1.5-14B-CT 后预训练 踩坑记录
1. 模型权重转换报错 hf2mcore_1.5_v2.py
报错为:
/mnt/cpfs/kexin/dlc_code/qwen1.5/PAI-Megatron-Patch/toolkits/model_checkpoints_convertor/qwen/hf2mcore_1.5_v2.py
正确文件替换如下,更改了477行,删除了 args.hidden_size 这个维度,在tp>1时也支持转换:
elif 'linear_qkv.bias' in k and 'norm' not in k:
# raw
viewed = v.view(args.num_query_groups, -1, head_dim, args.hidden_size)
# changed
viewed = v.view(args.num_query_groups, -1, head_dim)
替换为:
import os
import re
import json
import torch
import transformers
import torch.nn as nn
from functools import partial
from collections import defaultdict
from transformers import (
AutoConfig,
AutoModelForCausalLM,
AutoTokenizer,
)
from transformers.models.mixtral
原文地址:https://blog.csdn.net/Trance95/article/details/137689388
免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!