自学内容网 自学内容网

[论文笔记] Pai-megatron Qwen1.5-14B-CT 后预训练 踩坑记录

1. 模型权重转换报错 hf2mcore_1.5_v2.py

报错为:

/mnt/cpfs/kexin/dlc_code/qwen1.5/PAI-Megatron-Patch/toolkits/model_checkpoints_convertor/qwen/hf2mcore_1.5_v2.py

正确文件替换如下,更改了477行,删除了 args.hidden_size 这个维度,在tp>1时也支持转换:

elif 'linear_qkv.bias' in k and 'norm' not in k:
  # raw
  viewed = v.view(args.num_query_groups, -1, head_dim, args.hidden_size)
  # changed
  viewed = v.view(args.num_query_groups, -1, head_dim)

替换为:

import os
import re
import json
import torch
import transformers
import torch.nn as nn
from functools import partial
from collections import defaultdict
from transformers import (
    AutoConfig,
    AutoModelForCausalLM,
    AutoTokenizer,
)
from transformers.models.mixtral

原文地址:https://blog.csdn.net/Trance95/article/details/137689388

免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!