Llama微调测试记录

🕗 发布于 2024-11-14 16:02 llama 大模型微调

使用llama模型（Atom-7B-Chat）

参考github：https://github.com/LlamaFamily/Llama-Chinese
conda安装python3.11的环境
运行pip install -r requirements.txt
从huggingface的下载Atom-7B-Chat模型，此处推荐一个好用的镜像：https://hf-mirror.com/FlagAlpha/Atom-7B-Chat
使用Atom-7B-Chat模型进行推理创建一个名为 quick_start.py 的文件，并将以下内容复制到该文件中(较官网有所修改)

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
device_map = "cuda:4" if torch.cuda.is_available() else "auto"
model = AutoModelForCausalLM.from_pretrained('Atom-7B-Chat',device_map=device_map,torch_dtype=torch.float16,load_in_8bit=True,trust_remote_code=True,use_flash_attention_2=True)
model =model.eval()
tokenizer = AutoTokenizer.from_pretrained('Atom-7B-Chat',use_fast=False)
tokenizer.pad_token = tokenizer.eos_token
input_ids = tokenizer(['<s>Human: 介绍一下中国\n</s><s>Assistant: '], return_tensors="pt",add_special_tokens=False).input_ids
if torch.cuda.is_available():
  input_ids = input_ids.to('cuda'

原文地址：https://blog.csdn.net/qq_45734745/article/details/143659811

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：0基础学前端 day9 -- 了解JS框架
下一篇：第23天Linux下常用工具（二）

Llama微调测试记录

使用llama模型（Atom-7B-Chat）

相关文章