跟着官方文档快速入门RAGAS

🕗 发布于 2024-12-05 17:29 llama 人工智能 知识图谱

官网: Ragas

Ragas（Retrieval-Augmented Generation, RAG）是一个基于简单手写提示的评估框架，通过这些提示全自动地衡量答案的准确性、相关性和上下文相关性。这种评估方法不需要访问人工注释的数据集或参考答案，使得评估过程更为简便和高效。

跟着官方的入门案例来了解该框架

安装

    pip install ragas

加载数据

从Hugging face中加载数据集

from datasets import load_dataset
dataset = load_dataset(
    "explodinggradients/amnesty_qa",
    "english_v3",
    trust_remote_code=True
)

# 若无法使用代理也可以直接加载
dataset = {'question': 'Which private companies in the Americas are the largest GHG emitters according to the Carbon Majors database?',
 'ground_truths': ['The largest private companies in the Americas that are the largest GHG emitters according to the Carbon Majors database are ExxonMobil, Chevron, and Peabody.'],
 'answer': 'According to the Carbon Majors database, the largest private companies in the Americas that are the largest GHG emitters are:\n\n1. Chevron Corporation (United States)\n2. ExxonMobil Corporation (United States)\n3. ConocoPhillips Company (United States)\n4. BP plc (United Kingdom, but with significant operations in the Americas)\n5. Royal Dutch Shell plc (Netherlands, but with significant operations in the Americas)\n6. Peabody Energy Corporation (United States)\n7. Duke Energy Corporation (United States)\n8. TotalEnergies SE (France, but with significant operations in the Americas)\n9. BHP Group Limited (Australia, but with significant operations in the Americas)\n10. Rio Tinto Group (United Kingdom/Australia, but with significant operations in the Americas)\n\nPlease note that the rankings may change over time as new data becomes available.',
 'contexts': ['The private companies responsible for the most emissions during this period, according to the database, are from the United States: ExxonMobil, Chevron and Peabody.\nThe largest emitter amongst state-owned companies in the Americas is Mexican company Pemex, followed by Venezuelan company Petróleos de Venezuela, S.A.']}

将数据集加载到 Ragas EvaluationDataset 对象中。

from ragas import EvaluationDataset

eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

# 若无法使用代理也可以直接加载
eval_dataset = Dataset.from_dict(data_samples)

选择对应的指标并运行(以OpenAI为例)

metrics = [
    LLMContextRecall(llm=evaluator_llm), #对应着上下文召回率
    FactualCorrectness(llm=evaluator_llm), #对应着事实正确性
    Faithfulness(llm=evaluator_llm), #对应着忠诚度
    SemanticSimilarity(embeddings=evaluator_embeddings) #对应着语义相似度
]
results = evaluate(dataset=eval_dataset, metrics=metrics)

导出和分析结果

df = results.to_pandas()
print(df.head())

效果展示:

代码总览

from datasets import load_dataset
from ragas import EvaluationDataset
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity
from ragas import evaluate
from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAIEmbeddings


dataset = load_dataset(
    "explodinggradients/amnesty_qa",
    "english_v3",
    trust_remote_code=True
)
eval_dataset = EvaluationDataset.from_hf_dataset(dataset["eval"])

evaluator_llm = LangchainLLMWrapper(ChatOpenAI(model="gpt-3.5-turbo", api_key='sk-L9Vf0rGL7kkNTGXCvDG1vHBj5wQj1jAwX1K8wp0qbIO9XCFx', base_url='https://api.openai-proxy.org/v1'))
evaluator_embeddings = LangchainEmbeddingsWrapper(OpenAIEmbeddings())

metrics = [
    LLMContextRecall(llm=evaluator_llm),
    FactualCorrectness(llm=evaluator_llm),
    Faithfulness(llm=evaluator_llm),
    SemanticSimilarity(embeddings=evaluator_embeddings)
]
results = evaluate(dataset=eval_dataset, metrics=metrics)

df = results.to_pandas()

print(df.head())

原文地址：https://blog.csdn.net/manfulshark/article/details/144225857

免责声明：本站文章内容转载自网络资源，如本站内容侵犯了原著者的合法权益，可联系本站删除。更多内容请关注自学内容网（zxcms.com）！

上一篇：Python几种常用数据结构（重制版）
下一篇：C#/.NET/.NET Core技术前沿周刊 | 第 15 期（2024年11.25-11.30）

.NET(C#) 如何配置用户首选项及保存用户设置
.NET(C#) 如何配置用户首选项及保存用户设置
阅读更多2024-12-14
【最新】北大数字普惠金融指数数据集-省市县（2011-2023年）
郭峰,王靖一,王芳,孔涛,张勋,程志云.测度中国数字普惠金融发展:指数编制与空间特征[J].经济学(季刊),2020,19(04):1401-1418.时间跨度：省级和城市级指数时间跨度为2011-2
阅读更多2024-12-14
GESP202412 四级【Recamán】题解（AC）
a11ak−1−kkakak−1−kak−1k小杨想知道 Recamán 数列的前n项从小到大排序后的结果。手动计算非常困难，小杨希望你能帮他解决这个问题。
阅读更多2024-12-14
IDEA遇到EasyConnect中的网络资源无法访问的问题
版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。原文链接：https://blog.csdn.net/wanshanyu_/article/de
阅读更多2024-12-14
双目摄像头标定方法
此时已经完成标定，左下角为反投影误差，右边为外参可视化。将双目左右目拍的图像上传（左右目最好不少于20张）此时回到主页面，即可看到成功导出。把这些误差大的删除即可。
阅读更多2024-12-14
Servlet、omcat服务器架构与工作原理
Servlet是运行在服务器端的Java程序，它的主要职责之一是接收并处理来自客户端（如浏览器）的HTTP请求。当客户端发送一个请求到服务器时，Servlet可以解析请求中的信息，例如请求的URL路径
阅读更多2024-12-14
Vue生命周期钩子函数：深入解析与实践
作为高级Vue前端开发人员，对Vue组件的生命周期钩子函数有着深刻的理解是至关重要的。生命周期钩子函数是指在Vue组件的创建、更新、销毁等过程中，Vue自动调用的一系列方法。通过这些钩子函数，我们可以
阅读更多2024-12-14
安卓开发--使用android studio发布APP
app发布
阅读更多2024-12-14
数据结构与算法学习笔记----拓扑排序
@ author: 明月清了个风。
阅读更多2024-12-14
python 将数据保存到现有的Excel文件的新工作表
out_file = ‘query.xlsx’df1 = pd.DataFrame(out_data)若直接写入：df1.to_excel(out_file, index=False, sheet_n
阅读更多2024-12-14

跟着官方文档快速入门RAGAS

安装

加载数据

选择对应的指标并运行(以OpenAI为例)

导出和分析结果

代码总览

相关文章