Prometheus 专栏 —— Prometheus安装、配置
配置文件基本结构
- global: 全局配置
- scrape_interval: 抓取目标指标的频率,默认为 1min
- evaluation_interval: 评估告警规则的频率,默认为 1min
- scrape_timeout: 抓取目标指标数据拉取超时,默认为 10s,如果出现 context deadline exceeded 错误时需要在特定的 job 下配置该字段
- external_labels: 服务端在与其他系统对接所携带的标签
# 写法示例
global:
scrape_interval: 15s # 设置为15秒
evaluation_interval: 15s # 设置为15秒
scrape_timeout: 15s # 设置为15秒
- alerting: 非必须,配置与告警相关的设置,可以指定多个 Alertmanager 实例
- alert_relabel_configs: 用于在发送告警之前重新标记告警的标签
- alertmanagers: 定义Prometheus将告警发送给哪些Alertmanager实例
# 写法示例
alerting:
alertmanagers:
- static_configs:
- targets:
- localhost:9093 # 假设Alertmanager运行在本地9093端口
- scrape_configs: 定义了Prometheus如何抓取目标的数据,每个 scrape_config 块代表一组抓取目标及其相关的配置
- metrics_path: 目标抓取数据时使用的路径,一般默认路径为
/metrics
- scheme: 目标抓取数据时使用的协议,一般默认协议为
http
- job_name: 用于标识这组抓取目标的名称
- static_configs: 静态配置的目标列表
- targets: 目标地址列表,Prometheus将从这些地址抓取数据
- metrics_path: 目标抓取数据时使用的路径,一般默认路径为
# 写法示例
scrape_configs:
- job_name: 'node_exporter'
metrics_path: /metrics
scheme: http
static_configs:
- targets: ['localhost:9100'] # 假设node_exporter运行在本地9100端口
- remote_write: 非必须,用于远程存储写配置
- remote_read: 非必须,用于远程读配置
- rule_files: 指定 Prometheus 载入的告警规则文件列表,这些文件中定义了触发告警的具体规则
# 写法示例
rule_files:
- "alert_rules.yml" # 假设告警规则定义在alert_rules.yml文件中
Prometheus.service
基础配置
- –config.file=“prometheus.yml” : Prometheus的配置文件路径,这个文件定义了Prometheus如何发现目标、抓取指标等
告警配置
- –alertmanager.notification-queue-capacity=10000 : 待处理的Alertmanager通知队列的容量
查询配置
- –query.lookback-delta=5m : 在表达式评估和联邦过程中检索指标的最大回溯持续时间
- –query.timeout=2m : 查询可能执行的最长时间,之后将被中止
- –query.max-concurrency=20 : 最大并发查询数
- –query.max-samples=50000000 : 单个查询可以加载到内存中的最大样本数。这也限制了查询可以返回的样本数
日志配置
- –log.level=info : 仅记录给定严重程度或以上的日志消息
- –log.format=logfmt : 日志消息的输出格式
存储配置
- –storage.tsdb.path=“data/” : 指标存储的基本路径,仅用于服务器模式
- –storage.tsdb.retention.time : 存储中保留样本的时间长度
- –storage.tsdb.retention.size : 存储中保留样本的最大字节数
- –[no-]storage.tsdb.no-lockfile : 不在数据目录中创建锁文件
- –storage.tsdb.head-chunks-write-queue-size=0 : 用于写入头块到磁盘的队列大小
- –storage.agent.path=“data-agent/” : 仅用于代理模式的指标存储基本路径
- –storage.agent.wal-compression : 压缩代理的 WAL(写前日志)
- –storage.agent.retention.min-time : 定义 WAL 截断时样本的最小年龄
- –storage.agent.retention.max-time : 定义 WAL 截断时样本的最大年龄
Web配置
- –web.listen-address=“0.0.0.0:9090” : Prometheus的UI、API和遥测数据的监听地址和端口。默认情况下,Prometheus在9090端口上监听所有接口
- –web.config.file=“” : 用于指定启用 TLS 或身份验证的配置文件路径
- –web.read-timeout=5m : 请求读取超时前的最大持续时间,以及空闲连接的关闭时间
- –web.max-connections=512 : 最大并发连接数
- –web.external-url= : Prometheus对外可达的URL,通常用于反向代理设置。它用于生成指向Prometheus自身的相对和绝对链接
- –web.route-prefix=
: Web端点的内部路由前缀。默认为 --web.external-url 的路径部分 - –web.user-assets=
: 静态资源目录的路径,通过 /user 访问 - –[no-]web.enable-lifecycle : 通过HTTP请求启用关闭和加载
- –[no-]web.enable-admin-api : 启用用于管理控制操作的API端点
- –[no-]web.enable-remote-write-receiver : 启用接受远程写入请求的API端点
- –web.console.templates=“consoles” : 控制台模板目录的路径,通过/consoles访问
- –web.console.libraries=“console_libraries” : 控制台库目录的路径
- –web.page-title=“Prometheus Time Series Collection and Processing Server” : Prometheus实例的文档标题
- –web.cors.origin=“.*” : CORS源的正则表达式。用于跨域资源共享配置
功能标志
- –enable-feature= … : 启用特定的功能标志。这可以用于启用实验性或高级功能
搭建监控环境(ansible版)
下载地址: https://prometheus.io/download/
目录结构
> hostk
> monitor.yml
> roles
> - monitor
> - vars
> - main.yml
> - tasks
> - main.yml
> - install_prometheus.yml
> - install_grafana.yml
> - templates
> - prometheus.service.j2
> - prometheus.yml.j2
下载、安装与配置
# hostk
[monitorServer]
monitor-server ansible_host=172.16.13.212 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_pass='123456'
[monitorServer:vars]
gra_version=7.5.4-1
pro_version=2.53.3
# monitor.yml
---
- name: monitor
hosts: monitorServer
become: yes
vars_files:
- roles/monitor/vars/main.yml
roles:
- monitor
# vars/main.yml
GRAFANA_VERSION: "{{ hostvars['monitor-server']['gra_version'] }}"
PROMETHEUS_VERSION: "{{ hostvars['monitor-server']['pro_version'] }}"
MONITOR_IP: "{{ ansible_default_ipv4.address }}"
SOURCE_DIR: /data/tools
# tasks/main.yml
---
- import_tasks: install_prometheus.yml
- import_tasks: install_grafana.yml
# tasks/install_prometheus.yml
---
- name: Install multiple packages using yum module
ansible.builtin.yum:
name:
- fontconfig
- urw-fonts
state: present
# - name: Copy source prometheus to remote server(拷贝版)
# ansible.builtin.copy:
# src: "{{ item }}"
# dest: "{{ SOURCE_DIR }}"
# with_fileglob:
# - "../files/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64.tar.gz"
- name: 下载并解压文件prometheus-{{prometheus_version}}.linux-amd64.tar.gz(下载版)
ansible.builtin.unarchive:
src: 'https://github.com/prometheus/node_exporter/releases/download/v{{ PROMETHEUS_VERSION }}/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64.tar.gz'
dest: '{{ SOURCE_DIR }}'
remote_src: yes
# - name: unarchive source prometheus package
# ansible.builtin.unarchive:
# src: "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64.tar.gz"
# dest: "{{ SOURCE_DIR }}"
# remote_src: yes
# creates: "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}"
# register: unarchive_result
- name: Rename extracted directory if necessary
ansible.builtin.command: mv {{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64 {{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}
when: unarchive_result.changed
- name: create prometheus directory
file:
path: "{{ item }}"
state: directory
mode: '0755'
with_items:
- "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/data"
- "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/rules"
- name: copy template redis.conf to remote server
ansible.builtin.template:
src: "prometheus.yml.j2"
dest: "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/prometheus.yml"
- name: Copy template prometheus.service to remote server
ansible.builtin.template:
src: "prometheus.service.j2"
dest: "/usr/lib/systemd/system/prometheus.service"
owner: root
group: root
- name: start prometheus
ansible.builtin.systemd:
name: prometheus
state: started
enabled: yes
- name: check prometheus port is already running
ansible.builtin.wait_for:
port: 9090
state: started
delay: 1
timeout: 60
# tasks/install_grafana.yml
---
# - name: Copy source grafana to remote server
# ansible.builtin.copy:
# src: "{{ item }}"
# dest: "{{ SOURCE_DIR }}"
# with_fileglob:
# - "../files/grafana-{{ GRAFANA_VERSION }}.x86_64.rpm"
- name: Download Grafana RPM package
ansible.builtin.get_url:
url: https://mirrors.aliyun.com/grafana/yum/rpm/Packages/grafana-7.5.4-1.x86_64.rpm
dest: "{{ SOURCE_DIR }}"
- name: Install grafana into remoter server
ansible.builtin.command: "rpm -ivh {{ SOURCE_DIR }}/grafana-{{ GRAFANA_VERSION }}.x86_64.rpm"
- name: start grafana
ansible.builtin.systemd:
name: grafana-server
state: started
enabled: yes
- name: check grafana port is already running
ansible.builtin.wait_for:
port: 3000
state: started
delay: 1
timeout: 604
# prometheus.service.j2
[Unit]
Description = prometheus
After=network.target
[Service]
Type=simple
User=root
ExecStart={{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/prometheus --web.enable-lifecycle --config.file={{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/prometheus.yml --storage.tsdb.path={{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/data/ --storage.tsdb.retention.time=45d
Restart=on-failure
[Install]
WantedBy=multi-user.target
# prometheus.yml.j2
global:
scrape_interval: 20s
scrape_timeout: 15s
evaluation_interval: 15s
external_labels:
monitor: 'ops-dev-prometheus'
alerting:
alertmanagers:
- static_configs:
- targets:
- 127.0.0.1:9093
rule_files:
- '{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/rules/*.yml'
scrape_configs:
- job_name: 'node_exporter'
metrics_path: /metrics
static_configs:
- targets: ['{{ MONITOR_IP }}:9100']
原文地址:https://blog.csdn.net/m0_37868230/article/details/144829934
免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!