自学内容网 自学内容网

Prometheus 专栏 —— Prometheus安装、配置

配置文件基本结构

  • global: 全局配置
    • scrape_interval: 抓取目标指标的频率,默认为 1min
    • evaluation_interval: 评估告警规则的频率,默认为 1min
    • scrape_timeout: 抓取目标指标数据拉取超时,默认为 10s,如果出现 context deadline exceeded 错误时需要在特定的 job 下配置该字段
    • external_labels: 服务端在与其他系统对接所携带的标签
# 写法示例
global:  
  scrape_interval:     15s # 设置为15秒  
  evaluation_interval: 15s # 设置为15秒
  scrape_timeout:      15s # 设置为15秒     
  • alerting: 非必须,配置与告警相关的设置,可以指定多个 Alertmanager 实例
    • alert_relabel_configs: 用于在发送告警之前重新标记告警的标签
    • alertmanagers: 定义Prometheus将告警发送给哪些Alertmanager实例
# 写法示例
alerting:  
  alertmanagers:  
  - static_configs:  
    - targets:  
      - localhost:9093 # 假设Alertmanager运行在本地9093端口
  • scrape_configs: 定义了Prometheus如何抓取目标的数据,每个 scrape_config 块代表一组抓取目标及其相关的配置
    • metrics_path: 目标抓取数据时使用的路径,一般默认路径为 /metrics
    • scheme: 目标抓取数据时使用的协议,一般默认协议为 http
    • job_name: 用于标识这组抓取目标的名称
    • static_configs: 静态配置的目标列表
    • targets: 目标地址列表,Prometheus将从这些地址抓取数据
# 写法示例
scrape_configs:  
  - job_name: 'node_exporter'
    metrics_path: /metrics
    scheme: http
    static_configs:  
      - targets: ['localhost:9100'] # 假设node_exporter运行在本地9100端口
  • remote_write: 非必须,用于远程存储写配置
  • remote_read: 非必须,用于远程读配置
  • rule_files: 指定 Prometheus 载入的告警规则文件列表,这些文件中定义了触发告警的具体规则
# 写法示例
rule_files:  
  - "alert_rules.yml" # 假设告警规则定义在alert_rules.yml文件中

Prometheus.service

基础配置

  • –config.file=“prometheus.yml” : Prometheus的配置文件路径,这个文件定义了Prometheus如何发现目标、抓取指标等

告警配置

  • –alertmanager.notification-queue-capacity=10000 : 待处理的Alertmanager通知队列的容量

查询配置

  • –query.lookback-delta=5m : 在表达式评估和联邦过程中检索指标的最大回溯持续时间
  • –query.timeout=2m : 查询可能执行的最长时间,之后将被中止
  • –query.max-concurrency=20 : 最大并发查询数
  • –query.max-samples=50000000 : 单个查询可以加载到内存中的最大样本数。这也限制了查询可以返回的样本数

日志配置

  • –log.level=info : 仅记录给定严重程度或以上的日志消息
  • –log.format=logfmt : 日志消息的输出格式

存储配置

  • –storage.tsdb.path=“data/” : 指标存储的基本路径,仅用于服务器模式
  • –storage.tsdb.retention.time : 存储中保留样本的时间长度
  • –storage.tsdb.retention.size : 存储中保留样本的最大字节数
  • –[no-]storage.tsdb.no-lockfile : 不在数据目录中创建锁文件
  • –storage.tsdb.head-chunks-write-queue-size=0 : 用于写入头块到磁盘的队列大小
  • –storage.agent.path=“data-agent/” : 仅用于代理模式的指标存储基本路径
  • –storage.agent.wal-compression : 压缩代理的 WAL(写前日志)
  • –storage.agent.retention.min-time : 定义 WAL 截断时样本的最小年龄
  • –storage.agent.retention.max-time : 定义 WAL 截断时样本的最大年龄

Web配置

  • –web.listen-address=“0.0.0.0:9090” : Prometheus的UI、API和遥测数据的监听地址和端口。默认情况下,Prometheus在9090端口上监听所有接口
  • –web.config.file=“” : 用于指定启用 TLS 或身份验证的配置文件路径
  • –web.read-timeout=5m : 请求读取超时前的最大持续时间,以及空闲连接的关闭时间
  • –web.max-connections=512 : 最大并发连接数
  • –web.external-url= : Prometheus对外可达的URL,通常用于反向代理设置。它用于生成指向Prometheus自身的相对和绝对链接
  • –web.route-prefix= : Web端点的内部路由前缀。默认为 --web.external-url 的路径部分
  • –web.user-assets= : 静态资源目录的路径,通过 /user 访问
  • –[no-]web.enable-lifecycle : 通过HTTP请求启用关闭和加载
  • –[no-]web.enable-admin-api : 启用用于管理控制操作的API端点
  • –[no-]web.enable-remote-write-receiver : 启用接受远程写入请求的API端点
  • –web.console.templates=“consoles” : 控制台模板目录的路径,通过/consoles访问
  • –web.console.libraries=“console_libraries” : 控制台库目录的路径
  • –web.page-title=“Prometheus Time Series Collection and Processing Server” : Prometheus实例的文档标题
  • –web.cors.origin=“.*” : CORS源的正则表达式。用于跨域资源共享配置

功能标志

  • –enable-feature= … : 启用特定的功能标志。这可以用于启用实验性或高级功能

搭建监控环境(ansible版)

下载地址: https://prometheus.io/download/

目录结构

> hostk
> monitor.yml
> roles
>   - monitor
>       - vars 
>            - main.yml
>       - tasks
>            - main.yml
>            - install_prometheus.yml
>            - install_grafana.yml
>       - templates
>            - prometheus.service.j2
>            - prometheus.yml.j2

下载、安装与配置

# hostk
[monitorServer]
monitor-server ansible_host=172.16.13.212 ansible_ssh_port=22 ansible_ssh_user='root' ansible_ssh_pass='123456'
[monitorServer:vars]
gra_version=7.5.4-1
pro_version=2.53.3
# monitor.yml
---
- name: monitor
  hosts: monitorServer
  become: yes
  vars_files:
    - roles/monitor/vars/main.yml
  roles:
    - monitor
# vars/main.yml
GRAFANA_VERSION: "{{ hostvars['monitor-server']['gra_version'] }}"
PROMETHEUS_VERSION: "{{ hostvars['monitor-server']['pro_version'] }}"
MONITOR_IP: "{{ ansible_default_ipv4.address }}"
SOURCE_DIR: /data/tools
# tasks/main.yml
---
- import_tasks: install_prometheus.yml
- import_tasks: install_grafana.yml
# tasks/install_prometheus.yml
---
- name: Install multiple packages using yum module
  ansible.builtin.yum:
    name:
      - fontconfig
      - urw-fonts
    state: present

# - name: Copy source prometheus to remote server(拷贝版)
#   ansible.builtin.copy:
#     src: "{{ item }}"
#     dest: "{{ SOURCE_DIR }}"
#   with_fileglob:
#     - "../files/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64.tar.gz"

- name: 下载并解压文件prometheus-{{prometheus_version}}.linux-amd64.tar.gz(下载版)
  ansible.builtin.unarchive:
    src: 'https://github.com/prometheus/node_exporter/releases/download/v{{ PROMETHEUS_VERSION }}/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64.tar.gz'
    dest: '{{ SOURCE_DIR }}'
    remote_src: yes

# - name: unarchive source prometheus package
#   ansible.builtin.unarchive:
#     src: "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64.tar.gz"
#     dest: "{{ SOURCE_DIR }}"
#     remote_src: yes
#     creates: "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}"
#   register: unarchive_result

- name: Rename extracted directory if necessary
  ansible.builtin.command: mv {{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}.linux-amd64 {{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}
  when: unarchive_result.changed

- name: create prometheus directory
  file:
    path: "{{ item }}"
    state: directory
    mode: '0755'
  with_items:
    - "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/data"
    - "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/rules"

- name: copy template redis.conf to remote server
  ansible.builtin.template:
    src: "prometheus.yml.j2"
    dest: "{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/prometheus.yml"

- name: Copy template prometheus.service to remote server
  ansible.builtin.template:
    src: "prometheus.service.j2"
    dest: "/usr/lib/systemd/system/prometheus.service"
    owner: root
    group: root

- name: start prometheus
  ansible.builtin.systemd:
    name: prometheus
    state: started
    enabled: yes

- name: check prometheus port is already running
  ansible.builtin.wait_for:
    port: 9090
    state: started
    delay: 1
    timeout: 60

# tasks/install_grafana.yml
---
# - name: Copy source grafana to remote server
#   ansible.builtin.copy:
#     src: "{{ item }}"
#     dest: "{{ SOURCE_DIR }}"
#   with_fileglob:
#     - "../files/grafana-{{ GRAFANA_VERSION }}.x86_64.rpm"

- name: Download Grafana RPM package
  ansible.builtin.get_url:
    url: https://mirrors.aliyun.com/grafana/yum/rpm/Packages/grafana-7.5.4-1.x86_64.rpm
    dest: "{{ SOURCE_DIR }}"

- name: Install grafana into remoter server
  ansible.builtin.command: "rpm -ivh {{ SOURCE_DIR }}/grafana-{{ GRAFANA_VERSION }}.x86_64.rpm"

- name: start grafana
  ansible.builtin.systemd:
    name: grafana-server
    state: started
    enabled: yes

- name: check grafana port is already running
  ansible.builtin.wait_for:
    port: 3000
    state: started
    delay: 1
    timeout: 604
# prometheus.service.j2
[Unit]
Description = prometheus
After=network.target

[Service]
Type=simple
User=root
ExecStart={{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/prometheus --web.enable-lifecycle --config.file={{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/prometheus.yml --storage.tsdb.path={{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/data/  --storage.tsdb.retention.time=45d
Restart=on-failure

[Install]
WantedBy=multi-user.target
# prometheus.yml.j2
global:
  scrape_interval: 20s
  scrape_timeout: 15s
  evaluation_interval: 15s
  external_labels:
    monitor: 'ops-dev-prometheus'

alerting:
  alertmanagers:
  - static_configs:
    - targets:
       - 127.0.0.1:9093

rule_files:
  - '{{ SOURCE_DIR }}/prometheus-{{ PROMETHEUS_VERSION }}/rules/*.yml'

scrape_configs:
  - job_name: 'node_exporter'
    metrics_path: /metrics
    static_configs:
    - targets: ['{{ MONITOR_IP }}:9100']

原文地址:https://blog.csdn.net/m0_37868230/article/details/144829934

免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!