kube-prometheus监控rocketmq
参考
https://www.cnblogs.com/fengjian2016/p/12666361.html
kube-prometheus部署
https://blog.csdn.net/wuxingge/article/details/143964814
部署rocketmq
jdk
cd /opt
tar xf jdk-8u212-linux-x64.tar.gz
ln -s jdk1.8.0_212 jdk
jdk环境变量
vim /etc/profile
export JAVA_HOME=/opt/jdk
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
export CLASSPATH=.$CLASSPATH:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$JAVA_HOME/lib/tools.jar
下载rocketmq
wget https://mirrors.aliyun.com/apache/rocketmq/4.9.8/rocketmq-all-4.9.8-bin-release.zip
启动 Name Server
修改内存大小
vi bin/runserver.sh
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g -Xmn2g ...
...
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g ...
nohup bash bin/mqnamesrv &> startnameserver.log &
ss -lntup |grep java
tcp LISTEN 0 1024 0.0.0.0:9876 0.0.0.0:* users:(("java",pid=22646,fd=90))
启动 brocker
vim bin/runbroker.sh
JAVA_OPT="${JAVA_OPT} -server -Xms2g -Xmx2g"
nohup bash bin/mqbroker -n 192.168.0.134:9876 &> startbroker.log &
部署rocketmq-exporter
maven
cd /opt
tar xf apache-maven-3.8.8-bin.tar.gz
ln -s apache-maven-3.8.8 maven
maven环境变量配置
vim /etc/profile
export MAVEN_HOME=/opt/maven
export PATH="$MAVEN_HOME/bin:$PATH"
maven配置
在</mirrors>标签中配置
vim maven/conf/settings.xml
...
<mirror>
<id>aliyunmaven</id>
<mirrorOf>*</mirrorOf>
<name>阿里云公共仓库</name>
<url>https://maven.aliyun.com/repository/public</url>
</mirror>
...
rocketmq-exporter打包
git clone https://github.com/apache/rocketmq-exporter
cd rocketmq-exporter
mvn clean install
启动rocketmq-exporter
nohup java -jar rocketmq-exporter-0.0.2-exec.jar --rocketmq.config.namesrvAddr="192.168.0.134:9876" &> rocketmq-exporter.log &
#rocketmq是集群的,用以下命令启动
java -jar rocketmq-exporter-0.0.2-exec.jar --rocketmq.config.namesrvAddr="192.168.0.134:9876;192.168.0.135:9876;192.168.0.136:9876"
rocketmq-exporter 有如下的运行选项
rocketmq.config.namesrvAddr="127.0.0.1:9876" MQ 集群的 nameSrv 地址
rocketmq.config.webTelemetryPath="/metrics" 指标搜集路径
server.port=5557 HTTP 服务暴露端口
访问rocketmq-exporter
curl http://192.168.0.134:5557/metrics
kube-prometheus 添加 rocketmsq service和 endpoint ,把rocketmq服务导入到集群
vi rocketmq-service-endpoint.yaml
apiVersion: v1
kind: Endpoints
metadata:
name: exporter-rocketmq
namespace: kube-system
labels:
app: exporter-rocketmq
subsets:
- addresses:
- ip: 192.168.0.134
ports:
- name: port
port: 5557
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: exporter-rocketmq
namespace: kube-system
labels:
app: exporter-rocketmq
spec:
type: ClusterIP
clusterIP: None
ports:
- name: port
port: 5557
protocol: TCP
kube-prometheus 添加 prometheus-servicemonitor
vi prometheus-serviceMonitorrocketmq.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: exporter-rocketmq
namespace: monitoring
labels:
app: exporter-rocketmq
spec:
jobLabel: exporter-rocketmq
endpoints:
- port: port
interval: 30s
scheme: http
selector:
matchLabels:
app: exporter-rocketmq
namespaceSelector:
matchNames:
- kube-system
Prometheus Operator创建告警规则文件
kube-prometheus会从 monitoring 这个 namespace 去获取 prometheusRule 告警规则
vi rocketmq-prometheusRule.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
app.kubernetes.io/component: rocketmq
app.kubernetes.io/name: rocketmq
app.kubernetes.io/part-of: kube-prometheus
#prometheus: k8s
#role: alert-rules
name: rocketmq-rules
namespace: monitoring
spec:
groups:
- name: rocketmq
rules:
- alert: RocketMQ Exporter is Down
expr: up{job="rocketmq"} == 0
for: 20s
labels:
severity: disaster
annotations:
summary: RocketMQ {{ $labels.instance }} is down
- alert: RocketMQ 存在消息积压
expr: (sum(irate(rocketmq_producer_offset[1m])) by (topic) - on(topic) group_right sum(irate(rocketmq_consumer_offset[1m])) by (group,topic)) > 5
for: 5m
labels:
severity: warning
annotations:
summary: RocketMQ (group={{ $labels.group }} topic={{ $labels.topic }})积压数 = {{ .Value }}
- alert: GroupGetLatencyByStoretime 消费组的消费延时时间过高
expr: rocketmq_group_get_latency_by_storetime/1000 > 10 and rate(rocketmq_group_get_latency_by_storetime[5m]) >0
for: 3m
labels:
severity: warning
annotations:
description: 'consumer {{$labels.group}} on {{$labels.broker}}, {{$labels.topic}} consume time lag behind message store time
and (behind value is {{$value}}).'
summary: 消费组的消费延时时间过高
- alert: RocketMQClusterProduceHigh 集群TPS > 20
expr: sum(rocketmq_producer_tps) by (cluster) >= 20
for: 3m
labels:
severity: warning
annotations:
description: '{{$labels.cluster}} Sending tps too high. now TPS = {{ .Value }}'
summary: cluster send tps too high
kubectl apply -f rocketmq-prometheusRule.yaml
kubectl -n monitoring get prometheusrules.monitoring.coreos.com
NAME AGE
alertmanager-main-rules 9d
grafana-rules 9d
kube-prometheus-rules 9d
kube-state-metrics-rules 9d
kubernetes-monitoring-rules 9d
node-exporter-rules 9d
prometheus-k8s-prometheus-rules 9d
prometheus-operator-rules 9d
rocketmq-rules 22m
在容器中,operator已经自动创建好规则文件
kubectl -n monitoring exec -it prometheus-k8s-0 -- sh
/prometheus $ cd /etc/prometheus/rules/prometheus-k8s-rulefiles-0/
/etc/prometheus/rules/prometheus-k8s-rulefiles-0 $ ls
monitoring-alertmanager-main-rules-5643f949-3d8d-4942-bbb0-0e93e51aeafd.yaml monitoring-node-exporter-rules-e8704a33-1fcb-4390-baac-7e4ced854824.yaml
monitoring-grafana-rules-d16a41fd-a917-4b35-aedb-38b1dccbc0e9.yaml monitoring-prometheus-k8s-prometheus-rules-fa554d2b-a09b-438c-a264-a46f956c4f4b.yaml
monitoring-kube-prometheus-rules-f3949e47-9cee-41d7-94a0-f03cc00cad8f.yaml monitoring-prometheus-operator-rules-3dc8eb3b-96bb-4e7b-b721-2a98317393bc.yaml
monitoring-kube-state-metrics-rules-10e6eb2c-5f58-4d48-be29-458d2938c10a.yaml monitoring-rocketmq-rules-8da5d119-c3c8-4212-a68e-50d2d1156c0a.yaml
monitoring-kubernetes-monitoring-rules-b4d35d4d-6656-4199-baaa-2c256c0eb1ed.yaml
原文地址:https://blog.csdn.net/wuxingge/article/details/144026496
免责声明:本站文章内容转载自网络资源,如本站内容侵犯了原著者的合法权益,可联系本站删除。更多内容请关注自学内容网(zxcms.com)!