prometheus-operator使用(七)|prometheus-operator使用(七) -- 3个alertmanager实例

prometheus通过alertmanager这个CRD来管理和部署alertmanager实例,默认是部署3个实例,3个实例组成集群,通过Gossip协议保持一致性。
由于部署了headless service,每个alertmanager实例都有1个唯一的标识。
1. 3个alertmanager实例 alertmanager使用statefulset部署,部署了3个pod:

# kubectl get all -n monitoring |grep alertmanager-main pod/alertmanager-main-02/2Running019d pod/alertmanager-main-12/2Running019d pod/alertmanager-main-22/2Running118d service/alertmanager-mainClusterIP10.233.5.739093/TCP19d statefulset.apps/alertmanager-main3/319d

3个alertmanager部署为集群,实例之间通过Gossip协议实现一致性:
# ps -ef|grep alertmanager 100051237 512050 Feb01 ?01:34:55 /bin/alertmanager --config.file=/etc/alertmanager/config/alertmanager.yaml --storage.path=/alertmanager --data.retention=120h --cluster.listen-address=[10.233.97.7]:9094 --web.listen-address=:9093 --web.route-prefix=/ --cluster.peer=alertmanager-main-0.alertmanager-operated:9094 --cluster.peer=alertmanager-main-1.alertmanager-operated:9094 --cluster.peer=alertmanager-main-2.alertmanager-operated:9094

定义了一个headless service,让每个alertmanager实例有1个唯一的标识:
apiVersion: v1 kind: Service metadata: name: alertmanager-operated namespace: monitoring spec: clusterIP: None ports: - name: web port: 9093 protocol: TCP targetPort: web - name: tcp-mesh port: 9094 protocol: TCP targetPort: 9094 - name: udp-mesh port: 9094 protocol: UDP targetPort: 9094 selector: app: alertmanager sessionAffinity: None type: ClusterIP

2. prometheus中配置访问alertmanager 到prometheus POD上看一下:
# kubectl exec -it prometheus-k8s-0 /bin/sh -n monitoring/etc/prometheus/config_out $ vi prometheus.env.yaml ...... alerting: alert_relabel_configs: - action: labeldrop regex: prometheus_replica alertmanagers: - path_prefix: / scheme: http kubernetes_sd_configs: - role: endpoints namespaces: names: - monitoring relabel_configs: - action: keep source_labels: - __meta_kubernetes_service_name regex: alertmanager-main - action: keep source_labels: - __meta_kubernetes_endpoint_port_name regex: web .......

【prometheus-operator使用(七)|prometheus-operator使用(七) -- 3个alertmanager实例】可以看到,alertManager的实例也是通过kubernetes_sd_configs来自动发现的,其筛选规则:
  • service_name=alertmanager-main;
  • endpoint_port_name=web;
看一下alertmanager-service.yaml,满足这一条件:
# cat alertmanager-service.yamlapiVersion: v1 kind: Service metadata: labels: alertmanager: main name: alertmanager-main namespace: monitoring spec: ports: - name: web port: 9093 targetPort: web selector: alertmanager: main app: alertmanager sessionAffinity: ClientIP

3. alertManager的CRD、service和serviceMonitor的定义 alertmanager的statefulset实际是通过alertmanager这个CRD来实现的:
# kubectl get crd -n monitoring NAMECREATED AT alertmanagers.monitoring.coreos.com2021-02-01T03:13:36Z ......# cat alertmanager-alertmanager.yamlapiVersion: monitoring.coreos.com/v1 kind: Alertmanager metadata: labels: alertmanager: main name: main namespace: monitoring spec: image: "178.104.162.39:443/dev/huayun/amd64/alertmanager:v0.21.0" nodeSelector: kubernetes.io/os: linux replicas: 3 securityContext: fsGroup: 2000 runAsNonRoot: true runAsUser: 1000 serviceAccountName: alertmanager-main version: v0.21.0

可以看到,上面定义的replica=3。
service和serviceMonitor的定义如下:
# cat alertmanager-service.yamlapiVersion: v1 kind: Service metadata: labels: alertmanager: main name: alertmanager-main namespace: monitoring spec: ports: - name: web port: 9093 targetPort: web selector: alertmanager: main app: alertmanager sessionAffinity: ClientIP# cat alertmanager-serviceMonitor.yamlapiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: labels: k8s-app: alertmanager name: alertmanager namespace: monitoring spec: endpoints: - interval: 30s port: web selector: matchLabels: alertmanager: main

定义了serviceMonitor,则alertmanager实例本身暴露的/metrics,是可以被Prometheus拉取的。

    推荐阅读