prometheusalert区分告警到不同钉钉群
方法一
修改告警规则
- alert: cpu使用率大于88%expr: instance:node_cpu_utilization:ratio * 100 > 88for: 5mlabels:severity: criticallevel: 3kind: CpuUsageannotations:summary: "cpu使用率大于85%"description: "主机 {{ $labels.hostname }} 的cpu使用率为 {{ $value | humanize }}"
根据Kind区分,规则一kind1,规则二是kind2。
alertmanager配置示例
global:resolve_timeout: 5msmtp_from: from@email.comsmtp_smarthost: smtp.net:portsmtp_auth_username: from@email.comsmtp_auth_password: PASSsmtp_require_tls: false
route:receiver: 'email'group_by: ['alertname']group_wait: 10sgroup_interval: 10srepeat_interval: 10mroutes:- receiver: 'our'group_wait: 10smatch_re:severity: warning- receiver: 'other'group_wait: 10smatch_re:severity: busitemplates:- '*.html'
receivers:
- name: 'email'email_configs:- to: 'xuxd@email.com'send_resolved: falsehtml: '{{ template "default-monitor.html" . }}'headers: { Subject: "[WARN] 报警邮件" } #邮件主题
- name: 'our'webhook_configs:- url: http://127.0.0.1:8060/dingtalk/our/send
- name: 'other'webhook_configs:- url: http://127.0.0.1:8060/dingtalk/other/send
- route:除了email这个全局配置的接收者外,下面的routes指定了两个特定的接收者,一个接收者叫“our”,匹配warning级别的;另一个叫“other”,匹配busi级别的,这两个级别在最前面的规则里定义,不是什么特定关键字,就是自己随便定义的一个标记
- receivers:这里指定了上面定义的接收者的配置,email指定邮件发给谁;“our”指定dingtalk的发送url,注意这个uri的末尾,send前用的"our";“other”下面指定了两个url,区别就是url末尾的send前面,一个是“our”,另一个是"other"
prometheus-webhook-dingtalk配置
## Customizable templates path
templates:- /home/user/monitor/alert/prometheus-webhook-dingtalk-1.4.0.linux-amd64/template/template.tmpl## Targets, previously was known as "profiles"
targets:our:url: https://oapi.dingtalk.com/robot/send?access_token=xxxxsecret: xxx_secretother:url: https://oapi.dingtalk.com/robot/send?access_token=xxx_othersecret: xxx_other_secret
targets下有两个,分别是"our"和"other",这里对应上面alertmanager配置的url里的"our"和"other。
这样配置,如果规则一告警,就是alertmanager的name为other的receiver来发送告警通知,发送到我们的钉钉群和业务侧钉钉群。如果是规则二告警,通过our发送,便只发送到我们的钉钉群。
vmalert配置文件value.yaml
# Default values for victoria-metrics-alert.
# This is a YAML-formatted file.
# Declare variables to be passed into your templates.serviceAccount:# Specifies whether a service account should be createdcreate: true# Annotations to add to the service accountannotations: {}# The name of the service account to use.# If not set and create is true, a name is generated using the fullname templatename:# mount API token to pod directlyautomountToken: trueimagePullSecrets: []rbac:create: truepspEnabled: truenamespaced: falseextraLabels: {}annotations: {}server:name: serverenabled: trueimage:repository: victoriametrics/vmalerttag: "" # rewrites Chart.AppVersionpullPolicy: IfNotPresentnameOverride: ""fullnameOverride: ""## See `kubectl explain poddisruptionbudget.spec` for more## ref: https://kubernetes.io/docs/tasks/run-application/configure-pdb/podDisruptionBudget:enabled: false# minAvailable: 1# maxUnavailable: 1labels: {}# -- Additional environment variables (ex.: secret tokens, flags) https://github.com/VictoriaMetrics/VictoriaMetrics#environment-variablesenv:[]# - name: VM_remoteWrite_basicAuth_password# valueFrom:# secretKeyRef:# name: auth_secret# key: passwordreplicaCount: 1# deployment strategy, set to standard k8s defaultstrategy:type: RollingUpdaterollingUpdate:maxSurge: 25%maxUnavailable: 25%# specifies the minimum number of seconds for which a newly created Pod should be ready without any of its containers crashing/terminating# 0 is the standard k8s defaultminReadySeconds: 0# vmalert reads metrics from source, next section represents its configuration. It can be any service which supports# MetricsQL or PromQL.datasource:url: "http://192.168.47.9:8481/select/0/prometheus/"basicAuth:username: ""password: ""remote:write:url: ""read:url: ""notifier:alertmanager:url: "http://x.x.x.x:9093"extraArgs:envflag.enable: "true"envflag.prefix: VM_loggerFormat: json# Additional hostPath mountsextraHostPathMounts:[]# - name: certs-dir# mountPath: /etc/kubernetes/certs# subPath: ""# hostPath: /etc/kubernetes/certs# readOnly: true# Extra Volumes for the podextraVolumes:[]#- name: example# configMap:# name: example# Extra Volume Mounts for the containerextraVolumeMounts:[]# - name: example# mountPath: /exampleextraContainers:[]#- name: config-reloader# image: reloader-imageservice:annotations: {}labels: {}clusterIP: ""## Ref: https://kubernetes.io/docs/user-guide/services/#external-ips##externalIPs: []loadBalancerIP: ""loadBalancerSourceRanges: []servicePort: 8880type: ClusterIP# Ref: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip# externalTrafficPolicy: "local"# healthCheckNodePort: 0ingress:enabled: falseannotations: {}# kubernetes.io/ingress.class: nginx# kubernetes.io/tls-acme: 'true'extraLabels: {}hosts: []# - name: vmselect.local# path: /select# port: httptls: []# - secretName: vmselect-ingress-tls# hosts:# - vmselect.local# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress# ingressClassName: nginx# -- pathType is only for k8s >= 1.1=pathType: PrefixpodSecurityContext: {}# fsGroup: 2000securityContext:{}# capabilities:# drop:# - ALL# readOnlyRootFilesystem: true# runAsNonRoot: true# runAsUser: 1000resources:{}# We usually recommend not to specify default resources and to leave this as a conscious# choice for the user. This also increases chances charts run on environments with little# resources, such as Minikube. If you do want to specify resources, uncomment the following# lines, adjust them as necessary, and remove the curly braces after 'resources:'.# limits:# cpu: 100m# memory: 128Mi# requests:# cpu: 100m# memory: 128Mi# Annotations to be added to the deploymentannotations: {}# labels to be added to the deploymentlabels: {}# Annotations to be added to podpodAnnotations: {}podLabels: {}nodeSelector: {}priorityClassName: ""tolerations: []affinity: {}# vmalert alert rules configuration configuration:# use existing configmap if specified# otherwise .config values will be usedconfigMap: ""config:alerts:groups:- name: 磁盘挂载错误rules:- alert: 磁盘挂载错误annotations:description: '{{$labels.job}}链{{$labels.instance}}节点磁盘挂载错误'expr: mount_error{job=~"dev|sit"} == 1for: 1mlabels:severity: criticalkind: kind1- name: 进程不存在rules:- alert: 进程不存在annotations:description: '{{$labels.job}}链{{$labels.instance}}进程不存在'expr: process_total_error{job=~"dev|sit"} == 1for: 1mlabels:severity: criticalkind: kind2serviceMonitor:enabled: falseextraLabels: {}annotations: {}
# interval: 15s
# scrapeTimeout: 5s# -- Commented. HTTP scheme to use for scraping.
# scheme: https# -- Commented. TLS configuration to use when scraping the endpoint
# tlsConfig:
# insecureSkipVerify: truealertmanager:enabled: truereplicaCount: 1podMetadata:labels: {}annotations: {}image: prom/alertmanagertag: v0.20.0retention: 120hnodeSelector: {}priorityClassName: ""resources: {}tolerations: []imagePullSecrets: []podSecurityContext: {}extraArgs: {}# key: value# external URL, that alertmanager will expose to receiversbaseURL: ""# use existing configmap if specified# otherwise .config values will be usedconfigMap: ""config:global:resolve_timeout: 5mroute:# default receiverreceiver: aldaba# tag to group bygroup_by: [alertname]# How long to initially wait to send a notification for a group of alertsgroup_wait: 30s# How long to wait before sending a notification about new alerts that are added to a groupgroup_interval: 60s# How long to wait before sending a notification again if it has already been sent successfully for an alertrepeat_interval: 1hroutes:- receiver: 'mychain'group_wait: 10smatch_re:kind: mychainreceivers:- name: aldabawebhook_configs:- url: http://192.168.208.133:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=72a3a55795094a6878c2c2443a81a3545add1f688ddee18701c0dd753dbb3b2a&split=falsesend_resolved: true- name: mychainwebhook_configs:- url: http://192.168.208.133:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=307270fdcd1bb0c4b0533e29005cca7cb353c27d7f988fdff0ec00e6affc6e83&split=falsesend_resolved: trueinhibit_rules:- source_match:#severity: 'warning'target_match:#severity: 'warning'#equal: ['alertname', 'job']templates: {}# alertmanager.tmpl: |-service:annotations: {}type: ClusterIPport: 9093# if you want to force a specific nodePort. Must be use with service.type=NodePort# nodePort:ingress:enabled: falseannotations:# nginx.ingress.kubernetes.io/auth-realm: Authentication Required# nginx.ingress.kubernetes.io/auth-secret: victoria-metrics/basic-auth# nginx.ingress.kubernetes.io/auth-type: basic# kubernetes.io/ingress.class: nginx# kubernetes.io/tls-acme: 'true'extraLabels: {}hosts: {}# - name: wangjuan.test.com# path: /# port: webtls: []# - secretName: alertmanager-ingress-tls# hosts:# - alertmanager.local# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress# ingressClassName: nginx# -- pathType is only for k8s >= 1.1=pathType: PrefixpersistentVolume:# -- Create/use Persistent Volume Claim for alertmanager component. Empty dir if falseenabled: false# -- Array of access modes. Must match those of existing PV or dynamic provisioner. Ref: [http://kubernetes.io/docs/user-guide/persistent-volumes/](http://kubernetes.io/docs/user-guide/persistent-volumes/)accessModes:- ReadWriteOnce# -- Persistant volume annotationsannotations: {}# -- StorageClass to use for persistent volume. Requires alertmanager.persistentVolume.enabled: true. If defined, PVC created automaticallystorageClass: ""# -- Existing Claim name. If defined, PVC must be created manually before volume will be boundexistingClaim: ""# -- Mount path. Alertmanager data Persistent Volume mount root path.mountPath: /data# -- Mount subpathsubPath: ""# -- Size of the volume. Better to set the same as resource limit memory property.size: 50Mi
方法二
根据job过滤
alertmanager配置
apiVersion: v1
data:alertmanager.yaml: |-global:resolve_timeout: 5minhibit_rules:- equal:- alertname- jobsource_match:severity: warningtarget_match:severity: warningreceivers:- name: nftwebhook_configs:- send_resolved: falseurl: http://x.x.x.x:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxx&split=false- name: poapwebhook_configs:- send_resolved: falseurl: http://x.x.x.x:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxx&split=false- name: ipforcewebhook_configs:- send_resolved: falseurl: http://x.x.x.x:8080/prometheusalert?type=dd&tpl=prometheus-dd&ddurl=https://oapi.dingtalk.com/robot/send?access_token=xxxxxxxxxxxxxxxx&split=falseroute:group_by:- alertnamegroup_interval: 60sgroup_wait: 30sreceiver: nftrepeat_interval: 1hroutes:- group_wait: 10smatch_re:job: test_poapreceiver: poap- group_wait: 10smatch_re:job: test_ipforcereceiver: ipforce
kind: ConfigMap
metadata:annotations:meta.helm.sh/release-name: vmalertmeta.helm.sh/release-namespace: victoria-metricscreationTimestamp: '2022-04-06T07:31:38Z'labels:app: alertmanagerapp.kubernetes.io/instance: vmalertapp.kubernetes.io/managed-by: Helmapp.kubernetes.io/name: victoria-metrics-alerthelm.sh/chart: victoria-metrics-alert-0.4.33managedFields:- apiVersion: v1fieldsType: FieldsV1fieldsV1:'f:data': {}'f:metadata':'f:annotations':.: {}'f:meta.helm.sh/release-name': {}'f:meta.helm.sh/release-namespace': {}'f:labels':.: {}'f:app': {}'f:app.kubernetes.io/instance': {}'f:app.kubernetes.io/managed-by': {}'f:app.kubernetes.io/name': {}'f:helm.sh/chart': {}manager: helmoperation: Updatetime: '2022-04-06T07:31:38Z'- apiVersion: v1fieldsType: FieldsV1fieldsV1:'f:data':'f:alertmanager.yaml': {}manager: ACK-Console Apache-HttpClientoperation: Updatetime: '2023-01-05T07:52:13Z'name: vmalert-alertmanager-alertmanager-confignamespace: victoria-metricsresourceVersion: '80954053'uid: 653e4633-86e5-41ce-9a17-301f75224e9c相关文章:
prometheusalert区分告警到不同钉钉群
方法一 修改告警规则 - alert: cpu使用率大于88%expr: instance:node_cpu_utilization:ratio * 100 > 88for: 5mlabels:severity: criticallevel: 3kind: CpuUsageannotations:summary: "cpu使用率大于85%"description: "主机 {{ $labels.hostname }} 的cp…...
AUTOSAR规范与ECU软件开发(实践篇)3.2 ETAS AUTOSAR系统解决方案介绍(上)
1、ETAS AUTOSAR系统解决方案介绍 博世集团ETAS公司基于其强大的研发实力为用户提供了一套高效、 可靠的AUTOSAR系统解决方案, 该方案覆盖了软件架构设计、 应用层模型设计、 基础软件开发、 软件虚拟验证等各个方面, 如图3.5所示, 其中深色…...
【leetcode】第三章 哈希表part02
454.四数相加II public int fourSumCount(int[] nums1, int[] nums2, int[] nums3, int[] nums4) {HashMap<Integer,Integer> map new HashMap<>();// 统计频率for (int i 0; i < nums1.length; i) {for (int j 0; j < nums2.length; j) {int num nums1…...
【C语言】memset()函数
一.memset()函数简介 我们先来看一下cplusplus.com - The C Resources Network网站上memset()函数的基本信息: 1.函数功能 memset()函数的功能是:将一块内存空间的每个字节都设置为指定的值。 这个函数通常用于初始化一个内存空间,或者清空一个内存空间…...
C++中重载(overload)、重写(override,也叫做“覆盖”)和重定义(redefine,也叫作“隐藏”)的区别?
在C中,允许在同一作用域中的某个函数和运算符指定多个定义,分别称为函数重载和运算符重载。 重载声明是指一个与之前已经在该作用域内声明过的函数或方法具有相同名称的声明,但是它们的参数列表和定义(实现)不相同。 …...
将非受信数据作为参数传入,可能引起xml 注入,引起数据覆盖,这个问题咋解决
目录 1 解决 1 解决 当将非受信数据作为参数传入时,确实存在XML注入(XML Injection)的风险,攻击者可以通过构造恶意的XML数据来修改XML文档结构或执行意外的操作。为了解决这个问题,你可以采取以下措施: 输…...
设计模式-简单工厂模式
简单工厂模式又称为静态工厂模式,其实就是根据传入参数创建对应具体类的实例并返回实例对象,这些类通常继承至同一个父类,该模式专门定义了一个类来负责创建其他类的实例。 using System.Collections; using System.Collections.Generic; us…...
Maven框架SpringBootWeb简单入门
一、Maven ★ Maven:是Apache旗下的一个开源项目,是一款用于管理和构建java项目的工具。 官网:https://maven.apache.org/ ★ Maven的作用: 1. 依赖管理:方便快捷的管理项目依赖的资源(jar包),避免版本冲突问题。 2. 统一项目结构:提供标准、统一的项目结构。 …...
关于2023年8月19日PMP认证考试准考信下载通知
各位考生: 为保证参加2023年8月19日PMI项目管理资格认证考试的每位考生都能顺利进入考场参加考试,请完整阅读本通知内容。 一、关于准考信下载 为确保您顺利进入考场参加8月份考试,请及时登录本网站(https://event.chinapmp.cn/)…...
html实现iphone同款开关
一、背景 想实现一个开关的按钮,来触发一些操作,网上找了总感觉看着别扭,忽然想到iphone的开关挺好,搞一个 二、代码实现 <!DOCTYPE html> <html lang"en"> <head><meta charset"UTF-8&qu…...
使用Vue和jsmind如何实现思维导图的历史版本控制和撤销/重做功能?
思维导图是一种流行的知识图谱工具,可以帮助我们更好地组织和理解复杂的思维关系。在开发基于Vue的思维导图应用时,实现历史版本控制和撤销/重做功能是非常有用的。以下为您介绍如何使用Vue和jsmind插件来实现这些功能。 安装依赖 首先,我们…...
【Vue-Router】路由元信息
路由元信息(Route Meta Information)是在路由配置中为每个路由定义的一组自定义数据。这些数据可以包含任何你希望在路由中传递和使用的信息,比如权限、页面标题、布局设置等。Vue Router 允许你在路由配置中定义元信息,然后在组件…...
vue 控件的四个角设置 父视图position:relative
父视图relative,子视图 absolute <div class"bg1"> <i class"topL"></i> <i class"topR"></i> <i class"bottomL"></i> <i class"bottomR"></i> <di…...
VM中linux虚拟机配置桥接模式(虚拟机与宿主机网络互通)
VM虚拟机配置桥接模式,可以让虚拟机和物理主机一样存在于局域网中,可以和主机相通,和互联网相通,和局域网中其它主机相通。 vmware为我们提供了三种网络工作模式,它们分别是:Bridged(桥接模式&…...
7.Eclipse中改变编码方式及解决部分乱码问题
1、改变整个工作空间的编码方式: 点击Window->Preference->General->workplace,然后选择默认编码方式 2、改变某个项目的编码方式: 右键点击项目名->Properties>Resource,然后选择默认编码方式。 问题ÿ…...
grafana 的 ws websocket 连接不上的解决方式
使用了多层的代理方式,一层没有此问题 错误 WebSocket connection to ‘wss://ip地址/grafana01/api/live/ws’ failed: 日志报错 msg“Request Completed” methodGET path/api/live/ws status403 解决方式 # allowed_origins is a comma-separated list of o…...
多环境_部署项目
多环境: 指同一套项目代码在不同的阶段需要根据实际情况来调整配置并且部署到不同的机器上。 为什么需要? 1. 每个环境互不影响 2. 区分不同的阶段:开发 / 测试 / 生产 3. 对项目进行优化: 1. 本地日志级别 2. 精简依赖&a…...
go web框架 gin-gonic源码解读02————router
go web框架 gin-gonic源码解读02————router 本来想先写context,但是发现context能简单讲讲的东西不多,就准备直接和router合在一起讲好了 router是web服务的路由,是指讲来自客户端的http请求与服务器端的处理逻辑或者资源相映射的机制。&…...
【Java后端封装数据】常见后端封装数据的格式,用于返回给前端使用(109)
数据格式一:包装 List Map 返回,常用于数据展示; // Controller:public Result selectRegConfig(RequestBody String param) {try {Map<String, Object> paramMap JsonUtils.readValue(param, Map.class);return Result.su…...
无脑入门pytorch系列(三)—— nn.Linear
本系列教程适用于没有任何pytorch的同学(简单的python语法还是要的),从代码的表层出发挖掘代码的深层含义,理解具体的意思和内涵。pytorch的很多函数看着非常简单,但是其中包含了很多内容,不了解其中的意思…...
谷歌浏览器插件
项目中有时候会用到插件 sync-cookie-extension1.0.0:开发环境同步测试 cookie 至 localhost,便于本地请求服务携带 cookie 参考地址:https://juejin.cn/post/7139354571712757767 里面有源码下载下来,加在到扩展即可使用FeHelp…...
业务系统对接大模型的基础方案:架构设计与关键步骤
业务系统对接大模型:架构设计与关键步骤 在当今数字化转型的浪潮中,大语言模型(LLM)已成为企业提升业务效率和创新能力的关键技术之一。将大模型集成到业务系统中,不仅可以优化用户体验,还能为业务决策提供…...
JavaSec-RCE
简介 RCE(Remote Code Execution),可以分为:命令注入(Command Injection)、代码注入(Code Injection) 代码注入 1.漏洞场景:Groovy代码注入 Groovy是一种基于JVM的动态语言,语法简洁,支持闭包、动态类型和Java互操作性,…...
反向工程与模型迁移:打造未来商品详情API的可持续创新体系
在电商行业蓬勃发展的当下,商品详情API作为连接电商平台与开发者、商家及用户的关键纽带,其重要性日益凸显。传统商品详情API主要聚焦于商品基本信息(如名称、价格、库存等)的获取与展示,已难以满足市场对个性化、智能…...
使用分级同态加密防御梯度泄漏
抽象 联邦学习 (FL) 支持跨分布式客户端进行协作模型训练,而无需共享原始数据,这使其成为在互联和自动驾驶汽车 (CAV) 等领域保护隐私的机器学习的一种很有前途的方法。然而,最近的研究表明&…...
解锁数据库简洁之道:FastAPI与SQLModel实战指南
在构建现代Web应用程序时,与数据库的交互无疑是核心环节。虽然传统的数据库操作方式(如直接编写SQL语句与psycopg2交互)赋予了我们精细的控制权,但在面对日益复杂的业务逻辑和快速迭代的需求时,这种方式的开发效率和可…...
Springcloud:Eureka 高可用集群搭建实战(服务注册与发现的底层原理与避坑指南)
引言:为什么 Eureka 依然是存量系统的核心? 尽管 Nacos 等新注册中心崛起,但金融、电力等保守行业仍有大量系统运行在 Eureka 上。理解其高可用设计与自我保护机制,是保障分布式系统稳定的必修课。本文将手把手带你搭建生产级 Eur…...
BCS 2025|百度副总裁陈洋:智能体在安全领域的应用实践
6月5日,2025全球数字经济大会数字安全主论坛暨北京网络安全大会在国家会议中心隆重开幕。百度副总裁陈洋受邀出席,并作《智能体在安全领域的应用实践》主题演讲,分享了在智能体在安全领域的突破性实践。他指出,百度通过将安全能力…...
JVM 内存结构 详解
内存结构 运行时数据区: Java虚拟机在运行Java程序过程中管理的内存区域。 程序计数器: 线程私有,程序控制流的指示器,分支、循环、跳转、异常处理、线程恢复等基础功能都依赖这个计数器完成。 每个线程都有一个程序计数…...
DiscuzX3.5发帖json api
参考文章:PHP实现独立Discuz站外发帖(直连操作数据库)_discuz 发帖api-CSDN博客 简单改造了一下,适配我自己的需求 有一个站点存在多个采集站,我想通过主站拿标题,采集站拿内容 使用到的sql如下 CREATE TABLE pre_forum_post_…...
