Open-falcon judge告警判定表达式的解析
judge组件在做告警判定的时候,会解析配置的告警策略,生成一个fn,由fn.Compute()计算是否触发,比如:
- 配置 all(#3)>90 表示最近3次的数据都 > 90 触发;
- 配置 max(#3)>90 表示最近3次的最大值 > 90 触发;
- 配置 min(#3)<10 表示最近3次的最小值 < 10 触发;
- 配置 avg(#3)>90 标识最近3次的avg > 90 触发;
1. 告警判断 告警判定的入口代码:讲解judge代码时有介绍
- ParseFuncFromString()生成fn;
- fn.Compute(L)根据最近的数据计算,判定是否触发;
// modules/judge/store/judge.go
func judgeItemWithStrategy(L *SafeLinkedList, strategy model.Strategy, firstItem *model.JudgeItem, now int64) {
fn, err := ParseFuncFromString(strategy.Func, strategy.Operator, strategy.RightValue)
historyData, leftValue, isTriggered, isEnough := fn.Compute(L)
// 当前的数据点太少,不足以做告警判定
if !isEnough {
return
}
......
}
2. 如何生成fn fn由告警策略配置的string生成:比如配置 all(#3) > 90
- str = all(#3)
- operator = >
- rightValue = https://www.it610.com/article/90
// modules/judge/store/func.go
// @str: e.g. all(#3) sum(#3) avg(#10)
// @opeartor: > <
func ParseFuncFromString(str string, operator string, rightValue float64) (fn Function, err error) {
if str == "" {
return nil, fmt.Errorf("func can not be null!")
}
idx := strings.Index(str, "#")
args, err := atois(str[idx+1 : len(str)-1])
if err != nil {
return nil, err
}switch str[:idx-1] {
case "max":
fn = &MaxFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
case "min":
fn = &MinFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
case "all":
fn = &AllFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
case "sum":
fn = &SumFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
case "avg":
fn = &AvgFunction{Limit: args[0], Operator: operator, RightValue: rightValue}
......
default:
err = fmt.Errorf("not_supported_method")
}return
}
返回的Funtion是个interface类型,AllFunction、AvgFunction都实现了这个interface:
// modules/judge/store/func.go
type Function interface {
Compute(L *SafeLinkedList) (vs []*model.HistoryData, leftValue float64, isTriggered bool, isEnough bool)
}
3. fn如何计算 【Open-falcon judge告警判定表达式的解析】fn的计算方式都在其实现的Compute()方法内;
AllFunction需要最近的点都要满足:
func (this AllFunction) Compute(L *SafeLinkedList) (vs []*model.HistoryData, leftValue float64, isTriggered bool, isEnough bool) {
vs, isEnough = L.HistoryData(this.Limit)
if !isEnough {
return
}
isTriggered = true
for i := 0;
i < this.Limit;
i++ {
isTriggered = checkIsTriggered(vs[i].Value, this.Operator, this.RightValue)
if !isTriggered {
break
}
}leftValue = https://www.it610.com/article/vs[0].Value
return
}
checkIsTriggered()就是简单的数值判断:
// modules/judge/store/func.go
func checkIsTriggered(leftValue float64, operator string, rightValue float64) (isTriggered bool) {
switch operator {
case "=", "==":
isTriggered = math.Abs(leftValue-rightValue) < 0.0001
case "!=":
isTriggered = math.Abs(leftValue-rightValue) > 0.0001
case "<":
isTriggered = leftValue < rightValue
case "<=":
isTriggered = leftValue <= rightValue
case ">":
isTriggered = leftValue > rightValue
case ">=":
isTriggered = leftValue >= rightValue
}return
}
MaxFunction需要最近N个点的最大值满足阈值:
func (this MaxFunction) Compute(L *SafeLinkedList) (vs []*model.HistoryData, leftValue float64, isTriggered bool, isEnough bool) {
vs, isEnough = L.HistoryData(this.Limit)
if !isEnough {
return
}
// 先计算最大值
max := vs[0].Value
for i := 1;
i < this.Limit;
i++ {
if max < vs[i].Value {
max = vs[i].Value
}
}leftValue = https://www.it610.com/article/max
// 判定最大值是否触发阈值
isTriggered = checkIsTriggered(leftValue, this.Operator, this.RightValue)
return
}
推荐阅读
- 【研经日课497天】Study|【研经日课497天】Study 7 Judges 7:1-23
- jvm|JVM调优(线上 JVM GC 频繁耗时长,出现 LongGC 告警,这次排查后想说:还有谁(...))
- 运维(你们|运维:你们 JAVA 服务内存占用太高,还只增不减!告警了,快来接锅)
- 【绿书】|【绿书】 模拟,rep大坑
- 特殊密码锁,openjudge全局题号8496,已AC
- 【OpenJudge|【OpenJudge 1665】完美覆盖
- 监控告警优化需求的思考
- 利用微信实现自动发送监控告警
- 监控告警满飞天,运维在家睡到自然醒...
- 判断并将矩阵转化为严格对角占优矩阵