Open-falcon aggregator表达式解析计算过程及其优化

以下面两个聚合规则为例,详解aggregator表达式的解析和计算过程,并提出可以优化的地方。

# 计算cpu.used.percent的平均值 分子:$(cpu.used.percent) 分母:$## 计算cpu.idle + cpu.busy为100的机器个数 分子:($(cpu.idle) + $(cpu.busy)) > 90 分母:1

1. 解析 numberatorStr和denominatorStr分别是分子和分母的表达式字符串;
表达式中带$(的是需要计算的:
// $(cpu.used.percent): true // $#: false // ($(cpu.idle) + $(cpu.busy)) > 90: true // 1: false needComputeNumerator := needCompute(numeratorStr) needComputeDenominator := needCompute(denominatorStr)func needCompute(val string) bool { return strings.Contains(val, "$(") }

如果不需要计算,返回的numeratorOperands/numeratorOperators/numeratorComputeMode都为空;
// $(cpu.used.percent)返回 [cpu.used.percent] [] "" // ($(cpu.idle) + $(cpu.busy)) > 90返回 [cpu.idle cpu.busy] [] ">90" numeratorOperands, numeratorOperators, numeratorComputeMode := parse(numeratorStr, needComputeNumerator) denominatorOperands, denominatorOperators, denominatorComputeMode := parse(denominatorStr, needComputeDenominator)func parse(expression string, needCompute bool) (operands []string, operators []string, computeMode string) { if !needCompute { return } splitCounter, _ := regexp.Compile(`[\$\(\)]+`) items := splitCounter.Split(expression, -1) count := len(items) for i, val := range items[1 : count-1] { if i%2 == 0 { operands = append(operands, val) } else { operators = append(operators, val) } } computeMode = items[count-1] return }

2. 计算 先查询聚合组下的hosts列表,再查host列表的最新指标值,保存到map,以供聚合时使用:
hostnames, err := sdk.HostnamesByID(item.GroupId) //numeratorOperands 和 denominatorOperands 保存了表达式包含的指标名称 valueMap, err := queryCounterLast(numeratorOperands, denominatorOperands, hostnames, now-int64(item.Step*2), now)

【Open-falcon aggregator表达式解析计算过程及其优化】聚合过程的代码,对hosts进行遍历,最终得到numberator,denominator,validaCount:
for _, hostname := range hostnames { if needComputeNumerator { numeratorVal, err = compute(numeratorOperands, numeratorOperators, numeratorComputeMode, hostname, valueMap) } if needComputeDenominator { denominatorVal, err = compute(denominatorOperands, denominatorOperators, denominatorComputeMode, hostname, valueMap) } numerator += numeratorVal denominator += denominatorVal validCount += 1 }if !needComputeNumerator { if numeratorStr == "$#" { numerator = float64(validCount) } else { numerator, err = strconv.ParseFloat(numeratorStr, 64) } } if !needComputeDenominator { if denominatorStr == "$#" { denominator = float64(validCount) } else { denominator, err = strconv.ParseFloat(denominatorStr, 64) } }

第一个表达式的聚合过程:
  • 分子:$(cpu.used.percent)需要计算,它查询每个host的cpu.used.percent的值,然后累加到numberator中;
  • 分母:$# 不需要计算,它累加host的个数,保存在denominator中;
  • 故分子/分母=所有host的cpu.used.percent的平均值
第二个表达式的聚合过程:
  • 分子:($(cpu.idle) + $(cpu.busy)) > 90,它计算每个host的cpu.idle + cpu.busy > 90是否成立,若成立则为1,否则为0,然后累加到numberator中;
  • 分母:1 不要计算,denominator=1;
  • 故分子/分母=所有host中满足cpu.idle+cpu.busy > 90的个数;
compute()的过程,以最复杂的($(cpu.idle) + $(cpu.busy)) > 90为例:
  • operands: 操作数有两个cpu.idle和cpu.busy;
  • opearots:操作符有1个+;
  • computeMode=">90":布尔判断;
func compute(operands []string, operators []string, computeMode string, hostname string, valMap map[string]float64) (val float64, err error) { .... vals := queryOperands(operands, hostname, valMap) if len(vals) != count { return val, errors.New("value invalid") } sum := vals[0] //操作符仅支持+和- for i, v := range vals[1:] { if operators[i] == "+" { sum += v } else { sum -= v } } //存在布尔判断,判断成功,val=1 if computeMode != "" { if compareSum(sum, computeMode) { val = 1 } } else { val = sum } return val, nil }

布尔的比较操作也很简单,sum=累加的值,computeMode=">90",字符串解析得到比较符:
func compareSum(sum float64, computeMode string) bool { regMatch, _ := regexp.Compile(`([><=]+)([\d\.]+)`) match := regMatch.FindStringSubmatch(computeMode)mode := match[1] val, _ := strconv.ParseFloat(match[2], 64)switch { case mode == ">" && sum > val: case mode == "<" && sum < val: case mode == "=" && sum == val: case mode == ">=" && sum >= val: case mode == "<=" && sum <= val: default: return false } return true }

3. 优化 上述的表达式解析和计算,是按照固定的格式解析字符串实现的,比如正则解析>90,先得到>操作符,再得到90操作数。
在具体实现时,可以采用表达式引擎来优化,比如github.com/antonmedv/expr,这样就不必关注表达式的语法解析,可以将注意力集中于业务流程;缺点是由于使用通用的表达式引擎,执行速度可能没有自己解析快。
使用表达式引擎的demo:github.com/antonmedv/expr
env := map[string]interface{}{ "cpuIdle": 30, "cpuBusy": 40, } code := `( cpuIdle + cpuBusy ) > 50` program, err := expr.Compile(code, expr.Env(env)) output, err := expr.Run(program, env)

    推荐阅读