golang|go-kit 上手之example stringsvc2 添加日志和监控


  • 日志
  • 添加instrument
  • 对于Histograms和Summary的区别
  • 主函数
  • 错误总结
    • 用logMiddleware和instrumentingMiddleware对svc进行封装
    • label misss error
    • inconsistent label cardinality
  • 查看metrics结果
    • 结果分析
      • thread数
      • count接口返回结果的统计
      • 接口uppercase 和count接口的请求次数统计
      • 接口uppercase 和count接口的请求时间统计

//StringService的结构体定义、以及输入输出参数结构定义和stringsvc1中是一样的
日志 //loggingMiddleware 这个方法的缺点在于无论是uppercase还是count参数都是uppercase,通用的就没有办法针对每个功能进行定制日志。
//由于StringService是一个接口,所以我们只需要定义一个新的类型包装之前的StringService,实现StringService的接口,在实现过程中加入log。
type logMiddleware struct { logger log.Logger nextStringService }func (mw logMiddleware) Uppercase(ctx context.Context, s string) (output string, err error) { defer func(begin time.Time) { mw.logger.Log( "logmethod", "uppercase", "input", s, "output", output, "err", err, "took", time.Since(begin), ) }(time.Now())output, err = mw.next.Uppercase(ctx, s) return }func (mw logMiddleware) Count(ctx context.Context, s string) (n int) { defer func(begin time.Time) { mw.logger.Log( "logmethod", "uppercase", "input", s, "output", n, // "err", err, "took", time.Since(begin), ) }(time.Now())n = mw.next.Count(ctx, s) return }

添加instrument
instrumentation means using package metrics to record statistics about your service’s runtime behavior. Counting the number of jobs processed, recording the duration of requests after they’ve finished, and tracking the number of in-flight operations would all be considered instrumentation.
说白了就是对服务运行情况的统计
type instrumentingMiddleware struct { requestCountmetrics.Counter requestLatency metrics.Histogram //官网文档这里似乎错了 countResultmetrics.Histogram nextStringService }func (mw instrumentingMiddleware) Uppercase(ctx context.Context, s string) (output string, err error) { defer func(begin time.Time) { //官网文档是 metrics.Field 似乎是错的 //methodFiled 要和 main中的fieldKeys保持一致 methodField := []string{"method", "uppercase", "error", fmt.Sprintf("%T", err != nil)} mw.requestCount.With(methodField...).Add(1) mw.requestLatency.With(methodField...).Observe(time.Since(begin).Seconds())}(time.Now())output, err = mw.next.Uppercase(ctx, s) return }func (mw instrumentingMiddleware) Count(ctx context.Context, s string) (n int) { defer func(begin time.Time) { //官网文档里面是 metrics.Field 似乎是错的 methodField := []string{"method", "count", "error", "false"} mw.requestCount.With(methodField...).Add(1) mw.requestLatency.With(methodField...).Observe(time.Since(begin).Seconds()) //文档里面是int64是错的, mw.countResult.Observe(float64(n))}(time.Now())n = mw.next.Count(ctx, s) return }

对于Histograms和Summary的区别
// Note that Histograms, in contrast to Summaries, can be aggregated with the // Prometheus query language (see the documentation for detailed // procedures). However, Histograms require the user to pre-define suitable // buckets, and they are in general less accurate. The Observe method of a // Histogram has a very low performance overhead in comparison with the Observe // method of a Summary.// A Summary captures individual observations from an event or sample stream and // summarizes them in a manner similar to traditional summary statistics: 1. sum // of observations, 2. observation count, 3. rank estimations. // // A typical use-case is the observation of request latencies. By default, a // Summary provides the median, the 90th and the 99th percentile of the latency // as rank estimations. However, the default behavior will change in the // upcoming v0.10 of the library. There will be no rank estiamtions at all by // default. For a sane transition, it is recommended to set the desired rank // estimations explicitly. // // Note that the rank estimations cannot be aggregated in a meaningful way with // the Prometheus query language (i.e. you cannot average or add them). If you // need aggregatable quantiles (e.g. you want the 99th percentile latency of all // queries served across all instances of a service), consider the Histogram // metric type. See the Prometheus documentation for more details.

更加详细的参见代码文档中的说明
主函数
func main() { logger := log.NewLogfmtLogger(os.Stderr) //注意这里的 filedkeys要和 methodField 中的一致,不然会报错 //fieldKeys := []string{"metod", "error"} //2017/10/19 18:09:28 http: panic serving [::1]:55246: label name "metod" missing in label mapfieldKeys := []string{"method", "error"}//统计请求次数 requestCount := kitprometheus.NewCounterFrom(stdprometheus.CounterOpts{ Namespace: "my_gropu", Subsystem: "string_service", Name:"request_count", Help:"Number of requests received.", }, fieldKeys)//总结请求的延迟时间 requestLatency := kitprometheus.NewSummaryFrom(stdprometheus.SummaryOpts{ Namespace: "my_gropu", Subsystem: "string_service", Name:"request_latence_microseconds", Help:"Number of requests in misroseconds.", }, fieldKeys)//总结Count()功能返回的结果 countResult := kitprometheus.NewSummaryFrom(stdprometheus.SummaryOpts{ Namespace: "my_gropu", Subsystem: "string_service", Name:"count_result", Help:"The result of each count method.", }, []string{})//svc := stringService{} //cannot use logMiddleware literal (type logMiddleware) as type stringService in assignmentvar svc StringService svc = stringService{} svc = logMiddleware{logger, svc} svc = instrumentingMiddleware{requestCount, requestLatency, countResult, svc}uppercaseHandler := httptransport.NewServer( makeUppercaseEndpoint(svc), decodeUpperCaseRequest, encodeResponse, )countHandler := httptransport.NewServer( makeCountEndpoint(svc), decodeCountRequest, encodeResponse, ) http.Handle("/uppercase", uppercaseHandler) http.Handle("/count", countHandler) http.Handle("/metrics", promhttp.Handler()) logger.Log("msg", "HTTP", "addr", ":8082") logger.Log("err", http.ListenAndServe(":8082", nil)) }

错误总结 用logMiddleware和instrumentingMiddleware对svc进行封装 下面这段代码
var svc StringService svc = stringService{} svc = logMiddleware{logger, svc} svc = instrumentingMiddleware{requestCount, requestLatency, countResult, svc}

如果写成
svc := stringService{} svc = logMiddleware{logger, svc} svc = instrumentingMiddleware{requestCount, requestLatency, countResult, svc}

报错
golang|go-kit 上手之example stringsvc2 添加日志和监控
文章图片

//cannot use logMiddleware literal (type logMiddleware) as type stringService in assignment

这是因为StrinService是一个接口,所以 stringService,logMiddleware,instrumentingMiddleware都能给它赋值。
但是如果定义为具体的stringService,那么logMiddleware赋值给它就会类型不一致报错了。
label misss error 在main函数里面
fieldKeys := []string{"method", "error"}

【golang|go-kit 上手之example stringsvc2 添加日志和监控】如果写为
fieldKeys := []string{"metod", "error"}

那么就会报错
msg=HTTP addr=:8082 logmethod=uppercase input=Hello output=HELLO err=null took=2.531μs 2017/10/19 18:09:28 http: panic serving [::1]:55246: label name "metod" missing in label map goroutine 40 [running]: net/http.(*conn).serve.func1(0xc4200d0600) /usr/local/Cellar/go/1.7.5/libexec/src/net/http/server.go:1491 +0x12a

inconsistent label cardinality 在main函数里countResult对应的instrument声明如下:
//总结Count()功能返回的结果 countResult := kitprometheus.NewSummaryFrom(stdprometheus.SummaryOpts{ Namespace: "my_gropu", Subsystem: "string_service", Name:"count_result", Help:"The result of each count method.", }, []string{})

countResult 设置的field为[]string{} 但是在instrumentingMiddleware中Count()如果使用下面语句记录

mw.countResult.With(methodField...).Observe(float64(n))

就会报错:
2017/10/19 17:26:49 http: panic serving [::1]:54929: inconsistent label cardinality goroutine 25 [running]: net/http.(*conn).serve.func1(0xc420122580) /usr/local/Cellar/go/1.7.5/libexec/src/net/http/server.go:1491 +0x12a panic(0x2c70a0, 0xc42000d850) /usr/local/Cellar/go/1.7.5/libexec/src/runtime/panic.go:458 +0x243 github.com/prometheus/client_golang/prometheus.(*metricVec).with(0xc42011c100, 0xc420117710, 0xc42011cef0, 0xc420117710) /Users/kes/goproject/src/github.com/prometheus/client_golang/prometheus/vec.go:103 +0x88

查看metrics结果 分别调用3次uppsercase count接口
Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/count {"v":5} Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/count {"v":5} Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/count {"v":5} Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/uppercase {"v":"HELLO"} Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/uppercase {"v":"HELLO"} Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/uppercase

然后查看统计结果如下:
Sean-MacBook-Air:sringsrv1 kes$ curl -XPOST -d'{"s":"Hello"}' localhost:8082/metrics # HELP go_gc_duration_seconds A summary of the GC invocation durations. # TYPE go_gc_duration_seconds summary go_gc_duration_seconds{quantile="0"} 0 go_gc_duration_seconds{quantile="0.25"} 0 go_gc_duration_seconds{quantile="0.5"} 0 go_gc_duration_seconds{quantile="0.75"} 0 go_gc_duration_seconds{quantile="1"} 0 go_gc_duration_seconds_sum 0 go_gc_duration_seconds_count 0 # HELP go_goroutines Number of goroutines that currently exist. # TYPE go_goroutines gauge go_goroutines 8 # HELP go_info Information about the Go environment. # TYPE go_info gauge go_info{version="go1.7.5"} 1 # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use. # TYPE go_memstats_alloc_bytes gauge go_memstats_alloc_bytes 745448 # HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed. # TYPE go_memstats_alloc_bytes_total counter go_memstats_alloc_bytes_total 745448 # HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table. # TYPE go_memstats_buck_hash_sys_bytes gauge go_memstats_buck_hash_sys_bytes 1.442694e+06 # HELP go_memstats_frees_total Total number of frees. # TYPE go_memstats_frees_total counter go_memstats_frees_total 183 # HELP go_memstats_gc_cpu_fraction The fraction of this program's available CPU time used by the GC since the program started. # TYPE go_memstats_gc_cpu_fraction gauge go_memstats_gc_cpu_fraction 0 # HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata. # TYPE go_memstats_gc_sys_bytes gauge go_memstats_gc_sys_bytes 131072 # HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use. # TYPE go_memstats_heap_alloc_bytes gauge go_memstats_heap_alloc_bytes 745448 # HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used. # TYPE go_memstats_heap_idle_bytes gauge go_memstats_heap_idle_bytes 466944 # HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use. # TYPE go_memstats_heap_inuse_bytes gauge go_memstats_heap_inuse_bytes 1.302528e+06 # HELP go_memstats_heap_objects Number of allocated objects. # TYPE go_memstats_heap_objects gauge go_memstats_heap_objects 6072 # HELP go_memstats_heap_released_bytes Number of heap bytes released to OS. # TYPE go_memstats_heap_released_bytes gauge go_memstats_heap_released_bytes 0 # HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system. # TYPE go_memstats_heap_sys_bytes gauge go_memstats_heap_sys_bytes 1.769472e+06 # HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection. # TYPE go_memstats_last_gc_time_seconds gauge go_memstats_last_gc_time_seconds 0 # HELP go_memstats_lookups_total Total number of pointer lookups. # TYPE go_memstats_lookups_total counter go_memstats_lookups_total 17 # HELP go_memstats_mallocs_total Total number of mallocs. # TYPE go_memstats_mallocs_total counter go_memstats_mallocs_total 6255 # HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures. # TYPE go_memstats_mcache_inuse_bytes gauge go_memstats_mcache_inuse_bytes 4800 # HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system. # TYPE go_memstats_mcache_sys_bytes gauge go_memstats_mcache_sys_bytes 16384 # HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures. # TYPE go_memstats_mspan_inuse_bytes gauge go_memstats_mspan_inuse_bytes 19680 # HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system. # TYPE go_memstats_mspan_sys_bytes gauge go_memstats_mspan_sys_bytes 32768 # HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place. # TYPE go_memstats_next_gc_bytes gauge go_memstats_next_gc_bytes 4.194304e+06 # HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations. # TYPE go_memstats_other_sys_bytes gauge go_memstats_other_sys_bytes 804210 # HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator. # TYPE go_memstats_stack_inuse_bytes gauge go_memstats_stack_inuse_bytes 327680 # HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator. # TYPE go_memstats_stack_sys_bytes gauge go_memstats_stack_sys_bytes 327680 # HELP go_memstats_sys_bytes Number of bytes obtained from system. # TYPE go_memstats_sys_bytes gauge go_memstats_sys_bytes 4.52428e+06 # HELP go_threads Number of OS threads created. # TYPE go_threads gauge go_threads 7 # HELP my_gropu_string_service_count_result The result of each count method. # TYPE my_gropu_string_service_count_result summary my_gropu_string_service_count_result{quantile="0.5"} 5 my_gropu_string_service_count_result{quantile="0.9"} 5 my_gropu_string_service_count_result{quantile="0.99"} 5 my_gropu_string_service_count_result_sum 15 my_gropu_string_service_count_result_count 3 # HELP my_gropu_string_service_request_count Number of requests received. # TYPE my_gropu_string_service_request_count counter my_gropu_string_service_request_count{error="bool",method="uppercase"} 3 my_gropu_string_service_request_count{error="false",method="count"} 3 # HELP my_gropu_string_service_request_latence_microseconds Number of requests in misroseconds. # TYPE my_gropu_string_service_request_latence_microseconds summary my_gropu_string_service_request_latence_microseconds{error="bool",method="uppercase",quantile="0.5"} 4.7236e-05 my_gropu_string_service_request_latence_microseconds{error="bool",method="uppercase",quantile="0.9"} 5.4974000000000006e-05 my_gropu_string_service_request_latence_microseconds{error="bool",method="uppercase",quantile="0.99"} 5.4974000000000006e-05 my_gropu_string_service_request_latence_microseconds_sum{error="bool",method="uppercase"} 0.00014803100000000002 my_gropu_string_service_request_latence_microseconds_count{error="bool",method="uppercase"} 3 my_gropu_string_service_request_latence_microseconds{error="false",method="count",quantile="0.5"} 4.6348000000000004e-05 my_gropu_string_service_request_latence_microseconds{error="false",method="count",quantile="0.9"} 7.964e-05 my_gropu_string_service_request_latence_microseconds{error="false",method="count",quantile="0.99"} 7.964e-05 my_gropu_string_service_request_latence_microseconds_sum{error="false",method="count"} 0.000169707 my_gropu_string_service_request_latence_microseconds_count{error="false",method="count"} 3

结果分析 如果分析有误还请大家指正。
thread数
记录系统起的go_threads 数,由于main本身算一个,然后3次uppercase,3次count,go中每个handler都会单独起个goroutine,所以一共7个。
# HELP go_threads Number of OS threads created. # TYPE go_threads gauge go_threads 7

/count接口返回结果的统计
用summary统计的/count接口的返回结果
# HELP my_gropu_string_service_count_result The result of each count method. # TYPE my_gropu_string_service_count_result summary my_gropu_string_service_count_result{quantile="0.5"} 5 my_gropu_string_service_count_result{quantile="0.9"} 5 my_gropu_string_service_count_result{quantile="0.99"} 5 my_gropu_string_service_count_result_sum 15 my_gropu_string_service_count_result_count 3

/count接口3次请求的结果都是5,和为15.
接口/uppercase 和/count接口的请求次数统计
/uppercase请求了3次,/count请求了3次
# HELP my_gropu_string_service_request_count Number of requests received. # TYPE my_gropu_string_service_request_count counter my_gropu_string_service_request_count{error="bool",method="uppercase"} 3 my_gropu_string_service_request_count{error="false",method="count"} 3

接口/uppercase 和/count接口的请求时间统计
# HELP my_gropu_string_service_request_latence_microseconds Number of requests in misroseconds. # TYPE my_gropu_string_service_request_latence_microseconds summarymy_gropu_string_service_request_latence_microseconds{error="bool",method="uppercase",quantile="0.5"} 4.7236e-05 my_gropu_string_service_request_latence_microseconds{error="bool",method="uppercase",quantile="0.9"} 5.4974000000000006e-05 my_gropu_string_service_request_latence_microseconds{error="bool",method="uppercase",quantile="0.99"} 5.4974000000000006e-05 my_gropu_string_service_request_latence_microseconds_sum{error="bool",method="uppercase"} 0.00014803100000000002 my_gropu_string_service_request_latence_microseconds_count{error="bool",method="uppercase"} 3my_gropu_string_service_request_latence_microseconds{error="false",method="count",quantile="0.5"} 4.6348000000000004e-05 my_gropu_string_service_request_latence_microseconds{error="false",method="count",quantile="0.9"} 7.964e-05 my_gropu_string_service_request_latence_microseconds{error="false",method="count",quantile="0.99"} 7.964e-05 my_gropu_string_service_request_latence_microseconds_sum{error="false",method="count"} 0.000169707 my_gropu_string_service_request_latence_microseconds_count{error="false",method="count"} 3

数据展示了每次请求的时间以及总时间和请求次数

    推荐阅读