Azure China Application Gateway 性能监控

莫问天涯路几重,轻衫侧帽且从容。这篇文章主要讲述Azure China Application Gateway 性能监控相关的知识,希望能为你提供帮助。
  目前中国区Azure Portal无法支持应用程序网关的日志诊断,显示和告警功能,之前在www.azure.cn网站的应用程序网关文档页面曾经出现过关于日志诊断的配置文章,但其实是直接翻译自Azure Global的,现在已经被删除。
这样意味着中国客户使用AppGW的网站或API,无法通过直接对AppGW的使用情况进行监控,以进行适当的性能扩展。如果一定要看,只能向21V开工单申请拉性能图表,但实时性差了很多,也不方便。这一问题已遭到客户的多次吐槽。
而微软后台研发在上周末由于客户的强烈需求,对中国区应用程序网关进行了hotfix,实现了部分的海外功能,即Diagnostic log生成和写入到Blob Storage的功能。下面对具体如何使用作出分享。
一.Application Gateway的诊断日志开启和检查
目前中国区对AppGW开启诊断日志只能通过以下Powershell命令行进行:Set-AzureRmDiagnosticSetting -ResourceId < String> -StorageAccountId < String> -Enabled < Boolean>
其中ResourceId代表需要被开启日志诊断功能的资源,在本文中具体指应用程序网关,StorageAccountId代表日志被写入的存储账号,由用户自定义。比如我再Powershell登录自己的Azure China测试订阅后,打开一个AppGW的日志诊断功能操作如下:
PS C:\\Users\\huxu> Set-AzureRmDiagnosticSetting -ResourceId /subscriptions/3fd8e8ff-2373-4d1c-8587-bddcd13a1ba0/resourceGroups/DWTEST01/providers/Microsoft.Network/applicationGateways/aapgwmon01 -StorageAccountId /subscriptions/3fd8e8ff-2373-4d1c-8587-bddcd13a1ba0/resourceGroups/dwtest01/providers/Microsoft.Storage/storageAccounts/appgwtest -Enabled $true
StorageAccountId : /subscriptions/3fd8e8ff-2373-4d1c-8587-bddcd13a1ba0/resourceGroups/dwtest01/providers/
Microsoft.Storage/storageAccounts/appgwtest
ServiceBusRuleId :
StorageAccountName :
Metrics
Enabled : True
Timegrain : PT1M
RetentionPolicy
Enabled : False
Days : 0
Logs
Enabled : True
Category : ApplicationGatewayAccessLog
RetentionPolicy
Enabled : False
Days : 0
Enabled : True
Category : ApplicationGatewayPerformanceLog
RetentionPolicy
Enabled : False
Days : 0
Enabled : True
Category : ApplicationGatewayFirewallLog
RetentionPolicy
Enabled : False
Days : 0
 
然后过了一段时间,在此存储账号内可以看到生成了日志文件。

Azure China Application Gateway 性能监控

文章图片

日志文件为Json格式,其中accesslog文件格式如下:
{ "records": [{ "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", "operationName": "ApplicationGatewayAccess", "time": "2017-06-12T06:50:23Z", "category": "ApplicationGatewayAccessLog", "properties": {"instanceId":"ApplicationGatewayRole_IN_1","clientIP":"211.156.223.30","clientPort":44107,"httpMethod":"POST","requestUri":"/expressApi/ems/managePush","requestQuery":"X-AzureApplicationGateway-CACHE-HIT=0& SERVER-ROUTED=172.16.5.12& X-AzureApplicationGateway-LOG-ID=9e84951a-62ca-4903-b66a-86a8cff8eee8& SERVER-STATUS=200","userAgent":"java/1.6.0_20","httpStatus":200,"httpVersion":"HTTP/1.1","receivedBytes":668,"sentBytes":335,"timeTaken":80,"sslEnabled":"off"} } , { "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", "operationName": "ApplicationGatewayAccess", "time": "2017-06-12T06:50:22Z", "category": "ApplicationGatewayAccessLog", "properties": {"instanceId":"ApplicationGatewayRole_IN_2","clientIP":"211.156.223.30","clientPort":43661,"httpMethod":"POST","requestUri":"/expressApi/ems/managePush","requestQuery":"X-AzureApplicationGateway-CACHE-HIT=0& SERVER-ROUTED=172.16.5.12& X-AzureApplicationGateway-LOG-ID=9b79883f-639a-45c1-aa26-4797295b62a0& SERVER-STATUS=200","userAgent":"Java/1.6.0_20","httpStatus":200,"httpVersion":"HTTP/1.1","receivedBytes":668,"sentBytes":335,"timeTaken":125,"sslEnabled":"off"} }] }

 
Performance日志内容如下:
 
{ "records": [ { "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", "operationName": "ApplicationGatewayPerformance", "time": "2017-06-12T06:54:00Z", "category": "ApplicationGatewayPerformanceLog", "properties": {"instanceId":"ApplicationGatewayRole_IN_2","healthyHostCount":2,"unHealthyHostCount":0,"requestCount":0,"latency":22,"failedRequestCount":0,"throughput":0} } , { "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", "operationName": "ApplicationGatewayPerformance", "time": "2017-06-12T06:54:00Z", "category": "ApplicationGatewayPerformanceLog", "properties": {"instanceId":"ApplicationGatewayRole_IN_0","healthyHostCount":2,"unHealthyHostCount":0,"requestCount":0,"latency":6,"failedRequestCount":0,"throughput":0} } ] }

 
目前中国区的功能属于被阉割版,无法在portal上对应用程序网关的日志进行操作和显示,也无法自定义度量值,比如实例的CPU,MEMORY,只能看到上面日志中默认生成的度量值。监控日志每分钟分别对所有实例生成一次,requestCount和throughput都是过去1分钟每秒的平均值,latency表示后端server的响应平均延迟。
 
二.利用python对Application Gateway的诊断日志进行可视化
感谢研发的努力才能拿到这样的结果,但更多的功能还需要漫长的等待。然而客户的吐槽仍然没有停止,难道在系统性能测试时需要用肉眼去看json吗?
对日志文件实现可视化的方法很多,比如官网介绍了将json转成表格然后用powerbi进行酷炫的展示。这里为各位介绍利用python数据处理和画图相关的包进行可视化的方式,个人觉得比较简单灵活。
首先当然是通过python从blob storage取得日志文件,官网有详细操作说明,这里不再赘述。
然后对取得的json文件进行图表展示,比如这里对第二个实例的延迟和存活的后端server数量进行展示,代码如下:
 
import pandas as pd import matplotlib.ticker as tic import matplotlib.pyplot as plt import matplotlib.dates as dat import datetime as dt import os import json from matplotlib.font_manager import FontProperties #载入文件 os.chdir(‘C:\\\\Work\\\\AppGW0613‘) with open(‘PerformanceLog.json‘,‘r‘) as f: data = https://www.songbingjia.com/android/json.load(f) #赋值 ss=data[‘records‘][0][‘time‘] list1 = [dt.datetime.strptime(ss,‘%Y-%m-%dT%H:%M:%SZ‘)] list2 = [data[‘records‘][0][‘properties‘][‘latency‘]] list3 = [data[‘records‘][0][‘properties‘][‘healthyHostCount‘]] #遍历监控数据 for i in range (1,data[‘records‘].__len__()): if data[‘records‘][i][‘properties‘][‘instanceId‘] == ‘ApplicationGatewayRole_IN_2‘: s=data[‘records‘][i][‘time‘] timeTuple = dt.datetime.strptime(s,‘%Y-%m-%dT%H:%M:%SZ‘) list1.append(timeTuple) list2.append(data[‘records‘][i][‘properties‘][‘latency‘]) list3.append(data[‘records‘][i][‘properties‘][‘healthyHostCount‘])plt.figure(figsize=(8,20)) #延迟图 ax = plt.subplot(211) plt.xlabel("Time") plt.ylabel("Latency") ax.set_ylim(ymin=0,ymax=25) ymajor = tic.MultipleLocator(5) ax.yaxis.set_major_locator(ymajor) tittle = ‘ApplicationGatewayRole_IN_2‘ font = FontProperties(size=14) ax.set_title(tittle,fontproperties=font) ax.xaxis.set_major_formatter(dat.DateFormatter(‘%Y-%m-%d %H:%M:%S‘)) plt.xticks(pd.date_range(list1[0],list1[-1],freq=‘1min‘)) plt.plot(list1,list2,"g") ax.xaxis_date()for label in ax.get_xticklabels(): label.set_rotation(20) label.set_horizontalalignment(‘right‘) #存活服务器数量图 ax2 = plt.subplot(212) ax2.set_ylim(ymin=0,ymax=10) plt.xlabel("Time") plt.ylabel("HealthyHost") ymajor = tic.MultipleLocator(1) ax2.yaxis.set_major_locator(ymajor) ax2.xaxis.set_major_formatter(dat.DateFormatter(‘%Y-%m-%d %H:%M:%S‘)) plt.xticks(pd.date_range(list1[0],list1[-1],freq=‘1min‘)) plt.plot(list1,list3,"r") ax2.xaxis_date()for label in ax2.get_xticklabels(): label.set_rotation(20) label.set_horizontalalignment(‘right‘)plt.show()

 
 
显示效果如下:
Azure China Application Gateway 性能监控

文章图片

 
  Python的画图方便易用,但我画的有点丑。希望能够有帮助:)
【Azure China Application Gateway 性能监控】 

    推荐阅读