爬虫|Python实战---王者荣耀皮肤爬虫 Python数据分析|python|爬虫|机器

文章目录

- 方法1
- 方法2

作为王者荣耀的老玩家，今天教大家如何用python爬虫获取王者荣耀皮肤

文章图片

本文将介绍两种王者荣耀皮肤的爬取方法，一种比较简单的，一种复杂的方法供大家学习。
首先先进去王者荣耀官方网站：王者荣耀
进入开发者工具找到英雄皮肤所在位置,图中 herolist.json就是我们需要找的英雄列表，包括英雄编号、英雄名称、影响类型、皮肤等信息，复制 url：http://pvp.qq.com/web201605/js/herolist.json路径

方法1 见注释

# 导入所需要的模块 import urllib.request import json import os # 获取响应头文件 response = urllib.request.urlopen("http://pvp.qq.com/web201605/js/herolist.json") # 读取英雄列表，并存入hero_json中 hero_json = json.loads(response.read()) hero_num = len(hero_json) # 保存路径 save_dir = 'Dheroskin\\' # 检查路劲是否存在，不存在则创建路径 if not os.path.exists(save_dir): os.mkdir(save_dir) for i in range(hero_num): # 获取英雄皮肤列表 skin_names = hero_json[i]['skin_name'].split('|') for cnt in range(len(skin_names)): save_file_name = save_dir + str(hero_json[i]['ename']) + '-' +hero_json[i]['cname']+ '-' +skin_names[cnt] + '.jpg' skin_url = 'http://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/'+str(hero_json[i]['ename'])+ '/' +str(hero_json[i]['ename'])+'-bigskin-' + str(cnt+1) +'.jpg' print(skin_url) # 检查图片文件是否存在，如果存在则跳过下载 if not os.path.exists(save_file_name): urllib.request.urlretrieve(skin_url, save_file_name)

效果展示如下

方法2 见注释

import requests import re import json import os import time# 获取当前时间戳，用于计算爬虫爬取完毕消耗了多少时间 now = lambda: time.time()# 请求头 headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.198 Safari/537.36", "Cookie": "pgv_pvid=120529985; pgv_pvi=8147644416; RK=iSx1Z7fSHW; ptcz=d094d0d03f513f6762a4c18a13ddae168782ec153f43b16b604723b27069d0a7; luin=o0894028891; lskey=000100008bc32936da345e2a5268733bf022b5be1613bd2600c10ad315c7559ff138e170f30e0dcd6a325a38; tvfe_boss_uuid=8f47030b9d8237f7; o_cookie=894028891; LW_sid=s116T01788a5f6T2U8I0j4F1K8; LW_uid=Z1q620M7a8E5G6b2m8p0R4U280; eas_sid=m1j6R057x88566P2Z8k074T2N7; eas_entry=https%3A%2F%2Fcn.bing.com%2F; pgv_si=s8425377792; PTTuserFirstTime=1607817600000; isHostDate=18609; isOsSysDate=18609; PTTosSysFirstTime=1607817600000; isOsDate=18609; PTTosFirstTime=1607817600000; pgv_info=ssid=s5339727114; ts_refer=cn.bing.com/; ts_uid=120529985; weekloop=0-0-0-51; ieg_ingame_userid=Qh3nEjEJwxHvg8utb4rT2AJKkM0fsWng; pvpqqcomrouteLine=index_herolist_herolist_herodetail_herodetail_herodetail_herodetail; ts_last=pvp.qq.com/web201605/herolist.shtml; PTTDate=1607856398243", "referer": "https://pvp.qq.com/" }# 解析函数，返回文本或者二进制或者None def parse_url(url, get_b=False): try: response = requests.get(url, headers=headers) response.encoding = "gbk" assert response.status_code == 200 if get_b == True: return response.content else: return response.text except: print("status_code != 200(from parse_url)") return None# 处理单个英雄 def parse_hero_detail(id, name): # 保存所有皮肤图片的本地路径 path = f"./英雄皮肤/{ name}" if not os.path.exists(path): os.makedirs(path, exist_ok=True)# 因为不确定每个英雄有多少个皮肤，所以假设单个英雄一共请求10张皮肤，这样就不会出现皮肤缺少的情况 for num in range(1, 11): # 单个英雄皮肤图片的url链接 api_url = f"https://game.gtimg.cn/images/yxzj/img201606/skin/hero-info/{ id}/{ id}-bigskin-{ num}.jpg"# 如果返回None，则说明状态码不是200，即没有这个请求的皮肤 b_data = https://www.it610.com/article/parse_url(api_url, get_b=True)if b_data == None: print(f"{ name} 一共有{ num - 1}个皮肤") print("--------------------------------------------------") # 没有新的皮肤了，立即退出循环 breakimg_path = f"{ path}/demo{ num}.jpg" if not os.path.exists(img_path): try: download_img(img_path, b_data) except: return print(f"{ name} 第{ num}张皮肤图片下载完毕")# 下载图片 def download_img(path, b_data): with open(path, "wb") as f: f.write(b_data)def main(): # 含有每个英雄对应的id、英雄名称的url api_url = "https://game.gtimg.cn/images/yxzj/web201706/js/heroid.js" text = parse_url(api_url)search_result = re.search('var module_exports = ({.*?})', text, re.S) hero_info_str = search_result.group(1) hero_info_str = re.sub("'", '"', hero_info_str) # 包含所有英雄以及各自对应的id 的字典 hero_info_dict = json.loads(hero_info_str)for hero in hero_info_dict: name, id = hero_info_dict[hero], hero print(name, id) parse_hero_detail(id, name)if __name__ == '__main__': start = now()# 记录起始时间 main()# 主函数 print(f"耗时: { now() - start}")# 计算爬虫执行完毕消耗的时间

【爬虫|Python实战---王者荣耀皮肤爬虫】