python抓取简单爬虫02 python抓取简单爬虫02

python抓取简单爬虫02

#!/usr/bin/env python

#-*-coding: utf-8-*-
import urllib.request
url="http://www.baidu.com"
data=https://www.it610.com/article/urllib.request.urlopen(url).read()
data=https://www.it610.com/article/data.decode('UTF-8')
print(data)

百度主页的内容，并且已url格式的字符串进行输出，包括内容里的简体中文。
注：如果命令行格式运行python中文乱码，可以尝试在python charm界面上运行。

#!/usr/bin/env python

#-*-coding: utf-8-*-
import urllib
import urllib.request
data=https://www.it610.com/article/{}
data['word']='Tianxu Notes'
url_values=urllib.parse.urlencode(data)
url='http://www.baidu.com/s?'
full_url=url+url_values
data=https://www.it610.com/article/urllib.request.urlopen(full_url).read()
data=https://www.it610.com/article/data.decode('utf-8')
print(data)

【python抓取简单爬虫02】python3里面，print后面必须要加（），因为python3之前，print是语句，python3之后是函数，所以必须要加括号了