字典(dictionary)在python中是一种可变的容器模型,它是通过一组键(key)值(value)对组成,这种结构类型通常也被称为映射,或者叫关联数组,也有叫哈希表的。每个key-value之间用“:”隔开,每组用“,”分割,整个字典用“{}”括起来。
凡是用到键值对的地方,就可以用字典。爬虫中的headers都可以用到字典(推荐学习:python视频教程)
# coding:utf-8import requestsfrom bs4 import beautifulsoupclass spiderproxy(object): #python版本为2.7以上 headers = { host: www.xicidaili.com, user-agent: mozilla/5.0 (macintosh; intel mac os x 10.11; rv:47.0) gecko/20100101 firefox/47.0, accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, accept-language: en-us,en;q=0.5, accept-encoding: gzip, deflate, referer: http://www.xicidaili.com/wt/1, } def __init__(self, session_url): self.req = requests.session() self.req.get(session_url) def get_pagesource(self, url): html = self.req.get(url, headers=self.headers) return html.content def get_all_proxy(self, url, n): data = [] for i in range(1, n): html = self.get_pagesource(url + str(i)) soup = beautifulsoup(html, lxml) table = soup.find('table', id=ip_list) for row in table.findall(tr): cells = row.findall(td) tmp = [] for item in cells: tmp.append(item.find(text=true)) data.append(tmp[1:3]) return datasession_url = 'http://www.xicidaili.com/wt/1'url = 'http://www.xicidaili.com/wt/'p = spiderproxy(session_url)proxy_ip = p.get_all_proxy(url, 10)for item in proxy_ip: if item: print item
更多python相关技术文章,请访问python教程栏目进行学习!
以上就是python什么时候用到字典的详细内容。