beautiful soup选择元素的方法-ag真人游戏

beautiful soup是python的一个网页解析库，处理快捷; 支持多种解析器，功能强大。教程细致讲解beautiful soup的深入使用、节点选择器、css选择器、beautiful soup4的方法选择器等重要知识点，是学好爬虫的基础课程。

学习目标

掌握使用节点选择器选择元素的方法
了解被选取元素的类型
了解有多个相同节点时返回的节点结果

1. 选取元素的方法

节点选择器就是使用tag对象来选择节点元素。而tag对象与html、xml的原生文档中的tag相同，即标签。

例如：

the dormouse's story

title和a标签及里面的内容成为tag对象。

1.1 格式

获取元素

格式：soup.tag

soup为beautiful soup4的对象
返回值：节点元素

1.2 示例

from bs4 import beautifulsoup
html_str = """
the dormouse's story
once upon a time there were three little sisters; and their names were
,
 and
;
and they lived at the bottom of a well.
...
"""
soup = beautifulsoup(html_str, 'lxml')
# 抽取title标签
print(soup.title)
# 打印soup.title的类型
print(type(soup.title))
# 抽取a标签
print(soup.a)
# 打印soup.a的类型
print(type(soup.a))
# 输出结果
the dormouse's story

1.3 结论

通过打印结果，我们能够发现使用tag获取到的对象都是'bs4.element.tag'类型，这是beautifulsoup中一个重要的数据结构。

注意点：

需要注意的是当有多个相同的标签的时候，使用此种方式只能获取第一个匹配到的节点，其他的相同节点会被忽略。

2. 总结

(1) 选取元素的方法：

只需要在beautiful soup对象的后面加上标签名即可，例如：soup.title

(2) 获取到的节点是tag对象

(3) 当有多个相同节点的时候，只会返回第一个节点

beautiful soup教程