select : CSS selector method

BeautifulSoup Basics
By using select method we can run a CSS selector and get all matching elements. We can find tags also by using select method.
import requests
link = "https://www.plus2net.com/html_tutorial/html_form.php"
content = requests.get(link)

from bs4 import BeautifulSoup
soup = BeautifulSoup(content.text, 'html.parser')

print(soup.select("title"))
Output
[<title>Web Form tag  elements in HTML</title>]
All meta tags available within HEAD tag
print(soup.select("head meta")) 
Tags with class='table-striped'
print(soup.select('.table-striped'))
All the links inside class='table-striped'
print(soup.select(".table-striped a")) 
All links even list
print(soup.select("a:nth-of-type(even)")) 
Odd list
print(soup.select("a:nth-of-type(odd)")) 
print(soup.select("a:nth-of-type(2n)"))

Example

We can use class name, id , tag with class , tag with id etc.
content = """<h2>List of web programming languages</h2>
<div class=my_list>
<p>My Pages one </p>
<p class=my_pages>My Pages </p>
<p id=ck1>My ck1 page</p>
<a href='https://www.plus2net.com' class='home_link'>Home page</a>
</div>"""

from bs4 import BeautifulSoup
soup = BeautifulSoup(content, 'html.parser')

print(soup.select("div")) # all div tags
We will get all div tags in above code.
Let us collect tag having class=my_list
print(soup.select('.my_list')) # class=my_list
Print tags with class=my_pages
print(soup.select('.my_pages')) # class=my_pages
Output
[<p class="my_pages">My Pages </p>]
Print tags with id
print(soup.select('#ck1'))
Output
[<p id="ck1">My ck1 page</p>]
Print all a tags within the class=home_link
[<a class="home_link" href="https://www.plus2net.com">Home page</a>]
Print all <p>
print(soup.select('p')) # all p tags
Output
[<p>My Pages one </p>, <p class="my_pages">My Pages </p>, 
<p id="ck1">My ck1 page</p>]
All <p> tags having class
print(soup.select('p[class]'))
Output
[<p class="my_pages">My Pages </p>]
All <p> tags having id
print(soup.select('p[id]'))
Output
[<p id="ck1">My ck1 page</p>]
Find HTML Table with width="170", then collect the 2nd and 3rd <td> tag value
str1=soup.select('table[width="170"] td')
print(str1[1].string)
print(str1[2].string)

select_one

Print only the first <p> tag
print(soup.select_one('p')) # the first p tag only. 
Output
<p>My Pages one </p>

Using CSS selector for XML

Try using this code to get mtaching XML tags with details.

To read XML files
pip install lxml
import requests
link = "https://www.plus2net.com/php_tutorial/file-xml-demo.xml"

content = requests.get(link)

from bs4 import BeautifulSoup
soup = BeautifulSoup(content.text, "xml")

print(soup.select("name"))
Output ( sample output )
[<name>John Deo</name>, <name>Max Ruin</name>, 
------
------
<name>Rows Noump</name>]

Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com



    Post your comments , suggestion , error , requirements etc here





    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer