find_all : List of tags

BeautifulSoup Basics
We can collect a list of all of the occurrences of a tag used in the web page by using find_all. We will input the name of the tag and in return get a list of its occurrence in the web page.

Let us findout all the H2 tags of the webpage.
import requests
link = "https://www.plus2net.com/html_tutorial/html_form.php"
content = requests.get(link)

from bs4 import BeautifulSoup
soup = BeautifulSoup(content.text, 'html.parser')

print(soup.find_all("h2"))
Output is here
[<h2>How to select a form component</h2>,
 <h2>Form tag</h2>, <h2>Method attribute of the html form</h2>,
 <h2>Action attribute</h2>, 
 <h2>Applications and uses of html form elements</h2>]
If you don't want to keep the <h2> </h2>tags, then use this
my_list=soup.find_all("h2")
for my_tags in my_list:
    print(my_tags.string)

Collecting all the links of a webpage

One of the important requirement is to collect the all the links present in a webpage. We will use find_all to get the links ( <a href=… > … </a>), then try to get the anchored string part and the URL or the address part of the links. Note that we will get a list of links by using find_all and then by using a for loop we will display all links.
import requests
link = "https://www.plus2net.com/html_tutorial/html_form.php"
content = requests.get(link)

from bs4 import BeautifulSoup
soup = BeautifulSoup(content.text, 'html.parser')

print(soup.find_all('a')) # all the links with string and tags 
The output will be all the links present in the webpage.

Now let us try to collect the anchored string and the URL ( or address ) part of the links.
my_list=soup.find_all("a")
for my_tags in my_list:
    #print(my_tags['href']) # returns the links or URLs
    print(my_tags.string)   # returns the string or anchored string

Using Regular expression

We can use regular expression with find_all to get matching tags.
Let us find out all the h1 and h2 tags
import requests
link = "https://www.plus2net.com/html_tutorial/html_form.php"
content = requests.get(link)

from bs4 import BeautifulSoup
soup = BeautifulSoup(content.text, 'html.parser')

import re
print(soup.find_all(re.compile("(h[1|2])")))
We will get one list as output
[<h1 itemprop="headline">Web Form tag & HTML elements</h1>,
 <h2>How to select a form component</h2>, <h2>Form tag</h2>,
 <h2>Method attribute of the html form</h2>,
 <h2>Action attribute</h2>,
 <h2>Applications and uses of html form elements</h2>]
all a or div tags
import re
#print(soup.find_all(re.compile("(a|div)"))) # all a or div tags 


plus2net.com



Post your comments , suggestion , error , requirements etc here




We use cookies to improve your browsing experience. . Learn more
HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
©2000-2020 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer