SEO reports using Pandas

Pandas GA4 download search traffic
From inside the Google Analytics GA4 report, download the CSV ( Comma Separated Value ) file

Google organic search traffic: Landing page + query string
CSV file structure
We can skip the top rows and manage our own column headers. While creating the DataFrame from CSV file we will use the options skiprows, usecols,index_col and names to customize the DataFrame as per our requirement.
While using usecols we have given 6 columns ( 0 indexed ) starting from first column as 0. To match these 6 columns we have used a list of column headers l_headers based on our required headers.
l_headers=['page','clicks','imp','ctr','position','tm'] # Changed the name of the columns 
df=pd.read_csv(path,skiprows=11,usecols=[0,1,2,3,4,8],index_col=0,names=l_headers)
After creating the DataFrame df we are using the same to create a SQLite table to store them using to_sql(). We can use MySQL also to store the DataFrame.
import pandas as pd
#path and name of downloaded csv file. 
path='C:\\testing\\analytics\\data\\y23_04.csv'
table_name='y23_04' # change name of the SQlite table 
####### No edit ######
l_headers=['page','clicks','imp','ctr','position','tm'] # Changed the name of the columns 
df=pd.read_csv(path,skiprows=11,usecols=[0,1,2,3,4,8],index_col=0,names=l_headers) 
df['position']=df['position']. round(2)
df['ctr']=df['ctr']. round(2)
df['tm']=df['tm']. round(2)
print(df.head())

## Inserted the DataFrame to SQLite database 
from sqlalchemy import create_engine
my_conn = create_engine("sqlite:///C:\\my_folder\\analytics3.db")
my_conn=my_conn.connect()
df.to_sql(con=my_conn,name=table_name,if_exists='replace') 

How to Analyze Keyword Combinations and Their Rank Positions

Download keyword list from the Google Search Console in CSV format.
Download Performance on Search Result data
This step by step approach helps in identifying which keywords are performing well and which ones may need more focus for optimization.

Download the query list in csv format from Google Search Console

Performance > Search results >
Top right corner check for download link and then select download CSV.

Inside the zip file there will be a file Queris.csv, we will use this file for our analysis.

Read the csv file by using read_csv(), by using the column Queries we will create one new dataframe (df).
import pandas as pd
df=pd.read_csv('F:\\webmaster-console-query-data\\Queries.csv') # use your system path
list_words=['tkinter','color'] # List of keywords to check ranking 
my_str = '|'.join(list_words) # tkinter | color , the OR combination 
df2=df[df['queries'].str.contains(my_str,case=False)]
print(df2)
In above code we will get output based on the OR combination. So our output rows will contain presence of any one keyword from the given list of keywords.

For an AND combination where presence of all keywords is to be checked against the query we have to modify the code like this.
list_words=['tkinter','color'] # List of keywords to check ranking 
## my_str = '|'.join(list_words) # tkinter | color , only for the OR combination 
my_str=''
for w in list_words:
    my_str=my_str + "(?=.*" + w + ")"

print(my_str) # check the string before applying 
#df2=df[df['queries'].str.contains(r'^(?=.*tkinter)(?=.*color)',case=False)]
df2=df[df['queries'].str.contains(my_str,case=False)]
print(df2)
Using the above output we can further filter the rows based on ranking position. From the above rows ( dataframe df2 ) we will collect rows having rank ( Column name Position ) within 6 to 10.
df2=df[df['queries'].str.contains(my_str,case=False)]
df2 = df2[df2["Position"].between(6,10)] # Position between two ranks 

Frequency of key words


While creating the dataframe we will use str to create string, then split the string and use stack. Finally we will use value_count() to create a list of key words with the frequency of their occurrence.

We are storing the final key word list with the frequency of occurrence using CSV file using to_csv().
This gives us the list of words which are as per google have relatively better authoritativeness ( Part of E-A-T : expertise authoritativeness and trustworthiness).
import pandas as pd
df=pd.read_csv('F:\\webmaster-console-query-data\\Queries.csv') # use your system path
df2 = df["queries"].str.split(expand=True).stack().value_counts()
print(df2.head(15)) # printing top 15 keywords
df2.to_csv("H:\\data\\seo\\to_words.csv") # saving to csv file, change to your path

Pandas contains() split() slice()
Subscribe to our YouTube Channel here


Subscribe

* indicates required
Subscribe to plus2net

    plus2net.com



    Post your comments , suggestion , error , requirements etc here





    Python Video Tutorials
    Python SQLite Video Tutorials
    Python MySQL Video Tutorials
    Python Tkinter Video Tutorials
    We use cookies to improve your browsing experience. . Learn more
    HTML MySQL PHP JavaScript ASP Photoshop Articles FORUM . Contact us
    ©2000-2024 plus2net.com All rights reserved worldwide Privacy Policy Disclaimer