DataFrame from Analytics CSV file using Tkinter GUI
Download data from your google analytics over a period as CSV file ( Comma Separated value ). This script has four parts.
Part 1 : Download data from google Analytics and save as CSV file. Or you can use sample file available at end of this page.
Part 2 : read the data and create on Pandas DataFrame
Part 3 : Clean the Data to match the requirement
Part 4 : Save the dataframe as CSV file.
Part 1 : Download Analytics data from Google
Inside your google analytics account visit this page
Bahavior > Site Content > All Pages
Create a date range using From date and To date, then select Export and use CSV ( Comma Separated Value ) data as file type.
At the bottom change the show rows option to higher number ( by default it is 10 ) based on number of pages you want to download.
Part II : Read the CSV file and create DataFrame
Use the file browser to connect to any csv ( Comma Separated value ) file on click of a button. Use the read_csv() method to create a Pandas Dataframe by using the selected csv file.
While creating the DataFrame by using read_csv() we have to understand the file structure of the downloaded CSV file. Here the file you download from Google Analytics will have some blank lines at the top and at the end of the page. So we need to skip 6 rows by using the option skiprows.
Remove rows with blank data or not real data based on length. Here any data in Page column having length less than 12 and more than 3 are removed. The second part is added to take care of home page which has 0 length. In all other pages length has to be more than 12.
We don't want multiple rows of same page with different query string. Here to indentify such page we checked for presence of ? and based on this matching we removed the row.
df=df[~df.Page.str.contains('\?')]
Delete column
You can delete any column based on your requirment. Here one column Page Value with data is removed.
df.drop(labels='Page Value',axis=1,inplace=True)
Change the column names
To match with the requirment of any database, we can change the column names by removing space and by adding underscore ( _ ). Maintain the sequence.