Google Colab offers a powerful Python environment with free access to GPUs, ideal for machine learning and data analysis tasks. If you're working with datasets from Kaggle, you can easily connect the two platforms using the Kaggle API.
In this guide, we'll show you how to import Kaggle datasets into Google Colab in five simple steps.

kaggle.json to your computer
kaggle.json file to your local system.
kaggle.json to ColabUse the below code in your Colab notebook to upload the kaggle.json file.
from google.colab import files
files.upload() # Choose kaggle.json when prompted
Move the file to the correct location and set permissions:
# Create the directory if it doesn't exist
!mkdir -p ~/.kaggle
# Move kaggle.json to the correct directory. Assumes kaggle.json is in the current working directory.
# If your kaggle.json is in a different location, please update the path below.
!mv kaggle.json ~/.kaggle/
# Set permissions for the kaggle.json file
!chmod 600 ~/.kaggle/kaggle.json

Go to the Kaggle dataset page and copy the dataset name from the URL. For example, for:
kagglehub.dataset_download("yasserh/titanic-dataset")
The dataset path is: yasserh/titanic-dataset
Use the command below to download:
!kaggle datasets download -d yasserh/titanic-dataset
!unzip -q titanic-dataset.zip
Remove other files
# Remove the zip file after extraction
!rm titanic-dataset.zip
# Remove other metadata files if they exist and are not needed
!rm -f titanic-dataset.zip.json
Load Dataset by using Pandas
import pandas as pd
df = pd.read_csv("Titanic-Dataset.csv")
df.head()
How many rows and columns are there in Dataset ?
# Get the total number of rows and columns
num_rows, num_cols = df.shape
print(f"Total number of rows: {num_rows}")
print(f"Total number of columns: {num_cols}")
import pandas as pd
import sqlite3
# Name of the CSV file to convert
csv_file = 'Titanic-Dataset.csv'
# Name of the SQLite database file to create
db_file = 'titanic.db'
# Name of the table within the SQLite database
table_name = 'titanic_data'
# Read the CSV file into a pandas DataFrame
df = pd.read_csv(csv_file)
# Create an SQLite database connection
conn = sqlite3.connect(db_file)
# Write the DataFrame to an SQLite table
df.to_sql(table_name, conn, if_exists='replace', index=False)
# Close the connection
conn.close()
print(f"Successfully converted '{csv_file}' to SQLite database '{db_file}' with table '{table_name}'.")
Checking Output from SQLite db
import sqlite3
import pandas as pd
# Connect to the SQLite database
conn = sqlite3.connect('titanic.db')
# Read the data from the table into a pandas DataFrame
db_df = pd.read_sql_query("SELECT * FROM titanic_data LIMIT 5;", conn)
# Display the DataFrame
display(db_df)
# Close the connection
conn.close()
import pandas as pd
# Name of the CSV file to convert
csv_file = 'Titanic-Dataset.csv'
# Name of the JSON file to create
json_file = 'titanic.json'
# Read the CSV file into a pandas DataFrame
df = pd.read_csv(csv_file)
# Convert the DataFrame to a JSON file
df.to_json(json_file, orient='records', indent=4)
print(f"Successfully converted '{csv_file}' to JSON file '{json_file}'.")
Checking output
import json
# Read the first few lines of the JSON file
with open('titanic.json', 'r') as f:
for i, line in enumerate(f):
if i < 15: # Displaying first 15 lines for brevity
print(line.strip())
else:
break
import pandas as pd
import xml.etree.ElementTree as ET
# Name of the CSV file to convert
csv_file = 'Titanic-Dataset.csv'
# Name of the XML file to create
xml_file = 'titanic.xml'
# Read the CSV file into a pandas DataFrame
df = pd.read_csv(csv_file)
# Create the root element for the XML
root = ET.Element('TitanicData')
# Iterate over DataFrame rows and add them to the XML structure
for index, row in df.iterrows():
record = ET.SubElement(root, 'Record')
for col_name, value in row.items():
child = ET.SubElement(record, col_name)
# Convert NaN to empty string for XML representation
child.text = str(value) if pd.notna(value) else ''
# Create an ElementTree object
tree = ET.ElementTree(root)
# Use pretty_print for better readability
from xml.dom.minidom import parseString
xml_string = parseString(ET.tostring(root)).toprettyxml(indent=" ")
with open(xml_file, "w", encoding="utf-8") as f:
f.write(xml_string)
print(f"Successfully converted '{csv_file}' to XML file '{xml_file}'.")
Checking output
import os
# Read the first few lines of the XML file
# We'll read more lines than JSON due to XML's verbose structure
num_lines_to_display = 30
if os.path.exists('titanic.xml'):
with open('titanic.xml', 'r') as f:
for i, line in enumerate(f):
if i < num_lines_to_display:
print(line.strip())
else:
break
else:
print("Error: 'titanic.xml' not found.")
Enhance your notebook using IPyWidgets for sliders, buttons, and more.
Explore IPyWidgets โ
Author
๐ฅ Join me live on YouTubePassionate about coding and teaching, I publish practical tutorials on PHP, Python, JavaScript, SQL, and web development. My goal is to make learning simple, engaging, and projectโoriented with real examples and source code.