Creating DataSet for your Spotify Library

Appu
3 min readSep 29, 2020

Hey Guys,

In this Article i’m going to show you how to create a data Set for your Spotify Playlist,or any other playlist.This data set will contain values like “artist”,”album”,”track_name”, “track_id”,”danceability”,”energy”,”key”,”loudness”,”mode”, “speechiness”,”instrumentalness”,”liveness”,”valence”,”tempo”, “duration_ms”,”time_signature” for an individual Track.

So let’s get started.The first thing you want to do is to go to https://developer.spotify.com/.Go to Dashboard and Click login.You can use your Same Spotify login credentials to Login.After you Login you might see an option “Create an App “ Click on it. Give a name for your app and Description.

Check the two permissions and Click on create.You’ll be able to see your app on dashboard click on it and go to edit settings. give Redirect URL as http://localhost/.You can see Client id and Below That a button called “Show client secret”.Copy Both the values and paste in Notepad you are going to use them soon.

Now Create a new directory.Create a virtual env inside that for Jupyter notebook.You are Going to use two library for creating data set from Spotify.

One is you may be Familiar with this one “pandas” and other one is “spotipy”

this is python library by spotify. Make sure you install both of these with pip install before Starting Jupyter notebook.

pip install pandaspip install spotipy

or

pip3 install pandaspip3 install spotipy

Now Open Jupyter Notebooks.In the directory where you installed These libraries.Open a new Notebook. name it what ever you like.Now we jump of to the main part of the article.how can we extract the data from Spotify.

First you have to import pandas and Spotipy Libraries

import pandas as pd 
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

Now we have some code which we use to get access token for getting details from the Library

cid =”YOUR_CLIENT_ID" 
secret = “YOUR_CLIENT_SECRET”
client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)
sp.trace=False

So you have a token now you can use “sp” to analyse any public playlist.

sp.user_playlist_tracks(“USERNAMEOFCREATOR”, “PLAYLISTID”)

Username is Not “Appu aravind” it will be on the URL

The one I have Marked is the Username
The one I have marked is the Playlist Id

Copy both of those vales

When you execute that command what you’ll get is a json containing All details of the All the tracks in the Playlist now we have to Convert it to Table format.For this we will use pandas which we already imported.We define a function called analyze_playlist.this takes in Two variables creator and playlist_id.First we create an empty data frame List Playlist features list.Which contains headers we mentioned above “artist”,”album”,”track_name”, “track_id”,”danceability” etc…

Now we create a Dataframe using pandas. with columns as given above.Now we loop each and every track through it and extract features of the track.we create an empy playlist feature and add values to it first we add datas like “track_id”, “Track_name” and “album” then we add “audio_features” which contains rest of the attributes of the song.

def analyze_playlist(creator, playlist_id):

# Create empty dataframe
playlist_features_list = [“artist”,”album”,”track_name”, “track_id”,”danceability”,”energy”,”key”,”loudness”,”mode”, “speechiness”,”instrumentalness”,”liveness”,”valence”,”tempo”, “duration_ms”,”time_signature”]

playlist_df = pd.DataFrame(columns = playlist_features_list)

# Loop through every track in the playlist, extract features and append the features to the playlist df

playlist = sp.user_playlist_tracks(creator, playlist_id)[“items”]
for track in playlist:
# Create empty dict
playlist_features = {}
# Get metadata
playlist_features[“artist”] = track[“track”][“album”][“artists”][0][“name”]
playlist_features[“album”] = track[“track”][“album”][“name”]
playlist_features[“track_name”] = track[“track”][“name”]
playlist_features[“track_id”] = track[“track”][“id”]

# Get audio features
audio_features = sp.audio_features(playlist_features[“track_id”])[0]
for feature in playlist_features_list[4:]:
playlist_features[feature] = audio_features[feature]

# Concat the dfs
track_df = pd.DataFrame(playlist_features, index = [0])
playlist_df = pd.concat([playlist_df, track_df], ignore_index = True)

return playlist_df

Now all you have to is to call the function.for Public playlist you browse the creator will be spotify.

analyze_playlist(“spotify”, “Playlist_id”)

Now you’ll be able to see the table

To save the it in csv format

analyze_playlist(“spotify, “Playlist_id”).to_csv(“dataframe.csv”, index = False)

And you can download the saved csv file from home page.

So that’s it In this article.Thank you.

--

--