A bit of stardust on text analysis

All published Bowie lyrics analyzed

David Bowie is considered one of the most influential, long-lasting but also versatile contemporary music artists. With a career spanning almost six decades he managed to remain active, innovative and a leading, critically acclaimed artist by constantly reimagining himself and his music. This project aims to analyze all the officially published lyrics in Bowie's songbook. Through text and sentiment analysis, I aim to recognise patterns that could be correlated with the artist's own life on planet Earth but also with major global events.

bowiegif

Positivity, negativity, fear, sadness, anger, anticipation, surprise, all lie in Bowie's songs. How have these sentiments changed through the years?

Lyrics ranging from 1964 to 2015

The dataset was collected using LyricWikia, an online wiki-based lyrics database providing access via API and a python library, as well as APIseeds' lyrics API. Data on published songbook obtained by scraping Allmusic and Wikipedia. Here you can find all song lyrics used in this project.

Click to navigate through the visualisation below.

Text mining: applying TFIDF and more

Trying to correlate lyrics to the zeitgeist of each decade but also in Bowie's own personal life events, we use TFIDF. Short for term frequency–inverse document frequency, TFIDF is a numerical statistic that is intended to reflect how important a word is to a document in a collection or corpus, in this case the whole lyrics database, divided by decades. The tf–idf value increases proportionally to the number of times a word appears in the document and is offset by the number of documents in the corpus that contain the word, which helps to adjust for the fact that some words appear more frequently in general. Tf–idf is one of the most popular term-weighting schemes today as 83% of text-based recommender systems in digital libraries use it.

Hover your mouse or tap over the circles representing to find out the top 20 terms and their TFIDF score after the analysis.

Planet Earth is blue and there's nothing I can do

Sentiment analysis was also applied to the lyrics which are now grouped by album. All officially Bowie published albums are put chronologically. What was the main sentiment in each one of Bowie's albums? Which are the happiest and which the saddest of his carreer? With the use of Python Pandas and TextBlob library we can analyse the lyrics in a Natural Language Processing (NLP) way. The sentiment property used below, returns a namedtuple of the form Sentiment(polarity, subjectivity). The polarity score is a float within the range [-1.0, 1.0] whereas the subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective. Therefore, we end up with the below ranking for each album with the negativity being scored below zero and the positivity above zero. Quite unsurprisingly, David Bowie's swansong album (★ or Blackstar) was filled with negative sentiments, whereas his early 70s and 80s eras have been the most upbeat ones.


with_names


TO BE CONTINUED