Don’t forget the lyrics: Can song sentiment predict Eurovision results?

From 2007 to 2009, Wayne Brady hosted the short-lived game show Don’t Forget the Lyrics! where contestants competed to recall the lyrics to well-known songs for the chance to win cash and prizes. In our latest article, YDAWG explores whether the lyrics of Eurovision songs - like in Brady’s show - are what determines a song’s chances of success.

The Eurovision Song Contest has been running since 1956 and has become well-known for its extravagant spectacles and larger-than-life performances. Despite the name, the competition features entries from countries across the globe, with songs performed in a variety of different languages. To begin our analysis, we have translated the lyrics of each song to ensure they are all in English to ensure consistency between comparisons, although some tone/sentiment may be lost as part of the conversion process.

Figure 1: Song language breakdown

A pie chart showing the breakdown of Eurovision song languages across the competition's history. English is the most common at 38.8%, followed by Other languages at 32.3%, French at 8.7%, German at 3.9%, Italian at 3.1%, Dutch at 3.1%, Spanish at 2.9%, Portuguese at 2.7%, Greek at 2.4%, and Swedish at 2.1%.

Over 60% of Eurovision songs are not sung in English

Figure 2: Language breakdown by year

A line chart showing the percentage of Eurovision songs performed in English, French, German, and other languages from 1956 to approximately 2025. Other languages dominated until the late 1990s, when English rose sharply to become the most common language, a trend that has continued with some variation into the 2020s.

Changes to the native language rule has meant that songs are performed more frequently in English

After translating the lyrics into English, we begin by performing an initial sentiment analysis. This involves assigning each song a distribution of sentiment, indicating the proportion of the lyrics that are negative, neutral and positive.

This process uses Natural Language Processing (NLP) to classify each individual word in the text based on the feelings they convey. Positive words such as “love” can increase the score while negative words such as “bad” decrease it. However, as this only considers each word individually, it can return results that are different when considering the broader context of the lyrics.

Does lyrical sentiment determine Eurovision success?

After this, we fitted a random forest to predict whether a song won Eurovision or not based on its sentiment. Although the model achieved an accuracy of over 90% (since its easy to correctly guess a song did not win), its practical performance was poor. The total ROC area under the curve was a relatively uninformative 63% and none of the Eurovision winners in the test dataset were correctly predicted.

Figure 3: Random forest - Visualisation of first decision tree

A visualisation of the first decision tree in the random forest model, showing five levels of splits based on sentiment variables including compound, positive, negative, and neutral scores. Nodes shaded in blue indicate a predicted class of Winner and nodes shaded in orange indicate Not Winner, with each node displaying its Gini impurity, sample size, and value distribution.

A max depth of 5 was selected for our random forest

The original dataset was unevenly weighted, with most entries not representing Eurovision winners, which limited the analysis. To address this issue, we considered the following:

Could a different model be a better predictor?
Could a different prediction criterion be better?
Could we introduce more variables to help the model improve accuracy?

We extend the analysis by comparing four modelling approaches: a gradient boosting machine (GBM), a logistic regression model, a linear regression model and the previously used random forest.

We introduced two new outcome variables to assess song performance: final placing (e.g. 1st, 2nd) and total points scored. Instead of modelling Eurovision success as a binary outcome (win vs. not win), we reframed the problem as a regression task by predicting final placement on a scale from 1 to 25. Entries that did not reach the final stage (i.e. failed to qualify) were excluded from this version of the dataset to maintain consistency in the outcome variable. We also separately modelled total points scored as an alternative measure of performance.

Do repeated lyrics help Eurovision songs win more votes?

Finally, to improve model performance and capture additional sources of variation, we introduced new features into the dataset. First, we included language as a predictor. While this is not directly derived from lyrical content, it may capture broader nuances that are potentially lost in translation. In addition, we included a binary indicator distinguishing between English and non-English songs.

Figure 4: Languages of winning songs

A pie chart showing the languages of Eurovision winning songs. English dominates at 45.1%, followed by French at 19.7%, Other languages at 15.5%, Hebrew and Dutch at 4.2% each, and Italian, Spanish, Swedish, and Norwegian at 2.8% each.

Secondly, we examined whether lyrical structure provides additional information beyond sentiment alone. In particular, we considered whether songs that are more “lyrically pleasing” perform better. To investigate this, we introduced additional variables such as the number of adjacent rhyming pairs — that is, successive lines that end with a rhyme, also known as couplets.

We also wanted to consider how a song is written. For instance, do songs that have made-up words — think MMMBop by Hanson — get the crowd humming? Or would a dense, lyrically intrinsic song with a powerful message receive more votes at Eurovision?

To explore this, we introduced additional features capturing the linguistic composition of each song, including the number of verbs, nouns, adjectives and other grammatical categories present in the lyrics. We also included total word count as a separate variable. This allows us to investigate whether longer songs are associated with more favourable outcomes, or whether shorter, simpler songs are more effective in influencing voting behaviour.

Finally, we examined word usage patterns within each song, focusing on whether repetition plays a role in voting outcomes. Specifically, we analysed the proportion of unique words versus repeated words in the lyrics. This allows us to assess whether repeated lyrical phrases or hooks have an impact on a song’s popularity and its ability to attract votes.We also measured the proportion of unique words in each song — that is, words that appear only once — as a way of capturing how lexically varied a song's lyrics are.

However, many songs also include repeated non-lexical elements such as vocal sounds, filler phrases or stylistic repetitions that do not carry clear semantic meaning. Given that determiners such as “the” are common and not particularly informative, focusing on nouns provides a clearer indication of the key subjects being referenced and the actions or descriptions associated with them. As part of this, we have looked at what lyrics are the most common in each song. We have used the Natural Language Processing module to standardise the data. This ensures that words that are plurals are grouped with singular to ensure that data is not lost.

“Love” emerged as a particularly common theme in the lyrics, with approximately 10% of songs featuring it as the most frequently used noun.

Figure 5: Top word in each song

A pie chart showing the languages of Eurovision winning songs, with English the most common at 45.1%, followed by French at 19.7% and a spread of other languages making up the remainder.

English and French account for nearly two-thirds of all Eurovision winners.

As part of this analysis, we applied k-means clustering to group words based on their usage patterns. This approach allowed us to classify frequently occurring terms into distinct clusters with similar characteristics. In our dataset, we defined 50 different groups, with words such as 'love', 'lover' and 'loving' grouped together within the same cluster. Each song's cluster assignment — that is, the group its most frequently used word belongs to — was then included as a feature in the model.

Now that we have engineered our features, we proceed to the modelling stage. We have trained our model and evaluated its performance using a range of metrics to assess overall effectiveness.

As discussed above, we ran the data through a variety of different models and different prediction variables. Unfortunately, despite introducing these additional models and features, we were unable to develop a model that accurately predicts Eurovision performance. The best-performing model returned low r-squared values suggesting that the factors included in the analysis have limited predictive power on their own. In some cases, the models produced a negative R-squared value, indicating that they performed worse than a simple baseline model that predicts the mean outcome (i.e. a horizontal line).

Figure 6: Feature importance (random forest)

A horizontal bar chart showing feature importance scores from the random forest model. The number of proper nouns (num_propn) has the highest importance score at approximately 0.043, followed closely by num_part and compound sentiment. Features such as total word count, punctuation count, and letter count rank lowest, all below 0.025.

Proper nouns and compound sentiment score are the strongest predictors of Eurovision performance in the random forest model.

Interestingly, song sentiment (negative, positive, compound) remained a consistently important feature across the different models. However, its degree of importance varied depending on the specific model and whether positive or negative sentiment was considered. In Figure 7, we present partial dependence plots for the key features to illustrate how changes in each variable influence the model’s predictions. Songs with a positive sentiment had a 2-3 percentage point improvement in their win prediction.

Figure 7: Partial dependence plot (feature effects on predictions)

A set of six partial dependence plots showing how individual features influence the model's predictions. Higher compound and positive sentiment scores are associated with increased predicted probability of success, while higher negative sentiment scores show a declining effect. The number of participles (num_part) shows a sharp drop in influence beyond low values, total word count has a relatively flat effect, and higher paragraph counts are associated with a modest increase in predicted probability.

How key lyrical and sentiment features influence the predicted probability of Eurovision success

For example, if we focus on our GBM model, we obtain the SHAP plot shown below, which illustrates each feature’s contribution to the model’s predictions. In the bee swarm plot, the count of the key-word (e.g. “love”) has the highest SHAP value, indicating it is the most influential feature for this model. In contrast, the same feature shows only moderate importance in our random forest model. Similarly, neutral and negative sentiment songs did not score a high feature importance under the GBM, like they did for the random forest. Similarly, neutral and negative sentiment scores were less influential in the GBM than they were in the random forest model.

Figure 8: SHAP values beeswarm (GBM)

A SHAP beeswarm plot showing each feature's contribution to the GBM model's predictions. Points are coloured from blue (low feature value) to red (high feature value). The most repeated word count (most_repeated_count) has the widest spread of SHAP values, indicating the greatest influence on model output. Positive sentiment (pos) shows several high-value outliers with notably large positive SHAP values, while features such as num_verb, unique_words, and num_sconj cluster tightly around zero, suggesting limited influence on predictions.

Positive sentiment and word repetition have the greatest influence on the GBM model's predictions

Unfortunately, our models couldn’t reliably determine Eurovision winners based on song lyrics. Success at Eurovision depends on a variety of different factors, and while a song’s lyrics might carry emotion, narrative, or harmonic resonance, it is only a fragment of a greater performance. Elements such as the musical composition, staging, beat, rhythm and everything in-between ultimately help contribute to a song’s overall success and make Eurovision the spectacle that it is. As a result, we encourage readers to tune in and enjoy the remaining performances as they unfold.

The performance times (AEST) are provided below:

Semi-Final 2: Friday, May 15 at 5:00 AM (Live)
Grand Final: Sunday, May 17 at 5:00 AM (Live)

Don’t forget the lyrics: Can song sentiment predict Eurovision results?

Figure 1: Song language breakdown

Figure 2: Language breakdown by year

Does lyrical sentiment determine Eurovision success?

Figure 3: Random forest - Visualisation of first decision tree

Do repeated lyrics help Eurovision songs win more votes?

Figure 4: Languages of winning songs

Figure 5: Top word in each song

Figure 6: Feature importance (random forest)

Figure 7: Partial dependence plot (feature effects on predictions)

Figure 8: SHAP values beeswarm (GBM)

About the authors

Latest publications

2026 Pre-Budget Submission

Introduction to Quantum Computing

Where Are Australia’s Data Science Leaders? The Case for Technical Career Pathways

Building Tomorrow: Preparing Australia for the Age of AI

Response to Treasury - Review of AI and Australian Consumer Law

Response to Department of Industry, Science and Resources Consultation on Mandatory Guardrails for Safe and Responsible AI

Big Data and the Digital Economy

The Rise of the Gig Economy and its Impact on the Australian Workforce

People, Projections and Payments: A Look at Modern Government Service Delivery

The Impact of Big Data on the Future of Insurance

Never miss an article

Resources

Qualification programs

The Institute

Follow us