In his final Normal Deviance column for the year, Hugh writes on how he plans to sneak data science into the season.
It's been a busy year. I say this partly because it's true, but mainly because it seems to be the default answer for a lot of people at the moment when discussing work (perhaps with the occasional exception). One nice thing about reaching December/January is that it gives the opportunity to catch up on some of the reading and viewing to-do list that has built up.
For those with a data science bent, you may have a collection of things on your list to expand your toolkit for the year ahead. Here's a few things sitting in my reading pile, for what it's worth:
- CAS released a monograph this year (which are always worth a look) on penalised regression and credibility (a topic that close to my heart ). Even more interesting is a rebuttal paper that questions its use.
- I'm often faced with statistical matching problems - trying to find a comparison group to provide a counterfactual. Causal analysis continues to develop, and there's always a bunch of interesting things to read. Coarsened exact matching , and variants, is a popular approach that managed to pass me by, until now.
- It's a surprisingly poorly known fact that actuaries event presentations get put up for free access. There were lots of good things at this year's Summit that are worth a look, or a re-look.
- My deep learning knowledge is pretty patchy, but I think there's plenty of untapped potential in conditional variational autoencoders, so will give some of the foundational papers a look.
Christmas is also the time of year for tacky themed analysis (I still like last year's most Christmassy song sleuthing). For something a little simpler this year, here are some google trends that see some big spikes this time of year (thanks pytrends package!).
Figure 1: Google trends for selected Christmas-themed searches, 2016-2023. Darker gradients are more recent years
Source Google Trends, 2016-2023
A few easy observations:
- Aside from a big Turkey outlier, Ham has the biggest (absolute) Christmas spike. The Turkey outlier corresponds to the February 2023 Türkiye earthquake, obviously a conflation sneaking into Google trends
- Relatively speaking, gingerbread is champion - it spends the year in virtual hibernation to pull out a credible effort in December.
- Christmas trees seem to be getting more population over time; Turkey may be fading a little.
- Mariah Carey gets a Christmas bump but it's pretty small compared to some of the other terms tested. Stockings gets virtually nothing.
- Gift ideas and Christmas trees start early - certainly in November. The Christmas food can be left till the last week.
Now matter how you celebrate the season, I hope that the new year brings joy and refreshment. Thanks for reading (really - I love doing this column) and see you next year!