Analytics Snippet: In the Library

iStock

Claim your CPD points

Data Science

Home

Articles

Analytics Snippet: In the Library

by Jacky Poon

Posted 14 June 2018

Share

 

Tags

Analytics, Analytics Snippet, Big Data, data, Data Analytics, Data visualisation

Welcome to the Analytics Snippet column, where we showcase short code snippets in R, Python, or other programming languages, to demo the power of data analytics.

Whether you are a complete beginner or an experienced practitioner, we hope you can learn a trick or two from this column.

In this snippet we will be diving into checkouts of book titles – from the Brisbane City Council Library and its branches during three days in December – to see what Brisbanians have been reading!

Libraries and Packages

In R, we will be using: 

plotly for the pie charts, 

dplyr for data manipulation,

tm for text mining,

wordcloud for word clouds, and

RColorBrewer for a touch of colour.

library("plotly")

library("dplyr")

library("tm")

library("wordcloud")

library("RColorBrewer")

If you do not have these packages installed previously, you will need to run install.packages to install th (e.g. install.packages(“plotly”))

Reading the Data

The data used for this can be downloaded from the Brisbane City Open Data Portal which is publicly available under Creative Commons Attribution 4.0.

First, in R, we provision a temporary file location for the download, and then download the zip file to the location:

temp->tempfile();

download.file("https://www.data.brisbane.qld.gov.au/data/dataset/53d02339-1818-43df-9845-83808e5e247e/resource/ed431a68-15f2-430e-b140-4c603597680a/download/library-checkouts-all-branches-december-2017.csv.zip")




We then unzip and read the data from the comma separated values file.




data <- read.csv(unz(temp, "Library Checkouts all Branches December 2017.csv"))



Inspecting the data to show the first 100 records with head(), we have title, author, item type, age, and the library branch it was checked out from, as well as various IDs. Click on the arrow on the top right to see more columns.




head(data, n=100)




Ashes to ashes / Jenny Han & Siobhan Vivian

 

Silicon chip

 

Nepal / written and researched by Bradley Mayhew, Lindsay Brown, Trent Holden

 

Trekking in the Nepal Himalaya / written and researched by Bradley Mayhew, Lindsay Brown, Stuart Butler

 

The destroyers / Christopher Bollen

 

The lonely city : adventures in the art of being alone / Olivia Laing

 

Too many elephants in this house / Ursula Dubosarsky ; pictures by Andrew Joyner

 

Lost cities of the ancients [dvd]

 

Blind faith / Rebecca Zanetti

 

Divided / Sharon M. Johnston

 

About the authors
Jacky Poon
Jacky is the current Chair of the Young Data Analytics Working Group and the Head of Finance - nib Travel, the Travel Insurance division of nib Health Funds. He is an editor of the monthly Data Analytics Newsletter for Actuaries Institute members. He is also member of the IFoA Machine Learning Reserving Working Party and has a keen interest in research on the use of data analytics and machine learning techniques to complement the traditional actuarial skillset in insurance.