Text analysis of students’ diary entries during the Covid-19 lockdown in South Africa
Author
Jonathan Jayes
Published
July 6, 2021
Students Speak
Hi Johan - here are some visualizations I have put together from your students’ diary entries. I think that they tell quite a nice story. I hope that some are useful. I’ve done them in black and white - I’m not sure where you want to publish them in the end. If you’d like some colour I can add it easily.
I’ve written up the process mostly so that I can remember. The visualizations are at the bottom of the post.
Context
The Stellenbosch students of Economic History 281 were encouraged to keep a diary during the lockdown as the Covid-19 pandemic overtook the world in March 2020. This post is a short text analysis of the content of their diary entries.
Data
The students’ diary entries have been ingested to form a dataset such that each row is one student’s observation on one day. Additional columns specify the date and the week of the log. There are 333 observations in total. Three examples are shown in the table below.
Date
Text
Week
2020-03-21
Today has probably been the most challenging day yet. I feel like my mind hasn’t taken a break. There are just so many unknown factors right now and I don’t know how to fully prepare myself. I like plan and to know how things are going to be. It is driving me crazy with everything being so fluid at the moment. I understand it must be difficult for the University to give us clear answers as to what is going to happen exactly in the coming weeks, but it is so difficult to know what head space to be in and how to prepare. […] I just feel like I need a break from it all to focus on other things and pretend that life is normal, even just for a minute.
1
2020-03-21
I realized today that people only care about things that affect them directly. A family friend turned 50 today and I declined to go to the party, but my family still wants to proceed with my grandmother’s 70th birthday party. This does not seem fair to me.[…] I’ve come to the conclusion that it is our civil duty to educate those around us on how we’re supposed to act and act in union with the rest of the world, even if said people are your own family.
1
2020-03-21
Today I went hiking up Lion’s Head with two friends. Later, my family from Switzerland called me and informed me that their government implemented a state of emergency. From now on no stores will open apart from pharmacies and grocery stores. Hospital visits are only allowed for emergencies. Everyone is to stay at home and gatherings of more than 6 persons are forbidden. I suspect that South Africa will enforce similar arrangements if the situation gets worse. For now, I am glad that some gyms and stores haven’t closed yet. It still feels like a holiday at the moment.
These data were supplemented to include the number of Covid-19 cases in South Africa, the number of deaths, and the number of tests performed. These may provide some context around the change in content of the diary entries over time.
Word Cloud
We start with a word cloud which shows the words used by the students in their diary entries.
The size of the word is correlated to how frequently it is used. The sentiment of the word is scored with the bing sentiment lexicon, a general purpose English sentiment lexicon that categorizes words in a binary fashion, either positive or negative.
We can see that common positive words include “support”, “privileged”, “healthy”, “productive”, and “excited”. Common negative words are dominated by “virus”, followed by “difficult”, “struggling”, and “infected”.
This is slightly more informative than a generic word cloud showing word frequency. However, it should be noted that the words must occur in both the students’ diary entries and the bing sentiment lexicon in order to be shown in the word cloud.
Table @ref(tab:excluded) shows some common words in the students’ diary entries which are excluded from the wordcloud in Figure @ref(fig:wordcloud).
Common words excluded from wordcloud
Word
Number of uses
Lockdown
251
People
211
Day
183
Time
172
Family
111
Feel
92
South
90
Home
80
World
78
Days
71
Africa
61
Life
54
Online
51
19
50
Friends
50
We can also include a conventional word cloud beside the comparison cloud, and shown in Figure @ref(fig:image-grobs).
Evolution of students’ diary entry sentiment over time.
Figure @ref(fig:sentiment) below shows the change in sentiment of the student responses over the course of the lockdown. It requires some explanation: the words used by the students are grouped by week, scored according to a sentiment lexicon, the score is averaged across the week. The points on the graph represent the average sentiment of the students’ diary entries in a particular week.
We can see that at the outset, sentiment is poor, this improves, and then drops dramatically at the end of the period. It is noteworthy that the average sentiment is negative for the entirety of the period, highlighted by the dotted line at zero.
This can be explained by the choice of sentiment lexicon used to score the words. The AFINN-111 dataset is a lexicon of English words rated for valence with an integer between minus five and plus five. The words were manually labelled by Finn Årup Nielsen in 2009-2011. An example of the scores assigned to words in the students’ diary entries is shown in Table @ref(tab:afinn) below.
AFINN sentiment scores
Word
Sentiment score
Bullshit
-4
Catastrophic
-4
Panic
-3
Fake
-3
Worse
-3
Funny
4
Fun
4
Wonderful
4
Thrilled
5
What are the words most specific to each week of the student diary entries?
Table @ref(tab:words) below shows the words most specific to each week.
Week specific words calculated with weighted log odds
Week 1
Week 2
Week 3
Week 4
Week 5
Week 6
African leaders
Lockdown starts
Privilege
Easter Sunday
Zoom
Clothing bank
Church
Virus
Conspiracy theories
Payment
R500 billion
Lockdown restrictions
Airports
Townships
An obligation
Hot Cross buns
SUN Learn
Level 5
NSFAS
Cases recorded
Continues to rise
Extension
Economic stimulus
Livelihoods
Nice! We can see that we capture some elements of the experience in each week of lockdown.
Sentiment and week-specific words
This figure superimposes the week-specific words above the line graph that shows the evolution of the students’ sentiment across the weeks.
I think it captures a bit of the experience - at the outset there was anxiety about the lockdown, difficulties with internet access and a worry about the rise in cases. This was followed by conspiracy theories and discussions of obligation and privilege. The collective mood improved toward Easter, and was further buoyed by the announcement of a large stimulus package by the government. Finally there was exasperation about the state of employment and livelihoods.
Comparison of students’ diary entries with Covid-19 statistics.
Figure @ref(fig:context) shows the evolution of the sentiment of the students’ diary entries beside the rising Covid-19 case numbers in South Africa.
It is difficult to conclude about a relationship between the number of cases and the sentiment of the students’ reflections. While there appears to be a relationship between average sentiment and number of tests at the outset of the lockdown, I think this is statistical noise rather than some sort of correlation.
Contextualization of timing of diary entries
The purpose of this selection of figures is to emphasize that the diary entries were recorded at the outset of the pandemic in South Africa. The number of cases was relatively low compared to the steep increase in cases which followed in winter of 2020.
The figures below compare the period of diary entries to the number of cases and deaths in the first year of the pandemic.
I think option two conveys the message clearly and without clutter.
Here we have a two panel plot of the Covid-19 statistics and number of diary entries recorded by the students.
Next we have a single panel with the period of diary entries superimposed on the Covid-19 statistics.
Alternatively we can annotate a thick line to show where the diary entries occur.
Alternatively we can have a legend variant of option 2.
Financial markets comparison
Figure @ref(fig:fin-mkt-comp) shows the mean sentiment of the students’ diary entries alongside the JSE All Share Index for the same period, as well as the Rand to US Dollar exchange rate. Several students question in their entries what will happen to the stock market, with one stating, “I was looking at good stock picks on the JSE today. Every disaster can be an opportunity…”.
Again the trends displayed may constitute statistical noise. The JSE was rising out of an enormous trough created as investors panicked with Covid-19 spreading into Europe and the US. The Rand is a notoriously volatile currency. Yet, these trends are interesting to show in the context of the early weeks of lockdown.
Interactive figure
We can also make this a little more attractive as an interactive chart with some colour and hover labels.
Figure @ref(fig:interactive) shows the same information as above with a hover field to show the week-specific words. Mouse over the points to see the words most specific to each week and the average sentiment of the students’ diary entries.