November 26, 2014

What are we most thankful for?

By: Lada Adamic, Moira Burke, Winter Mason, Funda Kivran-Swaine

Over the past few months, many people have been challenging one another to share on Facebook the things for which they are most grateful. So, for example, one friend might challenge another to “write 3 things you are thankful for over the next 5 days.” In the spirit of Thanksgiving, we thought we would see what people are most thankful for.

The following analysis was conducted on anonymized, aggregate data by English speakers in the United States.

We started by collecting a set of anonymized English status updates that contained “grateful” or “thankful,” as well as the word “day” preceded or followed by a number. These status updates were then aggregated and processed by a text-clustering algorithm so we could identify what people were grateful for.

One of the first things we discovered is that the people who participated in this challenge were overwhelmingly women: 90% of people who participated identified as female on their profile. There are a number of explanations for why this might be: women may be more likely to participate in challenges such as this; women may be more likely to nominate other women than men; women may be more willing to share what they are grateful for on Facebook; etc. To be clear, we think it is unlikely that women are actually more grateful than men.

So now, without further ado, we present the top 10 things people are thankful for:

post00022_image0001

The #1 thing people on Facebook are most thankful for? Friends! Also in the top ten are “family and friends,” “husband,” “children,” and “daughter.” It appears that we are most thankful for the people we are closest to. In this figure, and the other bar charts that follow, blue and teal indicate friends and family, red indicates tangible things, and orange indicates intangible things.

We also looked at what topics are most distinctive for each state.

post00022_image0002

Notable are weather patterns during the summer and fall when the gratitude challenge was most popular — people in the southwest are grateful for much-needed rain, those in the midwest for summer thunderstorms, residents of Minnesota and Connecticut love the fall colors, and balmy Hawaii and Louisiana are grateful for rainbows.

post00022_image0003

Why is Michigan grateful for electricity? We think it may be because heavy summer storms knocked out power to hundreds of thousands of houses right around the challenge.

New Yorkers are thankful for their apartments, those on the Eastern seaboard for the beach, Oregonians for yoga, and much of the South for religion and God.

post00022_image0004

Social media makes an appearance, too (Pinterest! Netflix! Google! Youtube!). Facebook was mentioned far more often than any other form of social media and uniformly enough to not show up on the map for any particular state.

post00022_image0005

We also analyzed, for each topic, what fraction of people saying they were thankful for that topic were men and what fraction were women. Of course, because the challenge was done mostly by women, most topics had a smaller proportion of men — with one exception: the wife!

post00022_image0006

When we look at the topics most unanimously talked about by women, we also see the significant others get a lot of love — as well as babies and “fur babies” (pets).

post00022_image0007

We also looked at which gratitude topics received the most likes on average.

post00022_image0008

Here we see evidence of social support: when someone says they are thankful for “sobriety” or “recovery,” their friends come out and show their support by liking that status update more than any other. We also see that some of the most-liked topics are those posted disproportionately by men: “wife” and “girlfriend.”

Does the focus of our gratitude change as we get older? Next, we plot the frequency of each topic by the age of the poster.

post00022_image0009

We can see two things from this figure. First, most participants were between the ages of 28 and 55. Second, we see some trends that you would expect: friends are always one of the most important things people are thankful for, “husband” doesn’t enter the charts until the early twenties, opportunities start decreasing after early thirties, and health is more important later. To emphasize this, let’s look at the proportion of people in the age group that are grateful for that topic.

post00022_image0010

With this figure, the changing priorities are even clearer: as one gets older, one is less likely to be thankful for music, coffee, and friends, and more thankful for one’s spouse and children.

Next we look at people’s priorities: what do they say they are grateful for on the first day relative to the later days of the challenge?

post00022_image0011

Here we see that people are more likely to begin the challenge by expressing their gratitude for their immediate families and significant others. “husband” is much more likely to appear on the first day than any other day, and although “friends” are the most frequently mentioned, they’re most likely to be mentioned on the second day or later. As the days of the challenge progress, we also see increases in thankfulness for opportunities and blessings.

post00022_image0012

On Facebook, people are expressing the things they’re most grateful for, especially close friends and family. We hope you’re able to reflect on the things you’re most grateful for this Thanksgiving.

Technical note on text clustering:
To programmatically identify topics mentioned in response to the gratitude challenge, we segmented status updates on newlines and numbers (e.g. if someone was enumerating all the things they were grateful for). We partitioned the resulting string segments into a series of overlapping 4-grams (4 consecutive characters), and computed the cosine similarity on the 4-gram vectors between the new string and previously seen strings. If the match was close, the string was added to a previous ‘cluster’ of strings, otherwise a new cluster was created. For example, ‘my health’, ‘good health’, ‘being healthy’, and ‘I am in good health’ were clustered together. Topics mentioned fewer than 250 times were omitted.

Technical note on “distinctive” topics:
For each topic, we calculated the conditional probability of the state given the topic. Then for each state, we chose the topic that maximized that probability. Put another way, each topic is more or less associated with each state. For each state, we chose the topic that was most strongly associated with that state.