Language people use in their Facebook posts can predict a future diagnosis of depression as accurately as the tools clinicians use in medical settings to screen for the disease, suggests new research.
"Social media data contain markers akin to the genome," said one of the researchers Johannes Eichstaedt from University of Pennsylvania in the US.
"With surprisingly similar methods to those used in genomics, we can comb social media data to find these markers. Depression appears to be something quite detectable in this way," Eichstaedt said.
For the study, published in the journal Proceedings of the National Academy of Sciences (PNAS), the researchers identified data from nearly 1,200 people consenting to share Facebook statuses and electronic medical-record information.
They then analysed the statuses using Machine Learning techniques to distinguish those with a formal depression diagnosis.
Analysing social media data shared by the participants across the months leading up to a depression diagnosis, the researchers found their algorithm could accurately predict future depression.
To build the algorithm, the researchers looked back at 524,292 Facebook updates from the years leading to diagnosis for each individual with depression and for the same time span for the control.
They determined the most frequently used words and phrases and then modelled 200 topics to figure out what they called "depression-associated language markers."
Finally, they compared in what manner and how frequently depressed versus control participants used such phrasing.
The researchers learned that these markers comprised emotional, cognitive, and interpersonal processes such as hostility and loneliness, sadness and rumination, and that they could predict future depression as early as three months before first documentation of the illness in a medical record.