THERE may be a powerful epidemiological tool at our fingertips, available anywhere, any time. Most of us have used the popular internet search engine Google for many of our queries. It is also known to many of us that the Google search engine has a built-in dictionary.
But, unlike a conventional dictionary, when you ask for the meaning of a word, the engine automatically generates an intriguing graph named “usage over time” when we click on the down arrow of the box containing the meaning.
For instance, just typing “obesity meaning” in the Google search engine, we are presented with a graph like the one below. This tells us that the usage of the term obesity has been on a steady increase from somewhere around 1930 and there is a sharp increase around the year 2010.
This raises the question of whether the usage of the word obesity in the billions of digitised text materials searched corresponds to the actual increase in the incidence and prevalence of obesity. I cannot help but get surprised that the result appears to be a true reflection of this.
We could call this method “culturomic epidemiology” whereby we could use lexical analysis to infer epidemiological data.
To understand how this works, we should first look at how this graph is generated. A nifty program called Google Books Ngram Viewer searches a mammoth corpus of digitised literature for recurring instances of a particular word and plots them as a graph. While there are some shortcomings in this method such as large numbers of incorrectly dated and categorised texts in the corpus searched, it seems to produce interesting results open to interpretation.
Culturomics refers to a form of computational lexicology that studies human behaviour and cultural trends through the quantitative analysis of digitised texts. Analysis of words – their emergence, usage and disappearance from parlance (eg, in newspapers) – could provide insight into various aspects of human behaviour, from current fashion trends to predicting a potential winner in an election.
But for us in the health care arena, such culturomic analysis could potentially help us better understand the diseases of public health importance. It could also help us understand the public perception of the impact of certain diseases.
For instance, searches for the terms diabetes, hypertension and AIDS yield the following graphs:
Whether the above mean a plateauing diabetes incidence, a declining hypertension incidence and a waning AIDS epidemic is a question that requires further research into this tool. I also invite the readers to input the terms smallpox, tuberculosis and whooping cough by directly clicking the following link for the Google Books Ngram Viewer:
I am sure you will be surprised at the results obtained.
On the surface, while this appears to be a powerful epidemiological tool available right at our fingertips anytime, anywhere provided we have a computer or a mobile phone available that is connected to the internet, it is necessary that the data are compared with validated epidemiological data in formal studies before we make further inferences.
Until such time, the results obtained are food for thought!
Dr Balaji Bikshandi is an intensivist based in Canberra and an inventor attached to the Department of Industrial Design at the University of Canberra.
Latest news from doctorportal:
- Mental health rates worse in the bush
- Superbugs could be ‘worse than global financial crisis’: World Bank
- ‘Ban this rubbish’: antivax film trashed
- No link between ‘obesity gene’ and ability to lose weight