City Research Online

The proportion of cancer related entries in PubMed has increased considerably

Reyes-Aldasoro, C. C. (2017). The proportion of cancer related entries in PubMed has increased considerably. Paper presented at the 2017 NCRI Conference, 5-8 Nov 2017, Liverpool, UK.


This work explored the presence of Cancer-related publications in PubMed. The database MEDLINE of the United States National Library of Medicine (NLM) and its search engine PubMed ( have grown to include over 26 million entries out of which more than 3 million entries correspond to Cancer, which correspond roughly to 12% of the total entries.

The public database of biomedical literature PubMed was mined systematically using queries with combinations of keywords: Cancer-related, organ, funding and year restrictions. In addition, the relationships with DNA, Computing and Mathematics, were performed to explore the impact of these scientific advances on Cancer Research. All queries and figures were generated with the software platform Matlab® and the files are freely available.

The proportion of Cancer-related entries per year in PubMed has risen from around 6% in 1950 to more than 16% in 2016. This increase is not shared by other conditions such as AIDS, Malaria, Tuberculosis, Diabetes, Cardiovascular, Stroke and Infection some of which have, on the contrary, decreased as a proportion of the total entries per year. Interestingly, the proportion of Cancer-related entries that contain “DNA”, “Computational” or “Mathematical” have increased, which suggests that the impact of these scientific advances on Cancer has been stronger than in other conditions.

The sharp increase of Cancer Research as testified by the number of entries in PubMed may be due to the strong impact of the scientific advances in the areas of Genetics, Computing and Mathematics, which have had a stronger influence in Cancer than other areas like cardiovascular disease. It is important to highlight that the results obtained with a data mining approach and thus are limited to the presence or absence of the keywords on a single, yet extensive, database.

Publication Type: Conference or Workshop Item (Paper)
Additional Information: Paper presented at the 2017 NCRI Cancer Conference, 5-8 November, Liverpool, UK
Departments: School of Science & Technology > Engineering
School of Science & Technology > Computer Science > giCentre
[thumbnail of Abstract] Text (Abstract) - Accepted Version
Download (110kB)
[thumbnail of Poster]
Text (Poster) - Supplemental Material
Download (1MB) | Preview


Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email


Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login