Key takeaways:
A researcher-level FoR metric reveals disciplinary engagement patterns not obvious at the publication level.
NTD research remains clinically and biologically anchored, but largely disconnected from climate-related disciplines.
I joined Digital Science in 2017, as dimensions.ai was launching. The data was stored in a search index—queries like disease would match both disease and diseases, and similarly, analyse or analyze returned the same results. There is a research data bite about the exact search because sometimes you actually want to search a specific spelling. Early on, I queried directly the index using Python—it was internal and undocumented. In 2018, we introduced the Dimensions DSL API, which made the platform more usable: it allowed us to query relationships between items, facet results, and automate complex data retrievals. Pagination was still a hurdle—until Michele (developer of the Python dimcli library, and later a member of our team), which made it all cleaner.
Between 2018 and 2019, we explored Snowflake as another backend, before migrating fully to Google BigQuery in 2020. The API remains valuable—particularly for text search and classification queries—but BigQuery transformed our ability to scale analyses. We can now combine Dimensions data with World Bank indicators, ORCID records, or internal data like gender prediction, field classification, and so on.
As querying became easier, Kathryn (who leads our work on metrics and indicators and co-author of this bite) wanted to make metrics development more visible and accessible—hence this new series: research metrics bites. It complements our other research musings series (data, thoughts, and AI) and focuses on developing metrics not currently available in Dimensions on GBQ and applying them to public-facing questions we’re interested in.
From publications to researcher-level Fields of Research
Today I want to explore Fields of Research (FoR) not just for publications, but for researchers. Each items in Dimensions has up to four FoR, but what about researchers themselves?
While studying research on Neglected Tropical Diseases (NTDs), we wanted to understand if NTD researchers were specialised or if NTD research was only a part of their research portfolio. We used the WHO list of 22 diseases—including leprosy, rabies, dengue, chikungunya, and lesser-known diseases like dracunculiasis or onchocerciasis— yielding over 10,000 publications in 2010 and more than 22,000 publications in 2024. By contrast, malaria (not an NTD) alone accounted for 5,000 to 9,000 publications in those same years. NTD research is also less cited (average 14.51 vs 18.84) and the researchers’ top 5 nations include Brazil, which is replaced by France in malaria research.
In 2010, for researchers who worked on NTDs, NTDs represented 62.4% of their portfolio, while in 2024, they represented only 53.7%. This shift could reflect a growing multidisciplinarity of NTD researchers, a reduction in field specialisation, or simply a broadening of their overall publication activity. It could also signal a rise in occasional NTD contributors whose primary research lies elsewhere.
A researcher’s Fields of Research could, in principle, be derived from their entire record in Dimensions—including both publications and grants—since both types of records carry FoR classifications. However, for now we focus exclusively on publications because their volume and coverage are significantly higher, providing a more consistent and granular basis for analysis across the researcher population.
Comparing NTD focus and full portfolios
To explore this, I compared the top 10 FoRs for NTD researchers in two ways: (1) based only on their NTD publications, and (2) based on all their publications. The results reveal what NTD researchers bring into NTD research—and what they leave behind.
Both bump charts (you can switch within the chart itself, using the orangey button) show a shared disciplinary base rooted in 32. Biomedical and Clinical Sciences and 42. Health Sciences. In the NTD-only view, 3202. Clinical Sciences, 3207. Medical Microbiology, and 3009 Veterinary Sciences dominate—highlighting the clinical, microbial, and zoonotic foundations of NTD work. But in the full-portfolio view, we see a shift: 31. Biological Sciences enter with 3204. Immunology and 3105. Genetics, alongside 3211. Oncology and 3201. Cardiovascular Medicine, which do not appear at all in the NTD-specific ranking. Meanwhile, 3009. Veterinary Sciences is prominent only in the NTD view, confirming its tight coupling to this domain. 4206. Public Health and 4203. Health Services rise in both views, showing a growing focus on health systems. Overall, the NTD-only FoR profile is narrower and application-oriented; the full-portfolio profile reveals a broader biomedical scope, including disease areas not addressed within NTDs.
Notably absent from both views, however, are FoR linked to environmental science or climate adaptation—despite the well-documented sensitivity of NTD vectors to changing ecological conditions. This absence suggests that NTD research has yet to meaningfully integrate climate-related expertise, pointing to a missed opportunity for anticipatory, transdisciplinary approaches. We tracked 4101. Climate Change Impacts and Adaptation rising from the 83rd to the 63rd place in the overall researchers’ portfolio.
Conclusion
By comparing the Fields of Research of NTD researchers across their NTD-specific and full publication portfolios, we gain a deeper understanding of disciplinary focus and divergence. NTD research draws on domain-specific expertise—like veterinary science—that is tightly linked to neglected disease challenges but rarely appears in researchers’ broader work. At the same time, their full portfolios include fields like oncology and cardiovascular medicine, which are largely disconnected from NTDs but dominate global health research agendas. This contrast demonstrates the dual position of NTD research: highly specialised and applied, yet often sustained by researchers whose primary focus lies elsewhere. It also reflects the neglected status of NTDs—important enough to draw occasional contributions from researchers working in high-prestige fields, but not central enough to anchor full careers. This kind of analysis provides a scalable framework for mapping disciplinary engagement—and disciplinary neglect—across any research domain.
Code
NTD research
WITH ntd_for_y AS (
SELECT
for_code.code AS for_code,
CONCAT(for_code.code, ". ", for_code.name) AS for_label,
pub.year,
pub.id AS publication_id
FROM `dimensions-ai.data_analytics.publications` AS pub
INNER JOIN `NTDs_publications` AS ntd
ON pub.id = ntd.publication_id
LEFT JOIN UNNEST(pub.category_for.second_level.full) AS for_code
WHERE pub.year BETWEEN 2010 AND 2024
AND for_code.code IS NOT NULL
),
pub_counts AS (
SELECT
for_code,
for_label,
year,
COUNT(DISTINCT publication_id) AS pub_count
FROM ntd_for_y
GROUP BY for_code, for_label, year
),
with_share AS (
SELECT
year,
for_label,
pub_count,
pub_count * 1.0 / SUM(pub_count) OVER (PARTITION BY year) AS share
FROM pub_counts
),
ranked AS (
SELECT *,
RANK() OVER (PARTITION BY year ORDER BY share DESC) AS rank_in_year
FROM with_share
)
SELECT year, for_label, rank_in_year
FROM ranked
WHERE rank_in_year <= 10
ORDER BY year, rank_in_year;
All publications
WITH ntd_researchers AS (
SELECT DISTINCT author.researcher_id
FROM `dimensions-ai.data_analytics.publications` AS pub
INNER JOIN `NTDs_publications` AS ntd
ON pub.id = ntd.publication_id
LEFT JOIN UNNEST(pub.authors) AS author
WHERE author.researcher_id IS NOT NULL
),
all_publications AS (
SELECT
pub.id AS publication_id,
pub.year,
author.researcher_id,
for_code.code AS for_code,
CONCAT(for_code.code, '. ', for_code.name) AS for_label
FROM `dimensions-ai.data_analytics.publications` AS pub
LEFT JOIN UNNEST(pub.authors) AS author
LEFT JOIN UNNEST(pub.category_for.second_level.full) AS for_code
WHERE pub.year BETWEEN 2010 AND 2024
AND author.researcher_id IN (SELECT researcher_id FROM ntd_researchers)
AND for_code.code IS NOT NULL
),
pub_counts AS (
SELECT
for_code,
for_label,
year,
COUNT(DISTINCT publication_id) AS pub_count
FROM all_publications
GROUP BY for_code, for_label, year
),
with_share AS (
SELECT
year,
for_label,
pub_count,
pub_count * 1.0 / SUM(pub_count) OVER (PARTITION BY year) AS share
FROM pub_counts
),
ranked AS (
SELECT *,
RANK() OVER (PARTITION BY year ORDER BY share DESC) AS rank_in_year
FROM with_share
)
SELECT year, for_label, rank_in_year
FROM ranked
WHERE rank_in_year <= 10
ORDER BY year, rank_in_year;