Key takeaway:
NL2Query aids in the rapid creation of Boolean queries, making it easier to start bibliometric studies, but further validation is needed.
Starting a bibliometric study always starts with the delineation of the corpus — whether focused on a specific set of grants, funder, field of research, or other criteria.? I am currently working on a research project about Neglected Tropical Diseases (NTDs), so creating my corpus first requires to identify the publications and grants related to any of the 22 NTDs listed by WHO.
I first thought of using MeSH terms, but quickly realised it would exclude some work. NTDs, as the name indicates it, are neglected, and therefore I want to make sure my study includes everything, hence trying to build them myself, before asking researchers in the area to validate them.
As I am not an expert in NTDs, I needed assistance to construct accurate search queries. While I could have manually used WHO fact sheets, I thought it would be interesting to test the new GenAI tool by Dimensions: NL2Query.
I asked it to:
select publications that talk about buruli ulcer, including similar terms found on WHO documentation or Wikipedia articles.
The tool created the following query:
((("buruli ulcer" OR "Mycobacterium ulcerans") OR "M. ulcerans") OR "Buruli ulcer disease")
The query included some redundant terms (like both "buruli ulcer" and "Buruli ulcer disease") but also helpful terms that would have required searching on the WHO fact-sheets. It doesn’t know how to handle parentheses, so it’s lucky here they do not matter.
Search results and analysis
Instead of using the default Full Text search, I searched in Title and Abstract and obtained 2,349 publications. I repeated this process for the first 11 NTDs in Dimensions. To provide context, I also noted the Disability-Adjusted Life Years (DALYs) for each disease in the following table (see GBD 2019).
Publication analysis
I queried Dimensions to examine the volume and growth trends of publications. This analysis revealed that:
Dengue, leishmaniasis, Chagas disease, and chikungunya have seen a surge in publications, especially since the 2000s, when the NTDs were first introduced by WHO.
Dengue dominates research publications among the NTDs.
Despite having the highest DALYs, lymphatic filariasis has very few publications and limited growth.
Chikungunya, which was for long under published, saw an increase in publications after the WHO’s First NTD Roadmap in 2012.
Publication and grant analysis
Next, I looked at the number of grants (from 2014 to 2023), which reflect interest from public and private funders. I plotted this data on a scatter plot, with DALYs representing the size of each dot.
The data indicates that in the past decade, there are fewer publications for chikungunya, Chagas disease, and leishmaniasis than might be expected, given the number of grants (potentially reflecting smaller grant sizes). Lymphatic filariasis, unsurprisingly, receives little funding, which correlates with its low publication output.
Conclusion
While bibliometric analysis can be challenging to initiate, GenAI tools like NL2Query offer a helpful starting point for research studies. However, these tools should not be used without further refinement and validation.

