The reviewer paradox: more publications, fewer peers?

Research data bites 12.

Jan 14, 2025

Key takeaways:
Dimensions on GBQ enables powerful and flexible queries for analysing peer review trends; Dimensions data includes “research articles” classification and uniquely identified authors.
As publication numbers grow, the pool of potential reviewers also expands. However, when seeking expert reviewers (e.g., with 5+ years of publishing experience), the pool remains insufficient in most research Divisions.

I recently read that “as the number of publications increases, the number and availability of peers decrease”. This didn’t seem intuitive to me. If each publication has multiple authors, the pool of potential reviewers should grow. However, if authors publish multiple papers in the same year, there could indeed be a deficit of peers. After all, we have all heard about the peer-reviewing crisis. So what is the mechanism behind this? And what fields of research are potentially the most affected?

The peer review system varies across disciplines. It dates back to 1665 with the Philosophical Transactions of the Royal Society, which relied on editorial committees and trusted experts. By the 1950s, formal peer review had been adopted widely as a quality control mechanism in scientific publishing. While there’s no universal standard for the number of reviewers per article, two reviewers are most common, though it typically ranges from one to three.

Identifying peer reviewers

I decided to test the hypothesis: “the more publications, the fewer peers are available for reviewing”, using the following assumptions:

each publication needs to be reviewed by two peers (in the same field of research Division — e.g., “32 Biomedical and Clinical Sciences”).
each author on a publication is a potential reviewer for 2 years after publication of a research article, as long as they have been publishing for 5 years.
only publications of the type article and document type research article were included.
Dimensions only has published articles; however we used here an average acceptance rate of 32%.

Using Dimensions on GBQ, I ran the query for 1950-2023 (starting the pool of reviewers in 1948). I calculated:

total_publications: Number of research article, likely to be peer reviewed (type article in Dimensions, excluding the obvious pre prints)
reviewer_demand: Twice `total_publications`
unique_peer_reviewers: Unique number of peer reviewers for the two years prior
reviews_per_reviewer: reviewer_demand / unique_peer_reviewers

As usual, starting the analysis in the mid-20th century is primarily to show a trend; data before 2010 (when metadata collection first became digitised) is less reliable.

Peer review needed per potential reviewers

Displaying the trend for each of the 22 Divisions, we obtain the following graphs showing the number of reviews suitable peers need to review per year, between 1950 and 2024. The Divisions are ordered by the value of the end point (2024), showing that the most in-demand Divisions in Humanities and Social Sciences, where there are fewer authors per publication. According to our data, potential reviewers should have to review between 1.75 and 6.16 publications per year.

Limitations

A 2008 survey found that researchers reviewed eight publications a year, which is below the needs in any Division according to my analysis; this suggest that this research data bite simplifies a lot the problem at hand because it assumes that:

researchers within a Division can review all articles within the same Division, which is unlikely,
rejection rate is the same in every Division and from 1950 to 2024,
publications are submitted once only — we know it is not true, although some journals have shortcut the process with possibility of accepting a publication into another journal without re-reviewing,
reviewing takes less than a year,
researchers are forever available to review, and
..there is no reviewer fatigue (today's researchers write more papers than 15 years ago, so availability for review has decreased too).

However, the results demonstrate that the relationship between publication volume and reviewer availability is not as simple as “more publications = fewer reviewers.”, since the research Divisions that publish the most also require fewer reviews per adequate peer. Instead, the dynamics vary significantly across disciplines and depends on researcher age pyramid, the number of researchers usually co-authoring publications, and so on.

Acknowledgements

Thanks to the two reviewers who suggested to add seniority and rejection rate.

Code

WITH 
  exploded_authors AS (
    SELECT 
      pub.id AS publication_id, year, author.researcher_id,
      CONCAT(ffl.code, ". ", ffl.name) AS field_of_research
    FROM `dimensions-ai.data_analytics.publications` pub,
         UNNEST(authors) AS author,
         UNNEST(category_for.first_level.full) AS ffl
    WHERE authors IS NOT NULL 
      AND document_type.classification = "RESEARCH_ARTICLE" 
      AND type = "article" 
      AND year BETWEEN 1948 AND 2024
  ),

  reviewer_pool AS (
    SELECT DISTINCT r.year AS review_year, e.field_of_research, e.researcher_id
    FROM (SELECT DISTINCT year, field_of_research FROM exploded_authors) r
    JOIN exploded_authors e
      ON e.year BETWEEN r.year - 2 AND r.year - 1
      AND r.field_of_research = e.field_of_research
    JOIN `dimensions-ai.data_analytics.researchers` res
      ON e.researcher_id = res.id
    WHERE e.researcher_id IS NOT NULL
      AND r.year - res.first_publication_year >= 5 -- Only include researchers active for 5+ years
  ),

  peer_reviewer_supply AS (
    SELECT review_year AS year, field_of_research, 
           COUNT(DISTINCT researcher_id) AS unique_peer_reviewers
    FROM reviewer_pool GROUP BY review_year, field_of_research
  ),

  yearly_publications AS (
    SELECT year, CONCAT(ffl.code, ". ", ffl.name) AS field_of_research,
           COUNT(pub.id) AS total_publications, COUNT(pub.id) * 2 * 3.125 AS reviewer_demand
    FROM `dimensions-ai.data_analytics.publications` AS pub,
         UNNEST(category_for.first_level.full) AS ffl
    WHERE document_type.classification = "RESEARCH_ARTICLE"
      AND type = "article" AND year BETWEEN 1950 AND 2024
    GROUP BY year, field_of_research
  )

SELECT 
  p.year, 
  p.field_of_research, 
  p.total_publications, 
  p.reviewer_demand,
  COALESCE(r.unique_peer_reviewers, 0) AS unique_peer_reviewers,
  p.reviewer_demand - COALESCE(r.unique_peer_reviewers, 0) AS peer_reviewer_gap,
  ROUND(COALESCE(p.reviewer_demand, 0) / NULLIF(COALESCE(r.unique_peer_reviewers, 0), 0), 2) AS reviews_per_reviewer
FROM yearly_publications p
LEFT JOIN peer_reviewer_supply r
  ON p.year = r.year AND p.field_of_research = r.field_of_research
ORDER BY p.year, p.field_of_research;

research musings

Discussion about this post