Peer Review Simulation Sheds Light on Bias in Grant Funding


In an ideal world, the very best grant proposals that are the most likely to advance scientific discovery would win financial support from the National Institutes of Health (NIH). But the reality is that science is not immune to human bias. Race, ethnicity, gender, career stage, institution of origin, and other factors can accumulate as competing biases that may influence — consciously or unconsciously — how peer reviewers score submissions.

While this is no surprise to the research community, investigators may be taken aback by the results of a simulation study published in the journal Research Policy by T. Eugene Day, DSc, of the Office of Safety and Medical Operations at The Children’s Hospital of Philadelphia. He assessed how much bias is needed before grant funding decisions are swayed.

In his role as principal health systems specialist, Day uses simulation tools to examine clinical delivery systems and make predictions about which potential quality or safety interventions are most likely to have a positive change at CHOP. For example, he was the principal investigator for a recently published study that demonstrated how his team used simulation to test plans to improve scheduling of elective procedures in pediatric cardiac care.

During his free time, Dr. Day designed a thought experiment based on a simulation model of a simplified grant review process. He formulated the idea after a social media conversation with colleagues about how slight advantages could play a big role in determining who is lucky enough to submit a successful grant application when federal research dollars are scarce.

Dr. Day created two fictional classes of investigators for the simulation — preferred and non-preferred — and he assigned each class 1,000 grant applications that had been given intrinsic quality ratings. He ensured that the quality of the grant applications from each group was statistically identical. Then, in order to mimic the peer review process, he generated three reviewers who were imperfect at determining the intrinsic quality of the grants.

“Just like in the real world, no three reviewers on a NIH grant ever agree on exactly how good it is,” Dr. Day said. “The same was true in the simulation. I based the distribution of that randomness from my own grant score history.”

After validating the model, Dr. Day conducted a sensitivity analysis. He introduced small biases in one or all three reviewers against the non-preferred investigators and then increased the level of bias until he found statistically significant differences in the scores and in the actual awards, despite the fact that the quality of the grants was the same.

“What I found was that it takes an alarmingly small bias to make a significant difference,” Dr. Day said. “With only about 2 percent of the score being bias, we saw statistically significant differences in the scores of grants. That did not translate to the number of funded awards at that level. But with 3 percent, we saw statistically significant differences in the actual funds distributed. Very small biases can make these rather dramatic impacts on who gets funded.”

Another provocative aspect that his simulation revealed is that while the more privileged investigators received funding and the underprivileged investigators were left behind, the average quality of the funded grants was lower than it would have been without bias.

“So not only do these biases influence who gets funded, but they may degrade the overall quality of science,” Dr. Day said.

In addition, Dr. Day pointed out that bias can be difficult to detect because it is overshadowed by the random variation in how good reviewers are at determining the quality of the grants. In other words, the signal is a lot smaller than the noise. A first step to honing in on bias would be to promote more transparency in the grant review process, Dr. Day said, specifically by publishing the variation in individual reviewers’ scores.

“The first thing that we need to do is understand how big the variation is, and then next we need to work on narrowing it,” Dr. Day said. “That will allow us to identify real-world bias, which might then be addressed.”

Several efforts already are underway to maximize the fairness in NIH peer review. Last spring, Director of the NIH’s Center for Scientific Review (CSR) Richard Nakamura, PhD, announced  an initiative to begin combing through grant proposals to remove identity cues and then testing to see if anonymization has any effect on funding disparities. The CSR, which is the gateway for NIH grant applications, also launched a challenge to produce the best ideas to detect possible bias in peer review and named the winners in September.

It is crucial for the NIH to seek solutions to overcome bias, Dr. Day said, because challenges in securing grant funding that are unrelated to submissions’ merit can have long-term consequences for investigators who are striving to pursue promising biomedical research.

“If a bias prevents you from getting that first grant, then maybe you don’t have the preliminary data that you need for the next grant,” Dr. Day noted. “Maybe you don’t have the funding to continue your laboratory, and you exit science.”

Share This