• Tarsila Machado

AI in Public Health, Part Two: Bias and Inequities

Updated: Jul 21, 2021



Source: https://unsplash.com/photos/w7ZyuGYNpRQ

Although artificial intelligence (AI) can help us more accurately track and predict disease patterns and outcomes, it also runs the risk of exacerbating existing inequities if biased data for those analyses are used. Diverse datasets that capture the impacts of public health issues like COVID-19 on marginalized groups are necessary to produce findings that reflect the health inequities present in societies. When equitable and unbiased base data is collected, the use of AI shows great promise of supplementing human expertise with tools that can reduce the time needed to analyze complex information. However, if measures are not taken to ensure that not only the data used but also the individuals designing these algorithms are fully impartial in their views, the models and frameworks produced can have even more detrimental effects for the most vulnerable populations in need of targeted health interventions.

A recent article in the Journal of American Medical Informatics Association found that the rapid transition to AI usage amid the COVID-19 pandemic presents a great risk of biased or under-representative data modeling. In the rush to utilize AI for gaining greater statistical information on the pandemic, the risk of developing inaccurate prediction models increases due to the use of data samples that do not fully represent the extent of affected populations. Racial and ethnic groups including Black, Asian, or Hispanic/Latinx populations have been frequently cited as disproportionately suffering from the impacts of COVID-19 transmission, suggesting that pandemic relief and related healthcare money and resources should be proportionally distributed to the populations most at risk.

These racial and ethnic disparities in COVID-19 disease burden largely result due to socioeconomic disadvantage factors that lead to members of these populations having less opportunity to socially distance themselves from others, increasing the chances of infection. In addition, the presence of existing structural and health-related inequities can worsen the onset of COVID-19 infection progression in the body. It has been previously found that although black patients in the U.S. are disproportionately sicker than white patients, AI algorithms predicted less healthcare spending to be needed to treat black patients due to inequities that have led to them not having the same access to medical care or preventive measures as compared to white individuals on average.. If the data sets used to generate predictions do not fully encompass the true structural and health access inequities present, that can result in an exacerbation of those disparities and inequities through inadequate COVID-19 response plans and resource allocation to the communities who need them the most.

When used properly, AI-based disease transmission tracking and modeling can help supplement the epidemiological work conducted by public health and related professionals. It is effective at filling in gaps with inferential modeling when some clinical evidence might be missing or otherwise difficult to identify, interpret, or use to forecast predictions. Systematic reviews of AI-generated COVID-19 prediction models have cited biases in the form of under-representative data samples, model overfitting, and imprecise study population reports used as foundations for prediction modeling. The authors of this article point out to those using these tools that a balance is needed between producing results quickly with producing data models that are unbiased, high-quality, and equitable across all groups experiencing the pandemic.

A report published by Duke University’s Margolis Center for Health Policy emphasizes the mitigation of bias as a major component of ethical AI use and development. Controlling bias in algorithms and machine learning technology can ensure that disease monitoring and predictive modeling patterns are more accurately curated and best serves the needs of those most disproportionately affected. The Duke report advocates for greater policy processes set in place to regulate AI usage and reduce the risk of racial biases arising in AI algorithms and frameworks. Best practices such as reporting standards and publicizing of source coding should be implemented within health systems utilizing AI and related tools. The contributing authors call for more strategies specifically designed to manage how AI data is collected and generated in a way that most equitably allocates healthcare spending and pandemic relief resources to improve health outcomes and reduce disparities as communities deal with COVID-19.

When coding source bias for AI technology in public health is mitigated, it helps increase the efficiency of disease pattern tracking and forecasting by reducing the time and costs needed to generate those models compared to using human intelligence and labor alone. However, if caution is not taken when generating models using these predictive tools, they can inadvertently worsen racial and ethnic health inequities through inaccurate representation of data, leading to insufficient resource allocation and healthcare spending to the populations that most urgently require them. Through specially targeted policy processes and regulatory standards aimed at increasing transparency of data sets collected and incorporation of equitable reporting procedures, AI bias and potential harm to marginalized populations can be reduced. With greater control over bias in AI prediction modeling, public health and technology experts alike can make strides towards addressing systemic racism and related social inequalities in the COVID-19 pandemic, with the hope of improving health outcomes and reducing disparities for years to come.

Speaking Plainly:

Artificial Intelligence (AI) and other machine learning systems run the risk of exacerbating racial and ethnic health disparities through bias in source data collection.

When biased data is used in AI modeling that does not fully capture the most accurate disease burden in specific populations, this results in disease prediction models that do not equitably represent the true severity of disease spread and need for resources.

When biased AI source data and coding produces biased prediction models, this leads to fewer resources and healthcare spending dollars allocated to populations most in need.

Greater transparency in the source data collection and coding process along with regulatory standards and policies can help reduce AI bias, subsequently improving health outcomes and inequities.