Using Outbreak Data for Hypothesis Generation: A Vehicle Prediction Tool for Disease Outbreaks Caused by Salmonella and Shiga Toxin-Producing Escherichia coli

Foodborne Pathog Dis. 2022 Apr;19(4):281-289. doi: 10.1089/fpd.2021.0090. Epub 2022 Feb 15.

Abstract

Hypothesis generation about potential food and other exposures is a critical step in an enteric disease outbreak investigation, helping to focus investigation efforts and use of limited resources. Historical outbreak data are an important source of information for hypothesis generation, providing data on common food- and animal-pathogen pairs and other epidemiological trends. We developed a model to predict vehicles for Shiga toxin-producing Escherichia coli and Salmonella outbreaks using demographic and outbreak characteristics from outbreaks in the Centers for Disease Control and Prevention's Foodborne Disease Outbreak Surveillance System (1998-2019) and Animal Contact Outbreak Surveillance System (2009-2019). We evaluated six algorithmic methods for prediction based on their ability to predict multiple class probabilities, selecting the random forest prediction model, which performed best with the lowest Brier score (0.0953) and highest accuracy (0.54). The model performed best for outbreaks transmitted by animal contact and foodborne outbreaks associated with eggs, meat, or vegetables. Expanding the criteria to include the two highest predicted vehicles, 83% of egg outbreaks were predicted correctly, followed by meat (82%), vegetables (74%), poultry (67%), and animal contact (62%). The model performed less well for fruit and poultry vehicles, and it did not predict any dairy outbreaks. The final model was translated into a free, publicly available online tool that can be used by investigators to provide data-driven hypotheses about outbreak vehicles as part of ongoing outbreak investigations. Investigators should use the tool for hypothesis generation along-side other sources, such as food-pathogen pairs, descriptive data, and case exposure assessments. The tool should be implemented in the context of individual outbreaks and with an awareness of its limitations, including the heterogeneity of outbreaks and the possibility of novel food vehicles.

Keywords: Salmonella; Shiga toxin–producing Escherichia coli; hypothesis generation; outbreaks.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Disease Outbreaks
  • Escherichia coli Infections* / epidemiology
  • Foodborne Diseases* / epidemiology
  • Salmonella
  • Shiga-Toxigenic Escherichia coli*
  • Vegetables