Abstract:
There is a substantial increase in sexually transmitted infections (STIs) among men who
have sexwith men(MSM)globally. Unprotected sexual practices, multiple sex partners,
criminalization, stigmatisation, fear of discrimination, substance use, poor access to care,
andlack of early STI screening tools are among the contributing factors. Therefore, this
study applied multilayer perceptron (MLP), extremely randomized trees (ExtraTrees) and
XGBoostmachine learning models to predict STIs among MSMusing bio-behavioural sur
vey (BBS) data in Zimbabwe. Data were collected from 1538 MSMin Zimbabwe. The data
set was split into training and testing sets using the ratio of 80% and 20%, respectively. The
synthetic minority oversampling technique (SMOTE) was applied to address class imbal
ance. Using a stepwise logistic regression model, the study revealed several predictors of
STIs amongMSMsuchasage,cohabitation with sexpartners, education status and
employment status. The results show that MLP performed better than STI predictive models
(XGBoost and ExtraTrees) and achieved accuracy of 87.54%, recall of 97.29%, precision of
89.64%, F1-Score of 93.31% andAUCof66.78%.XGBoostalsoachieved anaccuracy of
86.51%, recall of 96.51%, precision of 89.25%, F1-Score of 92.74% and AUC of 54.83%.
ExtraTrees recorded an accuracy of 85.47%, recall of 95.35%, precision of 89.13%, F1
Score of 92.13% andAUCof60.21%.Thesemodelscanbeeffectively used toidentify
highly at-risk MSM, for STI surveillance and to further develop STI infection screening tools
to improve health outcomes of MSM