Nevin Manimala Statistics

Identifying Potential Factors Associated with High HIV viral load in KwaZulu-Natal, South Africa using Multiple Correspondence Analysis and Random Forest Analysis

BMC Med Res Methodol. 2022 Jun 17;22(1):174. doi: 10.1186/s12874-022-01625-6.


BACKGROUND: Sustainable Human Immunodeficiency Virus (HIV) virological suppression is crucial to achieving the Joint United Nations Programme of HIV/AIDS (UNAIDS) 95-95-95 treatment targets to reduce the risk of onward HIV transmission. Exploratory data analysis is an integral part of statistical analysis which aids variable selection from complex survey data for further confirmatory analysis.

METHODS: In this study, we divulge participants’ epidemiological and biological factors with high HIV RNA viral load (HHVL) from an HIV Incidence Provincial Surveillance System (HIPSS) sequential cross-sectional survey between 2014 and 2015 KwaZulu-Natal, South Africa. Using multiple correspondence analysis (MCA) and random forest analysis (RFA), we analyzed the linkage between socio-demographic, behavioral, psycho-social, and biological factors associated with HHVL, defined as ≥400 copies per m/L.

RESULTS: Out of 3956 in 2014 and 3868 in 2015, 50.1% and 41% of participants, respectively, had HHVL. MCA and RFA revealed that knowledge of HIV status, ART use, ARV dosage, current CD4 cell count, perceived risk of contracting HIV, number of lifetime HIV tests, number of lifetime sex partners, and ever diagnosed with TB were consistent potential factors identified to be associated with high HIV viral load in the 2014 and 2015 surveys. Based on MCA findings, diverse categories of variables identified with HHVL were, did not know HIV status, not on ART, on multiple dosages of ARV, with less likely perceived risk of contracting HIV and having two or more lifetime sexual partners.

CONCLUSION: The high proportion of individuals with HHVL suggests that the UNAIDS 95-95-95 goal of HIV viral suppression is less likely to be achieved. Based on performance and visualization evaluation, MCA was selected as the best and essential exploration tool for identifying and understanding categorical variables’ significant associations and interactions to enhance individual epidemiological understanding of high HIV viral load. When faced with complex survey data and challenges of variables selection in research, exploratory data analysis with robust graphical visualization and reliability that can reveal divers’ structures should be considered.

PMID:35715730 | DOI:10.1186/s12874-022-01625-6

By Nevin Manimala

Portfolio Website for Nevin Manimala