Expanding Research Capacity in Sub-Saharan Africa Through Informatics, Bioinformatics, and Data Science Training Programs in Mali

Bioinformatics and data science research have boundless potential across Africa due to its high levels of genetic diversity and disproportionate burden of infectious diseases, including malaria, tuberculosis, HIV and AIDS, Ebola virus disease, and Lassa fever. This work lays out an incremental approach for reaching underserved countries in bioinformatics and data science research through a progression of capacity building, training, and research efforts. Two global health informatics training programs sponsored by the Fogarty International Center (FIC) were carried out at the University of Sciences, Techniques and Technologies of Bamako, Mali (USTTB) between 1999 and 2011. Together with capacity building efforts through the West Africa International Centers of Excellence in Malaria Research (ICEMR), this progress laid the groundwork for a bioinformatics and data science training program launched at USTTB as part of the Human Heredity and Health in Africa (H3Africa) initiative. Prior to the global health informatics training, its trainees published first or second authorship and third or higher authorship manuscripts at rates of 0.40 and 0.10 per year, respectively. Following the training, these rates increased to 0.70 and 1.23 per year, respectively, which was a statistically significant increase (p < 0.001). The bioinformatics and data science training program at USTTB commenced in 2017 focusing on student, faculty, and curriculum tiers of enhancement. The program’s sustainable measures included institutional support for core elements, university tuition and fees, resource sharing and coordination with local research projects and companion training programs, increased student and faculty publication rates, and increased research proposal submissions. Challenges reliance of high-speed bandwidth availability on short-term funding, lack of a discounted software portal for basic software applications, protracted application processes for United States visas, lack of industry job positions, and low publication rates in the areas of bioinformatics and data science. Long-term, incremental processes are necessary for engaging historically underserved countries in bioinformatics and data science research. The multi-tiered enhancement approach laid out here provides a platform for generating bioinformatics and data science technicians, teachers, researchers, and program managers. Increased literature on bioinformatics and data science training approaches and progress is needed to provide a framework for establishing benchmarks on the topics.

Bioinformatics and data science research have boundless potential across Africa due to its high levels of genetic diversity and disproportionate burden of infectious diseases, including malaria, tuberculosis, HIV and AIDS, Ebola virus disease, and Lassa fever. This work lays out an incremental approach for reaching underserved countries in bioinformatics and data science research through a progression of capacity building, training, and research efforts. Two global health informatics training programs sponsored by the Fogarty International Center (FIC) were carried out at the University of Sciences, Techniques and Technologies of Bamako, Mali (USTTB) between 1999 and 2011. Together with capacity building efforts through the West Africa International Centers of Excellence in Malaria Research (ICEMR), this progress laid the groundwork for a bioinformatics and data science training program launched at USTTB as part of the Human Heredity and Health in Africa (H3Africa) initiative. Prior to the global health informatics training, its trainees published first or second authorship and third or higher authorship manuscripts at rates of 0.40 and 0.10 per year, respectively. Following the training, these rates increased to 0.70 and 1.23 per year, respectively, which was a statistically significant increase (p < 0.001). The bioinformatics and data science training program at USTTB commenced in 2017 focusing on student, faculty, and curriculum tiers of enhancement. The program's sustainable measures included institutional support for core elements, university tuition and fees, resource sharing and coordination with local research projects and companion training programs, increased student and faculty publication rates, and increased research proposal submissions. Challenges reliance of high-speed bandwidth availability on short-term funding, lack of a discounted software portal for basic software applications, protracted application processes for United States visas, lack of industry job positions, and low publication rates in the areas of bioinformatics and data science. Long-term,

INTRODUCTION
African countries have long been disproportionately burdened by the "big three" infectious diseases (HIV and AIDS, tuberculosis, and malaria) and neglected emerging infectious diseases such as EVD and Lassa fever. African populations maintain the world's highest levels of genetic diversity which decline proportionately with increasing distance from Africa (Tishkoff et al., 2009). Bioinformatics and data science [respectively, considered in this context as the methods and software tools for understanding biological data; and the unification of data design, collection and analysis (Hayashi, 1998;Wikipedia, 2019a)] research thrives on genetically diverse populations as population substructure variation contributes to the identification of true associations in complex disorders and drug response (Campbell and Tishkoff, 2008;Tishkoff et al., 2009;Quansah and McGregor, 2018). Research on these topics within Africa provide considerable opportunities for improving health outcomes through their application in infectious disease research, vaccine and drug development, and drug resistance patterns. The completion of the Human Genome Project and technological advances have led to significant cost reductions for genomic data acquisition and also provide immense opportunities for novel insights into etiology, diagnosis, and therapy (Tishkoff et al., 2009).
African researchers and participant populations have historically been underrepresented in GWAS. Through 2014, only 11 of the thousands of the GWAS have included African participants (Rotimi et al., 2014). While African countries such as South Africa have strong bioinformatics and data science capabilities, such capacity has been imbalanced across Africa, and many of its countries have yet to develop any of these capacities . Currently bioinformatics and data science degree programs are concentrated within several African institutions (Karikari et al., 2015b). Other factors negatively impacting bioinformatics and data science research in Africa include weak biomedical infrastructure; lack of governmental financial support; limited computational expertise; lack of participation in collaborative research beyond sample collection; and limited training opportunities, biorepositories, and databases (Tishkoff et al., 2009;Woolley et al., 2010;Rotimi et al., 2014;Karikari et al., 2015a;World Health Organization, 2015;Nielsen et al., 2017).
To fully benefit from advances in bioinformatics and data science research, it is imperative to train the next generation of African scientists on their use (Adoga et al., 2014;Human Heredity and Health in Africa, 2018). Tastan Bishop et al. (2015) note that the shortage of trained bioinformaticians is among the main obstacles in the development of bioinformatics in Africa (Tastan Bishop et al., 2015). Doumbo and Krogstad (1998) note that doctoral training on advanced topics are essential for African countries to define and implement their own health priorities (Doumbo and Krogstad, 1998). These demands call for building local university programs and infrastructure for establishing environments that are conducive for bioinformatics and data science training. Bioinformatics is known to require less infrastructural investments than other bench science initiatives, but essential resources are necessary such as powerful computer systems, reliable high-speed internet, access to databases and software programs, and reliable electricity (Karikari et al., 2015a). Karikari et al. (2015a) also note the importance of research infrastructure, research funding, training programs, scientific networking, and collaborations as key elements for developing bioinformatics expertise (Karikari et al., 2015a). Other factors affecting the implementation of training programs include teaching laboratories, server systems, airfare cost, timeliness of visas, suitable computational infrastructure, socio-political stability, and availability of open training spots Shaffer et al., 2018). This capacity may be gained through research and training on overlapping computationally intensive topics such as data management and data capture (Shaffer et al., 2018). Attwood et al. (2017) describe the importance of data management, data storage, data integration, and data sharing, and data science in bioinformatics training (Attwood et al., 2017). The importance of DCMSs is regularly noted in the literature as a key tool for establishing sustainable and collaborative research efforts (Lansang and Dennis, 2004;World Health Organization, 2004;Abou Zahr and Boerma, 2005;Kirigia and Wambebe, 2006;Gezmu et al., 2011;Gutierrez et al., 2015;Mulder et al., 2017;Shaffer et al., 2018).
Multi-country organizations such as the H3Africa and H3Africa BioNet (H3ABioNet) consortiums have yielded extensive training and research opportunities within Africa (Human Heredity and Health in Africa, 2013;National Institutes of Health, 2018). The H3Africa initiative aims to study genomics and environmental diseases to improve the health of African populations, partnering between the AESA, the Wellcome Trust, the ASHG, and the NIH (Adoga et al., 2014;Human Heredity and Health in Africa, 2018). The H3Africa Consortium had the effect of diversifying the bioinformatics skills and training in Africa, providing genomics training for over 500 Africans approximately 5 years (Mulder et al., 2018a). H3ABioNet is a Pan-African bioinformatics network consisting of 32 bioinformatics research groups in 15 African countries and partner institutions in the United States providing bioinformatics training in both introductory bioinformatics topics and specialized topics such as next generation sequencing (NGS) and GWAS (National Institutes of Health, 2018). The H3ABioNet bioinformatics training platform includes distance-based online training courses using virtual classrooms across 20 African institutions . The Eastern Africa Network of Bioinformatics Training (EANBitT) provides bioinformatics training in Kenya as part of a M.Sc. program in bioinformatics (International Centre of Insect Physiology and Ecology, 2018). Doctoral training in bioinformatics is also provided in Botswana and Uganda through the Collaborative African Genomics Network [CAfGEN; (Mlotshwa et al., 2017)]. Karikari (2015) discuss the current bioinformatics training programs in Ghana (Karikari, 2015). Tastan Bishop et al. (2015) lay out the development of bioinformatics as a discipline and list the current bioinformatics degree programs in Africa (Tastan Bishop et al., 2015). Mulder et al. (2018b) provide guidelines for competencies for bioinformatics training in Africa. The African Genomic Center maintains the first genome sequencing facility that was launched in Cape Town in 2018 and includes a strong bioinformatics training component (SAMRC, 2018). Other organizations promoting bioinformatics in Africa include The African Society for Bioinformatics and Computational Biology and formerly The ABioNET (SciDevNet, 2004; African Society for Bioinformatics and Computational Biology, 2019).
The focus of the current work are bioinformatics and data science training in sub-Saharan Mali. Research in Mali has emphasized malaria as it is the country's primary cause of morbidity and mortality, representing 42% of consultations in its health centers (Sissoko et al., 2017). Malaria control strategies in Mali have emphasized universal intervention coverage, epidemic and entomological surveillance, and targeted operational research (President's Malaria Initiative, 2018). Substantial progress in malaria reduction has occurred through scaling up malaria prevention and control interventions resulting in a nearly 50% reduction in malaria mortality rates in children under 5 years of age (President's Malaria Initiative, 2018). However, drug resistance to antimalarial drugs have complicated efforts to fully control malaria. The utilization of genomic and clinical data to understand parasite evolution, predict behaviors of resistance to new antimalarial medication, and inform strategies to prevent the spread of drug-resistant malaria is thus of great importance (Flegg et al., 2011;Fairhurst et al., 2012;Maiga et al., 2012;Takala-Harrison and Laufer, 2015;Oboh et al., 2018). Other infectious diseases with significant burden in Mali include leishmaniasis, filariasis, and tick-borne diseases. Neglected infectious diseases that have not been extensively studied (but not necessarily absent) in Mali include Lassa fever and EVD Shaffer et al., 2014;Traore et al., 2016).
As with many countries in sub-Saharan Africa, Mali has significant limitations in developing, implementing, sustaining, and expanding innovative mechanisms for research efforts and clinical trials that are central to its health improvement (Miiro et al., 2013;Mwangoka et al., 2013;Richie et al., 2015;Dicko et al., 2016;Niare et al., 2016). Recent studies on health information systems (HIS) in Mali reported limited expertise in data management, data analysis, and report generation (MEASURE Evaluation, 2014. Mali also shares the difficult task of collecting data through a weak HIS for monitoring the health of its population (Asangansi, 2012;Ndabarora et al., 2014). Despite these limitations, research investments in Mali have been substantial. Mali was established as an International Center of Excellence in Research (ICER) in 2002 and is currently ranked as the seventh highest investment country for malaria research (Head et al., 2017). The USTTB regularly serves as the lead institution research and training projects, including several recent awards as part of the H3Africa initiative (Landoure et al., 2016; Human Heredity and Health in Africa, 2019).
Here we describe an incremental approach for engaging the next generation of African scientists in research through a progressive sequence of informatics, bioinformatics, and data science training programs at the USTTB. We describe the approaches, developments, and challenges incurred culminating with the West African Center of Excellence for Global Health Bioinformatics Research Training program in an effort assist researchers for reaching underserved populations in similar environments.

Study Site
Situated in urban Bamako, Mali, USTTB is comprised of schools of medicine, pharmacy, and basic sciences; an institute of applied science; and research laboratories focusing on malaria, tuberculosis, and retrovirology (Harvard T.H. Chan School of Public Health, 2018). The site maintains teaching computer laboratories; server systems; and a formal data center including computer workstations, printers and internet access in controlled-access spaces. USTTB is a member of the REDCap (Vanderbilt, TN) Consortium. The site is situated near the epicenter for a host of infectious diseases and is surrounded by numerous complementary research efforts and networks, including the West Africa International Centers of Excellence for Malaria Research [ICEMR (National Institute of Allergy and Infectious Diseases, 2018a)].

An Incremental Approach for Engaging Underserved Populations in Bioinformatics and Data Science Research
Formal research and training infrastructure at USTTB dates back to 1989 with the launch the MRTC. The facility maintained hardwired internet access, laboratories, classrooms, conference rooms, and a library (Science Blog, 2000). The MRTC supported a host of internationally funded research projects (particularly the NIAID) and training programs and worked closely with Mali's National Malaria Control Program (NMCP; Science Blog, 2000). While the MRTC's mission was not initially focused on molecular research, it spawned growth in the area through its capacity building, particularly in the area of epidemiology. Bioinformatics was formally introduced to Mali in 2003 through the African Center for Training in Functional Genomics of Insect Vectors of Human Disease (AFRO VECTGEN), which was sponsored by the WHO as part of its Special Programme for Research and Training in Tropical Diseases (TDR) initiative. A timeline of incremental developments in bioinformatics and data science capacity building, research and training at USTTB are listed in Table 1.
The West African Center of Excellence for Global Health Bioinformatics Research Training program was launched in 2017 (Africa). The program leveraged infrastructure and personnel from: two earlier informatics training programs, a malaria research project, the USTTB bioinformatics M.Sc. program, and the African Center of Excellence in Bioinformatics (ACE) teaching computer laboratories (Doumbia et al., 2012;Koita et al., 2016Koita et al., , 2017; National Institute of Allergy and Infectious Diseases, 2018b). Descriptions of these efforts follow.

International Training in Medical Informatics (ITMI)
From 1999 to 2003, the ITMI program provided short and long term training in informatics for Malian researchers at the MRTC and governmental health agencies across West Africa. The ITMI program was complemented with research on determinants of drug resistance, immune evasion and virulence in malaria, development of field research sites to study drug resistance, human response to malaria, pathogenesis of severe malaria, and malaria vaccine trials. The informatics focus for the ITMI occurred in the sense of research question formulation and data collection, capture, linkage processes, management, and analysis. The program included five trainees with the overall goal of completing master's degrees in public health and preparing manuscripts and submitting them for publication in peerreviewed journals.

Informatics Training in Global Health (ITGH)
Building on the ITMI program, the ITGH program was a carried between 2004 and 2011 and provided training toward completion of M.Sc. degrees in public health. Several trainees also participated in an online master's diploma program known as epidemiology and public health (Epidemiologie et Sante Publique en ligne; ESPEL). With training delivered entirely in French, ESPEL was a consortium serving Francophone countries in the Mediterranean and North Africa through the University of Bordeaux with courses in statistics and epidemiology. Course instruction in the ESPEL program was provided by USTTB medical faculty and online tutoring tools.  (Doumbia et al., 2012). These countries provided four study sites with differential seasonal prevalence of Plasmodium falciparum (P. falciparum) infection and incidence in uncomplicated malaria (Shaffer et al., 2018). The primary goal of the study was to collect epidemiologic, clinical, and molecular data to better understand the transmission and human impact of malaria. Significant byproducts of this work were trained research personnel and established DCMS (Shaffer et al., 2018). These efforts continued in 2017 focusing on the study of malaria control interventions and antimalarial drug resistance (National Institute of Allergy and Infectious Diseases, 2018b).

African Centers of Excellence in Bioinformatics Program (ACE)
The ACE program is a public-private partnership with the NIAID and the FNIH to strengthen bioinformatics research capacity in low and middle income ( The program is one of only 13 such programs in 7 African countries (Tastan Bishop et al., 2015;Mulder et al., 2016). The program includes 20 courses arranged over 4 semesters with cohorts of 15 students over quarterly semesters, including three semesters of coursework and short-term internships and a single semester of thesis research in bioinformatics. The 1st year of study includes two semesters of core coursework equivalent to 60 academic credits (European Credit system), and the 2nd year consists of 60 credits of coursework and a 4-month practicum and a master thesis research project. Training is provided in collaboration by USTTB faculty; the H3ABioNet Consortium (from instructors in Tunisia, South Africa, and Ghana); and collaborating institutions in France and the United States (through video conferencing and webinars). The program's curriculum is shown in Table 2.
A key component of the curriculum for engaging underserved populations in research included a formal course on English speaking and writing in scientific research ( Table 1, course code BIN 103).

West African Center of Excellence for Global Health Bioinformatics Research Training
Launched in October 2017, the West African Center of Excellence for Global Health Bioinformatics Research Training is a collaborative bioinformatics data science and training program between USTTB and Tulane University. The program provides bioinformatics and data science training to faculty and students at USTTB and is sponsored by the NIH Fogarty International Center as part of the H3Africa initiative. The program seeks to establish a sustainable bioinformatics and data science research training program at USTTB, focusing on advancing the USTTB bioinformatics curriculum, increasing faculty and student authorship in bioinformatics and data science journals, grant proposal development, and improving success in gaining extramural research funding.

RESULTS
The primary outcomes of the ITMI and ITGH training programs included numbers of college degrees earned and publication frequencies and rates. These programs laid the foundation for subsequent research and training efforts. Among the trainees in the ITMI and ITGH programs were investigators the West Africa ICEMR and West African Center of Excellence for Global Health Bioinformatics Research Training program.

International Training in Medical Informatics (ITMI)
This program provided long-term training to five trainees between 1999 and 2003. Each of these trainees successfully completed a M.Sc. in Public Health (MSPH) degree. The impact of the ITMI on publication productivity is shown in Table 3.
Publication rates per year in first authorship and third and higher authorship increased by 690% (0.10 per year to 0.79 per year) and 253% (0.40 per year to 1.41 per year), respectively, following the ITMI training program. Each of these increases was statistically significant (p < 0.001). Publications following the ITMI program were focused in the areas of malaria interventions, vaccine development, and epidemiology (Sagara et al., 2014;Portugal et al., 2017).

Informatics Training in Global Health (ITGH)
The ITGH program provided short and long term informatics training in Mali between 2004 and 2011. The program included 53 short-term trainees and 7 long-term trainees from the MRTC, local governmental agencies, field sites and neighboring Francophone West African countries. Short term workshop training was delivered in both French and English, and ten short-term trainees completed the ESPEL online diploma training in biostatistics and epidemiology. Three of the long-term trainees earned master of public health degrees in biostatistics programs, and four of the long-term trainees completed the online ESPEL training program.

West African Center of Excellence for Global Health Bioinformatics Research Training Program (WABT)
The WABT was sponsored by the National Institutes of Health through its H3Africa initiative. The USTTB served as the WABT's lead institution, partnering with Tulane University (New Orleans, LA, United States) and the University of Strasbourg (Alsace, France). The WABT integrated three intertwined training components, namely faculty training and development, curriculum enhancement, and student training enhancement for students enrolled in the USTTB master's degree program in bioinformatics. The feedback loop illustrating the approach for launching new trainees into academic and research positions is shown in Figure 1.
The program provided a direct pipeline of trainees into the USTTB bioinformatics program and local research projects. An advisory board provided independent oversight and insight for the program. The three tiers of enhancement covered in the training program follow.

Faculty Enhancement
Faculty trainees were recruited from USTTB faculty with responsibilities in developing and overseeing the USTTB bioinformatics and data science curriculum. This component was carried out through participation in scientific workshops; delivery of oral and poster presentations at scientific conferences; mentorship on proposal development and manuscript preparation; and development and implementation of an annual bioinformatics symposium at the USTTB site. Research proposal topics focused on mobile health in malaria surveillance and efficacy evaluation for seasonal malaria chemotherapy in malaria prevention through the application of bioinformatics and data science approaches and technologies.

Curriculum Enhancement
Training activities included mentored program and curriculum development for the USTTB M.Sc. in bioinformatics program. The program's course competencies were compared and evaluated according to Mulder et al. (2018a,b). Curriculum modifications included an expansion of the program's component through providing options for completing thesis work at outside institutions. Additionally, a certificate training program in bioinformatics was developed based on current course offerings to expand participation and generate additional revenue for supporting the program. The ultimate goal of the curriculum enhancement activities was to lay the groundwork for a doctoral program in bioinformatics at USTTB.

Student Enhancement
The project provided partial scholarships for current and incoming students in the USTTB M.Sc. in bioinformatics program as well as partial scholarships for related doctoral programs. Trainees were recruited among students enrolled in the USTTB M.Sc. in bioinformatics program or doctoral students working in research programs with bioinformatics focuses. Training activities included "study abroad" training at outside institutions through the following mentored activities: formal coursework; literature review preparation; data capture, management, and analysis; and manuscript preparation. Manuscript data were provided through the West Africa ICEMR research projects. Trainees were responsible for attending and presenting research findings at professional research conferences, including the American Society of Tropical Medicine and Hygiene (ASTMH) and H3Africa consortium meetings. Funds were allocated for pilot research projects per year in the amount of $10,000 USD, which were intended to foster mentorship, incorporation of research into the classroom, and research evaluation. An online portal was developed for proposal submissions, and proposals were reviewed and scored by the training program's key investigators and advisory board (Figure 2).

Workshop Training
Workshop training was provided annually at USTTB aimed toward two aspects: (1) enabling junior trainees to effectively manage and interpret genetic data, including major bioinformatics database sources and integration with biological data; and (2) performing computational tasks and carrying out analytical approaches to process, analyze, and interpret biological data. The program's workshop themes are listed in Table 4.
The official language in Mali is French, but Bombari is the most widely spoken (Wikipedia, 2019b). The workshop training was delivered in English, and periodic translation summaries in French were delivered by USTTB faculty. Each day of the workshops concluded with student oral summaries of concepts and activities. Certificates of completion were awarded following successful completion of the workshops and were presented by the USTTB president and the project's principal investigators.

Additional Training Activities
Financial program management training was provided for USTTB financial administrators through in-person discussion sessions with trained sponsored projects personnel. Training was also provided on biographical sketch development, COI and disclosure, and research ethics.

Challenges
The challenges incurred during the bioinformatics and data science training program included language barriers, complexity in obtaining United States. Exchange Visitor visas (J−1), high-speed internet availability, and the lack of discounted software portals. While the training was primarily delivered in English, many of the trainees were not fully fluent in English. Also, this effort required the availability of high-speed bandwidth for utilizing software extensions and accessing biomedical databases. While high-speed bandwidth was available for training at USTTB, its funding was dependent on ongoing short-term funding. The lack of a discounted software portal for commercial software presented challenges for acquiring and upgrading several common software applications such as Microsoft Access (Redmond, WA, United States). This effort focused on freeware applications including R, REDCap, ArcGIS Online (Esri, Redlands, CA, United States), and QGIS (formerly Quantum GIS; Open Source Geospatial Foundation, Chicago, IL, United States). While the training program included cohorts of trainees with similar academic focuses, the participants with bioinformatics expertise ranged from beginning to advanced skill sets. The program strategy here incorporated basic biological concepts prior to covering more advanced topics in bioinformatics and data science. The lack of industry opportunities for trainees primarily limited post-training employment prospects to academia, research, and governmental health agencies.
Using a PubMed search with key words Mali bioinformatics yielded N = 63 publications between 2006 (the year of the first observed bioinformatics publication) and November 2018. None of these hits focused exclusively on bioinformatics or data science training (Figure 3).

Sustainability
Sustainability was a core part of the program's study design that was envisioned through its development in a university setting capable of: maintaining key resources during lapses in short-term funding, generating tuition and fees, developing workforces and human capital, and providing teaching computer and wet laboratory capacity. The program's sustainability approaches and measures are shown in Table 5.

DISCUSSION
Bioinformatics and data science expertise has arguably the most potential and impact in underserved parts of Africa due its high levels of disease and genetic diversity. The computational capacity and dynamic nature of bioinformatics research and training necessitate incremental processes in capacity building and training on related data intensive topics such as data capture and management. This capacity may be used to establish research networks and improve site suitability for hosting additional research. The training programs here yielded a sustainable platform for launching trainees into academia and complimentary research projects. The training programs here benefited from a longstanding partnership between USTTB and Tulane University in both training and research capacities. This partnership fostered local participation in content and program development to target the specific needs and health outcomes found in Mali. The authors here acknowledge that the progress described throughout this work did not operate in a vacuum and directly benefitted from a host of efforts by other researchers over the past several decades.
Indeed the teaching laboratory facilities available through the African Center of Excellence in Bioinformatics and the bioinformatics curriculum development by Mulder et al. (2016) were vital to the launch of our bioinformatics and data science training program. We found English training to be a key component for engaging underserved populations, and thus our earlier informatics training programs included formal English training. We believe that providing English training at outside institutions is becoming more difficult as United States visa programs such as the Visitor Exchange (J−1) visa program mandate English proficiency (U.S. Department of State, 2019). Inclusion of formal coursework in English in the USTTB bioinformatics curriculum also provided key advancements in this regard. It is of central importance for African universities with research missions to incorporate research-driven and overlapping computationally intensive research topics courses into their curriculum, including biostatistics, GISs, bioinformatics, and data science to foster a workforce capable of competing for large-scale research projects.

Trainee Outcomes
Trainee outcomes were considered primarily in terms of publications, grant proposal submissions and awards, research conference presentations, and employment outcomes. In the absence of strong industry or pharmaceutical presence, it is likely that trainees will ultimately gain employment in research, academic, or government settings. Emphasizing proposal preparation within the training programs therefore has great utility in this regard. The training efforts in this work benefitted from several complementary malaria research projects that provided training and data sources for manuscript development and publication. Our programs also benefitted from the biannual H3Africa research conferences, which provided an international venue for our trainees to present their work in oral or written discourses. Professional responsibilities in Africa are perhaps less dependent on scientific publishing for measuring scientific productivity than in other parts of the world, and thus additional incentives for publishing may be useful to conform to the extramural funding process where publication is highly prioritized. One such incentive occurs at the University of Cape Town where governmental supplements are provided for completed publications (Whitworth et al., 2010). While overall publication rates are improving across African institutions, they are not always available through search engines such as PubMed. Schoonbaert (2009) notes that a more complete resource for African literature is CABI's Global Health Database (Schoonbaert, 2009). Similar issues may arise for country-specific grants or foundation grants as they often go uncaptured in research repositories such as NIH RePORTER (U.S. Department of Health and Human Services, 2018). To this end, country research databases may be useful in improving research visibility.

Correlate Training
Ideally trainees should develop diverse research portfolios including topics focused on practical needs with utility in all facets of research, such as DCMS development and oversight. These practical skills provide opportunities over the entire course of research endeavors as opposed to sole skill sets in advanced analytical techniques that are suited for latter stages research. The challenges associated with bioinformatics and data science training parallel those for related data intensive research processes such as DCMS. Core bioinformatics and data science research infrastructure shares common elements with data intensive epidemiological and clinical research such as the setup of data systems, data management, and data warehousing. Because of the overlap in bioinformatics and DCMS responsibilities, we provided training on the development and use of REDCap databases and tablet-based data collection. Other practical training in our programs included biographic sketch and curriculum vitae and resume development and program management.

Regional Thinking and Sustainability of Bioinformatics and Data Science Training Programs
While the training programs covered in this work focused on Mali, we believe that they made a positive impact more broadly across the region of sub-Saharan Africa. These programs regularly hosted trainees from Mali's neighboring countries, including Nigeria and Ghana. The West Africa ICEMR also fostered collaborations among multiple sub-Saharan countries, including Mali, Senegal, and The Gambia.
Regional training approaches may increase research participation for countries lacking necessary capacity for hosting training efforts independently. The utility of regional-based approaches is recognized to yield sharing of study protocols and standardization of case definitions and reporting practices (Shaffer et al., 2018). Integrating data sources across study sites or countries provides opportunities for more advanced, multifactor approaches for evaluating treatments and vaccines. Such efforts are facilitated when host countries consider the health problems of neighboring countries as their own. Defining appropriate regional groupings may also consider the absence of disease as viable opportunities for control populations in research. Virtual regional infrastructures have also been shown to improve engagement with countries with sparse research resources (Jennings et al., 2004).
Long-term sustainability of training and capacity development in Africa will likely require additional support within the host countries in the area of research and development. Among 13 African countries (Cameroon, Gabon, Ghana, Kenya, Malawi, Mali, Mozambique, Nigeria, Sengal, South Africa, Tanzania, Uganda, and Zambia), only 3 countries (Uganda, Malawi, and South Africa) achieved a modest goal for spending at least 1% of GDP on research and development. GDP expenditures on research and development among the remainder of surveyed countries ranged between 0.20 and 0.48% (NEPAD Planning and Coordinating Agency, 2010). By contrast, these expenditures for the United States and Japan were 2.74 and 3.15, respectively (The World Bank, 2019).

Clinical Trials Infrastructure and Drug Development
Increased pharmaceutical involvement within Africa is greatly needed for developing bioinformatics and data science expertise and clinical research participation within the continent. Koita et al. (2016) note that key priorities in West Africa are the development of clinical research facilities and the training of host country investigators to ensure that the facilities and expertise necessary to evaluate candidate interventions are available in endemic regions when and where they are needed (Koita et al., 2016). The authors also note that many treatments deployed in Africa may have never included participants in their target countries. Bioinformatics and data science training programs provide an opportunity for showcasing workforce capacity to attract pharmaceutical and commercial investment. In turn, such investments will likely provide competitive advantages for short-term research. Partnerships with pharmaceutical companies may also serve as another means for sustaining core infrastructure during lapses in short-term funding.

Importance of Literature on Bioinformatics and Data Science Training Efforts
To our knowledge, this is the first manuscript on bioinformatics and data science training in Mali. Additional literature on bioinformatics and data science training in Africa is needed for establishing training priorities, monitoring progress, and developing goal-based strategies for its improvement. Such literature also allows investigators developing new training programs to build on prior efforts and adapt training approaches. It is therefore essential for journal publishers to recognize the importance of publishing work on training programs as they often serve as the backbone for their associated research.

CONCLUSION
Bioinformatics and data science training programs in developing countries necessitate incremental and collaborative strategies for their feasible and sustainable development. The progress described here covered decades of collaborative efforts centered on training and research on computationally intensive topics. These efforts laid the groundwork and platforms conducive for hosting a bioinformatics and data science training program in Mali. Training programs are perhaps best facilitated through Africa's university systems as they are perhaps best positioned to maintain core resources during lapses in short-term funding. While bioinformatics and data science training programs are rapidly growing across Africa, much of the continent currently lacks substantial commercial investment and is reliant on short-term funding for training and research efforts. It is therefore critical to incentivize, commercial and governmental investment within African countries to complement short-term funding efforts. It is also of central importance to publish literature on scientific training programs to monitor and evaluate progress, develop standards, and share training approaches and experiences.

FUNDING
This study was supported by NIH Global Health Training awards D43TW01086 and D43TW007000 and NIH Cooperative Agreements U2R TW010673 for West African Center of Excellence for Global Health Bioinformatics Research Training and U19 AI 089696 and U19 AI 129387 for the West African Center of Excellence for Malaria Research. Data sets for trainee publication were financially supported through NIH/NIAID Cooperative Agreements for the West African International Centers for Excellence in Malaria awards U19 AI 089696 and U19 AI 129387. AAD is supported through the DELTAS Africa Initiative an independent funding scheme of the African Academy of Sciences (AAS)'s Alliance for Accelerating Excellence in Science in Africa (AESA) and supported by the New Partnership for Africa's Development Planning and Coordinating Agency (NEPAD Agency) with funding from the Wellcome Trust (DELGEME grant #107740/Z/15/Z) and the United Kingdom (UK) Government. The views expressed in this publication are those of the author(s) and not necessarily those of AAS, NEPAD Agency, Wellcome Trust, or the UK Government.