Transdisciplinary Approach for Bioinformatics Education in Southern Brazil

The development and application of bioinformatics has been growing steadily, but its learning and training has been lagging. We have approached this problem through a bi-annual event, called EGB (Escola Gaúcha de Bioinformática), dedicated to undergraduate and graduate students (mainly from biology, biomedicine, chemistry, physics, and computer sciences), as well as professionals, to mingle and be presented to bioinformatics from sequence, structure, and computational standpoints simultaneously. The interactive environment provided by EGB allows for participants mingling, independently from their training background, fostering collaborative learning and experience exchange. Both lecturers and students are encouraged to collaborate and communicate, with no formal acknowledgement of “status differentiation”.


INTRODUCTION
Bioinformatics can be defined as simply as the "application of tools of computation and analysis to the capture and interpretation of biological data" (Bayat, 2002). Nonetheless, such definition is not as clear cut as it seems, with some authors being much more detailed in their description, e.g., "Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical chemistry) and then applying "informatics" techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale" (Luscombe et al., 2001). More recently, a definition of bioinformatics has been proposed as "an interdisciplinary field that is concerned with the development and application of algorithms that analyze biological data to investigate the structure and function of biological polymers and their relationships to living systems" (Tapprich et al., 2021). Contrasting with its array of definitions, there is no contending that bioinformatics has been steadily growing, both in its use and as a research field, at least since the early 2000s (Hodcroft et al., 2021;Wilson Sayres et al., 2018;Brusic, 2007;Perez-Iratxeta et al., 2007). This lack of agreement reflects the quick paced development of an evolving discipline whose target is not well defined yet. This lack of a settled focus is one of the major problems in teaching Bioinformatics: despite bioinformatics' half-century history (Gauthier et al., 2019), its learning and teaching seem to be trailing behind its observed growth and use (Hack and Kendall, 2005;Wilson Sayres et al., 2018). This trend is observed especially in undergraduate courses focused in biological and health sciences (Madlung, 2018), but computational science courses also lack intersection with biological applications and implications (Atwood et al., 2019). Thus, while students of life sciences lack formal training in data management and programming, computer science students have little to no contact with biological data and its intrinsic variability. This poses the immediate problem: how to address the shortcomings in bioinformatics teaching and learning when dealing with such diverse set of actor backgrounds?Considering such problem, we herein present the results from a bioinformatics learning model based on: 1) a space for undergraduate and graduate students interactions, among each other and with professionals; 2) integration of participants originated from different backgrounds; 3) exposition of the participants, simultaneously, to bioinformatics from sequence, structure and computational standpoints, through 4) both theoretical lectures and hands-on courses, from introductory and advanced, methodological and applied standpoints. Encompassing three editions so far of the Escola Gaúcha de Bioinformática (EGB), or "Southernmost Brazilian School of Bioinformatics" in a free translation, such model has been applied with variations in Brazil since the 1980s through multiple fields, such as physics, chemistry, molecular simulation and others, and has been able to offer a generational impact on the formation of highly qualified researchers.

EGB: ESCOLA GAÚCHA DE BIOINFORMÁTICA
Background Different strategies to consolidate bioinformatics as part of the main curriculum of different courses, especially in the biological sciences, have been proposed through the years. These include the definition of its core competencies (Welch et al., 2014), its defining elements (Tapprich et al., 2021), the inspection of successful teaching cases in the US and United Kingdom (Hack and Kendall, 2005), the need to go beyond traditional classroom short courses (Atwood et al., 2019), and the analysis of a dedicated learning module of bioinformatics for biology students (Madlung, 2018). The frequently mentioned transdisciplinary nature of the field, however, remains observational rather than practical in most of these propositions, with disciplines borrowing techniques from each other, but without forming a single, major area of knowledge. Thus, in order to show explicitly the combination of technical fields that make up this fast-paced field, we decided that, during the EGB, the transdisciplinarity should be presented as a living experience, bringing together participants from the multiple backgrounds that encompass the definition of bioinformatics.
The quick development of Bioinformatics is also often uneven spatially, with some regions developing faster than others, and often each group specializing in a very limited scope of the field of temporary interest for their research, neglecting other possibilities that mightif consideredimprove their capacities. For example, as a research field, bioinformatics has been concentrated in the Southeast Region of Brazil, especially in the states of São Paulo and Minas Gerais (Bicudo, 2016). Even with government initiatives to spread research facilities all over the country (from 2007 onwards), such as BioMicro nucleation effort (CAPES, 2008) and a major Computational Biology financial laid call (CAPES, 2013), other regions remained underrepresented (Bicudo, 2016). This also results in numerous groups working in specialized subfields, with little knowledge of other areas, which further hampers development of usefully complete Bioinformatics curricula. By taking advantage of the budding bioinformatics groups in the South Region of Brazil, particularly those in the southernmost state of Rio Grande do Sul, and connecting them with professional from all over Brazil and Latin America, EGB presents itself as a networking environment for the supervised development of abilities required to navigate the field. It is noteworthy that AB3C (Brazilian Association for Bioinformatics and Computational Biology) via its annual X-meeting event, provides training and networking opportunities. Likewise, the Brazilian ISCB Regional Student Group has been connecting bioinformatics students regionally. Nonetheless, these initiatives target subjects already working with bioinformatics, not specifically beginner, non-experienced users. The EGB works both as a scientific event (such as a symposium) and as a teachinglearning environment (like a "school"). This model has been successfully applied to more specific study areas in Brazil, for instance in the Brazilian School of Electronic Structure (since 1987) (dos Santos et al., 2003;EBEE, 2018) and in the School for Molecular Modeling in Biological Systems (since 2002) (EMMSB, 2021). Participation can include research paper submission and poster presentation, but these are not mandatory activities. The integration of all participants in an experience-sharing environment with provocative ideas is the main goal of EGB. A regular day at the "school" would include three to four lectures in the morning, followed by a lunch break, and flash courses in the afternoon, with both morning and afternoon periods having coffee break intermissions. These intermissions are particularly relevant for allowing participants to mingle in a friendly, casual environment. The school duration is proposed to be of 5 days (Monday to Friday), taking place during the Brazilian college winter break (in July). From the practical standpoint, the school requires a large auditorium for lectures, open halls for intermission activities and small individual computer labs for flash courses. Participants have the option to validate their participation as a credit in partner undergraduate and graduate programs. Thus far, three editions of EGB were organized (in 2015, 2017, and 2019), taking place in the Informatics Institute of the Federal University of Rio Grande do Sul, in Porto Alegre, Brazil. Supporting lab spaces were also provided by the Center of Biotechnology and the Institute of Biosciences at the same university. Financial management and enrollment assistance was provided the Brazilian Genetics Society (SBG).
Conceptually, the transdisciplinary characteristic of bioinformatics is approached by a simultaneous exposition of the students to lectures from introductory to advanced graduate levels, and from methodological to applied standpoints. A particular care is taken to include, in the same morning, lecturers from multiple fields in bioinformatics, as those working with sequence-or structure-based approaches, from biological to computer science backgrounds. Consequently, for example, some undergraduate biology student will be familiarized with coding while a computer science graduate student will be presented to protein biochemistry and molecular biology. During the intermissions, a more experienced student could help a newcomer to a particular area of bioinformatics to better understand previously discussed concepts, while the lecturer could both deepen some methodological details or explain in an even more basic level some aspect for participants. Finally, in the afternoon, the aspects from bioinformatics discussed in the morning lectures will be sedimented in hands-on courses on computer labs, also from the basic to more advanced levels. While it may be particularly challenging for some students to have a first contact to advanced coding, protein structure modeling or genome annotation using this approach, it is simultaneously a highly contextualized introduction to such field supported by the student familiarity with some of the other fields discussed during the event. In addition, these interactions have the benefit of encouraging and fostering inter-group cooperation.

Participant Demographics
The target audience for EGB was conceived to be wide, from undergraduate to graduate students and professionals. Hence, since its first edition in 2015, similar amounts of undergraduate and graduate students have participated in the event (Figure 1), simultaneously to a steady increase in the interest for the school, at an approximate 25% rate per edition, ranging from 174 participants in 2015 to 251 participants in 2019. We have observed a slight increase in graduate students as event editions progressed (Figures 1A,C). The same tendency was observed for the interest in focused flash courses ( Figure 1A). More data is required to interpret these trends, but we hypothesize that graduate courses (especially those in the biological areas) are being insufficient in providing their students with the bioinformatics training they need or aspire. Since it is not directly enforced, the varied audience emerges as reflection of the public interest in the subject. Admissions are processed in first-come, first-served basis, with the only selection criteria in place being the preference for new participants (in lieu of returning ones). So far, no exclusion has occurred in the admissions process.
Around 90% of the participants were Brazilians, with three quarters of them being from the Rio Grande do Sul State. These participants listed traditional University cities as their origin, with a concentration in the capital city of Porto Alegre and its surroundings. Non-Brazilian participants were Latin-American, coming from Argentina, Bolivia, Chile, Colombia, Mexico, Peru, and Uruguay (Figure 2). Direct involvement of international participants remains dependent on additional funding for travel expenses, something that is yet to be secured. The origin institutions of the participants (universities, research centers, etc) were also listed, forming a total of 42 different affiliations. At the end of each event, all participants were invited to provide anonymous evaluations on multiple aspects of the school. This qualitative data has been used to adjust and improve subsequent editions of EGB. These observations are reported in Section 3. Participant demographic questionnaire and obtained results are

Lectures and Flash Courses
Following the general concept of "mixing and matching", the lectures were planned to expand the participants' perspectives on the field. Accordingly, during each morning talks from specialists were given to the students covering multiple bioinformatics aspects, under three main standpoints: sequence or structurebased methods, biological applications and questions, and computational approaches to solve them. The main goal here is that the students get familiarized with multiple perspectives of bioinformatics, which are reinforced during the event. Thus, a lecture cycle could begin with observations on ways bioinformatics assisted zoologists to solve how tigers got their stripes (based, for example, on genomic data), followed by drug discovery against a specific pharmacological target (supported by docking, for instance), and ending with examples of computer clusters being used for crop enhancement via synthetic biology. Themes covered by lectures included (to cite a few) agricultural enhancements via genetic manipulation, big data in health and ecology, chemoinformatics, evolution, forensic applications of bioinformatics, massive parallel processing for bioapplications, molecular docking, multiscale molecular modeling and simulation, protein engineering, and vaccinology. Themes were changed at each event, and while one theme was given in a basic level in one edition, in other edition it could be offered in an intermediate level. Such approach accommodates students from multiple backgrounds and stimulate the return of participants for a new addition of the event with, accordingly, a continuous learning on the field. A similar approach was used for the flash courses, that include mandatorily some level of hands-on activity, from physical modeling of protein structures to viral genome assembly. The main themes covered by flash courses included introductory biochemistry, genome assembly and annotation, molecular dynamics, Python programming, structural biology, systems biology, and transcriptome data processing. All flash courses have allotted time at their initial session for familiarization with computer use and (when needed) with nongraphic interfaces. The ratio of 1 assistant for every five flash course participants is preconized. One day of the event was reserved to lectures from researchers in industry, emphasizing commercial applications and opportunities for the students. Also, these entrepreneurs reserved part of the day to mix with the participants. Course instructors and lecturers comprised Brazilian and Latin American experts from various fields pertaining to bioinformatics. These included professors, researchers, product managers, and industry representatives. Each lecture had an expected length of 50 min, followed by around 10 minutes of questioning, while flash courses lasted

OBSERVATIONS AND PERSPECTIVES
The qualitative evaluation by the participants of the event was highly positive, with courses and lectures being considered "excellent" or "good" by more than 90% of the participants. Likewise, the scientific program and the venue were deemed excellent, and about three quarters of the participants expressed interest in participating in future editions of EGB. This interest is confirmed by the amount of returning participants at each new edition ( Figure 1B). From the open-ended questionnaire, some aspects were brought up and have helped shape a more inclusive program in bioinformatics. Such aspects encompass the requirement for more physically accessible auditoria for mobility challenged individuals, the option for sign language interpreters and commitment to gender-balanced lineup of lecturers and instructors. Considering the need for training in bioinformatics and the observed success of the EGB strategy, impacting directly around 200 people per edition, plans are underway for expanding its geographical coverage. Initial arrangements are in progress for providing simultaneous instances of the event in different regions/countries, with live webcasting of lectures and local offerings of flash courses. Nonetheless, with the lengthening of the COVID-19 pandemic and its ongoing effects (Zoumpourlis et al., 2020), these plans were halted. Considering the social mingling that has been considered the highlight of EGB, by providing ways of connecting people with different backgrounds with a common interest in bioinformatics, and the additional need for supervised activities in the flash courses, the total conversion of the event to distance activities was deemed inadequate by the organizing committee. Despite still lacking proper assessment, information volunteered by former participants indicate that the interaction opportunities provided by EGB have fostered collaborations among researchers and students from different institutions, allowing for multiple groups joining forces for topical research projects. Such collaborations were also stimulated with the bridging of academia and different economic sectors, with students familiarizing themselves with the entrepreneurial environment. Undergrad students were also able to socialize with graduate students, sparkling interest in pursuing further academic career pathways. Still, the enrolled students came majorly from life sciences backgrounds, with computer science students representing a minor portion of the attending public. Even if the social component of the event may not apply to all participants, the knowledge gathered can supplement the current curricula by providing up-do-date lectures and high-level discussions. We hope these strategies may aid other endeavors for consolidating bioinformatics as a perennial knowledge in students and professionals alike.
In conclusion, the opportunity for interaction between people from biological and computational science backgrounds in a casual environment emerges as the main advantage of the model for bioinformatics training, teaching, and learning presented in the EGB school events.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/Supplementary Material, further inquiries can be directed to the corresponding author.