Data-driven, cross-disciplinary collaboration: lessons learned at the largest academic health center in Latin America during the COVID-19 pandemic

Introduction The COVID-19 pandemic has prompted global research efforts to reduce infection impact, highlighting the potential of cross-disciplinary collaboration to enhance research quality and efficiency. Methods At the FMUSP-HC academic health system, we implemented innovative flow management routines for collecting, organizing and analyzing demographic data, COVID-related data and biological materials from over 4,500 patients with confirmed SARS-CoV-2 infection hospitalized from 2020 to 2022. This strategy was mainly planned in three areas: organizing a database with data from the hospitalizations; setting-up a multidisciplinary taskforce to conduct follow-up assessments after discharge; and organizing a biobank. Additionally, a COVID-19 curated collection was created within the institutional digital library of academic papers to map the research output. Results Over the course of the experience, the possible benefits and challenges of this type of research support approach were identified and discussed, leading to a set of recommended strategies to enhance collaboration within the research institution. Demographic and clinical data from COVID-19 hospitalizations were compiled in a database including adults and a minority of children and adolescents with laboratory confirmed COVID-19, covering 2020–2022, with approximately 350 fields per patient. To date, this database has been used in 16 published studies. Additionally, we assessed 700 adults 6 to 11 months after hospitalization through comprehensive, multidisciplinary in-person evaluations; this database, comprising around 2000 fields per subject, was used in 15 publications. Furthermore, thousands of blood samples collected during the acute phase and follow-up assessments remain stored for future investigations. To date, more than 3,700 aliquots have been used in ongoing research investigating various aspects of COVID-19. Lastly, the mapping of the overall research output revealed that between 2020 and 2022 our academic system produced 1,394 scientific articles on COVID-19. Discussion Research is a crucial component of an effective epidemic response, and the preparation process should include a well-defined plan for organizing and sharing resources. The initiatives described in the present paper were successful in our aim to foster large-scale research in our institution. Although a single model may not be appropriate for all contexts, cross-disciplinary collaboration and open data sharing should make health research systems more efficient to generate the best evidence.


Introduction
The COVID-19 pandemic, caused by the SARS-CoV-2 infection, has triggered an urgent global research effort to mitigate its impact.At the time, it was essential to devise effective strategies to reduce the rate, severity, and economic aftermath of SARS-CoV-2 infection (1)(2)(3).This scenario persists, owing to the recognition that many patients present a post-acute COVID-19 syndrome (PACS), which can appear as an intricate combination of persisting and novel symptoms (2).In recent years, collaboration among healthcare researchers with diverse expertise and access to largescale patient data has emerged as a critical approach to enhancing research quality and efficiency (4)(5)(6)(7)(8)(9)(10)(11).Cross-disciplinary collaboration involves a joint and equal contribution from a broad range of health research experts, crossing disciplinary boundaries to work collaboratively (12)(13)(14)(15)(16).This emerging view of science can and should also be employed for PACS, following the efforts during the acute phase of the pandemic.
In response to the COVID-19 outbreak in Brazil (one of the most affected countries), the Hospital das Clínicas & Faculdade de Medicina da Universidade de São Paulo (HC-FMUSP), the largest academic health system in Latin America, established a crisis committee in January 2020.Over the following 2 years, the HC-FMUSP complex admitted over 9,000 patients with suspected SARS-CoV-2 infection, mostly moderate and severe cases.To cope with surge in demand for COVID-19 care during the first wave of the pandemic (from March through August 2020), the crisis committee converted the Central Institute, one of the eight HC-FMUSP institutes, into a specialized COVID-19 inpatient facility, with a total of 900 beds (including 300 intensive care beds) (17,18).
At the onset of the pandemic, several research groups initiated clinical studies on COVID-19 and explored various preventive strategies for the disease (19).In May 2020, HC-FMUSP installed an emergency institutional taskforce, aimed to support research infrastructure and logistics for those studies, which had until then been conducted with a low degree of connection and collaboration among teams.A set of institutional cross-disciplinary research initiatives to study and provide solutions for COVID-19 was thus implemented by this taskforce, with the purpose of fostering scientific collaborations among groups affiliated with HC-FMUSP.This enterprise was designed to reach far beyond co-authorship and, instead, involved joint institutional efforts across disciplines with a focus on cooperation, equity, and transparency (20)(21)(22)(23).This paper aims to describe, in detail, the successful implementation of such initiatives (including flow management routines to capture, organize, share and analyze large amounts of data), and outline the challenges and barriers identified over the course of this unprecedented experience in the country.

Context
This paper examines the benefits and challenges of an institutional research management initiative implemented to facilitate large-scale, cross-disciplinary scientific collaborations during the COVID-19 pandemic.In response to the urgent need for knowledge about the disease and the resource constraints faced during the pandemic, all the actions described below were designed and implemented simultaneously, rendering this a particularly challenging and complex endeavor.

Overall strategy and governance
Research managing strategies were planned in three main areas: organization of a large database consisting of clinical data from hospitalized COVID-19 patients; setting-up of a multidisciplinary taskforce to conduct follow-up assessments of these patients; and organization of a biobank of blood samples collected both during inpatient stay and follow-up assessments.
A COVID-19 Steering Committee was established, comprising institutional leaders with expertise in scientific management and representatives from the COVID-19 crisis committee.This committee shared several key responsibilities, including mapping, monitoring, and supporting research groups utilizing data from the institutional databases.The committee also proposed strategies to encourage collaborative publications and approved applications from HC-FMUSP researchers seeking access to the databases.To ensure fair decisionmaking, the vice-chair of FMUSP has served as an adjunct member of the Steering Committee, responsible for reviewing and adjudicating appeals from dissatisfied applicants regarding committee decisions.
To ensure effective management of the COVID-19 data organization initiatives, specific teams were created for each of the three institutional fronts.Additionally, a small team was responsible for the overall day-to-day management of these fronts.This group, led by a university full professor who was also a member of the Steering Committee, facilitated communication and collaboration, acting as a catalyst for the exchange of relevant information and intelligence related to COVID-19 research within the institution (Figure 1).With support from the FMUSP Library, this direct management team prepared an institutional data management plan for the various initiatives, which was validated by the COVID-19 Steering Committee and approved by the HC-FMUSP board of directors.This document outlined the criteria for granting access to institutional data and biological material, as well as periods of retention prior to open data sharing.
Regarding the initial financing for the research initiatives outlined in this report, the HC-FMUSP superintendence rapidly provided seed funds generated from a crowdfunding campaign launched during the pandemic 1 (see details of funding allocation in Supplementary materials).

Institutional database of hospitalizations due to COVID-19
The dedicated inpatient facility for COVID-19 patients was operational at the Central Institute of HC-FMUSP until September 2020, coinciding with the abatement of the first wave of COVID-19 cases in São Paulo.From then onwards, inpatient admissions due to acute COVID-19 during the subsequent waves of the pandemic continued to take at HC-FMUSP, being allocated to different units of the hospital complex.
In May 2020, HC-FMUSP initiated the development of the institutional database focused on hospitalization data.This database included information from consecutive patients admitted for at least 24 h as inpatients due to suspected SARS-CoV-2 infection.The HC-FMUSP Information Technology (IT) Center extracted data from structured fields within electronic health records (EHR) and populated the database.These records followed a specific case report form designed for COVID-19 within the HC-FMUSP EHR system, facilitating the collection of pertinent information during hospital admissions.The basic set of variables was defined by a panel of experts in clinical emergencies, intensive care and infectious diseases, combined with the case report form proposed by the World Health Organization to globally standardize COVID-19 records (24).Data regarding vital signs, laboratory and radiology tests, and drug prescriptions were also extracted by the IT Center, assisted by physicians to determine where the most accurate clinical information was available within the EHR.A team of data science specialists was hired to organize all data into a set of variables usable for research, further data mine EHR, and organize all the processes involved in the construction of the institutional databases (including, cleaning, structuring, and reconciliation; Figure 2).The database was stored on a Research Electronic Data Capture (REDCap) system (25) hosted at HC-FMUSP servers.This database was further expanded with two additional sources: 1. contributions from research groups who agreed to share the data that they had already been collecting for their own studies with inpatients; and 2. information manually extracted from unstructured fields of the EHR by a taskforce of young researchers supervised by experienced scientists, in order to fill missing data for selected variables.
Gradually, this hospitalization database was integrated with information from two other institutional initiatives which are described below (i.e., the multidisciplinary follow-up assessment of COVID-19 patients after in-hospital discharge, and the COVID-19 biobank).
Direct access to data from the hospitalization database was provided solely by the direct management team, after swift authorization of the COVID-19 Steering Committee using objective and previously advertised criteria.This ensured objectivity and transparency in the process of granting access to the data.

Multidisciplinary follow-up assessments after discharge
The multidisciplinary follow-up assessment program of COVID-19 patients was carried out from October 2020 to April 2021, as detailed elsewhere (26).All surviving adult (≥18 years) patients that had been admitted to HC-FMUSP between March and August 2020 due to COVID-19 were consecutively invited for a follow-up visit that should occur around 6 months after their hospitalization.Comorbid conditions prior to COVID-19 were identified using the hospitalization database described in the previous sub-section, and patients with a previous diagnosis of dementia or end-stage cancer were excluded.Additional exclusion criteria were pregnant or postpartum patients, subjects living in nursing homes or long-term care facilities, and insufficient physical mobility to leave home.
The plans for this follow-up initiative were extensively advertised throughout HC-FMUSP, and all interested research groups were invited to participate.To ensure cooperation and minimize inconvenience for the patients, groups were required to collect data in an integrated and coordinated manner.To optimize participants' time during the in-person visit, participants of the follow-up cohort were initially evaluated remotely.All interviews and protocols that could be administered remotely were answered during this telehealth consultation, taking advantage of the infrastructure and training of health care professionals that were implemented for innovative tele-ICU practices during the pandemic at HC-FMUSP (27)(28)(29).Most in-person assessments were streamlined on a single day, approximately 1 week after the remote evaluation, optimizing the use of institutional resources, maximizing multidisciplinary interchange of experiences, and fostering a comprehensive outlook on the health needs of the subjects who underwent those follow-up assessments.Participants with a history of ICU admission and diagnosed with lung damage during hospitalization, according to predefined criteria (30, 31), were invited for a second visit to undergo specific tests (plethysmography, cardiac stress test and chest computed tomography).To facilitate interactions between research groups and to avoid the physical circulation of subjects and their relatives, all in-person evaluations (except radiological exams) were conducted at one single hospital sector.Two separate facilities were used: a temporary outpatient center for patients without a history of ICU admission during in-hospital stay and the clinical research center at the Instituto do Coração at HC-FMUSP for patients who had been admitted to an ICU during acute COVID-19 (26).
The multidisciplinary follow-up assessment also included the evaluation of hospitalized pediatric COVID-19 patients (<18 years), specifically focusing on multisystem inflammatory syndrome in children (MIS-C) (32).To facilitate this assessment, a dedicated outpatient clinic was established at HC-FMUSP's Children's Institute, where patients were scheduled for visits every 6 months.The prospective studies conducted on children and adolescents that had COVID-19 encompassed various areas, such as linear and pubertal development, dietary habits, mental health, innate immunity errors, autoimmune conditions, metabolomics, gut microbiota, genetic determinants, bone mineral density, and home-based exercise training (33).
Collected data were stored on the REDCap system hosted at HC-FMUSP, fully integrated with the hospitalization database.Access to those data was provided solely but with swift authorization by the direct management team.The variables that could be accessed by each participating team and the principles for the collaborative sharing of information were agreed between those groups.Information on periods of retention for the broader sharing of those data was included in the institutional data management plan.

COVID-19 biobank
Our institution's COVID-related activities included a pioneering effort to collect and store large amounts of biological material from hospitalized COVID-19 patients for both short-term and future scientific studies.This initiative utilized an existing biobank at the Tropical Medicine Institute of HC-FMUSP, which had prior approval from the Brazilian Council of Ethics in Research to incorporate residual biological material from diagnostic samples collected during routine clinical procedures at HC-FMUSP, with explicit patient consent.A dedicated COVID-19 branch of the biobank was established at the Central Laboratory of HC-FMUSP, allowing for the systematic processing and storage of leftover blood samples collected from hospitalized COVID-19 patients starting in June 2020.

Creation of a COVID-19 curated collection within the institutional digital library of academic papers at the FMUSP-HC system
Using DSpace software and in line with the institution's strategic needs, the FMUSP-HC Library developed a COVID-19 curated collection within the Intellectual Production Observatory of the FMUSP-HC academic system -OPI. 2 OPI is an institutional digital library of academic papers created in 2014 to facilitate the mapping, monitoring and analyzing of quantitative metrics related to the research output of FMUSP-HC groups.

Ethical approval, consent and data security aspects
The implementation of all actions described in this paper strictly followed ethical and data security principles, adhering to standards of consent, privacy, confidentiality, and data protection.All research protocols included in the initiatives described herein received ethical approval.The multidisciplinary follow-up cohort integrates the results of several research projects led by health specialist teams within HC-FMUSP.All projects were approved by HC-FMUSP's institutional review board (CAPPesq -Comissão de Ética para Análise de Projetos de Pesquisa) (approval numbers 4.270.242,4.502.334,4.524.031,4.302.745and 4.391.560).Participants provided signed informed consent.
In 2020, voluntary medical students made efforts to obtain informed consent for the COVID-19 biobank from hospitalized individuals and their relatives during their inpatient stay.These efforts were continued through subsequent telephone and face-to-face contacts during the follow-up program.
To ensure data security and confidentiality, the REDCap system hosted at HC-FMUSP complies with U.S. Health Insurance Portability and Accountability Act (HIPAA) and the Brazilian General Personal Data Protection Act (in Portuguese, LGPD).Researchers accessing data and samples are required to sign agreements acknowledging the ethical and legal responsibilities and ensuring strict confidentiality of participants' data.

Institutional database of hospitalizations due to COVID-19
Data from COVID-19 hospitalizations were consistently extracted and compiled in the research database from all disease waves, through June 2022.By that date, the institutional research database contained hospitalization data from more than 4,500 adults with laboratoryconfirmed diagnosis of COVID-19, including cases from March 2020 to June 2022 (see Table 1; Supplementary Table S1), with approximately 350 fields from each patient (see Supplementary Table S2).The pediatric database including hospitalization data from more than 150 children and adolescents admitted to HC-FMUSP due to COVID-19 was organized by research groups based at the specialized HC-FMUSP's Children's Institute.The collaborative efforts of several teams at HC-FMUSP were crucial to maximize the quality of the data compiled in the hospitalization database above.The teams from the HC-FMUSP Infectious Diseases section and the HC-FMUSP Central Laboratory were responsible for developing and overseeing the application of the criteria for laboratory-based diagnosis of COVID-19 (34,35).The Pulmonology and Radiology teams worked to validate and apply the criteria for radiological diagnosis of COVID-19 mainly based on lung computed tomography (CT) findings.The Infectious Diseases team, together with the Pulmonology and Radiology groups, devised the clinical criteria for highly suspect cases of COVID-19 (36).Using the defined criteria, the specialized Epidemiological Surveillance team at HC-FMUSP validated the inclusion of cases in the institutional research database while excluding patients with nosocomial COVID-19 infections.
Finally, the expertise of two HC-FMUSP groups involved in environmental research allowed the generation of neighborhood variables based on each patient's zip code of residence.These variables included factors such as air pollution levels and exposure to green areas, which were incorporated into the research database to help explore potential environmental risk factors associated with post-COVID-19 syndrome (37).
Upon completion of the database and case validation, the possibility of accessing the institutional database above was widely advertised in successive calls open to HC-FMUSP-based research groups.Thus far, the database has been used in 17 published studies, attracting several research groups (36,(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52)(53) (see Supplementary Table S3).The hospitalization database is currently being used for a few additional analyses, and it will continue to be accessible for new studies proposed in the near future.This unique database also provides the means for assessing long-term outcome of patients, as it provides a profusion of baseline data on the different clinical parameters, allowing the continuous horizontal follow-up of patients.

Multidisciplinary follow-up assessments of COVID-19 patients after hospital discharge
More than 20 HC-FMUSP-based research groups from different disciplines agreed to join the multidisciplinary follow-up assessment program, bringing human and operational resources to make the collection of comprehensive data from hundreds of patients feasible over a few months without the need for large external financial resources.
This collaborative experience paved the way for an ambitious multidisciplinary grant proposal submitted to the state-run São Paulo Research Foundation (FAPESP), to fund two additional waves of follow-up assessments of the same cohort (after three and after 4 years of hospital admission), combining research proposals from the different groups involved thus far.This proposal, worth approximately 1.8 million dollars, was awarded by FAPESP at the beginning of 2023 acknowledging the potential impact of the proposed research on advancing the understanding, prevention, and treatment of COVID-19.

COVID-19 biobank
By October 2020, blood serum and plasma samples had been collected, processed, and stored at -80 o C from over 2,000 patients hospitalized due to COVID-19 at HC-FMUSP.Additional vials of serum were stored for more than 700 patients who agreed to attend the follow-up visit.In total, the biobank contains over 45,000 blood vials (each of containing approximately 0.5 mL).To date, over 3,700 aliquots have been dispensed for 10 different ongoing research initiatives investigating pathophysiological aspects of COVID-19, relating distinct acute symptoms and sequelae of the disease to a wide range of biomarkers of inflammation, neurodegeneration, intestinal permeability, peptidomics and metabolomics, among others (see Supplementary Table S5).There was considerable variability in how these biomarker studies were conceived, including one top-down initiative led by the COVID-19 Steering Committee related to the assessment of a large panel of inflammatory markers, whereby all research groups with expertise and interest in the field at HC-FMUSP were contacted and encouraged to work collaboratively, planning and conducting analyses on hundreds of samples and sharing costs of consumables.
Up until now, two collaborative papers from the above studies have been published (65,66).

COVID-19 curated collection within the institutional digital library of academic papers
By the end of 2022, the FMUSP-HC System's researchers had authored 1,394 papers on COVID-19 published in high-impact journals, encompassing original articles, case reports, technical notes, reviews, commentaries and editorials.From these 1,394 papers, at least 90 comprised original studies containing patient data.A total of 48% of the articles housed in the institutional digital library of academic papers are accessible through open access.FMUSP-HC has actively encouraged researchers to submit their work for publication in open access journals, reflecting a strategic alignment with the broader movement toward open science.

Implementation challenges
The COVID-19 Steering Committee and the research management team faced several implementation challenges during the course of their work at HC-FMUSP.These challenges, their possible causes and the ways by which they were dealt with are discussed in the sub-items below and summarized in Table 2.

Dissemination of information and questions regarding leadership
Large academic health system complexes like HC-FMUSP often face communication challenges, hindering the dissemination of information about institutional initiatives (22,67).During the pandemic, our research management teams encountered difficulties in reaching all potentially interested research groups.Additionally, the shift toward a more collaborative research approach was met with hesitation by the HC-FMUSP scientific community, leading to concerns about conflicts over leadership and data ownership.
To address these challenges, we appointed experienced and respected HC-FMUSP researchers to lead different components of the institutional initiatives.We maintained frequent communication with these leaders to make strategic decisions and ensure consistent implementation of actions with transparency and cooperation.Several channels of communication were used to increase overall institutional awareness about the principles of inclusion, transparency, and cooperation of the initiatives, through small-group discussions, sharing of presentations, and sending of memos and progress reports.This process was time-consuming but rewarding, since approximately 23% of the total number of research groups of HC-FMUSP (50 out of 220) eventually agreed to participate in the collaborative initiatives.
Over time, open communication appeared to reduce the number of conflicts regarding leadership, data ownership, and data sharing that had initially arisen.
Despite our efforts to improve communication and promote collaboration among research groups at HC-FMUSP, some expressed Participants who consented with remote assessments but declined the invitation to attend in-person appointments due to health concerns.Despite the implementation of measures to mitigate exposure risk during follow-up evaluations, there was still a level of heightened health apprehension, since the multidisciplinary follow-up assessments occurred between October 2020 and January 2021, when the pandemic was still at its peak in São Paulo.dissatisfaction with their level of inclusion and access to information.
To address this, we launched a website for the institutional initiatives. 3 While this initiative was implemented relatively late in the process (September 2022), we hope that it will serve as a valuable resource for all interested parties, providing updates on progress, relevant publications, and opportunities for involvement in ongoing research activities.

Questions about credit and authorship
A second challenge we faced was how to give credit to the healthcare professionals and management teams of HC-FMUSP whose contributions were essential in creating the institutional databases discussed in this paper.A corporate coauthor, the HC-FMUSP COVID-19 Study Group, was created to acknowledge their contributions, and grant them opportunities to have some degree of intellectual involvement in research activities.This group, consisting of 31 professionals, was selected based on objective criteria.We set a rule whereby those individuals would be invited to contribute intellectually to (and approve the final version of) all the original scientific publications that would be based on data collected from significant amounts (>800) of patients from the institutional databases.The goal was to acknowledge individuals that did not take part in the conception and writing of the first draft of articles, but who helped significantly in the construction of the databases.The use of such corporate coauthor gave the opportunity for its members to contribute intellectually to the interpretation of the results and finalization of several manuscripts.
The contributions of other key professionals from the crisis committee who had no familiarity or interests in research on clinical aspects of COVID-19 was acknowledge by the creation of a second corporate name (HC-FMUSP COVID-19 task force), quoted in the Acknowledgements sections of manuscripts, listing individuals and the key contributions within the group involved in overall infrastructure and logistics during the pandemic.Some of those individuals were also invited to contribute intellectually to (and therefore were included as individual authors in) a few specific manuscripts evaluating non-clinical aspects of COVID-19, e.g., costs of care (44).
An associated challenge that emerged concerned the risk of professors involved in the management of the initiatives being offered honorary co-authorship in any papers that utilized the institutional databases, simply based on their administrative leadership.This situation was deemed unacceptable as per the guidelines recommended by the International Committee of Medical Journal Editors (68), and also went against the principle of true cooperativeness that our institution aimed to bring to the initiatives.Therefore, the professors whose leadership was restricted to research administration of the initiatives were neither included as members of the corporate coauthor cited above (HC-FMUSP COVID-19 Study Group) nor individually named in the list of authors of papers (except in situations in which those leaders did play an intellectual role in the planning of investigations and analyses, interpretation of results and writing-up of manuscripts).

Lack of familiarity with and/or difficulties understanding the databases' structure
A third challenge faced by our group was that some research groups presented a degree of unfamiliarity with and/or difficulties to understand the databases' structure, the clinical case definitions based on key variables for symptom-based, laboratory and radiological diagnoses (34,35), and the structured flow for inclusion and exclusion of cases (36).To address this, the research managing team (Figure 1) played a constant role in overseeing and assisting research teams.Through repeated interactions, the managing team developed domain-expertise, gaining a deep understanding of the database structure and variables, and ultimately helping researchers in the selection of relevant data fields and comprehension of how the data was coded within the database.This domain-expertise facilitated the application of data to healthcare problems and research questions (10, 69), preventing duplication of analyses, interpretation errors, and discrepancies in reported data across papers.

Choice of instruments and examinations to be included in the multidisciplinary follow-up assessment battery
A fourth challenge involved negotiations among research groups with different interests to determine the scope of the multidisciplinary follow-up assessments.The goal was to ensure a comprehensive assessment while avoiding to burden the participants.Compromises were made to prevent overlap and reach consensus on assessment instruments.The leadership (Figure 1) mediated these negotiations, aiming to democratize access to the program for diverse research groups, and most research groups demonstrated flexibility and a willingness to compromise.

Participants' consent
A most relevant challenge faced over the course of the institutional initiatives described herein regarded patient consent.Our overall approach was carefully planned to avoid privacy breaches, re-identification, and misuse of data extracted from medical files.The Good Clinical Practice (GCP) guidelines (70) were followed to guarantee ethical and scientific quality standards in the conduction of the studies.
Obtaining patient or family members consent for the COVID-19 biobank data was a complex and time-consuming process due to the conditions of hospital strain and strict isolation needs.The follow-up visits conducted months after hospitalization were highly valuable to retrospectively obtain permission for use of leftover blood from diagnostic tests stored during hospitalization.The use of blood samples from surviving patients for which we had not obtained consent was strictly forbidden.For deceased patients, the prospect of posthumously using their biological samples for research was discussed with the next-of-kin whenever possible, in order to obtain consent.Nonetheless, during the unusual and hectic circumstances of the COVID-19 pandemic, identifying and successfully contacting deceased patients' relatives was often not realistic.Considering the invaluable importance of medical research during the pandemic, and to prevent further loss of human life, the Brazilian National Research Ethics Committee granted permission for de-identified healthcare data and biological materials from patients deceased due to COVID-19 to be used by research groups in their ethically approved research projects, even without patient consent.

Difficulties in the dispensing of blood samples from the biobank
A final challenge was the imbalance between the large amount of stored blood samples and the slow rate of dispensing of such biological material for use in research studies.The research management teams were unprepared for this initiative (given its novelty) and burdened with work at the two other fronts of data organization, causing delays in advertising the biobank and providing samples to approved studies.This frustrated research groups at HC-FMUSP eager to utilize the material promptly.Some interested groups also faced financial constraints for sample processing and analysis.However, the recently approved grant by FAPESP has secured funds for biomarker investigations, benefiting from the stored samples.Furthermore, we have expanded collaboration opportunities with external research groups (both from Brazil and abroad), offering access to samples and clinical data for joint investigations.Research collaboration agreements are currently under preparation, whereby we will share both blood samples and clinical data for additional investigations in collaboration with those groups.These measures will help to increase the pace of sample dispensing and facilitate research using the valuable resources of the biobank.

Recommended strategies to enhance collaboration within research institutions
Based on the lessons learned from the actions described in this paper, combined with previous literature (10, 16,22,69,[71][72][73], we present in Table 3 a set of recommendations for strategies aimed at enhancing collaboration within research institutions.The rationale behind these recommendations is to foster collaborations, complementing rather than replacing traditional research.All of the strategies summarized in Table 3 were fully or partially implemented in our institutional approach.

Discussion
Based on the relatively large size of the databases that the HC-FMUSP teams were able to compile, the institutional initiatives described in the present paper may be judged as successful in their aim to foster productive, large-scale research.These initiatives captured demographic and clinical data from thousands of COVID-19 cases treated in a densely urbanized region from a low-and-middleincome country (LMIC), organized in interconnected REDCap databases, available for investigations by over 30 research groups so far.Additionally, follow-up data from hundreds of COVID-19 patients, assessed 6 to 11 months after hospitalization through comprehensive in-person evaluations, have been used by more than 20 research groups.Finally, thousands of blood samples collected during the acute phase and follow-up assessments remain stored for future investigations.Most papers published within this initiative were interdisciplinary, with an unprecedented level of interaction between internal groups that had not previously worked together.To our knowledge, this was the first large-sized collaborative experience of such kind inside an academic hospital complex in Brazil.
Innovative strategies, different from traditional clinical research methods, are necessary to drive advances into the healthcare field and further improve public health (5).Implementing collaborative research management models offers several advantages, including pooling diverse knowledge, enhancing research productivity, crossdisciplinary fertilization, and improved access to expertise, equipment, and funds (7).The extraordinary context of the COVID-19 pandemic confirmed the notion that complex human health problems demand innovative and collaborative solutions combining knowledge from different scientific disciplines (2,3,7).Additionally, the pandemic has emphasized the decisive role of data sharing and open access to scientific publications in expediting scientific advancement with efficiency.During the global health crisis, journals and publishers responded by unlocking access to their content and by promoting a marked decrease in the time required for article publication.Furthermore, there has been a surge in the release of preprints, albeit without formal peer review.While these initiatives have accelerated the pace of scientific communication, they have concurrently evidenced the essential need for rigor in the scientific community.While not a fit-for-all solution, large-scale cross-disciplinary research management models, like the one described in this paper, can foster collaboration, reduce inefficiency, and produce high-quality, largescale research results (20).
While over thirty studies on COVID-19 have been published in peer-reviewed journals using institutional databases (30,31,33,, contributing significant data to the existing literature, there was a considerable delay in their production, with most being accepted for publication in late 2021, or later.The delay in publishing findings from institutional databases can be attributed to various factors such as research groups being involved in completing their own studies and others being overwhelmed with healthcare and teaching activities during the pandemic.However, the major cause of the publication delay was the time required for organizing this process in our institution.Additionally, the high submission rate of COVID-19related manuscripts from different parts of the world to highly-ranked peer-reviewed journals possibly led to an increased level of competitiveness, resulting in a higher threshold for acceptance of papers by those journals.Nonetheless, we are optimistic about the future of the program as we consolidate the data, establish the biobank, and receive grant support, which will ensure a more robust and sustainable program. Regarding the multidisciplinary follow-up initiative, the COVID-19 Steering Committee encouraged participating groups to publish interim findings [e.g., (58)].However, most teams opted to wait until data collection was completed in April 2021.By that time, several observational studies on long COVID had already been published by research groups from China, Europe, and the United States [e.g., (74)(75)(76)(77)], and that led to some of our manuscripts being rejected by high-profile journals on the grounds of lack of novelty.
For the blood samples from the COVID-19 biobank, there were difficulties and delays in dispensing aliquots, which may explain why only two studies have been published to date using this biological material.Nevertheless, our collection of biological material is still regarded as highly precious, as it was obtained from a large sample of unvaccinated COVID-19 patients for whom we have also comprehensive data both about the acute disease and follow-up assessments.This explains the current interest raised by external research groups both from Brazil and abroad in using such databases in further collaborative research studies with HC-FMUSP teams.Establishing a framework for decision-making, task allocation, and accountability helps prevent misunderstandings and ensures that everyone knows their contribution and commitment to the collaboration.

Create a team of professional data analytics and data science experts
To effectively handle large datasets and/or datasets that involve the integration of secondary data (such as data extracted from electronic health records), it is important to create a dedicated team of professional data analytics and data science experts to ensure that data is accurate, consistent, and reliable.The team should develop robust processes to prevent data errors, duplicates, and inconsistencies, resulting in improved data quality and integrity.Additionally, they should streamline data integration, standardization, and harmonization across various systems and departments.The team's responsibilities also encompass managing data throughout its lifecycle, including identifying and mitigating security risks, ensuring data protection and compliance, and providing necessary technical assistance and support.Moreover, a new wave of biomarker investigations by HC-FMUSP groups is expected to take place thanks to the funds that have been recently secured through the large grant approved by FAPESP.Efforts to foster large-scale data-driven research require multidisciplinary collaboration, crossing the boundaries of healthcare, with additional teams required with skills spanning statistics, computational systems and data science (6,78,79).Implementation of EHR brings healthcare closer to data science, computational biology, and artificial intelligence (10).In our initiatives, we applied artificial intelligence and contemporary computational methods to analyze hospitalization data through collaborations with computer science groups (30,48,50).Caution is advised regarding such secondary uses of healthcare data from EHR due to potential misinterpretation and concerns about data quality, especially missing or inaccurate data (72).Nevertheless, routine healthcare data, i.e., data generated from routine, standard care of patients, may be a particularly valuable source to inform treatment decisions, because it better represents the real-world uncontrolled conditions faced in clinical practice.
Albeit large, our COVID-19 hospitalization databases were substantially more modest in size compared to initiatives conducted in other settings using EHR.While we collected data from thousands of patients during hospital stays and hundreds of follow-up assessments, studies in other countries have included hundreds of thousands or millions of subjects [e.g., (80)(81)(82)(83)(84)(85)(86)].However, our institutional approach combining different sources of data and involving several teams working in collaboration improved the quantity and quality of the health data obtained from each subject.This led to the construction of comprehensive institutional databases from a representative cohort of subjects from a large LMIC city, with information on complex patients with multi-morbidity and polypharmacy, and who were treated in a real-world setting.These databases include detailed information for subjects from racial-ethnic minorities, socioeconomically disadvantaged, and other underprivileged or discriminated-against populations, who continue to experience a disproportionate share of many acute or chronic diseases and adverse health outcomes (9,87,88).Despite all the limitations and challenges, the implemented collaborative research actions resulted in one of the largest severe COVID-19 cohorts with in-person follow-up multidisciplinary evaluations to date.
As it appears to be the norm in most universities (16,22,67), the different research groups at HC-FMUSP distinguish themselves by their varied areas of interest, assumptions, priorities, methods, and research practices.These structural and cultural differences between disciplines may constitute significant barriers to collaborative research, and that was a difficulty faced during the implementation of our institutional collaborative COVID-19 research approach.It is not uncommon for talented, high-performing research leaders to find collaboration unnatural, after years working to set themselves apart and propel their academic careers (89).Up until now, there is limited research that explicitly examines how to encourage collaboration in settings similar to the HC-FMUSP system (16,22,73).Additional studies are necessary to increase understanding on how to further help researchers to overcome barriers and lean toward more collaborative science.Institutional initiatives such as the one described herein should be evaluated using qualitative survey methods, in order to investigate the perceptions of members of the research community about the proposed management approach and the challenges faced during its implementation.

Conclusion
Several experts have predicted that we are moving toward an era of research where openly shared data will become the norm (5,10,23,90,91).The results obtained from shared knowledge and discovery diminish the importance of securing intellectual property of healthcare data (without forgoing patient's privacy) (90, 91).Consequently, independent research might become less sustainable than collaborative research.Thus, researchers are beginning to prepare for a future when science will be led by those who have the resources and skills to exploit knowledge assets fastest, rather than by those who own it (23).In this context, scientific collaboration provides a highly effective means to produce knowledge by allowing the sharing of skills, expertise and resources (5,15).
Research is a crucial component of an effective epidemic response, and the preparation process should include a well-defined plan for organizing and sharing data.This aspect is just as important as all other elements of the response.Although a single model may not be appropriate for all contexts, cross-disciplinary collaboration should make health research systems more efficient to generate the best evidence (5).The top-down collaborative model implemented at HC-FMUSP during the COVID-19 pandemic has the aspiration to motivate a broader use of such kind of institutional approach to enable further scientific developments, helping to transform health care and improve human health.Our current COVID-19 databases may serve as prototypes for the development of additional databases addressing other areas of clinical interest.Such large-scale databases are likely to grow more rapidly, be more complete and be more useful if the three following conditions are met: universal use of automatically-extracted electronic health records; a greater acceptance of cross-disciplinary collaboration; and the cultivation of a culture of more open data sharing.

FIGURE 1 Flow
FIGURE 1Flow-chart of HC-FMUSP research-managing structure to foster large-scale cross-disciplinary collaborative research studies on COVID-19.

FIGURE 2
FIGURE 2 Data curation processes involved in the development of HC-FMUSP COVID-19 institutional databases.EHR, electronic health record; SQL, Structured Query Language; CSV, Comma-separated values; CRF, case report form.

FIGURE 3 Flowchart
FIGURE 3Flowchart of the multidisciplinary follow-up assessment program cohort.a Exclusion criteria: previous diagnosis of dementia or end-stage cancer, pregnant or postpartum patients, patients living in nursing homes or long-term care facilities or insufficient physical mobility to leave home.b

TABLE 2 How
Challenges identified during the implementation of institutional research initiatives at HC-FMUSP in the COVID-19 pandemic.internal communication about the initiatives (one-on-one discussions with research leaders, presentations to groups and internal collegiate, repeated electronic memos to mailing lists, progress reports to participating research groups) Building of an open website (https://sites.google.com/view/covid-19-hcfmusp)and frequent communication with) a few experienced and respected HC-FMUSP researchers willing to manage key components of the institutional collaborative research initiatives Transparent alignment with those leaders to guide the uniform implementation of actions fostering maximal inclusion of (and cooperation between) potentially interested research groups Stressing of the principles of inclusion, transparency and cooperativeness in all communications with research groups Access to data from the REDCap databases provided solely by the direct management team, after swift authorization of the COVID-19 Steering Committee using objective and previously advertised criteria How to award fair credit and co-authorship opportunities in publications Frustration of professionals that might not have opportunities to exercise their research interests due to overload of clinical and management work during the pandemic Risk of honorary authorship Use of a corporate coauthor including members selected according to objective criteria (i.e., professionals who helped significantly in the construction of the hospital databases).Setting of rules whereby individuals from the corporate coauthor were invited to contribute intellectually to (and approve the final version of) original papers that used data from significant numbers (>800) of patients For other key professionals from the crisis committee who had no familiarity with or interests in research on clinical aspects of COVID-19, use of a second corporate name cited in the Acknowledgements sections of manuscripts.Some of those individuals were also invited to contribute intellectually to specific papers evaluating non-clinical aspects of COVID-19 Avoidance of inclusion of coauthors that did not meet the criteria of the International Committee of Medical Journal Editors.Professors whose leadership was restricted to research administration of the initiatives were listed as authors only in specific cases when they did take part in the planning of investigations and analyses, interpretation of results and writing-up of manuscripts How to facilitate the hands-on use of institutional databases by researchers Lack of familiarity of the research groups with the databases' structure Strengthening of the role of the direct research managing team shown in Figure 1 in the overseeing of the research teams that worked on analyses using the databases.The management team helped researchers to select data fields relevant to their study goals and to understand how those elements were coded in the databases, as well as working to prevent unnecessary duplication of analyses by different groups, errors in the interpretation of numbers and variables, and discrepancies when similar data was reported across separate papers proposed by some research groups would not be valued as equally meritorious by other teams Validation by the COVID-19 Steering Committee of the direct management team as the mediator in the negotiations between research groups, in order to ensure that the assessment battery would be as thorough as possible without overburdening research participants Democratic mediation of decisions, in order to facilitate the inclusion of the largest possible number of research groups with diverse interests How to deal with ethical issues regarding use of data from hospitalized patients with COVID-19 Risk of privacy breaches, reidentification and misuse of data extracted from medical records Difficulties to obtain informed consent from hospitalized patients or their family members for storing blood samples in the biobank Recruitment of a team of medical students to seek informed consent during in-hospital stay from patients and family members for storing leftover blood from diagnostic tests Use of the follow-up visit conducted months after hospitalization to retrospectively obtain permission for use of leftover blood from diagnostic tests stored during hospitalization Request for the Brazilian National Research Ethics Committee to grant permission for the scientific use of de-identified healthcare data and biological materials from patients deceased due to COVID-19, considering the invaluable importance of medical research during the extraordinary pandemic circumstances How to minimize delays in the dispensing and use of blood samples from the biobank Lack of previous experience of the management teams Overload of the management teams with work on the other fronts of data organization Difficulties of potentially interested research groups to allocate funds to cover costs of processing/analyzing samples Top-down orientation for individual research groups to work collaborative in the sharing of costs whenever possible Submission of a multidisciplinary grant proposal to FAPESP in order to raise further funds for largescale analyses of samples Expansion of opportunities for experienced groups outside the HC-FMUSP system to get access to blood samples for collaborative studies FAPESP, Fundação de Amparo à Pesquisa do Estado de São Paulo (São Paulo Research Foundation); HC-FMUSP, Hospital das Clínicas da Faculdade de Medicina da Universidade de São Paulo; REDCap, Research Electronic Data Capture.

TABLE 1
Baseline and hospitalization characteristics of adult patients (≥ 18 years) with confirmed SARS-Cov-2 infection hospitalized from 2020 to 2022 available in our database.
a Either: (1) positive reverse-transcriptase polymerase chain reaction (RT-PCR) for SARS-CoV-2 on swab from nasopharyngeal and/or oropharyngeal samples (collected at admission and repeated after 48 h if negative); or (2) positive testing by chemiluminescent immunoassays to detect serum antibodies, performed for highly suspect cases with at least two negative RT-PCR samples or for whom an RT-PCR test was not available up to day 10 of symptom onset.Patients with nosocomial COVID-19 infections were excluded.b WHO scale categories: 3-4, no continuous supplemental oxygen needed; 5, continuous supplemental oxygen only; 6, continuous positive airway pressure ventilation, bi-level positive airway pressure or high flow nasal oxygen; 7-8-9, invasive mechanical ventilation and/or extracorporeal membrane oxygenation (ECMO).WHO Working Group on the Clinical Characterization and Management of COVID-19 infection (2020).

TABLE 3
Strategies recommended for enhancing collaboration within research institutions, based on lessons learned and previous literature.