Computational Psychiatry Research Map (CPSYMAP): A New Database for Visualizing Research Papers

The field of computational psychiatry is growing in prominence along with recent advances in computational neuroscience, machine learning, and the cumulative scientific understanding of psychiatric disorders. Computational approaches based on cutting-edge technologies and high-dimensional data are expected to provide an understanding of psychiatric disorders with integrating the notions of psychology and neuroscience, and to contribute to clinical practices. However, the multidisciplinary nature of this field seems to limit the development of computational psychiatry studies. Computational psychiatry combines knowledge from neuroscience, psychiatry, and computation; thus, there is an emerging need for a platform to integrate and coordinate these perspectives. In this study, we developed a new database for visualizing research papers as a two-dimensional “map” called the Computational Psychiatry Research Map (CPSYMAP). This map shows the distribution of papers along neuroscientific, psychiatric, and computational dimensions to enable anyone to find niche research and deepen their understanding ofthe field.


INTRODUCTION
The understanding of psychiatric disorders is based on multiple interacting levels, from genetics to cells, neural circuits, cognition, behavior, and the surrounding environment. In addition to efforts to accumulate large-scale biological and psychological evidence such as the ENIGMA (1) and COCORO (2) projects, recent advances in computational methodology have helped to handle high-dimensional data to understand psychiatric disorders (3,4) Bringing mathematical methodologies and theoretical frameworks to psychiatric research is expected to have a central role in the development of treatments and preventive strategies; the use of such methods constitutes an area of research called computational psychiatry (CPSY), which has come to be treated as an important discipline of psychiatry. The development of this field has been reviewed in many papers from the perspective of clinical and computational neuroscience (5)(6)(7)(8)(9)(10). This is a highly multidisciplinary field that requires researchers from a variety of backgrounds ranging across neuroscience, psychiatry, and mathematical modeling. Therefore, to accelerate CPSY research, it is crucial to link each CPSY study to the knowledge derived from different fields, including accumulated evidence of neuroscience, traditional psychiatry, and computational techniques.

Link to Neuroscience
Traditionally, mental disorders have been diagnosed based on patients' self-reporting of subjective experiences and behavioral observation by clinicians; disorders are conceptualized by accumulating those observations. While this view provides benefits for topics such as clinical communications based on the reliability of a diagnosis (11), the current diagnostic category faces several challenges in terms of biological plausibility. Namely, the underlying biological pathology associated with a particular psychiatric disorder category is often observed in other disorder categories (non-specificity) [e.g., (12,13)]. Moreover, patients diagnosed with a psychiatric disorder of a particular category indicates diverse differences in treatment responses, disease processes, and outcomes (lack of predictive validity) (14). The National Institute of Mental Health (NIMH) has developed the Research Domain Criteria (RDoC) framework to systematize psychiatric research based on the findings of behavioral neuroscience that is not bound by conventional disease categories (15,16). Moreover, the NIMH has provided the RDoC matrix, which integrates many levels of information. The observable basic component functions based on behavioral neuroscience are defined as constructs. Psychiatric disorders are considered on a spectrum from normal to abnormal states of constructs and related constructs are organized into a domain. In addition, they enumerated the variables of the unit of analysis from micro to macroscopic levels. Although the dimensional approach of RDoC must work welltogether with the computational approach in the sense that most computational models represent individual differences as continuous parameters, the correspondence between the contents of each CPSY research paper and RDoC is not mentioned in many cases. Linking RDoC to each study based on accumulated neuroscience findings is essential to enhance the significance of CPSY research, particularly with regard to showing the expected role of CPSY research in connecting the knowledge of different aspects using mathematical models.

Link to Traditional Psychiatry
Since RDoC does not cover the traditional taxonomy and symptomatology of mental disorders, linking CPSY studies to RDoC is not sufficient. As mentioned, traditional psychiatric symptomatology has contributed significantly to the research and diagnostic classification criteria because of the inter-rater reliability (11), which has been accumulated in the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) (17,18). Thus, an integration of the terms of psychiatric symptomatology and neuroscientific knowledge defined in the RDoC is needed. However, it is difficult for researchers outside the domain of psychiatry to obtain an overview of which symptoms are important and what should be studied.

Link to Computation
In CPSY research, various powerful computational tools and conceptual frameworks are used to analyze and manipulate high-dimensional, multimodal data sets, including clinical, genetic, behavioral, neuroimaging, and other data types obtained by animal models, human experiments, clinical reports, and simulations. In this domain, there are different cultures, such as the theory-driven vs. data-driven approaches (10). The theory-driven approach models the information processing of perception and cognition in the brain based on the computational framework. The data-driven approach applies machine learning methods to large-scale data on mental disorders, then clusters the data and develops discriminatory models. Additionally, various types of models (e.g., reinforcement learning, neural networks, and biophysical models) have been utilized in this field. It is, therefore, necessary to organize the methodologies, data, theories, and models that are used to accelerate the research.

Database to Integrate the Three Perspectives
Due to cumulative efforts in the fields of neuroscience, psychiatry, and mathematical modeling, CPSY research could have a central role in the rational development of psychological pathology, treatments, ontologies, and preventive strategies. However, insufficient connections between the perspectives of neuroscience, psychiatry, and computational models may hinder the extension of research. For example, clear connections between a computational modeling study to and RDoC constructs and the unit of analysis could provide insights from the accumulated findings of neuroscience in the related domains with different granularity. In addition, this link can be used to explore what kind of computational models may be applicable to a particular impairment of cognitive functions. Therefore, in order to enable further development in the field, it is necessary to organize individual studies from the following perspectives: (1) Connection to neuroscience: What cognitive functions and units of analysis, such as behavior and circuitry, are targeted? (2) Connection to psychiatry: Which mental disorders and symptoms are being addressed and, which are not being addressed? (3) Connection to computation: What methodologies, data, theories, and models are used?
To address this issue, we developed a database and created an environment wherein anyone can search, register, and obtain an overview of CPSY research as a web application. In this report, a CPSY Research Map (CPSYMAP) (RRID:SCR_018942; accessible online at https://ncnp-cpsy-rmap.web.app/) (19) is described, which is an online database that tags CPSY papers using the terminology of neuroscience, psychiatry, and computation to help researchers organize and visualize the status of research areas along with the tags on a two-dimensional map.

Database Architecture
CPSYMAP is accessed through a public website at https:// ncnp-cpsy-rmap.web.app. The database was built using Google Firebase (20) and the source code was written in Javascript. The Modeling computational processes of the brain Modeling the information processing of the brain at the computational and algorithmic levels (partly including the implementation level) using mathematical formulations.

Model-fitting
Estimating and evaluating values of model parameters that can be considered as features reflecting the cognition/impairment of participants from behavioral/neural/other data.

Simulated lesion
Modeling normal functions and manipulating model parameters to simulate pathological mechanisms.
Classification/Discrimination Discriminating/Classifying disease categories/subjects (e.g., diagnosis) based on neuro-physiological, behavioral, and clinical data using supervised learning techniques.

Clustering
Clustering of subjects/symptoms using neuro-physiological, behavioral, and clinical data using unsupervised learning techniques.
Prediction of disease states Predicting disease states including severities, prognosis, and responses to treatments, using neuro-physiological, behavioral, and clinical data.
core of CPSYMAP is a database consisting of two components: information about individual papers and the tags based on neuroscientific, psychiatric, and computational knowledge. The information about the papers is taken from API Crossref (21). The data points stored are the title of the paper, the names of authors, the citation count, digital object identifier (DOI), data type, and abstract.

Tags Used in the Database
Individual CPSY papers are tagged according to the categories of neuroscience, psychiatry, and computation in order to characterize and visualize their relationships from the three perspectives described above; lists of all tags are shown in Supplementary Table 1.
From the neuroscientific perspective, the RDoC domain, construct, and subconstruct are provided as options showing which cognitive functions are targeted. To arrange the papers by the unit of analysis, users can choose tags including genes, molecules, cells, neural circuits, physiology, behaviors, selfreports, and paradigms.
To visualize the mental disorders and symptoms that are being addressed and those not being addressed, we provide options from both the DSM-5 and symptomatology. The DSM-5 is the authoritative list of mental disorders proposed by the American Psychiatric Association (17). Psychiatric symptomatology is a method of psychiatry attempting to describe and reconstruct the characteristics of patients' experiences from the perspectives of psychology and phenomenology. We are referencing the lexicon of psychiatric and mental health terms (WHO) (18).
To arrange the different types of computational frameworks, we provide options in a category called experimental design. In this category, we defined seven tags including Modeling computational processes of the brain, Model-fitting, Simulated lesion, Classification/Discrimination, Clustering, Prediction of disease states and Others. The definitions are provided in the database (19) and in Table 1.
The data type can be selected from animal data, human data, simulation, and literature review. Although we could not cover all the models used in this field, the tags the for models are also provided. For the model names, 13 tags are prepared and 5 models (Reinforcement learning, Biophysical model, Neural network model, Bayes, and Regression model) are shown on the axis.

Organization
CPSYMAP provides a simple graphical user interface to create a web application environment where anyone can register, search, and, overview CPSY research. Through the graphical user interface, users can easily select and tag the scientific papers they want to add. Note that for the registration of a paper, CPSYMAP uses a community augmented approach, i.e., anyone can register and add tags to a paper based on their judgements.

Reliability of CPSYMAP
This is a community augmented approach, meaning that CPSYMAP is publicly editable. Thus, like other crowd-sourced projects, the registered information is not guaranteed to be accurate. We aimed to avoid flooding the CPSYMAP with inaccurate information and making it less reliable by implementing the following measures. First, if a user adds a tagged paper, they can review and edit the related tag information at any time. We have also implemented a simple reporting system, called "Edit request." When a user finds mistakes or wants to make suggestions regarding the tags added on papers by other users, they can send an edit request via the CPSYMAP website. Administrators of the CPSYMAP (i.e., the authors) validate these requests and make appropriate changes to the database.

Data Availability
The information regarding the registered papers and tags can be downloaded from the site in.csv format and can be used for secondary analysis by other researchers with credit to CPSYMAP by citing this paper. In addition, we could potentially share our source code in case of inquiry.

Visualizing, Filtering, and Sorting
This database organizes and visualizes the status of the research areas along the tags on a two-dimensional map. In the Heatmap section, the degree of green color shading indicates areas with many papers vs. areas with few papers (Figure 1).
Each cell contains the papers with tags specified in the vertical and horizontal axes. The users can choose the vertical and horizontal axes they wish to view. Clicking the cell takes users to the Pickup Table. This section allows users to view the paper   list in the cell and sort the tags in ascending or descending order. When users choose a paper from the table, the details of the paper appear. The filter function can be used to further narrow the contents in the map.

One-Dimensional Analysis of Registered Papers
Using these functions, we introduced a preliminary notice from the database. When accessed on October 10th, 2020, there were 248 papers in total (32 users other than the authors registered papers). If users choose the same category on both the horizontal and vertical axes, they can see the distribution of the papers on a single axis (Figure 2). Since we allow multiple selections of tags during the registration, there are some papers outside of the diagonal cells. Regarding the data type, for example, 80 papers used a simulation as their main result and 149 papers reported experiments with human participants; 16 papers dealt with both simulations and human experiments.
Although this trend could be temporary, the most popular diseases discussed in this database were schizophrenia spectrum and other psychotic disorders (95 papers, Figure 3). The bar labeled "Others" included papers targeting 13 other disorders and normal cognitive functions that were tagged as cognitive processes at the registration process (Figure 3,  bottom). On the other hand, there are relatively few papers about bipolar and related disorders or anxiety according to the current data set despite their relatively high disorder prevalence (22). Regarding the experimental design from the computational perspective, both modeling computational processes of the brain (101 papers) and model-fitting (95 papers) were frequently used in the registered CPSY papers (Figure 4). However, all designs were used to FIGURE 4 | Number of papers grouped by experimental design. In terms of experimental design from the computational perspective, both modeling computational processes of the brain and model-fitting were frequently used in the registered CPSY papers. Regarding the RDoC domain, the cognitive system was the term most frequently tagged to papers (122 papers). By using the pickup table and filter function, users can shortlist the papers. Within the constructs of the cognitive system, cognitive control was tagged to 50 papers, while working memory and perception were studied in 30 papers. Within the six domains, the arousal FIGURE 6 | Heatmap with models and experimental design as the vertical and horizontal axes, respectively, as in Figure 5. However, this map was filtered by DSM-5 category, schizophrenia spectrum, and other psychotic disorders. and regulatory systems are less frequently tagged in this database. Concerning the unit of analysis of the RDoC matrix, fewer papers deal with the cellular or molecular level; this was reflected in the small number of papers using biophysical models.

Two to Three-Dimensional Analysis of the Registered Papers
If the unit of analysis and models are chosen for the axes in the current database, the cells with the largest number of papers concern behavior crossed with reinforcement learning (84 papers, Figure 1). The papers included in the cell dealt with a wide range of disorders such as schizophrenia (20 papers) and depressive disorders (20 papers); however, papers on substance use disorders were the most prevalent (26 papers).
In cases where the target disease is already determined, users can see the different landscapes by choosing the disease in the filter and perusing the Heatmap section. Before filtering by the DSM-5 category, the reinforcement learning crossed with modelfitting (58 papers) is the most common methodology ( Figure 5); however, the cell with neural network models crossed with modeling computational processes of the brain and the cells with Bayes have a similarly large number of the papers (13-14 papers) after filtering by schizophrenia (Figure 6). This difference in the landscape might provide insights to find the niches of each field depending on the type of the disorders.

DISCUSSION
We developed CPSYMAP to visualize research papers on a two-dimensional map and to provide an interactive environment where anyone can search, register, and obtain an overview of CPSY research. Further, we described the details and importance of the perspectives of neuroscientific, psychiatric, and computational methods and how we implemented those dimensions in this database. Finally, we showed some preliminary analysis of the registered papers in the database. This is the first database of research papers in the field of CPSY. In addition, the CPSYMAP provides the benefit of presenting new ways of organizing papers, such as expressing distributions with the use of colors in maps and switching axes. It is designed in a way such that it can be updated and augmented though the efforts of the community. This database can enable a new form of reviewing research articles in the sense that the database can help researchers design new research plans and organize previously published works.
However, other approaches to organize and visualize research papers are also available. For instance, Microsoft Academia is a semantic search engine for academic research that is not keyword-based (23). It provides a sophisticated user interface to visualize the keywords of papers and organize academic knowledge, including relationships between authors, in a variety of research fields. However, since the scope of that engine includes the entirety of academia, it may not cover the granularity we aimed to achieve in our database. A notable advantage of our database is that it attempts to cover the entire field of CPSY while ensuring fine granularity and organized dimensionality (neuroscience, psychiatry, and computation).
In addition, the significance of systematic evidence maps in several fields, including epidemiology (24), environmental science (25), and psychiatry (26)(27)(28), is increasing. These maps provide a comprehensive overview of the evidence with visual depictions such as figures or tables in most cases (29). CPSYMAP can be considered as one form of an evidence map. However, a unique feature of our database is that it can be updated to include all newly published papers in this field. This allows for an open and community-augmented approach, which is expected to promote the field of computational psychiatry.
Conversely, there are some limitations to this database. Currently, the number and categories of registered papers are limited. This could be resolved as the community begins to assist us with registration of additional research papers and as more computational psychiatrists become familiar with our database.
The second limitation is the potential for credibility issues with tagging. Specifically, there may be disagreements and differing views on tagging among the users and contributors in the community. At present, users are asked to identify problems using the edit request function.
To overcome these limitations, we plan to update CPSYMAP, especially the registration process. To improve the coverage and objectivity, a semi-automatic registration system will be introduced. The algorithm will recommend potential papers that are suitable for registration and suggest tags for the individual candidate papers. Human researchers will review these suggestions, and approved papers will be registered in the database. We will develop automated tag suggestions as part of the next major update scheduled around April 2021.

CONCLUSION
We expect that the CPSYMAP database will provide a better perspective for non-specialists regarding topics that are not explored often and that are important in the CPSY field. This may attract more experts from other fields such as informatics, physics, and engineering, as well as attracting more experts from psychiatry and neuroscience to CPSY research.
Moreover, the user interface allows researchers to understand CPSY papers with a multidimensional view, which we hope will assist with the development of computational methods and theoretical frameworks to establish links between traditional psychiatry and neuroscience. This can contribute to the integration of different methods, especially the interactive and cyclical development of both theory-driven and data-driven approaches. By building a platform for the organization and storage of information, we hope that the database contributes to the efficiency of research and vitalization of the field.

DATA AVAILABILITY STATEMENT
The original contributions generated for this study are included in the article/Supplementary Materials, further inquiries can be directed to the corresponding author/s.