Processing and Analysis of Multichannel Extracellular Neuronal Signals: State-of-the-Art and Challenges

Mahmud, Mufti; Vassanelli, Stefano

doi:10.3389/fnins.2016.00248

REVIEW article

Front. Neurosci., 02 June 2016

Sec. Neural Technology

Volume 10 - 2016 | https://doi.org/10.3389/fnins.2016.00248

This article is part of the Research TopicCurrent challenges and new avenues in neural interfacing: from nanomaterials and microfabrication state-of-the-art, to advanced control-theoretical and signal-processing principlesView all 35 articles

Processing and Analysis of Multichannel Extracellular Neuronal Signals: State-of-the-Art and Challenges

Mufti Mahmud^*

Stefano Vassanelli^*

NeuroChip Laboratory, Department of Biomedical Sciences, University of Padova, Padova, Italy

In recent years multichannel neuronal signal acquisition systems have allowed scientists to focus on research questions which were otherwise impossible. They act as a powerful means to study brain (dys)functions in in-vivo and in in-vitro animal models. Typically, each session of electrophysiological experiments with multichannel data acquisition systems generate large amount of raw data. For example, a 128 channel signal acquisition system with 16 bits A/D conversion and 20 kHz sampling rate will generate approximately 17 GB data per hour (uncompressed). This poses an important and challenging problem of inferring conclusions from the large amounts of acquired data. Thus, automated signal processing and analysis tools are becoming a key component in neuroscience research, facilitating extraction of relevant information from neuronal recordings in a reasonable time. The purpose of this review is to introduce the reader to the current state-of-the-art of open-source packages for (semi)automated processing and analysis of multichannel extracellular neuronal signals (i.e., neuronal spikes, local field potentials, electroencephalogram, etc.), and the existing Neuroinformatics infrastructure for tool and data sharing. The review is concluded by pinpointing some major challenges that are being faced, which include the development of novel benchmarking techniques, cloud-based distributed processing and analysis tools, as well as defining novel means to share and standardize data.

1. Introduction

The open question of structure-function relationship has attracted lot of interests in Systems Neuroscience. Recent works on anatomical substructures of the brain (Briggman and Denk, 2006; Mikula et al., 2012) promise to improve our understanding of neuronal networks physiology and drive the development of novel applications of neurotechnology by interpreting the activities of large neuronal ensembles via extracellular methods (Buzsaki, 2004; Nicolelis and Lebedev, 2009).

On the other hand, neuronal signals recorded by means of neuronal probes require rigorous (pre)processing and analysis. In terms of technological advancement, the extracellular interfacing of neurons with artificial chip-based devices has taken a considerable leap forward, even in comparison with very popular patch-clamp, EEG, and fMRI techniques (Vassanelli, 2011; Spira and Hai, 2013). In the last two decades, such advances have allowed neuroscientists to record neural activity simultaneously from many neurons with up to thousands of recording sites in a single neuronal probe and at a temporal resolution from a few up to hundreds of kilo Hertz (kHz) (Buzsaki, 2004; Schröder et al., 2015).

The wide variety of electrode size and dimensions allow different types of neuronal signals to be recorded from the extracellular space. Single-unit activities (action potentials) from single neurons can be sensed by small electrodes in their close proximity (Buzsaki et al., 2012). They also pick multi-unit activities from several simultaneously active neurons nearby to the electrode (Einevoll et al., 2012). With increasing electrode dimensions, local field potentials (LFPs) are sensed from neighboring neuronal populations as synchronous net activity of several hundreds to thousands neurons (Tsytsarev et al., 2006; Vassanelli, 2011, 2014; Vassanelli et al., 2012; Khodagholy et al., 2015). Therefore, the neurophysiological signals from different brain structures can be measured using a wide range of techniques based on the dimensions of the electrodes (see Figure 1; Sejnowski et al., 2014).

FIGURE 1

Figure 1. Spatiotemporal range of neurophysiological signal acquisition techniques. Spatiotemporal range of the main techniques to measure neurophysiological signals from the brain. EEG, electroencephalography; MEG, magnetoencephalography.

Also, the massive growth in the field of brain imaging techniques allowed scientists to image brain activities at very different scales, from imaging single ion-channels to the whole brain (for a review, see Freeman, 2015).

Recently developed neural probes allowed neuroscientists to investigate neural processing by monitoring groups of neurons and their activation patterns at unprecedented resolution (Brown et al., 2004; Giocomo, 2015), thus also contributing to bridge the gap between neuronal network activity and behavior (Berenyi et al., 2014). In addition, they provided deep insights on the pathological basis of brain disorders (Friston et al., 2015). As a drawback, investigation of brain function and pathology can require massive data mining. For example, in an hour, a 128 channel signal acquisition system with 16 bits A/D conversion and 20 kHz sampling rate will generate approximately 17 GB uncompressed data (Mahmud et al., 2014). Inferring meaningful conclusions from this massive amount of data is pivotal to the neuroscience and neuroengineering community (Mahmud et al., 2010a, 2012a) and tools for analysis of such multichannel extracellular recordings that support a rapid and accurate data interpretation are still missing (Stevenson and Kording, 2011). Though computing power increased and costs decreased, yet, processing and analysis of signals remained labor-intensive. This poses a huge challenge to the computational neuroscientists: to develop tools to analyze such complex data that are optimized for both memory management and processing times (Stevenson and Kording, 2011).

Over the years, to make data handling and analysis fast, interactive and user friendly, several software tools have been developed by individual laboratories, e.g., Mahmud et al. (2012a), but only a negligible number of them have been released to the community. In practice, large number of analysis scripts are kept private, leading to a situation where analysis transparency is reduced and reproducibility of analysis results is hampered (Schofield et al., 2009).

It has also been argued that the acquired data, despite being in digitized form, have been only minimally made publicly available for other scientists to explore and validate (Van Horn and Ball, 2008). To overcome this, in recent years, the community sees a growing need to have standardized and publicly available tools (Gardner et al., 2008; Akil et al., 2011) as well as experimental data repositories (Ascoli, 2006a; De Schutter, 2010). To this aim, a paradigm shift has been initiated by a set of laboratories to share their analysis tools through open-source licenses fostering standardization (Ince et al., 2010). Given the circumstances, distributed and cloud-based computing solutions have become an obvious and valuable option (Mahmud et al., 2014).

This review will introduce the readers to the available major open-source academic toolboxes for processing and analysis of neurophysiological signals acquired by means of multichannel probes, and the available infrastructure for sharing such tools and the experimental data. Also, some of the challenges and bottlenecks the community is currently facing will be identified and highlighted, and development perspectives which, in our opinion, will facilitate result reproducibility, flexibility, and standardization will be provided.

2. State-of-the-Art

The state-of-the-art for processing and analysis of neurophysiological signals can be categorized based on signal types, i.e., electroencephalography or magnetoencephalography (local) field potentials, and spikes. Though the majority of the toolboxes specialize to process and analyze one specific type of signal, there exist a few which provide rather comprehensive methods covering two or more signal types. Therefore, based on the signal types we categorized the toolboxes into three broad categories:

• Toolboxes for Electroencephalography (EEG) analysis;

• Toolboxes for spike trains and field potentials analysis;

• Toolboxes for spike sorting.

Most of the tools were developed mainly in MATLAB (Mathworks Inc., Natick, USA; www.mathworks.com) and python (www.python.org) programming languages due to their diffused usage in the neuroscience community. Other programming languages such as C, C++, R, Delphi7, and Java were also used in partial coding of some packages.

2.1. Toolboxes for Electroencephalography (EEG) Analysis

In the last decade, various techniques have been developed and applied to EEG data analysis and focused reviews on specific techniques have been reported (Pascual-Marqui et al., 2002; Stam, 2005; Hallez et al., 2007; Grech et al., 2008; Lenkov et al., 2013; de Cheveigné and Parra, 2014). Table 1 summarizes some of the popular open-source EEG analysis tools with their representative features which are enlisted below.

TABLE 1

Table 1. Popular EEG processing and analysis toolboxes with their representative features.

2.1.1. EEGLAB

“EEGLAB” is a MATLAB based EEG signal processing environment with time-frequency and ICA methods (Delorme and Makeig, 2004). It allows the user to: plot channel spectra and maps, remove artifacts, extract signal epochs, average data, select and compare multiple data, plot event related potential (ERP) images, decompose data using ICA and time/frequency methods, and estimate source locations. In addition, it also allows handling data from multiple subjects and perform statistical analysis on them. It can be obtained from http://sccn.ucsd.edu/eeglab/.

2.1.2. ERPWAVELAB

“ERPWAVELAB” is another MATLAB based EEG processing toolbox (Morup et al., 2007) which depends on EEGLAB for certain functionalities. It is capable of multi-channel time-frequency analysis of ERP of EEG and MEG data. Provides data decomposition using multiway (tensor) factorization. The features include: various visualizations and maps, artifact rejection in the time-frequency domain, clustering dendrogram, statistical analysis across different groups and subjects, cross coherence analysis, etc. It can be obtained from www.erpwavelab.org.

2.1.3. pyMVPA

“pyMVPA” is a multivariate pattern analysis package developed in Python and aims to facilitate statistical learning analyses of large datasets (Hanke et al., 2009). It offers data handling and an extensible framework for multivariate statistical analyses such as, classification, regression, and feature selection. It can be downloaded from www.pymvpa.org/.

2.1.4. eConnectome

“eConnectome” is a MATLAB based software with interactive graphical interfaces for EEG/ECoG/MEG preprocessing, source estimation, connectivity analysis and visualization where the connectivity from EEG/ECoG/MEG can be mapped over sensor and source domains (He et al., 2011). It can be obtained from http://econnectome.umn.edu/.

2.1.5. FieldTrip

“FieldTrip” is a MATLAB based toolbox developed for the analysis of MEG, EEG, and other noninvasively recorded electrophysiological data (Oostenveld et al., 2011). Capable of handling data directly from many proprietary formats (e.g., BrainProducts/BrainVision, NeuroScan, Electrical Geodesics Inc., BCI2000, Micromed, Nexstim, European data format, Generic standard formats, etc.), it provides the user to perform time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. It can be obtained from www.fieldtriptoolbox.org.

2.1.6. EEGVIS

“EEGVIS” is a MATLAB based toolbox that allows users to explore multichannel EEG and other large array-based data sets using multiscale drill-down techniques (Robbins, 2012). Available at http://visual.cs.utsa.edu/research/projects/eegvis, and useable as a plugin to “EEGLAB.”

2.1.7. SCoT

“SCoT” is a toolbox written in Python for connectivity analysis on EEG/MEG sources. It performs blind source separation, connectivity estimation, resampling statistics, and visualization (Billinger et al., 2014). It works with both multi-trial and single trial data. The source code can be downloaded from https://github.com/SCoT-dev/SCoT.

2.1.8. EMDLAB

“EMDLAB” is developed in MATLAB as a plugin to the EEGLAB to perform various empirical mode decomposition (EMD), e.g., plain EMD, ensemble EMD, weighted sliding EMD, and multivariate EMD (MEMD) on EEG data (Al-Subari et al., 2015). It can be obtained from http://sccn.ucsd.edu/eeglab/plugins/EMDLAB_Plugin.zip.

2.1.9. PREP

“PREP” is for early-stage EEG processing which is a MATLAB based preprocessing pipeline that aims in cleaning (e.g., line noise removal, fixing drifting problem, interpolating corrupt channels, etc.) the EEG signals (Bigdely-Shamlo et al., 2015). The library is available at http://eegstudy.org/prepcode.

2.2. Toolboxes for Spike Trains and Field Potentials Analysis

With the increasing capabilities to record simultaneously from a growing number of neurons, computational neuroscientists developed automated toolboxes addressing the required processing and analyses. We touch upon few of the publicly available ones below. Table 2 summarizes the packages we discuss below with their representative features.

TABLE 2

Table 2. Popular spike trains and field potentials processing and analysis toolboxes with their representative features.

2.2.1. DATA-MEAns

“DATA-MEAns” is a toolbox developed in Borland Delphi 7 (Embarcadero Technologies Inc., Austin, USA) and MATLAB (Bonomini et al., 2005). It provides data visualization, basic analysis (i.e., autocorrelations, perievent histograms, rate curves, PSTHs, ISIs, etc.), and nearest neighbor or k-means clustering. Available at http://cortivis.umh.es/.

2.2.2. MeaBench

“MeaBench” is a toolbox written mainly in C++ with certain parts written in Perl¹ and MATLAB. It is intended for data acquisition and online analysis of commercial multielectrode array recordings from Multichannel Systems GmbH (Reutlingen, Germany) (Wagenaar et al., 2005). It allows real-time data visualization, line and stimulus artifact suppression, spike and burst detection and validation. Available at www.danielwagenaar.net/res/software/meabench/.

2.2.3. Klusters, NeuroScope, NDManager

“Klusters,” “NeuroScope,” and “NDManager” are three integrated modules bundled together for processing and analysis of spike and field potential signals (Hazan et al., 2006). Klusters performs spike sorting using KlustaKwik (see Section 2.3.2) and displays 2D projection of features, spike traces, correlograms, and error matrix view. NeuroScope allows inspection, selection, and event editing of spike signals as well as local field potentials (LFPs). NDManager facilitates experimental and preprocessing parameter management. Available at http://neurosuite.sourceforge.net/.

2.2.4. Brain-System for Multivariate AutoRegressive Time Series (BSMART)

“BSMART” toolbox is written in MATLAB/C for spectral analysis of neurophysiological signals (Cui et al., 2008). It provides (multi-)bi-variate AutoRegressive modeling, spectral analysis through coherence and Granger causality, and network analysis. Available at http://www.brain-smart.org/.

2.2.5. Finding Information in Neural Data (FIND)

“FIND” is a platform-independent framework for the analysis of neuronal data based on MATLAB (Meier et al., 2008). It provides a unified data import function from various proprietary formats simplifying standardized interfacing with analysis tools and allows analysis of discrete series of spike events, continuous time series, and imaging data. Also, allows simulating multielectrode activity using point-process based stochastic model. Available at http://find.bccn.uni-freiburg.de/.

2.2.6. Spike Train Analysis Toolkit (STAToolkit)

“STAToolkit” is a MATLAB/C-hybrid toolbox implementing information theoretic methods to quantify how well the stimuli can be distinguished based on the timing of neuronal firing patterns in a spike train (Goldberg et al., 2009). Available at http://neuroanalysis.org.

2.2.7. PANDORA

“PANDORA” is a MATLAB-based toolbox that extracts user-defined characteristics from spike train signals and create numerical database tables from them (Gunay et al., 2009). Further analyses (e.g., drug and parameter effects, spike shape characterization, histogramming and comparison of distributions, cross-correlation, etc.) can then be performed on these tables. Spike detection and feature extraction can also be performed. It is available at http://software.incf.org/software/pandora.

2.2.8. sigTOOL

“sigTOOL” toolbox is written in MATLAB and allows direct loading of a wide range of proprietary file formats (Lidierth, 2009). It provides (auto-)cross-correlation, power spectral analysis, and coherence estimation in addition to usual spike train analysis (i.e., ISI, event auto- and cross-correlations, spike-triggered averaging, peri-event time histograms, frequencygrams, etc.). Available at http://sigtool.sourceforge.net/.

2.2.9. Information Breakdown ToolBox (ibTB)

“ibTB” is a MATLAB-based toolbox which implements information theory methods for spike, LFP, and EEG analysis (Magri et al., 2009). It provides information breakdown technique to decode the encoding of sensory stimuli by different groups of neurons. The source code can be obtained from the publisher's website (http://static-content.springer.com/esm/art%3A10.1186%2F1471-2202-10-81/MediaObjects/1471-2202-10-81-S1.zip).

2.2.10. Chronux

“Chronux” toolbox is developed in MATLAB for the analysis of both point process and continuous data (Bokil et al., 2010). It provides spike sorting, and local regression and multitaper spectral analysis of neural signals. Available at http://chronux.org/.

2.2.11. SPKTool

“SPKTool” is coded in MATLAB for the detection and analysis of neural spiking activity (Liu et al., 2011). It performs spike detection, feature extraction, manual and semi-automatic clustering of spike trains. Available at http://spktool.sourceforge.net/.

2.2.12. nSTAT

“nSTAT” toolbox is coded in MATLAB and performs spike train analysis in time domain (e.g., Kalman Filtering), frequency domain (e.g., multi-taper spectral estimation), and mixed time-frequency domain (e.g., spectrogram) (Cajigas et al., 2012). Available at www.neurostat.mit.edu/nstat/.

2.2.13. SigMate

“SigMate” is a MATLAB-based comprehensive framework that allows preprocessing and analysis of EEG, LFPs, and spike signals (Mahmud et al., 2012a). It's main contribution is in the analysis of LFPs which includes data display, file operations, baseline correction, artifact removal, noise characterization, current source density (CSD) analysis, latency estimation from LFPs and CSDs, determination of cortical layer activation order using LFPs and CSDs, and single LFP clustering. The EEG and spike analysis are provided through EEGLAB (see Section 2.1.1) and Wave_Clus (see Section 2.3.1) toolboxes. It can be obtained from https://sites.google.com/site/muftimahmud/codes.

2.2.14. Multivariate Granger Causality Toolbox (MVGC)

“MVGC” is a toolbox written in MATLAB that implements WienerGranger causality (G-causality) on multiple equivalent representations of a vector autoregressive model in both time and frequency domains (Barnett and Seth, 2014). It can be applied to neuroelectric, neuromagnetic, and fMRI signals and can be obtained from http://www.sussex.ac.uk/sackler/mvgc/.

2.2.15. QSpike Tools

“QSpike Tools” is a Linux/Unix-based cloud-computing framework, modeled using client-server architecture and developed in MATLAB / Bash scripts², for processing and analysis of extracellular spike trains (Mahmud et al., 2014). It performs batch preprocessing of CPU-intensive operations for each channel (e.g., filtering, multi-unit activity detection, spike-sorting, etc.), in parallel, by delegating them to a multi-core computer or to a computers cluster. It can be obtained from https://sites.google.com/site/qspiketool/.

2.3. Toolboxes for Spike Sorting

As seen in the literature, majority of the efforts have been devoted in developing tools for spike sorting and analysis. A recent review by Rey et al. outlines the basic concepts of spike sorting, applicability requirements, and shortcoming of currently available algorithms (Rey et al., 2015). Detailing all spike-sorting packages and their functionalities would require a complete review, therefore, here we restrict our discussion to some of the popular open-source toolboxes.

2.3.1. Wave_Clus

“Wave_Clus” is the most popular spike sorting package to date. Developed in MATLAB, it uses wavelet transformation based feature selection method and superparamagnetic clustering (Blatt et al., 1996) method to sort the spikes into different classes (Quian Quiroga et al., 2004). It is available at https://vis.caltech.edu/~rodri/Wave_clus/Wave_clus_home.htm.

2.3.2. KlustaKwik

“KlustaKwik” is a stand-alone program written in C++ for automatic clustering analysis (Harris et al., 2000) by fitting a mixture of Gaussians and masked expectation-maximization (Kadir et al., 2014; Rossant et al., 2016). Download link is https://github.com/klusta-team/klustakwik.

2.3.3. OSort

“OSort” is a template-based, unsupervised, online spike sorting algorithm written in MATLAB (Rutishauser et al., 2006). It uses residual-sum-of-squares based distance method and custom thresholds to on-the-fly sort the recorded spikes. Available at http://www.urut.ch/new/serendipity/index.php?/pages/osort.html.

2.3.4. SpikeOMatic

“SpikeOMatic” is a spike sorting package developed in R (Pouzat and Chaffiol, 2009). It implements Gaussian Mixture and Dynamic Hidden Markov Models using expectation-maximization and Markov Chain Monte Carlo methods, respectively. Available at http://www.biomedicale.univ-paris5.fr/SpikeOMatic/.

2.3.5. Spyke

“Spyke” is a Python based toolbox for visualizing, navigating, and spike sorting of high-density multichannel extracellular spikes (Spacek et al., 2009). It uses PCA for dimensionality reduction and modified gradient ascent clustering algorithm (Fukunaga and Hostetler, 1975; Swindale and Spacek, 2014) to classify the features. Available at http://spyke.github.io/.

2.3.6. UltraMegaSort2000

“UltraMegaSort2000” is a MATLAB based toolbox for spike detection and clustering which implements a hierarchical clustering scheme using similarities of spike shape and spike timing statistics, and provides false-positive and false-negative errors as quality evaluation metrics (Fee et al., 1996; Hill et al., 2011). Available at http://physics.ucsd.edu/neurophysics/software.php.

2.3.7. EToS

“EToS” is a spike sorting toolbox written in C++ implementing multimodality-weighted PCA and variational Bayes for student's t mixture model (Takekawa et al., 2012). The spike sorting code is parallelized through OpenMP (www.openmp.org) and available at http://etos.sourceforge.net/.

2.3.8. MClust

“MClust” is a spike sorting toolbox developed in MATLAB. It supports both manual and automated clustering with possibility to manual feature selection (Redish, 2014). It can be obtained from http://redishlab.neuroscience.umn.edu/MClust/MClust.html.

2.3.9. NEV2lkit

“NEV2lKit” is a package written in C++ with routines for analysis, visualization and classification of spikes (Bongard et al., 2014). Its results are accurate, efficient and consistency across experiments. Available at http://nev2lkit.sourceforge.net/.

2.3.10. WIToolbox

“WIToolbox” implements a combination of wavelet transform and information theory using MATLAB for better classification of spikes on the occasions of spike time-jitter, background noise, and sample size problem (Lopes-dos Santos et al., 2015). Available at www.le.ac.uk/csn/WI.

3. Sharing of Analysis Tools and Experimental Data

Making available to the community analysis toolboxes for easy and efficient handling of massive neuronal data is just a part of the solution. The other part is the availability of infrastructures which would allow these tools and the experimental data to be shared. Computational neuroscientists are putting constant and significant efforts in building and refining “Neuroinformatics” infrastructures, as outlined below, for making data, tools, and resources electronically accessible over the web (Ascoli, 2006b) which is believed to help and facilitate the standardization, benchmarking process, and foster collaborative research (Mahmud et al., 2012b). As quoted by Prof. Jan G. Bjaalie, “Neuroinformatics applies the methods and approaches required for large scale data integration and thereby paves the way toward understanding the brain”³.

3.1. Neuroshare

The neuroshare (http://neuroshare.sourceforge.net/) framework started with the goal to create and support open data file format specifications for neurophysiology, a set of open libraries to access those data, and open-source software tools for their analysis. This is particularly important when the community faces a situation where there are many proprietary neuronal signal file formats used by different acquisition softwares. Leveraging the “Neuroshare API,” the framework aims at standardizing the access to individual file formats of neurophysiological experiment data by creating low-level handling and processing tools. However, this has been designed to be achieved in two subsequent phases: (i) creation of open library and format standards for the experimental data, and (ii) developing free and open-source tools for low-level handling and processing of the data. Currently, it provides eight Neuroshare-compliant digital link libraries (DLLs) to access raw data files recorded with proprietary acquisition setups, e.g., Alpha-Omega, Blackrock Microsystems, Cambrige Electronic Design, Multichannel Systems, NeuroExplorer, Plexon, RC Electronics, and Tucker-Davis Technologies.

3.2. International Neuroinformatics Coordinating Facility (INCF)

To facilitate tools and data sharing and fostering development in the field of Neuroinformatics, an organization called International Neuroinformatics Coordinating Facility (INCF, www.incf.org) was formed by 12 member countries of the Organization for Economic Co-operation and Development (OECD, www.oecd.org/). Financed by Belgium, Czech Republic, Finland, France, Germany, Italy, Japan, The Netherlands, Norway, Sweden, Switzerland, the United States, and the European Commission, many of these member countries have their own nodes to provide this facility locally (Rautenberg et al., 2011). Quoting from an article by the Executive Director of INCF during 2006–2008, who defined it's aims to be (Bjaalie and Grillner, 2007):

quote

• coordinate and foster international activities in Neuroinformatics;

• contribute to the development of scalable, portable, and extensible applications that can be used for furthering our knowledge of the human brain and its diseases;

• contribute to the development and maintenance of specific database and other computational infrastructures and support mechanisms; and

• focus on developing mechanisms for the seamless flow of information and knowledge between academia, private enterprizes, and the publication industry.

unquote

3.3. Code Analysis, Repository and Modeling for e-Neuroscience (CARMEN)

The Code Analysis, Repository and Modeling for e-Neuroscience (CARMEN) project was one of its kind in developing a virtual neuroscience laboratory, specially for electrophysiology data, facilitating e-Neuroscience through creating a unique infrastructure for data and tools sharing and services (Watson et al., 2010). These secure services allow a user to curate data and analysis code to defined storages, document experimental protocols, and execute data analysis (Fletcher et al., 2008). The data as such cannot be curated to the databases of CARMEN without having a proper metadata description about it. This description is essential for accessing correct data out of the thousands of available datasets and interpreting them using the appropriate analysis codes (Jessop et al., 2010).

The CARMEN framework currently supports analysis codes written in MATLAB, Python, C/C++, and R. The users may upload their codes, in the form of non-interactive standalone command-line applications, wrapping them using a Service Builder tool to create a suitable service format to be executed on the platform (Weeks et al., 2013).

Recently, a programming document demonstrated the usage of a curated repository of multielectrode array recordings of spontaneous activity from mouse and ferret retina. The mentioned dataset was in HD5⁴ format (a format for hierarchical data organization), and the document outlined the guide to be followed for the efficient usage of the CARMEN software workflow. Moreover, the dataset structure along with examples of reproducible research using those data files were reported (Eglen et al., 2014).

3.4. Neurodata without Borders: Neurophysiology (NWB:N)

To facilitate research reproducibility and to have an opportunity to explore someone else's data, data standardization is a must. The Neurodata Without Borders: Neurophysiology (NWB:N, http://www.nwb.org/) is an initiative aiming at promoting data standardization and sharing. Since it's infancy, the NWB:N has been keen on producing a common data format for recordings and metadata of cellular electrophysiology which has recently been released along with a sample dataset (Teeters et al., 2015).

4. Challenges and Future Perspectives

Secure infrastructures are vital for the success of large-scale, multi-institutional Neuroinformatics research. It is foreseeable that Neuroinformatics research facilities shall be capable of integrating data seamlessly from different sources for data sharing, but also they should be secure enough to address challenging issues like –

• research collaboration with the option to protect their proprietary data,

• user friendliness allowing users with minimal information technology skills to explore, navigate, and use scientific data and services provided by the environment.

In the recent years, the emergence and popularity of distributed computing render an opportunity to share resources that otherwise require more effort. In particular, cloud computing and service oriented architecture open novel avenues necessary to foster collaborative neuronal signal analysis through distributed infrastructure. These approaches allow better representation of responsibilities taken by the different users in accordance to their granted privileges. In our opinion, the development is expected toward:

• Design and implementation of secure and protected systems;

• Advance on cloud based web applications;

• Facilitate easy deployment of data;

• Reusability and sharing of tools with adaptability to changing requirements;

• Empower researchers to share functionalities that they want to publish.

Based on the current state-of-the-art, we identified few challenges that require immediate attention of the community, a few are indicated below:

1. Over the last few years, the neuroscientists have put together quite a few useful neuroimage repositories and their analysis tools (Eickhoff et al., 2016), but neurophysiology is lagging behind. Though there exist a few individual databases (e.g., http://brainliner.jp/, http://www.g-node.org/, https://www.ieeg.org/, etc.), they are very poor in comparison to their imaging counterpart (Tripathy et al., 2014).

2. With the actual acquisition systems and the needed data formats changes, inter-operability and data conversion is still a nightmare due to the lack of widely adopted standards. In addition, when the data are being curated in a databases, the data-description through metadata is again incompatible among different labs/curators which also hampers in conducting meaningful analyses using data from another lab. This unnecessarily increases the time and effort required for data discovery and analysis.

3. Due to the practical problem of rapid and customized analyses, most of the labs develop their own analysis scripts and perform their required analyses. This approach has severe drawbacks on the global scale: interoperability, compatibility, and sharing of tools with other laboratories are highly restricted. Thus, the problem of creating a common set of analyses and the availability of benchmark analysis tools are yet to be addressed.

4. Though the price of computing power has reduced significantly over the years, yet the power required to demystify large neuronal ensembles is still alarmingly high. From a Neuroinformatics perspective, availability of powerful international computing facilities will greatly facilitate remote, automated, and standardized multichannel neuronal signal processing and analysis.

5. Cloud computing's popularity is rapidly growing. Exploiting the bliss of distributed computing, a concept of Competitor-to-Collaborator would be very interesting where small clusters of laboratories working on similar research questions would share their resources and tools through a unified cloud-based framework for the other laboratories to be used as web-services.

Author Contributions

MM performed the reported study. MM wrote and SV edited the paper. Both authors have seen and approved the final manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Acknowledgments

Financial support by the 7th Framework Programme of the European Commission through “RAMP” project (www.rampproject.eu) with contract no. 612058 is kindly acknowledged.

Footnotes

1. ^https://www.perl.org/.

2. ^https://en.wikipedia.org/wiki/Bash_(Unix_shell).

3. ^https://www.incf.org/community/people/bjaalie/person_view.

4. ^https://www.hdfgroup.org/HDF5/.

References

Akil, H., Martone, M. E., and Van Essen, D. C. (2011). Challenges and opportunities in mining neuroscience data. Science 331, 708–712. doi: 10.1126/science.1199305

PubMed Abstract | CrossRef Full Text | Google Scholar

Al-Subari, K., Al-Baddai, S., Tome, A. M., Goldhacker, M., Faltermeier, R., and Lang, E. W. (2015). EMDLAB: a toolbox for analysis of single-trial EEG dynamics using empirical mode decomposition. J. Neurosci. Methods 253, 193–205. doi: 10.1016/j.jneumeth.2015.06.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Ascoli, G. A. (2006a). Mobilizing the base of neuroscience data: the case of neuronal morphologies. Nat. Rev. Neurosci. 7, 318–324. doi: 10.1038/nrn1885

PubMed Abstract | CrossRef Full Text | Google Scholar

Ascoli, G. A. (2006b). The ups and downs of neuroscience shares. Neuroinformatics 4, 213–216. doi: 10.1385/NI:4:3:213

PubMed Abstract | CrossRef Full Text | Google Scholar

Baccala, L. A., and Sameshima, K. (2001). Partial directed coherence: a new concept in neural structure determination. Biol. Cybern. 84, 463–474. doi: 10.1007/pl00007990

PubMed Abstract | CrossRef Full Text | Google Scholar

Barnett, L., and Seth, A. K. (2014). The MVGC multivariate Granger causality toolbox: a new approach to Granger-causal inference. J. Neurosci. Methods 223, 50–68. doi: 10.1016/j.jneumeth.2013.10.018

PubMed Abstract | CrossRef Full Text | Google Scholar

Berenyi, A., Somogyvari, Z., Nagy, A. J., Roux, L., Long, J. D., Fujisawa, S., et al. (2014). Large-scale, high-density (up to 512 channels) recording of local circuits in behaving animals. J. Neurophysiol. 111, 1132–1149. doi: 10.1152/jn.00785.2013

PubMed Abstract | CrossRef Full Text | Google Scholar

Bigdely-Shamlo, N., Mullen, T., Kothe, C., Su, K. M., and Robbins, K. A. (2015). The PREP pipeline: standardized preprocessing for large-scale EEG analysis. Front. Neuroinform. 9:16. doi: 10.3389/fninf.2015.00016

PubMed Abstract | CrossRef Full Text | Google Scholar

Billinger, M., Brunner, C., and Muller-Putz, G. R. (2014). SCoT: a python toolbox for EEG source connectivity. Front. Neuroinform. 8:22. doi: 10.3389/fninf.2014.00022

PubMed Abstract | CrossRef Full Text | Google Scholar

Bjaalie, J. G., and Grillner, S. (2007). Global neuroinformatics: the international neuroinformatics coordinating facility. J. Neurosci. 27, 3613–3615. doi: 10.1523/jneurosci.0558-07.2007

PubMed Abstract | CrossRef Full Text | Google Scholar

Blatt, M., Wiseman, S., and Domany, E. (1996). Superparamagnetic clustering of data. Phys. Rev. Lett. 76, 3251–3254. doi: 10.1103/physrevlett.76.3251

PubMed Abstract | CrossRef Full Text | Google Scholar

Bokil, H., Andrews, P., Kulkarni, J. E., Mehta, S., and Mitra, P. P. (2010). Chronux: a platform for analyzing neural signals. J. Neurosci. Methods 192, 146–151. doi: 10.1016/j.jneumeth.2010.06.020

PubMed Abstract | CrossRef Full Text | Google Scholar

Bongard, M., Micol, D., and Fernandez, E. (2014). NEV2lkit: a new open source tool for handling neuronal event files from multi-electrode recordings. Int. J. Neural Syst. 24, 1450009. doi: 10.1142/s0129065714500099

PubMed Abstract | CrossRef Full Text | Google Scholar

Bonomini, M. P., Ferrandez, J. M., Bolea, J. A., and Fernandez, E. (2005). DATA-MEAns: an open source tool for the classification and management of neural ensemble recordings. J. Neurosci. Methods 148, 137–146. doi: 10.1016/j.jneumeth.2005.04.008

PubMed Abstract | CrossRef Full Text | Google Scholar

Briggman, K. L., and Denk, W. (2006). Towards neural circuit reconstruction with volume electron microscopy techniques. Curr. Opin. Neurobiol. 16, 562–570. doi: 10.1016/j.conb.2006.08.010

PubMed Abstract | CrossRef Full Text | Google Scholar

Brown, E. N., Kass, R. E., and Mitra, P. P. (2004). Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat. Neurosci. 7, 456–461. doi: 10.1038/nn1228

PubMed Abstract | CrossRef Full Text | Google Scholar

Buzsaki, G. (2004). Large-scale recording of neuronal ensembles. Nat. Neurosci. 7, 446–451. doi: 10.1038/nn1233

PubMed Abstract | CrossRef Full Text | Google Scholar

Buzsaki, G., Anastassiou, C. A., and Koch, C. (2012). The origin of extracellular fields and currents–EEG, ECoG, LFP and spikes. Nat. Rev. Neurosci. 13, 407–420. doi: 10.1038/nrn3241

PubMed Abstract | CrossRef Full Text | Google Scholar

Cajigas, I., Malik, W. Q., and Brown, E. N. (2012). nSTAT: open-source neural spike train analysis toolbox for matlab. J. Neurosci. Methods 211, 245–264. doi: 10.1016/j.jneumeth.2012.08.009

PubMed Abstract | CrossRef Full Text | Google Scholar

Celeux, G., and Govaert, G. (1992). A classification em algorithm for clustering and two stochastic versions. Comput. Stat. Data Anal., 14, 315–332. doi: 10.1016/0167-9473(92)90042-e