Short-Term Classification Learning Promotes Rapid Global Improvements of Information Processing in Human Brain Functional Connectome

Classification learning is a preeminent human ability within the animal kingdom but the key mechanisms of brain networks regulating learning remain mostly elusive. Recent neuroimaging advancements have depicted human brain as a complex graph machinery where brain regions are nodes and coherent activities among them represent the functional connections. While long-term motor memories have been found to alter functional connectivity in the resting human brain, a graph topological investigation of the short-time effects of learning are still not widely investigated. For instance, classification learning is known to orchestrate rapid modulation of diverse memory systems like short-term and visual working memories but how the brain functional connectome accommodates such modulations is unclear. We used publicly available repositories (openfmri.org) selecting three experiments, two focused on short-term classification learning along two consecutive runs where learning was promoted by trial-by-trial feedback errors, while a further experiment was used as supplementary control. We analyzed the functional connectivity extracted from BOLD fMRI signals, and estimated the graph information processing in the cerebral networks. The information processing capability, characterized by complex network statistics, significantly improved over runs, together with the subject classification accuracy. Instead, null-learning experiments, where feedbacks came with poor consistency, did not provoke any significant change in the functional connectivity over runs. We propose that learning induces fast modifications in the overall brain network dynamics, definitely ameliorating the short-term potential of the brain to process and integrate information, a dynamic consistently orchestrated by modulations of the functional connections among specific brain regions.


INTRODUCTION
Learning sensory inputs is a crucial property for humans and animals in order to adapt their behaviors in relation to the external environment variability and survival (Tetzlaff et al., 2012). In many cases, these conditions demand for fast learnings which occur in short temporal intervals (i.e., from seconds to minutes). One specific type of learning (classification learning) requires classifications of objects into categories, an objective typically achievable by providing an adequate number of correctly labeled examples. For instance, an ornithology untrained subject can quickly learn to discriminate robins from songbirds after an opportune instruction with examples of both species.
In the human brain, classification learning essentially involves two memory systems: the visual working memory and the visual short-term memory (Knowlton et al., 1994;Foerde et al., 2007). The brain functional correlates of these systems have been broadly identified by functional magnetic resonance imaging (fMRI) as a distributed network in many cortical regions (Bettencourt and Xu, 2016) such as the prefrontal cortex, the middle temporal gyrus, the posterior parietal areas, and the occipital regions. Also, several subcortical regions like the hippocampus (Aron et al., 2004), the amygdala and the striatum seem involved, too (Harrison and Tong, 2009).
From a complex network perspective, the human brain regions have massive mutual dependencies, combined in a spatial organization arrangement known as the modules-and-hubs architecture Sporns, 2009, 2012;van den Heuvel and Sporns, 2013), which promotes a wide variety of coherent subnetwork dynamics and tasks serving resting-state (Greicius et al., 2003), attention (Dosenbach et al., 2007;Sheremata et al., 2018), salience (Fox and Raichle, 2007), sensorimotor inputs and outputs (Bressler and Menon, 2010), language (Friederici and Gierhan, 2013) and other functions (Achard et al., 2006). However, the precise spatial and temporal superposition of each subsystem during cognitive tasks still remains unclear, possibly because of their reciprocal interconnections (Karahanoglu and Van De Ville, 2015) which generate blurring intersections of their dynamic profile. Specifically, although particularly fine, statistical techniques such as the Independent Component Analysis (ICA) (Calhoun et al., 2001;Stone et al., 2002) may identify coherent activities in the spatial and temporal domains, the formal model relies on substantial assumptions as, for example, the non-Gaussianity of data, a limitation likely to be changed by a pure complex network approach which models the brain as a sole whole unit.
Therefore, in this work, we propose a functional connectome investigation of global large-scale neurophysiological bases of the visual working and short-term memory dynamics elicited by a classification learning task. In classification learning, human subjects have to learn, in few minutes, the association of some visual stimuli to specific choices (e.g., keystroke between two buttons). User choices are driven by visual feedbacks which lead the learning process. If feedbacks are consistently provided over time (deterministically), participants eventually learn the associations between visual stimuli and the correct responses, while, if feedbacks are administered by chance (probabilistically), participants do not learn any associations. We used fMRI data from publicly available repositories related to two similar classification learning experiments performed by Poldrack et al. (2001) (Aron et al., 2006) on 30 healthy subjects recruited. In a first part, the participants learned the proper stationary associations by visual error feedbacks in two consecutive sets of trials (Run 1 and 2) where the classification accuracy increased over runs (Poldrack et al., 2001;Aron et al., 2006). Subsequently, participants were challenged in another couple of run sets with non-stationary visual feedbacks, which disrupted the already formed memories and prevented the formation of new associations. As control condition, we used a third dataset where a cognitive task not activating short-term memory systems (one-back working memory task) was similarly executed along two runs. Examiners acquired the Blood-Oxygen-Level Dependent (BOLD) signal from fMRI together with a preliminary structural MRI of each subject. Functional connectomes of subjects were extracted for each run after the AFNI (Analysis of Functional NeuroImages) preprocessing pipeline (Cox, 2012) and embedded in two different atlases (Harvard-Oxford FSL (Rademacher et al., 1992;Fischl et al., 2004) and Brainnetome (Fan et al., 2016) to reduce effects of the choice of the anatomical parcellation. On the extracted graphs, we applied a common set of complex network statistics widely used in the brain functional connectomes (i.e., node degree, global and local efficiency, clustering coefficient and the average shortest path length) to investigate the information processing dynamics of the brain large-scale networks. Specifically, the aforementioned measures complementarily estimated the extent of functional segregation and functional integration (Tononi et al., 1994), two crucial statistics highlighting the information processing capability of complex brain networks.
The results showed a consistent and significant increment of the information processing efficiency in terms of functional segregation and integration in the second runs as compared to the first ones. This suggest that distributed and ample functional connectivity modifications emerge also in fast short-term learning, enabling faster information integration in classification learning tasks. Of note, these effects were mediated by coherent co-activation or deactivation of specific brain regions mainly from temporal, fusiform insular gyri and parietal lobe. No evident session effects emerged from the third dataset of oneback memory task.

Subject Data
Data were retrieved by the OpenfMRI project [openfmri.org now converged into the openneuro.org portal, number "ds002" (Poldrack et al., 2001) and "ds052" (Aron et al., 2006)] managed by the Poldrack Lab and the Center for Reproducible Neuroscience at Stanford University (United States). The third dataset was the "ds107" were uploaded by Duncan et al. (2009). The database and its contents are made available under the FIGURE 1 | The experimental and computational frameworks. (A) Healthy participants performed a weather-prediction task through the association of card types to a binary weather output (sunny/rainy). (B) Two stages of trials were presented sequentially to subjects where each trial was composed by four sections: a first phase characterized by the visual presentation of the card, a second stage wherein the user makes the choice (sun/rain), a third phase with the visual feedback (correct/wrong) and a short final rest phase with a blank screen. Depending on the task type, the feedbacks could be assigned deterministically or probabilistically. (C) The AFNI preprocessing pipeline used for the structural MRI and the BOLD signals. (D) Axial view samples of the two atlases used to parcellate the fMRI volumes: the FSL and the Brainnetome (BN). (E-G) Examples of, respectively, adjacency matrices (E), their related topological (F) and MNI space embeddings (G). (H) Exemplary collections of complex network statistics plotted in Box-Whisker (1st, 25th, 50th, and 99th percentiles) with scattering points as measure of dispersion. (I) The classification accuracy reported by the original works (Poldrack et al., 2001;Aron et al., 2006) shows that probabilistic feedbacks did not evoke any consistent association learning.
Public Domain Dedication and License v1.0 (PDL) 1 . The ds002 dataset was populated by 17 healthy right-handed participants (female = 10, age = 23.3 ± 2.8). The ds052 dataset contained 13 healthy subjects (female = 7, age = 22.8 ± 3.2). The ds107 dataset contained 49 healthy monolingual English speakers. For each participant, both fMRI acquisitions (repetition time, TR, of 2.0 s in ds002 and ds052 and of 3.0 s in ds107, echo time, TE, of 4 ms in ds002 and ds052 and of 50 ms in ds107) and structural MRIs were included. All details about MRI and BOLD acquisitions can be found in the related works (Poldrack et al., 2001;Aron et al., 2006;Duncan et al., 2009). Classification accuracy of experiments ds002 and ds052 has been computed by the metadata contained in the ds002 and ds052 datasets. 1 https://opendatacommons.org/licenses/pddl/index.html

Deterministic and Probabilistic Classification Tasks
The objective of a classification learning experiment is to promote the learning of a set of associations between visual stimuli and specific user responses. Learning is driven by visual feedbacks which lead the participant to choose responses that give back "correct" feedbacks. When feedbacks are consistently provided over the experimental session (i.e., with a deterministic assignment), participants eventually learn the associations between visual stimuli and correct response while, if feedbacks are administered by chance (i.e., by a probabilistic assignment), participants do not learn the arbitrary associations. During the fMRI scans in the ds002 and ds052 experiments, the subjects had to perform two different classification learning tasks along two consecutive runs in a "weather prediction" setup ( Figure 1A).   Participants have to learn associations between cards (four in ds002, one to three in ds052) and a binary output, visually represented (as feedback) by a sun or a rainy cloud, after their responses. Learning occurs trial-by-trial while the visual feedback errors (correct/incorrect) drive subjects towards the correct cardweather associations. According to the metadata in the dataset, although these consistently derives from the materials presented in Poldrack et al. (2001), Aron et al. (2006), in the first series of two consecutive runs, trials were characterized by deterministic associations between cards and weathers. In the second series, instead, the associations were assigned probabilistically. In each run there were 80 trials in ds002 (∼5 min of total duration) and 48 in ds052 (∼3 min of total duration).

One-Back Working Memory Task
In the ds107 experiment, the participants (N = 49) observe a sequence of objects and have to press a specific key on a keyboard whether the current object was identical to the previous one, during MRI scanning. Visual stimuli belonged to four categories (Duncan et al., 2009): written words, pictures of common objects, scrambled pictures and consonant letter strings. Stimuli were presented in a sequence of four blocks. Each block consisted of 16 trials from a single category. Objects appeared on a screen for 350 ms each. A trial began with a 650 ms fixation cross, for a total of 1 s per trial.

Signal Processing
Data were preprocessed and analyzed using the following MATLAB toolbox: SPM12 (Friston et al., 2007), CONN (Whitfield-Gabrieli and Nieto-Castanon, 2012) and BCT (Rubinov and Sporns, 2010). Prior to analyses, all images underwent preprocessing steps according to the AFNI pipeline (Cox, 2012) in the following order: realign and unwarp of functional slices, centering of functional slices, slice-timing correction of functional volumes, outlier detection in functional volumes, direct segmentation and normalization in Montreal Neurological Institute (MNI) space of the functional volumes, centering of the structural slices, segmentation and normalization in MNI space of the structural volumes and smoothing of the functional volumes. We used the default parameters (functional outlier detection = 97th percentiles, global-signal z-value threshold = 5, subject-motion mm threshold = 0.9, structural target resolution = 1 mm, functional target resolution = 2 mm, smoothing kernel FWHM = 8 mm) suggested within the CONN framework for all processing steps (Whitfield-Gabrieli and Nieto-Castanon, 2012). In addition, to avoid errors derived from the choice of the atlas, we used two different atlases: the FSL (Rademacher et al., 1992;Fischl et al., 2004)

Functional Connectivity Estimation
After BOLD signal preprocessing, data underwent a denoising step through a band-pass filter in the frequency of [0.008, 0.09] Hz and a despiking procedure to furtherly remove motion   artifacts after the ArtiFact detection tool (ART)-based scrubbing 2 (Power et al., 2012;Van Dijk et al., 2012;Jo et al., 2013). Voxelwise time series were transformed into region of interest (ROIs) series by averaging the signal over all ROI voxels. Two parcellation atlas were used in this study: the FSL (Fischl et al., 2004) and the Brainnetome (Fan et al., 2016). In the subsequent first-level analysis, we computed the ROI-to-ROI connectomes (by means of the CONN toolbox) represented by adjacency matrices obtained through a bivariate analysis of the Pearson correlation coefficient between all ROI couples transformed with the Fischer z-transformation (setting to 0 those with a False Discovery Rate, FDR (Benjamini and Hochberg, 1995), corrected p-value larger than 0.05). Formally, given R i (t) the ith (of n distinct) ROI BOLD signal measured at the tth scan, results of the averaging of all voxels within the ith ROI (centered for zero mean, i.e., by subtracting the estimated mean value), r (Pearson's correlation coefficient) and Z are defined as follow: , Z i, j = tanh −1 r(i, j), with i, j = 1, 2, · · · , n and T is the total number of fMRI scans. 2 nitrc.org/projects/artifact_detect/ For each subject and condition, Z is the adjacency matrix of the resulting graph G = V,E where V = {v i : i = 1, 2, · · · , n} is the set of all ROIs and E = e i,j |∀ v i , v j ∈ V is the set of all edges, the functional connectome of interest which comprised all i, j ROI's couples. Summarily, a node v i of the graph G denotes the ith ROI, while an edge e i,j connecting nodes v i and v j is computed using the Z transform of the Pearson correlation coefficient between the ROIs R i (t) and R j (t) such that e i,j . All graphs were maintained in their weighted form.
We analyzed the functional connectivity graphs (i.e., the Z matrices) with a set of common network statistics (node degree, global and local efficiency, clustering coefficient and the average shortest path length) by avoiding thresholding techniques which provoke loss of information and makes analyses more complicated because of the introduction of the threshold parameter (Rubinov and Sporns, 2011). For both atlases, we selected only forebrain regions (Tables 1, 2 for details) by excluding the cerebellum because its ubiquitous role (E et al., 2014) in high cognition is still debated (Yu et al., 2015;Gelal et al., 2016).

Complex Network Statistics
For the analysis of the connectome graphs, we selected a set of common statistics from the Complex Network Theory able to estimate the network information processing extent. Table 3 shows measures (definition and interpretation) used in

Measure Definition Interpretation
Node degree (also known as node strength) The sum of weights connected to a given node i Average Shortest path length Given: d ij = e f,g ∈ r i↔j 1/e f,g where r i↔j is the shortest path between i and j; L = 1 n i∈V j∈V,j =i d ij n−1 The average edge weights encountered in the shortest path between node i and j Local Efficiency (Latora and Marchiori, 2001) the length of the shortest path length between i and j that contains only nodes directly connected to i

Measure of local network segregation. Supplementary to the clustering coefficient
Global Efficiency (Latora and Marchiori, 2001) Measure of network integration. The inverse of the average shortest path length that became meaningful in disconnected networks with infinite length paths Clustering coefficient (Watts and Strogatz, 1998) Measure of fine-grain network segregation. It counts the average weight of triangles t (3-node fully connected graphs) present in the network As assumption, Z is an adjacency matrix of the graph G = V,E with V = {v i : i = 1, 2, · · · , n} and E = e i,j |∀ v i , v j ∈ V where the element e i,j represents the connection weight between nodes i and j.
the present work. Analyses on the extracted functional brain networks were performed in Matlab by the Brain Connectivity Toolbox (BCT) (Rubinov and Sporns, 2010), by the Python graph toolbox (Peixoto, 2014c), and by other ad hoc-routines developed in our lab. Specifically, we used a complementary measure of information integration, called Compression Flow (CF), that we previously showed to effectively discriminate patients diagnosed with mild cognitive impairment from those with probable Alzheimer's disease . For a better numerical treatment of the results, the original fourth stage, consisting in a summation, was replaced by an average, as follows. Algorithm: Inputs: Z is the adjacency matrix of the graph G = V,E with V = {v i : i = 1, 2, · · · , n} and E = e i,j |∀ v i , v j ∈ V , the node betweenness centrality (BC) of G, and the edge betweenness centrality (EBC) of G; Output: the extent of CF for the graph G. Steps: 1. Set a pivot value ϑ in the BC distribution, usually a low percentile of the BC distribution (values from 5 to 20 do not affect results); 2. Establish which nodes have a BC lower than ϑ, thus obtaining the subset ϕ ⊂ V with |ϕ| = k of the putative most peripheral nodes of G (| | the cardinality operator); 3. For w = 1, 2, · · · , k compute and collect the random walks r w from the periphery to the network center for each input load w; at each step the w activated nodes are randomly chosen from ϕ; 4. For w = 1, 2, · · · , k estimate the compression ratio by computing (through the c function) and counting the number of connected components |c(Ĝ)| of the graph provisionalĜ obtained by the collection of all edges encountered in all paths of r w ; the compression ratio is set to ρ w = w n−|c(Ĝ)| ; 5. Average the obtained CF = 1 The algorithm is written in Matlab and the code are available upon request. The connected components of graphs are computed by a depth-first search algorithm.
From the graph toolbox we used an efficient routine to extract the hierarchical modularity from networks (Peixoto, 2014a(Peixoto, ,b, 2015a. The implemented algorithm (the stochastic block models) outperforms many other common modularity methods (Newman, 2006;Blondel et al., 2008) and it has been chosen for this reason.

Statistical Tests
We performed statistical comparisons between the used complex network statistics within each experimental condition. Specifically, these included the type of feedback: probabilistic or deterministic, the chosen atlas: FSL or BN atlas and the dataset: ds002, ds052, or ds107.
For hypothesis testing, we made no assumption about the a priori data distribution, thus, we used non-parametric models. Pairwise comparisons were performed by the non-parametric Wilcoxon signed rank test with Bonferroni correction for multiple contrasts (by multiplying the p-values for the total number of hypotheses), while for multiple group comparisons we used the Kruskal-Wallis test with the False Discovery Rate (FDR) correction. The significance level was assumed as 0.05 in all hypothesis tests.

Edge Filtering
To identify relevant edges (i.e., functional connections between ROIs) which supported the observed information processing enhancement from run 1 to run 2, we set up a statistical procedure that selected edges which significantly changed between runs. Edge weights were modeled by a linear model and fitted with univariate ANOVA criteria in the R language environment (R Core Team, 2019). Pairwise comparisons between runs were subsequently performed with the Tukey post-hoc test. Edges below the significance level (0.05) were furtherly filtered to select those with an absolute high magnitude. For this reason, FIGURE 2 | Complex Network statistics for FSL atlas. A complete overview of the complex network statistics (respectively, Global and Local Efficiency, Clustering Coefficient, Average Shortest Path Length and Node Degree see Table 3) computed on the functional connectomes for both datasets (ds002, ds052, rows 1-2 and 3-4, respectively) and both experimental conditions (deterministic/probabilistic) embedded in the FSL atlas. Plots reported the statistical significance according to the Wilcoxon signed rank test with Bonferroni correction. Boxplot colors indicate the run: blue for run 1 and yellow for run 2. Significant p-values (<0.05) are highlighted in red.
we picked edges whose differences were either greater than the 95th percentile (namely, positive differences) or lesser than the 5th percentile (negative differences). Since, the difference weight distribution had about zero mean, the latter set grouped only edge with negative weights.

RESULTS
In the present work, we preprocessed fMRI volumes, from two classification learning experiments (Figures 1A,B), according to the AFNI pipeline ( Figure 1C) and subsequently we extracted the ROI-to-ROI functional connectivity for each subject (Figures 1D-H) according to the two atlases templates (FSL and BN, coordinates in Tables 1, 2). The classification learning tasks were of two types: deterministic and probabilistic (Figures 1A,B). The former represented the actual classification learning assumed to occur in subjects (reported performances in Figure 1I), the second indicated the null hypothesis where learning was dampened through probabilistic feedbacks and thus memory associations were precluded. We considered a third experiment, as additional control, to evaluate the role of possible session-effects in a different cognitive task recruiting only visual working memory systems (visual one-back task). On the functional connectomes we computed a common set of complex network statistics ( Table 3) to assess the network information processing capability between the first (run 1) and the second group (run 2) of trials revealing a significant increment of classification accuracy ( Figure 1I). Eventually, we used a recently presented  refined functional integration measure, of functional integration, the compression flow (CF), stochastically estimating the network capability to learn and predict external inputs.
We found that, in deterministic sessions, the node degree distribution, the global and local efficiency, the clustering coefficient and the characteristic path length were all significantly FIGURE 3 | Complex Network statistics for BN atlas. A complete overview of the complex network statistics (respectively, Global and Local Efficiency, Clustering Coefficient, Average Shortest Path Length and Node Degree see Table 3) computed on the functional connectomes for both datasets (ds002, ds052, rows 1-2 and 3-4 respectively) and both experimental conditions (deterministic/probabilistic) embedded in the Brainnetome atlas. Plots reported the statistical significance according to the Wilcoxon signed rank test with Bonferroni correction. Boxplot colors indicate the run: blue for run 1 and yellow for run 2. Significant p-values (<0.05) are highlighted in red. different between runs. Specifically, the node degree distribution, the global and the local efficiency and the clustering coefficient were higher in run 2 while, conversely, the average shortest path length was smaller (results of Wilcoxon's tests in Figures 2,  3, rows one and three). Moreover, analyses from both datasets (ds002, ds052) and both atlases (FLS, BN) generated congruent observations. In details, the node degree increment indicated that new functional connections were activated in the second run. The global and local efficiencies measured how proficiently the information was exchanged within respectively the entire graph and neighbor's nodes. The observed efficiency dynamics suggested that, in run 2, information exchange was optimized thus minimizing the processing energetic expenditure (Bullmore and Sporns, 2012). The clustering coefficient changes, instead, demonstrated that the networks were more prone to segregate information in run 2 compared to run 1. Eventually, the characteristic path length decreasing in run 2 expressed a reduction in the average path length between node random couples. These outcomes indicated that the observed networks became more topologically efficient in run 2, and, therefore, according to complex network statistics, the brain networks became more effective in terms of information processing capabilities and more prone to integrate and segregate information. Conversely, in probabilistic trials (considered as control) we did not observe any significant modulations between runs (results of Wilcoxon's tests in Figures 2, 3, rows two and four) in both datasets (ds002, ds052) and in both atlases (FLS, BN). Altogether, these results proposed a scenario where the brain functional connectome, when exerted by an input learning demand, alters its connections in order to optimize the information storage of the putative predictive associations (Tetzlaff et al., 2012).
Subsequently, we wondered which putative functional connections shaped the observed network dynamics and, accordingly, we analyzed edge fluctuations between runs with statistical hypothesis tests (see section "Materials and Methods"). For more statistical robustness and consistent interpretation of the data, we combined sessions from both datasets ds002 and ds052. We found that only deterministic sessions identified statistically significant and remarkable connections ( Figure 4A) and we divided these relevant edges into two sets: the first set containing edged tightly strengthened in run 2, the second containing the weakened edges in run 2. Within the FSL atlas ( Figure 4B), we found four strengthened connections, namely the left insular cortex with the left temporal fusiform cortex, the left superior parietal lobe with the right frontal operculum, the right inferior temporal cortex with the subcallosal cortex and the left nucleus accumbens with the right parahippocampal gyrus. These results were coherent with results obtained with BN atlas which covered more than 80% of each correspondent brain regions. Instead, the weakened edges were those connecting the left inferior frontal gyrus with the right fusiform cortex, the right temporo-occipital inferior temporal cortex with the left inferior frontal gyrus and left planum temporale with the left inferior frontal gyrus. Again, these results were remarkably coherent with results obtained with BN atlas where brain regions between atlases were overlapped at least for 84%.
Looking for a further indication of the increment of information integration in run 2, we averaged all FSL connectomes for all participants of both experiments (ds002, ds052) and we analyzed the hierarchical modular organization of nodes in communities comparing deterministic and probabilistic conditions. We observed that the number of modules and hierarchical levels dropped from run 1 to run 2 indicating that network information processing took place in more integrated topological architectures (Figures 5A,B). Statistically, prior to averaging, we found 7.4 ± 1.9 (mean and standard deviation) modules in run 1 and 4.9 ± 0.7 in run 2 with a significant difference (p = 0.001, non-parametric Wilcoxon signed ranksum test). Vice versa, the number of modules did not decrease in the probabilistic conditions as a sign of a missed integrative merging among modules (Figures 5C,D, 8.1 ± 1.4 in run 1 vs. 7.8 ± 1.9 in run 2, p = 0.349, ranksum test). Ultimate, the CF estimations coherently confirmed the significant increment of the topological information integration of the connectomes (Figures 5E-H) in the second run of deterministic trials. Eventually, to evaluate the possible role of the session-effect (run 1 vs. run 2), we decided to include a further dataset from the same repository (ds107) where participants performed a oneback working memory task. Results in Figure 6 did not show any significant differences (Wilcoxon's test) confirming that observed results in the previous analyses were not merely an outcome of the comparison between runs.
Altogether these evidences suggest that critical topological modifications of the functional connectome allow large-scale architecture to accommodate the incoming cognitive demand achieving high efficiency with low energy expenditure.

DISCUSSION
In this work, we investigated the fast and transient topological dynamics of a short-term memory task in broad functional connectomes. We found a consistent enhancement of the functional integration and segregation during the trial-by-trial generation of the associative learning. Namely, connectomes FIGURE 5 | Hierarchical Modularity Structure and Compression Flow statistics. Hierarchical modularity analysis of the FSL grand average networks (run 1 vs. run 2, respectively, A, B) among subjects (N = 30) and experiments (ds002, ds052) for the deterministic condition. In run 2, the functional modules of the connectome collapse, as a sign of the arisen functional integration, into five communities with a singular hierarchical level, from the eight communities of run 1 arranged in two hierarchical levels (five modules in the second level). Oppositely, in probabilistic condition modular organization did not change (C,D). Edge colors mark community membership and are arbitrarily chosen by the graph plotting routine. Analyses of the compression flow measure of brain graphs by using the FSL atlas (E,G) and the BN atlas (F,H) or the ds002 (E,F) and ds052 (G,H) experiments. Plots reported the statistical significance according to the Kruskal-Wallis non-parametric test with a False Discovery Rate (FDR) correction for group comparisons while, for pairwise comparisons, the Wilcoxon signed rank test significance with Bonferroni correction is reported. In (E-H), Deterministic is referred with "Det." and Probabilistic with "Prob.". Boxplot colors (blue, yellow, gray, and red) denote the diverse conditions (respectively deterministic run 1, deterministic run 2, probabilistic run 1 and probabilistic run 2). FIGURE 6 | Complex Network Statistics in a non-classification learning cognitive task. The first row represents the collection of network statistics obtained by extracting the ROIs according to the FSL atlas, while in the second row the statistics are computed with the BN atlas. Altogether, the lack of statistically significant differences indicate that no session effect is present between run 1 and run 2. Boxplot colors indicate the run: blue for run 1 and yellow for run 2.
became more efficient in information processing capability in diverse experimental conditions and analyses, a property absent in both sham and control experiments. Therefore, as highlighted by our estimates and analyses, higher cognitive tasks involve global connectome adaptations rather than mere local topological modifications of few regions. This property implies a new assessment of general brain dynamics obliging to reconsider the conventional view of brain functional specialization as common refrain in studies correlating few but specific brain regions with distinct cognitive tasks. From a clinical perspective, this widespread interpretation takes its origins from old neurological judgements of past centuries, assuming a causality between anatomically observable lesions and specific disruptions of behavioral or cognitive functions. In contrast to such a perspective, the functional connectivity network of the human brain proposes a strong global interdependency among regions where alterations of single node dynamics may be echoed widespreadly over the entire network, significantly changing the brain network dynamics. The assumption of localized lesional models inevitably neglects the complex and diffuse damages upon the globally connected brain network, as well as the compensatory or repair mechanisms that, with diverse strength and at diverse time, may arise from the original alterations (Catricalà et al., 2015).
From a computational point of view, classification learning implies an information storage demand to be accomplished in short time intervals (from seconds to few minutes). Indeed, according to the Friston's free-energy minimization principles (Friston, 2010), nervous systems work to minimize the discrepancy between external world information and the related internal brain representations in neuronal networks. This theoretical approach is, seemingly, time-independent and active at most different time-scales.
Furthermore, in our previous work we conjectured that compression flow is inversely related with the free-energy , namely, when brain networks increase the extent of compression flow, the system free-energy decreases. Therefore, we could suggest that the new information needed by the classification learning task induces a bump of free-energy that, theoretically, is likely to be cut by means of topological modifications of the functional connectivity.
In the literature, as cited in the introduction of this work, original neurophysiological studies on non-human primates showed that visual working memory appeared solely related with the prefrontal cortex, the parietal cortex and some associative area in the occipital lobe. Subsequent studies further extended the list of the involved areas with the contribution of the premotor cortex, the intraparietal sulcus, the caudate, the hippocampus, the thalamus and several occipitotemporal regions (Doron et al., 2012). This progressive spatial extension of neuronal networks involved in visual tasks, primarily strengthens the global vs. local accounting of brain dynamics and, as a consequence leads to hypothesize a more extended design applicable to other systems and task conditions. From a more classical anatomical view it shows augmented functional connectivity within the rostro-caudal axis (Kuo et al., 2011) enriching dramatically the neuronal textures ignited by an external specific stimulus and functionally requires a widely distributed dense network for the active maintenance of a perceptual representation (Gazzaley et al., 2004).
By an edge-centric perspective, our results showed an enhancement of specific brain region connections. In particular the functional connection between the left insular and the left fusiform cortices appears in accordance with the putative roles of such districts implicated, respectively, in the consolidation of object recognition memory (Bermudez-Rattoni et al., 2005) and working memory tasks (Druzgal and D'Esposito, 2001;Postle et al., 2003;Hofer et al., 2007), two executive functions heavily recruited in classification learning assignments. Similarly, the strengthened connections between the right parahippocampal gyrus and the left nucleus accumbens are likely consonant with their putative roles such as, respectively, in short-term memory (Daselaar et al., 2001;Peters et al., 2009) and the visual memory consolidation (Setlow, 1997;Deadwyler et al., 2004). Again, the right frontal operculum and left superior parietal lobe, participating in task control (Higo et al., 2011), in episodic memory retrieval (Wagner et al., 2005) and in the maintaining of internal representation (Wolpert et al., 1998).
Eventually, the right posterior inferior temporal cortex, a crucial region of the ventral stream visual processing directly involved in the object recognition (Gross, 1992;Nobre et al., 1994) with the subcallosal cortex responsible, instead, for the monitoring and the control of executive processes (Hebscher et al., 2016). In opposition, other functional connections were inhibited. Specifically, these connections encompass the inferior temporal and the fusiform cortices with the inferior frontal gyrus, usually recruited in response inhibition (Swick et al., 2008), in the selection among competing alternatives (Moss et al., 2005;Hirshorn and Thompson-Schill, 2006) and attentional control (Hampshire et al., 2010).
Past works with strong local-centric activated networks showed that there are two distinctly different stages in accessing information in short-term memory, a stage elicited in the classification learning, recruiting the inferior temporal regions with frontal-and posterior-parietal contributions, the medial temporal lobe and left mid-ventrolateral prefrontal cortex (Nee and Jonides, 2008). In contrast, van den Berg and coworkers stated that neural representation of visual shortterm memory is continuous and variable rather than discrete and fixed, thus smoothing this modular interpretation (van den Berg et al., 2012). In addition, early evidences suggested that the cortico-limbic neurophysiological substrate of visual short-term memory changed globally, rather than with focal modifications, in healthy elderly subjects (Della-Maggiore et al., 2000). On this track, D'Esposito (2007) concluded (in a study on the visual working memory) that it is not localized to a single brain region but more likely it represents an emergent property of the functional interactions between the prefrontal cortex and the rest of the brain, a key step towards the shift of a local towards a global appraisal of brain functional domains. However, our results from a visual working memory task (one-back) seemed to encourage the idea that the observed global topological optimization causally emerged from the visual short-term rather than working memory completions.
Other studies on fast dynamics of the visual working memory suggested an involvement of several EEG frequency bands (α, β, γ) over large-scale densely connected cortical areas (frontal, parietal and occipital) for maintenance and coordination . These findings multiply and supply furtherly more complex pictures that succeed in a more temporally accurate technique (i.e., the EEG) which highlighted the dynamic complexity of the global brain involvement yet engaged in "simple" visual tasks. A novel work suggests, again, a cross-modal recruitment of sensory related short-term memory where visual memory implicated also auditory regions and, vice versa, auditory short-term memory was associated with the activity of the dorsal and ventral visual pathways (Michalka et al., 2015). Moreover, a recent work has shown fast modifications of functional connectomes and remarked the importance of even minute topological changes for the global network capacity to integrate information (Fransson et al., 2018). Eventually, our previous study focused on the alternating dynamics of segregation and integration in a visual working memory task suggested that the interchange of segregation-integration required a quasi-continuous coherent activation of most of the recorded cortical regions resulting in a global complex network orchestration (Zippo et al., 2018). Therefore, the observed involvement of global network dynamics appeared coherent with these recent results.
Despite the present study analyzed data from two independent experiments, it is however limited to a small sample of just 30 participants from a distinct younger age (23 years old on average, with small variance). Thus, for generalization of results, further investigations are needed considering larger populations homogeneously distributed in age. In addition, both studies referred to a single type of visual short-term and working memory task, thus, for more robust conclusions, similar studies with different experimental conditions and modalities (e.g., auditory or motor short-term learning) should be performed. The BOLD signal, generated by the functional MRI scanner is still considered an indirect measure of neuronal metabolism, unclearly linked with the synaptic activity, therefore, the robustness of the proposed results needed to be investigated in different experimental setups with more direct measures of the neuronal activity (e.g., EEG/MEG).
However, notwithstanding these limitations, it would be a curious exception that other sensory and cognitive tasks involved in the individual survival and environmental adaptation, could perform different topological dynamics. This could be due to the implicit law of parsimonious evolutionary conservation of basic schemes for coherent cognitive abilities.
In conclusion, the results of the work highlight the effectiveness of a global topological strategy in the treatment and storage of a task (in this case) a temporary visual memory retention, which drives the functional topologies towards more information processing optimized configurations. Novel interpretations of whole brain functional networks could therefore be envisaged in investigations regarding the brain cognitive functions.

AUTHOR CONTRIBUTIONS
AZ designed the study, performed the analyses, and wrote the manuscript. IC, JL, VB, MV, and GB revised all analyses and the manuscript.