A perspective on correlation-based financial networks and entropy measures

In this brief review, we critically examine the recent work done on correlation-based networks in financial systems. The structure of empirical correlation matrices constructed from the financial market data changes as the individual stock prices fluctuate with time, showing interesting evolutionary patterns, especially during critical events such as market crashes, bubbles, etc. We show that the study of correlation-based networks and their evolution with time is useful for extracting important information of the underlying market dynamics. We, also, present our perspective on the use of recently developed entropy measures such as structural entropy and eigen-entropy for continuous monitoring of correlation-based networks.


INTRODUCTION
There has been a growing interest in understanding the dynamics of complex systems in the real world. Network science has emerged as an important tool and convenient framework for analyzing a wide variety of social, financial, biological and informative complex systems [1][2][3] . Network science began with the seminal papers of Erdős and Rényi 4, 5 , who proposed random graphs in 1959-60. Random graphs have been used to compare real-world complex networks, since the late 1990s, when a number of scientists started using networks in physical, social, and biological domains. Watts and Strogatz 6 renewed the modeling of networks with "small world" properties -random graphs with small diameter but highly clustered like regular lattices. Barabási and Albert investigated the properties of vertex connectivity of large networks with "scale-free" power-law distributions 7 . These were followed by a flood of papers (see, e.g., [8][9][10][11][12][13][14][15][16][17] ). Thus, network science emerged as an important tool for studying different phenomena -spread of infectious diseases 18 , economic development 19,20 , detection, characterisation, identification of long-term precursors to financial crashes [21][22][23] , construction of robust sustainable infrastructure and technological networks 24 , etc.
Here, we briefly review the role of network science in understanding complex financial markets. Firstly, for uncovering the structure of complex interactions among stocks at a particular instant of time (static picture). For this purpose, one starts with the cross-correlations among stocks returns and then uses various methods of network analysis, such as threshold networks, Minimum Spanning Tree (MST) 25,26 , Planar Maximally Filtered Graph (PMFG) 27 , etc. Using these methods, one can identify stocks (or sectors) that are strongly or weakly correlated and also study their hierarchy in the network structures. Correlations among stocks change with time, and the underlying dynamics of the market becomes very intriguing. Secondly, a continuous monitoring of financial market becomes very useful and necessary 28 , since there are sizable fluctuations during crashes and bubbles. Thus, we discuss here the role of entropy measures in continuous monitoring of the financial market (dynamic picture).

CORRELATION-BASED NETWORKS
Mantegna studied the hierarchical structures of correlation-based networks in financial markets 25,26 . Later similar studies of correlation-based networks were made (see, e.g., [29][30][31][32][33] ). These correlation-based networks provide easy visual representation of multivariate time series and extract meaningful information about the complex market dynamics. The analysis of evolution of correlation-based networks provides a deep understanding of the underlying market trends, especially during periods of crisis 34,35 . We briefly discuss a few methods to construct correlation-based networks from empirical correlation matrix (ECM): MST, threshold network and PMFG.

Minimum Spanning Tree
MST is constructed by using the distances d i j = 2(1 −C i j ) [36][37][38] , where C i j s are the elements of ECM (correlations between pairs of stocks i, j = 1, . . . , N in a market for a specific time window), such that all N vertices (stocks) are connected with exactly N − 1 edges under the constraint that total distance is minimum. Algorithms of Kruskal and Prim are generally utilized to obtain MST from a distance matrix. For a non-degenerate distance matrix, the MST is uniquely determined. Two of the main advantages of MST are that: (i) it produces a network structure without putting any arbitrary threshold, and (ii) it has property of inherent hierarchical clustering. There have been many papers with applications of MST in equity markets 34,39 , currency exchange rates 40 , global foreign exchange dynamics 41 . Among disadvantages, there is the fact that the order and classification of nodes in a cluster of MST is not robust, and often sensitive to minor changes in correlations or spurious correlations. Therefore, for improvement of results, either noise suppression techniques like Random Matrix Theory (RMT) 42 and power mapping 23 have been used, or alternative algorithms such as PMFG, Triangulated Maximally Filtered Graph (TMFG), Average Linkage Minimum Spanning Tree (ALMST), Directed Bubble Hierarchical Tree (DBHT) [43][44][45][46][47] have been proposed. Instead of using pair-wise Pearson correlations, partial correlations and mutual information have also been computed for some studies 48,49 .
MST is useful for studying the taxonomy or the sector classification 50 , with potential applications in portfolio optimization. Researchers have also carried out analysis of dynamical correlations using MST 51 . This type of dynamical studies has the potential of catching important changes and continuous monitoring of the market. By calculating correlation using rolling window of different lengths, one could construct and analyze the temporal networks. From such analyses, it has been found that configuration of MST structure changes during crisis and there exist strong correlations between normalized tree length and the investment diversification potential 52 .  Figure 1D-F, which have been generated using the Prim's algorithm. Different colors in MSTs correspond to different sectors in the market. One can easily view the changes in the structures of MSTs in different periods of market evolution.

Threshold Networks
In this approach, an adjacency matrix is constructed by applying a threshold value in the correlation or fixing the number of edges of the network 53 . It filters out the strongest correlations by putting a certain value of threshold and discard all remaining correlations below the value of this threshold. A small threshold value gives rise to a completely connected graph, while increasing value of threshold makes the connections less. Thus, one can tune the threshold in order to get the weakly or strongly connected nodes. For a particular value of threshold, as correlation matrices change with time, the threshold networks also change, as shown in Figure 1G-I corresponding to the ECMs shown in Figure 1A-C. Here the Fruchterman-Reingold (forced-based) layout 54 has been used to visualize the threshold networks.
One drawback of the threshold networks is that there is a loss of information; when we put a threshold value to the correlation matrix we discard some nodes and edges. Also, threshold networks are very sensitive to the noise (random fluctuations). However, threshold networks have been constructed and applied in different areas of finance 55 .

Planar Maximally Filtered Graph
PMFG is a network drawn in a plane, such that there are no intersecting links 27,56 . If N is total number of stocks, then it contains 3(N − 2) links. The PMFG has the advantage that it retains the structure of MST (which contains N − 1 links) and provides additional information about the connections 43,44 . However, PMFG has a disadvantage that there exists a certain arbitrariness in its results, as there is an embedding of data from higher dimension to lower dimension with a zero genus. Figure 1J-L show the planar maximally filtered graph of matrices shown in Figure 1A-C. We find significant changes in the structures of PMFGs in different periods of analysis.
Recently, PMFG and threshold network have been combined to produce PMFG-based threshold networks 57 . Threshold networks of the financial market are constructed over multi-scale and at multi-threshold 58 .

Robustness: Noise suppression and community detection
We have seen that many of the correlation-based networks have shown clustering with communities of stocks. Thus, community detection in network science serves as an important technique for extraction of the clustering information from ECM of a multivariate time series. Several community detection algorithms have been proposed [59][60][61] . The problem is that different community detection algorithms yield different results for the same ECM. So, often domain knowledge is required to determine what is a sensible or meaningful community.
Further, we have seen that many of the networks are sensitive to noise or spurious correlations. Properties of random matrices 62 have turned out to be useful in reducing noise and thus understanding dynamics of complex systems 63  ensemble of random matrices, also known as stationary or standard random (Gaussian) matrix ensemble 62 [73][74][75][76] . Notably, any ECM of financial market can be decomposed into partial correlations, consisting of market C M , group C G and random C R modes, respectively 77,78 . It enables us to identify the dominant stocks, sectors and inherent structures of the market. Recently, detailed analyses of ECMs using these approaches have been carried out to understand the complexity in dynamics of stock market 23,63,79 . It has been found that during the crisis, the eigenvalue spectrum behaves very differently from one corresponding to a normal period.

ENTROPY MEASURES
Entropy measures provide an easy way for continuous monitoring of the financial market, and also prove useful in various other applications in finance, as summarized below. Phillippatos and Wilson had used entropy in selection of possible efficient portfolios by applying a mean-entropy approach on a randomly selected 50 securities over 14 years 80 . Using a hybrid entropy model, Xu et al. have evaluated the asset risk due to the randomness of the system 81 . In 1996, Buchen and Kelly used the principle of maximum entropy for option pricing to estimate the distribution 82 , which fitted accurately with a known probability density function. The principle of the minimum cross-entropy principle (MCEP) has been very useful in finance, which was developed by Kullback and Leibler 83 . Later, Frittelli discovered sufficient conditions to give a interpretation of the minimal entropy martingale measure 84 .
Entropy has also been used to understand the financial hazards as well as to construct an early warning indicator for predicting systematic risks 85,86 . Maasoumi and Racine examined the predictability of the market returns using entropy measure and found that it is capable to detect the nonlinear dependence within the time series of market returns as well as between returns and other prediction variables obtained from other models 87

Structural Entropy
Recently, the concept of Structural Entropy (SE) has been used in monitoring the dynamical correlation based networks of financial market 89 . The SE resolved the problem of choosing different period of crisis and extracting substantial information from the large network of stock market. The SE measures the amount of heterogeneity of the network nodes with an assumption that more connected nodes share common attributes than others. The authors assume the nature of clusters as independent sub-units of the network. The process of calculating the structural entropy involves two steps: (i) Calculation of an optimal partition function which places every node in a certain cluster using a community detection algorithm. (ii) Analysing the partition function and extracting the representative value of the diversity level. Consider a network G with N nodes. The community detection algorithm partitions G nodes in M communities. Let σ denote the N-dimensional vector where the i-th component denotes the community assigned to node i. Calculate M-dimensional probability vector P ≡ c 1 N , c 2 N , . . . , c M N , where c i is the size of community i. It is proportional size of the cluster in the network. Then, the formula for Shannon's entropy is S ≡ H(P) ≡ − ∑ M i=1 P i ln (P i ) in terms of probability vector P. Structural entropy S of the network provides a way to continuously monitor the state of the network. However, it is sensitive to the choice of community detection algorithm employed in detecting communities. This arbitrariness makes the calculation of entropy dependent on the choice of the user and hence is not universal.

Eigen-entropy
Very recently, the concept of eigen-entropy was used in studying financial markets 90 . It is computed from eigenvector centrality of the network obtained from the short time series correlation matrices 90,91 . In order to capture the global feature of the network, every node is ranked by its eigenvector centrality and then entropy formula from information theory is used to compute eigen-entropy. Let graph G(E,V ) consisting of vertices V and edges E. Let It can be seen that eigen-entropy easily quantifies the order and disorder in the stock market. Evolution of structural entropy S(τ) calculated by using community detection algorithm is shown in C. The dashed vertical lines are corresponding to different periods (normal, bubble, and crash) whose static results are shown in Figure 1.
the largest eigenvalue of A. A is a symmetric positive semi-definite matrix with all non-negative eigenvalues and orthogonal eigenvectors. According to the Perron-Frobenius theorem, any square matrix with all positive entries has a unique solution corresponding to the maximum eigenvalue and its eigenvector with all positive components. Then v th component of the corresponding eigenvector gives the relative eigen-centrality score of the node v in the network. For an absolute score one must normalize the eigenvector, i.e., ∑ N i=1 p i = 1. The disorderness and randomness of the system uniquely be measured by eigen-entropy and defined as H = − ∑ N i=1 p i ln p i . Higher the disorder of the system higher the eigen-entropy. Empirical correlation matrix of the market can be decomposed in two logical ways: (i) into three separated modes i.e. market mode C M , the group mode C G and the random mode C R , where it is arbitrary to chose the range of eigenvalue corresponding to the group mode C G and the random mode C R and (ii) into a market mode C M and group-random modes C GR , with no arbitrariness in the system. C M & C GR is the preferable decomposition and corresponding eigen-entropy H M and H GR and calculated as A = |C M | 2 (matrix element-wise) and A = |C GR | 2 (matrix element-wise), respectively. The eigen-entropy computed using above method gives a simple yet robust measure to quantify the randomness of the financial market without using any arbitrary thresholds. Further Charkraborti et al. investigated the relative-entropy which separates the phase space based on their disorder 90 . The evolution dyanamics of these relative entropies in the phase space show phase-separation with possible order-disorder transitions. These results are certainly of deep significance for the understanding of financial market behavior and designing strategies for risk management. Figure 2 shows how the entropy measures can be used for continuous monitoring of the financial markets. Figure 2A

CONCLUDING REMARKS
In this review, we have discussed different methods for analysis of static and dynamic correlation-based networks of financial markets, and also studied how entropy measures can be used to identify normal, bubble, and crash periods. Specifically, we

5/10
have compared the recently developed concepts of structural entropy and eigen-entropy.
The prediction of collapses of financial markets using traditional economic theories has been a daunting task. These new and alternate methods have the potential use of continuous monitoring and understanding of the complex structures and dynamics of financial markets. These are a few of the attempts physicists have made for generation of early warning signals for crisis, and these methods can be used for timely intervention.

CONFLICT OF INTEREST STATEMENT
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

AUTHOR CONTRIBUTIONS
SK and HKP designed the idea, wrote the main manuscript text and prepared figures. VK and PG contributed to the literature review. All authors reviewed the manuscript.