Data Depth and Core-based Trend Detection on Blockchain Transaction Networks

Blockchains are significantly easing trade finance, with billions of dollars worth of assets being transacted daily. However, analyzing these networks remains challenging due to the sheer volume and complexity of the data. We introduce a method named InnerCore that detects market manipulators within blockchain-based networks and offers a sentiment indicator for these networks. This is achieved through data depth-based core decomposition and centered motif discovery, ensuring scalability. InnerCore is a computationally efficient, unsupervised approach suitable for analyzing large temporal graphs. We demonstrate its effectiveness by analyzing and detecting three recent real-world incidents from our datasets: the catastrophic collapse of LunaTerra, the Proof-of-Stake switch of Ethereum, and the temporary peg loss of USDC - while also verifying our results against external ground truth. Our experiments show that InnerCore can match the qualified analysis accurately without human involvement, automating blockchain analysis in a scalable manner, while being more effective and efficient than baselines and state-of-the-art attributed change detection approach in dynamic graphs.


INTRODUCTION
Blockchain technology [49,72] is revolutionizing the way we store and transfer digital assets in multiple domains including internetof-things [41], healthcare [40], and digital evidence [66].Public blockchain networks are completely open, allowing anonymous addresses to utilize transactions for cryptocurrency movement and asset trading/investment.While the technology offers numerous benefits, it poses significant challenges, particularly in the area of cybersecurity.Blockchains enable electronic crimes in a variety of ways [73], ranging from demands for ransomware [19] to transactions in darknet markets [25].
One of the biggest challenges in securing blockchain networks is detecting and preventing e-crime.E-crime detection requires scalable analysis of large-scale blockchain graphs in real-time, where results are both qualified and manageable by human analysts.To address this challenge, researchers have developed tools and algorithms for analyzing blockchain networks [1,29,63,67].
Unfortunately, analyzing blockchain networks is an arduous task, given their large size and the involvement of anonymous actors.It is crucial to devise scalable and effective methods that can analyze blockchain networks in real-time, to preempt future losses.The failure to conduct a timely analysis of blockchain networks has already resulted in a staggering loss of billions of dollars to blockchain users, as exemplified by the recent downfall of LunaTerra [14].
In this article, we introduce a new approach to detecting e-crimes and trends detection.Our approach, InnerCore, involves identifying influential addresses with data depth-based core decomposition and further filtering out the role of addresses by using centered motifs.InnerCore analysis reduces large graphs having more than 400K nodes and 1M edges to an induced subgraph of less than 300 nodes and 90K edges, while still being able to detect the influential nodes.InnerCore is unsupervised and highly scalable, yielding only ∼4-second running times on daily Ethereum graphs with ∼500K nodes and >1M edges.We apply InnerCore to three recent important events in the blockchain world: the collapse of LunaTerra in May 2022, the Proof-of-Stake (PoS) switch of Ethereum in September 2022, and the temporary peg loss of USDC in March 2023.Experimental results demonstrate that our proposed approach effectively detects significant changes in the network without human intervention.Moreover, InnerCore excels in accurately identifying market-manipulating addresses within the network, underscoring its effectiveness in pinpointing key actors.
Our key novelties and contributions are summarized below.
• InnerCore: We propose InnerCore, a data depth-based core discovery method that can identify the influential traders in blockchain-based asset networks ( §5.1).• Explainable behavior: We develop two metrics, InnerCore expansion and decay ( §5.2), that provide a sentiment indicator for the networks and explain trader mood ( §5.3).• Unsupervised address discovery: Through conducting node ranking with a centered-motif approach in temporal asset networks, we demonstrate that InnerCore tracking detects market manipulators and e-crime behavior and warns the network about possible long-term instability, without the need for supervised address discovery ( §5.4).• Scalability: Due to their computational efficiency and ability to utilize only a small portion of graph nodes and edges to analyze overall behavior, the InnerCore discovery and expansion/decay calculations are suitable on large temporal graphs including Ethereum transaction and stablecoin networks.InnerCore is more effective and efficient than baselines [9,67] and the stateof-the-art attributed change detection method in dynamic graphs [20] ( §6).

RELATED WORK
In recent years, several studies focused on analyzing different aspects of the blockchain networks [2,12,18,26], particularly in the Ethereum network.Researchers working on natural language processing and sentiment analysis using tweets, news articles, cryptocurrency prices, and charts, Google Trends about blockchains [33,70] could find supporting evidence based on blockchain data analysis.Oliveira et al. [52] performed an analysis of the effects of external events on the Ethereum platform, highlighting shortterm changes in the behavior of accounts and transactions on the network.Aspembitova et al. [5] used temporal complex network analysis to determine the properties of users in the Bitcoin and Ethereum markets and developed a methodology to derive behavioral types of users.
Other studies focused on specific aspects of the Ethereum network.For instance, Casale-Brunet et al. [10] analyzed the networks of Ethereum Non-Fungible Tokens using a graph-based approach, while Silva [62] characterized relationships between primary miners in Ethereum using on-chain transactions.Meanwhile, Victor and Lüders [68] measured Ethereum-based ERC20 token networks, and Kiffer et al. [30] examined how contracts in Ethereum are created and how users interact with them.
Numerous researchers found success in anomaly detection through the strategic exploration of the Ethereum transaction network using graph representation.In particular, Patel et al. [54] proposed an oneclass graph neural network-based anomaly detection framework for Ethereum transaction networks that harnesses graph representation.Wu et al. [74] proposed a scalable transaction tracing tool which incorporates a biased search method to guide the search of fund transfer traces on transaction graphs.
Zhao et al. [78] investigated the evolutionary nature of Ethereum interaction networks from a temporal graph perspective, detecting anomalies based on temporal changes in global network properties and forecasting the survival of network communities using relevant graph features and machine learning models.Li et al. [37] analyzed the magnitude of illicit activities in the Ethereum ecosystem using proprietary labeling data and machine learning techniques to identify additional malicious addresses.Kılıç et al. [31] predicted whether given addresses are blacklisted or not in the Ethereum network using a transaction graph and local and global features.
Our temporal approach for analyzing the effects of external events on a blockchain platform is similar to the one used by Anoaica and Levard [4].The authors examined the temporal variation of transaction features in the Ethereum network and observed an increase in activity following the announcement of the Ethereum Alliance creation.Zanelatto Gavião Mascarenhas et al. [75] also studied the evolution of users and transactions over time, showing the centralization tendency of the transaction network.Kapengut and Mizrach [27] studied the Ethereum blockchain around the Bea-conChain phase of the PoS transition (September 15, 2022), but the authors focused on the power efficiency and miners' rewards around the transition.
Finally, Khan [28] conducted a survey of datasets, methods, and future work related to graph analysis of the Ethereum blockchain data, while Poursafaei's PhD thesis [55] presented results on temporal anomaly detection in blockchain networks.

BACKGROUND AND PROBLEM
We discuss preliminaries on blockchain and stablecoins ( §3.1, §3.2), followed by one key technique AlphaCore decomposition based on data depth ( §3.3).We introduce our problem in §3.4.

Blockchain and Smart Contracts
A blockchain is an immutable public ledger that records transactions in discrete data structures called blocks.The earliest blockchains are cryptocurrencies such as Bitcoin and Litecoin where a transaction is a transfer of coins.The Ethereum project [72] was created in July 2015 to provide smart contract functionality on a blockchain.Smart contracts are Turing complete software codes, replicated across a blockchain network, ensuring deterministic code execution and can be verified publicly.Smart contracts have implemented mechanisms to trade digital assets, known as tokens [68].Similar to cryptocurrencies, a token is transferred publicly between accounts (addresses), and may have an associated value in fiat currency which is arbitrated by token demand and supply in the real world.Blockchain Transaction Network vs. Mining Network.In blockchain transaction networks, the nodes represent individual participating addresses within the network, while the edges signify the actual transactions involving transfer of assets between these addresses.On the other hand, in blockchain mining networks, nodes are computational entities that play a crucial role in maintaining blockchain integrity by validating and appending transactions to the ledger through a consensus mechanism.We focus on blockchain transaction networks, where edges are directed and weighted.An edge weight corresponds to the numerical value associated with the edge incident to a node.For instance, in a blockchain token transcation network, the numerical value denotes the amount of token sent from one address to another.

Stablecoins
A stablecoin is a smart contract-based asset whose price is protected against volatility by i) collateralizing the stablecoin with one or more offline real-life assets (e.g., USD, gold), ii) using a dual coin, or by iii) employing algorithmic trading mechanisms [36,46].
In the pegged asset mechanism, an increase in the price is countered by creating more stablecoins (i.e., coin minting) and selling them to traders at the pegged price.The dual coin mechanism operates by having a management coin, referred to as the dual coin, to oversee a stablecoin.The traders of the dual coin participate in decision making through voting and receive benefits from the stablecoin's transactions.In the event that the stablecoin's price rises, some of the dual coin will be sold to purchase and decrease the supply of the stablecoin.Conflicting demand and supply dynamics of the two coins are assumed to stabilize the stablecoin's price.However, traders may lose faith in the stablecoin to such a degree that they might also not buy the dual coin, however cheap it becomes.Stablecoins that are based on algorithmic trading do not require collateral for stability.They achieve stability through the utilization of a blockchain-based algorithm that adjusts the supply of tokens automatically in response to changes in demand.
It is worth noting that for an Ethereum token such as the UST (TerraUSD) stablecoin, there can be at most  tokens issued within this network, with the value of  being set by the project owner, subject to the condition that it must be ≤ (2 256 −1) (due to Ethereum virtual machine operating on 256 bit words).Furthermore, each of these  tokens can be subdivided into a maximum of 10 18 subunits (an Ethereum protocol specified value).Therefore, the total subunit capacity for a token within the system is  × 10 18 subunits.

Data Depth-based Core Decomposition
Core decomposition [44] is a central technique used in network science to determine the significance of nodes and to find community structures in a wide range of applications such as biology [42], social networks [3], and visualization [77].One of the best-known representatives of core decomposition algorithms, graph--core [9,58], finds the maximal subgraph where each node has at least  neighbors in that subgraph.Although the graph--core algorithm demonstrates high utility for the analysis of graph structural properties, it does not account for important graph information such as the direction of edges, edge weights, and node features.
To address these limitations, modifications to graph--core have been proposed, e.g., graph--core in weighted and directed graphs, generalized -core [3,8,16,17,38,79].Different from them, AlphaCore [67] is a recent core decomposition algorithm that combines multiple node properties using the statistical methodology of data depth [47].The key idea of data depth is to offer a centeroutward ordering of all observations by assigning a numeric score in (0, 1] to each data point with respect to its position within a cloud of a multivariate probability distribution.Using such a data depth function designed for directed and weighted graphs, AlphaCore maps a node with multiple features to a single numeric score, while preserving its relative importance with respect to other nodes. Consider a directed and weighted multigraph,  (V, E, ), where V represents the set of nodes and E is a multiset of edges.The weight of each edge is designated by the weight function  : E → R + .In accordance with the generalized core definitions introduced in Batagelj and Zaveršnik [8], a node property function can assign a real value to each node  ∈ V, based on edge properties such as weight and node features.A node  can be represented by its feature vector x ∈ R  , where  features have been computed for the node .
Definition 1 (Mahalanobis depth to the origin (MhDO)).Let x ∈ R  be an observed data point, then Mahalanobis (MhD) depth of x in respect to a -variate probability distribution  with mean vector   ∈ R  and covariance matrix Σ  ∈ R  × is given by Σ  is the covariance matrix of  .The Mahalanobis data depth to origin (MhDO) measures the degree of "outlyingness" of point x (in this context, the node property column vector) in relation to origin 0.
As the AlphaCore decomposition unfolds, the core value  of a node is established using a data depth threshold  ∈ [0, 1] that is applied to remove neighboring high-depth nodes iteratively.Nodes with high property values, such as large edge weights, generally have a low depth, while nodes with low property values often have a high depth, such as most blockchain nodes that trade small amounts of tokens.However, node property values are not the only factor that determines depth; the community structure around the node also plays a role.Nodes are considered to be in the  = (1 − )-core if their depth, relative to themselves, is no more than .
Why Data Depth?Data depth provides a more precise identification of crucial nodes compared to state-of-the-art core decomposition algorithms and acts as a combination of centrality measure and core decomposition [67].Unlike traditional decomposition algorithms, a depth-based decomposition does not require the specification of multiple feature weighting parameters to perform effectively on a particular task.An Example of AlphaCore.To better illustrate the differences between the traditional graph--core and AlphaCore decomposition methods, we showcase an example in Figure 1.In the case of graph--core, the innermost core is the 3-core, whereas the InnerCore of AlphaCore would be the core of  > 0.75.Note that the 3-core consists of nodes that trade frequently with themselves, but their trade volumes with themselves are not that significant compared to other transactions which exist in the network.In certain analyses of financial networks such as anomalous address detection, being able to filter out these negligible transactions and their participating nodes, while still capturing more meaningful ones, significantly improves the accuracy and scalability of subsequent computations on the decomposed network core.On the other hand, the AlphaCore of  > 0.75 is able to capture both the nodes that participate in the largest transactions which occur in the example network, while filtering the negligible transactions and their participating nodes.We point out that the main limitation with graph--core is that it only considers node degrees, whereas AlphaCore is flexible and can consider any combination of node features as outlined in Table 1, without requiring to specify any feature weighting parameters to perform effectively on a particular task.Therefore, in networks where edge weights fall under a broad range and they are meaningful distinguishing factors, we recommend AlphaCore over the traditional graph--core decomposition.

Problem Definition
Given a weighted, directed, multi-graph representation of a blockchain transaction network over successive timestamps, where   (V  , E  ,   ) denotes the graph at timestamp , V  its set of nodes (traders1 , exchanges, liquidity pools, etc.), and E  multiset of edges (i.e., transactions) representing the amount of asset transferred between two nodes, (i) detect the node set   ⊆ V  at time  such that the behavior of nodes in   can characterize the future success of the underlying asset at  ′ > , and (ii) categorize nodes' behavior in terms of the future health and success of the underlying asset.E-crime Detection vs. Prediction.In blockchain space, predictions can only go so far, as we are unable to anticipate malicious transactions that originate from the external world.At most, what we can do is to detect e-crime transactions among the vast number of transactions taking place.This detection process is highly valuable because when a significant crime occurs, we have access to public graphs of the affected assets.However, the sheer volume of addresses and transactions makes qualitative analysis impractical.This is where blockchain data analytics tools come into play, aiming to narrow down the search space by providing a ranking of maliciousness to addresses and transactions.

DATA DEPTH
Depth functions have been initially introduced in the setting of non-parametric multivariate analysis to define affine invariant versions of median, quantiles, and ranks in higher dimensional spaces where there is no natural order (see historical overviews by Mosler [47], Nieto-Reyes and Battey [51]).The key idea of the depth approach is to offer a center-outward ordering of all observations by assigning a numeric score in (0, 1] range to each data point with respect to its position within a cloud of multivariate or functional observations or a probability distribution.Nowadays, data depth is a rapidly developing field that gains increasing momentum due to the wide applicability of depth concepts to classification, visualization, high dimensional and functional data analysis [22,48,50,59,76].Most recently, depth approaches have found novel applications in density-based clustering and space-time data mining [21,24,69], shape recognition and uncertainty quantification in computer graphics [61,71], ordinal data analysis [32] and computational geometry for privacy-preserving data analysis [43].Nevertheless, data depth is yet a largely unexplored concept in network sciences [15,56,64,65]. Definition 2 (Data Depth).Formally, let  be a Banach space (e.g.,  = R  ), B its Borel sets in , and P be a set of probability distributions on B. We view P as the class of empirical distributions giving equal probabilities 1/ to  data points in .Then, a data depth function is a function D :  × P −→ [0, 1], (, ) −→ D( |),  ∈ ,  ∈ P that satisfies the following desirable properties: affine invariant, upper semi-continuous in , quasiconcave in  (i.e., having convex upper level sets) and vanishing as || || → ∞.Specifically, a data depth function D() measures how closely an observed point  ∈ R  ,  ≥ 1, is located to the center of a finite set X ∈ R  , or relative to  , which is a probability distribution in R  .In complex network analysis, these points may correspond to nodes or edges having features.
Among many depth functions formulated to date, the Mahalanobis depth is one of the most prominent in the current practice.Definition 3 (Mahalanobis (MhD) depth).Let  ∈ R  be an observed data point, then Mahalanobis (MhD) depth of  with respect to a -variate probability distribution  having mean vector   ∈ R  and covariance matrix Σ  ∈ R  × is given by Here ⊤ denotes matrix transpose.The MhD depth measures the outlyingness of the point with respect to the deepest point of the distribution (here   ), and allows to easily handle the elliptical family of distributions, including a Gaussian case.
MhD offers flexibility in changing the reference point with respect to which we compute data rankings.For instance, instead of   we can select an arbitrary point  0 ∈ R  and compute MhD in respect to this new reference point  0 Furthermore, Σ  can be substituted by any empirical estimator of covariance matrix Σ obtained from the observed data sample  1 ,  2 , . . .,   .

METHODOLOGY
Our methodology is illustrated in Figure 2. In keeping with the routine of daily life, blockchain transaction networks are frequently examined on a 24-hour basis [10,11].We divide a blockchain transaction network into daily intervals, using a reference time zone to create a set of snapshot graphs.In a snapshot graph of a blockchain transaction network, a node represents a participant (traders, exchanges, liquidity pools, etc.), whereas a directed edge denotes a financial transaction involving the transfer of assets from one participant to the other.Next, we define InnerCore, InnerCore expansion, and InnerCore decay on the snapshot graphs.InnerCore helps us eliminate unimportant edges and nodes (e.g., addresses trading small amounts).We then compute daily temporal InnerCore expansion and decay measures to identify significant days and trends for further investigation ( §5.

InnerCore of a Graph
Consider the weighted, directed multi-graph defined in Section 3.4.We define data depth of a node  ∈ V  as the degree of "outlyingness" of the node properties in relation to the origin 0. We use In-Degree, Out-Degree, In-Strength, and Out-Strength as node properties (defined in Table 1) to compute the InnerCore of a snapshot graph ( §5.1), as these node features can be defined easily for a weighted, directed, multi-graph.
We define the InnerCore of  as the set of nodes V  whose data depth, relative to themselves, is less than an  value.We set  to a small value, and iteratively recompute the depth of each node as we remove nodes whose data depth is greater than  in each iteration.This process continues until no more nodes can be removed.The resulting set of nodes is the InnerCore.
Algorithm 1 computes a feature matrix  based on each node property function in line 1.In particular, edge weight is used for computing Strength, In-Strength, and Out-Strength node property functions, where the numerical values of all incident edges to a node irrespective of direction, inbound to a node, and outbound from a node, respectively, are aggregated.For example, if we have a network  10 − − →  5 ← −  , the In-Strength node property function will return 15 for node B. The feature matrix  is used to compute the inverse covariance matrix Σ  in line 2, which will be utilized for future data depth calculations.The initial depth of each node is determined using the Mahalanobis depth with respect to the origin at line 3. Nodes with a depth greater than or equal to input  are removed from the node-set V at line 6.Once one batch of node removals has been performed, the feature matrix and depth values are re-evaluated in lines 7-8.If any remaining nodes still have a depth greater than or equal to , the next batch is initiated at the same  level.When there are no nodes left with a depth larger than , the algorithm is considered complete, and the remaining nodes in V are returned as the InnerCore.
InnerCore vs. Alphacore.InnerCore discovery of a graph  does not require a complete decomposition of all graph cores by varying , as it is done in AlphaCore [67].Instead, we set an  value (e.g.,  = 0.1) just once, and then use the value to iteratively prune nodes until all remaining nodes, relative to themselves, satisfy a

InnerCore Expansion and Decay
By analyzing how a temporal graph expands and shrinks in relation to entry and exit of nodes on a daily basis, we gain valuable insights into market sentiment.We define the influential nodes of a graph as its InnerCore nodes (i.e., V

𝑡
).We propose two measures to quantify the activity of influential nodes in the network: expansion and decay.Expansion counts the number of new influential nodes on day  that were not influential in the preceding  days, while decay quantifies the number of influential nodes from the previous  days that are not present in the influential nodes of day .The goals of measuring InnerCore expansion and decay are two-fold: (1) Correctly accentuate anomalous days to motivate further analysis using motifs and NF-IAF scores ranking; and (2) accurately depict trends in the market to provide a sentiment indicator and explain mood.InnerCore, based on its output, isolates the key participants in the daily transaction network snapshot, whereas the expansion and decay measures provide a unique perspective on market trends and sentiment from the activity of key participants.As prefaced in §3.4,our InnerCore methodology focuses on detection rather than prediction, acknowledging the inherent unpredictability of malicious transactions originating from the external world.
To this end, we first discover V   as the set of nodes in the InnerCore of the snapshot graph at timestamp , and define A substantial expansion measure observed on a particular day often indicates the presence of excessive buy or sell behavior from new traders entering the daily InnerCore.Such behavior may arise either from a large group of traders acting in unison or from a selected group of traders whose significant transactions prompt other traders to follow a similar pattern.Consequently, heavy-buy or heavy-sell behaviors coincide on days characterized by considerable influxes of new traders entering the daily InnerCore.On the other hand, a substantial decay measure observed on a particular day often is reactionary in response to a significant change in the state of a currency caused by the transactions of key traders in the preceding days.Therefore, we suggest that days with significant expansion measures, followed by days with significant decay measures, as anomalies and prime candidates for detecting market manipulator addresses.Parameters in Experimental Setup.In the context of InnerCore expansion and decay, a greater  (i.e., the history parameter from §3.2) produces an averaging effect, coupled with the tendency to lower expansion and inflate decay.Setting a specific  value depends on the application.We use  = 1 to improve the accentuation of expansion and decay in the InnerCore to better depict the shift in market sentiment during the days of significant events.In InnerCore decomposition, depth values range between (0, 1]; nodes with high property values (e.g., many transactions, higher transacted amounts) tend to have low depth, while nodes with low property values tend to have high depth [67].With data depth threshold  = 1, all nodes will be returned as InnerCore members; while for  = 0, the empty set will be returned.Setting an appropriate  depends on the desired size of the InnerCore returned specific to an application.In our experiments, we set  = 0.1 to ensure that the average number of nodes in each daily InnerCore is above 150.

Behavioral Patterns in Temporal Networks
Temporal networks, including blockchain networks, exhibit continuous evolution and can experience notable shifts in user sentiment and node activity triggered by technological advancements and significant events, sometimes occurring within fewer days.
By utilizing expansion and decay, we have identified four behavioral patterns that provide sentiment indication and capture node activity.These patterns serve as the foundation for network analysis in our experiments detailed in §6. Figure 3 illustrates the expansion and decay values for each pattern.To gain a better understanding of these patterns, particularly when examining the temporal graph of a financial network such as the Ethereum transaction network, it is helpful to consider the network's underlying transaction semantics.
• The Despair pattern is characterized by a reduction in expansion and an increase in decay, implying that previously influential nodes are leaving the network, while the InnerCore is shrinking due to a decrease in the number of new influential nodes.• The Uncertainty pattern is distinguished by an increase in both expansion and decay.This is primarily due to the influx of many new traders into the network who do not remain active for a significant period of time.• The Hope pattern is characterized by a reduction in decay and an increase in expansion, indicating the presence of many newcomers to the network who remain active within the network.• The Faith pattern is identified by a decrease in both decay and expansion, which initially suggests a state of confusion.On the positive side, nodes, such as traders, may have faith in the network's ability to withstand a catastrophic event, as demonstrated in the LunaTerra case in our experimental results.On the negative side, it may indicate a sense of hopelessness as traders may hold onto their assets without engaging in transactions or exiting the system altogether.

Motif Analysis in InnerCore
Our rationale behind using motif analysis in conjunction with InnerCore is to accurately discover larger and potentially influential players in the daily network, referred to as market manipulators.The structure of a motif defines a behavior of interest and its existence in a network indicates the presence of such behavior.Motif analysis has been a popular tool to identify subgraph patterns and the addresses involved in them [6,35,45,53,77].We have decided to use three-node motifs since they can be identified more quickly than higher-order motifs, while still capturing the direct buying or selling behavior between addresses.Our decision is consistent with previous research on temporal motifs [53].
Scalability.The fastest triangular motif discovery algorithm has time complexity  (|V  |  ), where  < 2.376 is the fast matrix product exponent [13,34].The number of nodes in the InnerCore is denoted by |V  |.We demonstrate in §6 that triangular motif discovery on InnerCores has low time costs because of the relatively small size of daily networks' InnerCores.In particular, we consider a simpler implementation of triangular motif discovery, where for each node we explore its local neighborhood.For every triple consisting of the current node and its two neighbors, we verify if a motif can be formed.The time complexity of our approach is  |V  | ×

2
, where  denotes the maximum number of neighbors per node.The daily temporal InnerCore networks from our Ethereum stablecoin dataset have, on average, 180 nodes, with each node having 11 neighbors on average (max.number of neighbors of a node = 134).In contrast, the entire daily temporal Ethereum stablecoin networks have, on average, 89,500 nodes and though each node has only 3 neighbors on average, the maximum number of neighbors per node is 69,381.This explains why our triangular motif discovery method is quite efficient on the InnerCore networks as opposed to on entire daily temporal graphs.
We define the center of each 3-node motif as a node that either receives incoming edges from the two other nodes (buy behavior) or delivers outgoing edges to two other nodes (sell behavior).This definition ensures that motif centers exhibit only buy or sell behavior, and they do not act as intermediary nodes between the other two nodes in a motif.
Out of the 16 connected three-node motifs (see Figure 1B in Milo et al. [45]), only five of them contain a center node (Figure 4).We identify all instances of these five motifs and their centers from our daily networks' InnerCores.Finally, we utilize the well-known TF-IDF measure from information retrieval [57] to rank the discovered center nodes.TF-IDF is a statistical measure to reflect the relevance of a word in a collection of documents.In our setting, we treat each discovered center address as a word and daily instances of each motif as a collection of documents to propose a novel node relevance score for temporal graphs: NF-IAF.
Formally, let  =  1 ,  4 ,  5 ,  6 ,  11 be the set of five motifs of interest, and let  =  1 ,  2 , . . .,   be the set of  days under consideration.For each   ∈  and   ∈  , let  (,   ,   ) denote the number of occurrences of node  ∈ V  in all instances of motif   on day   .For all  ∈ V  ,   ∈ , and   ∈  , we define the node frequency (NF) and inverse-appearance frequency (IAF) as follows: Definition 6 (Node Freqency).We define the node frequency of node  for motif   on day   as The NF measures how frequently a particular node occurs in a specific motif on a specific day relative to the total number of occurrences of all nodes in that motif on that day.Definition 7 (Inverse Appearance Freqency).We define the inverse appearance frequency of node  for motif   as where | | is the total number of days in the dataset, and   (,   ) is defined as the number of days   ∈  where  (,   ,   ) > 0.
The IAF measures the importance of a node by how frequently it appears across all days for a motif.If a node appears in many days for a motif, its IAF will be low, indicating that it is not very informative.On the other hand, if a node appears in only a few days for a motif, its IAF will be high, indicating that it is a rare and potentially important node.Definition 8 (NF-IAF Score).The NF-IAF score of node  for motif   on day   is given as   - (,   ,   ) =   (,   ,   ) ×  (,   ).
A greater NF-IAF score of a center node on a particular day indicates greater relevance between that node and the behavior associated with the motif type.Therefore, a node corresponding to a motif center on a particular day with a high NF-IAF score has an increased likelihood that it has more influence on the network on that day, while a lower NF-IAF score indicates the opposite.

EXPERIMENTAL RESULTS
In this section, we first describe three large temporal blockchain graphs that we use to answer our research questions ( §3.4).Next, we analyze the scalability of InnerCore discovery and centeredmotif analysis on these graphs.Upon demonstrating our scalability results, we illustrate how our methods provide predictive insights  into anomalies stemming from external events and identify the addresses that played a significant role in such events.Our code and datasets are available at https://github.com/JZ-FSDev/InnerCore.In May 2022, the Terra blockchain and its cryptocurrency Luna collapsed, owing to TerraUSD loans that could not be repaid.A Luna coin that was valued at $USD116 in April plummeted to a fraction of a penny during the collapse 2 .This resulted in a loss of confidence in both WLUNA and UST on Ethereum.On May 9th, 2022, UST lost its $USD1 peg and fell as low as 35 cents 3 .The Ethereum Stablecoin dataset covers the period from April 1st, 2022, to November 1st, 2022, spanning about one month before the crash to six months after the crash.We construct a transaction network consisting of UST, USDC, DAI, UST, PAX, and WLUNA transactions for §6.3.1 between this period.We also use the address labels dataset from Shamsi et al. [60] where labels of 296 addresses from 149 centralized and decentralized Ethereum exchange addresses are listed publicly to distinguish unique exchange addresses.

Environment Setup
In March 2023, Silicon Valley Bank, holding over 3 billion of Circle's collateralized reserves collapsed abruptly, causing a mass liquidation of USDC from traders.Consequently, on March 11th, 2023, Circle's USDC temporarily lost its $USD1 peg, dropping to an  all-time low of 87 cents.The USDC dataset covers the period from February 25th, 2023, to March 23, 2023, spanning approximately two weeks before and after the peg loss.We use a transaction network consisting of only USDC transactions for §6.3.3.Ethereum Transaction Network.We collected ether transactions from the Ethereum blockchain for the period between August 21st and October 1st, 2022.On an average day during this period, there were 480,000 addresses, with approximately 1 million edges connecting them.Ether is a type of cryptocurrency, similar to bitcoin, and its value can be converted to various fiat currencies such as USD and JPY.Ethereum changed its block creation process during this time, moving from the costly Proof-of-Work method to the more efficient Proof-of-Stake algorithm in two phases on September 9th and 15th, 2022.

Scalability Analysis
System Specifications.The machine used for experiments is an Intel Core i7-8700K CPU @ 3.70GHz processor, 32.0GB RAM, Win-dows10 OS, and GeForce GTX1070 GPU.A combination of Python and R was used for coding.InnerCore Discovery.Since we are interested in directly finding the InnerCore, compared to AlphaCore decomposition [67], InnerCore discovery method ( §3.1) does not associate different  values to intermediate cores generated in an iterative stepwise fashion.Instead, a fixed threshold , or upper bound for depth, is set and all nodes with a depth greater than  are pruned repetitively until all remaining nodes relative to each other in the resulting network have a depth < .This allows InnerCore discovery to run approximately 1/stepsize times faster than AlphaCore decomposition since the computations of all intermediate cores are skipped.As depicted in Figure 5, the average running time for InnerCore discovery is only 4.06 seconds on graphs with approximately 480,000 nodes and 1 million edges.Furthermore, InnerCore discovery has a running time of only one-tenth of that for AlphaCore decomposition.
Due to the need for graph--core to repetitively iterate over all remaining nodes with each peeling until the highest -core remains, we find InnerCore to be nearly 8x faster on each daily graph snapshot.
SCPD is state-of-the-art method to identify anomalies from attributed graph snapshots [20].Due to its spectral approach, we find it slower: InnerCore discovery runs nearly 7x faster on each daily graph snapshot, which demonstrates the scalability of our solution.Three-Node Motifs Counting.Instead of conducting motif analysis on all nodes, our approach utilizes the InnerCore.By focusing on this core subset of nodes, we are able to reduce the number of nodes in a daily network consisting of approximately 480,000 nodes and 1 million edges to an induced subgraph of roughly 300 nodes and 90,000 edges (counting multi-edges), resulting in a more manageable and efficient approach.Although centered motif counting on each snapshot graph takes > 1 day to complete, motif counting inside InnerCore significantly improves the processing speed, requiring only < 20 secs to complete, which illustrates our scalability.

Experiment 1:
The Collapse of LunaTerra.Stablecoins are meant to be a safe house as they are generally pegged to and maintain a 1:1 ratio with a fiat currency, resisting the volatility associated with other popular cryptocurrencies.Commonly, traders keep blockchain assets not needed for immediate use in a transaction as a stablecoin, analogous to people keeping extra money in a bank.For this reason, The LunaTerra collapse was a historic event in the decentralized financial space as it questioned traders' trust in cryptocurrencies; if even stablecoins are susceptible to collapse, then is any cryptocurrency truly safe?Behavioral Patterns via Expansion and Decay.First, we analyze this event from the perspective of traders' market sentiment via expansion and decay measures of the temporal stablecoin network for the days surrounding the collapse.In Figure 6, four days after the collapse unfolded, on May 13, 2022, there was a substantial increase in decay and a decrease in expansion: a prime indicator of the despair behavioral pattern ( §3.3).We can infer from this signal that a large majority of regular traders stopped trading by this time, either from the conversion or sale of any assets stored as UST out of the stablecoin ecosystem or simply due to uncertainty and inaction in response to the collapse.Following this cue, for approximately two weeks afterward, we see a consistent behavioral pattern of faith characterized by low expansion and low decay.During this period, few new traders entered or left the stablecoin network.There was still faith in the remaining traders that perhaps a large stablecoin such as UST could rebound and restore its peg with USD and thus, they refrained from engaging in any transactions.On the other hand, decay and expansion values also indicate a sign of hopelessness as the bulk of traders already exited the network since the first signal of despair.We understand from this behavioral analysis that there is a delayed reaction from traders when a significant unannounced event occurs due to indecision, and there is a general trend of inactivity in the following period.Why is this e-crime?We outline two reasons.Dumping of UST: On May 7th, large sums of UST were dumped, with 85 million UST swapped for 84.5 million USDC [39].This massive dumping of UST contributed to its de-pegging and caused its value to drop significantly.Concealing past failures: The CEO of Terra, Do Kwon, was revealed to be a co-creator of the failed algorithmic stablecoin, Basis Cash [23].The concealment of such information about the project's founder could mislead traders and hide potential risks.SCPD vs. InnerCore.From Figure 7, we observe that SCPD less accentuates the critical event of UST's peg loss and InnerCore more accurately depicts the impact of the collapse on the market relative to other days in the data time span.SCPD assigns an anomaly score to Sep 26 when USDC announced their plan to expand to five new blockchains 4 , nearly two times as anomalous as the score assigned to May 4, the closest day to the LunaTerra collapse.However, our Stablecoin decay and expansion measures in Figure 6 notably accentuate and emphasize the impact of UST's peg loss on the stablecoin ecosystem from the less impactful events occurring on other days.This accentuation is evident by the presence of a pronounced decay peak on May 13 followed by a period of approximately two weeks of consistently low decay and expansion measures before returning to more standard values seen in other days, clearly indicating a significant event had transpired.This demonstrates that decay and expansion measures serve as a better indicator of the significance of an event on its corresponding network.
Identify Key Addresses.Before the LunaTerra collapse, it is reasonable to assume that traders responsible for the collapse would prepare for the anticipated negative consequences by exiting the UST network and entering another reliable stablecoin.In order to capture these transactions of traders converting between different stablecoins, we have included four stablecoins in our network along with UST.We focus on the unknown addresses that occurred most frequently as motif centers in InnerCores (defined in §3.4) on days immediately before the LunaTerra collapse since they could have influenced the initial phase of the crash.Generally, a large amount of tokens transferred from one address to another is easily detectable due to the sheer volume.However, if a trader tries to confiscate detection, the trader could produce multiple transactions with smaller volumes.Additionally, often in a transaction where one token is exchanged for another, a series of multiple transfers can arise for a single conversion transaction due to interactions with exchanges. 5Therefore, a trader is more likely to exhibit both selling and buying behaviors, making the trader a prime candidate as a 3-node motif center.Ground Truth.Nansen (https://www.nansen.ai/) is a prominent blockchain analytics platform that frequently publishes comprehensive analyses of blockchain events, which are followed with great interest by the industry.Nansen.aiconducted a thorough analysis of the LunaTerra collapse in May 2022 and identified 11 important addresses that played central roles in the collapse [7].We compare the addresses of interest detected by our InnerCore analysis using the centered-motif approach with those identified by Nansen.ai(Table 4) as the primary candidates for triggering the collapse.
Exchanges are an intermediary hub to facilitate transfers between traders.The addresses of exchanges are well-known for this reason, making them not very interesting in our context.In contrast, addresses that are not exchanges are mostly owned by traders and thus, the existence of such addresses and their edges in a network is a direct consequence of a trader's activity in the network.From Table 3, we observe that motif centers identified from InnerCores have a high ratio of non-exchange addresses to exchange addresses (≈99%).This shows the effectiveness of our method to identify potentially meaningful addresses in a network different from high-traffic exchange addresses.
In particular, we capture 9 of 11 externally owned addresses (EoAs) in Table 4 identified by Nansen.ai that occurred as center addresses for our motif types (Figure 4) on days immediately leading up to the LunaTerra collapse.We notice that the NF-IAF score percentile ranks of these addresses are higher compared to that of other center addresses for the same motif type on the same day, indicating that these addresses were important traders contributing to the buy or sell behavior associated with the motif on the day.We surmise the possibility that certain EoAs found by our InnerCore method, coupled with centered-motif analysis, could have been responsible for the initial phase of the collapse.
Recall that in Figure 4, we defined motif centers  1 ,  5 , and  11 as exhibiting sell behavior; while motif centers  4 ,  5 , and  6 as exhibiting buy behavior.It is evident from Table 4 that every motif center on May 8, 2022, has at least one corresponding trader with an NF-IAF score percentile rank above 90.This suggests that addresses with greater NF-IAF percentiles exhibit a higher buy or sell behavior associated with the particular motif type on the day of the collapse.Specifically, we identify two traders, hs0327.ethand Heavy Dex Trader, as the most likely candidates for influencing the initial phase of the crash, since they had the greatest NF-IAF score percentile increases from May 7 to May 8, 2022 consistently across all their participating motif center types in comparison to other addresses.In addition, we identify the two traders, masknft.ethand Oapital, as key participants throughout the crash, since they are the two addresses with greater NF-IAF percentiles (above 90) occurring consistently across at least two motif types exhibiting sell behavior on days before, during, and after the crash.We identify Celsius as being the least likely trader to have directly impacted the collapse as it is the only address which had score percentiles < 90 across all three days.K-Core vs. InnerCore.We notice that graph--core cannot find any of the 11 addresses indicated by Nansen.ai as prime candidates for triggering the initial phase of LunaTerra collapse.In comparison, InnerCore + centered-motif analysis captures potentially anomalous buy and sell behaviors by identifying 9 of the 11 addresses.

Experiment 2:
Ethereum's Switch to PoS.Ethereum's transition from Proof-of-Work (PoW) to Proof-of-Stake (PoS) came with many benefits including enhanced security for users and lower energy consumption.Together, these positives incentivized new traders to participate in the Ethereum network due to increased trust in the blockchain and lower barriers to entry.The transition occurred in two phases; the first phase was a preparatory hard forking of the blockchain into a PoS structure and the second phase was a finalization of the upgrade.
A pattern of hope was expected as the upgrade was highly anticipated due to the positives, transparency, and consistent updates regarding the official dates of the upgrade.From Figure 8, we indeed verify this behavioral pattern of hope characterized by inflated expansion values, coupled with relatively stable decay values, on three separate occasions.The first occurrence of hope is observed approximately a week before the first phase of the upgrade took place.It was around this time, the end of August 2022, that official news regarding the concrete dates of when the upgrade would be expected to take place was released to the public.We observe a surge of new hopeful traders participating in the Ethereum network and a significant dip in existing traders leaving the network in anticipation of the upgrade.The other two instances of hope are seen during the immediate days surrounding and between each of the phases of the upgrade.These occurrences provide insight into the market sentiment during the upgrade as positive and the overall transition of Ethereum to PoS as being well-received by traders.SCPD vs. InnerCore.We next apply SCPD on the Ethereum transaction network to compare against our expansion and decay results.From Figure 9, we notice that SCPD less accurately captures the two phases of Ethereum's transition to POS occurring on Sep 6 and 15, 2022.SCPD identifies Sep 9 and 16 as anomalous, which are two days before the first phase and one day after the second phase, respectively, of Ethereum's transition to POS.In contrast, our expansion measures in Figure 8 more accurately capture the phases of Ethereum transition to POS by producing a peak on Sep 4, one day before the first phase, and on Sep 15, the same day of the second phase.It is evident InnerCore detects the second phase of the switch on the day of the event, whereas SCPD can only detect the event after it has occurred.Therefore, InnerCore expansion measures more accurately detect an anomaly on days when a significant event actually unfolded.6.3.3Experiment 3: USDC's Temporary Peg Loss.On May 11th, 2023, a significant event unfolded in the stablecoin market as Circle's stablecoin, USDC, experienced a temporary loss of its peg, plummeting to a concerning value of 87 cents. 6The abrupt collapse of Silicon Valley Bank, which held over 3 billion of Circle's reserves, triggered panic among traders.Fearing a collapse, many traders liquidated their USDC holdings and sought refuge in alternative stablecoins like MakerDAO's DAI.
By analyzing the expansion and decay measures surrounding the incident, we realize how traders responded differently to this event.Figure 10 shows a sudden surge in expansion on May 11th, 2023, attributing to a wave of traders liquidating their USDC holdings in response to the stablecoin's all-time low value of 87 cents.In the subsequent three days following the temporary loss of USDC's peg, a distinct series of behavioral patterns emerged, characterized by alternating signals of despair, hope, and despair again, before  eventually stabilizing.During this three-day period, Circle's reassurances regarding the recovery of lost reserves gradually restored trust among its traders.This is evident through the decreasing extent of despair patterns observed on the 12th and 14th.
In summary, traders' reactions were initially marked by panic and a rush to sell USDC, causing a surge in expansion.However, as Circle provided updates on their efforts to recover the lost reserves, a sense of hope permeated the market, leading to a decline in the extent of despair patterns.Ultimately, the stablecoin regained stability, with expansion and decay returning to typical levels.SCPD vs. InnerCore.We also apply SCPD to the USDC network in order to compare with our decay and expansion results.From Figure 11, we observe that SCPD less accurately captures USDC's temporary peg loss occurring on Mar 11.SCPD identifies Mar 12 and 15 as anomalous which are one day and four days, respectively, after USDC's peg loss.Conversely, our expansion measures in Figure 10 accurately capture USDC's peg loss by producing a prominent peak on Mar 11.It is evident that InnerCore detects the temporary peg loss on the day of the event, whereas SCPD can only detect the event after it has occurred.Clearly, our InnerCore expansion measures more accurately indicate an anomaly on days when a significant event occurred.

Figure 1 :
Figure 1: A running example to compare between the graph--core and AlphaCore decomposition methods.The Coreness of nodes according to graph--core decomposition is shown with different node colors, whereas AlphaCore is run with in-strength and out-strength as node features with a step size of 0.25.Different AlphaCores are shown using dotted boundaries.

Figure 2 :
Figure 2: Flowchart of our methodology for identification of significant days and subsequent anomalous addresses.

Figure 3 :
Figure 3: In a temporal graph (e.g., transaction network), changes in decay and expansion reflect varying levels of hope, despair, uncertainty, and faith in the asset being represented.

Figure 4 :
Figure 4: Five 3-node motifs exhibiting buy and sell behaviors.Nodes labeled C denote the center where a center with an in-degree = 2 indicates buy behavior and an out-degree = 2 indicates sell behavior.Out of the 16 connected 3-node motifs (see Figure 1B in Milo et al. [45]), only the five given above (motifs 1, 4, 5, 6, and 11) contain a center node.

6. 1 . 1
Datasets.Our experiments investigate the Ethereum transaction network and Ethereum stablecoin networks across three recent real-world events: the LunaTerra collapse, Ethereum's transition to Proof-of-Stake, and USDC's temporary peg loss.For each of our experiments, we construct a transaction network from the following datasets.Ethereum Stablecoin Transaction Networks.We retrieve transaction data for the top five stablecoins based on market capitalization (UST, USDC, DAI, UST, PAX) and WLUNA from the Chartalist repository[60].The data pertains only to transactions conducted on the Ethereum blockchain; each transaction in the data set corresponds to a transfer of the asset indicated by the contract address.However, the UST collapse event that we are studying involved another blockchain called Terra with its own network, and the cryptocurrency called Luna, acting as a parallel to ether on Ethereum.Terra issued a stablecoin named UST (also known as TerraUSD), which offered high-interest rates to lenders and was pegged to the value of $USD1.Additionally, Terra's owners created an ERC-20 version UST on the Ethereum blockchain and a Wrapped LUNA (WLUNA) token was established to trade Luna tokens on Ethereum.

Figure 5 :
Figure 5: Comparison between running times of AlphaCore with the starting  = 1.0 and stepsize  = 0.1, InnerCore with  = 0.1 on daily Ethereum transaction networks to return the InnerCore of depth < 0.1.An average of approximately 480,000 nodes (addresses) and 1 million edges (transactions) exist in each network.The average computation time is 4.06 seconds (max 8.1s), which is approximately 0.10 times the average computation time of AlphaCore, 0.12 times the average computation time of the highest graph -core, and 0.14 times the average computation time of SCPD.

Figure 6 :
Figure 6: Stablecoin decay and expansion measures.On May 8 (shown with the vertical blue line), UST loses its $1 peg and falls to as low as 35 cents.

Figure 7 :
Figure 7: Stablecoin anomalous days identified by SCPD.Unlike decay and expansion measures by InnerCore, SCPD less accentuates the critical event of UST's peg loss in Ethereum stablecoin networks, compared to other anomalies that occurred between Apr 3 to Oct 30, 2022.

Figure 8 :
Figure 8: Ethereum decay and expansion measures.The move of Ethereum to Proof-of-Stake mining took place in two stages, indicated by 2 vertical blue lines (Sep 6 and 15, 2022).An expansion peak on Sep 5, 2022 detects the anomaly one day before the first stage commenced.

Figure 9 :
Figure 9: Ethereum anomalous days identified by SCPD.Compared to decay and expansion measures by InnerCore, SCPD less accurately captures the two phases of Ethereum's transition to POS occurring on Sep 6 and 15, 2022.

Figure 10 :
Figure 10: USDC decay and expansion measures.On Mar 11, 2023(shown with the vertical blue line), USDC loses its $1 peg and falls to as low as 87 cents.An expansion peak detects the anomaly on the day the event transpires.

Figure 11 :
Figure 11: USDC anomalous days identified by SCPD .Compared to decay and expansion measures by InnerCore, SCPD less accurately captures USDC's temporary peg loss occurring on Mar 11, 2023.

Table 1 :
[9]mple node property functions.The InnerCore approach is also different from graph--core decomposition[9], where the outer cores are computed first before the higher -core can be determined.As a result, InnerCore discovery is quite scalable and can be applied to very large graphs.Our experiments in §6 reveal that InnerCore discovery has a running time that is only one-tenth of that required for AlphaCore decomposition.
Scalability.Computing the InnerCore requires performing Cholesky decomposition on the covariance matrix at line 2 once, which has time complexity  ( 3 ) for  features.Node features need to be recomputed at each iteration of the while loop with a cost of  (|V | × ), where  is the average degree in the graph.There are at most |V | iterations (number of nodes).In the worst case, the total time complexity is  ( 3 + |V | ×  × |V |).However, since the neighborhood of a node can be sparse, the value of  is small.Moreover, since multiple nodes are removed in batches, the number of iterations is much smaller than |V |.For example, in a network with approximately 480,000 nodes and 1 million edges ( §6.1.1),only 4 iterations on average are needed for an  = 0.1.

Table 2 :
Occurrences and NF-IAF scores of nodes  1 ,  2 , and  3 across three days  1 ,  2 , and  3 in instances of motifs  4 and  5 . 3 does not appear for motif  4 on any day, whereas  1 does not appear on days  1 and  2 for motif  5 . 2  3  1  2  3

Table 3 :
Numbers of center addresses in motifs identified by our method ( §3.4) that are known exchanges.The numbers represent the total counts per motif across all days.

Table 4 :
NF-IAF score percentile ranks of InnerCore motif centers matching highlighted addresses by Nansen.ai to have played key roles before (May 7), during (May 8), and after (May 9, 2022) the LunaTerra collapse.The percentile scores for individual addresses on a specific day of a particular motif center are determined relative to all addresses associated with the same motif center throughout all days in the data window.Motif centers  1 ,  5 ,  11 exhibit sell behavior, while motif centers  4 ,  5 ,  6 exhibit buy behaviour.Addresses with percentiles ≥ 90 across at least one motif center type (given in red color) are considered impactful on a given day.Dashes indicate absence of the address as the motif center.LunaTerra addresses onMay 7 Address/Motif Center  1  4  5  5  6  11 Address/Motif Center  1  4  5  5  6  11 Motif Center  1  4  5  5  6  11