Node centrality measures are among the most commonly used analytical techniques for networks. They have long helped analysts to identify “important” nodes that hold power in a social context, where damages could have dire consequences for transportation applications, or who should be a focus for prevention in epidemiology. Given the ubiquity of network data, new measures have been proposed, occasionally motivated by emerging applications or by the ability to interpolate existing measures. Before analysts use these measures and interpret results, the fundamental question is: are these measures likely to complete within the time window allotted to the analysis? In this paper, we comprehensively examine how the time necessary to run 18 new measures (introduced from 2005 to 2020) scales as a function of the number of nodes in the network. Our focus is on giving analysts a simple and practical estimate for sparse networks. As the time consumption depends on the properties in the network, we nuance our analysis by considering whether the network is scale-free, small-world, or random. Our results identify that several metrics run in the order of O(nlogn) and could scale to large networks, whereas others can require O(n2) or O(n3) and may become prime targets in future works for approximation algorithms or distributed implementations.
Often thought of as higher-order entities, events have recently become important subjects of research in the computational sciences, including within complex systems and natural language processing (NLP). One such application is event link prediction. Given an input event, event link prediction is the problem of retrieving a relevant set of events, similar to the problem of retrieving relevant documents on the Web in response to keyword queries. Since geopolitical events have complex semantics, it is an open question as to how to best model and represent events within the framework of event link prediction. In this paper, we formalize the problem and discuss how established representation learning algorithms from the machine learning community could potentially be applied to it. We then conduct a detailed empirical study on the Global Terrorism Database (GTD) using a set of metrics inspired by the information retrieval community. Our results show that, while there is considerable signal in both network-theoretic and text-centric models of the problem, classic text-only models such as bag-of-words prove surprisingly difficult to outperform. Our results establish both a baseline for event link prediction on GTD, and currently outstanding challenges for the research community to tackle in this space.
Interactions between humans give rise to complex social networks that are characterized by heterogeneous degree distribution, weight-topology relation, overlapping community structure, and dynamics of links. Understanding these characteristics of social networks is the primary goal of their research as they constitute scaffolds for various emergent social phenomena from disease spreading to political movements. An appropriate tool for studying them is agent-based modeling, in which nodes, representing individuals, make decisions about creating and deleting links, thus yielding various macroscopic behavioral patterns. Here we focus on studying a generalization of the weighted social network model, being one of the most fundamental agent-based models for describing the formation of social ties and social networks. This generalized weighted social network (GWSN) model incorporates triadic closure, homophilic interactions, and various link termination mechanisms, which have been studied separately in the previous works. Accordingly, the GWSN model has an increased number of input parameters and the model behavior gets excessively complex, making it challenging to clarify the model behavior. We have executed massive simulations with a supercomputer and used the results as the training data for deep neural networks to conduct regression analysis for predicting the properties of the generated networks from the input parameters. The obtained regression model was also used for global sensitivity analysis to identify which parameters are influential or insignificant. We believe that this methodology is applicable for a large class of complex network models, thus opening the way for more realistic quantitative agent-based modeling.