Original Research ARTICLE
The Core Literature of the Historians of Venice
- Digital Humanities Laboratory, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
Over the past decades, the humanities have been accumulating a growing body of literature at an increasing pace. How does this impact their traditional organization into disciplines and fields of research therein? This article considers history, by examining a citation network among recent monographs on the history of Venice. The resulting network is almost connected, clusters of monographs are identifiable according to specific disciplinary areas (history, history of architecture, and history of arts) or periods of time (middle ages, early modern, and modern history), and a map of the recent trends in the field is sketched. Most notably a set of highly cited works emerges as the core literature of the historians of Venice. This core literature comprises a mix of primary sources, works of reference, and scholarly monographs and is important in keeping the field connected: monographs usually cite a combination of few core and a variety of less well-cited works. Core primary sources and works of reference never age, while core scholarly monographs are replaced at a very slow rate by new ones. The reliance of new publications on the core literature is slowly rising over time, as the field gets increasingly more varied.
The incessant expansion in the volume of scientific publications is a well-known phenomenon of modern science. A similar process is undergoing in the humanities, where a growing number of practitioners have to deal with an increasing amount of literature being published. From the point of view of bibliometrics, the discipline interested in the quantitative analysis of written publications, the humanities are mostly uncharted territory. Open questions include how they are organized intellectually, how the knowledge they produce accumulates, and how the increasing volume of publications is affecting the way scholars in the humanities conduct and publish their research. A variety of challenges make it more difficult to approach these questions in the humanities than in the sciences, among them the lack of citation data, especially sensible for important publication typologies such as monographs.
History is, in this respect, a particularly compelling example. Often seen as a boundary discipline in between the social sciences and the humanities, history is characterized by deeply rooted intellectual traditions and a practically open-ended wealth of primary sources it relies upon, which determines its strong grounding in space and time. We focus on one of its many fields: the history of Venice. The city was at the forefront of the discipline during its first modern period, the nineteenth century, and is now again since the 1950s welcoming a great number of local and foreign scholars into its libraries and archives. Not unlike other fields of enquiry in history, an encounter is happening since a few decades between the local community of scholars, inclined toward strongly felt interests often approached by traditional methods, and the most dynamic international communities. In this respect, the recent historiography on Venice can be considered to be representative of the mixing of local and international perspectives occurring more broadly in modern historiography.
We consider here a specific bibliometric point of view by using a dataset of citations between monographs. Monographs are still the most important publication typology in most humanities disciplines, a fact that shows no sign of change in recent years. At the same time their “citation profile” is poorly understood given that only recently citation indexes such as the Web of Science and Scopus have been indexing them. We thus start by considering a recent and representative set of monographs on the history of Venice, to map the current trends in the field and its intellectual landscape. We then explore the most cited monographs in this field, or its core literature, to qualify it and discuss its structural role, with the goal of uncovering how historians relate to their past literature in the modern, rapidly expanding fabric of historiography. Despite the focus on history, our results and discussion might be relevant more generally for other disciplines in the humanities.
2. State of the Art
Science is growing at an increasingly rapid pace since the nineteenth century (Bornmann and Mutz, 2015), and the humanities in a similar way, with all due proportions. This growth originated worries among scientist for the effects of information overload (Bush, 1945), addressed to some extent by citation indexes. In the humanities, similar concerns have been vouched out repeatedly in the form of criticisms and alarms for the overspecialized nature of new research, consequence of the explosion of contributions and novel avenues taken by scholars, all in the absence of reliable and more effective ways to navigate previous literature (Tyrrell, 2005). Historians in some cases even proved refractory to acknowledge the need to update their research methods and publication behaviors despite the digital turn (Hitchcock, 2013). It would be thus most interesting to understand the process of knowledge accumulation in the humanities and explore how the current organization and growth of science influences it. Finally, the need to advance our understanding in view of “a bibliometrics for the humanities,” must be put in context with the growing demand for the quantitative evaluation of scientific output (Hammarfelt, 2016). It goes without saying that the blind application of metrics developed for the sciences might not be at all appropriated for the humanities.
The humanities possess a set of characteristics that makes it more challenging to acquire and use citation data to study their intellectual organization and communication practices. Among these feats we can find the importance of the national and local dimensions, the variety of publication typologies with a preference for monographs, the slow or absent aging of their sources, the richness of citation semantics (and syntaxes), the individual endeavor as the preferred way to conduct research, the variety of sources and topics being investigated, and the resulting less focused and wider information retrieval behavior (Garfield, 1980; Hicks, 1999; Nederhof, 2006; Huang and Chang, 2008; Hellqvist, 2009; Linmans, 2009). Partially as a consequence, it is more difficult to build comprehensive citation indexes in the humanities, a condition that hindered the development of bibliometric research in this area for a long time (Ardanuy, 2013). This remains the case today, despite slow progresses (Hammarfelt, 2016).
2.1. Mapping the Humanities
Science can be conceptualized in a variety of ways, among them it can be viewed as a process of accumulation of new knowledge. Such conceptualization leads naturally to so called maps of science, or attempts to localize and relate, by relative positioning, some entities of interest, such as publications, authors or journals, using some relations among them, for example, citations (Börner and Scharnhorst, 2009). As much as the mapping of the sciences is a well-developed area of research (Börner, 2010), the humanities are often omitted or sidelined (Klavans and Boyack, 2009): “[their] fine-structures […] have been black-boxed and insufficiently unpacked; the available studies focused mainly on their positions relative to the social and natural sciences.” (Leydesdorff et al., 2011). In the early decades of bibliometrics, the humanities have been the object of general descriptive citation studies (Hérubel, 1994). More recently, we can find some examples of mapping attempts at a general scale or at the scale of a single discipline or field, which we consider here only if they directly or tangentially discuss history.
In an early effort, Hérubel and Goedeken (2001) analyzed the French journal Annales using the A&HI (Arts and Humanities citation Index), and assessing its international reach, despite the preponderant French share of authors, as well as its capacity to rely on a broad array of literature from a variety of fields. Leydesdorff and Salah (2010) analyzed instead two journals in the arts, Leonardo and the Arts Journal, using data from the A&HI and considering their positioning among all A&HI journals. The authors found that both journals cite mostly within the span of their original domain, but are cited widely outside of it, while for comparison a small set of articles in the digital humanities was found to cite widely but being only cited by a narrower community, resembling the sciences with respect to its “being-cited patterns.” Leydesdorff et al. (2011) provided the largest attempt to date to map all the humanities using the whole A&HI dataset for the year 2008. Perhaps the most salient finding is a coherent set of twelve dimensions (latent factors) clearly organized in more or less proximal areas of research, among which we find classics, religion, and archeology; linguistics and the history and philosophy of science; literature and history; arts; music. A different perspective is taken by Zuccala et al. (2015), who attempt to rank scholarly book publishers in historiography using citations to books from articles indexed in Scopus. The resulting map of publishers shows a strong polarity toward prestigious English or American publishers, with only some topical organization. Finally, another aspect of the humanities, which has barely started to be explored, is mapping the use of primary sources, attempted, for example, by Romanello (2016) considering L’Année Philologique in the domain of Classics.
2.2. Monographs and the Core Literature
Almost no previous work has considered monographs, mainly due to the lack of data, only recently made available in the Web of Science or Scopus (Mingers and Leydesdorff, 2015). Yet one of the main features of the humanities is their reliance on monographs, which still are the main publication channel in most humanities disciplines (Thompson, 2002; Knievel and Kellsey, 2005; Larivière et al., 2006; Williams et al., 2009), and specifically in historiography (Jones et al., 1972). As a consequence, the most cited literature in any field within the humanities should essentially include monographs (Hicks, 1999), indeed the conclusion reached by some previous studies (Lindholm-Romantschuk and Warner, 1996; Hammarfelt, 2011, 2012), even if others struggled to find a set of core works in specific fields (McCain, 1987; Thompson, 2002; Nolen and Richardson, 2016). The contrasting results provided by previous literature are motivated by a set of considerations, which relate to the citation patterns of the humanities more in general. The humanities have been found to undergo an increase in interdisciplinary citing of sources in recent times (Leydesdorff and Salah, 2010; Hammarfelt, 2011), which is also coupled with a growing international projection (Hicks, 1999; Engels et al., 2012). This might not help a core literature to emerge, as “a less demarcated discipline lacking a central core is heavily influenced by other research fields and therefore more interdisciplinary in referencing practices” (Hammarfelt, 2016). Furthermore, publications in the humanities usually accumulate citations at a slower pace (Nederhof, 2006; Linmans, 2009). It appears clear how a thorough exploration of the recent trends in a humanities’ field and of the role of the core literature should consider citations to monographs either as source or non-source items (i.e., citations from and to, or just to monographs) (Hammarfelt, 2011).
2.3. The Historiography on Venice
The investigation of intellectual structures and core literatures in historiography might be a particularly compelling case to consider, one where a by now rich tradition of research questions, possible answers, as well as abundant if scattered evidence, is in constant dialog with new perspectives and avenues of research put forward by a growing community of practitioners, both in numbers and international outlook. The case of Venice is no exception in this respect. Relying on 200 years of erudite scholarship, just to consider modern times (Dursteler, 2013), and often mixed with political or ideological motivations (Infelise, 2002; Povolo, 2002), the most recent historiography on Venice is inevitably conditioned by its past. At the same time, and like many other fields in history and beyond, Venice saw a surge in internationalization during the past few decades, effectively managing to connect its local community to other, mostly English- and French-speaking ones (Grubb, 1986; Davidson, 1997; Dursteler, 2013). As a consequence, studies proliferate and new avenues of research are being opened with increased frequency. Venice can effectively be considered as a playground, representative of the most recent trends in historiographical research (Horodowich, 2004). In this context, it appears not at all trivial to ask the question on how the intellectual landscape of the historians of Venice is organized, given the novelty of new scholarship, but also its need to dialog with the past to forge its identity (Davidson, 1997).
As much as it would be important to have a clear map of the humanities, and a good understanding of their knowledge accumulation processes, we are still far from it. It is especially problematic should we want to understand how humanists are coping with the growing size of their literature. In this context it might be useful to consider specific case studies at a more granular level and explore dimensions previously neglected such as the role of monographs.
3. Methods and Data
There exist perhaps two main challenges for analyzing intellectual landscapes in the humanities, as mapped by citations: individuating a representative sample of the literature of the field and acquiring its citation dataset. In the absence of comprehensive book citation indexes, the only viable option is to use the resources available from research libraries and the advice of domain experts to delineate a first sample of works, extract their citations and then proceed to enlarge the corpus iteratively. For the purpose of this article, a citation dataset among monographs on the history of Venice is used (Romanello and Colavizza, 2017), whose details are given in Colavizza et al. (2017). A set of monographs was selected trying to cover on-demand works, aiming at representing recent trends in the field, including the tightly connected areas of the histories of art and architecture. Different means were used to individuate these monographs, among which the shelving strategy of the library (selecting works in rapid consultation shelves), catalog classification, and scholarly bibliographies. Furthermore, only monographs with reference lists were considered to extract their references, thus there is no ambition of comprehensiveness. To be sure, this selection did not entail specific biases by publisher or date of publication. As a consequence, the dataset only considers monograph to monograph citations, irrespective of the frequency of in-text references, therefore resulting in an unweighted directed citation network. The exclusion of journal articles is partially justified by the fact that they likely do not become part of the core literature (see, e.g., Hammarfelt (2011)).
The dataset comprises 700 citing monographs and 37,362 cited monographs. 264 citing monographs are never cited in turn. The total number of individual citations (citing to cited) is 73,268, or slightly more than 100 for every citing monograph. The distribution of the number of citations made by these 700 monographs is given in Figure 1A for reference. Values are reasonably between 20–30 and 300, with some more extensive but rare reference lists. The distribution of the received citations is, instead, more skewed, as shown in Figure 1B. In particular, 27,109 works are cited only once, and just 769 ten or more times. We consider this last group of monographs to be the core literature, which will be discussed in what follows.
Figure 1. The distribution of the given and received citations (or the out and in degrees of the directed citation network). The distribution of the number of received citations is particularly skewed. Only 769 works are cited ten or more times, and constitute the core literature. Please note the scales in two plots differ significantly. (A) Number of given citations from citing works (out degree). (B) Number of received citations by cited works (in degree). The y axis is on log scale.
The age of the cited monographs is given in Figure 2B. The age of some cited works is very considerable, with publications dating back to the Renaissance. Some turning points in the historiography on Venice also emerge, notably the end of the Republic of Venice in 1797 and two world wars, which determined a reduction in the number of new publications, in the latter case common to all domains of science (De Solla Price, 1965). Besides, the volume of cited literature rises considerably moving closer in time, another phenomenon in common with the sciences. The distribution of the age of citing monographs is, instead, concentrated for the most part between the years 1980 and 2013, as shown in Figure 2A. The citing group is thus representing recent historiography and is relatively up to date at least by humanities’ standards, as intended.
Figure 2. The distribution of the age of the citing and cited works, respectively. Citing works mainly concentrate from 1980 to 2013, while cited works essentially span from the Renaissance to the present day. (A) Age of citing works. (B) Age of cited works.
The languages and places of publication of the citing and cited works are given in Table 1. Italian is by and large the most represented language, followed by the main Western languages. The dataset thus strongly represents local as well as international historiography on the topic and confirms the tendency of scholarship to rely heavily on research published in national languages.
A note on terminology. Networks commonly follow the terminology of graph theory and are thus made of nodes (or vertices) connected by edges. In our case, nodes are monographs, and edges are citation relations among them. Edges can also be weighted to distinguish between stronger and weaker relations. In this article, three kinds of citations networks will be used, which are all often used to map different aspects of the intellectual structure of a field or discipline.1 The most basic one is the directed, unweighted citation network where every node is a monograph, and an edge exists from one node to another if the former cites the latter. This network comprises 37,200 nodes and 68,748 edges. Given this representation, two other networks can be constructed. The bibliographic coupling network is a weighted, undirected network where every node is a citing monograph, and every edge represents the overlap of references between the two monographs (Kessler, 1963). For example, if two monographs both refer to the same three monographs, they will be connected by an edge of weight 3. This network comprises 673 nodes and 87,419 edges and accounts for how recent literature defines an intellectual landscape according to its use of the literature. The co-citation network is a weighted, undirected network where every node is a cited monograph, and an edge is established between two nodes if the two are cited together in the same reference list (Marshakova Shaikevich, 1973; Small, 1973). The weight of the edge is given by the number of times the two monographs were cited together in different reference lists. A minimum weight of 2 is established as a threshold, to filter-out monographs cited only once or anyway weak and possibly episodic relations. This last network comprises 9,061 nodes and 288,782 edges among them and accounts for the way the literature of the field was used by recent scholarship. Recent trends in the literature can be mapped by bibliographic coupling, the literature or “intellectual base” they rely on by using co-citation networks (Persson, 1994; Hammarfelt, 2011).
4. The Recent Historiography on Venice
The starting point of the analysis is the topology of the bibliographic coupling network. The monographs’ part of the sample was selected considering a broad definition of historiography, also including the histories of arts and architecture. It is therefore important to assess to what extent the citation network at the monograph level allows to characterize the field as a whole (i.e., the network is connected or not) and individuate its subfields and topics of interest through clustering (i.e., community detection).
To evaluate the results of any clustering, all nodes (citing monographs) have been classified with a unique keyword corresponding to their general subfield (history, arts, or architecture), and with two groups of keywords (every monograph can receive no, one, or more keywords for these two groups, as appropriate) for topics and periods under consideration. This classification has been performed manually by experts. It relies on the Dewey and subject classifications of the Italian National Catalog, which could not be directly used due to its granularity being either too generic or too specific in the dataset at hand. It should be noted that manual classification of publications, in itself important to interpret results, is perhaps the least scalable part of the whole study. The resulting classification is made available for inspection (see Data Availability). There are 419 monographs classed under history, 129 arts and 125 architecture. 42 keywords for topics include, for example, “social” history (86 monographs), “politics” (80), “individuals” (62), “churches” and religion institutions (52), and “urban” life and architecture (45). 29 monographs could not be classed with topic keywords. The keywords for the periods under consideration are the Renaissance (234), eighteenth century (161), seventeenth century (149), the middle ages and late ancient period (122), nineteenth century (85,) and more recent times (20). 190 monographs could not be clearly classified by period. It is clear at glance that the historiography on Venice has a strong focus on the early modern period, especially the Renaissance, with less attention given to the periods of the early middle ages—likely due to the lack of sources—and the modern period—likely in part due to the over-abundance of sources—and covers a great variety of topics, both established since a long time or emerged recently. This classification according to the library catalog can both provide a direct clustering of monographs into communities and serve as a way to qualify—but not evaluate—the results of automated clustering using citation data. In particular, frequent keywords and periods can help qualify clusters much in the same way the most significant words in a topic model can help assign a label to it. It must be stressed that the two perspectives, such as library classification and clustering based on citations, need not coincide.
Several methods exist for the detection of clusters of nodes (communities) in networks (Fortunato and Hric, 2016), and their application to citation networks has been extensively explored (Šubelj et al., 2016). One particularly popular method relies on modularity maximization (Newman and Girvan, 2004), for which a fast implementation exists, known as the Louvain algorithm (Blondel et al., 2008), which has also been extended to incorporate a resolution parameter, helping to tune the size and thus resulting number of clusters (Reichardt and Bornholdt, 2006). This method has its features—for example, it is not deterministic, thus different runs can yield different results—and shortcomings (Fortunato and Barthelemy, 2007; Good et al., 2010), thus it is important to compare it with other methods or use external information to interpret any clustering result. Yet, modularity maximization gave by far the most interpretable results on the dataset under analysis here, where several other methods even failed to distinguish any structure in the network, mainly due to its density. The interested reader can experiment using the code released with this article.2 In absence of further specification, when a clustering solution is discussed it is one of the possible similar results from modularity maximization, with resolution parameter to 1.
At large scale, and despite considering quite different subfields such as arts and history, the network is almost connected—only two nodes are not part of its giant component (the largest set of connected nodes; two nodes are connected if a path exists among them). The network is also very dense: it contains almost 40% of all possible edges among nodes, entailing that a strong overlap exists across the reference lists of historians. Such well-connected network inevitably brings some difficulty in finding clusters of nodes. A comparison of a clustering solution with the general categories from the library catalog is given in Figure 3. The labels of the clusters in Figure 3B have been given inspecting the general categories, keywords and periods of the monographs within each cluster. With respect to this dataset, the field appears intellectually organized according to two main subfields, namely, history on one side, arts and architecture on another side, plus over the dimension of time, according to the main periods of interest for the historians of Venice. Most notably, the history of the early modern Republic, especially its Renaissance period, is the focal point of attention by number of publications. To a lesser degree the middle ages, and to a much lesser degree the nineteenth century and beyond. Borderline smaller areas of activity, such as the applied arts, emerge as well at this level.
Figure 3. Different clustering of citing monographs (giant component of the bibliographic coupling network) according to catalog metadata (left) and citation information (right). At this level, citation information captures general categories and the periods under consideration, as well as smaller subfields such as applied arts (yellow on the right). This visualization was made with Gephi 0.9.1 (Bastian et al., 2009), using Force Atlas 2 with default parameters but for LinLog mode, scaling 0.5 and edge influence 0.8. Edges are omitted: the network is connected. It is important to note that the disposition of the nodes is related to but is not determined only by clusters found by maximizing modularity (Jacomy et al., 2014). (A) The clusters according to catalog metadata at the highest level. Red/grey: history, blue/darker grey: history of architecture, green/lighter grey: history of the arts. (B) The clusters according to modularity maximization. Red: early modern history, green: arts and architecture, blue: history of the middle ages, cyan: history of the nineteenth century, yellow: applied arts (e.g., textile).
Starting from this most general situation, finer-grained clusters can be determined, either by tuning the resolution parameter or by further clustering an already individuated cluster. By further clustering the largest history cluster in Figure 3B (in red), a set of smaller clusters emerge, which we might consider as broad areas of interest of the recent literature. Four clusters relate to the Renaissance period, from different perspectives:
1. Aspects related to the political history and the elites, touching on foreign relations and the Venetian empire. An example is Donald Queller’s “Il patriziato veneziano: la realtá contro il mito” (The patriciate of Venice: reality vs myth).
2. Social and religious history, also touching upon censorship, gender, and culture. Examples are Satya Datta’s “Women and men in early modern Venice: reassessing history” and Muir’s “Civic ritual in Renaissance Venice.” Most of the publications in this cluster are quite recent, after the year 2002.
3. The government of the city and its Mainland state. For example, Claudio Povolo’s “L’intrigo dell’onore: poteri e istituzioni nella Repubblica di Venezia tra Cinque e Seicento” (The intrigues of honor: powers and institutions in the Republic of Venice between the sixteenth and seventeenth century).
4. Economic history. E.g., Richard Rapp’s “Industry and economic decline in seventeenth-century Venice.” This is a quite old cluster dating back mostly to the 1970s and 1980s.
Another cluster is instead made by works devoted to the eighteenth century, with a mix of perspectives spanning from politics and the role of elites, to the reform of government or the social and cultural aspects of the period. An example is given by Volker Hunecke’s “Il patriziato veneziano alla fine della Repubblica: 1646–1797” (The Venetian patriciate at the end of the Republic). Besides exceptions, all clusters include relatively recent works (the 1990s and 2000s for the best part).
The bulk of the historiography on early modern Venice is a mix of old topics often reconsidered under new perspectives. Most of these areas of interest of the recent literature have a long tradition of study among historians of the city (Grubb, 1986; Davidson, 1997; Dursteler, 2013). In fact, their emergence in the network signifies the presence of a continuity in the use of the literature. Remarkable is instead the relative lack of recent efforts in the study of the economic history of Venice, at least within the dataset. The social and economic history of Venice had its heydays during the 1960s and 1970s, mainly due to the influence of the works of Fernand Braudel and the École des Annales, but it has become since then of less importance. The most notable novelty in this recent historiography, and something already discussed in the literature (Horodowich, 2004) is the surge in the number of important studies dealing with a new social history, marking “a shift in interest from order to disorder, from orthodoxy to dissent, from the center of power to the broader social context” (Davidson, 1997). Examples in this respect are the relatively new trends of gender and women history.
With respect to the histories of arts and architecture cluster, the division is simpler and historically more stable: architecture broadly organizes itself into a urban dimension, where palaces and the city more in general are considered, and a dimension related to religious buildings, especially churches and convents. The arts are instead largely dominated by the study of individual painters and their schools, with a division by period into the Renaissance and the later eighteenth century and beyond. Interestingly, in this later period, more attention is given to private collecting, while in the previous period applied arts such as jewelry have an influence due to the proximity with the middle ages, when painting played a subordinate role.
The middle ages’ cluster generally orbits around two dimensions too: the history of the establishment of the Venetian empire, with strong focus on its commercial as well as political dimensions, and the history of the urban development of the city and its relation to the lagoon and its natural environment. The works of John Julius Norwich (“Venice: the rise to empire”) and Gerhard Rösch (“Venezia e l’Impero,” Venice and the Empire) feature among the former group; Wladimiro Dorigo’s “Venezia romanica: la formazione della cittá medioevale fino all’etá gotica” (Romanesque Venice: the formation of the medieval city until the Gothic period) is the most important work in the latter, and one that could have fitted into the architecture cluster as well.
Finally, the nineteenth century cluster is recently mainly devoted to the social, cultural, and political history of the city after the fall of the Republic. Another older cluster deals with the events of the year 1848, when a short-lived Republic was established between two periods of Austrian domination. All these considerations evidently apply only with respect to the sample under consideration.
Despite the fact that a relatively clear landscape of the recent historiography on Venice emerges from citation network at the level of monographs, it must be noted that the quality of the clustering, as measured by the modularity of the partitions, as well as by direct inspection, rapidly degrades while rising the number of clusters. The bibliographic coupling network among citing monographs is very well connected and effectively provides a broad overview of the field, without allowing for a too fine-grained individuation of small clusters, whose emergence might require further information, such as citations to journal articles and primary sources. This specific citation landscape relies for its tight organization on the use of previous literature, or the intellectual base of the historians of Venice. It is possible to consider the previously introduced co-citation network, to explore how such literature has been used, and relate it to specific clusters of citing monographs. It will become soon evident that a tiny part of the literature, its core, plays an important role as shared reference for the historians of Venice, within and between clusters.
5. The Intellectual Base and Its Core
The co-citation network, filtered to include only edges with weight of two or more, is again an almost connected graph (only 73 out of 9,061 nodes are not part of the giant component). Furthermore, its density is much lower than for the bibliographic coupling network, at 0.007. This follows directly from the fact that most of the literature is cited but a few times. To highlight the role of the core literature in the co-citation network, three centrality measures at the node level are considered3:
• Betweenness: accounts for the capacity of a node to bridge different areas of the network, which would be less well connected without it.
• PageRank: accounts for the importance of a node with respect to it being connected to other important nodes.
• Local clustering coefficient: the proportion of neighbor nodes that are connected in turn. A neighbor node is directly connected to the node of interest. If all the neighbors of a node are connected among them, its local clustering will be 1. It gives an idea on how densely connected the local neighborhood of a node is.
These three measures play together: it is expected that betweenness and PageRank will be high, and local clustering will be low for the core literature. Intuitively, this would mean that the core literature is able to connect different areas of the network, thus groups of works that have been cited by different communities (this entails high betweenness and low local clustering) and is particularly connected among itself (high PageRank) due to the fact that core works are frequently cited together.
Figure 4 displays the giant component of the co-citation network and highlights the core literature into it for reference (in red/dark gray). With this picture in mind, it is possible to appreciate how the intuitive role of the core finds confirmation using the three proposed measures of centrality. In particular, Figure 4 shows how the core literature has a high betweenness and PageRank, respectively, meaning that it bridges different areas of the network. But the core also has a lower local clustering coefficient, due to the fact that it helps connect groups of sources that are more densely connected within the group but not across groups. The intuitive explanation is that groups of sources here represent the reference lists of a few citing monographs, which are fully connected among themselves but are only connected with other groups of such a kind through the core literature.
Figure 4. The core literature highlighted in the giant component of the co-citation network, the betweenness and PageRank centralities that are higher for the core, and the local clustering that is instead lower for the core. This visualization uses Gephi’s Force Atlas 2 with LinLog mode and edge weight of 3.5. (A) The core literature (red/dark grey) and the rest (cyan/light grey). (B) Betweenness centrality is higher for darker nodes (i.e., mostly the core). (C) PageRank is higher for darker nodes (i.e., mostly the core). (D) Local clustering coefficient is higher for darker nodes (i.e., not the core).
Visual intuitions find confirmation using correlation coefficients, as shown in Table 2. Perhaps interestingly, and despite the fact that the core behaves as expected, the correlation coefficients are not as high as to warrant too narrow an explanation. The number of received citations in the directed network certainly determines the important role of the core into bridging groups of literature otherwise barely connected, but this role is not accounted for exclusively by the core. The core likely plays the prominent role in this respect, but other works too help in keeping the network connected. It should appear clear by now how using a threshold on the number of received citations is but one method to define the core literature. It could also have been individuated, with similar but not identical results, using the properties of the co-citation network, e.g., according to some centrality measures such as PageRank or betweenness. This was indeed one of the purposes for the introduction of co-citation networks in the first place (Small, 1973). Different aspects of the core literature can, in this way, be put into play, besides its popularity (number of received citations).
Yet the main point holds: the core literature exists, and is the main reason for which the field appears to be connected at the citation level. Scholars from different subfields and dealing with a variety of topics, still share a (small) set of works that they all refer to. The next section explores these works in more detail.
6. The Core Literature
The core literature, composed by 769 monographs cited ten or more times, is almost uniformly representing all periods of publication of the cited material. Still, it is comparatively older due to the time needed to accumulate citations in this field. It also is, as a consequence, quite varied in its contents. Two groupings can be proposed for the core literature: one, more trivial, where core works are grouped by their publication age: pre-1800, 1800–1949, and 1950 to the present. Another grouping uses the typology of the publication itself, allowing to individuate three different groups: primary sources, works of reference and scholarly monographs. A summary is given in Table 3.
The first group of core works by age (defined as age 1) is composed of publications dating before the year 1800, mostly early printed books. Yet several of the most cited primary sources have been edited at a later time in a critical edition, made to provide easier access to historians. A notable example of primary source that was edited and published at a later time are the Diaries of Marin Sanudo, a Venetian nobleman who recorded the daily life of the city for several decades across the fourteenth and fifteenth centuries. This edition was published in between the years 1879 and 1903. Conversely, early works of scholarship published before the nineteenth century are also included in this category. A second group by age (age 2) is composed of sources published during the period between 1800 and 1949. This phase of the historiography on Venice, developing since the fall of the Republic, is characterized by the efforts of local historians to cast a positive view on the city’s past, but more importantly by the effects of the general positivistic turn in historical studies, which fostered the production of works of reference and overarching syntheses of the history of the Republic (Infelise, 2002; Povolo, 2002; Dursteler, 2013). Works of reference can be critical editions of documents, with associated historical studies, as well as historical dictionaries, repertories, bibliographies or any kind of work meant to aid future historians by providing digested information. The most notable example is perhaps “Delle Inscrizioni Veneziane,” by Emmanuele Cicogna, a wide repertory of Venetian epigraphs. Additionally, during the same period, modern historiography developed while ambitious works of historical synthesis were produced on the basis of newly discovered documentary evidence. An example is the Documented History of Venice by Samuele Romanin, published between 1853 and 1861. Several works in this group are also multi-volume works. A third and last group (age 3) is more recent and abundant, gathering all works published from the year 1950, in what we might term the contemporary historiography on Venice. This group of 477 monographs comprises some works of enduring importance such as the History of the Population of Venice by Daniele Beltrami (1954) or the Economic History of Venice by Gino Luzzatto (1961), but less works of reference or edition of sources. Every core group by age includes in between 1.4 and 2.4% of the cited works for the given period, with proportionally more works from period two being core than the other periods.
The groups by typology are organized differently. A first typology (type 1) comprises primary sources individuated by being publications or documentary records not originally meant as scholarly works, including critical editions. In practice, all works published not as scholarly works, plus all editions of documents are included in typology one. The third typology (type 3) comprises all works of scholarship, published at any time. Using this definition, several works from age one and, even more, age two, end up in typology three. Finally, the second typology (type 2) gathers all works of reference made by historians for historians (for example, dictionaries, bibliographies, indexes, guides, etc.), according to the definition given previously. Most of these works have been published during the nineteenth and early twentieth centuries. A summary of this second classification method is as well given in Table 3, while the five most cited works per typology, along with their citation counts are further detailed in Table 4.
The presence of a core literature, and its three main typologies of primary sources, works of reference and scholarly works, highlights what connects the field. Notorious primary sources can become commonplace, especially so if published in a critical edition. Works of reference often entail an investment of resources, which is not easily replicated, thus determining their enduring importance. Some might even contain materials on long-disappeared records or artifacts, for which they represent the only surviving evidence. Works of reference are also often a product of specific periods during which their status as a scholarly product was deemed on par, if not above that of scholarly monographs, such as during the second half of the nineteenth century. Primary sources and works of reference can be considered as shared for the community, works on top of which it is possible to build further scholarship, and that do not fall into oblivion until another comparable and better work is acknowledged in their place. Finally, scholarly monographs of recognized status emerge quite slowly, often after one or more generations have passed. Clearly, citations in the humanities accumulate at a slow pace, especially so for monographs. Yet the fact that recent historiography so often cites old scholarship can be explained in several ways: for once, topics long forgotten can live through a second life, such is the case for private life and the history of interiors, a topic early discussed by Pompeo Molmenti in his highly cited work (the most cited in typology three) and rediscovered by several scholars since 30 years ago. Another motivation to cite old, well-known works is that they are, effectively, widely recognized, thus mentioning them is important to signal membership in the community. The importance of citing to contextualize or signal, especially in monographs where citations are more abundant, might be a factor contributing to the importance of the core literature. Finally, highly cited works are also landmark works that originated, or anyway highly contributed to a specific topic of enduring relevance, thus they are cited to reconstruct its main developments.
By considering the use of the core literature over time, in Figure 5, it is shown that the proportion of citations to the core literature is relatively stable over different typologies. Typology one and two comprise in fact fairly specialized works, which are marginal in terms of the total number of received citations, but stable in their presence. Typology three is instead more substantially represented, rising and leveling-off at 20% received citations over recent decades. With respect to the categories of core literature by age, it is possible to appreciate the waning-out of older scholarly literature being displaced by more recent works over time, in a process of slow update of the scholarly literature of reference, which does not impact primary sources nor works of reference. The proportion of references to old literature is in fact slightly rising over time. We consider (a modified version of) the Price Index (De Solla Price, 1970), or the proportion of citations to works published maximum 10 years before the citing one, in Table 5. The values, already very low, are slowly lowering over time. This is interesting as it points to a possible growing preference of scholars for older and well-known sources, instead of more recent (and abundant) literature. As the core literature is often old, this would mean that its importance is slowly growing over time. The top-cited sources over the same intervals of time highlight in fact the stable popularity of core sources of typology one and two, with some change happening in typology three (results are omitted here for brevity and can be found in the code repository).
Figure 5. The proportion of citations given to core and non-core works over time. Proportions are calculated using a smoothing window of 6 years, for every point in time the total (y axis) sums to one. The proportion of citations to age category two reduces, and category three rises, as new scholarship supplements older works in recent years. With respect to typologies instead, we see that the role of typologies one and two is marginal but stable, while typology three rose to occupy a stable 20% of citations that are given to highly cited, well-known scholarly monographs. (A) Proportion of citations to the core by age. (B) Proportion of citations to the core by typology.
The proportion of citations to the core literature can be compared with the proportion of citations given to uniquely cited works, or works that are cited only once in the dataset. These distributions are given jointly in Figure 6. Interestingly, citing monographs have a more uniform distribution of citations to unique works, with a mean to 30–40% but high variance, while core works occupy a more limited yet significant role, taking on average 10–20% of citations. Most monographs balance their citations to a fraction of core works and less well-cited works, in what appears to be a trade-off between contextualized and specialized referencing.
Figure 6. The joint distribution of the proportion of citations given to the core literature and to uniquely cited works (i.e., works cited only once). Most citing monographs cite 10–20% core and 30–40% uniquely cited works.
The core literature, which glues together the field of the history of Venice, represents all periods of its development, as well as different typologies of publications. The historians of Venice share, it seems, a set of sources, works of reference, and monographs, which are widely known by practitioners, and remain relevant to this day of a rapidly increasing variety of perspectives. A limited set of well-known monographs that accrue sufficient recognition to become cited even outside of their original specialization, and part of the common ground of the scholars of the field. On one side, we have primary sources and works of reference, which never become outdated until substituted; on the other, scholarly works of particular importance, which are slowly updated, or rediscovered over time, as the field shifts attention to different topics but grounds them in previous work. This situation might well be shared in other fields in history and beyond, as further work will explore.
In this article, we suggested the importance of the core literature in history, and the humanities more in general, to bridge different clusters of research into a coherent field. We explored its existence, quality and structural role for the case study of the historiography on Venice, by using a dataset of monograph to monograph citations where source items (citing works) were selected from recent historiography on the topic. A fine-grained manual classification was used to qualify the results of different clustering methods.
Starting from the shared point of view that the humanities present a holistic intellectual structure, we indeed found that this is the case for the historiography on Venice as well. Yet, a group of core, highly cited works emerges as the main motivation for which both the recent literature (bibliographic coupling network) and the intellectual base (co-citation network) are almost connected and organized in coherent clusters bridged by it. The structural role of the core literature is also found to be rising over time, as the field becomes increasingly more varied. The core literature mainly comprises primary sources and works of reference, which never age out until substituted by similar contributions, and scholarly works of substantial importance, which become well-known in the field. This second group of core works is instead slowly updated over time, as the field moves to new topics or casts new light on older ones. Despite the fact that the humanities and social sciences will likely (and hopefully) never become high-consensus, rapid-discovery sciences, the role of some primary sources and works of reference to ground their discussions can perhaps be compared to the “genealogies of research technologies” so important to allow for the cumulative advance of the sciences (Collins, 1994). Their impact over time is perhaps a still under-acknowledged element with respect to the evaluation of research in the humanities.
Interestingly, in the case of Venice an established tradition of studies and resources still bears an influence on recent scholarship, which is growing considerably more elaborated and internationally oriented. The presence of a core literature is ultimately the reason for which we can still consider the historiography on Venice a field on its own, instead of a set of increasingly fragmented areas of research. This research, if eventually replicated for other fields and disciplines, points to two more general considerations. Firstly, that the core literature can influence a field for a very long time. This has implications for research evaluation, which evidently cannot be based on short-term citation counts. Secondly, that the pace of research in recent times is likely resulting in intellectual fragmentation, as the rising importance of the core suggests. This could possibly be avoided by increased efforts to produce and use shared (digital) resources and reference works, which have the proven long-lasting effect of integrating research efforts.
8. Data Availability
The complete dataset is available online (Romanello and Colavizza, 2017). A repository with the replication of every figure and analysis of this article is provided at https://github.com/Giovanni1085/core_literature_historians_venice. At the same address, the list of citing and cited monographs can be found, as well as the three considered citation networks in csv and graphml formats.
The author is sole responsible for the whole article.
Conflict of Interest Statement
The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The author would like to thank, in alphabetical order: Martina Babetto, Silvia Ferronato, and Matteo Romanello (EPFL, CH), members of the project team. Thanks also to Ludo Waltman and Vincent Traag (CWTS Leiden, NL), Massimo Franceschet (University of Udine, IT), and Dorit Raines (Ca’ Foscari University of Venice, IT) for very helpful discussions. The Library System of the Ca’ Foscari University of Venice, and especially so its Humanities Library (BAUM), and The Central Institute for the Union Catalogue of Italian Libraries and Bibliographic Information (ICCU) willingly collaborated with bibliographical resources and logistic support.
This research is funded by the Swiss National Fund with grants 205121_159961 and P1ELP2_168489.
- ^With respect to the full dataset, 27 citing monographs have been removed as duplicate editions, despite the fact that most of these editions constitute updates from a previous work, due to the fact that even a revised or extended edition is likely to contain substantial overlaps with previous ones in terms of references. When multiple editions of a work exist, the most recent one is kept; when translations of a work exist, the original is kept, but if the translation also includes an updated version of the work, it is retained instead.
- ^Most analyses relied on igraph [0.7.1] (Csardi and Nepusz, 2006) and Vincent Traag’s community detection library [0.5.3] available at https://github.com/vtraag/louvain-igraph.
- ^For formal definitions, see Newman (2010).
Bastian, M., Heymann, S., and Jacomy, M. (2009). Gephi: an open source software for exploring and manipulating networks. In International AAAI Conference on Weblogs and Social Media. San Jose, CA: Bastian.
Blondel, V.D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. (2008). Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008: 10008. doi:10.1088/1742-5468/2008/10/P10008
Bornmann, L., and Mutz, R. (2015). Growth rates of modern science: a bibliometric analysis based on the number of publications and cited references. Journal of the Association for Information Science and Technology 66: 2215–22. doi:10.1002/asi.23329
Colavizza, G., Romanello, M., and Kaplan, F. (2017). The references of references: a method to enrich humanities library catalogs with citation data. International Journal on Digital Libraries 1–11. doi:10.1007/s00799-017-0210-1
De Solla Price, D. (1970). Citation measures of hard science, soft science, technology, and nanoscience. In Communication among Scientists and Engineers, Edited by C.E. Nelson and D.K. Pollock, 3–22. Lexington, MA: Heath Lexington Books.
Fortunato, S., and Barthelemy, M. (2007). Resolution limit in community detection. Proceedings of the National Academy of Sciences of the United States of America 104: 36–41. doi:10.1073/pnas.0605965104
Garfield, E. (1980). Is information retrieval in the arts and humanities inherently different from that in science? The effect that ISI®’s citation index for the arts and humanities is expected to have on future scholarship. The Library Quarterly 50: 40–57. doi:10.1086/629874
Good, B.H., de Montjoye, Y.-A., and Clauset, A. (2010). Performance of modularity maximization in practical contexts. Physical Review E Covering Statistical, Nonlinear, Biological, and Soft Matter Physics 81(4 Pt 2): 046106. doi:10.1103/PhysRevE.81.046106
Hammarfelt, B. (2016). Beyond coverage: toward a bibliometrics for the humanities. In Research Assessment in the Humanities, Edited by M. Ochsner, S.E. Hug, and H.-D. Daniel, 115–131. Cham: Springer International Publishing.
Hérubel, J.-P.V.M., and Goedeken, E.A. (2001). Using the arts and humanities citation index to identify a community of interdisciplinary historians: an exploratory bibliometric study. The Serials Librarian 41: 85–98. doi:10.1300/J123v41n01_07
Huang, M.-H., and Chang, Y.W. (2008). Characteristics of research output in social sciences and humanities: from a research evaluation perspective. Journal of the American Society for Information Science and Technology 59: 1819–28. doi:10.1002/asi.20885
Infelise, M. (2002). Venezia e il suo passato. Storia, miti, ‘fole’. In Storia di Venezia. L’Ottocento e il Novecento, Edited by M. Isnenghi and S. Woolf, 967–988. Rome: Istituto dell’Enciclopedia Italiana Treccani.
Jacomy, M., Venturini, T., Heymann, S., and Bastian, M. (2014). ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software. Plos One 9:e98679. doi:10.1371/journal.pone.0098679
Knievel, J.E., and Kellsey, C. (2005). Citation analysis for collection development: a comparative study of eight humanities fields. The Library Quarterly: Information, Community, Policy 75: 142–68. doi:10.1086/431331
Larivière, V., Archambault, É, Gingras, Y., and Vignola-Gagné, É (2006). The place of serials in referencing practices: comparing natural sciences and engineering with social sciences and humanities. Journal of the American Society for Information Science and Technology 57: 997–1004. doi:10.1002/asi.20349
Leydesdorff, L., Hammarfelt, B., and Salah, A. (2011). The structure of the arts & humanities citation index: a mapping on the basis of aggregated citations among 1,157 journals. Journal of the American Society for Information Science and Technology 62: 2414–26. doi:10.1002/asi.21636
Leydesdorff, L., and Salah, A. (2010). Maps on the basis of the arts & humanities citation index: the journals Leonardo and art journal versus “digital humanities” as a topic. Journal of the Association for Information Science and Technology 61: 787–801. doi:10.1002/asi.21303
Lindholm-Romantschuk, Y., and Warner, J. (1996). The role of monographs in scholarly communication: an empirical study of philosophy, sociology and economics. Journal of Documentation 52: 389–404. doi:10.1108/eb026972
Linmans, A.J.M. (2009). Why with bibliometrics the humanities does not need to be the weakest link: indicators for research evaluation based on citations, library holdings, and productivity measures. Scientometrics 83: 337–54. doi:10.1007/s11192-009-0088-9
Newman, M.E.J., and Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E Covering Statistical, Nonlinear, Biological, and Soft Matter Physics 69: 066133. doi:10.1103/PhysRevE.69.066133
Nolen, D.S., and Richardson, H.A. (2016). The search for landmark works in English literary studies: a citation analysis. The Journal of Academic Librarianship 42: 453–8. doi:10.1016/j.acalib.2016.04.002
Persson, O. (1994). The intellectual base and research fronts of “Jasis” 1986–1990. Journal of the American Society for Information Science 45: 31. doi:10.1002/(SICI)1097-4571(199401)45:1<31::AID-ASI4>3.0.CO;2-G
Povolo, C. (2002). The creation of Venetian historiography. In Venice Reconsidered: The History and Civilization of an Italian City-State, 1297–1797, Edited by J.J. Martin and D. Romano, 491–519. Baltimore: Johns Hopkins University Press.
Reichardt, J., and Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review E Covering Statistical, Nonlinear, Biological, and Soft Matter Physics 74: 016110. doi:10.1103/PhysRevE.74.016110
Small, H. (1973). Co-citation in the scientific literature: a new measure of the relationship between two documents. Journal of the American Society for Information Science 24: 265–9. doi:10.1002/asi.4630240406
Šubelj, L., van Eck, N.J., and Waltman, L. (2016). Clustering scientific publications based on citation relations: a systematic comparison of different methods. Plos One 11:e0154404. doi:10.1371/journal.pone.0154404
Williams, P., Stevenson, I., Nicholas, D., Watkinson, A., and Rowlands, I. (2009). The role and future of the monograph in arts and humanities research. Aslib Proceedings 61: 67–82. doi:10.1108/00012530910932294
Keywords: bibliometrics, citation networks, core literature, history, Venice, history of Venice
Citation: Colavizza G (2017) The Core Literature of the Historians of Venice. Front. Digit. Humanit. 4:14. doi: 10.3389/fdigh.2017.00014
Received: 21 March 2017; Accepted: 16 June 2017;
Published: 10 July 2017
Edited by:Arianna Ciula, King’s College London, United Kingdom
Reviewed by:Chris Alen Sula, Pratt Institute, United States
Jane Winters, University of London, United Kingdom
Copyright: © 2017 Colavizza. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Giovanni Colavizza, email@example.com