Entropy and Enumeration of Subtrees in a Cactus Network

For a given network, the number of spanning trees is a key parameter to measure its reliability in edge failure cases, while the number of subtrees is a key parameter to measure its reliability in both vertex and edge failures cases. Zhang et al. investigated the entropy of spanning trees, spanning forests and connected spanning subgraphs of Koch networks. In this paper, we extend Koch networks to 3–cactus networks, and study the entropy of subtrees of 3–cactus networks. We present a linear algorithm to count the number of subtrees in a 3–cactus network, determine the upper and lower bounds of the entropy of subtrees of these networks and characterize those attaining the extremal values. As an application, a linear algorithm is developed to count the number of subtrees in Koch networks, with complexity O(g), where g is the number of iterations. Finally, we determine the entropy of subtrees of Koch networks.


INTRODUCTION
The number of subtrees of a connected network is well-studied, as an important parameter to measure the reliability of a network for both vertex and edge failures. For a network G, let p be the probability of failure of a vertex in G, and q be the probability of failure of an edge in G. The reliability of the network G, denoted by R(G; p, q), is the probability that the remaining vertices can communicate with each other. Zhao [1] showed that for given numbers m and n, among the networks with n vertices and m edges, if p → 0 and q → 1 with p and 1 − q are equivalent infinitesimals, then networks with more subtrees are more reliable. Therefore, it is of interest to develop efficient algorithms to count the number of subtrees in a network and, in a given family of networks, characterize the networks with extremal number of subtrees. Denote by η(G) the number of subtrees of G. For a vertex v ∈ V(G), let η v (G) denote the number of subtrees containing v in G. Székely and Wang [2] determined the maximum and minimum values of η(G) and characterized all extremal networks among all trees G with given number of vertices. Yan and Yeh [3] developed a linear algorithm to count the number of subtrees of a tree. In the same paper, they also characterized all trees having diameter at least d with maximum number of subtrees, as well as all trees having maximum degree at least with minimum number of subtrees. On the other hand, Kirk and Wang [4] characterized all trees with a given maximum degree that have the maximum number of subtrees. Zhang and Zhang [5] and Zhang et al. [6,7] characterized the trees attaining the maximum and minimum numbers of subtrees among all trees with the given degree sequence. Xiao et al. [8] showed that 5 + n + 2 n−3 ≤ η(T) ≤ 2 n−1 + n − 1 for any tree T with n vertices. Yang et al. [9,10] studied the number of subtrees in spiro and polyphenyl hexagonal chains and hexagonal and phenylene chains, respectively. For more results on this topic, we refer the readers to references [11][12][13][14][15][16][17].
Koch networks, introduced by Zhang et al. [18], are typical fractal networks. Zhang et al. showed that Koch networks have some important properties in real life networks, such as a power law degree distribution with exponent between 2 and 3, a large clustering coefficient and a small average path length. However, the number of subtrees in Koch networks has not been determined. In this paper, we investigate the entropy and enumeration of subtrees in 3-cactus networks, which is a generalization of Koch networks. We first establish a linear algorithm to count the number of subtrees in an arbitrary 3cactus network. Then we characterize the 3-cactus networks with upper and lower bounds of the entropy of subtrees. Finally, as an application, we obtain the entropy of subtrees in Koch networks.

PRELIMINARIES
We first introduce some general notation and definitions that will be used throughout the paper. Undefined notation and terminology will follow Bondy and Murty [19].
We follow Zhang et al. [18] and Wu et al. [20] to define Koch networks. Denote by G m,g the Koch network after g iterations, which has N g vertices, E g edges and L g triangles. At g = 0, G m,0 is a triangle with three vertices labeled by X, Y, and Z, respectively. This triangle is called the initial triangle, and its three vertices are called hub vertices. For g ≥ 1, G m,g is obtained from G m,g−1 by attaching m triangles to each of the three vertices in each triangle in G m,g−1 . The Koch networks for m = 2 are shown in Figure 1.
To extend the notion of Koch networks, we introduce 3cactus networks. Definition 1. Let G(t) be the set of all 3-cactus networks with t triangles. We define a 3-cactus network G t ∈ G(t) by the following recursive iterations.
Step 1: When t = 0, G 0 is isomorphic to K 1 , the graph with one vertex and no edges. The only vertex in G 0 is called the initial vertex of G t .
Step 2: Assume that t ≥ 1 and a 3-cactus network G t−1 is already generated by (t − 1) iterations. Then, a 3-cactus network G t can be obtained from G t−1 and a new triangle by identifying a vertex of the new triangle and a vertex of the G t−1 .
Let G be a subnetwork of G ′ . If the vertex set (or edge set) of G is a proper subset of the vertex set (or edge set) of G ′ , G is called a proper subnetwork of G ′ . Let d G (u) denote the degree of vertex u in G, and (v 1 , v 2 , v 3 ) denote the triangle with vertices v 1 , v 2 , and v 3 . A triangle (v 1 , v 2 , v 3 ) in G is called a d-pendant triangle if two of the vertices v 1 , v 2 , v 3 have degree 2 in G and the third vertex has degree equals d in G for some integer d ≥ 3. A d-pendant triangle is also called a pendant triangle.
By Definition 1, a network G t ∈ G(t) will contain a pendant triangle with a vertex of degree at least 4. Define a 3-cactus networks G t ∈ G(t) to be in Class I if G t contains a 4-pendant triangle, and in Class II if G t does not contain a 4-pendant triangle. The following observations follow from Definition 1. Observation 1. Let t ≥ 0 be an integer, and let G t ∈ G(t) be a network formed by Definition 1 using t iterations. Each of the following holds. (i) The 3-cactus networks is a generalization of Koch networks, as building a Koch network G m,g using g iterations amounts to (3m + 1) g iterations in building a 3-cactus network.
(ii) G t has t triangles, 2t + 1 vertices and 3t edges. (iii) If (u 0 , u 1 , u 2 ) is a pendant triangle of G t with d G t (u 1 ) = d G t (u 2 ) = 2, then t ≥ 2 and u 0 is a cut vertex of G t . (iv) Every Class II 3-cactus network G contains two pendant triangles (u 0 , u 1 , u 2 ) and (u 0 , v 1 , v 2 ) sharing a common vertex u 0 .
Proof: It suffices to justify (iv) as all others are direct consequences from Definition 1. Let G be a Class II 3-cactus network. By Definition 1, G ∈ G(t) for some t ≥ 3, and if G ∈ G(3), then G has a vertex of degree 6 and all other vertices of degree 2, and so (iv) holds. Assume that t ≥ 4 and (iv) holds for smaller values of t. By Definition 1, G contains a pendant triangle (x, z 1 , If G ′ is of Class II, then by induction, G ′ has two pendant triangles (u 0 , u 1 , u 2 ) and (u 0 , v 1 , v 2 ) sharing a common vertex u 0 . But as G is of Class II, u 0 cannot be a vertex of degree 4 in a 4-pendant triangle in G. If x / ∈ {u 0 , u 1 , u 2 , v 1 , v 2 }, then (u 0 , u 1 , u 2 ) and (u 0 , v 1 , v 2 ) are two pendant triangles in G, and so (iv) holds and we are done. Hence we assume that x / ∈ {u 0 , u 1 , u 2 , v 1 , v 2 }. If x = u 0 , then (u 0 , u 1 , u 2 ), (u 0 , v 1 , v 2 ), and (u 0 , z 1 , z 2 ) are three pendant triangles with a common vertex u 0 , implying (iv) also. Hence we must have x ∈ {u 1 , u 2 , v 1 , v 2 }. By symmetry, we assume that x = u 1 . Then the two triangles (u 0 , u 1 , u 2 ) and (u 1 , z 1 , z 2 ) sharing a common vertex x = u 1 with d G (u 1 ) = 4, and so (u 1 , z 1 , z 2 ) is a 4-pendant triangle. This implies that G is of Class I, a contradiction. Therefore, we assume that G ′ is of Class I, and so G ′ has a 4-pendant triangle ( and (x, z 1 , z 2 ) satisfying (iv); or x ∈ {u ′ 1 , u ′ 2 }, whence G is of Class I, contrary to the assumption that G is of Class II. 2 Thus every Koch network is a 3-cactus network. However, there exist 3-cactus networks that are not Koch networks, as shown by the triangle-path P △ t and the triangle-star S △ t are two special classes of 3-cactus networks with t triangles. Examples of these 3-cactus networks are depicted in Figure 2.

ALGORITHM FOR COUNTING THE NUMBER OF SUBTREES
In this section, we present a linear algorithm to count the number of subtrees in a 3-cactus network and its proof of the correctness. Algorithm A: Initial condition: Let G t = (V, E) be a 3cactus network with t triangles with the vertex set V = {u 0 , u 1 , v 1 , u 2 , v 2 , . . . , u t , v t } and the edge set E = {e 1 , e 2 , . . . , e 3t }, where u i and v i are the vertices added at the i-th iteration, for each i = 1, 2, . . . , t. Let η(G t ) be the number of subtrees of G t . For every v, we assign an ordered pair of real numbers (1, 0) as the initial weight of the vertex.
Step 1. Set k = t.  Step 2. Let (a 1 , b 1 ), (a 2 , b 2 ), and (a 3 , b 3 ) denote the weights of u k , v k and the common neighbor of u k and v k , respectively. Delete the vertices u k and v k , and reset the value of (a 3 , b 3 ) by the following: Step 3. If k = 1, then stop, output η(G t ) = a 3 + b 3 . Otherwise, set k : = k − 1 and go to Step 2. As there will be t iterations using Step 2, the complexity of Algorithm A is O(t). Since running Algorithm A to a 3-cactus networks G t ∈ G(t) can be performed in a reverse order of building G t as described in Definition 1, it is possible to arrange the iterations in Step 2 during the running of Algorithm A so that when Algorithm A terminates, only the initial vertex of G t is left.
As examples, it is inspected that η(G 0 ) = 1. Let G 1 = (u 0 , u 1 , u 2 ) be a triangle. Then G 1 has three subtrees avoiding u 0 , namely, {u 1 }, {u 2 } and the complete graph on {u 1 , u 2 }, in G 1 that avoids the vertex u 0 , and six subtrees containing u 0 , induced by the vertex subset {u 0 } or by the edge subsets, On the other hand, using Algorithm A, with each of u 0 , u 1 , u 2 assigned the initial weight (1, 0), by (1), we have a 3 = 6 and b 3 = 3. By Step 3 of Algorithm A, we also have η(G 1 ) = 9.
Let (a, b) denote the weight of vertex u 0 . Each of the following holds when Algorithm A stops. (i) The value of a, denoted by η u 0 (G t ), is the number of subtrees containing vertex u 0 in G t . (ii) The value of b, denoted by η(G t −u 0 ), is the number of subtrees that do notcontain vertex u 0 in G t . (iii) The value of a + b, denoted by η(G t ), is the number of subtrees in G t .

Proof:
We argue by induction on t. If t = 1, then G 1 is a triangle. By Algorithm A, we have that a = 6 and b = 3, and by inspection, η(G 1 ) = 9. Thus, the result holds for t = 1. Suppose that k ≥ 2 and the result holds for all values of t < k.
The structural of G t when t = k ≥ 2 is depicted in Figure 4, and we shall use Figure 4 and its notation to illustrate our arguments. Thus, H 1 , H 2 and H 3 are the three vertex disjoint subgraphs of G t such that H 1 contains exactly one vertex u 1 in the initial 3-cycle G 1 = (u 0 , u 1 , v 1 ), H 2 contains exactly one vertex v 1 in G 1 = (u 0 , u 1 , v 1 ) and H 3 contains exactly one vertex u 0 in G 1 = (u 0 , u 1 , v 1 ), as described in Definition 1.
Relabel the vertices u 1 , v 1 , and u 0 as w 1 , w 2 , and w 3 , when each of them is considered as a vertex in H 1 , H 2 , and H 3 , respectively. In the rest of the proof of this theorem, we keep in mind that u 1 = w 1 , v 1 = w 2 , and u 0 = w 3 .
As G 1 = (u 0 , u 1 , v 1 ) is an initial 3-cycle of G t , we can view w i as the initial vertex of the 3-cactus network H i . For each i ∈ {1, 2, 3}, since each H i is a 3-cactus network generated by at most t − 1 iterations, by applying Algorithm A to H i ending at the initial vertex w i of H i , we obtain the weights (a 1 , b 1 ), (a 2 , b 2 ), (a 3 , b 3 ) of the vertices w 1 , w 2 , and w 3 , respectively. By the induction hypothesis, we have, for i ∈ {1, 2, 3}, both (A) and (B) of the following.
At this stage of Algorithm A, the graph to be computed is G 1 = (u 0 , u 1 , v 1 ) with the weights assigned to u 0 , u 1 , v 1 being (a 3 , b 3 ), (a 1 , b 1 ), and (a 2 , b 2 ), respectively. Running Algorithm A at the last iteration, we delete the vertices u 1 and v 1 , and reset the weight (a, b) to the vertex u 0 by (1), a = a 3 (3a 1 a 2 + a 1 + a 2 + 1) and Let T denote the collection of subtrees of G t . Define and for i ∈ {1, 2}, define Thus T = ∪ 7 i=0 T i is a partition. By (A) and (B) in (2), |T 0 | = a 3 , |T 4 | = a 1 , |T 5 | = a 2 and |T 7 | = b 1 + b 2 + b 3 . As the edge u 0 u 1 lies in every tree in T 1 , we have |T 1 | = a 1 a 3 . Likewise, |T 2 | = a 2 a 3 and |T 6 | = a 1 a 2 . Since every tree in T 3 contains exactly two edges in G 1 and there are three pairs of such edges in G 1 , we have |T 3 | = 3a 1 a 2 a 3 . By definition and (3), With a similar argument and using |T 7 It follows that This shows that when Algorithm A terminates, it outputs the values as stated in Theorem 1 (i), (ii), and (iii), and justifies the correctness of Algorithm A.

UPPER AND LOWER BOUNDS OF THE NUMBER OF SUBTREES IN A 3-CACTUS NETWORK
The following lemma [8] will be used in our proof.

Theorem 2. For an integer t, S △ t is the unique 3-cactus network with the maximum number of subtrees in G(t).
Proof: We prove the theorem by induction on t, and observe that the theorem holds when t ∈ {1, 2}. Let k ≥ 3 be an integer and assume that the result is true for t < k. We consider that case when t = k.
Let G ′ be a 3-cactus network with k triangles, and let (u, v, w) be a pendant triangle such that

where the equality holds if and only if
where P 2 is the path with 2 vertices. If G ′ = S △ k , then (k − 1)P 2 must be a proper subnetwork of G ′ 1 − u, which implies η(G ′ 1 − u) > η((k − 1)P 2 ). Since G ′ 2 = (u, v, w) and G ′ 2 − u = P 2 , it follows from (4) that η(G ′ ) ≤ η(S △ k ), where the equality holds if and only if G ′ = S △ k , which completes the proof. 2 The next lemma presents some observations which follow from Algorithm A.
Lemma 3. Let t ≥ 3 be an integer and G ∈ G(t) be a Class II 3cactus network. Then there exists a Class I 3-cactus network G ′ in G(t) satisfying η(G ′ ) < η(G).
Proof: Let G ∈ G(t) be a Class II 3-cactus network. By Observation 1(iv), G contains two pendant triangles (u 0 , u 1 , u 2 ) and (u 0 , u, w) such that d G (u 0 ) ≥ 6 and d G (u 1 ) = d G (u 2 ) = d G (u) = d G (w) = 2, as depicted in Figure 5A. Let H be the graph depicted in Figure 5. Denote by a 0 (b 0 , respectively) the number of subtrees containing u 0 (not containing u 0 ) in H. By Algorithm A, when deleting the vertices u 1 and u 2 from G, the weight of vertex u 0 , denoted by (a u 0 , b u 0 ), can be reset as follows: a u 0 = 6a 0 and b u 0 = b 0 + 3. Then in the step of deleting the vertices u 0 and w from G, the weight of vertex u is given by Let G ′ denote the graph depicted in Figure 5B, which is obtained from H by identifying vertex u 0 of H and a vertex of degree 2 in P △ 2 . By Algorithm A, when deleting the vertices u 0 and u 2 from G ′ , the weight of vertex u 1 , denoted by (a u 1 , b u 1 ), is obtained as a u 1 = 4a 0 + 2, b u 1 = 2a 0 + b 0 + 1. Then in the step of deleting the vertices u 1 and w from G ′ , the weight of vertex u is given by Thus, we have η(G ′ ) = 26a 0 + b 0 + 16. Algebraic computation yields that η(G) − η(G ′ ) = 10a 0 − 10 > 0, and so the lemma is proved. Proof: By Lemma 3, within the family G(t), for every 3-cactus network G in Class II, there is always a 3-cactus network G ′ in Class I containing fewer subtrees. Thus, we only need to show that P △ t is the unique 3-cactus network with the minimum number of subtrees, among all the networks of G(t) in Class I.
We prove the conclusion by induction on t. Let G t ∈ G(t) be a 3-cactus network with a pendant triangle (u, v, w) such that d G t (u) = 4 and d G t (w) = d G t (v) = 2. We will prove the following results: (i) P △ t is the unique 3-cactus network with the minimum number of subtrees in G(t); (ii) η w (G t ) attains the minimum value if and only if G t = P △ t . The results hold for t = 1 and 2. Suppose that k ≥ 3 and the results are true for t < k, we consider the case when t = k. Let G k−1 = G k − {w, v}. Since d G k−1 (u) = 2, by Lemma 2 and the induction hypothesis, we have that where the equalities hold if and only if G k = P △ k .

2
With Algorithm A, we can compute the numbers of subtrees of S △ t and P △ t , respectively. Let u 0 be the center of S △ t and v 0 be a vertex of degree 2 in a pendant triangle of P △ t . From Algorithm A, it follows that By algebraic manipulations, we obtain that and By Theorems 2 and 3, η(S △ t ) and η(P △ t ) represent the upper and lower bounds of the number of subtrees in a 3-cactus network with t triangles, respectively. As shown in (5) and (6), the numbers of subtrees in S △ t and P △ t grow exponentially as the number of triangles increases, as seen in Figure 6, where the curves of values of η(S △ t ) and η(P △ t ) vs. t are plotted.

THE ENTROPY OF SUBTREES OF KOCH NETWORKS
The entropy of spanning trees has been well-studied, as seen in Lyons et al. [21] and Zhang et al. [22,23], among others. The entropy of subtrees can be similarly defined.

Definition 2.
Let G be a network with N(G) vertices. The entropy of subtrees of G is defined as follows: Let T be a tree with n vertices. Székely and Wang [2] and Yan and Yeh [3] obtained the results of the number of subtrees in a tree as follows: n(n+1) 2 = η(P n ) ≤ η(T) ≤ η(K 1,n−1 ) = 2 n−1 + n − 1, where P n and K 1,n−1 are the path and star with n vertices, respectively. So, we have E(K 1,n−1 ) = lim n→∞ ln(2 n−1 + n − 1) n = ln2 From Equations (5) and (6), we can calculate the entropies of subtrees of S △ t and P △ t , respectively, as follows: Therefore, for a tree T with n vertices and a 3-cactus network G t with t triangles, we have Now we calculate the entropy of subtrees of Koch networks. The following definition and notation will be used in this section. Let α(x) and β(x) be two infinities when x is close to infinity. If lim x→∞ α(x) β(x) = 1, we say that α(x) and β(x) are equivalent infinities, and write α(x) ∼ β(x) (x → ∞).
For Koch networks, another construction is given as follows. The network G m,g is obtained from G m,i (i = 0, 1, 2, . . . , g − 1) and the initial triangle by adding m copies of G m,i to each of the three hub vertices of the initial triangle, as shown in Figure 7.
According to the construction, G m,g has the form illustrated in Figure 8, where H g is the subnetwork of G m,g obtained from m copies of G m,i (i = 0, 1, 2, . . . , g − 1) by identifying mg hub vertices X in m copies of G m,i (i = 0, 1, 2, . . . , g −1). Then η X (H g ) and η(H g − X) are given by By Algorithm A to the initial triangle (X, Y, Z), we get that and initial conditions are η X (G m,0 ) = 6 and η(G m,0 − X) = 3. So, we arrive at From Equation (7), we obtain According to the above theoretical analysis, we give a linear algorithm for counting the number of subtrees in a Koch network, as follows. Algorithm B: Initial condition: Let G m,g be a Koch network generated by g iterations and X, Y and Z be three hub vertices in G m,g . Let η(G m,g ) denote the number of subtrees of G m,g .
Step 2. Suppose that, at iteration k, the number of subtrees containing X in H k and the number of subtrees not containing X in H k are denoted by x and y, respectively. One gets the values x and y as x : = x(a) m , y : = y + mb.
Step 3. Suppose that a and b denote the number of subtrees containing X in G m,k and the number of subtrees not containing X in G m,k , respectively. We get a and b by a : = x(3x 2 + 2x + 1), b : = 3y + x 2 + 2x.
Step 4. If k = g, then stop, output η(G m,g ) = a + b. Otherwise, k : = k + 1, go to Step 2. It is not difficult to see that the complexity of the above algorithm is O(g). In order to calculate the entropy of subtrees of Koch networks, we need the following lemma: Lemma 4. Let G m,g be a Koch network generated by g iterations. Then Proof: From Equation (10), we have the following recurrence relations: It is easy to obtain η X (H g ) ∼ 3 m × (η X (H g−1 )) 3m+1 (g → ∞). Note that η X (H 1 ) = 6 m .
The entropy of subtrees of Koch networks reflects the fact that although the number of subtrees of Koch networks grows exponentially, the growth rate is lower than that of the trianglestar and higher than that of the triangle-path. This result suggests that if the vertex failure probability p satisfies p → 0, the edge failure probability q satisfies q → 1, and p and 1−q are equivalent infinitesimals, then Koch networks are less reliable than the triangle-star, and more reliable than the triangle-path. Noticing that E(G m,g ) can be considered as a function of the variable m, it is easy to see that the entropy of subtrees of Koch networks tends to the maximum entropy of subtrees of all 3-cactus networks as m tends to infinity.

CONCLUSIONS
In this paper, we have given the definition of the entropy of subtrees, which is used to compare the average number of subtrees for networks of different sizes. We have established a linear algorithm for counting the number of subtrees in any 3cactus network, and characterized the 3-cactus networks with upper and lower bounds of the entropy of subtrees among all 3cactus networks with t triangles. In order to avoid exponential computation, we have also proposed a linear algorithm for calculating the number of subtrees in Koch networks. Finally, we have determined the entropy of subtrees of Koch networks which tends to the maximum entropy of subtrees of all 3-cactus networks as m tends to infinity.

DATA AVAILABILITY STATEMENT
The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.

AUTHOR CONTRIBUTIONS
LD, HZ, and H-JL contributed the conception and design of the study. LD organized the literature, wrote the simulation programs, and performed the design of figures. LD and HZ wrote the first draft of the manuscript. All authors contributed to the manuscript revision, read, and approved the submitted version.