Modified Particle Swarm Optimization Algorithms for the Generation of Stable Structures of Carbon Clusters, Cn (n = 3–6, 10)

Particle Swarm Optimization (PSO), a population based technique for stochastic search in a multidimensional space, has so far been employed successfully for solving a variety of optimization problems including many multifaceted problems, where other popular methods like steepest descent, gradient descent, conjugate gradient, Newton method, etc. do not give satisfactory results. Herein, we propose a modified PSO algorithm for unbiased global minima search by integrating with density functional theory which turns out to be superior to the other evolutionary methods such as simulated annealing, basin hopping and genetic algorithm. The present PSO code combines evolutionary algorithm with a variational optimization technique through interfacing of PSO with the Gaussian software, where the latter is used for single point energy calculation in each iteration step of PSO. Pure carbon and carbon containing systems have been of great interest for several decades due to their important role in the evolution of life as well as wide applications in various research fields. Our study shows how arbitrary and randomly generated small Cn clusters (n = 3–6, 10) can be transformed into the corresponding global minimum structure. The detailed results signify that the proposed technique is quite promising in finding the best global solution for small population size clusters.

On the other hand, the investigation on pure carbon molecules existing in various structural forms (chains/cyclic rings) has been a matter of great interest in the research area of organic, inorganic and physical chemistry (Weltner and Van Zee, 1989) as the study and production of carbon-riched molecules in the laboratory are notoriously difficult due to their high reactivity and transient like behavior. They are also very important in astrophysics, particularly in connection with the chemistry of carbon stars (Bernath et al., 1989), comets (Douglas, 1951), and interstellar molecular clouds (Bettens and Herbst, 1997). Long carbon chains are also believed to act as carriers of diffuse interstellar bands (Fulara et al., 1993). Moreover, carbon clusters are also important constituents in hydrocarbon flames and other soot-forming systems (Kroto and McKay, 1988) and they play an important role in gas-phase carbon chemistry where they serve as intermediates for the production of fullerenes, carbon tubes, thin diamond and silicon carbide films (Koinuma et al., 1996;Van Orden and Saykally, 1998). Therefore, the study about the structures and stabilities of carbon clusters is very important to thoroughly understand the complex chemical environment of such systems and also to shed light into the remarkable bonding capability of carbon which is able to form single, double and triple bonds. They together make the study on the structural information of carbon clusters in the field of theoretical research a subject of immense interest and it started before the development of fullerene chemistry (Pitzer and Clementi, 1959;Weltner and Van Zee, 1989;Martin et al., 1993;. Due to the reduction in angle strain, carbon clusters larger than C 10 are likely to exist as monocyclic rings, while smaller ones possess low-energy linear structures. Moreover, it was reported that for small clusters with even number of carbon atoms such as C 4 , C 6 , and C 8 , the cyclic form is either the lowest energy isomer or almost isoenergetic to their linear counterparts (Raghavachari and Binkley, 1987;Watts et al., 1992;Hutter and Lüthi, 1994;Pless et al., 1994;Martin and Taylor, 1996). In this study, we have checked the efficiency of our newly developed multi-threaded PSO code, written in python, and augmented by Gaussian 09 program package (Frisch et al., 2013) to locate global minimum energy structures for C n clusters (n = 3-6). Particularly, we want to test our code for the system where two minima are located at two deep well points on the PES as in the case of C 6 cluster. We kept the cluster size small in order to compare the performance of our code to other popular evolutionary simulation techniques such as SA, GA, and BH.

CURRENTLY PROPOSED AND IMPLEMENTED PSO TECHNIQUE
Initially, random structures are generated within certain range (−3, 3) in a multidimensional search space followed by upgradation of velocity and position vectors through swarm intelligence. After completion of every iteration, energy of each particle is calculated and a convergence criterion is verified with the help of the Gaussian 09 package interfaced with the present PSO algorithm. Individual best and global best positions are updated. If the energy values of successive 30 iterations remain same, the program automatically terminates. Finally a new set of initial structures are generated from the related output structures and the process is continued till the self-consistency is achieved.
In order to check the efficiency of our proposed PSO method over some most familiar GO methods like advanced BH, SA, and GA methods, the results for C 5 cluster have been analyzed, as a reference system.

A COMPARATIVE ACCOUNT OF THE CURRENT PSO METHOD WITH OTHER EXISTING APPROACHES
We have made the computer experiment to compare our proposed PSO with the other popular evolutionary simulation techniques such as SA, GA and advanced BH.

Comparison of Performances of PSO and GA
(a) The most important distinction between our proposed DFT-PSO with GA is the sharing of information. In GA, chromosomes share information with each other, whereas in PSO the best particle informs the others and the information of variables is stored in small memory. Again, PSO search for the global best solution is unidirectional, while GA follows the parallel searching process.
(b) In contrast to GA, PSO does not use any genetic kind of operator, i.e., crossover and mutation, and the internal velocity leads the particle to the next better place. (c) PSO implementation is more simple and easier than GA as it deals with few parameters (like position and velocity only). (d) GA provides satisfactory results in case of combinatorial problems, PSO being less suitable there. (e) PSO takes much less time to execute and the convergence rate is also faster than that of GA.
A previous study by Hassan et al. (2005) has been further recommended for more clarity and reliability of the efficiency of PSO over GA.

Comparison of Performances of PSO and SA
In SA technique, a small perturbation is given to cluster entity at each successive step, and energy estimation is carried out consecutively. Acceptance of perturbation depends on the obtained energy value. If the obtained energy is better than the previous one, the perturbation as well as the move with low cost is accepted. Otherwise, the process excludes it and the Boltzmann probability distribution is applied at a given temperature. The particle (individual cluster) in SA takes much time to generate different lower energy structures. The temperature decreases during the whole course of the process very slowly and at the end of the run it attains the least value. In contrast, such kind of perturbation or temperature variable is not present in PSO. Both exploitation and exploration techniques drive the particle in PSO, while only exploitation is used in SA. So, there are more chances to trap the particles in local minima in case of SA being a singlebased technique than PSO. On the other hand, PSO, being the population based technique, is able to swarm wherever (different places of mountain or lower point of valleys) be the particle in the search space.

Comparison With Basin Hopping
Wales and Doye jointly described basin hopping algorithm (Berg and Neuhaus, 1991;Wales and Doye, 1997;Doye et al., 1998) which has become a popular stochastic search process to find out the desired global best solution of an object function. This method is basically a Monte Carlo technique, which works in a perturbative and iterative manner. At first, a random coordinate of a particle is considered. Then, random perturbation is applied to the configuration considering the fact that the particle remains in a local basin which is then followed by the minimization of energy functional to get a better solution. Energy estimation is again carried out and the process is repeated until the best configuration or the lowest energy structure is achieved. The most important thing is that the applied perturbation should be large enough to get out of a local basin.

ALGORITHM AND COMPUTATIONAL DETAILS
At the beginning, a set of random coordinates of C n clusters (particles) with random positions and velocities are considered.
The newer sets of coordinates are updated through PSO run to find out global best position or configuration. The local best configuration (p best ) or that having the lowest energy value obtained locally so far is stored in a small memory variable which is then followed by the searching of global best (g best ) configuration (among the set of p best ) through an exploration technique. Ultimately, the best optimal solution is achieved.
The new velocities (v t+1 i ) and positions (x t+1 i ) of particles in ith generation obey the following equations where x t i and v t i are the current position and velocity.
ε 1 and ε 2 are chosen randomly in between (0,1). The tendency of a particle to remain in its current position is called inertia coefficient denoted by w. d 1 and d 2 (which can be modified as per requirement) which are referred to as individual coefficient of acceleration and global coefficient of acceleration, respectively. These two coefficients guide the particles to meet convergence so that all the candidate solutions in the problem space efficiently achieve the global minimum (see Table 1).
After the completion of each PSO run, optimization of global best structural units of C n clusters (n = 3-6) are performed at the B3LYP (Lee et al., 1988;Becke, 1993)/6-311+G * level in the Gaussian 09 program.

PARALLEL IMPLEMENTATION
One of the major advantages of using PSO as proposed in this paper over some of the classical optimization techniques is its parallelizability. The same implementation of the algorithm can be executed on machines having single core (serial implementation) or ones with multiple cores (laboratory grade clusters) or high performance computing (HPC) systems. Changing a couple of header parameters in the program is sufficient to make it portable across a wide range of platforms. We have tested both a serial as well as a parallel implementation of our programs. Results on parallel implementation are reported. It may be noted that our PSO algorithm implemented in Python invokes the Gaussian software as a system call. Each such parallel call, one for each particle of the PSO algorithm, causes a new instance of Gaussian to be executed. The number of cores on which each Gaussian instance runs is dependent on the available number of processor cores. However, at the  2 | The randomly chosen 10 different molecular frameworks of C n (n = 3-6, 10) with singlet and triplet spin multiplicity converge to the global minimum energy structures (Bond lengths are given in Å unit and the relative energies, E w.r.t the global minimum energy structures in brackets are given in kcal/mol).

Clusters
Initial structure Final structure using PSO Final optimized energy (bond lengths)    end of every iteration, PSO has to recompute the best and global best positions of individual particle before updating the velocity values from which the new positions of the particles are determined. These are done by reading the output log files generated by Gaussian for each particle. This implies that the results of all the parallel invocations of Gaussian need to be completed before the iteration-end processing can be done. We have implemented appropriate synchronization mechanisms to enable such parallel implementation and hence, the code base is portable across multiple platforms.

COMPUTATIONAL SETUP
All our computations were carried out on a single server having two Intel 2.70 GHz Xeon E5-2697 v2 processors and 256 GB of RAM. Each processor has 12 cores. Leaving aside a few cores for operating system and other housekeeping processes, we made use of 30 threads for executing our PSO algorithm. A PSO population size of 15 particles implies that 2 threads could be used for each instance of Gaussian. Also, 8 GB of RAM was dedicated to each such instance. As mentioned before, the number of PSO particles, RAM assignment and the number of threads for each Gaussian call are set as input hyper parameters. The completely parameterized implementation of PSO has been done in Python invoking Gaussian for energy calculation in a multi-threaded environment. This is one of the unique features of our work, which has not yet been reported in the literature for stable structure prediction of C n , to the best of our knowledge.

RESULTS AND DISCUSSION
In our study, each C n cluster unit (each individual unit) is considered to be a swarm particle in a multidimensional potential energy surface (PES) where the stationary points (maxima, minima, and higher order saddle points) are connected. The randomly generated individual particle is governed by a position vector and a velocity vector. Again, each position vector representing a candidate solution in the hyperspace starts searching for the optima of a given function of several variables by updating generations in iterative process without much of any assumption leading to a minimum energy structure. After iteration the particle driven by a velocity vector changes its search direction. The position and velocity vectors together store the information regarding its own best position or the local best position (called p best ) seen so far and a global best position (called g best ) which is obtained by communicating with its nearest neighbors. Further, the advancement of particles toward the global best position is attained via particle swarm optimizer ideology and they gravitate toward the global best solution with the help of the best variable memory values. Our proposed PSO implementation explores rapidly without being entrapped in local optima and executes extensively, followed by immediate convergence to the desired objective value, the global optima.
The results of global optimization of C n clusters (n = 3-6, 10) considering a maximum of 1,000 runs starting from the random choices of input configuration are shown in Table 2. The global stable structure (best solution) can be obtained by fulfilling the termination criteria along the convex function of the information matrix when one of the particles reaches the target. Initially, 10 different random configurations have been chosen by setting random initial positions and velocities of all particles followed by the Gaussian interfaced PSO driven operation to get the global optimum structure (see Table 2).
It is a very fascinating aspect that Gaussian optimization technique works in such a way that the guess structure can be stuck at local minima which may or may not be the global minimum. But, it is obvious that our proposed modified PSO implementation converges to the most stable structure where all the particles exist in a given range in the multidimensional hyperspace. However, sometimes atoms of the randomly generated particles (each individual cluster unit) are not in the limit of bonding perception and they might overlap on each other. In order to understand whether the atoms remain in the same molecular framework or not, we have connected the randomly deployed particles with solid lines in the following figures and they do not necessarily imply true bonds (see Table 2). In case of C 3 cluster, the structure obtained after the end of the PSO run (linear, D ∞h point group) exactly matches with the structure obtained after the final G09 optimization in terms of bond length and energy. C 5 cluster also shows linear geometry with D ∞h point group and singlet electronic state after final optimization step. A significantly higher energy cyclic isomer is also found in this case. On the other hand, C 4 and C 6 clusters (containing even number of C atoms) give both linear (D ∞h ) and ring structures (D 2h for C 4 and D 3h for C 6 ). Corresponding energies and bond lengths are provided in Table 2. The computed geometrical parameters and minimum energy structures match excellently with the previously reported experimental results (Raghavachari and Binkley, 1987;Watts et al., 1992;Hutter and Lüthi, 1994;Pless et al., 1994;Martin and Taylor, 1996;Van Orden and Saykally, 1998). For both C 4 and C 6 clusters, the lowest energy isomer has linear form in triplet state, whereas the linear singlet state is 17.8 (C 4 ) and 13.3 (C 6 ) kcal/mol higher in energy than the corresponding triplet forms. In addition to the small cluster systems, we have also checked the efficiency and the robustness of our implemented PSO code to find the global FIGURE 2 | Single point energy evolution landscape of C 5 cluster during each generation of convergence at the B3LYP/6-311+G* level.
Frontiers in Chemistry | www.frontiersin.org minimum for a relatively larger sized cluster, C 10 . The results show that the present code can successfully locate the desired D 10h symmetric ring structure which is the most stable isomer in this case.
In the present context, we have also carried out DFT-SA and DFT-BH methods considering same object energy function as in our proposed PSO method to compare the obtained results (see Table 3). The tabulated values clearly reflect that the present PSO method is superior to other methods based on the time to locate the GM, energy values after completion of all runs of the studied methods and the number of iteration steps needed to get the final structure.
A representative plot of C 5 cluster (as reference) is given below to ensure the fulfillment of convergence criteria up to 600 iteration steps (see Figure 2).

CONCLUSION
This systematic study for the searching of the most stable carbon based small clusters describes the effectiveness of the application of our proposed PSO technique. Currently employed less expensive and relatively less complicated computational method generates a vast potential search space depending only on the position and velocity variables. Our proposed method opens a new vista to find out global minimum energy structures effectively and accurately within a given multidimensional configuration search space. PSO implementation without much of any assumption like constraints of symmetry and externally imposed factors like temperature, pressure, etc. performs suitably and converges to a single configuration that presumably is a global minimum energy structure or may exactly fit it after Gaussian optimization. PSO can be used as a fast postprocessing technique to get a global minimum or close to global minimum structure. In fact, in this study we have introduced a new easy implementation and computationally less expensive approach for the reduction of iteration steps to obtain global best configurations of small carbon clusters with exact energy values.

DATA AVAILABILITY
All datasets generated for this study are included in the manuscript and/or the Supplementary Files.

AUTHOR CONTRIBUTIONS
GJ: interfacing with the Gaussian software, writing first version of the full manuscript, generation of the TOC, and analysis of the results. AM: implementation of the PSO algorithm in python including coding and parallelization. SP: revision of the draft manuscript and interpretation of data. SS: revision of the draft manuscript and checking of implemented code. PC: formulation of the problem, critical revision of the manuscript, and data interpretation.