Microdosimetry and Dose-Averaged LET Calculations of Protons in Liquid Water: A Novel Geant4-DNA Application

The spatial distribution of energy deposition events is an essential aspect in the determination of the radiobiological effects of ionizing radiation at the cellular level. Microdosimetry provides a theoretical framework for the description of these events, and has been used in several studies to address problems such as the characterization of Linear Energy Transfer (LET) and Relative Biological Effectiveness (RBE) of ion beams for proton therapy applications. Microdosimetry quantities and their distributions can be obtained by means of Monte Carlo simulations. In this work, we present a track structure Monte Carlo (MC) application, based on Geant4-DNA, for the computation of microdosimetric distributions of protons in liquid water. This application provides two sampling methods uniform and weighted, for the scoring of the quantities of interest in spherical sites, with diameters ranging from 1 to 10 μm. As an element of novelty, the work shows the approach followed to calculate, without resorting to dedicated simulations, the distribution of energy imparted to the site per electronic collision of the proton, which can be used to obtain the macroscopic dose-averaged LET as proposed by Kellerer. Furthermore, in this work the concept of effective mean chord length is proposed to take into account δ-ray influx and escape in the calculation of macroscopic dose-averaged LET for proton track segments and retrieve the agreement predicted by Kellerer’s formula. Finally, the results obtained demonstrate that our MC application is reliable and computational-efficient to perform calculations of microdosimetric distributions and dose-averaged LET of proton track segments in liquid water.


INTRODUCTION
The radiobiological effects of ionizing radiation can be traced back to the microscopic patterns of energy deposition in irradiated media, which can be described quantitatively with the formalism of microdosimetry [1][2][3]. In view of the growing awareness of the variations of proton Relative Biological Effectiveness (RBE) with increasing Linear Energy Transfer (LET) [4][5][6], several studies have turned to the framework of microdosimetry to approach problems such as the characterization of LET distributions for proton therapy beams and the determination of their RBE [7][8][9][10]. Indeed, by obtaining the LET distributions it is possible to optimize treatment plans, taking into account the actual biological impact of radiation by working on LET objectives and constraints. Microdosimetry, on one hand, permits the calculation of RBE from a microscopic approach by means of the Microdosimetric-Kinetic Model (MKM) [11][12][13] for cell killing endpoint and, on the other hand, provides physical concepts and computational tools to calculate macroscopic dose-averaged LET ( L D ) distributions. These, in fact, being representative both of the amount of energy imparted and of its spatial concentration, are strictly related to the biological damage induced by radiation.
The microscopic analogue of the LET is called lineal energy (y), which is a stochastic quantity that accounts for the energy imparted per unit length to a certain sensitive volume, called site, by an incident particle and all its secondary products [14]. As for the macroscopic distribution of LET, the microscopic distribution of y can be biologically characterised by its dose-weighted average, y D [15]. The difference between L D and y D is given by energy straggling, which can be accounted for by considering the weighted average δ 2 of the energy imparted to the site per individual collision ε c of the incident particle. The mathematical description of this relation was given by Kellerer, and can be used to extract the macroscopic dose-averaged LET of a beam from its microdosimetric distributions [16][17][18].
The calculation of microdosimetric distributions and their moments can be carried out both analytically or by means of Monte Carlo (MC) simulations. In this work, we present the main features of a MC code developed with Geant4 [19][20][21] for the computation of microdosimetric distributions generated by proton track segments in liquid water via track-structure simulations done with the package Geant4-DNA [22][23][24][25]. Our code provides two sampling methods, called uniform and weighted, for the computation of the quantities of interest in spherical sites. We also highlight the description of the novel scoring method followed to compute of the energy imparted to the site per electronic collision of the primary proton, in order to connect the obtained y D values to the macroscopic L D ones avoiding resorting to dedicated calculations as discussed in our previous work [26]. Following the description of the geometry and the sampling algorithm implemented, the results obtained for microdosimetric distributions for different site sizes and energies are presented. Finally, the code is tested and benchmarked by comparing the L D calculated with the formula proposed by Kellerer and the macroscopic one. As we found significant deviations from its prediction, probably due to the influence of secondary electrons, here we propose the usage of a quantity, named effective mean chord length, l * F , with which we calculated effective frequency-and dose-averaged lineal energies, y * F and y * D , respectively, so that we could recover the agreement predicted by Kellerer for protons between 10 and 90 MeV.

Theoretical Framework
The use of the LET for the description of the biological effect of different radiation qualities at a small scale presents some limitations, the principal ones being that the concept of LET does not consider the random nature of energy loss along a particle track and that it does not take into account the length of the track relative to a finite target structure. The effects of these limitations, and the ranges of proton energies and site dimensions in which they are relevant to energy deposition, were extensively studied by Kellerer and Chmelevsky [27][28][29][30]. For large site dimensions (of the order of 1-10 μm) and small proton energies (below 1 MeV), the energy deposition is strongly influenced by the limited range of protons, since the LET may change significantly along the path through the site. On the other hand, as the energy of the proton increases, energy straggling and eventually δ-rays influx and escape start playing an important role, which becomes more relevant as site dimensions get smaller [14]. The interplay of the various factors affecting the energy deposition in cellular or subcellular regions and their contribution to the macroscopic dose-averaged LET is discussed in detail in the following paragraphs.

Basic Microdosimetric Concepts
The basic microdosimetric quantity is the energy imparted to a site by a single energy-deposition event, ϵ s , where an event is defined as an energy deposition due to particles that are statistically correlated. The energy imparted is a random variable, and its value varies from event to event. Therefore, predictions can be only made on the basis of probability distributions.
The single-event distribution f(ϵ s ) of the energy imparted is defined as the distribution of energy deposited in the site by exactly one event. The expectation value of ϵ s following f(ϵ s ) is called frequency-averaged imparted energy per event: For radiobiological considerations, it is also useful to consider the weighted or dose distribution d(ϵ s ) of the energy imparted per event, where the contribution of each event is weighted by the energy it deposits: The expectation value of ϵ s following d(ϵ s ) is called doseaveraged imparted energy per event ϵ sD and it is given by: From Eq. 3 it is possible to relate both averages through: where σ 2 ϵs and V ϵ s are the variance and relative variance of ϵ s , respectively. In general any continuous stochastic quantity x can be characterized trough its frequency and weighted probability density functions f(x) and d(x), the second being obtained by weighting each event by the value of the stochastic quantity.
Frontiers in Physics | www.frontiersin.org October 2021 | Volume 9 | Article 726787 Therefore, the following general relations can be obtained for the frequency-( x F ) and weighted-averaged ( x D ) values of x: where σ x and V x are the variance and the relative variance of x, respectively. Another fundamental microdosimetric quantity is the lineal energy, y, defined as: where l is the mean chord length of the site. The chord length l is itself a stochastic quantity and represents the length of a particle track (considered a straight line) within a site. It depends on the shape and dimensions of the site and, if the track is considered as a straight line, the determination of l is a purely geometrical problem [18,31]. The frequency-averaged lineal energy y F ϵ sF / l F is the microdosimetric analogue of the frequency-or track-averaged LET, L T . If the so called "short track segment condition" holds, i.e., the site dimension is considerably smaller than the proton range so that the LET can be considered constant along the track segment across the site, the frequency-averaged lineal energy and the track-averaged LET coincide: y F L T . It must be noted that, in the specific situations considered in this work, the macroscopic LET values are to be considered as restricted, since the site dimension determines a maximum cut-off to the distance that secondary electrons can travel from the place they are set in motion and therefore, to their kinetic energy. This is especially relevant for clinical applications, where the interest is focused on the energy deposited "locally" by a particle and not on its total energy loss.

The Relative Variance Formula for the Derivation of Macroscopic LET From Microdosimetry
As already mentioned at the beginning of this section, if a medium is exposed to an absorbed dose D, the actual energy imparted to a specified site will depend on various factors [30]. Firstly, the number of independent charged particles traversing the region and their different LET values if a composite beam impinges on the site. Secondly, the chord length associated to each proton crossing the site. Finally, even particles with same LET and chord length can result in a different energy imparted, as both the number of collisions along the chord and the energy lost in each collision might vary. The impact of this last factor, called energy straggling, can be expressed as ϵ s L × l ± δ, where L and l are respectively the LET and chord length of a particle, and δ is the stochastic variability of the energy imparted with respect to the expected energy ϵ s L · l due to straggling. The influence of the different random factors on the single-event distribution, under the short track segment condition, can be considered in terms of the relative variance of the different quantities [17], i.e.: where V LET , V l and V δ are the relative variances of the distributions of LET, chord length and energy straggling, respectively. The computation of the straggling distribution is complicated by the fact that the escape and influx of δ-rays from and into the region of interest presents considerable theoretical difficulties, and the problem needs to be studied with Monte Carlo simulations of δ-ray tracks. The relative variance of the straggling distribution, in particular, is determined by the weighted average of the distribution of energy imparted per individual proton collision: where ε c and σ εc are the average and the standard deviation of the distribution of energy imparted per collision, respectively. The relative variance of the straggling distribution can be demonstrated to be equal to [16]: Consequently, from Eq. 7 the following formula can be obtained: By using the previous relations, and considering that y D ϵ sD / l F , the derivation of the subsequent relation for the dose-averaged LET is straightforward: As emerges from Eq. 11, identified hereafter as Kellerer's formula, macroscopic and microscopic quantities are related to each other. Therefore, the calculation through MC track structure simulations of the microdosimetric quantities y D , δ 2 and the chord length distributions can be used to obtain the doseaveraged LET of a given composite beam.

MC Track Structure Simulations
In order to calculate the various microdosimetric quantities in Eq. 11 through MC track structure simulations, a particle track needs to be simulated and sampled, placing a site in such a way that at least one energy deposit occurs inside it. There are different types of randomness for the interception of a convex body with straight lines; in the following, two types will be distinguished: uniform and weighted [31,32]. The first type results when a body is exposed to a uniform isotropic fluence of infinite straight lines, while the second occurs when an uniformly distributed random point is chosen in a body and is traversed by straight lines with uniform random direction. From a computational point of view, uniform sampling consist in randomly selecting a position for the Frontiers in Physics | www.frontiersin.org October 2021 | Volume 9 | Article 726787 center of the site within the region of interest, while weighted sampling consist in selecting an energy transfer point and then placing a site randomly around it. An immediate drawback of the first method is that it is highly inefficient, since the chances of selecting areas with no energy imparted to them are really high. On the other hand, the second method ensures the occurrence of at least one energy transfer within the site, and it is more suitable for MC track structure calculations. The term "weighted" in this context, refers to the fact that the probability of selecting a point centered at a certain elementary volume dV is not uniform but weighted by a factor that compensates for the non-uniform density of transfer points around the particle track [14,33]. The chord length is a stochastic quantity that depends on the geometry of the site and the straight line associated to each proton in the site. Therefore, its distribution and average values are strictly connected to the sampling method chosen, affecting the shape of Kellerer's formula [31]. However, with the selection of the appropriate weighting factor, uniform and random sampling methods can be considered as equivalent and, in the case of spherical sites of diameter d, Eq. 11 takes the form [18]: Straggling, on its side, is an independent random factor from LET and chord length, as even particles with same LET traversing the site with the same chord length might result in a different energy imparted per event ϵ s . Therefore, it is not affected by the sampling method considered. However, when simulating a beam composed by particles of different LET, attention must be put to not include the variance of the LET distribution into the straggling one for the computation of δ 2 , as explained extensively by Bertolet et al. [26].
In the following paragraphs, the main features of the MC application developed for the computation of the various quantities involved in Kellerer's formula are presented. All simulations were carried out with the Geant4 toolkit (version 10.5), by using our computing cluster hosted at Centro Informático Científico de Andalucía (CICA, Seville, Spain), consisting of 24 computational nodes, each with 2 × 12C AMD Abu Dhabi 6344 (2.6 GHz/6 MB) L3. For every case considered, a number of 250,000 proton tracks was simulated, divided into 250 parallel jobs. The Geant4-DNA package, available within Geant4, was used for this purpose. Compared to Geant4, which provides various physics list options based on a condensed-history approach for the transport of charged particles, Geant4-DNA includes models developed for track structure simulations, which allows the reproduction of each single interaction of neutral and charged particles with liquid water [22][23][24][25]. This is especially key to carry out an accurate modeling of electron transport below 1 keV, which affects to energy deposition tallied at submicrometric targets (below ∼100 nm) as pointed out by Kyriakou et al. [34]. Geant4-DNA provides various physics list options as well, which mostly differ on the electron transport models used below 255 keV or its algorithm speed [25]. In this work, intended to highlight the code capabilities and illustrate the accuracy with which the dose-averaged LET can be calculated from microdosimetry quantities, we used the "option 2" of the Geant4-DNA constructor, which is an accelerated version of Geant4-DNA default constructor. This package only allows calculations for proton energies below 100 MeV. Therefore, simulations were performed with monoenergetic proton beams in the range 10-90 MeV, choosing the lower limit to ensure the validity of the short track segment condition. Simulations carried out with weighted sampling ranged from 90 min (protons at 10-20 MeV) to 87 h (90 MeV) per 1,000 primary protons; as for simulations using uniform sampling of site position, they took from 170 min  to 87 h (90 MeV) including, to improve statistics, a loop of up to 1,000 tries which finished once at least one energy exchange point was found within the site.
For mono-energetic proton beams, the LET distribution approximates a delta function, centered at the LET of the particle for the specific energy considered. In this specific situation, therefore, the relation L D L T y F (being L D and L T averages on restricted LET values, as stated previously) holds and can be used for the validation of the code [26]. Furthermore, FIGURE 1 | Scheme of the geometry used for microdosimetry simulations (2D projection). The primary proton (red track) is generated at the surface (on the left) of a cubic water volume (world volume) and travels along the Z axis. The energy transfer points (hits) are then scored in the middle of the volume (shaded area), in a slab whose thickness (Z S ) varies depending on the dimension of the site. The spherical sites are not physically simulated, as they are virtually selected in the scoring volume with an algorithm that depends on the type of randomness considered. Finally, the half-dimension of the world volume (R max ) varies according to the maximum energy of the incident protons, and is slightly greater than the maximum range of the δ-rays emitted.
Frontiers in Physics | www.frontiersin.org October 2021 | Volume 9 | Article 726787 comparisons were also made with unrestricted macroscopic doseaveraged LET values, computed with the formula proposed by Cortés-Giraldo and Carabe [35]: where n and s are the indexes for an event and a step, respectively, L sn is the mean energy loss per unit path length in the material, ε sn is the energy deposited by the primary particle along the step and ω n its statistical weight. Further details of this calculation can be found in [35]. Figure 1 shows the geometry of our microdosimetry code. The world volume, i.e., the volume that represents the experimental area and contains all the other components, consist of a water cubic box having half dimensions R max slightly greater than the maximum range, R δ, max , of δ-rays emitted by the incident protons. This range is calculated with the formula proposed by Tabata et al. [36], and the condition R max ≥ R δ, max is necessary to ensure intra-track electronic equilibrium. The scoring volume, i.e., the volume in which the energy transfer points (hits) are detected and stored, is a water slab positioned in the center of the world volume with same transversal dimensions and thickness Z S 4d, where d is the diameter of the site. The value of Z S was set to ensure enough energy transfer points for each track within the scoring volume, being 4d a good compromise between its thickness and calculations carried out comparing the two site positioning methods explained below; more details can be found elsewhere [37]. Simulations were run with spherical sites of 1 and 10 μm diameter. In order to have the same proton kinetic energy distribution at the center of the world volume for both site diameters, the decision to set R max R δ, max + Z S, max /2 was taken, where Z S, max is the thickness of the scoring volume corresponding to a site diameter of 10 μm. In our simulations, the beam (red track in Figure 1) originates at the surface of the cubic volume located at z − R max and travels along the z-axis. The physical quantities of interest are scored in randomly placed spherical sites, that are virtually identified as regions of the scoring volume thanks to specific algorithms for the selection of their centers. For the site positioning, two sampling methods have been implemented, uniform and weighted random sampling, which are described in the next paragraph.

Scoring Algorithms
The most intuitive way of scoring energy deposition events in micrometric sites along a proton track would be to randomly select a point P C in the region of interest with a uniform probability, and to consider all the hit positions P h i occurring at a distance | P h i − P C | ≤ d/2 from it. This sampling method, identified in this work as uniform, is very robust and it is not subject to any bias. The main drawback of this technique is that it is likely to be highly inefficient, as the probability of selecting volumes with no hits is really high and increases for smaller sites and higher energies.
To increase the sampling efficiency, weighted sampling can be used instead. In this case, an energy transfer point P H is randomly chosen with uniform probability among the hits of the simulated track. Then, the center of the site P C is sampled with uniform probability in a sphere of radius d/2 around P H , i.e., at a distance | P C − P H | ≤ d/2. Finally, all the energy transfer points P hi , within a distance | P hi − P C | ≤ d/2 are considered for the computation of the microdosimetric quantities in the site. In this way, the presence of at least one hit in the site is always ensured, and a sampling efficiency of 100% can be achieved. The downside of this method is that it is spatially biased towards regions of high density of hits, and a compensation factor must be introduced to "correct" the results and make them equivalent to uniform ones. As suggested by other studies [33,38], this factor should be equal to the ratio of the number of hits that can be selected to the number of hits located in the randomly placed site.
The working principle of the two methods, and the way they were implemented in our code, is shown in Figure 2. In both cases, a hit selection region within the scoring volume, where P C and P H are sorted, can be identified. The dimensions of this region are chosen to ensure that all the randomly selected points are always fully included in the scoring volume, and depend on the scoring algorithm considered. In the case of weighted sampling, the compensation factor is set equal to N sel /N int (see Figure 3), where N sel is the number of hits (with energy different from zero) in the hit selection region that can be randomly selected and N int is the number of hits in the intersection volume of the site and the hit selection region.
Once the position of the site is sorted, three main quantities are computed: the total energy deposited in the site per event, ϵ s , the total energy deposited for each single proton collision, ε c , and the trajectory length, s, of the proton traversing the site. In particular, for the computation of the energy imparted per individual proton collision, a Sensitive Detector class was specifically coded to identify uniquely the electronic showers generated by each collision. To this end, a distinctive tag is assigned to every primary electron set in motion by a proton collision. Then, taking advantage of the specific structure of the tracking algorithm of Geant4, all secondary electrons created by further collisions of the electron originating the shower are assigned the same tag, which becames an identifier of the shower. In this way, the energy deposited in the site by each δ-ray shower originating in the site or entering the site can be scored independently, as graphically represented in Figure 4. Figure 5 shows the results obtained for the frequency-averaged energy imparted per event, ϵ sF , as a function of the energy and the site diameter, for the two sampling methods considered. Taking the uniform sapling method results as a reference, we obtained relative residuals below 5% for all the energies considered. Regarding the weighted-averaged energy deposited per individual proton collision, δ 2 , an intermediate step is

RESULTS
October 2021 | Volume 9 | Article 726787 necessary to extract its value from the simulations. This quantity is an independent random factor from LET and chord length, therefore their variances should not be included in the computation of δ 2 when calculating L D with Eq. 12. This means that only the distribution of energy imparted per individual proton collision corresponding exactly to an energy imparted per event ϵ sF and a chord length l F should be taken into account to calculate δ 2 . In our previous work [26], this was done with a dedicated simulation, positioning the site in such a way that the particle traversed it with a path length equal to l F . In the work presented here, however, a new approach is proposed to extract δ 2 directly from the main MC simulation, by recording the two dimensional plot of the distribution ε 2 c f(ε c ) as a function of the primary proton trajectory length s within the site. By doing so, projections could be taken for intervals of s with size 0.1d, to build a graph of δ 2 as a function of s. Examples of this graphs, for the weighted random sampling, are shown in Figure 6. For values of s < d/2, the main contribution to δ 2 is given by the δ-ray influx, whose influence decreases rapidly as s increases. For s > d/2, on the other hand, the value of δ 2 can be considered as constant. Therefore, since l F 2d/3 > d/2 for a spherical site, the final values of δ 2 were extracted from the average of the points in the range s > d/2, weighted by their uncertainty. In Figure 7 the values of δ 2 for the different sampling methods and site diameters considered are represented together as a function of the incident proton kinetic energy: again, taking the uniform sampling results as a reference, we obtained relative residuals below 6%.
The results obtained for the frequency-averaged lineal energy y F with both sampling methods are presented in Figure 8. In this figure, y F values and L D values obtained with Eq. 12 are compared, to verify the relation y F L D , valid for mono-energetic beams. Constant LET condition was fulfilled for all scenarios, with the worst one being 10 MeV protons on 10-μm in-diameter sites, in which a change below 2% was observed. Taking the y F values as a reference, the relative residuals of L D for the different site diameters and sampling methods considered are represented in Figure 9. While for the 10 μm diameter site we find residuals approximately constant with energy and below 6%, for the 1 μm diameter site deviations increase with energy, reaching values of up to 30%. This effect is more evident looking at the weighted random sampling points, which are not subject to the lack of statistics that affects the uniform random sampling method, especially at higher energies. Moreover, this increasing difference between L D and y F for the 1 μm site, results in an anomalous behaviour of L D if compared to the macroscopic dose-averaged LET obtained from Eq. 13, represented as a black shaded area in Figure 8. Indeed, this last value, being unrestricted, represents the maximum L D achievable with Kellerer's formula and should never be crossed, a statement that is even stronger in the case of the 1 μm site against what happens in the figure.
To solve this problem, we proposed the use of new quantities named effective mean chord length, l * F , and effective frequency-and dose-averaged lineal energy, y * F,D , respectively. As done for the energy imparted per individual proton collision, a two dimensional plot of ϵ s f(ϵ s ) as a function of s was produced from the simulation, and projections were extracted for intervals of s equal to 0.1d. In this way, plots of the mean energy imparted per event ϵ s as a function of the proton trajectory length could be built, as the ones shown in Figure 10 for weighted random sampling. When proton tracks do not cross the site or pass tangent to it, i.e. s ∼ 0, the only contribution to ϵ s is given by δ-ray influx. Then, for higher s values, ϵ s grows linearly with the trajectory length. Intersecting the frequency-averaged energies imparted per event obtained considering the full data set and the linear fits of ϵ s as a function of s shown in Figure 10, the value of s corresponding exactly to an energy imparted per event equal to ϵ sF should be obtained. In principle, this value should correspond to the "geometrical" mean chord length l F 2d/3, however, we found out that this was not the case as the mean chord lengths obtained, which we named effective mean chord lengths l * F , were slightly different and varied with proton energy (see Figure 11).
Using the effective chord lengths for the computation of frequency and dose-averaged effective lineal energies, the results shown in Figure 12 could be obtained, where all the values of L D obtained with Eq. 12 lay below the macroscopic unrestricted dose-averaged LET, as expected. In this figure, a general better agreement between uniform and weighted random sampling methods emerges, if compared to Figure 8. Relative residuals of our derived L D values from y * F were found to be below 6% for both site dimensions and for all energies (see Figure 13) with the only exception of the points obtained for the two higher energies and the smallest site with uniform random sampling, which are strongly affected by the lack of statistics.  Figure 2B. The compensation factor is computed as the ratio between the number of selectable hits (with deposited energy different from zero) in the hit selection region (N sel ) and the number of hits in the intersection volume of the site with the hit selection region (N int ). FIGURE 4 | Schematic representation of the scoring method for the evaluation of the energy deposited per single proton collision in the site, ε c . The primary δ-ray generated by the proton collision is identified and assigned one distinctive tag, that is then inherited by all the secondary electrons in the shower. Only the energy deposition events happening within the site are accounted for in the calculation of ε c , including δ-ray influx from outside the site.

DISCUSSION
In this work, a novel MC code for the computation of microdosimetric distributions generated by proton track segments in liquid water has been presented. Two different algorithms were implemented for the scoring of the quantities of interest: uniform and weighted random sampling for site positioning. While the first is ideally better as it is not subject to any bias, it is also very inefficient due to the high chances of selecting sites with no energy depositions within. On the contrary, weighted random sampling has a 100% efficiency, but might be   y F values are here superimposed to the respective L D values derived from Eq. 12, represented as shaded areas whose thickness is given by the error associated to L D for each diameter site. Finally, the macroscopic unrestricted L D computed with Eq. 13 is also reported as a reference.
Frontiers in Physics | www.frontiersin.org October 2021 | Volume 9 | Article 726787 subject to biases related to the choice of the weighting factor. One of the main elements of novelty of our code is the inclusion of an algorithm for the computation of the distribution of energy imparted per individual proton collision. This distribution, which is impossible to measure experimentally, reflects the action of energy straggling and δ-ray influx and escape on the energy deposited in the site, and it is fundamental for the computation of dose-averaged LET from microdosimetric quantities with Kellerer's formula (Eq. 11).
In this study, our weighted random sampling algorithm was optimized so that the results obtained could be considered equivalent to uniform ones, with relative differences generally lower than 6%. Then, the code was used to run various simulations with mono-energetic proton beams, with energies in the range 10-90 MeV impinging on spherical sites of 1 and 10 μm diameter. Since for mono-energetic proton beams the relation L D y F holds, frequency-averaged lineal energy values obtained with both sampling methods were used as a reference for the validation of the code and the comparison with dose-averaged LET values obtained with Eq. 12. Furthermore, since the site dimensions considered were generally smaller than the maximum range of δ-rays emitted by the incident proton, the L D values obtained in this way were expected to be restricted and always lower than the macroscopic unrestricted dose-averaged LET values, which were computed with Eq. 13.
While generally no evident problems emerged in the results obtained for the 10 μm diameter site, the 1 μm site arose questions about the validity of Kellerer's formula, as the L D curve intersected the unrestricted one. This apparent contradiction was solved with our proposal of introducing the concept of effective mean chord length l * F , defined as the trajectory length corresponding exactly to an energy imparted per event to the site equal to the ϵ sF obtained from the MC. Following this definition, l * F should be naturally equal to the "geometrical" mean chord length l F 2d/3. However, this was proved to be wrong, as the effective value of the mean chord length not only differed slightly from the expected one, but varied also with the energy of the incident proton. This behaviour is probably related to the action of δ-ray influx, which affects the distribution of energy imparted to the site per event, increasing the actual value of ϵ sF with respect to what would be obtained by only considering direct  respectively, a better agreement was observed both between uniform and weighted random sampling results and between L D and y * F curves. Furthermore, the nature of restricted LET emerged clearly in this case, being all the L D values computed with Kellerer's formula below the macroscopic ones, and generally lower for the 1 μm diameter site. Indeed, the difference between restricted and unrestricted dose-averaged LET values decreases as the energy increases, especially expected for the 10 μm site, since the maximum range of δ-rays starts to be comparable to the site dimensions for proton energies below 20 MeV. Although the agreement found with our alternative approach looks promising, we must point out that this work only involved protons from 10 to 90 MeV (currently, Geant4-DNA simulates protons only up to 100 MeV). Thus, a dedicated analysis (beyond the scope of this work) extending the energy range and/or involving different ions and site shapes seems to be of interest to confirm our proposal, as well as cross-checking with other track-structure codes and experimental measurements. For the latter, it must be remarked that the beam energy spread and detector response introduce further variances in the distributions of microdosimetry quantities which must be carefully taken into account in such an analysis.
Concerning the δ 2 -values, we observed deviations from previous works when comparing calculations with sites of 10 μm diameter. While the values obtained in this work were between 2.1 and 2.4 keV for protons from 10 to 90 MeV, lower values were reported previously using other approaches. With dedicated simulations, aiming at isolating its calculation from contributions due to the variance of single-event energy imparted variance as discussed in [26], we reported values of about 1.8 keV; previously, with a more approximated calculation with G4EmPenelopePhysics physics list [35], values of about 1.5 keV were reported for the same energy range. These deviations suggest  y * F as a function of the incident proton kinetic energy. Values obtained for uniform (open markers) and weighted (full markers) random sampling methods are compared for sites of 1 (blue triangles and diamonds) and 10 μm (red circles and stars) diameter. Values of y * F are superimposed to the respective L D ones derived considering y * D instead of y D in Eq. 12, represented as shaded areas whose thickness is given by the error associated to L D for each diameter site. Finally, the macroscopic unrestricted L D computed with Eq. 13 is also reported as a reference. Frontiers in Physics | www.frontiersin.org October 2021 | Volume 9 | Article 726787 that the methodology followed to calculate δ 2 must be explained in detail, as it may impact the values of L D calculated from microdosimetry quantities by means of Eq. 12. Moreover, different versions of the Geant4 toolkit were used across these works, thus the choice of transport models varied between them as well. This issue shows that a dedicated study to quantify the impact of different transport models on the calculated δ 2 -value is of interest, but out of the scope of this article as Geant4 itself offers a wide variety of transport models combinations or physics list.
Using a weighted sampling approach in MC simulations to obtain microdosimetric quantities is undoubtedly advantageous in terms of computation time and efficiency. Indeed, the lower efficiency of uniform random sampling has emerged in our results numerous times, being its effects especially evident for sites of 1 μm diameter and proton kinetic energies above 70 MeV. For the same number of primary protons generated, a loss in statistic going from 60% to more than 90% in the worst case was observed, when comparing uniform sampling with weighted one. Therefore, the observed differences in the computational efficiencies clearly justifies the adoption of a weighted random sampling method for the computation of the microdosimetric quantities of interest, since equivalent results for L D could be obtained by using the approach presented in this work. This method brings the advantage of permitting the calculation of RBE by means of models either based on microdosimetry, such as the MKM [11][12][13] or on LET [39][40][41] by means of calculating microdosimetry quantities only which, contrary to LET, can be measured. Thus, with the microdosimetry data provided by our code, one can derive analytical parameterisations to calculate the RBE by each approach, without having to derive a specific parameterisation for LET distributions [42][43][44]. Further, we expect the impact on the uncertainties found on our LET values retrieved from microdosimetry will be relatively small, and will not add up significantly to other more important sources of uncertainty, such as the (α/β) of the irradiated tissue.
The current version of the code, available upon request, is intended to be incorporated within the official examples of the Geant4 toolkit dealing with the extension Geant4-DNA, as it shows the main aspects of the calculation of microdosimetric quantities with track segments of protons and ions (such as He, Li, C, O or Si) using spherical sites, and illustrates how L D can be retrieved from y D calculated with protons. From a modelling perspective, spherical sites are good enough since sphere-based microdosimetry quantities correlate well with the yield of DSBs, as shown recently [45]. Thus, other shapes such as cylinders were not included in our study to avoid further complications in our analysis of the relations between macro-and microdosimetry distributions. However, they could be useful to benchmark our predictions against experimental results from microdosimetry with other shapes, as done previously in [46].

CONCLUSION
The use of a weighted random sampling approach in MC simulations for the calculation of microdosimetric quantities is widely supported in literature for its undeniable computational efficiency if compared to uniform random sampling. However, especially when considering segments of proton tracks traversing a sensitive volume, particular attention must be put in the computation of the weighting factors, to obtain unbiased results equivalent to uniform ones. In this work, we addressed this problem by developing a MC application for the computation of microdosimetric distributions generated by segments of proton tracks. A remarkable feature of our code, which would constitute a novelty as an official example of Geant4 toolkit, is the possibility of computing the distribution of energy imparted per single proton collision. This distribution, indeed, is strongly connected to the effect of energy straggling and δ-ray influx and escape on the energy deposited in a site per event, and represents the liaison between the microscopic dose-averaged lineal energy and the macroscopic dose-averaged LET, as theorised by Kellerer. To test and benchmark our code, we ran various simulations with mono-energetic proton beams, comparing frequency-averaged lineal energy results with dose-averaged LET ones obtained with Kellerer's formula. As predicted theoretically for proton beams of this kind in the short track segment condition, we obtained a very good agreement between the two quantities, with percentage differences always lower than 6%. In order to obtain this result, we resorted to a new quantity proposed in this work, the effective mean chord length l * F , defined from the value of the primary proton trajectory length corresponding to the calculated frequency-averaged energy imparted to the site per single event. In fact, we observed that the action of δ-ray influx in the site strongly affected the distribution of energy imparted to the site per event, displacing its mean value with respect to what would be obtained by only considering direct traversals of proton tracks through the site. For this reason, further calculations at other proton energies and with other ions are of interest to assess the validity of our proposal beyond the limits of this work (protons from 10 to 90 MeV). Such a further study may suggest the introduction of minor corrections of the fitting parameters of our analytical microdosimetric models, described in [42,43].

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.