A New Automatic Statistical Microcharcoal Analysis Method Based on Image Processing, Demonstrated in the Weiyuan Section, Northwest China

Microcharcoal is a proxy of biomass burning and widely used in paleoenvironment research to reconstruct the fire history, which is influenced by the climate and land cover changes of the past. At present, microcharcoal characteristics (amount, size, shape) are commonly quantified by visual inspection, which is a precise but time-consuming approach. A few computer-assisted methods have been developed, but with an insufficient degree of automation. This paper proposes a new methodology for microcharcoal statistical analysis based on digital image processing by ImageJ software, which improves statistical efficiency by 80–90%, and validation by manual statistical comparison. The method is then applied to reconstruct the fire-related environmental change in the Weiyuan loess section since about 40 thousand years before present (ka BP), northwest China with a semi-arid climate, found that the microcharcoal concentration is low in cold and dry climate and high in warm and humid climate. The two main contributions of this study are: 1) proposal of a new, reliable and high efficient automatic statistical method for microcharcoal analysis; and 2) using the new method in a semi-arid section, revealing the paleofire evolution patterns in the semi-arid region was mainly driven by the biomass rather than the aridity degree found in humid regions.


INTRODUCTION
Fire is an important ecological factor that can indicate changes of climate, vegetation and human activities. In recent years, with the frequent occurrence of extreme weather events, the incidence of biomass burning has increased (Jolly et al., 2015). This has had significant impacts: ecological damage, economic costs, and human casualties (Ashe et al., 2009;Goldammer et al., 2013), for example, the 2019-2020 Australian bushfire season (colloquially known as the Black Summer) (Campbell et al., 2020;Lindenmayer and Taylor, 2020). Therefore, it is of practical significance to understand the processes controlling wildfires. However, climate dynamics and vegetation variations operate at relatively larger scales than fire processes (Macias Fauria et al., 2011), which are considered to be the primary control factors of the fire (Moritz et al., 2005). Hence, understanding past fire dynamics and their relationship to environmental factors is a key aspect of preserving and managing present-day ecosystem functions and fire occurrence (Conedera et al., 2009).
An effective method to obtain paleofire data is from sedimentary records (Miao et al., 2016b;Han et al., 2020), using pyrogenic carbon as a proxy of paleofire, since this is produced by the incomplete combustion of organic matter during biomass burning or fossil fuel consumption (Goldberg, 1985). Microcharcoal is an indicator of pyrogenic carbon, and in palynological studies it is also extracted during the pollen extraction process. Microcharcoal can be characterized by its jet-black, opaque, angular particles in samples; otherwise, the clear or brown, amorphous, weakly -structured particles are considered as vegetal matter (Patterson et al., 1987).
At present, the statistical analysis of microcharcoal is mainly based on visual inspection, which is accurate in identifying the microcharcoal characteristics but time-consuming: the manual measurement of the size of each microcharcoal grain is not conducive to improving work efficiency. In terms of our work experience, it takes 2-8 h to count each sample depending on the impurity content, for instance, this study (Weiyuan loess section) includes 76 samples, it will take about 19-76 working days (8 h per working day) to complete the statistics. Therefore, an automatic approach would clearly be advantageous, by increasing the speed at which samples could be analyzed and allowing the analysis of larger numbers of samples (Rhodes, 1998). Automatic counting of microcharcoal is an area that has long been proposed, following the studies of MacDonald et al., (1991) and Horn et al., (1992). However, research in this field is restricted by the development of computer performance and microscopic imaging technology, and has only progressed slowly. In 2004, ImageJ, an open-source, high-performance and lightweight biological image processing software was proposed (Abràmoff et al., 2004). ImageJ was used in microcharcoal analysis for the first time in 2005 (Stevenson and Haberle, 2005), and the procedure has also been used by Hawthorne and Mitchell (2016), but the method is only suitable for a small sample amount (e.g., in a petri dish) and large microcharcoal particle size (>125 μm), and the automation is still inadequate. Another independent study on the automatic analysis of microcharcoal (Thevenon and Anselmetti, 2007) found a method that can be applied to a large number of samples, however, due to the lack of accurate assessment, the results cannot be used directly and still need manual verification. Therefore, the method can only be regarded as computer-aided analysis, rather than automatic analysis.
In this study, we firstly proposed a new automatic statistical method for microcharcoal analysis, based on ImageJ software (Abràmoff et al., 2004). We then applied the new method to the Weiyuan loess section in the semi-arid area of northwest China, to analyze environmental changes in this area since about 40 ka BP. As part of the second-largest arid to semi-arid area in the world, northwestern China is a unique location for studying fire history, along with vegetation and aridity evolution (Miao et al., 2016a); in addition, arid and semi-arid regions are also more vulnerable to global climate change (Prospero and Lamb, 2003;Cook et al., 2004;Sankaran et al., 2005). Therefore, studying the past fire evolution in this region (Huang et al., 2006;Tan et al., 2015;Han et al., 2020;Miao et al., 2020) is helpful to understand the impacts of future climate changes on the incidence and intensity of fire.

Study Site
The Weiyuan section (Yang S et al., 2015) (104.25°E, 35.13°N) is located in the west of the Loess Plateau, northwest China, at the boundary of the monsoon zone and non-monsoon zone. This region has a semi-arid and typical temperate continental climate, with an annual average temperature of about 6.8°C, and annual average precipitation of around 363 mm ; Figure 1A).

Microcharcoal Extraction and Identification
The sediment samples of the Weiyuan section were extracted following standard palynological methodology (Miao et al., 2017) as follows. 1) One tablet of Lycopodium (each one containing about 27, 600 Lycopodium spores) was initially added to each weighted sample as a reference for the calculation of concentration (Maher, 1981); 2) acid digestion with 10% HCl removed carbonates; 3) acid digestion with 40% HF removed silicates; 4) fine sieving (10 μm mesh) to enrich the microcharcoal particles (as well as pollen grains); 5) microscope slides of each sample were prepared for identification.
The microcharcoal concentration can be calculated according to the following formula: Charcoal concentration x C x L x × 27600 W x Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 609916 2 where: x is sample number; C is the identified number of microcharcoals; L is the identified number of Lycopodium spores; and W is the sample dry weight.

Image Processing and Statistics
To apply the image processing method, scanned RGB digital images were acquired from the microslides by the Zeiss slide scanner Axio Scan. Z1, a 10× magnification objective lens was used with a CCD camera with an imaging accuracy of 0.44 μm/pixel. For a standard slide of 26 mm × 76 mm in size, a 1.5 GB image file in Zeiss CZI format was generated and converted in Zeiss Zen software to the more common TIFF format for analysis in ImageJ. Note that converting to TIFF will increase the size of the file to about 5GB, which is very demanding for computer performance, therefore, the minimum performance recommendations of computer processor are 3 GHz of clock rate, eight cores, and 32 GB of RAM, moreover, cropping the image into subsets for processing is also a remedy. an example image, is shown in Figure 2A. The RGB color model is an additive color model applied to display images in electronic systems, based on human perception of colors. All other colors are then defined as mixtures of the three additive primary colors, red, green, and blue, in different proportions (Hirsch, 2004). In the commonly adopted 8-bit storage format, its mathematical representation is: Here, the range 0-255 represents the resolution of typical 8-bit binary storage (2 8 256), for instance, (0, 0, 0) represents pure black and (255, 255, 255) represents pure white. Considering that microcharcoal is mainly characterized by being black in color, when image processing we convert RGB color to grayscale images in ImageJ (Ferreira and Rasband, 2012), using the formula: The converted grayscale image is shown in Figure 2B. The automated method needs to distinguish both microcharcoal and Lycopodium spores; however, testing showed the microcharcoal particles were successfully separated from microcharcoal-like particles, while no efficient method was found to distinguish the Lycopodium spores from the Lycopodium-like particles. Therefore, the statistics for the Lycopodium spores were determined manually. Fortunately, counting the Lycopodium spores is a relatively easy task, because according to Wang et al., (2020), a minimum count of 300 Lycopodium spores will enable stable microcharcoal results.
The key to automatic analysis of microcharcoal samples is to convert the human eye identification criteria described in natural language into computer image processing criteria described in mathematical language. First of all, the most intuitive criterion is that the grayscale values of microcharcoal particles are low, and we suggest this as the basic standard. Furthermore, as shown in Figure 2B, the grayscale values of microcharcoal-like particles are also low, and these are hard to distinguish based only on the grayscale value. Further observations showed that the range of grayscale values of microcharcoal particles is smaller than that of the microcharcoal-like particles: this is because of the grayscale color of microcharcoal particles is relatively consistent, while that of microcharcoal-like particles may have some variability. Therefore, we propose the median grayscale value or the standard deviation of a single particle as a further distinguishing criterion.

Image Interpretation by ImageJ
Microcharcoal image automatic recognition consists of two steps: training a classification model and testing the classification model.  .
Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 609916 To train the classification model, we compiled a training set from sample Nos. 10, 20, 30, 40, 50, 60, and 70 to avoid individual samples introducing errors in the imaging system. We manually selected microcharcoal and microcharcoal-like particle samples from the seven test sets and calculated their statistical characteristics. Figure 3A shows histograms of the grayscale  . Lycopodium-like means that its spectral signature is similar to Lycopodium when classified by computer, although they can be easily distinguished manually.
Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 609916 4 values of microcharcoal and microcharcoal-like pixels, and Figure 3B shows the frequency distributions; the grayscale value distributions of microcharcoal and microcharcoal-like particles are not clearly distinguishable, making it is difficult to simply set a threshold on a grayscale image and conclude that any pixel value less than that is definitely microcharcoal. However, the amount of microcharcoal is far greater than the amount of microcharcoal-like material, which supports the opinion that most of the black particles identified during pollen analysis could be regarded as microcharcoal (Sun et al., 2000). Therefore, further distinguishing strategies are needed. We calculated the median and standard deviation of the grayscale value of each selected particle (see Figures 3C,D); both achieve a better differentiation than the grayscale value alone, with the standard deviation performing better than the median. Therefore, we decided to use the standard deviation as a further distinction criterion.
Based on the above discussion, we propose the following distinguishing procedure based on ImageJ software. 1) Grayscale threshold: the image is converted from RGB to grayscale, and pixels with grayscale values of 100 or less are retained for the next step. According to the training set, the grayscale values of 98.28% of microcharcoal pixels and 84.41% of microcharcoal-like pixels meet this criterion. 2) Noise reduction: the selected results are processed in ImageJ to remove small patches and to fill holes. 3) Vector graph generation: raster images of individual pixels are converted to vector images of individual particles, using the ImageJ particle analysis tool. 4) Parameter analysis: required spectral and shape parameters can be calculated by ImageJ, such as median grayscale value, standard deviation of grayscale value, length, width, area, perimeter, etc. 5) Standard deviation threshold: objects with a grayscale standard deviation greater than 26 were excluded, because they were more likely to be not microcharcoal. 6) Removal of particles smaller than 10 microns in length, following standard palynological practice (Miao et al., 2017), because a 10 μm mesh is adopted for fine sieving of samples to remove the impurities. In addition, particles smaller than 10 microns are more difficult to identify, regardless of whether using computers or human eyes.

Evaluation by Comparison With Manual Analysis
Next, we manually tested the classification accuracy by selecting test samples. To avoid choosing samples from the training data, we selected test sets from sample Nos. 5,15,25,35,45,55,65, and 75 in a random position, each test sample size is approximately 1/ 9 to the result sample size. The results of ImageJ automatic analysis based on the above procedures were compared with the results of manual statistical analysis by three people: Yaguo Zou, Zisha Wang and Anxia Du (see Figure 4). The microcharcoal identification experience of the three people is 2 years, 3 years and 1 year respectively. The identification was done in ImageJ based on the cropped test sample from the scanned image under 400× Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 609916 magnification (the original image is scanned by 10× objective lens and CCD camera into an image with a resolution of 0.44 um/ pixel, and can be further enlarged in ImageJ, similar to the function of the eyepiece), and the black, opaque, angular particles (Patterson et al., 1987) were identified as microcharcoal. Figure 4A shows the total number of counted particles, and reveals that the automatic statistical results are close to those of Yaguo Zou and Zisha Wang, but higher than those of Anxia Du; however, the trends of all four sets of results are consistent. It is interesting that anomalous values were produced by human analysis, not by the machine analysis, for the reasons explained below. Figure 4B shows the total area of counted particles, and unlike the number of particles, this is consistent across all four cases (3 humans, 1 machine). One obvious possibility to explain the statistical difference appeared in particle count but not in particle area, is that the smaller particles were overlooked: as a result, the number of particles recorded by Anxia Du was smaller but the total area was still similar. This is confirmed in Figure 4C, where the rank-size distributions of counted particles show that the four statistical sequences are broadly consistent in the large size particles range, but as the grain size gets smaller, the differences increase. This is easy to understand because our statistical rule requires that microcharcoal particles above 10 microns should be counted, but the visual implementation of this threshold requires experience unless every particle is measured, which is time-consuming. Overall, the grayscale value distributions of counted pixels ( Figure 4D) and the standard deviations of counted particles ( Figure 4E) show that the four sets of results for manual and machine analysis are generally consistent with each other.
In conclusion, the results of the machine and manual statistical analysis achieve high levels of consistency in their distributions of pixel grayscale values ( Figure 4D), particle grayscale standard deviations ( Figure 4E), and particle rank-size distribution greater than 20 microns ( Figure 4C). The main divergence occurs at around the 5-20 microns grain size ( Figure 4C), because the visual measurement is needed in the manual statistical analysis to determine whether or not a microcharcoal particle is larger than 10 microns. Therefore, the results are consistent in the total area counted ( Figure 4B), but not in the number of particles counted ( Figure 4A), however, the trends are still consistent.

Statistical Results
The results of analyzed microcharcoal concentrations in the Weiyuan section are shown in Figure 5. Note that we provide the traditional particle concentration ( Figure 5A), but for the convenience of comparing image processing with the manual statistics, the area concentration ( Figure 5B) is also given. We believe that the area concentration is more accurate than the particle concentration in reflecting the amount of microcharcoal; however, the high correlation coefficient (0.97) between the two curves shows that the particle concentration and area concentration are closely matched. In addition, the results analyzed manually in Figure 4 are dotted on the curves in  ; (D): magnetic susceptibility (MS) in the Weiyuan section ; (E): median grain size (Md) in the Weiyuan section ; (F): 10 Be based rainfall in the Baoji section (Beck et al., 2018); (G): phytoliths based mean annual precipitation (MAP) in the Weinan section (Lu et al., 2007);  Figure 4 as a reference for the accuracy of the automatic method; however, we note that the test sample size in Figure 4 is only 1/9 to the result sample size in (A) and (B).
Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 609916 6 Figures 5A,B as a reference for the accuracy of the automatic method; however, we note that the test sample size in Figure 4 is only 1/9 to the result sample size in Figure 5; Nevertheless, we still find a good correlation between the manual and automatic analysis methods.
As the trend shown in the microcharcoal results, from the L1-2 layer (MIS 3) to the L1-1 layer (MIS2), along with the climate changed from cool and wet to dry and cold in the Last Glacial Period, it shows a low values state and a slowly decreasing trend in microcharcoal content. The S0 layer (MIS 1, Holocene), deposited during a warm and wet climate, shows three clear stages in its microcharcoal content: first, a rapid increase during the early Holocene, then a fluctuating state from ∼4 to 2 ka BP, and finally an increase since 2 ka BP. This pattern indicates that fire events in the region were weaker during the Last Glacial Period and stronger during the Holocene, which is inconsistent with the commonly recognized trends that fires occur more frequently in glacial periods and less frequently in interglacial periods.

Comparison of Microcharcoal Statistical Methods
As an important proxy reflecting the occurrence of fire events, microcharcoal has a unique research value; however, manual statistical analysis of microcharcoal contents is time-consuming, which hinders the application and development of this method. Based on previous research on the automatic analysis of microcharcoal, this paper proposed and verifies a new automatic statistical method, our method reduce the time to count a sample from 2-8 h to 20-30 min, and there was no reduction in accuracy compared with manual statistics, which is more convenient and greatly improves the statistical efficiency of microcharcoal.
Scientific statistical work needs comprehensive evaluation in terms of both accuracy and efficiency. The automatic statistical analysis of microcharcoal is a long-standing but rarely discussed problem, first addressed by MacDonald et al., (1991) in their study using a transmitted light microscope and an image analysis system designed for optical densitometry; in their method, microcharcoal was identified according to its optical density (opacity). A more complete automatic statistical system was proposed by Horn et al., (1992), based on statistical characteristics such as grayscale and its standard deviation. However, due to the limited computer performance and at that time, these early explorations were not developed further, and their methods were not widely adopted.
With the continuous development of computer technology, and successive improvements in image processing software, automatic pattern recognition has become widely used in many fields: for example, ImageJ in biology, metallographic analyzer in materials science, ENVI and Erdas in remotesensing. These image processing packages are also often used by other disciplines, including in research on computer-aided microcharcoal analysis (Stevenson and Haberle, 2005;Thevenon and Anselmetti, 2007;Lu et al., 2009;Tan et al., 2014;Hawthorne and Mitchell, 2016). However, many studies simply apply the software directly, without discussing the software's statistical basis and without a thorough evaluation of its accuracy.
This study further analyzed the problem of microcharcoal statistics from the perspective of methodology, yielding preliminary results as described above. However, there are still many deficiencies in this study: for example, we are not sure whether the threshold chosen in this paper is applicable to other sedimentary sections, because of their different sedimentary environments and soil textures. In addition, the problem with the automatic statistical analysis of Lycopodium spores has not been satisfactorily resolved, although this may be addressed by using the WEKA (Hall et al., 2009) expansion package in ImageJ, which is an image classification algorithm based on training samples and machine learning; however, this algorithm requires high-performance computer, because the microscope images are too large. Alternatively, more recognisable markers may be used for instead of the Lycopodium spores, such as plastic markers (Ogden and Gordon, 1986). In this sense, automated statistical algorithms are evaluated not only for their accuracy, but also for their achievement of the same accuracy but with the consumption of fewer computational resources. Therefore, we suggest strengthening research efforts in this field in the future, not only the address the automatic statistical analysis of microcharcoal and Lycopodium spores, but also to extend the method to the analysis of all spores and pollen species encountered in palynological studies.

Spatial Heterogeneity of Fire Event Controlling Factors
In combustion theory, the fire triangle (Countryman, 1972) is a simple model for understanding the three necessary ingredients for ignition and combustion: heat, fuel, and an oxidizing agent (usually oxygen). This is a microscopic model from a chemical perspective, then Moritz et al., (2005) extended it to different scales of time and space, including the large scale which vegetation, climate, and ignition sources constitute the three vertices of the triangle. Among them, vegetation conditions and extreme weather  are considered to be two basic factors to control the production of microcharcoal (Herring, 1985;Clark, 1988;Whitlock and Larsen, 2002;Huang et al., 2006;Adolf et al., 2018). More specifically, the solar insolation drives the terrestrial temperature changes (Bond et al., 1997;Chen et al., 1997;Petit et al., 1999;Sun et al., 2012), and the moisture (precipitation) is coevolved with the temperature (Miao et al., 2012), moreover, the burning of the biomass is more easily in dry conditions (Huang et al., 2006;Pechony and Shindell, 2010;Miao et al., 2019;Han et al., 2020).
Based on the above discussion, we selected the environmental indexes related to fire activities for comparative analysis ( Figure 5): 1) the pollen concentration ; Figure 5C) as a proxy for plant species richness; 2) the magnetic susceptibility  Figure 5D) as a proxy for East Asian Summer Monsoon strength; 3) median grain size  Figure 5E) as a proxy for East Asian Winter Monsoon; 4) the reconstructions of rainfall based on 10 Be in Frontiers in Earth Science | www.frontiersin.org February 2021 | Volume 9 | Article 609916 the Baoji section (Beck et al., 2018; Figure 5F); 5) mean annual precipitation based on phytoliths in the Weinan section (Lu et al., 2007) ( Figure 5G); 6) total organic carbon content in Luochuan ( Figure 5H), Xifeng ( Figure 5I), Weinan ( Figure 5J), and Lantian ( Figure 5K) respectively ; 7) summer insolation in 35°N (Laskar et al., 2004; Figure 5L). Based on the information in Figure 5, we interpret the paleofire characteristics in this section, since about 40 ka BP, as follows.
Fire activity during the Last Glacial Period was significantly reduced relative to that in the Holocene, which is contrary to the usual view of wildfire being more prevalent in cold and dry climates, leading to more microcharcoal in associated sediments (Kaars et al., 2000;Luo et al., 2001;Wang et al., 2005;Wang et al., 2012;Zhao et al., 2019;Xiao et al., 2020;Zhang et al., 2020). We speculate that the reason for this difference is that the above records are all from humid regions, while this study was located in a semi-arid region. Here, although the climate of the Last Glacial Period was dry and conducive to the occurrence of fire, the vegetation coverage in Weiyuan was low, as supported by palynological records ; Figure 5C), total organic carbon content   (Figures 5H-K) and model simulation (Liu et al., 2002;Ni et al., 2010). This would have limited the development of wildfire. Paleofire studies in other arid areas, for example, Sierra Leone in sub-Saharan Africa, have found similar patterns (Bird and Cali, 1998). However, long time-scale paleofire records in arid and semiarid areas remain sparse, and often show strong spatial heterogeneity (Wang et al., 2005;Wang et al., 2012), so this hypothesis needs to be verified by additional microcharcoal studies in arid areas in the future.
The first significant increase of fire was from 12 ka BP to 5 ka BP, this period can be further divided into two stages. At the earlier stage, summer insolation was at a high level (Laskar et al., 2004; Figure 5L), which is considered as a contributing factor to the occurrence of wildfires (Whitlock et al., 2010), and it was confirmed by sedimentary records (Millspaugh et al., 2000;Brunelle and Whitlock, 2003;Whitlock et al., 2008;Gil-Romera et al., 2014;Inoue et al., 2018) and simulation results (Hély et al., 2010). At the later stage, summer insolation has decreased, but the rainfall began to increase (Beck et al., 2018;Lu et al., 2007;Figures 5F,G), which promoted plant growth Lu et al., 2019;Figures 5C,H-K) and resulted in an abundant fuel supply.
The second significant increase of fire, since 2 ka BP, is more likely to be associated with human activity. Weiyuan section is located near the center of early ancient China, and thus lies in a region with strong ancient human activities (Dong et al., 2013;An et al., 2017). Wang et al., (2003) suggest that human activity is the strongest additional factor superimposed on the natural background trend. With the development of human productivity, anthropogenic influence on natural processes has gradually strengthened: in areas where early human activity was intense, the increase in microcharcoal over the last few thousand years is often explained as a result of human use of fire, and this is more evident in the Holocene microcharcoal record (Xue et al., 2018).

CONCLUSION
(1) In this study, an automatic statistical method for microcharcoal was proposed, based on ImageJ software. The microcharcoal was identified using the pixel grayscale value threshold and the particle grayscale standard deviation threshold. Results from the statistical method were compared with manual analysis to demonstrate the method's statistical effectiveness. This method is efficient and accurate, and provide a powerful tool for microcharcoal analysis.
(2) We applied the new automatic statistical method to reconstruct paleofire activity since ∼40 ka BP in the Weiyuan loess section, which is located in semi-arid northwest China. We compared this record with other related records, and revealed the spatial heterogeneity of factors controlling paleofire. We speculated that differences in vegetation coverage caused fire events in arid and humid areas to show opposing patterns. This is related to the contrasting abundance of fuel in arid and humid areas, but because of the lack of fire records in arid and semi-arid areas, more research is needed on the paleofire evolution in this area to verify this hypothesis.

DATA AVAILABILITY STATEMENT
The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

AUTHOR CONTRIBUTIONS
YZ designed the experiment and wrote the manuscript, YM, YZ, and SY guided the work and modify the paper, SY collected samples and provided guidance, ZW and GT participated in the experiment.

ACKNOWLEDGMENTS
We are grateful to Yindi Duan, Xicai Zheng, Hongjie Yan helped in lab, Anxia Du contributed in manual statistical check. We thank the anonymous reviewers for their constructive suggestions.