^{1}

^{2}

^{1}

^{2}

This article was submitted to Energy Storage, a section of the journal Frontiers in Energy Research

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

We determine the energy storage needed to achieve self sufficiency to a given reliability as a function of excess capacity in a combined solar-energy generation and storage system. Based on 40 years of solar-energy data for the St. Louis region, we formulate a statistical model that we use to generate synthetic insolation data over millions of years. We use these data to monitor the energy depletion in the storage system near the winter solstice. From this information, we develop explicit formulas for the required storage and the nature of cost-optimized system configurations as a function of reliability and the excess generation capacity. Minimizing the cost of the combined generation and storage system gives the optimal mix of these two constituents. For an annual failure rate of less than 3%, it is sufficient to have a solar generation capacity that slightly exceeds the daily electrical load at the winter solstice, together with a few days of storage.

Moving away from fossil fuels to renewable energy is a crucial step to minimize the extent of global warming. Because renewable energy sources, such as wind and solar, are intermittent, achieving a 100% renewable scenario requires either a large excess generation capacity, a substantial amount of storage, or a judicious mixture of the two. Understanding the nature of is tradeoff between excess capacity and storage is crucial for the design and optimization of effective renewable energy systems. Understanding the factors that determine the tradeoff will improve our grasp of the right balance between the uncertain costs of generation and storage in the future.

This tradeoff is characterized by two fundamental parameters: The generation factor

Given the range of the these predictions about optimal configurations, a need exists for an analytical theory that would: 1) clarify the relation between input physical parameters and the performance of a combined generation/storage system, and 2) help constrain the parameters of this system to guide the realm of feasibility. In this work, we construct such a theory that is based on an idealized, but general model that faithfully incorporates the actual solar irradiation statistics, including seasonality and day-to-day correlations. This theory allows us to specify the nature of an optimal generation/storage system and make explicit predictions about its cost and reliability. Although optimal systems will in general include both wind and solar energy, we treat only solar energy in order to obtain a theoretically tractable model. We believe that the general features of our results will hold for mixed systems as well.

Our model extends previous analytical theories that were based on simplified solar irradiation statistics.

We begin by first outlining basic features of the solar-flux data for the St. Louis region, which typifies those of the entire United States. We then introduce our data-driven model and use it to develop analytic formulas for the failure rate and storage capacity needed to achieve a given reliability. We use these to calculate the generation and storage capacities of a combined system that minimizes the cost and yet is extremely reliable. We verify our predictions based on simulations of millions of years of synthetic data.

To illustrate the issues and as a preliminary to develop our model, we first present and analyze data for the solar flux on a 270 km × 270 km region centered on St. Louis over the 40-year period 1980–2019. This region is large enough that its energy needs can be met by covering a small fraction of the total land area with solar panels, but small enough that power transmission across the region is nearly lossless and instantaneous. The solar data, from the MERRA-2 dataset ^{2} from the winter minimum to the summer maximum, with daily extrema of 1.53 MJ/m^{2} and 32.1 MJ/m^{2} over the 40 years of data (

Average daily energy ⟨

Insolation fluctuations. The black curve is smoothed 45-day average daily energy per unit area on the St. Louis region, 1980–2019 and the one standard deviation range is dashed. Also shown is the daily data for 1980 (blue) and a typical realization of our synthetic data (red). The upper dashed blue line corresponds to the load with

The average daily solar energy is roughly sinusoidal, with the maximum at day 189 (July 7, roughly 2 weeks after the summer solstice) and the minimum at day 357 (December 23, a few days after the winter solstice). The standard deviation in the daily solar energy also has a systematic time dependence that ranges between 1.5 and 5.5 MJ/m^{2}, with maximal fluctuations occurring in the early spring. Near the winter solstice, the magnitude of the fluctuations is about 35% of the mean value. On the minimum-insolation day, ^{2} and ^{2}. These numbers, which will play a central role in our ensuing analysis, are based on averaging the daily energy data over a 45-day window.

To illlustrate the influence of fluctuations on the daily insolation data,

Our determination of the optimal system configuration is based on two key costs: the cost _{
g
} of the generation capacity to supply the daily electrical energy load _{
s
} of energy storage to cover 1 day of electrical load. This daily load of the St. Louis region is^{1}
^{14} J, or 1.1 KWh × 10^{8} KWh. This corresponds to an average power usage of 4.6 KWh × 10^{6} kW.

It is conventional to express the cost of solar panels in dollar per watt. Using the current solar panel cost of $1.50/W _{
g
} ≈ $75 billion. This cost grows roughly linearly with the area of the solar farm^{2}
^{14} J/(0.20 J/m^{2} × 7.95 J/m^{2} × 10^{6} J/m^{2}) ≈ 2.5 × 10^{8} m^{2} ≡ _{0}. This roughly corresponds to a 16 km × 16 km square. The cost of a solar farm of area _{0}, where _{
g
}. The excess generation capacity, (^{2}

The cost to store 1 day of electrical energy load for the St. Louis region at the current price of $200/KWh is _{
s
} = 1.1 × 10^{8} kWh×$200/kWh ≈ $22 billion _{
s
}

Since roughly 60% of a 24-hour period is dark at the winter solstice in the St. Louis region and total electrical energy use is roughly time independent in the winter (_{0} could fully supply the regional electrical energy needs during a 24-hour period at the solstice, and thus throughout the year.

The existence of insolation fluctuations has several essential consequences. First, the optimal area of the solar farm must be larger than _{0} and the storage capacity must be larger than the 60% of daily energy use that is needed to deal with the regular diurnal fluctuations. Second, we will see that it is impractical to achieve 100% reliability with this combined solar generation and electrical storage system. Thus it is necessary to balance the tradeoff between reliability and cost. Establishing how generation capacity and storage combine to achieve a given reliability, and understanding the tradeoff between reliability and cost, are primary goals of this work. We will find that the optimal cost system configuration is determined by the ratio of storage to generation costs, _{
s
}/_{
g
}. The above numbers give roughly 0.3 for this ratio. Since storage costs are rapidly decreasing

Because of the substantial day-to-day fluctuations in the solar flux, the 40 years of available data are too sparse to determine the reliability of a combined solar farm/storage system with statistical significance. To formulate a generally applicable theory, we first construct synthetic daily insolation data that faithfully incorporates the annual trends, the daily fluctuations, and the day-to-day correlations that are present in the solar flux data for the St. Louis region. The simple and direct algorithm that we use to construct these data allows us to readily generate time series for millions of years. From these, we obtain statistically meaningful results about the reliability and cost of a combined solar power generation and storage system.

To construct the synthetic data, we require two additional features beyond the average daily incident energy and its standard deviation: 1) the distribution of energy for each day of the year, and 2) the day-to-day energy correlations. The energy distributions away from the winter solstice are irrelevant when

_{
j
} ≡ _{
j
}/⟨_{
j
}⟩ are all greater than or all less than 1 over ^{
n
}, with

There are also day-to-day correlations in the energy flux that reflect the well-known feature that the weather on consecutive days is more likely to be similar than different _{
j
} ≡ _{
j
}/⟨_{
j
}⟩}, where _{
j
} is the energy per unit area on the _{
j
}⟩ is its average, with ^{3}
_{
j
} are either all greater than 1 or all less than 1. We then obtain the probability distribution _{
j
} are greater than 1 or less than 1.

In the absence of correlations in the daily solar flux, the string length distribution would decay in ^{
n
}. However, the actual correlations decay as ^{
n
}, with ^{
n
} for all

It is now convenient to shift the time origin so that the year begins on July 1. The solar energy _{1} on July 1 (now day 1) is given by_{1} the standard deviation on July 1 (see

To determine the solar energy on successive days _{
j
} for _{1} = 1 − 2Θ[rand(0, 1) − 0.5], where Θ is the Heaviside function. Thus _{1} equals +1 or −1, each with probability 1/2. For _{
j
}, which also takes the values ±1 only, has the same sign as _{
j−1} with probability _{
j
} is given by_{
j
} the standard deviation on the _{
j
} − ⟨_{
j
}⟩, having the same sign as _{
j−1} − ⟨_{
j−1}⟩ with probability

This persistent random-walk construction ^{
n
}, as in

This approach serves our purposes better than the “Moving-Average” models, “Auto-Regressive” models, or combinations thereof that have often been used to model insolation data

With this computational approach, we generate millions of years of synthetic insolation data over a two-dimensional mesh of thousands of (_{
j
} on the _{
j
} can never exceed

Schematic and not to scale dependence of the daily energy minus the load near the winter solstice (blue curve), with three periods of below average insolation (a,c,e) and two above-average periods (b,d). The extent of the energy deficits and surpluses are shown by the blue and red shaded areas. The green curve indicates the instantaneous storage

The time evolution in the model defines a biased random-walk-like process on the interval [0, _{
j
} reaches zero. The failure probability

To obtain cost-optimized system configurations within the simulations for a given value of _{
i
} and the storage capacity values _{
j
}. We evaluate the system cost for each mesh point (_{
i
}, _{
j
}) as _{
i,j
}. Then we find the pair (_{
i,j
} ≤ _{
i
} and _{
j
} define the optimized system configuration.

Due to fluctuations in daily insolation, even a system with

The stylized time history of the insolation and stored energy near the winter solstice (

For a solar farm of area _{0}, insolation tends to replenish the storage during most of the year; this time range corresponds to what we term the

For each day of the year, there is a day-specific average distribution of energy in the storage system. We will determine these daily distributions over a period around the winter solstice. From these distributions, we will determine how the annual failure probability

To compute the stored energy distribution on a single day, we first treat the idealized situation of a strong and time-independent bias. Based on the biased random-walk picture described above, the distribution of stored energy on the _{
j
}(

To begin, we determine the decay constant

When the bias is constant, the stored energy after each day changes by the average solar energy surplus (or deficit), (_{cb} = _{cb}(_{cb} deviates slightly from linearity for larger _{cb} = 0 at _{cb}
_{cb}

Seasonality causes the steady-state distribution of stored energy to be slightly different for each successive day of the year; thus we now write this distribution as _{
j
}(_{min}(_{min} the decay rate on this day. This decay rate would equal 0 when _{0}, which we will determine when

We also need the steady-state storage distributions _{
j
}(_{
j
}(_{
j
} on days near the winter solstice is well described by the quadratic _{min} is the day of minimum insolation and

Finally, we need to account for correlations in the daily insolation. To include these effects, we perform stochastic simulations of a system with constant bias

From the distribution of storage for each day of the year, we now determine the annual failure probability _{
j
} on each day

The day-specific failure probability for the _{
j
}(_{
j
} is written in the SM and _{
j
}(

To calculate the annual failure probability _{
j
}(

We now invert this expression to solve for the required storage as a function of the reliability _{0} is defined in

Eq.

Dependence of the storage

We now determine the optimal configuration of the combined system by minimizing the cost function:_{
g
} = $75 billion is the cost of a solar farm whose area _{0} is just sufficient to supply the daily electrical load _{
s
} = $22 billion is the cost of a storage system that can supply 1 day of electrical load for the region. As mentioned previously, the cost of the generation system is assumed to be linear in its area, so the cost of a solar farm of area _{0} will be _{
g
}
_{
s
}

To find the optimal parameters (_{
s
}/_{
g
}, once _{
s
}/_{
g
} and conventional measures of storage and generation costs is given in _{0} is given by

Here, and in what follows, we use _{0} = 0.038. If the cost ratio _{
s
}/_{
g
}, which currently is roughly 0.3, were to become less than 0.038, then Eq. _{
s
}/_{
g
} became less than 0.019. In this regime, our theory no longer applies, but is also unlikely to be reached by reductions in storage cost in the foreseeable future.

Our theoretical predictions for (

_{
s
}/_{
g
}. _{
s
}/_{
g
}. In both panels, circles are simulation points, while solid lines are the theoretical predictions of Eqs

We define this limit as _{
s
}/(_{
g
}
_{0}) ≫ 1, where (15) reduces to_{0} term in Eq.

Thus _{
g
}/_{
s
}. Combining Eq._{
s
}/_{
g
} exceeds 0.2.

From Eq. _{
s
}/_{
g
} in the range [0.1,0.3], this additional cost is roughly 50%–80% of _{
g
} or $40–$60 Billion. Eq.

Eq. _{
s
}, while _{
g
}. From Eq. _{
s
}/_{
g
} = 0.3. Thus a 30% reduction in storage cost has about the same impact as a 10% reduction in generation cost.

We define this limit by _{
s
}/(_{
g
}
_{0}) − 1 ≪ 1. Expanding Eq.

As _{
s
}/_{
g
} approaches _{0},

Combining Eq. _{
s
}/_{
g
} = 0.04, which is the smallest cost ratio value that we simulated, the additional cost due to weather fluctuations,

The relative influence of cost reductions in storage _{
s
}/_{
g
} = 0.04. Thus a 50% reduction in storage cost now has about the same impact as a 10% reduction in generation cost. In both the limits of expensive and inexpensive storage, reducing the generation cost has more impact on the overall cost than reducing storage cost.

We developed an analytic theory to determine the optimal mix of solar generation and storage that minimizes the overall system cost and achieves a given reliability. This system is specified by

• We have shown for the first time that in the presence of seasonal variations, the failure probability decays nearly exponentially with increasing storage and generation capacity (Eq.

• The storage capacity required to achieve a given reliability (Eq.

• The cost and configuration of the optimal generation/storage system [Eqs.

• A given percent reduction in the generation cost reduces the system cost by three to five times more than the same percent reduction in the storage cost (Eq.

A fundamental ingredient in our cost calculations is the ratio of the cost _{
g
} for a solar farm that can supply the daily load of the St. Louis region on an average insolation day at the winter solstice, to the cost _{
s
} of storing 1 day of energy load. With current technology, this cost ratio, _{
s
}/_{
g
}, is roughly 0.3. From _{
g
} + 1.3_{
s
} + 0.6_{
s
} ≈ $147 Billion (where the last term incorporates the diurnal storage need), consistent with

A system cost of roughly $100 Billion seems staggering. However, we emphasize that the long-term cost of a solar/storage system is likely cheaper than natural gas power generation. The construction cost for the requisite 5 GW of natural gas generation for the St. Louis region is roughly $4–5 Billion [

Within a 100% renewable system, costs can be reduced by deploying a mix of solar and wind energy

If one is willing to forgo 100% renewable energy generation, a solar/storage system could be augmented by natural gas “peaker” plants that operate only during solar energy deficit periods near the winter solstice. Because natural gas generation plants are relatively cheap to build (as mentioned above), they are well suited to being run for just a few days of the year. Thus consider a composite system that consists of a solar farm of area _{0}, with

In the absence of insolation fluctuations, the annual energy deficit for such a solar farm is approximately (see

We developed our mathematical methods for the specific case of the climate in the St Louis region. The same approach can be applied to any geographic region. We expect the general aspects of our findings, such as the nearly-exponential dependence of the failure rate on the storage and excess generation capacities Eq. _{
s
}/_{
g
}. Regions at higher latitudes have lower insolation at the winter solstice, which increases _{
g
} since more solar panel area is needed to satisfy the load. The increased generation cost will shift the optimal system toward more storage. 2) Variations in _{0}, Γ, and

Publicly available datasets were analyzed in this study. This data can be found here:

This work was performed equally by AEC and SR.

This work was partly supported by the National Science Foundation, Grants DMR-1910736 and EF-2133863 to SR. We gratefully acknowledge support from Washington University’s International Center for Energy, Environment, and Sustainability (INCEES).

SR thanks Dan Shrag for helpful advice and conversations. We gratefully acknowledge support to AEC from Washington University’s International Center for Energy, Environment, and Sustainability.

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

The Supplementary Material for this article can be found online at:

This is obtained from the continental United States yearly electricity consumption of 4 × 10^{12} kWh (

There is little economy of scale for a large solar farm [

There is a small error in the correlation function because we drop the data for the 10 leap days.