The process of assessing resource adequacy requires detailed data on all the environmental and system elements that determine electrical demand and generation capacity as a function of time. Weather and climate data drive demand for heating and cooling and modulate power plant generation, either directly, for wind, solar, and hydropower generation, or indirectly, through their influence on generation efficiency and plant reliability. Data on the specifications and performance of the various components of the electrical grid are required to model their behavior. The quality and abundance of requirements for each of these data streams is reviewed here, guidance on reliable sources for these data is provided.
Multiple sections below propose three levels of data collection efforts, in recognition that tradeoffs are required in most assessment studies because of real-world constraints on resources, data, models and toolsets. These levels are recommendations that practitioners may use as a starting point to determine an appropriate course of action. Other solutions beyond those presented in this document are possible given the specifics of a given region and study. The three levels are defined as follows:
Level I | Level II | Level III | |
|---|---|---|---|
Resource or system feature | Most basic representation: may be sufficient when the outcome of an RA study is not sensitive to the data quality. | Mid-fidelity: may employ data preparation or synthesis techniques to extend or filter historical data or employ higher granularity data points. | Highest fidelity representation: these models systematically capture system component operations and behavior with the highest level of accuracy compared to Levels I and II. |
WEATHER AND CLIMATE DATA
Weather Data – Data Options
Level I:
15 years of weather data from nearby observing station.
Level II:
30 years of weather data, including local observations and supplemental gridded reanalysis data.
Level III:
500 years simulated weather data adjusted for climate, including plausible scenarios such as large volcanic eruptions, decadal drought.
Weather Data
Weather is a critical factor in the reliability of bulk electric systems because it affects both electricity demand and the performance of generation resources. RA studies depend on weather-driven simulations to capture this variability, making accurate and consistent weather data essential.
Electricity demand is mainly driven by temperature, which influences heating and cooling needs, along with calendar factors such as time of day and holidays. Other variables like wind, humidity, and solar radiation have smaller effects. On the supply side, weather impacts generation differently: thermal plants are sensitive to air and water temperature, humidity, and severe weather; wind generation depends on hub-height wind speed and air density, with icing and extreme temperatures posing risks; and solar generation relies on solar irradiance but suffers from high temperatures, snow, dust, or hail that can reduce efficiency or damage equipment.
To model these effects, RA studies require long and representative time series of temperature, pressure, wind (surface and hub-height), solar radiation, and precipitation. Ideally, these datasets come from a single consistent source to maintain correlations among variables. When this is not possible, the potential inaccuracies from combining separate datasets should be assessed. High-quality, sub-hourly observations are preferred, and when local data is unavailable, sources such as NOAA, MesoWest, the US Climate Reference Network or the New York State Mesonet, followed by synoptic observation such as the Automated Station Observing System network1 are valuable. For precipitation, which is highly variable, satellite and radar products like IMERG2 provide the best coverage.
Because RA studies are forward-looking, past observations must be translated into estimates of future variability, introducing uncertainty. Assessing rare events such as extreme heatwaves or storms requires very long historical records or multiple simulated scenarios. Modern meteorological observation systems and modeling tools help extend datasets and generate independent realizations, improving the ability to evaluate these low-probability, high-impact events.
In short, reliable RA assessments depend on high-quality, consistent, and sufficiently long weather datasets that capture both normal variability and extreme events. These datasets are the foundation for understanding how weather affects electricity demand and generation, and for ensuring the resilience of the power system.
Extreme Weather
Historical extreme weather data can be extracted from raw data (should be incorporated into historical data already if sufficient weather years are used)
Many extreme events such as severe thunderstorms storms are not easily represented in gridded climate change data
Extreme weather is a threshold-dependent function of weather variables and their climate statistics. What qualifies as an “extreme weather event” is dependent on the observed variability of weather variables at a given location and on the vulnerability of the system. For example, if a system’s power plants are not winterized, they will be more vulnerable to extreme cold events.
To ensure extreme weather is properly represented, it must be appropriately represented in all of the model’s inputs. For example, are weather-dependent outages represented in your model?
For specific extreme weather analysis, please review Resource Adequacy Scenario Selection Guide.
Extreme Weather Category | Historical Understanding | Projection Capability |
Cold extremes | High | High capability, evolving methods |
Heat extremes | High | High |
Precipitation and droughts | Medium | Medium |
Wildfires | Medium | Low |
Coastal flooding | High | High |
Severe storms | Medium | Low |
Snow & Ice | Medium | Medium |
Variable/Data Type | Source Description |
Surface weather observations of pressure, temperature, relative humidity, and wind | |
Surface Radiation Data | |
Gridded wind speed at turbine hub height | |
Gridded consistent temperature, humidity, wind, radiation, hub-height wind from reanalysis | European Center for Medium Range Weather Forecasting ERA5 Reanalysis |
Gridded precipitation data |
Read more:
Climate Data
Climate data refers to long-term statistics of weather variables, typically averaged over 30 years, such as the mean July temperature or the variability of wind speeds. These data provide a baseline expectation for weather conditions, but forecasts often need refinement, especially when projecting beyond the historical period. One way to improve accuracy is to account for trends in the data, such as gradual warming. This can be done by removing the trend before calculating averages and then adding it back, extrapolated to the present. Longer records allow better testing of these methods, while climate models offer additional insight by simulating how future conditions may change under different greenhouse gas emission scenarios.
Observed and modeled trends consistently show global warming, though rates vary by region and season. Precipitation patterns are more complex, with increases in many areas and decreases in others, such as the Mediterranean. Trends in wind speed and solar radiation are less clear, often influenced by local factors like land-use changes. For RA studies, these changes matter because extreme temperatures drive peak loads, and shifts in their frequency or intensity affect system adequacy. Historical data show larger increases in daily minimum temperatures than maximums, reducing the diurnal temperature range, while winter warming generally exceeds summer warming, though models differ on magnitude.
Beyond temperature, severe weather events, such as flooding, icing on wind turbines, snow on solar panels, or hurricane winds, pose operational risks. While climate models can represent these phenomena, their coarse resolution limits confidence in projections of extreme event frequency. Observational data is also sparse, making statistical detection of trends challenging. As a result, uncertainty in future extremes remains high.
To address these uncertainties, forecasts are best expressed probabilistically. Climate projections combine multiple model runs, different initial conditions, and various emissions scenarios to capture natural variability and human-driven uncertainty. This ensemble approach, refined over decades of climate modeling, provides a range of plausible futures. Retrospective forecasts further allow evaluation of model skill, improving confidence in long-term planning.
Figure: Simulated time series with trend. The black star shows the actual average of the orange points. Blue star shows the mean of the blue points. Red star shows the mean of the orange points predicted by extrapolating the trend of the blue points.
Climate Projection & Extreme Events
Projecting future years’ weather and climate requires extensive historical data to be coupled together with projections of future climate. Global climate models provide various scenarios and lenses through which future weather can be synthesized. The example shown below was used to inform the evolution of temperature when moving between the historical record, climate models and synthetic projections in the Northeast case study.
This process, based on a group of climate models and two shared socioeconomic pathway scenarios, illustrates neatly that the expected average temperature is projected to increase as we move towards 2065, compared with an ERA5 reanalysis data set baseline from 1950 to 2021. The average data can be downscaled to hourly resolution by appropriately blending detrended historical temperature profiles with projected temperature deltas for each quantile of temperature and time period. The value of such an approach is in the downstream effect on demand and asset availability.
Source: EPRI Resource Adequacy for a Decarbonized Future Case Study: Northeastern Power Coordinating Council (3002027835)
SUPPLY RESOURCE DATA
Resource Supply – Data Options
Level I:
Basic nameplate capacity for all resources; static assumptions for energy and flexibility; no weather adjustments.
Level II:
Includes seasonal derates, and simplified flexibility constraints (e.g., ramp rates, min up/down times); limited weather sensitivity.
Level III:
Full dynamic modeling: unit-specific ratings, weather-dependent derates, energy/fuel constraints, flexibility parameters, and predictive modeling for extreme conditions.
Level I:
Extreme outages from 15+ years of forced outage historical data
Level II:
Weather-dependent outages (WDO) generated using 30+ years of historical forced outage and temperature data.
Level III:
WDO and common cause outages generated using 50+ years of historical forced outage and temperature data, coupled with predictive modeling of extreme weather events.
Level I:
Generic annual maintenance rates and average outage durations; fixed scheduling during low-load seasons.
Level II:
Seasonal maintenance rates by resource type; optimized scheduling for expected peak net load; outage duration distributions.
Level III:
Unit-specific maintenance schedules and repair time distributions; stochastic modeling under uncertainty with multiple weather years and renewable output scenarios.
Generic Data Requirements
Thermal generation, that uses heat generated from fossil fuel combustion or nuclear fission to drive heat engines that generate electrical power, represents a very large fraction of present electrical generation and is expected to continue to do so under most decarbonization scenarios. Environmental data is needed to model the forced outage rate (especially dependent on the presence of cold temperatures at the plant or its associated fuel lines) and to model the plant efficiency, where cooling of working fluids is required, and where the efficiency of the plant is reduced in conditions of high temperature and humidity (for evaporative cooling mechanisms) at the plant location.
Capacity Limits
Capacity limits represent the maximum amount of production that the asset could produce in service in each interval. Different limits may be used depending on the type of asset and the prevailing conditions. The common modeling options require the following data:
Installed Generation Capacity Rating
The most common source of installed generating capacity data is an amalgam of already installed capacity and planned installed capacity, from both grid interconnection requests and planned additions in further out horizons. Those planned additions may be the outcome of an expansion planning process. Installed capacity, given in MW, does not account for seasonal variation. In many cases, plant capacities resulting from investment planning models are directly related to input parameters, which may be in turn based on common capacity ratings for a given asset type.
In certain circumstances several power units may constitute a plant. Common examples include a combination of 2 gas turbines and a steam turbine in a 2 on 1 combined cycle gas turbine, or multiple blocks of reciprocating engines. In this case a decision needs to be made as to whether the combined plant or each block is to be used. If a plant approach is taken and partial outages are common, then partial failure and repair processes are a requirement for adequacy assessment. Guidance on partial outage is beyond the current scope of this document, but we acknowledge it is a key issue.
Data need: nameplate capacity (MW)
Contractually Declared Capacity Rating
In cases where RA assessments are conducted within the auspices of a contractual or market clearing process, the capacity used to represent an asset may be different to that of the nameplate installed capacity. In this case a participant has made a declaration of capacity which may be considered. The source of this data is the operator of the contractual process itself.
Data need: declared capacity (MW)
Seasonally Adjusted or Condition-Based Capacity Rating
Dynamically setting a rating at seasonal or more granular intervals may follow two approaches. Capacity ratings may vary depending on the underlying physics of the plant, the plant’s environment and the plant’s design. The first approach is as self-declaration of asset derating during certain conditions (e.g. summer derating on Rankine cycle machines). In this case the data is available from plant owner declarations and fleetwide averages may be reasonably used to use in place of missing data, for the same type of asset.
In the absence of self-declared ratings, the synthesis of dynamic ratings requires the installed capacity (as previously discussed), a projection of key weather variables (temperature, humidity, pressure, heat sink temperature) over the period for which the rating is to be used and a transfer function to relate the projected condition to the capacity rating.
Availability and sources of location-specific temperature and humidity data for the past, present and future are discussed in the Weather Data section above. A specific concern for certain thermal plant cooling is the temperature and availability of water from rivers, lakes or oceans. The US Geological Survey has comprehensive data on stream water flow and level and lake and reservoir level, however stream and lake temperature are harder to obtain. Fortunately, surface water temperatures tend to be close to an average of the air temperature over their catchment basin, so the same procedures used to estimate future temperatures at a given location discussed in the Weather Data section may be used to estimate future stream and surface water temperature. The dependence of the forced outage rate on weather factors at thermal power plants is discussed in detail in the Common Cause Outages section.
The determination of an appropriate transfer function is specific to the asset and the timescale. Weekly or daily ratings may be more reasonably determined by using fleet level statistics of historical performance during specific weather conditions. Data to support this assessment may be ascertained from inspection of historical performance of similar generators or from vendor specification. Approaches to develop generic models are the subject of ongoing work in EPRI’s Climate READi initiative.
Data need:
Historical expected seasonally adjusted capacity (MW, interval), or
Nameplate capacity (MW), transfer function, weather variables
Energy Limitations
Increasingly, the energy delivery limitations are recognized as an important constraint to the delivery of power when required. Two modelling options are presented that require input parameters. In both options, the rate of consumption of the primary energy supply source per unit power.
Fuel offtake limits represent a more straightforward approach to modelling energy inputs. This approach is commonly applied with wind and solar power and simplified into an hourly power production profile. A similar approach may be taken to limit the fuel consumption at a gas fired power plant to some pre-determined level, that may be related to gas purchase agreements, coal delivery schedule or other operational constraint in the upstream fuel network.
Fuel pools represent a finite amount of primary energy input available to a group of supply assets. Examples may include a reservoir supplying hydro units with fixed storage and inflow or a gas pipeline and storage supplying gas units. In this case the data need is the fuel pool capacity (energy TWh), and inflows / outflows by time interval (TWh / interval). Depending on the fuel type, this may be available from asset operators (i.e. fuel surveys).
Flexibility Constraints
Asset flexibility relates to the extent with which an asset can change its power production over time. In many cases this level of detail is omitted in RA studies, but it is becoming increasingly clear that at some point the constraint that limits the supply of power to customers may not be capacity or energy related, but flexibility related. In cases where modelling guidelines highlight the need to consider this risk, several variables may be relevant, all of which are specific to the plant, or where fleetwide expected values may be necessary.
Potentially required data:
Minimum stable generation level (MW)
Start-up time (hours or intervals)
Minimum up / down times (hours or intervals)
Ramp rates (MW/hours)
The most common source for this data resides at a system level in production cost model databases for existing generation. For projected future generation, data is often uncertain, but vendor specification sheets provide an indication of the reasonable range for each asset class.
Forced Outage Data
Historically, most of the activity in adequacy modelling has focused on the representation of failure of generating units. The modelling guidelines describe several approaches to this that require a range of data inputs, extending upon existing industry guidelines (e.g. NERC Data Collection Guidelines).
A basic practice is to examine the historical availability of a unit type for dispatch during specific conditions. Such approaches require:
Historical declaration of generator availability or unavailability for dispatch and coincidence system or weather conditions. Generator availability for dispatch should be distinct from outturn production, curtailment and inclusive of limitations to production because of low wind / irradiance or fuel availability. This data may only be available from plant or system operators.
Standard regional databases such as NERC GADS or the ENTSO-E transparency platform provide the basic data by which outages are reported and classified, allowing for the direct estimation of reliability statistics.
Derating factors are regularly employed to represent the likelihood of outages at a point in time. In many cases the formulation is a variant of the capacity available to dispatch multiplied by (1 – FOR), where FOR is a forced outage rate. FOR is commonly considered to be a % value, rather than a true rate (e.g. failures per period). Many variants of FOR are encountered, requiring different data input. These variants include:
EFOR – Expected Forced Outage Rate is the expected value of percentage time that a unit is on full forced outage (partial equated to full).
EFORd - Expected Forced Outage Rate on Demand is the forced outage rate conditional on the unit being required to meet demand at that interval. Both demand, other supply and economics determine when a plant is in demand. The historical duty cycle for plants may also not be accurately projected into the future without an accompanying dispatch model.
Monte Carlo Markov Chain (MCMC) models represent the generator transitioning between two (or more) states: available and unavailable. Failure and repair rates determine whether a unit makes the transition from one state to another. These rates can be determined based on raw failure information that is found in industry standard databases or generator declarations to system operators. Projection of historical forced outage rates into the future is also not without uncertainty as underlying plant factors influence their FOR.
The common practice is that FORs are determined based on an annual basis, but increasingly conditional FORs are being determined for Summer or Winter assessments, reflecting the different failure mechanisms that are at play in each condition.
Common Mode Outage Data
A common mode (common cause) outage is defined as a related multiple outage event consisting of two or more primary outage occurrences initiated by a single incident or underlying cause where the outage occurrences are not consequences of each other.1
The presence of a single “actor” or causal event differentiates a common mode outage from a dependent or cascading outage.2 Common mode and dependent outages are influenced by several factors, such as failure of equipment (often due to aging), malfunctioning of protective devices, weather conditions (wind, lightning), natural disasters (hurricanes, earthquakes), loading conditions, power transfers, maintenance, and human error.
The traditional industry assumption for outages, particularly forced outages, is that they are independent and uncorrelated. This practice, which does not consider correlations like WDO, may no longer be valid with increased dependence on renewable technologies combined with a recognition of common-mode events that affect multiple asset types.3
Historically, common mode outage is assumed to be part of the historical availability performance data that goes into the unit’s failure rate statistic. Static forced outage rates are applied to the RA model, which randomly allocates outages of individual units across the one-year planning horizon. While this approach offers simplicity, the random distribution of outages ignores the causality of events and possible correlations with other factors.
Weather dependent outage (WDO) is an extension of common mode and dependent outages. Modeling WDO is one way where correlated outages can be quantified and where outages correlate to severe weather events.4 Modeling WDO requires historical data on generator performance, temperature, and load. The generator data—generator unit name, the reason for the outage, start and end times of outage, and derating percentage—are used to generate a historical time series of all the transitions between states for each generator.
The data gap associated with modeling WDO involves the availability and validity of historical data. As mentioned, generator mapping is challenging when anonymized data from a large region is used. For other sources that provide generator names and do not need mapping, validating the data may be necessary. Missing observations are inevitable and require careful assumptions in filling in the information. Future WDO modeling may necessitate a process coupled with predictive modeling. In many cases it may be prudent to evaluate weather dependent failure rate models based on a group of similar units, rather than on a plant specific basis. In certain conditions finding a group of similar assets may not be possible, and default assumptions may need to be generated using other asset types as a proxy. Future common mode outage modeling may necessitate a process coupled with predictive modeling to project future rates based on the anticipated future condition of the asset. This type of modeling will require additional data that is plant and system specific.
For planners without access to the above information, modeling common mode outage requires historical generator performance data. The key data needed to model common mode outage are generator unit name, the reason for the outage, start and end times of outage, and derating percentage. Database examples in the U.S., Europe, and Australia are:
NERC GADS. In the U.S., generator data pertaining to outage events is available from the GADS maintained by the NERC. The electric utility industry-initiated GADS in 1982, but data collection began in 1963. To date, GADS maintains histories on more than 5,000 generating units in North America [Source: Generating Availability Data System (GADS) (nerc.com)].
ENTSO-E Transparency Platform. In Europe, the member states are mandated to submit fundamental information related to electricity generation, load, transmission, and electricity balancing, which ENTSO-E now publishes on the ENTSO-E Transparency Platform as of January 5, 2015 [Source: Data View (entsoe.eu)].
AEMO NEMWEB. In the Australian NEM, up to thirteen months of market data is stored on the AEMO website [Source: AEMO | Nemweb data].
The gaps identified call for an improvement in the data collection system, which could also be observed in other databases. Note that the two examples cited in this section highlight the issue of the lack of more extended historical data, which could be supplied by commercial tools that regularly collate market information. This leaves the validity of the information as the primary data gap—whether an event is classified as a common mode outage.
Database Name | Data Gaps |
NERC GADS | The matching of generator units is not straightforward. Events are classified by region code (e.g., MRO), fuel category (e.g., coal), and unit type (e.g., 100 - Fossil-Steam). Location information is not available. Matching only a few out of 1000+ units in the RA model is possible, requiring additional assumptions in mapping. For events with multiple cause codes, only the primary cause code is tallied, which could lead to some events with the same cause of outage not being classified as common mode outages. Note: Cause code refers to the identified cause of an event. Only units with a capacity of 20 MW or above are reported. |
ENTSO-E Transparency Platform | Available data is only from 2015. The reason for the outage may require further investigation to ensure whether an event can be classified as a common mode outage. |
AEMO NEMWEB | Available data only contains the latest thirteen months. The cause of the outage is not available. |
Weather Dependent Forced Outages
Incorporating temperature sensitivity in forced outage rate estimation is one of the key learnings from Winter storms Uri and Elliott. While NERC GADS may provide the underlying data to be able to develop such relationships, in many instances, the considerable analysis that is needed to develop the relationships between weather and performance may not be feasible within the limitations of the resources available to complete adequacy assessments. One approach to ensure that some level of weather-related uncertainty is captured without investing in the complete analysis is to leverage the work already carried out in other studies to develop curves that relate temperature to asset failure rates and to make reasonable estimates adjustments based on known calibration points. In the Texas case study, the team leveraged the U-shaped forced outage rate curve developed by Murphy et. al. for Gas Combined Cycles in the PJM territory which are more likely to be weatherized to manage extreme cold, given their increased exposure to such conditions.
In this case the team evaluated two different shifts of the curve to the right- increasing the forced outage rate for below freezing temperatures relative to the original. Two shifts were examined: a 5- and a 10-degree Celsius shift. When compared to the two data points available for winter storm Uri and Elliott, the curves may be seen to be in the region of what might be expected by a more detailed model. Given the uncertainty in the application of weatherization measure implementation and performance, the loss in accuracy for a simplification of the approach taken to generate the data may be reasonable over a more precise approach that may rely on underlying assumptions that rapidly and unexpectedly change.
Figure: Simulated time series with trend. The black star shows the actual average of the orange points. Blue star shows the mean of the blue points. Red star shows the mean of the orange points predicted by extrapolating the trend of the blue points.
Failure to Start
In systems that anticipate the need for significant flexibility from generation resources, additional cycling is foreseen, and a significant addition of start-ups may be a reality for generators managing an increasingly variable net load curve. In solar dominated systems, several generators may need to start in a short interval to ramp into a morning peak as well as a late afternoon peak. Generation failures are most encountered in the start-up process. Given the increasing number and concentration of startups, failure to start may be increasingly modelled as part of the assessment process.
Start-up failure rates are reported differently in each region. NERC’s GADS requires reporting of such failures with a defined failure code. Similarly, balancing authorities and generators record start-up failure statistics. Condition based start failure may be determined based on a variety of dimensions – calendar or time related, or condition related (e.g. temperature). In this case the data required includes both historical weather data coincident with the startup failure events as well as future projected weather.
Key Data Sources for Outages
Variable/Data Type | Source Description | Web Address |
Generation Facility Outage Data | NERC Generating Availability Data System (GADS) | |
Summary Outage Data in the Western US & Canada | WECC | |
Hourly Resource Outage Data | ERCOT |
Read More:
Planned Maintenance Data
To optimally schedule maintenance profiles for a resource adequacy study, maintenance rates for thermal generators should be obtained from the entity responsible for conducting generator maintenance. In some cases that entity may be a utility that has full control over maintenance scheduling, while in other cases it may be a utility or third-party generation owner that must make a request of the grid operator (utility or ISO/RTO) to schedule maintenance. Ideally the maintenance rates and repair times for each generator should be obtained based on each generator’s historical and/or projected maintenance schedules. However, a good approximation for each generator’s maintenance rate is a resource type average maintenance rate based on each thermal generator’s resource type (nuclear, gas, coal, etc.) and size (100 MW, 200 MW, etc.). This fleet average approach is typically used for modeling large networks or utilizing only publicly available data for an analysis.
Some short-term maintenance plans and/or historical maintenance schedules for individual generators and some projected maintenance requirements (either in a set number of hours/events per year, or as a planned maintenance rate in percentage) can be found in the NERC Generating Availability Data System (GADS) and ENTSO-E transparency platform for benchmarking duration and frequency of individual generators.
Historically, maintenance schedules were planned during periods of the highest capacity margins on the system, which were typically spring and fall shoulder seasons. These seasons typically have the lowest risk of resource adequacy shortfalls. The major gaps in modeling maintenance schedules revolves around the uncertainties that come from increased renewable and energy storage penetration and determining when a traditional scheduling approach is no longer suitable for high renewable and energy storage systems. In high renewable and energy storage systems, resource adequacy risk can shift to periods outside of the summer and winter peak demand periods. In addition, climate change could lead to higher-than-expected demand during the shoulder seasons. In general, the following topics are presented as gaps that require careful consideration when developing optimized maintenance schedules in a resource adequacy analysis:
Representing the distribution of repair times for maintenance outages if only class averages are available (e.g., exponential, normal, uniform, etc.).
Scheduling maintenance with respect to renewable energy output and load uncertainty to avoid perfect maintenance foresight (e.g., different weather years may have different risk periods and require different maintenance schedules).
Determining the expected reliability risk periods of a resource mix for expected net peak load and/or net energy reserves.
The issues described in the latter two points above are further discussed below.
Maintenance Scheduling Under Uncertainty
An optimal maintenance schedule with perfect foresight based on net load (load minus available wind and solar) and hourly capacity reserves is likely dependent on the study year being simulated. Therefore, if assuming perfect foresight for maintenance scheduling, modeling requires developing a unique maintenance profile for each year that is simulated in resource adequacy modeling. However, when planners evaluate the impacts of a maintenance schedule on resource adequacy, the future weather is uncertain. Thus, in the case of imperfect foresight, optimal maintenance scheduling would be determined based on a range of correlated renewable output and load data. Using a maintenance profile optimized for a single weather year could lead to maintenance scheduling that is not representative if the net load and energy reserves in a study year are significantly different from the weather year the optimal schedule was developed.
Determining Expected Reliability Risk Periods
Maintenance Scheduling within a Resource Adequacy model is typically optimized based on the expected available capacity reserves for a given scheduling period. The expected available capacity reserves can be calculated in multiple ways, and different practices may be required based on multiple factors, such as the penetration of variable renewable resources and battery storage. A few approaches are outlined below:
Maintenance schedule is optimized based on expected peak load risk (traditional)
Maintenance schedule is optimized based on expected peak net load risk (higher renewable penetration)
Maintenance schedule is optimized based on both expected peak net load and expected net energy reserves (high penetration of renewables and storage)
The traditional maintenance scheduling assumption (expected peak load risk) typically results in maintenance concentrated during low load periods (typically spring and fall). As renewables become a larger portion of the generation fleet, the hours of high system risk will shift and expected peak load may no longer be an hour of high system risk which maintenance can be scheduled around. Therefore, optimal maintenance scheduling should be focused more on periods with expected low net load (high expected renewable output) while limiting maintenance outages during periods of expected high net load (low expected renewable output). As energy storage penetration grows, the addition of considering expected net energy reserves (energy minus renewable generation and load) should be considered so that there is both sufficient capacity and energy reserves during periods of maintenance.
VARIABLE RENEWABLE ENERGY DATA
Variable Renewable Energy – Data Options
Level I:
Decades of hourly data (40+ years), validated speed-to-power conversion, benchmarked against real-world generation data representing current and near-future wind technologies.
Level II:
Decades of hourly data (40+ years), validated speed-to-power conversion, benchmarked against real-world generation data representing current and near-future wind technologies.
Level III:
Similar to level II but with climate trends included, uncertainty modeled, and new/future wind technology represented.
Level I:
Five years of hourly GHI data at 0.25-degree resolution, conversion to power based on generic power curves.
Level II:
Decades of 5-minute data from in-situ instrumental observations of GHI/DNI, used to generate simulated hourly mean and generation time series as well as hourly statistics of 5-minute variability, conversion to power including tracking and inverter modeling.
Level III:
Decades of 5-minute GHI/DNI from a combination of modeled and observed radiation, converted to generation using power curves and tracking algorithms particular to the modeled facility.
Wind Power
Traditionally, wind power potential production data in RA studies have been generated using standard wind speed to power transformations. Historical wind power generation and wind speed at hub height have been used to estimate generic wind power curves. As the number of wind facilities have been increasing significantly, the geographical resolution of the historical data has been growing as well. Wind turbines are typically capable of measuring wind speed, direction, and other relevant parameters. Other site-specific parameters (e.g., wind turbine type, hub height, lay-out, altitude, terrain) can be incorporated in the models to improve its accuracy. However, this data is typically collected by the utilities and is not always publicly available or difficult to find.
Data sources such as The Wind Power and CorRES tools used by ENTSO-E in the European Resource Adequacy Assessment include data about installed capacity, hub height, number of turbines and turbine models. Also, an open-source database OpenEnergy platform includes turbine data (power curves, hub heights). Finally, the US Geological Survey Wind Turbine Database provides the location of all wind turbines in the United States, together with the manufacturer and capacity and farm name.
To capture weather-driven uncertainty and the impact of extreme weather events on wind power outputs, it is advisable to model a wide range of weather years (> 40 years) and relevant weather variables such air density, jointly set by temperature and pressure, freezing precipitation, or other weather events that might have a significant impact in the power output (i.e., severe storms, tornados. hurricanes, lightning, and snow) in systems where wind power plays a material role in determining adequacy outcomes.
Although climate change is likely to change temperature trends, it is difficult to estimate how wind speed and other variables mentioned above will be impacted. Incorporating climate projections raises questions regarding the:
confidence in predictions and levels of uncertainty,
physical understanding of what drives climate change,
transformation of daily granularity provided by climate models to hourly granularity needed by RA models, and,
prediction skill for specific variables (i.e., wind speeds) and events (hurricanes/tornados)
As some near-term wind installations might be modeled similarly to other recent installations despite the lack of historical data, the representation of future wind turbine technology adds additional complexity in the data collection and modeling. Some of the key areas to focus on are:
Technological innovations data (i.e., turbine types, efficiencies, capacity, resiliency, ageing)
Design of the installation (i.e., hub heights, layout)
Location of the new projects
Potential repowering of existing sites
Other relevant data that can affect the reliability of wind power installations:
Age degradation is difficult to incorporate and might have a meaningful impact in long-term studies underestimating adequacy risks. Some studies have calculated annual degradation factors, for example a 1.6% annual decline in capacity factors in UK wind farms.1
Blade icing can lead to losses of up to 20%.2 Characterizing the type and duration of ice (i.e., rime ice) is also key to properly estimate power losses. Offshore wind power plants are likely to include this preventive equipment as they are more exposed to severe climate conditions. However, older wind onshore turbines might not include such equipment which is normally overlooked by planners.
Wake losses can be derived from historical power generation. Regional studies might indicate a range of values to use, for example a 5%-15% reduction in the UK.1 However, they might be difficult to consider in future wind installations or installations without historical data. ENTSO-E ERAA uses PyWake for such cases.
Some important gaps in currently available data include:
Weather and turbine generation data at site is difficult to collect or not publicly available. Most of the time only available at aggregated level and may incorporate grid directed curtailment actions without explicit labelling.
Simplified models relying on wind power curves do not accurately capture weather uncertainty and underestimate the impact of extreme weather events as only few weather variables are considered (wind speed). Open-source models such as windpowerlib offer a more complete set of wind to power transformation methods.
Outage data is typically overlooked or implicitly incorporated in the wind power generation timeseries. Explicit modelling of wind generation outages is still a gap in many RA studies.
Climate data is becoming increasingly available but still difficult to interpret and use.
Read More:
Solar Power
Solar power generation relies on the conversion of solar irradiance into electricity, primarily using photovoltaic (PV) cells. There are key variables that influence solar power generation such as solar irradiance, temperature, time of the day, and the characteristics of the PV system.
Solar irradiance is typically collected at weather stations located near the installation site. There are three main components of irradiance: Direct Normal Irradiance (DNI), Diffuse Horizontal Irradiance (DHI), and Global Horizontal Irradiance (GHI). DNI is typically used to model concentrating solar power (CSP), while GHI is typically used to model PV systems, which are the focus of this section. When only GHI is available, DNI and DHI can be derived using transformation methods.1 NREL manages and updates the National Solar Radiation Database, a free and complete collection of meteorological and solar irradiance data sets for the United States and a growing list of international locations. The European Solar Radiation Atlas provides solar radiation data for Europe based on satellite measurements and ground-based observations.
Temperature also has an impact on the performance of PV systems, especially during extreme events. PV cells are less efficient during high temperatures leading to a decrease in power output of around 0.5% per degree Celsius above the standard operating temperature (25 °C). This means that during heat waves (>40 °C) the solar power output will be reduced by at least 7.5%. In addition, high temperatures can also accelerate the degradation of the PV cells. This weather-dependent degradation has generally been overlooked in RA studies.
PV system characteristics have a significant impact on solar power generation. Mounting systems determine the horizontal orientation and tilt of the PV panels. There are two main types:
Fixed mounting: More frequently used in rooftop PV. The tilt angle of panels on flat surfaces can be optimized for the latitude where the installation is located, typically downward towards the equator, at an angle a few degrees smaller than the latitude in degrees, subject to constraints for wind resistance. Panels on tilted surfaces are typically mounted flush to the surface.
Trackers: More frequently used in utility scale PV. Orientation and tilt angle are adjusted throughout the day and year to optimize the power output based on the solar position. Tracking systems must avoid shading by adjacent panels at low sun angles, so they typically do not follow the at low angles but instead revert towards a flat setting at sunset and sunrise. This occurs at a sun angle that depends on the spacing of rows of panels as well as on the location on the horizon of sunset and sunrise. Single-axis trackers usually have panels arranged in North-South oriented lines that tilt from East to West). Dual-axis trackers place an array of panels on a post and orient them so that the sun’s rays strike them at a right angle for much of the day. The presence of trackers introduces the potential for mis-operation of the trackers and lower yield than would otherwise be assumed. This effect may be material at the plant level but is unlikely to be material at a regional aggregation level.
Module and inverter parameters determine how power from the panels is delivered to the grid. The inverter is responsible for converting the DC power generated by the PV module into AC power. Typically, detailed information about the type of inverter is not publicly available. ENTSOE’s European Resource Adequacy Assessment uses as reference Canadian Solar modules and ABB inverters. The CEC module database, Sandia Module database, CEC Inverter database, and Anton Driesse Inverter database can offer a wider selection of possible modules and inverters. Their technical specifications may be incorporated in the synthesis of production time series using tools such as PVLib. Two main parameters that are important in design and have a direct impact on the solar power output:
Inverter efficiency (~96%)
DC/AC ratio. This typically ranges from 1.15 to 1.3.2 Higher ratios result in less variation in power delivery, but lower ratios use a higher fraction of the power generated by the panels.
Like wind power, solar generation can be modelled using historical meteorological data for times before the construction of the solar facility. Statistical models can be used to model historical solar power generation using key variables such as solar irradiance, temperature, and time of the day. However, it may not be appropriate to assume that a single profile may be scaled when a significant addition of capacity occurs, as the increased geographical diversity will change the area-specific model. In addition, operational units are likely to have limited historical data due to the rapid growth increase in solar capacity in recent years. Insufficiently long data records can lead to inaccurate models that underestimate the adequacy risk by not correctly accounting for the low-probability high-impact events.
On the other hand, physical models can incorporate key PV system characteristics along with site-specific weather conditions such as solar irradiance and temperature from closest weather stations. These models can provide more accurate estimates of solar power generation. PVLib, an open-source library for Python, uses physical models to estimate solar power generation. It includes a complete set of functions that allow the user to model different types of PV modules and inverters and provides tools for simulating the performance of PV systems under different conditions, such as changes in weather or system design. This allows extreme weather events to be captured and model weather effects such as snow cover, reduced soiling losses due to precipitations, and reduced performance due to extreme heat. However, they can be more computationally expensive and require detailed information about the installation that might not be always available.
Other factors that can affect the reliability of solar power installations are:
Age degradation is difficult to incorporate and might have a significant impact in long-term studies underestimating adequacy risks. Some studies have calculated annual degradation factors (~2%-0.7%/yr). However, technology innovation is likely to reduce the impact of degradation for future installations.
Snow cover can lead to significant losses blocking all the solar irradiance reaching the panel. Characterizing the melting process or collecting detailed information on removal mechanisms (i.e., tilting) is key to properly estimate power losses during critical periods for the power system.3, 4
Environmental losses such as soiling and shading can have a significant impact on the performance of the panels. To mitigate the effects of soiling, cleaning tasks need to be planned to depend on the local condition (e.g., pollen accumulation during spring season) and the observed performance losses. Accounting for soiling losses and the impact of precipitation to mitigate soiling losses can be model in physical models such as the Kimber PVLib function.5 In addition, nearby vegetation can also produce shading losses, especially when the vegetation has grown significantly that block the sun’s rays.
Some important gaps in available data include:
Simple statistical models relying on historical data do not accurately capture weather uncertainty and underestimate the impact of extreme weather events as only few weather variables are considered. Physical models are typically more suited for RA studies allowing them to capture change in climate conditions and the PV system characteristics.
Historical production data may incorporate the impact of grid directed curtailment. This may interfere with the training of models to project future production.
PV system characteristics are difficult to collect or are not publicly available. There is a need for publicly accessible information with detailed information about current and projected solar installation locations.
Outage data is typically overlooked or implicitly incorporated in the solar power generation time series. NERC GADS database for solar installations is not available and can be difficult to use. Explicit modelling of solar outages is still a gap in many RA studies.
Climate data is becoming increasingly available but still difficult to interpret/use.
Read More:
Other Sources of Data for Variable Renewable Energy Resources
Variable/Data Type | Source Description | Web Address |
Solar Facility Information | Technology (thin film vs. silicon), axis properties (fixed vs tilting, etc) DC Capacity, Inverter (AC) Capacity, Location (county), installation date. | Form EIA-860 detailed data with previous form data (EIA-860A/860B) |
Solar Facility Information | Location (exact), capacity, technology information on US solar farms. Membership required. | |
Wind Facility Information | Turbine location, manufacturer, capacity, installation date. |
BATTERY ENERGY STORAGE AND HYBRID POWER PLANT DATA
Battery Energy Storage – Data Options
Level I:
Rated power capacity (MW), energy capacity (MWh), state of charge, round-trip efficiency, and operating mode (e.g., only dispatch during shortfall events, charge/discharge when system energy cost is low/high or assign fixed dispatch) from available historical data.
Level II:
Weather dependent rated power capacity (MW), energy capacity (MWh), state of charge, round-trip efficiency, operating mode, and cycle life/lifetime from 15+ years historical data, coupled with experts/manufacturers' advice on future advancements.
Level III:
Weather dependent rated power capacity (MW), energy capacity (MWh), state of charge, round-trip efficiency, operating mode, cycle life/lifetime, and outages from 30+ years historical data, coupled with experts/manufacturers' advice on future advancements and predictive modeling of climate projections.
A standalone battery storage requires three basic data to characterize its operation—rated power capacity, energy capacity, and roundtrip efficiency. These are defined by as follows1:
Rated power capacity is the total possible instantaneous discharge capability or the maximum discharge rate that a battery energy storage system can achieve starting from a fully charged state. Its unit of measurement is kilowatts (kW) or megawatts (MW).
Energy capacity is the maximum amount of stored energy of battery energy storage. It is measured in kilowatt-hours (kWh) or megawatt-hours (MWh). Dividing the total energy capacity by the rated power capacity provides information on the duration of the storage in hours (h).
Roundtrip efficiency refers to the ratio of the energy charged to the battery to the energy discharged from the battery. It can represent the battery energy storage total DC-DC or AC-AC efficiency, including losses from self-discharge and other electrical losses. DC-DC efficiency is often used by battery manufacturers. On the other hand, AC-AC efficiency is often more useful to utilities because they only see battery charging and discharging from the point of interconnection to the power system.
A hybrid power plant requires additional data, including a constraint characterizing how the battery charges/discharges when solar or wind energy is available/unavailable. Conditional data can improve the simulation of the operation of battery energy storage and hybrid power plants. Some of them may have limits under the battery warranty. They are listed as follows:
Lifetime or end of life: the condition at which the battery can no longer provide a minimum percentage of its rated power capacity of energy capacity.2
Cycle life/lifetime: the amount of time or cycles a battery can provide regular charging and discharging before failure or significant degradation. 1
Battery degradation: the reduction in rated power capacity and energy capacity of a battery as it ages with time and use. 3
State of charge (SOC): the battery’s present charge level and ranges from completely discharged to fully charged. This includes initial and end of day target SOC.1
Depth of discharge (DOD): the energy discharged as a percentage of the energy capacity of the battery. This includes minimum and maximum DOD.2
Outages: number of partial and full unplanned outages of a battery in a year. Outages can vary depending on the age of the battery.1
Assumed costs: variable operations and maintenance (VOM) costs associated with battery’s operation.2
Coupling: refers to whether the battery is AC or DC coupled in a hybrid system.
Aggregated battery storage configuration: the detailed representation of an aggregated battery storage with charge/discharge characteristics of individual batteries, which is typically simplified.
Battery charging restriction: any restrictions that may have implications to the battery operation (e.g., the battery in a hybrid system may not be permitted to charge from the grid).1
Applications: the ways the battery is assumed to operate (e.g., arbitrage, capacity, or energy adequacy only, fixed dispatch, etc.).1
Temperature: used for developing battery temperature (or weather dependent) models, which have not yet been commonly included in RA studies. Charge/discharge currents generate heat and can lead to higher battery temperatures, so efficiently enforcing temperature limits could significantly reduce battery degradation.3
The table below shows selected publicly available repositories of data on battery storage systems and their function. While some of the data needed for forward-looking studies are publicly available, important areas are still only poorly characterized, for example the behavior of aggregations of large number of combined rooftop PV and storage systems. Given the pace of development of battery storage technology, it is not clear whether historical data will be helpful in developing predictive models of future technology. Additional input from battery technology experts and greater transparency from manufacturers will be helpful in developing skillful simulations of future battery behavior.
Key Data Sources for Battery Storage
Variable/Data Type | Source Description | Web Address |
Grid storage technical data | EPRI Data Repository | |
Grid storage technology cost and performance | PNNL | 2022 Grid Energy Storage Technology Cost and Performance Assessment |
Battery Data Genome: planned comprehensive archive of battery science information | Argonne National Lab | Envisioning the Battery Data Genome, a central data hub for battery innovation |
Global electrical storage project database | DOE/Sandia National Lab | |
DER Database for Australia, aggregate installation data | AEMO |
Read More:
HYDROELECTRIC DATA
Hydroelectric Power – Data Options
Level I:
Typical aggregate hydrogeneration hourly generation profiles by month, accounting for maintenance periods.
Level II:
Multiple years of hourly generation profiles for each hydrogeneration facility, together with hydrological data that allows modeling of generation constraints due to water availability and environmental requirements.
Level III:
Projections of hydrological constraints due to climate variability and change.
Current hydro modeling is well established and more sophisticated than most other resources, since it has always involved stored energy, uncertain future hydro conditions (primary fuel), complex watershed interactions, ecologic and recreation constraints, as well as traditional, mechanical characteristics of plant operations.
The table below describes these characteristics in broad terms that motivate why the data is needed and what data may be unique to hydro plants or shared with others. Each of these characteristics of hydroelectric generation modeling has important data requirements.
Characteristic | Implication |
Stored Energy | Need for proxy estimate of future power system conditions to balance storage against future needs, up to two years ahead. Pumped hydro technology has characteristics like other energy storage resources, but may be more complex to operate, depending on the physical design. |
Uncertain Future Hydro Conditions | Need to balance expected hydro conditions with risks of deviations up to two years ahead. |
Complex Watershed Interactions | Need to couple operating decisions among many hydro plants according to their downstream dependencies and the presence or absence of reservoir storage. |
Ecological and Recreational Constraints | Operating constraints for minimum flows, maximum ramps, and reservoir levels arise seasonally and as conditions warrant. |
Mechanical Characteristics | These are the same types of data seen in most power system resources, like variable costs, operating ranges for energy production and ramping, and response times. |
Stored Energy
Hydro plants rely on stored energy for reservoir systems, including pumped hydro. Run-of-river plants typically have very little stored energy and operate without this consideration. When operating a stored energy plant, it is essential to consider future system conditions, because the use or storage of energy at any moment must be compared to saving it for later.
The key information requirement is the future value of energy (FVOE), which is a system-level condition, shared by all stored energy resources, because they are all linked by the power network. It appears as a cost for using stored energy in the last period of an operating horizon within an RA calculation. Without this cost at the time boundary, a model will feel free to use all the energy, leaving none left for the next time period beyond the horizon.
Because a hydro plant can store energy for a year or more, it comes to dominate the FVOE calculation over future years. For instance, some plant operators consider system scenarios up to three years ahead when computing the FVOE for the next day.
The nature of the FVOE depends on the RA context of a hydro plant. Alternative scenarios for evaluating RA may require separate FVOE calculations for each one, because future system conditions may depend on current conditions. For instance, a scenario with two years of drought may increase the FVOE going forward.
Uncertain Future Hydro Conditions
Hydro conditions are uncertain beyond short-term weather forecasts, and in an RA context, many scenarios are simulated either in a probabilistic way, using stochastic programming, or in a Monte Carlo or scenario-based method having repeated calculations for each condition. This requires historical conditions for each plant’s location, a statistical analysis about tail events, and potential long-term trends.
Complex Watershed Interactions
Watersheds form tree networks wherein water flows from the leaf nodes to the root node. Flows down the watershed are a combination of natural phenomena and hydro plant controls, with the result that the controlled decisions of upstream plants affect the decision making of downstream ones.
Watershed data includes the topology, natural flows into the watershed, and flow characteristics between nodes, such as flow limits, transit times, relationships between water temperature and flow rates, and temperature limits. Temperature modeling may be held externally to an RA analysis, being incorporated into weather- or time-dependent flow limits.
Ecological and Recreational Conditions
Ecological data describes the impact of hydro plant operations on the watershed biome, especially aquatic creatures. This data takes the form of flow limits and ramp limits. Recreation data describes human use of the watershed for swimming, boating, fishing, etc. It affects reservoir levels, and limits flow rates and ramping.
Mechanical Characteristics
One of the unique mechanical characteristics of hydro plants is the head height, the vertical distance between the reservoir level and the outlet below the hydro turbine generator. This distance is naturally a function of the quantity of water flowing through the turbine over time. As the reservoir level decreases, less and less potential energy can be derived from the same rate of flow.
Water storage is characteristic of hydro reservoirs, while run-of-river plants have limited to no storage. The reservoir height indicates the amount of water stored in the reservoir in a non-linear fashion that depends on the shape of the reservoir. The reservoir height is also subject to minimum and maximum limits. When the maximum is exceeded, the water is spilled, meaning that it flows over the top of the dam, down the spillway, and the potential energy is lost.
Other characteristics are the typical energy production limits, generator control characteristics for real and reactive power, ramp rates, and efficiency curves that relate water flow and head height to electricity production, just as power curves relate wind generation to wind speed and air density.
Pumped hydro data includes additionally a pumping mechanism. Often the pumping and generator functions are interdependent, requiring a delay time between their operations, but some installations offer simultaneous operations of the pumping and generation, offering more seamless and flexible operations. These types of dependencies require modeling and data to support their representation.
Existing Sources of Data
Because hydro plants are custom designed for their location, the data specific to a plant must be obtained from the owner/operator or from a system model for the specific region. The following sources are relevant for the US northwest, Europe and Australia.
Relevant Data Sources for the US Northwest, Europe, and Australia
- NWPCC, GENESYS web page
- ENTSO-E, “Hydropower modelling – New database complementing PECD,” ENTSO-E, September 2019
- Iotti, G., "Hydropower Modelling in Mid-Term Adequacy Forecasts: the peculiar case of Austria," Master's dissertation, School of Industrial and Information Engineering, Politecnico di Milano, Milan, Italy, 2020.
- AEMO, “Market Modelling Methodologies,” July 2020.
- Blom, E., “Including Hydropower in Large Scale Power System Models,” School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology, Stockholm, Sweden, 2019.
Open Data Gaps
Some regions are experiencing extreme hydro events, like extended droughts and high precipitation. Formats and statistics for representing these events are needed as input to existing models.
Hydro reservoir interdependence is encapsulated by watershed networks and their characteristics. Representations of the interdependencies may be required in regions adding hydro resources and accuracy may become more important, incorporating time-dependent and environmental contexts.
Ownership of hydro resources may be joint between several partners with combinations of joint and independent interests over time. Data is needed to describe the character and changes of owner objectives over time.
Environmental constraints on water flow may include minimum values and maximum ramp rates, which characterize limits on daily water-flow variations. Such water flow limits typically need to be converted into electric power limits that depend on plant operating conditions and may include additional information about the dynamic flow characteristics of a watershed.
With the changing resource mix, hydro resources may become more competitive providing contingency and operating reserves, which would necessitate additional data about how flexible combinations of energy and reserves can be provided over time.
Hydro may couple their operations with other energy storage resources, like demand response and battery storage to increase its ability to respond flexibly. Improving the operation of a collection of storage resources may require added emphasis on their relative strengths and weaknesses, like size, response times, operating ranges, in relation to evolving net load characteristics.
Key Data Sources for Hydropower
Stored Energy | Future Value of Energy (FVOE) | Statistical study of future hydro operations, throughout and beyond the RA horizon. |
Future Hydro Conditions | Future projections of precipitation and temperatures and/or water flows, based on historical patterns. | |
Watershed Interactions | Topology Flow Characteristics | |
Ecologic and Recreation Conditions | Stream Flow Monitoring & Assessment | Environmental Flow Requirements from FERC Licenses across the US |
Mechanical Characteristics | Generator parameters | See section Supply Resource Data |
CUSTOMER SIDE DATA
Customer Side – Data Options
Level I:
Historical annual or monthly peak demand; ignores DER and electrification; no extreme event modeling.
Level II:
Hourly load profiles for 10+ years; includes DER impact and scenario-based electrification adjustments; stress-test scenarios for extreme weather.
Level III:
Full dynamic modeling: unit-specific ratings, weather-dependent derates, energy/fuel constraints, flexibility parameters, and predictive modeling for extreme conditions.
The simplest approach to generating estimated future demand time series is to simply draw from historical demand data. This may work well in some regions that have little change in economics and climate over time. Other regions with significant changes may already account for them with traditional statistical methods using historical data. However, with rapid change in demand technologies and underlying usage and conditions, historical data may be less useful and forward-looking models based on various trend analyses and physical models are required. These forward-looking analyses require extra off-line analyses and provide additional information that must be integrated into an RA analysis in new and complex ways. The following sections delve into various specific data needs for demand forecasting, without making recommendations as to the process of demand forecasting itself.
Historical Load Data
Historical demand is often readily available to utilities in detail through monthly customer meter readings and time series meter readings at substations. Substation readings may have high regional and temporal resolution, sufficient for analyzing important factors like economics and the weather and for projecting future load, given future economic and weather projections. These traditional techniques are well known in industry and have served well in most cases.
However, historical behavior does not always account well for future demand when end-use technology changes, for example due to adoption of LED lighting or electric vehicles. These adoption rates can sometimes be very rapid, introducing significant errors even in forecasts less than one year ahead. As such, end-use adoption models become critical for fully representing changes in demand over time, withing the horizon of an RA analysis.
As an example of the types of adoption models that are incorporated in advanced economic modeling, consider the end-use technologies included in the US REGEN model.1 Each of these end-uses uses one of several models, each of which is based on econometric analysis, the parameters from which constitute the data used in US REGEN. From scenarios run on a model like US REGEN, synchronized to the RA scenarios, output comes in the form of operational parameters that represent them as simple demand or supply levels (uncontrolled), price responsive levels, or trigger sensitive levels, with response times and perhaps other, more sophisticated parameters for availability, degradation, and rebound behavior.
Taking cues from a higher-level econometric analysis, like US REGEN, means that sophisticated factors, like degradation, replacements, and upgrades may not need to be included in the RA analysis itself, if it is already accounted for in the higher-level model.
Behind-the-Meter DER Adoption and Net Load
Behind-the-meter (BTM) distributed energy resource adoption (DER) may be reported alongside periodic updates to regional load and supply resources, with rooftop PV (either domestic or commercial) and battery storage growing rapidly. For RA analysis, due to their high potential number, these resources may appear as aggregations of generic generation and storage at the regional level or a netted from demand. Data availability for behind-the-meter generation varies strongly with region and governance. Some grid operators require solar installers to provide access to inverter-level generation numbers, while others do not. For the case when no specific generation data is available from public sources, several alternatives may be available. Inverter vendors will often vend aggregate generation data at the zip-code level, and if installed capacity number are available, subsampled generation data from the inverter vendor can be scaled to estimate total generation. If no such data is available (or available at a low enough cost), generation can be estimated.
As DER uptake increases, RA modeling must include not only demand, but net load values for represented regions and time frames. A region may require extensive detail, even when the RA model uses a copper sheet to represent transmission, because it may have many contrasting climate zones, economic zones, and other characteristics that affect the type and nature of end-uses and distributed resources. The combination of these two reveals the net load, for which system-level RA is responsible.
The time scales used in RA analysis govern the resolution in time for the required end-use and DER data. Some systems can rely simply on peak loads, having little DER adoption and demand response, in which case an analysis may focus on Summer and/or Winter peak values. Other systems are highly integrated along their supply chains, have numerous simultaneous changes occurring for different reasons, and are operated with maximal flexibility to adapt to rapidly changing conditions.
Climate Trends and Extreme Events
Climate trends may be evident within the RA analysis timeframe, in which case, a separate trend analysis is conducted to understand its important factors, change coefficients, and the way different climate aspects are affected on average and in the extremes. Clearly, climate affects demand across multiple end-uses, like heating, cooling, lighting, and transportation, which may be sensitive enough to include within the RA analysis. For example, when future weather years are generated, that information may feed into a factor model of end-use net loads. Data sources and recommendations for input data remain the same as those established in the Weather Data Section.
Extreme events may be covered in bookend scenarios or in probabilistic sampling, especially when representing tail events. More and more, details about the reasons for RA shortfalls become interesting so decision makers can be informed of system weaknesses caused by different types of extremes (heat, cold, wind, recessions). The process to develop a demand projection for a specific type of extreme event is specific to that type of event. For example, extreme cold snaps, such as experienced by winter-storm Uri in Texas in 2021, or during heatwaves will differ considerably.
Recent ex post analyses2 of load losses due to extreme cold have revealed several sources of errors in short-term forecasts, like end-use heat pump adoption and its non-linear behavior in extreme cold. In the case of a cold snap, explicit modelling of the secondary heating source in heat pump heating becomes critical to the estimation of the unserved energy, whereas the same does not apply during heat waves, when air conditioning performance is more material. The operation of industrial facilities during such events is a similar source of uncertainty. Natural gas is used in industry for major processes, but its use for heating typically has priority, which means that high heating loads from natural gas can cause industrial loads to shut down. This is a coupling between natural gas demand and electricity demand that may need to be represented in an RA analysis specific to cold extremes or gas supply shocks.
Demand forecasts develop to project demand across all periods trade off accuracy in specific conditions for greater accuracy over all conditions. Short term net load forecasting is typically quite accurate, which means that economic and weather forecasts are likely accurate in the short term. However, their limitations should be borne in mind by practitioners when conducting specific stress tests. Data needed is therefore system and case specific and may require extensive effort to consolidate. A substantial gap presents around this issue.
Forecast Error
In the long-term, over years and decades, all these forecasts are known to be inaccurate, and the technique for handling them is to conduct ex post analyses on demand forecasts to understand their statistical weaknesses and to further account for systemic errors when projecting future net loads. These error statistics may enter the RA analysis as part of a risk assessment of the confidence with which targets for RA metrics are satisfied.
The Role of Electrification in Demand Forecasts
Accounting for behavioral and policy driven changes in adequacy assessments is both critical to their meaningfulness and difficult to complete with certainty. As is widely acknowledged, electrification of new load classes is likely to occur soon, both as a response to the need to decarbonize and also in response to changing technology and efficiency. Projection of how that load may change is the domain of energy-economic models that typically base their decisions on a greatly reduced set of representative periods. Resource adequacy assessment requires multitudes of hourly demand profiles as an input. Therefore, a comprehensive process to derive the future demand considering the impact of electrification explicitly will first generate a scenario outcome for the level of electrification of a given sector and then using that information to synthesize multiple weather years of data for a given technology buildout.
This approach was leveraged in two case studies: the Southwest Power Pool and the Northeast studies. The following figure shows the range in average hourly demand before and after the effect of electrification is considered using EPRI’s REGEN tool, as outlined earlier. The outcome demonstrates that for summer, demand peaks earlier in the day to coincide with solar generation during the summer, while also experiencing lower nighttime demand levels.
Source: EPRI Resource Adequacy for a Decarbonized Future Case Study: Southwest Power Pool (3002027836)
Data Sources on Demand Modeling
The following documents have useful information on how a wide range of grid operators and utilities have handled modeling of future electrical demand:
EIA, “Electricity Data Browser,” Energy Information Administration
ENTSO-E, European Resource Adequacy Assessment (ERAA), ENTSO-E
Duke Energy Indiana, "Duke Energy Indiana Integrated Resource Plan," Charlotte, NC, 2021.
AEMO, “Forecasting Approach - Electricity Demand Forecasting Methodology,” Sydney, May 2021.
ISO-NE, “Long-Term Load Forecast Methodology Overview,” Holyoke, MA, September 2019.
Extra Efforts on Demand Modeling
There are some notable areas where cutting edge RA modeling requires more data for demand modeling than has been needed in the past:
Price-Responsive Demand:Traditional models link demand to price, but emerging market structures (e.g., incentives for renewable production/consumption, rooftop PV with batteries) require capturing complex, non-linear, and weather-dependent behaviors, such as smart thermostat adjustments during extreme temperatures.
Weather-Dependent Loads:EV charging and other new technologies introduce weather-sensitive patterns (temperature, road conditions). Coordinated EV charging will be critical as EVs become major loads and resources. Advanced Metering Infrastructure data and Europe’s Generation and Load Data Provision methodology aim to improve load characterization and forecasting.
DER Aggregations:In the U.S., FERC Order 2222 mandates DER participation in wholesale markets. Accurate baselines and flexibility modeling for aggregated DERs are essential for capacity and operational planning.
Extreme Weather Forecasting:Stress-testing RA now includes scenarios beyond historical observations. Improved accuracy may require additional weather variables (solar irradiance, dew point, wind speed) to capture impacts on load and behind-the-meter solar.
Industrial Electrification: Large industrial loads need better modeling under extreme conditions. Rapid shifts from policy or technology could create new load classes quickly, requiring proactive monitoring and possibly new outage categories for responsive industrial loads.
Flexible Customer Demand
Demand flexibility is expected to play a more significant role in maintaining adequacy in future. While a variety of customer side technologies will account for the flexibility, space and water heating will constitute a significant portion of the potential for load flexibility. Ascertaining the extent of the potential customer base that will respond and the manner in which they will respond can be challenging. To better understand how this may be achieved, EPRI research developed and evaluated models for different classes of demand flexibility in adequacy studies. These models were developed based on available data from a retail customer program data and synthesized into models that allow for pre- and post- cooling or heating as well as a significant load reduction during an event.
The data gathering process is non-trivial, with extensive treatment needed to correct for selection bias implicit in the sampled data, reconstruction of data against future weather patterns and projection to a wider customer base than those which participated in the program. Consistent and coherent data gathering for customer-sided flexibility is strongly recommended as extensive information is needed to both develop a baseline demand but to parameterize the potential response.
Source: EPRI, Demand Flexibility for Grid Reliability and Resilience: Planning Tool Integration of Demand Flexibility, EPRI, Palo Alto, CA, 2022 (3002025443)
DISTRIBUTED RESOURCES
Distributed Resources – Data Options
Level I:
Aggregate capacity of distributed generation and storage facilities, sufficient to allow estimated generation and charge and discharge for storage.
Level II:
Comprehensive facility location and technology data for distributed generation and storage facilities, together with sampled data, allowing for more accurate estimation of generation, charge, and discharge.
Level III:
Telemetered generation, charge, and discharge data for all distributed resource facilities, whether at individual facility or nodal aggregation.
Distributed generation and storage encompass any small (generally <10 MW) generation and storage resources which are located behind the customer meter.1 This section focuses specifically on distributed PV and on co-generation, or combined heat and power (CHP), technologies.
Distributed PV
As described in the section, there are key data needs that influence solar power generation such as solar irradiance, temperature, and the characteristics of the PV system. However, the number of distributed solar power installations can be significantly higher and more diverse than utility-scale installations. Collecting detailed information about each behind-the-meter installation such as PV system characteristics and historical power generation is a big challenge as data is distributed among many different residential, commercial, and industrial customers who own solar installations. However, some utilities, aggregators, system operators, and other energy providers can also have access to the data depending on the region or market.
Typically, historical distributed PV power generation data and installed capacity estimates are reported at system level and can be used in statistical and physical models as described in to estimate and validate distributed PV power output. While statistical models will be used in a similar way as for utility-scale PV installations, physical models require detailed information about PV system characteristics which are difficult to capture from distributed PV installations due to the high number and different systems within a study area. In this case, optimizing the number of PV systems and system configurations (i.e., using different tilt and azimuth angles and module performance) to minimize the error against historical records allow to have more accurate estimates of behind-the-meter solar power generation.2
In most cases, distributed PV systems do not make asset specific data publicly available. However, there is a growing trend in many countries towards centralized databases to facilitate data access and sharing among various stakeholders, such as policymakers, system operators, researchers, and industry professionals for monitoring and analysis purposes. Depending on the regulations and policies in place in each system, distributed PV system data might not be accessible at all, challenging to collect, or not existing. Some jurisdictions require detailed PV system information whereas others require basic system information as part of a grant process.3
Several important gaps in information on distributed PV systems remain:
Data management and data quality are difficult to guarantee.
Data registration of PV installations can be inconsistent or lack accuracy in many regions. For instance, not making clear whether maximum power refers to the module (DC) or inverter (AC) output. Users of this data are recommended to clarify this to the greatest extent possible when gathering the information.
System-level, behind-the-meter PV generation data can sometimes be ascertained by third-party vendors that are informed by a limited set of field installations across the study area. This can help to validate against historical record, as in the case of ISO-NE.2 Other solutions involve estimation of installed and forecast distributed PV outputs based on GATS data as in the case of PJM.4
Cogeneration
Cogeneration is increasingly present in urban areas as part of an effort to lower the CO2 emissions of heating and cooling, and in certain regions to produce potable water. To model distributed cogeneration, data on installed capacities of individual CHP units and their dual use is a minimum requirement. Typically, large, and small capacity CHP units are modeled separately in resource adequacy assessments.5,6
While large-scale CHP data requirements are like those for thermal generation, certain additional operational constraints need to be considered. Defining the dispatch strategy of large-scale CHPs is important to model must-run obligations. Heat-to-power ratios help understanding whether a CHP unit is primarily used to serve thermal loads or electricity. In addition, historical CHP power generation and power prices help to understand the price-responsiveness and define the must-run obligations and dispatch behavior of the units. With the addition of heat storage, future CHP operational profiles may differ from current experience: it is recommended that the likelihood of such duty cycle evolutions be assessed before a study is started.
On the other hand, operational characteristics of small-scale CHPs data are more difficult to collect and analyze. System-level historical profiles are typically needed to model small-scale CHPs in RA studies. However, this data is not always available or is difficult to collect by presenting similar challenges to the ones described for distributed PV. Data collection for small-scale CHP units is typically driven by system-level historical power generation data based on the resource type (i.e., CCGT, ST) or price-responsive behavior (i.e., capacity that can come online if power prices are over certain thresholds).
The impact of weather factors such as ambient temperature, humidity, and pressure, in the performance of the CHP plants, and the thermosensitivity of the side process are difficult to estimate. Failure to incorporate those factors in RA models will result in an underestimation of the adequacy risk, especially during extreme weather events.
Useful data sources for CHP units include:
Industry associations can provide data regarding installed capacity, operations, characteristics, and performance. For example, in the US, the Combined Heat and Power Alliance and the Department of Energy's CHP Technical Assistance Partnerships and in Europe, COGEN Europe provide data and resources for CHP analysis.
Regulatory bodies: The US Department of Energy's (DOE) CHP Installation Database, contains information such as installed capacity, application, resource, and fuel type of more than 4,500 CHP systems across the country. (S. Department of Energy Combined Heat & Power and Microgrid Installation Databases.
Energy agencies: In the United States, the Energy Information Administration (EIA) provides historical data on CHP electricity generation in the Electric Power Monthly report (Electricity Data - U.S. EIA) This report provides monthly and annual data on total electricity generation, including CHP, by state and type of generator. In Europe, each national energy agency can provide information on CHP installations and generation within their respective countries. Eurostat, the statistical office of the EU, can provide data about CHP energy production across the EU perimeter.7
Distribution or transmission system operators collect data from CHP units connected to their grid. On the transmission side, ENTSO-E transparency platform publishes historical CHP electricity generation for most of the bidding zones (markets) in Europe with hourly granularity.
Historical CHP power generation is not always available and can be difficult to collect. Some regions have more explicit CHP data available than others where data is more limited or implicitly reported under thermal or others categories. Breaking-down these categories will facilitate a more realistic representation of CHP in RA models to account for operational constraints that otherwise will not be captured correctly. This might result in higher flexibility provided by the grid and possibly underestimate adequacy risks if the CHP operational constraints are not well accounted for.
Projections on new CHP units need to be broken down to better represent the technology types and the degree of flexibility that this technology can offer so it can be well incorporated in RA models.
Key Data Sources for Distributed Energy Resources
Variable/Data Type | Source Description | Web Address |
DER facility maps, specifications, and hourly & monthly generation data for New York State | NYSERDA | |
Aggregate DER generation and capacity data for New England States | ISONE | |
Location of DER generation facilities in Vermont | Vermont Electric Coop |
ELECTRICITY TRANSMISSION NETWORK DATA
Transmission Network – Data Options
Level I:
No data collected: transmission is not modeled in copper-plate RA models, with fixed prescribed imports and exports to neighboring grids.
Level II:
Transmission limits prescribed for zone-to-zone and grid-to-grid transmission, but variations within limits set according to modeled excess capacity in each zone.
Level III:
Detailed model of transmission line ratings with weather inputs, nodal model of transmission within grid and between grid and neighboring grids.
Modeling of the transmission system in resource adequacy models is largely divided into three categories (from simplest to most complex):
Copper Sheet models which have no transmission.
Zonal models which divide the system into geographic regions and usually connect zones via a transportation, “pipe-and-bubble” model.
Nodal models which model most or all the nodes and lines in the system and the transmission network is modeled as a DC optimal power flow (DC OPF).
In addition to transmission line characteristics and system transfer limits, transmission line availability is also important data for resource adequacy studies. Many studies assume that modelled transmission elements are always available and never forced out of service. However, applying forced and maintenance outages to transmission elements can help identify reliability needs for nodal models. For zonal models, applying outages to individual transmission elements (via a forced outage partial derate on the zonal interface, or collection of lines connecting zones) can better tune inter-zone power transfer limits.
All transmission owners are required to annually file a FERC 715 report which details all assets under their ownership and their properties, such as line resistance and reactance. This information is routinely used by transmission planners and used to perform a variety of system studies, such as contingency analysis and interconnection studies. This data can also be reviewed to develop representative limits for lines or interfaces (a group of lines) for the purpose of resource adequacy studies. This requires detailed AC contingency, and potentially voltage analysis, to quantify total transfer capabilities and is performed exogenously (as an input) to resource adequacy studies. Historical data on system transfer limits caused by non-thermal transmission limits are typically published by system operators but may not be available for all systems. Some examples are shown on the table below. Data on specific transmission element outages can be obtained from the owners of the transmission assets or from the transmission system operator. Average statistics for transmission outages are also available from NERC via the Transmission Availability Data System.
While a typical power flow analysis might evaluate a handful of hours during a study year (such as peak load and light load conditions), resource adequacy studies typically analyze an entire year across a variety of system conditions. This may prevent data from being taken directly from a data source like the FERC 715 and require careful calibration by running power flow studies.1,2,3
One study assumption to consider is line ratings themselves. Data in the FERC 715 may show a series of different ratings for a transmission line, such as “Normal” and “Emergency” ratings. The transmission line limit used in a resource adequacy study can have an impact on power transfer limits and impact the observed system events.
In addition, accurate modeling of transmission limits can be a difficult task. At any given point in real time operations, the limit that a line can support can change due to factors such as ambient conditions (temperature) or generator availability (VAR support) which may be difficult to reflect in a resource adequacy study. Applying seasonal transmission line ratings can help to address overall ambient conditions or other temporal constraints such as line shunts being applied.4 To apply generator availability, power flow studies can be performed to evaluate whether line limits change as a function of the operating state of specific generators. For example, NYISO evaluates the impact of a set of generators on the Central East interface,5 which is voltage limited, and a simplified version of this data is input into their resource adequacy models.6
Nodal transmission topology is computationally burdensome, even when applying DC OPF simplifications, and is typically not implemented in resource adequacy studies. There is an inherent tradeoff between transmission fidelity – both in terms of data needs and computational speeds – that must weigh transmission accuracy against modeling feasibility. Today, most systems evaluate the most important transmission constraints that limit system operations during peak risk conditions and use a simplified zonal model to reflect that in resource adequacy simulations. However, future systems may have resource adequacy risk occur in new locations and during different time periods as the resource mix changes. It is therefore important to tightly couple transmission planning and resource adequacy planning for future systems.
Key Data Sources for Transmission
Variable/Data Type | Source Description | Web Address |
Transmission Total Transfer Capability | ISONE | |
Generic Transmission constraints | ERCOT | |
Power Export Stability Limits | NYISO | Northern Export and Cedars Import Stability Limit Analysis (NX-19) |
Transmission Line Locations and Voltage | Homeland Infrastructure Foundation Level Database (via ArcGIS) | https://hifld-geoplatform.opendata.arcgis.com/datasets/geoplatform::transmission-lines |
OPERATING RESERVES
Operating Reserves – Data Options
Level I:
Operating reserve requirements are updated according to expected and unexpected system characteristics. Requirements are more frequently updated when available resource capacity, energy (primary fuels) and demand are more uncertain and variable. Depending on these factors, risk is re-assessed up to multiple times per hour.
Level II:
Operating reserve serves as insurance against load loss and cascading outages. Requirements are determined very often and well in advance of many types of extreme events. Those events lacking forecasts more than a few hours in advance may not allow for preparing additional reserves.
Level III:
Determine reserve requirements considering probabilistic forecasts of capacity, energy, and demand. Consider whether reserve types can be generalized according to availability, capabilities, and response times.
Operating reserve is traditionally used to respond to demand forecast errors and unexpected resource outages. In some regions the use of operating reserve is becoming more challenging as it supports new system features like fast net load ramping, minimum inertia requirements, and behind-the-meter distributed energy resource outages. These supplemental uses may require operating reserves to influence the types of resources needed for RA (faster responding) and may affect the way operations and load losses are represented. These changes may add up in ways that cause traditional RA metrics to be insufficient and then require additional data about resource capabilities and operating reserve requirements.
Traditional operating reserve requirements were set under assumptions of high-transmission availability and seasonal changes to be sufficient to cover the largest generation contingency. With the changing resource mix, electrification, demand-side participation in energy dispatch, DERs, and pressure to decrease operating costs, the amount of operating reserve required is starting to be recalculated and procured hourly.
Dynamic reserve determination for setting reserve requirements has the potential benefit to reduce system risk and costs simultaneously.1 Recent advances with statistical methods leverage historical reserve needs indexed by existing system conditions, including reserve and energy shortfalls. This historical information is compiled into system risk distributions, and when coupled with near-term forecasts of the same parameters for the system conditions, the distributions can be used to determine reserve requirements meeting or exceeding risk targets. The EPRI DynADOR tool uses this method, which is appropriate under the assumption that historical conditions and their parameters continue to indicate near-term systemic risks.
A new method for reserve determination integrates a PCM (Production Cost Model) with probability of resource availability to determine under a given unit commitment the risk of inadequacy system. The additional information about energy and capacity uncertainty is like that handled in RA, so a similar method may be embedded within an RA analysis, with the added computational burden.
The data needed for reserve determination depends on the method as follows:
Simple Seasonal Determination – Peak loads in summer and winter seasons, resource ratings by season, maximum forecast errors.
Historical Statistical Analysis – Time series of reserve requirements by type, weather and system conditions, renewable energy production, and probabilistic forecasts for resource energy and capacity and for load, system risk targets.
Integrated Production Cost Analysis – All data for historical statistical analysis, plus PCM resource models, which typically includes production cost information.
EPRI’s DynADCR (Dynamic Assessment and Determination of Contingency Reserve) software package manual has extensive data on reserve determination.
Key Data Sources for Operating Reserve
Variable/Data Type | Source Description | Web Address |
Simple Seasonal Determination | Historical peak net load Resource ratings | See section Customer Side Data See chapter Supply Resource Data |
Historical Statistical Analysis | Weather data Historical reserve requirements System conditions Renewable energy production Energy production and load forecasts | See chapter Weather Data Utility or ISO reserve procurement records, e.g., For energy production, see report Modeling New and Existing Technologies and System Components in Resource Adequacy See section Demand Forecasting & Uncertainty Considerations |
Integrated Production Cost Analysis | Simulated energy production and reserve procurement | See report Modeling New and Existing Technologies and System Components in Resource Adequacy |
GAS NETWORK DATA
Gas Network Data – Data Options
Level I:
No gas network data needed; supply is assumed not to be a binding constraint on energy supply.
Level II:
Capacity of gas lines leading to power plants, and records of occasions when supply was not available to run plants at expected output is collected.
Level III:
Comprehensive daily gas supply information for all gas lines supplying generation facilities is collected.
The Model Selection Guidelines | EPRI Micro Sites site makes the distinction between implicitly and explicitly modeling gas networks, i.e., either accounting for fuel inventory constraints and internalizing outages of the gas infrastructure, or explicitly representing the gas network within the RA assessment.
In the case of implicitly modelling gas constraints, data on outages of gas generators can be used, encompassing both plant failures as well as fuel supply issues. To further represent fuel constraints, data on the overall available gas inventory and on the availability of firm and non-firm gas contracts is required. As non-firm contracts represent interruptible contracts, data on the historical availability of gas subject to system conditions (e.g., weather) can be collected. The source of this varies, but pipeline operator bulletins do provide updates for specific systems in the US.
When explicitly modelling gas networks, data on the system topology and inflows are required. Depending on the level of detail of the model (e.g., transport model vs. hydraulic model), data on the characteristics and locations of the different network components (pipelines, compressor stations, vales, storage, gas supply, gas quality etc.) may be required.1 High level data on transfer capabilities for main lines may be found publicly, however, flow and contracted flow levels may not. Typically, for the transport model, simplified data can be collected, whereas dynamic models will require more detail on the individual components. One advantage of explicitly modelling the gas networks is that outages of the different network components can be represented individually, providing insight into their vulnerabilities, if outage data on these components is available.
In the US, the inter-dependence of gas compressor stations on the electric grid has been the subject of considerable interest, with recent research comprehensively documenting the interfaces.2
The NERC Generating Availability Data System (GADS) dataset provides unscheduled outages of natural-gas-fired generators. EIA Form 860 identifies pipeline to power plant connections. EIA Form 923 holds natural gas fuel receipts and contracts of power plant (firm, non-firm). Automatic incident reporting lies with the Pipeline Hazardous Material Safety Administration (PHMSA). The ABB Velocity Suite commercial tool compiles pipeline critical notices such as force majeure, curtailment, and operational flow as well as natural gas scheduling data.3
In Europe, a collection of gas network instances based on real-world network data is coordinated by the Zuse Institute of Berlin. Additionally, some open-source data sets are collected and available on SCIgrid gas. Information on the pipeline topology in Europe is also available on the ENTSO-G website.
Compared with electrical transmission networks, there is a marked lack of data related to the gas infrastructure. When implicitly accounting for the fuel supply network, it is difficult to recreate firm/non-firm gas contract availabilities and to identify the root cause of a failure. When explicitly modelling the gas network, accessible data is missing on the topology specifically including components and their characteristics (e.g., pipeline diameter, roughness, compressor stations etc.). One example dataset is the Belgian gas network from 2010.4 Further, there seems to be a lack for a detailed outage reporting like the GADS for the gas network.
Key Data Sources for Gas Networks
Variable/Data Type | Source Description | Web Address |
Collection of gas transportation networks | Open-source collection of realistic Gas Transport instances | |
Compressor data set | Smillie et. al. US gas electric compressor data set | |
Information on pipeline failures | Pipeline Hazardous Material Safety Administration | |
Information on firm/non-firm contracts | U.S. Energy Information Administration | Form EIA-923 detailed data with previous form data (EIA-906/920) |
Read more:
HYDROGEN DATA
Hydrogen Data – Data Options
Level I:
Projected, aggregate data for proposed electrolysis or electricity production facilities.
Level II:
Historical or projected hydrogen production, storage, and hydrogen-fueled generation data.
Level III:
As plant history develops, outage data due to maintenance and equipment failure will accumulate and will be needed, along with correlated environmental/weather dependency for the technology.
Modeling the impact of hydrogen both as an industrial demand and a potential source of dispatchable capacity is a relatively new endeavor but has similarities with traditional modeling methods for thermal generation, demand response, energy storage, gas networks, and variable renewable energy resources. Hydrogen is also prototypical of other sector coupling that may emerge in future. The first step needed is to determine the desired level of fidelity for capturing the inter-dependence of hydrogen with electricity and gas. This section draws from the discussion on hydrogen modeling considerations in the Model Selection Guidelines | EPRI Micro Sites site and outlines existing data available to address each consideration as well as gaps that make high-fidelity modeling of a hydrogen system difficult to achieve at the present. These considerations include the location of production, capacity and energy adequacy of a hydrogen generation system, conversion, transportation storage and leakage, competing end uses and economic operations, and maintenance and forced outage modeling.
Location of Production
Due to the increase in global attention towards hydrogen’s potential to decarbonize industrial processes requiring high temperatures, a multitude of projects have been announced around the world. High profile facilities or hydrogen hubs can be readily included at a zonal level in resource adequacy models to represent the spatial distribution of both hydrogen load/natural gas consumption and storage networks. This allows for energy and resource demands to be modeled in a way that accounts for their effect on resource adequacy in regions with constrained transmission or gas networks.
Data Sources on Siting of Existing and Future Hydrogen Production Facilities
- IEA, Worldwide Hydrogen Projects Database
- GIS U.S. Hydrogen Producers and Consumers Infrastructure
- S. Hydrogen Resource Data Set (NREL, 2009) – Includes information on potential for hydrogen production from renewable energy resources
- HYDRA, (NREL (undisclosed)), GIS for hydrogen demand, resource, infrastructure etc.
- Platts World Refinery Database, Gray hydrogen facilities
- Low-carbon hydrogen projects and their status (Pillsbury)
- CSIRO – Hydrogen projects Australia
- ENTSO-G, Hydrogen visualization platform for Europe
- Suitable Sites for Wind Hydrogen Production Based on GIS-MCDM Method in Algeria (2019)
Capacity and Energy Adequacy
Operational considerations for more mature hydrogen technologies, such as steam methane reformation, are addressable with today's information, and this industrial demand is largely accounted for in utility or system operator forecasts. Future demand from alternative production pathways (electrolysis) requires additional knowledge about how these facilities will operate. Over 98% of hydrogen is produced from fossil fuels today, which means there is a significant lack of operational experience to understand how electrolyzers will be operated at a large-scale.
Most of the above data can be approximated in the near-term using existing data or using information supplied by original equipment manufacturers on the operational characteristics of different electrolyzers. Fleetwide averages by technology type may be acceptable until more detailed information is produced from real-world projects. The actual production potential of a facility can also be estimated based on its stated plant design (SMRs with CCS or green hydrogen with self-supplied renewables).
Data Sources on Electrolyzer Constraints
- IEA, “The Future of Hydrogen,” 2019, Full Report
- IEA, “The Future of Hydrogen,” 2019, Technical Assumptions
- Götz, Manuel, et al. "Renewable Power-to-Gas: A technological and economic review." Renewable energy 85 (2016): 1371-1390
- Sánchez, Mónica, et al. "Semi-empirical model and experimental validation for the performance evaluation of a 15 kW alkaline water electrolyzer." International Journal of Hydrogen Energy 43.45 (2018): 20332-20345.
- ENTSO-E, Potential of P2H2 technologies to provide system services, 2022
Some important gaps in existing data sources include:
Modeling associating hydrogen storage and delivery may not be readily available for proposed projects but is an important aspect of energy adequacy for a hydrogen system.
Original equipment manufacturer stated operational characteristics may only be representative of controlled environments and not reflect real-world system costs that decrease efficiencies across the supply chain.
Prices, at which hydrogen production may curtail, will vary by production pathway, location, and by offtake agreements for the facility. Outside of firm participation in demand response programs it may be difficult to quantify the willingness to curtail production during grid emergencies.
Electrolyzer efficiencies currently reported are from original equipment manufacturers or in scientific literature. Climate-specific understanding of electrolyzer performance is needed.
Conversion, Transportation, Storage and Leakage
Economy-wide leakage of hydrogen is estimated to be about 3% for current infrastructure.1 The leakage rate for the future would depend on the mixture of end uses (leakage rates are believed to be higher for power generation than for domestic heat and hot water, for example). Other important data needs include the energy requirements for conversion of hydrogen for use (compression, conversion to ammonia, etc.), operational procedures for reducing or curtailing downstream operations when production is halted, transportation system capacity and frequency of delivery, storage type, size and characteristics, and leakage data on the entire supply chain from production down to end use.
Important gaps in data for this area include the rate of degradation of the hydrogen system downstream from production, and its effects on hydrogen delivery, and empirical data on leakage and downstream hydrogen loss.
Data Sources on Conversion, Transportation, Storage, and Leakage
Competing End Uses and Economic Operations
Some data on willingness to pay for hydrogen (by each end use), firm contract agreements for hydrogen offtake and transferability of hydrogen produced or stored to different end users can be found in the DOE’s National Clean Hydrogen Strategy and Roadmap. However, hydrogen is currently a concentrated market, both geographically and across industrial uses. For larger-scale systems and clean hydrogen production, there is little basis to go on for developing models that capture the supply and demand balance of a larger system and delivered hydrogen costs that consumers would be willing to pay. In addition, economic operations of hydrogen facilities are likely to be constrained to fleet-wide averages which can lead to under or overestimation of hydrogen availability for power in emergencies.
Maintenance and Forced Outage Rates
When upstream models of the hydrogen system may be required, future maintenance and forced outage rates for hydrogen-fueled generation facilities can be estimated. Data on expected downtimes for maintenance of both the hydrogen generation facilities and downstream transportation and storage facilities, failure rates for different components of the supply chain, repair times for different levels of outages (whether for maintenance or repairs), and seasonal and climate impacts on maintenance and forced outages. Some of this data may be available from original equipment manufacturers, however and the present stage of technical readiness, the focus has been primarily on the production cost of hydrogen; more information is needed on electrolyzer performance at scale and across different types of systems. Real-world data on operational experience and actual availability levels of hydrogen production and storage (especially for non-fossil methods) will also be needed.
Key Data Sources for Hydrogen
Variable/Data Type | Source Description | Web Address |
Siting of existing and possible future hydrogen facilities in the US | Hydrogen matchmaker by the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy | |
Hydrogen potential from renewables across the US | NREL Hydrogen Resource Data, Tools, and Maps | |
Existing hydrogen projects in Australia | CSIRO, Australia | |
Existing hydrogen projects in Europe | ENTSO-G | |
Flexibility of electrolyzers | ENTSO-G | |
Large-Scale storage of hydrogen | Andersson and Grönkvist (2019) | |
End-uses of hydrogen | U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy |