Analysis of the same day of the week increases in peak electricity demand in South Africa

Modelling of the same day of the week increases in peak electricity demand using the Generalized Pareto-type (GP-type) distribution is discussed. The GP-type distribution discussed in this paper has one parameter to estimate and as such, it is referred to as the Generalized Single Pareto (GSP). The data is from Eskom, South Africa’s power utility company and is for the years 2000 to 2011. A comparative analysis is done with a Generalized Pareto Distribution (GPD). Although both the GSP and the GPD fit the data, the use of the GSP is easier since it has only one parameter to estimate instead of two as is the case with the GPD. Modelling of the same day of the week increases in peak electricity demand improves the reliability of a power network if an accurate assessment of the level and frequency of future extreme load forecasts is carried out.


Introduction
Modelling the same day of the week extreme peak loads is important to the system operator who has to balance electricity generated with demand.It is important that the amount of electricity drawn from the grid and the amount generated balance [7,15].This amount generated is called electricity load which is equal to electricity demand in the absence of blackouts and load-shedding.It is therefore important to have accurate predictions of peak loads.Accurate load forecasts are needed by both generators and distributors particularly during peak periods.This paper discusses an application of the Generalized Pareto-type (GP-type) distribution in predicting the probability of exceedance (POE) levels of the same day of the week increases in peak electricity demand.The GP-type distribution discussed in this paper has one parameter to estimate and as such it is referred to as the Generalized Single Pareto (GSP).Most recent work includes that of Verster & De Waal [16], who showed that above a sufficiently high threshold, the tail of a Generalized Burr Gamma (GBG) distribution can be approximated by a GP-type distribution which has one parameter to estimate, namely the extreme value index (EVI).A comparative analysis is performed with a Generalized Pareto Distribution (GPD).The use of the GPD in modelling the tail of a distribution has been studied for over three decades.Balkema & De Haan [1] and Pickands [13] showed that the distribution function of the excesses above a high threshold converges to a GPD as the threshold tends to the right endpoint of the distribution.Smith [14] showed that estimation of GPD parameters with the maximum likelihood method is a non regular problem when ξ < − 1  2 , where ξ is the shape parameter which is also known as the EVI.This implies that certain regularity conditions have to be met.It is therefore more attractive to use the Bayesian approach which does not depend on these regularity conditions [3,5].Since the peaks over threshold (POT) methodology involves tail estimation based on small data sets with little information available, the Bayes approach can be used to capture and take into account all the available information including additional information through prior elicitation [3,5,17].Various aspects of the Bayes approach and related methods have been studied in literature [2,3,4,11].The GPD has two parameters to estimate.Using the Bayesian approach to estimate the parameters involves the simultaneous simulation of the two parameters.This is generally computationally expensive and time consuming.
The remainder of the paper is organized as follows.Section 2 contains a description of the input data used in the models.The GSP and GPD models are discussed in §3.In §4 the empirical results are presented.A detailed discussion of the comparative analysis of the GSP with the GPD follows in §5, while §6 concludes the paper.

Data
Aggregated national Daily Peak Demand (DPD) electricity data is used for the commercial, residential and industrial sectors of South Africa giving us a total of 4 271 observations.The data is obtained from Eskom, South Africa's power utility company and spans the years 2000 to 2011.DPD is the maximum hourly demand in a 24-hour period and aggregated data is the net energy sent out (NESO) from the distribution division of Eskom in response to some demand of electrical power.We define same day of the week increases in peak electricity demand as follows.Let y t be DPD on day t and y t−7 DPD on day t − 7, then the same day of the week increase in peak electricity demand on day t is defined as In this definition we only consider positive increases in peak demand.We have 2265 same day of the week increases in peak electricity demand.Electricity demand varies from day to day depending on the day of the week.The demand of electricity is generally higher during the week (Monday to Friday) than during the weekend (Saturday and Sunday).A summary of the demand indices for each day of the week is shown in Day 2 (representing Tuesday) indicating that the demand is on average 2.869% above the average demand.The plot in Figure 2(a) shows that the same day of the week increases in peak electricity demand is highly volatile.This volatility is due to a range of uncertainties, including population growth, changing technology, economic conditions and activity, prevailing weather conditions as well as the general randomness in individual usage [10].The histogram of the same day of the week increases in Figure 2(b) shows an exponential decreasing pattern.

Modelling of peaks over a threshold
Two distributions are introduced to model the peaks over threshold.

The Generalized Single Pareto (GSP) distribution
Above a reasonably high threshold, τ , the tail of a GBG distribution can be approximated by a GP-type distribution [16].The GP-type distribution, which is also called the GSP, is an approximation to the GPD which is a POT distribution used in extreme value theory to model observations above a sufficiently high threshold.The POT distribution considered here is the GSP with one parameter η, which is the EVI.The survival function of the GSP  distribution is given as where x is the same day of the week increase in peak electricity demand as defined in §2 and τ is a threshold.The EVI η is estimated using the Bayesian approach.In this study the maximal data information (MDI) prior [17], is used as it provides maximal data information, is easy to implement and constraints can be built into the prior.The MDI prior is given by and the likelihood function is given as with X i ∼ GSP(η) for i = 1, . . ., N τ , such that N τ is the number of observations above the threshold and X i denotes the same day of the week increase in peak electricity demand.The posterior density of η is given as Future posterior predictive tail probabilities of a future observation, X 0 , can be predicted through the formula The posterior predictive density given in expression ( 6) cannot be computed analytically.It will be approximated through simulations.Values of η are simulated from the posterior density given in expression ( 5) and substituted into expression (6).The average over all tail probabilities is then used as an estimate of the posterior predictive tail probability.

The Generalized Pareto Distribution (GPD)
The GPD is a POT distribution which may be used to model the observations above a sufficiently high threshold.The GPD has two parameters ξ, the shape parameter (EVI), and σ, the scale parameter.The survival function of the GPD is given by Equation (7) shows that when ξ < 0 the survival function of the GPD is bounded above by σ ξ .The two parameters, ξ and σ are estimated jointly by considering a Bayesian approach.The MDI prior [17] is given as while the likelihood function is i.e.X i ∼ GPD(σ, ξ) for i = 1, . . ., N τ , such that N τ is the number of observations above the threshold and X i denotes the same day of the week increase in peak electricity.The joint posterior density of ξ and σ is The two parameters are estimated by simulating values of σ and ξ from the posterior density given in expression (10) and taking the mean of the simulated values to obtain estimates.To simulate a set of (ξ, σ) values from the posterior density, the Gibbs sampling method is used.The Gibbs sampling procedure is discussed in the appendix.This involves simulating σ from its conditional density function given a fixed ξ.The EVI ξ, is then simulated from its conditional density given the selected σ.This process is repeated a large number of times.The posterior predictive density is Since expression (11) cannot be computed analytically we take the same approximation approach as with the GSP.Values of σ's and ξ's are simulated from the posterior density given in expression (10) and substituted into expression (11).The average over all the tail probabilities is again obtained as an estimate of the posterior predictive tail probability.

Empirical results
In this section numerical results are presented for the data supplied by ESKOM.

Predicting future observations using the GSP distribution
We use the Pareto quantile plot given in Figure 3 to obtain the threshold on the same day of the week increases in peak electricity demand.The observation on the y-axis where the plot starts to follow a straight line is taken as the threshold [3].In this case τ = exp(8) = 2980.958.The exceedances, z t are those x t 's such that x t > τ .The values z t = x t − τ are the excesses over τ .There are 44 points that exceeds τ .The posterior density of η is given in Figure 4.The mode of the posterior density is an estimate of η.In this study η = 0.1909 is used.The empirical cumulative distribution function (cdf ) and theoretical cdf of the GSP with the estimated parameter value η = 0.1909 are constructed for observations above the threshold and shown in Figure 5(a).
From Figure 5(a) it can be seen that the theoretical cdf is close to the empirical cdf , especially at the larger observations, indicating the appropriateness of using the GSP to model the observations above the threshold.
In order to assess whether the GSP distribution is identical to the empirical distribution of the exceedances a QQ plot is constructed.The theoretical quantiles from the GSP with the estimated parameter are plotted against the empirical quantiles in Figure 5(b).If the plot lies on the 45 0 line it indicates a good fit, which is the case in Figure 5(b).The only exception is the last observation which is a little underestimated.existence of an upper bound.This is probably due to the fact that the same day of the week increase in peak electricity demand cannot exceed supply because the maximum electricity supply is fixed over the short run.

Predicting future observations using the GPD distribution
The empirical cdf and theoretical cdf of same day of the week increase in peak demand with the estimated parameter values is constructed for observations above the threshold and shown in Figure 6(b).It is evident that the theoretical cdf is close to the empirical cdf , indicating the appropriateness of using the GPD to model the observations above the threshold.
In order to compare whether the empirical distribution is identical to the GPD a QQ plot of the same day of the week increase in peak demand above the threshold is constructed.The theoretical quantiles from the GPD with the estimated parameters are plotted against the empirical quantiles.If the two distributions being compared are identical, the QQ plot follows the 45 0 line y = x as is the case in Figure 7.

Comparative analysis and discussion
Selected posterior predicted tail probabilities for various future extreme same day of the week increases in peak electricity demand are given in Table 2.For the GSP, 1000 η's were simulated and substituted into expression (6).Similarly for the GPD, 1000 ξ's and 1000 σ's were simulated and substituted into expression (11).After substituting the averages were calculated.Both the GSP and the GPD are a good fit to the data.One of the main advantages of the GSP requires the estimation of only one parameter instead of two as is the case with the GPD.The simultaneous simulations of two parameters can be difficult and time consuming in terms of programming and computing time.The maximum extreme same day increase above the threshold of 2 981 megawatts (MW) is 5 603MW.This is the difference between the DPD of Thursday 8 January 2009 (28 202MW) and that of Thursday 1 January 2009 (22 599MW).This huge increase is possibly due to weather changes and randomness in individual usage of electricity.The daily frequency of the occurrence of exceedances given in Figure 8 shows that Monday has the highest frequency of 15 followed by Friday with a frequency of 13.This shows that large increases are more likely to be experienced on Mondays.An assessment of the monthly level and frequency of exceedances above the threshold of 2 981MW by month is given in Figure 9. Figure 9 shows that most of the large same day of the week increases above the threshold of 2 981MW are found in the month of April.This analysis is important for both load forecasters and system operators of power utilities for planning, load flow analysis and scheduling of short-term electricity demand particularly during periods of peak demand.
Figure 10 shows that same day of the week increases above 2 981MW are volatile and that there is a weak linear trend.This high volatility poses challenges to load forecasters and system operators scheduling peak electricity demand.This calls for accurate predictions of extreme same day of the week peak electricity demand increases.Over or underestimation can be very costly to a power utility.Underestimation will result in insufficient generation of electricity which will result in unmet demand resulting in load shedding.Overestimation results in wastage of financial resources in the construction of excess generating plants.

Conclusion
An application of a GSP distribution which has one parameter to estimate is discussed in this paper.The GSP distribution is used to model the same day of the week increases in peak electricity demand using data from Eskom, South Africa's power utility company.
The GSP is used for estimating extreme tail quantiles and probability of exceedance values for various future extreme same day of the week increases in peak electricity demand.A comparative analysis is performed with a GPD.Although both the GSP and the GPD provide a satisfactory fit to the data, the use of the GSP is preferred since it has only one parameter to estimate instead of two as is the case with the GPD.
Modelling of the same day increases in peak electricity demand improves the reliability of a power network if an accurate assessment of the level and frequency of future extreme load forecasts is carried out.This also helps to relieve peak capacity loads [6].Accurate forecasts used together with real time dynamic electricity pricing programs such as critical peak pricing, may help in shifting peak loads to off peak periods [9,12].
Possible areas for future research calls for density forecasting of long term peak electricity demand which provides full probability distributions of possible future load forecasts [10].Another area for development would be that of modelling the tail of the residual distribution to understand how extremes may affect forecasts [8].

Figure 1 :
Figure 1: Weekly seasonal index plot of DPD.

( a )
Same day of the week increases in peak electricity demand.(b) Histogram of same day of the week increases in peak electricity demand.

Figure 2 :
Figure 2: Same day of the week increases in demand.

Figure 3 :
Figure 3: Pareto quantile plot on the same day of the week increases in peak electricity demand.

Figure 6 (
Figure6(a) shows a scatter plot of the simulated ξ's against σ's for the data.The threshold is kept at τ = exp(8) = 2980.958as obtained from the Pareto quantile plot in Figure3.The means of the simulated ξ's and σ's are calculated as −0.0272 and 740.5557 respectively and are considered as the estimates of ξ and σ.The value of ξ is negative indicating the

Figure 4 :
Figure 4: A plot of the posterior density of η.

( a )
Graphical plot of the empirical cdf (dotted curve) and the cdf of the GSP (solid curve) on the exceedances.(b)QQ plot of sample data (same day of the week increases in peak demand above the threshold) versus a GSP distribution.

Figure 5 :
Figure 5: Using the GSP distribution to predict future observations.

Figure 6 :
Figure 6: Using GPD distribution for future predictions.

Figure 7 :
Figure 7: A QQ plot of sample data (same day of the week increases in peak demand above the threshold) versus a GPD.

Figure 8 :
Figure 8: Daily frequency of occurrence of exceedances.

Figure 9 :
Figure 9: Monthly frequency of occurrence of exceedances.

Figure 10 :
Figure 10: Same day of the week increases above 2 981MW.

Table 2 :
A selection of the posterior predictive tail probabilities.