Iproving Power Grid Resilience Through Predictive Outage Estiation Rozhin Eskandarpour, Ain Khodaei Departent of Electrical and Coputer Engineering University of Denver Denver, CO 800, USA rozhin.eskandarpour@du.edu, ain.khodaei@du.edu Ali Arab Data Manageent and Advanced Analytics Protiviti Inc. New York City, NY 009, USA ali.arab@protiviti.co Abstract In this paper, in an attept to iprove power grid resilience, a achine learning odel is proposed to predictively estiate the coponent states in response to extree events. The proposed odel is based on a ulti-diensional Support Vector Machine (SVM) considering the associated resilience index, i.e., the infrastructure quality level and the tie duration that each coponent can withstand the event, as well as predicted path and intensity of the upcoing extree event. The outcoe of the proposed odel is the classified coponent state data to two categories of outage and operational, which can be further used to schedule syste resources in a predictive anner with the objective of axiizing its resilience. The proposed odel is validated using k-fold cross-validation and odel bencharking techniques. The perforance of the odel is tested through nuerical siulations and based on a welldefined and coonly-used perforance easure. Index Ters-- extree events, achine learning, power grids resilience, predictive analytics. I. INTRODUCTION Predictive analytics and eerging applications of achine intelligence tools are shaping every aspects of our daily lives. Data has becoe the epicenter of the odern decision aking by policy akers, corporations, and enterprises. Utilities and local governents are facing increasing expectations fro their custoers and constituencies to effectively respond to the afterath of the catastrophic events such as hurricanes that can affect the quality of life of the counities and interrupt the business continuity. In this cliate, the concept of resilience enhanceent has becoe an iportant risk anageent easure in addressing these challenges. Resilience denotes the capability of a syste to absorb and to adapt to external shocks, which is an iportant characteristic expected fro critical lifeline systes such as electric power grids []. There are several types of external shocks to the power grid, ost notably extree events which include adverse weather events and natural disasters that are known to cause considerable negative ipacts not only on the syste itself but also on the society in general []. Aong these extree events, hurricanes are known to be the ost frequent extree event in the United States, ainly occurred along the Atlantic Ocean and Gulf of Mexico []. The devastating afterath of these events calls for disruptive strategies to ensure that the power grid can still supply electricity to custoers, or even if considerably ipacted, can quickly bounce back fro the contingency state to its noral operational condition. In this case, an accurate forecasting of the likely hurricane ipacts on the power grid can be of significant value as it can be leveraged in achieving enhanced grid resilience. This paper proposes a achine learning based ethod for predicting the state of the power grid coponents in response to upcoing hurricane strikes. The concept of resilience for coplex systes was originally introduced by Holling [3] in the ecology area. Holling defined the resilience of a syste as the rate and speed of returning to noral conditions after an extree event. The intent of resilience study is to anticipate the unexpected change due to failure, considering that systes have liits and gaps, and the atosphere constantly affects both regarding design and external shocks [4]. In [5], the significance of geographic and cascading interdependencies are highlighted which are associated with urban infrastructure, and a general ethod to describe infrastructure interdependencies is proposed. In [6] the ipact of resilient systes on diinishing the probabilities of failure in urban infrastructure is analyzed. This concept was extended into other systes including the power grids. In [7] an approach for calculating the resilience of a single infrastructure and its coponents is proposed. In [8] a proactive resource allocation ethod aiing to repair and recover power grid after extree events is proposed. In [9] and [0] a proactive recovery fraework of power grid coponents is introduced which develops a stochastic odel for operating the coponents prior to the event, followed by a deterinistic recovery odel for anaging resources after the event. In [] a restoration odel is proposed based on power flow constraints which identifies an optial schedule using the acroeconoic concept of the value of lost load (VOLL) in order to iniize the econoic loss due to load interruptions in the post-disaster phase. A decision-aking odel, based on unit coitent solution and syste configuration, is proposed in [] to find the optial repair schedule after a hurricane and in the restoration phase of a daaged power grid. This work has been supported in part by the U.S. National Science Foundation under Grant CMMI-43477.
In [3], a power grid resilience index is proposed by analyzing the process of generation, transission, and consuption of electricity in various countries. The geoetric ean of several factors such as the generation efficiency of non-renewable fuel dependence, the distribution efficiency, the carbon intensity, and the diversity are considered to develop the resilience index. However, an index for individual coponents in the syste is not considered in the ethodology. In [4], a ethodology to calculate resilience index of power delivery systes in post-event infrastructure recovery is proposed. A ulti-infrastructure syste including electric power delivery, telecounications, and transportation is considered and the resilience easures of fragility and quality are cobined with the input-output odel of these infrastructures. The proposed index is evaluated by the data collected fro post-landfall of Hurricane Katrina to assess the resilience and interdependence of a ulti-syste networked infrastructure during natural extree events. The study in [5] proposes a fraework for resilience enhanceent of urban infrastructure systes. The tiedependent expected resilience etric is built on perforance and response of the power grid following an extree event. The process is perfored in the stages of disaster prevention, daage propagation, and assessent and recovery. The hurricane resilience of electric power grids is quantified through a probabilistic odeling approach in [6], using a Poisson process odel for hurricane occurrence, coponent fragility odels, and a grid restoration odel with coponent repair priority. The odel is then calibrated using actual custoer outage and power grid restoration data in Harris County, Texas in the afterath of Hurricane Ike in 008. In any probles, a closed forulation of the proble and its solution cannot be easily derived. Machine learning investigates the algoriths that are capable of learning fro and aking forecasts fro data. These algoriths can categorize the observed data for classification (supervised learning), cobine siilar patterns for clustering (unsupervised learning), and predict the output of the syste based on its past behavior and historical data (regression odeling) [7]. Machine learning approaches have been utilized in a considerable nuber of research efforts in the power and energy sector [8]. Security assessent is one of the ost versatile achine learning applications in power grids with the applications fro pattern recognition [8], decision tree induction, and nearest neighbor classifiers [9], to nae a few. Forecasting arises as another popular application of achine learning. A nuber of Artificial Neural Networks (ANNs) have been proposed for short-ter load forecasting [0] and wind power forecasting []. Soe other exaples of achine learning applications in power grids include risk analysis using regression odels, ANNs, and Support Vector Machine (SVM) [], distribution fault detection applying ANNs and SVM [3], and power outage duration prediction using regression odels and regression trees/splines [4]. This paper proposes a three-diensional SVM [5] odel to predict the potential power grid coponent outages in response to hurricanes. The proposed SVM is trained on historical data with three features, naely the resiliency index of the coponent, the distance of the coponent fro the center of the hurricane, and the category of the hurricane which is deterined based on the wind speed. The rest of the paper is organized as follows: Section II presents the odel outline and forulation of the proposed achine learning ethod for outage prediction. Section III presents siulation results on a test syste, and Section IV concludes the paper. II. PROBLEM STATEMENT Resilience index is easured by considering the quality of a coponent and its perforance in a tie period before and after extree events. The state of a coponent during a hurricane can be considered as daaged (coponent is on outage) or operational (coponent is in service). In order to classify the daage/operational state of each coponent, various features borrowed fro historical data can be used. In this paper, a three-diensional SVM is utilized in order to deterine the state of the coponents and the decision boundary between the daaged and operational data points. In the following, a brief introduction on ulti-diensional SVM is provided and the features that are used to deterine the state of the coponents are discussed. A. Support Vector Machines SVM is a discriinative classifier that defines a separating hyperplane between two classes. The best hyperplane in SVM is considered as the hyperplane with the widest gap between the classes which decreases the risk of iss-classifying and increases the generalization of the classifier. This gap is usually referred to as argin, where SVM intends to axiize this argin between the classes. The details of the SVMs are fully described in the literature [5], so only a brief introduction to SVM in threediensional space is presented in this section. Consider training saples xiîr 3, i=,..., in a binary classification proble.. The linear decision is function f(x)=sign(w T x+b), xiîr 3, where w is the weight vector which defines a direction perpendicular to the hyperplane of the decision function, while bîr is a bias which oves the hyperplane parallel to itself. The optial decision function given by support vectors is the solution of the following optiization proble: in w + c s.t. y ( w x + g) ³,-e, b e b ³ 0, T b b åe b b = b =,..., b =,..., where w is the noral vector to the hyperplane separating training exaples, g / w is the perpendicular distance of the hyperplane fro the origin, and c is a penalty paraeter. When c, SVM does not allow any training errors (hard argin classification) and when 0 < c <, the odel allows soe training errors, and hence allowing separating nonlinear exaples (soft argin). This is a quadratic prograing proble which can be solved for the proble s Lagrange duality ultiplier aîr 3 as follows: ()
ax - a s.t. åå a a y y ( x. x ) = 0. In order to solve the duality proble, any analytical approaches are proposed in the literature, depending on the size of dataset and eory liitation considerations. Sequential Minial Optiization (SMO) [6] is one of the analytic approaches that is used to solve the quadratic prograing (QP) proble () in any SVM toolboxes such as LIBSVM tool in MATLAB [7]. SMO breaks the QP proble into ultiple saller subprobles, which are then solved analytically. SMO picks two support vectors, finds corresponding Lagrange ultipliers and repeats this process until reaching convergence (within a user-defined tolerance) or a axiu nuber of iterations. By solving the duality proble (), the final hyperplane only depends on the support vectors (i.e., saple points that are in the argin) and SVM needs to find only the inner products between the test saples and the support vectors. Fig. shows the support vectors and optial hyperplane in a separable two-class classification of SVM. In regards to the objective of this paper, Fig. also shows the support vectors and optial hyperplane to separate outage fro operational coponents based on the associated resiliency index, distance fro the center of the hurricane, and the wind speed. The idea of the axiu-argin hyperplane, which is discussed above, is based on the assuption that training data are linearly separable. To apply SVM to nonlinear data (which often is the case, especially in the case of the hurricane data), kernel ethods [5] can be used. The idea of a kernel ethod (or as soetie called kernel trick) is to ap the input space into a linear separable feature space, usually a higher diension, where the linear classifiers can separate two classes (Fig. ). Kernel trick siply states that for all x and x in the input space, a certain function k(x,x) can be replaced as inner product of x and x in another space. For exaple, a Gaussian kernel can be defined as: k where s is the paraeter of the kernel defined by the user. In practice, the best kernel is found by experient while adjusting kernel paraeters via a search ethod to iniize the error on a test set. B. Coponent Features A feature, in achine learning, is defined as an individual easurable property of a phenoenon being observed [7]. Selection of discriinating, independent, and inforative features plays a critical role in the perforance of the classification ethod. Various features can be defined to deterine the state of the coponents in response to a hurricane strike. In [8], the wind speed and the distance of the each coponent fro the center of the hurricane are proposed as response to a hurricane. å i j i j i i i= j= i= 0 a C, å a i y i i= ( x, x ) i - x i - x j j = e s + a i () (3) Figure. Support vectors and optial argin in SVM Figure. The kernel ethod in SVM. The linearly inseparable data in a two-diensional space can be linearly separable in higher diensions (three diensions in this figure) Although these features are obviously adequately inforative, they do not provide inforation about the coponent itself. Resilience index of coponents is also an iportant factor during weather-related events. Siilar to [6], we quantify the hurricane resilience of the electric power grid using a probabilistic odeling approach. For the sake of illustration, only the Poisson process odel of hurricane occurrence during a given tie period along with fragility odels are considered in this work. Other factors used in [6] such as DC power flow, power grid restoration and coponent repair priority are not considered in this index. However, the proposed odel is a general fraework and can be extended to other resilience indices. Based on this, hurricanes are described by a Poisson process of constant rate λh such that the tie interval between successive hurricane events has an exponential distribution with a probability function of - ìlhe f ( t) = í î 0 l h t t ³ 0 t < 0 Siilar to [6] and based on historical data fro 900 to 999 [9], the annual occurrence rate of hurricanes is considered as λh = /7 per year, and the probability of a hurricane belonging to each category is respectively calculated as 0.53, 0.9, 0.5, 0.08, and 0.05. In this paper, we consider resilience index for four coponents: a) generation units, b) transission lines, c) distribution lines, and d) substations. For their flexible analytical properties, siilar fragility odels following a noral distribution, are considered for all four categories with probabilities of low, oderate, severe, and (4)
coplete. Resilience index is then considered as the average of fragility odel and the probability of the hurricane. The category of hurricane, the distance of each coponent fro the center of the hurricane, and the calculated coponents resilience index are investigated as three ain features to predict the state of each coponent in response to the hurricane. Scale categories with a sall Gaussian noise. The features are noralized to [0,] based on the axiu considered values of wind speed and distance. Fig. 3 shows the generated synthetic data. C. Evaluation Metrics There are various evaluation etrics in the literature to easure the reliability and acceptable perforance of a classification ethod. Accuracy is the ost coon easure of any classification syste which is coonly defined as the nuber of correct predictions divided by the total nuber of saples in the test set. Reporting the general accuracy of prediction cannot be sufficient as the nuber of saples ay not balance in the test set. To test the perforance of the obtained decision boundary, the F-Score [30] will be tested on the test historical data defined as: PR F = (5) P + R where P is the nuber of positive predictions divided by the total nuber of positive class values predicted (i.e., precision), and R is the nuber of positive predictions divided by the nuber of positive class values in the test data (i.e., recall). For exaple, in the case of the outage prediction proble, precision (P) is the nuber of correctly predicted outages divided by the total nuber of predicted outages, and recall (R) is the nuber of correctly predicted outages divided by the total nuber of actual outages. The F-Score will be a value ranging fro 0 to, where higher values represent a higher predictive power as a easure of acceptable perforance of the obtained decision boundary. To evaluate the perforance of the classifier, usually a subset of the historical dataset is reserved as holdout saple for odel validation. k-fold cross-validation is a coon validation technique for assessing the results of a classification syste and evaluating how well it can generalize on a dataset [3]. In k-fold cross-validation, the dataset is randoly partitioned into k equal sized subsaples. A single subsaple is reserved as the validation/test set, and the other k subsaples are used as training data for the odel. This process is iterated for k ties (the nuber folds), where each of the k subsaples is used only once for the validation. The k results fro the folds are accordingly averaged to obtain a single estiation. III. NUMERICAL SIMULATION Scarcity of readily available datasets still reains a challenge for research counity and industry practitioners. However, the liited historical data on past extree hurricanes at the coponent granularity level shall not preclude ethodological developents in critical areas including in achine learning systes. Therefore, in this paper, a synthetic set of 000 saple data is generated to train the SVM odel, considering half of the saples in outage state and the other half in the operational state. The generated saples follow a noral distribution function of one-inute sustained wind speed of different Saffir-Sipson Hurricane Figure 3. Generated synthetic data for SVM training and validaton A k-fold cross validation (k=5) is perfored to easure the perforance of the proposed odel. Different kernels (linear, polynoial Quadratic, Cubic, and Gaussian) with various penalty paraeters (c=0.0, 0.,, 0, 00) are exained. Since the considered dataset is relatively sall, an off-the-shelf SVM odel ipleented in LibSVM [7] is used in this paper. In the proposed work, the SMO tolerance for convergence is set to e-3 and the axiu nuber of iterations is set to a large value (5000 iterations). In practice, since the considered dataset is relatively sall, it converges in about 350 iterations for different folds. Table I shows the average F-Score for various penalty paraeters and kernel shapes. As it is shown, SVM with Gaussian kernel and c= offers the best perforance aong other settings. A third order polynoial logistic regression odel is also trained and exained in the sae fashion (i.e., k-fold crossvalidation with k=5) to predict the coponent outages. Table II copares evaluation etrics of SVM with different kernels (using penalty paraeter c=) and a third order polynoial logistic regression odel. As shown, aong the trained odels, Gaussian kernel SVM had the best overall classification accuracy with a precision of 0.893, a recall of 0.86, and overall F-Score of 0.858. Coparing the result of logistic regression with the proposed SVM indicates that the proposed SVM approach has a better perforance in both accuracy and F-Score. Table III shows confusion atrix of predicting coponents as operational and outage using Gaussian kernel SVM. The proposed odel can predict outage and operational states with the accuracy of 90.% and 8.6%, respectively. TABLE I. AVERAGE F-SCORE OF SVM WITH VARIOUS PANEALTY PARAMETERS C AND KERNELS USING 5-FOLD CROSS-VALIDATION Kernel c=0. c= c=0 c=00 Linear 0.845 0.845 0.846 0.846 Quadratic 0.858 0.856 0.855 0.857 Cubic 0.855 0.854 0.840 0.754 Gaussian 0.857 0.858 0.850 0.847
TABLE II. COMPARISON OF THE PERFORMANCE OF SVM WITH VARIOUS KERNELS AND THE LOGISTIC REGRESSION METHOD. Accuracy Precision Recall F-Score Linear SVM 0.847 0.853 0.838 0.845 Quadratic SVM 0.863 0.898 0.88 0.856 Cubic SVM 0.86 0.896 0.86 0.854 Gaussian SVM 0.864 0.893 0.86 0.858 Logistic Reg. 0.809 0.85 0.798 0.806 TABLE III. CONFUSION MATRIX OF CLASSIFYING SYSTEM COMPONENTS USING GAUSSIAN KERNEL SVM (NUMBER OF SAMPLES AND PERCENTAGE) Predicted Noral Outage Actual Noral 45 (90.%) 49 (9.8%) Outage 87 (7.4%) 43 (8.6%) IV. CONCLUSION Prediction of a coponent state in response to an extree event is a challenging task in practice. In this paper, a three diensional SVM was proposed to categorize syste coponents into two classes of daaged and operational in response to an upcoing hurricane. The proposed SVM was trained on historical data with three features related to each grid coponent i.e., the resilience index, the distance of the coponent fro the center of the hurricane, and the category of the hurricane (the wind speed). A synthetic set of data was generated to train the SVM, as the publicly available data on the ipact of hurricanes on power grid coponents is liited. Siulation results showed the effectiveness of the proposed odel copared to the results obtained fro Logistic Regression, as a popular benchark for two-class classification proble, and further deonstrated its acceptable perforance in reaching high accuracy estiations. REFERENCES [] L. M. Branscob, Sustainable cities: Safety and security, Technol. Soc., vol. 8, no., pp. 5 34, 006. [] Executive Office of the President, Econoic Benefits of Increasing Electric Grid Resilience to Weather Outages-August 03. [3] C. S. Holling, Resilience and stability of ecological systes, Annu. Rev. Ecol. Syst., pp. 3, 973. [4] E. Hollnagel, D. D. Woods, and N. Leveson, Resilience Engineering: Concepts and Precepts. Ashgate Publishing, Ltd., 007. [5] S. M. Rinaldi, J. P. Peerenboo, and T. K. Kelly, Identifying, understanding, and analyzing critical infrastructure interdependencies, IEEE Control Syst., vol., no. 6, pp. 5, Dec. 00. [6] N. O. Attoh-Okine, A. T. Cooper, and S. A. Mensah, Forulation of resilience index of urban infrastructure using belief functions, Syst. J. IEEE, vol. 3, no., pp. 47 53, 009. [7] O. O. Aderinlewo and N. O. Attoh-Okine, Assessent of a Transportation Infrastructure Syste using Graph Theory, 03. [8] A. Arab, A. Khodaei, S.K. Khator, K. Ding, V. Eesih, and Z. Han, Stochastic Pre-hurricane Restoration Planning for Electric Power Systes Infrastructure, Sart Grid IEEE Trans. On, vol. 6, no., pp. 046 054, 05. [9] A. Arab, A. Khodaei, Z. Han, and S. K. Khator, Proactive Recovery of Electric Power Assets for Resiliency Enhanceent, Access IEEE, vol. 3, pp. 99 09, 05. [0] A. Arab, A. Khodaei, S. K. Khator, K. Ding, and Z. Han, Posthurricane transission network outage anageent, in Proc. IEEE Great Lakes Syp. Sart Grid New Energy Econ, 03, pp. 6. [] A. Arab, A. Khodaei, S. K. Khator, and Z. Han, Transission network restoration considering AC power flow constraints, in 05 IEEE International Conference on Sart Grid Counications (SartGridCo), 05, pp. 86 8. [] A. Arab, A. Khodaei, S. Khator, and Z. Han, Electric Power Grid Restoration Considering Disaster Econoics, 06. [3] L. Molyneaux, L. Wagner, C. Frooe, and J. Foster, Resilience and electricity systes: A coparative analysis, Energy Policy, vol. 47, pp. 88 0, 0. [4] D. A. Reed, K. C. Kapur, and R. D. Christie, Methodology for assessing the resilience of networked infrastructure, Syst. J. IEEE, vol. 3, no., pp. 74 80, 009. [5] M. Ouyang, L. Dueñas-Osorio, and X. Min, A three-stage resilience analysis fraework for urban infrastructure systes, Struct. Saf., vol. 36, pp. 3 3, 0. [6] M. Ouyang and L. Dueñas-Osorio, Multi-diensional hurricane resilience assessent of electric power systes, Struct. Saf., vol. 48, pp. 5 4, 04. [7] C. M. Bishop, Pattern recognition, Mach. Learn., vol. 8, pp. 58, 006. [8] C. K. Pang, F. S. Prabhakara, A. H. El-abiad, and A. J. Koivo, Security Evaluation in Power Systes Using Pattern Recognition, IEEE Trans. Power Appar. Syst., vol. PAS-93, no. 3, pp. 969 976, May 974. [9] D. J. Sobajic and Y.-H. Pao, Artificial neural-net based dynaic security assessent for electric power systes, IEEE Trans. Power Syst., vol. 4, no., pp. 0 8, 989. [0] A. D. Papalexopoulos, S. Hao, and T.-M. Peng, An ipleentation of a neural network based load forecasting odel for the EMS, IEEE Trans. Power Syst., vol. 9, no. 4, pp. 956 96, 994. [] G. N. Kariniotakis, G. S. Stavrakakis, and E. F. Nogaret, Wind power forecasting using advanced neural networks odels, IEEE Trans. Energy Convers., vol., no. 4, pp. 76 767, 996. [] S. D. Guikea, Natural disaster risk analysis for critical infrastructure systes: An approach based on statistical learning theory, Reliab. Eng. Syst. Saf., vol. 94, no. 4, pp. 855 860, 009. [3] D. Thukara, H. P. Khincha, and H. P. Vijaynarasiha, Artificial neural network and support vector achine approach for locating faults in radial distribution systes, IEEE Trans. Power Deliv., vol. 0, no., pp. 70 7, 005. [4] R. Nateghi, S. D. Guikea, and S. M. Quiring, Coparison and validation of statistical ethods for predicting power outage durations in the event of hurricanes, Risk Anal., vol. 3, no., pp. 897 906, 0. [5] C. Cortes and V. Vapnik, Support-vector networks, Mach. Learn., vol. 0, no. 3, pp. 73 97, 995. [6] J. Platt, Sequential inial optiization: A fast algorith for training support vector achines, 998. [7] C.-C. Chang and C.-J. Lin, LIBSVM: a library for support vector achines, ACM Trans. Intell. Syst. Technol. TIST, vol., no. 3, p. 7, 0. [8] R. Eskandarpour and A. Khodaei, Machine Learning based Power Grid Outage Prediction in Response to Extree Events, IEEE Trans. Power Syst., 06. [9] M. Ouyang and L. Dueñas-Osorio, An approach to design interface topologies across interdependent urban infrastructure systes, Reliab. Eng. Syst. Saf., vol. 96, no., pp. 46 473, 0. [30] C. Goutte and E. Gaussier, A probabilistic interpretation of precision, recall and F-score, with iplication for evaluation, in European Conference on Inforation Retrieval, 005, pp. 345 359. [3] R. Kohavi and others, A study of cross-validation and bootstrap for accuracy estiation and odel selection, in Ijcai, 995, vol. 4, pp. 37 45.