Energy Consumption Prediction for Optimum Storage Utilization

Similar documents
LOAD FORECASTING. Amanpreet Kaur, CSE 291 Smart Grid Seminar

CS 229 Final Project: Using Reinforcement Learning to Play Othello

Attention-based Multi-Encoder-Decoder Recurrent Neural Networks

CandyCrush.ai: An AI Agent for Candy Crush

AUTOMATED MUSIC TRACK GENERATION

Overview. Algorithms: Simon Weber CSC173 Scheme Week 3-4 N-Queens Problem in Scheme

Reduce the Wait Time For Customers at Checkout

Radio Deep Learning Efforts Showcase Presentation

An Experimental Comparison of Path Planning Techniques for Teams of Mobile Robots

Short-term load forecasting based on the Kalman filter and the neural-fuzzy network (ANFIS)

Bayesian Positioning in Wireless Networks using Angle of Arrival

CS221 Project Final Report Gomoku Game Agent

CSE 258 Winter 2017 Assigment 2 Skill Rating Prediction on Online Video Game

Artificial Neural Networks. Artificial Intelligence Santa Clara, 2016

New York City Bike Share

On the Application of Artificial Neural Network in Analyzing and Studying Daily Loads of Jordan Power System Plant

Distributed Power Control in Cellular and Wireless Networks - A Comparative Study

SELECTING RELEVANT DATA

Dota2 is a very popular video game currently.

Frequency Prediction of Synchronous Generators in a Multi-machine Power System with a Photovoltaic Plant Using a Cellular Computational Network

AN IMPROVED NEURAL NETWORK-BASED DECODER SCHEME FOR SYSTEMATIC CONVOLUTIONAL CODE. A Thesis by. Andrew J. Zerngast

arxiv: v1 [cs.ce] 9 Jan 2018

Comparative Analysis of Self Organizing Maps vs. Multilayer Perceptron Neural Networks for Short - Term Load Forecasting

Research on Hand Gesture Recognition Using Convolutional Neural Network

11/13/18. Introduction to RNNs for NLP. About Me. Overview SHANG GAO

Heterogeneous transfer functionsmultilayer Perceptron (MLP) for meteorological time series forecasting

CHAPTER 4 LINK ADAPTATION USING NEURAL NETWORK

COMP3211 Project. Artificial Intelligence for Tron game. Group 7. Chiu Ka Wa ( ) Chun Wai Wong ( ) Ku Chun Kit ( )

Matthew Fox CS229 Final Project Report Beating Daily Fantasy Football. Introduction

Tiny ImageNet Challenge Investigating the Scaling of Inception Layers for Reduced Scale Classification Problems

Target detection in side-scan sonar images: expert fusion reduces false alarms

Generalized Game Trees

Initialisation improvement in engineering feedforward ANN models.

System Identification and CDMA Communication

Learning from Hints: AI for Playing Threes

Estimation of Ground Enhancing Compound Performance Using Artificial Neural Network

Neural network approximation precision change analysis on cryptocurrency price prediction

Automated hand recognition as a human-computer interface

Generating an appropriate sound for a video using WaveNet.

2048: An Autonomous Solver

Travel time uncertainty and network models

Real-Time Selective Harmonic Minimization in Cascaded Multilevel Inverters with Varying DC Sources

Experiments on Alternatives to Minimax

CS221 Project: Final Report Raiden AI Agent

CS221 Project Final Report Deep Q-Learning on Arcade Game Assault

A Comparison of Particle Swarm Optimization and Gradient Descent in Training Wavelet Neural Network to Predict DGPS Corrections

Nikolaos Kourentzes Dr. Sven F. Crone LUMS Department of Management Science

Music Recommendation using Recurrent Neural Networks

Reinforcement Learning Agent for Scrolling Shooter Game

Recurrent neural networks Modelling sequential data. MLP Lecture 9 Recurrent Networks 1

Narrow-Band Interference Rejection in DS/CDMA Systems Using Adaptive (QRD-LSL)-Based Nonlinear ACM Interpolators

Construction of SARIMAXmodels

Identification of Cardiac Arrhythmias using ECG

Electricity Load Forecast for Power System Planning

Dynamic Throttle Estimation by Machine Learning from Professionals

Developing Frogger Player Intelligence Using NEAT and a Score Driven Fitness Function

Using Artificial intelligent to solve the game of 2048

PERFORMANCE ANALYSIS OF SRM DRIVE USING ANN BASED CONTROLLING OF 6/4 SWITCHED RELUCTANCE MOTOR

Lesson 08. Convolutional Neural Network. Ing. Marek Hrúz, Ph.D. Katedra Kybernetiky Fakulta aplikovaných věd Západočeská univerzita v Plzni.

An Empirical Evaluation of Policy Rollout for Clue

Prediction of Cluster System Load Using Artificial Neural Networks

Wind Power Forecasting Algorithms and Application

CHAPTER 6 BACK PROPAGATED ARTIFICIAL NEURAL NETWORK TRAINED ARHF

Energy-Efficient Data Management for Sensor Networks

10:00-10:30 HOMOGENIZATION OF THE GLOBAL TEMPERATURE Victor Venema, University of Bonn

Reference Free Image Quality Evaluation

CS 229, Project Progress Report SUNet ID: Name: Ajay Shanker Tripathi

Intercomparison of a WaveGuide radar and two Directional Waveriders

Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks

Compensation of Analog-to-Digital Converter Nonlinearities using Dither

REAL TIME EMULATION OF PARAMETRIC GUITAR TUBE AMPLIFIER WITH LONG SHORT TERM MEMORY NEURAL NETWORK

Deep Neural Network Architectures for Modulation Classification

Achievable-SIR-Based Predictive Closed-Loop Power Control in a CDMA Mobile System

Automatic Public State Space Abstraction in Imperfect Information Games

An Array Feed Radial Basis Function Tracking System for NASA s Deep Space Network Antennas

Hand & Upper Body Based Hybrid Gesture Recognition

- go over homework #2 on applications - Finish Applications Day #3 - more applications... tide problems, start project

IoT Wi-Fi- based Indoor Positioning System Using Smartphones

신경망기반자동번역기술. Konkuk University Computational Intelligence Lab. 김강일

DV-HOP LOCALIZATION ALGORITHM IMPROVEMENT OF WIRELESS SENSOR NETWORK

FUZZY AND NEURO-FUZZY MODELLING AND CONTROL OF NONLINEAR SYSTEMS

Deep Neural Networks (2) Tanh & ReLU layers; Generalisation and Regularisation

On the Use of Convolutional Neural Networks for Specific Emitter Identification

Channel Sensing Order in Multi-user Cognitive Radio Networks

Performance Comparison of VLSI Adders Using Logical Effort 1

Multi-Directional Weighted Interpolation for Wi-Fi Localisation

CROSS-LAYER FEATURES IN CONVOLUTIONAL NEURAL NETWORKS FOR GENERIC CLASSIFICATION TASKS. Kuan-Chuan Peng and Tsuhan Chen

A New Switching Controller Based Soft Computing-High Accuracy Implementation of Artificial Neural Network

CONSTRUCTION COST PREDICTION USING NEURAL NETWORKS

Automatic Bidding for the Game of Skat

Programming an Othello AI Michael An (man4), Evan Liang (liange)

We Know Where You Are : Indoor WiFi Localization Using Neural Networks Tong Mu, Tori Fujinami, Saleil Bhat

Prediction of Missing PMU Measurement using Artificial Neural Network

Unit 12: Artificial Intelligence CS 101, Fall 2018

Comparison of Monte Carlo Tree Search Methods in the Imperfect Information Card Game Cribbage

AI Learning Agent for the Game of Battleship

Chapter 2 Distributed Consensus Estimation of Wireless Sensor Networks

NEURAL NETWORK DEMODULATOR FOR QUADRATURE AMPLITUDE MODULATION (QAM)

Neural Blind Separation for Electromagnetic Source Localization and Assessment

MORE POWER TO THE ENERGY AND UTILITIES BUSINESS, FROM AI.

Transcription:

Energy Consumption Prediction for Optimum Storage Utilization Eric Boucher, Robin Schucker, Jose Ignacio del Villar December 12, 2015 Introduction Continuous access to energy for commercial and industrial sites across the United States to cover their needs is essential for them to keep producing goods and providing services. Nonetheless, the price of electricity provided by utilities is variable throughout the day. In addition, Commercial and Industrial customers have to pay demand charges that are proportional to the maximum power drawn from the grid during each month. This rate structure makes these sites very sensible to how they consume power on a minute to minute basis. Solving this issue involves providing businesses that use solar energy and storage with a system that would allow them to optimize in real time the choice between using the electricity they have produced and buying from their utilities provider. This is exactly what the startup elum does. To make this happen, there is a need to have accurate predictions of the electricity consumption of each site at one or two days horizon. Using data provided to us by elum, we have tried to solve this problem, and helped sites optimize the resources they spend on energy. There have been many attempts to solve this issue in the past using extremely varied methods. Nonetheless, the literature points to the linear regression method (as in [1] and [2]), the KNN method (as in [3] and [4]), and specially neural networks ([5], [6], and [7]). Main Objective Much research has been realized in the field of Short Term Load Forecasting, The main objective of this project is to accurately predict the next day energy consumption needs for 100 businesses in the USA. The input data is consumption over a certain period (up to year) at 5 min intervals and we want to predict the consumption of the day after that period. Admittedly, this goal is complex, as we needed to predict 288 energy consumption needs to complete a whole day of prediction (with a prediction every 5 minutes). Nonetheless, we considered this goal to be challenging and engaging enough for us to try to tackle it. To find what we would consider a good method, we wanted to make sure that it worked well on all the different sites. Methods and Algorithms used Error Used Error = 1 N Sites site (Y pred Y ) T (Y pred Y ) Y T Y This error, while not completely perfect, was made so that we do not favor sites that consume more energy on average (greater Y ). Nonetheless, this gives us a good estimate for the overall error. Note that Y and Y pred represent the true consumption and the estimated consumption respectively. Linear Regression Our first approach to the problem was to implement a simple linear regression model. We randomly separated our data points into 70% training set and 30% testing set. As features, we used what was available to us: the date. Thus, we trained a linear regression model using the weekday and the hour of the day: Consumption est = i,j θ i,j X i,j Where i=1..7 is the weekday (i.e. Monday, Tuesday...) and j=0..24 is the hour of the day. For example, X 2,8 = 1 if we want to predict a Monday between 8:00 am and 8:59 am, and X 2,8 = 0 any other time or day of the week combination. This very simple model gave us an average test error of 11% across all sites. It is promising since this algorithm models the consumption over the whole year and thus any day could be foretasted using only what weekday it is as information. This model corresponds to our baseline case, and any other more sophisticated time series model we need to beat this error to have potential. The difference between test and train error is very small (10.96% vs 10.87%) and we only use 24 7 = 168 features to predict over 100,000 points. Thus, we believe that our error is mostly from high bias rather than high variance. As a result we need to use more features in order to reduce the error. More Features We have found site specific 30 min interval weather data (from NREL NSRDB), including solar irradiance, 1

temperature, wind speeds, relative humidity, and pressure for 2012. Intuitively, weather data, especially temperature and solar irradiance (if the site has solar panels) would play a large role in energy consumption. Adding only linear terms in weather data did not seem to help much, as the test error drop only to 10.7%. However, adding polynomial terms (especially temp 2, wind 2, wind 2 temp...) helped a lot and dropped the test error down to 7.0%. Again, the test and train errors were very similar so we are not over fitting. The distribution of errors can be found in Figure 2. We also tried adding holiday data, but this resulted in a over fit for those days as they are very few of them and we only have data for one year. Looking at school consumption in particular, the academic calendar has a huge impact as during the long summer break, the electric consumption is far lower than any other months. However, adding this feature did not change the error significantly. In order to predict the day in the test set, we look at the last P days of the train set (= query key ) and compare that query key to the keys stored in our table. For each key in the table, the distance to our query key is the norm of (query key - key). Then we select the K keys which have the lowest distance. Our predicted value (= a day) is then: (i = 1 key that is closest to query key, i = 2 second closest etc) predicted value = K i=1 K i=1 value(i) distance(i) 1 distance(i) Figure 3 KNN key-value pair Figure 1 Linear regression model of a site that has an average error (blue = true consumption, red = modeled consumption) Figure 2 Error histogram for all sites using Linear Regression K Nearest Neighbors As mentioned in [3], a K nearest neighbor (KNN) algorithm seems to be promising for this application. We have implemented a KNN in order to predict an entire day (selected randomly) of our data set. The KNN algorithm works in the following way: For each facility: Looking at past and future data, stores the electricity consumption of P days (P*24*12 points) as keys and the electricity consumption of the day right after the P days as value Using all the data in the train set (353 days) we then have 353 - P (key, value) pairs stored in a table (see Figure 3) The variables that we can tune on this model, is the number of days we look into the past to predict the next day (P = size of key vector in days) and the number of neighbors we include in the prediction (K). As a test set, we tried to predict 20 random days, that were never included in our training data. The lowest error we obtained is 13.5% for P = 1 and K = 5. Playing around with these values we see that this model becomes worse as P increases. The K dependence is not so strong and any value around 5 produces similar errors. This method does not seem to work well on our dataset as the best error is higher than using linear regression without weather data. Fourier and STD In the case of time series such as energy consumption, a standard approach is to use an algorithm which finds periodicity patterns in the data and use theses patterns to predict the future. A standard algorithm in this case is the Fourier analysis which approximates the data by a sum of trigonometric functions. A more elaborated version is the STD - Standard Trend Decomposition, which approximates the function by a sum of periodic functions with additional seasonal and trend functions with lower or no periodicity. This enables more flexibility in our case as it allows the algorithm to take into account a rising demand or significant changes in equipment. We have implemented both algorithm and unsurprisingly, STD performs systematically better than simple Fourier analysis. However, we found out that our error on the test set (i.e. the day to be predicted) 2

varied significantly with the number of days taken into account during training; and more data points is not always better. We found that on average, 2-weeks of data yields the best results (see Figure 5, ie the lowest test error). Our first hypothesis is that future points will be more similar to data points that happened a few days ago than what happened a long time ago as in general weather patterns are usually on longer time scales. Our first hypothesis is that future points will be more similar to data points that happened a few days ago than what happened a long time ago as in general weather patterns are usually on longer time scales. Neural Nets - LSTM STD Error vs Training Size Building on literature on the usage of neural nets for time series prediction, we decided to implement an LSTM. LSTM stands for Long-Short-Term-Memory, a kind of Recurrent Neural Network algorithm that can learn from experience thanks to memory gates (see figure 6). Figure 6 Long-Short-Term-Memory Figure 4 Best and Worst STD Forecasts E = 0.12*10e-3 vs E = 0.89 Figure 5 Using the python package Keras [8] we first implemented an LSTM with a *tanh* activation before a linear activation for the output. And started with a one hidden layer LSTM model with a time step of 7. The model takes into account a sequence of 7 days stepwise and outputs the prediction for the following one. The variables that we can tune on this model are: the number of days we look into the past to predict the next day; the number of hidden layers; and the number of training epochs. We chose T=7 as it allows us to have a complete view of a week and e=300 with dropouts to avoid over-fitting. We then tried to select the best possible parameter for H, the number of hidden layers. Unfortunately, different time series of different sites behaved very differently. H = 300 gave good results overall and even outperformed STD on some sites, but simultaneously gave extremely bad results for others. Feature selection was a therefore a tough process and our hope of finding an universal algorithm did not seem very realistic in the case of neural nets. Adding more weather data did not improve our results significantly. Although our results on linear regression suggest that including higher order terms or having a deeper neural net might help. We then went on with different models, looking not at a vector but at each five minute value individually in the LSTM and a size-step of 500. Despite being a more classical approach, it return non significant results. We suspect that we did not have enough data to really grasp the trends and admittedly were asking a lot of our model. Indeed, there is tremendous variability at the 5-minute levels even on two days that look extremely similar from a distance. 3

Lastly, we tried a simple feed forward neural network on a 5-minute basis but the results were inconclusive, likely due to a not so surprising error propagation. Figure 7 Error vs Hidden Layer on LSTM Figure 9 Prediction of consumption of site 6 with LSTM Limitations of Models and Next Steps STD is the model that works best (4.7% on average across sites) even though it does not use any weather data.(see figures?? and??). This makes sense because STD does not try to model the whole year and only models the last two weeks and predicts the next day from that. Furthermore, STD seems to be consistently underestimating (but with the right shape) which could be why this error is still very high. We think that this underestimation could be alleviated by augmenting our STD with weather data and this would be our main goal for future work. Interestingly, it is hard to tweak the model to become a one-fit-all algorithm as very regularized sites are having variance issues while less regularized ones are having bias issues. Another next step would be to improve our linear regression model to only model a few weeks before our query day rather than modeling the whole data that we have in order to reduce the bias of the model. This would be a fast implementation but we decided to explore other methods rather than the classic linear regression. As we mentioned before, some limitations exist on the neural network that we implemented, as is shown in the bad performance on some sites, even though the performance was good on other sites. We believe that the key reason for that is the lack of data (only one year) relative to the daunting task at hand, predicting a vector of 288 values. Conclusions Figure 8 Histogram of Errors for H = 300 Neural nets LSTM, while promising, gives us good results on 84 sites, but on the remaining 16 we get extremely bad results (error >1). Linear regression actually performs better than neural nets LSTM overall except in 5 sites. Adding weather data is crucial to get an error lower than 10% and currently our best algorithm for prediction the consumption is linear regression. Adding weather features to a linear regression model is straight forward and we were able to integrate all the weather data we had in our model. In contrast, as STD or KNN are time series prediction model, augmenting them with weather features was more challenging and we were not able to implement a solution that leveraged all of the weather information on our hands. 4

References [1] Amral, N.; Ozveren, C.S.; King, D., Short Term Load Forecasting using Multiple Linear Regression, Universities Power Engineering Conference, 2007. UPEC 2007. 42nd International, vol., no., pp.1192-1198, 4-6 Sept. 2007 [2] Papalexopoulos, A.D.; Hesterberg, T.C., A regression-based approach to short-term system load forecasting, Power Systems, IEEE Transactions on, vol.5, no.4, pp.1535-1547, Nov 1990 [3] Al-Qahtani, F.H.; Crone, S.F., Multivariate k- nearest neighbour regression for time series data A novel algorithm for forecasting UK electricity demand, Neural Networks (IJCNN), The 2013 International Joint Conference on, vol., no., pp.1-8, 4-9 Aug. 2013 [4] Troncoso Lora, A; Riquelme Santos, J.M.; Riquelme, J.C.; Gmez Expsito, A.; Martnez Ramos, J.L., Time-Series Prediction: Application to the Short-Term Electric Energy Demand, Current Topics in Artificial Intelligence, Springer Berlin Heidelberg, vol. 3040, 2004 [5] Hippert, H.S.; Pedreira, C.E.; Souza, R.C., Neural networks for short-term load forecasting: a review and evaluation, Power Systems, IEEE Transactions on, vol.16, no.1, pp.44-55, Feb 2001 [6] Lee, K.Y.; Cha, Y.T.; Park, J.H., Short-term load forecasting using an artificial neural network, Power Systems, IEEE Transactions on, vol.7, no.1, pp.124-132, Feb 1992 [7] Bakirtzis, A.G.; Petridis, V.; Kiartzis, S.J.; Alexiadis, M.C., A neural network short term load forecasting model for the Greek power system, Power Systems, IEEE Transactions on, vol.11, no.2, pp.858-863, May 1996 [8] keras.io, Keras: Deep Learning library for Theano and TensorFlow, Last accessed: December 10th 2015 5