Res. Lett. Inf. Math. Sc., 2004, Vol. 6, pp 15-29 15 Avalable onlne at http://ms.massey.ac.nz/research/letters/ Equty trend predcton wth neural networks R.HALLIDAY Insttute of Informaton & Mathematcal Scences Massey Unversty at Albany, Auckland, New Zealand Russell.Hallday.1@un.massey.ac. Ths paper presents results of neural network based trend predcton for equty markets. Raw equty exchange data s pre-processed before beng fed nto a seres of neural networks. The use of Self Organsng Maps (SOM) s nvestgated as a data classfcaton method to lmt neural network nputs and tranng data requrements. The resultng prmary smulaton s a neural network that can predcton whether the next tradng perod wll be, on average, hgher or lower than the current. Combnatons of pre-processng and feature extractng SOM s are nvestgated to determne the more optmal system confguraton. 1 Introducton Predcton of fnancal markets has long been a holy gral n the mnds of equty nvestors. Wth the advent of powerful computers much attenton has been focused on ths feld. The ablty of neural networks to learn from tranng data has not been overlooked and as such neural networks have been appled to a range tradng market applcatons from equty markets to currency markets. Contrastng ths s a level of skeptcsm surroundng the ablty of a system to predct future prces of tradng markets[2]. Equty market prces depend on many nfluences. Key factors that nfluence future equty prces can be broadly dvded nto quanttatve and qualtatve types. Prmary quanttatve factors nclude open, hgh, low, close and volume data for ndvdual equtes, market segments (equty groups), ndexes and exchange markets as a whole. Qualtatve factors nclude soco-economc, poltcal, nternatonal, regonal and performance factors to name but a few. Due to the dffculty n accurately retrevng and quantfyng hstorcal qualtatve factors, network nputs used n the model presented here have been confned to readly avalable quanttatve data. However, from quanttatve factors the key qualtatve factor of the market sentment can be derved. Market sentment tells us f the market s bullsh, where hgh of confdence and rsng prces preval, or bearsh, where there s a lack of nvestor confdence and prces are n declne. Thus hstorcal data quanttatvely reflects qualtatve market sentment to some extent whch n turn should gve ndcaton of future prce movements. Smulaton data was sourced from Yahoo! Fnance [2]. The data used for network tranng and verfcaton s comprsed of daly fgures for ndvdual equtes lsted on the New York Stock Exchange (NYSE) from January 1 st 1985 to December 31 st 2000. Tradtonal tradng systems are almost exclusvely mechancal systems that apply mathematcal formulae to securtes data to produce turnery buy, sell and hold ndcators. The weakness of ths tradtonal approach s that the tradng system must be programmed to make explct use of certan tradng rules. Conversely t s hypothessed that neural networks should be able learn from tranng data and n turn make use of data that s ntrnscally present n the nput data set. The prmary am of ths paper was the creaton of a neural network that, gven a set of hstorcal daly data, s capable of predctng the drecton of the future prce trend. The prce trend predcton smply
16 R.Hallday beng whether market prces would on average ncrease or decrease relatve to a subset of the daly tranng data. In the smulaton carred out n ths paper ranges of network-parameter combnatons were tested n order to determne the best arrangement for the network. 2 Desgn Consderatons Many consderatons must be examned n developng a neural network. Attenton needs to be gven to these desgn consderatons before begnnng the network mplementaton phase. Desgn consderatons nclude feasblty of the proposed network, data collecton lmtatons, pre-processng and tranng. All of these aspects nfluence the effectveness of the desred goals. Ultmately t must be remembered that neural networks are the tool or vehcle, but the objectve must be both measurable and vable gven avalable data and computatonal lmtatons. 2.1 Feasblty Feasblty s an mportant factor n the assessment of any neural network project [3]. A prmary feasblty aspect s that of nformaton nherent n tranng data. Lke any statstcal tool neural networks are lmted by the ntrnsc nformaton n nput data. It s not possble to predct nformaton that s not reflected n tranng set. A smple example of ths s real world events, such as company announcements, that due to lack of pertnent nformaton beforehand cannot be antcpated by the market. Even n solaton of external factors t s unrealstc to presume that any set of hstorcal data ponts nherently contans requred nformaton sutable for precse predcton of future market prces. At best a tradng system, be t mechancal or artfcal ntellgence by desgn, can only am to maxmse pertnent nformaton extracton from a hstorcal data set. Wth a neural network approach we can at best hope the system derves nformaton otherwse obscured n the tranng set. 2.2 Data Collecton and Adjustment As stated prevously Yahoo Fnance [2] was selected as the source of market nformaton. Yahoo Fnance s a well-establshed and relable source of equty prces on markets around the world. Common occurrences n equty markets are so-called stock splts and reverse splts. Prce data has been pre-adjusted to reflect these prce abnormaltes. Wthout such correcton tradng data would experence unexpected prce jumps up and down for reverse splts and splts respectvely. For example f a stock sellng at $2 were splt on a 2:1 bass then a downward prce jump of $1 would be shown after the splt, whle the opposte scenaro wth a jump from $2 to $4 would be true for a reverse splt. Wthout adjustment ths would create abnormaltes n the tranng data thereby adversely effectng the tranng and performance of the neural network. Tranng data length s mportant n order to correctly assess the systems ablty to acheve accurate predctons n reasonable range of market condtons. 10 20 years s consdered a reasonable range for system assessment [1] and n accordance wth ths the length of hstorcal data used for ths paper was 15 years. The 15 years of data across stocks lsted on the New York Stock Exchange should provde balanced data wth suffcent predctve power to forecast prce trend movements as gven n [3]. 2.3 Pre-processng Pre-processng data s essental to the learnng and subsequent predctve ablty of a neural network. Consder Fgure 1. If a network were traned over secton A of the then t would be unlkely to be able to generalse for the data covered n secton B. The soluton to ths problem as the use of smple frst order dfferences (equaton 1) [4]. δ ( k + 1) = x( k + 1) x( k) (1)
Equty trend predcton wth neural networks 17 18 16 Secton A Secton B 14 12 10 8 6 4 2 0 Tme Fgure 1: Example of dffculty n generalsng from raw tranng data. Generalsaton of secton B from A s dffcult gven the entrely dfferent trend of the data. Normalsaton s a key part of data pre-processng for neural networks and should enable more accurately predct future prce trends. Consder Fgure 2. Normalsaton of Source Data 10 8 Seres A Seres B Normalsed Prce (USD) 6 4 2 0 1 2 3 4 5 6 7 8 9 Day Fgure 2: Normalsaton of Source Data In the above fgure both equtes follow the same prce changes but on a dfferent scale of magntude. The mportant feature of data s the relatve changes n daly stock prces rather than the absolute stock prce [5]. Normalsaton yelds an dentcal prce graph for both. Before normalsaton both data sets exhbt the same qualtes at dfferent orders of magntude. By normalsng the data the trend predcton neural network can be traned to dentfy generc trends n data rather than specfc data arrangements. Another form of pre-processng mplemented n ths paper s logarthmc scalng. Logarthmc scalng makes better use of nput data scope by evenly spreadng data across the nput range. Ths s a useful approach when reducng the effect of outlers [4]. Due to the sequental nature of equty prce data and the mportance of nter-day prce changes outler elmnaton, as outlned n [3] was not consdered.
18 R.Hallday Pre-processng can greatly nfluence the effectveness of a network. Careful pre-process selecton can ncrease the success of a networks output. In lne wth ths varous combnatons of pre-processng were traled. The varous pre-processng combnatons are further explored under the mplementaton detals (Secton 3). 2.4 Dmensonalty Reducton and Nose The number of tranng examples requred ncreases wth the number of network weghts. Ths sometmes-exponental ncrease puts stran on both data collecton and computaton requrements [3]. Because of ths a Self Organsng Map (SOM) stage was ntroduced to reduce the number of nputs to the neural network. Ths methodology groups nput data nto classfcatons accordng to data smlarty, whch n turn lmts the number of nput weghts requred. The addton of nose to neural network tranng data helps to reduce the rsk of overtranng therefore allowng the network to generalse. Whle raw market data s nherently nosy, data passed nto the neural network from the optonal SOM stage was assessed wth varous amounts of post-processng ncludng both the addton of random nose as well as normalsaton. 2.5 Tranng Neural networks can suffer from over tranng. By over tranng a network ts ablty to generalse s dmnshed. Ths phenomenon s common to all forms of neural network tranng n addton to human tunng of mechancal tradng systems [1]. A separate set of data was used for network error verfcaton wth network tranng beng stopped once the verfcaton set begns to ncrease [6]. 3 Network Implementaton and Archtecture The network archtecture mplemented n ths paper s shown n fgure 3. Fgure 3: Neural Network Archtecture Ths basc desgn was proposed and mplemented n [4] however the applcaton has been transferred from nternatonal exchange markets to equty markets. Extenson to ths basc model has been performed n the ablty of the network to undertake varous types and combnatons of neural network components and parameters. Pre-processng has been extended to nclude a wder range of alternatves. The SOM stage has been traned wth a broad range of dmensons n addton to a stage of post-processng before SOM data s fed nto the man neural network. The neural network phase was smulated wth both standard feed-forward and Elman network types. The trend predcton neural network was programmed n Matlab and took 23 tranng parameters. A wrapper batch program was wrtten that called the trend predcaton neural network tranng program wth varous combnatons of parameters. Ths desgn allowed better control of trend predcton n addton to a separaton of functonalty. 3.1 Data Pre-processng Due to the nherently nosy nature of raw data n addton to neural network lmtatons such as the curse of dmensonalty, some level of pre-processng s requred. In ths mplementaton up to two stages of pre-processng were appled. To determne the effectveness of varous combnatons of pre-processng each stage could also be turned off, n effect actng n a smple pass-through manner. Stage one of pre-processng calculates smple or logarthmc dfferences between days. A smple dfference calculates the percentage between the current and prevous days (equaton 2). The logarthmc
Equty trend predcton wth neural networks 19 dfference performs a log transformaton on the smple dfference (equaton 3). Logarthmc scalng also was undertaken on a smple dfference between the current and frst days tradng (equaton 4). Agan a smple pass-through mechansm was allowed (equaton 5). h ( ) = x( ) x( 1) (2) 5 ( ( ) )( log( σ ( ) 1) ) ( ( ) )( log( σ ( ) 1) ) h3 ( ) = sgn σ + where σ ( ) = x( ) x( 1) (3) h2 ( ) = sgn σ + where σ ( ) = x( ) x(1) (4) h ( ) = x( ) (5) 1 Stage two of pre-processng mplemented pre-process normalsaton (equaton 6). x( ) mean( X ) x ( ) = where X = x1, x2,... xchunksze (6) stddev ( X ) The combnaton of these two stages allow for a total of 10 combnatons of pre-processng. Testng of all combnatons allowed for determnaton of the more optmal form of pre-processng. 3.2 Self Organsng Map & Post Processng A feature extractng Self Organsng Map stage was optonally used to cluster smlar nput data. Ths allows for a large reducton n the number of nputs to the NN. Each SOM works on one set of data only. SOM were used on a varety of combnatons of source data and pre-processng functons. Self Organsng Maps are able to organse nput data nto smlar groups. Ths organsaton reduces the number of neural network nputs, n turn lmtng the effects of the curse of dmensonalty. The sze of the SOM was vared between 1-by-4, 1-by-8, 5-by-5 and 8-by-8. The SOM stage of the network was optonally removed. Two SOM post-processng phases were added to the network desgn. The frst stage of normalsaton ntroduced nose to the SOM output whle the second optonally normalsed the output of the SOM to a standard devaton of one and mean zero, smlar to that performed n (equaton 6). 3.3 Neural Network Two varetes of neural network tested were feed-forward back propagaton & Elman back propagaton, representng non-recursve and recursve neural network types respectvely. The nput layer to the neural network was vared to nclude a number of data chunks ether fed from the SOM or where no SOM was not used, bpassng the SOM stage. These data chunks were allowed to be overlappng. Each chunk s created by takng a sample of chunksze days. An ncrement factor allows chunks to overlap.e. f the ncrement factor s smaller than the chunksze then the next chunk wll overlap by (chunksze ncrement) days. For example three chunks of ncrement 10 and sze 20 would cover 40 days as llustrated n Fgure 4.
20 R.Hallday 20.5 Daly Close Data 20 19.5 19 Prce (USD) 18.5 18 17.5 17 16.5 16 Chunk 1 Chunk 2 Chunk 3 15.5 1 4 7 10 13 16 19 22 25 28 31 34 37 40 Day Fgure 4: Three Chunks of 20 days wth 10-day ncrement The number of chunks and chunksze parameters were kept constant at 4 and 10 respectvely for the smulaton. The Elman network was chosen as suggested by [4]. Elman networks have feedback to all hdden nodes from each hdden node (Fgure 5). For comparatve reasons a standard feed-forward network was also used n the smulaton. A feed forward network has no nternal state memory therefore ts predctve ablty s lmted to the data provded to the nputs at the current tme nstance. Fgure 5: Elman Network Varous parameters for the neural networks were expermented wth durng tranng. Only one layer of hdden nodes was used n the expermentaton. The number of nodes n the hdden layer was modfed between 3, 5 and 10. The transfer functon used by the neural network was tansg (Fg 6). The transfer functon was not modfed durng the tranng of the network.
Equty trend predcton wth neural networks 21 Fgure 6: Tan-Sgmod Transfer Functon 3.4 Result Analyss The network was traned to predct f a defned number of days ahead of tme would be, on average, hgher or lower than reference days n the tranng set. Thus the network s only expected to predct a hgher or lower result from tranng data. Outputs to the neural network were tested n both sngular and dual arrangements. The sngular confguraton has only one output from the neural network; ths output s set to near-one for a hgher predcton and near-zero for a lower predcton (equaton 7). Sngular output confguraton s classfed as correct f both the predcted and actual future values are both hgher or lower than 0.5 (equaton 8). where where 0.8, D > D x x 1 T x = (7) 0.2, Dx Dx 1 D x = Daly prce data for day x T x = Target trend drecton for day (x 1) to day x 1, R > 0.5 T > 0.5 Correct = 1, R 0.5 T 0.5 (8) 0, otherwse R x = Predcted trend drecton for day (x 1) to day x Dual outputs were smply set to alternate values (one hgh, one low) dependng on f the target prce was hgher or lower than the reference days (equaton 9). Classfcaton of correct results was categorsed f actual and predcted results were of the same polarty (equaton 10). The target and predcted trend are defned as prmary and secondary for reasons of clarty. where 0.2, D > D x x 1 U x = (9) 0.8, Dx Dx 1 U x = Secondary Target trend drecton for day (x 1) to day x D x = Daly prce data for day x 1, R > S Correct = 1, R < S T < U (10) 0, otherwse T > U
22 R.Hallday where A correct predcton s defned as 1 and ncorrect predcton s defned as 0 R x = Prmary Predcted trend drecton for day (x 1) to day x S x = Secondary Predcted trend drecton for day (x 1) to day x T x = Target trend drecton for day (x 1) to day x For comparatve purposes smple mechancal expermentaton was carred out on the sample data collected. The ntenton of ths analyss was to verfy f a smple movng average could predct the next tradng day. Ths tested whether the movng average ncrease from the prevous day to the current day predcted that the followng day would also be hgher than the current (equaton 11) and vsa versa for lower days. Ths test was carred out based on a 1, 5 and 15-day movng averages. where 1 ma > ma x x 1 R x = (11) 0 ma x < ma x 1 R x = Predcted trend drecton ma x =movng average of daly values 4 Results Due to the enormous number of combnatons of parameters that could be fed nto the neural network a batch layer was used to cycle through the possble combnatons. Lmtatons n avalable computng tme lmted the number of combnatons that could be tested. Fortunately the addton of the batch-processng layer revealed quckly whch parameters greatly mpacted the performance of the network and whch had lttle or no effect. 4.1 Neural Network Outputs Outputs from the neural network were ether dual or sngular. It was found that a sngular output gave vastly superor performance to that of dual outputs. In order to reduce the computatonal requrements the follow results have been lmted to a sngular network output only. 4.2 Data Pre-Processng Both stages of pre-processng proved hghly mportant n the predctve ablty of the network. Results revealed that n the frst stage of pre-processng, performance was not sgnfcantly altered between the smple-dfference and logarthmc-dfference pre-processng methodologes. However the performance of the network was negatvely affected by settng ths frst pre-processng stage to pass-through. The average percentage of correct predctons wth pass-through was just 55.7% versus 72.7% for other forms of preprocessng. Normalsaton of the second stage of pre-processng, proved to have a sgnfcant mpact on the results of the neural network. Wthout normalsaton the network predcted results wth a lower accuracy of around 55% and narrow standard devaton of approxmately 3% (Table 1). By contrast wth normalsaton results gave a much hgher average accuracy n the range of 73% but wth a wder standard devaton of approxmately 15%.
Equty trend predcton wth neural networks 23 Pre-Process (Stage 1) Pre-Process Normalsaton (Stage 2) Average Success (%) Standard Devaton (%) None N 55.57 2.64 None Y 55.85 2.98 Relatve Logarthmc Dfference Y 73.60 14.68 Orgn Logarthmc Dfference N 56.14 3.62 Orgn Logarthmc Dfference Y 72.03 15.16 Relatve Smple-Dfference Y 73.18 14.87 Table 1: Effectveness of Pre-processng These results show the effectveness of normalsaton on the output of the neural network (table 1). Wthout normalsaton results were sgnfcantly degraded. Addtonally wthout any form of dfferencng correct predctons were also low. It s nterestng to note that whle normalsaton and pre-processng yelded a hgher average result the subsequent larger standard devaton shows that the consstency of the result was reduced. 4.3 Self Organsng Maps The addton of post-processng normalsaton to the SOM dd not yeld a sgnfcant mprovement or degradaton n predctve ablty of the network (table 2). Pre-Processng SOM Post-Processng Average Success (%) Standard Devaton (%) Relatve Smple-Dfference None 56.46 3.98 Orgn Logarthmc-Dfference None 58.94 5.23 Relatve Smple-Dfference Normalsed 58.18 3.30 Orgn Logarthmc-Dfference Normalsed 57.97 3.22 Relatve Logarthmc-Dfference Normalsed 58.13 3.51 Table 2: Effectveness f Post-Processng Normalsaton on Self Organsng Maps The effectve of addng varyng amounts of nose to the SOM was examned. The results show a mnor drop n the average success of the neural networks predctve ablty as nose s added (table 3). Average Percentage Error Added Average Success (%) Standard Devaton (%) 0% 58.47 3.77 2.5% 58.41 3.36 5.0% 58.20 3.76 12.5% 57.21 3.40 Table 3: Effect of Introduced Error on SOM Stage Changes to the sze of the SOM showed accuracy of the network output was mproved wth a larger 1- dmensonal SOM (table 4). SOM Sze Average Success (%) Standard Devaton (%) 1 x 4 57.82 3.56 1 x 8 58.30 3.64 5 x 5 51.86 1.47 8 x 8 55.74 3.29 Table 4: Effectveness of SOM dmensons
24 R.Hallday Removal of the SOM from the network archtecture was also tested. The results showed a sgnfcant ncrease n the average success of the network. Ths result s n contrast to proposed by Gles, Lawrence and Tso (2001). Note that SOM nose and normalsaton post-processng phases are of course not used when the SOM phase s not present. Orgn relatve logarthmc dfferencng yelded a slghtly worse average success rate when compared to other forms of dfferencng. Addtonally the consstency of the result was compromsed by the removal of the SOM layer as revealed by an ncrease n standard devaton of results. Pre-Processng SOM Post-Processng Average Success (%) Standard Devaton (%) Relatve Smple-Dffernce No SOM 77.32 14.05 Orgn Logarthmc-Dfference No SOM 73.30 15.17 Relatve Logarthmc-Dfference No SOM 77.58 13.81 Table 5: Average Success wth SOM Stage Removed The trend predcton network was very successful n ts ablty to predct prce drecton wth predcton ablty averagng 77%. 4.4 Neural Network Type and Confguraton The network type was modfed wth and wthout a SOM stage to show the mpact ths would have. Results from smulaton showed no sgnfcant dfference n performance between Elman and standard Feed-Forward network topologes. Results demonstrated a drect correlaton between the number of hdden nodes and average predctve ablty when nput was taken from the SOM stage. Conversely when nput was taken drectly from the pre-processed data no sgnfcant dfference n average performance was demonstrated however a slght mprovement n varaton of the results was observed (Table 6). Number of Hdden Layers NN Type SOM average success (%) standard devaton (%) 3 Elman Y 58.35 3.34 5 Elman Y 57.23 3.82 10 Elman Y 52.68 2.40 3 Feed-Forward Y 57.85 2.92 5 Feed-Forward Y 58.77 4.13 10 Feed-Forward Y 52.69 2.54 3 Elman N 76.47 15.63 5 Elman N 75.88 14.02 10 Elman N 76.47 13.68 3 Feed-Forward N 76.13 15.11 5 Feed-Forward N 76.03 13.96 10 Feed-Forward N 75.06 13.30 Table 6: Effect of Neural Network Type and Number of Hdden Layers on Network Performance Fnally the effects of predctve range and reference were examned. Predctve range s the range of tradng days over whch the neural network s to predct the future trend. Predctve reference s the hstorcal tradng-day range used for calculatng relatve trend movement measures n days relatve to the last tradng day. For example a predctve range and reference of 1-5 and 1-16 defne that the network wll try to predct whether the next 5 tradng days (predctve range) wll be hgher or lower on average than the prevous 16 tradng days (predctve reference).
Equty trend predcton wth neural networks 25 The results of varyng the predctve range (horzontal) and predctve reference (vertcal) of the neural network are shown below for 3, 5 and 10 hdden nodes (Table 7,8 and 9 respectvely). The below data was obtaned wthout ncluson of the SOM stage due to the hgher predctve ablty of a SOM-free network archtecture. Averagng was used to smplfy data representaton. 1-1 1-2 1-5 1-10 Average 1-20 91.41 87.15 72.49 68.57 79.90 1-16 88.45 83.90 74.37 66.39 78.28 1-6 92.85 76.08 62.83 63.72 73.87 1-3 86.27 74.59 63.38 65.06 72.33 1-1 92.50 76.55 69.79 66.49 76.33 Average 90.30 79.66 68.57 66.05 76.14 Table 7: Network Predctve Success wth 3 Hdden Neurons 1-1 1-2 1-5 1-10 Average 1-20 91.21 84.36 73.65 68.54 79.44 1-16 90.09 78.52 71.45 66.94 76.75 1-6 95.25 80.61 68.68 63.00 76.88 1-3 88.16 79.33 69.95 65.62 75.76 1-1 94.35 79.22 66.37 64.29 76.06 Average 91.81 80.41 70.02 65.68 76.98 Table 8: Network Predctve Success wth 5 Hdden Neurons 1-1 1-2 1-5 1-10 Average 1-20 97.26 82.24 73.64 64.90 79.51 1-16 97.23 81.98 72.57 65.76 79.38 1-6 97.38 80.14 71.75 57.67 76.73 1-3 95.52 76.94 72.55 62.45 76.86 1-1 91.50 77.21 63.7 63.50 73.98 Average 95.78 79.70 70.84 62.86 77.29 Table 9: Network Predctve Success wth 10 Hdden Neurons Fgure 7 shows the plotted averages for 3, 5 and 10 hdden layers across varyng predctve ranges. Graphcal representaton of averages clearly shows the detrmental effect of an ncreased predctve range on the networks predctve ablty.
26 R.Hallday Predcton Accuracy 100.00 90.00 80.00 70.00 60.00 50.00 40.00 30.00 20.00 10.00 0.00 Comparson of Predcton Accuracy vs. Predctve Range 3 Hdden Neurons 5 Hdden Neurons 10 Hdden Neurons 1-1 1-2 1-5 1-10 Predctve Range Fgure 7: Predctve Accuracy vs. Predctve Range for Neural Network Results of changes n predctve reference (Fgure 8) show an ncrease n the predctve ablty of the network as the sze reference group s ncreased. Predcton Accuracy 82.00 80.00 78.00 76.00 74.00 72.00 Predcton Accuracy vs. Predctve Reference 3 Hdden Neurons 5 Hdden Neurons 10 Hdden Neurons 70.00 68.00 1-20 1-16 1-6 1-3 1 Predctve Reference Fgure 8: Predctve Accuracy vs. Predctve Reference for Neural Network 4.5 Mechancal Averaged-Based Predcton Mechancal predcton was carred out to verfy f the network was smply carryng on the prevalng trend as descrbed n 3.4 (Table 10). Average Type Average Success (%) 1-day 51.01 5-day 64.32 15-day 57.53 Table 10: Smple Average Trend Predcton
Equty trend predcton wth neural networks 27 Ths result demonstrates the effectveness of a smple mechancal system. Whle an average success rate of 64% was acheved by the smple mechancal system when usng 5-day averagng t falls short of the result obtaned by the neural network. The traned neural network was able to acheve a much hgher level of accuracy wth an average success rate exceedng 94% for a 1-1 predctve reference and range (Table 8). Ths demonstrates that the neural network was performng more than smple averagng to acheve ts result. 5 Concluson The results of expermental smulaton have gven tangble results for both pre-processng requrements and general network archtecture. Results show that raw data should always undergo some form or dfferencng, be t between the prevous a day or a fxed reference pont. The use of logarthmc functon on ths dfferenced data yelded no sgnfcant ncrease or decrease n the networks predctve ablty. Furthermore normalsaton across the nput wndow proved to be a crtcal element of the pre-processng procedure. Self Organsng Maps were found to decrease the networks predctve abltes. Ths result could be attrbuted to an over-smplfcaton of tranng data. Informaton nherently n the tranng set was lkely not reflected n SOM classfcaton. When usng SOM s t was found that the addton of a post-processng phase was unnecessary. Ths postprocessng phase added output normalsaton and nose addton to the SOM-output data. Whle t was hypotheszed that nose would reduce effects of over-fttng and therefore mprove performance t was found not to be the case n ths crcumstance. Nose was not added to nput data as t s already nherently nosy. It was however found that when usng a SOM stage the sze of the SOM had only a slght effect on the qualty of the fnal neural network output. Both Elman and standard Feed-Forward networks performed almost equally n the smulatons. Ths suggests that nformaton needed to predct the mmedate drecton of future prces s lmted prmarly to data contaned n recent tradng statstcs. Comparatvely long term (seasonal) trends would very lkely requre more memory or a lower resoluton such as weekly nstead of daly statstcs. When the SOM stage was used an ncrease n the number of hdden nodes n the second dmenson resulted n a slght decrease n predctve ablty of the network. However when the SOM stage was excluded from the network an ncreased number of nodes resulted n a very slght reducton n the varaton of the network output. In concluson makng use of normalsed relatve nputs nto a neural network s sgnfcant n ganng an ncrease result. A SOM stage n processng gves worse results when predctng trend; however when a SOM stage was used t was best to keep a one-dmensonal structure. Nose and normalsaton added to the SOM stage had no effect on the predctve ablty. Network type has no effect on network output, however the use of more hdden layers proved benefcal to output varaton when not usng a SOM layer and detrmental to average predctve ablty when usng a SOM layer. The optmal confguraton of the network has proven to be one wthout a SOM layer. Pre-processng should nclude relatve-normalsaton. To mnmse the output varaton an ncreased number of hdden layers should be used. Best results were yelded wth predcton calculated relatve to a wder tradng perod for the followng tradng day only. Ths mples the dffculty n predctng multple days nto the future wth such a model; however next day prce predcton, a more useful measure n real world applcatons, also yelded hghly favorable results. When compared to a smple mechancal averagng form of trend predcton the neural network traned acheved on average a hgher success rate. Smple mechancal average trend predcton was only able to predct wth 65% accuracy versus the optmally traned networks wth predctve ablty exceedng 90%.
28 R.Hallday 6 Future Research Drecton Further research wll utlse current results n servng as a foundaton for a herarchcal neural network structure (Fgure 9). Under ths network topology many sub-networks are traned on focused and ndependent tasks. An example of one of these sub-networks s the trend predcton network developed n ths paper. The funnelng of these multple networks s benefcal due to an ncrease n tranng speed and reducng tranng data requrements over one larger network [7]. Each neural network allows a more focused form of predcton that doesn t rely upon complex formaton of a predctve functon from the nput data set. Through combnaton of ndvdual network results a hgher predctve ablty can be obtaned. Components of ths future network wll rely upon addtonal and more complex pre-processng. Much of the pre-processng n the future wll lkely be based on techncal ndcators. The am of ths future research wll be the development of a tradng system wth the ablty to predct Buy, Sell and Hold prce ponts. The qualty of system output from such a system can be easly measured by the net proft or loss made by the system after accountng for transactonal costs.e. brokerage. Addtonal raw data wll be necessary for such a system. Ths data wll come n the form of foregn exchange and sector ndexes. Such ndexes wll allow the system to account for nternatonal and sector qualtatve events n a lmted fashon. Due to the tme dfference between varous markets nternatonal ndexes should sgnal major world events and allow the system to take ths nto account when makng a daly poston decsons. Indexes of sector should sgnal trends of varous sectors such as mnng, agrculture and bankng. These trends can assst n decsons of dstrbuton of funds across sectors. Fgure 9: Herarchcal Network Topology Results presented n ths paper are prmarly exploratory. The am of ths ntal phase of smulatons was the dscovery of better network topologes. The trend predcton system however has proven to be hghly successful and wll be used n later phases of the research.
Equty trend predcton wth neural networks 29 7 References [1] Schwager, J. (1999). Gettng Started n Techncal Analyss. John Wley & Son. p.2,253-258 [2] Yahoo Fnance. http://fnance.yahoo.com [3] Swngler, K. (1996). Applyng Neural Networks a Practcal Gude. Unversty of Strlng, Strlng, Scotland and Neural Innovaton Lmted. p. 24-25,106, 191. [4] Gles C., Lawrence S. and Tso A. (2001). Nosy Tme Seres Predcton usng a Recurrent Neural Network and Grammatcal Inference. Machne Learnng, Volume 44, Number 1/2, July/August. p.7,161-183. [5] Prng, M. (2001) Introducton to Techncal Analyss. McGraw-Hll. p.114-115 [6] Shadbolt, J. (2002). Overfttng, Generalsaton and Regularsaton. In Shadbolt J. and Taylor J. (Eds) Neural Networks and the Fnancal Markets. Sprnger-Verlag. p.56 [7] Mendelsohn, L. (1993) Usng Neural Networks For Fnancal Forecastng. Stocks & Commodtes, Volume 11:12, October. p.518-521
30