Maximus-AI: Using Elman Neural Networks for ... - Semantic Scholar

3 downloads 1197 Views 379KB Size Report
Maximus-AI: Using Elman Neural Networks for. Implementing a SLMR Trading Strategy. Nuno C. Marques and Carlos Gomes. CENTRIA — Departamento de ...
Maximus-AI: Using Elman Neural Networks for Implementing a SLMR Trading Strategy Nuno C. Marques and Carlos Gomes CENTRIA — Departamento de Inform´atica, Faculdade de Ciˆencias e Tecnologia, Universidade Nova de Lisboa and GoBusiness, email: [email protected].

Abstract. This paper presents a stop-loss - maximum return (SLMR) trading strategy based on improving the classic moving average technical indicator with neural networks. We propose an improvement in the efficiency of the long term moving average by using the limited recursion in Elman Neural Networks, jointly with hybrid neuro-symbolic neural network, while still fully keeping all the learning capabilities of non-recursive parts of the network. Simulations using Eurostoxx50 financial index will illustrate the potential of such a strategy for avoiding negative asset returns and decreasing the investment risk.

1

Introduction

Several authors (e.g. [3], [11], [1]) show an empirical evidence confirming that Normal Distribution doesnt fit the behaviour of the financial assets returns and that leads to risk underestimation. This risk underestimation increases the probability of financial crises. So we will assume implicit time dependency in financial assets and the detection of that dependency by the use of technical analysis indicators such as moving averages (i.e. technical indicators are used as a risk reducing device). Unfortunately, traditional moving averages, taken over a fixed number of days, are too strict, and cannot adapt to different market conditions. This paper presents a model that can dynamically adjust the number of days to be considered for calculating a moving average according with the patterns observed over a given set of fundamental and technical market conditions. The goal is to have a descriptive approach, that can be used jointly with a stop-loss maximum return (SLMR) trading strategy. So, this approach is different from the more direct trend of most previous works applying neural networks to stock market price prediction. Next section further formalizes the model presented in [10] and [1], for encoding knowledge in Elman Neural Networks in the case of financial markets. Then section 3 will validate such a strategy by using the DJ Eurostoxx50 financial index for illustrating the potential of such a strategy. Finally some conclusions will be drawn.

2

Representing the Intelligent Moving Average for Training Elman Neural Networks

Elman neural networks make a direct extension of the back-propagation learning algorithm, by extending the network with recurrent context units (or memory units) [4]. The

recurrent weights have a fixed value (usually 1), so the back-propagation algorithm still can be applied to this class of networks. This variant of the backpropagation algorithm is known as backpropagation through time [4]. The main problem with Elman neural networks (as with other kinds of neural networks) is that they have too generic learning capabilities: instead of representing logical functions (as in initial proposal by [9]), the generic learning capabilities of neural networks made them black boxes. The backpropagation algorithm distributes learning for a given pattern by as many units as possible, while errors in model output are minimised regarding the target value provided in the dataset. Unfortunately there are only generic a priori specifications regarding what are the best proper input encodings and neural architectures for a given problem. E.g. Fuzzy Neural Networks (e.g. [5]) can encode so called linguistic rules inside a neural network. However the semantic of such a model is limited and the neural network computation is usually replaced by a fuzzy classifier (e.g. the recursive approach of the Elman network is harder to implement). Unguided learning in neural networks is also sensible to early convergence in local minima and unsuitable generalisation of the train data (overfitting). We use an alternative based on the use of hybrid neuro-symbolic methods ([7], [6]). Hybrid neuro-symbolic methods can encode logical programs in a feed-forward neural network core (the model is usually called the core method). In this method the neural network keeps all its computational power while still encoding both statistical and logic models. The system also learns to adjust itself (i.e. fits) to experimental data. Moreover, in [2] the authors present experimental evidence that backpropagation learning can be guided by encoding logical rules. [8] shows that logical knowledge may not be enough for different classes of problems and generalises the represented knowledge from true − f alse values to real values. Finally, by explicitly modelling the time series inside the network, [10], shows how such models can be used for computing intelligent moving averages. Here these results will be applied to our new SLMR/Maximus proposal of a new, and safer, trading strategy. SLMR model To evaluate the risk underestimation, five indicators were created: 1) Serial Correlation; 2) Fama Multiples; 3) Correlation Breakdown; 4) Moving Average; 5) Trading Range Break-Out. Combining the information from the five indicators of risk underestimation the Stop Loss -Maximum Return (SLMR) investment strategy was suggested to achieve superior returns, statistically different, without risk increase and exposure to rare events [1]. These results will be compared with a long position in the DJEurostoxx50. The SLMR Strategy resumes as follows: – The Stop Loss its function is to suggest signals to sell the financial asset to avoid the huge losses following risk underestimation; – The Maximum Return its role is to suggest signals to buy the financial asset, when it shows strong signals of recovering, after the huge fall had happened. Using an Intelligent Moving Average Diagram 1, shows the specification for implementing the calculation of the intelligent moving average function taking into account the SLMR model. The moving average component is encoded in the lower part of the graph and follows the encoding studied in [10]: each node can be associated with a state

normalt

ynormal crisist

1/2

IMA

F

xt

xt xt-1 ...

+

ycrisis

... 1/N +

xt-n

Fig. 1. Diagram of proposed model for implementing the financial intelligent moving average.

or variable. Each arrow represents the memorisation of a given (previous state) value by other state. When a transition has a number associated to it, this memorisation should be multiplied by previous state value. In the diagram the addition corresponds to the sum of all connected variables (and is not a variable). The upper part of the diagram is reserved for the SLMR model. The two main states crisis/Stop Loss and normal/Maximum Return are represented. These states are activated by current financial indicators (F ). Although we don’t know how many days should be used for computing the moving average under each market condition (as defined by F ), we know we should depart from extreme values, i.e. a 40 day short moving average during normal periods and a one year moving average during crisis periods. We use the semi-recursive Elman network [4] where the initial state in the diagram will represent the input value for the neural network (i.e. the value of the time series vector xt ). Input values are connected to the first layer (in the graph), represented by the hidden unit layer and the contextual (Elman) layer. Each transition among a unit xt → xt−1 will represent a recursive connection from xt to a memory contextual unit xt−1 , followed by a feed forward connection (without further information we will assume weights of 1.0) to the hidden layer unit xt−1 . The connection from the hidden layer to the output layer always uses the value 1/N (i.e. a multiplying factor). The states crisis and normal were implemented by two hidden neurons and correspondent contextual units in an Elman neural network. Since we still use the basic specification for the core method [7], true and f alse values can be encoded by 1.0 or −1.0, respectively. Due to the use of the hyperbolic tangent as an activation function [8], these two boolean indicators can be connected to a second layer of hidden neurons (ynorm and ycrisis ) [8] while the memory effect is achieved by using the contextual units in the Elman neural network. For conjoin the non-linearity with the linear moving average calculation all values for xt are normalized to the range [−0.1, 0.1] [8]. As a result, when activated, the −1/1 output will switch off a [−0.5, 0.5] value of the moving average (or, conjoined with neuron bias, will accept it). This way, the network will compute the most appropriate moving average as a result of conjoining both ystate values. All remaining feed-forward weights are randomly set in a the [−0.1, 0.1] range (an usual procedure if

we don’t want to encode a priori knowledge in the model). Finally a small amount of noise (+ − 0.01 or 2% of neural weights) was applied to all connection weights.

3

Predicting the best moving average in SX5E

The network just described was tested over the European DJ EURO STOXX 50 financial index(SX5E). Daily prices in the 9 year period of 1 September 2000 to 9 November 2009 were considered as our data. F vector (in diagram 1) was set to represent several fundamental measures provided by our financial partner (e.g. the German CDS index, the US CPI, urban consumers SA index, US Generic Government 1 Month Yield or even gold spot, s/OZ commodity) and technical measures (e.g. daily and other average variations and other more evolved risk measures). A precalculated moving average was precalculated over SX5E and was set as the x value in diagram 1. Validation data was always set to the period starting in 14 June 2006 (selected for including the 2008 financial crisis, but also to give some positive return period). Please notice that this = −17.32% negative return. period has a 2822.72−3414.21 2822.72 The network was trained using the following strategy: a premonition factor was set to 20 trading days. When SX5E index has a positive variation in the next 20 trading days Ima training target is set as to a short (40 days) moving average (this will advise our financial simulator to give buying instructions). On the other hand, if SX5E has a negative variation, a more conservative one year Ima is set as the target value. This premonitive train Ima value is ploted against SX5E index in figure 2 – A. This represents all available data. All values were normalised to values in the [−0.1, 0.1] interval (for enabling regression while still using the hyperbolic tangent activation function).

Fig. 2. Dataset and results over validation data for the intelligent moving average (description in text).

As in [10], we can notice (figure 2 – B) that the initialised Elman network is much more predictable and stable that the randomly initialised one. Indeed output values are always near to the target Ima and the sharp effects of input index variations are much less noticed in this network output. Also quite good results can be noticed regarding comparison with the targeted premonitive Ima. This seems to point that during validation the neural network (without receiving any premonition information) knows how to use the F vector to predict the future1 .

Fig. 3. Dataset and results over training and validation data for the intelligent moving average (description in text).

For a more quantitative measure, a financial simulator was developed. Neural network Ima was used jointly with an R2 goodness of fit test indicator for measuring instability. The non-invested periods have return given by the Euribor rate. Figure 3 plots normalised values of the technical indexes and simulator financial decisions (represented as vertical lines, green for buying decisions and red for selling decisions) over IM A(40, 256). A black vertical line identifies the start of the test data. The bar chart presents monthly and cumulative returns after each decision. Under this conditions global return during test period of this Ima index was 58%, while M A256 return was 43% and M A40 had a negative return of −37%. Comparisons were also made with several major fund houses, using the same benchmark as SLMR (DJ EuroStoxx50) in this period: all of them lost value in this period (from −12% to −28%). 1

Notice this will always be a risky prediction: selected features (i.e. F ) may be relevant only to this financial crisis so, as usual, past incomes can not be used to predict future ones.

4

Conclusions

It is unfeasible to have a financial expert looking at all promising investment decisions in the world. Also humans are not that good trying to optimise decisions, e.g.: should we use a 40 day moving average or a 56 day moving average for short selling periods, or what would be the best strategy to combine our fundamental and technical parameters. With the proposed method we have illustrated how we can encode a moving average calculation inside a traditional neural network having learning algorithms and properties quite well studied: the Elman Neural Network [4]. This is very useful if we want to leave some information under-specified for our method. Backpropagation though time algorithm [4] was used to discover those patterns in the dataset. However that search was guided, in the sense that we always want some kind of index or moving average. In fact, we need an indicator that can be studied regarding its performance and rating by our financial experts, not an opaque classification decision. Acquired network can be seen as a non-parametric statistical model of the proposed MMI. The two states used for the SLMR model have a logical interpretation, and the inclusion of further knowledge can be done by using the core method or its extensions (e.g. [7] and [8]). For this study, the only a priori assumption was the use of different moving averages for a given (changeable) period of time: not only encoding more logical knowledge would be outside the scope of this paper, but also a no-knowledge approach is more appropriate to validate the core Ima-NN extension to the financial data. Future work addresses the inclusion of economic models for reasoning (based on fundamental and technical features) and on what is the most probable short term economic scenario.

References 1. Gomes, C., Maximus investment fund, Tech. report, GoBusiness, (2010). 2. Bader, S., H¨olldobler, S., Marques, N., Guiding backprop by inserting rules, In ECAI08 Workshop on Neural-Symbolic Learning and Reasoning, Greece, vol. 366, CEUR, (2008). 3. Brock, W., Lebaron, B., Lakonishok, J.: Simple technical rules and stochastic properties of stock returns. Journal of Finance 47, 1731–1764 (1992) 4. Elman, J.L.: Finding structure in time. Cognitive Science 14, 179–211 (1990) 5. Feuring, T.: Learning in fuzzy neural networks. In: in Proc. IEEE Int. Conf. Neural Networks. pp. 1061–1066 (1996) 6. d’Avila Garcez, A.S., Broda, K.B., Gabbay, D.M.: Neural-Symbolic Learning Systems — Foundations and Applications. Perspectives in Neural Computing, Springer, Berlin (2002) 7. H¨olldobler, S., Kalinke, Y.: Towards a massively parallel computational model for logic programming. In: ECAI94 Workshop on Combining Symbolic and Connectionist Processing. pp. 68–77. (1994) 8. Marques, N.C.: An extension of the core method for continuous values: Learning with probabilities. In New Trends in Artificial Intelligence. pp. 319–328. APPIA (2009) 9. McCulloch, W.S., Pitts, W.: A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics 5, 115–133 (1943) 10. Marques, N., Gomes, C., T.: An intelligent moving average. In: Proceedings of the 19th European Conference on Artificial Intelligence - ECAI 2010 (2010) 11. Sheikh, A.Z., Quiao, H.: Non-normality of market returns. The Journal of Alternative Investments 12(3) (2009)