Craig,
<< I have an idea to offer with no specifics on how to implement it. In my professional area researchers are in the process of developing instruments which predict what characteristics of a person correlate with specific future behaviors. A few years ago, folks depended on individual factors which correlated with future behaviors. The problems here are that not all factors equally predict and that there is overlap in characteristics of the factors (i.e., they have much of the same qualities).
The way our field has gotten beyond this (and raised the accuracy of prediction)is through the use of linear regression statistical models. Basically, using statistical software, you find what factor contributes the most prediction, then the next highest contribution,and the next. At some point, there is no increase in predictive ability (far short of 50+ variables)- the statistical packages will identify which variables co-vary so you might want to use one or the other (determined through testing). I'm not a statistician and am sure I have butchered this explanation, but this is one lead. One of the statistical packages commonly used is SPSS.>>
About two years ago I started to organize an approach to studying the utility of about 75 variables which would suggest themselves to many of QP2 users. I worked for the following 10 months to retain data on a daily basis and as well allow me to use any weighting I wanted with any subset of all the variables, and for a variety of holding periods.
While this was a trial and error process, I began getting results good enough to identify a 'best candidate' which has generally performed better than alternatives I subsequently explored.
I am sure I haven't found an optimal approach but I would certainly settle for the returns I have found to date.
I have not used proprietary statistical models for several reasons, the major being the need for a database for sufficient time frame. Another is my interest in programmatically looking at how the spectrum of stocks performs and not merely the few at one extreme.
I, too, thought I should be able to find 8-12 variables which ought to do as well as the 55 or 56 I have so far found to produce best results. I have not been able to. But since the stock selection process is computerized, using 55 variables is no burden. |