You wrote: So, in verification and test I might have a short signal going into November 13th but after the 11/13 COT data gets loaded I might find that the system went back to November 10th, for example, and tell me after the data was loaded that it's been long for 3 days.
This is called a leak from the future, and if you don't notice it, it will cost you thousands of dollars as it did me. I was trading two models (unsucessfully) for a couple weeks when I noticed that they were changing position days in the past. Thus it told me to go short, I did, then a few days later it told me that it had been long all along.
I caused one of these myself by creating a target variable based on 3 days in advance, then switching to a variable based on 2 days in advance, but failing to delete the old target, which then became an input. Thus I was using the stock price in 3 days to forecast where it would be in 2 days, a neat trick. The other problem I could never find anything wrong, but something clearly was.
One sign of this is long term performance of a model that is awesome. But now before I trade any model I go in and test the model for leaks from the future. What I do is go in and look at each model and write down the signal for the last couple days. Then I go back to the data and put in preposterous numbers for the next day. Then I reevaluate the models. If the prior days numbers are still the same, the model is safe.
This has only happened on two models for me, and it did cost me money. Other than these two models my models have done well for me.
Carl
|