
I’ve got results similar to the above a couple of times now, where the results improve fora while and then get worse again. A search on Google suggest the most likely cause is overfitting. This would not be too surprising given that each of 50 episodes is a complete run through the data.
Suggested solutions include reducing the complexity of the network so it doesn’t learn the specifics of the data so well, and various approaches t regularization. I already have a couple of dropout layers, but I increased the percentage of one of them a bit. And also decreased the number of nodes on my layers. So I’m running it again, let’s see what happens this time. At about 3 hours per runthrough finding good values might take a while. For the chart above I also used RMSprop as the optimizer instead of Adam which I had used previously. I do get some more positive results, so that’s a good thing.
Later:
RMSprop started giving me strange results, a whole string of zero PnL. I think this has happened before with that optimizer. Anyway, back to Adam, with a slightly simpler network, and

No that’s better. Still, it looks like 50 episodes is a bit more than I need, and I’ll reduce it in future. In fact I could save a model after about 10 episodes and then explore further variations using that as a starting point, avoiding the initial training period altogether except for the first time. Perhaps I’ll try that.






