Coding!

It’s great to be coding again. I really enjoy it. Not so much the typing as the problem solving. Using the code from DeepLizard as a base I’m now writing scripts to achieve broader aims, and that requires a significant amount of coding on my part. The DeepLizard code was created to solve the CartPole game from OpenAI Gym, and getting it to work with a fairly different environment took a bit of work. A significant difference is that in Cartpole the action of the agent affects the state of the environment (move cart left or right) but in the trading situation it doesn’t, there is an historic succession of prices which is completely determined (by history). However the actions do affect the reward, which is the most important part. Buy, Hold, and Sell are what trading is all about. Perhaps if you’re a whale your trades would affect the prices, but not for me. I think I qualify as a shrimp in the BTC ecosystem.

What I’m trying to do is pretty basic, but until I’ve actually done it successfully it’s not so basic for me. I have the main application to train a DDQN on my ADA price data. Now I need to validate that the model is good, and optimize it as far as possible, and then to use it in backtesting a trading strategy. This requires saving the trained model from the training script, and loading it into the validation script and ensuring that it works with the correct data (train and test sets) and checks for overfitting. This is achieved by making predictions on the training set, and also on the test set, and ensuring that the success rate is not very different. If the training set significantly outperforms the test set then the model is too customised for specific details of the training set and does not generalize to other data well enough. A big problem with machine learning models.

In the training I had to run through the training data several times because there just wasn’t enough data to get converging results. No wonder the Quantra people were using 5 minute data for the SPY! How many 5 minute periods are there in 10 years of data? My 7 years of daily data for ADA/USD is proving problematic. I’m thinking of going to 6 hourly data, which will give me 4 times as much. I’m not thinking of trading bots here, and if I was to actually trade manually using this model for signals then checking prices a few times a day is actually realistic, but not every 5 minutes. Anyway, back to work!

Yesss!!

I fixed all the bugs in the code that uses a DDQN to make trading decisions for ADA/USDT. I learned some important things such as how to properly use the gather method in PyTorch. Seems the error that was giving me the most trouble (and which caused me to research said gather method) was due to my specifying that my network had 2 outputs when it should have had 3. My bad, but a good lesson learned. Also, must admit that the results look promising. However in the past every trading strategy that ‘looked promising’ ended up losing me money, so I’m not going to fall for that again.

ETA: I haven’t done any validation on that promising backtest result, so it really doesn’t mean much. I’ll have to grab some more recent data and test out the model on that to see if it actually generalizes instead of overfitting the training data. Still, I might be a lot closer to Phases 3 and 4 of my plan than I realized in my last post.

Progress

Despite all my complaining about ‘problems’ I’m actually making solid progress. I think I understand conceptually how reinforcement algorithms work, and I can even read those pesky mathematical equations I complained about a few posts ago. I’m getting a better grip on the jargon, and the concepts generally. I’ve even managed to get a few examples of code up and running.

So that’s Phase 1. Phase 2 will be to consolidate my programming these algorithms and not just getting other peoples code to actually work on my computer. I’ve encountered quite a few numpy functions recently that I haven’t seen in the past couple of years of programming in quantitative finance. Most of them have to do with reshaping matrices in various ways. In the past the pandas DataFrame has been my goto data structure. In machine learning it’s a bit more basic, numpy arrays and the tensor library equivalents (either torch or tensorflow).

Phase 3 will be trying out different neural network architectures to work on this general problem. So far I’ve been staying with multi layer perceptrons but they’re pretty basic. Lots of other options to explore

And finally Phase 4 will concentrate on determining exactly what kind of input data will give the best results.

Castles in the Sand

In my ongoing studies of Reinforcement Learning I keep running into the same problem (one of many, I must admit). Many authors/tutors use the OpenAI Gym framework to demonstrate the RL algorithms, because it’s so convenient to use (comes with many premade environments) and to customise. A consistent interface is always a good thing. Except that it isn’t. In the past I’ve had to fix some fairly trivial changes to the API, but the latest incarnation of API change is a bit more serious. The author I’m currently reading has created a custom environment that is a subclass of a gym class – that no longer exists. This is a bit more challenging to fix. Maybe the fix is simple, if you are thoroughly familiar with the gym API. So, should I spend time becoming familiar with the gym API in order to fix all the examples (upon which many chapters of books are based!) because they no longer work? I don’t expect to be using gym going forward, it’s just a ‘convenient’ teaching tool. It wouldn’t be such a problem if new teaching material came out often to keep up to date, but it doesn’t. I haven’t found a single book/course/article that doesn’t have this problem. I’m looking forward to Yves Hilpisch’s book coming out in December, but will it already be outdated by the time it is published? Somehow I don’t think he’ll be using OpenAI Gym for introductory examples/exercises, but I might be wrong. Or if he does it might have changed again by the time his book comes out.

ETA: I guess there is a simple fix, create a new Docker container with Tensorflow and an old version of gym. Not so simple though, as there has been a change to the maintainers of gym and I’m not sure if older versions are available for installation. Perhaps I should check that out.

Further ETA: I took the obvious solution and created a Docker container with old versions of everything! Had some compatibility issues but not too much. So now that code that was giving me errors doesn’t. Installing an old version of gym was not problem after all. Hopefully this setup will allow me to explore all those books, courses, etc that have given me problems lately.

That Went … Well

With some minor editing I converted the notebooks for the Quantra course on Reinforcement Learning in Trading to ordinary Python files and got it up and running in my new Docker container. And running. And running. Several hours later the screen seemed to have completely frozen, neither mouse nor keyboard had any effect, and I shut down the computer.

There was some output while it was running – time for each step. At the start it was 10 – 20 msecs per step, but by the time I shut it down it was 10 – 20 secs. I have no idea how far it actually got in processing the data, of which there was quite a bit more than my previous exercises in ML have included. I think I’ll have to include some sort of logging so I can get a bit more feedback on what’s actually happening. Or perhaps I should just use a fraction of the data. Or something. It would be nice to actually get some results. But at least the program didn’t crash, at least not after I fixed the issues that caused the first few crashes!

Being Flexible

A while back I decided to stick with the PyTorch library instead of Keras/Tensorflow for neural networks. However that decision seems to be limiting me a bit too much. One of the reasons was because trying to get Tensorflow to work after I had set PyTorch up to work with the GPU caused errors that I couldn’t resolve.

Well, there is a way to resolve them, and that is to run Tensorflow in a Docker container. I’ve already tried that a couple of times, it works, although I’m not going to try to get it to work with the GPU from inside a Docker container! It probably can be done, but not by me.

Anyway the main codebase I want to explore, using Tensorflow, is some Quantra courses involving deep learning. They use Tensorflow, and my attempts to convert to PyTorch were not as successful as I hoped they would be. I think it’s because that codebase is just a bit too complex for my current level of understanding. Quantra also use ta-lib quite a bit for indicators. That has to be built from source, and luckily I found some code on StackOverflow that does exactly that, so I now have a Docker container with both Tensorflow and TA-Lib installed and hopefully that will be enough (in addition to all the usual data science packages of course). I forgot to install Jupyter Notebook but I can live without that. The original Quantra files are all Jupyter Notebooks, but I find that’s hopeless for debugging and I prefer to reconfigure them all as ordinary py files anyway.

So after lots of very frustrating explorations of apps that never seem to work, for a range of reasons, I’m back to the original app that got me started in Reinforcement Learning. Maybe I can actually get it up and running this time, and more importantly, understand it well enough to get it to work with my own data and not just the data supplied in the course.

[[[1]]]

Dimensions are becoming the bane of my life. Take the title to this post. The value is the number 1, but what all those brackets mean is this. Imagine you have an MS Excel workbook (or whatever spreadsheet app you use). This workbook has the possibility of several worksheets. Each worksheet has a number of rows, and each row a number of columns. The above notation means the value 1 in the first column of the first row of the first worksheet in the workbook. I’m pretty sure this makes it a 3 dimensional value.

Some of the issues I’m having converting code to work with different data are related to this issue of dimensions. The result of

torch.tensor([1, 2, 3])

is not the same as

torch.tensor([[1, 2, 3]])

That extra set of brackets creates an extra dimension, even though the values are the same. Problem is functions expect input data to be in a certain format and spit the dummy throw Exceptions if the format isn’t what they expect. Only by carefully stepping through the code, while it’s running, with the debugger allows me to see what types and dimensions data has. I’m still pretty new to PyTorch (and tensors generally) so it might take a while to get up to speed on this.

Milestone

I have a complete Double Deep Q Network solution to a game (Cartpole) up and running! First time achieving this. Admittedly it’s someone else’s code, only slightly reorganized by me, but I understand nearly all of it and so should be able to get it to work with other inputs, such as trading data.

I don’t know why it has been so hard to get to this point. Simply copy/pasting someone else’s code should be a no-brainer, but the process has been fraught with difficulties. Anyway, I’ll explore this solution further ’til I understand it completely, apply it to a variety of problems, and then develop things from there. The network topology used is very basic, and I’m sure can be improved for a variety of other problems. Also hyperparameter tuning might come in handy. Perhaps most important is that conceptually I now understand what’s going on, even those equations I mentioned a few posts ago make more sense now. It’s just a couple of implementation details that I haven’t quite got my head around. For one section of code even the author/presenter says to check out StackOverflow for an explanation of how it works!

Reinforcement Learning faces a dichotomy between Exploration and Exploitation, mentioned in a previous post. Exploration is fairly random, and develops knowledge of how the environment works. Exploitation involves using that knowledge to achieve real ends. Usually in RL there’s always some exploration, just in case there’s some undiscovered treasure in the environment.

Having spent a lot of time on exploration of the subject of RL, I’m now about to exploit that knowledge. However I’ll continue to explore (study) some of the time to improve my general understanding and perhaps discover some golden nugget of knowledge that takes me to a new level.

Jumping the Shark

Well, a little shark, anyway. There’s an excellent course on RL produced by DeepLizard. All the videos are available on YT (for free of course) but they do have a paid course with some extras, although I’m not sure those extras are worth it (small community involvement, etc). However I don’t mind paying content creators, especially for good stuff.

Towards the end of the course we get to the CartPole environment in OpenAI Gym. Now this environment produces state as a set of 4 values related to position and movement of the cart and the pole, but for some reason the course presenter has decided to use image data from renders as the state. She has mentioned quite a bit that this is how AI learned to play Atari games originally, but the code required for image processing seems to be a little overkill to me, especially with a perfectly adequate alternative provided by default.

Well, is it adequate? I guess I’ll try to implement the solution using the provided state instead of her modification, and see what kind of results I get. At least the whole thing is in PyTorch so I won’t have to worry about conversions from TF (which I haven’t totally got sorted yet, unfortunately)

ETA: Actually, she did rework the solution to use the state data provided by the gym environment rather than using image data from renders, with better results. I guess a willingness to explore options is a good thing, especially in the realm of machine learning!

A Question of Identity

One of the things I really dislike about Python is that you never know what anything is. You see a variable states, and you wonder it this is a list, or a dictionary, or a numpy array, or a Pandas series or dataframe. Because they all have to be treated differently. This is especially a problem when trying to understand someone else’s code.

When I was teaching IT I was mostly teaching Java. You know what everything is, you have to declare the type when you declare the variable name. And you can’t change the type just by assigning something else to it. Python is a breeze to write and a nightmare to read, especially complex stuff.

One can implement type hinting in Python, and I think I might do that for all code that I’m trying to read. Just add types whereever I can and hope that the author doesn’t just change it on the fly. Perhaps I can also make use of datatypes such as the namedtuple. Anything to help identifying what something is so I know what I can do with it.

Contretemps

After yesterday’s little contretemps I finally managed to get that code running by using code from Deep Learning with PytTorch as a template. However there was practically no difference between my original code and the code from the book, so I still don’t know why it didn’t work yesterday.

However it is working now, and I might leave the rest of that book ’til later when I want to review other kinds of networks, such as RNNs, CNNs, LSTMs etc. For the moment I’ll just stick with plain old Feed Forward networks, aka Multi Layer Perceptrons. I did pick up some useful learning though, in relation to such things as batch processing, and also conversions between numpy arrays and torch tensors, which always seem to give me some trouble.

So, on to phase two of my study plan, which is to translate all the Tensorflow code in the Quantra courses into PyTorch code. Hopefully that will now be fairly straightforward. Happy, happy, joy, joy.

ETA: Have implemented the TF model in one of the chapters of the course as a PT model, and it worked well. Pretty easy really. There may be some nuances of the TF model that I didn’t catch, I have read that TF provides better fine-grained control of layers than PyTorch does. The TF model certainly performs better on the data, but my implementation was a bit simpler and I haven’t optimized it. Looks like I can move forward from here.

I did some more work on that model, adding a couple of extra features, and ended up with a model that predicted whether or not an equity would go up in the next 5 days better than 60% of the time. The equity was not trending either way over the period, just some ups and downs. Not too shabby!

Lost in Translation

My current challenge is to convert neural networks created with the Tensorflow library (as used by many of the people who write books and create online courses) to an equivalent network created with the PyTorch library (which I have decided to use for all my deep learning activities). It should be fairly straightforward, but already I’ve run into a problem that I can’t fix. It was a fairly simple network and my PyTorch version is giving nonsensical results.

So what to do? This is a pretty important issue, becasue there’s a book being published in December that should be a great resource. However I have another work by Yves Hilpisch and he uses Tensorflow. I expect he will do so in this new book as well. Also my trading focused courses from Quantra use Tensorflow for the neural networks. Theoretically I should be able to use either on my computer but I’ve had conflicts, especially with regard to using the GPU, which is pretty important in deep learning. Besides, it SHOULD be easy to convert one to the other.

So my plan of action is to make sure I am completely familiar with the PyTorch library. I think the best resource for this is from machinelearningmastery.com, an ebook called Deep Learning with PyTorch, by Adrian Tam. I have several other resources but I think this one is the best. After that I’ll spend time converting TF networks to PyTorch networks, trying to make that process as seamless as possible. Then back to the Quantra courses, especially the one on Reinforcement Learning which I have discussed a bit in recent posts. That should take me to December. Then I can get back to learning to trade ADA. Finally!