Moving Right Along

I’ve written code for my ADA trading environment, and an Agent that decides to sell, hold or buy using the integers 1, 0 and 2. Here’s the code that currently makes that important decision

    def select_action(self, state):
        """select action and pass to environment"""
        action = random.randint(0, 2)
        self.current_action = action    # needed for replay buffer
        reward, next_state, trade_closed = self.env.receive_action(action)
        return reward, next_state, trade_closed

The third line just selects a number from 0 to 2 randomly! If I run my app with 8863 lines of data I do usually get about 0.02% average return, which might just cover the trading fees.

I need to replace that line with a neural network or two. I’m going to attempt to do that with as little reference to other people’s code as possible. A real test of my understanding of how these things work. What mark will I get for this assignment? Well, that could be the profit that my code manages to produce, if I ever use it for live trading. Learning Reinforcement Learning is itself an exercise in Reinforcement Learning. How meta!

ETA: Have an NN making decisions, but not actually learning yet. Sorted out numerous issues converting lists <-> np.ndarray <-> torch.tensor, all with the right number of dimensions! And no integers amongst the floats. Overall profits about the same as selecting random actions. Now, on to the actual learning.