RMSE = 0.02477
I set up a simple MLP in PyTorch, but I only used one node in one hidden layer, so I’m calling this a Uni Layer Perceptron. I figure this should give me the same result as the previous trial with the LinearRegression class from Scikit-Learn, as it’s only using one weight and one bias, equivalent to y = mx + c. Indeed, with sufficient iterations of the training loop, the results were almost identical.
I made the model a bit more complex, with a couple of layers (5 nodes each) and a ReLU activation layer, but the result was almost identical. I guess with a single input feature there’s not much scope for improvement. The code (simple version):
import torch
from torch.autograd import Variable
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error
df = pd.read_csv('data/btc.csv', usecols=['date', 'close'], index_col='date', parse_dates=True)
df_returns = df['close'].to_frame().pct_change()
df_returns.rename(columns={'close': 't'}, inplace=True)
df_returns.insert(0, 't-1', df_returns['t'].shift(1))
df_returns.dropna(inplace=True)
X = df_returns['t-1'].to_numpy(dtype=np.float32).reshape(-1, 1)
y = df_returns['t'].to_numpy(dtype=np.float32).reshape(-1, 1)
test_limit = 700
X_train, X_test = X[:test_limit], X[test_limit:]
y_train, y_test = y[:test_limit], y[test_limit:]
class MLP_LinReg(torch.nn.Module):
def __init__(self):
super(MLP_LinReg, self).__init__()
self.fc1 = torch.nn.Linear(1, 1)
def forward(self, x):
out = self.fc1(x)
return out
model = MLP_LinReg()
criterion = torch.nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)
for epoch in range(1, 10001):
inputs = Variable(torch.from_numpy(X_train))
targets = Variable(torch.from_numpy(y_train))
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
if epoch % 10 == 0:
print('epoch {}, loss {}'.format(epoch, loss.item()))
with torch.no_grad():
y_pred = model(Variable(torch.from_numpy(X_test))).data.numpy()
loss = mean_squared_error(y_test, y_pred)
print(loss)
print(np.sqrt(loss))
It’s easy to get caught up on simple details. Initially I was getting an error running the code because apparently one of the arrays was float and the other was double, and PyTorch didn’t like that. With so many ways to create Torch tensors from Numpy arrays it can be hard to find the right syntax to specify the data type. but eventually I found a way to do it that didn’t give me further errors. One reason I’m writing this blog is so that I can find the correct syntax fairly easily when I need it again in future. Not sure how well that will work though.