Why is everything so difficult? I’m sure I’ve asked that question a few times already. So, hyperparameter tuning. Requires lots of iterations to find the best values. Need for speed. So run it all on the GPU naturally. Except that the approach my current book uses, which is skorch to leverage sklearns grid search, won’t run on the GPU. I tried it on Google Colab with a GPU runtime, but it took 3 times longer than using the CPU on my local machine. I’m not sure what Google Colab actually does when it says it’s using a GPU, but I suspect it’s not doing Torch tensor operations there.
So I went looking for alternatives. Optuna is on open source hyperparameter tuning library that works with PyTorch (among other platforms) and doesn’t seem too complicated. I don’t want to have to spend a month learning a new sophisticated app. Most online tutorials I found were overly complicated. Nobody seems to have heard of the KISS principle. Anway it seems that about six months ago I actually bought a course on Udemy on Hyperparameter tuning, and it does actually have a section on Optuna, so I’m looking at that. Maybe I’ll get it to work with PyTorch NNs running on a GPU. Who knows? And is it actually worth all the trouble? And will it run on Google Colab, with a GPU?
Well, I got a very simple example working, and with the GPU. Funny thing was the GPU took 3 times longer than the CPU did!! I think I read somewhere that GPUs work faster on large datasets, and this one was pretty small. I used the BTC MLP with a single input variable. Model was pretty simple, only a single layer in addition to the input and output layers.
import torch
from torch.autograd import Variable
import numpy as np
import pandas as pd
from sklearn.metrics import mean_squared_error
import optuna
from optuna.trial import TrialState
DEVICE = torch.device("cpu")
df = pd.read_csv('data/btc.csv', usecols=['date', 'close'], index_col='date', parse_dates=True)
df_returns = df['close'].to_frame().pct_change()
df_returns.rename(columns={'close': 't'}, inplace=True)
df_returns.insert(0, 't-1', df_returns['t'].shift(1))
df_returns.dropna(inplace=True)
X = df_returns['t-1'].to_numpy(dtype=np.float32).reshape(-1, 1)
y = df_returns['t'].to_numpy(dtype=np.float32).reshape(-1, 1)
train_limit = 700
X_train, X_test = X[:train_limit], X[train_limit:]
y_train, y_test = y[:train_limit], y[train_limit:]
class MLP_LinReg(torch.nn.Module):
def __init__(self):
super(MLP_LinReg, self).__init__()
self.fc1 = torch.nn.Linear(1, 5)
self.act = torch.nn.ReLU()
self.fc2 = torch.nn.Linear(5, 1)
def forward(self, x):
out = self.act(self.fc1(x))
out = self.fc2(out)
return out
# Optuna function defining model and ranges of it's hyperparameters
# Returns the value to be optimized
def objective(trial):
model = MLP_LinReg().to(DEVICE)
criterion = torch.nn.MSELoss()
lr = trial.suggest_float("lr", 0.001, 0.01)
optimizer = torch.optim.SGD(model.parameters(), lr=lr)
# train the model
for epoch in range(100):
inputs = Variable(torch.from_numpy(X_train)).to(DEVICE)
targets = Variable(torch.from_numpy(y_train)).to(DEVICE)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()
# run model on test inputs to get predictions
with torch.no_grad():
input = Variable(torch.from_numpy(X_test)).to(DEVICE)
y_pred = model(input).data.cpu().numpy()
# compare predictions with actual returns (y_test)
loss = mean_squared_error(y_test, y_pred)
return np.sqrt(loss)
if __name__ == '__main__':
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=10)
pruned_trials = study.get_trials(deepcopy=False, states=[TrialState.PRUNED])
complete_trials = study.get_trials(deepcopy=False, states=[TrialState.COMPLETE])
print("Study statistics: ")
print(" Number of finished trials: ", len(study.trials))
print(" Number of pruned trials: ", len(pruned_trials))
print(" Number of complete trials: ", len(complete_trials))
print("Best trial:")
trial = study.best_trial
print(" Value: ", trial.value)
print(" Params: ")
for key, value in trial.params.items():
print(" {}: {}".format(key, value))