-
Notifications
You must be signed in to change notification settings - Fork 116
Description
Hi,
I am running experiments for multi-class classification and get the following error for EpectedErrorReduction:
File "/home/julia/master_thesis/env/lib/python3.6/site-packages/alipy/query_strategy/query_labels.py", line 829, in select
score.append(pv[i, yi] * self.log_loss(prob))
IndexError: index 4 is out of bounds for axis 1 with size 4
I think the Error is that my initial seed set (label index) does not contain all labels which can be found in y.
In the following code, shouldn't it be
classes = np.unique(label_y)
instead of
classes = np.unique(self.y)?
`` if self.X is None or self.y is None:
raise Exception('Data matrix is not provided.')
if model is None:
model = LogisticRegression(solver='liblinear')
model.fit(self.X[label_index if isinstance(label_index, (list, np.ndarray)) else label_index.index],
self.y[label_index if isinstance(label_index, (list, np.ndarray)) else label_index.index])
unlabel_x = self.X[unlabel_index]
label_y = self.y[label_index]
##################################
classes = np.unique(self.y)
pv, spv = _get_proba_pred(unlabel_x, model)
scores = []
for i in range(spv[0]):
new_train_inds = np.append(label_index, unlabel_index[i])
new_train_X = self.X[new_train_inds, :]
unlabel_ind = list(unlabel_index)
unlabel_ind.pop(i)
new_unlabel_X = self.X[unlabel_ind, :]
score = []
for yi in classes:
new_model = copy.deepcopy(model)
new_model.fit(new_train_X, np.append(label_y, yi))
prob = new_model.predict_proba(new_unlabel_X)
score.append(pv[i, yi] * self.log_loss(prob))
scores.append(np.sum(score))
return unlabel_index[nsmallestarg(scores, batch_size)]``