-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference time #8
Comments
Sorry, the implementation of Mahalanobis distance is not elegant and takes up most of the inference time, which may still have room for optimization. |
thanks for reply! I think this way is faster that Instead of calcuating vectors with for loop, get mahalanobis distance with matrix multiply! |
Do you think that could be improved by multiprocessing or joblib packages? |
Do you mean
This part takes a lot of time, right? |
@xiahaifeng1995 , @okokchoi , you could also move the following into the training and save it with mean.
So, in the training part:
I replace the following : with :
|
Thanks a lot for your reply!
dist value has the same length, but something wrong with dist_list |
@okokchoi , Did you compute the conv_inv and save it? see, in the training part, and replace it with :
and in testing:
|
I solve the problem that I just load pkl file for the non-modified version. |
I think so, please try it and share your findings |
Ok I will 👍 |
I think it's pretty much the same. As well as the size of the feature map, below codes are heavy dist = SSD.cdist(embedding_vectors[:,:,i], mean[None, :], metric='mahalanobis', VI=conv_inv)
dist = list(itertools.chain(*dist)) Is there a way to turn it in parallel? |
Improved 3.5 times through real process multiprocessing |
Awesome! Did you use the multiprocessing module in Pytorch? |
Thank you for the code @DeepKnowledge1 . I tried to your code and was able to improve my inference time from 80 secs to 43 secs! I tried to use cython with the code, but it didn't improve by much (this may be due to SSD.cdist already implementing c language optimisation ). Based on the mahalanobis equation (which can be reference in scipy's page), I used einsum to multiply the 3d matrices which is the mean, inv_cov, and embedding vectors without any looping. I was able to reduce my infer time from 43 secs to 2 secs! The code is as below
To improve further, maybe real process multiprocessing such as mentioned by @ingbeeedd could be implemented? By the way, I used this code for single-image inference, and not for multiple at a time, so the size for the matrices of the mean, inv_cov and embedding_vectors may be too large for a calculating mahalanobis at only one time. Some modifications may be needed to process the data by batches and calculating mahalanobis. |
@GreatScherzo That's what I want to do to change the loop to matrix calculation. I will apply some modifications to this. |
@fryegg @GreatScherzo manager = multiprocessing.Manager()
cpu_core = 8
dist_list = manager.list()
for number in range(cpu_core):
dist_list.append(manager.list())
def calculate_distance(number, start, end, train_outputs, embedding_vectors):
global dist_list
for i in range(start, end):
mean = train_outputs[0][:, i ]
conv_inv = train_outputs[1][:, :, i] #np.linalg.inv(train_outputs[1][:, :, i])#
dist = SSD.cdist(embedding_vectors[:,:,i], mean[None, :], metric='mahalanobis', VI=conv_inv)
dist = list(itertools.chain(*dist))
dist_list[number].append(dist) main function procs = []
start = time.time()
for number in range(cpu_core):
s = number * (H*W // cpu_core)
e = (number + 1) * (H*W // cpu_core)
proc = Process(target=calculate_distance, args=(number, s, e, train_outputs, embedding_vectors))
procs.append(proc)
proc.start()
for proc in procs:
proc.join()
print("time :", time.time() - start)
global dist_list
final_list = []
for number in range(cpu_core):
final_list.extend(dist_list[number])
final_list = np.array(final_list).transpose(1, 0).reshape(B, H, W)
final_list = torch.tensor(final_list)
score_map = F.interpolate(final_list.unsqueeze(1), size=x.size(2), mode='bilinear', align_corners=False).squeeze().numpy() I'd appreciate it if you could give me your opinion. |
@ingbeeedd thank you very much for sharing your code! |
@fryegg @GreatScherzo |
@ingbeeedd |
@fryegg |
Hi @GreatScherzo , thanks for your improvement, it is faster but the score is different , the scores for the normal images are higher than the defective images, do you have any explanation? |
@GreatScherzo It works fine with only one image, but if you have a batch, the scores will be much different. |
By the way, i fixed that, So now, the distance is vectorized, works if you have one or many images |
ok, thanks |
Thanks for your effort!
I have a question about PaDiM.
I saw the average inference time with R18-Rd100 took 0.23sec in the paper.
But in test phase, calculating train/test image vector's mahalanobis distance takes about 9sec when I use gpu.
Any comments??? Thanks!
The text was updated successfully, but these errors were encountered: