Crash when calling MacenkoNormalizer.fit with tensorflow backend #37

bertrandchauveau · 2023-02-25T17:45:34Z

Hi,
I am sorry if my question is trivial but I have trouble using this package with the tensorflow backend.
Using torchstain 1.2.0, I have no problem performing a Macenko normalization with numpy. But as I try with tensorflow, it crashes using normalizer.fit

target_path = '/XXX.jpg'
target = cv2.cvtColor(cv2.imread(target_path), cv2.COLOR_BGR2RGB)
tf_normalizer = torchstain.normalizers.MacenkoNormalizer(backend='tensorflow')

The only thing that I am doing differently from the provided example is the tensor conversion of the numpy array.
That is, I am not doing this

T = transforms.Compose([
    transforms.ToTensor(),
    transforms.Lambda(lambda x: x*255)
])

But rather tried this to match the transformation:

target = tf.constant(target, dtype=tf.float32)  #convert to tensor
target = tf.transpose(target, perm=[2, 0, 1])  #channel first

tf_normalizer.fit(target)

Is this why it crashes ? Is there a way to run this without using torchvision.transforms/on a pure TF basis?

I am using Tensorflow_2.10.0 and have installed torchstain using pip install torchstain[tf].
I currently do not use nor have installed torchvision in my TF environment.

Thank you for your advice

The text was updated successfully, but these errors were encountered:

andreped · 2023-02-25T17:59:55Z

I am sorry if my question is trivial but I have trouble using this package with the tensorflow backend.

Hello, @bertrandchauveau! I had this issue when making this myself, so no worries :]

You can take a look at what is done in the tests here.

Basically, do this instead:

import tensorflow as tf
import torchstain
import numpy as np

T = lambda x: tf.convert_to_tensor(np.moveaxis(x, -1, 0).astype("float32"))
t_to_transform = T(to_transform)

normalizer = torchstain.normalizers.MacenkoNormalizer(backend='tensorflow')
normalizer.fit(T(target))
result, _, _ = normalizer.normalize(I=t_to_transform, stains=True)

result = result.numpy().astype("float32")

Could you try this first to see if it resolves you issue? I'm a bit occupied right now, but could take a new look tomorrow, if you are still having issues.

This will be better documented in the upcoming release, which includes some new and interested stain normalization techniques and new backends (see here).

BTW: What is the status on the release, @carloalbertobarbano? Shall we aim to get it released by next week? I have a master student who would be interested in the new modified reinhard implementation.

bertrandchauveau · 2023-02-25T18:50:21Z

Thank you for your quick response!

Sadly the same problem occurs, i.e. crashes when running:

normalizer.fit(T(target))

the "T" conversion does the same as my attempt of tf tensor conversion

andreped · 2023-02-25T21:10:55Z

Sadly the same problem occurs, i.e. crashes when running:

Hmm, well, what I described above is what we do in the unit test, so that should work. Could you show me the error log from the terminal?

Also, could you try downloading the test data that we used for the unit tests here and here, and try running them through your code. I believe that should work. If that works, then the intensity range of your image after imread is in the wrong range. You can see the intensity range by running print(np.unique(image))

Also, I noticed that you were a pathologists. If you just want to get a method working, I would recommend trying the command line tool fast-stain-normalization that is based on torchstain. It enables you to normalize an entire folder without needing to code. Just provide arguments to a CLI and run it from the terminal. You can see how to use it here.

bertrandchauveau · 2023-02-26T16:24:13Z

I had the same issue with the test images that you provided.

This is the error message from the terminal:

2023-02-26 16:20:02.217992: I tensorflow/stream_executor/cuda/cuda_blas.cc:1614] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2023-02-26 16:20:02.225717: I tensorflow/core/util/cuda_solvers.cc:179] Creating GpuSolver handles for stream 000001CE08BFD700
2023-02-26 16:20:03.039762: F tensorflow/core/util/cuda_solvers.cc:114] Check failed: cusolverDnCreate(&cusolver_dn_handle) == CUSOLVER_STATUS_SUCCESS Failed to create cuSolverDN instance.
[I 16:20:28.009 NotebookApp] KernelRestarter: restarting kernel (1/5), keep random ports

I tried this kind of things from what I saw from stackoverflow, but the kernel still crashes:

gpu = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(device=gpu[0], enable=True)

As I understand it, Tensorflow tries to place the tensors on the GPU, but for whatever reason, it does not work (as you said, I'm a pathologist.) For note, I have an RTX 4090 in a Windows setup and I have not encountered similar issues when tranining deep learning models.

So by forcing Tensorflow to use the CPU with:

with tf.device('/CPU:0'):
    tf_normalizer.fit(T(target))
    result_tf, _, _ = tf_normalizer.normalize(I=t_to_transform, stains=True)

It works as intended.

Should it also work with the GPU?

andreped · 2023-02-26T19:51:31Z

I was unable to reproduce your issue. See gist.
As you can see from the gist, it works just fine with GPU, also for TF backend.

What you are observing I'm guessing is likely related to the TensorFloat-32 message your are seeing, which I have not seen before. This likely happens because you have a very new GPU, 4090, which I would think might produce some issues.

First I would try disabling TensorFloats, by adding this to the top of your script (after tf import): tf.enable_tensor_float_32_execution(False)

If that did not fix the issue, try installing the nightly release of TF to see if this has been fixed recently:

pip uninstall tensorflow && pip install tf-nightly

bertrandchauveau · 2023-03-02T08:36:12Z

Thank you for your response. Agree that it works nicely in colab.

On my local machine, I disabled TensorFloat-32 with :
tf.config.experimental.enable_tensor_float_32_execution(False)

But the kernel still crashes when fitting the normalizer.
Upgrading tensorflow won’t be as simple as that since I am currently running on native Windows and tf_2.10.0 was the last version that allowed this according to the tf documentation. Upgrading would require to use WSL2, but I am not ready for this right now.

My initial idea (perhaps not a good one) for my project was to use torchstain to normalize images on the fly using a custom data generator, this to avoid the duplication of the dataset (normalized and non-normalized).

For now, I will duplicate my dataset, as relying on the CPU for normalization slows down the batch preparation pretty much. I’ll give it a try when I’m ready to upgrade tensorflow or will try with pytorch which seems less windows-phobic.

carloalbertobarbano · 2023-03-02T09:24:50Z

Hi @bertrandchauveau, what version of CUDA and cuDNN are you using?

andreped · 2023-03-02T10:42:40Z

My initial idea (perhaps not a good one) for my project was to use torchstain to normalize images on the fly using a custom data generator, this to avoid the duplication of the dataset (normalized and non-normalized).

That's exactly what I do in my training frameworks and that works just fine. As long as you are using tf.data.Dataset and take advantage of multithreading, it is barely any lag :] But I guess it depends on how much lag you expect and can tolerate, how large the images are, which CPU and SSD/HDD you have, and whatnot.

I don't really work on windows for training models anymore. Note that multithreading does not work as well on windows, as for UNIX-based systems.

Hi @bertrandchauveau, what version of CUDA and cuDNN are you using?

I guess as you seem to be using anaconda, you have installed CUDA through something like this. As I said, I don't have that much experience with conda, as I don't use it myself, but I guess @carloalbertobarbano can help you on that.

bertrandchauveau · 2023-03-02T13:09:53Z

Hi @carloalbertobarbano,
cudatoolkit 11.2.2
cudnn 8.1.0.77
Exactly, installed via conda

andreped · 2023-04-21T06:29:52Z

@bertrandchauveau Are you still experiencing issues?

bertrandchauveau · 2023-04-23T19:16:15Z

Hi,
Thank you for your message and sorry for my late reply. Since my last message:

I installed torchstain 1.3.0
kernel still crashes when using the Macenko approach, in fact now when calling:
torchstain.normalizers.MacenkoNormalizer(backend='tensorflow')
Same error message as before.
With using modified Reinhard method on a single image, sometimes it worked with the GPU, sometimes it crashed. I did not have time to explore this more.

It works when I force torchstain to work on the CPU. With tf.data.Dataset, it is true that there is not much lag during pure training (about +10% for me as compared to no stain normalization) but the validation step after each training epoch is much longer.

As you suggested it, I tried to install the last tf.2.12 on WSL, but failed for now with it seems endless error messages for tf to simply work and recognize the GPU...

I should have a bit more time this week to see why sometimes it seems to work with the modified Reinhard method.

andreped · 2023-04-23T19:18:33Z

As you suggested it, I tried to install the last tf.2.12 on WSL, but failed for now with it seems endless error messages for tf to simply work and recognize the GPU...

AFAIK, there does not yet exist a precompiled binary of tf 2.12 on windows, so I believe that might result in some issues. But if you are using WSL it should work better. You could post the error messages you are getting and I could try to debug it for you. Note that I believe you need a nightly release, as the GPU you have might be too new, as discussed above.

I should have a bit more time this week to see why sometimes it seems to work with the modified Reinhard method.

Why it sometimes works and sometimes fails does not make much sense to me. Have you tried not using Anaconda and just regular Python virtual environments? You will need to setup CUDA yourself then.

andreped self-assigned this Feb 25, 2023

andreped added good first issue Good for newcomers documentation Improvements or additions to documentation labels Feb 25, 2023

andreped added this to torchstain Feb 25, 2023

RECranston mentioned this issue Oct 7, 2024

Crash when using normalizers.MacenkoNormalizer(backend='torch') #55

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Crash when calling MacenkoNormalizer.fit with tensorflow backend #37

Crash when calling MacenkoNormalizer.fit with tensorflow backend #37

bertrandchauveau commented Feb 25, 2023

andreped commented Feb 25, 2023 •

edited

Loading

bertrandchauveau commented Feb 25, 2023

andreped commented Feb 25, 2023

bertrandchauveau commented Feb 26, 2023

andreped commented Feb 26, 2023 •

edited

Loading

bertrandchauveau commented Mar 2, 2023

carloalbertobarbano commented Mar 2, 2023

andreped commented Mar 2, 2023

bertrandchauveau commented Mar 2, 2023

andreped commented Apr 21, 2023

bertrandchauveau commented Apr 23, 2023

andreped commented Apr 23, 2023 •

edited

Loading

Crash when calling MacenkoNormalizer.fit with tensorflow backend #37

Crash when calling MacenkoNormalizer.fit with tensorflow backend #37

Comments

bertrandchauveau commented Feb 25, 2023

andreped commented Feb 25, 2023 • edited Loading

bertrandchauveau commented Feb 25, 2023

andreped commented Feb 25, 2023

bertrandchauveau commented Feb 26, 2023

andreped commented Feb 26, 2023 • edited Loading

bertrandchauveau commented Mar 2, 2023

carloalbertobarbano commented Mar 2, 2023

andreped commented Mar 2, 2023

bertrandchauveau commented Mar 2, 2023

andreped commented Apr 21, 2023

bertrandchauveau commented Apr 23, 2023

andreped commented Apr 23, 2023 • edited Loading

andreped commented Feb 25, 2023 •

edited

Loading

andreped commented Feb 26, 2023 •

edited

Loading

andreped commented Apr 23, 2023 •

edited

Loading