Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread unsafety edge case when loading/using GeoImages on Windows #23

Open
kb173 opened this issue Apr 16, 2020 · 0 comments
Open

Thread unsafety edge case when loading/using GeoImages on Windows #23

kb173 opened this issue Apr 16, 2020 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@kb173
Copy link
Member

kb173 commented Apr 16, 2020

We're still not quite sure what exactly is causing it, but there can be crashes on Windows in very specific circumstances. It is apparently related to loading new (non-cached) GeoImages from multiple threads. Here are some of our observations while testing in LandscapeLab:

  • Using a copy of the file in the other thread fixes the crash -> It's definitely related to loading from the same GeoTIFF
  • Using the same file all the time fixes the crash -> The cache is doing its job, only loading new files seems to be the problem
  • Removing any multi-threading fixes the crash -> It's definitely related to multi-threading

In 2419613, rigorous mutexes were added anytime a new resource is created or loaded. Interestingly, this did not fix the crashes, although they did seem to get less frequent.

With these mutexes and the cache, the creation of the GeoImage (loading GeoRaster, reading as array, etc) is actually non-threaded. That's why this behavior is really strange... I can't think of any situation where our two threads actually do something in parallel without a mutex, aside from GeoImage.get_image() and GeoImage.get_image_texture(). However, even those functions use a mutex internally.

This crash has never been observed on Linux, while it's pretty frequent on Windows (always happened within ~5 times of loading terrain up to a high LOD). So I'd say the issue is either related to GDAL on Windows, or to our compilation on Windows (which uses Visual C++, not g++).

There is no error message at all, the window simply closes.

We can currently work around this issue by using two copies of the same heightmap, one in the TerrainModule and one in the TerrainColliderModule. While this works, it's not very clean, so it would be good to figure out exactly what's going on.

(related to #10)

@kb173 kb173 added the bug Something isn't working label Apr 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants