Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement on-disk cache during download #17

Open
derrod opened this issue May 2, 2020 · 16 comments · May be fixed by #683
Open

Implement on-disk cache during download #17

derrod opened this issue May 2, 2020 · 16 comments · May be fixed by #683
Milestone

Comments

@derrod
Copy link
Owner

derrod commented May 2, 2020

While 1 GiB is enough for almost any title, World War Z defies that trend by requiring up to 5.3 GiB of cache to successfully reconstruct its files. This is due to heavy duplication on the game's part since it ships with both a client and server that share large amounts of their files. The resulting heavy deduplication on Epic's part result in lots of cache being required for successful reconstruction. There are basically three ways this could be addressed:

  • Optimize file processing order to discard cache sooner (i.e. grouping files that share lots of chunks)
  • Throw away cache when it gets full and redownload later (inefficient and would require redownload logic)
  • Add on-disk cache to temporarily store excess cache on disk during the download (annoying with mechanical drives)

Since the first two approaches are both complicated and not a cure-all/efficient solution we will have to add on-disk caching capability.

@derrod derrod closed this as completed in 69383c4 May 4, 2020
@derrod derrod reopened this May 4, 2020
@derrod
Copy link
Owner Author

derrod commented May 4, 2020

This issue is partially fixed now by implementing a simple version of No. 1. But an on-disk cache is ultimately still required.

@derrod derrod added this to the 0.1.0 milestone May 14, 2020
@RobVor
Copy link

RobVor commented Nov 21, 2020

Hi, Similar issue for "Ark: Survival Evolved". The cahce requirements is much lower though, but still causes an error.

MemoryError: Current shared memory cache is smaller than required! 1024.0 MiB < 1319.0 MiB. Try running legendary with the --enable-reordering flag to reduce memory usage.

Adding the flag doesn't change anything.

Any other suggestion?

@derrod
Copy link
Owner Author

derrod commented Nov 21, 2020

Increase the size of the shared memory with --max-shared-memory, e.g. --max-shared-memory 1536

@RobVor
Copy link

RobVor commented Nov 21, 2020

Perfect! Thanks for this and the fast response!

@DoubleAgentDave
Copy link

Downloading to shared memory is a bit scary, I've got 16 GB of ram and I'm just watching it fill up will waiting for Pillars of Eternity to download which apparently needs 23 GB of shared memory, it magically reduced itself to 12 GB, but that still leaves only 4 GB to run a browser to write this Github comment. This isn't how shared memory should be used surely?

@DoubleAgentDave
Copy link

[DLManager] INFO: - Cache usage: 7894.0 MiB, active tasks: 16, this is literally using up all the cache I have.

@DoubleAgentDave
Copy link

This is using the flag --max-shared-memory 23385 which is logically what I should have done given the above comment. It won't install without the flag. I did it knowing it would be a mess, and it was.

@DoubleAgentDave
Copy link

tried again, needed less shared momory than before: legendary install bcc75c246fe04e45b0c1f1c3fd52503a --enable-reordering --max-shared-memory 15587 still too much.

@derrod
Copy link
Owner Author

derrod commented Dec 12, 2020

Yeah the developers fucked up. They essentially included multiple copies of the game so there's a lot of duplication that the legendary algorithm can't deal with (the reorder optimization would fix that, but it is disabled for games with too many files, as it gets rather slow).

What does work is downloading the game with --prefix win to ignore the duplicated files, but currently that disables the installation so the game wouldn't be launchable via legendary.

Edit: I ran the optimisation process with the limit disabled, it got the memory requirement down to less than 2 GiB, which is probably similar in size to the biggest duplicated file. Still not great that they messed up the upload though. I'll have to provide some workaround I guess.

@derrod
Copy link
Owner Author

derrod commented Dec 14, 2020

Now for the time being I reworked the optimizer, it's now significantly faster and can handle larger file numbers.

Just as an example: The previous version took over 500 seconds (nearly 9 minutes) for Pillars of Eternity, the new version takes around 7 (on my machine).

The optimizer is also now enabled by default for Pillars. Unfortunately that still leaves it above the default limit and will require a manual increase to 2 GiB to work, at least until Obsidian or Paradox fix their uploaded version. I did not want to implement a workaround that manually adds a prefix filter, as that would have required a bit more work for something that really is only required for a single game.

@DoubleAgentDave
Copy link

That is a LOT better. You could maybe get around the manual increase of shared memory by stopping and resuming the download process once the cache reaches the specified limit. I noticed it uses a lot less shared memory the second time you start the process. It's a hacky way of doing it, and you could leave a warning that you should increase the limit. For now though a change from required 15-20 gb to 2 gb is good.

@derrod
Copy link
Owner Author

derrod commented Dec 15, 2020

That is effectively just doing 2) in the issue description, since you're just downloading the duplicated data twice and throwing it away between duplicated files instead of keeping it in the cache.

@Roman513
Copy link

Roman513 commented Dec 21, 2024

@derrod What do you think about not trying to do everything in memory and just download chunks to the disk first as official client doing, and then construct binaries as a separate job? How do you feel about this approach?
I got enormous memory requirements like 12 Gb even with reordering for one game and I think this is a kind of hard to accept. And frontends like Heroic cannot do anything about so not experienced users suffer.

We can try to use shutil.copyfileobj to make it in optimal way and when do all the magic with downloaded chunks.

I've started digging into it but thought better to ask you first because this is a big change.

@derrod
Copy link
Owner Author

derrod commented Dec 22, 2024

What game is actually that poorly packaged that it has 12 GB of duplicate data?

I don't think there's any good way to really do this within the architecture of the current download manager. I wanted to rewrite it a long time ago, but never got around to it...

Perhaps the data should just be redownloaded if it's needed again and the cache isn't big enough, that might be an easier change to make.

@Roman513
Copy link

Roman513 commented Dec 22, 2024

What game is actually that poorly packaged that it has 12 GB of duplicate data?

The Talos Principle 2 (677565d027be4b98998c39a35e31767a)

Perhaps the data should just be redownloaded if it's needed again and the cache isn't big enough, that might be an easier change to make.

Yes, this is better than nothing.
Probably, with current architecture we have an ability to read chunks from already downloaded file instead of keeping them in memory, as it happens during updates, but it is not sounds easy to write and maintain

@Roman513
Copy link

@derrod I found a way to dramatically decrease memory usage by reusing chunks from already fully downloaded files of the same download session - the same approach as we already have for reusing files from the old manifest during analysis. In my case with Talos Principle 2 I don't see even 300Mb of cache usage anymore.

See #683

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants