Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing archive throughput #901

Open
jayhohoho2019 opened this issue Aug 8, 2021 Discussed in #900 · 13 comments
Open

Increasing archive throughput #901

jayhohoho2019 opened this issue Aug 8, 2021 Discussed in #900 · 13 comments

Comments

@jayhohoho2019
Copy link

Discussed in #900

Originally posted by jayhohoho2019 August 8, 2021
Hello,
Which parameter if any is for archive polling period? 1 minute believe it or not is getting too long now for me. Thanks.

In addition to making the archive mode polling period configurable, could we allow running multiple rsync in parallel? I'm referring to the local_rsync mode only at this point.

My issue is that, with a fast plotter in use, it now takes longer to rsync a plot from my dst drive (NVME SSD) to the final archive HDD than to create a plot and save it to the dst drive, so the dst drive fills up after a while. Running 2 rsync to 2 archive HDDs would solve this.

@altendky
Copy link
Collaborator

altendky commented Aug 8, 2021

It is a thing that has been discussed but I can't say I've got any schedule for implementing it.

@jayhohoho2019
Copy link
Author

Would it be possible to run two instances of plotman archive each with its own config.yaml, and a slightly different target definition?

@altendky
Copy link
Collaborator

Yes, sorry I didn't think more to mention that. It detects based on the site root so if you just mount to two different directories that would cut it. I don't think we actually have a configuration path override though... Definitely a missing feature. I should also have mentioned https://github.com/rjsears/chia_plot_manager. The author uses plotman to plot and their own tooling to do "higher end" archiving (plus whatever other features it has).

@jayhohoho2019
Copy link
Author

jayhohoho2019 commented Aug 10, 2021

Thanks for the info. I just need to double the plotman archive throughput at this point so will take a look at the chia plot manager later. I am able to run two instances of plotman archive now each working off a different tmp dir (dst drive), and to a different site root.

@jayhohoho2019
Copy link
Author

Yes it'll be more convenient to allow config file override.

@jayhohoho2019
Copy link
Author

Actually I think there is some issue there. When 1 instance is running rsync, the other instance doesn't start rsync. But perhaps since I changed the archive sleep time to 10s, each instance from time to time starts rsync around the same minute. I recall the archive.py code is actually checking for the transfer script name and argument list. Any suggestions?

@altendky
Copy link
Collaborator

plotman does check for the site_root or URL to be in the options of existing rsync processes. Part of the suggestion was to have the different drives under a different site_root so they wouldn't detect rsync processes from the other plotman archiving instance.

To be clear, yes, we are talking about an annoying hacky way to get to what you want (sort of). I'm not suggesting this is a good way for plotman to work.

@jayhohoho2019
Copy link
Author

That's exactly how I set this up. Two config files with different site_roots (and different buffer drive paths, and different log directories). The inconvenience is I had to copy the the desired config file to the only location plotman is looking before starting that plotman instance, but the real problem is, it seems, plotman still detects the rsync process run by the other instance, most of the time. This test seems to only fail when the other rsync is started within the same minute. Therefore, I either only see 1 rsync running (most of the time), or see 2 rsync processes that are started at the same hour and minute. Is the code checking for the command_name (rsync in both instances) AND site_root (different by 1 character in both instances)?

So with this setup and the problem, I have increased my archive throughput but by nowhere close to doubling it. The buffer drive Use% is still growing, although at a much slower rate than when only 1 instance was running.

@altendky
Copy link
Collaborator

if proc.name() == proc_name:
args = proc.cmdline()
for arg in args:
if arg.startswith(dest):
jobs.append(proc.pid)

What are the actual site roots? Is one just the other one plus a character? Perhaps just share both complete config config files.

@jayhohoho2019
Copy link
Author

Yes. The second site_root is first $site_root}1. So it's getting a partial match by using startwith I suppose.

@jayhohoho2019
Copy link
Author

Is there a function that does exact match? Or I can change site_root to something like appending 2 to it I suppose.

@altendky
Copy link
Collaborator

Yeah, for now, making it so that neither starts with the other seems best. I'm sure the code could change as well.

@jayhohoho2019
Copy link
Author

Ok. So after naming 2 site_roots neither of which starting with the other, both plotman archive instances are firing up 1 rsync regardless of the other instance. To summarize I guess a few things are good to have:

  1. allow config file override
  2. make archive mode polling period configurable, and
  3. make string match exact in the test for transfer script dest

FYI I'm using plotman archive only and it's been working well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants