-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
luci-app-statistics: Add backup/restore for RRD statistics #6646
Conversation
USE CASES:
|
4790845
to
86b39bc
Compare
USE_PROCD=1 | ||
|
||
# We only want to restore a sysupgrade backup file if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to add a note about this being better than a cronjob, as it doesn't constantly write to flash on devices with non-durable storage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added something, let me know if you like it.
applications/luci-app-statistics/root/etc/init.d/luci_statistics
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
86b39bc
to
6c654ce
Compare
I also added a README.backups explaining the use of |
@hnyman what do you think? |
6c654ce
to
b8c232a
Compare
I didn't test it, yet, but one comment regarding verbosity: The init script and any file included in the .ipk package should be kept short and terse, and verbose explanations into the commit message, please. Right now the init script looks like a novel ;-) |
I wondered about that. I assume since the scripts eat flash space, so minimizing them is best practice for OpenWrt/LEDE, embedded distros in general. Would it be appropriate for @jtkohl to just move all that doc to the readme? Actually, on second thought, I'd guess the best thing would be to put it in the wiki: https://openwrt.org/docs/guide-user/luci/luci_app_statistics |
Sure thing... I'm still learning my way around the OpenWrt development norms. How about putting the design notes into the Makefile for the package, or as a README in the sources? And also I'm happy to put notes in the wiki if desired. EDIT: I'm happy to update the wiki section on preserving statistics, if/when this is merged into the project's repository. |
b8c232a
to
c0422e9
Compare
applications/luci-app-statistics/root/etc/init.d/luci_statistics
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some comments, generic in nature, aiming mainly for simplification.
I have not yet tested your code, but the idea looks ok in principle.
I wonder about the backup file list generation. Is it not enough to just provide a directory to backup? Does it need real evaluation? (What actually gets added to the "filelist" after the sed code?)
E.g. in my use case, the wlan interface names might change depending on 22.03, 23.05 and master due to changes in the default naming logic, so the exact files to be backuped do vary a bit depending on the specific build on the same router as there are rrd records unused by a specific Openwrt version. (Well, I have named my wlan interfaces to avoid that, but in principle...)
Similarly, I am thinking about the logic to remove backup files.
Is that crash-proof? If the router crashes and reboots, what happens? Is something restored?
Some context: I flash a different main/master test build roughly every 3 days to my routers, and they do crash every now and then. My current manual (cronjob) "store a backup a day as .tar.gz" approach ensures me that I have the last night's copy. What happens with your new scheme if the router crashes and reboots and then continues operation ? The backup made at the last sysupgrade survives? Or is lost? (Would daily backups be possible?)
(I am currently using daily timestamp-named files: if the router crashes and reboots (and there is no automatic restore), I do not want the next night's empty short-term backup to overwrite the single backup, but I want to also preserve the earlier backups for manually selecting a suitable restore from a before-crash backup. So the standard backup name symlinks to the most recent daily backup, but the daily backups are individual files.) Sure, I am minority, but the long-term monthly/yearly trend is the most useful tool to see gradual changes. And sure, I do need to manually remove the old backups.)
|
||
### restart collectd if it was running before us | ||
/etc/init.d/collectd status >/dev/null 2>&1 && /etc/init.d/collectd stop >/dev/null 2>&1 | ||
/etc/init.d/collectd start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just having a "collectd restart" here?
currently this is actually not about "restart collectd if it was running before us", but it is "stop collectd if running + always start collectd".
If you want "restart collectd if it was running before us", it might be just "/etc/init.d/collectd status >/dev/null 2>&1 && /etc/init.d/collectd restart"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm....
reading further, there actually was a "collectd" restart that you removed and replaced with a more polished logic. git diff hide that well at the first glance.
Not sure if that is worthwhile. (I like KISS principle.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.... reading further, there actually was a "collectd" restart that you removed and replaced with a more polished logic. git diff hide that well at the first glance. Not sure if that is worthwhile. (I like KISS principle.)
This is what got lost when moving the docs to README and out of the source code comments.
The problem I found (really, this is unrelated to the backups/restores) is that if you install luci_statistics with opkg install luci_statistics
, it will load the dependencies first. collectd gets installed and started with its default configuration. Then luci_statistics gets installed and started, whereupon it creates a new config file. But in this case, it never restarts collectd so the new config file is not yet used.
This problem does not occur if the system is built with both collectd and luci_statistics in the base image. In that case, the first time it boots, luci_statistics goes first and sets up the config (with collectd not yet running), due to the /etc/rc.d/SXX ordering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But in this case, it never restarts collectd so the new config file is not yet used.
Well, there was a "collectd restart" (that you removed) so it should restart collectd once luci_statistics starts. Interesting if it does not happen.
This problem does not occur if the system is built with both collectd and luci_statistics in the base image
Yeah, I have it always that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, there was a "collectd restart" (that you removed) so it should restart collectd once luci_statistics starts. Interesting if it does not happen.
That previous restart of collectd is only inside the code to restart luci_statistics, not in the code to start luci_statistics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently this is actually not about "restart collectd if it was running before us", but it is "stop collectd if running + always start collectd".
If you want "restart collectd if it was running before us", it might be just "/etc/init.d/collectd status >/dev/null 2>&1 && /etc/init.d/collectd restart"
I think we want "stop if running and then start". Otherwise, if collectd is not actually running, then starting just luci_statistics won't really start gathering statistics. In the current code (before this PR), the sysadmin needs to know that if both collectd & luci_statistcs are stopped, then they have to start luci_statistics and then collectd in order to get statistics gathering started.
So I can either update the comment to explain that we want "stop collectd if necessary and always start it", or leave the comment and change the code to "restart collectd if it was running". I of course prefer the former.
We want to create a new backup at the time of sysupgrade, so for non- You can test this out with |
I'm adding the following to the README.md. During disorderly rebootIn a system crash or other disorderly reboot, the shutdown scripts do
|
c0422e9
to
9782dea
Compare
Add a backup/restore capability for rrd data storage in luci_statistics. The data storage is typically in /tmp and does not survive reboot or sysupgrade. This adds an option for the administrator to configure the RRD plugin, so that the RRD data are are preserved with a backup copy in the overlay file system. This works for shutdown/reboot, sysupgrade (backup config files, restore config files, and true sysupgrade). Also fix a bug where starting luci_statistics for the first time would not get a restart a running collectd: during install of the package when it is not included in the base flashed image, collectd might be started when it got installed/configured before this package gets installed/configured. So we need to check if it's running, and restart it to use the luci_statistics configuration. Signed-off-by: John Kohl <[email protected]>
9782dea
to
ad98af3
Compare
had to update https://openwrt.org/docs/guide-developer/procd-init-scripts#init_scripts_during_compilation |
Thanks.
Looks ok to me. I will merge this as it is, but a few suggestions for further development:
|
Crude example about that:
|
Thank you! I will take a look at your suggestions next week when I have some more time available. Any possibility of back porting to 23.05 branch? |
Backporting to 23.05 might be possible after a few weeks, when there is more evidence that there are no major negative surprises. |
You could do |
I guess that you meant |
yup. oops! sorry, was traveling and not thinking 100% straight, clearly! |
ah, I'm somewhat new to GitHub (been using/developing other SCMs for 30+ years) so that would be like
I thought about a different name for README, and then decided to leave it as README and use a heading for Backups...figuring if anybody wanted to document anything else about the package, they could add other sections to the README. But if you prefer to rename it, I can do that...let me know? |
I get the point. I struggle with the best place for this in the LuCi UI. The backups are really all about the rrdtool database, so putting it on the rrd plugin config seems right. Is there a way for the collectd.js page logic to see the current values of the settings from the rrdplugin.js form? With your initial prototype, it only tracks the value as seen by UCI, not the temporary values in the LuCI GUI. I'm new to LuCi programming so any hints here would be helpful. |
Seemed to work ok in master, that I backported it to 23.05 |
While looking at the help output, I tested the two extra commands, ane one of tem sems to fail?
|
Ah, it's a documentation/usage problem, and insufficient protection. sysupgrade_backup is used by |
Is it meant to be used from CLI? |
Add a backup/restore capability for rrd data storage in luci_statistics. The data storage is typically in /tmp and does not survive reboot or sysupgrade. This adds an option for the administrator to configure the RRD plugin, so that the RRD data are are preserved with a backup copy in the overlay file system.
This works for shutdown/reboot, sysupgrade (backup config files, restore config files, and true sysupgrade).
Also fix a bug where starting luci_statistics for the first time would not get a restart a running collectd