-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework the cache that it also provides virtual folders like core, extra, community etc. #11
Comments
fixed in 4cd1151 |
Hmm, I think there is a missing piece here. The db files should be linked to pkg.pacman.store/arch/x86_64/default/$repo/$repo.db If I set pkg.pacman.store/arch/x86_64/default/$repo/ as my package mirror and try to perform an upgrade, pacman will try to download the repo db from pkg.pacman.store/arch/x86_64/default/$repo/$repo.db which now 404s |
I know the recommended usage is via FUSE, but I prefer to sync via http (and pay double on disk space) than to mount the internet on my filesystem. |
Yes indeed I'm breaking the conventions of a regular mirror intentionally in this case. The http gateways are IMHO not meant for downloading binaries of updates from the IPFS. I just want to avoid that people add the cluster with an IPFS http gateway like ipfs.io to update their servers and publish this like the optimal solution to avoid censorship or do distributed updates, you know? Additionally there needs to be checks that the db file is not stuck in time, like if the import got stuck, the project died or the server importing it in the cluster having an outage. Not checking the time the update was refreshed and having IPFS as first server configured could lead to serious security implications, since you won't get security updates in this case. The dbs are meant to be copied from the ipfs to the local filesystem with something like The packages on the other hand can be downloaded by browsing. If you need a specific one, like an older version, you can also download the corresponding database which holds the signature for this package - that's the reason I keep them in the snapshots. The cache folder is meant to be mounted just as an read-only cache for pacman, which avoids - as you pointed out, that you have to spend the storage on packages twice. If you like to keep local copies of files you've installed, you can always simply pin them in ipfs by a script. You just need to fetch the package list with versions and match them to the files in To get a better overview over the files you've pinned I recommend to also copy them in an MFS structure, like
With the CID:
And then copy the CID of the package you have installed to a local folder:
Note that files added with So if you would just run You can circumvent this behavoir by pinning it additionally, and unpinning it afterwards (or the file will be still pinned, when you remove it). If you feel like having a history of installed packages locally stored in an MFS is a viable solution and more people than you might want to use it, I'm fine with adding this as an option to scripting around pacman. The idea would be to either hold all, the installed or just the last n versions of a package locally (in an MFS folder) and automatically remount it, after changes has been done, so you can always just cd into it. |
Addition: In this case create a new ticket please where you explain exactly what you need and why :)
As already explained here the integrated solution for mounting isn't stable. So I won't use it. It's also rather slow. I don't see any advantage for downloading via http except the 'having a bunch of older versions locally' aspect. If we can migrate those files just into a MFS which is somewhere conveniently mounted all the time, we can avoid the need to copy data back and forth on the same filesystem and you can also share the files with the network... this will help the cluster with traffic, compared to a non-local http gateway. |
Wait, why not?
Well, this seems inevitable. Right now any third party can replicate the repo tree, add those db files where needed and perform upgrades from the new tree. Our cluster will still share resources, as the package tarball CIDs are still the same. The only way to counter such "abuse" is to start a private, exclusive IPFS network.
I don't think this will be a problem if you update where the IPNS name points to only after a finished refresh. This way partial updates are not possible. If the project dies, its users should be aware of that by subscribing to arch's security advisories and noticing patches are not installed. |
I'd also like to comment that while updating via http is less space efficient, it's a simpler solution than tricking pacman to download packages while checking if they exist in the cache, by mounting pacman's cache on the repo tree. using IPFS as a mirror also allows to easily fall back to normal mirrors if the IPFS one underperforms. Also this is how victorb's arch-on-ipfs project used to work. |
It's a convenient feature for users, but they offer this service for free while paying for the traffic. Causing a huge amount of traffic for them just to replace traditional static http servers which already do exactly this job isn't fair use IMHO. I will never actively support this type of usage. It's also just a convenient feature for users to browse the archive and the structure via automatic http redirects to the service, to be able to download specific files and databases off of snapshots manually very easily.
In this case, the cluster data is probably blacklisted in the long term, because it's no longer considered fair use if thousands of machines will download their updates purely from this web service. In those cases, the convenience for the users to browse easily with their browser to this repo is no longer possible. It's like browsing the Wikipedia mirror via the gateway vs scraping it with an HTTP downloader completely, fetching half a petabyte from the gateway instead of just using IPFS to pin it directly. As I stated above, I won't help in any way to abuse what I consider fair use of a freely available service.
IPNS name points are constantly refreshed as long as the server is running (this is a necessity, like reproviding content you're offering), but I won't update them.
True, they are not. I only refresh the record to a new version when the sync was successfully completed. Else the old version (by CID) will be still available for at least 2 months in the cluster.
This doesn't change, that there are security implications by using this service. :)
Yes, that's completely true. But the alternative is to just use IPFS with it's local http proxy to do exactly the same thing. Feel free to write a how-to for this. :)
I think @victorb was just offering a proof of concept. But maybe this wasn't just not part of his considerations. |
I also like to point out, that my approach is not to replace a centralized service with a different one. Sure, it's CURRENTLY centralized, by just importing to the cluster in one place. But it's not meant that way in the long term. I hope that the things I've said in the original discussion is going to be come true, in the long term:
|
I think I didn't explain my rationale correctly. I will open a new ticket explaining my situation top-to-bottom. |
I haven't read the whole thread, but just use the local gateway provided by the IPFS daemon on |
@hsanjuan true. He was locking into running a IPFS cluster follower in his network, while configuring the ipfs.io gateway for his local Arch Linux machines. Regardless on his actual setup details I will avoid using the ipfs.io gateway for anything in my setup recommendations, since replacing hundreds of update servers with the ipfs.io gateway doesn't work out. Neither for the ipfs project nor for receiving the updates fast and is not decentralized at all. Getting the current CID of an ipns and mounting it is IMHO still the best solution, since it avoids that you're downloading the stuff to the IPFS cache, reading it again from the disk, pushing it locally thru an http connection and writing it back on the same disk as before in the pacman cache. |
what, no, I was looking at using my local follower's gateway, not ipfs.io's. |
This allows a script to cache updates:
#4
The text was updated successfully, but these errors were encountered: