Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Varnish cache instead of Squid proxy #194

Open
jpecar opened this issue Jun 13, 2024 · 5 comments
Open

Use Varnish cache instead of Squid proxy #194

jpecar opened this issue Jun 13, 2024 · 5 comments

Comments

@jpecar
Copy link

jpecar commented Jun 13, 2024

Due to squid's age there's a strong motivation to use something more modern and faster. Varnish is a reverse proxy, typically used as a "web application accelerator" but can be used with eessi scenario due to how cvmfs works.

Attached is a mashup of some existing cvmfs+varnish configs I found online (mainly caching logic) with addition of eessi backends and some logic to use them when url looks like it wants them. It is showing signs of life and already offers lower loading latencies on initial cold cache but can of course be polished further.

eessi.vcl.txt

@amadio
Copy link

amadio commented Jun 13, 2024

You may also want to look into using XRootD for your cache (Xcache). Documentation can be found here: https://xrootd.slac.stanford.edu/doc/dev56/pss_config.htm

@boegel
Copy link
Contributor

boegel commented Jul 4, 2024

@jpecar It could be interesting to re-run the performance experiments I did with TensorFlow startup time for the tutorial, see https://multixscale.github.io/cvmfs-tutorial-hpc-best-practices/performance/, to check how much the (mainly cold) startup times improve with Varnish vs Squid proxy.

@jpecar
Copy link
Author

jpecar commented Jul 30, 2024

Gathered some numbers:

reference: local eb repo on NFS:
first load, cold vfs cache: 31.61
warm vfs cache: 6.16 -0.47 +0.59

private stratum1, direct:
first load on cold cache: 10.01
subsequent warm cache loads: 9.27 -0.14 +0.34

private stratum1 + squid, cold cache:
first load on cold cache: 11.14
subsequent warm cache loads: 9.10 -0.23 +0.26

private stratum1 + varnish, cold cache:
first load on cold cache: 11.11
subsequent warm cache loads: 9.38 -0.37 +0.50

eessi from aws + squid:
first load on cold cache: 24.25
subsequent loads: 9.09 -0.15 +0.24

eessi from aws + varnish:
first load on cold cache: 89.17
subsequent loads: 9.39 -0.26 +0.27

Interestingly initial fetch from aws s1 on varnish takes about 3x longer. One needs to decide whether that's acceptable for their local env or not.

Our prod env is now made of private S1 with two varnish instances (in-memory only), configured with fallback director to prefer local S1.

@ocaisa
Copy link
Member

ocaisa commented Jul 30, 2024

Probably what is more interesting to look at is what the performance looks like if we are hammering squid/varnish (imagine an 8k process MPI job). From what I read varnish should be better at handling lots of simultaneous requests (which means less instances required per site)

@jpecar
Copy link
Author

jpecar commented Jul 30, 2024

Yes. In our env I basically cannot avoid things like R/Bioconductor in tight loops of tens of thousands invocations ... and those have brought our nfs filers to their knees in the past.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants