-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Benchmarking PostgreSQL vs. SQLite for Large-Scale VPN Networks Using Headscale #2001
Comments
Does this highlight a need (regardless of database) to be keeping more info in memory, or doing less database work every time a peer update occurs (or a mixture of both)? I've been diving into the source, and I see a couple of TODOs by @kradalby mentioning caching. Would there be appetite for that? |
Thanks for your meticulous benchmark. BTW, if you have raw profile data, maybe flamegraph will provide more insight. |
The reason creating 600 clients took several hours seems to be because I made the requests synchronously (using tailscale CLI to headscale by shell script). The latency observed in the benchmark is being analyzed in two areas: network latency due to AWS region distances and the database. Once the analysis is complete, I'll share the results. Currently, I only have raw profile data for SQLite, so I am attaching a flamegraph for that. |
I've posted a comment touching on some of these topics here: #1993 (comment) |
Use case
Although Headscale is primarily designed for small-scale VPN environments (e.g., home VPNs), there is increasing interest in using Headscale to deploy large-scale VPN networks with over 500 clients. However, there is a lack of comprehensive guides and benchmark materials for such use cases. This benchmark provides recommendations for database configurations and comparison metrics to assist users in setting up large-scale VPN networks with Headscale.
Description
Benchmark Environment
- The benchmark was performed in a consistent environment where the only variable was the choice of database specified in the headscale config.yaml: either PostgreSQL or SQLite (with the WAL option).
Comparison Metrics and Results
The benchmark evaluated SQLite (with the WAL option) and PostgreSQL based on the following three criteria: total client creation time and error occurrence rate.
Total Client (600) Creation Time and Error Occurrence Rate
Error Occurrences
The following provides a verification of the errors that occurred in Issue [Bug] Node Connection Issues(~600 nodes) in v0.23.0-alpha12 #1966, including their occurrence rates:
Cannot create user: context deadline exceeded
error.Profiling Results
Profiling Results: cpu
postgres
sqlite
Profiling Results: memory
postgres
sqlite
Conclusion
Recommendation
Based on the benchmark results, PostgreSQL is recommended for large-scale client environments over SQLite. It is suggested to update the documentation and
config.yaml
to include guidelines for selecting the appropriate database based on the use case.Attachments
postgres-node-list.json
sqlite-node-list.json
cpu-postgres.png
cpu-sqlite.png
heap-postgres.png
heap-sqlite.png
Contribution
How can it be implemented?
No response
The text was updated successfully, but these errors were encountered: