Skip to content

Conversation

@veloman-yunkan
Copy link
Collaborator

This PR introduces support for user-defined cost to lru_cache and ConcurrentCache classes as an enhancement enabling to fix #947. It was extracted from #960 which in turn was based on #956.

mgautierfr and others added 12 commits May 2, 2025 19:52
At cache level, "size" is renamed to "cost", representing how much a
cache can store (and how much an item cost). "size" is preserved for
the count of items.

At higher level, we keep the "size" sementics as we are speaking about
the size of the cache, whatever this is.

This is the first step to a cache limited by memory usage.
... and enhanced two of the lru_cache unittests with log_debug-assisted
testing.
The changes in the unit tests are better understood if whitespace
changes are ignored (git diff -w).
This test too intermittently fails (likely to the suspected but not yet
tracked-down race condition in logging and/or namedthread code). This
time however it fails either due to a deadlock, or obvious memory corruption
like below:

[ RUN      ] ConcurrentCacheMultithreadedTest.accessSameExistingItem
../../../../home/leon/freelancing/kiwix/builddir/SOURCE/libzim/test/concurrentcache.cpp:429: Failure
Expected equality of these values:
  zim::Logging::getInMemLogContent()
    Which is: "thread#0: Output of interest starts from the next line\na  : ConcurrentCache::getOrPut(5) {\n  b: ConcurrentCache::getOrPut(5) {\na  :  ConcurrentCache::getCacheSlot(5) {\na  :   entered synchronized section\n  b:  ConcurrentCache::getCacheSlot(5) {\na  :   lru_cache::getOrPut(5) {\na  :    already in cache, moved to the beginning of the LRU list.\na  :   }\na  :   exiting synchronized section\na  :  }\n  b:   entered synchronized section\na  :  Obtained the cache slot\n  b:   lru_cache::getOrPut(5) {\na  : } (return value: 50)\n\0 b:    already in cache, moved to the beginning of the LRU list.\n  b:   }\n  b:   exiting synchronized section\n  b:  }\n  b:  Obtained the cache slot\n  b: } (return value: 50)\n\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0 "
  targetOutput
    Which is: "thread#0: Output of interest starts from the next line\na  : ConcurrentCache::getOrPut(5) {\n  b: ConcurrentCache::getOrPut(5) {\na  :  ConcurrentCache::getCacheSlot(5) {\na  :   entered synchronized section\n  b:  ConcurrentCache::getCacheSlot(5) {\na  :   lru_cache::getOrPut(5) {\na  :    already in cache, moved to the beginning of the LRU list.\na  :   }\na  :   exiting synchronized section\na  :  }\n  b:   entered synchronized section\na  :  Obtained the cache slot\n  b:   lru_cache::getOrPut(5) {\na  : } (return value: 50)\n  b:    already in cache, moved to the beginning of the LRU list.\n  b:   }\n  b:   exiting synchronized section\n  b:  }\n  b:  Obtained the cache slot\n  b: } (return value: 50)\n"
With diff:
@@ -14,5 +14,5 @@
   b:   lru_cache::getOrPut(5) {
 a  : } (return value: 50)
-\0 b:    already in cache, moved to the beginning of the LRU list.
+  b:    already in cache, moved to the beginning of the LRU list.
   b:   }
   b:   exiting synchronized section
@@ -19,4 +19,3 @@
   b:  }
   b:  Obtained the cache slot
-  b: } (return value: 50)
-\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0
+  b: } (return value: 50)\n

[  FAILED  ] ConcurrentCacheMultithreadedTest.accessSameExistingItem (2 ms)
If an item that is in the process of being added to the cache
is requested from the cache within the time window when it has
been (temporarily) evicted from the cache (as a result of the
concurrent turmoil), a concurrent evaluation of the same item
is started while the previous one is still in progress.
An item being added to the cache is never evicted due to concurrent
turmoil. LRU eviction heuristics was changed to only evict items
with non-zero cost.
@codecov
Copy link

codecov bot commented May 3, 2025

Codecov Report

❌ Patch coverage is 49.54955% with 56 lines in your changes missing coverage. Please review.
✅ Project coverage is 57.65%. Comparing base (fe520a3) to head (d28077f).
⚠️ Report is 35 commits behind head on main.

Files with missing lines Patch % Lines
src/lrucache.h 52.45% 2 Missing and 27 partials ⚠️
src/concurrent_cache.h 38.63% 0 Missing and 27 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #982      +/-   ##
==========================================
- Coverage   57.97%   57.65%   -0.32%     
==========================================
  Files         103      103              
  Lines        4850     4922      +72     
  Branches     2018     2066      +48     
==========================================
+ Hits         2812     2838      +26     
- Misses        704      706       +2     
- Partials     1334     1378      +44     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@veloman-yunkan
Copy link
Collaborator Author

There seems to be a bug in concurrency orchestration of log_debug-based testing leading with low probability to a deadlock. In this particular CI run it resulted in the timeout of the concurrentcache unittest run in the Packages/ build-deb (bunutu-jammy) CI job but I have observed similar failures with the unittest of log_debug.

@kelson42
Copy link
Contributor

kelson42 commented May 3, 2025

@veloman-yunkan OK, so I guess this is something to fix before merging?

@veloman-yunkan
Copy link
Collaborator Author

@veloman-yunkan OK, so I guess this is something to fix before merging?

@kelson42 Yes, found the bug and filed it as #983. Will fix it tomorrow.

@kelson42
Copy link
Contributor

kelson42 commented May 3, 2025

@veloman-yunkan Considering you will treat the issue separatly, does that mean this PR is ready?

@veloman-yunkan
Copy link
Collaborator Author

@veloman-yunkan Considering you will treat the issue separatly, does that mean this PR is ready?

@kelson42 Yes, it is.

@veloman-yunkan veloman-yunkan requested a review from mgautierfr May 3, 2025 17:01
@veloman-yunkan
Copy link
Collaborator Author

@veloman-yunkan OK, so I guess this is something to fix before merging?

@kelson42 Yes, found the bug and filed it as #983. Will fix it tomorrow.

Bug-fix is ready (#984).

@kelson42 kelson42 merged commit 5b80090 into main May 4, 2025
29 of 31 checks passed
@kelson42 kelson42 deleted the user_defined_cost_for_caches branch May 4, 2025 08:07
@kelson42 kelson42 added this to the 9.4.0 milestone May 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

About cache limited by memory

4 participants