Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

permissions: Rewrite the in-memory storage backend #3560

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

benceszigeti
Copy link
Contributor

@benceszigeti benceszigeti commented Jan 20, 2025

Summary
Removed the hardcoded entry limit in the permission module’s in-memory subnet storage and optimized lookup performance.

Details
Previously, exceeding 128 subnet entries prevented adding new subnets, requiring code modifications and custom builds to handle more. This update eliminates that limitation and enhances the scalability of in-memory storage for larger numbers of subnets and addresses.

Solution

  1. Dynamic Hash Table: Bucket count now grows dynamically.
  2. Prefix Tree: Introduced a prefix tree for subnet storage, enabling efficient address lookup within subnet lists in a memory-friendly way.
  3. Group Partitioning: Subnets and addresses are now partitioned by group ID in a hash table, removing the need for ordered storage and simplifying group lookups. UPDATE: Removed for backward compatibility.
  4. Refactor: Refactored the touched code parts to make the code more consistent.

Compatibility
The permission module interfaces (scripting, MI) remained unchanged: except, on the MI interface, when dumping subnets or addresses, the result is not in an ordered list.
UPDATE: get_source_group can return different group if address/subnet added into multiple groups and found an another first.
UPDATE: PR updated so the change is backward compatible.

Closing Issues
#2551, #2636

@c-leroy
Copy link
Contributor

c-leroy commented Jan 20, 2025

Hello @benceszigeti ,

I haven't had a look at your patch, but I have a question :
Does it work like IP routing (the finest network matches, ig. the with the largest network mask), or is it the first found that matches ?

@benceszigeti
Copy link
Contributor Author

benceszigeti commented Jan 20, 2025

@c-leroy

I haven't had a look at your patch, but I have a question : Does it work like IP routing (the finest network matches, ig. the with the largest network mask), or is it the first found that matches ?

It returns the first match.

For check_address and check_source_address, we only need to know if any subnet matches, so we can return early (fastest) while iterating through the prefix tree.

For get_source_group, I can modify the code to return the group with the finest network match -- but then we must also decide which group to return when the same match appears in multiple groups.

@bogdan-iancu bogdan-iancu self-assigned this Jan 21, 2025
@benceszigeti
Copy link
Contributor Author

benceszigeti commented Jan 21, 2025

If an address/subnet is defined in multiple groups, get_source_group may return a different group than it did previously. Previously, it returned the group with the lowest group ID where the match occurred (due to how the data was stored). Therefore, I am considering using a linked list for groups instead of a hash table, so we do not have to look up every group to find the smallest group ID. This change is necessary for backward compatibility.

If an address/subnet is defined in multiple groups, `get_source_group`
may return a different group than it did previously.  Previously, it
returned the group with the lowest group ID where the match occurred
(due to how the data was stored).  Therefore, I am using a linked list
for groups instead of a hash table, so we do not have to look up every
group to find the smallest group ID.  This change is necessary for
backward compatibility.
@benceszigeti
Copy link
Contributor Author

Updated the PR: changes are now backward-compatible:

  • Address/subnet MI dump returns an array ordered by group ID.
  • get_source_group returns the lowest group ID with a match.

Copy link
Member

@liviuchircu liviuchircu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on the internal refactoring, @benceszigeti! I have tested the code a bit and couldn't see any backwards-compatibility issues neither on opensips.cfg side nor MI side, so we are good to merge. Still, do make sure to add GPLv3 headers on each of the 4 new source code files! Either with copyrights to your company or just "OpenSIPS Project" as default.

And a couple optional suggestions, perhaps worthy of future work:

  • the 1.5f bucket growth factor seems a bit off, because it generates the bucket sequence: 16, 24, 36, 54, 81, 121, etc. Most of these numbers are not powers of two, which means the core_hash() function will have gaps in its output hashing, meaning that some buckets will be permanently empty, while others will be extra full. Simply using a factor of 2f would fix this.

  • in new_address_node() you could use a private, dynamic-sized struct with 4 fields (two fixed-sized and two dynamic-sized, determined at runtime) to perform x1 shm_malloc instead of x4, for all kinds of benefits (speed, better memory usage, etc.).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants