TELECOM-10654: Rewrite permission module storage backend #87

benceszigeti · 2025-01-06T17:14:03Z

No description provided.

bcnewlin · 2025-01-06T21:52:08Z

What is the reasoning for using a fixed table length that grows by doubling in size vs copying the mechanism used for addresses using a hash table with linked list buckets? That allows the list to grow more closely aligned to usage, instead of periodic potentially large allocations to double the table.

If it is because of the group sorting mechanism that currently exists, that seems pretty unscalable as well. Perhaps what is needed is a multi-level table. First partitioned on group, and then for each group containing a hash table like for addresses? The first tier could then just be a sorted linked list and grow dynamically as well, since the group values are known to always be integers so hashing isn't required.

benceszigeti · 2025-01-07T01:32:43Z

@bcnewlin

What is the reasoning for using a fixed table length that grows by doubling in size vs copying the mechanism used for addresses using a hash table with linked list buckets? That allows the list to grow more closely aligned to usage, instead of periodic potentially large allocations to double the table.

Honestly, I went with an easy modification to remove the hardcoded limit only, since, anyway, the number of elements will be reasonable, so this implementation will not be a performance or space bottleneck.

(I implemented it similarly to how std::vector works in C++: multiplying the initially reserved capacity by 1.5 or 2.0 when growing, which results in fewer syscalls, a more cache-friendly memory layout, but it uses more space and is slower for looking up elements by value.)

I do agree that we can optimize. The hot/critical path seems to be the lookup, so I agree that an unordered map is definitely a better choice. What I miss from the current implementation is the growing and rebalancing when buckets become too big -- this can also harm performance.

If it is because of the group sorting mechanism that currently exists, that seems pretty unscalable as well. Perhaps what is needed is a multi-level table. First partitioned on group, and then for each group containing a hash table like for addresses? The first tier could then just be a sorted linked list and grow dynamically as well, since the group values are known to always be integers so hashing isn’t required.

Sounds good to me.

I will rework the PR to use more reasonable storage containers.

bcnewlin · 2025-01-07T02:24:44Z

I didn't mean that this solution wouldn't work. I think it will be fine since, as you said, the numbers should be reasonable. However, we don't actually have any limits in place and nothing would stop a few customers from being very unreasonable.

I certainly don't think a re-write is required, but would support if you want to optimize. The answer of going with this solution because it is a fast implementation and computationally less complex is a good answer. :)

This reverts commit ea7fb17.

benceszigeti · 2025-01-09T14:36:56Z

@bcnewlin

Your proposed solution, I believe, results in much faster runtime behavior, so I think we should proceed with it, with a slight modification.

struct subnet and struct address_list differ only in that struct subnet also stores the subnet mask. If we reuse the address storage for subnets as well, then:

We can remove a lot of code, resulting in a simpler codebase, though with slightly larger storage requirements -- since we store the mask (32 or 128) along with addresses, which is a fair tradeoff I think.
Both address and subnet lookups will benefit from partitioning into groups, leading to faster lookup.
Generally, lookup will become faster, as we currently perform an initial lookup by address and then by subnet if no results for the address lookup. With this change, we can merge the two processes, simplifying the usage side as well.

benceszigeti · 2025-01-16T14:57:51Z

@bcnewlin

I overlooked that we can not use a hash table for subnet lookups, as the key is an IP address inside a subnet, not a CIDR network address. We need to determine which subnet contains the IP, and if multiple subnets match, we must check their metadata -- port, protocol, and pattern -- for exact match. To address this, I implemented a prefix tree for more efficient subnet searches.

The new concept is partitioning addresses and subnets into groups using a hash table. Each group has a separate hash table for IP addresses and a prefix tree for subnets. The prefix tree does not store the metadata (subnet, port, protocol, pattern, and info) itself; it only stores a pointer to an element in the IP address hash table, which is the network address of the subnet. This approach makes lookup and memory usage more efficient.

This solution maybe more complex, but it is scalable.

I tested it locally with multiple test sets and large datasets, including frequent reloads/lookups, and observed no memory leaks, crashes or regression.

I will revert the draft mode but will continue testing (especially IPv6) while the PR is under review.

benceszigeti · 2025-01-20T12:02:45Z

Created a PR against OpenSIPS master: OpenSIPS#3560.

…p` backward compatibility

bcnewlin · 2025-01-22T17:13:20Z

Well this certainly is more complex, I see now why you went with the easy solution first. I probably should have kept my comments to myself! 😆

But you get bonus points from me for this solution because you were able to use my favorite data structure, which is a Trie.

benceszigeti added the enhancement New feature or request label Jan 6, 2025

benceszigeti requested review from bcnewlin, DHennessyGenesys, davidtrihy-genesys and genesys-adam January 6, 2025 17:14

benceszigeti self-assigned this Jan 6, 2025

benceszigeti force-pushed the TELECOM-10654 branch 2 times, most recently from f827f79 to be2c974 Compare January 6, 2025 17:16

TELECOM-10654: Make permission module entry allocate dynamic

ea7fb17

benceszigeti force-pushed the TELECOM-10654 branch from be2c974 to ea7fb17 Compare January 6, 2025 17:30

benceszigeti marked this pull request as draft January 7, 2025 01:34

Revert "TELECOM-10654: Make permission module entry allocate dynamic"

52c0ca8

This reverts commit ea7fb17.

benceszigeti force-pushed the TELECOM-10654 branch 12 times, most recently from 421f951 to 17e53c4 Compare January 16, 2025 14:57

benceszigeti marked this pull request as ready for review January 16, 2025 14:58

benceszigeti changed the title ~~TELECOM-10654: Make permission module entry allocate dynamic~~ TELECOM-10654: Rewrite permission module storage backend Jan 16, 2025

benceszigeti force-pushed the TELECOM-10654 branch from 17e53c4 to 0618ead Compare January 16, 2025 15:10

benceszigeti force-pushed the TELECOM-10654 branch from 0618ead to 40344b4 Compare January 20, 2025 14:05

TELECOM-10654: Rewrite permission module storage backend

e703a87

benceszigeti force-pushed the TELECOM-10654 branch 3 times, most recently from 74d25b2 to a09eb2c Compare January 21, 2025 19:08

TELECOM-10654: Use linked list for group storage for `get_source_grou…

07bc114

…p` backward compatibility

benceszigeti force-pushed the TELECOM-10654 branch from a09eb2c to 07bc114 Compare January 21, 2025 19:11

bcnewlin approved these changes Jan 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TELECOM-10654: Rewrite permission module storage backend #87

TELECOM-10654: Rewrite permission module storage backend #87

benceszigeti commented Jan 6, 2025

bcnewlin commented Jan 6, 2025 •

edited

Loading

benceszigeti commented Jan 7, 2025 •

edited

Loading

bcnewlin commented Jan 7, 2025 •

edited

Loading

benceszigeti commented Jan 9, 2025

benceszigeti commented Jan 16, 2025 •

edited

Loading

benceszigeti commented Jan 20, 2025

bcnewlin commented Jan 22, 2025

TELECOM-10654: Rewrite permission module storage backend #87

Are you sure you want to change the base?

TELECOM-10654: Rewrite permission module storage backend #87

Conversation

benceszigeti commented Jan 6, 2025

bcnewlin commented Jan 6, 2025 • edited Loading

benceszigeti commented Jan 7, 2025 • edited Loading

bcnewlin commented Jan 7, 2025 • edited Loading

benceszigeti commented Jan 9, 2025

benceszigeti commented Jan 16, 2025 • edited Loading

benceszigeti commented Jan 20, 2025

bcnewlin commented Jan 22, 2025

bcnewlin commented Jan 6, 2025 •

edited

Loading

benceszigeti commented Jan 7, 2025 •

edited

Loading

bcnewlin commented Jan 7, 2025 •

edited

Loading

benceszigeti commented Jan 16, 2025 •

edited

Loading