Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] permissions: redesign address permissions regarding networks #2551

Open
Nestorfish opened this issue Jun 11, 2021 · 7 comments
Open
Assignees

Comments

@Nestorfish
Copy link

Is your feature request related to a problem? Please describe.

We will soon have to insert quite a lot of IPv4 network ranges in address, so I took a look at the way OpenSIPS handles this, and more generally address permissions.

Here is what I found so far (the following applies to a partition):

Storage

Individual hosts (mask 32 or 128 depending on address family) are stored in a hash-map, IP address begin the key.
The maximum number of buckets is set during compilation (PERM_HASH_SIZE, default 128, used in perm_hash macro as parameter to core_hash)
There is no particular ordering for elements matching this key, they are inserted in front of a chained list as they are read from database.

Non-host network addresses are stored in a fixed-size array, whose size is set during compilation (the same PERM_HASH_SIZE, default 128).
They are ordered by group_id.

Match

OpenSIPS first looks into the hosts hash-map.
It checks if the given group_id (if any) exists in any value, and aborts if not.
Then it iterates the chained list corresponding to the IP address.

If no matching entry is found above, it uses the networks array.
It first checks if there is any entry for the given group_id (if any), and aborts if not.
Then it looks for a matching entry.
(By the way, it forgets meantime the index where the group_id was found in the array, it could have avoided restarting from first position by remembering it)

Remarks

  • Why does it first checks if the group_id exists? If it is absent, it only produces a LM_DBG statement along with a different return code (which is not documented for now I guess)
  • Networks entries are not scalable
  • Size of hashing function result could be set as a module parameter (like eg. cache_collections parameter of module cachedb_local which sets the number of bits). Not a must-have however
  • There is no fundamental difference between hosts and networks for this use case, they could all be considered as networks
  • This brings the last point: by looking first for individual hosts, it allows specific overrides of network entries. This behaviour can be generalized to networks: the match would be done on the longest mask entry found.

Describe the solution you'd like

Treat all entries the same way, namely:
No limitation for number of network entries
Match the smallest network first, being an host or not

Implementation

Proposal

Storage

  • Array of fixed size 129, the index being the mask of address entry (0 to 128).
  • Each entry of this array is a hash-map (as currently for hosts). During (re)load, we must ensure that the hashing is done on network address, ie. ip value after zeroing host bits.

Match

  1. Hash IP address

  2. Set the initial index to the IP address family size (eg. 128 or 32)

  3. Search for a matching entry in the hash-map present at this index, at the computed hash

  4. If none found:

    • Break if index == 0

    • Decrement index

    • If the bit that passed from network part to host part of IP address == 1:

      • Zero this bit
      • Refresh hash value
    • Go to 3.

  • Component: permissions module
  • Type: Internal
  • Name: Not relevant
@Nestorfish
Copy link
Author

Nestorfish commented Jul 16, 2021

Here is an attempt to implement it on 2.4.11 branch.
Could someone have a glance at it and give some feedback?
There must be some memory issues, I encountered a no private memory left once during an address reload.

EDIT: no private memory left is also there with legacy 2.4.11, it is due to PKG sizing on our test platform.

@Nestorfish
Copy link
Author

To explain it quickly, each partition holds arrays of struct netmask_list.
A netmask_list correspond to a mask length, and has a hash list of struct network_list entries (like before for plain host entries with struct address_list). Its position inside the array is given by the mask length.
In order to speed up address lookup, there is a next pointer pointing to the next non-empty netmask_list.

Subnet functions were removed, and hash functions were updated to use network_list entries.

@Nestorfish
Copy link
Author

The same based on branch master.

@Nestorfish
Copy link
Author

The patched OpenSIPS 2.4.11 has been tested for over a month on production systems, with no issues noted so far.
Anyone have any thoughts on any issues that would block the merge (outside of the module documentation)?
We have no way of testing higher versions of OpenSIPS in the near future in production, but our case tests have been successful.
Note: all tests were done with IPv4 only

@bogdan-iancu
Copy link
Member

@Nestorfish , thanks for your work and contribution here. Could you generate a PR against the master branch, so we can do a proper review ?

@Nestorfish
Copy link
Author

@bogdan-iancu The branch is updated against master. I left documentation untouched.

@liviuchircu
Copy link
Member

Hi, @Nestorfish! Regarding this FR and the PR in #2648, please note that #3560 is to be merged soon, which eliminates the "128" subnet limitation, while also boosting lookup performance by converting all internal data structures in permissions module to use trie data structures. Moreover, the whole extension is backwards-compatible: same MI output in address_dump and subnet_dump (despite the two concepts having been fully merged internally!), and same behavior from script functions such as get_source_group(), which still returns the 1st group in ascending order, as previously.

Is this enough to fullfill your use case too? I notice you would prefer a longest-prefix-match, but what is the use-case? Because, after all, if you're trying to match source IP in group 42, then a match of any address or subnet stored in that group would give you the desired true result back. Or are you storing a different context_info there?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants