Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add script for updating OUI records #2945

Merged
merged 8 commits into from
Sep 24, 2024
Merged

Conversation

stveit
Copy link
Contributor

@stveit stveit commented Aug 1, 2024

Closes #2958
Reliant on #2896

Adds a script that downloads a document of all OUIs and uses this to update OUI records in the database.

Django 3.2 didnt have great tools for updating database in a clean way so old ones are removed and current ones are updated (if an OUI got registered under a different name for instance), but it does have great tools for creating a lot of rows and deleting a lot of rows, so thats what I went with. Open for suggestions on improvements.

@stveit stveit self-assigned this Aug 1, 2024
Copy link

codecov bot commented Aug 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 56.79%. Comparing base (1652d80) to head (0568467).
Report is 63 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2945      +/-   ##
==========================================
+ Coverage   56.29%   56.79%   +0.49%     
==========================================
  Files         602      603       +1     
  Lines       43720    43773      +53     
  Branches       48       48              
==========================================
+ Hits        24611    24859     +248     
+ Misses      19097    18902     -195     
  Partials       12       12              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@stveit stveit force-pushed the mac-to-vendor-script branch 3 times, most recently from ba46670 to 0403943 Compare August 6, 2024 13:20
@stveit stveit marked this pull request as ready for review August 7, 2024 10:38
@stveit
Copy link
Contributor Author

stveit commented Aug 9, 2024

The fixtures are formatted horribly, my editor seems to enjoy converting some tabs to spaces and it creates a real mess

@johannaengland
Copy link
Contributor

Django 3.2 didnt have great tools for updating database in a clean way so old ones are removed and current ones are updated (if an OUI got registered under a different name for instance), but it does have great tools for creating a lot of rows and deleting a lot of rows, so thats what I went with. Open for suggestions on improvements.

It is quite an eyesore, but I've been thinking about a better solution for the last 30 min and I have not come up with anything better. Maybe @hmpf can help us out here

Copy link
Contributor

@johannaengland johannaengland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the small nitpick and the agreement to _update_database not being very pretty this looks good to me

python/nav/bin/update_ouis.py Outdated Show resolved Hide resolved
@johannaengland
Copy link
Contributor

I had another idea to improve the _update_database function by separating the OUIs into three groups:

  1. previously in the database, but no longer in the file
  2. overlap in OUI, but changed vendor name
  3. added OUI, vendor pair

and I worked on updating/adding/deleting those separately, but it turns out that that takes longer than the current solution 😁

Copy link
Member

@lunkwill42 lunkwill42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got a lot of feedback this time, but I think the actual update problem is the most important - but I have to think a bit more about that one. It doesn't have to be "nice", it just has to do the job without too many updates :)

python/nav/bin/update_ouis.py Outdated Show resolved Hide resolved
python/nav/bin/update_ouis.py Outdated Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
tests/integration/update_ouis_test.py Outdated Show resolved Hide resolved
python/nav/bin/update_ouis.py Outdated Show resolved Hide resolved
Comment on lines 77 to 82
line = line.strip()
split_line = line.split()
oui = split_line[0] + "-00-00-00"
split_vendor = split_line[2:]
vendor = " ".join(split_vendor)
return OUI(oui=oui, vendor=vendor)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have downloaded a file from a "random location on the internet". I think this could do well to verify even further that the data line you received is formatted as a valid MAC prefix.

Alternate suggestion could be (this assumes you've done from nav.macaddress import MacPrefix):

Suggested change
line = line.strip()
split_line = line.split()
oui = split_line[0] + "-00-00-00"
split_vendor = split_line[2:]
vendor = " ".join(split_vendor)
return OUI(oui=oui, vendor=vendor)
prefix, _, vendor = line.strip(maxsplit=2)
oui = str(MacPrefix(prefix)[0])
return OUI(oui=oui, vendor=vendor)

This will raise a ValueError if the prefix is not a valid MAC address prefix, or if the line cannot be split into three items. My suggestion is to ignore/log such an entry and continue processing, otherwise the whole update process would be derailed by a single bad value. There is, of course, also the potential that many , or even all, lines containing the (hex) string are invalid, which would cause a huge amount of errors to be logged. Makes me think there should also be some threshold where the process exists because there are just too many errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a draft of a solution like this, used a fairly arbitrary number of errors

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another valid solution could be to just count all the errors, rather than log the details of each one. It's not like the user can do much about the errors anyway, since they weren't the ones to create the file in the first place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it so it now logs an error and exits if its exceeds the max error value, but it does not log per error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But currently you are logging per error - I'll add an inline comment as well

@lunkwill42 lunkwill42 mentioned this pull request Aug 22, 2024
@lunkwill42
Copy link
Member

Ok, I've pushed a commit ecc7449 that shows how all the updating can be pushed to PostgreSQL while avoiding a full table rewrite on every script run.

I have some more feedback on your latest changes, @stveit, but those can wait until tomorrow :)

@stveit
Copy link
Contributor Author

stveit commented Sep 17, 2024

Force pushed to get Selenium stuff fixed

Copy link
Contributor

@johannaengland johannaengland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the small logging command all else looks nice

python/nav/bin/update_ouis.py Outdated Show resolved Hide resolved
Copy link
Member

@lunkwill42 lunkwill42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better :)

You've a got a huge history with very little content here, so feel free to do some history editing before merging :)

@stveit stveit force-pushed the mac-to-vendor-script branch 2 times, most recently from 0658806 to 47a4dcd Compare September 24, 2024 10:39
stveit and others added 5 commits September 24, 2024 12:42
This ensures we do not rewrite the entire table every time the script
runs, as that can cause table bloat in PostgreSQL.  Instead, all
the heavy lifting of figuring out which rows to insert, update or delete
is delegated to PostgreSQL.  We simply upload the full dataset to
a temporary table and issue SQL statements to ensure the `oui` table is
in sync.
@stveit stveit merged commit cce0a5a into Uninett:master Sep 24, 2024
12 checks passed
@stveit stveit deleted the mac-to-vendor-script branch September 24, 2024 11:01
@davidemiccone
Copy link

Please, how can I run update_ouis.py script?
I have a Debian 11 installation.
Running
python3 /opt/venvs/nav/lib/python3.9/site-packages/nav/bin/update_ouis.py
as root user doesn't works

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add maintenance script to download the current official OUI list and insert/update the database OUI model
4 participants