Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple entries ARP table #1191

Draft
wants to merge 17 commits into
base: pre-release
Choose a base branch
from
Draft

Conversation

FilMarini
Copy link
Collaborator

@FilMarini FilMarini commented Sep 16, 2024

Adding ARP table with multiple entries

PULL REQUEST IS STILL A DRAFT! - Seems to work but more testing is needed

Description

  • The file ethernet/UdpEngine/rtl/ArpIpTable.vhd contains a dynamic LUT which keeps in memory both IPs and MAC addressed. A new entry is memorized in the LUT every time a new 'clientRemoteIp' is detected
  • In the ethernet/UdpEngine/rtl/UdpEngineArp.vhd, every time the destination IP changes, the correspondent MAC address is either retrieved from the LUT if already memorized, or a new entry is created after an ARP request/response.
  • The IP/MAC LUT works as a circular buffer. Once the LUT is filled, then the first LUT address will be filled if a new unknown IP is given to the core
  • In the ethernet/UdpEngine/rtl/UdpEngineTx.vhd the incoming packets are checked for the tDest metadata field. If tDest /= 0, then the IP/MAC address saved in the tDest LUT address are used

Details

  • Currently the LUT is filled in order: first given IP is saved in position 1, second given IP is saved in position 2, and so on. Maybe its good to have a way to feed the ARP table from a file (like a JSON file for example) so the user has better control over the ARP table. I'm not sure, suggestions are welcome!
  • Also, what happens if the tDest field refers to an address that has not been filled? Currently I just flush the data (Or at least thats what I think its doing, havent tested it yet)
  • Also, emacs beautifier should be run with SLAC settings

@FilMarini FilMarini marked this pull request as draft September 16, 2024 20:40
@FilMarini
Copy link
Collaborator Author

FilMarini commented Sep 17, 2024

Everything seems to work in simulation.

Some possible criticalities (?):

  • Using the ARP table, the ARP timeout is only used for the currently active IP address (meaning the 'clientRemoteIp' set either via Axi-Lite or input port)
  • Selecting an IP with the tDest does not make it active, meaning that the ARP timer still refers to the IP set either via Axi-Lite or port

@FilMarini
Copy link
Collaborator Author

I'll be un-drafting this PR once I'll hardware test it to make sure everything works like in simulation.

In the meantime, if people have any thoughts about the points above, please let me know so I can fix them in the simulation environment before hw testing

@ruck314
Copy link
Contributor

ruck314 commented Sep 23, 2024

https://github.com/slaclab/surf/blob/main/.emacs
Can you update the formatting of these changed files to use that .emacs configuration above? It is really difficult to review the changes on Github with so many minor format changes (e.g. different # of spaces of indenting).

Using the ARP table, the ARP timeout is only used for the currently active IP address (meaning the 'clientRemoteIp' set either via Axi-Lite or input port)

I think there would need to be a FSM that cycles through IP addresses in the ARP table that periodically checks if the ARP timeout has happen and need to issue a ARP request.

Selecting an IP with the tDest does not make it active, meaning that the ARP timer still refers to the IP set either via Axi-Lite or port

Is the intent of these "Multiple entries ARP table" to support UDP client only, UDP server only, or both types?

@FilMarini
Copy link
Collaborator Author

Yeah, sorry about that.. maybe I should just add the SURF's .emacs file to my emacs conf file..
I've applied the SLAC format and commited. Anyway I found useful to disable whitespaces in the comparison window, that usually helps a little.

  • Ok, Ill proceed with implementing the FSM for cycling through the IP addresses in the LUT
  • So, for our purposes we only need it to support UDP client, but if you think its useful I'm definitely open to add the support for UDP server as well!

@ruck314
Copy link
Contributor

ruck314 commented Sep 26, 2024

So, for our purposes we only need it to support UDP client,
@FilMarini I thought about it more and seems like only support this for the UDP client is fine. Most cases the FPGA side of the UDP server is just a single UDP client/server pair for the register configuration path.

@FilMarini
Copy link
Collaborator Author

Hi @ruck314, I've updated the code. Here's the changes

  • In the LUT table, each entry has a timer, very similar to the timer in the 'UdpEngineArp' module. When an entry expires, that spot in the LUT is freed
  • Expired ARP entries are stored in a queue (FIFO). When a new entry needs to be added to the ARP table, the system checks the queue. If the queue has any entries, the new entry is placed in the spot of the oldest expired one
  • When an incoming communication is detected in 'UdpEngineRx', the IP address is sent to the ARP Table. If that matches with any entries, then the counter is refreshed

Does this make sense to you?

Now the issue is to check thoroughly the new functionality. Ill try to come up with a nice testbench in order to make sure everything works as intended

@FilMarini
Copy link
Collaborator Author

FilMarini commented Oct 9, 2024

I've created a testbench in order to verify the functionalities of this PR.
The testbench is here
It's a cocotb testbench where basically there's a docker container acting as the client and N docker containers acting as servers.
In the client the firmware simulation sends PRBS data to one server container at a time, while the servers just send the data back in loopback mode. The testbench verifies that the different servers are targeted and that the received data in the client match with the sent one.
Using a waveform viewer I make sure the ARP table functionalities are working as expected.

Currently it seems like everything is working as it is supposed to

I do think though that an hw test is needed before un-drafting this PR

Edit1: The test is not actually working as it is supposed to. There's a bug when entries on the ARP table expire. Working on solving it now

Edit2: Everything should be fixed now

@ruck314
Copy link
Contributor

ruck314 commented Oct 21, 2024

@FilMarini I cloned the latest "Simple-10GbE-RUDP-KCU105-Example", update the submodule/surf to FilMarini:multi-arp, built the FW using your branch and programmed my KCU105 with this new FW. However, when I try to run the software, I got this error message

Simple-10GbE-RUDP-KCU105-Example/software$ python scripts/devGui.py
Rogue/pyrogue version v6.4.0. https://github.com/slaclab/rogue
WARNING:pyrogue.Device.UdpRssiPack.SwRudpClient[1]:host=192.168.2.10, port=8193 -> Establishing link ...
Start: Started zmqServer on ports 9099-9101
    To start a gui: python -m pyrogue gui --server='localhost:9099'
    To use a virtual client: client = pyrogue.interfaces.VirtualClient(addr='localhost', port=9099)
terminate called after throwing an instance of 'rogue::GeneralError'
  what():  JtagDriver::query(): General Error: Received invalid word size.

If I comment out in the code listed below, then it runs (without the XVC support) and no errors observed.
https://github.com/slaclab/Simple-10GbE-RUDP-KCU105-Example/blob/main/firmware/python/simple_10gbe_rudp_kcu105_example/_Root.py#L93-L99

            # Create XVC server and UDP client
            self.udpClient = rogue.protocols.udp.Client( ip, 2542, False ) # Client(host, port, jumbo)
            self.xvc = rogue.protocols.xilinx.Xvc ( 2542 ) # Server(port)
            self.addProtocol( self.xvc )

            # Connect the UDP Client to the XVC
            self.udpClient == self.xvc

If I switch the submodule/surf to latest pre-release branch on slaclab/surf, rebuild FW and revert the python code, I do not get this error message and software runs as expected.

It appears that your changes in surf has broken this FW.UDP.Server/SW.UDP.client connection for XVC communication

@ruck314 ruck314 self-requested a review October 21, 2024 20:05
@FilMarini
Copy link
Collaborator Author

@ruck314 Hmmm, ok. Thanks for trying it out!
I'm currently not in the lab/office. Next week I'll take a closer look at this.

Do you have any idea if its possible to exercise the XVC UDP traffic in simulation in some ways?

@ruck314
Copy link
Contributor

ruck314 commented Oct 31, 2024

@FilMarini Here's how the XVC UDP is connected to the KCU105:
https://github.com/slaclab/Simple-10GbE-RUDP-KCU105-Example/blob/main/firmware/python/simple_10gbe_rudp_kcu105_example/_Root.py#L93-L99

On the SW side, this is a UDP client. On the FW side, this would be yet another UDP server.

In one terminal I tried to start the GUI:

$ python scripts/devGui.py
Rogue/pyrogue version v6.4.0. https://github.com/slaclab/rogue
Start: Started zmqServer on ports 9099-9101
    To start a gui: python -m pyrogue gui --server='localhost:9099'
    To use a virtual client: client = pyrogue.interfaces.VirtualClient(addr='localhost', port=9099)
terminate called after throwing an instance of 'rogue::GeneralError'
  what():  JtagDriver::query(): General Error: Received invalid word size. Please ensure the board is powered up.

Aborted (core dumped)

And in the other terminal I capture the UDP port 2542 traffic between SW/FW:

$ sudo tcpdump -i any udp dst port 2542 and dst host 192.168.2.10 -vvv -X -s 0 -tttt -n
tcpdump: data link type LINUX_SLL2
tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
2024-10-31 08:53:30.435555 eth2  Out IP (tos 0x0, ttl 64, id 57830, offset 0, flags [DF], proto UDP (17), length 32)
    192.168.2.1.53486 > 192.168.2.10.2542: [bad udp cksum 0x8579 -> 0x9f9d!] UDP, length 4
        0x0000:  4500 0020 e1e6 4000 4011 d38a c0a8 0201  E.....@.@.......
        0x0010:  c0a8 020a d0ee 09ee 000c 8579 0000 0000  ...........y....

It looks like it fails on the 1st UDP packet sent from SW to FW because it never got a response packet back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants