Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cisco 3745 + 2 NM-16ESW modules breaks IP routing #243

Open
Abyss-W4tcher opened this issue Jun 17, 2024 · 20 comments
Open

Cisco 3745 + 2 NM-16ESW modules breaks IP routing #243

Abyss-W4tcher opened this issue Jun 17, 2024 · 20 comments

Comments

@Abyss-W4tcher
Copy link

Hello,

I am running a Cisco 3745 12.4(25d) in a GNS 3 topology. Inserting a single NM-16ESW works fine, but adding another one in slot 2 breaks the IP layer setup in my administration VLAN interface :

# debug ip packet
# ping 192.168.1.3
*Mar  1 00:04:04.139: IP: tableid=0, s=192.168.1.2 (local), d=192.168.1.3 (Vlan200), routed via RIB
*Mar  1 00:04:04.139: IP: s=192.168.1.2 (local), d=192.168.1.3 (Vlan200), len 100, sending
*Mar  1 00:04:04.143: IP: s=192.168.1.2 (local), d=192.168.1.3 (Vlan200), len 100, encapsulation failed.

However, the switching still works fine, as hosts are able to communicate with Internet. Removing the second module and rebooting makes everything back to normal. Could someone help me ? Thanks !

@grossmj
Copy link
Member

grossmj commented Jun 17, 2024

This may be a bug in either Dynamips or IOS itself. It could be worth trying with another IOS image, like the c3660.

@grossmj
Copy link
Member

grossmj commented Jun 17, 2024

According to this post https://www.gns3.com/community/featured/cant-get-nm-16esw-module-to-work

The NM-16ESW modules must be in slot 0 and slot 1.

@Abyss-W4tcher
Copy link
Author

This may be a bug in either Dynamips or IOS itself. It could be worth trying with another IOS image, like the c3660.

Hello, I will give it a shot tomorrow

According to this post https://www.gns3.com/community/featured/cant-get-nm-16esw-module-to-work

The NM-16ESW modules must be in slot 0 and slot 1.

Even if it is true, I wanted to expand to 4 modules and have a 64 ports "switch", and the slot 0 is locked anyway... I do not have access to the paid switch Cisco images, so I use this router-as-a-switch setup.

@grossmj
Copy link
Member

grossmj commented Jun 17, 2024

Even if it is true, I wanted to expand to 4 modules and have a 64 ports "switch", and the slot 0 is locked anyway... I do not have access to the paid switch Cisco images, so I use this router-as-a-switch setup.

Correct, slot 0 is taken for c3725, c3745 and c3660. I think it is available for 3620 and 3640 though.

However, you cannot 64 ports anyway because the maximum is 2 NM-16ESW modules per router:

https://www.cisco.com/c/en/us/support/docs/interfaces-modules/network-modules/60565-etherswitch-FAQ.html

Two modules in any router is the limit for an intrachassis stack

@Abyss-W4tcher
Copy link
Author

Hello @grossmj, sorry for the delay. I switched to a 3640 image, and I can put an NM-16ESW in slot 0. However, adding another one in the slot 1 raises the same "encapsulation failed" error as before.

@Abyss-W4tcher
Copy link
Author

Abyss-W4tcher commented Jun 21, 2024

Looking at the arp table, between working and not working, there definitely is a problem at this layer. I enabled arp debugging, and I think the problem comes from here. Maybe adding another module, makes a sort of loop or overlapping. I will try to debug more.

edit : looks like https://community.cisco.com/t5/switching/3745-nm-esw-16-issues/m-p/649599/highlight/true#M6037 + https://community.cisco.com/t5/other-network-architecture-subjects/nm-16esw-switching-problem/m-p/837398/highlight/true#M181651

@Abyss-W4tcher
Copy link
Author

Abyss-W4tcher commented Jun 21, 2024

Ok, so thanks to the link you sent and additional doc (https://www.cisco.com/c/en/us/td/docs/routers/access/interfaces/nm/hardware/installation/guide/connswh.html#:~:text=If%20two%20Ethernet%20switch%20network,configuration%20requirements%20must%20be%20met%3A&text=Both%20Ethernet%20switch%20network%20modules,Gigabit%20Ethernet%20expansion%20board%20installed.) :

image

The two Gigabit interfaces of each module need to be connected. However, I do not see this port in the module... I will now try to debug the dynamips source code to see if there are mention of it.

edit : GE-DCARD-ESW https://community.cisco.com/t5/networking-knowledge-base/unable-to-configure-both-nm-16esw-switch-modules-on-a-3600-or/ta-p/3131532 daughterboard would be needed

@Abyss-W4tcher
Copy link
Author

Abyss-W4tcher commented Jun 21, 2024

Hello again @grossmj, sorry for the ping. I would like to add a sub module to the NM-16ESW slot, typically inside the dev_nm_16esw_init function. I found the EEPROM data for the GE-DCARD-ESW, and I am trying to add a new daughterboard, which would be automatically added as a sub module when an NM-16ESW is instantiated.

I saw references to sub slots in the code, but maybe you have a similar example in mind that could help me ?

Thank you in advance !

@flaviojs
Copy link
Contributor

flaviojs commented Jun 21, 2024

I don't see an example of a subslot card in the code.
Based on the CISCO_CARD_TYPE_NM case in vm_slot_get_info you should probably:

  1. increase the wic_slots field of dev_c3745_nm_16esw_driver from 0 to 1
  2. add the daughter card to card->sub_slots[0] in dev_c3745_nm_16esw_init
  3. maybe create a function for the card_get_sub_info field of dev_c3745_nm_16esw_driver?
  4. maybe add an EEPROM v4 field "01 XX(Number of Slots)" to eeprom_nm_16esw_data? (for IOS)

@Abyss-W4tcher
Copy link
Author

Abyss-W4tcher commented Jun 21, 2024

I don't see an example of a subslot card in the code. Based on the CISCO_CARD_TYPE_NM case in vm_slot_get_info you should probably:

  1. increase the wic_slots field of dev_c3745_nm_16esw_driver from 0 to 1
  2. add the daughter card to card->sub_slots[0] in dev_c3745_nm_16esw_init
  3. maybe create a function for the card_get_sub_info field of dev_c3745_nm_16esw_driver?
  4. maybe add an EEPROM v4 field "01 XX(Number of Slots)" to eeprom_nm_16esw_data? (for IOS)

Hello !

I modified the EEPROM field, and I'm figuring out what needs to be done to create a card to insert in the sub_slots list (which fields to set), and if initializing this sub card is needed or automatically done. Do I have to set daughter_card->dev_type = CISCO_CARD_TYPE_WIC; ?

edit : after some tries, I can't get the kind of wanted inheritance :/ I am not sure that WIC are necessary, as 3600 series support these additional daughter cards.

edit2: I think I figured out how to declare a port, and make it "inherit" from a parent card. Found a sample output of show inventory raw :

NAME: "16 Port 10BaseT/100BaseTX EtherSwitch on Slot 1", DESCR: "16 Port 10BaseT/100BaseTX EtherSwitch"
PID: NM-16ESW          , VID: V01 , SN: FOC11332NVM
 
NAME: "DaughterCard Slot 0 on Card 1", DESCR: "C2821 DaughterCard Slot"
PID:                   , VID:    , SN:
 
NAME: "Power daughter card for 16 port EtherSwitch NM on Slot 1 SubSlot 0", DESCR: "Power daughter card for 16 port EtherSwitch NM"
PID: PPWR-DCARD-16ESW  , VID: V01 , SN: FOC11325M3W
 
NAME: "DaughterCard Slot 1 on Card 1", DESCR: "C2821 DaughterCard Slot"
PID:                   , VID:    , SN:
 
NAME: "Gigabit(1000BaseT) module for EtherSwitch NM on Slot 1 SubSlot 1", DESCR: "Gigabit(1000BaseT) module for EtherSwitch NM"
PID: GE-DCARD-ESW      , VID: V01 , SN: FOC11341MKK

I tried adding my card as a subslot of the NM-16ESW one, for which I gave a custom EEPROM with driver dev_c7200_pa_ge_driver (GigabitEthernet port). However, there are conflicts as it tries to instantiate another PCI device in the existing slot of the VM board, instead of this one on the right (not sure at all if it's a PCI port though) :

image

final edit:

I tried adding a device to the same PCI slot as the NM-16ESW, but it won't get detected. I think it needs to be bound to a "DaughterCard slot", which might need specific additional hooks :/

@flaviojs
Copy link
Contributor

This daughtercard stuff is interesting to me, but I can't comment on your approach without seeing the code.
If you fork this repository and make a draft PR in your repository then I can see the code and comment there.
Maybe the daughtercard device needs to be in the same pci_bus?
Maybe the device needs to be mapped to a certain physical address?
There are no examples in the code so no idea.

There is DEBUG_* code, maybe enabling that in appropriate places can guide you.


The cards/slots/subslots are an IOS concept that dynamips mimics, but the actual functionality is in the chip devices.
In general terms IOS communicates with the chips through PCI.
The dev_*_access or dev->handler or dev.handler functions are the heart of the chip functionality.
Anything can happen when it writes or reads a register through PCI.

Since NM-16ESW uses the BCM5605 chip and BCM chips are mostly "system-on-a-chip", then the original gigabit port is probably a part of it.
The comments say the device is experimental, maybe the original author never got around to the gigabit port.
If you can find the appropriate datasheet or programmer's guide then you can check and see if it's something you can do.

@Abyss-W4tcher
Copy link
Author

Abyss-W4tcher commented Jun 23, 2024

Thanks for the detailed answer !

My code is a bit messy right now, and I don't think making a fork would really help you. I've been trying to insert a generic 1-FE-TX card inside a sub slot of the NM-16ESW card, but I think it is not possible, as it is an am79c971.

I set the sub slot card on the same pci bus and device, but it won't show anywhere. However, I can see some available daughter card slots on the chassis, not related to the NM-16ESW card :

sh inventory raw
NAME: "3745 chassis", DESCR: "3745 chassis"
PID:                   , VID: 2.0, SN: FTX0945W0MY

NAME: "3640 Chassis Slot 0", DESCR: "3640 Chassis Slot"
PID:                   , VID:    , SN:            

NAME: "c3745 Motherboard with Fast Ethernet", DESCR: "c3745 Motherboard with Fast Ethernet"
PID: C3745-2FE         , VID: 2.0, SN: XXXXXXXXXXX

NAME: "DaughterCard Slot 0 on Card 0", DESCR: "3640 DaughterCard Slot"
PID:                   , VID:    , SN: 

So, my guess is that I need to add a "DaughterCard" type of slot in the bcm functions, similarly to the way the 16 Fa ports are created. This way, the 3745 might detect that this card has this kind of slot available (see https://pastebin.com/5B57icJw for reference).

I am currently analysing the 3745 image binary to find where the router queries a card for any daughter card slot existence, and also searching for any documentation on BCM5605 or BCM5618.

@flaviojs
Copy link
Contributor

Since you are analyzing the image binary, you might want to check how it distinguishes between NM-16ESW and NM-16ESW-1GIG.

The IOS code might be something simple like this or equivalent:

if PID is NM-16ESW { add 16 FastEthernet }
else if PID is NM-16ESW-1GIG { add 16 FastEthernet and 1 GigabitEthernet }
...

@Abyss-W4tcher
Copy link
Author

Abyss-W4tcher commented Jun 24, 2024

By looking at the code, I am not sure there is a NM-16ESW-1GIG specifically. There are checks against NM-16ESW and NMD-36ESW :

    if ( v8 == 0x2B1 )
    {
      v9 = &unk_6340486C;
      *(a3 + 4) = 0x469;
      *(a3 + 8) = "36 Port 10BaseT/100BaseTX EtherSwitch";
    }
    else                                        // // 0x2a9, see https://github.com/GNS3/dynamips/blob/master/common/cisco_eeprom.c#L45
    {
      *(a3 + 4) = 0x468;
      *(a3 + 8) = "16 Port 10BaseT/100BaseTX EtherSwitch";
      v9 = &unk_634048A0;
    }

And then, a logic handles sub slots (in a separate function, not related to the previous one apparently) :

  v11 = check_bcm5605_or_bcm5618(a2, a3, v6 + 0x2C);
  v7 = 0;
  if ( v11 )
{
// [...] *INTERMEDIATE LOGIC* [...]
        if ( v22 )                              // number of ports == 36 check
        {
          v25 = *(v6 + 0x54);
          if ( v25 && eeprom_v4_parser(v25, 0x80, 0x40, v65) && v65[0] == 0x2B4 )// nmd-36esw powercard
            *(v6 + 0x24) = 1;
          v26 = *(v6 + 0x4C);
          if ( v26 )
          {
            if ( eeprom_v4_parser(v26, 0x80, 0x40, v65) )
            {
              v20 = 0x2B2;
              if ( v65[0] == 0x2B2 )            // ge-dcard-esw (for nmd-36esw)
              {
                v27 = *(v6 + 0x40) | 0x1000000;
                ++*(v6 + 0x60);
                *(v6 + 0x40) = v27;
              }
            }
          }
          v24 = *(v6 + 0x50);
        }
        else
        {
          v23 = *(v6 + 0x54);
          if ( v23 && eeprom_v4_parser(v23, 0x80, 0x40, v65) && v65[0] == 0x2B3 )// nm-16esw powercard
            *(v6 + 0x24) = 1;
          v24 = *(v6 + 0x4C);
        }
        if ( v24 && eeprom_v4_parser(v24, 0x80, 0x40, v65) && v65[0] == 0x2B2 )// ge-dcard-esw (for nm-16esw)
        {
          v28 = *(v6 + 0x30) | 0x1000000;
          ++*(v6 + 0x60);
          *(v6 + 0x30) = v28;
        }
}

If I insert the ge-dcard-esw into a PCI slot (let's say 3), it won't get detected. So it definitely needs to be attached to a main card to be correctly detected. I think the BCM logic in dynamips needs to show an additional "DaughterCard slot" as available to the router, or else the sub slot won't even be checked ?

@flaviojs
Copy link
Contributor

Looks like it uses the EEPROM field "40 XX XX(product id)" to distinguish the cards:

  • 0x2A9 = NM-16ESW
  • 0x2B1 = NMD-36ESW
  • 0x2B2 = GE-DCARD-ESW (subslot)
  • 0x2B3 = powercard for NM-16ESW (subslot)
  • 0x2B4 = powercard for NMD-36ESW (subslot)

Seems like the correct way to do this is to implement a GE-DCARD-ESW card with whatever is needed.
Then either add subslots to NM-16ESW and add it manually "-p 1:NM-16ESW -p 1:0:GE-DCARD-ESW",
or create a NM-16ESW-1GIG card that adds GE-DCARD-ESW automatically "-p 1:NM-16ESW-1GIG".

@Abyss-W4tcher
Copy link
Author

I agree, but the problem is that I don't really know what type of card should be GE-DCARD-ESW, even though I think it should be bound to the same PCI device and bus as the master. I tried adding some ports binding in the BCM definitions :

static int nm16esw_port_mapping[] = {
2, 0, 6, 4, 10, 8, 14, 12, 3, 1, 7, 5, 11, 9, 15, 13,
};

But this doesn't really change anything, so maybe the card actually sends a BCM READ REQUEST (?) to check if there are any """DaughterCard""" slots on the chip. But that's the part to figure out, and I can't find the needed documentation (or how the original dev figured out those values).

For now, I am trying to edit NM-16ESW to automatically add the GE-DCARD, as it would be easier to integrate to GNS3. Additionnaly, GE-DCARD sole purpose should be to connect two NM-16ESW, but that's another part to figure out.

@flaviojs
Copy link
Contributor

flaviojs commented Jun 24, 2024

The original dev clearly did trial and error + whatever datasheets were available.

Since the cards are identified by the product_id in EEPROM then making sure IOS reads the subslot EEPROM is required.
The 1st argument of your eeprom_v4_parser should point to a device address that reads the target EEPROM.
It is probably an address that triggers dev_c3745_iofpga_access like 0x1fa00000+offset,
or an address that triggers dev_bcm5605_access like 0x30000000+offset for slot0 and 0x32000000+offset for slot1.

@Abyss-W4tcher
Copy link
Author

Hi, I won't have much time to allow into this subject, for a few weeks. I still want to dive back into it later though !

@grossmj
Copy link
Member

grossmj commented Oct 31, 2024

No worries, thanks for the links 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants