1、部署完服务器后,在服务器上将LLDP打开,想通过LLDP去排查服务器与交换机的网线、光纤有没有连接错误,但是发现服务器的电口网卡(Intel X700系列网卡)无法正常显示LLDP邻居,就怀疑是网卡配置的问题。
[root@BCONEST-X86-MON02 ~]# lspci |grep net
18:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
18:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
3d:00.0 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.1 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.2 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 1GbE (rev 09)
5f:00.0 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
5f:00.1 Ethernet controller: Mellanox Technologies MT27710 Family [ConnectX-4 Lx]
2、为了进一步定位问题,我们在异常接口上去通过tcpdump去抓包只能抓到服务器往外发的LLDP报文,没有抓到交换机发下来的报文。然后检查交换机配置后在交换机上debug,发现交换机接口有LLDP报文的收发,所以进一步判断是服务器网卡处理的问题。
[root@BCONEST-X86-MON02 ~]# tcpdump -i enp61s0f1 |grep -i LLDP
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on enp61s0f1, link-type EN10MB (Ethernet), capture size 262144 bytes
11:27:38.357788 LLDP, length 262: BCONEST-X86-MON02
11:28:08.401968 LLDP, length 262: BCONEST-X86-MON02
11:28:38.445474 LLDP, length 262: BCONEST-X86-MON02
11:29:08.489210 LLDP, length 262: BCONEST-X86-MON02
11:29:38.533460 LLDP, length 262: BCONEST-X86-MON02
11:30:08.579707 LLDP, length 262: BCONEST-X86-MON02
11:30:38.624087 LLDP, length 262: BCONEST-X86-MON02
11:31:08.668239 LLDP, length 262: BCONEST-X86-MON02
11:31:38.712726 LLDP, length 262: BCONEST-X86-MON02
3、经过不懈的搜索,在Radhat知识库发现了问题的所在,Intel X710 series NICs (i40e) do not receive LLDP frames
Intel 700 series NICs run an LLDP agent in firmware that will process and “absorb” any LLDPDU frames received from the switch. The frames are therefore never visible to the OS.
Intel 700 系列网卡在固件中会运行一个LLDP agent,这个agent会处理所有从交换发出的LLDP报文,这样在操作系统层面就再也看不到这个报文了。
解决方案:
Radhat提供了两个解决方案
①当Kernel版本大于等于kernel-3.10.0-957.el7
,可以调用ethtool --set-priv-flags eth0 disable-fw-lldp on
通知网卡驱动关闭内置的LLDP agent。
ethtool --set-priv-flags <NIC name> disable-fw-lldp on
ethtool --set-priv-flags <enp61s0f1> disable-fw-lldp on
②内核版本低或第一种方案不生效是可以通过该方法关闭,但是这种方法重启会失效。
echo "lldp stop" > /sys/kernel/debug/i40e/<pci bus address>/command
echo "lldp stop" > /sys/kernel/debug/i40e/0000\:3d\:00.0/command #开启0口
echo "lldp stop" > /sys/kernel/debug/i40e/0000\:3d\:00.1/command #开启1口
for i in `find /sys/kernel/debug/i40e/ -name command`; do echo 'lldp stop'> $i; done
#使用find、echo、for循环批量重定向“lldp stop”
4、检查lldp信息是否能正常显示。
[root@ZJNB-PSC-P10F2-SPOD3-PM-OS01-BCONEST-X86-MON02 ~]# echo "lldp stop" > /sys/kernel/debug/i40e/0000\:3d\:00.0/command
[root@ZJNB-PSC-P10F2-SPOD3-PM-OS01-BCONEST-X86-MON02 ~]# lldptool -t -n -i enp61s0f1
Chassis ID TLV
MAC: 00:01:7a:6a:02:15
Port ID TLV
Ifname: gigabitethernet2/0/44
Time to Live TLV
120
Port Description TLV
dT:[BCONEST-X86-MON02]-eno4-bond0-10.194.220.2
System Name TLV
ZJNB-PSC-P10F2-POD3-M-JR-4320-3&4
System Description TLV
MyPower (R) Operating System Software
Copyright (C) 2020 Maipu Communication Technology Co.,Ltd.All Rights Reserved.
System Capabilities TLV
System capabilities: Bridge, Router
Enabled capabilities: Bridge, Router
Management Address TLV
IPv4: 10.0.0.40
Ifindex: 4
Port VLAN ID TLV
PVID: 1
Port and Protocol VLAN ID TLV
PVID: 0, supported, not enabled
VLAN Name TLV
VID 1200: Name VLAN1200
MAC/PHY Configuration Status TLV
Auto-negotiation supported and enabled
PMD auto-negotiation capabilities: 0x009b
MAU type: 1000 BaseTFD
Power via MDI TLV
Port class PD
PSE MDI power not supported
PSE pairs not controllable
PSE Power pair: unkwown [0]
Power class 1
Link Aggregation TLV
Aggregation capable
Currently not aggregated
Aggregated Port ID: 0
Maximum Frame Size TLV
9216
End of LLDPDU TLV
参考资料:
Radhat知识库
Intel X700系列网卡(i40e驱动)收不到LLDP包问题
2488H v5服务器安装linux系统自带网卡X722不发送LLDP报文