* Re: regression: nvme rdma with bnxt_re0 broken [not found] ` <619411460.27128070.1562838433020.JavaMail.zimbra@redhat.com> @ 2019-07-11 16:13 ` Sagi Grimberg [not found] ` <AM0PR05MB48664657022ECA8526E3C967D1F30@AM0PR05MB4866.eurprd05.prod.outlook.com> 1 sibling, 0 replies; 18+ messages in thread From: Sagi Grimberg @ 2019-07-11 16:13 UTC (permalink / raw) To: Yi Zhang, linux-nvme; +Cc: danielj, parav, linux-rdma CC linux-rdma On 7/11/19 2:47 AM, Yi Zhang wrote: > Hello > > 'nvme connect' failed when use bnxt_re0 on latest upstream build[1], by bisecting I found it was introduced from v5.2.0-rc1 with [2], it works after I revert it. > Let me know if you need more info, thanks. > > [1] > [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n testnqn > Failed to write to /dev/nvme-fabrics: Bad address > > [root@rdma-perf-07 ~]$ dmesg > [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status 0x5 > [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH > [ 476.332525] nvme nvme2: rdma_connect failed (-14). > [ 476.343552] nvme nvme2: rdma connection establishment failed (-14) > > [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe > 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 2-port Gigabit Ethernet PCIe > 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] (rev 02) > 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > > [2] > commit 823b23da71132b80d9f41ab667c68b112455f3b6 > Author: Parav Pandit <parav@mellanox.com> > Date: Wed Apr 10 11:23:03 2019 +0300 > > IB/core: Allow vlan link local address based RoCE GIDs > > IPv6 link local address for a VLAN netdevice has nothing to do with its > resemblance with the default GID, because VLAN link local GID is in > different layer 2 domain. > > Now that RoCE MAD packet processing and route resolution consider the > right GID index, there is no need for an unnecessary check which prevents > the addition of vlan based IPv6 link local GIDs. > > Signed-off-by: Parav Pandit <parav@mellanox.com> > Reviewed-by: Daniel Jurgens <danielj@mellanox.com> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > > > > Best Regards, > Yi Zhang > > > > _______________________________________________ > Linux-nvme mailing list > Linux-nvme@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme > ^ permalink raw reply [flat|nested] 18+ messages in thread
[parent not found: <AM0PR05MB48664657022ECA8526E3C967D1F30@AM0PR05MB4866.eurprd05.prod.outlook.com>]
* RE: regression: nvme rdma with bnxt_re0 broken [not found] ` <AM0PR05MB48664657022ECA8526E3C967D1F30@AM0PR05MB4866.eurprd05.prod.outlook.com> @ 2019-07-11 16:18 ` Parav Pandit 2019-07-12 1:53 ` Yi Zhang 0 siblings, 1 reply; 18+ messages in thread From: Parav Pandit @ 2019-07-11 16:18 UTC (permalink / raw) To: Yi Zhang, linux-nvme Cc: Daniel Jurgens, linux-rdma, Devesh Sharma, selvin.xavier Sagi, This is better one to cc to linux-rdma. + Devesh, Selvin. > -----Original Message----- > From: Parav Pandit > Sent: Thursday, July 11, 2019 6:25 PM > To: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org > Cc: Daniel Jurgens <danielj@mellanox.com> > Subject: RE: regression: nvme rdma with bnxt_re0 broken > > Hi Yi Zhang, > > > -----Original Message----- > > From: Yi Zhang <yi.zhang@redhat.com> > > Sent: Thursday, July 11, 2019 3:17 PM > > To: linux-nvme@lists.infradead.org > > Cc: Daniel Jurgens <danielj@mellanox.com>; Parav Pandit > > <parav@mellanox.com> > > Subject: regression: nvme rdma with bnxt_re0 broken > > > > Hello > > > > 'nvme connect' failed when use bnxt_re0 on latest upstream build[1], > > by bisecting I found it was introduced from v5.2.0-rc1 with [2], it > > works after I revert it. > > Let me know if you need more info, thanks. > > > > [1] > > [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 > > -n testnqn Failed to write to /dev/nvme-fabrics: Bad address > > > > [root@rdma-perf-07 ~]$ dmesg > > [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status > > 0x5 [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ > > 476.332525] nvme nvme2: rdma_connect failed (-14). > > [ 476.343552] nvme nvme2: rdma connection establishment failed (-14) > > > > [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > > 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme > > BCM5720 2-port Gigabit Ethernet PCIe > > 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme > > BCM5720 2-port Gigabit Ethernet PCIe > > 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] > > (rev > > 02) > > 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > > NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > > NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > > > > > [2] > > commit 823b23da71132b80d9f41ab667c68b112455f3b6 > > Author: Parav Pandit <parav@mellanox.com> > > Date: Wed Apr 10 11:23:03 2019 +0300 > > > > IB/core: Allow vlan link local address based RoCE GIDs > > > > IPv6 link local address for a VLAN netdevice has nothing to do with its > > resemblance with the default GID, because VLAN link local GID is in > > different layer 2 domain. > > > > Now that RoCE MAD packet processing and route resolution consider the > > right GID index, there is no need for an unnecessary check which prevents > > the addition of vlan based IPv6 link local GIDs. > > > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > Reviewed-by: Daniel Jurgens <danielj@mellanox.com> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > > > > > > > > Best Regards, > > Yi Zhang > > > > I need some more information from you to debug this issue as I don’t have the > hw. > The highlighted patch added support for IPv6 link local address for vlan. I am > unsure how this can affect IPv4 AH creation for which there is failure. > > 1. Before you assign the IP address to the netdevice, Please do, echo -n > "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control > > Please share below output before doing nvme connect. > 2. Output of script [1] > $ show_gids script > If getting this script is problematic, share the output of, > > $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* > $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* > $ ip link show > $ip addr show > $ dmesg > > [1] https://community.mellanox.com/s/article/understanding-show-gids- > script#jive_content_id_The_Script > > I suspect that driver's assumption about GID indices might have gone wrong > here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. > Lets see about results to confirm that. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-11 16:18 ` Parav Pandit @ 2019-07-12 1:53 ` Yi Zhang 2019-07-12 2:49 ` Parav Pandit 0 siblings, 1 reply; 18+ messages in thread From: Yi Zhang @ 2019-07-12 1:53 UTC (permalink / raw) To: Parav Pandit, linux-nvme Cc: Daniel Jurgens, linux-rdma, Devesh Sharma, selvin.xavier Hi Parav Here is the info, let me know if it's enough, thanks. [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control [root@rdma-perf-07 ~]$ ifdown bnxt_roce Device 'bnxt_roce' successfully disconnected. [root@rdma-perf-07 ~]$ ifup bnxt_roce Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/16) [root@rdma-perf-07 ~]$ sh a.sh DEV PORT INDEX GID IPv4 VER DEV --- ---- ----- --- ------------ --- --- bnxt_re0 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 v1 bnxt_roce bnxt_re0 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 v2 bnxt_roce bnxt_re0 1 10 0000:0000:0000:0000:0000:ffff:ac1f:2bbb 172.31.43.187 v1 bnxt_roce.43 bnxt_re0 1 11 0000:0000:0000:0000:0000:ffff:ac1f:2bbb 172.31.43.187 v2 bnxt_roce.43 bnxt_re0 1 2 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 v1 bnxt_roce.45 bnxt_re0 1 3 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 v2 bnxt_roce.45 bnxt_re0 1 4 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 v1 bnxt_roce.43 bnxt_re0 1 5 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 v2 bnxt_roce.43 bnxt_re0 1 6 0000:0000:0000:0000:0000:ffff:ac1f:28bb 172.31.40.187 v1 bnxt_roce bnxt_re0 1 7 0000:0000:0000:0000:0000:ffff:ac1f:28bb 172.31.40.187 v2 bnxt_roce bnxt_re0 1 8 0000:0000:0000:0000:0000:ffff:ac1f:2dbb 172.31.45.187 v1 bnxt_roce.45 bnxt_re0 1 9 0000:0000:0000:0000:0000:ffff:ac1f:2dbb 172.31.45.187 v2 bnxt_roce.45 bnxt_re1 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 v1 lom_2 bnxt_re1 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 v2 lom_2 cxgb4_0 1 0 0007:433b:f5b0:0000:0000:0000:0000:0000 v1 cxgb4_0 2 0 0007:433b:f5b8:0000:0000:0000:0000:0000 v1 hfi1_0 1 0 fe80:0000:0000:0000:0011:7501:0109:6c60 v1 hfi1_0 1 1 fe80:0000:0000:0000:0006:6a00:0000:0005 v1 mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00f3:8a38 v1 n_gids_found=19 [root@rdma-perf-07 ~]$ dmesg | tail -15 [ 19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link becomes ready [ 19.758371] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8004: link becomes ready [ 20.010469] hfi1 0000:d8:00.0: hfi1_0: Switching to NO_DMA_RTAIL [ 20.440580] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready [ 21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes requested: 8. Max supported is 2. [ 21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes requested: 8. Max supported is 2. [ 22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes ready [ 211.407329] _ib_cache_gid_del: can't delete gid fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.407334] _ib_cache_gid_del: can't delete gid fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.425275] infiniband bnxt_re0: del_gid port=1 index=6 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb [ 211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 index=6 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb [ 211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb [ 211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 index=7 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb [ 225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb [ 225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb On 7/12/19 12:18 AM, Parav Pandit wrote: > Sagi, > > This is better one to cc to linux-rdma. > > + Devesh, Selvin. > >> -----Original Message----- >> From: Parav Pandit >> Sent: Thursday, July 11, 2019 6:25 PM >> To: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org >> Cc: Daniel Jurgens <danielj@mellanox.com> >> Subject: RE: regression: nvme rdma with bnxt_re0 broken >> >> Hi Yi Zhang, >> >>> -----Original Message----- >>> From: Yi Zhang <yi.zhang@redhat.com> >>> Sent: Thursday, July 11, 2019 3:17 PM >>> To: linux-nvme@lists.infradead.org >>> Cc: Daniel Jurgens <danielj@mellanox.com>; Parav Pandit >>> <parav@mellanox.com> >>> Subject: regression: nvme rdma with bnxt_re0 broken >>> >>> Hello >>> >>> 'nvme connect' failed when use bnxt_re0 on latest upstream build[1], >>> by bisecting I found it was introduced from v5.2.0-rc1 with [2], it >>> works after I revert it. >>> Let me know if you need more info, thanks. >>> >>> [1] >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 >>> -n testnqn Failed to write to /dev/nvme-fabrics: Bad address >>> >>> [root@rdma-perf-07 ~]$ dmesg >>> [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status >>> 0x5 [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ >>> 476.332525] nvme nvme2: rdma_connect failed (-14). >>> [ 476.343552] nvme nvme2: rdma connection establishment failed (-14) >>> >>> [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme >>> BCM5720 2-port Gigabit Ethernet PCIe >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme >>> BCM5720 2-port Gigabit Ethernet PCIe >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 [Fury] >>> (rev >>> 02) >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) >>> >>> >>> [2] >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6 >>> Author: Parav Pandit <parav@mellanox.com> >>> Date: Wed Apr 10 11:23:03 2019 +0300 >>> >>> IB/core: Allow vlan link local address based RoCE GIDs >>> >>> IPv6 link local address for a VLAN netdevice has nothing to do with its >>> resemblance with the default GID, because VLAN link local GID is in >>> different layer 2 domain. >>> >>> Now that RoCE MAD packet processing and route resolution consider the >>> right GID index, there is no need for an unnecessary check which prevents >>> the addition of vlan based IPv6 link local GIDs. >>> >>> Signed-off-by: Parav Pandit <parav@mellanox.com> >>> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> >>> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> >>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> >>> >>> >>> >>> Best Regards, >>> Yi Zhang >>> >> I need some more information from you to debug this issue as I don’t have the >> hw. >> The highlighted patch added support for IPv6 link local address for vlan. I am >> unsure how this can affect IPv4 AH creation for which there is failure. >> >> 1. Before you assign the IP address to the netdevice, Please do, echo -n >> "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control >> >> Please share below output before doing nvme connect. >> 2. Output of script [1] >> $ show_gids script >> If getting this script is problematic, share the output of, >> >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* >> $ ip link show >> $ip addr show >> $ dmesg >> >> [1] https://community.mellanox.com/s/article/understanding-show-gids- >> script#jive_content_id_The_Script >> >> I suspect that driver's assumption about GID indices might have gone wrong >> here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. >> Lets see about results to confirm that. > _______________________________________________ > Linux-nvme mailing list > Linux-nvme@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: regression: nvme rdma with bnxt_re0 broken 2019-07-12 1:53 ` Yi Zhang @ 2019-07-12 2:49 ` Parav Pandit 2019-07-12 3:45 ` Selvin Xavier 0 siblings, 1 reply; 18+ messages in thread From: Parav Pandit @ 2019-07-12 2:49 UTC (permalink / raw) To: Yi Zhang, linux-nvme Cc: Daniel Jurgens, linux-rdma, Devesh Sharma, selvin.xavier Hi Yi Zhang, > -----Original Message----- > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > owner@vger.kernel.org> On Behalf Of Yi Zhang > Sent: Friday, July 12, 2019 7:23 AM > To: Parav Pandit <parav@mellanox.com>; linux-nvme@lists.infradead.org > Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > Devesh Sharma <devesh.sharma@broadcom.com>; > selvin.xavier@broadcom.com > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > Hi Parav > > Here is the info, let me know if it's enough, thanks. > > [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" > > /sys/kernel/debug/dynamic_debug/control > [root@rdma-perf-07 ~]$ ifdown bnxt_roce > Device 'bnxt_roce' successfully disconnected. > [root@rdma-perf-07 ~]$ ifup bnxt_roce > Connection successfully activated (D-Bus active path: > /org/freedesktop/NetworkManager/ActiveConnection/16) > [root@rdma-perf-07 ~]$ sh a.sh > DEV PORT INDEX GID IPv4 VER DEV > --- ---- ----- --- ------------ --- --- > bnxt_re0 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v1 bnxt_roce > bnxt_re0 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v2 bnxt_roce > bnxt_re0 1 10 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > 172.31.43.187 v1 bnxt_roce.43 > bnxt_re0 1 11 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > 172.31.43.187 v2 bnxt_roce.43 > bnxt_re0 1 2 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v1 bnxt_roce.45 > bnxt_re0 1 3 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v2 bnxt_roce.45 > bnxt_re0 1 4 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v1 bnxt_roce.43 > bnxt_re0 1 5 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > v2 bnxt_roce.43 > bnxt_re0 1 6 0000:0000:0000:0000:0000:ffff:ac1f:28bb > 172.31.40.187 v1 bnxt_roce > bnxt_re0 1 7 0000:0000:0000:0000:0000:ffff:ac1f:28bb > 172.31.40.187 v2 bnxt_roce > bnxt_re0 1 8 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > 172.31.45.187 v1 bnxt_roce.45 > bnxt_re0 1 9 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > 172.31.45.187 v2 bnxt_roce.45 > bnxt_re1 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > v1 lom_2 > bnxt_re1 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > v2 lom_2 > cxgb4_0 1 0 0007:433b:f5b0:0000:0000:0000:0000:0000 v1 > cxgb4_0 2 0 0007:433b:f5b8:0000:0000:0000:0000:0000 v1 > hfi1_0 1 0 fe80:0000:0000:0000:0011:7501:0109:6c60 v1 > hfi1_0 1 1 fe80:0000:0000:0000:0006:6a00:0000:0005 v1 > mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00f3:8a38 v1 > n_gids_found=19 > > [root@rdma-perf-07 ~]$ dmesg | tail -15 > [ 19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link > becomes ready [ 19.758371] IPv6: ADDRCONF(NETDEV_CHANGE): > mlx5_ib0.8004: link becomes ready [ 20.010469] hfi1 0000:d8:00.0: hfi1_0: > Switching to NO_DMA_RTAIL [ 20.440580] IPv6: > ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready > [ 21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > requested: 8. Max supported is 2. > [ 21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > requested: 8. Max supported is 2. > [ 22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes > ready [ 211.407329] _ib_cache_gid_del: can't delete gid > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.407334] > _ib_cache_gid_del: can't delete gid > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.425275] infiniband > bnxt_re0: del_gid port=1 index=6 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 index=6 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 index=7 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > [ 225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 gid > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > GID table looks fine. > On 7/12/19 12:18 AM, Parav Pandit wrote: > > Sagi, > > > > This is better one to cc to linux-rdma. > > > > + Devesh, Selvin. > > > >> -----Original Message----- > >> From: Parav Pandit > >> Sent: Thursday, July 11, 2019 6:25 PM > >> To: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org > >> Cc: Daniel Jurgens <danielj@mellanox.com> > >> Subject: RE: regression: nvme rdma with bnxt_re0 broken > >> > >> Hi Yi Zhang, > >> > >>> -----Original Message----- > >>> From: Yi Zhang <yi.zhang@redhat.com> > >>> Sent: Thursday, July 11, 2019 3:17 PM > >>> To: linux-nvme@lists.infradead.org > >>> Cc: Daniel Jurgens <danielj@mellanox.com>; Parav Pandit > >>> <parav@mellanox.com> > >>> Subject: regression: nvme rdma with bnxt_re0 broken > >>> > >>> Hello > >>> > >>> 'nvme connect' failed when use bnxt_re0 on latest upstream build[1], > >>> by bisecting I found it was introduced from v5.2.0-rc1 with [2], it > >>> works after I revert it. > >>> Let me know if you need more info, thanks. > >>> > >>> [1] > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 > >>> -n testnqn Failed to write to /dev/nvme-fabrics: Bad address > >>> > >>> [root@rdma-perf-07 ~]$ dmesg > >>> [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status 0x5 Devesh, Selvin, What does this error mean? bnxt_qplib_create_ah() is failing. > >>> [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ > >>> 476.332525] nvme nvme2: rdma_connect failed (-14). > >>> [ 476.343552] nvme nvme2: rdma connection establishment failed > >>> (-14) > >>> > >>> [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > >>> NetXtreme > >>> BCM5720 2-port Gigabit Ethernet PCIe > >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > >>> NetXtreme > >>> BCM5720 2-port Gigabit Ethernet PCIe > >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 > >>> [Fury] (rev > >>> 02) > >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > >>> > >>> > >>> [2] > >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6 > >>> Author: Parav Pandit <parav@mellanox.com> > >>> Date: Wed Apr 10 11:23:03 2019 +0300 > >>> > >>> IB/core: Allow vlan link local address based RoCE GIDs > >>> > >>> IPv6 link local address for a VLAN netdevice has nothing to do with its > >>> resemblance with the default GID, because VLAN link local GID is in > >>> different layer 2 domain. > >>> > >>> Now that RoCE MAD packet processing and route resolution consider > the > >>> right GID index, there is no need for an unnecessary check which > prevents > >>> the addition of vlan based IPv6 link local GIDs. > >>> > >>> Signed-off-by: Parav Pandit <parav@mellanox.com> > >>> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> > >>> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > >>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > >>> > >>> > >>> > >>> Best Regards, > >>> Yi Zhang > >>> > >> I need some more information from you to debug this issue as I don’t > >> have the hw. > >> The highlighted patch added support for IPv6 link local address for > >> vlan. I am unsure how this can affect IPv4 AH creation for which there is > failure. > >> > >> 1. Before you assign the IP address to the netdevice, Please do, echo > >> -n "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control > >> > >> Please share below output before doing nvme connect. > >> 2. Output of script [1] > >> $ show_gids script > >> If getting this script is problematic, share the output of, > >> > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* > >> $ ip link show > >> $ip addr show > >> $ dmesg > >> > >> [1] https://community.mellanox.com/s/article/understanding-show-gids- > >> script#jive_content_id_The_Script > >> > >> I suspect that driver's assumption about GID indices might have gone > >> wrong here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. > >> Lets see about results to confirm that. > > _______________________________________________ > > Linux-nvme mailing list > > Linux-nvme@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 2:49 ` Parav Pandit @ 2019-07-12 3:45 ` Selvin Xavier 2019-07-12 9:28 ` Parav Pandit 0 siblings, 1 reply; 18+ messages in thread From: Selvin Xavier @ 2019-07-12 3:45 UTC (permalink / raw) To: Parav Pandit Cc: Yi Zhang, linux-nvme, Daniel Jurgens, linux-rdma, Devesh Sharma On Fri, Jul 12, 2019 at 8:19 AM Parav Pandit <parav@mellanox.com> wrote: > > Hi Yi Zhang, > > > -----Original Message----- > > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > > owner@vger.kernel.org> On Behalf Of Yi Zhang > > Sent: Friday, July 12, 2019 7:23 AM > > To: Parav Pandit <parav@mellanox.com>; linux-nvme@lists.infradead.org > > Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > > Devesh Sharma <devesh.sharma@broadcom.com>; > > selvin.xavier@broadcom.com > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > Hi Parav > > > > Here is the info, let me know if it's enough, thanks. > > > > [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" > > > /sys/kernel/debug/dynamic_debug/control > > [root@rdma-perf-07 ~]$ ifdown bnxt_roce > > Device 'bnxt_roce' successfully disconnected. > > [root@rdma-perf-07 ~]$ ifup bnxt_roce > > Connection successfully activated (D-Bus active path: > > /org/freedesktop/NetworkManager/ActiveConnection/16) > > [root@rdma-perf-07 ~]$ sh a.sh > > DEV PORT INDEX GID IPv4 VER DEV > > --- ---- ----- --- ------------ --- --- > > bnxt_re0 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > v1 bnxt_roce > > bnxt_re0 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > v2 bnxt_roce > > bnxt_re0 1 10 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > > 172.31.43.187 v1 bnxt_roce.43 > > bnxt_re0 1 11 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > > 172.31.43.187 v2 bnxt_roce.43 > > bnxt_re0 1 2 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > v1 bnxt_roce.45 > > bnxt_re0 1 3 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > v2 bnxt_roce.45 > > bnxt_re0 1 4 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > v1 bnxt_roce.43 > > bnxt_re0 1 5 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > v2 bnxt_roce.43 > > bnxt_re0 1 6 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > 172.31.40.187 v1 bnxt_roce > > bnxt_re0 1 7 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > 172.31.40.187 v2 bnxt_roce > > bnxt_re0 1 8 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > > 172.31.45.187 v1 bnxt_roce.45 > > bnxt_re0 1 9 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > > 172.31.45.187 v2 bnxt_roce.45 > > bnxt_re1 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > > v1 lom_2 > > bnxt_re1 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > > v2 lom_2 > > cxgb4_0 1 0 0007:433b:f5b0:0000:0000:0000:0000:0000 v1 > > cxgb4_0 2 0 0007:433b:f5b8:0000:0000:0000:0000:0000 v1 > > hfi1_0 1 0 fe80:0000:0000:0000:0011:7501:0109:6c60 v1 > > hfi1_0 1 1 fe80:0000:0000:0000:0006:6a00:0000:0005 v1 > > mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00f3:8a38 v1 > > n_gids_found=19 > > > > [root@rdma-perf-07 ~]$ dmesg | tail -15 > > [ 19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link > > becomes ready [ 19.758371] IPv6: ADDRCONF(NETDEV_CHANGE): > > mlx5_ib0.8004: link becomes ready [ 20.010469] hfi1 0000:d8:00.0: hfi1_0: > > Switching to NO_DMA_RTAIL [ 20.440580] IPv6: > > ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready > > [ 21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > > requested: 8. Max supported is 2. > > [ 21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > > requested: 8. Max supported is 2. > > [ 22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link becomes > > ready [ 211.407329] _ib_cache_gid_del: can't delete gid > > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.407334] > > _ib_cache_gid_del: can't delete gid > > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.425275] infiniband > > bnxt_re0: del_gid port=1 index=6 gid > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > [ 211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 index=6 gid > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > [ 211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > [ 211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 index=7 gid > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > [ 225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 gid > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > [ 225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 gid > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > > GID table looks fine. > The GID table has fe80:0000:0000:0000:020a:f7ff:fee3:6e32 entry repeated 6 times. 2 for each interface bnxt_roce, bnxt_roce.43 and bnxt_roce.45. Is this expected to have same gid entries for vlan and base interfaces? As you mentioned earlier, driver's assumption that only 2 GID entries identical (one for RoCE v1 and one for RoCE v2) is breaking here. > > On 7/12/19 12:18 AM, Parav Pandit wrote: > > > Sagi, > > > > > > This is better one to cc to linux-rdma. > > > > > > + Devesh, Selvin. > > > > > >> -----Original Message----- > > >> From: Parav Pandit > > >> Sent: Thursday, July 11, 2019 6:25 PM > > >> To: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org > > >> Cc: Daniel Jurgens <danielj@mellanox.com> > > >> Subject: RE: regression: nvme rdma with bnxt_re0 broken > > >> > > >> Hi Yi Zhang, > > >> > > >>> -----Original Message----- > > >>> From: Yi Zhang <yi.zhang@redhat.com> > > >>> Sent: Thursday, July 11, 2019 3:17 PM > > >>> To: linux-nvme@lists.infradead.org > > >>> Cc: Daniel Jurgens <danielj@mellanox.com>; Parav Pandit > > >>> <parav@mellanox.com> > > >>> Subject: regression: nvme rdma with bnxt_re0 broken > > >>> > > >>> Hello > > >>> > > >>> 'nvme connect' failed when use bnxt_re0 on latest upstream build[1], > > >>> by bisecting I found it was introduced from v5.2.0-rc1 with [2], it > > >>> works after I revert it. > > >>> Let me know if you need more info, thanks. > > >>> > > >>> [1] > > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 > > >>> -n testnqn Failed to write to /dev/nvme-fabrics: Bad address > > >>> > > >>> [root@rdma-perf-07 ~]$ dmesg > > >>> [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 status 0x5 > > Devesh, Selvin, > > What does this error mean? > bnxt_qplib_create_ah() is failing. > We are passing a wrong index for the GID to FW because of the assumption mentioned earlier. FW is failing command due to this. > > >>> [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ > > >>> 476.332525] nvme nvme2: rdma_connect failed (-14). > > >>> [ 476.343552] nvme nvme2: rdma connection establishment failed > > >>> (-14) > > >>> > > >>> [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > > >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > > >>> NetXtreme > > >>> BCM5720 2-port Gigabit Ethernet PCIe > > >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > > >>> NetXtreme > > >>> BCM5720 2-port Gigabit Ethernet PCIe > > >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 > > >>> [Fury] (rev > > >>> 02) > > >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 > > >>> NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > >>> > > >>> > > >>> [2] > > >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6 > > >>> Author: Parav Pandit <parav@mellanox.com> > > >>> Date: Wed Apr 10 11:23:03 2019 +0300 > > >>> > > >>> IB/core: Allow vlan link local address based RoCE GIDs > > >>> > > >>> IPv6 link local address for a VLAN netdevice has nothing to do with its > > >>> resemblance with the default GID, because VLAN link local GID is in > > >>> different layer 2 domain. > > >>> > > >>> Now that RoCE MAD packet processing and route resolution consider > > the > > >>> right GID index, there is no need for an unnecessary check which > > prevents > > >>> the addition of vlan based IPv6 link local GIDs. > > >>> > > >>> Signed-off-by: Parav Pandit <parav@mellanox.com> > > >>> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> > > >>> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > > >>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > > >>> > > >>> > > >>> > > >>> Best Regards, > > >>> Yi Zhang > > >>> > > >> I need some more information from you to debug this issue as I don’t > > >> have the hw. > > >> The highlighted patch added support for IPv6 link local address for > > >> vlan. I am unsure how this can affect IPv4 AH creation for which there is > > failure. > > >> > > >> 1. Before you assign the IP address to the netdevice, Please do, echo > > >> -n "module ib_core +p" > /sys/kernel/debug/dynamic_debug/control > > >> > > >> Please share below output before doing nvme connect. > > >> 2. Output of script [1] > > >> $ show_gids script > > >> If getting this script is problematic, share the output of, > > >> > > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* > > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* > > >> $ ip link show > > >> $ip addr show > > >> $ dmesg > > >> > > >> [1] https://community.mellanox.com/s/article/understanding-show-gids- > > >> script#jive_content_id_The_Script > > >> > > >> I suspect that driver's assumption about GID indices might have gone > > >> wrong here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. > > >> Lets see about results to confirm that. > > > _______________________________________________ > > > Linux-nvme mailing list > > > Linux-nvme@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: regression: nvme rdma with bnxt_re0 broken 2019-07-12 3:45 ` Selvin Xavier @ 2019-07-12 9:28 ` Parav Pandit 2019-07-12 9:39 ` Parav Pandit 0 siblings, 1 reply; 18+ messages in thread From: Parav Pandit @ 2019-07-12 9:28 UTC (permalink / raw) To: Selvin Xavier Cc: Yi Zhang, linux-nvme, Daniel Jurgens, linux-rdma, Devesh Sharma Hi Selvin, > -----Original Message----- > From: Selvin Xavier <selvin.xavier@broadcom.com> > Sent: Friday, July 12, 2019 9:16 AM > To: Parav Pandit <parav@mellanox.com> > Cc: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org; Daniel > Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; Devesh > Sharma <devesh.sharma@broadcom.com> > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > On Fri, Jul 12, 2019 at 8:19 AM Parav Pandit <parav@mellanox.com> wrote: > > > > Hi Yi Zhang, > > > > > -----Original Message----- > > > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > > > owner@vger.kernel.org> On Behalf Of Yi Zhang > > > Sent: Friday, July 12, 2019 7:23 AM > > > To: Parav Pandit <parav@mellanox.com>; > > > linux-nvme@lists.infradead.org > > > Cc: Daniel Jurgens <danielj@mellanox.com>; > > > linux-rdma@vger.kernel.org; Devesh Sharma > > > <devesh.sharma@broadcom.com>; selvin.xavier@broadcom.com > > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > > > Hi Parav > > > > > > Here is the info, let me know if it's enough, thanks. > > > > > > [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" > > > > /sys/kernel/debug/dynamic_debug/control > > > [root@rdma-perf-07 ~]$ ifdown bnxt_roce Device 'bnxt_roce' > > > successfully disconnected. > > > [root@rdma-perf-07 ~]$ ifup bnxt_roce Connection successfully > > > activated (D-Bus active path: > > > /org/freedesktop/NetworkManager/ActiveConnection/16) > > > [root@rdma-perf-07 ~]$ sh a.sh > > > DEV PORT INDEX GID IPv4 VER DEV > > > --- ---- ----- --- ------------ --- --- > > > bnxt_re0 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > v1 bnxt_roce > > > bnxt_re0 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > v2 bnxt_roce > > > bnxt_re0 1 10 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > > > 172.31.43.187 v1 bnxt_roce.43 > > > bnxt_re0 1 11 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > > > 172.31.43.187 v2 bnxt_roce.43 > > > bnxt_re0 1 2 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > v1 bnxt_roce.45 > > > bnxt_re0 1 3 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > v2 bnxt_roce.45 > > > bnxt_re0 1 4 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > v1 bnxt_roce.43 > > > bnxt_re0 1 5 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > v2 bnxt_roce.43 > > > bnxt_re0 1 6 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > 172.31.40.187 v1 bnxt_roce > > > bnxt_re0 1 7 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > 172.31.40.187 v2 bnxt_roce > > > bnxt_re0 1 8 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > > > 172.31.45.187 v1 bnxt_roce.45 > > > bnxt_re0 1 9 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > > > 172.31.45.187 v2 bnxt_roce.45 > > > bnxt_re1 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > > > v1 lom_2 > > > bnxt_re1 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > > > v2 lom_2 > > > cxgb4_0 1 0 0007:433b:f5b0:0000:0000:0000:0000:0000 v1 > > > cxgb4_0 2 0 0007:433b:f5b8:0000:0000:0000:0000:0000 v1 > > > hfi1_0 1 0 fe80:0000:0000:0000:0011:7501:0109:6c60 v1 > > > hfi1_0 1 1 fe80:0000:0000:0000:0006:6a00:0000:0005 v1 > > > mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00f3:8a38 v1 > > > n_gids_found=19 > > > > > > [root@rdma-perf-07 ~]$ dmesg | tail -15 > > > [ 19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link > > > becomes ready [ 19.758371] IPv6: ADDRCONF(NETDEV_CHANGE): > > > mlx5_ib0.8004: link becomes ready [ 20.010469] hfi1 0000:d8:00.0: hfi1_0: > > > Switching to NO_DMA_RTAIL [ 20.440580] IPv6: > > > ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready > > > [ 21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > > > requested: 8. Max supported is 2. > > > [ 21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > > > requested: 8. Max supported is 2. > > > [ 22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link > becomes > > > ready [ 211.407329] _ib_cache_gid_del: can't delete gid > > > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.407334] > > > _ib_cache_gid_del: can't delete gid > > > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.425275] > > > infiniband > > > bnxt_re0: del_gid port=1 index=6 gid > > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > [ 211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 > > > index=6 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > [ 211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid > > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > [ 211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 > > > index=7 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > [ 225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 > > > gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > [ 225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 > > > gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > > > > > GID table looks fine. > > > The GID table has fe80:0000:0000:0000:020a:f7ff:fee3:6e32 entry repeated 6 > times. 2 for each interface bnxt_roce, bnxt_roce.43 and bnxt_roce.45. Is this > expected to have same gid entries for vlan and base interfaces? As you > mentioned earlier, driver's assumption that only 2 GID entries identical (one for > RoCE v1 and one for RoCE > v2) is breaking here. > Yes, this is correct behavior. Each vlan netdev interface is in different L2 segment. Vlan netdev has this ipv6 link local address. Hence, it is added to the GID table. A given GID table entry is identified uniquely by GID+ndev+type(v1/v2). Reviewing bnxt_qplib_add_sgid() does the comparison below. if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { This comparison looks incomplete which ignore netdev and type. But even with that, I would expect GID entry addition failure for vlans for ipv6 link local entries. But I am puzzled now, that , with above memcmp() check, how does both v1 and v2 entries get added in your table and for vlans too? I expect add_gid() and core/cache.c add_roce_gid () to fail for the duplicate entry. But GID table that Yi Zhang dumped has these vlan entries. I am missing something. Yi Zhang, Instead of last 15 lines of dmesg, can you please share the whole dmsg log which should be enabled before creating vlans. using echo -n "module ib_core +p" /sys/kernel/debug/dynamic_debug/control Selvin, Additionally, driver shouldn't be looking at the duplicate entries. core already does it. You might only want to do for v1/v2 case as bnxt driver has some dependency with it. Can you please fix this part? > > > On 7/12/19 12:18 AM, Parav Pandit wrote: > > > > Sagi, > > > > > > > > This is better one to cc to linux-rdma. > > > > > > > > + Devesh, Selvin. > > > > > > > >> -----Original Message----- > > > >> From: Parav Pandit > > > >> Sent: Thursday, July 11, 2019 6:25 PM > > > >> To: Yi Zhang <yi.zhang@redhat.com>; > > > >> linux-nvme@lists.infradead.org > > > >> Cc: Daniel Jurgens <danielj@mellanox.com> > > > >> Subject: RE: regression: nvme rdma with bnxt_re0 broken > > > >> > > > >> Hi Yi Zhang, > > > >> > > > >>> -----Original Message----- > > > >>> From: Yi Zhang <yi.zhang@redhat.com> > > > >>> Sent: Thursday, July 11, 2019 3:17 PM > > > >>> To: linux-nvme@lists.infradead.org > > > >>> Cc: Daniel Jurgens <danielj@mellanox.com>; Parav Pandit > > > >>> <parav@mellanox.com> > > > >>> Subject: regression: nvme rdma with bnxt_re0 broken > > > >>> > > > >>> Hello > > > >>> > > > >>> 'nvme connect' failed when use bnxt_re0 on latest upstream > > > >>> build[1], by bisecting I found it was introduced from v5.2.0-rc1 > > > >>> with [2], it works after I revert it. > > > >>> Let me know if you need more info, thanks. > > > >>> > > > >>> [1] > > > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s > > > >>> 4420 -n testnqn Failed to write to /dev/nvme-fabrics: Bad > > > >>> address > > > >>> > > > >>> [root@rdma-perf-07 ~]$ dmesg > > > >>> [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 > > > >>> status 0x5 > > > > Devesh, Selvin, > > > > What does this error mean? > > bnxt_qplib_create_ah() is failing. > > > We are passing a wrong index for the GID to FW because of the assumption > mentioned earlier. > FW is failing command due to this. > > > > >>> [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ > > > >>> 476.332525] nvme nvme2: rdma_connect failed (-14). > > > >>> [ 476.343552] nvme nvme2: rdma connection establishment failed > > > >>> (-14) > > > >>> > > > >>> [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > > > >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > > > >>> NetXtreme > > > >>> BCM5720 2-port Gigabit Ethernet PCIe > > > >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > > > >>> NetXtreme > > > >>> BCM5720 2-port Gigabit Ethernet PCIe > > > >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3008 > > > >>> [Fury] (rev > > > >>> 02) > > > >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > > > >>> BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > > >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > > > >>> BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > > >>> > > > >>> > > > >>> [2] > > > >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6 > > > >>> Author: Parav Pandit <parav@mellanox.com> > > > >>> Date: Wed Apr 10 11:23:03 2019 +0300 > > > >>> > > > >>> IB/core: Allow vlan link local address based RoCE GIDs > > > >>> > > > >>> IPv6 link local address for a VLAN netdevice has nothing to do with > its > > > >>> resemblance with the default GID, because VLAN link local GID is in > > > >>> different layer 2 domain. > > > >>> > > > >>> Now that RoCE MAD packet processing and route resolution > > > >>> consider > > > the > > > >>> right GID index, there is no need for an unnecessary check > > > >>> which > > > prevents > > > >>> the addition of vlan based IPv6 link local GIDs. > > > >>> > > > >>> Signed-off-by: Parav Pandit <parav@mellanox.com> > > > >>> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> > > > >>> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > > > >>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > > > >>> > > > >>> > > > >>> > > > >>> Best Regards, > > > >>> Yi Zhang > > > >>> > > > >> I need some more information from you to debug this issue as I > > > >> don’t have the hw. > > > >> The highlighted patch added support for IPv6 link local address > > > >> for vlan. I am unsure how this can affect IPv4 AH creation for > > > >> which there is > > > failure. > > > >> > > > >> 1. Before you assign the IP address to the netdevice, Please do, > > > >> echo -n "module ib_core +p" > > > > >> /sys/kernel/debug/dynamic_debug/control > > > >> > > > >> Please share below output before doing nvme connect. > > > >> 2. Output of script [1] > > > >> $ show_gids script > > > >> If getting this script is problematic, share the output of, > > > >> > > > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* > > > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* > > > >> $ ip link show > > > >> $ip addr show > > > >> $ dmesg > > > >> > > > >> [1] > > > >> https://community.mellanox.com/s/article/understanding-show-gids- > > > >> script#jive_content_id_The_Script > > > >> > > > >> I suspect that driver's assumption about GID indices might have > > > >> gone wrong here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. > > > >> Lets see about results to confirm that. > > > > _______________________________________________ > > > > Linux-nvme mailing list > > > > Linux-nvme@lists.infradead.org > > > > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: regression: nvme rdma with bnxt_re0 broken 2019-07-12 9:28 ` Parav Pandit @ 2019-07-12 9:39 ` Parav Pandit 2019-07-12 9:49 ` Parav Pandit 0 siblings, 1 reply; 18+ messages in thread From: Parav Pandit @ 2019-07-12 9:39 UTC (permalink / raw) To: Parav Pandit, Selvin Xavier Cc: Yi Zhang, linux-nvme, Daniel Jurgens, linux-rdma, Devesh Sharma > -----Original Message----- > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > owner@vger.kernel.org> On Behalf Of Parav Pandit > Sent: Friday, July 12, 2019 2:58 PM > To: Selvin Xavier <selvin.xavier@broadcom.com> > Cc: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org; Daniel > Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; Devesh > Sharma <devesh.sharma@broadcom.com> > Subject: RE: regression: nvme rdma with bnxt_re0 broken > > Hi Selvin, > > > -----Original Message----- > > From: Selvin Xavier <selvin.xavier@broadcom.com> > > Sent: Friday, July 12, 2019 9:16 AM > > To: Parav Pandit <parav@mellanox.com> > > Cc: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org; > > Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > > Devesh Sharma <devesh.sharma@broadcom.com> > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > On Fri, Jul 12, 2019 at 8:19 AM Parav Pandit <parav@mellanox.com> wrote: > > > > > > Hi Yi Zhang, > > > > > > > -----Original Message----- > > > > From: linux-rdma-owner@vger.kernel.org <linux-rdma- > > > > owner@vger.kernel.org> On Behalf Of Yi Zhang > > > > Sent: Friday, July 12, 2019 7:23 AM > > > > To: Parav Pandit <parav@mellanox.com>; > > > > linux-nvme@lists.infradead.org > > > > Cc: Daniel Jurgens <danielj@mellanox.com>; > > > > linux-rdma@vger.kernel.org; Devesh Sharma > > > > <devesh.sharma@broadcom.com>; selvin.xavier@broadcom.com > > > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > > > > > Hi Parav > > > > > > > > Here is the info, let me know if it's enough, thanks. > > > > > > > > [root@rdma-perf-07 ~]$ echo -n "module ib_core +p" > > > > > /sys/kernel/debug/dynamic_debug/control > > > > [root@rdma-perf-07 ~]$ ifdown bnxt_roce Device 'bnxt_roce' > > > > successfully disconnected. > > > > [root@rdma-perf-07 ~]$ ifup bnxt_roce Connection successfully > > > > activated (D-Bus active path: > > > > /org/freedesktop/NetworkManager/ActiveConnection/16) > > > > [root@rdma-perf-07 ~]$ sh a.sh > > > > DEV PORT INDEX GID IPv4 VER DEV > > > > --- ---- ----- --- ------------ --- --- > > > > bnxt_re0 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > > v1 bnxt_roce > > > > bnxt_re0 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > > v2 bnxt_roce > > > > bnxt_re0 1 10 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > > > > 172.31.43.187 v1 bnxt_roce.43 > > > > bnxt_re0 1 11 0000:0000:0000:0000:0000:ffff:ac1f:2bbb > > > > 172.31.43.187 v2 bnxt_roce.43 > > > > bnxt_re0 1 2 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > > v1 bnxt_roce.45 > > > > bnxt_re0 1 3 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > > v2 bnxt_roce.45 > > > > bnxt_re0 1 4 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > > v1 bnxt_roce.43 > > > > bnxt_re0 1 5 fe80:0000:0000:0000:020a:f7ff:fee3:6e32 > > > > v2 bnxt_roce.43 > > > > bnxt_re0 1 6 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > 172.31.40.187 v1 bnxt_roce > > > > bnxt_re0 1 7 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > 172.31.40.187 v2 bnxt_roce > > > > bnxt_re0 1 8 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > > > > 172.31.45.187 v1 bnxt_roce.45 > > > > bnxt_re0 1 9 0000:0000:0000:0000:0000:ffff:ac1f:2dbb > > > > 172.31.45.187 v2 bnxt_roce.45 > > > > bnxt_re1 1 0 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > > > > v1 lom_2 > > > > bnxt_re1 1 1 fe80:0000:0000:0000:020a:f7ff:fee3:6e33 > > > > v2 lom_2 > > > > cxgb4_0 1 0 0007:433b:f5b0:0000:0000:0000:0000:0000 v1 > > > > cxgb4_0 2 0 0007:433b:f5b8:0000:0000:0000:0000:0000 v1 > > > > hfi1_0 1 0 fe80:0000:0000:0000:0011:7501:0109:6c60 v1 > > > > hfi1_0 1 1 fe80:0000:0000:0000:0006:6a00:0000:0005 v1 > > > > mlx5_0 1 0 fe80:0000:0000:0000:506b:4b03:00f3:8a38 v1 > > > > n_gids_found=19 > > > > > > > > [root@rdma-perf-07 ~]$ dmesg | tail -15 > > > > [ 19.744421] IPv6: ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8002: link > > > > becomes ready [ 19.758371] IPv6: ADDRCONF(NETDEV_CHANGE): > > > > mlx5_ib0.8004: link becomes ready [ 20.010469] hfi1 0000:d8:00.0: > hfi1_0: > > > > Switching to NO_DMA_RTAIL [ 20.440580] IPv6: > > > > ADDRCONF(NETDEV_CHANGE): mlx5_ib0.8006: link becomes ready > > > > [ 21.098510] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > > > > requested: 8. Max supported is 2. > > > > [ 21.324341] bnxt_en 0000:19:00.0 bnxt_roce: Too many traffic classes > > > > requested: 8. Max supported is 2. > > > > [ 22.058647] IPv6: ADDRCONF(NETDEV_CHANGE): hfi1_opa0: link > > becomes > > > > ready [ 211.407329] _ib_cache_gid_del: can't delete gid > > > > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.407334] > > > > _ib_cache_gid_del: can't delete gid > > > > fe80:0000:0000:0000:020a:f7ff:fee3:6e32 error=-22 [ 211.425275] > > > > infiniband > > > > bnxt_re0: del_gid port=1 index=6 gid > > > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > [ 211.425280] infiniband bnxt_re0: free_gid_entry_locked port=1 > > > > index=6 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > [ 211.425292] infiniband bnxt_re0: del_gid port=1 index=7 gid > > > > 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > [ 211.425461] infiniband bnxt_re0: free_gid_entry_locked port=1 > > > > index=7 gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > [ 225.474061] infiniband bnxt_re0: store_gid_entry port=1 index=6 > > > > gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > [ 225.474075] infiniband bnxt_re0: store_gid_entry port=1 index=7 > > > > gid 0000:0000:0000:0000:0000:ffff:ac1f:28bb > > > > > > > > > > > GID table looks fine. > > > > > The GID table has fe80:0000:0000:0000:020a:f7ff:fee3:6e32 entry > > repeated 6 times. 2 for each interface bnxt_roce, bnxt_roce.43 and > > bnxt_roce.45. Is this expected to have same gid entries for vlan and > > base interfaces? As you mentioned earlier, driver's assumption that > > only 2 GID entries identical (one for RoCE v1 and one for RoCE > > v2) is breaking here. > > > Yes, this is correct behavior. Each vlan netdev interface is in different L2 > segment. > Vlan netdev has this ipv6 link local address. Hence, it is added to the GID table. > A given GID table entry is identified uniquely by GID+ndev+type(v1/v2). > > Reviewing bnxt_qplib_add_sgid() does the comparison below. > if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > > This comparison looks incomplete which ignore netdev and type. > But even with that, I would expect GID entry addition failure for vlans for ipv6 > link local entries. > > But I am puzzled now, that , with above memcmp() check, how does both v1 > and v2 entries get added in your table and for vlans too? > I expect add_gid() and core/cache.c add_roce_gid () to fail for the duplicate > entry. > But GID table that Yi Zhang dumped has these vlan entries. > I am missing something. > Ah, found it. bnxt_re_add_gid() checks for -EALREADY and returns 0 to add_gid() callback. Ok. so you just need to extend bnxt_qplib_add_sgid() for considering vlan too. Let me see if I can share a patch in few minutes. > Yi Zhang, > Instead of last 15 lines of dmesg, can you please share the whole dmsg log > which should be enabled before creating vlans. > using > echo -n "module ib_core +p" /sys/kernel/debug/dynamic_debug/control > > Selvin, > Additionally, driver shouldn't be looking at the duplicate entries. core already > does it. > You might only want to do for v1/v2 case as bnxt driver has some dependency > with it. > Can you please fix this part? > > > > > On 7/12/19 12:18 AM, Parav Pandit wrote: > > > > > Sagi, > > > > > > > > > > This is better one to cc to linux-rdma. > > > > > > > > > > + Devesh, Selvin. > > > > > > > > > >> -----Original Message----- > > > > >> From: Parav Pandit > > > > >> Sent: Thursday, July 11, 2019 6:25 PM > > > > >> To: Yi Zhang <yi.zhang@redhat.com>; > > > > >> linux-nvme@lists.infradead.org > > > > >> Cc: Daniel Jurgens <danielj@mellanox.com> > > > > >> Subject: RE: regression: nvme rdma with bnxt_re0 broken > > > > >> > > > > >> Hi Yi Zhang, > > > > >> > > > > >>> -----Original Message----- > > > > >>> From: Yi Zhang <yi.zhang@redhat.com> > > > > >>> Sent: Thursday, July 11, 2019 3:17 PM > > > > >>> To: linux-nvme@lists.infradead.org > > > > >>> Cc: Daniel Jurgens <danielj@mellanox.com>; Parav Pandit > > > > >>> <parav@mellanox.com> > > > > >>> Subject: regression: nvme rdma with bnxt_re0 broken > > > > >>> > > > > >>> Hello > > > > >>> > > > > >>> 'nvme connect' failed when use bnxt_re0 on latest upstream > > > > >>> build[1], by bisecting I found it was introduced from > > > > >>> v5.2.0-rc1 with [2], it works after I revert it. > > > > >>> Let me know if you need more info, thanks. > > > > >>> > > > > >>> [1] > > > > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 > > > > >>> -s > > > > >>> 4420 -n testnqn Failed to write to /dev/nvme-fabrics: Bad > > > > >>> address > > > > >>> > > > > >>> [root@rdma-perf-07 ~]$ dmesg > > > > >>> [ 476.320742] bnxt_en 0000:19:00.0: QPLIB: cmdq[0x4b9]=0x15 > > > > >>> status 0x5 > > > > > > Devesh, Selvin, > > > > > > What does this error mean? > > > bnxt_qplib_create_ah() is failing. > > > > > We are passing a wrong index for the GID to FW because of the > > assumption mentioned earlier. > > FW is failing command due to this. > > > > > > >>> [ 476.327103] infiniband bnxt_re0: Failed to allocate HW AH [ > > > > >>> 476.332525] nvme nvme2: rdma_connect failed (-14). > > > > >>> [ 476.343552] nvme nvme2: rdma connection establishment > > > > >>> failed > > > > >>> (-14) > > > > >>> > > > > >>> [root@rdma-perf-07 ~]$ lspci | grep -i Broadcom > > > > >>> 01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > > > > >>> NetXtreme > > > > >>> BCM5720 2-port Gigabit Ethernet PCIe > > > > >>> 01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > > > > >>> NetXtreme > > > > >>> BCM5720 2-port Gigabit Ethernet PCIe > > > > >>> 18:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 > > > > >>> 3008 [Fury] (rev > > > > >>> 02) > > > > >>> 19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries > > > > >>> BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > > > >>> 19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries > > > > >>> BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01) > > > > >>> > > > > >>> > > > > >>> [2] > > > > >>> commit 823b23da71132b80d9f41ab667c68b112455f3b6 > > > > >>> Author: Parav Pandit <parav@mellanox.com> > > > > >>> Date: Wed Apr 10 11:23:03 2019 +0300 > > > > >>> > > > > >>> IB/core: Allow vlan link local address based RoCE GIDs > > > > >>> > > > > >>> IPv6 link local address for a VLAN netdevice has nothing > > > > >>> to do with > > its > > > > >>> resemblance with the default GID, because VLAN link local GID is > in > > > > >>> different layer 2 domain. > > > > >>> > > > > >>> Now that RoCE MAD packet processing and route resolution > > > > >>> consider > > > > the > > > > >>> right GID index, there is no need for an unnecessary > > > > >>> check which > > > > prevents > > > > >>> the addition of vlan based IPv6 link local GIDs. > > > > >>> > > > > >>> Signed-off-by: Parav Pandit <parav@mellanox.com> > > > > >>> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> > > > > >>> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> > > > > >>> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> > > > > >>> > > > > >>> > > > > >>> > > > > >>> Best Regards, > > > > >>> Yi Zhang > > > > >>> > > > > >> I need some more information from you to debug this issue as I > > > > >> don’t have the hw. > > > > >> The highlighted patch added support for IPv6 link local address > > > > >> for vlan. I am unsure how this can affect IPv4 AH creation for > > > > >> which there is > > > > failure. > > > > >> > > > > >> 1. Before you assign the IP address to the netdevice, Please > > > > >> do, echo -n "module ib_core +p" > > > > > >> /sys/kernel/debug/dynamic_debug/control > > > > >> > > > > >> Please share below output before doing nvme connect. > > > > >> 2. Output of script [1] > > > > >> $ show_gids script > > > > >> If getting this script is problematic, share the output of, > > > > >> > > > > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gids/* > > > > >> $ cat /sys/class/infiniband/bnxt_re0/ports/1/gid_attrs/ndevs/* > > > > >> $ ip link show > > > > >> $ip addr show > > > > >> $ dmesg > > > > >> > > > > >> [1] > > > > >> https://community.mellanox.com/s/article/understanding-show-gid > > > > >> s- script#jive_content_id_The_Script > > > > >> > > > > >> I suspect that driver's assumption about GID indices might have > > > > >> gone wrong here in drivers/infiniband/hw/bnxt_re/ib_verbs.c. > > > > >> Lets see about results to confirm that. > > > > > _______________________________________________ > > > > > Linux-nvme mailing list > > > > > Linux-nvme@lists.infradead.org > > > > > http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: regression: nvme rdma with bnxt_re0 broken 2019-07-12 9:39 ` Parav Pandit @ 2019-07-12 9:49 ` Parav Pandit 2019-07-12 11:41 ` Yi Zhang 0 siblings, 1 reply; 18+ messages in thread From: Parav Pandit @ 2019-07-12 9:49 UTC (permalink / raw) To: Selvin Xavier Cc: Yi Zhang, linux-nvme, Daniel Jurgens, linux-rdma, Devesh Sharma > > Hi Selvin, > > > > > -----Original Message----- > > > From: Selvin Xavier <selvin.xavier@broadcom.com> > > > Sent: Friday, July 12, 2019 9:16 AM > > > To: Parav Pandit <parav@mellanox.com> > > > Cc: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org; > > > Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > > > Devesh Sharma <devesh.sharma@broadcom.com> > > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > > > On Fri, Jul 12, 2019 at 8:19 AM Parav Pandit <parav@mellanox.com> > wrote: > > > > > > > > GID table looks fine. > > > > > > > The GID table has fe80:0000:0000:0000:020a:f7ff:fee3:6e32 entry > > > repeated 6 times. 2 for each interface bnxt_roce, bnxt_roce.43 and > > > bnxt_roce.45. Is this expected to have same gid entries for vlan and > > > base interfaces? As you mentioned earlier, driver's assumption that > > > only 2 GID entries identical (one for RoCE v1 and one for RoCE > > > v2) is breaking here. > > > > > Yes, this is correct behavior. Each vlan netdev interface is in > > different L2 segment. > > Vlan netdev has this ipv6 link local address. Hence, it is added to the GID > table. > > A given GID table entry is identified uniquely by GID+ndev+type(v1/v2). > > > > Reviewing bnxt_qplib_add_sgid() does the comparison below. > > if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > > > > This comparison looks incomplete which ignore netdev and type. > > But even with that, I would expect GID entry addition failure for > > vlans for ipv6 link local entries. > > > > But I am puzzled now, that , with above memcmp() check, how does both > > v1 and v2 entries get added in your table and for vlans too? > > I expect add_gid() and core/cache.c add_roce_gid () to fail for the > > duplicate entry. > > But GID table that Yi Zhang dumped has these vlan entries. > > I am missing something. > > > Ah, found it. > bnxt_re_add_gid() checks for -EALREADY and returns 0 to add_gid() callback. > Ok. so you just need to extend bnxt_qplib_add_sgid() for considering vlan too. > Let me see if I can share a patch in few minutes. > > > Yi Zhang, > > Instead of last 15 lines of dmesg, can you please share the whole dmsg > > log which should be enabled before creating vlans. > > using > > echo -n "module ib_core +p" /sys/kernel/debug/dynamic_debug/control > > > > Selvin, > > Additionally, driver shouldn't be looking at the duplicate entries. > > core already does it. > > You might only want to do for v1/v2 case as bnxt driver has some > > dependency with it. > > Can you please fix this part? > > How about below fix? From f3f17008d34b5a0c38c190010281a3030a8e5771 Mon Sep 17 00:00:00 2001 From: Parav Pandit <parav@mellanox.com> Date: Fri, 12 Jul 2019 04:34:52 -0500 Subject: [PATCH] IB/bnxt_re: Honor vlan_id in GID entry comparision GID entry consist of GID, vlan, netdev and smac. Extend GID duplicate check comparions to consider vlan_id as well to support IPv6 VLAN based link local addresses. Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") Change-Id: I2e026ec8065c8425ba24fad8525323d112a2f4e4 Signed-off-by: Parav Pandit <parav@mellanox.com> --- drivers/infiniband/hw/bnxt_re/qplib_sp.c | 4 +++- drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 + 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 48793d3512ac..8567b7367669 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -296,7 +296,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, } free_idx = sgid_tbl->max; for (i = 0; i < sgid_tbl->max; i++) { - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && + sgid_tbl->tbl[i].vlan_id == vlan_id) { dev_dbg(&res->pdev->dev, "SGID entry already exist in entry %d!\n", i); *index = i; @@ -351,6 +352,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, } /* Add GID to the sgid_tbl */ memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid)); + sgid_tbl->tbl[free_idx].vlan_id = vlan_id; sgid_tbl->active++; if (vlan_id != 0xFFFF) sgid_tbl->vlan[free_idx] = 1; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h index 0ec3b12b0bcd..7a1957c9dc6f 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h @@ -82,6 +82,7 @@ struct bnxt_qplib_pd { struct bnxt_qplib_gid { u8 data[16]; + u16 vlan_id; }; struct bnxt_qplib_ah { -- 2.19.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 9:49 ` Parav Pandit @ 2019-07-12 11:41 ` Yi Zhang 2019-07-12 12:52 ` Parav Pandit 0 siblings, 1 reply; 18+ messages in thread From: Yi Zhang @ 2019-07-12 11:41 UTC (permalink / raw) To: Parav Pandit, Selvin Xavier Cc: Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme Hi Parav The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. [1] [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n testnqn Failed to write to /dev/nvme-fabrics: Connection reset by peer [2] https://pastebin.com/ipvak0Sp Thanks Yi On 7/12/19 5:49 PM, Parav Pandit wrote: >>> Hi Selvin, >>> >>>> -----Original Message----- >>>> From: Selvin Xavier <selvin.xavier@broadcom.com> >>>> Sent: Friday, July 12, 2019 9:16 AM >>>> To: Parav Pandit <parav@mellanox.com> >>>> Cc: Yi Zhang <yi.zhang@redhat.com>; linux-nvme@lists.infradead.org; >>>> Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; >>>> Devesh Sharma <devesh.sharma@broadcom.com> >>>> Subject: Re: regression: nvme rdma with bnxt_re0 broken >>>> >>>> On Fri, Jul 12, 2019 at 8:19 AM Parav Pandit <parav@mellanox.com> >> wrote: >>>>> GID table looks fine. >>>>> >>>> The GID table has fe80:0000:0000:0000:020a:f7ff:fee3:6e32 entry >>>> repeated 6 times. 2 for each interface bnxt_roce, bnxt_roce.43 and >>>> bnxt_roce.45. Is this expected to have same gid entries for vlan and >>>> base interfaces? As you mentioned earlier, driver's assumption that >>>> only 2 GID entries identical (one for RoCE v1 and one for RoCE >>>> v2) is breaking here. >>>> >>> Yes, this is correct behavior. Each vlan netdev interface is in >>> different L2 segment. >>> Vlan netdev has this ipv6 link local address. Hence, it is added to the GID >> table. >>> A given GID table entry is identified uniquely by GID+ndev+type(v1/v2). >>> >>> Reviewing bnxt_qplib_add_sgid() does the comparison below. >>> if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { >>> >>> This comparison looks incomplete which ignore netdev and type. >>> But even with that, I would expect GID entry addition failure for >>> vlans for ipv6 link local entries. >>> >>> But I am puzzled now, that , with above memcmp() check, how does both >>> v1 and v2 entries get added in your table and for vlans too? >>> I expect add_gid() and core/cache.c add_roce_gid () to fail for the >>> duplicate entry. >>> But GID table that Yi Zhang dumped has these vlan entries. >>> I am missing something. >>> >> Ah, found it. >> bnxt_re_add_gid() checks for -EALREADY and returns 0 to add_gid() callback. >> Ok. so you just need to extend bnxt_qplib_add_sgid() for considering vlan too. >> Let me see if I can share a patch in few minutes. >> >>> Yi Zhang, >>> Instead of last 15 lines of dmesg, can you please share the whole dmsg >>> log which should be enabled before creating vlans. >>> using >>> echo -n "module ib_core +p" /sys/kernel/debug/dynamic_debug/control >>> >>> Selvin, >>> Additionally, driver shouldn't be looking at the duplicate entries. >>> core already does it. >>> You might only want to do for v1/v2 case as bnxt driver has some >>> dependency with it. >>> Can you please fix this part? >>> > How about below fix? > > From f3f17008d34b5a0c38c190010281a3030a8e5771 Mon Sep 17 00:00:00 2001 > From: Parav Pandit <parav@mellanox.com> > Date: Fri, 12 Jul 2019 04:34:52 -0500 > Subject: [PATCH] IB/bnxt_re: Honor vlan_id in GID entry comparision > > GID entry consist of GID, vlan, netdev and smac. > Extend GID duplicate check comparions to consider vlan_id as well > to support IPv6 VLAN based link local addresses. > > Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") > Change-Id: I2e026ec8065c8425ba24fad8525323d112a2f4e4 > Signed-off-by: Parav Pandit <parav@mellanox.com> > --- > drivers/infiniband/hw/bnxt_re/qplib_sp.c | 4 +++- > drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 + > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > index 48793d3512ac..8567b7367669 100644 > --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > @@ -296,7 +296,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > } > free_idx = sgid_tbl->max; > for (i = 0; i < sgid_tbl->max; i++) { > - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && > + sgid_tbl->tbl[i].vlan_id == vlan_id) { > dev_dbg(&res->pdev->dev, > "SGID entry already exist in entry %d!\n", i); > *index = i; > @@ -351,6 +352,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > } > /* Add GID to the sgid_tbl */ > memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid)); > + sgid_tbl->tbl[free_idx].vlan_id = vlan_id; > sgid_tbl->active++; > if (vlan_id != 0xFFFF) > sgid_tbl->vlan[free_idx] = 1; > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h > index 0ec3b12b0bcd..7a1957c9dc6f 100644 > --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h > @@ -82,6 +82,7 @@ struct bnxt_qplib_pd { > > struct bnxt_qplib_gid { > u8 data[16]; > + u16 vlan_id; > }; > > struct bnxt_qplib_ah { ^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: regression: nvme rdma with bnxt_re0 broken 2019-07-12 11:41 ` Yi Zhang @ 2019-07-12 12:52 ` Parav Pandit 2019-07-12 15:40 ` Jason Gunthorpe 2019-07-12 16:18 ` Selvin Xavier 0 siblings, 2 replies; 18+ messages in thread From: Parav Pandit @ 2019-07-12 12:52 UTC (permalink / raw) To: Yi Zhang, Selvin Xavier Cc: Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme > -----Original Message----- > From: Yi Zhang <yi.zhang@redhat.com> > Sent: Friday, July 12, 2019 5:11 PM > To: Parav Pandit <parav@mellanox.com>; Selvin Xavier > <selvin.xavier@broadcom.com> > Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > Devesh Sharma <devesh.sharma@broadcom.com>; linux- > nvme@lists.infradead.org > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > Hi Parav > The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. > > > [1] > [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n > testnqn > Failed to write to /dev/nvme-fabrics: Connection reset by peer > [2] > https://pastebin.com/ipvak0Sp > I think vlan_id is not initialized to 0xffff due to which ipv4 entry addition also failed with my patch. This is probably solves it. Not sure. I am not much familiar with it. Selvin, Can you please take a look? From 7b55e1d4500259cf7c02fb4d9fbbeb5ad1cfc623 Mon Sep 17 00:00:00 2001 From: Parav Pandit <parav@mellanox.com> Date: Fri, 12 Jul 2019 04:34:52 -0500 Subject: [PATCH v1] IB/bnxt_re: Honor vlan_id in GID entry comparison GID entry consist of GID, vlan, netdev and smac. Extend GID duplicate check companions to consider vlan_id as well to support IPv6 VLAN based link local addresses. GID deletion based on only GID comparision is not correct. It needs further fixes. Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") Change-Id: I2e026ec8065c8425ba24fad8525323d112a2f4e4 Signed-off-by: Parav Pandit <parav@mellanox.com> --- drivers/infiniband/hw/bnxt_re/qplib_res.c | 5 +++++ drivers/infiniband/hw/bnxt_re/qplib_sp.c | 7 ++++++- drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 + 3 files changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c index 37928b1111df..216648b640ce 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c @@ -488,6 +488,8 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, struct bnxt_qplib_sgid_tbl *sgid_tbl, u16 max) { + u16 i; + sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid), GFP_KERNEL); if (!sgid_tbl->tbl) return -ENOMEM; @@ -500,6 +502,9 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, if (!sgid_tbl->ctx) goto out_free2; + for (i = 0; i < max; i++) + sgid_tbl->tbl[i].vlan_id = 0xffff; + sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL); if (!sgid_tbl->vlan) goto out_free3; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 48793d3512ac..0d90be88685f 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -236,6 +236,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, return -ENOMEM; } for (index = 0; index < sgid_tbl->max; index++) { + /* FIXME: GID delete should happen based on index + * and refcount + */ if (!memcmp(&sgid_tbl->tbl[index], gid, sizeof(*gid))) break; } @@ -296,7 +299,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, } free_idx = sgid_tbl->max; for (i = 0; i < sgid_tbl->max; i++) { - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && + sgid_tbl->tbl[i].vlan_id == vlan_id) { dev_dbg(&res->pdev->dev, "SGID entry already exist in entry %d!\n", i); *index = i; @@ -351,6 +355,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, } /* Add GID to the sgid_tbl */ memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid)); + sgid_tbl->tbl[free_idx].vlan_id = vlan_id; sgid_tbl->active++; if (vlan_id != 0xFFFF) sgid_tbl->vlan[free_idx] = 1; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h index 0ec3b12b0bcd..7a1957c9dc6f 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h @@ -82,6 +82,7 @@ struct bnxt_qplib_pd { struct bnxt_qplib_gid { u8 data[16]; + u16 vlan_id; }; struct bnxt_qplib_ah { -- 2.19.2 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 12:52 ` Parav Pandit @ 2019-07-12 15:40 ` Jason Gunthorpe 2019-07-12 16:29 ` Selvin Xavier 2019-07-12 16:18 ` Selvin Xavier 1 sibling, 1 reply; 18+ messages in thread From: Jason Gunthorpe @ 2019-07-12 15:40 UTC (permalink / raw) To: Parav Pandit Cc: Yi Zhang, Selvin Xavier, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Fri, Jul 12, 2019 at 12:52:24PM +0000, Parav Pandit wrote: > > > > From: Yi Zhang <yi.zhang@redhat.com> > > Sent: Friday, July 12, 2019 5:11 PM > > To: Parav Pandit <parav@mellanox.com>; Selvin Xavier > > <selvin.xavier@broadcom.com> > > Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > > Devesh Sharma <devesh.sharma@broadcom.com>; linux- > > nvme@lists.infradead.org > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > Hi Parav > > The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. > > > > > > [1] > > [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n > > testnqn > > Failed to write to /dev/nvme-fabrics: Connection reset by peer > > [2] > > https://pastebin.com/ipvak0Sp > > > > I think vlan_id is not initialized to 0xffff due to which ipv4 entry addition also failed with my patch. > This is probably solves it. Not sure. I am not much familiar with it. > > Selvin, > Can you please take a look? > > From 7b55e1d4500259cf7c02fb4d9fbbeb5ad1cfc623 Mon Sep 17 00:00:00 2001 > From: Parav Pandit <parav@mellanox.com> > Date: Fri, 12 Jul 2019 04:34:52 -0500 > Subject: [PATCH v1] IB/bnxt_re: Honor vlan_id in GID entry comparison > > GID entry consist of GID, vlan, netdev and smac. > Extend GID duplicate check companions to consider vlan_id as well > to support IPv6 VLAN based link local addresses. > > GID deletion based on only GID comparision is not correct. > It needs further fixes. > > Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") > Change-Id: I2e026ec8065c8425ba24fad8525323d112a2f4e4 > Signed-off-by: Parav Pandit <parav@mellanox.com> > drivers/infiniband/hw/bnxt_re/qplib_res.c | 5 +++++ > drivers/infiniband/hw/bnxt_re/qplib_sp.c | 7 ++++++- > drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 + > 3 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c > index 37928b1111df..216648b640ce 100644 > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c > @@ -488,6 +488,8 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, > struct bnxt_qplib_sgid_tbl *sgid_tbl, > u16 max) > { > + u16 i; > + > sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid), GFP_KERNEL); > if (!sgid_tbl->tbl) > return -ENOMEM; > @@ -500,6 +502,9 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, > if (!sgid_tbl->ctx) > goto out_free2; > > + for (i = 0; i < max; i++) > + sgid_tbl->tbl[i].vlan_id = 0xffff; > + > sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL); > if (!sgid_tbl->vlan) > goto out_free3; > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > index 48793d3512ac..0d90be88685f 100644 > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > @@ -236,6 +236,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > return -ENOMEM; > } > for (index = 0; index < sgid_tbl->max; index++) { > + /* FIXME: GID delete should happen based on index > + * and refcount > + */ > if (!memcmp(&sgid_tbl->tbl[index], gid, sizeof(*gid))) > break; > } > @@ -296,7 +299,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > } > free_idx = sgid_tbl->max; > for (i = 0; i < sgid_tbl->max; i++) { > - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && > + sgid_tbl->tbl[i].vlan_id == vlan_id) { > dev_dbg(&res->pdev->dev, > "SGID entry already exist in entry %d!\n", i); bnxt guys: please just delete this duplicate detection code from the driver. Every GID provided from the core must be programmed into the given gid table index. Jason ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 15:40 ` Jason Gunthorpe @ 2019-07-12 16:29 ` Selvin Xavier 2019-07-12 17:42 ` Jason Gunthorpe 0 siblings, 1 reply; 18+ messages in thread From: Selvin Xavier @ 2019-07-12 16:29 UTC (permalink / raw) To: Jason Gunthorpe Cc: Parav Pandit, Yi Zhang, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Fri, Jul 12, 2019 at 9:10 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Fri, Jul 12, 2019 at 12:52:24PM +0000, Parav Pandit wrote: > > > > > > > From: Yi Zhang <yi.zhang@redhat.com> > > > Sent: Friday, July 12, 2019 5:11 PM > > > To: Parav Pandit <parav@mellanox.com>; Selvin Xavier > > > <selvin.xavier@broadcom.com> > > > Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > > > Devesh Sharma <devesh.sharma@broadcom.com>; linux- > > > nvme@lists.infradead.org > > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > > > Hi Parav > > > The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. > > > > > > > > > [1] > > > [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n > > > testnqn > > > Failed to write to /dev/nvme-fabrics: Connection reset by peer > > > [2] > > > https://pastebin.com/ipvak0Sp > > > > > > > I think vlan_id is not initialized to 0xffff due to which ipv4 entry addition also failed with my patch. > > This is probably solves it. Not sure. I am not much familiar with it. > > > > Selvin, > > Can you please take a look? > > > > From 7b55e1d4500259cf7c02fb4d9fbbeb5ad1cfc623 Mon Sep 17 00:00:00 2001 > > From: Parav Pandit <parav@mellanox.com> > > Date: Fri, 12 Jul 2019 04:34:52 -0500 > > Subject: [PATCH v1] IB/bnxt_re: Honor vlan_id in GID entry comparison > > > > GID entry consist of GID, vlan, netdev and smac. > > Extend GID duplicate check companions to consider vlan_id as well > > to support IPv6 VLAN based link local addresses. > > > > GID deletion based on only GID comparision is not correct. > > It needs further fixes. > > > > Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") > > Change-Id: I2e026ec8065c8425ba24fad8525323d112a2f4e4 > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > drivers/infiniband/hw/bnxt_re/qplib_res.c | 5 +++++ > > drivers/infiniband/hw/bnxt_re/qplib_sp.c | 7 ++++++- > > drivers/infiniband/hw/bnxt_re/qplib_sp.h | 1 + > > 3 files changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c > > index 37928b1111df..216648b640ce 100644 > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c > > @@ -488,6 +488,8 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, > > struct bnxt_qplib_sgid_tbl *sgid_tbl, > > u16 max) > > { > > + u16 i; > > + > > sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid), GFP_KERNEL); > > if (!sgid_tbl->tbl) > > return -ENOMEM; > > @@ -500,6 +502,9 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, > > if (!sgid_tbl->ctx) > > goto out_free2; > > > > + for (i = 0; i < max; i++) > > + sgid_tbl->tbl[i].vlan_id = 0xffff; > > + > > sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL); > > if (!sgid_tbl->vlan) > > goto out_free3; > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > > index 48793d3512ac..0d90be88685f 100644 > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > > @@ -236,6 +236,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > > return -ENOMEM; > > } > > for (index = 0; index < sgid_tbl->max; index++) { > > + /* FIXME: GID delete should happen based on index > > + * and refcount > > + */ > > if (!memcmp(&sgid_tbl->tbl[index], gid, sizeof(*gid))) > > break; > > } > > @@ -296,7 +299,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > > } > > free_idx = sgid_tbl->max; > > for (i = 0; i < sgid_tbl->max; i++) { > > - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > > + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && > > + sgid_tbl->tbl[i].vlan_id == vlan_id) { > > dev_dbg(&res->pdev->dev, > > "SGID entry already exist in entry %d!\n", i); > > bnxt guys: please just delete this duplicate detection code from the > driver. Every GID provided from the core must be programmed into the > given gid table index. Jason, This check is required in bnxt_re because the HW need only one entry in its table for RoCE V1 and RoCE v2 Gids. Sending the second add_gid for RoCE V2 (with the same gid as RoCE v1) to the HW returns failure. So driver handles this using a ref count. During delete_gid, the entry in the HW is deleted only if the refcount is zero. Thanks, Selvin > > Jason ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 16:29 ` Selvin Xavier @ 2019-07-12 17:42 ` Jason Gunthorpe 2019-07-13 7:51 ` Selvin Xavier 0 siblings, 1 reply; 18+ messages in thread From: Jason Gunthorpe @ 2019-07-12 17:42 UTC (permalink / raw) To: Selvin Xavier Cc: Parav Pandit, Yi Zhang, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Fri, Jul 12, 2019 at 09:59:38PM +0530, Selvin Xavier wrote: > > bnxt guys: please just delete this duplicate detection code from the > > driver. Every GID provided from the core must be programmed into the > > given gid table index. > > Jason, > This check is required in bnxt_re because the HW need only one entry > in its table for RoCE V1 and RoCE v2 Gids. The HW doesn't have a 'GID table' if it has this restriction. It sounds like it has some 'IP table' or maybe 'IP and VLAN' table? So the driver must provide a full emulated 'GID Table' with all the normal semantics. Which looks sort of like what the add side is doing, other than the mis-naming it seems reasonable.. But then I see this in re_create_ah: /* * If RoCE V2 is enabled, stack will have two entries for * each GID entry. Avoiding this duplicte entry in HW. Dividing * the GID index by 2 for RoCE V2 */ ah->qplib_ah.sgid_index = grh->sgid_index / 2; Which seems completely wrong - that is making assumptions about the layout of the gid table that is just not allowed. Surely it needs to access the per-gid driver context that add_gid set and use the index into the sgid_table from there? Jason ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 17:42 ` Jason Gunthorpe @ 2019-07-13 7:51 ` Selvin Xavier 2019-07-13 12:12 ` Jason Gunthorpe 0 siblings, 1 reply; 18+ messages in thread From: Selvin Xavier @ 2019-07-13 7:51 UTC (permalink / raw) To: Jason Gunthorpe Cc: Parav Pandit, Yi Zhang, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Fri, Jul 12, 2019 at 11:12 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Fri, Jul 12, 2019 at 09:59:38PM +0530, Selvin Xavier wrote: > > > > bnxt guys: please just delete this duplicate detection code from the > > > driver. Every GID provided from the core must be programmed into the > > > given gid table index. > > > > Jason, > > This check is required in bnxt_re because the HW need only one entry > > in its table for RoCE V1 and RoCE v2 Gids. > > The HW doesn't have a 'GID table' if it has this restriction. It > sounds like it has some 'IP table' or maybe 'IP and VLAN' table? > > So the driver must provide a full emulated 'GID Table' with all the > normal semantics. > > Which looks sort of like what the add side is doing, other than the > mis-naming it seems reasonable.. > > But then I see this in re_create_ah: > > /* > * If RoCE V2 is enabled, stack will have two entries for > * each GID entry. Avoiding this duplicte entry in HW. Dividing > * the GID index by 2 for RoCE V2 > */ > ah->qplib_ah.sgid_index = grh->sgid_index / 2; > > Which seems completely wrong - that is making assumptions about the > layout of the gid table that is just not allowed. > > Surely it needs to access the per-gid driver context that add_gid set > and use the index into the sgid_table from there? > Agree.. We need a mapping between HW table index and GID table index. We can either maintain a mapping in the driver or have an ib stack function to get the per gid driver context from gid table index. The later makes sense only if other drivers also needs such interface. I am not sure any other driver needs it. ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-13 7:51 ` Selvin Xavier @ 2019-07-13 12:12 ` Jason Gunthorpe 0 siblings, 0 replies; 18+ messages in thread From: Jason Gunthorpe @ 2019-07-13 12:12 UTC (permalink / raw) To: Selvin Xavier Cc: Parav Pandit, Yi Zhang, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Sat, Jul 13, 2019 at 01:21:54PM +0530, Selvin Xavier wrote: > On Fri, Jul 12, 2019 at 11:12 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > > > On Fri, Jul 12, 2019 at 09:59:38PM +0530, Selvin Xavier wrote: > > > > > > bnxt guys: please just delete this duplicate detection code from the > > > > driver. Every GID provided from the core must be programmed into the > > > > given gid table index. > > > > > > Jason, > > > This check is required in bnxt_re because the HW need only one entry > > > in its table for RoCE V1 and RoCE v2 Gids. > > > > The HW doesn't have a 'GID table' if it has this restriction. It > > sounds like it has some 'IP table' or maybe 'IP and VLAN' table? > > > > So the driver must provide a full emulated 'GID Table' with all the > > normal semantics. > > > > Which looks sort of like what the add side is doing, other than the > > mis-naming it seems reasonable.. > > > > But then I see this in re_create_ah: > > > > /* > > * If RoCE V2 is enabled, stack will have two entries for > > * each GID entry. Avoiding this duplicte entry in HW. Dividing > > * the GID index by 2 for RoCE V2 > > */ > > ah->qplib_ah.sgid_index = grh->sgid_index / 2; > > > > Which seems completely wrong - that is making assumptions about the > > layout of the gid table that is just not allowed. > > > > Surely it needs to access the per-gid driver context that add_gid set > > and use the index into the sgid_table from there? > > > Agree.. We need a mapping between HW table index and GID table index. > We can either maintain a mapping in the driver or have an ib stack function to > get the per gid driver context from gid table index. The later makes sense > only if other drivers also needs such interface. I am not sure any > other driver needs > it. I'd prefer a core function to return that context.. Even better would be to have the core allocate a larger entry and use container_of Jason ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 12:52 ` Parav Pandit 2019-07-12 15:40 ` Jason Gunthorpe @ 2019-07-12 16:18 ` Selvin Xavier 2019-07-13 7:56 ` Yi Zhang 1 sibling, 1 reply; 18+ messages in thread From: Selvin Xavier @ 2019-07-12 16:18 UTC (permalink / raw) To: Parav Pandit Cc: Yi Zhang, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Fri, Jul 12, 2019 at 6:22 PM Parav Pandit <parav@mellanox.com> wrote: > > > > > -----Original Message----- > > From: Yi Zhang <yi.zhang@redhat.com> > > Sent: Friday, July 12, 2019 5:11 PM > > To: Parav Pandit <parav@mellanox.com>; Selvin Xavier > > <selvin.xavier@broadcom.com> > > Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > > Devesh Sharma <devesh.sharma@broadcom.com>; linux- > > nvme@lists.infradead.org > > Subject: Re: regression: nvme rdma with bnxt_re0 broken > > > > Hi Parav > > The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. > > > > > > [1] > > [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n > > testnqn > > Failed to write to /dev/nvme-fabrics: Connection reset by peer > > [2] > > https://pastebin.com/ipvak0Sp > > > > I think vlan_id is not initialized to 0xffff due to which ipv4 entry addition also failed with my patch. > This is probably solves it. Not sure. I am not much familiar with it. > > Selvin, > Can you please take a look? > Parav, Thanks for the patch. I removed the change you made in struct bnxt_qplib_gid and added a new structure struct bnxt_qplib_gid_info to include both gid and vlan_id. struct bnxt_qplib_gid is used in multiple places to memcpy or compare the 16 bytes. Also, added vlan checking in the destroy path also. Some more cleanup possible in delete_gid path. That can be done once the basic issue is fixed. Yi, Can you please test this patch and see if it is solving the issue? Thanks, Selvin From 3d83613cfc5993bf9dae529ab0d4d25ddefff29b Mon Sep 17 00:00:00 2001 From: Parav Pandit <parav@mellanox.com> Date: Fri, 12 Jul 2019 10:32:51 -0400 Subject: [PATCH] IB/bnxt_re: Honor vlan_id in GID entry comparison GID entry consist of GID, vlan, netdev and smac. Extend GID duplicate check companions to consider vlan_id as well to support IPv6 VLAN based link local addresses. Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> --- drivers/infiniband/hw/bnxt_re/ib_verbs.c | 7 +++++-- drivers/infiniband/hw/bnxt_re/qplib_res.c | 17 +++++++++++++---- drivers/infiniband/hw/bnxt_re/qplib_res.h | 2 +- drivers/infiniband/hw/bnxt_re/qplib_sp.c | 14 +++++++++----- drivers/infiniband/hw/bnxt_re/qplib_sp.h | 7 ++++++- 5 files changed, 34 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c index a91653aabf38..098ab883733e 100644 --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c @@ -308,6 +308,7 @@ int bnxt_re_del_gid(const struct ib_gid_attr *attr, void **context) struct bnxt_re_dev *rdev = to_bnxt_re_dev(attr->device, ibdev); struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl; struct bnxt_qplib_gid *gid_to_del; + u16 vlan_id = 0xFFFF; /* Delete the entry from the hardware */ ctx = *context; @@ -317,7 +318,8 @@ int bnxt_re_del_gid(const struct ib_gid_attr *attr, void **context) if (sgid_tbl && sgid_tbl->active) { if (ctx->idx >= sgid_tbl->max) return -EINVAL; - gid_to_del = &sgid_tbl->tbl[ctx->idx]; + gid_to_del = &sgid_tbl->tbl[ctx->idx].gid; + vlan_id = sgid_tbl->tbl[ctx->idx].vlan_id; /* DEL_GID is called in WQ context(netdevice_event_work_handler) * or via the ib_unregister_device path. In the former case QP1 * may not be destroyed yet, in which case just return as FW @@ -335,7 +337,8 @@ int bnxt_re_del_gid(const struct ib_gid_attr *attr, void **context) } ctx->refcnt--; if (!ctx->refcnt) { - rc = bnxt_qplib_del_sgid(sgid_tbl, gid_to_del, true); + rc = bnxt_qplib_del_sgid(sgid_tbl, gid_to_del, + vlan_id, true); if (rc) { dev_err(rdev_to_dev(rdev), "Failed to remove GID: %#x", rc); diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c index 37928b1111df..7f2571f7a13f 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c @@ -488,7 +488,10 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, struct bnxt_qplib_sgid_tbl *sgid_tbl, u16 max) { - sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid), GFP_KERNEL); + u16 i; + + sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid_info), + GFP_KERNEL); if (!sgid_tbl->tbl) return -ENOMEM; @@ -500,6 +503,9 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res, if (!sgid_tbl->ctx) goto out_free2; + for (i = 0; i < max; i++) + sgid_tbl->tbl[i].vlan_id = 0xffff; + sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL); if (!sgid_tbl->vlan) goto out_free3; @@ -526,9 +532,11 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct bnxt_qplib_res *res, for (i = 0; i < sgid_tbl->max; i++) { if (memcmp(&sgid_tbl->tbl[i], &bnxt_qplib_gid_zero, sizeof(bnxt_qplib_gid_zero))) - bnxt_qplib_del_sgid(sgid_tbl, &sgid_tbl->tbl[i], true); + bnxt_qplib_del_sgid(sgid_tbl, &sgid_tbl->tbl[i].gid, + sgid_tbl->tbl[i].vlan_id, true); } - memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max); + memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid_info) * + sgid_tbl->max); memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max); memset(sgid_tbl->vlan, 0, sizeof(u8) * sgid_tbl->max); sgid_tbl->active = 0; @@ -537,7 +545,8 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct bnxt_qplib_res *res, static void bnxt_qplib_init_sgid_tbl(struct bnxt_qplib_sgid_tbl *sgid_tbl, struct net_device *netdev) { - memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max); + memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid_info) * + sgid_tbl->max); memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max); } diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h index 30c42c92fac7..fbda11a7ab1a 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_res.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h @@ -111,7 +111,7 @@ struct bnxt_qplib_pd_tbl { }; struct bnxt_qplib_sgid_tbl { - struct bnxt_qplib_gid *tbl; + struct bnxt_qplib_gid_info *tbl; u16 *hw_id; u16 max; u16 active; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c index 48793d3512ac..40296b97d21e 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c @@ -213,12 +213,12 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, index, sgid_tbl->max); return -EINVAL; } - memcpy(gid, &sgid_tbl->tbl[index], sizeof(*gid)); + memcpy(gid, &sgid_tbl->tbl[index].gid, sizeof(*gid)); return 0; } int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, - struct bnxt_qplib_gid *gid, bool update) + struct bnxt_qplib_gid *gid, u16 vlan_id, bool update) { struct bnxt_qplib_res *res = to_bnxt_qplib(sgid_tbl, struct bnxt_qplib_res, @@ -236,7 +236,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, return -ENOMEM; } for (index = 0; index < sgid_tbl->max; index++) { - if (!memcmp(&sgid_tbl->tbl[index], gid, sizeof(*gid))) + if (!memcmp(&sgid_tbl->tbl[index].gid, gid, sizeof(*gid)) && + vlan_id == sgid_tbl->tbl[index].vlan_id) break; } if (index == sgid_tbl->max) { @@ -262,8 +263,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, if (rc) return rc; } - memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero, + memcpy(&sgid_tbl->tbl[index].gid, &bnxt_qplib_gid_zero, sizeof(bnxt_qplib_gid_zero)); + sgid_tbl->tbl[index].vlan_id = 0xFFFF; sgid_tbl->vlan[index] = 0; sgid_tbl->active--; dev_dbg(&res->pdev->dev, @@ -296,7 +298,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, } free_idx = sgid_tbl->max; for (i = 0; i < sgid_tbl->max; i++) { - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && + sgid_tbl->tbl[i].vlan_id == vlan_id) { dev_dbg(&res->pdev->dev, "SGID entry already exist in entry %d!\n", i); *index = i; @@ -351,6 +354,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, } /* Add GID to the sgid_tbl */ memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid)); + sgid_tbl->tbl[free_idx].vlan_id = vlan_id; sgid_tbl->active++; if (vlan_id != 0xFFFF) sgid_tbl->vlan[free_idx] = 1; diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h index 0ec3b12b0bcd..b5c4ce302c61 100644 --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h @@ -84,6 +84,11 @@ struct bnxt_qplib_gid { u8 data[16]; }; +struct bnxt_qplib_gid_info { + struct bnxt_qplib_gid gid; + u16 vlan_id; +}; + struct bnxt_qplib_ah { struct bnxt_qplib_gid dgid; struct bnxt_qplib_pd *pd; @@ -221,7 +226,7 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, struct bnxt_qplib_sgid_tbl *sgid_tbl, int index, struct bnxt_qplib_gid *gid); int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, - struct bnxt_qplib_gid *gid, bool update); + struct bnxt_qplib_gid *gid, u16 vlan_id, bool update); int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id, bool update, u32 *index); -- 2.18.1 ^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-12 16:18 ` Selvin Xavier @ 2019-07-13 7:56 ` Yi Zhang 2019-07-13 16:00 ` Selvin Xavier 0 siblings, 1 reply; 18+ messages in thread From: Yi Zhang @ 2019-07-13 7:56 UTC (permalink / raw) To: Selvin Xavier, Parav Pandit Cc: Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme Hi Salvin Confirmed the patch fixed the login issue. And from the dmesg I found there is only one I/O queue created, is that reasonable? there are more queues created during my testing for IB/iWARP. # nvme connect-all -t rdma -a 172.31.40.125 -s 4420 # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 931.5G 0 disk ├─sda1 8:1 0 1G 0 part /boot ├─sda2 8:2 0 25G 0 part /mnt/rdma-ext4 ├─sda3 8:3 0 25G 0 part /mnt/rdma-xfs ├─sda4 8:4 0 1K 0 part ├─sda5 8:5 0 15.7G 0 part [SWAP] └─sda6 8:6 0 864.9G 0 part / nvme0n1 259:405 0 250G 0 disk # dmesg [ 4311.635430] nvme nvme0: new ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.40.125:4420 [ 4311.646687] nvme nvme0: Removing ctrl: NQN "nqn.2014-08.org.nvmexpress.discovery" [ 4311.706052] nvme nvme0: creating 1 I/O queues. [ 4311.717848] nvme nvme0: mapped 1/0/0 default/read/poll queues. [ 4311.727384] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.40.125:4420 # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 2 NUMA node(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 63 Model name: Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz Stepping: 2 CPU MHz: 3283.306 CPU max MHz: 3500.0000 CPU min MHz: 1200.0000 BogoMIPS: 5993.72 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 10240K NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida arat pln pts flush_l1d On 7/13/19 12:18 AM, Selvin Xavier wrote: > On Fri, Jul 12, 2019 at 6:22 PM Parav Pandit <parav@mellanox.com> wrote: >> >> >>> -----Original Message----- >>> From: Yi Zhang <yi.zhang@redhat.com> >>> Sent: Friday, July 12, 2019 5:11 PM >>> To: Parav Pandit <parav@mellanox.com>; Selvin Xavier >>> <selvin.xavier@broadcom.com> >>> Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; >>> Devesh Sharma <devesh.sharma@broadcom.com>; linux- >>> nvme@lists.infradead.org >>> Subject: Re: regression: nvme rdma with bnxt_re0 broken >>> >>> Hi Parav >>> The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. >>> >>> >>> [1] >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n >>> testnqn >>> Failed to write to /dev/nvme-fabrics: Connection reset by peer >>> [2] >>> https://pastebin.com/ipvak0Sp >>> >> I think vlan_id is not initialized to 0xffff due to which ipv4 entry addition also failed with my patch. >> This is probably solves it. Not sure. I am not much familiar with it. >> >> Selvin, >> Can you please take a look? >> > Parav, > Thanks for the patch. I removed the change you made in struct bnxt_qplib_gid > and added a new structure struct bnxt_qplib_gid_info to include both > gid and vlan_id. > struct bnxt_qplib_gid is used in multiple places to memcpy or compare > the 16 bytes. > Also, added vlan checking in the destroy path also. Some more cleanup > possible in > delete_gid path. That can be done once the basic issue is fixed. > > Yi, Can you please test this patch and see if it is solving the issue? > > Thanks, > Selvin > > From 3d83613cfc5993bf9dae529ab0d4d25ddefff29b Mon Sep 17 00:00:00 2001 > From: Parav Pandit <parav@mellanox.com> > Date: Fri, 12 Jul 2019 10:32:51 -0400 > Subject: [PATCH] IB/bnxt_re: Honor vlan_id in GID entry comparison > > GID entry consist of GID, vlan, netdev and smac. > Extend GID duplicate check companions to consider vlan_id as well > to support IPv6 VLAN based link local addresses. > > Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") > Signed-off-by: Parav Pandit <parav@mellanox.com> > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> > --- > drivers/infiniband/hw/bnxt_re/ib_verbs.c | 7 +++++-- > drivers/infiniband/hw/bnxt_re/qplib_res.c | 17 +++++++++++++---- > drivers/infiniband/hw/bnxt_re/qplib_res.h | 2 +- > drivers/infiniband/hw/bnxt_re/qplib_sp.c | 14 +++++++++----- > drivers/infiniband/hw/bnxt_re/qplib_sp.h | 7 ++++++- > 5 files changed, 34 insertions(+), 13 deletions(-) > > diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c > b/drivers/infiniband/hw/bnxt_re/ib_verbs.c > index a91653aabf38..098ab883733e 100644 > --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c > +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c > @@ -308,6 +308,7 @@ int bnxt_re_del_gid(const struct ib_gid_attr > *attr, void **context) > struct bnxt_re_dev *rdev = to_bnxt_re_dev(attr->device, ibdev); > struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl; > struct bnxt_qplib_gid *gid_to_del; > + u16 vlan_id = 0xFFFF; > > /* Delete the entry from the hardware */ > ctx = *context; > @@ -317,7 +318,8 @@ int bnxt_re_del_gid(const struct ib_gid_attr > *attr, void **context) > if (sgid_tbl && sgid_tbl->active) { > if (ctx->idx >= sgid_tbl->max) > return -EINVAL; > - gid_to_del = &sgid_tbl->tbl[ctx->idx]; > + gid_to_del = &sgid_tbl->tbl[ctx->idx].gid; > + vlan_id = sgid_tbl->tbl[ctx->idx].vlan_id; > /* DEL_GID is called in WQ context(netdevice_event_work_handler) > * or via the ib_unregister_device path. In the former case QP1 > * may not be destroyed yet, in which case just return as FW > @@ -335,7 +337,8 @@ int bnxt_re_del_gid(const struct ib_gid_attr > *attr, void **context) > } > ctx->refcnt--; > if (!ctx->refcnt) { > - rc = bnxt_qplib_del_sgid(sgid_tbl, gid_to_del, true); > + rc = bnxt_qplib_del_sgid(sgid_tbl, gid_to_del, > + vlan_id, true); > if (rc) { > dev_err(rdev_to_dev(rdev), > "Failed to remove GID: %#x", rc); > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c > b/drivers/infiniband/hw/bnxt_re/qplib_res.c > index 37928b1111df..7f2571f7a13f 100644 > --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c > @@ -488,7 +488,10 @@ static int bnxt_qplib_alloc_sgid_tbl(struct > bnxt_qplib_res *res, > struct bnxt_qplib_sgid_tbl *sgid_tbl, > u16 max) > { > - sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid), GFP_KERNEL); > + u16 i; > + > + sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid_info), > + GFP_KERNEL); > if (!sgid_tbl->tbl) > return -ENOMEM; > > @@ -500,6 +503,9 @@ static int bnxt_qplib_alloc_sgid_tbl(struct > bnxt_qplib_res *res, > if (!sgid_tbl->ctx) > goto out_free2; > > + for (i = 0; i < max; i++) > + sgid_tbl->tbl[i].vlan_id = 0xffff; > + > sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL); > if (!sgid_tbl->vlan) > goto out_free3; > @@ -526,9 +532,11 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct > bnxt_qplib_res *res, > for (i = 0; i < sgid_tbl->max; i++) { > if (memcmp(&sgid_tbl->tbl[i], &bnxt_qplib_gid_zero, > sizeof(bnxt_qplib_gid_zero))) > - bnxt_qplib_del_sgid(sgid_tbl, &sgid_tbl->tbl[i], true); > + bnxt_qplib_del_sgid(sgid_tbl, &sgid_tbl->tbl[i].gid, > + sgid_tbl->tbl[i].vlan_id, true); > } > - memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max); > + memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid_info) * > + sgid_tbl->max); > memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max); > memset(sgid_tbl->vlan, 0, sizeof(u8) * sgid_tbl->max); > sgid_tbl->active = 0; > @@ -537,7 +545,8 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct > bnxt_qplib_res *res, > static void bnxt_qplib_init_sgid_tbl(struct bnxt_qplib_sgid_tbl *sgid_tbl, > struct net_device *netdev) > { > - memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max); > + memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid_info) * > + sgid_tbl->max); > memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max); > } > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h > b/drivers/infiniband/hw/bnxt_re/qplib_res.h > index 30c42c92fac7..fbda11a7ab1a 100644 > --- a/drivers/infiniband/hw/bnxt_re/qplib_res.h > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h > @@ -111,7 +111,7 @@ struct bnxt_qplib_pd_tbl { > }; > > struct bnxt_qplib_sgid_tbl { > - struct bnxt_qplib_gid *tbl; > + struct bnxt_qplib_gid_info *tbl; > u16 *hw_id; > u16 max; > u16 active; > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c > b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > index 48793d3512ac..40296b97d21e 100644 > --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > @@ -213,12 +213,12 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, > index, sgid_tbl->max); > return -EINVAL; > } > - memcpy(gid, &sgid_tbl->tbl[index], sizeof(*gid)); > + memcpy(gid, &sgid_tbl->tbl[index].gid, sizeof(*gid)); > return 0; > } > > int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > - struct bnxt_qplib_gid *gid, bool update) > + struct bnxt_qplib_gid *gid, u16 vlan_id, bool update) > { > struct bnxt_qplib_res *res = to_bnxt_qplib(sgid_tbl, > struct bnxt_qplib_res, > @@ -236,7 +236,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl > *sgid_tbl, > return -ENOMEM; > } > for (index = 0; index < sgid_tbl->max; index++) { > - if (!memcmp(&sgid_tbl->tbl[index], gid, sizeof(*gid))) > + if (!memcmp(&sgid_tbl->tbl[index].gid, gid, sizeof(*gid)) && > + vlan_id == sgid_tbl->tbl[index].vlan_id) > break; > } > if (index == sgid_tbl->max) { > @@ -262,8 +263,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl > *sgid_tbl, > if (rc) > return rc; > } > - memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero, > + memcpy(&sgid_tbl->tbl[index].gid, &bnxt_qplib_gid_zero, > sizeof(bnxt_qplib_gid_zero)); > + sgid_tbl->tbl[index].vlan_id = 0xFFFF; > sgid_tbl->vlan[index] = 0; > sgid_tbl->active--; > dev_dbg(&res->pdev->dev, > @@ -296,7 +298,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl > *sgid_tbl, > } > free_idx = sgid_tbl->max; > for (i = 0; i < sgid_tbl->max; i++) { > - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && > + sgid_tbl->tbl[i].vlan_id == vlan_id) { > dev_dbg(&res->pdev->dev, > "SGID entry already exist in entry %d!\n", i); > *index = i; > @@ -351,6 +354,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl > *sgid_tbl, > } > /* Add GID to the sgid_tbl */ > memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid)); > + sgid_tbl->tbl[free_idx].vlan_id = vlan_id; > sgid_tbl->active++; > if (vlan_id != 0xFFFF) > sgid_tbl->vlan[free_idx] = 1; > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h > b/drivers/infiniband/hw/bnxt_re/qplib_sp.h > index 0ec3b12b0bcd..b5c4ce302c61 100644 > --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h > @@ -84,6 +84,11 @@ struct bnxt_qplib_gid { > u8 data[16]; > }; > > +struct bnxt_qplib_gid_info { > + struct bnxt_qplib_gid gid; > + u16 vlan_id; > +}; > + > struct bnxt_qplib_ah { > struct bnxt_qplib_gid dgid; > struct bnxt_qplib_pd *pd; > @@ -221,7 +226,7 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, > struct bnxt_qplib_sgid_tbl *sgid_tbl, int index, > struct bnxt_qplib_gid *gid); > int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > - struct bnxt_qplib_gid *gid, bool update); > + struct bnxt_qplib_gid *gid, u16 vlan_id, bool update); > int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id, > bool update, u32 *index); ^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: regression: nvme rdma with bnxt_re0 broken 2019-07-13 7:56 ` Yi Zhang @ 2019-07-13 16:00 ` Selvin Xavier 0 siblings, 0 replies; 18+ messages in thread From: Selvin Xavier @ 2019-07-13 16:00 UTC (permalink / raw) To: Yi Zhang Cc: Parav Pandit, Daniel Jurgens, linux-rdma, Devesh Sharma, linux-nvme On Sat, Jul 13, 2019 at 1:26 PM Yi Zhang <yi.zhang@redhat.com> wrote: > > Hi Salvin > > Confirmed the patch fixed the login issue. > Thanks Yi for the confirmation. I will post this patch. > And from the dmesg I found there is only one I/O queue created, is that > reasonable? there are more queues created during my testing for IB/iWARP. num_comp_vectors exported by bnxt_re is 1 and this is expected due to that. This can be increased. I will post a patch after doing some testing for the same. > > # nvme connect-all -t rdma -a 172.31.40.125 -s 4420 > > # lsblk > NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT > sda 8:0 0 931.5G 0 disk > ├─sda1 8:1 0 1G 0 part /boot > ├─sda2 8:2 0 25G 0 part /mnt/rdma-ext4 > ├─sda3 8:3 0 25G 0 part /mnt/rdma-xfs > ├─sda4 8:4 0 1K 0 part > ├─sda5 8:5 0 15.7G 0 part [SWAP] > └─sda6 8:6 0 864.9G 0 part / > nvme0n1 259:405 0 250G 0 disk > > # dmesg > [ 4311.635430] nvme nvme0: new ctrl: NQN > "nqn.2014-08.org.nvmexpress.discovery", addr 172.31.40.125:4420 > [ 4311.646687] nvme nvme0: Removing ctrl: NQN > "nqn.2014-08.org.nvmexpress.discovery" > [ 4311.706052] nvme nvme0: creating 1 I/O queues. > [ 4311.717848] nvme nvme0: mapped 1/0/0 default/read/poll queues. > [ 4311.727384] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.40.125:4420 > > # lscpu > Architecture: x86_64 > CPU op-mode(s): 32-bit, 64-bit > Byte Order: Little Endian > CPU(s): 16 > On-line CPU(s) list: 0-15 > Thread(s) per core: 2 > Core(s) per socket: 4 > Socket(s): 2 > NUMA node(s): 2 > Vendor ID: GenuineIntel > CPU family: 6 > Model: 63 > Model name: Intel(R) Xeon(R) CPU E5-2623 v3 @ 3.00GHz > Stepping: 2 > CPU MHz: 3283.306 > CPU max MHz: 3500.0000 > CPU min MHz: 1200.0000 > BogoMIPS: 5993.72 > Virtualization: VT-x > L1d cache: 32K > L1i cache: 32K > L2 cache: 256K > L3 cache: 10240K > NUMA node0 CPU(s): 0,2,4,6,8,10,12,14 > NUMA node1 CPU(s): 1,3,5,7,9,11,13,15 > Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr > pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe > syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good > nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor > ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 > sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand > lahf_lm abm cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp > tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 > avx2 smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm > ida arat pln pts flush_l1d > > On 7/13/19 12:18 AM, Selvin Xavier wrote: > > On Fri, Jul 12, 2019 at 6:22 PM Parav Pandit <parav@mellanox.com> wrote: > >> > >> > >>> -----Original Message----- > >>> From: Yi Zhang <yi.zhang@redhat.com> > >>> Sent: Friday, July 12, 2019 5:11 PM > >>> To: Parav Pandit <parav@mellanox.com>; Selvin Xavier > >>> <selvin.xavier@broadcom.com> > >>> Cc: Daniel Jurgens <danielj@mellanox.com>; linux-rdma@vger.kernel.org; > >>> Devesh Sharma <devesh.sharma@broadcom.com>; linux- > >>> nvme@lists.infradead.org > >>> Subject: Re: regression: nvme rdma with bnxt_re0 broken > >>> > >>> Hi Parav > >>> The nvme connect still failed[1], I've paste all the dmesg log to[2], pls check it. > >>> > >>> > >>> [1] > >>> [root@rdma-perf-07 ~]$ nvme connect -t rdma -a 172.31.40.125 -s 4420 -n > >>> testnqn > >>> Failed to write to /dev/nvme-fabrics: Connection reset by peer > >>> [2] > >>> https://pastebin.com/ipvak0Sp > >>> > >> I think vlan_id is not initialized to 0xffff due to which ipv4 entry addition also failed with my patch. > >> This is probably solves it. Not sure. I am not much familiar with it. > >> > >> Selvin, > >> Can you please take a look? > >> > > Parav, > > Thanks for the patch. I removed the change you made in struct bnxt_qplib_gid > > and added a new structure struct bnxt_qplib_gid_info to include both > > gid and vlan_id. > > struct bnxt_qplib_gid is used in multiple places to memcpy or compare > > the 16 bytes. > > Also, added vlan checking in the destroy path also. Some more cleanup > > possible in > > delete_gid path. That can be done once the basic issue is fixed. > > > > Yi, Can you please test this patch and see if it is solving the issue? > > > > Thanks, > > Selvin > > > > From 3d83613cfc5993bf9dae529ab0d4d25ddefff29b Mon Sep 17 00:00:00 2001 > > From: Parav Pandit <parav@mellanox.com> > > Date: Fri, 12 Jul 2019 10:32:51 -0400 > > Subject: [PATCH] IB/bnxt_re: Honor vlan_id in GID entry comparison > > > > GID entry consist of GID, vlan, netdev and smac. > > Extend GID duplicate check companions to consider vlan_id as well > > to support IPv6 VLAN based link local addresses. > > > > Fixes: 823b23da7113 ("IB/core: Allow vlan link local address based RoCE GIDs") > > Signed-off-by: Parav Pandit <parav@mellanox.com> > > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> > > --- > > drivers/infiniband/hw/bnxt_re/ib_verbs.c | 7 +++++-- > > drivers/infiniband/hw/bnxt_re/qplib_res.c | 17 +++++++++++++---- > > drivers/infiniband/hw/bnxt_re/qplib_res.h | 2 +- > > drivers/infiniband/hw/bnxt_re/qplib_sp.c | 14 +++++++++----- > > drivers/infiniband/hw/bnxt_re/qplib_sp.h | 7 ++++++- > > 5 files changed, 34 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c > > b/drivers/infiniband/hw/bnxt_re/ib_verbs.c > > index a91653aabf38..098ab883733e 100644 > > --- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c > > +++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c > > @@ -308,6 +308,7 @@ int bnxt_re_del_gid(const struct ib_gid_attr > > *attr, void **context) > > struct bnxt_re_dev *rdev = to_bnxt_re_dev(attr->device, ibdev); > > struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl; > > struct bnxt_qplib_gid *gid_to_del; > > + u16 vlan_id = 0xFFFF; > > > > /* Delete the entry from the hardware */ > > ctx = *context; > > @@ -317,7 +318,8 @@ int bnxt_re_del_gid(const struct ib_gid_attr > > *attr, void **context) > > if (sgid_tbl && sgid_tbl->active) { > > if (ctx->idx >= sgid_tbl->max) > > return -EINVAL; > > - gid_to_del = &sgid_tbl->tbl[ctx->idx]; > > + gid_to_del = &sgid_tbl->tbl[ctx->idx].gid; > > + vlan_id = sgid_tbl->tbl[ctx->idx].vlan_id; > > /* DEL_GID is called in WQ context(netdevice_event_work_handler) > > * or via the ib_unregister_device path. In the former case QP1 > > * may not be destroyed yet, in which case just return as FW > > @@ -335,7 +337,8 @@ int bnxt_re_del_gid(const struct ib_gid_attr > > *attr, void **context) > > } > > ctx->refcnt--; > > if (!ctx->refcnt) { > > - rc = bnxt_qplib_del_sgid(sgid_tbl, gid_to_del, true); > > + rc = bnxt_qplib_del_sgid(sgid_tbl, gid_to_del, > > + vlan_id, true); > > if (rc) { > > dev_err(rdev_to_dev(rdev), > > "Failed to remove GID: %#x", rc); > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c > > b/drivers/infiniband/hw/bnxt_re/qplib_res.c > > index 37928b1111df..7f2571f7a13f 100644 > > --- a/drivers/infiniband/hw/bnxt_re/qplib_res.c > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c > > @@ -488,7 +488,10 @@ static int bnxt_qplib_alloc_sgid_tbl(struct > > bnxt_qplib_res *res, > > struct bnxt_qplib_sgid_tbl *sgid_tbl, > > u16 max) > > { > > - sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid), GFP_KERNEL); > > + u16 i; > > + > > + sgid_tbl->tbl = kcalloc(max, sizeof(struct bnxt_qplib_gid_info), > > + GFP_KERNEL); > > if (!sgid_tbl->tbl) > > return -ENOMEM; > > > > @@ -500,6 +503,9 @@ static int bnxt_qplib_alloc_sgid_tbl(struct > > bnxt_qplib_res *res, > > if (!sgid_tbl->ctx) > > goto out_free2; > > > > + for (i = 0; i < max; i++) > > + sgid_tbl->tbl[i].vlan_id = 0xffff; > > + > > sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL); > > if (!sgid_tbl->vlan) > > goto out_free3; > > @@ -526,9 +532,11 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct > > bnxt_qplib_res *res, > > for (i = 0; i < sgid_tbl->max; i++) { > > if (memcmp(&sgid_tbl->tbl[i], &bnxt_qplib_gid_zero, > > sizeof(bnxt_qplib_gid_zero))) > > - bnxt_qplib_del_sgid(sgid_tbl, &sgid_tbl->tbl[i], true); > > + bnxt_qplib_del_sgid(sgid_tbl, &sgid_tbl->tbl[i].gid, > > + sgid_tbl->tbl[i].vlan_id, true); > > } > > - memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max); > > + memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid_info) * > > + sgid_tbl->max); > > memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max); > > memset(sgid_tbl->vlan, 0, sizeof(u8) * sgid_tbl->max); > > sgid_tbl->active = 0; > > @@ -537,7 +545,8 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct > > bnxt_qplib_res *res, > > static void bnxt_qplib_init_sgid_tbl(struct bnxt_qplib_sgid_tbl *sgid_tbl, > > struct net_device *netdev) > > { > > - memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max); > > + memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid_info) * > > + sgid_tbl->max); > > memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max); > > } > > > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h > > b/drivers/infiniband/hw/bnxt_re/qplib_res.h > > index 30c42c92fac7..fbda11a7ab1a 100644 > > --- a/drivers/infiniband/hw/bnxt_re/qplib_res.h > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h > > @@ -111,7 +111,7 @@ struct bnxt_qplib_pd_tbl { > > }; > > > > struct bnxt_qplib_sgid_tbl { > > - struct bnxt_qplib_gid *tbl; > > + struct bnxt_qplib_gid_info *tbl; > > u16 *hw_id; > > u16 max; > > u16 active; > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c > > b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > > index 48793d3512ac..40296b97d21e 100644 > > --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c > > @@ -213,12 +213,12 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, > > index, sgid_tbl->max); > > return -EINVAL; > > } > > - memcpy(gid, &sgid_tbl->tbl[index], sizeof(*gid)); > > + memcpy(gid, &sgid_tbl->tbl[index].gid, sizeof(*gid)); > > return 0; > > } > > > > int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > > - struct bnxt_qplib_gid *gid, bool update) > > + struct bnxt_qplib_gid *gid, u16 vlan_id, bool update) > > { > > struct bnxt_qplib_res *res = to_bnxt_qplib(sgid_tbl, > > struct bnxt_qplib_res, > > @@ -236,7 +236,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl > > *sgid_tbl, > > return -ENOMEM; > > } > > for (index = 0; index < sgid_tbl->max; index++) { > > - if (!memcmp(&sgid_tbl->tbl[index], gid, sizeof(*gid))) > > + if (!memcmp(&sgid_tbl->tbl[index].gid, gid, sizeof(*gid)) && > > + vlan_id == sgid_tbl->tbl[index].vlan_id) > > break; > > } > > if (index == sgid_tbl->max) { > > @@ -262,8 +263,9 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl > > *sgid_tbl, > > if (rc) > > return rc; > > } > > - memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero, > > + memcpy(&sgid_tbl->tbl[index].gid, &bnxt_qplib_gid_zero, > > sizeof(bnxt_qplib_gid_zero)); > > + sgid_tbl->tbl[index].vlan_id = 0xFFFF; > > sgid_tbl->vlan[index] = 0; > > sgid_tbl->active--; > > dev_dbg(&res->pdev->dev, > > @@ -296,7 +298,8 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl > > *sgid_tbl, > > } > > free_idx = sgid_tbl->max; > > for (i = 0; i < sgid_tbl->max; i++) { > > - if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid))) { > > + if (!memcmp(&sgid_tbl->tbl[i], gid, sizeof(*gid)) && > > + sgid_tbl->tbl[i].vlan_id == vlan_id) { > > dev_dbg(&res->pdev->dev, > > "SGID entry already exist in entry %d!\n", i); > > *index = i; > > @@ -351,6 +354,7 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl > > *sgid_tbl, > > } > > /* Add GID to the sgid_tbl */ > > memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid)); > > + sgid_tbl->tbl[free_idx].vlan_id = vlan_id; > > sgid_tbl->active++; > > if (vlan_id != 0xFFFF) > > sgid_tbl->vlan[free_idx] = 1; > > diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h > > b/drivers/infiniband/hw/bnxt_re/qplib_sp.h > > index 0ec3b12b0bcd..b5c4ce302c61 100644 > > --- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h > > +++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h > > @@ -84,6 +84,11 @@ struct bnxt_qplib_gid { > > u8 data[16]; > > }; > > > > +struct bnxt_qplib_gid_info { > > + struct bnxt_qplib_gid gid; > > + u16 vlan_id; > > +}; > > + > > struct bnxt_qplib_ah { > > struct bnxt_qplib_gid dgid; > > struct bnxt_qplib_pd *pd; > > @@ -221,7 +226,7 @@ int bnxt_qplib_get_sgid(struct bnxt_qplib_res *res, > > struct bnxt_qplib_sgid_tbl *sgid_tbl, int index, > > struct bnxt_qplib_gid *gid); > > int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > > - struct bnxt_qplib_gid *gid, bool update); > > + struct bnxt_qplib_gid *gid, u16 vlan_id, bool update); > > int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl, > > struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id, > > bool update, u32 *index); ^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2019-07-13 16:00 UTC | newest] Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <1310083272.27124086.1562836112586.JavaMail.zimbra@redhat.com> [not found] ` <619411460.27128070.1562838433020.JavaMail.zimbra@redhat.com> 2019-07-11 16:13 ` regression: nvme rdma with bnxt_re0 broken Sagi Grimberg [not found] ` <AM0PR05MB48664657022ECA8526E3C967D1F30@AM0PR05MB4866.eurprd05.prod.outlook.com> 2019-07-11 16:18 ` Parav Pandit 2019-07-12 1:53 ` Yi Zhang 2019-07-12 2:49 ` Parav Pandit 2019-07-12 3:45 ` Selvin Xavier 2019-07-12 9:28 ` Parav Pandit 2019-07-12 9:39 ` Parav Pandit 2019-07-12 9:49 ` Parav Pandit 2019-07-12 11:41 ` Yi Zhang 2019-07-12 12:52 ` Parav Pandit 2019-07-12 15:40 ` Jason Gunthorpe 2019-07-12 16:29 ` Selvin Xavier 2019-07-12 17:42 ` Jason Gunthorpe 2019-07-13 7:51 ` Selvin Xavier 2019-07-13 12:12 ` Jason Gunthorpe 2019-07-12 16:18 ` Selvin Xavier 2019-07-13 7:56 ` Yi Zhang 2019-07-13 16:00 ` Selvin Xavier
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).