linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* cx4 sriov problem in stable 4.14.47
@ 2018-06-20 10:40 Sagi Grimberg
  2018-06-20 12:01 ` Leon Romanovsky
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Sagi Grimberg @ 2018-06-20 10:40 UTC (permalink / raw)
  To: linux-rdma, Leon Romanovsky, Saeed Mahameed; +Cc: Stable, Roy Shterman

Hey folks,

Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
stable 4.14 kernel. I upgrated firmware but with no luck :(

Has anyone seen this issue as well?

Kernel: stable 4.14.47
Firmware: 12.22.1002
psid: MT_2140110033

sysfs port state is DOWN and phys port state is Disabled.

I tried to add as much debug as I could to dump for the experts:
--
[ 3243.373948] mlx5_core 0000:03:00.0: 
mlx5_core_sriov_configure:210:(pid 1616): requested num_vfs 1
[ 3243.374321] mlx5_core 0000:03:00.0: mlx5_device_enable_sriov:115:(pid 
1616): successfully enabled VF* 0
[ 3243.482126] pci 0000:03:00.1: [15b3:1014] type 00 class 0x020000
[ 3243.482490] pci 0000:03:00.1: Max Payload Size set to 256 (was 128, 
max 512)
[ 3243.482510] pci 0000:03:00.1: enabling Extended Tags
[ 3243.487075] mlx5_core 0000:03:00.1: enabling device (0000 -> 0002)
[ 3243.487220] mlx5_core 0000:03:00.1: firmware version: 12.22.1002
[ 3243.570233] mlx5_core 0000:03:00.1: handle_hca_cap:517:(pid 5010): 
Current Pkey table size 128 Setting new size 128
[ 3244.302809] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 20
[ 3244.303052] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 21
[ 3244.303296] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 22
[ 3244.303590] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 23
[ 3244.303825] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 24
[ 3244.304049] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 25
[ 3244.304276] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 26
[ 3244.304499] mlx5_core 0000:03:00.1: alloc_comp_eqs:776:(pid 5010): 
allocated completion EQN 27
[ 3244.307241] mlx5_core 0000:03:00.1: 
mlx5_nic_vport_update_local_lb:939:(pid 5010): disable local_lb
[ 3244.307613] mlx5_core 0000:03:00.1: MLX5E: StrdRq(0) RqSz(1024) 
StrdSz(1) RxCqeCmprss(0)
[ 3244.308008] mlx5_core 0000:03:00.1: Assigned random MAC address 
42:7b:57:5a:f2:06
[ 3244.460140] mlx5_core 0000:03:00.1 ens1f1: renamed from eth0
[ 3244.627854] mlx5_core 0000:03:00.1 ens1f1: Link down
[ 3244.629033] IPv6: ADDRCONF(NETDEV_UP): ens1f1: link is not ready
[ 3244.629480] IPv6: ADDRCONF(NETDEV_UP): ens1f1: link is not ready
[ 3244.639340] pkey = 0xffff
[ 3244.644081] pkey = 0xffff
[ 3244.645809] pkey = 0xffff
[ 3244.646510] pkey = 0x0
[ 3244.646557] pkey = 0x0
[ 3244.646600] pkey = 0x0
[ 3244.646649] pkey = 0x0
[ 3244.646692] pkey = 0x0
[ 3244.646751] pkey = 0x0
[ 3244.646793] pkey = 0x0
[ 3244.646835] pkey = 0x0
[ 3244.646882] pkey = 0x0
[ 3244.646933] pkey = 0x0
[ 3244.646986] pkey = 0x0
[ 3244.647044] pkey = 0x0
[ 3244.647086] pkey = 0x0
[ 3244.647136] pkey = 0x0
[ 3244.647182] pkey = 0x0
[ 3244.647240] pkey = 0x0
[ 3244.647283] pkey = 0x0
[ 3244.647333] pkey = 0x0
[ 3244.647384] pkey = 0x0
[ 3244.647434] pkey = 0x0
[ 3244.647497] pkey = 0x0
[ 3244.647554] pkey = 0x0
[ 3244.647598] pkey = 0x0
[ 3244.647652] pkey = 0x0
[ 3244.647695] pkey = 0x0
[ 3244.647737] pkey = 0x0
[ 3244.647786] pkey = 0x0
[ 3244.647839] pkey = 0x0
[ 3244.647884] pkey = 0x0
[ 3244.647934] pkey = 0x0
[ 3244.647986] pkey = 0x0
[ 3244.648049] pkey = 0x0
[ 3244.648092] pkey = 0x0
[ 3244.648134] pkey = 0x0
[ 3244.648184] pkey = 0x0
[ 3244.648241] pkey = 0x0
[ 3244.648286] pkey = 0x0
[ 3244.648334] pkey = 0x0
[ 3244.648385] pkey = 0x0
[ 3244.648435] pkey = 0x0
[ 3244.648495] pkey = 0x0
[ 3244.648539] pkey = 0x0
[ 3244.648584] pkey = 0x0
[ 3244.648634] pkey = 0x0
[ 3244.648685] pkey = 0x0
[ 3244.648748] pkey = 0x0
[ 3244.648792] pkey = 0x0
[ 3244.648836] pkey = 0x0
[ 3244.648885] pkey = 0x0
[ 3244.648935] pkey = 0x0
[ 3244.648998] pkey = 0x0
[ 3244.649042] pkey = 0x0
[ 3244.649086] pkey = 0x0
[ 3244.649138] pkey = 0x0
[ 3244.649185] pkey = 0x0
[ 3244.649235] pkey = 0x0
[ 3244.649285] pkey = 0x0
[ 3244.649344] pkey = 0x0
[ 3244.649388] pkey = 0x0
[ 3244.649435] pkey = 0x0
[ 3244.649484] pkey = 0x0
[ 3244.649549] pkey = 0x0
[ 3244.649593] pkey = 0x0
[ 3244.649638] pkey = 0x0
[ 3244.649686] pkey = 0x0
[ 3244.649741] pkey = 0x0
[ 3244.649785] pkey = 0x0
[ 3244.649836] pkey = 0x0
[ 3244.649886] pkey = 0x0
[ 3244.649936] pkey = 0x0
[ 3244.649988] pkey = 0x0
[ 3244.651604] pkey = 0x0
[ 3244.651652] pkey = 0x0
[ 3244.651697] pkey = 0x0
[ 3244.651744] pkey = 0x0
[ 3244.651788] pkey = 0x0
[ 3244.651839] pkey = 0x0
[ 3244.651889] pkey = 0x0
[ 3244.651939] pkey = 0x0
[ 3244.651989] pkey = 0x0
[ 3244.652048] pkey = 0x0
[ 3244.652090] pkey = 0x0
[ 3244.652140] pkey = 0x0
[ 3244.652195] pkey = 0x0
[ 3244.652241] pkey = 0x0
[ 3244.652300] pkey = 0x0
[ 3244.652343] pkey = 0x0
[ 3244.652403] pkey = 0x0
[ 3244.652450] pkey = 0x0
[ 3244.652493] pkey = 0x0
[ 3244.652540] pkey = 0x0
[ 3244.652593] pkey = 0x0
[ 3244.652640] pkey = 0x0
[ 3244.652697] pkey = 0x0
[ 3244.652755] pkey = 0x0
[ 3244.652800] pkey = 0x0
[ 3244.652848] pkey = 0x0
[ 3244.652892] pkey = 0x0
[ 3244.652950] pkey = 0x0
[ 3244.652993] pkey = 0x0
[ 3244.653041] pkey = 0x0
[ 3244.653091] pkey = 0x0
[ 3244.653141] pkey = 0x0
[ 3244.653191] pkey = 0x0
[ 3244.653241] pkey = 0x0
[ 3244.653294] pkey = 0x0
[ 3244.653341] pkey = 0x0
[ 3244.653391] pkey = 0x0
[ 3244.653531] pkey = 0x0
[ 3244.653593] pkey = 0x0
[ 3244.653935] pkey = 0x0
[ 3244.653996] pkey = 0x0
[ 3244.654253] pkey = 0x0
[ 3244.654974] pkey = 0x0
[ 3244.655038] pkey = 0x0
[ 3244.655097] pkey = 0x0
[ 3244.655143] pkey = 0x0
[ 3244.655194] pkey = 0x0
[ 3244.655253] pkey = 0x0
[ 3244.655308] pkey = 0x0
[ 3244.655350] pkey = 0x0
[ 3244.655393] pkey = 0x0
[ 3244.655448] pkey = 0x0
[ 3244.655494] pkey = 0x0
[ 3244.655544] pkey = 0x0
[ 3244.655595] pkey = 0x0
[ 3244.655648] pkey = 0x0

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 10:40 cx4 sriov problem in stable 4.14.47 Sagi Grimberg
@ 2018-06-20 12:01 ` Leon Romanovsky
  2018-06-20 12:28   ` Sagi Grimberg
  2018-06-20 15:47 ` Or Gerlitz
  2018-06-20 18:44 ` Greg KH
  2 siblings, 1 reply; 8+ messages in thread
From: Leon Romanovsky @ 2018-06-20 12:01 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-rdma, Saeed Mahameed, Stable, Roy Shterman

[-- Attachment #1: Type: text/plain, Size: 362 bytes --]

On Wed, Jun 20, 2018 at 01:40:16PM +0300, Sagi Grimberg wrote:
> Hey folks,
>
> Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
> stable 4.14 kernel. I upgrated firmware but with no luck :(
>
> Has anyone seen this issue as well?

I'm shooting to the dark, isn't this related to commit
e3ca34880652250f524022ad89e516f8ba9a805b ?

Thanks

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 12:01 ` Leon Romanovsky
@ 2018-06-20 12:28   ` Sagi Grimberg
  0 siblings, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2018-06-20 12:28 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma, Saeed Mahameed, Stable, Roy Shterman


>> Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
>> stable 4.14 kernel. I upgrated firmware but with no luck :(
>>
>> Has anyone seen this issue as well?
> 
> I'm shooting to the dark, isn't this related to commit
> e3ca34880652250f524022ad89e516f8ba9a805b ?

Probably not, this only effects rdma ulps what want to map
to irq affinities. This is an ethernet vf probed with link down.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 10:40 cx4 sriov problem in stable 4.14.47 Sagi Grimberg
  2018-06-20 12:01 ` Leon Romanovsky
@ 2018-06-20 15:47 ` Or Gerlitz
  2018-06-20 17:06   ` Sagi Grimberg
  2018-06-20 18:44 ` Greg KH
  2 siblings, 1 reply; 8+ messages in thread
From: Or Gerlitz @ 2018-06-20 15:47 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-rdma, Leon Romanovsky, Saeed Mahameed, Stable, Roy Shterman

On Wed, Jun 20, 2018 at 1:40 PM, Sagi Grimberg <sagi@grimberg.me> wrote:
> Hey folks,
>
> Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
> stable 4.14 kernel. I upgrated firmware but with no luck :(
>
> Has anyone seen this issue as well?
>
> Kernel: stable 4.14.47
> Firmware: 12.22.1002
> psid: MT_2140110033
>
> sysfs port state is DOWN and

do you mean ip link on the pf shows link state down for the vf?

> phys port state is Disabled.

where do you see it is disabled and what happens if you take the PF
netdev link up?

We had a bug fix there recently [1] -- but it was for the switchdev
mode not the legacy mode, are you
using the switchdev mode? (if not, I recommend going there, the legacy
mode is not going to last long)

Also, FWIW the whole ethernet sriov upstreaming is done in netdev, not rdma

Or.


[1] 84c9c8f net/mlx5e: Don't override vport admin link state in switchdev mode

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 15:47 ` Or Gerlitz
@ 2018-06-20 17:06   ` Sagi Grimberg
  2018-06-21  7:05     ` Or Gerlitz
  0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2018-06-20 17:06 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: linux-rdma, Leon Romanovsky, Saeed Mahameed, Stable, Roy Shterman


>> Hey folks,
>>
>> Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
>> stable 4.14 kernel. I upgrated firmware but with no luck :(
>>
>> Has anyone seen this issue as well?
>>
>> Kernel: stable 4.14.47
>> Firmware: 12.22.1002
>> psid: MT_2140110033
>>
>> sysfs port state is DOWN and
> 
> do you mean ip link on the pf shows link state down for the vf?

the PF is fine, the VF is down.

>> phys port state is Disabled.
> 
> where do you see it is disabled and what happens if you take the PF
> netdev link up?

Nothing happens if I take it up, the carrier is off. I see the port
phys state via infiniband sysfs (which exposes it from the hca vport
context).

> We had a bug fix there recently [1] -- but it was for the switchdev
> mode not the legacy mode,

It was backported to 4,14 stable...

> are you using the switchdev mode? (if not, I recommend going there, the legacy
> mode is not going to last long)

No, I wasn't sure if it was mature enough in 4.14. And stable backports
don't always propagate immediately...

I'll try switchdev mode.

> Also, FWIW the whole ethernet sriov upstreaming is done in netdev, not rdma
I know, it should have went to netdev... Next time (which I hope won't 
come too soon ;))

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 10:40 cx4 sriov problem in stable 4.14.47 Sagi Grimberg
  2018-06-20 12:01 ` Leon Romanovsky
  2018-06-20 15:47 ` Or Gerlitz
@ 2018-06-20 18:44 ` Greg KH
  2018-07-12 18:01   ` Or Gerlitz
  2 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2018-06-20 18:44 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-rdma, Leon Romanovsky, Saeed Mahameed, Stable, Roy Shterman

On Wed, Jun 20, 2018 at 01:40:16PM +0300, Sagi Grimberg wrote:
> Hey folks,
> 
> Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
> stable 4.14 kernel. I upgrated firmware but with no luck :(

Did this work on older kernels?  If so, can you use 'git bisect' to
track down the offending commit?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 17:06   ` Sagi Grimberg
@ 2018-06-21  7:05     ` Or Gerlitz
  0 siblings, 0 replies; 8+ messages in thread
From: Or Gerlitz @ 2018-06-21  7:05 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: linux-rdma, Leon Romanovsky, Saeed Mahameed, Stable, Roy Shterman

On Wed, Jun 20, 2018 at 8:06 PM, Sagi Grimberg <sagi@grimberg.me> wrote:

> I'll try switchdev mode.

in switchdev mode you have the take the host side VF representor netdev up
such that the VF vport link will be enabled. Also in switchdev mode you need
some host side controller SW to program the e-switch forwarding rule.
You can make the experiment just to see if the link problem persists.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: cx4 sriov problem in stable 4.14.47
  2018-06-20 18:44 ` Greg KH
@ 2018-07-12 18:01   ` Or Gerlitz
  0 siblings, 0 replies; 8+ messages in thread
From: Or Gerlitz @ 2018-07-12 18:01 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-rdma, Saeed Mahameed, Stable, Roy Shterman, Greg KH

On Wed, Jun 20, 2018 at 9:44 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Wed, Jun 20, 2018 at 01:40:16PM +0300, Sagi Grimberg wrote:
>> Hey folks,
>>
>> Seems that CX4 (Ethernet) sriov ports probe with link down in the latest
>> stable 4.14 kernel. I upgrated firmware but with no luck :(
>
> Did this work on older kernels?  If so, can you use 'git bisect' to
> track down the offending commit?

so... what was/is the resolution here?


Or.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-07-12 18:01 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-20 10:40 cx4 sriov problem in stable 4.14.47 Sagi Grimberg
2018-06-20 12:01 ` Leon Romanovsky
2018-06-20 12:28   ` Sagi Grimberg
2018-06-20 15:47 ` Or Gerlitz
2018-06-20 17:06   ` Sagi Grimberg
2018-06-21  7:05     ` Or Gerlitz
2018-06-20 18:44 ` Greg KH
2018-07-12 18:01   ` Or Gerlitz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).