All of lore.kernel.org
 help / color / mirror / Atom feed
* nvmeof Issues with Zen 3/Ryzen 5000 Initiator
@ 2021-05-26 20:47 Jonathan Wright
  2021-05-26 21:50 ` Chaitanya Kulkarni
  2021-05-27 21:36 ` Sagi Grimberg
  0 siblings, 2 replies; 10+ messages in thread
From: Jonathan Wright @ 2021-05-26 20:47 UTC (permalink / raw)
  To: linux-nvme

I've been testing NVMe over Fabrics for the past few weeks and the 
performance has been nothing short of incredible, though I'm running 
into some major issues that seems to be specifically related to AMD Zen 
3 Ryzen chips (in my case I'm testing with 5900x).

Target:
Supermicro X10 board
Xeon E5-2620v4
Intel E810 NIC

Problematic Client/initiator:
ASRock X570 board
Ryzen 9 5900x
Intel E810 NIC

Stable Client/initiator:
Supermicro X10 board
Xeon E5-2620v4
Intel E810 NIC

I'm using the same 2 E810 NICs and pair of 25G DACs in both cases.  The 
NICs are directly connected with the DACs and there is no switch in the 
equation.  To trigger the issue I'm simply using FIO similar to this:

fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
--name=test --filename=/dev/nvme0n1 --bs=4k --iodepth=64 --size=10G 
--readwrite=randread --time_based --runtime=1200

I'm primarily using RDMA/iWARP right now but I've also tested RoCE2 
which presents the same issues/symptoms.  Primary testing has been done 
with Ubuntu 20.04.2 with CentOS 8 in the mix as well just to try and 
rule out a weird distro-specific issue.  All tests used the latest 
ice/irdma drivers from Intel (1.5.8 and 1.5.2 respectively)

I've not yet tested a Ryzen 5900x target with an Intel initiator but i 
plan to to see if it exhibits the same instability.

The issue presents itself as a connectivity loss between the two hosts - 
but there is no connectivity issue.  The issue is also somewhat 
inconsistent.  Sometimes it will show up after 1-2 minutes of testing, 
sometimes instantly, and sometimes close to 10 minutes in.

Target dmesg sample:
[ 3867.598007] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[ 3867.598384] nvmet: ctrl 1 fatal error occurred!

Initiator dmesg sample:
<snip>
[  348.122160] nvme nvme4: I/O 86 QID 17 timeout
[  348.122224] nvme nvme4: I/O 87 QID 17 timeout
[  348.122290] nvme nvme4: I/O 88 QID 17 timeout
[  348.122354] nvme nvme4: I/O 89 QID 17 timeout
[  348.122417] nvme nvme4: I/O 90 QID 17 timeout
[  348.122480] nvme nvme4: I/O 91 QID 17 timeout
[  348.122544] nvme nvme4: I/O 92 QID 17 timeout
[  348.122607] nvme nvme4: I/O 93 QID 17 timeout
[  348.122670] nvme nvme4: I/O 94 QID 17 timeout
[  348.122733] nvme nvme4: I/O 95 QID 17 timeout
[  348.122796] nvme nvme4: I/O 96 QID 17 timeout
<snip>
[  380.387212] nvme nvme4: creating 24 I/O queues.
[  380.573925] nvme nvme4: Successfully reconnected (1 attempts)

All the while the underlying connectivity is working just fine. There's 
a long delay between the timeout and the successful reconnect.  I 
haven't timed it but it seems like about 5 minutes. This has luckily 
given me plenty of time to test connectivity which has consistently been 
just fine on all fronts.

I'm testing with a single Micron 9300 Pro 7.68TB right now which can 
push about 850k read IOPs.  On the Intel target/initiator combo I can 
run it "balls to the walls" for hours on end with 0 issues.  On the AMD 
initiator I can trigger the disconnect/drop generally within 5 minutes.  
Here's where things get weird - if I limit the test to 200K IOPs or less 
then it's relatively stable on the AMD and I've not seen any drops when 
this limitation is in place.

Here are some things I've tried which make no difference (or make things 
worse):

Ubuntu 20.04.2 kernel 5.4.
Ubuntu 20.04.2 kernel 5.8
Ubuntu 20.04.2 kernel 5.10
CentOS 8 kernel 4.18
CentOS 8 kernel 5.10 (from elrepo)
CentOS 8 kernel 5.12 (from elrepo) - whole system actually freezes upon 
"nvme connect" command on this one
With and without multipath (native)
With and without round-robin on multipath (native)
Different NVMe drive models
With and without PFC
10G DAC
25G DAC
25G DAC negotiated at 10G
With and without a switch
iWARP and RoCE2

I did do some testing with TCP/IP but cannot reach the >200k IOPS 
threshold with it which seems to be important for triggering the issue.  
I did not experience the drops with TCP/IP.

I can't seem to draw any conclusion other than this being something 
specific to Zen 3, but I'm not sure why.  Is there somewhere I should be 
looking aside from "dmesg" to get some useful debug info?  According to 
the irdma driver there are no rdma packets getting 
lost/dropped/erroring, etc.  Common things like rping and 
ib_read_bw/ib_write_bw tests all run indefinitely without error.

I would appreciate any help or advice with this or how I can help 
confirm if this is indeed specific to Zen 3.

-- 
Jonathan Wright
KnownHost, LLC
https://www.knownhost.com


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-26 20:47 nvmeof Issues with Zen 3/Ryzen 5000 Initiator Jonathan Wright
@ 2021-05-26 21:50 ` Chaitanya Kulkarni
  2021-05-26 23:07   ` Jonathan Wright
  2021-05-27 21:36 ` Sagi Grimberg
  1 sibling, 1 reply; 10+ messages in thread
From: Chaitanya Kulkarni @ 2021-05-26 21:50 UTC (permalink / raw)
  To: Jonathan Wright, linux-nvme

On 5/26/21 13:53, Jonathan Wright wrote:
> Here are some things I've tried which make no difference (or make things 
> worse):
>
> Ubuntu 20.04.2 kernel 5.4.
> Ubuntu 20.04.2 kernel 5.8
> Ubuntu 20.04.2 kernel 5.10
> CentOS 8 kernel 4.18
> CentOS 8 kernel 5.10 (from elrepo)
> CentOS 8 kernel 5.12 (from elrepo) - whole system actually freezes upon 
> "nvme connect" command on this one
> With and without multipath (native)
> With and without round-robin on multipath (native)
> Different NVMe drive models
> With and without PFC
> 10G DAC
> 25G DAC
> 25G DAC negotiated at 10G
> With and without a switch
> iWARP and RoCE2

I think you need to talk to respective distros.

Have you seen any issues with the nvme repo branch 5.14 ?



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-26 21:50 ` Chaitanya Kulkarni
@ 2021-05-26 23:07   ` Jonathan Wright
  2021-05-27  2:36     ` Chaitanya Kulkarni
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wright @ 2021-05-26 23:07 UTC (permalink / raw)
  To: linux-nvme

I grabbed/built the kernel from the nvme-5.14 branch 
(http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-5.14) and 
managed to get the NIC drivers compiled.

Unfortunately the issue still persists.

Target:
[  790.591424] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[  790.591846] nvmet: ctrl 1 fatal error occurred!

Initiator:
[  688.683933] nvme nvme2: I/O 1 QID 2 timeout
[  688.684066] nvme nvme2: starting error recovery
[  688.684089] nvme nvme2: I/O 2 QID 2 timeout
[  691.756492] nvme nvme2: I/O 3 QID 2 timeout
[  691.756522] nvme nvme2: I/O 4 QID 2 timeout
[  691.756542] nvme nvme2: I/O 5 QID 2 timeout
[  691.756561] nvme nvme2: I/O 6 QID 2 timeout
[  691.756578] nvme nvme2: I/O 7 QID 2 timeout
[  691.757922] nvme nvme2: I/O 8 QID 2 timeout
[  691.758995] nvme nvme2: I/O 9 QID 2 timeout
[  691.760063] nvme nvme2: I/O 10 QID 2 timeout
[  691.761129] nvme nvme2: I/O 11 QID 2 timeout
[  691.762192] nvme nvme2: I/O 12 QID 2 timeout
[  691.763236] nvme nvme2: I/O 13 QID 2 timeout
[  691.764268] nvme nvme2: I/O 14 QID 2 timeout
[  691.765283] nvme nvme2: I/O 15 QID 2 timeout
[  691.766285] nvme nvme2: I/O 16 QID 2 timeout
[  691.767268] nvme nvme2: I/O 17 QID 2 timeout
[  691.768230] nvme nvme2: I/O 18 QID 2 timeout
[  691.768275] block nvme2n1: no usable path - requeuing I/O
[  691.769194] nvme nvme2: I/O 19 QID 2 timeout
[  691.770265] block nvme2n1: no usable path - requeuing I/O
[  691.771210] nvme nvme2: I/O 20 QID 2 timeout
[  691.772159] block nvme2n1: no usable path - requeuing I/O
[  691.773086] nvme nvme2: I/O 21 QID 2 timeout
<snip>

[root@nvmeof-client-centos ~]# uname -a
Linux nvmeof-client-centos 5.12.0+ #2 SMP Wed May 26 18:39:31 EDT 2021 
x86_64 x86_64 x86_64 GNU/Linux

I wasn't totally sure who to start talking to about this issue but I 
became pretty confident that it wasn't distro-specific which is why I 
sent my message here.  If I'm still in the wrong place please let me know.

On 5/26/21 4:50 PM, Chaitanya Kulkarni wrote:
> On 5/26/21 13:53, Jonathan Wright wrote:
>> Here are some things I've tried which make no difference (or make things
>> worse):
>>
>> Ubuntu 20.04.2 kernel 5.4.
>> Ubuntu 20.04.2 kernel 5.8
>> Ubuntu 20.04.2 kernel 5.10
>> CentOS 8 kernel 4.18
>> CentOS 8 kernel 5.10 (from elrepo)
>> CentOS 8 kernel 5.12 (from elrepo) - whole system actually freezes upon
>> "nvme connect" command on this one
>> With and without multipath (native)
>> With and without round-robin on multipath (native)
>> Different NVMe drive models
>> With and without PFC
>> 10G DAC
>> 25G DAC
>> 25G DAC negotiated at 10G
>> With and without a switch
>> iWARP and RoCE2
> I think you need to talk to respective distros.
>
> Have you seen any issues with the nvme repo branch 5.14 ?
>
>
>
> _______________________________________________
> Linux-nvme mailing list
> Linux-nvme@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-nvme

-- 
Jonathan Wright
KnownHost, LLC
https://www.knownhost.com


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-26 23:07   ` Jonathan Wright
@ 2021-05-27  2:36     ` Chaitanya Kulkarni
  2021-05-27 16:51       ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Chaitanya Kulkarni @ 2021-05-27  2:36 UTC (permalink / raw)
  To: Jonathan Wright; +Cc: linux-nvme

On 5/26/21 16:14, Jonathan Wright wrote:
> I grabbed/built the kernel from the nvme-5.14 branch 
> (http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-5.14) and 
> managed to get the NIC drivers compiled.
>
> Unfortunately the issue still persists.

It will allow everyone to help you to fix this if you can bisect
the problem.



_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-27  2:36     ` Chaitanya Kulkarni
@ 2021-05-27 16:51       ` Keith Busch
  2021-05-27 16:57         ` Jonathan Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2021-05-27 16:51 UTC (permalink / raw)
  To: Chaitanya Kulkarni; +Cc: Jonathan Wright, linux-nvme

On Thu, May 27, 2021 at 02:36:59AM +0000, Chaitanya Kulkarni wrote:
> On 5/26/21 16:14, Jonathan Wright wrote:
> > I grabbed/built the kernel from the nvme-5.14 branch 
> > (http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-5.14) and 
> > managed to get the NIC drivers compiled.
> >
> > Unfortunately the issue still persists.
> 
> It will allow everyone to help you to fix this if you can bisect
> the problem.

You need a "good" commit point in order to bisect, and we don't have one
here. The description sounds like no kernel release ever worked.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-27 16:51       ` Keith Busch
@ 2021-05-27 16:57         ` Jonathan Wright
  0 siblings, 0 replies; 10+ messages in thread
From: Jonathan Wright @ 2021-05-27 16:57 UTC (permalink / raw)
  To: Keith Busch, Chaitanya Kulkarni; +Cc: linux-nvme

Yes, in an attempt to put forth more effort to help here I've spent the 
morning compiling various branches of the nvme kernel branches.

5.8 - NIC drivers won't compile, cannot test
5.9 - Issue persists and when it happens the entire system locks up
5.10 - Issue persists

I don't see much point in continuing further with such tests honestly.  
I think this is something that has never worked with Zen 3 and not a 
regression.

I'm gathering up more hardware to test later today on some other 
combinations - namely Zen 2 (Ryzen 3600) to see if the issue exists 
there.  I have some EPYCs (Zen 2) as well but it'd be later next week 
before I can get them in my lab for testing.

On 5/27/21 11:51 AM, Keith Busch wrote:
> On Thu, May 27, 2021 at 02:36:59AM +0000, Chaitanya Kulkarni wrote:
>> On 5/26/21 16:14, Jonathan Wright wrote:
>>> I grabbed/built the kernel from the nvme-5.14 branch
>>> (http://git.infradead.org/nvme.git/shortlog/refs/heads/nvme-5.14) and
>>> managed to get the NIC drivers compiled.
>>>
>>> Unfortunately the issue still persists.
>> It will allow everyone to help you to fix this if you can bisect
>> the problem.
> You need a "good" commit point in order to bisect, and we don't have one
> here. The description sounds like no kernel release ever worked.

-- 
Jonathan Wright
KnownHost, LLC
https://www.knownhost.com


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-26 20:47 nvmeof Issues with Zen 3/Ryzen 5000 Initiator Jonathan Wright
  2021-05-26 21:50 ` Chaitanya Kulkarni
@ 2021-05-27 21:36 ` Sagi Grimberg
  2021-06-03 16:57   ` Jonathan Wright
  1 sibling, 1 reply; 10+ messages in thread
From: Sagi Grimberg @ 2021-05-27 21:36 UTC (permalink / raw)
  To: Jonathan Wright, linux-nvme, Shiraz Saleem


> I've been testing NVMe over Fabrics for the past few weeks and the 
> performance has been nothing short of incredible, though I'm running 
> into some major issues that seems to be specifically related to AMD Zen 
> 3 Ryzen chips (in my case I'm testing with 5900x).
> 
> Target:
> Supermicro X10 board
> Xeon E5-2620v4
> Intel E810 NIC
> 
> Problematic Client/initiator:
> ASRock X570 board
> Ryzen 9 5900x
> Intel E810 NIC
> 
> Stable Client/initiator:
> Supermicro X10 board
> Xeon E5-2620v4
> Intel E810 NIC
> 
> I'm using the same 2 E810 NICs and pair of 25G DACs in both cases.  The 
> NICs are directly connected with the DACs and there is no switch in the 
> equation.  To trigger the issue I'm simply using FIO similar to this:
> 
> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
> --name=test --filename=/dev/nvme0n1 --bs=4k --iodepth=64 --size=10G 
> --readwrite=randread --time_based --runtime=1200
> 
> I'm primarily using RDMA/iWARP right now but I've also tested RoCE2 
> which presents the same issues/symptoms.  Primary testing has been done 
> with Ubuntu 20.04.2 with CentOS 8 in the mix as well just to try and 
> rule out a weird distro-specific issue.  All tests used the latest 
> ice/irdma drivers from Intel (1.5.8 and 1.5.2 respectively)

CCing Shiraz Saleem who maintains irdma.

> 
> I've not yet tested a Ryzen 5900x target with an Intel initiator but i 
> plan to to see if it exhibits the same instability.
> 
> The issue presents itself as a connectivity loss between the two hosts - 
> but there is no connectivity issue.  The issue is also somewhat 
> inconsistent.  Sometimes it will show up after 1-2 minutes of testing, 
> sometimes instantly, and sometimes close to 10 minutes in.
> 
> Target dmesg sample:
> [ 3867.598007] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [ 3867.598384] nvmet: ctrl 1 fatal error occurred!
> 
> Initiator dmesg sample:
> <snip>
> [  348.122160] nvme nvme4: I/O 86 QID 17 timeout
> [  348.122224] nvme nvme4: I/O 87 QID 17 timeout
> [  348.122290] nvme nvme4: I/O 88 QID 17 timeout
> [  348.122354] nvme nvme4: I/O 89 QID 17 timeout
> [  348.122417] nvme nvme4: I/O 90 QID 17 timeout
> [  348.122480] nvme nvme4: I/O 91 QID 17 timeout
> [  348.122544] nvme nvme4: I/O 92 QID 17 timeout
> [  348.122607] nvme nvme4: I/O 93 QID 17 timeout
> [  348.122670] nvme nvme4: I/O 94 QID 17 timeout
> [  348.122733] nvme nvme4: I/O 95 QID 17 timeout
> [  348.122796] nvme nvme4: I/O 96 QID 17 timeout
> <snip>
> [  380.387212] nvme nvme4: creating 24 I/O queues.
> [  380.573925] nvme nvme4: Successfully reconnected (1 attempts)
> 
> All the while the underlying connectivity is working just fine. There's 
> a long delay between the timeout and the successful reconnect.  I 
> haven't timed it but it seems like about 5 minutes. This has luckily 
> given me plenty of time to test connectivity which has consistently been 
> just fine on all fronts.

Seems like loss of connectivity from the driver perspective.
While this is happening, can you try an rdma application like
ib_send_bwib_send_lat or something?

I'd also suggest to run both workloads concurrently and see if they
both suffer from a connectivity issue, this will help rule out
if this is something specific to the nvme-rdma driver.

> 
> I'm testing with a single Micron 9300 Pro 7.68TB right now which can 
> push about 850k read IOPs.  On the Intel target/initiator combo I can 
> run it "balls to the walls" for hours on end with 0 issues.  On the AMD 
> initiator I can trigger the disconnect/drop generally within 5 minutes. 
> Here's where things get weird - if I limit the test to 200K IOPs or less 
> then it's relatively stable on the AMD and I've not seen any drops when 
> this limitation is in place.
> 
> Here are some things I've tried which make no difference (or make things 
> worse):
> 
> Ubuntu 20.04.2 kernel 5.4.
> Ubuntu 20.04.2 kernel 5.8
> Ubuntu 20.04.2 kernel 5.10
> CentOS 8 kernel 4.18
> CentOS 8 kernel 5.10 (from elrepo)
> CentOS 8 kernel 5.12 (from elrepo) - whole system actually freezes upon 
> "nvme connect" command on this one
> With and without multipath (native)
> With and without round-robin on multipath (native)
> Different NVMe drive models
> With and without PFC
> 10G DAC
> 25G DAC
> 25G DAC negotiated at 10G
> With and without a switch
> iWARP and RoCE2

Looks like this probably always existed...

> 
> I did do some testing with TCP/IP but cannot reach the >200k IOPS 
> threshold with it which seems to be important for triggering the issue. 
> I did not experience the drops with TCP/IP.
> 
> I can't seem to draw any conclusion other than this being something 
> specific to Zen 3, but I'm not sure why.  Is there somewhere I should be 
> looking aside from "dmesg" to get some useful debug info?  According to 
> the irdma driver there are no rdma packets getting 
> lost/dropped/erroring, etc.  Common things like rping and 
> ib_read_bw/ib_write_bw tests all run indefinitely without error.

Ah, that is an important detail.

I think that packet sniffer can help here if this is the case, IIRC
there should be way to sniff rdma traffic using tcpdump but I don't
remember the details. Perhaps Intel folks can help you there...

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-05-27 21:36 ` Sagi Grimberg
@ 2021-06-03 16:57   ` Jonathan Wright
  2021-06-21 16:04     ` Jonathan Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wright @ 2021-06-03 16:57 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Shiraz Saleem


>> I've been testing NVMe over Fabrics for the past few weeks and the 
>> performance has been nothing short of incredible, though I'm running 
>> into some major issues that seems to be specifically related to AMD 
>> Zen 3 Ryzen chips (in my case I'm testing with 5900x).
>>
>> Target:
>> Supermicro X10 board
>> Xeon E5-2620v4
>> Intel E810 NIC
>>
>> Problematic Client/initiator:
>> ASRock X570 board
>> Ryzen 9 5900x
>> Intel E810 NIC
>>
>> Stable Client/initiator:
>> Supermicro X10 board
>> Xeon E5-2620v4
>> Intel E810 NIC
>>
>> I'm using the same 2 E810 NICs and pair of 25G DACs in both cases.  
>> The NICs are directly connected with the DACs and there is no switch 
>> in the equation.  To trigger the issue I'm simply using FIO similar 
>> to this:
>>
>> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
>> --name=test --filename=/dev/nvme0n1 --bs=4k --iodepth=64 --size=10G 
>> --readwrite=randread --time_based --runtime=1200
>>
>> I'm primarily using RDMA/iWARP right now but I've also tested RoCE2 
>> which presents the same issues/symptoms.  Primary testing has been 
>> done with Ubuntu 20.04.2 with CentOS 8 in the mix as well just to try 
>> and rule out a weird distro-specific issue. All tests used the latest 
>> ice/irdma drivers from Intel (1.5.8 and 1.5.2 respectively)
>
> CCing Shiraz Saleem who maintains irdma.

Thanks.  I've done some testing now with Mellanox ConnectX-4 cards with 
Zen 3 and the issue does not exist.  This seems to point the finger to 
something specific between irdma and Zen 2/3 since irdma/E810 works fine 
on all-Intel hardware.  I tested the Mellanox on both Ubuntu 20.04 stock 
kernel (5.4) and CentOS 8.3 (stock kernel 4.18) as I tested the E810 on 
these combinations.

Further I tested an AMD target with an Intel initiator and the issue 
still exists so it doesn't seem to matter which end the Zen 3 (and/or 
Zen2) chip is on when paired with an E810/irdma.

The issue also exists with Zen 2 (Ryzen 3600).

@Shriaz since I guess this isn't a common setup right now let me know if 
I can be of any assistance with getting to the bottom of this seeming 
incompatibility.

>
>>
>> I've not yet tested a Ryzen 5900x target with an Intel initiator but 
>> i plan to to see if it exhibits the same instability.
>>
>> The issue presents itself as a connectivity loss between the two 
>> hosts - but there is no connectivity issue.  The issue is also 
>> somewhat inconsistent.  Sometimes it will show up after 1-2 minutes 
>> of testing, sometimes instantly, and sometimes close to 10 minutes in.
>>
>> Target dmesg sample:
>> [ 3867.598007] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
>> [ 3867.598384] nvmet: ctrl 1 fatal error occurred!
>>
>> Initiator dmesg sample:
>> <snip>
>> [  348.122160] nvme nvme4: I/O 86 QID 17 timeout
>> [  348.122224] nvme nvme4: I/O 87 QID 17 timeout
>> [  348.122290] nvme nvme4: I/O 88 QID 17 timeout
>> [  348.122354] nvme nvme4: I/O 89 QID 17 timeout
>> [  348.122417] nvme nvme4: I/O 90 QID 17 timeout
>> [  348.122480] nvme nvme4: I/O 91 QID 17 timeout
>> [  348.122544] nvme nvme4: I/O 92 QID 17 timeout
>> [  348.122607] nvme nvme4: I/O 93 QID 17 timeout
>> [  348.122670] nvme nvme4: I/O 94 QID 17 timeout
>> [  348.122733] nvme nvme4: I/O 95 QID 17 timeout
>> [  348.122796] nvme nvme4: I/O 96 QID 17 timeout
>> <snip>
>> [  380.387212] nvme nvme4: creating 24 I/O queues.
>> [  380.573925] nvme nvme4: Successfully reconnected (1 attempts)
>>
>> All the while the underlying connectivity is working just fine. 
>> There's a long delay between the timeout and the successful 
>> reconnect.  I haven't timed it but it seems like about 5 minutes. 
>> This has luckily given me plenty of time to test connectivity which 
>> has consistently been just fine on all fronts.
>
> Seems like loss of connectivity from the driver perspective.
> While this is happening, can you try an rdma application like
> ib_send_bwib_send_lat or something?
>
> I'd also suggest to run both workloads concurrently and see if they
> both suffer from a connectivity issue, this will help rule out
> if this is something specific to the nvme-rdma driver.
>
>>
>> I'm testing with a single Micron 9300 Pro 7.68TB right now which can 
>> push about 850k read IOPs.  On the Intel target/initiator combo I can 
>> run it "balls to the walls" for hours on end with 0 issues.  On the 
>> AMD initiator I can trigger the disconnect/drop generally within 5 
>> minutes. Here's where things get weird - if I limit the test to 200K 
>> IOPs or less then it's relatively stable on the AMD and I've not seen 
>> any drops when this limitation is in place.
>>
>> Here are some things I've tried which make no difference (or make 
>> things worse):
>>
>> Ubuntu 20.04.2 kernel 5.4.
>> Ubuntu 20.04.2 kernel 5.8
>> Ubuntu 20.04.2 kernel 5.10
>> CentOS 8 kernel 4.18
>> CentOS 8 kernel 5.10 (from elrepo)
>> CentOS 8 kernel 5.12 (from elrepo) - whole system actually freezes 
>> upon "nvme connect" command on this one
>> With and without multipath (native)
>> With and without round-robin on multipath (native)
>> Different NVMe drive models
>> With and without PFC
>> 10G DAC
>> 25G DAC
>> 25G DAC negotiated at 10G
>> With and without a switch
>> iWARP and RoCE2
>
> Looks like this probably always existed...
>
>>
>> I did do some testing with TCP/IP but cannot reach the >200k IOPS 
>> threshold with it which seems to be important for triggering the 
>> issue. I did not experience the drops with TCP/IP.
>>
>> I can't seem to draw any conclusion other than this being something 
>> specific to Zen 3, but I'm not sure why.  Is there somewhere I should 
>> be looking aside from "dmesg" to get some useful debug info?  
>> According to the irdma driver there are no rdma packets getting 
>> lost/dropped/erroring, etc.  Common things like rping and 
>> ib_read_bw/ib_write_bw tests all run indefinitely without error.
>
> Ah, that is an important detail.
>
> I think that packet sniffer can help here if this is the case, IIRC
> there should be way to sniff rdma traffic using tcpdump but I don't
> remember the details. Perhaps Intel folks can help you there...

-- 
Jonathan Wright
KnownHost, LLC
https://www.knownhost.com


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-06-03 16:57   ` Jonathan Wright
@ 2021-06-21 16:04     ` Jonathan Wright
  2021-06-21 16:06       ` Jonathan Wright
  0 siblings, 1 reply; 10+ messages in thread
From: Jonathan Wright @ 2021-06-21 16:04 UTC (permalink / raw)
  To: Sagi Grimberg, linux-nvme, Shiraz Saleem


On 6/3/21 11:57 AM, Jonathan Wright wrote:
>
>>> I've been testing NVMe over Fabrics for the past few weeks and the 
>>> performance has been nothing short of incredible, though I'm running 
>>> into some major issues that seems to be specifically related to AMD 
>>> Zen 3 Ryzen chips (in my case I'm testing with 5900x).
>>>
>>> Target:
>>> Supermicro X10 board
>>> Xeon E5-2620v4
>>> Intel E810 NIC
>>>
>>> Problematic Client/initiator:
>>> ASRock X570 board
>>> Ryzen 9 5900x
>>> Intel E810 NIC
>>>
>>> Stable Client/initiator:
>>> Supermicro X10 board
>>> Xeon E5-2620v4
>>> Intel E810 NIC
>>>
>>> I'm using the same 2 E810 NICs and pair of 25G DACs in both cases.  
>>> The NICs are directly connected with the DACs and there is no switch 
>>> in the equation.  To trigger the issue I'm simply using FIO similar 
>>> to this:
>>>
>>> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
>>> --name=test --filename=/dev/nvme0n1 --bs=4k --iodepth=64 --size=10G 
>>> --readwrite=randread --time_based --runtime=1200
>>>
>>> I'm primarily using RDMA/iWARP right now but I've also tested RoCE2 
>>> which presents the same issues/symptoms.  Primary testing has been 
>>> done with Ubuntu 20.04.2 with CentOS 8 in the mix as well just to 
>>> try and rule out a weird distro-specific issue. All tests used the 
>>> latest ice/irdma drivers from Intel (1.5.8 and 1.5.2 respectively)
>>
>> CCing Shiraz Saleem who maintains irdma.
>
> Thanks.  I've done some testing now with Mellanox ConnectX-4 cards 
> with Zen 3 and the issue does not exist.  This seems to point the 
> finger to something specific between irdma and Zen 2/3 since 
> irdma/E810 works fine on all-Intel hardware.  I tested the Mellanox on 
> both Ubuntu 20.04 stock kernel (5.4) and CentOS 8.3 (stock kernel 
> 4.18) as I tested the E810 on these combinations.
>
> Further I tested an AMD target with an Intel initiator and the issue 
> still exists so it doesn't seem to matter which end the Zen 3 (and/or 
> Zen2) chip is on when paired with an E810/irdma.
>
> The issue also exists with Zen 2 (Ryzen 3600).
>
> @Shriaz since I guess this isn't a common setup right now let me know 
> if I can be of any assistance with getting to the bottom of this 
> seeming incompatibility.

@Intel is there any desire to fix this currently?  I'll be sending this 
current lab equipment with the Zen3 systems to other tasks soon and will 
no longer be able to help troubleshoot or test.  Without a fix we'll 
have to look into other vendors for fabric setups.


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nvmeof Issues with Zen 3/Ryzen 5000 Initiator
  2021-06-21 16:04     ` Jonathan Wright
@ 2021-06-21 16:06       ` Jonathan Wright
  0 siblings, 0 replies; 10+ messages in thread
From: Jonathan Wright @ 2021-06-21 16:06 UTC (permalink / raw)
  To: linux-nvme, Shiraz Saleem, tatyana.e.nikolova, mustafa.ismail

Adding Nikolova, Tatyana E and Ismail, Mustafa to CC per auto-responder 
from @Shiraz.

On 6/21/21 11:04 AM, Jonathan Wright wrote:
>
> On 6/3/21 11:57 AM, Jonathan Wright wrote:
>>
>>>> I've been testing NVMe over Fabrics for the past few weeks and the 
>>>> performance has been nothing short of incredible, though I'm 
>>>> running into some major issues that seems to be specifically 
>>>> related to AMD Zen 3 Ryzen chips (in my case I'm testing with 5900x).
>>>>
>>>> Target:
>>>> Supermicro X10 board
>>>> Xeon E5-2620v4
>>>> Intel E810 NIC
>>>>
>>>> Problematic Client/initiator:
>>>> ASRock X570 board
>>>> Ryzen 9 5900x
>>>> Intel E810 NIC
>>>>
>>>> Stable Client/initiator:
>>>> Supermicro X10 board
>>>> Xeon E5-2620v4
>>>> Intel E810 NIC
>>>>
>>>> I'm using the same 2 E810 NICs and pair of 25G DACs in both cases.  
>>>> The NICs are directly connected with the DACs and there is no 
>>>> switch in the equation.  To trigger the issue I'm simply using FIO 
>>>> similar to this:
>>>>
>>>> fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 
>>>> --name=test --filename=/dev/nvme0n1 --bs=4k --iodepth=64 --size=10G 
>>>> --readwrite=randread --time_based --runtime=1200
>>>>
>>>> I'm primarily using RDMA/iWARP right now but I've also tested RoCE2 
>>>> which presents the same issues/symptoms. Primary testing has been 
>>>> done with Ubuntu 20.04.2 with CentOS 8 in the mix as well just to 
>>>> try and rule out a weird distro-specific issue. All tests used the 
>>>> latest ice/irdma drivers from Intel (1.5.8 and 1.5.2 respectively)
>>>
>>> CCing Shiraz Saleem who maintains irdma.
>>
>> Thanks.  I've done some testing now with Mellanox ConnectX-4 cards 
>> with Zen 3 and the issue does not exist.  This seems to point the 
>> finger to something specific between irdma and Zen 2/3 since 
>> irdma/E810 works fine on all-Intel hardware.  I tested the Mellanox 
>> on both Ubuntu 20.04 stock kernel (5.4) and CentOS 8.3 (stock kernel 
>> 4.18) as I tested the E810 on these combinations.
>>
>> Further I tested an AMD target with an Intel initiator and the issue 
>> still exists so it doesn't seem to matter which end the Zen 3 (and/or 
>> Zen2) chip is on when paired with an E810/irdma.
>>
>> The issue also exists with Zen 2 (Ryzen 3600).
>>
>> @Shriaz since I guess this isn't a common setup right now let me know 
>> if I can be of any assistance with getting to the bottom of this 
>> seeming incompatibility.
>
> @Intel is there any desire to fix this currently?  I'll be sending 
> this current lab equipment with the Zen3 systems to other tasks soon 
> and will no longer be able to help troubleshoot or test. Without a fix 
> we'll have to look into other vendors for fabric setups.
>
-- 
Jonathan Wright
KnownHost, LLC
https://www.knownhost.com


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-06-21 16:06 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-26 20:47 nvmeof Issues with Zen 3/Ryzen 5000 Initiator Jonathan Wright
2021-05-26 21:50 ` Chaitanya Kulkarni
2021-05-26 23:07   ` Jonathan Wright
2021-05-27  2:36     ` Chaitanya Kulkarni
2021-05-27 16:51       ` Keith Busch
2021-05-27 16:57         ` Jonathan Wright
2021-05-27 21:36 ` Sagi Grimberg
2021-06-03 16:57   ` Jonathan Wright
2021-06-21 16:04     ` Jonathan Wright
2021-06-21 16:06       ` Jonathan Wright

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.