Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 0/5] avoid race for time out
@ 2020-10-22  2:14 Chao Leng
  2020-10-28 11:36 ` Yi Zhang
  0 siblings, 1 reply; 8+ messages in thread
From: Chao Leng @ 2020-10-22  2:14 UTC (permalink / raw)
  To: linux-nvme; +Cc: kbusch, axboe, hch, lengchao, sagi

First avoid race between time out and tear down for rdma and tcp.
Second avoid repeated request completion in time out for rdma and tcp.

V2:
	- add avoiding repeated request completion in time out

Chao Leng (3):
  nvme-core: introduce sync io queues
  nvme-rdma: avoid race between time out and tear down
  nvme-tcp: avoid race between time out and tear down

Sagi Grimberg (2):
  nvme-rdma: avoid repeated request completion
  nvme-tcp: avoid repeated request completion

 drivers/nvme/host/core.c |  8 ++++++--
 drivers/nvme/host/nvme.h |  1 +
 drivers/nvme/host/rdma.c | 14 +++-----------
 drivers/nvme/host/tcp.c  | 16 ++++------------
 4 files changed, 14 insertions(+), 25 deletions(-)

-- 
2.16.4


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-22  2:14 [PATCH v2 0/5] avoid race for time out Chao Leng
@ 2020-10-28 11:36 ` Yi Zhang
  2020-10-28 13:25   ` Ming Lei
                     ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Yi Zhang @ 2020-10-28 11:36 UTC (permalink / raw)
  To: Chao Leng, sagi, Ming Lei; +Cc: kbusch, axboe, hch, linux-nvme

Hello

This series fixed the WARNING issue I reported [1], but now the nvme/012 
[2] will be hang there and never finished, here is the log[3]
[1]
https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/

[2]
[root@hpe-xw9400-02 blktests]# nvme_trtype=tcp ./check nvme/012
nvme/012 (run mkfs and data verification fio job on NVMeOF block 
device-backed ns)
     runtime  1199.651s  ...

[3]
[  120.550409] run blktests nvme/012 at 2020-10-28 06:50:11
[  121.138234] loop: module loaded
[  121.170869] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[  121.215930] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[  121.288229] nvmet: creating controller 1 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[  121.302597] nvme nvme0: creating 12 I/O queues.
[  121.308361] nvme nvme0: mapped 12/0/0 default/read/poll queues.
[  121.320030] nvme nvme0: new ctrl: NQN "blktests-subsystem-1", addr 
127.0.0.1:4420
[  123.278903] XFS (nvme0n1): Mounting V5 Filesystem
[  123.291608] XFS (nvme0n1): Ending clean mount
[  123.297321] xfs filesystem being mounted at /mnt/blktests supports 
timestamps until 2038 (0x7fffffff)
[  183.872118] nvme nvme0: queue 1: timeout request 0x6c type 4
[  183.877792] nvme nvme0: starting error recovery
[  183.882376] nvme nvme0: queue 8: timeout request 0x11 type 4
[  183.888149] nvme nvme0: queue 8: timeout request 0x12 type 4
[  183.893805] nvme nvme0: queue 8: timeout request 0x13 type 4
[  183.899469] nvme nvme0: queue 8: timeout request 0x14 type 4
[  183.905130] nvme nvme0: queue 8: timeout request 0x15 type 4
[  183.910792] nvme nvme0: queue 8: timeout request 0x16 type 4
[  183.916453] nvme nvme0: queue 8: timeout request 0x17 type 4
[  183.922114] nvme nvme0: queue 8: timeout request 0x18 type 4
[  183.927777] nvme nvme0: queue 8: timeout request 0x19 type 4
[  183.933450] nvme nvme0: queue 8: timeout request 0x1a type 4
[  183.939110] nvme nvme0: queue 8: timeout request 0x1b type 4
[  183.944771] nvme nvme0: queue 8: timeout request 0x1c type 4
[  183.950431] nvme nvme0: queue 8: timeout request 0x1d type 4
[  183.956095] nvme nvme0: queue 8: timeout request 0x1e type 4
[  183.961755] nvme nvme0: queue 8: timeout request 0x1f type 4
[  183.967414] nvme nvme0: queue 8: timeout request 0x20 type 4
[  183.973218] block nvme0n1: no usable path - requeuing I/O
[  183.978623] block nvme0n1: no usable path - requeuing I/O
[  183.982492] nvme nvme0: Reconnecting in 10 seconds...
[  183.984022] block nvme0n1: no usable path - requeuing I/O
[  183.994476] block nvme0n1: no usable path - requeuing I/O
[  183.999870] block nvme0n1: no usable path - requeuing I/O
[  184.005264] block nvme0n1: no usable path - requeuing I/O
[  184.010669] block nvme0n1: no usable path - requeuing I/O
[  184.016080] block nvme0n1: no usable path - requeuing I/O
[  184.021463] block nvme0n1: no usable path - requeuing I/O
[  184.026858] block nvme0n1: no usable path - requeuing I/O
[  209.472647] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
[  209.478662] nvmet: ctrl 2 fatal error occurred!
[  213.568765] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[  213.574782] nvmet: ctrl 1 fatal error occurred!
[  238.064572] nvmet: creating controller 2 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[  256.577658] nvme nvme0: queue 0: timeout request 0x0 type 4
[  256.583333] nvme nvme0: Connect command failed, error wo/DNR bit: 881
[  256.589806] nvme nvme0: failed to connect queue: 0 ret=881
[  256.595326] nvme nvme0: Failed reconnect attempt 1
[  256.600119] nvme nvme0: Reconnecting in 10 seconds...
[  266.818455] nvmet: creating controller 1 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[  266.832356] nvme_ns_head_submit_bio: 30 callbacks suppressed
[  266.832362] block nvme0n1: no usable path - requeuing I/O
[  266.843443] block nvme0n1: no usable path - requeuing I/O
[  266.848848] block nvme0n1: no usable path - requeuing I/O
[  266.854244] block nvme0n1: no usable path - requeuing I/O
[  266.859663] block nvme0n1: no usable path - requeuing I/O
[  266.865059] block nvme0n1: no usable path - requeuing I/O
[  266.870454] block nvme0n1: no usable path - requeuing I/O
[  266.875845] block nvme0n1: no usable path - requeuing I/O
[  266.881234] block nvme0n1: no usable path - requeuing I/O
[  266.886632] block nvme0n1: no usable path - requeuing I/O
[  266.892237] nvme nvme0: creating 12 I/O queues.
[  266.903744] nvme nvme0: mapped 12/0/0 default/read/poll queues.
[  266.911929] nvme nvme0: Successfully reconnected (2 attempt)
[  327.747177] nvme nvme0: queue 2: timeout request 0x1e type 4
[  327.752883] nvme nvme0: starting error recovery
[  327.757450] nvme nvme0: queue 4: timeout request 0x63 type 4
[  327.763511] nvme_ns_head_submit_bio: 14 callbacks suppressed
[  327.763520] block nvme0n1: no usable path - requeuing I/O
[  327.774614] block nvme0n1: no usable path - requeuing I/O
[  327.780053] block nvme0n1: no usable path - requeuing I/O
[  327.785450] block nvme0n1: no usable path - requeuing I/O
[  327.790876] block nvme0n1: no usable path - requeuing I/O
[  327.796316] block nvme0n1: no usable path - requeuing I/O
[  327.801727] block nvme0n1: no usable path - requeuing I/O
[  327.807231] block nvme0n1: no usable path - requeuing I/O
[  327.812627] block nvme0n1: no usable path - requeuing I/O
[  327.818025] block nvme0n1: no usable path - requeuing I/O
[  353.859745] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[  353.865761] nvmet: ctrl 1 fatal error occurred!


On 10/22/20 10:14 AM, Chao Leng wrote:
> First avoid race between time out and tear down for rdma and tcp.
> Second avoid repeated request completion in time out for rdma and tcp.
>
> V2:
> 	- add avoiding repeated request completion in time out
>
> Chao Leng (3):
>    nvme-core: introduce sync io queues
>    nvme-rdma: avoid race between time out and tear down
>    nvme-tcp: avoid race between time out and tear down
>
> Sagi Grimberg (2):
>    nvme-rdma: avoid repeated request completion
>    nvme-tcp: avoid repeated request completion
>
>   drivers/nvme/host/core.c |  8 ++++++--
>   drivers/nvme/host/nvme.h |  1 +
>   drivers/nvme/host/rdma.c | 14 +++-----------
>   drivers/nvme/host/tcp.c  | 16 ++++------------
>   4 files changed, 14 insertions(+), 25 deletions(-)
>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-28 11:36 ` Yi Zhang
@ 2020-10-28 13:25   ` Ming Lei
  2020-10-29  6:13   ` Chao Leng
  2020-10-29 21:00   ` Sagi Grimberg
  2 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2020-10-28 13:25 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Sagi Grimberg, linux-nvme, Ming Lei, Jens Axboe, Chao Leng,
	Keith Busch, Christoph Hellwig

On Wed, Oct 28, 2020 at 7:38 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> Hello
>
> This series fixed the WARNING issue I reported [1], but now the nvme/012
> [2] will be hang there and never finished, here is the log[3]

Hello Yi,

Please try the following patch against Chao's patches and see if the
hang issue can
be fixed:

https://lore.kernel.org/linux-block/20201020085301.1553959-2-ming.lei@redhat.com/T/#u


Thanks,
Ming Lei

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-28 11:36 ` Yi Zhang
  2020-10-28 13:25   ` Ming Lei
@ 2020-10-29  6:13   ` Chao Leng
  2020-10-29 21:00   ` Sagi Grimberg
  2 siblings, 0 replies; 8+ messages in thread
From: Chao Leng @ 2020-10-29  6:13 UTC (permalink / raw)
  To: Yi Zhang, sagi, Ming Lei; +Cc: kbusch, axboe, hch, linux-nvme



On 2020/10/28 19:36, Yi Zhang wrote:
> Hello
> 
> This series fixed the WARNING issue I reported [1], but now the nvme/012 [2] will be hang there and never finished, here is the log[3]
This is another bug. In two scenarios may cause request hang:
1. If work with nvme native multipath, all path is not availabble.
request will hang until all controller are deleted.
2. if work without multipath, controller is reconnectting.
request will hang until the controller are deleted.
This patch may fix the request hang.
https://lore.kernel.org/linux-nvme/319b8b1869f34a48b26fbd902883ed71@kioxia.com/
This patch has been discussed for too long time.

> [1]
> https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/
> 
> [2]
> [root@hpe-xw9400-02 blktests]# nvme_trtype=tcp ./check nvme/012
> nvme/012 (run mkfs and data verification fio job on NVMeOF block device-backed ns)
>      runtime  1199.651s  ...
> 
> [3]
> [  120.550409] run blktests nvme/012 at 2020-10-28 06:50:11
> [  121.138234] loop: module loaded
> [  121.170869] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
> [  121.215930] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
> [  121.288229] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
> [  121.302597] nvme nvme0: creating 12 I/O queues.
> [  121.308361] nvme nvme0: mapped 12/0/0 default/read/poll queues.
> [  121.320030] nvme nvme0: new ctrl: NQN "blktests-subsystem-1", addr 127.0.0.1:4420
> [  123.278903] XFS (nvme0n1): Mounting V5 Filesystem
> [  123.291608] XFS (nvme0n1): Ending clean mount
> [  123.297321] xfs filesystem being mounted at /mnt/blktests supports timestamps until 2038 (0x7fffffff)
> [  183.872118] nvme nvme0: queue 1: timeout request 0x6c type 4
> [  183.877792] nvme nvme0: starting error recovery
> [  183.882376] nvme nvme0: queue 8: timeout request 0x11 type 4
> [  183.888149] nvme nvme0: queue 8: timeout request 0x12 type 4
> [  183.893805] nvme nvme0: queue 8: timeout request 0x13 type 4
> [  183.899469] nvme nvme0: queue 8: timeout request 0x14 type 4
> [  183.905130] nvme nvme0: queue 8: timeout request 0x15 type 4
> [  183.910792] nvme nvme0: queue 8: timeout request 0x16 type 4
> [  183.916453] nvme nvme0: queue 8: timeout request 0x17 type 4
> [  183.922114] nvme nvme0: queue 8: timeout request 0x18 type 4
> [  183.927777] nvme nvme0: queue 8: timeout request 0x19 type 4
> [  183.933450] nvme nvme0: queue 8: timeout request 0x1a type 4
> [  183.939110] nvme nvme0: queue 8: timeout request 0x1b type 4
> [  183.944771] nvme nvme0: queue 8: timeout request 0x1c type 4
> [  183.950431] nvme nvme0: queue 8: timeout request 0x1d type 4
> [  183.956095] nvme nvme0: queue 8: timeout request 0x1e type 4
> [  183.961755] nvme nvme0: queue 8: timeout request 0x1f type 4
> [  183.967414] nvme nvme0: queue 8: timeout request 0x20 type 4
> [  183.973218] block nvme0n1: no usable path - requeuing I/O
> [  183.978623] block nvme0n1: no usable path - requeuing I/O
> [  183.982492] nvme nvme0: Reconnecting in 10 seconds...
> [  183.984022] block nvme0n1: no usable path - requeuing I/O
> [  183.994476] block nvme0n1: no usable path - requeuing I/O
> [  183.999870] block nvme0n1: no usable path - requeuing I/O
> [  184.005264] block nvme0n1: no usable path - requeuing I/O
> [  184.010669] block nvme0n1: no usable path - requeuing I/O
> [  184.016080] block nvme0n1: no usable path - requeuing I/O
> [  184.021463] block nvme0n1: no usable path - requeuing I/O
> [  184.026858] block nvme0n1: no usable path - requeuing I/O
> [  209.472647] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
> [  209.478662] nvmet: ctrl 2 fatal error occurred!
> [  213.568765] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [  213.574782] nvmet: ctrl 1 fatal error occurred!
> [  238.064572] nvmet: creating controller 2 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
> [  256.577658] nvme nvme0: queue 0: timeout request 0x0 type 4
> [  256.583333] nvme nvme0: Connect command failed, error wo/DNR bit: 881
> [  256.589806] nvme nvme0: failed to connect queue: 0 ret=881
> [  256.595326] nvme nvme0: Failed reconnect attempt 1
> [  256.600119] nvme nvme0: Reconnecting in 10 seconds...
> [  266.818455] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
> [  266.832356] nvme_ns_head_submit_bio: 30 callbacks suppressed
> [  266.832362] block nvme0n1: no usable path - requeuing I/O
> [  266.843443] block nvme0n1: no usable path - requeuing I/O
> [  266.848848] block nvme0n1: no usable path - requeuing I/O
> [  266.854244] block nvme0n1: no usable path - requeuing I/O
> [  266.859663] block nvme0n1: no usable path - requeuing I/O
> [  266.865059] block nvme0n1: no usable path - requeuing I/O
> [  266.870454] block nvme0n1: no usable path - requeuing I/O
> [  266.875845] block nvme0n1: no usable path - requeuing I/O
> [  266.881234] block nvme0n1: no usable path - requeuing I/O
> [  266.886632] block nvme0n1: no usable path - requeuing I/O
> [  266.892237] nvme nvme0: creating 12 I/O queues.
> [  266.903744] nvme nvme0: mapped 12/0/0 default/read/poll queues.
> [  266.911929] nvme nvme0: Successfully reconnected (2 attempt)
> [  327.747177] nvme nvme0: queue 2: timeout request 0x1e type 4
> [  327.752883] nvme nvme0: starting error recovery
> [  327.757450] nvme nvme0: queue 4: timeout request 0x63 type 4
> [  327.763511] nvme_ns_head_submit_bio: 14 callbacks suppressed
> [  327.763520] block nvme0n1: no usable path - requeuing I/O
> [  327.774614] block nvme0n1: no usable path - requeuing I/O
> [  327.780053] block nvme0n1: no usable path - requeuing I/O
> [  327.785450] block nvme0n1: no usable path - requeuing I/O
> [  327.790876] block nvme0n1: no usable path - requeuing I/O
> [  327.796316] block nvme0n1: no usable path - requeuing I/O
> [  327.801727] block nvme0n1: no usable path - requeuing I/O
> [  327.807231] block nvme0n1: no usable path - requeuing I/O
> [  327.812627] block nvme0n1: no usable path - requeuing I/O
> [  327.818025] block nvme0n1: no usable path - requeuing I/O
> [  353.859745] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
> [  353.865761] nvmet: ctrl 1 fatal error occurred!
> 
> 
> On 10/22/20 10:14 AM, Chao Leng wrote:
>> First avoid race between time out and tear down for rdma and tcp.
>> Second avoid repeated request completion in time out for rdma and tcp.
>>
>> V2:
>>     - add avoiding repeated request completion in time out
>>
>> Chao Leng (3):
>>    nvme-core: introduce sync io queues
>>    nvme-rdma: avoid race between time out and tear down
>>    nvme-tcp: avoid race between time out and tear down
>>
>> Sagi Grimberg (2):
>>    nvme-rdma: avoid repeated request completion
>>    nvme-tcp: avoid repeated request completion
>>
>>   drivers/nvme/host/core.c |  8 ++++++--
>>   drivers/nvme/host/nvme.h |  1 +
>>   drivers/nvme/host/rdma.c | 14 +++-----------
>>   drivers/nvme/host/tcp.c  | 16 ++++------------
>>   4 files changed, 14 insertions(+), 25 deletions(-)
>>
> 
> .

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-28 11:36 ` Yi Zhang
  2020-10-28 13:25   ` Ming Lei
  2020-10-29  6:13   ` Chao Leng
@ 2020-10-29 21:00   ` Sagi Grimberg
  2020-10-29 21:01     ` Sagi Grimberg
  2020-10-30  0:04     ` Yi Zhang
  2 siblings, 2 replies; 8+ messages in thread
From: Sagi Grimberg @ 2020-10-29 21:00 UTC (permalink / raw)
  To: Yi Zhang, Chao Leng, Ming Lei; +Cc: kbusch, axboe, hch, linux-nvme


> Hello
> 
> This series fixed the WARNING issue I reported [1], but now the nvme/012 
> [2] will be hang there and never finished, here is the log[3]
> [1]
> https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/ 
> 
> 
> [2]
> [root@hpe-xw9400-02 blktests]# nvme_trtype=tcp ./check nvme/012
> nvme/012 (run mkfs and data verification fio job on NVMeOF block 
> device-backed ns)
>      runtime  1199.651s  ...

Hey Yi,

This is a different issue, as I said, first of all the test is not
designed to trigger this scenario so it is weird why it even happens.

For debug purposes, can you switch to run blktests with siw for
comparison?

This patch set should move forward for inclusion regardless.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-29 21:00   ` Sagi Grimberg
@ 2020-10-29 21:01     ` Sagi Grimberg
  2020-10-30  0:04     ` Yi Zhang
  1 sibling, 0 replies; 8+ messages in thread
From: Sagi Grimberg @ 2020-10-29 21:01 UTC (permalink / raw)
  To: Yi Zhang, Chao Leng, Ming Lei; +Cc: kbusch, axboe, hch, linux-nvme


> For debug purposes, can you switch to run blktests with siw for
> comparison?

Disregard this comment, forgot which transport we are talking about
for a second...

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-29 21:00   ` Sagi Grimberg
  2020-10-29 21:01     ` Sagi Grimberg
@ 2020-10-30  0:04     ` Yi Zhang
  2020-10-30  1:00       ` Ming Lei
  1 sibling, 1 reply; 8+ messages in thread
From: Yi Zhang @ 2020-10-30  0:04 UTC (permalink / raw)
  To: Sagi Grimberg, Chao Leng, Ming Lei; +Cc: kbusch, axboe, hch, linux-nvme



On 10/30/20 5:00 AM, Sagi Grimberg wrote:
>
>> Hello
>>
>> This series fixed the WARNING issue I reported [1], but now the 
>> nvme/012 [2] will be hang there and never finished, here is the log[3]
>> [1]
>> https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/ 
>>
>>
>> [2]
>> [root@hpe-xw9400-02 blktests]# nvme_trtype=tcp ./check nvme/012
>> nvme/012 (run mkfs and data verification fio job on NVMeOF block 
>> device-backed ns)
>>      runtime  1199.651s  ...
>
> Hey Yi,
>
> This is a different issue, as I said, first of all the test is not
> designed to trigger this scenario so it is weird why it even happens.
>
OK, I will keep monitor the issue in the future, and the original issue 
also found in stable branch, should we also CC stable?

Hi Ming

just FYI, with your path on top of this series
the test passed 100 times although there are timeout, but finally 
reconnected and test finished.
Here is log:
[47534.756506] run blktests nvme/012 at 2020-10-29 10:34:52
[47534.934534] nvmet: adding nsid 1 to subsystem blktests-subsystem-1
[47534.949247] nvmet_tcp: enabling port 0 (127.0.0.1:4420)
[47534.965075] nvmet: creating controller 1 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[47534.979730] nvme nvme0: creating 12 I/O queues.
[47534.985596] nvme nvme0: mapped 12/0/0 default/read/poll queues.
[47534.997327] nvme nvme0: new ctrl: NQN "blktests-subsystem-1", addr 
127.0.0.1:4420
[47536.885732] XFS (nvme0n1): Mounting V5 Filesystem
[47536.899366] XFS (nvme0n1): Ending clean mount
[47536.905082] xfs filesystem being mounted at /mnt/blktests supports 
timestamps until 2038 (0x7fffffff)
[47597.518197] nvme nvme0: queue 7: timeout request 0x63 type 4
[47597.523884] nvme nvme0: starting error recovery
[47597.528457] nvme nvme0: queue 11: timeout request 0x21 type 4
[47597.534865] nvme nvme0: queue 11: timeout request 0x22 type 4
[47597.540607] nvme nvme0: queue 11: timeout request 0x23 type 4
[47597.546351] nvme nvme0: queue 11: timeout request 0x24 type 4
[47597.552090] nvme nvme0: queue 11: timeout request 0x25 type 4
[47597.557830] nvme nvme0: queue 11: timeout request 0x26 type 4
[47597.563570] nvme nvme0: queue 11: timeout request 0x27 type 4
[47597.569310] nvme nvme0: queue 11: timeout request 0x28 type 4
[47597.575049] nvme nvme0: queue 11: timeout request 0x29 type 4
[47597.580803] nvme nvme0: queue 11: timeout request 0x2a type 4
[47597.586544] nvme nvme0: queue 11: timeout request 0x2b type 4
[47597.592287] nvme nvme0: queue 11: timeout request 0x2c type 4
[47597.598026] nvme nvme0: queue 11: timeout request 0x2d type 4
[47597.603765] nvme nvme0: queue 11: timeout request 0x2e type 4
[47597.609505] nvme nvme0: queue 11: timeout request 0x2f type 4
[47597.615244] nvme nvme0: queue 11: timeout request 0x30 type 4
[47597.621052] block nvme0n1: no usable path - requeuing I/O
[47597.622230] nvme nvme0: Reconnecting in 10 seconds...
[47597.626452] block nvme0n1: no usable path - requeuing I/O
[47597.636885] block nvme0n1: no usable path - requeuing I/O
[47597.642276] block nvme0n1: no usable path - requeuing I/O
[47597.647670] block nvme0n1: no usable path - requeuing I/O
[47597.653076] block nvme0n1: no usable path - requeuing I/O
[47597.658470] block nvme0n1: no usable path - requeuing I/O
[47597.663874] block nvme0n1: no usable path - requeuing I/O
[47597.669272] block nvme0n1: no usable path - requeuing I/O
[47597.674676] block nvme0n1: no usable path - requeuing I/O
[47623.118733] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
[47623.124745] nvmet: ctrl 2 fatal error occurred!
[47627.214836] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[47627.220856] nvmet: ctrl 1 fatal error occurred!
[47647.671350] nvmet: creating controller 2 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[47668.175734] nvme nvme0: queue 0: timeout request 0x0 type 4
[47668.181397] nvme nvme0: Connect command failed, error wo/DNR bit: 881
[47668.187884] nvme nvme0: failed to connect queue: 0 ret=881
[47668.193725] nvme nvme0: Failed reconnect attempt 1
[47668.198510] nvme nvme0: Reconnecting in 10 seconds...
[47678.416555] nvmet: creating controller 1 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[47678.430569] nvme_ns_head_submit_bio: 30 callbacks suppressed
[47678.430577] block nvme0n1: no usable path - requeuing I/O
[47678.441645] block nvme0n1: no usable path - requeuing I/O
[47678.447042] block nvme0n1: no usable path - requeuing I/O
[47678.452446] block nvme0n1: no usable path - requeuing I/O
[47678.457862] block nvme0n1: no usable path - requeuing I/O
[47678.463787] block nvme0n1: no usable path - requeuing I/O
[47678.469180] block nvme0n1: no usable path - requeuing I/O
[47678.474569] block nvme0n1: no usable path - requeuing I/O
[47678.479955] block nvme0n1: no usable path - requeuing I/O
[47678.485344] block nvme0n1: no usable path - requeuing I/O
[47678.490977] nvme nvme0: creating 12 I/O queues.
[47678.503323] nvme nvme0: mapped 12/0/0 default/read/poll queues.
[47678.511558] nvme nvme0: Successfully reconnected (2 attempt)
[47743.953398] nvme nvme0: queue 3: timeout request 0x31 type 4
[47743.959069] nvme nvme0: starting error recovery
[47743.963627] nvme nvme0: queue 3: timeout request 0x32 type 4
[47743.969367] nvme nvme0: queue 3: timeout request 0x33 type 4
[47743.975021] nvme nvme0: queue 3: timeout request 0x34 type 4
[47743.980826] nvme nvme0: queue 3: timeout request 0x35 type 4
[47743.986482] nvme nvme0: queue 3: timeout request 0x36 type 4
[47743.992291] nvme nvme0: queue 3: timeout request 0x37 type 4
[47743.997953] nvme nvme0: queue 3: timeout request 0x38 type 4
[47744.003609] nvme nvme0: queue 3: timeout request 0x39 type 4
[47744.009272] nvme nvme0: queue 3: timeout request 0x3a type 4
[47744.014949] nvme nvme0: queue 3: timeout request 0x3b type 4
[47744.020933] nvme nvme0: queue 3: timeout request 0x3c type 4
[47744.026589] nvme nvme0: queue 3: timeout request 0x3d type 4
[47744.032251] nvme nvme0: queue 3: timeout request 0x3e type 4
[47744.037914] nvme nvme0: queue 3: timeout request 0x3f type 4
[47744.043569] nvme nvme0: queue 3: timeout request 0x40 type 4
[47744.049227] nvme nvme0: queue 3: timeout request 0x41 type 4
[47744.054895] nvme nvme0: queue 5: timeout request 0x11 type 4
[47744.060659] nvme nvme0: queue 5: timeout request 0x12 type 4
[47744.066348] nvme nvme0: queue 5: timeout request 0x13 type 4
[47744.071999] nvme nvme0: queue 5: timeout request 0x14 type 4
[47744.077646] nvme nvme0: queue 5: timeout request 0x15 type 4
[47744.083305] nvme nvme0: queue 5: timeout request 0x1e type 4
[47744.088959] nvme nvme0: queue 5: timeout request 0x1f type 4
[47744.094608] nvme nvme0: queue 5: timeout request 0x20 type 4
[47744.100263] nvme nvme0: queue 11: timeout request 0x24 type 4
[47744.106083] nvme_ns_head_submit_bio: 14 callbacks suppressed
[47744.106087] block nvme0n1: no usable path - requeuing I/O
[47744.107067] nvme nvme0: Reconnecting in 10 seconds...
[47744.111745] block nvme0n1: no usable path - requeuing I/O
[47744.127552] block nvme0n1: no usable path - requeuing I/O
[47744.133104] block nvme0n1: no usable path - requeuing I/O
[47744.138493] block nvme0n1: no usable path - requeuing I/O
[47744.143885] block nvme0n1: no usable path - requeuing I/O
[47744.149280] block nvme0n1: no usable path - requeuing I/O
[47744.154668] block nvme0n1: no usable path - requeuing I/O
[47744.160066] block nvme0n1: no usable path - requeuing I/O
[47744.165459] block nvme0n1: no usable path - requeuing I/O
[47769.553925] nvmet: ctrl 2 keep-alive timer (15 seconds) expired!
[47769.559934] nvmet: ctrl 2 fatal error occurred!
[47770.577940] nvmet: ctrl 1 keep-alive timer (15 seconds) expired!
[47770.583962] nvmet: ctrl 1 fatal error occurred!
[47814.247111] nvmet: creating controller 2 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[47815.634922] nvme nvme0: queue 0: timeout request 0x1 type 4
[47815.640593] nvme nvme0: Connect command failed, error wo/DNR bit: 881
[47815.647088] nvme nvme0: failed to connect queue: 0 ret=881
[47815.653011] nvme nvme0: Failed reconnect attempt 1
[47815.657792] nvme nvme0: Reconnecting in 10 seconds...
[47825.875752] nvmet: creating controller 1 for subsystem 
blktests-subsystem-1 for NQN 
nqn.2014-08.org.nvmexpress:uuid:ffe2b140e76a45649005853f3b871859.
[47825.889828] nvme_ns_head_submit_bio: 101 callbacks suppressed
[47825.889835] block nvme0n1: no usable path - requeuing I/O
[47825.900998] block nvme0n1: no usable path - requeuing I/O
[47825.906396] block nvme0n1: no usable path - requeuing I/O
[47825.911800] block nvme0n1: no usable path - requeuing I/O
[47825.917213] block nvme0n1: no usable path - requeuing I/O
[47825.922607] block nvme0n1: no usable path - requeuing I/O
[47825.928004] block nvme0n1: no usable path - requeuing I/O
[47825.933387] block nvme0n1: no usable path - requeuing I/O
[47825.938779] block nvme0n1: no usable path - requeuing I/O
[47825.944169] block nvme0n1: no usable path - requeuing I/O
[47825.949765] nvme nvme0: creating 12 I/O queues.
[47825.962904] nvme nvme0: mapped 12/0/0 default/read/poll queues.
[47825.970958] nvme nvme0: Successfully reconnected (2 attempt)
[47831.211728] XFS (nvme0n1): Unmounting Filesystem
[47831.362650] nvme nvme0: Removing ctrl: NQN "blktests-subsystem-1"


> For debug purposes, can you switch to run blktests with siw for
> comparison?
>
> This patch set should move forward for inclusion regardless.
>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2 0/5] avoid race for time out
  2020-10-30  0:04     ` Yi Zhang
@ 2020-10-30  1:00       ` Ming Lei
  0 siblings, 0 replies; 8+ messages in thread
From: Ming Lei @ 2020-10-30  1:00 UTC (permalink / raw)
  To: Yi Zhang; +Cc: Sagi Grimberg, linux-nvme, axboe, Chao Leng, kbusch, hch

On Fri, Oct 30, 2020 at 08:04:07AM +0800, Yi Zhang wrote:
> 
> 
> On 10/30/20 5:00 AM, Sagi Grimberg wrote:
> > 
> > > Hello
> > > 
> > > This series fixed the WARNING issue I reported [1], but now the
> > > nvme/012 [2] will be hang there and never finished, here is the
> > > log[3]
> > > [1]
> > > https://lore.kernel.org/linux-nvme/1934331639.3314730.1602152202454.JavaMail.zimbra@redhat.com/
> > > 
> > > 
> > > [2]
> > > [root@hpe-xw9400-02 blktests]# nvme_trtype=tcp ./check nvme/012
> > > nvme/012 (run mkfs and data verification fio job on NVMeOF block
> > > device-backed ns)
> > >      runtime  1199.651s  ...
> > 
> > Hey Yi,
> > 
> > This is a different issue, as I said, first of all the test is not
> > designed to trigger this scenario so it is weird why it even happens.
> > 
> OK, I will keep monitor the issue in the future, and the original issue also
> found in stable branch, should we also CC stable?
> 
> Hi Ming
> 
> just FYI, with your path on top of this series
> the test passed 100 times although there are timeout, but finally
> reconnected and test finished.

Hi Yi,

Thanks for your test, I will re-post the patch out in linux-block today. 


Thanks,
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-22  2:14 [PATCH v2 0/5] avoid race for time out Chao Leng
2020-10-28 11:36 ` Yi Zhang
2020-10-28 13:25   ` Ming Lei
2020-10-29  6:13   ` Chao Leng
2020-10-29 21:00   ` Sagi Grimberg
2020-10-29 21:01     ` Sagi Grimberg
2020-10-30  0:04     ` Yi Zhang
2020-10-30  1:00       ` Ming Lei

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git