All of lore.kernel.org
 help / color / mirror / Atom feed
* [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-05-21 16:12 ` Yi Zhang
  0 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-05-21 16:12 UTC (permalink / raw)
  To: linux-nvme, linux-rdma; +Cc: Sagi Grimberg, maxg

Hi
I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
help check it?
Thanks.

$ time echo 1 >/sys/block/nvme0n1/device/reset_controller
real 0m10.678s
user 0m0.000s
sys 0m0.000s
$ time echo 1 >/sys/block/nvme0n1/device/reset_controller
real 1m11.530s
user 0m0.000s
sys 0m0.000s

target:
$ dmesg | grep nvme
[  276.891454] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
[  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[  287.399317] nvmet: ctrl 1 fatal error occurred!
[  348.412672] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.

client:
$ dmesg | grep nvme
[  281.704475] nvme nvme0: creating 40 I/O queues.
[  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[  353.187809] nvme nvme0: I/O 8 QID 0 timeout
[  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
[  353.226082] nvme nvme0: creating 40 I/O queues.
[  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-05-21 16:12 ` Yi Zhang
  0 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-05-21 16:12 UTC (permalink / raw)
  To: linux-nvme, linux-rdma; +Cc: Sagi Grimberg, maxg

Hi
I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
help check it?
Thanks.

$ time echo 1 >/sys/block/nvme0n1/device/reset_controller
real 0m10.678s
user 0m0.000s
sys 0m0.000s
$ time echo 1 >/sys/block/nvme0n1/device/reset_controller
real 1m11.530s
user 0m0.000s
sys 0m0.000s

target:
$ dmesg | grep nvme
[  276.891454] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
[  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[  287.399317] nvmet: ctrl 1 fatal error occurred!
[  348.412672] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.

client:
$ dmesg | grep nvme
[  281.704475] nvme nvme0: creating 40 I/O queues.
[  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[  353.187809] nvme nvme0: I/O 8 QID 0 timeout
[  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
[  353.226082] nvme nvme0: creating 40 I/O queues.
[  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.


-- 
Best Regards,
  Yi Zhang


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-05-21 16:12 ` Yi Zhang
@ 2021-05-21 18:00   ` Sagi Grimberg
  -1 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2021-05-21 18:00 UTC (permalink / raw)
  To: Yi Zhang, linux-nvme, linux-rdma; +Cc: maxg


> Hi
> I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
> help check it?
> Thanks.
> 
> $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> real 0m10.678s
> user 0m0.000s
> sys 0m0.000s
> $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> real 1m11.530s
> user 0m0.000s
> sys 0m0.000s
> 
> target:
> $ dmesg | grep nvme
> [  276.891454] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> [  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [  287.399317] nvmet: ctrl 1 fatal error occurred!
> [  348.412672] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> 
> client:
> $ dmesg | grep nvme
> [  281.704475] nvme nvme0: creating 40 I/O queues.
> [  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [  353.187809] nvme nvme0: I/O 8 QID 0 timeout
> [  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
> [  353.226082] nvme nvme0: creating 40 I/O queues.
> [  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.

It appears that there is an admin timeout that is either triggered
by the reset or unrelated.

Can you run nvme reset /dev/nvme0 instead so we can see the "resetting
controller" print?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-05-21 18:00   ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2021-05-21 18:00 UTC (permalink / raw)
  To: Yi Zhang, linux-nvme, linux-rdma; +Cc: maxg


> Hi
> I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
> help check it?
> Thanks.
> 
> $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> real 0m10.678s
> user 0m0.000s
> sys 0m0.000s
> $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> real 1m11.530s
> user 0m0.000s
> sys 0m0.000s
> 
> target:
> $ dmesg | grep nvme
> [  276.891454] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> [  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [  287.399317] nvmet: ctrl 1 fatal error occurred!
> [  348.412672] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> 
> client:
> $ dmesg | grep nvme
> [  281.704475] nvme nvme0: creating 40 I/O queues.
> [  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [  353.187809] nvme nvme0: I/O 8 QID 0 timeout
> [  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
> [  353.226082] nvme nvme0: creating 40 I/O queues.
> [  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.

It appears that there is an admin timeout that is either triggered
by the reset or unrelated.

Can you run nvme reset /dev/nvme0 instead so we can see the "resetting
controller" print?

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-05-21 18:00   ` Sagi Grimberg
@ 2021-05-22  4:27     ` Yi Zhang
  -1 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-05-22  4:27 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, linux-rdma

On Sat, May 22, 2021 at 2:00 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> > Hi
> > I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
> > help check it?
> > Thanks.
> >
> > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > real 0m10.678s
> > user 0m0.000s
> > sys 0m0.000s
> > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > real 1m11.530s
> > user 0m0.000s
> > sys 0m0.000s
> >
> > target:
> > $ dmesg | grep nvme
> > [  276.891454] nvmet: creating controller 1 for subsystem testnqn for
> > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > [  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> > [  287.399317] nvmet: ctrl 1 fatal error occurred!
> > [  348.412672] nvmet: creating controller 1 for subsystem testnqn for
> > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> >
> > client:
> > $ dmesg | grep nvme
> > [  281.704475] nvme nvme0: creating 40 I/O queues.
> > [  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > [  353.187809] nvme nvme0: I/O 8 QID 0 timeout
> > [  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
> > [  353.226082] nvme nvme0: creating 40 I/O queues.
> > [  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>
> It appears that there is an admin timeout that is either triggered
> by the reset or unrelated.
>
> Can you run nvme reset /dev/nvme0 instead so we can see the "resetting
> controller" print?
>
Yes, here is the log:
------------------------0
+ nvme reset /dev/nvme0
real 0m10.737s
user 0m0.004s
sys 0m0.004s
------------------------1
+ nvme reset /dev/nvme0
real 1m11.335s
user 0m0.002s
sys 0m0.005s

target:
[  934.306016] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
[  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[  944.900051] nvmet: ctrl 1 fatal error occurred!
[ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.

client:
[  857.264029] nvme nvme0: resetting controller
[  864.115369] nvme nvme0: creating 40 I/O queues.
[  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[  868.001673] nvme nvme0: resetting controller
[  935.396789] nvme nvme0: I/O 9 QID 0 timeout
[  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
[  935.438080] nvme nvme0: creating 40 I/O queues.
[  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-05-22  4:27     ` Yi Zhang
  0 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-05-22  4:27 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, linux-rdma

On Sat, May 22, 2021 at 2:00 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> > Hi
> > I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
> > help check it?
> > Thanks.
> >
> > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > real 0m10.678s
> > user 0m0.000s
> > sys 0m0.000s
> > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > real 1m11.530s
> > user 0m0.000s
> > sys 0m0.000s
> >
> > target:
> > $ dmesg | grep nvme
> > [  276.891454] nvmet: creating controller 1 for subsystem testnqn for
> > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > [  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> > [  287.399317] nvmet: ctrl 1 fatal error occurred!
> > [  348.412672] nvmet: creating controller 1 for subsystem testnqn for
> > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> >
> > client:
> > $ dmesg | grep nvme
> > [  281.704475] nvme nvme0: creating 40 I/O queues.
> > [  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > [  353.187809] nvme nvme0: I/O 8 QID 0 timeout
> > [  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
> > [  353.226082] nvme nvme0: creating 40 I/O queues.
> > [  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>
> It appears that there is an admin timeout that is either triggered
> by the reset or unrelated.
>
> Can you run nvme reset /dev/nvme0 instead so we can see the "resetting
> controller" print?
>
Yes, here is the log:
------------------------0
+ nvme reset /dev/nvme0
real 0m10.737s
user 0m0.004s
sys 0m0.004s
------------------------1
+ nvme reset /dev/nvme0
real 1m11.335s
user 0m0.002s
sys 0m0.005s

target:
[  934.306016] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
[  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[  944.900051] nvmet: ctrl 1 fatal error occurred!
[ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.

client:
[  857.264029] nvme nvme0: resetting controller
[  864.115369] nvme nvme0: creating 40 I/O queues.
[  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[  868.001673] nvme nvme0: resetting controller
[  935.396789] nvme nvme0: I/O 9 QID 0 timeout
[  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
[  935.438080] nvme nvme0: creating 40 I/O queues.
[  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.


-- 
Best Regards,
  Yi Zhang


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-05-22  4:27     ` Yi Zhang
@ 2021-06-23 10:01       ` Yi Zhang
  -1 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-06-23 10:01 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, linux-rdma

Hello

Gentle ping here, this issue still exists on latest 5.13-rc7

# time nvme reset /dev/nvme0

real 0m12.636s
user 0m0.002s
sys 0m0.005s
# time nvme reset /dev/nvme0

real 0m12.641s
user 0m0.000s
sys 0m0.007s
# time nvme reset /dev/nvme0

real 1m16.133s
user 0m0.000s
sys 0m0.007s

On Sat, May 22, 2021 at 12:27 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> On Sat, May 22, 2021 at 2:00 AM Sagi Grimberg <sagi@grimberg.me> wrote:
> >
> >
> > > Hi
> > > I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
> > > help check it?
> > > Thanks.
> > >
> > > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > > real 0m10.678s
> > > user 0m0.000s
> > > sys 0m0.000s
> > > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > > real 1m11.530s
> > > user 0m0.000s
> > > sys 0m0.000s
> > >
> > > target:
> > > $ dmesg | grep nvme
> > > [  276.891454] nvmet: creating controller 1 for subsystem testnqn for
> > > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > > [  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> > > [  287.399317] nvmet: ctrl 1 fatal error occurred!
> > > [  348.412672] nvmet: creating controller 1 for subsystem testnqn for
> > > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > >
> > > client:
> > > $ dmesg | grep nvme
> > > [  281.704475] nvme nvme0: creating 40 I/O queues.
> > > [  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > > [  353.187809] nvme nvme0: I/O 8 QID 0 timeout
> > > [  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
> > > [  353.226082] nvme nvme0: creating 40 I/O queues.
> > > [  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >
> > It appears that there is an admin timeout that is either triggered
> > by the reset or unrelated.
> >
> > Can you run nvme reset /dev/nvme0 instead so we can see the "resetting
> > controller" print?
> >
> Yes, here is the log:
> ------------------------0
> + nvme reset /dev/nvme0
> real 0m10.737s
> user 0m0.004s
> sys 0m0.004s
> ------------------------1
> + nvme reset /dev/nvme0
> real 1m11.335s
> user 0m0.002s
> sys 0m0.005s
>
> target:
> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [  944.900051] nvmet: ctrl 1 fatal error occurred!
> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
>
> client:
> [  857.264029] nvme nvme0: resetting controller
> [  864.115369] nvme nvme0: creating 40 I/O queues.
> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [  868.001673] nvme nvme0: resetting controller
> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
> [  935.438080] nvme nvme0: creating 40 I/O queues.
> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>
>
> --
> Best Regards,
>   Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-06-23 10:01       ` Yi Zhang
  0 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-06-23 10:01 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, linux-rdma

Hello

Gentle ping here, this issue still exists on latest 5.13-rc7

# time nvme reset /dev/nvme0

real 0m12.636s
user 0m0.002s
sys 0m0.005s
# time nvme reset /dev/nvme0

real 0m12.641s
user 0m0.000s
sys 0m0.007s
# time nvme reset /dev/nvme0

real 1m16.133s
user 0m0.000s
sys 0m0.007s

On Sat, May 22, 2021 at 12:27 PM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> On Sat, May 22, 2021 at 2:00 AM Sagi Grimberg <sagi@grimberg.me> wrote:
> >
> >
> > > Hi
> > > I found this issue on 5.13-rc2 with NVMe/IB environment, could anyone
> > > help check it?
> > > Thanks.
> > >
> > > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > > real 0m10.678s
> > > user 0m0.000s
> > > sys 0m0.000s
> > > $ time echo 1 >/sys/block/nvme0n1/device/reset_controller
> > > real 1m11.530s
> > > user 0m0.000s
> > > sys 0m0.000s
> > >
> > > target:
> > > $ dmesg | grep nvme
> > > [  276.891454] nvmet: creating controller 1 for subsystem testnqn for
> > > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > > [  287.374412] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> > > [  287.399317] nvmet: ctrl 1 fatal error occurred!
> > > [  348.412672] nvmet: creating controller 1 for subsystem testnqn for
> > > NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > >
> > > client:
> > > $ dmesg | grep nvme
> > > [  281.704475] nvme nvme0: creating 40 I/O queues.
> > > [  285.557759] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > > [  353.187809] nvme nvme0: I/O 8 QID 0 timeout
> > > [  353.193100] nvme nvme0: Property Set error: 881, offset 0x14
> > > [  353.226082] nvme nvme0: creating 40 I/O queues.
> > > [  357.088266] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >
> > It appears that there is an admin timeout that is either triggered
> > by the reset or unrelated.
> >
> > Can you run nvme reset /dev/nvme0 instead so we can see the "resetting
> > controller" print?
> >
> Yes, here is the log:
> ------------------------0
> + nvme reset /dev/nvme0
> real 0m10.737s
> user 0m0.004s
> sys 0m0.004s
> ------------------------1
> + nvme reset /dev/nvme0
> real 1m11.335s
> user 0m0.002s
> sys 0m0.005s
>
> target:
> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [  944.900051] nvmet: ctrl 1 fatal error occurred!
> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
>
> client:
> [  857.264029] nvme nvme0: resetting controller
> [  864.115369] nvme nvme0: creating 40 I/O queues.
> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [  868.001673] nvme nvme0: resetting controller
> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
> [  935.438080] nvme nvme0: creating 40 I/O queues.
> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>
>
> --
> Best Regards,
>   Yi Zhang


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-06-23 10:01       ` Yi Zhang
@ 2021-06-23 21:32         ` Sagi Grimberg
  -1 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2021-06-23 21:32 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvme, linux-rdma


> Hello
> 
> Gentle ping here, this issue still exists on latest 5.13-rc7
> 
> # time nvme reset /dev/nvme0
> 
> real 0m12.636s
> user 0m0.002s
> sys 0m0.005s
> # time nvme reset /dev/nvme0
> 
> real 0m12.641s
> user 0m0.000s
> sys 0m0.007s

Strange that even normal resets take so long...
What device are you using?

> # time nvme reset /dev/nvme0
> 
> real 1m16.133s
> user 0m0.000s
> sys 0m0.007s

There seems to be a spurious command timeout here, but maybe this
is due to the fact that the queues take so long to connect and
the target expires the keep-alive timer.

Does this patch help?
--
diff --git a/drivers/nvme/target/fabrics-cmd.c 
b/drivers/nvme/target/fabrics-cmd.c
index 7d0f3523fdab..f4a7db1ab3e5 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -142,6 +142,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl 
*ctrl, struct nvmet_req *req)
                 }
         }

+       /*
+        * Controller establishment flow may take some time, and the 
host may not
+        * send us keep-alive during this period, hence reset the
+        * traffic based keep-alive timer so we don't trigger a
+        * controller teardown as a result of a keep-alive expiration.
+        */
+       ctrl->reset_tbkas = true;
+
         return 0;

  err:
--

>> target:
>> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
>> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
>> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
>> [  944.900051] nvmet: ctrl 1 fatal error occurred!
>> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
>> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
>>
>> client:
>> [  857.264029] nvme nvme0: resetting controller
>> [  864.115369] nvme nvme0: creating 40 I/O queues.
>> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>> [  868.001673] nvme nvme0: resetting controller
>> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
>> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
>> [  935.438080] nvme nvme0: creating 40 I/O queues.
>> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-06-23 21:32         ` Sagi Grimberg
  0 siblings, 0 replies; 45+ messages in thread
From: Sagi Grimberg @ 2021-06-23 21:32 UTC (permalink / raw)
  To: Yi Zhang; +Cc: linux-nvme, linux-rdma


> Hello
> 
> Gentle ping here, this issue still exists on latest 5.13-rc7
> 
> # time nvme reset /dev/nvme0
> 
> real 0m12.636s
> user 0m0.002s
> sys 0m0.005s
> # time nvme reset /dev/nvme0
> 
> real 0m12.641s
> user 0m0.000s
> sys 0m0.007s

Strange that even normal resets take so long...
What device are you using?

> # time nvme reset /dev/nvme0
> 
> real 1m16.133s
> user 0m0.000s
> sys 0m0.007s

There seems to be a spurious command timeout here, but maybe this
is due to the fact that the queues take so long to connect and
the target expires the keep-alive timer.

Does this patch help?
--
diff --git a/drivers/nvme/target/fabrics-cmd.c 
b/drivers/nvme/target/fabrics-cmd.c
index 7d0f3523fdab..f4a7db1ab3e5 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -142,6 +142,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl 
*ctrl, struct nvmet_req *req)
                 }
         }

+       /*
+        * Controller establishment flow may take some time, and the 
host may not
+        * send us keep-alive during this period, hence reset the
+        * traffic based keep-alive timer so we don't trigger a
+        * controller teardown as a result of a keep-alive expiration.
+        */
+       ctrl->reset_tbkas = true;
+
         return 0;

  err:
--

>> target:
>> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
>> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
>> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
>> [  944.900051] nvmet: ctrl 1 fatal error occurred!
>> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
>> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
>>
>> client:
>> [  857.264029] nvme nvme0: resetting controller
>> [  864.115369] nvme nvme0: creating 40 I/O queues.
>> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>> [  868.001673] nvme nvme0: resetting controller
>> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
>> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
>> [  935.438080] nvme nvme0: creating 40 I/O queues.
>> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-06-23 21:32         ` Sagi Grimberg
@ 2021-06-24 16:14           ` Yi Zhang
  -1 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-06-24 16:14 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, linux-rdma

On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> > Hello
> >
> > Gentle ping here, this issue still exists on latest 5.13-rc7
> >
> > # time nvme reset /dev/nvme0
> >
> > real 0m12.636s
> > user 0m0.002s
> > sys 0m0.005s
> > # time nvme reset /dev/nvme0
> >
> > real 0m12.641s
> > user 0m0.000s
> > sys 0m0.007s
>
> Strange that even normal resets take so long...
> What device are you using?

Hi Sagi

Here is the device info:
Mellanox Technologies MT27700 Family [ConnectX-4]

>
> > # time nvme reset /dev/nvme0
> >
> > real 1m16.133s
> > user 0m0.000s
> > sys 0m0.007s
>
> There seems to be a spurious command timeout here, but maybe this
> is due to the fact that the queues take so long to connect and
> the target expires the keep-alive timer.
>
> Does this patch help?

The issue still exists, let me know if you need more testing for it. :)


> --
> diff --git a/drivers/nvme/target/fabrics-cmd.c
> b/drivers/nvme/target/fabrics-cmd.c
> index 7d0f3523fdab..f4a7db1ab3e5 100644
> --- a/drivers/nvme/target/fabrics-cmd.c
> +++ b/drivers/nvme/target/fabrics-cmd.c
> @@ -142,6 +142,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl
> *ctrl, struct nvmet_req *req)
>                  }
>          }
>
> +       /*
> +        * Controller establishment flow may take some time, and the
> host may not
> +        * send us keep-alive during this period, hence reset the
> +        * traffic based keep-alive timer so we don't trigger a
> +        * controller teardown as a result of a keep-alive expiration.
> +        */
> +       ctrl->reset_tbkas = true;
> +
>          return 0;
>
>   err:
> --
>
> >> target:
> >> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
> >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> >> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> >> [  944.900051] nvmet: ctrl 1 fatal error occurred!
> >> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
> >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> >>
> >> client:
> >> [  857.264029] nvme nvme0: resetting controller
> >> [  864.115369] nvme nvme0: creating 40 I/O queues.
> >> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >> [  868.001673] nvme nvme0: resetting controller
> >> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
> >> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
> >> [  935.438080] nvme nvme0: creating 40 I/O queues.
> >> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
@ 2021-06-24 16:14           ` Yi Zhang
  0 siblings, 0 replies; 45+ messages in thread
From: Yi Zhang @ 2021-06-24 16:14 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: linux-nvme, linux-rdma

On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> > Hello
> >
> > Gentle ping here, this issue still exists on latest 5.13-rc7
> >
> > # time nvme reset /dev/nvme0
> >
> > real 0m12.636s
> > user 0m0.002s
> > sys 0m0.005s
> > # time nvme reset /dev/nvme0
> >
> > real 0m12.641s
> > user 0m0.000s
> > sys 0m0.007s
>
> Strange that even normal resets take so long...
> What device are you using?

Hi Sagi

Here is the device info:
Mellanox Technologies MT27700 Family [ConnectX-4]

>
> > # time nvme reset /dev/nvme0
> >
> > real 1m16.133s
> > user 0m0.000s
> > sys 0m0.007s
>
> There seems to be a spurious command timeout here, but maybe this
> is due to the fact that the queues take so long to connect and
> the target expires the keep-alive timer.
>
> Does this patch help?

The issue still exists, let me know if you need more testing for it. :)


> --
> diff --git a/drivers/nvme/target/fabrics-cmd.c
> b/drivers/nvme/target/fabrics-cmd.c
> index 7d0f3523fdab..f4a7db1ab3e5 100644
> --- a/drivers/nvme/target/fabrics-cmd.c
> +++ b/drivers/nvme/target/fabrics-cmd.c
> @@ -142,6 +142,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl
> *ctrl, struct nvmet_req *req)
>                  }
>          }
>
> +       /*
> +        * Controller establishment flow may take some time, and the
> host may not
> +        * send us keep-alive during this period, hence reset the
> +        * traffic based keep-alive timer so we don't trigger a
> +        * controller teardown as a result of a keep-alive expiration.
> +        */
> +       ctrl->reset_tbkas = true;
> +
>          return 0;
>
>   err:
> --
>
> >> target:
> >> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
> >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> >> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> >> [  944.900051] nvmet: ctrl 1 fatal error occurred!
> >> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
> >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> >>
> >> client:
> >> [  857.264029] nvme nvme0: resetting controller
> >> [  864.115369] nvme nvme0: creating 40 I/O queues.
> >> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >> [  868.001673] nvme nvme0: resetting controller
> >> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
> >> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
> >> [  935.438080] nvme nvme0: creating 40 I/O queues.
> >> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>


-- 
Best Regards,
  Yi Zhang


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-06-24 16:14           ` Yi Zhang
  (?)
@ 2021-12-11  3:01           ` Yi Zhang
  2021-12-12  9:45             ` Sagi Grimberg
  -1 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-11  3:01 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list

On Fri, Jun 25, 2021 at 12:14 AM Yi Zhang <yi.zhang@redhat.com> wrote:
>
> On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi@grimberg.me> wrote:
> >
> >
> > > Hello
> > >
> > > Gentle ping here, this issue still exists on latest 5.13-rc7
> > >
> > > # time nvme reset /dev/nvme0
> > >
> > > real 0m12.636s
> > > user 0m0.002s
> > > sys 0m0.005s
> > > # time nvme reset /dev/nvme0
> > >
> > > real 0m12.641s
> > > user 0m0.000s
> > > sys 0m0.007s
> >
> > Strange that even normal resets take so long...
> > What device are you using?
>
> Hi Sagi
>
> Here is the device info:
> Mellanox Technologies MT27700 Family [ConnectX-4]
>
> >
> > > # time nvme reset /dev/nvme0
> > >
> > > real 1m16.133s
> > > user 0m0.000s
> > > sys 0m0.007s
> >
> > There seems to be a spurious command timeout here, but maybe this
> > is due to the fact that the queues take so long to connect and
> > the target expires the keep-alive timer.
> >
> > Does this patch help?
>
> The issue still exists, let me know if you need more testing for it. :)

Hi Sagi
ping, this issue still can be reproduced on the latest
linux-block/for-next, do you have a chance to recheck it, thanks.


>
>
> > --
> > diff --git a/drivers/nvme/target/fabrics-cmd.c
> > b/drivers/nvme/target/fabrics-cmd.c
> > index 7d0f3523fdab..f4a7db1ab3e5 100644
> > --- a/drivers/nvme/target/fabrics-cmd.c
> > +++ b/drivers/nvme/target/fabrics-cmd.c
> > @@ -142,6 +142,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl
> > *ctrl, struct nvmet_req *req)
> >                  }
> >          }
> >
> > +       /*
> > +        * Controller establishment flow may take some time, and the
> > host may not
> > +        * send us keep-alive during this period, hence reset the
> > +        * traffic based keep-alive timer so we don't trigger a
> > +        * controller teardown as a result of a keep-alive expiration.
> > +        */
> > +       ctrl->reset_tbkas = true;
> > +
> >          return 0;
> >
> >   err:
> > --
> >
> > >> target:
> > >> [  934.306016] nvmet: creating controller 1 for subsystem testnqn for
> > >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > >> [  944.875021] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> > >> [  944.900051] nvmet: ctrl 1 fatal error occurred!
> > >> [ 1005.628340] nvmet: creating controller 1 for subsystem testnqn for
> > >> NQN nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4c10-8058-b7c04f383432.
> > >>
> > >> client:
> > >> [  857.264029] nvme nvme0: resetting controller
> > >> [  864.115369] nvme nvme0: creating 40 I/O queues.
> > >> [  867.996746] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > >> [  868.001673] nvme nvme0: resetting controller
> > >> [  935.396789] nvme nvme0: I/O 9 QID 0 timeout
> > >> [  935.402036] nvme nvme0: Property Set error: 881, offset 0x14
> > >> [  935.438080] nvme nvme0: creating 40 I/O queues.
> > >> [  939.332125] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >
>
>
> --
> Best Regards,
>   Yi Zhang



--
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-11  3:01           ` Yi Zhang
@ 2021-12-12  9:45             ` Sagi Grimberg
  2021-12-13  6:12               ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2021-12-12  9:45 UTC (permalink / raw)
  To: Yi Zhang; +Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list



On 12/11/21 5:01 AM, Yi Zhang wrote:
> On Fri, Jun 25, 2021 at 12:14 AM Yi Zhang <yi.zhang@redhat.com> wrote:
>>
>> On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi@grimberg.me> wrote:
>>>
>>>
>>>> Hello
>>>>
>>>> Gentle ping here, this issue still exists on latest 5.13-rc7
>>>>
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real 0m12.636s
>>>> user 0m0.002s
>>>> sys 0m0.005s
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real 0m12.641s
>>>> user 0m0.000s
>>>> sys 0m0.007s
>>>
>>> Strange that even normal resets take so long...
>>> What device are you using?
>>
>> Hi Sagi
>>
>> Here is the device info:
>> Mellanox Technologies MT27700 Family [ConnectX-4]
>>
>>>
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real 1m16.133s
>>>> user 0m0.000s
>>>> sys 0m0.007s
>>>
>>> There seems to be a spurious command timeout here, but maybe this
>>> is due to the fact that the queues take so long to connect and
>>> the target expires the keep-alive timer.
>>>
>>> Does this patch help?
>>
>> The issue still exists, let me know if you need more testing for it. :)
> 
> Hi Sagi
> ping, this issue still can be reproduced on the latest
> linux-block/for-next, do you have a chance to recheck it, thanks.

Can you check if it happens with the below patch:
--
diff --git a/drivers/nvme/target/fabrics-cmd.c 
b/drivers/nvme/target/fabrics-cmd.c
index f91a56180d3d..6e5aadfb07a0 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -191,6 +191,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl 
*ctrl, struct nvmet_req *req)
                 }
         }

+       /*
+        * Controller establishment flow may take some time, and the 
host may not
+        * send us keep-alive during this period, hence reset the
+        * traffic based keep-alive timer so we don't trigger a
+        * controller teardown as a result of a keep-alive expiration.
+        */
+       ctrl->reset_tbkas = true;
+
         return 0;

  err:
--

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-12  9:45             ` Sagi Grimberg
@ 2021-12-13  6:12               ` Yi Zhang
  2021-12-13  9:04                 ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-13  6:12 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list

On Sun, Dec 12, 2021 at 5:45 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
>
> On 12/11/21 5:01 AM, Yi Zhang wrote:
> > On Fri, Jun 25, 2021 at 12:14 AM Yi Zhang <yi.zhang@redhat.com> wrote:
> >>
> >> On Thu, Jun 24, 2021 at 5:32 AM Sagi Grimberg <sagi@grimberg.me> wrote:
> >>>
> >>>
> >>>> Hello
> >>>>
> >>>> Gentle ping here, this issue still exists on latest 5.13-rc7
> >>>>
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real 0m12.636s
> >>>> user 0m0.002s
> >>>> sys 0m0.005s
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real 0m12.641s
> >>>> user 0m0.000s
> >>>> sys 0m0.007s
> >>>
> >>> Strange that even normal resets take so long...
> >>> What device are you using?
> >>
> >> Hi Sagi
> >>
> >> Here is the device info:
> >> Mellanox Technologies MT27700 Family [ConnectX-4]
> >>
> >>>
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real 1m16.133s
> >>>> user 0m0.000s
> >>>> sys 0m0.007s
> >>>
> >>> There seems to be a spurious command timeout here, but maybe this
> >>> is due to the fact that the queues take so long to connect and
> >>> the target expires the keep-alive timer.
> >>>
> >>> Does this patch help?
> >>
> >> The issue still exists, let me know if you need more testing for it. :)
> >
> > Hi Sagi
> > ping, this issue still can be reproduced on the latest
> > linux-block/for-next, do you have a chance to recheck it, thanks.
>
> Can you check if it happens with the below patch:

Hi Sagi
It is still reproducible with the change, here is the log:

# time nvme reset /dev/nvme0

real    0m12.973s
user    0m0.000s
sys     0m0.006s
# time nvme reset /dev/nvme0

real    1m15.606s
user    0m0.000s
sys     0m0.007s

# dmesg | grep nvme
[  900.634877] nvme nvme0: resetting controller
[  909.026958] nvme nvme0: creating 40 I/O queues.
[  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[  917.600993] nvme nvme0: resetting controller
[  988.562230] nvme nvme0: I/O 2 QID 0 timeout
[  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
[  988.608181] nvme nvme0: creating 40 I/O queues.
[  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.

BTW, this issue cannot be reproduced on my NVME/ROCE environment.

> --
> diff --git a/drivers/nvme/target/fabrics-cmd.c
> b/drivers/nvme/target/fabrics-cmd.c
> index f91a56180d3d..6e5aadfb07a0 100644
> --- a/drivers/nvme/target/fabrics-cmd.c
> +++ b/drivers/nvme/target/fabrics-cmd.c
> @@ -191,6 +191,14 @@ static u16 nvmet_install_queue(struct nvmet_ctrl
> *ctrl, struct nvmet_req *req)
>                  }
>          }
>
> +       /*
> +        * Controller establishment flow may take some time, and the
> host may not
> +        * send us keep-alive during this period, hence reset the
> +        * traffic based keep-alive timer so we don't trigger a
> +        * controller teardown as a result of a keep-alive expiration.
> +        */
> +       ctrl->reset_tbkas = true;
> +
>          return 0;
>
>   err:
> --
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-13  6:12               ` Yi Zhang
@ 2021-12-13  9:04                 ` Sagi Grimberg
  2021-12-13 17:05                   ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2021-12-13  9:04 UTC (permalink / raw)
  To: Yi Zhang; +Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list


>>>>>> Hello
>>>>>>
>>>>>> Gentle ping here, this issue still exists on latest 5.13-rc7
>>>>>>
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real 0m12.636s
>>>>>> user 0m0.002s
>>>>>> sys 0m0.005s
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real 0m12.641s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.007s
>>>>>
>>>>> Strange that even normal resets take so long...
>>>>> What device are you using?
>>>>
>>>> Hi Sagi
>>>>
>>>> Here is the device info:
>>>> Mellanox Technologies MT27700 Family [ConnectX-4]
>>>>
>>>>>
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real 1m16.133s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.007s
>>>>>
>>>>> There seems to be a spurious command timeout here, but maybe this
>>>>> is due to the fact that the queues take so long to connect and
>>>>> the target expires the keep-alive timer.
>>>>>
>>>>> Does this patch help?
>>>>
>>>> The issue still exists, let me know if you need more testing for it. :)
>>>
>>> Hi Sagi
>>> ping, this issue still can be reproduced on the latest
>>> linux-block/for-next, do you have a chance to recheck it, thanks.
>>
>> Can you check if it happens with the below patch:
> 
> Hi Sagi
> It is still reproducible with the change, here is the log:
> 
> # time nvme reset /dev/nvme0
> 
> real    0m12.973s
> user    0m0.000s
> sys     0m0.006s
> # time nvme reset /dev/nvme0
> 
> real    1m15.606s
> user    0m0.000s
> sys     0m0.007s

Does it speed up if you use less queues? (i.e. connect with -i 4) ?

> 
> # dmesg | grep nvme
> [  900.634877] nvme nvme0: resetting controller
> [  909.026958] nvme nvme0: creating 40 I/O queues.
> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [  917.600993] nvme nvme0: resetting controller
> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
> [  988.608181] nvme nvme0: creating 40 I/O queues.
> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> 
> BTW, this issue cannot be reproduced on my NVME/ROCE environment.

Then I think that we need the rdma folks to help here...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-13  9:04                 ` Sagi Grimberg
@ 2021-12-13 17:05                   ` Yi Zhang
  2021-12-14 10:39                     ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-13 17:05 UTC (permalink / raw)
  To: Sagi Grimberg; +Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list

On Mon, Dec 13, 2021 at 5:05 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> >>>>>> Hello
> >>>>>>
> >>>>>> Gentle ping here, this issue still exists on latest 5.13-rc7
> >>>>>>
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 0m12.636s
> >>>>>> user 0m0.002s
> >>>>>> sys 0m0.005s
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 0m12.641s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.007s
> >>>>>
> >>>>> Strange that even normal resets take so long...
> >>>>> What device are you using?
> >>>>
> >>>> Hi Sagi
> >>>>
> >>>> Here is the device info:
> >>>> Mellanox Technologies MT27700 Family [ConnectX-4]
> >>>>
> >>>>>
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 1m16.133s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.007s
> >>>>>
> >>>>> There seems to be a spurious command timeout here, but maybe this
> >>>>> is due to the fact that the queues take so long to connect and
> >>>>> the target expires the keep-alive timer.
> >>>>>
> >>>>> Does this patch help?
> >>>>
> >>>> The issue still exists, let me know if you need more testing for it. :)
> >>>
> >>> Hi Sagi
> >>> ping, this issue still can be reproduced on the latest
> >>> linux-block/for-next, do you have a chance to recheck it, thanks.
> >>
> >> Can you check if it happens with the below patch:
> >
> > Hi Sagi
> > It is still reproducible with the change, here is the log:
> >
> > # time nvme reset /dev/nvme0
> >
> > real    0m12.973s
> > user    0m0.000s
> > sys     0m0.006s
> > # time nvme reset /dev/nvme0
> >
> > real    1m15.606s
> > user    0m0.000s
> > sys     0m0.007s
>
> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
Yes, with -i 4, it has stablee 1.3s
# time nvme reset /dev/nvme0

real 0m1.225s
user 0m0.000s
sys 0m0.007s

>
> >
> > # dmesg | grep nvme
> > [  900.634877] nvme nvme0: resetting controller
> > [  909.026958] nvme nvme0: creating 40 I/O queues.
> > [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> > [  917.600993] nvme nvme0: resetting controller
> > [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
> > [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
> > [  988.608181] nvme nvme0: creating 40 I/O queues.
> > [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >
> > BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>
> Then I think that we need the rdma folks to help here...
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-13 17:05                   ` Yi Zhang
@ 2021-12-14 10:39                     ` Sagi Grimberg
  2021-12-14 12:00                       ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2021-12-14 10:39 UTC (permalink / raw)
  To: Yi Zhang, Max Gurtovoy; +Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list


>>> Hi Sagi
>>> It is still reproducible with the change, here is the log:
>>>
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.973s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    1m15.606s
>>> user    0m0.000s
>>> sys     0m0.007s
>>
>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
> Yes, with -i 4, it has stablee 1.3s
> # time nvme reset /dev/nvme0

So it appears that destroying a qp takes a long time on
IB for some reason...

> real 0m1.225s
> user 0m0.000s
> sys 0m0.007s
> 
>>
>>>
>>> # dmesg | grep nvme
>>> [  900.634877] nvme nvme0: resetting controller
>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>> [  917.600993] nvme nvme0: resetting controller
>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>
>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>
>> Then I think that we need the rdma folks to help here...

Max?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-14 10:39                     ` Sagi Grimberg
@ 2021-12-14 12:00                       ` Max Gurtovoy
  2021-12-15  1:15                         ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2021-12-14 12:00 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang, Max Gurtovoy
  Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list


On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
>
>>>> Hi Sagi
>>>> It is still reproducible with the change, here is the log:
>>>>
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real    0m12.973s
>>>> user    0m0.000s
>>>> sys     0m0.006s
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real    1m15.606s
>>>> user    0m0.000s
>>>> sys     0m0.007s
>>>
>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
>> Yes, with -i 4, it has stablee 1.3s
>> # time nvme reset /dev/nvme0
>
> So it appears that destroying a qp takes a long time on
> IB for some reason...
>
>> real 0m1.225s
>> user 0m0.000s
>> sys 0m0.007s
>>
>>>
>>>>
>>>> # dmesg | grep nvme
>>>> [  900.634877] nvme nvme0: resetting controller
>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>> [  917.600993] nvme nvme0: resetting controller
>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>
>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>>
>>> Then I think that we need the rdma folks to help here...
>
> Max?

It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.

Can you try repro with latest versions please ?

Or give the exact scenario ?



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-14 12:00                       ` Max Gurtovoy
@ 2021-12-15  1:15                         ` Yi Zhang
  2021-12-15 12:10                           ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-15  1:15 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Sagi Grimberg, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	RDMA mailing list

On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>
> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
> >
> >>>> Hi Sagi
> >>>> It is still reproducible with the change, here is the log:
> >>>>
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real    0m12.973s
> >>>> user    0m0.000s
> >>>> sys     0m0.006s
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real    1m15.606s
> >>>> user    0m0.000s
> >>>> sys     0m0.007s
> >>>
> >>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
> >> Yes, with -i 4, it has stablee 1.3s
> >> # time nvme reset /dev/nvme0
> >
> > So it appears that destroying a qp takes a long time on
> > IB for some reason...
> >
> >> real 0m1.225s
> >> user 0m0.000s
> >> sys 0m0.007s
> >>
> >>>
> >>>>
> >>>> # dmesg | grep nvme
> >>>> [  900.634877] nvme nvme0: resetting controller
> >>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
> >>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>> [  917.600993] nvme nvme0: resetting controller
> >>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
> >>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
> >>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
> >>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>
> >>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
> >>>
> >>> Then I think that we need the rdma folks to help here...
> >
> > Max?
>
> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
>
> Can you try repro with latest versions please ?
>
> Or give the exact scenario ?

Yeah, both target and client are using Mellanox Technologies MT27700
Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
first time reset will take 12s, and it always can be reproduced at the
second reset operation.

>
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-15  1:15                         ` Yi Zhang
@ 2021-12-15 12:10                           ` Max Gurtovoy
  2021-12-16  2:18                             ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2021-12-15 12:10 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Sagi Grimberg, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	RDMA mailing list


On 12/15/2021 3:15 AM, Yi Zhang wrote:
> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>
>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
>>>>>> Hi Sagi
>>>>>> It is still reproducible with the change, here is the log:
>>>>>>
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real    0m12.973s
>>>>>> user    0m0.000s
>>>>>> sys     0m0.006s
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real    1m15.606s
>>>>>> user    0m0.000s
>>>>>> sys     0m0.007s
>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
>>>> Yes, with -i 4, it has stablee 1.3s
>>>> # time nvme reset /dev/nvme0
>>> So it appears that destroying a qp takes a long time on
>>> IB for some reason...
>>>
>>>> real 0m1.225s
>>>> user 0m0.000s
>>>> sys 0m0.007s
>>>>
>>>>>> # dmesg | grep nvme
>>>>>> [  900.634877] nvme nvme0: resetting controller
>>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
>>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>> [  917.600993] nvme nvme0: resetting controller
>>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
>>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>
>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>>>> Then I think that we need the rdma folks to help here...
>>> Max?
>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
>>
>> Can you try repro with latest versions please ?
>>
>> Or give the exact scenario ?
> Yeah, both target and client are using Mellanox Technologies MT27700
> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
> first time reset will take 12s, and it always can be reproduced at the
> second reset operation.

I created a target with 1 namespace backed by null_blk and connected to 
it from the same server in loopback rdma connection using the ConnectX-4 
adapter.

I run a loop with the "nvme reset" command and it took me 4-5 secs to 
reset each time..


>
>>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-15 12:10                           ` Max Gurtovoy
@ 2021-12-16  2:18                             ` Yi Zhang
  2021-12-16 13:21                               ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-16  2:18 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Sagi Grimberg, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	RDMA mailing list

On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>
> On 12/15/2021 3:15 AM, Yi Zhang wrote:
> > On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>
> >> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
> >>>>>> Hi Sagi
> >>>>>> It is still reproducible with the change, here is the log:
> >>>>>>
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real    0m12.973s
> >>>>>> user    0m0.000s
> >>>>>> sys     0m0.006s
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real    1m15.606s
> >>>>>> user    0m0.000s
> >>>>>> sys     0m0.007s
> >>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
> >>>> Yes, with -i 4, it has stablee 1.3s
> >>>> # time nvme reset /dev/nvme0
> >>> So it appears that destroying a qp takes a long time on
> >>> IB for some reason...
> >>>
> >>>> real 0m1.225s
> >>>> user 0m0.000s
> >>>> sys 0m0.007s
> >>>>
> >>>>>> # dmesg | grep nvme
> >>>>>> [  900.634877] nvme nvme0: resetting controller
> >>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
> >>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>>> [  917.600993] nvme nvme0: resetting controller
> >>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
> >>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
> >>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
> >>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>>>
> >>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
> >>>>> Then I think that we need the rdma folks to help here...
> >>> Max?
> >> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
> >>
> >> Can you try repro with latest versions please ?
> >>
> >> Or give the exact scenario ?
> > Yeah, both target and client are using Mellanox Technologies MT27700
> > Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
> > first time reset will take 12s, and it always can be reproduced at the
> > second reset operation.
>
> I created a target with 1 namespace backed by null_blk and connected to
> it from the same server in loopback rdma connection using the ConnectX-4
> adapter.

Could you share your loop.json file so I can try it on my environment?

And can you try it with two servers that both have CX-4? This should
be easier to reproduce it.

>
> I run a loop with the "nvme reset" command and it took me 4-5 secs to
> reset each time..
>
>
> >
> >>
> >
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-16  2:18                             ` Yi Zhang
@ 2021-12-16 13:21                               ` Max Gurtovoy
  2021-12-16 16:32                                 ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2021-12-16 13:21 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Sagi Grimberg, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	RDMA mailing list


On 12/16/2021 4:18 AM, Yi Zhang wrote:
> On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>
>> On 12/15/2021 3:15 AM, Yi Zhang wrote:
>>> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
>>>>>>>> Hi Sagi
>>>>>>>> It is still reproducible with the change, here is the log:
>>>>>>>>
>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>
>>>>>>>> real    0m12.973s
>>>>>>>> user    0m0.000s
>>>>>>>> sys     0m0.006s
>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>
>>>>>>>> real    1m15.606s
>>>>>>>> user    0m0.000s
>>>>>>>> sys     0m0.007s
>>>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
>>>>>> Yes, with -i 4, it has stablee 1.3s
>>>>>> # time nvme reset /dev/nvme0
>>>>> So it appears that destroying a qp takes a long time on
>>>>> IB for some reason...
>>>>>
>>>>>> real 0m1.225s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.007s
>>>>>>
>>>>>>>> # dmesg | grep nvme
>>>>>>>> [  900.634877] nvme nvme0: resetting controller
>>>>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
>>>>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>> [  917.600993] nvme nvme0: resetting controller
>>>>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>>>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>>>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
>>>>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>
>>>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>>>>>> Then I think that we need the rdma folks to help here...
>>>>> Max?
>>>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
>>>>
>>>> Can you try repro with latest versions please ?
>>>>
>>>> Or give the exact scenario ?
>>> Yeah, both target and client are using Mellanox Technologies MT27700
>>> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
>>> first time reset will take 12s, and it always can be reproduced at the
>>> second reset operation.
>> I created a target with 1 namespace backed by null_blk and connected to
>> it from the same server in loopback rdma connection using the ConnectX-4
>> adapter.
> Could you share your loop.json file so I can try it on my environment?

{
   "hosts": [],
   "ports": [
     {
       "addr": {
         "adrfam": "ipv4",
         "traddr": "<ip>",
         "treq": "not specified",
         "trsvcid": "4420",
         "trtype": "rdma"
       },
       "portid": 1,
       "referrals": [],
       "subsystems": [
         "testsubsystem_0"
       ]
     }
   ],
   "subsystems": [
     {
       "allowed_hosts": [],
       "attr": {
         "allow_any_host": "1",
         "cntlid_max": "65519",
         "cntlid_min": "1",
         "model": "Linux",
         "serial": "3d83c78b76623f1d",
         "version": "1.3"
       },
       "namespaces": [
         {
           "device": {
             "nguid": "5b722b05-e9b6-542d-ba80-62010b57775d",
             "path": "/dev/nullb0",
             "uuid": "26ffc8ce-73b4-321d-9685-7d7a9872c460"
           },
           "enable": 1,
           "nsid": 1
         }
       ],
       "nqn": "testsubsystem_0"
     }
   ]
}


>
> And can you try it with two servers that both have CX-4? This should
> be easier to reproduce it.

I did this experiment. I have only a setup with 12 cores so I created 12 
nvmf queues.

The reset took 4 seconds. The test did 100 loops of "nvme reset".

I saw that you also complained on the disconnect flow so I assume the 
root cause is the same.

My disconnect took 2 seconds.

My FW version is 12.28.2006.

>> I run a loop with the "nvme reset" command and it took me 4-5 secs to
>> reset each time..
>>
>>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-16 13:21                               ` Max Gurtovoy
@ 2021-12-16 16:32                                 ` Yi Zhang
  2021-12-16 17:33                                   ` Haakon Bugge
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-16 16:32 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Sagi Grimberg, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	RDMA mailing list

On Thu, Dec 16, 2021 at 9:21 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>
> On 12/16/2021 4:18 AM, Yi Zhang wrote:
> > On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>
> >> On 12/15/2021 3:15 AM, Yi Zhang wrote:
> >>> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
> >>>>>>>> Hi Sagi
> >>>>>>>> It is still reproducible with the change, here is the log:
> >>>>>>>>
> >>>>>>>> # time nvme reset /dev/nvme0
> >>>>>>>>
> >>>>>>>> real    0m12.973s
> >>>>>>>> user    0m0.000s
> >>>>>>>> sys     0m0.006s
> >>>>>>>> # time nvme reset /dev/nvme0
> >>>>>>>>
> >>>>>>>> real    1m15.606s
> >>>>>>>> user    0m0.000s
> >>>>>>>> sys     0m0.007s
> >>>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
> >>>>>> Yes, with -i 4, it has stablee 1.3s
> >>>>>> # time nvme reset /dev/nvme0
> >>>>> So it appears that destroying a qp takes a long time on
> >>>>> IB for some reason...
> >>>>>
> >>>>>> real 0m1.225s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.007s
> >>>>>>
> >>>>>>>> # dmesg | grep nvme
> >>>>>>>> [  900.634877] nvme nvme0: resetting controller
> >>>>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
> >>>>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>>>>> [  917.600993] nvme nvme0: resetting controller
> >>>>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
> >>>>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
> >>>>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
> >>>>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>>>>>
> >>>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
> >>>>>>> Then I think that we need the rdma folks to help here...
> >>>>> Max?
> >>>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
> >>>>
> >>>> Can you try repro with latest versions please ?
> >>>>
> >>>> Or give the exact scenario ?
> >>> Yeah, both target and client are using Mellanox Technologies MT27700
> >>> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
> >>> first time reset will take 12s, and it always can be reproduced at the
> >>> second reset operation.
> >> I created a target with 1 namespace backed by null_blk and connected to
> >> it from the same server in loopback rdma connection using the ConnectX-4
> >> adapter.
> > Could you share your loop.json file so I can try it on my environment?
>
> {
>    "hosts": [],
>    "ports": [
>      {
>        "addr": {
>          "adrfam": "ipv4",
>          "traddr": "<ip>",
>          "treq": "not specified",
>          "trsvcid": "4420",
>          "trtype": "rdma"
>        },
>        "portid": 1,
>        "referrals": [],
>        "subsystems": [
>          "testsubsystem_0"
>        ]
>      }
>    ],
>    "subsystems": [
>      {
>        "allowed_hosts": [],
>        "attr": {
>          "allow_any_host": "1",
>          "cntlid_max": "65519",
>          "cntlid_min": "1",
>          "model": "Linux",
>          "serial": "3d83c78b76623f1d",
>          "version": "1.3"
>        },
>        "namespaces": [
>          {
>            "device": {
>              "nguid": "5b722b05-e9b6-542d-ba80-62010b57775d",
>              "path": "/dev/nullb0",
>              "uuid": "26ffc8ce-73b4-321d-9685-7d7a9872c460"
>            },
>            "enable": 1,
>            "nsid": 1
>          }
>        ],
>        "nqn": "testsubsystem_0"
>      }
>    ]
> }

Thanks, I reproduced it with your json file on one server with
loopback rdma connection:
# time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testsubsystem_0

real 0m4.557s
user 0m0.000s
sys 0m0.005s
# time nvme reset /dev/nvme0

real 0m13.176s
user 0m0.000s
sys 0m0.007s
# time nvme reset /dev/nvme0

real 1m16.577s
user 0m0.002s
sys 0m0.005s
# time nvme disconnect -n testsubsystem_0

NQN:testsubsystem_0 disconnected 1 controller(s)

real 1m11.541s
user 0m0.000s
sys 0m0.187s

#dmesg
[96600.362827] nvmet: creating nvm controller 1 for subsystem
testsubsystem_0 for NQN
nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
[96600.363038] nvme nvme0: creating 40 I/O queues.
[96604.905514] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[96604.909161] nvme nvme0: new ctrl: NQN "testsubsystem_0", addr
172.31.0.202:4420
[96614.270825] nvme nvme0: Removing ctrl: NQN "testsubsystem_0"
[96659.268006] nvmet: creating nvm controller 1 for subsystem
testsubsystem_0 for NQN
nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
[96659.268215] nvme nvme0: creating 40 I/O queues.
[96663.801929] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[96663.805589] nvme nvme0: new ctrl: NQN "testsubsystem_0", addr
172.31.0.202:4420
[96673.130986] nvme nvme0: resetting controller
[96681.761992] nvmet: creating nvm controller 1 for subsystem
testsubsystem_0 for NQN
nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
[96681.762133] nvme nvme0: creating 40 I/O queues.
[96686.302544] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[96688.850272] nvme nvme0: resetting controller
[96697.336217] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[96697.361231] nvmet: ctrl 1 fatal error occurred!
[96760.824363] nvme nvme0: I/O 25 QID 0 timeout
[96760.847531] nvme nvme0: Property Set error: 881, offset 0x14
[96760.885731] nvmet: creating nvm controller 1 for subsystem
testsubsystem_0 for NQN
nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
[96760.885879] nvme nvme0: creating 40 I/O queues.
[96765.423099] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[96808.112730] nvme nvme0: Removing ctrl: NQN "testsubsystem_0"
[96816.632485] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
[96816.657537] nvmet: ctrl 1 fatal error occurred!
[96879.608632] nvme nvme0: I/O 12 QID 0 timeout
[96879.632104] nvme nvme0: Property Set error: 881, offset 0x14>
>
> >
> > And can you try it with two servers that both have CX-4? This should
> > be easier to reproduce it.
>
> I did this experiment. I have only a setup with 12 cores so I created 12
> nvmf queues.
>
> The reset took 4 seconds. The test did 100 loops of "nvme reset".
>
> I saw that you also complained on the disconnect flow so I assume the
> root cause is the same.

Yeah, I think so

>
> My disconnect took 2 seconds.
>
> My FW version is 12.28.2006.

Yeah, mine is same with yours.
# cat /sys/devices/pci0000\:00/0000\:00\:02.0/0000\:04\:00.0/infiniband/mlx5_0/fw_ver
12.28.2006


>
> >> I run a loop with the "nvme reset" command and it took me 4-5 secs to
> >> reset each time..
> >>
> >>
> >
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-16 16:32                                 ` Yi Zhang
@ 2021-12-16 17:33                                   ` Haakon Bugge
  2021-12-17  7:03                                     ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Haakon Bugge @ 2021-12-16 17:33 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Max Gurtovoy, Sagi Grimberg, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list



> On 16 Dec 2021, at 17:32, Yi Zhang <yi.zhang@redhat.com> wrote:
> 
> On Thu, Dec 16, 2021 at 9:21 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>> 
>> 
>> On 12/16/2021 4:18 AM, Yi Zhang wrote:
>>> On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>> 
>>>> On 12/15/2021 3:15 AM, Yi Zhang wrote:
>>>>> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
>>>>>>>>>> Hi Sagi
>>>>>>>>>> It is still reproducible with the change, here is the log:
>>>>>>>>>> 
>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>>> 
>>>>>>>>>> real    0m12.973s
>>>>>>>>>> user    0m0.000s
>>>>>>>>>> sys     0m0.006s
>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>>> 
>>>>>>>>>> real    1m15.606s
>>>>>>>>>> user    0m0.000s
>>>>>>>>>> sys     0m0.007s
>>>>>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
>>>>>>>> Yes, with -i 4, it has stablee 1.3s
>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>> So it appears that destroying a qp takes a long time on
>>>>>>> IB for some reason...
>>>>>>> 
>>>>>>>> real 0m1.225s
>>>>>>>> user 0m0.000s
>>>>>>>> sys 0m0.007s
>>>>>>>> 
>>>>>>>>>> # dmesg | grep nvme
>>>>>>>>>> [  900.634877] nvme nvme0: resetting controller
>>>>>>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
>>>>>>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>>> [  917.600993] nvme nvme0: resetting controller
>>>>>>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>>>>>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>>>>>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
>>>>>>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>>> 
>>>>>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>>>>>>>> Then I think that we need the rdma folks to help here...
>>>>>>> Max?
>>>>>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
>>>>>> 
>>>>>> Can you try repro with latest versions please ?
>>>>>> 
>>>>>> Or give the exact scenario ?
>>>>> Yeah, both target and client are using Mellanox Technologies MT27700
>>>>> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
>>>>> first time reset will take 12s, and it always can be reproduced at the
>>>>> second reset operation.
>>>> I created a target with 1 namespace backed by null_blk and connected to
>>>> it from the same server in loopback rdma connection using the ConnectX-4
>>>> adapter.
>>> Could you share your loop.json file so I can try it on my environment?
>> 
>> {
>>   "hosts": [],
>>   "ports": [
>>     {
>>       "addr": {
>>         "adrfam": "ipv4",
>>         "traddr": "<ip>",
>>         "treq": "not specified",
>>         "trsvcid": "4420",
>>         "trtype": "rdma"
>>       },
>>       "portid": 1,
>>       "referrals": [],
>>       "subsystems": [
>>         "testsubsystem_0"
>>       ]
>>     }
>>   ],
>>   "subsystems": [
>>     {
>>       "allowed_hosts": [],
>>       "attr": {
>>         "allow_any_host": "1",
>>         "cntlid_max": "65519",
>>         "cntlid_min": "1",
>>         "model": "Linux",
>>         "serial": "3d83c78b76623f1d",
>>         "version": "1.3"
>>       },
>>       "namespaces": [
>>         {
>>           "device": {
>>             "nguid": "5b722b05-e9b6-542d-ba80-62010b57775d",
>>             "path": "/dev/nullb0",
>>             "uuid": "26ffc8ce-73b4-321d-9685-7d7a9872c460"
>>           },
>>           "enable": 1,
>>           "nsid": 1
>>         }
>>       ],
>>       "nqn": "testsubsystem_0"
>>     }
>>   ]
>> }
> 
> Thanks, I reproduced it with your json file on one server with
> loopback rdma connection:
> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testsubsystem_0
> 
> real 0m4.557s
> user 0m0.000s
> sys 0m0.005s
> # time nvme reset /dev/nvme0
> 
> real 0m13.176s
> user 0m0.000s
> sys 0m0.007s
> # time nvme reset /dev/nvme0
> 
> real 1m16.577s
> user 0m0.002s
> sys 0m0.005s
> # time nvme disconnect -n testsubsystem_0

What does:

# rdma res show qp | grep -c ERR

say, when it is slow?



Thxs, Håkon



> NQN:testsubsystem_0 disconnected 1 controller(s)
> 
> real 1m11.541s
> user 0m0.000s
> sys 0m0.187s
> 
> #dmesg
> [96600.362827] nvmet: creating nvm controller 1 for subsystem
> testsubsystem_0 for NQN
> nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
> [96600.363038] nvme nvme0: creating 40 I/O queues.
> [96604.905514] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [96604.909161] nvme nvme0: new ctrl: NQN "testsubsystem_0", addr
> 172.31.0.202:4420
> [96614.270825] nvme nvme0: Removing ctrl: NQN "testsubsystem_0"
> [96659.268006] nvmet: creating nvm controller 1 for subsystem
> testsubsystem_0 for NQN
> nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
> [96659.268215] nvme nvme0: creating 40 I/O queues.
> [96663.801929] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [96663.805589] nvme nvme0: new ctrl: NQN "testsubsystem_0", addr
> 172.31.0.202:4420
> [96673.130986] nvme nvme0: resetting controller
> [96681.761992] nvmet: creating nvm controller 1 for subsystem
> testsubsystem_0 for NQN
> nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
> [96681.762133] nvme nvme0: creating 40 I/O queues.
> [96686.302544] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [96688.850272] nvme nvme0: resetting controller
> [96697.336217] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [96697.361231] nvmet: ctrl 1 fatal error occurred!
> [96760.824363] nvme nvme0: I/O 25 QID 0 timeout
> [96760.847531] nvme nvme0: Property Set error: 881, offset 0x14
> [96760.885731] nvmet: creating nvm controller 1 for subsystem
> testsubsystem_0 for NQN
> nqn.2014-08.org.nvmexpress:uuid:4c4c4544-0056-4d10-8030-b7c04f393432.
> [96760.885879] nvme nvme0: creating 40 I/O queues.
> [96765.423099] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> [96808.112730] nvme nvme0: Removing ctrl: NQN "testsubsystem_0"
> [96816.632485] nvmet: ctrl 1 keep-alive timer (5 seconds) expired!
> [96816.657537] nvmet: ctrl 1 fatal error occurred!
> [96879.608632] nvme nvme0: I/O 12 QID 0 timeout
> [96879.632104] nvme nvme0: Property Set error: 881, offset 0x14>
>> 
>>> 
>>> And can you try it with two servers that both have CX-4? This should
>>> be easier to reproduce it.
>> 
>> I did this experiment. I have only a setup with 12 cores so I created 12
>> nvmf queues.
>> 
>> The reset took 4 seconds. The test did 100 loops of "nvme reset".
>> 
>> I saw that you also complained on the disconnect flow so I assume the
>> root cause is the same.
> 
> Yeah, I think so
> 
>> 
>> My disconnect took 2 seconds.
>> 
>> My FW version is 12.28.2006.
> 
> Yeah, mine is same with yours.
> # cat /sys/devices/pci0000\:00/0000\:00\:02.0/0000\:04\:00.0/infiniband/mlx5_0/fw_ver
> 12.28.2006
> 
> 
>> 
>>>> I run a loop with the "nvme reset" command and it took me 4-5 secs to
>>>> reset each time..
>>>> 
>>>> 
>>> 
>> 
> 
> 
> -- 
> Best Regards,
>  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-16 17:33                                   ` Haakon Bugge
@ 2021-12-17  7:03                                     ` Yi Zhang
  2021-12-17 11:19                                       ` Haakon Bugge
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2021-12-17  7:03 UTC (permalink / raw)
  To: Haakon Bugge
  Cc: Max Gurtovoy, Sagi Grimberg, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list

On Fri, Dec 17, 2021 at 1:34 AM Haakon Bugge <haakon.bugge@oracle.com> wrote:
>
>
>
> > On 16 Dec 2021, at 17:32, Yi Zhang <yi.zhang@redhat.com> wrote:
> >
> > On Thu, Dec 16, 2021 at 9:21 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>
> >>
> >> On 12/16/2021 4:18 AM, Yi Zhang wrote:
> >>> On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>>>
> >>>> On 12/15/2021 3:15 AM, Yi Zhang wrote:
> >>>>> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>>>>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
> >>>>>>>>>> Hi Sagi
> >>>>>>>>>> It is still reproducible with the change, here is the log:
> >>>>>>>>>>
> >>>>>>>>>> # time nvme reset /dev/nvme0
> >>>>>>>>>>
> >>>>>>>>>> real    0m12.973s
> >>>>>>>>>> user    0m0.000s
> >>>>>>>>>> sys     0m0.006s
> >>>>>>>>>> # time nvme reset /dev/nvme0
> >>>>>>>>>>
> >>>>>>>>>> real    1m15.606s
> >>>>>>>>>> user    0m0.000s
> >>>>>>>>>> sys     0m0.007s
> >>>>>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
> >>>>>>>> Yes, with -i 4, it has stablee 1.3s
> >>>>>>>> # time nvme reset /dev/nvme0
> >>>>>>> So it appears that destroying a qp takes a long time on
> >>>>>>> IB for some reason...
> >>>>>>>
> >>>>>>>> real 0m1.225s
> >>>>>>>> user 0m0.000s
> >>>>>>>> sys 0m0.007s
> >>>>>>>>
> >>>>>>>>>> # dmesg | grep nvme
> >>>>>>>>>> [  900.634877] nvme nvme0: resetting controller
> >>>>>>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
> >>>>>>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>>>>>>> [  917.600993] nvme nvme0: resetting controller
> >>>>>>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
> >>>>>>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
> >>>>>>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
> >>>>>>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
> >>>>>>>>>>
> >>>>>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
> >>>>>>>>> Then I think that we need the rdma folks to help here...
> >>>>>>> Max?
> >>>>>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
> >>>>>>
> >>>>>> Can you try repro with latest versions please ?
> >>>>>>
> >>>>>> Or give the exact scenario ?
> >>>>> Yeah, both target and client are using Mellanox Technologies MT27700
> >>>>> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
> >>>>> first time reset will take 12s, and it always can be reproduced at the
> >>>>> second reset operation.
> >>>> I created a target with 1 namespace backed by null_blk and connected to
> >>>> it from the same server in loopback rdma connection using the ConnectX-4
> >>>> adapter.
> >>> Could you share your loop.json file so I can try it on my environment?
> >>
> >> {
> >>   "hosts": [],
> >>   "ports": [
> >>     {
> >>       "addr": {
> >>         "adrfam": "ipv4",
> >>         "traddr": "<ip>",
> >>         "treq": "not specified",
> >>         "trsvcid": "4420",
> >>         "trtype": "rdma"
> >>       },
> >>       "portid": 1,
> >>       "referrals": [],
> >>       "subsystems": [
> >>         "testsubsystem_0"
> >>       ]
> >>     }
> >>   ],
> >>   "subsystems": [
> >>     {
> >>       "allowed_hosts": [],
> >>       "attr": {
> >>         "allow_any_host": "1",
> >>         "cntlid_max": "65519",
> >>         "cntlid_min": "1",
> >>         "model": "Linux",
> >>         "serial": "3d83c78b76623f1d",
> >>         "version": "1.3"
> >>       },
> >>       "namespaces": [
> >>         {
> >>           "device": {
> >>             "nguid": "5b722b05-e9b6-542d-ba80-62010b57775d",
> >>             "path": "/dev/nullb0",
> >>             "uuid": "26ffc8ce-73b4-321d-9685-7d7a9872c460"
> >>           },
> >>           "enable": 1,
> >>           "nsid": 1
> >>         }
> >>       ],
> >>       "nqn": "testsubsystem_0"
> >>     }
> >>   ]
> >> }
> >
> > Thanks, I reproduced it with your json file on one server with
> > loopback rdma connection:
> > # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testsubsystem_0
> >
> > real 0m4.557s
> > user 0m0.000s
> > sys 0m0.005s
> > # time nvme reset /dev/nvme0
> >
> > real 0m13.176s
> > user 0m0.000s
> > sys 0m0.007s
> > # time nvme reset /dev/nvme0
> >
> > real 1m16.577s
> > user 0m0.002s
> > sys 0m0.005s
> > # time nvme disconnect -n testsubsystem_0
>
> What does:
>
> # rdma res show qp | grep -c ERR
>
> say, when it is slow?

Hi Haakon
Here is the output during the reset operation:

38
33
28
24
19
14
9
4
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
0
0
0

>
>
>
> Thxs, Håkon
>



-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-17  7:03                                     ` Yi Zhang
@ 2021-12-17 11:19                                       ` Haakon Bugge
  2022-02-14  9:47                                         ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Haakon Bugge @ 2021-12-17 11:19 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Max Gurtovoy, Sagi Grimberg, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list



> On 17 Dec 2021, at 08:03, Yi Zhang <yi.zhang@redhat.com> wrote:
> 
> On Fri, Dec 17, 2021 at 1:34 AM Haakon Bugge <haakon.bugge@oracle.com> wrote:
>> 
>> 
>> 
>>> On 16 Dec 2021, at 17:32, Yi Zhang <yi.zhang@redhat.com> wrote:
>>> 
>>> On Thu, Dec 16, 2021 at 9:21 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>> 
>>>> 
>>>> On 12/16/2021 4:18 AM, Yi Zhang wrote:
>>>>> On Wed, Dec 15, 2021 at 8:10 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>>> 
>>>>>> On 12/15/2021 3:15 AM, Yi Zhang wrote:
>>>>>>> On Tue, Dec 14, 2021 at 8:01 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>>>>>> On 12/14/2021 12:39 PM, Sagi Grimberg wrote:
>>>>>>>>>>>> Hi Sagi
>>>>>>>>>>>> It is still reproducible with the change, here is the log:
>>>>>>>>>>>> 
>>>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>>>>> 
>>>>>>>>>>>> real    0m12.973s
>>>>>>>>>>>> user    0m0.000s
>>>>>>>>>>>> sys     0m0.006s
>>>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>>>>> 
>>>>>>>>>>>> real    1m15.606s
>>>>>>>>>>>> user    0m0.000s
>>>>>>>>>>>> sys     0m0.007s
>>>>>>>>>>> Does it speed up if you use less queues? (i.e. connect with -i 4) ?
>>>>>>>>>> Yes, with -i 4, it has stablee 1.3s
>>>>>>>>>> # time nvme reset /dev/nvme0
>>>>>>>>> So it appears that destroying a qp takes a long time on
>>>>>>>>> IB for some reason...
>>>>>>>>> 
>>>>>>>>>> real 0m1.225s
>>>>>>>>>> user 0m0.000s
>>>>>>>>>> sys 0m0.007s
>>>>>>>>>> 
>>>>>>>>>>>> # dmesg | grep nvme
>>>>>>>>>>>> [  900.634877] nvme nvme0: resetting controller
>>>>>>>>>>>> [  909.026958] nvme nvme0: creating 40 I/O queues.
>>>>>>>>>>>> [  913.604297] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>>>>> [  917.600993] nvme nvme0: resetting controller
>>>>>>>>>>>> [  988.562230] nvme nvme0: I/O 2 QID 0 timeout
>>>>>>>>>>>> [  988.567607] nvme nvme0: Property Set error: 881, offset 0x14
>>>>>>>>>>>> [  988.608181] nvme nvme0: creating 40 I/O queues.
>>>>>>>>>>>> [  993.203495] nvme nvme0: mapped 40/0/0 default/read/poll queues.
>>>>>>>>>>>> 
>>>>>>>>>>>> BTW, this issue cannot be reproduced on my NVME/ROCE environment.
>>>>>>>>>>> Then I think that we need the rdma folks to help here...
>>>>>>>>> Max?
>>>>>>>> It took me 12s to reset a controller with 63 IO queues with 5.16-rc3+.
>>>>>>>> 
>>>>>>>> Can you try repro with latest versions please ?
>>>>>>>> 
>>>>>>>> Or give the exact scenario ?
>>>>>>> Yeah, both target and client are using Mellanox Technologies MT27700
>>>>>>> Family [ConnectX-4], could you try stress "nvme reset /dev/nvme0", the
>>>>>>> first time reset will take 12s, and it always can be reproduced at the
>>>>>>> second reset operation.
>>>>>> I created a target with 1 namespace backed by null_blk and connected to
>>>>>> it from the same server in loopback rdma connection using the ConnectX-4
>>>>>> adapter.
>>>>> Could you share your loop.json file so I can try it on my environment?
>>>> 
>>>> {
>>>>  "hosts": [],
>>>>  "ports": [
>>>>    {
>>>>      "addr": {
>>>>        "adrfam": "ipv4",
>>>>        "traddr": "<ip>",
>>>>        "treq": "not specified",
>>>>        "trsvcid": "4420",
>>>>        "trtype": "rdma"
>>>>      },
>>>>      "portid": 1,
>>>>      "referrals": [],
>>>>      "subsystems": [
>>>>        "testsubsystem_0"
>>>>      ]
>>>>    }
>>>>  ],
>>>>  "subsystems": [
>>>>    {
>>>>      "allowed_hosts": [],
>>>>      "attr": {
>>>>        "allow_any_host": "1",
>>>>        "cntlid_max": "65519",
>>>>        "cntlid_min": "1",
>>>>        "model": "Linux",
>>>>        "serial": "3d83c78b76623f1d",
>>>>        "version": "1.3"
>>>>      },
>>>>      "namespaces": [
>>>>        {
>>>>          "device": {
>>>>            "nguid": "5b722b05-e9b6-542d-ba80-62010b57775d",
>>>>            "path": "/dev/nullb0",
>>>>            "uuid": "26ffc8ce-73b4-321d-9685-7d7a9872c460"
>>>>          },
>>>>          "enable": 1,
>>>>          "nsid": 1
>>>>        }
>>>>      ],
>>>>      "nqn": "testsubsystem_0"
>>>>    }
>>>>  ]
>>>> }
>>> 
>>> Thanks, I reproduced it with your json file on one server with
>>> loopback rdma connection:
>>> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testsubsystem_0
>>> 
>>> real 0m4.557s
>>> user 0m0.000s
>>> sys 0m0.005s
>>> # time nvme reset /dev/nvme0
>>> 
>>> real 0m13.176s
>>> user 0m0.000s
>>> sys 0m0.007s
>>> # time nvme reset /dev/nvme0
>>> 
>>> real 1m16.577s
>>> user 0m0.002s
>>> sys 0m0.005s
>>> # time nvme disconnect -n testsubsystem_0
>> 
>> What does:
>> 
>> # rdma res show qp | grep -c ERR
>> 
>> say, when it is slow?
> 
> Hi Haakon
> Here is the output during the reset operation:
> 
> 38
> 33
> 28
> 24
> 19
> 14
> 9
> 4
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 0
> 0
> 0


OK. I had a hunch that many QPs in the ERR state could be the reason. But this is not many, so I rest my case :-) Thanks for trying though!


Thxs, Håkon

> 
>> 
>> 
>> 
>> Thxs, Håkon
>> 
> 
> 
> 
> -- 
> Best Regards,
>  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2021-12-17 11:19                                       ` Haakon Bugge
@ 2022-02-14  9:47                                         ` Yi Zhang
  2022-02-14 11:00                                           ` Chaitanya Kulkarni
  2022-02-14 11:32                                           ` Sagi Grimberg
  0 siblings, 2 replies; 45+ messages in thread
From: Yi Zhang @ 2022-02-14  9:47 UTC (permalink / raw)
  To: Max Gurtovoy, Sagi Grimberg
  Cc: Haakon Bugge, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	OFED mailing list

Hi Sagi/Max
Here are more findings with the bisect:

The time for reset operation changed from 3s[1] to 12s[2] after
commit[3], and after commit[4], the reset operation timeout at the
second reset[5], let me know if you need any testing for it, thanks.

[1]
# time nvme reset /dev/nvme0

real 0m3.049s
user 0m0.000s
sys 0m0.006s
[2]
# time nvme reset /dev/nvme0

real 0m12.498s
user 0m0.000s
sys 0m0.006s
[3]
commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
Author: Max Gurtovoy <maxg@mellanox.com>
Date:   Tue May 19 17:05:56 2020 +0300

    nvme-rdma: add metadata/T10-PI support

[4]
commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
Author: Hannes Reinecke <hare@suse.de>
Date:   Fri Apr 16 13:46:20 2021 +0200

    nvme: sanitize KATO setting-

[5]
# time nvme reset /dev/nvme0

real 0m12.628s
user 0m0.000s
sys 0m0.006s
# time nvme reset /dev/nvme0

real 1m15.617s
user 0m0.000s
sys 0m0.006s
#dmesg
[ 1866.068377] nvme nvme0: creating 40 I/O queues.
[ 1870.367851] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[ 1870.370949] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.0.202:4420
[ 1930.981494] nvme nvme0: resetting controller
[ 1939.316131] nvme nvme0: creating 40 I/O queues.
[ 1943.605427] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[ 1953.708327] nvme nvme0: resetting controller
[ 2024.985591] nvme nvme0: I/O 7 QID 0 timeout
[ 2024.990979] nvme nvme0: Property Set error: 881, offset 0x14
[ 2025.031069] nvme nvme0: creating 40 I/O queues.
[ 2029.321798] nvme nvme0: mapped 40/0/0 default/read/poll queues.


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-14  9:47                                         ` Yi Zhang
@ 2022-02-14 11:00                                           ` Chaitanya Kulkarni
  2022-02-14 11:32                                           ` Sagi Grimberg
  1 sibling, 0 replies; 45+ messages in thread
From: Chaitanya Kulkarni @ 2022-02-14 11:00 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Haakon Bugge, Max Gurtovoy, Sagi Grimberg,
	open list:NVM EXPRESS DRIVER, OFED mailing list, Max Gurtovoy

On 2/14/22 01:47, Yi Zhang wrote:
> Hi Sagi/Max
> Here are more findings with the bisect:
> 
> The time for reset operation changed from 3s[1] to 12s[2] after
> commit[3], and after commit[4], the reset operation timeout at the
> second reset[5], let me know if you need any testing for it, thanks.

you can collect the perf numbers with function call graph
for base case and problem case, that way you will know exactly
where we are spending more time ...

-ck

> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-14  9:47                                         ` Yi Zhang
  2022-02-14 11:00                                           ` Chaitanya Kulkarni
@ 2022-02-14 11:32                                           ` Sagi Grimberg
  2022-02-14 12:11                                             ` Max Gurtovoy
  1 sibling, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2022-02-14 11:32 UTC (permalink / raw)
  To: Yi Zhang, Max Gurtovoy
  Cc: Haakon Bugge, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	OFED mailing list


> Hi Sagi/Max
> Here are more findings with the bisect:
> 
> The time for reset operation changed from 3s[1] to 12s[2] after
> commit[3], and after commit[4], the reset operation timeout at the
> second reset[5], let me know if you need any testing for it, thanks.

Does this at least eliminate the timeout?
--
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a162f6c6da6e..60e415078893 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
  extern unsigned int admin_timeout;
  #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)

-#define NVME_DEFAULT_KATO      5
+#define NVME_DEFAULT_KATO      10

  #ifdef CONFIG_ARCH_NO_SG_CHAIN
  #define  NVME_INLINE_SG_CNT  0
--

> 
> [1]
> # time nvme reset /dev/nvme0
> 
> real 0m3.049s
> user 0m0.000s
> sys 0m0.006s
> [2]
> # time nvme reset /dev/nvme0
> 
> real 0m12.498s
> user 0m0.000s
> sys 0m0.006s
> [3]
> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
> Author: Max Gurtovoy <maxg@mellanox.com>
> Date:   Tue May 19 17:05:56 2020 +0300
> 
>      nvme-rdma: add metadata/T10-PI support
> 
> [4]
> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
> Author: Hannes Reinecke <hare@suse.de>
> Date:   Fri Apr 16 13:46:20 2021 +0200
> 
>      nvme: sanitize KATO setting-

This change effectively changed the keep-alive timeout
from 15 to 5 and modified the host to send keepalives every
2.5 seconds instead of 5.

I guess that in combination that now it takes longer to
create and delete rdma resources (either qps or mrs)
it starts to timeout in setups where there are a lot of
queues.

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-14 11:32                                           ` Sagi Grimberg
@ 2022-02-14 12:11                                             ` Max Gurtovoy
  2022-02-15 13:52                                               ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2022-02-14 12:11 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang
  Cc: Haakon Bugge, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	OFED mailing list


On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
>
>> Hi Sagi/Max
>> Here are more findings with the bisect:
>>
>> The time for reset operation changed from 3s[1] to 12s[2] after
>> commit[3], and after commit[4], the reset operation timeout at the
>> second reset[5], let me know if you need any testing for it, thanks.
>
> Does this at least eliminate the timeout?
> -- 
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index a162f6c6da6e..60e415078893 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
>  extern unsigned int admin_timeout;
>  #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
>
> -#define NVME_DEFAULT_KATO      5
> +#define NVME_DEFAULT_KATO      10
>
>  #ifdef CONFIG_ARCH_NO_SG_CHAIN
>  #define  NVME_INLINE_SG_CNT  0
> -- 
>
or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in 
the connect command

>>
>> [1]
>> # time nvme reset /dev/nvme0
>>
>> real 0m3.049s
>> user 0m0.000s
>> sys 0m0.006s
>> [2]
>> # time nvme reset /dev/nvme0
>>
>> real 0m12.498s
>> user 0m0.000s
>> sys 0m0.006s
>> [3]
>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
>> Author: Max Gurtovoy <maxg@mellanox.com>
>> Date:   Tue May 19 17:05:56 2020 +0300
>>
>>      nvme-rdma: add metadata/T10-PI support
>>
>> [4]
>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
>> Author: Hannes Reinecke <hare@suse.de>
>> Date:   Fri Apr 16 13:46:20 2021 +0200
>>
>>      nvme: sanitize KATO setting-
>
> This change effectively changed the keep-alive timeout
> from 15 to 5 and modified the host to send keepalives every
> 2.5 seconds instead of 5.
>
> I guess that in combination that now it takes longer to
> create and delete rdma resources (either qps or mrs)
> it starts to timeout in setups where there are a lot of
> queues.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-14 12:11                                             ` Max Gurtovoy
@ 2022-02-15 13:52                                               ` Yi Zhang
  2022-02-15 14:30                                                 ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2022-02-15 13:52 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Sagi Grimberg, Haakon Bugge, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list

Hi Sagi/Max

Changing the value to 10 or 15 fixed the timeout issue.
And the reset operation still needs more than 12s on my environment, I
also tried disabling the pi_enable, the reset operation will be back
to 3s, so seems the added 9s was due to the PI enabled code path.

On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
>
> On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
> >
> >> Hi Sagi/Max
> >> Here are more findings with the bisect:
> >>
> >> The time for reset operation changed from 3s[1] to 12s[2] after
> >> commit[3], and after commit[4], the reset operation timeout at the
> >> second reset[5], let me know if you need any testing for it, thanks.
> >
> > Does this at least eliminate the timeout?
> > --
> > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> > index a162f6c6da6e..60e415078893 100644
> > --- a/drivers/nvme/host/nvme.h
> > +++ b/drivers/nvme/host/nvme.h
> > @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
> >  extern unsigned int admin_timeout;
> >  #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
> >
> > -#define NVME_DEFAULT_KATO      5
> > +#define NVME_DEFAULT_KATO      10
> >
> >  #ifdef CONFIG_ARCH_NO_SG_CHAIN
> >  #define  NVME_INLINE_SG_CNT  0
> > --
> >
> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in
> the connect command
>
> >>
> >> [1]
> >> # time nvme reset /dev/nvme0
> >>
> >> real 0m3.049s
> >> user 0m0.000s
> >> sys 0m0.006s
> >> [2]
> >> # time nvme reset /dev/nvme0
> >>
> >> real 0m12.498s
> >> user 0m0.000s
> >> sys 0m0.006s
> >> [3]
> >> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
> >> Author: Max Gurtovoy <maxg@mellanox.com>
> >> Date:   Tue May 19 17:05:56 2020 +0300
> >>
> >>      nvme-rdma: add metadata/T10-PI support
> >>
> >> [4]
> >> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
> >> Author: Hannes Reinecke <hare@suse.de>
> >> Date:   Fri Apr 16 13:46:20 2021 +0200
> >>
> >>      nvme: sanitize KATO setting-
> >
> > This change effectively changed the keep-alive timeout
> > from 15 to 5 and modified the host to send keepalives every
> > 2.5 seconds instead of 5.
> >
> > I guess that in combination that now it takes longer to
> > create and delete rdma resources (either qps or mrs)
> > it starts to timeout in setups where there are a lot of
> > queues.
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-15 13:52                                               ` Yi Zhang
@ 2022-02-15 14:30                                                 ` Max Gurtovoy
  2022-02-21 10:00                                                   ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2022-02-15 14:30 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Sagi Grimberg, Haakon Bugge, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list

[-- Attachment #1: Type: text/plain, Size: 2820 bytes --]

Thanks Yi Zhang.

Few years ago I've sent some patches that were supposed to fix the KA 
mechanism but eventually they weren't accepted.

I haven't tested it since but maybe you can run some tests with it.

The attached patches are partial and include only rdma transport for 
your testing.

If it work for you we can work on it again and argue for correctness.

Please don't use the workaround we suggested earlier with these patches.

-Max.

On 2/15/2022 3:52 PM, Yi Zhang wrote:
> Hi Sagi/Max
>
> Changing the value to 10 or 15 fixed the timeout issue.
> And the reset operation still needs more than 12s on my environment, I
> also tried disabling the pi_enable, the reset operation will be back
> to 3s, so seems the added 9s was due to the PI enabled code path.
>
> On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>
>> On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
>>>> Hi Sagi/Max
>>>> Here are more findings with the bisect:
>>>>
>>>> The time for reset operation changed from 3s[1] to 12s[2] after
>>>> commit[3], and after commit[4], the reset operation timeout at the
>>>> second reset[5], let me know if you need any testing for it, thanks.
>>> Does this at least eliminate the timeout?
>>> --
>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>>> index a162f6c6da6e..60e415078893 100644
>>> --- a/drivers/nvme/host/nvme.h
>>> +++ b/drivers/nvme/host/nvme.h
>>> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
>>>   extern unsigned int admin_timeout;
>>>   #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
>>>
>>> -#define NVME_DEFAULT_KATO      5
>>> +#define NVME_DEFAULT_KATO      10
>>>
>>>   #ifdef CONFIG_ARCH_NO_SG_CHAIN
>>>   #define  NVME_INLINE_SG_CNT  0
>>> --
>>>
>> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in
>> the connect command
>>
>>>> [1]
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real 0m3.049s
>>>> user 0m0.000s
>>>> sys 0m0.006s
>>>> [2]
>>>> # time nvme reset /dev/nvme0
>>>>
>>>> real 0m12.498s
>>>> user 0m0.000s
>>>> sys 0m0.006s
>>>> [3]
>>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
>>>> Author: Max Gurtovoy <maxg@mellanox.com>
>>>> Date:   Tue May 19 17:05:56 2020 +0300
>>>>
>>>>       nvme-rdma: add metadata/T10-PI support
>>>>
>>>> [4]
>>>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
>>>> Author: Hannes Reinecke <hare@suse.de>
>>>> Date:   Fri Apr 16 13:46:20 2021 +0200
>>>>
>>>>       nvme: sanitize KATO setting-
>>> This change effectively changed the keep-alive timeout
>>> from 15 to 5 and modified the host to send keepalives every
>>> 2.5 seconds instead of 5.
>>>
>>> I guess that in combination that now it takes longer to
>>> create and delete rdma resources (either qps or mrs)
>>> it starts to timeout in setups where there are a lot of
>>> queues.
>

[-- Attachment #2: 0001-Revert-nvme-unexport-nvme_start_keep_alive.patch --]
[-- Type: text/plain, Size: 1802 bytes --]

From 4f5629786ceaa6912d5c585c18cc5d83fa346ea9 Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <mgurtovoy@nvidia.com>
Date: Tue, 10 Apr 2018 15:21:29 +0000
Subject: [PATCH 1/3] Revert "nvme: unexport nvme_start_keep_alive"

This reverts commit c8799eee39e7523e5e0be10f8950b11cb66085bd.

nvme_start_keep_alive() will be used by the transport drivers
to fix keep-alive synchronization between NVMe-oF target/host.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/core.c | 3 ++-
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 961a5f8a44d2..46f2d85d4f31 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1284,13 +1284,14 @@ static void nvme_keep_alive_work(struct work_struct *work)
 	blk_execute_rq_nowait(rq, false, nvme_keep_alive_end_io);
 }
 
-static void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
+void nvme_start_keep_alive(struct nvme_ctrl *ctrl)
 {
 	if (unlikely(ctrl->kato == 0))
 		return;
 
 	nvme_queue_keep_alive_work(ctrl);
 }
+EXPORT_SYMBOL_GPL(nvme_start_keep_alive);
 
 void nvme_stop_keep_alive(struct nvme_ctrl *ctrl)
 {
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index a162f6c6da6e..1610ec764bfc 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -729,6 +729,7 @@ int nvme_get_features(struct nvme_ctrl *dev, unsigned int fid,
 		      unsigned int dword11, void *buffer, size_t buflen,
 		      u32 *result);
 int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count);
+void nvme_start_keep_alive(struct nvme_ctrl *ctrl);
 void nvme_stop_keep_alive(struct nvme_ctrl *ctrl);
 int nvme_reset_ctrl(struct nvme_ctrl *ctrl);
 int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl);
-- 
2.18.1


[-- Attachment #3: 0002-nvme-remove-association-between-ctrl-and-keep-alive.patch --]
[-- Type: text/plain, Size: 1135 bytes --]

From cbd2e21bc4487e345d37aa643404eab495c05e25 Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <mgurtovoy@nvidia.com>
Date: Tue, 10 Apr 2018 16:30:18 +0000
Subject: [PATCH 2/3] nvme: remove association between ctrl and keep-alive

Keep-alive mechanism is an admin queue property and
should be activated/deactivated during admin queue
creation/destruction.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/core.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 46f2d85d4f31..34b006a77270 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4389,7 +4389,6 @@ EXPORT_SYMBOL_GPL(nvme_complete_async_event);
 void nvme_stop_ctrl(struct nvme_ctrl *ctrl)
 {
 	nvme_mpath_stop(ctrl);
-	nvme_stop_keep_alive(ctrl);
 	nvme_stop_failfast_work(ctrl);
 	flush_work(&ctrl->async_event_work);
 	cancel_work_sync(&ctrl->fw_act_work);
@@ -4398,8 +4397,6 @@ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);
 
 void nvme_start_ctrl(struct nvme_ctrl *ctrl)
 {
-	nvme_start_keep_alive(ctrl);
-
 	nvme_enable_aen(ctrl);
 
 	if (ctrl->queue_count > 1) {
-- 
2.18.1


[-- Attachment #4: 0003-nvme-rdma-add-keep-alive-mechanism-as-admin_q-proper.patch --]
[-- Type: text/plain, Size: 1546 bytes --]

From 3a40795e1c47aae8297a1dcaa821f63fea5e4baa Mon Sep 17 00:00:00 2001
From: Max Gurtovoy <mgurtovoy@nvidia.com>
Date: Tue, 10 Apr 2018 16:33:06 +0000
Subject: [PATCH 3/3] nvme-rdma: add keep-alive mechanism as admin_q property

Activate/deactivate it during admin queue creation/destruction
and remove association to nvme ctrl.

Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 993e3a076a41..ab71f7d3d6a5 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -925,6 +925,8 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl,
 	if (error)
 		goto out_quiesce_queue;
 
+	nvme_start_keep_alive(&ctrl->ctrl);
+
 	return 0;
 
 out_quiesce_queue:
@@ -1026,6 +1028,7 @@ static int nvme_rdma_configure_io_queues(struct nvme_rdma_ctrl *ctrl, bool new)
 static void nvme_rdma_teardown_admin_queue(struct nvme_rdma_ctrl *ctrl,
 		bool remove)
 {
+	nvme_stop_keep_alive(&ctrl->ctrl);
 	nvme_stop_admin_queue(&ctrl->ctrl);
 	blk_sync_queue(ctrl->ctrl.admin_q);
 	nvme_rdma_stop_queue(&ctrl->queues[0]);
@@ -1199,7 +1202,6 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 	struct nvme_rdma_ctrl *ctrl = container_of(work,
 			struct nvme_rdma_ctrl, err_work);
 
-	nvme_stop_keep_alive(&ctrl->ctrl);
 	flush_work(&ctrl->ctrl.async_event_work);
 	nvme_rdma_teardown_io_queues(ctrl, false);
 	nvme_start_queues(&ctrl->ctrl);
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-15 14:30                                                 ` Max Gurtovoy
@ 2022-02-21 10:00                                                   ` Yi Zhang
  2022-02-23 10:04                                                     ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2022-02-21 10:00 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: Sagi Grimberg, Haakon Bugge, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list

Hi Max

The patch fixed the timeout issue when I use one non-debug kernel,
but when I tested on debug kernel with your patches, the timeout still
can be triggered with "nvme reset/nvme disconnect-all" operations.

On Tue, Feb 15, 2022 at 10:31 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
> Thanks Yi Zhang.
>
> Few years ago I've sent some patches that were supposed to fix the KA
> mechanism but eventually they weren't accepted.
>
> I haven't tested it since but maybe you can run some tests with it.
>
> The attached patches are partial and include only rdma transport for
> your testing.
>
> If it work for you we can work on it again and argue for correctness.
>
> Please don't use the workaround we suggested earlier with these patches.
>
> -Max.
>
> On 2/15/2022 3:52 PM, Yi Zhang wrote:
> > Hi Sagi/Max
> >
> > Changing the value to 10 or 15 fixed the timeout issue.
> > And the reset operation still needs more than 12s on my environment, I
> > also tried disabling the pi_enable, the reset operation will be back
> > to 3s, so seems the added 9s was due to the PI enabled code path.
> >
> > On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>
> >> On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
> >>>> Hi Sagi/Max
> >>>> Here are more findings with the bisect:
> >>>>
> >>>> The time for reset operation changed from 3s[1] to 12s[2] after
> >>>> commit[3], and after commit[4], the reset operation timeout at the
> >>>> second reset[5], let me know if you need any testing for it, thanks.
> >>> Does this at least eliminate the timeout?
> >>> --
> >>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> >>> index a162f6c6da6e..60e415078893 100644
> >>> --- a/drivers/nvme/host/nvme.h
> >>> +++ b/drivers/nvme/host/nvme.h
> >>> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
> >>>   extern unsigned int admin_timeout;
> >>>   #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
> >>>
> >>> -#define NVME_DEFAULT_KATO      5
> >>> +#define NVME_DEFAULT_KATO      10
> >>>
> >>>   #ifdef CONFIG_ARCH_NO_SG_CHAIN
> >>>   #define  NVME_INLINE_SG_CNT  0
> >>> --
> >>>
> >> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in
> >> the connect command
> >>
> >>>> [1]
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real 0m3.049s
> >>>> user 0m0.000s
> >>>> sys 0m0.006s
> >>>> [2]
> >>>> # time nvme reset /dev/nvme0
> >>>>
> >>>> real 0m12.498s
> >>>> user 0m0.000s
> >>>> sys 0m0.006s
> >>>> [3]
> >>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
> >>>> Author: Max Gurtovoy <maxg@mellanox.com>
> >>>> Date:   Tue May 19 17:05:56 2020 +0300
> >>>>
> >>>>       nvme-rdma: add metadata/T10-PI support
> >>>>
> >>>> [4]
> >>>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
> >>>> Author: Hannes Reinecke <hare@suse.de>
> >>>> Date:   Fri Apr 16 13:46:20 2021 +0200
> >>>>
> >>>>       nvme: sanitize KATO setting-
> >>> This change effectively changed the keep-alive timeout
> >>> from 15 to 5 and modified the host to send keepalives every
> >>> 2.5 seconds instead of 5.
> >>>
> >>> I guess that in combination that now it takes longer to
> >>> create and delete rdma resources (either qps or mrs)
> >>> it starts to timeout in setups where there are a lot of
> >>> queues.
> >



-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-21 10:00                                                   ` Yi Zhang
@ 2022-02-23 10:04                                                     ` Max Gurtovoy
  2022-02-23 10:30                                                       ` Sagi Grimberg
  2022-03-01  0:06                                                       ` Yi Zhang
  0 siblings, 2 replies; 45+ messages in thread
From: Max Gurtovoy @ 2022-02-23 10:04 UTC (permalink / raw)
  To: Yi Zhang
  Cc: Sagi Grimberg, Haakon Bugge, Max Gurtovoy,
	open list:NVM EXPRESS DRIVER, OFED mailing list

Hi Yi Zhang,

thanks for testing the patches.

Can you provide more info on the time it took with both kernels ?

The patches don't intend to decrease this time but re-start the KA in 
early stage - as soon as we create the AQ.

I guess we need to debug it offline.

On 2/21/2022 12:00 PM, Yi Zhang wrote:
> Hi Max
>
> The patch fixed the timeout issue when I use one non-debug kernel,
> but when I tested on debug kernel with your patches, the timeout still
> can be triggered with "nvme reset/nvme disconnect-all" operations.
>
> On Tue, Feb 15, 2022 at 10:31 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>> Thanks Yi Zhang.
>>
>> Few years ago I've sent some patches that were supposed to fix the KA
>> mechanism but eventually they weren't accepted.
>>
>> I haven't tested it since but maybe you can run some tests with it.
>>
>> The attached patches are partial and include only rdma transport for
>> your testing.
>>
>> If it work for you we can work on it again and argue for correctness.
>>
>> Please don't use the workaround we suggested earlier with these patches.
>>
>> -Max.
>>
>> On 2/15/2022 3:52 PM, Yi Zhang wrote:
>>> Hi Sagi/Max
>>>
>>> Changing the value to 10 or 15 fixed the timeout issue.
>>> And the reset operation still needs more than 12s on my environment, I
>>> also tried disabling the pi_enable, the reset operation will be back
>>> to 3s, so seems the added 9s was due to the PI enabled code path.
>>>
>>> On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>>>> On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
>>>>>> Hi Sagi/Max
>>>>>> Here are more findings with the bisect:
>>>>>>
>>>>>> The time for reset operation changed from 3s[1] to 12s[2] after
>>>>>> commit[3], and after commit[4], the reset operation timeout at the
>>>>>> second reset[5], let me know if you need any testing for it, thanks.
>>>>> Does this at least eliminate the timeout?
>>>>> --
>>>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
>>>>> index a162f6c6da6e..60e415078893 100644
>>>>> --- a/drivers/nvme/host/nvme.h
>>>>> +++ b/drivers/nvme/host/nvme.h
>>>>> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
>>>>>    extern unsigned int admin_timeout;
>>>>>    #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
>>>>>
>>>>> -#define NVME_DEFAULT_KATO      5
>>>>> +#define NVME_DEFAULT_KATO      10
>>>>>
>>>>>    #ifdef CONFIG_ARCH_NO_SG_CHAIN
>>>>>    #define  NVME_INLINE_SG_CNT  0
>>>>> --
>>>>>
>>>> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in
>>>> the connect command
>>>>
>>>>>> [1]
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real 0m3.049s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.006s
>>>>>> [2]
>>>>>> # time nvme reset /dev/nvme0
>>>>>>
>>>>>> real 0m12.498s
>>>>>> user 0m0.000s
>>>>>> sys 0m0.006s
>>>>>> [3]
>>>>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
>>>>>> Author: Max Gurtovoy <maxg@mellanox.com>
>>>>>> Date:   Tue May 19 17:05:56 2020 +0300
>>>>>>
>>>>>>        nvme-rdma: add metadata/T10-PI support
>>>>>>
>>>>>> [4]
>>>>>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
>>>>>> Author: Hannes Reinecke <hare@suse.de>
>>>>>> Date:   Fri Apr 16 13:46:20 2021 +0200
>>>>>>
>>>>>>        nvme: sanitize KATO setting-
>>>>> This change effectively changed the keep-alive timeout
>>>>> from 15 to 5 and modified the host to send keepalives every
>>>>> 2.5 seconds instead of 5.
>>>>>
>>>>> I guess that in combination that now it takes longer to
>>>>> create and delete rdma resources (either qps or mrs)
>>>>> it starts to timeout in setups where there are a lot of
>>>>> queues.
>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-23 10:04                                                     ` Max Gurtovoy
@ 2022-02-23 10:30                                                       ` Sagi Grimberg
  2022-02-23 11:20                                                         ` Max Gurtovoy
  2022-03-01  0:06                                                       ` Yi Zhang
  1 sibling, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2022-02-23 10:30 UTC (permalink / raw)
  To: Max Gurtovoy, Yi Zhang
  Cc: Haakon Bugge, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	OFED mailing list


> Hi Yi Zhang,
> 
> thanks for testing the patches.
> 
> Can you provide more info on the time it took with both kernels ?
> 
> The patches don't intend to decrease this time but re-start the KA in 
> early stage - as soon as we create the AQ.

Still not sure why this addresses the problem, because every io queue
connect should reset the keep alive timer in the target.

But if at all, just move the keep alive start to nvme_init_ctrl_finish
don't expose it to drivers...

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-23 10:30                                                       ` Sagi Grimberg
@ 2022-02-23 11:20                                                         ` Max Gurtovoy
  0 siblings, 0 replies; 45+ messages in thread
From: Max Gurtovoy @ 2022-02-23 11:20 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang
  Cc: Haakon Bugge, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	OFED mailing list


On 2/23/2022 12:30 PM, Sagi Grimberg wrote:
>
>> Hi Yi Zhang,
>>
>> thanks for testing the patches.
>>
>> Can you provide more info on the time it took with both kernels ?
>>
>> The patches don't intend to decrease this time but re-start the KA in 
>> early stage - as soon as we create the AQ.
>
> Still not sure why this addresses the problem, because every io queue
> connect should reset the keep alive timer in the target.

Right, in the NVMf connect. Not in transport connect.

You first allocate all IO queues (takes time) and only then nvmf_connect 
all IO queues.

In this time you probably get the timeout I guess.

between nvme_rdma_alloc_io_queues and nvme_rdma_start_io_queues there is 
no reason the admin_q can't send keep-alives.

This is what I tried pushing upstream few years ago...

Anyway, we shouldn't assume anything about the target implementation. We 
need to do our best to have a working initiator logic.

>
> But if at all, just move the keep alive start to nvme_init_ctrl_finish
> don't expose it to drivers...

Yes, back then there was no nvme_init_ctrl_finish code.



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-02-23 10:04                                                     ` Max Gurtovoy
  2022-02-23 10:30                                                       ` Sagi Grimberg
@ 2022-03-01  0:06                                                       ` Yi Zhang
  2022-03-16 15:16                                                         ` Sagi Grimberg
  1 sibling, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2022-03-01  0:06 UTC (permalink / raw)
  To: Max Gurtovoy, Max Gurtovoy
  Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list, Sagi Grimberg

On Wed, Feb 23, 2022 at 6:04 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
>
> Hi Yi Zhang,
>
> thanks for testing the patches.
>
> Can you provide more info on the time it took with both kernels ?

Hi Max
Sorry for the late response, here are the test results/dmesg on
debug/non-debug kernel with your patch:
debug kernel: timeout
# time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
real    0m16.956s
user    0m0.000s
sys     0m0.237s
# time nvme reset /dev/nvme0
real    1m33.623s
user    0m0.000s
sys     0m0.024s
# time nvme disconnect-all
real    1m26.640s
user    0m0.000s
sys     0m9.969s

host dmesg:
https://pastebin.com/8T3Lqtkn
target dmesg:
https://pastebin.com/KpFP7xG2

non-debug kernel: no timeout issue, but still 12s for reset, and 8s
for disconnect
host:
# time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn

real    0m4.579s
user    0m0.000s
sys     0m0.004s
# time nvme reset /dev/nvme0

real    0m12.778s
user    0m0.000s
sys     0m0.006s
# time nvme reset /dev/nvme0

real    0m12.793s
user    0m0.000s
sys     0m0.006s
# time nvme reset /dev/nvme0

real    0m12.808s
user    0m0.000s
sys     0m0.006s
# time nvme disconnect-all

real    0m8.348s
user    0m0.000s
sys     0m0.189s

>
> The patches don't intend to decrease this time but re-start the KA in
> early stage - as soon as we create the AQ.
>
> I guess we need to debug it offline.
>
> On 2/21/2022 12:00 PM, Yi Zhang wrote:
> > Hi Max
> >
> > The patch fixed the timeout issue when I use one non-debug kernel,
> > but when I tested on debug kernel with your patches, the timeout still
> > can be triggered with "nvme reset/nvme disconnect-all" operations.
> >
> > On Tue, Feb 15, 2022 at 10:31 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >> Thanks Yi Zhang.
> >>
> >> Few years ago I've sent some patches that were supposed to fix the KA
> >> mechanism but eventually they weren't accepted.
> >>
> >> I haven't tested it since but maybe you can run some tests with it.
> >>
> >> The attached patches are partial and include only rdma transport for
> >> your testing.
> >>
> >> If it work for you we can work on it again and argue for correctness.
> >>
> >> Please don't use the workaround we suggested earlier with these patches.
> >>
> >> -Max.
> >>
> >> On 2/15/2022 3:52 PM, Yi Zhang wrote:
> >>> Hi Sagi/Max
> >>>
> >>> Changing the value to 10 or 15 fixed the timeout issue.
> >>> And the reset operation still needs more than 12s on my environment, I
> >>> also tried disabling the pi_enable, the reset operation will be back
> >>> to 3s, so seems the added 9s was due to the PI enabled code path.
> >>>
> >>> On Mon, Feb 14, 2022 at 8:12 PM Max Gurtovoy <mgurtovoy@nvidia.com> wrote:
> >>>> On 2/14/2022 1:32 PM, Sagi Grimberg wrote:
> >>>>>> Hi Sagi/Max
> >>>>>> Here are more findings with the bisect:
> >>>>>>
> >>>>>> The time for reset operation changed from 3s[1] to 12s[2] after
> >>>>>> commit[3], and after commit[4], the reset operation timeout at the
> >>>>>> second reset[5], let me know if you need any testing for it, thanks.
> >>>>> Does this at least eliminate the timeout?
> >>>>> --
> >>>>> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> >>>>> index a162f6c6da6e..60e415078893 100644
> >>>>> --- a/drivers/nvme/host/nvme.h
> >>>>> +++ b/drivers/nvme/host/nvme.h
> >>>>> @@ -25,7 +25,7 @@ extern unsigned int nvme_io_timeout;
> >>>>>    extern unsigned int admin_timeout;
> >>>>>    #define NVME_ADMIN_TIMEOUT     (admin_timeout * HZ)
> >>>>>
> >>>>> -#define NVME_DEFAULT_KATO      5
> >>>>> +#define NVME_DEFAULT_KATO      10
> >>>>>
> >>>>>    #ifdef CONFIG_ARCH_NO_SG_CHAIN
> >>>>>    #define  NVME_INLINE_SG_CNT  0
> >>>>> --
> >>>>>
> >>>> or for the initial test you can use --keep-alive-tmo=<10 or 15> flag in
> >>>> the connect command
> >>>>
> >>>>>> [1]
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 0m3.049s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.006s
> >>>>>> [2]
> >>>>>> # time nvme reset /dev/nvme0
> >>>>>>
> >>>>>> real 0m12.498s
> >>>>>> user 0m0.000s
> >>>>>> sys 0m0.006s
> >>>>>> [3]
> >>>>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc (HEAD)
> >>>>>> Author: Max Gurtovoy <maxg@mellanox.com>
> >>>>>> Date:   Tue May 19 17:05:56 2020 +0300
> >>>>>>
> >>>>>>        nvme-rdma: add metadata/T10-PI support
> >>>>>>
> >>>>>> [4]
> >>>>>> commit a70b81bd4d9d2d6c05cfe6ef2a10bccc2e04357a (HEAD)
> >>>>>> Author: Hannes Reinecke <hare@suse.de>
> >>>>>> Date:   Fri Apr 16 13:46:20 2021 +0200
> >>>>>>
> >>>>>>        nvme: sanitize KATO setting-
> >>>>> This change effectively changed the keep-alive timeout
> >>>>> from 15 to 5 and modified the host to send keepalives every
> >>>>> 2.5 seconds instead of 5.
> >>>>>
> >>>>> I guess that in combination that now it takes longer to
> >>>>> create and delete rdma resources (either qps or mrs)
> >>>>> it starts to timeout in setups where there are a lot of
> >>>>> queues.
> >
> >
>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-01  0:06                                                       ` Yi Zhang
@ 2022-03-16 15:16                                                         ` Sagi Grimberg
  2022-03-19  7:29                                                           ` Yi Zhang
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2022-03-16 15:16 UTC (permalink / raw)
  To: Yi Zhang, Max Gurtovoy, Max Gurtovoy
  Cc: open list:NVM EXPRESS DRIVER, RDMA mailing list


>> Hi Yi Zhang,
>>
>> thanks for testing the patches.
>>
>> Can you provide more info on the time it took with both kernels ?
> 
> Hi Max
> Sorry for the late response, here are the test results/dmesg on
> debug/non-debug kernel with your patch:
> debug kernel: timeout
> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
> real    0m16.956s
> user    0m0.000s
> sys     0m0.237s
> # time nvme reset /dev/nvme0
> real    1m33.623s
> user    0m0.000s
> sys     0m0.024s
> # time nvme disconnect-all
> real    1m26.640s
> user    0m0.000s
> sys     0m9.969s
> 
> host dmesg:
> https://pastebin.com/8T3Lqtkn
> target dmesg:
> https://pastebin.com/KpFP7xG2
> 
> non-debug kernel: no timeout issue, but still 12s for reset, and 8s
> for disconnect
> host:
> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
> 
> real    0m4.579s
> user    0m0.000s
> sys     0m0.004s
> # time nvme reset /dev/nvme0
> 
> real    0m12.778s
> user    0m0.000s
> sys     0m0.006s
> # time nvme reset /dev/nvme0
> 
> real    0m12.793s
> user    0m0.000s
> sys     0m0.006s
> # time nvme reset /dev/nvme0
> 
> real    0m12.808s
> user    0m0.000s
> sys     0m0.006s
> # time nvme disconnect-all
> 
> real    0m8.348s
> user    0m0.000s
> sys     0m0.189s

These are very long times for a non-debug kernel...
Max, do you see the root cause for this?

Yi, does this happen with rxe/siw as well?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-16 15:16                                                         ` Sagi Grimberg
@ 2022-03-19  7:29                                                           ` Yi Zhang
  2022-03-20 10:50                                                             ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Yi Zhang @ 2022-03-19  7:29 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Max Gurtovoy, Max Gurtovoy, open list:NVM EXPRESS DRIVER,
	RDMA mailing list

On Wed, Mar 16, 2022 at 11:16 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>
>
> >> Hi Yi Zhang,
> >>
> >> thanks for testing the patches.
> >>
> >> Can you provide more info on the time it took with both kernels ?
> >
> > Hi Max
> > Sorry for the late response, here are the test results/dmesg on
> > debug/non-debug kernel with your patch:
> > debug kernel: timeout
> > # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
> > real    0m16.956s
> > user    0m0.000s
> > sys     0m0.237s
> > # time nvme reset /dev/nvme0
> > real    1m33.623s
> > user    0m0.000s
> > sys     0m0.024s
> > # time nvme disconnect-all
> > real    1m26.640s
> > user    0m0.000s
> > sys     0m9.969s
> >
> > host dmesg:
> > https://pastebin.com/8T3Lqtkn
> > target dmesg:
> > https://pastebin.com/KpFP7xG2
> >
> > non-debug kernel: no timeout issue, but still 12s for reset, and 8s
> > for disconnect
> > host:
> > # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
> >
> > real    0m4.579s
> > user    0m0.000s
> > sys     0m0.004s
> > # time nvme reset /dev/nvme0
> >
> > real    0m12.778s
> > user    0m0.000s
> > sys     0m0.006s
> > # time nvme reset /dev/nvme0
> >
> > real    0m12.793s
> > user    0m0.000s
> > sys     0m0.006s
> > # time nvme reset /dev/nvme0
> >
> > real    0m12.808s
> > user    0m0.000s
> > sys     0m0.006s
> > # time nvme disconnect-all
> >
> > real    0m8.348s
> > user    0m0.000s
> > sys     0m0.189s
>
> These are very long times for a non-debug kernel...
> Max, do you see the root cause for this?
>
> Yi, does this happen with rxe/siw as well?

Hi Sagi

rxe/siw will take less than 1s
with rdma_rxe
# time nvme reset /dev/nvme0
real 0m0.094s
user 0m0.000s
sys 0m0.006s

with siw
# time nvme reset /dev/nvme0
real 0m0.097s
user 0m0.000s
sys 0m0.006s

This is only reproducible with mlx IB card, as I mentioned before, the
reset operation time changed from 3s to 12s after the below commit,
could you check this commit?

commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
Author: Max Gurtovoy <maxg@mellanox.com>
Date:   Tue May 19 17:05:56 2020 +0300

    nvme-rdma: add metadata/T10-PI support


>


-- 
Best Regards,
  Yi Zhang


^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-19  7:29                                                           ` Yi Zhang
@ 2022-03-20 10:50                                                             ` Max Gurtovoy
  2022-03-20 13:03                                                               ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2022-03-20 10:50 UTC (permalink / raw)
  To: Yi Zhang, Sagi Grimberg
  Cc: Max Gurtovoy, open list:NVM EXPRESS DRIVER, RDMA mailing list


On 3/19/2022 9:29 AM, Yi Zhang wrote:
> On Wed, Mar 16, 2022 at 11:16 PM Sagi Grimberg <sagi@grimberg.me> wrote:
>>
>>>> Hi Yi Zhang,
>>>>
>>>> thanks for testing the patches.
>>>>
>>>> Can you provide more info on the time it took with both kernels ?
>>> Hi Max
>>> Sorry for the late response, here are the test results/dmesg on
>>> debug/non-debug kernel with your patch:
>>> debug kernel: timeout
>>> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
>>> real    0m16.956s
>>> user    0m0.000s
>>> sys     0m0.237s
>>> # time nvme reset /dev/nvme0
>>> real    1m33.623s
>>> user    0m0.000s
>>> sys     0m0.024s
>>> # time nvme disconnect-all
>>> real    1m26.640s
>>> user    0m0.000s
>>> sys     0m9.969s
>>>
>>> host dmesg:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2F8T3Lqtkn&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cc89cc47d8acf4ef3256408da097a3305%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637832717692265478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=qtZ8E6cvHlSu8LbUkBa0ehhguyQRfP%2B%2BC8BEonDNj9Y%3D&amp;reserved=0
>>> target dmesg:
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpastebin.com%2FKpFP7xG2&amp;data=04%7C01%7Cmgurtovoy%40nvidia.com%7Cc89cc47d8acf4ef3256408da097a3305%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637832717692265478%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=DerGWqQmWm9C30FFGbb5AcU%2B%2BrBErKClXzFlqSJT7jw%3D&amp;reserved=0
>>>
>>> non-debug kernel: no timeout issue, but still 12s for reset, and 8s
>>> for disconnect
>>> host:
>>> # time nvme connect -t rdma -a 172.31.0.202 -s 4420 -n testnqn
>>>
>>> real    0m4.579s
>>> user    0m0.000s
>>> sys     0m0.004s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.778s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.793s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme reset /dev/nvme0
>>>
>>> real    0m12.808s
>>> user    0m0.000s
>>> sys     0m0.006s
>>> # time nvme disconnect-all
>>>
>>> real    0m8.348s
>>> user    0m0.000s
>>> sys     0m0.189s
>> These are very long times for a non-debug kernel...
>> Max, do you see the root cause for this?
>>
>> Yi, does this happen with rxe/siw as well?
> Hi Sagi
>
> rxe/siw will take less than 1s
> with rdma_rxe
> # time nvme reset /dev/nvme0
> real 0m0.094s
> user 0m0.000s
> sys 0m0.006s
>
> with siw
> # time nvme reset /dev/nvme0
> real 0m0.097s
> user 0m0.000s
> sys 0m0.006s
>
> This is only reproducible with mlx IB card, as I mentioned before, the
> reset operation time changed from 3s to 12s after the below commit,
> could you check this commit?
>
> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
> Author: Max Gurtovoy <maxg@mellanox.com>
> Date:   Tue May 19 17:05:56 2020 +0300
>
>      nvme-rdma: add metadata/T10-PI support
>
I couldn't repro these long reset times.

Nevertheless, the above commit added T10-PI offloads.

In this commit, for supported devices we create extra resources in HW 
(more memory keys per task).

I suggested doing this configuration as part of the "nvme connect" 
command and save this resource allocation by default but during the 
review I was asked to make it the default behavior.

Sagi/Christoph,

WDYT ? should we reconsider the "nvme connect --with_metadata" option ?

>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-20 10:50                                                             ` Max Gurtovoy
@ 2022-03-20 13:03                                                               ` Sagi Grimberg
  2022-03-20 15:11                                                                 ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2022-03-20 13:03 UTC (permalink / raw)
  To: Max Gurtovoy, Yi Zhang
  Cc: Max Gurtovoy, open list:NVM EXPRESS DRIVER, RDMA mailing list


>>> These are very long times for a non-debug kernel...
>>> Max, do you see the root cause for this?
>>>
>>> Yi, does this happen with rxe/siw as well?
>> Hi Sagi
>>
>> rxe/siw will take less than 1s
>> with rdma_rxe
>> # time nvme reset /dev/nvme0
>> real 0m0.094s
>> user 0m0.000s
>> sys 0m0.006s
>>
>> with siw
>> # time nvme reset /dev/nvme0
>> real 0m0.097s
>> user 0m0.000s
>> sys 0m0.006s
>>
>> This is only reproducible with mlx IB card, as I mentioned before, the
>> reset operation time changed from 3s to 12s after the below commit,
>> could you check this commit?
>>
>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
>> Author: Max Gurtovoy <maxg@mellanox.com>
>> Date:   Tue May 19 17:05:56 2020 +0300
>>
>>      nvme-rdma: add metadata/T10-PI support
>>
> I couldn't repro these long reset times.

It appears to be when setting up a controller with lots of queues
maybe?

> Nevertheless, the above commit added T10-PI offloads.
> 
> In this commit, for supported devices we create extra resources in HW 
> (more memory keys per task).
> 
> I suggested doing this configuration as part of the "nvme connect" 
> command and save this resource allocation by default but during the 
> review I was asked to make it the default behavior.

Don't know if I gave you this feedback or not, but it probably didn't
occur to the commenter that it will make the connection establishment
take tens of seconds.

> Sagi/Christoph,
> 
> WDYT ? should we reconsider the "nvme connect --with_metadata" option ?

Maybe you can make these lazily allocated?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-20 13:03                                                               ` Sagi Grimberg
@ 2022-03-20 15:11                                                                 ` Max Gurtovoy
  2022-03-21  9:28                                                                   ` Sagi Grimberg
  0 siblings, 1 reply; 45+ messages in thread
From: Max Gurtovoy @ 2022-03-20 15:11 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang
  Cc: Max Gurtovoy, open list:NVM EXPRESS DRIVER, RDMA mailing list


On 3/20/2022 3:03 PM, Sagi Grimberg wrote:
>
>>>> These are very long times for a non-debug kernel...
>>>> Max, do you see the root cause for this?
>>>>
>>>> Yi, does this happen with rxe/siw as well?
>>> Hi Sagi
>>>
>>> rxe/siw will take less than 1s
>>> with rdma_rxe
>>> # time nvme reset /dev/nvme0
>>> real 0m0.094s
>>> user 0m0.000s
>>> sys 0m0.006s
>>>
>>> with siw
>>> # time nvme reset /dev/nvme0
>>> real 0m0.097s
>>> user 0m0.000s
>>> sys 0m0.006s
>>>
>>> This is only reproducible with mlx IB card, as I mentioned before, the
>>> reset operation time changed from 3s to 12s after the below commit,
>>> could you check this commit?
>>>
>>> commit 5ec5d3bddc6b912b7de9e3eb6c1f2397faeca2bc
>>> Author: Max Gurtovoy <maxg@mellanox.com>
>>> Date:   Tue May 19 17:05:56 2020 +0300
>>>
>>>      nvme-rdma: add metadata/T10-PI support
>>>
>> I couldn't repro these long reset times.
>
> It appears to be when setting up a controller with lots of queues
> maybe?

I'm doing that.

>
>> Nevertheless, the above commit added T10-PI offloads.
>>
>> In this commit, for supported devices we create extra resources in HW 
>> (more memory keys per task).
>>
>> I suggested doing this configuration as part of the "nvme connect" 
>> command and save this resource allocation by default but during the 
>> review I was asked to make it the default behavior.
>
> Don't know if I gave you this feedback or not, but it probably didn't
> occur to the commenter that it will make the connection establishment
> take tens of seconds.

It was known that we create more resources than needed for 
"non-PI-desired" controllers.


>
>> Sagi/Christoph,
>>
>> WDYT ? should we reconsider the "nvme connect --with_metadata" option ?
>
> Maybe you can make these lazily allocated?

You mean something like:

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index fd4720d37cc0..367ba0bb62ab 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1620,10 +1620,19 @@ int nvme_getgeo(struct block_device *bdev, 
struct hd_geometry *geo)
  }

  #ifdef CONFIG_BLK_DEV_INTEGRITY
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
-                               u32 max_integrity_segments)
+static int nvme_init_integrity(struct gendisk *disk, struct nvme_ns *ns)
  {
         struct blk_integrity integrity = { };
+       u16 ms = ns->ms;
+       u8 pi_type = ns->pi_type;
+       u32 max_integrity_segments = ns->ctrl->max_integrity_segments;
+       int ret;
+
+       if (ns->ctrl->ops->init_integrity) {
+               ret = ns->ctrl->ops->init_integrity(ns->ctrl);
+               if (ret)
+                       return ret;
+       }

         switch (pi_type) {
         case NVME_NS_DPS_PI_TYPE3:
@@ -1644,11 +1653,13 @@ static void nvme_init_integrity(struct gendisk 
*disk, u16 ms, u8 pi_type,
         integrity.tuple_size = ms;
         blk_integrity_register(disk, &integrity);
         blk_queue_max_integrity_segments(disk->queue, 
max_integrity_segments);
+
+       return 0;
  }
  #else
-static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
-                               u32 max_integrity_segments)
+static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns *ns)
  {
+       return 0;
  }
  #endif /* CONFIG_BLK_DEV_INTEGRITY */

@@ -1853,8 +1864,8 @@ static void nvme_update_disk_info(struct gendisk 
*disk,
         if (ns->ms) {
                 if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
                     (ns->features & NVME_NS_METADATA_SUPPORTED))
-                       nvme_init_integrity(disk, ns->ms, ns->pi_type,
- ns->ctrl->max_integrity_segments);
+                       if (nvme_init_integrity(disk, ns))
+                               capacity = 0;
                 else if (!nvme_ns_has_pi(ns))
                         capacity = 0;
         }
@@ -4395,7 +4406,7 @@ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);


and create the resources for the first namespace we find as PI formatted ?



^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-20 15:11                                                                 ` Max Gurtovoy
@ 2022-03-21  9:28                                                                   ` Sagi Grimberg
  2022-03-21 12:11                                                                     ` Max Gurtovoy
  0 siblings, 1 reply; 45+ messages in thread
From: Sagi Grimberg @ 2022-03-21  9:28 UTC (permalink / raw)
  To: Max Gurtovoy, Yi Zhang
  Cc: Max Gurtovoy, open list:NVM EXPRESS DRIVER, RDMA mailing list


>>> WDYT ? should we reconsider the "nvme connect --with_metadata" option ?
>>
>> Maybe you can make these lazily allocated?
> 
> You mean something like:
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index fd4720d37cc0..367ba0bb62ab 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -1620,10 +1620,19 @@ int nvme_getgeo(struct block_device *bdev, 
> struct hd_geometry *geo)
>   }
> 
>   #ifdef CONFIG_BLK_DEV_INTEGRITY
> -static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
> -                               u32 max_integrity_segments)
> +static int nvme_init_integrity(struct gendisk *disk, struct nvme_ns *ns)
>   {
>          struct blk_integrity integrity = { };
> +       u16 ms = ns->ms;
> +       u8 pi_type = ns->pi_type;
> +       u32 max_integrity_segments = ns->ctrl->max_integrity_segments;
> +       int ret;
> +
> +       if (ns->ctrl->ops->init_integrity) {
> +               ret = ns->ctrl->ops->init_integrity(ns->ctrl);
> +               if (ret)
> +                       return ret;
> +       }
> 
>          switch (pi_type) {
>          case NVME_NS_DPS_PI_TYPE3:
> @@ -1644,11 +1653,13 @@ static void nvme_init_integrity(struct gendisk 
> *disk, u16 ms, u8 pi_type,
>          integrity.tuple_size = ms;
>          blk_integrity_register(disk, &integrity);
>          blk_queue_max_integrity_segments(disk->queue, 
> max_integrity_segments);
> +
> +       return 0;
>   }
>   #else
> -static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 pi_type,
> -                               u32 max_integrity_segments)
> +static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns *ns)
>   {
> +       return 0;
>   }
>   #endif /* CONFIG_BLK_DEV_INTEGRITY */
> 
> @@ -1853,8 +1864,8 @@ static void nvme_update_disk_info(struct gendisk 
> *disk,
>          if (ns->ms) {
>                  if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
>                      (ns->features & NVME_NS_METADATA_SUPPORTED))
> -                       nvme_init_integrity(disk, ns->ms, ns->pi_type,
> - ns->ctrl->max_integrity_segments);
> +                       if (nvme_init_integrity(disk, ns))
> +                               capacity = 0;
>                  else if (!nvme_ns_has_pi(ns))
>                          capacity = 0;
>          }
> @@ -4395,7 +4406,7 @@ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);
> 
> 
> and create the resources for the first namespace we find as PI formatted ?
> 
> 

I was thinking more along the lines of allocating it as soon as an I/O
comes with pi... Is there something internal to the driver that can
be done in parallel to expedite the allocation of these extra resources?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [bug report] NVMe/IB: reset_controller need more than 1min
  2022-03-21  9:28                                                                   ` Sagi Grimberg
@ 2022-03-21 12:11                                                                     ` Max Gurtovoy
  0 siblings, 0 replies; 45+ messages in thread
From: Max Gurtovoy @ 2022-03-21 12:11 UTC (permalink / raw)
  To: Sagi Grimberg, Yi Zhang
  Cc: Max Gurtovoy, open list:NVM EXPRESS DRIVER, RDMA mailing list


On 3/21/2022 11:28 AM, Sagi Grimberg wrote:
>
>>>> WDYT ? should we reconsider the "nvme connect --with_metadata" 
>>>> option ?
>>>
>>> Maybe you can make these lazily allocated?
>>
>> You mean something like:
>>
>> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
>> index fd4720d37cc0..367ba0bb62ab 100644
>> --- a/drivers/nvme/host/core.c
>> +++ b/drivers/nvme/host/core.c
>> @@ -1620,10 +1620,19 @@ int nvme_getgeo(struct block_device *bdev, 
>> struct hd_geometry *geo)
>>   }
>>
>>   #ifdef CONFIG_BLK_DEV_INTEGRITY
>> -static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 
>> pi_type,
>> -                               u32 max_integrity_segments)
>> +static int nvme_init_integrity(struct gendisk *disk, struct nvme_ns 
>> *ns)
>>   {
>>          struct blk_integrity integrity = { };
>> +       u16 ms = ns->ms;
>> +       u8 pi_type = ns->pi_type;
>> +       u32 max_integrity_segments = ns->ctrl->max_integrity_segments;
>> +       int ret;
>> +
>> +       if (ns->ctrl->ops->init_integrity) {
>> +               ret = ns->ctrl->ops->init_integrity(ns->ctrl);
>> +               if (ret)
>> +                       return ret;
>> +       }
>>
>>          switch (pi_type) {
>>          case NVME_NS_DPS_PI_TYPE3:
>> @@ -1644,11 +1653,13 @@ static void nvme_init_integrity(struct 
>> gendisk *disk, u16 ms, u8 pi_type,
>>          integrity.tuple_size = ms;
>>          blk_integrity_register(disk, &integrity);
>>          blk_queue_max_integrity_segments(disk->queue, 
>> max_integrity_segments);
>> +
>> +       return 0;
>>   }
>>   #else
>> -static void nvme_init_integrity(struct gendisk *disk, u16 ms, u8 
>> pi_type,
>> -                               u32 max_integrity_segments)
>> +static void nvme_init_integrity(struct gendisk *disk, struct nvme_ns 
>> *ns)
>>   {
>> +       return 0;
>>   }
>>   #endif /* CONFIG_BLK_DEV_INTEGRITY */
>>
>> @@ -1853,8 +1864,8 @@ static void nvme_update_disk_info(struct 
>> gendisk *disk,
>>          if (ns->ms) {
>>                  if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) &&
>>                      (ns->features & NVME_NS_METADATA_SUPPORTED))
>> -                       nvme_init_integrity(disk, ns->ms, ns->pi_type,
>> - ns->ctrl->max_integrity_segments);
>> +                       if (nvme_init_integrity(disk, ns))
>> +                               capacity = 0;
>>                  else if (!nvme_ns_has_pi(ns))
>>                          capacity = 0;
>>          }
>> @@ -4395,7 +4406,7 @@ EXPORT_SYMBOL_GPL(nvme_stop_ctrl);
>>
>>
>> and create the resources for the first namespace we find as PI 
>> formatted ?
>>
>>
>
> I was thinking more along the lines of allocating it as soon as an I/O
> comes with pi... Is there something internal to the driver that can
> be done in parallel to expedite the allocation of these extra resources?

Since when are we allocating things in the fast path ?

We allocate a pool of MRs per queue. Not MR per task.

Do you think it's better to allocate the whole pool in the first PI IO 
and pay the latency for this IO ?

-Max.



^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2022-03-21 12:12 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-21 16:12 [bug report] NVMe/IB: reset_controller need more than 1min Yi Zhang
2021-05-21 16:12 ` Yi Zhang
2021-05-21 18:00 ` Sagi Grimberg
2021-05-21 18:00   ` Sagi Grimberg
2021-05-22  4:27   ` Yi Zhang
2021-05-22  4:27     ` Yi Zhang
2021-06-23 10:01     ` Yi Zhang
2021-06-23 10:01       ` Yi Zhang
2021-06-23 21:32       ` Sagi Grimberg
2021-06-23 21:32         ` Sagi Grimberg
2021-06-24 16:14         ` Yi Zhang
2021-06-24 16:14           ` Yi Zhang
2021-12-11  3:01           ` Yi Zhang
2021-12-12  9:45             ` Sagi Grimberg
2021-12-13  6:12               ` Yi Zhang
2021-12-13  9:04                 ` Sagi Grimberg
2021-12-13 17:05                   ` Yi Zhang
2021-12-14 10:39                     ` Sagi Grimberg
2021-12-14 12:00                       ` Max Gurtovoy
2021-12-15  1:15                         ` Yi Zhang
2021-12-15 12:10                           ` Max Gurtovoy
2021-12-16  2:18                             ` Yi Zhang
2021-12-16 13:21                               ` Max Gurtovoy
2021-12-16 16:32                                 ` Yi Zhang
2021-12-16 17:33                                   ` Haakon Bugge
2021-12-17  7:03                                     ` Yi Zhang
2021-12-17 11:19                                       ` Haakon Bugge
2022-02-14  9:47                                         ` Yi Zhang
2022-02-14 11:00                                           ` Chaitanya Kulkarni
2022-02-14 11:32                                           ` Sagi Grimberg
2022-02-14 12:11                                             ` Max Gurtovoy
2022-02-15 13:52                                               ` Yi Zhang
2022-02-15 14:30                                                 ` Max Gurtovoy
2022-02-21 10:00                                                   ` Yi Zhang
2022-02-23 10:04                                                     ` Max Gurtovoy
2022-02-23 10:30                                                       ` Sagi Grimberg
2022-02-23 11:20                                                         ` Max Gurtovoy
2022-03-01  0:06                                                       ` Yi Zhang
2022-03-16 15:16                                                         ` Sagi Grimberg
2022-03-19  7:29                                                           ` Yi Zhang
2022-03-20 10:50                                                             ` Max Gurtovoy
2022-03-20 13:03                                                               ` Sagi Grimberg
2022-03-20 15:11                                                                 ` Max Gurtovoy
2022-03-21  9:28                                                                   ` Sagi Grimberg
2022-03-21 12:11                                                                     ` Max Gurtovoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.