All of lore.kernel.org
 help / color / mirror / Atom feed
* way to unbind a bad nvme device/controller without powering off system
@ 2022-10-24 21:40 James Puthukattukaran
  2022-10-24 22:36 ` Keith Busch
  0 siblings, 1 reply; 9+ messages in thread
From: James Puthukattukaran @ 2022-10-24 21:40 UTC (permalink / raw)
  To: linux-nvme

Hi -

I'm seeing a scenario where what seems to be a non-functioning nvme controller/drive where the IO transactions are timing out and the controller is not responding to any controller commands. The controller seems to be disabled (nvme_dev_disable called via the nvme_timeout) but we're still seeing the nvme_reset_work thread  blocked and not making progress. I tried to remove the controller via the HP sysfs interface and that also hangs behind the reset thread waiting for it to complete. 

I thought the the disable controller path does not talk to the controller and simply unblocks the queues and cleans them out before unbinding the controller from the device. Not sure why the reset thread is still stuck then? Does the reset thread have to finish its course even though the controller has been disabled? trying to understand the flow here.

I guess what I'm really looking for is a way to simply unbind the device from the driver, kill any threads and allow the device to be powered of via the hotplug interface (trying to avoid rebooting the system to remove the device).

thanks,
James


systemd[1]:info: Removed slice User Slice of root.
kernel:warning: [10416583.556274] nvme nvme3: I/O 527 QID 1 timeout, aborting
kernel:warning: [10416583.556287] nvme nvme3: I/O 603 QID 1 timeout, aborting
kernel:warning: [10416583.556291] nvme nvme3: I/O 607 QID 1 timeout, aborting
kernel:warning: [10416583.556301] nvme nvme3: I/O 765 QID 1 timeout, aborting
kernel:warning: [10416583.556274] nvme nvme3: I/O 527 QID 1 timeout, aborting
kernel:warning: [10416583.556287] nvme nvme3: I/O 603 QID 1 timeout, aborting
kernel:warning: [10416583.556291] nvme nvme3: I/O 607 QID 1 timeout, aborting
kernel:warning: [10416583.556301] nvme nvme3: I/O 765 QID 1 timeout, aborting
kernel:warning: [10416583.557700] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557705] nvme nvme3: I/O 894 QID 22 timeout, aborting
kernel:warning: [10416583.557711] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557700] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557705] nvme nvme3: I/O 894 QID 22 timeout, aborting
kernel:warning: [10416583.557711] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557713] nvme nvme3: I/O 897 QID 22 timeout, aborting
kernel:warning: [10416583.557714] nvme nvme3: Abort status: 0x0
....
....
kernel:warning: [10416594.580023] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588219] nvme nvme3: I/O 191 QID 1 timeout, aborting
kernel:warning: [10416595.588228] nvme nvme3: I/O 192 QID 1 timeout, aborting
kernel:warning: [10416595.588233] nvme nvme3: I/O 195 QID 1 timeout, aborting
kernel:warning: [10416595.588237] nvme nvme3: I/O 196 QID 1 timeout, aborting
kernel:warning: [10416595.588244] nvme nvme3: I/O 209 QID 1 timeout, reset controller
kernel:warning: [10416595.588402] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588219] nvme nvme3: I/O 191 QID 1 timeout, aborting
kernel:warning: [10416595.588228] nvme nvme3: I/O 192 QID 1 timeout, aborting
kernel:warning: [10416595.588233] nvme nvme3: I/O 195 QID 1 timeout, aborting
kernel:warning: [10416595.588237] nvme nvme3: I/O 196 QID 1 timeout, aborting
kernel:warning: [10416595.588244] nvme nvme3: I/O 209 QID 1 timeout, reset controller
kernel:warning: [10416595.588402] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588449] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588449] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588513] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588561] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588513] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588561] nvme nvme3: Abort status: 0x0
kernel:info: [10416598.164310] nvme nvme3: Shutdown timeout set to 10 seconds
kernel:info: [10416598.164310] nvme nvme3: Shutdown timeout set to 10 seconds
systemd[1]:info: Removed slice User Slice of root.
kernel:warning: [10416583.556274] nvme nvme3: I/O 527 QID 1 timeout, aborting
kernel:warning: [10416583.556287] nvme nvme3: I/O 603 QID 1 timeout, aborting
kernel:warning: [10416583.556291] nvme nvme3: I/O 607 QID 1 timeout, aborting
kernel:warning: [10416583.556301] nvme nvme3: I/O 765 QID 1 timeout, aborting
kernel:warning: [10416583.556274] nvme nvme3: I/O 527 QID 1 timeout, aborting
kernel:warning: [10416583.556287] nvme nvme3: I/O 603 QID 1 timeout, aborting
kernel:warning: [10416583.556291] nvme nvme3: I/O 607 QID 1 timeout, aborting
kernel:warning: [10416583.556301] nvme nvme3: I/O 765 QID 1 timeout, aborting
kernel:warning: [10416583.557700] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557705] nvme nvme3: I/O 894 QID 22 timeout, aborting
kernel:warning: [10416583.557711] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557700] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557705] nvme nvme3: I/O 894 QID 22 timeout, aborting
kernel:warning: [10416583.557711] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557713] nvme nvme3: I/O 897 QID 22 timeout, aborting
kernel:warning: [10416583.557714] nvme nvme3: Abort status: 0x0
....
....
kernel:warning: [10416594.580023] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588219] nvme nvme3: I/O 191 QID 1 timeout, aborting
kernel:warning: [10416595.588228] nvme nvme3: I/O 192 QID 1 timeout, aborting
kernel:warning: [10416595.588233] nvme nvme3: I/O 195 QID 1 timeout, aborting
kernel:warning: [10416595.588237] nvme nvme3: I/O 196 QID 1 timeout, aborting
kernel:warning: [10416595.588244] nvme nvme3: I/O 209 QID 1 timeout, reset controller
kernel:warning: [10416595.588402] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588219] nvme nvme3: I/O 191 QID 1 timeout, aborting
kernel:warning: [10416595.588228] nvme nvme3: I/O 192 QID 1 timeout, aborting
kernel:warning: [10416595.588233] nvme nvme3: I/O 195 QID 1 timeout, aborting
kernel:warning: [10416595.588237] nvme nvme3: I/O 196 QID 1 timeout, aborting
kernel:warning: [10416595.588244] nvme nvme3: I/O 209 QID 1 timeout, reset controller
kernel:warning: [10416595.588402] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588449] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588449] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588513] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588561] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588513] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588561] nvme nvme3: Abort status: 0x0
kernel:info: [10416598.164310] nvme nvme3: Shutdown timeout set to 10 seconds
kernel:info: [10416598.164310] nvme nvme3: Shutdown timeout set to 10 seconds
systemd[1]:info: Removed slice User Slice of root.
kernel:warning: [10416583.556274] nvme nvme3: I/O 527 QID 1 timeout, aborting
kernel:warning: [10416583.556287] nvme nvme3: I/O 603 QID 1 timeout, aborting
kernel:warning: [10416583.556291] nvme nvme3: I/O 607 QID 1 timeout, aborting
kernel:warning: [10416583.556301] nvme nvme3: I/O 765 QID 1 timeout, aborting
kernel:warning: [10416583.556274] nvme nvme3: I/O 527 QID 1 timeout, aborting
kernel:warning: [10416583.556287] nvme nvme3: I/O 603 QID 1 timeout, aborting
kernel:warning: [10416583.556291] nvme nvme3: I/O 607 QID 1 timeout, aborting
kernel:warning: [10416583.556301] nvme nvme3: I/O 765 QID 1 timeout, aborting
kernel:warning: [10416583.557700] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557705] nvme nvme3: I/O 894 QID 22 timeout, aborting
kernel:warning: [10416583.557711] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557700] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557705] nvme nvme3: I/O 894 QID 22 timeout, aborting
kernel:warning: [10416583.557711] nvme nvme3: Abort status: 0x0
kernel:warning: [10416583.557713] nvme nvme3: I/O 897 QID 22 timeout, aborting
kernel:warning: [10416583.557714] nvme nvme3: Abort status: 0x0
....
....
kernel:warning: [10416594.580023] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588219] nvme nvme3: I/O 191 QID 1 timeout, aborting
kernel:warning: [10416595.588228] nvme nvme3: I/O 192 QID 1 timeout, aborting
kernel:warning: [10416595.588233] nvme nvme3: I/O 195 QID 1 timeout, aborting
kernel:warning: [10416595.588237] nvme nvme3: I/O 196 QID 1 timeout, aborting
kernel:warning: [10416595.588244] nvme nvme3: I/O 209 QID 1 timeout, reset controller
kernel:warning: [10416595.588402] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588219] nvme nvme3: I/O 191 QID 1 timeout, aborting
kernel:warning: [10416595.588228] nvme nvme3: I/O 192 QID 1 timeout, aborting
kernel:warning: [10416595.588233] nvme nvme3: I/O 195 QID 1 timeout, aborting
kernel:warning: [10416595.588237] nvme nvme3: I/O 196 QID 1 timeout, aborting
kernel:warning: [10416595.588244] nvme nvme3: I/O 209 QID 1 timeout, reset controller
kernel:warning: [10416595.588402] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588449] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588449] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588513] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588561] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588513] nvme nvme3: Abort status: 0x0
kernel:warning: [10416595.588561] nvme nvme3: Abort status: 0x0
kernel:info: [10416598.164310] nvme nvme3: Shutdown timeout set to 10 seconds
kernel:info: [10416598.164310] nvme nvme3: Shutdown timeout set to 10 seconds
kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
...
...
kernel:info: [10419813.132341] Workqueue: nvme-reset-wq nvme_reset_work [nvme]
kernel:warning: [10419813.132342] Call Trace:
kernel:warning: [10419813.132345]  __schedule+0x2bc/0x89b
kernel:warning: [10419813.132348]  schedule+0x36/0x7c
kernel:warning: [10419813.132351]  blk_mq_freeze_queue_wait+0x4b/0xaa
kernel:warning: [10419813.132353]  ? remove_wait_queue+0x60/0x60
kernel:warning: [10419813.132359]  nvme_wait_freeze+0x33/0x50 [nvme_core]
kernel:warning: [10419813.132362]  nvme_reset_work+0x802/0xd84 [nvme]
kernel:warning: [10419813.132364]  ? __switch_to_asm+0x40/0x62
kernel:warning: [10419813.132365]  ? __switch_to_asm+0x34/0x62
kernel:warning: [10419813.132367]  ? __switch_to+0x9b/0x505
kernel:warning: [10419813.132368]  ? __switch_to_asm+0x40/0x62
kernel:warning: [10419813.132370]  ? __switch_to_asm+0x40/0x62
kernel:warning: [10419813.132372]  process_one_work+0x169/0x399
kernel:warning: [10419813.132374]  worker_thread+0x4d/0x3e5
kernel:warning: [10419813.132377]  kthread+0x105/0x138
kernel:warning: [10419813.132379]  ? rescuer_thread+0x380/0x375
kernel:warning: [10419813.132380]  ? kthread_bind+0x20/0x15
kernel:warning: [10419813.132382]  ret_from_fork+0x24/0x49
...
...
kernel:warning: [10419813.158116]  __schedule+0x2bc/0x89b
kernel:warning: [10419813.158119]  schedule+0x36/0x7c
kernel:warning: [10419813.158122]  schedule_timeout+0x1f6/0x31f
kernel:warning: [10419813.158124]  ? sched_clock_cpu+0x11/0xa5
kernel:warning: [10419813.158126]  ? try_to_wake_up+0x59/0x505
kernel:warning: [10419813.158130]  wait_for_completion+0x12b/0x18a
kernel:warning: [10419813.158132]  ? wake_up_q+0x80/0x73
kernel:warning: [10419813.158134]  flush_work+0x122/0x1a7
kernel:warning: [10419813.158137]  ? wake_up_worker+0x30/0x2b
kernel:warning: [10419813.158141]  nvme_remove+0x71/0x100 [nvme]
kernel:warning: [10419813.158146]  pci_device_remove+0x3e/0xb6
kernel:warning: [10419813.158149]  device_release_driver_internal+0x134/0x1eb
kernel:warning: [10419813.158151]  device_release_driver+0x12/0x14
kernel:warning: [10419813.158155]  pci_stop_bus_device+0x7c/0x96
kernel:warning: [10419813.158158]  pci_stop_bus_device+0x39/0x96
kernel:warning: [10419813.158164]  pci_stop_and_remove_bus_device+0x12/0x1d
kernel:warning: [10419813.158167]  pciehp_unconfigure_device+0x7a/0x1d7
kernel:warning: [10419813.158169]  pciehp_disable_slot+0x52/0xca
kernel:warning: [10419813.158171]  pciehp_sysfs_disable_slot+0x67/0x112
kernel:warning: [10419813.158174]  disable_slot+0x12/0x14
kernel:warning: [10419813.158175]  power_write_file+0x6e/0xf8
kernel:warning: [10419813.158179]  pci_slot_attr_store+0x24/0x2e
kernel:warning: [10419813.158180]  sysfs_kf_write+0x3f/0x46
kernel:warning: [10419813.158182]  kernfs_fop_write+0x124/0x1a3
kernel:warning: [10419813.158184]  __vfs_write+0x3a/0x16d
kernel:warning: [10419813.158187]  ? audit_filter_syscall+0x33/0xce
kernel:warning: [10419813.158189]  vfs_write+0xb2/0x1a1
...
...
kernel:warning: [10419813.158507] Call Trace:
kernel:warning: [10419813.158513]  __schedule+0x2bc/0x89b
kernel:warning: [10419813.158521]  schedule+0x36/0x7c
kernel:warning: [10419813.158527]  schedule_timeout+0x189/0x31f
kernel:warning: [10419813.158533]  ? __next_timer_interrupt+0xe0/0xd4
kernel:warning: [10419813.158539]  ? blk_rq_append_bio+0x96/0xa3
kernel:warning: [10419813.158544]  ? __blk_mq_insert_request+0x76/0x104
kernel:warning: [10419813.158549]  io_schedule_timeout+0x1e/0x43
kernel:warning: [10419813.158554]  wait_for_completion_io_timeout+0x13b/0x19d
kernel:warning: [10419813.158562]  ? wake_up_q+0x80/0x73
kernel:warning: [10419813.158568]  blk_execute_rq+0x6e/0x9a
kernel:warning: [10419813.158577]  nvme_submit_user_cmd+0xc4/0x320 [nvme_core]
kernel:warning: [10419813.158583]  nvme_user_cmd+0x21d/0x3c0 [nvme_core]
kernel:warning: [10419813.158591]  nvme_ioctl+0x175/0x220 [nvme_core]
kernel:warning: [10419813.158598]  blkdev_ioctl+0x878/0x912
kernel:warning: [10419813.158605]  block_ioctl+0x41/0x45
kernel:warning: [10419813.158611]  do_vfs_ioctl+0xaa/0x602
kernel:warning: [10419813.158616]  ? __audit_syscall_entry+0xac/0xef
kernel:warning: [10419813.158621]  ? syscall_trace_enter+0x1ce/0x2b8
kernel:warning: [10419813.158627]  SyS_ioctl+0x79/0x84


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: way to unbind a bad nvme device/controller without powering off system
  2022-10-24 21:40 way to unbind a bad nvme device/controller without powering off system James Puthukattukaran
@ 2022-10-24 22:36 ` Keith Busch
  2022-10-25  0:02   ` [External] : " James Puthukattukaran
  0 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2022-10-24 22:36 UTC (permalink / raw)
  To: James Puthukattukaran; +Cc: linux-nvme

On Mon, Oct 24, 2022 at 05:40:30PM -0400, James Puthukattukaran wrote:
> Hi -
> 
> I'm seeing a scenario where what seems to be a non-functioning nvme controller/drive where the IO transactions are timing out and the controller is not responding to any controller commands. The controller seems to be disabled (nvme_dev_disable called via the nvme_timeout) but we're still seeing the nvme_reset_work thread  blocked and not making progress. I tried to remove the controller via the HP sysfs interface and that also hangs behind the reset thread waiting for it to complete. 

If it's in a hotplug slot, then just pull it out.
 
> I thought the the disable controller path does not talk to the controller and simply unblocks the queues and cleans them out before unbinding the controller from the device. Not sure why the reset thread is still stuck then? Does the reset thread have to finish its course even though the controller has been disabled? trying to understand the flow here.
> 
> I guess what I'm really looking for is a way to simply unbind the device from the driver, kill any threads and allow the device to be powered of via the hotplug interface (trying to avoid rebooting the system to remove the device).

What kernel are you using?

Generally, the default timeout is really long. If you have a broken
controller, it could take several minutes before the driver unblocks
forward progress to unbind.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-10-24 22:36 ` Keith Busch
@ 2022-10-25  0:02   ` James Puthukattukaran
  2022-10-25  2:26     ` Keith Busch
  0 siblings, 1 reply; 9+ messages in thread
From: James Puthukattukaran @ 2022-10-25  0:02 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme



On 10/24/22 18:36, Keith Busch wrote:
> On Mon, Oct 24, 2022 at 05:40:30PM -0400, James Puthukattukaran wrote:
>> Hi -
>>
>> I'm seeing a scenario where what seems to be a non-functioning nvme controller/drive where the IO transactions are timing out and the controller is not responding to any controller commands. The controller seems to be disabled (nvme_dev_disable called via the nvme_timeout) but we're still seeing the nvme_reset_work thread  blocked and not making progress. I tried to remove the controller via the HP sysfs interface and that also hangs behind the reset thread waiting for it to complete. 
> 
> If it's in a hotplug slot, then just pull it out.

Looking for a programmatic (remote) way to do it. Also, doing this will cause surprise remove and won't it leave the nvme controller data structure in a bad state/not unbound from the driver?
>  
>> I thought the the disable controller path does not talk to the controller and simply unblocks the queues and cleans them out before unbinding the controller from the device. Not sure why the reset thread is still stuck then? Does the reset thread have to finish its course even though the controller has been disabled? trying to understand the flow here.
>>
>> I guess what I'm really looking for is a way to simply unbind the device from the driver, kill any threads and allow the device to be powered of via the hotplug interface (trying to avoid rebooting the system to remove the device).
> 
> What kernel are you using?

5.14 based kernel

> 
> Generally, the default timeout is really long. If you have a broken
> controller, it could take several minutes before the driver unblocks
> forward progress to unbind.
One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
thanks



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-10-25  0:02   ` [External] : " James Puthukattukaran
@ 2022-10-25  2:26     ` Keith Busch
  2022-10-25 16:56       ` Keith Busch
  0 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2022-10-25  2:26 UTC (permalink / raw)
  To: James Puthukattukaran; +Cc: linux-nvme

On Mon, Oct 24, 2022 at 08:02:33PM -0400, James Puthukattukaran wrote:
> On 10/24/22 18:36, Keith Busch wrote:
> 
> > 
> > Generally, the default timeout is really long. If you have a broken
> > controller, it could take several minutes before the driver unblocks
> > forward progress to unbind.
> One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
> thanks

In your log snippet, there's this line:

  kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller

The next action the driver takes after logging that is to drain any
outstanding IO through a forced reset, and all subsequent tasks *should*
be unblocked after that completes to allow the unbinding, so I don't
think adding any new sysfs knobs is going to help if it's not already
succeeding.

The only other thing that looks odd is that one of your stuck tasks is a
user passthrough command, but that should have also been cleared out by
the reset. Do you know what command that process is sending? I'll need
to double check your kernel version to see if there's anything missing
in that driver to ensure the unbinding succeeds. 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-10-25  2:26     ` Keith Busch
@ 2022-10-25 16:56       ` Keith Busch
  2022-10-28  3:14         ` James Puthukattukaran
  2022-11-08 19:13         ` James Puthukattukaran
  0 siblings, 2 replies; 9+ messages in thread
From: Keith Busch @ 2022-10-25 16:56 UTC (permalink / raw)
  To: James Puthukattukaran; +Cc: linux-nvme

On Mon, Oct 24, 2022 at 08:26:54PM -0600, Keith Busch wrote:
> On Mon, Oct 24, 2022 at 08:02:33PM -0400, James Puthukattukaran wrote:
> > On 10/24/22 18:36, Keith Busch wrote:
> > 
> > > 
> > > Generally, the default timeout is really long. If you have a broken
> > > controller, it could take several minutes before the driver unblocks
> > > forward progress to unbind.
> > One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
> > thanks
> 
> In your log snippet, there's this line:
> 
>   kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
> 
> The next action the driver takes after logging that is to drain any
> outstanding IO through a forced reset, and all subsequent tasks *should*
> be unblocked after that completes to allow the unbinding, so I don't
> think adding any new sysfs knobs is going to help if it's not already
> succeeding.
> 
> The only other thing that looks odd is that one of your stuck tasks is a
> user passthrough command, but that should have also been cleared out by
> the reset. Do you know what command that process is sending? I'll need
> to double check your kernel version to see if there's anything missing
> in that driver to ensure the unbinding succeeds. 

I think there could be a mismatched queue quiesce state happening, but
there's some fixes for this in later kernels. Could you possibly try
something newer, like 6.0-stable, as an experiment?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-10-25 16:56       ` Keith Busch
@ 2022-10-28  3:14         ` James Puthukattukaran
  2022-11-08 19:13         ` James Puthukattukaran
  1 sibling, 0 replies; 9+ messages in thread
From: James Puthukattukaran @ 2022-10-28  3:14 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme



On 10/25/22 12:56, Keith Busch wrote:
> On Mon, Oct 24, 2022 at 08:26:54PM -0600, Keith Busch wrote:
>> On Mon, Oct 24, 2022 at 08:02:33PM -0400, James Puthukattukaran wrote:
>>> On 10/24/22 18:36, Keith Busch wrote:
>>>
>>>>
>>>> Generally, the default timeout is really long. If you have a broken
>>>> controller, it could take several minutes before the driver unblocks
>>>> forward progress to unbind.
>>> One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
>>> thanks
>>
>> In your log snippet, there's this line:
>>
>>   kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
>>
>> The next action the driver takes after logging that is to drain any
>> outstanding IO through a forced reset, and all subsequent tasks *should*
>> be unblocked after that completes to allow the unbinding, so I don't
>> think adding any new sysfs knobs is going to help if it's not already
>> succeeding.
>>
>> The only other thing that looks odd is that one of your stuck tasks is a
>> user passthrough command, but that should have also been cleared out by
>> the reset. Do you know what command that process is sending? I'll need
>> to double check your kernel version to see if there's anything missing
>> in that driver to ensure the unbinding succeeds. 
> 

The nvme command is either id-ctrl or id-ns; rather pedestrian

> I think there could be a mismatched queue quiesce state happening, but
> there's some fixes for this in later kernels. Could you possibly try
> something newer, like 6.0-stable, as an experiment?


Can you point me to the patches for this? would it straightforward to backport?
thanks


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-10-25 16:56       ` Keith Busch
  2022-10-28  3:14         ` James Puthukattukaran
@ 2022-11-08 19:13         ` James Puthukattukaran
  2022-11-08 23:15           ` Keith Busch
  1 sibling, 1 reply; 9+ messages in thread
From: James Puthukattukaran @ 2022-11-08 19:13 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme



On 10/25/22 12:56, Keith Busch wrote:
> On Mon, Oct 24, 2022 at 08:26:54PM -0600, Keith Busch wrote:
>> On Mon, Oct 24, 2022 at 08:02:33PM -0400, James Puthukattukaran wrote:
>>> On 10/24/22 18:36, Keith Busch wrote:
>>>
>>>>
>>>> Generally, the default timeout is really long. If you have a broken
>>>> controller, it could take several minutes before the driver unblocks
>>>> forward progress to unbind.
>>> One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
>>> thanks
>>
>> In your log snippet, there's this line:
>>
>>   kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
>>
>> The next action the driver takes after logging that is to drain any
>> outstanding IO through a forced reset, and all subsequent tasks *should*
>> be unblocked after that completes to allow the unbinding, so I don't
>> think adding any new sysfs knobs is going to help if it's not already
>> succeeding.
>>
>> The only other thing that looks odd is that one of your stuck tasks is a
>> user passthrough command, but that should have also been cleared out by
>> the reset. Do you know what command that process is sending? I'll need
>> to double check your kernel version to see if there's anything missing
>> in that driver to ensure the unbinding succeeds. 
> 
> I think there could be a mismatched queue quiesce state happening, but
> there's some fixes for this in later kernels. Could you possibly try
> something newer, like 6.0-stable, as an experiment?

Is this the patch for the mismatch?

commit d4060d2be1132596154f31f4d57976bd103e969d
Author: Tao Chiu <taochiu@synology.com>
Date:   Mon Apr 26 10:53:55 2021 +0800

    nvme-pci: fix controller reset hang when racing with nvme_timeout
    



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-11-08 19:13         ` James Puthukattukaran
@ 2022-11-08 23:15           ` Keith Busch
  2022-11-10 16:51             ` James Puthukattukaran
  0 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2022-11-08 23:15 UTC (permalink / raw)
  To: James Puthukattukaran; +Cc: linux-nvme

On Tue, Nov 08, 2022 at 02:13:25PM -0500, James Puthukattukaran wrote:
> On 10/25/22 12:56, Keith Busch wrote:
> > On Mon, Oct 24, 2022 at 08:26:54PM -0600, Keith Busch wrote:
> >> On Mon, Oct 24, 2022 at 08:02:33PM -0400, James Puthukattukaran wrote:
> >>> On 10/24/22 18:36, Keith Busch wrote:
> >>>
> >>>>
> >>>> Generally, the default timeout is really long. If you have a broken
> >>>> controller, it could take several minutes before the driver unblocks
> >>>> forward progress to unbind.
> >>> One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
> >>> thanks
> >>
> >> In your log snippet, there's this line:
> >>
> >>   kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
> >>
> >> The next action the driver takes after logging that is to drain any
> >> outstanding IO through a forced reset, and all subsequent tasks *should*
> >> be unblocked after that completes to allow the unbinding, so I don't
> >> think adding any new sysfs knobs is going to help if it's not already
> >> succeeding.
> >>
> >> The only other thing that looks odd is that one of your stuck tasks is a
> >> user passthrough command, but that should have also been cleared out by
> >> the reset. Do you know what command that process is sending? I'll need
> >> to double check your kernel version to see if there's anything missing
> >> in that driver to ensure the unbinding succeeds. 
> > 
> > I think there could be a mismatched queue quiesce state happening, but
> > there's some fixes for this in later kernels. Could you possibly try
> > something newer, like 6.0-stable, as an experiment?
> 
> Is this the patch for the mismatch?
> 
> commit d4060d2be1132596154f31f4d57976bd103e969d
> Author: Tao Chiu <taochiu@synology.com>
> Date:   Mon Apr 26 10:53:55 2021 +0800
> 
>     nvme-pci: fix controller reset hang when racing with nvme_timeout

That doesn't look like what's happening here. I was thinking of this
one:

  commit 9e6a6b1212100148c109675e003369e3e219dbd9
  Author: Ming Lei <ming.lei@redhat.com>
  Date:   Thu Oct 14 16:17:08 2021 +0800
  
      nvme: paring quiesce/unquiesce

It's not a clean cherry-pick, though.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [External] : Re: way to unbind a bad nvme device/controller without powering off system
  2022-11-08 23:15           ` Keith Busch
@ 2022-11-10 16:51             ` James Puthukattukaran
  0 siblings, 0 replies; 9+ messages in thread
From: James Puthukattukaran @ 2022-11-10 16:51 UTC (permalink / raw)
  To: Keith Busch; +Cc: linux-nvme



On 11/8/22 18:15, Keith Busch wrote:
> On Tue, Nov 08, 2022 at 02:13:25PM -0500, James Puthukattukaran wrote:
>> On 10/25/22 12:56, Keith Busch wrote:
>>> On Mon, Oct 24, 2022 at 08:26:54PM -0600, Keith Busch wrote:
>>>> On Mon, Oct 24, 2022 at 08:02:33PM -0400, James Puthukattukaran wrote:
>>>>> On 10/24/22 18:36, Keith Busch wrote:
>>>>>
>>>>>>
>>>>>> Generally, the default timeout is really long. If you have a broken
>>>>>> controller, it could take several minutes before the driver unblocks
>>>>>> forward progress to unbind.
>>>>> One concern is that the reset controller flow attempts to reinitialze the controller and this will cause problems if the controller is bad. Would it make sense to have a sysfs "remove_controller" interface that simply goes through and does a nvme_dev_disable() with the assumption that the controller is dead? Will the nvme_kill_queues() in nvme_dev_disadble() unwedge any potential nvme reset thread that is blocked and thus allow the nvme_remove() flow to complete?
>>>>> thanks
>>>>
>>>> In your log snippet, there's this line:
>>>>
>>>>   kernel:warning: [10416608.580157] nvme nvme3: I/O 209 QID 1 timeout, disable controller
>>>>
>>>> The next action the driver takes after logging that is to drain any
>>>> outstanding IO through a forced reset, and all subsequent tasks *should*
>>>> be unblocked after that completes to allow the unbinding, so I don't
>>>> think adding any new sysfs knobs is going to help if it's not already
>>>> succeeding.
>>>>
>>>> The only other thing that looks odd is that one of your stuck tasks is a
>>>> user passthrough command, but that should have also been cleared out by
>>>> the reset. Do you know what command that process is sending? I'll need
>>>> to double check your kernel version to see if there's anything missing
>>>> in that driver to ensure the unbinding succeeds. 
>>>
>>> I think there could be a mismatched queue quiesce state happening, but
>>> there's some fixes for this in later kernels. Could you possibly try
>>> something newer, like 6.0-stable, as an experiment?
>>
>> Is this the patch for the mismatch?
>>
>> commit d4060d2be1132596154f31f4d57976bd103e969d
>> Author: Tao Chiu <taochiu@synology.com>
>> Date:   Mon Apr 26 10:53:55 2021 +0800
>>
>>     nvme-pci: fix controller reset hang when racing with nvme_timeout
> 
> That doesn't look like what's happening here. I was thinking of this
> one:
> 
>   commit 9e6a6b1212100148c109675e003369e3e219dbd9
>   Author: Ming Lei <ming.lei@redhat.com>
>   Date:   Thu Oct 14 16:17:08 2021 +0800
>   
>       nvme: paring quiesce/unquiesce
> 
> It's not a clean cherry-pick, though.

Thanks. We are desperately trying to reproduce the issue so we know the patch fixes it. The plan is to run nvme IO along with attempting to do a reset controller (via the sysfs) while another thread repeatedly attempts nvme user IO commands. fingers crossed.



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-11-10 16:57 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-24 21:40 way to unbind a bad nvme device/controller without powering off system James Puthukattukaran
2022-10-24 22:36 ` Keith Busch
2022-10-25  0:02   ` [External] : " James Puthukattukaran
2022-10-25  2:26     ` Keith Busch
2022-10-25 16:56       ` Keith Busch
2022-10-28  3:14         ` James Puthukattukaran
2022-11-08 19:13         ` James Puthukattukaran
2022-11-08 23:15           ` Keith Busch
2022-11-10 16:51             ` James Puthukattukaran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.