All of lore.kernel.org
 help / color / mirror / Atom feed
* Issue with discard with NVME and Infinibox Storage
@ 2023-04-03 17:35 Laurence Oberman
  2023-04-03 18:00 ` Keith Busch
  0 siblings, 1 reply; 5+ messages in thread
From: Laurence Oberman @ 2023-04-03 17:35 UTC (permalink / raw)
  To: minlei, jmeneghi, Hellwig, Christoph, axboe; +Cc: linux-block, linux-nvme

Hello Ming and Christoph

Issue with Infinibox storage
----------------------------
Really discovered 2 issues here 

Issue 1
Kernels 5.15 to 5.18 inclusive recognize the discard support on the
Infinibox device but they fail in the nvme_setup_discard function call


[  339.591118] ------------[ cut here ]------------
[  339.591134] WARNING: CPU: 3 PID: 32 at drivers/nvme/host/core.c:868
nvme_setup_discard+0x16e/0x1e0 [nvme_core]

[  339.591349] CPU: 3 PID: 32 Comm: kworker/3:0H Not tainted 5.15.0 #1
[  339.591404] Hardware name: VMware, Inc. VMware Virtual
Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[  339.591423] Workqueue: kblockd blk_mq_run_work_fn
[  339.591458] RIP: 0010:nvme_setup_discard+0x16e/0x1e0 [nvme_core]
[  339.591475] Code: 38 48 8b b8 48 0b 00 00 48 2b 3d 2d 69 43 d3 48 c1
ff 06 48 c1 e7 0c 48 03 3d 2e 69 43 d3 48 89 f8 48 85 f6 0f 85 dd fe ff
ff <0f> 0b ba 00 00 00 80 48 01 d7 72 52 48 c7 c2 00 00 00 80 48 2b 15
[  339.591505] RSP: 0018:ffffbacb0052fcf8 EFLAGS: 00010212
[  339.591516] RAX: ffff93798b67e000 RBX: ffff937994565780 RCX:
ffff937a0b67e000
[  339.591529] RDX: 0000000000000020 RSI: 0000000000000000 RDI:
ffff93798b67e000
[  339.591541] RBP: ffff93799452f1b0 R08: ffff93798b67e000 R09:
00000000014000c0
[  339.591553] R10: 0000000000000800 R11: 0000000000000000 R12:
ffff9379a0df1000
[  339.591566] R13: 0000000000000001 R14: ffffbacb0052fde0 R15:
ffff9379a0df1000
[  339.591578] FS:  0000000000000000(0000) GS:ffff9379b9ec0000(0000)
knlGS:0000000000000000
[  339.591602] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  339.591617] CR2: 00007f4b7792f000 CR3: 000000010dcf2003 CR4:
0000000000770ee0
[  339.591641] PKRU: 55555554
[  339.591648] Call Trace:
[  339.591656]  nvme_setup_cmd+0xac/0x650 [nvme_core]
[  339.591673]  nvme_tcp_queue_rq+0x6a/0x390 [nvme_tcp]
[  339.591685]  blk_mq_dispatch_rq_list+0x139/0x810
[  339.591698]  ? blk_mq_flush_busy_ctxs+0xf9/0x120
[  339.591708]  __blk_mq_sched_dispatch_requests+0x135/0x140
[  339.591720]  blk_mq_sched_dispatch_requests+0x30/0x60
[  339.591746]  __blk_mq_run_hw_queue+0x2b/0x60
[  339.591757]  process_one_work+0x1cb/0x370
[  339.592339]  worker_thread+0x30/0x380
[  339.593200]  ? process_one_work+0x370/0x370
[  339.593990]  kthread+0x118/0x140
[  339.594710]  ? set_kthread_struct+0x40/0x40
[  339.595267]  ret_from_fork+0x1f/0x30
[  339.596077] ---[ end trace 547450bc9931a628 ]---
[  339.596806] blk_update_request: I/O error, dev nvme1c1n1, sector
20971712 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.741735] blk_update_request: I/O error, dev nvme1c1n1, sector
21037248 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.743952] blk_update_request: I/O error, dev nvme1c1n1, sector
21102784 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.745480] blk_update_request: I/O error, dev nvme1c1n1, sector
21168320 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.746425] blk_update_request: I/O error, dev nvme1c1n1, sector
21233856 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.747150] blk_update_request: I/O error, dev nvme1c1n1, sector
21299392 op 0x3:(DISCARD) flags 0x2004000 phys_seg 1 prio class 0
[  339.747948] blk_update_request: I/O error, dev nvme1c1n1, sector
21364928 op 0x3:(DISCARD) flags 0x2000000 phys_seg 1 prio class 0


Issue 2
Trying to narrow this down.
5.19 and higher (6.3 included), no longer support discard on the
Infinibox device and log this message so I cannot run the test for the
discard issue

[   35.989809] nvme nvme1: new ctrl: NQN "nqn.2020-
01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420
[   64.810437] XFS (nvme1n1): mounting with "discard" option, but the
device does not support discard
[   64.812298] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-18cc-
4a26-894b-8b0f8d79a98a

I then bisected between 5.18 and 5.19 to this commit

1a86924e4f464757546d7f7bdc469be237918395 is the first bad commit
commit 1a86924e4f464757546d7f7bdc469be237918395
Author: Tom Yan <tom.ty89@gmail.com>
Date:   Fri Apr 29 12:52:43 2022 +0800

    nvme: fix interpretation of DMRSL
    
    DMRSLl is in the unit of logical blocks, while max_discard_sectors
is
    in the unit of "linux sector".
    
    Signed-off-by: Tom Yan <tom.ty89@gmail.com>
    Signed-off-by: Christoph Hellwig <hch@lst.de>

 drivers/nvme/host/core.c | 6 ++++--
 drivers/nvme/host/nvme.h | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)


Note that Infindat mentioned this in our case they logged with us
They say they fully adhere to TP4040 MDTS.
Towards NVMe-oF 2.0 specification, TP4040  - Max Data Transfer for non-
IO Commands (MDTS) was released with additional fields to control these
parameters.
These parameters are supported in kernel versions 5.15 and above.  ****

Our storage target will reply with 0 for bit 2 of the ONCS, indicating
UNMAP is supported based on the DMRL, DMRSL, and DMSL values. 
(older kernels will interpret these values as UNMAP NOT SUPPORTED)


Let me know your thoughts please. for both issues

Regards
Laurence Oberman



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Issue with discard with NVME and Infinibox Storage
  2023-04-03 17:35 Issue with discard with NVME and Infinibox Storage Laurence Oberman
@ 2023-04-03 18:00 ` Keith Busch
  2023-04-03 18:18   ` Laurence Oberman
  0 siblings, 1 reply; 5+ messages in thread
From: Keith Busch @ 2023-04-03 18:00 UTC (permalink / raw)
  To: Laurence Oberman
  Cc: minlei, jmeneghi, Hellwig, Christoph, axboe, linux-block, linux-nvme

On Mon, Apr 03, 2023 at 01:35:22PM -0400, Laurence Oberman wrote:
> Hello Ming and Christoph
> 
> Issue with Infinibox storage
> ----------------------------
> Really discovered 2 issues here 
> 
> Issue 1
> Kernels 5.15 to 5.18 inclusive recognize the discard support on the
> Infinibox device but they fail in the nvme_setup_discard function call

This first i ssue should be fixed with this commit:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=37f0dc2ec78af0c3f35dd05578763de059f6fe77

> Issue 2
> Trying to narrow this down.
> 5.19 and higher (6.3 included), no longer support discard on the
> Infinibox device and log this message so I cannot run the test for the
> discard issue
> 
> [   35.989809] nvme nvme1: new ctrl: NQN "nqn.2020-
> 01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420
> [   64.810437] XFS (nvme1n1): mounting with "discard" option, but the
> device does not support discard
> [   64.812298] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-18cc-
> 4a26-894b-8b0f8d79a98a
> 
> I then bisected between 5.18 and 5.19 to this commit
> 
> 1a86924e4f464757546d7f7bdc469be237918395 is the first bad commit


> commit 1a86924e4f464757546d7f7bdc469be237918395
> Author: Tom Yan <tom.ty89@gmail.com>
> Date:   Fri Apr 29 12:52:43 2022 +0800
> 
>     nvme: fix interpretation of DMRSL
>     
>     DMRSLl is in the unit of logical blocks, while max_discard_sectors
> is
>     in the unit of "linux sector".
>     
>     Signed-off-by: Tom Yan <tom.ty89@gmail.com>
>     Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
>  drivers/nvme/host/core.c | 6 ++++--
>  drivers/nvme/host/nvme.h | 1 +
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> 
> Note that Infindat mentioned this in our case they logged with us
> They say they fully adhere to TP4040 MDTS.
> Towards NVMe-oF 2.0 specification, TP4040  - Max Data Transfer for non-
> IO Commands (MDTS) was released with additional fields to control these
> parameters.
> These parameters are supported in kernel versions 5.15 and above.  ****
> 
> Our storage target will reply with 0 for bit 2 of the ONCS, indicating
> UNMAP is supported based on the DMRL, DMRSL, and DMSL values. 
> (older kernels will interpret these values as UNMAP NOT SUPPORTED)
> 
> 
> Let me know your thoughts please. for both issues

The commit you found unconditionally sets the discard queue limit to the
reported DMRSL, so it sounds like your target is reporting DMRSL as '0'. Prior
to that commit, we'd use that value only if it was non-zero. I hope that helps.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Issue with discard with NVME and Infinibox Storage
  2023-04-03 18:00 ` Keith Busch
@ 2023-04-03 18:18   ` Laurence Oberman
  2023-04-03 18:29     ` Keith Busch
  2023-04-03 18:40     ` Laurence Oberman
  0 siblings, 2 replies; 5+ messages in thread
From: Laurence Oberman @ 2023-04-03 18:18 UTC (permalink / raw)
  To: Keith Busch
  Cc: minlei, jmeneghi, Hellwig, Christoph, axboe, linux-block, linux-nvme

On Mon, 2023-04-03 at 12:00 -0600, Keith Busch wrote:
> On Mon, Apr 03, 2023 at 01:35:22PM -0400, Laurence Oberman wrote:
> > Hello Ming and Christoph
> > 
> > Issue with Infinibox storage
> > ----------------------------
> > Really discovered 2 issues here 
> > 
> > Issue 1
> > Kernels 5.15 to 5.18 inclusive recognize the discard support on the
> > Infinibox device but they fail in the nvme_setup_discard function
> > call
> 
> This first i ssue should be fixed with this commit:
> 
>  
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=37f0dc2ec78af0c3f35dd05578763de059f6fe77
> 
> > Issue 2
> > Trying to narrow this down.
> > 5.19 and higher (6.3 included), no longer support discard on the
> > Infinibox device and log this message so I cannot run the test for
> > the
> > discard issue
> > 
> > [   35.989809] nvme nvme1: new ctrl: NQN "nqn.2020-
> > 01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420
> > [   64.810437] XFS (nvme1n1): mounting with "discard" option, but
> > the
> > device does not support discard
> > [   64.812298] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-18cc-
> > 4a26-894b-8b0f8d79a98a
> > 
> > I then bisected between 5.18 and 5.19 to this commit
> > 
> > 1a86924e4f464757546d7f7bdc469be237918395 is the first bad commit
> 
> 
> > commit 1a86924e4f464757546d7f7bdc469be237918395
> > Author: Tom Yan <tom.ty89@gmail.com>
> > Date:   Fri Apr 29 12:52:43 2022 +0800
> > 
> >     nvme: fix interpretation of DMRSL
> >     
> >     DMRSLl is in the unit of logical blocks, while
> > max_discard_sectors
> > is
> >     in the unit of "linux sector".
> >     
> >     Signed-off-by: Tom Yan <tom.ty89@gmail.com>
> >     Signed-off-by: Christoph Hellwig <hch@lst.de>
> > 
> >  drivers/nvme/host/core.c | 6 ++++--
> >  drivers/nvme/host/nvme.h | 1 +
> >  2 files changed, 5 insertions(+), 2 deletions(-)
> > 
> > 
> > Note that Infindat mentioned this in our case they logged with us
> > They say they fully adhere to TP4040 MDTS.
> > Towards NVMe-oF 2.0 specification, TP4040  - Max Data Transfer for
> > non-
> > IO Commands (MDTS) was released with additional fields to control
> > these
> > parameters.
> > These parameters are supported in kernel versions 5.15 and above. 
> > ****
> > 
> > Our storage target will reply with 0 for bit 2 of the ONCS,
> > indicating
> > UNMAP is supported based on the DMRL, DMRSL, and DMSL values. 
> > (older kernels will interpret these values as UNMAP NOT SUPPORTED)
> > 
> > 
> > Let me know your thoughts please. for both issues
> 
> The commit you found unconditionally sets the discard queue limit to
> the
> reported DMRSL, so it sounds like your target is reporting DMRSL as
> '0'. Prior
> to that commit, we'd use that value only if it was non-zero. I hope
> that helps.
> 



Hello Keith,
Many Thanks as always
I will inform Infinidat and have them figure this out.

Regards
Laurence


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Issue with discard with NVME and Infinibox Storage
  2023-04-03 18:18   ` Laurence Oberman
@ 2023-04-03 18:29     ` Keith Busch
  2023-04-03 18:40     ` Laurence Oberman
  1 sibling, 0 replies; 5+ messages in thread
From: Keith Busch @ 2023-04-03 18:29 UTC (permalink / raw)
  To: Laurence Oberman
  Cc: minlei, jmeneghi, Hellwig, Christoph, axboe, linux-block, linux-nvme

On Mon, Apr 03, 2023 at 02:18:21PM -0400, Laurence Oberman wrote:
> On Mon, 2023-04-03 at 12:00 -0600, Keith Busch wrote:
> > > 
> > > 
> > > Let me know your thoughts please. for both issues
> > 
> > The commit you found unconditionally sets the discard queue limit to
> > the
> > reported DMRSL, so it sounds like your target is reporting DMRSL as
> > '0'. Prior
> > to that commit, we'd use that value only if it was non-zero. I hope
> > that helps.
> > 
> 
> 
> 
> Hello Keith,
> Many Thanks as always
> I will inform Infinidat and have them figure this out.

If you have access to such a target and want to quickly verify, you could run:

  # nvme nvm-id-ctrl /dev/nvme0

and see what it reports for DMRSL.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Issue with discard with NVME and Infinibox Storage
  2023-04-03 18:18   ` Laurence Oberman
  2023-04-03 18:29     ` Keith Busch
@ 2023-04-03 18:40     ` Laurence Oberman
  1 sibling, 0 replies; 5+ messages in thread
From: Laurence Oberman @ 2023-04-03 18:40 UTC (permalink / raw)
  To: Keith Busch
  Cc: minlei, jmeneghi, Hellwig, Christoph, axboe, linux-block, linux-nvme

On Mon, 2023-04-03 at 14:18 -0400, Laurence Oberman wrote:
> On Mon, 2023-04-03 at 12:00 -0600, Keith Busch wrote:
> > On Mon, Apr 03, 2023 at 01:35:22PM -0400, Laurence Oberman wrote:
> > > Hello Ming and Christoph
> > > 
> > > Issue with Infinibox storage
> > > ----------------------------
> > > Really discovered 2 issues here 
> > > 
> > > Issue 1
> > > Kernels 5.15 to 5.18 inclusive recognize the discard support on
> > > the
> > > Infinibox device but they fail in the nvme_setup_discard function
> > > call
> > 
> > This first i ssue should be fixed with this commit:
> > 
> >  
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=37f0dc2ec78af0c3f35dd05578763de059f6fe77
> > 
> > > Issue 2
> > > Trying to narrow this down.
> > > 5.19 and higher (6.3 included), no longer support discard on the
> > > Infinibox device and log this message so I cannot run the test
> > > for
> > > the
> > > discard issue
> > > 
> > > [   35.989809] nvme nvme1: new ctrl: NQN "nqn.2020-
> > > 01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420
> > > [   64.810437] XFS (nvme1n1): mounting with "discard" option, but
> > > the
> > > device does not support discard
> > > [   64.812298] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-
> > > 18cc-
> > > 4a26-894b-8b0f8d79a98a
> > > 
> > > I then bisected between 5.18 and 5.19 to this commit
> > > 
> > > 1a86924e4f464757546d7f7bdc469be237918395 is the first bad commit
> > 
> > 
> > > commit 1a86924e4f464757546d7f7bdc469be237918395
> > > Author: Tom Yan <tom.ty89@gmail.com>
> > > Date:   Fri Apr 29 12:52:43 2022 +0800
> > > 
> > >     nvme: fix interpretation of DMRSL
> > >     
> > >     DMRSLl is in the unit of logical blocks, while
> > > max_discard_sectors
> > > is
> > >     in the unit of "linux sector".
> > >     
> > >     Signed-off-by: Tom Yan <tom.ty89@gmail.com>
> > >     Signed-off-by: Christoph Hellwig <hch@lst.de>
> > > 
> > >  drivers/nvme/host/core.c | 6 ++++--
> > >  drivers/nvme/host/nvme.h | 1 +
> > >  2 files changed, 5 insertions(+), 2 deletions(-)
> > > 
> > > 
> > > Note that Infindat mentioned this in our case they logged with us
> > > They say they fully adhere to TP4040 MDTS.
> > > Towards NVMe-oF 2.0 specification, TP4040  - Max Data Transfer
> > > for
> > > non-
> > > IO Commands (MDTS) was released with additional fields to control
> > > these
> > > parameters.
> > > These parameters are supported in kernel versions 5.15 and
> > > above. 
> > > ****
> > > 
> > > Our storage target will reply with 0 for bit 2 of the ONCS,
> > > indicating
> > > UNMAP is supported based on the DMRL, DMRSL, and DMSL values. 
> > > (older kernels will interpret these values as UNMAP NOT
> > > SUPPORTED)
> > > 
> > > 
> > > Let me know your thoughts please. for both issues
> > 
> > The commit you found unconditionally sets the discard queue limit
> > to
> > the
> > reported DMRSL, so it sounds like your target is reporting DMRSL as
> > '0'. Prior
> > to that commit, we'd use that value only if it was non-zero. I hope
> > that helps.
> > 
> 
> 
> 
> Hello Keith,
> Many Thanks as always
> I will inform Infinidat and have them figure this out.
> 
> Regards
> Laurence


Hi Keith 
Closing the loop

The challenge was the other issue was masking me testing the fix.

I reverted that commit (1a86924e4f464757546d7f7bdc469be237918395) and
now the original test passes so we are good here.

Linux localhost.localdomain 6.3.0-rc4+ #13 SMP PREEMPT_DYNAMIC Mon Apr
3 13:40:21 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

[   57.090328] nvme nvme1: new ctrl: NQN "nqn.2020-
01.com.infinidat:36000-subsystem-696", addr 192.168.1.2:4420
[   61.441213] XFS (nvme1n1): Mounting V5 Filesystem 6763a33f-18cc-
4a26-894b-8b0f8d79a98a
[   64.627670] XFS (nvme1n1): Ending clean mount
[   64.665657] xfs filesystem being mounted at /data supports
timestamps until 2038 (0x7fffffff)

Then running the fio test passes with no issues so confirming the fix
you called out (Ming had mentioned this too) resolves the original
issue.


Infinidat will have to fix their DMRSL issue

Thanks folks

Regards as always
Laurence Oberman



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-04-03 18:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-03 17:35 Issue with discard with NVME and Infinibox Storage Laurence Oberman
2023-04-03 18:00 ` Keith Busch
2023-04-03 18:18   ` Laurence Oberman
2023-04-03 18:29     ` Keith Busch
2023-04-03 18:40     ` Laurence Oberman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.