linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* 5.10.40-1 - Invalid SGL for payload:131072 nents:13
@ 2021-07-20 22:07 Andy Smith
  2021-07-21  0:34 ` Keith Busch
  2021-07-24  2:46 ` Ming Lei
  0 siblings, 2 replies; 8+ messages in thread
From: Andy Smith @ 2021-07-20 22:07 UTC (permalink / raw)
  To: Linux-nvme

Hi,

I have a Debian stable machine with a Samsung PM983 NVMe and a
Samsung SM883 in an MD RAID-1. It's been running the 4.19.x Debian
packaged kernel for almost 2 years now.

About 24 hours ago I upgraded its kernel to the buster-backports
kernel which is version 5.10.40-1~bpo10+1 and around four hours
after that I got this:

Jul 20 02:17:54 lamb kernel: [21061.388607] sg[0] phys_addr:0x00000015eb803000 offset:0 length:4096 dma_address:0x000000209e7b7000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.389775] sg[1] phys_addr:0x00000015eb7bc000 offset:0 length:4096 dma_address:0x000000209e7b8000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.390874] sg[2] phys_addr:0x00000015eb809000 offset:0 length:4096 dma_address:0x000000209e7b9000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.391974] sg[3] phys_addr:0x00000015eb766000 offset:0 length:4096 dma_address:0x000000209e7ba000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.393042] sg[4] phys_addr:0x00000015eb7a3000 offset:0 length:4096 dma_address:0x000000209e7bb000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.394086] sg[5] phys_addr:0x00000015eb7c6000 offset:0 length:4096 dma_address:0x000000209e7bc000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.395078] sg[6] phys_addr:0x00000015eb7c2000 offset:0 length:4096 dma_address:0x000000209e7bd000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.396042] sg[7] phys_addr:0x00000015eb7a9000 offset:0 length:4096 dma_address:0x000000209e7be000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.397004] sg[8] phys_addr:0x00000015eb775000 offset:0 length:4096 dma_address:0x000000209e7bf000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.397971] sg[9] phys_addr:0x00000015eb7c7000 offset:0 length:4096 dma_address:0x00000020ff520000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.398889] sg[10] phys_addr:0x00000015eb7cb000 offset:0 length:4096 dma_address:0x00000020ff521000 dma_length:4096
Jul 20 02:17:54 lamb kernel: [21061.399814] sg[11] phys_addr:0x00000015eb7e3000 offset:0 length:61952 dma_address:0x00000020ff522000 dma_length:61952
Jul 20 02:17:54 lamb kernel: [21061.400754] sg[12] phys_addr:0x00000015eb7f2200 offset:512 length:24064 dma_address:0x00000020ff531200 dma_length:24064
Jul 20 02:17:54 lamb kernel: [21061.401781] ------------[ cut here ]------------
Jul 20 02:17:54 lamb kernel: [21061.402738] Invalid SGL for payload:131072 nents:13
Jul 20 02:17:54 lamb kernel: [21061.403724] WARNING: CPU: 1 PID: 12669 at drivers/nvme/host/pci.c:716 nvme_map_data+0x7e0/0x820 [nvme]
Jul 20 02:17:54 lamb kernel: [21061.404728] Modules linked in: binfmt_misc ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_tcpmss nf_log_ipv6 nf_log_ipv4 nf_log_common xt_LOG xt_limit nfnetlink_log nfnetlink xt_NFLOG xt_multiport xt_tcpudp ip6table_filter ip6_tables iptable_filter bonding btrfs blake2b_generic dm_snapshot dm_bufio intel_rapl_msr intel_rapl_common skx_edac nfit libnvdimm intel_powerclamp crc32_pclmul ghash_clmulni_intel ipmi_ssif aesni_intel libaes crypto_simd cryptd glue_helper snd_hda_intel snd_intel_dspcfg mei_wdt soundwire_intel soundwire_generic_allocation nvme wdat_wdt snd_soc_core ast snd_compress watchdog drm_vram_helper drm_ttm_helper soundwire_cadence pcspkr nvme_core ttm snd_hda_codec drm_kms_helper snd_hda_core i2c_i801 snd_hwdep i2c_smbus cec soundwire_bus snd_pcm drm snd_timer snd soundcore igb ptp pps_core i2c_algo_bit joydev mei_me sg mei intel_lpss_pci intel_lpss idma64 acpi_ipmi ipmi_si ipmi_devintf ioatdma dca wmi ipmi_msghandler button dm_m
 od xenfs xen_acpi_processor
Jul 20 02:17:54 lamb kernel: [21061.404831]  xen_privcmd xen_pciback xen_netback xen_blkback xen_gntalloc xen_gntdev xen_evtchn ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 raid456 libcrc32c crc32c_generic async_raid6_recov async_memcpy async_pq async_xor xor async_tx evdev hid_generic usbhid hid raid6_pq raid0 multipath linear raid10 raid1 md_mod sd_mod t10_pi crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common xhci_pci ahci libahci crc32c_intel xhci_hcd libata usbcore scsi_mod usb_common
Jul 20 02:17:54 lamb kernel: [21061.417998] CPU: 1 PID: 12669 Comm: 62.xvda-0 Not tainted 5.10.0-0.bpo.7-amd64 #1 Debian 5.10.40-1~bpo10+1
Jul 20 02:17:54 lamb kernel: [21061.418459] Hardware name: Supermicro Super Server/X11SRM-VF, BIOS 1.2a 02/18/2019
Jul 20 02:17:54 lamb kernel: [21061.418922] RIP: e030:nvme_map_data+0x7e0/0x820 [nvme]
Jul 20 02:17:54 lamb kernel: [21061.419354] Code: d0 7b c0 48 c7 c7 40 d6 7b c0 e8 5b 44 c9 c0 8b 93 4c 01 00 00 f6 43 1e 04 75 36 8b 73 28 48 c7 c7 20 9c 7b c0 e8 8b 71 09 c1 <0f> 0b 41 bd 0a 00 00 00 e9 f7 fe ff ff 48 8d bd 68 02 00 00 48 89
Jul 20 02:17:54 lamb kernel: [21061.420271] RSP: e02b:ffffc90044797930 EFLAGS: 00010286
Jul 20 02:17:54 lamb kernel: [21061.420727] RAX: 0000000000000000 RBX: ffff888157db4200 RCX: 0000000000000027
Jul 20 02:17:54 lamb kernel: [21061.421186] RDX: 0000000000000027 RSI: ffff888292858a00 RDI: ffff888292858a08
Jul 20 02:17:54 lamb kernel: [21061.421639] RBP: ffff888103243000 R08: 0000000000000000 R09: c00000010000118b
Jul 20 02:17:54 lamb kernel: [21061.422090] R10: 0000000000165920 R11: ffffc90044797738 R12: ffffffffc07b9bd0
Jul 20 02:17:54 lamb kernel: [21061.422583] R13: 000000000000000d R14: 0000000000000000 R15: 000000000000000d
Jul 20 02:17:54 lamb kernel: [21061.423052] FS:  0000000000000000(0000) GS:ffff888292840000(0000) knlGS:0000000000000000
Jul 20 02:17:54 lamb kernel: [21061.423518] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
Jul 20 02:17:54 lamb kernel: [21061.423986] CR2: 00007f909a037c30 CR3: 000000010d2dc000 CR4: 0000000000050660
Jul 20 02:17:54 lamb kernel: [21061.424472] Call Trace:
Jul 20 02:17:54 lamb kernel: [21061.424943]  nvme_queue_rq+0x98/0x190 [nvme]
Jul 20 02:17:54 lamb kernel: [21061.425425]  blk_mq_dispatch_rq_list+0x123/0x7d0
Jul 20 02:17:54 lamb kernel: [21061.425904]  ? sbitmap_get+0x66/0x140
Jul 20 02:17:54 lamb kernel: [21061.426385]  ? elv_rb_del+0x1f/0x30
Jul 20 02:17:54 lamb kernel: [21061.426909]  ? deadline_remove_request+0x55/0xc0
Jul 20 02:17:54 lamb kernel: [21061.427373]  __blk_mq_do_dispatch_sched+0x164/0x2d0
Jul 20 02:17:54 lamb kernel: [21061.427843]  __blk_mq_sched_dispatch_requests+0x135/0x170
Jul 20 02:17:54 lamb kernel: [21061.428310]  blk_mq_sched_dispatch_requests+0x30/0x60
Jul 20 02:17:54 lamb kernel: [21061.428795]  __blk_mq_run_hw_queue+0x51/0xd0
Jul 20 02:17:54 lamb kernel: [21061.429269]  __blk_mq_delay_run_hw_queue+0x141/0x160
Jul 20 02:17:54 lamb kernel: [21061.429752]  blk_mq_sched_insert_requests+0x6a/0xf0
Jul 20 02:17:54 lamb kernel: [21061.430233]  blk_mq_flush_plug_list+0x119/0x1b0
Jul 20 02:17:54 lamb kernel: [21061.430756]  blk_flush_plug_list+0xd7/0x100
Jul 20 02:17:54 lamb kernel: [21061.431241]  blk_finish_plug+0x21/0x30
Jul 20 02:17:54 lamb kernel: [21061.431734]  dispatch_rw_block_io+0x6a5/0x9a0 [xen_blkback]
Jul 20 02:17:54 lamb kernel: [21061.432220]  __do_block_io_op+0x31d/0x620 [xen_blkback]
Jul 20 02:17:54 lamb kernel: [21061.432714]  ? _raw_spin_unlock_irqrestore+0x14/0x20
Jul 20 02:17:54 lamb kernel: [21061.433193]  ? try_to_del_timer_sync+0x4d/0x80
Jul 20 02:17:54 lamb kernel: [21061.433680]  xen_blkif_schedule+0xda/0x670 [xen_blkback]
Jul 20 02:17:54 lamb kernel: [21061.434160]  ? __schedule+0x2c6/0x770
Jul 20 02:17:54 lamb kernel: [21061.434679]  ? finish_wait+0x80/0x80
Jul 20 02:17:54 lamb kernel: [21061.435129]  ? xen_blkif_be_int+0x30/0x30 [xen_blkback]
Jul 20 02:17:54 lamb kernel: [21061.435571]  kthread+0x116/0x130
Jul 20 02:17:54 lamb kernel: [21061.436002]  ? kthread_park+0x80/0x80
Jul 20 02:17:54 lamb kernel: [21061.436422]  ret_from_fork+0x22/0x30
Jul 20 02:17:54 lamb kernel: [21061.436846] ---[ end trace 1d90be7aea2d9148 ]---
Jul 20 02:17:54 lamb kernel: [21061.437250] blk_update_request: I/O error, dev nvme0n1, sector 912000815 op 0x1:(WRITE) flags 0x800 phys_seg 13 prio class 0
Jul 20 02:17:54 lamb kernel: [21061.446344] md/raid1:md4: Disk failure on nvme0n1, disabling device.
Jul 20 02:17:54 lamb kernel: [21061.446344] md/raid1:md4: Operation continuing on 1 devices.

I was able to re-add nvme0n1 to the RAID-1 and continue without
rebooting but then later:

Jul 20 20:43:23 lamb kernel: [87388.876154] blk_update_request: I/O error, dev nvme0n1, sector 916064223 op 0x1:(WRITE) flags 0x800 phys_seg 28 prio class 0
Jul 20 20:43:23 lamb kernel: [87388.877750] md/raid1:md4: Disk failure on nvme0n1, disabling device.
Jul 20 20:43:23 lamb kernel: [87388.877750] md/raid1:md4: Operation continuing on 1 devices.

(no call trace this time)

Since this has started happening so soon after changing kernel, I
suspect change in kernel rather than faulty NVMe device, or maybe
faulty device that previous 4.19.x kernels did not catch, at least.

I found
<http://lists.infradead.org/pipermail/linux-nvme/2017-July/011930.html>
with similar log warning, which seems to be about invalid DMA
addresses and maybe Xen; in
<http://lists.infradead.org/pipermail/linux-nvme/2017-July/012055.html>
Christoph says, "I wonder if swiotlb-xen is involved" but the thread
doesn't seem to come to a resolution. My system is a Xen system
also.

Does anyone know if this issue was isolated and fixed but maybe not
backported to the Debian backports kernel?

Or could it be lurking undetected in the 4.19.x kernels I was
previously running (and am still running on other hosts with same
drives)?

Apart from the I/O error which causes the device to be kicked out
of RAID, can it be causing corruption either on this kernel or
silently on the 4.19.x kernels?

I am sorry that I haven't simply tried latest mainline kernel yet.
This is a production server and I need to schedule that; that will
obviously happen very soon if data corruption is suspected or if
this is a known fixed issue.

Thanks!
Andy

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-20 22:07 5.10.40-1 - Invalid SGL for payload:131072 nents:13 Andy Smith
@ 2021-07-21  0:34 ` Keith Busch
  2021-07-21  2:47   ` Andy Smith
  2021-07-23 23:35   ` Andy Smith
  2021-07-24  2:46 ` Ming Lei
  1 sibling, 2 replies; 8+ messages in thread
From: Keith Busch @ 2021-07-21  0:34 UTC (permalink / raw)
  To: Andy Smith; +Cc: Linux-nvme

On Tue, Jul 20, 2021 at 10:07:33PM +0000, Andy Smith wrote:
> Hi,
> 
> I have a Debian stable machine with a Samsung PM983 NVMe and a
> Samsung SM883 in an MD RAID-1. It's been running the 4.19.x Debian
> packaged kernel for almost 2 years now.
> 
> About 24 hours ago I upgraded its kernel to the buster-backports
> kernel which is version 5.10.40-1~bpo10+1 and around four hours
> after that I got this:
> 
> Jul 20 02:17:54 lamb kernel: [21061.388607] sg[0] phys_addr:0x00000015eb803000 offset:0 length:4096 dma_address:0x000000209e7b7000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.389775] sg[1] phys_addr:0x00000015eb7bc000 offset:0 length:4096 dma_address:0x000000209e7b8000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.390874] sg[2] phys_addr:0x00000015eb809000 offset:0 length:4096 dma_address:0x000000209e7b9000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.391974] sg[3] phys_addr:0x00000015eb766000 offset:0 length:4096 dma_address:0x000000209e7ba000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.393042] sg[4] phys_addr:0x00000015eb7a3000 offset:0 length:4096 dma_address:0x000000209e7bb000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.394086] sg[5] phys_addr:0x00000015eb7c6000 offset:0 length:4096 dma_address:0x000000209e7bc000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.395078] sg[6] phys_addr:0x00000015eb7c2000 offset:0 length:4096 dma_address:0x000000209e7bd000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.396042] sg[7] phys_addr:0x00000015eb7a9000 offset:0 length:4096 dma_address:0x000000209e7be000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.397004] sg[8] phys_addr:0x00000015eb775000 offset:0 length:4096 dma_address:0x000000209e7bf000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.397971] sg[9] phys_addr:0x00000015eb7c7000 offset:0 length:4096 dma_address:0x00000020ff520000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.398889] sg[10] phys_addr:0x00000015eb7cb000 offset:0 length:4096 dma_address:0x00000020ff521000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.399814] sg[11] phys_addr:0x00000015eb7e3000 offset:0 length:61952 dma_address:0x00000020ff522000 dma_length:61952
> Jul 20 02:17:54 lamb kernel: [21061.400754] sg[12] phys_addr:0x00000015eb7f2200 offset:512 length:24064 dma_address:0x00000020ff531200 dma_length:24064

Perhaps we should add the virt_addr in this print. If it was there, I
think it should show that the phys offset doesn't match the virtual
offset, which we are depending on.

Are you using swiotlb? If so, this recent patch sounds like it should
fix offset issues:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=5f89468e2f060031cd89fd4287298e0eaf246bf6

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-21  0:34 ` Keith Busch
@ 2021-07-21  2:47   ` Andy Smith
  2021-07-21  6:02     ` Christoph Hellwig
  2021-07-23 23:35   ` Andy Smith
  1 sibling, 1 reply; 8+ messages in thread
From: Andy Smith @ 2021-07-21  2:47 UTC (permalink / raw)
  To: Keith Busch; +Cc: Linux-nvme

Hi Keith,

On Tue, Jul 20, 2021 at 05:34:34PM -0700, Keith Busch wrote:
> Are you using swiotlb?

I've just realised that there was another change to my config
besides the kernel version: I increased the amount of memory
allocated to dom0 from 3GiB to 8GiB.

If I understand things correctly swiotlb is not used when memory is
< 4GiB, so I previously would not have been using swiotlb but now
will be.

What's the least invasive way of disabling swiotlb? Would it be

    swiotlb=noforce

Or

    iommu=off

Or maybe

    swiotlb=1

to reduce it to an amount that would never be used?

(Going by kernel parameters available in 5.10.40)

This is an Intel system and I see the various intel_iommu parameters
but don't know if any of them would be more appropriate.

> If so, this recent patch sounds like it should fix offset issues:

As I don't have an easy way to reproduce the problem on demand I
think I would like to first try going back to a situation where
swiotlb is not used, to see if the problem goes away in that case.
Happy to do more experimentation later…

Thanks!
Andy

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-21  2:47   ` Andy Smith
@ 2021-07-21  6:02     ` Christoph Hellwig
  0 siblings, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2021-07-21  6:02 UTC (permalink / raw)
  To: Andy Smith; +Cc: Keith Busch, Linux-nvme

On Wed, Jul 21, 2021 at 02:47:25AM +0000, Andy Smith wrote:
> to reduce it to an amount that would never be used?

Just the mem parameter.  With Xen dom0 you do not have an easy way out.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-21  0:34 ` Keith Busch
  2021-07-21  2:47   ` Andy Smith
@ 2021-07-23 23:35   ` Andy Smith
  2021-07-25  9:52     ` Andy Smith
  1 sibling, 1 reply; 8+ messages in thread
From: Andy Smith @ 2021-07-23 23:35 UTC (permalink / raw)
  To: Keith Busch; +Cc: Linux-nvme

Hi,

On Tue, Jul 20, 2021 at 05:34:34PM -0700, Keith Busch wrote:
> On Tue, Jul 20, 2021 at 10:07:33PM +0000, Andy Smith wrote:
> > I have a Debian stable machine with a Samsung PM983 NVMe and a
> > Samsung SM883 in an MD RAID-1. It's been running the 4.19.x Debian
> > packaged kernel for almost 2 years now.
> > 
> > About 24 hours ago I upgraded its kernel to the buster-backports
> > kernel which is version 5.10.40-1~bpo10+1 and around four hours
> > after that I got this:
> > 
> > Jul 20 02:17:54 lamb kernel: [21061.388607] sg[0] phys_addr:0x00000015eb803000 offset:0 length:4096 dma_address:0x000000209e7b7000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.389775] sg[1] phys_addr:0x00000015eb7bc000 offset:0 length:4096 dma_address:0x000000209e7b8000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.390874] sg[2] phys_addr:0x00000015eb809000 offset:0 length:4096 dma_address:0x000000209e7b9000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.391974] sg[3] phys_addr:0x00000015eb766000 offset:0 length:4096 dma_address:0x000000209e7ba000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.393042] sg[4] phys_addr:0x00000015eb7a3000 offset:0 length:4096 dma_address:0x000000209e7bb000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.394086] sg[5] phys_addr:0x00000015eb7c6000 offset:0 length:4096 dma_address:0x000000209e7bc000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.395078] sg[6] phys_addr:0x00000015eb7c2000 offset:0 length:4096 dma_address:0x000000209e7bd000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.396042] sg[7] phys_addr:0x00000015eb7a9000 offset:0 length:4096 dma_address:0x000000209e7be000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.397004] sg[8] phys_addr:0x00000015eb775000 offset:0 length:4096 dma_address:0x000000209e7bf000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.397971] sg[9] phys_addr:0x00000015eb7c7000 offset:0 length:4096 dma_address:0x00000020ff520000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.398889] sg[10] phys_addr:0x00000015eb7cb000 offset:0 length:4096 dma_address:0x00000020ff521000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.399814] sg[11] phys_addr:0x00000015eb7e3000 offset:0 length:61952 dma_address:0x00000020ff522000 dma_length:61952
> > Jul 20 02:17:54 lamb kernel: [21061.400754] sg[12] phys_addr:0x00000015eb7f2200 offset:512 length:24064 dma_address:0x00000020ff531200 dma_length:24064
> 
> Perhaps we should add the virt_addr in this print. If it was there, I
> think it should show that the phys offset doesn't match the virtual
> offset, which we are depending on.
> 
> Are you using swiotlb? If so, this recent patch sounds like it should
> fix offset issues:
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=5f89468e2f060031cd89fd4287298e0eaf246bf6

I was struggling to reproduce the above issue on my test hardware so
I took the time to resolve the sector offsets into logical volumes
to work out which Xen guests were involved. I found two guests
have triggered it and both of them have partitioned their block
device in an unaligned fashion e.g. partition starts at sector 63
with 512 byte sectors.

Making same setup on a guest on my test host I can now reliably
trigger this within a minute or so using fio.

I should now be able to test this patch mentioned above and/or
bisect to see what changed. Just thought I'd mention it in case the
unaligned nature sparked any other memories for anyone.

Thanks,
Andy


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-20 22:07 5.10.40-1 - Invalid SGL for payload:131072 nents:13 Andy Smith
  2021-07-21  0:34 ` Keith Busch
@ 2021-07-24  2:46 ` Ming Lei
  2021-07-25  9:46   ` Andy Smith
  1 sibling, 1 reply; 8+ messages in thread
From: Ming Lei @ 2021-07-24  2:46 UTC (permalink / raw)
  To: Andy Smith; +Cc: Linux-nvme

On Tue, Jul 20, 2021 at 10:07:33PM +0000, Andy Smith wrote:
> Hi,
> 
> I have a Debian stable machine with a Samsung PM983 NVMe and a
> Samsung SM883 in an MD RAID-1. It's been running the 4.19.x Debian
> packaged kernel for almost 2 years now.
> 
> About 24 hours ago I upgraded its kernel to the buster-backports
> kernel which is version 5.10.40-1~bpo10+1 and around four hours
> after that I got this:
> 
> Jul 20 02:17:54 lamb kernel: [21061.388607] sg[0] phys_addr:0x00000015eb803000 offset:0 length:4096 dma_address:0x000000209e7b7000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.389775] sg[1] phys_addr:0x00000015eb7bc000 offset:0 length:4096 dma_address:0x000000209e7b8000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.390874] sg[2] phys_addr:0x00000015eb809000 offset:0 length:4096 dma_address:0x000000209e7b9000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.391974] sg[3] phys_addr:0x00000015eb766000 offset:0 length:4096 dma_address:0x000000209e7ba000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.393042] sg[4] phys_addr:0x00000015eb7a3000 offset:0 length:4096 dma_address:0x000000209e7bb000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.394086] sg[5] phys_addr:0x00000015eb7c6000 offset:0 length:4096 dma_address:0x000000209e7bc000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.395078] sg[6] phys_addr:0x00000015eb7c2000 offset:0 length:4096 dma_address:0x000000209e7bd000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.396042] sg[7] phys_addr:0x00000015eb7a9000 offset:0 length:4096 dma_address:0x000000209e7be000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.397004] sg[8] phys_addr:0x00000015eb775000 offset:0 length:4096 dma_address:0x000000209e7bf000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.397971] sg[9] phys_addr:0x00000015eb7c7000 offset:0 length:4096 dma_address:0x00000020ff520000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.398889] sg[10] phys_addr:0x00000015eb7cb000 offset:0 length:4096 dma_address:0x00000020ff521000 dma_length:4096
> Jul 20 02:17:54 lamb kernel: [21061.399814] sg[11] phys_addr:0x00000015eb7e3000 offset:0 length:61952 dma_address:0x00000020ff522000 dma_length:61952
> Jul 20 02:17:54 lamb kernel: [21061.400754] sg[12] phys_addr:0x00000015eb7f2200 offset:512 length:24064 dma_address:0x00000020ff531200 dma_length:24064

The last two segments are physically continuous, which should have been
in same segment, otherwise virt boundary limit may be violated.

But __blk_bvec_map_sg() doesn't make the two into one segment, can you
collect the queue limit log?

(cd /sys/block/$NVME/queue && find . -type f -exec grep -aH . {} \;)

Meantime can you try the following patch and see if it can make a
difference?

commit c9c9762d4d44dcb1b2ba90cfb4122dc11ceebf31
Author: Long Li <longli@microsoft.com>
Date:   Mon Jun 7 12:34:05 2021 -0700

    block: return the correct bvec when checking for gaps



Thanks, 
Ming


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-24  2:46 ` Ming Lei
@ 2021-07-25  9:46   ` Andy Smith
  0 siblings, 0 replies; 8+ messages in thread
From: Andy Smith @ 2021-07-25  9:46 UTC (permalink / raw)
  To: Ming Lei; +Cc: Linux-nvme

Hi Ming Lei,

On Sat, Jul 24, 2021 at 10:46:53AM +0800, Ming Lei wrote:
> On Tue, Jul 20, 2021 at 10:07:33PM +0000, Andy Smith wrote:
> > Hi,
> > 
> > I have a Debian stable machine with a Samsung PM983 NVMe and a
> > Samsung SM883 in an MD RAID-1. It's been running the 4.19.x Debian
> > packaged kernel for almost 2 years now.
> > 
> > About 24 hours ago I upgraded its kernel to the buster-backports
> > kernel which is version 5.10.40-1~bpo10+1 and around four hours
> > after that I got this:
> > 
> > Jul 20 02:17:54 lamb kernel: [21061.388607] sg[0] phys_addr:0x00000015eb803000 offset:0 length:4096 dma_address:0x000000209e7b7000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.389775] sg[1] phys_addr:0x00000015eb7bc000 offset:0 length:4096 dma_address:0x000000209e7b8000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.390874] sg[2] phys_addr:0x00000015eb809000 offset:0 length:4096 dma_address:0x000000209e7b9000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.391974] sg[3] phys_addr:0x00000015eb766000 offset:0 length:4096 dma_address:0x000000209e7ba000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.393042] sg[4] phys_addr:0x00000015eb7a3000 offset:0 length:4096 dma_address:0x000000209e7bb000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.394086] sg[5] phys_addr:0x00000015eb7c6000 offset:0 length:4096 dma_address:0x000000209e7bc000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.395078] sg[6] phys_addr:0x00000015eb7c2000 offset:0 length:4096 dma_address:0x000000209e7bd000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.396042] sg[7] phys_addr:0x00000015eb7a9000 offset:0 length:4096 dma_address:0x000000209e7be000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.397004] sg[8] phys_addr:0x00000015eb775000 offset:0 length:4096 dma_address:0x000000209e7bf000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.397971] sg[9] phys_addr:0x00000015eb7c7000 offset:0 length:4096 dma_address:0x00000020ff520000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.398889] sg[10] phys_addr:0x00000015eb7cb000 offset:0 length:4096 dma_address:0x00000020ff521000 dma_length:4096
> > Jul 20 02:17:54 lamb kernel: [21061.399814] sg[11] phys_addr:0x00000015eb7e3000 offset:0 length:61952 dma_address:0x00000020ff522000 dma_length:61952
> > Jul 20 02:17:54 lamb kernel: [21061.400754] sg[12] phys_addr:0x00000015eb7f2200 offset:512 length:24064 dma_address:0x00000020ff531200 dma_length:24064
> 
> The last two segments are physically continuous, which should have been
> in same segment, otherwise virt boundary limit may be violated.
> 
> But __blk_bvec_map_sg() doesn't make the two into one segment, can you
> collect the queue limit log?
> 
> (cd /sys/block/$NVME/queue && find . -type f -exec grep -aH . {} \;)

$ (cd /sys/block/nvme0n1/queue && find . -type f -exec grep -aH . {} \;)
./io_poll_delay:-1
./max_integrity_segments:0
./zoned:none
./scheduler:[none] mq-deadline 
./io_poll:0
./discard_zeroes_data:0
./minimum_io_size:512
./nr_zones:0
./write_same_max_bytes:0
./max_segments:127
./dax:0
./physical_block_size:512
./logical_block_size:512
./zone_append_max_bytes:0
./io_timeout:30000
./nr_requests:1023
./write_cache:write through
./stable_writes:0
./max_segment_size:4294967295
./rotational:0
./discard_max_bytes:2199023255040
./add_random:0
./discard_max_hw_bytes:2199023255040
./optimal_io_size:0
./chunk_sectors:0
./read_ahead_kb:128
./max_discard_segments:256
./write_zeroes_max_bytes:2097152
./nomerges:0
./wbt_lat_usec:2000
./fua:0
./discard_granularity:512
./rq_affinity:1
./max_sectors_kb:1280
./hw_sector_size:512
./max_hw_sectors_kb:2048
./iostats:1

> Meantime can you try the following patch and see if it can make a
> difference?
> 
> commit c9c9762d4d44dcb1b2ba90cfb4122dc11ceebf31
> Author: Long Li <longli@microsoft.com>
> Date:   Mon Jun 7 12:34:05 2021 -0700
> 
>     block: return the correct bvec when checking for gaps

I applied this patch to 5.10.40 and I am no longer able to reproduce
the issue, thanks!

I see this patch made it in to 5.10.50, but Debian's
buster-backports kernel is based on 5.10.40, the forthcoming
bullseye is 5.10.46 and neither have this change. I will enquire
about backporting.

This is the fio command line I am using to reproduce:

fio --name=randread \
    --filename=/srv/fio/test \
    --size=35g \
    --numjobs=1 \
    --rw=randread \
    --direct=1 \
    --ioengine=libaio \
    --iodepth=16 \
    --blocksize_range=4k-4m \
    --blocksize_unaligned=0 \
    --gtod_reduce=1 \
    --iodepth=64 \
    --time_based \
    --runtime=4h

Where /srv/fio/ is an ext4 filesystem that is the first partition of
a block device deliberately misaligned by starting at sector 63.

I haven't been able to reproduce without this misalignment, but with
it I can reproduce within a few seconds. I've run it for several
hours (read 17TB off the device) with
c9c9762d4d44dcb1b2ba90cfb4122dc11ceebf31 and had no issues.

Thanks,
Andy

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: 5.10.40-1 - Invalid SGL for payload:131072 nents:13
  2021-07-23 23:35   ` Andy Smith
@ 2021-07-25  9:52     ` Andy Smith
  0 siblings, 0 replies; 8+ messages in thread
From: Andy Smith @ 2021-07-25  9:52 UTC (permalink / raw)
  To: Keith Busch; +Cc: Linux-nvme

Hi,

On Fri, Jul 23, 2021 at 11:35:19PM +0000, Andy Smith wrote:
> On Tue, Jul 20, 2021 at 05:34:34PM -0700, Keith Busch wrote:
> > Are you using swiotlb? If so, this recent patch sounds like it should
> > fix offset issues:
> > 
> >   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=5f89468e2f060031cd89fd4287298e0eaf246bf6

I found where this one made it in to 5.10.46:

    https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/diff/queue-5.10/swiotlb-manipulate-orig_addr-when-tlb_addr-has-offset.patch?id=fc7b255bfae6e62091006146ca685a25ec6f69c6

and tried it out but still had the same issue.

The patch that Ming Lei suggested:

    https://lore.kernel.org/patchwork/patch/1442338/

does seem to resolve this issue for me. I see that made it in to
5.10.50.

Thanks!
Andy

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-07-25  9:52 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-20 22:07 5.10.40-1 - Invalid SGL for payload:131072 nents:13 Andy Smith
2021-07-21  0:34 ` Keith Busch
2021-07-21  2:47   ` Andy Smith
2021-07-21  6:02     ` Christoph Hellwig
2021-07-23 23:35   ` Andy Smith
2021-07-25  9:52     ` Andy Smith
2021-07-24  2:46 ` Ming Lei
2021-07-25  9:46   ` Andy Smith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).