linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* too large sg segments with commit 09324d32d2a08
@ 2019-06-05  9:13 Sebastian Ott
  2019-06-05 10:09 ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Ott @ 2019-06-05  9:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Hannes Reinecke, Jens Axboe, linux-block, linux-kernel

Hi,

this warning turned up on s390:

[    7.041512] ------------[ cut here ]------------
[    7.041518] DMA-API: nvme 0000:00:00.0: mapping sg segment longer than device claims to support [len=106496] [max=65536]
[    7.041531] WARNING: CPU: 1 PID: 229 at kernel/dma/debug.c:1233 debug_dma_map_sg+0x21e/0x350
[    7.041537] Modules linked in: scm_block(+) eadm_sch sch_fq_codel autofs4
[    7.041547] CPU: 1 PID: 229 Comm: systemd-udevd Not tainted 5.2.0-rc3-00002-g112d38aa4733-dirty #146
[    7.041552] Hardware name: IBM 3906 M03 703 (LPAR)
[    7.041558] Krnl PSW : 0704d00180000000 00000000af580b6e (debug_dma_map_sg+0x21e/0x350)
[    7.041566]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[    7.041572] Krnl GPRS: 0000000095969122 0000000080000000 000000000000006c 00000000af5624bc
[    7.041578]            0000000000000007 0000000000000001 0000000000010000 0000000000000001
[    7.041583]            00000000b081d278 00000000fea06888 000000008fa43400 00000000f255b418
[    7.041589]            00000000f4a28100 ffffffff00000000 00000000af580b6a 000003e0004a36e0
[    7.041599] Krnl Code: 00000000af580b5e: c02000566fd7	larl	%r2,b004eb0c
                          00000000af580b64: c0e5fffad1fe	brasl	%r14,af4daf60
                         #00000000af580b6a: a7f40001		brc	15,af580b6c
                         >00000000af580b6e: c010005f45f5	larl	%r1,b0169758
                          00000000af580b74: e31010000012	lt	%r1,0(%r1)
                          00000000af580b7a: a774000f		brc	7,af580b98
                          00000000af580b7e: c010005fac23	larl	%r1,b01763c4
                          00000000af580b84: e31010000012	lt	%r1,0(%r1)
[    7.041620] Call Trace:
[    7.041626] ([<00000000af580b6a>] debug_dma_map_sg+0x21a/0x350)
[    7.041633]  [<00000000afbe2152>] nvme_queue_rq+0x49a/0xd18 
[    7.041639]  [<00000000afa178d0>] __blk_mq_try_issue_directly+0x108/0x1f0 
[    7.041645]  [<00000000afa18e96>] blk_mq_request_issue_directly+0x4e/0x70 
[    7.041651]  [<00000000afa18f42>] blk_mq_try_issue_list_directly+0x8a/0x118 
[    7.041657]  [<00000000afa1e42e>] blk_mq_sched_insert_requests+0x1c6/0x350 
[    7.041663]  [<00000000afa18e40>] blk_mq_flush_plug_list+0x4f8/0x500 
[    7.041669]  [<00000000afa0bb3e>] blk_flush_plug_list+0x106/0x110 
[    7.041674]  [<00000000afa0bb7c>] blk_finish_plug+0x34/0x50 
[    7.041680]  [<00000000af6938c2>] read_pages+0x152/0x160 
[    7.041687]  [<00000000af693b06>] __do_page_cache_readahead+0x236/0x268 
[    7.041693]  [<00000000af694458>] force_page_cache_readahead+0x110/0x120 
[    7.041699]  [<00000000af683fa4>] generic_file_buffered_read+0x144/0x968 
[    7.041706]  [<00000000af74d14c>] new_sync_read+0x13c/0x1b8 
[    7.041712]  [<00000000af74f9fa>] vfs_read+0x82/0x138 
[    7.041717]  [<00000000af74fd92>] ksys_read+0x62/0xd8 
[    7.041724]  [<00000000afe60e00>] system_call+0x2b0/0x2d0 
[    7.041729] 1 lock held by systemd-udevd/229:
[    7.041734]  #0: 00000000f715c4f3 (rcu_read_lock){....}, at: hctx_lock+0x28/0xf8
[    7.041743] Last Breaking-Event-Address:
[    7.041749]  [<00000000af580b6a>] debug_dma_map_sg+0x21a/0x350
[    7.041754] irq event stamp: 14457
[    7.041760] hardirqs last  enabled at (14465): [<00000000af560a2c>] console_unlock+0x63c/0x6a8
[    7.041766] hardirqs last disabled at (14472): [<00000000af5604ba>] console_unlock+0xca/0x6a8
[    7.041773] softirqs last  enabled at (11488): [<00000000afa20ed4>] get_gendisk+0xf4/0x148
[    7.041779] softirqs last disabled at (11486): [<00000000afa20e48>] get_gendisk+0x68/0x148
[    7.041784] ---[ end trace 9142fc6f63a22c6e ]---

The length of the sg entry created by blk_rq_map_sg is indeed largen than
the dma max_segment_size.

Bisect points to 09324d32d2a0 ("block: force an unlimited segment size on queues with a virt boundary")

Regards,
Sebastian


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: too large sg segments with commit 09324d32d2a08
  2019-06-05  9:13 too large sg segments with commit 09324d32d2a08 Sebastian Ott
@ 2019-06-05 10:09 ` Christoph Hellwig
  2019-06-05 13:30   ` Christoph Hellwig
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2019-06-05 10:09 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Christoph Hellwig, Ming Lei, Hannes Reinecke, Jens Axboe,
	linux-block, linux-kernel

The problem is that we don't communicate the block level max_segment
size to the iommu, and the commit really is just the messenger.

I'll cook up a series to fix this as papering over it in every driver
does not seem sustainable.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: too large sg segments with commit 09324d32d2a08
  2019-06-05 10:09 ` Christoph Hellwig
@ 2019-06-05 13:30   ` Christoph Hellwig
  2019-06-05 13:57     ` Sebastian Ott
  0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2019-06-05 13:30 UTC (permalink / raw)
  To: Sebastian Ott
  Cc: Christoph Hellwig, Ming Lei, Hannes Reinecke, Jens Axboe,
	linux-block, linux-kernel

Actually, it looks like something completely general isn't
easily doable, not without some major dma API work.  Here is what
should fix nvme, but a few other drivers will need fixes as well:

---
From 745541130409bc837a3416300f529b16eded8513 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Wed, 5 Jun 2019 14:55:26 +0200
Subject: nvme-pci: don't limit DMA segement size

NVMe uses PRPs (or optionally unlimited SGLs) for data transfers and
has no specific limit for a single DMA segement.  Limiting the size
will cause problems because the block layer assumes PRP-ish devices
using a virt boundary mask don't have a segment limit.  And while this
is true, we also really need to tell the DMA mapping layer about it,
otherwise dma-debug will trip over it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Sebastian Ott <sebott@linux.ibm.com>
---
 drivers/nvme/host/pci.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f562154551ce..524d6bd6d095 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2513,6 +2513,12 @@ static void nvme_reset_work(struct work_struct *work)
 	 */
 	dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1;
 	dev->ctrl.max_segments = NVME_MAX_SEGS;
+
+	/*
+	 * Don't limit the IOMMU merged segment size.
+	 */
+	dma_set_max_seg_size(dev->dev, 0xffffffff);
+
 	mutex_unlock(&dev->shutdown_lock);
 
 	/*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: too large sg segments with commit 09324d32d2a08
  2019-06-05 13:30   ` Christoph Hellwig
@ 2019-06-05 13:57     ` Sebastian Ott
  0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Ott @ 2019-06-05 13:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Ming Lei, Hannes Reinecke, Jens Axboe, linux-block, linux-kernel

On Wed, 5 Jun 2019, Christoph Hellwig wrote:
> Actually, it looks like something completely general isn't
> easily doable, not without some major dma API work.  Here is what
> should fix nvme, but a few other drivers will need fixes as well:
> 
> ---
> From 745541130409bc837a3416300f529b16eded8513 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Wed, 5 Jun 2019 14:55:26 +0200
> Subject: nvme-pci: don't limit DMA segement size
> 
> NVMe uses PRPs (or optionally unlimited SGLs) for data transfers and
> has no specific limit for a single DMA segement.  Limiting the size
> will cause problems because the block layer assumes PRP-ish devices
> using a virt boundary mask don't have a segment limit.  And while this
> is true, we also really need to tell the DMA mapping layer about it,
> otherwise dma-debug will trip over it.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reported-by: Sebastian Ott <sebott@linux.ibm.com>

Works for me. Thanks!


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-05 13:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-05  9:13 too large sg segments with commit 09324d32d2a08 Sebastian Ott
2019-06-05 10:09 ` Christoph Hellwig
2019-06-05 13:30   ` Christoph Hellwig
2019-06-05 13:57     ` Sebastian Ott

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).