* too large sg segments with commit 09324d32d2a08
@ 2019-06-05 9:13 Sebastian Ott
2019-06-05 10:09 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Sebastian Ott @ 2019-06-05 9:13 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Ming Lei, Hannes Reinecke, Jens Axboe, linux-block, linux-kernel
Hi,
this warning turned up on s390:
[ 7.041512] ------------[ cut here ]------------
[ 7.041518] DMA-API: nvme 0000:00:00.0: mapping sg segment longer than device claims to support [len=106496] [max=65536]
[ 7.041531] WARNING: CPU: 1 PID: 229 at kernel/dma/debug.c:1233 debug_dma_map_sg+0x21e/0x350
[ 7.041537] Modules linked in: scm_block(+) eadm_sch sch_fq_codel autofs4
[ 7.041547] CPU: 1 PID: 229 Comm: systemd-udevd Not tainted 5.2.0-rc3-00002-g112d38aa4733-dirty #146
[ 7.041552] Hardware name: IBM 3906 M03 703 (LPAR)
[ 7.041558] Krnl PSW : 0704d00180000000 00000000af580b6e (debug_dma_map_sg+0x21e/0x350)
[ 7.041566] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:1 PM:0 RI:0 EA:3
[ 7.041572] Krnl GPRS: 0000000095969122 0000000080000000 000000000000006c 00000000af5624bc
[ 7.041578] 0000000000000007 0000000000000001 0000000000010000 0000000000000001
[ 7.041583] 00000000b081d278 00000000fea06888 000000008fa43400 00000000f255b418
[ 7.041589] 00000000f4a28100 ffffffff00000000 00000000af580b6a 000003e0004a36e0
[ 7.041599] Krnl Code: 00000000af580b5e: c02000566fd7 larl %r2,b004eb0c
00000000af580b64: c0e5fffad1fe brasl %r14,af4daf60
#00000000af580b6a: a7f40001 brc 15,af580b6c
>00000000af580b6e: c010005f45f5 larl %r1,b0169758
00000000af580b74: e31010000012 lt %r1,0(%r1)
00000000af580b7a: a774000f brc 7,af580b98
00000000af580b7e: c010005fac23 larl %r1,b01763c4
00000000af580b84: e31010000012 lt %r1,0(%r1)
[ 7.041620] Call Trace:
[ 7.041626] ([<00000000af580b6a>] debug_dma_map_sg+0x21a/0x350)
[ 7.041633] [<00000000afbe2152>] nvme_queue_rq+0x49a/0xd18
[ 7.041639] [<00000000afa178d0>] __blk_mq_try_issue_directly+0x108/0x1f0
[ 7.041645] [<00000000afa18e96>] blk_mq_request_issue_directly+0x4e/0x70
[ 7.041651] [<00000000afa18f42>] blk_mq_try_issue_list_directly+0x8a/0x118
[ 7.041657] [<00000000afa1e42e>] blk_mq_sched_insert_requests+0x1c6/0x350
[ 7.041663] [<00000000afa18e40>] blk_mq_flush_plug_list+0x4f8/0x500
[ 7.041669] [<00000000afa0bb3e>] blk_flush_plug_list+0x106/0x110
[ 7.041674] [<00000000afa0bb7c>] blk_finish_plug+0x34/0x50
[ 7.041680] [<00000000af6938c2>] read_pages+0x152/0x160
[ 7.041687] [<00000000af693b06>] __do_page_cache_readahead+0x236/0x268
[ 7.041693] [<00000000af694458>] force_page_cache_readahead+0x110/0x120
[ 7.041699] [<00000000af683fa4>] generic_file_buffered_read+0x144/0x968
[ 7.041706] [<00000000af74d14c>] new_sync_read+0x13c/0x1b8
[ 7.041712] [<00000000af74f9fa>] vfs_read+0x82/0x138
[ 7.041717] [<00000000af74fd92>] ksys_read+0x62/0xd8
[ 7.041724] [<00000000afe60e00>] system_call+0x2b0/0x2d0
[ 7.041729] 1 lock held by systemd-udevd/229:
[ 7.041734] #0: 00000000f715c4f3 (rcu_read_lock){....}, at: hctx_lock+0x28/0xf8
[ 7.041743] Last Breaking-Event-Address:
[ 7.041749] [<00000000af580b6a>] debug_dma_map_sg+0x21a/0x350
[ 7.041754] irq event stamp: 14457
[ 7.041760] hardirqs last enabled at (14465): [<00000000af560a2c>] console_unlock+0x63c/0x6a8
[ 7.041766] hardirqs last disabled at (14472): [<00000000af5604ba>] console_unlock+0xca/0x6a8
[ 7.041773] softirqs last enabled at (11488): [<00000000afa20ed4>] get_gendisk+0xf4/0x148
[ 7.041779] softirqs last disabled at (11486): [<00000000afa20e48>] get_gendisk+0x68/0x148
[ 7.041784] ---[ end trace 9142fc6f63a22c6e ]---
The length of the sg entry created by blk_rq_map_sg is indeed largen than
the dma max_segment_size.
Bisect points to 09324d32d2a0 ("block: force an unlimited segment size on queues with a virt boundary")
Regards,
Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: too large sg segments with commit 09324d32d2a08
2019-06-05 9:13 too large sg segments with commit 09324d32d2a08 Sebastian Ott
@ 2019-06-05 10:09 ` Christoph Hellwig
2019-06-05 13:30 ` Christoph Hellwig
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2019-06-05 10:09 UTC (permalink / raw)
To: Sebastian Ott
Cc: Christoph Hellwig, Ming Lei, Hannes Reinecke, Jens Axboe,
linux-block, linux-kernel
The problem is that we don't communicate the block level max_segment
size to the iommu, and the commit really is just the messenger.
I'll cook up a series to fix this as papering over it in every driver
does not seem sustainable.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: too large sg segments with commit 09324d32d2a08
2019-06-05 10:09 ` Christoph Hellwig
@ 2019-06-05 13:30 ` Christoph Hellwig
2019-06-05 13:57 ` Sebastian Ott
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2019-06-05 13:30 UTC (permalink / raw)
To: Sebastian Ott
Cc: Christoph Hellwig, Ming Lei, Hannes Reinecke, Jens Axboe,
linux-block, linux-kernel
Actually, it looks like something completely general isn't
easily doable, not without some major dma API work. Here is what
should fix nvme, but a few other drivers will need fixes as well:
---
From 745541130409bc837a3416300f529b16eded8513 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Wed, 5 Jun 2019 14:55:26 +0200
Subject: nvme-pci: don't limit DMA segement size
NVMe uses PRPs (or optionally unlimited SGLs) for data transfers and
has no specific limit for a single DMA segement. Limiting the size
will cause problems because the block layer assumes PRP-ish devices
using a virt boundary mask don't have a segment limit. And while this
is true, we also really need to tell the DMA mapping layer about it,
otherwise dma-debug will trip over it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reported-by: Sebastian Ott <sebott@linux.ibm.com>
---
drivers/nvme/host/pci.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f562154551ce..524d6bd6d095 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2513,6 +2513,12 @@ static void nvme_reset_work(struct work_struct *work)
*/
dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1;
dev->ctrl.max_segments = NVME_MAX_SEGS;
+
+ /*
+ * Don't limit the IOMMU merged segment size.
+ */
+ dma_set_max_seg_size(dev->dev, 0xffffffff);
+
mutex_unlock(&dev->shutdown_lock);
/*
--
2.20.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: too large sg segments with commit 09324d32d2a08
2019-06-05 13:30 ` Christoph Hellwig
@ 2019-06-05 13:57 ` Sebastian Ott
0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Ott @ 2019-06-05 13:57 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Ming Lei, Hannes Reinecke, Jens Axboe, linux-block, linux-kernel
On Wed, 5 Jun 2019, Christoph Hellwig wrote:
> Actually, it looks like something completely general isn't
> easily doable, not without some major dma API work. Here is what
> should fix nvme, but a few other drivers will need fixes as well:
>
> ---
> From 745541130409bc837a3416300f529b16eded8513 Mon Sep 17 00:00:00 2001
> From: Christoph Hellwig <hch@lst.de>
> Date: Wed, 5 Jun 2019 14:55:26 +0200
> Subject: nvme-pci: don't limit DMA segement size
>
> NVMe uses PRPs (or optionally unlimited SGLs) for data transfers and
> has no specific limit for a single DMA segement. Limiting the size
> will cause problems because the block layer assumes PRP-ish devices
> using a virt boundary mask don't have a segment limit. And while this
> is true, we also really need to tell the DMA mapping layer about it,
> otherwise dma-debug will trip over it.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reported-by: Sebastian Ott <sebott@linux.ibm.com>
Works for me. Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-06-05 13:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-05 9:13 too large sg segments with commit 09324d32d2a08 Sebastian Ott
2019-06-05 10:09 ` Christoph Hellwig
2019-06-05 13:30 ` Christoph Hellwig
2019-06-05 13:57 ` Sebastian Ott
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).