From: Hannes Reinecke <hare@suse.de>
To: Luis Chamberlain <mcgrof@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>,
Kent Overstreet <kent.overstreet@linux.dev>,
hare@kernel.org, Andrew Morton <akpm@linux-foundation.org>,
Pankaj Raghav <kernel@pankajraghav.com>,
linux-nvme@lists.infradead.org, linux-block@vger.kernel.org,
Pankaj Raghav <p.raghav@samsung.com>
Subject: Re: [PATCH 5/5] nvme: enable logical block size > PAGE_SIZE
Date: Tue, 14 May 2024 01:07:02 +0200 [thread overview]
Message-ID: <0dc0b13d-27b7-41c4-8bf3-64f1810d2b39@suse.de> (raw)
In-Reply-To: <ZkKAoHoC5yvNUKSE@bombadil.infradead.org>
On 5/13/24 23:05, Luis Chamberlain wrote:
> On Mon, May 13, 2024 at 06:07:55PM +0200, Hannes Reinecke wrote:
>> On 5/12/24 11:16, Luis Chamberlain wrote:
>>> On Sat, May 11, 2024 at 07:43:26PM -0700, Luis Chamberlain wrote:
>>>> I'll try next going above 512 KiB.
>>>
>>> At 1 MiB NVMe LBA format we crash with the BUG_ON(sectors <= 0) on bio_split().
>>>
>>> [ 13.401651] ------------[ cut here ]------------
>>> [ 13.403298] kernel BUG at block/bio.c:1626!
>> Ah. MAX_BUFS_PER_PAGE getting in the way.
>>
>> Can you test with the attached patch?
>
> Nope same crash:
>
> I've enabled you to easily test with with NVMe on libvirt with kdevops,
> please test.
>
> Luis
>
> [ 14.972734] ------------[ cut here ]------------
> [ 14.974731] kernel BUG at block/bio.c:1626!
> [ 14.976906] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> [ 14.978899] CPU: 3 PID: 59 Comm: kworker/u36:0 Not tainted 6.9.0-rc6+ #4
> [ 14.981005] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [ 14.983782] Workqueue: nvme-wq nvme_scan_work [nvme_core]
> [ 14.985431] RIP: 0010:bio_split+0xd5/0xf0
> [ 14.986627] Code: 5b 4c 89 e0 5d 41 5c 41 5d c3 cc cc cc cc c7 43 28 00 00 00 00 eb db 0f 0b 45 31 e4 5b 5d 4c 89 e0 41 5c 41 5d c3 cc cc cc cc <0f> 0b 0f 0b 4c 89 e7 e8 bf ee ff ff eb e1 66 66 2e 0f 1f 84 00 00
> [ 14.992063] RSP: 0018:ffffbecc002378d0 EFLAGS: 00010246
> [ 14.993416] RAX: 0000000000000001 RBX: ffff9e2fe8583e40 RCX: ffff9e2fdcb73060
> [ 14.995181] RDX: 0000000000000c00 RSI: 0000000000000000 RDI: ffff9e2fe8583e40
> [ 14.996960] RBP: 0000000000000000 R08: 0000000000000080 R09: 0000000000000000
> [ 14.998715] R10: ffff9e2fe8583e40 R11: ffff9e2fe8583eb8 R12: ffff9e2fe884b750
> [ 15.000510] R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000000
> [ 15.002128] FS: 0000000000000000(0000) GS:ffff9e303bcc0000(0000) knlGS:0000000000000000
> [ 15.003956] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 15.005294] CR2: 0000561b2b5ce478 CR3: 0000000102484002 CR4: 0000000000770ef0
> [ 15.006921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 15.008509] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
> [ 15.010001] PKRU: 55555554
> [ 15.010672] Call Trace:
> [ 15.011297] <TASK>
> [ 15.011868] ? die+0x32/0x80
> [ 15.012572] ? do_trap+0xd9/0x100
> [ 15.013306] ? bio_split+0xd5/0xf0
> [ 15.014051] ? do_error_trap+0x6a/0x90
> [ 15.014854] ? bio_split+0xd5/0xf0
> [ 15.015597] ? exc_invalid_op+0x4c/0x60
> [ 15.016419] ? bio_split+0xd5/0xf0
> [ 15.017113] ? asm_exc_invalid_op+0x16/0x20
> [ 15.017932] ? bio_split+0xd5/0xf0
> [ 15.018624] __bio_split_to_limits+0x90/0x2d0
> [ 15.019474] blk_mq_submit_bio+0x111/0x6a0
> [ 15.020280] ? kmem_cache_alloc+0x254/0x2e0
> [ 15.021040] submit_bio_noacct_nocheck+0x2f1/0x3d0
> [ 15.021893] ? submit_bio_noacct+0x42/0x5b0
> [ 15.022658] block_read_full_folio+0x2b7/0x350
> [ 15.023457] ? __pfx_blkdev_get_block+0x10/0x10
> [ 15.024284] ? __pfx_blkdev_read_folio+0x10/0x10
> [ 15.025073] ? __pfx_blkdev_read_folio+0x10/0x10
> [ 15.025851] filemap_read_folio+0x32/0xb0
> [ 15.026540] do_read_cache_folio+0x108/0x200
> [ 15.027271] ? __pfx_adfspart_check_ICS+0x10/0x10
> [ 15.028066] read_part_sector+0x32/0xe0
> [ 15.028701] adfspart_check_ICS+0x32/0x480
> [ 15.029334] ? snprintf+0x49/0x70
> [ 15.029875] ? __pfx_adfspart_check_ICS+0x10/0x10
> [ 15.030592] bdev_disk_changed+0x2a2/0x6e0
> [ 15.031226] blkdev_get_whole+0x5f/0xa0
> [ 15.031827] bdev_open+0x201/0x3c0
> [ 15.032360] bdev_file_open_by_dev+0xb5/0x110
> [ 15.032990] disk_scan_partitions+0x65/0xe0
> [ 15.033598] device_add_disk+0x3e0/0x3f0
> [ 15.034172] nvme_scan_ns+0x5f0/0xe50 [nvme_core]
> [ 15.034862] nvme_scan_work+0x26f/0x5a0 [nvme_core]
> [ 15.035568] process_one_work+0x189/0x3b0
> [ 15.036168] worker_thread+0x273/0x390
> [ 15.036713] ? __pfx_worker_thread+0x10/0x10
> [ 15.037312] kthread+0xda/0x110
> [ 15.037779] ? __pfx_kthread+0x10/0x10
> [ 15.038316] ret_from_fork+0x2d/0x50
> [ 15.038829] ? __pfx_kthread+0x10/0x10
> [ 15.039364] ret_from_fork_asm+0x1a/0x30
> [ 15.039924] </TASK>
>
Ah. So this should fix it:
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 4e3483a16b75..4fac11edd0c8 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -289,7 +289,7 @@ struct bio *bio_split_rw(struct bio *bio, const
struct queue_limits *lim,
if (nsegs < lim->max_segments &&
bytes + bv.bv_len <= max_bytes &&
- bv.bv_offset + bv.bv_len <= PAGE_SIZE) {
+ bv.bv_offset + bv.bv_len <= lim->max_segment_size) {
nsegs++;
bytes += bv.bv_len;
} else {
Cheers,
Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@suse.de +49 911 74053 688
SUSE Software Solutions GmbH, Frankenstr. 146, 90461 Nürnberg
HRB 36809 (AG Nürnberg), GF: I. Totev, A. McDonald, W. Knoblich
next prev parent reply other threads:[~2024-05-13 23:07 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-05-10 10:29 [PATCH 0/5] enable bs > ps for block devices hare
2024-05-10 10:29 ` [PATCH 1/5] fs/mpage: use blocks_per_folio instead of blocks_per_page hare
2024-05-10 22:19 ` Matthew Wilcox
2024-05-11 9:12 ` Hannes Reinecke
2024-05-10 10:29 ` [PATCH 2/5] fs/mpage: avoid negative shift for large blocksize hare
2024-05-10 22:24 ` Matthew Wilcox
2024-05-11 9:13 ` Hannes Reinecke
2024-05-10 10:29 ` [PATCH 3/5] block/bdev: lift restrictions on supported blocksize hare
2024-05-10 22:25 ` Matthew Wilcox
2024-05-11 9:15 ` Hannes Reinecke
2024-05-11 22:57 ` Luis Chamberlain
2024-05-11 22:54 ` Luis Chamberlain
2024-05-10 10:29 ` [PATCH 4/5] block/bdev: enable large folio support for large logical block sizes hare
2024-05-10 22:35 ` Matthew Wilcox
2024-05-11 9:18 ` Hannes Reinecke
2024-05-11 5:07 ` kernel test robot
2024-05-11 7:04 ` kernel test robot
2024-05-10 10:29 ` [PATCH 5/5] nvme: enable logical block size > PAGE_SIZE hare
2024-05-10 22:37 ` Matthew Wilcox
2024-05-11 9:18 ` Hannes Reinecke
2024-05-11 23:11 ` Luis Chamberlain
2024-05-11 23:09 ` Luis Chamberlain
2024-05-11 23:30 ` Matthew Wilcox
2024-05-11 23:51 ` Luis Chamberlain
2024-05-12 2:43 ` Luis Chamberlain
2024-05-12 9:16 ` Luis Chamberlain
2024-05-13 16:07 ` Hannes Reinecke
2024-05-13 21:05 ` Luis Chamberlain
2024-05-13 23:07 ` Hannes Reinecke [this message]
2024-05-24 10:04 ` Hannes Reinecke
2024-05-14 5:08 ` Matthew Wilcox
2024-05-13 13:43 ` Hannes Reinecke
2024-05-13 14:32 ` Luis Chamberlain
2024-05-11 23:12 ` [PATCH 0/5] enable bs > ps for block devices Luis Chamberlain
2024-05-30 13:37 ` Pankaj Raghav (Samsung)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0dc0b13d-27b7-41c4-8bf3-64f1810d2b39@suse.de \
--to=hare@suse.de \
--cc=akpm@linux-foundation.org \
--cc=hare@kernel.org \
--cc=kent.overstreet@linux.dev \
--cc=kernel@pankajraghav.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=mcgrof@kernel.org \
--cc=p.raghav@samsung.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).