linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block: fix splitting segments
@ 2019-12-29  2:32 Ming Lei
  2019-12-29 16:14 ` Jens Axboe
                   ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Ming Lei @ 2019-12-29  2:32 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Ming Lei

There are two issues in get_max_segment_size():

1) the default segment boudary mask is bypassed, and some devices still
require segment to not cross the default 4G boundary

2) the segment start address isn't taken into account when checking
segment boundary limit

Fixes the two issues.

Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 block/blk-merge.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/block/blk-merge.c b/block/blk-merge.c
index d783bdc4559b..234377d8146b 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -157,16 +157,14 @@ static inline unsigned get_max_io_size(struct request_queue *q,
 	return sectors & (lbs - 1);
 }
 
-static unsigned get_max_segment_size(const struct request_queue *q,
-				     unsigned offset)
+static inline unsigned get_max_segment_size(const struct request_queue *q,
+					    const struct page *start_page,
+					    unsigned long offset)
 {
 	unsigned long mask = queue_segment_boundary(q);
 
-	/* default segment boundary mask means no boundary limit */
-	if (mask == BLK_SEG_BOUNDARY_MASK)
-		return queue_max_segment_size(q);
-
-	return min_t(unsigned long, mask - (mask & offset) + 1,
+	offset = mask & (page_to_phys(start_page) + offset);
+	return min_t(unsigned long, mask - offset + 1,
 		     queue_max_segment_size(q));
 }
 
@@ -201,7 +199,8 @@ static bool bvec_split_segs(const struct request_queue *q,
 	unsigned seg_size = 0;
 
 	while (len && *nsegs < max_segs) {
-		seg_size = get_max_segment_size(q, bv->bv_offset + total_len);
+		seg_size = get_max_segment_size(q, bv->bv_page,
+						bv->bv_offset + total_len);
 		seg_size = min(seg_size, len);
 
 		(*nsegs)++;
@@ -419,7 +418,8 @@ static unsigned blk_bvec_map_sg(struct request_queue *q,
 
 	while (nbytes > 0) {
 		unsigned offset = bvec->bv_offset + total;
-		unsigned len = min(get_max_segment_size(q, offset), nbytes);
+		unsigned len = min(get_max_segment_size(q, bvec->bv_page,
+					offset), nbytes);
 		struct page *page = bvec->bv_page;
 
 		/*
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2019-12-29  2:32 [PATCH] block: fix splitting segments Ming Lei
@ 2019-12-29 16:14 ` Jens Axboe
  2019-12-30  1:15   ` Bart Van Assche
  2020-01-07 12:47 ` Guenter Roeck
  2020-01-08 14:02 ` Christoph Hellwig
  2 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2019-12-29 16:14 UTC (permalink / raw)
  To: Ming Lei; +Cc: linux-block, Chris Mason

On 12/28/19 7:32 PM, Ming Lei wrote:
> There are two issues in get_max_segment_size():
> 
> 1) the default segment boudary mask is bypassed, and some devices still
> require segment to not cross the default 4G boundary
> 
> 2) the segment start address isn't taken into account when checking
> segment boundary limit
> 
> Fixes the two issues.

Given the potential severity of the bug, I think it deserves a somewhat
richer explanation than just that. It should also go into stable. This
is what I queued up:

https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.5&id=add1fc07334260253dfa880d9c964edc8381deac

BTW, did you change your email setup recently? Your patches are coming
through mangled.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2019-12-29 16:14 ` Jens Axboe
@ 2019-12-30  1:15   ` Bart Van Assche
  2019-12-30  1:25     ` Ming Lei
  0 siblings, 1 reply; 24+ messages in thread
From: Bart Van Assche @ 2019-12-30  1:15 UTC (permalink / raw)
  To: Jens Axboe, Ming Lei; +Cc: linux-block, Chris Mason

On 2019-12-29 08:14, Jens Axboe wrote:
> On 12/28/19 7:32 PM, Ming Lei wrote:
>> There are two issues in get_max_segment_size():
>>
>> 1) the default segment boudary mask is bypassed, and some devices still
>> require segment to not cross the default 4G boundary
>>
>> 2) the segment start address isn't taken into account when checking
>> segment boundary limit
>>
>> Fixes the two issues.
> 
> Given the potential severity of the bug, I think it deserves a somewhat
> richer explanation than just that. It should also go into stable. This
> is what I queued up:
> 
> https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.5&id=add1fc07334260253dfa880d9c964edc8381deac

Although this patch looks fine to me, seeing Jens' patch description
makes me wonder whether the DMA segment boundary for the mpt3sas adapter
should be made explicit, e.g. by setting the SCSI host .dma_boundary
parameter in the mpt3sas driver? From the SCSI core:

	blk_queue_segment_boundary(q, shost->dma_boundary);

Bart.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2019-12-30  1:15   ` Bart Van Assche
@ 2019-12-30  1:25     ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2019-12-30  1:25 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: Jens Axboe, linux-block, Chris Mason

On Sun, Dec 29, 2019 at 05:15:49PM -0800, Bart Van Assche wrote:
> On 2019-12-29 08:14, Jens Axboe wrote:
> > On 12/28/19 7:32 PM, Ming Lei wrote:
> >> There are two issues in get_max_segment_size():
> >>
> >> 1) the default segment boudary mask is bypassed, and some devices still
> >> require segment to not cross the default 4G boundary
> >>
> >> 2) the segment start address isn't taken into account when checking
> >> segment boundary limit
> >>
> >> Fixes the two issues.
> > 
> > Given the potential severity of the bug, I think it deserves a somewhat
> > richer explanation than just that. It should also go into stable. This
> > is what I queued up:
> > 
> > https://git.kernel.dk/cgit/linux-block/commit/?h=block-5.5&id=add1fc07334260253dfa880d9c964edc8381deac
> 
> Although this patch looks fine to me, seeing Jens' patch description
> makes me wonder whether the DMA segment boundary for the mpt3sas adapter
> should be made explicit, e.g. by setting the SCSI host .dma_boundary
> parameter in the mpt3sas driver? From the SCSI core:
> 
> 	blk_queue_segment_boundary(q, shost->dma_boundary);

See scsi_host_alloc():
		...
        /*
         * assume a 4GB boundary, if not set
         */
        if (sht->dma_boundary)
                shost->dma_boundary = sht->dma_boundary;
        else
                shost->dma_boundary = 0xffffffff;


Thanks, 
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2019-12-29  2:32 [PATCH] block: fix splitting segments Ming Lei
  2019-12-29 16:14 ` Jens Axboe
@ 2020-01-07 12:47 ` Guenter Roeck
  2020-01-07 15:23   ` Ming Lei
  2020-01-08 14:02 ` Christoph Hellwig
  2 siblings, 1 reply; 24+ messages in thread
From: Guenter Roeck @ 2020-01-07 12:47 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, linux-kernel, Chris Mason

Hi,

On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> There are two issues in get_max_segment_size():
> 
> 1) the default segment boudary mask is bypassed, and some devices still
> require segment to not cross the default 4G boundary
> 
> 2) the segment start address isn't taken into account when checking
> segment boundary limit
> 
> Fixes the two issues.
> 
> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> Signed-off-by: Ming Lei <ming.lei@redhat.com>

This patch, pushed into mainline as "block: fix splitting segments on
boundary masks", results in the following crash when booting 'versatilepb'
in qemu from disk. Bisect log is attached. Detailed log is at
https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio

Guenter

---
Crash:

kernel BUG at block/bio.c:1885!
Internal error: Oops - BUG: 0 [#1] ARM
Modules linked in:
CPU: 0 PID: 1 Comm: init Not tainted 5.5.0-rc5 #1
Hardware name: ARM-Versatile (Device Tree Support)
PC is at bio_split+0x10c/0x15c
LR is at __blk_queue_split+0x378/0x628
...
[<c03b716c>] (bio_split) from [<c03c24c8>] (__blk_queue_split+0x378/0x628)
[<c03c24c8>] (__blk_queue_split) from [<c03c82cc>] (blk_mq_make_request+0x6c/0x7e4)
[<c03c82cc>] (blk_mq_make_request) from [<c03bc7d0>] (generic_make_request+0xec/0x340)
[<c03bc7d0>] (generic_make_request) from [<c03bca70>] (submit_bio+0x4c/0x170)
[<c03bca70>] (submit_bio) from [<c020666c>] (ext4_mpage_readpages+0x54c/0x8e0)
[<c020666c>] (ext4_mpage_readpages) from [<c01e1ec8>] (ext4_readpages+0x40/0x50)
[<c01e1ec8>] (ext4_readpages) from [<c00df808>] (read_pages+0x50/0x13c)
[<c00df808>] (read_pages) from [<c00dfd2c>] (__do_page_cache_readahead+0x1a8/0x1f0)
[<c00dfd2c>] (__do_page_cache_readahead) from [<c00d500c>] (filemap_fault+0x440/0x8f4)
[<c00d500c>] (filemap_fault) from [<c01ed494>] (ext4_filemap_fault+0x28/0x3c)
[<c01ed494>] (ext4_filemap_fault) from [<c0100b88>] (__do_fault+0x3c/0x1c0)
[<c0100b88>] (__do_fault) from [<c0105360>] (handle_mm_fault+0x284/0xaf4)
[<c0105360>] (handle_mm_fault) from [<c001f01c>] (do_page_fault+0x114/0x2e0)
[<c001f01c>] (do_page_fault) from [<c001f34c>] (do_DataAbort+0x38/0xbc)
[<c001f34c>] (do_DataAbort) from [<c000a0dc>] (__dabt_svc+0x5c/0xa0)
Exception stack(0xc783be28 to 0xc783be70)
be20:                   b6f81898 00000760 00000000 00000055 00000051 c0af3068
be40: c71f7a00 c71f7800 b6f51000 c71df320 c7954c80 00000006 00000000 c783be78
be60: c01a099c c07a4bb4 20000153 ffffffff
[<c000a0dc>] (__dabt_svc) from [<c07a4bb4>] (__clear_user_std+0x34/0x68)
[<c07a4bb4>] (__clear_user_std) from [<c01a099c>] (clear_user+0x40/0x50)
[<c01a099c>] (clear_user) from [<c019f204>] (load_elf_binary+0x1354/0x13c4)
[<c019f204>] (load_elf_binary) from [<c013996c>] (search_binary_handler.part.4+0x58/0x1fc)
[<c013996c>] (search_binary_handler.part.4) from [<c013b18c>] (__do_execve_file+0x780/0x9a4)
[<c013b18c>] (__do_execve_file) from [<c013b54c>] (do_execve+0x28/0x30)
[<c013b54c>] (do_execve) from [<c000af1c>] (try_to_run_init_process+0xc/0x3c)
[<c000af1c>] (try_to_run_init_process) from [<c07c42fc>] (kernel_init+0x88/0xf0)
[<c07c42fc>] (kernel_init) from [<c00090b0>] (ret_from_fork+0x14/0x24)
...
WARNING: CPU: 0 PID: 1 at kernel/exit.c:719 do_exit+0x54/0xb5c
Modules linked in:
CPU: 0 PID: 1 Comm: init Tainted: G      D           5.5.0-rc5 #1
Hardware name: ARM-Versatile (Device Tree Support)
[<c001e8b0>] (unwind_backtrace) from [<c001a774>] (show_stack+0x10/0x14)
[<c001a774>] (show_stack) from [<c0027dc4>] (__warn+0xe4/0x108)
[<c0027dc4>] (__warn) from [<c0027e90>] (warn_slowpath_fmt+0xa8/0xb8)
[<c0027e90>] (warn_slowpath_fmt) from [<c0029dfc>] (do_exit+0x54/0xb5c)
[<c0029dfc>] (do_exit) from [<c001a918>] (die+0x1a0/0x274)
[<c001a918>] (die) from [<c001abfc>] (do_undefinstr+0xac/0x258)
[<c001abfc>] (do_undefinstr) from [<c000a238>] (__und_svc_finish+0x0/0x48)
Exception stack(0xc783b950 to 0xc783b998)
b940:                                     c7226a80 00000000 00000c00 c7bebe00
b960: 00000000 c783b9e4 00000000 c7226a80 00007000 00000060 c7beb848 c783b9f0
b980: 00000060 c783b9a0 c03c24c8 c03b716c 60000153 ffffffff
[<c000a238>] (__und_svc_finish) from [<c03b716c>] (bio_split+0x10c/0x15c)
[<c03b716c>] (bio_split) from [<c03c24c8>] (__blk_queue_split+0x378/0x628)
[<c03c24c8>] (__blk_queue_split) from [<c03c82cc>] (blk_mq_make_request+0x6c/0x7e4)
[<c03c82cc>] (blk_mq_make_request) from [<c03bc7d0>] (generic_make_request+0xec/0x340)
[<c03bc7d0>] (generic_make_request) from [<c03bca70>] (submit_bio+0x4c/0x170)
[<c03bca70>] (submit_bio) from [<c020666c>] (ext4_mpage_readpages+0x54c/0x8e0)
[<c020666c>] (ext4_mpage_readpages) from [<c01e1ec8>] (ext4_readpages+0x40/0x50)
[<c01e1ec8>] (ext4_readpages) from [<c00df808>] (read_pages+0x50/0x13c)
[<c00df808>] (read_pages) from [<c00dfd2c>] (__do_page_cache_readahead+0x1a8/0x1f0)
[<c00dfd2c>] (__do_page_cache_readahead) from [<c00d500c>] (filemap_fault+0x440/0x8f4)
[<c00d500c>] (filemap_fault) from [<c01ed494>] (ext4_filemap_fault+0x28/0x3c)
[<c01ed494>] (ext4_filemap_fault) from [<c0100b88>] (__do_fault+0x3c/0x1c0)
[<c0100b88>] (__do_fault) from [<c0105360>] (handle_mm_fault+0x284/0xaf4)
[<c0105360>] (handle_mm_fault) from [<c001f01c>] (do_page_fault+0x114/0x2e0)
[<c001f01c>] (do_page_fault) from [<c001f34c>] (do_DataAbort+0x38/0xbc)
[<c001f34c>] (do_DataAbort) from [<c000a0dc>] (__dabt_svc+0x5c/0xa0)
Exception stack(0xc783be28 to 0xc783be70)
be20:                   b6f81898 00000760 00000000 00000055 00000051 c0af3068
be40: c71f7a00 c71f7800 b6f51000 c71df320 c7954c80 00000006 00000000 c783be78
be60: c01a099c c07a4bb4 20000153 ffffffff
[<c000a0dc>] (__dabt_svc) from [<c07a4bb4>] (__clear_user_std+0x34/0x68)
[<c07a4bb4>] (__clear_user_std) from [<c01a099c>] (clear_user+0x40/0x50)
[<c01a099c>] (clear_user) from [<c019f204>] (load_elf_binary+0x1354/0x13c4)
[<c019f204>] (load_elf_binary) from [<c013996c>] (search_binary_handler.part.4+0x58/0x1fc)
[<c013996c>] (search_binary_handler.part.4) from [<c013b18c>] (__do_execve_file+0x780/0x9a4)
[<c013b18c>] (__do_execve_file) from [<c013b54c>] (do_execve+0x28/0x30)
[<c013b54c>] (do_execve) from [<c000af1c>] (try_to_run_init_process+0xc/0x3c)
[<c000af1c>] (try_to_run_init_process) from [<c07c42fc>] (kernel_init+0x88/0xf0)
[<c07c42fc>] (kernel_init) from [<c00090b0>] (ret_from_fork+0x14/0x24)
Exception stack(0xc783bfb0 to 0xc783bff8)
bfa0:                                     00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
irq event stamp: 159876
hardirqs last  enabled at (159875): [<c009123c>] ktime_get+0x74/0x16c
hardirqs last disabled at (159876): [<c000a21c>] __und_svc+0x5c/0x70
softirqs last  enabled at (159728): [<c000aaf0>] __do_softirq+0x308/0x4bc
softirqs last disabled at (159697): [<c002cf94>] irq_exit+0x150/0x178
---[ end trace 42dd349d5c0726c1 ]---
BUG: sleeping function called from invalid context at ./include/linux/cgroup-defs.h:747
in_atomic(): 0, irqs_disabled(): 128, non_block: 0, pid: 1, name: init
INFO: lockdep is turned off.
irq event stamp: 159876
hardirqs last  enabled at (159875): [<c009123c>] ktime_get+0x74/0x16c
hardirqs last disabled at (159876): [<c000a21c>] __und_svc+0x5c/0x70
softirqs last  enabled at (159728): [<c000aaf0>] __do_softirq+0x308/0x4bc
softirqs last disabled at (159697): [<c002cf94>] irq_exit+0x150/0x178
CPU: 0 PID: 1 Comm: init Tainted: G      D W         5.5.0-rc5 #1
Hardware name: ARM-Versatile (Device Tree Support)
[<c001e8b0>] (unwind_backtrace) from [<c001a774>] (show_stack+0x10/0x14)
[<c001a774>] (show_stack) from [<c004fc4c>] (___might_sleep+0x1a8/0x2bc)
[<c004fc4c>] (___might_sleep) from [<c00380d0>] (exit_signals+0x28/0x134)
[<c00380d0>] (exit_signals) from [<c0029e70>] (do_exit+0xc8/0xb5c)
[<c0029e70>] (do_exit) from [<c001a918>] (die+0x1a0/0x274)
[<c001a918>] (die) from [<c001abfc>] (do_undefinstr+0xac/0x258)
[<c001abfc>] (do_undefinstr) from [<c000a238>] (__und_svc_finish+0x0/0x48)
Exception stack(0xc783b950 to 0xc783b998)
b940:                                     c7226a80 00000000 00000c00 c7bebe00
b960: 00000000 c783b9e4 00000000 c7226a80 00007000 00000060 c7beb848 c783b9f0
b980: 00000060 c783b9a0 c03c24c8 c03b716c 60000153 ffffffff
[<c000a238>] (__und_svc_finish) from [<c03b716c>] (bio_split+0x10c/0x15c)
[<c03b716c>] (bio_split) from [<c03c24c8>] (__blk_queue_split+0x378/0x628)
[<c03c24c8>] (__blk_queue_split) from [<c03c82cc>] (blk_mq_make_request+0x6c/0x7e4)
[<c03c82cc>] (blk_mq_make_request) from [<c03bc7d0>] (generic_make_request+0xec/0x340)
[<c03bc7d0>] (generic_make_request) from [<c03bca70>] (submit_bio+0x4c/0x170)
[<c03bca70>] (submit_bio) from [<c020666c>] (ext4_mpage_readpages+0x54c/0x8e0)
[<c020666c>] (ext4_mpage_readpages) from [<c01e1ec8>] (ext4_readpages+0x40/0x50)
[<c01e1ec8>] (ext4_readpages) from [<c00df808>] (read_pages+0x50/0x13c)
[<c00df808>] (read_pages) from [<c00dfd2c>] (__do_page_cache_readahead+0x1a8/0x1f0)
[<c00dfd2c>] (__do_page_cache_readahead) from [<c00d500c>] (filemap_fault+0x440/0x8f4)
[<c00d500c>] (filemap_fault) from [<c01ed494>] (ext4_filemap_fault+0x28/0x3c)
[<c01ed494>] (ext4_filemap_fault) from [<c0100b88>] (__do_fault+0x3c/0x1c0)
[<c0100b88>] (__do_fault) from [<c0105360>] (handle_mm_fault+0x284/0xaf4)
[<c0105360>] (handle_mm_fault) from [<c001f01c>] (do_page_fault+0x114/0x2e0)
[<c001f01c>] (do_page_fault) from [<c001f34c>] (do_DataAbort+0x38/0xbc)
[<c001f34c>] (do_DataAbort) from [<c000a0dc>] (__dabt_svc+0x5c/0xa0)
Exception stack(0xc783be28 to 0xc783be70)
be20:                   b6f81898 00000760 00000000 00000055 00000051 c0af3068
be40: c71f7a00 c71f7800 b6f51000 c71df320 c7954c80 00000006 00000000 c783be78
be60: c01a099c c07a4bb4 20000153 ffffffff
[<c000a0dc>] (__dabt_svc) from [<c07a4bb4>] (__clear_user_std+0x34/0x68)
[<c07a4bb4>] (__clear_user_std) from [<c01a099c>] (clear_user+0x40/0x50)
[<c01a099c>] (clear_user) from [<c019f204>] (load_elf_binary+0x1354/0x13c4)
[<c019f204>] (load_elf_binary) from [<c013996c>] (search_binary_handler.part.4+0x58/0x1fc)
[<c013996c>] (search_binary_handler.part.4) from [<c013b18c>] (__do_execve_file+0x780/0x9a4)
[<c013b18c>] (__do_execve_file) from [<c013b54c>] (do_execve+0x28/0x30)
[<c013b54c>] (do_execve) from [<c000af1c>] (try_to_run_init_process+0xc/0x3c)
[<c000af1c>] (try_to_run_init_process) from [<c07c42fc>] (kernel_init+0x88/0xf0)
[<c07c42fc>] (kernel_init) from [<c00090b0>] (ret_from_fork+0x14/0x24)
Exception stack(0xc783bfb0 to 0xc783bff8)
bfa0:                                     00000000 00000000 00000000 00000000
bfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
bfe0: 00000000 00000000 00000000 00000000 00000013 00000000
Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b

---
bisect log:

# bad: [ae6088216ce4b99b3a4aaaccd2eb2dd40d473d42] Merge tag 'trace-v5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
# good: [738d2902773e30939a982c8df7a7f94293659810] Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
git bisect start 'HEAD' '738d2902773e'
# bad: [84029fd04c201a4c7e0b07ba262664900f47c6f5] memcg: account security cred as well to kmemcg
git bisect bad 84029fd04c201a4c7e0b07ba262664900f47c6f5
# good: [e35d0165908ad2d2bdb76773ef77b551763eedbd] Merge tag 'sound-5.5-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect good e35d0165908ad2d2bdb76773ef77b551763eedbd
# bad: [3a562aee727a7bfbb3a37b1aa934118397dad701] Merge tag 'for-5.5-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
git bisect bad 3a562aee727a7bfbb3a37b1aa934118397dad701
# good: [bed723519a72c0f68fbfaf68ed5bf55d04e46566] Merge tag 'kbuild-fixes-v5.5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
git bisect good bed723519a72c0f68fbfaf68ed5bf55d04e46566
# bad: [b6b4aafc99d7c8dbf7d9429bf054b591daab1ad0] Merge tag 'block-5.5-20200103' of git://git.kernel.dk/linux-block
git bisect bad b6b4aafc99d7c8dbf7d9429bf054b591daab1ad0
# bad: [429120f3df2dba2bf3a4a19f4212a53ecefc7102] block: fix splitting segments on boundary masks
git bisect bad 429120f3df2dba2bf3a4a19f4212a53ecefc7102
# good: [85a8ce62c2eabe28b9d76ca4eecf37922402df93] block: add bio_truncate to fix guard_bio_eod
git bisect good 85a8ce62c2eabe28b9d76ca4eecf37922402df93
# first bad commit: [429120f3df2dba2bf3a4a19f4212a53ecefc7102] block: fix splitting segments on boundary masks

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-07 12:47 ` Guenter Roeck
@ 2020-01-07 15:23   ` Ming Lei
  2020-01-07 18:11     ` Guenter Roeck
  0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2020-01-07 15:23 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jens Axboe, linux-block, linux-kernel, Chris Mason

On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> Hi,
> 
> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > There are two issues in get_max_segment_size():
> > 
> > 1) the default segment boudary mask is bypassed, and some devices still
> > require segment to not cross the default 4G boundary
> > 
> > 2) the segment start address isn't taken into account when checking
> > segment boundary limit
> > 
> > Fixes the two issues.
> > 
> > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> 
> This patch, pushed into mainline as "block: fix splitting segments on
> boundary masks", results in the following crash when booting 'versatilepb'
> in qemu from disk. Bisect log is attached. Detailed log is at
> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> 
> Guenter
> 
> ---
> Crash:
> 
> kernel BUG at block/bio.c:1885!
> Internal error: Oops - BUG: 0 [#1] ARM

Please apply the following debug patch, and post the log.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 347782a24a35..c4aa683a1c20 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -217,6 +217,33 @@ static bool bvec_split_segs(const struct request_queue *q,
 	return len > 0 || bv->bv_len > max_len;
 }
 
+static void blk_dump_bio(struct request_queue *q, struct bio *bio,
+		unsigned secs)
+{
+	struct bvec_iter iter;
+	struct bio_vec bvec;
+	int i = 0;
+	unsigned sectors = 0;
+
+	printk("max_sectors %u max_segs %u max_seg_size %u mask %lx\n",
+			get_max_io_size(q, bio), queue_max_segments(q),
+			queue_max_segment_size(q), queue_segment_boundary(q));
+	printk("%p: %hx/%hx %llu %u, %u\n",
+                       bio,
+                       bio->bi_flags, bio->bi_opf,
+                       (unsigned long long)bio->bi_iter.bi_sector,
+                       bio->bi_iter.bi_size, secs);
+	bio_for_each_bvec(bvec, bio, iter) {
+		sectors += bvec.bv_len >> 9;
+		trace_printk("\t %d: %lu %u %u(%u)\n", i++,
+				(unsigned long)page_to_pfn(bvec.bv_page),
+				bvec.bv_offset,
+				bvec.bv_len, bvec.bv_len >> 12);
+	}
+	printk("\t total sectors %u\n", sectors);
+}
+
+
 /**
  * blk_bio_segment_split - split a bio in two bios
  * @q:    [in] request queue pointer
@@ -273,6 +300,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 	return NULL;
 split:
 	*segs = nsegs;
+
+	if (sectors <= 0 || sectors >= bio_sectors(bio))
+		blk_dump_bio(q, bio, sectors);
 	return bio_split(bio, sectors, GFP_NOIO, bs);
 }
 

Thanks,
Ming


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-07 15:23   ` Ming Lei
@ 2020-01-07 18:11     ` Guenter Roeck
  2020-01-07 22:30       ` Ming Lei
  0 siblings, 1 reply; 24+ messages in thread
From: Guenter Roeck @ 2020-01-07 18:11 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, linux-kernel, Chris Mason

On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> > Hi,
> > 
> > On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > > There are two issues in get_max_segment_size():
> > > 
> > > 1) the default segment boudary mask is bypassed, and some devices still
> > > require segment to not cross the default 4G boundary
> > > 
> > > 2) the segment start address isn't taken into account when checking
> > > segment boundary limit
> > > 
> > > Fixes the two issues.
> > > 
> > > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > 
> > This patch, pushed into mainline as "block: fix splitting segments on
> > boundary masks", results in the following crash when booting 'versatilepb'
> > in qemu from disk. Bisect log is attached. Detailed log is at
> > https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> > 
> > Guenter
> > 
> > ---
> > Crash:
> > 
> > kernel BUG at block/bio.c:1885!
> > Internal error: Oops - BUG: 0 [#1] ARM
> 
> Please apply the following debug patch, and post the log.
> 

Here you are:

max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
c738da80: 8c80/0 2416 28672, 0
         total sectors 56

(I replaced %p with %px).

Guenter

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 347782a24a35..c4aa683a1c20 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -217,6 +217,33 @@ static bool bvec_split_segs(const struct request_queue *q,
>  	return len > 0 || bv->bv_len > max_len;
>  }
>  
> +static void blk_dump_bio(struct request_queue *q, struct bio *bio,
> +		unsigned secs)
> +{
> +	struct bvec_iter iter;
> +	struct bio_vec bvec;
> +	int i = 0;
> +	unsigned sectors = 0;
> +
> +	printk("max_sectors %u max_segs %u max_seg_size %u mask %lx\n",
> +			get_max_io_size(q, bio), queue_max_segments(q),
> +			queue_max_segment_size(q), queue_segment_boundary(q));
> +	printk("%p: %hx/%hx %llu %u, %u\n",
> +                       bio,
> +                       bio->bi_flags, bio->bi_opf,
> +                       (unsigned long long)bio->bi_iter.bi_sector,
> +                       bio->bi_iter.bi_size, secs);
> +	bio_for_each_bvec(bvec, bio, iter) {
> +		sectors += bvec.bv_len >> 9;
> +		trace_printk("\t %d: %lu %u %u(%u)\n", i++,
> +				(unsigned long)page_to_pfn(bvec.bv_page),
> +				bvec.bv_offset,
> +				bvec.bv_len, bvec.bv_len >> 12);
> +	}
> +	printk("\t total sectors %u\n", sectors);
> +}
> +
> +
>  /**
>   * blk_bio_segment_split - split a bio in two bios
>   * @q:    [in] request queue pointer
> @@ -273,6 +300,9 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
>  	return NULL;
>  split:
>  	*segs = nsegs;
> +
> +	if (sectors <= 0 || sectors >= bio_sectors(bio))
> +		blk_dump_bio(q, bio, sectors);
>  	return bio_split(bio, sectors, GFP_NOIO, bs);
>  }
>  
> 
> Thanks,
> Ming
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-07 18:11     ` Guenter Roeck
@ 2020-01-07 22:30       ` Ming Lei
  2020-01-07 22:32         ` Jens Axboe
  2020-01-07 23:06         ` Guenter Roeck
  0 siblings, 2 replies; 24+ messages in thread
From: Ming Lei @ 2020-01-07 22:30 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jens Axboe, linux-block, linux-kernel, Chris Mason

On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> > On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> > > Hi,
> > > 
> > > On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > > > There are two issues in get_max_segment_size():
> > > > 
> > > > 1) the default segment boudary mask is bypassed, and some devices still
> > > > require segment to not cross the default 4G boundary
> > > > 
> > > > 2) the segment start address isn't taken into account when checking
> > > > segment boundary limit
> > > > 
> > > > Fixes the two issues.
> > > > 
> > > > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > 
> > > This patch, pushed into mainline as "block: fix splitting segments on
> > > boundary masks", results in the following crash when booting 'versatilepb'
> > > in qemu from disk. Bisect log is attached. Detailed log is at
> > > https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> > > 
> > > Guenter
> > > 
> > > ---
> > > Crash:
> > > 
> > > kernel BUG at block/bio.c:1885!
> > > Internal error: Oops - BUG: 0 [#1] ARM
> > 
> > Please apply the following debug patch, and post the log.
> > 
> 
> Here you are:
> 
> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
> c738da80: 8c80/0 2416 28672, 0
>          total sectors 56
> 
> (I replaced %p with %px).
> 

Please try the following patch and see if it makes a difference.
If not, replace trace_printk with printk in previous debug patch,
and apply the debug patch only & post the log.

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 347782a24a35..f152bdee9b05 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -159,12 +159,12 @@ static inline unsigned get_max_io_size(struct request_queue *q,
 
 static inline unsigned get_max_segment_size(const struct request_queue *q,
 					    struct page *start_page,
-					    unsigned long offset)
+					    unsigned long long offset)
 {
 	unsigned long mask = queue_segment_boundary(q);
 
 	offset = mask & (page_to_phys(start_page) + offset);
-	return min_t(unsigned long, mask - offset + 1,
+	return min_t(unsigned long long, mask - offset + 1,
 		     queue_max_segment_size(q));
 }
 

Thanks,
Ming


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-07 22:30       ` Ming Lei
@ 2020-01-07 22:32         ` Jens Axboe
  2020-01-08  1:59           ` Ming Lei
  2020-01-07 23:06         ` Guenter Roeck
  1 sibling, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2020-01-07 22:32 UTC (permalink / raw)
  To: Ming Lei, Guenter Roeck; +Cc: linux-block, linux-kernel, Chris Mason

On 1/7/20 3:30 PM, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
>> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
>>> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
>>>> Hi,
>>>>
>>>> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
>>>>> There are two issues in get_max_segment_size():
>>>>>
>>>>> 1) the default segment boudary mask is bypassed, and some devices still
>>>>> require segment to not cross the default 4G boundary
>>>>>
>>>>> 2) the segment start address isn't taken into account when checking
>>>>> segment boundary limit
>>>>>
>>>>> Fixes the two issues.
>>>>>
>>>>> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
>>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>>>
>>>> This patch, pushed into mainline as "block: fix splitting segments on
>>>> boundary masks", results in the following crash when booting 'versatilepb'
>>>> in qemu from disk. Bisect log is attached. Detailed log is at
>>>> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
>>>>
>>>> Guenter
>>>>
>>>> ---
>>>> Crash:
>>>>
>>>> kernel BUG at block/bio.c:1885!
>>>> Internal error: Oops - BUG: 0 [#1] ARM
>>>
>>> Please apply the following debug patch, and post the log.
>>>
>>
>> Here you are:
>>
>> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
>> c738da80: 8c80/0 2416 28672, 0
>>          total sectors 56
>>
>> (I replaced %p with %px).
>>
> 
> Please try the following patch and see if it makes a difference.
> If not, replace trace_printk with printk in previous debug patch,
> and apply the debug patch only & post the log.

If it is a 32-bit issue, then we should use a 64-bit type to make
this nicer than ULL. But it seems reasonable that it could be!

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-07 22:30       ` Ming Lei
  2020-01-07 22:32         ` Jens Axboe
@ 2020-01-07 23:06         ` Guenter Roeck
  1 sibling, 0 replies; 24+ messages in thread
From: Guenter Roeck @ 2020-01-07 23:06 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block, linux-kernel, Chris Mason

On Wed, Jan 08, 2020 at 06:30:35AM +0800, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
> > On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> > > On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> > > > Hi,
> > > > 
> > > > On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> > > > > There are two issues in get_max_segment_size():
> > > > > 
> > > > > 1) the default segment boudary mask is bypassed, and some devices still
> > > > > require segment to not cross the default 4G boundary
> > > > > 
> > > > > 2) the segment start address isn't taken into account when checking
> > > > > segment boundary limit
> > > > > 
> > > > > Fixes the two issues.
> > > > > 
> > > > > Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> > > > > Signed-off-by: Ming Lei <ming.lei@redhat.com>
> > > > 
> > > > This patch, pushed into mainline as "block: fix splitting segments on
> > > > boundary masks", results in the following crash when booting 'versatilepb'
> > > > in qemu from disk. Bisect log is attached. Detailed log is at
> > > > https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> > > > 
> > > > Guenter
> > > > 
> > > > ---
> > > > Crash:
> > > > 
> > > > kernel BUG at block/bio.c:1885!
> > > > Internal error: Oops - BUG: 0 [#1] ARM
> > > 
> > > Please apply the following debug patch, and post the log.
> > > 
> > 
> > Here you are:
> > 
> > max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
> > c738da80: 8c80/0 2416 28672, 0
> >          total sectors 56
> > 
> > (I replaced %p with %px).
> > 
> 
> Please try the following patch and see if it makes a difference.
> If not, replace trace_printk with printk in previous debug patch,
> and apply the debug patch only & post the log.
> 

The patch below fixes the problem for me.

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 347782a24a35..f152bdee9b05 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -159,12 +159,12 @@ static inline unsigned get_max_io_size(struct request_queue *q,
>  
>  static inline unsigned get_max_segment_size(const struct request_queue *q,
>  					    struct page *start_page,
> -					    unsigned long offset)
> +					    unsigned long long offset)
>  {
>  	unsigned long mask = queue_segment_boundary(q);
>  
>  	offset = mask & (page_to_phys(start_page) + offset);
> -	return min_t(unsigned long, mask - offset + 1,
> +	return min_t(unsigned long long, mask - offset + 1,
>  		     queue_max_segment_size(q));
>  }
>  
> 
> Thanks,
> Ming
> 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-07 22:32         ` Jens Axboe
@ 2020-01-08  1:59           ` Ming Lei
  2020-01-08  2:36             ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2020-01-08  1:59 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Guenter Roeck, linux-block, linux-kernel, Chris Mason

On Tue, Jan 07, 2020 at 03:32:58PM -0700, Jens Axboe wrote:
> On 1/7/20 3:30 PM, Ming Lei wrote:
> > On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
> >> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
> >>> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
> >>>> Hi,
> >>>>
> >>>> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
> >>>>> There are two issues in get_max_segment_size():
> >>>>>
> >>>>> 1) the default segment boudary mask is bypassed, and some devices still
> >>>>> require segment to not cross the default 4G boundary
> >>>>>
> >>>>> 2) the segment start address isn't taken into account when checking
> >>>>> segment boundary limit
> >>>>>
> >>>>> Fixes the two issues.
> >>>>>
> >>>>> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
> >>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> >>>>
> >>>> This patch, pushed into mainline as "block: fix splitting segments on
> >>>> boundary masks", results in the following crash when booting 'versatilepb'
> >>>> in qemu from disk. Bisect log is attached. Detailed log is at
> >>>> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
> >>>>
> >>>> Guenter
> >>>>
> >>>> ---
> >>>> Crash:
> >>>>
> >>>> kernel BUG at block/bio.c:1885!
> >>>> Internal error: Oops - BUG: 0 [#1] ARM
> >>>
> >>> Please apply the following debug patch, and post the log.
> >>>
> >>
> >> Here you are:
> >>
> >> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
> >> c738da80: 8c80/0 2416 28672, 0
> >>          total sectors 56
> >>
> >> (I replaced %p with %px).
> >>
> > 
> > Please try the following patch and see if it makes a difference.
> > If not, replace trace_printk with printk in previous debug patch,
> > and apply the debug patch only & post the log.
> 
> If it is a 32-bit issue, then we should use a 64-bit type to make
> this nicer than ULL. But it seems reasonable that it could be!

oops, just saw this email after sending out the patch.

Do you need V2 to change ULL to u64?

Thanks,
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-08  1:59           ` Ming Lei
@ 2020-01-08  2:36             ` Jens Axboe
  0 siblings, 0 replies; 24+ messages in thread
From: Jens Axboe @ 2020-01-08  2:36 UTC (permalink / raw)
  To: Ming Lei; +Cc: Guenter Roeck, linux-block, linux-kernel, Chris Mason

On 1/7/20 6:59 PM, Ming Lei wrote:
> On Tue, Jan 07, 2020 at 03:32:58PM -0700, Jens Axboe wrote:
>> On 1/7/20 3:30 PM, Ming Lei wrote:
>>> On Tue, Jan 07, 2020 at 10:11:45AM -0800, Guenter Roeck wrote:
>>>> On Tue, Jan 07, 2020 at 11:23:39PM +0800, Ming Lei wrote:
>>>>> On Tue, Jan 07, 2020 at 04:47:08AM -0800, Guenter Roeck wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Sun, Dec 29, 2019 at 10:32:30AM +0800, Ming Lei wrote:
>>>>>>> There are two issues in get_max_segment_size():
>>>>>>>
>>>>>>> 1) the default segment boudary mask is bypassed, and some devices still
>>>>>>> require segment to not cross the default 4G boundary
>>>>>>>
>>>>>>> 2) the segment start address isn't taken into account when checking
>>>>>>> segment boundary limit
>>>>>>>
>>>>>>> Fixes the two issues.
>>>>>>>
>>>>>>> Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
>>>>>>> Signed-off-by: Ming Lei <ming.lei@redhat.com>
>>>>>>
>>>>>> This patch, pushed into mainline as "block: fix splitting segments on
>>>>>> boundary masks", results in the following crash when booting 'versatilepb'
>>>>>> in qemu from disk. Bisect log is attached. Detailed log is at
>>>>>> https://kerneltests.org/builders/qemu-arm-master/builds/1410/steps/qemubuildcommand/logs/stdio
>>>>>>
>>>>>> Guenter
>>>>>>
>>>>>> ---
>>>>>> Crash:
>>>>>>
>>>>>> kernel BUG at block/bio.c:1885!
>>>>>> Internal error: Oops - BUG: 0 [#1] ARM
>>>>>
>>>>> Please apply the following debug patch, and post the log.
>>>>>
>>>>
>>>> Here you are:
>>>>
>>>> max_sectors 2560 max_segs 96 max_seg_size 65536 mask ffffffff
>>>> c738da80: 8c80/0 2416 28672, 0
>>>>          total sectors 56
>>>>
>>>> (I replaced %p with %px).
>>>>
>>>
>>> Please try the following patch and see if it makes a difference.
>>> If not, replace trace_printk with printk in previous debug patch,
>>> and apply the debug patch only & post the log.
>>
>> If it is a 32-bit issue, then we should use a 64-bit type to make
>> this nicer than ULL. But it seems reasonable that it could be!
> 
> oops, just saw this email after sending out the patch.
> 
> Do you need V2 to change ULL to u64?

Nah, I can just edit it, that's fine.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2019-12-29  2:32 [PATCH] block: fix splitting segments Ming Lei
  2019-12-29 16:14 ` Jens Axboe
  2020-01-07 12:47 ` Guenter Roeck
@ 2020-01-08 14:02 ` Christoph Hellwig
  2020-01-09  2:03   ` Ming Lei
  2 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2020-01-08 14:02 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, linux-block

> +static inline unsigned get_max_segment_size(const struct request_queue *q,
> +					    const struct page *start_page,
> +					    unsigned long offset)
>  {
>  	unsigned long mask = queue_segment_boundary(q);
>  
> -	/* default segment boundary mask means no boundary limit */
> -	if (mask == BLK_SEG_BOUNDARY_MASK)
> -		return queue_max_segment_size(q);
> -
> -	return min_t(unsigned long, mask - (mask & offset) + 1,
> +	offset = mask & (page_to_phys(start_page) + offset);

This looks weird and potentionaly incorrect - once you add the offset
to page_phys it really isn't an offset anymore and should be in a
variable named paddr or similar.  And that needs to use a phys_addr_t
as we can have 32-bit architectures that use 64-bit physical addresses.

I'd also pass in the actual phys_addr_t instead of the page and offset.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-08 14:02 ` Christoph Hellwig
@ 2020-01-09  2:03   ` Ming Lei
  2020-01-09  7:16     ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2020-01-09  2:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Jens Axboe, linux-block

On Wed, Jan 08, 2020 at 06:02:48AM -0800, Christoph Hellwig wrote:
> > +static inline unsigned get_max_segment_size(const struct request_queue *q,
> > +					    const struct page *start_page,
> > +					    unsigned long offset)
> >  {
> >  	unsigned long mask = queue_segment_boundary(q);
> >  
> > -	/* default segment boundary mask means no boundary limit */
> > -	if (mask == BLK_SEG_BOUNDARY_MASK)
> > -		return queue_max_segment_size(q);
> > -
> > -	return min_t(unsigned long, mask - (mask & offset) + 1,
> > +	offset = mask & (page_to_phys(start_page) + offset);
> 
> This looks weird and potentionaly incorrect - once you add the offset
> to page_phys it really isn't an offset anymore and should be in a
> variable named paddr or similar.  And that needs to use a phys_addr_t

It can be thought as offset from viewpoint of segment boundary.

> as we can have 32-bit architectures that use 64-bit physical addresses.
> 
> I'd also pass in the actual phys_addr_t instead of the page and offset.
> 

It has been addressed in:

https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a

Thanks,
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-09  2:03   ` Ming Lei
@ 2020-01-09  7:16     ` Christoph Hellwig
  2020-01-09 15:18       ` Jens Axboe
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2020-01-09  7:16 UTC (permalink / raw)
  To: Ming Lei; +Cc: Christoph Hellwig, Jens Axboe, linux-block

On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
> It has been addressed in:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a

That is probably correct, but still highly suboptimal for most 32-bit
architectures where physical addresses are 32 bits wide.  To fix that
the proper phys_addr_t type should be used.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-09  7:16     ` Christoph Hellwig
@ 2020-01-09 15:18       ` Jens Axboe
  2020-01-10  2:58         ` Ming Lei
  0 siblings, 1 reply; 24+ messages in thread
From: Jens Axboe @ 2020-01-09 15:18 UTC (permalink / raw)
  To: Christoph Hellwig, Ming Lei; +Cc: linux-block

On 1/9/20 12:16 AM, Christoph Hellwig wrote:
> On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
>> It has been addressed in:
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
> 
> That is probably correct, but still highly suboptimal for most 32-bit
> architectures where physical addresses are 32 bits wide.  To fix that
> the proper phys_addr_t type should be used.

I'll swap it for phys_addr_t - we used to use dma_address_t or something
like that, but I missed this type.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-09 15:18       ` Jens Axboe
@ 2020-01-10  2:58         ` Ming Lei
  2020-01-10  3:00           ` Ming Lei
  0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2020-01-10  2:58 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-block

On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
> On 1/9/20 12:16 AM, Christoph Hellwig wrote:
> > On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
> >> It has been addressed in:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
> > 
> > That is probably correct, but still highly suboptimal for most 32-bit
> > architectures where physical addresses are 32 bits wide.  To fix that
> > the proper phys_addr_t type should be used.
> 
> I'll swap it for phys_addr_t - we used to use dma_address_t or something
> like that, but I missed this type.

Guenter mentioned that 'page_to_phys(start_page) as well as offset are
sometimes 0'[1].

If that(zero page physical address) can happen when phys_addr_t is 32bit,
I guess phys_addr_t may not work too.

Guener, could you test the patch in link[2] again?


[1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
[2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4


Thanks,
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10  2:58         ` Ming Lei
@ 2020-01-10  3:00           ` Ming Lei
  2020-01-10  5:10             ` Guenter Roeck
  0 siblings, 1 reply; 24+ messages in thread
From: Ming Lei @ 2020-01-10  3:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Christoph Hellwig, linux-block, Guenter Roeck

On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
> On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
> > On 1/9/20 12:16 AM, Christoph Hellwig wrote:
> > > On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
> > >> It has been addressed in:
> > >>
> > >> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
> > > 
> > > That is probably correct, but still highly suboptimal for most 32-bit
> > > architectures where physical addresses are 32 bits wide.  To fix that
> > > the proper phys_addr_t type should be used.
> > 
> > I'll swap it for phys_addr_t - we used to use dma_address_t or something
> > like that, but I missed this type.
> 
> Guenter mentioned that 'page_to_phys(start_page) as well as offset are
> sometimes 0'[1].
> 
> If that(zero page physical address) can happen when phys_addr_t is 32bit,
> I guess phys_addr_t may not work too.
> 
> Guener, could you test the patch in link[2] again?
> 
> 
> [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4

Loop Guener in.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10  3:00           ` Ming Lei
@ 2020-01-10  5:10             ` Guenter Roeck
  2020-01-10  6:37               ` Ming Lei
  0 siblings, 1 reply; 24+ messages in thread
From: Guenter Roeck @ 2020-01-10  5:10 UTC (permalink / raw)
  To: Ming Lei, Jens Axboe; +Cc: Christoph Hellwig, linux-block

On 1/9/20 7:00 PM, Ming Lei wrote:
> On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
>> On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
>>> On 1/9/20 12:16 AM, Christoph Hellwig wrote:
>>>> On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
>>>>> It has been addressed in:
>>>>>
>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
>>>>
>>>> That is probably correct, but still highly suboptimal for most 32-bit
>>>> architectures where physical addresses are 32 bits wide.  To fix that
>>>> the proper phys_addr_t type should be used.
>>>
>>> I'll swap it for phys_addr_t - we used to use dma_address_t or something
>>> like that, but I missed this type.
>>
>> Guenter mentioned that 'page_to_phys(start_page) as well as offset are
>> sometimes 0'[1].
>>
>> If that(zero page physical address) can happen when phys_addr_t is 32bit,
>> I guess phys_addr_t may not work too.
>>
>> Guener, could you test the patch in link[2] again?
>>
>>
>> [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4
> 
> Loop Guener in.
> 

The patch at [2] doesn't work.

My understanding is that the page in question is not mapped when
get_max_segment_size() is called (after all, the operation is the
result of a page fault). This is why page_to_phys() returns 0.

You'll either need a local u64 variable, or use some other means
to handle that situation. Something like

     phys_addr_t paddr = page_to_phys(start_page);

     if (paddr == 0)
	return queue_max_segment_size(q);

at the beginning of the function might do, though there might
still be a problem when the page is later assigned and crosses
a segment boundary (if that is possible).

Guenter

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10  5:10             ` Guenter Roeck
@ 2020-01-10  6:37               ` Ming Lei
  2020-01-10  7:15                 ` Ming Lei
  2020-01-10 12:36                 ` Guenter Roeck
  0 siblings, 2 replies; 24+ messages in thread
From: Ming Lei @ 2020-01-10  6:37 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On Thu, Jan 09, 2020 at 09:10:24PM -0800, Guenter Roeck wrote:
> On 1/9/20 7:00 PM, Ming Lei wrote:
> > On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
> > > On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
> > > > On 1/9/20 12:16 AM, Christoph Hellwig wrote:
> > > > > On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
> > > > > > It has been addressed in:
> > > > > > 
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
> > > > > 
> > > > > That is probably correct, but still highly suboptimal for most 32-bit
> > > > > architectures where physical addresses are 32 bits wide.  To fix that
> > > > > the proper phys_addr_t type should be used.
> > > > 
> > > > I'll swap it for phys_addr_t - we used to use dma_address_t or something
> > > > like that, but I missed this type.
> > > 
> > > Guenter mentioned that 'page_to_phys(start_page) as well as offset are
> > > sometimes 0'[1].
> > > 
> > > If that(zero page physical address) can happen when phys_addr_t is 32bit,
> > > I guess phys_addr_t may not work too.
> > > 
> > > Guener, could you test the patch in link[2] again?
> > > 
> > > 
> > > [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
> > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4
> > 
> > Loop Guener in.
> > 
> 
> The patch at [2] doesn't work.
> 
> My understanding is that the page in question is not mapped when
> get_max_segment_size() is called (after all, the operation is the
> result of a page fault). This is why page_to_phys() returns 0.

page_to_phys() supposes to return page's physical address, which
should just depend on this machine's physical address space,
not related with page mapping.

I understand physical address 0 might be valid, such as the 1st
page frame of ram.

> 
> You'll either need a local u64 variable, or use some other means
> to handle that situation. Something like
> 
>     phys_addr_t paddr = page_to_phys(start_page);
> 
>     if (paddr == 0)
> 	return queue_max_segment_size(q);
> 
> at the beginning of the function might do, though there might
> still be a problem when the page is later assigned and crosses
> a segment boundary (if that is possible).

IMO, zero physical address case is the only corner case not
covered by using 'phys_addr_t'. If phys_addr_t is 32bit, sum of
page_to_phys(start_page) and offset shouldn't be >= 4G.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10  6:37               ` Ming Lei
@ 2020-01-10  7:15                 ` Ming Lei
  2020-01-10 12:48                   ` Guenter Roeck
  2020-01-10 12:36                 ` Guenter Roeck
  1 sibling, 1 reply; 24+ messages in thread
From: Ming Lei @ 2020-01-10  7:15 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On Fri, Jan 10, 2020 at 02:37:44PM +0800, Ming Lei wrote:
> On Thu, Jan 09, 2020 at 09:10:24PM -0800, Guenter Roeck wrote:
> > On 1/9/20 7:00 PM, Ming Lei wrote:
> > > On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
> > > > On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
> > > > > On 1/9/20 12:16 AM, Christoph Hellwig wrote:
> > > > > > On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
> > > > > > > It has been addressed in:
> > > > > > > 
> > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
> > > > > > 
> > > > > > That is probably correct, but still highly suboptimal for most 32-bit
> > > > > > architectures where physical addresses are 32 bits wide.  To fix that
> > > > > > the proper phys_addr_t type should be used.
> > > > > 
> > > > > I'll swap it for phys_addr_t - we used to use dma_address_t or something
> > > > > like that, but I missed this type.
> > > > 
> > > > Guenter mentioned that 'page_to_phys(start_page) as well as offset are
> > > > sometimes 0'[1].
> > > > 
> > > > If that(zero page physical address) can happen when phys_addr_t is 32bit,
> > > > I guess phys_addr_t may not work too.
> > > > 
> > > > Guener, could you test the patch in link[2] again?
> > > > 
> > > > 
> > > > [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
> > > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4
> > > 
> > > Loop Guener in.
> > > 
> > 
> > The patch at [2] doesn't work.
> > 
> > My understanding is that the page in question is not mapped when
> > get_max_segment_size() is called (after all, the operation is the
> > result of a page fault). This is why page_to_phys() returns 0.
> 
> page_to_phys() supposes to return page's physical address, which
> should just depend on this machine's physical address space,
> not related with page mapping.
> 
> I understand physical address 0 might be valid, such as the 1st
> page frame of ram.
> 
> > 
> > You'll either need a local u64 variable, or use some other means
> > to handle that situation. Something like
> > 
> >     phys_addr_t paddr = page_to_phys(start_page);
> > 
> >     if (paddr == 0)
> > 	return queue_max_segment_size(q);
> > 
> > at the beginning of the function might do, though there might
> > still be a problem when the page is later assigned and crosses
> > a segment boundary (if that is possible).
> 
> IMO, zero physical address case is the only corner case not
> covered by using 'phys_addr_t'. If phys_addr_t is 32bit, sum of
> page_to_phys(start_page) and offset shouldn't be >= 4G.

Or could you test the following patch?

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 347782a24a35..d1bed8c382bf 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -164,8 +164,13 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
 	unsigned long mask = queue_segment_boundary(q);
 
 	offset = mask & (page_to_phys(start_page) + offset);
-	return min_t(unsigned long, mask - offset + 1,
-		     queue_max_segment_size(q));
+
+	/*
+	 * overflow may be triggered in case of zero page physical address
+	 * on 32bit arch, use queue's max segment size when that happens
+	 */
+	return min_not_zero(mask - offset + 1,
+			(unsigned long)queue_max_segment_size(q));
 }
 
 /**


Thanks,
Ming


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10  6:37               ` Ming Lei
  2020-01-10  7:15                 ` Ming Lei
@ 2020-01-10 12:36                 ` Guenter Roeck
  2020-01-11 12:57                   ` Ming Lei
  1 sibling, 1 reply; 24+ messages in thread
From: Guenter Roeck @ 2020-01-10 12:36 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On 1/9/20 10:37 PM, Ming Lei wrote:
> On Thu, Jan 09, 2020 at 09:10:24PM -0800, Guenter Roeck wrote:
>> On 1/9/20 7:00 PM, Ming Lei wrote:
>>> On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
>>>> On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
>>>>> On 1/9/20 12:16 AM, Christoph Hellwig wrote:
>>>>>> On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
>>>>>>> It has been addressed in:
>>>>>>>
>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
>>>>>>
>>>>>> That is probably correct, but still highly suboptimal for most 32-bit
>>>>>> architectures where physical addresses are 32 bits wide.  To fix that
>>>>>> the proper phys_addr_t type should be used.
>>>>>
>>>>> I'll swap it for phys_addr_t - we used to use dma_address_t or something
>>>>> like that, but I missed this type.
>>>>
>>>> Guenter mentioned that 'page_to_phys(start_page) as well as offset are
>>>> sometimes 0'[1].
>>>>
>>>> If that(zero page physical address) can happen when phys_addr_t is 32bit,
>>>> I guess phys_addr_t may not work too.
>>>>
>>>> Guener, could you test the patch in link[2] again?
>>>>
>>>>
>>>> [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
>>>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4
>>>
>>> Loop Guener in.
>>>
>>
>> The patch at [2] doesn't work.
>>
>> My understanding is that the page in question is not mapped when
>> get_max_segment_size() is called (after all, the operation is the
>> result of a page fault). This is why page_to_phys() returns 0.
> 
> page_to_phys() supposes to return page's physical address, which
> should just depend on this machine's physical address space,
> not related with page mapping.
> 
> I understand physical address 0 might be valid, such as the 1st
> page frame of ram.
> 

Not sure if that happens here, but makes sense.

>>
>> You'll either need a local u64 variable, or use some other means
>> to handle that situation. Something like
>>
>>      phys_addr_t paddr = page_to_phys(start_page);
>>
>>      if (paddr == 0)
>> 	return queue_max_segment_size(q);
>>
>> at the beginning of the function might do, though there might
>> still be a problem when the page is later assigned and crosses
>> a segment boundary (if that is possible).
> 
> IMO, zero physical address case is the only corner case not
> covered by using 'phys_addr_t'. If phys_addr_t is 32bit, sum of
> page_to_phys(start_page) and offset shouldn't be >= 4G.
> 

Yes, but that isn't what is calculated. What is calculated is
         mask - offset + 1
where
         offset = mask & (page_to_phys(start_page) + offset);

with mask == 0xffffffff, offset == 0, we get:
         mask - offset + 1 = 0xffffffff - 0 + 1 = 0x100000000, which is > 4G.

Guenter

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10  7:15                 ` Ming Lei
@ 2020-01-10 12:48                   ` Guenter Roeck
  0 siblings, 0 replies; 24+ messages in thread
From: Guenter Roeck @ 2020-01-10 12:48 UTC (permalink / raw)
  To: Ming Lei; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On 1/9/20 11:15 PM, Ming Lei wrote:
> On Fri, Jan 10, 2020 at 02:37:44PM +0800, Ming Lei wrote:
>> On Thu, Jan 09, 2020 at 09:10:24PM -0800, Guenter Roeck wrote:
>>> On 1/9/20 7:00 PM, Ming Lei wrote:
>>>> On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
>>>>> On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
>>>>>> On 1/9/20 12:16 AM, Christoph Hellwig wrote:
>>>>>>> On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
>>>>>>>> It has been addressed in:
>>>>>>>>
>>>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
>>>>>>>
>>>>>>> That is probably correct, but still highly suboptimal for most 32-bit
>>>>>>> architectures where physical addresses are 32 bits wide.  To fix that
>>>>>>> the proper phys_addr_t type should be used.
>>>>>>
>>>>>> I'll swap it for phys_addr_t - we used to use dma_address_t or something
>>>>>> like that, but I missed this type.
>>>>>
>>>>> Guenter mentioned that 'page_to_phys(start_page) as well as offset are
>>>>> sometimes 0'[1].
>>>>>
>>>>> If that(zero page physical address) can happen when phys_addr_t is 32bit,
>>>>> I guess phys_addr_t may not work too.
>>>>>
>>>>> Guener, could you test the patch in link[2] again?
>>>>>
>>>>>
>>>>> [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
>>>>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4
>>>>
>>>> Loop Guener in.
>>>>
>>>
>>> The patch at [2] doesn't work.
>>>
>>> My understanding is that the page in question is not mapped when
>>> get_max_segment_size() is called (after all, the operation is the
>>> result of a page fault). This is why page_to_phys() returns 0.
>>
>> page_to_phys() supposes to return page's physical address, which
>> should just depend on this machine's physical address space,
>> not related with page mapping.
>>
>> I understand physical address 0 might be valid, such as the 1st
>> page frame of ram.
>>
>>>
>>> You'll either need a local u64 variable, or use some other means
>>> to handle that situation. Something like
>>>
>>>      phys_addr_t paddr = page_to_phys(start_page);
>>>
>>>      if (paddr == 0)
>>> 	return queue_max_segment_size(q);
>>>
>>> at the beginning of the function might do, though there might
>>> still be a problem when the page is later assigned and crosses
>>> a segment boundary (if that is possible).
>>
>> IMO, zero physical address case is the only corner case not
>> covered by using 'phys_addr_t'. If phys_addr_t is 32bit, sum of
>> page_to_phys(start_page) and offset shouldn't be >= 4G.
> 
> Or could you test the following patch?
> 

That patch applied to ToT (without the other patch) fixes the problem for me.

Tested-by: Guenter Roeck <linux@roeck-us.net>

Guenter

> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 347782a24a35..d1bed8c382bf 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -164,8 +164,13 @@ static inline unsigned get_max_segment_size(const struct request_queue *q,
>   	unsigned long mask = queue_segment_boundary(q);
>   
>   	offset = mask & (page_to_phys(start_page) + offset);
> -	return min_t(unsigned long, mask - offset + 1,
> -		     queue_max_segment_size(q));
> +
> +	/*
> +	 * overflow may be triggered in case of zero page physical address
> +	 * on 32bit arch, use queue's max segment size when that happens
> +	 */
> +	return min_not_zero(mask - offset + 1,
> +			(unsigned long)queue_max_segment_size(q));
>   }
>   
>   /**
> 
> 
> Thanks,
> Ming
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH] block: fix splitting segments
  2020-01-10 12:36                 ` Guenter Roeck
@ 2020-01-11 12:57                   ` Ming Lei
  0 siblings, 0 replies; 24+ messages in thread
From: Ming Lei @ 2020-01-11 12:57 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Jens Axboe, Christoph Hellwig, linux-block

On Fri, Jan 10, 2020 at 04:36:05AM -0800, Guenter Roeck wrote:
> On 1/9/20 10:37 PM, Ming Lei wrote:
> > On Thu, Jan 09, 2020 at 09:10:24PM -0800, Guenter Roeck wrote:
> > > On 1/9/20 7:00 PM, Ming Lei wrote:
> > > > On Fri, Jan 10, 2020 at 10:58:01AM +0800, Ming Lei wrote:
> > > > > On Thu, Jan 09, 2020 at 08:18:04AM -0700, Jens Axboe wrote:
> > > > > > On 1/9/20 12:16 AM, Christoph Hellwig wrote:
> > > > > > > On Thu, Jan 09, 2020 at 10:03:41AM +0800, Ming Lei wrote:
> > > > > > > > It has been addressed in:
> > > > > > > > 
> > > > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=ecd255974caa45901d0b8fab03626e0a18fbc81a
> > > > > > > 
> > > > > > > That is probably correct, but still highly suboptimal for most 32-bit
> > > > > > > architectures where physical addresses are 32 bits wide.  To fix that
> > > > > > > the proper phys_addr_t type should be used.
> > > > > > 
> > > > > > I'll swap it for phys_addr_t - we used to use dma_address_t or something
> > > > > > like that, but I missed this type.
> > > > > 
> > > > > Guenter mentioned that 'page_to_phys(start_page) as well as offset are
> > > > > sometimes 0'[1].
> > > > > 
> > > > > If that(zero page physical address) can happen when phys_addr_t is 32bit,
> > > > > I guess phys_addr_t may not work too.
> > > > > 
> > > > > Guener, could you test the patch in link[2] again?
> > > > > 
> > > > > 
> > > > > [1] https://lore.kernel.org/linux-block/20200108023822.GB28075@ming.t460p/T/#m5862216b960454fc41a85204defbb887983bfd75
> > > > > [2] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=block-5.5&id=b6a89c4a9590663f80486662fe9a9c1f4cee31f4
> > > > 
> > > > Loop Guener in.
> > > > 
> > > 
> > > The patch at [2] doesn't work.
> > > 
> > > My understanding is that the page in question is not mapped when
> > > get_max_segment_size() is called (after all, the operation is the
> > > result of a page fault). This is why page_to_phys() returns 0.
> > 
> > page_to_phys() supposes to return page's physical address, which
> > should just depend on this machine's physical address space,
> > not related with page mapping.
> > 
> > I understand physical address 0 might be valid, such as the 1st
> > page frame of ram.
> > 
> 
> Not sure if that happens here, but makes sense.
> 
> > > 
> > > You'll either need a local u64 variable, or use some other means
> > > to handle that situation. Something like
> > > 
> > >      phys_addr_t paddr = page_to_phys(start_page);
> > > 
> > >      if (paddr == 0)
> > > 	return queue_max_segment_size(q);
> > > 
> > > at the beginning of the function might do, though there might
> > > still be a problem when the page is later assigned and crosses
> > > a segment boundary (if that is possible).
> > 
> > IMO, zero physical address case is the only corner case not
> > covered by using 'phys_addr_t'. If phys_addr_t is 32bit, sum of
> > page_to_phys(start_page) and offset shouldn't be >= 4G.
> > 
> 
> Yes, but that isn't what is calculated. What is calculated is
>         mask - offset + 1
> where
>         offset = mask & (page_to_phys(start_page) + offset);
> 
> with mask == 0xffffffff, offset == 0, we get:
>         mask - offset + 1 = 0xffffffff - 0 + 1 = 0x100000000, which is > 4G.

Yes, queue_max_segment_size() returns 'unsigned int', which is always < 4G,
so min_not_zero() is correct to fix the issue.


Thanks,
Ming


^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-01-11 12:57 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-29  2:32 [PATCH] block: fix splitting segments Ming Lei
2019-12-29 16:14 ` Jens Axboe
2019-12-30  1:15   ` Bart Van Assche
2019-12-30  1:25     ` Ming Lei
2020-01-07 12:47 ` Guenter Roeck
2020-01-07 15:23   ` Ming Lei
2020-01-07 18:11     ` Guenter Roeck
2020-01-07 22:30       ` Ming Lei
2020-01-07 22:32         ` Jens Axboe
2020-01-08  1:59           ` Ming Lei
2020-01-08  2:36             ` Jens Axboe
2020-01-07 23:06         ` Guenter Roeck
2020-01-08 14:02 ` Christoph Hellwig
2020-01-09  2:03   ` Ming Lei
2020-01-09  7:16     ` Christoph Hellwig
2020-01-09 15:18       ` Jens Axboe
2020-01-10  2:58         ` Ming Lei
2020-01-10  3:00           ` Ming Lei
2020-01-10  5:10             ` Guenter Roeck
2020-01-10  6:37               ` Ming Lei
2020-01-10  7:15                 ` Ming Lei
2020-01-10 12:48                   ` Guenter Roeck
2020-01-10 12:36                 ` Guenter Roeck
2020-01-11 12:57                   ` Ming Lei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).