All of lore.kernel.org
 help / color / mirror / Atom feed
* bug in bcachefs -> bio_copy_data_iter
@ 2021-11-09 22:04 Kent Overstreet
  2021-11-10 12:02 ` Coly Li
  0 siblings, 1 reply; 5+ messages in thread
From: Kent Overstreet @ 2021-11-09 22:04 UTC (permalink / raw)
  To: hch; +Cc: ming.lei, colyli, linux-bcachefs

Hey Christoph, got a strange one.

I've got a user that's reporting a bug where we deref a bad ptr in bio_copy_data
-> memcpy, and reverting your patch "block: rewrite bio_copy_data_iter to use
bvec_kmap_local and memcpy_to_bvec" seems to make it go away.

I haven't figured out what's different yet between the two versions (your patch
looks like it should be functionally equivalent), but clearly I'm missing
something... wonder if there might be some relation to the bug you guys hit in
bcache with bvec_virt.

Any ideas?

[  395.978225] BUG: unable to handle page fault for address: ffff9b0b8e600000
[  395.979503] #PF: supervisor read access in kernel mode
[  395.980720] #PF: error_code(0x0000) - not-present page
[  395.981953] PGD 182c01067 P4D 182c01067 PUD 182c05067 PMD 26df23067 PTE 800ffffdb19ff060
[  395.983227] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[  395.984422] CPU: 3 PID: 10 Comm: kworker/u8:1 Not tainted 5.15.1-00987-g97f33a3143f5 #5
[  395.985700] Hardware name: MSI MS-7982/B150M PRO-VDH (MS-7982), BIOS 3.H0 07/10/2018
[  395.986939] Workqueue: writeback wb_workfn (flush-bcachefs-2)
[  395.988196] RIP: 0010:memcpy_erms (arch/x86/lib/memcpy_64.S:55)
[ 395.989495] Code: cc cc cc cc eb 1e 0f 1f 00 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07 f3 48 a5 89 d1 f3 a4 c3 66 0f 1f 44 00 00 48 89 f8 48 89 d1 <f3> a4 c3 0f 1f 80 00 00 00 00 48 89 f8 48 83 fa 20 72 7e 40 38 fe
All code
========
   0:	cc                   	int3
   1:	cc                   	int3
   2:	cc                   	int3
   3:	cc                   	int3
   4:	eb 1e                	jmp    0x24
   6:	0f 1f 00             	nopl   (%rax)
   9:	48 89 f8             	mov    %rdi,%rax
   c:	48 89 d1             	mov    %rdx,%rcx
   f:	48 c1 e9 03          	shr    $0x3,%rcx
  13:	83 e2 07             	and    $0x7,%edx
  16:	f3 48 a5             	rep movsq %ds:(%rsi),%es:(%rdi)
  19:	89 d1                	mov    %edx,%ecx
  1b:	f3 a4                	rep movsb %ds:(%rsi),%es:(%rdi)
  1d:	c3                   	ret
  1e:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)
  24:	48 89 f8             	mov    %rdi,%rax
  27:	48 89 d1             	mov    %rdx,%rcx
  2a:*	f3 a4                	rep movsb %ds:(%rsi),%es:(%rdi)		<-- trapping instruction
  2c:	c3                   	ret
  2d:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
  34:	48 89 f8             	mov    %rdi,%rax
  37:	48 83 fa 20          	cmp    $0x20,%rdx
  3b:	72 7e                	jb     0xbb
  3d:	40 38 fe             	cmp    %dil,%sil

Code starting with the faulting instruction
===========================================
   0:	f3 a4                	rep movsb %ds:(%rsi),%es:(%rdi)
   2:	c3                   	ret
   3:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
   a:	48 89 f8             	mov    %rdi,%rax
   d:	48 83 fa 20          	cmp    $0x20,%rdx
  11:	72 7e                	jb     0x91
  13:	40 38 fe             	cmp    %dil,%sil
[  395.990872] RSP: 0018:ffffad9b800ab760 EFLAGS: 00010286
[  395.992275] RAX: ffff9b0b1b21c000 RBX: 0000000000000200 RCX: 0000000000000e00
[  395.993705] RDX: 0000000000001000 RSI: ffff9b0b8e600000 RDI: ffff9b0b1b21c200
[  395.995042] RBP: ffff9b0a4306c300 R08: 0000000000000e00 R09: 0000000000001000
[  395.996436] R10: 0000000000000200 R11: 0000000000000200 R12: ffff9b0b44b54030
[  395.997880] R13: ffffad9b800ab7a0 R14: ffffad9b800ab7b8 R15: 0000000000001000
[  395.999307] FS:  0000000000000000(0000) GS:ffff9b0ba4f80000(0000) knlGS:0000000000000000
[  396.000697] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  396.002065] CR2: ffff9b0b8e600000 CR3: 0000000182210004 CR4: 00000000003706e0
[  396.003544] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  396.004997] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  396.006414] Call Trace:
[  396.007814] bio_copy_data_iter (./include/linux/bio.h:158 block/bio.c:1297)
[  396.009208] bio_copy_data (block/bio.c:1317)
[  396.010536] __bch2_write (fs/bcachefs/io.c:969 fs/bcachefs/io.c:1150)
[  396.011920] ? bch2_writepage_do_io (./include/linux/closure.h:229 fs/bcachefs/fs-io.c:1103)
[  396.013293] bch2_writepage_do_io (./include/linux/closure.h:229 fs/bcachefs/fs-io.c:1103)
[  396.014801] __bch2_writepage (fs/bcachefs/fs-io.c:1246)
[  396.016232] ? __mod_memcg_lruvec_state (mm/memcontrol.c:681)
[  396.017649] write_cache_pages (mm/page-writeback.c:2255)
[  396.019106] ? bch2_page_reservation_get.constprop.0 (fs/bcachefs/fs-io.c:1141)
[  396.020546] bch2_writepages (fs/bcachefs/fs-io.c:1283)
[  396.022013] ? update_load_avg (kernel/sched/fair.c:3619 kernel/sched/fair.c:3856)
[  396.023473] do_writepages (mm/page-writeback.c:2364)
[  396.024943] ? enqueue_task_fair (kernel/sched/fair.c:5626)
[  396.026408] ? psi_task_change (kernel/sched/psi.c:755 kernel/sched/psi.c:817)
[  396.027864] __writeback_single_inode (fs/fs-writeback.c:1616)
[  396.029287] writeback_sb_inodes (fs/fs-writeback.c:1883)
[  396.030724] __writeback_inodes_wb (fs/fs-writeback.c:1951)
[  396.032194] wb_writeback (fs/fs-writeback.c:2055)
[  396.033679] wb_workfn (fs/fs-writeback.c:2209 fs/fs-writeback.c:2237)
[  396.035115] ? __schedule (kernel/sched/core.c:6295)
[  396.036607] process_one_work (kernel/workqueue.c:2297)
[  396.038075] worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2445)
[  396.039552] ? rescuer_thread (kernel/workqueue.c:2387)
[  396.041035] kthread (kernel/kthread.c:319)
[  396.042535] ? set_kthread_struct (kernel/kthread.c:272)
[  396.043967] ret_from_fork (arch/x86/entry/entry_64.S:295)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug in bcachefs -> bio_copy_data_iter
  2021-11-09 22:04 bug in bcachefs -> bio_copy_data_iter Kent Overstreet
@ 2021-11-10 12:02 ` Coly Li
  2022-04-23 17:31   ` Christoph Hellwig
  0 siblings, 1 reply; 5+ messages in thread
From: Coly Li @ 2021-11-10 12:02 UTC (permalink / raw)
  To: hch, Kent Overstreet; +Cc: ming.lei, linux-bcachefs

On 11/10/21 6:04 AM, Kent Overstreet wrote:
> Hey Christoph, got a strange one.
>
> I've got a user that's reporting a bug where we deref a bad ptr in bio_copy_data
> -> memcpy, and reverting your patch "block: rewrite bio_copy_data_iter to use
> bvec_kmap_local and memcpy_to_bvec" seems to make it go away.
>
> I haven't figured out what's different yet between the two versions (your patch
> looks like it should be functionally equivalent), but clearly I'm missing
> something... wonder if there might be some relation to the bug you guys hit in
> bcache with bvec_virt.
>
> Any ideas?

I experience similar one in bcache code during my recent development,

[ 3134.522913] BUG: unable to handle page fault for address: 
ffffa100e2eaa3f8^M
[ 3134.605187] #PF: supervisor read access in kernel mode^M
[ 3134.666654] #PF: error_code(0x0000) - not-present page^M
[ 3134.728124] PGD 523c805067 P4D 523c805067 PUD 5b35ea1067 PMD 
5b35d89067 PTE 800ffffb5d155060^M
[ 3134.829113] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI^M
[ 3134.889545] CPU: 3 PID: 458 Comm: kworker/3:2 Kdump: loaded Tainted: 
G            EL    5.15.0-59.27-default+ #15^M
[ 3135.012373] Hardware name: Lenovo ThinkSystem SR650 
-[7X05CTO1WW]-/-[7X05CTO1WW]-, BIOS -[IVE164L-2.80]- 10/23/2020^M
[ 3135.137281] Workqueue: bcache cached_dev_read_done [bcache]^M
[ 3135.203970] RIP: 0010:bio_copy_data_iter+0x1a8/0x260^M
[ 3135.263365] Code: 8d 04 3e 48 3d 00 10 00 00 0f 87 a3 00 00 00 4c 01 
f9 41 83 f8 08 0f 82 89 fe ff ff 48 8b 06 48 8d 79 08 48 83 e7 f8 48 89 
01 <4a> 8b 44 36 f8 4a 89 44 31 f8 48 29 f9 48 29 ce 44 01 c1 c1 e9 03^M
[ 3135.488114] RSP: 0018:ffffba1607fcbdb8 EFLAGS: 00010282^M
[ 3135.550624] RAX: 0000000000000000 RBX: ffffa0ff6912f968 RCX: 
ffffa0fe22626000^M
[ 3135.636012] RDX: ffffa0ffa8cc4160 RSI: ffffa100e2ea9400 RDI: 
ffffa0fe22626008^M
[ 3135.721402] RBP: 0000000000001000 R08: 0000000000001000 R09: 
0000000000000c00^M
[ 3135.806793] R10: ffffba1607fcbdf8 R11: ffffba1607fcbe10 R12: 
ffffa0fc40000000^M
[ 3135.892180] R13: ffffe04500000000 R14: 0000000000001000 R15: 
0000000000000000^M
[ 3135.977570] FS:  0000000000000000(0000) GS:ffffa1084b000000(0000) 
knlGS:0000000000000000^M
[ 3136.074399] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033^M
[ 3136.143147] CR2: ffffa100e2eaa3f8 CR3: 000000036854c003 CR4: 
00000000007706e0^M
[ 3136.228538] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 
0000000000000000^M
[ 3136.313926] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 
0000000000000400^M
[ 3136.399317] PKRU: 55555554^M
[ 3136.431665] Call Trace:^M
[ 3136.460896]  bio_copy_data+0x5e/0x80^M
[ 3136.503647]  cached_dev_read_done+0xa8/0x210 [bcache]^M
[ 3136.564082]  process_one_work+0x2e3/0x640^M
[ 3136.612034]  worker_thread+0x39/0x400^M
[ 3136.655824]  ? process_one_work+0x640/0x640^M
[ 3136.705854]  kthread+0x13c/0x160^M
[ 3136.744442]  ? set_kthread_struct+0x40/0x40^M
[ 3136.794471]  ret_from_fork+0x22/0x30^M



Coly Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug in bcachefs -> bio_copy_data_iter
  2021-11-10 12:02 ` Coly Li
@ 2022-04-23 17:31   ` Christoph Hellwig
  2022-05-07 18:28     ` Kent Overstreet
  0 siblings, 1 reply; 5+ messages in thread
From: Christoph Hellwig @ 2022-04-23 17:31 UTC (permalink / raw)
  To: Coly Li; +Cc: hch, Kent Overstreet, ming.lei, linux-bcachefs

On Wed, Nov 10, 2021 at 08:02:06PM +0800, Coly Li wrote:
>> looks like it should be functionally equivalent), but clearly I'm missing
>> something... wonder if there might be some relation to the bug you guys hit in
>> bcache with bvec_virt.
>>
>> Any ideas?
>
> I experience similar one in bcache code during my recent development,

Can you try this patch?


diff --git a/block/bio.c b/block/bio.c
index 4259125e16ab2..ac29c87c6735c 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1336,10 +1336,12 @@ void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
 		struct bio_vec src_bv = bio_iter_iovec(src, *src_iter);
 		struct bio_vec dst_bv = bio_iter_iovec(dst, *dst_iter);
 		unsigned int bytes = min(src_bv.bv_len, dst_bv.bv_len);
-		void *src_buf;
+		void *src_buf = bvec_kmap_local(&src_bv);
+		void *dst_buf = bvec_kmap_local(&dst_bv);
 
-		src_buf = bvec_kmap_local(&src_bv);
-		memcpy_to_bvec(&dst_bv, src_buf);
+		memcpy(dst_buf, src_buf, bytes);
+
+		kunmap_local(dst_buf);
 		kunmap_local(src_buf);
 
 		bio_advance_iter_single(src, src_iter, bytes);

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: bug in bcachefs -> bio_copy_data_iter
  2022-04-23 17:31   ` Christoph Hellwig
@ 2022-05-07 18:28     ` Kent Overstreet
  2022-05-08  7:41       ` Coly Li
  0 siblings, 1 reply; 5+ messages in thread
From: Kent Overstreet @ 2022-05-07 18:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Coly Li, ming.lei, linux-bcachefs

On Sat, Apr 23, 2022 at 07:31:23PM +0200, Christoph Hellwig wrote:
> On Wed, Nov 10, 2021 at 08:02:06PM +0800, Coly Li wrote:
> >> looks like it should be functionally equivalent), but clearly I'm missing
> >> something... wonder if there might be some relation to the bug you guys hit in
> >> bcache with bvec_virt.
> >>
> >> Any ideas?
> >
> > I experience similar one in bcache code during my recent development,
> 
> Can you try this patch?

Coly, did you look at this?

Christoph, the fix looks correct to me - let's get it in if you haven't sent it
to Jens yet.

> diff --git a/block/bio.c b/block/bio.c
> index 4259125e16ab2..ac29c87c6735c 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1336,10 +1336,12 @@ void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
>  		struct bio_vec src_bv = bio_iter_iovec(src, *src_iter);
>  		struct bio_vec dst_bv = bio_iter_iovec(dst, *dst_iter);
>  		unsigned int bytes = min(src_bv.bv_len, dst_bv.bv_len);
> -		void *src_buf;
> +		void *src_buf = bvec_kmap_local(&src_bv);
> +		void *dst_buf = bvec_kmap_local(&dst_bv);
>  
> -		src_buf = bvec_kmap_local(&src_bv);
> -		memcpy_to_bvec(&dst_bv, src_buf);
> +		memcpy(dst_buf, src_buf, bytes);
> +
> +		kunmap_local(dst_buf);
>  		kunmap_local(src_buf);
>  
>  		bio_advance_iter_single(src, src_iter, bytes);

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: bug in bcachefs -> bio_copy_data_iter
  2022-05-07 18:28     ` Kent Overstreet
@ 2022-05-08  7:41       ` Coly Li
  0 siblings, 0 replies; 5+ messages in thread
From: Coly Li @ 2022-05-08  7:41 UTC (permalink / raw)
  To: Kent Overstreet, Christoph Hellwig; +Cc: ming.lei, linux-bcachefs

On 5/8/22 2:28 AM, Kent Overstreet wrote:
> On Sat, Apr 23, 2022 at 07:31:23PM +0200, Christoph Hellwig wrote:
>> On Wed, Nov 10, 2021 at 08:02:06PM +0800, Coly Li wrote:
>>>> looks like it should be functionally equivalent), but clearly I'm missing
>>>> something... wonder if there might be some relation to the bug you guys hit in
>>>> bcache with bvec_virt.
>>>>
>>>> Any ideas?
>>> I experience similar one in bcache code during my recent development,
>> Can you try this patch?
> Coly, did you look at this?
>
> Christoph, the fix looks correct to me - let's get it in if you haven't sent it
> to Jens yet.
>

Hi Kent and Christoph,


I tried it and trying it still. Without this change, I don't observe 
immediate panic in 5.18-rcX, so applying it I don't see obvious 
different, just testing it with my current development patches.

It's fine to have it in mainline IMHO.


Coly Li


>> diff --git a/block/bio.c b/block/bio.c
>> index 4259125e16ab2..ac29c87c6735c 100644
>> --- a/block/bio.c
>> +++ b/block/bio.c
>> @@ -1336,10 +1336,12 @@ void bio_copy_data_iter(struct bio *dst, struct bvec_iter *dst_iter,
>>   		struct bio_vec src_bv = bio_iter_iovec(src, *src_iter);
>>   		struct bio_vec dst_bv = bio_iter_iovec(dst, *dst_iter);
>>   		unsigned int bytes = min(src_bv.bv_len, dst_bv.bv_len);
>> -		void *src_buf;
>> +		void *src_buf = bvec_kmap_local(&src_bv);
>> +		void *dst_buf = bvec_kmap_local(&dst_bv);
>>   
>> -		src_buf = bvec_kmap_local(&src_bv);
>> -		memcpy_to_bvec(&dst_bv, src_buf);
>> +		memcpy(dst_buf, src_buf, bytes);
>> +
>> +		kunmap_local(dst_buf);
>>   		kunmap_local(src_buf);
>>   
>>   		bio_advance_iter_single(src, src_iter, bytes);



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-05-08  7:41 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-09 22:04 bug in bcachefs -> bio_copy_data_iter Kent Overstreet
2021-11-10 12:02 ` Coly Li
2022-04-23 17:31   ` Christoph Hellwig
2022-05-07 18:28     ` Kent Overstreet
2022-05-08  7:41       ` Coly Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.