All of lore.kernel.org
 help / color / mirror / Atom feed
* use-after-free access in bt_iter()
@ 2021-02-04 11:46 pragalla
  2021-02-04 15:51 ` Bart Van Assche
  0 siblings, 1 reply; 8+ messages in thread
From: pragalla @ 2021-02-04 11:46 UTC (permalink / raw)
  To: axboe, bvanassche, evgreen, jianchao.w.wang; +Cc: linux-block, stummala

Hi Jens, Bart,

This is with regards to use-after-free access in bt_iter().
i saw this got discussed and reported on many separate threads but could 
see
more discussions and conversations over the solution was made on [1]
as pointed in [2].

[1] 
https://lore.kernel.org/linux-block/1545261885.185366.488.camel@acm.org/
[2] https://lkml.org/lkml/2019/2/14/942

A similar issue was reported again on 5.4 kernel during internal 
stability testing.

<2> Unable to handle kernel paging request at virtual address 
ffffff8107929600
<2> Mem abort info:
<2>   ESR = 0x96000007
<2>   EC = 0x25: DABT (current EL), IL = 32 bits
<2>   SET = 0, FnV = 0
<2>   EA = 0, S1PTW = 0
<2> Data abort info:
<2>   ISV = 0, ISS = 0x00000007
<2>   CM = 0, WnR = 0
<2> swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000a2603000
<2> [ffffff8107929600] pgd=00000001bf909003, pud=00000001bf909003, 
pmd=00000001bf8cc003, pte=0068000187929f12
<2> Internal error: Oops: 96000007 [#1] PREEMPT SMP
<2> Skip md ftrace buffer dump for: 0x1609e0

<2> CPU: 0 PID: 220 Comm: kworker/0:1H Tainted: G S      W  O      
5.4.61-qgki-debug-g85faaf6 #2
<2> Workqueue: kblockd blk_mq_timeout_work
<2> pstate: 20c00005 (nzCv daif +PAN +UAO)
<2> pc : bt_for_each+0x114/0x1a4
<2> lr : bt_for_each+0xe0/0x1a4
<2> sp : ffffffc017f7bc60
<2> x29: ffffffc017f7bc80 x28: 0000000000000001
<2> x27: 0000000000000008 x26: 0000000000000001
<2> x25: 0000000000000001 x24: 0000000000000008
<2> x23: ffffff8107bcd800 x22: ffffff810872bd10
<2> x21: ffffffd764e6ea50 x20: 0000000000000008
<2> x19: 0000000000000000 x18: ffffffc017f51030
<2> x17: 0000000005f5e100 x16: 0000000000000000
<2> x15: ffffffffff84bf5c x14: 0000000000000598
<2> x13: 0000000000000008 x12: 00000000212d4a53
<2> x11: 00000000000000ff x10: 0000000000000000
<2> x9 : ffffff810872cd00 x8 : 0000000000000009
<2> x7 : 0000000000000000 x6 : ffffffd763890758
<2> x5 : 0000000000000000 x4 : 0000000000000000
<2> x3 : ffffffc017f7bd20 x2 : 0000000000000001
<2> x1 : ffffff8107929600 x0 : 0000000000000001
<2> Call trace:
<2>  bt_for_each+0x114/0x1a4
<2>  blk_mq_queue_tag_busy_iter+0xd8/0x1a4
<2>  blk_mq_timeout_work+0xd4/0x1c0
<2>  process_one_work+0x280/0x460
<2>  worker_thread+0x27c/0x4dc
<2>  kthread+0x160/0x170
<2>  ret_from_fork+0x10/0x18

Is this issue got fixed on any latest kernel ? if so, can you please 
help point the patch ?
If not got fixed, can we have a final solution ? i can even help in 
testing the solution.

Thanks and Regards,
Pradeep

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
  2021-02-04 11:46 use-after-free access in bt_iter() pragalla
@ 2021-02-04 15:51 ` Bart Van Assche
  2021-02-04 16:17   ` John Garry
  0 siblings, 1 reply; 8+ messages in thread
From: Bart Van Assche @ 2021-02-04 15:51 UTC (permalink / raw)
  To: pragalla, axboe, evgreen, jianchao.w.wang
  Cc: linux-block, stummala, John Garry

On 2/4/21 3:46 AM, pragalla@codeaurora.org wrote:
> Is this issue got fixed on any latest kernel ? if so, can you please
> help point the patch ?
> If not got fixed, can we have a final solution ? i can even help in
> testing the solution.

Hi John,

Some time ago you replied the following to an email from me with a
suggestion for a fix: "Please let me consider it a bit more." Are you
still working on a fix?

See also
https://lore.kernel.org/linux-block/1bcc1d9e-6a32-1e00-0d32-f5b7325b2f8c@huawei.com/

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
  2021-02-04 15:51 ` Bart Van Assche
@ 2021-02-04 16:17   ` John Garry
  2021-02-05  2:39     ` Ming Lei
  2021-02-05 15:30     ` pragalla
  0 siblings, 2 replies; 8+ messages in thread
From: John Garry @ 2021-02-04 16:17 UTC (permalink / raw)
  To: Bart Van Assche, pragalla, axboe, evgreen, jianchao.w.wang
  Cc: linux-block, stummala, Ming Lei

On 04/02/2021 15:51, Bart Van Assche wrote:
> On 2/4/21 3:46 AM,pragalla@codeaurora.org  wrote:
>> Is this issue got fixed on any latest kernel ? if so, can you please
>> help point the patch ?
>> If not got fixed, can we have a final solution ? i can even help in
>> testing the solution.
> Hi John,
> 

Hi Bart,

> Some time ago you replied the following to an email from me with a
> suggestion for a fix: "Please let me consider it a bit more." Are you
> still working on a fix?

Unfortunately I have not had a chance, sorry. But I can look again.

So I have only seen KASAN use-after-free's myself, but never an actual 
oops. IIRC, someone did report an oops.

@Pradeep, do you have a reliable re-creator? I noticed the timeout 
handler stackframe in your mail, so I guess not. However, as an 
experiment, could you test:
https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/

This should fix the common issue. But no final solution to issues 
discussed from patch 2/2, which is more exotic.

BTW, is this the same Pradeep who reported:
https://lore.kernel.org/linux-block/1606402925-24420-1-git-send-email-ppvk@codeaurora.org/

I did cc ppvk@codeaurora.org on earlier version of my series, but it 
bounced.

> 
> See also
> https://lore.kernel.org/linux-block/1bcc1d9e-6a32-1e00-0d32-f5b7325b2f8c@huawei.com/

Thanks,
John

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
  2021-02-04 16:17   ` John Garry
@ 2021-02-05  2:39     ` Ming Lei
  2021-02-05 15:30     ` pragalla
  1 sibling, 0 replies; 8+ messages in thread
From: Ming Lei @ 2021-02-05  2:39 UTC (permalink / raw)
  To: John Garry
  Cc: Bart Van Assche, pragalla, axboe, evgreen, jianchao.w.wang,
	linux-block, stummala

On Thu, Feb 04, 2021 at 04:17:51PM +0000, John Garry wrote:
> On 04/02/2021 15:51, Bart Van Assche wrote:
> > On 2/4/21 3:46 AM,pragalla@codeaurora.org  wrote:
> > > Is this issue got fixed on any latest kernel ? if so, can you please
> > > help point the patch ?
> > > If not got fixed, can we have a final solution ? i can even help in
> > > testing the solution.
> > Hi John,
> > 
> 
> Hi Bart,
> 
> > Some time ago you replied the following to an email from me with a
> > suggestion for a fix: "Please let me consider it a bit more." Are you
> > still working on a fix?
> 
> Unfortunately I have not had a chance, sorry. But I can look again.
> 
> So I have only seen KASAN use-after-free's myself, but never an actual oops.
> IIRC, someone did report an oops.
> 
> @Pradeep, do you have a reliable re-creator? I noticed the timeout handler
> stackframe in your mail, so I guess not. However, as an experiment, could
> you test:
> https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/
> 
> This should fix the common issue. But no final solution to issues discussed
> from patch 2/2, which is more exotic.
> 

If still no progress, I'd suggest to consider the patches I posted:

https://lore.kernel.org/linux-block/accb98d8-8186-2e74-a5c3-e0f09ce2b3ff@acm.org/#r

The idea is quite simple at least, :-)

-- 
Ming


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
  2021-02-04 16:17   ` John Garry
  2021-02-05  2:39     ` Ming Lei
@ 2021-02-05 15:30     ` pragalla
  2021-02-05 16:07       ` John Garry
  1 sibling, 1 reply; 8+ messages in thread
From: pragalla @ 2021-02-05 15:30 UTC (permalink / raw)
  To: John Garry
  Cc: Bart Van Assche, axboe, evgreen, jianchao.w.wang, linux-block,
	stummala, Ming Lei

On 2021-02-04 21:47, John Garry wrote:
> On 04/02/2021 15:51, Bart Van Assche wrote:
>> On 2/4/21 3:46 AM,pragalla@codeaurora.org  wrote:
>>> Is this issue got fixed on any latest kernel ? if so, can you please
>>> help point the patch ?
>>> If not got fixed, can we have a final solution ? i can even help in
>>> testing the solution.
>> Hi John,
>> 
> 
> Hi Bart,
> 
>> Some time ago you replied the following to an email from me with a
>> suggestion for a fix: "Please let me consider it a bit more." Are you
>> still working on a fix?
> 
> Unfortunately I have not had a chance, sorry. But I can look again.
> 
> So I have only seen KASAN use-after-free's myself, but never an actual
> oops. IIRC, someone did report an oops.
> 
Hi John,

> @Pradeep, do you have a reliable re-creator? I noticed the timeout
> handler stackframe in your mail, so I guess not. However, as an
> experiment, could you test:
> https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/
> 
Yes, i don't have a reliable re-creator. The oops was noticed as a part 
of stability testing and
was not an intentional try. This was noticed couple of times.
Please share the steps (if any) to easy hit or to exercise this path 
more frequently.
Meanwhile, i will go with the usual stability procedure. i will update 
the results here later.

> This should fix the common issue. But no final solution to issues
> discussed from patch 2/2, which is more exotic.
> 
> BTW, is this the same Pradeep who reported:
> https://lore.kernel.org/linux-block/1606402925-24420-1-git-send-email-ppvk@codeaurora.org/
> 
> I did cc ppvk@codeaurora.org on earlier version of my series, but it 
> bounced.
> 
Yes, it's the same Pradeep. Unfortunately my old email 
"ppvk@codeaurora.org" got expired and
couldn't able to restore. Hence the bounced emails. Now this got 
resolved with a new email
"pragalla@codeaurora.org" which I'm now currently replying.

>> 
>> See also
>> https://lore.kernel.org/linux-block/1bcc1d9e-6a32-1e00-0d32-f5b7325b2f8c@huawei.com/
> 
> Thanks,
> John

Thanks and Regards,
Pradeep

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
  2021-02-05 15:30     ` pragalla
@ 2021-02-05 16:07       ` John Garry
       [not found]         ` <9ace4c26c47e84c3c6a1c68ef1a193f8@codeaurora.org>
  0 siblings, 1 reply; 8+ messages in thread
From: John Garry @ 2021-02-05 16:07 UTC (permalink / raw)
  To: pragalla; +Cc: Bart Van Assche, axboe, evgreen, linux-block, stummala, Ming Lei

- bouncing jianchao.w.wang@oracle.com

>>
>>> Some time ago you replied the following to an email from me with a
>>> suggestion for a fix: "Please let me consider it a bit more." Are you
>>> still working on a fix?
>>
>> Unfortunately I have not had a chance, sorry. But I can look again.
>>
>> So I have only seen KASAN use-after-free's myself, but never an actual
>> oops. IIRC, someone did report an oops.
>>
> Hi John,
> 
>> @Pradeep, do you have a reliable re-creator? I noticed the timeout
>> handler stackframe in your mail, so I guess not. However, as an
>> experiment, could you test:
>> https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/ 
>>
>>
> Yes, i don't have a reliable re-creator. The oops was noticed as a part 
> of stability testing and
> was not an intentional try. This was noticed couple of times.
> Please share the steps (if any) to easy hit or to exercise this path 
> more frequently.
> Meanwhile, i will go with the usual stability procedure. i will update 
> the results here later.
> 

Do you have a full kernel log for your crash?

So there are different flavors of this issue, and you reported a crash 
from blk_mq_queue_tag_busy_iter().

If you check:
https://lore.kernel.org/linux-block/76190c94-c5c1-9553-5509-9969fc323544@huawei.com/

You can see how I artificially trigger an issue in 
blk_mq_queue_tag_busy_iter().

>> This should fix the common issue. But no final solution to issues
>> discussed from patch 2/2, which is more exotic.
>>
>> BTW, is this the same Pradeep who reported:
>> https://lore.kernel.org/linux-block/1606402925-24420-1-git-send-email-ppvk@codeaurora.org/ 
>>

Thanks,
John


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
       [not found]         ` <9ace4c26c47e84c3c6a1c68ef1a193f8@codeaurora.org>
@ 2021-02-19  6:22           ` pragalla
  2021-02-19  9:34             ` John Garry
  0 siblings, 1 reply; 8+ messages in thread
From: pragalla @ 2021-02-19  6:22 UTC (permalink / raw)
  To: John Garry
  Cc: Bart Van Assche, axboe, evgreen, linux-block, stummala, Ming Lei

On 2021-02-05 21:51, pragalla@codeaurora.org wrote:
> On 2021-02-05 21:37, John Garry wrote:
>> - bouncing jianchao.w.wang@oracle.com
>> 
>>>> 
>>>>> Some time ago you replied the following to an email from me with a
>>>>> suggestion for a fix: "Please let me consider it a bit more." Are 
>>>>> you
>>>>> still working on a fix?
>>>> 
>>>> Unfortunately I have not had a chance, sorry. But I can look again.
>>>> 
>>>> So I have only seen KASAN use-after-free's myself, but never an 
>>>> actual
>>>> oops. IIRC, someone did report an oops.
>>>> 
>>> Hi John,
>>> 
>>>> @Pradeep, do you have a reliable re-creator? I noticed the timeout
>>>> handler stackframe in your mail, so I guess not. However, as an
>>>> experiment, could you test:
>>>> https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/
>>> Yes, i don't have a reliable re-creator. The oops was noticed as a 
>>> part of stability testing and
>>> was not an intentional try. This was noticed couple of times.
>>> Please share the steps (if any) to easy hit or to exercise this path 
>>> more frequently.
>>> Meanwhile, i will go with the usual stability procedure. i will 
>>> update the results here later.
>>> 
>> 
Hi John,

we ran the stability with the above patch
(https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/)
with switching the io-schedulers in b/w for ~88hrs on 2 devices, we 
didn't notice any crash/issue.

>> Do you have a full kernel log for your crash?
> Yes. Attaching the full kernel dmesg log.
>> 
>> So there are different flavors of this issue, and you reported a crash
>> from blk_mq_queue_tag_busy_iter().
>> 
>> If you check:
>> https://lore.kernel.org/linux-block/76190c94-c5c1-9553-5509-9969fc323544@huawei.com/
>> 
>> You can see how I artificially trigger an issue in 
>> blk_mq_queue_tag_busy_iter().
> Sure, i will go through the steps on the recreation part. Thanks.
>> 
>>>> This should fix the common issue. But no final solution to issues
>>>> discussed from patch 2/2, which is more exotic.
>>>> 
>>>> BTW, is this the same Pradeep who reported:
>>>> https://lore.kernel.org/linux-block/1606402925-24420-1-git-send-email-ppvk@codeaurora.org/
>> 
>> Thanks,
>> John
> 
> Thanks and Regards,
> Pradeep

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: use-after-free access in bt_iter()
  2021-02-19  6:22           ` pragalla
@ 2021-02-19  9:34             ` John Garry
  0 siblings, 0 replies; 8+ messages in thread
From: John Garry @ 2021-02-19  9:34 UTC (permalink / raw)
  To: pragalla; +Cc: Bart Van Assche, axboe, evgreen, linux-block, stummala, Ming Lei

On 19/02/2021 06:22, pragalla@codeaurora.org wrote:
>>>>
>>>
> Hi John,
> 
> we ran the stability with the above patch
> (https://lore.kernel.org/linux-block/1608203273-170555-2-git-send-email-john.garry@huawei.com/) 
> 
> with switching the io-schedulers in b/w for ~88hrs on 2 devices, we 
> didn't notice any crash/issue.
> 

Oh, good. I assume that this is same test which you were reporting 
crashes on previously.

So we still have the issue of changing IO scheduler and the tagset iters 
holding old references to tags. I tried Bart's idea to deal with the 
request queue tagset iter, and it seems to work, but not sure if 
acceptable. And also I might have a solution for blk_mq_tagset_busy_iter().

However I am not sure if these are seen in real-life, and whether the 
patch you tested is good enough for now.

Let me look at this again and I will report back.

Thanks very much,
John

>>> Do you have a full kernel log for your crash? 


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-02-19  9:37 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-04 11:46 use-after-free access in bt_iter() pragalla
2021-02-04 15:51 ` Bart Van Assche
2021-02-04 16:17   ` John Garry
2021-02-05  2:39     ` Ming Lei
2021-02-05 15:30     ` pragalla
2021-02-05 16:07       ` John Garry
     [not found]         ` <9ace4c26c47e84c3c6a1c68ef1a193f8@codeaurora.org>
2021-02-19  6:22           ` pragalla
2021-02-19  9:34             ` John Garry

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.