All of lore.kernel.org
 help / color / mirror / Atom feed
* KASAN warning in bt_for_each
@ 2019-02-14 18:04 Evan Green
  2019-02-14 19:33 ` Bart Van Assche
  0 siblings, 1 reply; 5+ messages in thread
From: Evan Green @ 2019-02-14 18:04 UTC (permalink / raw)
  To: linux-block; +Cc: LKML, Bart Van Assche, Jens Axboe

Greetings, block experts!

I'm trying to track down a KASAN warning I'm seeing in our downstream
4.19 kernel, and I could use a little help. The warning looks like
this:

[  224.564894] BUG: KASAN: use-after-free in bt_for_each+0x1ac/0x28c
[  224.571195] Read of size 8 at addr ffffffc17c621340 by task fio/3851
[  224.577745]
[  224.579320] CPU: 2 PID: 3851 Comm: fio Tainted: G        W
4.19.16 #398
[  224.591905] Call trace:
[  224.594465]  dump_backtrace+0x0/0x2d8
[  224.598263]  show_stack+0x20/0x2c
[  224.601708]  __dump_stack+0x20/0x28
[  224.605329]  dump_stack+0xc8/0xec
[  224.608769]  print_address_description+0x74/0x240
[  224.613632]  kasan_report+0x258/0x274
[  224.617431]  __asan_report_load8_noabort+0x20/0x28
[  224.622384]  bt_for_each+0x1ac/0x28c
[  224.626090]  blk_mq_queue_tag_busy_iter+0x1d4/0x2f8
[  224.631127]  blk_mq_in_flight+0xa0/0x100
[  224.635193]  part_in_flight+0x54/0x19c
[  224.639079]  part_round_stats+0x138/0x1f0
[  224.643230]  blk_account_io_start+0x23c/0x54c
[  224.647742]  blk_mq_bio_to_request+0x24/0x2c
[  224.652165]  blk_mq_make_request+0x674/0xf90
[  224.656581]  generic_make_request+0x32c/0x7b4
[  224.661090]  submit_bio+0x230/0x464
[  224.664702]  __blkdev_direct_IO_simple+0x3d0/0x79c
[  224.669656]  blkdev_direct_IO+0x60/0x88
[  224.673629]  generic_file_read_iter+0x368/0xeb8
[  224.678308]  blkdev_read_iter+0xc0/0xd0
[  224.682281]  __vfs_read+0x1d8/0x2f0
[  224.685900]  vfs_read+0xd0/0x1bc
[  224.689249]  ksys_pread64+0x9c/0xec
[  224.692869]  __arm64_compat_sys_aarch32_pread64+0x60/0x68
[  224.698441]  el0_svc_common+0x120/0x1e0
[  224.702409]  el0_svc_compat_handler+0x64/0x80
[  224.706919]  el0_svc_compat+0x8/0x18
[  224.710615]
[  224.712177] The buggy address belongs to the page:
[  224.717128] page:ffffffbf05f18840 count:0 mapcount:0
mapping:0000000000000000 index:0x0
[  224.725377] flags: 0x4000000000000000()
[  224.729361] raw: 4000000000000000 0000000000000000 ffffffff00000301
0000000000000000
[  224.737340] raw: 0000000000000000 0000000000000000 00000000ffffffff
0000000000000000
[  224.745318] page dumped because: kasan: bad access detected
[  224.751070]
[  224.752629] Memory state around the buggy address:
[  224.757583]  ffffffc17c621200: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[  224.765035]  ffffffc17c621280: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[  224.772483] >ffffffc17c621300: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[  224.779921]                                            ^
[  224.785403]  ffffffc17c621380: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[  224.792853]  ffffffc17c621400: ff ff ff ff ff ff ff ff ff ff ff ff
ff ff ff ff
[  224.800299]

What was I doing when I saw this?
I was running blktests block/005 (switch schedulers while doing IO) on
a loop device backed by a block device. I don't hit this on every run
of block/005, but with this script I can usually hit it in less than a
minute:

d=`losetup --show -f /dev/sda5` # Replace sda5 with your favorite
unused partition.
cd /usr/blktests
echo "TEST_DEVS=$d" > config
while true; do
    ./check block/005
done

I got totally lost in the blk-mq stuff, but my best hunch is related
to this comment in bt_iter:
        /*
         * We can hit rq == NULL here, because the tagging functions
         * test and set the bit before assining ->rqs[].
         */
        if (rq && rq->q == hctx->queue)
                iter_data->fn(hctx, rq, iter_data->data, reserved);

I believe the KASAN warning is about touching rq->q. So my question
is: blk_mq_put_tag doesn't appear to clear out tags->rqs[tag] when
releasing the tag, so what's to prevent bt_iter from looking through a
stale rq, in between when someone has allocated that sbitmap tag (for
the second time), but not yet assigned their own rq in ->rqs[tag].
Maybe there's some synchronization further up the stack that I'm
missing?

I was hoping to prove this by nulling out tags->rqs[tag] in
blk_mq_put_tag, but apparently hctx->tags is sometimes not valid
anymore by that point? Horrible things happen when I do that.

So I moved sideways and tried a partial fix by clearing rqs[tag] right
away in blk_mq_get_tag. This of course doesn't close my suspected
window, but does make it a lot smaller, and indeed I saw my repro rate
drop to zero (at least letting it go for a few minutes).

Do you agree this might be an issue, and if so, is there a good place
I can null out tags->rqs[tag] before calling blk_mq_put_tag?
-Evan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KASAN warning in bt_for_each
  2019-02-14 18:04 KASAN warning in bt_for_each Evan Green
@ 2019-02-14 19:33 ` Bart Van Assche
  2019-02-14 20:45   ` Evan Green
  0 siblings, 1 reply; 5+ messages in thread
From: Bart Van Assche @ 2019-02-14 19:33 UTC (permalink / raw)
  To: Evan Green, linux-block; +Cc: LKML, Jens Axboe

On Thu, 2019-02-14 at 10:04 -0800, Evan Green wrote:
> Greetings, block experts!
> 
> I'm trying to track down a KASAN warning I'm seeing in our downstream
> 4.19 kernel, and I could use a little help. The warning looks like
> this:
> 
> [  224.564894] BUG: KASAN: use-after-free in bt_for_each+0x1ac/0x28c
> [  224.571195] Read of size 8 at addr ffffffc17c621340 by task fio/3851
> [  224.577745]
> [  224.579320] CPU: 2 PID: 3851 Comm: fio Tainted: G W 4.19.16 #398

Please have a look at this e-mail thread:
https://lore.kernel.org/linux-block/1545261885.185366.488.camel@acm.org/

Bart.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KASAN warning in bt_for_each
  2019-02-14 19:33 ` Bart Van Assche
@ 2019-02-14 20:45   ` Evan Green
  2019-02-14 20:50     ` Jens Axboe
  0 siblings, 1 reply; 5+ messages in thread
From: Evan Green @ 2019-02-14 20:45 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: linux-block, LKML, Jens Axboe

On Thu, Feb 14, 2019 at 11:33 AM Bart Van Assche <bvanassche@acm.org> wrote:
>
> On Thu, 2019-02-14 at 10:04 -0800, Evan Green wrote:
> > Greetings, block experts!
> >
> > I'm trying to track down a KASAN warning I'm seeing in our downstream
> > 4.19 kernel, and I could use a little help. The warning looks like
> > this:
> >
> > [  224.564894] BUG: KASAN: use-after-free in bt_for_each+0x1ac/0x28c
> > [  224.571195] Read of size 8 at addr ffffffc17c621340 by task fio/3851
> > [  224.577745]
> > [  224.579320] CPU: 2 PID: 3851 Comm: fio Tainted: G W 4.19.16 #398
>
> Please have a look at this e-mail thread:
> https://lore.kernel.org/linux-block/1545261885.185366.488.camel@acm.org/

Oh, yep, that's totally it. Did that patch make it in anywhere?
-Evan

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KASAN warning in bt_for_each
  2019-02-14 20:45   ` Evan Green
@ 2019-02-14 20:50     ` Jens Axboe
  2019-02-14 22:20       ` Bart Van Assche
  0 siblings, 1 reply; 5+ messages in thread
From: Jens Axboe @ 2019-02-14 20:50 UTC (permalink / raw)
  To: Evan Green, Bart Van Assche; +Cc: linux-block, LKML

On 2/14/19 1:45 PM, Evan Green wrote:
> On Thu, Feb 14, 2019 at 11:33 AM Bart Van Assche <bvanassche@acm.org> wrote:
>>
>> On Thu, 2019-02-14 at 10:04 -0800, Evan Green wrote:
>>> Greetings, block experts!
>>>
>>> I'm trying to track down a KASAN warning I'm seeing in our downstream
>>> 4.19 kernel, and I could use a little help. The warning looks like
>>> this:
>>>
>>> [  224.564894] BUG: KASAN: use-after-free in bt_for_each+0x1ac/0x28c
>>> [  224.571195] Read of size 8 at addr ffffffc17c621340 by task fio/3851
>>> [  224.577745]
>>> [  224.579320] CPU: 2 PID: 3851 Comm: fio Tainted: G W 4.19.16 #398
>>
>> Please have a look at this e-mail thread:
>> https://lore.kernel.org/linux-block/1545261885.185366.488.camel@acm.org/
> 
> Oh, yep, that's totally it. Did that patch make it in anywhere?

It didn't, since I didn't get any response to it.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: KASAN warning in bt_for_each
  2019-02-14 20:50     ` Jens Axboe
@ 2019-02-14 22:20       ` Bart Van Assche
  0 siblings, 0 replies; 5+ messages in thread
From: Bart Van Assche @ 2019-02-14 22:20 UTC (permalink / raw)
  To: Jens Axboe, Evan Green; +Cc: linux-block, LKML

On Thu, 2019-02-14 at 13:50 -0700, Jens Axboe wrote:
> On 2/14/19 1:45 PM, Evan Green wrote:
> > On Thu, Feb 14, 2019 at 11:33 AM Bart Van Assche <bvanassche@acm.org> wrote:
> > > 
> > > On Thu, 2019-02-14 at 10:04 -0800, Evan Green wrote:
> > > > Greetings, block experts!
> > > > 
> > > > I'm trying to track down a KASAN warning I'm seeing in our downstream
> > > > 4.19 kernel, and I could use a little help. The warning looks like
> > > > this:
> > > > 
> > > > [  224.564894] BUG: KASAN: use-after-free in bt_for_each+0x1ac/0x28c
> > > > [  224.571195] Read of size 8 at addr ffffffc17c621340 by task fio/3851
> > > > [  224.577745]
> > > > [  224.579320] CPU: 2 PID: 3851 Comm: fio Tainted: G W 4.19.16 #398
> > > 
> > > Please have a look at this e-mail thread:
> > > https://lore.kernel.org/linux-block/1545261885.185366.488.camel@acm.org/
> > 
> > Oh, yep, that's totally it. Did that patch make it in anywhere?
> 
> It didn't, since I didn't get any response to it.

I will have a look. Sorry that I had dropped the ball on that patch.

Bart.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-02-14 22:20 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-14 18:04 KASAN warning in bt_for_each Evan Green
2019-02-14 19:33 ` Bart Van Assche
2019-02-14 20:45   ` Evan Green
2019-02-14 20:50     ` Jens Axboe
2019-02-14 22:20       ` Bart Van Assche

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.