linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] block, bfq: NULL out the bic when it's no longer valid
@ 2019-06-28  4:44 Douglas Anderson
  2019-06-28  4:57 ` Guenter Roeck
  2019-06-28 13:45 ` Jens Axboe
  0 siblings, 2 replies; 4+ messages in thread
From: Douglas Anderson @ 2019-06-28  4:44 UTC (permalink / raw)
  To: Paolo Valente, Jens Axboe
  Cc: groeck, drinkcat, Douglas Anderson, linux-block, linux-kernel

In reboot tests on several devices we were seeing a "use after free"
when slub_debug or KASAN was enabled.  The kernel complained about:

  Unable to handle kernel paging request at virtual address 6b6b6c2b

...which is a classic sign of use after free under slub_debug.  The
stack crawl in kgdb looked like:

 0  test_bit (addr=<optimized out>, nr=<optimized out>)
 1  bfq_bfqq_busy (bfqq=<optimized out>)
 2  bfq_select_queue (bfqd=<optimized out>)
 3  __bfq_dispatch_request (hctx=<optimized out>)
 4  bfq_dispatch_request (hctx=<optimized out>)
 5  0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440)
 6  0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440)
 7  0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440)
 8  0xc0568d94 in blk_mq_run_work_fn (work=<optimized out>)
 9  0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480)
 10 0xc024cff4 in worker_thread (__worker=0xec6d4640)

Digging in kgdb, it could be found that, though bfqq looked fine,
bfqq->bic had been freed.

Through further digging, I postulated that perhaps it is illegal to
access a "bic" (AKA an "icq") after bfq_exit_icq() had been called
because the "bic" can be freed at some point in time after this call
is made.  I confirmed that there certainly were cases where the exact
crashing code path would access the "bic" after bfq_exit_icq() had
been called.  Sspecifically I set the "bfqq->bic" to (void *)0x7 and
saw that the bic was 0x7 at the time of the crash.

To understand a bit more about why this crash was fairly uncommon (I
saw it only once in a few hundred reboots), you can see that much of
the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't
access the ->bic anymore.  The only case it doesn't is if
bfq_put_queue() sees a reference still held.

However, even in the case when bfqq isn't freed, the crash is still
rare.  Why?  I tracked what happened to the "bic" after the exit
routine.  It doesn't get freed right away.  Rather,
put_io_context_active() eventually called put_io_context() which
queued up freeing on a workqueue.  The freeing then actually happened
later than that through call_rcu().  Despite all these delays, some
extra debugging showed that all the hoops could be jumped through in
time and the memory could be freed causing the original crash.  Phew!

To make a long story short, assuming it truly is illegal to access an
icq after the "exit_icq" callback is finished, this patch is needed.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
Most of the testing of this was done on the Chrome OS 4.19 kernel with
BFQ backported (thanks to Paolo's help).  I did manage to reproduce a
crash on mainline Linux (v5.2-rc6) though.

To see some of the techniques used to debug this, see
<https://crrev.com/c/1679134> and <https://crrev.com/c/1681258/1>.

I'll also note that on linuxnext (next-20190627) I saw some other
use-after-frees that seemed related to BFQ but haven't had time to
debug.  They seemed unrelated.

 block/bfq-iosched.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index f8d430f88d25..6c0cff03f8f6 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -4584,6 +4584,7 @@ static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)
 		unsigned long flags;
 
 		spin_lock_irqsave(&bfqd->lock, flags);
+		bfqq->bic = NULL;
 		bfq_exit_bfqq(bfqd, bfqq);
 		bic_set_bfqq(bic, NULL, is_sync);
 		spin_unlock_irqrestore(&bfqd->lock, flags);
-- 
2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] block, bfq: NULL out the bic when it's no longer valid
  2019-06-28  4:44 [PATCH] block, bfq: NULL out the bic when it's no longer valid Douglas Anderson
@ 2019-06-28  4:57 ` Guenter Roeck
  2019-06-28  7:51   ` Paolo Valente
  2019-06-28 13:45 ` Jens Axboe
  1 sibling, 1 reply; 4+ messages in thread
From: Guenter Roeck @ 2019-06-28  4:57 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Paolo Valente, Jens Axboe, groeck, drinkcat, linux-block, linux-kernel

On Thu, Jun 27, 2019 at 09:44:09PM -0700, Douglas Anderson wrote:
> In reboot tests on several devices we were seeing a "use after free"
> when slub_debug or KASAN was enabled.  The kernel complained about:
> 
>   Unable to handle kernel paging request at virtual address 6b6b6c2b
> 
> ...which is a classic sign of use after free under slub_debug.  The
> stack crawl in kgdb looked like:
> 
>  0  test_bit (addr=<optimized out>, nr=<optimized out>)
>  1  bfq_bfqq_busy (bfqq=<optimized out>)
>  2  bfq_select_queue (bfqd=<optimized out>)
>  3  __bfq_dispatch_request (hctx=<optimized out>)
>  4  bfq_dispatch_request (hctx=<optimized out>)
>  5  0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440)
>  6  0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440)
>  7  0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440)
>  8  0xc0568d94 in blk_mq_run_work_fn (work=<optimized out>)
>  9  0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480)
>  10 0xc024cff4 in worker_thread (__worker=0xec6d4640)
> 
> Digging in kgdb, it could be found that, though bfqq looked fine,
> bfqq->bic had been freed.
> 
> Through further digging, I postulated that perhaps it is illegal to
> access a "bic" (AKA an "icq") after bfq_exit_icq() had been called
> because the "bic" can be freed at some point in time after this call
> is made.  I confirmed that there certainly were cases where the exact
> crashing code path would access the "bic" after bfq_exit_icq() had
> been called.  Sspecifically I set the "bfqq->bic" to (void *)0x7 and
                ^^^

> saw that the bic was 0x7 at the time of the crash.
> 
> To understand a bit more about why this crash was fairly uncommon (I
> saw it only once in a few hundred reboots), you can see that much of
> the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't
> access the ->bic anymore.  The only case it doesn't is if
> bfq_put_queue() sees a reference still held.
> 
> However, even in the case when bfqq isn't freed, the crash is still
> rare.  Why?  I tracked what happened to the "bic" after the exit
> routine.  It doesn't get freed right away.  Rather,
> put_io_context_active() eventually called put_io_context() which
> queued up freeing on a workqueue.  The freeing then actually happened
> later than that through call_rcu().  Despite all these delays, some
> extra debugging showed that all the hoops could be jumped through in
> time and the memory could be freed causing the original crash.  Phew!
> 
> To make a long story short, assuming it truly is illegal to access an
> icq after the "exit_icq" callback is finished, this patch is needed.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

Nicely done ... thanks!

Reviewed-by: Guenter Roeck <groeck@chromium.org>

> ---
> Most of the testing of this was done on the Chrome OS 4.19 kernel with
> BFQ backported (thanks to Paolo's help).  I did manage to reproduce a
> crash on mainline Linux (v5.2-rc6) though.
> 
> To see some of the techniques used to debug this, see
> <https://crrev.com/c/1679134> and <https://crrev.com/c/1681258/1>.
> 
> I'll also note that on linuxnext (next-20190627) I saw some other
> use-after-frees that seemed related to BFQ but haven't had time to
> debug.  They seemed unrelated.
> 
>  block/bfq-iosched.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index f8d430f88d25..6c0cff03f8f6 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -4584,6 +4584,7 @@ static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)
>  		unsigned long flags;
>  
>  		spin_lock_irqsave(&bfqd->lock, flags);
> +		bfqq->bic = NULL;
>  		bfq_exit_bfqq(bfqd, bfqq);
>  		bic_set_bfqq(bic, NULL, is_sync);
>  		spin_unlock_irqrestore(&bfqd->lock, flags);
> -- 
> 2.22.0.410.gd8fdbe21b5-goog
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] block, bfq: NULL out the bic when it's no longer valid
  2019-06-28  4:57 ` Guenter Roeck
@ 2019-06-28  7:51   ` Paolo Valente
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Valente @ 2019-06-28  7:51 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Douglas Anderson, Jens Axboe, Guenter Roeck, drinkcat,
	linux-block, linux-kernel



> Il giorno 28 giu 2019, alle ore 06:57, Guenter Roeck <linux@roeck-us.net> ha scritto:
> 
> On Thu, Jun 27, 2019 at 09:44:09PM -0700, Douglas Anderson wrote:
>> In reboot tests on several devices we were seeing a "use after free"
>> when slub_debug or KASAN was enabled.  The kernel complained about:
>> 
>>  Unable to handle kernel paging request at virtual address 6b6b6c2b
>> 
>> ...which is a classic sign of use after free under slub_debug. The
>> stack crawl in kgdb looked like:
>> 
>> 0  test_bit (addr=<optimized out>, nr=<optimized out>)
>> 1  bfq_bfqq_busy (bfqq=<optimized out>)
>> 2  bfq_select_queue (bfqd=<optimized out>)
>> 3  __bfq_dispatch_request (hctx=<optimized out>)
>> 4  bfq_dispatch_request (hctx=<optimized out>)
>> 5  0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440)
>> 6  0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440)
>> 7  0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440)
>> 8  0xc0568d94 in blk_mq_run_work_fn (work=<optimized out>)
>> 9  0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480)
>> 10 0xc024cff4 in worker_thread (__worker=0xec6d4640)
>> 
>> Digging in kgdb, it could be found that, though bfqq looked fine,
>> bfqq->bic had been freed.
>> 
>> Through further digging, I postulated that perhaps it is illegal to
>> access a "bic" (AKA an "icq") after bfq_exit_icq() had been called
>> because the "bic" can be freed at some point in time after this call
>> is made.  I confirmed that there certainly were cases where the exact
>> crashing code path would access the "bic" after bfq_exit_icq() had
>> been called.  Sspecifically I set the "bfqq->bic" to (void *)0x7 and
>                ^^^
> 
>> saw that the bic was 0x7 at the time of the crash.
>> 
>> To understand a bit more about why this crash was fairly uncommon (I
>> saw it only once in a few hundred reboots), you can see that much of
>> the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't
>> access the ->bic anymore.  The only case it doesn't is if
>> bfq_put_queue() sees a reference still held.
>> 
>> However, even in the case when bfqq isn't freed, the crash is still
>> rare.  Why?  I tracked what happened to the "bic" after the exit
>> routine.  It doesn't get freed right away.  Rather,
>> put_io_context_active() eventually called put_io_context() which
>> queued up freeing on a workqueue.  The freeing then actually happened
>> later than that through call_rcu().  Despite all these delays, some
>> extra debugging showed that all the hoops could be jumped through in
>> time and the memory could be freed causing the original crash. Phew!
>> 
>> To make a long story short, assuming it truly is illegal to access an
>> icq after the "exit_icq" callback is finished, this patch is needed.
>> 

Nice work! I think this bug has been there since when BFQ was born ...

Reviewed-by: Paolo Valente <paolo.valente@unimore.it>

Thanks,
Paolo

>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> 
> Nicely done ... thanks!
> 
> Reviewed-by: Guenter Roeck <groeck@chromium.org>
> 
>> ---
>> Most of the testing of this was done on the Chrome OS 4.19 kernel with
>> BFQ backported (thanks to Paolo's help).  I did manage to reproduce a
>> crash on mainline Linux (v5.2-rc6) though.
>> 
>> To see some of the techniques used to debug this, see
>> <https://crrev.com/c/1679134> and <https://crrev.com/c/1681258/1>.
>> 
>> I'll also note that on linuxnext (next-20190627) I saw some other
>> use-after-frees that seemed related to BFQ but haven't had time to
>> debug.  They seemed unrelated.
>> 
>> block/bfq-iosched.c | 1 +
>> 1 file changed, 1 insertion(+)
>> 
>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>> index f8d430f88d25..6c0cff03f8f6 100644
>> --- a/block/bfq-iosched.c
>> +++ b/block/bfq-iosched.c
>> @@ -4584,6 +4584,7 @@ static void bfq_exit_icq_bfqq(struct bfq_io_cq *bic, bool is_sync)
>> 		unsigned long flags;
>> 
>> 		spin_lock_irqsave(&bfqd->lock, flags);
>> +		bfqq->bic = NULL;
>> 		bfq_exit_bfqq(bfqd, bfqq);
>> 		bic_set_bfqq(bic, NULL, is_sync);
>> 		spin_unlock_irqrestore(&bfqd->lock, flags);
>> -- 
>> 2.22.0.410.gd8fdbe21b5-goog


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] block, bfq: NULL out the bic when it's no longer valid
  2019-06-28  4:44 [PATCH] block, bfq: NULL out the bic when it's no longer valid Douglas Anderson
  2019-06-28  4:57 ` Guenter Roeck
@ 2019-06-28 13:45 ` Jens Axboe
  1 sibling, 0 replies; 4+ messages in thread
From: Jens Axboe @ 2019-06-28 13:45 UTC (permalink / raw)
  To: Douglas Anderson, Paolo Valente
  Cc: groeck, drinkcat, linux-block, linux-kernel

On 6/27/19 10:44 PM, Douglas Anderson wrote:
> In reboot tests on several devices we were seeing a "use after free"
> when slub_debug or KASAN was enabled.  The kernel complained about:
> 
>    Unable to handle kernel paging request at virtual address 6b6b6c2b
> 
> ...which is a classic sign of use after free under slub_debug.  The
> stack crawl in kgdb looked like:
> 
>   0  test_bit (addr=<optimized out>, nr=<optimized out>)
>   1  bfq_bfqq_busy (bfqq=<optimized out>)
>   2  bfq_select_queue (bfqd=<optimized out>)
>   3  __bfq_dispatch_request (hctx=<optimized out>)
>   4  bfq_dispatch_request (hctx=<optimized out>)
>   5  0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440)
>   6  0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440)
>   7  0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440)
>   8  0xc0568d94 in blk_mq_run_work_fn (work=<optimized out>)
>   9  0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480)
>   10 0xc024cff4 in worker_thread (__worker=0xec6d4640)
> 
> Digging in kgdb, it could be found that, though bfqq looked fine,
> bfqq->bic had been freed.
> 
> Through further digging, I postulated that perhaps it is illegal to
> access a "bic" (AKA an "icq") after bfq_exit_icq() had been called
> because the "bic" can be freed at some point in time after this call
> is made.  I confirmed that there certainly were cases where the exact
> crashing code path would access the "bic" after bfq_exit_icq() had
> been called.  Sspecifically I set the "bfqq->bic" to (void *)0x7 and
> saw that the bic was 0x7 at the time of the crash.
> 
> To understand a bit more about why this crash was fairly uncommon (I
> saw it only once in a few hundred reboots), you can see that much of
> the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't
> access the ->bic anymore.  The only case it doesn't is if
> bfq_put_queue() sees a reference still held.
> 
> However, even in the case when bfqq isn't freed, the crash is still
> rare.  Why?  I tracked what happened to the "bic" after the exit
> routine.  It doesn't get freed right away.  Rather,
> put_io_context_active() eventually called put_io_context() which
> queued up freeing on a workqueue.  The freeing then actually happened
> later than that through call_rcu().  Despite all these delays, some
> extra debugging showed that all the hoops could be jumped through in
> time and the memory could be freed causing the original crash.  Phew!
> 
> To make a long story short, assuming it truly is illegal to access an
> icq after the "exit_icq" callback is finished, this patch is needed.
> 
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> Most of the testing of this was done on the Chrome OS 4.19 kernel with
> BFQ backported (thanks to Paolo's help).  I did manage to reproduce a
> crash on mainline Linux (v5.2-rc6) though.
> 
> To see some of the techniques used to debug this, see
> <https://crrev.com/c/1679134> and <https://crrev.com/c/1681258/1>.
> 
> I'll also note that on linuxnext (next-20190627) I saw some other
> use-after-frees that seemed related to BFQ but haven't had time to
> debug.  They seemed unrelated.

Applied for 5.3, but I marked it for stable as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-06-28 13:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-28  4:44 [PATCH] block, bfq: NULL out the bic when it's no longer valid Douglas Anderson
2019-06-28  4:57 ` Guenter Roeck
2019-06-28  7:51   ` Paolo Valente
2019-06-28 13:45 ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).