All of lore.kernel.org
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Paolo Valente <paolo.valente@linaro.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-block <linux-block@vger.kernel.org>,
	kernel list <linux-kernel@vger.kernel.org>,
	Hsin-Yi Wang <hsinyi@google.com>,
	Nicolas Boichat <drinkcat@chromium.org>,
	Doug Anderson <dianders@chromium.org>
Subject: Re: [PATCH] bfq: Check if bfqq is NULL in bfq_insert_request
Date: Wed, 31 Jul 2019 06:20:20 -0700	[thread overview]
Message-ID: <ed9d530d-a5d5-44fb-1e9d-1c9f562f0ce3@roeck-us.net> (raw)
In-Reply-To: <5162CB3B-39B1-4348-AEBD-2197330A3BA3@linaro.org>

On 7/31/19 3:11 AM, Paolo Valente wrote:
> 
> 
>> Il giorno 30 lug 2019, alle ore 15:35, Guenter Roeck <linux@roeck-us.net> ha scritto:
>>
>> On 7/30/19 1:55 AM, Paolo Valente wrote:
>>> Hi Guenter,
>>> sorry for the delay (Dolomiti's fault).
>>> I didn't consider that rq->elv-icq might have been NULL also
>>> because of OOM.  Thanks for spotting this issue.
>>> As for the other places where the return value of bfq_init_rq is used,
>>> unfortunately I think they matter too.  Those other places are related
>>> to request merging, which is the alternative destiny of requests
>>> (instead of being just inserted).  But, regardless of whether a
>>> request is to be merged or inserted, that request may be destined to a
>>> bfq_queue (possibly merged with a request already in a bfq_queue), and
>>> a NULL return value by bfq_init_rq leads to a crash.  I guess you can
>>> reproduce your failure also for the merge case, by generating
>>> sequential, direct I/O with queue depth > 1, and of course by enabling
>>> failslab.
>> My assumption was that requests would only be merged if they are associated
>> with the same io context. In that case, that IO context isn't reallocated
>> with ioc_create_icq() but reused, and icq would thus never be NULL.
>> I guess that assumption was wrong.
> 
> I don't remember such a filtering.  I had a look again, but didn't
> find anything relevant.  However, more competent people see these

Me not either, when I had a closer look yesterday. My conclusion was
that your analysis is correct.

Thanks,
Guenter

> emails.  Maybe someone can give us better advice.  Otherwise, to stay
> on the safe side, I'd propose to handle any possible NULL return.
> 
> And I'll manage it, as per your request.
> 
> Thanks,
> Paolo
> 
>>
>>> If my considerations above are correct, do you want to propose a
>>> complete fix yourself?
>>
>> Sure, I'll send an updated patch.
>>
>> Thanks,
>> Guenter
>>
>>> Thanks,
>>> Paolo
>>>> Il giorno 28 lug 2019, alle ore 17:19, Guenter Roeck <linux@roeck-us.net> ha scritto:
>>>>
>>>> ping ... just in case this patch got lost in Paolo's queue.
>>>>
>>>> Guenter
>>>>
>>>> On Mon, Jul 22, 2019 at 10:30:48AM -0700, Guenter Roeck wrote:
>>>>> In bfq_insert_request(), bfqq is initialized with:
>>>>> 	bfqq = bfq_init_rq(rq);
>>>>> In bfq_init_rq(), we find:
>>>>> 	if (unlikely(!rq->elv.icq))
>>>>> 		return NULL;
>>>>> Indeed, rq->elv.icq can be NULL if the memory allocation in
>>>>> create_task_io_context() failed.
>>>>>
>>>>> A comment in bfq_insert_request() suggests that bfqq is supposed to be
>>>>> non-NULL if 'at_head || blk_rq_is_passthrough(rq)' is false. Yet, as
>>>>> debugging and practical experience shows, this is not the case in the
>>>>> above situation.
>>>>>
>>>>> This results in the following crash.
>>>>>
>>>>> Unable to handle kernel NULL pointer dereference
>>>>> 	at virtual address 00000000000001b0
>>>>> ...
>>>>> Call trace:
>>>>> bfq_setup_cooperator+0x44/0x134
>>>>> bfq_insert_requests+0x10c/0x630
>>>>> blk_mq_sched_insert_requests+0x60/0xb4
>>>>> blk_mq_flush_plug_list+0x290/0x2d4
>>>>> blk_flush_plug_list+0xe0/0x230
>>>>> blk_finish_plug+0x30/0x40
>>>>> generic_writepages+0x60/0x94
>>>>> blkdev_writepages+0x24/0x30
>>>>> do_writepages+0x74/0xac
>>>>> __filemap_fdatawrite_range+0x94/0xc8
>>>>> file_write_and_wait_range+0x44/0xa0
>>>>> blkdev_fsync+0x38/0x68
>>>>> vfs_fsync_range+0x68/0x80
>>>>> do_fsync+0x44/0x80
>>>>>
>>>>> The problem is relatively easy to reproduce by running an image with
>>>>> failslab enabled, such as with:
>>>>>
>>>>> cd /sys/kernel/debug/failslab
>>>>> echo 10 > probability
>>>>> echo 300 > times
>>>>>
>>>>> Avoid the problem by checking if bfqq is NULL before using it. With the
>>>>> NULL check in place, requests with missing io context are queued
>>>>> immediately, and the crash is no longer seen.
>>>>>
>>>>> Fixes: 18e5a57d79878 ("block, bfq: postpone rq preparation to insert or merge")
>>>>> Reported-by: Hsin-Yi Wang  <hsinyi@google.com>
>>>>> Cc: Hsin-Yi Wang <hsinyi@google.com>
>>>>> Cc: Nicolas Boichat <drinkcat@chromium.org>
>>>>> Cc: Doug Anderson <dianders@chromium.org>
>>>>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>>>>> ---
>>>>> block/bfq-iosched.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
>>>>> index 72860325245a..56f3f4227010 100644
>>>>> --- a/block/bfq-iosched.c
>>>>> +++ b/block/bfq-iosched.c
>>>>> @@ -5417,7 +5417,7 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>>>>>
>>>>> 	spin_lock_irq(&bfqd->lock);
>>>>> 	bfqq = bfq_init_rq(rq);
>>>>> -	if (at_head || blk_rq_is_passthrough(rq)) {
>>>>> +	if (!bfqq || at_head || blk_rq_is_passthrough(rq)) {
>>>>> 		if (at_head)
>>>>> 			list_add(&rq->queuelist, &bfqd->dispatch);
>>>>> 		else
>>>>> -- 
>>>>> 2.7.4
>>>>>
>>
> 
> 


  reply	other threads:[~2019-07-31 13:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-22 17:30 [PATCH] bfq: Check if bfqq is NULL in bfq_insert_request Guenter Roeck
2019-07-22 20:29 ` Guenter Roeck
2019-07-22 23:36   ` Bob Liu
2019-07-23  0:06     ` Guenter Roeck
2019-07-28 15:19 ` Guenter Roeck
2019-07-30  8:55   ` Paolo Valente
2019-07-30 13:35     ` Guenter Roeck
2019-07-31 10:11       ` Paolo Valente
2019-07-31 13:20         ` Guenter Roeck [this message]
2019-07-30 17:06     ` Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed9d530d-a5d5-44fb-1e9d-1c9f562f0ce3@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=axboe@kernel.dk \
    --cc=dianders@chromium.org \
    --cc=drinkcat@chromium.org \
    --cc=hsinyi@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paolo.valente@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.