All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@canonical.com>
To: Eric Wheeler <bcache@lists.ewheeler.net>
Cc: Jens Axboe <axboe@fb.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-block@vger.kernel.org,
	Kent Overstreet <kent.overstreet@gmail.com>,
	Christoph Hellwig <hch@infradead.org>,
	Sebastian Roesner <sroesner-kernelorg@roesner-online.de>,
	"4.2+" <stable@vger.kernel.org>, Shaohua Li <shli@fb.com>
Subject: Re: [PATCH] block: make sure big bio is splitted into at most 256 bvecs
Date: Thu, 7 Apr 2016 10:16:10 +0800	[thread overview]
Message-ID: <CACVXFVM+jk-UZ7ejDXASmRLqjn+FuNn-W4gw5=oAMifvVws-SQ@mail.gmail.com> (raw)
In-Reply-To: <alpine.LRH.2.11.1604070154410.12121@mail.ewheeler.net>

On Thu, Apr 7, 2016 at 9:56 AM, Eric Wheeler <bcache@lists.ewheeler.net> wrote:
> On Thu, 7 Apr 2016, Ming Lei wrote:
>
>> On Thu, Apr 7, 2016 at 9:36 AM, Eric Wheeler <bcache@lists.ewheeler.net> wrote:
>> > On Wed, 6 Apr 2016, Ming Lei wrote:
>> >
>> >> After arbitrary bio size is supported, the incoming bio may
>> >> be very big. We have to split the bio into small bios so that
>> >> each holds at most BIO_MAX_PAGES bvecs for safety reason, such
>> >> as bio_clone().
>> >>
>> >> This patch fixes the following kernel crash:
>> >>
>> >> > [  172.660142] BUG: unable to handle kernel NULL pointer dereference at
>> >> > 0000000000000028
>> >> > [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
>> >> > [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
>> >> > [  172.660399] Oops: 0000 [#1] SMP
>> >> > [...]
>> >> > [  172.664780] Call Trace:
>> >> > [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
>> >> > [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
>> >> > [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
>> >> > [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
>> >> > [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
>> >> > [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
>> >> > [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]
>> >>
>> >> Fixes: 54efd50(block: make generic_make_request handle arbitrarily sized bios)
>> >> Reported-by: Sebastian Roesner <sroesner-kernelorg@roesner-online.de>
>> >> Reported-by: Eric Wheeler <bcache@lists.ewheeler.net>
>> >> Cc: stable@vger.kernel.org (4.2+)
>> >
>> > Ming Lei,
>> >
>> > get_max_io_size doesn't appear until 4.5 based on a quick LXR check so we
>> > won't see it in stable I don't think.
>> >
>> > It would be nice to see this fixed in 4.1 (if affected there).  Is there
>>
>> The issue should be introduced to v4.3 via 54efd50
>>
>> > another place this could be applied to be a bit more backward compatible?
>>
>> The v1 needn't change to get_max_io_size(), and it should be simple enough
>> to backport to previous stables, please try it:
>>
>> http://marc.info/?l=linux-block&m=145991422422927&w=2
>
> V1 changes blk_bio_segment_split() which doesn't exist until 4.3.
>
> How might you port this to v4.1.y?

Can you see the issue with v4.1?

You mentioned there are three reports:

> [2014-02-04] https://bugzilla.redhat.com/show_bug.cgi?id=1061339
> [2016-01-13] http://www.spinics.net/lists/linux-bcache/msg03335.html
             https://bugzilla.kernel.org/show_bug.cgi?id=110771
> [2016-03-25] http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3607 [this thread]
             https://bugzilla.kernel.org/show_bug.cgi?id=114871

The first one has been fixed by '8423ae3 block: Fix cloning of
discard/write same bios', as mentioned in bugzilla.

The other two are reported on v4.4 and v4.5.

If you look at the patch description, it is just needed for 4.3+.

Or I am wrong?


Thanks,

>
> --
> Eric Wheeler
>
>
>>
>> Thanks,
>>
>> >
>> >> Cc: Shaohua Li <shli@fb.com>
>> >> Signed-off-by: Ming Lei <ming.lei@canonical.com>
>> >> ---
>> >> I can reproduce the issue and verify the fix by the following approach:
>> >>       - create one raid1 over two virtio-blk
>> >>       - build bcache device over the above raid1 and another cache device.
>> >>       - set cache mode as writeback
>> >>       - run random write over ext4 on the bcache device
>> >>       - then the crash can be triggered
>> >>
>> >>  block/blk-merge.c | 12 ++++++++++++
>> >>  1 file changed, 12 insertions(+)
>> >>
>> >> diff --git a/block/blk-merge.c b/block/blk-merge.c
>> >> index 2613531..9a8651f 100644
>> >> --- a/block/blk-merge.c
>> >> +++ b/block/blk-merge.c
>> >> @@ -79,6 +79,18 @@ static inline unsigned get_max_io_size(struct request_queue *q,
>> >>       /* aligned to logical block size */
>> >>       sectors &= ~(mask >> 9);
>> >>
>> >> +     /*
>> >> +      * With arbitrary bio size, the incoming bio may be very big.
>> >> +      * We have to split the bio into small bios so that each holds
>> >> +      * at most BIO_MAX_PAGES bvecs for safety reason, such as
>> >> +      * bio_clone().
>> >> +      *
>> >> +      * In the future, the limit might be converted into per-queue
>> >> +      * flag.
>> >> +      */
>> >> +     sectors = min_t(unsigned, sectors, BIO_MAX_PAGES <<
>> >> +                     (PAGE_CACHE_SHIFT - 9));
>> >> +
>> >>       return sectors;
>> >>  }
>> >>
>> >> --
>> >> 1.9.1
>> >>
>> >>
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-block" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>

  reply	other threads:[~2016-04-07  2:16 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 17:44 [PATCH] block: make sure big bio is splitted into at most 256 bvecs Ming Lei
2016-04-05 18:27 ` Shaohua Li
2016-04-06  0:27   ` Kent Overstreet
2016-04-06  0:30     ` Shaohua Li
2016-04-06  0:36       ` Kent Overstreet
2016-04-06  0:41         ` Shaohua Li
2016-04-06  0:45           ` Kent Overstreet
2016-04-06  0:59             ` Shaohua Li
2016-04-06  1:05               ` Ming Lei
2016-04-06  0:47   ` Ming Lei
2016-04-06  1:04     ` Shaohua Li
2016-04-06  1:11       ` Ming Lei
2016-04-06  0:30 ` Kent Overstreet
2016-04-06  0:59   ` Ming Lei
2016-04-06  1:10     ` Kent Overstreet
2016-04-06  1:20       ` Ming Lei
2016-04-06  1:28         ` Kent Overstreet
2016-04-06  1:51           ` Ming Lei
2016-04-06  2:22             ` Kent Overstreet
2016-04-06  2:30               ` Ming Lei
2016-04-06  2:34                 ` Kent Overstreet
2016-04-06  2:37                   ` Ming Lei
2016-04-06  2:40                     ` Kent Overstreet
2016-04-06  2:51                       ` Ming Lei
2016-04-06  2:58                         ` Kent Overstreet
2016-04-06  1:02 ` Ming Lei
2016-04-07  1:48   ` Eric Wheeler
2016-04-07  1:36 ` Eric Wheeler
2016-04-07  1:49   ` Ming Lei
2016-04-07  1:56     ` Eric Wheeler
2016-04-07  2:16       ` Ming Lei [this message]
2016-04-07 23:29         ` Eric Wheeler
2016-04-08  0:21           ` Ming Lei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CACVXFVM+jk-UZ7ejDXASmRLqjn+FuNn-W4gw5=oAMifvVws-SQ@mail.gmail.com' \
    --to=ming.lei@canonical.com \
    --cc=axboe@fb.com \
    --cc=bcache@lists.ewheeler.net \
    --cc=hch@infradead.org \
    --cc=kent.overstreet@gmail.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=shli@fb.com \
    --cc=sroesner-kernelorg@roesner-online.de \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.