linux-bcache.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Coly Li <colyli@suse.de>
To: Ken Raeburn <raeburn@redhat.com>
Cc: linux-bcache@vger.kernel.org
Subject: Re: bcache integer overflow for large devices w/small io_opt
Date: Sun, 12 Jul 2020 17:31:29 +0800	[thread overview]
Message-ID: <26ce0472-5727-0601-ed9e-ea9474f39210@suse.de> (raw)
In-Reply-To: <1de4ebce-c62f-e357-9827-3fa263f6b36a@redhat.com>

On 2020/7/12 11:06, Ken Raeburn wrote:
> On 7/11/20 11:28 AM, Coly Li wrote:
> 
>> On 2020/7/11 06:47, Ken Raeburn wrote:
>>> The long version is written up at
>>> https://bugzilla.redhat.com/show_bug.cgi?id=1783075 but the short
>>> version:
>>>
>>> There are devices out there which set q->limits.io_opt to small values
>>> like 4096 bytes, causing bcache to use that for the stripe size, but the
>>> device size could still be large enough that the computed stripe count
>>> is 2**32 or more. That value gets stuffed into a 32-bit (unsigned int)
>>> field, throwing away the high bits, and then that truncated value is
>>> range-checked and used. This can result in memory corruption or faults
>>> in some cases.
>>>
>>> The problem was brought up with us on Red Hat's VDO driver team by a
>>> bcache user on a 4.17.8 kernel, has been demonstrated in the Fedora
>>> 5.3.15-300.fc31 kernel, and by inspection appears to be present in
>>> Linus's tree as of this morning.
>>>
>>> The easy fix would be to keep the quotient in a 64-bit variable until
>>> it's validated, but that would simply limit the size of such devices as
>>> bcache backing storage (in this case, limiting VDO volumes to under 8
>>> TB). Is there a way to still be able to use larger devices? Perhaps
>>> scale up the stripe size from io_opt to the point where the stripe count
>>> falls in the allowed range?
>>>
>>> Ken Raeburn
>>> (Red Hat VDO driver developer)
>>>
>> We cannot extend the bit width of nr_stripes, because
>> d->full_dirty_stripes memory allocation depends on it.
>>
>> For the 18T volume, and stripe_size is 4KB, there are 4831838208
>> stripes. Then size of d->full_dirty_stripes will be
>> 4831838208*sizeof(atomic_t) > 140GB. This is too large for kernel memory
>> allocation.
> I didn't intend for nr_stripes to be made 64 bits. Just a temporary
> variable for the purposes of validation, to ensure that you won't be
> losing high bits when coercing to unsigned int.
>> Does it help of we have a option in bcache-tools to specify a
>> stripe_size number to overwrite limit->io_opt ? Then you may specify a
>> larger stripe size which may avoid nr_stripes overflow.
>>
>> Thanks for the report.
>>
>> Coly Li
>>
> Yes, I think letting the user choose a stripe size would be a good way
> to address the problem. Of course, the driver must still defend itself
> against memory corruption anyway, if the user doesn't do so, by
> rejecting or adjusting the values. But whereas I wouldn't recommend the
> driver alter the stripe size by more than necessary to make the stripe
> count fit, the user can make it as big as they want, if they want to
> bring the memory requirement down further, or if they've done some
> performance measurements of different configurations, or they know
> something interesting about their workload's access patterns, etc.

Copied. Correct me if I am wrong, I will do two fixes to solve the problem,
1, The quick fix is to solve ad avoid the kernel panic reported in the
bugzilla.
2, Permit people to set their own stripe_size to overwhelm the default
one, then bache make continue to work on small limit->io_opt device.

Coly Li

      reply	other threads:[~2020-07-12  9:31 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-10 22:47 bcache integer overflow for large devices w/small io_opt Ken Raeburn
2020-07-11 15:28 ` Coly Li
2020-07-12  3:06   ` Ken Raeburn
2020-07-12  9:31     ` Coly Li [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=26ce0472-5727-0601-ed9e-ea9474f39210@suse.de \
    --to=colyli@suse.de \
    --cc=linux-bcache@vger.kernel.org \
    --cc=raeburn@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).