dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
From: Sami Tolvanen <samitolvanen@google.com>
To: Milan Broz <gmazyland@gmail.com>
Cc: device-mapper development <dm-devel@redhat.com>,
	Mikulas Patocka <mpatocka@redhat.com>
Subject: Re: [dm-devel] [PATCH] dm-verity: Fix FEC for RS roots non-aligned to block size
Date: Tue, 23 Feb 2021 08:51:01 -0800	[thread overview]
Message-ID: <CABCJKufZjA4G4-ncU_+8aS_SS9wU5frFyCThxVg2i-rBTu-KkQ@mail.gmail.com> (raw)
In-Reply-To: <20210222211528.848441-1-gmazyland@gmail.com>

Hi Milan,

On Mon, Feb 22, 2021 at 1:15 PM Milan Broz <gmazyland@gmail.com> wrote:
>
> Optional Forward Error Correction (FEC) code in dm-verity uses
> Reed-Solomon code and should support roots from 2 to 24.
>
> The error correction parity bytes (of roots lengths per RS block) are stored
> on a separate device in sequence without any padding.
>
> Currently, to access FEC device, the dm-verity-fec code uses dm-bufio client
> with block size set to verity data block (usually 4096 or 512 bytes).
>
> Because this block size is not divisible by some (most!) of the roots
> supported lengths, data repair cannot work for partially stored
> parity bytes.

Ah, good catch!

> This patch changes FEC device dm-bufio block size to "roots << SECTOR_SHIFT"
> where we can be sure that the full parity data is always available.
> (There cannot be partial FEC blocks because parity must cover whole sectors.)
>
> Because the optional FEC starting offset could be unaligned to this
> new block size, we have to use dm_bufio_set_sector_offset() to configure it.
>
> The problem is easily reproducible using veritysetup,
> here for example for roots=13:
>
>   # create verity device with RS FEC
>   dd if=/dev/urandom of=data.img bs=4096 count=8 status=none
>   veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=13 | awk '/^Root hash/{ print $3 }' >roothash
>
>   # create an erasure that should be always repairable with this roots setting
>   dd if=/dev/zero of=data.img conv=notrunc bs=1 count=8 seek=4088 status=none
>
>   # try to read it through dm-verity
>   veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=13 $(cat roothash)
>   dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer
>   # wait for possible recursive recovery in kernel
>   udevadm settle
>   veritysetup close test
>
> Without it, FEC code usually ends on unrecoverable failure in RS decoder:
>   device-mapper: verity-fec: 7:1: FEC 0: failed to correct: -74
>   ...
>
> With the patch, errors are properly repaired.
>   device-mapper: verity-fec: 7:1: FEC 0: corrected 8 errors
>   ...
>
> This problem is present in all kernels since the FEC code introduction (kernel 4.5).
>
> AFAIK the problem is not visible in Android  ecosystem because it always
> use default RS roots=2.

That's correct, Android always uses the default value to minimize
space overhead, which is why we never ran into this.

>
> Signed-off-by: Milan Broz <gmazyland@gmail.com>
> ---
>  drivers/md/dm-verity-fec.c | 23 +++++++++++++----------
>  1 file changed, 13 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/md/dm-verity-fec.c b/drivers/md/dm-verity-fec.c
> index fb41b4f23c48..be170581eb69 100644
> --- a/drivers/md/dm-verity-fec.c
> +++ b/drivers/md/dm-verity-fec.c
> @@ -61,18 +61,18 @@ static int fec_decode_rs8(struct dm_verity *v, struct dm_verity_fec_io *fio,
>  static u8 *fec_read_parity(struct dm_verity *v, u64 rsb, int index,
>                            unsigned *offset, struct dm_buffer **buf)
>  {
> -       u64 position, block;
> +       u64 position, block, rem;
>         u8 *res;
>
>         position = (index + rsb) * v->fec->roots;
> -       block = position >> v->data_dev_block_bits;
> -       *offset = (unsigned)(position - (block << v->data_dev_block_bits));
> +       block = div64_u64_rem(position, v->fec->roots << SECTOR_SHIFT, &rem);
> +       *offset = (unsigned)rem;
>
> -       res = dm_bufio_read(v->fec->bufio, v->fec->start + block, buf);
> +       res = dm_bufio_read(v->fec->bufio, block, buf);
>         if (IS_ERR(res)) {
>                 DMERR("%s: FEC %llu: parity read failed (block %llu): %ld",
>                       v->data_dev->name, (unsigned long long)rsb,
> -                     (unsigned long long)(v->fec->start + block),
> +                     (unsigned long long)(block),

Nit: parentheses not needed around block.

>                       PTR_ERR(res));
>                 *buf = NULL;
>         }
> @@ -155,7 +155,7 @@ static int fec_decode_bufs(struct dm_verity *v, struct dm_verity_fec_io *fio,
>
>                 /* read the next block when we run out of parity bytes */
>                 offset += v->fec->roots;
> -               if (offset >= 1 << v->data_dev_block_bits) {
> +               if (offset >= (v->fec->roots << SECTOR_SHIFT)) {

Same here, but I suppose it might help with readability.

>                         dm_bufio_release(buf);
>
>                         par = fec_read_parity(v, rsb, block_offset, &offset, &buf);
> @@ -674,7 +674,7 @@ int verity_fec_ctr(struct dm_verity *v)
>  {
>         struct dm_verity_fec *f = v->fec;
>         struct dm_target *ti = v->ti;
> -       u64 hash_blocks;
> +       u64 hash_blocks, start_blocks, fec_blocks;
>         int ret;
>
>         if (!verity_fec_is_enabled(v)) {
> @@ -744,15 +744,18 @@ int verity_fec_ctr(struct dm_verity *v)
>         }
>
>         f->bufio = dm_bufio_client_create(f->dev->bdev,
> -                                         1 << v->data_dev_block_bits,
> +                                         f->roots << SECTOR_SHIFT,
>                                           1, 0, NULL, NULL);
>         if (IS_ERR(f->bufio)) {
>                 ti->error = "Cannot initialize FEC bufio client";
>                 return PTR_ERR(f->bufio);
>         }
>
> -       if (dm_bufio_get_device_size(f->bufio) <
> -           ((f->start + f->rounds * f->roots) >> v->data_dev_block_bits)) {
> +       dm_bufio_set_sector_offset(f->bufio, f->start << v->data_dev_block_bits >> SECTOR_SHIFT);
> +
> +       start_blocks = div64_u64(f->start << v->data_dev_block_bits, v->fec->roots << SECTOR_SHIFT);
> +       fec_blocks   = div64_u64(f->rounds * f->roots, v->fec->roots << SECTOR_SHIFT);
> +       if (dm_bufio_get_device_size(f->bufio) < (start_blocks + fec_blocks)) {
>                 ti->error = "FEC device is too small";
>                 return -E2BIG;
>         }

I haven't tested the code, but it looks correct to me. Thanks for fixing this!

Reviewed-by: Sami Tolvanen <samitolvanen@google.com>

Sami

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  parent reply	other threads:[~2021-02-23 16:52 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 21:15 [dm-devel] [PATCH] dm-verity: Fix FEC for RS roots non-aligned to block size Milan Broz
2021-02-23  3:00 ` Jérôme Carretero
2021-02-23 10:57   ` Milan Broz
2021-02-23 16:51 ` Sami Tolvanen [this message]
2021-02-23 20:21 ` [dm-devel] [PATCH 1/2] dm-bufio: subtract the number of initial sectors in dm_bufio_get_device_size Milan Broz
2021-02-23 20:21   ` [dm-devel] [PATCH 2/2] dm-verity: Fix FEC for RS roots non-aligned to block size Milan Broz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CABCJKufZjA4G4-ncU_+8aS_SS9wU5frFyCThxVg2i-rBTu-KkQ@mail.gmail.com \
    --to=samitolvanen@google.com \
    --cc=dm-devel@redhat.com \
    --cc=gmazyland@gmail.com \
    --cc=mpatocka@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).