nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Lukas Straub <lukasstraub2@web.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	Jane Chu <jane.chu@oracle.com>,
	"david@fromorbit.com" <david@fromorbit.com>,
	"djwong@kernel.org" <djwong@kernel.org>,
	"vishal.l.verma@intel.com" <vishal.l.verma@intel.com>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	"agk@redhat.com" <agk@redhat.com>,
	"snitzer@redhat.com" <snitzer@redhat.com>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	"ira.weiny@intel.com" <ira.weiny@intel.com>,
	"willy@infradead.org" <willy@infradead.org>,
	"vgoyal@redhat.com" <vgoyal@redhat.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"nvdimm@lists.linux.dev" <nvdimm@lists.linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"linux-btrfs@vger.kernel.org" <linux-btrfs@vger.kernel.org>
Subject: Re: [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag
Date: Sat, 6 Nov 2021 07:41:46 +0000	[thread overview]
Message-ID: <20211106074146.04fc36a3@gecko> (raw)
In-Reply-To: <CAPcyv4hK18DetEf9+NcDqM5y07Vp-=nhysHJ3JSnKbS-ET2ppw@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3055 bytes --]

On Tue, 2 Nov 2021 09:03:55 -0700
Dan Williams <dan.j.williams@intel.com> wrote:

> On Tue, Oct 26, 2021 at 11:50 PM Christoph Hellwig <hch@infradead.org> wrote:
> >
> > On Fri, Oct 22, 2021 at 08:52:55PM +0000, Jane Chu wrote:  
> > > Thanks - I try to be honest.  As far as I can tell, the argument
> > > about the flag is a philosophical argument between two views.
> > > One view assumes design based on perfect hardware, and media error
> > > belongs to the category of brokenness. Another view sees media
> > > error as a build-in hardware component and make design to include
> > > dealing with such errors.  
> >
> > No, I don't think so.  Bit errors do happen in all media, which is
> > why devices are built to handle them.  It is just the Intel-style
> > pmem interface to handle them which is completely broken.  
> 
> No, any media can report checksum / parity errors. NVME also seems to
> do a poor job with multi-bit ECC errors consumed from DRAM. There is
> nothing "pmem" or "Intel" specific here.
> 
> > > errors in mind from start.  I guess I'm trying to articulate why
> > > it is acceptable to include the RWF_DATA_RECOVERY flag to the
> > > existing RWF_ flags. - this way, pwritev2 remain fast on fast path,
> > > and its slow path (w/ error clearing) is faster than other alternative.
> > > Other alternative being 1 system call to clear the poison, and
> > > another system call to run the fast pwrite for recovery, what
> > > happens if something happened in between?  
> >
> > Well, my point is doing recovery from bit errors is by definition not
> > the fast path.  Which is why I'd rather keep it away from the pmem
> > read/write fast path, which also happens to be the (much more important)
> > non-pmem read/write path.  
> 
> I would expect this interface to be useful outside of pmem as a
> "failfast" or "try harder to recover" flag for reading over media
> errors.

Yeah, I think this flag could also be useful for non-raid btrfs.

If you have an extend that is shared between multiple snapshots and
it's data is corrupted (without the disk returning an i/o error), btrfs
won't be able to fix the corruption without raid and will always return
an i/o error when accessing the affected range (due to checksum
mismatch).

Of course you could just overwrite the range in the file with good
data, but that would only fix the file you are operating on, snapshots
will still reference the corrupted data.

With this flag, a read could just return the corrupted data without i/o
error and a write could write directly to the on-disk data to fixup the
corruption everywhere. btrfs could also check that the newly written
data actually matches the checksum.
However, in this btrfs usecase the process still needs to be
CAP_SYS_ADMIN or similar, since it's easy to create collisions for
crc32 and so an attacker could write to a file that he has no
permissions for, if that file shares an extend with one where he has
write permissions.

Regards,
Lukas Straub
-- 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

      parent reply	other threads:[~2021-11-06  7:42 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-21  0:10 [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Jane Chu
2021-10-21  0:10 ` [PATCH 1/6] dax: introduce RWF_RECOVERY_DATA flag to preadv2() and pwritev2() Jane Chu
2021-10-21  0:10 ` [PATCH 2/6] dax: prepare dax_direct_access() API with DAXDEV_F_RECOVERY flag Jane Chu
2021-10-21 11:20   ` Christoph Hellwig
2021-10-21 18:19     ` Jane Chu
2021-10-21  0:10 ` [PATCH 3/6] pmem: pmem_dax_direct_access() to honor the " Jane Chu
2021-10-21 11:23   ` Christoph Hellwig
2021-10-21 18:24     ` Jane Chu
2021-10-21  0:10 ` [PATCH 4/6] dm,dax,pmem: prepare dax_copy_to/from_iter() APIs with DAXDEV_F_RECOVERY Jane Chu
2021-10-21 11:27   ` Christoph Hellwig
2021-10-22  0:49     ` Jane Chu
2021-10-22  1:41       ` correction: " Jane Chu
2021-10-22  5:33       ` Christoph Hellwig
2021-10-22 20:30         ` Jane Chu
2021-10-21  0:10 ` [PATCH 5/6] dax,pmem: Add data recovery feature to pmem_copy_to/from_iter() Jane Chu
2021-10-21 11:28   ` Christoph Hellwig
2021-10-22  0:58     ` Jane Chu
2021-10-21  0:10 ` [PATCH 6/6] dm: Ensure dm honors DAXDEV_F_RECOVERY flag on dax only Jane Chu
2021-10-21 11:31 ` [dm-devel] [PATCH 0/6] dax poison recovery with RWF_RECOVERY_DATA flag Christoph Hellwig
2021-10-22  1:37   ` Jane Chu
2021-10-22  1:58     ` Darrick J. Wong
2021-10-22  5:38       ` Christoph Hellwig
2021-10-22  5:36     ` Christoph Hellwig
2021-10-22 20:52       ` Jane Chu
2021-10-27  6:49         ` Christoph Hellwig
2021-10-28  0:24           ` Darrick J. Wong
2021-10-28 22:59             ` Dave Chinner
2021-10-29 11:46               ` Pavel Begunkov
2021-10-29 16:57                 ` Darrick J. Wong
2021-10-29 19:23                   ` Pavel Begunkov
2021-10-29 20:08                     ` Darrick J. Wong
2021-10-31 13:27                       ` Pavel Begunkov
2021-10-29 18:53                 ` Jane Chu
2021-10-29 22:32                 ` Dave Chinner
2021-10-31 13:19                   ` Pavel Begunkov
2021-11-01  2:31                     ` Matthew Wilcox
2021-11-02  6:18             ` Christoph Hellwig
2021-11-02 19:57               ` Dan Williams
2021-11-03 16:58                 ` Christoph Hellwig
2021-11-03 20:33                   ` Dan Williams
2021-11-04  8:30                     ` Christoph Hellwig
2021-11-04 12:29                       ` Matthew Wilcox
2021-11-04 16:24                       ` Dan Williams
2021-11-04 17:43                         ` Christoph Hellwig
2021-11-04 17:50                           ` Dan Williams
2021-11-04 18:05                           ` Matthew Wilcox
2021-11-04 18:33                         ` Jane Chu
2021-11-04 19:00                           ` Dan Williams
2021-11-04 20:27                             ` Jane Chu
2021-11-05  0:46                               ` Dan Williams
2021-11-05  1:35                                 ` Dan Williams
2021-11-05  5:56                             ` Christoph Hellwig
2021-11-03 18:09               ` Jane Chu
2021-11-04  6:21                 ` Dan Williams
2021-11-04  8:36                   ` Christoph Hellwig
2021-11-04 16:08                     ` Dan Williams
2021-11-04 17:46                       ` Christoph Hellwig
2021-11-04  8:21                 ` Christoph Hellwig
2021-11-02 16:12             ` Dan Williams
2021-11-02 16:03           ` Dan Williams
2021-11-03 16:53             ` Christoph Hellwig
2021-11-06  7:41             ` Lukas Straub [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211106074146.04fc36a3@gecko \
    --to=lukasstraub2@web.de \
    --cc=agk@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=dm-devel@redhat.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jane.chu@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=nvdimm@lists.linux.dev \
    --cc=snitzer@redhat.com \
    --cc=vgoyal@redhat.com \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).