linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Dave Chinner <david@fromorbit.com>
Cc: Jane Chu <jane.chu@oracle.com>,
	Ruan Shiyang <ruansy.fnst@cn.fujitsu.com>,
	linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org, linux-raid@vger.kernel.org,
	dan.j.williams@intel.com, hch@lst.de, song@kernel.org,
	rgoldwyn@suse.de, qi.fuli@fujitsu.com, y-goto@fujitsu.com
Subject: Re: [RFC PATCH v2 0/6] fsdax: introduce fs query to support reflink
Date: Tue, 15 Dec 2020 18:46:18 -0800	[thread overview]
Message-ID: <20201216024618.GC6918@magnolia> (raw)
In-Reply-To: <20201215231022.GL632069@dread.disaster.area>

On Wed, Dec 16, 2020 at 10:10:22AM +1100, Dave Chinner wrote:
> On Tue, Dec 15, 2020 at 11:05:07AM -0800, Jane Chu wrote:
> > On 12/15/2020 3:58 AM, Ruan Shiyang wrote:
> > > Hi Jane
> > > 
> > > On 2020/12/15 上午4:58, Jane Chu wrote:
> > > > Hi, Shiyang,
> > > > 
> > > > On 11/22/2020 4:41 PM, Shiyang Ruan wrote:
> > > > > This patchset is a try to resolve the problem of tracking shared page
> > > > > for fsdax.
> > > > > 
> > > > > Change from v1:
> > > > >    - Intorduce ->block_lost() for block device
> > > > >    - Support mapped device
> > > > >    - Add 'not available' warning for realtime device in XFS
> > > > >    - Rebased to v5.10-rc1
> > > > > 
> > > > > This patchset moves owner tracking from dax_assocaite_entry() to pmem
> > > > > device, by introducing an interface ->memory_failure() of struct
> > > > > pagemap.  The interface is called by memory_failure() in mm, and
> > > > > implemented by pmem device.  Then pmem device calls its ->block_lost()
> > > > > to find the filesystem which the damaged page located in, and call
> > > > > ->storage_lost() to track files or metadata assocaited with this page.
> > > > > Finally we are able to try to fix the damaged data in filesystem and do
> > > > 
> > > > Does that mean clearing poison? if so, would you mind to elaborate
> > > > specifically which change does that?
> > > 
> > > Recovering data for filesystem (or pmem device) has not been done in
> > > this patchset...  I just triggered the handler for the files sharing the
> > > corrupted page here.
> > 
> > Thanks! That confirms my understanding.
> > 
> > With the framework provided by the patchset, how do you envision it to
> > ease/simplify poison recovery from the user's perspective?
> 
> At the moment, I'd say no change what-so-ever. THe behaviour is
> necessary so that we can kill whatever user application maps
> multiply-shared physical blocks if there's a memory error. THe
> recovery method from that is unchanged. The only advantage may be
> that the filesystem (if rmap enabled) can tell you the exact file
> and offset into the file where data was corrupted.
> 
> However, it can be worse, too: it may also now completely shut down
> the filesystem if the filesystem discovers the error is in metadata
> rather than user data. That's much more complex to recover from, and
> right now will require downtime to take the filesystem offline and
> run fsck to correct the error. That may trash whatever the metadata
> that can't be recovered points to, so you still have a uesr data
> recovery process to perform after this...

...though for the future future I'd like to bypass the default behaviors
if there's somebody watching the sb notification that will also kick off
the appropriate repair activities.  The xfs auto-repair parts are coming
along nicely.  Dunno about userspace, though I figure if we can do
userspace page faults then some people could probably do autorepair
too.

--D

> > And how does it help in dealing with page faults upon poisoned
> > dax page?
> 
> It doesn't. If the page is poisoned, the same behaviour will occur
> as does now. This is simply error reporting infrastructure, not
> error handling.
> 
> Future work might change how we correct the faults found in the
> storage, but I think the user visible behaviour is going to be "kill
> apps mapping corrupted data" for a long time yet....
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

      reply	other threads:[~2020-12-16  2:47 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-23  0:41 [RFC PATCH v2 0/6] fsdax: introduce fs query to support reflink Shiyang Ruan
2020-11-23  0:41 ` [RFC PATCH v2 1/6] fs: introduce ->storage_lost() for memory-failure Shiyang Ruan
2020-11-23  0:41 ` [RFC PATCH v2 2/6] blk: introduce ->block_lost() to handle memory-failure Shiyang Ruan
2020-11-23  0:41 ` [RFC PATCH v2 3/6] md: implement ->block_lost() for memory-failure Shiyang Ruan
2020-11-23  0:41 ` [RFC PATCH v2 4/6] pagemap: introduce ->memory_failure() Shiyang Ruan
2020-11-23  0:41 ` [RFC PATCH v2 5/6] mm, fsdax: refactor dax handler in memory-failure Shiyang Ruan
2020-11-23  0:41 ` [RFC PATCH v2 6/6] fsdax: remove useless (dis)associate functions Shiyang Ruan
2020-11-29 22:47 ` [RFC PATCH v2 0/6] fsdax: introduce fs query to support reflink Dave Chinner
2020-12-02  7:12   ` Ruan Shiyang
2020-12-06 22:55     ` Dave Chinner
2020-12-14 20:58 ` Jane Chu
2020-12-15 11:58   ` Ruan Shiyang
2020-12-15 19:05     ` Jane Chu
2020-12-15 23:10       ` Dave Chinner
2020-12-16  2:46         ` Darrick J. Wong [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201216024618.GC6918@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jane.chu@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=qi.fuli@fujitsu.com \
    --cc=rgoldwyn@suse.de \
    --cc=ruansy.fnst@cn.fujitsu.com \
    --cc=song@kernel.org \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).