All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
	"ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	david <david@fromorbit.com>, Alasdair Kergon <agk@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	"qi.fuli@fujitsu.com" <qi.fuli@fujitsu.com>,
	"y-goto@fujitsu.com" <y-goto@fujitsu.com>
Subject: Re: [PATCH v3 01/11] pagemap: Introduce ->memory_failure()
Date: Wed, 24 Mar 2021 18:39:35 +0100	[thread overview]
Message-ID: <20210324173935.GB12770@lst.de> (raw)
In-Reply-To: <CAPcyv4hOrYCW=wjkxkCP+JbyD+A_Po0rW-61qQWAOm3zp_eyUQ@mail.gmail.com>

On Wed, Mar 24, 2021 at 09:37:01AM -0700, Dan Williams wrote:
> > Eww.  As I said I think the right way is that the file system (or
> > other consumer) can register a set of callbacks for opening the device.
> 
> How does that solve the problem of the driver being notified of all
> pfn failure events?

Ok, I probably just showed I need to spend more time looking at
your proposal vs the actual code..

Don't we have a proper way how one of the nvdimm layers own a
spefific memory range and call directly into that instead of through
a notifier?

> Today pmem only finds out about the ones that are
> notified via native x86 machine check error handling via a notifier
> (yes "firmware-first" error handling fails to do the right thing for
> the pmem driver),

Did any kind of firmware-first error handling ever get anything
right?  I wish people would have learned that by now.

> or the ones that are eventually reported via address
> range scrub, but only for the nvdimms that implement range scrubbing.
> memory_failure() seems a reasonable catch all point to route pfn
> failure events, in an arch independent way, to interested drivers.

Yeah.

> I'm fine swapping out dax_device blocking_notiier chains for your
> proposal, but that does not address all the proposed reworks in my
> list which are:
> 
> - delete "drivers/acpi/nfit/mce.c"
> 
> - teach memory_failure() to be able to communicate range failure
> 
> - enable memory_failure() to defer to a filesystem that can say
> "critical metadata is impacted, no point in trying to do file-by-file
> isolation, bring the whole fs down".

This all sounds sensible.
_______________________________________________
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-leave@lists.01.org

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@lst.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Christoph Hellwig <hch@lst.de>,
	"ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	device-mapper development <dm-devel@redhat.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	david <david@fromorbit.com>, Alasdair Kergon <agk@redhat.com>,
	Mike Snitzer <snitzer@redhat.com>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	"qi.fuli@fujitsu.com" <qi.fuli@fujitsu.com>,
	"y-goto@fujitsu.com" <y-goto@fujitsu.com>
Subject: Re: [PATCH v3 01/11] pagemap: Introduce ->memory_failure()
Date: Wed, 24 Mar 2021 18:39:35 +0100	[thread overview]
Message-ID: <20210324173935.GB12770@lst.de> (raw)
In-Reply-To: <CAPcyv4hOrYCW=wjkxkCP+JbyD+A_Po0rW-61qQWAOm3zp_eyUQ@mail.gmail.com>

On Wed, Mar 24, 2021 at 09:37:01AM -0700, Dan Williams wrote:
> > Eww.  As I said I think the right way is that the file system (or
> > other consumer) can register a set of callbacks for opening the device.
> 
> How does that solve the problem of the driver being notified of all
> pfn failure events?

Ok, I probably just showed I need to spend more time looking at
your proposal vs the actual code..

Don't we have a proper way how one of the nvdimm layers own a
spefific memory range and call directly into that instead of through
a notifier?

> Today pmem only finds out about the ones that are
> notified via native x86 machine check error handling via a notifier
> (yes "firmware-first" error handling fails to do the right thing for
> the pmem driver),

Did any kind of firmware-first error handling ever get anything
right?  I wish people would have learned that by now.

> or the ones that are eventually reported via address
> range scrub, but only for the nvdimms that implement range scrubbing.
> memory_failure() seems a reasonable catch all point to route pfn
> failure events, in an arch independent way, to interested drivers.

Yeah.

> I'm fine swapping out dax_device blocking_notiier chains for your
> proposal, but that does not address all the proposed reworks in my
> list which are:
> 
> - delete "drivers/acpi/nfit/mce.c"
> 
> - teach memory_failure() to be able to communicate range failure
> 
> - enable memory_failure() to defer to a filesystem that can say
> "critical metadata is impacted, no point in trying to do file-by-file
> isolation, bring the whole fs down".

This all sounds sensible.

WARNING: multiple messages have this Message-ID (diff)
From: Christoph Hellwig <hch@lst.de>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "y-goto@fujitsu.com" <y-goto@fujitsu.com>,
	"qi.fuli@fujitsu.com" <qi.fuli@fujitsu.com>,
	Mike Snitzer <snitzer@redhat.com>,
	linux-nvdimm <linux-nvdimm@lists.01.org>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	david <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"ruansy.fnst@fujitsu.com" <ruansy.fnst@fujitsu.com>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	device-mapper development <dm-devel@redhat.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Christoph Hellwig <hch@lst.de>, Alasdair Kergon <agk@redhat.com>
Subject: Re: [dm-devel] [PATCH v3 01/11] pagemap: Introduce ->memory_failure()
Date: Wed, 24 Mar 2021 18:39:35 +0100	[thread overview]
Message-ID: <20210324173935.GB12770@lst.de> (raw)
In-Reply-To: <CAPcyv4hOrYCW=wjkxkCP+JbyD+A_Po0rW-61qQWAOm3zp_eyUQ@mail.gmail.com>

On Wed, Mar 24, 2021 at 09:37:01AM -0700, Dan Williams wrote:
> > Eww.  As I said I think the right way is that the file system (or
> > other consumer) can register a set of callbacks for opening the device.
> 
> How does that solve the problem of the driver being notified of all
> pfn failure events?

Ok, I probably just showed I need to spend more time looking at
your proposal vs the actual code..

Don't we have a proper way how one of the nvdimm layers own a
spefific memory range and call directly into that instead of through
a notifier?

> Today pmem only finds out about the ones that are
> notified via native x86 machine check error handling via a notifier
> (yes "firmware-first" error handling fails to do the right thing for
> the pmem driver),

Did any kind of firmware-first error handling ever get anything
right?  I wish people would have learned that by now.

> or the ones that are eventually reported via address
> range scrub, but only for the nvdimms that implement range scrubbing.
> memory_failure() seems a reasonable catch all point to route pfn
> failure events, in an arch independent way, to interested drivers.

Yeah.

> I'm fine swapping out dax_device blocking_notiier chains for your
> proposal, but that does not address all the proposed reworks in my
> list which are:
> 
> - delete "drivers/acpi/nfit/mce.c"
> 
> - teach memory_failure() to be able to communicate range failure
> 
> - enable memory_failure() to defer to a filesystem that can say
> "critical metadata is impacted, no point in trying to do file-by-file
> isolation, bring the whole fs down".

This all sounds sensible.

--
dm-devel mailing list
dm-devel@redhat.com
https://listman.redhat.com/mailman/listinfo/dm-devel


  reply	other threads:[~2021-03-24 17:39 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-08 10:55 [PATCH v3 00/11] fsdax: introduce fs query to support reflink Shiyang Ruan
2021-02-08 10:55 ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55 ` Shiyang Ruan
2021-02-08 10:55 ` [PATCH v3 01/11] pagemap: Introduce ->memory_failure() Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-10 13:20   ` Christoph Hellwig
2021-02-10 13:20     ` [dm-devel] " Christoph Hellwig
2021-02-10 13:20     ` Christoph Hellwig
2021-03-06 20:36   ` Dan Williams
2021-03-06 20:36     ` [dm-devel] " Dan Williams
2021-03-06 20:36     ` Dan Williams
2021-03-06 20:36     ` Dan Williams
2021-03-08  3:38     ` ruansy.fnst
2021-03-08  3:38       ` [dm-devel] " ruansy.fnst
2021-03-08  3:38       ` ruansy.fnst
2021-03-08  5:23       ` Dan Williams
2021-03-08  5:23         ` [dm-devel] " Dan Williams
2021-03-08  5:23         ` Dan Williams
2021-03-08  5:23         ` Dan Williams
2021-03-08 11:34         ` ruansy.fnst
2021-03-08 11:34           ` [dm-devel] " ruansy.fnst
2021-03-08 11:34           ` ruansy.fnst
2021-03-08 18:01           ` Dan Williams
2021-03-08 18:01             ` [dm-devel] " Dan Williams
2021-03-08 18:01             ` Dan Williams
2021-03-08 18:01             ` Dan Williams
2021-03-12 10:18             ` ruansy.fnst
2021-03-12 10:18               ` [dm-devel] " ruansy.fnst
2021-03-12 10:18               ` ruansy.fnst
2021-03-19  2:17               ` ruansy.fnst
2021-03-19  2:17                 ` [dm-devel] " ruansy.fnst
2021-03-19  2:17                 ` ruansy.fnst
2021-03-24  2:19                 ` Dan Williams
2021-03-24  2:19                   ` [dm-devel] " Dan Williams
2021-03-24  2:19                   ` Dan Williams
2021-03-24  2:19                   ` Dan Williams
2021-03-24  7:47                   ` Christoph Hellwig
2021-03-24  7:47                     ` [dm-devel] " Christoph Hellwig
2021-03-24  7:47                     ` Christoph Hellwig
2021-03-24 16:37                     ` Dan Williams
2021-03-24 16:37                       ` [dm-devel] " Dan Williams
2021-03-24 16:37                       ` Dan Williams
2021-03-24 16:37                       ` Dan Williams
2021-03-24 17:39                       ` Christoph Hellwig [this message]
2021-03-24 17:39                         ` [dm-devel] " Christoph Hellwig
2021-03-24 17:39                         ` Christoph Hellwig
2021-03-24 18:00                         ` Dan Williams
2021-03-24 18:00                           ` [dm-devel] " Dan Williams
2021-03-24 18:00                           ` Dan Williams
2021-03-24 18:00                           ` Dan Williams
2021-02-08 10:55 ` [PATCH v3 02/11] blk: Introduce ->corrupted_range() for block device Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-10 13:21   ` Christoph Hellwig
2021-02-10 13:21     ` [dm-devel] " Christoph Hellwig
2021-02-10 13:21     ` Christoph Hellwig
2021-03-04 22:42     ` Darrick J. Wong
2021-03-04 22:42       ` [dm-devel] " Darrick J. Wong
2021-03-04 22:42       ` Darrick J. Wong
2021-03-05  6:10       ` Christoph Hellwig
2021-03-05  6:10         ` [dm-devel] " Christoph Hellwig
2021-03-05  6:10         ` Christoph Hellwig
2021-02-08 10:55 ` [PATCH v3 03/11] fs: Introduce ->corrupted_range() for superblock Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-08 10:55 ` [PATCH v3 04/11] block_dev: Introduce bd_corrupted_range() for block device Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-08 10:55 ` [PATCH v3 05/11] mm, fsdax: Refactor memory-failure handler for dax mapping Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-10 13:33   ` Christoph Hellwig
2021-02-10 13:33     ` [dm-devel] " Christoph Hellwig
2021-02-10 13:33     ` Christoph Hellwig
2021-02-17  2:56     ` Ruan Shiyang
2021-02-17  2:56       ` [dm-devel] " Ruan Shiyang
2021-02-17  2:56       ` Ruan Shiyang
2021-02-18  8:32       ` Christoph Hellwig
2021-02-18  8:32         ` [dm-devel] " Christoph Hellwig
2021-02-18  8:32         ` Christoph Hellwig
2021-02-18  8:59         ` Ruan Shiyang
2021-02-18  8:59           ` [dm-devel] " Ruan Shiyang
2021-02-18  8:59           ` Ruan Shiyang
2021-03-16  3:21   ` zhong jiang
2021-03-16  3:21     ` [dm-devel] " zhong jiang
2021-03-16  3:21     ` zhong jiang
2021-03-17  3:46     ` ruansy.fnst
2021-03-17  3:46       ` [dm-devel] " ruansy.fnst
2021-03-17  3:46       ` ruansy.fnst
2021-02-08 10:55 ` [PATCH v3 06/11] mm, pmem: Implement ->memory_failure() in pmem driver Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-10 13:41   ` Christoph Hellwig
2021-02-10 13:41     ` [dm-devel] " Christoph Hellwig
2021-02-10 13:41     ` Christoph Hellwig
2021-02-08 10:55 ` [PATCH v3 07/11] pmem: Implement ->corrupted_range() for " Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-08 10:55 ` [PATCH v3 08/11] dm: Introduce ->rmap() to find bdev offset Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-08 10:55 ` [PATCH v3 09/11] md: Implement ->corrupted_range() Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-08 10:55 ` [PATCH v3 10/11] xfs: Implement ->corrupted_range() for XFS Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-10 13:44   ` Christoph Hellwig
2021-02-10 13:44     ` [dm-devel] " Christoph Hellwig
2021-02-10 13:44     ` Christoph Hellwig
2021-02-08 10:55 ` [PATCH v3 11/11] fs/dax: Remove useless functions Shiyang Ruan
2021-02-08 10:55   ` [dm-devel] " Shiyang Ruan
2021-02-08 10:55   ` Shiyang Ruan
2021-02-10 13:09   ` Christoph Hellwig
2021-02-10 13:09     ` [dm-devel] " Christoph Hellwig
2021-02-10 13:09     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210324173935.GB12770@lst.de \
    --to=hch@lst.de \
    --cc=agk@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=qi.fuli@fujitsu.com \
    --cc=rgoldwyn@suse.de \
    --cc=ruansy.fnst@fujitsu.com \
    --cc=snitzer@redhat.com \
    --cc=y-goto@fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.