linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-nvdimm <linux-nvdimm@lists.01.org>,
	Christoph Hellwig <hch@lst.de>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	Ross Zwisler <ross.zwisler@linux.intel.com>,
	Linux MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	"Luck, Tony" <tony.luck@intel.com>
Subject: Re: [PATCH 06/11] filesystem-dax: perform __dax_invalidate_mapping_entry() under the page lock
Date: Wed, 23 May 2018 06:50:59 -0700	[thread overview]
Message-ID: <CAPcyv4hbwPFxvfJVot14dtxJyxttChM06bsP+E5mCN4=VjG5BA@mail.gmail.com> (raw)
In-Reply-To: <20180523093537.duw6jlglcx7fnutw@quack2.suse.cz>

On Wed, May 23, 2018 at 2:35 AM, Jan Kara <jack@suse.cz> wrote:
> On Tue 22-05-18 07:40:03, Dan Williams wrote:
>> Hold the page lock while invalidating mapping entries to prevent races
>> between rmap using the address_space and the filesystem freeing the
>> address_space.
>>
>> This is more complicated than the simple description implies because
>> dev_pagemap pages that fsdax uses do not have any concept of page size.
>> Size information is stored in the radix and can only be safely read
>> while holding the xa_lock. Since lock_page() can not be taken while
>> holding xa_lock, drop xa_lock and speculatively lock all the associated
>> pages. Once all the pages are locked re-take the xa_lock and revalidate
>> that the radix entry did not change.
>>
>> Cc: Jan Kara <jack@suse.cz>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Matthew Wilcox <mawilcox@microsoft.com>
>> Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>
> IMO this is too ugly to live.

The same thought crossed my mind...

> The combination of entry locks in the radix
> tree and page locks is just too big mess. And from a quick look I don't see
> a reason why we could not use entry locks to protect rmap code as well -
> when you have PFN for which you need to walk rmap, you can grab
> rcu_read_lock(), then you can safely look at page->mapping, grab xa_lock,
> verify the radix tree points where it should and grab entry lock. I agree
> it's a bit complicated but for memory failure I think it is fine.

Ah, I missed this cleverness with rcu relative to keeping the
page->mapping valid. I'll take a look.

> Or we could talk about switching everything to page locks instead of entry
> locks but that isn't trivial either as we need something to serialized page
> faults on even before we go into the filesystem and allocate blocks for the
> fault...

I'd rather use entry locks everywhere and not depend on the page lock
for rmap if at all possible. Ideally lock_page is only used for
typical pages and not these dev_pagemap related structures.

  reply	other threads:[~2018-05-23 13:50 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-22 14:39 [PATCH 00/11] mm: Teach memory_failure() about ZONE_DEVICE pages Dan Williams
2018-05-22 14:39 ` [PATCH 01/11] device-dax: convert to vmf_insert_mixed and vm_fault_t Dan Williams
2018-05-22 14:39 ` [PATCH 02/11] device-dax: cleanup vm_fault de-reference chains Dan Williams
2018-05-22 14:39 ` [PATCH 03/11] device-dax: enable page_mapping() Dan Williams
2018-05-23  9:03   ` Jan Kara
2018-05-30 19:54   ` kbuild test robot
2018-05-22 14:39 ` [PATCH 04/11] device-dax: set page->index Dan Williams
2018-05-22 14:39 ` [PATCH 05/11] filesystem-dax: " Dan Williams
2018-05-23  8:40   ` Jan Kara
2018-05-30  1:38     ` Dan Williams
2018-05-30  8:13       ` Jan Kara
2018-05-30 23:21         ` Dan Williams
2018-05-31 10:08           ` Jan Kara
2018-05-31 21:49             ` Dan Williams
2018-05-22 14:40 ` [PATCH 06/11] filesystem-dax: perform __dax_invalidate_mapping_entry() under the page lock Dan Williams
2018-05-23  9:35   ` Jan Kara
2018-05-23 13:50     ` Dan Williams [this message]
2018-05-22 14:40 ` [PATCH 07/11] mm, madvise_inject_error: fix page count leak Dan Williams
2018-05-23  4:19   ` Naoya Horiguchi
2018-05-24 20:55     ` Dan Williams
2018-05-22 14:40 ` [PATCH 08/11] x86, memory_failure: introduce {set, clear}_mce_nospec() Dan Williams
2018-05-22 14:40 ` [PATCH 09/11] mm, memory_failure: pass page size to kill_proc() Dan Williams
2018-05-23  6:41   ` Naoya Horiguchi
2018-05-22 14:40 ` [PATCH 10/11] mm, memory_failure: teach memory_failure() about dev_pagemap pages Dan Williams
2018-05-23  6:48   ` Naoya Horiguchi
2018-05-22 14:40 ` [PATCH 11/11] libnvdimm, pmem: restore page attributes when clearing errors Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4hbwPFxvfJVot14dtxJyxttChM06bsP+E5mCN4=VjG5BA@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=mawilcox@microsoft.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).