All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Dave Chinner <david@fromorbit.com>,
	Ingo Molnar <mingo@redhat.com>, Jan Kara <jack@suse.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	Matthew Wilcox <willy@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	X86 ML <x86@kernel.org>, XFS Developers <xfs@oss.sgi.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v5 4/7] dax: add support for fsync/sync
Date: Sat, 19 Dec 2015 10:37:46 -0800	[thread overview]
Message-ID: <CAPcyv4irspQEPVdYfLK+QfW4t-1_y1gFFVuBm00=i03PFQwEYw@mail.gmail.com> (raw)
In-Reply-To: <1450502540-8744-5-git-send-email-ross.zwisler@linux.intel.com>

On Fri, Dec 18, 2015 at 9:22 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> To properly handle fsync/msync in an efficient way DAX needs to track dirty
> pages so it is able to flush them durably to media on demand.
>
> The tracking of dirty pages is done via the radix tree in struct
> address_space.  This radix tree is already used by the page writeback
> infrastructure for tracking dirty pages associated with an open file, and
> it already has support for exceptional (non struct page*) entries.  We
> build upon these features to add exceptional entries to the radix tree for
> DAX dirty PMD or PTE pages at fault time.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
[..]
> +static void dax_writeback_one(struct address_space *mapping, pgoff_t index,
> +               void *entry)
> +{
> +       struct radix_tree_root *page_tree = &mapping->page_tree;
> +       int type = RADIX_DAX_TYPE(entry);
> +       struct radix_tree_node *node;
> +       void **slot;
> +
> +       if (type != RADIX_DAX_PTE && type != RADIX_DAX_PMD) {
> +               WARN_ON_ONCE(1);
> +               return;
> +       }
> +
> +       spin_lock_irq(&mapping->tree_lock);
> +       /*
> +        * Regular page slots are stabilized by the page lock even
> +        * without the tree itself locked.  These unlocked entries
> +        * need verification under the tree lock.
> +        */
> +       if (!__radix_tree_lookup(page_tree, index, &node, &slot))
> +               goto unlock;
> +       if (*slot != entry)
> +               goto unlock;
> +
> +       /* another fsync thread may have already written back this entry */
> +       if (!radix_tree_tag_get(page_tree, index, PAGECACHE_TAG_TOWRITE))
> +               goto unlock;
> +
> +       radix_tree_tag_clear(page_tree, index, PAGECACHE_TAG_TOWRITE);
> +
> +       if (type == RADIX_DAX_PMD)
> +               wb_cache_pmem(RADIX_DAX_ADDR(entry), PMD_SIZE);
> +       else
> +               wb_cache_pmem(RADIX_DAX_ADDR(entry), PAGE_SIZE);

Hi Ross, I should have realized this sooner, but what guarantees that
the address returned by RADIX_DAX_ADDR(entry) is still valid at this
point?  I think we need to store the sector in the radix tree and then
perform a new dax_map_atomic() operation to either lookup a valid
address or fail the sync request.  Otherwise, if the device is gone
we'll crash, or write into some other random vmalloc address space.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Dave Chinner <david@fromorbit.com>,
	Ingo Molnar <mingo@redhat.com>, Jan Kara <jack@suse.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	Matthew Wilcox <willy@linux.intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@ml01.01.org>,
	X86 ML <x86@kernel.org>, XFS Developers <xfs@oss.sgi.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>
Subject: Re: [PATCH v5 4/7] dax: add support for fsync/sync
Date: Sat, 19 Dec 2015 10:37:46 -0800	[thread overview]
Message-ID: <CAPcyv4irspQEPVdYfLK+QfW4t-1_y1gFFVuBm00=i03PFQwEYw@mail.gmail.com> (raw)
In-Reply-To: <1450502540-8744-5-git-send-email-ross.zwisler@linux.intel.com>

On Fri, Dec 18, 2015 at 9:22 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> To properly handle fsync/msync in an efficient way DAX needs to track dirty
> pages so it is able to flush them durably to media on demand.
>
> The tracking of dirty pages is done via the radix tree in struct
> address_space.  This radix tree is already used by the page writeback
> infrastructure for tracking dirty pages associated with an open file, and
> it already has support for exceptional (non struct page*) entries.  We
> build upon these features to add exceptional entries to the radix tree for
> DAX dirty PMD or PTE pages at fault time.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
[..]
> +static void dax_writeback_one(struct address_space *mapping, pgoff_t index,
> +               void *entry)
> +{
> +       struct radix_tree_root *page_tree = &mapping->page_tree;
> +       int type = RADIX_DAX_TYPE(entry);
> +       struct radix_tree_node *node;
> +       void **slot;
> +
> +       if (type != RADIX_DAX_PTE && type != RADIX_DAX_PMD) {
> +               WARN_ON_ONCE(1);
> +               return;
> +       }
> +
> +       spin_lock_irq(&mapping->tree_lock);
> +       /*
> +        * Regular page slots are stabilized by the page lock even
> +        * without the tree itself locked.  These unlocked entries
> +        * need verification under the tree lock.
> +        */
> +       if (!__radix_tree_lookup(page_tree, index, &node, &slot))
> +               goto unlock;
> +       if (*slot != entry)
> +               goto unlock;
> +
> +       /* another fsync thread may have already written back this entry */
> +       if (!radix_tree_tag_get(page_tree, index, PAGECACHE_TAG_TOWRITE))
> +               goto unlock;
> +
> +       radix_tree_tag_clear(page_tree, index, PAGECACHE_TAG_TOWRITE);
> +
> +       if (type == RADIX_DAX_PMD)
> +               wb_cache_pmem(RADIX_DAX_ADDR(entry), PMD_SIZE);
> +       else
> +               wb_cache_pmem(RADIX_DAX_ADDR(entry), PAGE_SIZE);

Hi Ross, I should have realized this sooner, but what guarantees that
the address returned by RADIX_DAX_ADDR(entry) is still valid at this
point?  I think we need to store the sector in the radix tree and then
perform a new dax_map_atomic() operation to either lookup a valid
address or fail the sync request.  Otherwise, if the device is gone
we'll crash, or write into some other random vmalloc address space.

WARNING: multiple messages have this Message-ID (diff)
From: Dan Williams <dan.j.williams@intel.com>
To: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: X86 ML <x86@kernel.org>, Theodore Ts'o <tytso@mit.edu>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linux-nvdimm@lists.01.org" <linux-nvdimm@lists.01.org>,
	Jan Kara <jack@suse.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	XFS Developers <xfs@oss.sgi.com>,
	"J. Bruce Fields" <bfields@fieldses.org>,
	Linux MM <linux-mm@kvack.org>, Ingo Molnar <mingo@redhat.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	"H. Peter Anvin" <hpa@zytor.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Matthew Wilcox <willy@linux.intel.com>,
	Jeff Layton <jlayton@poochiereds.net>,
	linux-ext4 <linux-ext4@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Matthew Wilcox <matthew.r.wilcox@intel.com>
Subject: Re: [PATCH v5 4/7] dax: add support for fsync/sync
Date: Sat, 19 Dec 2015 10:37:46 -0800	[thread overview]
Message-ID: <CAPcyv4irspQEPVdYfLK+QfW4t-1_y1gFFVuBm00=i03PFQwEYw@mail.gmail.com> (raw)
In-Reply-To: <1450502540-8744-5-git-send-email-ross.zwisler@linux.intel.com>

On Fri, Dec 18, 2015 at 9:22 PM, Ross Zwisler
<ross.zwisler@linux.intel.com> wrote:
> To properly handle fsync/msync in an efficient way DAX needs to track dirty
> pages so it is able to flush them durably to media on demand.
>
> The tracking of dirty pages is done via the radix tree in struct
> address_space.  This radix tree is already used by the page writeback
> infrastructure for tracking dirty pages associated with an open file, and
> it already has support for exceptional (non struct page*) entries.  We
> build upon these features to add exceptional entries to the radix tree for
> DAX dirty PMD or PTE pages at fault time.
>
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
[..]
> +static void dax_writeback_one(struct address_space *mapping, pgoff_t index,
> +               void *entry)
> +{
> +       struct radix_tree_root *page_tree = &mapping->page_tree;
> +       int type = RADIX_DAX_TYPE(entry);
> +       struct radix_tree_node *node;
> +       void **slot;
> +
> +       if (type != RADIX_DAX_PTE && type != RADIX_DAX_PMD) {
> +               WARN_ON_ONCE(1);
> +               return;
> +       }
> +
> +       spin_lock_irq(&mapping->tree_lock);
> +       /*
> +        * Regular page slots are stabilized by the page lock even
> +        * without the tree itself locked.  These unlocked entries
> +        * need verification under the tree lock.
> +        */
> +       if (!__radix_tree_lookup(page_tree, index, &node, &slot))
> +               goto unlock;
> +       if (*slot != entry)
> +               goto unlock;
> +
> +       /* another fsync thread may have already written back this entry */
> +       if (!radix_tree_tag_get(page_tree, index, PAGECACHE_TAG_TOWRITE))
> +               goto unlock;
> +
> +       radix_tree_tag_clear(page_tree, index, PAGECACHE_TAG_TOWRITE);
> +
> +       if (type == RADIX_DAX_PMD)
> +               wb_cache_pmem(RADIX_DAX_ADDR(entry), PMD_SIZE);
> +       else
> +               wb_cache_pmem(RADIX_DAX_ADDR(entry), PAGE_SIZE);

Hi Ross, I should have realized this sooner, but what guarantees that
the address returned by RADIX_DAX_ADDR(entry) is still valid at this
point?  I think we need to store the sector in the radix tree and then
perform a new dax_map_atomic() operation to either lookup a valid
address or fail the sync request.  Otherwise, if the device is gone
we'll crash, or write into some other random vmalloc address space.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2015-12-19 18:37 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-19  5:22 [PATCH v5 0/7] DAX fsync/msync support Ross Zwisler
2015-12-19  5:22 ` Ross Zwisler
2015-12-19  5:22 ` Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 1/7] pmem: add wb_cache_pmem() to the PMEM API Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-22 22:44   ` Andrew Morton
2015-12-22 22:44     ` Andrew Morton
2015-12-22 22:44     ` Andrew Morton
2015-12-23  0:00     ` Ross Zwisler
2015-12-23  0:00       ` Ross Zwisler
2015-12-23  0:00       ` Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 2/7] dax: support dirty DAX entries in radix tree Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-21 17:15   ` Jan Kara
2015-12-21 17:15     ` Jan Kara
2015-12-21 17:15     ` Jan Kara
2015-12-21 17:15     ` Jan Kara
2015-12-21 17:45     ` Ross Zwisler
2015-12-21 17:45       ` Ross Zwisler
2015-12-21 17:45       ` Ross Zwisler
2015-12-22 22:46   ` Andrew Morton
2015-12-22 22:46     ` Andrew Morton
2015-12-22 22:46     ` Andrew Morton
2015-12-23  0:16     ` Ross Zwisler
2015-12-23  0:16       ` Ross Zwisler
2015-12-23  0:16       ` Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 3/7] mm: add find_get_entries_tag() Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-22 22:46   ` Andrew Morton
2015-12-22 22:46     ` Andrew Morton
2015-12-22 22:46     ` Andrew Morton
2015-12-19  5:22 ` [PATCH v5 4/7] dax: add support for fsync/sync Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19 18:37   ` Dan Williams [this message]
2015-12-19 18:37     ` Dan Williams
2015-12-19 18:37     ` Dan Williams
2015-12-21 17:05     ` Ross Zwisler
2015-12-21 17:05       ` Ross Zwisler
2015-12-21 17:05       ` Ross Zwisler
2015-12-21 17:49       ` Dan Williams
2015-12-21 17:49         ` Dan Williams
2015-12-21 17:49         ` Dan Williams
2015-12-21 17:49         ` Dan Williams
2015-12-21 19:27       ` Dan Williams
2015-12-21 19:27         ` Dan Williams
2015-12-21 19:27         ` Dan Williams
2015-12-21 19:27         ` Dan Williams
2015-12-22 22:46   ` Andrew Morton
2015-12-22 22:46     ` Andrew Morton
2015-12-22 22:46     ` Andrew Morton
2015-12-22 23:51     ` Ross Zwisler
2015-12-22 23:51       ` Ross Zwisler
2015-12-22 23:51       ` Ross Zwisler
2015-12-19  5:22 ` [PATCH v5 5/7] ext2: call dax_pfn_mkwrite() for DAX fsync/msync Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-21 17:32   ` Jan Kara
2015-12-21 17:32     ` Jan Kara
2015-12-21 17:32     ` Jan Kara
2015-12-21 17:32     ` Jan Kara
2015-12-19  5:22 ` [PATCH v5 6/7] ext4: " Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-21 17:32   ` Jan Kara
2015-12-21 17:32     ` Jan Kara
2015-12-21 17:32     ` Jan Kara
2015-12-19  5:22 ` [PATCH v5 7/7] xfs: " Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler
2015-12-19  5:22   ` Ross Zwisler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4irspQEPVdYfLK+QfW4t-1_y1gFFVuBm00=i03PFQwEYw@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=bfields@fieldses.org \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=hpa@zytor.com \
    --cc=jack@suse.com \
    --cc=jlayton@poochiereds.net \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=matthew.r.wilcox@intel.com \
    --cc=mingo@redhat.com \
    --cc=ross.zwisler@linux.intel.com \
    --cc=tglx@linutronix.de \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@linux.intel.com \
    --cc=x86@kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.