From: Ross Zwisler <ross.zwisler@linux.intel.com> To: Jan Kara <jack@suse.cz> Cc: linux-nvdimm@lists.01.org, NeilBrown <neilb@suse.com>, Wilcox, Subject: Re: [PATCH 05/10] dax: Allow DAX code to replace exceptional entries Date: Wed, 23 Mar 2016 11:52:58 -0600 [thread overview] Message-ID: <20160323175258.GD5544@linux.intel.com> (raw) In-Reply-To: <1458566575-28063-6-git-send-email-jack@suse.cz> On Mon, Mar 21, 2016 at 02:22:50PM +0100, Jan Kara wrote: > Currently we forbid page_cache_tree_insert() to replace exceptional radix > tree entries for DAX inodes. However to make DAX faults race free we will > lock radix tree entries and when hole is created, we need to replace > such locked radix tree entry with a hole page. So modify > page_cache_tree_insert() to allow that. Perhaps this is addressed later in the series, but at first glance this seems unsafe to me - what happens of we had tasks waiting on the locked entry? Are they woken up when we replace the locked entry with a page cache entry? If they are woken up, will they be alright with the fact that they were sleeping on a lock for an exceptional entry, and now that lock no longer exists? The radix tree entry will now be filled with a zero page hole, and so they can't acquire the lock they were sleeping for - instead if they wanted to lock that page they'd need to lock the zero page page lock, right? > Signed-off-by: Jan Kara <jack@suse.cz> > --- > include/linux/dax.h | 6 ++++++ > mm/filemap.c | 18 +++++++++++------- > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/include/linux/dax.h b/include/linux/dax.h > index 7c45ac7ea1d1..4b63923e1f8d 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -3,8 +3,14 @@ > > #include <linux/fs.h> > #include <linux/mm.h> > +#include <linux/radix-tree.h> > #include <asm/pgtable.h> > > +/* > + * Since exceptional entries do not use indirect bit, we reuse it as a lock bit > + */ > +#define DAX_ENTRY_LOCK RADIX_TREE_INDIRECT_PTR > + > ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, > get_block_t, dio_iodone_t, int flags); > int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size); > diff --git a/mm/filemap.c b/mm/filemap.c > index 7c00f105845e..fbebedaf719e 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -597,14 +597,18 @@ static int page_cache_tree_insert(struct address_space *mapping, > if (!radix_tree_exceptional_entry(p)) > return -EEXIST; > > - if (WARN_ON(dax_mapping(mapping))) > - return -EINVAL; > - > - if (shadowp) > - *shadowp = p; > mapping->nrexceptional--; > - if (node) > - workingset_node_shadows_dec(node); > + if (!dax_mapping(mapping)) { > + if (shadowp) > + *shadowp = p; > + if (node) > + workingset_node_shadows_dec(node); > + } else { > + /* DAX can replace empty locked entry with a hole */ > + WARN_ON_ONCE(p != > + (void *)(RADIX_TREE_EXCEPTIONAL_ENTRY | > + DAX_ENTRY_LOCK)); > + } > } > radix_tree_replace_slot(slot, page); > mapping->nrpages++; > -- > 2.6.2 > _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Ross Zwisler <ross.zwisler@linux.intel.com> To: Jan Kara <jack@suse.cz> Cc: linux-fsdevel@vger.kernel.org, "Wilcox, Matthew R" <matthew.r.wilcox@intel.com>, Ross Zwisler <ross.zwisler@linux.intel.com>, Dan Williams <dan.j.williams@intel.com>, linux-nvdimm@lists.01.org, NeilBrown <neilb@suse.com> Subject: Re: [PATCH 05/10] dax: Allow DAX code to replace exceptional entries Date: Wed, 23 Mar 2016 11:52:58 -0600 [thread overview] Message-ID: <20160323175258.GD5544@linux.intel.com> (raw) In-Reply-To: <1458566575-28063-6-git-send-email-jack@suse.cz> On Mon, Mar 21, 2016 at 02:22:50PM +0100, Jan Kara wrote: > Currently we forbid page_cache_tree_insert() to replace exceptional radix > tree entries for DAX inodes. However to make DAX faults race free we will > lock radix tree entries and when hole is created, we need to replace > such locked radix tree entry with a hole page. So modify > page_cache_tree_insert() to allow that. Perhaps this is addressed later in the series, but at first glance this seems unsafe to me - what happens of we had tasks waiting on the locked entry? Are they woken up when we replace the locked entry with a page cache entry? If they are woken up, will they be alright with the fact that they were sleeping on a lock for an exceptional entry, and now that lock no longer exists? The radix tree entry will now be filled with a zero page hole, and so they can't acquire the lock they were sleeping for - instead if they wanted to lock that page they'd need to lock the zero page page lock, right? > Signed-off-by: Jan Kara <jack@suse.cz> > --- > include/linux/dax.h | 6 ++++++ > mm/filemap.c | 18 +++++++++++------- > 2 files changed, 17 insertions(+), 7 deletions(-) > > diff --git a/include/linux/dax.h b/include/linux/dax.h > index 7c45ac7ea1d1..4b63923e1f8d 100644 > --- a/include/linux/dax.h > +++ b/include/linux/dax.h > @@ -3,8 +3,14 @@ > > #include <linux/fs.h> > #include <linux/mm.h> > +#include <linux/radix-tree.h> > #include <asm/pgtable.h> > > +/* > + * Since exceptional entries do not use indirect bit, we reuse it as a lock bit > + */ > +#define DAX_ENTRY_LOCK RADIX_TREE_INDIRECT_PTR > + > ssize_t dax_do_io(struct kiocb *, struct inode *, struct iov_iter *, loff_t, > get_block_t, dio_iodone_t, int flags); > int dax_clear_sectors(struct block_device *bdev, sector_t _sector, long _size); > diff --git a/mm/filemap.c b/mm/filemap.c > index 7c00f105845e..fbebedaf719e 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -597,14 +597,18 @@ static int page_cache_tree_insert(struct address_space *mapping, > if (!radix_tree_exceptional_entry(p)) > return -EEXIST; > > - if (WARN_ON(dax_mapping(mapping))) > - return -EINVAL; > - > - if (shadowp) > - *shadowp = p; > mapping->nrexceptional--; > - if (node) > - workingset_node_shadows_dec(node); > + if (!dax_mapping(mapping)) { > + if (shadowp) > + *shadowp = p; > + if (node) > + workingset_node_shadows_dec(node); > + } else { > + /* DAX can replace empty locked entry with a hole */ > + WARN_ON_ONCE(p != > + (void *)(RADIX_TREE_EXCEPTIONAL_ENTRY | > + DAX_ENTRY_LOCK)); > + } > } > radix_tree_replace_slot(slot, page); > mapping->nrpages++; > -- > 2.6.2 >
next prev parent reply other threads:[~2016-03-23 17:54 UTC|newest] Thread overview: 86+ messages / expand[flat|nested] mbox.gz Atom feed top 2016-03-21 13:22 [RFC v2] [PATCH 0/10] DAX page fault locking Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-21 13:22 ` [PATCH 01/10] DAX: move RADIX_DAX_ definitions to dax.c Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-21 17:25 ` Matthew Wilcox 2016-03-21 17:25 ` Matthew Wilcox 2016-03-21 13:22 ` [PATCH 02/10] radix-tree: make 'indirect' bit available to exception entries Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-21 17:34 ` Matthew Wilcox 2016-03-21 17:34 ` Matthew Wilcox 2016-03-22 9:12 ` Jan Kara 2016-03-22 9:12 ` Jan Kara 2016-03-22 9:27 ` Matthew Wilcox 2016-03-22 9:27 ` Matthew Wilcox 2016-03-22 10:37 ` Jan Kara 2016-03-22 10:37 ` Jan Kara 2016-03-23 16:41 ` Ross Zwisler 2016-03-23 16:41 ` Ross Zwisler 2016-03-24 12:31 ` Jan Kara 2016-03-24 12:31 ` Jan Kara 2016-03-21 13:22 ` [PATCH 03/10] dax: Remove complete_unwritten argument Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-23 17:12 ` Ross Zwisler 2016-03-23 17:12 ` Ross Zwisler 2016-03-24 12:32 ` Jan Kara 2016-03-24 12:32 ` Jan Kara 2016-03-21 13:22 ` [PATCH 04/10] dax: Fix data corruption for written and mmapped files Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-23 17:39 ` Ross Zwisler 2016-03-23 17:39 ` Ross Zwisler 2016-03-24 12:51 ` Jan Kara 2016-03-24 12:51 ` Jan Kara 2016-03-29 15:17 ` Ross Zwisler 2016-03-29 15:17 ` Ross Zwisler 2016-03-21 13:22 ` [PATCH 05/10] dax: Allow DAX code to replace exceptional entries Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-23 17:52 ` Ross Zwisler [this message] 2016-03-23 17:52 ` Ross Zwisler 2016-03-24 10:42 ` Jan Kara 2016-03-24 10:42 ` Jan Kara 2016-03-21 13:22 ` [PATCH 06/10] dax: Remove redundant inode size checks Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-23 21:08 ` Ross Zwisler 2016-03-23 21:08 ` Ross Zwisler 2016-03-21 13:22 ` [PATCH 07/10] dax: Disable huge page handling Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-23 20:50 ` Ross Zwisler 2016-03-23 20:50 ` Ross Zwisler 2016-03-24 12:56 ` Jan Kara 2016-03-24 12:56 ` Jan Kara 2016-03-21 13:22 ` [PATCH 08/10] dax: New fault locking Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-29 21:57 ` Ross Zwisler 2016-03-29 21:57 ` Ross Zwisler 2016-03-31 16:27 ` Jan Kara 2016-03-31 16:27 ` Jan Kara 2016-03-21 13:22 ` [PATCH 09/10] dax: Use radix tree entry lock to protect cow faults Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-21 19:11 ` Matthew Wilcox 2016-03-21 19:11 ` Matthew Wilcox 2016-03-22 7:03 ` Jan Kara 2016-03-22 7:03 ` Jan Kara 2016-03-29 22:18 ` Ross Zwisler 2016-03-29 22:18 ` Ross Zwisler 2016-03-21 13:22 ` [PATCH 10/10] dax: Remove i_mmap_lock protection Jan Kara 2016-03-21 13:22 ` Jan Kara 2016-03-29 22:17 ` Ross Zwisler 2016-03-29 22:17 ` Ross Zwisler 2016-03-21 17:41 ` [RFC v2] [PATCH 0/10] DAX page fault locking Matthew Wilcox 2016-03-21 17:41 ` Matthew Wilcox 2016-03-23 15:09 ` Jan Kara 2016-03-23 15:09 ` Jan Kara 2016-03-23 20:50 ` Matthew Wilcox 2016-03-23 20:50 ` Matthew Wilcox 2016-03-24 10:00 ` Matthew Wilcox 2016-03-24 10:00 ` Matthew Wilcox 2016-03-22 19:32 ` Ross Zwisler 2016-03-22 19:32 ` Ross Zwisler 2016-03-22 21:07 ` Toshi Kani 2016-03-22 21:07 ` Toshi Kani 2016-03-22 21:15 ` Dave Chinner 2016-03-22 21:15 ` Dave Chinner 2016-03-23 9:45 ` Jan Kara 2016-03-23 9:45 ` Jan Kara 2016-03-23 15:11 ` Toshi Kani 2016-03-23 15:11 ` Toshi Kani
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20160323175258.GD5544@linux.intel.com \ --to=ross.zwisler@linux.intel.com \ --cc=jack@suse.cz \ --cc=linux-nvdimm@lists.01.org \ --cc=neilb@suse.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.