nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] Two DAX fixes for 4.20
@ 2018-11-27 21:16 Matthew Wilcox
  2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
  2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
  0 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-27 21:16 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-fsdevel, Jan Kara, Matthew Wilcox, linux-nvdimm

These both fix race conditions in dax_lock_mapping_entry().  I've tagged
them both for 4.19 backport, which will fail and I'll do the equivalent
patch for it.  Dan, do you want to take these through your tree?

Matthew Wilcox (2):
  dax: Check page->mapping isn't NULL
  dax: Don't access a freed inode

 fs/dax.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

-- 
2.19.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] dax: Check page->mapping isn't NULL
  2018-11-27 21:16 [PATCH 0/2] Two DAX fixes for 4.20 Matthew Wilcox
@ 2018-11-27 21:16 ` Matthew Wilcox
  2018-11-28  9:18   ` Johannes Thumshirn
  2018-11-28 11:46   ` Jan Kara
  2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
  1 sibling, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-27 21:16 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-nvdimm, stable, Matthew Wilcox, linux-fsdevel, Jan Kara

If we race with inode destroy, it's possible for page->mapping to be
NULL before we even enter this routine, as well as after having slept
waiting for the dax entry to become unlocked.

Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Cc: stable@vger.kernel.org
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
---
 fs/dax.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/dax.c b/fs/dax.c
index 9bcce89ea18e..e69fc231833b 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -365,7 +365,7 @@ bool dax_lock_mapping_entry(struct page *page)
 		struct address_space *mapping = READ_ONCE(page->mapping);
 
 		locked = false;
-		if (!dax_mapping(mapping))
+		if (!mapping || !dax_mapping(mapping))
 			break;
 
 		/*
-- 
2.19.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] dax: Don't access a freed inode
  2018-11-27 21:16 [PATCH 0/2] Two DAX fixes for 4.20 Matthew Wilcox
  2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
@ 2018-11-27 21:16 ` Matthew Wilcox
  2018-11-28 11:53   ` Jan Kara
  2018-11-28 19:44   ` [PATCH v2] " Dan Williams
  1 sibling, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-27 21:16 UTC (permalink / raw)
  To: Dan Williams
  Cc: linux-nvdimm, stable, Matthew Wilcox, linux-fsdevel, Jan Kara

After we drop the i_pages lock, the inode can be freed at any time.
The get_unlocked_entry() code has no choice but to reacquire the lock,
so it can't be used here.  Create a new wait_entry_unlocked() which takes
care not to acquire the lock or dereference the address_space in any way.

Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
---
 fs/dax.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index e69fc231833b..cf1805645d18 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -232,6 +232,28 @@ static void *get_unlocked_entry(struct xa_state *xas)
 	}
 }
 
+/*
+ * The only thing keeping the address space around is the i_pages lock
+ * (it's cycled in clear_inode() after removing the entries from i_pages)
+ * After we call xas_unlock_irq(), we cannot touch xas->xa.
+ */
+static void wait_entry_unlocked(struct xa_state *xas, void *entry)
+{
+	struct wait_exceptional_entry_queue ewait;
+	wait_queue_head_t *wq;
+
+	init_wait(&ewait.wait);
+	ewait.wait.func = wake_exceptional_entry_func;
+
+	wq = dax_entry_waitqueue(xas, entry, &ewait.key);
+	prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
+	xas_unlock_irq(xas);
+	schedule();
+	finish_wait(wq, &ewait.wait);
+	if (waitqueue_active(wq))
+		__wake_up(wq, TASK_NORMAL, 1, &ewait.key);
+}
+
 static void put_unlocked_entry(struct xa_state *xas, void *entry)
 {
 	/* If we were the only waiter woken, wake the next one */
@@ -389,9 +411,7 @@ bool dax_lock_mapping_entry(struct page *page)
 		entry = xas_load(&xas);
 		if (dax_is_locked(entry)) {
 			rcu_read_unlock();
-			entry = get_unlocked_entry(&xas);
-			xas_unlock_irq(&xas);
-			put_unlocked_entry(&xas, entry);
+			wait_entry_unlocked(&xas, entry);
 			rcu_read_lock();
 			continue;
 		}
-- 
2.19.1

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] dax: Check page->mapping isn't NULL
  2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
@ 2018-11-28  9:18   ` Johannes Thumshirn
  2018-11-28 11:46   ` Jan Kara
  1 sibling, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2018-11-28  9:18 UTC (permalink / raw)
  To: Matthew Wilcox, Dan Williams
  Cc: linux-fsdevel, Jan Kara, stable, linux-nvdimm

Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>

-- 
Johannes Thumshirn                                        SUSE Labs
jthumshirn@suse.de                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] dax: Check page->mapping isn't NULL
  2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
  2018-11-28  9:18   ` Johannes Thumshirn
@ 2018-11-28 11:46   ` Jan Kara
  1 sibling, 0 replies; 9+ messages in thread
From: Jan Kara @ 2018-11-28 11:46 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dan Williams, Jan Kara, Dave Jiang, linux-nvdimm, linux-fsdevel, stable

On Tue 27-11-18 13:16:33, Matthew Wilcox wrote:
> If we race with inode destroy, it's possible for page->mapping to be
> NULL before we even enter this routine, as well as after having slept
> waiting for the dax entry to become unlocked.
> 
> Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
> Cc: stable@vger.kernel.org
> Reported-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Matthew Wilcox <willy@infradead.org>

Thanks for writing this fix. The patch looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/dax.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/dax.c b/fs/dax.c
> index 9bcce89ea18e..e69fc231833b 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -365,7 +365,7 @@ bool dax_lock_mapping_entry(struct page *page)
>  		struct address_space *mapping = READ_ONCE(page->mapping);
>  
>  		locked = false;
> -		if (!dax_mapping(mapping))
> +		if (!mapping || !dax_mapping(mapping))
>  			break;
>  
>  		/*
> -- 
> 2.19.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] dax: Don't access a freed inode
  2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
@ 2018-11-28 11:53   ` Jan Kara
  2018-11-28 17:08     ` Dan Williams
  2018-11-28 19:44   ` [PATCH v2] " Dan Williams
  1 sibling, 1 reply; 9+ messages in thread
From: Jan Kara @ 2018-11-28 11:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Dan Williams, Jan Kara, Dave Jiang, linux-nvdimm, linux-fsdevel, stable

On Tue 27-11-18 13:16:34, Matthew Wilcox wrote:
> After we drop the i_pages lock, the inode can be freed at any time.
> The get_unlocked_entry() code has no choice but to reacquire the lock,
> so it can't be used here.  Create a new wait_entry_unlocked() which takes
> care not to acquire the lock or dereference the address_space in any way.
> 
> Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Matthew Wilcox <willy@infradead.org>

The patch looks good. You can add:

Reviewed-by: Jan Kara <jack@suse.cz>

Just one nit below:

> +/*
> + * The only thing keeping the address space around is the i_pages lock
> + * (it's cycled in clear_inode() after removing the entries from i_pages)
> + * After we call xas_unlock_irq(), we cannot touch xas->xa.
> + */
> +static void wait_entry_unlocked(struct xa_state *xas, void *entry)
> +{
> +	struct wait_exceptional_entry_queue ewait;
> +	wait_queue_head_t *wq;
> +
> +	init_wait(&ewait.wait);
> +	ewait.wait.func = wake_exceptional_entry_func;
> +
> +	wq = dax_entry_waitqueue(xas, entry, &ewait.key);
> +	prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
> +	xas_unlock_irq(xas);
> +	schedule();
> +	finish_wait(wq, &ewait.wait);

Can we add a comment here like:

	/*
	 * Entry lock waits are exclusive. Wake up the next waiter since we
	 * aren't sure we will acquire the entry lock and thus wake the
	 * next waiter up on unlock.
	 */

Because I always wonder for a moment why this is needed.

> +	if (waitqueue_active(wq))
> +		__wake_up(wq, TASK_NORMAL, 1, &ewait.key);
> +}
> +

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] dax: Don't access a freed inode
  2018-11-28 11:53   ` Jan Kara
@ 2018-11-28 17:08     ` Dan Williams
  2018-11-28 17:10       ` Matthew Wilcox
  0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2018-11-28 17:08 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-fsdevel, stable, Matthew Wilcox, linux-nvdimm

On Wed, Nov 28, 2018 at 3:54 AM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 27-11-18 13:16:34, Matthew Wilcox wrote:
> > After we drop the i_pages lock, the inode can be freed at any time.
> > The get_unlocked_entry() code has no choice but to reacquire the lock,
> > so it can't be used here.  Create a new wait_entry_unlocked() which takes
> > care not to acquire the lock or dereference the address_space in any way.
> >
> > Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Matthew Wilcox <willy@infradead.org>
>
> The patch looks good. You can add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Just one nit below:
>
> > +/*
> > + * The only thing keeping the address space around is the i_pages lock
> > + * (it's cycled in clear_inode() after removing the entries from i_pages)
> > + * After we call xas_unlock_irq(), we cannot touch xas->xa.
> > + */
> > +static void wait_entry_unlocked(struct xa_state *xas, void *entry)
> > +{
> > +     struct wait_exceptional_entry_queue ewait;
> > +     wait_queue_head_t *wq;
> > +
> > +     init_wait(&ewait.wait);
> > +     ewait.wait.func = wake_exceptional_entry_func;
> > +
> > +     wq = dax_entry_waitqueue(xas, entry, &ewait.key);
> > +     prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
> > +     xas_unlock_irq(xas);
> > +     schedule();
> > +     finish_wait(wq, &ewait.wait);
>
> Can we add a comment here like:
>
>         /*
>          * Entry lock waits are exclusive. Wake up the next waiter since we
>          * aren't sure we will acquire the entry lock and thus wake the
>          * next waiter up on unlock.
>          */
>
> Because I always wonder for a moment why this is needed.

Looks good, I'll add that when applying.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] dax: Don't access a freed inode
  2018-11-28 17:08     ` Dan Williams
@ 2018-11-28 17:10       ` Matthew Wilcox
  0 siblings, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-28 17:10 UTC (permalink / raw)
  To: Dan Williams; +Cc: linux-fsdevel, Jan Kara, stable, linux-nvdimm

On Wed, Nov 28, 2018 at 09:08:40AM -0800, Dan Williams wrote:
> > Can we add a comment here like:
> >
> >         /*
> >          * Entry lock waits are exclusive. Wake up the next waiter since we
> >          * aren't sure we will acquire the entry lock and thus wake the
> >          * next waiter up on unlock.
> >          */
> >
> > Because I always wonder for a moment why this is needed.
> 
> Looks good, I'll add that when applying.

Thanks.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] dax: Don't access a freed inode
  2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
  2018-11-28 11:53   ` Jan Kara
@ 2018-11-28 19:44   ` Dan Williams
  1 sibling, 0 replies; 9+ messages in thread
From: Dan Williams @ 2018-11-28 19:44 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Jan Kara, Matthew Wilcox, stable, linux-nvdimm

From: Matthew Wilcox <willy@infradead.org>

After we drop the i_pages lock, the inode can be freed at any time.
The get_unlocked_entry() code has no choice but to reacquire the lock,
so it can't be used here.  Create a new wait_entry_unlocked() which takes
care not to acquire the lock or dereference the address_space in any way.

Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 fs/dax.c |   32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index e69fc231833b..3f592dc18d67 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -232,6 +232,34 @@ static void *get_unlocked_entry(struct xa_state *xas)
 	}
 }
 
+/*
+ * The only thing keeping the address space around is the i_pages lock
+ * (it's cycled in clear_inode() after removing the entries from i_pages)
+ * After we call xas_unlock_irq(), we cannot touch xas->xa.
+ */
+static void wait_entry_unlocked(struct xa_state *xas, void *entry)
+{
+	struct wait_exceptional_entry_queue ewait;
+	wait_queue_head_t *wq;
+
+	init_wait(&ewait.wait);
+	ewait.wait.func = wake_exceptional_entry_func;
+
+	wq = dax_entry_waitqueue(xas, entry, &ewait.key);
+	prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
+	xas_unlock_irq(xas);
+	schedule();
+	finish_wait(wq, &ewait.wait);
+
+	/*
+	 * Entry lock waits are exclusive. Wake up the next waiter since
+	 * we aren't sure we will acquire the entry lock and thus wake
+	 * the next waiter up on unlock.
+	 */
+	if (waitqueue_active(wq))
+		__wake_up(wq, TASK_NORMAL, 1, &ewait.key);
+}
+
 static void put_unlocked_entry(struct xa_state *xas, void *entry)
 {
 	/* If we were the only waiter woken, wake the next one */
@@ -389,9 +417,7 @@ bool dax_lock_mapping_entry(struct page *page)
 		entry = xas_load(&xas);
 		if (dax_is_locked(entry)) {
 			rcu_read_unlock();
-			entry = get_unlocked_entry(&xas);
-			xas_unlock_irq(&xas);
-			put_unlocked_entry(&xas, entry);
+			wait_entry_unlocked(&xas, entry);
 			rcu_read_lock();
 			continue;
 		}

_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-11-28 19:56 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-27 21:16 [PATCH 0/2] Two DAX fixes for 4.20 Matthew Wilcox
2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
2018-11-28  9:18   ` Johannes Thumshirn
2018-11-28 11:46   ` Jan Kara
2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
2018-11-28 11:53   ` Jan Kara
2018-11-28 17:08     ` Dan Williams
2018-11-28 17:10       ` Matthew Wilcox
2018-11-28 19:44   ` [PATCH v2] " Dan Williams

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).