* [PATCH 0/2] Two DAX fixes for 4.20
@ 2018-11-27 21:16 Matthew Wilcox
2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
0 siblings, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-27 21:16 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-fsdevel, Jan Kara, Matthew Wilcox, linux-nvdimm
These both fix race conditions in dax_lock_mapping_entry(). I've tagged
them both for 4.19 backport, which will fail and I'll do the equivalent
patch for it. Dan, do you want to take these through your tree?
Matthew Wilcox (2):
dax: Check page->mapping isn't NULL
dax: Don't access a freed inode
fs/dax.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)
--
2.19.1
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH 1/2] dax: Check page->mapping isn't NULL
2018-11-27 21:16 [PATCH 0/2] Two DAX fixes for 4.20 Matthew Wilcox
@ 2018-11-27 21:16 ` Matthew Wilcox
2018-11-28 9:18 ` Johannes Thumshirn
2018-11-28 11:46 ` Jan Kara
2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
1 sibling, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-27 21:16 UTC (permalink / raw)
To: Dan Williams
Cc: linux-nvdimm, stable, Matthew Wilcox, linux-fsdevel, Jan Kara
If we race with inode destroy, it's possible for page->mapping to be
NULL before we even enter this routine, as well as after having slept
waiting for the dax entry to become unlocked.
Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Cc: stable@vger.kernel.org
Reported-by: Jan Kara <jack@suse.cz>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
---
fs/dax.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/dax.c b/fs/dax.c
index 9bcce89ea18e..e69fc231833b 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -365,7 +365,7 @@ bool dax_lock_mapping_entry(struct page *page)
struct address_space *mapping = READ_ONCE(page->mapping);
locked = false;
- if (!dax_mapping(mapping))
+ if (!mapping || !dax_mapping(mapping))
break;
/*
--
2.19.1
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [PATCH 2/2] dax: Don't access a freed inode
2018-11-27 21:16 [PATCH 0/2] Two DAX fixes for 4.20 Matthew Wilcox
2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
@ 2018-11-27 21:16 ` Matthew Wilcox
2018-11-28 11:53 ` Jan Kara
2018-11-28 19:44 ` [PATCH v2] " Dan Williams
1 sibling, 2 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-27 21:16 UTC (permalink / raw)
To: Dan Williams
Cc: linux-nvdimm, stable, Matthew Wilcox, linux-fsdevel, Jan Kara
After we drop the i_pages lock, the inode can be freed at any time.
The get_unlocked_entry() code has no choice but to reacquire the lock,
so it can't be used here. Create a new wait_entry_unlocked() which takes
care not to acquire the lock or dereference the address_space in any way.
Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Wilcox <willy@infradead.org>
---
fs/dax.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index e69fc231833b..cf1805645d18 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -232,6 +232,28 @@ static void *get_unlocked_entry(struct xa_state *xas)
}
}
+/*
+ * The only thing keeping the address space around is the i_pages lock
+ * (it's cycled in clear_inode() after removing the entries from i_pages)
+ * After we call xas_unlock_irq(), we cannot touch xas->xa.
+ */
+static void wait_entry_unlocked(struct xa_state *xas, void *entry)
+{
+ struct wait_exceptional_entry_queue ewait;
+ wait_queue_head_t *wq;
+
+ init_wait(&ewait.wait);
+ ewait.wait.func = wake_exceptional_entry_func;
+
+ wq = dax_entry_waitqueue(xas, entry, &ewait.key);
+ prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
+ xas_unlock_irq(xas);
+ schedule();
+ finish_wait(wq, &ewait.wait);
+ if (waitqueue_active(wq))
+ __wake_up(wq, TASK_NORMAL, 1, &ewait.key);
+}
+
static void put_unlocked_entry(struct xa_state *xas, void *entry)
{
/* If we were the only waiter woken, wake the next one */
@@ -389,9 +411,7 @@ bool dax_lock_mapping_entry(struct page *page)
entry = xas_load(&xas);
if (dax_is_locked(entry)) {
rcu_read_unlock();
- entry = get_unlocked_entry(&xas);
- xas_unlock_irq(&xas);
- put_unlocked_entry(&xas, entry);
+ wait_entry_unlocked(&xas, entry);
rcu_read_lock();
continue;
}
--
2.19.1
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] dax: Check page->mapping isn't NULL
2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
@ 2018-11-28 9:18 ` Johannes Thumshirn
2018-11-28 11:46 ` Jan Kara
1 sibling, 0 replies; 9+ messages in thread
From: Johannes Thumshirn @ 2018-11-28 9:18 UTC (permalink / raw)
To: Matthew Wilcox, Dan Williams
Cc: linux-fsdevel, Jan Kara, stable, linux-nvdimm
Looks good,
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
--
Johannes Thumshirn SUSE Labs
jthumshirn@suse.de +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 1/2] dax: Check page->mapping isn't NULL
2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
2018-11-28 9:18 ` Johannes Thumshirn
@ 2018-11-28 11:46 ` Jan Kara
1 sibling, 0 replies; 9+ messages in thread
From: Jan Kara @ 2018-11-28 11:46 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Dan Williams, Jan Kara, Dave Jiang, linux-nvdimm, linux-fsdevel, stable
On Tue 27-11-18 13:16:33, Matthew Wilcox wrote:
> If we race with inode destroy, it's possible for page->mapping to be
> NULL before we even enter this routine, as well as after having slept
> waiting for the dax entry to become unlocked.
>
> Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
> Cc: stable@vger.kernel.org
> Reported-by: Jan Kara <jack@suse.cz>
> Signed-off-by: Matthew Wilcox <willy@infradead.org>
Thanks for writing this fix. The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/dax.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/dax.c b/fs/dax.c
> index 9bcce89ea18e..e69fc231833b 100644
> --- a/fs/dax.c
> +++ b/fs/dax.c
> @@ -365,7 +365,7 @@ bool dax_lock_mapping_entry(struct page *page)
> struct address_space *mapping = READ_ONCE(page->mapping);
>
> locked = false;
> - if (!dax_mapping(mapping))
> + if (!mapping || !dax_mapping(mapping))
> break;
>
> /*
> --
> 2.19.1
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] dax: Don't access a freed inode
2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
@ 2018-11-28 11:53 ` Jan Kara
2018-11-28 17:08 ` Dan Williams
2018-11-28 19:44 ` [PATCH v2] " Dan Williams
1 sibling, 1 reply; 9+ messages in thread
From: Jan Kara @ 2018-11-28 11:53 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Dan Williams, Jan Kara, Dave Jiang, linux-nvdimm, linux-fsdevel, stable
On Tue 27-11-18 13:16:34, Matthew Wilcox wrote:
> After we drop the i_pages lock, the inode can be freed at any time.
> The get_unlocked_entry() code has no choice but to reacquire the lock,
> so it can't be used here. Create a new wait_entry_unlocked() which takes
> care not to acquire the lock or dereference the address_space in any way.
>
> Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Matthew Wilcox <willy@infradead.org>
The patch looks good. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Just one nit below:
> +/*
> + * The only thing keeping the address space around is the i_pages lock
> + * (it's cycled in clear_inode() after removing the entries from i_pages)
> + * After we call xas_unlock_irq(), we cannot touch xas->xa.
> + */
> +static void wait_entry_unlocked(struct xa_state *xas, void *entry)
> +{
> + struct wait_exceptional_entry_queue ewait;
> + wait_queue_head_t *wq;
> +
> + init_wait(&ewait.wait);
> + ewait.wait.func = wake_exceptional_entry_func;
> +
> + wq = dax_entry_waitqueue(xas, entry, &ewait.key);
> + prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
> + xas_unlock_irq(xas);
> + schedule();
> + finish_wait(wq, &ewait.wait);
Can we add a comment here like:
/*
* Entry lock waits are exclusive. Wake up the next waiter since we
* aren't sure we will acquire the entry lock and thus wake the
* next waiter up on unlock.
*/
Because I always wonder for a moment why this is needed.
> + if (waitqueue_active(wq))
> + __wake_up(wq, TASK_NORMAL, 1, &ewait.key);
> +}
> +
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] dax: Don't access a freed inode
2018-11-28 11:53 ` Jan Kara
@ 2018-11-28 17:08 ` Dan Williams
2018-11-28 17:10 ` Matthew Wilcox
0 siblings, 1 reply; 9+ messages in thread
From: Dan Williams @ 2018-11-28 17:08 UTC (permalink / raw)
To: Jan Kara; +Cc: linux-fsdevel, stable, Matthew Wilcox, linux-nvdimm
On Wed, Nov 28, 2018 at 3:54 AM Jan Kara <jack@suse.cz> wrote:
>
> On Tue 27-11-18 13:16:34, Matthew Wilcox wrote:
> > After we drop the i_pages lock, the inode can be freed at any time.
> > The get_unlocked_entry() code has no choice but to reacquire the lock,
> > so it can't be used here. Create a new wait_entry_unlocked() which takes
> > care not to acquire the lock or dereference the address_space in any way.
> >
> > Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Matthew Wilcox <willy@infradead.org>
>
> The patch looks good. You can add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Just one nit below:
>
> > +/*
> > + * The only thing keeping the address space around is the i_pages lock
> > + * (it's cycled in clear_inode() after removing the entries from i_pages)
> > + * After we call xas_unlock_irq(), we cannot touch xas->xa.
> > + */
> > +static void wait_entry_unlocked(struct xa_state *xas, void *entry)
> > +{
> > + struct wait_exceptional_entry_queue ewait;
> > + wait_queue_head_t *wq;
> > +
> > + init_wait(&ewait.wait);
> > + ewait.wait.func = wake_exceptional_entry_func;
> > +
> > + wq = dax_entry_waitqueue(xas, entry, &ewait.key);
> > + prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
> > + xas_unlock_irq(xas);
> > + schedule();
> > + finish_wait(wq, &ewait.wait);
>
> Can we add a comment here like:
>
> /*
> * Entry lock waits are exclusive. Wake up the next waiter since we
> * aren't sure we will acquire the entry lock and thus wake the
> * next waiter up on unlock.
> */
>
> Because I always wonder for a moment why this is needed.
Looks good, I'll add that when applying.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH 2/2] dax: Don't access a freed inode
2018-11-28 17:08 ` Dan Williams
@ 2018-11-28 17:10 ` Matthew Wilcox
0 siblings, 0 replies; 9+ messages in thread
From: Matthew Wilcox @ 2018-11-28 17:10 UTC (permalink / raw)
To: Dan Williams; +Cc: linux-fsdevel, Jan Kara, stable, linux-nvdimm
On Wed, Nov 28, 2018 at 09:08:40AM -0800, Dan Williams wrote:
> > Can we add a comment here like:
> >
> > /*
> > * Entry lock waits are exclusive. Wake up the next waiter since we
> > * aren't sure we will acquire the entry lock and thus wake the
> > * next waiter up on unlock.
> > */
> >
> > Because I always wonder for a moment why this is needed.
>
> Looks good, I'll add that when applying.
Thanks.
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] dax: Don't access a freed inode
2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
2018-11-28 11:53 ` Jan Kara
@ 2018-11-28 19:44 ` Dan Williams
1 sibling, 0 replies; 9+ messages in thread
From: Dan Williams @ 2018-11-28 19:44 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Jan Kara, Matthew Wilcox, stable, linux-nvdimm
From: Matthew Wilcox <willy@infradead.org>
After we drop the i_pages lock, the inode can be freed at any time.
The get_unlocked_entry() code has no choice but to reacquire the lock,
so it can't be used here. Create a new wait_entry_unlocked() which takes
care not to acquire the lock or dereference the address_space in any way.
Fixes: c2a7d2a11552 ("filesystem-dax: Introduce dax_lock_mapping_entry()")
Cc: <stable@vger.kernel.org>
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
fs/dax.c | 32 +++++++++++++++++++++++++++++---
1 file changed, 29 insertions(+), 3 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index e69fc231833b..3f592dc18d67 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -232,6 +232,34 @@ static void *get_unlocked_entry(struct xa_state *xas)
}
}
+/*
+ * The only thing keeping the address space around is the i_pages lock
+ * (it's cycled in clear_inode() after removing the entries from i_pages)
+ * After we call xas_unlock_irq(), we cannot touch xas->xa.
+ */
+static void wait_entry_unlocked(struct xa_state *xas, void *entry)
+{
+ struct wait_exceptional_entry_queue ewait;
+ wait_queue_head_t *wq;
+
+ init_wait(&ewait.wait);
+ ewait.wait.func = wake_exceptional_entry_func;
+
+ wq = dax_entry_waitqueue(xas, entry, &ewait.key);
+ prepare_to_wait_exclusive(wq, &ewait.wait, TASK_UNINTERRUPTIBLE);
+ xas_unlock_irq(xas);
+ schedule();
+ finish_wait(wq, &ewait.wait);
+
+ /*
+ * Entry lock waits are exclusive. Wake up the next waiter since
+ * we aren't sure we will acquire the entry lock and thus wake
+ * the next waiter up on unlock.
+ */
+ if (waitqueue_active(wq))
+ __wake_up(wq, TASK_NORMAL, 1, &ewait.key);
+}
+
static void put_unlocked_entry(struct xa_state *xas, void *entry)
{
/* If we were the only waiter woken, wake the next one */
@@ -389,9 +417,7 @@ bool dax_lock_mapping_entry(struct page *page)
entry = xas_load(&xas);
if (dax_is_locked(entry)) {
rcu_read_unlock();
- entry = get_unlocked_entry(&xas);
- xas_unlock_irq(&xas);
- put_unlocked_entry(&xas, entry);
+ wait_entry_unlocked(&xas, entry);
rcu_read_lock();
continue;
}
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-11-28 19:56 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-27 21:16 [PATCH 0/2] Two DAX fixes for 4.20 Matthew Wilcox
2018-11-27 21:16 ` [PATCH 1/2] dax: Check page->mapping isn't NULL Matthew Wilcox
2018-11-28 9:18 ` Johannes Thumshirn
2018-11-28 11:46 ` Jan Kara
2018-11-27 21:16 ` [PATCH 2/2] dax: Don't access a freed inode Matthew Wilcox
2018-11-28 11:53 ` Jan Kara
2018-11-28 17:08 ` Dan Williams
2018-11-28 17:10 ` Matthew Wilcox
2018-11-28 19:44 ` [PATCH v2] " Dan Williams
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).