* [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts()
@ 2018-08-07 22:11 Dave Jiang
2018-08-07 22:11 ` [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts() Dave Jiang
2018-08-08 8:49 ` [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts() Jan Kara
0 siblings, 2 replies; 5+ messages in thread
From: Dave Jiang @ 2018-08-07 22:11 UTC (permalink / raw)
To: tytso, darrick.wong, jack, zwisler
Cc: linux-nvdimm, david, linux-xfs, linux-fsdevel, lczerner, linux-ext4, hch
From: Ross Zwisler <zwisler@kernel.org>
If the refcount of a page is lowered between the time that it is returned
by dax_busy_page() and when the refcount is again checked in
ext4_break_layouts() => ___wait_var_event(), the waiting function
ext4_wait_dax_page() will never be called. This means that
ext4_break_layouts() will still have 'retry' set to false, so we'll stop
looping and never check the refcount of other pages in this inode.
Instead, always continue looping as long as dax_layout_busy_page() gives us
a page which it found with an elevated refcount.
Note that this works around the race exposed by my unit test, but I think
that there is another race that needs to be addressed, probably with
additional synchronization added between direct I/O and
{ext4,xfs}_break_layouts().
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/inode.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8f6ad7667974..d2663a1e3ec2 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4191,9 +4191,8 @@ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset,
return 0;
}
-static void ext4_wait_dax_page(struct ext4_inode_info *ei, bool *did_unlock)
+static void ext4_wait_dax_page(struct ext4_inode_info *ei)
{
- *did_unlock = true;
up_write(&ei->i_mmap_sem);
schedule();
down_write(&ei->i_mmap_sem);
@@ -4203,14 +4202,12 @@ int ext4_break_layouts(struct inode *inode)
{
struct ext4_inode_info *ei = EXT4_I(inode);
struct page *page;
- bool retry;
int error;
if (WARN_ON_ONCE(!rwsem_is_locked(&ei->i_mmap_sem)))
return -EINVAL;
do {
- retry = false;
page = dax_layout_busy_page(inode->i_mapping);
if (!page)
return 0;
@@ -4218,8 +4215,8 @@ int ext4_break_layouts(struct inode *inode)
error = ___wait_var_event(&page->_refcount,
atomic_read(&page->_refcount) == 1,
TASK_INTERRUPTIBLE, 0, 0,
- ext4_wait_dax_page(ei, &retry));
- } while (error == 0 && retry);
+ ext4_wait_dax_page(ei));
+ } while (error == 0);
return error;
}
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts()
2018-08-07 22:11 [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts() Dave Jiang
@ 2018-08-07 22:11 ` Dave Jiang
2018-08-08 8:53 ` Jan Kara
2018-08-08 8:49 ` [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts() Jan Kara
1 sibling, 1 reply; 5+ messages in thread
From: Dave Jiang @ 2018-08-07 22:11 UTC (permalink / raw)
To: tytso, darrick.wong, jack, zwisler
Cc: linux-nvdimm, david, linux-xfs, linux-fsdevel, lczerner, linux-ext4, hch
This patch is the duplicate of ross's fix for ext4 for xfs.
If the refcount of a page is lowered between the time that it is returned
by dax_busy_page() and when the refcount is again checked in
xfs_break_layouts() => ___wait_var_event(), the waiting function
xfs_wait_dax_page() will never be called. This means that
xfs_break_layouts() will still have 'retry' set to false, so we'll stop
looping and never check the refcount of other pages in this inode.
Instead, always continue looping as long as dax_layout_busy_page() gives us
a page which it found with an elevated refcount.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
fs/xfs/xfs_file.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index a3e7767a5715..666c93fe5284 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -721,12 +721,10 @@ xfs_file_write_iter(
static void
xfs_wait_dax_page(
- struct inode *inode,
- bool *did_unlock)
+ struct inode *inode)
{
struct xfs_inode *ip = XFS_I(inode);
- *did_unlock = true;
xfs_iunlock(ip, XFS_MMAPLOCK_EXCL);
schedule();
xfs_ilock(ip, XFS_MMAPLOCK_EXCL);
@@ -746,9 +744,10 @@ xfs_break_dax_layouts(
if (!page)
return 0;
+ *did_unlock = true;
return ___wait_var_event(&page->_refcount,
atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE,
- 0, 0, xfs_wait_dax_page(inode, did_unlock));
+ 0, 0, xfs_wait_dax_page(inode));
}
int
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts()
2018-08-07 22:11 ` [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts() Dave Jiang
@ 2018-08-08 8:53 ` Jan Kara
2018-08-08 15:47 ` Dave Jiang
0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2018-08-08 8:53 UTC (permalink / raw)
To: Dave Jiang
Cc: lczerner, jack, linux-nvdimm, darrick.wong, david, linux-xfs,
zwisler, linux-fsdevel, tytso, linux-ext4, hch
On Tue 07-08-18 15:11:43, Dave Jiang wrote:
> This patch is the duplicate of ross's fix for ext4 for xfs.
>
> If the refcount of a page is lowered between the time that it is returned
> by dax_busy_page() and when the refcount is again checked in
> xfs_break_layouts() => ___wait_var_event(), the waiting function
> xfs_wait_dax_page() will never be called. This means that
> xfs_break_layouts() will still have 'retry' set to false, so we'll stop
> looping and never check the refcount of other pages in this inode.
>
> Instead, always continue looping as long as dax_layout_busy_page() gives us
> a page which it found with an elevated refcount.
>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
The patch looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
Just one minor nit below:
> @@ -746,9 +744,10 @@ xfs_break_dax_layouts(
> if (!page)
> return 0;
>
> + *did_unlock = true;
I think it would be more understandable to name the argument of
xfs_break_dax_layouts() as 'retry' instead of 'did_unlock' as it's not
about unlocking anymore.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts()
2018-08-08 8:53 ` Jan Kara
@ 2018-08-08 15:47 ` Dave Jiang
0 siblings, 0 replies; 5+ messages in thread
From: Dave Jiang @ 2018-08-08 15:47 UTC (permalink / raw)
To: Jan Kara
Cc: tytso, linux-nvdimm, darrick.wong, david, linux-xfs, zwisler,
linux-fsdevel, lczerner, linux-ext4, hch
On 08/08/2018 01:53 AM, Jan Kara wrote:
> On Tue 07-08-18 15:11:43, Dave Jiang wrote:
>> This patch is the duplicate of ross's fix for ext4 for xfs.
>>
>> If the refcount of a page is lowered between the time that it is returned
>> by dax_busy_page() and when the refcount is again checked in
>> xfs_break_layouts() => ___wait_var_event(), the waiting function
>> xfs_wait_dax_page() will never be called. This means that
>> xfs_break_layouts() will still have 'retry' set to false, so we'll stop
>> looping and never check the refcount of other pages in this inode.
>>
>> Instead, always continue looping as long as dax_layout_busy_page() gives us
>> a page which it found with an elevated refcount.
>>
>> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
>
> The patch looks good to me. You can add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Just one minor nit below:
>
>> @@ -746,9 +744,10 @@ xfs_break_dax_layouts(
>> if (!page)
>> return 0;
>>
>> + *did_unlock = true;
>
> I think it would be more understandable to name the argument of
> xfs_break_dax_layouts() as 'retry' instead of 'did_unlock' as it's not
> about unlocking anymore.
Thanks for the review Jan! I will change. I was trying to decide between
less code change vs more clear definition. :)
>
> Honza
>
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts()
2018-08-07 22:11 [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts() Dave Jiang
2018-08-07 22:11 ` [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts() Dave Jiang
@ 2018-08-08 8:49 ` Jan Kara
1 sibling, 0 replies; 5+ messages in thread
From: Jan Kara @ 2018-08-08 8:49 UTC (permalink / raw)
To: Dave Jiang
Cc: lczerner, jack, linux-nvdimm, darrick.wong, david, linux-xfs,
zwisler, linux-fsdevel, tytso, linux-ext4, hch
On Tue 07-08-18 15:11:37, Dave Jiang wrote:
> From: Ross Zwisler <zwisler@kernel.org>
>
> If the refcount of a page is lowered between the time that it is returned
> by dax_busy_page() and when the refcount is again checked in
> ext4_break_layouts() => ___wait_var_event(), the waiting function
> ext4_wait_dax_page() will never be called. This means that
> ext4_break_layouts() will still have 'retry' set to false, so we'll stop
> looping and never check the refcount of other pages in this inode.
>
> Instead, always continue looping as long as dax_layout_busy_page() gives us
> a page which it found with an elevated refcount.
>
> Note that this works around the race exposed by my unit test, but I think
> that there is another race that needs to be addressed, probably with
> additional synchronization added between direct I/O and
> {ext4,xfs}_break_layouts().
I'd just note that the race Ross suspected should be properly handled by
dax_layout_busy_page() so I think this last paragraph from the changelog
can go.
Also Ted, this fixes a problem in the DAX truncate patches you currently
carry in your tree so you can consider just pushing it with them during the
merge window. It's not necessary though - the patches already make the
problematic behavior much less likely, this patch just hopefully completely
closes the race window.
> Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
> Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> fs/ext4/inode.c | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 8f6ad7667974..d2663a1e3ec2 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4191,9 +4191,8 @@ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset,
> return 0;
> }
>
> -static void ext4_wait_dax_page(struct ext4_inode_info *ei, bool *did_unlock)
> +static void ext4_wait_dax_page(struct ext4_inode_info *ei)
> {
> - *did_unlock = true;
> up_write(&ei->i_mmap_sem);
> schedule();
> down_write(&ei->i_mmap_sem);
> @@ -4203,14 +4202,12 @@ int ext4_break_layouts(struct inode *inode)
> {
> struct ext4_inode_info *ei = EXT4_I(inode);
> struct page *page;
> - bool retry;
> int error;
>
> if (WARN_ON_ONCE(!rwsem_is_locked(&ei->i_mmap_sem)))
> return -EINVAL;
>
> do {
> - retry = false;
> page = dax_layout_busy_page(inode->i_mapping);
> if (!page)
> return 0;
> @@ -4218,8 +4215,8 @@ int ext4_break_layouts(struct inode *inode)
> error = ___wait_var_event(&page->_refcount,
> atomic_read(&page->_refcount) == 1,
> TASK_INTERRUPTIBLE, 0, 0,
> - ext4_wait_dax_page(ei, &retry));
> - } while (error == 0 && retry);
> + ext4_wait_dax_page(ei));
> + } while (error == 0);
>
> return error;
> }
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-08 15:47 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-07 22:11 [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts() Dave Jiang
2018-08-07 22:11 ` [PATCH 2/2] [PATCH] xfs: Close race between direct IO and xfs_break_layouts() Dave Jiang
2018-08-08 8:53 ` Jan Kara
2018-08-08 15:47 ` Dave Jiang
2018-08-08 8:49 ` [PATCH 1/2] ext4: Close race between direct IO and ext4_break_layouts() Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).