All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3 v2] ext4: Fix issues in ext4 truncate handling
@ 2019-05-22  9:03 Jan Kara
  2019-05-22  9:03 ` [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode Jan Kara
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Jan Kara @ 2019-05-22  9:03 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ira Weiny, Jan Kara

Hello,

Ira Weiny has reported that ext4_setattr() doesn't handle properly failure
of ext4_break_layouts(). When revieweing truncate handling code in
ext4_setattr() I've found some more issues. This series fixes them.

Changes since v1:
* added Reviewed-by and Tested-by tags
* removed unnecessary ext4_orpan_del() call

								Honza

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode
  2019-05-22  9:03 [PATCH 0/3 v2] ext4: Fix issues in ext4 truncate handling Jan Kara
@ 2019-05-22  9:03 ` Jan Kara
  2019-05-24  3:46   ` Theodore Ts'o
  2019-05-22  9:03 ` [PATCH 2/3] ext4: Do not delete unlinked inode from orphan list on failed truncate Jan Kara
  2019-05-22  9:03 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
  2 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2019-05-22  9:03 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ira Weiny, Jan Kara, stable

We didn't wait for outstanding direct IO during truncate in nojournal
mode (as we skip orphan handling in that case). This can lead to fs
corruption or stale data exposure if truncate ends up freeing blocks
and these get reallocated before direct IO finishes. Fix the condition
determining whether the wait is necessary.

CC: stable@vger.kernel.org
Fixes: 1c9114f9c0f1 ("ext4: serialize unlocked dio reads with truncate")
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 82298c63ea6d..9bcb7f2b86dd 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5630,20 +5630,17 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 				goto err_out;
 			}
 		}
-		if (!shrink)
+		if (!shrink) {
 			pagecache_isize_extended(inode, oldsize, inode->i_size);
-
-		/*
-		 * Blocks are going to be removed from the inode. Wait
-		 * for dio in flight.  Temporarily disable
-		 * dioread_nolock to prevent livelock.
-		 */
-		if (orphan) {
-			if (!ext4_should_journal_data(inode)) {
-				inode_dio_wait(inode);
-			} else
-				ext4_wait_for_tail_page_commit(inode);
+		} else {
+			/*
+			 * Blocks are going to be removed from the inode. Wait
+			 * for dio in flight.
+			 */
+			inode_dio_wait(inode);
 		}
+		if (orphan && ext4_should_journal_data(inode))
+			ext4_wait_for_tail_page_commit(inode);
 		down_write(&EXT4_I(inode)->i_mmap_sem);
 
 		rc = ext4_break_layouts(inode);
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/3] ext4: Do not delete unlinked inode from orphan list on failed truncate
  2019-05-22  9:03 [PATCH 0/3 v2] ext4: Fix issues in ext4 truncate handling Jan Kara
  2019-05-22  9:03 ` [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode Jan Kara
@ 2019-05-22  9:03 ` Jan Kara
  2019-05-24  3:47   ` Theodore Ts'o
  2019-05-22  9:03 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
  2 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2019-05-22  9:03 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ira Weiny, Jan Kara

It is possible that unlinked inode enters ext4_setattr() (e.g. if
somebody calls ftruncate(2) on unlinked but still open file). In such
case we should not delete the inode from the orphan list if truncate
fails. Note that this is mostly a theoretical concern as filesystem is
corrupted if we reach this path anyway but let's be consistent in our
orphan handling.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 9bcb7f2b86dd..c7f77c643008 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5625,7 +5625,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 			up_write(&EXT4_I(inode)->i_data_sem);
 			ext4_journal_stop(handle);
 			if (error) {
-				if (orphan)
+				if (orphan && inode->i_nlink)
 					ext4_orphan_del(NULL, inode);
 				goto err_out;
 			}
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-22  9:03 [PATCH 0/3 v2] ext4: Fix issues in ext4 truncate handling Jan Kara
  2019-05-22  9:03 ` [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode Jan Kara
  2019-05-22  9:03 ` [PATCH 2/3] ext4: Do not delete unlinked inode from orphan list on failed truncate Jan Kara
@ 2019-05-22  9:03 ` Jan Kara
  2019-05-24  4:18   ` Theodore Ts'o
  2019-05-25  3:32   ` Theodore Ts'o
  2 siblings, 2 replies; 13+ messages in thread
From: Jan Kara @ 2019-05-22  9:03 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Ira Weiny, Jan Kara

ext4_break_layouts() may fail e.g. due to a signal being delivered.
Thus we need to handle its failure gracefully and not by taking the
filesystem down. Currently ext4_break_layouts() failure is rare but it
may become more common once RDMA uses layout leases for handling
long-term page pins for DAX mappings.

To handle the failure we need to move ext4_break_layouts() earlier
during setattr handling before we do hard to undo changes such as
modifying inode size. To be able to do that we also have to move some
other checks which are better done without holding i_mmap_sem earlier.

Reported-and-tested-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 60 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 29 insertions(+), 31 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c7f77c643008..33411ba4546a 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5571,7 +5571,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 	if (attr->ia_valid & ATTR_SIZE) {
 		handle_t *handle;
 		loff_t oldsize = inode->i_size;
-		int shrink = (attr->ia_size <= inode->i_size);
+		int shrink = (attr->ia_size < inode->i_size);
 
 		if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) {
 			struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
@@ -5585,18 +5585,35 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 		if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
 			inode_inc_iversion(inode);
 
-		if (ext4_should_order_data(inode) &&
-		    (attr->ia_size < inode->i_size)) {
-			error = ext4_begin_ordered_truncate(inode,
+		if (shrink) {
+			if (ext4_should_order_data(inode)) {
+				error = ext4_begin_ordered_truncate(inode,
 							    attr->ia_size);
-			if (error)
-				goto err_out;
+				if (error)
+					goto err_out;
+			}
+			/*
+			 * Blocks are going to be removed from the inode. Wait
+			 * for dio in flight.
+			 */
+			inode_dio_wait(inode);
+		} else {
+			pagecache_isize_extended(inode, oldsize, inode->i_size);
 		}
+
+		down_write(&EXT4_I(inode)->i_mmap_sem);
+
+		rc = ext4_break_layouts(inode);
+		if (rc) {
+			up_write(&EXT4_I(inode)->i_mmap_sem);
+			return rc;
+		}
+
 		if (attr->ia_size != inode->i_size) {
 			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
 			if (IS_ERR(handle)) {
 				error = PTR_ERR(handle);
-				goto err_out;
+				goto out_mmap_sem;
 			}
 			if (ext4_handle_valid(handle) && shrink) {
 				error = ext4_orphan_add(handle, inode);
@@ -5624,32 +5641,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 				i_size_write(inode, attr->ia_size);
 			up_write(&EXT4_I(inode)->i_data_sem);
 			ext4_journal_stop(handle);
-			if (error) {
-				if (orphan && inode->i_nlink)
-					ext4_orphan_del(NULL, inode);
-				goto err_out;
-			}
-		}
-		if (!shrink) {
-			pagecache_isize_extended(inode, oldsize, inode->i_size);
-		} else {
-			/*
-			 * Blocks are going to be removed from the inode. Wait
-			 * for dio in flight.
-			 */
-			inode_dio_wait(inode);
-		}
-		if (orphan && ext4_should_journal_data(inode))
-			ext4_wait_for_tail_page_commit(inode);
-		down_write(&EXT4_I(inode)->i_mmap_sem);
-
-		rc = ext4_break_layouts(inode);
-		if (rc) {
-			up_write(&EXT4_I(inode)->i_mmap_sem);
-			error = rc;
-			goto err_out;
+			if (error)
+				goto out_mmap_sem;
 		}
 
+		if (shrink && ext4_should_journal_data(inode))
+			ext4_wait_for_tail_page_commit(inode);
 		/*
 		 * Truncate pagecache after we've waited for commit
 		 * in data=journal mode to make pages freeable.
@@ -5660,6 +5657,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 			if (rc)
 				error = rc;
 		}
+out_mmap_sem:
 		up_write(&EXT4_I(inode)->i_mmap_sem);
 	}
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode
  2019-05-22  9:03 ` [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode Jan Kara
@ 2019-05-24  3:46   ` Theodore Ts'o
  0 siblings, 0 replies; 13+ messages in thread
From: Theodore Ts'o @ 2019-05-24  3:46 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, Ira Weiny, stable

On Wed, May 22, 2019 at 11:03:15AM +0200, Jan Kara wrote:
> We didn't wait for outstanding direct IO during truncate in nojournal
> mode (as we skip orphan handling in that case). This can lead to fs
> corruption or stale data exposure if truncate ends up freeing blocks
> and these get reallocated before direct IO finishes. Fix the condition
> determining whether the wait is necessary.
> 
> CC: stable@vger.kernel.org
> Fixes: 1c9114f9c0f1 ("ext4: serialize unlocked dio reads with truncate")
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/3] ext4: Do not delete unlinked inode from orphan list on failed truncate
  2019-05-22  9:03 ` [PATCH 2/3] ext4: Do not delete unlinked inode from orphan list on failed truncate Jan Kara
@ 2019-05-24  3:47   ` Theodore Ts'o
  0 siblings, 0 replies; 13+ messages in thread
From: Theodore Ts'o @ 2019-05-24  3:47 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, Ira Weiny

On Wed, May 22, 2019 at 11:03:16AM +0200, Jan Kara wrote:
> It is possible that unlinked inode enters ext4_setattr() (e.g. if
> somebody calls ftruncate(2) on unlinked but still open file). In such
> case we should not delete the inode from the orphan list if truncate
> fails. Note that this is mostly a theoretical concern as filesystem is
> corrupted if we reach this path anyway but let's be consistent in our
> orphan handling.
> 
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

Thanks, applied (and I added a cc:stable@kernel.org).

		       	       - Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-22  9:03 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
@ 2019-05-24  4:18   ` Theodore Ts'o
  2019-05-24  8:13     ` Jan Kara
  2019-05-25  3:32   ` Theodore Ts'o
  1 sibling, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2019-05-24  4:18 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, Ira Weiny

On Wed, May 22, 2019 at 11:03:17AM +0200, Jan Kara wrote:
> ext4_break_layouts() may fail e.g. due to a signal being delivered.
> Thus we need to handle its failure gracefully and not by taking the
> filesystem down. Currently ext4_break_layouts() failure is rare but it
> may become more common once RDMA uses layout leases for handling
> long-term page pins for DAX mappings.
> 
> To handle the failure we need to move ext4_break_layouts() earlier
> during setattr handling before we do hard to undo changes such as
> modifying inode sizhe. To be able to do that we also have to move some
> other checks which are better done uwithout holding i_mmap_sem earlier.
> 
> Reported-and-tested-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Jan Kara <jack@suse.cz>hh

Thanks, applied.

What do people think about adding marking this for stable?  My take is
that DAX is still not that common for most stable kernel users, and
the patch moves enough stuff around that it's borderline for stable.
I'm going to leave off marking for stable unless someone wants to make
a case that we should so mark it.

       	       	     	       - Ted
			       

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-24  4:18   ` Theodore Ts'o
@ 2019-05-24  8:13     ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-05-24  8:13 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Jan Kara, linux-ext4, Ira Weiny

On Fri 24-05-19 00:18:29, Theodore Ts'o wrote:
> On Wed, May 22, 2019 at 11:03:17AM +0200, Jan Kara wrote:
> > ext4_break_layouts() may fail e.g. due to a signal being delivered.
> > Thus we need to handle its failure gracefully and not by taking the
> > filesystem down. Currently ext4_break_layouts() failure is rare but it
> > may become more common once RDMA uses layout leases for handling
> > long-term page pins for DAX mappings.
> > 
> > To handle the failure we need to move ext4_break_layouts() earlier
> > during setattr handling before we do hard to undo changes such as
> > modifying inode sizhe. To be able to do that we also have to move some
> > other checks which are better done uwithout holding i_mmap_sem earlier.
> > 
> > Reported-and-tested-by: Ira Weiny <ira.weiny@intel.com>
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>hh
> 
> Thanks, applied.
> 
> What do people think about adding marking this for stable?  My take is
> that DAX is still not that common for most stable kernel users, and
> the patch moves enough stuff around that it's borderline for stable.
> I'm going to leave off marking for stable unless someone wants to make
> a case that we should so mark it.

Yeah, my take was that I'd care about backporting this patch for stable once
somebody complains that he has actually hit the problem...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-22  9:03 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
  2019-05-24  4:18   ` Theodore Ts'o
@ 2019-05-25  3:32   ` Theodore Ts'o
  2019-05-27 14:53     ` Jan Kara
  1 sibling, 1 reply; 13+ messages in thread
From: Theodore Ts'o @ 2019-05-25  3:32 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-ext4, Ira Weiny

On Wed, May 22, 2019 at 11:03:17AM +0200, Jan Kara wrote:
> ext4_break_layouts() may fail e.g. due to a signal being delivered.
> Thus we need to handle its failure gracefully and not by taking the
> filesystem down. Currently ext4_break_layouts() failure is rare but it
> may become more common once RDMA uses layout leases for handling
> long-term page pins for DAX mappings.
> 
> To handle the failure we need to move ext4_break_layouts() earlier
> during setattr handling before we do hard to undo changes such as
> modifying inode size. To be able to do that we also have to move some
> other checks which are better done without holding i_mmap_sem earlier.
> 
> Reported-and-tested-by: Ira Weiny <ira.weiny@intel.com>
> Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> Signed-off-by: Jan Kara <jack@suse.cz>

When doing some final testing before sending a pull request to Linus,
I found a regression.  After bisecting, this patch fails reliably
under gce-xfstests:

TESTRUNID: tytso-20190524230226
KERNEL:    kernel 5.1.0-rc3-xfstests-00039-g079f9927c7bf #1016 SMP Fri May 24 23:00:47 EDT 2019 x86_64
CMDLINE:   -c 4k generic/092
CPUS:      2
MEM:       7680

ext4/4k: 1 tests, 1 failures, 2 seconds
  generic/092  Failed   1s
Totals: 1 tests, 0 skipped, 1 failures, 0 errors, 1s

FSTESTPRJ: gce-xfstests
FSTESTVER: fio  fio-3.2 (Fri, 3 Nov 2017 15:23:49 -0600)
FSTESTVER: quota  62661bd (Tue, 2 Apr 2019 17:04:37 +0200)
FSTESTVER: xfsprogs v5.0.0 (Fri, 3 May 2019 12:14:36 -0500)
FSTESTVER: xfstests-bld 9582562 (Sun, 12 May 2019 00:38:51 -0400)
FSTESTVER: xfstests linux-v3.8-2390-g64233614 (Thu, 16 May 2019 00:12:52 -0400)
FSTESTCFG: 4k
FSTESTSET: generic/092
FSTESTOPT: aex
GCE ID:    343197219467628221

generic/092 0s ... 	[23:05:07] [23:05:08]- output mismatch (see /results/ext4/results-4k/generic/092
.out.bad)
% diff -u /tmp/results-tytso-20190524230226/ext4/results-4k/generic/092.out.bad /usr/projects/xfstests-bld/build-64/xfstests-dev/tests/generic/092.out 
--- /tmp/results-tytso-20190524230226/ext4/results-4k/generic/092.out.bad	2019-05-24 23:05:08.000000000 -0400
+++ /usr/projects/xfstests-bld/build-64/xfstests-dev/tests/generic/092.out	2018-02-13 23:37:20.330097382 -0500
@@ -2,6 +2,5 @@
 wrote 5242880/5242880 bytes at offset 0
 XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 0: [0..10239]: data
-1: [10240..20479]: unwritten
 0: [0..10239]: data
 1: [10240..20479]: unwritten


Dropping this patch makes the test failure go away.  So I'm going to
drop it for now.  Jan, can you take a look?  Thanks!!

	      	    	      	   	      - Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-25  3:32   ` Theodore Ts'o
@ 2019-05-27 14:53     ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-05-27 14:53 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Jan Kara, linux-ext4, Ira Weiny

On Fri 24-05-19 23:32:35, Theodore Ts'o wrote:
> On Wed, May 22, 2019 at 11:03:17AM +0200, Jan Kara wrote:
> > ext4_break_layouts() may fail e.g. due to a signal being delivered.
> > Thus we need to handle its failure gracefully and not by taking the
> > filesystem down. Currently ext4_break_layouts() failure is rare but it
> > may become more common once RDMA uses layout leases for handling
> > long-term page pins for DAX mappings.
> > 
> > To handle the failure we need to move ext4_break_layouts() earlier
> > during setattr handling before we do hard to undo changes such as
> > modifying inode size. To be able to do that we also have to move some
> > other checks which are better done without holding i_mmap_sem earlier.
> > 
> > Reported-and-tested-by: Ira Weiny <ira.weiny@intel.com>
> > Reviewed-by: Ira Weiny <ira.weiny@intel.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> When doing some final testing before sending a pull request to Linus,
> I found a regression.  After bisecting, this patch fails reliably
> under gce-xfstests:
> 
> TESTRUNID: tytso-20190524230226
> KERNEL:    kernel 5.1.0-rc3-xfstests-00039-g079f9927c7bf #1016 SMP Fri May 24 23:00:47 EDT 2019 x86_64
> CMDLINE:   -c 4k generic/092
> CPUS:      2
> MEM:       7680
> 
> ext4/4k: 1 tests, 1 failures, 2 seconds
>   generic/092  Failed   1s
> Totals: 1 tests, 0 skipped, 1 failures, 0 errors, 1s
> 
> FSTESTPRJ: gce-xfstests
> FSTESTVER: fio  fio-3.2 (Fri, 3 Nov 2017 15:23:49 -0600)
> FSTESTVER: quota  62661bd (Tue, 2 Apr 2019 17:04:37 +0200)
> FSTESTVER: xfsprogs v5.0.0 (Fri, 3 May 2019 12:14:36 -0500)
> FSTESTVER: xfstests-bld 9582562 (Sun, 12 May 2019 00:38:51 -0400)
> FSTESTVER: xfstests linux-v3.8-2390-g64233614 (Thu, 16 May 2019 00:12:52 -0400)
> FSTESTCFG: 4k
> FSTESTSET: generic/092
> FSTESTOPT: aex
> GCE ID:    343197219467628221
> 
> generic/092 0s ... 	[23:05:07] [23:05:08]- output mismatch (see /results/ext4/results-4k/generic/092
> .out.bad)
> % diff -u /tmp/results-tytso-20190524230226/ext4/results-4k/generic/092.out.bad /usr/projects/xfstests-bld/build-64/xfstests-dev/tests/generic/092.out 
> --- /tmp/results-tytso-20190524230226/ext4/results-4k/generic/092.out.bad	2019-05-24 23:05:08.000000000 -0400
> +++ /usr/projects/xfstests-bld/build-64/xfstests-dev/tests/generic/092.out	2018-02-13 23:37:20.330097382 -0500
> @@ -2,6 +2,5 @@
>  wrote 5242880/5242880 bytes at offset 0
>  XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
>  0: [0..10239]: data
> -1: [10240..20479]: unwritten
>  0: [0..10239]: data
>  1: [10240..20479]: unwritten
> 
> 
> Dropping this patch makes the test failure go away.  So I'm going to
> drop it for now.  Jan, can you take a look?  Thanks!!

Ah, right. I wonder how I missed that failure in my test run. Anyway I see
what is the problem. I'll send updated patch.

								Honza

-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-21 18:27   ` Ira Weiny
@ 2019-05-22  8:57     ` Jan Kara
  0 siblings, 0 replies; 13+ messages in thread
From: Jan Kara @ 2019-05-22  8:57 UTC (permalink / raw)
  To: Ira Weiny; +Cc: Jan Kara, Ted Tso, linux-ext4

On Tue 21-05-19 11:27:32, Ira Weiny wrote:
> On Tue, May 21, 2019 at 09:43:58AM +0200, Jan Kara wrote:
> > ext4_break_layouts() may fail e.g. due to a signal being delivered.
> > Thus we need to handle its failure gracefully and not by taking the
> > filesystem down. Currently ext4_break_layouts() failure is rare but it
> > may become more common once RDMA uses layout leases for handling
> > long-term page pins for DAX mappings.
> > 
> > To handle the failure we need to move ext4_break_layouts() earlier
> > during setattr handling before we do hard to undo changes such as
> > modifying inode size. To be able to do that we also have to move some
> > other checks which are better done without holding i_mmap_sem earlier.
> > 
> > Reported-by: "Weiny, Ira" <ira.weiny@intel.com>
> > Signed-off-by: Jan Kara <jack@suse.cz>
> 
> 
> This fixes the bug I was seeing WRT ext4_break_layouts().  Thanks for the help!
> One more NIT comment below.
> 
> > @@ -5627,29 +5644,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
> >  			if (error) {
> >  				if (orphan && inode->i_nlink)
> >  					ext4_orphan_del(NULL, inode);
> > -				goto err_out;
> > +				goto out_mmap_sem;
> 
> This goto flows through a second ext4_orphan_del() call which threw me at
> first.  But I think this is ok.

It is OK but unnecessary. I've deleted this ext4_orphan_del() call. Thanks
for testing and review!

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-21  7:43 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
@ 2019-05-21 18:27   ` Ira Weiny
  2019-05-22  8:57     ` Jan Kara
  0 siblings, 1 reply; 13+ messages in thread
From: Ira Weiny @ 2019-05-21 18:27 UTC (permalink / raw)
  To: Jan Kara; +Cc: Ted Tso, linux-ext4

On Tue, May 21, 2019 at 09:43:58AM +0200, Jan Kara wrote:
> ext4_break_layouts() may fail e.g. due to a signal being delivered.
> Thus we need to handle its failure gracefully and not by taking the
> filesystem down. Currently ext4_break_layouts() failure is rare but it
> may become more common once RDMA uses layout leases for handling
> long-term page pins for DAX mappings.
> 
> To handle the failure we need to move ext4_break_layouts() earlier
> during setattr handling before we do hard to undo changes such as
> modifying inode size. To be able to do that we also have to move some
> other checks which are better done without holding i_mmap_sem earlier.
> 
> Reported-by: "Weiny, Ira" <ira.weiny@intel.com>
> Signed-off-by: Jan Kara <jack@suse.cz>


This fixes the bug I was seeing WRT ext4_break_layouts().  Thanks for the help!
One more NIT comment below.

> ---
>  fs/ext4/inode.c | 55 ++++++++++++++++++++++++++++---------------------------
>  1 file changed, 28 insertions(+), 27 deletions(-)
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index c7f77c643008..979570b42e18 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -5571,7 +5571,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  	if (attr->ia_valid & ATTR_SIZE) {
>  		handle_t *handle;
>  		loff_t oldsize = inode->i_size;
> -		int shrink = (attr->ia_size <= inode->i_size);
> +		int shrink = (attr->ia_size < inode->i_size);
>  
>  		if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) {
>  			struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> @@ -5585,18 +5585,35 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  		if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
>  			inode_inc_iversion(inode);
>  
> -		if (ext4_should_order_data(inode) &&
> -		    (attr->ia_size < inode->i_size)) {
> -			error = ext4_begin_ordered_truncate(inode,
> +		if (shrink) {
> +			if (ext4_should_order_data(inode)) {
> +				error = ext4_begin_ordered_truncate(inode,
>  							    attr->ia_size);
> -			if (error)
> -				goto err_out;
> +				if (error)
> +					goto err_out;
> +			}
> +			/*
> +			 * Blocks are going to be removed from the inode. Wait
> +			 * for dio in flight.
> +			 */
> +			inode_dio_wait(inode);
> +		} else {
> +			pagecache_isize_extended(inode, oldsize, inode->i_size);
>  		}
> +
> +		down_write(&EXT4_I(inode)->i_mmap_sem);
> +
> +		rc = ext4_break_layouts(inode);
> +		if (rc) {
> +			up_write(&EXT4_I(inode)->i_mmap_sem);
> +			return rc;
> +		}
> +
>  		if (attr->ia_size != inode->i_size) {
>  			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
>  			if (IS_ERR(handle)) {
>  				error = PTR_ERR(handle);
> -				goto err_out;
> +				goto out_mmap_sem;
>  			}
>  			if (ext4_handle_valid(handle) && shrink) {
>  				error = ext4_orphan_add(handle, inode);
> @@ -5627,29 +5644,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  			if (error) {
>  				if (orphan && inode->i_nlink)
>  					ext4_orphan_del(NULL, inode);
> -				goto err_out;
> +				goto out_mmap_sem;

This goto flows through a second ext4_orphan_del() call which threw me at
first.  But I think this is ok.

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

And with the series applied.

Tested-by: Ira Weiny <ira.weiny@intel.com>

>  			}
>  		}
> -		if (!shrink) {
> -			pagecache_isize_extended(inode, oldsize, inode->i_size);
> -		} else {
> -			/*
> -			 * Blocks are going to be removed from the inode. Wait
> -			 * for dio in flight.
> -			 */
> -			inode_dio_wait(inode);
> -		}
> -		if (orphan && ext4_should_journal_data(inode))
> -			ext4_wait_for_tail_page_commit(inode);
> -		down_write(&EXT4_I(inode)->i_mmap_sem);
> -
> -		rc = ext4_break_layouts(inode);
> -		if (rc) {
> -			up_write(&EXT4_I(inode)->i_mmap_sem);
> -			error = rc;
> -			goto err_out;
> -		}
>  
> +		if (shrink && ext4_should_journal_data(inode))
> +			ext4_wait_for_tail_page_commit(inode);
>  		/*
>  		 * Truncate pagecache after we've waited for commit
>  		 * in data=journal mode to make pages freeable.
> @@ -5660,6 +5660,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
>  			if (rc)
>  				error = rc;
>  		}
> +out_mmap_sem:
>  		up_write(&EXT4_I(inode)->i_mmap_sem);
>  	}
>  
> -- 
> 2.16.4
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate
  2019-05-21  7:43 [PATCH 0/3] ext4: Fix issues in ext4 truncate handling Jan Kara
@ 2019-05-21  7:43 ` Jan Kara
  2019-05-21 18:27   ` Ira Weiny
  0 siblings, 1 reply; 13+ messages in thread
From: Jan Kara @ 2019-05-21  7:43 UTC (permalink / raw)
  To: Ted Tso; +Cc: linux-ext4, Weiny, Ira, Jan Kara

ext4_break_layouts() may fail e.g. due to a signal being delivered.
Thus we need to handle its failure gracefully and not by taking the
filesystem down. Currently ext4_break_layouts() failure is rare but it
may become more common once RDMA uses layout leases for handling
long-term page pins for DAX mappings.

To handle the failure we need to move ext4_break_layouts() earlier
during setattr handling before we do hard to undo changes such as
modifying inode size. To be able to do that we also have to move some
other checks which are better done without holding i_mmap_sem earlier.

Reported-by: "Weiny, Ira" <ira.weiny@intel.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/ext4/inode.c | 55 ++++++++++++++++++++++++++++---------------------------
 1 file changed, 28 insertions(+), 27 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index c7f77c643008..979570b42e18 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -5571,7 +5571,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 	if (attr->ia_valid & ATTR_SIZE) {
 		handle_t *handle;
 		loff_t oldsize = inode->i_size;
-		int shrink = (attr->ia_size <= inode->i_size);
+		int shrink = (attr->ia_size < inode->i_size);
 
 		if (!(ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))) {
 			struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
@@ -5585,18 +5585,35 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 		if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
 			inode_inc_iversion(inode);
 
-		if (ext4_should_order_data(inode) &&
-		    (attr->ia_size < inode->i_size)) {
-			error = ext4_begin_ordered_truncate(inode,
+		if (shrink) {
+			if (ext4_should_order_data(inode)) {
+				error = ext4_begin_ordered_truncate(inode,
 							    attr->ia_size);
-			if (error)
-				goto err_out;
+				if (error)
+					goto err_out;
+			}
+			/*
+			 * Blocks are going to be removed from the inode. Wait
+			 * for dio in flight.
+			 */
+			inode_dio_wait(inode);
+		} else {
+			pagecache_isize_extended(inode, oldsize, inode->i_size);
 		}
+
+		down_write(&EXT4_I(inode)->i_mmap_sem);
+
+		rc = ext4_break_layouts(inode);
+		if (rc) {
+			up_write(&EXT4_I(inode)->i_mmap_sem);
+			return rc;
+		}
+
 		if (attr->ia_size != inode->i_size) {
 			handle = ext4_journal_start(inode, EXT4_HT_INODE, 3);
 			if (IS_ERR(handle)) {
 				error = PTR_ERR(handle);
-				goto err_out;
+				goto out_mmap_sem;
 			}
 			if (ext4_handle_valid(handle) && shrink) {
 				error = ext4_orphan_add(handle, inode);
@@ -5627,29 +5644,12 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 			if (error) {
 				if (orphan && inode->i_nlink)
 					ext4_orphan_del(NULL, inode);
-				goto err_out;
+				goto out_mmap_sem;
 			}
 		}
-		if (!shrink) {
-			pagecache_isize_extended(inode, oldsize, inode->i_size);
-		} else {
-			/*
-			 * Blocks are going to be removed from the inode. Wait
-			 * for dio in flight.
-			 */
-			inode_dio_wait(inode);
-		}
-		if (orphan && ext4_should_journal_data(inode))
-			ext4_wait_for_tail_page_commit(inode);
-		down_write(&EXT4_I(inode)->i_mmap_sem);
-
-		rc = ext4_break_layouts(inode);
-		if (rc) {
-			up_write(&EXT4_I(inode)->i_mmap_sem);
-			error = rc;
-			goto err_out;
-		}
 
+		if (shrink && ext4_should_journal_data(inode))
+			ext4_wait_for_tail_page_commit(inode);
 		/*
 		 * Truncate pagecache after we've waited for commit
 		 * in data=journal mode to make pages freeable.
@@ -5660,6 +5660,7 @@ int ext4_setattr(struct dentry *dentry, struct iattr *attr)
 			if (rc)
 				error = rc;
 		}
+out_mmap_sem:
 		up_write(&EXT4_I(inode)->i_mmap_sem);
 	}
 
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-05-27 14:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-22  9:03 [PATCH 0/3 v2] ext4: Fix issues in ext4 truncate handling Jan Kara
2019-05-22  9:03 ` [PATCH 1/3] ext4: Wait for outstanding dio during truncate in nojournal mode Jan Kara
2019-05-24  3:46   ` Theodore Ts'o
2019-05-22  9:03 ` [PATCH 2/3] ext4: Do not delete unlinked inode from orphan list on failed truncate Jan Kara
2019-05-24  3:47   ` Theodore Ts'o
2019-05-22  9:03 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
2019-05-24  4:18   ` Theodore Ts'o
2019-05-24  8:13     ` Jan Kara
2019-05-25  3:32   ` Theodore Ts'o
2019-05-27 14:53     ` Jan Kara
  -- strict thread matches above, loose matches on Subject: below --
2019-05-21  7:43 [PATCH 0/3] ext4: Fix issues in ext4 truncate handling Jan Kara
2019-05-21  7:43 ` [PATCH 3/3] ext4: Gracefully handle ext4_break_layouts() failure during truncate Jan Kara
2019-05-21 18:27   ` Ira Weiny
2019-05-22  8:57     ` Jan Kara

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.