All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/3] xfs: don't overflow xattr listent buffer
@ 2019-02-13 20:50 Darrick J. Wong
  2019-02-13 20:50 ` [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-02-13 20:50 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

For VFS listxattr calls, xfs_xattr_put_listent calls
__xfs_xattr_put_listent twice if it sees an attribute
"trusted.SGI_ACL_FILE": once for that name, and again for
"system.posix_acl_access".  Unfortunately, if we happen to run out of
buffer space while emitting the first name, we set count to -1 (so that
we can feed ERANGE to the caller).  The second invocation doesn't check that
the context parameters make sense and overwrites the byte before the
buffer, triggering a KASAN report:

==================================================================
BUG: KASAN: slab-out-of-bounds in strncpy+0xb3/0xd0
Write of size 1 at addr ffff88807fbd317f by task syz/1113

CPU: 3 PID: 1113 Comm: syz Not tainted 5.0.0-rc6-xfsx #rc6
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1ubuntu1 04/01/2014
Call Trace:
 dump_stack+0xcc/0x180
 print_address_description+0x6c/0x23c
 kasan_report.cold.3+0x1c/0x35
 strncpy+0xb3/0xd0
 __xfs_xattr_put_listent+0x1a9/0x2c0 [xfs]
 xfs_attr_list_int_ilocked+0x11af/0x1800 [xfs]
 xfs_attr_list_int+0x20c/0x2e0 [xfs]
 xfs_vn_listxattr+0x225/0x320 [xfs]
 listxattr+0x11f/0x1b0
 path_listxattr+0xbd/0x130
 do_syscall_64+0x139/0x560

While we're at it we add an assert to the other put_listent to avoid
this sort of thing ever happening to the attrlist_by_handle code.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_attr_list.c |    1 +
 fs/xfs/xfs_xattr.c     |    3 +++
 2 files changed, 4 insertions(+)


diff --git a/fs/xfs/xfs_attr_list.c b/fs/xfs/xfs_attr_list.c
index a58034049995..3d213a7394c5 100644
--- a/fs/xfs/xfs_attr_list.c
+++ b/fs/xfs/xfs_attr_list.c
@@ -555,6 +555,7 @@ xfs_attr_put_listent(
 	attrlist_ent_t *aep;
 	int arraytop;
 
+	ASSERT(!context->seen_enough);
 	ASSERT(!(context->flags & ATTR_KERNOVAL));
 	ASSERT(context->count >= 0);
 	ASSERT(context->count < (ATTR_MAX_VALUELEN/8));
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index 63ee1d5bf1d7..9a63016009a1 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -129,6 +129,9 @@ __xfs_xattr_put_listent(
 	char *offset;
 	int arraytop;
 
+	if (context->count < 0 || context->seen_enough)
+		return;
+
 	if (!context->alist)
 		goto compute_size;
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list
  2019-02-13 20:50 [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
@ 2019-02-13 20:50 ` Darrick J. Wong
  2019-02-14  8:15   ` Christoph Hellwig
  2019-02-14 21:41   ` Christoph Hellwig
  2019-02-13 20:50 ` [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery Darrick J. Wong
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-02-13 20:50 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

When XFS creates an O_TMPFILE file, the inode is created with nlink = 1,
put on the unlinked list, and then the VFS sets nlink = 0 in d_tmpfile.
If we crash before anything logs the inode (it's dirty incore but the
vfs doesn't tell us it's dirty so we never log that change), the iunlink
processing part of recovery will then explode with a pile of:

XFS: Assertion failed: VFS_I(ip)->i_nlink == 0, file:
fs/xfs/xfs_log_recover.c, line: 5072

Worse yet, since nlink is nonzero, the inodes also don't get cleaned up
and they just leak until the next xfs_repair run.

Therefore, change xfs_iunlink to require that inodes being put on the
unlinked list have nlink == 0, change the tmpfile callers to instantiate
nodes that way, and set the nlink to 1 just prior to calling d_tmpfile.
Fix the comment for xfs_iunlink while we're at it.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_inode.c |   16 ++++++----------
 fs/xfs/xfs_iops.c  |   13 +++++++++++--
 2 files changed, 17 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 9aaa3143a277..9d683b455e01 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1332,7 +1332,7 @@ xfs_create_tmpfile(
 	if (error)
 		goto out_trans_cancel;
 
-	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0, prid, &ip);
+	error = xfs_dir_ialloc(&tp, dp, mode, 0, 0, prid, &ip);
 	if (error)
 		goto out_trans_cancel;
 
@@ -2231,11 +2231,8 @@ xfs_iunlink_update_inode(
 }
 
 /*
- * This is called when the inode's link count goes to 0 or we are creating a
- * tmpfile via O_TMPFILE. In the case of a tmpfile, @ignore_linkcount will be
- * set to true as the link count is dropped to zero by the VFS after we've
- * created the file successfully, so we have to add it to the unlinked list
- * while the link count is non-zero.
+ * This is called when the inode's link count has gone to 0 or we are creating
+ * a tmpfile via O_TMPFILE.  The inode @ip must have nlink == 0.
  *
  * We place the on-disk inode on a list in the AGI.  It will be pulled from this
  * list when the inode is freed.
@@ -2254,6 +2251,7 @@ xfs_iunlink(
 	short			bucket_index = agino % XFS_AGI_UNLINKED_BUCKETS;
 	int			error;
 
+	ASSERT(VFS_I(ip)->i_nlink == 0);
 	ASSERT(VFS_I(ip)->i_mode != 0);
 	trace_xfs_iunlink(ip);
 
@@ -3184,11 +3182,9 @@ xfs_rename_alloc_whiteout(
 
 	/*
 	 * Prepare the tmpfile inode as if it were created through the VFS.
-	 * Otherwise, the link increment paths will complain about nlink 0->1.
-	 * Drop the link count as done by d_tmpfile(), complete the inode setup
-	 * and flag it as linkable.
+	 * Complete the inode setup and flag it as linkable.  nlink is already
+	 * zero, so we can skip the drop_nlink.
 	 */
-	drop_nlink(VFS_I(tmpfile));
 	xfs_setup_iops(tmpfile);
 	xfs_finish_inode_setup(tmpfile);
 	VFS_I(tmpfile)->i_state |= I_LINKABLE;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index f48ffd7a8d3e..1efef69a7f1c 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -191,9 +191,18 @@ xfs_generic_create(
 
 	xfs_setup_iops(ip);
 
-	if (tmpfile)
+	if (tmpfile) {
+		/*
+		 * The VFS requires that any inode fed to d_tmpfile must have
+		 * nlink == 1 so that it can decrement the nlink in d_tmpfile.
+		 * However, we created the temp file with nlink == 0 because
+		 * we're not allowed to put an inode with nlink > 0 on the
+		 * unlinked list.  Therefore we have to set nlink to 1 so that
+		 * d_tmpfile can immediately set it back to zero.
+		 */
+		set_nlink(inode, 1);
 		d_tmpfile(dentry, inode);
-	else
+	} else
 		d_instantiate(dentry, inode);
 
 	xfs_finish_inode_setup(ip);

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery
  2019-02-13 20:50 [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
  2019-02-13 20:50 ` [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list Darrick J. Wong
@ 2019-02-13 20:50 ` Darrick J. Wong
  2019-02-14  8:17   ` Christoph Hellwig
  2019-02-13 20:58 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
  2019-02-14  8:11 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Christoph Hellwig
  3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2019-02-13 20:50 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Log recovery frees all the inodes stored in the unlinked list, which can
cause expansion of the free inode btree.  The ifree code skips block
reservations if it thinks there's a per-AG space reservation, but we
don't set up the reservation until after log recovery, which means that
a finobt expansion blows up in xfs_trans_mod_sb when we exceed the
transaction's block reservation.

To fix this, we set the "no finobt reservation" flag to true when we
create the xfs_mount and only set it to false if we confirm that every
AG had enough free space to put aside for the finobt.

While we're at it we change the flag name to be clearer about what it
actually does.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_ag_resv.c      |    2 +-
 fs/xfs/libxfs/xfs_ialloc_btree.c |    4 ++--
 fs/xfs/xfs_fsops.c               |    1 +
 fs/xfs/xfs_inode.c               |    2 +-
 fs/xfs/xfs_mount.h               |    2 +-
 fs/xfs/xfs_super.c               |    7 +++++++
 6 files changed, 13 insertions(+), 5 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_ag_resv.c b/fs/xfs/libxfs/xfs_ag_resv.c
index e701ebc36c06..e2ba2a3b63b2 100644
--- a/fs/xfs/libxfs/xfs_ag_resv.c
+++ b/fs/xfs/libxfs/xfs_ag_resv.c
@@ -281,7 +281,7 @@ xfs_ag_resv_init(
 			 */
 			ask = used = 0;
 
-			mp->m_inotbt_nores = true;
+			mp->m_finobt_nores = true;
 
 			error = xfs_refcountbt_calc_reserves(mp, tp, agno, &ask,
 					&used);
diff --git a/fs/xfs/libxfs/xfs_ialloc_btree.c b/fs/xfs/libxfs/xfs_ialloc_btree.c
index c2df1f89eec8..1080381ff243 100644
--- a/fs/xfs/libxfs/xfs_ialloc_btree.c
+++ b/fs/xfs/libxfs/xfs_ialloc_btree.c
@@ -124,7 +124,7 @@ xfs_finobt_alloc_block(
 	union xfs_btree_ptr	*new,
 	int			*stat)
 {
-	if (cur->bc_mp->m_inotbt_nores)
+	if (cur->bc_mp->m_finobt_nores)
 		return xfs_inobt_alloc_block(cur, start, new, stat);
 	return __xfs_inobt_alloc_block(cur, start, new, stat,
 			XFS_AG_RESV_METADATA);
@@ -154,7 +154,7 @@ xfs_finobt_free_block(
 	struct xfs_btree_cur	*cur,
 	struct xfs_buf		*bp)
 {
-	if (cur->bc_mp->m_inotbt_nores)
+	if (cur->bc_mp->m_finobt_nores)
 		return xfs_inobt_free_block(cur, bp);
 	return __xfs_inobt_free_block(cur, bp, XFS_AG_RESV_METADATA);
 }
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index f3ef70c542e1..584648582ba7 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -533,6 +533,7 @@ xfs_fs_reserve_ag_blocks(
 	int			error = 0;
 	int			err2;
 
+	mp->m_finobt_nores = false;
 	for (agno = 0; agno < mp->m_sb.sb_agcount; agno++) {
 		pag = xfs_perag_get(mp, agno);
 		err2 = xfs_ag_resv_init(pag, NULL);
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 9d683b455e01..f643a9295179 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1754,7 +1754,7 @@ xfs_inactive_ifree(
 	 * now remains allocated and sits on the unlinked list until the fs is
 	 * repaired.
 	 */
-	if (unlikely(mp->m_inotbt_nores)) {
+	if (unlikely(mp->m_finobt_nores)) {
 		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_ifree,
 				XFS_IFREE_SPACE_RES(mp), 0, XFS_TRANS_RESERVE,
 				&tp);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index a33f45077867..864ecf27aa75 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -138,7 +138,7 @@ typedef struct xfs_mount {
 	struct mutex		m_growlock;	/* growfs mutex */
 	int			m_fixedfsid[2];	/* unchanged for life of FS */
 	uint64_t		m_flags;	/* global mount flags */
-	bool			m_inotbt_nores; /* no per-AG finobt resv. */
+	bool			m_finobt_nores; /* no per-AG finobt resv. */
 	int			m_ialloc_inos;	/* inodes in inode allocation */
 	int			m_ialloc_blks;	/* blocks in inode allocation */
 	int			m_ialloc_min_blks;/* min blocks in sparse inode
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c9097cb0b955..08033ac040d6 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1594,6 +1594,13 @@ xfs_mount_alloc(
 	INIT_DELAYED_WORK(&mp->m_eofblocks_work, xfs_eofblocks_worker);
 	INIT_DELAYED_WORK(&mp->m_cowblocks_work, xfs_cowblocks_worker);
 	mp->m_kobj.kobject.kset = xfs_kset;
+	/*
+	 * We don't create the finobt per-ag space reservation until after log
+	 * recovery, so we must set this to true so that an ifree transaction
+	 * started during log recovery will not depend on space reservations
+	 * for finobt expansion.
+	 */
+	mp->m_finobt_nores = true;
 	return mp;
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] xfs: don't overflow xattr listent buffer
  2019-02-13 20:50 [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
  2019-02-13 20:50 ` [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list Darrick J. Wong
  2019-02-13 20:50 ` [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery Darrick J. Wong
@ 2019-02-13 20:58 ` Darrick J. Wong
  2019-06-27 16:12   ` [STABLE 4.19] fixes for xfs memory and fs corruption Amir Goldstein
  2019-02-14  8:11 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Christoph Hellwig
  3 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2019-02-13 20:58 UTC (permalink / raw)
  To: linux-xfs

Bah, I forgot to send the cover letter.  Oh well.

xfs: various fixes

The first patch fixes a memory corruption that syzkaller found in the
attr listent code; see "generic: posix acl extended attribute memory
corruption test" for the relevant regression test.

Patches 2 fixes problems found in XFS's unlinked inode recovery code
that were unearthed by some new testcases.  We're logging nlink==1 temp
files on the iunlinked list (and then the vfs sets nlink to 0 without
telling us) which means that we leak them in recovery if we crash
immediately after the committing the creation of the temp file.

Patch 3 fixes the problem that ifree during recovery can expand the
finobt but we need to force the ifree code to reserve blocks for the
transaction because perag reservations aren't set up yet.

See "[PATCH v2 2/2] generic: check the behavior of programs opening a
lot of O_TMPFILE files" for the regression test.

--D

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/3] xfs: don't overflow xattr listent buffer
  2019-02-13 20:50 [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
                   ` (2 preceding siblings ...)
  2019-02-13 20:58 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
@ 2019-02-14  8:11 ` Christoph Hellwig
  3 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2019-02-14  8:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list
  2019-02-13 20:50 ` [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list Darrick J. Wong
@ 2019-02-14  8:15   ` Christoph Hellwig
  2019-02-14 16:03     ` Darrick J. Wong
  2019-02-14 21:41   ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-02-14  8:15 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Feb 13, 2019 at 12:50:53PM -0800, Darrick J. Wong wrote:
> +	if (tmpfile) {
> +		/*
> +		 * The VFS requires that any inode fed to d_tmpfile must have
> +		 * nlink == 1 so that it can decrement the nlink in d_tmpfile.
> +		 * However, we created the temp file with nlink == 0 because
> +		 * we're not allowed to put an inode with nlink > 0 on the
> +		 * unlinked list.  Therefore we have to set nlink to 1 so that
> +		 * d_tmpfile can immediately set it back to zero.
> +		 */
> +		set_nlink(inode, 1);
>  		d_tmpfile(dentry, inode);
> +	} else

At least btrtfs has to work around these d_tmpfile assumptions as well.
Instead of piling hacks over hacks I'd rather move the call to
inode_dec_link_count from d_tmpfile, which should lead to a saner
interface.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery
  2019-02-13 20:50 ` [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery Darrick J. Wong
@ 2019-02-14  8:17   ` Christoph Hellwig
  2019-02-14 15:58     ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-02-14  8:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Feb 13, 2019 at 12:50:59PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Log recovery frees all the inodes stored in the unlinked list, which can
> cause expansion of the free inode btree.  The ifree code skips block
> reservations if it thinks there's a per-AG space reservation, but we
> don't set up the reservation until after log recovery, which means that
> a finobt expansion blows up in xfs_trans_mod_sb when we exceed the
> transaction's block reservation.
> 
> To fix this, we set the "no finobt reservation" flag to true when we
> create the xfs_mount and only set it to false if we confirm that every
> AG had enough free space to put aside for the finobt.
> 
> While we're at it we change the flag name to be clearer about what it
> actually does.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

But throwing in the field rename makes the patch way bigger and
not as obvious to understand.  Any reason it can't be split into
a separate patch?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery
  2019-02-14  8:17   ` Christoph Hellwig
@ 2019-02-14 15:58     ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-02-14 15:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Feb 14, 2019 at 12:17:00AM -0800, Christoph Hellwig wrote:
> On Wed, Feb 13, 2019 at 12:50:59PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Log recovery frees all the inodes stored in the unlinked list, which can
> > cause expansion of the free inode btree.  The ifree code skips block
> > reservations if it thinks there's a per-AG space reservation, but we
> > don't set up the reservation until after log recovery, which means that
> > a finobt expansion blows up in xfs_trans_mod_sb when we exceed the
> > transaction's block reservation.
> > 
> > To fix this, we set the "no finobt reservation" flag to true when we
> > create the xfs_mount and only set it to false if we confirm that every
> > AG had enough free space to put aside for the finobt.
> > 
> > While we're at it we change the flag name to be clearer about what it
> > actually does.
> 
> Looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> But throwing in the field rename makes the patch way bigger and
> not as obvious to understand.  Any reason it can't be split into
> a separate patch?

Ok, I'll separate the two.

--D

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list
  2019-02-14  8:15   ` Christoph Hellwig
@ 2019-02-14 16:03     ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-02-14 16:03 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, Feb 14, 2019 at 12:15:56AM -0800, Christoph Hellwig wrote:
> On Wed, Feb 13, 2019 at 12:50:53PM -0800, Darrick J. Wong wrote:
> > +	if (tmpfile) {
> > +		/*
> > +		 * The VFS requires that any inode fed to d_tmpfile must have
> > +		 * nlink == 1 so that it can decrement the nlink in d_tmpfile.
> > +		 * However, we created the temp file with nlink == 0 because
> > +		 * we're not allowed to put an inode with nlink > 0 on the
> > +		 * unlinked list.  Therefore we have to set nlink to 1 so that
> > +		 * d_tmpfile can immediately set it back to zero.
> > +		 */
> > +		set_nlink(inode, 1);
> >  		d_tmpfile(dentry, inode);
> > +	} else
> 
> At least btrtfs has to work around these d_tmpfile assumptions as well.
> Instead of piling hacks over hacks I'd rather move the call to
> inode_dec_link_count from d_tmpfile, which should lead to a saner
> interface.

I'm working on a bigger change to fix the d_tmpfile behavior, but that's
a complex multi-fs change that may or may not make it for 5.1. :(

In the meantime this prevents leaking inodes during unlink recovery by
ensuring that we never put linked inodes on the unlink list so I would
still like to get this one reviewed. :)

--D

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list
  2019-02-13 20:50 ` [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list Darrick J. Wong
  2019-02-14  8:15   ` Christoph Hellwig
@ 2019-02-14 21:41   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2019-02-14 21:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Feb 13, 2019 at 12:50:53PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> When XFS creates an O_TMPFILE file, the inode is created with nlink = 1,
> put on the unlinked list, and then the VFS sets nlink = 0 in d_tmpfile.
> If we crash before anything logs the inode (it's dirty incore but the
> vfs doesn't tell us it's dirty so we never log that change), the iunlink
> processing part of recovery will then explode with a pile of:
> 
> XFS: Assertion failed: VFS_I(ip)->i_nlink == 0, file:
> fs/xfs/xfs_log_recover.c, line: 5072
> 
> Worse yet, since nlink is nonzero, the inodes also don't get cleaned up
> and they just leak until the next xfs_repair run.
> 
> Therefore, change xfs_iunlink to require that inodes being put on the
> unlinked list have nlink == 0, change the tmpfile callers to instantiate
> nodes that way, and set the nlink to 1 just prior to calling d_tmpfile.
> Fix the comment for xfs_iunlink while we're at it.

Looks good for a quick fix, even if I think we should fix it different
in the long term:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [STABLE 4.19] fixes for xfs memory and fs corruption
  2019-02-13 20:58 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
@ 2019-06-27 16:12   ` Amir Goldstein
  2019-06-27 17:08     ` Darrick J. Wong
  2019-06-27 23:32     ` Sasha Levin
  0 siblings, 2 replies; 14+ messages in thread
From: Amir Goldstein @ 2019-06-27 16:12 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Sasha Levin, Greg KH, linux-xfs, stable

Darrick,

Can I have your blessing on the choice of these upstream commits
as stable candidates?
I did not observe any xfstests regressions when testing v4.19.55
with these patches applied.

Sasha,

Can you run these patches though your xfstests setup?
They fix nasty bugs.

Make sure to update xfsprogs to very latest, because
generic/530 used to blow up (OOM) my test machine...

>
> The first patch fixes a memory corruption that syzkaller found in the
> attr listent code;

3b50086f0c0d xfs: don't overflow xattr listent buffer

> see "generic: posix acl extended attribute memory
> corruption test" for the relevant regression test.

Fixed generic/529

>
> Patches 2 fixes problems found in XFS's unlinked inode recovery code
> that were unearthed by some new testcases.  We're logging nlink==1 temp
> files on the iunlinked list (and then the vfs sets nlink to 0 without
> telling us) which means that we leak them in recovery if we crash
> immediately after the committing the creation of the temp file.
>
> Patch 3 fixes the problem that ifree during recovery can expand the
> finobt but we need to force the ifree code to reserve blocks for the
> transaction because perag reservations aren't set up yet.

e1f6ca113815 xfs: rename m_inotbt_nores to m_finobt_nores
15a268d9f263 xfs: reserve blocks for ifree transaction during log recovery
c4a6bf7f6cc7 xfs: don't ever put nlink > 0 inodes on the unlinked list

>
> See "[PATCH v2 2/2] generic: check the behavior of programs opening a
> lot of O_TMPFILE files" for the regression test.
>

Fixes generic/530

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [STABLE 4.19] fixes for xfs memory and fs corruption
  2019-06-27 16:12   ` [STABLE 4.19] fixes for xfs memory and fs corruption Amir Goldstein
@ 2019-06-27 17:08     ` Darrick J. Wong
  2019-06-27 23:32     ` Sasha Levin
  1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2019-06-27 17:08 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Sasha Levin, Greg KH, linux-xfs, stable

On Thu, Jun 27, 2019 at 07:12:48PM +0300, Amir Goldstein wrote:
> Darrick,
> 
> Can I have your blessing on the choice of these upstream commits
> as stable candidates?
> I did not observe any xfstests regressions when testing v4.19.55
> with these patches applied.

All four commits look reasonable to me. :)

--D

> Sasha,
> 
> Can you run these patches though your xfstests setup?
> They fix nasty bugs.
> 
> Make sure to update xfsprogs to very latest, because
> generic/530 used to blow up (OOM) my test machine...
> 
> >
> > The first patch fixes a memory corruption that syzkaller found in the
> > attr listent code;
> 
> 3b50086f0c0d xfs: don't overflow xattr listent buffer
> 
> > see "generic: posix acl extended attribute memory
> > corruption test" for the relevant regression test.
> 
> Fixed generic/529
> 
> >
> > Patches 2 fixes problems found in XFS's unlinked inode recovery code
> > that were unearthed by some new testcases.  We're logging nlink==1 temp
> > files on the iunlinked list (and then the vfs sets nlink to 0 without
> > telling us) which means that we leak them in recovery if we crash
> > immediately after the committing the creation of the temp file.
> >
> > Patch 3 fixes the problem that ifree during recovery can expand the
> > finobt but we need to force the ifree code to reserve blocks for the
> > transaction because perag reservations aren't set up yet.
> 
> e1f6ca113815 xfs: rename m_inotbt_nores to m_finobt_nores
> 15a268d9f263 xfs: reserve blocks for ifree transaction during log recovery
> c4a6bf7f6cc7 xfs: don't ever put nlink > 0 inodes on the unlinked list
> 
> >
> > See "[PATCH v2 2/2] generic: check the behavior of programs opening a
> > lot of O_TMPFILE files" for the regression test.
> >
> 
> Fixes generic/530
> 
> Thanks,
> Amir.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [STABLE 4.19] fixes for xfs memory and fs corruption
  2019-06-27 16:12   ` [STABLE 4.19] fixes for xfs memory and fs corruption Amir Goldstein
  2019-06-27 17:08     ` Darrick J. Wong
@ 2019-06-27 23:32     ` Sasha Levin
  2019-07-03  2:47       ` Sasha Levin
  1 sibling, 1 reply; 14+ messages in thread
From: Sasha Levin @ 2019-06-27 23:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Darrick J. Wong, Greg KH, linux-xfs, stable

On Thu, Jun 27, 2019 at 07:12:48PM +0300, Amir Goldstein wrote:
>Darrick,
>
>Can I have your blessing on the choice of these upstream commits
>as stable candidates?
>I did not observe any xfstests regressions when testing v4.19.55
>with these patches applied.
>
>Sasha,
>
>Can you run these patches though your xfstests setup?
>They fix nasty bugs.

Will do. Tests running now - I'll update tomorrow.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [STABLE 4.19] fixes for xfs memory and fs corruption
  2019-06-27 23:32     ` Sasha Levin
@ 2019-07-03  2:47       ` Sasha Levin
  0 siblings, 0 replies; 14+ messages in thread
From: Sasha Levin @ 2019-07-03  2:47 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Darrick J. Wong, Greg KH, linux-xfs, stable

On Thu, Jun 27, 2019 at 07:32:17PM -0400, Sasha Levin wrote:
>On Thu, Jun 27, 2019 at 07:12:48PM +0300, Amir Goldstein wrote:
>>Darrick,
>>
>>Can I have your blessing on the choice of these upstream commits
>>as stable candidates?
>>I did not observe any xfstests regressions when testing v4.19.55
>>with these patches applied.
>>
>>Sasha,
>>
>>Can you run these patches though your xfstests setup?
>>They fix nasty bugs.
>
>Will do. Tests running now - I'll update tomorrow.

I gave it a few more days, and it looks good here.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-07-03  2:47 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-13 20:50 [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
2019-02-13 20:50 ` [PATCH 2/3] xfs: don't ever put nlink > 0 inodes on the unlinked list Darrick J. Wong
2019-02-14  8:15   ` Christoph Hellwig
2019-02-14 16:03     ` Darrick J. Wong
2019-02-14 21:41   ` Christoph Hellwig
2019-02-13 20:50 ` [PATCH 3/3] xfs: reserve blocks for ifree transaction during log recovery Darrick J. Wong
2019-02-14  8:17   ` Christoph Hellwig
2019-02-14 15:58     ` Darrick J. Wong
2019-02-13 20:58 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Darrick J. Wong
2019-06-27 16:12   ` [STABLE 4.19] fixes for xfs memory and fs corruption Amir Goldstein
2019-06-27 17:08     ` Darrick J. Wong
2019-06-27 23:32     ` Sasha Levin
2019-07-03  2:47       ` Sasha Levin
2019-02-14  8:11 ` [PATCH 1/3] xfs: don't overflow xattr listent buffer Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.