All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: remove "no-allocation" reservations for file creations
@ 2017-08-21  8:24 Christoph Hellwig
  2017-08-23 20:22 ` Darrick J. Wong
  2017-12-07  0:15 ` Darrick J. Wong
  0 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2017-08-21  8:24 UTC (permalink / raw)
  To: linux-xfs

If we create a new file we will need an inode, and usually some metadata
in the parent direction.  Aiming for everything to go well despite the
lack of a reservation leads to dirty transactions cancelled under a heavy
create/delete load.  This patch removes those nospace transactions, which
will lead to slightly earlier ENOSPC on some workloads, but instead
prevent file system shutdowns due to cancelling dirty transactions for
others.

A customer could observe assertations failures and shutdowns due to
cancelation of dirty transactions during heavy NFS workloads as shown
below:

2017-05-30 21:17:06 kernel: WARNING: [ 2670.728125] XFS: Assertion failed: error != -ENOSPC, file: fs/xfs/xfs_inode.c, line: 1262

2017-05-30 21:17:06 kernel: WARNING: [ 2670.728222] Call Trace:
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728246]  [<ffffffff81795daf>] dump_stack+0x63/0x81
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728262]  [<ffffffff810a1a5a>] warn_slowpath_common+0x8a/0xc0
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728264]  [<ffffffff810a1b8a>] warn_slowpath_null+0x1a/0x20
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728285]  [<ffffffffa01bf403>] asswarn+0x33/0x40 [xfs]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728308]  [<ffffffffa01bb07e>] xfs_create+0x7be/0x7d0 [xfs]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728329]  [<ffffffffa01b6ffb>] xfs_generic_create+0x1fb/0x2e0 [xfs]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728348]  [<ffffffffa01b7114>] xfs_vn_mknod+0x14/0x20 [xfs]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728366]  [<ffffffffa01b7153>] xfs_vn_create+0x13/0x20 [xfs]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728380]  [<ffffffff81231de5>] vfs_create+0xd5/0x140
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728390]  [<ffffffffa045ddb9>] do_nfsd_create+0x499/0x610 [nfsd]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728396]  [<ffffffffa0465fa5>] nfsd3_proc_create+0x135/0x210 [nfsd]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728401]  [<ffffffffa04561e3>] nfsd_dispatch+0xc3/0x210 [nfsd]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728416]  [<ffffffffa03bfa43>] svc_process_common+0x453/0x6f0 [sunrpc]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728423]  [<ffffffffa03bfdf3>] svc_process+0x113/0x1f0 [sunrpc]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728427]  [<ffffffffa0455bcf>] nfsd+0x10f/0x180 [nfsd]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728432]  [<ffffffffa0455ac0>] ? nfsd_destroy+0x80/0x80 [nfsd]
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728438]  [<ffffffff810c0d58>] kthread+0xd8/0xf0
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728441]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728451]  [<ffffffff8179d962>] ret_from_fork+0x42/0x70
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728453]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
2017-05-30 21:17:06 kernel: WARNING: [ 2670.728454] ---[ end trace f9822c842fec81d4 ]---

2017-05-30 21:17:06 kernel: ALERT: [ 2670.728477] XFS (sdb): Internal error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x4ee/0x7d0 [xfs]

2017-05-30 21:17:06 kernel: ALERT: [ 2670.728684] XFS (sdb): Corruption of in-memory data detected. Shutting down filesystem
2017-05-30 21:17:06 kernel: ALERT: [ 2670.728685] XFS (sdb): Please umount the filesystem and rectify the problem(s)
---
 fs/xfs/libxfs/xfs_ialloc.c | 10 +++-------
 fs/xfs/libxfs/xfs_ialloc.h |  1 -
 fs/xfs/xfs_inode.c         | 33 +++++++--------------------------
 fs/xfs/xfs_inode.h         |  2 +-
 fs/xfs/xfs_qm.c            |  4 ++--
 fs/xfs/xfs_symlink.c       | 15 +--------------
 6 files changed, 14 insertions(+), 51 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index abf5beaae907..6af9ef5d33de 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -921,8 +921,7 @@ STATIC xfs_agnumber_t
 xfs_ialloc_ag_select(
 	xfs_trans_t	*tp,		/* transaction pointer */
 	xfs_ino_t	parent,		/* parent directory inode number */
-	umode_t		mode,		/* bits set to indicate file type */
-	int		okalloc)	/* ok to allocate more space */
+	umode_t		mode)		/* bits set to indicate file type */
 {
 	xfs_agnumber_t	agcount;	/* number of ag's in the filesystem */
 	xfs_agnumber_t	agno;		/* current ag number */
@@ -979,9 +978,6 @@ xfs_ialloc_ag_select(
 			return agno;
 		}
 
-		if (!okalloc)
-			goto nextag;
-
 		if (!pag->pagf_init) {
 			error = xfs_alloc_pagf_init(mp, tp, agno, flags);
 			if (error)
@@ -1682,7 +1678,6 @@ xfs_dialloc(
 	struct xfs_trans	*tp,
 	xfs_ino_t		parent,
 	umode_t			mode,
-	int			okalloc,
 	struct xfs_buf		**IO_agbp,
 	xfs_ino_t		*inop)
 {
@@ -1694,6 +1689,7 @@ xfs_dialloc(
 	int			noroom = 0;
 	xfs_agnumber_t		start_agno;
 	struct xfs_perag	*pag;
+	int			okalloc = 1;
 
 	if (*IO_agbp) {
 		/*
@@ -1709,7 +1705,7 @@ xfs_dialloc(
 	 * We do not have an agbp, so select an initial allocation
 	 * group for inode allocation.
 	 */
-	start_agno = xfs_ialloc_ag_select(tp, parent, mode, okalloc);
+	start_agno = xfs_ialloc_ag_select(tp, parent, mode);
 	if (start_agno == NULLAGNUMBER) {
 		*inop = NULLFSINO;
 		return 0;
diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
index b32cfb5aeb5b..72311a636458 100644
--- a/fs/xfs/libxfs/xfs_ialloc.h
+++ b/fs/xfs/libxfs/xfs_ialloc.h
@@ -81,7 +81,6 @@ xfs_dialloc(
 	struct xfs_trans *tp,		/* transaction pointer */
 	xfs_ino_t	parent,		/* parent inode (directory) */
 	umode_t		mode,		/* mode bits for new inode */
-	int		okalloc,	/* ok to allocate more space */
 	struct xfs_buf	**agbp,		/* buf for a.g. inode header */
 	xfs_ino_t	*inop);		/* inode number allocated */
 
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index ff48f0096810..6af59418f4ab 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -769,7 +769,6 @@ xfs_ialloc(
 	xfs_nlink_t	nlink,
 	xfs_dev_t	rdev,
 	prid_t		prid,
-	int		okalloc,
 	xfs_buf_t	**ialloc_context,
 	xfs_inode_t	**ipp)
 {
@@ -785,7 +784,7 @@ xfs_ialloc(
 	 * Call the space management code to pick
 	 * the on-disk inode to be allocated.
 	 */
-	error = xfs_dialloc(tp, pip ? pip->i_ino : 0, mode, okalloc,
+	error = xfs_dialloc(tp, pip ? pip->i_ino : 0, mode,
 			    ialloc_context, &ino);
 	if (error)
 		return error;
@@ -977,7 +976,6 @@ xfs_dir_ialloc(
 	xfs_nlink_t	nlink,
 	xfs_dev_t	rdev,
 	prid_t		prid,		/* project id */
-	int		okalloc,	/* ok to allocate new space */
 	xfs_inode_t	**ipp,		/* pointer to inode; it will be
 					   locked. */
 	int		*committed)
@@ -1008,8 +1006,8 @@ xfs_dir_ialloc(
 	 * transaction commit so that no other process can steal
 	 * the inode(s) that we've just allocated.
 	 */
-	code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid, okalloc,
-			  &ialloc_context, &ip);
+	code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid, &ialloc_context,
+			&ip);
 
 	/*
 	 * Return an error if we were unable to allocate a new inode.
@@ -1081,7 +1079,7 @@ xfs_dir_ialloc(
 		 * this call should always succeed.
 		 */
 		code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid,
-				  okalloc, &ialloc_context, &ip);
+				  &ialloc_context, &ip);
 
 		/*
 		 * If we get an error at this point, return to the caller
@@ -1203,11 +1201,6 @@ xfs_create(
 		xfs_flush_inodes(mp);
 		error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
 	}
-	if (error == -ENOSPC) {
-		/* No space at all so try a "no-allocation" reservation */
-		resblks = 0;
-		error = xfs_trans_alloc(mp, tres, 0, 0, 0, &tp);
-	}
 	if (error)
 		goto out_release_inode;
 
@@ -1224,19 +1217,13 @@ xfs_create(
 	if (error)
 		goto out_trans_cancel;
 
-	if (!resblks) {
-		error = xfs_dir_canenter(tp, dp, name);
-		if (error)
-			goto out_trans_cancel;
-	}
-
 	/*
 	 * A newly created regular or special file just has one directory
 	 * entry pointing to them, but a directory also the "." entry
 	 * pointing to itself.
 	 */
-	error = xfs_dir_ialloc(&tp, dp, mode, is_dir ? 2 : 1, rdev,
-			       prid, resblks > 0, &ip, NULL);
+	error = xfs_dir_ialloc(&tp, dp, mode, is_dir ? 2 : 1, rdev, prid, &ip,
+			NULL);
 	if (error)
 		goto out_trans_cancel;
 
@@ -1361,11 +1348,6 @@ xfs_create_tmpfile(
 	tres = &M_RES(mp)->tr_create_tmpfile;
 
 	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
-	if (error == -ENOSPC) {
-		/* No space at all so try a "no-allocation" reservation */
-		resblks = 0;
-		error = xfs_trans_alloc(mp, tres, 0, 0, 0, &tp);
-	}
 	if (error)
 		goto out_release_inode;
 
@@ -1374,8 +1356,7 @@ xfs_create_tmpfile(
 	if (error)
 		goto out_trans_cancel;
 
-	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0,
-				prid, resblks > 0, &ip, NULL);
+	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0, prid, &ip, NULL);
 	if (error)
 		goto out_trans_cancel;
 
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 0ee453de239a..ef9b634e4a0a 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -428,7 +428,7 @@ xfs_extlen_t	xfs_get_extsz_hint(struct xfs_inode *ip);
 xfs_extlen_t	xfs_get_cowextsz_hint(struct xfs_inode *ip);
 
 int		xfs_dir_ialloc(struct xfs_trans **, struct xfs_inode *, umode_t,
-			       xfs_nlink_t, xfs_dev_t, prid_t, int,
+			       xfs_nlink_t, xfs_dev_t, prid_t,
 			       struct xfs_inode **, int *);
 
 /* from xfs_file.c */
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 15751dc2a27d..b1e45ac8ada7 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -792,8 +792,8 @@ xfs_qm_qino_alloc(
 		return error;
 
 	if (need_alloc) {
-		error = xfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0, 0, 1, ip,
-								&committed);
+		error = xfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0, 0, ip,
+				&committed);
 		if (error) {
 			xfs_trans_cancel(tp);
 			return error;
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 23a50d7aa46a..ae8071a4d035 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -232,11 +232,6 @@ xfs_symlink(
 	resblks = XFS_SYMLINK_SPACE_RES(mp, link_name->len, fs_blocks);
 
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, resblks, 0, 0, &tp);
-	if (error == -ENOSPC && fs_blocks == 0) {
-		resblks = 0;
-		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, 0, 0, 0,
-				&tp);
-	}
 	if (error)
 		goto out_release_inode;
 
@@ -260,14 +255,6 @@ xfs_symlink(
 		goto out_trans_cancel;
 
 	/*
-	 * Check for ability to enter directory entry, if no space reserved.
-	 */
-	if (!resblks) {
-		error = xfs_dir_canenter(tp, dp, link_name);
-		if (error)
-			goto out_trans_cancel;
-	}
-	/*
 	 * Initialize the bmap freelist prior to calling either
 	 * bmapi or the directory create code.
 	 */
@@ -277,7 +264,7 @@ xfs_symlink(
 	 * Allocate an inode for the symlink.
 	 */
 	error = xfs_dir_ialloc(&tp, dp, S_IFLNK | (mode & ~S_IFMT), 1, 0,
-			       prid, resblks > 0, &ip, NULL);
+			       prid, &ip, NULL);
 	if (error)
 		goto out_trans_cancel;
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-21  8:24 [PATCH] xfs: remove "no-allocation" reservations for file creations Christoph Hellwig
@ 2017-08-23 20:22 ` Darrick J. Wong
  2017-08-24  0:47   ` Dave Chinner
  2017-08-24 12:18   ` Christoph Hellwig
  2017-12-07  0:15 ` Darrick J. Wong
  1 sibling, 2 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-23 20:22 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Aug 21, 2017 at 10:24:04AM +0200, Christoph Hellwig wrote:
> If we create a new file we will need an inode, and usually some metadata
> in the parent direction.  Aiming for everything to go well despite the
> lack of a reservation leads to dirty transactions cancelled under a heavy
> create/delete load.  This patch removes those nospace transactions, which
> will lead to slightly earlier ENOSPC on some workloads, but instead
> prevent file system shutdowns due to cancelling dirty transactions for
> others.
> 
> A customer could observe assertations failures and shutdowns due to
> cancelation of dirty transactions during heavy NFS workloads as shown
> below:

Looks ok... but is there a xfstest somewhere that can be coaxed into
reproducing this?  I'm looking at what this code does and have been
wondering why it even tries this weird workaround in the first place?

IOWS, the weirdness removed by this patch didn't quite smell right, but
at the same time I want to know more about why the smelly weirdness was
there at all before I rip it out. Context, anyone? :)

Will throw it on the pile and test.

--D

> 
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728125] XFS: Assertion failed: error != -ENOSPC, file: fs/xfs/xfs_inode.c, line: 1262
> 
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728222] Call Trace:
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728246]  [<ffffffff81795daf>] dump_stack+0x63/0x81
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728262]  [<ffffffff810a1a5a>] warn_slowpath_common+0x8a/0xc0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728264]  [<ffffffff810a1b8a>] warn_slowpath_null+0x1a/0x20
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728285]  [<ffffffffa01bf403>] asswarn+0x33/0x40 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728308]  [<ffffffffa01bb07e>] xfs_create+0x7be/0x7d0 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728329]  [<ffffffffa01b6ffb>] xfs_generic_create+0x1fb/0x2e0 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728348]  [<ffffffffa01b7114>] xfs_vn_mknod+0x14/0x20 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728366]  [<ffffffffa01b7153>] xfs_vn_create+0x13/0x20 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728380]  [<ffffffff81231de5>] vfs_create+0xd5/0x140
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728390]  [<ffffffffa045ddb9>] do_nfsd_create+0x499/0x610 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728396]  [<ffffffffa0465fa5>] nfsd3_proc_create+0x135/0x210 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728401]  [<ffffffffa04561e3>] nfsd_dispatch+0xc3/0x210 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728416]  [<ffffffffa03bfa43>] svc_process_common+0x453/0x6f0 [sunrpc]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728423]  [<ffffffffa03bfdf3>] svc_process+0x113/0x1f0 [sunrpc]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728427]  [<ffffffffa0455bcf>] nfsd+0x10f/0x180 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728432]  [<ffffffffa0455ac0>] ? nfsd_destroy+0x80/0x80 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728438]  [<ffffffff810c0d58>] kthread+0xd8/0xf0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728441]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728451]  [<ffffffff8179d962>] ret_from_fork+0x42/0x70
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728453]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728454] ---[ end trace f9822c842fec81d4 ]---
> 
> 2017-05-30 21:17:06 kernel: ALERT: [ 2670.728477] XFS (sdb): Internal error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x4ee/0x7d0 [xfs]
> 
> 2017-05-30 21:17:06 kernel: ALERT: [ 2670.728684] XFS (sdb): Corruption of in-memory data detected. Shutting down filesystem
> 2017-05-30 21:17:06 kernel: ALERT: [ 2670.728685] XFS (sdb): Please umount the filesystem and rectify the problem(s)
> ---
>  fs/xfs/libxfs/xfs_ialloc.c | 10 +++-------
>  fs/xfs/libxfs/xfs_ialloc.h |  1 -
>  fs/xfs/xfs_inode.c         | 33 +++++++--------------------------
>  fs/xfs/xfs_inode.h         |  2 +-
>  fs/xfs/xfs_qm.c            |  4 ++--
>  fs/xfs/xfs_symlink.c       | 15 +--------------
>  6 files changed, 14 insertions(+), 51 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> index abf5beaae907..6af9ef5d33de 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.c
> +++ b/fs/xfs/libxfs/xfs_ialloc.c
> @@ -921,8 +921,7 @@ STATIC xfs_agnumber_t
>  xfs_ialloc_ag_select(
>  	xfs_trans_t	*tp,		/* transaction pointer */
>  	xfs_ino_t	parent,		/* parent directory inode number */
> -	umode_t		mode,		/* bits set to indicate file type */
> -	int		okalloc)	/* ok to allocate more space */
> +	umode_t		mode)		/* bits set to indicate file type */
>  {
>  	xfs_agnumber_t	agcount;	/* number of ag's in the filesystem */
>  	xfs_agnumber_t	agno;		/* current ag number */
> @@ -979,9 +978,6 @@ xfs_ialloc_ag_select(
>  			return agno;
>  		}
>  
> -		if (!okalloc)
> -			goto nextag;
> -
>  		if (!pag->pagf_init) {
>  			error = xfs_alloc_pagf_init(mp, tp, agno, flags);
>  			if (error)
> @@ -1682,7 +1678,6 @@ xfs_dialloc(
>  	struct xfs_trans	*tp,
>  	xfs_ino_t		parent,
>  	umode_t			mode,
> -	int			okalloc,
>  	struct xfs_buf		**IO_agbp,
>  	xfs_ino_t		*inop)
>  {
> @@ -1694,6 +1689,7 @@ xfs_dialloc(
>  	int			noroom = 0;
>  	xfs_agnumber_t		start_agno;
>  	struct xfs_perag	*pag;
> +	int			okalloc = 1;
>  
>  	if (*IO_agbp) {
>  		/*
> @@ -1709,7 +1705,7 @@ xfs_dialloc(
>  	 * We do not have an agbp, so select an initial allocation
>  	 * group for inode allocation.
>  	 */
> -	start_agno = xfs_ialloc_ag_select(tp, parent, mode, okalloc);
> +	start_agno = xfs_ialloc_ag_select(tp, parent, mode);
>  	if (start_agno == NULLAGNUMBER) {
>  		*inop = NULLFSINO;
>  		return 0;
> diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
> index b32cfb5aeb5b..72311a636458 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.h
> +++ b/fs/xfs/libxfs/xfs_ialloc.h
> @@ -81,7 +81,6 @@ xfs_dialloc(
>  	struct xfs_trans *tp,		/* transaction pointer */
>  	xfs_ino_t	parent,		/* parent inode (directory) */
>  	umode_t		mode,		/* mode bits for new inode */
> -	int		okalloc,	/* ok to allocate more space */
>  	struct xfs_buf	**agbp,		/* buf for a.g. inode header */
>  	xfs_ino_t	*inop);		/* inode number allocated */
>  
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index ff48f0096810..6af59418f4ab 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -769,7 +769,6 @@ xfs_ialloc(
>  	xfs_nlink_t	nlink,
>  	xfs_dev_t	rdev,
>  	prid_t		prid,
> -	int		okalloc,
>  	xfs_buf_t	**ialloc_context,
>  	xfs_inode_t	**ipp)
>  {
> @@ -785,7 +784,7 @@ xfs_ialloc(
>  	 * Call the space management code to pick
>  	 * the on-disk inode to be allocated.
>  	 */
> -	error = xfs_dialloc(tp, pip ? pip->i_ino : 0, mode, okalloc,
> +	error = xfs_dialloc(tp, pip ? pip->i_ino : 0, mode,
>  			    ialloc_context, &ino);
>  	if (error)
>  		return error;
> @@ -977,7 +976,6 @@ xfs_dir_ialloc(
>  	xfs_nlink_t	nlink,
>  	xfs_dev_t	rdev,
>  	prid_t		prid,		/* project id */
> -	int		okalloc,	/* ok to allocate new space */
>  	xfs_inode_t	**ipp,		/* pointer to inode; it will be
>  					   locked. */
>  	int		*committed)
> @@ -1008,8 +1006,8 @@ xfs_dir_ialloc(
>  	 * transaction commit so that no other process can steal
>  	 * the inode(s) that we've just allocated.
>  	 */
> -	code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid, okalloc,
> -			  &ialloc_context, &ip);
> +	code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid, &ialloc_context,
> +			&ip);
>  
>  	/*
>  	 * Return an error if we were unable to allocate a new inode.
> @@ -1081,7 +1079,7 @@ xfs_dir_ialloc(
>  		 * this call should always succeed.
>  		 */
>  		code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid,
> -				  okalloc, &ialloc_context, &ip);
> +				  &ialloc_context, &ip);
>  
>  		/*
>  		 * If we get an error at this point, return to the caller
> @@ -1203,11 +1201,6 @@ xfs_create(
>  		xfs_flush_inodes(mp);
>  		error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
>  	}
> -	if (error == -ENOSPC) {
> -		/* No space at all so try a "no-allocation" reservation */
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, tres, 0, 0, 0, &tp);
> -	}
>  	if (error)
>  		goto out_release_inode;
>  
> @@ -1224,19 +1217,13 @@ xfs_create(
>  	if (error)
>  		goto out_trans_cancel;
>  
> -	if (!resblks) {
> -		error = xfs_dir_canenter(tp, dp, name);
> -		if (error)
> -			goto out_trans_cancel;
> -	}
> -
>  	/*
>  	 * A newly created regular or special file just has one directory
>  	 * entry pointing to them, but a directory also the "." entry
>  	 * pointing to itself.
>  	 */
> -	error = xfs_dir_ialloc(&tp, dp, mode, is_dir ? 2 : 1, rdev,
> -			       prid, resblks > 0, &ip, NULL);
> +	error = xfs_dir_ialloc(&tp, dp, mode, is_dir ? 2 : 1, rdev, prid, &ip,
> +			NULL);
>  	if (error)
>  		goto out_trans_cancel;
>  
> @@ -1361,11 +1348,6 @@ xfs_create_tmpfile(
>  	tres = &M_RES(mp)->tr_create_tmpfile;
>  
>  	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
> -	if (error == -ENOSPC) {
> -		/* No space at all so try a "no-allocation" reservation */
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, tres, 0, 0, 0, &tp);
> -	}
>  	if (error)
>  		goto out_release_inode;
>  
> @@ -1374,8 +1356,7 @@ xfs_create_tmpfile(
>  	if (error)
>  		goto out_trans_cancel;
>  
> -	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0,
> -				prid, resblks > 0, &ip, NULL);
> +	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0, prid, &ip, NULL);
>  	if (error)
>  		goto out_trans_cancel;
>  
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 0ee453de239a..ef9b634e4a0a 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -428,7 +428,7 @@ xfs_extlen_t	xfs_get_extsz_hint(struct xfs_inode *ip);
>  xfs_extlen_t	xfs_get_cowextsz_hint(struct xfs_inode *ip);
>  
>  int		xfs_dir_ialloc(struct xfs_trans **, struct xfs_inode *, umode_t,
> -			       xfs_nlink_t, xfs_dev_t, prid_t, int,
> +			       xfs_nlink_t, xfs_dev_t, prid_t,
>  			       struct xfs_inode **, int *);
>  
>  /* from xfs_file.c */
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 15751dc2a27d..b1e45ac8ada7 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -792,8 +792,8 @@ xfs_qm_qino_alloc(
>  		return error;
>  
>  	if (need_alloc) {
> -		error = xfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0, 0, 1, ip,
> -								&committed);
> +		error = xfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0, 0, ip,
> +				&committed);
>  		if (error) {
>  			xfs_trans_cancel(tp);
>  			return error;
> diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> index 23a50d7aa46a..ae8071a4d035 100644
> --- a/fs/xfs/xfs_symlink.c
> +++ b/fs/xfs/xfs_symlink.c
> @@ -232,11 +232,6 @@ xfs_symlink(
>  	resblks = XFS_SYMLINK_SPACE_RES(mp, link_name->len, fs_blocks);
>  
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, resblks, 0, 0, &tp);
> -	if (error == -ENOSPC && fs_blocks == 0) {
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, 0, 0, 0,
> -				&tp);
> -	}
>  	if (error)
>  		goto out_release_inode;
>  
> @@ -260,14 +255,6 @@ xfs_symlink(
>  		goto out_trans_cancel;
>  
>  	/*
> -	 * Check for ability to enter directory entry, if no space reserved.
> -	 */
> -	if (!resblks) {
> -		error = xfs_dir_canenter(tp, dp, link_name);
> -		if (error)
> -			goto out_trans_cancel;
> -	}
> -	/*
>  	 * Initialize the bmap freelist prior to calling either
>  	 * bmapi or the directory create code.
>  	 */
> @@ -277,7 +264,7 @@ xfs_symlink(
>  	 * Allocate an inode for the symlink.
>  	 */
>  	error = xfs_dir_ialloc(&tp, dp, S_IFLNK | (mode & ~S_IFMT), 1, 0,
> -			       prid, resblks > 0, &ip, NULL);
> +			       prid, &ip, NULL);
>  	if (error)
>  		goto out_trans_cancel;
>  
> -- 
> 2.11.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-23 20:22 ` Darrick J. Wong
@ 2017-08-24  0:47   ` Dave Chinner
  2017-08-24  0:57     ` Darrick J. Wong
  2017-08-29 18:04     ` Darrick J. Wong
  2017-08-24 12:18   ` Christoph Hellwig
  1 sibling, 2 replies; 14+ messages in thread
From: Dave Chinner @ 2017-08-24  0:47 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Wed, Aug 23, 2017 at 01:22:13PM -0700, Darrick J. Wong wrote:
> On Mon, Aug 21, 2017 at 10:24:04AM +0200, Christoph Hellwig wrote:
> > If we create a new file we will need an inode, and usually some metadata
> > in the parent direction.  Aiming for everything to go well despite the
> > lack of a reservation leads to dirty transactions cancelled under a heavy
> > create/delete load.  This patch removes those nospace transactions, which
> > will lead to slightly earlier ENOSPC on some workloads, but instead
> > prevent file system shutdowns due to cancelling dirty transactions for
> > others.
> > 
> > A customer could observe assertations failures and shutdowns due to
> > cancelation of dirty transactions during heavy NFS workloads as shown
> > below:
> 
> Looks ok... but is there a xfstest somewhere that can be coaxed into
> reproducing this?  I'm looking at what this code does and have been
> wondering why it even tries this weird workaround in the first place?

Because back in 1997 SGI had a customer that didn't like getting
ENOSPC being reported trying to rename file near ENOSPC when df said
the filesystem had <some tiny amount> of space available and the
directory blocks weren't full:

commit f5029ed542697e8daf728b57d8fec0d9f1abc66c
Author: Doug Doucette <doucette@engr.sgi.com>
Date:   Tue Jul 15 17:54:13 1997 +0000

    Add xfs_dir_canenter to check for entering name in a directory
    with no space allocation.  Initialize new da_arg field justcheck,
    use it in xfs_dir_node_addname.

commit 9b9c81137b07d40d864e468cf3168f1b55d83c13
Author: Doug Doucette <doucette@engr.sgi.com>
Date:   Fri Jul 11 16:33:02 1997 +0000

    Make xfs_dir_createname fail gracefully if the total argument is
    0 and we actually need space.  Same treatment for xfs_dir_node_addname.
    Part of making rename work sometimes with 0 space reservation.

> IOWS, the weirdness removed by this patch didn't quite smell right, but
> at the same time I want to know more about why the smelly weirdness was
> there at all before I rip it out. Context, anyone? :)

It's always been a crufty corner case. xfs_rename and xfs_remove
need to be able to operate at ENOSPC where reservations may not be
possible so they can free up space. However, create/symlink/link
don't really need to work when so close to ENOSPC we can't get a
reservation of a few blocks, so I don't see a huge problem with
this.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-24  0:47   ` Dave Chinner
@ 2017-08-24  0:57     ` Darrick J. Wong
  2017-08-29 18:04     ` Darrick J. Wong
  1 sibling, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-24  0:57 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, linux-xfs

On Thu, Aug 24, 2017 at 10:47:55AM +1000, Dave Chinner wrote:
> On Wed, Aug 23, 2017 at 01:22:13PM -0700, Darrick J. Wong wrote:
> > On Mon, Aug 21, 2017 at 10:24:04AM +0200, Christoph Hellwig wrote:
> > > If we create a new file we will need an inode, and usually some metadata
> > > in the parent direction.  Aiming for everything to go well despite the
> > > lack of a reservation leads to dirty transactions cancelled under a heavy
> > > create/delete load.  This patch removes those nospace transactions, which
> > > will lead to slightly earlier ENOSPC on some workloads, but instead
> > > prevent file system shutdowns due to cancelling dirty transactions for
> > > others.
> > > 
> > > A customer could observe assertations failures and shutdowns due to
> > > cancelation of dirty transactions during heavy NFS workloads as shown
> > > below:
> > 
> > Looks ok... but is there a xfstest somewhere that can be coaxed into
> > reproducing this?  I'm looking at what this code does and have been
> > wondering why it even tries this weird workaround in the first place?
> 
> Because back in 1997 SGI had a customer that didn't like getting
> ENOSPC being reported trying to rename file near ENOSPC when df said
> the filesystem had <some tiny amount> of space available and the
> directory blocks weren't full:
> 
> commit f5029ed542697e8daf728b57d8fec0d9f1abc66c
> Author: Doug Doucette <doucette@engr.sgi.com>
> Date:   Tue Jul 15 17:54:13 1997 +0000
> 
>     Add xfs_dir_canenter to check for entering name in a directory
>     with no space allocation.  Initialize new da_arg field justcheck,
>     use it in xfs_dir_node_addname.
> 
> commit 9b9c81137b07d40d864e468cf3168f1b55d83c13
> Author: Doug Doucette <doucette@engr.sgi.com>
> Date:   Fri Jul 11 16:33:02 1997 +0000
> 
>     Make xfs_dir_createname fail gracefully if the total argument is
>     0 and we actually need space.  Same treatment for xfs_dir_node_addname.
>     Part of making rename work sometimes with 0 space reservation.
> 
> > IOWS, the weirdness removed by this patch didn't quite smell right, but
> > at the same time I want to know more about why the smelly weirdness was
> > there at all before I rip it out. Context, anyone? :)
> 
> It's always been a crufty corner case. xfs_rename and xfs_remove
> need to be able to operate at ENOSPC where reservations may not be
> possible so they can free up space. However, create/symlink/link
> don't really need to work when so close to ENOSPC we can't get a
> reservation of a few blocks, so I don't see a huge problem with
> this.

<nod> I agree that for rm and rename we might have to do twisted things
to avoid blowing up; it was more that doing so for inode allocation as
part of create/symlink that just seemed ... weird to me.

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-23 20:22 ` Darrick J. Wong
  2017-08-24  0:47   ` Dave Chinner
@ 2017-08-24 12:18   ` Christoph Hellwig
  1 sibling, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2017-08-24 12:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Wed, Aug 23, 2017 at 01:22:13PM -0700, Darrick J. Wong wrote:
> Looks ok... but is there a xfstest somewhere that can be coaxed into
> reproducing this?

The QA team that reported this to me required multiple hours to reproduce
it.  I think the key is heavy metadata ops in a full fs, but all my
simple attempts to reproduce it failed.

> I'm looking at what this code does and have been
> wondering why it even tries this weird workaround in the first place?

I think this dates to the time where we weren't very strict about
transaction space reservations - it basically plays lose when near
enospc at the cost of rarely shutting down the whole file system
due to a dirty transaction.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-24  0:47   ` Dave Chinner
  2017-08-24  0:57     ` Darrick J. Wong
@ 2017-08-29 18:04     ` Darrick J. Wong
  2017-09-03  7:25       ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-08-29 18:04 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, linux-xfs

On Thu, Aug 24, 2017 at 10:47:55AM +1000, Dave Chinner wrote:
> On Wed, Aug 23, 2017 at 01:22:13PM -0700, Darrick J. Wong wrote:
> > On Mon, Aug 21, 2017 at 10:24:04AM +0200, Christoph Hellwig wrote:
> > > If we create a new file we will need an inode, and usually some metadata
> > > in the parent direction.  Aiming for everything to go well despite the
> > > lack of a reservation leads to dirty transactions cancelled under a heavy
> > > create/delete load.  This patch removes those nospace transactions, which
> > > will lead to slightly earlier ENOSPC on some workloads, but instead
> > > prevent file system shutdowns due to cancelling dirty transactions for
> > > others.
> > > 
> > > A customer could observe assertations failures and shutdowns due to
> > > cancelation of dirty transactions during heavy NFS workloads as shown
> > > below:
> > 
> > Looks ok... but is there a xfstest somewhere that can be coaxed into
> > reproducing this?  I'm looking at what this code does and have been
> > wondering why it even tries this weird workaround in the first place?
> 
> Because back in 1997 SGI had a customer that didn't like getting
> ENOSPC being reported trying to rename file near ENOSPC when df said
> the filesystem had <some tiny amount> of space available and the
> directory blocks weren't full:
> 
> commit f5029ed542697e8daf728b57d8fec0d9f1abc66c
> Author: Doug Doucette <doucette@engr.sgi.com>
> Date:   Tue Jul 15 17:54:13 1997 +0000
> 
>     Add xfs_dir_canenter to check for entering name in a directory
>     with no space allocation.  Initialize new da_arg field justcheck,
>     use it in xfs_dir_node_addname.
> 
> commit 9b9c81137b07d40d864e468cf3168f1b55d83c13
> Author: Doug Doucette <doucette@engr.sgi.com>
> Date:   Fri Jul 11 16:33:02 1997 +0000
> 
>     Make xfs_dir_createname fail gracefully if the total argument is
>     0 and we actually need space.  Same treatment for xfs_dir_node_addname.
>     Part of making rename work sometimes with 0 space reservation.
> 
> > IOWS, the weirdness removed by this patch didn't quite smell right, but
> > at the same time I want to know more about why the smelly weirdness was
> > there at all before I rip it out. Context, anyone? :)
> 
> It's always been a crufty corner case. xfs_rename and xfs_remove
> need to be able to operate at ENOSPC where reservations may not be
> possible so they can free up space. However, create/symlink/link
> don't really need to work when so close to ENOSPC we can't get a
> reservation of a few blocks, so I don't see a huge problem with
> this.

Hmm.  Across the testing farm, I see a new behavior -- some of the
xfs/013 runs now get stuck in the cleaner thread, waiting for fsstress
to create some directories for it to remove, only fsstress crashed out
with ENOSPC long ago.  Prior to that, the test would finish with a ton
of ENOSPC messages.

I'm /fairly/ sure this is just xfs/013 being slightly stupid, but in any
case I'm going to hang onto this one a little more while I try to figure
out what exactly xfs/013 is trying to do.  My guess is that the cleaner
thread should just exit if fsstress goes away, but then there's also the
question of whether or not it matter that we spew out the ENOSPC
messages even without this patch applied.

(IOWs I think the problem I see is due to the test, not this patch.)

<muttering>

Ultimately this comes down to a behavioral change, where we trade a
fairly improbably bad event (fs shutdown) for a somewhat more probably
not as bad event (enospc with a few free blocks), which makes this
decision hard, especially with the current cloudy trend of running
systems much closer to the limits than we used to, as Dave pointed out
elsewhere.

My off-the-cuff opinion is that systems ought to be provisioned with
enough storage to keep the fs from getting anywhere past 90% full, let
alone full to the point where we even need to try a no-allocation
transaction; and that shutdowns are bad, especially if they're
avoidable. :)

</muttering>

--D

> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-29 18:04     ` Darrick J. Wong
@ 2017-09-03  7:25       ` Christoph Hellwig
  2017-09-03 15:31         ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2017-09-03  7:25 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Dave Chinner, Christoph Hellwig, linux-xfs

Looks like this didn't make the 4.14 for-next branch either.  I guess
we'll just consider it for the next round?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-09-03  7:25       ` Christoph Hellwig
@ 2017-09-03 15:31         ` Darrick J. Wong
  2017-12-05  1:34           ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-09-03 15:31 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Dave Chinner, linux-xfs

On Sun, Sep 03, 2017 at 09:25:13AM +0200, Christoph Hellwig wrote:
> Looks like this didn't make the 4.14 for-next branch either.  I guess
> we'll just consider it for the next round?

Yes, it caused a number of new xfstests failures due to new ENOSPC
messages in the golden output so I decided to defer action on it until
someone can assess and fix the problems appropriately.  The worst part
was that the new failures are intermittent.

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-09-03 15:31         ` Darrick J. Wong
@ 2017-12-05  1:34           ` Darrick J. Wong
  2017-12-05 18:59             ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-12-05  1:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Dave Chinner, linux-xfs

On Sun, Sep 03, 2017 at 08:31:50AM -0700, Darrick J. Wong wrote:
> On Sun, Sep 03, 2017 at 09:25:13AM +0200, Christoph Hellwig wrote:
> > Looks like this didn't make the 4.14 for-next branch either.  I guess
> > we'll just consider it for the next round?
> 
> Yes, it caused a number of new xfstests failures due to new ENOSPC
> messages in the golden output so I decided to defer action on it until
> someone can assess and fix the problems appropriately.  The worst part
> was that the new failures are intermittent.

Heh, a customer slammed right into this, so I'll try to pick up this
patch again and fix all the xfstests....

--D

> --D
> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-12-05  1:34           ` Darrick J. Wong
@ 2017-12-05 18:59             ` Darrick J. Wong
  2017-12-07  0:02               ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-12-05 18:59 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Dave Chinner, linux-xfs

On Mon, Dec 04, 2017 at 05:34:22PM -0800, Darrick J. Wong wrote:
> On Sun, Sep 03, 2017 at 08:31:50AM -0700, Darrick J. Wong wrote:
> > On Sun, Sep 03, 2017 at 09:25:13AM +0200, Christoph Hellwig wrote:
> > > Looks like this didn't make the 4.14 for-next branch either.  I guess
> > > we'll just consider it for the next round?
> > 
> > Yes, it caused a number of new xfstests failures due to new ENOSPC
> > messages in the golden output so I decided to defer action on it until
> > someone can assess and fix the problems appropriately.  The worst part
> > was that the new failures are intermittent.
> 
> Heh, a customer slammed right into this, so I'll try to pick up this
> patch again and fix all the xfstests....

I fixed most of the auto group regressions with small fixes to xfstests,
but xfs/013 is a problem because once we run out of space to create new
dirX we simply get nothing.  Will have to study the behavior of that
test in more depth...

--D

> --D
> 
> > --D
> > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-12-05 18:59             ` Darrick J. Wong
@ 2017-12-07  0:02               ` Christoph Hellwig
  2017-12-07  0:19                 ` Darrick J. Wong
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2017-12-07  0:02 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, Dave Chinner, linux-xfs

On Tue, Dec 05, 2017 at 10:59:36AM -0800, Darrick J. Wong wrote:
> I fixed most of the auto group regressions with small fixes to xfstests,
> but xfs/013 is a problem because once we run out of space to create new
> dirX we simply get nothing.  Will have to study the behavior of that
> test in more depth...

With what kind of config do you see xfs/013 failures?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-08-21  8:24 [PATCH] xfs: remove "no-allocation" reservations for file creations Christoph Hellwig
  2017-08-23 20:22 ` Darrick J. Wong
@ 2017-12-07  0:15 ` Darrick J. Wong
  2017-12-07  0:17   ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Darrick J. Wong @ 2017-12-07  0:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Mon, Aug 21, 2017 at 10:24:04AM +0200, Christoph Hellwig wrote:
> If we create a new file we will need an inode, and usually some metadata
> in the parent direction.  Aiming for everything to go well despite the
> lack of a reservation leads to dirty transactions cancelled under a heavy
> create/delete load.  This patch removes those nospace transactions, which
> will lead to slightly earlier ENOSPC on some workloads, but instead
> prevent file system shutdowns due to cancelling dirty transactions for
> others.
> 
> A customer could observe assertations failures and shutdowns due to
> cancelation of dirty transactions during heavy NFS workloads as shown
> below:
> 
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728125] XFS: Assertion failed: error != -ENOSPC, file: fs/xfs/xfs_inode.c, line: 1262
> 
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728222] Call Trace:
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728246]  [<ffffffff81795daf>] dump_stack+0x63/0x81
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728262]  [<ffffffff810a1a5a>] warn_slowpath_common+0x8a/0xc0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728264]  [<ffffffff810a1b8a>] warn_slowpath_null+0x1a/0x20
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728285]  [<ffffffffa01bf403>] asswarn+0x33/0x40 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728308]  [<ffffffffa01bb07e>] xfs_create+0x7be/0x7d0 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728329]  [<ffffffffa01b6ffb>] xfs_generic_create+0x1fb/0x2e0 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728348]  [<ffffffffa01b7114>] xfs_vn_mknod+0x14/0x20 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728366]  [<ffffffffa01b7153>] xfs_vn_create+0x13/0x20 [xfs]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728380]  [<ffffffff81231de5>] vfs_create+0xd5/0x140
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728390]  [<ffffffffa045ddb9>] do_nfsd_create+0x499/0x610 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728396]  [<ffffffffa0465fa5>] nfsd3_proc_create+0x135/0x210 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728401]  [<ffffffffa04561e3>] nfsd_dispatch+0xc3/0x210 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728416]  [<ffffffffa03bfa43>] svc_process_common+0x453/0x6f0 [sunrpc]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728423]  [<ffffffffa03bfdf3>] svc_process+0x113/0x1f0 [sunrpc]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728427]  [<ffffffffa0455bcf>] nfsd+0x10f/0x180 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728432]  [<ffffffffa0455ac0>] ? nfsd_destroy+0x80/0x80 [nfsd]
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728438]  [<ffffffff810c0d58>] kthread+0xd8/0xf0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728441]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728451]  [<ffffffff8179d962>] ret_from_fork+0x42/0x70
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728453]  [<ffffffff810c0c80>] ? kthread_create_on_node+0x1b0/0x1b0
> 2017-05-30 21:17:06 kernel: WARNING: [ 2670.728454] ---[ end trace f9822c842fec81d4 ]---
> 
> 2017-05-30 21:17:06 kernel: ALERT: [ 2670.728477] XFS (sdb): Internal error xfs_trans_cancel at line 983 of file fs/xfs/xfs_trans.c.  Caller xfs_create+0x4ee/0x7d0 [xfs]
> 
> 2017-05-30 21:17:06 kernel: ALERT: [ 2670.728684] XFS (sdb): Corruption of in-memory data detected. Shutting down filesystem
> 2017-05-30 21:17:06 kernel: ALERT: [ 2670.728685] XFS (sdb): Please umount the filesystem and rectify the problem(s)

Ok, so the xfstests fixes were pretty straightforward, so can I get a
signed-off-by for this patch?

--D

> ---
>  fs/xfs/libxfs/xfs_ialloc.c | 10 +++-------
>  fs/xfs/libxfs/xfs_ialloc.h |  1 -
>  fs/xfs/xfs_inode.c         | 33 +++++++--------------------------
>  fs/xfs/xfs_inode.h         |  2 +-
>  fs/xfs/xfs_qm.c            |  4 ++--
>  fs/xfs/xfs_symlink.c       | 15 +--------------
>  6 files changed, 14 insertions(+), 51 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
> index abf5beaae907..6af9ef5d33de 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.c
> +++ b/fs/xfs/libxfs/xfs_ialloc.c
> @@ -921,8 +921,7 @@ STATIC xfs_agnumber_t
>  xfs_ialloc_ag_select(
>  	xfs_trans_t	*tp,		/* transaction pointer */
>  	xfs_ino_t	parent,		/* parent directory inode number */
> -	umode_t		mode,		/* bits set to indicate file type */
> -	int		okalloc)	/* ok to allocate more space */
> +	umode_t		mode)		/* bits set to indicate file type */
>  {
>  	xfs_agnumber_t	agcount;	/* number of ag's in the filesystem */
>  	xfs_agnumber_t	agno;		/* current ag number */
> @@ -979,9 +978,6 @@ xfs_ialloc_ag_select(
>  			return agno;
>  		}
>  
> -		if (!okalloc)
> -			goto nextag;
> -
>  		if (!pag->pagf_init) {
>  			error = xfs_alloc_pagf_init(mp, tp, agno, flags);
>  			if (error)
> @@ -1682,7 +1678,6 @@ xfs_dialloc(
>  	struct xfs_trans	*tp,
>  	xfs_ino_t		parent,
>  	umode_t			mode,
> -	int			okalloc,
>  	struct xfs_buf		**IO_agbp,
>  	xfs_ino_t		*inop)
>  {
> @@ -1694,6 +1689,7 @@ xfs_dialloc(
>  	int			noroom = 0;
>  	xfs_agnumber_t		start_agno;
>  	struct xfs_perag	*pag;
> +	int			okalloc = 1;
>  
>  	if (*IO_agbp) {
>  		/*
> @@ -1709,7 +1705,7 @@ xfs_dialloc(
>  	 * We do not have an agbp, so select an initial allocation
>  	 * group for inode allocation.
>  	 */
> -	start_agno = xfs_ialloc_ag_select(tp, parent, mode, okalloc);
> +	start_agno = xfs_ialloc_ag_select(tp, parent, mode);
>  	if (start_agno == NULLAGNUMBER) {
>  		*inop = NULLFSINO;
>  		return 0;
> diff --git a/fs/xfs/libxfs/xfs_ialloc.h b/fs/xfs/libxfs/xfs_ialloc.h
> index b32cfb5aeb5b..72311a636458 100644
> --- a/fs/xfs/libxfs/xfs_ialloc.h
> +++ b/fs/xfs/libxfs/xfs_ialloc.h
> @@ -81,7 +81,6 @@ xfs_dialloc(
>  	struct xfs_trans *tp,		/* transaction pointer */
>  	xfs_ino_t	parent,		/* parent inode (directory) */
>  	umode_t		mode,		/* mode bits for new inode */
> -	int		okalloc,	/* ok to allocate more space */
>  	struct xfs_buf	**agbp,		/* buf for a.g. inode header */
>  	xfs_ino_t	*inop);		/* inode number allocated */
>  
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index ff48f0096810..6af59418f4ab 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -769,7 +769,6 @@ xfs_ialloc(
>  	xfs_nlink_t	nlink,
>  	xfs_dev_t	rdev,
>  	prid_t		prid,
> -	int		okalloc,
>  	xfs_buf_t	**ialloc_context,
>  	xfs_inode_t	**ipp)
>  {
> @@ -785,7 +784,7 @@ xfs_ialloc(
>  	 * Call the space management code to pick
>  	 * the on-disk inode to be allocated.
>  	 */
> -	error = xfs_dialloc(tp, pip ? pip->i_ino : 0, mode, okalloc,
> +	error = xfs_dialloc(tp, pip ? pip->i_ino : 0, mode,
>  			    ialloc_context, &ino);
>  	if (error)
>  		return error;
> @@ -977,7 +976,6 @@ xfs_dir_ialloc(
>  	xfs_nlink_t	nlink,
>  	xfs_dev_t	rdev,
>  	prid_t		prid,		/* project id */
> -	int		okalloc,	/* ok to allocate new space */
>  	xfs_inode_t	**ipp,		/* pointer to inode; it will be
>  					   locked. */
>  	int		*committed)
> @@ -1008,8 +1006,8 @@ xfs_dir_ialloc(
>  	 * transaction commit so that no other process can steal
>  	 * the inode(s) that we've just allocated.
>  	 */
> -	code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid, okalloc,
> -			  &ialloc_context, &ip);
> +	code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid, &ialloc_context,
> +			&ip);
>  
>  	/*
>  	 * Return an error if we were unable to allocate a new inode.
> @@ -1081,7 +1079,7 @@ xfs_dir_ialloc(
>  		 * this call should always succeed.
>  		 */
>  		code = xfs_ialloc(tp, dp, mode, nlink, rdev, prid,
> -				  okalloc, &ialloc_context, &ip);
> +				  &ialloc_context, &ip);
>  
>  		/*
>  		 * If we get an error at this point, return to the caller
> @@ -1203,11 +1201,6 @@ xfs_create(
>  		xfs_flush_inodes(mp);
>  		error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
>  	}
> -	if (error == -ENOSPC) {
> -		/* No space at all so try a "no-allocation" reservation */
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, tres, 0, 0, 0, &tp);
> -	}
>  	if (error)
>  		goto out_release_inode;
>  
> @@ -1224,19 +1217,13 @@ xfs_create(
>  	if (error)
>  		goto out_trans_cancel;
>  
> -	if (!resblks) {
> -		error = xfs_dir_canenter(tp, dp, name);
> -		if (error)
> -			goto out_trans_cancel;
> -	}
> -
>  	/*
>  	 * A newly created regular or special file just has one directory
>  	 * entry pointing to them, but a directory also the "." entry
>  	 * pointing to itself.
>  	 */
> -	error = xfs_dir_ialloc(&tp, dp, mode, is_dir ? 2 : 1, rdev,
> -			       prid, resblks > 0, &ip, NULL);
> +	error = xfs_dir_ialloc(&tp, dp, mode, is_dir ? 2 : 1, rdev, prid, &ip,
> +			NULL);
>  	if (error)
>  		goto out_trans_cancel;
>  
> @@ -1361,11 +1348,6 @@ xfs_create_tmpfile(
>  	tres = &M_RES(mp)->tr_create_tmpfile;
>  
>  	error = xfs_trans_alloc(mp, tres, resblks, 0, 0, &tp);
> -	if (error == -ENOSPC) {
> -		/* No space at all so try a "no-allocation" reservation */
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, tres, 0, 0, 0, &tp);
> -	}
>  	if (error)
>  		goto out_release_inode;
>  
> @@ -1374,8 +1356,7 @@ xfs_create_tmpfile(
>  	if (error)
>  		goto out_trans_cancel;
>  
> -	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0,
> -				prid, resblks > 0, &ip, NULL);
> +	error = xfs_dir_ialloc(&tp, dp, mode, 1, 0, prid, &ip, NULL);
>  	if (error)
>  		goto out_trans_cancel;
>  
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 0ee453de239a..ef9b634e4a0a 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -428,7 +428,7 @@ xfs_extlen_t	xfs_get_extsz_hint(struct xfs_inode *ip);
>  xfs_extlen_t	xfs_get_cowextsz_hint(struct xfs_inode *ip);
>  
>  int		xfs_dir_ialloc(struct xfs_trans **, struct xfs_inode *, umode_t,
> -			       xfs_nlink_t, xfs_dev_t, prid_t, int,
> +			       xfs_nlink_t, xfs_dev_t, prid_t,
>  			       struct xfs_inode **, int *);
>  
>  /* from xfs_file.c */
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 15751dc2a27d..b1e45ac8ada7 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -792,8 +792,8 @@ xfs_qm_qino_alloc(
>  		return error;
>  
>  	if (need_alloc) {
> -		error = xfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0, 0, 1, ip,
> -								&committed);
> +		error = xfs_dir_ialloc(&tp, NULL, S_IFREG, 1, 0, 0, ip,
> +				&committed);
>  		if (error) {
>  			xfs_trans_cancel(tp);
>  			return error;
> diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> index 23a50d7aa46a..ae8071a4d035 100644
> --- a/fs/xfs/xfs_symlink.c
> +++ b/fs/xfs/xfs_symlink.c
> @@ -232,11 +232,6 @@ xfs_symlink(
>  	resblks = XFS_SYMLINK_SPACE_RES(mp, link_name->len, fs_blocks);
>  
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, resblks, 0, 0, &tp);
> -	if (error == -ENOSPC && fs_blocks == 0) {
> -		resblks = 0;
> -		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_symlink, 0, 0, 0,
> -				&tp);
> -	}
>  	if (error)
>  		goto out_release_inode;
>  
> @@ -260,14 +255,6 @@ xfs_symlink(
>  		goto out_trans_cancel;
>  
>  	/*
> -	 * Check for ability to enter directory entry, if no space reserved.
> -	 */
> -	if (!resblks) {
> -		error = xfs_dir_canenter(tp, dp, link_name);
> -		if (error)
> -			goto out_trans_cancel;
> -	}
> -	/*
>  	 * Initialize the bmap freelist prior to calling either
>  	 * bmapi or the directory create code.
>  	 */
> @@ -277,7 +264,7 @@ xfs_symlink(
>  	 * Allocate an inode for the symlink.
>  	 */
>  	error = xfs_dir_ialloc(&tp, dp, S_IFLNK | (mode & ~S_IFMT), 1, 0,
> -			       prid, resblks > 0, &ip, NULL);
> +			       prid, &ip, NULL);
>  	if (error)
>  		goto out_trans_cancel;
>  
> -- 
> 2.11.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-12-07  0:15 ` Darrick J. Wong
@ 2017-12-07  0:17   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2017-12-07  0:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs

On Wed, Dec 06, 2017 at 04:15:56PM -0800, Darrick J. Wong wrote:
> Ok, so the xfstests fixes were pretty straightforward, so can I get a
> signed-off-by for this patch?

I can't believe I missed that :)

Signed-off-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] xfs: remove "no-allocation" reservations for file creations
  2017-12-07  0:02               ` Christoph Hellwig
@ 2017-12-07  0:19                 ` Darrick J. Wong
  0 siblings, 0 replies; 14+ messages in thread
From: Darrick J. Wong @ 2017-12-07  0:19 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Dave Chinner, linux-xfs

On Thu, Dec 07, 2017 at 01:02:59AM +0100, Christoph Hellwig wrote:
> On Tue, Dec 05, 2017 at 10:59:36AM -0800, Darrick J. Wong wrote:
> > I fixed most of the auto group regressions with small fixes to xfstests,
> > but xfs/013 is a problem because once we run out of space to create new
> > dirX we simply get nothing.  Will have to study the behavior of that
> > test in more depth...
> 
> With what kind of config do you see xfs/013 failures?

MKFS_OPTIONS='-m rmapbt=1,reflink=1 -i sparse=1'

I /think/ the fix is that the cp -Rl can sometimes fail to create
dir$((i+1)) because we happen to hit exactly the point where the cleaner
hasn't caught up with fsstress and so the dir$((i+1)) creation
transaction fails with ENOSPC.  AFAICT if I change that to:

while ! test -d $SCRATCH_MNT/dir$((i+1)); do
	cp -Rl ...
done

then xfs/013 exercises the finobt just fine and doesn't fail.

--D

> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2017-12-07  0:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-21  8:24 [PATCH] xfs: remove "no-allocation" reservations for file creations Christoph Hellwig
2017-08-23 20:22 ` Darrick J. Wong
2017-08-24  0:47   ` Dave Chinner
2017-08-24  0:57     ` Darrick J. Wong
2017-08-29 18:04     ` Darrick J. Wong
2017-09-03  7:25       ` Christoph Hellwig
2017-09-03 15:31         ` Darrick J. Wong
2017-12-05  1:34           ` Darrick J. Wong
2017-12-05 18:59             ` Darrick J. Wong
2017-12-07  0:02               ` Christoph Hellwig
2017-12-07  0:19                 ` Darrick J. Wong
2017-08-24 12:18   ` Christoph Hellwig
2017-12-07  0:15 ` Darrick J. Wong
2017-12-07  0:17   ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.