linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] xfs_repair: two fixes
@ 2018-10-23  4:03 Eric Sandeen
  2018-10-23  4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen
  2018-10-23  4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen
  0 siblings, 2 replies; 7+ messages in thread
From: Eric Sandeen @ 2018-10-23  4:03 UTC (permalink / raw)
  To: linux-xfs

Two fixes for bugs that came up in the past 2 days ...

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check
  2018-10-23  4:03 [PATCH 0/2] xfs_repair: two fixes Eric Sandeen
@ 2018-10-23  4:04 ` Eric Sandeen
  2018-10-23  4:23   ` Darrick J. Wong
  2018-10-23  4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2018-10-23  4:04 UTC (permalink / raw)
  To: linux-xfs

If we need to realloc the bplist[] array holding buffers for a given
directory, we don't initialize the new slots.  This causes a problem
if the directory has holes, because those slots never get filled in.

At the end of the function we call libxfs_putbuf for every non-null
slot, and any uninitialized slots are segfault landmines.

Make sure we initialize all new slots to NULL for this reason.

Reported-by: Oleg Davydov <burunduk3@gmail.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

diff --git a/repair/phase6.c b/repair/phase6.c
index b87c751..9d24a4f 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -2348,6 +2348,8 @@ longform_dir2_entry_check(xfs_mount_t	*mp,
 
 		db = xfs_dir2_da_to_db(mp->m_dir_geo, da_bno);
 		if (db >= num_bps) {
+			int last_size = num_bps;
+
 			/* more data blocks than expected */
 			num_bps = db + 1;
 			bplist = realloc(bplist, num_bps * sizeof(struct xfs_buf*));
@@ -2355,6 +2357,9 @@ longform_dir2_entry_check(xfs_mount_t	*mp,
 				do_error(_("realloc failed in %s (%zu bytes)\n"),
 					__func__,
 					num_bps * sizeof(struct xfs_buf*));
+			/* Initialize the new elements */
+			for (i = last_size; i < num_bps; i++)
+				bplist[i] = NULL;
 		}
 
 		if (isblock)

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance
  2018-10-23  4:03 [PATCH 0/2] xfs_repair: two fixes Eric Sandeen
  2018-10-23  4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen
@ 2018-10-23  4:08 ` Eric Sandeen
  2018-10-23  4:22   ` Darrick J. Wong
  2018-10-23 13:57   ` [PATCH 2/2 V2] " Eric Sandeen
  1 sibling, 2 replies; 7+ messages in thread
From: Eric Sandeen @ 2018-10-23  4:08 UTC (permalink / raw)
  To: linux-xfs

After commit:

15a8bcc xfs: fix multi-AG deadlock in xfs_bunmapi

xfs_bunmapi can legitimately return before all work is done.
Sadly nobody told xfs_repair, so it fires an assert:

 phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. 

Fix this by calling back in until all work is done, as we do
in the kernel.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116
Reported-by: Tomasz Torcz <tomek@pipebreaker.pl>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

diff --git a/repair/phase6.c b/repair/phase6.c
index e017326..b87c751 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -1317,7 +1317,7 @@ longform_dir2_rebuild(
 	xfs_fileoff_t		lastblock;
 	xfs_inode_t		pip;
 	dir_hash_ent_t		*p;
-	int			done;
+	int			done = 0;
 
 	/*
 	 * trash directory completely and rebuild from scratch using the
@@ -1352,12 +1352,25 @@ longform_dir2_rebuild(
 			error);
 
 	/* free all data, leaf, node and freespace blocks */
-	error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0,
-				&done);
-	if (error) {
-		do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
-		goto out_bmap_cancel;
-	}
+	while (!done) {
+	       error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA,
+			               0, &done);
+	       if (error) {
+		       do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
+		       goto out_bmap_cancel;
+	       }
+	       error = xfs_defer_finish(&tp);
+	       if (error) {
+		       do_warn(("defer_finish failed -- error - %d\n"), error);
+		       goto out_bmap_cancel;
+	       }
+	       /*
+		* Close out trans and start the next one in the chain.
+		*/
+	       error = xfs_trans_roll_inode(&tp, ip);
+	       if (error)
+			goto out_bmap_cancel;
+        }
 
 	ASSERT(done);
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance
  2018-10-23  4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen
@ 2018-10-23  4:22   ` Darrick J. Wong
  2018-10-23 13:57   ` [PATCH 2/2 V2] " Eric Sandeen
  1 sibling, 0 replies; 7+ messages in thread
From: Darrick J. Wong @ 2018-10-23  4:22 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

On Mon, Oct 22, 2018 at 11:08:49PM -0500, Eric Sandeen wrote:
> After commit:
> 
> 15a8bcc xfs: fix multi-AG deadlock in xfs_bunmapi
> 
> xfs_bunmapi can legitimately return before all work is done.
> Sadly nobody told xfs_repair, so it fires an assert:
> 
>  phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. 
> 
> Fix this by calling back in until all work is done, as we do
> in the kernel.

Looking at the rest of xfsprogs, I think the other directory-related
xfs_bunmapi callers probably need to be able to roll-and-continue, but
that seems like a topic for (a) the kernel and (b) separate patches.

> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116
> Reported-by: Tomasz Torcz <tomek@pipebreaker.pl>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> diff --git a/repair/phase6.c b/repair/phase6.c
> index e017326..b87c751 100644
> --- a/repair/phase6.c
> +++ b/repair/phase6.c
> @@ -1317,7 +1317,7 @@ longform_dir2_rebuild(
>  	xfs_fileoff_t		lastblock;
>  	xfs_inode_t		pip;
>  	dir_hash_ent_t		*p;
> -	int			done;
> +	int			done = 0;
>  
>  	/*
>  	 * trash directory completely and rebuild from scratch using the
> @@ -1352,12 +1352,25 @@ longform_dir2_rebuild(
>  			error);
>  
>  	/* free all data, leaf, node and freespace blocks */
> -	error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0,
> -				&done);
> -	if (error) {
> -		do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
> -		goto out_bmap_cancel;
> -	}
> +	while (!done) {
> +	       error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA,
> +			               0, &done);
> +	       if (error) {
> +		       do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
> +		       goto out_bmap_cancel;
> +	       }
> +	       error = xfs_defer_finish(&tp);

error = -libxfs_defer_finish(...);

> +	       if (error) {
> +		       do_warn(("defer_finish failed -- error - %d\n"), error);
> +		       goto out_bmap_cancel;
> +	       }
> +	       /*
> +		* Close out trans and start the next one in the chain.
> +		*/
> +	       error = xfs_trans_roll_inode(&tp, ip);

error = -libxfs_trans_roll_inode(...);

> +	       if (error)
> +			goto out_bmap_cancel;
> +        }
>  
>  	ASSERT(done);

This assert can go away since !done is the loop test condition.

--D

>  
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check
  2018-10-23  4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen
@ 2018-10-23  4:23   ` Darrick J. Wong
  0 siblings, 0 replies; 7+ messages in thread
From: Darrick J. Wong @ 2018-10-23  4:23 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs

On Mon, Oct 22, 2018 at 11:04:31PM -0500, Eric Sandeen wrote:
> If we need to realloc the bplist[] array holding buffers for a given
> directory, we don't initialize the new slots.  This causes a problem
> if the directory has holes, because those slots never get filled in.
> 
> At the end of the function we call libxfs_putbuf for every non-null
> slot, and any uninitialized slots are segfault landmines.
> 
> Make sure we initialize all new slots to NULL for this reason.
> 
> Reported-by: Oleg Davydov <burunduk3@gmail.com>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>

Yay realloc :P

Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
> 
> diff --git a/repair/phase6.c b/repair/phase6.c
> index b87c751..9d24a4f 100644
> --- a/repair/phase6.c
> +++ b/repair/phase6.c
> @@ -2348,6 +2348,8 @@ longform_dir2_entry_check(xfs_mount_t	*mp,
>  
>  		db = xfs_dir2_da_to_db(mp->m_dir_geo, da_bno);
>  		if (db >= num_bps) {
> +			int last_size = num_bps;
> +
>  			/* more data blocks than expected */
>  			num_bps = db + 1;
>  			bplist = realloc(bplist, num_bps * sizeof(struct xfs_buf*));
> @@ -2355,6 +2357,9 @@ longform_dir2_entry_check(xfs_mount_t	*mp,
>  				do_error(_("realloc failed in %s (%zu bytes)\n"),
>  					__func__,
>  					num_bps * sizeof(struct xfs_buf*));
> +			/* Initialize the new elements */
> +			for (i = last_size; i < num_bps; i++)
> +				bplist[i] = NULL;
>  		}
>  
>  		if (isblock)
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2 V2] xfs_repair: continue after xfs_bunmapi deadlock avoidance
  2018-10-23  4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen
  2018-10-23  4:22   ` Darrick J. Wong
@ 2018-10-23 13:57   ` Eric Sandeen
  2018-10-23 15:30     ` Darrick J. Wong
  1 sibling, 1 reply; 7+ messages in thread
From: Eric Sandeen @ 2018-10-23 13:57 UTC (permalink / raw)
  To: linux-xfs; +Cc: Tomasz Torcz

xfs_bunmapi can legitimately return before all work is done, to
avoid deadlocks across AGs.

Sadly nobody told xfs_repair, so it fires an assert if this happens:

 phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. 

Fix this by calling back in until all work is done, as we do
in the kernel.

Fixes: 5a8bcc ("xfs: fix multi-AG deadlock in xfs_bunmapi")
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116
Reported-by: Tomasz Torcz <tomek@pipebreaker.pl>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

V2: libxfs-ify the code, remove now-pointless assert, thanks Darrick!


diff --git a/repair/phase6.c b/repair/phase6.c
index e017326..bdbbbaa 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -1317,7 +1317,7 @@ longform_dir2_rebuild(
 	xfs_fileoff_t		lastblock;
 	xfs_inode_t		pip;
 	dir_hash_ent_t		*p;
-	int			done;
+	int			done = 0;
 
 	/*
 	 * trash directory completely and rebuild from scratch using the
@@ -1352,14 +1352,25 @@ longform_dir2_rebuild(
 			error);
 
 	/* free all data, leaf, node and freespace blocks */
-	error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0,
-				&done);
-	if (error) {
-		do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
-		goto out_bmap_cancel;
-	}
-
-	ASSERT(done);
+	while (!done) {
+	       error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA,
+			               0, &done);
+	       if (error) {
+		       do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
+		       goto out_bmap_cancel;
+	       }
+	       error = -libxfs_defer_finish(&tp);
+	       if (error) {
+		       do_warn(("defer_finish failed -- error - %d\n"), error);
+		       goto out_bmap_cancel;
+	       }
+	       /*
+		* Close out trans and start the next one in the chain.
+		*/
+	       error = -libxfs_trans_roll_inode(&tp, ip);
+	       if (error)
+			goto out_bmap_cancel;
+        }
 
 	error = -libxfs_dir_init(tp, ip, &pip);
 	if (error) {

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2 V2] xfs_repair: continue after xfs_bunmapi deadlock avoidance
  2018-10-23 13:57   ` [PATCH 2/2 V2] " Eric Sandeen
@ 2018-10-23 15:30     ` Darrick J. Wong
  0 siblings, 0 replies; 7+ messages in thread
From: Darrick J. Wong @ 2018-10-23 15:30 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: linux-xfs, Tomasz Torcz

On Tue, Oct 23, 2018 at 08:57:03AM -0500, Eric Sandeen wrote:
> xfs_bunmapi can legitimately return before all work is done, to
> avoid deadlocks across AGs.
> 
> Sadly nobody told xfs_repair, so it fires an assert if this happens:
> 
>  phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. 
> 
> Fix this by calling back in until all work is done, as we do
> in the kernel.
> 
> Fixes: 5a8bcc ("xfs: fix multi-AG deadlock in xfs_bunmapi")
> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116
> Reported-by: Tomasz Torcz <tomek@pipebreaker.pl>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> V2: libxfs-ify the code, remove now-pointless assert, thanks Darrick!
> 
> 
> diff --git a/repair/phase6.c b/repair/phase6.c
> index e017326..bdbbbaa 100644
> --- a/repair/phase6.c
> +++ b/repair/phase6.c
> @@ -1317,7 +1317,7 @@ longform_dir2_rebuild(
>  	xfs_fileoff_t		lastblock;
>  	xfs_inode_t		pip;
>  	dir_hash_ent_t		*p;
> -	int			done;
> +	int			done = 0;
>  
>  	/*
>  	 * trash directory completely and rebuild from scratch using the
> @@ -1352,14 +1352,25 @@ longform_dir2_rebuild(
>  			error);
>  
>  	/* free all data, leaf, node and freespace blocks */
> -	error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0,
> -				&done);
> -	if (error) {
> -		do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
> -		goto out_bmap_cancel;
> -	}
> -
> -	ASSERT(done);
> +	while (!done) {
> +	       error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA,
> +			               0, &done);
> +	       if (error) {
> +		       do_warn(_("xfs_bunmapi failed -- error - %d\n"), error);
> +		       goto out_bmap_cancel;
> +	       }
> +	       error = -libxfs_defer_finish(&tp);
> +	       if (error) {
> +		       do_warn(("defer_finish failed -- error - %d\n"), error);
> +		       goto out_bmap_cancel;
> +	       }
> +	       /*
> +		* Close out trans and start the next one in the chain.
> +		*/
> +	       error = -libxfs_trans_roll_inode(&tp, ip);
> +	       if (error)

No do_warn() here?  :)

With that,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> +			goto out_bmap_cancel;
> +        }
>  
>  	error = -libxfs_dir_init(tp, ip, &pip);
>  	if (error) {
> 
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2018-10-23 23:54 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-23  4:03 [PATCH 0/2] xfs_repair: two fixes Eric Sandeen
2018-10-23  4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen
2018-10-23  4:23   ` Darrick J. Wong
2018-10-23  4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen
2018-10-23  4:22   ` Darrick J. Wong
2018-10-23 13:57   ` [PATCH 2/2 V2] " Eric Sandeen
2018-10-23 15:30     ` Darrick J. Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).