* [PATCH 0/2] xfs_repair: two fixes @ 2018-10-23 4:03 Eric Sandeen 2018-10-23 4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen 2018-10-23 4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen 0 siblings, 2 replies; 7+ messages in thread From: Eric Sandeen @ 2018-10-23 4:03 UTC (permalink / raw) To: linux-xfs Two fixes for bugs that came up in the past 2 days ... ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check 2018-10-23 4:03 [PATCH 0/2] xfs_repair: two fixes Eric Sandeen @ 2018-10-23 4:04 ` Eric Sandeen 2018-10-23 4:23 ` Darrick J. Wong 2018-10-23 4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen 1 sibling, 1 reply; 7+ messages in thread From: Eric Sandeen @ 2018-10-23 4:04 UTC (permalink / raw) To: linux-xfs If we need to realloc the bplist[] array holding buffers for a given directory, we don't initialize the new slots. This causes a problem if the directory has holes, because those slots never get filled in. At the end of the function we call libxfs_putbuf for every non-null slot, and any uninitialized slots are segfault landmines. Make sure we initialize all new slots to NULL for this reason. Reported-by: Oleg Davydov <burunduk3@gmail.com> Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- diff --git a/repair/phase6.c b/repair/phase6.c index b87c751..9d24a4f 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -2348,6 +2348,8 @@ longform_dir2_entry_check(xfs_mount_t *mp, db = xfs_dir2_da_to_db(mp->m_dir_geo, da_bno); if (db >= num_bps) { + int last_size = num_bps; + /* more data blocks than expected */ num_bps = db + 1; bplist = realloc(bplist, num_bps * sizeof(struct xfs_buf*)); @@ -2355,6 +2357,9 @@ longform_dir2_entry_check(xfs_mount_t *mp, do_error(_("realloc failed in %s (%zu bytes)\n"), __func__, num_bps * sizeof(struct xfs_buf*)); + /* Initialize the new elements */ + for (i = last_size; i < num_bps; i++) + bplist[i] = NULL; } if (isblock) ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check 2018-10-23 4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen @ 2018-10-23 4:23 ` Darrick J. Wong 0 siblings, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2018-10-23 4:23 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-xfs On Mon, Oct 22, 2018 at 11:04:31PM -0500, Eric Sandeen wrote: > If we need to realloc the bplist[] array holding buffers for a given > directory, we don't initialize the new slots. This causes a problem > if the directory has holes, because those slots never get filled in. > > At the end of the function we call libxfs_putbuf for every non-null > slot, and any uninitialized slots are segfault landmines. > > Make sure we initialize all new slots to NULL for this reason. > > Reported-by: Oleg Davydov <burunduk3@gmail.com> > Signed-off-by: Eric Sandeen <sandeen@redhat.com> Yay realloc :P Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > --- > > diff --git a/repair/phase6.c b/repair/phase6.c > index b87c751..9d24a4f 100644 > --- a/repair/phase6.c > +++ b/repair/phase6.c > @@ -2348,6 +2348,8 @@ longform_dir2_entry_check(xfs_mount_t *mp, > > db = xfs_dir2_da_to_db(mp->m_dir_geo, da_bno); > if (db >= num_bps) { > + int last_size = num_bps; > + > /* more data blocks than expected */ > num_bps = db + 1; > bplist = realloc(bplist, num_bps * sizeof(struct xfs_buf*)); > @@ -2355,6 +2357,9 @@ longform_dir2_entry_check(xfs_mount_t *mp, > do_error(_("realloc failed in %s (%zu bytes)\n"), > __func__, > num_bps * sizeof(struct xfs_buf*)); > + /* Initialize the new elements */ > + for (i = last_size; i < num_bps; i++) > + bplist[i] = NULL; > } > > if (isblock) > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance 2018-10-23 4:03 [PATCH 0/2] xfs_repair: two fixes Eric Sandeen 2018-10-23 4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen @ 2018-10-23 4:08 ` Eric Sandeen 2018-10-23 4:22 ` Darrick J. Wong 2018-10-23 13:57 ` [PATCH 2/2 V2] " Eric Sandeen 1 sibling, 2 replies; 7+ messages in thread From: Eric Sandeen @ 2018-10-23 4:08 UTC (permalink / raw) To: linux-xfs After commit: 15a8bcc xfs: fix multi-AG deadlock in xfs_bunmapi xfs_bunmapi can legitimately return before all work is done. Sadly nobody told xfs_repair, so it fires an assert: phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. Fix this by calling back in until all work is done, as we do in the kernel. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116 Reported-by: Tomasz Torcz <tomek@pipebreaker.pl> Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- diff --git a/repair/phase6.c b/repair/phase6.c index e017326..b87c751 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -1317,7 +1317,7 @@ longform_dir2_rebuild( xfs_fileoff_t lastblock; xfs_inode_t pip; dir_hash_ent_t *p; - int done; + int done = 0; /* * trash directory completely and rebuild from scratch using the @@ -1352,12 +1352,25 @@ longform_dir2_rebuild( error); /* free all data, leaf, node and freespace blocks */ - error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0, - &done); - if (error) { - do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); - goto out_bmap_cancel; - } + while (!done) { + error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, + 0, &done); + if (error) { + do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); + goto out_bmap_cancel; + } + error = xfs_defer_finish(&tp); + if (error) { + do_warn(("defer_finish failed -- error - %d\n"), error); + goto out_bmap_cancel; + } + /* + * Close out trans and start the next one in the chain. + */ + error = xfs_trans_roll_inode(&tp, ip); + if (error) + goto out_bmap_cancel; + } ASSERT(done); ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance 2018-10-23 4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen @ 2018-10-23 4:22 ` Darrick J. Wong 2018-10-23 13:57 ` [PATCH 2/2 V2] " Eric Sandeen 1 sibling, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2018-10-23 4:22 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-xfs On Mon, Oct 22, 2018 at 11:08:49PM -0500, Eric Sandeen wrote: > After commit: > > 15a8bcc xfs: fix multi-AG deadlock in xfs_bunmapi > > xfs_bunmapi can legitimately return before all work is done. > Sadly nobody told xfs_repair, so it fires an assert: > > phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. > > Fix this by calling back in until all work is done, as we do > in the kernel. Looking at the rest of xfsprogs, I think the other directory-related xfs_bunmapi callers probably need to be able to roll-and-continue, but that seems like a topic for (a) the kernel and (b) separate patches. > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116 > Reported-by: Tomasz Torcz <tomek@pipebreaker.pl> > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > --- > > diff --git a/repair/phase6.c b/repair/phase6.c > index e017326..b87c751 100644 > --- a/repair/phase6.c > +++ b/repair/phase6.c > @@ -1317,7 +1317,7 @@ longform_dir2_rebuild( > xfs_fileoff_t lastblock; > xfs_inode_t pip; > dir_hash_ent_t *p; > - int done; > + int done = 0; > > /* > * trash directory completely and rebuild from scratch using the > @@ -1352,12 +1352,25 @@ longform_dir2_rebuild( > error); > > /* free all data, leaf, node and freespace blocks */ > - error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0, > - &done); > - if (error) { > - do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); > - goto out_bmap_cancel; > - } > + while (!done) { > + error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, > + 0, &done); > + if (error) { > + do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); > + goto out_bmap_cancel; > + } > + error = xfs_defer_finish(&tp); error = -libxfs_defer_finish(...); > + if (error) { > + do_warn(("defer_finish failed -- error - %d\n"), error); > + goto out_bmap_cancel; > + } > + /* > + * Close out trans and start the next one in the chain. > + */ > + error = xfs_trans_roll_inode(&tp, ip); error = -libxfs_trans_roll_inode(...); > + if (error) > + goto out_bmap_cancel; > + } > > ASSERT(done); This assert can go away since !done is the loop test condition. --D > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2 V2] xfs_repair: continue after xfs_bunmapi deadlock avoidance 2018-10-23 4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen 2018-10-23 4:22 ` Darrick J. Wong @ 2018-10-23 13:57 ` Eric Sandeen 2018-10-23 15:30 ` Darrick J. Wong 1 sibling, 1 reply; 7+ messages in thread From: Eric Sandeen @ 2018-10-23 13:57 UTC (permalink / raw) To: linux-xfs; +Cc: Tomasz Torcz xfs_bunmapi can legitimately return before all work is done, to avoid deadlocks across AGs. Sadly nobody told xfs_repair, so it fires an assert if this happens: phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. Fix this by calling back in until all work is done, as we do in the kernel. Fixes: 5a8bcc ("xfs: fix multi-AG deadlock in xfs_bunmapi") Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116 Reported-by: Tomasz Torcz <tomek@pipebreaker.pl> Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- V2: libxfs-ify the code, remove now-pointless assert, thanks Darrick! diff --git a/repair/phase6.c b/repair/phase6.c index e017326..bdbbbaa 100644 --- a/repair/phase6.c +++ b/repair/phase6.c @@ -1317,7 +1317,7 @@ longform_dir2_rebuild( xfs_fileoff_t lastblock; xfs_inode_t pip; dir_hash_ent_t *p; - int done; + int done = 0; /* * trash directory completely and rebuild from scratch using the @@ -1352,14 +1352,25 @@ longform_dir2_rebuild( error); /* free all data, leaf, node and freespace blocks */ - error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0, - &done); - if (error) { - do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); - goto out_bmap_cancel; - } - - ASSERT(done); + while (!done) { + error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, + 0, &done); + if (error) { + do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); + goto out_bmap_cancel; + } + error = -libxfs_defer_finish(&tp); + if (error) { + do_warn(("defer_finish failed -- error - %d\n"), error); + goto out_bmap_cancel; + } + /* + * Close out trans and start the next one in the chain. + */ + error = -libxfs_trans_roll_inode(&tp, ip); + if (error) + goto out_bmap_cancel; + } error = -libxfs_dir_init(tp, ip, &pip); if (error) { ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2 V2] xfs_repair: continue after xfs_bunmapi deadlock avoidance 2018-10-23 13:57 ` [PATCH 2/2 V2] " Eric Sandeen @ 2018-10-23 15:30 ` Darrick J. Wong 0 siblings, 0 replies; 7+ messages in thread From: Darrick J. Wong @ 2018-10-23 15:30 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-xfs, Tomasz Torcz On Tue, Oct 23, 2018 at 08:57:03AM -0500, Eric Sandeen wrote: > xfs_bunmapi can legitimately return before all work is done, to > avoid deadlocks across AGs. > > Sadly nobody told xfs_repair, so it fires an assert if this happens: > > phase6.c:1410: longform_dir2_rebuild: Assertion `done' failed. > > Fix this by calling back in until all work is done, as we do > in the kernel. > > Fixes: 5a8bcc ("xfs: fix multi-AG deadlock in xfs_bunmapi") > Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1641116 > Reported-by: Tomasz Torcz <tomek@pipebreaker.pl> > Signed-off-by: Eric Sandeen <sandeen@redhat.com> > --- > > V2: libxfs-ify the code, remove now-pointless assert, thanks Darrick! > > > diff --git a/repair/phase6.c b/repair/phase6.c > index e017326..bdbbbaa 100644 > --- a/repair/phase6.c > +++ b/repair/phase6.c > @@ -1317,7 +1317,7 @@ longform_dir2_rebuild( > xfs_fileoff_t lastblock; > xfs_inode_t pip; > dir_hash_ent_t *p; > - int done; > + int done = 0; > > /* > * trash directory completely and rebuild from scratch using the > @@ -1352,14 +1352,25 @@ longform_dir2_rebuild( > error); > > /* free all data, leaf, node and freespace blocks */ > - error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, 0, > - &done); > - if (error) { > - do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); > - goto out_bmap_cancel; > - } > - > - ASSERT(done); > + while (!done) { > + error = -libxfs_bunmapi(tp, ip, 0, lastblock, XFS_BMAPI_METADATA, > + 0, &done); > + if (error) { > + do_warn(_("xfs_bunmapi failed -- error - %d\n"), error); > + goto out_bmap_cancel; > + } > + error = -libxfs_defer_finish(&tp); > + if (error) { > + do_warn(("defer_finish failed -- error - %d\n"), error); > + goto out_bmap_cancel; > + } > + /* > + * Close out trans and start the next one in the chain. > + */ > + error = -libxfs_trans_roll_inode(&tp, ip); > + if (error) No do_warn() here? :) With that, Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > + goto out_bmap_cancel; > + } > > error = -libxfs_dir_init(tp, ip, &pip); > if (error) { > > ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2018-10-23 23:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-23 4:03 [PATCH 0/2] xfs_repair: two fixes Eric Sandeen 2018-10-23 4:04 ` [PATCH 1/2] xfs_repair: initialize realloced bplist in longform_dir2_entry_check Eric Sandeen 2018-10-23 4:23 ` Darrick J. Wong 2018-10-23 4:08 ` [PATCH 2/2] xfs_repair: continue after xfs_bunmapi deadlock avoidance Eric Sandeen 2018-10-23 4:22 ` Darrick J. Wong 2018-10-23 13:57 ` [PATCH 2/2 V2] " Eric Sandeen 2018-10-23 15:30 ` Darrick J. Wong
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.