* [PATCH v5 0/5] xfs: support shrinking free space in the last AG @ 2021-01-18 8:36 Gao Xiang 2021-01-18 8:36 ` [PATCH v5 1/5] xfs: rename `new' to `delta' in xfs_growfs_data_private() Gao Xiang ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: Gao Xiang @ 2021-01-18 8:36 UTC (permalink / raw) To: linux-xfs Cc: Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig, Gao Xiang Hi folks, v4: https://lore.kernel.org/r/20210111132243.1180013-1-hsiangkao@redhat.com This patchset attempts to support shrinking free space in the last AG. This version mainly updates the per-ag reservation fail case mentioned by Darrick, also add error injection point to observe such path... If I'm still missing something (e.g. not sure of the log reservation calculation due to another free extent dfop) or something goes wrong, please kindly point out... xfsprogs: https://lore.kernel.org/r/20201028114010.545331-1-hsiangkao@redhat.com xfstests: https://lore.kernel.org/r/20201028230909.639698-1-hsiangkao@redhat.com Changes since v4: - [3/5] update a missing typedef case and move the comment to the top of the whole function (Christoph); - [4/5] put onstack structs at the top of the declaration list; handling the per-ag reservation fail case; do agf->agf_length, agi->agi_length sanity check; leave a comment in the error handing path above xfs_trans_commit() (Darrick); - [5/5] add an error injection path to observe the per-ag reservation fail path (Darrick). Thanks, Gao Xiang Gao Xiang (5): xfs: rename `new' to `delta' in xfs_growfs_data_private() xfs: get rid of xfs_growfs_{data,log}_t xfs: hoist out xfs_resizefs_init_new_ags() xfs: support shrinking unused space in the last AG xfs: add error injection for per-AG resv failure when shrinkfs fs/xfs/libxfs/xfs_ag.c | 93 +++++++++++++++++++ fs/xfs/libxfs/xfs_ag.h | 2 + fs/xfs/libxfs/xfs_errortag.h | 2 + fs/xfs/xfs_error.c | 2 + fs/xfs/xfs_fsops.c | 167 ++++++++++++++++++++++------------- fs/xfs/xfs_fsops.h | 4 +- fs/xfs/xfs_ioctl.c | 4 +- fs/xfs/xfs_trans.c | 1 - 8 files changed, 211 insertions(+), 64 deletions(-) -- 2.27.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v5 1/5] xfs: rename `new' to `delta' in xfs_growfs_data_private() 2021-01-18 8:36 [PATCH v5 0/5] xfs: support shrinking free space in the last AG Gao Xiang @ 2021-01-18 8:36 ` Gao Xiang 2021-01-18 8:36 ` [PATCH v5 2/5] xfs: get rid of xfs_growfs_{data,log}_t Gao Xiang ` (3 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Gao Xiang @ 2021-01-18 8:36 UTC (permalink / raw) To: linux-xfs Cc: Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig, Gao Xiang It actually means the delta block count of growfs. Rename it in order to make it clear. Also introduce nb_div to avoid reusing `delta`. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> --- fs/xfs/xfs_fsops.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 5870db855e8b..6ad31e6b4a04 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -32,8 +32,8 @@ xfs_growfs_data_private( int error; xfs_agnumber_t nagcount; xfs_agnumber_t nagimax = 0; - xfs_rfsblock_t nb, nb_mod; - xfs_rfsblock_t new; + xfs_rfsblock_t nb, nb_div, nb_mod; + xfs_rfsblock_t delta; xfs_agnumber_t oagcount; xfs_trans_t *tp; struct aghdr_init_data id = {}; @@ -50,16 +50,16 @@ xfs_growfs_data_private( return error; xfs_buf_relse(bp); - new = nb; /* use new as a temporary here */ - nb_mod = do_div(new, mp->m_sb.sb_agblocks); - nagcount = new + (nb_mod != 0); + nb_div = nb; + nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks); + nagcount = nb_div + (nb_mod != 0); if (nb_mod && nb_mod < XFS_MIN_AG_BLOCKS) { nagcount--; nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; if (nb < mp->m_sb.sb_dblocks) return -EINVAL; } - new = nb - mp->m_sb.sb_dblocks; + delta = nb - mp->m_sb.sb_dblocks; oagcount = mp->m_sb.sb_agcount; /* allocate the new per-ag structures */ @@ -89,7 +89,7 @@ xfs_growfs_data_private( INIT_LIST_HEAD(&id.buffer_list); for (id.agno = nagcount - 1; id.agno >= oagcount; - id.agno--, new -= id.agsize) { + id.agno--, delta -= id.agsize) { if (id.agno == nagcount - 1) id.agsize = nb - @@ -110,8 +110,8 @@ xfs_growfs_data_private( xfs_trans_agblocks_delta(tp, id.nfree); /* If there are new blocks in the old last AG, extend it. */ - if (new) { - error = xfs_ag_extend_space(mp, tp, &id, new); + if (delta) { + error = xfs_ag_extend_space(mp, tp, &id, delta); if (error) goto out_trans_cancel; } @@ -143,7 +143,7 @@ xfs_growfs_data_private( * If we expanded the last AG, free the per-AG reservation * so we can reinitialize it with the new size. */ - if (new) { + if (delta) { struct xfs_perag *pag; pag = xfs_perag_get(mp, id.agno); -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v5 2/5] xfs: get rid of xfs_growfs_{data,log}_t 2021-01-18 8:36 [PATCH v5 0/5] xfs: support shrinking free space in the last AG Gao Xiang 2021-01-18 8:36 ` [PATCH v5 1/5] xfs: rename `new' to `delta' in xfs_growfs_data_private() Gao Xiang @ 2021-01-18 8:36 ` Gao Xiang 2021-01-18 8:36 ` [PATCH v5 3/5] xfs: hoist out xfs_resizefs_init_new_ags() Gao Xiang ` (2 subsequent siblings) 4 siblings, 0 replies; 11+ messages in thread From: Gao Xiang @ 2021-01-18 8:36 UTC (permalink / raw) To: linux-xfs Cc: Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig, Gao Xiang, Eric Sandeen Such usage isn't encouraged by the kernel coding style. Leave the definitions alone in case of userspace users. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> --- fs/xfs/xfs_fsops.c | 12 ++++++------ fs/xfs/xfs_fsops.h | 4 ++-- fs/xfs/xfs_ioctl.c | 4 ++-- 3 files changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 6ad31e6b4a04..0bc9c5ebd199 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -25,8 +25,8 @@ */ static int xfs_growfs_data_private( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_growfs_data_t *in) /* growfs data input struct */ + struct xfs_mount *mp, /* mount point for filesystem */ + struct xfs_growfs_data *in) /* growfs data input struct */ { struct xfs_buf *bp; int error; @@ -35,7 +35,7 @@ xfs_growfs_data_private( xfs_rfsblock_t nb, nb_div, nb_mod; xfs_rfsblock_t delta; xfs_agnumber_t oagcount; - xfs_trans_t *tp; + struct xfs_trans *tp; struct aghdr_init_data id = {}; nb = in->newblocks; @@ -170,8 +170,8 @@ xfs_growfs_data_private( static int xfs_growfs_log_private( - xfs_mount_t *mp, /* mount point for filesystem */ - xfs_growfs_log_t *in) /* growfs log input struct */ + struct xfs_mount *mp, /* mount point for filesystem */ + struct xfs_growfs_log *in) /* growfs log input struct */ { xfs_extlen_t nb; @@ -268,7 +268,7 @@ xfs_growfs_data( int xfs_growfs_log( xfs_mount_t *mp, - xfs_growfs_log_t *in) + struct xfs_growfs_log *in) { int error; diff --git a/fs/xfs/xfs_fsops.h b/fs/xfs/xfs_fsops.h index 92869f6ec8d3..2cffe51a31e8 100644 --- a/fs/xfs/xfs_fsops.h +++ b/fs/xfs/xfs_fsops.h @@ -6,8 +6,8 @@ #ifndef __XFS_FSOPS_H__ #define __XFS_FSOPS_H__ -extern int xfs_growfs_data(xfs_mount_t *mp, xfs_growfs_data_t *in); -extern int xfs_growfs_log(xfs_mount_t *mp, xfs_growfs_log_t *in); +extern int xfs_growfs_data(struct xfs_mount *mp, struct xfs_growfs_data *in); +extern int xfs_growfs_log(struct xfs_mount *mp, struct xfs_growfs_log *in); extern void xfs_fs_counts(xfs_mount_t *mp, xfs_fsop_counts_t *cnt); extern int xfs_reserve_blocks(xfs_mount_t *mp, uint64_t *inval, xfs_fsop_resblks_t *outval); diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c index 3fbd98f61ea5..a62520f49ec5 100644 --- a/fs/xfs/xfs_ioctl.c +++ b/fs/xfs/xfs_ioctl.c @@ -2260,7 +2260,7 @@ xfs_file_ioctl( } case XFS_IOC_FSGROWFSDATA: { - xfs_growfs_data_t in; + struct xfs_growfs_data in; if (copy_from_user(&in, arg, sizeof(in))) return -EFAULT; @@ -2274,7 +2274,7 @@ xfs_file_ioctl( } case XFS_IOC_FSGROWFSLOG: { - xfs_growfs_log_t in; + struct xfs_growfs_log in; if (copy_from_user(&in, arg, sizeof(in))) return -EFAULT; -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v5 3/5] xfs: hoist out xfs_resizefs_init_new_ags() 2021-01-18 8:36 [PATCH v5 0/5] xfs: support shrinking free space in the last AG Gao Xiang 2021-01-18 8:36 ` [PATCH v5 1/5] xfs: rename `new' to `delta' in xfs_growfs_data_private() Gao Xiang 2021-01-18 8:36 ` [PATCH v5 2/5] xfs: get rid of xfs_growfs_{data,log}_t Gao Xiang @ 2021-01-18 8:36 ` Gao Xiang 2021-01-18 8:36 ` [PATCH v5 4/5] xfs: support shrinking unused space in the last AG Gao Xiang 2021-01-18 8:37 ` [PATCH v5 5/5] xfs: add error injection for per-AG resv failure when shrinkfs Gao Xiang 4 siblings, 0 replies; 11+ messages in thread From: Gao Xiang @ 2021-01-18 8:36 UTC (permalink / raw) To: linux-xfs Cc: Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig, Gao Xiang Move out related logic for initializing new added AGs to a new helper in preparation for shrinking. No logic changes. Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Gao Xiang <hsiangkao@redhat.com> --- fs/xfs/xfs_fsops.c | 74 +++++++++++++++++++++++++++------------------- 1 file changed, 44 insertions(+), 30 deletions(-) diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index 0bc9c5ebd199..db6ed354c465 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -20,6 +20,49 @@ #include "xfs_ag.h" #include "xfs_ag_resv.h" +/* + * Write new AG headers to disk. Non-transactional, but need to be + * written and completed prior to the growfs transaction being logged. + * To do this, we use a delayed write buffer list and wait for + * submission and IO completion of the list as a whole. This allows the + * IO subsystem to merge all the AG headers in a single AG into a single + * IO and hide most of the latency of the IO from us. + * + * This also means that if we get an error whilst building the buffer + * list to write, we can cancel the entire list without having written + * anything. + */ +static int +xfs_resizefs_init_new_ags( + struct xfs_mount *mp, + struct aghdr_init_data *id, + xfs_agnumber_t oagcount, + xfs_agnumber_t nagcount, + xfs_rfsblock_t *delta) +{ + xfs_rfsblock_t nb = mp->m_sb.sb_dblocks + *delta; + int error; + + INIT_LIST_HEAD(&id->buffer_list); + for (id->agno = nagcount - 1; + id->agno >= oagcount; + id->agno--, *delta -= id->agsize) { + + if (id->agno == nagcount - 1) + id->agsize = nb - (id->agno * + (xfs_rfsblock_t)mp->m_sb.sb_agblocks); + else + id->agsize = mp->m_sb.sb_agblocks; + + error = xfs_ag_init_headers(mp, id); + if (error) { + xfs_buf_delwri_cancel(&id->buffer_list); + return error; + } + } + return xfs_buf_delwri_submit(&id->buffer_list); +} + /* * growfs operations */ @@ -74,36 +117,7 @@ xfs_growfs_data_private( if (error) return error; - /* - * Write new AG headers to disk. Non-transactional, but need to be - * written and completed prior to the growfs transaction being logged. - * To do this, we use a delayed write buffer list and wait for - * submission and IO completion of the list as a whole. This allows the - * IO subsystem to merge all the AG headers in a single AG into a single - * IO and hide most of the latency of the IO from us. - * - * This also means that if we get an error whilst building the buffer - * list to write, we can cancel the entire list without having written - * anything. - */ - INIT_LIST_HEAD(&id.buffer_list); - for (id.agno = nagcount - 1; - id.agno >= oagcount; - id.agno--, delta -= id.agsize) { - - if (id.agno == nagcount - 1) - id.agsize = nb - - (id.agno * (xfs_rfsblock_t)mp->m_sb.sb_agblocks); - else - id.agsize = mp->m_sb.sb_agblocks; - - error = xfs_ag_init_headers(mp, &id); - if (error) { - xfs_buf_delwri_cancel(&id.buffer_list); - goto out_trans_cancel; - } - } - error = xfs_buf_delwri_submit(&id.buffer_list); + error = xfs_resizefs_init_new_ags(mp, &id, oagcount, nagcount, &delta); if (error) goto out_trans_cancel; -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v5 4/5] xfs: support shrinking unused space in the last AG 2021-01-18 8:36 [PATCH v5 0/5] xfs: support shrinking free space in the last AG Gao Xiang ` (2 preceding siblings ...) 2021-01-18 8:36 ` [PATCH v5 3/5] xfs: hoist out xfs_resizefs_init_new_ags() Gao Xiang @ 2021-01-18 8:36 ` Gao Xiang 2021-01-20 19:25 ` Darrick J. Wong 2021-01-18 8:37 ` [PATCH v5 5/5] xfs: add error injection for per-AG resv failure when shrinkfs Gao Xiang 4 siblings, 1 reply; 11+ messages in thread From: Gao Xiang @ 2021-01-18 8:36 UTC (permalink / raw) To: linux-xfs Cc: Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig, Gao Xiang As the first step of shrinking, this attempts to enable shrinking unused space in the last allocation group by fixing up freespace btree, agi, agf and adjusting super block and introduce a helper xfs_ag_shrink_space() to fixup the last AG. This can be all done in one transaction for now, so I think no additional protection is needed. Signed-off-by: Gao Xiang <hsiangkao@redhat.com> --- fs/xfs/libxfs/xfs_ag.c | 88 ++++++++++++++++++++++++++++++++++++++++++ fs/xfs/libxfs/xfs_ag.h | 2 + fs/xfs/xfs_fsops.c | 77 ++++++++++++++++++++++++++---------- fs/xfs/xfs_trans.c | 1 - 4 files changed, 146 insertions(+), 22 deletions(-) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index 9331f3516afa..04a7c9b20470 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -22,6 +22,8 @@ #include "xfs_ag.h" #include "xfs_ag_resv.h" #include "xfs_health.h" +#include "xfs_error.h" +#include "xfs_bmap.h" static int xfs_get_aghdr_buf( @@ -485,6 +487,92 @@ xfs_ag_init_headers( return error; } +int +xfs_ag_shrink_space( + struct xfs_mount *mp, + struct xfs_trans *tp, + struct aghdr_init_data *id, + xfs_extlen_t len) +{ + struct xfs_alloc_arg args = { + .tp = tp, + .mp = mp, + .type = XFS_ALLOCTYPE_THIS_BNO, + .minlen = len, + .maxlen = len, + .oinfo = XFS_RMAP_OINFO_SKIP_UPDATE, + .resv = XFS_AG_RESV_NONE, + .prod = 1 + }; + struct xfs_buf *agibp, *agfbp; + struct xfs_agi *agi; + struct xfs_agf *agf; + int error, err2; + + ASSERT(id->agno == mp->m_sb.sb_agcount - 1); + error = xfs_ialloc_read_agi(mp, tp, id->agno, &agibp); + if (error) + return error; + + agi = agibp->b_addr; + + error = xfs_alloc_read_agf(mp, tp, id->agno, 0, &agfbp); + if (error) + return error; + + agf = agfbp->b_addr; + if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length)) + return -EFSCORRUPTED; + + args.fsbno = XFS_AGB_TO_FSB(mp, id->agno, + be32_to_cpu(agi->agi_length) - len); + + /* remove the preallocations before allocation and re-establish then */ + error = xfs_ag_resv_free(agibp->b_pag); + if (error) + return error; + + /* internal log shouldn't also show up in the free space btrees */ + error = xfs_alloc_vextent(&args); + if (!error && args.agbno == NULLAGBLOCK) + error = -ENOSPC; + + if (error) { + err2 = xfs_ag_resv_init(agibp->b_pag, tp); + if (err2) + goto resv_err; + return error; + } + + /* + * if successfully deleted from freespace btrees, need to confirm + * per-AG reservation works as expected. + */ + be32_add_cpu(&agi->agi_length, -len); + be32_add_cpu(&agf->agf_length, -len); + + err2 = xfs_ag_resv_init(agibp->b_pag, tp); + if (err2) { + be32_add_cpu(&agi->agi_length, len); + be32_add_cpu(&agf->agf_length, len); + if (err2 != -ENOSPC) + goto resv_err; + + __xfs_bmap_add_free(tp, args.fsbno, len, + &XFS_RMAP_OINFO_SKIP_UPDATE, true); + return err2; + } + xfs_ialloc_log_agi(tp, agibp, XFS_AGI_LENGTH); + xfs_alloc_log_agf(tp, agfbp, XFS_AGF_LENGTH); + return 0; + +resv_err: + xfs_warn(mp, +"Error %d reserving per-AG metadata reserve pool.", err2); + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); + return err2; +} + /* * Extent the AG indicated by the @id by the length passed in */ diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h index 5166322807e7..f3b5bbfeadce 100644 --- a/fs/xfs/libxfs/xfs_ag.h +++ b/fs/xfs/libxfs/xfs_ag.h @@ -24,6 +24,8 @@ struct aghdr_init_data { }; int xfs_ag_init_headers(struct xfs_mount *mp, struct aghdr_init_data *id); +int xfs_ag_shrink_space(struct xfs_mount *mp, struct xfs_trans *tp, + struct aghdr_init_data *id, xfs_extlen_t len); int xfs_ag_extend_space(struct xfs_mount *mp, struct xfs_trans *tp, struct aghdr_init_data *id, xfs_extlen_t len); int xfs_ag_get_geometry(struct xfs_mount *mp, xfs_agnumber_t agno, diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c index db6ed354c465..2ae4f33b42c9 100644 --- a/fs/xfs/xfs_fsops.c +++ b/fs/xfs/xfs_fsops.c @@ -38,7 +38,7 @@ xfs_resizefs_init_new_ags( struct aghdr_init_data *id, xfs_agnumber_t oagcount, xfs_agnumber_t nagcount, - xfs_rfsblock_t *delta) + int64_t *delta) { xfs_rfsblock_t nb = mp->m_sb.sb_dblocks + *delta; int error; @@ -76,33 +76,41 @@ xfs_growfs_data_private( xfs_agnumber_t nagcount; xfs_agnumber_t nagimax = 0; xfs_rfsblock_t nb, nb_div, nb_mod; - xfs_rfsblock_t delta; + int64_t delta; xfs_agnumber_t oagcount; struct xfs_trans *tp; + bool extend; struct aghdr_init_data id = {}; nb = in->newblocks; - if (nb < mp->m_sb.sb_dblocks) - return -EINVAL; - if ((error = xfs_sb_validate_fsb_count(&mp->m_sb, nb))) + if (nb == mp->m_sb.sb_dblocks) + return 0; + + error = xfs_sb_validate_fsb_count(&mp->m_sb, nb); + if (error) return error; - error = xfs_buf_read_uncached(mp->m_ddev_targp, + + if (nb > mp->m_sb.sb_dblocks) { + error = xfs_buf_read_uncached(mp->m_ddev_targp, XFS_FSB_TO_BB(mp, nb) - XFS_FSS_TO_BB(mp, 1), XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL); - if (error) - return error; - xfs_buf_relse(bp); + if (error) + return error; + xfs_buf_relse(bp); + } nb_div = nb; nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks); nagcount = nb_div + (nb_mod != 0); if (nb_mod && nb_mod < XFS_MIN_AG_BLOCKS) { nagcount--; - nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; - if (nb < mp->m_sb.sb_dblocks) + if (nagcount < 2) return -EINVAL; + nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; } + delta = nb - mp->m_sb.sb_dblocks; + extend = (delta > 0); oagcount = mp->m_sb.sb_agcount; /* allocate the new per-ag structures */ @@ -110,22 +118,34 @@ xfs_growfs_data_private( error = xfs_initialize_perag(mp, nagcount, &nagimax); if (error) return error; + } else if (nagcount != oagcount) { + /* TODO: shrinking the entire AGs hasn't yet completed */ + return -EINVAL; } error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, - XFS_GROWFS_SPACE_RES(mp), 0, XFS_TRANS_RESERVE, &tp); + (extend ? XFS_GROWFS_SPACE_RES(mp) : -delta), 0, + XFS_TRANS_RESERVE, &tp); if (error) return error; - error = xfs_resizefs_init_new_ags(mp, &id, oagcount, nagcount, &delta); - if (error) - goto out_trans_cancel; - + if (extend) { + error = xfs_resizefs_init_new_ags(mp, &id, oagcount, + nagcount, &delta); + if (error) + goto out_trans_cancel; + } xfs_trans_agblocks_delta(tp, id.nfree); - /* If there are new blocks in the old last AG, extend it. */ + /* If there are some blocks in the last AG, resize it. */ if (delta) { - error = xfs_ag_extend_space(mp, tp, &id, delta); + if (extend) { + error = xfs_ag_extend_space(mp, tp, &id, delta); + } else { + id.agno = nagcount - 1; + error = xfs_ag_shrink_space(mp, tp, &id, -delta); + } + if (error) goto out_trans_cancel; } @@ -137,11 +157,19 @@ xfs_growfs_data_private( */ if (nagcount > oagcount) xfs_trans_mod_sb(tp, XFS_TRANS_SB_AGCOUNT, nagcount - oagcount); - if (nb > mp->m_sb.sb_dblocks) + if (nb != mp->m_sb.sb_dblocks) xfs_trans_mod_sb(tp, XFS_TRANS_SB_DBLOCKS, nb - mp->m_sb.sb_dblocks); if (id.nfree) xfs_trans_mod_sb(tp, XFS_TRANS_SB_FDBLOCKS, id.nfree); + + /* + * update in-core counters (especially sb_fdblocks) now + * so xfs_validate_sb_write() can pass. + */ + if (xfs_sb_version_haslazysbcount(&mp->m_sb)) + xfs_log_sb(tp); + xfs_trans_set_sync(tp); error = xfs_trans_commit(tp); if (error) @@ -157,7 +185,7 @@ xfs_growfs_data_private( * If we expanded the last AG, free the per-AG reservation * so we can reinitialize it with the new size. */ - if (delta) { + if (delta > 0) { struct xfs_perag *pag; pag = xfs_perag_get(mp, id.agno); @@ -178,7 +206,14 @@ xfs_growfs_data_private( return error; out_trans_cancel: - xfs_trans_cancel(tp); + /* + * AGFL fixup can dirty the transaction, so it needs committing anyway. + */ + if (!extend && ((tp->t_flags & XFS_TRANS_DIRTY) || + !list_empty(&tp->t_dfops))) + xfs_trans_commit(tp); + else + xfs_trans_cancel(tp); return error; } diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c index e72730f85af1..fd2cbf414b80 100644 --- a/fs/xfs/xfs_trans.c +++ b/fs/xfs/xfs_trans.c @@ -419,7 +419,6 @@ xfs_trans_mod_sb( tp->t_res_frextents_delta += delta; break; case XFS_TRANS_SB_DBLOCKS: - ASSERT(delta > 0); tp->t_dblocks_delta += delta; break; case XFS_TRANS_SB_AGCOUNT: -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v5 4/5] xfs: support shrinking unused space in the last AG 2021-01-18 8:36 ` [PATCH v5 4/5] xfs: support shrinking unused space in the last AG Gao Xiang @ 2021-01-20 19:25 ` Darrick J. Wong 2021-01-20 20:22 ` Gao Xiang 2021-01-21 1:51 ` Gao Xiang 0 siblings, 2 replies; 11+ messages in thread From: Darrick J. Wong @ 2021-01-20 19:25 UTC (permalink / raw) To: Gao Xiang Cc: linux-xfs, Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig On Mon, Jan 18, 2021 at 04:36:59PM +0800, Gao Xiang wrote: > As the first step of shrinking, this attempts to enable shrinking > unused space in the last allocation group by fixing up freespace > btree, agi, agf and adjusting super block and introduce a helper > xfs_ag_shrink_space() to fixup the last AG. > > This can be all done in one transaction for now, so I think no > additional protection is needed. > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com> > --- > fs/xfs/libxfs/xfs_ag.c | 88 ++++++++++++++++++++++++++++++++++++++++++ > fs/xfs/libxfs/xfs_ag.h | 2 + > fs/xfs/xfs_fsops.c | 77 ++++++++++++++++++++++++++---------- > fs/xfs/xfs_trans.c | 1 - > 4 files changed, 146 insertions(+), 22 deletions(-) > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > index 9331f3516afa..04a7c9b20470 100644 > --- a/fs/xfs/libxfs/xfs_ag.c > +++ b/fs/xfs/libxfs/xfs_ag.c > @@ -22,6 +22,8 @@ > #include "xfs_ag.h" > #include "xfs_ag_resv.h" > #include "xfs_health.h" > +#include "xfs_error.h" > +#include "xfs_bmap.h" > > static int > xfs_get_aghdr_buf( > @@ -485,6 +487,92 @@ xfs_ag_init_headers( > return error; > } > > +int > +xfs_ag_shrink_space( > + struct xfs_mount *mp, > + struct xfs_trans *tp, > + struct aghdr_init_data *id, > + xfs_extlen_t len) > +{ > + struct xfs_alloc_arg args = { > + .tp = tp, > + .mp = mp, > + .type = XFS_ALLOCTYPE_THIS_BNO, > + .minlen = len, > + .maxlen = len, > + .oinfo = XFS_RMAP_OINFO_SKIP_UPDATE, > + .resv = XFS_AG_RESV_NONE, > + .prod = 1 > + }; > + struct xfs_buf *agibp, *agfbp; > + struct xfs_agi *agi; > + struct xfs_agf *agf; > + int error, err2; > + > + ASSERT(id->agno == mp->m_sb.sb_agcount - 1); > + error = xfs_ialloc_read_agi(mp, tp, id->agno, &agibp); > + if (error) > + return error; > + > + agi = agibp->b_addr; > + > + error = xfs_alloc_read_agf(mp, tp, id->agno, 0, &agfbp); > + if (error) > + return error; > + > + agf = agfbp->b_addr; > + if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length)) > + return -EFSCORRUPTED; > + > + args.fsbno = XFS_AGB_TO_FSB(mp, id->agno, > + be32_to_cpu(agi->agi_length) - len); > + > + /* remove the preallocations before allocation and re-establish then */ > + error = xfs_ag_resv_free(agibp->b_pag); > + if (error) > + return error; > + > + /* internal log shouldn't also show up in the free space btrees */ > + error = xfs_alloc_vextent(&args); > + if (!error && args.agbno == NULLAGBLOCK) > + error = -ENOSPC; > + > + if (error) { Aha, now I see why this bit: if (!extend && ((tp->t_flags & XFS_TRANS_DIRTY) || !list_empty(&tp->t_dfops))) xfs_trans_commit(tp); is needed below -- we could have refilled the AGFL here but failed the allocation. At this point we have a dirty transaction /and/ an error code. We need to commit the AGFL refill changes and return to userspace with that error code, but calling xfs_trans_cancel on the dirty transaction causes an (unnecessary) shutdown. What if you rolled the transaction here and passed the new tp and the error code back to the caller? The new transaction is clean so it will cancel without any side effects, and then you can send the ENOSPC up to userspace. Granted, you could just as easily commit the transaction here and make the caller smart enough to know that it no longer has a transaction. I wonder if the transaction allocation and disposal ought to be part of the _ag_grow_space and _ag_shrink_space functions. Also fwiw I would make sure the transaction is clean before I tried to re-initialize the per-ag reservation. > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > + if (err2) > + goto resv_err; > + return error; > + } > + > + /* > + * if successfully deleted from freespace btrees, need to confirm > + * per-AG reservation works as expected. > + */ > + be32_add_cpu(&agi->agi_length, -len); > + be32_add_cpu(&agf->agf_length, -len); > + > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > + if (err2) { > + be32_add_cpu(&agi->agi_length, len); > + be32_add_cpu(&agf->agf_length, len); > + if (err2 != -ENOSPC) > + goto resv_err; If we've just undone reducing ag[if]_length, don't we need to call xfs_ag_resv_init here to (try to) recreate the former per-ag reservations? Also, the comment above about cleaning the transaction before trying to reinit the per-ag reservation and returning ENOSPC applies here. > + > + __xfs_bmap_add_free(tp, args.fsbno, len, > + &XFS_RMAP_OINFO_SKIP_UPDATE, true); > + return err2; > + } > + xfs_ialloc_log_agi(tp, agibp, XFS_AGI_LENGTH); > + xfs_alloc_log_agf(tp, agfbp, XFS_AGF_LENGTH); > + return 0; > + > +resv_err: > + xfs_warn(mp, > +"Error %d reserving per-AG metadata reserve pool.", err2); > + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); > + return err2; > +} > + > /* > * Extent the AG indicated by the @id by the length passed in > */ > diff --git a/fs/xfs/libxfs/xfs_ag.h b/fs/xfs/libxfs/xfs_ag.h > index 5166322807e7..f3b5bbfeadce 100644 > --- a/fs/xfs/libxfs/xfs_ag.h > +++ b/fs/xfs/libxfs/xfs_ag.h > @@ -24,6 +24,8 @@ struct aghdr_init_data { > }; > > int xfs_ag_init_headers(struct xfs_mount *mp, struct aghdr_init_data *id); > +int xfs_ag_shrink_space(struct xfs_mount *mp, struct xfs_trans *tp, > + struct aghdr_init_data *id, xfs_extlen_t len); > int xfs_ag_extend_space(struct xfs_mount *mp, struct xfs_trans *tp, > struct aghdr_init_data *id, xfs_extlen_t len); > int xfs_ag_get_geometry(struct xfs_mount *mp, xfs_agnumber_t agno, > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c > index db6ed354c465..2ae4f33b42c9 100644 > --- a/fs/xfs/xfs_fsops.c > +++ b/fs/xfs/xfs_fsops.c > @@ -38,7 +38,7 @@ xfs_resizefs_init_new_ags( > struct aghdr_init_data *id, > xfs_agnumber_t oagcount, > xfs_agnumber_t nagcount, > - xfs_rfsblock_t *delta) > + int64_t *delta) > { > xfs_rfsblock_t nb = mp->m_sb.sb_dblocks + *delta; > int error; > @@ -76,33 +76,41 @@ xfs_growfs_data_private( > xfs_agnumber_t nagcount; > xfs_agnumber_t nagimax = 0; > xfs_rfsblock_t nb, nb_div, nb_mod; > - xfs_rfsblock_t delta; > + int64_t delta; > xfs_agnumber_t oagcount; > struct xfs_trans *tp; > + bool extend; > struct aghdr_init_data id = {}; > > nb = in->newblocks; > - if (nb < mp->m_sb.sb_dblocks) > - return -EINVAL; > - if ((error = xfs_sb_validate_fsb_count(&mp->m_sb, nb))) > + if (nb == mp->m_sb.sb_dblocks) > + return 0; > + > + error = xfs_sb_validate_fsb_count(&mp->m_sb, nb); > + if (error) > return error; > - error = xfs_buf_read_uncached(mp->m_ddev_targp, > + > + if (nb > mp->m_sb.sb_dblocks) { > + error = xfs_buf_read_uncached(mp->m_ddev_targp, > XFS_FSB_TO_BB(mp, nb) - XFS_FSS_TO_BB(mp, 1), > XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL); > - if (error) > - return error; > - xfs_buf_relse(bp); > + if (error) > + return error; > + xfs_buf_relse(bp); > + } > > nb_div = nb; > nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks); > nagcount = nb_div + (nb_mod != 0); > if (nb_mod && nb_mod < XFS_MIN_AG_BLOCKS) { > nagcount--; > - nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; > - if (nb < mp->m_sb.sb_dblocks) > + if (nagcount < 2) > return -EINVAL; > + nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; > } > + > delta = nb - mp->m_sb.sb_dblocks; > + extend = (delta > 0); > oagcount = mp->m_sb.sb_agcount; > > /* allocate the new per-ag structures */ > @@ -110,22 +118,34 @@ xfs_growfs_data_private( > error = xfs_initialize_perag(mp, nagcount, &nagimax); > if (error) > return error; > + } else if (nagcount != oagcount) { Nit: nagcount < oagcount ? > + /* TODO: shrinking the entire AGs hasn't yet completed */ > + return -EINVAL; > } > > error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, > - XFS_GROWFS_SPACE_RES(mp), 0, XFS_TRANS_RESERVE, &tp); > + (extend ? XFS_GROWFS_SPACE_RES(mp) : -delta), 0, > + XFS_TRANS_RESERVE, &tp); > if (error) > return error; > > - error = xfs_resizefs_init_new_ags(mp, &id, oagcount, nagcount, &delta); > - if (error) > - goto out_trans_cancel; > - > + if (extend) { > + error = xfs_resizefs_init_new_ags(mp, &id, oagcount, > + nagcount, &delta); > + if (error) > + goto out_trans_cancel; > + } > xfs_trans_agblocks_delta(tp, id.nfree); > > - /* If there are new blocks in the old last AG, extend it. */ > + /* If there are some blocks in the last AG, resize it. */ > if (delta) { > - error = xfs_ag_extend_space(mp, tp, &id, delta); > + if (extend) { > + error = xfs_ag_extend_space(mp, tp, &id, delta); > + } else { > + id.agno = nagcount - 1; > + error = xfs_ag_shrink_space(mp, tp, &id, -delta); This is a little nitpicky, but I wonder if the reorganization of xfs_growfs_data_private ought to be in a separate preparation patch, wherein you'd define xfs_ag_shrink_space as a stub that returns EOPNOSUPP, and make all the necessary adjustments to the caller. That way, this second patch would concentrate on replacing the shrink_space stub an actual implementation. > + } > + > if (error) > goto out_trans_cancel; > } > @@ -137,11 +157,19 @@ xfs_growfs_data_private( > */ > if (nagcount > oagcount) > xfs_trans_mod_sb(tp, XFS_TRANS_SB_AGCOUNT, nagcount - oagcount); > - if (nb > mp->m_sb.sb_dblocks) > + if (nb != mp->m_sb.sb_dblocks) > xfs_trans_mod_sb(tp, XFS_TRANS_SB_DBLOCKS, > nb - mp->m_sb.sb_dblocks); > if (id.nfree) > xfs_trans_mod_sb(tp, XFS_TRANS_SB_FDBLOCKS, id.nfree); > + > + /* > + * update in-core counters (especially sb_fdblocks) now > + * so xfs_validate_sb_write() can pass. > + */ > + if (xfs_sb_version_haslazysbcount(&mp->m_sb)) > + xfs_log_sb(tp); How do we get a failure in xfs_validate_sb_write? We're changing fdblocks and dblocks in the same transaction, which means that both counters should have changed by the number of blocks we took out of the filesystem, right? Is the problem that the TRANS_SB_DBLOCKS change above makes the primary super's sb_dblocks decrease immediately, but since we're in lazycounters mode we defer updating sb_fdblocks until unmount, so in the meantime we fail the sb write verifier because fdblocks > dblocks? Or: is it the general case that we ought to be forcing fdblocks to get logged here even for fs grow operations? In which case this (minor) behavior change probably should go in a separate patch. --D > + > xfs_trans_set_sync(tp); > error = xfs_trans_commit(tp); > if (error) > @@ -157,7 +185,7 @@ xfs_growfs_data_private( > * If we expanded the last AG, free the per-AG reservation > * so we can reinitialize it with the new size. > */ > - if (delta) { > + if (delta > 0) { > struct xfs_perag *pag; > > pag = xfs_perag_get(mp, id.agno); > @@ -178,7 +206,14 @@ xfs_growfs_data_private( > return error; > > out_trans_cancel: > - xfs_trans_cancel(tp); > + /* > + * AGFL fixup can dirty the transaction, so it needs committing anyway. > + */ > + if (!extend && ((tp->t_flags & XFS_TRANS_DIRTY) || > + !list_empty(&tp->t_dfops))) > + xfs_trans_commit(tp); > + else > + xfs_trans_cancel(tp); > return error; > } > > diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c > index e72730f85af1..fd2cbf414b80 100644 > --- a/fs/xfs/xfs_trans.c > +++ b/fs/xfs/xfs_trans.c > @@ -419,7 +419,6 @@ xfs_trans_mod_sb( > tp->t_res_frextents_delta += delta; > break; > case XFS_TRANS_SB_DBLOCKS: > - ASSERT(delta > 0); > tp->t_dblocks_delta += delta; > break; > case XFS_TRANS_SB_AGCOUNT: > -- > 2.27.0 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v5 4/5] xfs: support shrinking unused space in the last AG 2021-01-20 19:25 ` Darrick J. Wong @ 2021-01-20 20:22 ` Gao Xiang 2021-01-20 20:31 ` Gao Xiang 2021-01-21 1:51 ` Gao Xiang 1 sibling, 1 reply; 11+ messages in thread From: Gao Xiang @ 2021-01-20 20:22 UTC (permalink / raw) To: Darrick J. Wong Cc: linux-xfs, Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig Hi Darrick, On Wed, Jan 20, 2021 at 11:25:06AM -0800, Darrick J. Wong wrote: > On Mon, Jan 18, 2021 at 04:36:59PM +0800, Gao Xiang wrote: ... > > > > +int > > +xfs_ag_shrink_space( > > + struct xfs_mount *mp, > > + struct xfs_trans *tp, > > + struct aghdr_init_data *id, > > + xfs_extlen_t len) > > +{ > > + struct xfs_alloc_arg args = { > > + .tp = tp, > > + .mp = mp, > > + .type = XFS_ALLOCTYPE_THIS_BNO, > > + .minlen = len, > > + .maxlen = len, > > + .oinfo = XFS_RMAP_OINFO_SKIP_UPDATE, > > + .resv = XFS_AG_RESV_NONE, > > + .prod = 1 > > + }; > > + struct xfs_buf *agibp, *agfbp; > > + struct xfs_agi *agi; > > + struct xfs_agf *agf; > > + int error, err2; > > + > > + ASSERT(id->agno == mp->m_sb.sb_agcount - 1); > > + error = xfs_ialloc_read_agi(mp, tp, id->agno, &agibp); > > + if (error) > > + return error; > > + > > + agi = agibp->b_addr; > > + > > + error = xfs_alloc_read_agf(mp, tp, id->agno, 0, &agfbp); > > + if (error) > > + return error; > > + > > + agf = agfbp->b_addr; > > + if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length)) > > + return -EFSCORRUPTED; > > + > > + args.fsbno = XFS_AGB_TO_FSB(mp, id->agno, > > + be32_to_cpu(agi->agi_length) - len); > > + > > + /* remove the preallocations before allocation and re-establish then */ > > + error = xfs_ag_resv_free(agibp->b_pag); > > + if (error) > > + return error; > > + > > + /* internal log shouldn't also show up in the free space btrees */ > > + error = xfs_alloc_vextent(&args); > > + if (!error && args.agbno == NULLAGBLOCK) > > + error = -ENOSPC; > > + > > + if (error) { > > Aha, now I see why this bit: > > if (!extend && ((tp->t_flags & XFS_TRANS_DIRTY) || > !list_empty(&tp->t_dfops))) > xfs_trans_commit(tp); > > is needed below -- we could have refilled the AGFL here but failed the > allocation. At this point we have a dirty transaction /and/ an error > code. We need to commit the AGFL refill changes and return to userspace > with that error code, but calling xfs_trans_cancel on the dirty > transaction causes an (unnecessary) shutdown. > > What if you rolled the transaction here and passed the new tp and the > error code back to the caller? The new transaction is clean so it will > cancel without any side effects, and then you can send the ENOSPC up to > userspace. > > Granted, you could just as easily commit the transaction here and make > the caller smart enough to know that it no longer has a transaction. I > wonder if the transaction allocation and disposal ought to be part of > the _ag_grow_space and _ag_shrink_space functions. > > Also fwiw I would make sure the transaction is clean before I tried to > re-initialize the per-ag reservation. Thanks for your review! Okay, will look into roll transaction way instead (at least for the new _ag_shrink_space() since I don't touch _grow_space and not sure if it has AGFL refill issue as well...) > > > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > > + if (err2) > > + goto resv_err; > > + return error; > > + } > > + > > + /* > > + * if successfully deleted from freespace btrees, need to confirm > > + * per-AG reservation works as expected. > > + */ > > + be32_add_cpu(&agi->agi_length, -len); > > + be32_add_cpu(&agf->agf_length, -len); > > + > > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > > + if (err2) { > > + be32_add_cpu(&agi->agi_length, len); > > + be32_add_cpu(&agf->agf_length, len); > > + if (err2 != -ENOSPC) > > + goto resv_err; > > If we've just undone reducing ag[if]_length, don't we need to call > xfs_ag_resv_init here to (try to) recreate the former per-ag > reservations? If my understanding is correct, xfs_fs_reserve_ag_blocks() in xfs_growfs_data_private() would do that for all AGs... Do we need to xfs_ag_resv_init() in advance here? I thought xfs_ag_resv_init() here is mainly used to guarantee the per-AG reservation for resized size is fine... if ag{i,f}_length don't change, leave such normal reservation to xfs_fs_reserve_ag_blocks() would be okay? > > Also, the comment above about cleaning the transaction before trying to > reinit the per-ag reservation and returning ENOSPC applies here. ok. that'd be here if rolling a new transaction is needed. Thanks for the reminder! Thanks, Gao Xiang > > > + > > + __xfs_bmap_add_free(tp, args.fsbno, len, > > + &XFS_RMAP_OINFO_SKIP_UPDATE, true); > > + return err2; > > + } > > + xfs_ialloc_log_agi(tp, agibp, XFS_AGI_LENGTH); > > + xfs_alloc_log_agf(tp, agfbp, XFS_AGF_LENGTH); > > + return 0; > > + > > +resv_err: > > + xfs_warn(mp, > > +"Error %d reserving per-AG metadata reserve pool.", err2); > > + xfs_force_shutdown(mp, SHUTDOWN_CORRUPT_INCORE); > > + return err2; > > +} ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v5 4/5] xfs: support shrinking unused space in the last AG 2021-01-20 20:22 ` Gao Xiang @ 2021-01-20 20:31 ` Gao Xiang 0 siblings, 0 replies; 11+ messages in thread From: Gao Xiang @ 2021-01-20 20:31 UTC (permalink / raw) To: Darrick J. Wong Cc: linux-xfs, Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig On Thu, Jan 21, 2021 at 04:22:59AM +0800, Gao Xiang wrote: ... (cont..) > > > > > > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > > > + if (err2) > > > + goto resv_err; > > > + return error; > > > + } > > > + > > > + /* > > > + * if successfully deleted from freespace btrees, need to confirm > > > + * per-AG reservation works as expected. > > > + */ > > > + be32_add_cpu(&agi->agi_length, -len); > > > + be32_add_cpu(&agf->agf_length, -len); > > > + > > > + err2 = xfs_ag_resv_init(agibp->b_pag, tp); > > > + if (err2) { > > > + be32_add_cpu(&agi->agi_length, len); > > > + be32_add_cpu(&agf->agf_length, len); > > > + if (err2 != -ENOSPC) > > > + goto resv_err; > > > > If we've just undone reducing ag[if]_length, don't we need to call > > xfs_ag_resv_init here to (try to) recreate the former per-ag > > reservations? > > If my understanding is correct, xfs_fs_reserve_ag_blocks() in > xfs_growfs_data_private() would do that for all AGs... Do we > need to xfs_ag_resv_init() in advance here? > > I thought xfs_ag_resv_init() here is mainly used to guarantee the > per-AG reservation for resized size is fine... if ag{i,f}_length > don't change, leave such normal reservation to > xfs_fs_reserve_ag_blocks() would be okay? > Although When xfs_fs_reserve_ag_blocks(), the transaction has already been commited, the last AG is unlocked. So there might be some race window here... So I will update it, thanks for this! Thanks, Gao Xiang ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v5 4/5] xfs: support shrinking unused space in the last AG 2021-01-20 19:25 ` Darrick J. Wong 2021-01-20 20:22 ` Gao Xiang @ 2021-01-21 1:51 ` Gao Xiang 1 sibling, 0 replies; 11+ messages in thread From: Gao Xiang @ 2021-01-21 1:51 UTC (permalink / raw) To: Darrick J. Wong Cc: linux-xfs, Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig (sorry, I was too sleepy at that time... so I didn't even realize if I replied them all...go on replying this.. at least for some record ... sorry for annoying) On Wed, Jan 20, 2021 at 11:25:06AM -0800, Darrick J. Wong wrote: > On Mon, Jan 18, 2021 at 04:36:59PM +0800, Gao Xiang wrote: ... > > > > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c > > index db6ed354c465..2ae4f33b42c9 100644 > > --- a/fs/xfs/xfs_fsops.c > > +++ b/fs/xfs/xfs_fsops.c > > @@ -38,7 +38,7 @@ xfs_resizefs_init_new_ags( > > struct aghdr_init_data *id, > > xfs_agnumber_t oagcount, > > xfs_agnumber_t nagcount, > > - xfs_rfsblock_t *delta) > > + int64_t *delta) > > { > > xfs_rfsblock_t nb = mp->m_sb.sb_dblocks + *delta; > > int error; > > @@ -76,33 +76,41 @@ xfs_growfs_data_private( > > xfs_agnumber_t nagcount; > > xfs_agnumber_t nagimax = 0; > > xfs_rfsblock_t nb, nb_div, nb_mod; > > - xfs_rfsblock_t delta; > > + int64_t delta; > > xfs_agnumber_t oagcount; > > struct xfs_trans *tp; > > + bool extend; > > struct aghdr_init_data id = {}; > > > > nb = in->newblocks; > > - if (nb < mp->m_sb.sb_dblocks) > > - return -EINVAL; > > - if ((error = xfs_sb_validate_fsb_count(&mp->m_sb, nb))) > > + if (nb == mp->m_sb.sb_dblocks) > > + return 0; > > + > > + error = xfs_sb_validate_fsb_count(&mp->m_sb, nb); > > + if (error) > > return error; > > - error = xfs_buf_read_uncached(mp->m_ddev_targp, > > + > > + if (nb > mp->m_sb.sb_dblocks) { > > + error = xfs_buf_read_uncached(mp->m_ddev_targp, > > XFS_FSB_TO_BB(mp, nb) - XFS_FSS_TO_BB(mp, 1), > > XFS_FSS_TO_BB(mp, 1), 0, &bp, NULL); > > - if (error) > > - return error; > > - xfs_buf_relse(bp); > > + if (error) > > + return error; > > + xfs_buf_relse(bp); > > + } > > > > nb_div = nb; > > nb_mod = do_div(nb_div, mp->m_sb.sb_agblocks); > > nagcount = nb_div + (nb_mod != 0); > > if (nb_mod && nb_mod < XFS_MIN_AG_BLOCKS) { > > nagcount--; > > - nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; > > - if (nb < mp->m_sb.sb_dblocks) > > + if (nagcount < 2) > > return -EINVAL; > > + nb = (xfs_rfsblock_t)nagcount * mp->m_sb.sb_agblocks; > > } > > + > > delta = nb - mp->m_sb.sb_dblocks; > > + extend = (delta > 0); > > oagcount = mp->m_sb.sb_agcount; > > > > /* allocate the new per-ag structures */ > > @@ -110,22 +118,34 @@ xfs_growfs_data_private( > > error = xfs_initialize_perag(mp, nagcount, &nagimax); > > if (error) > > return error; > > + } else if (nagcount != oagcount) { > > Nit: nagcount < oagcount ? (cont..) ok, that is equal.. will update this. > > > + /* TODO: shrinking the entire AGs hasn't yet completed */ > > + return -EINVAL; > > } > > > > error = xfs_trans_alloc(mp, &M_RES(mp)->tr_growdata, > > - XFS_GROWFS_SPACE_RES(mp), 0, XFS_TRANS_RESERVE, &tp); > > + (extend ? XFS_GROWFS_SPACE_RES(mp) : -delta), 0, > > + XFS_TRANS_RESERVE, &tp); > > if (error) > > return error; > > > > - error = xfs_resizefs_init_new_ags(mp, &id, oagcount, nagcount, &delta); > > - if (error) > > - goto out_trans_cancel; > > - > > + if (extend) { > > + error = xfs_resizefs_init_new_ags(mp, &id, oagcount, > > + nagcount, &delta); > > + if (error) > > + goto out_trans_cancel; > > + } > > xfs_trans_agblocks_delta(tp, id.nfree); > > > > - /* If there are new blocks in the old last AG, extend it. */ > > + /* If there are some blocks in the last AG, resize it. */ > > if (delta) { > > - error = xfs_ag_extend_space(mp, tp, &id, delta); > > + if (extend) { > > + error = xfs_ag_extend_space(mp, tp, &id, delta); > > + } else { > > + id.agno = nagcount - 1; > > + error = xfs_ag_shrink_space(mp, tp, &id, -delta); > > This is a little nitpicky, but I wonder if the reorganization of > xfs_growfs_data_private ought to be in a separate preparation patch, > wherein you'd define xfs_ag_shrink_space as a stub that returns > EOPNOSUPP, and make all the necessary adjustments to the caller. > > That way, this second patch would concentrate on replacing the > shrink_space stub an actual implementation. I could have a try on this. Another thought you mentioned on IRC was seperating shrinkfs into another function, e.g. xfs_shrinkfs_data_private()... Although Brian once mentioned he liked to use the shared way, I'm both fine with these. So the next version I would like to seperate it as a try. And see if it looks ok to people. > > > + } > > + > > if (error) > > goto out_trans_cancel; > > } > > @@ -137,11 +157,19 @@ xfs_growfs_data_private( > > */ > > if (nagcount > oagcount) > > xfs_trans_mod_sb(tp, XFS_TRANS_SB_AGCOUNT, nagcount - oagcount); > > - if (nb > mp->m_sb.sb_dblocks) > > + if (nb != mp->m_sb.sb_dblocks) > > xfs_trans_mod_sb(tp, XFS_TRANS_SB_DBLOCKS, > > nb - mp->m_sb.sb_dblocks); > > if (id.nfree) > > xfs_trans_mod_sb(tp, XFS_TRANS_SB_FDBLOCKS, id.nfree); > > + > > + /* > > + * update in-core counters (especially sb_fdblocks) now > > + * so xfs_validate_sb_write() can pass. > > + */ > > + if (xfs_sb_version_haslazysbcount(&mp->m_sb)) > > + xfs_log_sb(tp); > > How do we get a failure in xfs_validate_sb_write? We're changing > fdblocks and dblocks in the same transaction, which means that both > counters should have changed by the number of blocks we took out of > the filesystem, right? > > Is the problem that the TRANS_SB_DBLOCKS change above makes the primary > super's sb_dblocks decrease immediately, but since we're in lazycounters > mode we defer updating sb_fdblocks until unmount, so in the meantime > we fail the sb write verifier because fdblocks > dblocks? Yeah, this was mainly to deal with some sb write verifier at that time, otherwise sb verifier would complain about this: https://lore.kernel.org/r/20201021142230.GA30714@xiangao.remote.csb/ > > Or: is it` the general case that we ought to be forcing fdblocks to get > logged here even for fs grow operations? In which case this (minor) > behavior change probably should go in a separate patch. I think it's also needed to apply for growfs case as well, yet I didn't observe some strange about this on growfs, but I think generally lazy sb counters (including sb_dblocks and sb_fdblocks) might be better to be updated immediately for all resizing cases. ok, will add another patch to handle this... Thanks, Gao Xiang > > --D > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v5 5/5] xfs: add error injection for per-AG resv failure when shrinkfs 2021-01-18 8:36 [PATCH v5 0/5] xfs: support shrinking free space in the last AG Gao Xiang ` (3 preceding siblings ...) 2021-01-18 8:36 ` [PATCH v5 4/5] xfs: support shrinking unused space in the last AG Gao Xiang @ 2021-01-18 8:37 ` Gao Xiang 2021-01-20 19:25 ` Darrick J. Wong 4 siblings, 1 reply; 11+ messages in thread From: Gao Xiang @ 2021-01-18 8:37 UTC (permalink / raw) To: linux-xfs Cc: Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig, Gao Xiang per-AG resv failure after fixing up freespace is hard to test in an effective way, so directly add an error injection path to observe such error handling path works as expected. Signed-off-by: Gao Xiang <hsiangkao@redhat.com> --- fs/xfs/libxfs/xfs_ag.c | 5 +++++ fs/xfs/libxfs/xfs_errortag.h | 2 ++ fs/xfs/xfs_error.c | 2 ++ 3 files changed, 9 insertions(+) diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c index 04a7c9b20470..65e8e07f179b 100644 --- a/fs/xfs/libxfs/xfs_ag.c +++ b/fs/xfs/libxfs/xfs_ag.c @@ -23,6 +23,7 @@ #include "xfs_ag_resv.h" #include "xfs_health.h" #include "xfs_error.h" +#include "xfs_errortag.h" #include "xfs_bmap.h" static int @@ -552,6 +553,10 @@ xfs_ag_shrink_space( be32_add_cpu(&agf->agf_length, -len); err2 = xfs_ag_resv_init(agibp->b_pag, tp); + + if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_SHRINKFS_AG_RESV_FAIL)) + err2 = -ENOSPC; + if (err2) { be32_add_cpu(&agi->agi_length, len); be32_add_cpu(&agf->agf_length, len); diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h index 53b305dea381..89da08a451cf 100644 --- a/fs/xfs/libxfs/xfs_errortag.h +++ b/fs/xfs/libxfs/xfs_errortag.h @@ -40,6 +40,8 @@ #define XFS_ERRTAG_REFCOUNT_FINISH_ONE 25 #define XFS_ERRTAG_BMAP_FINISH_ONE 26 #define XFS_ERRTAG_AG_RESV_CRITICAL 27 +#define XFS_ERRTAG_SHRINKFS_AG_RESV_FAIL 28 + /* * DEBUG mode instrumentation to test and/or trigger delayed allocation * block killing in the event of failed writes. When enabled, all diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c index 7f6e20899473..c864451ba7d0 100644 --- a/fs/xfs/xfs_error.c +++ b/fs/xfs/xfs_error.c @@ -164,6 +164,7 @@ XFS_ERRORTAG_ATTR_RW(force_repair, XFS_ERRTAG_FORCE_SCRUB_REPAIR); XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC); XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK); XFS_ERRORTAG_ATTR_RW(buf_ioerror, XFS_ERRTAG_BUF_IOERROR); +XFS_ERRORTAG_ATTR_RW(shrinkfs_ag_resv_fail, XFS_ERRTAG_SHRINKFS_AG_RESV_FAIL); static struct attribute *xfs_errortag_attrs[] = { XFS_ERRORTAG_ATTR_LIST(noerror), @@ -202,6 +203,7 @@ static struct attribute *xfs_errortag_attrs[] = { XFS_ERRORTAG_ATTR_LIST(bad_summary), XFS_ERRORTAG_ATTR_LIST(iunlink_fallback), XFS_ERRORTAG_ATTR_LIST(buf_ioerror), + XFS_ERRORTAG_ATTR_LIST(shrinkfs_ag_resv_fail), NULL, }; -- 2.27.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v5 5/5] xfs: add error injection for per-AG resv failure when shrinkfs 2021-01-18 8:37 ` [PATCH v5 5/5] xfs: add error injection for per-AG resv failure when shrinkfs Gao Xiang @ 2021-01-20 19:25 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2021-01-20 19:25 UTC (permalink / raw) To: Gao Xiang Cc: linux-xfs, Darrick J. Wong, Brian Foster, Eric Sandeen, Dave Chinner, Christoph Hellwig On Mon, Jan 18, 2021 at 04:37:00PM +0800, Gao Xiang wrote: > per-AG resv failure after fixing up freespace is hard to test in an > effective way, so directly add an error injection path to observe > such error handling path works as expected. > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Generally seems fine to me... Reviewed-by: Darrick J. Wong <djwong@kernel.org> --D > --- > fs/xfs/libxfs/xfs_ag.c | 5 +++++ > fs/xfs/libxfs/xfs_errortag.h | 2 ++ > fs/xfs/xfs_error.c | 2 ++ > 3 files changed, 9 insertions(+) > > diff --git a/fs/xfs/libxfs/xfs_ag.c b/fs/xfs/libxfs/xfs_ag.c > index 04a7c9b20470..65e8e07f179b 100644 > --- a/fs/xfs/libxfs/xfs_ag.c > +++ b/fs/xfs/libxfs/xfs_ag.c > @@ -23,6 +23,7 @@ > #include "xfs_ag_resv.h" > #include "xfs_health.h" > #include "xfs_error.h" > +#include "xfs_errortag.h" > #include "xfs_bmap.h" > > static int > @@ -552,6 +553,10 @@ xfs_ag_shrink_space( > be32_add_cpu(&agf->agf_length, -len); > > err2 = xfs_ag_resv_init(agibp->b_pag, tp); > + > + if (XFS_TEST_ERROR(false, mp, XFS_ERRTAG_SHRINKFS_AG_RESV_FAIL)) > + err2 = -ENOSPC; > + > if (err2) { > be32_add_cpu(&agi->agi_length, len); > be32_add_cpu(&agf->agf_length, len); > diff --git a/fs/xfs/libxfs/xfs_errortag.h b/fs/xfs/libxfs/xfs_errortag.h > index 53b305dea381..89da08a451cf 100644 > --- a/fs/xfs/libxfs/xfs_errortag.h > +++ b/fs/xfs/libxfs/xfs_errortag.h > @@ -40,6 +40,8 @@ > #define XFS_ERRTAG_REFCOUNT_FINISH_ONE 25 > #define XFS_ERRTAG_BMAP_FINISH_ONE 26 > #define XFS_ERRTAG_AG_RESV_CRITICAL 27 > +#define XFS_ERRTAG_SHRINKFS_AG_RESV_FAIL 28 > + > /* > * DEBUG mode instrumentation to test and/or trigger delayed allocation > * block killing in the event of failed writes. When enabled, all > diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c > index 7f6e20899473..c864451ba7d0 100644 > --- a/fs/xfs/xfs_error.c > +++ b/fs/xfs/xfs_error.c > @@ -164,6 +164,7 @@ XFS_ERRORTAG_ATTR_RW(force_repair, XFS_ERRTAG_FORCE_SCRUB_REPAIR); > XFS_ERRORTAG_ATTR_RW(bad_summary, XFS_ERRTAG_FORCE_SUMMARY_RECALC); > XFS_ERRORTAG_ATTR_RW(iunlink_fallback, XFS_ERRTAG_IUNLINK_FALLBACK); > XFS_ERRORTAG_ATTR_RW(buf_ioerror, XFS_ERRTAG_BUF_IOERROR); > +XFS_ERRORTAG_ATTR_RW(shrinkfs_ag_resv_fail, XFS_ERRTAG_SHRINKFS_AG_RESV_FAIL); > > static struct attribute *xfs_errortag_attrs[] = { > XFS_ERRORTAG_ATTR_LIST(noerror), > @@ -202,6 +203,7 @@ static struct attribute *xfs_errortag_attrs[] = { > XFS_ERRORTAG_ATTR_LIST(bad_summary), > XFS_ERRORTAG_ATTR_LIST(iunlink_fallback), > XFS_ERRORTAG_ATTR_LIST(buf_ioerror), > + XFS_ERRORTAG_ATTR_LIST(shrinkfs_ag_resv_fail), > NULL, > }; > > -- > 2.27.0 > ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-01-21 3:52 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-01-18 8:36 [PATCH v5 0/5] xfs: support shrinking free space in the last AG Gao Xiang 2021-01-18 8:36 ` [PATCH v5 1/5] xfs: rename `new' to `delta' in xfs_growfs_data_private() Gao Xiang 2021-01-18 8:36 ` [PATCH v5 2/5] xfs: get rid of xfs_growfs_{data,log}_t Gao Xiang 2021-01-18 8:36 ` [PATCH v5 3/5] xfs: hoist out xfs_resizefs_init_new_ags() Gao Xiang 2021-01-18 8:36 ` [PATCH v5 4/5] xfs: support shrinking unused space in the last AG Gao Xiang 2021-01-20 19:25 ` Darrick J. Wong 2021-01-20 20:22 ` Gao Xiang 2021-01-20 20:31 ` Gao Xiang 2021-01-21 1:51 ` Gao Xiang 2021-01-18 8:37 ` [PATCH v5 5/5] xfs: add error injection for per-AG resv failure when shrinkfs Gao Xiang 2021-01-20 19:25 ` Darrick J. Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).