From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C725CC433E0 for ; Sat, 9 Jan 2021 05:27:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7D99023A68 for ; Sat, 9 Jan 2021 05:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726077AbhAIF12 (ORCPT ); Sat, 9 Jan 2021 00:27:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42850 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725300AbhAIF12 (ORCPT ); Sat, 9 Jan 2021 00:27:28 -0500 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E771FC061573 for ; Fri, 8 Jan 2021 21:26:47 -0800 (PST) Received: by mail-pl1-x62c.google.com with SMTP id j1so6796340pld.3 for ; Fri, 08 Jan 2021 21:26:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=references:user-agent:from:to:cc:subject:in-reply-to:date :message-id:mime-version; bh=r21AzQfz8Hp1Vq7D6zGuzdLQuTirHpf5GtiI0TmLmi8=; b=YembH6y/P+4Bow1hHMmhqm2EsB7AKq3xa00/HODTfTCq5pZquchZ8vYOc+ZUWRRBqF 4+7sgQUcto1SiDNhfqRKY99Cwe6OsrFoR/xwTRMrdOqnLXuSNvDDzDs7RIczge+NxKu6 brMG8GrS1M6IL+v4VlGeExkMCXP/vXPME4v9+U9xqTDM2u5zOp9qQEPF5At1H2SfRXsB 8SqVfaXT8FrfTaX27I3ewVJrRCsqmz4N5zh5F1cMy/xa/JoMR6jGUCesgFIJvbLejgMW ViBGDsqUo/5ShlFZDa1gk31CpXpNLibFiSY1cvkR6G+ywyQ/9Bfok4ANxB+6CSuiwpk7 Ppog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:references:user-agent:from:to:cc:subject :in-reply-to:date:message-id:mime-version; bh=r21AzQfz8Hp1Vq7D6zGuzdLQuTirHpf5GtiI0TmLmi8=; b=KO2pGdqn69LYrhvHYHHEVJOB9HDzSvZMrdCbz4uQkZd3EdRzKGD2DbzNilA7Iz3F2N nkhB0teHlTqRMZBD8mqsFe1MSIRPlYnCmEaJiuGgfO//HB1rrNcvXPtz2m0KLaznBuzC ushDJ9CM7fZlOep+owKR5t8xbXecBJVMaVoAxk9rx8kBF6GsFPGDP6aQO9ri8yXN79uQ eJ1NsOw2OlVCHIsSzj0SmK3t/QVu7O9ZhPu9ZCNjbwctbPMOpm/W4cmbnRTH0zN5qBQ5 CRVzYylL6GLx15qzwJSGzYD41vnGpH3udjpLp8COTm6BqIQbYjYfrTdll7/yVdl8tNGl lYcA== X-Gm-Message-State: AOAM531/c3Mmi+lzH02b68jX+VwnnmvwZGAqXVff9IsPTxzXkypmsYrM UXS6tmtbk5SVcsEiw1z8cH5Nm+uPZyw= X-Google-Smtp-Source: ABdhPJzJsI2gdJTDBrLXCNUPiVkhlyN31Cn4x4fu1hSvmOCZ0XMvyE07Rs6a+zc/NErFTyDPK6a1GQ== X-Received: by 2002:a17:90a:4a85:: with SMTP id f5mr7232079pjh.117.1610170007131; Fri, 08 Jan 2021 21:26:47 -0800 (PST) Received: from garuda ([122.179.56.226]) by smtp.gmail.com with ESMTPSA id p9sm9691305pjb.3.2021.01.08.21.26.44 (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Fri, 08 Jan 2021 21:26:46 -0800 (PST) References: <20210104103120.41158-1-chandanrlinux@gmail.com> <20210104103120.41158-5-chandanrlinux@gmail.com> <20210108011713.GP38809@magnolia> <2055679.3ixLThe9JB@garuda> <20210109013351.GS6918@magnolia> User-agent: mu4e 1.0; emacs 26.1 From: Chandan Babu R To: "Darrick J. Wong" Cc: Chandan Babu R , linux-xfs@vger.kernel.org, hch@lst.de, allison.henderson@oracle.com Subject: Re: [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries In-reply-to: <20210109013351.GS6918@magnolia> Date: Sat, 09 Jan 2021 10:56:42 +0530 Message-ID: <87y2h2r88d.fsf@garuda> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org On 09 Jan 2021 at 07:03, Darrick J. Wong wrote: > On Fri, Jan 08, 2021 at 10:25:35AM +0530, Chandan Babu R wrote: >> On Thu, 07 Jan 2021 17:17:13 -0800, Darrick J. Wong wrote: >> > On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote: >> > > Directory entry addition can cause the following, >> > > 1. Data block can be added/removed. >> > > A new extent can cause extent count to increase by 1. >> > > 2. Free disk block can be added/removed. >> > > Same behaviour as described above for Data block. >> > > 3. Dabtree blocks. >> > > XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these >> > > can be new extents. Hence extent count can increase by >> > > XFS_DA_NODE_MAXDEPTH. >> > > >> > > Directory entry remove and rename (applicable only to the source >> > > directory entry) operations are handled specially to allow them to >> > > succeed in low extent count availability scenarios >> > > i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible >> > > extent count overflow is detected. -ENOSPC is already handled by higher >> > > layers of XFS by letting, >> > > 1. Empty Data/Free space index blocks to linger around until a future >> > > remove operation frees them. >> > > 2. Dabtree blocks would be swapped with the last block in the leaf space >> > > followed by unmapping of the new last block. >> > > >> > > Also, Extent overflow check is performed for the target directory entry >> > > of the rename operation only when the entry does not exist and a >> > > non-zero space reservation is obtained successfully. >> > > >> > > Signed-off-by: Chandan Babu R >> > > --- >> > > fs/xfs/libxfs/xfs_bmap.c | 15 ++++++++++++ >> > > fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++ >> > > fs/xfs/xfs_inode.c | 45 ++++++++++++++++++++++++++++++++++ >> > > fs/xfs/xfs_symlink.c | 5 ++++ >> > > 4 files changed, 78 insertions(+) >> > > >> > > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c >> > > index 32aeacf6f055..5fd804534e67 100644 >> > > --- a/fs/xfs/libxfs/xfs_bmap.c >> > > +++ b/fs/xfs/libxfs/xfs_bmap.c >> > > @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real( >> > > /* >> > > * Deleting the middle of the extent. >> > > */ >> > > + >> > > + /* >> > > + * For directories, -ENOSPC will be handled by higher layers of >> > > + * XFS by letting the corresponding empty Data/Free blocks to >> > > + * linger around until a future remove operation. Dabtree blocks >> > > + * would be swapped with the last block in the leaf space and >> > > + * then the new last block will be unmapped. >> > > + */ >> > > + if (S_ISDIR(VFS_I(ip)->i_mode) && >> > > + whichfork == XFS_DATA_FORK && >> > > + xfs_iext_count_may_overflow(ip, whichfork, 1)) { >> > > + error = -ENOSPC; >> > > + goto done; >> > >> > Hmm... it strikes me as a little odd that we're checking file mode and >> > fork type in the middle of the bmap code. However, I think it's the >> > case that the only place where anyone would punch a hole in the /middle/ >> > of an extent is xattr trees and regular files, right? And both of those >> > cases are checked before we end up in the bmap code, right? >> >> Yes, your observation is correct. I will remove the file mode and fork type >> checks. > > You might want to leave an assert there just in case someone else trips > over that... Ok. > >> >> > >> > So we only really need this check to prevent extent count overflows when >> > removing dirents from directories, like the comment says, and only >> > because directories don't have a hard requirement that the bunmapi >> > succeeds. And I think this logic covers xfs_remove too? That's a bit >> > subtle, but as there's no extent count check in that function, there's >> > not much to attach a comment to... :) >> >> Yes, To provide more clarity, I should replace the above comment with >> following, >> >> /* >> * -ENOSPC is returned since a directory entry remove operation must not fail >> * due to low extent count availability. -ENOSPC will be handled by higher >> * layers of XFS by letting the corresponding empty Data/Free blocks to linger > > grammar nit: "...by letting the corresponding empty Data/Free blocks linger > until a future remove operation." > Ok, I will fix that. >> * around until a future remove operation. Dabtree blocks would be swapped with >> * the last block in the leaf space and then the new last block will be >> * unmapped. >> * >> * The above logic also applies to the source directory entry of a rename >> * operation. >> */ >> >> > >> > Hm. I think I'd like xfs_rename to get a brief comment that we're >> > protected from extent count overflows in xfs_remove() by virtue of this >> > "leave the dir block in place if we ENOSPC" capability: >> > >> > /* >> > * NOTE: We don't need to check for extent overflows here >> > * because the dir removename code will leave the dir block >> > * in place if the extent count would overflow. >> > */ >> > error = xfs_dir_removename(...); >> >> Sure, I will add that. >> >> > >> > Do xattr trees also have the same ability? I think they do, at least >> > for the dabtree part...? >> >> The following code snippet from xfs_da_shrink_inode() does the special casing >> only for the data fork i.e. for blocks holding directory entries. >> >> for (;;) { >> /* >> * Remove extents. If we get ENOSPC for a dir we have to move >> * the last block to the place we want to kill. >> */ >> error = xfs_bunmapi(tp, dp, dead_blkno, count, >> xfs_bmapi_aflag(w), 0, &done); >> if (error == -ENOSPC) { >> if (w != XFS_DATA_FORK) >> break; >> error = xfs_da3_swap_lastblock(args, &dead_blkno, >> &dead_buf); >> >> So this facility is available only for directory entries. >> >> Hence for xattrs, if we ever reach the extent count limit, the only way out is >> to delete the corresponding file. > > Ok. Just checking. :) > >> > >> > I think I would've split this patch into three pieces: >> > >> > - create, link, and symlink in one patch (adding dirents), >> > - the xfs_bmap_del_extent_real change and a comment for xfs_remove >> > (removing dirents) >> > - all the xfs_rename changes (adding and removing dirents) >> > >> > Though I dunno, this series is already 14 patches, and the part that I >> > care most about is not leaving that subtlety in xfs_remove(). :) >> >> I think you are right about that. I will split this patch according to what >> you have mentioned above. > > > Thanks! > > FWIW I didn't see any regressions in fstests... Cool. Thanks for the test run. > >> > >> > Other than that, I follow the logic in this patch and will give it a >> > testrun tonight. >> >> Thank you! >> >> > >> > --D >> > >> > > + } >> > > + >> > > old = got; >> > > >> > > got.br_blockcount = del->br_startoff - got.br_startoff; >> > > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h >> > > index bcac769a7df6..ea1a9dd8a763 100644 >> > > --- a/fs/xfs/libxfs/xfs_inode_fork.h >> > > +++ b/fs/xfs/libxfs/xfs_inode_fork.h >> > > @@ -47,6 +47,19 @@ struct xfs_ifork { >> > > */ >> > > #define XFS_IEXT_PUNCH_HOLE_CNT (1) >> > > >> > > +/* >> > > + * Directory entry addition can cause the following, >> > > + * 1. Data block can be added/removed. >> > > + * A new extent can cause extent count to increase by 1. >> > > + * 2. Free disk block can be added/removed. >> > > + * Same behaviour as described above for Data block. >> > > + * 3. Dabtree blocks. >> > > + * XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new >> > > + * extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH. >> > > + */ >> > > +#define XFS_IEXT_DIR_MANIP_CNT(mp) \ >> > > + ((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount) >> > > + >> > > /* >> > > * Fork handling. >> > > */ >> > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c >> > > index b7352bc4c815..0db21368c7e1 100644 >> > > --- a/fs/xfs/xfs_inode.c >> > > +++ b/fs/xfs/xfs_inode.c >> > > @@ -1042,6 +1042,11 @@ xfs_create( >> > > if (error) >> > > goto out_trans_cancel; >> > > >> > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK, >> > > + XFS_IEXT_DIR_MANIP_CNT(mp)); >> > > + if (error) >> > > + goto out_trans_cancel; >> > > + >> > > /* >> > > * A newly created regular or special file just has one directory >> > > * entry pointing to them, but a directory also the "." entry >> > > @@ -1258,6 +1263,11 @@ xfs_link( >> > > xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL); >> > > xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL); >> > > >> > > + error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK, >> > > + XFS_IEXT_DIR_MANIP_CNT(mp)); >> > > + if (error) >> > > + goto error_return; >> > > + >> > > /* >> > > * If we are using project inheritance, we only allow hard link >> > > * creation in our tree when the project IDs are the same; else >> > > @@ -3106,6 +3116,35 @@ xfs_rename( >> > > /* >> > > * Check for expected errors before we dirty the transaction >> > > * so we can return an error without a transaction abort. >> > > + * >> > > + * Extent count overflow check: >> > > + * >> > > + * From the perspective of src_dp, a rename operation is essentially a >> > > + * directory entry remove operation. Hence the only place where we check >> > > + * for extent count overflow for src_dp is in >> > > + * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns >> > > + * -ENOSPC when it detects a possible extent count overflow and in >> > > + * response, the higher layers of directory handling code do the >> > > + * following: >> > > + * 1. Data/Free blocks: XFS lets these blocks linger around until a >> > > + * future remove operation removes them. >> > > + * 2. Dabtree blocks: XFS swaps the blocks with the last block in the >> > > + * Leaf space and unmaps the last block. >> > > + * >> > > + * For target_dp, there are two cases depending on whether the >> > > + * destination directory entry exists or not. >> > > + * >> > > + * When destination directory entry does not exist (i.e. target_ip == >> > > + * NULL), extent count overflow check is performed only when transaction >> > > + * has a non-zero sized space reservation associated with it. With a >> > > + * zero-sized space reservation, XFS allows a rename operation to >> > > + * continue only when the directory has sufficient free space in its >> > > + * data/leaf/free space blocks to hold the new entry. >> > > + * >> > > + * When destination directory entry exists (i.e. target_ip != NULL), all >> > > + * we need to do is change the inode number associated with the already >> > > + * existing entry. Hence there is no need to perform an extent count >> > > + * overflow check. >> > > */ >> > > if (target_ip == NULL) { >> > > /* >> > > @@ -3116,6 +3155,12 @@ xfs_rename( >> > > error = xfs_dir_canenter(tp, target_dp, target_name); >> > > if (error) >> > > goto out_trans_cancel; >> > > + } else { >> > > + error = xfs_iext_count_may_overflow(target_dp, >> > > + XFS_DATA_FORK, >> > > + XFS_IEXT_DIR_MANIP_CNT(mp)); >> > > + if (error) >> > > + goto out_trans_cancel; >> > > } >> > > } else { >> > > /* >> > > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c >> > > index 1f43fd7f3209..0b8136a32484 100644 >> > > --- a/fs/xfs/xfs_symlink.c >> > > +++ b/fs/xfs/xfs_symlink.c >> > > @@ -220,6 +220,11 @@ xfs_symlink( >> > > if (error) >> > > goto out_trans_cancel; >> > > >> > > + error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK, >> > > + XFS_IEXT_DIR_MANIP_CNT(mp)); >> > > + if (error) >> > > + goto out_trans_cancel; >> > > + >> > > /* >> > > * Allocate an inode for the symlink. >> > > */ >> > >> -- chandan