From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4CDBC43461 for ; Sat, 12 Sep 2020 06:19:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1E000208FE for ; Sat, 12 Sep 2020 06:19:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oeRCKcxX" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1E000208FE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8195A6B006C; Sat, 12 Sep 2020 02:19:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7C84A6B006E; Sat, 12 Sep 2020 02:19:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6929A8E0001; Sat, 12 Sep 2020 02:19:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 50B386B006C for ; Sat, 12 Sep 2020 02:19:26 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 0DAD3181AEF1E for ; Sat, 12 Sep 2020 06:19:26 +0000 (UTC) X-FDA: 77253407532.17.bag04_3705dca270f5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id D75EB180D0181 for ; Sat, 12 Sep 2020 06:19:25 +0000 (UTC) X-HE-Tag: bag04_3705dca270f5 X-Filterd-Recvd-Size: 6836 Received: from mail-io1-f67.google.com (mail-io1-f67.google.com [209.85.166.67]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Sat, 12 Sep 2020 06:19:25 +0000 (UTC) Received: by mail-io1-f67.google.com with SMTP id z13so13352158iom.8 for ; Fri, 11 Sep 2020 23:19:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=RWUUdByFvtqb1glreh6Bs2mDpnkxfWc+Wt1JUPVGjnk=; b=oeRCKcxXBk+ojLrRlwIdT3VESa294A5UXZF4bv+munDBcqKQw8iEd1t+/7K8ouFph0 rQyy8ekphcTE3vjeYDn7BG5mWubXgm4IT4aNf3a4x9s5EVh9BStZPNw1DBTgFRKAgVTj vCedXQa5w8KuAnyx6OSAjFDtfDfYEWhdpPhx923FHm29ohE1utAiJAZQ/1nZ9b917Vem wqIoU3oP7v7MinNN50b0n0FNseDccrMs1IM8p70+mOkKsJ3/pRWsfbUDp4HQqoSoCKDo bvQeDXU5Q4Ls5jDabFolMZQG+2pOFLqQ1IdB1bl/mBMRHOFJPfF4aSycwsSSody+CKp/ wdew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=RWUUdByFvtqb1glreh6Bs2mDpnkxfWc+Wt1JUPVGjnk=; b=TSseo1m8cdhht1UOgWe1OC7sczadjcnKomYs6BJue4/eRF/Lv1mCJHIlNvH3KEyZA2 /X7KrK5JzhN6RFgrnbqH4fFVfgZxhGnTUJEb4kIgTqW/lPnXq5V2WX6Nd+hS8H9Ld7S7 cJAGmJH/nXYNtC7NS5AscZArhMZ3lImP00BXcfoSfp2PEpEWtBB3vxkpPpXh4dyEswCr aeRoe2jMFmCKZiTEQet8U+LyTE/JfnvOsTAzE+P6WeXx4aHgHTZ5rdWWNEUWiLyXa7TD jJ+V5zK3uk+27LavwwN4OssMs6jCK8JcCmhtSb+suv3TMhofw6MtMMjsqEschsYJhEOh vMUg== X-Gm-Message-State: AOAM532vgAVk6xl0+ffUxmuXFK7Lz7BbJ/57eJQ3LhouDbdTj7axu1rB flcRMaGX8u+Kf2c87+BOUxG965UR+N5S47u4Sl8= X-Google-Smtp-Source: ABdhPJwJ++P1wFa8xAYXha0p/f9NmOdh5QtDmdi2cIbl3aMXW1zuAUNjHk5PNBxI8lAoW3lZaC9GGWhxDbVOKw7wuEk= X-Received: by 2002:a05:6602:2e81:: with SMTP id m1mr4459738iow.64.1599891562643; Fri, 11 Sep 2020 23:19:22 -0700 (PDT) MIME-Version: 1.0 References: <20200623052059.1893966-1-david@fromorbit.com> In-Reply-To: <20200623052059.1893966-1-david@fromorbit.com> From: Amir Goldstein Date: Sat, 12 Sep 2020 09:19:11 +0300 Message-ID: Subject: More filesystem need this fix (xfs: use MMAPLOCK around filemap_map_pages()) To: Andreas Gruenbacher , Jan Kara , Theodore Tso , Martin Brandenburg , Mike Marshall , Damien Le Moal , Jaegeuk Kim , Qiuyang Sun Cc: linux-xfs , Dave Chinner , linux-fsdevel , Linux MM , linux-kernel , Matthew Wilcox , Linus Torvalds , "Kirill A. Shutemov" , Andrew Morton , Al Viro Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: D75EB180D0181 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 23, 2020 at 8:21 AM Dave Chinner wrote: > > From: Dave Chinner > > The page faultround path ->map_pages is implemented in XFS via > filemap_map_pages(). This function checks that pages found in page > cache lookups have not raced with truncate based invalidation by > checking page->mapping is correct and page->index is within EOF. > > However, we've known for a long time that this is not sufficient to > protect against races with invalidations done by operations that do > not change EOF. e.g. hole punching and other fallocate() based > direct extent manipulations. The way we protect against these > races is we wrap the page fault operations in a XFS_MMAPLOCK_SHARED > lock so they serialise against fallocate and truncate before calling > into the filemap function that processes the fault. > > Do the same for XFS's ->map_pages implementation to close this > potential data corruption issue. > > Signed-off-by: Dave Chinner > --- > fs/xfs/xfs_file.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 7b05f8fd7b3d..4b185a907432 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -1266,10 +1266,23 @@ xfs_filemap_pfn_mkwrite( > return __xfs_filemap_fault(vmf, PE_SIZE_PTE, true); > } > > +static void > +xfs_filemap_map_pages( > + struct vm_fault *vmf, > + pgoff_t start_pgoff, > + pgoff_t end_pgoff) > +{ > + struct inode *inode = file_inode(vmf->vma->vm_file); > + > + xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); > + filemap_map_pages(vmf, start_pgoff, end_pgoff); > + xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); > +} > + > static const struct vm_operations_struct xfs_file_vm_ops = { > .fault = xfs_filemap_fault, > .huge_fault = xfs_filemap_huge_fault, > - .map_pages = filemap_map_pages, > + .map_pages = xfs_filemap_map_pages, > .page_mkwrite = xfs_filemap_page_mkwrite, > .pfn_mkwrite = xfs_filemap_pfn_mkwrite, > }; > -- > 2.26.2.761.g0e0b3e54be > It appears that ext4, f2fs, gfs2, orangefs, zonefs also need this fix zonefs does not support hole punching, so it may not need to use mmap_sem at all. It is interesting to look at how this bug came to be duplicated in so many filesystems, because there are lessons to be learned. Commit f1820361f83d ("mm: implement ->map_pages for page cache") added to ->map_pages() operation and its commit message said: "...It should be safe to use filemap_map_pages() for ->map_pages() if filesystem use filemap_fault() for ->fault()." At the time, all of the aforementioned filesystems used filemap_fault() for ->fault(). But since then, ext4, xfs, f2fs and just recently gfs2 have added a filesystem ->fault() operation. orangefs has added vm_operations since and zonefs was added since, probably copying the mmap_sem handling from ext4. Both have a filesystem ->fault() operation. It was surprising for me to see that some of the filesystem developers signed on the added ->fault() operations are not strangers to mm. The recent gfs2 change was even reviewed by an established mm developer [1]. So what can we learn from this case study? How could we fix the interface to avoid repeating the same mistake in the future? Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20200703113801.GD25523@casper.infradead.org/