From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1A42C433E2 for ; Thu, 17 Sep 2020 07:55:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 991F720658 for ; Thu, 17 Sep 2020 07:55:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726340AbgIQHzt (ORCPT ); Thu, 17 Sep 2020 03:55:49 -0400 Received: from mx2.suse.de ([195.135.220.15]:38418 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726312AbgIQHys (ORCPT ); Thu, 17 Sep 2020 03:54:48 -0400 X-Greylist: delayed 791 seconds by postgrey-1.27 at vger.kernel.org; Thu, 17 Sep 2020 03:53:54 EDT X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 50DC9AC92; Thu, 17 Sep 2020 07:40:46 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id 647241E12E1; Thu, 17 Sep 2020 09:40:30 +0200 (CEST) Date: Thu, 17 Sep 2020 09:40:30 +0200 From: Jan Kara To: Nikolay Borisov Cc: Dave Chinner , Jan Kara , Amir Goldstein , Andreas Gruenbacher , Theodore Tso , Martin Brandenburg , Mike Marshall , Damien Le Moal , Jaegeuk Kim , Qiuyang Sun , linux-xfs , linux-fsdevel , Linux MM , linux-kernel , Matthew Wilcox , Linus Torvalds , "Kirill A. Shutemov" , Andrew Morton , Al Viro , nborisov@suse.de Subject: Re: More filesystem need this fix (xfs: use MMAPLOCK around filemap_map_pages()) Message-ID: <20200917074030.GA9555@quack2.suse.cz> References: <20200623052059.1893966-1-david@fromorbit.com> <20200916155851.GA1572@quack2.suse.cz> <20200917014454.GZ12131@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 17-09-20 08:37:17, Nikolay Borisov wrote: > On 17.09.20 г. 4:44 ч., Dave Chinner wrote: > > On Wed, Sep 16, 2020 at 05:58:51PM +0200, Jan Kara wrote: > >> On Sat 12-09-20 09:19:11, Amir Goldstein wrote: > >>> On Tue, Jun 23, 2020 at 8:21 AM Dave Chinner wrote: > > > > > > > So.... > > > > P0 p1 > > > > hole punch starts > > takes XFS_MMAPLOCK_EXCL > > truncate_pagecache_range() > > unmap_mapping_range(start, end) > > > > > > do_fault_around() > > ->map_pages > > filemap_map_pages() > > page mapping valid, > > page is up to date > > maps PTEs > > > > truncate_inode_pages_range() > > truncate_cleanup_page(page) > > invalidates page > > delete_from_page_cache_batch(page) > > frees page > > > > > > That doesn't seem good to me. > > > > Sure, maybe the page hasn't been freed back to the free lists > > because of elevated refcounts. But it's been released by the > > filesystem and not longer in the page cache so nothing good can come > > of this situation... > > > > AFAICT, this race condition exists for the truncate case as well > > as filemap_map_pages() doesn't have a "page beyond inode size" check > > in it. > > (It's not relevant to the discussion at hand but for the sake of > completeness): > > It does have a check: > > max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); > if (page->index >= max_idx) > goto unlock; Yes, but this does something meaningful only for truncate. For other operations such as hole punch this check doesn't bring anything. That's why only filesystems supporting hole punching and similar advanced operations need an equivalent of mmap_lock. Honza -- Jan Kara SUSE Labs, CR