From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=R+av=MY=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-23.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED,
	DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,
	INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,
	USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 553CBC4338F
	for <linux-mm@archiver.kernel.org>; Sun,  1 Aug 2021 03:46:04 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id A7D6F60F48
	for <linux-mm@archiver.kernel.org>; Sun,  1 Aug 2021 03:46:03 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org A7D6F60F48
Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org
Received: by kanga.kvack.org (Postfix)
	id CD8558D0001; Sat, 31 Jul 2021 23:46:02 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id C89B86B0036; Sat, 31 Jul 2021 23:46:02 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id B4FFE8D0001; Sat, 31 Jul 2021 23:46:02 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0210.hostedemail.com [216.40.44.210])
	by kanga.kvack.org (Postfix) with ESMTP id 97EF76B0033
	for <linux-mm@kvack.org>; Sat, 31 Jul 2021 23:46:02 -0400 (EDT)
Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay05.hostedemail.com (Postfix) with ESMTP id 2FF87181AEF1F
	for <linux-mm@kvack.org>; Sun,  1 Aug 2021 03:46:02 +0000 (UTC)
X-FDA: 78425123364.22.39247D5
Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172])
	by imf06.hostedemail.com (Postfix) with ESMTP id D743F801ACA9
	for <linux-mm@kvack.org>; Sun,  1 Aug 2021 03:45:01 +0000 (UTC)
Received: by mail-qk1-f172.google.com with SMTP id f22so13520233qke.10
        for <linux-mm@kvack.org>; Sat, 31 Jul 2021 20:45:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=date:from:to:cc:subject:in-reply-to:message-id:references
         :mime-version;
        bh=5WRI9ovrYarzT6P+AQzyPA+zqmSi1roZ3PpwFTiVvsU=;
        b=VP/8wrr+L9Dht4u0MJMeSSOm1ThNwDEsQqkwdL+2BxCgXIuJ+MtclIzdrzF/wfhwMO
         yQ9miYXYFYxmzqvE8QDOodcfyok+5jByfAzw8wF3wNFWeBLMx99WiZucz5wBgngYUAG0
         PNEfZyKutEQKhv6MNwosmIeiGLkol1lcMX9jMIOIlPL4LfC6ZJwo9ofOYjIlfyoo8elj
         Tk9j9lZLUVZfMBgp0+Zy/9aL+ydhZQxCzLimxef0eqhpgDaVFsu9AwrBGWIR+hiMR2JA
         DBAHTy92duQwewQ5ydWLQ4g3BiHgN7LOF2EDS7FnLkBJTpjECrqrXzcX7XbflLAbrVlY
         xtfA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id
         :references:mime-version;
        bh=5WRI9ovrYarzT6P+AQzyPA+zqmSi1roZ3PpwFTiVvsU=;
        b=FT73UcCafAWIui6LWDpYdPkkzFNHebnHZa7b/ZrIHI2DT22T+glkCzqJljhgjfvGp3
         yB0TGTM799RqWnwN4oKdDF7d5TbxNSUvqemIaOz1OTtS6MzEmn1snowGtEJ+jPaINIGA
         /aN+BvIlbFAd4x0C77+k8h6ZRJdTGCLHdfugNx0tyFPUmS5MP2/pZjw0GfpPhSAcB6Vh
         UpOQsX1E+hdvIKYx2t6C7r33X3Q07KRPqoFO2z+qC6R2TT/1p4O3XS2NhDdEvXwXIv74
         bGa+Nlz4tpmAgeDtKcuoYp8G7XkmLC1q3wUMyfPKe0KQU/ph1W/AR4rxcsMxvFMYDGXb
         ShUA==
X-Gm-Message-State: AOAM533y/h+mK1Ucn13eXquUbQNnu9WzqX5abH0Kh9uL64Pd76JxXq5g
	uPmos87pUbcJ8AlTjN5W9d7R30c3vKsqLg==
X-Google-Smtp-Source: ABdhPJzbeKbM+hBpLElHX4ZgSN+ykriUzfv8c09OhRuDRYE+xw9J+GYvowS5xY0q7CP8coU+Pjf4kA==
X-Received: by 2002:a05:620a:31a1:: with SMTP id bi33mr9497896qkb.146.1627789091715;
        Sat, 31 Jul 2021 20:38:11 -0700 (PDT)
Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147])
        by smtp.gmail.com with ESMTPSA id d200sm3505724qke.95.2021.07.31.20.38.09
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Sat, 31 Jul 2021 20:38:11 -0700 (PDT)
Date: Sat, 31 Jul 2021 20:38:00 -0700 (PDT)
From: Hugh Dickins <hughd@google.com>
X-X-Sender: hugh@ripple.anvils
To: Yang Shi <shy828301@gmail.com>
cc: Hugh Dickins <hughd@google.com>, Andrew Morton <akpm@linux-foundation.org>, 
    Shakeel Butt <shakeelb@google.com>, 
    "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, 
    Miaohe Lin <linmiaohe@huawei.com>, Mike Kravetz <mike.kravetz@oracle.com>, 
    Michal Hocko <mhocko@suse.com>, Rik van Riel <riel@surriel.com>, 
    Christoph Hellwig <hch@infradead.org>, 
    Matthew Wilcox <willy@infradead.org>, 
    "Eric W. Biederman" <ebiederm@xmission.com>, 
    Alexey Gladkov <legion@kernel.org>, 
    Chris Wilson <chris@chris-wilson.co.uk>, 
    Matthew Auld <matthew.auld@intel.com>, 
    Linux FS-devel Mailing List <linux-fsdevel@vger.kernel.org>, 
    Linux Kernel Mailing List <linux-kernel@vger.kernel.org>, 
    linux-api@vger.kernel.org, Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH 01/16] huge tmpfs: fix fallocate(vanilla) advance over
 huge pages
In-Reply-To: <CAHbLzkqp5-SrOBkpvxieswD6OwPT70gsztNpXCTBXW2JnrFpfg@mail.gmail.com>
Message-ID: <422db5c4-2490-749c-964b-dd2b93286ed5@google.com>
References: <2862852d-badd-7486-3a8e-c5ea9666d6fb@google.com> <af71608e-ecc-af95-3511-1a62cbf8d751@google.com> <CAHbLzkqp5-SrOBkpvxieswD6OwPT70gsztNpXCTBXW2JnrFpfg@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Authentication-Results: imf06.hostedemail.com;
	dkim=pass header.d=google.com header.s=20161025 header.b="VP/8wrr+";
	spf=pass (imf06.hostedemail.com: domain of hughd@google.com designates 209.85.222.172 as permitted sender) smtp.mailfrom=hughd@google.com;
	dmarc=pass (policy=reject) header.from=google.com
X-Rspamd-Server: rspam05
X-Rspamd-Queue-Id: D743F801ACA9
X-Stat-Signature: w3r89ib41pwp7ma6b7hypn5qaaeaqcpb
X-HE-Tag: 1627789501-170872
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Fri, 30 Jul 2021, Yang Shi wrote:
> On Fri, Jul 30, 2021 at 12:25 AM Hugh Dickins <hughd@google.com> wrote:
> >
> > shmem_fallocate() goes to a lot of trouble to leave its newly allocated
> > pages !Uptodate, partly to identify and undo them on failure, partly to
> > leave the overhead of clearing them until later.  But the huge page case
> > did not skip to the end of the extent, walked through the tail pages one
> > by one, and appeared to work just fine: but in doing so, cleared and
> > Uptodated the huge page, so there was no way to undo it on failure.
> >
> > Now advance immediately to the end of the huge extent, with a comment on
> > why this is more than just an optimization.  But although this speeds up
> > huge tmpfs fallocation, it does leave the clearing until first use, and
> > some users may have come to appreciate slow fallocate but fast first use:
> > if they complain, then we can consider adding a pass to clear at the end.
> >
> > Fixes: 800d8c63b2e9 ("shmem: add huge pages support")
> > Signed-off-by: Hugh Dickins <hughd@google.com>
> 
> Reviewed-by: Yang Shi <shy828301@gmail.com>

Many thanks for reviewing so many of these.

> 
> A nit below:
> 
> > ---
> >  mm/shmem.c | 19 ++++++++++++++++---
> >  1 file changed, 16 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 70d9ce294bb4..0cd5c9156457 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -2736,7 +2736,7 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
> >         inode->i_private = &shmem_falloc;
> >         spin_unlock(&inode->i_lock);
> >
> > -       for (index = start; index < end; index++) {
> > +       for (index = start; index < end; ) {
> >                 struct page *page;
> >
> >                 /*
> > @@ -2759,13 +2759,26 @@ static long shmem_fallocate(struct file *file, int mode, loff_t offset,
> >                         goto undone;
> >                 }
> >
> > +               index++;
> > +               /*
> > +                * Here is a more important optimization than it appears:
> > +                * a second SGP_FALLOC on the same huge page will clear it,
> > +                * making it PageUptodate and un-undoable if we fail later.
> > +                */
> > +               if (PageTransCompound(page)) {
> > +                       index = round_up(index, HPAGE_PMD_NR);
> > +                       /* Beware 32-bit wraparound */
> > +                       if (!index)
> > +                               index--;
> > +               }
> > +
> >                 /*
> >                  * Inform shmem_writepage() how far we have reached.
> >                  * No need for lock or barrier: we have the page lock.
> >                  */
> > -               shmem_falloc.next++;
> >                 if (!PageUptodate(page))
> > -                       shmem_falloc.nr_falloced++;
> > +                       shmem_falloc.nr_falloced += index - shmem_falloc.next;
> > +               shmem_falloc.next = index;
> 
> This also fixed the wrong accounting of nr_falloced, so it should be
> able to avoid returning -ENOMEM prematurely IIUC. Is it worth
> mentioning in the commit log?

It took me a long time to see your point there: ah yes, because it made
the whole huge page Uptodate when it reached the first tail, there would
have been only one nr_falloced++ for the whole of the huge page: well
spotted, thanks, I hadn't realized that.

Though I'm not so sure about your premature -ENOMEM: because once it has
made the huge page Uptodate, the other end (shmem_writepage()) will not
be incrementing nr_unswapped at all: so -ENOMEM would have been deferred
rather than premature, wouldn't it?

Add a comment on this in the commit log: yes, I guess so, but I haven't
worked out what to write yet.

Hugh

> 
> >
> >                 /*
> >                  * If !PageUptodate, leave it that way so that freeable pages
> > --
> > 2.26.2