All of lore.kernel.org
 help / color / mirror / Atom feed
From: liuyuntao <liuyuntao10@huawei.com>
To: <hughd@google.com>
Cc: <akpm@linux-foundation.org>, <kirill.shutemov@linux.intel.com>,
	<kirill@shutemov.name>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <liusirui@huawei.com>,
	<liuyuntao10@huawei.com>, <windspectator@gmail.com>,
	<wuxu.wu@huawei.com>
Subject: Re: [PATCH v2] fix judgment error in shmem_is_huge()
Date: Sun, 26 Sep 2021 14:42:01 +0800	[thread overview]
Message-ID: <20210926064201.3416154-1-liuyuntao10@huawei.com> (raw)
In-Reply-To: <614538e2-16bb-2657-f374-64195c5c7c2@google.com>

On Sat, 25 Sep 2021, Hugh Dickins wrote:
> On Fri, 24 Sep 2021, Hugh Dickins wrote:
> > On Thu, 9 Sep 2021, Liu Yuntao wrote:
> > 
> > > In the case of SHMEM_HUGE_WITHIN_SIZE, the page index is not rounded
> > > up correctly. When the page index points to the first page in a huge
> > > page, round_up() cannot bring it to the end of the huge page, but
> > > to the end of the previous one.
> > > 
> > > an example:
> > > HPAGE_PMD_NR on my machine is 512(2 MB huge page size).
> > > After allcoating a 3000 KB buffer, I access it at location 2050 KB.
> > 
> > Your example is certainly helpful, but weird!  It's not impossible,
> > but wouldn't it be easier to understand if you said "2048 KB" there?

I wanted to emphasize that access to any bit in the first page will
trigger this problem, so I didn't use "2048 KB".

> > 
> > > In shmem_is_huge(), the corresponding index happens to be 512.
> > > After rounded up by HPAGE_PMD_NR, it will still be 512 which is
> > > smaller than i_size, and shmem_is_huge() will return true.
> > > As a result, my buffer takes an additional huge page, and that
> > > shouldn't happen when shmem_enabled is set to within_size.
> > 
> > A colleague very recently opened my eyes to within_size on shmem_enabled:
> > I've always been dubious of both, but they can work quite well together.
> > 
> > > 
> > > Fixes: f3f0e1d2150b2b ("khugepaged: add support of collapse for tmpfs/shmem pages")
> > > Signed-off-by: Liu Yuntao <liuyuntao10@huawei.com>
> > 
> > Thanks, with a nice simplification from Kirill.
> > 
> > Acked-by: Hugh Dickins <hughd@google.com>
> 
> Andrew has just sent this on to Linus - thanks - and that's fine:
> no need to get in the way of that.
> 
> But since replying, I have remembered more history, and there is
> something that we need to be aware of.
> 
> Whereas to you this is a straightforward off-by-one (or off-by-page)
> fix, it also results in a significant change in behaviour - I'd say
> usually for the better, but some might be surprised.  This patch has
> Kirill's Ack and my Ack, and I hope and believe that we can get away
> with the change in behaviour: but let's be aware of it.
> 
> The change that concerns me is when, for example, copying a large
> file into a huge=within_size tmpfs (or more generally, just writing
> to the file by appending at EOF in the usual way).
> 
> With the old WITHIN_SIZE code, the first 2MB was allocated in small
> pages, then subsequent 2MB extents were allocated with huge pages;
> including the final extent, even if it only needed a single byte.
> 
> I always thought that was very clunky behaviour, the small pages
> coming at the wrong end of the file; and that's why I was dubious
> about it as a sensible filesystem mount option.  But I was under
> the impression that it was the intended behaviour.
> 
> With your new WITHIN_SIZE code, all those appending allocations
> are outside i_size, and the whole file is allocated in small pages.
> (Then maybe later on khugepaged can assemble huge pages for it.)
> 
> Your patch makes within_size more sensible than it was for pre-sized
> files (and I think it's fair to say that the majority of files in
> shmem's internal mount, subject to thp/shmem_enabled, are likely to
> be fixed-size files); and better-defined than it used to be on
> growing files, but they won't get the huge pages they used to.

Although my patch changes shmem's behaviour, it makes shmem consistent
with the documentation. I think with the new code, it will be easier
for our users to understand.

> 
> Hugh

  reply	other threads:[~2021-09-26  6:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-09  3:20 [PATCH v2] fix judgment error in shmem_is_huge() Liu Yuntao
2021-09-09  6:17 ` Kirill A. Shutemov
2021-09-24 21:31 ` Hugh Dickins
2021-09-24 21:31   ` Hugh Dickins
2021-09-25  0:15   ` Hugh Dickins
2021-09-25  0:15     ` Hugh Dickins
2021-09-26  6:42     ` liuyuntao [this message]
2021-09-26 20:01       ` Hugh Dickins
2021-09-26 20:01         ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210926064201.3416154-1-liuyuntao10@huawei.com \
    --to=liuyuntao10@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=liusirui@huawei.com \
    --cc=windspectator@gmail.com \
    --cc=wuxu.wu@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.