linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Hugh Dickins <hughd@google.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	cgel.zte@gmail.com, kirill@shutemov.name, songliubraving@fb.com,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, yang.yang29@zte.com.cn,
	wang.yong12@zte.com.cn, Zeal Robot <zealci@zte.com.cn>
Subject: Re: [PATCH linux-next] Fix shmem huge page failed to set F_SEAL_WRITE attribute problem
Date: Thu, 17 Feb 2022 13:43:40 +0000	[thread overview]
Message-ID: <Yg5RDDRLVsuT/Rfw@casper.infradead.org> (raw)
In-Reply-To: <d6e74520-88bc-9f57-e189-8e4f389726e@google.com>

On Wed, Feb 16, 2022 at 05:25:17PM -0800, Hugh Dickins wrote:
> On Wed, 16 Feb 2022, Mike Kravetz wrote:
> > On 2/14/22 23:37, cgel.zte@gmail.com wrote:
> > > From: wangyong <wang.yong12@zte.com.cn>
> > > 
> > > After enabling tmpfs filesystem to support transparent hugepage with the
> > > following command:
> > >  echo always > /sys/kernel/mm/transparent_hugepage/shmem_enabled
> > > The docker program adds F_SEAL_WRITE through the following command will
> > > prompt EBUSY.
> > >  fcntl(5, F_ADD_SEALS, F_SEAL_WRITE)=-1.
> > > 
> > > It is found that in memfd_wait_for_pins function, the page_count of
> > > hugepage is 512 and page_mapcount is 0, which does not meet the
> > > conditions:
> > >  page_count(page) - page_mapcount(page) != 1.
> > > But the page is not busy at this time, therefore, the page_order of
> > > hugepage should be taken into account in the calculation.
> > > 
> > > Reported-by: Zeal Robot <zealci@zte.com.cn>
> > > Signed-off-by: wangyong <wang.yong12@zte.com.cn>
> > > ---
> > >  mm/memfd.c | 16 +++++++++++++---
> > >  1 file changed, 13 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/mm/memfd.c b/mm/memfd.c
> > > index 9f80f162791a..26d1d390a22a 100644
> > > --- a/mm/memfd.c
> > > +++ b/mm/memfd.c
> > > @@ -31,6 +31,7 @@
> > >  static void memfd_tag_pins(struct xa_state *xas)
> > >  {
> > >  	struct page *page;
> > > +	int count = 0;
> > >  	unsigned int tagged = 0;
> > >  
> > >  	lru_add_drain();
> > > @@ -39,8 +40,12 @@ static void memfd_tag_pins(struct xa_state *xas)
> > >  	xas_for_each(xas, page, ULONG_MAX) {
> > >  		if (xa_is_value(page))
> > >  			continue;
> > > +
> > >  		page = find_subpage(page, xas->xa_index);
> > > -		if (page_count(page) - page_mapcount(page) > 1)
> > > +		count = page_count(page);
> > > +		if (PageTransCompound(page))
> > 
> > PageTransCompound() is true for hugetlb pages as well as THP.  And, hugetlb
> > pages will not have a ref per subpage as THP does.  So, I believe this will
> > break hugetlb seal usage.
> 
> Yes, I think so too; and that is not the only issue with the patch
> (I don't think page_mapcount is enough, I had to use total_mapcount).
> 
> It's a good find, and thank you WangYong for the report.
> I found the same issue when testing my MFD_HUGEPAGE patch last year,
> and devised a patch to fix it (and keep MFD_HUGETLB working) then; but
> never sent that in because there wasn't time to re-present MFD_HUGEPAGE.
> 
> I'm currently retesting my patch: just found something failing which
> I thought should pass; but maybe I'm confused, or maybe the xarray is
> working differently now.  I'm rushing to reply now because I don't want
> others to waste their own time on it.

I did change how the XArray works for THP recently.

Kirill's original patch stored:

512: p
513: p+1
514: p+2
...
1023: p+511

A couple of years ago, I changed it to store:

512: p
513: p
514: p
...
1023: p

And in January, Linus merged the commit which changes it to:

512-575: p
576-639: (sibling of 512)
640-703: (sibling of 512)
...
960-1023: (sibling of 512)

That is, I removed a level of the tree and store sibling entries
rather than duplicate entries.  That wasn't for fun; I needed to do
that in order to make msync() work with large folios.  Commit
6b24ca4a1a8d has more detail and hopefully can inspire whatever
changes you need to make to your patch.


  reply	other threads:[~2022-02-17 13:44 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-15  7:37 [PATCH linux-next] Fix shmem huge page failed to set F_SEAL_WRITE attribute problem cgel.zte
2022-02-15 22:12 ` Andrew Morton
2022-02-16  6:57   ` CGEL
2022-02-17  1:00 ` Mike Kravetz
2022-02-17  1:25   ` Hugh Dickins
2022-02-17 13:43     ` Matthew Wilcox [this message]
2022-02-27  3:00       ` Hugh Dickins
2022-02-27  5:20         ` [PATCH] memfd: fix F_SEAL_WRITE after shmem huge page allocated Hugh Dickins
2022-03-02  1:11           ` yong w
2022-03-02 19:52             ` Hugh Dickins

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yg5RDDRLVsuT/Rfw@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgel.zte@gmail.com \
    --cc=hughd@google.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=songliubraving@fb.com \
    --cc=wang.yong12@zte.com.cn \
    --cc=yang.yang29@zte.com.cn \
    --cc=zealci@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).