All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: John Hubbard <jhubbard@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
	Khalid Aziz <khalid.aziz@oracle.com>,
	Steven Sistare <steven.sistare@oracle.com>,
	akpm@linux-foundation.org, ying.huang@intel.com,
	mgorman@techsingularity.net, baolin.wang@linux.alibaba.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Khalid Aziz <khalid@kernel.org>
Subject: Re: [PATCH v4] mm, compaction: Skip all non-migratable pages during scan
Date: Mon, 29 May 2023 01:31:49 +0100	[thread overview]
Message-ID: <ZHPydXSAfRq8sh0u@casper.infradead.org> (raw)
In-Reply-To: <9821bd9c-7c30-8f0c-68e4-6b1d312bc032@nvidia.com>

On Sun, May 28, 2023 at 04:49:52PM -0700, John Hubbard wrote:
> On 5/26/23 20:18, Matthew Wilcox wrote:
> > On Fri, May 26, 2023 at 07:11:05PM -0700, John Hubbard wrote:
> > > > So any user with 1024 processes can fragment physical memory? :/
> > > > 
> > > > Sorry, I'd like to minimize the usage of folio_maybe_dma_pinned().
> > > 
> > > I was actually thinking that we should minimize any more cases of
> > > fragile mapcount and refcount comparison, which then leads to
> > > Matthew's approach here!
> > 
> > I was wondering if we shouldn't make folio_maybe_dma_pinned() a little
> > more accurate.  eg:
> > 
> >          if (folio_test_large(folio))
> >                  return atomic_read(&folio->_pincount) > 0;
> > 	return (unsigned)(folio_ref_count(folio) - folio_mapcount(folio)) >=
> > 			GUP_PIN_COUNTING_BIAS;
> 
> I'm trying to figure out what might be wrong with that, but it seems
> OK. We must have talked about this earlier, but I recall vaguely that
> there was not a lot of concern about the case of a page being mapped
> > 1024 times. Because pinned or not, it's likely to be effectively
> locked into memory due to LRU effects. As mentioned here, too.

That was my point of view, but David convinced me that a hostile process
can effectively lock its own memory into place.

> Anyway, sure.
> 
> A detail:
> 
> The unsigned cast, I'm not sure that helps or solves anything, right?
> That is, other than bugs, is it possible to get refcount < mapcount?
> 
> And if it's only due to bugs, then the casting, again, isn't likely to
> going to mitigate the fallout from whatever mess the bug caused.

I wasn't thinking too hard about the cast.  If the caller has the folio
lock, I don't think it's possible for refcount < mapcount.  This caller
has a refcount, but doesn't hold the lock, so it is possible for them
to read mapcount first, then have both mapcount and refcount decremented
and see refcount < mapcount.

I don't think it matters too much.  We don't hold the folio lock, so
it might transition from pinned to unpinned as much as a refcount might
be decremented or a mapcount incremented.  What's important is that a
hostile process can't prevent memory from being moved indefinitely.

David, have I missed something else?

  reply	other threads:[~2023-05-29  0:32 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-25 19:15 [PATCH v4] mm, compaction: Skip all non-migratable pages during scan Khalid Aziz
2023-05-25 19:58 ` Matthew Wilcox
2023-05-25 20:15   ` Steven Sistare
2023-05-25 20:45     ` Matthew Wilcox
2023-05-25 21:31       ` Matthew Wilcox
2023-05-26 15:44         ` Khalid Aziz
2023-05-26 16:44           ` Matthew Wilcox
2023-05-26 16:46             ` David Hildenbrand
2023-05-26 18:46               ` Matthew Wilcox
2023-05-26 18:50                 ` David Hildenbrand
2023-05-27  2:11                   ` John Hubbard
2023-05-27  3:18                     ` Matthew Wilcox
2023-05-28 23:49                       ` John Hubbard
2023-05-29  0:31                         ` Matthew Wilcox [this message]
2023-05-29  9:25                           ` David Hildenbrand
2023-05-30 15:42                             ` Khalid Aziz
2023-06-09 22:11                               ` Andrew Morton
2023-06-09 23:28                                 ` Khalid Aziz
2023-05-25 20:41   ` Khalid Aziz
2023-05-29  3:01 ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZHPydXSAfRq8sh0u@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=khalid.aziz@oracle.com \
    --cc=khalid@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=steven.sistare@oracle.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.