All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: David Hildenbrand <david@redhat.com>
Cc: "zhangliang (AG)" <zhangliang5@huawei.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux-MM <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	wangzhigang17@huawei.com,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] mm: reuse the unshared swapcache page in do_wp_page
Date: Thu, 20 Jan 2022 15:36:55 +0000	[thread overview]
Message-ID: <YemBl4ZVtJqtAVwV@casper.infradead.org> (raw)
In-Reply-To: <e2580cfa-a529-934d-861a-091c4a9714d4@redhat.com>

On Thu, Jan 20, 2022 at 04:26:22PM +0100, David Hildenbrand wrote:
> On 20.01.22 15:39, Matthew Wilcox wrote:
> > On Thu, Jan 20, 2022 at 03:15:37PM +0100, David Hildenbrand wrote:
> >> On 17.01.22 14:31, zhangliang (AG) wrote:
> >>> Sure, I will do that :)
> >>
> >> I'm polishing up / testing the patches and might send something out for discussion shortly.
> >> Just a note that on my branch was a version with a wrong condition that should have been fixed now.
> >>
> >> I am still thinking about PTE mapped THP. For these, we'll always
> >> have page_count() > 1, essentially corresponding to the number of still-mapped sub-pages.
> >>
> >> So if we end up with a R/O mapped part of a THP, we'll always have to COW and cannot reuse ever,
> >> although it's really just a single process mapping the THP via PTEs.
> >>
> >> One approach would be to scan the currently locked page table for entries mapping
> >> this same page. If page_count() corresponds to that value, we know that only we are
> >> mapping the THP and there are no additional references. That would be a special case
> >> if we find an anon THP in do_wp_page(). Hm.
> > 
> > You're starting to optimise for some pretty weird cases at that point.
> 
> So your claim is that read-only, PTE mapped pages are weird? How do you
> come to that conclusion?

Because normally anon THP pages are PMD mapped.  That's rather
the point of anon THPs.

> If we adjust the THP reuse logic to split on additional references
> (page_count() == 1) -- similarly as suggested by Linus to fix the CVE --
> we're going to end up with exactly that more frequently.

I don't understand.  Are we talking past each other?  As I understand
the situation we're talking about here, a process has created a THP,
done something to cause it to be partially mapped (or mapped in a
misaligned way) in its own address space, then forked, and we're
trying to figure out if it's safe to reuse it?  I say that situation is
rare enough that it's OK to always allocate an order-0 page and
copy into it.

> > Anon THP is always going to start out aligned (and can be moved by
> > mremap()).  Arguably it should be broken up if it's moved so it can be
> > reformed into aligned THPs by khugepaged.
> 
> Can you elaborate, I'm missing the point where something gets moved. I
> don't care about mremap() at all here.
> 
> 
> 1. You have a read-only, PTE mapped THP
> 2. Write fault on the THP
> 3. We PTE-map the THP because we run into a false positive in our COW
>    logic to handle COW on PTE
> 4. Write fault on the PTE
> 5. We always have to COW each and every sub-page and can never reuse,
>    because page_count() > 1
> 
> That's essentially what reuse_swap_page() tried to handle before.
> Eventually optimizing for this is certainly the next step, but I'd like
> to document which effect the removal of reuse_swap_page() will have to THP.

I'm talking about step 0.  How do we get a read-only, PTE-mapped THP?
Through mremap() or perhaps through an mprotect()/mmap()/munmap() that
failed to split the THP.

  reply	other threads:[~2022-01-20 15:37 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-13 14:03 [PATCH] mm: reuse the unshared swapcache page in do_wp_page Liang Zhang
2022-01-13 14:39 ` Matthew Wilcox
2022-01-13 14:46   ` David Hildenbrand
2022-01-13 15:02     ` Matthew Wilcox
2022-01-13 15:04       ` David Hildenbrand
2022-01-13 16:37   ` Linus Torvalds
2022-01-13 16:48     ` David Hildenbrand
2022-01-13 17:14       ` Linus Torvalds
2022-01-13 17:25         ` David Hildenbrand
2022-01-13 17:44           ` Linus Torvalds
2022-01-13 17:55             ` David Hildenbrand
2022-01-13 18:55               ` Linus Torvalds
2022-01-13 21:07             ` Matthew Wilcox
2022-01-13 22:21               ` Linus Torvalds
2022-01-14  5:00       ` zhangliang (AG)
2022-01-14 11:23         ` David Hildenbrand
2022-01-17  2:11           ` zhangliang (AG)
2022-01-17 12:58             ` David Hildenbrand
2022-01-17 13:31               ` zhangliang (AG)
2022-01-20 14:15                 ` David Hildenbrand
2022-01-20 14:39                   ` Matthew Wilcox
2022-01-20 15:26                     ` David Hildenbrand
2022-01-20 15:36                       ` Matthew Wilcox [this message]
2022-01-20 15:39                         ` David Hildenbrand
2022-01-20 15:45                           ` Matthew Wilcox
2022-01-20 15:51                             ` David Hildenbrand
2022-01-20 16:09                               ` Matthew Wilcox
2022-01-20 16:35                                 ` David Hildenbrand
2022-01-20 15:37                       ` Linus Torvalds
2022-01-20 15:46                         ` David Hildenbrand
2022-01-20 17:22                           ` Linus Torvalds
2022-01-20 17:49                             ` David Hildenbrand
2022-01-20 17:48                   ` Nadav Amit
2022-01-20 18:00                     ` David Hildenbrand
2022-01-20 18:11                       ` Nadav Amit
2022-01-20 18:19                         ` David Hildenbrand
2022-01-20 19:55                         ` David Hildenbrand
2022-01-20 20:07                           ` Matthew Wilcox
2022-01-20 20:09                             ` David Hildenbrand
2022-01-20 20:37                               ` David Hildenbrand
2022-01-20 20:46                                 ` Nadav Amit
2022-01-20 20:49                                   ` David Hildenbrand
2022-01-21  9:01                                     ` David Hildenbrand
2022-01-21 17:43                                       ` Nadav Amit
2022-01-20 20:18                           ` David Hildenbrand
2022-01-14  3:29   ` zhangliang (AG)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YemBl4ZVtJqtAVwV@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    --cc=wangzhigang17@huawei.com \
    --cc=zhangliang5@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.