linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "rppt@kernel.org" <rppt@kernel.org>
Cc: "kernel-hardening@lists.openwall.com" 
	<kernel-hardening@lists.openwall.com>,
	"Hansen, Dave" <dave.hansen@intel.com>,
	"luto@kernel.org" <luto@kernel.org>,
	"x86@kernel.org" <x86@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"linux-hardening@vger.kernel.org"
	<linux-hardening@vger.kernel.org>,
	"Weiny, Ira" <ira.weiny@intel.com>
Subject: Re: [PATCH RFC 3/9] x86/mm/cpa: Add grouped page allocations
Date: Mon, 10 May 2021 19:38:37 +0000	[thread overview]
Message-ID: <bcfe16c4b87c791b20aa1fb8090c01ed7ac4961a.camel@intel.com> (raw)
In-Reply-To: <YJet66kzbb6UB5Qe@kernel.org>

On Sun, 2021-05-09 at 12:39 +0300, Mike Rapoport wrote:
> On Wed, May 05, 2021 at 09:57:17PM +0000, Edgecombe, Rick P wrote:
> > On Wed, 2021-05-05 at 21:45 +0300, Mike Rapoport wrote:
> > > On Wed, May 05, 2021 at 03:09:12PM +0200, Peter Zijlstra wrote:
> > > > On Wed, May 05, 2021 at 03:08:27PM +0300, Mike Rapoport wrote:
> > > > > On Tue, May 04, 2021 at 05:30:26PM -0700, Rick Edgecombe
> > > > > wrote:
> > > > > > For x86, setting memory permissions on the direct map
> > > > > > results
> > > > > > in fracturing
> > > > > > large pages. Direct map fracturing can be reduced by
> > > > > > locating
> > > > > > pages that
> > > > > > will have their permissions set close together.
> > > > > > 
> > > > > > Create a simple page cache that allocates pages from huge
> > > > > > page
> > > > > > size
> > > > > > blocks. Don't guarantee that a page will come from a huge
> > > > > > page
> > > > > > grouping,
> > > > > > instead fallback to non-grouped pages to fulfill the
> > > > > > allocation
> > > > > > if
> > > > > > needed. Also, register a shrinker such that the system can
> > > > > > ask
> > > > > > for the
> > > > > > pages back if needed. Since this is only needed when there
> > > > > > is a
> > > > > > direct
> > > > > > map, compile it out on highmem systems.
> > > > > 
> > > > > I only had time to skim through the patches, I like the idea
> > > > > of
> > > > > having a
> > > > > simple cache that allocates larger pages with a fallback to
> > > > > basic
> > > > > page
> > > > > size.
> > > > > 
> > > > > I just think it should be more generic and closer to the page
> > > > > allocator.
> > > > > I was thinking about adding a GFP flag that will tell that
> > > > > the
> > > > > allocated
> > > > > pages should be removed from the direct map. Then
> > > > > alloc_pages()
> > > > > could use
> > > > > such cache whenever this GFP flag is specified with a
> > > > > fallback
> > > > > for lower
> > > > > order allocations.
> > > > 
> > > > That doesn't provide enough information I think. Removing from
> > > > direct
> > > > map isn't the only consideration, you also want to group them
> > > > by
> > > > the
> > > > target protection bits such that we don't get to use 4k pages
> > > > quite
> > > > so
> > > > much.
> > > 
> > > Unless I'm missing something we anyway hand out 4k pages from the
> > > cache and
> > > the neighbouring 4k may end up with different protections.
> > > 
> > > This is also similar to what happens in the set Rick posted a
> > > while
> > > ago to
> > > support grouped vmalloc allocations:
> > > 
> > 
> > One issue is with the shrinker callbacks. If you are just trying to
> > reset and free a single page because the system is low on memory,
> > it
> > could be problematic to have to break a large page, which would
> > require
> > another page.
> 
> I don't follow you here. Maybe I've misread the patches but AFAIU the
> large
> page is broken at allocation time and 4k pages remain 4k pages
> afterwards.

Yea that's right.

I thought Peter was saying if the page allocator grouped all of the
same permission together it could often leave the direct map as large
pages, and so the page allocator would have to know about permissions.

So I was just trying to say, to leave large pages on the direct map,
the shrinker has to handle breaking a page while freeing a single page.
So that would have to be addressed to get large pages with permissions
in the first place.

It doesn't seem impossible to solve I guess, so maybe not an important
point. It could maybe just hold a page in reserve.

Now that I think about it, since this PKS tables series holds all
potentially needed direct map page tables in reserve, it shouldn't
actually be a problem for this case. So this could leave the PKS tables
pages as large on the direct map.

> In my understanding the problem with a simple shrinker is that even
> if we
> have the entire 2M free it is not being reinstated as 2M page in the
> direct
> mapping.

Yea, that is a downside to this simple shrinker. 


  reply	other threads:[~2021-05-10 19:38 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-05  0:30 [PATCH RFC 0/9] PKS write protected page tables Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 1/9] list: Support getting most recent element in list_lru Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 2/9] list: Support list head not in object for list_lru Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 3/9] x86/mm/cpa: Add grouped page allocations Rick Edgecombe
2021-05-05 12:08   ` Mike Rapoport
2021-05-05 13:09     ` Peter Zijlstra
2021-05-05 18:45       ` Mike Rapoport
2021-05-05 21:57         ` Edgecombe, Rick P
2021-05-09  9:39           ` Mike Rapoport
2021-05-10 19:38             ` Edgecombe, Rick P [this message]
2021-05-05  0:30 ` [PATCH RFC 4/9] mm: Explicitly zero page table lock ptr Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 5/9] x86, mm: Use cache of page tables Rick Edgecombe
2021-05-05  8:51   ` Peter Zijlstra
2021-05-05 12:09     ` Mike Rapoport
2021-05-05 13:19       ` Peter Zijlstra
2021-05-05 21:54         ` Edgecombe, Rick P
2021-05-06 17:59       ` Matthew Wilcox
2021-05-06 18:24   ` Shakeel Butt
2021-05-07 16:27     ` Edgecombe, Rick P
2021-05-05  0:30 ` [PATCH RFC 6/9] x86/mm/cpa: Add set_memory_pks() Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 7/9] x86/mm/cpa: Add perm callbacks to grouped pages Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 8/9] x86, mm: Protect page tables with PKS Rick Edgecombe
2021-05-05  0:30 ` [PATCH RFC 9/9] x86, cpa: PKS protect direct map page tables Rick Edgecombe
2021-05-05  2:03 ` [PATCH RFC 0/9] PKS write protected " Ira Weiny
2021-05-05  6:25 ` Kees Cook
2021-05-05  8:37   ` Peter Zijlstra
2021-05-05 18:38     ` Kees Cook
2021-05-05 19:51   ` Edgecombe, Rick P
2021-05-06  0:00   ` Ira Weiny
2021-05-05 11:08 ` Vlastimil Babka
2021-05-05 11:56   ` Peter Zijlstra
2021-05-05 19:46     ` Edgecombe, Rick P

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcfe16c4b87c791b20aa1fb8090c01ed7ac4961a.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).