All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "luto@kernel.org" <luto@kernel.org>, "tycho@tycho.ws" <tycho@tycho.ws>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>,
	"ast@kernel.org" <ast@kernel.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"jeyu@kernel.org" <jeyu@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"jannh@google.com" <jannh@google.com>,
	"nadav.amit@gmail.com" <nadav.amit@gmail.com>,
	"Dock, Deneen T" <deneen.t.dock@intel.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"kristen@linux.intel.com" <kristen@linux.intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"will.deacon@arm.com" <will.deacon@arm.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>,
	"kernel-hardening@lists.openwall.com" 
	<kernel-hardening@lists.openwall.com>,
	"mhiramat@kernel.org" <mhiramat@kernel.org>,
	"naveen.n.rao@linux.vnet.ibm.com"
	<naveen.n.rao@linux.vnet.ibm.com>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Hansen, Dave" <dave.hansen@intel.com>
Subject: Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages
Date: Thu, 6 Dec 2018 20:19:35 +0000	[thread overview]
Message-ID: <f6096b80bdab59d2d21ece4ff31fcfd36bf6b809.camel@intel.com> (raw)
In-Reply-To: <CALCETrUmxht8dibJPBbPudQnoe6mHsKocEBgkJ7O1eFrVBfekQ@mail.gmail.com>

On Thu, 2018-12-06 at 11:19 -0800, Andy Lutomirski wrote:
> On Thu, Dec 6, 2018 at 11:01 AM Tycho Andersen <tycho@tycho.ws> wrote:
> > 
> > On Thu, Dec 06, 2018 at 10:53:50AM -0800, Andy Lutomirski wrote:
> > > > If we are going to unmap the linear alias, why not do it at vmalloc()
> > > > time rather than vfree() time?
> > > 
> > > That’s not totally nuts. Do we ever have code that expects __va() to
> > > work on module data?  Perhaps crypto code trying to encrypt static
> > > data because our APIs don’t understand virtual addresses.  I guess if
> > > highmem is ever used for modules, then we should be fine.
> > > 
> > > RO instead of not present might be safer.  But I do like the idea of
> > > renaming Rick's flag to something like VM_XPFO or VM_NO_DIRECT_MAP and
> > > making it do all of this.
> > 
> > Yeah, doing it for everything automatically seemed like it was/is
> > going to be a lot of work to debug all the corner cases where things
> > expect memory to be mapped but don't explicitly say it. And in
> > particular, the XPFO series only does it for user memory, whereas an
> > additional flag like this would work for extra paranoid allocations
> > of kernel memory too.
> > 
> 
> I just read the code, and I looks like vmalloc() is already using
> highmem (__GFP_HIGH) if available, so, on big x86_32 systems, for
> example, we already don't have modules in the direct map.
> 
> So I say we go for it.  This should be quite simple to implement --
> the pageattr code already has almost all the needed logic on x86.  The
> only arch support we should need is a pair of functions to remove a
> vmalloc address range from the address map (if it was present in the
> first place) and a function to put it back.  On x86, this should only
> be a few lines of code.
> 
> What do you all think?  This should solve most of the problems we have.
> 
> If we really wanted to optimize this, we'd make it so that
> module_alloc() allocates memory the normal way, then, later on, we
> call some function that, all at once, removes the memory from the
> direct map and applies the right permissions to the vmalloc alias (or
> just makes the vmalloc alias not-present so we can add permissions
> later without flushing), and flushes the TLB.  And we arrange for
> vunmap to zap the vmalloc range, then put the memory back into the
> direct map, then free the pages back to the page allocator, with the
> flush in the appropriate place.
> 
> I don't see why the page allocator needs to know about any of this.
> It's already okay with the permissions being changed out from under it
> on x86, and it seems fine.  Rick, do you want to give some variant of
> this a try?
Hi,

Sorry, I've been having email troubles today.

I found some cases where vmap with PAGE_KERNEL_RO happens, which would not set
NP/RO in the directmap, so it would be sort of inconsistent whether the
directmap of vmalloc range allocations were readable or not. I couldn't see any
places where it would cause problems today though.

I was ready to assume that all TLBs don't cache NP, because I don't know how
usages where a page fault is used to load something could work without lots of
flushes. If that's the case, then all archs with directmap permissions could
share a single vmalloc special permission flush implementation that works like
Andy described originally. It could be controlled with an
ARCH_HAS_DIRECT_MAP_PERMS. We would just need something like set_pages_np and
set_pages_rw on any archs with directmap permissions. So seems simpler to me
(and what I have been doing) unless I'm missing the problem.

If you all think so I can indeed take a shot at it, I just don't see what the
problem was with the original solution, that seems less likely to break
anything.

Thanks,

Rick

WARNING: multiple messages have this Message-ID (diff)
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "luto@kernel.org" <luto@kernel.org>, "tycho@tycho.ws" <tycho@tycho.ws>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"daniel@iogearbox.net" <daniel@iogearbox.net>,
	"ard.biesheuvel@linaro.org" <ard.biesheuvel@linaro.org>,
	"ast@kernel.org" <ast@kernel.org>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"jeyu@kernel.org" <jeyu@kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"jannh@google.com" <jannh@google.com>,
	"nadav.amit@gmail.com" <nadav.amit@gmail.com>,
	"Dock, Deneen T" <deneen.t.dock@intel.com>,
	"peterz@infradead.org" <peterz@infradead.org>,
	"kristen@linux.intel.com" <kristen@linux.intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"will.deacon@arm.com" <will.deacon@arm.com>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"Keshavamurthy, Anil S" <anil.s.keshavamurthy@intel.com>,
	"kernel-hard
Subject: Re: [PATCH 1/2] vmalloc: New flag for flush before releasing pages
Date: Thu, 6 Dec 2018 20:19:35 +0000	[thread overview]
Message-ID: <f6096b80bdab59d2d21ece4ff31fcfd36bf6b809.camel@intel.com> (raw)
In-Reply-To: <CALCETrUmxht8dibJPBbPudQnoe6mHsKocEBgkJ7O1eFrVBfekQ@mail.gmail.com>

On Thu, 2018-12-06 at 11:19 -0800, Andy Lutomirski wrote:
> On Thu, Dec 6, 2018 at 11:01 AM Tycho Andersen <tycho@tycho.ws> wrote:
> > 
> > On Thu, Dec 06, 2018 at 10:53:50AM -0800, Andy Lutomirski wrote:
> > > > If we are going to unmap the linear alias, why not do it at vmalloc()
> > > > time rather than vfree() time?
> > > 
> > > That’s not totally nuts. Do we ever have code that expects __va() to
> > > work on module data?  Perhaps crypto code trying to encrypt static
> > > data because our APIs don’t understand virtual addresses.  I guess if
> > > highmem is ever used for modules, then we should be fine.
> > > 
> > > RO instead of not present might be safer.  But I do like the idea of
> > > renaming Rick's flag to something like VM_XPFO or VM_NO_DIRECT_MAP and
> > > making it do all of this.
> > 
> > Yeah, doing it for everything automatically seemed like it was/is
> > going to be a lot of work to debug all the corner cases where things
> > expect memory to be mapped but don't explicitly say it. And in
> > particular, the XPFO series only does it for user memory, whereas an
> > additional flag like this would work for extra paranoid allocations
> > of kernel memory too.
> > 
> 
> I just read the code, and I looks like vmalloc() is already using
> highmem (__GFP_HIGH) if available, so, on big x86_32 systems, for
> example, we already don't have modules in the direct map.
> 
> So I say we go for it.  This should be quite simple to implement --
> the pageattr code already has almost all the needed logic on x86.  The
> only arch support we should need is a pair of functions to remove a
> vmalloc address range from the address map (if it was present in the
> first place) and a function to put it back.  On x86, this should only
> be a few lines of code.
> 
> What do you all think?  This should solve most of the problems we have.
> 
> If we really wanted to optimize this, we'd make it so that
> module_alloc() allocates memory the normal way, then, later on, we
> call some function that, all at once, removes the memory from the
> direct map and applies the right permissions to the vmalloc alias (or
> just makes the vmalloc alias not-present so we can add permissions
> later without flushing), and flushes the TLB.  And we arrange for
> vunmap to zap the vmalloc range, then put the memory back into the
> direct map, then free the pages back to the page allocator, with the
> flush in the appropriate place.
> 
> I don't see why the page allocator needs to know about any of this.
> It's already okay with the permissions being changed out from under it
> on x86, and it seems fine.  Rick, do you want to give some variant of
> this a try?
Hi,

Sorry, I've been having email troubles today.

I found some cases where vmap with PAGE_KERNEL_RO happens, which would not set
NP/RO in the directmap, so it would be sort of inconsistent whether the
directmap of vmalloc range allocations were readable or not. I couldn't see any
places where it would cause problems today though.

I was ready to assume that all TLBs don't cache NP, because I don't know how
usages where a page fault is used to load something could work without lots of
flushes. If that's the case, then all archs with directmap permissions could
share a single vmalloc special permission flush implementation that works like
Andy described originally. It could be controlled with an
ARCH_HAS_DIRECT_MAP_PERMS. We would just need something like set_pages_np and
set_pages_rw on any archs with directmap permissions. So seems simpler to me
(and what I have been doing) unless I'm missing the problem.

If you all think so I can indeed take a shot at it, I just don't see what the
problem was with the original solution, that seems less likely to break
anything.

Thanks,

Rick

  parent reply	other threads:[~2018-12-06 20:19 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-28  0:07 [PATCH 0/2] Don’t leave executable TLB entries to freed pages Rick Edgecombe
2018-11-28  0:07 ` [PATCH 1/2] vmalloc: New flag for flush before releasing pages Rick Edgecombe
2018-12-04  0:04   ` Edgecombe, Rick P
2018-12-04  0:04     ` Edgecombe, Rick P
2018-12-04  0:04     ` Edgecombe, Rick P
2018-12-04  0:04     ` Edgecombe, Rick P
2018-12-04  1:43   ` Nadav Amit
2018-12-04 16:03     ` Will Deacon
2018-12-04 20:02       ` Edgecombe, Rick P
2018-12-04 20:02         ` Edgecombe, Rick P
2018-12-04 20:02         ` Edgecombe, Rick P
2018-12-04 20:02         ` Edgecombe, Rick P
2018-12-04 20:09         ` Andy Lutomirski
2018-12-04 20:09           ` Andy Lutomirski
2018-12-04 23:52           ` Edgecombe, Rick P
2018-12-04 23:52             ` Edgecombe, Rick P
2018-12-04 23:52             ` Edgecombe, Rick P
2018-12-05  1:57             ` Andy Lutomirski
2018-12-05  1:57               ` Andy Lutomirski
2018-12-05  1:57               ` Andy Lutomirski
2018-12-05 11:41           ` Will Deacon
2018-12-05 11:41             ` Will Deacon
2018-12-05 23:16             ` Andy Lutomirski
2018-12-05 23:16               ` Andy Lutomirski
2018-12-06  7:29               ` Ard Biesheuvel
2018-12-06  7:29                 ` Ard Biesheuvel
2018-12-06 11:10                 ` Will Deacon
2018-12-06 11:10                   ` Will Deacon
2018-12-06 18:53                 ` Andy Lutomirski
2018-12-06 18:53                   ` Andy Lutomirski
2018-12-06 19:01                   ` Tycho Andersen
2018-12-06 19:01                     ` Tycho Andersen
2018-12-06 19:19                     ` Andy Lutomirski
2018-12-06 19:19                       ` Andy Lutomirski
2018-12-06 19:39                       ` Nadav Amit
2018-12-06 19:39                         ` Nadav Amit
2018-12-06 20:17                         ` Andy Lutomirski
2018-12-06 20:17                           ` Andy Lutomirski
2018-12-06 23:08                           ` Nadav Amit
2018-12-06 23:08                             ` Nadav Amit
2018-12-07  3:06                             ` Edgecombe, Rick P
2018-12-07  3:06                               ` Edgecombe, Rick P
2018-12-07  3:06                               ` Edgecombe, Rick P
2018-12-06 20:19                       ` Edgecombe, Rick P [this message]
2018-12-06 20:19                         ` Edgecombe, Rick P
2018-12-06 20:19                         ` Edgecombe, Rick P
2018-12-06 20:26                         ` Andy Lutomirski
2018-12-06 20:26                           ` Andy Lutomirski
2018-12-06 19:04                   ` Ard Biesheuvel
2018-12-06 19:04                     ` Ard Biesheuvel
2018-12-06 19:20                     ` Andy Lutomirski
2018-12-06 19:20                       ` Andy Lutomirski
2018-12-06 19:23                       ` Ard Biesheuvel
2018-12-06 19:23                         ` Ard Biesheuvel
2018-12-06 19:31                         ` Will Deacon
2018-12-06 19:31                           ` Will Deacon
2018-12-06 19:36                           ` Ard Biesheuvel
2018-12-06 19:36                             ` Ard Biesheuvel
2018-12-04 20:36         ` Nadav Amit
2018-12-04 20:36           ` Nadav Amit
2018-12-04 20:36           ` Nadav Amit
2018-12-04 23:51           ` Edgecombe, Rick P
2018-12-04 23:51             ` Edgecombe, Rick P
2018-12-05  0:01             ` Nadav Amit
2018-12-05  0:01               ` Nadav Amit
2018-12-05  0:01               ` Nadav Amit
2018-12-05  0:29               ` Edgecombe, Rick P
2018-12-05  0:29                 ` Edgecombe, Rick P
2018-12-05  0:29                 ` Edgecombe, Rick P
2018-12-05  0:53                 ` Nadav Amit
2018-12-05  0:53                   ` Nadav Amit
2018-12-05  0:53                   ` Nadav Amit
2018-12-05  1:45                   ` Edgecombe, Rick P
2018-12-05  1:45                     ` Edgecombe, Rick P
2018-12-05  1:45                     ` Edgecombe, Rick P
2018-12-05  2:09                     ` Nadav Amit
2018-12-05  2:09                       ` Nadav Amit
2018-12-05  2:09                       ` Nadav Amit
2018-12-04 18:56     ` Andy Lutomirski
2018-12-04 18:56       ` Andy Lutomirski
2018-12-04 19:44       ` Nadav Amit
2018-12-04 19:44         ` Nadav Amit
2018-12-04 19:48         ` Andy Lutomirski
2018-12-04 19:48           ` Andy Lutomirski
2018-12-04 22:48           ` Nadav Amit
2018-12-04 22:48             ` Nadav Amit
2018-12-04 23:27             ` Andy Lutomirski
2018-12-04 23:27               ` Andy Lutomirski
2018-12-04 23:34               ` Nadav Amit
2018-12-04 23:34                 ` Nadav Amit
2018-12-05  1:09             ` Edgecombe, Rick P
2018-12-05  1:09               ` Edgecombe, Rick P
2018-12-05  1:09               ` Edgecombe, Rick P
2018-12-05  1:45               ` Nadav Amit
2018-12-05  1:45                 ` Nadav Amit
2018-12-05  1:45                 ` Nadav Amit
2018-11-28  0:07 ` [PATCH 2/2] x86/modules: Make x86 allocs to flush when free Rick Edgecombe
2018-11-28 23:11   ` Andrew Morton
2018-11-29  0:02     ` Edgecombe, Rick P
2018-11-29  0:02       ` Edgecombe, Rick P
2018-11-29  0:02       ` Edgecombe, Rick P
2018-11-29  1:40   ` Andy Lutomirski
2018-11-29  1:40     ` Andy Lutomirski
2018-11-29  6:14     ` Edgecombe, Rick P
2018-11-29  6:14       ` Edgecombe, Rick P
2018-11-29  6:14       ` Edgecombe, Rick P
2018-11-28  1:06 ` [PATCH 0/2] Don’t leave executable TLB entries to freed pages Nadav Amit
2018-11-28  1:21   ` Nadav Amit
2018-11-28  9:57     ` Will Deacon
2018-11-28 18:29       ` Nadav Amit
2018-11-29 14:06 ` Masami Hiramatsu
2018-11-29 18:49   ` Edgecombe, Rick P
2018-11-29 18:49     ` Edgecombe, Rick P
2018-11-29 18:49     ` Edgecombe, Rick P
2018-11-29 23:19     ` Masami Hiramatsu
2018-11-29 23:19       ` Masami Hiramatsu
2018-11-29 23:19       ` Masami Hiramatsu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f6096b80bdab59d2d21ece4ff31fcfd36bf6b809.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=anil.s.keshavamurthy@intel.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=ast@kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@intel.com \
    --cc=davem@davemloft.net \
    --cc=deneen.t.dock@intel.com \
    --cc=jannh@google.com \
    --cc=jeyu@kernel.org \
    --cc=kernel-hardening@lists.openwall.com \
    --cc=kristen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=mingo@redhat.com \
    --cc=nadav.amit@gmail.com \
    --cc=naveen.n.rao@linux.vnet.ibm.com \
    --cc=netdev@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tycho@tycho.ws \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.