linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] Restricted kernel address spaces
@ 2020-02-06 16:59 Mike Rapoport
  2020-02-07 17:39 ` Kirill A. Shutemov
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Rapoport @ 2020-02-06 16:59 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-mm


Restricted mappings in the kernel mode may improve mitigation of hardware
speculation vulnerabilities and minimize the damage exploitable kernel bugs
can cause.

There are several ongoing efforts to use restricted address spaces in
Linux kernel for various use cases:
* speculation vulnerabilities mitigation in KVM [1]
* support for memory areas visible only in a single owning context, or more
  generically, a memory areas with more restrictive protection that the
  defaults ("secret" memory) [2], [3], [4]
* hardening of the Linux containers [ no reference yet :) ]

Last year we had vague ideas and possible directions, this year we have
several real challenges and design decisions we'd like to discuss:

* "Secret" memory userspace APIs

  Should such API follow "native" MM interfaces like mmap(), mprotect(),
  madvise() or it would be better to use a file descriptor , e.g. like
  memfd-create does?

  MM "native" APIs would require VM_something flag and probably a page flag
  or page_ext. With file-descriptor VM_SPECIAL and custom implementation of
  .mmap() and .fault() would suffice. On the other hand, mmap() and
  mprotect() seem better fit semantically and they could be more easily
  adopted by the userspace.

* Direct/linear map fragmentation

  Whenever we want to drop some mappings from the direct map or even change
  the protection bits for some memory area, the gigantic and huge pages
  that comprise the direct map need to be broken and there's no THP for the
  kernel page tables to collapse them back. Moreover, the existing API
  defined in <asm/set_memory.h> by several architectures do not really
  presume it would be widely used.

  For the "secret" memory use-case the fragmentation can be minimized by
  caching large pages, use them to satisfy smaller "secret" allocations and
  than collapse them back once the "secret" memory is freed. Another
  possibility is to pre-allocate physical memory at boot time.

  Yet another idea is to make page allocator aware of the direct map layout.

* Kernel page table management

  Currently we presume that only one kernel page table exists (well,
  mostly) and the page table abstraction is required only for the user page
  tables. As such, we presume that 'page table == struct mm_struct' and the
  mm_struct is used all over by the operations that manage the page tables.

  The management of the restricted address space in the kernel requires
  ability to create, update and remove kernel contexts the same way we do
  for the userspace.

  One way is to overload the mm_struct, like EFI and text poking did. But
  it is quite an overkill, because most of the mm_struct contains
  information required to manage user mappings.

  My suggestion is to introduce a first class abstraction for the page
  table and then it could be used in the same way for user and kernel
  context management. For now I have a very basic POC that slitted several
  fields from the mm_struct into a new 'struct pg_table' [5]. This new
  abstraction can be used e.g. by PTI implementation of the page table
  cloning and the KVM ASI work.


[1] https://lore.kernel.org/lkml/1557758315-12667-1-git-send-email-alexandre.chartre@oracle.com/
[2] https://lore.kernel.org/lkml/20190612170834.14855-1-mhillenb@amazon.de/
[3] https://lore.kernel.org/lkml/1572171452-7958-1-git-send-email-rppt@kernel.org/
[4] https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/
[5] https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=pg_table/v0.0

-- 
Sincerely yours,
Mike.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Restricted kernel address spaces
  2020-02-06 16:59 [LSF/MM/BPF TOPIC] Restricted kernel address spaces Mike Rapoport
@ 2020-02-07 17:39 ` Kirill A. Shutemov
  2020-02-11 17:20   ` Mike Rapoport
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2020-02-07 17:39 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: lsf-pc, linux-mm

On Thu, Feb 06, 2020 at 06:59:00PM +0200, Mike Rapoport wrote:
> 
> Restricted mappings in the kernel mode may improve mitigation of hardware
> speculation vulnerabilities and minimize the damage exploitable kernel bugs
> can cause.
> 
> There are several ongoing efforts to use restricted address spaces in
> Linux kernel for various use cases:
> * speculation vulnerabilities mitigation in KVM [1]
> * support for memory areas visible only in a single owning context, or more
>   generically, a memory areas with more restrictive protection that the
>   defaults ("secret" memory) [2], [3], [4]
> * hardening of the Linux containers [ no reference yet :) ]
> 
> Last year we had vague ideas and possible directions, this year we have
> several real challenges and design decisions we'd like to discuss:
> 
> * "Secret" memory userspace APIs
> 
>   Should such API follow "native" MM interfaces like mmap(), mprotect(),
>   madvise() or it would be better to use a file descriptor , e.g. like
>   memfd-create does?

I don't really see a point in such file-descriptor. It suppose to be very
private secret data. What functionality that provide a file descriptor do
you see valuable in this scenario?

File descriptor makes it easier to spill the secrets to other process: over
fork(), UNIX socket or via /proc/PID/fd/.

>   MM "native" APIs would require VM_something flag and probably a page flag
>   or page_ext. With file-descriptor VM_SPECIAL and custom implementation of
>   .mmap() and .fault() would suffice. On the other hand, mmap() and
>   mprotect() seem better fit semantically and they could be more easily
>   adopted by the userspace.

You mix up implementation and interface. You can provide an interface which
doesn't require a file descriptor, but still use a magic file internally to
the VMA distinct.

> * Direct/linear map fragmentation
> 
>   Whenever we want to drop some mappings from the direct map or even change
>   the protection bits for some memory area, the gigantic and huge pages
>   that comprise the direct map need to be broken and there's no THP for the
>   kernel page tables to collapse them back. Moreover, the existing API
>   defined in <asm/set_memory.h> by several architectures do not really
>   presume it would be widely used.
> 
>   For the "secret" memory use-case the fragmentation can be minimized by
>   caching large pages, use them to satisfy smaller "secret" allocations and
>   than collapse them back once the "secret" memory is freed. Another
>   possibility is to pre-allocate physical memory at boot time.

I would rather go with pre-allocation path. At least at first. We always
can come up with more dynamic and complicated solution later if the
interface would be wildly adopted.

>   Yet another idea is to make page allocator aware of the direct map layout.
> 
> * Kernel page table management
> 
>   Currently we presume that only one kernel page table exists (well,
>   mostly) and the page table abstraction is required only for the user page
>   tables. As such, we presume that 'page table == struct mm_struct' and the
>   mm_struct is used all over by the operations that manage the page tables.
> 
>   The management of the restricted address space in the kernel requires
>   ability to create, update and remove kernel contexts the same way we do
>   for the userspace.
> 
>   One way is to overload the mm_struct, like EFI and text poking did. But
>   it is quite an overkill, because most of the mm_struct contains
>   information required to manage user mappings.

In what way is it overkill? Just memory overhead? How many of such
contexts do you expect to see in the system?

>   My suggestion is to introduce a first class abstraction for the page
>   table and then it could be used in the same way for user and kernel
>   context management. For now I have a very basic POC that slitted several
>   fields from the mm_struct into a new 'struct pg_table' [5]. This new
>   abstraction can be used e.g. by PTI implementation of the page table
>   cloning and the KVM ASI work.
> 
> 
> [1] https://lore.kernel.org/lkml/1557758315-12667-1-git-send-email-alexandre.chartre@oracle.com/
> [2] https://lore.kernel.org/lkml/20190612170834.14855-1-mhillenb@amazon.de/
> [3] https://lore.kernel.org/lkml/1572171452-7958-1-git-send-email-rppt@kernel.org/
> [4] https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/
> [5] https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=pg_table/v0.0
> 
> -- 
> Sincerely yours,
> Mike.
> 
> 

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Restricted kernel address spaces
  2020-02-07 17:39 ` Kirill A. Shutemov
@ 2020-02-11 17:20   ` Mike Rapoport
  2020-02-11 21:53     ` Kirill A. Shutemov
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Rapoport @ 2020-02-11 17:20 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Fri, Feb 07, 2020 at 08:39:09PM +0300, Kirill A. Shutemov wrote:
> On Thu, Feb 06, 2020 at 06:59:00PM +0200, Mike Rapoport wrote:
> > 
> > Restricted mappings in the kernel mode may improve mitigation of hardware
> > speculation vulnerabilities and minimize the damage exploitable kernel bugs
> > can cause.
> > 
> > There are several ongoing efforts to use restricted address spaces in
> > Linux kernel for various use cases:
> > * speculation vulnerabilities mitigation in KVM [1]
> > * support for memory areas visible only in a single owning context, or more
> >   generically, a memory areas with more restrictive protection that the
> >   defaults ("secret" memory) [2], [3], [4]
> > * hardening of the Linux containers [ no reference yet :) ]
> > 
> > Last year we had vague ideas and possible directions, this year we have
> > several real challenges and design decisions we'd like to discuss:
> > 
> > * "Secret" memory userspace APIs
> > 
> >   Should such API follow "native" MM interfaces like mmap(), mprotect(),
> >   madvise() or it would be better to use a file descriptor , e.g. like
> >   memfd-create does?
> 
> I don't really see a point in such file-descriptor. It suppose to be very
> private secret data. What functionality that provide a file descriptor do
> you see valuable in this scenario?
> 
> File descriptor makes it easier to spill the secrets to other process: over
> fork(), UNIX socket or via /proc/PID/fd/.

On the other hand it is may be desired to share a secret between several
processes. Then UNIX socket or fork() actually become handy.
 
> >   MM "native" APIs would require VM_something flag and probably a page flag
> >   or page_ext. With file-descriptor VM_SPECIAL and custom implementation of
> >   .mmap() and .fault() would suffice. On the other hand, mmap() and
> >   mprotect() seem better fit semantically and they could be more easily
> >   adopted by the userspace.
> 
> You mix up implementation and interface. You can provide an interface which
> doesn't require a file descriptor, but still use a magic file internally to
> the VMA distinct.

If I understand correctly, if we go with mmap(MAP_SECRET) example, the
mmap() would implicitly create a magic file with its .mmap() and .fault()
implementing the protection? That's a possibility. But then, if we already
have a file, why not let user get a handle for it and allow fine grained
control over its sharing between processes?

> > * Direct/linear map fragmentation
> > 
> >   Whenever we want to drop some mappings from the direct map or even change
> >   the protection bits for some memory area, the gigantic and huge pages
> >   that comprise the direct map need to be broken and there's no THP for the
> >   kernel page tables to collapse them back. Moreover, the existing API
> >   defined in <asm/set_memory.h> by several architectures do not really
> >   presume it would be widely used.
> > 
> >   For the "secret" memory use-case the fragmentation can be minimized by
> >   caching large pages, use them to satisfy smaller "secret" allocations and
> >   than collapse them back once the "secret" memory is freed. Another
> >   possibility is to pre-allocate physical memory at boot time.
> 
> I would rather go with pre-allocation path. At least at first. We always
> can come up with more dynamic and complicated solution later if the
> interface would be wildly adopted.

We still must manage the "secret" allocations, so I don't think that the
dynamic solution will be much more complicated.

> >   Yet another idea is to make page allocator aware of the direct map layout.
> > 
> > * Kernel page table management
> > 
> >   Currently we presume that only one kernel page table exists (well,
> >   mostly) and the page table abstraction is required only for the user page
> >   tables. As such, we presume that 'page table == struct mm_struct' and the
> >   mm_struct is used all over by the operations that manage the page tables.
> > 
> >   The management of the restricted address space in the kernel requires
> >   ability to create, update and remove kernel contexts the same way we do
> >   for the userspace.
> > 
> >   One way is to overload the mm_struct, like EFI and text poking did. But
> >   it is quite an overkill, because most of the mm_struct contains
> >   information required to manage user mappings.
> 
> In what way is it overkill? Just memory overhead? How many of such
> contexts do you expect to see in the system?

Well, memory overhead is not that big, but it'd not negligible. For the KVM
ASI usescase, for instance, there will be at least as much contexts as
running VMs. We also have thoughts about how to make namespaces use restricted
address spaces, for this usecase there will be quite a lot of such
contexts.

Besides, it does not feel right to have the mm_struct to represent a page
table.
 
> >   My suggestion is to introduce a first class abstraction for the page
> >   table and then it could be used in the same way for user and kernel
> >   context management. For now I have a very basic POC that slitted several
> >   fields from the mm_struct into a new 'struct pg_table' [5]. This new
> >   abstraction can be used e.g. by PTI implementation of the page table
> >   cloning and the KVM ASI work.
> > 
> > 
> > [1] https://lore.kernel.org/lkml/1557758315-12667-1-git-send-email-alexandre.chartre@oracle.com/
> > [2] https://lore.kernel.org/lkml/20190612170834.14855-1-mhillenb@amazon.de/
> > [3] https://lore.kernel.org/lkml/1572171452-7958-1-git-send-email-rppt@kernel.org/
> > [4] https://lore.kernel.org/lkml/20200130162340.GA14232@rapoport-lnx/
> > [5] https://git.kernel.org/pub/scm/linux/kernel/git/rppt/linux.git/log/?h=pg_table/v0.0
> > 
> 
> -- 
>  Kirill A. Shutemov

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Restricted kernel address spaces
  2020-02-11 17:20   ` Mike Rapoport
@ 2020-02-11 21:53     ` Kirill A. Shutemov
  2020-02-16  6:35       ` Mike Rapoport
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2020-02-11 21:53 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Tue, Feb 11, 2020 at 07:20:47PM +0200, Mike Rapoport wrote:
> On Fri, Feb 07, 2020 at 08:39:09PM +0300, Kirill A. Shutemov wrote:
> > On Thu, Feb 06, 2020 at 06:59:00PM +0200, Mike Rapoport wrote:
> > > 
> > > Restricted mappings in the kernel mode may improve mitigation of hardware
> > > speculation vulnerabilities and minimize the damage exploitable kernel bugs
> > > can cause.
> > > 
> > > There are several ongoing efforts to use restricted address spaces in
> > > Linux kernel for various use cases:
> > > * speculation vulnerabilities mitigation in KVM [1]
> > > * support for memory areas visible only in a single owning context, or more
> > >   generically, a memory areas with more restrictive protection that the
> > >   defaults ("secret" memory) [2], [3], [4]
> > > * hardening of the Linux containers [ no reference yet :) ]
> > > 
> > > Last year we had vague ideas and possible directions, this year we have
> > > several real challenges and design decisions we'd like to discuss:
> > > 
> > > * "Secret" memory userspace APIs
> > > 
> > >   Should such API follow "native" MM interfaces like mmap(), mprotect(),
> > >   madvise() or it would be better to use a file descriptor , e.g. like
> > >   memfd-create does?
> > 
> > I don't really see a point in such file-descriptor. It suppose to be very
> > private secret data. What functionality that provide a file descriptor do
> > you see valuable in this scenario?
> > 
> > File descriptor makes it easier to spill the secrets to other process: over
> > fork(), UNIX socket or via /proc/PID/fd/.
> 
> On the other hand it is may be desired to share a secret between several
> processes. Then UNIX socket or fork() actually become handy.

If more than one knows, it is secret no longer :P

> > >   MM "native" APIs would require VM_something flag and probably a page flag
> > >   or page_ext. With file-descriptor VM_SPECIAL and custom implementation of
> > >   .mmap() and .fault() would suffice. On the other hand, mmap() and
> > >   mprotect() seem better fit semantically and they could be more easily
> > >   adopted by the userspace.
> > 
> > You mix up implementation and interface. You can provide an interface which
> > doesn't require a file descriptor, but still use a magic file internally to
> > the VMA distinct.
> 
> If I understand correctly, if we go with mmap(MAP_SECRET) example, the
> mmap() would implicitly create a magic file with its .mmap() and .fault()
> implementing the protection? That's a possibility. But then, if we already
> have a file, why not let user get a handle for it and allow fine grained
> control over its sharing between processes?

A proper file descriptor would have wider exposer with security
implications. It has to be at least scoped properly.

> > > * Direct/linear map fragmentation
> > > 
> > >   Whenever we want to drop some mappings from the direct map or even change
> > >   the protection bits for some memory area, the gigantic and huge pages
> > >   that comprise the direct map need to be broken and there's no THP for the
> > >   kernel page tables to collapse them back. Moreover, the existing API
> > >   defined in <asm/set_memory.h> by several architectures do not really
> > >   presume it would be widely used.
> > > 
> > >   For the "secret" memory use-case the fragmentation can be minimized by
> > >   caching large pages, use them to satisfy smaller "secret" allocations and
> > >   than collapse them back once the "secret" memory is freed. Another
> > >   possibility is to pre-allocate physical memory at boot time.
> > 
> > I would rather go with pre-allocation path. At least at first. We always
> > can come up with more dynamic and complicated solution later if the
> > interface would be wildly adopted.
> 
> We still must manage the "secret" allocations, so I don't think that the
> dynamic solution will be much more complicated.

Okay.

BTW, with clarified scope of the AMD Erratum, I believe we can implement
"collapse" for direct mapping. Willing to try?

> > >   Yet another idea is to make page allocator aware of the direct map layout.
> > > 
> > > * Kernel page table management
> > > 
> > >   Currently we presume that only one kernel page table exists (well,
> > >   mostly) and the page table abstraction is required only for the user page
> > >   tables. As such, we presume that 'page table == struct mm_struct' and the
> > >   mm_struct is used all over by the operations that manage the page tables.
> > > 
> > >   The management of the restricted address space in the kernel requires
> > >   ability to create, update and remove kernel contexts the same way we do
> > >   for the userspace.
> > > 
> > >   One way is to overload the mm_struct, like EFI and text poking did. But
> > >   it is quite an overkill, because most of the mm_struct contains
> > >   information required to manage user mappings.
> > 
> > In what way is it overkill? Just memory overhead? How many of such
> > contexts do you expect to see in the system?
> 
> Well, memory overhead is not that big, but it'd not negligible. For the KVM
> ASI usescase, for instance, there will be at least as much contexts as
> running VMs. We also have thoughts about how to make namespaces use restricted
> address spaces, for this usecase there will be quite a lot of such
> contexts.
> 
> Besides, it does not feel right to have the mm_struct to represent a page
> table.

Fair enough. It might be interesting.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Restricted kernel address spaces
  2020-02-11 21:53     ` Kirill A. Shutemov
@ 2020-02-16  6:35       ` Mike Rapoport
  2020-02-17 10:34         ` Kirill A. Shutemov
  0 siblings, 1 reply; 7+ messages in thread
From: Mike Rapoport @ 2020-02-16  6:35 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Wed, Feb 12, 2020 at 12:53:34AM +0300, Kirill A. Shutemov wrote:
> On Tue, Feb 11, 2020 at 07:20:47PM +0200, Mike Rapoport wrote:
> > On Fri, Feb 07, 2020 at 08:39:09PM +0300, Kirill A. Shutemov wrote:
> > > On Thu, Feb 06, 2020 at 06:59:00PM +0200, Mike Rapoport wrote:
> > > > 
> > > > * "Secret" memory userspace APIs
> > > > 
> > > >   Should such API follow "native" MM interfaces like mmap(), mprotect(),
> > > >   madvise() or it would be better to use a file descriptor , e.g. like
> > > >   memfd-create does?
> > > 
> > > I don't really see a point in such file-descriptor. It suppose to be very
> > > private secret data. What functionality that provide a file descriptor do
> > > you see valuable in this scenario?
> > > 
> > > File descriptor makes it easier to spill the secrets to other process: over
> > > fork(), UNIX socket or via /proc/PID/fd/.
> > 
> > On the other hand it is may be desired to share a secret between several
> > processes. Then UNIX socket or fork() actually become handy.
> 
> If more than one knows, it is secret no longer :P

But even cryptographers define "shared secret" ;-)
 
> > > >   MM "native" APIs would require VM_something flag and probably a page flag
> > > >   or page_ext. With file-descriptor VM_SPECIAL and custom implementation of
> > > >   .mmap() and .fault() would suffice. On the other hand, mmap() and
> > > >   mprotect() seem better fit semantically and they could be more easily
> > > >   adopted by the userspace.
> > > 
> > > You mix up implementation and interface. You can provide an interface which
> > > doesn't require a file descriptor, but still use a magic file internally to
> > > the VMA distinct.
> > 
> > If I understand correctly, if we go with mmap(MAP_SECRET) example, the
> > mmap() would implicitly create a magic file with its .mmap() and .fault()
> > implementing the protection? That's a possibility. But then, if we already
> > have a file, why not let user get a handle for it and allow fine grained
> > control over its sharing between processes?
> 
> A proper file descriptor would have wider exposer with security
> implications. It has to be at least scoped properly.
 
Agree.

> > > > * Direct/linear map fragmentation
> > > > 
> > > >   Whenever we want to drop some mappings from the direct map or even change
> > > >   the protection bits for some memory area, the gigantic and huge pages
> > > >   that comprise the direct map need to be broken and there's no THP for the
> > > >   kernel page tables to collapse them back. Moreover, the existing API
> > > >   defined in <asm/set_memory.h> by several architectures do not really
> > > >   presume it would be widely used.
> > > > 
> > > >   For the "secret" memory use-case the fragmentation can be minimized by
> > > >   caching large pages, use them to satisfy smaller "secret" allocations and
> > > >   than collapse them back once the "secret" memory is freed. Another
> > > >   possibility is to pre-allocate physical memory at boot time.
> > > 
> > > I would rather go with pre-allocation path. At least at first. We always
> > > can come up with more dynamic and complicated solution later if the
> > > interface would be wildly adopted.
> > 
> > We still must manage the "secret" allocations, so I don't think that the
> > dynamic solution will be much more complicated.
> 
> Okay.
> 
> BTW, with clarified scope of the AMD Erratum, I believe we can implement
> "collapse" for direct mapping. Willing to try?
 
My initial plan was to use a pool of large pages to satisfy "secret"
allocation requests. Whenever a new large page is allocated for that pool,
it's removed from the direct map without being split into small pages and
then when it would be reinstated back there would be no need to collapse
it. 

> > > >   Yet another idea is to make page allocator aware of the direct map layout.
> 
> -- 
>  Kirill A. Shutemov

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Restricted kernel address spaces
  2020-02-16  6:35       ` Mike Rapoport
@ 2020-02-17 10:34         ` Kirill A. Shutemov
  2020-02-18 15:06           ` Mike Rapoport
  0 siblings, 1 reply; 7+ messages in thread
From: Kirill A. Shutemov @ 2020-02-17 10:34 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Sun, Feb 16, 2020 at 08:35:04AM +0200, Mike Rapoport wrote:
> > BTW, with clarified scope of the AMD Erratum, I believe we can implement
> > "collapse" for direct mapping. Willing to try?
>  
> My initial plan was to use a pool of large pages to satisfy "secret"
> allocation requests. Whenever a new large page is allocated for that pool,
> it's removed from the direct map without being split into small pages and
> then when it would be reinstated back there would be no need to collapse
> it. 

It might be okay. But you likely will have to split 1G pages in direct
mapping into 2M. Being able to repare the direct mapping is more generally
useful.

-- 
 Kirill A. Shutemov


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [LSF/MM/BPF TOPIC] Restricted kernel address spaces
  2020-02-17 10:34         ` Kirill A. Shutemov
@ 2020-02-18 15:06           ` Mike Rapoport
  0 siblings, 0 replies; 7+ messages in thread
From: Mike Rapoport @ 2020-02-18 15:06 UTC (permalink / raw)
  To: Kirill A. Shutemov; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Mon, Feb 17, 2020 at 01:34:57PM +0300, Kirill A. Shutemov wrote:
> On Sun, Feb 16, 2020 at 08:35:04AM +0200, Mike Rapoport wrote:
> > > BTW, with clarified scope of the AMD Erratum, I believe we can implement
> > > "collapse" for direct mapping. Willing to try?
> >  
> > My initial plan was to use a pool of large pages to satisfy "secret"
> > allocation requests. Whenever a new large page is allocated for that pool,
> > it's removed from the direct map without being split into small pages and
> > then when it would be reinstated back there would be no need to collapse
> > it. 
> 
> It might be okay. But you likely will have to split 1G pages in direct
> mapping into 2M.

Right, it is quite likely, at least for the dynamic allocations.

> Being able to repare the direct mapping is more generally
> useful.

And it's not strictly related to the "secret" memory. I'll try to have a
look at it.

 
> -- 
>  Kirill A. Shutemov

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2020-02-18 15:15 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-06 16:59 [LSF/MM/BPF TOPIC] Restricted kernel address spaces Mike Rapoport
2020-02-07 17:39 ` Kirill A. Shutemov
2020-02-11 17:20   ` Mike Rapoport
2020-02-11 21:53     ` Kirill A. Shutemov
2020-02-16  6:35       ` Mike Rapoport
2020-02-17 10:34         ` Kirill A. Shutemov
2020-02-18 15:06           ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).