linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [LSF/MM TOPIC] Address space isolation inside the kernel
@ 2019-02-07  7:24 Mike Rapoport
  2019-02-14 19:21 ` Kees Cook
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Mike Rapoport @ 2019-02-07  7:24 UTC (permalink / raw)
  To: lsf-pc; +Cc: linux-mm, James Bottomley

(Joint proposal with James Bottomley)

Address space isolation has been used to protect the kernel from the
userspace and userspace programs from each other since the invention of
the virtual memory.

Assuming that kernel bugs and therefore vulnerabilities are inevitable
it might be worth isolating parts of the kernel to minimize damage
that these vulnerabilities can cause.

There is already ongoing work in a similar direction, like XPFO [1] and
temporary mappings proposed for the kernel text poking [2].

We have several vague ideas how we can take this even further and make
different parts of kernel run in different address spaces:
* Remove most of the kernel mappings from the syscall entry and add a
  trampoline when the syscall processing needs to call the "core
  kernel".
* Make the parts of the kernel that execute in a namespace use their
  own mappings for the namespace private data
* Extend EXPORT_SYMBOL to include a trampoline so that the code
  running in modules won't map the entire kernel
* Execute BFP programs in a dedicated address space

These are very general possible directions. We are exploring some of
them now to understand if the security value is worth the complexity
and the performance impact.

We believe it would be helpful to discuss the general idea of address
space isolation inside the kernel, both from the technical aspect of
how it can be achieved simply and efficiently and from the isolation
aspect of what actual security guarantees it usefully provides.

[1] https://lore.kernel.org/lkml/cover.1547153058.git.khalid.aziz@oracle.com/
[2] https://lore.kernel.org/lkml/20190129003422.9328-4-rick.p.edgecombe@intel.com/

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-07  7:24 [LSF/MM TOPIC] Address space isolation inside the kernel Mike Rapoport
@ 2019-02-14 19:21 ` Kees Cook
       [not found] ` <CA+VK+GOpjXQ2-CLZt6zrW6m-=WpWpvcrXGSJ-723tRDMeAeHmg@mail.gmail.com>
  2019-02-16 12:19 ` Balbir Singh
  2 siblings, 0 replies; 18+ messages in thread
From: Kees Cook @ 2019-02-14 19:21 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: lsf-pc, Linux-MM, James Bottomley

On Wed, Feb 6, 2019 at 11:24 PM Mike Rapoport <rppt@linux.ibm.com> wrote:
> Address space isolation has been used to protect the kernel from the
> userspace and userspace programs from each other since the invention of
> the virtual memory.

Well, traditionally the kernel's protection has been one-sided: we've
left userspace mapped while in the kernel, which has lead to countless
exploits. SMEP/SMAP (or similar for other architectures, like ARM's
PXN/PAN) have finally mitigated that, but we're still left with a lot
of older machines (and other architectures) that would benefit from
unmapping the userspace while in the kernel.

> Assuming that kernel bugs and therefore vulnerabilities are inevitable
> it might be worth isolating parts of the kernel to minimize damage
> that these vulnerabilities can cause.

Yes please. :) Two cases jump to mind:

1) Make regions unwritable to avoid write-anywhere data modification
attacks. For code and rodata, this is already done with regular page
table bits making them read-only for the entire lifetime of the
kernel. For areas that need writing but are sensitive (e.g. the page
tables themselves, and generally function pointer tables), there needs
to be a way to keep modifications isolated to given code (to block
write-anywhere attacks), keeping them read-only through all other
accesses. This is could be done with per-CPU page tables, a faster
version of the "write rarely" patch set[1], or maybe with the kernel
text poking (mentioned in your email). Attacking the page tables
directly is now the common way to gain execute control on the kernel,
since so much of the rest of memory is locked down[2]. How can we keep
page tables read-only except for when the page table code needs to
write to them?

2) Make a region unreadable to avoid read-anywhere memory disclosure
attacks. This mean it's either unmapped (for both data and code cases)
or we gain execute-not-read hardware bits (for code cases). Unmapping
code means a reduction in ROP gadgets, unmapping data means reduction
in memory disclosure surface. Note that while both coarse (CET) and
fine-grain (function-prototype-checking) CFI vastly reduces the
availability of ROP gadgets, the kernel still has a lot of functions
that return void and take a single unsigned long, so anything to
remove more code from visibility is good.

> There is already ongoing work in a similar direction, like XPFO [1] and
> temporary mappings proposed for the kernel text poking [2].
>
> We have several vague ideas how we can take this even further and make
> different parts of kernel run in different address spaces:
> * Remove most of the kernel mappings from the syscall entry and add a
>   trampoline when the syscall processing needs to call the "core
>   kernel".

Defining this boundary may be very tricky, but maybe the same logic
used for CFI and function graph analysis could be used to find the
existing bright lines between code regions...

> * Make the parts of the kernel that execute in a namespace use their
>   own mappings for the namespace private data
> * Extend EXPORT_SYMBOL to include a trampoline so that the code
>   running in modules won't map the entire kernel
> * Execute BFP programs in a dedicated address space

Pushing drivers into isolated regions would be very interesting. If it
needs context-switching, though, we're headed to microkernel fun.

> These are very general possible directions. We are exploring some of
> them now to understand if the security value is worth the complexity
> and the performance impact.
>
> We believe it would be helpful to discuss the general idea of address
> space isolation inside the kernel, both from the technical aspect of
> how it can be achieved simply and efficiently and from the isolation
> aspect of what actual security guarantees it usefully provides.
>
> [1] https://lore.kernel.org/lkml/cover.1547153058.git.khalid.aziz@oracle.com/
> [2] https://lore.kernel.org/lkml/20190129003422.9328-4-rick.p.edgecombe@intel.com/

I won't be able to make it to the conference, but I'm very interested
in finding ways forward on this topic. :)

-Kees

[1] https://patchwork.kernel.org/project/kernel-hardening/list/?series=79855
[2] https://www.blackhat.com/docs/asia-18/asia-18-WANG-KSMA-Breaking-Android-kernel-isolation-and-Rooting-with-ARM-MMU-features.pdf

-- 
Kees Cook


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
       [not found] ` <CA+VK+GOpjXQ2-CLZt6zrW6m-=WpWpvcrXGSJ-723tRDMeAeHmg@mail.gmail.com>
@ 2019-02-16 11:13   ` Paul Turner
  2019-04-25 20:47     ` Jonathan Adams
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Turner @ 2019-02-16 11:13 UTC (permalink / raw)
  To: lsf-pc, linux-mm, James Bottomley, Mike Rapoport; +Cc: Jonathan Adams

[-- Attachment #1: Type: text/plain, Size: 2641 bytes --]

I wanted to second the proposal for address space isolation.

We have some new techniques to introduce her also, built around some new
ideas using page-faults that we believe are interesting.

To wit, page faults uniquely allow us to fork speculative and
non-speculative execution as we can control the retired path within the
fault itself (which as it turns out, will obviously never be executed
speculatively).

This lets us provide isolation against variant1 gadgets, as well as
guarantee what data may or may not be cache present for the purposes of
L1TF and Meltdown mitigation.

I'm not sure whether or not I'll be able to attend (I have a newborn and
there's a lot of other scheduling I'm trying to work out).  But Jonathan
Adams (cc'd) has been working on this and can speak to it.  We also have
some write-ups to publish independently of this.

Thanks,

- Paul

(Joint proposal with James Bottomley)
>
> Address space isolation has been used to protect the kernel from the
> userspace and userspace programs from each other since the invention of
> the virtual memory.
>
> Assuming that kernel bugs and therefore vulnerabilities are inevitable
> it might be worth isolating parts of the kernel to minimize damage
> that these vulnerabilities can cause.
>
> There is already ongoing work in a similar direction, like XPFO [1] and
> temporary mappings proposed for the kernel text poking [2].
>
> We have several vague ideas how we can take this even further and make
> different parts of kernel run in different address spaces:
> * Remove most of the kernel mappings from the syscall entry and add a
>   trampoline when the syscall processing needs to call the "core
>   kernel".
> * Make the parts of the kernel that execute in a namespace use their
>   own mappings for the namespace private data
> * Extend EXPORT_SYMBOL to include a trampoline so that the code
>   running in modules won't map the entire kernel
> * Execute BFP programs in a dedicated address space
>
> These are very general possible directions. We are exploring some of
> them now to understand if the security value is worth the complexity
> and the performance impact.
>
> We believe it would be helpful to discuss the general idea of address
> space isolation inside the kernel, both from the technical aspect of
> how it can be achieved simply and efficiently and from the isolation
> aspect of what actual security guarantees it usefully provides.
>
> [1]
> https://lore.kernel.org/lkml/cover.1547153058.git.khalid.aziz@oracle.com/
> [2]
> https://lore.kernel.org/lkml/20190129003422.9328-4-rick.p.edgecombe@intel.com/
>
> --
> Sincerely yours,
> Mike.
>

[-- Attachment #2: Type: text/html, Size: 3409 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-07  7:24 [LSF/MM TOPIC] Address space isolation inside the kernel Mike Rapoport
  2019-02-14 19:21 ` Kees Cook
       [not found] ` <CA+VK+GOpjXQ2-CLZt6zrW6m-=WpWpvcrXGSJ-723tRDMeAeHmg@mail.gmail.com>
@ 2019-02-16 12:19 ` Balbir Singh
  2019-02-16 16:30   ` James Bottomley
  2 siblings, 1 reply; 18+ messages in thread
From: Balbir Singh @ 2019-02-16 12:19 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: lsf-pc, linux-mm, James Bottomley

On Thu, Feb 07, 2019 at 09:24:22AM +0200, Mike Rapoport wrote:
> (Joint proposal with James Bottomley)
> 
> Address space isolation has been used to protect the kernel from the
> userspace and userspace programs from each other since the invention of
> the virtual memory.
> 
> Assuming that kernel bugs and therefore vulnerabilities are inevitable
> it might be worth isolating parts of the kernel to minimize damage
> that these vulnerabilities can cause.
>

Is Address Space limited to user space and kernel space, where does the
hypervisor fit into the picture?
 
> There is already ongoing work in a similar direction, like XPFO [1] and
> temporary mappings proposed for the kernel text poking [2].
> 
> We have several vague ideas how we can take this even further and make
> different parts of kernel run in different address spaces:
> * Remove most of the kernel mappings from the syscall entry and add a
>   trampoline when the syscall processing needs to call the "core
>   kernel".
> * Make the parts of the kernel that execute in a namespace use their
>   own mappings for the namespace private data

Is the key reason for removing mappings -- to remove the processor
from speculating data/text from those mappings? SMAP/SMEP provides
a level of isolation from access and execution

For namespaces, does allocating the right memory protection key
work? At some point we'll need to recycle the keys

It'll be an interesting discussion and I'd love to attend if invited

Balbir Singh.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-16 12:19 ` Balbir Singh
@ 2019-02-16 16:30   ` James Bottomley
  2019-02-17  8:01     ` Balbir Singh
  2019-02-17 19:34     ` Matthew Wilcox
  0 siblings, 2 replies; 18+ messages in thread
From: James Bottomley @ 2019-02-16 16:30 UTC (permalink / raw)
  To: Balbir Singh, Mike Rapoport; +Cc: lsf-pc, linux-mm

On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> On Thu, Feb 07, 2019 at 09:24:22AM +0200, Mike Rapoport wrote:
> > (Joint proposal with James Bottomley)
> > 
> > Address space isolation has been used to protect the kernel from
> > the userspace and userspace programs from each other since the
> > invention of the virtual memory.
> > 
> > Assuming that kernel bugs and therefore vulnerabilities are
> > inevitable it might be worth isolating parts of the kernel to
> > minimize damage that these vulnerabilities can cause.
> > 
> 
> Is Address Space limited to user space and kernel space, where does
> the hypervisor fit into the picture?

It doesn't really.  The work is driven by the Nabla HAP measure

https://blog.hansenpartnership.com/measuring-the-horizontal-attack-profile-of-nabla-containers/

Although the results are spectacular (building a container that's
measurably more secure than a hypervisor based system), they come at
the price of emulating a lot of the kernel and thus damaging the
precise resource control advantage containers have.  The idea then is
to render parts of the kernel syscall interface safe enough that they
have a security profile equivalent to the emulated one and can thus be
called directly instead of being emulated, hoping to restore most of
the container resource management properties.

In theory, I suppose it would buy you protection from things like the
kata containers host breach:

https://nabla-containers.github.io/2018/11/28/fs/


> > There is already ongoing work in a similar direction, like XPFO [1]
> > and temporary mappings proposed for the kernel text poking [2].
> > 
> > We have several vague ideas how we can take this even further and
> > make different parts of kernel run in different address spaces:
> > * Remove most of the kernel mappings from the syscall entry and add
> > a
> >   trampoline when the syscall processing needs to call the "core
> >   kernel".
> > * Make the parts of the kernel that execute in a namespace use
> > their
> >   own mappings for the namespace private data
> 
> Is the key reason for removing mappings -- to remove the processor
> from speculating data/text from those mappings? SMAP/SMEP provides
> a level of isolation from access and execution

Not really, it's to reduce the exploitability of the code path and
limit the exposure of data which can be compromised when you're
exploited.

> For namespaces, does allocating the right memory protection key
> work? At some point we'll need to recycle the keys

I don't think anyone mentioned memory keys and namespaces ... I take it
you're thinking of SEV/MKTME?  The idea being to shield one container's
execution from another using memory encryption?  We've speculated it's
possible but the actual mechanism we were looking at is tagging pages
to namespaces (essentially using the mount namspace and tags on the
page cache) so the kernel would refuse to map a page into the wrong
namespace.  This approach doesn't seem to be as promising as the
separated address space one because the security properties are harder
to measure.

James


> It'll be an interesting discussion and I'd love to attend if invited
> 
> Balbir Singh.
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-16 16:30   ` James Bottomley
@ 2019-02-17  8:01     ` Balbir Singh
  2019-02-17 16:43       ` James Bottomley
  2019-02-17 19:34     ` Matthew Wilcox
  1 sibling, 1 reply; 18+ messages in thread
From: Balbir Singh @ 2019-02-17  8:01 UTC (permalink / raw)
  To: James Bottomley; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > On Thu, Feb 07, 2019 at 09:24:22AM +0200, Mike Rapoport wrote:
> > > (Joint proposal with James Bottomley)
> > > 
> > > Address space isolation has been used to protect the kernel from
> > > the userspace and userspace programs from each other since the
> > > invention of the virtual memory.
> > > 
> > > Assuming that kernel bugs and therefore vulnerabilities are
> > > inevitable it might be worth isolating parts of the kernel to
> > > minimize damage that these vulnerabilities can cause.
> > > 
> > 
> > Is Address Space limited to user space and kernel space, where does
> > the hypervisor fit into the picture?
> 
> It doesn't really.  The work is driven by the Nabla HAP measure
> 
> https://blog.hansenpartnership.com/measuring-the-horizontal-attack-profile-of-nabla-containers/
> 
> Although the results are spectacular (building a container that's
> measurably more secure than a hypervisor based system), they come at
> the price of emulating a lot of the kernel and thus damaging the
> precise resource control advantage containers have.  The idea then is
> to render parts of the kernel syscall interface safe enough that they
> have a security profile equivalent to the emulated one and can thus be
> called directly instead of being emulated, hoping to restore most of
> the container resource management properties.
> 
> In theory, I suppose it would buy you protection from things like the
> kata containers host breach:
> 
> https://nabla-containers.github.io/2018/11/28/fs/
> 

Thanks, so it's largely to prevent escaping the container namespace.
Since the topic thread was generic, I thought I'd ask

> 
> > > There is already ongoing work in a similar direction, like XPFO [1]
> > > and temporary mappings proposed for the kernel text poking [2].
> > > 
> > > We have several vague ideas how we can take this even further and
> > > make different parts of kernel run in different address spaces:
> > > * Remove most of the kernel mappings from the syscall entry and add
> > > a
> > >   trampoline when the syscall processing needs to call the "core
> > >   kernel".
> > > * Make the parts of the kernel that execute in a namespace use
> > > their
> > >   own mappings for the namespace private data
> > 
> > Is the key reason for removing mappings -- to remove the processor
> > from speculating data/text from those mappings? SMAP/SMEP provides
> > a level of isolation from access and execution
> 
> Not really, it's to reduce the exploitability of the code path and
> limit the exposure of data which can be compromised when you're
> exploited.
> 

Yep, understood

> > For namespaces, does allocating the right memory protection key
> > work? At some point we'll need to recycle the keys
> 
> I don't think anyone mentioned memory keys and namespaces ... I take it
> you're thinking of SEV/MKTME?  The idea being to shield one container's

I was wondering why keys are not sufficient? I know no one mentioned it,
but something I thought I'd bring it up.

> execution from another using memory encryption?  We've speculated it's
> possible but the actual mechanism we were looking at is tagging pages
> to namespaces (essentially using the mount namspace and tags on the
> page cache) so the kernel would refuse to map a page into the wrong
> namespace.  This approach doesn't seem to be as promising as the
> separated address space one because the security properties are harder
> to measure.
> 

Thanks for clarifying the scope

Balbir

> James
> 
> 
> > It'll be an interesting discussion and I'd love to attend if invited
> > 
> > Balbir Singh.
> > 
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-17  8:01     ` Balbir Singh
@ 2019-02-17 16:43       ` James Bottomley
  0 siblings, 0 replies; 18+ messages in thread
From: James Bottomley @ 2019-02-17 16:43 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Mike Rapoport, lsf-pc, linux-mm

On Sun, 2019-02-17 at 19:01 +1100, Balbir Singh wrote:
> On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > On Thu, Feb 07, 2019 at 09:24:22AM +0200, Mike Rapoport wrote:
> > > > (Joint proposal with James Bottomley)
> > > > 
> > > > Address space isolation has been used to protect the kernel
> > > > from the userspace and userspace programs from each other since
> > > > the invention of the virtual memory.
> > > > 
> > > > Assuming that kernel bugs and therefore vulnerabilities are
> > > > inevitable it might be worth isolating parts of the kernel to
> > > > minimize damage that these vulnerabilities can cause.
> > > > 
> > > 
> > > Is Address Space limited to user space and kernel space, where
> > > does the hypervisor fit into the picture?
> > 
> > It doesn't really.  The work is driven by the Nabla HAP measure
> > 
> > https://blog.hansenpartnership.com/measuring-the-horizontal-attack-
> > profile-of-nabla-containers/
> > 
> > Although the results are spectacular (building a container that's
> > measurably more secure than a hypervisor based system), they come
> > at the price of emulating a lot of the kernel and thus damaging the
> > precise resource control advantage containers have.  The idea then
> > is to render parts of the kernel syscall interface safe enough that
> > they have a security profile equivalent to the emulated one and can
> > thus be called directly instead of being emulated, hoping to
> > restore most of the container resource management properties.
> > 
> > In theory, I suppose it would buy you protection from things like
> > the kata containers host breach:
> > 
> > https://nabla-containers.github.io/2018/11/28/fs/
> > 
> 
> Thanks, so it's largely to prevent escaping the container namespace.
> Since the topic thread was generic, I thought I'd ask

Actually, that's not quite it either.  The motivation is certainly
container security but the current thrust of the work is generic kernel
security ... the rising tide principle.

James


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-16 16:30   ` James Bottomley
  2019-02-17  8:01     ` Balbir Singh
@ 2019-02-17 19:34     ` Matthew Wilcox
  2019-02-17 20:09       ` James Bottomley
  1 sibling, 1 reply; 18+ messages in thread
From: Matthew Wilcox @ 2019-02-17 19:34 UTC (permalink / raw)
  To: James Bottomley; +Cc: Balbir Singh, Mike Rapoport, lsf-pc, linux-mm

On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > For namespaces, does allocating the right memory protection key
> > work? At some point we'll need to recycle the keys
> 
> I don't think anyone mentioned memory keys and namespaces ... I take it
> you're thinking of SEV/MKTME?

I thought he meant Protection Keys
https://en.wikipedia.org/wiki/Memory_protection#Protection_keys

> The idea being to shield one container's
> execution from another using memory encryption?  We've speculated it's
> possible but the actual mechanism we were looking at is tagging pages
> to namespaces (essentially using the mount namspace and tags on the
> page cache) so the kernel would refuse to map a page into the wrong
> namespace.  This approach doesn't seem to be as promising as the
> separated address space one because the security properties are harder
> to measure.

What do you mean by "tags on the pages cache"?  Is that different from
the radix tree tags (now renamed to XArray marks), which are search keys.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-17 19:34     ` Matthew Wilcox
@ 2019-02-17 20:09       ` James Bottomley
  2019-02-17 21:54         ` Balbir Singh
  2019-02-17 22:01         ` Balbir Singh
  0 siblings, 2 replies; 18+ messages in thread
From: James Bottomley @ 2019-02-17 20:09 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Balbir Singh, Mike Rapoport, lsf-pc, linux-mm

On Sun, 2019-02-17 at 11:34 -0800, Matthew Wilcox wrote:
> On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > For namespaces, does allocating the right memory protection key
> > > work? At some point we'll need to recycle the keys
> > 
> > I don't think anyone mentioned memory keys and namespaces ... I
> > take it you're thinking of SEV/MKTME?
> 
> I thought he meant Protection Keys
> https://en.wikipedia.org/wiki/Memory_protection#Protection_keys

Really?  I wasn't really considering that mainly because in parisc we
use them to implement no execute, so they'd have to be repurposed.

> > The idea being to shield one container's execution from another
> > using memory encryption?  We've speculated it's possible but the
> > actual mechanism we were looking at is tagging pages to namespaces
> > (essentially using the mount namspace and tags on the
> > page cache) so the kernel would refuse to map a page into the wrong
> > namespace.  This approach doesn't seem to be as promising as the
> > separated address space one because the security properties are
> > harder
> > to measure.
> 
> What do you mean by "tags on the pages cache"?  Is that different
> from the radix tree tags (now renamed to XArray marks), which are
> search keys.

Tagging the page cache to namespaces means having a set of mount
namespaces per page in the page cache and not allowing placing the page
into a VMA unless the owning task's nsproxy is one of the tagged mount
namespaces.  The idea was to introduce kernel supported fencing between
containers, particularly if they were handling sensitive data, so that
if a container used an exploit to map another container's page, the
mapping would fail.  However, since sensitive data should be on an
encrypted filesystem, it looks like SEV/MKTME coupled with file based
encryption might provide a better mechanism.

James


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-17 20:09       ` James Bottomley
@ 2019-02-17 21:54         ` Balbir Singh
  2019-02-17 22:01         ` Balbir Singh
  1 sibling, 0 replies; 18+ messages in thread
From: Balbir Singh @ 2019-02-17 21:54 UTC (permalink / raw)
  To: James Bottomley; +Cc: Matthew Wilcox, Mike Rapoport, lsf-pc, linux-mm

On Sun, Feb 17, 2019 at 12:09:06PM -0800, James Bottomley wrote:
> On Sun, 2019-02-17 at 11:34 -0800, Matthew Wilcox wrote:
> > On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > > For namespaces, does allocating the right memory protection key
> > > > work? At some point we'll need to recycle the keys
> > > 
> > > I don't think anyone mentioned memory keys and namespaces ... I
> > > take it you're thinking of SEV/MKTME?
> > 
> > I thought he meant Protection Keys
> > https://en.wikipedia.org/wiki/Memory_protection#Protection_keys
> 
> Really?  I wasn't really considering that mainly because in parisc we
> use them to implement no execute, so they'd have to be repurposed.
>

Yes, but x86 and powerpc have the capability to use them for no-read,
no-write and no-execute (powerpc). I agree that this might not work
well across all architectures, but it might be an option for architectures
that support it.
 
> > > The idea being to shield one container's execution from another
> > > using memory encryption?  We've speculated it's possible but the
> > > actual mechanism we were looking at is tagging pages to namespaces
> > > (essentially using the mount namspace and tags on the
> > > page cache) so the kernel would refuse to map a page into the wrong
> > > namespace.  This approach doesn't seem to be as promising as the
> > > separated address space one because the security properties are
> > > harder
> > > to measure.
> > 
> > What do you mean by "tags on the pages cache"?  Is that different
> > from the radix tree tags (now renamed to XArray marks), which are
> > search keys.
> 
> Tagging the page cache to namespaces means having a set of mount
> namespaces per page in the page cache and not allowing placing the page
> into a VMA unless the owning task's nsproxy is one of the tagged mount
> namespaces.  The idea was to introduce kernel supported fencing between
> containers, particularly if they were handling sensitive data, so that
> if a container used an exploit to map another container's page, the
> mapping would fail.  However, since sensitive data should be on an
> encrypted filesystem, it looks like SEV/MKTME coupled with file based
> encryption might provide a better mechanism.
> 
> James

Balbir Singh


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-17 20:09       ` James Bottomley
  2019-02-17 21:54         ` Balbir Singh
@ 2019-02-17 22:01         ` Balbir Singh
  2019-02-17 22:20           ` [Lsf-pc] " James Bottomley
  1 sibling, 1 reply; 18+ messages in thread
From: Balbir Singh @ 2019-02-17 22:01 UTC (permalink / raw)
  To: James Bottomley; +Cc: Matthew Wilcox, Mike Rapoport, lsf-pc, linux-mm

On Sun, Feb 17, 2019 at 12:09:06PM -0800, James Bottomley wrote:
> On Sun, 2019-02-17 at 11:34 -0800, Matthew Wilcox wrote:
> > On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > > For namespaces, does allocating the right memory protection key
> > > > work? At some point we'll need to recycle the keys
> > > 
> > > I don't think anyone mentioned memory keys and namespaces ... I
> > > take it you're thinking of SEV/MKTME?
> > 
> > I thought he meant Protection Keys
> > https://en.wikipedia.org/wiki/Memory_protection#Protection_keys
> 
> Really?  I wasn't really considering that mainly because in parisc we
> use them to implement no execute, so they'd have to be repurposed.
> 
> > > The idea being to shield one container's execution from another
> > > using memory encryption?  We've speculated it's possible but the
> > > actual mechanism we were looking at is tagging pages to namespaces
> > > (essentially using the mount namspace and tags on the
> > > page cache) so the kernel would refuse to map a page into the wrong
> > > namespace.  This approach doesn't seem to be as promising as the
> > > separated address space one because the security properties are
> > > harder
> > > to measure.
> > 
> > What do you mean by "tags on the pages cache"?  Is that different
> > from the radix tree tags (now renamed to XArray marks), which are
> > search keys.
> 
> Tagging the page cache to namespaces means having a set of mount
> namespaces per page in the page cache and not allowing placing the page
> into a VMA unless the owning task's nsproxy is one of the tagged mount
> namespaces.  The idea was to introduce kernel supported fencing between
> containers, particularly if they were handling sensitive data, so that
> if a container used an exploit to map another container's page, the
> mapping would fail.  However, since sensitive data should be on an
> encrypted filesystem, it looks like SEV/MKTME coupled with file based
> encryption might provide a better mechanism.
>

Splitting out this point to a different email, I think being able to
tag page cache is quite interesting and in the long run might help
us to get things like mincore() right across shared boundaries.

But any fencing will come in the way of sharing and density of containers.
I still don't see how a container can map page cache it does not have
right permissions to/for? In an ideal world any writable pages (sensitive)
should ideally go to the writable bits of the union mount filesystem which is
private to the container (but I could be making up things without
trying them out)

Balbir Singh.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-17 22:01         ` Balbir Singh
@ 2019-02-17 22:20           ` James Bottomley
  2019-02-18 11:15             ` Balbir Singh
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2019-02-17 22:20 UTC (permalink / raw)
  To: Balbir Singh; +Cc: linux-mm, lsf-pc, Matthew Wilcox, Mike Rapoport

On Mon, 2019-02-18 at 09:01 +1100, Balbir Singh wrote:
> On Sun, Feb 17, 2019 at 12:09:06PM -0800, James Bottomley wrote:
> > On Sun, 2019-02-17 at 11:34 -0800, Matthew Wilcox wrote:
> > > On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > > > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > > > For namespaces, does allocating the right memory protection
> > > > > key work? At some point we'll need to recycle the keys
> > > > 
> > > > I don't think anyone mentioned memory keys and namespaces ... I
> > > > take it you're thinking of SEV/MKTME?
> > > 
> > > I thought he meant Protection Keys
> > > https://en.wikipedia.org/wiki/Memory_protection#Protection_keys
> > 
> > Really?  I wasn't really considering that mainly because in parisc
> > we use them to implement no execute, so they'd have to be
> > repurposed.
> > 
> > > > The idea being to shield one container's execution from another
> > > > using memory encryption?  We've speculated it's possible but
> > > > the actual mechanism we were looking at is tagging pages to
> > > > namespaces (essentially using the mount namspace and tags on
> > > > the page cache) so the kernel would refuse to map a page into
> > > > the wrong namespace.  This approach doesn't seem to be as
> > > > promising as the separated address space one because the
> > > > security properties are harder to measure.
> > > 
> > > What do you mean by "tags on the pages cache"?  Is that different
> > > from the radix tree tags (now renamed to XArray marks), which are
> > > search keys.
> > 
> > Tagging the page cache to namespaces means having a set of mount
> > namespaces per page in the page cache and not allowing placing the
> > page into a VMA unless the owning task's nsproxy is one of the
> > tagged mount namespaces.  The idea was to introduce kernel
> > supported fencing between containers, particularly if they were
> > handling sensitive data, so that if a container used an exploit to
> > map another container's page, the mapping would fail.  However,
> > since sensitive data should be on an encrypted filesystem, it looks
> > like SEV/MKTME coupled with file based encryption might provide a
> > better mechanism.
> > 
> 
> Splitting out this point to a different email, I think being able to
> tag page cache is quite interesting and in the long run might help
> us to get things like mincore() right across shared boundaries.
> 
> But any fencing will come in the way of sharing and density of
> containers. I still don't see how a container can map page cache it
> does not have right permissions to/for? In an ideal world any
> writable pages (sensitive) should ideally go to the writable bits of
> the union mount filesystem which is private to the container (but I
> could be making up things without trying them out)

As I said before, it's about reducing the horizontal attack profile
(HAP).  If the kernel were perfectly free from bugs and exploits,
containment would be perfect and the HAP would be zero.  In the real
world, where the kernel is trusted (it's your kernel) but potentially
vulnerable (it's not free from possibly exploitable defects), the HAP
is non-zero and the question becomes how do you prevent one tenant from
exploiting a defect to interfere with or exfiltrate data from another
tenant.

The idea behind page tagging is that modern techniqes (like ROP
attacks) use existing code sequences within the kernel to perform the
exploit so if all code sequences that map pages contain tag guards, the
defences against one container accessing another pages remain in place
even in the face of exploits.

James


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-17 22:20           ` [Lsf-pc] " James Bottomley
@ 2019-02-18 11:15             ` Balbir Singh
  0 siblings, 0 replies; 18+ messages in thread
From: Balbir Singh @ 2019-02-18 11:15 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-mm, lsf-pc, Matthew Wilcox, Mike Rapoport

On Sun, Feb 17, 2019 at 02:20:50PM -0800, James Bottomley wrote:
> On Mon, 2019-02-18 at 09:01 +1100, Balbir Singh wrote:
> > On Sun, Feb 17, 2019 at 12:09:06PM -0800, James Bottomley wrote:
> > > On Sun, 2019-02-17 at 11:34 -0800, Matthew Wilcox wrote:
> > > > On Sat, Feb 16, 2019 at 08:30:16AM -0800, James Bottomley wrote:
> > > > > On Sat, 2019-02-16 at 23:19 +1100, Balbir Singh wrote:
> > > > > > For namespaces, does allocating the right memory protection
> > > > > > key work? At some point we'll need to recycle the keys
> > > > > 
> > > > > I don't think anyone mentioned memory keys and namespaces ... I
> > > > > take it you're thinking of SEV/MKTME?
> > > > 
> > > > I thought he meant Protection Keys
> > > > https://en.wikipedia.org/wiki/Memory_protection#Protection_keys
> > > 
> > > Really?  I wasn't really considering that mainly because in parisc
> > > we use them to implement no execute, so they'd have to be
> > > repurposed.
> > > 
> > > > > The idea being to shield one container's execution from another
> > > > > using memory encryption?  We've speculated it's possible but
> > > > > the actual mechanism we were looking at is tagging pages to
> > > > > namespaces (essentially using the mount namspace and tags on
> > > > > the page cache) so the kernel would refuse to map a page into
> > > > > the wrong namespace.  This approach doesn't seem to be as
> > > > > promising as the separated address space one because the
> > > > > security properties are harder to measure.
> > > > 
> > > > What do you mean by "tags on the pages cache"?  Is that different
> > > > from the radix tree tags (now renamed to XArray marks), which are
> > > > search keys.
> > > 
> > > Tagging the page cache to namespaces means having a set of mount
> > > namespaces per page in the page cache and not allowing placing the
> > > page into a VMA unless the owning task's nsproxy is one of the
> > > tagged mount namespaces.  The idea was to introduce kernel
> > > supported fencing between containers, particularly if they were
> > > handling sensitive data, so that if a container used an exploit to
> > > map another container's page, the mapping would fail.  However,
> > > since sensitive data should be on an encrypted filesystem, it looks
> > > like SEV/MKTME coupled with file based encryption might provide a
> > > better mechanism.
> > > 
> > 
> > Splitting out this point to a different email, I think being able to
> > tag page cache is quite interesting and in the long run might help
> > us to get things like mincore() right across shared boundaries.
> > 
> > But any fencing will come in the way of sharing and density of
> > containers. I still don't see how a container can map page cache it
> > does not have right permissions to/for? In an ideal world any
> > writable pages (sensitive) should ideally go to the writable bits of
> > the union mount filesystem which is private to the container (but I
> > could be making up things without trying them out)
> 
> As I said before, it's about reducing the horizontal attack profile
> (HAP).  If the kernel were perfectly free from bugs and exploits,
> containment would be perfect and the HAP would be zero.  In the real
> world, where the kernel is trusted (it's your kernel) but potentially
> vulnerable (it's not free from possibly exploitable defects), the HAP
> is non-zero and the question becomes how do you prevent one tenant from
> exploiting a defect to interfere with or exfiltrate data from another
> tenant.
> 
> The idea behind page tagging is that modern techniqes (like ROP
> attacks) use existing code sequences within the kernel to perform the
> exploit so if all code sequences that map pages contain tag guards, the
> defences against one container accessing another pages remain in place
> even in the face of exploits.
>

Agreed, and I believe in defense in depth. I'd love to participate to
see what the final proposal looks like and what elements are used

Balbir Singh. 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-02-16 11:13   ` Paul Turner
@ 2019-04-25 20:47     ` Jonathan Adams
  2019-04-25 21:56       ` James Bottomley
  0 siblings, 1 reply; 18+ messages in thread
From: Jonathan Adams @ 2019-04-25 20:47 UTC (permalink / raw)
  To: Paul Turner; +Cc: lsf-pc, linux-mm, James Bottomley, Mike Rapoport

It looks like the MM track isn't full, and I think this topic is an
important thing to discuss.

Cheers,
- Jonathan

On Sat, Feb 16, 2019 at 3:14 AM Paul Turner <pjt@google.com> wrote:
>
> I wanted to second the proposal for address space isolation.
>
> We have some new techniques to introduce her also, built around some new ideas using page-faults that we believe are interesting.
>
> To wit, page faults uniquely allow us to fork speculative and non-speculative execution as we can control the retired path within the fault itself (which as it turns out, will obviously never be executed speculatively).
>
> This lets us provide isolation against variant1 gadgets, as well as guarantee what data may or may not be cache present for the purposes of L1TF and Meltdown mitigation.
>
> I'm not sure whether or not I'll be able to attend (I have a newborn and there's a lot of other scheduling I'm trying to work out).  But Jonathan Adams (cc'd) has been working on this and can speak to it.  We also have some write-ups to publish independently of this.
>
> Thanks,
>
> - Paul
>
>> (Joint proposal with James Bottomley)
>>
>> Address space isolation has been used to protect the kernel from the
>> userspace and userspace programs from each other since the invention of
>> the virtual memory.
>>
>> Assuming that kernel bugs and therefore vulnerabilities are inevitable
>> it might be worth isolating parts of the kernel to minimize damage
>> that these vulnerabilities can cause.
>>
>> There is already ongoing work in a similar direction, like XPFO [1] and
>> temporary mappings proposed for the kernel text poking [2].
>>
>> We have several vague ideas how we can take this even further and make
>> different parts of kernel run in different address spaces:
>> * Remove most of the kernel mappings from the syscall entry and add a
>>   trampoline when the syscall processing needs to call the "core
>>   kernel".
>> * Make the parts of the kernel that execute in a namespace use their
>>   own mappings for the namespace private data
>> * Extend EXPORT_SYMBOL to include a trampoline so that the code
>>   running in modules won't map the entire kernel
>> * Execute BFP programs in a dedicated address space
>>
>> These are very general possible directions. We are exploring some of
>> them now to understand if the security value is worth the complexity
>> and the performance impact.
>>
>> We believe it would be helpful to discuss the general idea of address
>> space isolation inside the kernel, both from the technical aspect of
>> how it can be achieved simply and efficiently and from the isolation
>> aspect of what actual security guarantees it usefully provides.
>>
>> [1] https://lore.kernel.org/lkml/cover.1547153058.git.khalid.aziz@oracle.com/
>> [2] https://lore.kernel.org/lkml/20190129003422.9328-4-rick.p.edgecombe@intel.com/
>>
>> --
>> Sincerely yours,
>> Mike.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-04-25 20:47     ` Jonathan Adams
@ 2019-04-25 21:56       ` James Bottomley
  2019-04-25 22:25         ` Paul Turner
  0 siblings, 1 reply; 18+ messages in thread
From: James Bottomley @ 2019-04-25 21:56 UTC (permalink / raw)
  To: Jonathan Adams, Paul Turner; +Cc: lsf-pc, linux-mm, Mike Rapoport

On Thu, 2019-04-25 at 13:47 -0700, Jonathan Adams wrote:
> It looks like the MM track isn't full, and I think this topic is an
> important thing to discuss.

Mike just posted the RFC patches for this using a ROP gadget preventor
as a demo:

https://lore.kernel.org/linux-mm/1556228754-12996-1-git-send-email-rppt@linux.ibm.com

but, unfortunately, he won't be at LSF/MM.

James


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-04-25 21:56       ` James Bottomley
@ 2019-04-25 22:25         ` Paul Turner
  2019-04-25 22:31           ` [Lsf-pc] " Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Paul Turner @ 2019-04-25 22:25 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jonathan Adams, lsf-pc, linux-mm, Mike Rapoport

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Thu, Apr 25, 2019 at 2:56 PM James Bottomley <
James.Bottomley@hansenpartnership.com> wrote:

> On Thu, 2019-04-25 at 13:47 -0700, Jonathan Adams wrote:
> > It looks like the MM track isn't full, and I think this topic is an
> > important thing to discuss.
>
> Mike just posted the RFC patches for this using a ROP gadget preventor
> as a demo:
>
>
> https://lore.kernel.org/linux-mm/1556228754-12996-1-git-send-email-rppt@linux.ibm.com
>
> but, unfortunately, he won't be at LSF/MM.
>
> James
>

Mike's proposal is quite different, and targeted at restricting ROP
execution.
The work proposed by Jonathan is aimed to transparently restrict
speculative execution to provide generic mitigation against Spectre-V1
gadgets (and similar) and potentially eliminate the current need for for
page table switches under most syscalls due to Meltdown.

[-- Attachment #2: Type: text/html, Size: 1416 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-04-25 22:25         ` Paul Turner
@ 2019-04-25 22:31           ` Alexei Starovoitov
  2019-04-25 22:40             ` Paul Turner
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2019-04-25 22:31 UTC (permalink / raw)
  To: Paul Turner
  Cc: James Bottomley, linux-mm, lsf-pc, Mike Rapoport, Jonathan Adams,
	Daniel Borkmann, Jann Horn

On Thu, Apr 25, 2019 at 3:27 PM Paul Turner via Lsf-pc
<lsf-pc@lists.linux-foundation.org> wrote:
>
> On Thu, Apr 25, 2019 at 2:56 PM James Bottomley <
> James.Bottomley@hansenpartnership.com> wrote:
>
> > On Thu, 2019-04-25 at 13:47 -0700, Jonathan Adams wrote:
> > > It looks like the MM track isn't full, and I think this topic is an
> > > important thing to discuss.
> >
> > Mike just posted the RFC patches for this using a ROP gadget preventor
> > as a demo:
> >
> >
> > https://lore.kernel.org/linux-mm/1556228754-12996-1-git-send-email-rppt@linux.ibm.com
> >
> > but, unfortunately, he won't be at LSF/MM.
> >
> > James
> >
>
> Mike's proposal is quite different, and targeted at restricting ROP
> execution.
> The work proposed by Jonathan is aimed to transparently restrict
> speculative execution to provide generic mitigation against Spectre-V1
> gadgets (and similar) and potentially eliminate the current need for for
> page table switches under most syscalls due to Meltdown.

sounds very interesting.
"v1 gadgets" would include unpriv bpf code too?


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Lsf-pc] [LSF/MM TOPIC] Address space isolation inside the kernel
  2019-04-25 22:31           ` [Lsf-pc] " Alexei Starovoitov
@ 2019-04-25 22:40             ` Paul Turner
  0 siblings, 0 replies; 18+ messages in thread
From: Paul Turner @ 2019-04-25 22:40 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: James Bottomley, linux-mm, lsf-pc, Mike Rapoport, Jonathan Adams,
	Daniel Borkmann, Jann Horn

[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]

On Thu, Apr 25, 2019 at 3:31 PM Alexei Starovoitov <
alexei.starovoitov@gmail.com> wrote:

> On Thu, Apr 25, 2019 at 3:27 PM Paul Turner via Lsf-pc
> <lsf-pc@lists.linux-foundation.org> wrote:
> >
> > On Thu, Apr 25, 2019 at 2:56 PM James Bottomley <
> > James.Bottomley@hansenpartnership.com> wrote:
> >
> > > On Thu, 2019-04-25 at 13:47 -0700, Jonathan Adams wrote:
> > > > It looks like the MM track isn't full, and I think this topic is an
> > > > important thing to discuss.
> > >
> > > Mike just posted the RFC patches for this using a ROP gadget preventor
> > > as a demo:
> > >
> > >
> > >
> https://lore.kernel.org/linux-mm/1556228754-12996-1-git-send-email-rppt@linux.ibm.com
> > >
> > > but, unfortunately, he won't be at LSF/MM.
> > >
> > > James
> > >
> >
> > Mike's proposal is quite different, and targeted at restricting ROP
> > execution.
> > The work proposed by Jonathan is aimed to transparently restrict
> > speculative execution to provide generic mitigation against Spectre-V1
> > gadgets (and similar) and potentially eliminate the current need for for
> > page table switches under most syscalls due to Meltdown.
>
> sounds very interesting.
> "v1 gadgets" would include unpriv bpf code too?
>

Yes

[-- Attachment #2: Type: text/html, Size: 2085 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-04-25 22:41 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-07  7:24 [LSF/MM TOPIC] Address space isolation inside the kernel Mike Rapoport
2019-02-14 19:21 ` Kees Cook
     [not found] ` <CA+VK+GOpjXQ2-CLZt6zrW6m-=WpWpvcrXGSJ-723tRDMeAeHmg@mail.gmail.com>
2019-02-16 11:13   ` Paul Turner
2019-04-25 20:47     ` Jonathan Adams
2019-04-25 21:56       ` James Bottomley
2019-04-25 22:25         ` Paul Turner
2019-04-25 22:31           ` [Lsf-pc] " Alexei Starovoitov
2019-04-25 22:40             ` Paul Turner
2019-02-16 12:19 ` Balbir Singh
2019-02-16 16:30   ` James Bottomley
2019-02-17  8:01     ` Balbir Singh
2019-02-17 16:43       ` James Bottomley
2019-02-17 19:34     ` Matthew Wilcox
2019-02-17 20:09       ` James Bottomley
2019-02-17 21:54         ` Balbir Singh
2019-02-17 22:01         ` Balbir Singh
2019-02-17 22:20           ` [Lsf-pc] " James Bottomley
2019-02-18 11:15             ` Balbir Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).