All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ksummit-discuss] security-related TODO items?
@ 2017-01-20 22:38 Kees Cook
  2017-01-21  0:14 ` Andy Lutomirski
                   ` (5 more replies)
  0 siblings, 6 replies; 29+ messages in thread
From: Kees Cook @ 2017-01-20 22:38 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: Josh Armour, Greg KH

Hi,

I've already got various Kernel Self-Protection Project TODO items
collected[1] (of varying size and complexity), but recently Google's
Patch Reward Program[2] is trying to expand by helping create a bounty
program for security-related TODOs. KSPP is just one corner of
interest in the kernel, and I'd love to know if any other maintainers
have TODO items that they'd like to see get done (and Google would
potentially provide bounty money for).

Let me know your security wish-lists, and I'll collect them all into a
single place. And if there is a better place than ksummit-discuss to
reach maintainers, I'm all ears. LKML tends to mostly just serve as a
public archive. :)

Thanks!

-Kees

[1] http://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project#Specific_TODO_Items
[2] https://www.google.com/about/appsecurity/patch-rewards/

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
@ 2017-01-21  0:14 ` Andy Lutomirski
  2017-01-21  0:26   ` Kees Cook
                     ` (2 more replies)
  2017-01-23 10:02 ` Alexey Dobriyan
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-21  0:14 UTC (permalink / raw)
  To: Kees Cook; +Cc: Josh Armour, Greg KH, ksummit-discuss

This is not easy at all, but: how about rewriting execve() so that the
actual binary format parsers run in user mode?

A minor one for x86: give binaries a way to opt out of the x86_64
vsyscall page.  I already did the hard part (in a branch), so all
that's really left is figuring out the ABI.

On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
> Hi,
>
> I've already got various Kernel Self-Protection Project TODO items
> collected[1] (of varying size and complexity), but recently Google's
> Patch Reward Program[2] is trying to expand by helping create a bounty
> program for security-related TODOs. KSPP is just one corner of
> interest in the kernel, and I'd love to know if any other maintainers
> have TODO items that they'd like to see get done (and Google would
> potentially provide bounty money for).
>
> Let me know your security wish-lists, and I'll collect them all into a
> single place. And if there is a better place than ksummit-discuss to
> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
> public archive. :)
>
> Thanks!
>
> -Kees
>
> [1] http://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project#Specific_TODO_Items
> [2] https://www.google.com/about/appsecurity/patch-rewards/
>
> --
> Kees Cook
> Nexus Security
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-21  0:14 ` Andy Lutomirski
@ 2017-01-21  0:26   ` Kees Cook
  2017-01-21  1:10   ` Matthew Wilcox
  2017-01-21  1:47   ` Josh Triplett
  2 siblings, 0 replies; 29+ messages in thread
From: Kees Cook @ 2017-01-21  0:26 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, Greg KH, ksummit-discuss

On Fri, Jan 20, 2017 at 4:14 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> This is not easy at all, but: how about rewriting execve() so that the
> actual binary format parsers run in user mode?

Fun! :)

> A minor one for x86: give binaries a way to opt out of the x86_64
> vsyscall page.  I already did the hard part (in a branch), so all
> that's really left is figuring out the ABI.

Oh right, we'd talked about this one too. ELF note? Something else?

You got time to do the PCID stuff? :) Please please? :)

-Kees

-- 
Kees Cook
Nexus Security

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-21  0:14 ` Andy Lutomirski
  2017-01-21  0:26   ` Kees Cook
@ 2017-01-21  1:10   ` Matthew Wilcox
  2017-01-21  1:47   ` Josh Triplett
  2 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-01-21  1:10 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, Greg Kroah-Hartman, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2319 bytes --]

[I swear Gmail used to have a "reply inline" option in the app]

Maybe we really want a "call this kernel function with user privilege"
ability? Such a function would be able to access only user space memory
(via get_user/...) and its own stack (all hail vmap).

All the compat code would benefit, and maybe some upper layers of ioctl
handling. Perhaps some of the filesystem parsing code would do well in this
kind of constrained environment, but it might need access to so much other
stuff we'd end up diluting the utility.

On Jan 20, 2017 19:23, "Andy Lutomirski" <luto@amacapital.net> wrote:

This is not easy at all, but: how about rewriting execve() so that the
actual binary format parsers run in user mode?

A minor one for x86: give binaries a way to opt out of the x86_64
vsyscall page.  I already did the hard part (in a branch), so all
that's really left is figuring out the ABI.

On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
> Hi,
>
> I've already got various Kernel Self-Protection Project TODO items
> collected[1] (of varying size and complexity), but recently Google's
> Patch Reward Program[2] is trying to expand by helping create a bounty
> program for security-related TODOs. KSPP is just one corner of
> interest in the kernel, and I'd love to know if any other maintainers
> have TODO items that they'd like to see get done (and Google would
> potentially provide bounty money for).
>
> Let me know your security wish-lists, and I'll collect them all into a
> single place. And if there is a better place than ksummit-discuss to
> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
> public archive. :)
>
> Thanks!
>
> -Kees
>
> [1] http://kernsec.org/wiki/index.php/Kernel_Self_Protection_
Project#Specific_TODO_Items
> [2] https://www.google.com/about/appsecurity/patch-rewards/
>
> --
> Kees Cook
> Nexus Security
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss



--
Andy Lutomirski
AMA Capital Management, LLC
_______________________________________________
Ksummit-discuss mailing list
Ksummit-discuss@lists.linuxfoundation.org
https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

[-- Attachment #2: Type: text/html, Size: 3872 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-21  0:14 ` Andy Lutomirski
  2017-01-21  0:26   ` Kees Cook
  2017-01-21  1:10   ` Matthew Wilcox
@ 2017-01-21  1:47   ` Josh Triplett
  2 siblings, 0 replies; 29+ messages in thread
From: Josh Triplett @ 2017-01-21  1:47 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, Greg KH, ksummit-discuss

On Fri, Jan 20, 2017 at 04:14:25PM -0800, Andy Lutomirski wrote:
> This is not easy at all, but: how about rewriting execve() so that the
> actual binary format parsers run in user mode?

I really like that idea.  And not just the binary format parsers;
everything except the "do what would happen on exec" transition within
the kernel (the bits documented in execve(2) as changing/resetting on
execve, other than those bits trivially doable in userspace).

(One potential challenge: this still has to handle setuid binaries
safely.)

I can think of other syscalls where a userspace implementation would
make sense, as well, if it can run with reasonable performance.  For
instance, imagine moving compatibility syscalls, x32 syscalls, or
deprecated syscalls into userspace, such that if a process found a way
to compromise that layer, it couldn't compromise any other process.

- Josh Triplett

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
  2017-01-21  0:14 ` Andy Lutomirski
@ 2017-01-23 10:02 ` Alexey Dobriyan
  2017-01-23 10:48 ` David Howells
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 29+ messages in thread
From: Alexey Dobriyan @ 2017-01-23 10:02 UTC (permalink / raw)
  To: Kees Cook; +Cc: Josh Armour, Greg KH, ksummit-discuss

On Sat, Jan 21, 2017 at 1:38 AM, Kees Cook <keescook@chromium.org> wrote:
> security wish-lists

Unmap module loader after setting kernel.modules_disabled=1.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
  2017-01-21  0:14 ` Andy Lutomirski
  2017-01-23 10:02 ` Alexey Dobriyan
@ 2017-01-23 10:48 ` David Howells
  2017-01-23 20:10     ` Andy Lutomirski
  2017-01-23 20:36     ` David Howells
  2017-01-23 20:15 ` Christoph Hellwig
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 29+ messages in thread
From: David Howells @ 2017-01-23 10:48 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, ksummit-discuss, Greg KH

Andy Lutomirski <luto@amacapital.net> wrote:

> This is not easy at all, but: how about rewriting execve() so that the
> actual binary format parsers run in user mode?

Sounds very chicken-and-egg-ish.  Issues you'd have:

 (1) You'd need at least one pre-loader binary image built into the kernel
     that you can map into userspace (you can't upcall to userspace to go get
     it for your core binfmt).  This could appear as, say, /proc/preloader,
     for the kernel to open and mmap.

 (2) Where would the kernel put the executable image?  It would have to parse
     the binary to find out where not to put it - otherwise the code might
     have to relocate itself.

 (3) How do you deal with address randomisation?

 (4) You may have to start without a stack as the kernel wouldn't necessarily
     know where to put it or how big it should be (see 6).  Or you might have
     to relocate it, including all the pointers it contains.

 (5) Where should the kernel put arguments, environment and other parameters?
     Currently, this presumes a stack, but see (4).

 (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
     size is set in the binary.  OTOH, you wouldn't have to relocate the
     pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.

 (7) When the kernel finds it's dealing with a script, it goes back through
     the security calculation procedure again to deal with the interpreter.

> A minor one for x86: give binaries a way to opt out of the x86_64
> vsyscall page.  I already did the hard part (in a branch), so all
> that's really left is figuring out the ABI.

munmap() it in the loader?

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-23 10:48 ` David Howells
@ 2017-01-23 20:10     ` Andy Lutomirski
  2017-01-23 20:36     ` David Howells
  1 sibling, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-23 20:10 UTC (permalink / raw)
  To: David Howells, linux-mm; +Cc: Josh Armour, Greg KH, ksummit-discuss

On Mon, Jan 23, 2017 at 2:48 AM, David Howells <dhowells@redhat.com> wrote:
> Andy Lutomirski <luto@amacapital.net> wrote:
>
>> This is not easy at all, but: how about rewriting execve() so that the
>> actual binary format parsers run in user mode?
>
> Sounds very chicken-and-egg-ish.  Issues you'd have:
>
>  (1) You'd need at least one pre-loader binary image built into the kernel
>      that you can map into userspace (you can't upcall to userspace to go get
>      it for your core binfmt).  This could appear as, say, /proc/preloader,
>      for the kernel to open and mmap.

No need for it to be visible at all.  I'm imagining the kernel making
a fresh mm_struct, directly mapping some text, running that text, and
then using the result as the mm_struct after execve.

>
>  (2) Where would the kernel put the executable image?  It would have to parse
>      the binary to find out where not to put it - otherwise the code might
>      have to relocate itself.

In vmlinux.

>
>  (3) How do you deal with address randomisation?

Non-issue, I think.

>
>  (4) You may have to start without a stack as the kernel wouldn't necessarily
>      know where to put it or how big it should be (see 6).  Or you might have
>      to relocate it, including all the pointers it contains.

The relocation part is indeed a bit nasty.

>
>  (5) Where should the kernel put arguments, environment and other parameters?
>      Currently, this presumes a stack, but see (4).

Hmm.

>
>  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
>      size is set in the binary.  OTOH, you wouldn't have to relocate the
>      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.

For nommu, forget about it.

>
>  (7) When the kernel finds it's dealing with a script, it goes back through
>      the security calculation procedure again to deal with the interpreter.

The security calculation isn't what I'm worried about.  I'm worried
about the parser.

Anyway, I didn't say this would be easy :)

>
>> A minor one for x86: give binaries a way to opt out of the x86_64
>> vsyscall page.  I already did the hard part (in a branch), so all
>> that's really left is figuring out the ABI.
>
> munmap() it in the loader?

Hmm, *that's* an interesting thought.  You can't remove the VMA (it's
not a VMA) but maybe munmap() could be made to work anyway.  Hey mm
folks, just how weird would it be to let arch code special-case
unmapping of the gate pseudo-vma?

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-23 20:10     ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-23 20:10 UTC (permalink / raw)
  To: David Howells, linux-mm; +Cc: Kees Cook, Josh Armour, Greg KH, ksummit-discuss

On Mon, Jan 23, 2017 at 2:48 AM, David Howells <dhowells@redhat.com> wrote:
> Andy Lutomirski <luto@amacapital.net> wrote:
>
>> This is not easy at all, but: how about rewriting execve() so that the
>> actual binary format parsers run in user mode?
>
> Sounds very chicken-and-egg-ish.  Issues you'd have:
>
>  (1) You'd need at least one pre-loader binary image built into the kernel
>      that you can map into userspace (you can't upcall to userspace to go get
>      it for your core binfmt).  This could appear as, say, /proc/preloader,
>      for the kernel to open and mmap.

No need for it to be visible at all.  I'm imagining the kernel making
a fresh mm_struct, directly mapping some text, running that text, and
then using the result as the mm_struct after execve.

>
>  (2) Where would the kernel put the executable image?  It would have to parse
>      the binary to find out where not to put it - otherwise the code might
>      have to relocate itself.

In vmlinux.

>
>  (3) How do you deal with address randomisation?

Non-issue, I think.

>
>  (4) You may have to start without a stack as the kernel wouldn't necessarily
>      know where to put it or how big it should be (see 6).  Or you might have
>      to relocate it, including all the pointers it contains.

The relocation part is indeed a bit nasty.

>
>  (5) Where should the kernel put arguments, environment and other parameters?
>      Currently, this presumes a stack, but see (4).

Hmm.

>
>  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
>      size is set in the binary.  OTOH, you wouldn't have to relocate the
>      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.

For nommu, forget about it.

>
>  (7) When the kernel finds it's dealing with a script, it goes back through
>      the security calculation procedure again to deal with the interpreter.

The security calculation isn't what I'm worried about.  I'm worried
about the parser.

Anyway, I didn't say this would be easy :)

>
>> A minor one for x86: give binaries a way to opt out of the x86_64
>> vsyscall page.  I already did the hard part (in a branch), so all
>> that's really left is figuring out the ABI.
>
> munmap() it in the loader?

Hmm, *that's* an interesting thought.  You can't remove the VMA (it's
not a VMA) but maybe munmap() could be made to work anyway.  Hey mm
folks, just how weird would it be to let arch code special-case
unmapping of the gate pseudo-vma?

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
                   ` (2 preceding siblings ...)
  2017-01-23 10:48 ` David Howells
@ 2017-01-23 20:15 ` Christoph Hellwig
  2017-01-24  2:38 ` Andy Lutomirski
  2017-02-02 21:12 ` David Howells
  5 siblings, 0 replies; 29+ messages in thread
From: Christoph Hellwig @ 2017-01-23 20:15 UTC (permalink / raw)
  To: Kees Cook; +Cc: Josh Armour, Greg KH, ksummit-discuss

It seems like you've accidentally sent this to ksummit-discuss instead
of linux-kernel - you probably want to resend it to the proper list.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-23 10:48 ` David Howells
@ 2017-01-23 20:36     ` David Howells
  2017-01-23 20:36     ` David Howells
  1 sibling, 0 replies; 29+ messages in thread
From: David Howells @ 2017-01-23 20:36 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, ksummit-discuss, Greg KH, linux-mm

Andy Lutomirski <luto@amacapital.net> wrote:

> >  (1) You'd need at least one pre-loader binary image built into the kernel
> >      that you can map into userspace (you can't upcall to userspace to go get
> >      it for your core binfmt).  This could appear as, say, /proc/preloader,
> >      for the kernel to open and mmap.
> 
> No need for it to be visible at all.  I'm imagining the kernel making
> a fresh mm_struct, directly mapping some text, running that text, and
> then using the result as the mm_struct after execve.

What would you see in /proc/pid/maps?

> >  (2) Where would the kernel put the executable image?  It would have to
> >      parse the binary to find out where not to put it - otherwise the code
> >      might have to relocate itself.
> 
> In vmlinux.

You misunderstood the question.  I meant at what address would you map it into
userspace?  You would have to avoid anywhere the executable needs to place
something - though as long as you can manage to start the loader, you can
ditch the pre-loader, so that might not be a problem.

> >  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
> >      size is set in the binary.  OTOH, you wouldn't have to relocate the
> >      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.
> 
> For nommu, forget about it.

Why?  If you do that, you have to have bimodal binfmts.  Note that the
ELF-FDPIC binfmt, at least, can be used for both MMU and NOMMU environments.
This may also be true of FLAT.

> >  (7) When the kernel finds it's dealing with a script, it goes back through
> >      the security calculation procedure again to deal with the interpreter.
> 
> The security calculation isn't what I'm worried about.  I'm worried
> about the parser.

But you may have to redo the security calculation *after* doing the parsing.

> Anyway, I didn't say this would be easy :)

True... :-)

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-23 20:36     ` David Howells
  0 siblings, 0 replies; 29+ messages in thread
From: David Howells @ 2017-01-23 20:36 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: dhowells, linux-mm, Kees Cook, Josh Armour, Greg KH, ksummit-discuss

Andy Lutomirski <luto@amacapital.net> wrote:

> >  (1) You'd need at least one pre-loader binary image built into the kernel
> >      that you can map into userspace (you can't upcall to userspace to go get
> >      it for your core binfmt).  This could appear as, say, /proc/preloader,
> >      for the kernel to open and mmap.
> 
> No need for it to be visible at all.  I'm imagining the kernel making
> a fresh mm_struct, directly mapping some text, running that text, and
> then using the result as the mm_struct after execve.

What would you see in /proc/pid/maps?

> >  (2) Where would the kernel put the executable image?  It would have to
> >      parse the binary to find out where not to put it - otherwise the code
> >      might have to relocate itself.
> 
> In vmlinux.

You misunderstood the question.  I meant at what address would you map it into
userspace?  You would have to avoid anywhere the executable needs to place
something - though as long as you can manage to start the loader, you can
ditch the pre-loader, so that might not be a problem.

> >  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
> >      size is set in the binary.  OTOH, you wouldn't have to relocate the
> >      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.
> 
> For nommu, forget about it.

Why?  If you do that, you have to have bimodal binfmts.  Note that the
ELF-FDPIC binfmt, at least, can be used for both MMU and NOMMU environments.
This may also be true of FLAT.

> >  (7) When the kernel finds it's dealing with a script, it goes back through
> >      the security calculation procedure again to deal with the interpreter.
> 
> The security calculation isn't what I'm worried about.  I'm worried
> about the parser.

But you may have to redo the security calculation *after* doing the parsing.

> Anyway, I didn't say this would be easy :)

True... :-)

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-23 20:36     ` David Howells
@ 2017-01-23 20:59       ` Matthew Wilcox
  -1 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-01-23 20:59 UTC (permalink / raw)
  To: David Howells; +Cc: linux-mm, Greg Kroah-Hartman, Josh Armour, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2557 bytes --]

Why put it in the user address space? As I said earlier in this thread, we
want the facility to run code from kernel addresses in user mode, limited
to only being able to access its own stack and the user addresses. Of
course it should also be able to make syscalls, like mmap.

On Jan 23, 2017 3:36 PM, "David Howells" <dhowells@redhat.com> wrote:

> Andy Lutomirski <luto@amacapital.net> wrote:
>
> > >  (1) You'd need at least one pre-loader binary image built into the
> kernel
> > >      that you can map into userspace (you can't upcall to userspace to
> go get
> > >      it for your core binfmt).  This could appear as, say,
> /proc/preloader,
> > >      for the kernel to open and mmap.
> >
> > No need for it to be visible at all.  I'm imagining the kernel making
> > a fresh mm_struct, directly mapping some text, running that text, and
> > then using the result as the mm_struct after execve.
>
> What would you see in /proc/pid/maps?
>
> > >  (2) Where would the kernel put the executable image?  It would have to
> > >      parse the binary to find out where not to put it - otherwise the
> code
> > >      might have to relocate itself.
> >
> > In vmlinux.
>
> You misunderstood the question.  I meant at what address would you map it
> into
> userspace?  You would have to avoid anywhere the executable needs to place
> something - though as long as you can manage to start the loader, you can
> ditch the pre-loader, so that might not be a problem.
>
> > >  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the
> stack
> > >      size is set in the binary.  OTOH, you wouldn't have to relocate
> the
> > >      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.
> >
> > For nommu, forget about it.
>
> Why?  If you do that, you have to have bimodal binfmts.  Note that the
> ELF-FDPIC binfmt, at least, can be used for both MMU and NOMMU
> environments.
> This may also be true of FLAT.
>
> > >  (7) When the kernel finds it's dealing with a script, it goes back
> through
> > >      the security calculation procedure again to deal with the
> interpreter.
> >
> > The security calculation isn't what I'm worried about.  I'm worried
> > about the parser.
>
> But you may have to redo the security calculation *after* doing the
> parsing.
>
> > Anyway, I didn't say this would be easy :)
>
> True... :-)
>
> David
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

[-- Attachment #2: Type: text/html, Size: 3408 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-23 20:59       ` Matthew Wilcox
  0 siblings, 0 replies; 29+ messages in thread
From: Matthew Wilcox @ 2017-01-23 20:59 UTC (permalink / raw)
  To: David Howells
  Cc: Greg Kroah-Hartman, Andy Lutomirski, linux-mm, Josh Armour,
	ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2557 bytes --]

Why put it in the user address space? As I said earlier in this thread, we
want the facility to run code from kernel addresses in user mode, limited
to only being able to access its own stack and the user addresses. Of
course it should also be able to make syscalls, like mmap.

On Jan 23, 2017 3:36 PM, "David Howells" <dhowells@redhat.com> wrote:

> Andy Lutomirski <luto@amacapital.net> wrote:
>
> > >  (1) You'd need at least one pre-loader binary image built into the
> kernel
> > >      that you can map into userspace (you can't upcall to userspace to
> go get
> > >      it for your core binfmt).  This could appear as, say,
> /proc/preloader,
> > >      for the kernel to open and mmap.
> >
> > No need for it to be visible at all.  I'm imagining the kernel making
> > a fresh mm_struct, directly mapping some text, running that text, and
> > then using the result as the mm_struct after execve.
>
> What would you see in /proc/pid/maps?
>
> > >  (2) Where would the kernel put the executable image?  It would have to
> > >      parse the binary to find out where not to put it - otherwise the
> code
> > >      might have to relocate itself.
> >
> > In vmlinux.
>
> You misunderstood the question.  I meant at what address would you map it
> into
> userspace?  You would have to avoid anywhere the executable needs to place
> something - though as long as you can manage to start the loader, you can
> ditch the pre-loader, so that might not be a problem.
>
> > >  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the
> stack
> > >      size is set in the binary.  OTOH, you wouldn't have to relocate
> the
> > >      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.
> >
> > For nommu, forget about it.
>
> Why?  If you do that, you have to have bimodal binfmts.  Note that the
> ELF-FDPIC binfmt, at least, can be used for both MMU and NOMMU
> environments.
> This may also be true of FLAT.
>
> > >  (7) When the kernel finds it's dealing with a script, it goes back
> through
> > >      the security calculation procedure again to deal with the
> interpreter.
> >
> > The security calculation isn't what I'm worried about.  I'm worried
> > about the parser.
>
> But you may have to redo the security calculation *after* doing the
> parsing.
>
> > Anyway, I didn't say this would be easy :)
>
> True... :-)
>
> David
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
>

[-- Attachment #2: Type: text/html, Size: 3408 bytes --]

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-23 20:59       ` Matthew Wilcox
@ 2017-01-23 21:53         ` Andy Lutomirski
  -1 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-23 21:53 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Greg Kroah-Hartman, Josh Armour, ksummit-discuss, linux-mm

On Mon, Jan 23, 2017 at 12:59 PM, Matthew Wilcox <willy6545@gmail.com> wrote:
> Why put it in the user address space? As I said earlier in this thread, we
> want the facility to run code from kernel addresses in user mode, limited to
> only being able to access its own stack and the user addresses. Of course it
> should also be able to make syscalls, like mmap.

Would you believe I've already started prototyping this (the
kernel-code-in-user-mode part, not the execve part)?

As a practical matter, though, I think the implementation would be
*much* simpler if code running in user mode sees user addresses.
Otherwise we'd end up with very messy and constrained code on
single-address-space arches like x86 and we might not be able to
implement it at all on split-address-space arches like s390.

That being said, writing a bit of PIC code that parses the ELF file,
finds some unused address space, and relocates itself out of the way
shouldn't be *that* hard.

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-23 21:53         ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-23 21:53 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: David Howells, Greg Kroah-Hartman, linux-mm, Josh Armour,
	ksummit-discuss

On Mon, Jan 23, 2017 at 12:59 PM, Matthew Wilcox <willy6545@gmail.com> wrote:
> Why put it in the user address space? As I said earlier in this thread, we
> want the facility to run code from kernel addresses in user mode, limited to
> only being able to access its own stack and the user addresses. Of course it
> should also be able to make syscalls, like mmap.

Would you believe I've already started prototyping this (the
kernel-code-in-user-mode part, not the execve part)?

As a practical matter, though, I think the implementation would be
*much* simpler if code running in user mode sees user addresses.
Otherwise we'd end up with very messy and constrained code on
single-address-space arches like x86 and we might not be able to
implement it at all on split-address-space arches like s390.

That being said, writing a bit of PIC code that parses the ELF file,
finds some unused address space, and relocates itself out of the way
shouldn't be *that* hard.

--Andy

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-23 20:36     ` David Howells
@ 2017-01-23 23:26       ` Greg Ungerer
  -1 siblings, 0 replies; 29+ messages in thread
From: Greg Ungerer @ 2017-01-23 23:26 UTC (permalink / raw)
  To: David Howells, Andy Lutomirski
  Cc: Josh Armour, Greg KH, ksummit-discuss, linux-mm

On 24/01/17 06:36, David Howells wrote:
> Andy Lutomirski <luto@amacapital.net> wrote:
[snip]
>>>  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
>>>      size is set in the binary.  OTOH, you wouldn't have to relocate the
>>>      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.
>>
>> For nommu, forget about it.
> 
> Why?  If you do that, you have to have bimodal binfmts.  Note that the
> ELF-FDPIC binfmt, at least, can be used for both MMU and NOMMU environments.
> This may also be true of FLAT.

It is true for FLAT as well, they can run on both MMU an noMMU.

Regards
Greg

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-23 23:26       ` Greg Ungerer
  0 siblings, 0 replies; 29+ messages in thread
From: Greg Ungerer @ 2017-01-23 23:26 UTC (permalink / raw)
  To: David Howells, Andy Lutomirski
  Cc: Josh Armour, ksummit-discuss, Greg KH, linux-mm

On 24/01/17 06:36, David Howells wrote:
> Andy Lutomirski <luto@amacapital.net> wrote:
[snip]
>>>  (6) NOMMU could be particularly tricky.  For ELF-FDPIC at least, the stack
>>>      size is set in the binary.  OTOH, you wouldn't have to relocate the
>>>      pre-loader - you'd just mmap it MAP_PRIVATE and execute in place.
>>
>> For nommu, forget about it.
> 
> Why?  If you do that, you have to have bimodal binfmts.  Note that the
> ELF-FDPIC binfmt, at least, can be used for both MMU and NOMMU environments.
> This may also be true of FLAT.

It is true for FLAT as well, they can run on both MMU an noMMU.

Regards
Greg


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
                   ` (3 preceding siblings ...)
  2017-01-23 20:15 ` Christoph Hellwig
@ 2017-01-24  2:38 ` Andy Lutomirski
  2017-01-24 10:03   ` Eric W. Biederman
                     ` (2 more replies)
  2017-02-02 21:12 ` David Howells
  5 siblings, 3 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-24  2:38 UTC (permalink / raw)
  To: Kees Cook, Djalal Harouni, Eric W. Biederman
  Cc: Josh Armour, Greg KH, ksummit-discuss

On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
> Hi,
>
> I've already got various Kernel Self-Protection Project TODO items
> collected[1] (of varying size and complexity), but recently Google's
> Patch Reward Program[2] is trying to expand by helping create a bounty
> program for security-related TODOs. KSPP is just one corner of
> interest in the kernel, and I'd love to know if any other maintainers
> have TODO items that they'd like to see get done (and Google would
> potentially provide bounty money for).
>
> Let me know your security wish-lists, and I'll collect them all into a
> single place. And if there is a better place than ksummit-discuss to
> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
> public archive. :)
>

Here's another one: split up and modernize /proc.

I'm imagining a whole series of changes:

 - Make a sysctlfs.  You could mount it and get all the sysctls if you
have global privilege.  If you only have privilege relative to some
namespace, you could pass a mount option like -o scope=net to get just
sysctls that belong to the mounting process' netns.  If done
carefully, this should be safe for unprivileged mounting without the
fs_fully_visible() checks.

 - Teach procfs to understand mount options for real (per-superblock).
Shouldn't be that hard.

 - Make it possible to control hidepid per mount.  systemd and such
could use this to tighten up daemons.

 - Make it possible to make /proc/PID/cmdline only show argv[0] via
per-mount option or perhaps sysctl.

 - Make it possible to mount a mini-proc that doesn't have all the
non-PID stuff.  Presumably it would still have an empty directory
called sys and maybe some other minimal contents for compatibility

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-24  2:38 ` Andy Lutomirski
@ 2017-01-24 10:03   ` Eric W. Biederman
  2017-01-24 21:00     ` Andy Lutomirski
  2017-01-24 10:38     ` Alexey Dobriyan
       [not found]   ` <CAEiveUcTQK84qFNpYoET-cpSXJe0KYtnYQtp0uTPz=z0tc3W9A@mail.gmail.com>
  2 siblings, 1 reply; 29+ messages in thread
From: Eric W. Biederman @ 2017-01-24 10:03 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, Greg KH, Djalal Harouni, ksummit-discuss

Andy Lutomirski <luto@amacapital.net> writes:

> On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
>> Hi,
>>
>> I've already got various Kernel Self-Protection Project TODO items
>> collected[1] (of varying size and complexity), but recently Google's
>> Patch Reward Program[2] is trying to expand by helping create a bounty
>> program for security-related TODOs. KSPP is just one corner of
>> interest in the kernel, and I'd love to know if any other maintainers
>> have TODO items that they'd like to see get done (and Google would
>> potentially provide bounty money for).
>>
>> Let me know your security wish-lists, and I'll collect them all into a
>> single place. And if there is a better place than ksummit-discuss to
>> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
>> public archive. :)
>>
>
> Here's another one: split up and modernize /proc.
>
> I'm imagining a whole series of changes:
>
>  - Make a sysctlfs.  You could mount it and get all the sysctls if you
> have global privilege.  If you only have privilege relative to some
> namespace, you could pass a mount option like -o scope=net to get just
> sysctls that belong to the mounting process' netns.  If done
> carefully, this should be safe for unprivileged mounting without the
> fs_fully_visible() checks.

Nope.  Because the fs_fully_visible checks are there to support a root
policy of what can be used.  Any filesystem with content needs
fs_fully_visible or another way for root to say no you can't access
these files.

cgroupfs gets a pass from me because we can set the number of cgroup
namespaces to 0, and because changing it will break userspace.

Besides bind if you split up proc into pieces bind mounts should be
sufficient and you should not need to allow unprivileged users to mount
any of the pieces of proc.

>  - Teach procfs to understand mount options for real (per-superblock).
> Shouldn't be that hard.
>
>  - Make it possible to control hidepid per mount.  systemd and such
> could use this to tighten up daemons.

How about we come up with a better answer than hidepid and kill the
hidepid option?

>  - Make it possible to make /proc/PID/cmdline only show argv[0] via
> per-mount option or perhaps sysctl.
>
>  - Make it possible to mount a mini-proc that doesn't have all the
> non-PID stuff.  Presumably it would still have an empty directory
> called sys and maybe some other minimal contents for compatibility

That would certainly be something if done carefully that could be
mounted without fs_fully_visible checks.

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-23 20:10     ` Andy Lutomirski
  (?)
@ 2017-01-24 10:32     ` Tetsuo Handa
  2017-01-24 20:58         ` Andy Lutomirski
  -1 siblings, 1 reply; 29+ messages in thread
From: Tetsuo Handa @ 2017-01-24 10:32 UTC (permalink / raw)
  To: Andy Lutomirski, David Howells, linux-mm
  Cc: Kees Cook, Josh Armour, Greg KH, ksummit-discuss

Hello.

Can I read archive of the discussion of this topic from the beginning?
I felt that this topic might be an opportunity of proposing my execute handler
approach.

In TOMOYO LSM (out of tree version), administrator can specify a program
called execute handler which should be executed on behalf of a program
requested by execve(). The specified program performs validation (e.g. whether
argv[]/envp[] are appropriate) and setup (e.g. redirect file handles) before
executing the program requested by execve().

Conceptually execute handler is something like

  #!/bin/sh
  test ... || exit 1
  test ... || exit 1
  test ... || exit 1
  exec ...

which would in practice be implemented using C like
https://osdn.net/projects/tomoyo/scm/svn/blobs/head/tags/ccs-tools/1.8.5p1/usr_lib_ccs/audit-exec-param.c .
It is not difficult to implement the kernel side as well.

Regards.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-24  2:38 ` Andy Lutomirski
@ 2017-01-24 10:38     ` Alexey Dobriyan
  2017-01-24 10:38     ` Alexey Dobriyan
       [not found]   ` <CAEiveUcTQK84qFNpYoET-cpSXJe0KYtnYQtp0uTPz=z0tc3W9A@mail.gmail.com>
  2 siblings, 0 replies; 29+ messages in thread
From: Alexey Dobriyan @ 2017-01-24 10:38 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Josh Armour, ksummit-discuss, Greg KH, Linux Kernel, Djalal Harouni

        [add linux-kernel]

On Tue, Jan 24, 2017 at 5:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> Here's another one: split up and modernize /proc.
>
> I'm imagining a whole series of changes:
>
>  - Make a sysctlfs.  You could mount it and get all the sysctls if you
> have global privilege.  If you only have privilege relative to some
> namespace, you could pass a mount option like -o scope=net to get just
> sysctls that belong to the mounting process' netns.  If done
> carefully, this should be safe for unprivileged mounting without the
> fs_fully_visible() checks.
>
>  - Teach procfs to understand mount options for real (per-superblock).
> Shouldn't be that hard.
>
>  - Make it possible to control hidepid per mount.  systemd and such
> could use this to tighten up daemons.
>
>  - Make it possible to make /proc/PID/cmdline only show argv[0] via
> per-mount option or perhaps sysctl.
>
>  - Make it possible to mount a mini-proc that doesn't have all the
> non-PID stuff.  Presumably it would still have an empty directory
> called sys and maybe some other minimal contents for compatibility

Yes, please!

mount -t sysctl ...
mount -t proc-pid ...
mount -t proc-kitchen-sink ...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-24 10:38     ` Alexey Dobriyan
  0 siblings, 0 replies; 29+ messages in thread
From: Alexey Dobriyan @ 2017-01-24 10:38 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Kees Cook, Djalal Harouni, Eric W. Biederman, Josh Armour,
	Greg KH, ksummit-discuss, Linux Kernel

        [add linux-kernel]

On Tue, Jan 24, 2017 at 5:38 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> Here's another one: split up and modernize /proc.
>
> I'm imagining a whole series of changes:
>
>  - Make a sysctlfs.  You could mount it and get all the sysctls if you
> have global privilege.  If you only have privilege relative to some
> namespace, you could pass a mount option like -o scope=net to get just
> sysctls that belong to the mounting process' netns.  If done
> carefully, this should be safe for unprivileged mounting without the
> fs_fully_visible() checks.
>
>  - Teach procfs to understand mount options for real (per-superblock).
> Shouldn't be that hard.
>
>  - Make it possible to control hidepid per mount.  systemd and such
> could use this to tighten up daemons.
>
>  - Make it possible to make /proc/PID/cmdline only show argv[0] via
> per-mount option or perhaps sysctl.
>
>  - Make it possible to mount a mini-proc that doesn't have all the
> non-PID stuff.  Presumably it would still have an empty directory
> called sys and maybe some other minimal contents for compatibility

Yes, please!

mount -t sysctl ...
mount -t proc-pid ...
mount -t proc-kitchen-sink ...

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-24 10:32     ` Tetsuo Handa
@ 2017-01-24 20:58         ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-24 20:58 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Josh Armour, ksummit-discuss, Greg KH, linux-mm

On Tue, Jan 24, 2017 at 2:32 AM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Hello.
>
> Can I read archive of the discussion of this topic from the beginning?
> I felt that this topic might be an opportunity of proposing my execute handler
> approach.

It should be in the linux-mm archives.

>
> In TOMOYO LSM (out of tree version), administrator can specify a program
> called execute handler which should be executed on behalf of a program
> requested by execve(). The specified program performs validation (e.g. whether
> argv[]/envp[] are appropriate) and setup (e.g. redirect file handles) before
> executing the program requested by execve().
>
> Conceptually execute handler is something like
>
>   #!/bin/sh
>   test ... || exit 1
>   test ... || exit 1
>   test ... || exit 1
>   exec ...
>
> which would in practice be implemented using C like
> https://osdn.net/projects/tomoyo/scm/svn/blobs/head/tags/ccs-tools/1.8.5p1/usr_lib_ccs/audit-exec-param.c .
> It is not difficult to implement the kernel side as well.
>

The difference is that that last exec means that the kernel is still
exposed to any bugs in its ELF parser.  Moving that to user mode would
reduce the attack surface.

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
@ 2017-01-24 20:58         ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-24 20:58 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: David Howells, linux-mm, Kees Cook, Josh Armour, Greg KH,
	ksummit-discuss

On Tue, Jan 24, 2017 at 2:32 AM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Hello.
>
> Can I read archive of the discussion of this topic from the beginning?
> I felt that this topic might be an opportunity of proposing my execute handler
> approach.

It should be in the linux-mm archives.

>
> In TOMOYO LSM (out of tree version), administrator can specify a program
> called execute handler which should be executed on behalf of a program
> requested by execve(). The specified program performs validation (e.g. whether
> argv[]/envp[] are appropriate) and setup (e.g. redirect file handles) before
> executing the program requested by execve().
>
> Conceptually execute handler is something like
>
>   #!/bin/sh
>   test ... || exit 1
>   test ... || exit 1
>   test ... || exit 1
>   exec ...
>
> which would in practice be implemented using C like
> https://osdn.net/projects/tomoyo/scm/svn/blobs/head/tags/ccs-tools/1.8.5p1/usr_lib_ccs/audit-exec-param.c .
> It is not difficult to implement the kernel side as well.
>

The difference is that that last exec means that the kernel is still
exposed to any bugs in its ELF parser.  Moving that to user mode would
reduce the attack surface.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-24 10:03   ` Eric W. Biederman
@ 2017-01-24 21:00     ` Andy Lutomirski
  2017-01-24 21:55       ` Eric W. Biederman
  0 siblings, 1 reply; 29+ messages in thread
From: Andy Lutomirski @ 2017-01-24 21:00 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Josh Armour, Greg KH, Djalal Harouni, ksummit-discuss

On Tue, Jan 24, 2017 at 2:03 AM, Eric W. Biederman
<ebiederm@xmission.com> wrote:
> Andy Lutomirski <luto@amacapital.net> writes:
>
>> On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
>>> Hi,
>>>
>>> I've already got various Kernel Self-Protection Project TODO items
>>> collected[1] (of varying size and complexity), but recently Google's
>>> Patch Reward Program[2] is trying to expand by helping create a bounty
>>> program for security-related TODOs. KSPP is just one corner of
>>> interest in the kernel, and I'd love to know if any other maintainers
>>> have TODO items that they'd like to see get done (and Google would
>>> potentially provide bounty money for).
>>>
>>> Let me know your security wish-lists, and I'll collect them all into a
>>> single place. And if there is a better place than ksummit-discuss to
>>> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
>>> public archive. :)
>>>
>>
>> Here's another one: split up and modernize /proc.
>>
>> I'm imagining a whole series of changes:
>>
>>  - Make a sysctlfs.  You could mount it and get all the sysctls if you
>> have global privilege.  If you only have privilege relative to some
>> namespace, you could pass a mount option like -o scope=net to get just
>> sysctls that belong to the mounting process' netns.  If done
>> carefully, this should be safe for unprivileged mounting without the
>> fs_fully_visible() checks.
>
> Nope.  Because the fs_fully_visible checks are there to support a root
> policy of what can be used.  Any filesystem with content needs
> fs_fully_visible or another way for root to say no you can't access
> these files.
>
> cgroupfs gets a pass from me because we can set the number of cgroup
> namespaces to 0, and because changing it will break userspace.
>
> Besides bind if you split up proc into pieces bind mounts should be
> sufficient and you should not need to allow unprivileged users to mount
> any of the pieces of proc.
>

Let me clarify what I meant.

Currently, IIUC there are a large number of sysctls that are global to
the system and a smaller number that only affect a single namespace.
If you have global privilege, you could do:

# mount -t sysctlfs -o scope=global none /whatever

This would be disallowed entirely if you don't have global privilege.
You could also do:

# mount -t sysctlfs -o scope=net none /whatever

This would *not* require global privilege or fs_fully_visible, but it
would require ns_capable(current->nsproxy->net_ns, CAP_NET_ADMIN).
You would get a limited syctlfs that only shows sysctls that are local
to the network namespace of the mounter.

Does that make sense?

--Andy

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-24 21:00     ` Andy Lutomirski
@ 2017-01-24 21:55       ` Eric W. Biederman
  0 siblings, 0 replies; 29+ messages in thread
From: Eric W. Biederman @ 2017-01-24 21:55 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, Greg KH, Djalal Harouni, ksummit-discuss

Andy Lutomirski <luto@amacapital.net> writes:

> On Tue, Jan 24, 2017 at 2:03 AM, Eric W. Biederman
> <ebiederm@xmission.com> wrote:
>> Andy Lutomirski <luto@amacapital.net> writes:
>>
>>> On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
>>>> Hi,
>>>>
>>>> I've already got various Kernel Self-Protection Project TODO items
>>>> collected[1] (of varying size and complexity), but recently Google's
>>>> Patch Reward Program[2] is trying to expand by helping create a bounty
>>>> program for security-related TODOs. KSPP is just one corner of
>>>> interest in the kernel, and I'd love to know if any other maintainers
>>>> have TODO items that they'd like to see get done (and Google would
>>>> potentially provide bounty money for).
>>>>
>>>> Let me know your security wish-lists, and I'll collect them all into a
>>>> single place. And if there is a better place than ksummit-discuss to
>>>> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
>>>> public archive. :)
>>>>
>>>
>>> Here's another one: split up and modernize /proc.
>>>
>>> I'm imagining a whole series of changes:
>>>
>>>  - Make a sysctlfs.  You could mount it and get all the sysctls if you
>>> have global privilege.  If you only have privilege relative to some
>>> namespace, you could pass a mount option like -o scope=net to get just
>>> sysctls that belong to the mounting process' netns.  If done
>>> carefully, this should be safe for unprivileged mounting without the
>>> fs_fully_visible() checks.
>>
>> Nope.  Because the fs_fully_visible checks are there to support a root
>> policy of what can be used.  Any filesystem with content needs
>> fs_fully_visible or another way for root to say no you can't access
>> these files.
>>
>> cgroupfs gets a pass from me because we can set the number of cgroup
>> namespaces to 0, and because changing it will break userspace.
>>
>> Besides bind if you split up proc into pieces bind mounts should be
>> sufficient and you should not need to allow unprivileged users to mount
>> any of the pieces of proc.
>>
>
> Let me clarify what I meant.
>
> Currently, IIUC there are a large number of sysctls that are global to
> the system and a smaller number that only affect a single namespace.
> If you have global privilege, you could do:
>
> # mount -t sysctlfs -o scope=global none /whatever
>
> This would be disallowed entirely if you don't have global privilege.
> You could also do:
>
> # mount -t sysctlfs -o scope=net none /whatever
>
> This would *not* require global privilege or fs_fully_visible, but it
> would require ns_capable(current->nsproxy->net_ns, CAP_NET_ADMIN).
> You would get a limited syctlfs that only shows sysctls that are local
> to the network namespace of the mounter.
>
> Does that make sense?

Yes that does make sense, and that is reasonable.

Eric

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
  2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
                   ` (4 preceding siblings ...)
  2017-01-24  2:38 ` Andy Lutomirski
@ 2017-02-02 21:12 ` David Howells
  5 siblings, 0 replies; 29+ messages in thread
From: David Howells @ 2017-02-02 21:12 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: Josh Armour, ksummit-discuss, Greg KH, Djalal Harouni

Andy Lutomirski <luto@amacapital.net> wrote:

> Here's another one: split up and modernize /proc.

Just remember: /proc is part of the user API.  It contains system calls that
are implemented with open/read/write/close rather than syscall directly.  As
such, you may not alter functionality that will break userspace[*].

[*] OTOH restricting stuff for security purposes does have merit, so I'm not
    totally against the idea.

David

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [Ksummit-discuss] security-related TODO items?
       [not found]   ` <CAEiveUcTQK84qFNpYoET-cpSXJe0KYtnYQtp0uTPz=z0tc3W9A@mail.gmail.com>
@ 2017-03-07 16:25     ` Andy Lutomirski
  0 siblings, 0 replies; 29+ messages in thread
From: Andy Lutomirski @ 2017-03-07 16:25 UTC (permalink / raw)
  To: Djalal Harouni; +Cc: Josh Armour, ksummit-discuss, Greg KH, Linus Torvalds

On Tue, Mar 7, 2017 at 8:12 AM, Djalal Harouni <tixxdz@gmail.com> wrote:
>
>
> On Tue, Jan 24, 2017 at 3:38 AM, Andy Lutomirski <luto@amacapital.net>
> wrote:
>> On Fri, Jan 20, 2017 at 2:38 PM, Kees Cook <keescook@chromium.org> wrote:
>>> Hi,
>>>
>>> I've already got various Kernel Self-Protection Project TODO items
>>> collected[1] (of varying size and complexity), but recently Google's
>>> Patch Reward Program[2] is trying to expand by helping create a bounty
>>> program for security-related TODOs. KSPP is just one corner of
>>> interest in the kernel, and I'd love to know if any other maintainers
>>> have TODO items that they'd like to see get done (and Google would
>>> potentially provide bounty money for).
>>>
>>> Let me know your security wish-lists, and I'll collect them all into a
>>> single place. And if there is a better place than ksummit-discuss to
>>> reach maintainers, I'm all ears. LKML tends to mostly just serve as a
>>> public archive. :)
>>>
>>
>> Here's another one: split up and modernize /proc.
>>
>> I'm imagining a whole series of changes:
>>
>>  - Make a sysctlfs.  You could mount it and get all the sysctls if you
>> have global privilege.  If you only have privilege relative to some
>> namespace, you could pass a mount option like -o scope=net to get just
>> sysctls that belong to the mounting process' netns.  If done
>> carefully, this should be safe for unprivileged mounting without the
>> fs_fully_visible() checks.
>>
>>  - Teach procfs to understand mount options for real (per-superblock).
>> Shouldn't be that hard.
>
> I spent some time investigating this to advance in this option in order to
> improve procfs hidepid and replace that per task hidepid solution... the
> result: this proposition will break userspace in a real bad way...
>
> Since procfs is a virtual fs we always generate a new 'st_dev' device ID
> that's used to get the major and minor IDs for that device.

If necessary, we could change that for procfs and have the same st_dev
for mounts in the same pidns.

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-03-07 16:26 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-20 22:38 [Ksummit-discuss] security-related TODO items? Kees Cook
2017-01-21  0:14 ` Andy Lutomirski
2017-01-21  0:26   ` Kees Cook
2017-01-21  1:10   ` Matthew Wilcox
2017-01-21  1:47   ` Josh Triplett
2017-01-23 10:02 ` Alexey Dobriyan
2017-01-23 10:48 ` David Howells
2017-01-23 20:10   ` Andy Lutomirski
2017-01-23 20:10     ` Andy Lutomirski
2017-01-24 10:32     ` Tetsuo Handa
2017-01-24 20:58       ` Andy Lutomirski
2017-01-24 20:58         ` Andy Lutomirski
2017-01-23 20:36   ` David Howells
2017-01-23 20:36     ` David Howells
2017-01-23 20:59     ` Matthew Wilcox
2017-01-23 20:59       ` Matthew Wilcox
2017-01-23 21:53       ` Andy Lutomirski
2017-01-23 21:53         ` Andy Lutomirski
2017-01-23 23:26     ` Greg Ungerer
2017-01-23 23:26       ` Greg Ungerer
2017-01-23 20:15 ` Christoph Hellwig
2017-01-24  2:38 ` Andy Lutomirski
2017-01-24 10:03   ` Eric W. Biederman
2017-01-24 21:00     ` Andy Lutomirski
2017-01-24 21:55       ` Eric W. Biederman
2017-01-24 10:38   ` Alexey Dobriyan
2017-01-24 10:38     ` Alexey Dobriyan
     [not found]   ` <CAEiveUcTQK84qFNpYoET-cpSXJe0KYtnYQtp0uTPz=z0tc3W9A@mail.gmail.com>
2017-03-07 16:25     ` Andy Lutomirski
2017-02-02 21:12 ` David Howells

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.