All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
       [not found] <20150309204321.AAF412E0@viggo.jf.intel.com>
@ 2015-03-09 21:31 ` Kees Cook
       [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2015-03-09 21:31 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Eric W. Biederman, Andrew Morton, Theodore Ts'o,
	Oleg Nesterov, LKML, Dave Hansen

On Mon, Mar 9, 2015 at 1:43 PM, Dave Hansen <dave@sr71.net> wrote:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> Physical addresses are sensitive information.  There are
> existing, known exploits that are made easier if physical
> information is available.  Here is one example:
>
>         http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>
> If you know the physical address of something you also know at
> which kernel virtual address you can find something (modulo
> highmem).  It means that things that keep the kernel from
> accessing user mappings (like SMAP/SMEP) can be worked around
> because the _kernel_ mapping can get used instead.
>
> But, /proc/$pid/pagemap exposes the physical addresses of all
> pages accessible to userspace.  This works against all of the
> efforts to keep kernel addresses out of places where unprivileged
> apps can find them.
>
> This patch introduces a "paranoid" option for /proc.  It can be
> enabled like this:
>
>         mount -o remount,paranoid /proc
>
> Or when /proc is mounted initially.  When 'paranoid' mode is
> active, opens to /proc/$pid/pagemap will return -EPERM for users
> without CAP_SYS_RAWIO.  It can be disabled like this:
>
>         mount -o remount,notparanoid /proc
>
> The option is applied to the pid namespace, so an app that wanted
> a separate policy from the rest of the system could get run in
> its own pid namespace.
>
> I'm not really that stuck on the name.  I'm not opposed to making
> it apply only to pagemap or to giving it a pagemap-specific
> name.
>
> pagemap is also the kind of feature that could be used to escalate
> privileged from root in to the kernel.  It probably needs to be
> protected in the same way that /dev/mem or module loading is in
> cases where the kernel needs to be protected from root, thus the
> choice to use CAP_SYS_RAWIO.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>

Seems reasonable. I would note that even CAP_SYS_RAWIO isn't enough to
actually do anything with RAM in /dev/mem. That's entirely controlled
by CONFIG_STRICT_DEVMEM.

I think /proc/kpagecount and /proc/kpageflags should get filtered as
well, instead of them relying on the uid=0 check.

Reviewed-by: Kees Cook <keescook@chromium.org>

-Kees


> ---
>
>  b/fs/proc/root.c                |   10 +++++++++-
>  b/fs/proc/task_mmu.c            |   11 +++++++++++
>  b/include/linux/pid_namespace.h |    1 +
>  3 files changed, 21 insertions(+), 1 deletion(-)
>
> diff -puN fs/proc/root.c~privileged-pagemap fs/proc/root.c
> --- a/fs/proc/root.c~privileged-pagemap 2015-03-09 13:33:12.104796793 -0700
> +++ b/fs/proc/root.c    2015-03-09 13:33:12.111797109 -0700
> @@ -39,10 +39,12 @@ static int proc_set_super(struct super_b
>  }
>
>  enum {
> -       Opt_gid, Opt_hidepid, Opt_err,
> +       Opt_gid, Opt_hidepid, Opt_paranoid, Opt_notparanoid, Opt_err,
>  };
>
>  static const match_table_t tokens = {
> +       {Opt_paranoid, "paranoid"},
> +       {Opt_notparanoid, "notparanoid"},
>         {Opt_hidepid, "hidepid=%u"},
>         {Opt_gid, "gid=%u"},
>         {Opt_err, NULL},
> @@ -70,6 +72,12 @@ static int proc_parse_options(char *opti
>                                 return 0;
>                         pid->pid_gid = make_kgid(current_user_ns(), option);
>                         break;
> +               case Opt_paranoid:
> +                       pid->paranoid = 1;
> +                       break;
> +               case Opt_notparanoid:
> +                       pid->paranoid = 0;
> +                       break;
>                 case Opt_hidepid:
>                         if (match_int(&args[0], &option))
>                                 return 0;
> diff -puN fs/proc/task_mmu.c~privileged-pagemap fs/proc/task_mmu.c
> --- a/fs/proc/task_mmu.c~privileged-pagemap     2015-03-09 13:33:12.106796883 -0700
> +++ b/fs/proc/task_mmu.c        2015-03-09 13:33:12.112797154 -0700
> @@ -1322,9 +1322,20 @@ out:
>
>  static int pagemap_open(struct inode *inode, struct file *file)
>  {
> +       struct pid_namespace *ns = inode->i_sb->s_fs_info;
> +
>         pr_warn_once("Bits 55-60 of /proc/PID/pagemap entries are about "
>                         "to stop being page-shift some time soon. See the "
>                         "linux/Documentation/vm/pagemap.txt for details.\n");
> +
> +       /*
> +        * Use the RAWIO capability bit.  If you can not go open
> +        * /dev/mem, then you also have no business knowing the
> +        * physical addresses of things.
> +        */
> +       if (ns->paranoid && !capable(CAP_SYS_RAWIO))
> +               return -EPERM;
> +
>         return 0;
>  }
>
> diff -puN include/linux/pid_namespace.h~privileged-pagemap include/linux/pid_namespace.h
> --- a/include/linux/pid_namespace.h~privileged-pagemap  2015-03-09 13:33:12.108796973 -0700
> +++ b/include/linux/pid_namespace.h     2015-03-09 13:33:12.112797154 -0700
> @@ -43,6 +43,7 @@ struct pid_namespace {
>         struct work_struct proc_work;
>         kgid_t pid_gid;
>         int hide_pid;
> +       int paranoid;
>         int reboot;     /* group exit code if this pidns was rebooted */
>         struct ns_common ns;
>  };
> _



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 2/2] proc: config options for making privileged /proc the default
       [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com>
@ 2015-03-09 21:32   ` Kees Cook
  0 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2015-03-09 21:32 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Eric W. Biederman, Andrew Morton, Theodore Ts'o,
	Oleg Nesterov, LKML, Dave Hansen

On Mon, Mar 9, 2015 at 1:43 PM, Dave Hansen <dave@sr71.net> wrote:
>
> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> This is for folks where /proc is mounted very early or where it
> is not convenient to go changing fstab everywhere.
>
> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>

It'd be nice to have a way to do this in a more arbitrary fashion. For
example, select a CONFIG to also add things like hidepid=2, paranoid,
etc.

-Kees

> ---
>
>  b/fs/proc/Kconfig        |   17 +++++++++++++++++
>  b/kernel/pid_namespace.c |    1 +
>  2 files changed, 18 insertions(+)
>
> diff -puN fs/proc/Kconfig~privileged-pagemap-default-config fs/proc/Kconfig
> --- a/fs/proc/Kconfig~privileged-pagemap-default-config 2015-03-09 13:32:23.628610423 -0700
> +++ b/fs/proc/Kconfig   2015-03-09 13:32:23.633610649 -0700
> @@ -71,3 +71,20 @@ config PROC_PAGE_MONITOR
>           /proc/pid/smaps, /proc/pid/clear_refs, /proc/pid/pagemap,
>           /proc/kpagecount, and /proc/kpageflags. Disabling these
>            interfaces will reduce the size of the kernel by approximately 4kb.
> +
> +config PROC_PARANOID_DEFAULT
> +       default y
> +       depends on PROC_FS
> +       bool "Enable paranoid /proc mount option by default"
> +       help
> +         Access to some sensitive /proc files is restricted when the
> +         "paranoid" mount option is specified:
> +
> +               mount -o paranoid -t proc none /proc
> +
> +         Enabling this config option will set the "paranoid" option
> +         by default on all /proc mounts.  It may still be disabled at
> +         mount or remount time:
> +
> +               mount -o remount,notparanoid -/proc
> +
> diff -puN kernel/pid_namespace.c~privileged-pagemap-default-config kernel/pid_namespace.c
> --- a/kernel/pid_namespace.c~privileged-pagemap-default-config  2015-03-09 13:32:23.630610514 -0700
> +++ b/kernel/pid_namespace.c    2015-03-09 13:32:23.633610649 -0700
> @@ -115,6 +115,7 @@ static struct pid_namespace *create_pid_
>         ns->parent = get_pid_ns(parent_pid_ns);
>         ns->user_ns = get_user_ns(user_ns);
>         ns->nr_hashed = PIDNS_HASH_ADDING;
> +       ns->paranoid = IS_ENABLED(CONFIG_PROC_PARANOID_DEFAULT);
>         INIT_WORK(&ns->proc_work, proc_cleanup_work);
>
>         set_bit(0, ns->pidmap[0].page);
> _



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
       [not found] <20150309204321.AAF412E0@viggo.jf.intel.com>
  2015-03-09 21:31 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Kees Cook
       [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com>
@ 2015-03-09 22:13 ` Eric W. Biederman
  2015-03-09 22:22   ` Kees Cook
  2015-03-12 22:35 ` Andrew Morton
  3 siblings, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2015-03-09 22:13 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Andrew Morton, Kees Cook, tytso, Oleg Nesterov, linux-kernel,
	dave.hansen

Dave Hansen <dave@sr71.net> writes:

> From: Dave Hansen <dave.hansen@linux.intel.com>
>
> Physical addresses are sensitive information.  There are
> existing, known exploits that are made easier if physical
> information is available.  Here is one example:
>
> 	http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>
> If you know the physical address of something you also know at
> which kernel virtual address you can find something (modulo
> highmem).  It means that things that keep the kernel from
> accessing user mappings (like SMAP/SMEP) can be worked around
> because the _kernel_ mapping can get used instead.
>
> But, /proc/$pid/pagemap exposes the physical addresses of all
> pages accessible to userspace.  This works against all of the
> efforts to keep kernel addresses out of places where unprivileged
> apps can find them.
>
> This patch introduces a "paranoid" option for /proc.  It can be
> enabled like this:
>
> 	mount -o remount,paranoid /proc
>
> Or when /proc is mounted initially.  When 'paranoid' mode is
> active, opens to /proc/$pid/pagemap will return -EPERM for users
> without CAP_SYS_RAWIO.  It can be disabled like this:
>
> 	mount -o remount,notparanoid /proc
>
> The option is applied to the pid namespace, so an app that wanted
> a separate policy from the rest of the system could get run in
> its own pid namespace.
>
> I'm not really that stuck on the name.  I'm not opposed to making
> it apply only to pagemap or to giving it a pagemap-specific
> name.
>
> pagemap is also the kind of feature that could be used to escalate
> privileged from root in to the kernel.  It probably needs to be
> protected in the same way that /dev/mem or module loading is in
> cases where the kernel needs to be protected from root, thus the
> choice to use CAP_SYS_RAWIO.


There is already a way to make pagemap go away.  It is called
CONFIG_PROC_PAGE_MONITOR.

I suspect the right answer here is if you enable kernel address
randomization you disable CONFIG_PROC_PAGE_MONTIOR.  Aka you make the
two options conflict with each other.

That is a lot less code and a lot less to maintain.

On the other hand if this is truly a valuable interface that you can't
part with we need an alternative to pagemaps that does the same job
with out the exploit potential.  And I don't how to do that.

Arguing in favor of just making the options conflict is the fact that
kernel address randomization is pretty much snake oil.  At least on
x86_64 the address pool is so small it can be trivially brute forced.  I
think there are maybe 10 bits you can randomize within.

As for a way to disable this I expect it would do better with something
like a set once flag that prevents a process and all of it's children
from accessing this file.

*Blink* *Blink* Did you say you are worried about escalting privileges
from root into the kernel space.  That is non-sense.  We give root the
power to shot themselves in the foot and any proc option will be
something that root will be able to get around.

The pieces of the patch description don't add up.

Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-09 22:13 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Eric W. Biederman
@ 2015-03-09 22:22   ` Kees Cook
  2015-03-09 23:08     ` Eric W. Biederman
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2015-03-09 22:22 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov,
	LKML, Dave Hansen

On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Dave Hansen <dave@sr71.net> writes:
>
>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>
>> Physical addresses are sensitive information.  There are
>> existing, known exploits that are made easier if physical
>> information is available.  Here is one example:
>>
>>       http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>>
>> If you know the physical address of something you also know at
>> which kernel virtual address you can find something (modulo
>> highmem).  It means that things that keep the kernel from
>> accessing user mappings (like SMAP/SMEP) can be worked around
>> because the _kernel_ mapping can get used instead.
>>
>> But, /proc/$pid/pagemap exposes the physical addresses of all
>> pages accessible to userspace.  This works against all of the
>> efforts to keep kernel addresses out of places where unprivileged
>> apps can find them.
>>
>> This patch introduces a "paranoid" option for /proc.  It can be
>> enabled like this:
>>
>>       mount -o remount,paranoid /proc
>>
>> Or when /proc is mounted initially.  When 'paranoid' mode is
>> active, opens to /proc/$pid/pagemap will return -EPERM for users
>> without CAP_SYS_RAWIO.  It can be disabled like this:
>>
>>       mount -o remount,notparanoid /proc
>>
>> The option is applied to the pid namespace, so an app that wanted
>> a separate policy from the rest of the system could get run in
>> its own pid namespace.
>>
>> I'm not really that stuck on the name.  I'm not opposed to making
>> it apply only to pagemap or to giving it a pagemap-specific
>> name.
>>
>> pagemap is also the kind of feature that could be used to escalate
>> privileged from root in to the kernel.  It probably needs to be
>> protected in the same way that /dev/mem or module loading is in
>> cases where the kernel needs to be protected from root, thus the
>> choice to use CAP_SYS_RAWIO.
>
>
> There is already a way to make pagemap go away.  It is called
> CONFIG_PROC_PAGE_MONITOR.
>
> I suspect the right answer here is if you enable kernel address
> randomization you disable CONFIG_PROC_PAGE_MONTIOR.  Aka you make the
> two options conflict with each other.

It's not a good idea to make CONFIG options conflict with each other
like this as it puts distros is a tricky spot to decide which to use.
Allowing both and having a runtime flag of some kind tends to be the
better option (e.g. kASLR vs Hibernation).

> That is a lot less code and a lot less to maintain.
>
> On the other hand if this is truly a valuable interface that you can't
> part with we need an alternative to pagemaps that does the same job
> with out the exploit potential.  And I don't how to do that.
>
> Arguing in favor of just making the options conflict is the fact that
> kernel address randomization is pretty much snake oil.  At least on
> x86_64 the address pool is so small it can be trivially brute forced.  I
> think there are maybe 10 bits you can randomize within.
>
> As for a way to disable this I expect it would do better with something
> like a set once flag that prevents a process and all of it's children
> from accessing this file.
>
> *Blink* *Blink* Did you say you are worried about escalting privileges
> from root into the kernel space.  That is non-sense.  We give root the
> power to shot themselves in the foot and any proc option will be
> something that root will be able to get around.
>
> The pieces of the patch description don't add up.

No, that's an entirely valid use-case. You can trust the kernel but
not root. This is the point of the "trusted_kernel" patch series that
disables all sorts of dangerous interfaces that allow root to get at
physical memory.

This situation is more a memory leak than a direct compromise, so it
seems like providing at least some runtime control of it (separate
from potential future "trusted_kernel" stuff) makes sense.

-Kees

>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> Eric



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-09 22:22   ` Kees Cook
@ 2015-03-09 23:08     ` Eric W. Biederman
  2015-03-09 23:40       ` Kees Cook
                         ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Eric W. Biederman @ 2015-03-09 23:08 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov,
	LKML, Dave Hansen

Kees Cook <keescook@chromium.org> writes:

> On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>> Dave Hansen <dave@sr71.net> writes:
>>
>>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>>
>>> Physical addresses are sensitive information.  There are
>>> existing, known exploits that are made easier if physical
>>> information is available.  Here is one example:
>>>
>>>       http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>>>
>>> If you know the physical address of something you also know at
>>> which kernel virtual address you can find something (modulo
>>> highmem).  It means that things that keep the kernel from
>>> accessing user mappings (like SMAP/SMEP) can be worked around
>>> because the _kernel_ mapping can get used instead.
>>>
>>> But, /proc/$pid/pagemap exposes the physical addresses of all
>>> pages accessible to userspace.  This works against all of the
>>> efforts to keep kernel addresses out of places where unprivileged
>>> apps can find them.
>>>
>>> This patch introduces a "paranoid" option for /proc.  It can be
>>> enabled like this:
>>>
>>>       mount -o remount,paranoid /proc
>>>
>>> Or when /proc is mounted initially.  When 'paranoid' mode is
>>> active, opens to /proc/$pid/pagemap will return -EPERM for users
>>> without CAP_SYS_RAWIO.  It can be disabled like this:
>>>
>>>       mount -o remount,notparanoid /proc
>>>
>>> The option is applied to the pid namespace, so an app that wanted
>>> a separate policy from the rest of the system could get run in
>>> its own pid namespace.
>>>
>>> I'm not really that stuck on the name.  I'm not opposed to making
>>> it apply only to pagemap or to giving it a pagemap-specific
>>> name.
>>>
>>> pagemap is also the kind of feature that could be used to escalate
>>> privileged from root in to the kernel.  It probably needs to be
>>> protected in the same way that /dev/mem or module loading is in
>>> cases where the kernel needs to be protected from root, thus the
>>> choice to use CAP_SYS_RAWIO.
>>
>>
>> There is already a way to make pagemap go away.  It is called
>> CONFIG_PROC_PAGE_MONITOR.
>>
>> I suspect the right answer here is if you enable kernel address
>> randomization you disable CONFIG_PROC_PAGE_MONTIOR.  Aka you make the
>> two options conflict with each other.
>
> It's not a good idea to make CONFIG options conflict with each other
> like this as it puts distros is a tricky spot to decide which to use.
> Allowing both and having a runtime flag of some kind tends to be the
> better option (e.g. kASLR vs Hibernation).

But there is a fundamental conflict.  As such it might as well be
expressed in Kconfig.

>> That is a lot less code and a lot less to maintain.
>>
>> On the other hand if this is truly a valuable interface that you can't
>> part with we need an alternative to pagemaps that does the same job
>> with out the exploit potential.  And I don't how to do that.
>>
>> Arguing in favor of just making the options conflict is the fact that
>> kernel address randomization is pretty much snake oil.  At least on
>> x86_64 the address pool is so small it can be trivially brute forced.  I
>> think there are maybe 10 bits you can randomize within.
>>
>> As for a way to disable this I expect it would do better with something
>> like a set once flag that prevents a process and all of it's children
>> from accessing this file.
>>
>> *Blink* *Blink* Did you say you are worried about escalting privileges
>> from root into the kernel space.  That is non-sense.  We give root the
>> power to shot themselves in the foot and any proc option will be
>> something that root will be able to get around.
>>
>> The pieces of the patch description don't add up.
>
> No, that's an entirely valid use-case. You can trust the kernel but
> not root. This is the point of the "trusted_kernel" patch series that
> disables all sorts of dangerous interfaces that allow root to get at
> physical memory.
>
> This situation is more a memory leak than a direct compromise, so it
> seems like providing at least some runtime control of it (separate
> from potential future "trusted_kernel" stuff) makes sense.

I am too tired to argue about the kASLR snake-oil.

I do not think a proc mount option is at all apropriate for controlling
the behavior of the pagemap file.  And "paranoid" is entirely too
generic of a string to have any meaning.

Either just tighten the permissions when kASLR is enabled, or have the
file go away entirely.

If you want run-time knobs there are all kinds of run-time knobs you can
use.

If the concern is to protect against root getting into the kernel the
"trusted_kernel" snake-oil just compile out the pagemap file.  Nothing
else is remotely interesting from a mainenance point of view.

As I said.
Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-09 23:08     ` Eric W. Biederman
@ 2015-03-09 23:40       ` Kees Cook
  2015-03-09 23:43       ` Eric W. Biederman
  2015-03-10  2:28       ` Dave Hansen
  2 siblings, 0 replies; 14+ messages in thread
From: Kees Cook @ 2015-03-09 23:40 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov,
	LKML, Dave Hansen

On Mon, Mar 9, 2015 at 4:08 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
> Kees Cook <keescook@chromium.org> writes:
>
>> On Mon, Mar 9, 2015 at 3:13 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>>> Dave Hansen <dave@sr71.net> writes:
>>>
>>>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>>>
>>>> Physical addresses are sensitive information.  There are
>>>> existing, known exploits that are made easier if physical
>>>> information is available.  Here is one example:
>>>>
>>>>       http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>>>>
>>>> If you know the physical address of something you also know at
>>>> which kernel virtual address you can find something (modulo
>>>> highmem).  It means that things that keep the kernel from
>>>> accessing user mappings (like SMAP/SMEP) can be worked around
>>>> because the _kernel_ mapping can get used instead.
>>>>
>>>> But, /proc/$pid/pagemap exposes the physical addresses of all
>>>> pages accessible to userspace.  This works against all of the
>>>> efforts to keep kernel addresses out of places where unprivileged
>>>> apps can find them.
>>>>
>>>> This patch introduces a "paranoid" option for /proc.  It can be
>>>> enabled like this:
>>>>
>>>>       mount -o remount,paranoid /proc
>>>>
>>>> Or when /proc is mounted initially.  When 'paranoid' mode is
>>>> active, opens to /proc/$pid/pagemap will return -EPERM for users
>>>> without CAP_SYS_RAWIO.  It can be disabled like this:
>>>>
>>>>       mount -o remount,notparanoid /proc
>>>>
>>>> The option is applied to the pid namespace, so an app that wanted
>>>> a separate policy from the rest of the system could get run in
>>>> its own pid namespace.
>>>>
>>>> I'm not really that stuck on the name.  I'm not opposed to making
>>>> it apply only to pagemap or to giving it a pagemap-specific
>>>> name.
>>>>
>>>> pagemap is also the kind of feature that could be used to escalate
>>>> privileged from root in to the kernel.  It probably needs to be
>>>> protected in the same way that /dev/mem or module loading is in
>>>> cases where the kernel needs to be protected from root, thus the
>>>> choice to use CAP_SYS_RAWIO.
>>>
>>>
>>> There is already a way to make pagemap go away.  It is called
>>> CONFIG_PROC_PAGE_MONITOR.
>>>
>>> I suspect the right answer here is if you enable kernel address
>>> randomization you disable CONFIG_PROC_PAGE_MONTIOR.  Aka you make the
>>> two options conflict with each other.
>>
>> It's not a good idea to make CONFIG options conflict with each other
>> like this as it puts distros is a tricky spot to decide which to use.
>> Allowing both and having a runtime flag of some kind tends to be the
>> better option (e.g. kASLR vs Hibernation).
>
> But there is a fundamental conflict.  As such it might as well be
> expressed in Kconfig.

Hm? I was using kASLR vs Hibernation as an example of something that
while even at odds with each other currently is available as a runtime
selectable option (putting "kaslr" on the command line enables it and
disables hibernation, rather than forcing a CONFIG choice to pick one
or the other).

>
>>> That is a lot less code and a lot less to maintain.
>>>
>>> On the other hand if this is truly a valuable interface that you can't
>>> part with we need an alternative to pagemaps that does the same job
>>> with out the exploit potential.  And I don't how to do that.
>>>
>>> Arguing in favor of just making the options conflict is the fact that
>>> kernel address randomization is pretty much snake oil.  At least on
>>> x86_64 the address pool is so small it can be trivially brute forced.  I
>>> think there are maybe 10 bits you can randomize within.
>>>
>>> As for a way to disable this I expect it would do better with something
>>> like a set once flag that prevents a process and all of it's children
>>> from accessing this file.
>>>
>>> *Blink* *Blink* Did you say you are worried about escalting privileges
>>> from root into the kernel space.  That is non-sense.  We give root the
>>> power to shot themselves in the foot and any proc option will be
>>> something that root will be able to get around.
>>>
>>> The pieces of the patch description don't add up.
>>
>> No, that's an entirely valid use-case. You can trust the kernel but
>> not root. This is the point of the "trusted_kernel" patch series that
>> disables all sorts of dangerous interfaces that allow root to get at
>> physical memory.
>>
>> This situation is more a memory leak than a direct compromise, so it
>> seems like providing at least some runtime control of it (separate
>> from potential future "trusted_kernel" stuff) makes sense.
>
> I am too tired to argue about the kASLR snake-oil.

No problem. :)

>
> I do not think a proc mount option is at all apropriate for controlling
> the behavior of the pagemap file.  And "paranoid" is entirely too
> generic of a string to have any meaning.
>
> Either just tighten the permissions when kASLR is enabled, or have the
> file go away entirely.
>
> If you want run-time knobs there are all kinds of run-time knobs you can
> use.
>
> If the concern is to protect against root getting into the kernel the
> "trusted_kernel" snake-oil just compile out the pagemap file.  Nothing
> else is remotely interesting from a mainenance point of view.

Distros cannot opt to compile out the pagemap file. They want to
provide end users with one kernel that can do both, selectable at
runtime. If I want to make it harder for things that need physical
page maps to attack my system, I'd like to be able to turn it on in my
distro. And since I can remove CAP_SYS_RAWIO from init during my
initramfs, I would love to have this flag.

-Kees

>
> As I said.
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> Eric



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-09 23:08     ` Eric W. Biederman
  2015-03-09 23:40       ` Kees Cook
@ 2015-03-09 23:43       ` Eric W. Biederman
  2015-03-10  0:03         ` Kees Cook
  2015-03-10  2:28       ` Dave Hansen
  2 siblings, 1 reply; 14+ messages in thread
From: Eric W. Biederman @ 2015-03-09 23:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov,
	LKML, Dave Hansen


A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an
appropriate random number ought to keep from revealing page numbers or
page ajacencies while not requiring any changes in userspace.

That way the revealed pfn and the physcial pfn would be different but
you could still use pagemap for it's intended purpose.

Eric

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-09 23:43       ` Eric W. Biederman
@ 2015-03-10  0:03         ` Kees Cook
  2015-03-10  2:51           ` Dave Hansen
  0 siblings, 1 reply; 14+ messages in thread
From: Kees Cook @ 2015-03-10  0:03 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Dave Hansen, Andrew Morton, Theodore Ts'o, Oleg Nesterov,
	LKML, Dave Hansen

On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an
> appropriate random number ought to keep from revealing page numbers or
> page ajacencies while not requiring any changes in userspace.
>
> That way the revealed pfn and the physcial pfn would be different but
> you could still use pagemap for it's intended purpose.

If this could be done in a way where it was sufficiently hard to
expose the random number, we should absolutely do this. And this could
be done for socket handles in INET_DIAG too. We have a lot of these
kinds of "handle" leaks where the handle's can be regarded as private
information leakage.

-Kees


-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-09 23:08     ` Eric W. Biederman
  2015-03-09 23:40       ` Kees Cook
  2015-03-09 23:43       ` Eric W. Biederman
@ 2015-03-10  2:28       ` Dave Hansen
  2 siblings, 0 replies; 14+ messages in thread
From: Dave Hansen @ 2015-03-10  2:28 UTC (permalink / raw)
  To: Eric W. Biederman, Kees Cook
  Cc: Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen

On 03/09/2015 04:08 PM, Eric W. Biederman wrote:
> If the concern is to protect against root getting into the kernel the
> "trusted_kernel" snake-oil just compile out the pagemap file.  Nothing
> else is remotely interesting from a mainenance point of view.

The paper I linked to showed one example of how pagemap makes a
user->kernel exploint _easier_.  Note that the authors had another way
of actually doing the exploit when pagemap was not available, but it
required some more trouble than if pagemap was around.

I mentioned the "trusted_kernel" stuff as an aside.  It's really not the
main concern.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-10  0:03         ` Kees Cook
@ 2015-03-10  2:51           ` Dave Hansen
  2015-03-10  4:49             ` Eric W. Biederman
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Hansen @ 2015-03-10  2:51 UTC (permalink / raw)
  To: Kees Cook, Eric W. Biederman
  Cc: Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML, Dave Hansen

On 03/09/2015 05:03 PM, Kees Cook wrote:
> On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an
>> appropriate random number ought to keep from revealing page numbers or
>> page ajacencies while not requiring any changes in userspace.
>>
>> That way the revealed pfn and the physcial pfn would be different but
>> you could still use pagemap for it's intended purpose.
> 
> If this could be done in a way where it was sufficiently hard to
> expose the random number, we should absolutely do this.

We would need something which is both reversible (so that the given
offsets can still be used in /proc/kpagemap) and also hard to do a
known-plaintext-type attack on it.

Transparent huge pages are a place where userspace knows the
relationship between 512 adjacent physical addresses.  That represents a
good chunk of known data.  Surely there are more of these kinds of things.

Right now, for instance, the ways in which a series of sequential
allocations come out of the page allocator are fairly deterministic.  We
would also need to do some kind of allocator randomization to ensure
that userspace couldn't make good guesses about the physical addresses
of things coming out of the allocator.

Or, we just be sure and turn the darn thing off. :)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-10  2:51           ` Dave Hansen
@ 2015-03-10  4:49             ` Eric W. Biederman
  0 siblings, 0 replies; 14+ messages in thread
From: Eric W. Biederman @ 2015-03-10  4:49 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Kees Cook, Andrew Morton, Theodore Ts'o, Oleg Nesterov, LKML,
	Dave Hansen

Dave Hansen <dave.hansen@intel.com> writes:

> On 03/09/2015 05:03 PM, Kees Cook wrote:
>> On Mon, Mar 9, 2015 at 4:43 PM, Eric W. Biederman <ebiederm@xmission.com> wrote:
>>> A 1 to 1 blinding function like integer multiplication mudulo 2^32 by an
>>> appropriate random number ought to keep from revealing page numbers or
>>> page ajacencies while not requiring any changes in userspace.
>>>
>>> That way the revealed pfn and the physcial pfn would be different but
>>> you could still use pagemap for it's intended purpose.
>> 
>> If this could be done in a way where it was sufficiently hard to
>> expose the random number, we should absolutely do this.
>
> We would need something which is both reversible (so that the given
> offsets can still be used in /proc/kpagemap) and also hard to do a
> known-plaintext-type attack on it.
>
> Transparent huge pages are a place where userspace knows the
> relationship between 512 adjacent physical addresses.  That represents a
> good chunk of known data.  Surely there are more of these kinds of things.
>
> Right now, for instance, the ways in which a series of sequential
> allocations come out of the page allocator are fairly deterministic.  We
> would also need to do some kind of allocator randomization to ensure
> that userspace couldn't make good guesses about the physical addresses
> of things coming out of the allocator.
>
> Or, we just be sure and turn the darn thing off. :)

Yes.  If we are worried about something a big off switch is fine.

As for a one-to-one transform that is resitant to plain text attacks
I think that is the definition of a cypher.  That is we should just use
AES or something well know to encrypt the pafe frame numbers if we want
to hide them.  I don't know if the block mode of AES would be a problem
or not.

Eric



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
       [not found] <20150309204321.AAF412E0@viggo.jf.intel.com>
                   ` (2 preceding siblings ...)
  2015-03-09 22:13 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Eric W. Biederman
@ 2015-03-12 22:35 ` Andrew Morton
  2015-03-13 15:56   ` Dave Hansen
  3 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2015-03-12 22:35 UTC (permalink / raw)
  To: Dave Hansen
  Cc: Eric W. Biederman, Kees Cook, tytso, Oleg Nesterov, linux-kernel,
	dave.hansen

On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen <dave@sr71.net> wrote:

> 
> From: Dave Hansen <dave.hansen@linux.intel.com>
> 
> Physical addresses are sensitive information.  There are
> existing, known exploits that are made easier if physical
> information is available.  Here is one example:
> 
> 	http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
> 
> If you know the physical address of something you also know at
> which kernel virtual address you can find something (modulo
> highmem).  It means that things that keep the kernel from
> accessing user mappings (like SMAP/SMEP) can be worked around
> because the _kernel_ mapping can get used instead.
> 
> But, /proc/$pid/pagemap exposes the physical addresses of all
> pages accessible to userspace.  This works against all of the
> efforts to keep kernel addresses out of places where unprivileged
> apps can find them.
> 
> This patch introduces a "paranoid" option for /proc.  It can be
> enabled like this:
> 
> 	mount -o remount,paranoid /proc
> 
> Or when /proc is mounted initially.  When 'paranoid' mode is
> active, opens to /proc/$pid/pagemap will return -EPERM for users
> without CAP_SYS_RAWIO.  It can be disabled like this:
> 
> 	mount -o remount,notparanoid /proc
> 
> The option is applied to the pid namespace, so an app that wanted
> a separate policy from the rest of the system could get run in
> its own pid namespace.
> 
> I'm not really that stuck on the name.  I'm not opposed to making
> it apply only to pagemap or to giving it a pagemap-specific
> name.

Do we really need to disable pagemap entirely?  What happens if we just
obscure the addresses (ie: zero them)?

> pagemap is also the kind of feature that could be used to escalate
> privileged from root in to the kernel.  It probably needs to be
> protected in the same way that /dev/mem or module loading is in
> cases where the kernel needs to be protected from root, thus the
> choice to use CAP_SYS_RAWIO.

Confused.  If you have root, you can do mount -o notparanoid.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-12 22:35 ` Andrew Morton
@ 2015-03-13 15:56   ` Dave Hansen
  2015-03-13 17:16     ` Eric W. Biederman
  0 siblings, 1 reply; 14+ messages in thread
From: Dave Hansen @ 2015-03-13 15:56 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Eric W. Biederman, Kees Cook, tytso, Oleg Nesterov, linux-kernel

On 03/12/2015 03:35 PM, Andrew Morton wrote:
> On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen <dave@sr71.net> wrote:
>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>
>> Physical addresses are sensitive information.  There are
>> existing, known exploits that are made easier if physical
>> information is available.  Here is one example:
>>
>> 	http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
> Do we really need to disable pagemap entirely?  What happens if we just
> obscure the addresses (ie: zero them)?

I think we have 3 basic options:

1.  Disable it entirely (-EPERM or whatever).  Apps using it break
    quickly and fairly obviously (diagnosable with an strace)
2.  Zero it, or return some nonsensical thing for the physical address
    portion, but maintain exporting the PTE flags.  Apps only caring
    about PTE flags work, but anything trying to do lookups in
    /proc/kpageflags break.  If we zero it, apps pay get confused
    thinking they have the _actual_ pfn=0.
3.  Scramble it in some way obscuring the physical address.  Unscramble
    it upon access to /proc/kpageflags.

I think you're suggesting (2).  Doesn't that risk silently breaking apps?

>> pagemap is also the kind of feature that could be used to escalate
>> privileged from root in to the kernel.  It probably needs to be
>> protected in the same way that /dev/mem or module loading is in
>> cases where the kernel needs to be protected from root, thus the
>> choice to use CAP_SYS_RAWIO.
> 
> Confused.  If you have root, you can do mount -o notparanoid.

Good point.  I guess it doesn't protect us much here unless we also
restrict the ability to remount.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface
  2015-03-13 15:56   ` Dave Hansen
@ 2015-03-13 17:16     ` Eric W. Biederman
  0 siblings, 0 replies; 14+ messages in thread
From: Eric W. Biederman @ 2015-03-13 17:16 UTC (permalink / raw)
  To: Dave Hansen; +Cc: Andrew Morton, Kees Cook, tytso, Oleg Nesterov, linux-kernel

Dave Hansen <dave@sr71.net> writes:

> On 03/12/2015 03:35 PM, Andrew Morton wrote:
>> On Mon, 09 Mar 2015 13:43:21 -0700 Dave Hansen <dave@sr71.net> wrote:
>>> From: Dave Hansen <dave.hansen@linux.intel.com>
>>>
>>> Physical addresses are sensitive information.  There are
>>> existing, known exploits that are made easier if physical
>>> information is available.  Here is one example:
>>>
>>> 	http://www.cs.columbia.edu/~vpk/papers/ret2dir.sec14.pdf
>> Do we really need to disable pagemap entirely?  What happens if we just
>> obscure the addresses (ie: zero them)?
>
> I think we have 3 basic options:
>
> 1.  Disable it entirely (-EPERM or whatever).  Apps using it break
>     quickly and fairly obviously (diagnosable with an strace)
> 2.  Zero it, or return some nonsensical thing for the physical address
>     portion, but maintain exporting the PTE flags.  Apps only caring
>     about PTE flags work, but anything trying to do lookups in
>     /proc/kpageflags break.  If we zero it, apps pay get confused
>     thinking they have the _actual_ pfn=0.
> 3.  Scramble it in some way obscuring the physical address.  Unscramble
>     it upon access to /proc/kpageflags.
>
> I think you're suggesting (2).  Doesn't that risk silently breaking
> apps?

I think 3 where the scramble is something like AES crypto is likely to
scramble this well and still protect us from plain text attacks. 

>>> pagemap is also the kind of feature that could be used to escalate
>>> privileged from root in to the kernel.  It probably needs to be
>>> protected in the same way that /dev/mem or module loading is in
>>> cases where the kernel needs to be protected from root, thus the
>>> choice to use CAP_SYS_RAWIO.
>> 
>> Confused.  If you have root, you can do mount -o notparanoid.
>
> Good point.  I guess it doesn't protect us much here unless we also
> restrict the ability to remount.

And the ability to unmount... 

A write-once sysctl or a boot time only parameter is much more likely to
be useful in the scenario where you are concerned about root.

Eric


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2015-03-13 17:20 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20150309204321.AAF412E0@viggo.jf.intel.com>
2015-03-09 21:31 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Kees Cook
     [not found] ` <20150309204322.50DA6B5D@viggo.jf.intel.com>
2015-03-09 21:32   ` [RFC][PATCH 2/2] proc: config options for making privileged /proc the default Kees Cook
2015-03-09 22:13 ` [RFC][PATCH 1/2] fs proc: make pagemap a privileged interface Eric W. Biederman
2015-03-09 22:22   ` Kees Cook
2015-03-09 23:08     ` Eric W. Biederman
2015-03-09 23:40       ` Kees Cook
2015-03-09 23:43       ` Eric W. Biederman
2015-03-10  0:03         ` Kees Cook
2015-03-10  2:51           ` Dave Hansen
2015-03-10  4:49             ` Eric W. Biederman
2015-03-10  2:28       ` Dave Hansen
2015-03-12 22:35 ` Andrew Morton
2015-03-13 15:56   ` Dave Hansen
2015-03-13 17:16     ` Eric W. Biederman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.