All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
@ 2014-04-14 19:02 Daniel Borkmann
  2014-04-14 20:13 ` Andy Lutomirski
  2014-04-15 22:52 ` Linus Torvalds
  0 siblings, 2 replies; 8+ messages in thread
From: Daniel Borkmann @ 2014-04-14 19:02 UTC (permalink / raw)
  To: davem
  Cc: netdev, linux-kernel, torvalds, Alexei Starovoitov, Eric Paris,
	James Morris, Kees Cook

Linus reports that on 32-bit x86 Chromium throws the following seccomp
resp. audit log messages:

  audit: type=1326 audit(1397359304.356:28108): auid=500 uid=500
gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0
syscall=172 compat=0 ip=0xb2dd9852 code=0x30000

  audit: type=1326 audit(1397359304.356:28109): auid=500 uid=500
gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=5
compat=0 ip=0xb2dd9852 code=0x50000

These audit messages are being triggered via audit_seccomp() through
__secure_computing() in seccomp mode (BPF) filter with seccomp return
codes 0x30000 (== SECCOMP_RET_TRAP) and 0x50000 (== SECCOMP_RET_ERRNO)
during filter runtime. Moreover, Linus reports that x86_64 Chromium
seems fine.

The underlying issue that explains this is that the implementation of
populate_seccomp_data() is wrong. Our seccomp data structure sd that
is being shared with user ABI is:

  struct seccomp_data {
    int nr;
    __u32 arch;
    __u64 instruction_pointer;
    __u64 args[6];
  };

Therefore, a simple cast to 'unsigned long *' for storing the value of
the syscall argument via syscall_get_arguments() is just wrong as on
32-bit x86 (or any other 32bit arch), it would result in storing a0-a5
at wrong offsets in args[] member, and thus i) could leak stack memory
to user space and ii) tampers with the logic of seccomp BPF programs
that read out and check for syscall arguments:

  syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);

Tested on 32-bit x86 with Google Chrome, unfortunately only via remote
test machine through slow ssh X forwarding, but it fixes the issue on
my side. So fix it up by storing args in type correct variables, gcc
is clever and optimizes the copy away in other cases, e.g. x86_64.

Fixes: bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set")
Reported-and-bisected-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Paris <eparis@redhat.com>
Cc: James Morris <james.l.morris@oracle.com>
Cc: Kees Cook <keescook@chromium.org>
---
 Dave, do you want to pick this up?

 kernel/seccomp.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index d8d046c..590c379 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -69,18 +69,17 @@ static void populate_seccomp_data(struct seccomp_data *sd)
 {
 	struct task_struct *task = current;
 	struct pt_regs *regs = task_pt_regs(task);
+	unsigned long args[6];
 
 	sd->nr = syscall_get_nr(task, regs);
 	sd->arch = syscall_get_arch();
-
-	/* Unroll syscall_get_args to help gcc on arm. */
-	syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);
-	syscall_get_arguments(task, regs, 1, 1, (unsigned long *) &sd->args[1]);
-	syscall_get_arguments(task, regs, 2, 1, (unsigned long *) &sd->args[2]);
-	syscall_get_arguments(task, regs, 3, 1, (unsigned long *) &sd->args[3]);
-	syscall_get_arguments(task, regs, 4, 1, (unsigned long *) &sd->args[4]);
-	syscall_get_arguments(task, regs, 5, 1, (unsigned long *) &sd->args[5]);
-
+	syscall_get_arguments(task, regs, 0, 6, args);
+	sd->args[0] = args[0];
+	sd->args[1] = args[1];
+	sd->args[2] = args[2];
+	sd->args[3] = args[3];
+	sd->args[4] = args[4];
+	sd->args[5] = args[5];
 	sd->instruction_pointer = KSTK_EIP(task);
 }
 
-- 
1.7.11.7


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-14 19:02 [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF Daniel Borkmann
@ 2014-04-14 20:13 ` Andy Lutomirski
  2014-04-14 20:24   ` David Miller
  2014-04-15 22:52 ` Linus Torvalds
  1 sibling, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2014-04-14 20:13 UTC (permalink / raw)
  To: Daniel Borkmann, davem
  Cc: netdev, linux-kernel, torvalds, Alexei Starovoitov, Eric Paris,
	James Morris, Kees Cook

On 04/14/2014 12:02 PM, Daniel Borkmann wrote:
> Linus reports that on 32-bit x86 Chromium throws the following seccomp
> resp. audit log messages:
> 
>   audit: type=1326 audit(1397359304.356:28108): auid=500 uid=500
> gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
> pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0
> syscall=172 compat=0 ip=0xb2dd9852 code=0x30000
> 
>   audit: type=1326 audit(1397359304.356:28109): auid=500 uid=500
> gid=500 ses=2 subj=unconfined_u:unconfined_r:chrome_sandbox_t:s0-s0:c0.c1023
> pid=3677 comm="chrome" exe="/opt/google/chrome/chrome" sig=0 syscall=5
> compat=0 ip=0xb2dd9852 code=0x50000
> 
> These audit messages are being triggered via audit_seccomp() through
> __secure_computing() in seccomp mode (BPF) filter with seccomp return
> codes 0x30000 (== SECCOMP_RET_TRAP) and 0x50000 (== SECCOMP_RET_ERRNO)
> during filter runtime. Moreover, Linus reports that x86_64 Chromium
> seems fine.
> 
> The underlying issue that explains this is that the implementation of
> populate_seccomp_data() is wrong. Our seccomp data structure sd that
> is being shared with user ABI is:
> 
>   struct seccomp_data {
>     int nr;
>     __u32 arch;
>     __u64 instruction_pointer;
>     __u64 args[6];
>   };
> 
> Therefore, a simple cast to 'unsigned long *' for storing the value of
> the syscall argument via syscall_get_arguments() is just wrong as on
> 32-bit x86 (or any other 32bit arch), it would result in storing a0-a5
> at wrong offsets in args[] member, and thus i) could leak stack memory
> to user space and ii) tampers with the logic of seccomp BPF programs
> that read out and check for syscall arguments:
> 
>   syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);

I think this description is wrong.  (unsigned long *) &sd->args[1] is
the right location, at least on 32-bit little-endian architectures.
((unsigned long *) &sd->args)[1] would be wrong, as I think you've
described, but that's not what the code does.

I think the real problem is that 32-bit BE is hosed, and on 32-bit LE,
the high bits aren't getting cleared.

I would make this change conditional on BITS_PER_LONG != 8, since this
probably severely pessimizes architectures like ia-64.

> 
> Tested on 32-bit x86 with Google Chrome, unfortunately only via remote
> test machine through slow ssh X forwarding, but it fixes the issue on
> my side. So fix it up by storing args in type correct variables, gcc
> is clever and optimizes the copy away in other cases, e.g. x86_64.
> 
> Fixes: bd4cf0ed331a ("net: filter: rework/optimize internal BPF interpreter's instruction set")
> Reported-and-bisected-by: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Eric Paris <eparis@redhat.com>
> Cc: James Morris <james.l.morris@oracle.com>
> Cc: Kees Cook <keescook@chromium.org>
> ---
>  Dave, do you want to pick this up?
> 
>  kernel/seccomp.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index d8d046c..590c379 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -69,18 +69,17 @@ static void populate_seccomp_data(struct seccomp_data *sd)
>  {
>  	struct task_struct *task = current;
>  	struct pt_regs *regs = task_pt_regs(task);
> +	unsigned long args[6];
>  
>  	sd->nr = syscall_get_nr(task, regs);
>  	sd->arch = syscall_get_arch();
> -
> -	/* Unroll syscall_get_args to help gcc on arm. */
> -	syscall_get_arguments(task, regs, 0, 1, (unsigned long *) &sd->args[0]);
> -	syscall_get_arguments(task, regs, 1, 1, (unsigned long *) &sd->args[1]);
> -	syscall_get_arguments(task, regs, 2, 1, (unsigned long *) &sd->args[2]);
> -	syscall_get_arguments(task, regs, 3, 1, (unsigned long *) &sd->args[3]);
> -	syscall_get_arguments(task, regs, 4, 1, (unsigned long *) &sd->args[4]);
> -	syscall_get_arguments(task, regs, 5, 1, (unsigned long *) &sd->args[5]);
> -
> +	syscall_get_arguments(task, regs, 0, 6, args);
> +	sd->args[0] = args[0];
> +	sd->args[1] = args[1];
> +	sd->args[2] = args[2];
> +	sd->args[3] = args[3];
> +	sd->args[4] = args[4];
> +	sd->args[5] = args[5];
>  	sd->instruction_pointer = KSTK_EIP(task);
>  }
>  
> 


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-14 20:13 ` Andy Lutomirski
@ 2014-04-14 20:24   ` David Miller
  2014-04-14 20:28     ` Andy Lutomirski
  0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2014-04-14 20:24 UTC (permalink / raw)
  To: luto
  Cc: dborkman, netdev, linux-kernel, torvalds, ast, eparis,
	james.l.morris, keescook

From: Andy Lutomirski <luto@amacapital.net>
Date: Mon, 14 Apr 2014 13:13:45 -0700

> I think this description is wrong.  (unsigned long *) &sd->args[1] is
> the right location, at least on 32-bit little-endian architectures.

It absolutely is not.

The thing is a u64, and we must respect that type in a completely
portable way.

Daniel's change is %100 correct, portable, and doesn't have any
ugly ifdef crap.

If you want to optimize this, and potentially break it again, do
it in the next merge window not now.

I'm going to apply Daniel's patch.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-14 20:24   ` David Miller
@ 2014-04-14 20:28     ` Andy Lutomirski
  2014-04-15  6:31       ` Alexei Starovoitov
  0 siblings, 1 reply; 8+ messages in thread
From: Andy Lutomirski @ 2014-04-14 20:28 UTC (permalink / raw)
  To: David Miller
  Cc: dborkman, Network Development, linux-kernel, Linus Torvalds, ast,
	Eric Paris, James Morris, Kees Cook

On Mon, Apr 14, 2014 at 1:24 PM, David Miller <davem@davemloft.net> wrote:
> From: Andy Lutomirski <luto@amacapital.net>
> Date: Mon, 14 Apr 2014 13:13:45 -0700
>
>> I think this description is wrong.  (unsigned long *) &sd->args[1] is
>> the right location, at least on 32-bit little-endian architectures.
>
> It absolutely is not.

Huh?  It's a pointer to the right address, but the type is wrong.

The changelog says "on 32-bit x86 (or any other 32bit arch), it would
result in storing a0-a5 at wrong offsets in args[] member".  Unless
I'm mistaken, this is incorrect: a0-a5 are are the correct offsets,
but they are stored with the wrong type, so the other bits in there
are garbage.

>
> The thing is a u64, and we must respect that type in a completely
> portable way.
>
> Daniel's change is %100 correct, portable, and doesn't have any
> ugly ifdef crap.
>

I have no problem with the patch itself.  I'm suggesting that a better
changelog message would confuse other people reading the same patch
less.

--Andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-14 20:28     ` Andy Lutomirski
@ 2014-04-15  6:31       ` Alexei Starovoitov
  2014-04-15 17:46         ` Andy Lutomirski
  0 siblings, 1 reply; 8+ messages in thread
From: Alexei Starovoitov @ 2014-04-15  6:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: David Miller, Daniel Borkmann, Network Development, linux-kernel,
	Linus Torvalds, Eric Paris, James Morris, Kees Cook

On Mon, Apr 14, 2014 at 1:28 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Mon, Apr 14, 2014 at 1:24 PM, David Miller <davem@davemloft.net> wrote:
>> From: Andy Lutomirski <luto@amacapital.net>
>> Date: Mon, 14 Apr 2014 13:13:45 -0700
>>
>>> I think this description is wrong.  (unsigned long *) &sd->args[1] is
>>> the right location, at least on 32-bit little-endian architectures.
>>
>> It absolutely is not.
>
> Huh?  It's a pointer to the right address, but the type is wrong.
>
> The changelog says "on 32-bit x86 (or any other 32bit arch), it would
> result in storing a0-a5 at wrong offsets in args[] member".  Unless
> I'm mistaken, this is incorrect: a0-a5 are are the correct offsets,
> but they are stored with the wrong type, so the other bits in there
> are garbage.

agree. your above description is more correct than the log.
We were focusing on the bug itself and the log came a bit misleading
as a result of multiple iterations back and forth between me and Daniel.

also the log says:
"gcc is clever and optimizes the copy away in other cases, e.g. x86_64"
since we actually checked assembler, so the fix doesn't pessimize
64-bit architectures :)
This function is in critical path for seccomp, so performance definitely
matters.

>>
>> The thing is a u64, and we must respect that type in a completely
>> portable way.
>>
>> Daniel's change is %100 correct, portable, and doesn't have any
>> ugly ifdef crap.
>>
>
> I have no problem with the patch itself.  I'm suggesting that a better
> changelog message would confuse other people reading the same patch
> less.
>
> --Andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-15  6:31       ` Alexei Starovoitov
@ 2014-04-15 17:46         ` Andy Lutomirski
  0 siblings, 0 replies; 8+ messages in thread
From: Andy Lutomirski @ 2014-04-15 17:46 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Miller, Daniel Borkmann, Network Development, linux-kernel,
	Linus Torvalds, Eric Paris, James Morris, Kees Cook

On Mon, Apr 14, 2014 at 11:31 PM, Alexei Starovoitov <ast@plumgrid.com> wrote:
> On Mon, Apr 14, 2014 at 1:28 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> On Mon, Apr 14, 2014 at 1:24 PM, David Miller <davem@davemloft.net> wrote:
>>> From: Andy Lutomirski <luto@amacapital.net>
>>> Date: Mon, 14 Apr 2014 13:13:45 -0700
>>>
>>>> I think this description is wrong.  (unsigned long *) &sd->args[1] is
>>>> the right location, at least on 32-bit little-endian architectures.
>>>
>>> It absolutely is not.
>>
>> Huh?  It's a pointer to the right address, but the type is wrong.
>>
>> The changelog says "on 32-bit x86 (or any other 32bit arch), it would
>> result in storing a0-a5 at wrong offsets in args[] member".  Unless
>> I'm mistaken, this is incorrect: a0-a5 are are the correct offsets,
>> but they are stored with the wrong type, so the other bits in there
>> are garbage.
>
> agree. your above description is more correct than the log.
> We were focusing on the bug itself and the log came a bit misleading
> as a result of multiple iterations back and forth between me and Daniel.
>
> also the log says:
> "gcc is clever and optimizes the copy away in other cases, e.g. x86_64"
> since we actually checked assembler, so the fix doesn't pessimize
> 64-bit architectures :)
> This function is in critical path for seccomp, so performance definitely
> matters.

Yeah, I'm not entirely sure what I was thinking when I wrote that
part.  The new code should actually be much better than the old code
for weird architectures like ia-64.

For reference, ia-64 uses the unwinder (!) to look up arguments, so
the fewer times it gets invoked, the better.

--Andy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-14 19:02 [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF Daniel Borkmann
  2014-04-14 20:13 ` Andy Lutomirski
@ 2014-04-15 22:52 ` Linus Torvalds
  2014-04-15 23:17   ` David Miller
  1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2014-04-15 22:52 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, Network Development, Linux Kernel Mailing List,
	Alexei Starovoitov, Eric Paris, James Morris, Kees Cook

[ Sorry for delayed testing, I just came back home and didn't have
access to the affected 32-bit machine on the road ]

On Mon, Apr 14, 2014 at 12:02 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
> Linus reports that on 32-bit x86 Chromium throws the following seccomp
> resp. audit log messages:

Tested, and fixes the problem for me.

Thanks,

              Linus

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF
  2014-04-15 22:52 ` Linus Torvalds
@ 2014-04-15 23:17   ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2014-04-15 23:17 UTC (permalink / raw)
  To: torvalds
  Cc: dborkman, netdev, linux-kernel, ast, eparis, james.l.morris, keescook

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Tue, 15 Apr 2014 15:52:42 -0700

> [ Sorry for delayed testing, I just came back home and didn't have
> access to the affected 32-bit machine on the road ]
> 
> On Mon, Apr 14, 2014 at 12:02 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
>> Linus reports that on 32-bit x86 Chromium throws the following seccomp
>> resp. audit log messages:
> 
> Tested, and fixes the problem for me.

I'll push this fix to you later this evening.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2014-04-15 23:17 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-14 19:02 [PATCH] seccomp: fix populating a0-a5 syscall args in 32-bit x86 BPF Daniel Borkmann
2014-04-14 20:13 ` Andy Lutomirski
2014-04-14 20:24   ` David Miller
2014-04-14 20:28     ` Andy Lutomirski
2014-04-15  6:31       ` Alexei Starovoitov
2014-04-15 17:46         ` Andy Lutomirski
2014-04-15 22:52 ` Linus Torvalds
2014-04-15 23:17   ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.