LKML Archive on lore.kernel.org
 help / color / Atom feed
* ptrace: seccomp: Return value when the call was already invalid
@ 2020-05-23  1:01 Keno Fischer
  2020-07-03  8:39 ` Will Deacon
  0 siblings, 1 reply; 13+ messages in thread
From: Keno Fischer @ 2020-05-23  1:01 UTC (permalink / raw)
  To: Linux Kernel Mailing List
  Cc: Oleg Nesterov, Will Deacon, Kees Cook, Andy Lutomirski, Will Drewry

I'm seeing the following while porting a ptracer from
x86_64 to arm64 (cc'ing arm64 folks, but in this case
x86_64 is the odd one out, I think other archs would
be consistent with arm64).

Consider userspace code like the following:
```
int ret = syscall(-10, 0);
assert(ret == -ENOSYS);
```

(Never mind the fact that this is something userspace
shouldn't do, I saw this in our test suite that tests
corner cases where the ptracer shouldn't affect behavior).

Now, if we have a seccomp filter that simply does
SECCOMP_RET_TRACE, and a ptracer that simply
does PTRACE_CONT, then the assert will fire/fail on arm64,
but not on x86_64.

The reason this happens is that the return value gets set
early on x86_64, but this is not possible on arm64,
because doing so would clobber the first argument
register that it shares. As a result, no return value is
set and `ret` retains the value that the first syscall
argument used to have.

I can work around this of course, but I guess my
question is whether this is expected/ok,
or you would expect an active ptracer that does not
touch the registers not to affect behavior.

Interestingly, arm64 does do something different
if the syscall is -1 rather than -10, where early
in the ptrace stop it does.
```
/* set default errno for user-issued syscall(-1) */
if (scno == NO_SYSCALL)
    regs->regs[0] = -ENOSYS;
```

I'm not sure that's great either since the ptracer
may want to inspect x0 and arm64 does not
make orig_x0 available via ptrace. To me
this indicates that maybe this was intended
to apply to any syscall skipped here, not
just -1 (the different comes from the fact
that seccomp considers any negative
syscall a skip/fail, but on syscall-entry
stops arm64 only considers a literal -1
a skip).

On the other hand if this is deemed expected,
I'll go ahead and submit a man-page patch to at
least document this architecture difference.

Keno

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-05-23  1:01 ptrace: seccomp: Return value when the call was already invalid Keno Fischer
@ 2020-07-03  8:39 ` Will Deacon
  2020-07-03 15:17   ` Kees Cook
  2020-07-03 20:27   ` Keno Fischer
  0 siblings, 2 replies; 13+ messages in thread
From: Will Deacon @ 2020-07-03  8:39 UTC (permalink / raw)
  To: Keno Fischer
  Cc: Linux Kernel Mailing List, Oleg Nesterov, Kees Cook,
	Andy Lutomirski, Will Drewry

Hi Keno,

On Fri, May 22, 2020 at 09:01:01PM -0400, Keno Fischer wrote:
> I'm seeing the following while porting a ptracer from
> x86_64 to arm64 (cc'ing arm64 folks, but in this case
> x86_64 is the odd one out, I think other archs would
> be consistent with arm64).
> 
> Consider userspace code like the following:
> ```
> int ret = syscall(-10, 0);
> assert(ret == -ENOSYS);
> ```
> 
> (Never mind the fact that this is something userspace
> shouldn't do, I saw this in our test suite that tests
> corner cases where the ptracer shouldn't affect behavior).
> 
> Now, if we have a seccomp filter that simply does
> SECCOMP_RET_TRACE, and a ptracer that simply
> does PTRACE_CONT

Ok, so this means that we're _skipping_ the system call, right?

> then the assert will fire/fail on arm64, but not on x86_64.

It feels weird to me that skipping the system call has any effect on the
tracee registers...

> Interestingly, arm64 does do something different
> if the syscall is -1 rather than -10, where early
> in the ptrace stop it does.
> ```
> /* set default errno for user-issued syscall(-1) */
> if (scno == NO_SYSCALL)
>     regs->regs[0] = -ENOSYS;

... so I think this should be fixed too. How about the diff below?

Will

--->8

diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..cb3f653c9688 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1833,12 +1833,12 @@ int syscall_trace_enter(struct pt_regs *regs)
 	if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
 		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
 		if (!in_syscall(regs) || (flags & _TIF_SYSCALL_EMU))
-			return -1;
+			return -ENOSYS;
 	}
 
 	/* Do the secure computing after ptrace; failures should be fast. */
 	if (secure_computing() == -1)
-		return -1;
+		return -ENOSYS;
 
 	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
 		trace_sys_enter(regs, regs->syscallno);
@@ -1846,7 +1846,7 @@ int syscall_trace_enter(struct pt_regs *regs)
 	audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1],
 			    regs->regs[2], regs->regs[3]);
 
-	return regs->syscallno;
+	return 0;
 }
 
 void syscall_trace_exit(struct pt_regs *regs)
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..a13661f44818 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
 	user_exit();
 
 	if (has_syscall_work(flags)) {
-		/* set default errno for user-issued syscall(-1) */
-		if (scno == NO_SYSCALL)
-			regs->regs[0] = -ENOSYS;
-		scno = syscall_trace_enter(regs);
-		if (scno == NO_SYSCALL)
+		if (syscall_trace_enter(regs))
 			goto trace_exit;
+
+		scno = regs->syscallno;
 	}
 
 	invoke_syscall(regs, scno, sc_nr, syscall_table);

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-03  8:39 ` Will Deacon
@ 2020-07-03 15:17   ` Kees Cook
  2020-07-03 15:44     ` Will Deacon
  2020-07-03 20:27   ` Keno Fischer
  1 sibling, 1 reply; 13+ messages in thread
From: Kees Cook @ 2020-07-03 15:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> Hi Keno,
> 
> On Fri, May 22, 2020 at 09:01:01PM -0400, Keno Fischer wrote:
> > I'm seeing the following while porting a ptracer from
> > x86_64 to arm64 (cc'ing arm64 folks, but in this case
> > x86_64 is the odd one out, I think other archs would
> > be consistent with arm64).
> > 
> > Consider userspace code like the following:
> > ```
> > int ret = syscall(-10, 0);
> > assert(ret == -ENOSYS);
> > ```
> > 
> > (Never mind the fact that this is something userspace
> > shouldn't do, I saw this in our test suite that tests
> > corner cases where the ptracer shouldn't affect behavior).
> > 
> > Now, if we have a seccomp filter that simply does
> > SECCOMP_RET_TRACE, and a ptracer that simply
> > does PTRACE_CONT
> 
> Ok, so this means that we're _skipping_ the system call, right?
> 
> > then the assert will fire/fail on arm64, but not on x86_64.
> 
> It feels weird to me that skipping the system call has any effect on the
> tracee registers...
> 
> > Interestingly, arm64 does do something different
> > if the syscall is -1 rather than -10, where early
> > in the ptrace stop it does.
> > ```
> > /* set default errno for user-issued syscall(-1) */
> > if (scno == NO_SYSCALL)
> >     regs->regs[0] = -ENOSYS;
> 
> ... so I think this should be fixed too. How about the diff below?
> 
> Will
> 
> --->8
> 
> diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
> index 68b7f34a08f5..cb3f653c9688 100644
> --- a/arch/arm64/kernel/ptrace.c
> +++ b/arch/arm64/kernel/ptrace.c
> @@ -1833,12 +1833,12 @@ int syscall_trace_enter(struct pt_regs *regs)
>  	if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
>  		tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
>  		if (!in_syscall(regs) || (flags & _TIF_SYSCALL_EMU))
> -			return -1;
> +			return -ENOSYS;
>  	}
>  
>  	/* Do the secure computing after ptrace; failures should be fast. */
>  	if (secure_computing() == -1)
> -		return -1;
> +		return -ENOSYS;
>  
>  	if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
>  		trace_sys_enter(regs, regs->syscallno);
> @@ -1846,7 +1846,7 @@ int syscall_trace_enter(struct pt_regs *regs)
>  	audit_syscall_entry(regs->syscallno, regs->orig_x0, regs->regs[1],
>  			    regs->regs[2], regs->regs[3]);
>  
> -	return regs->syscallno;
> +	return 0;
>  }
>  
>  void syscall_trace_exit(struct pt_regs *regs)
> diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> index 5f5b868292f5..a13661f44818 100644
> --- a/arch/arm64/kernel/syscall.c
> +++ b/arch/arm64/kernel/syscall.c
> @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
>  	user_exit();
>  
>  	if (has_syscall_work(flags)) {
> -		/* set default errno for user-issued syscall(-1) */
> -		if (scno == NO_SYSCALL)
> -			regs->regs[0] = -ENOSYS;
> -		scno = syscall_trace_enter(regs);
> -		if (scno == NO_SYSCALL)
> +		if (syscall_trace_enter(regs))
>  			goto trace_exit;
> +
> +		scno = regs->syscallno;
>  	}
>  
>  	invoke_syscall(regs, scno, sc_nr, syscall_table);

What effect do either of these patches have on the existing seccomp
selftests: tools/testing/selftests/seccomp/seccomp_bpf ?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-03 15:17   ` Kees Cook
@ 2020-07-03 15:44     ` Will Deacon
  2020-07-03 15:52       ` Kees Cook
  0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2020-07-03 15:44 UTC (permalink / raw)
  To: Kees Cook
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Fri, Jul 03, 2020 at 08:17:19AM -0700, Kees Cook wrote:
> On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > index 5f5b868292f5..a13661f44818 100644
> > --- a/arch/arm64/kernel/syscall.c
> > +++ b/arch/arm64/kernel/syscall.c
> > @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> >  	user_exit();
> >  
> >  	if (has_syscall_work(flags)) {
> > -		/* set default errno for user-issued syscall(-1) */
> > -		if (scno == NO_SYSCALL)
> > -			regs->regs[0] = -ENOSYS;
> > -		scno = syscall_trace_enter(regs);
> > -		if (scno == NO_SYSCALL)
> > +		if (syscall_trace_enter(regs))
> >  			goto trace_exit;
> > +
> > +		scno = regs->syscallno;
> >  	}
> >  
> >  	invoke_syscall(regs, scno, sc_nr, syscall_table);
> 
> What effect do either of these patches have on the existing seccomp
> selftests: tools/testing/selftests/seccomp/seccomp_bpf ?

Tests! Thanks, I'll have a look.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-03 15:44     ` Will Deacon
@ 2020-07-03 15:52       ` Kees Cook
  2020-07-04 12:33         ` Will Deacon
  0 siblings, 1 reply; 13+ messages in thread
From: Kees Cook @ 2020-07-03 15:52 UTC (permalink / raw)
  To: Will Deacon
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Fri, Jul 03, 2020 at 04:44:27PM +0100, Will Deacon wrote:
> On Fri, Jul 03, 2020 at 08:17:19AM -0700, Kees Cook wrote:
> > On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > index 5f5b868292f5..a13661f44818 100644
> > > --- a/arch/arm64/kernel/syscall.c
> > > +++ b/arch/arm64/kernel/syscall.c
> > > @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > >  	user_exit();
> > >  
> > >  	if (has_syscall_work(flags)) {
> > > -		/* set default errno for user-issued syscall(-1) */
> > > -		if (scno == NO_SYSCALL)
> > > -			regs->regs[0] = -ENOSYS;
> > > -		scno = syscall_trace_enter(regs);
> > > -		if (scno == NO_SYSCALL)
> > > +		if (syscall_trace_enter(regs))
> > >  			goto trace_exit;
> > > +
> > > +		scno = regs->syscallno;
> > >  	}
> > >  
> > >  	invoke_syscall(regs, scno, sc_nr, syscall_table);
> > 
> > What effect do either of these patches have on the existing seccomp
> > selftests: tools/testing/selftests/seccomp/seccomp_bpf ?
> 
> Tests! Thanks, I'll have a look.

Thanks!

(And either way, that this behavioral difference went unnoticed means we
need to add a test to the selftests for this patch.)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-03  8:39 ` Will Deacon
  2020-07-03 15:17   ` Kees Cook
@ 2020-07-03 20:27   ` Keno Fischer
  2020-07-04 12:50     ` Will Deacon
  1 sibling, 1 reply; 13+ messages in thread
From: Keno Fischer @ 2020-07-03 20:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: Linux Kernel Mailing List, Oleg Nesterov, Kees Cook,
	Andy Lutomirski, Will Drewry

> > Now, if we have a seccomp filter that simply does
> > SECCOMP_RET_TRACE, and a ptracer that simply
> > does PTRACE_CONT
>
> Ok, so this means that we're _skipping_ the system call, right?

If the system call were positive this would result in the system call
being executed. The notion of "skipping" the syscall is a bit odd in
this situation. Having the ptracer set the syscallno to -1 is generally
accepted as the way to do it, but what happens if the syscallno is
already -1 or negative is underspecified.

> > then the assert will fire/fail on arm64, but not on x86_64.
>
> It feels weird to me that skipping the system call has any effect on the
> tracee registers...

I think the correct way to frame it is to ask whether the behavior
matches that of the tracee in absence of the ptracer. I would argue
that if the ptracer doesn't explicitly modify register contents, then
the tracee shouldn't observe any behavior difference.

> > Interestingly, arm64 does do something different
> > if the syscall is -1 rather than -10, where early
> > in the ptrace stop it does.
> > ```
> > /* set default errno for user-issued syscall(-1) */
> > if (scno == NO_SYSCALL)
> >     regs->regs[0] = -ENOSYS;
>
> ... so I think this should be fixed too. How about the diff below?

I think the patch behavior is better overall, but I'm not sure it's ideal.
I think the biggest question is what the behavior should be here and
if we want a behavioral difference between *the syscall was -1 at entry*
and *the syscall was -1 because the ptracer wanted to skip the syscall*.
I think there is a bit of a semantic disconnect because "skipping" the
syscall is not really an operation that the ptracer has at its disposal
(unless it's using SYSEMU of course). The only thing it can do is set
the syscall to -1. However, arguably that already has semantics
(of returning -ENOSYS), so it's not at all clear to me that we should
deviate from that. Unfortunately, none of this is currently consistent
across architectures, so I think before we go changing arm64, we
should decide what we'd like to happen in theory and then see
what we can do to improve the situation without being too breaking.

Keno

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-03 15:52       ` Kees Cook
@ 2020-07-04 12:33         ` Will Deacon
  2020-07-05  4:56           ` Kees Cook
  0 siblings, 1 reply; 13+ messages in thread
From: Will Deacon @ 2020-07-04 12:33 UTC (permalink / raw)
  To: Kees Cook
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Fri, Jul 03, 2020 at 08:52:05AM -0700, Kees Cook wrote:
> On Fri, Jul 03, 2020 at 04:44:27PM +0100, Will Deacon wrote:
> > On Fri, Jul 03, 2020 at 08:17:19AM -0700, Kees Cook wrote:
> > > On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> > > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > > index 5f5b868292f5..a13661f44818 100644
> > > > --- a/arch/arm64/kernel/syscall.c
> > > > +++ b/arch/arm64/kernel/syscall.c
> > > > @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > > >  	user_exit();
> > > >  
> > > >  	if (has_syscall_work(flags)) {
> > > > -		/* set default errno for user-issued syscall(-1) */
> > > > -		if (scno == NO_SYSCALL)
> > > > -			regs->regs[0] = -ENOSYS;
> > > > -		scno = syscall_trace_enter(regs);
> > > > -		if (scno == NO_SYSCALL)
> > > > +		if (syscall_trace_enter(regs))
> > > >  			goto trace_exit;
> > > > +
> > > > +		scno = regs->syscallno;
> > > >  	}
> > > >  
> > > >  	invoke_syscall(regs, scno, sc_nr, syscall_table);
> > > 
> > > What effect do either of these patches have on the existing seccomp
> > > selftests: tools/testing/selftests/seccomp/seccomp_bpf ?
> > 
> > Tests! Thanks, I'll have a look.
> 
> Thanks!
> 
> (And either way, that this behavioral difference went unnoticed means we
> need to add a test to the selftests for this patch.)

Unsurprisingly, I don't think the tests go near this. I get 75/77 passes
on arm64 defconfig with or without these changes.

We could add a test, but then we'd have to agree on what it's supposed to
be doing ;)

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-03 20:27   ` Keno Fischer
@ 2020-07-04 12:50     ` Will Deacon
  0 siblings, 0 replies; 13+ messages in thread
From: Will Deacon @ 2020-07-04 12:50 UTC (permalink / raw)
  To: Keno Fischer
  Cc: Linux Kernel Mailing List, Oleg Nesterov, Kees Cook,
	Andy Lutomirski, Will Drewry

On Fri, Jul 03, 2020 at 04:27:37PM -0400, Keno Fischer wrote:
> > > Now, if we have a seccomp filter that simply does
> > > SECCOMP_RET_TRACE, and a ptracer that simply
> > > does PTRACE_CONT
> >
> > Ok, so this means that we're _skipping_ the system call, right?
> 
> If the system call were positive this would result in the system call
> being executed. The notion of "skipping" the syscall is a bit odd in
> this situation. Having the ptracer set the syscallno to -1 is generally
> accepted as the way to do it, but what happens if the syscallno is
> already -1 or negative is underspecified.

Ok. I think it would be sensible for us to have the same behaviour for
all negative system calls though.

> > > then the assert will fire/fail on arm64, but not on x86_64.
> >
> > It feels weird to me that skipping the system call has any effect on the
> > tracee registers...
> 
> I think the correct way to frame it is to ask whether the behavior
> matches that of the tracee in absence of the ptracer. I would argue
> that if the ptracer doesn't explicitly modify register contents, then
> the tracee shouldn't observe any behavior difference.

That's a useful way of thinking about it and is still the case after this
patch. The difference now is that x0 isn't zapped in the case where a
syscall(-1) is skipped.

> > > Interestingly, arm64 does do something different
> > > if the syscall is -1 rather than -10, where early
> > > in the ptrace stop it does.
> > > ```
> > > /* set default errno for user-issued syscall(-1) */
> > > if (scno == NO_SYSCALL)
> > >     regs->regs[0] = -ENOSYS;
> >
> > ... so I think this should be fixed too. How about the diff below?
> 
> I think the patch behavior is better overall, but I'm not sure it's ideal.
> I think the biggest question is what the behavior should be here and
> if we want a behavioral difference between *the syscall was -1 at entry*
> and *the syscall was -1 because the ptracer wanted to skip the syscall*.
> I think there is a bit of a semantic disconnect because "skipping" the
> syscall is not really an operation that the ptracer has at its disposal
> (unless it's using SYSEMU of course). The only thing it can do is set
> the syscall to -1. However, arguably that already has semantics
> (of returning -ENOSYS), so it's not at all clear to me that we should
> deviate from that. Unfortunately, none of this is currently consistent
> across architectures, so I think before we go changing arm64, we
> should decide what we'd like to happen in theory and then see
> what we can do to improve the situation without being too breaking.

I just object to treating a user-issued -1 differently to any other negative
system call. With this patch, they're all treated the same, which is to say
that they return -ENOSYS normally, but when skipped by a tracer (e.g. by
setting the syscall number to -1 or because of a seccomp failure), then
x0 is preserved.

This means that, with this patch, skipping syscall(-1) behaves the same
way as skipping syscall(-2) with mainline today.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-04 12:33         ` Will Deacon
@ 2020-07-05  4:56           ` Kees Cook
  2020-07-06  8:15             ` Will Deacon
  2020-07-10 12:42             ` Will Deacon
  0 siblings, 2 replies; 13+ messages in thread
From: Kees Cook @ 2020-07-05  4:56 UTC (permalink / raw)
  To: Will Deacon
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Sat, Jul 04, 2020 at 01:33:56PM +0100, Will Deacon wrote:
> On Fri, Jul 03, 2020 at 08:52:05AM -0700, Kees Cook wrote:
> > On Fri, Jul 03, 2020 at 04:44:27PM +0100, Will Deacon wrote:
> > > On Fri, Jul 03, 2020 at 08:17:19AM -0700, Kees Cook wrote:
> > > > On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> > > > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > > > index 5f5b868292f5..a13661f44818 100644
> > > > > --- a/arch/arm64/kernel/syscall.c
> > > > > +++ b/arch/arm64/kernel/syscall.c
> > > > > @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > > > >  	user_exit();
> > > > >  
> > > > >  	if (has_syscall_work(flags)) {
> > > > > -		/* set default errno for user-issued syscall(-1) */
> > > > > -		if (scno == NO_SYSCALL)
> > > > > -			regs->regs[0] = -ENOSYS;
> > > > > -		scno = syscall_trace_enter(regs);
> > > > > -		if (scno == NO_SYSCALL)
> > > > > +		if (syscall_trace_enter(regs))
> > > > >  			goto trace_exit;
> > > > > +
> > > > > +		scno = regs->syscallno;
> > > > >  	}
> > > > >  
> > > > >  	invoke_syscall(regs, scno, sc_nr, syscall_table);
> > > > 
> > > > What effect do either of these patches have on the existing seccomp
> > > > selftests: tools/testing/selftests/seccomp/seccomp_bpf ?
> > > 
> > > Tests! Thanks, I'll have a look.
> > 
> > Thanks!
> > 
> > (And either way, that this behavioral difference went unnoticed means we
> > need to add a test to the selftests for this patch.)
> 
> Unsurprisingly, I don't think the tests go near this. I get 75/77 passes
> on arm64 defconfig with or without these changes.

(What doesn't pass for you? I tried to go find kernelci.org test output,
but it doesn't appear to actually run selftests yet?)

Anyway, good that the test output doesn't change, bad that seccomp has
missed a corner of this architecture interface. (i.e. the entire
TRACE_syscall fixture is dedicated to exercising the changing/skipping
interface, but I see now that it doesn't at all exercise any area of
ENOSYS results.)

> We could add a test, but then we'd have to agree on what it's supposed to
> be doing ;)

Well, if you look at change_syscall() in seccomp_bpf.c (once you stop
screaming) you'll likely share my desire to have more things that are
common across architectures. ;) So, to that end, yes, please, let's
define what we'd like to see, and then build out the (likely wildly
different per-architecture expectations). If I read this thread
correctly, we need to test:

	syscall(-1), direct,  returns ENOSYS
	syscall(-10), direct, returns ENOSYS
	syscall(-1), SECCOMP_RET_TRACE+PTRACE_CONT, returns ENOSYS
	syscall(-10), SECCOMP_RET_TRACE+PTRACE_CONT, returns ENOSYS
	syscall(-1), ptrace+PTRACE_SYSCALL, returns ENOSYS
	syscall(-10), ptrace+PTRACE_SYSCALL, returns ENOSYS

do we need to double-check that registers before/after are otherwise
unchanged too? (I *think* just looking at syscall return should be
sufficient to catch the visible results.)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-05  4:56           ` Kees Cook
@ 2020-07-06  8:15             ` Will Deacon
  2020-07-06 21:40               ` Kees Cook
  2020-07-10 12:42             ` Will Deacon
  1 sibling, 1 reply; 13+ messages in thread
From: Will Deacon @ 2020-07-06  8:15 UTC (permalink / raw)
  To: Kees Cook
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Sat, Jul 04, 2020 at 09:56:50PM -0700, Kees Cook wrote:
> On Sat, Jul 04, 2020 at 01:33:56PM +0100, Will Deacon wrote:
> > On Fri, Jul 03, 2020 at 08:52:05AM -0700, Kees Cook wrote:
> > > On Fri, Jul 03, 2020 at 04:44:27PM +0100, Will Deacon wrote:
> > > > On Fri, Jul 03, 2020 at 08:17:19AM -0700, Kees Cook wrote:
> > > > > On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> > > > > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > > > > index 5f5b868292f5..a13661f44818 100644
> > > > > > --- a/arch/arm64/kernel/syscall.c
> > > > > > +++ b/arch/arm64/kernel/syscall.c
> > > > > > @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > > > > >  	user_exit();
> > > > > >  
> > > > > >  	if (has_syscall_work(flags)) {
> > > > > > -		/* set default errno for user-issued syscall(-1) */
> > > > > > -		if (scno == NO_SYSCALL)
> > > > > > -			regs->regs[0] = -ENOSYS;
> > > > > > -		scno = syscall_trace_enter(regs);
> > > > > > -		if (scno == NO_SYSCALL)
> > > > > > +		if (syscall_trace_enter(regs))
> > > > > >  			goto trace_exit;
> > > > > > +
> > > > > > +		scno = regs->syscallno;
> > > > > >  	}
> > > > > >  
> > > > > >  	invoke_syscall(regs, scno, sc_nr, syscall_table);
> > > > > 
> > > > > What effect do either of these patches have on the existing seccomp
> > > > > selftests: tools/testing/selftests/seccomp/seccomp_bpf ?
> > > > 
> > > > Tests! Thanks, I'll have a look.
> > > 
> > > Thanks!
> > > 
> > > (And either way, that this behavioral difference went unnoticed means we
> > > need to add a test to the selftests for this patch.)
> > 
> > Unsurprisingly, I don't think the tests go near this. I get 75/77 passes
> > on arm64 defconfig with or without these changes.
> 
> (What doesn't pass for you? I tried to go find kernelci.org test output,
> but it doesn't appear to actually run selftests yet?)
> 
> Anyway, good that the test output doesn't change, bad that seccomp has
> missed a corner of this architecture interface. (i.e. the entire
> TRACE_syscall fixture is dedicated to exercising the changing/skipping
> interface, but I see now that it doesn't at all exercise any area of
> ENOSYS results.)
> 
> > We could add a test, but then we'd have to agree on what it's supposed to
> > be doing ;)
> 
> Well, if you look at change_syscall() in seccomp_bpf.c (once you stop
> screaming) you'll likely share my desire to have more things that are
> common across architectures. ;) So, to that end, yes, please, let's
> define what we'd like to see, and then build out the (likely wildly
> different per-architecture expectations). If I read this thread
> correctly, we need to test:
> 
> 	syscall(-1), direct,  returns ENOSYS
> 	syscall(-10), direct, returns ENOSYS
> 	syscall(-1), SECCOMP_RET_TRACE+PTRACE_CONT, returns ENOSYS
> 	syscall(-10), SECCOMP_RET_TRACE+PTRACE_CONT, returns ENOSYS
> 	syscall(-1), ptrace+PTRACE_SYSCALL, returns ENOSYS
> 	syscall(-10), ptrace+PTRACE_SYSCALL, returns ENOSYS
> 
> do we need to double-check that registers before/after are otherwise
> unchanged too? (I *think* just looking at syscall return should be
> sufficient to catch the visible results.)

There's also the case where the tracer sets the system call to -1 to skip
it.

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-06  8:15             ` Will Deacon
@ 2020-07-06 21:40               ` Kees Cook
  0 siblings, 0 replies; 13+ messages in thread
From: Kees Cook @ 2020-07-06 21:40 UTC (permalink / raw)
  To: Will Deacon
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Mon, Jul 06, 2020 at 09:15:51AM +0100, Will Deacon wrote:
> On Sat, Jul 04, 2020 at 09:56:50PM -0700, Kees Cook wrote:
> > different per-architecture expectations). If I read this thread
> > correctly, we need to test:
> > 
> > 	syscall(-1), direct,  returns ENOSYS
> > 	syscall(-10), direct, returns ENOSYS
> > 	syscall(-1), SECCOMP_RET_TRACE+PTRACE_CONT, returns ENOSYS
> > 	syscall(-10), SECCOMP_RET_TRACE+PTRACE_CONT, returns ENOSYS
> > 	syscall(-1), ptrace+PTRACE_SYSCALL, returns ENOSYS
> > 	syscall(-10), ptrace+PTRACE_SYSCALL, returns ENOSYS
> > 
> > do we need to double-check that registers before/after are otherwise
> > unchanged too? (I *think* just looking at syscall return should be
> > sufficient to catch the visible results.)
> 
> There's also the case where the tracer sets the system call to -1 to skip
> it.

Yes, though that's already part of the seccomp selftests. (Specifically
TRACE_syscall's syscall_faked.)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-05  4:56           ` Kees Cook
  2020-07-06  8:15             ` Will Deacon
@ 2020-07-10 12:42             ` Will Deacon
  2020-07-10 16:14               ` Kees Cook
  1 sibling, 1 reply; 13+ messages in thread
From: Will Deacon @ 2020-07-10 12:42 UTC (permalink / raw)
  To: Kees Cook
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Sat, Jul 04, 2020 at 09:56:50PM -0700, Kees Cook wrote:
> On Sat, Jul 04, 2020 at 01:33:56PM +0100, Will Deacon wrote:
> > On Fri, Jul 03, 2020 at 08:52:05AM -0700, Kees Cook wrote:
> > > On Fri, Jul 03, 2020 at 04:44:27PM +0100, Will Deacon wrote:
> > > > On Fri, Jul 03, 2020 at 08:17:19AM -0700, Kees Cook wrote:
> > > > > On Fri, Jul 03, 2020 at 09:39:14AM +0100, Will Deacon wrote:
> > > > > > diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
> > > > > > index 5f5b868292f5..a13661f44818 100644
> > > > > > --- a/arch/arm64/kernel/syscall.c
> > > > > > +++ b/arch/arm64/kernel/syscall.c
> > > > > > @@ -121,12 +121,10 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
> > > > > >  	user_exit();
> > > > > >  
> > > > > >  	if (has_syscall_work(flags)) {
> > > > > > -		/* set default errno for user-issued syscall(-1) */
> > > > > > -		if (scno == NO_SYSCALL)
> > > > > > -			regs->regs[0] = -ENOSYS;
> > > > > > -		scno = syscall_trace_enter(regs);
> > > > > > -		if (scno == NO_SYSCALL)
> > > > > > +		if (syscall_trace_enter(regs))
> > > > > >  			goto trace_exit;
> > > > > > +
> > > > > > +		scno = regs->syscallno;
> > > > > >  	}
> > > > > >  
> > > > > >  	invoke_syscall(regs, scno, sc_nr, syscall_table);
> > > > > 
> > > > > What effect do either of these patches have on the existing seccomp
> > > > > selftests: tools/testing/selftests/seccomp/seccomp_bpf ?
> > > > 
> > > > Tests! Thanks, I'll have a look.
> > > 
> > > Thanks!
> > > 
> > > (And either way, that this behavioral difference went unnoticed means we
> > > need to add a test to the selftests for this patch.)
> > 
> > Unsurprisingly, I don't think the tests go near this. I get 75/77 passes
> > on arm64 defconfig with or without these changes.
> 
> (What doesn't pass for you? I tried to go find kernelci.org test output,
> but it doesn't appear to actually run selftests yet?)

Sorry, realised I forgot to reply to this point. I was seeing assertion
failures in 'global.user_notification_with_tsync' and
'user_notification_sibling_pid_ns'. I started looking into the first one,
saw an -EACCESS kicking around, re-ran the tests as root and now they all
pass.

Are they expected to pass as a normal user?

Will

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ptrace: seccomp: Return value when the call was already invalid
  2020-07-10 12:42             ` Will Deacon
@ 2020-07-10 16:14               ` Kees Cook
  0 siblings, 0 replies; 13+ messages in thread
From: Kees Cook @ 2020-07-10 16:14 UTC (permalink / raw)
  To: Will Deacon
  Cc: Keno Fischer, Linux Kernel Mailing List, Oleg Nesterov,
	Andy Lutomirski, Will Drewry

On Fri, Jul 10, 2020 at 01:42:54PM +0100, Will Deacon wrote:
> On Sat, Jul 04, 2020 at 09:56:50PM -0700, Kees Cook wrote:
> > (What doesn't pass for you? I tried to go find kernelci.org test output,
> > but it doesn't appear to actually run selftests yet?)
> 
> Sorry, realised I forgot to reply to this point. I was seeing assertion
> failures in 'global.user_notification_with_tsync' and
> 'user_notification_sibling_pid_ns'. I started looking into the first one,
> saw an -EACCESS kicking around, re-ran the tests as root and now they all
> pass.
> 
> Are they expected to pass as a normal user?

Oh right, I still have that on my TODO list. Right now the tests are a
mix of root and normal, but since there are some root tests, it needs to
be run as root. I've been meaning to do the appropriate permission tests
and issue SKIPs for the ones needing root..

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, back to index

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-23  1:01 ptrace: seccomp: Return value when the call was already invalid Keno Fischer
2020-07-03  8:39 ` Will Deacon
2020-07-03 15:17   ` Kees Cook
2020-07-03 15:44     ` Will Deacon
2020-07-03 15:52       ` Kees Cook
2020-07-04 12:33         ` Will Deacon
2020-07-05  4:56           ` Kees Cook
2020-07-06  8:15             ` Will Deacon
2020-07-06 21:40               ` Kees Cook
2020-07-10 12:42             ` Will Deacon
2020-07-10 16:14               ` Kees Cook
2020-07-03 20:27   ` Keno Fischer
2020-07-04 12:50     ` Will Deacon

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git