[1/1] seccomp: Always "goto wait" if the list is empty
diff mbox series

Message ID 20210413160151.3301-2-rodrigo@kinvolk.io
State New, archived
Headers show
Series
  • seccomp: Erroneous return on interrupted addfd ioctl()
Related show

Commit Message

Rodrigo Campos April 13, 2021, 4:01 p.m. UTC
It is possible for the thread with the seccomp filter attached (target)
to be waken up by an addfd message, but the list be empty. This happens
when the addfd ioctl on the other side (seccomp agent) is interrupted by
a signal such as SIGURG. In that case, the target erroneously and
prematurely returns from the syscall to userspace even though the
seccomp agent didn't ask for it.

This happens in the following scenario:

seccomp_notify_addfd()                                           | seccomp_do_user_notification()
                                                                 |
                                                                 |  err = wait_for_completion_interruptible(&n.ready);
 complete(&knotif->ready);                                       |
 ret = wait_for_completion_interruptible(&kaddfd.completion);    |
 // interrupted                                                  |
                                                                 |
 mutex_lock(&filter->notify_lock);                               |
 list_del(&kaddfd.list);                                         |
 mutex_unlock(&filter->notify_lock);                             |
                                                                 |  mutex_lock(&match->notify_lock);
                                                                 |  // This is false, addfd is false
                                                                 |  if (addfd && n.state != SECCOMP_NOTIFY_REPLIED)
                                                                 |
                                                                 |  ret = n.val;
                                                                 |  err = n.error;
                                                                 |  flags = n.flags;

So, the process blocked in seccomp_do_user_notification() will see a
response. As n is 0 initialized and wasn't set, it will see a 0 as
return value from the syscall.

The seccomp agent, when retrying the interrupted syscall, will see an
ENOENT error as the notification no longer exists (it was already
answered by this bug).

This patch fixes the issue by splitting the if in two parts: if we were
woken up and the state is not replied, we will always do a "goto wait".
And if that happens and there is an addfd element on the list, we will
add the fd before "goto wait".

This issue is present since 5.9, when addfd was added.

Fixes: 7cf97b1254550
Cc: stable@vger.kernel.org # 5.9+
Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
---
 kernel/seccomp.c | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

Comments

Christian Brauner April 13, 2021, 5:53 p.m. UTC | #1
On Tue, Apr 13, 2021 at 06:01:51PM +0200, Rodrigo Campos wrote:
> It is possible for the thread with the seccomp filter attached (target)
> to be waken up by an addfd message, but the list be empty. This happens
> when the addfd ioctl on the other side (seccomp agent) is interrupted by
> a signal such as SIGURG. In that case, the target erroneously and
> prematurely returns from the syscall to userspace even though the
> seccomp agent didn't ask for it.
> 
> This happens in the following scenario:
> 
> seccomp_notify_addfd()                                           | seccomp_do_user_notification()
>                                                                  |
>                                                                  |  err = wait_for_completion_interruptible(&n.ready);
>  complete(&knotif->ready);                                       |
>  ret = wait_for_completion_interruptible(&kaddfd.completion);    |
>  // interrupted                                                  |
>                                                                  |
>  mutex_lock(&filter->notify_lock);                               |
>  list_del(&kaddfd.list);                                         |
>  mutex_unlock(&filter->notify_lock);                             |
>                                                                  |  mutex_lock(&match->notify_lock);
>                                                                  |  // This is false, addfd is false
>                                                                  |  if (addfd && n.state != SECCOMP_NOTIFY_REPLIED)
>                                                                  |
>                                                                  |  ret = n.val;
>                                                                  |  err = n.error;
>                                                                  |  flags = n.flags;
> 
> So, the process blocked in seccomp_do_user_notification() will see a
> response. As n is 0 initialized and wasn't set, it will see a 0 as
> return value from the syscall.
> 
> The seccomp agent, when retrying the interrupted syscall, will see an
> ENOENT error as the notification no longer exists (it was already
> answered by this bug).
> 
> This patch fixes the issue by splitting the if in two parts: if we were
> woken up and the state is not replied, we will always do a "goto wait".
> And if that happens and there is an addfd element on the list, we will
> add the fd before "goto wait".
> 
> This issue is present since 5.9, when addfd was added.
> 
> Fixes: 7cf97b1254550
> Cc: stable@vger.kernel.org # 5.9+
> Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
> ---

So the agent will see the return value from
wait_for_completion_interruptible() and know that the addfd wasn't
successful and the target will notice that no addfd request has actually
been added and essentially try again. Seems like a decent fix and can be
backported cleanly. I assume seccomp testsuite passes.

Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Rodrigo Campos April 13, 2021, 6:02 p.m. UTC | #2
On Tue, Apr 13, 2021 at 7:54 PM Christian Brauner
<christian.brauner@ubuntu.com> wrote:
> > Fixes: 7cf97b1254550
> > Cc: stable@vger.kernel.org # 5.9+
> > Signed-off-by: Rodrigo Campos <rodrigo@kinvolk.io>
> > ---
>
> So the agent will see the return value from
> wait_for_completion_interruptible() and know that the addfd wasn't
> successful and the target will notice that no addfd request has actually
> been added and essentially try again. Seems like a decent fix and can be

Yes, exactly!

> backported cleanly. I assume seccomp testsuite passes.

Yes, seccomp selftests (tools/testing/selftests/seccomp/seccomp_bpf) passes fine

> Acked-by: Christian Brauner <christian.brauner@ubuntu.com>

Thanks!

Patch
diff mbox series

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 63b40d12896b..1b34598f0e07 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -1107,11 +1107,20 @@  static int seccomp_do_user_notification(int this_syscall,
 	err = wait_for_completion_interruptible(&n.ready);
 	mutex_lock(&match->notify_lock);
 	if (err == 0) {
-		/* Check if we were woken up by a addfd message */
-		addfd = list_first_entry_or_null(&n.addfd,
-						 struct seccomp_kaddfd, list);
-		if (addfd && n.state != SECCOMP_NOTIFY_REPLIED) {
-			seccomp_handle_addfd(addfd);
+
+		if (n.state != SECCOMP_NOTIFY_REPLIED) {
+			/*
+			 * It is possible to be waken-up by an addfd message but
+			 * the list be empty. This can happen if the addfd
+			 * ioctl() is interrupted, as it deletes the element.
+			 *
+			 * So, check if indeed there is an element in the list.
+			 */
+			addfd = list_first_entry_or_null(&n.addfd,
+							 struct seccomp_kaddfd, list);
+			if (addfd)
+				seccomp_handle_addfd(addfd);
+
 			mutex_unlock(&match->notify_lock);
 			goto wait;
 		}