From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1752140AbdBAMoU (ORCPT <rfc822;w@1wt.eu>);
        Wed, 1 Feb 2017 07:44:20 -0500
Received: from mail-ua0-f175.google.com ([209.85.217.175]:34029 "EHLO
        mail-ua0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751192AbdBAMoS (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Wed, 1 Feb 2017 07:44:18 -0500
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.20.1701311521430.3457@nanos>
References: <alpine.DEB.2.20.1701311521430.3457@nanos>
From: Dmitry Vyukov <dvyukov@google.com>
Date: Wed, 1 Feb 2017 13:43:57 +0100
Message-ID: <CACT4Y+Ye47OticsSsdUHbieyEyvFcUzHqWad00bp6k0tXM2fsQ@mail.gmail.com>
Subject: Re: [PATCH] timerfd: Protect the might cancel mechanism proper
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
        "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
        LKML <linux-kernel@vger.kernel.org>,
        syzkaller <syzkaller@googlegroups.com>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Jan 31, 2017 at 3:24 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> The handling of the might_cancel queueing is not properly protected, so
> parallel operations on the file descriptor can race with each other and
> lead to list corruptions or use after free.
>
> Protect the context for these operations with a seperate lock.
>
> The wait queue lock cannot be reused for this because that would create a
> lock inversion scenario vs. the cancel lock. Replacing might_cancel with an
> atomic (atomic_t or atomic bit) does not help either because it still can
> race vs. the actual list operation.
>
> Reported-by: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  fs/timerfd.c |   17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> --- a/fs/timerfd.c
> +++ b/fs/timerfd.c
> @@ -40,6 +40,7 @@ struct timerfd_ctx {
>         short unsigned settime_flags;   /* to show in fdinfo */
>         struct rcu_head rcu;
>         struct list_head clist;
> +       spinlock_t cancel_lock;
>         bool might_cancel;
>  };
>
> @@ -112,7 +113,7 @@ void timerfd_clock_was_set(void)
>         rcu_read_unlock();
>  }
>
> -static void timerfd_remove_cancel(struct timerfd_ctx *ctx)
> +static void __timerfd_remove_cancel(struct timerfd_ctx *ctx)
>  {
>         if (ctx->might_cancel) {
>                 ctx->might_cancel = false;
> @@ -122,6 +123,13 @@ static void timerfd_remove_cancel(struct
>         }
>  }
>
> +static void timerfd_remove_cancel(struct timerfd_ctx *ctx)
> +{
> +       spin_lock(&ctx->cancel_lock);
> +       __timerfd_remove_cancel(ctx);
> +       spin_unlock(&ctx->cancel_lock);
> +}
> +
>  static bool timerfd_canceled(struct timerfd_ctx *ctx)
>  {
>         if (!ctx->might_cancel || ctx->moffs != KTIME_MAX)
> @@ -132,6 +140,7 @@ static bool timerfd_canceled(struct time
>
>  static void timerfd_setup_cancel(struct timerfd_ctx *ctx, int flags)
>  {
> +       spin_lock(&ctx->cancel_lock);
>         if ((ctx->clockid == CLOCK_REALTIME ||
>              ctx->clockid == CLOCK_REALTIME_ALARM) &&
>             (flags & TFD_TIMER_ABSTIME) && (flags & TFD_TIMER_CANCEL_ON_SET)) {
> @@ -141,9 +150,10 @@ static void timerfd_setup_cancel(struct
>                         list_add_rcu(&ctx->clist, &cancel_list);
>                         spin_unlock(&cancel_lock);
>                 }
> -       } else if (ctx->might_cancel) {
> -               timerfd_remove_cancel(ctx);
> +       } else {
> +               __timerfd_remove_cancel(ctx);
>         }
> +       spin_unlock(&ctx->cancel_lock);
>  }
>
>  static ktime_t timerfd_get_remaining(struct timerfd_ctx *ctx)
> @@ -400,6 +410,7 @@ SYSCALL_DEFINE2(timerfd_create, int, clo
>                 return -ENOMEM;
>
>         init_waitqueue_head(&ctx->wqh);
> +       spin_lock_init(&ctx->cancel_lock);
>         ctx->clockid = clockid;
>
>         if (isalarm(ctx))


Can't we still end up with an inconsistently setup timer?
do_timerfd_settime executes timerfd_setup_cancel and timerfd_setup as
two separate non-atomic actions. So if there are 2 concurrent
timerfd_settime calls, one that needs cancel and another that does not
need cancel, can't we end up with inconsistent setup? E.g. setup timer
that needs cancel, but it won't be in cancel_list. Or vice versa.