From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752397AbdBJK1m (ORCPT ); Fri, 10 Feb 2017 05:27:42 -0500 Received: from mail-ua0-f181.google.com ([209.85.217.181]:33833 "EHLO mail-ua0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751305AbdBJK1i (ORCPT ); Fri, 10 Feb 2017 05:27:38 -0500 MIME-Version: 1.0 In-Reply-To: References: From: Dmitry Vyukov Date: Fri, 10 Feb 2017 11:26:39 +0100 Message-ID: Subject: Re: [PATCH] timerfd: Protect the might cancel mechanism proper To: Thomas Gleixner Cc: Al Viro , "linux-fsdevel@vger.kernel.org" , LKML , syzkaller Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 10, 2017 at 11:13 AM, Thomas Gleixner wrote: > Dmitry, > > On Thu, 2 Feb 2017, Dmitry Vyukov wrote: > >> On Thu, Feb 2, 2017 at 7:54 PM, Thomas Gleixner wrote: >> > On Wed, 1 Feb 2017, Dmitry Vyukov wrote: >> >> >> >> Can't we still end up with an inconsistently setup timer? >> >> do_timerfd_settime executes timerfd_setup_cancel and timerfd_setup as >> >> two separate non-atomic actions. So if there are 2 concurrent >> >> timerfd_settime calls, one that needs cancel and another that does not >> >> need cancel, can't we end up with inconsistent setup? E.g. setup timer >> >> that needs cancel, but it won't be in cancel_list. Or vice versa. >> > >> > Do we really care? If an application arms the timer with cancel in one >> > thread and the same timer without cancel in another thread, then it's >> > probably completely irrelevant whether the state pair timeout/cancel is >> > correct or not. That's clearly an application bug and I don't want to add >> > more locking just to make something which is broken by definition pseudo >> > 'atomic'. >> >> I agree that the program is bogus, and don't have to ensure any sane >> behavior for it. But I am concerned about potential kernel corruptions >> due to this. For example, maybe kernel code will decide to not remove >> such timer from the cancel list on destruction because based on >> clockid/flags it should not be in the cancel list, but the timer is >> actually there and we will end up with a leak or a dangling pointer. I >> did not check that this actually happens, such inconsistent state just >> looks like a red flag for me. > > That can't happen. > > ctx->might_cancel and ctx->clist are always in sync with the new lock and > that's the only interesting thing. On destruction we don't look at clockid > or such, we only care about might_cancel. > > What is not guaranteed to be in sync is the timer expiry time and the > cancel stuff, if two threads operate on the same timerfd in > parallel. That's what I do not care about at all. Ack. Thanks for looking at it bearing with me. Then: Acked-by: Dmitry Vyukov