From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754140AbbBBAQa (ORCPT ); Sun, 1 Feb 2015 19:16:30 -0500 Received: from kanga.kvack.org ([205.233.56.17]:34821 "EHLO kanga.kvack.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752364AbbBBAQ3 (ORCPT ); Sun, 1 Feb 2015 19:16:29 -0500 Date: Sun, 1 Feb 2015 19:16:28 -0500 From: Benjamin LaHaise To: Linus Torvalds Cc: linux-aio@kvack.org, Linux Kernel Subject: Re: [GIT PULL] aio: fix sleeping while TASK_INTERRUPTIBLE Message-ID: <20150202001628.GO2974@kvack.org> References: <20150201144058.GM2974@kvack.org> <20150201221458.GN2974@kvack.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.2i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Feb 01, 2015 at 03:33:25PM -0800, Linus Torvalds wrote: > On Sun, Feb 1, 2015 at 2:14 PM, Benjamin LaHaise wrote: > > > > It's ugly, but it actually is revealing a bug. Spurious wake ups caused > > by the task already being added to ctx->wait when calling into mutex_lock() > > could inadvertently cause things to go wrong. I can envision there being > > code invoked that possibly expects a 1-1 relationship between sleeps and > > wake ups, which being on the additional wait queue might break. > > So I'm looking at it, and I don't see it. > > One side uses wait_event_interruptible_hrtimeout(), which waits for > the return value (or the timeout), and it doesn't matter how many > times it gets woken up, regardless of what it's waiting for. If it > gets extra wakeups, it will just go through the loop again. > > The other side is just a plain aio_read_events() -> > aio_read_events_ring(), and that one just reads as many events as it > can, unless some error happens. > > In other words, it really looks like the warning is spurious, and the > comments about how the semaphore could cause it to loop around but it > all works look entirely correct. > > So no, I don't see it revealing a bug at all. All I see is a spurious warning. > > What's the bug you think could happen? The bug would be in code that gets run via mutex_lock(), kmap(), or (more likely) in the random mm or filesystem code that could be invoked via copy_to_user(). If someone has code that works like the following inside of, say, a filesystem that's doing i/o somewhere: static void foo_done(struct foo *foo) { /* stuff err somewhere */ wake_up(foo->task); } void some_random_code_in_some_fs() { struct foo *foo; /* setup the foo */ foo->task = current; set_current_state(TASK_UNINTERRUPTIBLE); /* send the foo on to do some other work */ schedule(); /* foo_done should have been called at this point. */ } When the task in question can receive spurious wake ups, there is the possibility that schedule() ends up returning without foo_done() having been called, which is not the case normally (that is, there should be a one to one relationship between the wake_up and schedule returning in this case). While I don't immediately see any code that relies on this, I'm not convinced that every possible code path that can be invoked (especially via copy_to_user()) does not rely on these semantics. Maybe I'm just being paranoid, but this is one of the concerns I raised when this issue came forth. Nobody has addressed it yet, though. -ben > Linus -- "Thought is the essence of where you are now."