From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757912AbcKCPhT (ORCPT ); Thu, 3 Nov 2016 11:37:19 -0400 Received: from mail-oi0-f54.google.com ([209.85.218.54]:32957 "EHLO mail-oi0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756850AbcKCPhR (ORCPT ); Thu, 3 Nov 2016 11:37:17 -0400 MIME-Version: 1.0 In-Reply-To: <1477551554-30349-11-git-send-email-binoy.jayan@linaro.org> References: <1477551554-30349-1-git-send-email-binoy.jayan@linaro.org> <1477551554-30349-11-git-send-email-binoy.jayan@linaro.org> From: Linus Torvalds Date: Thu, 3 Nov 2016 08:37:16 -0700 X-Google-Sender-Auth: UpombtBloWbYQvvqHJc9RzPUvO0 Message-ID: Subject: Re: [PATCH v4 10/10] IB/mlx5: Simplify completion into a wait_event To: Binoy Jayan Cc: Doug Ledford , Sean Hefty , Hal Rosenstock , Arnd Bergmann , "linux-rdma@vger.kernel.org" , Linux Kernel Mailing List Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Oct 26, 2016 at 11:59 PM, Binoy Jayan wrote: > Convert the completion 'mlx5_ib_umr_context:done' to a wait_event as it > just waits for the return value to be filled. This is wrong. Since that "umr_context" variable is on the stack, and you are waiting for it to be fully done, it really should be a completion. Why? Because a "complete()" guarantees that the *last* access to the variable is done by the thread that does "wait_for_completion()". In contrast, doing a "wait_event()" can actually *race* with whoever wakes it up, and the "wait_event()" may race with the wakeup, where the wakeup() ends up removing the entry from the list, so that "list_empty_careful()" sees that it's all done, but the "wakeup()" can still be holding (and about to release) the waitqueue spinlock. What's the problem? When the waiter is on the stack, the umr_context" variable may be about to be relased, and then the "wake_up()" may end up accessing the umr_context waitqueue spinlock after it has already become something else. This is unlikely, but it's very much one of the things that a "completion" is all about. A completion (along with a plain spinlock) is guaranteed to be synchronous in a way that semaphores and wait-queues are *not*, so that when somebody has done "complete()", there is no access to the variable that can race. So no. A wait-queue wakeup is NOT AT ALL the same thing as a completion. There really is a very real reason why "complete()" exists, and why it is used for one-time initializations like this. "complete()" along with spinlocks are synchronous in ways that other concepts are not. Linus