From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0E2BC4338F for ; Wed, 11 Aug 2021 12:26:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BFC6260EBC for ; Wed, 11 Aug 2021 12:26:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238345AbhHKM0V (ORCPT ); Wed, 11 Aug 2021 08:26:21 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:51076 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237737AbhHKMYF (ORCPT ); Wed, 11 Aug 2021 08:24:05 -0400 Message-ID: <20210811121417.532930310@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1628684621; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=GL0Akvx+J+4uTUYuRwUlXnL+P0pnSl80bsHAu0ZT1EU=; b=ro4TpzAoNW5c8/1aIN4N49w09qOP3duQ7kQmoQWNy+sPEarDA4IY9cnG8WR2C84qJVNWt5 rx7i1OHQHrM71HvwxBC+D7qgwuIBzG+eFrQiqZSyxasFhu6izIGwkhm5Sl3AqdOPGAQzcJ kSQflwBqP2gTjZ4FaJW5YTRD/iOLTIdAhqofvYu6idPrZQWQ3PpM8nl/LukDTpwA2nmoQi dLUunrCh6fC6LU1vzbT9idbolEIFmeCRp0yKUXIQurGbq19mVjQPbr86ACUNOP6PC2ZTvu ydotOYRzPHOw/8gC+14y7xAnW43plUatPYoeR+tvxdXqVnqDJ5owL5BWAAu2lA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1628684621; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=GL0Akvx+J+4uTUYuRwUlXnL+P0pnSl80bsHAu0ZT1EU=; b=Dv+RLVY9eOVBd0eLIw+orTlRkMcjceFSKFN0wqhBBGMurZy7CkeogKKBtw7y2ZEBp/bQet EWhkP4gWUl+2CCBw== From: Thomas Gleixner To: LKML Cc: Peter Zijlstra , Ingo Molnar , Juri Lelli , Steven Rostedt , Daniel Bristot de Oliveira , Will Deacon , Waiman Long , Boqun Feng , Sebastian Andrzej Siewior , Davidlohr Bueso , Mike Galbraith Subject: [patch V4 57/68] futex: Clarify futex_requeue() PI handling References: <20210811120348.855823694@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-transfer-encoding: 8-bit Date: Wed, 11 Aug 2021 14:23:41 +0200 (CEST) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When requeuing to a PI futex then the requeue code tries to trylock the PI futex on behalf of the topmost waiter on the inner 'waitqueue' futex. If that succeeds then PI state has to be allocated in order to requeue further waiters to the PI futex. The comment and the code are confusing as the PI state allocation uses lookup_pi_state() which either attaches to an existing waiter or to the owner. As the PI futex was just acquired, there cannot be a waiter on the PI futex because the hash bucket lock is held. Clarify the comment and use attach_to_pi_owner() directly. As the task on which behalf the PI futex has been acquired is guaranteed to be alive and not exiting, this call must succeed. Add a WARN_ON() in case that fails. Signed-off-by: Thomas Gleixner --- V4: New patch --- kernel/futex.c | 61 +++++++++++++++++++++------------------------------------ 1 file changed, 23 insertions(+), 38 deletions(-) --- a/kernel/futex.c +++ b/kernel/futex.c @@ -1299,27 +1299,6 @@ static int attach_to_pi_owner(u32 __user return 0; } -static int lookup_pi_state(u32 __user *uaddr, u32 uval, - struct futex_hash_bucket *hb, - union futex_key *key, struct futex_pi_state **ps, - struct task_struct **exiting) -{ - struct futex_q *top_waiter = futex_top_waiter(hb, key); - - /* - * If there is a waiter on that futex, validate it and - * attach to the pi_state when the validation succeeds. - */ - if (top_waiter) - return attach_to_pi_state(uaddr, uval, top_waiter->pi_state, ps); - - /* - * We are the first waiter - try to look up the owner based on - * @uval and attach to it. - */ - return attach_to_pi_owner(uaddr, uval, key, ps, exiting); -} - static int lock_pi_update_atomic(u32 __user *uaddr, u32 uval, u32 newval) { int err; @@ -2038,8 +2017,8 @@ static int futex_requeue(u32 __user *uad * At this point the top_waiter has either taken uaddr2 or is * waiting on it. If the former, then the pi_state will not * exist yet, look it up one more time to ensure we have a - * reference to it. If the lock was taken, ret contains the - * vpid of the top waiter task. + * reference to it. If the lock was taken, @ret contains the + * VPID of the top waiter task. * If the lock was not taken, we have pi_state and an initial * refcount on it. In case of an error we have nothing. */ @@ -2047,19 +2026,25 @@ static int futex_requeue(u32 __user *uad WARN_ON(pi_state); task_count++; /* - * If we acquired the lock, then the user space value - * of uaddr2 should be vpid. It cannot be changed by - * the top waiter as it is blocked on hb2 lock if it - * tries to do so. If something fiddled with it behind - * our back the pi state lookup might unearth it. So - * we rather use the known value than rereading and - * handing potential crap to lookup_pi_state. + * If futex_proxy_trylock_atomic() acquired the + * user space futex, then the user space value + * @uaddr2 has been set to the @hb1's top waiter + * task VPID. This task is guaranteed to be alive + * and cannot be exiting because it is either + * sleeping or blocked on @hb2 lock. + * + * The @uaddr2 futex cannot have waiters either as + * otherwise futex_proxy_trylock_atomic() would not + * have succeeded. * - * If that call succeeds then we have pi_state and an - * initial refcount on it. + * In order to requeue waiters to @hb2, pi state is + * required. Hand in the VPID value (@ret) and + * allocate PI state with an initial refcount on + * it. */ - ret = lookup_pi_state(uaddr2, ret, hb2, &key2, - &pi_state, &exiting); + ret = attach_to_pi_owner(uaddr2, ret, &key2, &pi_state, + &exiting); + WARN_ON(ret); } switch (ret) { @@ -2183,9 +2168,9 @@ static int futex_requeue(u32 __user *uad } /* - * We took an extra initial reference to the pi_state either - * in futex_proxy_trylock_atomic() or in lookup_pi_state(). We - * need to drop it here again. + * We took an extra initial reference to the pi_state either in + * futex_proxy_trylock_atomic() or in attach_to_pi_owner(). We need + * to drop it here again. */ put_pi_state(pi_state); @@ -2364,7 +2349,7 @@ static int __fixup_pi_state_owner(u32 __ * Modifying pi_state _before_ the user space value would leave the * pi_state in an inconsistent state when we fault here, because we * need to drop the locks to handle the fault. This might be observed - * in the PID check in lookup_pi_state. + * in the PID checks when attaching to PI state . */ retry: if (!argowner) {