From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D8C2C43381 for ; Fri, 8 Mar 2019 21:10:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6014220851 for ; Fri, 8 Mar 2019 21:10:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="d009zCXt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726559AbfCHVKP (ORCPT ); Fri, 8 Mar 2019 16:10:15 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:40203 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726298AbfCHVKO (ORCPT ); Fri, 8 Mar 2019 16:10:14 -0500 Received: by mail-pg1-f194.google.com with SMTP id u9so15070878pgo.7 for ; Fri, 08 Mar 2019 13:10:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0yYUMzEkCjS/Qqpe8kS166vZAtt2C07aEPQ0Ho9/Sfw=; b=d009zCXtS52q4C4yEJ+9pMND/KhTkuJICg7VlU2Ri9yv5NS9ATO7A990I/OHCC886j 1EvYgG0/rfe/mvlsf9aMxZWVkbNmyrkuZGydTqkJ7g1IM3Lb6uz3arzJvt5Zcqp+bBZz ZBwfWBUvf4+VRApQ5LxO3CLYsnjrCJKeOd8Os= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=0yYUMzEkCjS/Qqpe8kS166vZAtt2C07aEPQ0Ho9/Sfw=; b=DhFK3vM0KD+Gnxk+KGs3Yev3wLPGnt6mlWSAqIh30YgCJGvAu2ZPN+Z713VNfgnHYO Vw3oM4Sa7ybh7qlMK1tL4XAxJ47Xe9mC3UCNkwasCCwlKCIgnvK9+mnGrVMfgGGWHiZz lwvtObjaWFY38PwrFYn3Rc6qfsrYJESYI6sC4NZczb++/jLCJv9F5a4melup4Njzmf79 DbwWGaE4lpRpNu0vMsCxtWW53fpTqe39tKi2lItWYb/mR24zGXRfJy5r9oWMe+WxszLA TPIiuvQVUYRIn6yTvhhe8wI7FHzYGZjyk7F0XhsUZLF2UUrnx0ZpGKFaQ8IVhFtKAq87 ZJ9w== X-Gm-Message-State: APjAAAVXmjELdVEM3Es7HjkLPWHSWLqwMX+6DSvixJsKwpxrHr88g4Ob yWKyBsScOEYJhA/0Z5qXlcXHA1+47HA= X-Google-Smtp-Source: APXvYqxnnL7mGB/tA1vnSYatVyMplauG4J6+FHKFo1xsYAPr9hFKxNPs7ip4ZGcE+2lvSNmnc57a1Q== X-Received: by 2002:a17:902:1a9:: with SMTP id b38mr2083305plb.37.1552079413390; Fri, 08 Mar 2019 13:10:13 -0800 (PST) Received: from zsm-linux.mtv.corp.google.com ([2620:15c:202:201:49ea:b78f:4f04:4d25]) by smtp.googlemail.com with ESMTPSA id m3sm10255391pgp.85.2019.03.08.13.10.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 08 Mar 2019 13:10:12 -0800 (PST) From: Zubin Mithra To: stable@vger.kernel.org Cc: gregkh@linuxfoundation.org, groeck@chromium.org, tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, dvhart@infradead.org, zsm@chromium.org Subject: [PATCH v4.9.y,v4.4.y v2] futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock() Date: Fri, 8 Mar 2019 13:10:09 -0800 Message-Id: <20190308211009.239345-1-zsm@chromium.org> X-Mailer: git-send-email 2.21.0.360.g471c308f928-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Peter Zijlstra commit 38d589f2fd08f1296aea3ce62bebd185125c6d81 upstream With the ultimate goal of keeping rt_mutex wait_list and futex_q waiters consistent it's necessary to split 'rt_mutex_futex_lock()' into finer parts, such that only the actual blocking can be done without hb->lock held. Split split_mutex_finish_proxy_lock() into two parts, one that does the blocking and one that does remove_waiter() when the lock acquire failed. When the rtmutex was acquired successfully the waiter can be removed in the acquisiton path safely, since there is no concurrency on the lock owner. This means that, except for futex_lock_pi(), all wait_list modifications are done with both hb->lock and wait_lock held. [bigeasy@linutronix.de: fix for futex_requeue_pi_signal_restart] Signed-off-by: Peter Zijlstra (Intel) Cc: juri.lelli@arm.com Cc: bigeasy@linutronix.de Cc: xlpang@redhat.com Cc: rostedt@goodmis.org Cc: mathieu.desnoyers@efficios.com Cc: jdesfossez@efficios.com Cc: dvhart@infradead.org Cc: bristot@redhat.com Link: http://lkml.kernel.org/r/20170322104152.001659630@infradead.org Signed-off-by: Thomas Gleixner Signed-off-by: Zubin Mithra --- Syzkaller reported a GPF in rt_mutex_top_waiter when fuzzing a 4.4 kernel. The corresponding call trace is below: Call Trace: [] remove_waiter+0x1e/0x1c8 kernel/locking/rtmutex.c:1082 [] rt_mutex_start_proxy_lock+0x95/0xb1 kernel/locking/rtmutex.c:1685 [] futex_requeue+0x929/0xbc3 kernel/futex.c:1944 [] do_futex+0xecf/0xf9a kernel/futex.c:3249 [] SYSC_futex kernel/futex.c:3287 [inline] [] SyS_futex+0x253/0x29e kernel/futex.c:3255 [] entry_SYSCALL_64_fastpath+0x31/0xb3 Code: e0 2a 53 48 c1 ea 03 80 3c 02 00 74 05 e8 f5 54 1e 00 49 8b 5c 24 40 b8 ff ff 37 00 48 c1 e0 2a 48 8d 7b 38 48 89 fa 48 c1 ea 03 <80> 3c 02 00 74 05 e8 d1 54 1e 00 4c 39 63 38 74 02 0f 0b 48 89 RIP [] rt_mutex_top_waiter+0x42/0x5d kernel/locking/rtmutex_common.h:53 RSP ---[ end trace ab9c561cca7592c2 ]--- The PoC triggers a crash on the mainline kernel at tag:v4.4, stable at 4.4.y and the 4.4 kernel being fuzzed. The following tests after applying this patch: * LTP tests inside testcases/kernel/syscalls/futex * Syzkaller Repro does not cause GPF with the backport * Chrome OS tryjob tests * Some tests from within glibc/ntpl kernel/futex.c | 7 +++-- kernel/locking/rtmutex.c | 52 ++++++++++++++++++++++++++++----- kernel/locking/rtmutex_common.h | 8 +++-- 3 files changed, 55 insertions(+), 12 deletions(-) diff --git a/kernel/futex.c b/kernel/futex.c index a26d217c99fe7..0c92c8d34ffa2 100644 --- a/kernel/futex.c +++ b/kernel/futex.c @@ -2923,10 +2923,13 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags, */ WARN_ON(!q.pi_state); pi_mutex = &q.pi_state->pi_mutex; - ret = rt_mutex_finish_proxy_lock(pi_mutex, to, &rt_waiter); - debug_rt_mutex_free_waiter(&rt_waiter); + ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter); spin_lock(q.lock_ptr); + if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter)) + ret = 0; + + debug_rt_mutex_free_waiter(&rt_waiter); /* * Fixup the pi_state owner and possibly acquire the lock if we * haven't already. diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index b066724d7a5be..dd173df9ee5e5 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1712,21 +1712,23 @@ struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock) } /** - * rt_mutex_finish_proxy_lock() - Complete lock acquisition + * rt_mutex_wait_proxy_lock() - Wait for lock acquisition * @lock: the rt_mutex we were woken on * @to: the timeout, null if none. hrtimer should already have * been started. * @waiter: the pre-initialized rt_mutex_waiter * - * Complete the lock acquisition started our behalf by another thread. + * Wait for the the lock acquisition started on our behalf by + * rt_mutex_start_proxy_lock(). Upon failure, the caller must call + * rt_mutex_cleanup_proxy_lock(). * * Returns: * 0 - success * <0 - error, one of -EINTR, -ETIMEDOUT * - * Special API call for PI-futex requeue support + * Special API call for PI-futex support */ -int rt_mutex_finish_proxy_lock(struct rt_mutex *lock, +int rt_mutex_wait_proxy_lock(struct rt_mutex *lock, struct hrtimer_sleeper *to, struct rt_mutex_waiter *waiter) { @@ -1739,9 +1741,6 @@ int rt_mutex_finish_proxy_lock(struct rt_mutex *lock, /* sleep on the mutex */ ret = __rt_mutex_slowlock(lock, TASK_INTERRUPTIBLE, to, waiter); - if (unlikely(ret)) - remove_waiter(lock, waiter); - /* * try_to_take_rt_mutex() sets the waiter bit unconditionally. We might * have to fix that up. @@ -1752,3 +1751,42 @@ int rt_mutex_finish_proxy_lock(struct rt_mutex *lock, return ret; } + +/** + * rt_mutex_cleanup_proxy_lock() - Cleanup failed lock acquisition + * @lock: the rt_mutex we were woken on + * @waiter: the pre-initialized rt_mutex_waiter + * + * Attempt to clean up after a failed rt_mutex_wait_proxy_lock(). + * + * Unless we acquired the lock; we're still enqueued on the wait-list and can + * in fact still be granted ownership until we're removed. Therefore we can + * find we are in fact the owner and must disregard the + * rt_mutex_wait_proxy_lock() failure. + * + * Returns: + * true - did the cleanup, we done. + * false - we acquired the lock after rt_mutex_wait_proxy_lock() returned, + * caller should disregards its return value. + * + * Special API call for PI-futex support + */ +bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, + struct rt_mutex_waiter *waiter) +{ + bool cleanup = false; + + raw_spin_lock_irq(&lock->wait_lock); + /* + * Unless we're the owner; we're still enqueued on the wait_list. + * So check if we became owner, if not, take us off the wait_list. + */ + if (rt_mutex_owner(lock) != current) { + remove_waiter(lock, waiter); + fixup_rt_mutex_waiters(lock); + cleanup = true; + } + raw_spin_unlock_irq(&lock->wait_lock); + + return cleanup; +} diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h index e317e1cbb3eba..6f8f68edb700c 100644 --- a/kernel/locking/rtmutex_common.h +++ b/kernel/locking/rtmutex_common.h @@ -106,9 +106,11 @@ extern void rt_mutex_proxy_unlock(struct rt_mutex *lock, extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock, struct rt_mutex_waiter *waiter, struct task_struct *task); -extern int rt_mutex_finish_proxy_lock(struct rt_mutex *lock, - struct hrtimer_sleeper *to, - struct rt_mutex_waiter *waiter); +extern int rt_mutex_wait_proxy_lock(struct rt_mutex *lock, + struct hrtimer_sleeper *to, + struct rt_mutex_waiter *waiter); +extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock, + struct rt_mutex_waiter *waiter); extern int rt_mutex_timed_futex_lock(struct rt_mutex *l, struct hrtimer_sleeper *to); extern bool rt_mutex_futex_unlock(struct rt_mutex *lock, struct wake_q_head *wqh); -- 2.21.0.360.g471c308f928-goog