From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A494C433EF for ; Fri, 8 Jul 2022 20:22:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240056AbiGHUWF (ORCPT ); Fri, 8 Jul 2022 16:22:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238985AbiGHUWF (ORCPT ); Fri, 8 Jul 2022 16:22:05 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D886C1C91A for ; Fri, 8 Jul 2022 13:22:03 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 96935B828AE for ; Fri, 8 Jul 2022 20:22:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 705D1C341C0; Fri, 8 Jul 2022 20:22:00 +0000 (UTC) Date: Fri, 8 Jul 2022 16:21:58 -0400 From: Steven Rostedt To: Mintu Patel Cc: badolevishal1116@gmail.com, chinmoyghosh2001@gmail.com, linux-kernel@vger.kernel.org, mingo@redhat.com, peterz@infradead.org, vimal.kumar32@gmail.com, will@kernel.org Subject: Re: [PATCH v2] rt_spin_lock: To list the correct owner of rt_spin_lock Message-ID: <20220708162158.03c82f47@gandalf.local.home> In-Reply-To: <20220627161136.3468-1-mintupatel89@gmail.com> References: <20220619142038.1274-1-mintupatel89@gmail.com> <20220627161136.3468-1-mintupatel89@gmail.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 27 Jun 2022 21:41:38 +0530 Mintu Patel wrote: Hi Mintu, FYI, a v2 should never be a reply to the v1. It should always start its own thread, otherwise tooling and such will miss it. > rt_spin_lock is actually mutex on RT Kernel so it goes for contention > for lock. Currently owners of rt_spin_lock are decided before actual > acquiring of lock. This patch would depict the correct owner of > rt_spin_lock. The patch would help in solving crashes and deadlock > due to race condition of lock > > acquiring rt_spin_lock acquired the lock released the lock > <--------> <-------> > contention period Held period > > Thread1 Thread2 > _try_to_take_rt_mutex+0x95c+0x74 enqueue_task_dl+0x8cc/0x8dc > rt_spin_lock_slowlock_locked+0xac+2 rt_mutex_setprio+0x28c/0x574 > rt_spin_lock_slowlock+0x5c/0x90 task_blocks_rt_mutex+0x240/0x310 > rt_spin_lock+0x58/0x5c rt_spin_lock_slowlock_locked+0xac/0x2 > driverA_acquire_lock+0x28/0x56 rt_spin_lock_slowlock+0x5c/0x90 > rt_spin_lock+0x58/0x5c > driverB_acquire_lock+0x48/0x6c > > As per above call traces sample, Thread1 acquired the rt_spin_lock and > went to critical section on the other hand Thread2 kept trying to acquire > the same rt_spin_lock held by Thread1 ie contention period is too high. > Finally Thread2 entered to dl queue due to high held time of the lock by > Thread1. The below patch would help us to know the correct owner of > rt_spin_lock and point us the driver's critical section. Respective > driver need to be debugged for longer held period of lock. > > ex: cat /sys/kernel/debug/tracing/trace > > kworker/u13:0-150 [003] .....11 202.761025: rt_spinlock_acquire: > Process: kworker/u13:0 is acquiring lock: &kbdev->hwaccess_lock > kworker/u13:0-150 [003] .....11 202.761039: rt_spinlock_acquired: > Process: kworker/u13:0 has acquired lock: &kbdev->hwaccess_lock > kworker/u13:0-150 [003] .....11 202.761042: rt_spinlock_released: > Process: kworker/u13:0 has released lock: &kbdev->hwaccess_lock > > Signed-off-by: Mintu Patel > Signed-off-by: Chinmoy Ghosh > Signed-off-by: Vishal Badole > Signed-off-by: Vimal Kumar > --- > include/trace/events/lock.h | 38 +++++++++++++++++++++++++++++++++ > kernel/locking/rtmutex.c | 4 ++++ > kernel/locking/rtmutex_common.h | 14 ++++++++++++ > 3 files changed, 56 insertions(+) > > diff --git a/include/trace/events/lock.h b/include/trace/events/lock.h > index d7512129a324..0564474341c8 100644 > --- a/include/trace/events/lock.h > +++ b/include/trace/events/lock.h > @@ -36,6 +36,44 @@ TRACE_EVENT(lock_acquire, > __get_str(name)) > ); > > +DECLARE_EVENT_CLASS(rt_lock_class, > + > + TP_PROTO(struct lockdep_map *lock, struct task_struct *pname), > + > + TP_ARGS(lock, pname), > + > + TP_STRUCT__entry( > + __string(name, lock->name) > + __string(process_name, pname->comm) > + ), > + > + TP_fast_assign( > + __assign_str(name, lock->name); > + __assign_str(process_name, pname->comm); > + ), > + > + TP_printk("Process: %s is acquiring lock: %s", __get_str(process_name), > + __get_str(name)) > +); > + > +DEFINE_EVENT(rt_lock_class, rt_spinlock_acquire, > + TP_PROTO(struct lockdep_map *lock, struct task_struct *pname), > + TP_ARGS(lock, pname)); > + > +DEFINE_EVENT_PRINT(rt_lock_class, rt_spinlock_acquired, > + TP_PROTO(struct lockdep_map *lock, struct task_struct *pname), > + TP_ARGS(lock, pname), > + TP_printk("Process: %s has acquired lock: %s", __get_str(process_name), > + __get_str(name)) > + ); > + > +DEFINE_EVENT_PRINT(rt_lock_class, rt_spinlock_released, > + TP_PROTO(struct lockdep_map *lock, struct task_struct *pname), > + TP_ARGS(lock, pname), > + TP_printk("Process: %s has released lock: %s", __get_str(process_name), > + __get_str(name)) > + ); > + > DECLARE_EVENT_CLASS(lock, > > TP_PROTO(struct lockdep_map *lock, unsigned long ip), > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c > index 602eb7821a1b..80ba2c0d7923 100644 > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -26,6 +26,7 @@ > #include > #include > #include > +#include > > #include "rtmutex_common.h" > > @@ -1144,7 +1145,9 @@ void __lockfunc rt_spin_lock(spinlock_t *lock) > rcu_read_lock(); > migrate_disable(); > spin_acquire(&lock->dep_map, 0, 0, _RET_IP_); > + do_trace_rt_spinlock_acquire(lock, current); > rt_spin_lock_fastlock(&lock->lock, rt_spin_lock_slowlock); > + do_trace_rt_spinlock_acquired(lock, current); > } > EXPORT_SYMBOL(rt_spin_lock); > > @@ -1169,6 +1172,7 @@ void __lockfunc rt_spin_unlock(spinlock_t *lock) > { > /* NOTE: we always pass in '1' for nested, for simplicity */ > spin_release(&lock->dep_map, 1, _RET_IP_); > + do_trace_rt_spinlock_released(lock, current); > rt_spin_lock_fastunlock(&lock->lock, rt_spin_lock_slowunlock); > migrate_enable(); > rcu_read_unlock(); > diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h > index 546aaf058b9e..185ffc1e7015 100644 > --- a/kernel/locking/rtmutex_common.h > +++ b/kernel/locking/rtmutex_common.h > @@ -25,6 +25,20 @@ > * @pi_tree_entry: pi node to enqueue into the mutex owner waiters tree > * @task: task reference to the blocked task > */ > + > +#ifdef CONFIG_RT_SPIN_LOCK_TRACING > +#define do_trace_rt_spinlock_acquire(lock, task) \ > + trace_rt_spinlock_acquire(&lock->dep_map, task) > +#define do_trace_rt_spinlock_acquired(lock, task) \ > + trace_rt_spinlock_acquired(&lock->dep_map, task) > +#define do_trace_rt_spinlock_released(lock, task) \ > + trace_rt_spinlock_released(&lock->dep_map, task) > +#else > +#define do_trace_rt_spinlock_acquire(lock, task) do {} while(0) > +#define do_trace_rt_spinlock_acquired(lock, task) do {} while(0) > +#define do_trace_rt_spinlock_released(lock, task) do {} while(0) > +#endif > + Acked-by: Steven Rostedt (Google) Although it will need to be accepted by Peter Zijlstra. -- Steve > struct rt_mutex_waiter { > struct rb_node tree_entry; > struct rb_node pi_tree_entry;