* [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331!
@ 2022-07-20 7:28 yuxin.ye
2022-07-21 2:25 ` Waiman Long
0 siblings, 1 reply; 5+ messages in thread
From: yuxin.ye @ 2022-07-20 7:28 UTC (permalink / raw)
To: linux-rt-users, peterz, mingo, will, longman, boqun.feng
Cc: linux-kernel, yuxin.ye
before rt_mutex_adjust_prio_chain(),unlock lock->wait_lock will cause
BUG_ON at kernel/locking/rtmutex.c:1331:
[147800.420240] Call trace:
[147800.420243] dump_backtrace+0x0/0x1cc
[147800.420260] show_stack+0x18/0x24
[147800.420267] dump_stack+0xcc/0x12c
[147800.420278] nmi_cpu_backtrace+0xbc/0xc8
[147800.420287] nmi_trigger_cpumask_backtrace+0xcc/0x1c0
[147800.420295] arch_trigger_cpumask_backtrace+0x28/0x40
[147800.420304] oops_enter+0x3c/0x48
[147800.420313] die+0x54/0x228
[147800.420320] bug_handler+0x44/0x6c
[147800.420327] call_break_hook+0x40/0x74
[147800.420336] brk_handler+0x1c/0x5c
[147800.420342] do_debug_exception+0x88/0xc4
[147800.420350] el1_dbg+0x38/0x54
[147800.420359] el1_sync_handler+0xac/0xb8
[147800.420365] el1_sync+0x88/0x140
[147800.420372] task_blocks_on_rt_mutex+0x94/0x1b4
[147800.420383] rt_spin_lock_slowlock_locked+0x90/0x1bc
[147800.420391] rt_spin_lock_slowlock+0x5c/0x94
[147800.420399] rt_spin_lock_fastlock.constprop.0+0x28/0x34
[147800.420408] rt_spin_lock+0x10/0x24
[147800.420415] local_lock_acquire+0x28/0xd0
[147800.420425] free_unref_page+0x94/0x114
[147800.420432] free_the_page+0x14/0x2c
[147800.420440] __free_pages+0x30/0x78
[147800.420447] __vunmap+0x188/0x1c8
[147800.420453] __vfree+0x4c/0x50
[147800.420460] vfree+0x30/0x40
[147800.420465] free_thread_stack+0xd0/0x120
[147800.420472] put_task_stack+0x60/0x6c
[147800.420479] __put_task_struct+0x4c/0xd4
[147800.420485] put_task_struct+0x44/0x78
[147800.420493] rt_mutex_adjust_prio_chain+0x1a0/0x370
[147800.420502] task_blocks_on_rt_mutex+0x188/0x1b4
[147800.420511] rt_mutex_slowlock_locked+0xb0/0x170
[147800.420519] rt_mutex_slowlock+0x7c/0xd4
[147800.420526] __rt_mutex_lock_state+0x3c/0x50
[147800.420534] _mutex_lock_blk_flush+0x5c/0x6c
[147800.420543] _mutex_lock+0x14/0x20
[147800.420550] ion_alloc+0x5f8/0x62c
[147800.420561] ion_ioctl+0x18c/0x694
[147800.420569] vfs_ioctl+0x28/0x48
[147800.420578] __arm64_sys_ioctl+0x78/0xcc
[147800.420586] el0_svc_common.constprop.0+0x148/0x1e8
[147800.420593] do_el0_svc+0x50/0x80
[147800.420599] el0_svc+0x14/0x20
[147800.420606] el0_sync_handler+0xcc/0x154
[147800.420612] el0_sync+0x180/0x1c0
A\B\C is task.
L1\L2 is lock.
adj: means rt_mutex_adjust_prio_chain()
key process:
1. A owns L1,and blocked on L2.
2. B blocked on L1,B execute mutex_lock or spinlock will adjust A's
priority by execute adj func.
3. before execute adj,it will unlock L1->wait_lock
4. If at this point,C release L2.A owns L2,and finish the whole thread
work very quickly,Finally the B thread exited.In this process,
unlock L1 will assign 0x1 to L1->owner,what orign value is A
task_struct.But in adj func,the parameter of task is still A's
pointer.becaues of A already exited,put_task_struct will release
task A.
5. If local page.lock is locked,it will cause a BUG_ON,becaues one
task A be blocked on two lock.
====A================B===============C================
| | |->owns L2
|->owns L1 | |
|->block on L2 |->lock L1.rawspin_wait_lock
| |->block on L1 |
| | |->unlock L2
| |->get A task_truct
|->owns L2 |->unlocked L1.rawspin_wait_lock
|->lock L1.rawspin_wait_lock
|->unlock L1 | |
| | |
|->unlock L1.rawspin_wait_lock
|->release L2 | |
|->A exit & not free
| |->put A task_struct
| | ↓
[5]
Signed-off-by: yuxin.ye <yeyuxin0925@gmail.com>
---
kernel/locking/rtmutex.c | 8 --------
1 file changed, 8 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 7779ee8ab..52e9cebc3 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1175,13 +1175,9 @@ static int __sched task_blocks_on_rt_mutex(struct rt_mutex_base *lock,
*/
get_task_struct(owner);
- raw_spin_unlock_irq(&lock->wait_lock);
-
res = rt_mutex_adjust_prio_chain(owner, chwalk, lock,
next_lock, waiter, task);
- raw_spin_lock_irq(&lock->wait_lock);
-
return res;
}
@@ -1461,12 +1457,8 @@ static void __sched remove_waiter(struct rt_mutex_base *lock,
/* gets dropped in rt_mutex_adjust_prio_chain()! */
get_task_struct(owner);
- raw_spin_unlock_irq(&lock->wait_lock);
-
rt_mutex_adjust_prio_chain(owner, RT_MUTEX_MIN_CHAINWALK, lock,
next_lock, NULL, current);
-
- raw_spin_lock_irq(&lock->wait_lock);
}
/**
--
2.17.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331!
2022-07-20 7:28 [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331! yuxin.ye
@ 2022-07-21 2:25 ` Waiman Long
2022-07-21 7:17 ` yuxin.ye
0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2022-07-21 2:25 UTC (permalink / raw)
To: yuxin.ye, linux-rt-users, peterz, mingo, will, boqun.feng; +Cc: linux-kernel
On 7/20/22 03:28, yuxin.ye wrote:
> before rt_mutex_adjust_prio_chain(),unlock lock->wait_lock will cause
> BUG_ON at kernel/locking/rtmutex.c:1331:
The current upstream kernel/locking/rtmutex.c has no BUG_ON() call.
Which version of the kernel are you using?
Cheers,
Longman
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331!
2022-07-21 2:25 ` Waiman Long
@ 2022-07-21 7:17 ` yuxin.ye
2022-07-21 18:14 ` Waiman Long
0 siblings, 1 reply; 5+ messages in thread
From: yuxin.ye @ 2022-07-21 7:17 UTC (permalink / raw)
To: Waiman Long, linux-rt-users, peterz, mingo, will, boqun.feng; +Cc: linux-kernel
On Wed, Jul 20, 2022 at 10:25:17PM -0400, Waiman Long wrote:
> On 7/20/22 03:28, yuxin.ye wrote:
> > before rt_mutex_adjust_prio_chain(),unlock lock->wait_lock will cause
> > BUG_ON at kernel/locking/rtmutex.c:1331:
>
> The current upstream kernel/locking/rtmutex.c has no BUG_ON() call. Which
> version of the kernel are you using?
>
> Cheers,
> Longman
>
The Linux version is 5.10.
The upstream has indeed removed the BUG_ON, But in rt_mutex_adjust_prio_chain()
it is still possible to have a thread is blocked by two locks. Can this situation
be ignored without BUG_ON?
Thanks.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331!
2022-07-21 7:17 ` yuxin.ye
@ 2022-07-21 18:14 ` Waiman Long
2022-07-22 8:01 ` yuxin.ye
0 siblings, 1 reply; 5+ messages in thread
From: Waiman Long @ 2022-07-21 18:14 UTC (permalink / raw)
To: yuxin.ye, linux-rt-users, peterz, mingo, will, boqun.feng; +Cc: linux-kernel
On 7/21/22 03:17, yuxin.ye wrote:
> On Wed, Jul 20, 2022 at 10:25:17PM -0400, Waiman Long wrote:
>> On 7/20/22 03:28, yuxin.ye wrote:
>>> before rt_mutex_adjust_prio_chain(),unlock lock->wait_lock will cause
>>> BUG_ON at kernel/locking/rtmutex.c:1331:
>> The current upstream kernel/locking/rtmutex.c has no BUG_ON() call. Which
>> version of the kernel are you using?
>>
>> Cheers,
>> Longman
>>
> The Linux version is 5.10.
> The upstream has indeed removed the BUG_ON, But in rt_mutex_adjust_prio_chain()
> it is still possible to have a thread is blocked by two locks. Can this situation
> be ignored without BUG_ON?
No. However, we don't remove the lock like what you do with your patch.
It will corrupt the data if multiple CPUs are allowed to run
rt_mutex_adjust_prio_chain() for the same rt_mutex simultaneously. You
need to find a way to fix the underlying problem.
BTW, I still can't see a BUG_ON at line 1331 of rtmutex.c with a v5.10
kernel. Does your source tree have some out-of-tree patches that
modifies rtmutex?
Cheers,
Longman
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331!
2022-07-21 18:14 ` Waiman Long
@ 2022-07-22 8:01 ` yuxin.ye
0 siblings, 0 replies; 5+ messages in thread
From: yuxin.ye @ 2022-07-22 8:01 UTC (permalink / raw)
To: Waiman Long, linux-rt-users, peterz, mingo, will, boqun.feng; +Cc: linux-kernel
[-- Attachment #1: Type: text/plain, Size: 1903 bytes --]
On Thu, Jul 21, 2022 at 02:14:16PM -0400, Waiman Long wrote:
>
> On 7/21/22 03:17, yuxin.ye wrote:
> > On Wed, Jul 20, 2022 at 10:25:17PM -0400, Waiman Long wrote:
> > > On 7/20/22 03:28, yuxin.ye wrote:
> > > > before rt_mutex_adjust_prio_chain(),unlock lock->wait_lock will cause
> > > > BUG_ON at kernel/locking/rtmutex.c:1331:
> > > The current upstream kernel/locking/rtmutex.c has no BUG_ON() call. Which
> > > version of the kernel are you using?
> > >
> > > Cheers,
> > > Longman
> > >
> > The Linux version is 5.10.
> > The upstream has indeed removed the BUG_ON, But in rt_mutex_adjust_prio_chain()
> > it is still possible to have a thread is blocked by two locks. Can this situation
> > be ignored without BUG_ON?
>
> No. However, we don't remove the lock like what you do with your patch. It
> will corrupt the data if multiple CPUs are allowed to run
> rt_mutex_adjust_prio_chain() for the same rt_mutex simultaneously. You need
> to find a way to fix the underlying problem.
>
> BTW, I still can't see a BUG_ON at line 1331 of rtmutex.c with a v5.10
> kernel. Does your source tree have some out-of-tree patches that modifies
> rtmutex?
>
> Cheers,
> Longman
>
Yes, I'm sorry I overlooked that earlier. We applied the RT patch,and
the BUG_ON are also introduced by these patches.
Back to the question, I think remove the wait_lock unlock before
rt_mutex_adjust_prio_chain() is more likely to protect some data. The
commont on task_blocks_on_rt_mutex() indicates that must be called with
wait_lock held, but it unlock before call rt_mutex_adjust_prio_chain().
This may cause the owner thread to unlock the orig_lock and exit the
thead. Finally, when calling put_task_struct(owner) in
rt_mutex_adjust_prio_chain(), the thread is blocked by another lock that
is deeply hidden.
Actully, I'm not sure why rt_mutex_adjust_prio_chain()
dosen't need wait_lock protection.
Thanks again.
[-- Attachment #2: 0162-locking-rtmutex-Handle-the-various-new-futex-race-co.patch --]
[-- Type: text/x-diff, Size: 8887 bytes --]
From 785a93a55536c33b674ef5d794f545a9d333852e Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx@linutronix.de>
Date: Fri, 10 Jun 2011 11:04:15 +0200
Subject: [PATCH 162/304] locking/rtmutex: Handle the various new futex race
conditions
RT opens a few new interesting race conditions in the rtmutex/futex
combo due to futex hash bucket lock being a 'sleeping' spinlock and
therefor not disabling preemption.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
kernel/futex.c | 77 ++++++++++++++++++++++++++-------
kernel/locking/rtmutex.c | 36 ++++++++++++---
kernel/locking/rtmutex_common.h | 2 +
3 files changed, 94 insertions(+), 21 deletions(-)
diff --git a/kernel/futex.c b/kernel/futex.c
index 98a6e1b80bfe..c2fe58c34409 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2155,6 +2155,16 @@ static int futex_requeue(u32 __user *uaddr1, unsigned int flags,
*/
requeue_pi_wake_futex(this, &key2, hb2);
continue;
+ } else if (ret == -EAGAIN) {
+ /*
+ * Waiter was woken by timeout or
+ * signal and has set pi_blocked_on to
+ * PI_WAKEUP_INPROGRESS before we
+ * tried to enqueue it on the rtmutex.
+ */
+ this->pi_state = NULL;
+ put_pi_state(pi_state);
+ continue;
} else if (ret) {
/*
* rt_mutex_start_proxy_lock() detected a
@@ -3172,7 +3182,7 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
{
struct hrtimer_sleeper timeout, *to;
struct rt_mutex_waiter rt_waiter;
- struct futex_hash_bucket *hb;
+ struct futex_hash_bucket *hb, *hb2;
union futex_key key2 = FUTEX_KEY_INIT;
struct futex_q q = futex_q_init;
int res, ret;
@@ -3224,20 +3234,55 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
/* Queue the futex_q, drop the hb lock, wait for wakeup. */
futex_wait_queue_me(hb, &q, to);
- spin_lock(&hb->lock);
- ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
- spin_unlock(&hb->lock);
- if (ret)
- goto out;
+ /*
+ * On RT we must avoid races with requeue and trying to block
+ * on two mutexes (hb->lock and uaddr2's rtmutex) by
+ * serializing access to pi_blocked_on with pi_lock.
+ */
+ raw_spin_lock_irq(¤t->pi_lock);
+ if (current->pi_blocked_on) {
+ /*
+ * We have been requeued or are in the process of
+ * being requeued.
+ */
+ raw_spin_unlock_irq(¤t->pi_lock);
+ } else {
+ /*
+ * Setting pi_blocked_on to PI_WAKEUP_INPROGRESS
+ * prevents a concurrent requeue from moving us to the
+ * uaddr2 rtmutex. After that we can safely acquire
+ * (and possibly block on) hb->lock.
+ */
+ current->pi_blocked_on = PI_WAKEUP_INPROGRESS;
+ raw_spin_unlock_irq(¤t->pi_lock);
+
+ spin_lock(&hb->lock);
+
+ /*
+ * Clean up pi_blocked_on. We might leak it otherwise
+ * when we succeeded with the hb->lock in the fast
+ * path.
+ */
+ raw_spin_lock_irq(¤t->pi_lock);
+ current->pi_blocked_on = NULL;
+ raw_spin_unlock_irq(¤t->pi_lock);
+
+ ret = handle_early_requeue_pi_wakeup(hb, &q, &key2, to);
+ spin_unlock(&hb->lock);
+ if (ret)
+ goto out;
+ }
/*
- * In order for us to be here, we know our q.key == key2, and since
- * we took the hb->lock above, we also know that futex_requeue() has
- * completed and we no longer have to concern ourselves with a wakeup
- * race with the atomic proxy lock acquisition by the requeue code. The
- * futex_requeue dropped our key1 reference and incremented our key2
- * reference count.
+ * In order to be here, we have either been requeued, are in
+ * the process of being requeued, or requeue successfully
+ * acquired uaddr2 on our behalf. If pi_blocked_on was
+ * non-null above, we may be racing with a requeue. Do not
+ * rely on q->lock_ptr to be hb2->lock until after blocking on
+ * hb->lock or hb2->lock. The futex_requeue dropped our key1
+ * reference and incremented our key2 reference count.
*/
+ hb2 = hash_futex(&key2);
/* Check if the requeue code acquired the second futex for us. */
if (!q.rt_waiter) {
@@ -3246,14 +3291,15 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
* did a lock-steal - fix up the PI-state in that case.
*/
if (q.pi_state && (q.pi_state->owner != current)) {
- spin_lock(q.lock_ptr);
+ spin_lock(&hb2->lock);
+ BUG_ON(&hb2->lock != q.lock_ptr);
ret = fixup_pi_state_owner(uaddr2, &q, current);
/*
* Drop the reference to the pi state which
* the requeue_pi() code acquired for us.
*/
put_pi_state(q.pi_state);
- spin_unlock(q.lock_ptr);
+ spin_unlock(&hb2->lock);
/*
* Adjust the return value. It's either -EFAULT or
* success (1) but the caller expects 0 for success.
@@ -3272,7 +3318,8 @@ static int futex_wait_requeue_pi(u32 __user *uaddr, unsigned int flags,
pi_mutex = &q.pi_state->pi_mutex;
ret = rt_mutex_wait_proxy_lock(pi_mutex, to, &rt_waiter);
- spin_lock(q.lock_ptr);
+ spin_lock(&hb2->lock);
+ BUG_ON(&hb2->lock != q.lock_ptr);
if (ret && !rt_mutex_cleanup_proxy_lock(pi_mutex, &rt_waiter))
ret = 0;
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 1d0e9bf0487a..97a5fb19119d 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -136,6 +136,11 @@ static void fixup_rt_mutex_waiters(struct rt_mutex *lock)
WRITE_ONCE(*p, owner & ~RT_MUTEX_HAS_WAITERS);
}
+static int rt_mutex_real_waiter(struct rt_mutex_waiter *waiter)
+{
+ return waiter && waiter != PI_WAKEUP_INPROGRESS;
+}
+
/*
* We can speed up the acquire/release, if there's no debugging state to be
* set up.
@@ -378,7 +383,8 @@ int max_lock_depth = 1024;
static inline struct rt_mutex *task_blocked_on_lock(struct task_struct *p)
{
- return p->pi_blocked_on ? p->pi_blocked_on->lock : NULL;
+ return rt_mutex_real_waiter(p->pi_blocked_on) ?
+ p->pi_blocked_on->lock : NULL;
}
/*
@@ -514,7 +520,7 @@ static int rt_mutex_adjust_prio_chain(struct task_struct *task,
* reached or the state of the chain has changed while we
* dropped the locks.
*/
- if (!waiter)
+ if (!rt_mutex_real_waiter(waiter))
goto out_unlock_pi;
/*
@@ -947,6 +953,22 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
return -EDEADLK;
raw_spin_lock(&task->pi_lock);
+ /*
+ * In the case of futex requeue PI, this will be a proxy
+ * lock. The task will wake unaware that it is enqueueed on
+ * this lock. Avoid blocking on two locks and corrupting
+ * pi_blocked_on via the PI_WAKEUP_INPROGRESS
+ * flag. futex_wait_requeue_pi() sets this when it wakes up
+ * before requeue (due to a signal or timeout). Do not enqueue
+ * the task if PI_WAKEUP_INPROGRESS is set.
+ */
+ if (task != current && task->pi_blocked_on == PI_WAKEUP_INPROGRESS) {
+ raw_spin_unlock(&task->pi_lock);
+ return -EAGAIN;
+ }
+
+ BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on));
+
waiter->task = task;
waiter->lock = lock;
waiter->prio = task->prio;
@@ -970,7 +992,7 @@ static int task_blocks_on_rt_mutex(struct rt_mutex *lock,
rt_mutex_enqueue_pi(owner, waiter);
rt_mutex_adjust_prio(owner);
- if (owner->pi_blocked_on)
+ if (rt_mutex_real_waiter(owner->pi_blocked_on))
chain_walk = 1;
} else if (rt_mutex_cond_detect_deadlock(waiter, chwalk)) {
chain_walk = 1;
@@ -1066,7 +1088,7 @@ static void remove_waiter(struct rt_mutex *lock,
{
bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
struct task_struct *owner = rt_mutex_owner(lock);
- struct rt_mutex *next_lock;
+ struct rt_mutex *next_lock = NULL;
lockdep_assert_held(&lock->wait_lock);
@@ -1092,7 +1114,8 @@ static void remove_waiter(struct rt_mutex *lock,
rt_mutex_adjust_prio(owner);
/* Store the lock on which owner is blocked or NULL */
- next_lock = task_blocked_on_lock(owner);
+ if (rt_mutex_real_waiter(owner->pi_blocked_on))
+ next_lock = task_blocked_on_lock(owner);
raw_spin_unlock(&owner->pi_lock);
@@ -1128,7 +1151,8 @@ void rt_mutex_adjust_pi(struct task_struct *task)
raw_spin_lock_irqsave(&task->pi_lock, flags);
waiter = task->pi_blocked_on;
- if (!waiter || rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
+ if (!rt_mutex_real_waiter(waiter) ||
+ rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
raw_spin_unlock_irqrestore(&task->pi_lock, flags);
return;
}
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index b1455dc2366f..096b16cfb096 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -125,6 +125,8 @@ enum rtmutex_chainwalk {
/*
* PI-futex support (proxy locking functions, etc.):
*/
+#define PI_WAKEUP_INPROGRESS ((struct rt_mutex_waiter *) 1)
+
extern struct task_struct *rt_mutex_next_owner(struct rt_mutex *lock);
extern void rt_mutex_init_proxy_locked(struct rt_mutex *lock,
struct task_struct *proxy_owner);
--
2.32.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-07-22 8:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-20 7:28 [PATCH RFC] rtmutex: Fix BUG_ON at kernel/locking/rtmutex.c:1331! yuxin.ye
2022-07-21 2:25 ` Waiman Long
2022-07-21 7:17 ` yuxin.ye
2022-07-21 18:14 ` Waiman Long
2022-07-22 8:01 ` yuxin.ye
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).