* [ANNOUNCE] v4.11.5-rt1 @ 2017-06-16 10:56 Sebastian Andrzej Siewior 2017-06-17 8:14 ` Mike Galbraith 2017-06-18 17:01 ` [patch-rt] rtmutex: Fix lock stealing logic Mike Galbraith 0 siblings, 2 replies; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-16 10:56 UTC (permalink / raw) To: Thomas Gleixner; +Cc: LKML, linux-rt-users, Steven Rostedt Dear RT folks! I'm pleased to announce the v4.11.5-rt1 patch set. The release has been delayed due to the hotplug rework that was started before the final v4.11 release. However the new code has not been stabilized yet and it was decided to bring back the old patches before delaying the v4.11-RT release any longer. We will try to complete the hotplug rework (for RT) in the v4.11 cycle. Changes since v4.9.39-rt21: - rebase to v4.11.5 Known issues - CPU hotplug got a little better but can deadlock. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git v4.11.5-rt1 The RT patch against v4.11.5 can be found here: https://cdn.kernel.org/pub/linux/kernel/projects/rt/4.11/older/patch-4.11.5-rt1.patch.xz The split quilt queue is available at: https://cdn.kernel.org/pub/linux/kernel/projects/rt/4.11/older/patches-4.11.5-rt1.tar.xz Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-16 10:56 [ANNOUNCE] v4.11.5-rt1 Sebastian Andrzej Siewior @ 2017-06-17 8:14 ` Mike Galbraith 2017-06-18 6:46 ` Mike Galbraith 2017-06-19 8:52 ` Sebastian Andrzej Siewior 2017-06-18 17:01 ` [patch-rt] rtmutex: Fix lock stealing logic Mike Galbraith 1 sibling, 2 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-17 8:14 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Thomas Gleixner Cc: LKML, linux-rt-users, Steven Rostedt On Fri, 2017-06-16 at 12:56 +0200, Sebastian Andrzej Siewior wrote: > Dear RT folks! > > I'm pleased to announce the v4.11.5-rt1 patch set. > The release has been delayed due to the hotplug rework that was started > before the final v4.11 release. However the new code has not been > stabilized yet and it was decided to bring back the old patches before > delaying the v4.11-RT release any longer. We will try to complete the > hotplug rework (for RT) in the v4.11 cycle. > > Changes since v4.9.39-rt21: > > - rebase to v4.11.5 Hi Sebastian, I noticed a couple things wrt migrate_disable() changes... During that rebase, migrate_disable() was changed to no longer map to preempt_disable() for nonrt, but some patches still assume it does. It now depends upon PREEMPT_COUNT, the RT workaround in futex.c induces grumbling in nonrt builds with PREEMPT_COUNT enabled. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-17 8:14 ` Mike Galbraith @ 2017-06-18 6:46 ` Mike Galbraith 2017-06-19 8:52 ` Sebastian Andrzej Siewior 1 sibling, 0 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-18 6:46 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Thomas Gleixner Cc: LKML, linux-rt-users, Steven Rostedt On Sat, 2017-06-17 at 10:14 +0200, Mike Galbraith wrote: > >... the RT workaround in futex.c induces > grumbling in nonrt builds with PREEMPT_COUNT enabled. A trivial way to fix it up is to... futex: Fix migrate_disable/enable workaround for !PREEMPT_RT_FULL The imbalance fixed by aed0f50e58eb only exists for PREEMPT_RT_FULL, and creates for other PREEMPT_COUNT configs. Create/use _rt variants of migrate_disable/enable() which are compiled away when not needed. Signed-off-by: Mike Galbraith <efault@gmx.de> Fixes: aed0f50e58eb ("futex: workaround migrate_disable/enable in different context") --- include/linux/preempt.h | 11 +++++++++++ kernel/futex.c | 8 ++++---- 2 files changed, 15 insertions(+), 4 deletions(-) --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -227,12 +227,21 @@ do { \ extern void migrate_disable(void); extern void migrate_enable(void); +#ifdef CONFIG_PREEMPT_RT_FULL +#define migrate_disable_rt() migrate_disable() +#define migrate_enable_rt() migrate_enable() +#else +static inline void migrate_disable_rt(void) { } +static inline void migrate_enable_rt(void) { } +#endif int __migrate_disabled(struct task_struct *p); #else #define migrate_disable() barrier() #define migrate_enable() barrier() +static inline void migrate_disable_rt(void) { } +static inline void migrate_enable_rt(void) { } static inline int __migrate_disabled(struct task_struct *p) { return 0; @@ -323,6 +332,8 @@ do { \ #define migrate_disable() barrier() #define migrate_enable() barrier() +static inline void migrate_disable_rt(void) { } +static inline void migrate_enable_rt(void) { } static inline int __migrate_disabled(struct task_struct *p) { --- a/kernel/futex.c +++ b/kernel/futex.c @@ -2690,12 +2690,12 @@ static int futex_lock_pi(u32 __user *uad * one migrate_disable() pending in the slow-path which is reversed * after the raw_spin_unlock_irq() where we leave the atomic context. */ - migrate_disable(); + migrate_disable_rt(); spin_unlock(q.lock_ptr); ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current); raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock); - migrate_enable(); + migrate_enable_rt(); if (ret) { if (ret == 1) @@ -2846,13 +2846,13 @@ static int futex_unlock_pi(u32 __user *u * won't undo the migrate_disable() which was issued when * locking hb->lock. */ - migrate_disable(); + migrate_disable_rt(); spin_unlock(&hb->lock); /* Drops pi_state->pi_mutex.wait_lock */ ret = wake_futex_pi(uaddr, uval, pi_state); - migrate_enable(); + migrate_enable_rt(); put_pi_state(pi_state); ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-17 8:14 ` Mike Galbraith 2017-06-18 6:46 ` Mike Galbraith @ 2017-06-19 8:52 ` Sebastian Andrzej Siewior 2017-06-19 10:14 ` Mike Galbraith 1 sibling, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 8:52 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-17 10:14:37 [+0200], Mike Galbraith wrote: > Hi Sebastian, Hi Mike, > > I noticed a couple things wrt migrate_disable() changes... > > During that rebase, migrate_disable() was changed to no longer map to > preempt_disable() for nonrt, but some patches still assume it does. It > now depends upon PREEMPT_COUNT, the RT workaround in futex.c induces > grumbling in nonrt builds with PREEMPT_COUNT enabled. argh, right. It was planned to get it merged upstream but due to $reasons we never got that far. For that reason I would simply revert that change and let migrate_disable() map to preempt_disable() as it did earlier. > -Mike Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 8:52 ` Sebastian Andrzej Siewior @ 2017-06-19 10:14 ` Mike Galbraith 2017-06-19 10:44 ` Sebastian Andrzej Siewior 2017-06-19 14:08 ` Steven Rostedt 0 siblings, 2 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 10:14 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On Mon, 2017-06-19 at 10:52 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-17 10:14:37 [+0200], Mike Galbraith wrote: > > > During that rebase, migrate_disable() was changed to no longer map to > > preempt_disable() for nonrt, but some patches still assume it does. It > > now depends upon PREEMPT_COUNT, the RT workaround in futex.c induces > > grumbling in nonrt builds with PREEMPT_COUNT enabled. > > argh, right. It was planned to get it merged upstream but due to > $reasons we never got that far. For that reason I would simply revert > that change and let migrate_disable() map to preempt_disable() as it did > earlier. Ok, doesn't matter for RT testing. What does matter, is that... diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 30b24f774198..10e832da70b6 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2284,7 +2284,7 @@ EXPORT_SYMBOL(wake_up_process); */ int wake_up_lock_sleeper(struct task_struct *p) { - return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER); + return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); } ...appears to be inducing lost futex wakeups. Scratch that "appears", changing it to TASK_NORMAL just fixed my DL980 running otherwise absolutely pristine 4.9-rt21, after having double verified that rt20 works fine. Now to go back to 4.11/master/tip-rt, make sure that the little bugger really really REALLY ain't fscking with me for the sheer fun of it, futexes being made of pure evil :) My testcase is to run futex_wait -n 4 in a modest sized loop. Odd thing is that it only reproduces on the DL980 if I let it use multiple sockets, pin it to one, and all is peachy, (rather seems to be given) whereas on desktop box, the hang is far more intermittent, but there. -Mike ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 10:14 ` Mike Galbraith @ 2017-06-19 10:44 ` Sebastian Andrzej Siewior 2017-06-19 11:31 ` Mike Galbraith 2017-06-19 14:08 ` Steven Rostedt 1 sibling, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 10:44 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-19 12:14:51 [+0200], Mike Galbraith wrote: > Ok, doesn't matter for RT testing. What does matter, is that... > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 30b24f774198..10e832da70b6 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2284,7 +2284,7 @@ EXPORT_SYMBOL(wake_up_process); > */ > int wake_up_lock_sleeper(struct task_struct *p) > { > - return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER); > + return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); > } > > ...appears to be inducing lost futex wakeups. has this something to do with "rtmutex: Fix lock stealing logic" ? > Scratch that "appears", changing it to TASK_NORMAL just fixed my DL980 > running otherwise absolutely pristine 4.9-rt21, after having double > verified that rt20 works fine. Now to go back to 4.11/master/tip-rt, > make sure that the little bugger really really REALLY ain't fscking > with me for the sheer fun of it, futexes being made of pure evil :) So v4.9-rt20 works fine but -rt21 starts to lose wakeups on DL980 in general or just with "futex_wait -n 4" ? > My testcase is to run futex_wait -n 4 in a modest sized loop. Odd > thing is that it only reproduces on the DL980 if I let it use multiple > sockets, pin it to one, and all is peachy, (rather seems to be given) > whereas on desktop box, the hang is far more intermittent, but there. do I parse it right, as v4.9-rt21 (without the change above) works with the testcase mentioned if you pin it to one socket but does not work if you let it use multiple sockets. And your desktop box hangs no matter what? > -Mike Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 10:44 ` Sebastian Andrzej Siewior @ 2017-06-19 11:31 ` Mike Galbraith 2017-06-19 11:50 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 11:31 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On Mon, 2017-06-19 at 12:44 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-19 12:14:51 [+0200], Mike Galbraith wrote: > > Ok, doesn't matter for RT testing. What does matter, is that... > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 30b24f774198..10e832da70b6 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -2284,7 +2284,7 @@ EXPORT_SYMBOL(wake_up_process); > > */ > > int wake_up_lock_sleeper(struct task_struct *p) > > { > > - return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER); > > + return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); > > } > > > > ...appears to be inducing lost futex wakeups. > > has this something to do with "rtmutex: Fix lock stealing logic" ? Nope. The above is fallout of me being inspired me to stare, that inspiration having come initially from seeing lost wakeup symptoms in my desktop, telling me something has gone sour in rt-land, so a hunting I did go. I expected to find I had made a booboo in my trees, but maybe not, as I found a suspiciously similar symptom to what I was looking for in virgin source. > > Scratch that "appears", changing it to TASK_NORMAL just fixed my DL980 > > running otherwise absolutely pristine 4.9-rt21, after having double > > verified that rt20 works fine. Now to go back to 4.11/master/tip-rt, > > make sure that the little bugger really really REALLY ain't fscking > > with me for the sheer fun of it, futexes being made of pure evil :) > > So v4.9-rt20 works fine but -rt21 starts to lose wakeups on DL980 in > general or just with "futex_wait -n 4" ? -rt20 is verified to work fine, -rt21 starts hanging with futextest. The futex_wait -n 4 testcase was distilled out of seeing the full futextest/run.sh hanging. The only symptom I've _seen_ on the DL980 is futextest hanging. On the desktop, I've seen more, and may still, I'll know when I see or don't see desktop gizmos occasionally go comatose. > > My testcase is to run futex_wait -n 4 in a modest sized loop. Odd > > thing is that it only reproduces on the DL980 if I let it use multiple > > sockets, pin it to one, and all is peachy, (rather seems to be given) > > whereas on desktop box, the hang is far more intermittent, but there. > > do I parse it right, as v4.9-rt21 (without the change above) works with > the testcase mentioned if you pin it to one socket but does not work if > you let it use multiple sockets. > And your desktop box hangs no matter what? No no, desktop box will reproduce, but not nearly as reliably as the 8 socket box does, but yes, it seems to work fine on the DL980 when pinned to one socket. I was testing 4.9-rt because hunt was in progress when 4.11-rt was born. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 11:31 ` Mike Galbraith @ 2017-06-19 11:50 ` Sebastian Andrzej Siewior 2017-06-19 12:55 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 11:50 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-19 13:31:28 [+0200], Mike Galbraith wrote: > > > Scratch that "appears", changing it to TASK_NORMAL just fixed my DL980 > > > running otherwise absolutely pristine 4.9-rt21, after having double > > > verified that rt20 works fine. Now to go back to 4.11/master/tip-rt, > > > make sure that the little bugger really really REALLY ain't fscking > > > with me for the sheer fun of it, futexes being made of pure evil :) > > > > So v4.9-rt20 works fine but -rt21 starts to lose wakeups on DL980 in > > general or just with "futex_wait -n 4" ? > > -rt20 is verified to work fine, -rt21 starts hanging with futextest. > The futex_wait -n 4 testcase was distilled out of seeing the full > futextest/run.sh hanging. The only symptom I've _seen_ on the DL980 is > futextest hanging. On the desktop, I've seen more, and may still, I'll > know when I see or don't see desktop gizmos occasionally go comatose. rt20…rt21 is just https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/sched-Prevent-task-state-corruption-by-spurious-lock.patch?h=linux-4.9.y-rt-patches Let me verify that here and fire maybe the four socket box. > > > My testcase is to run futex_wait -n 4 in a modest sized loop. Odd > > > thing is that it only reproduces on the DL980 if I let it use multiple > > > sockets, pin it to one, and all is peachy, (rather seems to be given) > > > whereas on desktop box, the hang is far more intermittent, but there. > > > > do I parse it right, as v4.9-rt21 (without the change above) works with > > the testcase mentioned if you pin it to one socket but does not work if > > you let it use multiple sockets. > > And your desktop box hangs no matter what? > > No no, desktop box will reproduce, but not nearly as reliably as the 8 > socket box does, but yes, it seems to work fine on the DL980 when > pinned to one socket. I was testing 4.9-rt because hunt was in > progress when 4.11-rt was born. Let me try futex-wait test on more boxes then… > -Mike Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 11:50 ` Sebastian Andrzej Siewior @ 2017-06-19 12:55 ` Mike Galbraith 2017-06-19 14:06 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 12:55 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On Mon, 2017-06-19 at 13:50 +0200, Sebastian Andrzej Siewior wrote: > > rt20…rt21 is just > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/sched-Prevent-task-state-corruption-by-spurious-lock.patch?h=linux-4.9.y-rt-patches Yup. I got there via git checkout v4.9.x-rtx, build/test. > Let me verify that here and fire maybe the four socket box. One thing perhaps interesting about the DL980 is that it is 8 socket, but with 7 memoryless nodes (poor thing). I wouldn't be surprised if you had to try a lot harder than I. DL980 now seems happy with master-rt21 fwtw. It's not virgin though, not even considering continuous merge 4.9..today. Hopefully desktop will cease with the comatose widget business too, I really don't want to go there. These things give tglx headaches, might well blow the point clean off of my head. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 12:55 ` Mike Galbraith @ 2017-06-19 14:06 ` Sebastian Andrzej Siewior 2017-06-19 14:36 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 14:06 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-19 14:55:35 [+0200], Mike Galbraith wrote: > On Mon, 2017-06-19 at 13:50 +0200, Sebastian Andrzej Siewior wrote: > > > > rt20…rt21 is just > > https://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git/tree/patches/sched-Prevent-task-state-corruption-by-spurious-lock.patch?h=linux-4.9.y-rt-patches > > Yup. I got there via git checkout v4.9.x-rtx, build/test. > > > Let me verify that here and fire maybe the four socket box. > > One thing perhaps interesting about the DL980 is that it is 8 socket, > but with 7 memoryless nodes (poor thing). I wouldn't be surprised if > you had to try a lot harder than I. > DL980 now seems happy with master-rt21 fwtw. It's not virgin though, > not even considering continuous merge 4.9..today. > > Hopefully desktop will cease with the comatose widget business too, I > really don't want to go there. These things give tglx headaches, might > well blow the point clean off of my head. I am suppressed that your desktop shows any symptoms on rt21. I tried my smaller AMD box (A10), an Intel two sockets and a four socket box. Each of them was fine with the run.sh and manual futex_wait invocation. Could you please send the config of your desktop box? > -Mike Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 14:06 ` Sebastian Andrzej Siewior @ 2017-06-19 14:36 ` Mike Galbraith 2017-06-19 15:03 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 14:36 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On Mon, 2017-06-19 at 16:06 +0200, Sebastian Andrzej Siewior wrote: > > I am suppressed that your desktop shows any symptoms on rt21. I tried my > smaller AMD box (A10), an Intel two sockets and a four socket box. Each > of them was fine with the run.sh and manual futex_wait invocation. > Could you please send the config of your desktop box? (sent offline) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 14:36 ` Mike Galbraith @ 2017-06-19 15:03 ` Sebastian Andrzej Siewior 2017-06-19 16:14 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 15:03 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-19 16:36:22 [+0200], Mike Galbraith wrote: > On Mon, 2017-06-19 at 16:06 +0200, Sebastian Andrzej Siewior wrote: > > > > I am suppressed that your desktop shows any symptoms on rt21. I tried my > > smaller AMD box (A10), an Intel two sockets and a four socket box. Each > > of them was fine with the run.sh and manual futex_wait invocation. > > Could you please send the config of your desktop box? > > (sent offline) btw: I tried booting the two and four sockets box with less memory so that the box is using just one memory node but nothing changed. Let me try with the config of yours. Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 15:03 ` Sebastian Andrzej Siewior @ 2017-06-19 16:14 ` Mike Galbraith 2017-06-19 16:27 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 16:14 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On Mon, 2017-06-19 at 17:03 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-19 16:36:22 [+0200], Mike Galbraith wrote: > > On Mon, 2017-06-19 at 16:06 +0200, Sebastian Andrzej Siewior wrote: > > > > > > I am suppressed that your desktop shows any symptoms on rt21. I tried my > > > smaller AMD box (A10), an Intel two sockets and a four socket box. Each > > > of them was fine with the run.sh and manual futex_wait invocation. > > > Could you please send the config of your desktop box? > > > > (sent offline) > > btw: I tried booting the two and four sockets box with less memory so > that the box is using just one memory node but nothing changed. Let me > try with the config of yours. BTW back, I reran virgin 4.9-rt21 on desktop box while off doing the have something resembling a life thing, and it did not stall in 50 iterations of performance/run.sh (bloody fickle thing). Hohum, take it for whatever you think it's worth. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 16:14 ` Mike Galbraith @ 2017-06-19 16:27 ` Sebastian Andrzej Siewior 2017-06-19 16:46 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 16:27 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-19 18:14:52 [+0200], Mike Galbraith wrote: > > BTW back, I reran virgin 4.9-rt21 on desktop box while off doing the > have something resembling a life thing, and it did not stall in 50 > iterations of performance/run.sh (bloody fickle thing). Hohum, take it > for whatever you think it's worth. that sounds good because I tried your config on three boxes with the same result. So the issue remains on the DLxxx box which is 8socket NUMA box with only one memory node, right? I let run.sh run in a loop until tomorrow evening and check for results… > -Mike Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 16:27 ` Sebastian Andrzej Siewior @ 2017-06-19 16:46 ` Mike Galbraith 0 siblings, 0 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 16:46 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On Mon, 2017-06-19 at 18:27 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-19 18:14:52 [+0200], Mike Galbraith wrote: > > > > BTW back, I reran virgin 4.9-rt21 on desktop box while off doing the > > have something resembling a life thing, and it did not stall in 50 > > iterations of performance/run.sh (bloody fickle thing). Hohum, take it > > for whatever you think it's worth. > > that sounds good because I tried your config on three boxes with the > same result. > So the issue remains on the DLxxx box which is 8socket NUMA box with > only one memory node, right? Yes. Now _I_ think (having spent days chasing) that it is in fact generic, but yes, all else aside, my highly utilized and very reliable up until now (and crippled) DL980 G7 very clearly states that we have a bug lurking. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 10:14 ` Mike Galbraith 2017-06-19 10:44 ` Sebastian Andrzej Siewior @ 2017-06-19 14:08 ` Steven Rostedt 2017-06-19 14:13 ` Sebastian Andrzej Siewior 1 sibling, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2017-06-19 14:08 UTC (permalink / raw) To: Mike Galbraith Cc: Sebastian Andrzej Siewior, Thomas Gleixner, LKML, linux-rt-users On Mon, 19 Jun 2017 12:14:51 +0200 Mike Galbraith <efault@gmx.de> wrote: > On Mon, 2017-06-19 at 10:52 +0200, Sebastian Andrzej Siewior wrote: > > On 2017-06-17 10:14:37 [+0200], Mike Galbraith wrote: > > > > > During that rebase, migrate_disable() was changed to no longer map to > > > preempt_disable() for nonrt, but some patches still assume it does. It > > > now depends upon PREEMPT_COUNT, the RT workaround in futex.c induces > > > grumbling in nonrt builds with PREEMPT_COUNT enabled. > > > > argh, right. It was planned to get it merged upstream but due to > > $reasons we never got that far. For that reason I would simply revert > > that change and let migrate_disable() map to preempt_disable() as it did > > earlier. > > Ok, doesn't matter for RT testing. What does matter, is that... > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index 30b24f774198..10e832da70b6 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -2284,7 +2284,7 @@ EXPORT_SYMBOL(wake_up_process); > */ > int wake_up_lock_sleeper(struct task_struct *p) > { > - return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER); > + return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); > } > > ...appears to be inducing lost futex wakeups. Hmm, it shouldn't affect futexes, as it's only called by rtmutex when waiter->savestate is true. And that should always be false for futex. -- Steve > > Scratch that "appears", changing it to TASK_NORMAL just fixed my DL980 > running otherwise absolutely pristine 4.9-rt21, after having double > verified that rt20 works fine. Now to go back to 4.11/master/tip-rt, > make sure that the little bugger really really REALLY ain't fscking > with me for the sheer fun of it, futexes being made of pure evil :) > > My testcase is to run futex_wait -n 4 in a modest sized loop. Odd > thing is that it only reproduces on the DL980 if I let it use multiple > sockets, pin it to one, and all is peachy, (rather seems to be given) > whereas on desktop box, the hang is far more intermittent, but there. > > -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 14:08 ` Steven Rostedt @ 2017-06-19 14:13 ` Sebastian Andrzej Siewior 2017-06-19 14:41 ` Steven Rostedt 0 siblings, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-19 14:13 UTC (permalink / raw) To: Steven Rostedt; +Cc: Mike Galbraith, Thomas Gleixner, LKML, linux-rt-users On 2017-06-19 10:08:38 [-0400], Steven Rostedt wrote: > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index 30b24f774198..10e832da70b6 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -2284,7 +2284,7 @@ EXPORT_SYMBOL(wake_up_process); > > */ > > int wake_up_lock_sleeper(struct task_struct *p) > > { > > - return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER); > > + return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); > > } > > > > ...appears to be inducing lost futex wakeups. > > Hmm, it shouldn't affect futexes, as it's only called by rtmutex when > waiter->savestate is true. And that should always be false for futex. you still have sleeping locks like the hb-lock (which might matter in case the task blocks on the lock and does not continue for some reason). > -- Steve Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 14:13 ` Sebastian Andrzej Siewior @ 2017-06-19 14:41 ` Steven Rostedt 2017-06-19 16:29 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2017-06-19 14:41 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Mike Galbraith, Thomas Gleixner, LKML, linux-rt-users On Mon, 19 Jun 2017 16:13:41 +0200 Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > Hmm, it shouldn't affect futexes, as it's only called by rtmutex when > > waiter->savestate is true. And that should always be false for futex. > > you still have sleeping locks like the hb-lock (which might matter in > case the task blocks on the lock and does not continue for some reason). But then I'd expect this to be an issue with any spinlock in the kernel. -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 14:41 ` Steven Rostedt @ 2017-06-19 16:29 ` Mike Galbraith 2017-06-20 7:45 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-19 16:29 UTC (permalink / raw) To: Steven Rostedt, Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users On Mon, 2017-06-19 at 10:41 -0400, Steven Rostedt wrote: > On Mon, 19 Jun 2017 16:13:41 +0200 > Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > > > > Hmm, it shouldn't affect futexes, as it's only called by rtmutex when > > > waiter->savestate is true. And that should always be false for futex. > > > > you still have sleeping locks like the hb-lock (which might matter in > > case the task blocks on the lock and does not continue for some reason). > > But then I'd expect this to be an issue with any spinlock in the kernel. Thing to do is set a trap for the bugger. I'll give that a go after I get some beans/biscuits work done. For the nonce, what I've reported is all I know. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-19 16:29 ` Mike Galbraith @ 2017-06-20 7:45 ` Mike Galbraith 2017-06-22 16:34 ` Sebastian Andrzej Siewior 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-20 7:45 UTC (permalink / raw) To: Steven Rostedt, Sebastian Andrzej Siewior Cc: Thomas Gleixner, LKML, linux-rt-users On Mon, 2017-06-19 at 18:29 +0200, Mike Galbraith wrote: > On Mon, 2017-06-19 at 10:41 -0400, Steven Rostedt wrote: > > On Mon, 19 Jun 2017 16:13:41 +0200 > > Sebastian Andrzej Siewior <bigeasy@linutronix.de> wrote: > > > > > > > > Hmm, it shouldn't affect futexes, as it's only called by rtmutex when > > > > waiter->savestate is true. And that should always be false for futex. > > > > > > you still have sleeping locks like the hb-lock (which might matter in > > > case the task blocks on the lock and does not continue for some reason). > > > > But then I'd expect this to be an issue with any spinlock in the kernel. > > Thing to do is set a trap for the bugger. See ! and ? vogelweide:~/:[1]# grep MIKE trace MIKE 913.728104: start MIKE futex_wait-7496 [031] ....... 913.729160: sched_process_fork: comm=futex_wait pid=7496 child_comm=futex_wait child_pid=7497 MIKE futex_wait-7496 [031] ....... 913.729253: sched_process_fork: comm=futex_wait pid=7496 child_comm=futex_wait child_pid=7499 MIKE futex_wait-7496 [031] ....... 913.729307: sched_process_fork: comm=futex_wait pid=7496 child_comm=futex_wait child_pid=7500 MIKE futex_wait-7496 [031] ....... 913.729348: sched_process_fork: comm=futex_wait pid=7496 child_comm=futex_wait child_pid=7501 MIKE futex_wait-7496 [031] d...2.. 913.729430: sched_switch: prev_comm=futex_wait prev_pid=7496 prev_prio=120 prev_state=S ==> next_comm=swapper/31 next_pid=0 next_prio=120 MIKE--futex_wait-7500 [005] d...2.. 920.040320: sched_switch: prev_comm=futex_wait prev_pid=7500 prev_prio=120 prev_state=S ==> next_comm=swapper/5 next_pid=0 next_prio=120 MIKE futex_wait-7497 [058] d...211 920.060009: sched_waking: comm=futex_wait pid=7501 prio=120 target_cpu=044 MIKE futex_wait-7497 [058] d...311 920.060012: sched_wake_idle_without_ipi: cpu=44 MIKE futex_wait-7497 [058] d...311 920.060012: sched_wakeup: comm=futex_wait pid=7501 prio=120 target_cpu=044 MIKE--futex_wait-7497 [058] d...2.. 920.060035: sched_switch: prev_comm=futex_wait prev_pid=7497 prev_prio=120 prev_state=S ==> next_comm=swapper/58 next_pid=0 next_prio=120 MIKE futex_wait-7499 [043] dN..411 920.060035: sched_wakeup: comm=ktimersoftd/43 pid=438 prio=0 target_cpu=043 MIKE! futex_wait-7499 [043] .N..111 920.060037: wake_up_lock_sleeper: futex_wait:7497 state:1 saved_state:0 MIKE! futex_wait-7499 [043] dN..211 920.060037: try_to_wake_up: futex_wait:7497 WF_LOCK_SLEEPER !(p->state:1 & state:2) bailing MIKE 920.060037: futex_wait-7497 is never awakened again until ^C cleanup, 7499/7501 still standing MIKE futex_wait-7499 [043] ....111 920.060066: wake_up_lock_sleeper: futex_wait:7501 state:2 saved_state:0 MIKE futex_wait-7499 [043] d...211 920.060066: try_to_wake_up: futex_wait:7501 WF_LOCK_SLEEPER !(p->state:0 & state:2) bailing MIKE futex_wait-7499 [043] ....111 920.060606: wake_up_lock_sleeper: futex_wait:7501 state:2 saved_state:0 MIKE 920.355048: huh? wake_up_lock_sleeper sees state=2 but try_to_wake_up then sees state=0 and bails ?!? MIKE futex_wait-7501 [044] ....111 920.355048: wake_up_lock_sleeper: futex_wait:7499 state:2 saved_state:0 MIKE futex_wait-7501 [044] d...211 920.355049: try_to_wake_up: futex_wait:7499 WF_LOCK_SLEEPER !(p->state:0 & state:2) bailing MIKE futex_wait-7499 [048] d..h311 920.355060: sched_stat_runtime: comm=futex_wait pid=7499 runtime=168537 [ns] vruntime=4850005377 [ns] MIKE futex_wait-7499 [048] d...311 920.355061: sched_waking: comm=ksoftirqd/48 pid=489 prio=120 target_cpu=048 MIKE futex_wait-7499 [048] d...411 920.355062: sched_stat_runtime: comm=futex_wait pid=7499 runtime=2972 [ns] vruntime=4850008349 [ns] MIKE futex_wait-7499 [048] d.L.411 920.355063: sched_wakeup: comm=ksoftirqd/48 pid=489 prio=120 target_cpu=048 MIKE futex_wait-7499 [048] d.L.311 920.355064: sched_waking: comm=ktimersoftd/48 pid=488 prio=0 target_cpu=048 MIKE futex_wait-7499 [048] dNL.411 920.355065: sched_wakeup: comm=ktimersoftd/48 pid=488 prio=0 target_cpu=048 MIKE futex_wait-7499 [048] .NL.111 920.355066: wake_up_lock_sleeper: futex_wait:7501 state:0 saved_state:0 MIKE futex_wait-7499 [048] dNL.211 920.355067: try_to_wake_up: futex_wait:7501 WF_LOCK_SLEEPER !(p->state:0 & state:2) bailing MIKE futex_wait-7499 [048] dNL.211 920.355067: sched_stat_runtime: comm=futex_wait pid=7499 runtime=2011 [ns] vruntime=4850010360 [ns] MIKE futex_wait-7499 [048] d...211 920.355068: sched_switch: prev_comm=futex_wait prev_pid=7499 prev_prio=120 prev_state=R+ ==> next_comm=ktimersoftd/48 next_pid=488 next_prio=0 MIKE <...>-488 [048] d...2.. 920.355070: sched_switch: prev_comm=ktimersoftd/48 prev_pid=488 prev_prio=0 prev_state=S ==> next_comm=ksoftirqd/48 next_pid=489 next_prio=120 MIKE <...>-489 [048] d...2.. 920.355071: sched_stat_runtime: comm=ksoftirqd/48 pid=489 runtime=1925 [ns] vruntime=4838010274 [ns] MIKE <...>-489 [048] d...2.. 920.355072: sched_switch: prev_comm=ksoftirqd/48 prev_pid=489 prev_prio=120 prev_state=S ==> next_comm=futex_wait next_pid=7499 next_prio=120 MIKE futex_wait-7501 [044] d...2.. 920.355072: sched_stat_runtime: comm=futex_wait pid=7501 runtime=32398 [ns] vruntime=6632181410 [ns] MIKE--futex_wait-7501 [044] d...2.. 920.355074: sched_switch: prev_comm=futex_wait prev_pid=7501 prev_prio=120 prev_state=S ==> next_comm=swapper/44 next_pid=0 next_prio=120 MIKE 920.355074: 7499 is last man standing, but bored, just waking ktimersoftd MIKE futex_wait-7499 [048] d..h1.. 920.356060: sched_stat_runtime: comm=futex_wait pid=7499 runtime=988793 [ns] vruntime=4850999153 [ns] MIKE futex_wait-7499 [048] d...1.. 920.356061: sched_waking: comm=ktimersoftd/48 pid=488 prio=0 target_cpu=048 MIKE futex_wait-7499 [048] dN..2.. 920.356063: sched_wakeup: comm=ktimersoftd/48 pid=488 prio=0 target_cpu=048 MIKE futex_wait-7499 [048] dN..2.. 920.356063: sched_stat_runtime: comm=futex_wait pid=7499 runtime=1854 [ns] vruntime=4851001007 [ns] MIKE futex_wait-7499 [048] d...2.. 920.356064: sched_switch: prev_comm=futex_wait prev_pid=7499 prev_prio=120 prev_state=R ==> next_comm=ktimersoftd/48 next_pid=488 next_prio=0 MIKE <...>-488 [048] d...2.. 920.356065: sched_switch: prev_comm=ktimersoftd/48 prev_pid=488 prev_prio=0 prev_state=S ==> next_comm=futex_wait next_pid=7499 next_prio=120 MIKE futex_wait-7499 [048] d..h1.. 920.384055: sched_stat_runtime: comm=futex_wait pid=7499 runtime=997267 [ns] vruntime=4878950744 [ns] MIKE futex_wait-7499 [048] d...1.. 920.384056: sched_waking: comm=ktimersoftd/48 pid=488 prio=0 target_cpu=048 MIKE futex_wait-7499 [048] dN..2.. 920.384056: sched_wakeup: comm=ktimersoftd/48 pid=488 prio=0 target_cpu=048 MIKE futex_wait-7499 [048] dN..2.. 920.384057: sched_stat_runtime: comm=futex_wait pid=7499 runtime=1161 [ns] vruntime=4878951905 [ns] MIKE futex_wait-7499 [048] d...2.. 920.384057: sched_switch: prev_comm=futex_wait prev_pid=7499 prev_prio=120 prev_state=R ==> next_comm=ktimersoftd/48 next_pid=488 next_prio=0 MIKE <...>-488 [048] d...2.. 920.384058: sched_switch: prev_comm=ktimersoftd/48 prev_pid=488 prev_prio=0 prev_state=S ==> next_comm=futex_wait next_pid=7499 next_prio=120 MIKE futex_wait-7499 [048] d...2.. 920.384271: sched_stat_runtime: comm=futex_wait pid=7499 runtime=213243 [ns] vruntime=4879165148 [ns] MIKE futex_wait-7499 [048] d...2.. 920.384271: sched_switch: prev_comm=futex_wait prev_pid=7499 prev_prio=120 prev_state=S ==> next_comm=swapper/48 next_pid=0 next_prio=120 MIKE 920.384271: we're stalled, all futex_wait have scheduled off, elide gap MIKE 981.447247: nothing to see above /me poking ^C, elide to EOF ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-20 7:45 ` Mike Galbraith @ 2017-06-22 16:34 ` Sebastian Andrzej Siewior 2017-06-22 17:30 ` Mike Galbraith 0 siblings, 1 reply; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-22 16:34 UTC (permalink / raw) To: Mike Galbraith; +Cc: Steven Rostedt, Thomas Gleixner, LKML, linux-rt-users On 2017-06-20 09:45:06 [+0200], Mike Galbraith wrote: > See ! and ? See see. What about this: diff --git a/include/linux/sched.h b/include/linux/sched.h --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1014,8 +1014,20 @@ struct wake_q_head { #define WAKE_Q(name) \ struct wake_q_head name = { WAKE_Q_TAIL, &name.first } -extern void wake_q_add(struct wake_q_head *head, - struct task_struct *task); +extern void __wake_q_add(struct wake_q_head *head, + struct task_struct *task, bool sleeper); +static inline void wake_q_add(struct wake_q_head *head, + struct task_struct *task) +{ + __wake_q_add(head, task, false); +} + +static inline void wake_q_add_sleeper(struct wake_q_head *head, + struct task_struct *task) +{ + __wake_q_add(head, task, true); +} + extern void __wake_up_q(struct wake_q_head *head, bool sleeper); static inline void wake_up_q(struct wake_q_head *head) @@ -1745,6 +1757,7 @@ struct task_struct { raw_spinlock_t pi_lock; struct wake_q_node wake_q; + struct wake_q_node wake_q_sleeper; #ifdef CONFIG_RT_MUTEXES /* PI waiters blocked on a rt_mutex held by this task */ diff --git a/kernel/fork.c b/kernel/fork.c --- a/kernel/fork.c +++ b/kernel/fork.c @@ -558,6 +558,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) tsk->splice_pipe = NULL; tsk->task_frag.page = NULL; tsk->wake_q.next = NULL; + tsk->wake_q_sleeper.next = NULL; account_kernel_stack(tsk, 1); diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1506,7 +1506,7 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q, */ preempt_disable(); if (waiter->savestate) - wake_q_add(wake_sleeper_q, waiter->task); + wake_q_add_sleeper(wake_sleeper_q, waiter->task); else wake_q_add(wake_q, waiter->task); raw_spin_unlock(¤t->pi_lock); diff --git a/kernel/sched/core.c b/kernel/sched/core.c --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -430,9 +430,15 @@ static bool set_nr_if_polling(struct task_struct *p) #endif #endif -void wake_q_add(struct wake_q_head *head, struct task_struct *task) +void __wake_q_add(struct wake_q_head *head, struct task_struct *task, + bool sleeper) { - struct wake_q_node *node = &task->wake_q; + struct wake_q_node *node; + + if (sleeper) + node = &task->wake_q_sleeper; + else + node = &task->wake_q; /* * Atomically grab the task, if ->wake_q is !nil already it means @@ -461,11 +467,17 @@ void __wake_up_q(struct wake_q_head *head, bool sleeper) while (node != WAKE_Q_TAIL) { struct task_struct *task; - task = container_of(node, struct task_struct, wake_q); + if (sleeper) + task = container_of(node, struct task_struct, wake_q_sleeper); + else + task = container_of(node, struct task_struct, wake_q); BUG_ON(!task); /* task can safely be re-inserted now */ node = node->next; - task->wake_q.next = NULL; + if (sleeper) + task->wake_q_sleeper.next = NULL; + else + task->wake_q.next = NULL; /* * wake_up_process() implies a wmb() to pair with the queueing ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-22 16:34 ` Sebastian Andrzej Siewior @ 2017-06-22 17:30 ` Mike Galbraith 2017-06-22 21:36 ` Thomas Gleixner ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-22 17:30 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Steven Rostedt, Thomas Gleixner, LKML, linux-rt-users On Thu, 2017-06-22 at 18:34 +0200, Sebastian Andrzej Siewior wrote: > On 2017-06-20 09:45:06 [+0200], Mike Galbraith wrote: > > See ! and ? > > See see. > What about this: I'll give it a go, likely during the weekend. I moved 4.11-rt today (also repros nicely) due to ftrace annoying me. After yet more staring at ever more huge traces (opposite of goal;), then taking a break to stare at source again, I decided that the dual wake_q business should die.. and the stall died with it. > diff --git a/include/linux/sched.h b/include/linux/sched.h > --- a/include/linux/sched.h > +++ b/include/linux/sched.h > @@ -1014,8 +1014,20 @@ struct wake_q_head { > #define WAKE_Q(name) \ > struct wake_q_head name = { WAKE_Q_TAIL, &name.first } > > -extern void wake_q_add(struct wake_q_head *head, > - struct task_struct *task); > +extern void __wake_q_add(struct wake_q_head *head, > + struct task_struct *task, bool sleeper); > +static inline void wake_q_add(struct wake_q_head *head, > + struct task_struct *task) > +{ > + __wake_q_add(head, task, false); > +} > + > +static inline void wake_q_add_sleeper(struct wake_q_head *head, > + struct task_struct *task) > +{ > + __wake_q_add(head, task, true); > +} > + > extern void __wake_up_q(struct wake_q_head *head, bool sleeper); > > static inline void wake_up_q(struct wake_q_head *head) > @@ -1745,6 +1757,7 @@ struct task_struct { > raw_spinlock_t pi_lock; > > struct wake_q_node wake_q; > + struct wake_q_node wake_q_sleeper; > > #ifdef CONFIG_RT_MUTEXES > /* PI waiters blocked on a rt_mutex held by this task */ > diff --git a/kernel/fork.c b/kernel/fork.c > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -558,6 +558,7 @@ static struct task_struct *dup_task_struct(struct task_struct *orig, int node) > tsk->splice_pipe = NULL; > tsk->task_frag.page = NULL; > tsk->wake_q.next = NULL; > + tsk->wake_q_sleeper.next = NULL; > > account_kernel_stack(tsk, 1); > > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -1506,7 +1506,7 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q, > */ > preempt_disable(); > if (waiter->savestate) > - wake_q_add(wake_sleeper_q, waiter->task); > + wake_q_add_sleeper(wake_sleeper_q, waiter->task); > else > wake_q_add(wake_q, waiter->task); > raw_spin_unlock(¤t->pi_lock); > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -430,9 +430,15 @@ static bool set_nr_if_polling(struct task_struct *p) > #endif > #endif > > -void wake_q_add(struct wake_q_head *head, struct task_struct *task) > +void __wake_q_add(struct wake_q_head *head, struct task_struct *task, > + bool sleeper) > { > - struct wake_q_node *node = &task->wake_q; > + struct wake_q_node *node; > + > + if (sleeper) > + node = &task->wake_q_sleeper; > + else > + node = &task->wake_q; > > /* > * Atomically grab the task, if ->wake_q is !nil already it means > @@ -461,11 +467,17 @@ void __wake_up_q(struct wake_q_head *head, bool sleeper) > while (node != WAKE_Q_TAIL) { > struct task_struct *task; > > - task = container_of(node, struct task_struct, wake_q); > + if (sleeper) > + task = container_of(node, struct task_struct, wake_q_sleeper); > + else > + task = container_of(node, struct task_struct, wake_q); > BUG_ON(!task); > /* task can safely be re-inserted now */ > node = node->next; > - task->wake_q.next = NULL; > + if (sleeper) > + task->wake_q_sleeper.next = NULL; > + else > + task->wake_q.next = NULL; > > /* > * wake_up_process() implies a wmb() to pair with the queueing ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-22 17:30 ` Mike Galbraith @ 2017-06-22 21:36 ` Thomas Gleixner 2017-06-23 2:00 ` Mike Galbraith 2017-06-23 12:48 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 32+ messages in thread From: Thomas Gleixner @ 2017-06-22 21:36 UTC (permalink / raw) To: Mike Galbraith Cc: Sebastian Andrzej Siewior, Steven Rostedt, LKML, linux-rt-users [-- Attachment #1: Type: text/plain, Size: 1146 bytes --] On Thu, 22 Jun 2017, Mike Galbraith wrote: > On Thu, 2017-06-22 at 18:34 +0200, Sebastian Andrzej Siewior wrote: > > On 2017-06-20 09:45:06 [+0200], Mike Galbraith wrote: > > > See ! and ? > > > > See see. > > What about this: > > I'll give it a go, likely during the weekend. > > I moved 4.11-rt today (also repros nicely) due to ftrace annoying me. > After yet more staring at ever more huge traces (opposite of goal;), > then taking a break to stare at source again, I decided that the dual > wake_q business should die.. and the stall died with it. The dual wake_q business is the semantically correct seperation. The evils of 'sleeping spinlocks' require to preserve the task state accross the lock sleep and act upon the real wakeup (not the lock wakeup) proper. The TASK_ALL wakeup of lock sleepers was hiding that non seperation of these two except for the resulting wreckage of ptrace, which we simply ignored for a long time because .... Now that we figure out that TASK_ALL is wrong we exposed the convolution of competing lock wakeups and regular wakeups. Seperating wake_q for those is the only sane solution. Thanks, tglx ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-22 17:30 ` Mike Galbraith 2017-06-22 21:36 ` Thomas Gleixner @ 2017-06-23 2:00 ` Mike Galbraith 2017-06-23 12:48 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-23 2:00 UTC (permalink / raw) To: Sebastian Andrzej Siewior Cc: Steven Rostedt, Thomas Gleixner, LKML, linux-rt-users On Thu, 2017-06-22 at 19:30 +0200, Mike Galbraith wrote: > On Thu, 2017-06-22 at 18:34 +0200, Sebastian Andrzej Siewior wrote: > > On 2017-06-20 09:45:06 [+0200], Mike Galbraith wrote: > > > See ! and ? > > > > See see. > > What about this: > > I'll give it a go, likely during the weekend. What the heck, after so much time expended squabbling with this bugger, beans/biscuits work will hardly notice a mere hour or so... Yup, it's confirmed dead, and no doubt my GUI goblins will be too. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [ANNOUNCE] v4.11.5-rt1 2017-06-22 17:30 ` Mike Galbraith 2017-06-22 21:36 ` Thomas Gleixner 2017-06-23 2:00 ` Mike Galbraith @ 2017-06-23 12:48 ` Sebastian Andrzej Siewior 2 siblings, 0 replies; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-23 12:48 UTC (permalink / raw) To: Mike Galbraith; +Cc: Steven Rostedt, Thomas Gleixner, LKML, linux-rt-users On 2017-06-22 19:30:07 [+0200], Mike Galbraith wrote: > On Thu, 2017-06-22 at 18:34 +0200, Sebastian Andrzej Siewior wrote: > > On 2017-06-20 09:45:06 [+0200], Mike Galbraith wrote: > > > See ! and ? > > > > See see. > > What about this: > > I'll give it a go, likely during the weekend. It survived >1d on my AMD-A10 box. I am adding this description: |If a task is queued as a sleeper for a wakeup and never goes to |schedule() (because it just obtained the lock) then it will receive a |spurious wake up which is not "bad", it is considered. Until that wake |up happens this task can no be enqueued for any wake ups handled by the |WAKE_Q infrastructure (because a task can only be enqueued once). This |wouldn't be bad if we would use the same wakeup mechanism for the wake |up of sleepers as we do for "normal" wake ups. But we don't… | |So. | T1 T2 T3 | spin_lock(x) spin_unlock(x); | wake_q_add(q1, T1) | spin_unlock(x) | set_state(TASK_INTERRUPTIBLE) | if (!condition) | schedule() | condition = true | wake_q_add(q2, T1) | // T1 not added, still enqueued | wake_up_q(q2) | wake_up_q_sleeper(q1) | // T1 not woken up, wrong task state | |In order to solve this race this patch adds a wake_q_node for the |sleeper case. and consider this closed unless I hear from you different things :) Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* [patch-rt] rtmutex: Fix lock stealing logic 2017-06-16 10:56 [ANNOUNCE] v4.11.5-rt1 Sebastian Andrzej Siewior 2017-06-17 8:14 ` Mike Galbraith @ 2017-06-18 17:01 ` Mike Galbraith 2017-06-23 7:37 ` [patch-rt v2] " Mike Galbraith 1 sibling, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-18 17:01 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Thomas Gleixner Cc: LKML, linux-rt-users, Steven Rostedt 1. When trying to acquire an rtmutex, we first try to grab it without queueing the waiter, and explicitly check for that initial attempt in the !waiter path of __try_to_take_rt_mutex(). Checking whether the lock taker is top waiter before allowing a steal attempt in that path is a thinko: the lock taker has not yet blocked. 2. It seems wrong to change the definition of rt_mutex_waiter_less() to mean less or perhaps equal when we have an rt_mutex_waiter_equal(). Remove the thinko, and implement rt_mutex_stealable(steal_mode) using a restored rt_mutex_waiter_less(), and rt_mutex_waiter_equal(). Signed-off-by: Mike Galbraith <efault@gmx.de> Fixes: f34d077c6f28 ("rt: Add the preempt-rt lock replacement APIs") --- kernel/locking/rtmutex.c | 69 +++++++++++++++++++++++------------------------ 1 file changed, 35 insertions(+), 34 deletions(-) --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -236,26 +236,19 @@ static inline bool unlock_rt_mutex_safe( } #endif -#define STEAL_NORMAL 0 -#define STEAL_LATERAL 1 - /* * Only use with rt_mutex_waiter_{less,equal}() */ -#define task_to_waiter(p) \ - &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline } +#define task_to_waiter(p) &(struct rt_mutex_waiter) \ + { .prio = (p)->prio, .deadline = (p)->dl.deadline, .task = (p) } static inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left, - struct rt_mutex_waiter *right, int mode) + struct rt_mutex_waiter *right) { - if (mode == STEAL_NORMAL) { - if (left->prio < right->prio) - return 1; - } else { - if (left->prio <= right->prio) - return 1; - } + if (left->prio < right->prio) + return 1; + /* * If both waiters have dl_prio(), we check the deadlines of the * associated tasks. @@ -287,6 +280,25 @@ rt_mutex_waiter_equal(struct rt_mutex_wa return 1; } +#define STEAL_NORMAL 0 +#define STEAL_LATERAL 1 + +static inline int rt_mutex_stealable(struct rt_mutex_waiter *thief, + struct rt_mutex_waiter *victim, int mode) +{ + if (rt_mutex_waiter_less(thief, victim)) + return 1; + + /* + * Note that RT tasks are excluded from lateral-steals + * to prevent the introduction of an unbounded latency. + */ + if (mode == STEAL_NORMAL || rt_task(thief->task)) + return 0; + + return rt_mutex_waiter_equal(thief, victim); +} + static void rt_mutex_enqueue(struct rt_mutex *lock, struct rt_mutex_waiter *waiter) { @@ -298,7 +310,7 @@ rt_mutex_enqueue(struct rt_mutex *lock, while (*link) { parent = *link; entry = rb_entry(parent, struct rt_mutex_waiter, tree_entry); - if (rt_mutex_waiter_less(waiter, entry, STEAL_NORMAL)) { + if (rt_mutex_waiter_less(waiter, entry)) { link = &parent->rb_left; } else { link = &parent->rb_right; @@ -337,7 +349,7 @@ rt_mutex_enqueue_pi(struct task_struct * while (*link) { parent = *link; entry = rb_entry(parent, struct rt_mutex_waiter, pi_tree_entry); - if (rt_mutex_waiter_less(waiter, entry, STEAL_NORMAL)) { + if (rt_mutex_waiter_less(waiter, entry)) { link = &parent->rb_left; } else { link = &parent->rb_right; @@ -847,6 +859,7 @@ static int rt_mutex_adjust_prio_chain(st * @task: The task which wants to acquire the lock * @waiter: The waiter that is queued to the lock's wait tree if the * callsite called task_blocked_on_lock(), otherwise NULL + * @mode: Lock steal mode (STEAL_NORMAL, STEAL_LATERAL) */ static int __try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, @@ -890,10 +903,9 @@ static int __try_to_take_rt_mutex(struct * @lock, give up. */ if (waiter != rt_mutex_top_waiter(lock)) { - /* XXX rt_mutex_waiter_less() ? */ + /* XXX rt_mutex_stealable() ? */ return 0; } - /* * We can acquire the lock. Remove the waiter from the * lock waiters tree. @@ -910,25 +922,14 @@ static int __try_to_take_rt_mutex(struct * not need to be dequeued. */ if (rt_mutex_has_waiters(lock)) { - struct task_struct *pown = rt_mutex_top_waiter(lock)->task; - - if (task != pown) - return 0; - - /* - * Note that RT tasks are excluded from lateral-steals - * to prevent the introduction of an unbounded latency. - */ - if (rt_task(task)) - mode = STEAL_NORMAL; /* - * If @task->prio is greater than or equal to - * the top waiter priority (kernel view), - * @task lost. + * If @task->prio is greater than the top waiter + * priority (kernel view), or equal to it when + * lateral steal is forbidden, @task lost. */ - if (!rt_mutex_waiter_less(task_to_waiter(task), - rt_mutex_top_waiter(lock), - mode)) + if (!rt_mutex_stealable(task_to_waiter(task), + rt_mutex_top_waiter(lock), + mode)) return 0; /* * The current top waiter stays enqueued. We ^ permalink raw reply [flat|nested] 32+ messages in thread
* [patch-rt v2] rtmutex: Fix lock stealing logic 2017-06-18 17:01 ` [patch-rt] rtmutex: Fix lock stealing logic Mike Galbraith @ 2017-06-23 7:37 ` Mike Galbraith 2017-06-23 10:07 ` Mike Galbraith 2017-06-23 13:33 ` Steven Rostedt 0 siblings, 2 replies; 32+ messages in thread From: Mike Galbraith @ 2017-06-23 7:37 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Thomas Gleixner Cc: LKML, linux-rt-users, Steven Rostedt V2 changes: - beautification (ymmv) - enable lock stealing when waiter is queued rtmutex: Fix lock stealing logic 1. When trying to acquire an rtmutex, we first try to grab it without queueing the waiter, and explicitly check for that initial attempt in the !waiter path of __try_to_take_rt_mutex(). Checking whether the lock taker is top waiter before allowing a steal attempt in that path is a thinko: the lock taker has not yet blocked. 2. It seems wrong to change the definition of rt_mutex_waiter_less() to mean less or perhaps equal when we have an rt_mutex_waiter_equal(). Remove the thinko, restore rt_mutex_waiter_less(), implement and use rt_mutex_steal() based upon rt_mutex_waiter_less/equal(), moving all qualification criteria into the function itself. Signed-off-by: Mike Galbraith <efault@gmx.de> --- kernel/locking/rtmutex.c | 76 ++++++++++++++++++++++------------------------- 1 file changed, 37 insertions(+), 39 deletions(-) --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -236,26 +236,19 @@ static inline bool unlock_rt_mutex_safe( } #endif -#define STEAL_NORMAL 0 -#define STEAL_LATERAL 1 - /* * Only use with rt_mutex_waiter_{less,equal}() */ -#define task_to_waiter(p) \ - &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline } +#define task_to_waiter(p) &(struct rt_mutex_waiter) \ + { .prio = (p)->prio, .deadline = (p)->dl.deadline, .task = (p) } static inline int rt_mutex_waiter_less(struct rt_mutex_waiter *left, - struct rt_mutex_waiter *right, int mode) + struct rt_mutex_waiter *right) { - if (mode == STEAL_NORMAL) { - if (left->prio < right->prio) - return 1; - } else { - if (left->prio <= right->prio) - return 1; - } + if (left->prio < right->prio) + return 1; + /* * If both waiters have dl_prio(), we check the deadlines of the * associated tasks. @@ -287,6 +280,27 @@ rt_mutex_waiter_equal(struct rt_mutex_wa return 1; } +#define STEAL_NORMAL 0 +#define STEAL_LATERAL 1 + +static inline int +rt_mutex_steal(struct rt_mutex *lock, struct rt_mutex_waiter *waiter, int mode) +{ + struct rt_mutex_waiter *top_waiter = rt_mutex_top_waiter(lock); + + if (waiter == top_waiter || rt_mutex_waiter_less(waiter, top_waiter)) + return 1; + + /* + * Note that RT tasks are excluded from lateral-steals + * to prevent the introduction of an unbounded latency. + */ + if (mode == STEAL_NORMAL || rt_task(waiter->task)) + return 0; + + return rt_mutex_waiter_equal(waiter, top_waiter); +} + static void rt_mutex_enqueue(struct rt_mutex *lock, struct rt_mutex_waiter *waiter) { @@ -298,7 +312,7 @@ rt_mutex_enqueue(struct rt_mutex *lock, while (*link) { parent = *link; entry = rb_entry(parent, struct rt_mutex_waiter, tree_entry); - if (rt_mutex_waiter_less(waiter, entry, STEAL_NORMAL)) { + if (rt_mutex_waiter_less(waiter, entry)) { link = &parent->rb_left; } else { link = &parent->rb_right; @@ -337,7 +351,7 @@ rt_mutex_enqueue_pi(struct task_struct * while (*link) { parent = *link; entry = rb_entry(parent, struct rt_mutex_waiter, pi_tree_entry); - if (rt_mutex_waiter_less(waiter, entry, STEAL_NORMAL)) { + if (rt_mutex_waiter_less(waiter, entry)) { link = &parent->rb_left; } else { link = &parent->rb_right; @@ -847,6 +861,7 @@ static int rt_mutex_adjust_prio_chain(st * @task: The task which wants to acquire the lock * @waiter: The waiter that is queued to the lock's wait tree if the * callsite called task_blocked_on_lock(), otherwise NULL + * @mode: Lock steal mode (STEAL_NORMAL, STEAL_LATERAL) */ static int __try_to_take_rt_mutex(struct rt_mutex *lock, struct task_struct *task, @@ -886,20 +901,16 @@ static int __try_to_take_rt_mutex(struct */ if (waiter) { /* - * If waiter is not the highest priority waiter of - * @lock, give up. + * If waiter is not the highest priority waiter of @lock, + * or its peer when lateral steal is allowed, give up. */ - if (waiter != rt_mutex_top_waiter(lock)) { - /* XXX rt_mutex_waiter_less() ? */ + if (!rt_mutex_steal(lock, waiter, mode)) return 0; - } - /* * We can acquire the lock. Remove the waiter from the * lock waiters tree. */ rt_mutex_dequeue(lock, waiter); - } else { /* * If the lock has waiters already we check whether @task is @@ -910,25 +921,12 @@ static int __try_to_take_rt_mutex(struct * not need to be dequeued. */ if (rt_mutex_has_waiters(lock)) { - struct task_struct *pown = rt_mutex_top_waiter(lock)->task; - - if (task != pown) - return 0; - - /* - * Note that RT tasks are excluded from lateral-steals - * to prevent the introduction of an unbounded latency. - */ - if (rt_task(task)) - mode = STEAL_NORMAL; /* - * If @task->prio is greater than or equal to - * the top waiter priority (kernel view), - * @task lost. + * If @task->prio is greater than the top waiter + * priority (kernel view), or equal to it when a + * lateral steal is forbidden, @task lost. */ - if (!rt_mutex_waiter_less(task_to_waiter(task), - rt_mutex_top_waiter(lock), - mode)) + if (!rt_mutex_steal(lock, task_to_waiter(task), mode)) return 0; /* * The current top waiter stays enqueued. We ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch-rt v2] rtmutex: Fix lock stealing logic 2017-06-23 7:37 ` [patch-rt v2] " Mike Galbraith @ 2017-06-23 10:07 ` Mike Galbraith 2017-06-26 13:47 ` Sebastian Andrzej Siewior 2017-06-23 13:33 ` Steven Rostedt 1 sibling, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-23 10:07 UTC (permalink / raw) To: Sebastian Andrzej Siewior, Thomas Gleixner Cc: LKML, linux-rt-users, Steven Rostedt On Fri, 2017-06-23 at 09:37 +0200, Mike Galbraith wrote: > V2 changes: > - beautification (ymmv) > - enable lock stealing when waiter is queued (V3 s/steal/claim) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch-rt v2] rtmutex: Fix lock stealing logic 2017-06-23 10:07 ` Mike Galbraith @ 2017-06-26 13:47 ` Sebastian Andrzej Siewior 0 siblings, 0 replies; 32+ messages in thread From: Sebastian Andrzej Siewior @ 2017-06-26 13:47 UTC (permalink / raw) To: Mike Galbraith; +Cc: Thomas Gleixner, LKML, linux-rt-users, Steven Rostedt On 2017-06-23 12:07:38 [+0200], Mike Galbraith wrote: > On Fri, 2017-06-23 at 09:37 +0200, Mike Galbraith wrote: > > V2 changes: > > - beautification (ymmv) > > - enable lock stealing when waiter is queued Applied. > (V3 s/steal/claim) Sebastian ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch-rt v2] rtmutex: Fix lock stealing logic 2017-06-23 7:37 ` [patch-rt v2] " Mike Galbraith 2017-06-23 10:07 ` Mike Galbraith @ 2017-06-23 13:33 ` Steven Rostedt 2017-06-23 14:07 ` Mike Galbraith 1 sibling, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2017-06-23 13:33 UTC (permalink / raw) To: Mike Galbraith Cc: Sebastian Andrzej Siewior, Thomas Gleixner, LKML, linux-rt-users On Fri, 23 Jun 2017 09:37:14 +0200 Mike Galbraith <efault@gmx.de> wrote: > V2 changes: > - beautification (ymmv) > - enable lock stealing when waiter is queued > > rtmutex: Fix lock stealing logic > > 1. When trying to acquire an rtmutex, we first try to grab it without > queueing the waiter, and explicitly check for that initial attempt > in the !waiter path of __try_to_take_rt_mutex(). Checking whether > the lock taker is top waiter before allowing a steal attempt in that > path is a thinko: the lock taker has not yet blocked. > > 2. It seems wrong to change the definition of rt_mutex_waiter_less() > to mean less or perhaps equal when we have an rt_mutex_waiter_equal(). > > Remove the thinko, restore rt_mutex_waiter_less(), implement and use > rt_mutex_steal() based upon rt_mutex_waiter_less/equal(), moving all > qualification criteria into the function itself. > > Signed-off-by: Mike Galbraith <efault@gmx.de> > --- > kernel/locking/rtmutex.c | 76 ++++++++++++++++++++++------------------------- > 1 file changed, 37 insertions(+), 39 deletions(-) > > --- a/kernel/locking/rtmutex.c > +++ b/kernel/locking/rtmutex.c > @@ -236,26 +236,19 @@ static inline bool unlock_rt_mutex_safe( > } > #endif > > -#define STEAL_NORMAL 0 > -#define STEAL_LATERAL 1 > - > /* > * Only use with rt_mutex_waiter_{less,equal}() > */ > -#define task_to_waiter(p) \ > - &(struct rt_mutex_waiter){ .prio = (p)->prio, .deadline = (p)->dl.deadline } > +#define task_to_waiter(p) &(struct rt_mutex_waiter) \ > + { .prio = (p)->prio, .deadline = (p)->dl.deadline, .task = (p) } > > static inline int > rt_mutex_waiter_less(struct rt_mutex_waiter *left, > - struct rt_mutex_waiter *right, int mode) > + struct rt_mutex_waiter *right) > { > - if (mode == STEAL_NORMAL) { > - if (left->prio < right->prio) > - return 1; > - } else { > - if (left->prio <= right->prio) > - return 1; > - } > + if (left->prio < right->prio) > + return 1; > + > /* > * If both waiters have dl_prio(), we check the deadlines of the > * associated tasks. > @@ -287,6 +280,27 @@ rt_mutex_waiter_equal(struct rt_mutex_wa > return 1; > } > > +#define STEAL_NORMAL 0 > +#define STEAL_LATERAL 1 > + > +static inline int > +rt_mutex_steal(struct rt_mutex *lock, struct rt_mutex_waiter *waiter, int mode) > +{ > + struct rt_mutex_waiter *top_waiter = rt_mutex_top_waiter(lock); > + > + if (waiter == top_waiter || rt_mutex_waiter_less(waiter, top_waiter)) > + return 1; > + > + /* > + * Note that RT tasks are excluded from lateral-steals > + * to prevent the introduction of an unbounded latency. > + */ > + if (mode == STEAL_NORMAL || rt_task(waiter->task)) > + return 0; > + > + return rt_mutex_waiter_equal(waiter, top_waiter); > +} > + > static void > rt_mutex_enqueue(struct rt_mutex *lock, struct rt_mutex_waiter *waiter) > { > @@ -298,7 +312,7 @@ rt_mutex_enqueue(struct rt_mutex *lock, > while (*link) { > parent = *link; > entry = rb_entry(parent, struct rt_mutex_waiter, tree_entry); > - if (rt_mutex_waiter_less(waiter, entry, STEAL_NORMAL)) { > + if (rt_mutex_waiter_less(waiter, entry)) { > link = &parent->rb_left; > } else { > link = &parent->rb_right; > @@ -337,7 +351,7 @@ rt_mutex_enqueue_pi(struct task_struct * > while (*link) { > parent = *link; > entry = rb_entry(parent, struct rt_mutex_waiter, pi_tree_entry); > - if (rt_mutex_waiter_less(waiter, entry, STEAL_NORMAL)) { > + if (rt_mutex_waiter_less(waiter, entry)) { > link = &parent->rb_left; > } else { > link = &parent->rb_right; > @@ -847,6 +861,7 @@ static int rt_mutex_adjust_prio_chain(st > * @task: The task which wants to acquire the lock > * @waiter: The waiter that is queued to the lock's wait tree if the > * callsite called task_blocked_on_lock(), otherwise NULL > + * @mode: Lock steal mode (STEAL_NORMAL, STEAL_LATERAL) > */ > static int __try_to_take_rt_mutex(struct rt_mutex *lock, > struct task_struct *task, > @@ -886,20 +901,16 @@ static int __try_to_take_rt_mutex(struct > */ > if (waiter) { > /* > - * If waiter is not the highest priority waiter of > - * @lock, give up. > + * If waiter is not the highest priority waiter of @lock, > + * or its peer when lateral steal is allowed, give up. > */ > - if (waiter != rt_mutex_top_waiter(lock)) { > - /* XXX rt_mutex_waiter_less() ? */ > + if (!rt_mutex_steal(lock, waiter, mode)) > return 0; > - } > - > /* > * We can acquire the lock. Remove the waiter from the > * lock waiters tree. > */ > rt_mutex_dequeue(lock, waiter); > - I liked that space. > } else { > /* > * If the lock has waiters already we check whether @task is > @@ -910,25 +921,12 @@ static int __try_to_take_rt_mutex(struct > * not need to be dequeued. > */ > if (rt_mutex_has_waiters(lock)) { > - struct task_struct *pown = rt_mutex_top_waiter(lock)->task; > - > - if (task != pown) > - return 0; OK, I see what happened here. Yeah, this will always be true, because when waiter == NULL, it means that the task isn't on the lock's list yet, and pown != task is always true. > - > - /* > - * Note that RT tasks are excluded from lateral-steals > - * to prevent the introduction of an unbounded latency. > - */ > - if (rt_task(task)) > - mode = STEAL_NORMAL; > /* > - * If @task->prio is greater than or equal to > - * the top waiter priority (kernel view), > - * @task lost. > + * If @task->prio is greater than the top waiter > + * priority (kernel view), or equal to it when a > + * lateral steal is forbidden, @task lost. > */ > - if (!rt_mutex_waiter_less(task_to_waiter(task), > - rt_mutex_top_waiter(lock), > - mode)) > + if (!rt_mutex_steal(lock, task_to_waiter(task), mode)) > return 0; > /* > * The current top waiter stays enqueued. We Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch-rt v2] rtmutex: Fix lock stealing logic 2017-06-23 13:33 ` Steven Rostedt @ 2017-06-23 14:07 ` Mike Galbraith 2017-06-23 14:14 ` Steven Rostedt 0 siblings, 1 reply; 32+ messages in thread From: Mike Galbraith @ 2017-06-23 14:07 UTC (permalink / raw) To: Steven Rostedt Cc: Sebastian Andrzej Siewior, Thomas Gleixner, LKML, linux-rt-users On Fri, 2017-06-23 at 09:33 -0400, Steven Rostedt wrote: > struct task_struct *task, > > @@ -886,20 +901,16 @@ static int __try_to_take_rt_mutex(struct > > */ > > if (waiter) { > > /* > > - * If waiter is not the highest priority waiter of > > - * @lock, give up. > > + * If waiter is not the highest priority waiter of @lock, > > + * or its peer when lateral steal is allowed, give up. > > */ > > - if (waiter != rt_mutex_top_waiter(lock)) { > > - /* XXX rt_mutex_waiter_less() ? */ > > + if (!rt_mutex_steal(lock, waiter, mode)) > > return 0; > > - } > > - > > /* > > * We can acquire the lock. Remove the waiter from the > > * lock waiters tree. > > */ > > rt_mutex_dequeue(lock, waiter); > > - > > I liked that space. I like minus signs in diffstat, that one was a freebee. Maintainers can revive it if they like, or I can post a V3 with it revived as well as s/rt_mutex_steal/rt_mutex_claim. -Mike ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch-rt v2] rtmutex: Fix lock stealing logic 2017-06-23 14:07 ` Mike Galbraith @ 2017-06-23 14:14 ` Steven Rostedt 0 siblings, 0 replies; 32+ messages in thread From: Steven Rostedt @ 2017-06-23 14:14 UTC (permalink / raw) To: Mike Galbraith Cc: Sebastian Andrzej Siewior, Thomas Gleixner, LKML, linux-rt-users On Fri, 23 Jun 2017 16:07:19 +0200 Mike Galbraith <efault@gmx.de> wrote: > On Fri, 2017-06-23 at 09:33 -0400, Steven Rostedt wrote: > > struct task_struct *task, > > > @@ -886,20 +901,16 @@ static int __try_to_take_rt_mutex(struct > > > */ > > > if (waiter) { > > > /* > > > - * If waiter is not the highest priority waiter of > > > - * @lock, give up. > > > + * If waiter is not the highest priority waiter of @lock, > > > + * or its peer when lateral steal is allowed, give up. > > > */ > > > - if (waiter != rt_mutex_top_waiter(lock)) { > > > - /* XXX rt_mutex_waiter_less() ? */ > > > + if (!rt_mutex_steal(lock, waiter, mode)) > > > return 0; > > > - } > > > - > > > /* > > > * We can acquire the lock. Remove the waiter from the > > > * lock waiters tree. > > > */ > > > rt_mutex_dequeue(lock, waiter); > > > - > > > > I liked that space. > > I like minus signs in diffstat, that one was a freebee. Maintainers > can revive it if they like, or I can post a V3 with it revived as well > as s/rt_mutex_steal/rt_mutex_claim. > It's not bigly to me. -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2017-06-26 13:47 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-06-16 10:56 [ANNOUNCE] v4.11.5-rt1 Sebastian Andrzej Siewior 2017-06-17 8:14 ` Mike Galbraith 2017-06-18 6:46 ` Mike Galbraith 2017-06-19 8:52 ` Sebastian Andrzej Siewior 2017-06-19 10:14 ` Mike Galbraith 2017-06-19 10:44 ` Sebastian Andrzej Siewior 2017-06-19 11:31 ` Mike Galbraith 2017-06-19 11:50 ` Sebastian Andrzej Siewior 2017-06-19 12:55 ` Mike Galbraith 2017-06-19 14:06 ` Sebastian Andrzej Siewior 2017-06-19 14:36 ` Mike Galbraith 2017-06-19 15:03 ` Sebastian Andrzej Siewior 2017-06-19 16:14 ` Mike Galbraith 2017-06-19 16:27 ` Sebastian Andrzej Siewior 2017-06-19 16:46 ` Mike Galbraith 2017-06-19 14:08 ` Steven Rostedt 2017-06-19 14:13 ` Sebastian Andrzej Siewior 2017-06-19 14:41 ` Steven Rostedt 2017-06-19 16:29 ` Mike Galbraith 2017-06-20 7:45 ` Mike Galbraith 2017-06-22 16:34 ` Sebastian Andrzej Siewior 2017-06-22 17:30 ` Mike Galbraith 2017-06-22 21:36 ` Thomas Gleixner 2017-06-23 2:00 ` Mike Galbraith 2017-06-23 12:48 ` Sebastian Andrzej Siewior 2017-06-18 17:01 ` [patch-rt] rtmutex: Fix lock stealing logic Mike Galbraith 2017-06-23 7:37 ` [patch-rt v2] " Mike Galbraith 2017-06-23 10:07 ` Mike Galbraith 2017-06-26 13:47 ` Sebastian Andrzej Siewior 2017-06-23 13:33 ` Steven Rostedt 2017-06-23 14:07 ` Mike Galbraith 2017-06-23 14:14 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).