* [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
@ 2021-11-04 17:51 Vincent Donnefort
2021-11-05 15:08 ` Vincent Guittot
` (2 more replies)
0 siblings, 3 replies; 8+ messages in thread
From: Vincent Donnefort @ 2021-11-04 17:51 UTC (permalink / raw)
To: peterz, mingo, vincent.guittot
Cc: linux-kernel, dietmar.eggemann, valentin.schneider, jing-ting.wu,
Vincent Donnefort
Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:
CPU1 CPU2
==================================================================
per_cpu(sd_llc_id, CPUX) => 0
partition_sched_domains_locked()
detach_destroy_domains()
cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
per_cpu(sd_llc_id, CPUX) => 0
per_cpu(sd_llc_id, CPUX) = CPUX
per_cpu(sd_llc_id, CPUX) => CPUX
return false
ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
is a warning triggered from ttwu_queue_wakelist().
Avoid a such race in cpus_share_cache() by always returning true when
this_cpu == that_cpu.
Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f2611b9cf503..f5ca15cdcff4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)
bool cpus_share_cache(int this_cpu, int that_cpu)
{
+ if (this_cpu == that_cpu)
+ return true;
+
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
}
--
2.25.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-04 17:51 [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() Vincent Donnefort
@ 2021-11-05 15:08 ` Vincent Guittot
2021-11-08 11:19 ` Peter Zijlstra
2021-11-11 12:22 ` [tip: sched/urgent] " tip-bot2 for Vincent Donnefort
2021-11-15 10:46 ` [PATCH] " Shinichiro Kawasaki
2 siblings, 1 reply; 8+ messages in thread
From: Vincent Guittot @ 2021-11-05 15:08 UTC (permalink / raw)
To: Vincent Donnefort
Cc: peterz, mingo, linux-kernel, dietmar.eggemann,
Valentin.Schneider, jing-ting.wu
On Thu, 4 Nov 2021 at 18:52, Vincent Donnefort
<vincent.donnefort@arm.com> wrote:
>
> Nothing protects the access to the per_cpu variable sd_llc_id. When testing
> the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
> update_top_cache_domain(). One scenario being:
>
> CPU1 CPU2
> ==================================================================
>
> per_cpu(sd_llc_id, CPUX) => 0
> partition_sched_domains_locked()
> detach_destroy_domains()
> cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
> per_cpu(sd_llc_id, CPUX) => 0
> per_cpu(sd_llc_id, CPUX) = CPUX
> per_cpu(sd_llc_id, CPUX) => CPUX
> return false
>
> ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
> is a warning triggered from ttwu_queue_wakelist().
>
> Avoid a such race in cpus_share_cache() by always returning true when
> this_cpu == that_cpu.
>
> Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
> Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
> Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f2611b9cf503..f5ca15cdcff4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)
>
> bool cpus_share_cache(int this_cpu, int that_cpu)
> {
> + if (this_cpu == that_cpu)
> + return true;
> +
> return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
> }
>
> --
> 2.25.1
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-05 15:08 ` Vincent Guittot
@ 2021-11-08 11:19 ` Peter Zijlstra
0 siblings, 0 replies; 8+ messages in thread
From: Peter Zijlstra @ 2021-11-08 11:19 UTC (permalink / raw)
To: Vincent Guittot
Cc: Vincent Donnefort, mingo, linux-kernel, dietmar.eggemann,
Valentin.Schneider, jing-ting.wu
On Fri, Nov 05, 2021 at 04:08:33PM +0100, Vincent Guittot wrote:
> On Thu, 4 Nov 2021 at 18:52, Vincent Donnefort
> <vincent.donnefort@arm.com> wrote:
> >
> > Nothing protects the access to the per_cpu variable sd_llc_id. When testing
> > the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
> > update_top_cache_domain(). One scenario being:
> >
> > CPU1 CPU2
> > ==================================================================
> >
> > per_cpu(sd_llc_id, CPUX) => 0
> > partition_sched_domains_locked()
> > detach_destroy_domains()
> > cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
> > per_cpu(sd_llc_id, CPUX) => 0
> > per_cpu(sd_llc_id, CPUX) = CPUX
> > per_cpu(sd_llc_id, CPUX) => CPUX
> > return false
> >
> > ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
> > is a warning triggered from ttwu_queue_wakelist().
> >
> > Avoid a such race in cpus_share_cache() by always returning true when
> > this_cpu == that_cpu.
> >
> > Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
> > Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
> > Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
> > Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
>
> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
>
> >
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index f2611b9cf503..f5ca15cdcff4 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)
> >
> > bool cpus_share_cache(int this_cpu, int that_cpu)
> > {
> > + if (this_cpu == that_cpu)
> > + return true;
> > +
> > return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
> > }
Blergh, that's annoying. Thanks guys, picked it up for /urgent
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip: sched/urgent] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-04 17:51 [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() Vincent Donnefort
2021-11-05 15:08 ` Vincent Guittot
@ 2021-11-11 12:22 ` tip-bot2 for Vincent Donnefort
2021-11-15 10:46 ` [PATCH] " Shinichiro Kawasaki
2 siblings, 0 replies; 8+ messages in thread
From: tip-bot2 for Vincent Donnefort @ 2021-11-11 12:22 UTC (permalink / raw)
To: linux-tip-commits
Cc: Jing-Ting Wu, Vincent Donnefort, Peter Zijlstra (Intel),
Valentin Schneider, Vincent Guittot, x86, linux-kernel
The following commit has been merged into the sched/urgent branch of tip:
Commit-ID: 42dc938a590c96eeb429e1830123fef2366d9c80
Gitweb: https://git.kernel.org/tip/42dc938a590c96eeb429e1830123fef2366d9c80
Author: Vincent Donnefort <vincent.donnefort@arm.com>
AuthorDate: Thu, 04 Nov 2021 17:51:20
Committer: Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 11 Nov 2021 13:09:32 +01:00
sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:
CPU1 CPU2
==================================================================
per_cpu(sd_llc_id, CPUX) => 0
partition_sched_domains_locked()
detach_destroy_domains()
cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
per_cpu(sd_llc_id, CPUX) => 0
per_cpu(sd_llc_id, CPUX) = CPUX
per_cpu(sd_llc_id, CPUX) => CPUX
return false
ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
is a warning triggered from ttwu_queue_wakelist().
Avoid a such race in cpus_share_cache() by always returning true when
this_cpu == that_cpu.
Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Link: https://lore.kernel.org/r/20211104175120.857087-1-vincent.donnefort@arm.com
---
kernel/sched/core.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 523fd60..cec173a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3726,6 +3726,9 @@ out:
bool cpus_share_cache(int this_cpu, int that_cpu)
{
+ if (this_cpu == that_cpu)
+ return true;
+
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
}
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-04 17:51 [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() Vincent Donnefort
2021-11-05 15:08 ` Vincent Guittot
2021-11-11 12:22 ` [tip: sched/urgent] " tip-bot2 for Vincent Donnefort
@ 2021-11-15 10:46 ` Shinichiro Kawasaki
2021-11-15 18:33 ` Valentin Schneider
2 siblings, 1 reply; 8+ messages in thread
From: Shinichiro Kawasaki @ 2021-11-15 10:46 UTC (permalink / raw)
To: Vincent Donnefort
Cc: peterz, mingo, vincent.guittot, linux-kernel, dietmar.eggemann,
valentin.schneider, jing-ting.wu, Damien Le Moal
On Nov 04, 2021 / 17:51, Vincent Donnefort wrote:
> Nothing protects the access to the per_cpu variable sd_llc_id. When testing
> the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
> update_top_cache_domain(). One scenario being:
>
> CPU1 CPU2
> ==================================================================
>
> per_cpu(sd_llc_id, CPUX) => 0
> partition_sched_domains_locked()
> detach_destroy_domains()
> cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
> per_cpu(sd_llc_id, CPUX) => 0
> per_cpu(sd_llc_id, CPUX) = CPUX
> per_cpu(sd_llc_id, CPUX) => CPUX
> return false
>
> ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
> is a warning triggered from ttwu_queue_wakelist().
>
> Avoid a such race in cpus_share_cache() by always returning true when
> this_cpu == that_cpu.
>
> Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
> Reported-by: Jing-Ting Wu <jing-ting.wu@mediatek.com>
> Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com>
> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com>
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f2611b9cf503..f5ca15cdcff4 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)
>
> bool cpus_share_cache(int this_cpu, int that_cpu)
> {
> + if (this_cpu == that_cpu)
> + return true;
> +
> return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
> }
>
> --
> 2.25.1
>
Oh, this is the exactly same fix as I posted before [1]. It is a little bit sad
that my post did not get reviewed. Anyway, good to see the issue fixed. Thanks.
[1] https://lore.kernel.org/all/20211029005618.773579-1-shinichiro.kawasaki@wdc.com/
--
Best Regards,
Shin'ichiro Kawasaki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-15 10:46 ` [PATCH] " Shinichiro Kawasaki
@ 2021-11-15 18:33 ` Valentin Schneider
2021-11-16 8:02 ` Vincent Guittot
0 siblings, 1 reply; 8+ messages in thread
From: Valentin Schneider @ 2021-11-15 18:33 UTC (permalink / raw)
To: Shinichiro Kawasaki, Vincent Donnefort
Cc: peterz, mingo, vincent.guittot, linux-kernel, dietmar.eggemann,
jing-ting.wu, Damien Le Moal
On 15/11/21 10:46, Shinichiro Kawasaki wrote:
> Oh, this is the exactly same fix as I posted before [1]. It is a little bit sad
> that my post did not get reviewed. Anyway, good to see the issue fixed. Thanks.
>
> [1] https://lore.kernel.org/all/20211029005618.773579-1-shinichiro.kawasaki@wdc.com/
>
Oh, that sucks, sorry about that. I do try to keep an eye out for sched
stuff sent to LKML, but clearly there are some that fall through :(
> --
> Best Regards,
> Shin'ichiro Kawasaki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-15 18:33 ` Valentin Schneider
@ 2021-11-16 8:02 ` Vincent Guittot
2021-11-16 8:45 ` Shinichiro Kawasaki
0 siblings, 1 reply; 8+ messages in thread
From: Vincent Guittot @ 2021-11-16 8:02 UTC (permalink / raw)
To: Valentin Schneider
Cc: Shinichiro Kawasaki, Vincent Donnefort, peterz, mingo,
linux-kernel, dietmar.eggemann, jing-ting.wu, Damien Le Moal
Hi Shinichiro
On Mon, 15 Nov 2021 at 19:34, Valentin Schneider
<valentin.schneider@arm.com> wrote:
>
> On 15/11/21 10:46, Shinichiro Kawasaki wrote:
> > Oh, this is the exactly same fix as I posted before [1]. It is a little bit sad
> > that my post did not get reviewed. Anyway, good to see the issue fixed. Thanks.
> >
> > [1] https://lore.kernel.org/all/20211029005618.773579-1-shinichiro.kawasaki@wdc.com/
> >
>
> Oh, that sucks, sorry about that. I do try to keep an eye out for sched
> stuff sent to LKML, but clearly there are some that fall through :(
I would advise you to use get_maintainer.pl to make sure to not miss
to cc someone.
Like valentin, this patch has been lost in the lkml flow and i haven't
noticed it
Regards,
Vincent
>
> > --
> > Best Regards,
> > Shin'ichiro Kawasaki
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()
2021-11-16 8:02 ` Vincent Guittot
@ 2021-11-16 8:45 ` Shinichiro Kawasaki
0 siblings, 0 replies; 8+ messages in thread
From: Shinichiro Kawasaki @ 2021-11-16 8:45 UTC (permalink / raw)
To: Vincent Guittot
Cc: Valentin Schneider, Vincent Donnefort, peterz, mingo,
linux-kernel, dietmar.eggemann, jing-ting.wu, Damien Le Moal
On Nov 16, 2021 / 09:02, Vincent Guittot wrote:
> Hi Shinichiro
>
> On Mon, 15 Nov 2021 at 19:34, Valentin Schneider
> <valentin.schneider@arm.com> wrote:
> >
> > On 15/11/21 10:46, Shinichiro Kawasaki wrote:
> > > Oh, this is the exactly same fix as I posted before [1]. It is a little bit sad
> > > that my post did not get reviewed. Anyway, good to see the issue fixed. Thanks.
> > >
> > > [1] https://lore.kernel.org/all/20211029005618.773579-1-shinichiro.kawasaki@wdc.com/
> > >
> >
> > Oh, that sucks, sorry about that. I do try to keep an eye out for sched
> > stuff sent to LKML, but clearly there are some that fall through :(
>
> I would advise you to use get_maintainer.pl to make sure to not miss
> to cc someone.
> Like valentin, this patch has been lost in the lkml flow and i haven't
> noticed it
Hi Velentin, Vincent, thank you for the comments.
I have run get_maintainer.pl on may patch, and it listed 9 maintainers and
reviewers. But my patch had only 2 out of the 9 in its To/Cc list, which was
too small. For next time, I will use get_maintainer.pl and post with full
To/Cc list. Thanks!
--
Best Regards,
Shin'ichiro Kawasaki
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-11-16 8:45 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-04 17:51 [PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain() Vincent Donnefort
2021-11-05 15:08 ` Vincent Guittot
2021-11-08 11:19 ` Peter Zijlstra
2021-11-11 12:22 ` [tip: sched/urgent] " tip-bot2 for Vincent Donnefort
2021-11-15 10:46 ` [PATCH] " Shinichiro Kawasaki
2021-11-15 18:33 ` Valentin Schneider
2021-11-16 8:02 ` Vincent Guittot
2021-11-16 8:45 ` Shinichiro Kawasaki
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).