linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* stable-rc/linux-5.6.y bisection: baseline.dmesg.crit on bcm2837-rpi-3-b
@ 2020-06-25  5:24 kernelci.org bot
  2020-06-25  6:50 ` Guillaume Tucker
  0 siblings, 1 reply; 2+ messages in thread
From: kernelci.org bot @ 2020-06-25  5:24 UTC (permalink / raw)
  To: Tejun Heo, Mel Gorman, gtucker, Sasha Levin
  Cc: Li Zefan, Greg Kroah-Hartman, cgroups, Johannes Weiner, linux-kernel

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
* This automated bisection report was sent to you on the basis  *
* that you may be involved with the breaking commit it has      *
* found.  No manual investigation has been done to verify it,   *
* and the root cause of the problem may be somewhere else.      *
*                                                               *
* If you do send a fix, please include this trailer:            *
*   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
*                                                               *
* Hope this helps!                                              *
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

stable-rc/linux-5.6.y bisection: baseline.dmesg.crit on bcm2837-rpi-3-b

Summary:
  Start:      61aba373f570 Linux 5.6.19
  Plain log:  https://storage.kernelci.org/stable-rc/linux-5.6.y/v5.6.19/arm64/defconfig/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.txt
  HTML log:   https://storage.kernelci.org/stable-rc/linux-5.6.y/v5.6.19/arm64/defconfig/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.html
  Result:     9cf5d5444c78 Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"

Checks:
  revert:     PASS
  verify:     PASS

Parameters:
  Tree:       stable-rc
  URL:        https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
  Branch:     linux-5.6.y
  Target:     bcm2837-rpi-3-b
  CPU arch:   arm64
  Lab:        lab-baylibre
  Compiler:   gcc-8
  Config:     defconfig
  Test case:  baseline.dmesg.crit

Breaking commit found:

-------------------------------------------------------------------------------
commit 9cf5d5444c78c14bb9f90dd21cef361ee14ba64b
Author: Tejun Heo <tj@kernel.org>
Date:   Thu Apr 9 14:55:35 2020 -0400

    Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
    
    [ Upstream commit d8ef4b38cb69d907f9b0e889c44d05fc0f890977 ]
    
    This reverts commit 9a9e97b2f1f2 ("cgroup: Add memory barriers to plug
    cgroup_rstat_updated() race window").
    
    The commit was added in anticipation of memcg rstat conversion which needed
    synchronous accounting for the event counters (e.g. oom kill count). However,
    the conversion didn't get merged due to percpu memory overhead concern which
    couldn't be addressed at the time.
    
    Unfortunately, the patch's addition of smp_mb() to cgroup_rstat_updated()
    meant that every scheduling event now had to go through an additional full
    barrier and Mel Gorman noticed it as 1% regression in netperf UDP_STREAM test.
    
    There's no need to have this barrier in tree now and even if we need
    synchronous accounting in the future, the right thing to do is separating that
    out to a separate function so that hot paths which don't care about
    synchronous behavior don't have to pay the overhead of the full barrier. Let's
    revert.
    
    Signed-off-by: Tejun Heo <tj@kernel.org>
    Reported-by: Mel Gorman <mgorman@techsingularity.net>
    Link: http://lkml.kernel.org/r/20200409154413.GK3818@techsingularity.net
    Cc: v4.18+
    Signed-off-by: Sasha Levin <sashal@kernel.org>

diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index 6f87352f8219..41ca996568df 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -33,12 +33,9 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu)
 		return;
 
 	/*
-	 * Paired with the one in cgroup_rstat_cpu_pop_updated().  Either we
-	 * see NULL updated_next or they see our updated stat.
-	 */
-	smp_mb();
-
-	/*
+	 * Speculative already-on-list test. This may race leading to
+	 * temporary inaccuracies, which is fine.
+	 *
 	 * Because @parent's updated_children is terminated with @parent
 	 * instead of NULL, we can tell whether @cgrp is on the list by
 	 * testing the next pointer for NULL.
@@ -134,13 +131,6 @@ static struct cgroup *cgroup_rstat_cpu_pop_updated(struct cgroup *pos,
 		*nextp = rstatc->updated_next;
 		rstatc->updated_next = NULL;
 
-		/*
-		 * Paired with the one in cgroup_rstat_cpu_updated().
-		 * Either they see NULL updated_next or we see their
-		 * updated stat.
-		 */
-		smp_mb();
-
 		return pos;
 	}
-------------------------------------------------------------------------------


Git bisection log:

-------------------------------------------------------------------------------
git bisect start
# good: [7111951b8d4973bda27ff663f2cf18b663d15b48] Linux 5.6
git bisect good 7111951b8d4973bda27ff663f2cf18b663d15b48
# bad: [61aba373f5708f2aebc3f72078e51949a068aa6f] Linux 5.6.19
git bisect bad 61aba373f5708f2aebc3f72078e51949a068aa6f
# good: [0bc4d081e87ef8fd43baeed7141267e558896f86] RDMA/cm: Fix an error check in cm_alloc_id_priv()
git bisect good 0bc4d081e87ef8fd43baeed7141267e558896f86
# good: [0deceb85de592b89f6476e2bfcb5321a95d23126] dmaengine: owl: Use correct lock in owl_dma_get_pchan()
git bisect good 0deceb85de592b89f6476e2bfcb5321a95d23126
# bad: [566f16e2d62fb3817529a05147ef46046ecffe6a] io_uring: initialize ctx->sqo_wait earlier
git bisect bad 566f16e2d62fb3817529a05147ef46046ecffe6a
# good: [23952755bc2f839844b5a60c29620cce98c9b09d] riscv: stacktrace: Fix undefined reference to `walk_stackframe'
git bisect good 23952755bc2f839844b5a60c29620cce98c9b09d
# good: [be31933db49c8e3d81b96ee6bc9933d2adafe13d] drm/amd/display: Fix potential integer wraparound resulting in a hang
git bisect good be31933db49c8e3d81b96ee6bc9933d2adafe13d
# good: [ee899046f79db390650e546446a7a0196788d39d] esp6: get the right proto for transport mode in esp6_gso_encap
git bisect good ee899046f79db390650e546446a7a0196788d39d
# good: [bdd07c33a3be1896ec1982fab1a2baf986fbe4dc] netfilter: conntrack: comparison of unsigned in cthelper confirmation
git bisect good bdd07c33a3be1896ec1982fab1a2baf986fbe4dc
# bad: [46f4a75fb5034c8d80a5388fe3bb46a8ed05df5b] ARC: [plat-eznps]: Restrict to CONFIG_ISA_ARCOMPACT
git bisect bad 46f4a75fb5034c8d80a5388fe3bb46a8ed05df5b
# good: [960a4cc3ec49f8292d0f837f0a6b28b03c54f042] Linux 5.6.16
git bisect good 960a4cc3ec49f8292d0f837f0a6b28b03c54f042
# bad: [9cf5d5444c78c14bb9f90dd21cef361ee14ba64b] Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
git bisect bad 9cf5d5444c78c14bb9f90dd21cef361ee14ba64b
# good: [2df9b670ad1a2120f54982e1b1dfd17a6ce3cc0c] x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
git bisect good 2df9b670ad1a2120f54982e1b1dfd17a6ce3cc0c
# first bad commit: [9cf5d5444c78c14bb9f90dd21cef361ee14ba64b] Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
-------------------------------------------------------------------------------

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: stable-rc/linux-5.6.y bisection: baseline.dmesg.crit on bcm2837-rpi-3-b
  2020-06-25  5:24 stable-rc/linux-5.6.y bisection: baseline.dmesg.crit on bcm2837-rpi-3-b kernelci.org bot
@ 2020-06-25  6:50 ` Guillaume Tucker
  0 siblings, 0 replies; 2+ messages in thread
From: Guillaume Tucker @ 2020-06-25  6:50 UTC (permalink / raw)
  To: Tejun Heo, Mel Gorman, Sasha Levin
  Cc: Li Zefan, Greg Kroah-Hartman, cgroups, Johannes Weiner, linux-kernel

On 25/06/2020 06:24, kernelci.org bot wrote:
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> *                                                               *
> * If you do send a fix, please include this trailer:            *
> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
> *                                                               *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> stable-rc/linux-5.6.y bisection: baseline.dmesg.crit on bcm2837-rpi-3-b
> 
> Summary:
>   Start:      61aba373f570 Linux 5.6.19
>   Plain log:  https://storage.kernelci.org/stable-rc/linux-5.6.y/v5.6.19/arm64/defconfig/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.txt
>   HTML log:   https://storage.kernelci.org/stable-rc/linux-5.6.y/v5.6.19/arm64/defconfig/gcc-8/lab-baylibre/baseline-bcm2837-rpi-3-b.html
>   Result:     9cf5d5444c78 Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       stable-rc
>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
>   Branch:     linux-5.6.y
>   Target:     bcm2837-rpi-3-b
>   CPU arch:   arm64
>   Lab:        lab-baylibre
>   Compiler:   gcc-8
>   Config:     defconfig
>   Test case:  baseline.dmesg.crit

The "crit" kernel message is:

[   17.536674] hwmon hwmon1: Undervoltage detected!

which is a known intermittent issue on rpi-3-b that is most
likely to be unrelated to the patch found by this bisection.

Maybe the bisection landed on this patch because it makes the
undervoltage more likely by a side effect, if it affects timing
or the binary size or addresses in very subtle ways.  But even
then it wouldn't be a regression.

We should stop running bisections on rpi-3-b, at least for this
kernel error message, until it has been fixed to avoid future
false positives.

Sorry for the noise.

Guillaume


> Breaking commit found:
> 
> -------------------------------------------------------------------------------
> commit 9cf5d5444c78c14bb9f90dd21cef361ee14ba64b
> Author: Tejun Heo <tj@kernel.org>
> Date:   Thu Apr 9 14:55:35 2020 -0400
> 
>     Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
>     
>     [ Upstream commit d8ef4b38cb69d907f9b0e889c44d05fc0f890977 ]
>     
>     This reverts commit 9a9e97b2f1f2 ("cgroup: Add memory barriers to plug
>     cgroup_rstat_updated() race window").
>     
>     The commit was added in anticipation of memcg rstat conversion which needed
>     synchronous accounting for the event counters (e.g. oom kill count). However,
>     the conversion didn't get merged due to percpu memory overhead concern which
>     couldn't be addressed at the time.
>     
>     Unfortunately, the patch's addition of smp_mb() to cgroup_rstat_updated()
>     meant that every scheduling event now had to go through an additional full
>     barrier and Mel Gorman noticed it as 1% regression in netperf UDP_STREAM test.
>     
>     There's no need to have this barrier in tree now and even if we need
>     synchronous accounting in the future, the right thing to do is separating that
>     out to a separate function so that hot paths which don't care about
>     synchronous behavior don't have to pay the overhead of the full barrier. Let's
>     revert.
>     
>     Signed-off-by: Tejun Heo <tj@kernel.org>
>     Reported-by: Mel Gorman <mgorman@techsingularity.net>
>     Link: http://lkml.kernel.org/r/20200409154413.GK3818@techsingularity.net
>     Cc: v4.18+
>     Signed-off-by: Sasha Levin <sashal@kernel.org>
> 
> diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
> index 6f87352f8219..41ca996568df 100644
> --- a/kernel/cgroup/rstat.c
> +++ b/kernel/cgroup/rstat.c
> @@ -33,12 +33,9 @@ void cgroup_rstat_updated(struct cgroup *cgrp, int cpu)
>  		return;
>  
>  	/*
> -	 * Paired with the one in cgroup_rstat_cpu_pop_updated().  Either we
> -	 * see NULL updated_next or they see our updated stat.
> -	 */
> -	smp_mb();
> -
> -	/*
> +	 * Speculative already-on-list test. This may race leading to
> +	 * temporary inaccuracies, which is fine.
> +	 *
>  	 * Because @parent's updated_children is terminated with @parent
>  	 * instead of NULL, we can tell whether @cgrp is on the list by
>  	 * testing the next pointer for NULL.
> @@ -134,13 +131,6 @@ static struct cgroup *cgroup_rstat_cpu_pop_updated(struct cgroup *pos,
>  		*nextp = rstatc->updated_next;
>  		rstatc->updated_next = NULL;
>  
> -		/*
> -		 * Paired with the one in cgroup_rstat_cpu_updated().
> -		 * Either they see NULL updated_next or we see their
> -		 * updated stat.
> -		 */
> -		smp_mb();
> -
>  		return pos;
>  	}
> -------------------------------------------------------------------------------
> 
> 
> Git bisection log:
> 
> -------------------------------------------------------------------------------
> git bisect start
> # good: [7111951b8d4973bda27ff663f2cf18b663d15b48] Linux 5.6
> git bisect good 7111951b8d4973bda27ff663f2cf18b663d15b48
> # bad: [61aba373f5708f2aebc3f72078e51949a068aa6f] Linux 5.6.19
> git bisect bad 61aba373f5708f2aebc3f72078e51949a068aa6f
> # good: [0bc4d081e87ef8fd43baeed7141267e558896f86] RDMA/cm: Fix an error check in cm_alloc_id_priv()
> git bisect good 0bc4d081e87ef8fd43baeed7141267e558896f86
> # good: [0deceb85de592b89f6476e2bfcb5321a95d23126] dmaengine: owl: Use correct lock in owl_dma_get_pchan()
> git bisect good 0deceb85de592b89f6476e2bfcb5321a95d23126
> # bad: [566f16e2d62fb3817529a05147ef46046ecffe6a] io_uring: initialize ctx->sqo_wait earlier
> git bisect bad 566f16e2d62fb3817529a05147ef46046ecffe6a
> # good: [23952755bc2f839844b5a60c29620cce98c9b09d] riscv: stacktrace: Fix undefined reference to `walk_stackframe'
> git bisect good 23952755bc2f839844b5a60c29620cce98c9b09d
> # good: [be31933db49c8e3d81b96ee6bc9933d2adafe13d] drm/amd/display: Fix potential integer wraparound resulting in a hang
> git bisect good be31933db49c8e3d81b96ee6bc9933d2adafe13d
> # good: [ee899046f79db390650e546446a7a0196788d39d] esp6: get the right proto for transport mode in esp6_gso_encap
> git bisect good ee899046f79db390650e546446a7a0196788d39d
> # good: [bdd07c33a3be1896ec1982fab1a2baf986fbe4dc] netfilter: conntrack: comparison of unsigned in cthelper confirmation
> git bisect good bdd07c33a3be1896ec1982fab1a2baf986fbe4dc
> # bad: [46f4a75fb5034c8d80a5388fe3bb46a8ed05df5b] ARC: [plat-eznps]: Restrict to CONFIG_ISA_ARCOMPACT
> git bisect bad 46f4a75fb5034c8d80a5388fe3bb46a8ed05df5b
> # good: [960a4cc3ec49f8292d0f837f0a6b28b03c54f042] Linux 5.6.16
> git bisect good 960a4cc3ec49f8292d0f837f0a6b28b03c54f042
> # bad: [9cf5d5444c78c14bb9f90dd21cef361ee14ba64b] Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
> git bisect bad 9cf5d5444c78c14bb9f90dd21cef361ee14ba64b
> # good: [2df9b670ad1a2120f54982e1b1dfd17a6ce3cc0c] x86/syscalls: Revert "x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long"
> git bisect good 2df9b670ad1a2120f54982e1b1dfd17a6ce3cc0c
> # first bad commit: [9cf5d5444c78c14bb9f90dd21cef361ee14ba64b] Revert "cgroup: Add memory barriers to plug cgroup_rstat_updated() race window"
> -------------------------------------------------------------------------------
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-06-25  6:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-25  5:24 stable-rc/linux-5.6.y bisection: baseline.dmesg.crit on bcm2837-rpi-3-b kernelci.org bot
2020-06-25  6:50 ` Guillaume Tucker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).