linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* Re: next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b
       [not found] <63426bce.170a0220.b8179.75b2@mx.google.com>
@ 2022-10-10 11:59 ` Mark Brown
  2022-10-10 12:50   ` Jason A. Donenfeld
  0 siblings, 1 reply; 9+ messages in thread
From: Mark Brown @ 2022-10-10 11:59 UTC (permalink / raw)
  To: Florian Fainelli, Jason A. Donenfeld, Dominik Brodowski, Herbert Xu
  Cc: kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list


[-- Attachment #1.1: Type: text/plain, Size: 9819 bytes --]

The KernelCI bisection bot found a boot failure on Raspberry Pi 2B with
multi_v7_defconfig+CONFIG_THUMB2_KERNEL on next/pending-fixes triggered
by b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted
sources").  A RCU stall is detected towards the end of boot reading from
the bcm2835 hwrng:

<6>[    3.362859] Freeing initrd memory: 16196K
<3>[   23.160131] rcu: INFO: rcu_sched self-detected stall on CPU
<3>[   23.166057] rcu: 	0-....: (2099 ticks this GP) idle=03b4/1/0x40000002 softirq=28/28 fqs=1050
<4>[   23.174895] 	(t=2101 jiffies g=-1147 q=2353 ncpus=4)
<4>[   23.180203] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
<4>[   23.186125] Hardware name: BCM2835
<4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
<4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4
<4>[   23.199218] pc : [<c07ccdc8>]    lr : [<c07cb841>]    psr: 40000033
<4>[   23.205840] sp : f093df70  ip : 00000000  fp : 00000000
<4>[   23.211404] r10: c3c7e800  r9 : 00000000  r8 : c17e6b20
<4>[   23.216968] r7 : c17e6b64  r6 : c18b0a74  r5 : c07ccd99  r4 : c3f171c0
<4>[   23.223855] r3 : 000fffff  r2 : 00000040  r1 : c3c7e800  r0 : c3f171c0
<4>[   23.230743] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
<4>[   23.238426] Control: 50c5387d  Table: 0020406a  DAC: 00000051
<4>[   23.244519] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1

Similar issues appear to be present with other configurations, for
example a plain multi_v7_defconfig:

https://storage.kernelci.org/next/pending-fixes/v6.0-9666-g02c05e0b8d5c/arm/multi_v7_defconfig/gcc-10/lab-collabora/baseline-bcm2836-rpi-2-b.html

I've left the full report below with links to more information including
full logs, plus a reported-by tag for the bot:

On Sat, Oct 08, 2022 at 11:35:58PM -0700, KernelCI bot wrote:
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> * This automated bisection report was sent to you on the basis  *
> * that you may be involved with the breaking commit it has      *
> * found.  No manual investigation has been done to verify it,   *
> * and the root cause of the problem may be somewhere else.      *
> *                                                               *
> * If you do send a fix, please include this trailer:            *
> *   Reported-by: "kernelci.org bot" <bot@kernelci.org>          *
> *                                                               *
> * Hope this helps!                                              *
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b
> 
> Summary:
>   Start:      7871897dadfa Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply.git
>   Plain log:  https://storage.kernelci.org/next/pending-fixes/v6.0-5324-g7871897dadfa9/arm/multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y/gcc-10/lab-collabora/baseline-bcm2836-rpi-2-b.txt
>   HTML log:   https://storage.kernelci.org/next/pending-fixes/v6.0-5324-g7871897dadfa9/arm/multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y/gcc-10/lab-collabora/baseline-bcm2836-rpi-2-b.html
>   Result:     b006c439d58d hwrng: core - start hwrng kthread also for untrusted sources
> 
> Checks:
>   revert:     PASS
>   verify:     PASS
> 
> Parameters:
>   Tree:       next
>   URL:        https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
>   Branch:     pending-fixes
>   Target:     bcm2836-rpi-2-b
>   CPU arch:   arm
>   Lab:        lab-collabora
>   Compiler:   gcc-10
>   Config:     multi_v7_defconfig+CONFIG_THUMB2_KERNEL=y
>   Test case:  baseline.login
> 
> Breaking commit found:
> 
> -------------------------------------------------------------------------------
> commit b006c439d58db625318bf2207feabf847510a8a6
> Author: Dominik Brodowski <linux@dominikbrodowski.net>
> Date:   Thu Sep 22 15:59:31 2022 +0200
> 
>     hwrng: core - start hwrng kthread also for untrusted sources
>     
>     Start the hwrng kthread even if the hwrng source has a quality setting
>     of zero. Then, every crng reseed interval, one batch of data from this
>     zero-quality hwrng source will be mixed into the CRNG pool.
>     
>     This patch is based on the assumption that data from a hwrng source
>     will not actively harm the CRNG state. Instead, many hwrng sources
>     (such as TPM devices), even though they are assigend a quality level of
>     zero, actually provide some entropy, which is good enough to mix into
>     the CRNG pool every once in a while.
>     
>     Cc: Herbert Xu <herbert@gondor.apana.org.au>
>     Cc: Jason A. Donenfeld <Jason@zx2c4.com>
>     Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net>
>     Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
> 
> diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
> index d7045dfaf16c..cc002b0c2f0c 100644
> --- a/drivers/char/hw_random/core.c
> +++ b/drivers/char/hw_random/core.c
> @@ -52,7 +52,7 @@ MODULE_PARM_DESC(default_quality,
>  
>  static void drop_current_rng(void);
>  static int hwrng_init(struct hwrng *rng);
> -static void hwrng_manage_rngd(struct hwrng *rng);
> +static int hwrng_fillfn(void *unused);
>  
>  static inline int rng_get_data(struct hwrng *rng, u8 *buffer, size_t size,
>  			       int wait);
> @@ -96,6 +96,15 @@ static int set_current_rng(struct hwrng *rng)
>  	drop_current_rng();
>  	current_rng = rng;
>  
> +	/* if necessary, start hwrng thread */
> +	if (!hwrng_fill) {
> +		hwrng_fill = kthread_run(hwrng_fillfn, NULL, "hwrng");
> +		if (IS_ERR(hwrng_fill)) {
> +			pr_err("hwrng_fill thread creation failed\n");
> +			hwrng_fill = NULL;
> +		}
> +	}
> +
>  	return 0;
>  }
>  
> @@ -167,8 +176,6 @@ static int hwrng_init(struct hwrng *rng)
>  		rng->quality = 1024;
>  	current_quality = rng->quality; /* obsolete */
>  
> -	hwrng_manage_rngd(rng);
> -
>  	return 0;
>  }
>  
> @@ -454,10 +461,6 @@ static ssize_t rng_quality_store(struct device *dev,
>  	/* the best available RNG may have changed */
>  	ret = enable_best_rng();
>  
> -	/* start/stop rngd if necessary */
> -	if (current_rng)
> -		hwrng_manage_rngd(current_rng);
> -
>  out:
>  	mutex_unlock(&rng_mutex);
>  	return ret ? ret : len;
> @@ -513,9 +516,6 @@ static int hwrng_fillfn(void *unused)
>  
>  		put_rng(rng);
>  
> -		if (!quality)
> -			break;
> -
>  		if (rc <= 0)
>  			continue;
>  
> @@ -534,22 +534,6 @@ static int hwrng_fillfn(void *unused)
>  	return 0;
>  }
>  
> -static void hwrng_manage_rngd(struct hwrng *rng)
> -{
> -	if (WARN_ON(!mutex_is_locked(&rng_mutex)))
> -		return;
> -
> -	if (rng->quality == 0 && hwrng_fill)
> -		kthread_stop(hwrng_fill);
> -	if (rng->quality > 0 && !hwrng_fill) {
> -		hwrng_fill = kthread_run(hwrng_fillfn, NULL, "hwrng");
> -		if (IS_ERR(hwrng_fill)) {
> -			pr_err("hwrng_fill thread creation failed\n");
> -			hwrng_fill = NULL;
> -		}
> -	}
> -}
> -
>  int hwrng_register(struct hwrng *rng)
>  {
>  	int err = -EINVAL;
> -------------------------------------------------------------------------------
> 
> 
> Git bisection log:
> 
> -------------------------------------------------------------------------------
> git bisect start
> # good: [833477fce7a14d43ae4c07f8ddc32fa5119471a2] Merge tag 'sound-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
> git bisect good 833477fce7a14d43ae4c07f8ddc32fa5119471a2
> # bad: [7871897dadfa90816daf4963be075236587ada9d] Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/sre/linux-power-supply.git
> git bisect bad 7871897dadfa90816daf4963be075236587ada9d
> # good: [1b79573de717cfabe28221a98afaa6a3ff0e7458] crypto: blake2s - revert unintended config addition of CRYPTO_BLAKE2S
> git bisect good 1b79573de717cfabe28221a98afaa6a3ff0e7458
> # good: [1db031a4f6c36c02dbdd20d6c8e3f9771cdd4b78] Merge branch 'counter-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wbg/counter.git
> git bisect good 1db031a4f6c36c02dbdd20d6c8e3f9771cdd4b78
> # bad: [2bf13565c357c2cb9fef5d929fbf4fa2541d92de] Merge branch 'master' of git://git.kernel.org/pub/scm/virt/kvm/kvm.git
> git bisect bad 2bf13565c357c2cb9fef5d929fbf4fa2541d92de
> # bad: [b006c439d58db625318bf2207feabf847510a8a6] hwrng: core - start hwrng kthread also for untrusted sources
> git bisect bad b006c439d58db625318bf2207feabf847510a8a6
> # good: [f78f6f0bf34fd85c17ebcb31d645536112aa25d3] crypto: aspeed - fix build error when only CRYPTO_DEV_ASPEED is enabled
> git bisect good f78f6f0bf34fd85c17ebcb31d645536112aa25d3
> # good: [70513e1d65599f39aba4fa6594546f7c81fa59f4] crypto: aspeed - Fix check for platform_get_irq() errors
> git bisect good 70513e1d65599f39aba4fa6594546f7c81fa59f4
> # good: [0cb3c9cdf7fcc2ef75a6008223d2e3ee58ea00e1] crypto: octeontx2 - Remove the unneeded result variable
> git bisect good 0cb3c9cdf7fcc2ef75a6008223d2e3ee58ea00e1
> # good: [4edff849f7a0abca962374512907b3e2151091f4] crypto: zip - remove the unneeded result variable
> git bisect good 4edff849f7a0abca962374512907b3e2151091f4
> # first bad commit: [b006c439d58db625318bf2207feabf847510a8a6] hwrng: core - start hwrng kthread also for untrusted sources
> -------------------------------------------------------------------------------
> 
> 
> -=-=-=-=-=-=-=-=-=-=-=-
> Groups.io Links: You receive all messages sent to this group.
> View/Reply Online (#32452): https://groups.io/g/kernelci-results/message/32452
> Mute This Topic: https://groups.io/mt/94212648/1131744
> Group Owner: kernelci-results+owner@groups.io
> Unsubscribe: https://groups.io/g/kernelci-results/unsub [broonie@kernel.org]
> -=-=-=-=-=-=-=-=-=-=-=-
> 
> 

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b
  2022-10-10 11:59 ` next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b Mark Brown
@ 2022-10-10 12:50   ` Jason A. Donenfeld
  2022-10-10 13:52     ` Mark Brown
  0 siblings, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-10-10 12:50 UTC (permalink / raw)
  To: Mark Brown
  Cc: Florian Fainelli, Dominik Brodowski, Herbert Xu,
	kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list

On Mon, Oct 10, 2022 at 12:59:41PM +0100, Mark Brown wrote:
> The KernelCI bisection bot found a boot failure on Raspberry Pi 2B with
> multi_v7_defconfig+CONFIG_THUMB2_KERNEL on next/pending-fixes triggered
> by b006c439d58d ("hwrng: core - start hwrng kthread also for untrusted
> sources").  A RCU stall is detected towards the end of boot reading from
> the bcm2835 hwrng:
> 
> <6>[    3.362859] Freeing initrd memory: 16196K
> <3>[   23.160131] rcu: INFO: rcu_sched self-detected stall on CPU
> <3>[   23.166057] rcu: 	0-....: (2099 ticks this GP) idle=03b4/1/0x40000002 softirq=28/28 fqs=1050
> <4>[   23.174895] 	(t=2101 jiffies g=-1147 q=2353 ncpus=4)
> <4>[   23.180203] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> <4>[   23.186125] Hardware name: BCM2835
> <4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
> <4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4
> <4>[   23.199218] pc : [<c07ccdc8>]    lr : [<c07cb841>]    psr: 40000033
> <4>[   23.205840] sp : f093df70  ip : 00000000  fp : 00000000
> <4>[   23.211404] r10: c3c7e800  r9 : 00000000  r8 : c17e6b20
> <4>[   23.216968] r7 : c17e6b64  r6 : c18b0a74  r5 : c07ccd99  r4 : c3f171c0
> <4>[   23.223855] r3 : 000fffff  r2 : 00000040  r1 : c3c7e800  r0 : c3f171c0
> <4>[   23.230743] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
> <4>[   23.238426] Control: 50c5387d  Table: 0020406a  DAC: 00000051
> <4>[   23.244519] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1

In drivers/char/hw_random/bcm2835-rng.c, if you replace the cpu_relax()
with hwrng_msleep(rng, 1000), does it fix the problem?

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b
  2022-10-10 12:50   ` Jason A. Donenfeld
@ 2022-10-10 13:52     ` Mark Brown
  2022-10-10 15:06       ` [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax() Jason A. Donenfeld
  2022-10-17 20:31       ` next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b Jason A. Donenfeld
  0 siblings, 2 replies; 9+ messages in thread
From: Mark Brown @ 2022-10-10 13:52 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Florian Fainelli, Dominik Brodowski, Herbert Xu,
	kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list


[-- Attachment #1.1: Type: text/plain, Size: 509 bytes --]

On Mon, Oct 10, 2022 at 02:50:54PM +0200, Jason A. Donenfeld wrote:
> On Mon, Oct 10, 2022 at 12:59:41PM +0100, Mark Brown wrote:

> > <4>[   23.186125] Hardware name: BCM2835
> > <4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
> > <4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4

> In drivers/char/hw_random/bcm2835-rng.c, if you replace the cpu_relax()
> with hwrng_msleep(rng, 1000), does it fix the problem?

I can't schedule tests myself, hopefully someone with access to the
hardware can do so.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax()
  2022-10-10 13:52     ` Mark Brown
@ 2022-10-10 15:06       ` Jason A. Donenfeld
  2022-10-10 15:37         ` Jason A. Donenfeld
                           ` (2 more replies)
  2022-10-17 20:31       ` next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b Jason A. Donenfeld
  1 sibling, 3 replies; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-10-10 15:06 UTC (permalink / raw)
  To: Mark Brown, Florian Fainelli, Jason A. Donenfeld,
	Dominik Brodowski, Herbert Xu, kernelci-results, bot, gtucker,
	linux-arm-kernel, linux-rpi-kernel, bcm-kernel-feedback-list,
	linux-crypto, linux-kernel

Rather than busy looping, yield back to the scheduler and sleep for a
bit in the event that there's no data. This should hopefully prevent the
stalls that Mark reported:

<6>[    3.362859] Freeing initrd memory: 16196K
<3>[   23.160131] rcu: INFO: rcu_sched self-detected stall on CPU
<3>[   23.166057] rcu:  0-....: (2099 ticks this GP) idle=03b4/1/0x40000002 softirq=28/28 fqs=1050
<4>[   23.174895]       (t=2101 jiffies g=-1147 q=2353 ncpus=4)
<4>[   23.180203] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
<4>[   23.186125] Hardware name: BCM2835
<4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
<4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4
<4>[   23.199218] pc : [<c07ccdc8>]    lr : [<c07cb841>]    psr: 40000033
<4>[   23.205840] sp : f093df70  ip : 00000000  fp : 00000000
<4>[   23.211404] r10: c3c7e800  r9 : 00000000  r8 : c17e6b20
<4>[   23.216968] r7 : c17e6b64  r6 : c18b0a74  r5 : c07ccd99  r4 : c3f171c0
<4>[   23.223855] r3 : 000fffff  r2 : 00000040  r1 : c3c7e800  r0 : c3f171c0
<4>[   23.230743] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
<4>[   23.238426] Control: 50c5387d  Table: 0020406a  DAC: 00000051
<4>[   23.244519] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1

Link: https://lore.kernel.org/all/Y0QJLauamRnCDUef@sirena.org.uk/
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
I haven't tested this. Somebody with access to that kernel CI infra that
triggered this will need to test.

 drivers/char/hw_random/bcm2835-rng.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/char/hw_random/bcm2835-rng.c b/drivers/char/hw_random/bcm2835-rng.c
index e7dd457e9b22..e98fcac578d6 100644
--- a/drivers/char/hw_random/bcm2835-rng.c
+++ b/drivers/char/hw_random/bcm2835-rng.c
@@ -71,7 +71,7 @@ static int bcm2835_rng_read(struct hwrng *rng, void *buf, size_t max,
 	while ((rng_readl(priv, RNG_STATUS) >> 24) == 0) {
 		if (!wait)
 			return 0;
-		cpu_relax();
+		hwrng_msleep(rng, 1000);
 	}
 
 	num_words = rng_readl(priv, RNG_STATUS) >> 24;
-- 
2.37.3


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax()
  2022-10-10 15:06       ` [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax() Jason A. Donenfeld
@ 2022-10-10 15:37         ` Jason A. Donenfeld
  2022-10-10 22:05         ` Florian Fainelli
  2022-10-14 11:06         ` Herbert Xu
  2 siblings, 0 replies; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-10-10 15:37 UTC (permalink / raw)
  To: Mark Brown, Florian Fainelli, Dominik Brodowski, Herbert Xu,
	kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list, linux-crypto,
	linux-kernel

On Mon, Oct 10, 2022 at 09:06:07AM -0600, Jason A. Donenfeld wrote:
> Rather than busy looping, yield back to the scheduler and sleep for a
> bit in the event that there's no data. This should hopefully prevent the
> stalls that Mark reported:
> 
> <6>[    3.362859] Freeing initrd memory: 16196K
> <3>[   23.160131] rcu: INFO: rcu_sched self-detected stall on CPU
> <3>[   23.166057] rcu:  0-....: (2099 ticks this GP) idle=03b4/1/0x40000002 softirq=28/28 fqs=1050
> <4>[   23.174895]       (t=2101 jiffies g=-1147 q=2353 ncpus=4)
> <4>[   23.180203] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> <4>[   23.186125] Hardware name: BCM2835
> <4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
> <4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4
> <4>[   23.199218] pc : [<c07ccdc8>]    lr : [<c07cb841>]    psr: 40000033
> <4>[   23.205840] sp : f093df70  ip : 00000000  fp : 00000000
> <4>[   23.211404] r10: c3c7e800  r9 : 00000000  r8 : c17e6b20
> <4>[   23.216968] r7 : c17e6b64  r6 : c18b0a74  r5 : c07ccd99  r4 : c3f171c0
> <4>[   23.223855] r3 : 000fffff  r2 : 00000040  r1 : c3c7e800  r0 : c3f171c0
> <4>[   23.230743] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
> <4>[   23.238426] Control: 50c5387d  Table: 0020406a  DAC: 00000051
> <4>[   23.244519] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> 
> Link: https://lore.kernel.org/all/Y0QJLauamRnCDUef@sirena.org.uk/
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> I haven't tested this. Somebody with access to that kernel CI infra that
> triggered this will need to test.

So I succeeded at testing this, sort of. I was able to reproduce the
hang on a CONFIG_PREEMPT=n kernel with this diff:

diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c
index ee4da9ab8013..19e1186f0db0 100644
--- a/drivers/net/wireguard/main.c
+++ b/drivers/net/wireguard/main.c
@@ -15,12 +15,29 @@
 #include <linux/init.h>
 #include <linux/module.h>
 #include <linux/genetlink.h>
+#include <linux/hw_random.h>
 #include <net/rtnetlink.h>

+static int derp_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait)
+{
+	if (wait) {
+		for (;;)
+			cpu_relax();
+	}
+	return 0;
+}
+
+static struct hwrng derp_ops = {
+	.name = "flurpderp",
+	.read = derp_rng_read,
+};
+
 static int __init wg_mod_init(void)
 {
 	int ret;

+	hwrng_register(&derp_ops);
+
 	ret = wg_allowedips_slab_init();
 	if (ret < 0)
 		goto err_allowedips;


Next, I changed the cpu_relax() into hwrng_msleep(), as this patch does:

diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c
index ee4da9ab8013..19e1186f0db0 100644
--- a/drivers/net/wireguard/main.c
+++ b/drivers/net/wireguard/main.c
@@ -15,12 +15,29 @@
 #include <linux/init.h>
 #include <linux/module.h>
 #include <linux/genetlink.h>
+#include <linux/hw_random.h>
 #include <net/rtnetlink.h>

+static int derp_rng_read(struct hwrng *rng, void *buf, size_t max, bool wait)
+{
+	if (wait) {
+		for (;;)
+			hwrng_msleep(rng, 1000);
+	}
+	return 0;
+}
+
+static struct hwrng derp_ops = {
+	.name = "flurpderp",
+	.read = derp_rng_read,
+};
+
 static int __init wg_mod_init(void)
 {
 	int ret;

+	hwrng_register(&derp_ops);
+
 	ret = wg_allowedips_slab_init();
 	if (ret < 0)
 		goto err_allowedips;

And then the problem went away.

So I think this patch is a good one.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax()
  2022-10-10 15:06       ` [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax() Jason A. Donenfeld
  2022-10-10 15:37         ` Jason A. Donenfeld
@ 2022-10-10 22:05         ` Florian Fainelli
  2022-10-14 11:06         ` Herbert Xu
  2 siblings, 0 replies; 9+ messages in thread
From: Florian Fainelli @ 2022-10-10 22:05 UTC (permalink / raw)
  To: Jason A. Donenfeld, Mark Brown, Florian Fainelli,
	Dominik Brodowski, Herbert Xu, kernelci-results, bot, gtucker,
	linux-arm-kernel, linux-rpi-kernel, bcm-kernel-feedback-list,
	linux-crypto, linux-kernel

On 10/10/22 08:06, 'Jason A. Donenfeld' via BCM-KERNEL-FEEDBACK-LIST,PDL 
wrote:
> Rather than busy looping, yield back to the scheduler and sleep for a
> bit in the event that there's no data. This should hopefully prevent the
> stalls that Mark reported:
> 
> <6>[    3.362859] Freeing initrd memory: 16196K
> <3>[   23.160131] rcu: INFO: rcu_sched self-detected stall on CPU
> <3>[   23.166057] rcu:  0-....: (2099 ticks this GP) idle=03b4/1/0x40000002 softirq=28/28 fqs=1050
> <4>[   23.174895]       (t=2101 jiffies g=-1147 q=2353 ncpus=4)
> <4>[   23.180203] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> <4>[   23.186125] Hardware name: BCM2835
> <4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
> <4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4
> <4>[   23.199218] pc : [<c07ccdc8>]    lr : [<c07cb841>]    psr: 40000033
> <4>[   23.205840] sp : f093df70  ip : 00000000  fp : 00000000
> <4>[   23.211404] r10: c3c7e800  r9 : 00000000  r8 : c17e6b20
> <4>[   23.216968] r7 : c17e6b64  r6 : c18b0a74  r5 : c07ccd99  r4 : c3f171c0
> <4>[   23.223855] r3 : 000fffff  r2 : 00000040  r1 : c3c7e800  r0 : c3f171c0
> <4>[   23.230743] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
> <4>[   23.238426] Control: 50c5387d  Table: 0020406a  DAC: 00000051
> <4>[   23.244519] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> 
> Link: https://lore.kernel.org/all/Y0QJLauamRnCDUef@sirena.org.uk/
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>

Acked-by: Florian Fainelli <f.fainelli@gmail.com>
-- 
Florian

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax()
  2022-10-10 15:06       ` [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax() Jason A. Donenfeld
  2022-10-10 15:37         ` Jason A. Donenfeld
  2022-10-10 22:05         ` Florian Fainelli
@ 2022-10-14 11:06         ` Herbert Xu
  2 siblings, 0 replies; 9+ messages in thread
From: Herbert Xu @ 2022-10-14 11:06 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Mark Brown, Florian Fainelli, Dominik Brodowski,
	kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list, linux-crypto,
	linux-kernel

On Mon, Oct 10, 2022 at 09:06:07AM -0600, Jason A. Donenfeld wrote:
> Rather than busy looping, yield back to the scheduler and sleep for a
> bit in the event that there's no data. This should hopefully prevent the
> stalls that Mark reported:
> 
> <6>[    3.362859] Freeing initrd memory: 16196K
> <3>[   23.160131] rcu: INFO: rcu_sched self-detected stall on CPU
> <3>[   23.166057] rcu:  0-....: (2099 ticks this GP) idle=03b4/1/0x40000002 softirq=28/28 fqs=1050
> <4>[   23.174895]       (t=2101 jiffies g=-1147 q=2353 ncpus=4)
> <4>[   23.180203] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> <4>[   23.186125] Hardware name: BCM2835
> <4>[   23.189837] PC is at bcm2835_rng_read+0x30/0x6c
> <4>[   23.194709] LR is at hwrng_fillfn+0x71/0xf4
> <4>[   23.199218] pc : [<c07ccdc8>]    lr : [<c07cb841>]    psr: 40000033
> <4>[   23.205840] sp : f093df70  ip : 00000000  fp : 00000000
> <4>[   23.211404] r10: c3c7e800  r9 : 00000000  r8 : c17e6b20
> <4>[   23.216968] r7 : c17e6b64  r6 : c18b0a74  r5 : c07ccd99  r4 : c3f171c0
> <4>[   23.223855] r3 : 000fffff  r2 : 00000040  r1 : c3c7e800  r0 : c3f171c0
> <4>[   23.230743] Flags: nZcv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb  Segment none
> <4>[   23.238426] Control: 50c5387d  Table: 0020406a  DAC: 00000051
> <4>[   23.244519] CPU: 0 PID: 49 Comm: hwrng Not tainted 6.0.0 #1
> 
> Link: https://lore.kernel.org/all/Y0QJLauamRnCDUef@sirena.org.uk/
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
> I haven't tested this. Somebody with access to that kernel CI infra that
> triggered this will need to test.
> 
>  drivers/char/hw_random/bcm2835-rng.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b
  2022-10-10 13:52     ` Mark Brown
  2022-10-10 15:06       ` [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax() Jason A. Donenfeld
@ 2022-10-17 20:31       ` Jason A. Donenfeld
  2022-10-18 11:46         ` Mark Brown
  1 sibling, 1 reply; 9+ messages in thread
From: Jason A. Donenfeld @ 2022-10-17 20:31 UTC (permalink / raw)
  To: Mark Brown
  Cc: Florian Fainelli, Dominik Brodowski, Herbert Xu,
	kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list

Hey Mark,

Linus just merged the fix for this. Can you kick your CI and verify
that the latest tree no longer exhibits the issue?

Thanks,
Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b
  2022-10-17 20:31       ` next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b Jason A. Donenfeld
@ 2022-10-18 11:46         ` Mark Brown
  0 siblings, 0 replies; 9+ messages in thread
From: Mark Brown @ 2022-10-18 11:46 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Florian Fainelli, Dominik Brodowski, Herbert Xu,
	kernelci-results, bot, gtucker, linux-arm-kernel,
	linux-rpi-kernel, bcm-kernel-feedback-list


[-- Attachment #1.1: Type: text/plain, Size: 720 bytes --]

On Mon, Oct 17, 2022 at 02:31:39PM -0600, Jason A. Donenfeld wrote:

> Linus just merged the fix for this. Can you kick your CI and verify
> that the latest tree no longer exhibits the issue?

This isn't "my" CI - this is kernelci.org.  Like I say I don't have
access to run specific tests, right now Linus' tree is still in the
process of being built and run so there's no results yet but if any
appear for the relevant board they should be at:

   https://linux.kernelci.org/test/job/mainline/branch/master/kernel/v6.1-rc1-10-gbb1a1146467a/

(or you can drill down from https://linux.kernelci.org/ if a new
revision appears before you get round to things or fixes might've been
in an older version or different tree).

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

[-- Attachment #2: Type: text/plain, Size: 176 bytes --]

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-10-18 11:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <63426bce.170a0220.b8179.75b2@mx.google.com>
2022-10-10 11:59 ` next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b Mark Brown
2022-10-10 12:50   ` Jason A. Donenfeld
2022-10-10 13:52     ` Mark Brown
2022-10-10 15:06       ` [PATCH] hw_random: bcm2835: use hwrng_msleep() instead of cpu_relax() Jason A. Donenfeld
2022-10-10 15:37         ` Jason A. Donenfeld
2022-10-10 22:05         ` Florian Fainelli
2022-10-14 11:06         ` Herbert Xu
2022-10-17 20:31       ` next/pending-fixes bisection: baseline.login on bcm2836-rpi-2-b Jason A. Donenfeld
2022-10-18 11:46         ` Mark Brown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).