Linux-Crypto Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
@ 2019-08-05 23:32 Stephen Boyd
  2019-08-06 15:01 ` Stephen Boyd
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Stephen Boyd @ 2019-08-05 23:32 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-kernel, linux-crypto, Andrey Pronin, Duncan Laurie,
	Jason Gunthorpe, Arnd Bergmann, Greg Kroah-Hartman,
	Guenter Roeck, Alexander Steffen

The hwrng_fill() function can run while devices are suspending and
resuming. If the hwrng is behind a bus such as i2c or SPI and that bus
is suspended, the hwrng may hang the bus while attempting to add some
randomness. It's been observed on ChromeOS devices with suspend-to-idle
(s2idle) and an i2c based hwrng that this kthread may run and ask the
hwrng device for randomness before the i2c bus has been resumed.

Let's make this kthread freezable so that we don't try to touch the
hwrng during suspend/resume. This ensures that we can't cause the hwrng
backing driver to get into a bad state because the device is guaranteed
to be resumed before the hwrng kthread is thawed.

Cc: Andrey Pronin <apronin@chromium.org>
Cc: Duncan Laurie <dlaurie@chromium.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Guenter Roeck <groeck@chromium.org>
Cc: Alexander Steffen <Alexander.Steffen@infineon.com>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
---

I'm splitting this patch off of the larger series so it can
go through the crypto tree. See [1] for the prevoius round.
Nothing has changed in this patch since then.

[1] https://lkml.kernel.org/r/20190716224518.62556-2-swboyd@chromium.org

 drivers/char/hw_random/core.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/char/hw_random/core.c b/drivers/char/hw_random/core.c
index 95be7228f327..3b88af3149a7 100644
--- a/drivers/char/hw_random/core.c
+++ b/drivers/char/hw_random/core.c
@@ -13,6 +13,7 @@
 #include <linux/delay.h>
 #include <linux/device.h>
 #include <linux/err.h>
+#include <linux/freezer.h>
 #include <linux/fs.h>
 #include <linux/hw_random.h>
 #include <linux/kernel.h>
@@ -421,7 +422,9 @@ static int hwrng_fillfn(void *unused)
 {
 	long rc;
 
-	while (!kthread_should_stop()) {
+	set_freezable();
+
+	while (!kthread_freezable_should_stop(NULL)) {
 		struct hwrng *rng;
 
 		rng = get_current_rng();
-- 
Sent by a computer through tubes


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
  2019-08-05 23:32 [PATCH v3] hwrng: core: Freeze khwrng thread during suspend Stephen Boyd
@ 2019-08-06 15:01 ` Stephen Boyd
  2019-08-15 12:06 ` Herbert Xu
  2019-10-28 23:45 ` Maciej S. Szmigiero
  2 siblings, 0 replies; 6+ messages in thread
From: Stephen Boyd @ 2019-08-06 15:01 UTC (permalink / raw)
  To: Herbert Xu
  Cc: linux-kernel, linux-crypto, Andrey Pronin, Duncan Laurie,
	Jason Gunthorpe, Arnd Bergmann, Greg Kroah-Hartman,
	Guenter Roeck, Alexander Steffen

Quoting Stephen Boyd (2019-08-05 16:32:41)
> The hwrng_fill() function can run while devices are suspending and
> resuming. If the hwrng is behind a bus such as i2c or SPI and that bus
> is suspended, the hwrng may hang the bus while attempting to add some
> randomness. It's been observed on ChromeOS devices with suspend-to-idle
> (s2idle) and an i2c based hwrng that this kthread may run and ask the
> hwrng device for randomness before the i2c bus has been resumed.
> 
> Let's make this kthread freezable so that we don't try to touch the
> hwrng during suspend/resume. This ensures that we can't cause the hwrng
> backing driver to get into a bad state because the device is guaranteed
> to be resumed before the hwrng kthread is thawed.
> 
> Cc: Andrey Pronin <apronin@chromium.org>
> Cc: Duncan Laurie <dlaurie@chromium.org>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <groeck@chromium.org>
> Cc: Alexander Steffen <Alexander.Steffen@infineon.com>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---

Sorry, forgot to add

Fixes: be4000bc4644 ("hwrng: create filler thread")


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
  2019-08-05 23:32 [PATCH v3] hwrng: core: Freeze khwrng thread during suspend Stephen Boyd
  2019-08-06 15:01 ` Stephen Boyd
@ 2019-08-15 12:06 ` Herbert Xu
  2019-10-28 23:45 ` Maciej S. Szmigiero
  2 siblings, 0 replies; 6+ messages in thread
From: Herbert Xu @ 2019-08-15 12:06 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: linux-kernel, linux-crypto, Andrey Pronin, Duncan Laurie,
	Jason Gunthorpe, Arnd Bergmann, Greg Kroah-Hartman,
	Guenter Roeck, Alexander Steffen

On Mon, Aug 05, 2019 at 04:32:41PM -0700, Stephen Boyd wrote:
> The hwrng_fill() function can run while devices are suspending and
> resuming. If the hwrng is behind a bus such as i2c or SPI and that bus
> is suspended, the hwrng may hang the bus while attempting to add some
> randomness. It's been observed on ChromeOS devices with suspend-to-idle
> (s2idle) and an i2c based hwrng that this kthread may run and ask the
> hwrng device for randomness before the i2c bus has been resumed.
> 
> Let's make this kthread freezable so that we don't try to touch the
> hwrng during suspend/resume. This ensures that we can't cause the hwrng
> backing driver to get into a bad state because the device is guaranteed
> to be resumed before the hwrng kthread is thawed.
> 
> Cc: Andrey Pronin <apronin@chromium.org>
> Cc: Duncan Laurie <dlaurie@chromium.org>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Guenter Roeck <groeck@chromium.org>
> Cc: Alexander Steffen <Alexander.Steffen@infineon.com>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> ---
> 
> I'm splitting this patch off of the larger series so it can
> go through the crypto tree. See [1] for the prevoius round.
> Nothing has changed in this patch since then.
> 
> [1] https://lkml.kernel.org/r/20190716224518.62556-2-swboyd@chromium.org
> 
>  drivers/char/hw_random/core.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Patch applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
  2019-08-05 23:32 [PATCH v3] hwrng: core: Freeze khwrng thread during suspend Stephen Boyd
  2019-08-06 15:01 ` Stephen Boyd
  2019-08-15 12:06 ` Herbert Xu
@ 2019-10-28 23:45 ` Maciej S. Szmigiero
       [not found]   ` <5db85058.1c69fb81.202d7.e3d0@mx.google.com>
  2 siblings, 1 reply; 6+ messages in thread
From: Maciej S. Szmigiero @ 2019-10-28 23:45 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Herbert Xu, linux-kernel, linux-crypto, Andrey Pronin,
	Duncan Laurie, Jason Gunthorpe, Arnd Bergmann,
	Greg Kroah-Hartman, Guenter Roeck, Alexander Steffen

Hi Stephen,

On 06.08.2019 01:32, Stephen Boyd wrote:
> The hwrng_fill() function can run while devices are suspending and
> resuming. If the hwrng is behind a bus such as i2c or SPI and that bus
> is suspended, the hwrng may hang the bus while attempting to add some
> randomness. It's been observed on ChromeOS devices with suspend-to-idle
> (s2idle) and an i2c based hwrng that this kthread may run and ask the
> hwrng device for randomness before the i2c bus has been resumed.
> 
> Let's make this kthread freezable so that we don't try to touch the
> hwrng during suspend/resume. This ensures that we can't cause the hwrng
> backing driver to get into a bad state because the device is guaranteed
> to be resumed before the hwrng kthread is thawed.

This patch broke suspend with virtio-rng loaded (it hangs).

The problematic call chain is:
virtrng_freeze() -> remove_common() -> hwrng_unregister() ->
kthread_stop().

It looks like kthread_stop() can't finish on a frozen khwrng thread.

Reverting this commit makes a VM with virtio-rng driver loaded
suspend and resume correctly again.

Maciej

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
       [not found]   ` <5db85058.1c69fb81.202d7.e3d0@mx.google.com>
@ 2019-10-29 15:50     ` Maciej S. Szmigiero
       [not found]       ` <5dc0c12e.1c69fb81.2ce1c.01ee@mx.google.com>
  0 siblings, 1 reply; 6+ messages in thread
From: Maciej S. Szmigiero @ 2019-10-29 15:50 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Herbert Xu, linux-kernel, linux-crypto, Andrey Pronin,
	Duncan Laurie, Jason Gunthorpe, Arnd Bergmann,
	Greg Kroah-Hartman, Guenter Roeck, Alexander Steffen

On 29.10.2019 15:44, Stephen Boyd wrote:
> Quoting Maciej S. Szmigiero (2019-10-28 16:45:31)
>> Hi Stephen,
>>
>> On 06.08.2019 01:32, Stephen Boyd wrote:
>>> The hwrng_fill() function can run while devices are suspending and
>>> resuming. If the hwrng is behind a bus such as i2c or SPI and that bus
>>> is suspended, the hwrng may hang the bus while attempting to add some
>>> randomness. It's been observed on ChromeOS devices with suspend-to-idle
>>> (s2idle) and an i2c based hwrng that this kthread may run and ask the
>>> hwrng device for randomness before the i2c bus has been resumed.
>>>
>>> Let's make this kthread freezable so that we don't try to touch the
>>> hwrng during suspend/resume. This ensures that we can't cause the hwrng
>>> backing driver to get into a bad state because the device is guaranteed
>>> to be resumed before the hwrng kthread is thawed.
>>
>> This patch broke suspend with virtio-rng loaded (it hangs).
>>
>> The problematic call chain is:
>> virtrng_freeze() -> remove_common() -> hwrng_unregister() ->
>> kthread_stop().
>>
>> It looks like kthread_stop() can't finish on a frozen khwrng thread.
> 
> Can you provide the suspend/resume logs?

There isn't much in the kernel log, the closest thing I can get is
with dyndbg="file drivers/base/power/main.c +p":
[   58.441073][ T3511] virtio-pci 0000:00:06.0: bus freeze
[   58.448744][ T3511] virtio-pci 0000:00:05.0: bus freeze
[   58.454500][ T3511] virtio-pci 0000:00:04.0: bus freeze
[   58.456873][ T3511] virtio-pci 0000:00:03.0: bus freeze

And then the VM hangs.

The 0000:00:03.0 pci device is virtio-rng.

If I add printks around that kthread_stop() in hwrng_unregister()
only the first one gets printed.

>>
>> Reverting this commit makes a VM with virtio-rng driver loaded
>> suspend and resume correctly again.
> 
> Which kernel are you testing on? There was a fix to this commit, i.e.
> ff296293b353 ("random: Support freezable kthreads in
> add_hwgenerator_randomness()"), which was fixed again by 59b569480dc8
> ("random: Use wait_event_freezable() in add_hwgenerator_randomness()").
> There was a problem with suspend/resume that I tried to fix with the
> first patch and then the second patch fixed the first one. See this
> thread[1] for some more background. You'll want all three.
> 
> [1] https://lkml.kernel.org/r/49fc7c64-88c0-74d0-2cb3-07986490941d@ti.com

The kernel under test is current torvalds/master, I can see that it
contains both commit ff296293b353 and commit 59b569480dc8.

I assume that the third commit you mention is the original one that this
e-mail message Subject line refers to (03a3bb7ae63150).

Thanks,
Maciej

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
       [not found]       ` <5dc0c12e.1c69fb81.2ce1c.01ee@mx.google.com>
@ 2019-11-10  0:30         ` Maciej S. Szmigiero
  0 siblings, 0 replies; 6+ messages in thread
From: Maciej S. Szmigiero @ 2019-11-10  0:30 UTC (permalink / raw)
  To: Stephen Boyd
  Cc: Herbert Xu, linux-kernel, linux-crypto, Andrey Pronin,
	Duncan Laurie, Jason Gunthorpe, Arnd Bergmann,
	Greg Kroah-Hartman, Guenter Roeck, Alexander Steffen, Amit Shah

On 05.11.2019 01:24, Stephen Boyd wrote:
> Quoting Maciej S. Szmigiero (2019-10-29 08:50:52)
>> On 29.10.2019 15:44, Stephen Boyd wrote:
>>> Quoting Maciej S. Szmigiero (2019-10-28 16:45:31)
>>>> Hi Stephen,
>>>>
>>>> On 06.08.2019 01:32, Stephen Boyd wrote:
>>>>> The hwrng_fill() function can run while devices are suspending and
>>>>> resuming. If the hwrng is behind a bus such as i2c or SPI and that bus
>>>>> is suspended, the hwrng may hang the bus while attempting to add some
>>>>> randomness. It's been observed on ChromeOS devices with suspend-to-idle
>>>>> (s2idle) and an i2c based hwrng that this kthread may run and ask the
>>>>> hwrng device for randomness before the i2c bus has been resumed.
>>>>>
>>>>> Let's make this kthread freezable so that we don't try to touch the
>>>>> hwrng during suspend/resume. This ensures that we can't cause the hwrng
>>>>> backing driver to get into a bad state because the device is guaranteed
>>>>> to be resumed before the hwrng kthread is thawed.
>>>>
>>>> This patch broke suspend with virtio-rng loaded (it hangs).
>>>>
>>>> The problematic call chain is:
>>>> virtrng_freeze() -> remove_common() -> hwrng_unregister() ->
>>>> kthread_stop().
>>>>
>>>> It looks like kthread_stop() can't finish on a frozen khwrng thread.
>>>
>>> Can you provide the suspend/resume logs?
>>
>> There isn't much in the kernel log, the closest thing I can get is
>> with dyndbg="file drivers/base/power/main.c +p":
>> [   58.441073][ T3511] virtio-pci 0000:00:06.0: bus freeze
>> [   58.448744][ T3511] virtio-pci 0000:00:05.0: bus freeze
>> [   58.454500][ T3511] virtio-pci 0000:00:04.0: bus freeze
>> [   58.456873][ T3511] virtio-pci 0000:00:03.0: bus freeze
>>
>> And then the VM hangs.
>>
>> The 0000:00:03.0 pci device is virtio-rng.
>>
>> If I add printks around that kthread_stop() in hwrng_unregister()
>> only the first one gets printed.
> 
> Ok. I don't know why virtio rng wants to remove itself and then reprobe
> across suspend/resume. Do you know the history there?

Hard to tell, I have added Amit, who had implemented these PM callbacks
back in 2012, to CC now.

> Can you enable the dynamic debug printk in __refrigerator()?
> 
> 	file kernel/freezer.c +p
> 
> That will let us know when the kthread has been frozen/thawed.
> 
> Either way, it sounds like maybe it's what you say, virtio rng wants to
> call kthread_stop() on a kthread that's been frozen already and then it
> just hangs waiting for the thread to wake up, which it never does. I
> can't convince myself that the schedule() inside __refrigerator() won't
> wake up though. I would think it leaves the refrigerator when
> kthread_stop() is called because the kthread will wakeup from
> wake_up_process() in kthread_stop(), see it should stop in
> __refrigerator() and eventually exit. Maybe the hwrng thread is stuck
> somewhere else?
> 

Yes, it turns out the hwrng kthread is actually stuck inside
add_hwgenerator_randomness() in wait_event_freezable() call introduced
by commit 59b569480dc8
("random: Use wait_event_freezable() in add_hwgenerator_randomness()").

wait_event_freezable() ultimately calls __refrigerator() with its
check_kthr_stop argument set to false, which causes it to keep the
kthread frozen even if somebody calls kthread_stop() on it.

Calling wait_event_freezable() with kthread_should_stop() as a condition
seems racy because it doesn't take into account the situation where this
condition becomes true on a kthread marked for freezing only after it
has been checked.

I was able to make the VM write a s2disk image with the following change:
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -2500,8 +2500,8 @@ void add_hwgenerator_randomness(const char *buffer, size_t count,
 	 * We'll be woken up again once below random_write_wakeup_thresh,
 	 * or when the calling thread is about to terminate.
 	 */
-	wait_event_freezable(random_write_wait,
-			kthread_should_stop() ||
+	wait_event_interruptible(random_write_wait,
+			kthread_should_stop() || freezing(current) ||
 			ENTROPY_BITS(&input_pool) <= random_write_wakeup_bits);
 	mix_pool_bytes(poolp, buffer, count);
 	credit_entropy_bits(poolp, entropy);

Calling freezing() should avoid the issue that the commit 59b569480dc8
has fixed, as it is only a checking function, it doesn't actually do
the freezing.

However, while the written image seems valid (the machine will resume
successfully from it) the suspend process still hangs, only now a bit
later.

It turns out there is a second issue where the set_freezable() call
at the beginning of hwrng_fillfn() will freeze this kthread with
(again) check_kthr_stop argument set to false when this kthread gets
relaunched when devices are resumed to write the hibernation image
at the suspend time.

That makes the frozen kthread impossible to stop on shutdown, so the
VM hangs there.

If I only clear the PF_NOFREEZE flag instead of calling set_freezable()
at the beginning of hwrng_fillfn() the suspend process will complete
successfully.

However, it seems to me that the real solution here would be to
change the virtio-rng driver to not unregister / reregister itself
on PM events rather than fight the freezer / kthread_stop()
interactions.

Maciej

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-05 23:32 [PATCH v3] hwrng: core: Freeze khwrng thread during suspend Stephen Boyd
2019-08-06 15:01 ` Stephen Boyd
2019-08-15 12:06 ` Herbert Xu
2019-10-28 23:45 ` Maciej S. Szmigiero
     [not found]   ` <5db85058.1c69fb81.202d7.e3d0@mx.google.com>
2019-10-29 15:50     ` Maciej S. Szmigiero
     [not found]       ` <5dc0c12e.1c69fb81.2ce1c.01ee@mx.google.com>
2019-11-10  0:30         ` Maciej S. Szmigiero

Linux-Crypto Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-crypto/0 linux-crypto/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-crypto linux-crypto/ https://lore.kernel.org/linux-crypto \
		linux-crypto@vger.kernel.org
	public-inbox-index linux-crypto

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-crypto


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git