Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend

From: Amit Shah <amit@kernel.org>
To: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>,
	Stephen Boyd <swboyd@chromium.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>,
	linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org,
	Andrey Pronin <apronin@chromium.org>,
	Duncan Laurie <dlaurie@chromium.org>,
	Jason Gunthorpe <jgg@ziepe.ca>, Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Guenter Roeck <groeck@chromium.org>,
	Alexander Steffen <Alexander.Steffen@infineon.com>,
	Amit Shah <amit@kernel.org>
Subject: Re: [PATCH v3] hwrng: core: Freeze khwrng thread during suspend
Date: Tue, 17 Dec 2019 10:22:54 +0100	[thread overview]
Message-ID: <bcf52e98b1743410a87107b8430da49aec81fddd.camel@kernel.org> (raw)
In-Reply-To: <fe7e608e-d94d-a15c-bdc0-8431e9fce5c1@maciej.szmigiero.name>

(Apologies for the really late response.)

On Sun, 2019-11-10 at 01:30 +0100, Maciej S. Szmigiero wrote:
> On 05.11.2019 01:24, Stephen Boyd wrote:
> > Quoting Maciej S. Szmigiero (2019-10-29 08:50:52)
> > > On 29.10.2019 15:44, Stephen Boyd wrote:
> > > > Quoting Maciej S. Szmigiero (2019-10-28 16:45:31)
> > > > > Hi Stephen,
> > > > > 
> > > > > On 06.08.2019 01:32, Stephen Boyd wrote:
> > > > > > The hwrng_fill() function can run while devices are
> > > > > > suspending and
> > > > > > resuming. If the hwrng is behind a bus such as i2c or SPI
> > > > > > and that bus
> > > > > > is suspended, the hwrng may hang the bus while attempting
> > > > > > to add some
> > > > > > randomness. It's been observed on ChromeOS devices with
> > > > > > suspend-to-idle
> > > > > > (s2idle) and an i2c based hwrng that this kthread may run
> > > > > > and ask the
> > > > > > hwrng device for randomness before the i2c bus has been
> > > > > > resumed.
> > > > > > 
> > > > > > Let's make this kthread freezable so that we don't try to
> > > > > > touch the
> > > > > > hwrng during suspend/resume. This ensures that we can't
> > > > > > cause the hwrng
> > > > > > backing driver to get into a bad state because the device
> > > > > > is guaranteed
> > > > > > to be resumed before the hwrng kthread is thawed.
> > > > > 
> > > > > This patch broke suspend with virtio-rng loaded (it hangs).
> > > > > 
> > > > > The problematic call chain is:
> > > > > virtrng_freeze() -> remove_common() -> hwrng_unregister() ->
> > > > > kthread_stop().
> > > > > 
> > > > > 
> > > > > It looks like kthread_stop() can't finish on a frozen khwrng
> > > > > thread.
> > > > 
> > > > Can you provide the suspend/resume logs?
> > > 
> > > There isn't much in the kernel log, the closest thing I can get
> > > is
> > > with dyndbg="file drivers/base/power/main.c +p":
> > > [   58.441073][ T3511] virtio-pci 0000:00:06.0: bus freeze
> > > [   58.448744][ T3511] virtio-pci 0000:00:05.0: bus freeze
> > > [   58.454500][ T3511] virtio-pci 0000:00:04.0: bus freeze
> > > [   58.456873][ T3511] virtio-pci 0000:00:03.0: bus freeze
> > > 
> > > And then the VM hangs.
> > > 
> > > The 0000:00:03.0 pci device is virtio-rng.
> > > 
> > > If I add printks around that kthread_stop() in hwrng_unregister()
> > > only the first one gets printed.
> > 
> > Ok. I don't know why virtio rng wants to remove itself and then
> > reprobe
> > across suspend/resume. Do you know the history there?
> 
> Hard to tell, I have added Amit, who had implemented these PM
> callbacks
> back in 2012, to CC now.

The virtio queues need to be in sync with the host across suspend-
resume.  The usual workflow is qemu creates a device, Linux registers
the driver and maybe requests entropy at boot-up.  Later, when the
hibernation image is restored, the previously-running Linux's queue
indexes are used, which don't match with qemu's idea of where the
indexes will be.  This makes qemu terminate the guest.

To overcome this problem, the queues are removed before hibernate, and
recreated after resume -- which ensures both are in sync again.

I can imagine an addition to the virtio spec to re-initialize queues
for the hibernation use-case; but the current code path will have to be
around for the old-qemu usecase.

> 
> > Can you enable the dynamic debug printk in __refrigerator()?
> > 
> > 	file kernel/freezer.c +p
> > 
> > That will let us know when the kthread has been frozen/thawed.
> > 
> > Either way, it sounds like maybe it's what you say, virtio rng
> > wants to
> > call kthread_stop() on a kthread that's been frozen already and
> > then it
> > just hangs waiting for the thread to wake up, which it never does.
> > I
> > can't convince myself that the schedule() inside __refrigerator()
> > won't
> > wake up though. I would think it leaves the refrigerator when
> > kthread_stop() is called because the kthread will wakeup from
> > wake_up_process() in kthread_stop(), see it should stop in
> > __refrigerator() and eventually exit. Maybe the hwrng thread is
> > stuck
> > somewhere else?
> > 
> 
> Yes, it turns out the hwrng kthread is actually stuck inside
> add_hwgenerator_randomness() in wait_event_freezable() call
> introduced
> by commit 59b569480dc8
> ("random: Use wait_event_freezable() in
> add_hwgenerator_randomness()").
> 
> wait_event_freezable() ultimately calls __refrigerator() with its
> check_kthr_stop argument set to false, which causes it to keep the
> kthread frozen even if somebody calls kthread_stop() on it.
> 
> Calling wait_event_freezable() with kthread_should_stop() as a
> condition
> seems racy because it doesn't take into account the situation where
> this
> condition becomes true on a kthread marked for freezing only after it
> has been checked.
> 
> I was able to make the VM write a s2disk image with the following
> change:
> --- a/drivers/char/random.c
> +++ b/drivers/char/random.c
> @@ -2500,8 +2500,8 @@ void add_hwgenerator_randomness(const char
> *buffer, size_t count,
>  	 * We'll be woken up again once below
> random_write_wakeup_thresh,
>  	 * or when the calling thread is about to terminate.
>  	 */
> -	wait_event_freezable(random_write_wait,
> -			kthread_should_stop() ||
> +	wait_event_interruptible(random_write_wait,
> +			kthread_should_stop() || freezing(current) ||
>  			ENTROPY_BITS(&input_pool) <=
> random_write_wakeup_bits);
>  	mix_pool_bytes(poolp, buffer, count);
>  	credit_entropy_bits(poolp, entropy);
> 
> Calling freezing() should avoid the issue that the commit
> 59b569480dc8
> has fixed, as it is only a checking function, it doesn't actually do
> the freezing.
> 
> However, while the written image seems valid (the machine will resume
> successfully from it) the suspend process still hangs, only now a bit
> later.
> 
> It turns out there is a second issue where the set_freezable() call
> at the beginning of hwrng_fillfn() will freeze this kthread with
> (again) check_kthr_stop argument set to false when this kthread gets
> relaunched when devices are resumed to write the hibernation image
> at the suspend time.
> 
> That makes the frozen kthread impossible to stop on shutdown, so the
> VM hangs there.
> 
> If I only clear the PF_NOFREEZE flag instead of calling
> set_freezable()
> at the beginning of hwrng_fillfn() the suspend process will complete
> successfully.
> 
> However, it seems to me that the real solution here would be to
> change the virtio-rng driver to not unregister / reregister itself
> on PM events rather than fight the freezer / kthread_stop()
> interactions.
> 
> Maciej