All of lore.kernel.org
 help / color / mirror / Atom feed
* RFH with debugging suspend issues
@ 2020-03-09 10:57 Jonas Meurer
  2020-03-09 12:08 ` Jonas Meurer
  2020-03-18 21:37 ` Rafael J. Wysocki
  0 siblings, 2 replies; 6+ messages in thread
From: Jonas Meurer @ 2020-03-09 10:57 UTC (permalink / raw)
  To: Linux PM; +Cc: Tim Dittler


[-- Attachment #1.1: Type: text/plain, Size: 2420 bytes --]

Hello,

I'm searching for help with debugging a suspend issue:

Apparently, on some devices (Lenovo laptops in particular), the kernel
causes a I/O operation on the root filesystem when suspending the system
- even though the final sync[1] is disabled thanks to setting
`/sys/power/sync_on_suspend` to 0, see my corresponding patch that got
accepted in Linux 5.6[2].

My current guess it that some hardware-specific firmware is loaded
during system suspend. But unfortunately, so far I failed to find what
exactly it is despite following the 01.org debugging documentation[3].
Maybe you can help me shed some light on it?

The problem I'm facing is the following:

When luksSuspending my LUKS encrypted root file system just before
running `echo mem >/sys/power/state` on a Lenovo Thinkpad, the kernel
suspend function freezes my system..
*BUT*: If I first run `echo mem >/sys/power/state` without luksSuspend,
and do the luksSuspend + system suspend thing after a successful first
system suspend, then my luksSuspend + system suspend succeeds reproducibly.

So apparently, at the first system suspend, something triggers a disk
I/O operation that isn't triggered at subsequent system suspends. Do you
have an idea what that could be and how to further debug it?

To give some context: together with Tim (Cc'ed), I'm working on
automatically suspending encrypted LUKS devices during system suspend in
Debian. Our code basically uncompresses the initramfs image into a
ramfs, chroots into it, luks-suspends all encrypted LUKS devices and
sends the system to suspend mode. After resume, commands to unlock/
resume the suspended LUKS devices are executed.

You can find the wrapper script that prepares the ramfs chroot at [4]
and the C program that does the actual luksSuspend + system suspend at [5].

Kind regards
 jonas

[1]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/power/suspend.c#n570
[2]
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c052bf82c6b00ca27aab0859addc4b3159dfd3a4
[3]
https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues
[4]
https://salsa.debian.org/mejo/cryptsetup-suspend/-/blob/suspend/debian/scripts/suspend/cryptsetup-suspend-wrapper
[5]
https://salsa.debian.org/mejo/cryptsetup-suspend/-/blob/suspend/debian/scripts/suspend/cryptsetup-suspend.c


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFH with debugging suspend issues
  2020-03-09 10:57 RFH with debugging suspend issues Jonas Meurer
@ 2020-03-09 12:08 ` Jonas Meurer
  2020-03-17 15:26   ` Jonas Meurer
  2020-03-18 21:37 ` Rafael J. Wysocki
  1 sibling, 1 reply; 6+ messages in thread
From: Jonas Meurer @ 2020-03-09 12:08 UTC (permalink / raw)
  To: Linux PM; +Cc: Tim Dittler


[-- Attachment #1.1: Type: text/plain, Size: 2115 bytes --]

Hello again,

Jonas Meurer:
> I'm searching for help with debugging a suspend issue:
> 
> Apparently, on some devices (Lenovo laptops in particular), the kernel
> causes a I/O operation on the root filesystem when suspending the system
> - even though the final sync[1] is disabled thanks to setting
> `/sys/power/sync_on_suspend` to 0, see my corresponding patch that got
> accepted in Linux 5.6[2].
> 
> My current guess it that some hardware-specific firmware is loaded
> during system suspend. But unfortunately, so far I failed to find what
> exactly it is despite following the 01.org debugging documentation[3].
> Maybe you can help me shed some light on it?

I finally succeeded in reliably tracking this down to firmware loading.
With kernel boot parameters `initcall_debug ignore_loglevel`, the last
logs before my system freezes are:

PM: suspend entry (deep)
(NULL device *): firmware: direct-loading firmware regulatory.db
(NULL device *): firmware: direct-loading firmware regulatory.db.p7s
(NULL device *): firmware: direct-loading firmware iwlwifi-8000C-36.ucode

If I blacklist all modules that cause the kernel to load firmware (for
me, that's cfg80211, iwlwifi and some bluetooth modules), then the issue
is gone.

So without further investigation I could imagine three possible solutions:

1. Provide all firmware files to the kernel from the chroot. That
   probably means to copy the firmware files to initramfs and to make
   the kernel aware of the new firmware path at
   `/sys/module/firmware_class/parameters/path`.
2. Find a way to manually trigger the firmware loading operation before
   we luksSuspend the LUKS devices.
3. Find a way do disable direct-loading of firmware by the kernel during
   suspend.

Maybe someone with more inside knowledge could comment on whether
options 2 or 3 are possible at all and if not, whether kernel patches to
implement them would be acceptable.

Cheers
 jonas

PS: Inside our Debian cryptsetup-suspend project, we track this issue at
    https://salsa.debian.org/mejo/cryptsetup-suspend/issues/38


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFH with debugging suspend issues
  2020-03-09 12:08 ` Jonas Meurer
@ 2020-03-17 15:26   ` Jonas Meurer
  2020-03-17 15:28     ` Rafael J. Wysocki
  0 siblings, 1 reply; 6+ messages in thread
From: Jonas Meurer @ 2020-03-17 15:26 UTC (permalink / raw)
  To: Linux PM; +Cc: Tim Dittler, Rafael J. Wysocki


[-- Attachment #1.1: Type: text/plain, Size: 2307 bytes --]

Hello again,

sorry for Cc'ing you directly, Rafael. But I'm not sure whether anybody
reads the linux-pm mailinglist without beeing addressed directly ;)
Maybe you can point us to a better place to ask for help here?

Jonas Meurer:
> Jonas Meurer:
>> I'm searching for help with debugging a suspend issue:
>>
>> Apparently, on some devices (Lenovo laptops in particular), the kernel
>> causes a I/O operation on the root filesystem when suspending the system
>> - even though the final sync[1] is disabled thanks to setting
>> `/sys/power/sync_on_suspend` to 0, see my corresponding patch that got
>> accepted in Linux 5.6[2].
>>
>> My current guess it that some hardware-specific firmware is loaded
>> during system suspend. But unfortunately, so far I failed to find what
>> exactly it is despite following the 01.org debugging documentation[3].
>> Maybe you can help me shed some light on it?
> 
> I finally succeeded in reliably tracking this down to firmware loading.
> With kernel boot parameters `initcall_debug ignore_loglevel`, the last
> logs before my system freezes are:
> 
> PM: suspend entry (deep)
> (NULL device *): firmware: direct-loading firmware regulatory.db
> (NULL device *): firmware: direct-loading firmware regulatory.db.p7s
> (NULL device *): firmware: direct-loading firmware iwlwifi-8000C-36.ucode
> 
> If I blacklist all modules that cause the kernel to load firmware (for
> me, that's cfg80211, iwlwifi and some bluetooth modules), then the issue
> is gone.
> 
> So without further investigation I could imagine three possible solutions:
> 
> 1. Provide all firmware files to the kernel from the chroot. That
>    probably means to copy the firmware files to initramfs and to make
>    the kernel aware of the new firmware path at
>    `/sys/module/firmware_class/parameters/path`.
> 2. Find a way to manually trigger the firmware loading operation before
>    we luksSuspend the LUKS devices.
> 3. Find a way do disable direct-loading of firmware by the kernel during
>    suspend.
> 
> Maybe someone with more inside knowledge could comment on whether
> options 2 or 3 are possible at all and if not, whether kernel patches to
> implement them would be acceptable.

Any chance to get a comment on this?

Cheers,
 jonas


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFH with debugging suspend issues
  2020-03-17 15:26   ` Jonas Meurer
@ 2020-03-17 15:28     ` Rafael J. Wysocki
  2020-03-18 13:03       ` Jonas Meurer
  0 siblings, 1 reply; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-03-17 15:28 UTC (permalink / raw)
  To: Jonas Meurer; +Cc: Linux PM, Tim Dittler, Rafael J. Wysocki

On Tue, Mar 17, 2020 at 4:26 PM Jonas Meurer <jonas@freesources.org> wrote:
>
> Hello again,
>
> sorry for Cc'ing you directly, Rafael. But I'm not sure whether anybody
> reads the linux-pm mailinglist without beeing addressed directly ;)

It's better to keep a public record of the conversations that happen at least.

> Maybe you can point us to a better place to ask for help here?

This is the right place, but I've been somewhat distracted lately.
I'll get back to you later today or tomorrow, thanks!


> >> I'm searching for help with debugging a suspend issue:
> >>
> >> Apparently, on some devices (Lenovo laptops in particular), the kernel
> >> causes a I/O operation on the root filesystem when suspending the system
> >> - even though the final sync[1] is disabled thanks to setting
> >> `/sys/power/sync_on_suspend` to 0, see my corresponding patch that got
> >> accepted in Linux 5.6[2].
> >>
> >> My current guess it that some hardware-specific firmware is loaded
> >> during system suspend. But unfortunately, so far I failed to find what
> >> exactly it is despite following the 01.org debugging documentation[3].
> >> Maybe you can help me shed some light on it?
> >
> > I finally succeeded in reliably tracking this down to firmware loading.
> > With kernel boot parameters `initcall_debug ignore_loglevel`, the last
> > logs before my system freezes are:
> >
> > PM: suspend entry (deep)
> > (NULL device *): firmware: direct-loading firmware regulatory.db
> > (NULL device *): firmware: direct-loading firmware regulatory.db.p7s
> > (NULL device *): firmware: direct-loading firmware iwlwifi-8000C-36.ucode
> >
> > If I blacklist all modules that cause the kernel to load firmware (for
> > me, that's cfg80211, iwlwifi and some bluetooth modules), then the issue
> > is gone.
> >
> > So without further investigation I could imagine three possible solutions:
> >
> > 1. Provide all firmware files to the kernel from the chroot. That
> >    probably means to copy the firmware files to initramfs and to make
> >    the kernel aware of the new firmware path at
> >    `/sys/module/firmware_class/parameters/path`.
> > 2. Find a way to manually trigger the firmware loading operation before
> >    we luksSuspend the LUKS devices.
> > 3. Find a way do disable direct-loading of firmware by the kernel during
> >    suspend.
> >
> > Maybe someone with more inside knowledge could comment on whether
> > options 2 or 3 are possible at all and if not, whether kernel patches to
> > implement them would be acceptable.
>
> Any chance to get a comment on this?

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFH with debugging suspend issues
  2020-03-17 15:28     ` Rafael J. Wysocki
@ 2020-03-18 13:03       ` Jonas Meurer
  0 siblings, 0 replies; 6+ messages in thread
From: Jonas Meurer @ 2020-03-18 13:03 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Linux PM, Tim Dittler, Rafael J. Wysocki


[-- Attachment #1.1: Type: text/plain, Size: 2657 bytes --]

Hey Rafael,

Rafael J. Wysocki:
>> sorry for Cc'ing you directly, Rafael. But I'm not sure whether anybody
>> reads the linux-pm mailinglist without beeing addressed directly ;)
> 
> It's better to keep a public record of the conversations that happen at least.
> 
>> Maybe you can point us to a better place to ask for help here?
> 
> This is the right place, but I've been somewhat distracted lately.
> I'll get back to you later today or tomorrow, thanks!

Thanks a lot! Looking forward to your response :)

Cheers
 jonas


>>>> I'm searching for help with debugging a suspend issue:
>>>>
>>>> Apparently, on some devices (Lenovo laptops in particular), the kernel
>>>> causes a I/O operation on the root filesystem when suspending the system
>>>> - even though the final sync[1] is disabled thanks to setting
>>>> `/sys/power/sync_on_suspend` to 0, see my corresponding patch that got
>>>> accepted in Linux 5.6[2].
>>>>
>>>> My current guess it that some hardware-specific firmware is loaded
>>>> during system suspend. But unfortunately, so far I failed to find what
>>>> exactly it is despite following the 01.org debugging documentation[3].
>>>> Maybe you can help me shed some light on it?
>>>
>>> I finally succeeded in reliably tracking this down to firmware loading.
>>> With kernel boot parameters `initcall_debug ignore_loglevel`, the last
>>> logs before my system freezes are:
>>>
>>> PM: suspend entry (deep)
>>> (NULL device *): firmware: direct-loading firmware regulatory.db
>>> (NULL device *): firmware: direct-loading firmware regulatory.db.p7s
>>> (NULL device *): firmware: direct-loading firmware iwlwifi-8000C-36.ucode
>>>
>>> If I blacklist all modules that cause the kernel to load firmware (for
>>> me, that's cfg80211, iwlwifi and some bluetooth modules), then the issue
>>> is gone.
>>>
>>> So without further investigation I could imagine three possible solutions:
>>>
>>> 1. Provide all firmware files to the kernel from the chroot. That
>>>    probably means to copy the firmware files to initramfs and to make
>>>    the kernel aware of the new firmware path at
>>>    `/sys/module/firmware_class/parameters/path`.
>>> 2. Find a way to manually trigger the firmware loading operation before
>>>    we luksSuspend the LUKS devices.
>>> 3. Find a way do disable direct-loading of firmware by the kernel during
>>>    suspend.
>>>
>>> Maybe someone with more inside knowledge could comment on whether
>>> options 2 or 3 are possible at all and if not, whether kernel patches to
>>> implement them would be acceptable.
>>
>> Any chance to get a comment on this?


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: RFH with debugging suspend issues
  2020-03-09 10:57 RFH with debugging suspend issues Jonas Meurer
  2020-03-09 12:08 ` Jonas Meurer
@ 2020-03-18 21:37 ` Rafael J. Wysocki
  1 sibling, 0 replies; 6+ messages in thread
From: Rafael J. Wysocki @ 2020-03-18 21:37 UTC (permalink / raw)
  To: Jonas Meurer; +Cc: Linux PM, Tim Dittler

On Mon, Mar 9, 2020 at 12:14 PM Jonas Meurer <jonas@freesources.org> wrote:
>
> Hello,
>
> I'm searching for help with debugging a suspend issue:
>
> Apparently, on some devices (Lenovo laptops in particular), the kernel
> causes a I/O operation on the root filesystem when suspending the system
> - even though the final sync[1] is disabled thanks to setting
> `/sys/power/sync_on_suspend` to 0, see my corresponding patch that got
> accepted in Linux 5.6[2].

And that's how it goes.  There is no guarantee whatever that there
will be no I/O carried out during system-wide suspend without the
sync.

> My current guess it that some hardware-specific firmware is loaded
> during system suspend. But unfortunately, so far I failed to find what
> exactly it is despite following the 01.org debugging documentation[3].
> Maybe you can help me shed some light on it?
>
> The problem I'm facing is the following:
>
> When luksSuspending my LUKS encrypted root file system just before
> running `echo mem >/sys/power/state` on a Lenovo Thinkpad, the kernel
> suspend function freezes my system..
> *BUT*: If I first run `echo mem >/sys/power/state` without luksSuspend,
> and do the luksSuspend + system suspend thing after a successful first
> system suspend, then my luksSuspend + system suspend succeeds reproducibly.
>
> So apparently, at the first system suspend, something triggers a disk
> I/O operation that isn't triggered at subsequent system suspends.

I'm not quite sure how you arrived at this conclusion.  Can you
elaborate, please?

> Do you have an idea what that could be and how to further debug it?

Not really.

> To give some context: together with Tim (Cc'ed), I'm working on
> automatically suspending encrypted LUKS devices during system suspend in
> Debian. Our code basically uncompresses the initramfs image into a
> ramfs, chroots into it, luks-suspends all encrypted LUKS devices and
> sends the system to suspend mode. After resume, commands to unlock/
> resume the suspended LUKS devices are executed.

Sounds interesting. :-)

Unfortunately, I'm not familiar with LUKS at all, so I cannot give you
any useful advice on that particular topic.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-03-18 21:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-09 10:57 RFH with debugging suspend issues Jonas Meurer
2020-03-09 12:08 ` Jonas Meurer
2020-03-17 15:26   ` Jonas Meurer
2020-03-17 15:28     ` Rafael J. Wysocki
2020-03-18 13:03       ` Jonas Meurer
2020-03-18 21:37 ` Rafael J. Wysocki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.