Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23

All of lore.kernel.org
 help / color / mirror / Atom feed

* Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
@ 2024-04-02  8:08 Linux regression tracking (Thorsten Leemhuis)
  2024-04-03  0:38 ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-02  8:08 UTC (permalink / raw)
  To: Greg KH, Sasha Levin
  Cc: stable, LKML, Linux kernel regressions list, Tejun Heo

Hi stable team, there is a report that the recent backport of
5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
for unbound workqueues") [from Tejun] to 6.6.y (as 5a70baec2294) broke
hibernate for a user. 6.6.24-rc1 did not fix this problem; reverting the
culprit does.

> With kernel 6.6.23 hibernating usually hangs here: the display stays
> on but the mouse pointer does not move and the keyboard does not work.
> But SysRq REISUB does reboot. Sometimes it seems to hibernate: the
> computer powers down and can be waked up and the previous display comes
> visible, but it is stuck there.

See https://bugzilla.kernel.org/show_bug.cgi?id=218658 for details.
Note, you have to use bugzilla to reach the reporter, as I sadly[1] can
not CCed them in mails like this.

Side note: there is a mainline report about problems due to
5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
for unbound workqueues") as well, but it's about "nohz_full=0 prevents
kernel from booting":
https://bugzilla.kernel.org/show_bug.cgi?id=218665; will forward that
separately to Tejun.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

[1] because bugzilla.kernel.org tells users upon registration their
"email address will never be displayed to logged out users"

#regzbot introduced: 5a70baec2294e8a7d0fcc4558741c23e752dad
#regzbot from: Petri Kaukasoina
#regzbot duplicate: https://bugzilla.kernel.org/show_bug.cgi?id=218658
#regzbot title: workqueue: hubernate usually hangs when going to sleep
#regzbot ignore-activity

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
  2024-04-02  8:08 Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23 Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-03  0:38 ` Tejun Heo
  2024-04-03  4:26   ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2024-04-03  0:38 UTC (permalink / raw)
  To: Linux regressions mailing list; +Cc: Greg KH, Sasha Levin, stable, LKML

Hello,

On Tue, Apr 02, 2024 at 10:08:11AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi stable team, there is a report that the recent backport of
> 5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
> for unbound workqueues") [from Tejun] to 6.6.y (as 5a70baec2294) broke
> hibernate for a user. 6.6.24-rc1 did not fix this problem; reverting the
> culprit does.
> 
> > With kernel 6.6.23 hibernating usually hangs here: the display stays
> > on but the mouse pointer does not move and the keyboard does not work.
> > But SysRq REISUB does reboot. Sometimes it seems to hibernate: the
> > computer powers down and can be waked up and the previous display comes
> > visible, but it is stuck there.
> 
> See https://bugzilla.kernel.org/show_bug.cgi?id=218658 for details.
> Note, you have to use bugzilla to reach the reporter, as I sadly[1] can
> not CCed them in mails like this.
> 
> Side note: there is a mainline report about problems due to
> 5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
> for unbound workqueues") as well, but it's about "nohz_full=0 prevents
> kernel from booting":
> https://bugzilla.kernel.org/show_bug.cgi?id=218665; will forward that
> separately to Tejun.
> 
> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

Sorry about late reply but that commit is not for -stable. It does fix an
undesirable behavior from an earlier commit, so it has the Fixes tag but it
has a series of dependencies that need to be backported together which would
be far too invasive for -stable, thus no Cc: stable. I didn't know a Fixes
tag automatically triggers backport to -stable. I will keep that in mind for
future.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
  2024-04-03  0:38 ` Tejun Heo
@ 2024-04-03  4:26   ` Linux regression tracking (Thorsten Leemhuis)
  2024-04-03  5:11     ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-03  4:26 UTC (permalink / raw)
  To: Tejun Heo, Linux regressions mailing list
  Cc: Greg KH, Sasha Levin, stable, LKML

On 03.04.24 02:38, Tejun Heo wrote:
> On Tue, Apr 02, 2024 at 10:08:11AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> Hi stable team, there is a report that the recent backport of
>> 5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
>> for unbound workqueues") [from Tejun] to 6.6.y (as 5a70baec2294) broke
>> hibernate for a user. 6.6.24-rc1 did not fix this problem; reverting the
>> culprit does.
>>
>>> With kernel 6.6.23 hibernating usually hangs here: the display stays
>>> on but the mouse pointer does not move and the keyboard does not work.
>>> But SysRq REISUB does reboot. Sometimes it seems to hibernate: the
>>> computer powers down and can be waked up and the previous display comes
>>> visible, but it is stuck there.
>>
>> See https://bugzilla.kernel.org/show_bug.cgi?id=218658 for details.
>> Note, you have to use bugzilla to reach the reporter, as I sadly[1] can
>> not CCed them in mails like this.
>>
>> Side note: there is a mainline report about problems due to
>> 5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
>> for unbound workqueues") as well, but it's about "nohz_full=0 prevents
>> kernel from booting":
>> https://bugzilla.kernel.org/show_bug.cgi?id=218665; will forward that
>> separately to Tejun.
> 
> Sorry about late reply

No problem at all, really, thx for your reply!

> but that commit is not for -stable. It does fix an
> undesirable behavior from an earlier commit, so it has the Fixes tag but it
> has a series of dependencies that need to be backported together

Not sure if you know, but the stable team apparently recently backported a
bunch of other workqueue commits as well; I see 10 from a quick look here:
https://lore.kernel.org/all/20240326223803.2647796-1-sashal@kernel.org/

$ git log --grep='workqueue:' --oneline v6.6.22..stable/linux-6.6.y -- include/linux/workqueue.h kernel/workqueue*
7df62b8cca38aa workqueue: Don't call cpumask_test_cpu() with -1 CPU in wq_update_node_max_active()
5a70baec2294e8 workqueue: Implement system-wide nr_active enforcement for unbound workqueues
b522229a56941a workqueue: Introduce struct wq_node_nr_active
bd31fb926dfa02 workqueue: RCU protect wq->dfl_pwq and implement accessors for it
5f99fee6f2dea1 workqueue: Make wq_adjust_max_active() round-robin pwqs while activating
4023a2d9507691 workqueue: Move nr_active handling into helpers
6c592f0bb96815 workqueue: Replace pwq_activate_inactive_work() with [__]pwq_activate_work()
bad184d26a4f68 workqueue: Factor out pwq_is_empty()
82e098f5bed1ff workqueue: Move pwq->max_active to wq->max_active
43a181f8f41aca workqueue.c: Increase workqueue name length

And there is also "workqueue: Shorten events_freezable_power_efficient name"
in the queue for the next 6.6.y release:
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.6

That leads to the question: what is the better patch forward here: pick
up what's is missing or back out?

Side note: I have no idea why the stable team backported those patches
and no option on whether that was wise, just trying to help finding the best
solution forward from the current state of things.

> which would
> be far too invasive for -stable, thus no Cc: stable.
>
> I didn't know a Fixes
> tag automatically triggers backport to -stable. I will keep that in mind for
> future.

/me fears that more and more developers due to situations like this will
avoid Fixes: tags and wonders what consequences that might have for the
kernel as a whole

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
  2024-04-03  4:26   ` Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-03  5:11     ` Greg KH
  2024-04-03 15:22       ` Tejun Heo
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2024-04-03  5:11 UTC (permalink / raw)
  To: Linux regressions mailing list; +Cc: Tejun Heo, Sasha Levin, stable, LKML

On Wed, Apr 03, 2024 at 06:26:24AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 03.04.24 02:38, Tejun Heo wrote:
> > On Tue, Apr 02, 2024 at 10:08:11AM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> Hi stable team, there is a report that the recent backport of
> >> 5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
> >> for unbound workqueues") [from Tejun] to 6.6.y (as 5a70baec2294) broke
> >> hibernate for a user. 6.6.24-rc1 did not fix this problem; reverting the
> >> culprit does.
> >>
> >>> With kernel 6.6.23 hibernating usually hangs here: the display stays
> >>> on but the mouse pointer does not move and the keyboard does not work.
> >>> But SysRq REISUB does reboot. Sometimes it seems to hibernate: the
> >>> computer powers down and can be waked up and the previous display comes
> >>> visible, but it is stuck there.
> >>
> >> See https://bugzilla.kernel.org/show_bug.cgi?id=218658 for details.
> >> Note, you have to use bugzilla to reach the reporter, as I sadly[1] can
> >> not CCed them in mails like this.
> >>
> >> Side note: there is a mainline report about problems due to
> >> 5797b1c18919cd ("workqueue: Implement system-wide nr_active enforcement
> >> for unbound workqueues") as well, but it's about "nohz_full=0 prevents
> >> kernel from booting":
> >> https://bugzilla.kernel.org/show_bug.cgi?id=218665; will forward that
> >> separately to Tejun.
> > 
> > Sorry about late reply
> 
> No problem at all, really, thx for your reply!
> 
> > but that commit is not for -stable. It does fix an
> > undesirable behavior from an earlier commit, so it has the Fixes tag but it
> > has a series of dependencies that need to be backported together
> 
> Not sure if you know, but the stable team apparently recently backported a
> bunch of other workqueue commits as well; I see 10 from a quick look here:
> https://lore.kernel.org/all/20240326223803.2647796-1-sashal@kernel.org/
> 
> $ git log --grep='workqueue:' --oneline v6.6.22..stable/linux-6.6.y -- include/linux/workqueue.h kernel/workqueue*
> 7df62b8cca38aa workqueue: Don't call cpumask_test_cpu() with -1 CPU in wq_update_node_max_active()
> 5a70baec2294e8 workqueue: Implement system-wide nr_active enforcement for unbound workqueues
> b522229a56941a workqueue: Introduce struct wq_node_nr_active
> bd31fb926dfa02 workqueue: RCU protect wq->dfl_pwq and implement accessors for it
> 5f99fee6f2dea1 workqueue: Make wq_adjust_max_active() round-robin pwqs while activating
> 4023a2d9507691 workqueue: Move nr_active handling into helpers
> 6c592f0bb96815 workqueue: Replace pwq_activate_inactive_work() with [__]pwq_activate_work()
> bad184d26a4f68 workqueue: Factor out pwq_is_empty()
> 82e098f5bed1ff workqueue: Move pwq->max_active to wq->max_active
> 43a181f8f41aca workqueue.c: Increase workqueue name length
> 
> And there is also "workqueue: Shorten events_freezable_power_efficient name"
> in the queue for the next 6.6.y release:
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-6.6
> 
> That leads to the question: what is the better patch forward here: pick
> up what's is missing or back out?
> 
> Side note: I have no idea why the stable team backported those patches
> and no option on whether that was wise, just trying to help finding the best
> solution forward from the current state of things.

The Fixes: tag triggered it, that's why they were backported.

> > which would
> > be far too invasive for -stable, thus no Cc: stable.
> >
> > I didn't know a Fixes
> > tag automatically triggers backport to -stable. I will keep that in mind for
> > future.
> 
> /me fears that more and more developers due to situations like this will
> avoid Fixes: tags and wonders what consequences that might have for the
> kernel as a whole

The problem is that we have subsystems that only use Fixes: and not cc:
stable which is why we need to pick these up as well.  Fixes: is great,
but if everyone were to do this "properly" then we wouldn't need to pick
these other ones up, but instead, it's about 1/3 of our volume :(

I'll gladly revert the above series if they shouldn't have been
backported to stable, but from reading them, it seemed like they were
fixing an issue that was serious and should have been added to stable,
which is why they were.

This is also why we have review cycles, so maintainers can let us know
if they want us to drop them :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
  2024-04-03  5:11     ` Greg KH
@ 2024-04-03 15:22       ` Tejun Heo
  2024-04-03 16:10         ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Tejun Heo @ 2024-04-03 15:22 UTC (permalink / raw)
  To: Greg KH; +Cc: Linux regressions mailing list, Sasha Levin, stable, LKML

Hello,

On Wed, Apr 03, 2024 at 07:11:04AM +0200, Greg KH wrote:
> > Side note: I have no idea why the stable team backported those patches
> > and no option on whether that was wise, just trying to help finding the best
> > solution forward from the current state of things.
> 
> The Fixes: tag triggered it, that's why they were backported.
> 
> > > which would
> > > be far too invasive for -stable, thus no Cc: stable.
> > >
> > > I didn't know a Fixes
> > > tag automatically triggers backport to -stable. I will keep that in mind for
> > > future.
> > 
> > /me fears that more and more developers due to situations like this will
> > avoid Fixes: tags and wonders what consequences that might have for the
> > kernel as a whole
> 
> The problem is that we have subsystems that only use Fixes: and not cc:
> stable which is why we need to pick these up as well.  Fixes: is great,
> but if everyone were to do this "properly" then we wouldn't need to pick
> these other ones up, but instead, it's about 1/3 of our volume :(
> 
> I'll gladly revert the above series if they shouldn't have been
> backported to stable, but from reading them, it seemed like they were
> fixing an issue that was serious and should have been added to stable,
> which is why they were.

Oh, yeah, they're fixing an issue. It's just that the issue is relatively
confined peformance degradation and the fix is really invasive, so not a
great -stable candidate. At the very least, they'd need a log longer cooking
time in mainline before being considered for -stable backport.

My intention w/ Fixes: wasn't triggering -stable backport at all, so it's a
miscommunication. From now on, I'll keep in mind that Fixes: does trigger
backports. I'm not too worried about not using it as the fixee commit can be
mentioned in the commit message.

> This is also why we have review cycles, so maintainers can let us know
> if they want us to drop them :)

Heh, sorry about that. This never caused any issues, so I just glide over
the stable mails without thinking.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23
  2024-04-03 15:22       ` Tejun Heo
@ 2024-04-03 16:10         ` Greg KH
  2024-04-04 15:36           ` Do we need a "DoNotBackPort" tag? (was: Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23) Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2024-04-03 16:10 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Linux regressions mailing list, Sasha Levin, stable, LKML

On Wed, Apr 03, 2024 at 05:22:17AM -1000, Tejun Heo wrote:
> Hello,
> 
> On Wed, Apr 03, 2024 at 07:11:04AM +0200, Greg KH wrote:
> > > Side note: I have no idea why the stable team backported those patches
> > > and no option on whether that was wise, just trying to help finding the best
> > > solution forward from the current state of things.
> > 
> > The Fixes: tag triggered it, that's why they were backported.
> > 
> > > > which would
> > > > be far too invasive for -stable, thus no Cc: stable.
> > > >
> > > > I didn't know a Fixes
> > > > tag automatically triggers backport to -stable. I will keep that in mind for
> > > > future.
> > > 
> > > /me fears that more and more developers due to situations like this will
> > > avoid Fixes: tags and wonders what consequences that might have for the
> > > kernel as a whole
> > 
> > The problem is that we have subsystems that only use Fixes: and not cc:
> > stable which is why we need to pick these up as well.  Fixes: is great,
> > but if everyone were to do this "properly" then we wouldn't need to pick
> > these other ones up, but instead, it's about 1/3 of our volume :(
> > 
> > I'll gladly revert the above series if they shouldn't have been
> > backported to stable, but from reading them, it seemed like they were
> > fixing an issue that was serious and should have been added to stable,
> > which is why they were.
> 
> Oh, yeah, they're fixing an issue. It's just that the issue is relatively
> confined peformance degradation and the fix is really invasive, so not a
> great -stable candidate. At the very least, they'd need a log longer cooking
> time in mainline before being considered for -stable backport.

Ok, I'll go revert them all now.  I did some test builds here with them
reverted and they seem sane.  I'll push out some -rcs with just the
reverts to at least fix the regressions found in the 6.8.y tree now.

> My intention w/ Fixes: wasn't triggering -stable backport at all, so it's a
> miscommunication. From now on, I'll keep in mind that Fixes: does trigger
> backports. I'm not too worried about not using it as the fixee commit can be
> mentioned in the commit message.

No worries, if you want, we can add any files/paths to our "ignore
Fixes: tags, only take cc: stable ones" that we have for many parts of
the kernel already, where maintainers are good and properly tag stuff.
Just let me know.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Do we need a "DoNotBackPort" tag? (was: Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23)
  2024-04-03 16:10         ` Greg KH
@ 2024-04-04 15:36           ` Linux regression tracking (Thorsten Leemhuis)
  2024-04-04 15:44             ` Greg KH
  0 siblings, 1 reply; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-04 15:36 UTC (permalink / raw)
  To: Greg KH, Tejun Heo
  Cc: Linux regressions mailing list, Sasha Levin, stable, LKML

On 03.04.24 18:10, Greg KH wrote:
> On Wed, Apr 03, 2024 at 05:22:17AM -1000, Tejun Heo wrote:
>> On Wed, Apr 03, 2024 at 07:11:04AM +0200, Greg KH wrote:
>>>> Side note: I have no idea why the stable team backported those patches
>>>> and no option on whether that was wise, just trying to help finding the best
>>>> solution forward from the current state of things.
>>>
>>> The Fixes: tag triggered it, that's why they were backported.

Yeah, this is what I assumed.

>>>>> which would
>>>>> be far too invasive for -stable, thus no Cc: stable.
>>>>>
>>>>> I didn't know a Fixes
>>>>> tag automatically triggers backport to -stable. I will keep that in mind for
>>>>> future.
>>>>
>>>> /me fears that more and more developers due to situations like this will
>>>> avoid Fixes: tags and wonders what consequences that might have for the
>>>> kernel as a whole
>>>
>>> The problem is that we have subsystems that only use Fixes: and not cc:
>>> stable which is why we need to pick these up as well.  Fixes: is great,
>>> but if everyone were to do this "properly" then we wouldn't need to pick
>>> these other ones up, but instead, it's about 1/3 of our volume :(

I'm also well aware of that and do not want to complain about it, as I
think I grasped why the stable team works like that -- and even think
given the circumstances it is round about the right approach. I also
understand that mistakes will always happen.

Nevertheless this thread and the Bluetooth thing we had earlier this
week[1] makes me fear that this approach could lead to developer
avoiding Fixes: tags. And funny thing, that's something that is already
happening, as I noticed by chance today: "'"A "Fixes" tag has been
deliberately omitted to avoid potential test failures and subsequent
regression issues that could arise from backporting."'"[2].

I wonder if that in the long term might be bad. But well, maybe it won't
matter much. Still made me wonder if we should have a different solution
for this kind of problem. Something like this?

  Cc: <stable@vger.kernel.org> # DoNotBackport

Or something like this?

  Cc: <stable@vger.kernel.org> # DoNotBackport - or only after 16 weeks
in mainline [but I don't care]

Whatever, mainly thinking aloud and do no need a reply to this. :-D

[1]
https://lore.kernel.org/all/84da1f26-0457-451c-b4fd-128cb9bd860d@leemhuis.info/

[2] saw that today here:
https://lore.kernel.org/all/cover.1712226175.git.antony.antony@secunet.com/

>>> I'll gladly revert the above series if they shouldn't have been
>>> backported to stable, but from reading them, it seemed like they were
>>> fixing an issue that was serious and should have been added to stable,
>>> which is why they were.
>> Oh, yeah, they're fixing an issue. It's just that the issue is relatively
>> confined peformance degradation and the fix is really invasive, so not a
>> great -stable candidate. At the very least, they'd need a log longer cooking
>> time in mainline before being considered for -stable backport.
> Ok, I'll go revert them all now.  I did some test builds here with them
> reverted and they seem sane.  I'll push out some -rcs with just the
> reverts to at least fix the regressions found in the 6.8.y tree now.

Great, thx for taking care of this!

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Do we need a "DoNotBackPort" tag? (was: Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23)
  2024-04-04 15:36           ` Do we need a "DoNotBackPort" tag? (was: Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23) Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-04 15:44             ` Greg KH
  2024-04-04 15:56               ` Do we need a "DoNotBackPort" tag? Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 1 reply; 12+ messages in thread
From: Greg KH @ 2024-04-04 15:44 UTC (permalink / raw)
  To: Linux regressions mailing list; +Cc: Tejun Heo, Sasha Levin, stable, LKML

On Thu, Apr 04, 2024 at 05:36:42PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 03.04.24 18:10, Greg KH wrote:
> > On Wed, Apr 03, 2024 at 05:22:17AM -1000, Tejun Heo wrote:
> >> On Wed, Apr 03, 2024 at 07:11:04AM +0200, Greg KH wrote:
> >>>> Side note: I have no idea why the stable team backported those patches
> >>>> and no option on whether that was wise, just trying to help finding the best
> >>>> solution forward from the current state of things.
> >>>
> >>> The Fixes: tag triggered it, that's why they were backported.
> 
> Yeah, this is what I assumed.
> 
> >>>>> which would
> >>>>> be far too invasive for -stable, thus no Cc: stable.
> >>>>>
> >>>>> I didn't know a Fixes
> >>>>> tag automatically triggers backport to -stable. I will keep that in mind for
> >>>>> future.
> >>>>
> >>>> /me fears that more and more developers due to situations like this will
> >>>> avoid Fixes: tags and wonders what consequences that might have for the
> >>>> kernel as a whole
> >>>
> >>> The problem is that we have subsystems that only use Fixes: and not cc:
> >>> stable which is why we need to pick these up as well.  Fixes: is great,
> >>> but if everyone were to do this "properly" then we wouldn't need to pick
> >>> these other ones up, but instead, it's about 1/3 of our volume :(
> 
> I'm also well aware of that and do not want to complain about it, as I
> think I grasped why the stable team works like that -- and even think
> given the circumstances it is round about the right approach. I also
> understand that mistakes will always happen.
> 
> Nevertheless this thread and the Bluetooth thing we had earlier this
> week[1] makes me fear that this approach could lead to developer
> avoiding Fixes: tags. And funny thing, that's something that is already
> happening, as I noticed by chance today: "'"A "Fixes" tag has been
> deliberately omitted to avoid potential test failures and subsequent
> regression issues that could arise from backporting."'"[2].
> 
> I wonder if that in the long term might be bad. But well, maybe it won't
> matter much. Still made me wonder if we should have a different solution
> for this kind of problem. Something like this?
> 
>   Cc: <stable@vger.kernel.org> # DoNotBackport
> 
> Or something like this?
> 
>   Cc: <stable@vger.kernel.org> # DoNotBackport - or only after 16 weeks
> in mainline [but I don't care]

We do this today, with stuff like:
	Cc: stable <stable@kernel.org>	# wait for -rc3 to be out

So if people want to do that, they can, the documentation says that you
can give us hints and the like in the # area, and usually we notice them :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Do we need a "DoNotBackPort" tag?
  2024-04-04 15:44             ` Greg KH
@ 2024-04-04 15:56               ` Linux regression tracking (Thorsten Leemhuis)
  2024-04-05  2:54                 ` Theodore Ts'o
  0 siblings, 1 reply; 12+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2024-04-04 15:56 UTC (permalink / raw)
  To: Greg KH, Linux regressions mailing list
  Cc: Tejun Heo, Sasha Levin, stable, LKML

On 04.04.24 17:44, Greg KH wrote:
> On Thu, Apr 04, 2024 at 05:36:42PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> On 03.04.24 18:10, Greg KH wrote:
>>> On Wed, Apr 03, 2024 at 05:22:17AM -1000, Tejun Heo wrote:
>>>> On Wed, Apr 03, 2024 at 07:11:04AM +0200, Greg KH wrote:
>>>>>> Side note: I have no idea why the stable team backported those patches
>>>>>> and no option on whether that was wise, just trying to help finding the best
>>>>>> solution forward from the current state of things.
>>>>>
>>>>> The Fixes: tag triggered it, that's why they were backported.
>>
>> Yeah, this is what I assumed.
>>
>>>>>>> which would
>>>>>>> be far too invasive for -stable, thus no Cc: stable.
>>>>>>>
>>>>>>> I didn't know a Fixes
>>>>>>> tag automatically triggers backport to -stable. I will keep that in mind for
>>>>>>> future.
>>>>>>
>>>>>> /me fears that more and more developers due to situations like this will
>>>>>> avoid Fixes: tags and wonders what consequences that might have for the
>>>>>> kernel as a whole
>>>>>
>>>>> The problem is that we have subsystems that only use Fixes: and not cc:
>>>>> stable which is why we need to pick these up as well.  Fixes: is great,
>>>>> but if everyone were to do this "properly" then we wouldn't need to pick
>>>>> these other ones up, but instead, it's about 1/3 of our volume :(
>>
>> I'm also well aware of that and do not want to complain about it, as I
>> think I grasped why the stable team works like that -- and even think
>> given the circumstances it is round about the right approach. I also
>> understand that mistakes will always happen.
>>
>> Nevertheless this thread and the Bluetooth thing we had earlier this
>> week[1] makes me fear that this approach could lead to developer
>> avoiding Fixes: tags. And funny thing, that's something that is already
>> happening, as I noticed by chance today: "'"A "Fixes" tag has been
>> deliberately omitted to avoid potential test failures and subsequent
>> regression issues that could arise from backporting."'"[2].
>>
>> I wonder if that in the long term might be bad. But well, maybe it won't
>> matter much. Still made me wonder if we should have a different solution
>> for this kind of problem. Something like this?
>>
>>   Cc: <stable@vger.kernel.org> # DoNotBackport
>>
>> Or something like this?
>>
>>   Cc: <stable@vger.kernel.org> # DoNotBackport - or only after 16 weeks
>> in mainline [but I don't care]
> 
> We do this today, with stuff like:
> 	Cc: stable <stable@kernel.org>	# wait for -rc3 to be out
> 
> So if people want to do that, they can, the documentation says that you
> can give us hints and the like in the # area, and usually we notice them :)

I know, as I wrote that (as you likely remember). ;-) But it seems it's
not well known; and maybe making it explicit that this can be used to
convey a "DoNotBackport" message is supported as well.

Guess I'll prepare a patch to do that then and we'll see how it goes
from there.

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Do we need a "DoNotBackPort" tag?
  2024-04-04 15:56               ` Do we need a "DoNotBackPort" tag? Linux regression tracking (Thorsten Leemhuis)
@ 2024-04-05  2:54                 ` Theodore Ts'o
  2024-04-05  4:24                   ` Greg KH
  2024-04-11  5:30                   ` Thorsten Leemhuis
  0 siblings, 2 replies; 12+ messages in thread
From: Theodore Ts'o @ 2024-04-05  2:54 UTC (permalink / raw)
  To: Linux regressions mailing list
  Cc: Greg KH, Tejun Heo, Sasha Levin, stable, LKML

On Thu, Apr 04, 2024 at 05:56:58PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> 
> I know, as I wrote that (as you likely remember). ;-) But it seems it's
> not well known; and maybe making it explicit that this can be used to
> convey a "DoNotBackport" message is supported as well.
> 
> Guess I'll prepare a patch to do that then and we'll see how it goes
> from there.

Maybe something like "ManualBackportOnly"instead?   The basic idea is
that it's not that the commit should *never* be backported, but only
with human intervention where someone has specifically requested the
backport, perhaps with qualification test.

(For example, the XFS file system has an implicit ManualBackportOnly
on all commits, and the XFS stable maintainers are responsible for
backporting identifying commits with Fixes: tags, and running QA
before passing on a request to having them be backported.)

              	       	    	       - Ted

P.S.  I'd love to be able do something equivalent for ext4, if we
could scare up the resources to do the same, in terms of having some
company fund the necessary developer resources, or someone turn up to
volunteer to do that for ext4.  Gentle reader, if that's you or your
company --- let's talk.  :-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Do we need a "DoNotBackPort" tag?
  2024-04-05  2:54                 ` Theodore Ts'o
@ 2024-04-05  4:24                   ` Greg KH
  2024-04-11  5:30                   ` Thorsten Leemhuis
  1 sibling, 0 replies; 12+ messages in thread
From: Greg KH @ 2024-04-05  4:24 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Linux regressions mailing list, Tejun Heo, Sasha Levin, stable, LKML

On Thu, Apr 04, 2024 at 10:54:39PM -0400, Theodore Ts'o wrote:
> On Thu, Apr 04, 2024 at 05:56:58PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> > 
> > I know, as I wrote that (as you likely remember). ;-) But it seems it's
> > not well known; and maybe making it explicit that this can be used to
> > convey a "DoNotBackport" message is supported as well.
> > 
> > Guess I'll prepare a patch to do that then and we'll see how it goes
> > from there.
> 
> Maybe something like "ManualBackportOnly"instead?   The basic idea is
> that it's not that the commit should *never* be backported, but only
> with human intervention where someone has specifically requested the
> backport, perhaps with qualification test.
> 
> (For example, the XFS file system has an implicit ManualBackportOnly
> on all commits, and the XFS stable maintainers are responsible for
> backporting identifying commits with Fixes: tags, and running QA
> before passing on a request to having them be backported.)

For drivers/subsystems/files that no one wants to have backported at all
UNLESS there is an explicit cc: stable tag, just email us at stable@vger
and let us know and we will add you to the list of files that we ignore
for this.  We keep that list in the stable-queue repo for anyone to see
if they are curious.

That's what xfs and bcachefs and kvm and mm and other subsystems have
done, that way they can tag things with "Fixes:" to their hearts content
and we just ignore them.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Do we need a "DoNotBackPort" tag?
  2024-04-05  2:54                 ` Theodore Ts'o
  2024-04-05  4:24                   ` Greg KH
@ 2024-04-11  5:30                   ` Thorsten Leemhuis
  1 sibling, 0 replies; 12+ messages in thread
From: Thorsten Leemhuis @ 2024-04-11  5:30 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Greg KH, Tejun Heo, Sasha Levin, stable, LKML,
	Linux regressions mailing list

On 05.04.24 04:54, Theodore Ts'o wrote:
> On Thu, Apr 04, 2024 at 05:56:58PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>>
>> I know, as I wrote that (as you likely remember). ;-) But it seems it's
>> not well known; and maybe making it explicit that this can be used to
>> convey a "DoNotBackport" message is supported as well.
>>
>> Guess I'll prepare a patch to do that then and we'll see how it goes
>> from there.
> 
> Maybe something like "ManualBackportOnly"instead?   The basic idea is
> that it's not that the commit should *never* be backported, but only
> with human intervention where someone has specifically requested the
> backport, perhaps with qualification test.

I liked the idea at first, as it was more from the positive side of
things. But a CC stable with a "# ManualBackportOnly" might sound like
"I want this backported and handle that" to some developers and not what
they want to express.

After thinking about it for some time I went with "# no semi-automatic
backport" for now. For details see:
https://lore.kernel.org/all/c0a08b160b286e8c98549eedb37404c6e784cf8a.1712812895.git.linux@leemhuis.info/

Ciao, Thorsten

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2024-04-11  5:30 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-02  8:08 Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23 Linux regression tracking (Thorsten Leemhuis)
2024-04-03  0:38 ` Tejun Heo
2024-04-03  4:26   ` Linux regression tracking (Thorsten Leemhuis)
2024-04-03  5:11     ` Greg KH
2024-04-03 15:22       ` Tejun Heo
2024-04-03 16:10         ` Greg KH
2024-04-04 15:36           ` Do we need a "DoNotBackPort" tag? (was: Re: Hibernate stuck after recent kernel/workqueue.c changes in Stable 6.6.23) Linux regression tracking (Thorsten Leemhuis)
2024-04-04 15:44             ` Greg KH
2024-04-04 15:56               ` Do we need a "DoNotBackPort" tag? Linux regression tracking (Thorsten Leemhuis)
2024-04-05  2:54                 ` Theodore Ts'o
2024-04-05  4:24                   ` Greg KH
2024-04-11  5:30                   ` Thorsten Leemhuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.