[Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
@ 2018-09-04 20:58 Laura Abbott
  2018-09-04 21:12 ` Jiri Kosina
                   ` (4 more replies)
  0 siblings, 5 replies; 74+ messages in thread
From: Laura Abbott @ 2018-09-04 20:58 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: Greg KH

I'd like to start a discussion about the stable release cycle.

Fedora is a heavy user of the most recent stable trees and we
generally do a pretty good job of keeping up to date. As we
try and increase testing though, the stable release process
gets to be a bit difficult. We often run into the problem where
release .Z is officially released and then .Z+1 comes
out as an -rc immediately after. Given Fedora release processes,
we haven't always finished testing .Z by the time .Z+1 comes
out. What to do in this situation really depends on what's in
.Z and .Z+1 and how stable we think things are. This usually
works out fine but a) sometimes we guess wrong and should have
tested .Z more b) we're only looking to increase testing.

What I'd like to see is stable updates that come on a regular
schedule with a longer -rc interval, say Sunday with
a one week -rc period. I understand that much of the current
stable schedule is based on Greg's schedule. As a distro
maintainer though, a regular release schedule with a longer
testing window makes it much easier to plan and deliver something
useful to our users. It's also a much easier sell for encouraging
everyone to pick up every stable update if there's a known
schedule. I also realize Greg is probably reading this with a very
skeptical look on his face so I'd be interested to hear from
other distro maintainers as well.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 20:58 [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time Laura Abbott
@ 2018-09-04 21:12 ` Jiri Kosina
  2018-09-05 14:31   ` Greg KH
  2018-09-04 21:22 ` Justin Forbes
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 74+ messages in thread
From: Jiri Kosina @ 2018-09-04 21:12 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Greg KH, ksummit-discuss

On Tue, 4 Sep 2018, Laura Abbott wrote:

> I also realize Greg is probably reading this with a very skeptical look 
> on his face so I'd be interested to hear from other distro maintainers 
> as well.

As a SUSE distro kernel maintainer, I'd really like to participate if any 
such discussion is happening.

Namely:

- we're having a lot of internal discussions about how to adjust our 
  processess to the changes happening in -stable tree process and patch 
  acceptance criteria

- it's becoming more and more apparent (and even Greg stated it in the few 
  months old thread Sasha referred to) that stable tree is not really 
  intended for distros in the first place; it might be useful to have this 
  clarified a bit more.

  Namely: it's sort of evident that most of the major distros are running 
  their own variation of the stable tree. Would it be beneficial to 
  somehow close the feedback loop back from the distros to the stable 
  tree? Or is total disconnect between the two worlds inevitable and 
  desired?

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 20:58 [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time Laura Abbott
  2018-09-04 21:12 ` Jiri Kosina
@ 2018-09-04 21:22 ` Justin Forbes
  2018-09-05 14:42   ` Greg KH
  2018-09-04 21:33 ` Sasha Levin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 74+ messages in thread
From: Justin Forbes @ 2018-09-04 21:22 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Greg KH, ksummit

On Tue, Sep 4, 2018 at 3:58 PM, Laura Abbott <labbott@redhat.com> wrote:
> I'd like to start a discussion about the stable release cycle.
>
> Fedora is a heavy user of the most recent stable trees and we
> generally do a pretty good job of keeping up to date. As we
> try and increase testing though, the stable release process
> gets to be a bit difficult. We often run into the problem where
> release .Z is officially released and then .Z+1 comes
> out as an -rc immediately after. Given Fedora release processes,
> we haven't always finished testing .Z by the time .Z+1 comes
> out. What to do in this situation really depends on what's in
> .Z and .Z+1 and how stable we think things are. This usually
> works out fine but a) sometimes we guess wrong and should have
> tested .Z more b) we're only looking to increase testing.
>
> What I'd like to see is stable updates that come on a regular
> schedule with a longer -rc interval, say Sunday with
> a one week -rc period. I understand that much of the current
> stable schedule is based on Greg's schedule. As a distro
> maintainer though, a regular release schedule with a longer
> testing window makes it much easier to plan and deliver something
> useful to our users. It's also a much easier sell for encouraging
> everyone to pick up every stable update if there's a known
> schedule. I also realize Greg is probably reading this with a very
> skeptical look on his face so I'd be interested to hear from
> other distro maintainers as well.
>

This has been a fairly recent problem. There was a roughly weekly
cadence for a very long time and that was pretty easy to work with.  I
know that some of these updates do fix embargoed security issues that
we don't find out are actual fixes until later, but frequently in
those cases, the fixes are pushed well before embargo lifts, and they
could be fit into a weekly cadence.  Personally I don't have a problem
with the 3 day rc period, but pushing 2 kernels a week can be a
problem for users. (skipping a stable update is also a problem for
users.)  What I would prefer is 1 stable update per week with an
exception for *serious* security issues, where serious would mean
either real end user impact or high profile lots of press users are
going to be wondering where a fix is.

Justin

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 20:58 [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time Laura Abbott
  2018-09-04 21:12 ` Jiri Kosina
  2018-09-04 21:22 ` Justin Forbes
@ 2018-09-04 21:33 ` Sasha Levin
  2018-09-04 21:55   ` Guenter Roeck
                     ` (2 more replies)
  2018-09-04 21:49 ` Guenter Roeck
  2018-09-05  3:44 ` Eduardo Valentin
  4 siblings, 3 replies; 74+ messages in thread
From: Sasha Levin @ 2018-09-04 21:33 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Greg KH, ksummit-discuss

On Tue, Sep 04, 2018 at 01:58:42PM -0700, Laura Abbott wrote:
>I'd like to start a discussion about the stable release cycle.
>
>Fedora is a heavy user of the most recent stable trees and we
>generally do a pretty good job of keeping up to date. As we
>try and increase testing though, the stable release process
>gets to be a bit difficult. We often run into the problem where
>release .Z is officially released and then .Z+1 comes
>out as an -rc immediately after. Given Fedora release processes,
>we haven't always finished testing .Z by the time .Z+1 comes
>out. What to do in this situation really depends on what's in
>.Z and .Z+1 and how stable we think things are. This usually
>works out fine but a) sometimes we guess wrong and should have
>tested .Z more b) we're only looking to increase testing.
>
>What I'd like to see is stable updates that come on a regular
>schedule with a longer -rc interval, say Sunday with
>a one week -rc period. I understand that much of the current
>stable schedule is based on Greg's schedule. As a distro
>maintainer though, a regular release schedule with a longer
>testing window makes it much easier to plan and deliver something
>useful to our users. It's also a much easier sell for encouraging
>everyone to pick up every stable update if there's a known
>schedule. I also realize Greg is probably reading this with a very
>skeptical look on his face so I'd be interested to hear from
>other distro maintainers as well.

OTOH, what I like with the current process is that I don't have to align
any of the various (internal) release schedules we have with some
standard stable kernel release schedule. I just pick the latest stable
kernel (.Z) and we go through our build/testing pipeline on it. If
another stable kernel (.Z+1) is released a day later it will just wait
until the next release based on our schedule.

Why not set your own release schedule and just take the latest stable
kernel at that point? So what if the .Z+1 kernel is out a day later? You
could just queue it up for your next release.

This is exactly what would happen if you ask Greg to go on some sort of
a schedule - he'll just defer the .Z+1 commits to what would have been
the .Z+2 release, so you don't really win anything by moving to a
stricter schedule.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 20:58 [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time Laura Abbott
                   ` (2 preceding siblings ...)
  2018-09-04 21:33 ` Sasha Levin
@ 2018-09-04 21:49 ` Guenter Roeck
  2018-09-04 22:06   ` Laura Abbott
  2018-09-05  3:44 ` Eduardo Valentin
  4 siblings, 1 reply; 74+ messages in thread
From: Guenter Roeck @ 2018-09-04 21:49 UTC (permalink / raw)
  To: Laura Abbott, ksummit-discuss; +Cc: Greg KH

On 09/04/2018 01:58 PM, Laura Abbott wrote:
> I'd like to start a discussion about the stable release cycle.
> 
> Fedora is a heavy user of the most recent stable trees and we
> generally do a pretty good job of keeping up to date. As we
> try and increase testing though, the stable release process
> gets to be a bit difficult. We often run into the problem where
> release .Z is officially released and then .Z+1 comes
> out as an -rc immediately after. Given Fedora release processes,
> we haven't always finished testing .Z by the time .Z+1 comes
> out. What to do in this situation really depends on what's in
> .Z and .Z+1 and how stable we think things are. This usually
> works out fine but a) sometimes we guess wrong and should have
> tested .Z more b) we're only looking to increase testing.
> 
> What I'd like to see is stable updates that come on a regular
> schedule with a longer -rc interval, say Sunday with
> a one week -rc period. I understand that much of the current
> stable schedule is based on Greg's schedule. As a distro
> maintainer though, a regular release schedule with a longer
> testing window makes it much easier to plan and deliver something
> useful to our users. It's also a much easier sell for encouraging
> everyone to pick up every stable update if there's a known
> schedule. I also realize Greg is probably reading this with a very
> skeptical look on his face so I'd be interested to hear from
> other distro maintainers as well.
> 

For my part, a longer -rc interval would not help or improve the
situation. Given the large number of security fixes, it would
actually make the situation worse: In many cases I could no longer
wait for a fix to be available in a release. Instead, I would have
to pick and pre-apply individual patches from a pending release.

I like the idea of having (no more than) one release per week with
the exception of security fixes, but longer -rc intervals would be
problematic.

Guenter

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:33 ` Sasha Levin
@ 2018-09-04 21:55   ` Guenter Roeck
  2018-09-04 22:03     ` Laura Abbott
  2018-09-04 21:58   ` Laura Abbott
  2018-09-05  6:48   ` Jiri Kosina
  2 siblings, 1 reply; 74+ messages in thread
From: Guenter Roeck @ 2018-09-04 21:55 UTC (permalink / raw)
  To: Sasha Levin, Laura Abbott; +Cc: Greg KH, ksummit-discuss

On 09/04/2018 02:33 PM, Sasha Levin via Ksummit-discuss wrote:
> On Tue, Sep 04, 2018 at 01:58:42PM -0700, Laura Abbott wrote:
>> I'd like to start a discussion about the stable release cycle.
>>
>> Fedora is a heavy user of the most recent stable trees and we
>> generally do a pretty good job of keeping up to date. As we
>> try and increase testing though, the stable release process
>> gets to be a bit difficult. We often run into the problem where
>> release .Z is officially released and then .Z+1 comes
>> out as an -rc immediately after. Given Fedora release processes,
>> we haven't always finished testing .Z by the time .Z+1 comes
>> out. What to do in this situation really depends on what's in
>> .Z and .Z+1 and how stable we think things are. This usually
>> works out fine but a) sometimes we guess wrong and should have
>> tested .Z more b) we're only looking to increase testing.
>>
>> What I'd like to see is stable updates that come on a regular
>> schedule with a longer -rc interval, say Sunday with
>> a one week -rc period. I understand that much of the current
>> stable schedule is based on Greg's schedule. As a distro
>> maintainer though, a regular release schedule with a longer
>> testing window makes it much easier to plan and deliver something
>> useful to our users. It's also a much easier sell for encouraging
>> everyone to pick up every stable update if there's a known
>> schedule. I also realize Greg is probably reading this with a very
>> skeptical look on his face so I'd be interested to hear from
>> other distro maintainers as well.
> 
> OTOH, what I like with the current process is that I don't have to align
> any of the various (internal) release schedules we have with some
> standard stable kernel release schedule. I just pick the latest stable
> kernel (.Z) and we go through our build/testing pipeline on it. If
> another stable kernel (.Z+1) is released a day later it will just wait
> until the next release based on our schedule.
> 
> Why not set your own release schedule and just take the latest stable
> kernel at that point? So what if the .Z+1 kernel is out a day later? You
> could just queue it up for your next release.
> 
> This is exactly what would happen if you ask Greg to go on some sort of
> a schedule - he'll just defer the .Z+1 commits to what would have been
> the .Z+2 release, so you don't really win anything by moving to a
> stricter schedule.
> 

Good point. There would actually be a downside of having a longer
release cycle: Fewer releases means more patches per release.
More patches per release results in more regressions per release
(if we assume a constant percentage of regressions, which seems
reasonable).

Guenter

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:33 ` Sasha Levin
  2018-09-04 21:55   ` Guenter Roeck
@ 2018-09-04 21:58   ` Laura Abbott
  2018-09-05  4:53     ` Sasha Levin
  2018-09-05  6:48   ` Jiri Kosina
  2 siblings, 1 reply; 74+ messages in thread
From: Laura Abbott @ 2018-09-04 21:58 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Greg KH, ksummit-discuss

On 09/04/2018 02:33 PM, Sasha Levin wrote:
> Why not set your own release schedule and just take the latest stable
> kernel at that point? So what if the .Z+1 kernel is out a day later? You
> could just queue it up for your next release.
> 

It's really rough on users to update that frequently. Fedora relies
on users to give feedback to let us know to push an update and on
older releases it can be hard to get feedback. We've sometimes had
issues where multiple stable issues get delayed because they haven't
gotten enough testing to be pushed. Admittedly that's a Fedora quirk
but it's still an issue that the short rc and release window is not
enough for many users to test and give feedback. When stable regressions
are introduced it's very difficult to guide users to bisect.

> This is exactly what would happen if you ask Greg to go on some sort of
> a schedule - he'll just defer the .Z+1 commits to what would have been
> the .Z+2 release, so you don't really win anything by moving to a
> stricter schedule.

I'd actually be okay with that. I'd rather focus on testing a known
set of commits and getting those stable before pushing out.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:55   ` Guenter Roeck
@ 2018-09-04 22:03     ` Laura Abbott
  2018-09-04 23:14       ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Laura Abbott @ 2018-09-04 22:03 UTC (permalink / raw)
  To: Guenter Roeck, Sasha Levin; +Cc: Greg KH, ksummit-discuss

On 09/04/2018 02:55 PM, Guenter Roeck wrote:
> On 09/04/2018 02:33 PM, Sasha Levin via Ksummit-discuss wrote:
>> On Tue, Sep 04, 2018 at 01:58:42PM -0700, Laura Abbott wrote:
>>> I'd like to start a discussion about the stable release cycle.
>>>
>>> Fedora is a heavy user of the most recent stable trees and we
>>> generally do a pretty good job of keeping up to date. As we
>>> try and increase testing though, the stable release process
>>> gets to be a bit difficult. We often run into the problem where
>>> release .Z is officially released and then .Z+1 comes
>>> out as an -rc immediately after. Given Fedora release processes,
>>> we haven't always finished testing .Z by the time .Z+1 comes
>>> out. What to do in this situation really depends on what's in
>>> .Z and .Z+1 and how stable we think things are. This usually
>>> works out fine but a) sometimes we guess wrong and should have
>>> tested .Z more b) we're only looking to increase testing.
>>>
>>> What I'd like to see is stable updates that come on a regular
>>> schedule with a longer -rc interval, say Sunday with
>>> a one week -rc period. I understand that much of the current
>>> stable schedule is based on Greg's schedule. As a distro
>>> maintainer though, a regular release schedule with a longer
>>> testing window makes it much easier to plan and deliver something
>>> useful to our users. It's also a much easier sell for encouraging
>>> everyone to pick up every stable update if there's a known
>>> schedule. I also realize Greg is probably reading this with a very
>>> skeptical look on his face so I'd be interested to hear from
>>> other distro maintainers as well.
>>
>> OTOH, what I like with the current process is that I don't have to align
>> any of the various (internal) release schedules we have with some
>> standard stable kernel release schedule. I just pick the latest stable
>> kernel (.Z) and we go through our build/testing pipeline on it. If
>> another stable kernel (.Z+1) is released a day later it will just wait
>> until the next release based on our schedule.
>>
>> Why not set your own release schedule and just take the latest stable
>> kernel at that point? So what if the .Z+1 kernel is out a day later? You
>> could just queue it up for your next release.
>>
>> This is exactly what would happen if you ask Greg to go on some sort of
>> a schedule - he'll just defer the .Z+1 commits to what would have been
>> the .Z+2 release, so you don't really win anything by moving to a
>> stricter schedule.
>>
> 
> Good point. There would actually be a downside of having a longer
> release cycle: Fewer releases means more patches per release.
> More patches per release results in more regressions per release
> (if we assume a constant percentage of regressions, which seems
> reasonable).
> 

Yes but with a longer -rc cycle we could have more time to actually
find those bugs before they get released and we could get more focused
testing.

> Guenter

Thanks,
Laura

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:49 ` Guenter Roeck
@ 2018-09-04 22:06   ` Laura Abbott
  2018-09-04 23:35     ` Guenter Roeck
  0 siblings, 1 reply; 74+ messages in thread
From: Laura Abbott @ 2018-09-04 22:06 UTC (permalink / raw)
  To: Guenter Roeck, ksummit-discuss; +Cc: Greg KH

On 09/04/2018 02:49 PM, Guenter Roeck wrote:
> On 09/04/2018 01:58 PM, Laura Abbott wrote:
>> I'd like to start a discussion about the stable release cycle.
>>
>> Fedora is a heavy user of the most recent stable trees and we
>> generally do a pretty good job of keeping up to date. As we
>> try and increase testing though, the stable release process
>> gets to be a bit difficult. We often run into the problem where
>> release .Z is officially released and then .Z+1 comes
>> out as an -rc immediately after. Given Fedora release processes,
>> we haven't always finished testing .Z by the time .Z+1 comes
>> out. What to do in this situation really depends on what's in
>> .Z and .Z+1 and how stable we think things are. This usually
>> works out fine but a) sometimes we guess wrong and should have
>> tested .Z more b) we're only looking to increase testing.
>>
>> What I'd like to see is stable updates that come on a regular
>> schedule with a longer -rc interval, say Sunday with
>> a one week -rc period. I understand that much of the current
>> stable schedule is based on Greg's schedule. As a distro
>> maintainer though, a regular release schedule with a longer
>> testing window makes it much easier to plan and deliver something
>> useful to our users. It's also a much easier sell for encouraging
>> everyone to pick up every stable update if there's a known
>> schedule. I also realize Greg is probably reading this with a very
>> skeptical look on his face so I'd be interested to hear from
>> other distro maintainers as well.
>>
> 
> For my part, a longer -rc interval would not help or improve the
> situation. Given the large number of security fixes, it would
> actually make the situation worse: In many cases I could no longer
> wait for a fix to be available in a release. Instead, I would have
> to pick and pre-apply individual patches from a pending release.
> 

Fedora does this already. We frequently carry patches which have
not yet made it into a stable release. Sometimes they only stay
around for one release but we've had ones that stayed around for
multiple releases.

> I like the idea of having (no more than) one release per week with
> the exception of security fixes, but longer -rc intervals would be
> problematic.
> 

Security fixes are an interesting question. The problem is that
not every security issue is actually equal and even patches
that fix CVEs can cause regressions.

> Guenter

Thanks,
Laura

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 22:03     ` Laura Abbott
@ 2018-09-04 23:14       ` Sasha Levin
  2018-09-04 23:43         ` Guenter Roeck
  0 siblings, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-04 23:14 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Greg KH, ksummit-discuss

On Tue, Sep 04, 2018 at 03:03:05PM -0700, Laura Abbott wrote:
>On 09/04/2018 02:55 PM, Guenter Roeck wrote:
>>On 09/04/2018 02:33 PM, Sasha Levin via Ksummit-discuss wrote:
>>>On Tue, Sep 04, 2018 at 01:58:42PM -0700, Laura Abbott wrote:
>>>>I'd like to start a discussion about the stable release cycle.
>>>>
>>>>Fedora is a heavy user of the most recent stable trees and we
>>>>generally do a pretty good job of keeping up to date. As we
>>>>try and increase testing though, the stable release process
>>>>gets to be a bit difficult. We often run into the problem where
>>>>release .Z is officially released and then .Z+1 comes
>>>>out as an -rc immediately after. Given Fedora release processes,
>>>>we haven't always finished testing .Z by the time .Z+1 comes
>>>>out. What to do in this situation really depends on what's in
>>>>.Z and .Z+1 and how stable we think things are. This usually
>>>>works out fine but a) sometimes we guess wrong and should have
>>>>tested .Z more b) we're only looking to increase testing.
>>>>
>>>>What I'd like to see is stable updates that come on a regular
>>>>schedule with a longer -rc interval, say Sunday with
>>>>a one week -rc period. I understand that much of the current
>>>>stable schedule is based on Greg's schedule. As a distro
>>>>maintainer though, a regular release schedule with a longer
>>>>testing window makes it much easier to plan and deliver something
>>>>useful to our users. It's also a much easier sell for encouraging
>>>>everyone to pick up every stable update if there's a known
>>>>schedule. I also realize Greg is probably reading this with a very
>>>>skeptical look on his face so I'd be interested to hear from
>>>>other distro maintainers as well.
>>>
>>>OTOH, what I like with the current process is that I don't have to align
>>>any of the various (internal) release schedules we have with some
>>>standard stable kernel release schedule. I just pick the latest stable
>>>kernel (.Z) and we go through our build/testing pipeline on it. If
>>>another stable kernel (.Z+1) is released a day later it will just wait
>>>until the next release based on our schedule.
>>>
>>>Why not set your own release schedule and just take the latest stable
>>>kernel at that point? So what if the .Z+1 kernel is out a day later? You
>>>could just queue it up for your next release.
>>>
>>>This is exactly what would happen if you ask Greg to go on some sort of
>>>a schedule - he'll just defer the .Z+1 commits to what would have been
>>>the .Z+2 release, so you don't really win anything by moving to a
>>>stricter schedule.
>>>
>>
>>Good point. There would actually be a downside of having a longer
>>release cycle: Fewer releases means more patches per release.
>>More patches per release results in more regressions per release
>>(if we assume a constant percentage of regressions, which seems
>>reasonable).
>>
>
>Yes but with a longer -rc cycle we could have more time to actually
>find those bugs before they get released and we could get more focused
>testing.

Indeed, but what's long enough? I'm sure that if we extend it to a month
we'll find even more bugs; there's never "enough" testing.

Maybe some concrete numbers will help here. Do you maybe know how many
commits in the past year snuck past the -rc cycle into a stable release
and found as buggy by Fedora's testing pipeline?

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 22:06   ` Laura Abbott
@ 2018-09-04 23:35     ` Guenter Roeck
  2018-09-05  1:45       ` Laura Abbott
  0 siblings, 1 reply; 74+ messages in thread
From: Guenter Roeck @ 2018-09-04 23:35 UTC (permalink / raw)
  To: Laura Abbott, ksummit-discuss; +Cc: Greg KH

On 09/04/2018 03:06 PM, Laura Abbott wrote:
> On 09/04/2018 02:49 PM, Guenter Roeck wrote:
>> On 09/04/2018 01:58 PM, Laura Abbott wrote:
>>> I'd like to start a discussion about the stable release cycle.
>>>
>>> Fedora is a heavy user of the most recent stable trees and we
>>> generally do a pretty good job of keeping up to date. As we
>>> try and increase testing though, the stable release process
>>> gets to be a bit difficult. We often run into the problem where
>>> release .Z is officially released and then .Z+1 comes
>>> out as an -rc immediately after. Given Fedora release processes,
>>> we haven't always finished testing .Z by the time .Z+1 comes
>>> out. What to do in this situation really depends on what's in
>>> .Z and .Z+1 and how stable we think things are. This usually
>>> works out fine but a) sometimes we guess wrong and should have
>>> tested .Z more b) we're only looking to increase testing.
>>>
>>> What I'd like to see is stable updates that come on a regular
>>> schedule with a longer -rc interval, say Sunday with
>>> a one week -rc period. I understand that much of the current
>>> stable schedule is based on Greg's schedule. As a distro
>>> maintainer though, a regular release schedule with a longer
>>> testing window makes it much easier to plan and deliver something
>>> useful to our users. It's also a much easier sell for encouraging
>>> everyone to pick up every stable update if there's a known
>>> schedule. I also realize Greg is probably reading this with a very
>>> skeptical look on his face so I'd be interested to hear from
>>> other distro maintainers as well.
>>>
>>
>> For my part, a longer -rc interval would not help or improve the
>> situation. Given the large number of security fixes, it would
>> actually make the situation worse: In many cases I could no longer
>> wait for a fix to be available in a release. Instead, I would have
>> to pick and pre-apply individual patches from a pending release.
>>
> 
> Fedora does this already. We frequently carry patches which have
> not yet made it into a stable release. Sometimes they only stay
> around for one release but we've had ones that stayed around for
> multiple releases.
> 
Sure, but having to pull them from release candidates adds additional
work and increases risk.

>> I like the idea of having (no more than) one release per week with
>> the exception of security fixes, but longer -rc intervals would be
>> problematic.
>>
> 
> Security fixes are an interesting question. The problem is that
> not every security issue is actually equal and even patches
> that fix CVEs can cause regressions.
> 

We do have a pretty well defined process for handling CVEs depending
on their severity. The preferred handling for all CVEs is to get the
fixes through stable releases.

As for regressions, only a system with no patches applied is safe from
regressions. Otherwise regressions are unavoidable. The key is to improve
testing to a point where the pain from regressions is acceptable.

Guenter

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 23:14       ` Sasha Levin
@ 2018-09-04 23:43         ` Guenter Roeck
  2018-09-05  1:17           ` Laura Abbott
  0 siblings, 1 reply; 74+ messages in thread
From: Guenter Roeck @ 2018-09-04 23:43 UTC (permalink / raw)
  To: Sasha Levin, Laura Abbott; +Cc: Greg KH, ksummit-discuss

On 09/04/2018 04:14 PM, Sasha Levin wrote:
[ ... ]
>>
>> Yes but with a longer -rc cycle we could have more time to actually
>> find those bugs before they get released and we could get more focused
>> testing.
> 
> Indeed, but what's long enough? I'm sure that if we extend it to a month
> we'll find even more bugs; there's never "enough" testing.
> 
> Maybe some concrete numbers will help here. Do you maybe know how many
> commits in the past year snuck past the -rc cycle into a stable release
> and found as buggy by Fedora's testing pipeline?
> 

... and how many bugs were found during the existing test cycle ?

The next question would be how many regressions were reported by users
after a release was published.

The statistics I carried until early this year suggested a regression rate
of around 0.15% for stable releases, where regression means that a bug was
found post-release and had to be fixed later. It would indeed be interesting
to know how many of those were found by (automated ?) testing and how many
were found by users.

Guenter

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 23:43         ` Guenter Roeck
@ 2018-09-05  1:17           ` Laura Abbott
  2018-09-06  3:56             ` Benjamin Gilbert
  0 siblings, 1 reply; 74+ messages in thread
From: Laura Abbott @ 2018-09-05  1:17 UTC (permalink / raw)
  To: Guenter Roeck, Sasha Levin; +Cc: Greg KH, ksummit-discuss

On 09/04/2018 04:43 PM, Guenter Roeck wrote:
> On 09/04/2018 04:14 PM, Sasha Levin wrote:
> [ ... ]
>>>
>>> Yes but with a longer -rc cycle we could have more time to actually
>>> find those bugs before they get released and we could get more focused
>>> testing.
>>
>> Indeed, but what's long enough? I'm sure that if we extend it to a month
>> we'll find even more bugs; there's never "enough" testing.
>>
>> Maybe some concrete numbers will help here. Do you maybe know how many
>> commits in the past year snuck past the -rc cycle into a stable release
>> and found as buggy by Fedora's testing pipeline?
>>
> 
> ... and how many bugs were found during the existing test cycle ?
> 
> The next question would be how many regressions were reported by users
> after a release was published.
> 
> The statistics I carried until early this year suggested a regression rate
> of around 0.15% for stable releases, where regression means that a bug was
> found post-release and had to be fixed later. It would indeed be interesting
> to know how many of those were found by (automated ?) testing and how many
> were found by users.
> 
> Guenter

I'd have to do some digging through bugzilla to get numbers. Some
of this is also motivated by discussions with the CoreOS team who
have also tried to use the stable kernels and ran into problems.
I'll see if I can get some numbers.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 23:35     ` Guenter Roeck
@ 2018-09-05  1:45       ` Laura Abbott
  2018-09-05  2:54         ` Guenter Roeck
  0 siblings, 1 reply; 74+ messages in thread
From: Laura Abbott @ 2018-09-05  1:45 UTC (permalink / raw)
  To: Guenter Roeck, ksummit-discuss; +Cc: Greg KH

On 09/04/2018 04:35 PM, Guenter Roeck wrote:
> On 09/04/2018 03:06 PM, Laura Abbott wrote:
>> On 09/04/2018 02:49 PM, Guenter Roeck wrote:
>>> On 09/04/2018 01:58 PM, Laura Abbott wrote:
>>>> I'd like to start a discussion about the stable release cycle.
>>>>
>>>> Fedora is a heavy user of the most recent stable trees and we
>>>> generally do a pretty good job of keeping up to date. As we
>>>> try and increase testing though, the stable release process
>>>> gets to be a bit difficult. We often run into the problem where
>>>> release .Z is officially released and then .Z+1 comes
>>>> out as an -rc immediately after. Given Fedora release processes,
>>>> we haven't always finished testing .Z by the time .Z+1 comes
>>>> out. What to do in this situation really depends on what's in
>>>> .Z and .Z+1 and how stable we think things are. This usually
>>>> works out fine but a) sometimes we guess wrong and should have
>>>> tested .Z more b) we're only looking to increase testing.
>>>>
>>>> What I'd like to see is stable updates that come on a regular
>>>> schedule with a longer -rc interval, say Sunday with
>>>> a one week -rc period. I understand that much of the current
>>>> stable schedule is based on Greg's schedule. As a distro
>>>> maintainer though, a regular release schedule with a longer
>>>> testing window makes it much easier to plan and deliver something
>>>> useful to our users. It's also a much easier sell for encouraging
>>>> everyone to pick up every stable update if there's a known
>>>> schedule. I also realize Greg is probably reading this with a very
>>>> skeptical look on his face so I'd be interested to hear from
>>>> other distro maintainers as well.
>>>>
>>>
>>> For my part, a longer -rc interval would not help or improve the
>>> situation. Given the large number of security fixes, it would
>>> actually make the situation worse: In many cases I could no longer
>>> wait for a fix to be available in a release. Instead, I would have
>>> to pick and pre-apply individual patches from a pending release.
>>>
>>
>> Fedora does this already. We frequently carry patches which have
>> not yet made it into a stable release. Sometimes they only stay
>> around for one release but we've had ones that stayed around for
>> multiple releases.
>>
> Sure, but having to pull them from release candidates adds additional
> work and increases risk.
> 
>>> I like the idea of having (no more than) one release per week with
>>> the exception of security fixes, but longer -rc intervals would be
>>> problematic.
>>>
>>
>> Security fixes are an interesting question. The problem is that
>> not every security issue is actually equal and even patches
>> that fix CVEs can cause regressions.
>>
> 
> We do have a pretty well defined process for handling CVEs depending
> on their severity. The preferred handling for all CVEs is to get the
> fixes through stable releases.
> 

Yes, I agree CVEs should eventually go through a stable release
for the same reason all fixes are security fixes. There's also a
difference between a CVE that should be picked up urgently and one that
can be applied as part of a regular update cycle.

> As for regressions, only a system with no patches applied is safe from
> regressions. Otherwise regressions are unavoidable. The key is to improve
> testing to a point where the pain from regressions is acceptable.

This may just be kernel tree philosophy but I'm not sure any regression
in the stable tree should be acceptable. In Greg's blog post
http://www.kroah.com/log/blog/2018/08/24/what-stable-kernel-should-i-use/
he suggested "Server: Latest stable release or latest LTS release"
I don't think anyone wants their server regressing. I've talked
with the CoreOS team about their experience using stable kernels
and it gets tricky to convince users to update when there are
regressions.

Maybe this goes to what Jiri Kosina suggested about having a
discussion about the target audience of the stable trees since
it seems like different people want different things from the trees.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  1:45       ` Laura Abbott
@ 2018-09-05  2:54         ` Guenter Roeck
  2018-09-05  8:31           ` Jan Kara
  0 siblings, 1 reply; 74+ messages in thread
From: Guenter Roeck @ 2018-09-05  2:54 UTC (permalink / raw)
  To: Laura Abbott, ksummit-discuss; +Cc: Greg KH

On 09/04/2018 06:45 PM, Laura Abbott wrote:
> On 09/04/2018 04:35 PM, Guenter Roeck wrote:
>> On 09/04/2018 03:06 PM, Laura Abbott wrote:
>>> On 09/04/2018 02:49 PM, Guenter Roeck wrote:
>>>> On 09/04/2018 01:58 PM, Laura Abbott wrote:
>>>>> I'd like to start a discussion about the stable release cycle.
>>>>>
>>>>> Fedora is a heavy user of the most recent stable trees and we
>>>>> generally do a pretty good job of keeping up to date. As we
>>>>> try and increase testing though, the stable release process
>>>>> gets to be a bit difficult. We often run into the problem where
>>>>> release .Z is officially released and then .Z+1 comes
>>>>> out as an -rc immediately after. Given Fedora release processes,
>>>>> we haven't always finished testing .Z by the time .Z+1 comes
>>>>> out. What to do in this situation really depends on what's in
>>>>> .Z and .Z+1 and how stable we think things are. This usually
>>>>> works out fine but a) sometimes we guess wrong and should have
>>>>> tested .Z more b) we're only looking to increase testing.
>>>>>
>>>>> What I'd like to see is stable updates that come on a regular
>>>>> schedule with a longer -rc interval, say Sunday with
>>>>> a one week -rc period. I understand that much of the current
>>>>> stable schedule is based on Greg's schedule. As a distro
>>>>> maintainer though, a regular release schedule with a longer
>>>>> testing window makes it much easier to plan and deliver something
>>>>> useful to our users. It's also a much easier sell for encouraging
>>>>> everyone to pick up every stable update if there's a known
>>>>> schedule. I also realize Greg is probably reading this with a very
>>>>> skeptical look on his face so I'd be interested to hear from
>>>>> other distro maintainers as well.
>>>>>
>>>>
>>>> For my part, a longer -rc interval would not help or improve the
>>>> situation. Given the large number of security fixes, it would
>>>> actually make the situation worse: In many cases I could no longer
>>>> wait for a fix to be available in a release. Instead, I would have
>>>> to pick and pre-apply individual patches from a pending release.
>>>>
>>>
>>> Fedora does this already. We frequently carry patches which have
>>> not yet made it into a stable release. Sometimes they only stay
>>> around for one release but we've had ones that stayed around for
>>> multiple releases.
>>>
>> Sure, but having to pull them from release candidates adds additional
>> work and increases risk.
>>
>>>> I like the idea of having (no more than) one release per week with
>>>> the exception of security fixes, but longer -rc intervals would be
>>>> problematic.
>>>>
>>>
>>> Security fixes are an interesting question. The problem is that
>>> not every security issue is actually equal and even patches
>>> that fix CVEs can cause regressions.
>>>
>>
>> We do have a pretty well defined process for handling CVEs depending
>> on their severity. The preferred handling for all CVEs is to get the
>> fixes through stable releases.
>>
> 
> Yes, I agree CVEs should eventually go through a stable release
> for the same reason all fixes are security fixes. There's also a
> difference between a CVE that should be picked up urgently and one that
> can be applied as part of a regular update cycle.
> 
>> As for regressions, only a system with no patches applied is safe from
>> regressions. Otherwise regressions are unavoidable. The key is to improve
>> testing to a point where the pain from regressions is acceptable.
> 
> This may just be kernel tree philosophy but I'm not sure any regression
> in the stable tree should be acceptable. In Greg's blog post
> http://www.kroah.com/log/blog/2018/08/24/what-stable-kernel-should-i-use/
> he suggested "Server: Latest stable release or latest LTS release"
> I don't think anyone wants their server regressing. I've talked
> with the CoreOS team about their experience using stable kernels
> and it gets tricky to convince users to update when there are
> regressions.
>

I understand that philosophy very well. Each and every regression is an argument
to not use stable releases in the first place. We have been there. My solution
is to do everything I can to improve testing to the point where regressions
are all but non-existent.

However, realistically, there will _always_ be regressions, and some of them
_will_ be found post-release. If zero regressions is your absolute must-have
condition for a release, the only guaranteed means to accomplish that is to
make zero changes in that release.

I am not saying that regressions are futile. Yes, we have regressions,
and, yes, we have to get better in catching them. However, I don't think
that changing the process will provide a solution. We will have to further
improve test coverage instead.

At the same time, I do realize that even a regression rate of 0.01% will
be used as argument against stable releases. I don't think there is anything
we can do about that.

Only looking at regressions provides an extremely lop-sided view of stable
release quality. The "There Shall Be No Regressions" crowd tends to ignore the
benefit of getting lots of bug fixes. Look at it that way - for each regression,
today, one gets close to a thousand bug fixes. Realistically, I think that is
pretty good. Not good enough, maybe, but pretty good.

Which leads to another point: People complaining about regressions tend to forget
how things looked like just a few years ago, when stable releases were all but
untested. Five years ago, today's regression rate would have been a dream.

Guenter

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 20:58 [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time Laura Abbott
                   ` (3 preceding siblings ...)
  2018-09-04 21:49 ` Guenter Roeck
@ 2018-09-05  3:44 ` Eduardo Valentin
  4 siblings, 0 replies; 74+ messages in thread
From: Eduardo Valentin @ 2018-09-05  3:44 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Greg KH, ksummit-discuss

Hello,

On Tue, Sep 04, 2018 at 01:58:42PM -0700, Laura Abbott wrote:
> I'd like to start a discussion about the stable release cycle.
> 
> Fedora is a heavy user of the most recent stable trees and we
> generally do a pretty good job of keeping up to date. As we
> try and increase testing though, the stable release process
> gets to be a bit difficult. We often run into the problem where
> release .Z is officially released and then .Z+1 comes
> out as an -rc immediately after. Given Fedora release processes,
> we haven't always finished testing .Z by the time .Z+1 comes
> out. What to do in this situation really depends on what's in
> .Z and .Z+1 and how stable we think things are. This usually
> works out fine but a) sometimes we guess wrong and should have
> tested .Z more b) we're only looking to increase testing.
> 
> What I'd like to see is stable updates that come on a regular
> schedule with a longer -rc interval, say Sunday with
> a one week -rc period. I understand that much of the current
> stable schedule is based on Greg's schedule. As a distro
> maintainer though, a regular release schedule with a longer
> testing window makes it much easier to plan and deliver something
> useful to our users. It's also a much easier sell for encouraging
> everyone to pick up every stable update if there's a known
> schedule. I also realize Greg is probably reading this with a very
> skeptical look on his face so I'd be interested to hear from
> other distro maintainers as well.


If this discussion is happening, I would like to be part of it. As
Amazon Linux kernel maintainer, I surely can relate with the issues of
regression introduction. And as of today, Amazon Linux does rely
on stable kernels.

Now, I am not sure if making the stable rc cycle longer would actually
improve the regression issue. As already mentioned over other emails
in this thread, longer rcs == more patches == more regressions. Given
that we set our own internal cadence, and we do not commit on releasing
every single stable rc, picking what ever is the latest rc is what we
typically do.

Also, as for the cadence of the stable branches, what I have noticed
is that having one kernel per week should be enough. However, I do
also see that there will always be cases of more than one release
per week for the embargo cases (at least based on my last observations
of LTS branches), and those usually also means we need to carry or
cherry pick patches.

With that said, I would like to see out of the discussion more on:
- What can be done to improve the regression introduction? I also
agree that regression free is a target, not necessarily achievable,
but sharing the testing strategies may be one thing to consider
to be done in rc cycles.
- What can be done to improve the embargo process and CVE managment.
- Should distro be using stable / LTS kernels?

BR, 

Eduardo

> 
> Thanks,
> Laura
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:58   ` Laura Abbott
@ 2018-09-05  4:53     ` Sasha Levin
  0 siblings, 0 replies; 74+ messages in thread
From: Sasha Levin @ 2018-09-05  4:53 UTC (permalink / raw)
  To: Laura Abbott; +Cc: Greg KH, ksummit-discuss

On Tue, Sep 04, 2018 at 02:58:42PM -0700, Laura Abbott wrote:
>On 09/04/2018 02:33 PM, Sasha Levin wrote:
>>Why not set your own release schedule and just take the latest stable
>>kernel at that point? So what if the .Z+1 kernel is out a day later? You
>>could just queue it up for your next release.
>>
>
>It's really rough on users to update that frequently. Fedora relies
>on users to give feedback to let us know to push an update and on
>older releases it can be hard to get feedback. We've sometimes had
>issues where multiple stable issues get delayed because they haven't
>gotten enough testing to be pushed. Admittedly that's a Fedora quirk
>but it's still an issue that the short rc and release window is not
>enough for many users to test and give feedback. When stable regressions
>are introduced it's very difficult to guide users to bisect.
>
>>This is exactly what would happen if you ask Greg to go on some sort of
>>a schedule - he'll just defer the .Z+1 commits to what would have been
>>the .Z+2 release, so you don't really win anything by moving to a
>>stricter schedule.
>
>I'd actually be okay with that. I'd rather focus on testing a known
>set of commits and getting those stable before pushing out.

My point is that Fedora can just skip a certain stable release if the
timing is off. If Greg releases .Z+1 but you just started testing .Z
just proceed with .Z and go with .Z+2 for your next release.

The set of commits in .Z and .Z+2 would be exactly the same if Greg is
asked to wait longer between releases.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:33 ` Sasha Levin
  2018-09-04 21:55   ` Guenter Roeck
  2018-09-04 21:58   ` Laura Abbott
@ 2018-09-05  6:48   ` Jiri Kosina
  2018-09-05  8:16     ` Jan Kara
  2 siblings, 1 reply; 74+ messages in thread
From: Jiri Kosina @ 2018-09-05  6:48 UTC (permalink / raw)
  To: Sasha Levin; +Cc: Greg KH, ksummit-discuss

On Tue, 4 Sep 2018, Sasha Levin via Ksummit-discuss wrote:

> This is exactly what would happen if you ask Greg to go on some sort of 
> a schedule - he'll just defer the .Z+1 commits to what would have been 
> the .Z+2 release, so you don't really win anything by moving to a 
> stricter schedule.

You potentially do win one thing, and that's review (or at least 
possibility of review).

With current cadence, I'd put all my bets on the fact that everybody is 
just completely ignoring the e-mails about patches being queued for stable 
inclusion. It's just way, way too many of them, it's a neverending, 
contignuous, overwhelming stream.

If this would be happening at smaller cadence, chances of people (original 
patch author, reviewers and maintainer) actually investing brain energy 
into evaluating whether particular patch is suitable for particular stable 
without introducing backporting regression would be much higher.

Also, it's not only the cadence, but the patch selection criteria that 
contributes to killing the review of stable patches; the bar for stable 
tree acceptance is much lower than it used to be (really, just look at the 
criteria formulated in stable-kernel-rules.rst and then match them against 
the patches that actually land in the tree); so we'd need both, otherwise 
I think the trend of distros moving away from stable is inevitable (as "no 
review" basically equals "we're not obsessed about regressions", and 
distros don't want that, I think).

But then, yes, it might be that that's actually not a problem :)

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  6:48   ` Jiri Kosina
@ 2018-09-05  8:16     ` Jan Kara
  2018-09-05  8:32       ` Jiri Kosina
  2018-09-05 10:28       ` Thomas Gleixner
  0 siblings, 2 replies; 74+ messages in thread
From: Jan Kara @ 2018-09-05  8:16 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Greg KH, ksummit-discuss

On Wed 05-09-18 08:48:03, Jiri Kosina wrote:
> On Tue, 4 Sep 2018, Sasha Levin via Ksummit-discuss wrote:
> 
> > This is exactly what would happen if you ask Greg to go on some sort of 
> > a schedule - he'll just defer the .Z+1 commits to what would have been 
> > the .Z+2 release, so you don't really win anything by moving to a 
> > stricter schedule.
> 
> You potentially do win one thing, and that's review (or at least 
> possibility of review).
> 
> With current cadence, I'd put all my bets on the fact that everybody is 
> just completely ignoring the e-mails about patches being queued for stable 
> inclusion. It's just way, way too many of them, it's a neverending, 
> contignuous, overwhelming stream.
> 
> If this would be happening at smaller cadence, chances of people (original 
> patch author, reviewers and maintainer) actually investing brain energy 
> into evaluating whether particular patch is suitable for particular stable 
> without introducing backporting regression would be much higher.

So I agree that with current amount of patches in stable tree, the review
is cursory at best. However that does not really seem to be related to
the frequency of stable releases (which is what I believe Laura complains
about in this thread) but rather to the amount of patches going into
stable.

> Also, it's not only the cadence, but the patch selection criteria that 
> contributes to killing the review of stable patches; the bar for stable 
> tree acceptance is much lower than it used to be (really, just look at the 
> criteria formulated in stable-kernel-rules.rst and then match them against 
> the patches that actually land in the tree); so we'd need both, otherwise 
> I think the trend of distros moving away from stable is inevitable (as "no 
> review" basically equals "we're not obsessed about regressions", and 
> distros don't want that, I think).

I think distros usually have established feedback loop (through bugzilla
etc.) so they rely on "if there's a bug, users will report it and we'll fix
it" strategy. So they need to fix proactively only really nasty bugs which
you don't want to ever happen. OTOH for products like embedded devices,
that feedback loop is much weaker (it's difficult for average user to
extract info about the problem from the device) so I can understand they
need to be much more aggressive in picking fixes (since even with current
stable cadence it is a net win in the amount of bugs your wide user base is
going to hit).

So I think the bar for patch acceptance for these two kinds of deployments
is rather different and Greg decided to accomodate more the second one.
Distros now have to decide whether they relax their rules as well or
whether they do their own stricter selection. At least that was the outcome
of the last stable-tree discussion for me...

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  2:54         ` Guenter Roeck
@ 2018-09-05  8:31           ` Jan Kara
  0 siblings, 0 replies; 74+ messages in thread
From: Jan Kara @ 2018-09-05  8:31 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Greg KH, ksummit-discuss

On Tue 04-09-18 19:54:34, Guenter Roeck wrote:
> On 09/04/2018 06:45 PM, Laura Abbott wrote:
> > On 09/04/2018 04:35 PM, Guenter Roeck wrote:
> > > On 09/04/2018 03:06 PM, Laura Abbott wrote:
> > > > On 09/04/2018 02:49 PM, Guenter Roeck wrote:
> > > > > On 09/04/2018 01:58 PM, Laura Abbott wrote:
> > > > > > I'd like to start a discussion about the stable release cycle.
> > > > > > 
> > > > > > Fedora is a heavy user of the most recent stable trees and we
> > > > > > generally do a pretty good job of keeping up to date. As we
> > > > > > try and increase testing though, the stable release process
> > > > > > gets to be a bit difficult. We often run into the problem where
> > > > > > release .Z is officially released and then .Z+1 comes
> > > > > > out as an -rc immediately after. Given Fedora release processes,
> > > > > > we haven't always finished testing .Z by the time .Z+1 comes
> > > > > > out. What to do in this situation really depends on what's in
> > > > > > .Z and .Z+1 and how stable we think things are. This usually
> > > > > > works out fine but a) sometimes we guess wrong and should have
> > > > > > tested .Z more b) we're only looking to increase testing.
> > > > > > 
> > > > > > What I'd like to see is stable updates that come on a regular
> > > > > > schedule with a longer -rc interval, say Sunday with
> > > > > > a one week -rc period. I understand that much of the current
> > > > > > stable schedule is based on Greg's schedule. As a distro
> > > > > > maintainer though, a regular release schedule with a longer
> > > > > > testing window makes it much easier to plan and deliver something
> > > > > > useful to our users. It's also a much easier sell for encouraging
> > > > > > everyone to pick up every stable update if there's a known
> > > > > > schedule. I also realize Greg is probably reading this with a very
> > > > > > skeptical look on his face so I'd be interested to hear from
> > > > > > other distro maintainers as well.
> > > > > > 
> > > > > 
> > > > > For my part, a longer -rc interval would not help or improve the
> > > > > situation. Given the large number of security fixes, it would
> > > > > actually make the situation worse: In many cases I could no longer
> > > > > wait for a fix to be available in a release. Instead, I would have
> > > > > to pick and pre-apply individual patches from a pending release.
> > > > > 
> > > > 
> > > > Fedora does this already. We frequently carry patches which have
> > > > not yet made it into a stable release. Sometimes they only stay
> > > > around for one release but we've had ones that stayed around for
> > > > multiple releases.
> > > > 
> > > Sure, but having to pull them from release candidates adds additional
> > > work and increases risk.
> > > 
> > > > > I like the idea of having (no more than) one release per week with
> > > > > the exception of security fixes, but longer -rc intervals would be
> > > > > problematic.
> > > > > 
> > > > 
> > > > Security fixes are an interesting question. The problem is that
> > > > not every security issue is actually equal and even patches
> > > > that fix CVEs can cause regressions.
> > > > 
> > > 
> > > We do have a pretty well defined process for handling CVEs depending
> > > on their severity. The preferred handling for all CVEs is to get the
> > > fixes through stable releases.
> > > 
> > 
> > Yes, I agree CVEs should eventually go through a stable release
> > for the same reason all fixes are security fixes. There's also a
> > difference between a CVE that should be picked up urgently and one that
> > can be applied as part of a regular update cycle.
> > 
> > > As for regressions, only a system with no patches applied is safe from
> > > regressions. Otherwise regressions are unavoidable. The key is to improve
> > > testing to a point where the pain from regressions is acceptable.
> > 
> > This may just be kernel tree philosophy but I'm not sure any regression
> > in the stable tree should be acceptable. In Greg's blog post
> > http://www.kroah.com/log/blog/2018/08/24/what-stable-kernel-should-i-use/
> > he suggested "Server: Latest stable release or latest LTS release"
> > I don't think anyone wants their server regressing. I've talked
> > with the CoreOS team about their experience using stable kernels
> > and it gets tricky to convince users to update when there are
> > regressions.
> > 
> 
> I understand that philosophy very well. Each and every regression is an argument
> to not use stable releases in the first place. We have been there. My solution
> is to do everything I can to improve testing to the point where regressions
> are all but non-existent.
> 
> However, realistically, there will _always_ be regressions, and some of them
> _will_ be found post-release. If zero regressions is your absolute must-have
> condition for a release, the only guaranteed means to accomplish that is to
> make zero changes in that release.

Agreed. Even enterprise distro kernels with all the QA and review and
experienced people doing the work regress occasionally. In the end what
matters to users is: "If I update, how big is the probability that my
current workload stops working?" And you can make that 1%, 0.1% or 0.01%
(with adequately increasing costs to achieve that) but never 0.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  8:16     ` Jan Kara
@ 2018-09-05  8:32       ` Jiri Kosina
  2018-09-05  8:56         ` Greg KH
  2018-09-05  9:58         ` James Bottomley
  2018-09-05 10:28       ` Thomas Gleixner
  1 sibling, 2 replies; 74+ messages in thread
From: Jiri Kosina @ 2018-09-05  8:32 UTC (permalink / raw)
  To: Jan Kara; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, Jan Kara wrote:

> So I agree that with current amount of patches in stable tree, the 
> review is cursory at best. However that does not really seem to be 
> related to the frequency of stable releases (which is what I believe 
> Laura complains about in this thread) but rather to the amount of 
> patches going into stable.

I think the psychological aspect shouldn't be ignored in this particular 
case.

Maintainers and patch authors being constantly flooded by stable queues 
would never really get back to reviewing it, as it's always there, more is 
already in flight, and the previous pile is still unreviewed.

If it comes at regular pace though, it's a bit easier to align with it and 
establish for example "friday afternoon stable review 2 hours" into 
maintainer's workflow :)

But yeah, it's weak and doesn't solve the primary thing, which is just the 
size of stable itself.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  8:32       ` Jiri Kosina
@ 2018-09-05  8:56         ` Greg KH
  2018-09-05  9:13           ` Geert Uytterhoeven
  2018-09-05 10:11           ` Mark Brown
  2018-09-05  9:58         ` James Bottomley
  1 sibling, 2 replies; 74+ messages in thread
From: Greg KH @ 2018-09-05  8:56 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

On Wed, Sep 05, 2018 at 10:32:45AM +0200, Jiri Kosina wrote:
> But yeah, it's weak and doesn't solve the primary thing, which is just the 
> size of stable itself.

I've held off on responding so far, but I think this is the big point of
"fear" that many people seem to have with the stable updates over the
past years.

Yes, the "size" is bigger than before, but that is because of the
following things:
	- our development process is going faster (9 patches an hour)
	  than it used to a few years ago.
	- We have finally woken some subsystem maintainers up into
	  actually properly tagging patches for stable.  We used to have
	  a horrid rate of this happening, and it is getting better.
	  However, we still have whole major subsystems that _never_ tag
	  anything, which is a problem, so things will get larger.
	- I have more time to work on this than when I used to work for
	  a distro.
	- The "Fixes:" tag has helped out a lot in finding patches that
	  people forgot to tag with stable@ lines.
	- We have more people using and caring about stable kernels, so
	  they submit more patches for them, allowing them to replace
	  their internal trees
	- Sasha's work in finding the patches that maintainers/developer
	  fail to tag is paying off really well, which also increases
	  the size.
	- fuzzing tools are finding loads of stuff that have always been
	  there.  syzbot is wonderful in this, and still has many
	  hundreds of open bugs left to be fixed.  When they are fixed,
	  those patches will be backported.  This means we are getting
	  better at finding and fixing things, not that the bugs were
	  not ever there in the first place.

So yes, things are "bigger" than before, but still, overall, we are only
accepting a small percentage of patches that hit Linus's tree (12
patches a day for stable vs. 216 a day for Linus).

Are you also worried that Linus's tree is getting "bigger"?  :)

So we are larger than before, but this is a good thing, because we are
actually catching more problems than we were before.  Which means your
older kernels had more bugs...

Anyway, just a comment in that you should not "fear" the increased size,
it is to be expected as more people pay more attention to Linux,
combined with the fact that we are still growing.  If the stable patches
were shrinking, then I would get worried as that would imply that people
don't care anymore.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  8:56         ` Greg KH
@ 2018-09-05  9:13           ` Geert Uytterhoeven
  2018-09-05  9:33             ` Greg KH
  2018-09-05 10:11           ` Mark Brown
  1 sibling, 1 reply; 74+ messages in thread
From: Geert Uytterhoeven @ 2018-09-05  9:13 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit-discuss

Hi Greg,

On Wed, Sep 5, 2018 at 10:56 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> On Wed, Sep 05, 2018 at 10:32:45AM +0200, Jiri Kosina wrote:
> > But yeah, it's weak and doesn't solve the primary thing, which is just the
> > size of stable itself.
>
> I've held off on responding so far, but I think this is the big point of
> "fear" that many people seem to have with the stable updates over the
> past years.
>
> Yes, the "size" is bigger than before, but that is because of the
> following things:

[...]

> So yes, things are "bigger" than before, but still, overall, we are only
> accepting a small percentage of patches that hit Linus's tree (12
> patches a day for stable vs. 216 a day for Linus).
>
> Are you also worried that Linus's tree is getting "bigger"?  :)
>
> So we are larger than before, but this is a good thing, because we are
> actually catching more problems than we were before.  Which means your
> older kernels had more bugs...
>
> Anyway, just a comment in that you should not "fear" the increased size,
> it is to be expected as more people pay more attention to Linux,
> combined with the fact that we are still growing.  If the stable patches
> were shrinking, then I would get worried as that would imply that people
> don't care anymore.

So we're slowly evolving to the point where distros/companies/... can just
follow Linus' tree instead, and stable trees are no longer needed, except for
the last released point version (no-one wants to run plain rc1 or rc2) ;-)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  9:13           ` Geert Uytterhoeven
@ 2018-09-05  9:33             ` Greg KH
  0 siblings, 0 replies; 74+ messages in thread
From: Greg KH @ 2018-09-05  9:33 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: ksummit-discuss

On Wed, Sep 05, 2018 at 11:13:01AM +0200, Geert Uytterhoeven wrote:
> Hi Greg,
> 
> On Wed, Sep 5, 2018 at 10:56 AM Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Wed, Sep 05, 2018 at 10:32:45AM +0200, Jiri Kosina wrote:
> > > But yeah, it's weak and doesn't solve the primary thing, which is just the
> > > size of stable itself.
> >
> > I've held off on responding so far, but I think this is the big point of
> > "fear" that many people seem to have with the stable updates over the
> > past years.
> >
> > Yes, the "size" is bigger than before, but that is because of the
> > following things:
> 
> [...]
> 
> > So yes, things are "bigger" than before, but still, overall, we are only
> > accepting a small percentage of patches that hit Linus's tree (12
> > patches a day for stable vs. 216 a day for Linus).
> >
> > Are you also worried that Linus's tree is getting "bigger"?  :)
> >
> > So we are larger than before, but this is a good thing, because we are
> > actually catching more problems than we were before.  Which means your
> > older kernels had more bugs...
> >
> > Anyway, just a comment in that you should not "fear" the increased size,
> > it is to be expected as more people pay more attention to Linux,
> > combined with the fact that we are still growing.  If the stable patches
> > were shrinking, then I would get worried as that would imply that people
> > don't care anymore.
> 
> So we're slowly evolving to the point where distros/companies/... can just
> follow Linus' tree instead, and stable trees are no longer needed, except for
> the last released point version (no-one wants to run plain rc1 or rc2) ;-)

I would _love_ if if that were to start happening!

Seriously, I would...

greg "please put me out of a job" k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  8:32       ` Jiri Kosina
  2018-09-05  8:56         ` Greg KH
@ 2018-09-05  9:58         ` James Bottomley
  2018-09-05 10:47           ` Mark Brown
  1 sibling, 1 reply; 74+ messages in thread
From: James Bottomley @ 2018-09-05  9:58 UTC (permalink / raw)
  To: Jiri Kosina, Jan Kara; +Cc: Greg KH, ksummit-discuss

On Wed, 2018-09-05 at 10:32 +0200, Jiri Kosina wrote:
> On Wed, 5 Sep 2018, Jan Kara wrote:
> 
> > So I agree that with current amount of patches in stable tree, the 
> > review is cursory at best. However that does not really seem to be 
> > related to the frequency of stable releases (which is what I
> > believe  Laura complains about in this thread) but rather to the
> > amount of 
> > patches going into stable.
> 
> I think the psychological aspect shouldn't be ignored in this
> particular  case.

This really shouldn't be an issue: stable trees are backported from
upstream.  The patch (should) work in upstream, so it should work in
stable.  There are only a few real cases you need to worry about:

   1. Buggy patch in upstream backported to stable. (will be caught and
      the fix backported soon)
   2. Missing precursor causing issues in stable alone.
   3. Bug introduced when hand applying.

The chances of one of these happening is non-zero, but the criteria for
stable should mean its still better odds than the odds of hitting the
bug it was fixing.

This is the thing: backporting is an expediency process, not a perfect
process.  We are going to get bugs with backports, we just make sure
the backport is for an issue serious enough that on balance we reduce
the bugginess of the stable kernels.

> Maintainers and patch authors being constantly flooded by stable
> queues would never really get back to reviewing it, as it's always
> there, more is  already in flight, and the previous pile is still
> unreviewed.
> 
> If it comes at regular pace though, it's a bit easier to align with
> it and establish for example "friday afternoon stable review 2 hours"
> into maintainer's workflow :)
> 
> But yeah, it's weak and doesn't solve the primary thing, which is
> just the  size of stable itself.

Look, this just isn't going to happen.  As a maintainer I'm going to
see the backport, think it fixed a bug in upstream and stop there. 
That's no better review than you get by insisting that the patch be
upstream first.  Can we embrace this as the actual review in upstream
process and move on?

As I said there's a small but non-zero chance of bugs because of the
issues listed above, but I'm not going to spot them even if I performed
a full review for a kernel I forgot about months ago.

James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  8:56         ` Greg KH
  2018-09-05  9:13           ` Geert Uytterhoeven
@ 2018-09-05 10:11           ` Mark Brown
  2018-09-05 14:44             ` Steven Rostedt
  1 sibling, 1 reply; 74+ messages in thread
From: Mark Brown @ 2018-09-05 10:11 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]

On Wed, Sep 05, 2018 at 10:56:42AM +0200, Greg KH wrote:

> 	- We have finally woken some subsystem maintainers up into
> 	  actually properly tagging patches for stable.  We used to have
> 	  a horrid rate of this happening, and it is getting better.
> 	  However, we still have whole major subsystems that _never_ tag
> 	  anything, which is a problem, so things will get larger.

Some of us have been doing this for 5+ years :/

> 	- Sasha's work in finding the patches that maintainers/developer
> 	  fail to tag is paying off really well, which also increases
> 	  the size.

These and the few other patches that I didn't tag for stable myself are
the only ones I try to review reliably, the others I already looked at
for stable at least once and especially where things are automated it's
better to have some manual checking.  It's good that Sasha's stuff is
flagged now, and most other submissions are obvious as well, so that's
fairly easy to do and means that the review burden is relatively light.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  8:16     ` Jan Kara
  2018-09-05  8:32       ` Jiri Kosina
@ 2018-09-05 10:28       ` Thomas Gleixner
  2018-09-05 11:20         ` Jiri Kosina
  1 sibling, 1 reply; 74+ messages in thread
From: Thomas Gleixner @ 2018-09-05 10:28 UTC (permalink / raw)
  To: Jan Kara; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, Jan Kara wrote:
> On Wed 05-09-18 08:48:03, Jiri Kosina wrote:
> > If this would be happening at smaller cadence, chances of people (original 
> > patch author, reviewers and maintainer) actually investing brain energy 
> > into evaluating whether particular patch is suitable for particular stable 
> > without introducing backporting regression would be much higher.
> 
> So I agree that with current amount of patches in stable tree, the review
> is cursory at best. However that does not really seem to be related to
> the frequency of stable releases (which is what I believe Laura complains
> about in this thread) but rather to the amount of patches going into
> stable.

Having a fixed schedule is not solving anything IMO. I won't have more time
to review stable patches than I have now.

And to be blunt, I actually do not look at anything which goes into dead
kernels at all. Right now I skim through the 4.14 stable patches, but I
really can't be bothered to look at something like 4.4 or even 3.16.

The whole speculation mess has shown, that backporting anything complex to
old kernels is a complete fail.

It's not only the meltdown/spectre mess which has been mostly caused by the
irresponsible secrecy mess which resulted in different distros getting
different patch sets for the same dead kernel from Intel.

The same problem persisted with L1TF. I've done the L1TF backports for 4.14
and spent quite some time on doing that, but my first attempt of doing so
for 4.9 made me run away screaming.

If you look at the whole picture then you have to take into account, that:

   1) The number of changes in Linus tree increases steadily and also the
      complexity of those changes increases. Substantial refactoring of
      subsystems is not an exceptional event. It happens all the time.

      As a consequence backporting becomes more complex as well and the
      farther you go back, the probability of introducing subtle bugs and
      regressions increases.

   2) The test coverage including fuzzers has increased enourmosly over the
      last couple of years and given the fact that even the increased
      coverage does not catch all regressions and new bugs between -rc1 and
      final, I seriously doubt that having the fixed weekly stable schedule
      will make a substantial difference.

   3) You'll never catch the weird corner cases of the oddball hardware
      people are using before a release. Even if you have that piece of
      hardware in your test rig, the user will trigger issues which you
      never can reproduce.

Aside of that we have limited resources in upstream review already, so no
matter whether you change the release frequency of stable or not, the
situation won't improve.

I totally agree that we want backports and stable kernels, but I really
have to ask whether backporting all the way back to the begin of the
universe makes any sense at all. I know that the enterprise folks still
believe that their frankenkernels are valuable and make sense, but given
the shit they rolled out this year, there is enough factual evidence that
this model is broken beyond repair.

We really have to sit down and ask ourself the question whether backporting
of complex changes all the way is the right thing to do. I rather have a
very stable and well tested/reviewed 4.14 LTS today than a gazillion of
half baken LTS variants.

So given the above I think it makes sense to think about a strict rolling
model and limit the LTS support to two years and even in the event of
something like meltdown/spectre/l1tf think hard whether backporting makes
sense in the first place or introduces more risks than what it fixes.

>From a product perspective, people really have to rethink what they are
doing. This technology is changing and evolving too fast for models which
were invented when semiconductors were guaranteed to be available for 20
years and the change rate in technology was comparable to snail mail. The
illusion to support a product for 20 years with software from 20 years ago
has been destroyed long ago, but still people cling to it for any price.

The whole 'we can't change the version number' QA argument is complete and
utter bullshit especially given the fact, that after 2 years these so
called stable versions have absolutely nothing to do with the version they
are allegedly based on.

In fact, from a maintainability and also quality POV, lots of effort should
be put into stabilizing an LTS from the day it is selected. So if massive
changes need to be made after a year, then switching over to a well tested
and QA'ed code base in order to avoid backport complexity hell becomes a no
brainer decision. IOW, in the light of meltdown/spectre all effort
should have been put into getting 4.14 and 4.9 fixed instead of diverting
our very limited capcity to create monstrosities back to 2.6 variants.

The upstream development model was changed to rolling releases over a
decade ago after the 2.5 disaster to accomodate with the technology
churn. It has been the right decision. Now we really should make the next
step and switch upstream to a strict rolling 2 years LTS model instead of
supporting the kernel necrophilia cult forever.

I surely know that this will hurt the out of tree mess of vendors who fail
to get their act together, but I've seen a lot of these out of tree kernels
and getting security updates is the least of their worries. In fact many of
them do not even follow the LTS releases in a timely and responsible way.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  9:58         ` James Bottomley
@ 2018-09-05 10:47           ` Mark Brown
  2018-09-05 12:24             ` James Bottomley
  0 siblings, 1 reply; 74+ messages in thread
From: Mark Brown @ 2018-09-05 10:47 UTC (permalink / raw)
  To: James Bottomley; +Cc: Greg KH, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1147 bytes --]

On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:

> This really shouldn't be an issue: stable trees are backported from
> upstream.  The patch (should) work in upstream, so it should work in
> stable.  There are only a few real cases you need to worry about:

>    1. Buggy patch in upstream backported to stable. (will be caught and
>       the fix backported soon)
>    2. Missing precursor causing issues in stable alone.
>    3. Bug introduced when hand applying.

> The chances of one of these happening is non-zero, but the criteria for
> stable should mean its still better odds than the odds of hitting the
> bug it was fixing.

Some of those are substantial enough to be worth worrying about,
especially the missing precursor issues.  It's rarely an issue with the
human generated backports but the automated ones don't have a sense of
context in the selection.

There's also a risk/reward tradeoff to consider with more minor issues,
especially performance related ones.  We want people to be enthusiastic
about taking stable updates and every time they find a problem with a
backport that works against them doing that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 10:28       ` Thomas Gleixner
@ 2018-09-05 11:20         ` Jiri Kosina
  2018-09-05 14:41           ` Thomas Gleixner
  0 siblings, 1 reply; 74+ messages in thread
From: Jiri Kosina @ 2018-09-05 11:20 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, Thomas Gleixner wrote:

> I totally agree that we want backports and stable kernels, but I really
> have to ask whether backporting all the way back to the begin of the
> universe makes any sense at all. I know that the enterprise folks still
> believe that their frankenkernels are valuable and make sense, but given
> the shit they rolled out this year, there is enough factual evidence that
> this model is broken beyond repair.

I don't think any enterprise distro vendor is asking for stable LTS for 
super-historical kernels. Major RHEL is (afaik) 2.6.32 and 3.10-based, 
major SLE is 3.0, 4.4 and 4.12 based.

So there is one intersection there, and that's 4.4.

Supporting such old monsters is a business decision that was made by said 
vendors, so it's perfectly fine they (actually "we" :) ) are suffering on 
their (our) own.

If enterprise vendors would be able to create a working business 
relationship with partners and customers around 'rolling' kernel versions 
in enterprise distributions one day, that'd of course be awesome.

We're not there yet, but things are definitely changing on this front as 
well. For example we (as in "SUSE") are now more pro-active updating 
kernel version between enterprise distro service packs than we've 
historically been. It can be seen as one of the steps towards more 
'rolling' flexibility, but it's sometimes a rather hard sell to the 
enterprise.

> IOW, in the light of meltdown/spectre all effort should have been put 
> into getting 4.14 and 4.9 fixed instead of diverting our very limited 
> capcity to create monstrosities back to 2.6 variants.

I agree that it'd be an ideal world, but it's guaranteed that if we just 
say to the people running some of our 2.6 kernel under a very special 
contract that they have to all of a sudden move to 4.14, we'll just 
immediately lose that contract (and someone else will immediately plug the 
hole on the market and create perhaps even worse backport for them), and 
for various reasons we don't want that to happen :)

Such contracts are usually set up in a way that only very specific fixes 
can be requested for said kernels. We've historically put our bets on the 
fact that we'll be able to provide those rare fixes even for 2.6, and it 
worked well.
Now we're paying back a bit of course (because spectre/meltdown of course 
qualifies), but upstream can completely and happily ignore that.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 10:47           ` Mark Brown
@ 2018-09-05 12:24             ` James Bottomley
  2018-09-05 12:53               ` Jiri Kosina
                                 ` (2 more replies)
  0 siblings, 3 replies; 74+ messages in thread
From: James Bottomley @ 2018-09-05 12:24 UTC (permalink / raw)
  To: Mark Brown; +Cc: Greg KH, ksummit-discuss

On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>
>> This really shouldn't be an issue: stable trees are backported from
>> upstream.  The patch (should) work in upstream, so it should work in
>> stable.  There are only a few real cases you need to worry about:
>
>>    1. Buggy patch in upstream backported to stable. (will be caught
>and
>>       the fix backported soon)
>>    2. Missing precursor causing issues in stable alone.
>>    3. Bug introduced when hand applying.
>
>> The chances of one of these happening is non-zero, but the criteria
>for
>> stable should mean its still better odds than the odds of hitting the
>> bug it was fixing.
>
>Some of those are substantial enough to be worth worrying about,
>especially the missing precursor issues.  It's rarely an issue with the
>human generated backports but the automated ones don't have a sense of
>context in the selection.
>
>There's also a risk/reward tradeoff to consider with more minor issues,
>especially performance related ones.  We want people to be enthusiastic
>about taking stable updates and every time they find a problem with a
>backport that works against them doing that.

I absolutely agree.  That's why I said our process is expediency based:  you have to trade off the value of applying the patch vs the probability of introducing bugs.  However the maintainers are mostly considering this which is why stable is largely free from trivial but pointless patches.  The rule should be: if it doesn't fix a user visible bug, it doesn't go into stable.

James

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 12:24             ` James Bottomley
@ 2018-09-05 12:53               ` Jiri Kosina
  2018-09-05 13:05                 ` Greg KH
  2018-09-05 16:39                 ` James Bottomley
  2018-09-05 13:03               ` Takashi Iwai
  2018-09-05 13:16               ` Mark Brown
  2 siblings, 2 replies; 74+ messages in thread
From: Jiri Kosina @ 2018-09-05 12:53 UTC (permalink / raw)
  To: James Bottomley; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, James Bottomley wrote:

> The rule should be: if it doesn't fix a user visible bug, it doesn't go 
> into stable.

So I just looked at the latest (and newest) stable 4.18.5. It contains 22
patches:

	$ grep "commit [a-f0-9]\+ upstream" ChangeLog-4.18.5
	    commit a13f085d111e90469faf2d9965eb39b11c114d7e upstream.
	    commit bed4ff1ed4d8f2ef5007c5c6ae1b29c5677a3632 upstream.
	    commit c463a158cb6c5d9a85b7d894cd4f8116e8bd6be0 upstream.
	    commit 1204e35bedf4e5015cda559ed8c84789a6dae24e upstream.
	    commit 281e878eab191cce4259abbbf1a0322e3adae02c upstream.
	    commit 3dbe97efe8bf450b183d6dee2305cbc032e6b8a4 upstream.
	    commit 91a2968e245d6ba616db37001fa1a043078b1a65 upstream.
	    commit 4ce6435820d1f1cc2c2788e232735eb244bcc8a3 upstream.
	    commit 9d64b539b738fc181442caab95f1f76d9bd58539 upstream.
	    commit d3252ace0bc652a1a244455556b6a549f969bf99 upstream.
	    commit 7797167ffde1f00446301cb22b37b7c03194cfaf upstream.
	    commit 3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8 upstream.
	    commit ddf74e79a54070f277ae520722d3bab7f7a6c67a upstream.
	    commit de5372da605d3bca46e3102bab51b7e1c0e0a6f6 upstream.
	    commit 1a5d5e5d51e75a5bca67dadbcea8c841934b7b85 upstream.
	    commit 6d44acae1937b81cf8115ada8958e04f601f3f2e upstream.
	    commit c40a56a7818cfe735fc93a69e1875f8bba834483 upstream.
	    commit 6ea2738e0ca0e626c75202fb051c1e88d7a950fa upstream.
	    commit 9f515cdb411ef34f1aaf4c40bb0c932cf6db5de1 upstream.
	    commit 0d83432811f26871295a9bc24d3c387924da6071 upstream.
	    commit 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 upstream.
	    commit b748f2de4b2f578599f46c6000683a8da755bf68 upstream.

Just randomly scrolling through those, I am wondering how at least

	7797167ffde1f00446301cb22b37b7c03194cfaf
	3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8

made it past any stable tree acceptance criteria.

They are memory ordering changes (so exactly area which is generally 
fragile by itself and the risk of regressions simply can't be completely 
ignored), yet they fix absolutely no functional issue.

In addition to that, they all exist upstream only for one single -rc, so 
the public testing exposure is also currently minimal.

Yeah, I know I know, those are parisc, so noone cares anyway :P but that's 
really just the first randomly chosen kernel, with a small number of 
patches, and still 10% of them are something we'd not want to put into an 
enterprise distro kernel without a lot of justification and regression 
testing.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 12:24             ` James Bottomley
  2018-09-05 12:53               ` Jiri Kosina
@ 2018-09-05 13:03               ` Takashi Iwai
  2018-09-05 13:27                 ` Daniel Vetter
  2018-09-05 14:20                 ` Sasha Levin
  2018-09-05 13:16               ` Mark Brown
  2 siblings, 2 replies; 74+ messages in thread
From: Takashi Iwai @ 2018-09-05 13:03 UTC (permalink / raw)
  To: James Bottomley; +Cc: Greg KH, ksummit-discuss

On Wed, 05 Sep 2018 14:24:18 +0200,
James Bottomley wrote:
> 
> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
> >
> >> This really shouldn't be an issue: stable trees are backported from
> >> upstream.  The patch (should) work in upstream, so it should work in
> >> stable.  There are only a few real cases you need to worry about:
> >
> >>    1. Buggy patch in upstream backported to stable. (will be caught
> >and
> >>       the fix backported soon)
> >>    2. Missing precursor causing issues in stable alone.
> >>    3. Bug introduced when hand applying.
> >
> >> The chances of one of these happening is non-zero, but the criteria
> >for
> >> stable should mean its still better odds than the odds of hitting the
> >> bug it was fixing.
> >
> >Some of those are substantial enough to be worth worrying about,
> >especially the missing precursor issues.  It's rarely an issue with the
> >human generated backports but the automated ones don't have a sense of
> >context in the selection.
> >
> >There's also a risk/reward tradeoff to consider with more minor issues,
> >especially performance related ones.  We want people to be enthusiastic
> >about taking stable updates and every time they find a problem with a
> >backport that works against them doing that.
> 
> I absolutely agree.  That's why I said our process is expediency
> based:  you have to trade off the value of applying the patch vs the
> probability of introducing bugs.  However the maintainers are mostly
> considering this which is why stable is largely free from trivial
> but pointless patches.  The rule should be: if it doesn't fix a user
> visible bug, it doesn't go into stable.

Right, and here the current AUTOSEL (and some other not-stable-marked)
patches coming to a gray zone.  The picked-up patches are often right
as "some" fixes, but they are not necessarily qualified as "stable
fixes".

How about allowing to change the choice of AUTOSEL to be opt-in and
opt-out, depending on the tree?  In my case, usually the patches
caught by AUTOSEL aren't really the patches with forgotten stable
marker, but rather left intentionally by various reasons.  Most of
them are fine to apply in anyway, but it was uncertain whether they
are really needed / qualifying as stable fixes.  So, I'd be happy to
see them as opt-in, i.e. applied only via manual approval.

Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
help for them.  They can be opt-out, i.e. kept until someone rejects.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 12:53               ` Jiri Kosina
@ 2018-09-05 13:05                 ` Greg KH
  2018-09-05 13:15                   ` Jiri Kosina
  2018-09-05 16:39                 ` James Bottomley
  1 sibling, 1 reply; 74+ messages in thread
From: Greg KH @ 2018-09-05 13:05 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: James Bottomley, ksummit-discuss

On Wed, Sep 05, 2018 at 02:53:34PM +0200, Jiri Kosina wrote:
> Just randomly scrolling through those, I am wondering how at least
> 
> 	7797167ffde1f00446301cb22b37b7c03194cfaf
> 	3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8
> 
> made it past any stable tree acceptance criteria.
> 
> They are memory ordering changes (so exactly area which is generally 
> fragile by itself and the risk of regressions simply can't be completely 
> ignored), yet they fix absolutely no functional issue.
> 
> In addition to that, they all exist upstream only for one single -rc, so 
> the public testing exposure is also currently minimal.
> 
> Yeah, I know I know, those are parisc, so noone cares anyway :P but that's 
> really just the first randomly chosen kernel, with a small number of 
> patches, and still 10% of them are something we'd not want to put into an 
> enterprise distro kernel without a lot of justification and regression 
> testing.

For these specific ones, I trusted that the maintainer of the subsystem
knew what they were doing when they marked them for the stable tree.

Which is what we do in kernel development, we trust others that their
stewardship of their code subsystems is in the best interest of their
users.  To not do so, would be to force me to know much more about _ALL_
parts of the kernel, and that will just not happen to anyone.

And yes, you can probably always find one-off patches that you might not
feel comfortable with, but as they are in Linus's tree, you better feel
comfortable with them when you update to the next major version :)

There's a cool script floating around somewhere that shows you, when you
merge in a stable kernel release into your tree, exactly which files and
commits actually affect you based on your configuration for that
specific kernel tree.  It's used in some Android devices and it lets
people feel much more comfortable about taking stable release updates
because they realize two major things:

	- stuff like parisc changes has no affect on them at all so they
	  get comfortable with "large numbers" of patches in stable
	  releases.
	- they get bugfixes for their platform that they didn't realize
	  they needed just yet.

Maybe you should start using it for your kernels so that parisc patches
don't bother you :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:05                 ` Greg KH
@ 2018-09-05 13:15                   ` Jiri Kosina
  2018-09-05 14:00                     ` Greg KH
  2018-09-05 14:06                     ` Sasha Levin
  0 siblings, 2 replies; 74+ messages in thread
From: Jiri Kosina @ 2018-09-05 13:15 UTC (permalink / raw)
  To: Greg KH; +Cc: James Bottomley, ksummit-discuss

On Wed, 5 Sep 2018, Greg KH wrote:

> For these specific ones, I trusted that the maintainer of the subsystem
> knew what they were doing when they marked them for the stable tree.

And do you honestly think they should be marked for stable tree in the 
first place?

> Which is what we do in kernel development, we trust others that their 
> stewardship of their code subsystems is in the best interest of their 
> users.  

Sure, I wholeheartedly agree. For Linus' tree, all the web of trust is 
there so that changes can be propagated up the maintainership structure, 
and we trust the maintainers and developers that they did all the 
development and testing as well as they possibly could, and that eventual 
bugs in the code will be responsibly fixed.

For stable, there is another aspect that needs to be trusted -- that the 
relevance for stable has been properly considered, so that we ideally 
avoid the need for "eventual bugs will be fixed" much more pro-actively 
than in Linus' tree (that's "stable", right?).

And I think we simply could improve there (well, again, this all very much 
depends on the target audience I guess).

*Especially* with the automatic selection thing -- who exactly is the 
entity you trust there?

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 12:24             ` James Bottomley
  2018-09-05 12:53               ` Jiri Kosina
  2018-09-05 13:03               ` Takashi Iwai
@ 2018-09-05 13:16               ` Mark Brown
  2018-09-05 14:27                 ` Sasha Levin
  2 siblings, 1 reply; 74+ messages in thread
From: Mark Brown @ 2018-09-05 13:16 UTC (permalink / raw)
  To: James Bottomley; +Cc: Greg KH, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 733 bytes --]

On Wed, Sep 05, 2018 at 01:24:18PM +0100, James Bottomley wrote:

> I absolutely agree.  That's why I said our process is expediency
> based:  you have to trade off the value of applying the patch vs the
> probability of introducing bugs.  However the maintainers are mostly
> considering this which is why stable is largely free from trivial but
> pointless patches.  The rule should be: if it doesn't fix a user
> visible bug, it doesn't go into stable.

It's not just maintainers any more - in particular we've got Sasha's
neural net thing picking patches as well and it's substantially more
trigger happy than at least I am.  People do get a chance to review what
it's picking but that's different to maintainers picking things.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:03               ` Takashi Iwai
@ 2018-09-05 13:27                 ` Daniel Vetter
  2018-09-05 14:05                   ` Greg KH
  2018-09-05 14:20                 ` Sasha Levin
  1 sibling, 1 reply; 74+ messages in thread
From: Daniel Vetter @ 2018-09-05 13:27 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 5, 2018 at 3:03 PM, Takashi Iwai <tiwai@suse.de> wrote:
> On Wed, 05 Sep 2018 14:24:18 +0200,
> James Bottomley wrote:
>>
>> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>> >
>> >> This really shouldn't be an issue: stable trees are backported from
>> >> upstream.  The patch (should) work in upstream, so it should work in
>> >> stable.  There are only a few real cases you need to worry about:
>> >
>> >>    1. Buggy patch in upstream backported to stable. (will be caught
>> >and
>> >>       the fix backported soon)
>> >>    2. Missing precursor causing issues in stable alone.
>> >>    3. Bug introduced when hand applying.
>> >
>> >> The chances of one of these happening is non-zero, but the criteria
>> >for
>> >> stable should mean its still better odds than the odds of hitting the
>> >> bug it was fixing.
>> >
>> >Some of those are substantial enough to be worth worrying about,
>> >especially the missing precursor issues.  It's rarely an issue with the
>> >human generated backports but the automated ones don't have a sense of
>> >context in the selection.
>> >
>> >There's also a risk/reward tradeoff to consider with more minor issues,
>> >especially performance related ones.  We want people to be enthusiastic
>> >about taking stable updates and every time they find a problem with a
>> >backport that works against them doing that.
>>
>> I absolutely agree.  That's why I said our process is expediency
>> based:  you have to trade off the value of applying the patch vs the
>> probability of introducing bugs.  However the maintainers are mostly
>> considering this which is why stable is largely free from trivial
>> but pointless patches.  The rule should be: if it doesn't fix a user
>> visible bug, it doesn't go into stable.
>
> Right, and here the current AUTOSEL (and some other not-stable-marked)
> patches coming to a gray zone.  The picked-up patches are often right
> as "some" fixes, but they are not necessarily qualified as "stable
> fixes".
>
> How about allowing to change the choice of AUTOSEL to be opt-in and
> opt-out, depending on the tree?  In my case, usually the patches
> caught by AUTOSEL aren't really the patches with forgotten stable
> marker, but rather left intentionally by various reasons.  Most of
> them are fine to apply in anyway, but it was uncertain whether they
> are really needed / qualifying as stable fixes.  So, I'd be happy to
> see them as opt-in, i.e. applied only via manual approval.
>
> Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
> help for them.  They can be opt-out, i.e. kept until someone rejects.

+1 on AUTOSEL opt-in. It's annyoing at best, when it backports cleanup
patches (because somehow those look like stealthy security fixes
sometimes) and breaks a bunch of people's boxes for no good reason.

In general it'd be really good if -stable had a clearer audit path.
Every patch have a recorded reason why it's being applied (e.g. Cc:
stable in upstream, Link to the lkml thread/bug report, AUTOSEL mail,
whatever), so that after the fact I can figure out why a -stable patch
happend, that would be really great. Atm -stable occasionally blows
up, with a patch we didn't mark as cc: stable, and we have no idea
whyiit showed up in -stable even. That makes it really hard to do
better next time around.

Thanks, Danile
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:15                   ` Jiri Kosina
@ 2018-09-05 14:00                     ` Greg KH
  2018-09-05 14:06                     ` Sasha Levin
  1 sibling, 0 replies; 74+ messages in thread
From: Greg KH @ 2018-09-05 14:00 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: James Bottomley, ksummit-discuss

On Wed, Sep 05, 2018 at 03:15:25PM +0200, Jiri Kosina wrote:
> On Wed, 5 Sep 2018, Greg KH wrote:
> 
> > For these specific ones, I trusted that the maintainer of the subsystem
> > knew what they were doing when they marked them for the stable tree.
> 
> And do you honestly think they should be marked for stable tree in the 
> first place?

For these, they passed my smell test, it seemed a simple win for a major
performance increase.  I would have pushed back if I didn't think so.

Now arguably, maybe I don't push back hard enough, but I do complain
when I see things marked for stable that are not "obvious".

As I have said every year when this same comment comes up, if you, or
anyone else, wants to help and review and push back on patches that have
been tagged like this, please do so, I can ALWAYS use the help.

> > Which is what we do in kernel development, we trust others that their 
> > stewardship of their code subsystems is in the best interest of their 
> > users.  
> 
> Sure, I wholeheartedly agree. For Linus' tree, all the web of trust is 
> there so that changes can be propagated up the maintainership structure, 
> and we trust the maintainers and developers that they did all the 
> development and testing as well as they possibly could, and that eventual 
> bugs in the code will be responsibly fixed.
> 
> For stable, there is another aspect that needs to be trusted -- that the 
> relevance for stable has been properly considered, so that we ideally 
> avoid the need for "eventual bugs will be fixed" much more pro-actively 
> than in Linus' tree (that's "stable", right?).
> 
> And I think we simply could improve there (well, again, this all very much 
> depends on the target audience I guess).
> 
> *Especially* with the automatic selection thing -- who exactly is the 
> entity you trust there?

I trust Sasha for doing the first cut at scanning the output and picking
the patches that "look correct".  I have seen previous outputs of the
"raw" tool, and they still need human judgement, which he uses.

I also then review them all myself, and sometimes do find things that
should not be merged, and I say so and drop them.

And of course, having the maintainers/developers of the patches asked if
they should be applied also helps, I rely on them to say "NO!" which
again, also happens.

So that's three levels of "entities" that I trust, is that not
sufficient?  If so, what would make you "feel better" about it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:27                 ` Daniel Vetter
@ 2018-09-05 14:05                   ` Greg KH
  2018-09-05 15:54                     ` Daniel Vetter
  0 siblings, 1 reply; 74+ messages in thread
From: Greg KH @ 2018-09-05 14:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: James Bottomley, ksummit-discuss

On Wed, Sep 05, 2018 at 03:27:58PM +0200, Daniel Vetter wrote:
> On Wed, Sep 5, 2018 at 3:03 PM, Takashi Iwai <tiwai@suse.de> wrote:
> > On Wed, 05 Sep 2018 14:24:18 +0200,
> > James Bottomley wrote:
> >>
> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
> >> >
> >> >> This really shouldn't be an issue: stable trees are backported from
> >> >> upstream.  The patch (should) work in upstream, so it should work in
> >> >> stable.  There are only a few real cases you need to worry about:
> >> >
> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
> >> >and
> >> >>       the fix backported soon)
> >> >>    2. Missing precursor causing issues in stable alone.
> >> >>    3. Bug introduced when hand applying.
> >> >
> >> >> The chances of one of these happening is non-zero, but the criteria
> >> >for
> >> >> stable should mean its still better odds than the odds of hitting the
> >> >> bug it was fixing.
> >> >
> >> >Some of those are substantial enough to be worth worrying about,
> >> >especially the missing precursor issues.  It's rarely an issue with the
> >> >human generated backports but the automated ones don't have a sense of
> >> >context in the selection.
> >> >
> >> >There's also a risk/reward tradeoff to consider with more minor issues,
> >> >especially performance related ones.  We want people to be enthusiastic
> >> >about taking stable updates and every time they find a problem with a
> >> >backport that works against them doing that.
> >>
> >> I absolutely agree.  That's why I said our process is expediency
> >> based:  you have to trade off the value of applying the patch vs the
> >> probability of introducing bugs.  However the maintainers are mostly
> >> considering this which is why stable is largely free from trivial
> >> but pointless patches.  The rule should be: if it doesn't fix a user
> >> visible bug, it doesn't go into stable.
> >
> > Right, and here the current AUTOSEL (and some other not-stable-marked)
> > patches coming to a gray zone.  The picked-up patches are often right
> > as "some" fixes, but they are not necessarily qualified as "stable
> > fixes".
> >
> > How about allowing to change the choice of AUTOSEL to be opt-in and
> > opt-out, depending on the tree?  In my case, usually the patches
> > caught by AUTOSEL aren't really the patches with forgotten stable
> > marker, but rather left intentionally by various reasons.  Most of
> > them are fine to apply in anyway, but it was uncertain whether they
> > are really needed / qualifying as stable fixes.  So, I'd be happy to
> > see them as opt-in, i.e. applied only via manual approval.
> >
> > Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
> > help for them.  They can be opt-out, i.e. kept until someone rejects.
> 
> +1 on AUTOSEL opt-in. It's annyoing at best, when it backports cleanup
> patches (because somehow those look like stealthy security fixes
> sometimes) and breaks a bunch of people's boxes for no good reason.
> 
> In general it'd be really good if -stable had a clearer audit path.
> Every patch have a recorded reason why it's being applied (e.g. Cc:
> stable in upstream, Link to the lkml thread/bug report, AUTOSEL mail,
> whatever), so that after the fact I can figure out why a -stable patch
> happend, that would be really great. Atm -stable occasionally blows
> up, with a patch we didn't mark as cc: stable, and we have no idea
> whyiit showed up in -stable even. That makes it really hard to do
> better next time around.

I try to keep the audit thread here, as I get asked all the time why
stuff got added.

Here's what I do, it's not exactly obvious, sorry:
	- if it came from a stable@ tag, just leave it alone and add my
	  signed-off-by
	- if it was manually requested by someone, I add a "cc:
	  requestor" to the signed-off-by area and add my s-o-b
	- if it came from Sasha's tree, Sasha's s-o-b is on it
	- if it came from David Miller's patchset, his s-o-b is on it.

That should cover all types of patches currently going into the trees,
right?

So always, you can cc: everyone on the s-o-b area and get the people
involved in the patch and someone involved in reviewing it for stable
inclusion.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:15                   ` Jiri Kosina
  2018-09-05 14:00                     ` Greg KH
@ 2018-09-05 14:06                     ` Sasha Levin
  2018-09-05 21:02                       ` Jiri Kosina
  1 sibling, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 14:06 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 03:15:25PM +0200, Jiri Kosina wrote:
>On Wed, 5 Sep 2018, Greg KH wrote:
>
>> For these specific ones, I trusted that the maintainer of the subsystem
>> knew what they were doing when they marked them for the stable tree.
>
>And do you honestly think they should be marked for stable tree in the
>first place?

If you can't trust a maintainer's judgement about his very own subsystem
then you're shit out of luck. In this scenario Greg's opinion weighs
less (IMO) than the maintainer, so if Greg was asked to include them
then there better be a solid reason to challange that request.

>> Which is what we do in kernel development, we trust others that their
>> stewardship of their code subsystems is in the best interest of their
>> users.
>
>Sure, I wholeheartedly agree. For Linus' tree, all the web of trust is
>there so that changes can be propagated up the maintainership structure,
>and we trust the maintainers and developers that they did all the
>development and testing as well as they possibly could, and that eventual
>bugs in the code will be responsibly fixed.
>
>For stable, there is another aspect that needs to be trusted -- that the
>relevance for stable has been properly considered, so that we ideally
>avoid the need for "eventual bugs will be fixed" much more pro-actively
>than in Linus' tree (that's "stable", right?).
>
>And I think we simply could improve there (well, again, this all very much
>depends on the target audience I guess).
>
>*Especially* with the automatic selection thing -- who exactly is the
>entity you trust there?

Me!

Greg can (and does) criticize/yell/flame me and I will address his and
any other reviewer concerns. I go through every patch that gets
selected by the engine and sent upstream.


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:03               ` Takashi Iwai
  2018-09-05 13:27                 ` Daniel Vetter
@ 2018-09-05 14:20                 ` Sasha Levin
  2018-09-05 14:30                   ` Takashi Iwai
  1 sibling, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 14:20 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 03:03:13PM +0200, Takashi Iwai wrote:
>On Wed, 05 Sep 2018 14:24:18 +0200,
>James Bottomley wrote:
>>
>> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>> >
>> >> This really shouldn't be an issue: stable trees are backported from
>> >> upstream.  The patch (should) work in upstream, so it should work in
>> >> stable.  There are only a few real cases you need to worry about:
>> >
>> >>    1. Buggy patch in upstream backported to stable. (will be caught
>> >and
>> >>       the fix backported soon)
>> >>    2. Missing precursor causing issues in stable alone.
>> >>    3. Bug introduced when hand applying.
>> >
>> >> The chances of one of these happening is non-zero, but the criteria
>> >for
>> >> stable should mean its still better odds than the odds of hitting the
>> >> bug it was fixing.
>> >
>> >Some of those are substantial enough to be worth worrying about,
>> >especially the missing precursor issues.  It's rarely an issue with the
>> >human generated backports but the automated ones don't have a sense of
>> >context in the selection.
>> >
>> >There's also a risk/reward tradeoff to consider with more minor issues,
>> >especially performance related ones.  We want people to be enthusiastic
>> >about taking stable updates and every time they find a problem with a
>> >backport that works against them doing that.
>>
>> I absolutely agree.  That's why I said our process is expediency
>> based:  you have to trade off the value of applying the patch vs the
>> probability of introducing bugs.  However the maintainers are mostly
>> considering this which is why stable is largely free from trivial
>> but pointless patches.  The rule should be: if it doesn't fix a user
>> visible bug, it doesn't go into stable.
>
>Right, and here the current AUTOSEL (and some other not-stable-marked)
>patches coming to a gray zone.  The picked-up patches are often right
>as "some" fixes, but they are not necessarily qualified as "stable
>fixes".
>
>How about allowing to change the choice of AUTOSEL to be opt-in and
>opt-out, depending on the tree?  In my case, usually the patches
>caught by AUTOSEL aren't really the patches with forgotten stable
>marker, but rather left intentionally by various reasons.  Most of
>them are fine to apply in anyway, but it was uncertain whether they
>are really needed / qualifying as stable fixes.  So, I'd be happy to
>see them as opt-in, i.e. applied only via manual approval.

So right now you can opt-out your tree if you'd like. I'm not trying to
force it on any particular maintainer. If you'd like to ack each patch I
send before it goes in a tree this is something we can definitely do.

FWIW, it looks like your tree is in a very good shape compared to most
other trees I encounter, so I end up sending fewer proposed stable
commits your way.

I tried picking a random commit that went through my selection process
and chose https://lore.kernel.org/patchwork/patch/909923/ . Is this type
of patch that should not belong in stable?


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 13:16               ` Mark Brown
@ 2018-09-05 14:27                 ` Sasha Levin
  2018-09-05 14:50                   ` Mark Brown
  0 siblings, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 14:27 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 02:16:43PM +0100, Mark Brown wrote:
>On Wed, Sep 05, 2018 at 01:24:18PM +0100, James Bottomley wrote:
>
>> I absolutely agree.  That's why I said our process is expediency
>> based:  you have to trade off the value of applying the patch vs the
>> probability of introducing bugs.  However the maintainers are mostly
>> considering this which is why stable is largely free from trivial but
>> pointless patches.  The rule should be: if it doesn't fix a user
>> visible bug, it doesn't go into stable.
>
>It's not just maintainers any more - in particular we've got Sasha's
>neural net thing picking patches as well and it's substantially more
>trigger happy than at least I am.  People do get a chance to review what
>it's picking but that's different to maintainers picking things.

What can I do to make the process better?

I tried giving longer review periods, I tried sending emails right when
the patch is merged upstream instead of weeks later and I tried actively
pursuing some maintainers for explicit Acks. None of which seemed to
make anyone happier.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:20                 ` Sasha Levin
@ 2018-09-05 14:30                   ` Takashi Iwai
  2018-09-05 14:41                     ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Takashi Iwai @ 2018-09-05 14:30 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, 05 Sep 2018 16:20:40 +0200,
Sasha Levin wrote:
> 
> On Wed, Sep 05, 2018 at 03:03:13PM +0200, Takashi Iwai wrote:
> >On Wed, 05 Sep 2018 14:24:18 +0200,
> >James Bottomley wrote:
> >>
> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
> >> >
> >> >> This really shouldn't be an issue: stable trees are backported from
> >> >> upstream.  The patch (should) work in upstream, so it should work in
> >> >> stable.  There are only a few real cases you need to worry about:
> >> >
> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
> >> >and
> >> >>       the fix backported soon)
> >> >>    2. Missing precursor causing issues in stable alone.
> >> >>    3. Bug introduced when hand applying.
> >> >
> >> >> The chances of one of these happening is non-zero, but the criteria
> >> >for
> >> >> stable should mean its still better odds than the odds of hitting the
> >> >> bug it was fixing.
> >> >
> >> >Some of those are substantial enough to be worth worrying about,
> >> >especially the missing precursor issues.  It's rarely an issue with the
> >> >human generated backports but the automated ones don't have a sense of
> >> >context in the selection.
> >> >
> >> >There's also a risk/reward tradeoff to consider with more minor issues,
> >> >especially performance related ones.  We want people to be enthusiastic
> >> >about taking stable updates and every time they find a problem with a
> >> >backport that works against them doing that.
> >>
> >> I absolutely agree.  That's why I said our process is expediency
> >> based:  you have to trade off the value of applying the patch vs the
> >> probability of introducing bugs.  However the maintainers are mostly
> >> considering this which is why stable is largely free from trivial
> >> but pointless patches.  The rule should be: if it doesn't fix a user
> >> visible bug, it doesn't go into stable.
> >
> >Right, and here the current AUTOSEL (and some other not-stable-marked)
> >patches coming to a gray zone.  The picked-up patches are often right
> >as "some" fixes, but they are not necessarily qualified as "stable
> >fixes".
> >
> >How about allowing to change the choice of AUTOSEL to be opt-in and
> >opt-out, depending on the tree?  In my case, usually the patches
> >caught by AUTOSEL aren't really the patches with forgotten stable
> >marker, but rather left intentionally by various reasons.  Most of
> >them are fine to apply in anyway, but it was uncertain whether they
> >are really needed / qualifying as stable fixes.  So, I'd be happy to
> >see them as opt-in, i.e. applied only via manual approval.
> 
> So right now you can opt-out your tree if you'd like. I'm not trying to
> force it on any particular maintainer. If you'd like to ack each patch I
> send before it goes in a tree this is something we can definitely do.

Yeah, that would help in my case.

Particularly, I'd like to have an option to defer the patch merge.
For example...

> FWIW, it looks like your tree is in a very good shape compared to most
> other trees I encounter, so I end up sending fewer proposed stable
> commits your way.
> 
> I tried picking a random commit that went through my selection process
> and chose https://lore.kernel.org/patchwork/patch/909923/ . Is this type
> of patch that should not belong in stable?

... this is an example I'd hold for a while until a bit more testing
has been done after the release of Linus tree.  This is clearly a fix,
but it's no regression fix or such but just catching some logically
possible error case.  Hence there hasn't been any test coverage or
explicit unit testing.  So, this kind of change might have a slightly
higher risk of regression than the obvious fix (which is usually with
cc-to-stable).

Note that this particular patch might have been picked up lately
enough, but you get an idea.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:12 ` Jiri Kosina
@ 2018-09-05 14:31   ` Greg KH
  0 siblings, 0 replies; 74+ messages in thread
From: Greg KH @ 2018-09-05 14:31 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

On Tue, Sep 04, 2018 at 11:12:38PM +0200, Jiri Kosina wrote:
> On Tue, 4 Sep 2018, Laura Abbott wrote:
> 
> > I also realize Greg is probably reading this with a very skeptical look 
> > on his face so I'd be interested to hear from other distro maintainers 
> > as well.
> 
> As a SUSE distro kernel maintainer, I'd really like to participate if any 
> such discussion is happening.
> 
> Namely:
> 
> - we're having a lot of internal discussions about how to adjust our 
>   processess to the changes happening in -stable tree process and patch 
>   acceptance criteria
> 
> - it's becoming more and more apparent (and even Greg stated it in the few 
>   months old thread Sasha referred to) that stable tree is not really 
>   intended for distros in the first place; it might be useful to have this 
>   clarified a bit more.
> 
>   Namely: it's sort of evident that most of the major distros are running 
>   their own variation of the stable tree. Would it be beneficial to 
>   somehow close the feedback loop back from the distros to the stable 
>   tree? Or is total disconnect between the two worlds inevitable and 
>   desired?

I don't recall ever saying that the stable tree is "not for distros",
given that I know many distros rely on it (Arch, Fedora, openSUSE,
Gentoo, CoreOS, Debian, Android, etc.)  Perhaps I said "not for an
'enterprise' distro?" given your 'contraints' that you all put on your
trees that I do not have?

I strongly recommend any user use a distro tree over the stable tree,
but if you are distro, you need to pick what you want to base on
depending on your requirements (new hardware, stable internal abi, SoC
madness, etc.)

I do take "this is not working for us" complaints from distros really
seriously, but so far all I hear from SuSE is the usual "you are taking
too many patches" which we have already covered elsewhere in this thread :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 11:20         ` Jiri Kosina
@ 2018-09-05 14:41           ` Thomas Gleixner
  2018-09-05 15:18             ` Steven Rostedt
  0 siblings, 1 reply; 74+ messages in thread
From: Thomas Gleixner @ 2018-09-05 14:41 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, Jiri Kosina wrote:
> On Wed, 5 Sep 2018, Thomas Gleixner wrote:
> If enterprise vendors would be able to create a working business 
> relationship with partners and customers around 'rolling' kernel versions 
> in enterprise distributions one day, that'd of course be awesome.

It would be a good thing if _all_ of them would start to think about it
seriously and even more so if they would agree and push that model
together.

> > IOW, in the light of meltdown/spectre all effort should have been put 
> > into getting 4.14 and 4.9 fixed instead of diverting our very limited 
> > capcity to create monstrosities back to 2.6 variants.
> 
> I agree that it'd be an ideal world, but it's guaranteed that if we just 
> say to the people running some of our 2.6 kernel under a very special 
> contract that they have to all of a sudden move to 4.14, we'll just 
> immediately lose that contract (and someone else will immediately plug the 
> hole on the market and create perhaps even worse backport for them), and 
> for various reasons we don't want that to happen :)

Yeah, I've heard that song over and over. Of course you can't undo the
mistakes of the past, but the shades of meltdown & co. should give all
vendors enough ammunition to start serious negotiations with their
customers.

> Such contracts are usually set up in a way that only very specific fixes 
> can be requested for said kernels. We've historically put our bets on the 
> fact that we'll be able to provide those rare fixes even for 2.6, and it 
> worked well.
> Now we're paying back a bit of course (because spectre/meltdown of course 
> qualifies), but upstream can completely and happily ignore that.

Hell, no. It affects upstream very much because the whole dead kernel
rituals consume a massive amount of brain power.

These backports are not done by random code monkeys, they waste the scarse
time of top notch developers and maintainers. This time is not available
for concentrating on upstream and a very restricted set of LTS kernels,
which would benefit everybody, including distros and their customers.

I very well know how many developers and maintainers are trainwrecked and
frustrated by that. Not to talk about the massive backlog this creates,
which hurts everyone again.

So no, we cannot shrug it off and happily ignore it. We have to tell
distros over and over that they are doing a massive damage.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:30                   ` Takashi Iwai
@ 2018-09-05 14:41                     ` Sasha Levin
  2018-09-05 14:46                       ` Takashi Iwai
  0 siblings, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 14:41 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 04:30:36PM +0200, Takashi Iwai wrote:
>On Wed, 05 Sep 2018 16:20:40 +0200,
>Sasha Levin wrote:
>>
>> On Wed, Sep 05, 2018 at 03:03:13PM +0200, Takashi Iwai wrote:
>> >On Wed, 05 Sep 2018 14:24:18 +0200,
>> >James Bottomley wrote:
>> >>
>> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>> >> >
>> >> >> This really shouldn't be an issue: stable trees are backported from
>> >> >> upstream.  The patch (should) work in upstream, so it should work in
>> >> >> stable.  There are only a few real cases you need to worry about:
>> >> >
>> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
>> >> >and
>> >> >>       the fix backported soon)
>> >> >>    2. Missing precursor causing issues in stable alone.
>> >> >>    3. Bug introduced when hand applying.
>> >> >
>> >> >> The chances of one of these happening is non-zero, but the criteria
>> >> >for
>> >> >> stable should mean its still better odds than the odds of hitting the
>> >> >> bug it was fixing.
>> >> >
>> >> >Some of those are substantial enough to be worth worrying about,
>> >> >especially the missing precursor issues.  It's rarely an issue with the
>> >> >human generated backports but the automated ones don't have a sense of
>> >> >context in the selection.
>> >> >
>> >> >There's also a risk/reward tradeoff to consider with more minor issues,
>> >> >especially performance related ones.  We want people to be enthusiastic
>> >> >about taking stable updates and every time they find a problem with a
>> >> >backport that works against them doing that.
>> >>
>> >> I absolutely agree.  That's why I said our process is expediency
>> >> based:  you have to trade off the value of applying the patch vs the
>> >> probability of introducing bugs.  However the maintainers are mostly
>> >> considering this which is why stable is largely free from trivial
>> >> but pointless patches.  The rule should be: if it doesn't fix a user
>> >> visible bug, it doesn't go into stable.
>> >
>> >Right, and here the current AUTOSEL (and some other not-stable-marked)
>> >patches coming to a gray zone.  The picked-up patches are often right
>> >as "some" fixes, but they are not necessarily qualified as "stable
>> >fixes".
>> >
>> >How about allowing to change the choice of AUTOSEL to be opt-in and
>> >opt-out, depending on the tree?  In my case, usually the patches
>> >caught by AUTOSEL aren't really the patches with forgotten stable
>> >marker, but rather left intentionally by various reasons.  Most of
>> >them are fine to apply in anyway, but it was uncertain whether they
>> >are really needed / qualifying as stable fixes.  So, I'd be happy to
>> >see them as opt-in, i.e. applied only via manual approval.
>>
>> So right now you can opt-out your tree if you'd like. I'm not trying to
>> force it on any particular maintainer. If you'd like to ack each patch I
>> send before it goes in a tree this is something we can definitely do.
>
>Yeah, that would help in my case.
>
>Particularly, I'd like to have an option to defer the patch merge.
>For example...

You can always do that by pointing it out on the review request mail.

>> FWIW, it looks like your tree is in a very good shape compared to most
>> other trees I encounter, so I end up sending fewer proposed stable
>> commits your way.
>>
>> I tried picking a random commit that went through my selection process
>> and chose https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F909923%2F&amp;data=02%7C01%7CAlexander.Levin%40microsoft.com%7C9410861ca37a4c2f0ca908d6133c26cb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636717546400542729&amp;sdata=J0WTTH%2F9bOE5ipwDpxRzHTAxRppc6HoxvMr25HzFaaA%3D&amp;reserved=0 . Is this type
>> of patch that should not belong in stable?
>
>... this is an example I'd hold for a while until a bit more testing
>has been done after the release of Linus tree.  This is clearly a fix,
>but it's no regression fix or such but just catching some logically
>possible error case.  Hence there hasn't been any test coverage or
>explicit unit testing.  So, this kind of change might have a slightly
>higher risk of regression than the obvious fix (which is usually with
>cc-to-stable).
>
>Note that this particular patch might have been picked up lately
>enough, but you get an idea.

So right now I'm lagging a few weeks behind upstream. If I limit it to
patches that are at least 1 month old will that help with your concerns?


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-04 21:22 ` Justin Forbes
@ 2018-09-05 14:42   ` Greg KH
  2018-09-05 15:10     ` Mark Brown
                       ` (5 more replies)
  0 siblings, 6 replies; 74+ messages in thread
From: Greg KH @ 2018-09-05 14:42 UTC (permalink / raw)
  To: Justin Forbes; +Cc: ksummit

On Tue, Sep 04, 2018 at 04:22:59PM -0500, Justin Forbes wrote:
> On Tue, Sep 4, 2018 at 3:58 PM, Laura Abbott <labbott@redhat.com> wrote:
> > I'd like to start a discussion about the stable release cycle.
> >
> > Fedora is a heavy user of the most recent stable trees and we
> > generally do a pretty good job of keeping up to date. As we
> > try and increase testing though, the stable release process
> > gets to be a bit difficult. We often run into the problem where
> > release .Z is officially released and then .Z+1 comes
> > out as an -rc immediately after. Given Fedora release processes,
> > we haven't always finished testing .Z by the time .Z+1 comes
> > out. What to do in this situation really depends on what's in
> > .Z and .Z+1 and how stable we think things are. This usually
> > works out fine but a) sometimes we guess wrong and should have
> > tested .Z more b) we're only looking to increase testing.
> >
> > What I'd like to see is stable updates that come on a regular
> > schedule with a longer -rc interval, say Sunday with
> > a one week -rc period. I understand that much of the current
> > stable schedule is based on Greg's schedule. As a distro
> > maintainer though, a regular release schedule with a longer
> > testing window makes it much easier to plan and deliver something
> > useful to our users. It's also a much easier sell for encouraging
> > everyone to pick up every stable update if there's a known
> > schedule. I also realize Greg is probably reading this with a very
> > skeptical look on his face so I'd be interested to hear from
> > other distro maintainers as well.
> >
> 
> This has been a fairly recent problem. There was a roughly weekly
> cadence for a very long time and that was pretty easy to work with.  I
> know that some of these updates do fix embargoed security issues that
> we don't find out are actual fixes until later, but frequently in
> those cases, the fixes are pushed well before embargo lifts, and they
> could be fit into a weekly cadence.  Personally I don't have a problem
> with the 3 day rc period, but pushing 2 kernels a week can be a
> problem for users. (skipping a stable update is also a problem for
> users.)  What I would prefer is 1 stable update per week with an
> exception for *serious* security issues, where serious would mean
> either real end user impact or high profile lots of press users are
> going to be wondering where a fix is.

Laura, thanks for bringing this up.  I'll try to respond here given that
Justin agrees with the issue of timing.

Honestly, this year has been a total shit-storm for stable due to the
whole security mess we have been dealing with.  The number of
totally-crazy-intrusive patches I have had to take is insane.  Combine
that with a total lack of regard for the security issues for some arches
(arm32 comes to mind), it's been a very rough year and I have been just
trying to keep on top of everything.

Because of these issues (and it wasn't just spectre/meltdown, we have
had other major fire drills in some subsystems), the release cycles have
been quick and contain a lot of patches, sorry about that.  But that is
reflected in Linus's tree as well, so maybe this is just the "new
normal" that we all need to get used to.

I could do a "one release a week" cycle, which I would _love_ but that
is not going to decrease the number of patches per release, it is only
going to make them large (patch rate stays the same, and increases, no
matter when I release)  So I had been thinking that to break the
releases up into a "here's a hundred or so patches" per release, was a
helpful thing to the reviewers.

If this assumption is incorrect, yes, I can go to one-per-week, if
people agree that they can handle the large increase per release
properly.  Can you all do that?

Are we going to do a "patch tuesday" like our friends in Redmond now? :)

Note, if we do pick a specific day-per-week, then anything outside of
that cycle will cause people to look _very_ close at the release.  I
don't know if that's a good thing or not, but be aware that it could
cause unintended side-affects.  Personally I think the fact that we are
_not_ regular is a good thing, no out-of-band information leakage
happens that way.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 10:11           ` Mark Brown
@ 2018-09-05 14:44             ` Steven Rostedt
  0 siblings, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2018-09-05 14:44 UTC (permalink / raw)
  To: Mark Brown; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018 11:11:59 +0100
Mark Brown <broonie@kernel.org> wrote:

> On Wed, Sep 05, 2018 at 10:56:42AM +0200, Greg KH wrote:
> 
> > 	- We have finally woken some subsystem maintainers up into
> > 	  actually properly tagging patches for stable.  We used to have
> > 	  a horrid rate of this happening, and it is getting better.
> > 	  However, we still have whole major subsystems that _never_ tag
> > 	  anything, which is a problem, so things will get larger.  
> 
> Some of us have been doing this for 5+ years :/

Yep.

> 
> > 	- Sasha's work in finding the patches that maintainers/developer
> > 	  fail to tag is paying off really well, which also increases
> > 	  the size.  
> 
> These and the few other patches that I didn't tag for stable myself are
> the only ones I try to review reliably, the others I already looked at
> for stable at least once and especially where things are automated it's
> better to have some manual checking.  It's good that Sasha's stuff is
> flagged now, and most other submissions are obvious as well, so that's
> fairly easy to do and means that the review burden is relatively light.

I tend to too. As there's some fixes I don't tag for stable because it
doesn't "break" things too bad (things that were always broken for
years, but nobody noticed.. like a bad format of the trace output), or
I have another fix for the problem. For instance, I just replied to an
AUTOSEL patch that fixed a symptom of a bug, but the bug fix itself was
tagged for stable. The symptom fix was just to make the code more
"robust" and shouldn't hurt anything, but it really wasn't a bug fix.
I don't think it was necessary to backport that, but if someone else
thinks it is, I'll just let it be.

-- Steve

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:41                     ` Sasha Levin
@ 2018-09-05 14:46                       ` Takashi Iwai
  2018-09-05 14:54                         ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Takashi Iwai @ 2018-09-05 14:46 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, 05 Sep 2018 16:41:56 +0200,
Sasha Levin wrote:
> 
> On Wed, Sep 05, 2018 at 04:30:36PM +0200, Takashi Iwai wrote:
> >On Wed, 05 Sep 2018 16:20:40 +0200,
> >Sasha Levin wrote:
> >>
> >> On Wed, Sep 05, 2018 at 03:03:13PM +0200, Takashi Iwai wrote:
> >> >On Wed, 05 Sep 2018 14:24:18 +0200,
> >> >James Bottomley wrote:
> >> >>
> >> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
> >> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
> >> >> >
> >> >> >> This really shouldn't be an issue: stable trees are backported from
> >> >> >> upstream.  The patch (should) work in upstream, so it should work in
> >> >> >> stable.  There are only a few real cases you need to worry about:
> >> >> >
> >> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
> >> >> >and
> >> >> >>       the fix backported soon)
> >> >> >>    2. Missing precursor causing issues in stable alone.
> >> >> >>    3. Bug introduced when hand applying.
> >> >> >
> >> >> >> The chances of one of these happening is non-zero, but the criteria
> >> >> >for
> >> >> >> stable should mean its still better odds than the odds of hitting the
> >> >> >> bug it was fixing.
> >> >> >
> >> >> >Some of those are substantial enough to be worth worrying about,
> >> >> >especially the missing precursor issues.  It's rarely an issue with the
> >> >> >human generated backports but the automated ones don't have a sense of
> >> >> >context in the selection.
> >> >> >
> >> >> >There's also a risk/reward tradeoff to consider with more minor issues,
> >> >> >especially performance related ones.  We want people to be enthusiastic
> >> >> >about taking stable updates and every time they find a problem with a
> >> >> >backport that works against them doing that.
> >> >>
> >> >> I absolutely agree.  That's why I said our process is expediency
> >> >> based:  you have to trade off the value of applying the patch vs the
> >> >> probability of introducing bugs.  However the maintainers are mostly
> >> >> considering this which is why stable is largely free from trivial
> >> >> but pointless patches.  The rule should be: if it doesn't fix a user
> >> >> visible bug, it doesn't go into stable.
> >> >
> >> >Right, and here the current AUTOSEL (and some other not-stable-marked)
> >> >patches coming to a gray zone.  The picked-up patches are often right
> >> >as "some" fixes, but they are not necessarily qualified as "stable
> >> >fixes".
> >> >
> >> >How about allowing to change the choice of AUTOSEL to be opt-in and
> >> >opt-out, depending on the tree?  In my case, usually the patches
> >> >caught by AUTOSEL aren't really the patches with forgotten stable
> >> >marker, but rather left intentionally by various reasons.  Most of
> >> >them are fine to apply in anyway, but it was uncertain whether they
> >> >are really needed / qualifying as stable fixes.  So, I'd be happy to
> >> >see them as opt-in, i.e. applied only via manual approval.
> >>
> >> So right now you can opt-out your tree if you'd like. I'm not trying to
> >> force it on any particular maintainer. If you'd like to ack each patch I
> >> send before it goes in a tree this is something we can definitely do.
> >
> >Yeah, that would help in my case.
> >
> >Particularly, I'd like to have an option to defer the patch merge.
> >For example...
> 
> You can always do that by pointing it out on the review request mail.

OK, that should work, then.


> >> FWIW, it looks like your tree is in a very good shape compared to most
> >> other trees I encounter, so I end up sending fewer proposed stable
> >> commits your way.
> >>
> >> I tried picking a random commit that went through my selection process
> >> and chose https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fpatchwork%2Fpatch%2F909923%2F&amp;data=02%7C01%7CAlexander.Levin%40microsoft.com%7C9410861ca37a4c2f0ca908d6133c26cb%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636717546400542729&amp;sdata=J0WTTH%2F9bOE5ipwDpxRzHTAxRppc6HoxvMr25HzFaaA%3D&amp;reserved=0 . Is this type
> >> of patch that should not belong in stable?
> >
> >... this is an example I'd hold for a while until a bit more testing
> >has been done after the release of Linus tree.  This is clearly a fix,
> >but it's no regression fix or such but just catching some logically
> >possible error case.  Hence there hasn't been any test coverage or
> >explicit unit testing.  So, this kind of change might have a slightly
> >higher risk of regression than the obvious fix (which is usually with
> >cc-to-stable).
> >
> >Note that this particular patch might have been picked up lately
> >enough, but you get an idea.
> 
> So right now I'm lagging a few weeks behind upstream. If I limit it to
> patches that are at least 1 month old will that help with your concerns?

A few weeks after rc-release or the final release?
If it's the latter, that should be fine.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:27                 ` Sasha Levin
@ 2018-09-05 14:50                   ` Mark Brown
  2018-09-05 15:00                     ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Mark Brown @ 2018-09-05 14:50 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1166 bytes --]

On Wed, Sep 05, 2018 at 02:27:29PM +0000, Sasha Levin wrote:
> On Wed, Sep 05, 2018 at 02:16:43PM +0100, Mark Brown wrote:

> >It's not just maintainers any more - in particular we've got Sasha's
> >neural net thing picking patches as well and it's substantially more
> >trigger happy than at least I am.  People do get a chance to review what
> >it's picking but that's different to maintainers picking things.

> What can I do to make the process better?

> I tried giving longer review periods, I tried sending emails right when
> the patch is merged upstream instead of weeks later and I tried actively
> pursuing some maintainers for explicit Acks. None of which seemed to
> make anyone happier.

Honestly the whole thing just gives me a bit of anxiety.  I'd say most
of the patches my thinking is "it's *probably* fine" and it's rare for
me to think "Oh, I missed that important patch!".  As you know I do
sometimes actively push back on things.  On balance it mostly doesn't
break things though and I'm not sure what can be improved process wise
here, a lot of it is fundamentally that if I thought it was clearly good
for stable I'd have flagged it already.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:46                       ` Takashi Iwai
@ 2018-09-05 14:54                         ` Sasha Levin
  2018-09-05 15:12                           ` Takashi Iwai
  2018-09-05 15:19                           ` Thomas Gleixner
  0 siblings, 2 replies; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 14:54 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 04:46:26PM +0200, Takashi Iwai wrote:
>On Wed, 05 Sep 2018 16:41:56 +0200,
>Sasha Levin wrote:
>> So right now I'm lagging a few weeks behind upstream. If I limit it to
>> patches that are at least 1 month old will that help with your concerns?
>
>A few weeks after rc-release or the final release?
>If it's the latter, that should be fine.

A few weeks after a given patch was merged upstream. The tricky part is
if I wait a month from final release then the relevant stable tree is
already EOL.

Greg has an aggressive schedule for Stable, I'm just trying to keep up
most of the time...


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:50                   ` Mark Brown
@ 2018-09-05 15:00                     ` Sasha Levin
  0 siblings, 0 replies; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 15:00 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 03:50:29PM +0100, Mark Brown wrote:
>On Wed, Sep 05, 2018 at 02:27:29PM +0000, Sasha Levin wrote:
>> On Wed, Sep 05, 2018 at 02:16:43PM +0100, Mark Brown wrote:
>
>> >It's not just maintainers any more - in particular we've got Sasha's
>> >neural net thing picking patches as well and it's substantially more
>> >trigger happy than at least I am.  People do get a chance to review what
>> >it's picking but that's different to maintainers picking things.
>
>> What can I do to make the process better?
>
>> I tried giving longer review periods, I tried sending emails right when
>> the patch is merged upstream instead of weeks later and I tried actively
>> pursuing some maintainers for explicit Acks. None of which seemed to
>> make anyone happier.
>
>Honestly the whole thing just gives me a bit of anxiety.  I'd say most
>of the patches my thinking is "it's *probably* fine" and it's rare for
>me to think "Oh, I missed that important patch!".  As you know I do
>sometimes actively push back on things.  On balance it mostly doesn't
>break things though and I'm not sure what can be improved process wise
>here, a lot of it is fundamentally that if I thought it was clearly good
>for stable I'd have flagged it already.

Indeed. Some maintainers do a better job than others with tagging things
for Stable, so to them it seems like AUTOSEL patches are just noise, but
for other maintainers they basically do all the job for them.


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:42   ` Greg KH
@ 2018-09-05 15:10     ` Mark Brown
  2018-09-05 15:10     ` Sasha Levin
                       ` (4 subsequent siblings)
  5 siblings, 0 replies; 74+ messages in thread
From: Mark Brown @ 2018-09-05 15:10 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit, Justin Forbes

[-- Attachment #1: Type: text/plain, Size: 658 bytes --]

On Wed, Sep 05, 2018 at 04:42:33PM +0200, Greg KH wrote:

> totally-crazy-intrusive patches I have had to take is insane.  Combine
> that with a total lack of regard for the security issues for some arches
> (arm32 comes to mind), it's been a very rough year and I have been just
> trying to keep on top of everything.

There is regard for the arm32 stuff, it's just a question of having
enough people to do the backports to the excitingly old kernels that
people want for both arm32 and especially arm64 (which is a lot more
here fun as it's been more actively developed in the relevant time
period for current stable kernels).  Watch this space basically.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:42   ` Greg KH
  2018-09-05 15:10     ` Mark Brown
@ 2018-09-05 15:10     ` Sasha Levin
  2018-09-05 16:19     ` Guenter Roeck
                       ` (3 subsequent siblings)
  5 siblings, 0 replies; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 15:10 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit, Justin Forbes

On Wed, Sep 05, 2018 at 04:42:33PM +0200, Greg KH wrote:
>On Tue, Sep 04, 2018 at 04:22:59PM -0500, Justin Forbes wrote:
>> On Tue, Sep 4, 2018 at 3:58 PM, Laura Abbott <labbott@redhat.com> wrote:
>> > I'd like to start a discussion about the stable release cycle.
>> >
>> > Fedora is a heavy user of the most recent stable trees and we
>> > generally do a pretty good job of keeping up to date. As we
>> > try and increase testing though, the stable release process
>> > gets to be a bit difficult. We often run into the problem where
>> > release .Z is officially released and then .Z+1 comes
>> > out as an -rc immediately after. Given Fedora release processes,
>> > we haven't always finished testing .Z by the time .Z+1 comes
>> > out. What to do in this situation really depends on what's in
>> > .Z and .Z+1 and how stable we think things are. This usually
>> > works out fine but a) sometimes we guess wrong and should have
>> > tested .Z more b) we're only looking to increase testing.
>> >
>> > What I'd like to see is stable updates that come on a regular
>> > schedule with a longer -rc interval, say Sunday with
>> > a one week -rc period. I understand that much of the current
>> > stable schedule is based on Greg's schedule. As a distro
>> > maintainer though, a regular release schedule with a longer
>> > testing window makes it much easier to plan and deliver something
>> > useful to our users. It's also a much easier sell for encouraging
>> > everyone to pick up every stable update if there's a known
>> > schedule. I also realize Greg is probably reading this with a very
>> > skeptical look on his face so I'd be interested to hear from
>> > other distro maintainers as well.
>> >
>>
>> This has been a fairly recent problem. There was a roughly weekly
>> cadence for a very long time and that was pretty easy to work with.  I
>> know that some of these updates do fix embargoed security issues that
>> we don't find out are actual fixes until later, but frequently in
>> those cases, the fixes are pushed well before embargo lifts, and they
>> could be fit into a weekly cadence.  Personally I don't have a problem
>> with the 3 day rc period, but pushing 2 kernels a week can be a
>> problem for users. (skipping a stable update is also a problem for
>> users.)  What I would prefer is 1 stable update per week with an
>> exception for *serious* security issues, where serious would mean
>> either real end user impact or high profile lots of press users are
>> going to be wondering where a fix is.
>
>Laura, thanks for bringing this up.  I'll try to respond here given that
>Justin agrees with the issue of timing.
>
>Honestly, this year has been a total shit-storm for stable due to the
>whole security mess we have been dealing with.  The number of
>totally-crazy-intrusive patches I have had to take is insane.  Combine
>that with a total lack of regard for the security issues for some arches
>(arm32 comes to mind), it's been a very rough year and I have been just
>trying to keep on top of everything.
>
>Because of these issues (and it wasn't just spectre/meltdown, we have
>had other major fire drills in some subsystems), the release cycles have
>been quick and contain a lot of patches, sorry about that.  But that is
>reflected in Linus's tree as well, so maybe this is just the "new
>normal" that we all need to get used to.
>
>I could do a "one release a week" cycle, which I would _love_ but that
>is not going to decrease the number of patches per release, it is only
>going to make them large (patch rate stays the same, and increases, no
>matter when I release)  So I had been thinking that to break the
>releases up into a "here's a hundred or so patches" per release, was a
>helpful thing to the reviewers.

Maybe something like stable-next would help? I know that right now you
lag a few weeks behind Linus. What if instead of lagging we just put all
the stable tagged commits into a stable-next branch right away and let
adventerous humans/distros test it out?

By the time you'll be queueing up these commits to your stable branches
they would have already had a few weeks worth of eyeballs and some
extent of testing.

>If this assumption is incorrect, yes, I can go to one-per-week, if
>people agree that they can handle the large increase per release
>properly.  Can you all do that?
>
>Are we going to do a "patch tuesday" like our friends in Redmond now? :)

diff --git a/Makefile b/Makefile
index 2b458801ba74..9a7e83c658cc 100644
--- a/Makefile
+++ b/Makefile
@@ -3,7 +3,7 @@ VERSION = 4
 PATCHLEVEL = 19
 SUBLEVEL = 0
 EXTRAVERSION = -rc1
-NAME = Merciless Moray
+NAME = Microsoft Linux

 # *DOCUMENTATION*
 # To see a list of typical targets execute "make help"

--
Thanks,
Sasha

^ permalink raw reply related	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:54                         ` Sasha Levin
@ 2018-09-05 15:12                           ` Takashi Iwai
  2018-09-05 15:19                           ` Thomas Gleixner
  1 sibling, 0 replies; 74+ messages in thread
From: Takashi Iwai @ 2018-09-05 15:12 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, 05 Sep 2018 16:54:14 +0200,
Sasha Levin wrote:
> 
> On Wed, Sep 05, 2018 at 04:46:26PM +0200, Takashi Iwai wrote:
> >On Wed, 05 Sep 2018 16:41:56 +0200,
> >Sasha Levin wrote:
> >> So right now I'm lagging a few weeks behind upstream. If I limit it to
> >> patches that are at least 1 month old will that help with your concerns?
> >
> >A few weeks after rc-release or the final release?
> >If it's the latter, that should be fine.
> 
> A few weeks after a given patch was merged upstream. The tricky part is
> if I wait a month from final release then the relevant stable tree is
> already EOL.

But it's still good for LTS, right?
For the normal stable, we can live without such non-urgent fixes.
And for LTS, we can push the fix once after it's confirmed to be
really stable enough.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:41           ` Thomas Gleixner
@ 2018-09-05 15:18             ` Steven Rostedt
  2018-09-06  8:48               ` Thomas Gleixner
  0 siblings, 1 reply; 74+ messages in thread
From: Steven Rostedt @ 2018-09-05 15:18 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018 16:41:50 +0200 (CEST)
Thomas Gleixner <tglx@linutronix.de> wrote:

> Yeah, I've heard that song over and over. Of course you can't undo the
> mistakes of the past, but the shades of meltdown & co. should give all
> vendors enough ammunition to start serious negotiations with their
> customers.

Perhaps that's the most distros can do.

> 
> > Such contracts are usually set up in a way that only very specific fixes 
> > can be requested for said kernels. We've historically put our bets on the 
> > fact that we'll be able to provide those rare fixes even for 2.6, and it 
> > worked well.
> > Now we're paying back a bit of course (because spectre/meltdown of course 
> > qualifies), but upstream can completely and happily ignore that.  
> 
> Hell, no. It affects upstream very much because the whole dead kernel
> rituals consume a massive amount of brain power.
> 
> These backports are not done by random code monkeys, they waste the scarse
> time of top notch developers and maintainers. This time is not available
> for concentrating on upstream and a very restricted set of LTS kernels,
> which would benefit everybody, including distros and their customers.
> 
> I very well know how many developers and maintainers are trainwrecked and
> frustrated by that. Not to talk about the massive backlog this creates,
> which hurts everyone again.
> 
> So no, we cannot shrug it off and happily ignore it. We have to tell
> distros over and over that they are doing a massive damage.

It's not the distros that need convincing, it's the vendors that pay to
have it done. When I first started at Red Hat and was told about the
"Stable Kernel ABI", the person telling me about this (a very
established kernel developer) also said "Yeah it really sucks, but
companies are willing to pay a shite load of money to have it done". And
it's in the distros best interest to get that shite load of money. It
also funds the same developers to do this work, and hopefully continue
to help upstream as well.

If we remove that nasty work, these companies wont need to continue
paying that shite load anymore, and they may not be able to afford
paying these talented developers.

-- Steve

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:54                         ` Sasha Levin
  2018-09-05 15:12                           ` Takashi Iwai
@ 2018-09-05 15:19                           ` Thomas Gleixner
  2018-09-05 15:29                             ` Sasha Levin
  1 sibling, 1 reply; 74+ messages in thread
From: Thomas Gleixner @ 2018-09-05 15:19 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, ksummit-discuss, Greg KH

On Wed, 5 Sep 2018, Sasha Levin via Ksummit-discuss wrote:
> On Wed, Sep 05, 2018 at 04:46:26PM +0200, Takashi Iwai wrote:
> >On Wed, 05 Sep 2018 16:41:56 +0200,
> >Sasha Levin wrote:
> >> So right now I'm lagging a few weeks behind upstream. If I limit it to
> >> patches that are at least 1 month old will that help with your concerns?
> >
> >A few weeks after rc-release or the final release?
> >If it's the latter, that should be fine.
> 
> A few weeks after a given patch was merged upstream. The tricky part is
> if I wait a month from final release then the relevant stable tree is
> already EOL.

If the thing is close to EOL then it wont get the next fix or worse the fix
for the subtly broken one either. It really does not matter at all.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 15:19                           ` Thomas Gleixner
@ 2018-09-05 15:29                             ` Sasha Levin
  0 siblings, 0 replies; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 15:29 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: James Bottomley, ksummit-discuss, Greg KH

On Wed, Sep 05, 2018 at 05:19:59PM +0200, Thomas Gleixner wrote:
>On Wed, 5 Sep 2018, Sasha Levin via Ksummit-discuss wrote:
>> On Wed, Sep 05, 2018 at 04:46:26PM +0200, Takashi Iwai wrote:
>> >On Wed, 05 Sep 2018 16:41:56 +0200,
>> >Sasha Levin wrote:
>> >> So right now I'm lagging a few weeks behind upstream. If I limit it to
>> >> patches that are at least 1 month old will that help with your concerns?
>> >
>> >A few weeks after rc-release or the final release?
>> >If it's the latter, that should be fine.
>>
>> A few weeks after a given patch was merged upstream. The tricky part is
>> if I wait a month from final release then the relevant stable tree is
>> already EOL.
>
>If the thing is close to EOL then it wont get the next fix or worse the fix
>for the subtly broken one either. It really does not matter at all.

Given that we release a kernel every ~2 months and that the Stable tree
goes EOL a week or two after the next kernel is released, we will
essentially ignore half of the release cycle (in particular the -rc
ones, when more fixes come in).


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:05                   ` Greg KH
@ 2018-09-05 15:54                     ` Daniel Vetter
  2018-09-05 16:19                       ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Daniel Vetter @ 2018-09-05 15:54 UTC (permalink / raw)
  To: Greg KH; +Cc: James Bottomley, ksummit-discuss

On Wed, Sep 5, 2018 at 4:05 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Wed, Sep 05, 2018 at 03:27:58PM +0200, Daniel Vetter wrote:
>> On Wed, Sep 5, 2018 at 3:03 PM, Takashi Iwai <tiwai@suse.de> wrote:
>> > On Wed, 05 Sep 2018 14:24:18 +0200,
>> > James Bottomley wrote:
>> >>
>> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>> >> >
>> >> >> This really shouldn't be an issue: stable trees are backported from
>> >> >> upstream.  The patch (should) work in upstream, so it should work in
>> >> >> stable.  There are only a few real cases you need to worry about:
>> >> >
>> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
>> >> >and
>> >> >>       the fix backported soon)
>> >> >>    2. Missing precursor causing issues in stable alone.
>> >> >>    3. Bug introduced when hand applying.
>> >> >
>> >> >> The chances of one of these happening is non-zero, but the criteria
>> >> >for
>> >> >> stable should mean its still better odds than the odds of hitting the
>> >> >> bug it was fixing.
>> >> >
>> >> >Some of those are substantial enough to be worth worrying about,
>> >> >especially the missing precursor issues.  It's rarely an issue with the
>> >> >human generated backports but the automated ones don't have a sense of
>> >> >context in the selection.
>> >> >
>> >> >There's also a risk/reward tradeoff to consider with more minor issues,
>> >> >especially performance related ones.  We want people to be enthusiastic
>> >> >about taking stable updates and every time they find a problem with a
>> >> >backport that works against them doing that.
>> >>
>> >> I absolutely agree.  That's why I said our process is expediency
>> >> based:  you have to trade off the value of applying the patch vs the
>> >> probability of introducing bugs.  However the maintainers are mostly
>> >> considering this which is why stable is largely free from trivial
>> >> but pointless patches.  The rule should be: if it doesn't fix a user
>> >> visible bug, it doesn't go into stable.
>> >
>> > Right, and here the current AUTOSEL (and some other not-stable-marked)
>> > patches coming to a gray zone.  The picked-up patches are often right
>> > as "some" fixes, but they are not necessarily qualified as "stable
>> > fixes".
>> >
>> > How about allowing to change the choice of AUTOSEL to be opt-in and
>> > opt-out, depending on the tree?  In my case, usually the patches
>> > caught by AUTOSEL aren't really the patches with forgotten stable
>> > marker, but rather left intentionally by various reasons.  Most of
>> > them are fine to apply in anyway, but it was uncertain whether they
>> > are really needed / qualifying as stable fixes.  So, I'd be happy to
>> > see them as opt-in, i.e. applied only via manual approval.
>> >
>> > Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
>> > help for them.  They can be opt-out, i.e. kept until someone rejects.
>>
>> +1 on AUTOSEL opt-in. It's annyoing at best, when it backports cleanup
>> patches (because somehow those look like stealthy security fixes
>> sometimes) and breaks a bunch of people's boxes for no good reason.
>>
>> In general it'd be really good if -stable had a clearer audit path.
>> Every patch have a recorded reason why it's being applied (e.g. Cc:
>> stable in upstream, Link to the lkml thread/bug report, AUTOSEL mail,
>> whatever), so that after the fact I can figure out why a -stable patch
>> happend, that would be really great. Atm -stable occasionally blows
>> up, with a patch we didn't mark as cc: stable, and we have no idea
>> whyiit showed up in -stable even. That makes it really hard to do
>> better next time around.
>
> I try to keep the audit thread here, as I get asked all the time why
> stuff got added.
>
> Here's what I do, it's not exactly obvious, sorry:
>         - if it came from a stable@ tag, just leave it alone and add my
>           signed-off-by
>         - if it was manually requested by someone, I add a "cc:
>           requestor" to the signed-off-by area and add my s-o-b

Cc-stable-requested-by: would be more obvious. If you have, lkml
archive link with the bug report is even better.

An additional quirk in drm is that we have committers, so normal Cc:
rules (author + committer + anyone already on Cc:) has a good chance
of leaving out maintainers. And generally committers don't care one
bit about some multi-year old LTS kernel, not their job ... You'll
never get any review from them.

>         - if it came from Sasha's tree, Sasha's s-o-b is on it

How do things end up in Sasha's tree? Is that just AUTOSEL, or also
other patches?

>         - if it came from David Miller's patchset, his s-o-b is on it.

Ok, that's netdev and Dave knows what's he doing :-)

> That should cover all types of patches currently going into the trees,
> right?
>
> So always, you can cc: everyone on the s-o-b area and get the people
> involved in the patch and someone involved in reviewing it for stable
> inclusion.

Let's pick a concrete example:

commit c81350c31d0d20661a0aa839b79182bcb0e7a45d
Author: Satendra Singh Thakur <satendra.t@samsung.com>
Date:   Thu May 3 11:19:32 2018 +0530

    drm/atomic: Handling the case when setting old crtc for plane

    [ Upstream commit fc2a69f3903dfd97cd47f593e642b47918c949df ]

    In the func drm_atomic_set_crtc_for_plane, with the current code,
    if crtc of the plane_state and crtc passed as argument to the func
    are same, entire func will executed in vein.
    It will get state of crtc and clear and set the bits in plane_mask.
    All these steps are not required for same old crtc.
    Ideally, we should do nothing in this case, this patch handles the same,
    and causes the program to return without doing anything in such scenario.

    Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
    Cc: Madhur Verma <madhur.verma@samsung.com>
    Cc: Hemanshu Srivastava <hemanshu.s@samsung.com>
    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
    Link: https://patchwork.freedesktop.org/patch/msgid/1525326572-25854-1-git-send-email-satendra.t@samsung.com
    Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Upstream patch doesn't have a cc: stable. I tried looking for it in my
mail archives (and it's a patch committed by myself, so I guess I'll
get cc'ed?), didn't find anything.

I have no idea why this got added at all. Looking at the discussion on
dri-devel, it's purely a cleanup for consistency with another
function. And it blew up :-/
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 15:54                     ` Daniel Vetter
@ 2018-09-05 16:19                       ` Sasha Levin
  2018-09-05 16:26                         ` Daniel Vetter
  0 siblings, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 16:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 05:54:47PM +0200, Daniel Vetter wrote:
>On Wed, Sep 5, 2018 at 4:05 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
>> On Wed, Sep 05, 2018 at 03:27:58PM +0200, Daniel Vetter wrote:
>>> On Wed, Sep 5, 2018 at 3:03 PM, Takashi Iwai <tiwai@suse.de> wrote:
>>> > On Wed, 05 Sep 2018 14:24:18 +0200,
>>> > James Bottomley wrote:
>>> >>
>>> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>>> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>>> >> >
>>> >> >> This really shouldn't be an issue: stable trees are backported from
>>> >> >> upstream.  The patch (should) work in upstream, so it should work in
>>> >> >> stable.  There are only a few real cases you need to worry about:
>>> >> >
>>> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
>>> >> >and
>>> >> >>       the fix backported soon)
>>> >> >>    2. Missing precursor causing issues in stable alone.
>>> >> >>    3. Bug introduced when hand applying.
>>> >> >
>>> >> >> The chances of one of these happening is non-zero, but the criteria
>>> >> >for
>>> >> >> stable should mean its still better odds than the odds of hitting the
>>> >> >> bug it was fixing.
>>> >> >
>>> >> >Some of those are substantial enough to be worth worrying about,
>>> >> >especially the missing precursor issues.  It's rarely an issue with the
>>> >> >human generated backports but the automated ones don't have a sense of
>>> >> >context in the selection.
>>> >> >
>>> >> >There's also a risk/reward tradeoff to consider with more minor issues,
>>> >> >especially performance related ones.  We want people to be enthusiastic
>>> >> >about taking stable updates and every time they find a problem with a
>>> >> >backport that works against them doing that.
>>> >>
>>> >> I absolutely agree.  That's why I said our process is expediency
>>> >> based:  you have to trade off the value of applying the patch vs the
>>> >> probability of introducing bugs.  However the maintainers are mostly
>>> >> considering this which is why stable is largely free from trivial
>>> >> but pointless patches.  The rule should be: if it doesn't fix a user
>>> >> visible bug, it doesn't go into stable.
>>> >
>>> > Right, and here the current AUTOSEL (and some other not-stable-marked)
>>> > patches coming to a gray zone.  The picked-up patches are often right
>>> > as "some" fixes, but they are not necessarily qualified as "stable
>>> > fixes".
>>> >
>>> > How about allowing to change the choice of AUTOSEL to be opt-in and
>>> > opt-out, depending on the tree?  In my case, usually the patches
>>> > caught by AUTOSEL aren't really the patches with forgotten stable
>>> > marker, but rather left intentionally by various reasons.  Most of
>>> > them are fine to apply in anyway, but it was uncertain whether they
>>> > are really needed / qualifying as stable fixes.  So, I'd be happy to
>>> > see them as opt-in, i.e. applied only via manual approval.
>>> >
>>> > Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
>>> > help for them.  They can be opt-out, i.e. kept until someone rejects.
>>>
>>> +1 on AUTOSEL opt-in. It's annyoing at best, when it backports cleanup
>>> patches (because somehow those look like stealthy security fixes
>>> sometimes) and breaks a bunch of people's boxes for no good reason.
>>>
>>> In general it'd be really good if -stable had a clearer audit path.
>>> Every patch have a recorded reason why it's being applied (e.g. Cc:
>>> stable in upstream, Link to the lkml thread/bug report, AUTOSEL mail,
>>> whatever), so that after the fact I can figure out why a -stable patch
>>> happend, that would be really great. Atm -stable occasionally blows
>>> up, with a patch we didn't mark as cc: stable, and we have no idea
>>> whyiit showed up in -stable even. That makes it really hard to do
>>> better next time around.
>>
>> I try to keep the audit thread here, as I get asked all the time why
>> stuff got added.
>>
>> Here's what I do, it's not exactly obvious, sorry:
>>         - if it came from a stable@ tag, just leave it alone and add my
>>           signed-off-by
>>         - if it was manually requested by someone, I add a "cc:
>>           requestor" to the signed-off-by area and add my s-o-b
>
>Cc-stable-requested-by: would be more obvious. If you have, lkml
>archive link with the bug report is even better.
>
>An additional quirk in drm is that we have committers, so normal Cc:
>rules (author + committer + anyone already on Cc:) has a good chance
>of leaving out maintainers. And generally committers don't care one
>bit about some multi-year old LTS kernel, not their job ... You'll
>never get any review from them.
>
>>         - if it came from Sasha's tree, Sasha's s-o-b is on it
>
>How do things end up in Sasha's tree? Is that just AUTOSEL, or also
>other patches?

Just autosel. Other patches take the regular way into Stable.

>>         - if it came from David Miller's patchset, his s-o-b is on it.
>
>Ok, that's netdev and Dave knows what's he doing :-)
>
>> That should cover all types of patches currently going into the trees,
>> right?
>>
>> So always, you can cc: everyone on the s-o-b area and get the people
>> involved in the patch and someone involved in reviewing it for stable
>> inclusion.
>
>Let's pick a concrete example:
>
>commit c81350c31d0d20661a0aa839b79182bcb0e7a45d
>Author: Satendra Singh Thakur <satendra.t@samsung.com>
>Date:   Thu May 3 11:19:32 2018 +0530
>
>    drm/atomic: Handling the case when setting old crtc for plane
>
>    [ Upstream commit fc2a69f3903dfd97cd47f593e642b47918c949df ]
>
>    In the func drm_atomic_set_crtc_for_plane, with the current code,
>    if crtc of the plane_state and crtc passed as argument to the func
>    are same, entire func will executed in vein.
>    It will get state of crtc and clear and set the bits in plane_mask.
>    All these steps are not required for same old crtc.
>    Ideally, we should do nothing in this case, this patch handles the same,
>    and causes the program to return without doing anything in such scenario.
>
>    Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
>    Cc: Madhur Verma <madhur.verma@samsung.com>
>    Cc: Hemanshu Srivastava <hemanshu.s@samsung.com>
>    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>    Link: https://patchwork.freedesktop.org/patch/msgid/1525326572-25854-1-git-send-email-satendra.t@samsung.com
>    Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
>    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
>Upstream patch doesn't have a cc: stable. I tried looking for it in my
>mail archives (and it's a patch committed by myself, so I guess I'll
>get cc'ed?), didn't find anything.

I'm really not sure why you don't see the mail. Can you maybe see if it
got filtered as spam?

>I have no idea why this got added at all. Looking at the discussion on
>dri-devel, it's purely a cleanup for consistency with another
>function. And it blew up :-/

On the flip side, what about:

commit 3fd34ac02ae8cc20d78e3aed2cf6e67f0ae109ea
Author: Hang Yuan <hang.yuan@linux.intel.com>
Date:   Mon Jul 23 20:15:46 2018 +0800

    drm/i915/gvt: fix cleanup sequence in intel_gvt_clean_device

    Create one vGPU and then unbind IGD device from i915 driver. The following
    oops will happen. This patch will free vgpu resource first and then gvt
    resource to remove these oops.

    BUG: unable to handle kernel NULL pointer dereference at       00000000000000a8
      PGD 80000003c9d2c067 P4D 80000003c9d2c067 PUD 3c817c067 P      MD 0
      Oops: 0002 [#1] SMP PTI
      RIP: 0010:down_write+0x1b/0x40
    Call Trace:
      debugfs_remove_recursive+0x46/0x1a0
      intel_gvt_debugfs_remove_vgpu+0x15/0x30 [i915]
      intel_gvt_destroy_vgpu+0x2d/0xf0 [i915]
      intel_vgpu_remove+0x2c/0x30 [kvmgt]
      mdev_device_remove_ops+0x23/0x50 [mdev]
      mdev_device_remove+0xdb/0x190 [mdev]
      mdev_device_remove+0x190/0x190 [mdev]
      device_for_each_child+0x47/0x90
      mdev_unregister_device+0xd5/0x120 [mdev]
      intel_gvt_clean_device+0x91/0x120 [i915]
      i915_driver_unload+0x9d/0x120 [i915]
      i915_pci_remove+0x15/0x20 [i915]
      pci_device_remove+0x3b/0xc0
      device_release_driver_internal+0x157/0x230
      unbind_store+0xfc/0x150
      kernfs_fop_write+0x10f/0x180
      __vfs_write+0x36/0x180
      ? common_file_perm+0x41/0x130
      ? _cond_resched+0x16/0x40
      vfs_write+0xb3/0x1a0
      ksys_write+0x52/0xc0
      do_syscall_64+0x55/0x100
      entry_SYSCALL_64_after_hwframe+0x44/0xa9

    BUG: unable to handle kernel NULL pointer dereference at 0      000000000000038
      PGD 8000000405bce067 P4D 8000000405bce067 PUD 405bcd067 PM      D 0
      Oops: 0000 [#1] SMP PTI
      RIP: 0010:hrtimer_active+0x5/0x40
    Call Trace:
      hrtimer_try_to_cancel+0x25/0x120
      ? tbs_sched_clean_vgpu+0x1f/0x50 [i915]
      hrtimer_cancel+0x15/0x20
      intel_gvt_destroy_vgpu+0x4c/0xf0 [i915]
      intel_vgpu_remove+0x2c/0x30 [kvmgt]
      mdev_device_remove_ops+0x23/0x50 [mdev]
      mdev_device_remove+0xdb/0x190 [mdev]
      ? mdev_device_remove+0x190/0x190 [mdev]
      device_for_each_child+0x47/0x90
      mdev_unregister_device+0xd5/0x120 [mdev]
      intel_gvt_clean_device+0x89/0x120 [i915]
      i915_driver_unload+0x9d/0x120 [i915]
      i915_pci_remove+0x15/0x20 [i915]
      pci_device_remove+0x3b/0xc0
      device_release_driver_internal+0x157/0x230
      unbind_store+0xfc/0x150
      kernfs_fop_write+0x10f/0x180
      __vfs_write+0x36/0x180
      ? common_file_perm+0x41/0x130
      ? _cond_resched+0x16/0x40
      vfs_write+0xb3/0x1a0
      ksys_write+0x52/0xc0
      do_syscall_64+0x55/0x100
      entry_SYSCALL_64_after_hwframe+0x44/0xa9
    
    Fixes: bc7b0be316ae("drm/i915/gvt: Add basic debugfs infrastructure")
    Fixes: afe04fbe6c52("drm/i915/gvt: create an idle vGPU")
    Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
    Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>

Which wasn't tagged for (and is not in any) stable trees?


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:42   ` Greg KH
  2018-09-05 15:10     ` Mark Brown
  2018-09-05 15:10     ` Sasha Levin
@ 2018-09-05 16:19     ` Guenter Roeck
  2018-09-05 18:31     ` Laura Abbott
                       ` (2 subsequent siblings)
  5 siblings, 0 replies; 74+ messages in thread
From: Guenter Roeck @ 2018-09-05 16:19 UTC (permalink / raw)
  To: Greg KH, Justin Forbes; +Cc: ksummit

On 09/05/2018 07:42 AM, Greg KH wrote:
[ ... ]
> I could do a "one release a week" cycle, which I would _love_ but that
> is not going to decrease the number of patches per release, it is only
> going to make them large (patch rate stays the same, and increases, no
> matter when I release)  So I had been thinking that to break the
> releases up into a "here's a hundred or so patches" per release, was a
> helpful thing to the reviewers.
> 
> If this assumption is incorrect, yes, I can go to one-per-week, if
> people agree that they can handle the large increase per release
> properly.  Can you all do that?
> 

Tough call. More patches per release means higher likelihood of conflicts,
and thus increases the per-release risk (I just checked, chromeos-4.4
carries a whopping 15,000+ patches on top of v4.4.y. Outch). On the other
side, I do miss releases lately, but that is because it sometimes takes
several days for a merge to be accepted by the Chrome OS QC, on top of
the internal merge review process which also takes 2-3 days. Thinking
about it, I'd rather miss a release once in a while than having a rigid
release schedule with more patches per release.

Ultimately, my conclusion is that the current process works fine for us,
but I can also live with fixed schedule. However, I would really dislike
a longer -rc cycle since, for me, it would be zero gain and added pain.

> Are we going to do a "patch tuesday" like our friends in Redmond now? :)
> 
> Note, if we do pick a specific day-per-week, then anything outside of
> that cycle will cause people to look _very_ close at the release.  I
> don't know if that's a good thing or not, but be aware that it could
> cause unintended side-affects.  Personally I think the fact that we are
> _not_ regular is a good thing, no out-of-band information leakage
> happens that way.
> 

Agreed.

Guenter

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 16:19                       ` Sasha Levin
@ 2018-09-05 16:26                         ` Daniel Vetter
  2018-09-05 19:09                           ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Daniel Vetter @ 2018-09-05 16:26 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 5, 2018 at 6:19 PM, Sasha Levin
<Alexander.Levin@microsoft.com> wrote:
> On Wed, Sep 05, 2018 at 05:54:47PM +0200, Daniel Vetter wrote:
>>On Wed, Sep 5, 2018 at 4:05 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
>>> On Wed, Sep 05, 2018 at 03:27:58PM +0200, Daniel Vetter wrote:
>>>> On Wed, Sep 5, 2018 at 3:03 PM, Takashi Iwai <tiwai@suse.de> wrote:
>>>> > On Wed, 05 Sep 2018 14:24:18 +0200,
>>>> > James Bottomley wrote:
>>>> >>
>>>> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>>>> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>>>> >> >
>>>> >> >> This really shouldn't be an issue: stable trees are backported from
>>>> >> >> upstream.  The patch (should) work in upstream, so it should work in
>>>> >> >> stable.  There are only a few real cases you need to worry about:
>>>> >> >
>>>> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
>>>> >> >and
>>>> >> >>       the fix backported soon)
>>>> >> >>    2. Missing precursor causing issues in stable alone.
>>>> >> >>    3. Bug introduced when hand applying.
>>>> >> >
>>>> >> >> The chances of one of these happening is non-zero, but the criteria
>>>> >> >for
>>>> >> >> stable should mean its still better odds than the odds of hitting the
>>>> >> >> bug it was fixing.
>>>> >> >
>>>> >> >Some of those are substantial enough to be worth worrying about,
>>>> >> >especially the missing precursor issues.  It's rarely an issue with the
>>>> >> >human generated backports but the automated ones don't have a sense of
>>>> >> >context in the selection.
>>>> >> >
>>>> >> >There's also a risk/reward tradeoff to consider with more minor issues,
>>>> >> >especially performance related ones.  We want people to be enthusiastic
>>>> >> >about taking stable updates and every time they find a problem with a
>>>> >> >backport that works against them doing that.
>>>> >>
>>>> >> I absolutely agree.  That's why I said our process is expediency
>>>> >> based:  you have to trade off the value of applying the patch vs the
>>>> >> probability of introducing bugs.  However the maintainers are mostly
>>>> >> considering this which is why stable is largely free from trivial
>>>> >> but pointless patches.  The rule should be: if it doesn't fix a user
>>>> >> visible bug, it doesn't go into stable.
>>>> >
>>>> > Right, and here the current AUTOSEL (and some other not-stable-marked)
>>>> > patches coming to a gray zone.  The picked-up patches are often right
>>>> > as "some" fixes, but they are not necessarily qualified as "stable
>>>> > fixes".
>>>> >
>>>> > How about allowing to change the choice of AUTOSEL to be opt-in and
>>>> > opt-out, depending on the tree?  In my case, usually the patches
>>>> > caught by AUTOSEL aren't really the patches with forgotten stable
>>>> > marker, but rather left intentionally by various reasons.  Most of
>>>> > them are fine to apply in anyway, but it was uncertain whether they
>>>> > are really needed / qualifying as stable fixes.  So, I'd be happy to
>>>> > see them as opt-in, i.e. applied only via manual approval.
>>>> >
>>>> > Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
>>>> > help for them.  They can be opt-out, i.e. kept until someone rejects.
>>>>
>>>> +1 on AUTOSEL opt-in. It's annyoing at best, when it backports cleanup
>>>> patches (because somehow those look like stealthy security fixes
>>>> sometimes) and breaks a bunch of people's boxes for no good reason.
>>>>
>>>> In general it'd be really good if -stable had a clearer audit path.
>>>> Every patch have a recorded reason why it's being applied (e.g. Cc:
>>>> stable in upstream, Link to the lkml thread/bug report, AUTOSEL mail,
>>>> whatever), so that after the fact I can figure out why a -stable patch
>>>> happend, that would be really great. Atm -stable occasionally blows
>>>> up, with a patch we didn't mark as cc: stable, and we have no idea
>>>> whyiit showed up in -stable even. That makes it really hard to do
>>>> better next time around.
>>>
>>> I try to keep the audit thread here, as I get asked all the time why
>>> stuff got added.
>>>
>>> Here's what I do, it's not exactly obvious, sorry:
>>>         - if it came from a stable@ tag, just leave it alone and add my
>>>           signed-off-by
>>>         - if it was manually requested by someone, I add a "cc:
>>>           requestor" to the signed-off-by area and add my s-o-b
>>
>>Cc-stable-requested-by: would be more obvious. If you have, lkml
>>archive link with the bug report is even better.
>>
>>An additional quirk in drm is that we have committers, so normal Cc:
>>rules (author + committer + anyone already on Cc:) has a good chance
>>of leaving out maintainers. And generally committers don't care one
>>bit about some multi-year old LTS kernel, not their job ... You'll
>>never get any review from them.
>>
>>>         - if it came from Sasha's tree, Sasha's s-o-b is on it
>>
>>How do things end up in Sasha's tree? Is that just AUTOSEL, or also
>>other patches?
>
> Just autosel. Other patches take the regular way into Stable.
>
>>>         - if it came from David Miller's patchset, his s-o-b is on it.
>>
>>Ok, that's netdev and Dave knows what's he doing :-)
>>
>>> That should cover all types of patches currently going into the trees,
>>> right?
>>>
>>> So always, you can cc: everyone on the s-o-b area and get the people
>>> involved in the patch and someone involved in reviewing it for stable
>>> inclusion.
>>
>>Let's pick a concrete example:
>>
>>commit c81350c31d0d20661a0aa839b79182bcb0e7a45d
>>Author: Satendra Singh Thakur <satendra.t@samsung.com>
>>Date:   Thu May 3 11:19:32 2018 +0530
>>
>>    drm/atomic: Handling the case when setting old crtc for plane
>>
>>    [ Upstream commit fc2a69f3903dfd97cd47f593e642b47918c949df ]
>>
>>    In the func drm_atomic_set_crtc_for_plane, with the current code,
>>    if crtc of the plane_state and crtc passed as argument to the func
>>    are same, entire func will executed in vein.
>>    It will get state of crtc and clear and set the bits in plane_mask.
>>    All these steps are not required for same old crtc.
>>    Ideally, we should do nothing in this case, this patch handles the same,
>>    and causes the program to return without doing anything in such scenario.
>>
>>    Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
>>    Cc: Madhur Verma <madhur.verma@samsung.com>
>>    Cc: Hemanshu Srivastava <hemanshu.s@samsung.com>
>>    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>    Link: https://patchwork.freedesktop.org/patch/msgid/1525326572-25854-1-git-send-email-satendra.t@samsung.com
>>    Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
>>    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>
>>Upstream patch doesn't have a cc: stable. I tried looking for it in my
>>mail archives (and it's a patch committed by myself, so I guess I'll
>>get cc'ed?), didn't find anything.
>
> I'm really not sure why you don't see the mail. Can you maybe see if it
> got filtered as spam?

Nothing in spam either. Maybe gmail cleaned it out already.

>>I have no idea why this got added at all. Looking at the discussion on
>>dri-devel, it's purely a cleanup for consistency with another
>>function. And it blew up :-/
>
> On the flip side, what about:
>
> commit 3fd34ac02ae8cc20d78e3aed2cf6e67f0ae109ea
> Author: Hang Yuan <hang.yuan@linux.intel.com>
> Date:   Mon Jul 23 20:15:46 2018 +0800
>
>     drm/i915/gvt: fix cleanup sequence in intel_gvt_clean_device
>
>     Create one vGPU and then unbind IGD device from i915 driver. The following
>     oops will happen. This patch will free vgpu resource first and then gvt
>     resource to remove these oops.
>
>     BUG: unable to handle kernel NULL pointer dereference at       00000000000000a8
>       PGD 80000003c9d2c067 P4D 80000003c9d2c067 PUD 3c817c067 P      MD 0
>       Oops: 0002 [#1] SMP PTI
>       RIP: 0010:down_write+0x1b/0x40
>     Call Trace:
>       debugfs_remove_recursive+0x46/0x1a0
>       intel_gvt_debugfs_remove_vgpu+0x15/0x30 [i915]
>       intel_gvt_destroy_vgpu+0x2d/0xf0 [i915]
>       intel_vgpu_remove+0x2c/0x30 [kvmgt]
>       mdev_device_remove_ops+0x23/0x50 [mdev]
>       mdev_device_remove+0xdb/0x190 [mdev]
>       mdev_device_remove+0x190/0x190 [mdev]
>       device_for_each_child+0x47/0x90
>       mdev_unregister_device+0xd5/0x120 [mdev]
>       intel_gvt_clean_device+0x91/0x120 [i915]
>       i915_driver_unload+0x9d/0x120 [i915]
>       i915_pci_remove+0x15/0x20 [i915]
>       pci_device_remove+0x3b/0xc0
>       device_release_driver_internal+0x157/0x230
>       unbind_store+0xfc/0x150
>       kernfs_fop_write+0x10f/0x180
>       __vfs_write+0x36/0x180
>       ? common_file_perm+0x41/0x130
>       ? _cond_resched+0x16/0x40
>       vfs_write+0xb3/0x1a0
>       ksys_write+0x52/0xc0
>       do_syscall_64+0x55/0x100
>       entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>     BUG: unable to handle kernel NULL pointer dereference at 0      000000000000038
>       PGD 8000000405bce067 P4D 8000000405bce067 PUD 405bcd067 PM      D 0
>       Oops: 0000 [#1] SMP PTI
>       RIP: 0010:hrtimer_active+0x5/0x40
>     Call Trace:
>       hrtimer_try_to_cancel+0x25/0x120
>       ? tbs_sched_clean_vgpu+0x1f/0x50 [i915]
>       hrtimer_cancel+0x15/0x20
>       intel_gvt_destroy_vgpu+0x4c/0xf0 [i915]
>       intel_vgpu_remove+0x2c/0x30 [kvmgt]
>       mdev_device_remove_ops+0x23/0x50 [mdev]
>       mdev_device_remove+0xdb/0x190 [mdev]
>       ? mdev_device_remove+0x190/0x190 [mdev]
>       device_for_each_child+0x47/0x90
>       mdev_unregister_device+0xd5/0x120 [mdev]
>       intel_gvt_clean_device+0x89/0x120 [i915]
>       i915_driver_unload+0x9d/0x120 [i915]
>       i915_pci_remove+0x15/0x20 [i915]
>       pci_device_remove+0x3b/0xc0
>       device_release_driver_internal+0x157/0x230
>       unbind_store+0xfc/0x150
>       kernfs_fop_write+0x10f/0x180
>       __vfs_write+0x36/0x180
>       ? common_file_perm+0x41/0x130
>       ? _cond_resched+0x16/0x40
>       vfs_write+0xb3/0x1a0
>       ksys_write+0x52/0xc0
>       do_syscall_64+0x55/0x100
>       entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
>     Fixes: bc7b0be316ae("drm/i915/gvt: Add basic debugfs infrastructure")
>     Fixes: afe04fbe6c52("drm/i915/gvt: create an idle vGPU")
>     Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
>     Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
>
> Which wasn't tagged for (and is not in any) stable trees?

Not stable material, it fixes just a driver unload bug. That's for
developers only. Worst case you break some user's box for this, which
I don't think is cool. Since we're a 100% upstream driver team this
won't harm developers if it's not backported.

Note that because of fbcon and other reasons, an rmmod i915 will fail.
You need to enable a bunch of CONFIG_EXPERT options (with scary texts
and stuff) and have a script from our test suite to be able to even
make this happen.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 12:53               ` Jiri Kosina
  2018-09-05 13:05                 ` Greg KH
@ 2018-09-05 16:39                 ` James Bottomley
  2018-09-05 17:06                   ` Dmitry Torokhov
  2018-09-05 17:33                   ` Steven Rostedt
  1 sibling, 2 replies; 74+ messages in thread
From: James Bottomley @ 2018-09-05 16:39 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: Greg KH, ksummit-discuss

On Wed, 2018-09-05 at 14:53 +0200, Jiri Kosina wrote:
> On Wed, 5 Sep 2018, James Bottomley wrote:
> 
> > The rule should be: if it doesn't fix a user visible bug, it
> > doesn't go 
> > into stable.
> 
> So I just looked at the latest (and newest) stable 4.18.5. It
> contains 22
> patches:
> 
> 	$ grep "commit [a-f0-9]\+ upstream" ChangeLog-4.18.5
> 	    commit a13f085d111e90469faf2d9965eb39b11c114d7e upstream.
> 	    commit bed4ff1ed4d8f2ef5007c5c6ae1b29c5677a3632 upstream.
> 	    commit c463a158cb6c5d9a85b7d894cd4f8116e8bd6be0 upstream.
> 	    commit 1204e35bedf4e5015cda559ed8c84789a6dae24e upstream.
> 	    commit 281e878eab191cce4259abbbf1a0322e3adae02c upstream.
> 	    commit 3dbe97efe8bf450b183d6dee2305cbc032e6b8a4 upstream.
> 	    commit 91a2968e245d6ba616db37001fa1a043078b1a65 upstream.
> 	    commit 4ce6435820d1f1cc2c2788e232735eb244bcc8a3 upstream.
> 	    commit 9d64b539b738fc181442caab95f1f76d9bd58539 upstream.
> 	    commit d3252ace0bc652a1a244455556b6a549f969bf99 upstream.
> 	    commit 7797167ffde1f00446301cb22b37b7c03194cfaf upstream.
> 	    commit 3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8 upstream.
> 	    commit ddf74e79a54070f277ae520722d3bab7f7a6c67a upstream.
> 	    commit de5372da605d3bca46e3102bab51b7e1c0e0a6f6 upstream.
> 	    commit 1a5d5e5d51e75a5bca67dadbcea8c841934b7b85 upstream.
> 	    commit 6d44acae1937b81cf8115ada8958e04f601f3f2e upstream.
> 	    commit c40a56a7818cfe735fc93a69e1875f8bba834483 upstream.
> 	    commit 6ea2738e0ca0e626c75202fb051c1e88d7a950fa upstream.
> 	    commit 9f515cdb411ef34f1aaf4c40bb0c932cf6db5de1 upstream.
> 	    commit 0d83432811f26871295a9bc24d3c387924da6071 upstream.
> 	    commit 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 upstream.
> 	    commit b748f2de4b2f578599f46c6000683a8da755bf68 upstream.
> 
> Just randomly scrolling through those, I am wondering how at least
> 
> 	7797167ffde1f00446301cb22b37b7c03194cfaf
> 	3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8
> 
> made it past any stable tree acceptance criteria.
> 
> They are memory ordering changes (so exactly area which is generally 
> fragile by itself and the risk of regressions simply can't be
> completely ignored), yet they fix absolutely no functional issue.
> 
> In addition to that, they all exist upstream only for one single -rc,
> so the public testing exposure is also currently minimal.
> 
> Yeah, I know I know, those are parisc, so noone cares anyway :P but
> that's really just the first randomly chosen kernel, with a small
> number of patches, and still 10% of them are something we'd not want
> to put into an enterprise distro kernel without a lot of
> justification and regression testing.

[puts PA-RISC hat on]

The maintainers believe these two patches will fix a persistent
segmentation fault problem that's blocking forward progress on the
debian parisc port.  So, given the parisc specificity of the patches
and the maintainer input, I very much think they fit the criteria.

James

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 16:39                 ` James Bottomley
@ 2018-09-05 17:06                   ` Dmitry Torokhov
  2018-09-05 17:33                   ` Steven Rostedt
  1 sibling, 0 replies; 74+ messages in thread
From: Dmitry Torokhov @ 2018-09-05 17:06 UTC (permalink / raw)
  To: James Bottomley; +Cc: Greg Kroah-Hartman, ksummit-discuss

On Wed, Sep 5, 2018 at 9:39 AM James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
>
> On Wed, 2018-09-05 at 14:53 +0200, Jiri Kosina wrote:
> > On Wed, 5 Sep 2018, James Bottomley wrote:
> >
> > > The rule should be: if it doesn't fix a user visible bug, it
> > > doesn't go
> > > into stable.
> >
> > So I just looked at the latest (and newest) stable 4.18.5. It
> > contains 22
> > patches:
> >
> >       $ grep "commit [a-f0-9]\+ upstream" ChangeLog-4.18.5
> >           commit a13f085d111e90469faf2d9965eb39b11c114d7e upstream.
> >           commit bed4ff1ed4d8f2ef5007c5c6ae1b29c5677a3632 upstream.
> >           commit c463a158cb6c5d9a85b7d894cd4f8116e8bd6be0 upstream.
> >           commit 1204e35bedf4e5015cda559ed8c84789a6dae24e upstream.
> >           commit 281e878eab191cce4259abbbf1a0322e3adae02c upstream.
> >           commit 3dbe97efe8bf450b183d6dee2305cbc032e6b8a4 upstream.
> >           commit 91a2968e245d6ba616db37001fa1a043078b1a65 upstream.
> >           commit 4ce6435820d1f1cc2c2788e232735eb244bcc8a3 upstream.
> >           commit 9d64b539b738fc181442caab95f1f76d9bd58539 upstream.
> >           commit d3252ace0bc652a1a244455556b6a549f969bf99 upstream.
> >           commit 7797167ffde1f00446301cb22b37b7c03194cfaf upstream.
> >           commit 3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8 upstream.
> >           commit ddf74e79a54070f277ae520722d3bab7f7a6c67a upstream.
> >           commit de5372da605d3bca46e3102bab51b7e1c0e0a6f6 upstream.
> >           commit 1a5d5e5d51e75a5bca67dadbcea8c841934b7b85 upstream.
> >           commit 6d44acae1937b81cf8115ada8958e04f601f3f2e upstream.
> >           commit c40a56a7818cfe735fc93a69e1875f8bba834483 upstream.
> >           commit 6ea2738e0ca0e626c75202fb051c1e88d7a950fa upstream.
> >           commit 9f515cdb411ef34f1aaf4c40bb0c932cf6db5de1 upstream.
> >           commit 0d83432811f26871295a9bc24d3c387924da6071 upstream.
> >           commit 36ecc1481dc8d8c52d43ba18c6b642c1d2fde789 upstream.
> >           commit b748f2de4b2f578599f46c6000683a8da755bf68 upstream.
> >
> > Just randomly scrolling through those, I am wondering how at least
> >
> >       7797167ffde1f00446301cb22b37b7c03194cfaf
> >       3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8
> >
> > made it past any stable tree acceptance criteria.
> >
> > They are memory ordering changes (so exactly area which is generally
> > fragile by itself and the risk of regressions simply can't be
> > completely ignored), yet they fix absolutely no functional issue.
> >
> > In addition to that, they all exist upstream only for one single -rc,
> > so the public testing exposure is also currently minimal.
> >
> > Yeah, I know I know, those are parisc, so noone cares anyway :P but
> > that's really just the first randomly chosen kernel, with a small
> > number of patches, and still 10% of them are something we'd not want
> > to put into an enterprise distro kernel without a lot of
> > justification and regression testing.
>
> [puts PA-RISC hat on]
>
> The maintainers believe these two patches will fix a persistent
> segmentation fault problem that's blocking forward progress on the
> debian parisc port.  So, given the parisc specificity of the patches
> and the maintainer input, I very much think they fit the criteria.

Should we try to add a blurb justifying why it was marked for stable
in unobvious cases? I do believe that we often are trigger happy to
mark changes for stable, especially changes that add support for new
hardware, even though it might not be that well tested (guilty myself)
or changes that introduce error handling that may be technically
correct but will not actually fix any real user issues as errors never
triggered in real life.

Some of it comes from how easy it is for maintainer to tag a patch for
stable as we apply it to our tree compared to the effort for sending
it to stable at a later time.

Thanks.

-- 
Dmitry

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 16:39                 ` James Bottomley
  2018-09-05 17:06                   ` Dmitry Torokhov
@ 2018-09-05 17:33                   ` Steven Rostedt
  1 sibling, 0 replies; 74+ messages in thread
From: Steven Rostedt @ 2018-09-05 17:33 UTC (permalink / raw)
  To: James Bottomley; +Cc: Greg KH, ksummit-discuss

On Wed, 05 Sep 2018 17:39:21 +0100
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> > Just randomly scrolling through those, I am wondering how at least
> > 
> > 	7797167ffde1f00446301cb22b37b7c03194cfaf
> > 	3b885ac1dc35b87a39ee176a6c7e2af9c789d8b8
> > 
> > made it past any stable tree acceptance criteria.
> > 
> > They are memory ordering changes (so exactly area which is generally 
> > fragile by itself and the risk of regressions simply can't be
> > completely ignored), yet they fix absolutely no functional issue.
> > 
> > In addition to that, they all exist upstream only for one single -rc,
> > so the public testing exposure is also currently minimal.
> > 
> > Yeah, I know I know, those are parisc, so noone cares anyway :P but
> > that's really just the first randomly chosen kernel, with a small
> > number of patches, and still 10% of them are something we'd not want
> > to put into an enterprise distro kernel without a lot of
> > justification and regression testing.  
> 
> [puts PA-RISC hat on]
> 
> The maintainers believe these two patches will fix a persistent
> segmentation fault problem that's blocking forward progress on the
> debian parisc port.  So, given the parisc specificity of the patches
> and the maintainer input, I very much think they fit the criteria.

But the change logs don't mention anything about that. The only thing
the change logs talk about is that this has performance improvements.
Was this info embargoed for some reason?

-- Steve

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:42   ` Greg KH
                       ` (2 preceding siblings ...)
  2018-09-05 16:19     ` Guenter Roeck
@ 2018-09-05 18:31     ` Laura Abbott
  2018-09-05 21:23     ` Justin Forbes
  2018-09-06  2:17     ` Eduardo Valentin
  5 siblings, 0 replies; 74+ messages in thread
From: Laura Abbott @ 2018-09-05 18:31 UTC (permalink / raw)
  To: Greg KH, Justin Forbes; +Cc: ksummit

On 09/05/2018 07:42 AM, Greg KH wrote:
> On Tue, Sep 04, 2018 at 04:22:59PM -0500, Justin Forbes wrote:
>> On Tue, Sep 4, 2018 at 3:58 PM, Laura Abbott <labbott@redhat.com> wrote:
>>> I'd like to start a discussion about the stable release cycle.
>>>
>>> Fedora is a heavy user of the most recent stable trees and we
>>> generally do a pretty good job of keeping up to date. As we
>>> try and increase testing though, the stable release process
>>> gets to be a bit difficult. We often run into the problem where
>>> release .Z is officially released and then .Z+1 comes
>>> out as an -rc immediately after. Given Fedora release processes,
>>> we haven't always finished testing .Z by the time .Z+1 comes
>>> out. What to do in this situation really depends on what's in
>>> .Z and .Z+1 and how stable we think things are. This usually
>>> works out fine but a) sometimes we guess wrong and should have
>>> tested .Z more b) we're only looking to increase testing.
>>>
>>> What I'd like to see is stable updates that come on a regular
>>> schedule with a longer -rc interval, say Sunday with
>>> a one week -rc period. I understand that much of the current
>>> stable schedule is based on Greg's schedule. As a distro
>>> maintainer though, a regular release schedule with a longer
>>> testing window makes it much easier to plan and deliver something
>>> useful to our users. It's also a much easier sell for encouraging
>>> everyone to pick up every stable update if there's a known
>>> schedule. I also realize Greg is probably reading this with a very
>>> skeptical look on his face so I'd be interested to hear from
>>> other distro maintainers as well.
>>>
>>
>> This has been a fairly recent problem. There was a roughly weekly
>> cadence for a very long time and that was pretty easy to work with.  I
>> know that some of these updates do fix embargoed security issues that
>> we don't find out are actual fixes until later, but frequently in
>> those cases, the fixes are pushed well before embargo lifts, and they
>> could be fit into a weekly cadence.  Personally I don't have a problem
>> with the 3 day rc period, but pushing 2 kernels a week can be a
>> problem for users. (skipping a stable update is also a problem for
>> users.)  What I would prefer is 1 stable update per week with an
>> exception for *serious* security issues, where serious would mean
>> either real end user impact or high profile lots of press users are
>> going to be wondering where a fix is.
> 
> Laura, thanks for bringing this up.  I'll try to respond here given that
> Justin agrees with the issue of timing.
> 
> Honestly, this year has been a total shit-storm for stable due to the
> whole security mess we have been dealing with.  The number of
> totally-crazy-intrusive patches I have had to take is insane.  Combine
> that with a total lack of regard for the security issues for some arches
> (arm32 comes to mind), it's been a very rough year and I have been just
> trying to keep on top of everything.
> 
> Because of these issues (and it wasn't just spectre/meltdown, we have
> had other major fire drills in some subsystems), the release cycles have
> been quick and contain a lot of patches, sorry about that.  But that is
> reflected in Linus's tree as well, so maybe this is just the "new
> normal" that we all need to get used to.
> 

While the specdown stuff was bad, I was seeing this pattern well
before all that happened as well. I do agree this may be a new normal
which is why I brought up the discussion topic.

> I could do a "one release a week" cycle, which I would _love_ but that
> is not going to decrease the number of patches per release, it is only
> going to make them large (patch rate stays the same, and increases, no
> matter when I release)  So I had been thinking that to break the
> releases up into a "here's a hundred or so patches" per release, was a
> helpful thing to the reviewers.
I'm really not that concerned with the number of patches going in.
We'll be testing if there's 1 or 300 patches and trying to pick
and choose tests also doesn't work. Stable updates that contain
a headline making bug can be handled differently.

> If this assumption is incorrect, yes, I can go to one-per-week, if
> people agree that they can handle the large increase per release
> properly.  Can you all do that?
> 
> Are we going to do a "patch tuesday" like our friends in Redmond now? :)
> > Note, if we do pick a specific day-per-week, then anything outside of
> that cycle will cause people to look _very_ close at the release.  I
> don't know if that's a good thing or not, but be aware that it could
> cause unintended side-affects.  Personally I think the fact that we are
> _not_ regular is a good thing, no out-of-band information leakage
> happens that way.
> 

There's certainly trade offs to be made. A side-channel for our
side-channel patches could be bad but most people who are seriously
interested are looking already.

Thanks,
Laura

> thanks,
> 
> greg k-h
> 

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 16:26                         ` Daniel Vetter
@ 2018-09-05 19:09                           ` Sasha Levin
  2018-09-05 20:18                             ` Sasha Levin
  0 siblings, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 19:09 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 06:26:17PM +0200, Daniel Vetter wrote:
>On Wed, Sep 5, 2018 at 6:19 PM, Sasha Levin
><Alexander.Levin@microsoft.com> wrote:
>> On Wed, Sep 05, 2018 at 05:54:47PM +0200, Daniel Vetter wrote:
>>>On Wed, Sep 5, 2018 at 4:05 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
>>>> On Wed, Sep 05, 2018 at 03:27:58PM +0200, Daniel Vetter wrote:
>>>>> On Wed, Sep 5, 2018 at 3:03 PM, Takashi Iwai <tiwai@suse.de> wrote:
>>>>> > On Wed, 05 Sep 2018 14:24:18 +0200,
>>>>> > James Bottomley wrote:
>>>>> >>
>>>>> >> On September 5, 2018 11:47:00 AM GMT+01:00, Mark Brown <broonie@kernel.org> wrote:
>>>>> >> >On Wed, Sep 05, 2018 at 10:58:45AM +0100, James Bottomley wrote:
>>>>> >> >
>>>>> >> >> This really shouldn't be an issue: stable trees are backported from
>>>>> >> >> upstream.  The patch (should) work in upstream, so it should work in
>>>>> >> >> stable.  There are only a few real cases you need to worry about:
>>>>> >> >
>>>>> >> >>    1. Buggy patch in upstream backported to stable. (will be caught
>>>>> >> >and
>>>>> >> >>       the fix backported soon)
>>>>> >> >>    2. Missing precursor causing issues in stable alone.
>>>>> >> >>    3. Bug introduced when hand applying.
>>>>> >> >
>>>>> >> >> The chances of one of these happening is non-zero, but the criteria
>>>>> >> >for
>>>>> >> >> stable should mean its still better odds than the odds of hitting the
>>>>> >> >> bug it was fixing.
>>>>> >> >
>>>>> >> >Some of those are substantial enough to be worth worrying about,
>>>>> >> >especially the missing precursor issues.  It's rarely an issue with the
>>>>> >> >human generated backports but the automated ones don't have a sense of
>>>>> >> >context in the selection.
>>>>> >> >
>>>>> >> >There's also a risk/reward tradeoff to consider with more minor issues,
>>>>> >> >especially performance related ones.  We want people to be enthusiastic
>>>>> >> >about taking stable updates and every time they find a problem with a
>>>>> >> >backport that works against them doing that.
>>>>> >>
>>>>> >> I absolutely agree.  That's why I said our process is expediency
>>>>> >> based:  you have to trade off the value of applying the patch vs the
>>>>> >> probability of introducing bugs.  However the maintainers are mostly
>>>>> >> considering this which is why stable is largely free from trivial
>>>>> >> but pointless patches.  The rule should be: if it doesn't fix a user
>>>>> >> visible bug, it doesn't go into stable.
>>>>> >
>>>>> > Right, and here the current AUTOSEL (and some other not-stable-marked)
>>>>> > patches coming to a gray zone.  The picked-up patches are often right
>>>>> > as "some" fixes, but they are not necessarily qualified as "stable
>>>>> > fixes".
>>>>> >
>>>>> > How about allowing to change the choice of AUTOSEL to be opt-in and
>>>>> > opt-out, depending on the tree?  In my case, usually the patches
>>>>> > caught by AUTOSEL aren't really the patches with forgotten stable
>>>>> > marker, but rather left intentionally by various reasons.  Most of
>>>>> > them are fine to apply in anyway, but it was uncertain whether they
>>>>> > are really needed / qualifying as stable fixes.  So, I'd be happy to
>>>>> > see them as opt-in, i.e. applied only via manual approval.
>>>>> >
>>>>> > Meanwhile, some trees have no stable-maintenance, and AUTOSEL would
>>>>> > help for them.  They can be opt-out, i.e. kept until someone rejects.
>>>>>
>>>>> +1 on AUTOSEL opt-in. It's annyoing at best, when it backports cleanup
>>>>> patches (because somehow those look like stealthy security fixes
>>>>> sometimes) and breaks a bunch of people's boxes for no good reason.
>>>>>
>>>>> In general it'd be really good if -stable had a clearer audit path.
>>>>> Every patch have a recorded reason why it's being applied (e.g. Cc:
>>>>> stable in upstream, Link to the lkml thread/bug report, AUTOSEL mail,
>>>>> whatever), so that after the fact I can figure out why a -stable patch
>>>>> happend, that would be really great. Atm -stable occasionally blows
>>>>> up, with a patch we didn't mark as cc: stable, and we have no idea
>>>>> whyiit showed up in -stable even. That makes it really hard to do
>>>>> better next time around.
>>>>
>>>> I try to keep the audit thread here, as I get asked all the time why
>>>> stuff got added.
>>>>
>>>> Here's what I do, it's not exactly obvious, sorry:
>>>>         - if it came from a stable@ tag, just leave it alone and add my
>>>>           signed-off-by
>>>>         - if it was manually requested by someone, I add a "cc:
>>>>           requestor" to the signed-off-by area and add my s-o-b
>>>
>>>Cc-stable-requested-by: would be more obvious. If you have, lkml
>>>archive link with the bug report is even better.
>>>
>>>An additional quirk in drm is that we have committers, so normal Cc:
>>>rules (author + committer + anyone already on Cc:) has a good chance
>>>of leaving out maintainers. And generally committers don't care one
>>>bit about some multi-year old LTS kernel, not their job ... You'll
>>>never get any review from them.
>>>
>>>>         - if it came from Sasha's tree, Sasha's s-o-b is on it
>>>
>>>How do things end up in Sasha's tree? Is that just AUTOSEL, or also
>>>other patches?
>>
>> Just autosel. Other patches take the regular way into Stable.
>>
>>>>         - if it came from David Miller's patchset, his s-o-b is on it.
>>>
>>>Ok, that's netdev and Dave knows what's he doing :-)
>>>
>>>> That should cover all types of patches currently going into the trees,
>>>> right?
>>>>
>>>> So always, you can cc: everyone on the s-o-b area and get the people
>>>> involved in the patch and someone involved in reviewing it for stable
>>>> inclusion.
>>>
>>>Let's pick a concrete example:
>>>
>>>commit c81350c31d0d20661a0aa839b79182bcb0e7a45d
>>>Author: Satendra Singh Thakur <satendra.t@samsung.com>
>>>Date:   Thu May 3 11:19:32 2018 +0530
>>>
>>>    drm/atomic: Handling the case when setting old crtc for plane
>>>
>>>    [ Upstream commit fc2a69f3903dfd97cd47f593e642b47918c949df ]
>>>
>>>    In the func drm_atomic_set_crtc_for_plane, with the current code,
>>>    if crtc of the plane_state and crtc passed as argument to the func
>>>    are same, entire func will executed in vein.
>>>    It will get state of crtc and clear and set the bits in plane_mask.
>>>    All these steps are not required for same old crtc.
>>>    Ideally, we should do nothing in this case, this patch handles the same,
>>>    and causes the program to return without doing anything in such scenario.
>>>
>>>    Signed-off-by: Satendra Singh Thakur <satendra.t@samsung.com>
>>>    Cc: Madhur Verma <madhur.verma@samsung.com>
>>>    Cc: Hemanshu Srivastava <hemanshu.s@samsung.com>
>>>    Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>    Link: https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork.freedesktop.org%2Fpatch%2Fmsgid%2F1525326572-25854-1-git-send-email-satendra.t%40samsung.com&amp;data=02%7C01%7CAlexander.Levin%40microsoft.com%7Cf2a367b80fd448c6387708d6134c4f76%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636717615809686536&amp;sdata=CGkkBSha3ZIuIQY%2Bz4lgAhBl5XYrYYlqE3cT%2Fx7iAjI%3D&amp;reserved=0
>>>    Signed-off-by: Sasha Levin <alexander.levin@microsoft.com>
>>>    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>
>>>Upstream patch doesn't have a cc: stable. I tried looking for it in my
>>>mail archives (and it's a patch committed by myself, so I guess I'll
>>>get cc'ed?), didn't find anything.
>>
>> I'm really not sure why you don't see the mail. Can you maybe see if it
>> got filtered as spam?
>
>Nothing in spam either. Maybe gmail cleaned it out already.
>
>>>I have no idea why this got added at all. Looking at the discussion on
>>>dri-devel, it's purely a cleanup for consistency with another
>>>function. And it blew up :-/
>>
>> On the flip side, what about:
>>
>> commit 3fd34ac02ae8cc20d78e3aed2cf6e67f0ae109ea
>> Author: Hang Yuan <hang.yuan@linux.intel.com>
>> Date:   Mon Jul 23 20:15:46 2018 +0800
>>
>>     drm/i915/gvt: fix cleanup sequence in intel_gvt_clean_device
>>
>>     Create one vGPU and then unbind IGD device from i915 driver. The following
>>     oops will happen. This patch will free vgpu resource first and then gvt
>>     resource to remove these oops.
>>
>>     BUG: unable to handle kernel NULL pointer dereference at       00000000000000a8
>>       PGD 80000003c9d2c067 P4D 80000003c9d2c067 PUD 3c817c067 P      MD 0
>>       Oops: 0002 [#1] SMP PTI
>>       RIP: 0010:down_write+0x1b/0x40
>>     Call Trace:
>>       debugfs_remove_recursive+0x46/0x1a0
>>       intel_gvt_debugfs_remove_vgpu+0x15/0x30 [i915]
>>       intel_gvt_destroy_vgpu+0x2d/0xf0 [i915]
>>       intel_vgpu_remove+0x2c/0x30 [kvmgt]
>>       mdev_device_remove_ops+0x23/0x50 [mdev]
>>       mdev_device_remove+0xdb/0x190 [mdev]
>>       mdev_device_remove+0x190/0x190 [mdev]
>>       device_for_each_child+0x47/0x90
>>       mdev_unregister_device+0xd5/0x120 [mdev]
>>       intel_gvt_clean_device+0x91/0x120 [i915]
>>       i915_driver_unload+0x9d/0x120 [i915]
>>       i915_pci_remove+0x15/0x20 [i915]
>>       pci_device_remove+0x3b/0xc0
>>       device_release_driver_internal+0x157/0x230
>>       unbind_store+0xfc/0x150
>>       kernfs_fop_write+0x10f/0x180
>>       __vfs_write+0x36/0x180
>>       ? common_file_perm+0x41/0x130
>>       ? _cond_resched+0x16/0x40
>>       vfs_write+0xb3/0x1a0
>>       ksys_write+0x52/0xc0
>>       do_syscall_64+0x55/0x100
>>       entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>>     BUG: unable to handle kernel NULL pointer dereference at 0      000000000000038
>>       PGD 8000000405bce067 P4D 8000000405bce067 PUD 405bcd067 PM      D 0
>>       Oops: 0000 [#1] SMP PTI
>>       RIP: 0010:hrtimer_active+0x5/0x40
>>     Call Trace:
>>       hrtimer_try_to_cancel+0x25/0x120
>>       ? tbs_sched_clean_vgpu+0x1f/0x50 [i915]
>>       hrtimer_cancel+0x15/0x20
>>       intel_gvt_destroy_vgpu+0x4c/0xf0 [i915]
>>       intel_vgpu_remove+0x2c/0x30 [kvmgt]
>>       mdev_device_remove_ops+0x23/0x50 [mdev]
>>       mdev_device_remove+0xdb/0x190 [mdev]
>>       ? mdev_device_remove+0x190/0x190 [mdev]
>>       device_for_each_child+0x47/0x90
>>       mdev_unregister_device+0xd5/0x120 [mdev]
>>       intel_gvt_clean_device+0x89/0x120 [i915]
>>       i915_driver_unload+0x9d/0x120 [i915]
>>       i915_pci_remove+0x15/0x20 [i915]
>>       pci_device_remove+0x3b/0xc0
>>       device_release_driver_internal+0x157/0x230
>>       unbind_store+0xfc/0x150
>>       kernfs_fop_write+0x10f/0x180
>>       __vfs_write+0x36/0x180
>>       ? common_file_perm+0x41/0x130
>>       ? _cond_resched+0x16/0x40
>>       vfs_write+0xb3/0x1a0
>>       ksys_write+0x52/0xc0
>>       do_syscall_64+0x55/0x100
>>       entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>>     Fixes: bc7b0be316ae("drm/i915/gvt: Add basic debugfs infrastructure")
>>     Fixes: afe04fbe6c52("drm/i915/gvt: create an idle vGPU")
>>     Signed-off-by: Hang Yuan <hang.yuan@linux.intel.com>
>>     Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
>>
>> Which wasn't tagged for (and is not in any) stable trees?
>
>Not stable material, it fixes just a driver unload bug. That's for
>developers only. Worst case you break some user's box for this, which
>I don't think is cool. Since we're a 100% upstream driver team this
>won't harm developers if it's not backported.
>
>Note that because of fbcon and other reasons, an rmmod i915 will fail.
>You need to enable a bunch of CONFIG_EXPERT options (with scary texts
>and stuff) and have a script from our test suite to be able to even
>make this happen.

Hm, how does that work?

On an Ubuntu 4.18 kernel I can remove i915 just by:

root@jumpy:~# echo 1 > /sys/devices/pci0000\:00/0000\:00\:02.0/remove
root@jumpy:~# sudo rmmod i915


--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 19:09                           ` Sasha Levin
@ 2018-09-05 20:18                             ` Sasha Levin
  2018-09-05 20:33                               ` Daniel Vetter
  0 siblings, 1 reply; 74+ messages in thread
From: Sasha Levin @ 2018-09-05 20:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 05, 2018 at 03:09:02PM -0400, Sasha Levin wrote:
>On Wed, Sep 05, 2018 at 06:26:17PM +0200, Daniel Vetter wrote:
>>Note that because of fbcon and other reasons, an rmmod i915 will fail.
>>You need to enable a bunch of CONFIG_EXPERT options (with scary texts
>>and stuff) and have a script from our test suite to be able to even
>>make this happen.
>
>Hm, how does that work?
>
>On an Ubuntu 4.18 kernel I can remove i915 just by:
>
>root@jumpy:~# echo 1 > /sys/devices/pci0000\:00/0000\:00\:02.0/remove
>root@jumpy:~# sudo rmmod i915

... which apparently spews a warning and hangs the system:

[  802.065412] PCH DPLL A assertion failure (expected on, current off)
[  802.065574] WARNING: CPU: 1 PID: 804 at drivers/gpu/drm/i915/intel_dpll_mgr.c:124 assert_shared_dpll+0x106/0x120 [i915]
[  802.065576] Modules linked in: intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass arc4 snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc iwldvm snd_hda_codec_conexant snd_hda_codec_generic mac80211 gpio_ich snd_hda_intel iwlwifi snd_hda_codec aesni_intel aes_x86_64 crypto_simd snd_hda_core cryptd glue_helper snd_hwdep input_leds intel_cstate intel_rapl_perf joydev snd_pcm serio_raw cfg80211 wmi_bmof snd_timer lpc_ich mei_me mei thinkpad_acpi nvram snd soundcore mac_hid binfmt_misc sch_fq_codel coretemp ip_tables x_tables autofs4 i915 i2c_algo_bit drm_kms_helper psmouse syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops sdhci_pci ahci drm cqhci libahci sdhci wmi video
[  802.065678] CPU: 1 PID: 804 Comm: bash Not tainted 4.18.6-041806-generic #201809050847
[  802.065681] Hardware name: LENOVO 4236AT9/4236AT9, BIOS 83ET66WW (1.36 ) 10/31/2011
[  802.065769] RIP: 0010:assert_shared_dpll+0x106/0x120 [i915]
[  802.065770] Code: c7 c2 42 cc 36 c0 48 8b 83 88 00 00 00 48 89 f1 48 c7 c7 40 49 38 c0 48 0f 44 ca 45 84 e4 48 0f 45 d6 48 8b 30 e8 5c 60 7b dc <0f> 0b e9 40 ff ff ff e8 ae 5d 7b dc 66 66 2e 0f 1f 84 00 00 00 00
[  802.065883] RSP: 0018:ffffba82014d3a28 EFLAGS: 00010282
[  802.065887] RAX: 0000000000000000 RBX: ffff9a3e8977e350 RCX: 0000000000000006
[  802.065890] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff9a3e9e2564b0
[  802.065893] RBP: ffffba82014d3ac0 R08: 0000000000000001 R09: 0000000000000341
[  802.065895] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000001
[  802.065898] R13: 0000000000000001 R14: ffff9a3e89778000 R15: ffff9a3e8977e6b8
[  802.065902] FS:  00007fcd73414740(0000) GS:ffff9a3e9e240000(0000) knlGS:0000000000000000
[  802.065905] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  802.065908] CR2: 000055cf3a21ae00 CR3: 00000002137b4002 CR4: 00000000000606e0
[  802.065911] Call Trace:
[  802.065926]  ? __cancel_work_timer+0x110/0x190
[  802.066003]  ? intel_set_pch_fifo_underrun_reporting+0xec/0x180 [i915]
[  802.066082]  intel_disable_shared_dpll+0xa2/0x140 [i915]
[  802.066162]  intel_atomic_commit_tail+0x210/0xd10 [i915]
[  802.066244]  intel_atomic_commit+0x2b8/0x2f0 [i915]
[  802.066290]  drm_atomic_commit+0x4a/0x50 [drm]
[  802.066315]  __drm_atomic_helper_disable_all+0x190/0x1d0 [drm_kms_helper]
[  802.066335]  drm_atomic_helper_shutdown+0x5c/0xb0 [drm_kms_helper]
[  802.066397]  i915_driver_unload+0x8a/0x120 [i915]
[  802.066462]  i915_pci_remove+0x19/0x30 [i915]
[  802.066470]  pci_device_remove+0x3e/0xc0
[  802.066479]  device_release_driver_internal+0x18c/0x250
[  802.066486]  device_release_driver+0x12/0x20
[  802.066492]  pci_stop_bus_device+0x82/0xa0
[  802.066498]  pci_stop_and_remove_bus_device_locked+0x1a/0x30
[  802.066504]  remove_store+0x7c/0x90
[  802.066509]  dev_attr_store+0x1b/0x30
[  802.066515]  sysfs_kf_write+0x3b/0x50
[  802.066520]  kernfs_fop_write+0x12e/0x1b0
[  802.066526]  __vfs_write+0x3a/0x190
[  802.066533]  ? common_file_perm+0x4d/0x140
[  802.066538]  ? apparmor_file_permission+0x1a/0x20
[  802.066546]  ? security_file_permission+0x2f/0xb0
[  802.066554]  ? _cond_resched+0x19/0x30
[  802.066559]  vfs_write+0xab/0x1b0
[  802.066564]  ksys_write+0x55/0xc0
[  802.066569]  __x64_sys_write+0x1a/0x20
[  802.066576]  do_syscall_64+0x5a/0x110
[  802.066583]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  802.066588] RIP: 0033:0x7fcd72ae9154
[  802.066589] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 b1 07 2e 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 49 89 d4 53 48 89 f5
[  802.066668] RSP: 002b:00007ffdc88ea2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  802.066673] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fcd72ae9154
[  802.066676] RDX: 0000000000000002 RSI: 0000561e2995ec20 RDI: 0000000000000001
[  802.066678] RBP: 0000561e2995ec20 R08: 000000000000000a R09: 0000000000000001
[  802.066681] R10: 000000000000000a R11: 0000000000000246 R12: 00007fcd72dc5760
[  802.066683] R13: 0000000000000002 R14: 00007fcd72dc12a0 R15: 00007fcd72dc0760
[  802.066689] ---[ end trace 45b9f0b00b7e9277 ]---

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 20:18                             ` Sasha Levin
@ 2018-09-05 20:33                               ` Daniel Vetter
  0 siblings, 0 replies; 74+ messages in thread
From: Daniel Vetter @ 2018-09-05 20:33 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, Sep 5, 2018 at 10:18 PM, Sasha Levin
<Alexander.Levin@microsoft.com> wrote:
> On Wed, Sep 05, 2018 at 03:09:02PM -0400, Sasha Levin wrote:
>>On Wed, Sep 05, 2018 at 06:26:17PM +0200, Daniel Vetter wrote:
>>>Note that because of fbcon and other reasons, an rmmod i915 will fail.
>>>You need to enable a bunch of CONFIG_EXPERT options (with scary texts
>>>and stuff) and have a script from our test suite to be able to even
>>>make this happen.
>>
>>Hm, how does that work?
>>
>>On an Ubuntu 4.18 kernel I can remove i915 just by:
>>
>>root@jumpy:~# echo 1 > /sys/devices/pci0000\:00/0000\:00\:02.0/remove
>>root@jumpy:~# sudo rmmod i915
>
> ... which apparently spews a warning and hangs the system:

Yeah don't do that, we know about it. The force-unplug is what kills
your box, without that you can't rmmod even. Fixing i915+drm core to
properly refcount everything so thist doesn't blow up anymore is
probably a few man-years of effort. You can unload the entire thing,
if you first make sure everyone stops using the driver.

Like I said, not stable material.
-Daniel

>
> [  802.065412] PCH DPLL A assertion failure (expected on, current off)
> [  802.065574] WARNING: CPU: 1 PID: 804 at drivers/gpu/drm/i915/intel_dpll_mgr.c:124 assert_shared_dpll+0x106/0x120 [i915]
> [  802.065576] Modules linked in: intel_rapl x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm irqbypass arc4 snd_hda_codec_hdmi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc iwldvm snd_hda_codec_conexant snd_hda_codec_generic mac80211 gpio_ich snd_hda_intel iwlwifi snd_hda_codec aesni_intel aes_x86_64 crypto_simd snd_hda_core cryptd glue_helper snd_hwdep input_leds intel_cstate intel_rapl_perf joydev snd_pcm serio_raw cfg80211 wmi_bmof snd_timer lpc_ich mei_me mei thinkpad_acpi nvram snd soundcore mac_hid binfmt_misc sch_fq_codel coretemp ip_tables x_tables autofs4 i915 i2c_algo_bit drm_kms_helper psmouse syscopyarea e1000e sysfillrect sysimgblt fb_sys_fops sdhci_pci ahci drm cqhci libahci sdhci wmi video
> [  802.065678] CPU: 1 PID: 804 Comm: bash Not tainted 4.18.6-041806-generic #201809050847
> [  802.065681] Hardware name: LENOVO 4236AT9/4236AT9, BIOS 83ET66WW (1.36 ) 10/31/2011
> [  802.065769] RIP: 0010:assert_shared_dpll+0x106/0x120 [i915]
> [  802.065770] Code: c7 c2 42 cc 36 c0 48 8b 83 88 00 00 00 48 89 f1 48 c7 c7 40 49 38 c0 48 0f 44 ca 45 84 e4 48 0f 45 d6 48 8b 30 e8 5c 60 7b dc <0f> 0b e9 40 ff ff ff e8 ae 5d 7b dc 66 66 2e 0f 1f 84 00 00 00 00
> [  802.065883] RSP: 0018:ffffba82014d3a28 EFLAGS: 00010282
> [  802.065887] RAX: 0000000000000000 RBX: ffff9a3e8977e350 RCX: 0000000000000006
> [  802.065890] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff9a3e9e2564b0
> [  802.065893] RBP: ffffba82014d3ac0 R08: 0000000000000001 R09: 0000000000000341
> [  802.065895] R10: 0000000000000004 R11: 0000000000000000 R12: 0000000000000001
> [  802.065898] R13: 0000000000000001 R14: ffff9a3e89778000 R15: ffff9a3e8977e6b8
> [  802.065902] FS:  00007fcd73414740(0000) GS:ffff9a3e9e240000(0000) knlGS:0000000000000000
> [  802.065905] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  802.065908] CR2: 000055cf3a21ae00 CR3: 00000002137b4002 CR4: 00000000000606e0
> [  802.065911] Call Trace:
> [  802.065926]  ? __cancel_work_timer+0x110/0x190
> [  802.066003]  ? intel_set_pch_fifo_underrun_reporting+0xec/0x180 [i915]
> [  802.066082]  intel_disable_shared_dpll+0xa2/0x140 [i915]
> [  802.066162]  intel_atomic_commit_tail+0x210/0xd10 [i915]
> [  802.066244]  intel_atomic_commit+0x2b8/0x2f0 [i915]
> [  802.066290]  drm_atomic_commit+0x4a/0x50 [drm]
> [  802.066315]  __drm_atomic_helper_disable_all+0x190/0x1d0 [drm_kms_helper]
> [  802.066335]  drm_atomic_helper_shutdown+0x5c/0xb0 [drm_kms_helper]
> [  802.066397]  i915_driver_unload+0x8a/0x120 [i915]
> [  802.066462]  i915_pci_remove+0x19/0x30 [i915]
> [  802.066470]  pci_device_remove+0x3e/0xc0
> [  802.066479]  device_release_driver_internal+0x18c/0x250
> [  802.066486]  device_release_driver+0x12/0x20
> [  802.066492]  pci_stop_bus_device+0x82/0xa0
> [  802.066498]  pci_stop_and_remove_bus_device_locked+0x1a/0x30
> [  802.066504]  remove_store+0x7c/0x90
> [  802.066509]  dev_attr_store+0x1b/0x30
> [  802.066515]  sysfs_kf_write+0x3b/0x50
> [  802.066520]  kernfs_fop_write+0x12e/0x1b0
> [  802.066526]  __vfs_write+0x3a/0x190
> [  802.066533]  ? common_file_perm+0x4d/0x140
> [  802.066538]  ? apparmor_file_permission+0x1a/0x20
> [  802.066546]  ? security_file_permission+0x2f/0xb0
> [  802.066554]  ? _cond_resched+0x19/0x30
> [  802.066559]  vfs_write+0xab/0x1b0
> [  802.066564]  ksys_write+0x55/0xc0
> [  802.066569]  __x64_sys_write+0x1a/0x20
> [  802.066576]  do_syscall_64+0x5a/0x110
> [  802.066583]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [  802.066588] RIP: 0033:0x7fcd72ae9154
> [  802.066589] Code: 89 02 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 8d 05 b1 07 2e 00 8b 00 85 c0 75 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 f3 c3 66 90 41 54 55 49 89 d4 53 48 89 f5
> [  802.066668] RSP: 002b:00007ffdc88ea2c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
> [  802.066673] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fcd72ae9154
> [  802.066676] RDX: 0000000000000002 RSI: 0000561e2995ec20 RDI: 0000000000000001
> [  802.066678] RBP: 0000561e2995ec20 R08: 000000000000000a R09: 0000000000000001
> [  802.066681] R10: 000000000000000a R11: 0000000000000246 R12: 00007fcd72dc5760
> [  802.066683] R13: 0000000000000002 R14: 00007fcd72dc12a0 R15: 00007fcd72dc0760
> [  802.066689] ---[ end trace 45b9f0b00b7e9277 ]---
>
> --
> Thanks,
> Sasha



-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:06                     ` Sasha Levin
@ 2018-09-05 21:02                       ` Jiri Kosina
  0 siblings, 0 replies; 74+ messages in thread
From: Jiri Kosina @ 2018-09-05 21:02 UTC (permalink / raw)
  To: Sasha Levin; +Cc: James Bottomley, Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, Sasha Levin wrote:

> >And do you honestly think they should be marked for stable tree in the
> >first place?
> 
> If you can't trust a maintainer's judgement about his very own subsystem
> then you're shit out of luck. 

I said this already in this thread, but let me formulate this explicitly 
-- I see the trust required for mainlining (total focus on development), 
and trust required for including into stable (total focus on stability), 
as two non-equal qualities of a maintainer.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:42   ` Greg KH
                       ` (3 preceding siblings ...)
  2018-09-05 18:31     ` Laura Abbott
@ 2018-09-05 21:23     ` Justin Forbes
  2018-09-06  2:17     ` Eduardo Valentin
  5 siblings, 0 replies; 74+ messages in thread
From: Justin Forbes @ 2018-09-05 21:23 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit

On Wed, Sep 5, 2018 at 9:42 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Tue, Sep 04, 2018 at 04:22:59PM -0500, Justin Forbes wrote:
>> On Tue, Sep 4, 2018 at 3:58 PM, Laura Abbott <labbott@redhat.com> wrote:
>> > I'd like to start a discussion about the stable release cycle.
>> >
>> > Fedora is a heavy user of the most recent stable trees and we
>> > generally do a pretty good job of keeping up to date. As we
>> > try and increase testing though, the stable release process
>> > gets to be a bit difficult. We often run into the problem where
>> > release .Z is officially released and then .Z+1 comes
>> > out as an -rc immediately after. Given Fedora release processes,
>> > we haven't always finished testing .Z by the time .Z+1 comes
>> > out. What to do in this situation really depends on what's in
>> > .Z and .Z+1 and how stable we think things are. This usually
>> > works out fine but a) sometimes we guess wrong and should have
>> > tested .Z more b) we're only looking to increase testing.
>> >
>> > What I'd like to see is stable updates that come on a regular
>> > schedule with a longer -rc interval, say Sunday with
>> > a one week -rc period. I understand that much of the current
>> > stable schedule is based on Greg's schedule. As a distro
>> > maintainer though, a regular release schedule with a longer
>> > testing window makes it much easier to plan and deliver something
>> > useful to our users. It's also a much easier sell for encouraging
>> > everyone to pick up every stable update if there's a known
>> > schedule. I also realize Greg is probably reading this with a very
>> > skeptical look on his face so I'd be interested to hear from
>> > other distro maintainers as well.
>> >
>>
>> This has been a fairly recent problem. There was a roughly weekly
>> cadence for a very long time and that was pretty easy to work with.  I
>> know that some of these updates do fix embargoed security issues that
>> we don't find out are actual fixes until later, but frequently in
>> those cases, the fixes are pushed well before embargo lifts, and they
>> could be fit into a weekly cadence.  Personally I don't have a problem
>> with the 3 day rc period, but pushing 2 kernels a week can be a
>> problem for users. (skipping a stable update is also a problem for
>> users.)  What I would prefer is 1 stable update per week with an
>> exception for *serious* security issues, where serious would mean
>> either real end user impact or high profile lots of press users are
>> going to be wondering where a fix is.
>
> Laura, thanks for bringing this up.  I'll try to respond here given that
> Justin agrees with the issue of timing.
>
> Honestly, this year has been a total shit-storm for stable due to the
> whole security mess we have been dealing with.  The number of
> totally-crazy-intrusive patches I have had to take is insane.  Combine
> that with a total lack of regard for the security issues for some arches
> (arm32 comes to mind), it's been a very rough year and I have been just
> trying to keep on top of everything.
>
> Because of these issues (and it wasn't just spectre/meltdown, we have
> had other major fire drills in some subsystems), the release cycles have
> been quick and contain a lot of patches, sorry about that.  But that is
> reflected in Linus's tree as well, so maybe this is just the "new
> normal" that we all need to get used to.
>
Yeah, this year has been tough, I completely understand that. Though
with the exception of the spectre/meltdown bits, we tend to get the
patches out well before embargos are lifted because the patches
themselves do not point out the issue. I don't know that changing to a
weekly cadence would be a problem here.  And of course there can be
exceptions. It just seems that this year the overall cadence has
doubled, 1 a week is more of an exception, and 2 is the new normal.

> I could do a "one release a week" cycle, which I would _love_ but that
> is not going to decrease the number of patches per release, it is only
> going to make them large (patch rate stays the same, and increases, no
> matter when I release)  So I had been thinking that to break the
> releases up into a "here's a hundred or so patches" per release, was a
> helpful thing to the reviewers.
>
> If this assumption is incorrect, yes, I can go to one-per-week, if
> people agree that they can handle the large increase per release
> properly.  Can you all do that?

I would be happy with this (exception being serious security issues as
noted before).  The number of patches going in doesn't matter as much,
I review them when they hit queue-4.xx, not when they are sent out for
rc.  The issue for us is twofold, pushing 2 kernel updates per week to
users is unwieldy.  Of course skipping releases is also problematic,
serving a community is a balance.  I honestly don't even think there
is much to gain from extending the rc phase, 3 days is fine, but also
because I look at the actual patches when they hit queue, rc is just
build/test.  Of course we aren't the only distro, and I am not the
only Fedora maintainer, so take this as one voice.

>
> Are we going to do a "patch tuesday" like our friends in Redmond now? :)
>
> Note, if we do pick a specific day-per-week, then anything outside of
> that cycle will cause people to look _very_ close at the release.  I
> don't know if that's a good thing or not, but be aware that it could
> cause unintended side-affects.  Personally I think the fact that we are
> _not_ regular is a good thing, no out-of-band information leakage
> happens that way.

I don't see any real value of having a specific day of the week in
this regard.  A lot of things work around your travel schedule and
such, and when an embargoed issue is set to drop, it might be easier
to move the release day of the week to coincide with that.  I see more
downside to a specific day than I do upside.

> thanks,
>
> greg k-h

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 14:42   ` Greg KH
                       ` (4 preceding siblings ...)
  2018-09-05 21:23     ` Justin Forbes
@ 2018-09-06  2:17     ` Eduardo Valentin
  5 siblings, 0 replies; 74+ messages in thread
From: Eduardo Valentin @ 2018-09-06  2:17 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit, Justin Forbes

Hey,

On Wed, Sep 05, 2018 at 04:42:33PM +0200, Greg KH wrote:
> On Tue, Sep 04, 2018 at 04:22:59PM -0500, Justin Forbes wrote:

<cut>

> 
> Laura, thanks for bringing this up.  I'll try to respond here given that
> Justin agrees with the issue of timing.
> 
> Honestly, this year has been a total shit-storm for stable due to the
> whole security mess we have been dealing with.  The number of
> totally-crazy-intrusive patches I have had to take is insane.  Combine
> that with a total lack of regard for the security issues for some arches
> (arm32 comes to mind), it's been a very rough year and I have been just
> trying to keep on top of everything.
>

Cannot agree more :-)

> Because of these issues (and it wasn't just spectre/meltdown, we have
> had other major fire drills in some subsystems), the release cycles have
> been quick and contain a lot of patches, sorry about that.  But that is
> reflected in Linus's tree as well, so maybe this is just the "new
> normal" that we all need to get used to.
> 
> I could do a "one release a week" cycle, which I would _love_ but that
> is not going to decrease the number of patches per release, it is only
> going to make them large (patch rate stays the same, and increases, no
> matter when I release)  So I had been thinking that to break the
> releases up into a "here's a hundred or so patches" per release, was a
> helpful thing to the reviewers.
> 
> If this assumption is incorrect, yes, I can go to one-per-week, if
> people agree that they can handle the large increase per release
> properly.  Can you all do that?
> 

I think the fixed schedule is fine for us. But as any good rule has
exceptions, I am assuming there will still be exceptions to the proposed
1 release per week, right? For example, are the insane security backports,
like meltdown / spectre, still follow the cadence or are they gonna be
out of order, as their nature seams to be :-) ?


> Are we going to do a "patch tuesday" like our friends in Redmond now? :)
> 
> Note, if we do pick a specific day-per-week, then anything outside of
> that cycle will cause people to look _very_ close at the release.  I
> don't know if that's a good thing or not, but be aware that it could
> cause unintended side-affects.  Personally I think the fact that we are
> _not_ regular is a good thing, no out-of-band information leakage
> happens that way.

Well, the ability to do a release per major change/backport (L1TF/Meltdown)
may be one thing to consider, yes. I agree that for these cases, having
out-of-order releases may be a good thing.

Now, with that said, the testing effort with cadence or not will not
change, as the amount of patches / rate of patches won't really change.

How about having a couple of stable-rc per week and then finalize the week
with one release? That should give opportunity for people who want
to dilute the testing effort on smaller chunks by focusing on  testing every rc
and also give opportunity for those who want to have a longer release
cycle of one release per week. That would be following what is done by
Linus. 

BR, Eduardo

> 
> thanks,
> 
> greg k-h
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05  1:17           ` Laura Abbott
@ 2018-09-06  3:56             ` Benjamin Gilbert
  0 siblings, 0 replies; 74+ messages in thread
From: Benjamin Gilbert @ 2018-09-06  3:56 UTC (permalink / raw)
  To: ksummit-discuss

On Tue, Sep 04, 2018 at 06:17:46PM -0700, Laura Abbott wrote:
> On 09/04/2018 04:43 PM, Guenter Roeck wrote:
> > On 09/04/2018 04:14 PM, Sasha Levin wrote:
> >> Maybe some concrete numbers will help here. Do you maybe know how many
> >> commits in the past year snuck past the -rc cycle into a stable release
> >> and found as buggy by Fedora's testing pipeline?
> > 
> > ... and how many bugs were found during the existing test cycle ?
> > 
> > The next question would be how many regressions were reported by users
> > after a release was published.
> > 
> > The statistics I carried until early this year suggested a regression rate
> > of around 0.15% for stable releases, where regression means that a bug was
> > found post-release and had to be fixed later. It would indeed be interesting
> > to know how many of those were found by (automated ?) testing and how many
> > were found by users.
> 
> I'd have to do some digging through bugzilla to get numbers. Some
> of this is also motivated by discussions with the CoreOS team who
> have also tried to use the stable kernels and ran into problems.
> I'll see if I can get some numbers.

We've shipped 11 different 4.14.x kernels on the CoreOS Container Linux
stable channel, all in this calendar year.  Five of them had user-impacting
regressions that had to be fixed via OS updates:

4.14.30 - vxlan panic
          https://github.com/coreos/bugs/issues/2382
4.14.42 - Failure to set MTU in xen-netfront
          https://github.com/coreos/bugs/issues/2443
4.14.44 - Failure to bring up hv_netvsc interface after it was brought down
          https://github.com/coreos/bugs/issues/2454
4.14.48 - Integer overflow causing tiny TCP receive windows
          https://github.com/coreos/bugs/issues/2457
4.14.55 - Broken CIFS client
          https://github.com/coreos/bugs/issues/2480
4.14.55 - Failure to mount ext4 filesystems 3 TB or larger
          https://github.com/coreos/bugs/issues/2485

The TCP window bug was particularly exciting, since affected machines would
have downloaded the fixed OS image at ~300 bytes/sec.  To avoid that, we had
to implement several workarounds in our update infrastructure, only the
second time we've had to do that in the history of Container Linux.

--Benjamin Gilbert

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-05 15:18             ` Steven Rostedt
@ 2018-09-06  8:48               ` Thomas Gleixner
  2018-09-06 12:47                 ` Thomas Gleixner
  0 siblings, 1 reply; 74+ messages in thread
From: Thomas Gleixner @ 2018-09-06  8:48 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Greg KH, ksummit-discuss

On Wed, 5 Sep 2018, Steven Rostedt wrote:
> It's not the distros that need convincing, it's the vendors that pay to
> have it done. When I first started at Red Hat and was told about the
> "Stable Kernel ABI", the person telling me about this (a very
> established kernel developer) also said "Yeah it really sucks, but
> companies are willing to pay a shite load of money to have it done". And
> it's in the distros best interest to get that shite load of money. It
> also funds the same developers to do this work, and hopefully continue
> to help upstream as well.
> 
> If we remove that nasty work, these companies wont need to continue
> paying that shite load anymore, and they may not be able to afford
> paying these talented developers.

Come on. Unless you have hard evidence for this, you are merily
proliferating a decades old distro fairy tale.

Seriously, those old kernels are part of the revenue stream, but if any
vendors engineering investment, which is securing the future, depends on
this, then the company is close to the state of the dead kernels.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time
  2018-09-06  8:48               ` Thomas Gleixner
@ 2018-09-06 12:47                 ` Thomas Gleixner
  0 siblings, 0 replies; 74+ messages in thread
From: Thomas Gleixner @ 2018-09-06 12:47 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Greg KH, ksummit-discuss

On Thu, 6 Sep 2018, Thomas Gleixner wrote:

> On Wed, 5 Sep 2018, Steven Rostedt wrote:
> > It's not the distros that need convincing, it's the vendors that pay to
> > have it done. When I first started at Red Hat and was told about the
> > "Stable Kernel ABI", the person telling me about this (a very
> > established kernel developer) also said "Yeah it really sucks, but
> > companies are willing to pay a shite load of money to have it done". And
> > it's in the distros best interest to get that shite load of money. It
> > also funds the same developers to do this work, and hopefully continue
> > to help upstream as well.
> > 
> > If we remove that nasty work, these companies wont need to continue
> > paying that shite load anymore, and they may not be able to afford
> > paying these talented developers.
> 
> Come on. Unless you have hard evidence for this, you are merily
> proliferating a decades old distro fairy tale.
> 
> Seriously, those old kernels are part of the revenue stream, but if any
> vendors engineering investment, which is securing the future, depends on
> this, then the company is close to the state of the dead kernels.

Clarification. I meant that mostly vs. the 2.6 myth, but for the not that
old kernel crap including the KABI mess, which makes a large part of the
revenue indeed, the problem was introduced by the distros in the first
place.

The people who warned about the issues and predicted the horrors (in a way
too small scale) were ignored and I have no indication that there is a
serious change happening.

I wouldn't care at all if this would not affect upstream and not totally
trainwreck engineers and maintainers. No money in the world can compensate
for people getting on the edge of burnouts. 

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 74+ messages in thread

end of thread, other threads:[~2018-09-06 12:47 UTC | newest]

Thread overview: 74+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-04 20:58 [Ksummit-discuss] [MAINTAINER SUMMIT] Stable trees and release time Laura Abbott
2018-09-04 21:12 ` Jiri Kosina
2018-09-05 14:31   ` Greg KH
2018-09-04 21:22 ` Justin Forbes
2018-09-05 14:42   ` Greg KH
2018-09-05 15:10     ` Mark Brown
2018-09-05 15:10     ` Sasha Levin
2018-09-05 16:19     ` Guenter Roeck
2018-09-05 18:31     ` Laura Abbott
2018-09-05 21:23     ` Justin Forbes
2018-09-06  2:17     ` Eduardo Valentin
2018-09-04 21:33 ` Sasha Levin
2018-09-04 21:55   ` Guenter Roeck
2018-09-04 22:03     ` Laura Abbott
2018-09-04 23:14       ` Sasha Levin
2018-09-04 23:43         ` Guenter Roeck
2018-09-05  1:17           ` Laura Abbott
2018-09-06  3:56             ` Benjamin Gilbert
2018-09-04 21:58   ` Laura Abbott
2018-09-05  4:53     ` Sasha Levin
2018-09-05  6:48   ` Jiri Kosina
2018-09-05  8:16     ` Jan Kara
2018-09-05  8:32       ` Jiri Kosina
2018-09-05  8:56         ` Greg KH
2018-09-05  9:13           ` Geert Uytterhoeven
2018-09-05  9:33             ` Greg KH
2018-09-05 10:11           ` Mark Brown
2018-09-05 14:44             ` Steven Rostedt
2018-09-05  9:58         ` James Bottomley
2018-09-05 10:47           ` Mark Brown
2018-09-05 12:24             ` James Bottomley
2018-09-05 12:53               ` Jiri Kosina
2018-09-05 13:05                 ` Greg KH
2018-09-05 13:15                   ` Jiri Kosina
2018-09-05 14:00                     ` Greg KH
2018-09-05 14:06                     ` Sasha Levin
2018-09-05 21:02                       ` Jiri Kosina
2018-09-05 16:39                 ` James Bottomley
2018-09-05 17:06                   ` Dmitry Torokhov
2018-09-05 17:33                   ` Steven Rostedt
2018-09-05 13:03               ` Takashi Iwai
2018-09-05 13:27                 ` Daniel Vetter
2018-09-05 14:05                   ` Greg KH
2018-09-05 15:54                     ` Daniel Vetter
2018-09-05 16:19                       ` Sasha Levin
2018-09-05 16:26                         ` Daniel Vetter
2018-09-05 19:09                           ` Sasha Levin
2018-09-05 20:18                             ` Sasha Levin
2018-09-05 20:33                               ` Daniel Vetter
2018-09-05 14:20                 ` Sasha Levin
2018-09-05 14:30                   ` Takashi Iwai
2018-09-05 14:41                     ` Sasha Levin
2018-09-05 14:46                       ` Takashi Iwai
2018-09-05 14:54                         ` Sasha Levin
2018-09-05 15:12                           ` Takashi Iwai
2018-09-05 15:19                           ` Thomas Gleixner
2018-09-05 15:29                             ` Sasha Levin
2018-09-05 13:16               ` Mark Brown
2018-09-05 14:27                 ` Sasha Levin
2018-09-05 14:50                   ` Mark Brown
2018-09-05 15:00                     ` Sasha Levin
2018-09-05 10:28       ` Thomas Gleixner
2018-09-05 11:20         ` Jiri Kosina
2018-09-05 14:41           ` Thomas Gleixner
2018-09-05 15:18             ` Steven Rostedt
2018-09-06  8:48               ` Thomas Gleixner
2018-09-06 12:47                 ` Thomas Gleixner
2018-09-04 21:49 ` Guenter Roeck
2018-09-04 22:06   ` Laura Abbott
2018-09-04 23:35     ` Guenter Roeck
2018-09-05  1:45       ` Laura Abbott
2018-09-05  2:54         ` Guenter Roeck
2018-09-05  8:31           ` Jan Kara
2018-09-05  3:44 ` Eduardo Valentin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.