[Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
@ 2019-07-03  1:35 Sasha Levin
  2019-07-03 14:57 ` Laura Abbott
                   ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Sasha Levin @ 2019-07-03  1:35 UTC (permalink / raw)
  To: ksummit-discuss

Hi folks,

If there is interest, I'd like to go over the (minor) changes that went
into the -stable kernel process since last year's MS, the various
automations we now have, and how we have addressed some of the pain
points that came up last year. I'd also love to hear from folks about
the issues they're seeing with the process, and if there's anything we
can do to make it better.

Some of the concerns that were raised during last year's MS (both in the
group session as well as in the hallway track) which we've tried to
address are:

 - Commits missing because authors did not respond to Greg's "FAILED:"
   mails.
 - Concerns about how well -stable kernels are tested.
 - "Fixes for fixes" end up being missed.
 - Saner AUTOSEL process.
 - Tracking of dropped commits.

I found last years feedback very valuable and hopefully have addressed
some of it, hoping for the same this year as well.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-03  1:35 [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement Sasha Levin
@ 2019-07-03 14:57 ` Laura Abbott
  2019-07-05 13:54 ` Michael Ellerman
  2019-07-05 16:41 ` Mark Brown
  2 siblings, 0 replies; 27+ messages in thread
From: Laura Abbott @ 2019-07-03 14:57 UTC (permalink / raw)
  To: Sasha Levin, ksummit-discuss

On 7/2/19 9:35 PM, Sasha Levin wrote:
> Hi folks,
> 
> If there is interest, I'd like to go over the (minor) changes that went
> into the -stable kernel process since last year's MS, the various
> automations we now have, and how we have addressed some of the pain
> points that came up last year. I'd also love to hear from folks about
> the issues they're seeing with the process, and if there's anything we
> can do to make it better.
> 
> Some of the concerns that were raised during last year's MS (both in the
> group session as well as in the hallway track) which we've tried to
> address are:
> 
> - Commits missing because authors did not respond to Greg's "FAILED:"
>    mails.
> - Concerns about how well -stable kernels are tested.
> - "Fixes for fixes" end up being missed.
> - Saner AUTOSEL process.
> - Tracking of dropped commits.
> 
> I found last years feedback very valuable and hopefully have addressed
> some of it, hoping for the same this year as well.

I'm certainly interested in this.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-03  1:35 [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement Sasha Levin
  2019-07-03 14:57 ` Laura Abbott
@ 2019-07-05 13:54 ` Michael Ellerman
  2019-07-05 14:13   ` Takashi Iwai
  2019-07-05 16:41 ` Mark Brown
  2 siblings, 1 reply; 27+ messages in thread
From: Michael Ellerman @ 2019-07-05 13:54 UTC (permalink / raw)
  To: Sasha Levin, ksummit-discuss

Sasha Levin <sashal@kernel.org> writes:
> Hi folks,
>
> If there is interest, I'd like to go over the (minor) changes that went
> into the -stable kernel process since last year's MS, the various
> automations we now have, and how we have addressed some of the pain
> points that came up last year. I'd also love to hear from folks about
> the issues they're seeing with the process, and if there's anything we
> can do to make it better.
>
> Some of the concerns that were raised during last year's MS (both in the
> group session as well as in the hallway track) which we've tried to
> address are:
>
>  - Commits missing because authors did not respond to Greg's "FAILED:"
>    mails.
>  - Concerns about how well -stable kernels are tested.
>  - "Fixes for fixes" end up being missed.
>  - Saner AUTOSEL process.
>  - Tracking of dropped commits.

Yeah definitely interested in this.

Especially the tracking part. I have been trying to keep track of
powerpc commits that need backporting, but haven't really come up with a
good system. So would be interested in what you and/or others are doing.

Something I've been experimenting with is using git notes to mark
commits that have been fixed by a subsequent commit. This gives you a
two way link between the fix and the fixed commit, and you can get the
notes to show up in git log, like:

  commit 1846193b178dcc58435fdc57352db7b74826ef37
  Author: Michael Ellerman <mpe@ellerman.id.au>
  Date:   Thu Jul 7 22:54:29 2016 +1000
  
      powerpc/xmon: Dump ISA 2.06 SPRs
      
      Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
  
  Notes (fixed):
      Fixed-by: c47a94031e81 ("powerpc/xmon: Fix display of SPRs")


I'd like to extend this to the stable trees, so you could have output
something like:

  commit 1846193b178dcc58435fdc57352db7b74826ef37
  Author: Michael Ellerman <mpe@ellerman.id.au>
  Date:   Thu Jul 7 22:54:29 2016 +1000
  
      powerpc/xmon: Dump ISA 2.06 SPRs
      
      Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
  
  Notes (fixed):
      Fixed-by: c47a94031e81 ("powerpc/xmon: Fix display of SPRs")
        v4.9.y: deadbeef0000 ("powerpc/xmon: Fix display of SPRs")
       v4.10.y: not found


Git notes are also just blobs, so in theory the processing to generate
those notes could be done once and pushed to a repo where everyone could
pull them.

cheers

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-05 13:54 ` Michael Ellerman
@ 2019-07-05 14:13   ` Takashi Iwai
  2019-07-05 16:17     ` Greg KH
  2019-07-05 16:52     ` Sasha Levin
  0 siblings, 2 replies; 27+ messages in thread
From: Takashi Iwai @ 2019-07-05 14:13 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: ksummit-discuss

On Fri, 05 Jul 2019 15:54:11 +0200,
Michael Ellerman wrote:
> 
> Sasha Levin <sashal@kernel.org> writes:
> > Hi folks,
> >
> > If there is interest, I'd like to go over the (minor) changes that went
> > into the -stable kernel process since last year's MS, the various
> > automations we now have, and how we have addressed some of the pain
> > points that came up last year. I'd also love to hear from folks about
> > the issues they're seeing with the process, and if there's anything we
> > can do to make it better.
> >
> > Some of the concerns that were raised during last year's MS (both in the
> > group session as well as in the hallway track) which we've tried to
> > address are:
> >
> >  - Commits missing because authors did not respond to Greg's "FAILED:"
> >    mails.
> >  - Concerns about how well -stable kernels are tested.
> >  - "Fixes for fixes" end up being missed.
> >  - Saner AUTOSEL process.
> >  - Tracking of dropped commits.
> 
> Yeah definitely interested in this.
> 
> Especially the tracking part. I have been trying to keep track of
> powerpc commits that need backporting, but haven't really come up with a
> good system. So would be interested in what you and/or others are doing.
> 
> Something I've been experimenting with is using git notes to mark
> commits that have been fixed by a subsequent commit. This gives you a
> two way link between the fix and the fixed commit, and you can get the
> notes to show up in git log, like:
> 
>   commit 1846193b178dcc58435fdc57352db7b74826ef37
>   Author: Michael Ellerman <mpe@ellerman.id.au>
>   Date:   Thu Jul 7 22:54:29 2016 +1000
>   
>       powerpc/xmon: Dump ISA 2.06 SPRs
>       
>       Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>   
>   Notes (fixed):
>       Fixed-by: c47a94031e81 ("powerpc/xmon: Fix display of SPRs")
> 
> 
> I'd like to extend this to the stable trees, so you could have output
> something like:
> 
>   commit 1846193b178dcc58435fdc57352db7b74826ef37
>   Author: Michael Ellerman <mpe@ellerman.id.au>
>   Date:   Thu Jul 7 22:54:29 2016 +1000
>   
>       powerpc/xmon: Dump ISA 2.06 SPRs
>       
>       Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>   
>   Notes (fixed):
>       Fixed-by: c47a94031e81 ("powerpc/xmon: Fix display of SPRs")
>         v4.9.y: deadbeef0000 ("powerpc/xmon: Fix display of SPRs")
>        v4.10.y: not found
> 
> 
> Git notes are also just blobs, so in theory the processing to generate
> those notes could be done once and pushed to a repo where everyone could
> pull them.

Yes, I'd love to have (and share) this kind of reverse mapping
information.  But somehow using git-notes for such a purpose wasn't
accepted widely.  IIRC, Linus mentioned that git-notes is a hack, and
indeed it is.  But if the entries aren't too big, it would work well
enough, I guess.  Once when the size matters, we can reconsider to
switch to a better infrastructure...

FWIW, SUSE tracks the possible upstream fixes by parsing Fixes tag
regularly, so it's proven to be useful.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-05 14:13   ` Takashi Iwai
@ 2019-07-05 16:17     ` Greg KH
  2019-07-05 16:52     ` Sasha Levin
  1 sibling, 0 replies; 27+ messages in thread
From: Greg KH @ 2019-07-05 16:17 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: ksummit-discuss

On Fri, Jul 05, 2019 at 04:13:35PM +0200, Takashi Iwai wrote:
> 
> FWIW, SUSE tracks the possible upstream fixes by parsing Fixes tag
> regularly, so it's proven to be useful.

Yeah, it's the fixes tag parsing that I know I use (well, should use
more often than I do).  I think Sasha runs that type of script more
often than I do.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-03  1:35 [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement Sasha Levin
  2019-07-03 14:57 ` Laura Abbott
  2019-07-05 13:54 ` Michael Ellerman
@ 2019-07-05 16:41 ` Mark Brown
  2019-07-05 20:12   ` Sasha Levin
  2 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2019-07-05 16:41 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 510 bytes --]

On Tue, Jul 02, 2019 at 09:35:57PM -0400, Sasha Levin wrote:

> - Concerns about how well -stable kernels are tested.
> - "Fixes for fixes" end up being missed.
> - Saner AUTOSEL process.

I'm a bit worried about these, especially pushed together - one
of the things the AUTOSEL stuff does quite often is pull in
driver changes and our coverage of drivers is especially weak.
When a person has explicitly flagged something for stable it's a
still risky but the automation adds that extra level of
uncertainty.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-05 14:13   ` Takashi Iwai
  2019-07-05 16:17     ` Greg KH
@ 2019-07-05 16:52     ` Sasha Levin
  1 sibling, 0 replies; 27+ messages in thread
From: Sasha Levin @ 2019-07-05 16:52 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: ksummit-discuss

On Fri, Jul 05, 2019 at 04:13:35PM +0200, Takashi Iwai wrote:
>On Fri, 05 Jul 2019 15:54:11 +0200,
>Michael Ellerman wrote:
>>
>> Sasha Levin <sashal@kernel.org> writes:
>> > Hi folks,
>> >
>> > If there is interest, I'd like to go over the (minor) changes that went
>> > into the -stable kernel process since last year's MS, the various
>> > automations we now have, and how we have addressed some of the pain
>> > points that came up last year. I'd also love to hear from folks about
>> > the issues they're seeing with the process, and if there's anything we
>> > can do to make it better.
>> >
>> > Some of the concerns that were raised during last year's MS (both in the
>> > group session as well as in the hallway track) which we've tried to
>> > address are:
>> >
>> >  - Commits missing because authors did not respond to Greg's "FAILED:"
>> >    mails.
>> >  - Concerns about how well -stable kernels are tested.
>> >  - "Fixes for fixes" end up being missed.
>> >  - Saner AUTOSEL process.
>> >  - Tracking of dropped commits.
>>
>> Yeah definitely interested in this.
>>
>> Especially the tracking part. I have been trying to keep track of
>> powerpc commits that need backporting, but haven't really come up with a
>> good system. So would be interested in what you and/or others are doing.
>>
>> Something I've been experimenting with is using git notes to mark
>> commits that have been fixed by a subsequent commit. This gives you a
>> two way link between the fix and the fixed commit, and you can get the
>> notes to show up in git log, like:
>>
>>   commit 1846193b178dcc58435fdc57352db7b74826ef37
>>   Author: Michael Ellerman <mpe@ellerman.id.au>
>>   Date:   Thu Jul 7 22:54:29 2016 +1000
>>
>>       powerpc/xmon: Dump ISA 2.06 SPRs
>>
>>       Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>>
>>   Notes (fixed):
>>       Fixed-by: c47a94031e81 ("powerpc/xmon: Fix display of SPRs")
>>
>>
>> I'd like to extend this to the stable trees, so you could have output
>> something like:
>>
>>   commit 1846193b178dcc58435fdc57352db7b74826ef37
>>   Author: Michael Ellerman <mpe@ellerman.id.au>
>>   Date:   Thu Jul 7 22:54:29 2016 +1000
>>
>>       powerpc/xmon: Dump ISA 2.06 SPRs
>>
>>       Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
>>
>>   Notes (fixed):
>>       Fixed-by: c47a94031e81 ("powerpc/xmon: Fix display of SPRs")
>>         v4.9.y: deadbeef0000 ("powerpc/xmon: Fix display of SPRs")
>>        v4.10.y: not found
>>
>>
>> Git notes are also just blobs, so in theory the processing to generate
>> those notes could be done once and pushed to a repo where everyone could
>> pull them.
>
>Yes, I'd love to have (and share) this kind of reverse mapping
>information.  But somehow using git-notes for such a purpose wasn't
>accepted widely.  IIRC, Linus mentioned that git-notes is a hack, and
>indeed it is.  But if the entries aren't too big, it would work well
>enough, I guess.  Once when the size matters, we can reconsider to
>switch to a better infrastructure...
>
>FWIW, SUSE tracks the possible upstream fixes by parsing Fixes tag
>regularly, so it's proven to be useful.

Indeed, I also have quite a few scripts that do interesting things with
the fixes tag (such as the "fixes for fixes" script which tries to
understand if a certain fix was backported, and the new fix would apply
to older LTS trees).

I'm toying with a similar idea for git notes, but my approach was to
extract mailing list conversations that are related to the patch in
question and add them as git notes to the commit they're discussing.

This means that when I do 'git log' to see a commit I'm about to
backport, I also get all the mailing list context related to it which
often tends to be more valuable than the commit message itself.

This is the sort of things I feel would be useful beyond just -stable
work; I'm sure that everyone spent hours sifting through the mailing
list to understand some of the logic of a given patch. I'd love to have
better integration between our git tree and the mailing list.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-05 16:41 ` Mark Brown
@ 2019-07-05 20:12   ` Sasha Levin
  2019-07-06  0:32     ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Sasha Levin @ 2019-07-05 20:12 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

On Fri, Jul 05, 2019 at 05:41:42PM +0100, Mark Brown wrote:
>On Tue, Jul 02, 2019 at 09:35:57PM -0400, Sasha Levin wrote:
>
>> - Concerns about how well -stable kernels are tested.
>> - "Fixes for fixes" end up being missed.
>> - Saner AUTOSEL process.
>
>I'm a bit worried about these, especially pushed together - one
>of the things the AUTOSEL stuff does quite often is pull in
>driver changes and our coverage of drivers is especially weak.

Our driver coverage is indeed weak, but I don't think that the solution
is to leave drivers/ alone. On the contrary, I think that making
drivers/ move quickly together with the rest of the kernel will
encourage vendors to up their testing game.

This came up in the last MS, and the agreement there was that we expect
stable kernel users to test their workloads before throwing it into
production.

If we were to start avoiding driver updates, it would act as an
incentive for people not to upgrade their kernel.

Right now I'm working with a certain hardware vendor who does a crappy
job at tagging fixes for stable, and it's horribly painful. I end up
spending time triaging a bug, reporting it to the vendor, only to be
told "oh grab this fix from upstream".

This user experience is just bad, and I can't imagine how difficult it
is for users who are less familiar with the kerenl.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-05 20:12   ` Sasha Levin
@ 2019-07-06  0:32     ` Mark Brown
  2019-07-08 11:02       ` Sasha Levin
  0 siblings, 1 reply; 27+ messages in thread
From: Mark Brown @ 2019-07-06  0:32 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 4113 bytes --]

On Fri, Jul 05, 2019 at 04:12:31PM -0400, Sasha Levin wrote:
> On Fri, Jul 05, 2019 at 05:41:42PM +0100, Mark Brown wrote:

> > I'm a bit worried about these, especially pushed together - one
> > of the things the AUTOSEL stuff does quite often is pull in
> > driver changes and our coverage of drivers is especially weak.

> Our driver coverage is indeed weak, but I don't think that the solution
> is to leave drivers/ alone. On the contrary, I think that making
> drivers/ move quickly together with the rest of the kernel will
> encourage vendors to up their testing game.

I'm not saying leave it alone, it's more a question of how
aggressive we are about picking up things we think might be
relevant fixes but haven't had some sort of domain specific
analysis of.  Testing is a good way to mitigate the potential
risks here.

> This came up in the last MS, and the agreement there was that we expect
> stable kernel users to test their workloads before throwing it into
> production.

That's kind of the problem - if people are doing testing and end
up finding problems coming back in the stable kernel that's the
sort of thing that encourages them to not just take stable en
masse as we say they should.  Part of the deal with stable is
that it is conservative, people can trust it to be a low risk
update.  That's not happening now as far as I'm aware but it does
worry me that it might happen.

> If we were to start avoiding driver updates, it would act as an
> incentive for people not to upgrade their kernel.

I'm not sure I follow the logic here?

> Right now I'm working with a certain hardware vendor who does a crappy
> job at tagging fixes for stable, and it's horribly painful. I end up
> spending time triaging a bug, reporting it to the vendor, only to be
> told "oh grab this fix from upstream".

> This user experience is just bad, and I can't imagine how difficult it
> is for users who are less familiar with the kerenl.

Well, the advice from the upstream community has always been that
you should track upstream and I'm sure people will be praising
this vendor's upstream focus but obviously that's not always
terribly helpful or realistic for production systems.  In my
(mostly embedded and consumer electronics based) experience
support for older kernel versions is generally part of the
commercial discussion with the hardware vendor, there's an
understanding that the hardware will only get bought if it works
on kernel versions that are useful to the customer or (depending
on the power relationships) that the customer will use kernel
versions that the vendor supports.  Sometimes, especially for
smaller customers, that doesn't work out but those are usually
the people who are more likely to track upstream and/or do
considerable testing before fixing a version and generally are on
their own.

This is where the out of tree patch stacks from vendors come from
- everyone agrees that they'll use one or more given kernel
versions, enterprise distros or whatever and then the vendor
commits to supporting what's agreed but often that doesn't just
include bug fixing but also new features (or entirely new bits of
hardware).  As a result those vendors are shipping their patch
stacks out of tree, users are getting their bug fixes from there
and those vendors are not finding much user demand for vanilla
LTS as a separate thing.  They may even find conflicts with it an
annoying hassle.  Frankly for them upstream support is often a
bit of an investment in reducing the cost of future out of tree
patch stacks and giving a longer general market life to products
rather than something customers directly demand.  None of this is
ideal from an upstream point of view of course but it does
function for people.

It sounds like somewhere along the line this process has come
unstuck for you and you have a vendor that's not aligned with
what you need but I don't think that's quite the same question
as the issues with pulling patches into stable without either
testing coverage or direct identification of an issue by someone
with domain knowledge which is what I'm worrying about.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-06  0:32     ` Mark Brown
@ 2019-07-08 11:02       ` Sasha Levin
  2019-07-08 11:35         ` Jiri Kosina
  2019-07-08 12:37         ` Mark Brown
  0 siblings, 2 replies; 27+ messages in thread
From: Sasha Levin @ 2019-07-08 11:02 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
>On Fri, Jul 05, 2019 at 04:12:31PM -0400, Sasha Levin wrote:
>> On Fri, Jul 05, 2019 at 05:41:42PM +0100, Mark Brown wrote:
>
>> > I'm a bit worried about these, especially pushed together - one
>> > of the things the AUTOSEL stuff does quite often is pull in
>> > driver changes and our coverage of drivers is especially weak.
>
>> Our driver coverage is indeed weak, but I don't think that the solution
>> is to leave drivers/ alone. On the contrary, I think that making
>> drivers/ move quickly together with the rest of the kernel will
>> encourage vendors to up their testing game.
>
>I'm not saying leave it alone, it's more a question of how
>aggressive we are about picking up things we think might be
>relevant fixes but haven't had some sort of domain specific
>analysis of.  Testing is a good way to mitigate the potential
>risks here.

I agree, and for various subsystems and drivers where the maintainers
volunteer their domain specific expertise to send backports to stable, I
have "blacklisted" it from AUTOSEL since indeed it's a much better
option.

>> This came up in the last MS, and the agreement there was that we expect
>> stable kernel users to test their workloads before throwing it into
>> production.
>
>That's kind of the problem - if people are doing testing and end
>up finding problems coming back in the stable kernel that's the
>sort of thing that encourages them to not just take stable en
>masse as we say they should.  Part of the deal with stable is
>that it is conservative, people can trust it to be a low risk
>update.  That's not happening now as far as I'm aware but it does
>worry me that it might happen.

Right, and the rate at which AUTOSEL commits are reverted is lower than
commits that are actually tagged for stable. If AUTOSEL commits on their
own were being reverted left and right I'd agree we need to tone it
down, but I don't see it happening now.

>> If we were to start avoiding driver updates, it would act as an
>> incentive for people not to upgrade their kernel.
>
>I'm not sure I follow the logic here?

The way I see it, the lower your "effective delta" is between to
kernels, the easier it is to move forward. For example, if I have a
product that runs on 4.19 and uses all our core kernel code + 10
drivers, and I know that those drivers had most of the fixes backported
to my LTS tree, I'd feel much more confident going to 5.4 knowning that
I already have most of the patches that come with 5.4.

For me it's a matter of how one would budget a move from a kernel X LTS
to kernel Y LTS, and I think that as that budget requirement grows it's
actually harder to actually do it (and convince management), acting as a
negative incentive to stay with whatever works now.

>> Right now I'm working with a certain hardware vendor who does a crappy
>> job at tagging fixes for stable, and it's horribly painful. I end up
>> spending time triaging a bug, reporting it to the vendor, only to be
>> told "oh grab this fix from upstream".
>
>> This user experience is just bad, and I can't imagine how difficult it
>> is for users who are less familiar with the kerenl.
>
>Well, the advice from the upstream community has always been that
>you should track upstream and I'm sure people will be praising
>this vendor's upstream focus but obviously that's not always
>terribly helpful or realistic for production systems.  In my
>(mostly embedded and consumer electronics based) experience
>support for older kernel versions is generally part of the
>commercial discussion with the hardware vendor, there's an
>understanding that the hardware will only get bought if it works
>on kernel versions that are useful to the customer or (depending
>on the power relationships) that the customer will use kernel
>versions that the vendor supports.  Sometimes, especially for
>smaller customers, that doesn't work out but those are usually
>the people who are more likely to track upstream and/or do
>considerable testing before fixing a version and generally are on
>their own.

I have a different experience with this. I'd like to think that we're a
bigger customer and this process wasn't working too well for us. My
thinking was that if it's broken for us I can only imagine how bad it is
for the smaller customers.

>This is where the out of tree patch stacks from vendors come from
>- everyone agrees that they'll use one or more given kernel
>versions, enterprise distros or whatever and then the vendor
>commits to supporting what's agreed but often that doesn't just
>include bug fixing but also new features (or entirely new bits of
>hardware).  As a result those vendors are shipping their patch
>stacks out of tree, users are getting their bug fixes from there
>and those vendors are not finding much user demand for vanilla
>LTS as a separate thing.  They may even find conflicts with it an
>annoying hassle.  Frankly for them upstream support is often a
>bit of an investment in reducing the cost of future out of tree
>patch stacks and giving a longer general market life to products
>rather than something customers directly demand.  None of this is
>ideal from an upstream point of view of course but it does
>function for people.

This is where our story is different, which might explain my experience
being different: we usually require vendors to upstream everything, and
so they do. This means we don't have much of a out-of-tree patch
stacks/fixes from the vendor directly, and we expect to pick up patches
via the regular stable process, and that didn't happen all too well so
far.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 11:02       ` Sasha Levin
@ 2019-07-08 11:35         ` Jiri Kosina
  2019-07-08 12:34           ` Greg KH
  2019-07-08 17:56           ` Sasha Levin
  2019-07-08 12:37         ` Mark Brown
  1 sibling, 2 replies; 27+ messages in thread
From: Jiri Kosina @ 2019-07-08 11:35 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ksummit-discuss

On Mon, 8 Jul 2019, Sasha Levin wrote:
> 
> >> If we were to start avoiding driver updates, it would act as an
> >> incentive for people not to upgrade their kernel.
> >
> >I'm not sure I follow the logic here?
> 
> The way I see it, the lower your "effective delta" is between to
> kernels, the easier it is to move forward. For example, if I have a
> product that runs on 4.19 and uses all our core kernel code + 10
> drivers, and I know that those drivers had most of the fixes backported
> to my LTS tree, I'd feel much more confident going to 5.4 knowning that
> I already have most of the patches that come with 5.4.
> 
> For me it's a matter of how one would budget a move from a kernel X LTS
> to kernel Y LTS, and I think that as that budget requirement grows it's
> actually harder to actually do it (and convince management), acting as a
> negative incentive to stay with whatever works now.

But where does the 'stable' aspect appear here?

I think it's reasonable to expect 'stable' to mean 'minimal number of 
changes needed to maintain stability of the kernel', and that I believe 
was the original purpose of stable tree.

Now you seem to be repurposing 'stable' as 'as close to upstream as 
possible in order to minimize cost of version updates'.

I guess that's one of the reasons why distros are gradually turning away 
from stable tree the main purpose of distros is to provide stability, 
while it clearly is not minimizing acumulation of cost for future version 
updates.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 11:35         ` Jiri Kosina
@ 2019-07-08 12:34           ` Greg KH
  2019-07-08 17:56           ` Sasha Levin
  1 sibling, 0 replies; 27+ messages in thread
From: Greg KH @ 2019-07-08 12:34 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

On Mon, Jul 08, 2019 at 01:35:15PM +0200, Jiri Kosina wrote:
> On Mon, 8 Jul 2019, Sasha Levin wrote:
> > 
> > >> If we were to start avoiding driver updates, it would act as an
> > >> incentive for people not to upgrade their kernel.
> > >
> > >I'm not sure I follow the logic here?
> > 
> > The way I see it, the lower your "effective delta" is between to
> > kernels, the easier it is to move forward. For example, if I have a
> > product that runs on 4.19 and uses all our core kernel code + 10
> > drivers, and I know that those drivers had most of the fixes backported
> > to my LTS tree, I'd feel much more confident going to 5.4 knowning that
> > I already have most of the patches that come with 5.4.
> > 
> > For me it's a matter of how one would budget a move from a kernel X LTS
> > to kernel Y LTS, and I think that as that budget requirement grows it's
> > actually harder to actually do it (and convince management), acting as a
> > negative incentive to stay with whatever works now.
> 
> But where does the 'stable' aspect appear here?
> 
> I think it's reasonable to expect 'stable' to mean 'minimal number of 
> changes needed to maintain stability of the kernel', and that I believe 
> was the original purpose of stable tree.
> 
> Now you seem to be repurposing 'stable' as 'as close to upstream as 
> possible in order to minimize cost of version updates'.

"stable" means "All the bugfixes that we have in Linus's tree,
backported to this one as well to resolve known issues".  That's all
that is happening here with the autosel stuff.  There are a load of
subsystems that still do not tag stuff for stable backporting, and
sometimes even the maintainers miss them as well (I am guilty of that as
well.)  So autosel finds those fixes and backports them, it's no
different from a distro doing the exact same thing when a bug report
comes into it, but it happens _BEFORE_ the bug report happens.

> I guess that's one of the reasons why distros are gradually turning away 
> from stable tree the main purpose of distros is to provide stability, 
> while it clearly is not minimizing acumulation of cost for future version 
> updates.

That's directly opposite of what I see happening with loads of
real-world devices.

As proof of this, and as part of a talk I gave a few weeks ago, I can
quote the Android security team.  They kept track of all requests that
they made to be backported to their device trees for 2018.  Out of 218
requests, 201 of them were _ALREADY_ in the LTS release tree.  The other
remaining ones were due to out-of-tree code being in the devices, or due
to bugs in backports that were not upstream.

So again, bugs are being fixed _before_ people report them, which sounds
exactly like what a distro needs to have happen for them :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 11:02       ` Sasha Levin
  2019-07-08 11:35         ` Jiri Kosina
@ 2019-07-08 12:37         ` Mark Brown
  2019-07-08 14:05           ` Guenter Roeck
  2019-07-08 18:01           ` Sasha Levin
  1 sibling, 2 replies; 27+ messages in thread
From: Mark Brown @ 2019-07-08 12:37 UTC (permalink / raw)
  To: Sasha Levin; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 4321 bytes --]

On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:

> > I'm not saying leave it alone, it's more a question of how
> > aggressive we are about picking up things we think might be
> > relevant fixes but haven't had some sort of domain specific
> > analysis of.  Testing is a good way to mitigate the potential
> > risks here.

> I agree, and for various subsystems and drivers where the maintainers
> volunteer their domain specific expertise to send backports to stable, I
> have "blacklisted" it from AUTOSEL since indeed it's a much better
> option.

Hrm, it's definitely getting a bunch of stuff for my subsystems
where I do tag things for stable...

> > > This came up in the last MS, and the agreement there was that we expect
> > > stable kernel users to test their workloads before throwing it into
> > > production.

> > That's kind of the problem - if people are doing testing and end
> > up finding problems coming back in the stable kernel that's the
> > sort of thing that encourages them to not just take stable en
> > masse as we say they should.  Part of the deal with stable is
> > that it is conservative, people can trust it to be a low risk
> > update.  That's not happening now as far as I'm aware but it does
> > worry me that it might happen.

> Right, and the rate at which AUTOSEL commits are reverted is lower than
> commits that are actually tagged for stable. If AUTOSEL commits on their
> own were being reverted left and right I'd agree we need to tone it
> down, but I don't see it happening now.

I'm not sure how many people will actually report problems they
experience upstream rather than just fixing things locally and
just moving on.  The more code is the more likely it is that one
of the users will report things.

> > > If we were to start avoiding driver updates, it would act as an
> > > incentive for people not to upgrade their kernel.

> > I'm not sure I follow the logic here?

> The way I see it, the lower your "effective delta" is between to
> kernels, the easier it is to move forward. For example, if I have a
> product that runs on 4.19 and uses all our core kernel code + 10
> drivers, and I know that those drivers had most of the fixes backported
> to my LTS tree, I'd feel much more confident going to 5.4 knowning that
> I already have most of the patches that come with 5.4.

I see, that's definitely a new one to me.  The concerns people
usually have about upgrading are more around the core kernel
changing performance characteristics or something in a way that
disrupts important workloads.  I'm not quite sure I follow the
logic there TBH, it seems to be discounting new development
rather too much - even if the drivers have been very static
there's all the integration with the rest of the kernel to think
about.

> For me it's a matter of how one would budget a move from a kernel X LTS
> to kernel Y LTS, and I think that as that budget requirement grows it's
> actually harder to actually do it (and convince management), acting as a
> negative incentive to stay with whatever works now.

If the drivers are static enough to only be getting bug fixes
surely the rest of the kernel is a massively more substantial
concern?

> I have a different experience with this. I'd like to think that we're a
> bigger customer and this process wasn't working too well for us. My
> thinking was that if it's broken for us I can only imagine how bad it is
> for the smaller customers.

...

> This is where our story is different, which might explain my experience
> being different: we usually require vendors to upstream everything, and
> so they do. This means we don't have much of a out-of-tree patch
> stacks/fixes from the vendor directly, and we expect to pick up patches
> via the regular stable process, and that didn't happen all too well so
> far.

That sounds like they didn't pick up on the bit about getting
things through LTS.  This sounds like a pretty unusual request
for a vendor to be getting, it doesn't 100% surprise me that
it might take a few goes for them to understand what you're
looking for, or that you're having a worse time than most users.
For enterprise type stuff AFAICT people are expecting people to
get their stable versions from distros rather than raw LTS.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 12:37         ` Mark Brown
@ 2019-07-08 14:05           ` Guenter Roeck
  2019-07-08 14:33             ` Takashi Iwai
  2019-07-08 14:50             ` Mark Brown
  2019-07-08 18:01           ` Sasha Levin
  1 sibling, 2 replies; 27+ messages in thread
From: Guenter Roeck @ 2019-07-08 14:05 UTC (permalink / raw)
  To: Mark Brown, Sasha Levin; +Cc: ksummit-discuss

On 7/8/19 5:37 AM, Mark Brown wrote:
> On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
>> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
> 
>>> I'm not saying leave it alone, it's more a question of how
>>> aggressive we are about picking up things we think might be
>>> relevant fixes but haven't had some sort of domain specific
>>> analysis of.  Testing is a good way to mitigate the potential
>>> risks here.
> 
>> I agree, and for various subsystems and drivers where the maintainers
>> volunteer their domain specific expertise to send backports to stable, I
>> have "blacklisted" it from AUTOSEL since indeed it's a much better
>> option.
> 
> Hrm, it's definitely getting a bunch of stuff for my subsystems
> where I do tag things for stable...
> 
>>>> This came up in the last MS, and the agreement there was that we expect
>>>> stable kernel users to test their workloads before throwing it into
>>>> production.
> 
>>> That's kind of the problem - if people are doing testing and end
>>> up finding problems coming back in the stable kernel that's the
>>> sort of thing that encourages them to not just take stable en
>>> masse as we say they should.  Part of the deal with stable is
>>> that it is conservative, people can trust it to be a low risk
>>> update.  That's not happening now as far as I'm aware but it does
>>> worry me that it might happen.
> 
>> Right, and the rate at which AUTOSEL commits are reverted is lower than
>> commits that are actually tagged for stable. If AUTOSEL commits on their
>> own were being reverted left and right I'd agree we need to tone it
>> down, but I don't see it happening now.
> 
> I'm not sure how many people will actually report problems they
> experience upstream rather than just fixing things locally and
> just moving on.  The more code is the more likely it is that one
> of the users will report things.
> 

I for my part will most definitely report any such problems, since each
regression in stable releases is used as argument against merging
stable releases (even if the regression rate is negligible), and I am
very interested in getting that regression rate as close to zero as
possible. Reporting each and every regression is an essential part
of that.

Guenter

>>>> If we were to start avoiding driver updates, it would act as an
>>>> incentive for people not to upgrade their kernel.
> 
>>> I'm not sure I follow the logic here?
> 
>> The way I see it, the lower your "effective delta" is between to
>> kernels, the easier it is to move forward. For example, if I have a
>> product that runs on 4.19 and uses all our core kernel code + 10
>> drivers, and I know that those drivers had most of the fixes backported
>> to my LTS tree, I'd feel much more confident going to 5.4 knowning that
>> I already have most of the patches that come with 5.4.
> 
> I see, that's definitely a new one to me.  The concerns people
> usually have about upgrading are more around the core kernel
> changing performance characteristics or something in a way that
> disrupts important workloads.  I'm not quite sure I follow the
> logic there TBH, it seems to be discounting new development
> rather too much - even if the drivers have been very static
> there's all the integration with the rest of the kernel to think
> about.
> 
>> For me it's a matter of how one would budget a move from a kernel X LTS
>> to kernel Y LTS, and I think that as that budget requirement grows it's
>> actually harder to actually do it (and convince management), acting as a
>> negative incentive to stay with whatever works now.
> 
> If the drivers are static enough to only be getting bug fixes
> surely the rest of the kernel is a massively more substantial
> concern?
> 
>> I have a different experience with this. I'd like to think that we're a
>> bigger customer and this process wasn't working too well for us. My
>> thinking was that if it's broken for us I can only imagine how bad it is
>> for the smaller customers.
> 
> ...
> 
>> This is where our story is different, which might explain my experience
>> being different: we usually require vendors to upstream everything, and
>> so they do. This means we don't have much of a out-of-tree patch
>> stacks/fixes from the vendor directly, and we expect to pick up patches
>> via the regular stable process, and that didn't happen all too well so
>> far.
> 
> That sounds like they didn't pick up on the bit about getting
> things through LTS.  This sounds like a pretty unusual request
> for a vendor to be getting, it doesn't 100% surprise me that
> it might take a few goes for them to understand what you're
> looking for, or that you're having a worse time than most users.
> For enterprise type stuff AFAICT people are expecting people to
> get their stable versions from distros rather than raw LTS.
> 
> 
> _______________________________________________
> Ksummit-discuss mailing list
> Ksummit-discuss@lists.linuxfoundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/ksummit-discuss
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 14:05           ` Guenter Roeck
@ 2019-07-08 14:33             ` Takashi Iwai
  2019-07-08 15:10               ` Greg KH
  2019-07-08 14:50             ` Mark Brown
  1 sibling, 1 reply; 27+ messages in thread
From: Takashi Iwai @ 2019-07-08 14:33 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: ksummit-discuss

On Mon, 08 Jul 2019 16:05:44 +0200,
Guenter Roeck wrote:
> 
> On 7/8/19 5:37 AM, Mark Brown wrote:
> > On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
> >> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
> >
> >>> I'm not saying leave it alone, it's more a question of how
> >>> aggressive we are about picking up things we think might be
> >>> relevant fixes but haven't had some sort of domain specific
> >>> analysis of.  Testing is a good way to mitigate the potential
> >>> risks here.
> >
> >> I agree, and for various subsystems and drivers where the maintainers
> >> volunteer their domain specific expertise to send backports to stable, I
> >> have "blacklisted" it from AUTOSEL since indeed it's a much better
> >> option.
> >
> > Hrm, it's definitely getting a bunch of stuff for my subsystems
> > where I do tag things for stable...
> >
> >>>> This came up in the last MS, and the agreement there was that we expect
> >>>> stable kernel users to test their workloads before throwing it into
> >>>> production.
> >
> >>> That's kind of the problem - if people are doing testing and end
> >>> up finding problems coming back in the stable kernel that's the
> >>> sort of thing that encourages them to not just take stable en
> >>> masse as we say they should.  Part of the deal with stable is
> >>> that it is conservative, people can trust it to be a low risk
> >>> update.  That's not happening now as far as I'm aware but it does
> >>> worry me that it might happen.
> >
> >> Right, and the rate at which AUTOSEL commits are reverted is lower than
> >> commits that are actually tagged for stable. If AUTOSEL commits on their
> >> own were being reverted left and right I'd agree we need to tone it
> >> down, but I don't see it happening now.
> >
> > I'm not sure how many people will actually report problems they
> > experience upstream rather than just fixing things locally and
> > just moving on.  The more code is the more likely it is that one
> > of the users will report things.
> >
> 
> I for my part will most definitely report any such problems, since each
> regression in stable releases is used as argument against merging
> stable releases (even if the regression rate is negligible), and I am
> very interested in getting that regression rate as close to zero as
> possible. Reporting each and every regression is an essential part
> of that.

BTW, regarding regression: currently we have no central regression
tracking.  This is another big missing piece, and a thing to be
discussed in KS, IMO.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 14:05           ` Guenter Roeck
  2019-07-08 14:33             ` Takashi Iwai
@ 2019-07-08 14:50             ` Mark Brown
  2019-07-08 15:06               ` Greg KH
  1 sibling, 1 reply; 27+ messages in thread
From: Mark Brown @ 2019-07-08 14:50 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 858 bytes --]

On Mon, Jul 08, 2019 at 07:05:44AM -0700, Guenter Roeck wrote:
> On 7/8/19 5:37 AM, Mark Brown wrote:

> > I'm not sure how many people will actually report problems they
> > experience upstream rather than just fixing things locally and
> > just moving on.  The more code is the more likely it is that one
> > of the users will report things.

> I for my part will most definitely report any such problems, since each
> regression in stable releases is used as argument against merging
> stable releases (even if the regression rate is negligible), and I am
> very interested in getting that regression rate as close to zero as
> possible. Reporting each and every regression is an essential part
> of that.

Me too - but I'm pretty sure for example most of the product
teams I've worked with at consumer electronics companies would
never even consider it.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 14:50             ` Mark Brown
@ 2019-07-08 15:06               ` Greg KH
  2019-07-08 15:27                 ` Mark Brown
  0 siblings, 1 reply; 27+ messages in thread
From: Greg KH @ 2019-07-08 15:06 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

On Mon, Jul 08, 2019 at 03:50:45PM +0100, Mark Brown wrote:
> On Mon, Jul 08, 2019 at 07:05:44AM -0700, Guenter Roeck wrote:
> > On 7/8/19 5:37 AM, Mark Brown wrote:
> 
> > > I'm not sure how many people will actually report problems they
> > > experience upstream rather than just fixing things locally and
> > > just moving on.  The more code is the more likely it is that one
> > > of the users will report things.
> 
> > I for my part will most definitely report any such problems, since each
> > regression in stable releases is used as argument against merging
> > stable releases (even if the regression rate is negligible), and I am
> > very interested in getting that regression rate as close to zero as
> > possible. Reporting each and every regression is an essential part
> > of that.
> 
> Me too - but I'm pretty sure for example most of the product
> teams I've worked with at consumer electronics companies would
> never even consider it.

Sweet, want me to come into those teams and give a presentation like I
did a few months ago for one major company entitled "all the ways your
kernel is insecure and trivial to break"?

I'll be glad to do so :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 14:33             ` Takashi Iwai
@ 2019-07-08 15:10               ` Greg KH
  2019-07-08 15:18                 ` Takashi Iwai
                                   ` (3 more replies)
  0 siblings, 4 replies; 27+ messages in thread
From: Greg KH @ 2019-07-08 15:10 UTC (permalink / raw)
  To: Takashi Iwai; +Cc: ksummit-discuss

On Mon, Jul 08, 2019 at 04:33:28PM +0200, Takashi Iwai wrote:
> On Mon, 08 Jul 2019 16:05:44 +0200,
> Guenter Roeck wrote:
> > 
> > On 7/8/19 5:37 AM, Mark Brown wrote:
> > > On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
> > >> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
> > >
> > >>> I'm not saying leave it alone, it's more a question of how
> > >>> aggressive we are about picking up things we think might be
> > >>> relevant fixes but haven't had some sort of domain specific
> > >>> analysis of.  Testing is a good way to mitigate the potential
> > >>> risks here.
> > >
> > >> I agree, and for various subsystems and drivers where the maintainers
> > >> volunteer their domain specific expertise to send backports to stable, I
> > >> have "blacklisted" it from AUTOSEL since indeed it's a much better
> > >> option.
> > >
> > > Hrm, it's definitely getting a bunch of stuff for my subsystems
> > > where I do tag things for stable...
> > >
> > >>>> This came up in the last MS, and the agreement there was that we expect
> > >>>> stable kernel users to test their workloads before throwing it into
> > >>>> production.
> > >
> > >>> That's kind of the problem - if people are doing testing and end
> > >>> up finding problems coming back in the stable kernel that's the
> > >>> sort of thing that encourages them to not just take stable en
> > >>> masse as we say they should.  Part of the deal with stable is
> > >>> that it is conservative, people can trust it to be a low risk
> > >>> update.  That's not happening now as far as I'm aware but it does
> > >>> worry me that it might happen.
> > >
> > >> Right, and the rate at which AUTOSEL commits are reverted is lower than
> > >> commits that are actually tagged for stable. If AUTOSEL commits on their
> > >> own were being reverted left and right I'd agree we need to tone it
> > >> down, but I don't see it happening now.
> > >
> > > I'm not sure how many people will actually report problems they
> > > experience upstream rather than just fixing things locally and
> > > just moving on.  The more code is the more likely it is that one
> > > of the users will report things.
> > >
> > 
> > I for my part will most definitely report any such problems, since each
> > regression in stable releases is used as argument against merging
> > stable releases (even if the regression rate is negligible), and I am
> > very interested in getting that regression rate as close to zero as
> > possible. Reporting each and every regression is an essential part
> > of that.
> 
> BTW, regarding regression: currently we have no central regression
> tracking.  This is another big missing piece, and a thing to be
> discussed in KS, IMO.

Well, I think the conversation will go just like it has in the past for
this issue:
	"We need to have someone track regressions!"
	"X said they would do it but they need to be paid, any company
	willing to sponsor this?"
	{crickets}

We know we need this, we have at least one talented and capable person
to do the work, but no company is willing to step up and fund it :(

It's like where we were 5 years ago with testing, everyone knew there
was a problem, but no one was willing to do anything about it.  That
time I convinced some LF member companies to start doing work within
their companies toward this, but that really doesn't solve this type of
problem as being "distributed" isn't the issue here...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 15:10               ` Greg KH
@ 2019-07-08 15:18                 ` Takashi Iwai
  2019-07-08 18:08                 ` Sasha Levin
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 27+ messages in thread
From: Takashi Iwai @ 2019-07-08 15:18 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit-discuss

On Mon, 08 Jul 2019 17:10:40 +0200,
Greg KH wrote:
> 
> On Mon, Jul 08, 2019 at 04:33:28PM +0200, Takashi Iwai wrote:
> > On Mon, 08 Jul 2019 16:05:44 +0200,
> > Guenter Roeck wrote:
> > > 
> > > On 7/8/19 5:37 AM, Mark Brown wrote:
> > > > On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
> > > >> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
> > > >
> > > >>> I'm not saying leave it alone, it's more a question of how
> > > >>> aggressive we are about picking up things we think might be
> > > >>> relevant fixes but haven't had some sort of domain specific
> > > >>> analysis of.  Testing is a good way to mitigate the potential
> > > >>> risks here.
> > > >
> > > >> I agree, and for various subsystems and drivers where the maintainers
> > > >> volunteer their domain specific expertise to send backports to stable, I
> > > >> have "blacklisted" it from AUTOSEL since indeed it's a much better
> > > >> option.
> > > >
> > > > Hrm, it's definitely getting a bunch of stuff for my subsystems
> > > > where I do tag things for stable...
> > > >
> > > >>>> This came up in the last MS, and the agreement there was that we expect
> > > >>>> stable kernel users to test their workloads before throwing it into
> > > >>>> production.
> > > >
> > > >>> That's kind of the problem - if people are doing testing and end
> > > >>> up finding problems coming back in the stable kernel that's the
> > > >>> sort of thing that encourages them to not just take stable en
> > > >>> masse as we say they should.  Part of the deal with stable is
> > > >>> that it is conservative, people can trust it to be a low risk
> > > >>> update.  That's not happening now as far as I'm aware but it does
> > > >>> worry me that it might happen.
> > > >
> > > >> Right, and the rate at which AUTOSEL commits are reverted is lower than
> > > >> commits that are actually tagged for stable. If AUTOSEL commits on their
> > > >> own were being reverted left and right I'd agree we need to tone it
> > > >> down, but I don't see it happening now.
> > > >
> > > > I'm not sure how many people will actually report problems they
> > > > experience upstream rather than just fixing things locally and
> > > > just moving on.  The more code is the more likely it is that one
> > > > of the users will report things.
> > > >
> > > 
> > > I for my part will most definitely report any such problems, since each
> > > regression in stable releases is used as argument against merging
> > > stable releases (even if the regression rate is negligible), and I am
> > > very interested in getting that regression rate as close to zero as
> > > possible. Reporting each and every regression is an essential part
> > > of that.
> > 
> > BTW, regarding regression: currently we have no central regression
> > tracking.  This is another big missing piece, and a thing to be
> > discussed in KS, IMO.
> 
> Well, I think the conversation will go just like it has in the past for
> this issue:
> 	"We need to have someone track regressions!"
> 	"X said they would do it but they need to be paid, any company
> 	willing to sponsor this?"
> 	{crickets}
> 
> We know we need this, we have at least one talented and capable person
> to do the work, but no company is willing to step up and fund it :(

Yeah, it's a sad deja vu...

> It's like where we were 5 years ago with testing, everyone knew there
> was a problem, but no one was willing to do anything about it.  That
> time I convinced some LF member companies to start doing work within
> their companies toward this, but that really doesn't solve this type of
> problem as being "distributed" isn't the issue here...

The past attempts and their failing patterns look like a SPOF, it's
been always a load to a single person, who eventually gave up
maintaining.  A more automated and distributed work would help in this
regard, I hope sincerely.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 15:06               ` Greg KH
@ 2019-07-08 15:27                 ` Mark Brown
  0 siblings, 0 replies; 27+ messages in thread
From: Mark Brown @ 2019-07-08 15:27 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1276 bytes --]

On Mon, Jul 08, 2019 at 05:06:41PM +0200, Greg KH wrote:
> On Mon, Jul 08, 2019 at 03:50:45PM +0100, Mark Brown wrote:
> > On Mon, Jul 08, 2019 at 07:05:44AM -0700, Guenter Roeck wrote:

> > > I for my part will most definitely report any such problems, since each
> > > regression in stable releases is used as argument against merging
> > > stable releases (even if the regression rate is negligible), and I am
> > > very interested in getting that regression rate as close to zero as
> > > possible. Reporting each and every regression is an essential part
> > > of that.

> > Me too - but I'm pretty sure for example most of the product
> > teams I've worked with at consumer electronics companies would
> > never even consider it.

> Sweet, want me to come into those teams and give a presentation like I
> did a few months ago for one major company entitled "all the ways your
> kernel is insecure and trivial to break"?

Go wild!  Note that this isn't a case of people not taking
updates, it's often a combination of a general confidentiality
mindset and the fact that if you're taking updates from multiple
sources (eg, LTS and one or more chip vendors) as well as making
your own changes it can be more trouble than it's worth to figure
out where to report anything.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 11:35         ` Jiri Kosina
  2019-07-08 12:34           ` Greg KH
@ 2019-07-08 17:56           ` Sasha Levin
  1 sibling, 0 replies; 27+ messages in thread
From: Sasha Levin @ 2019-07-08 17:56 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: ksummit-discuss

On Mon, Jul 08, 2019 at 01:35:15PM +0200, Jiri Kosina wrote:
>On Mon, 8 Jul 2019, Sasha Levin wrote:
>>
>> >> If we were to start avoiding driver updates, it would act as an
>> >> incentive for people not to upgrade their kernel.
>> >
>> >I'm not sure I follow the logic here?
>>
>> The way I see it, the lower your "effective delta" is between to
>> kernels, the easier it is to move forward. For example, if I have a
>> product that runs on 4.19 and uses all our core kernel code + 10
>> drivers, and I know that those drivers had most of the fixes backported
>> to my LTS tree, I'd feel much more confident going to 5.4 knowning that
>> I already have most of the patches that come with 5.4.
>>
>> For me it's a matter of how one would budget a move from a kernel X LTS
>> to kernel Y LTS, and I think that as that budget requirement grows it's
>> actually harder to actually do it (and convince management), acting as a
>> negative incentive to stay with whatever works now.
>
>But where does the 'stable' aspect appear here?
>
>I think it's reasonable to expect 'stable' to mean 'minimal number of
>changes needed to maintain stability of the kernel', and that I believe
>was the original purpose of stable tree.

I think that we're parsing the words "stable kernel" differently. You
see "stable kernel" as a kernel that remains mostly the same over time
and accepts a very small amount of critical fixes.

On the other hand, my expectation of a "stable kernel" is a kernel
without known bugs. I associate the word "stable" with stable runtime
rather than a stable codebase.

>Now you seem to be repurposing 'stable' as 'as close to upstream as
>possible in order to minimize cost of version updates'.

I don't think that the stable kernel was meant to lag behind upstream
too much. Even the rules suggest that a commit just has to be upstream,
without regard to how long (as long as it made one release, so ~1 week
tops).

I'm not suggesting that we should be in sync with Linus, all I'm saying
that users who stay close to upstream have an easier time moving to
newer kernels, and we want to provide that ability to users of the
stable kernel.

>I guess that's one of the reasons why distros are gradually turning away
>from stable tree the main purpose of distros is to provide stability,
>while it clearly is not minimizing acumulation of cost for future version
>updates.

I'm not sure about statistics, but I think that the stable tree is
gaining more "distro" users than losing them. I think it's also
important to note here that the stable tree doesn't work for everyone,
and that's perfectly fine.

Even with all the AUTOSEL stuff that go in, a quick look at my mailbox
suggests that I spent more time finding missing patches from various
distro trees than reverting patches from the stable trees.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 12:37         ` Mark Brown
  2019-07-08 14:05           ` Guenter Roeck
@ 2019-07-08 18:01           ` Sasha Levin
  1 sibling, 0 replies; 27+ messages in thread
From: Sasha Levin @ 2019-07-08 18:01 UTC (permalink / raw)
  To: Mark Brown; +Cc: ksummit-discuss

On Mon, Jul 08, 2019 at 01:37:33PM +0100, Mark Brown wrote:
>On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
>> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
>
>> > I'm not saying leave it alone, it's more a question of how
>> > aggressive we are about picking up things we think might be
>> > relevant fixes but haven't had some sort of domain specific
>> > analysis of.  Testing is a good way to mitigate the potential
>> > risks here.
>
>> I agree, and for various subsystems and drivers where the maintainers
>> volunteer their domain specific expertise to send backports to stable, I
>> have "blacklisted" it from AUTOSEL since indeed it's a much better
>> option.
>
>Hrm, it's definitely getting a bunch of stuff for my subsystems
>where I do tag things for stable...

You still need to explicitly ask me to blacklist it, but I'm more than
happy to if you feel the AUTOSEL process doesn't add value. Some
maintainers choose to keep AUTOSEL but just respond with "NAK" on
patches they don't want in.

>> > > This came up in the last MS, and the agreement there was that we expect
>> > > stable kernel users to test their workloads before throwing it into
>> > > production.
>
>> > That's kind of the problem - if people are doing testing and end
>> > up finding problems coming back in the stable kernel that's the
>> > sort of thing that encourages them to not just take stable en
>> > masse as we say they should.  Part of the deal with stable is
>> > that it is conservative, people can trust it to be a low risk
>> > update.  That's not happening now as far as I'm aware but it does
>> > worry me that it might happen.
>
>> Right, and the rate at which AUTOSEL commits are reverted is lower than
>> commits that are actually tagged for stable. If AUTOSEL commits on their
>> own were being reverted left and right I'd agree we need to tone it
>> down, but I don't see it happening now.
>
>I'm not sure how many people will actually report problems they
>experience upstream rather than just fixing things locally and
>just moving on.  The more code is the more likely it is that one
>of the users will report things.
>
>> > > If we were to start avoiding driver updates, it would act as an
>> > > incentive for people not to upgrade their kernel.
>
>> > I'm not sure I follow the logic here?
>
>> The way I see it, the lower your "effective delta" is between to
>> kernels, the easier it is to move forward. For example, if I have a
>> product that runs on 4.19 and uses all our core kernel code + 10
>> drivers, and I know that those drivers had most of the fixes backported
>> to my LTS tree, I'd feel much more confident going to 5.4 knowning that
>> I already have most of the patches that come with 5.4.
>
>I see, that's definitely a new one to me.  The concerns people
>usually have about upgrading are more around the core kernel
>changing performance characteristics or something in a way that
>disrupts important workloads.  I'm not quite sure I follow the
>logic there TBH, it seems to be discounting new development
>rather too much - even if the drivers have been very static
>there's all the integration with the rest of the kernel to think
>about.

My thinking is that we will need to address new core kernel developments
either way, which is why I haven't mentioned them here.

The variable cost here is how much effort will go into validating my
hardware devices and the code that runs them.

>> For me it's a matter of how one would budget a move from a kernel X LTS
>> to kernel Y LTS, and I think that as that budget requirement grows it's
>> actually harder to actually do it (and convince management), acting as a
>> negative incentive to stay with whatever works now.
>
>If the drivers are static enough to only be getting bug fixes
>surely the rest of the kernel is a massively more substantial
>concern?

They're not too static, and sadly them being less tested means I'm more
worried about drivers than core kernel code.

Sure, the core kernel is also a concern but as I've mentioned above, you
will pay the price for re-testing core kernel stuff anyway.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 15:10               ` Greg KH
  2019-07-08 15:18                 ` Takashi Iwai
@ 2019-07-08 18:08                 ` Sasha Levin
  2019-07-08 21:31                 ` Jiri Kosina
  2019-07-09 15:21                 ` Laura Abbott
  3 siblings, 0 replies; 27+ messages in thread
From: Sasha Levin @ 2019-07-08 18:08 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit-discuss

On Mon, Jul 08, 2019 at 05:10:40PM +0200, Greg KH wrote:
>Well, I think the conversation will go just like it has in the past for
>this issue:
>	"We need to have someone track regressions!"
>	"X said they would do it but they need to be paid, any company
>	willing to sponsor this?"
>	{crickets}
>
>We know we need this, we have at least one talented and capable person
>to do the work, but no company is willing to step up and fund it :(

Maybe I am not clear on the role of the LF here, but why can't we get
the LF to self-fund a regression tracking project for the kernel?

Getting funding for something like this from companies is difficult.
It's hard to sell the value of something like this to managers even
though to us it's obviously *critical* (see the KernelCI case for
example), and even if a certain company secure funding, LF's method of
spinning up projects and trying to get them funded individually just
doesn't work.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 15:10               ` Greg KH
  2019-07-08 15:18                 ` Takashi Iwai
  2019-07-08 18:08                 ` Sasha Levin
@ 2019-07-08 21:31                 ` Jiri Kosina
  2019-07-09 15:44                   ` Rafael J. Wysocki
  2019-07-09 15:21                 ` Laura Abbott
  3 siblings, 1 reply; 27+ messages in thread
From: Jiri Kosina @ 2019-07-08 21:31 UTC (permalink / raw)
  To: Greg KH; +Cc: ksummit-discuss

On Mon, 8 Jul 2019, Greg KH wrote:

> Well, I think the conversation will go just like it has in the past for
> this issue:
> 	"We need to have someone track regressions!"
> 	"X said they would do it but they need to be paid, any company
> 	willing to sponsor this?"
> 	{crickets}

SUSE has actually been funding this for quite some time (back when Rafael 
was doing it), but it's really tricky.

We of course realize it's very important long-term activity, from which 
everybody profits.

At the same time, you need somebody who *really deeply* understands 
everything inside and around the kernel development, otherwise you get 
more harm and chaos than added value out of the whole excercise.

And if you have such a person (like we had Rafael), it's unlikely that 
person would want to do that work forever, and the funding company is also 
losing brainpower in other, more development-related areas (like PM in 
Rafael's case) at the same time.

So it's not as simple as "hey, you, company making money on linux, go pay 
someone to do this".

If I remember correctly (Rafael for sure would remember better), there 
were some attempts to have the regression tracking made by someone much 
more juniorish, but that person got of course immediately overwhelmed.

Thanks,

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 15:10               ` Greg KH
                                   ` (2 preceding siblings ...)
  2019-07-08 21:31                 ` Jiri Kosina
@ 2019-07-09 15:21                 ` Laura Abbott
  3 siblings, 0 replies; 27+ messages in thread
From: Laura Abbott @ 2019-07-09 15:21 UTC (permalink / raw)
  To: Greg KH, Takashi Iwai; +Cc: ksummit-discuss

On 7/8/19 11:10 AM, Greg KH wrote:
> On Mon, Jul 08, 2019 at 04:33:28PM +0200, Takashi Iwai wrote:
>> On Mon, 08 Jul 2019 16:05:44 +0200,
>> Guenter Roeck wrote:
>>>
>>> On 7/8/19 5:37 AM, Mark Brown wrote:
>>>> On Mon, Jul 08, 2019 at 07:02:08AM -0400, Sasha Levin wrote:
>>>>> On Sat, Jul 06, 2019 at 01:32:14AM +0100, Mark Brown wrote:
>>>>
>>>>>> I'm not saying leave it alone, it's more a question of how
>>>>>> aggressive we are about picking up things we think might be
>>>>>> relevant fixes but haven't had some sort of domain specific
>>>>>> analysis of.  Testing is a good way to mitigate the potential
>>>>>> risks here.
>>>>
>>>>> I agree, and for various subsystems and drivers where the maintainers
>>>>> volunteer their domain specific expertise to send backports to stable, I
>>>>> have "blacklisted" it from AUTOSEL since indeed it's a much better
>>>>> option.
>>>>
>>>> Hrm, it's definitely getting a bunch of stuff for my subsystems
>>>> where I do tag things for stable...
>>>>
>>>>>>> This came up in the last MS, and the agreement there was that we expect
>>>>>>> stable kernel users to test their workloads before throwing it into
>>>>>>> production.
>>>>
>>>>>> That's kind of the problem - if people are doing testing and end
>>>>>> up finding problems coming back in the stable kernel that's the
>>>>>> sort of thing that encourages them to not just take stable en
>>>>>> masse as we say they should.  Part of the deal with stable is
>>>>>> that it is conservative, people can trust it to be a low risk
>>>>>> update.  That's not happening now as far as I'm aware but it does
>>>>>> worry me that it might happen.
>>>>
>>>>> Right, and the rate at which AUTOSEL commits are reverted is lower than
>>>>> commits that are actually tagged for stable. If AUTOSEL commits on their
>>>>> own were being reverted left and right I'd agree we need to tone it
>>>>> down, but I don't see it happening now.
>>>>
>>>> I'm not sure how many people will actually report problems they
>>>> experience upstream rather than just fixing things locally and
>>>> just moving on.  The more code is the more likely it is that one
>>>> of the users will report things.
>>>>
>>>
>>> I for my part will most definitely report any such problems, since each
>>> regression in stable releases is used as argument against merging
>>> stable releases (even if the regression rate is negligible), and I am
>>> very interested in getting that regression rate as close to zero as
>>> possible. Reporting each and every regression is an essential part
>>> of that.
>>
>> BTW, regarding regression: currently we have no central regression
>> tracking.  This is another big missing piece, and a thing to be
>> discussed in KS, IMO.
> 
> Well, I think the conversation will go just like it has in the past for
> this issue:
> 	"We need to have someone track regressions!"
> 	"X said they would do it but they need to be paid, any company
> 	willing to sponsor this?"
> 	{crickets}
> 
> We know we need this, we have at least one talented and capable person
> to do the work, but no company is willing to step up and fund it :(
> 
> It's like where we were 5 years ago with testing, everyone knew there
> was a problem, but no one was willing to do anything about it.  That
> time I convinced some LF member companies to start doing work within
> their companies toward this, but that really doesn't solve this type of
> problem as being "distributed" isn't the issue here...
> 
> thanks,
> 
> greg k-h

There's two parts here: a centralized place to track bugs and regressions
and person to help manage those. While having a person to manage everything
would be good, getting the central tracking going without relying on a
single person is important.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-08 21:31                 ` Jiri Kosina
@ 2019-07-09 15:44                   ` Rafael J. Wysocki
  2019-07-09 21:05                     ` Takashi Iwai
  0 siblings, 1 reply; 27+ messages in thread
From: Rafael J. Wysocki @ 2019-07-09 15:44 UTC (permalink / raw)
  To: ksummit-discuss

On Monday, July 8, 2019 11:31:45 PM CEST Jiri Kosina wrote:
> On Mon, 8 Jul 2019, Greg KH wrote:
> 
> > Well, I think the conversation will go just like it has in the past for
> > this issue:
> > 	"We need to have someone track regressions!"
> > 	"X said they would do it but they need to be paid, any company
> > 	willing to sponsor this?"
> > 	{crickets}
> 
> SUSE has actually been funding this for quite some time (back when Rafael 
> was doing it), but it's really tricky.
> 
> We of course realize it's very important long-term activity, from which 
> everybody profits.
> 
> At the same time, you need somebody who *really deeply* understands 
> everything inside and around the kernel development, otherwise you get 
> more harm and chaos than added value out of the whole excercise.
> 
> And if you have such a person (like we had Rafael), it's unlikely that 
> person would want to do that work forever, and the funding company is also 
> losing brainpower in other, more development-related areas (like PM in 
> Rafael's case) at the same time.
> 
> So it's not as simple as "hey, you, company making money on linux, go pay 
> someone to do this".
> 
> If I remember correctly (Rafael for sure would remember better), there 
> were some attempts to have the regression tracking made by someone much 
> more juniorish, but that person got of course immediately overwhelmed.

There were such attempts and yes, the people dropped the ball eventually.

Honestly, I don't agree with the idea that one person can practically track
regression on the whole kernel basis today, because there is too many potential
sources of information to follow.  You'd need to track all of the mailing lists used for
development, bug tracking systems in many places and so on.

When I was tracking regressions, it was more or less sufficient to follow the LKML,
and that was hard enough already at that time, but it is not sufficient any more (and
even the LKML itself has become much more of a fire hose since then).

The tracking of regressions, to be effective, would need to scale at least in the
same way as the development process does IMO.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement
  2019-07-09 15:44                   ` Rafael J. Wysocki
@ 2019-07-09 21:05                     ` Takashi Iwai
  0 siblings, 0 replies; 27+ messages in thread
From: Takashi Iwai @ 2019-07-09 21:05 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: ksummit-discuss

On Tue, 09 Jul 2019 17:44:13 +0200,
Rafael J. Wysocki wrote:
> 
> On Monday, July 8, 2019 11:31:45 PM CEST Jiri Kosina wrote:
> > On Mon, 8 Jul 2019, Greg KH wrote:
> > 
> > > Well, I think the conversation will go just like it has in the past for
> > > this issue:
> > > 	"We need to have someone track regressions!"
> > > 	"X said they would do it but they need to be paid, any company
> > > 	willing to sponsor this?"
> > > 	{crickets}
> > 
> > SUSE has actually been funding this for quite some time (back when Rafael 
> > was doing it), but it's really tricky.
> > 
> > We of course realize it's very important long-term activity, from which 
> > everybody profits.
> > 
> > At the same time, you need somebody who *really deeply* understands 
> > everything inside and around the kernel development, otherwise you get 
> > more harm and chaos than added value out of the whole excercise.
> > 
> > And if you have such a person (like we had Rafael), it's unlikely that 
> > person would want to do that work forever, and the funding company is also 
> > losing brainpower in other, more development-related areas (like PM in 
> > Rafael's case) at the same time.
> > 
> > So it's not as simple as "hey, you, company making money on linux, go pay 
> > someone to do this".
> > 
> > If I remember correctly (Rafael for sure would remember better), there 
> > were some attempts to have the regression tracking made by someone much 
> > more juniorish, but that person got of course immediately overwhelmed.
> 
> There were such attempts and yes, the people dropped the ball eventually.
> 
> Honestly, I don't agree with the idea that one person can practically track
> regression on the whole kernel basis today, because there is too many potential
> sources of information to follow.  You'd need to track all of the mailing lists used for
> development, bug tracking systems in many places and so on.
> 
> When I was tracking regressions, it was more or less sufficient to follow the LKML,
> and that was hard enough already at that time, but it is not sufficient any more (and
> even the LKML itself has become much more of a fire hose since then).
> 
> The tracking of regressions, to be effective, would need to scale at least in the
> same way as the development process does IMO.

Agreed.  And, I believe the key is to establish the standard way to
report a regression from each subsystem maintainer side.  That is,
instead of a "regression manager" gathering the regression reports
alone by him/herself, we let each maintainer reporting the regression
more easily to a central place.  And there we simply gather and
provide the link to each regression report on a dashboard.

Of course, this would need educations to each maintainer and
developer, but it should be more scalable and sustainable than a
top-down model.


thanks,

Takashi

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-07-09 21:05 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-03  1:35 [Ksummit-discuss] [MAINTAINERS SUMMIT] stable kernel process automation and improvement Sasha Levin
2019-07-03 14:57 ` Laura Abbott
2019-07-05 13:54 ` Michael Ellerman
2019-07-05 14:13   ` Takashi Iwai
2019-07-05 16:17     ` Greg KH
2019-07-05 16:52     ` Sasha Levin
2019-07-05 16:41 ` Mark Brown
2019-07-05 20:12   ` Sasha Levin
2019-07-06  0:32     ` Mark Brown
2019-07-08 11:02       ` Sasha Levin
2019-07-08 11:35         ` Jiri Kosina
2019-07-08 12:34           ` Greg KH
2019-07-08 17:56           ` Sasha Levin
2019-07-08 12:37         ` Mark Brown
2019-07-08 14:05           ` Guenter Roeck
2019-07-08 14:33             ` Takashi Iwai
2019-07-08 15:10               ` Greg KH
2019-07-08 15:18                 ` Takashi Iwai
2019-07-08 18:08                 ` Sasha Levin
2019-07-08 21:31                 ` Jiri Kosina
2019-07-09 15:44                   ` Rafael J. Wysocki
2019-07-09 21:05                     ` Takashi Iwai
2019-07-09 15:21                 ` Laura Abbott
2019-07-08 14:50             ` Mark Brown
2019-07-08 15:06               ` Greg KH
2019-07-08 15:27                 ` Mark Brown
2019-07-08 18:01           ` Sasha Levin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.