[Ksummit-discuss] [Stable kernel] feature backporting collaboration

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ksummit-discuss] [Stable kernel] feature backporting collaboration
@ 2016-09-01  2:01 Alex Shi
  2016-09-02  1:25 ` Levin, Alexander
  2016-09-02 13:47 ` Theodore Ts'o
  0 siblings, 2 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-01  2:01 UTC (permalink / raw)
  To: ksummit-discuss, Mark Brown; +Cc: ltsi-dev, gregkh

Hi All,

I am a Linaro stable kernel maintainer. Our stable kernel is base on LTS
plus much of upstream features backporting on them. Here is the detailed
info of LSK: https://wiki.linaro.org/LSK
https://git.linaro.org/?p=kernel/linux-linaro-stable.git

These kind of backporting features are requested by many LSK members
which most are leading ARM product vendors. LSK target on the feature
backporting collaboration, to reduce the duplicate work on that. Current
LTSI: https://ltsi.linuxfoundation.org/what-is-ltsi, has
similar target for backporting collocation. but there are still couples
problems.

1, LTSI is focus on board support more than feature backporting

2, ltsi kernel version 3.10/3.14/4.1 is older than LTS and LSK 3.18/4.1/4.4.

3, merge everything together isn't good for some users and can not give
user option to select preferred kernel feature. On the contrary, each of
feature backported separately on latest LTS in LSK, user can just pick
their wanted features and merge them for their own kernel.

4, all vendor specific driver in one branch get complains and developing
status make it hard to handle changes in a fast-forward stable kernel.

As to LSK, although most feature are ARM related, but LSK also provide
some common feature which works on other archs, like cgroupv2, RO-vDSO,
KASAN, PAX_USERCOPY, etc. I believe this common backporting is also
useful for common industries.
If so, could we call a better way for feature backporting collaboration?

Regards.
Alex

=== lsk backported features ====
Coresight
Coresight-TMC/ETM
IPA
dm-crypt performance
OPP v2
PCIe for arm64/Juno R1
PAN
OF-overlay
PSCI
Cgroup-writeback
RO-vDSO
KASan
KALSR
IOMMU DMA
Hibernate on arm64
Devfreq cooling
PAX_USERCOPY

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-01  2:01 [Ksummit-discuss] [Stable kernel] feature backporting collaboration Alex Shi
@ 2016-09-02  1:25 ` Levin, Alexander
  2016-09-02  2:43   ` Stephen Hemminger
  2016-09-02  9:54   ` Mark Brown
  2016-09-02 13:47 ` Theodore Ts'o
  1 sibling, 2 replies; 122+ messages in thread
From: Levin, Alexander @ 2016-09-02  1:25 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, ksummit-discuss, Mark Brown, gregkh

On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
> Hi All,
> 
> I am a Linaro stable kernel maintainer. Our stable kernel is base on LTS
> plus much of upstream features backporting on them. Here is the detailed
> info of LSK: https://wiki.linaro.org/LSK
> https://git.linaro.org/?p=kernel/linux-linaro-stable.git
> 
> These kind of backporting features are requested by many LSK members
> which most are leading ARM product vendors. LSK target on the feature
> backporting collaboration, to reduce the duplicate work on that. Current
> LTSI: https://ltsi.linuxfoundation.org/what-is-ltsi, has
> similar target for backporting collocation. but there are still couples
> problems.
> 
> 1, LTSI is focus on board support more than feature backporting
> 
> 2, ltsi kernel version 3.10/3.14/4.1 is older than LTS and LSK 3.18/4.1/4.4.
> 
> 3, merge everything together isn't good for some users and can not give
> user option to select preferred kernel feature. On the contrary, each of
> feature backported separately on latest LTS in LSK, user can just pick
> their wanted features and merge them for their own kernel.
> 
> 4, all vendor specific driver in one branch get complains and developing
> status make it hard to handle changes in a fast-forward stable kernel.
> 
> As to LSK, although most feature are ARM related, but LSK also provide
> some common feature which works on other archs, like cgroupv2, RO-vDSO,
> KASAN, PAX_USERCOPY, etc. I believe this common backporting is also
> useful for common industries.
> If so, could we call a better way for feature backporting collaboration?

I really disagree with this approach. I think that backporting board support
like what LTSI does might make sense since it's self contained, but what LSK
does is just crazy.

Stable kernels have very strict restrictions that are focused on not taking
commits that have high potential to cause unintended side effects, incorrect
interactions with the rest of the kernel or just introduce new bugs.

Mixing in new features that interact with multiple subsystems is a recipe for
disaster. We barely pull off backporting what looks like trivial fixes, trying
to do the same for more than that is bound be broken.

As an alternative, why not use more recent stable kernels and customize the
config specifically for each user to enable on features that that specific
user wants to have.

The benefit here is that if used correctly you'll get to use all the new shiny
features you want on a more recent kernel, and none of the things you don't
want. So yes, you're upgrading to a newer kernel all the time, but if I
understant your use-case right it shouldn't matter too much, more so if
you're already taking chances on backporting major features yourself.

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02  1:25 ` Levin, Alexander
@ 2016-09-02  2:43   ` Stephen Hemminger
  2016-09-02  9:59     ` Mark Brown
  2016-09-02  9:54   ` Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: Stephen Hemminger @ 2016-09-02  2:43 UTC (permalink / raw)
  To: Levin, Alexander via Ksummit-discuss; +Cc: ltsi-dev, Mark Brown, gregkh

On Thu, 1 Sep 2016 21:25:31 -0400
"Levin, Alexander via Ksummit-discuss" <ksummit-discuss@lists.linuxfoundation.org> wrote:

> On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
> > Hi All,
> > 
> > I am a Linaro stable kernel maintainer. Our stable kernel is base on LTS
> > plus much of upstream features backporting on them. Here is the detailed
> > info of LSK: https://wiki.linaro.org/LSK
> > https://git.linaro.org/?p=kernel/linux-linaro-stable.git
> > 
> > These kind of backporting features are requested by many LSK members
> > which most are leading ARM product vendors. LSK target on the feature
> > backporting collaboration, to reduce the duplicate work on that. Current
> > LTSI: https://ltsi.linuxfoundation.org/what-is-ltsi, has
> > similar target for backporting collocation. but there are still couples
> > problems.
> > 
> > 1, LTSI is focus on board support more than feature backporting
> > 
> > 2, ltsi kernel version 3.10/3.14/4.1 is older than LTS and LSK 3.18/4.1/4.4.
> > 
> > 3, merge everything together isn't good for some users and can not give
> > user option to select preferred kernel feature. On the contrary, each of
> > feature backported separately on latest LTS in LSK, user can just pick
> > their wanted features and merge them for their own kernel.
> > 
> > 4, all vendor specific driver in one branch get complains and developing
> > status make it hard to handle changes in a fast-forward stable kernel.
> > 
> > As to LSK, although most feature are ARM related, but LSK also provide
> > some common feature which works on other archs, like cgroupv2, RO-vDSO,
> > KASAN, PAX_USERCOPY, etc. I believe this common backporting is also
> > useful for common industries.
> > If so, could we call a better way for feature backporting collaboration?  
> 
> I really disagree with this approach. I think that backporting board support
> like what LTSI does might make sense since it's self contained, but what LSK
> does is just crazy.
> 
> Stable kernels have very strict restrictions that are focused on not taking
> commits that have high potential to cause unintended side effects, incorrect
> interactions with the rest of the kernel or just introduce new bugs.
> 
> Mixing in new features that interact with multiple subsystems is a recipe for
> disaster. We barely pull off backporting what looks like trivial fixes, trying
> to do the same for more than that is bound be broken.
> 
> As an alternative, why not use more recent stable kernels and customize the
> config specifically for each user to enable on features that that specific
> user wants to have.
> 
> The benefit here is that if used correctly you'll get to use all the new shiny
> features you want on a more recent kernel, and none of the things you don't
> want. So yes, you're upgrading to a newer kernel all the time, but if I
> understant your use-case right it shouldn't matter too much, more so if
> you're already taking chances on backporting major features yourself.
> 

Agree, calling it a stable kernel is a misnomer; it looks like a fork. 
It really should be:
 Linaro Arm Home for Features not Upstream Kernel (LAHFUK)
or maybe linaro-next?

How come the feature matrix doesn't list what upstream kernel the feature
was merged into?

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02  1:25 ` Levin, Alexander
  2016-09-02  2:43   ` Stephen Hemminger
@ 2016-09-02  9:54   ` Mark Brown
  2016-09-02 10:16     ` [Ksummit-discuss] [LTSI-dev] " Geert Uytterhoeven
                       ` (2 more replies)
  1 sibling, 3 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-02  9:54 UTC (permalink / raw)
  To: Levin, Alexander; +Cc: ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 2764 bytes --]

On Thu, Sep 01, 2016 at 09:25:31PM -0400, Levin, Alexander via Ksummit-discuss wrote:
> On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:

> > I am a Linaro stable kernel maintainer. Our stable kernel is base on LTS
> > plus much of upstream features backporting on them. Here is the detailed

> I really disagree with this approach. I think that backporting board support
> like what LTSI does might make sense since it's self contained, but what LSK
> does is just crazy.

The bulk of these features are exactly that - they're isolated driver
specific code or new subsystems.  There are also some things with wider
impact but it's nowhere near all of them.

> Stable kernels have very strict restrictions that are focused on not taking
> commits that have high potential to cause unintended side effects, incorrect
> interactions with the rest of the kernel or just introduce new bugs.

> Mixing in new features that interact with multiple subsystems is a recipe for
> disaster. We barely pull off backporting what looks like trivial fixes, trying
> to do the same for more than that is bound be broken.

It's what people are doing for products, they want newer features but
they also don't want to rebase their product kernel onto mainline as
that's an even bigger integration risk.  People aren't using this kernel
raw, they're using it as the basis for product kernels.  What this is
doing is getting a bunch of people using the same backports which shares
effort and hopefully makes it more likely that some of the security
relevant features will get deployed in products.  Ideally some of the
saved time can be spent on upstreaming things though I fear that's a
little optimistic.

> As an alternative, why not use more recent stable kernels and customize the
> config specifically for each user to enable on features that that specific
> user wants to have.

That's just shipping a kernel - I don't think anyone is silly enough to
ship an allmodconfig or similar in production (though I'm sure someone
can come up with an example).

> The benefit here is that if used correctly you'll get to use all the new shiny
> features you want on a more recent kernel, and none of the things you don't
> want. So yes, you're upgrading to a newer kernel all the time, but if I
> understant your use-case right it shouldn't matter too much, more so if
> you're already taking chances on backporting major features yourself.

Like I say in this case updating to a newer kernel also means rebasing
the out of tree patch stack and taking a bunch of test risk from that -
in product development for the sorts of products that end up including
the LSK the churn and risk from targeted backports is seen as much safer
than updating to an entire new upstream kernel.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02  2:43   ` Stephen Hemminger
@ 2016-09-02  9:59     ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-02  9:59 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: ltsi-dev, gregkh, Levin, Alexander via Ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1173 bytes --]

On Thu, Sep 01, 2016 at 07:43:53PM -0700, Stephen Hemminger wrote:

> Agree, calling it a stable kernel is a misnomer; it looks like a fork. 

The name predates me but I believe it comes from the fact that it's
shared development on Greg's LTSs (plus Android versions as those don't
always line up).  It's definitely *not* -next which people find implies
rather more of a bleeding edge than we're aiming for.  Naming is hard.

> It really should be:
>  Linaro Arm Home for Features not Upstream Kernel (LAHFUK)
> or maybe linaro-next?

The goal is that anything in the actual LSK is actually upstream and
backported, we try to avoid out of tree things (this is why OP-TEE isn't
in the LSK itself for example).

> How come the feature matrix doesn't list what upstream kernel the feature
> was merged into?

Apathy mostly (I guess there's at least a range indicated in the point
at which the feature gets flagged as being in the base kernel), plus a
desire to not end up documenting all upstream features but rather just
things that are backported (some of the things that are pure upstream
features there were previously backported in older LSKs).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-02  9:54   ` Mark Brown
@ 2016-09-02 10:16     ` Geert Uytterhoeven
  2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
  2016-09-02 19:16     ` Levin, Alexander
  2 siblings, 0 replies; 122+ messages in thread
From: Geert Uytterhoeven @ 2016-09-02 10:16 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, ksummit-discuss

On Fri, Sep 2, 2016 at 11:54 AM, Mark Brown <broonie@kernel.org> wrote:
>> Stable kernels have very strict restrictions that are focused on not taking
>> commits that have high potential to cause unintended side effects, incorrect
>> interactions with the rest of the kernel or just introduce new bugs.
>
>> Mixing in new features that interact with multiple subsystems is a recipe for
>> disaster. We barely pull off backporting what looks like trivial fixes, trying
>> to do the same for more than that is bound be broken.
>
> It's what people are doing for products, they want newer features but
> they also don't want to rebase their product kernel onto mainline as
> that's an even bigger integration risk.  People aren't using this kernel
> raw, they're using it as the basis for product kernels.  What this is
> doing is getting a bunch of people using the same backports which shares
> effort and hopefully makes it more likely that some of the security
> relevant features will get deployed in products.  Ideally some of the
> saved time can be spent on upstreaming things though I fear that's a
> little optimistic.
>
>> As an alternative, why not use more recent stable kernels and customize the
>> config specifically for each user to enable on features that that specific
>> user wants to have.
>
> That's just shipping a kernel - I don't think anyone is silly enough to
> ship an allmodconfig or similar in production (though I'm sure someone
> can come up with an example).
>
>> The benefit here is that if used correctly you'll get to use all the new shiny
>> features you want on a more recent kernel, and none of the things you don't
>> want. So yes, you're upgrading to a newer kernel all the time, but if I
>> understant your use-case right it shouldn't matter too much, more so if
>> you're already taking chances on backporting major features yourself.
>
> Like I say in this case updating to a newer kernel also means rebasing
> the out of tree patch stack and taking a bunch of test risk from that -
> in product development for the sorts of products that end up including
> the LSK the churn and risk from targeted backports is seen as much safer
> than updating to an entire new upstream kernel.

So it's about finding a good balance between backporting new features into
your in-house tested kernel vs. forwardporting your in-house patches to a
community-tested kernel. Mistakes can be made during both back- and
forwardporting.

Nothing new to see here...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-01  2:01 [Ksummit-discuss] [Stable kernel] feature backporting collaboration Alex Shi
  2016-09-02  1:25 ` Levin, Alexander
@ 2016-09-02 13:47 ` Theodore Ts'o
  2016-09-02 19:31   ` Levin, Alexander
                     ` (2 more replies)
  1 sibling, 3 replies; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-02 13:47 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, gregkh, ksummit-discuss, Mark Brown

On Thu, Sep 01, 2016 at 10:01:13AM +0800, Alex Shi wrote:
> 
> I am a Linaro stable kernel maintainer. Our stable kernel is base on LTS
> plus much of upstream features backporting on them. Here is the detailed
> info of LSK: https://wiki.linaro.org/LSK
> https://git.linaro.org/?p=kernel/linux-linaro-stable.git

I'm really not sure what problem you are trying to solve here.

As near as I can tell, the kernels provided by a SOC vendor are a
snapshot in time of some LTS kernel, and after that, they don't bother
merging any bug fixes or security fixes from the upstream kernel.
They might take individual patches if they notice there's a problem
(e.g., it gets written about in the national press), but otherwise,
they'll be stuck on some nonsense such as 3.10.23.

Then the product vendors take the SOC kernel, and further hack it up,
and then once they take a snapshot, as far as I can tell, they don't
take any rolling updates from the SOC vendor either.  I'm not sure how
much of this is due to lack of engineering bandwidth, and how much of
this is due to being worried that a newer SOC kernel will introduce
regressions, but either way, they'll lock onto an old SOC kernel, and
apparently only take bug fixes when they notice there is a problem.
(And in multiple cases I've gotten calls from help of SOC vendors
asking for assistance in figuring out a problem, and more often than
not, the fix is in the latest LTS kernel, but that doesn't help them.)
And of course, in some cases, "never", unless it's written about in
the aforementioned national press, and even then, I'm not convinced
the product vendors will have the engineering staff to turn out
firmware upgrades for an older product.

So what's the point of moving features into some ancient kernel?
Who's going to take it?  Certainly not the product vendors, who
consume the SOC kernel.  The SOC vendors?  Why not just encourage them
to get their device drivers into staging, and just go to a newer LTS
kernel?  Because I guarantee that's going to be less risky than taking
a random collection of features, and backporting them into some
ancient kernel.

Or for that matter, why not simply going to the latest mainline
kernel.  Since the SOC vendors aren't taking updates from the LTS
kernel anyway, if the LTS kernel exists only as a patch repository
where people can look for security fixes and bug fixes (sometimes
after the upstream maintainer has to point out it's in the LTS
kernel), if they take, say, 4.7, in the future they might need to take
a look at 4.8.x, 4.9.x, etc., until the next LTS kernel is declared.
So that means that an SOC vendor or a downstream product vendors might
have to look at 3 or 4 patch releases instead of one.  Is that really
that hard?

     	  		 	    	  - Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02  9:54   ` Mark Brown
  2016-09-02 10:16     ` [Ksummit-discuss] [LTSI-dev] " Geert Uytterhoeven
@ 2016-09-02 14:42     ` James Bottomley
  2016-09-02 14:55       ` Rik van Riel
                         ` (3 more replies)
  2016-09-02 19:16     ` Levin, Alexander
  2 siblings, 4 replies; 122+ messages in thread
From: James Bottomley @ 2016-09-02 14:42 UTC (permalink / raw)
  To: Mark Brown, Levin, Alexander; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 4324 bytes --]

On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
> On Thu, Sep 01, 2016 at 09:25:31PM -0400, Levin, Alexander via
> Ksummit-discuss wrote:
> > On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
> 
> > > I am a Linaro stable kernel maintainer. Our stable kernel is base 
> > > on LTS plus much of upstream features backporting on them. Here 
> > > is the detailed
> 
> > I really disagree with this approach. I think that backporting 
> > board support like what LTSI does might make sense since it's self 
> > contained, but what LSK does is just crazy.
> 
> The bulk of these features are exactly that - they're isolated driver
> specific code or new subsystems.  There are also some things with 
> wider impact but it's nowhere near all of them.

It's crazy because it encourages precisely the wrong behaviour: vendors
target this tree not upstream.

> > Stable kernels have very strict restrictions that are focused on 
> > not taking commits that have high potential to cause unintended 
> > side effects, incorrect interactions with the rest of the kernel or 
> > just introduce new bugs.
> 
> > Mixing in new features that interact with multiple subsystems is a 
> > recipe for disaster. We barely pull off backporting what looks like 
> > trivial fixes, trying to do the same for more than that is bound be
> > broken.
> 
> It's what people are doing for products, they want newer features but
> they also don't want to rebase their product kernel onto mainline as
> that's an even bigger integration risk.  People aren't using this 
> kernel raw, they're using it as the basis for product kernels.  What 
> this is doing is getting a bunch of people using the same backports 
> which shares effort and hopefully makes it more likely that some of 
> the security relevant features will get deployed in products. 


And history repeats itself: this is almost the precise rationale the
distros used for all their out of tree patches in their 2.4 enterprise
kernels.  The disaster that ended up with (patch sets bigger than the
kernel itself with no way of getting them all upstream) is what led
directly to their upstream first policy.

The fact that all the distros track upstream more closely also means it's better tested: the farther away from upstream you move, the more problems you'll have.

>  Ideally some of the saved time can be spent on upstreaming things
> though I fear that's a little optimistic.

Such as a diff to mainline that grows without bound ...

> > As an alternative, why not use more recent stable kernels and 
> > customize the config specifically for each user to enable on 
> > features that that specific user wants to have.
> 
> That's just shipping a kernel - I don't think anyone is silly enough 
> to ship an allmodconfig or similar in production (though I'm sure
> someone can come up with an example).
> 
> > The benefit here is that if used correctly you'll get to use all 
> > the new shiny features you want on a more recent kernel, and none 
> > of the things you don't want. So yes, you're upgrading to a newer 
> > kernel all the time, but if I understant your use-case right it 
> > shouldn't matter too much, more so if you're already taking chances 
> > on backporting major features yourself.
> 
> Like I say in this case updating to a newer kernel also means 
> rebasing the out of tree patch stack and taking a bunch of test risk 
> from that

Risk you wouldn't have if you just followed upstream first.  You can
add this to the list of problems you created by not upstreaming the
patches.

>  - in product development for the sorts of products that end up
> including the LSK the churn and risk from targeted backports is seen 
> as much safer than updating to an entire new upstream kernel.

This is the attitude that needs to change.  If enterprises can finally
realise that tracking upstream more closely is a good strategy: shared
testing on the trunk, why can't embedded?  What is this huge risk they
see with the upstream kernel?  Granted, they have this vicious circle
where they need stuff that's not upstream because they targetted a non
-upstream kernel, which leads to them not wanting to upport it, but
surely it's Linaro's job to break this circle?

James


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
@ 2016-09-02 14:55       ` Rik van Riel
  2016-09-02 15:04         ` James Bottomley
  2016-09-02 17:06       ` Bird, Timothy
                         ` (2 subsequent siblings)
  3 siblings, 1 reply; 122+ messages in thread
From: Rik van Riel @ 2016-09-02 14:55 UTC (permalink / raw)
  To: James Bottomley, Mark Brown, Levin, Alexander
  Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1667 bytes --]

On Fri, 2016-09-02 at 07:42 -0700, James Bottomley wrote:
> On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
> > 
> > It's what people are doing for products, they want newer features
> > but
> > they also don't want to rebase their product kernel onto mainline
> > as
> > that's an even bigger integration risk.  People aren't using this 
> > kernel raw, they're using it as the basis for product
> > kernels.  What 
> > this is doing is getting a bunch of people using the same
> > backports 
> > which shares effort and hopefully makes it more likely that some
> > of 
> > the security relevant features will get deployed in products. 
> 
> 
> And history repeats itself: this is almost the precise rationale the
> distros used for all their out of tree patches in their 2.4
> enterprise
> kernels.  The disaster that ended up with (patch sets bigger than the
> kernel itself with no way of getting them all upstream) is what led
> directly to their upstream first policy.
> 
> The fact that all the distros track upstream more closely also means
> it's better tested: the farther away from upstream you move, the more
> problems you'll have.
> 
What exactly is the business case for re-learning the same
lesson the hard way, anyway?

The embedded people can either learn from the mistakes the
distro vendors made in the 2.4 era, which was repeated by
the Android kernel team later on, or they can choose to
repeat that mistake and learn things the hard way.

With 6-9 month time to market on products, do you really
have time for a 12 month rebase of a gigantic pile of
patches?

-- 

All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 14:55       ` Rik van Riel
@ 2016-09-02 15:04         ` James Bottomley
  2016-09-02 15:39           ` Rik van Riel
  0 siblings, 1 reply; 122+ messages in thread
From: James Bottomley @ 2016-09-02 15:04 UTC (permalink / raw)
  To: Rik van Riel, Mark Brown, Levin, Alexander
  Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2156 bytes --]

On Fri, 2016-09-02 at 10:55 -0400, Rik van Riel wrote:
> On Fri, 2016-09-02 at 07:42 -0700, James Bottomley wrote:
> > On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
> > >  
> > > It's what people are doing for products, they want newer features
> > > but they also don't want to rebase their product kernel onto 
> > > mainline as that's an even bigger integration risk.  People 
> > > aren't using this kernel raw, they're using it as the basis for 
> > > product kernels.  What this is doing is getting a bunch of people 
> > > using the same backports which shares effort and hopefully makes 
> > > it more likely that some of the security relevant features will
> > > get deployed in products. 
> > 
> > 
> > And history repeats itself: this is almost the precise rationale 
> > the distros used for all their out of tree patches in their 2.4
> > enterprise kernels.  The disaster that ended up with (patch sets 
> > bigger than the kernel itself with no way of getting them all 
> > upstream) is what led directly to their upstream first policy.
> > 
> > The fact that all the distros track upstream more closely also 
> > means it's better tested: the farther away from upstream you move, 
> > the more problems you'll have.
> > 
> What exactly is the business case for re-learning the same
> lesson the hard way, anyway?

It costs a lot less to learn from history instead of repeating it. 
 But, I suppose, it's not my money being wasted.

> The embedded people can either learn from the mistakes the
> distro vendors made in the 2.4 era, which was repeated by
> the Android kernel team later on, or they can choose to
> repeat that mistake and learn things the hard way.
> 
> With 6-9 month time to market on products, do you really
> have time for a 12 month rebase of a gigantic pile of
> patches?

You mean keep feeding them crack until they either die in an alley or
seek rehab?  It's a view, I suppose ... I just hate the idea that we
know why the behaviour is counterproductive and we have examples to
prove it, we just can't convince the addicts.  It seems socially inept
somehow.

James

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 15:04         ` James Bottomley
@ 2016-09-02 15:39           ` Rik van Riel
  0 siblings, 0 replies; 122+ messages in thread
From: Rik van Riel @ 2016-09-02 15:39 UTC (permalink / raw)
  To: James Bottomley, Mark Brown, Levin, Alexander
  Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2308 bytes --]

On Fri, 2016-09-02 at 08:04 -0700, James Bottomley wrote:
> On Fri, 2016-09-02 at 10:55 -0400, Rik van Riel wrote:
> > On Fri, 2016-09-02 at 07:42 -0700, James Bottomley wrote:
> > > On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
> > > >  
> > > > It's what people are doing for products, they want newer
> > > > features
> > > > but they also don't want to rebase their product kernel onto 
> > > > mainline as that's an even bigger integration risk.  People 
> > > > aren't using this kernel raw, they're using it as the basis
> > > > for 
> > > > product kernels.  What this is doing is getting a bunch of
> > > > people 
> > > > using the same backports which shares effort and hopefully
> > > > makes 
> > > > it more likely that some of the security relevant features will
> > > > get deployed in products. 
> > > 
> > > 
> > > And history repeats itself: this is almost the precise rationale 
> > > the distros used for all their out of tree patches in their 2.4
> > > enterprise kernels.  The disaster that ended up with (patch sets 
> > > bigger than the kernel itself with no way of getting them all 
> > > upstream) is what led directly to their upstream first policy.
> > > 
> > > The fact that all the distros track upstream more closely also 
> > > means it's better tested: the farther away from upstream you
> > > move, 
> > > the more problems you'll have.
> > > 
> > What exactly is the business case for re-learning the same
> > lesson the hard way, anyway?
> 
> It costs a lot less to learn from history instead of repeating it. 
>  But, I suppose, it's not my money being wasted.

The fact that there is demand for a collaborative
project on a common kernel tree to carry features
for the embedded folks suggests they are already
feeling the pain themselves.

What is missing is the realization that we already
have such a tree, where everybody (not just the
embedded folks) are collaborating on features.

The upstream kernel.

The embedded companies have the choice between
paying to duplicate a lot of the upstream kernel
work themselves, or participating in a larger
community, with less duplication of effort, and
more work done by developers who are not on their
payrolls.

-- 

All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
  2016-09-02 14:55       ` Rik van Riel
@ 2016-09-02 17:06       ` Bird, Timothy
  2016-09-05  1:45         ` NeilBrown
  2016-09-02 18:21       ` [Ksummit-discuss] " Olof Johansson
  2016-09-02 23:29       ` Mark Brown
  3 siblings, 1 reply; 122+ messages in thread
From: Bird, Timothy @ 2016-09-02 17:06 UTC (permalink / raw)
  To: James Bottomley, Mark Brown, Levin, Alexander
  Cc: ltsi-dev, gregkh, ksummit-discuss

> -----Original Message-----
>  James Bottomley wrote:
> On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
> > On Thu, Sep 01, 2016 at 09:25:31PM -0400, Levin, Alexander via
> > Ksummit-discuss wrote:
> > > On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
> >
> > > > I am a Linaro stable kernel maintainer. Our stable kernel is base
> > > > on LTS plus much of upstream features backporting on them. Here
> > > > is the detailed
> >
> > > I really disagree with this approach. I think that backporting
> > > board support like what LTSI does might make sense since it's self
> > > contained, but what LSK does is just crazy.
> >
> > The bulk of these features are exactly that - they're isolated driver
> > specific code or new subsystems.  There are also some things with
> > wider impact but it's nowhere near all of them.
> 
> It's crazy because it encourages precisely the wrong behaviour: vendors
> target this tree not upstream.
> 
> > > Stable kernels have very strict restrictions that are focused on
> > > not taking commits that have high potential to cause unintended
> > > side effects, incorrect interactions with the rest of the kernel or
> > > just introduce new bugs.
> >
> > > Mixing in new features that interact with multiple subsystems is a
> > > recipe for disaster. We barely pull off backporting what looks like
> > > trivial fixes, trying to do the same for more than that is bound be
> > > broken.
> >
> > It's what people are doing for products, they want newer features but
> > they also don't want to rebase their product kernel onto mainline as
> > that's an even bigger integration risk.  People aren't using this
> > kernel raw, they're using it as the basis for product kernels.  What
> > this is doing is getting a bunch of people using the same backports
> > which shares effort and hopefully makes it more likely that some of
> > the security relevant features will get deployed in products.
> 
> 
> And history repeats itself: this is almost the precise rationale the
> distros used for all their out of tree patches in their 2.4 enterprise
> kernels.  The disaster that ended up with (patch sets bigger than the
> kernel itself with no way of getting them all upstream) is what led
> directly to their upstream first policy.
> 
> The fact that all the distros track upstream more closely also means it's better
> tested: the farther away from upstream you move, the more problems you'll
> have.
> 
> >  Ideally some of the saved time can be spent on upstreaming things
> > though I fear that's a little optimistic.
> 
> Such as a diff to mainline that grows without bound ...
> 
> > > As an alternative, why not use more recent stable kernels and
> > > customize the config specifically for each user to enable on
> > > features that that specific user wants to have.
> >
> > That's just shipping a kernel - I don't think anyone is silly enough
> > to ship an allmodconfig or similar in production (though I'm sure
> > someone can come up with an example).
> >
> > > The benefit here is that if used correctly you'll get to use all
> > > the new shiny features you want on a more recent kernel, and none
> > > of the things you don't want. So yes, you're upgrading to a newer
> > > kernel all the time, but if I understant your use-case right it
> > > shouldn't matter too much, more so if you're already taking chances
> > > on backporting major features yourself.
> >
> > Like I say in this case updating to a newer kernel also means
> > rebasing the out of tree patch stack and taking a bunch of test risk
> > from that
> 
> Risk you wouldn't have if you just followed upstream first.  You can
> add this to the list of problems you created by not upstreaming the
> patches.
> 
> >  - in product development for the sorts of products that end up
> > including the LSK the churn and risk from targeted backports is seen
> > as much safer than updating to an entire new upstream kernel.
> 
> This is the attitude that needs to change.  If enterprises can finally
> realise that tracking upstream more closely is a good strategy: shared
> testing on the trunk, why can't embedded?  What is this huge risk they
> see with the upstream kernel?  Granted, they have this vicious circle
> where they need stuff that's not upstream because they targetted a non
> -upstream kernel, which leads to them not wanting to upport it, but
> surely it's Linaro's job to break this circle?

I'll take a crack at this (the "why can't embedded?" question).
In many cases, when many of these embedded
SoC vendors first started with Linux there was a combination of
1) kernel not being sufficient for their needs (particularly in the mobile space)
2) inexperience with mainlining and the community, and
3) a rush to market (there's always a rush to market)

The solution that Android came up with, to ship first and worry about
mainlining either later or not at all, worked tremendously for them.
It caused pain throughout the supply chain (that I've felt personally),
but it got the job done.  My argument is that they could not possibly
have followed any kind of "upstream-first" strategy, even if they had
had the skills or inclination. As one example, it took 3 years before
the kernel community accepted their strategy for power management
in the mobile space.

Where we are now with some of these SoCs is at millions of lines of
code out-of-tree.  It's being reduced, slowly, but there are still
significant areas where the mainline kernel just doesn't have the
support needed for shipping product. My pet peeve is support for
charging over USB, where Linaro has had a patch set
being stalled and/or ignored by the USB maintainer for 2 years!!

This discussion trivializes the difficulty of making progress mainlining
some of these pieces. At least the enterprise guys were working on
the same chip architecture. The problem for some of these
SoC vendors is that they have completely different approaches
to some issues, already shipping, and there's very little by
way of synergy with other vendors' patches.  That's an issue
of fragmentation (in the embedded space) that the enterprise
distros didn't have (well, not from my perspective - likely I'm trivializing
their issues :-).

Anyway, add to that the enormous churn caused by device tree, and it's
a miracle any of them have made significant progress upstreaming stuff
in  the last few years.

The stark reality is that in the near term, mainline just won't be feasible to
ship in products on some SoCs.  And no SoC vendor is going to stop and
wait for it to mature before shipping product.

Now - I'm not holding the SoC vendors blameless.  Many of them have
done their upstreaming work in about the worst way possible, when
they've made attempts at all.  I sympathize with the notion that somehow
we've failed them by not convincing them how to do things right.

Sorry to rant... This discussion just hit an area I've felt deeply about, and
tried to do something about.
 -- Tim

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
  2016-09-02 14:55       ` Rik van Riel
  2016-09-02 17:06       ` Bird, Timothy
@ 2016-09-02 18:21       ` Olof Johansson
  2016-09-02 23:35         ` Mark Brown
                           ` (2 more replies)
  2016-09-02 23:29       ` Mark Brown
  3 siblings, 3 replies; 122+ messages in thread
From: Olof Johansson @ 2016-09-02 18:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: ltsi-dev, gregkh, ksummit-discuss

On Fri, Sep 2, 2016 at 7:42 AM, James Bottomley
<James.Bottomley@hansenpartnership.com> wrote:
> On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
>> On Thu, Sep 01, 2016 at 09:25:31PM -0400, Levin, Alexander via
>> Ksummit-discuss wrote:
>> > On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
>>
>> > > I am a Linaro stable kernel maintainer. Our stable kernel is base
>> > > on LTS plus much of upstream features backporting on them. Here
>> > > is the detailed
>>
>> > I really disagree with this approach. I think that backporting
>> > board support like what LTSI does might make sense since it's self
>> > contained, but what LSK does is just crazy.
>>
>> The bulk of these features are exactly that - they're isolated driver
>> specific code or new subsystems.  There are also some things with
>> wider impact but it's nowhere near all of them.
>
> It's crazy because it encourages precisely the wrong behaviour: vendors
> target this tree not upstream.

The one case where it is warranted is for features that went in since
the last LTS release.

Pushing vendors all the way to target a non-LTS release is a bit more
aggressive that needed, in my opinion.


-Olof

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02  9:54   ` Mark Brown
  2016-09-02 10:16     ` [Ksummit-discuss] [LTSI-dev] " Geert Uytterhoeven
  2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
@ 2016-09-02 19:16     ` Levin, Alexander
  2016-09-03  0:05       ` Mark Brown
  2 siblings, 1 reply; 122+ messages in thread
From: Levin, Alexander @ 2016-09-02 19:16 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, gregkh, ksummit-discuss

On Fri, Sep 02, 2016 at 05:54:17AM -0400, Mark Brown wrote:
> Sep 01, 2016 at 09:25:31PM -0400, Levin, Alexander via Ksummit-discuss wrote:
> > On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
> 
> > > I am a Linaro stable kernel maintainer. Our stable kernel is base on LTS
> > > plus much of upstream features backporting on them. Here is the detailed
> 
> > I really disagree with this approach. I think that backporting board support
> > like what LTSI does might make sense since it's self contained, but what LSK
> > does is just crazy.
> 
> The bulk of these features are exactly that - they're isolated driver
> specific code or new subsystems.  There are also some things with wider
> impact but it's nowhere near all of them.

It's nowhere near all of them, but all it takes is one.

Look at KASLR and KASan, it has complex interactions with pretty much the rest
of the kernel. Quite a few things not directly related to either of those had
to be fixed just because they were found to not integrate right (For example,
KASLR uncovered a bunch of bugs before it was actually merged in), who says
that there aren't any similar interactions with the older kernels that no one
looked into?

> > Stable kernels have very strict restrictions that are focused on not taking
> > commits that have high potential to cause unintended side effects, incorrect
> > interactions with the rest of the kernel or just introduce new bugs.
> 
> > Mixing in new features that interact with multiple subsystems is a recipe for
> > disaster. We barely pull off backporting what looks like trivial fixes, trying
> > to do the same for more than that is bound be broken.
> 
> It's what people are doing for products, they want newer features but
> they also don't want to rebase their product kernel onto mainline as
> that's an even bigger integration risk.  People aren't using this kernel
> raw, they're using it as the basis for product kernels.  What this is
> doing is getting a bunch of people using the same backports which shares
> effort and hopefully makes it more likely that some of the security
> relevant features will get deployed in products.  Ideally some of the
> saved time can be spent on upstreaming things though I fear that's a
> little optimistic.

I'm sorry but just calling a kernel "stable" doesn't mean that suddenly it
acquires the qualities of a stable kernel that follows the very strict rules
we have for those.

Given that you're backporting features into a stable kernel it really inherits
the code quality of a release candidate kernel; nowhere close to a stable
kernel.

This following is just my opinion as an LTS kernel maintainer: if you think
that the integration risk of a newer stable/LTS is bigger than using these
frankenstein kernels you are very much mistaken.

In your case it's nice if you could share backports betweek multiple users
(just like we try doing for all the stable/LTS trees), but the coverage and
testing you're going to get for that isn't anywhere close to what you'll have
for a more recent stable kernel that already has those features baked into
that.

> > As an alternative, why not use more recent stable kernels and customize the
> > config specifically for each user to enable on features that that specific
> > user wants to have.
> 
> That's just shipping a kernel - I don't think anyone is silly enough to
> ship an allmodconfig or similar in production (though I'm sure someone
> can come up with an example).

I highly doubt that most shipped kernels actually go through the process of
auditing every single config option and figuring out if they actually need it
or not (in part because the kernel's config is quite a mess). I really doubt
that the kernel is fine-tuned for majority of the released products that run
linux.

I think that time invested in improving the config code is much more important
that investing time in attempting to backport features.

> > The benefit here is that if used correctly you'll get to use all the new shiny
> > features you want on a more recent kernel, and none of the things you don't
> > want. So yes, you're upgrading to a newer kernel all the time, but if I
> > understant your use-case right it shouldn't matter too much, more so if
> > you're already taking chances on backporting major features yourself.
> 
> Like I say in this case updating to a newer kernel also means rebasing
> the out of tree patch stack and taking a bunch of test risk from that -
> in product development for the sorts of products that end up including
> the LSK the churn and risk from targeted backports is seen as much safer
> than updating to an entire new upstream kernel.

Same as I said before, the risk LSK introduces, IMO, is much greater than
rebasing and out-of-tree driver stack.

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 13:47 ` Theodore Ts'o
@ 2016-09-02 19:31   ` Levin, Alexander
  2016-09-02 19:42     ` gregkh
  2016-09-03  2:04   ` Mark Brown
  2016-09-06  7:20   ` [Ksummit-discuss] [LTSI-dev] " Tsugikazu Shibata
  2 siblings, 1 reply; 122+ messages in thread
From: Levin, Alexander @ 2016-09-02 19:31 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ltsi-dev, Mark Brown, ksummit-discuss, gregkh

Hi Ted,

On Fri, Sep 02, 2016 at 09:47:11AM -0400, Theodore Ts'o wrote:
> Or for that matter, why not simply going to the latest mainline
> kernel.  Since the SOC vendors aren't taking updates from the LTS
> kernel anyway, if the LTS kernel exists only as a patch repository
> where people can look for security fixes and bug fixes (sometimes
> after the upstream maintainer has to point out it's in the LTS
> kernel), if they take, say, 4.7, in the future they might need to take
> a look at 4.8.x, 4.9.x, etc., until the next LTS kernel is declared.
> So that means that an SOC vendor or a downstream product vendors might
> have to look at 3 or 4 patch releases instead of one.  Is that really
> that hard?

I agree with everything you said besides this last paragraph, and it's
our fault.

In theory, the flow of commits that need to go into the stable tree should
have uniform distribution: Linus takes fixes at any point in time, so unlike
new features that come in only during the merge window fixes should be
constantly flowing in.

However, this is not the case; looking at LTS kernel releases during merge
windows we can see that the volume of commits that go into LTS kernel is much
higher than during release candidate cycles. Why? people still hold off on
sending fixes for a variety of reasons, which isn't the way it's supposed to
happen.

As a result, I'd never want to use mainline for production. The first kernel
I'd consider is a stable kernel that has taken in everything that was sent
during a merge window of the next release.

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 19:31   ` Levin, Alexander
@ 2016-09-02 19:42     ` gregkh
  2016-09-02 20:06       ` Levin, Alexander
  0 siblings, 1 reply; 122+ messages in thread
From: gregkh @ 2016-09-02 19:42 UTC (permalink / raw)
  To: Levin, Alexander; +Cc: ltsi-dev, ksummit-discuss, Mark Brown

On Fri, Sep 02, 2016 at 03:31:03PM -0400, Levin, Alexander wrote:
> Hi Ted,
> 
> On Fri, Sep 02, 2016 at 09:47:11AM -0400, Theodore Ts'o wrote:
> > Or for that matter, why not simply going to the latest mainline
> > kernel.  Since the SOC vendors aren't taking updates from the LTS
> > kernel anyway, if the LTS kernel exists only as a patch repository
> > where people can look for security fixes and bug fixes (sometimes
> > after the upstream maintainer has to point out it's in the LTS
> > kernel), if they take, say, 4.7, in the future they might need to take
> > a look at 4.8.x, 4.9.x, etc., until the next LTS kernel is declared.
> > So that means that an SOC vendor or a downstream product vendors might
> > have to look at 3 or 4 patch releases instead of one.  Is that really
> > that hard?
> 
> I agree with everything you said besides this last paragraph, and it's
> our fault.
> 
> In theory, the flow of commits that need to go into the stable tree should
> have uniform distribution: Linus takes fixes at any point in time, so unlike
> new features that come in only during the merge window fixes should be
> constantly flowing in.
> 
> However, this is not the case; looking at LTS kernel releases during merge
> windows we can see that the volume of commits that go into LTS kernel is much
> higher than during release candidate cycles. Why? people still hold off on
> sending fixes for a variety of reasons, which isn't the way it's supposed to
> happen.

I disagree.  I tag things for stable and then hold off to send them in
until -rc1 because of a variety of good reasons:
	- it needs more testing
	- it really isn't that big of a deal, and can wait a few weeks

Only the "really big" things usually get sent from me to Linus after
-rc1 is out, stuff that affects a number of people (and not just one odd
device/platform), or fixes a regression.

I imagine other maintainers do the same thing, so I wouldn't read all
that much into this.

> As a result, I'd never want to use mainline for production. The first kernel
> I'd consider is a stable kernel that has taken in everything that was sent
> during a merge window of the next release.

That usually feels like the most "unstable" stable release for some
reason, maybe because of the size, I don't have any real numbers to back
it up, as they all are obviously "good and stable" releases :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 19:42     ` gregkh
@ 2016-09-02 20:06       ` Levin, Alexander
  0 siblings, 0 replies; 122+ messages in thread
From: Levin, Alexander @ 2016-09-02 20:06 UTC (permalink / raw)
  To: gregkh; +Cc: ltsi-dev, ksummit-discuss, Mark Brown

On Fri, Sep 02, 2016 at 03:42:15PM -0400, gregkh@linuxfoundation.org wrote:
> On Fri, Sep 02, 2016 at 03:31:03PM -0400, Levin, Alexander wrote:
> > Hi Ted,
> > 
> > On Fri, Sep 02, 2016 at 09:47:11AM -0400, Theodore Ts'o wrote:
> > > Or for that matter, why not simply going to the latest mainline
> > > kernel.  Since the SOC vendors aren't taking updates from the LTS
> > > kernel anyway, if the LTS kernel exists only as a patch repository
> > > where people can look for security fixes and bug fixes (sometimes
> > > after the upstream maintainer has to point out it's in the LTS
> > > kernel), if they take, say, 4.7, in the future they might need to take
> > > a look at 4.8.x, 4.9.x, etc., until the next LTS kernel is declared.
> > > So that means that an SOC vendor or a downstream product vendors might
> > > have to look at 3 or 4 patch releases instead of one.  Is that really
> > > that hard?
> > 
> > I agree with everything you said besides this last paragraph, and it's
> > our fault.
> > 
> > In theory, the flow of commits that need to go into the stable tree should
> > have uniform distribution: Linus takes fixes at any point in time, so unlike
> > new features that come in only during the merge window fixes should be
> > constantly flowing in.
> > 
> > However, this is not the case; looking at LTS kernel releases during merge
> > windows we can see that the volume of commits that go into LTS kernel is much
> > higher than during release candidate cycles. Why? people still hold off on
> > sending fixes for a variety of reasons, which isn't the way it's supposed to
> > happen.
> 
> I disagree.  I tag things for stable and then hold off to send them in
> until -rc1 because of a variety of good reasons:
> 	- it needs more testing
> 	- it really isn't that big of a deal, and can wait a few weeks
> 
> Only the "really big" things usually get sent from me to Linus after
> -rc1 is out, stuff that affects a number of people (and not just one odd
> device/platform), or fixes a regression.
> 
> I imagine other maintainers do the same thing, so I wouldn't read all
> that much into this.

My point was that it seems like most maintainer actually work the other way
around: they may not send things for -rc5/6/7 and would rather hold off on
them until the next merge window. I'm just basing it on an objective count of
stable tagged commits during the merge window vs during release cycles.

What you described would lead to more stable tagged commits during release
cycles, but if you look at the numbers it's very much the other way around.

> > As a result, I'd never want to use mainline for production. The first kernel
> > I'd consider is a stable kernel that has taken in everything that was sent
> > during a merge window of the next release.
> 
> That usually feels like the most "unstable" stable release for some
> reason, maybe because of the size, I don't have any real numbers to back
> it up, as they all are obviously "good and stable" releases :)

Yes I very much agree with that, I'm just saying that that very first stable
commit usually has the highest amount of fixes to the corresponding mainline
kernel, so I'd never use the mainline kernel until that first "unstable"
release.

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
                         ` (2 preceding siblings ...)
  2016-09-02 18:21       ` [Ksummit-discuss] " Olof Johansson
@ 2016-09-02 23:29       ` Mark Brown
  3 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-02 23:29 UTC (permalink / raw)
  To: James Bottomley; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 5376 bytes --]

On Fri, Sep 02, 2016 at 07:42:06AM -0700, James Bottomley wrote:
> On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:

> > The bulk of these features are exactly that - they're isolated driver
> > specific code or new subsystems.  There are also some things with 
> > wider impact but it's nowhere near all of them.

> It's crazy because it encourages precisely the wrong behaviour: vendors
> target this tree not upstream.

By "vendors" you mean "the people paying for the work"...

> And history repeats itself: this is almost the precise rationale the
> distros used for all their out of tree patches in their 2.4 enterprise
> kernels.  The disaster that ended up with (patch sets bigger than the
> kernel itself with no way of getting them all upstream) is what led
> directly to their upstream first policy.

Right, this is nothing new - but as we discussed earlier on this year we
can't just wish away the product needs people have or the time taken to
implement change.

> >  Ideally some of the saved time can be spent on upstreaming things
> > though I fear that's a little optimistic.

> Such as a diff to mainline that grows without bound ...

Or gets smaller - some of the vendors are actually making some progress
here, and one of the things that people have worked out is that it's
really helpful if you want to try to push forwards to make sure that if
a problem is solved upstream you work with that solution rather than
rolling your own (and if one isn't there you get it upstream at the same
time as you develop it).

There's a continum of behaviour here, obviously there are some vendors
who just don't care at all but you also have vendors working with
varying degrees of engagement in the community all the way up to those
who have real live customers running mainline on their systems.  

It's also interesting to look at where the diffs are - for example are
people rewriting subsystems or are they adding new drivers?  The more
it's just simple driver work the closer we're getting.

> > Like I say in this case updating to a newer kernel also means 
> > rebasing the out of tree patch stack and taking a bunch of test risk 
> > from that

> Risk you wouldn't have if you just followed upstream first.  You can
> add this to the list of problems you created by not upstreaming the
> patches.

You're not really engaging with the problem here.  People aren't
rewriting their entire stacks every time they move forwards, telling
them to just bin their entire code bases or not release any new products
for a year isn't helpful here, running mainline isn't a change people
can make overnight.  It's good to know where we want to get to but we
need to recognise that it's a journey.

It's not like things like laptops are doing a perfect job here - as was
covered in the previous stable discussion this year the experience using
a newly released x86 laptop chipset on Linux is usually not a fun one
initially.  Embedded vendors need to ship a fully working product, they
have to get things together before market rather than after.

> >  - in product development for the sorts of products that end up
> > including the LSK the churn and risk from targeted backports is seen 
> > as much safer than updating to an entire new upstream kernel.

> This is the attitude that needs to change.  If enterprises can finally
> realise that tracking upstream more closely is a good strategy: shared
> testing on the trunk, why can't embedded?

Note that what the LSK itself is doing is an example of the enterprise
approach - we're backporting things that are already upstream.  

>  What is this huge risk they
> see with the upstream kernel?  Granted, they have this vicious circle

As one progresses in system integration and QA the amount of change you
can tolerate in the product decreases - change control tightens up
considerably, by the time you get to shipping an end product you'll
often be at the point where any change is individually reviewed at the
top level of the engineering teams before it goes in.  There's a window
in people's development cycles where it is viable for them to move to a
newer kernel version (which is when they do do this) but after that the
need for retesting and the risk that something will break from a rebase
just becomes too great.

Now, one can work upstream and then do backports but that's not trivial
(especially when you're relying on other in flight things) and at some
point you are always going to end up shipping things downstream first
simply because upstream's timescales don't match up with the product
needs.

This is very much what the enterprise distributions are doing (although
on a much longer product cycle compared to the consumer electronics
world) - for example RHEL 7 shipped in June 2014 using a v3.10 based
kernel, v3.10 having been release in June 2013, and still has a v3.10
based kernel and it's those vendor kernels where their QA efforts are
focused.

> where they need stuff that's not upstream because they targetted a non
> -upstream kernel, which leads to them not wanting to upport it, but
> surely it's Linaro's job to break this circle?

The issue isn't not wanting to work upstream, the issue is wanting to
continue to get products out the door with a reasonable degree of QA.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 18:21       ` [Ksummit-discuss] " Olof Johansson
@ 2016-09-02 23:35         ` Mark Brown
  2016-09-03  5:29         ` Guenter Roeck
  2016-09-04  0:10         ` Theodore Ts'o
  2 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-02 23:35 UTC (permalink / raw)
  To: Olof Johansson; +Cc: James Bottomley, ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 720 bytes --]

On Fri, Sep 02, 2016 at 11:21:38AM -0700, Olof Johansson wrote:
> On Fri, Sep 2, 2016 at 7:42 AM, James Bottomley

> > It's crazy because it encourages precisely the wrong behaviour: vendors
> > target this tree not upstream.

> The one case where it is warranted is for features that went in since
> the last LTS release.

> Pushing vendors all the way to target a non-LTS release is a bit more
> aggressive that needed, in my opinion.

Which is basically what people are doing for the most part - the older
kernels are either the latest Android baseline (which is effectively the
latest LTS to people in that part of the game) or products already quite
far down the development cycle and just getting a bit of polish.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 19:16     ` Levin, Alexander
@ 2016-09-03  0:05       ` Mark Brown
  2016-09-05  9:28         ` Laurent Pinchart
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-03  0:05 UTC (permalink / raw)
  To: Levin, Alexander; +Cc: ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 4497 bytes --]

On Fri, Sep 02, 2016 at 03:16:37PM -0400, Levin, Alexander wrote:

> Look at KASLR and KASan, it has complex interactions with pretty much the rest
> of the kernel. Quite a few things not directly related to either of those had
> to be fixed just because they were found to not integrate right (For example,
> KASLR uncovered a bunch of bugs before it was actually merged in), who says
> that there aren't any similar interactions with the older kernels that no one
> looked into?

Sure, and this sort of thing is one of the reasons we have the ability
to disable things in Kconfig.  It's not risk free but it's very much
mitigated compared to tracking mainline.

> > It's what people are doing for products, they want newer features but
> > they also don't want to rebase their product kernel onto mainline as
> > that's an even bigger integration risk.  People aren't using this kernel

> I'm sorry but just calling a kernel "stable" doesn't mean that suddenly it
> acquires the qualities of a stable kernel that follows the very strict rules
> we have for those.

> Given that you're backporting features into a stable kernel it really inherits
> the code quality of a release candidate kernel; nowhere close to a stable
> kernel.

> This following is just my opinion as an LTS kernel maintainer: if you think
> that the integration risk of a newer stable/LTS is bigger than using these
> frankenstein kernels you are very much mistaken.

I really don't think you understand the environment that this work is
done in.  You may have heard people mention the large amount of out of
tree code that vendors tend to be sitting on.  That interacts with a
*very* large chunk of the kernel, and of course there's also a bunch of
performance stuff that's being looked at beyond pure correctness issues.
Taking a new upstream requires a bunch of work to update the out of tree
code to any new kernel APIs and realistically it's going to trash a huge
chunk of the testing that's been done on the product and require at
least revalidation.  Taking a targeted update, especially one where the
riskier changes are configuration options, isn't free either but the
surface that needs to be looked at is much more known and controlled.

> In your case it's nice if you could share backports betweek multiple users
> (just like we try doing for all the stable/LTS trees), but the coverage and
> testing you're going to get for that isn't anywhere close to what you'll have
> for a more recent stable kernel that already has those features baked into
> that.

If everything were upstream, everyone was working directly upstream and
everyone had their QA focused on upstream what you're saying would be
more true but as everyone is so keen to point out that's just not what's
happening.  There's a bunch of other code in play on the relevant
systems which makes things that little bit more involved.

> > > As an alternative, why not use more recent stable kernels and customize the
> > > config specifically for each user to enable on features that that specific
> > > user wants to have.

> > That's just shipping a kernel - I don't think anyone is silly enough to
> > ship an allmodconfig or similar in production (though I'm sure someone
> > can come up with an example).

> I highly doubt that most shipped kernels actually go through the process of
> auditing every single config option and figuring out if they actually need it
> or not (in part because the kernel's config is quite a mess). I really doubt
> that the kernel is fine-tuned for majority of the released products that run
> linux.

I'm sorry but I really don't follow what you're saying here - I'm not
sure anyone's out of tree code is the result of a failure to understand
Kconfig and I don't really understand the relevance of a detailed study
of configuration to the issues around rebasing.

> > Like I say in this case updating to a newer kernel also means rebasing
> > the out of tree patch stack and taking a bunch of test risk from that -
> > in product development for the sorts of products that end up including
> > the LSK the churn and risk from targeted backports is seen as much safer
> > than updating to an entire new upstream kernel.

> Same as I said before, the risk LSK introduces, IMO, is much greater than
> rebasing and out-of-tree driver stack.

I'm afraid you're very much mistaken if you believe that people are only
working on leaf drivers, or that nothing we do upstream has a meaningful
impact at the system level.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 13:47 ` Theodore Ts'o
  2016-09-02 19:31   ` Levin, Alexander
@ 2016-09-03  2:04   ` Mark Brown
  2016-09-06  7:20   ` [Ksummit-discuss] [LTSI-dev] " Tsugikazu Shibata
  2 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-03  2:04 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 6490 bytes --]

On Fri, Sep 02, 2016 at 09:47:11AM -0400, Theodore Ts'o wrote:

> As near as I can tell, the kernels provided by a SOC vendor are a
> snapshot in time of some LTS kernel, and after that, they don't bother
> merging any bug fixes or security fixes from the upstream kernel.
> They might take individual patches if they notice there's a problem
> (e.g., it gets written about in the national press), but otherwise,
> they'll be stuck on some nonsense such as 3.10.23.

That's really not the case universally and it's not helpful to just
dismiss all SoC vendors out of hand like this.  There are some vendors
who do a bad job, there are other vendors who pride themselves on things
like keeping current and use it as a selling point with their customers.
Obviously the bad vendors are going to be more visible and more
memorable, and given your mention of v3.10 I'm guessing your experiences
here may have been quite a while ago and those vendors you were looking
at then may have changed the way they work.

> Then the product vendors take the SOC kernel, and further hack it up,

I think many of the companies and people involved would characterise at
least some of what they're doing as being substantial and useful
engineering.  I'm saying this not just from the point of view of
politeness but also because one of the fears that people often have is
that they're just going to get ignored and dismissed upstream by people
who don't understand anything except PCs.  Obviously that's not
generally the case, but as we were reminded in the other thread there
are times when a diplomatic approach can be helpful in changing the
behaviour of companies.

> and then once they take a snapshot, as far as I can tell, they don't
> take any rolling updates from the SOC vendor either.  I'm not sure how
> much of this is due to lack of engineering bandwidth, and how much of
> this is due to being worried that a newer SOC kernel will introduce
> regressions, but either way, they'll lock onto an old SOC kernel, and

Assuming the product vendor is doing product updates at all it's much
more the fear of regressions than anything else - the SoC vendor is
likely to be doing all kinds of work to make their SoC more appealing to
customers which often won't be compatible with something that people
want to be more stable and people can get *very* conservative about what
they'll touch.

> apparently only take bug fixes when they notice there is a problem.
> (And in multiple cases I've gotten calls from help of SOC vendors
> asking for assistance in figuring out a problem, and more often than
> not, the fix is in the latest LTS kernel, but that doesn't help them.)
> And of course, in some cases, "never", unless it's written about in
> the aforementioned national press, and even then, I'm not convinced
> the product vendors will have the engineering staff to turn out
> firmware upgrades for an older product.

As with so much here this varies a lot between vendors and the markets
they're in - some produce a fairly constant stream of firmware updates
over a long period of time and are geared up to do that, others never
think about the product once it's out the door.  In the case of phones
you've also got the whole carrier mess to deal with even when updates do
get provided by the system integrator.

> So what's the point of moving features into some ancient kernel?

For values of "ancient" that are mostly "current LTS" and "current
Android kernel" (there is usually the prior LTS too but the focus is on
the current one, that's generally getting wound down).  The Android
kernel thing is a separate issue which is being worked on, one of the
goals with the decision at last year's KS to move the time LTS kernels
are announced was to help move towards keeping the two in sync.

> Who's going to take it?  Certainly not the product vendors, who
> consume the SOC kernel.  The SOC vendors?  Why not just encourage them

This is for SoC vendors.

> to get their device drivers into staging, and just go to a newer LTS

People in the embedded space are doing a *little* bit more than just
churning out drivers.

> kernel?  Because I guarantee that's going to be less risky than taking
> a random collection of features, and backporting them into some
> ancient kernel.

People mostly have been persuaded to use a current LTS kernel for new
development, moving to a newer LTS than the current one has it's
challenges.  

> Or for that matter, why not simply going to the latest mainline
> kernel.  Since the SOC vendors aren't taking updates from the LTS
> kernel anyway, if the LTS kernel exists only as a patch repository

Like I say that's not universally the case so your assumptions here are
a bit flawed...

> where people can look for security fixes and bug fixes (sometimes
> after the upstream maintainer has to point out it's in the LTS
> kernel), if they take, say, 4.7, in the future they might need to take
> a look at 4.8.x, 4.9.x, etc., until the next LTS kernel is declared.
> So that means that an SOC vendor or a downstream product vendors might
> have to look at 3 or 4 patch releases instead of one.  Is that really
> that hard?

I'm not entirely sure I understand what you are saying here or how it
differs from what people are already doing with LTSs here?

If you're saying people at the start of development should track
mainline then to an extent until an LTS people already do that if
they've got the capacity to track upstream and have a reasonable belief
that there's going to be a LTS in a window that's useful to them (the
new timing does make that much more likely).  That does not, however,
mean that there isn't a substantial period after the kernel version is
locked in where development is still going on.

If you're saying people should just ship on mainline then I think that's
predicated on nobody ever doing LTS updates and even if that were the
case I'd have thought we want to encourage this.

You also have the problem that if everyone decides to pick different
baseline kernel versions (at the SoC vendor or product level) then any
third party vendor who wants to integrate with them ends up having to
target multiple kernel versions which hurts the ecosystem - it's more
work for them, and it makes it harder for people who want to buy these
devices to get software support if they happen to be on a different
version.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 18:21       ` [Ksummit-discuss] " Olof Johansson
  2016-09-02 23:35         ` Mark Brown
@ 2016-09-03  5:29         ` Guenter Roeck
  2016-09-03 10:40           ` Mark Brown
  2016-09-04  0:10         ` Theodore Ts'o
  2 siblings, 1 reply; 122+ messages in thread
From: Guenter Roeck @ 2016-09-03  5:29 UTC (permalink / raw)
  To: Olof Johansson; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Fri, Sep 02, 2016 at 11:21:38AM -0700, Olof Johansson wrote:
> On Fri, Sep 2, 2016 at 7:42 AM, James Bottomley
> <James.Bottomley@hansenpartnership.com> wrote:
> > On Fri, 2016-09-02 at 10:54 +0100, Mark Brown wrote:
> >> On Thu, Sep 01, 2016 at 09:25:31PM -0400, Levin, Alexander via
> >> Ksummit-discuss wrote:
> >> > On Wed, Aug 31, 2016 at 10:01:13PM -0400, Alex Shi wrote:
> >>
> >> > > I am a Linaro stable kernel maintainer. Our stable kernel is base
> >> > > on LTS plus much of upstream features backporting on them. Here
> >> > > is the detailed
> >>
> >> > I really disagree with this approach. I think that backporting
> >> > board support like what LTSI does might make sense since it's self
> >> > contained, but what LSK does is just crazy.
> >>
> >> The bulk of these features are exactly that - they're isolated driver
> >> specific code or new subsystems.  There are also some things with
> >> wider impact but it's nowhere near all of them.
> >
> > It's crazy because it encourages precisely the wrong behaviour: vendors
> > target this tree not upstream.
> 
> The one case where it is warranted is for features that went in since
> the last LTS release.
> 
> Pushing vendors all the way to target a non-LTS release is a bit more
> aggressive that needed, in my opinion.
> 
Getting a bit off track here, but at least some vendors don't even track
stable kernels in the first place because they think it is too risky/unstable.
We just had a discussion about that, after all. Personally I would prefer
to get stable kernels to the point where people feel comfortable using them,
and then consider adding functionality on top of that if needed.

Having said that, an effort like this may still be helpful, I just wonder what
the intended use case is. Is it for vendors to actually use the new LTS+feature
branch, or for vendors to cherry-pick features from it ?

Guenter

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-03  5:29         ` Guenter Roeck
@ 2016-09-03 10:40           ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-03 10:40 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 370 bytes --]

On Fri, Sep 02, 2016 at 10:29:14PM -0700, Guenter Roeck wrote:

> Having said that, an effort like this may still be helpful, I just wonder what
> the intended use case is. Is it for vendors to actually use the new LTS+feature
> branch, or for vendors to cherry-pick features from it ?

In our case it's both, though we provide less support for cherry picking
features.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 18:21       ` [Ksummit-discuss] " Olof Johansson
  2016-09-02 23:35         ` Mark Brown
  2016-09-03  5:29         ` Guenter Roeck
@ 2016-09-04  0:10         ` Theodore Ts'o
  2016-09-04  8:34           ` gregkh
  2016-09-04 22:58           ` Amit Kucheria
  2 siblings, 2 replies; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-04  0:10 UTC (permalink / raw)
  To: Olof Johansson; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Fri, Sep 02, 2016 at 11:21:38AM -0700, Olof Johansson wrote:
> 
> The one case where it is warranted is for features that went in since
> the last LTS release.
> 
> Pushing vendors all the way to target a non-LTS release is a bit more
> aggressive that needed, in my opinion.

So maybe they shouldn't target 4.7 before 4.8 has been released, but
once 4.8 is released, what's the problem with having vendors use 4.7.x
at that point (where x would probably be 2 or 3)?

We've already established that they don't track the stable kernel bug
fixes, but are instead cherry picking fixes.  So let's suppose 4.9
turns out to be next LTS kernel.  The only downside of using 4.7.x for
an SOC kernel is people will have to search the 4.7.x, 4.8.x, and
4.9.x stable kernels to find commits to cherry pick.  Is that really
that much harder?

The big problem is knowing that there are patches to cherry pick, and
hoping that all of the mobile handset vendors know to cherry pick all
of the patches for their diverged, forked kernel.  Is needing to
search multiple stable kernels really a sigificant part of the cost of
updating the device kernel?

At this point, the only thing thing the LTS seems to provide is that
it limits the number of kernels that patches needed to backported to,
and it limits the number of sets of stable kernel patches that device
vendor maintainers need to search to find patches to backport from.

If it reduces the number of massive features that have to be
backported to 4.4, it might be that pushing them to use 4.7 (or
waiting for the next LTS kernel, which should hopefully be coming soon
anyway) might be less risky and less work than trying to back port
feature patchsets to 4.4 or 3.18.

					- Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-04  0:10         ` Theodore Ts'o
@ 2016-09-04  8:34           ` gregkh
  2016-09-04 22:58           ` Amit Kucheria
  1 sibling, 0 replies; 122+ messages in thread
From: gregkh @ 2016-09-04  8:34 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: James Bottomley, ltsi-dev, ksummit-discuss

On Sat, Sep 03, 2016 at 08:10:31PM -0400, Theodore Ts'o wrote:
> If it reduces the number of massive features that have to be
> backported to 4.4, it might be that pushing them to use 4.7 (or
> waiting for the next LTS kernel, which should hopefully be coming soon
> anyway)

The next "LTS" kernel will be 4.9.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-04  0:10         ` Theodore Ts'o
  2016-09-04  8:34           ` gregkh
@ 2016-09-04 22:58           ` Amit Kucheria
  2016-09-04 23:51             ` Theodore Ts'o
  2016-09-05 11:11             ` Mark Brown
  1 sibling, 2 replies; 122+ messages in thread
From: Amit Kucheria @ 2016-09-04 22:58 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Sun, Sep 4, 2016 at 5:40 AM, Theodore Ts'o <tytso@mit.edu> wrote:
>
> If it reduces the number of massive features that have to be
> backported to 4.4, it might be that pushing them to use 4.7 (or
> waiting for the next LTS kernel, which should hopefully be coming soon
> anyway) might be less risky and less work than trying to back port
> feature patchsets to 4.4 or 3.18.

OK, I'll bite.

The vendors depend on Google providing an Android common tree[1] to
build their BSP on top of. Currently, there isn't anything newer than
a 4.4-based common tree from Google. It'll be early 2017 by the time
4.9 LTS is released and the Android common tree is available on it
before vendors can even start porting their BSPs to it. Some are
quicker than others, but safe to say that it'll be summer 2017 by the
time the vendor BSPs can run on 4.9 LTS. That leaves very little time
for product testing[2] and some slack, in order to ready devices for
the Christmas market[3]. The project managers will flag that as a
'risk' and the vendor will stick to a 4.4 LTS.

But since 4.4 LTS misses some key features that have landed in 4.5,
4.6, 4.7 and 4.8, something like LSK will fill that gap by backporting
those features.

So the problem is that it might be 2018 before the first 4.9-based CE
devices hit the market.

Now, if everything was on its way upstream, the BSP deltas would be
smaller and the whole porting and validation exercise on top of the
Android common tree would take much less time. But we aren't there yet
for all vendors, some are doing better than others.

Regards,
Amit

[1] https://android.googlesource.com/kernel/common/
[2] Across several SoC variants, I might add
[3] This may not be uniformly important across the globe, but it is an
empirical observation that flagship devices with the latest software
tend to appear in the 2nd half of the year.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-04 22:58           ` Amit Kucheria
@ 2016-09-04 23:51             ` Theodore Ts'o
  2016-09-05 12:58               ` Mark Brown
  2016-09-05 11:11             ` Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-04 23:51 UTC (permalink / raw)
  To: Amit Kucheria; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:
> 
> The vendors depend on Google providing an Android common tree[1] to
> build their BSP on top of. Currently, there isn't anything newer than
> a 4.4-based common tree from Google. It'll be early 2017 by the time
> 4.9 LTS is released and the Android common tree is available on it
> before vendors can even start porting their BSPs to it. Some are
> quicker than others, but safe to say that it'll be summer 2017 by the
> time the vendor BSPs can run on 4.9 LTS. That leaves very little time
> for product testing[2] and some slack, in order to ready devices for
> the Christmas market[3]. The project managers will flag that as a
> 'risk' and the vendor will stick to a 4.4 LTS.
> 
> But since 4.4 LTS misses some key features that have landed in 4.5,
> 4.6, 4.7 and 4.8, something like LSK will fill that gap by backporting
> those features.

But the BSP kernel already has thousands of commits and tens of
thousands of lines of code (some of which guarantee that the kernel
won't build on anything other than ARM, which makes testing file
systems using KVM Very Hard).  So if you backport to LSK, that's not
going to help the BSP kernel unless someone then cherry picks patches
from LSK to the BSP kernel.  So you are doing two very risky things;
one is backporting a feature to 4.4 LTS, and then cherry picking the
feature from the 4.4 LTS upstream kernel to the BSP kernel.

I've done this sort of thing before, with ext4 encryption, where I was
very familiar with the code and where I had a comprehensive test suite
so I could at least test the first part of it using kvm/x86.  And I
can tell you it's an *extremely* fraught and tricky thing.  What works
for device A won't work for device B, and just because you've
backported the feature a 3.10 or 3.18 upstream kernel, and tested it
extensively, doesn't mean that it is easy and risk free to cherry pick
the patch to device kernel A and device kernel B, because the changes
made by SOC vendor A and SOC vendor B may be quite different (and may
not even be based on the same upstream kernel version).  And although
we finally have xfstests (sorta[1]) working under Android today, we
didn't have it when we started, and there were bugs introduced just
doing the cherry pick.

[1] https://github.com/tytso/xfstests-bld/blob/master/Documentation/android-xfstests.md

And this was with the subsystem expert personally doing the backport
and the cherry pick, and where each cherry pick for each different
device kernel had to be done separately and tested separately.

Doing this generally for a large number of device kernels, and for
features where you weren't the original developer and for which you
might not have deep set of regression tests?   Good luck with that....

> Now, if everything was on its way upstream, the BSP deltas would be
> smaller and the whole porting and validation exercise on top of the
> Android common tree would take much less time. But we aren't there yet
> for all vendors, some are doing better than others.

Has there been *any* forward progress since last year in terms of
reducing the BSP deltas?

Even if we can't get everyone working from upstream, if multiple BSP
deltas for different SOC's could be integrated into a single common
4.4 tree, and if that 4.4 tree could still generate bootable x86
kernels, so that could be the basis of an Android common branch, that
would be a good start.  But I suspect we are a looooooong way away
from that.

						- Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-02 17:06       ` Bird, Timothy
@ 2016-09-05  1:45         ` NeilBrown
  2016-09-05 11:04           ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: NeilBrown @ 2016-09-05  1:45 UTC (permalink / raw)
  To: Bird, Timothy, James Bottomley, Mark Brown, Levin, Alexander
  Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 528 bytes --]

On Sat, Sep 03 2016, Bird, Timothy wrote:

> Where we are now with some of these SoCs is at millions of lines of
> code out-of-tree.  It's being reduced, slowly, but there are still
> significant areas where the mainline kernel just doesn't have the
> support needed for shipping product. My pet peeve is support for
> charging over USB, where Linaro has had a patch set
> being stalled and/or ignored by the USB maintainer for 2 years!!

Do you have a link to that?  I have an interest in charging over USB.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-03  0:05       ` Mark Brown
@ 2016-09-05  9:28         ` Laurent Pinchart
  2016-09-21  6:58           ` Alex Shi
  0 siblings, 1 reply; 122+ messages in thread
From: Laurent Pinchart @ 2016-09-05  9:28 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: ltsi-dev, gregkh

On Saturday 03 Sep 2016 01:05:18 Mark Brown wrote:
> On Fri, Sep 02, 2016 at 03:16:37PM -0400, Levin, Alexander wrote:
> > Look at KASLR and KASan, it has complex interactions with pretty much the
> > rest of the kernel. Quite a few things not directly related to either of
> > those had to be fixed just because they were found to not integrate right
> > (For example, KASLR uncovered a bunch of bugs before it was actually
> > merged in), who says that there aren't any similar interactions with the
> > older kernels that no one looked into?
> 
> Sure, and this sort of thing is one of the reasons we have the ability
> to disable things in Kconfig.  It's not risk free but it's very much
> mitigated compared to tracking mainline.
> 
> > > It's what people are doing for products, they want newer features but
> > > they also don't want to rebase their product kernel onto mainline as
> > > that's an even bigger integration risk.  People aren't using this kernel
> > 
> > I'm sorry but just calling a kernel "stable" doesn't mean that suddenly it
> > acquires the qualities of a stable kernel that follows the very strict
> > rules we have for those.
> > 
> > Given that you're backporting features into a stable kernel it really
> > inherits the code quality of a release candidate kernel; nowhere close to
> > a stable kernel.
> > 
> > This following is just my opinion as an LTS kernel maintainer: if you
> > think
> > that the integration risk of a newer stable/LTS is bigger than using these
> > frankenstein kernels you are very much mistaken.
> 
> I really don't think you understand the environment that this work is
> done in.  You may have heard people mention the large amount of out of
> tree code that vendors tend to be sitting on.  That interacts with a
> *very* large chunk of the kernel, and of course there's also a bunch of
> performance stuff that's being looked at beyond pure correctness issues.
> Taking a new upstream requires a bunch of work to update the out of tree
> code to any new kernel APIs and realistically it's going to trash a huge
> chunk of the testing that's been done on the product and require at
> least revalidation.  Taking a targeted update, especially one where the
> riskier changes are configuration options, isn't free either but the
> surface that needs to be looked at is much more known and controlled.
> 
> > In your case it's nice if you could share backports betweek multiple users
> > (just like we try doing for all the stable/LTS trees), but the coverage
> > and
> > testing you're going to get for that isn't anywhere close to what you'll
> > have for a more recent stable kernel that already has those features
> > baked into that.
> 
> If everything were upstream, everyone was working directly upstream and
> everyone had their QA focused on upstream what you're saying would be
> more true but as everyone is so keen to point out that's just not what's
> happening.  There's a bunch of other code in play on the relevant
> systems which makes things that little bit more involved.
> 
> > > > As an alternative, why not use more recent stable kernels and
> > > > customize the
> > > > config specifically for each user to enable on features that that
> > > > specific
> > > > user wants to have.
> > > 
> > > That's just shipping a kernel - I don't think anyone is silly enough to
> > > ship an allmodconfig or similar in production (though I'm sure someone
> > > can come up with an example).
> > 
> > I highly doubt that most shipped kernels actually go through the process
> > of auditing every single config option and figuring out if they actually
> > need it or not (in part because the kernel's config is quite a mess). I
> > really doubt that the kernel is fine-tuned for majority of the released
> > products that run linux.
> 
> I'm sorry but I really don't follow what you're saying here - I'm not
> sure anyone's out of tree code is the result of a failure to understand
> Kconfig and I don't really understand the relevance of a detailed study
> of configuration to the issues around rebasing.
> 
> > > Like I say in this case updating to a newer kernel also means rebasing
> > > the out of tree patch stack and taking a bunch of test risk from that -
> > > in product development for the sorts of products that end up including
> > > the LSK the churn and risk from targeted backports is seen as much safer
> > > than updating to an entire new upstream kernel.
> > 
> > Same as I said before, the risk LSK introduces, IMO, is much greater than
> > rebasing and out-of-tree driver stack.
> 
> I'm afraid you're very much mistaken if you believe that people are only
> working on leaf drivers, or that nothing we do upstream has a meaningful
> impact at the system level.

To provide a real-life example, we recently ran into a scheduler issue in a 
project I'm working on. The device is a phone running a Qualcomm kernel, and 
the scheduler is so hacked by the vendor to cover the phone use cases that 
creating a spinning high priority SCHED_FIFO thread in userspace kills the 
system instantly. That's the kind of crap vendors tend to ship, and moving to 
a newer kernel version pretty much means they have no revalidate all the 
scheduler-related use cases (and add more awful hacks to "fix issues 
introduced in mainline").

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05  1:45         ` NeilBrown
@ 2016-09-05 11:04           ` Mark Brown
  2016-09-05 22:44             ` NeilBrown
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-05 11:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: ltsi-dev, ksummit-discuss, gregkh, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 769 bytes --]

On Mon, Sep 05, 2016 at 11:45:52AM +1000, NeilBrown wrote:
> On Sat, Sep 03 2016, Bird, Timothy wrote:

> > Where we are now with some of these SoCs is at millions of lines of
> > code out-of-tree.  It's being reduced, slowly, but there are still
> > significant areas where the mainline kernel just doesn't have the
> > support needed for shipping product. My pet peeve is support for
> > charging over USB, where Linaro has had a patch set
> > being stalled and/or ignored by the USB maintainer for 2 years!!

> Do you have a link to that?  I have an interest in charging over USB.

This is it:

    https://lkml.org/lkml/2016/7/1/35

it's been more like one year than two and there has been progress but
there's also been an awful lot of latency in the process too.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-04 22:58           ` Amit Kucheria
  2016-09-04 23:51             ` Theodore Ts'o
@ 2016-09-05 11:11             ` Mark Brown
  2016-09-05 14:03               ` Theodore Ts'o
  1 sibling, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-05 11:11 UTC (permalink / raw)
  To: Amit Kucheria; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 562 bytes --]

On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:

> The vendors depend on Google providing an Android common tree[1] to
> build their BSP on top of. Currently, there isn't anything newer than
> a 4.4-based common tree from Google. It'll be early 2017 by the time
> 4.9 LTS is released and the Android common tree is available on it

It's not just having an Android tree either, it's having an Android tree
that they're confident is actively used and tested by Google.  Simply
making a tree available wouldn't be enough, that was tried in the past.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-04 23:51             ` Theodore Ts'o
@ 2016-09-05 12:58               ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-05 12:58 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 5587 bytes --]

On Sun, Sep 04, 2016 at 07:51:57PM -0400, Theodore Ts'o wrote:
> On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:

> > But since 4.4 LTS misses some key features that have landed in 4.5,
> > 4.6, 4.7 and 4.8, something like LSK will fill that gap by backporting
> > those features.

> But the BSP kernel already has thousands of commits and tens of
> thousands of lines of code (some of which guarantee that the kernel
> won't build on anything other than ARM, which makes testing file
> systems using KVM Very Hard).  So if you backport to LSK, that's not
> going to help the BSP kernel unless someone then cherry picks patches
> from LSK to the BSP kernel.  So you are doing two very risky things;
> one is backporting a feature to 4.4 LTS, and then cherry picking the
> feature from the 4.4 LTS upstream kernel to the BSP kernel.

It's generally a merge rather than a cherry pick but whatever.

> I've done this sort of thing before, with ext4 encryption, where I was
> very familiar with the code and where I had a comprehensive test suite
> so I could at least test the first part of it using kvm/x86.  And I
> can tell you it's an *extremely* fraught and tricky thing.  What works

The difficulty level is all really dependent on what the feature is,
where it hooks in and how many implicit dependencies it has on things.
Filesystem code is some of the most fraught but other places it's a lot
more straightforward and commonly done - for my subsystems I know that
huge chunks of the code have been developed and then immediately
backported (and probably also developed downstream and then forward
ported a lot) and consider this a totally reasonable thing to do for a
lot of features. 

> for device A won't work for device B, and just because you've
> backported the feature a 3.10 or 3.18 upstream kernel, and tested it
> extensively, doesn't mean that it is easy and risk free to cherry pick
> the patch to device kernel A and device kernel B, because the changes
> made by SOC vendor A and SOC vendor B may be quite different (and may
> not even be based on the same upstream kernel version).  And although
> we finally have xfstests (sorta[1]) working under Android today, we
> didn't have it when we started, and there were bugs introduced just
> doing the cherry pick.

One of the secondary goals here is to get more people using the same
backports for the common code so that the variance between the vendor
kernels is mitigated and where there are problems we can at least deal
with them centrally.

In terms of consistent versions one of the really positive things that
Android has achieved is that the Android kernel versions have managed to
get most of the vendors onto the same set of kernel versions (which are
normally LTSs too) - product schedules being what they are this doesn't
work out 100% but normally current products will all be on the same
version.  Things could get a little better but we're so far forward
compared to where we were before Android.

> > Now, if everything was on its way upstream, the BSP deltas would be
> > smaller and the whole porting and validation exercise on top of the
> > Android common tree would take much less time. But we aren't there yet
> > for all vendors, some are doing better than others.

> Has there been *any* forward progress since last year in terms of
> reducing the BSP deltas?

Things could be a lot better than they are but there's progress.  For me
the most encouraging thing is that we're seeing changes going into
common code which make it more suitable for embedded use cases, the
drivers are annoying but ultimately relatively minor in impact and easy
to address but once people start going off on their own with generic
code it is a much bigger problem.

Off the top of my head the EAS work is a nice example of this happening -
it's still in progress but that's a nice big chunk of non-trivial
scheduler changes which are being actively worked on upstream and seem
to be getting somewhere (there are some changes in already).  There's
been different bits of out of tree code around for this for years which
are now on their way out.  Things like power domains have been quite
active too, and I know a good chunk of the framework changes I'm
personally merging are coming from vendors.  Indeed I've recently had to
push Intel to finally start using the common clock framework because x86
has been holding back progress on integrating things.

> Even if we can't get everyone working from upstream, if multiple BSP
> deltas for different SOC's could be integrated into a single common
> 4.4 tree, and if that 4.4 tree could still generate bootable x86

That's impractical - nobody wants to be sitting between a vendor and
their customer when things are getting hairy, and if people are paying
the integration costs of getting everything into a common tree then we
want them to be doing that upstream rather than in some downstream tree.
You can get a long way in terms of lines of code relatively easily but
less so in terms of usefulness and if you can solve the problems there
then you've probably solved the upstreaming problem.

> kernels, so that could be the basis of an Android common branch, that
> would be a good start.  But I suspect we are a looooooong way away
> from that.

The Android tree itself is minimal because there's a desire to get
people taking updates to it as far into the product development cycle as
possible, the less code there is in there the easier that is.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05 11:11             ` Mark Brown
@ 2016-09-05 14:03               ` Theodore Ts'o
  2016-09-05 14:22                 ` Laurent Pinchart
  0 siblings, 1 reply; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-05 14:03 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Mon, Sep 05, 2016 at 12:11:05PM +0100, Mark Brown wrote:
> On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:
> 
> > The vendors depend on Google providing an Android common tree[1] to
> > build their BSP on top of. Currently, there isn't anything newer than
> > a 4.4-based common tree from Google. It'll be early 2017 by the time
> > 4.9 LTS is released and the Android common tree is available on it
> 
> It's not just having an Android tree either, it's having an Android tree
> that they're confident is actively used and tested by Google.  Simply
> making a tree available wouldn't be enough, that was tried in the past.

One of the problems is I can't test the Android common tree, because I
don't have access to hardware that I can boot on that tree.  (To be
honest, I'm not even sure what hardware would boot on it.)  And thanks
to the tender loving care the SOC vendors have lavished on on their
BSP kernels, if there has been BSP "value added patches" from ARM SOC
vendors applied to a kernel tree, chances are extremely high that you
can no longer do testing using kvm-xfstests.  (In some cases I was
able to bash the tree enough that it would boot under kvm/x86, but in
many cases, the ARM SOC changes were so horrible that it was
hopeless.)

This is why upstream-first is so darned important.  And why sloppy
patches that break other architectures are a really bad idea, even if
they are for a vendor-only BSP kernel....

Maybe there will be some hope if some of the features from ARM64
server can infect the SOC community --- Jon Masters really had the
right idea when he insisted on one kernel to boot all ARM64 kernels,
with all changes pushed upstream, and not hacky, out-of-tree patches
which only work for one SOC vendor.

						- Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05 14:03               ` Theodore Ts'o
@ 2016-09-05 14:22                 ` Laurent Pinchart
  2016-09-06  0:35                   ` Mark Brown
  2016-09-06 13:34                   ` Catalin Marinas
  0 siblings, 2 replies; 122+ messages in thread
From: Laurent Pinchart @ 2016-09-05 14:22 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: James Bottomley, ltsi-dev, gregkh

Hi Ted,

On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> On Mon, Sep 05, 2016 at 12:11:05PM +0100, Mark Brown wrote:
> > On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:
> >> The vendors depend on Google providing an Android common tree[1] to
> >> build their BSP on top of. Currently, there isn't anything newer than
> >> a 4.4-based common tree from Google. It'll be early 2017 by the time
> >> 4.9 LTS is released and the Android common tree is available on it
> > 
> > It's not just having an Android tree either, it's having an Android tree
> > that they're confident is actively used and tested by Google.  Simply
> > making a tree available wouldn't be enough, that was tried in the past.
> 
> One of the problems is I can't test the Android common tree, because I
> don't have access to hardware that I can boot on that tree.  (To be
> honest, I'm not even sure what hardware would boot on it.)  And thanks
> to the tender loving care the SOC vendors have lavished on on their
> BSP kernels, if there has been BSP "value added patches" from ARM SOC
> vendors applied to a kernel tree, chances are extremely high that you
> can no longer do testing using kvm-xfstests.  (In some cases I was
> able to bash the tree enough that it would boot under kvm/x86, but in
> many cases, the ARM SOC changes were so horrible that it was
> hopeless.)
> 
> This is why upstream-first is so darned important.  And why sloppy
> patches that break other architectures are a really bad idea, even if
> they are for a vendor-only BSP kernel....
> 
> Maybe there will be some hope if some of the features from ARM64
> server can infect the SOC community --- Jon Masters really had the
> right idea when he insisted on one kernel to boot all ARM64 kernels,
> with all changes pushed upstream, and not hacky, out-of-tree patches
> which only work for one SOC vendor.

I don't think that's a fair comparison. For server platforms end-users of the 
hardware will pick a distribution and roll it out on the machines, so hardware 
vendors have a strong incentive to play by our rules. Phones are completely 
different in that the device vendor doesn't care about end-users being able to 
pick what software in general and kernel in particular they want to install on 
the device. Worse than that, many vendors go through hoops and loops to make 
that impossible. Unless customers start boycotting devices that are not 
upstream-friendly - and I don't think anyone expects this to happen - we'll 
need to give SoC vendors a different incentive.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05 11:04           ` Mark Brown
@ 2016-09-05 22:44             ` NeilBrown
  2016-09-06  0:57               ` Mark Brown
  2016-09-08 18:33               ` [Ksummit-discuss] [LTSI-dev] " Bird, Timothy
  0 siblings, 2 replies; 122+ messages in thread
From: NeilBrown @ 2016-09-05 22:44 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, ksummit-discuss, gregkh, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 1154 bytes --]

On Mon, Sep 05 2016, Mark Brown wrote:

> [ Unknown signature status ]
> On Mon, Sep 05, 2016 at 11:45:52AM +1000, NeilBrown wrote:
>> On Sat, Sep 03 2016, Bird, Timothy wrote:
>
>> > Where we are now with some of these SoCs is at millions of lines of
>> > code out-of-tree.  It's being reduced, slowly, but there are still
>> > significant areas where the mainline kernel just doesn't have the
>> > support needed for shipping product. My pet peeve is support for
>> > charging over USB, where Linaro has had a patch set
>> > being stalled and/or ignored by the USB maintainer for 2 years!!
>
>> Do you have a link to that?  I have an interest in charging over USB.
>
> This is it:
>
>     https://lkml.org/lkml/2016/7/1/35
>
> it's been more like one year than two and there has been progress but
> there's also been an awful lot of latency in the process too.

Really?  That is worthy of a "pet peeve"?
The patch set does highlight an important area of missing functionality,
but doesn't (IMO) display much understanding of the problem space.  I'm
not surprised it hasn't made progress.

Maybe I should reply to the patch directly.

Thanks,
NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05 14:22                 ` Laurent Pinchart
@ 2016-09-06  0:35                   ` Mark Brown
  2016-09-06 15:30                     ` James Bottomley
  2016-09-06 13:34                   ` Catalin Marinas
  1 sibling, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-06  0:35 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 3622 bytes --]

On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
> On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> > On Mon, Sep 05, 2016 at 12:11:05PM +0100, Mark Brown wrote:

> > One of the problems is I can't test the Android common tree, because I
> > don't have access to hardware that I can boot on that tree.  (To be
> > honest, I'm not even sure what hardware would boot on it.)  And thanks

I'd be rather suprised if it doesn't boot on x86 TBH - if it doesn't I'd
imagine Google would fix it.  If you have a burning desire to run it on
an ARM system (and who wouldn't!) then one of the qemu platforms or
something like the BeagleBone Black or Cubietruck will work well
upstream (CubieTruck should be especially good for filesystem stuff due
to the SATA interface).  It's just a normal tree, it should work on
anything you're already working with.

> > This is why upstream-first is so darned important.  And why sloppy
> > patches that break other architectures are a really bad idea, even if
> > they are for a vendor-only BSP kernel....

You're preaching to the converted here but just telling everyone over
and over again that they're doing a terrible job isn't helping anything.
It's not offering any sort of constructive suggestion for how to improve
the situation that engages with the reality on the ground.

> > Maybe there will be some hope if some of the features from ARM64
> > server can infect the SOC community --- Jon Masters really had the
> > right idea when he insisted on one kernel to boot all ARM64 kernels,
> > with all changes pushed upstream, and not hacky, out-of-tree patches
> > which only work for one SOC vendor.

> I don't think that's a fair comparison. For server platforms end-users of the 
> hardware will pick a distribution and roll it out on the machines, so hardware 
> vendors have a strong incentive to play by our rules. Phones are completely 

The other big difference with server on ARM is that it's starting from
scratch with no real world to worry about on this architecture - the
server vendors only really care about the distros who happen to have
policies we like.  In turn the reason the distros are able to do that is
that there is no real deployed base of hardware and software for them to
target.

There's also the fact that for platforms that are upstream (and anyone
doing a good job downstream) we've been doing the single kernel image
thing for literally years - the way arm64 did single system image was
just to inherit all the work that had already been done on 32 bit ARM.
This is mainly about upstreaming especially in the mobile space where a
combination of technical issues and the construction of the market make
things challenging for everyone (the x86 mobile systems I've seen have
had just the same issues here).

> different in that the device vendor doesn't care about end-users being able to 
> pick what software in general and kernel in particular they want to install on 
> the device. Worse than that, many vendors go through hoops and loops to make 
> that impossible. Unless customers start boycotting devices that are not 
> upstream-friendly - and I don't think anyone expects this to happen - we'll 
> need to give SoC vendors a different incentive.

You can see some of those incentives operating if you look at the
systems that do do better upstream - for the most part the platforms
doing best upstream are those that address markets where the customers
care about it like the general market where there's no expectation of
strong support from the chip vendor.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05 22:44             ` NeilBrown
@ 2016-09-06  0:57               ` Mark Brown
  2016-09-06  5:41                 ` NeilBrown
  2016-09-08 18:33               ` [Ksummit-discuss] [LTSI-dev] " Bird, Timothy
  1 sibling, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-06  0:57 UTC (permalink / raw)
  To: NeilBrown; +Cc: ltsi-dev, ksummit-discuss, gregkh, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 1625 bytes --]

On Tue, Sep 06, 2016 at 08:44:01AM +1000, NeilBrown wrote:
> On Mon, Sep 05 2016, Mark Brown wrote:

> >> > support needed for shipping product. My pet peeve is support for
> >> > charging over USB, where Linaro has had a patch set
> >> > being stalled and/or ignored by the USB maintainer for 2 years!!

> >> Do you have a link to that?  I have an interest in charging over USB.

> > This is it:

> >     https://lkml.org/lkml/2016/7/1/35

> > it's been more like one year than two and there has been progress but
> > there's also been an awful lot of latency in the process too.

> Really?  That is worthy of a "pet peeve"?
> The patch set does highlight an important area of missing functionality,
> but doesn't (IMO) display much understanding of the problem space.  I'm
> not surprised it hasn't made progress.

> Maybe I should reply to the patch directly.

Yes, please - if we're just not providing feedback that's very
unhelpful, if there's problems then we should be saying so.  Not only
does it mean that there's no feedback that the submitters can use to
improve things the existence of an actively worked on patch series has a
chilling effect on other people who might want to work in the same area
but don't want to duplicate effort.  People do just not work on things
they see other people actively working on and pick some other task
instead, assuming the first thing they thought of will get taken care of
in due course.

Things do of course fall through the cracks or get delayed from time to
time so submitters should be prepared to handle that but equally it's
hard for people to improve without feedback.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06  0:57               ` Mark Brown
@ 2016-09-06  5:41                 ` NeilBrown
  0 siblings, 0 replies; 122+ messages in thread
From: NeilBrown @ 2016-09-06  5:41 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, ksummit-discuss, gregkh, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 994 bytes --]

On Tue, Sep 06 2016, Mark Brown wrote:

> [ Unknown signature status ]
> On Tue, Sep 06, 2016 at 08:44:01AM +1000, NeilBrown wrote:
>> On Mon, Sep 05 2016, Mark Brown wrote:
>
>> >> > support needed for shipping product. My pet peeve is support for
>> >> > charging over USB, where Linaro has had a patch set
>> >> > being stalled and/or ignored by the USB maintainer for 2 years!!
>
>> >> Do you have a link to that?  I have an interest in charging over USB.
>
>> > This is it:
>
>> >     https://lkml.org/lkml/2016/7/1/35
>
>> > it's been more like one year than two and there has been progress but
>> > there's also been an awful lot of latency in the process too.
>
>> Really?  That is worthy of a "pet peeve"?
>> The patch set does highlight an important area of missing functionality,
>> but doesn't (IMO) display much understanding of the problem space.  I'm
>> not surprised it hasn't made progress.
>
>> Maybe I should reply to the patch directly.
>
> Yes, please

Done :-)

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-02 13:47 ` Theodore Ts'o
  2016-09-02 19:31   ` Levin, Alexander
  2016-09-03  2:04   ` Mark Brown
@ 2016-09-06  7:20   ` Tsugikazu Shibata
  2016-09-10 12:00     ` Theodore Ts'o
  2016-09-12  4:12     ` Alex Shi
  2 siblings, 2 replies; 122+ messages in thread
From: Tsugikazu Shibata @ 2016-09-06  7:20 UTC (permalink / raw)
  To: Theodore Ts'o, Alex Shi
  Cc: ltsi-dev, Tsugikazu Shibata, Greg KH, ksummit-discuss, Mark Brown

On Fri, Sep 02, 2016 at 10:47PM Theodore Ts'o wrote:
>On Thu, Sep 01, 2016 at 10:01:13AM +0800, Alex Shi wrote:
>>
>> I am a Linaro stable kernel maintainer. Our stable kernel is base on
>> LTS plus much of upstream features backporting on them. Here is the
>> detailed info of LSK: https://wiki.linaro.org/LSK
>> https://git.linaro.org/?p=kernel/linux-linaro-stable.git
>
>I'm really not sure what problem you are trying to solve here.
>
>As near as I can tell, the kernels provided by a SOC vendor are a snapshot in time of
>some LTS kernel, and after that, they don't bother merging any bug fixes or security fixes
>from the upstream kernel.
>They might take individual patches if they notice there's a problem (e.g., it gets written
>about in the national press), but otherwise, they'll be stuck on some nonsense such as
>3.10.23.
>
>Then the product vendors take the SOC kernel, and further hack it up, and then once
>they take a snapshot, as far as I can tell, they don't take any rolling updates from the
>SOC vendor either.  I'm not sure how much of this is due to lack of engineering
>bandwidth, and how much of this is due to being worried that a newer SOC kernel will
>introduce regressions, but either way, they'll lock onto an old SOC kernel, and
>apparently only take bug fixes when they notice there is a problem.
>(And in multiple cases I've gotten calls from help of SOC vendors asking for assistance in
>figuring out a problem, and more often than not, the fix is in the latest LTS kernel, but
>that doesn't help them.) And of course, in some cases, "never", unless it's written about
>in the aforementioned national press, and even then, I'm not convinced the product
>vendors will have the engineering staff to turn out firmware upgrades for an older
>product.
>
>So what's the point of moving features into some ancient kernel?
>Who's going to take it?  Certainly not the product vendors, who consume the SOC
>kernel.  The SOC vendors?  Why not just encourage them to get their device drivers
>into staging, and just go to a newer LTS kernel?  Because I guarantee that's going to be
>less risky than taking a random collection of features, and backporting them into some
>ancient kernel.
>
>Or for that matter, why not simply going to the latest mainline kernel.  Since the SOC
>vendors aren't taking updates from the LTS kernel anyway, if the LTS kernel exists only
>as a patch repository where people can look for security fixes and bug fixes (sometimes
>after the upstream maintainer has to point out it's in the LTS kernel), if they take, say,
>4.7, in the future they might need to take a look at 4.8.x, 4.9.x, etc., until the next LTS
>kernel is declared.
>So that means that an SOC vendor or a downstream product vendors might have to look
>at 3 or 4 patch releases instead of one.  Is that really that hard?

It seems too late but Here I would like to comment from LTSI project's stand point:
We were started LTSI project similar view as above that is:
- LTSI is based on upstream first policy. We are providing a chance to
merge vendor-required patches on top of LTS but patches need to be
in -next or merged in later upstream and back ported to LTS.
Majority of LTSI's additional patches are drivers but sometimes tools or 
new features may proposed. Such patches are reviewed in case by case 
basis but should be self-contained.
- We are approaching to SoC vendors and device manufactures to 
convince their in-house patches to be upstream. It seems that
SoC vendors can create their own fork for their customers but
providing a chance to merge their patches on top of LTS as LTSI will
make some motivation to merge such patches into upstream as long 
term view. Yes, We hope our activities may reduce the problem 
discussed in this thread in longer term. 
Actually, some of SoC vendors would send us their patches 
backported from upstream. We will continue to discuss with SoCs and 
device manufactures to solve the problem.
- Also, some member of LTSI is employee of companies and they are
testing hard for LTSI kernel before the release. Those fixes are providing
to upstream not just LTSI actually. We see LTSI is being one of activity 
of filling the gap between companies and community to create the kernel 
used for the companies and industry.

Finally, A request to the community from LTSI's stand point is:
We want to have some process to be expected; How or about when 
LTS would be released. So that companies can easier to create their plan 
to use LTS and that will cause more user can use stable and secure
kernel.

Thanks,
Tsugikazu Shibata

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05 14:22                 ` Laurent Pinchart
  2016-09-06  0:35                   ` Mark Brown
@ 2016-09-06 13:34                   ` Catalin Marinas
  2016-09-06 16:24                     ` Bartlomiej Zolnierkiewicz
                                       ` (3 more replies)
  1 sibling, 4 replies; 122+ messages in thread
From: Catalin Marinas @ 2016-09-06 13:34 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
> On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> > On Mon, Sep 05, 2016 at 12:11:05PM +0100, Mark Brown wrote:
> > > On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:
> > >> The vendors depend on Google providing an Android common tree[1] to
> > >> build their BSP on top of. Currently, there isn't anything newer than
> > >> a 4.4-based common tree from Google. It'll be early 2017 by the time
> > >> 4.9 LTS is released and the Android common tree is available on it
> > > 
> > > It's not just having an Android tree either, it's having an Android tree
> > > that they're confident is actively used and tested by Google.  Simply
> > > making a tree available wouldn't be enough, that was tried in the past.
> > 
> > One of the problems is I can't test the Android common tree, because I
> > don't have access to hardware that I can boot on that tree.  (To be
> > honest, I'm not even sure what hardware would boot on it.)  And thanks
> > to the tender loving care the SOC vendors have lavished on on their
> > BSP kernels, if there has been BSP "value added patches" from ARM SOC
> > vendors applied to a kernel tree, chances are extremely high that you
> > can no longer do testing using kvm-xfstests.  (In some cases I was
> > able to bash the tree enough that it would boot under kvm/x86, but in
> > many cases, the ARM SOC changes were so horrible that it was
> > hopeless.)
> > 
> > This is why upstream-first is so darned important.  And why sloppy
> > patches that break other architectures are a really bad idea, even if
> > they are for a vendor-only BSP kernel....
> > 
> > Maybe there will be some hope if some of the features from ARM64
> > server can infect the SOC community --- Jon Masters really had the
> > right idea when he insisted on one kernel to boot all ARM64 kernels,
> > with all changes pushed upstream, and not hacky, out-of-tree patches
> > which only work for one SOC vendor.

>From an arm64 kernel perspective, we have similar requirements w.r.t.
single Image. However, such requirements have implications beyond just
the kernel as we insist on standardisation of the firmware interfaces,
use of the Device Tree, no mach-* directories for SoC quirks. Some SoC
vendors find such requirements difficult to comply with and won't even
bother with mainline (claiming time to market is more important than
long term maintenance).

> I don't think that's a fair comparison. For server platforms end-users of the 
> hardware will pick a distribution and roll it out on the machines, so hardware 
> vendors have a strong incentive to play by our rules. Phones are completely 
> different in that the device vendor doesn't care about end-users being able to 
> pick what software in general and kernel in particular they want to install on 
> the device.

Things could be different if fewer entities control the software that
gets installed/updated on such hardware. E.g. Google controlling the OTA
updates of the Chromebook kernels, they will at some point take a
similar hard stance to Red Hat on upstream first, single kernel Image.
For phones, however, that's unlikely to happen given the multitude and
short life-time of new products.

> Unless customers start boycotting devices that are not 
> upstream-friendly - and I don't think anyone expects this to happen - we'll 
> need to give SoC vendors a different incentive.

One way to make SoC vendors understand the benefits of upstream is for
them to first feel the pain of rebasing their SoC patches to newer
kernel versions regularly. But forcing them to do such rebasing means
to stop helping them back-port the features they need to older kernel
versions like LSK ;) (this may be difficult from a corporate perspective
where significant support contracts are involved; that's where kernel
maintainer goals don't always match the business ones).

-- 
Catalin

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06  0:35                   ` Mark Brown
@ 2016-09-06 15:30                     ` James Bottomley
  2016-09-06 19:44                       ` gregkh
  2016-09-06 23:23                       ` Mark Brown
  0 siblings, 2 replies; 122+ messages in thread
From: James Bottomley @ 2016-09-06 15:30 UTC (permalink / raw)
  To: Mark Brown, Laurent Pinchart; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

On Tue, 2016-09-06 at 01:35 +0100, Mark Brown wrote:
> On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
> > On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> > > This is why upstream-first is so darned important.  And why 
> > > sloppy patches that break other architectures are a really bad 
> > > idea, even if they are for a vendor-only BSP kernel....
> 
> You're preaching to the converted here but just telling everyone over
> and over again that they're doing a terrible job isn't helping 
> anything. It's not offering any sort of constructive suggestion for 
> how to improve the situation that engages with the reality on the
> ground.

OK, so how do we move forwards?  Everyone who remembers the 2.4->2.6
transition is convinced of upstream first because it was impossible to
forward port the patch sets.

However, with the 2.6 release methodology, you have much smaller patch
sets and we have a much more incremental release strategy, meaning you
don't have this massive (2 year) gap between upports.  I think it's
arguable that our change in release strategy coupled with you getting
enough stuff upstream supports vendors who want to target a stable tree
because the patch difference is big but not that huge to upport.

So there's probably a couple of things we could do about this

   1. Nothing.  Say it's working about as well as can be expected for
      embedded and stop worrying.
   2. Accept that the pain will never be great enough and approach from a
      process view point instead.  Perhaps changing the acceptance
      criteria for the base tree to make it harder to get features in
      making it less painful to get them upstream and then backported?
   3. Increase the pain.  Not sure I like this, but in theory, we could
      churn the upstream API to increase the pain of upports, but it would
      also cause a lot of issues with backports.

James




[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 13:34                   ` Catalin Marinas
@ 2016-09-06 16:24                     ` Bartlomiej Zolnierkiewicz
  2016-09-06 16:25                     ` Guenter Roeck
                                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 122+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2016-09-06 16:24 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh


[ sorry if you got this mail twice ]

Hi,

On Tuesday, September 06, 2016 02:34:30 PM Catalin Marinas wrote:
> On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
> > On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> > > On Mon, Sep 05, 2016 at 12:11:05PM +0100, Mark Brown wrote:
> > > > On Mon, Sep 05, 2016 at 04:28:44AM +0530, Amit Kucheria wrote:
> > > >> The vendors depend on Google providing an Android common tree[1] to
> > > >> build their BSP on top of. Currently, there isn't anything newer than
> > > >> a 4.4-based common tree from Google. It'll be early 2017 by the time
> > > >> 4.9 LTS is released and the Android common tree is available on it
> > > > 
> > > > It's not just having an Android tree either, it's having an Android tree
> > > > that they're confident is actively used and tested by Google.  Simply
> > > > making a tree available wouldn't be enough, that was tried in the past.
> > > 
> > > One of the problems is I can't test the Android common tree, because I
> > > don't have access to hardware that I can boot on that tree.  (To be
> > > honest, I'm not even sure what hardware would boot on it.)  And thanks
> > > to the tender loving care the SOC vendors have lavished on on their
> > > BSP kernels, if there has been BSP "value added patches" from ARM SOC
> > > vendors applied to a kernel tree, chances are extremely high that you
> > > can no longer do testing using kvm-xfstests.  (In some cases I was
> > > able to bash the tree enough that it would boot under kvm/x86, but in
> > > many cases, the ARM SOC changes were so horrible that it was
> > > hopeless.)
> > > 
> > > This is why upstream-first is so darned important.  And why sloppy
> > > patches that break other architectures are a really bad idea, even if
> > > they are for a vendor-only BSP kernel....
> > > 
> > > Maybe there will be some hope if some of the features from ARM64
> > > server can infect the SOC community --- Jon Masters really had the
> > > right idea when he insisted on one kernel to boot all ARM64 kernels,
> > > with all changes pushed upstream, and not hacky, out-of-tree patches
> > > which only work for one SOC vendor.
> 
> From an arm64 kernel perspective, we have similar requirements w.r.t.
> single Image. However, such requirements have implications beyond just
> the kernel as we insist on standardisation of the firmware interfaces,
> use of the Device Tree, no mach-* directories for SoC quirks. Some SoC
> vendors find such requirements difficult to comply with and won't even
> bother with mainline (claiming time to market is more important than
> long term maintenance).
> 
> > I don't think that's a fair comparison. For server platforms end-users of the 
> > hardware will pick a distribution and roll it out on the machines, so hardware 
> > vendors have a strong incentive to play by our rules. Phones are completely 
> > different in that the device vendor doesn't care about end-users being able to 
> > pick what software in general and kernel in particular they want to install on 
> > the device.
> 
> Things could be different if fewer entities control the software that
> gets installed/updated on such hardware. E.g. Google controlling the OTA
> updates of the Chromebook kernels, they will at some point take a
> similar hard stance to Red Hat on upstream first, single kernel Image.
> For phones, however, that's unlikely to happen given the multitude and
> short life-time of new products.

[ Disclaimer: below are my personal observations and not official
  Samsung's stance on upstream first policy, also while I work for
  Samsung I don't work for its silicon vendor division ]

For phones it feels that upstream itself is moving too slow for
vendors to get benefits from upstream first policy.  Each year there
is a new flagship device and new SoC.  With the current upstream model
(new kernel release every 2-3 months and discussions on upstreaming
some core subsystems enhancements easily taking weeks/months) upstream
first policy just doesn't give enough business benefits.  Before you
have everything upstream and changes propagate itself to LTS and Android
kernels (which should also move faster BTW) it can be easily 2 years
since start of the effort and your SoC is now obsolete. Sure there are
some benefits of core subsystems' changes done for old SoC etc. which
can be re-used by new SoC.  However in short-term it is still usually
easier to just backport few selected upstream kernel features that you
care about to your vendor kernel (which supports all your SoCs, not
only your flagship one) than to move everything to upstream.

> > Unless customers start boycotting devices that are not 
> > upstream-friendly - and I don't think anyone expects this to happen - we'll 
> > need to give SoC vendors a different incentive.
> 
> One way to make SoC vendors understand the benefits of upstream is for
> them to first feel the pain of rebasing their SoC patches to newer
> kernel versions regularly. But forcing them to do such rebasing means

Some vendors are still stuck on v3.4 (or v3.10) and they have no enough
incentives to re-base to something newer.

> to stop helping them back-port the features they need to older kernel
> versions like LSK ;) (this may be difficult from a corporate perspective
> where significant support contracts are involved; that's where kernel
> maintainer goals don't always match the business ones).

When talking about more incentives it would help getting embedded/mobile
oriented features upstream quicker and doing more upstream validation &
testing on embedded/mobile devices.  This would require more upstream
involvement from embedded/mobile companies (independently or through
Linaro) and I think that the ones to get involved first will be the ones
to reap the biggest benefits from leading the efforts later.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 13:34                   ` Catalin Marinas
  2016-09-06 16:24                     ` Bartlomiej Zolnierkiewicz
@ 2016-09-06 16:25                     ` Guenter Roeck
  2016-09-06 22:39                       ` Mark Brown
  2016-09-07  8:33                       ` Jan Kara
  2016-09-06 16:46                     ` Olof Johansson
       [not found]                     ` <2181684.5VzIQ6DWv4@amdc1976>
  3 siblings, 2 replies; 122+ messages in thread
From: Guenter Roeck @ 2016-09-06 16:25 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Tue, Sep 06, 2016 at 02:34:30PM +0100, Catalin Marinas wrote:
> 
> Things could be different if fewer entities control the software that
> gets installed/updated on such hardware. E.g. Google controlling the OTA
> updates of the Chromebook kernels, they will at some point take a
> similar hard stance to Red Hat on upstream first, single kernel Image.

Seems to me that Redhat and Google are in different boats. Chromebooks,
unlike "standard" PCs, have lots of "custom" hardware, where "custom" means
hardware for which upstream support is not available. chromeos-4.4 currently
(as of this morning) carries 5,594 patches on top of v4.4.14. Out of those,
roughly 2,700 are tagged as backports, ~200 are tagged as from an upstream
submission which was not accepted by the time the patch was added, and ~2,000
are tagged as chromeos specific. And that is with (as far as I know) no
products shipping yet with the 4.4 kernel. We are trying to upstream as much
as we can, but it will take a while. Given time constraints, I don't think
"upstream first" will ever work. Products have to ship and simply can not
wait for upstream patches to be accepted.

> For phones, however, that's unlikely to happen given the multitude and
> short life-time of new products.
> 
> > Unless customers start boycotting devices that are not 
> > upstream-friendly - and I don't think anyone expects this to happen - we'll 
> > need to give SoC vendors a different incentive.
> 
> One way to make SoC vendors understand the benefits of upstream is for
> them to first feel the pain of rebasing their SoC patches to newer
> kernel versions regularly. But forcing them to do such rebasing means
> to stop helping them back-port the features they need to older kernel
> versions like LSK ;) (this may be difficult from a corporate perspective
> where significant support contracts are involved; that's where kernel
> maintainer goals don't always match the business ones).
> 

This is a two-edged sword. Make rebasing too hard (eg by on purpose changing the
in-kernel APIs constantly, as I think was suggested elsewhere) and they will simply
never switch to a newer kernel.

Ted was making an excellent point about the complexity of backporting features.
Out of personal experience, I fully agree. Instead of reducing risk by avoiding
a newer kernel version, backporting actually adds risk. Maybe it would help to
educate people about the risks of backporting, and do a better job explaining
why a new kernel may be a better choice.

Elsewhere it was also mentioned that companies just can not wait for the next
LTS release to incorporate new features, while at the same time suggesting that
4.9 may be too late. But this also suggests that those devices would _never_
ship with a 4.9 kernel in the first place. From my perspective, I think it would
make more sense to add the new features to those devices after the features
matured, or in other words plan for an upgrade to 4.9 after the device shipped.
This would both ensure that the devices get the feature(s) and that the features
get some test coverage before being used in (supposedly) high-stability devices.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 13:34                   ` Catalin Marinas
  2016-09-06 16:24                     ` Bartlomiej Zolnierkiewicz
  2016-09-06 16:25                     ` Guenter Roeck
@ 2016-09-06 16:46                     ` Olof Johansson
  2016-09-08  8:34                       ` Linus Walleij
       [not found]                     ` <2181684.5VzIQ6DWv4@amdc1976>
  3 siblings, 1 reply; 122+ messages in thread
From: Olof Johansson @ 2016-09-06 16:46 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

Hi,

On Tue, Sep 6, 2016 at 6:34 AM, Catalin Marinas <catalin.marinas@arm.com> wrote:
> On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
>> Unless customers start boycotting devices that are not
>> upstream-friendly - and I don't think anyone expects this to happen - we'll
>> need to give SoC vendors a different incentive.

As a customer, this is hard, you're a few steps removed from the
decision making and they'll never attribute a failed product to lack
of upstreaming.

[This is going offtopic for this forum]

Instead, the point where there is good leverage is when you work for
the company making the widgets. Preferring to work with vendors who
have sorted out their upstream story, or pressuring, helping,
mentoring or connecting the others to contracting shops that can help
them get going are all good ways in providing appropriate business
pressure back to the vendors.

Chrome OS was successful in this (if I might say so myself), getting
several vendors who earlier had very thin upstream presence to
significantly improve. I haven't seen all that many other projects
being able to do it, but for those of you who are in positions to help
steer SoC choices, do keep this in mind, work with your internal
development teams to make them understand the importance of this, and
make it a priority.

> One way to make SoC vendors understand the benefits of upstream is for
> them to first feel the pain of rebasing their SoC patches to newer
> kernel versions regularly.

You're talking about this as if they haven't already done this for
years. As a matter of fact, they've gotten quite good at it. Masters
even -- it's a competitive advantage for them to be good at it.

Rebasing is also the point in time where they can shed old hardware
support. As a matter of fact, several _look forward to_ rebasing since
it means they can shed a lot of software burden and start (somewhat)
fresh again.

And the other elephant in the room is the fact that right when you
start upstreaming, and only have SOME of your code upstream, rebasing
can be _incredibly_ painful, and it's not uncommon that you get a huge
amount of pushback from upstreaming at all during this time. The best
way to avoid that as far as I've seen, is to do the upstreaming under
different names/locations, so that you can keep the parallel
implementations until you're ready to cut over. If you end up stepping
on the same file/driver names, you're in for a lot of conflicts and
pain.

-Olof

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 15:30                     ` James Bottomley
@ 2016-09-06 19:44                       ` gregkh
  2016-09-06 22:20                         ` Mark Brown
  2016-09-06 23:23                       ` Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: gregkh @ 2016-09-06 19:44 UTC (permalink / raw)
  To: James Bottomley; +Cc: ltsi-dev, ksummit-discuss

On Tue, Sep 06, 2016 at 11:30:31AM -0400, James Bottomley wrote:
>    3. Increase the pain.  Not sure I like this, but in theory, we could
>       churn the upstream API to increase the pain of upports, but it would
>       also cause a lot of issues with backports.

I tried doing this in the past.  It did cause pain for out-of-tree
modules, but then they got really good and abstracted things away so
that it made their future kernel porting efforts even easier than
before, making their need to upstream code even less.  And then when
they did want to upstream stuff, it took more work unwinding the
abstraction layer.

So watch out for unintended consequences here :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 19:44                       ` gregkh
@ 2016-09-06 22:20                         ` Mark Brown
  2016-09-06 22:34                           ` James Bottomley
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-06 22:20 UTC (permalink / raw)
  To: gregkh; +Cc: James Bottomley, ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1337 bytes --]

On Tue, Sep 06, 2016 at 09:44:04PM +0200, gregkh@linuxfoundation.org wrote:
> On Tue, Sep 06, 2016 at 11:30:31AM -0400, James Bottomley wrote:
> >    3. Increase the pain.  Not sure I like this, but in theory, we could
> >       churn the upstream API to increase the pain of upports, but it would
> >       also cause a lot of issues with backports.

> I tried doing this in the past.  It did cause pain for out-of-tree
> modules, but then they got really good and abstracted things away so
> that it made their future kernel porting efforts even easier than
> before, making their need to upstream code even less.  And then when
> they did want to upstream stuff, it took more work unwinding the
> abstraction layer.

> So watch out for unintended consequences here :)

The other big unintended consequence I'd worry about here is that it
will present an obstacle to someone who wants to try to upstream
something while working in a downstream environment - if someone is
looking at some code but the changes for upstream are too great then it
might make it too much work for them to try if it's not their primary
job.

I'd also worry about annoying people who are working upstream as well,
it's annoying having things break randomly due to API changes (both as
the submitter and as a maintainer or reviewer).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 22:20                         ` Mark Brown
@ 2016-09-06 22:34                           ` James Bottomley
  2016-09-08 18:55                             ` Bird, Timothy
  0 siblings, 1 reply; 122+ messages in thread
From: James Bottomley @ 2016-09-06 22:34 UTC (permalink / raw)
  To: Mark Brown, gregkh; +Cc: ltsi-dev, ksummit-discuss

On September 6, 2016 6:20:58 PM EDT, Mark Brown <broonie@kernel.org> wrote:
>On Tue, Sep 06, 2016 at 09:44:04PM +0200, gregkh@linuxfoundation.org
>wrote:
>> On Tue, Sep 06, 2016 at 11:30:31AM -0400, James Bottomley wrote:
>> >    3. Increase the pain.  Not sure I like this, but in theory, we
>could
>> >       churn the upstream API to increase the pain of upports, but
>it would
>> >       also cause a lot of issues with backports.
>
>> I tried doing this in the past.  It did cause pain for out-of-tree
>> modules, but then they got really good and abstracted things away so
>> that it made their future kernel porting efforts even easier than
>> before, making their need to upstream code even less.  And then when
>> they did want to upstream stuff, it took more work unwinding the
>> abstraction layer.
>
>> So watch out for unintended consequences here :)
>
>The other big unintended consequence I'd worry about here is that it
>will present an obstacle to someone who wants to try to upstream
>something while working in a downstream environment - if someone is
>looking at some code but the changes for upstream are too great then it
>might make it too much work for them to try if it's not their primary
>job.
>
>I'd also worry about annoying people who are working upstream as well,
>it's annoying having things break randomly due to API changes (both as
>the submitter and as a maintainer or reviewer).

Ok, so everyone went straight for the option I didn't like. I knew I shouldn't have included it.   So what about options 1 or 2, or even something I hadn't thought of?

James

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 16:25                     ` Guenter Roeck
@ 2016-09-06 22:39                       ` Mark Brown
  2016-09-07  8:33                       ` Jan Kara
  1 sibling, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-06 22:39 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]

On Tue, Sep 06, 2016 at 09:25:02AM -0700, Guenter Roeck wrote:
> On Tue, Sep 06, 2016 at 02:34:30PM +0100, Catalin Marinas wrote:

> > Things could be different if fewer entities control the software that
> > gets installed/updated on such hardware. E.g. Google controlling the OTA
> > updates of the Chromebook kernels, they will at some point take a
> > similar hard stance to Red Hat on upstream first, single kernel Image.

> Seems to me that Redhat and Google are in different boats. Chromebooks,
> unlike "standard" PCs, have lots of "custom" hardware, where "custom" means
> hardware for which upstream support is not available. chromeos-4.4 currently

Right, there's a chicken and egg problem - it'd be a lot easier to
insist on this if it were possible to produce a viable product that
way...

> as we can, but it will take a while. Given time constraints, I don't think
> "upstream first" will ever work. Products have to ship and simply can not
> wait for upstream patches to be accepted.

...and you'd be looking at pretty major changes in the market, at least
for the devices shipping bleeding edge silicon.

> Ted was making an excellent point about the complexity of backporting features.
> Out of personal experience, I fully agree. Instead of reducing risk by avoiding
> a newer kernel version, backporting actually adds risk. Maybe it would help to
> educate people about the risks of backporting, and do a better job explaining
> why a new kernel may be a better choice.

People are aware of the risks here - this is where a lot of the pressure
to pull forward onto the newer LTSs comes from.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 15:30                     ` James Bottomley
  2016-09-06 19:44                       ` gregkh
@ 2016-09-06 23:23                       ` Mark Brown
  1 sibling, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-06 23:23 UTC (permalink / raw)
  To: James Bottomley; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]

On Tue, Sep 06, 2016 at 11:30:31AM -0400, James Bottomley wrote:
> On Tue, 2016-09-06 at 01:35 +0100, Mark Brown wrote:

> > You're preaching to the converted here but just telling everyone over
> > and over again that they're doing a terrible job isn't helping 
> > anything. It's not offering any sort of constructive suggestion for 
> > how to improve the situation that engages with the reality on the
> > ground.

> OK, so how do we move forwards?  Everyone who remembers the 2.4->2.6
> transition is convinced of upstream first because it was impossible to
> forward port the patch sets.

> So there's probably a couple of things we could do about this

>    1. Nothing.  Say it's working about as well as can be expected for
>       embedded and stop worrying.

I think I'd put it more as saying that there are relatively few special
snowflake things that we can do upstream to help here - most of the
things I can think of that are useful are just generally useful for
everyone rather than embedded specific.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 16:25                     ` Guenter Roeck
  2016-09-06 22:39                       ` Mark Brown
@ 2016-09-07  8:33                       ` Jan Kara
  2016-09-07  8:41                         ` Jiri Kosina
                                           ` (2 more replies)
  1 sibling, 3 replies; 122+ messages in thread
From: Jan Kara @ 2016-09-07  8:33 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Tue 06-09-16 09:25:02, Guenter Roeck wrote:
> On Tue, Sep 06, 2016 at 02:34:30PM +0100, Catalin Marinas wrote:
> > 
> > Things could be different if fewer entities control the software that
> > gets installed/updated on such hardware. E.g. Google controlling the OTA
> > updates of the Chromebook kernels, they will at some point take a
> > similar hard stance to Red Hat on upstream first, single kernel Image.
> 
> Seems to me that Redhat and Google are in different boats. Chromebooks,
> unlike "standard" PCs, have lots of "custom" hardware, where "custom" means
> hardware for which upstream support is not available. chromeos-4.4 currently
> (as of this morning) carries 5,594 patches on top of v4.4.14. Out of those,
> roughly 2,700 are tagged as backports, ~200 are tagged as from an upstream
> submission which was not accepted by the time the patch was added, and ~2,000
> are tagged as chromeos specific. And that is with (as far as I know) no
> products shipping yet with the 4.4 kernel.

Just to give a comparable numbers for SUSE. The coming SLE12 SP2 release is
based on 4.4 as well. On top of 4.4.19 we currently carry some 4700 patches
which together add/delete some 390k lines. Out of these some 280 patches
are not backports of upstream patches (or at least in the process of going
upstream), adding / deleting some 35k lines. So indeed we do have much less
non-upstream stuff in the distro kernel. OTOH I'd note we still do have a
considerable amount of backported stuff in the product that haven't even
shipped yet...

> We are trying to upstream as much as we can, but it will take a while.
> Given time constraints, I don't think "upstream first" will ever work.
> Products have to ship and simply can not wait for upstream patches to be
> accepted.

Well, it works reasonably for us... Actually most of the backports are HW
support so server HW vendors seem to be better in getting their drivers
upstream before actually shipping the HW. But I understand this is easier
for servers as that is a more mature / slower market than phones / tablets.

> > For phones, however, that's unlikely to happen given the multitude and
> > short life-time of new products.
> > 
> > > Unless customers start boycotting devices that are not 
> > > upstream-friendly - and I don't think anyone expects this to happen - we'll 
> > > need to give SoC vendors a different incentive.
> > 
> > One way to make SoC vendors understand the benefits of upstream is for
> > them to first feel the pain of rebasing their SoC patches to newer
> > kernel versions regularly. But forcing them to do such rebasing means
> > to stop helping them back-port the features they need to older kernel
> > versions like LSK ;) (this may be difficult from a corporate perspective
> > where significant support contracts are involved; that's where kernel
> > maintainer goals don't always match the business ones).
> 
> This is a two-edged sword. Make rebasing too hard (eg by on purpose
> changing the in-kernel APIs constantly, as I think was suggested
> elsewhere) and they will simply never switch to a newer kernel.
> 
> Ted was making an excellent point about the complexity of backporting
> features.  Out of personal experience, I fully agree. Instead of reducing
> risk by avoiding a newer kernel version, backporting actually adds risk.
> Maybe it would help to educate people about the risks of backporting, and
> do a better job explaining why a new kernel may be a better choice.

Well there are risks both way - updating to a newer kernel certainly has
risks (otherwise our kernel team & QA wouldn't have to spend several months
working on testing & tweaking the distro when creating new release based on
the new kernel) and backporting has risks as well. You want to find a
kernel version where the added risk from all the backports does not
outweight the additional time for testing the kernel and generally
stabilizing the product.
 
								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  8:33                       ` Jan Kara
@ 2016-09-07  8:41                         ` Jiri Kosina
  2016-09-07 18:44                           ` Mark Brown
  2016-09-09 15:21                         ` Alex Shi
  2016-09-12 15:34                         ` Christoph Hellwig
  2 siblings, 1 reply; 122+ messages in thread
From: Jiri Kosina @ 2016-09-07  8:41 UTC (permalink / raw)
  To: Jan Kara; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Wed, 7 Sep 2016, Jan Kara wrote:

> > We are trying to upstream as much as we can, but it will take a while.
> > Given time constraints, I don't think "upstream first" will ever work.
> > Products have to ship and simply can not wait for upstream patches to be
> > accepted.
> 
> Well, it works reasonably for us... Actually most of the backports are HW
> support so server HW vendors seem to be better in getting their drivers
> upstream before actually shipping the HW. But I understand this is easier
> for servers as that is a more mature / slower market than phones / tablets.

Yeah, unfortunately there really is a difference between server space and 
consumer space in this area. Consumer space folks seem to be in a 
situation that server space people have been in 2.0 / 2.2 era, where we 
had exactly the same problem with server hardware (i.e lack of timely 
focus on upstream in the process).

Also, it feels like consumer HW is more in a "fire and forget" mode of 
operation, while server hardware usually sticks around for a rather long 
time.

-- 
Jiri Kosina
SUSE Labs

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
       [not found]                     ` <2181684.5VzIQ6DWv4@amdc1976>
@ 2016-09-07  9:32                       ` Catalin Marinas
  2016-09-07 13:07                         ` Bartlomiej Zolnierkiewicz
                                           ` (2 more replies)
  0 siblings, 3 replies; 122+ messages in thread
From: Catalin Marinas @ 2016-09-07  9:32 UTC (permalink / raw)
  To: Bartlomiej Zolnierkiewicz
  Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Tue, Sep 06, 2016 at 06:06:48PM +0200, Bartlomiej Zolnierkiewicz wrote:
> On Tuesday, September 06, 2016 02:34:30 PM Catalin Marinas wrote:
> > On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
> > > On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> > > > Maybe there will be some hope if some of the features from ARM64
> > > > server can infect the SOC community --- Jon Masters really had the
> > > > right idea when he insisted on one kernel to boot all ARM64 kernels,
> > > > with all changes pushed upstream, and not hacky, out-of-tree patches
> > > > which only work for one SOC vendor.
[...]
> > > I don't think that's a fair comparison. For server platforms end-users of the 
> > > hardware will pick a distribution and roll it out on the machines, so hardware 
> > > vendors have a strong incentive to play by our rules. Phones are completely 
> > > different in that the device vendor doesn't care about end-users being able to 
> > > pick what software in general and kernel in particular they want to install on 
> > > the device.
> > 
> > Things could be different if fewer entities control the software that
> > gets installed/updated on such hardware. E.g. Google controlling the OTA
> > updates of the Chromebook kernels, they will at some point take a
> > similar hard stance to Red Hat on upstream first, single kernel Image.
> > For phones, however, that's unlikely to happen given the multitude and
> > short life-time of new products.
[...]
> For phones it feels that upstream itself is moving too slow for
> vendors to get benefits from upstream first policy.  Each year there
> is a new flagship device and new SoC.  With the current upstream model
> (new kernel release every 2-3 months and discussions on upstreaming
> some core subsystems enhancements easily taking weeks/months) upstream
> first policy just doesn't give enough business benefits.  Before you
> have everything upstream and changes propagate itself to LTS and Android
> kernels (which should also move faster BTW) it can be easily 2 years
> since start of the effort and your SoC is now obsolete.

If there is a one-off, completely independent SoC that no-one (not even
the vendor) cares about a year after the device goes on sales, I don't
think we want it upstream either. However, SoC vendors tend to work on
SoC families with some variations within a family (like CPU upgrades,
maybe a new interconnect etc.) but in general a lot of code that can be
reused. That's where upstreaming is highly beneficial to the vendor on
the long run since such SoC family has a life span bigger than the
individual device derived from a specific SoC.

I'm also aware that vendors don't always want to disclose their SoC
details until the device goes public, so that's another business
argument against upstreaming first, especially in the mobile world.

One impediment to upstreaming in my experience is that vendors tend to
develop the initial SoC port against an old kernel version (e.g. based
on the Android version they target). Forward-porting to latest mainline
all of a sudden becomes a much larger task that companies are not always
willing to (sufficiently) invest in. So if an "upstream first" policy is
not always feasible from a (mobile) business perspective, "develop
against upstream" is a better second option. An initial SoC port doesn't
need all the additional features that Android kernels provide, so it's
usually doable with what is available upstream. There is more effort
initially since targeting certain Android versions require back-porting,
however it pays off in the long run for the SoC *family*.

Trying to get on-topic: where organisations providing kernels like LSK
(Linaro) can help is offering to integrate/maintain the SoC back-port
while encouraging the SoC vendors to focus on developing against the
latest upstream. It looks to me that on (too) many occasions SoC vendors
take LSK as their development base for new SoC ports, making the
forward-porting effort significantly larger (and potentially ignored).

-- 
Catalin

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  9:32                       ` Catalin Marinas
@ 2016-09-07 13:07                         ` Bartlomiej Zolnierkiewicz
  2016-09-07 18:49                         ` Mark Brown
  2016-09-09 15:06                         ` Alex Shi
  2 siblings, 0 replies; 122+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2016-09-07 13:07 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Wednesday, September 07, 2016 10:32:50 AM Catalin Marinas wrote:
> On Tue, Sep 06, 2016 at 06:06:48PM +0200, Bartlomiej Zolnierkiewicz wrote:
> > On Tuesday, September 06, 2016 02:34:30 PM Catalin Marinas wrote:
> > > On Mon, Sep 05, 2016 at 05:22:59PM +0300, Laurent Pinchart wrote:
> > > > On Monday 05 Sep 2016 10:03:27 Theodore Ts'o wrote:
> > > > > Maybe there will be some hope if some of the features from ARM64
> > > > > server can infect the SOC community --- Jon Masters really had the
> > > > > right idea when he insisted on one kernel to boot all ARM64 kernels,
> > > > > with all changes pushed upstream, and not hacky, out-of-tree patches
> > > > > which only work for one SOC vendor.
> [...]
> > > > I don't think that's a fair comparison. For server platforms end-users of the 
> > > > hardware will pick a distribution and roll it out on the machines, so hardware 
> > > > vendors have a strong incentive to play by our rules. Phones are completely 
> > > > different in that the device vendor doesn't care about end-users being able to 
> > > > pick what software in general and kernel in particular they want to install on 
> > > > the device.
> > > 
> > > Things could be different if fewer entities control the software that
> > > gets installed/updated on such hardware. E.g. Google controlling the OTA
> > > updates of the Chromebook kernels, they will at some point take a
> > > similar hard stance to Red Hat on upstream first, single kernel Image.
> > > For phones, however, that's unlikely to happen given the multitude and
> > > short life-time of new products.
> [...]
> > For phones it feels that upstream itself is moving too slow for
> > vendors to get benefits from upstream first policy.  Each year there
> > is a new flagship device and new SoC.  With the current upstream model
> > (new kernel release every 2-3 months and discussions on upstreaming
> > some core subsystems enhancements easily taking weeks/months) upstream
> > first policy just doesn't give enough business benefits.  Before you
> > have everything upstream and changes propagate itself to LTS and Android
> > kernels (which should also move faster BTW) it can be easily 2 years
> > since start of the effort and your SoC is now obsolete.
> 
> If there is a one-off, completely independent SoC that no-one (not even
> the vendor) cares about a year after the device goes on sales, I don't
> think we want it upstream either. However, SoC vendors tend to work on
> SoC families with some variations within a family (like CPU upgrades,
> maybe a new interconnect etc.) but in general a lot of code that can be
> reused. That's where upstreaming is highly beneficial to the vendor on
> the long run since such SoC family has a life span bigger than the
> individual device derived from a specific SoC.

I agree that for SoC families upstreaming is highly beneficial.
Unfortunately for the whole products this doesn't seem to be
the case with current upstream policies.

> I'm also aware that vendors don't always want to disclose their SoC
> details until the device goes public, so that's another business
> argument against upstreaming first, especially in the mobile world.
> 
> One impediment to upstreaming in my experience is that vendors tend to
> develop the initial SoC port against an old kernel version (e.g. based
> on the Android version they target). Forward-porting to latest mainline
> all of a sudden becomes a much larger task that companies are not always
> willing to (sufficiently) invest in. So if an "upstream first" policy is
> not always feasible from a (mobile) business perspective, "develop
> against upstream" is a better second option. An initial SoC port doesn't
> need all the additional features that Android kernels provide, so it's
> usually doable with what is available upstream. There is more effort
> initially since targeting certain Android versions require back-porting,
> however it pays off in the long run for the SoC *family*.
> 
> Trying to get on-topic: where organisations providing kernels like LSK
> (Linaro) can help is offering to integrate/maintain the SoC back-port
> while encouraging the SoC vendors to focus on developing against the
> latest upstream. It looks to me that on (too) many occasions SoC vendors
> take LSK as their development base for new SoC ports, making the
> forward-porting effort significantly larger (and potentially ignored).

Fully agreed.

Best regards,
--
Bartlomiej Zolnierkiewicz
Samsung R&D Institute Poland
Samsung Electronics

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  8:41                         ` Jiri Kosina
@ 2016-09-07 18:44                           ` Mark Brown
  2016-09-08 17:06                             ` Frank Rowand
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-07 18:44 UTC (permalink / raw)
  To: Jiri Kosina; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 915 bytes --]

On Wed, Sep 07, 2016 at 10:41:08AM +0200, Jiri Kosina wrote:

> consumer space in this area. Consumer space folks seem to be in a 
> situation that server space people have been in 2.0 / 2.2 era, where we 
> had exactly the same problem with server hardware (i.e lack of timely 
> focus on upstream in the process).

As people keep saying that's not really entirely it - the timescales on
consumer hardware are *much* tighter than those on server hardware which
does make a big difference to what you can do here.  

> Also, it feels like consumer HW is more in a "fire and forget" mode of 
> operation, while server hardware usually sticks around for a rather long 
> time.

This is part of it.  There *is* a long tail lifespan for a lot of
consumer hardware but it is in the lower tier vendors and products who
are for the most part reusing earlier work and we do see a lot of IP
reuse though.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  9:32                       ` Catalin Marinas
  2016-09-07 13:07                         ` Bartlomiej Zolnierkiewicz
@ 2016-09-07 18:49                         ` Mark Brown
  2016-09-09 15:06                         ` Alex Shi
  2 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-07 18:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: James Bottomley, ltsi-dev, gregkh, ksummit-discuss,
	Bartlomiej Zolnierkiewicz

[-- Attachment #1: Type: text/plain, Size: 876 bytes --]

On Wed, Sep 07, 2016 at 10:32:50AM +0100, Catalin Marinas wrote:

> Trying to get on-topic: where organisations providing kernels like LSK
> (Linaro) can help is offering to integrate/maintain the SoC back-port
> while encouraging the SoC vendors to focus on developing against the
> latest upstream. It looks to me that on (too) many occasions SoC vendors

You can sit between a vendor and their customer during bringup and
integration if you want to but I'm not sure you'll find many other
people willing to do that, and I'm not sure that anyone else would
want a third party in that path anyway.

> take LSK as their development base for new SoC ports, making the
> forward-porting effort significantly larger (and potentially ignored).

I'm not sure that one direction is that much harder than the other TBH,
it's more about the motivation being there in the first place.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 16:46                     ` Olof Johansson
@ 2016-09-08  8:34                       ` Linus Walleij
  2016-09-08  8:55                         ` Vinod Koul
  2016-09-09 14:23                         ` Rob Herring
  0 siblings, 2 replies; 122+ messages in thread
From: Linus Walleij @ 2016-09-08  8:34 UTC (permalink / raw)
  To: Olof Johansson; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Tue, Sep 6, 2016 at 6:46 PM, Olof Johansson <olof@lixom.net> wrote:

> Chrome OS was successful in this (if I might say so myself), getting
> several vendors who earlier had very thin upstream presence to
> significantly improve. I haven't seen all that many other projects
> being able to do it, but for those of you who are in positions to help
> steer SoC choices, do keep this in mind, work with your internal
> development teams to make them understand the importance of this, and
> make it a priority.

Actually what you did with SoC vendors from Chrome OS and stating
clearly that upstream presence is a factor in procurement was the
*only* thing I have ever seen that actually works to change the
behaviour of an entire company, apart from dedicated individuals on
the inside of the companies.

It got one major SoC vendor "hooked" on upstreaming to the point
that they have now come around to internalize that way of working,
at least partly.

So Chrome OS SoC procurement did good. You should be proud.

When it comes to Android, as I think I remarked in the past, the
problem since its inception is that the Android people making Nexus
devices (or whatever they will call it now) have traditionally thought of
themselves as inferior by being tied to someone actually doing
the hardware such as HTC, Samsung, LG etc, and they see it
as those companies are doing the actual procurement of
components and SoC, where BSP software is just another
"component".

The day the Android people say that for a Nexus(-ish) device it's
gonna be all upstream kernel and they will pick the SoC that
delivers that, then things will happen. But as it seems, they are
not doing the SoC pick, it is done by someone else. But I guess
they *do* pick which company will make the Nexus-ish device and
they could communicate this along to them.

They can also say "upstream strategy document or no playstore
for you" to all handset and tablet vendors any day, but I guess it
would be percieved as too aggressive. But I would personally have
used that hammer immediately.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-08  8:34                       ` Linus Walleij
@ 2016-09-08  8:55                         ` Vinod Koul
  2016-09-09 14:32                           ` Rob Herring
  2016-09-09 14:23                         ` Rob Herring
  1 sibling, 1 reply; 122+ messages in thread
From: Vinod Koul @ 2016-09-08  8:55 UTC (permalink / raw)
  To: Linus Walleij; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Thu, Sep 08, 2016 at 10:34:48AM +0200, Linus Walleij wrote:
> On Tue, Sep 6, 2016 at 6:46 PM, Olof Johansson <olof@lixom.net> wrote:
> 
> > Chrome OS was successful in this (if I might say so myself), getting
> > several vendors who earlier had very thin upstream presence to
> > significantly improve. I haven't seen all that many other projects
> > being able to do it, but for those of you who are in positions to help
> > steer SoC choices, do keep this in mind, work with your internal
> > development teams to make them understand the importance of this, and
> > make it a priority.
> 
> Actually what you did with SoC vendors from Chrome OS and stating
> clearly that upstream presence is a factor in procurement was the
> *only* thing I have ever seen that actually works to change the
> behaviour of an entire company, apart from dedicated individuals on
> the inside of the companies.
> 
> It got one major SoC vendor "hooked" on upstreaming to the point
> that they have now come around to internalize that way of working,
> at least partly.
> 
> So Chrome OS SoC procurement did good. You should be proud.
> 
> When it comes to Android, as I think I remarked in the past, the
> problem since its inception is that the Android people making Nexus
> devices (or whatever they will call it now) have traditionally thought of
> themselves as inferior by being tied to someone actually doing
> the hardware such as HTC, Samsung, LG etc, and they see it
> as those companies are doing the actual procurement of
> components and SoC, where BSP software is just another
> "component".
> 
> The day the Android people say that for a Nexus(-ish) device it's
> gonna be all upstream kernel and they will pick the SoC that
> delivers that, then things will happen. But as it seems, they are
> not doing the SoC pick, it is done by someone else. But I guess
> they *do* pick which company will make the Nexus-ish device and
> they could communicate this along to them.
> 
> They can also say "upstream strategy document or no playstore
> for you" to all handset and tablet vendors any day, but I guess it
> would be percieved as too aggressive. But I would personally have
> used that hammer immediately.

And I think things might be better in future. Brillo (though Android)
seems to have ChromeOS kind of upstream strategy in place, though it is
not clear to me yet if that is mandatory.

Personally although we wanted to upstream, we never got support and time
for upstream when we were on Android only. ChromeOS helped us to get a
biz requirement and support for upstreaming activities.

[1]:
https://android.googlesource.com/device/generic/brillo/+/master/docs/KernelDevelopmentGuide.md


-- 
~Vinod

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07 18:44                           ` Mark Brown
@ 2016-09-08 17:06                             ` Frank Rowand
  2016-09-09 10:32                               ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: Frank Rowand @ 2016-09-08 17:06 UTC (permalink / raw)
  To: Mark Brown, Jiri Kosina
  Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On 09/07/16 11:44, Mark Brown wrote:
> On Wed, Sep 07, 2016 at 10:41:08AM +0200, Jiri Kosina wrote:
> 
>> consumer space in this area. Consumer space folks seem to be in a 
>> situation that server space people have been in 2.0 / 2.2 era, where we 
>> had exactly the same problem with server hardware (i.e lack of timely 
>> focus on upstream in the process).
> 
> As people keep saying that's not really entirely it - the timescales on
> consumer hardware are *much* tighter than those on server hardware which
> does make a big difference to what you can do here. 

You might be surprised about server hardware. I had a project where I
had to add kernel support (in a proprietary unix) to a release that
shipped before there was any hardware to test on.
 

>> Also, it feels like consumer HW is more in a "fire and forget" mode of 
>> operation, while server hardware usually sticks around for a rather long 
>> time.
> 
> This is part of it.  There *is* a long tail lifespan for a lot of
> consumer hardware but it is in the lower tier vendors and products who
> are for the most part reusing earlier work and we do see a lot of IP
> reuse though.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-05 22:44             ` NeilBrown
  2016-09-06  0:57               ` Mark Brown
@ 2016-09-08 18:33               ` Bird, Timothy
  2016-09-08 22:38                 ` NeilBrown
  1 sibling, 1 reply; 122+ messages in thread
From: Bird, Timothy @ 2016-09-08 18:33 UTC (permalink / raw)
  To: NeilBrown, Mark Brown; +Cc: ltsi-dev, James Bottomley, ksummit-discuss

> -----Original Message-----
> From: ltsi-dev-bounces@lists.linuxfoundation.org [mailto:ltsi-dev-
> bounces@lists.linuxfoundation.org] On Behalf Of NeilBrown
> Sent: Monday, September 05, 2016 3:44 PM
> To: Mark Brown <broonie@kernel.org>
> Cc: ltsi-dev@lists.linuxfoundation.org; ksummit-
> discuss@lists.linuxfoundation.org; Levin, Alexander
> <alexander.levin@verizon.com>; James Bottomley
> <James.Bottomley@HansenPartnership.com>
> Subject: Re: [LTSI-dev] [Ksummit-discuss] [Stable kernel] feature backporting
> collaboration
> 
> On Mon, Sep 05 2016, Mark Brown wrote:
> 
> > [ Unknown signature status ]
> > On Mon, Sep 05, 2016 at 11:45:52AM +1000, NeilBrown wrote:
> >> On Sat, Sep 03 2016, Bird, Timothy wrote:
> >
> >> > Where we are now with some of these SoCs is at millions of lines of
> >> > code out-of-tree.  It's being reduced, slowly, but there are still
> >> > significant areas where the mainline kernel just doesn't have the
> >> > support needed for shipping product. My pet peeve is support for
> >> > charging over USB, where Linaro has had a patch set
> >> > being stalled and/or ignored by the USB maintainer for 2 years!!
> >
> >> Do you have a link to that?  I have an interest in charging over USB.
> >
> > This is it:
> >
> >     https://lkml.org/lkml/2016/7/1/35
> >
> > it's been more like one year than two and there has been progress but
> > there's also been an awful lot of latency in the process too.
> 
> Really?  That is worthy of a "pet peeve"?
> The patch set does highlight an important area of missing functionality,
> but doesn't (IMO) display much understanding of the problem space.  I'm
> not surprised it hasn't made progress.

First - thanks for responding to the patch.  I'm hopeful it will see movement.
My "pet peeve" (which *is* a bit histrionic, I suppose) is not so much about this
individual patch (though IMHO it's taken longer to get worked out than it should have).
Rather, I find it shocking that the mainline kernel is missing such a fundamental
feature that is critical to mobile devices.  I brought this up at the kernel summit
last year (so it's not a new rant for me).  But basically, one can not run a mainline
kernel on any phone that I'm aware of, and charge the device.

This means that hobbyists are locked out of experimenting with mainline on their
devices (unless they have the fortitude to manually swap batteries, or switch to
a vendor kernel when they need to charge).  IMHO this is just too much of a burden.
I mean, you might expect that some functionality of the phone might be missing if
you went all-mainline, or pure open source (like proprietary camera modules, or NFC
support).  You'll probably not have access to trivial things like the touch screen or 
display ;-).  But sheesh - to not have your device last longer than a few hours before
you have to do some kludgy work-around, if you want to use mainline on it?
That's retarded.

IMHO this is blocking SoC mainlining efforts by non-vendor people more than any other
single issue.

</rant>
 -- Tim

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-06 22:34                           ` James Bottomley
@ 2016-09-08 18:55                             ` Bird, Timothy
  2016-09-08 19:19                               ` gregkh
  0 siblings, 1 reply; 122+ messages in thread
From: Bird, Timothy @ 2016-09-08 18:55 UTC (permalink / raw)
  To: James Bottomley, Mark Brown, gregkh; +Cc: ltsi-dev, ksummit-discuss

> -----Original Message-----
> From: ksummit-discuss-bounces@lists.linuxfoundation.org [mailto:ksummit-
> discuss-bounces@lists.linuxfoundation.org] On Behalf Of James Bottomley
> On September 6, 2016 6:20:58 PM EDT, Mark Brown <broonie@kernel.org>
> wrote:
> >On Tue, Sep 06, 2016 at 09:44:04PM +0200, gregkh@linuxfoundation.org
> >wrote:
> >> On Tue, Sep 06, 2016 at 11:30:31AM -0400, James Bottomley wrote:
> >> >    3. Increase the pain.  Not sure I like this, but in theory, we
> >could
> >> >       churn the upstream API to increase the pain of upports, but
> >it would
> >> >       also cause a lot of issues with backports.
> >
> >> I tried doing this in the past.  It did cause pain for out-of-tree
> >> modules, but then they got really good and abstracted things away so
> >> that it made their future kernel porting efforts even easier than
> >> before, making their need to upstream code even less.  And then when
> >> they did want to upstream stuff, it took more work unwinding the
> >> abstraction layer.
> >
> >> So watch out for unintended consequences here :)
> >
> >The other big unintended consequence I'd worry about here is that it
> >will present an obstacle to someone who wants to try to upstream
> >something while working in a downstream environment - if someone is
> >looking at some code but the changes for upstream are too great then it
> >might make it too much work for them to try if it's not their primary
> >job.
> >
> >I'd also worry about annoying people who are working upstream as well,
> >it's annoying having things break randomly due to API changes (both as
> >the submitter and as a maintainer or reviewer).
> 
> Ok, so everyone went straight for the option I didn't like. I knew I shouldn't have
> included it.   So what about options 1 or 2, or even something I hadn't thought
> of?

I think that it's important to identify key features that inhibit non-vendor developers
from working on stuff in mainline, and fix those.  That's why I've got a bee in my bonnet
about USB charging.

We've got literally over a billion devices in the field that run Linux, but can't run mainline.
This is very different from the enterprise space, where you can run a mainline kernel but
you're just missing a few thousand vendor patches.  Basic functionality will still work.

There are thousands of non-vendor kernel developers working on these devices. Some of
them, I believe, would play around with this stuff and contribute to mainline, if they could
use it on their devices.  Option 3, by the way, makes it harder for that category of developer
to contribute.
 -- Tim

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-08 18:55                             ` Bird, Timothy
@ 2016-09-08 19:19                               ` gregkh
  2016-09-09 10:45                                 ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: gregkh @ 2016-09-08 19:19 UTC (permalink / raw)
  To: Bird, Timothy; +Cc: James Bottomley, ltsi-dev, ksummit-discuss

On Thu, Sep 08, 2016 at 06:55:12PM +0000, Bird, Timothy wrote:
> > -----Original Message-----
> > From: ksummit-discuss-bounces@lists.linuxfoundation.org [mailto:ksummit-
> > discuss-bounces@lists.linuxfoundation.org] On Behalf Of James Bottomley
> > On September 6, 2016 6:20:58 PM EDT, Mark Brown <broonie@kernel.org>
> > wrote:
> > >On Tue, Sep 06, 2016 at 09:44:04PM +0200, gregkh@linuxfoundation.org
> > >wrote:
> > >> On Tue, Sep 06, 2016 at 11:30:31AM -0400, James Bottomley wrote:
> > >> >    3. Increase the pain.  Not sure I like this, but in theory, we
> > >could
> > >> >       churn the upstream API to increase the pain of upports, but
> > >it would
> > >> >       also cause a lot of issues with backports.
> > >
> > >> I tried doing this in the past.  It did cause pain for out-of-tree
> > >> modules, but then they got really good and abstracted things away so
> > >> that it made their future kernel porting efforts even easier than
> > >> before, making their need to upstream code even less.  And then when
> > >> they did want to upstream stuff, it took more work unwinding the
> > >> abstraction layer.
> > >
> > >> So watch out for unintended consequences here :)
> > >
> > >The other big unintended consequence I'd worry about here is that it
> > >will present an obstacle to someone who wants to try to upstream
> > >something while working in a downstream environment - if someone is
> > >looking at some code but the changes for upstream are too great then it
> > >might make it too much work for them to try if it's not their primary
> > >job.
> > >
> > >I'd also worry about annoying people who are working upstream as well,
> > >it's annoying having things break randomly due to API changes (both as
> > >the submitter and as a maintainer or reviewer).
> > 
> > Ok, so everyone went straight for the option I didn't like. I knew I shouldn't have
> > included it.   So what about options 1 or 2, or even something I hadn't thought
> > of?
> 
> I think that it's important to identify key features that inhibit non-vendor developers
> from working on stuff in mainline, and fix those.  That's why I've got a bee in my bonnet
> about USB charging.

Why about that topic?  NO ONE SUBMITTED PATCHES FOR IT!  Until recently,
and that's only because Linaro decided to pick one of the random vendor
tree solutions and tried to upstream it.

That's kind of proof that upstream is being flat-out ignored...

> We've got literally over a billion devices in the field that run Linux, but can't run mainline.
> This is very different from the enterprise space, where you can run a mainline kernel but
> you're just missing a few thousand vendor patches.  Basic functionality will still work.
> 
> There are thousands of non-vendor kernel developers working on these devices. Some of
> them, I believe, would play around with this stuff and contribute to mainline, if they could
> use it on their devices.  Option 3, by the way, makes it harder for that category of developer
> to contribute.

Well for this topic, upstream was ignored, so I don't know what else we
could really do here.  Except congratulate Linaro for doing the dirty
work, they are doing good stuff here in trying to reduce the delta.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-08 18:33               ` [Ksummit-discuss] [LTSI-dev] " Bird, Timothy
@ 2016-09-08 22:38                 ` NeilBrown
  2016-09-09 11:01                   ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: NeilBrown @ 2016-09-08 22:38 UTC (permalink / raw)
  To: Bird, Timothy, Mark Brown; +Cc: ltsi-dev, James Bottomley, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2672 bytes --]

On Fri, Sep 09 2016, Bird, Timothy wrote:

>
> First - thanks for responding to the patch.  I'm hopeful it will see movement.
> My "pet peeve" (which *is* a bit histrionic, I suppose) is not so much about this
> individual patch (though IMHO it's taken longer to get worked out than it should have).
> Rather, I find it shocking that the mainline kernel is missing such a fundamental
> feature that is critical to mobile devices.  I brought this up at the kernel summit
> last year (so it's not a new rant for me).  But basically, one can not run a mainline
> kernel on any phone that I'm aware of, and charge the device.

I think this is mischaracterizing the problem.
The problem isn't that the kernel is missing this fundamental feature.  The
problem is that it has multiple half-way attempts to support this
functionality, but they are inconsistently implemented and bordering on
incoherent.

There is a usb notifier which allows other drivers, such as power
managers, to be told when a cable is attached to a USB port.  That is
enough to get basic charger functionality working.  But not all phys
send a notification and those that do, don't do it in a consistent way.
There is also extcon, which some phys use to report a cable, but not
all.  It provides more details of the cable, but cannot report anything
about a current limit negotiated with a host.

So you certainly can make this work with mainline (I've done it)
without any extra infrastructure.

If it were me doing this work, I'd clean up the current infrastructure
first and make sure it was used consistently, and documented coherently.

The current patchset isn't exactly adding a third way to do things, but
it is adding stuff without cleaning up what is currently there.  This
won't make the code less messy.

NeilBrown

>
> This means that hobbyists are locked out of experimenting with mainline on their
> devices (unless they have the fortitude to manually swap batteries, or switch to
> a vendor kernel when they need to charge).  IMHO this is just too much of a burden.
> I mean, you might expect that some functionality of the phone might be missing if
> you went all-mainline, or pure open source (like proprietary camera modules, or NFC
> support).  You'll probably not have access to trivial things like the touch screen or 
> display ;-).  But sheesh - to not have your device last longer than a few hours before
> you have to do some kludgy work-around, if you want to use mainline on it?
> That's retarded.
>
> IMHO this is blocking SoC mainlining efforts by non-vendor people more than any other
> single issue.
>
> </rant>
>  -- Tim

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-08 17:06                             ` Frank Rowand
@ 2016-09-09 10:32                               ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-09 10:32 UTC (permalink / raw)
  To: Frank Rowand; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

[-- Attachment #1: Type: text/plain, Size: 624 bytes --]

On Thu, Sep 08, 2016 at 10:06:22AM -0700, Frank Rowand wrote:
> On 09/07/16 11:44, Mark Brown wrote:

> > As people keep saying that's not really entirely it - the timescales on
> > consumer hardware are *much* tighter than those on server hardware which
> > does make a big difference to what you can do here. 

> You might be surprised about server hardware. I had a project where I
> had to add kernel support (in a proprietary unix) to a release that
> shipped before there was any hardware to test on.

Right, and there's some consumer hardware that turns much slower for
various reasons but on average...

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-08 19:19                               ` gregkh
@ 2016-09-09 10:45                                 ` Mark Brown
  2016-09-09 11:03                                   ` gregkh
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-09 10:45 UTC (permalink / raw)
  To: gregkh; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, Baolin.Wang

[-- Attachment #1: Type: text/plain, Size: 1003 bytes --]

On Thu, Sep 08, 2016 at 09:19:30PM +0200, gregkh@linuxfoundation.org wrote:
> On Thu, Sep 08, 2016 at 06:55:12PM +0000, Bird, Timothy wrote:

> > I think that it's important to identify key features that inhibit non-vendor developers
> > from working on stuff in mainline, and fix those.  That's why I've got a bee in my bonnet
> > about USB charging.

> Why about that topic?  NO ONE SUBMITTED PATCHES FOR IT!  Until recently,
> and that's only because Linaro decided to pick one of the random vendor
> tree solutions and tried to upstream it.

Baolin wrote all that code from scratch - it's not based off a vendor
tree.

> That's kind of proof that upstream is being flat-out ignored...

Or that people were working on other things, it's not like this is the
only thing that people are missing in mainline.

> could really do here.  Except congratulate Linaro for doing the dirty
> work, they are doing good stuff here in trying to reduce the delta.

Thanks, especially to Baolin for his persistence.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-08 22:38                 ` NeilBrown
@ 2016-09-09 11:01                   ` Mark Brown
  2016-09-09 22:17                     ` NeilBrown
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-09 11:01 UTC (permalink / raw)
  To: NeilBrown; +Cc: ltsi-dev, ksummit-discuss, Baolin.Wang, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 1464 bytes --]

On Fri, Sep 09, 2016 at 08:38:07AM +1000, NeilBrown wrote:

> There is a usb notifier which allows other drivers, such as power
> managers, to be told when a cable is attached to a USB port.  That is
> enough to get basic charger functionality working.  But not all phys
> send a notification and those that do, don't do it in a consistent way.
> There is also extcon, which some phys use to report a cable, but not
> all.  It provides more details of the cable, but cannot report anything
> about a current limit negotiated with a host.

Right, we did look at these early on but they're all working with
subsets of the functionality - for example physical presence isn't
enough to kick off high current charging.  Part of the thinking was to
create something which could aggregate the various bits of information
that individual subsystems are detecting to produce a coherent view.

> So you certainly can make this work with mainline (I've done it)
> without any extra infrastructure.

You can support low rate charging (and some systems will just do chunks
of this autonomously in the hardware with no OS intervention) but I'm
not clear how high rate is going to work.

> The current patchset isn't exactly adding a third way to do things, but
> it is adding stuff without cleaning up what is currently there.  This
> won't make the code less messy.

It wasn't clear that the messiness wasn't just because nothing is taking
a top level view of what's going on.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-09 10:45                                 ` Mark Brown
@ 2016-09-09 11:03                                   ` gregkh
  2016-09-09 11:48                                     ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: gregkh @ 2016-09-09 11:03 UTC (permalink / raw)
  To: Mark Brown; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, Baolin.Wang

On Fri, Sep 09, 2016 at 11:45:45AM +0100, Mark Brown wrote:
> On Thu, Sep 08, 2016 at 09:19:30PM +0200, gregkh@linuxfoundation.org wrote:
> > On Thu, Sep 08, 2016 at 06:55:12PM +0000, Bird, Timothy wrote:
> 
> > > I think that it's important to identify key features that inhibit non-vendor developers
> > > from working on stuff in mainline, and fix those.  That's why I've got a bee in my bonnet
> > > about USB charging.
> 
> > Why about that topic?  NO ONE SUBMITTED PATCHES FOR IT!  Until recently,
> > and that's only because Linaro decided to pick one of the random vendor
> > tree solutions and tried to upstream it.
> 
> Baolin wrote all that code from scratch - it's not based off a vendor
> tree.

Oh that's even worse!  Well, worse from the case of "the vendors
shipping this type of functionality really do not care".

> > That's kind of proof that upstream is being flat-out ignored...
> 
> Or that people were working on other things, it's not like this is the
> only thing that people are missing in mainline.

What is the list of things that you know of that we are currently
missing?

> > could really do here.  Except congratulate Linaro for doing the dirty
> > work, they are doing good stuff here in trying to reduce the delta.
> 
> Thanks, especially to Baolin for his persistence.

Yes, that's a great job, especially given that it was started "from
scratch".

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-09 11:03                                   ` gregkh
@ 2016-09-09 11:48                                     ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-09 11:48 UTC (permalink / raw)
  To: gregkh; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, Baolin.Wang

[-- Attachment #1: Type: text/plain, Size: 1254 bytes --]

On Fri, Sep 09, 2016 at 01:03:28PM +0200, gregkh@linuxfoundation.org wrote:
> On Fri, Sep 09, 2016 at 11:45:45AM +0100, Mark Brown wrote:

> > Baolin wrote all that code from scratch - it's not based off a vendor
> > tree.

> Oh that's even worse!  Well, worse from the case of "the vendors
> shipping this type of functionality really do not care".

Well, another way of looking at it is that one of the big goals people
have in funding Linaro is to get this sort of shared problem worked on.

> > > That's kind of proof that upstream is being flat-out ignored...

> > Or that people were working on other things, it's not like this is the
> > only thing that people are missing in mainline.

> What is the list of things that you know of that we are currently
> missing?

There's a page in the elinux wiki that people have been using to track
some of this (intermittently it must be said):

   http://elinux.org/Kernel_areas_of_focus_for_mainlining

which focuses a lot on specific functionality rather than higher level
things and obviously internally people will look at their own pain
points.  There's things like big.LITTLE (the EAS work) and system wide
voltage and frequency scaling which are needed for competitive products
but aren't on there.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-08  8:34                       ` Linus Walleij
  2016-09-08  8:55                         ` Vinod Koul
@ 2016-09-09 14:23                         ` Rob Herring
  1 sibling, 0 replies; 122+ messages in thread
From: Rob Herring @ 2016-09-09 14:23 UTC (permalink / raw)
  To: Linus Walleij; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Thu, Sep 8, 2016 at 3:34 AM, Linus Walleij <linus.walleij@linaro.org> wrote:
> On Tue, Sep 6, 2016 at 6:46 PM, Olof Johansson <olof@lixom.net> wrote:
>
>> Chrome OS was successful in this (if I might say so myself), getting
>> several vendors who earlier had very thin upstream presence to
>> significantly improve. I haven't seen all that many other projects
>> being able to do it, but for those of you who are in positions to help
>> steer SoC choices, do keep this in mind, work with your internal
>> development teams to make them understand the importance of this, and
>> make it a priority.
>
> Actually what you did with SoC vendors from Chrome OS and stating
> clearly that upstream presence is a factor in procurement was the
> *only* thing I have ever seen that actually works to change the
> behaviour of an entire company, apart from dedicated individuals on
> the inside of the companies.
>
> It got one major SoC vendor "hooked" on upstreaming to the point
> that they have now come around to internalize that way of working,
> at least partly.

I think a large part of that is being hungry for the business to do
whatever it takes to win it. If QCom was in a ChromeOS device, I don't
think their tree would be much better than Android other than pieces
not needed can be dropped. For example, ChromeOS bypasses the whole
SoC camera subsystem mess by using USB cameras.

> So Chrome OS SoC procurement did good. You should be proud.
>
> When it comes to Android, as I think I remarked in the past, the
> problem since its inception is that the Android people making Nexus
> devices (or whatever they will call it now) have traditionally thought of
> themselves as inferior by being tied to someone actually doing
> the hardware such as HTC, Samsung, LG etc, and they see it
> as those companies are doing the actual procurement of
> components and SoC, where BSP software is just another
> "component".
>
> The day the Android people say that for a Nexus(-ish) device it's
> gonna be all upstream kernel and they will pick the SoC that
> delivers that, then things will happen. But as it seems, they are
> not doing the SoC pick, it is done by someone else. But I guess
> they *do* pick which company will make the Nexus-ish device and
> they could communicate this along to them.
>
> They can also say "upstream strategy document or no playstore
> for you" to all handset and tablet vendors any day, but I guess it
> would be percieved as too aggressive. But I would personally have
> used that hammer immediately.

It's not that easy. After it's mandated, what's step 2? It would be a
multi year process even if vendors stopped everything else to only get
things upstream.

Rob

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-08  8:55                         ` Vinod Koul
@ 2016-09-09 14:32                           ` Rob Herring
  0 siblings, 0 replies; 122+ messages in thread
From: Rob Herring @ 2016-09-09 14:32 UTC (permalink / raw)
  To: Vinod Koul; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Thu, Sep 8, 2016 at 3:55 AM, Vinod Koul <vinod.koul@intel.com> wrote:
> On Thu, Sep 08, 2016 at 10:34:48AM +0200, Linus Walleij wrote:
>> On Tue, Sep 6, 2016 at 6:46 PM, Olof Johansson <olof@lixom.net> wrote:
>>
>> > Chrome OS was successful in this (if I might say so myself), getting
>> > several vendors who earlier had very thin upstream presence to
>> > significantly improve. I haven't seen all that many other projects
>> > being able to do it, but for those of you who are in positions to help
>> > steer SoC choices, do keep this in mind, work with your internal
>> > development teams to make them understand the importance of this, and
>> > make it a priority.
>>
>> Actually what you did with SoC vendors from Chrome OS and stating
>> clearly that upstream presence is a factor in procurement was the
>> *only* thing I have ever seen that actually works to change the
>> behaviour of an entire company, apart from dedicated individuals on
>> the inside of the companies.
>>
>> It got one major SoC vendor "hooked" on upstreaming to the point
>> that they have now come around to internalize that way of working,
>> at least partly.
>>
>> So Chrome OS SoC procurement did good. You should be proud.
>>
>> When it comes to Android, as I think I remarked in the past, the
>> problem since its inception is that the Android people making Nexus
>> devices (or whatever they will call it now) have traditionally thought of
>> themselves as inferior by being tied to someone actually doing
>> the hardware such as HTC, Samsung, LG etc, and they see it
>> as those companies are doing the actual procurement of
>> components and SoC, where BSP software is just another
>> "component".
>>
>> The day the Android people say that for a Nexus(-ish) device it's
>> gonna be all upstream kernel and they will pick the SoC that
>> delivers that, then things will happen. But as it seems, they are
>> not doing the SoC pick, it is done by someone else. But I guess
>> they *do* pick which company will make the Nexus-ish device and
>> they could communicate this along to them.
>>
>> They can also say "upstream strategy document or no playstore
>> for you" to all handset and tablet vendors any day, but I guess it
>> would be percieved as too aggressive. But I would personally have
>> used that hammer immediately.
>
> And I think things might be better in future. Brillo (though Android)
> seems to have ChromeOS kind of upstream strategy in place, though it is
> not clear to me yet if that is mandatory.

You are the first person I've heard say Brillo is doing things right.
IMO, they are reusing everything that is wrong with Android: the build
system, HALs and vendor kernels. The initial announcement talked about
leveraging the broad h/w support of Android. That's vendor kernels and
HALs (aka BSPs) for which many are not open source. As the saying
goes, if you have a BSP, you're doing it wrong.

Rob

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  9:32                       ` Catalin Marinas
  2016-09-07 13:07                         ` Bartlomiej Zolnierkiewicz
  2016-09-07 18:49                         ` Mark Brown
@ 2016-09-09 15:06                         ` Alex Shi
  2 siblings, 0 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-09 15:06 UTC (permalink / raw)
  To: Catalin Marinas, Bartlomiej Zolnierkiewicz
  Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh



On 09/07/2016 05:32 PM, Catalin Marinas wrote:
> Trying to get on-topic: where organisations providing kernels like LSK
> (Linaro) can help is offering to integrate/maintain the SoC back-port
> while encouraging the SoC vendors to focus on developing against the
> latest upstream. It looks to me that on (too) many occasions SoC vendors
> take LSK as their development base for new SoC ports, making the
> forward-porting effort significantly larger (and potentially ignored).

W or w/o LSK, vendor are going to hang on android kernel which also base
on LTS. On the other side, this collaboration will save much of separate
work time, then give them more time on long term upstream work. I saw
this thing happens in linaro members.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  8:33                       ` Jan Kara
  2016-09-07  8:41                         ` Jiri Kosina
@ 2016-09-09 15:21                         ` Alex Shi
  2016-09-12 15:34                         ` Christoph Hellwig
  2 siblings, 0 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-09 15:21 UTC (permalink / raw)
  To: Jan Kara, Guenter Roeck
  Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh



On 09/07/2016 04:33 PM, Jan Kara wrote:
> Well there are risks both way - updating to a newer kernel certainly has
> risks (otherwise our kernel team & QA wouldn't have to spend several months
> working on testing & tweaking the distro when creating new release based on
> the new kernel) and backporting has risks as well. You want to find a
> kernel version where the added risk from all the backports does not
> outweight the additional time for testing the kernel and generally
> stabilizing the product.
>  
> 								Honza

That's right. The feature backporting to LSK often has such judgment and
considerations. That's also the reason why LSK is more aggressive to use
(to push) newer LTS. We do prefer the upstream, but like many distros,
have to use relative newer LTS...

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-09 11:01                   ` Mark Brown
@ 2016-09-09 22:17                     ` NeilBrown
  2016-09-12 17:37                       ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: NeilBrown @ 2016-09-09 22:17 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, ksummit-discuss, Baolin.Wang, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 3910 bytes --]

On Fri, Sep 09 2016, Mark Brown wrote:

> [ Unknown signature status ]
> On Fri, Sep 09, 2016 at 08:38:07AM +1000, NeilBrown wrote:
>
>> There is a usb notifier which allows other drivers, such as power
>> managers, to be told when a cable is attached to a USB port.  That is
>> enough to get basic charger functionality working.  But not all phys
>> send a notification and those that do, don't do it in a consistent way.
>> There is also extcon, which some phys use to report a cable, but not
>> all.  It provides more details of the cable, but cannot report anything
>> about a current limit negotiated with a host.
>
> Right, we did look at these early on but they're all working with
> subsets of the functionality - for example physical presence isn't
> enough to kick off high current charging.  Part of the thinking was to
> create something which could aggregate the various bits of information
> that individual subsystems are detecting to produce a coherent view.
>
>> So you certainly can make this work with mainline (I've done it)
>> without any extra infrastructure.
>
> You can support low rate charging (and some systems will just do chunks
> of this autonomously in the hardware with no OS intervention) but I'm
> not clear how high rate is going to work.

Get the phy to register an extcon, and report the charger type.
Get the power supply to register with that extcon and, when an
EXTCON_CHG_USB_* cable is reported, start using more current.

>
>> The current patchset isn't exactly adding a third way to do things, but

Actually I got that wrong.  It *does* exactly add a third way of doing
things.
->uchgr_nh  is a third, separate, notifier chain.

>> it is adding stuff without cleaning up what is currently there.  This
>> won't make the code less messy.
>
> It wasn't clear that the messiness wasn't just because nothing is taking
> a top level view of what's going on.

That is no excuse for leaving it there.  If you want to fix something,
take the top level view, tidy up the mess, and fix it.

The first step should be to do an audit of the current code.  See how
things are used, ask how they could be used more consistently and maybe
could be modified to meet your needs.  Part of this is making sure you
have a clear understanding of the need.  The current patchset doesn't show
that clear understand at all.  This is particularly obvious in the way it
has incorrect defaults for the various charger types.

With a little bit of fixing, extcon is perfectly placed to report
charger types to the power supply.  I think that should be fixed up and
used.  No extra framework needed, just fix what we have.

For communicating the current negotiated in the USB config I see two
options.  One is to fix up the current usb_register_notifier() framework
so that it is used consistently, and have it communicate the allowed
current level.
The other is to remove usb_register_notifier() completely and replace it
with something like the uchgr_nh in the current patchset.
I think usb_register_notifier() is used for two different things.
One is to report a USB or a USB-OTG connection to the usb driver, so it
can configure as 'host' or 'gadget'.
The other is to report the current limit.
Some drivers use extcon for the USB-OTG notification.
Deciding on, and standardizing, a single way to notify which sort of USB
data cable has been plugged, would be very valuable.
The charger-type and negotiated-current notification could use the same
mechanism, or could use a separate mechanism, or could use two different
mechanisms.
I have no strong opinion on what mechanism should be used for each
(well ... I do, but I'll probably have a different opinion tomorrow).
But I *do* strongly believe that each particular type of communication
should be done just one way, and part of creating a new "framework" is
to make sure that all current use-cases fit neatly into the framework.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 800 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-06  7:20   ` [Ksummit-discuss] [LTSI-dev] " Tsugikazu Shibata
@ 2016-09-10 12:00     ` Theodore Ts'o
  2016-09-12 16:27       ` Mark Brown
  2016-09-12  4:12     ` Alex Shi
  1 sibling, 1 reply; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-10 12:00 UTC (permalink / raw)
  To: Tsugikazu Shibata; +Cc: ltsi-dev, Greg KH, ksummit-discuss, Mark Brown

On Tue, Sep 06, 2016 at 07:20:37AM +0000, Tsugikazu Shibata wrote:
> 
> Finally, A request to the community from LTSI's stand point is:
> We want to have some process to be expected; How or about when 
> LTS would be released. So that companies can easier to create their plan 
> to use LTS and that will cause more user can use stable and secure
> kernel.

LTS is released in the Fall.  This year Greg K-H announced that the
next LTS would be 4.9 a month or so ago.  He said a week or two ago
that he reserved the right to not pick 4.9 and either fall back to 4.8
or wait until 4.10 if people abused the preannouncement (e.g., by
trying to squish into Linus's tree a lot of patches that were
extremely buggy and not ready for prime time).

Historically this has been a problem when the enterprise distributions
were extremely strict about the "upstream first" policy, and so
vendors who wanted specific features would try cram stuff into the LTS
kernel before they were ready, resulting in an *extremely* unstable
LTS candidate.  I'm going to guess that since upstream first is so
loosely followed by the mobile handset community that this is going to
be much less of an issue --- or the really abusive stuff (e.g., that
spreads sh*t all over the core scheduler code changes which are
extremely specific to a particular ARM SOC core, so badly that the
schedule won't compile for any other architecture) would get rejected
by Linus with extreme prejudice anyway.

So the "about when" has been pretty easy to predict for quite a while
now.  And the process has also been roughly the same for a while; IIRC
the announcement came at the kernel summit, again with the caveat that
it might be subject to change if people abused the preannouncement.
And even when it wasn't preannounced because of the historic abuse
patterns, people who had observed past practices could generally
guestimate the LTS candidate to within +/- a release.

If that is not enough of a process, could you please state what you
think would be more helpful?

Thanks,

					- Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-06  7:20   ` [Ksummit-discuss] [LTSI-dev] " Tsugikazu Shibata
  2016-09-10 12:00     ` Theodore Ts'o
@ 2016-09-12  4:12     ` Alex Shi
  2016-09-12 16:09       ` Masami Hiramatsu
  1 sibling, 1 reply; 122+ messages in thread
From: Alex Shi @ 2016-09-12  4:12 UTC (permalink / raw)
  To: Tsugikazu Shibata, Theodore Ts'o
  Cc: ltsi-dev, Greg KH, ksummit-discuss, Mark Brown



On 09/06/2016 03:20 PM, Tsugikazu Shibata wrote:
> It seems too late but Here I would like to comment from LTSI project's stand point:
> We were started LTSI project similar view as above that is:
> - LTSI is based on upstream first policy. We are providing a chance to
> merge vendor-required patches on top of LTS but patches need to be
> in -next or merged in later upstream and back ported to LTS.
> Majority of LTSI's additional patches are drivers but sometimes tools or 
> new features may proposed. Such patches are reviewed in case by case 
> basis but should be self-contained.
> - We are approaching to SoC vendors and device manufactures to 
> convince their in-house patches to be upstream. It seems that
> SoC vendors can create their own fork for their customers but
> providing a chance to merge their patches on top of LTS as LTSI will
> make some motivation to merge such patches into upstream as long 
> term view. Yes, We hope our activities may reduce the problem 
> discussed in this thread in longer term. 
> Actually, some of SoC vendors would send us their patches 
> backported from upstream. We will continue to discuss with SoCs and 
> device manufactures to solve the problem.
> - Also, some member of LTSI is employee of companies and they are
> testing hard for LTSI kernel before the release. Those fixes are providing
> to upstream not just LTSI actually. We see LTSI is being one of activity 
> of filling the gap between companies and community to create the kernel 
> used for the companies and industry.

Thanks for Tsugikazu's explanation of LTSI!
LSK follows the very similar rules as LTSI, if anyone like to look into
LSK's policies: https://wiki.linaro.org/LSK
>From 'LSK INCLUSION CONSIDERATIONS' chapter.

Compare to LTSI, LSK has newer LTS, and more upstream features
backporting which requested by many SoC vendors. Apparently these are
needed widely in industry. And the separately feature branch give more
feasibility on feature selection to user. If don't like any features
branches, That just left pure LTS.

BTW, someone worried non-upstream feature wasn't in LSK mainline. We
setup rules to isolated them.

=====
The LSK is meant for stability, not early access, thus normal requested
features should be present in an upstream kernel (committed in the
mainline kernel, or on a clear path to upstream, i.e. merged into
linux-next or applied to a maintainer tree) prior to the request.
Exception rules for few extra features which are necessary and not
merged in upstream yet:
stay out of the LSK mainline by a separate branch, but easy merged into LSK
sponsor Linaro group or member company need providing source code and
response for quality
=====

> 
> Finally, A request to the community from LTSI's stand point is:
> We want to have some process to be expected; How or about when 
> LTS would be released. So that companies can easier to create their plan 
> to use LTS and that will cause more user can use stable and secure
> kernel.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-07  8:33                       ` Jan Kara
  2016-09-07  8:41                         ` Jiri Kosina
  2016-09-09 15:21                         ` Alex Shi
@ 2016-09-12 15:34                         ` Christoph Hellwig
  2 siblings, 0 replies; 122+ messages in thread
From: Christoph Hellwig @ 2016-09-12 15:34 UTC (permalink / raw)
  To: Jan Kara; +Cc: James Bottomley, ltsi-dev, ksummit-discuss, gregkh

On Wed, Sep 07, 2016 at 10:33:12AM +0200, Jan Kara wrote:
> Just to give a comparable numbers for SUSE. The coming SLE12 SP2 release is
> based on 4.4 as well. On top of 4.4.19 we currently carry some 4700 patches
> which together add/delete some 390k lines. Out of these some 280 patches
> are not backports of upstream patches (or at least in the process of going
> upstream), adding / deleting some 35k lines. So indeed we do have much less
> non-upstream stuff in the distro kernel. OTOH I'd note we still do have a
> considerable amount of backported stuff in the product that haven't even
> shipped yet...

And a not insignificant part of that has been carried around by SuSE
for years.  Often not submitted upstream or only half-heartedly, and
sometimes even rejected.  Leading to fun like ioctl numbers added in
SuSE and reused differently in mainline.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12  4:12     ` Alex Shi
@ 2016-09-12 16:09       ` Masami Hiramatsu
  2016-09-13  2:39         ` Alex Shi
  0 siblings, 1 reply; 122+ messages in thread
From: Masami Hiramatsu @ 2016-09-12 16:09 UTC (permalink / raw)
  To: Alex Shi
  Cc: ltsi-dev, ksummit-discuss, Mark Brown, Tsugikazu Shibata, Greg KH

Hi Alex,

On Mon, 12 Sep 2016 12:12:47 +0800
Alex Shi <alex.shi@linaro.org> wrote:

> On 09/06/2016 03:20 PM, Tsugikazu Shibata wrote:
> > It seems too late but Here I would like to comment from LTSI project's stand point:
> > We were started LTSI project similar view as above that is:
> > - LTSI is based on upstream first policy. We are providing a chance to
> > merge vendor-required patches on top of LTS but patches need to be
> > in -next or merged in later upstream and back ported to LTS.
> > Majority of LTSI's additional patches are drivers but sometimes tools or 
> > new features may proposed. Such patches are reviewed in case by case 
> > basis but should be self-contained.
> > - We are approaching to SoC vendors and device manufactures to 
> > convince their in-house patches to be upstream. It seems that
> > SoC vendors can create their own fork for their customers but
> > providing a chance to merge their patches on top of LTS as LTSI will
> > make some motivation to merge such patches into upstream as long 
> > term view. Yes, We hope our activities may reduce the problem 
> > discussed in this thread in longer term. 
> > Actually, some of SoC vendors would send us their patches 
> > backported from upstream. We will continue to discuss with SoCs and 
> > device manufactures to solve the problem.
> > - Also, some member of LTSI is employee of companies and they are
> > testing hard for LTSI kernel before the release. Those fixes are providing
> > to upstream not just LTSI actually. We see LTSI is being one of activity 
> > of filling the gap between companies and community to create the kernel 
> > used for the companies and industry.
> 
> Thanks for Tsugikazu's explanation of LTSI!
> LSK follows the very similar rules as LTSI, if anyone like to look into
> LSK's policies: https://wiki.linaro.org/LSK
> From 'LSK INCLUSION CONSIDERATIONS' chapter.
> 
> Compare to LTSI, LSK has newer LTS, and more upstream features
> backporting which requested by many SoC vendors. Apparently these are
> needed widely in industry. And the separately feature branch give more
> feasibility on feature selection to user. If don't like any features
> branches, That just left pure LTS.

If I understand correctly, the major difference of LSK and LTSI is the SoC
neutral or not. LSK focuses to backport "features" not "SoC/board supports"
because many vendors may port their kernel on LSK to make a stable
BSP/AOSP kernel for their devices. But LTSI aims to help vendors to
push their patches to upstream by giving them a motivation to merge
their patches within the LTSI merge window.

If we merge the effort of LTSI and LSK, we'll also give vendors a chance
to *backport* their SoC support from upstream.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-10 12:00     ` Theodore Ts'o
@ 2016-09-12 16:27       ` Mark Brown
  2016-09-12 17:14         ` Greg KH
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-12 16:27 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss, Greg KH

[-- Attachment #1: Type: text/plain, Size: 1566 bytes --]

On Sat, Sep 10, 2016 at 08:00:55AM -0400, Theodore Ts'o wrote:
> On Tue, Sep 06, 2016 at 07:20:37AM +0000, Tsugikazu Shibata wrote:

> > Finally, A request to the community from LTSI's stand point is:
> > We want to have some process to be expected; How or about when 
> > LTS would be released. So that companies can easier to create their plan 
> > to use LTS and that will cause more user can use stable and secure
> > kernel.

> So the "about when" has been pretty easy to predict for quite a while
> now.  And the process has also been roughly the same for a while; IIRC
> the announcement came at the kernel summit, again with the caveat that

No, it was more usually sometime in late spring or summer (IIRC LinuxCon
Japan had some announcements).

> it might be subject to change if people abused the preannouncement.
> And even when it wasn't preannounced because of the historic abuse
> patterns, people who had observed past practices could generally
> guestimate the LTS candidate to within +/- a release.

> If that is not enough of a process, could you please state what you
> think would be more helpful?

Personally I think what we've got at the minute with the
preannouncements is fairly good, though it might help if it were a bit
more explicit when announcements would be made - the change to
preannouncements at a less awkward point in the release cycle has helped
for the users I talk to but there was always a very awkward period where
people were looking for an announcement but no information to base
things on.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12 16:27       ` Mark Brown
@ 2016-09-12 17:14         ` Greg KH
  2016-09-12 23:45           ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: Greg KH @ 2016-09-12 17:14 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On Mon, Sep 12, 2016 at 05:27:14PM +0100, Mark Brown wrote:
> 
> Personally I think what we've got at the minute with the
> preannouncements is fairly good, though it might help if it were a bit
> more explicit when announcements would be made - the change to
> preannouncements at a less awkward point in the release cycle has helped
> for the users I talk to but there was always a very awkward period where
> people were looking for an announcement but no information to base
> things on.

So, where should I do this "announcement" that will be seen by everyone?

And remember, we are just trying out the "preanounce" thing this time,
let's see what happens, I'm not guaranteeing I will keep doing it...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-09 22:17                     ` NeilBrown
@ 2016-09-12 17:37                       ` Mark Brown
  2016-09-13  7:46                         ` NeilBrown
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-12 17:37 UTC (permalink / raw)
  To: NeilBrown; +Cc: ltsi-dev, ksummit-discuss, Baolin.Wang, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 3345 bytes --]

On Sat, Sep 10, 2016 at 08:17:49AM +1000, NeilBrown wrote:
> On Fri, Sep 09 2016, Mark Brown wrote:

Baolin, I may be misremembering some of our discussions here - please
jump in!

> > It wasn't clear that the messiness wasn't just because nothing is taking
> > a top level view of what's going on.

> That is no excuse for leaving it there.  If you want to fix something,
> take the top level view, tidy up the mess, and fix it.

> The first step should be to do an audit of the current code.  See how
> things are used, ask how they could be used more consistently and maybe
> could be modified to meet your needs.  Part of this is making sure you

We did that, bear in mind that this was started quite a while ago.  The
main thing I was discussing with Baolin at the time was handling of the
USB level negotiations rather than actual chargers.  

> have a clear understanding of the need.  The current patchset doesn't show
> that clear understand at all.  This is particularly obvious in the way it
> has incorrect defaults for the various charger types.

Right.

> With a little bit of fixing, extcon is perfectly placed to report
> charger types to the power supply.  I think that should be fixed up and
> used.  No extra framework needed, just fix what we have.

But chargers aren't the world...

> For communicating the current negotiated in the USB config I see two
> options.  One is to fix up the current usb_register_notifier() framework
> so that it is used consistently, and have it communicate the allowed
> current level.
> The other is to remove usb_register_notifier() completely and replace it
> with something like the uchgr_nh in the current patchset.

...and then the power supply drivers need to also do this.  Since we
know we're likely to have a PHY and a gadget working together it seems
sensible to have a single bit of code that does the joining up here.

> I think usb_register_notifier() is used for two different things.
> One is to report a USB or a USB-OTG connection to the usb driver, so it
> can configure as 'host' or 'gadget'.
> The other is to report the current limit.
> Some drivers use extcon for the USB-OTG notification.
> Deciding on, and standardizing, a single way to notify which sort of USB
> data cable has been plugged, would be very valuable.
> The charger-type and negotiated-current notification could use the same
> mechanism, or could use a separate mechanism, or could use two different
> mechanisms.
> I have no strong opinion on what mechanism should be used for each
> (well ... I do, but I'll probably have a different opinion tomorrow).
> But I *do* strongly believe that each particular type of communication
> should be done just one way, and part of creating a new "framework" is
> to make sure that all current use-cases fit neatly into the framework.

Right, so a certain part of the thinking here was to hide these
decisions from the power supply drivers so they just see the combined
result of charger type and negotiated current and we could just start
off by reporting the negotiated current.

Going back to the more general point that spawned this subthread this is
all really useful feedback which could've been provided much earlier on
- we've not done a great job of ensuring that this review happened.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12 17:14         ` Greg KH
@ 2016-09-12 23:45           ` Mark Brown
  2016-09-13  3:14             ` Theodore Ts'o
  2016-09-13  6:19             ` Greg KH
  0 siblings, 2 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-12 23:45 UTC (permalink / raw)
  To: Greg KH; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1718 bytes --]

On Mon, Sep 12, 2016 at 07:14:50PM +0200, Greg KH wrote:
> On Mon, Sep 12, 2016 at 05:27:14PM +0100, Mark Brown wrote:

> > for the users I talk to but there was always a very awkward period where
> > people were looking for an announcement but no information to base
> > things on.

> So, where should I do this "announcement" that will be seen by everyone?

I don't think it's a problem with location at the minute since you're
fairly quick at posting on your G+ and blog (though I didn't spot
anything on the stable list, it's possible I just can't see it due to
the volume there though), that was more about the old pre v4.4 days.

> And remember, we are just trying out the "preanounce" thing this time,
> let's see what happens, I'm not guaranteeing I will keep doing it...

Sure, and I think the main thing I'm saying here is that this lack of
predictability in your future plans is something the people I talk to 
mention and may be what Shibata-san was getting at also.

Part of the problem when the announcement was done after the release was
that there was a lot of trying to read the runes and guess what'll
happen and when going on behind the scenes, if people would like to
follow the LTS but also have product deadlines they can end up walking a
tightrope with their schedule and trying to predict LTS.  What you did
with v4.4 (announcing during the -rc cycle) addresses this, as does what
you've done this year pulling that even further forwards.  

If things don't work out with what you're doing with the preannouncement
it might be good to comment on that that and say you intend to do
something different next year, but hopefully everything will be fine of
course.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12 16:09       ` Masami Hiramatsu
@ 2016-09-13  2:39         ` Alex Shi
  0 siblings, 0 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-13  2:39 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: ltsi-dev, ksummit-discuss, Mark Brown, Tsugikazu Shibata, Greg KH



On 09/13/2016 12:09 AM, Masami Hiramatsu wrote:
>> Thanks for Tsugikazu's explanation of LTSI!
>> > LSK follows the very similar rules as LTSI, if anyone like to look into
>> > LSK's policies: https://wiki.linaro.org/LSK
>> > From 'LSK INCLUSION CONSIDERATIONS' chapter.
>> > 
>> > Compare to LTSI, LSK has newer LTS, and more upstream features
>> > backporting which requested by many SoC vendors. Apparently these are
>> > needed widely in industry. And the separately feature branch give more
>> > feasibility on feature selection to user. If don't like any features
>> > branches, That just left pure LTS.
> If I understand correctly, the major difference of LSK and LTSI is the SoC
> neutral or not. LSK focuses to backport "features" not "SoC/board supports"
> because many vendors may port their kernel on LSK to make a stable
> BSP/AOSP kernel for their devices. But LTSI aims to help vendors to
> push their patches to upstream by giving them a motivation to merge
> their patches within the LTSI merge window.

Yes, that's right. Since each of vendors often has different
drivers/devices for their hardware, it's hard to LSK to adopt and test
them, especial when the SoC platform is still hold in lab.

> 
> If we merge the effort of LTSI and LSK, we'll also give vendors a chance
> to *backport* their SoC support from upstream.
> 

Yes, that would give some extra chance to share our work. On the
contrary, the usage/test feedback will make LSK better.

Thanks
Alex

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12 23:45           ` Mark Brown
@ 2016-09-13  3:14             ` Theodore Ts'o
  2016-09-13 10:14               ` Mark Brown
  2016-09-13 13:19               ` Levin, Alexander
  2016-09-13  6:19             ` Greg KH
  1 sibling, 2 replies; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-13  3:14 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tsugikazu Shibata, Greg KH, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 12:45:48AM +0100, Mark Brown wrote:
> Part of the problem when the announcement was done after the release was
> that there was a lot of trying to read the runes and guess what'll
> happen and when going on behind the scenes, if people would like to
> follow the LTS but also have product deadlines they can end up walking a
> tightrope with their schedule and trying to predict LTS.  What you did
> with v4.4 (announcing during the -rc cycle) addresses this, as does what
> you've done this year pulling that even further forwards.  
> 
> If things don't work out with what you're doing with the preannouncement
> it might be good to comment on that that and say you intend to do
> something different next year, but hopefully everything will be fine of
> course.

So Greg has already said that if people abuse the preannouncement by
trying to push obviously unready code into 4.9 to comply with
enterprise distributions requirements that features have to be
upstream first (although obviously the distributions would accept "bug
fix" patches), he reserved the right to retroactively declare that 4.8
or 4.10 would be the LTS kernel.

So even this year, if people behave badly (which is the reason why the
announcement was done after the release -- people were trying to game
the system) it is not guaranteed that 4.9 will be the LTS kernel.

    	       	      		      	  - Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12 23:45           ` Mark Brown
  2016-09-13  3:14             ` Theodore Ts'o
@ 2016-09-13  6:19             ` Greg KH
  2016-09-13 10:38               ` Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: Greg KH @ 2016-09-13  6:19 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 12:45:48AM +0100, Mark Brown wrote:
> On Mon, Sep 12, 2016 at 07:14:50PM +0200, Greg KH wrote:
> > On Mon, Sep 12, 2016 at 05:27:14PM +0100, Mark Brown wrote:
> 
> > > for the users I talk to but there was always a very awkward period where
> > > people were looking for an announcement but no information to base
> > > things on.
> 
> > So, where should I do this "announcement" that will be seen by everyone?
> 
> I don't think it's a problem with location at the minute since you're
> fairly quick at posting on your G+ and blog (though I didn't spot
> anything on the stable list, it's possible I just can't see it due to
> the volume there though), that was more about the old pre v4.4 days.
> 
> > And remember, we are just trying out the "preanounce" thing this time,
> > let's see what happens, I'm not guaranteeing I will keep doing it...
> 
> Sure, and I think the main thing I'm saying here is that this lack of
> predictability in your future plans is something the people I talk to 
> mention and may be what Shibata-san was getting at also.
> 
> Part of the problem when the announcement was done after the release was
> that there was a lot of trying to read the runes and guess what'll
> happen and when going on behind the scenes, if people would like to
> follow the LTS but also have product deadlines they can end up walking a
> tightrope with their schedule and trying to predict LTS.  What you did
> with v4.4 (announcing during the -rc cycle) addresses this, as does what
> you've done this year pulling that even further forwards.  

4.4 was a total supprise to everyone, including me, it came out during
the kernel summit discussions, and we decided right there in the middle
of my talk to do it.  So there was no "preannouncement" possible there.

> If things don't work out with what you're doing with the preannouncement
> it might be good to comment on that that and say you intend to do
> something different next year, but hopefully everything will be fine of
> course.

I did that here as well, 4.9 was announced _way_ in advance.

I really don't know what else I can do here to make it easier.
Companies can always talk to me, and lots do, in trying to figure out
what the next kernel will be.  I work with them to show that it doesn't
really matter _what_ kernel is picked, if their code is merged upstream,
or ready to be merged, the specific kernel number doesn't matter.

And that's the key here, and is what we are all trying to work toward.
The ability for companies to easily update their kernels, getting the
latest security and bug fixes, and not have huge outstanding patches
that are impossible to rebase.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-12 17:37                       ` Mark Brown
@ 2016-09-13  7:46                         ` NeilBrown
  2016-09-13 17:53                           ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: NeilBrown @ 2016-09-13  7:46 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, ksummit-discuss, Baolin.Wang, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 5252 bytes --]

On Mon, Sep 12 2016, Mark Brown wrote:

> [ Unknown signature status ]
> On Sat, Sep 10, 2016 at 08:17:49AM +1000, NeilBrown wrote:
>> On Fri, Sep 09 2016, Mark Brown wrote:
>
> Baolin, I may be misremembering some of our discussions here - please
> jump in!
>
>> > It wasn't clear that the messiness wasn't just because nothing is taking
>> > a top level view of what's going on.
>
>> That is no excuse for leaving it there.  If you want to fix something,
>> take the top level view, tidy up the mess, and fix it.
>
>> The first step should be to do an audit of the current code.  See how
>> things are used, ask how they could be used more consistently and maybe
>> could be modified to meet your needs.  Part of this is making sure you
>
> We did that, bear in mind that this was started quite a while ago.  The
> main thing I was discussing with Baolin at the time was handling of the
> USB level negotiations rather than actual chargers.  

It would have been so helpful if the introduction to the patch series
had provided the results of that review:
  "Notification of negotiated USB current allowance is currently
  handled in a very ad hoc way.  Some drivers (listed here) use the usb
  notifier mechanism, others just ignore the information.  The usb
  notification mechanism is also used to notify the USB controller when
  the phy detects a cable being plugged in, though some phys send the
  notification with extcon.
  This is rather a mess, but instead of fixing this up, we have chosen
  to create a third, incompatible, notification mechanism because ....

The background thinking behind a patch can be just as useful as the code
in the patch.

>
>> have a clear understanding of the need.  The current patchset doesn't show
>> that clear understand at all.  This is particularly obvious in the way it
>> has incorrect defaults for the various charger types.
>
> Right.
>
>> With a little bit of fixing, extcon is perfectly placed to report
>> charger types to the power supply.  I think that should be fixed up and
>> used.  No extra framework needed, just fix what we have.
>
> But chargers aren't the world...

True, but they are the most problematic part of the proposed patch
series.

>
>> For communicating the current negotiated in the USB config I see two
>> options.  One is to fix up the current usb_register_notifier() framework
>> so that it is used consistently, and have it communicate the allowed
>> current level.
>> The other is to remove usb_register_notifier() completely and replace it
>> with something like the uchgr_nh in the current patchset.
>
> ...and then the power supply drivers need to also do this.  Since we
> know we're likely to have a PHY and a gadget working together it seems
> sensible to have a single bit of code that does the joining up here.

Sorry, I cannot make out what it is that power supply drivers will also
need to do.

>
>> I think usb_register_notifier() is used for two different things.
>> One is to report a USB or a USB-OTG connection to the usb driver, so it
>> can configure as 'host' or 'gadget'.
>> The other is to report the current limit.
>> Some drivers use extcon for the USB-OTG notification.
>> Deciding on, and standardizing, a single way to notify which sort of USB
>> data cable has been plugged, would be very valuable.
>> The charger-type and negotiated-current notification could use the same
>> mechanism, or could use a separate mechanism, or could use two different
>> mechanisms.
>> I have no strong opinion on what mechanism should be used for each
>> (well ... I do, but I'll probably have a different opinion tomorrow).
>> But I *do* strongly believe that each particular type of communication
>> should be done just one way, and part of creating a new "framework" is
>> to make sure that all current use-cases fit neatly into the framework.
>
> Right, so a certain part of the thinking here was to hide these
> decisions from the power supply drivers so they just see the combined
> result of charger type and negotiated current and we could just start
> off by reporting the negotiated current.
>
> Going back to the more general point that spawned this subthread this is
> all really useful feedback which could've been provided much earlier on
> - we've not done a great job of ensuring that this review happened.

This sounds like the topic that seems to come up every kernel-summit and
never really makes any progress:  How do we motivate people to provide
good quality independent reviews of code?
My perspective has always been that someone has to pay for it.  Good
review takes a serious amount of time, and unless they are being paid,
people are going to scratch their own itch, not someone else's.

To an extent, lwn.net does pay me to review kernel patches, and they
publish my reviews.  Nearly every comment I've made about the patches in
recent emails were in those published reviews.  Yet it seems I had to
make them all over again, then argue my case when it seemed that I
wasn't being heard.  So in this case at least, it could also be said
that we've not done a great job of listening to the reviews that did
happen.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13  3:14             ` Theodore Ts'o
@ 2016-09-13 10:14               ` Mark Brown
  2016-09-13 13:19               ` Levin, Alexander
  1 sibling, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-13 10:14 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Tsugikazu Shibata, Greg KH, ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]

On Mon, Sep 12, 2016 at 11:14:37PM -0400, Theodore Ts'o wrote:
> On Tue, Sep 13, 2016 at 12:45:48AM +0100, Mark Brown wrote:

> > If things don't work out with what you're doing with the preannouncement
> > it might be good to comment on that that and say you intend to do
> > something different next year, but hopefully everything will be fine of
> > course.

> So Greg has already said that if people abuse the preannouncement by
> trying to push obviously unready code into 4.9 to comply with
> enterprise distributions requirements that features have to be

Sure, and what I'm saying is that or it otherwise doesn't seem to have
worked well then talking about it and how that affects the plans for
next year would be good.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13  6:19             ` Greg KH
@ 2016-09-13 10:38               ` Mark Brown
  2016-09-13 12:09                 ` Greg KH
  2016-09-13 12:25                 ` Geert Uytterhoeven
  0 siblings, 2 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-13 10:38 UTC (permalink / raw)
  To: Greg KH; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1931 bytes --]

On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
> On Tue, Sep 13, 2016 at 12:45:48AM +0100, Mark Brown wrote:

> > tightrope with their schedule and trying to predict LTS.  What you did
> > with v4.4 (announcing during the -rc cycle) addresses this, as does what
> > you've done this year pulling that even further forwards.  

> 4.4 was a total supprise to everyone, including me, it came out during
> the kernel summit discussions, and we decided right there in the middle
> of my talk to do it.  So there was no "preannouncement" possible there.

It was still a preannouncement - you announced before v4.4 came out as
opposed to after.

> > If things don't work out with what you're doing with the preannouncement
> > it might be good to comment on that that and say you intend to do
> > something different next year, but hopefully everything will be fine of
> > course.

> I did that here as well, 4.9 was announced _way_ in advance.

Sure.

> I really don't know what else I can do here to make it easier.
> Companies can always talk to me, and lots do, in trying to figure out

Some indication of your plans for handling this next year would be good
(probably after the next LTS has actually appeared and you can evaluate
how that's gone).  Moving things earlier is unlikely to ever be a
problem but if you decide to move things substantially later then that
might get people nervous.

> what the next kernel will be.  I work with them to show that it doesn't
> really matter _what_ kernel is picked, if their code is merged upstream,
> or ready to be merged, the specific kernel number doesn't matter.

In the cases I'm aware of it's more about knowing when the kernel will
appear so people can commit to integration activities than the version
number itself - I've never really heard "I need version X", it's always
been "when will we know which version Greg has chosen?".

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 10:38               ` Mark Brown
@ 2016-09-13 12:09                 ` Greg KH
  2016-09-13 12:20                   ` Josh Boyer
  2016-09-13 12:25                 ` Geert Uytterhoeven
  1 sibling, 1 reply; 122+ messages in thread
From: Greg KH @ 2016-09-13 12:09 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 11:38:14AM +0100, Mark Brown wrote:
> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
> > what the next kernel will be.  I work with them to show that it doesn't
> > really matter _what_ kernel is picked, if their code is merged upstream,
> > or ready to be merged, the specific kernel number doesn't matter.
> 
> In the cases I'm aware of it's more about knowing when the kernel will
> appear so people can commit to integration activities than the version
> number itself - I've never really heard "I need version X", it's always
> been "when will we know which version Greg has chosen?".

Yes, I hear that a lot, so you need to follow up with, "why does it
matter what version Greg picks?", and then their response to me always
is, "so we know what kernel to start to rebase our huge patchsets to
earlier", which again, is the thing we want to keep them from doing!

I got a few emails when I stopped 3.14.y this week along the lines of,
"oops, we were using that kernel, what are we supposed to do now!"  Each
time I asked if 4.4 or 4.8 worked for them.  And each time I got back a
response a day later along the lines of, "oh wow, yes, it does, we'll
use that."

I have half-a-mind to just skip a LTS kernel for a whole year and see if
anyone even notices, I feel it is being used for all the wrong
reasons...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 12:09                 ` Greg KH
@ 2016-09-13 12:20                   ` Josh Boyer
  2016-09-13 13:12                     ` Greg KH
  0 siblings, 1 reply; 122+ messages in thread
From: Josh Boyer @ 2016-09-13 12:20 UTC (permalink / raw)
  To: Greg KH; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 8:09 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Tue, Sep 13, 2016 at 11:38:14AM +0100, Mark Brown wrote:
>> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
>> > what the next kernel will be.  I work with them to show that it doesn't
>> > really matter _what_ kernel is picked, if their code is merged upstream,
>> > or ready to be merged, the specific kernel number doesn't matter.
>>
>> In the cases I'm aware of it's more about knowing when the kernel will
>> appear so people can commit to integration activities than the version
>> number itself - I've never really heard "I need version X", it's always
>> been "when will we know which version Greg has chosen?".
>
> Yes, I hear that a lot, so you need to follow up with, "why does it
> matter what version Greg picks?", and then their response to me always
> is, "so we know what kernel to start to rebase our huge patchsets to
> earlier", which again, is the thing we want to keep them from doing!
>
> I got a few emails when I stopped 3.14.y this week along the lines of,
> "oops, we were using that kernel, what are we supposed to do now!"  Each
> time I asked if 4.4 or 4.8 worked for them.  And each time I got back a
> response a day later along the lines of, "oh wow, yes, it does, we'll
> use that."
>
> I have half-a-mind to just skip a LTS kernel for a whole year and see if
> anyone even notices, I feel it is being used for all the wrong
> reasons...

What are the right reasons?  I've always found the LTS kernels to be a
weird thing.  They don't serve the same purpose the normal stable
kernels do for users (fixes after initial release, before the next
release).  They don't serve the same purpose as a vendor kernel
(hardened, stable, tuned, supported).  Developers tend to abuse them
as you've descried.  So I guess I've forgotten the original intention
of the LTS kernels to begin with.  Could you remind me (us)?

josh

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 10:38               ` Mark Brown
  2016-09-13 12:09                 ` Greg KH
@ 2016-09-13 12:25                 ` Geert Uytterhoeven
  2016-09-13 19:21                   ` Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: Geert Uytterhoeven @ 2016-09-13 12:25 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tsugikazu Shibata, Greg KH, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 12:38 PM, Mark Brown <broonie@kernel.org> wrote:
> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
>> On Tue, Sep 13, 2016 at 12:45:48AM +0100, Mark Brown wrote:
>
>> > tightrope with their schedule and trying to predict LTS.  What you did
>> > with v4.4 (announcing during the -rc cycle) addresses this, as does what
>> > you've done this year pulling that even further forwards.
>
>> 4.4 was a total supprise to everyone, including me, it came out during
>> the kernel summit discussions, and we decided right there in the middle
>> of my talk to do it.  So there was no "preannouncement" possible there.
>
> It was still a preannouncement - you announced before v4.4 came out as
> opposed to after.

Yes, but still long after the closing of the v4.4 merge window, making it
almost impossible to sneak in new features.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 12:20                   ` Josh Boyer
@ 2016-09-13 13:12                     ` Greg KH
  2016-09-13 16:23                       ` Bird, Timothy
                                         ` (3 more replies)
  0 siblings, 4 replies; 122+ messages in thread
From: Greg KH @ 2016-09-13 13:12 UTC (permalink / raw)
  To: Josh Boyer; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 08:20:59AM -0400, Josh Boyer wrote:
> On Tue, Sep 13, 2016 at 8:09 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Tue, Sep 13, 2016 at 11:38:14AM +0100, Mark Brown wrote:
> >> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
> >> > what the next kernel will be.  I work with them to show that it doesn't
> >> > really matter _what_ kernel is picked, if their code is merged upstream,
> >> > or ready to be merged, the specific kernel number doesn't matter.
> >>
> >> In the cases I'm aware of it's more about knowing when the kernel will
> >> appear so people can commit to integration activities than the version
> >> number itself - I've never really heard "I need version X", it's always
> >> been "when will we know which version Greg has chosen?".
> >
> > Yes, I hear that a lot, so you need to follow up with, "why does it
> > matter what version Greg picks?", and then their response to me always
> > is, "so we know what kernel to start to rebase our huge patchsets to
> > earlier", which again, is the thing we want to keep them from doing!
> >
> > I got a few emails when I stopped 3.14.y this week along the lines of,
> > "oops, we were using that kernel, what are we supposed to do now!"  Each
> > time I asked if 4.4 or 4.8 worked for them.  And each time I got back a
> > response a day later along the lines of, "oh wow, yes, it does, we'll
> > use that."
> >
> > I have half-a-mind to just skip a LTS kernel for a whole year and see if
> > anyone even notices, I feel it is being used for all the wrong
> > reasons...
> 
> What are the right reasons?  I've always found the LTS kernels to be a
> weird thing.  They don't serve the same purpose the normal stable
> kernels do for users (fixes after initial release, before the next
> release).  They don't serve the same purpose as a vendor kernel
> (hardened, stable, tuned, supported).  Developers tend to abuse them
> as you've descried.  So I guess I've forgotten the original intention
> of the LTS kernels to begin with.  Could you remind me (us)?

Originally it was to make my "day job" of maintaining an enterprise
kernel (SLES) easier (didn't have to have a bunch of bugfixes listed in
the spec file, took advantage of more testers, got to do stable kernel
work on company time, etc.)

But then, others used it for their "products", especially the embedded
space, as they wanted to support a single kernel for the lifetime of the
product.  So they asked for a kernel a year for this.

But, it turned out that they would only use the kernel series for a
while during the development phase, and then stop after they "shipped"
the device.  Look at all of the Android phones sitting on old obsolete
versions of 3.4 and 3.10 stable kernels.  They aren't even updated to
newer ones, and so, it didn't really help all that much.  Even though I
am fixing security bugs for these kernels, no one pushes them to the
users.  I have an example of a security bug that a Google researcher
found in a 3.10 kernel (but not mainline) I fixed and pushed out an
update, but never got picked up in Nexus phones until 6 months later
when I found the right person/group to poke within Google.

That was a 6 month window where anyone could have gotten root on your
phone, easily.

People say "look, we are using an LTS kernel in our product, all must be
good!" but if they don't update it, it's broken and insecure, and really
no better than if they were using 3.10.0 in a way.

But if we didn't provide an LTS, would companies constantly update their
kernels to newer releases to keep up with the security and bugfixes?
That goes against everything those managers/PMs have ever been used to
in the past, yet it's actually the best thing they could do.  It's a
long road of education and doing work on their part to get test
frameworks set up to be able to qualify "larger" upgrades.  It also
requires that their chip vendor not add 1.5 million lines of code to
their kernel, rewrite the scheduler, and duplicate all existing drivers
with a "-2" suffix.

Ok, this is rambling, and something I've been mulling over for a while
now.  I'm working with some people at some of the chip companies to see
about how we can do this better, hopefully the work of education,
testing, and other assurances can help everyone out in the end, and
start to resolve these issues, but it's going to be slow going.

Yes, I'll have an LTS this year, but next year?  Maybe not, my dream
would be that it wouldn't be needed.  And one has to dream :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13  3:14             ` Theodore Ts'o
  2016-09-13 10:14               ` Mark Brown
@ 2016-09-13 13:19               ` Levin, Alexander
  1 sibling, 0 replies; 122+ messages in thread
From: Levin, Alexander @ 2016-09-13 13:19 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Tsugikazu Shibata, Greg KH, ltsi-dev, ksummit-discuss

On Mon, Sep 12, 2016 at 11:14:37PM -0400, Theodore Ts'o wrote:
> On Tue, Sep 13, 2016 at 12:45:48AM +0100, Mark Brown wrote:
> > Part of the problem when the announcement was done after the release was
> > that there was a lot of trying to read the runes and guess what'll
> > happen and when going on behind the scenes, if people would like to
> > follow the LTS but also have product deadlines they can end up walking a
> > tightrope with their schedule and trying to predict LTS.  What you did
> > with v4.4 (announcing during the -rc cycle) addresses this, as does what
> > you've done this year pulling that even further forwards.  
> > 
> > If things don't work out with what you're doing with the preannouncement
> > it might be good to comment on that that and say you intend to do
> > something different next year, but hopefully everything will be fine of
> > course.
> 
> So Greg has already said that if people abuse the preannouncement by
> trying to push obviously unready code into 4.9 to comply with
> enterprise distributions requirements that features have to be
> upstream first (although obviously the distributions would accept "bug
> fix" patches), he reserved the right to retroactively declare that 4.8
> or 4.10 would be the LTS kernel.
> 
> So even this year, if people behave badly (which is the reason why the
> announcement was done after the release -- people were trying to game
> the system) it is not guaranteed that 4.9 will be the LTS kernel.

If the problem is that people tried to push half-baked features maybe we should be addressing the lack of trust in those people rather than trying to dance around it.

If we can't trust maintainers to look for the good of the kernel rather for the corporation they work in maybe they should be going through another layer of filtering before their stuff gets to Linus?

-- 

Thanks,
Sasha

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 13:12                     ` Greg KH
@ 2016-09-13 16:23                       ` Bird, Timothy
  2016-09-13 19:02                       ` Mark Brown
                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 122+ messages in thread
From: Bird, Timothy @ 2016-09-13 16:23 UTC (permalink / raw)
  To: Greg KH, Josh Boyer; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

> -----Original Message-----
> From: ksummit-discuss-bounces@lists.linuxfoundation.org [mailto:ksummit-
> discuss-bounces@lists.linuxfoundation.org] On Behalf Of Greg KH
> Sent: Tuesday, September 13, 2016 6:13 AM
> To: Josh Boyer <jwboyer@fedoraproject.org>
> Cc: Tsugikazu Shibata <tshibata@ab.jp.nec.com>; ltsi-
> dev@lists.linuxfoundation.org; ksummit-discuss@lists.linuxfoundation.org
> Subject: Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting
> collaboration
> 
> On Tue, Sep 13, 2016 at 08:20:59AM -0400, Josh Boyer wrote:
> > On Tue, Sep 13, 2016 at 8:09 AM, Greg KH <gregkh@linuxfoundation.org>
> wrote:
> > > On Tue, Sep 13, 2016 at 11:38:14AM +0100, Mark Brown wrote:
> > >> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
> > >> > what the next kernel will be.  I work with them to show that it doesn't
> > >> > really matter _what_ kernel is picked, if their code is merged upstream,
> > >> > or ready to be merged, the specific kernel number doesn't matter.
> > >>
> > >> In the cases I'm aware of it's more about knowing when the kernel will
> > >> appear so people can commit to integration activities than the version
> > >> number itself - I've never really heard "I need version X", it's always
> > >> been "when will we know which version Greg has chosen?".
> > >
> > > Yes, I hear that a lot, so you need to follow up with, "why does it
> > > matter what version Greg picks?", and then their response to me always
> > > is, "so we know what kernel to start to rebase our huge patchsets to
> > > earlier", which again, is the thing we want to keep them from doing!
> > >
> > > I got a few emails when I stopped 3.14.y this week along the lines of,
> > > "oops, we were using that kernel, what are we supposed to do now!"  Each
> > > time I asked if 4.4 or 4.8 worked for them.  And each time I got back a
> > > response a day later along the lines of, "oh wow, yes, it does, we'll
> > > use that."
> > >
> > > I have half-a-mind to just skip a LTS kernel for a whole year and see if
> > > anyone even notices, I feel it is being used for all the wrong
> > > reasons...
> >
> > What are the right reasons?  I've always found the LTS kernels to be a
> > weird thing.  They don't serve the same purpose the normal stable
> > kernels do for users (fixes after initial release, before the next
> > release).  They don't serve the same purpose as a vendor kernel
> > (hardened, stable, tuned, supported).  Developers tend to abuse them
> > as you've descried.  So I guess I've forgotten the original intention
> > of the LTS kernels to begin with.  Could you remind me (us)?

I can give my perspective.  The CE Workgroup in the Linux Foundation
has been trying to reduce fragmentation in the embedded space for many
years.  (You can argue about how successful we've been :-)

One thing that plagued the embedded industry (particularly product vendors,
like Sony), was receiving vendor kernels on a number of different kernel
versions, for all the different vendors we worked with.  We had
patches out of tree (some things were external,
like LTTng, or the Linux-tiny patch set), and some things were internal
(like 4k stacks, or our own crash handling/debugging code).  Integrating
this into multiple kernel versions was a huge pain.  You might ask,
"why not just send these upstream".  Well, for some of these there were
multi-year efforts to do so, resulting in failure (or only partial success). 
Others, we were told pretty quickly would never be accepted upstream.
But we still wanted them in our kernels.  So we resigned ourselves to
managing these out-of-tree, indefinitely.

And then you have the backported features issue that's been discussed
on this thread.  As long as SoC vendors are shipping old kernels, product
companies are going to want some amount of backported features.

LTS is a way to get vendors synced up so we see less variation at the 
product-company level in the kernel versions we see, which is actually
a big help.  It's taken us years to educate the vendors that they should be
using the same kernel version as everyone else (within a particular timeframe),
so it would be a step backwards if LTS went away.

Now, from a community standpoint, I don't think any of this is very visible.
In the Android space, whatever Google picks becomes the rallying point
for a particular generation of phones.  But Google is constrained by what
they can reasonably get agreement on from their partners (e.g. Samsung,
MediaTek and of course Qualcomm).  And the patch mountains that each
of those companies has ends up being an hurdle to moving to more recent
kernel versions.  (It's a bit of a vicious cycle, that honestly is very difficult to
break.)  But ultimately, I think it's helpful for everyone involved
that one kernel version gets chosen.  And like others, I don't believe it really
matters which one.

LTS lets us rally non-Android and Android-based products around consistent
kernel versions.

Maybe this does lessen the pain of maintaining out-of-tree patches, which
means there's less incentive to upstream stuff, but there's
a certain amount of patches where embedded people have been told to
just  manage them out of tree.
 -- Tim

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13  7:46                         ` NeilBrown
@ 2016-09-13 17:53                           ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-13 17:53 UTC (permalink / raw)
  To: NeilBrown; +Cc: ltsi-dev, ksummit-discuss, Baolin.Wang, James Bottomley

[-- Attachment #1: Type: text/plain, Size: 3489 bytes --]

On Tue, Sep 13, 2016 at 09:46:48AM +0200, NeilBrown wrote:
> On Mon, Sep 12 2016, Mark Brown wrote:
> > On Sat, Sep 10, 2016 at 08:17:49AM +1000, NeilBrown wrote:

> >> For communicating the current negotiated in the USB config I see two
> >> options.  One is to fix up the current usb_register_notifier() framework
> >> so that it is used consistently, and have it communicate the allowed
> >> current level.
> >> The other is to remove usb_register_notifier() completely and replace it
> >> with something like the uchgr_nh in the current patchset.

> > ...and then the power supply drivers need to also do this.  Since we
> > know we're likely to have a PHY and a gadget working together it seems
> > sensible to have a single bit of code that does the joining up here.

> Sorry, I cannot make out what it is that power supply drivers will also
> need to do.

Handle this separate set of notifications to the extcon ones.

> > Going back to the more general point that spawned this subthread this is
> > all really useful feedback which could've been provided much earlier on
> > - we've not done a great job of ensuring that this review happened.

> This sounds like the topic that seems to come up every kernel-summit and
> never really makes any progress:  How do we motivate people to provide
> good quality independent reviews of code?

A bit, yes.

> My perspective has always been that someone has to pay for it.  Good
> review takes a serious amount of time, and unless they are being paid,
> people are going to scratch their own itch, not someone else's.

A bunch of companies do contribute review, of course, but whoever's
paying there's always going to be particular itches to be scratched that
don't always line up with submitters.

> To an extent, lwn.net does pay me to review kernel patches, and they
> publish my reviews.  Nearly every comment I've made about the patches in
> recent emails were in those published reviews.  Yet it seems I had to
> make them all over again, then argue my case when it seemed that I
> wasn't being heard.  So in this case at least, it could also be said
> that we've not done a great job of listening to the reviews that did
> happen.

If people are missing things from reviews then we need to call them on
it - this can be a genuine oversight and just as we ask people to remind
maintainers when they miss things we should do the same for submitters.
Of course some people do just ignore repeated reminders which needs to
be handled differently but we should be able to keep things moving in
more normal cases.

At a very high level you can think about there being two axes for review
- there's speed and there's quality.  Clearly few are going to have a
problem with prompt and high quality reviews (though if they're *too*
fast that does make it difficult for new reviewers to get involved).
Equally clearly slow and low quality reviews are unhelpful so we want to
avoid those - we want to be somewhere on the spectrum of fast but low
quality or high quality but slow instead.  Low quality reviews don't
need to be a problem, it can mean saying things like "can you explain
this, I'm not entirely sure" or doing a first pass review that misses
some details if things need resubmitting for high level issues since
that gives the submitter something to work with.  Exactly where we end
on this spectrum is going to depend on the situation of course but we
really do want to avoid the situation where we're being slow and
unresponsive.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 13:12                     ` Greg KH
  2016-09-13 16:23                       ` Bird, Timothy
@ 2016-09-13 19:02                       ` Mark Brown
  2016-09-14 14:47                       ` Alex Shi
  2016-09-20  5:15                       ` Tsugikazu Shibata
  3 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-13 19:02 UTC (permalink / raw)
  To: Greg KH; +Cc: Tsugikazu Shibata, ltsi-dev, Josh Boyer, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 2209 bytes --]

On Tue, Sep 13, 2016 at 03:12:49PM +0200, Greg KH wrote:

> People say "look, we are using an LTS kernel in our product, all must be
> good!" but if they don't update it, it's broken and insecure, and really
> no better than if they were using 3.10.0 in a way.

Do they actually say that?  I can't recall that being a selling point
for any of the devices I've bought...  For Android Google are now
talking about delivering security updates and advertising their
frequency a lot more but I don't recall LTS being part of that sell.

> But if we didn't provide an LTS, would companies constantly update their
> kernels to newer releases to keep up with the security and bugfixes?
> That goes against everything those managers/PMs have ever been used to
> in the past, yet it's actually the best thing they could do.  It's a
> long road of education and doing work on their part to get test
> frameworks set up to be able to qualify "larger" upgrades.  It also
> requires that their chip vendor not add 1.5 million lines of code to
> their kernel, rewrite the scheduler, and duplicate all existing drivers
> with a "-2" suffix.

I'm not sure I'd go so far as saying everyone should be tracking
mainline in production - the enteprise distros and their users haven't
been persuaded yet either.  It's definitely a useful goal to get people
doing that though, especially for longer lived devices it just makes so
much more sense.  In the Android world we already see some vendors
shipping an entirely new userspace version which ought to be at least as
risky as a new kernel, though there's also going to be rather more
direct user demand pushing it.

> Ok, this is rambling, and something I've been mulling over for a while
> now.  I'm working with some people at some of the chip companies to see
> about how we can do this better, hopefully the work of education,
> testing, and other assurances can help everyone out in the end, and
> start to resolve these issues, but it's going to be slow going.

The issues around deploying updates into the field are at least as much
if not more of an issue at the system integrator level and potentially
more directly marketable for them - are you talking to them as well?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 12:25                 ` Geert Uytterhoeven
@ 2016-09-13 19:21                   ` Mark Brown
  2016-09-14  1:49                     ` Greg KH
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-13 19:21 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Tsugikazu Shibata, Greg KH, ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

On Tue, Sep 13, 2016 at 02:25:10PM +0200, Geert Uytterhoeven wrote:
> On Tue, Sep 13, 2016 at 12:38 PM, Mark Brown <broonie@kernel.org> wrote:
> > On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:

> >> 4.4 was a total supprise to everyone, including me, it came out during
> >> the kernel summit discussions, and we decided right there in the middle
> >> of my talk to do it.  So there was no "preannouncement" possible there.

> > It was still a preannouncement - you announced before v4.4 came out as
> > opposed to after.

> Yes, but still long after the closing of the v4.4 merge window, making it
> almost impossible to sneak in new features.

That's true but it's not something that makes a different from the point
of view of the people I've seen caring most about getting advance notice
- for them just having some notice before the actual release makes a big
difference.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 19:21                   ` Mark Brown
@ 2016-09-14  1:49                     ` Greg KH
  2016-09-14  3:00                       ` Guenter Roeck
  0 siblings, 1 reply; 122+ messages in thread
From: Greg KH @ 2016-09-14  1:49 UTC (permalink / raw)
  To: Mark Brown; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On Tue, Sep 13, 2016 at 08:21:53PM +0100, Mark Brown wrote:
> On Tue, Sep 13, 2016 at 02:25:10PM +0200, Geert Uytterhoeven wrote:
> > On Tue, Sep 13, 2016 at 12:38 PM, Mark Brown <broonie@kernel.org> wrote:
> > > On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
> 
> > >> 4.4 was a total supprise to everyone, including me, it came out during
> > >> the kernel summit discussions, and we decided right there in the middle
> > >> of my talk to do it.  So there was no "preannouncement" possible there.
> 
> > > It was still a preannouncement - you announced before v4.4 came out as
> > > opposed to after.
> 
> > Yes, but still long after the closing of the v4.4 merge window, making it
> > almost impossible to sneak in new features.
> 
> That's true but it's not something that makes a different from the point
> of view of the people I've seen caring most about getting advance notice
> - for them just having some notice before the actual release makes a big
> difference.

Why?  What do they want to do with this notice?

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-14  1:49                     ` Greg KH
@ 2016-09-14  3:00                       ` Guenter Roeck
  0 siblings, 0 replies; 122+ messages in thread
From: Guenter Roeck @ 2016-09-14  3:00 UTC (permalink / raw)
  To: Greg KH, Mark Brown; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss

On 09/13/2016 06:49 PM, Greg KH wrote:
> On Tue, Sep 13, 2016 at 08:21:53PM +0100, Mark Brown wrote:
>> On Tue, Sep 13, 2016 at 02:25:10PM +0200, Geert Uytterhoeven wrote:
>>> On Tue, Sep 13, 2016 at 12:38 PM, Mark Brown <broonie@kernel.org> wrote:
>>>> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
>>
>>>>> 4.4 was a total supprise to everyone, including me, it came out during
>>>>> the kernel summit discussions, and we decided right there in the middle
>>>>> of my talk to do it.  So there was no "preannouncement" possible there.
>>
>>>> It was still a preannouncement - you announced before v4.4 came out as
>>>> opposed to after.
>>
>>> Yes, but still long after the closing of the v4.4 merge window, making it
>>> almost impossible to sneak in new features.
>>
>> That's true but it's not something that makes a different from the point
>> of view of the people I've seen caring most about getting advance notice
>> - for them just having some notice before the actual release makes a big
>> difference.
>
> Why?  What do they want to do with this notice?
>

Planning. Those two months (or more) do make a substantial difference for
companies with a policy to use the next LTS once it is available. From
engineering side, the first rebase attempts can start early, meaning
engineering can get an early idea how difficult the ultimate task is going
to be, and can prepare management accordingly. Also there is more time
to prepare sales and marketing (and hesitant engineering managers ;-)
for the new release.

Guenter

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 13:12                     ` Greg KH
  2016-09-13 16:23                       ` Bird, Timothy
  2016-09-13 19:02                       ` Mark Brown
@ 2016-09-14 14:47                       ` Alex Shi
  2016-09-20  5:15                       ` Tsugikazu Shibata
  3 siblings, 0 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-14 14:47 UTC (permalink / raw)
  To: Greg KH, Josh Boyer; +Cc: Tsugikazu Shibata, ltsi-dev, ksummit-discuss



On 09/13/2016 09:12 PM, Greg KH wrote:
> People say "look, we are using an LTS kernel in our product, all must be
> good!" but if they don't update it, it's broken and insecure, and really
> no better than if they were using 3.10.0 in a way.
> 
> But if we didn't provide an LTS, would companies constantly update their
> kernels to newer releases to keep up with the security and bugfixes?
> That goes against everything those managers/PMs have ever been used to
> in the past, yet it's actually the best thing they could do.  It's a
> long road of education and doing work on their part to get test
> frameworks set up to be able to qualify "larger" upgrades.  It also
> requires that their chip vendor not add 1.5 million lines of code to
> their kernel, rewrite the scheduler, and duplicate all existing drivers
> with a "-2" suffix.
> 
> Ok, this is rambling, and something I've been mulling over for a while
> now.  I'm working with some people at some of the chip companies to see
> about how we can do this better, hopefully the work of education,
> testing, and other assurances can help everyone out in the end, and
> start to resolve these issues, but it's going to be slow going.

Yes, usually the SoC/product vendors get a huge pressure on 'time to
market' and there are no enough and strong resources to work on upstream
first policy through they also know that's better for their-self in long
term.

So this kind of collaboration like LSTI, LSK would relief these guys,
save duplicated work in this industry and give them more time to work
upstream for long term purpose. That's happening in Linaro members.

> 
> Yes, I'll have an LTS this year, but next year?  Maybe not, my dream
> would be that it wouldn't be needed.  And one has to dream :)
> 

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-13 13:12                     ` Greg KH
                                         ` (2 preceding siblings ...)
  2016-09-14 14:47                       ` Alex Shi
@ 2016-09-20  5:15                       ` Tsugikazu Shibata
  2016-09-21  8:46                         ` Alex Shi
  3 siblings, 1 reply; 122+ messages in thread
From: Tsugikazu Shibata @ 2016-09-20  5:15 UTC (permalink / raw)
  To: Greg KH, Josh Boyer; +Cc: ltsi-dev, Tsugikazu Shibata, ksummit-discuss


On Tue, Sep 13, 2016 at 10:13PM, Greg KH wrote:
>On Tue, Sep 13, 2016 at 08:20:59AM -0400, Josh Boyer wrote:
>> On Tue, Sep 13, 2016 at 8:09 AM, Greg KH <gregkh@linuxfoundation.org> wrote:
>> > On Tue, Sep 13, 2016 at 11:38:14AM +0100, Mark Brown wrote:
>> >> On Tue, Sep 13, 2016 at 08:19:31AM +0200, Greg KH wrote:
>> >> > what the next kernel will be.  I work with them to show that it
>> >> > doesn't really matter _what_ kernel is picked, if their code is
>> >> > merged upstream, or ready to be merged, the specific kernel number doesn't
>matter.
>> >>
>> >> In the cases I'm aware of it's more about knowing when the kernel
>> >> will appear so people can commit to integration activities than the
>> >> version number itself - I've never really heard "I need version X",
>> >> it's always been "when will we know which version Greg has chosen?".
>> >
>> > Yes, I hear that a lot, so you need to follow up with, "why does it
>> > matter what version Greg picks?", and then their response to me
>> > always is, "so we know what kernel to start to rebase our huge
>> > patchsets to earlier", which again, is the thing we want to keep them from doing!
>> >
>> > I got a few emails when I stopped 3.14.y this week along the lines
>> > of, "oops, we were using that kernel, what are we supposed to do
>> > now!"  Each time I asked if 4.4 or 4.8 worked for them.  And each
>> > time I got back a response a day later along the lines of, "oh wow,
>> > yes, it does, we'll use that."
>> >
>> > I have half-a-mind to just skip a LTS kernel for a whole year and
>> > see if anyone even notices, I feel it is being used for all the
>> > wrong reasons...
>>
>> What are the right reasons?  I've always found the LTS kernels to be a
>> weird thing.  They don't serve the same purpose the normal stable
>> kernels do for users (fixes after initial release, before the next
>> release).  They don't serve the same purpose as a vendor kernel
>> (hardened, stable, tuned, supported).  Developers tend to abuse them
>> as you've descried.  So I guess I've forgotten the original intention
>> of the LTS kernels to begin with.  Could you remind me (us)?
>
>Originally it was to make my "day job" of maintaining an enterprise kernel (SLES)
>easier (didn't have to have a bunch of bugfixes listed in the spec file, took advantage
>of more testers, got to do stable kernel work on company time, etc.)
>
>But then, others used it for their "products", especially the embedded space, as they
>wanted to support a single kernel for the lifetime of the product.  So they asked for
>a kernel a year for this.
>
>But, it turned out that they would only use the kernel series for a while during the
>development phase, and then stop after they "shipped"
>the device.  Look at all of the Android phones sitting on old obsolete versions of 3.4
>and 3.10 stable kernels.  They aren't even updated to newer ones, and so, it didn't
>really help all that much.  Even though I am fixing security bugs for these kernels, no
>one pushes them to the users.  I have an example of a security bug that a Google
>researcher found in a 3.10 kernel (but not mainline) I fixed and pushed out an update,
>but never got picked up in Nexus phones until 6 months later when I found the right
>person/group to poke within Google.
>
>That was a 6 month window where anyone could have gotten root on your phone,
>easily.
>
>People say "look, we are using an LTS kernel in our product, all must be good!" but if
>they don't update it, it's broken and insecure, and really no better than if they were
>using 3.10.0 in a way.
>
>But if we didn't provide an LTS, would companies constantly update their kernels to
>newer releases to keep up with the security and bugfixes?
>That goes against everything those managers/PMs have ever been used to in the past,
>yet it's actually the best thing they could do.  It's a long road of education and doing
>work on their part to get test frameworks set up to be able to qualify "larger" upgrades.
>It also requires that their chip vendor not add 1.5 million lines of code to their kernel,
>rewrite the scheduler, and duplicate all existing drivers with a "-2" suffix.
>
>Ok, this is rambling, and something I've been mulling over for a while now.  I'm
>working with some people at some of the chip companies to see about how we can do
>this better, hopefully the work of education, testing, and other assurances can help
>everyone out in the end, and start to resolve these issues, but it's going to be slow
>going.
>
>Yes, I'll have an LTS this year, but next year?  Maybe not, my dream would be that it
>wouldn't be needed.  And one has to dream :)

In this thread, We LTSI team could recognized there is still huge gap between community and industry (especially that have no Enterprise distribution).
Some years ago, I think Enterprise distros got number of requests to include specific in-house patches from companies but distros were never accepted because of upstream first policy and as the result, companies were going to send patch to upstream. 
And then, It looks better forming now.
OTH, Industry that have no such distros, companies should build their own distribution by their own but they have no enough resources to maintain the kernel long term. So, LTS is only the choice for such industry. 
In the case, companies should understand LTS is a community work, so they should not only consume LTS but also work with upstream.
By the discussion here, large outstanding patches waiting for next LTS without considering upstream. Also, others are trying to push unready patches for next LTS candidates. These are the example of the gap.
We LTSI were trying to do some activities to reduce the gap but It seems still not enough.
As for a next step, I think industry should discuss this problem to find better way otherwise we will lose only a choice. We would try to set up a chance of this discussion at coming Embedded Linux Conference Europe. I hope related people would join there and exchange their own opinion.

Thanks,
Tsugikazu Shibata

>thanks,
>
>greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-05  9:28         ` Laurent Pinchart
@ 2016-09-21  6:58           ` Alex Shi
  2016-09-21  9:23             ` gregkh
  2016-09-21 13:56             ` Theodore Ts'o
  0 siblings, 2 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-21  6:58 UTC (permalink / raw)
  To: Laurent Pinchart, ksummit-discuss; +Cc: ltsi-dev, gregkh

On 09/05/2016 05:28 PM, Laurent Pinchart wrote:
>>> Same as I said before, the risk LSK introduces, IMO, is much greater than
>>> > > rebasing and out-of-tree driver stack.

During the 3 years LSK work, I did get few bug report on LSK by users.
But they are some track bugs in common LTS. None of them in backporting
part.

>> > 
>> > I'm afraid you're very much mistaken if you believe that people are only
>> > working on leaf drivers, or that nothing we do upstream has a meaningful
>> > impact at the system level.
> To provide a real-life example, we recently ran into a scheduler issue in a 
> project I'm working on. The device is a phone running a Qualcomm kernel, and 
> the scheduler is so hacked by the vendor to cover the phone use cases that 
> creating a spinning high priority SCHED_FIFO thread in userspace kills the 
> system instantly. That's the kind of crap vendors tend to ship, and moving to 
> a newer kernel version pretty much means they have no revalidate all the 
> scheduler-related use cases (and add more awful hacks to "fix issues 
> introduced in mainline").

I am not a fun of some scheduler solution. But focus on this can not
explain why many distributions are using 'old' stable kernel. Looking
into product world, could you find some real product are using
'upstream' kernel?

'upstream first' is good for feature development, but isn't good for
product. Many product guys talked to me that the non-upstream porting
didn't cost much and not the reason to pin on some stable kernel. All of
them said that testing and stability was the most cost part. Not only
the regular test case, benchmarks, but also the long time using for some
trick/corner case bugs in whole system.

I doubt the 'keep rebasing on upstream' guys have been really worked on
product?

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-20  5:15                       ` Tsugikazu Shibata
@ 2016-09-21  8:46                         ` Alex Shi
  0 siblings, 0 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-21  8:46 UTC (permalink / raw)
  To: Tsugikazu Shibata, Greg KH, Josh Boyer; +Cc: ltsi-dev, ksummit-discuss



On 09/20/2016 01:15 PM, Tsugikazu Shibata wrote:
>> Yes, I'll have an LTS this year, but next year?  Maybe not, my dream would be that it
>> >wouldn't be needed.  And one has to dream :)
> In this thread, We LTSI team could recognized there is still huge gap between community and industry (especially that have no Enterprise distribution).
> Some years ago, I think Enterprise distros got number of requests to include specific in-house patches from companies but distros were never accepted because of upstream first policy and as the result, companies were going to send patch to upstream. 
> And then, It looks better forming now.
> OTH, Industry that have no such distros, companies should build their own distribution by their own but they have no enough resources to maintain the kernel long term. So, LTS is only the choice for such industry. 
> In the case, companies should understand LTS is a community work, so they should not only consume LTS but also work with upstream.
> By the discussion here, large outstanding patches waiting for next LTS without considering upstream. Also, others are trying to push unready patches for next LTS candidates. These are the example of the gap.
> We LTSI were trying to do some activities to reduce the gap but It seems still not enough.
> As for a next step, I think industry should discuss this problem to find better way otherwise we will lose only a choice. We would try to set up a chance of this discussion at coming Embedded Linux Conference Europe. I hope related people would join there and exchange their own opinion.

I will be there for ELC EU. and like to share some comments on this
issue with you.

But as to LTS existence value, no doubt industry guys like it even it's
just a symbol as co-work target. industry guys do what they do now
because of balance between product pressure and long term maintainer
pressure. Not of they are blind or untrained for upstream benefit. Maybe
there should has better way to help them out....

Regards
Alex

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21  6:58           ` Alex Shi
@ 2016-09-21  9:23             ` gregkh
  2016-09-21 14:52               ` Alex Shi
  2016-09-21 18:22               ` Mark Brown
  2016-09-21 13:56             ` Theodore Ts'o
  1 sibling, 2 replies; 122+ messages in thread
From: gregkh @ 2016-09-21  9:23 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, ksummit-discuss

On Wed, Sep 21, 2016 at 02:58:22PM +0800, Alex Shi wrote:
> 
> 
> On 09/05/2016 05:28 PM, Laurent Pinchart wrote:
> >>> Same as I said before, the risk LSK introduces, IMO, is much greater than
> >>> > > rebasing and out-of-tree driver stack.
> 
> During the 3 years LSK work, I did get few bug report on LSK by users.
> But they are some track bugs in common LTS. None of them in backporting
> part.
> 
> >> > 
> >> > I'm afraid you're very much mistaken if you believe that people are only
> >> > working on leaf drivers, or that nothing we do upstream has a meaningful
> >> > impact at the system level.
> > To provide a real-life example, we recently ran into a scheduler issue in a 
> > project I'm working on. The device is a phone running a Qualcomm kernel, and 
> > the scheduler is so hacked by the vendor to cover the phone use cases that 
> > creating a spinning high priority SCHED_FIFO thread in userspace kills the 
> > system instantly. That's the kind of crap vendors tend to ship, and moving to 
> > a newer kernel version pretty much means they have no revalidate all the 
> > scheduler-related use cases (and add more awful hacks to "fix issues 
> > introduced in mainline").
> 
> I am not a fun of some scheduler solution. But focus on this can not
> explain why many distributions are using 'old' stable kernel. Looking
> into product world, could you find some real product are using
> 'upstream' kernel?
> 
> 'upstream first' is good for feature development, but isn't good for
> product.

Not true, IBM and Intel have shown that it saves you time and money to
do "upstream first", why do people claim that their reports of this is
somehow false?  Other companies also agree, but just don't want to take
the initial "hit" of time to do it correctly as it will affect the
device today to save time and money for the device tomorrow.

> Many product guys talked to me that the non-upstream porting didn't
> cost much and not the reason to pin on some stable kernel.

You must be talking to product people who only have to make one device,
not a family of devices :)

> All of them said that testing and stability was the most cost part.

Sure, software is always free, it's that pesky testing and fixing all of
the bugs found that costs money :)
(hint, all of those backports and non-upstrem stuff is what is causing
lots of those bugs...)

> Not only the regular test case, benchmarks, but also the long time
> using for some trick/corner case bugs in whole system.

What do you mean by this?

> I doubt the 'keep rebasing on upstream' guys have been really worked on
> product?

I doubt those "let's not work upstream" have been in this business for
as long as those of us who say "work upstrem first" have :)

Fine, you can ignore us, but realize that it will cost you time and
money to _not_ work upstream.  We are just trying to help you out...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21  6:58           ` Alex Shi
  2016-09-21  9:23             ` gregkh
@ 2016-09-21 13:56             ` Theodore Ts'o
  2016-09-21 15:23               ` Alex Shi
  1 sibling, 1 reply; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-21 13:56 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, gregkh, ksummit-discuss

On Wed, Sep 21, 2016 at 02:58:22PM +0800, Alex Shi wrote:
> 'upstream first' is good for feature development, but isn't good for
> product. Many product guys talked to me that the non-upstream porting
> didn't cost much and not the reason to pin on some stable kernel. All of
> them said that testing and stability was the most cost part. Not only
> the regular test case, benchmarks, but also the long time using for some
> trick/corner case bugs in whole system.
> 
> I doubt the 'keep rebasing on upstream' guys have been really worked on
> product?

I've worked on product kernels for consumer, and I'm horrified by what
the cr*p drivers do to stability and testing.  It also means that I've
had to port the same feature to N different product kernels, and seen
how many bug fixes (including some security-relevant fixes) aren't
getting applied the product kernel because the SOC vendor isn't
tracking the LTS kernel.  And I've also seen engineers from other
companies ask me about bug fixes that were already fixed in the LTS
kernel.

So yes, I've seen it all, from running upstream leading edge kernels
on my laptop, to product kernels for consumer products, to enterprise
distro kernels, to data center kernels where we rebase once a year or
so --- and I certainly know what I prefer, and what results in the
highest quality kernels while still allowing for bleeding edge kernel
features that allow for competitive advantages.  (Hint: it's not the
current consumer product kernel approach.)

   	   	     	     	      	     - Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21  9:23             ` gregkh
@ 2016-09-21 14:52               ` Alex Shi
  2016-09-21 15:28                 ` gregkh
  2016-09-21 18:22               ` Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: Alex Shi @ 2016-09-21 14:52 UTC (permalink / raw)
  To: gregkh; +Cc: ltsi-dev, ksummit-discuss

On 09/21/2016 05:23 PM, gregkh@linuxfoundation.org wrote:
>> > 
>> > I am not a fun of some scheduler solution. But focus on this can not
>> > explain why many distributions are using 'old' stable kernel. Looking
>> > into product world, could you find some real product are using
>> > 'upstream' kernel?
>> > 
>> > 'upstream first' is good for feature development, but isn't good for
>> > product.
> Not true, IBM and Intel have shown that it saves you time and money to
> do "upstream first", why do people claim that their reports of this is
> somehow false?  Other companies also agree, but just don't want to take
> the initial "hit" of time to do it correctly as it will affect the
> device today to save time and money for the device tomorrow.

Thanks for quick response!

I have left Intel Open source center for 3 years, may I miss some
changes in Intel. In my memory, Intel has no soft product on its leading
open source project, like virtualization or others. On android or
previous meego project, it's still use 'old' kernel with much of
down-stream patches.

So would you like to tell me more detailed info of IBM, Intel case?

> 
>> > Many product guys talked to me that the non-upstream porting didn't
>> > cost much and not the reason to pin on some stable kernel.
> You must be talking to product people who only have to make one device,
> not a family of devices :)

No, what I asked is one of Linaro core member, they are also leading
company in mobile phone.

> 
>> > All of them said that testing and stability was the most cost part.
> Sure, software is always free, it's that pesky testing and fixing all of
> the bugs found that costs money :)
> (hint, all of those backports and non-upstrem stuff is what is causing
> lots of those bugs...)
> 
>> > Not only the regular test case, benchmarks, but also the long time
>> > using for some trick/corner case bugs in whole system.
> What do you mean by this?

Uh, they are not so confident on the whole system stability, bug maybe
come from middle layer, or user APP the compatibility with kernel.
Regular testing cannot cover everything, some bug report also come from
consumers.

> 
>> > I doubt the 'keep rebasing on upstream' guys have been really worked on
>> > product?
> I doubt those "let's not work upstream" have been in this business for
> as long as those of us who say "work upstrem first" have :)

There are do many guys 'ignore' the upstream work with a huge 'time to
market' pressure. But there are not only their fault, community may need
some better ways to help them out.

BTW, I take back this word. There are may some industry out of my
experience which is doing so. But let me know the case.

> 
> Fine, you can ignore us, but realize that it will cost you time and
> money to _not_ work upstream.  We are just trying to help you out...

Sorry to give you this impression, but that's not what I mean. To save
mobile industry guys' time and give more help should be better than give
more pressure on them.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 13:56             ` Theodore Ts'o
@ 2016-09-21 15:23               ` Alex Shi
  2016-09-21 15:33                 ` gregkh
  0 siblings, 1 reply; 122+ messages in thread
From: Alex Shi @ 2016-09-21 15:23 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ltsi-dev, gregkh, ksummit-discuss



On 09/21/2016 09:56 PM, Theodore Ts'o wrote:
> On Wed, Sep 21, 2016 at 02:58:22PM +0800, Alex Shi wrote:
>> 'upstream first' is good for feature development, but isn't good for
>> product. Many product guys talked to me that the non-upstream porting
>> didn't cost much and not the reason to pin on some stable kernel. All of
>> them said that testing and stability was the most cost part. Not only
>> the regular test case, benchmarks, but also the long time using for some
>> trick/corner case bugs in whole system.
>>
>> I doubt the 'keep rebasing on upstream' guys have been really worked on
>> product?
> 
> I've worked on product kernels for consumer, and I'm horrified by what
> the cr*p drivers do to stability and testing.  It also means that I've
> had to port the same feature to N different product kernels, and seen
> how many bug fixes (including some security-relevant fixes) aren't
> getting applied the product kernel because the SOC vendor isn't
> tracking the LTS kernel.  And I've also seen engineers from other
> companies ask me about bug fixes that were already fixed in the LTS
> kernel.

Thanks Ted!

Mobile industry is really mess. :)
But there are still companies that like to update their release for
consumer report bug.

> 
> So yes, I've seen it all, from running upstream leading edge kernels
> on my laptop, to product kernels for consumer products, to enterprise
> distro kernels, to data center kernels where we rebase once a year or
> so --- and I certainly know what I prefer, and what results in the
> highest quality kernels while still allowing for bleeding edge kernel
> features that allow for competitive advantages.  (Hint: it's not the
> current consumer product kernel approach.)
> 

Yes, personally, I also keep using latest kernel on my laptop. But as to
a released product, could you let me know data center will ship to other
as product? Or which enterprise distro kernel keep rebasing to upstream?
Most of product need team work, like mobile phone, include kernel,
libraries, user APP, the software number is tremendous. Thus base layer
software update need long time testing for compatibility and stability.
Then rebasing to upstream is too luxury to them.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 14:52               ` Alex Shi
@ 2016-09-21 15:28                 ` gregkh
  2016-09-21 18:50                   ` Mark Brown
  2016-09-22  3:15                   ` Alex Shi
  0 siblings, 2 replies; 122+ messages in thread
From: gregkh @ 2016-09-21 15:28 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, ksummit-discuss

On Wed, Sep 21, 2016 at 10:52:19PM +0800, Alex Shi wrote:
> 
> 
> On 09/21/2016 05:23 PM, gregkh@linuxfoundation.org wrote:
> >> > 
> >> > I am not a fun of some scheduler solution. But focus on this can not
> >> > explain why many distributions are using 'old' stable kernel. Looking
> >> > into product world, could you find some real product are using
> >> > 'upstream' kernel?
> >> > 
> >> > 'upstream first' is good for feature development, but isn't good for
> >> > product.
> > Not true, IBM and Intel have shown that it saves you time and money to
> > do "upstream first", why do people claim that their reports of this is
> > somehow false?  Other companies also agree, but just don't want to take
> > the initial "hit" of time to do it correctly as it will affect the
> > device today to save time and money for the device tomorrow.
> 
> Thanks for quick response!
> 
> I have left Intel Open source center for 3 years, may I miss some
> changes in Intel. In my memory, Intel has no soft product on its leading
> open source project, like virtualization or others. On android or
> previous meego project, it's still use 'old' kernel with much of
> down-stream patches.

Being one of the previous Meego kernel maintainers, and in charge of a
number of laptops that shipped Meego images (we made money on
preinstalled Linux!) I strongly disagree with that statement.  We spent
a lot of time getting all of the work we did for Meego upstream when we
added it to our kernels, there was no deviation there at all.

> So would you like to tell me more detailed info of IBM, Intel case?

It's online somewhere, and has been described in many presentations from
executives of both companies.  I think there was a business "whitepaper"
written somewhere as well that went into the details.

> >> > Many product guys talked to me that the non-upstream porting didn't
> >> > cost much and not the reason to pin on some stable kernel.
> > You must be talking to product people who only have to make one device,
> > not a family of devices :)
> 
> No, what I asked is one of Linaro core member, they are also leading
> company in mobile phone.

Lots of companies ship mobile phones, none of them do it well :)

> >> > All of them said that testing and stability was the most cost part.
> > Sure, software is always free, it's that pesky testing and fixing all of
> > the bugs found that costs money :)
> > (hint, all of those backports and non-upstrem stuff is what is causing
> > lots of those bugs...)
> > 
> >> > Not only the regular test case, benchmarks, but also the long time
> >> > using for some trick/corner case bugs in whole system.
> > What do you mean by this?
> 
> Uh, they are not so confident on the whole system stability, bug maybe
> come from middle layer, or user APP the compatibility with kernel.

Really?  That's news to me, what are we breaking at that layer?  We
ALWAYS want to know information like that as we do not accept that.

> Regular testing cannot cover everything, some bug report also come from
> consumers.

Sure, you do realize you are talking to lots of people here who
individually have decades of shipping Linux products on lots of
different platforms?  :)

We know bug reports come from everyone, there is no such thing as "bug
free software", and none of us are claiming it.  What we are claiming is
that you should stick to the tree that is tested by as many people as
possible the closest (i.e. mainline) as that gets you the most bug
fixes, as well as the ability to use the kernel community to help you
out when you have problems.  Otherwise you are on your own with your
2.5million lines added franken-kernel that no one will touch if they
have a choice not to.

> >> > I doubt the 'keep rebasing on upstream' guys have been really worked on
> >> > product?
> > I doubt those "let's not work upstream" have been in this business for
> > as long as those of us who say "work upstrem first" have :)
> 
> There are do many guys 'ignore' the upstream work with a huge 'time to
> market' pressure. But there are not only their fault, community may need
> some better ways to help them out.

Ok, why are they not talking to us?  We are easy to find, just look at
our inboxes :)

What do you think we could do to help them out?  That's what I have been
doing for the past 10 years in going around and working with companies.
But it's a two-way street, we aren't going to suddenly stop development
on new kernels and just focus on one specific one for a full year, you
have to be realistic.

> BTW, I take back this word. There are may some industry out of my
> experience which is doing so. But let me know the case.

Lots of them are, look at the customers of Renesas as one such example
of an SoC company that knows how to do this well, and is doing a great
job.  And their customers seem to appreciate it from what I can tell.

> > Fine, you can ignore us, but realize that it will cost you time and
> > money to _not_ work upstream.  We are just trying to help you out...
> 
> Sorry to give you this impression, but that's not what I mean. To save
> mobile industry guys' time and give more help should be better than give
> more pressure on them.

We have given them lots of help, we gave them a whole kernel, and
another company gave them a whole operating system for free.  What more
do they want? :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 15:23               ` Alex Shi
@ 2016-09-21 15:33                 ` gregkh
  2016-09-21 19:16                   ` Mark Brown
  0 siblings, 1 reply; 122+ messages in thread
From: gregkh @ 2016-09-21 15:33 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, ksummit-discuss

On Wed, Sep 21, 2016 at 11:23:15PM +0800, Alex Shi wrote:
> On 09/21/2016 09:56 PM, Theodore Ts'o wrote:
> Yes, personally, I also keep using latest kernel on my laptop. But as to
> a released product, could you let me know data center will ship to other
> as product?

My cloud provider keeps updating the kernel of my virtual machines to
the latest upstream stable releases all the time.  If yours doesn't, I
suggest you get a better provider.

> Or which enterprise distro kernel keep rebasing to upstream?

Both SuSE and Oracle do this for their releases.

> Most of product need team work, like mobile phone, include kernel,
> libraries, user APP, the software number is tremendous.

Why would updating a new kernel require userspace changes in these
things?  Kernels are always forward compatible, minus random driver bugs
for crazy subsystems (drm does have some problems at times as people do
point out.)

> Thus base layer software update need long time testing for
> compatibility and stability.

Why isn't that happening constantly?  Why wait a few years to do this
for new versions?  Why is the test frameworks that people have not
always churning away at this type of thing to find our bugs as soon as
possible?

> Then rebasing to upstream is too luxury to them.

Then they end up with a kernel, and a product, that is insecure and
vulnerable when it ships.  Not a good thing for something people trust
personal data to...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21  9:23             ` gregkh
  2016-09-21 14:52               ` Alex Shi
@ 2016-09-21 18:22               ` Mark Brown
  2016-09-21 18:54                 ` Linus Walleij
                                   ` (3 more replies)
  1 sibling, 4 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-21 18:22 UTC (permalink / raw)
  To: gregkh; +Cc: ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 5290 bytes --]

On Wed, Sep 21, 2016 at 11:23:41AM +0200, gregkh@linuxfoundation.org wrote:
> On Wed, Sep 21, 2016 at 02:58:22PM +0800, Alex Shi wrote:

> > I am not a fun of some scheduler solution. But focus on this can not
> > explain why many distributions are using 'old' stable kernel. Looking
> > into product world, could you find some real product are using
> > 'upstream' kernel?

> > 'upstream first' is good for feature development, but isn't good for
> > product.

> Not true, IBM and Intel have shown that it saves you time and money to
> do "upstream first", why do people claim that their reports of this is
> somehow false?  Other companies also agree, but just don't want to take
> the initial "hit" of time to do it correctly as it will affect the
> device today to save time and money for the device tomorrow.

One problem that bites people fairly hard trying to do upstream first is
that upstream turns too slowly to do things for the current product in
some markets, when I was working on phones it'd off the top of my head
typically be 3-4 months before anything ended up in a release (depending
on where we were in the release cycle) and another year or so before
that filtered back out into product (and this is mostly with very low
resistance subsystems where I'm the maintainer).  This doesn't play very
nicely when your total product development lifecycle is on the order of
six months, even when you scale down to individual reviews of patches
the latencies involved just don't fit that well.  Upstream is
simultaneously very fast moving (when you look at the overall change
volume flowing in) and very slow moving (when you look at some
individual changes).

When I talk to people about this I tend to talk about doing upstream
simultaneously rather than first, that is a lot more tractable.  Work on
things, try to push your latest work upstream as you write it and
incorporate review feedback from upstream into your product as you go
but don't gate things on upstream.  Try to view anything that's still in
review as a problem that needs to be fixed but acknowledge that there's
also a need to actually get product out the door.

> > Many product guys talked to me that the non-upstream porting didn't
> > cost much and not the reason to pin on some stable kernel.

> You must be talking to product people who only have to make one device,
> not a family of devices :)

What Alex is seeing reflects my experience talking to people as well.
It's not like anyone is saying this is free but it's a thing that people
have been doing for a while, figured out and have incorporated into
their planning - it's managable and reasonably well understood even if
not super productive.

> > All of them said that testing and stability was the most cost part.

> Sure, software is always free, it's that pesky testing and fixing all of
> the bugs found that costs money :)
> (hint, all of those backports and non-upstrem stuff is what is causing
> lots of those bugs...)

All code has problems, it's not like we've got a silver bullet here -
let's not pretend that we're in a position where upstream is shippable
on all systems or where we never have any performance regressions.

> > Not only the regular test case, benchmarks, but also the long time
> > using for some trick/corner case bugs in whole system.

> What do you mean by this?

I think Alex is referring to the detailed full system testing that
people do, I could be wrong though.

> > I doubt the 'keep rebasing on upstream' guys have been really worked on
> > product?

> I doubt those "let's not work upstream" have been in this business for
> as long as those of us who say "work upstrem first" have :)

I have worked in this environment at various levels all the way up to
being sat on site with major consumer electronics companies pushing
patches into market leading products (and onto the lists).  I have
talked and continue to talk to colleagues in this space who have varying
degrees of engagement with upstream.

> Fine, you can ignore us, but realize that it will cost you time and
> money to _not_ work upstream.  We are just trying to help you out...

People aren't just ignoring the idea of working upstream entirely.
People are doing work here, it's producing results (or at least a pile
of mainline code).  Some are doing more than others, some are more
successful than others but there is a lot happening here.  There's a lot
of things going on that I'm critical of but there's also a lot of
perfectly rational, practical decisions being made.  Getting the more
complex consumer electronics devices to the point where they work as
well upstream as servers do is not trivial and will take some time.

Simply repeating "upstream first" over and over again and telling people
that doing anything else is just silly isn't really helping move things
forward.  People have heard this but for a good chunk of the industry
there's a big gap between that simple statement and something that can
be practically acted on in any sort of direct fashion, it can easily
just come over as dismissive and hostile.  It's going to be much more
productive to acknowledge the realities people are dealing with and talk
about how people can improve their engagement with upstream, make the
situation better and close the gaps.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 15:28                 ` gregkh
@ 2016-09-21 18:50                   ` Mark Brown
  2016-09-22  3:15                   ` Alex Shi
  1 sibling, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-21 18:50 UTC (permalink / raw)
  To: gregkh; +Cc: ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 3395 bytes --]

On Wed, Sep 21, 2016 at 05:28:19PM +0200, gregkh@linuxfoundation.org wrote:
> On Wed, Sep 21, 2016 at 10:52:19PM +0800, Alex Shi wrote:

> > I have left Intel Open source center for 3 years, may I miss some
> > changes in Intel. In my memory, Intel has no soft product on its leading
> > open source project, like virtualization or others. On android or
> > previous meego project, it's still use 'old' kernel with much of
> > down-stream patches.

> Being one of the previous Meego kernel maintainers, and in charge of a
> number of laptops that shipped Meego images (we made money on
> preinstalled Linux!) I strongly disagree with that statement.  We spent
> a lot of time getting all of the work we did for Meego upstream when we
> added it to our kernels, there was no deviation there at all.

Sadly the x86 Android phones I've seen were no different to any other
Android phone in this regard.

> We know bug reports come from everyone, there is no such thing as "bug
> free software", and none of us are claiming it.  What we are claiming is
> that you should stick to the tree that is tested by as many people as
> possible the closest (i.e. mainline) as that gets you the most bug
> fixes, as well as the ability to use the kernel community to help you
> out when you have problems.  Otherwise you are on your own with your
> 2.5million lines added franken-kernel that no one will touch if they
> have a choice not to.

That diff isn't going to go away overnight of course...

> > There are do many guys 'ignore' the upstream work with a huge 'time to
> > market' pressure. But there are not only their fault, community may need
> > some better ways to help them out.

> Ok, why are they not talking to us?  We are easy to find, just look at
> our inboxes :)

> What do you think we could do to help them out?  That's what I have been
> doing for the past 10 years in going around and working with companies.
> But it's a two-way street, we aren't going to suddenly stop development
> on new kernels and just focus on one specific one for a full year, you
> have to be realistic.

Right, as do we.  People's code isn't magically going to get upstreamed
overnight, it takes time especially if the hardware isn't designed in a
way that Linux expects and needs some substantial framework adjustments
(which is a serious problem for some of the SoCs).

> > BTW, I take back this word. There are may some industry out of my
> > experience which is doing so. But let me know the case.

> Lots of them are, look at the customers of Renesas as one such example
> of an SoC company that knows how to do this well, and is doing a great
> job.  And their customers seem to appreciate it from what I can tell.

They are one of the better engaged vendors, as are companies like Atmel,
TI and nVidia (pulling names off the top of my head here), but talking
to their downstreams they do still have product kernels with a bunch of
things in them which have not all seen the light of upstream.  It's not
black or white, it's shades of grey.

You're asking what we can do to help - one of the biggest things I wish
we did more of was spend time calling out good practice rather than
complaining about bad practice.  Help people see changes they can make
today rather than talking only about a far off, distant goal.  Encourage
people doing these things to talk about the day to day benefits they've
seen.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 18:22               ` Mark Brown
@ 2016-09-21 18:54                 ` Linus Walleij
  2016-09-21 19:52                 ` Theodore Ts'o
                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 122+ messages in thread
From: Linus Walleij @ 2016-09-21 18:54 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, gregkh, ksummit-discuss

On Wed, Sep 21, 2016 at 8:22 PM, Mark Brown <broonie@kernel.org> wrote:

> When I talk to people about this I tend to talk about doing upstream
> simultaneously rather than first, that is a lot more tractable.  Work on
> things, try to push your latest work upstream as you write it and
> incorporate review feedback from upstream into your product as you go
> but don't gate things on upstream.  Try to view anything that's still in
> review as a problem that needs to be fixed but acknowledge that there's
> also a need to actually get product out the door.

This is very similar to my experience working as main responsible for the
kernel at ST-Ericsson (until their sad demise).

The strategy I had was that of trying to minimize hamming distance on
anything already upstream, incorporate patches from upstream, and also
empasize that core infrastructure such as irqchips, clocks, regulators,
pins, GPIOs, dma, timers, UARTs etc need to go upstream first.

This way, whenever one of our developers patched any of these interna
drivers, that was likely on some lines that were identical to upstream, so
I could rebase the patch onto upstream, test it, sign it off and
mail off to the respective subsystem maintainer. Win-win!

Leaf drivers were then (in theory) be worked on in isolation.

The storage team (MMC/SD) where the current MMC/SD subsystem
maintainer Ulf Hansson was working, actually embraced an upstream
first strategy. This was achieved by backporting the whole set of
changes in drivers/mmc/* to the upstream until all the files in this
part of the kernel were identical to a much later kernel version. Luckily
that part of the kernel was so self-contained behind the block layer
API that this worked out pretty well. I think it was done solely for that
part of the kernel because it was the path of least resistance as
many advanced MMC/SD protocol features had to be squeezed in,
and upstream actually moved ahead much faster with this.

The big contention points to working upstream on Ux500 were
(for all of yours amusement):

- Missing frameworks: system PM and runtime PM especially.
  And even more the intersection between those two.
  The situation is still not very good when it comes to this. It
  is getting better.

- No working generic power domain. This is getting better.

- Huge out-of-tree drivers for on-chip power managment: the PRCMU.
  (Power-reset control management unit). We now have a better
  chance of handling that with syscon and also rpmsg and splitting
  it up per-subsystem after Björn Anderssons latest patches but it
  wasn't there some years ago.

- Missing UART bus for attaching Bluetooth: the thing that Rob
  Herring is now working on.

- Missing HCI transport for subdevices piggybacking HCI:
  FM radio and GPS was using this. Still there is no solution for
  this.

- Huge rewrites needed to move a former framebuffer driver to
  KMS/DRM. The organization was strongly unwilling to rewrite
  a working driver of that size.

- No proper sensor subsystem. We now have IIO.

- Third party GPUs (Mali). Even Intel had this problem when using
  PowerVR IIRC. More of a political problem than a technical one
  as it seems.

Yours,
Linus Walleij

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 15:33                 ` gregkh
@ 2016-09-21 19:16                   ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-21 19:16 UTC (permalink / raw)
  To: gregkh; +Cc: ltsi-dev, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1014 bytes --]

On Wed, Sep 21, 2016 at 05:33:41PM +0200, gregkh@linuxfoundation.org wrote:
> On Wed, Sep 21, 2016 at 11:23:15PM +0800, Alex Shi wrote:

> > Or which enterprise distro kernel keep rebasing to upstream?

> Both SuSE and Oracle do this for their releases.

There's some other distros out there...

> > Thus base layer software update need long time testing for
> > compatibility and stability.

> Why isn't that happening constantly?  Why wait a few years to do this
> for new versions?  Why is the test frameworks that people have not
> always churning away at this type of thing to find our bugs as soon as
> possible?

We've had plenty of talks at KS from the enterprise vendors about the
challenges with doing detailed performance testing on upstream on a
constant basis, the same issues apply.  It's not a case of waiting years
to start, it's a case of tests that take a very long time or lots of
other resources to perform and often some skill in knowing how to
interpret the results.  

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 18:22               ` Mark Brown
  2016-09-21 18:54                 ` Linus Walleij
@ 2016-09-21 19:52                 ` Theodore Ts'o
  2016-09-22  0:43                   ` Mark Brown
  2016-09-22  5:20                 ` gregkh
  2016-09-22 12:56                 ` Laurent Pinchart
  3 siblings, 1 reply; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-21 19:52 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, gregkh, ksummit-discuss

On Wed, Sep 21, 2016 at 07:22:04PM +0100, Mark Brown wrote:
> One problem that bites people fairly hard trying to do upstream first is
> that upstream turns too slowly to do things for the current product in
> some markets, when I was working on phones it'd off the top of my head
> typically be 3-4 months before anything ended up in a release (depending
> on where we were in the release cycle) and another year or so before
> that filtered back out into product (and this is mostly with very low
> resistance subsystems where I'm the maintainer).

To me, "upstream first" doesn't mean wait for things to cycle out to a
release.  It means "develop first on upstream" and then backport to
your product kernel.  And I tend to think of this as something that
should be done on a per-feature or per-device driver basis.  So yes,
your product won't run on upstream, at least certainly not at launch time.

But this tends to avoid a huge build-up of horrific technical debt
where you have a completely horrific scheduler change that is
completely invasive to the core kernel structures, and guarantees that
your changes will break the kernel building on any other architecture.
Most of the time, the comments that you will get back from the
community are actually good ideas; not just nit-picking to slow you
down.  (And, if you know that your changes are going upstream, this
tends to invoke a bit more professional pride with the result that
what you put forward for external review, and what shows up in the
product kernel, isn't a terrible hack filled with techincal debt.)

> When I talk to people about this I tend to talk about doing upstream
> simultaneously rather than first, that is a lot more tractable.  Work on
> things, try to push your latest work upstream as you write it and
> incorporate review feedback from upstream into your product as you go
> but don't gate things on upstream.  Try to view anything that's still in
> review as a problem that needs to be fixed but acknowledge that there's
> also a need to actually get product out the door.

Sure.  I tend to think of that as upstream first, although if people
are happier calling it "upstream simultaneously", the terminology
doesn't really matter all that much.

What *does* matter is an attitude that the goal should be to have
stuff in your product kernel which is upstreamable as the default
goal, and having things that are out-of-tree should be the exception
and not the rule.  And if it does happen, there ought to be good
reasons for each of those changes, and an acknowledgement that
out-of-tree changes represent technical debt.

> People aren't just ignoring the idea of working upstream entirely.
> People are doing work here, it's producing results (or at least a pile
> of mainline code).  Some are doing more than others, some are more
> successful than others but there is a lot happening here.  There's a lot
> of things going on that I'm critical of but there's also a lot of
> perfectly rational, practical decisions being made.  Getting the more
> complex consumer electronics devices to the point where they work as
> well upstream as servers do is not trivial and will take some time.

Perhaps those that are doing good work here should be called out and
given praise?  While there are some negative consequences with calling
out those folks who aren't willing to do more, highlighting the people
who are doing a good job should be all upside.

    	      	     	 	       - Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 19:52                 ` Theodore Ts'o
@ 2016-09-22  0:43                   ` Mark Brown
  0 siblings, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-22  0:43 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 5505 bytes --]

On Wed, Sep 21, 2016 at 03:52:41PM -0400, Theodore Ts'o wrote:
> On Wed, Sep 21, 2016 at 07:22:04PM +0100, Mark Brown wrote:
> > One problem that bites people fairly hard trying to do upstream first is
> > that upstream turns too slowly to do things for the current product in
> > some markets, when I was working on phones it'd off the top of my head

> To me, "upstream first" doesn't mean wait for things to cycle out to a
> release.  It means "develop first on upstream" and then backport to
> your product kernel.  And I tend to think of this as something that

I'm not sure that's what everyone understands when they are just told to
"upstream first" - I'd say a good proportion people understand upstream
first as meaning getting things merged first, and we certainly don't
want people to just post patches once and then declare themselves happy
that they've done their bit.

> But this tends to avoid a huge build-up of horrific technical debt
> where you have a completely horrific scheduler change that is
> completely invasive to the core kernel structures, and guarantees that

It certainly mitigates it but it doesn't avoid it - you will get a
backlog, and probably not in the easy bits either.  This is already how
a lot of things are developed today, especially when they fiddle around
in core.

What's a bit less clear to me is that it's worth everyone assembling
enough of people's out of tree stuff to actually run things usefully on
the systems people are developing on day to day which is what you'd need
to do to make things practical for most end developers to actually do
anything with.  The mechanics for actually pushing things found in
production out from those production trees as they stand are not trivial
(especially when working through combinations of large companies many of
which don't want to talk to the world in general about what they're
doing during a large portion of development), until we're in a better
place upstreaming wise I'm not sure that *everyone* trying to focus on
in flight upstream work directly wouldn't generate more heat than light.  

At some point the balance will tip, I think it already has tipped in
some market segments that are less concerned about some of the more fun
features like power, but there are places where we need to get enough
building blocks in place for upstream to be a viable basis for
development - Tim Bird has some interesting presentations based on his
experience trying to push this at Sony Mobile.

> your changes will break the kernel building on any other architecture.

This is basically unrelated to upstreaming - it's just a quality thing
people can do if they like.  I know there are vendors that do actually
keep track of x86 already for their product kernels but honestly for a
lot of environments the practical use cases are fairly marginal so I can
totally understand why people would just fix it if they needed to.

I know you had really bad experiences with some product trees in the
past but that's not the entire world today.

> Most of the time, the comments that you will get back from the
> community are actually good ideas; not just nit-picking to slow you
> down.  (And, if you know that your changes are going upstream, this
> tends to invoke a bit more professional pride with the result that
> what you put forward for external review, and what shows up in the
> product kernel, isn't a terrible hack filled with techincal debt.)

I think at this point there has been enough repititon to ensure that
everyone who might care has heard these basic arguments for working
upstream.  They just aren't flying well enough to solve the problem by
themselves, sorry.  If that's all we're saying then at this point it's
probably doing more harm than good to keep on repeating them over and
over again.

> > When I talk to people about this I tend to talk about doing upstream
> > simultaneously rather than first, that is a lot more tractable.  Work on

> Sure.  I tend to think of that as upstream first, although if people
> are happier calling it "upstream simultaneously", the terminology
> doesn't really matter all that much.

Like I say I think a lot of people tend to take the natural English
meaning of "first" when they hear the term.

> What *does* matter is an attitude that the goal should be to have
> stuff in your product kernel which is upstreamable as the default
> goal, and having things that are out-of-tree should be the exception
> and not the rule.  And if it does happen, there ought to be good
> reasons for each of those changes, and an acknowledgement that
> out-of-tree changes represent technical debt.

I'm not sure there's any meaningful set of people who are actively
enthusiastic about the current situation.  Thing is that we're already
in that situation, it is more useful to focus on improving it and we
need to be aware that this isn't something that can change overnight.
It's true that it's a huge pile of technical debt and we want to deal
with that but we also don't want people to view this as such a large
problem that it's insurmountable so they just give up.

> Perhaps those that are doing good work here should be called out and
> given praise?  While there are some negative consequences with calling
> out those folks who aren't willing to do more, highlighting the people
> who are doing a good job should be all upside.

Yes, we should do more of that.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 15:28                 ` gregkh
  2016-09-21 18:50                   ` Mark Brown
@ 2016-09-22  3:15                   ` Alex Shi
  1 sibling, 0 replies; 122+ messages in thread
From: Alex Shi @ 2016-09-22  3:15 UTC (permalink / raw)
  To: gregkh; +Cc: ltsi-dev, ksummit-discuss


On 09/21/2016 11:28 PM, gregkh@linuxfoundation.org wrote:
> Being one of the previous Meego kernel maintainers, and in charge of a
> number of laptops that shipped Meego images (we made money on
> preinstalled Linux!) I strongly disagree with that statement.  We spent
> a lot of time getting all of the work we did for Meego upstream when we
> added it to our kernels, there was no deviation there at all.

Thanks info, Greg!

I wish every company could have engineers as good as you. Then there is
no needs on LTSI or LSK like things.

But the fact is much of company still need some help on kernel. Even in
Meego project, most of vendors can not do well as your team, they still
need 'reference kernel' and keep much out of tree code.

http://www.allaboutmeego.com/news/item/12477_MeeGo_kernel_policy_announceme.php
--
Correspondingly, maintainers of "kernel adaptations" will be expected to
backport features from the main MeeGo reference kernel to their own
kernels, to protect the reputation of MeeGo and maintain functionality
for end-users.
--
BTW, meego.com disappeared, so I can not find full kernel policy of
meego now.

> 
>> So would you like to tell me more detailed info of IBM, Intel case?
> 
> It's online somewhere, and has been described in many presentations from
> executives of both companies.  I think there was a business "whitepaper"
> written somewhere as well that went into the details.
> 
>>>>> Many product guys talked to me that the non-upstream porting didn't
>>>>> cost much and not the reason to pin on some stable kernel.
>>> You must be talking to product people who only have to make one device,
>>> not a family of devices :)
>>
>> No, what I asked is one of Linaro core member, they are also leading
>> company in mobile phone.
> 
> Lots of companies ship mobile phones, none of them do it well :)

That depends on how you define 'well'. :)
Yes, they has no capability to polish software system in ideal status.
but they give people more choices on smart phone. Not only iphone.

> 
>>>>> All of them said that testing and stability was the most cost part.
>>> Sure, software is always free, it's that pesky testing and fixing all of
>>> the bugs found that costs money :)
>>> (hint, all of those backports and non-upstrem stuff is what is causing
>>> lots of those bugs...)
>>>
>>>>> Not only the regular test case, benchmarks, but also the long time
>>>>> using for some trick/corner case bugs in whole system.
>>> What do you mean by this?
>>
>> Uh, they are not so confident on the whole system stability, bug maybe
>> come from middle layer, or user APP the compatibility with kernel.
> 
> Really?  That's news to me, what are we breaking at that layer?  We
> ALWAYS want to know information like that as we do not accept that.

Sorry. I can not get more detailed info from them.

> 
>> Regular testing cannot cover everything, some bug report also come from
>> consumers.
> 
> Sure, you do realize you are talking to lots of people here who
> individually have decades of shipping Linux products on lots of
> different platforms?  :)
> 
> We know bug reports come from everyone, there is no such thing as "bug
> free software", and none of us are claiming it.  What we are claiming is
> that you should stick to the tree that is tested by as many people as
> possible the closest (i.e. mainline) as that gets you the most bug
> fixes, as well as the ability to use the kernel community to help you
> out when you have problems.  Otherwise you are on your own with your
> 2.5million lines added franken-kernel that no one will touch if they
> have a choice not to.
> 
>>>>> I doubt the 'keep rebasing on upstream' guys have been really worked on
>>>>> product?
>>> I doubt those "let's not work upstream" have been in this business for
>>> as long as those of us who say "work upstrem first" have :)
>>
>> There are do many guys 'ignore' the upstream work with a huge 'time to
>> market' pressure. But there are not only their fault, community may need
>> some better ways to help them out.
> 
> Ok, why are they not talking to us?  We are easy to find, just look at
> our inboxes :)
> 
> What do you think we could do to help them out?  That's what I have been
> doing for the past 10 years in going around and working with companies.
> But it's a two-way street, we aren't going to suddenly stop development
> on new kernels and just focus on one specific one for a full year, you
> have to be realistic.

Actually, I have no much good idea on helping them. But seems LTSI/LSK
is kind of thing would give some help.

Open source is a large world with many players which isn't good enough.

The interesting thing is, we always play the same role as you here to
our members, to encourage them to involve more in community, to use
latest kernel/LTS as they can. But they need time and more practice.

> 
>> BTW, I take back this word. There are may some industry out of my
>> experience which is doing so. But let me know the case.
> 
> Lots of them are, look at the customers of Renesas as one such example
> of an SoC company that knows how to do this well, and is doing a great
> job.  And their customers seem to appreciate it from what I can tell.

Good to know. Thanks!

> 
>>> Fine, you can ignore us, but realize that it will cost you time and
>>> money to _not_ work upstream.  We are just trying to help you out...
>>
>> Sorry to give you this impression, but that's not what I mean. To save
>> mobile industry guys' time and give more help should be better than give
>> more pressure on them.
> 
> We have given them lots of help, we gave them a whole kernel, and
> another company gave them a whole operating system for free.  What more
> do they want? :)
> 

Shortly here, LTSI/LSK. For long term, the capability for upstream work.

Well, the LTSI/LSK do save their time and release more engineers to
upstream work.

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 18:22               ` Mark Brown
  2016-09-21 18:54                 ` Linus Walleij
  2016-09-21 19:52                 ` Theodore Ts'o
@ 2016-09-22  5:20                 ` gregkh
  2016-09-22 12:56                 ` Laurent Pinchart
  3 siblings, 0 replies; 122+ messages in thread
From: gregkh @ 2016-09-22  5:20 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, ksummit-discuss

On Wed, Sep 21, 2016 at 07:22:04PM +0100, Mark Brown wrote:
> 
> Simply repeating "upstream first" over and over again and telling people
> that doing anything else is just silly isn't really helping move things
> forward.  People have heard this but for a good chunk of the industry
> there's a big gap between that simple statement and something that can
> be practically acted on in any sort of direct fashion, it can easily
> just come over as dismissive and hostile.  It's going to be much more
> productive to acknowledge the realities people are dealing with and talk
> about how people can improve their engagement with upstream, make the
> situation better and close the gaps.

I think we are in violent agreement here :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-21 18:22               ` Mark Brown
                                   ` (2 preceding siblings ...)
  2016-09-22  5:20                 ` gregkh
@ 2016-09-22 12:56                 ` Laurent Pinchart
  2016-09-22 16:22                   ` Mark Brown
  3 siblings, 1 reply; 122+ messages in thread
From: Laurent Pinchart @ 2016-09-22 12:56 UTC (permalink / raw)
  To: ksummit-discuss; +Cc: ltsi-dev, gregkh

On Wednesday 21 Sep 2016 19:22:04 Mark Brown wrote:
> On Wed, Sep 21, 2016 at 11:23:41AM +0200, gregkh@linuxfoundation.org wrote:
> > On Wed, Sep 21, 2016 at 02:58:22PM +0800, Alex Shi wrote:
> >> I am not a fun of some scheduler solution. But focus on this can not
> >> explain why many distributions are using 'old' stable kernel. Looking
> >> into product world, could you find some real product are using
> >> 'upstream' kernel?
> >> 
> >> 'upstream first' is good for feature development, but isn't good for
> >> product.
> > 
> > Not true, IBM and Intel have shown that it saves you time and money to
> > do "upstream first", why do people claim that their reports of this is
> > somehow false?  Other companies also agree, but just don't want to take
> > the initial "hit" of time to do it correctly as it will affect the
> > device today to save time and money for the device tomorrow.
> 
> One problem that bites people fairly hard trying to do upstream first is
> that upstream turns too slowly to do things for the current product in
> some markets, when I was working on phones it'd off the top of my head
> typically be 3-4 months before anything ended up in a release (depending
> on where we were in the release cycle) and another year or so before
> that filtered back out into product (and this is mostly with very low
> resistance subsystems where I'm the maintainer).

As a real world example, I'm currently sitting in a room at XDC discussing 
memory allocation for video devices (V4L2 & DRM/KMS). The exact same 
discussion started in 2010, and we're still not close to agreeing on an API.

> This doesn't play very nicely when your total product development lifecycle
> is on the order of six months, even when you scale down to individual
> reviews of patches the latencies involved just don't fit that well. 
> Upstream is simultaneously very fast moving (when you look at the overall
> change volume flowing in) and very slow moving (when you look at some
> individual changes).
> 
> When I talk to people about this I tend to talk about doing upstream
> simultaneously rather than first, that is a lot more tractable.

That's the strategy we used at Nokia when developing the N900 and N9. It was 
quite painful though, it was largely believed among the developers that having 
separate upstream and product teams would have been better.

> Work on things, try to push your latest work upstream as you write it and
> incorporate review feedback from upstream into your product as you go
> but don't gate things on upstream.  Try to view anything that's still in
> review as a problem that needs to be fixed but acknowledge that there's
> also a need to actually get product out the door.
> 
> > > Many product guys talked to me that the non-upstream porting didn't
> > > cost much and not the reason to pin on some stable kernel.
> > 
> > You must be talking to product people who only have to make one device,
> > not a family of devices :)
> 
> What Alex is seeing reflects my experience talking to people as well.
> It's not like anyone is saying this is free but it's a thing that people
> have been doing for a while, figured out and have incorporated into
> their planning - it's managable and reasonably well understood even if
> not super productive.
> 
> > > All of them said that testing and stability was the most cost part.
> > 
> > Sure, software is always free, it's that pesky testing and fixing all of
> > the bugs found that costs money :)
> > (hint, all of those backports and non-upstrem stuff is what is causing
> > lots of those bugs...)
> 
> All code has problems, it's not like we've got a silver bullet here -
> let's not pretend that we're in a position where upstream is shippable
> on all systems or where we never have any performance regressions.
> 
> > > Not only the regular test case, benchmarks, but also the long time
> > > using for some trick/corner case bugs in whole system.
> > 
> > What do you mean by this?
> 
> I think Alex is referring to the detailed full system testing that
> people do, I could be wrong though.
> 
> > > I doubt the 'keep rebasing on upstream' guys have been really worked on
> > > product?
> > 
> > I doubt those "let's not work upstream" have been in this business for
> > as long as those of us who say "work upstrem first" have :)
> 
> I have worked in this environment at various levels all the way up to
> being sat on site with major consumer electronics companies pushing
> patches into market leading products (and onto the lists).  I have
> talked and continue to talk to colleagues in this space who have varying
> degrees of engagement with upstream.
> 
> > Fine, you can ignore us, but realize that it will cost you time and
> > money to _not_ work upstream.  We are just trying to help you out...
> 
> People aren't just ignoring the idea of working upstream entirely.
> People are doing work here, it's producing results (or at least a pile
> of mainline code).  Some are doing more than others, some are more
> successful than others but there is a lot happening here.  There's a lot
> of things going on that I'm critical of but there's also a lot of
> perfectly rational, practical decisions being made.  Getting the more
> complex consumer electronics devices to the point where they work as
> well upstream as servers do is not trivial and will take some time.
> 
> Simply repeating "upstream first" over and over again and telling people
> that doing anything else is just silly isn't really helping move things
> forward.  People have heard this but for a good chunk of the industry
> there's a big gap between that simple statement and something that can
> be practically acted on in any sort of direct fashion, it can easily
> just come over as dismissive and hostile.  It's going to be much more
> productive to acknowledge the realities people are dealing with and talk
> about how people can improve their engagement with upstream, make the
> situation better and close the gaps.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-22 12:56                 ` Laurent Pinchart
@ 2016-09-22 16:22                   ` Mark Brown
  2016-09-22 22:14                     ` Theodore Ts'o
  0 siblings, 1 reply; 122+ messages in thread
From: Mark Brown @ 2016-09-22 16:22 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 804 bytes --]

On Thu, Sep 22, 2016 at 03:56:28PM +0300, Laurent Pinchart wrote:
> On Wednesday 21 Sep 2016 19:22:04 Mark Brown wrote:

> > When I talk to people about this I tend to talk about doing upstream
> > simultaneously rather than first, that is a lot more tractable.

> That's the strategy we used at Nokia when developing the N900 and N9. It was 
> quite painful though, it was largely believed among the developers that having 
> separate upstream and product teams would have been better.

Yeah, this is partly a preference thing and also a size thing - if the
organization is too small it's not really sustainable to have dedicated
team, and if you do split them then there's a big risk that information
will stop flowing effectively between the two teams which defeats a lot
of the point.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-22 16:22                   ` Mark Brown
@ 2016-09-22 22:14                     ` Theodore Ts'o
  2016-09-23 12:28                       ` Laurent Pinchart
  2016-09-23 14:40                       ` [Ksummit-discuss] " Mark Brown
  0 siblings, 2 replies; 122+ messages in thread
From: Theodore Ts'o @ 2016-09-22 22:14 UTC (permalink / raw)
  To: Mark Brown; +Cc: ltsi-dev, gregkh, ksummit-discuss

On Thu, Sep 22, 2016 at 05:22:18PM +0100, Mark Brown wrote:
> > That's the strategy we used at Nokia when developing the N900 and N9. It was 
> > quite painful though, it was largely believed among the developers that having 
> > separate upstream and product teams would have been better.
> 
> Yeah, this is partly a preference thing and also a size thing - if the
> organization is too small it's not really sustainable to have dedicated
> team, and if you do split them then there's a big risk that information
> will stop flowing effectively between the two teams which defeats a lot
> of the point.

Having a single team is also helpful if you require that the engineers
who are responsible for forwarding the patch to newer kernels are also
responsible for getting the patches upstream.  This way, the pain of
having outstanding technical debt is felt by those who are also
responsible for doing something to reduce the technical debt.

One of the problem for having one-time, special purpose teams for a
product bringup, and then breaking them up right after product lanuch
and reassigning them to another product/team is that there is
absolutely no incentive to pay down technical debt, and all the
incentive in the world to leave a cr*p job for someone else to clean
up, especially if they are given unrealistically tight deadlines.

It's "better" only to have separate product and upstream teams if the
product team are only measured on product launch, and if no one cares
if the upstream team is outnumbered by a factor of three compared to
the product teams throwing cr*p over the wall.  So if it's considered
a victory condition for the company to want to pay lip service to
getting things upstream, but doesn't care about whether or not the
"upstreaming" team is getting enough resources to have any chance of
success, sure, it's "better".  But to the extent that it can hide
problems, I suspect it can be a recipe towards building up lots of
technical debt and sweeping it under the rug....

   	       	       	   	 		       - Ted

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-22 22:14                     ` Theodore Ts'o
@ 2016-09-23 12:28                       ` Laurent Pinchart
  2016-09-23 13:27                         ` [Ksummit-discuss] [LTSI-dev] " Alex Shi
  2016-09-23 14:40                       ` [Ksummit-discuss] " Mark Brown
  1 sibling, 1 reply; 122+ messages in thread
From: Laurent Pinchart @ 2016-09-23 12:28 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ltsi-dev, gregkh, ksummit-discuss

Hi Ted,

On Thursday 22 Sep 2016 18:14:51 Theodore Ts'o wrote:
> On Thu, Sep 22, 2016 at 05:22:18PM +0100, Mark Brown wrote:
> > > That's the strategy we used at Nokia when developing the N900 and N9. It
> > > was quite painful though, it was largely believed among the developers
> > > that having separate upstream and product teams would have been better.
> > 
> > Yeah, this is partly a preference thing and also a size thing - if the
> > organization is too small it's not really sustainable to have dedicated
> > team, and if you do split them then there's a big risk that information
> > will stop flowing effectively between the two teams which defeats a lot
> > of the point.
> 
> Having a single team is also helpful if you require that the engineers
> who are responsible for forwarding the patch to newer kernels are also
> responsible for getting the patches upstream.  This way, the pain of
> having outstanding technical debt is felt by those who are also
> responsible for doing something to reduce the technical debt.

It all depends on how upstream-oriented the developers are. In our case, the 
wish to have two separate teams was mostly due to the desire to have more time 
to spend on upstream work. Part of the challenge was to get enough budget (as 
in developer time) to work on upstream.

> One of the problem for having one-time, special purpose teams for a
> product bringup, and then breaking them up right after product lanuch
> and reassigning them to another product/team is that there is
> absolutely no incentive to pay down technical debt, and all the
> incentive in the world to leave a cr*p job for someone else to clean
> up, especially if they are given unrealistically tight deadlines.
> 
> It's "better" only to have separate product and upstream teams if the
> product team are only measured on product launch, and if no one cares
> if the upstream team is outnumbered by a factor of three compared to
> the product teams throwing cr*p over the wall.  So if it's considered
> a victory condition for the company to want to pay lip service to
> getting things upstream, but doesn't care about whether or not the
> "upstreaming" team is getting enough resources to have any chance of
> success, sure, it's "better".  But to the extent that it can hide
> problems, I suspect it can be a recipe towards building up lots of
> technical debt and sweeping it under the rug....

Yes, it's also a budget issue, which in the end requires convincing the right 
level of management that investing in upstream will pay back. Once you're at 
that point the rest becomes a matter of "just doing it" (with all the problems 
associated with performing the real upstreaming work).

As another example, the strategy we're using at Renesas (who has perfectly 
understood the value of upstreaming kernel code) is to develop the BSP and 
upstream support in parallel in different teams. The BSP team is bound to 
customer deadlines, while the upstream team has more freedom. The upstreaming 
priorities are negotiated between the two teams, and when the BSP team rebases 
its tree on upstream the delta shrinks. It will never get down to zero (as BSP 
contain quite a few product-specific hacks that have a really bad effort/gain 
upstreaming ratio), but if the upstream team is correctly staffed it usually 
stays reasonable.

The BSP delta is also a good indicator of how good the upstreaming process is 
going, and can be regularly reviewed to identify issues that need to be fixed 
upstream.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-23 12:28                       ` Laurent Pinchart
@ 2016-09-23 13:27                         ` Alex Shi
  2016-09-23 13:40                           ` Laurent Pinchart
  0 siblings, 1 reply; 122+ messages in thread
From: Alex Shi @ 2016-09-23 13:27 UTC (permalink / raw)
  To: Laurent Pinchart, Theodore Ts'o; +Cc: ltsi-dev, ksummit-discuss



On 09/23/2016 08:28 PM, Laurent Pinchart wrote:
> Yes, it's also a budget issue, which in the end requires convincing the right 
> level of management that investing in upstream will pay back. Once you're at 
> that point the rest becomes a matter of "just doing it" (with all the problems 
> associated with performing the real upstreaming work).
> 
> As another example, the strategy we're using at Renesas (who has perfectly 
> understood the value of upstreaming kernel code) is to develop the BSP and 
> upstream support in parallel in different teams. The BSP team is bound to 
> customer deadlines, while the upstream team has more freedom. The upstreaming 
> priorities are negotiated between the two teams, and when the BSP team rebases 
> its tree on upstream the delta shrinks. It will never get down to zero (as BSP 
> contain quite a few product-specific hacks that have a really bad effort/gain 
> upstreaming ratio), but if the upstream team is correctly staffed it usually 
> stays reasonable.
> 
> The BSP delta is also a good indicator of how good the upstreaming process is 
> going, and can be regularly reviewed to identify issues that need to be fixed 
> upstream.

Oh, will glad to know the details on Renesas, the 'upstream first'
model, and how it run and effect. Thanks for sharing.

Just a question, how do you deal with the user software compatibility
issue when kernel API change in rebasing on upstream? Such as the
libcgroup compatibility issue with cgroupv2.

Regards
Alex

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [LTSI-dev] [Stable kernel] feature backporting collaboration
  2016-09-23 13:27                         ` [Ksummit-discuss] [LTSI-dev] " Alex Shi
@ 2016-09-23 13:40                           ` Laurent Pinchart
  0 siblings, 0 replies; 122+ messages in thread
From: Laurent Pinchart @ 2016-09-23 13:40 UTC (permalink / raw)
  To: Alex Shi; +Cc: ltsi-dev, ksummit-discuss

Hi Alex,

On Friday 23 Sep 2016 21:27:50 Alex Shi wrote:
> On 09/23/2016 08:28 PM, Laurent Pinchart wrote:
> > Yes, it's also a budget issue, which in the end requires convincing the
> > right level of management that investing in upstream will pay back. Once
> > you're at that point the rest becomes a matter of "just doing it" (with
> > all the problems associated with performing the real upstreaming work).
> > 
> > As another example, the strategy we're using at Renesas (who has perfectly
> > understood the value of upstreaming kernel code) is to develop the BSP and
> > upstream support in parallel in different teams. The BSP team is bound to
> > customer deadlines, while the upstream team has more freedom. The
> > upstreaming priorities are negotiated between the two teams, and when the
> > BSP team rebases its tree on upstream the delta shrinks. It will never
> > get down to zero (as BSP contain quite a few product-specific hacks that
> > have a really bad effort/gain upstreaming ratio), but if the upstream
> > team is correctly staffed it usually stays reasonable.
> > 
> > The BSP delta is also a good indicator of how good the upstreaming process
> > is going, and can be regularly reviewed to identify issues that need to
> > be fixed upstream.
> 
> Oh, will glad to know the details on Renesas, the 'upstream first'
> model, and how it run and effect. Thanks for sharing.
> 
> Just a question, how do you deal with the user software compatibility
> issue when kernel API change in rebasing on upstream? Such as the
> libcgroup compatibility issue with cgroupv2.

There's no hard rule there, it's decided on a case-by-case basis. Working on 
products make it easier in the sense that you can usually update the userspace 
code along with the kernel. It all depends what stability guarantee you want 
to give to the consumers of the kernel.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 122+ messages in thread

* Re: [Ksummit-discuss] [Stable kernel] feature backporting collaboration
  2016-09-22 22:14                     ` Theodore Ts'o
  2016-09-23 12:28                       ` Laurent Pinchart
@ 2016-09-23 14:40                       ` Mark Brown
  1 sibling, 0 replies; 122+ messages in thread
From: Mark Brown @ 2016-09-23 14:40 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: ltsi-dev, gregkh, ksummit-discuss

[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]

On Thu, Sep 22, 2016 at 06:14:51PM -0400, Theodore Ts'o wrote:

> One of the problem for having one-time, special purpose teams for a
> product bringup, and then breaking them up right after product lanuch
> and reassigning them to another product/team is that there is
> absolutely no incentive to pay down technical debt, and all the
> incentive in the world to leave a cr*p job for someone else to clean
> up, especially if they are given unrealistically tight deadlines.

This varies a lot depending on where in the industry the team is - the
small temporary teams are normally not the ones doing the substantial
development but are more usually taking something that's essentially a
finished product and customizing it in some way, for example adapting
for a cost down or doing a reference design copy.  Since they're mostly
inheriting preexisting work there is less to be gained from consistency,
the technical debt is there when the teams start and they're really at
worst just maintaining the status quo.

My experience has been that where there's more substantial development
(eg, on the flagship products) being done companies will usually have a
fairly consistent team though they will also typically be inheriting
substantial technical debt from somewhere (eg, previous generations of
products, or SoC vendors if they're a system integrator).

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 122+ messages in thread

end of thread, other threads:[~2016-09-23 14:40 UTC | newest]

Thread overview: 122+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-01  2:01 [Ksummit-discuss] [Stable kernel] feature backporting collaboration Alex Shi
2016-09-02  1:25 ` Levin, Alexander
2016-09-02  2:43   ` Stephen Hemminger
2016-09-02  9:59     ` Mark Brown
2016-09-02  9:54   ` Mark Brown
2016-09-02 10:16     ` [Ksummit-discuss] [LTSI-dev] " Geert Uytterhoeven
2016-09-02 14:42     ` [Ksummit-discuss] " James Bottomley
2016-09-02 14:55       ` Rik van Riel
2016-09-02 15:04         ` James Bottomley
2016-09-02 15:39           ` Rik van Riel
2016-09-02 17:06       ` Bird, Timothy
2016-09-05  1:45         ` NeilBrown
2016-09-05 11:04           ` Mark Brown
2016-09-05 22:44             ` NeilBrown
2016-09-06  0:57               ` Mark Brown
2016-09-06  5:41                 ` NeilBrown
2016-09-08 18:33               ` [Ksummit-discuss] [LTSI-dev] " Bird, Timothy
2016-09-08 22:38                 ` NeilBrown
2016-09-09 11:01                   ` Mark Brown
2016-09-09 22:17                     ` NeilBrown
2016-09-12 17:37                       ` Mark Brown
2016-09-13  7:46                         ` NeilBrown
2016-09-13 17:53                           ` Mark Brown
2016-09-02 18:21       ` [Ksummit-discuss] " Olof Johansson
2016-09-02 23:35         ` Mark Brown
2016-09-03  5:29         ` Guenter Roeck
2016-09-03 10:40           ` Mark Brown
2016-09-04  0:10         ` Theodore Ts'o
2016-09-04  8:34           ` gregkh
2016-09-04 22:58           ` Amit Kucheria
2016-09-04 23:51             ` Theodore Ts'o
2016-09-05 12:58               ` Mark Brown
2016-09-05 11:11             ` Mark Brown
2016-09-05 14:03               ` Theodore Ts'o
2016-09-05 14:22                 ` Laurent Pinchart
2016-09-06  0:35                   ` Mark Brown
2016-09-06 15:30                     ` James Bottomley
2016-09-06 19:44                       ` gregkh
2016-09-06 22:20                         ` Mark Brown
2016-09-06 22:34                           ` James Bottomley
2016-09-08 18:55                             ` Bird, Timothy
2016-09-08 19:19                               ` gregkh
2016-09-09 10:45                                 ` Mark Brown
2016-09-09 11:03                                   ` gregkh
2016-09-09 11:48                                     ` Mark Brown
2016-09-06 23:23                       ` Mark Brown
2016-09-06 13:34                   ` Catalin Marinas
2016-09-06 16:24                     ` Bartlomiej Zolnierkiewicz
2016-09-06 16:25                     ` Guenter Roeck
2016-09-06 22:39                       ` Mark Brown
2016-09-07  8:33                       ` Jan Kara
2016-09-07  8:41                         ` Jiri Kosina
2016-09-07 18:44                           ` Mark Brown
2016-09-08 17:06                             ` Frank Rowand
2016-09-09 10:32                               ` Mark Brown
2016-09-09 15:21                         ` Alex Shi
2016-09-12 15:34                         ` Christoph Hellwig
2016-09-06 16:46                     ` Olof Johansson
2016-09-08  8:34                       ` Linus Walleij
2016-09-08  8:55                         ` Vinod Koul
2016-09-09 14:32                           ` Rob Herring
2016-09-09 14:23                         ` Rob Herring
     [not found]                     ` <2181684.5VzIQ6DWv4@amdc1976>
2016-09-07  9:32                       ` Catalin Marinas
2016-09-07 13:07                         ` Bartlomiej Zolnierkiewicz
2016-09-07 18:49                         ` Mark Brown
2016-09-09 15:06                         ` Alex Shi
2016-09-02 23:29       ` Mark Brown
2016-09-02 19:16     ` Levin, Alexander
2016-09-03  0:05       ` Mark Brown
2016-09-05  9:28         ` Laurent Pinchart
2016-09-21  6:58           ` Alex Shi
2016-09-21  9:23             ` gregkh
2016-09-21 14:52               ` Alex Shi
2016-09-21 15:28                 ` gregkh
2016-09-21 18:50                   ` Mark Brown
2016-09-22  3:15                   ` Alex Shi
2016-09-21 18:22               ` Mark Brown
2016-09-21 18:54                 ` Linus Walleij
2016-09-21 19:52                 ` Theodore Ts'o
2016-09-22  0:43                   ` Mark Brown
2016-09-22  5:20                 ` gregkh
2016-09-22 12:56                 ` Laurent Pinchart
2016-09-22 16:22                   ` Mark Brown
2016-09-22 22:14                     ` Theodore Ts'o
2016-09-23 12:28                       ` Laurent Pinchart
2016-09-23 13:27                         ` [Ksummit-discuss] [LTSI-dev] " Alex Shi
2016-09-23 13:40                           ` Laurent Pinchart
2016-09-23 14:40                       ` [Ksummit-discuss] " Mark Brown
2016-09-21 13:56             ` Theodore Ts'o
2016-09-21 15:23               ` Alex Shi
2016-09-21 15:33                 ` gregkh
2016-09-21 19:16                   ` Mark Brown
2016-09-02 13:47 ` Theodore Ts'o
2016-09-02 19:31   ` Levin, Alexander
2016-09-02 19:42     ` gregkh
2016-09-02 20:06       ` Levin, Alexander
2016-09-03  2:04   ` Mark Brown
2016-09-06  7:20   ` [Ksummit-discuss] [LTSI-dev] " Tsugikazu Shibata
2016-09-10 12:00     ` Theodore Ts'o
2016-09-12 16:27       ` Mark Brown
2016-09-12 17:14         ` Greg KH
2016-09-12 23:45           ` Mark Brown
2016-09-13  3:14             ` Theodore Ts'o
2016-09-13 10:14               ` Mark Brown
2016-09-13 13:19               ` Levin, Alexander
2016-09-13  6:19             ` Greg KH
2016-09-13 10:38               ` Mark Brown
2016-09-13 12:09                 ` Greg KH
2016-09-13 12:20                   ` Josh Boyer
2016-09-13 13:12                     ` Greg KH
2016-09-13 16:23                       ` Bird, Timothy
2016-09-13 19:02                       ` Mark Brown
2016-09-14 14:47                       ` Alex Shi
2016-09-20  5:15                       ` Tsugikazu Shibata
2016-09-21  8:46                         ` Alex Shi
2016-09-13 12:25                 ` Geert Uytterhoeven
2016-09-13 19:21                   ` Mark Brown
2016-09-14  1:49                     ` Greg KH
2016-09-14  3:00                       ` Guenter Roeck
2016-09-12  4:12     ` Alex Shi
2016-09-12 16:09       ` Masami Hiramatsu
2016-09-13  2:39         ` Alex Shi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.