workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* thoughts on a Merge Request based development workflow
@ 2019-09-24 18:25 Neil Horman
  2019-09-24 18:37 ` Drew DeVault
  2019-09-24 23:15 ` David Rientjes
  0 siblings, 2 replies; 102+ messages in thread
From: Neil Horman @ 2019-09-24 18:25 UTC (permalink / raw)
  To: workflows

Hey all-
	After hearing at LPC that that there was a group investigating moving
some upstream development to a less email-centric workfow, I wanted to share
this with the group:

https://gitlab.com/nhorman/git-lab-porcelain

Its still very rough, and is focused on working with RH based workflows
currently, but it can pretty easily be adapted to generic projects, if theres
interest, as well as to other services besides gitlab (github/etc).

The principle is pretty straightforward (at least currently), its a git
porcelain that wraps up the notion of creating a merge request with sending
patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
in sync with email patch posting.  It also contains an email listener daemon to
monitor reqisite lists for ACK/NACK responses which can then be translated into
MR metadata for true MR approvals/notifications to the maintainer that a branch
is good to merge.

Ostensibly, if this has any sort of legs, the idea in the long term is to add
the ability to use the porcelain to do reviews on the command line, and
eventually phase out email entirely, but I think thats a significant way off
here.

Anywho, food for thought.

Best
Neil


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 18:25 thoughts on a Merge Request based development workflow Neil Horman
@ 2019-09-24 18:37 ` Drew DeVault
  2019-09-24 18:53   ` Neil Horman
  2019-10-07 15:33   ` David Miller
  2019-09-24 23:15 ` David Rientjes
  1 sibling, 2 replies; 102+ messages in thread
From: Drew DeVault @ 2019-09-24 18:37 UTC (permalink / raw)
  To: Neil Horman, workflows

On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> 	After hearing at LPC that that there was a group investigating moving
> some upstream development to a less email-centric workfow, I wanted to share
> this with the group:
> 
> https://gitlab.com/nhorman/git-lab-porcelain
> 
> Its still very rough, and is focused on working with RH based workflows
> currently, but it can pretty easily be adapted to generic projects, if theres
> interest, as well as to other services besides gitlab (github/etc).
> 
> The principle is pretty straightforward (at least currently), its a git
> porcelain that wraps up the notion of creating a merge request with sending
> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> in sync with email patch posting.  It also contains an email listener daemon to
> monitor reqisite lists for ACK/NACK responses which can then be translated into
> MR metadata for true MR approvals/notifications to the maintainer that a branch
> is good to merge.

This is a great idea.

> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> the ability to use the porcelain to do reviews on the command line, and
> eventually phase out email entirely, but I think thats a significant way off
> here.

Until this part. Phasing out email in favor of a centralized solution
like Gitlab would be a stark regression.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 18:37 ` Drew DeVault
@ 2019-09-24 18:53   ` Neil Horman
  2019-09-24 20:24     ` Laurent Pinchart
  2019-10-07 15:33   ` David Miller
  1 sibling, 1 reply; 102+ messages in thread
From: Neil Horman @ 2019-09-24 18:53 UTC (permalink / raw)
  To: Drew DeVault; +Cc: workflows

On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > 	After hearing at LPC that that there was a group investigating moving
> > some upstream development to a less email-centric workfow, I wanted to share
> > this with the group:
> > 
> > https://gitlab.com/nhorman/git-lab-porcelain
> > 
> > Its still very rough, and is focused on working with RH based workflows
> > currently, but it can pretty easily be adapted to generic projects, if theres
> > interest, as well as to other services besides gitlab (github/etc).
> > 
> > The principle is pretty straightforward (at least currently), its a git
> > porcelain that wraps up the notion of creating a merge request with sending
> > patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > in sync with email patch posting.  It also contains an email listener daemon to
> > monitor reqisite lists for ACK/NACK responses which can then be translated into
> > MR metadata for true MR approvals/notifications to the maintainer that a branch
> > is good to merge.
> 
> This is a great idea.
> 
> > Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > the ability to use the porcelain to do reviews on the command line, and
> > eventually phase out email entirely, but I think thats a significant way off
> > here.
> 
> Until this part. Phasing out email in favor of a centralized solution
> like Gitlab would be a stark regression.
> 
Well, that by no rights has to happen (at least not in my mind).  I wouldn't
have an issue with maintaining a mailing list in perpituity.  I only mean to say
that if common practice becomes to us the git interface to preform reviews, and
email usage becomes less needed, a given project could choose to phase it out.

Neil


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 18:53   ` Neil Horman
@ 2019-09-24 20:24     ` Laurent Pinchart
  2019-09-24 22:25       ` Neil Horman
  0 siblings, 1 reply; 102+ messages in thread
From: Laurent Pinchart @ 2019-09-24 20:24 UTC (permalink / raw)
  To: Neil Horman; +Cc: Drew DeVault, workflows

Hi Neil,

On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> > On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > > 	After hearing at LPC that that there was a group investigating moving
> > > some upstream development to a less email-centric workfow, I wanted to share
> > > this with the group:
> > > 
> > > https://gitlab.com/nhorman/git-lab-porcelain
> > > 
> > > Its still very rough, and is focused on working with RH based workflows
> > > currently, but it can pretty easily be adapted to generic projects, if theres
> > > interest, as well as to other services besides gitlab (github/etc).
> > > 
> > > The principle is pretty straightforward (at least currently), its a git
> > > porcelain that wraps up the notion of creating a merge request with sending
> > > patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > > in sync with email patch posting.  It also contains an email listener daemon to
> > > monitor reqisite lists for ACK/NACK responses which can then be translated into
> > > MR metadata for true MR approvals/notifications to the maintainer that a branch
> > > is good to merge.
> > 
> > This is a great idea.
> > 
> > > Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > > the ability to use the porcelain to do reviews on the command line, and
> > > eventually phase out email entirely, but I think thats a significant way off
> > > here.
> > 
> > Until this part. Phasing out email in favor of a centralized solution
> > like Gitlab would be a stark regression.
>
> Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> have an issue with maintaining a mailing list in perpituity.  I only mean to say
> that if common practice becomes to us the git interface to preform reviews, and
> email usage becomes less needed, a given project could choose to phase it out.

My opinion on this is that if anyone wants to move towards a more
git-centric workflow, be it for review, pull/merge requests, or anything
else, we will have to figure out a way to make this decentralised and
not bound to a single server instance. Without interoperability between
servers and decentralisation, the result will be vendor lock-in, and
that's a no-go for a large part of the community.

How this could be achieved remains to be discussed, and should be an
interesting exercise.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 20:24     ` Laurent Pinchart
@ 2019-09-24 22:25       ` Neil Horman
  2019-09-25 20:50         ` Laurent Pinchart
  0 siblings, 1 reply; 102+ messages in thread
From: Neil Horman @ 2019-09-24 22:25 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: Drew DeVault, workflows

On Tue, Sep 24, 2019 at 11:24:23PM +0300, Laurent Pinchart wrote:
> Hi Neil,
> 
> On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> > On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> > > On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > > > 	After hearing at LPC that that there was a group investigating moving
> > > > some upstream development to a less email-centric workfow, I wanted to share
> > > > this with the group:
> > > > 
> > > > https://gitlab.com/nhorman/git-lab-porcelain
> > > > 
> > > > Its still very rough, and is focused on working with RH based workflows
> > > > currently, but it can pretty easily be adapted to generic projects, if theres
> > > > interest, as well as to other services besides gitlab (github/etc).
> > > > 
> > > > The principle is pretty straightforward (at least currently), its a git
> > > > porcelain that wraps up the notion of creating a merge request with sending
> > > > patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > > > in sync with email patch posting.  It also contains an email listener daemon to
> > > > monitor reqisite lists for ACK/NACK responses which can then be translated into
> > > > MR metadata for true MR approvals/notifications to the maintainer that a branch
> > > > is good to merge.
> > > 
> > > This is a great idea.
> > > 
> > > > Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > > > the ability to use the porcelain to do reviews on the command line, and
> > > > eventually phase out email entirely, but I think thats a significant way off
> > > > here.
> > > 
> > > Until this part. Phasing out email in favor of a centralized solution
> > > like Gitlab would be a stark regression.
> >
> > Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> > have an issue with maintaining a mailing list in perpituity.  I only mean to say
> > that if common practice becomes to us the git interface to preform reviews, and
> > email usage becomes less needed, a given project could choose to phase it out.
> 
> My opinion on this is that if anyone wants to move towards a more
> git-centric workflow, be it for review, pull/merge requests, or anything
> else, we will have to figure out a way to make this decentralised and
> not bound to a single server instance. Without interoperability between
> servers and decentralisation, the result will be vendor lock-in, and
> that's a no-go for a large part of the community.
> 
I think thats a bit of an overstatement.

Yes, we definately want to avoid vendor lock in, and decentralization is
definately a needed aspect of any highly parallelized workflow.  That said:

1) Regarding vendor lock in, if you want to work with any workflow-as-a-service
provider, your going to want tooling that talks to it (via the web ui, the
command line, etc).  But in the end, you're going to have to talk to that
service.  Thats all this tooling I presented does.  And if you look at the REST
apis for the major available services (gitlab/github), while they differ, their
general objects and operations are sufficiently simmilar that they can be
abstracted through tooling such that the same tool can be adapted to either.
One would imagine that any to-be-created service would have sufficiently
simmilar operations to also be adaptable to a generic set of operations

2) Regarding decentrallization, the advantage of the decentrailzation of git
lies in its ability for users to house their own local copies of a git tree, not
so much in the ability to have multiple git servers (though the latter is
important too).  In either case, the use of gitlab or github doesn't enjoin you
from doing that.  You can still move your git tree around between clients and
servers however you see fit.

The thing to consider outside of that is the exportability of the data that
resides outside of git - that is to say, if you want to move from gitlab to
github, or from either to some 3rd service, or to a home built service, is there
a way to export/import all that merge request and issue data, and honestly I
don't know the answer to that, yet.  I can see ways that it could be done, but
am completely unsure as to how it should be done.

Neil

> How this could be achieved remains to be discussed, and should be an
> interesting exercise.
> 
> -- 
> Regards,
> 
> Laurent Pinchart
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 18:25 thoughts on a Merge Request based development workflow Neil Horman
  2019-09-24 18:37 ` Drew DeVault
@ 2019-09-24 23:15 ` David Rientjes
  2019-09-25  6:35   ` Toke Høiland-Jørgensen
  2019-09-25 10:49   ` Neil Horman
  1 sibling, 2 replies; 102+ messages in thread
From: David Rientjes @ 2019-09-24 23:15 UTC (permalink / raw)
  To: Neil Horman; +Cc: workflows

On Tue, 24 Sep 2019, Neil Horman wrote:

> Hey all-
> 	After hearing at LPC that that there was a group investigating moving
> some upstream development to a less email-centric workfow, I wanted to share
> this with the group:
> 
> https://gitlab.com/nhorman/git-lab-porcelain
> 
> Its still very rough, and is focused on working with RH based workflows
> currently, but it can pretty easily be adapted to generic projects, if theres
> interest, as well as to other services besides gitlab (github/etc).
> 
> The principle is pretty straightforward (at least currently), its a git
> porcelain that wraps up the notion of creating a merge request with sending
> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> in sync with email patch posting.  It also contains an email listener daemon to
> monitor reqisite lists for ACK/NACK responses which can then be translated into
> MR metadata for true MR approvals/notifications to the maintainer that a branch
> is good to merge.
> 
> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> the ability to use the porcelain to do reviews on the command line, and
> eventually phase out email entirely, but I think thats a significant way off
> here.
> 
> Anywho, food for thought.
> 

This is very interesting.

It may be off-topic but this email raised my curiosity about how features 
are maintained internally before they are ready to propose to upstream, 
especially when those features are developed over multiple upstream base 
releases.

We have features that we maintain in-house until they are ready to push 
upstream and while they are still under active development or we are 
collecting data to use as motivation for asking that feature to be merged.

For this, we have historically always rebased these features on top of new 
kernel releases (unfortunately not 4.20 -> 5.0 -> 5.1, but somewhere in 
between like 4.20 -> 5.2) and that creates a lot of churn, developer 
resources, and rewrites the git history.

Exploring options for maintaining these features by merging rather than 
rebasing has been done: instead of rebasing from 4.20 to 5.2, for example, 
as a clean series on top of 5.2, we fork the feature branch based on 4.20 
off and merge it with 5.2, fix it up, run tests, and publish.  The thought 
process here was that we can always git rebase --onto linus to create a 
nice clean patch series for posting upstream or asking for a git pull from 
upstream.

I'd be very interested to know how others maintain patch series across 
multiple base kernel version especially when they need to maintain the 
feature for those kernel versions separately, how RH handles their patches 
before they are ready to be officially posted, etc.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 23:15 ` David Rientjes
@ 2019-09-25  6:35   ` Toke Høiland-Jørgensen
  2019-09-25 10:49   ` Neil Horman
  1 sibling, 0 replies; 102+ messages in thread
From: Toke Høiland-Jørgensen @ 2019-09-25  6:35 UTC (permalink / raw)
  To: David Rientjes, Neil Horman; +Cc: workflows

David Rientjes <rientjes@google.com> writes:

> I'd be very interested to know how others maintain patch series across
> multiple base kernel version especially when they need to maintain the
> feature for those kernel versions separately, how RH handles their
> patches before they are ready to be officially posted, etc.

The short answer is "we don't". Features are developed upstream and not
shipped in RHEL until they've landed upstream. The internal branches
contains plenty of backports and bug fixes, but the flow of patches is
one way: from upstream to internal.

There's a long-ish blog post here describing this in the context of
openstack, but really speaking quite generally:

https://community.redhat.com/blog/2015/03/upstream-first-turning-openstack-into-an-nfv-platform/

-Toke


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 23:15 ` David Rientjes
  2019-09-25  6:35   ` Toke Høiland-Jørgensen
@ 2019-09-25 10:49   ` Neil Horman
  1 sibling, 0 replies; 102+ messages in thread
From: Neil Horman @ 2019-09-25 10:49 UTC (permalink / raw)
  To: David Rientjes; +Cc: workflows

On Tue, Sep 24, 2019 at 04:15:18PM -0700, David Rientjes wrote:
> On Tue, 24 Sep 2019, Neil Horman wrote:
> 
> > Hey all-
> > 	After hearing at LPC that that there was a group investigating moving
> > some upstream development to a less email-centric workfow, I wanted to share
> > this with the group:
> > 
> > https://gitlab.com/nhorman/git-lab-porcelain
> > 
> > Its still very rough, and is focused on working with RH based workflows
> > currently, but it can pretty easily be adapted to generic projects, if theres
> > interest, as well as to other services besides gitlab (github/etc).
> > 
> > The principle is pretty straightforward (at least currently), its a git
> > porcelain that wraps up the notion of creating a merge request with sending
> > patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > in sync with email patch posting.  It also contains an email listener daemon to
> > monitor reqisite lists for ACK/NACK responses which can then be translated into
> > MR metadata for true MR approvals/notifications to the maintainer that a branch
> > is good to merge.
> > 
> > Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > the ability to use the porcelain to do reviews on the command line, and
> > eventually phase out email entirely, but I think thats a significant way off
> > here.
> > 
> > Anywho, food for thought.
> > 
> 
> This is very interesting.
> 
> It may be off-topic but this email raised my curiosity about how features 
> are maintained internally before they are ready to propose to upstream, 
> especially when those features are developed over multiple upstream base 
> releases.
> 
> We have features that we maintain in-house until they are ready to push 
> upstream and while they are still under active development or we are 
> collecting data to use as motivation for asking that feature to be merged.
> 
> For this, we have historically always rebased these features on top of new 
> kernel releases (unfortunately not 4.20 -> 5.0 -> 5.1, but somewhere in 
> between like 4.20 -> 5.2) and that creates a lot of churn, developer 
> resources, and rewrites the git history.
> 
> Exploring options for maintaining these features by merging rather than 
> rebasing has been done: instead of rebasing from 4.20 to 5.2, for example, 
> as a clean series on top of 5.2, we fork the feature branch based on 4.20 
> off and merge it with 5.2, fix it up, run tests, and publish.  The thought 
> process here was that we can always git rebase --onto linus to create a 
> nice clean patch series for posting upstream or asking for a git pull from 
> upstream.
> 
> I'd be very interested to know how others maintain patch series across 
> multiple base kernel version especially when they need to maintain the 
> feature for those kernel versions separately, how RH handles their patches 
> before they are ready to be officially posted, etc.
> 
As far as kernel features are concerned, RH follows an upstream first policy
(and does so generally speaking for all of our products).  If a new feature is
to be developed, we do so against the latest upstream, always, and then the
effort becomes one of adapting those features to the RHEL stabilized kernels.
That shifts the effort from constantly needing to adapt an existing feature to a
newer kernel, to needing to do take a newly developed feature and backport it to
an older stabilized kernel.  Its not necessecarily any less work per-se, but we
find it accelerates the development of the actual feature.

HTH
Neil


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 22:25       ` Neil Horman
@ 2019-09-25 20:50         ` Laurent Pinchart
  2019-09-25 21:54           ` Neil Horman
                             ` (2 more replies)
  0 siblings, 3 replies; 102+ messages in thread
From: Laurent Pinchart @ 2019-09-25 20:50 UTC (permalink / raw)
  To: Neil Horman; +Cc: Drew DeVault, workflows

Hi Neil,

On Tue, Sep 24, 2019 at 06:25:02PM -0400, Neil Horman wrote:
> On Tue, Sep 24, 2019 at 11:24:23PM +0300, Laurent Pinchart wrote:
> > On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> >> On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> >>> On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> >>>> 	After hearing at LPC that that there was a group investigating moving
> >>>> some upstream development to a less email-centric workfow, I wanted to share
> >>>> this with the group:
> >>>> 
> >>>> https://gitlab.com/nhorman/git-lab-porcelain
> >>>> 
> >>>> Its still very rough, and is focused on working with RH based workflows
> >>>> currently, but it can pretty easily be adapted to generic projects, if theres
> >>>> interest, as well as to other services besides gitlab (github/etc).
> >>>> 
> >>>> The principle is pretty straightforward (at least currently), its a git
> >>>> porcelain that wraps up the notion of creating a merge request with sending
> >>>> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> >>>> in sync with email patch posting.  It also contains an email listener daemon to
> >>>> monitor reqisite lists for ACK/NACK responses which can then be translated into
> >>>> MR metadata for true MR approvals/notifications to the maintainer that a branch
> >>>> is good to merge.
> >>> 
> >>> This is a great idea.
> >>> 
> >>>> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> >>>> the ability to use the porcelain to do reviews on the command line, and
> >>>> eventually phase out email entirely, but I think thats a significant way off
> >>>> here.
> >>> 
> >>> Until this part. Phasing out email in favor of a centralized solution
> >>> like Gitlab would be a stark regression.
> >>
> >> Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> >> have an issue with maintaining a mailing list in perpituity.  I only mean to say
> >> that if common practice becomes to us the git interface to preform reviews, and
> >> email usage becomes less needed, a given project could choose to phase it out.
> > 
> > My opinion on this is that if anyone wants to move towards a more
> > git-centric workflow, be it for review, pull/merge requests, or anything
> > else, we will have to figure out a way to make this decentralised and
> > not bound to a single server instance. Without interoperability between
> > servers and decentralisation, the result will be vendor lock-in, and
> > that's a no-go for a large part of the community.
>
> I think thats a bit of an overstatement.
> 
> Yes, we definately want to avoid vendor lock in, and decentralization is
> definately a needed aspect of any highly parallelized workflow.  That said:
> 
> 1) Regarding vendor lock in, if you want to work with any workflow-as-a-service
> provider, your going to want tooling that talks to it (via the web ui, the
> command line, etc).  But in the end, you're going to have to talk to that
> service.  Thats all this tooling I presented does.  And if you look at the REST
> apis for the major available services (gitlab/github), while they differ, their
> general objects and operations are sufficiently simmilar that they can be
> abstracted through tooling such that the same tool can be adapted to either.
> One would imagine that any to-be-created service would have sufficiently
> simmilar operations to also be adaptable to a generic set of operations

Then let's create such a tool that also supports e-mail workflows, to
really offer a choice.

> 2) Regarding decentrallization, the advantage of the decentrailzation of git
> lies in its ability for users to house their own local copies of a git tree, not
> so much in the ability to have multiple git servers (though the latter is
> important too).  In either case, the use of gitlab or github doesn't enjoin you
> from doing that.  You can still move your git tree around between clients and
> servers however you see fit.

But it forces me to have one account with every git hosting service that
the projects I want to contribute to happen to select. Sure, we could
fix that by deciding to move the entire free software development to a
single git hosting provider, but that creates a vendor lock in issue
that few people would be comfortable with (including myself).

> The thing to consider outside of that is the exportability of the data that
> resides outside of git - that is to say, if you want to move from gitlab to
> github, or from either to some 3rd service, or to a home built service, is there
> a way to export/import all that merge request and issue data, and honestly I
> don't know the answer to that, yet.  I can see ways that it could be done, but
> am completely unsure as to how it should be done.

There's one way, and we have it today, it's called e-mail :-) Jokes
aside, a bi-directional e-mail gateway would allow interoperability
between services, and wouldn't require me to create an account with the
hosting provider you happen to favour, or the other way around. The
issue with e-mail gateways is that e-mail clients (and users) are
notorously bad at keeping formatting intact, so the e-mail to hosted
service direction is pretty unreliable. I would prefer investigating how
that could be improved instead of picking which vendor we'll get our
handcuffs from, as it would then benefit everybody.

> > How this could be achieved remains to be discussed, and should be an
> > interesting exercise.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-25 20:50         ` Laurent Pinchart
@ 2019-09-25 21:54           ` Neil Horman
  2019-09-26  0:40           ` Neil Horman
  2019-09-26 10:23           ` Geert Uytterhoeven
  2 siblings, 0 replies; 102+ messages in thread
From: Neil Horman @ 2019-09-25 21:54 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: Drew DeVault, workflows

On Wed, Sep 25, 2019 at 11:50:36PM +0300, Laurent Pinchart wrote:
> Hi Neil,
> 
> On Tue, Sep 24, 2019 at 06:25:02PM -0400, Neil Horman wrote:
> > On Tue, Sep 24, 2019 at 11:24:23PM +0300, Laurent Pinchart wrote:
> > > On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> > >> On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> > >>> On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > >>>> 	After hearing at LPC that that there was a group investigating moving
> > >>>> some upstream development to a less email-centric workfow, I wanted to share
> > >>>> this with the group:
> > >>>> 
> > >>>> https://gitlab.com/nhorman/git-lab-porcelain
> > >>>> 
> > >>>> Its still very rough, and is focused on working with RH based workflows
> > >>>> currently, but it can pretty easily be adapted to generic projects, if theres
> > >>>> interest, as well as to other services besides gitlab (github/etc).
> > >>>> 
> > >>>> The principle is pretty straightforward (at least currently), its a git
> > >>>> porcelain that wraps up the notion of creating a merge request with sending
> > >>>> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > >>>> in sync with email patch posting.  It also contains an email listener daemon to
> > >>>> monitor reqisite lists for ACK/NACK responses which can then be translated into
> > >>>> MR metadata for true MR approvals/notifications to the maintainer that a branch
> > >>>> is good to merge.
> > >>> 
> > >>> This is a great idea.
> > >>> 
> > >>>> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > >>>> the ability to use the porcelain to do reviews on the command line, and
> > >>>> eventually phase out email entirely, but I think thats a significant way off
> > >>>> here.
> > >>> 
> > >>> Until this part. Phasing out email in favor of a centralized solution
> > >>> like Gitlab would be a stark regression.
> > >>
> > >> Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> > >> have an issue with maintaining a mailing list in perpituity.  I only mean to say
> > >> that if common practice becomes to us the git interface to preform reviews, and
> > >> email usage becomes less needed, a given project could choose to phase it out.
> > > 
> > > My opinion on this is that if anyone wants to move towards a more
> > > git-centric workflow, be it for review, pull/merge requests, or anything
> > > else, we will have to figure out a way to make this decentralised and
> > > not bound to a single server instance. Without interoperability between
> > > servers and decentralisation, the result will be vendor lock-in, and
> > > that's a no-go for a large part of the community.
> >
> > I think thats a bit of an overstatement.
> > 
> > Yes, we definately want to avoid vendor lock in, and decentralization is
> > definately a needed aspect of any highly parallelized workflow.  That said:
> > 
> > 1) Regarding vendor lock in, if you want to work with any workflow-as-a-service
> > provider, your going to want tooling that talks to it (via the web ui, the
> > command line, etc).  But in the end, you're going to have to talk to that
> > service.  Thats all this tooling I presented does.  And if you look at the REST
> > apis for the major available services (gitlab/github), while they differ, their
> > general objects and operations are sufficiently simmilar that they can be
> > abstracted through tooling such that the same tool can be adapted to either.
> > One would imagine that any to-be-created service would have sufficiently
> > simmilar operations to also be adaptable to a generic set of operations
> 
> Then let's create such a tool that also supports e-mail workflows, to
> really offer a choice.
> 

Well, this tool does that, by wrapping up merge creation with git-send-email, so
that you get both a traditional patch posting, as well as a merge request.  It
also has a email monitor daemon that listens for responses and formatted text to
take action against responses to those postings in gitlab.  I'm not sure what
more you're looking for.

Though if the concern is choice....I'm not sure how the choice to continue just
using email isn't still available.  Just keep using it for your project.

> > 2) Regarding decentrallization, the advantage of the decentrailzation of git
> > lies in its ability for users to house their own local copies of a git tree, not
> > so much in the ability to have multiple git servers (though the latter is
> > important too).  In either case, the use of gitlab or github doesn't enjoin you
> > from doing that.  You can still move your git tree around between clients and
> > servers however you see fit.
> 
> But it forces me to have one account with every git hosting service that
> the projects I want to contribute to happen to select. Sure, we could
> fix that by deciding to move the entire free software development to a
> single git hosting provider, but that creates a vendor lock in issue
> that few people would be comfortable with (including myself).
> 

Well, yes, but thats not the fault of the tooling, thats a choice on the part of
the project leadership.  I'm just suggesting that, if you want to use a merge
request based workflow, this tool can make the transition to that workflow
easier.  Thats orthogonoal though to the access requirements for any given
service.  I can't imagine a Merge request based workflow (short of simple pull
requests via git) that wouldn't require some additional authentication.  Even
with a tool like git-appraise (which stores review metadata in git notes), you
need to write your notes to a publically accessible server, and having write
access comes with some level of authentication and permission (i.e. git+ssh or
some such).  I don't think anyone is going to go for a solution which is world
writeable to the SCM.

> > The thing to consider outside of that is the exportability of the data that
> > resides outside of git - that is to say, if you want to move from gitlab to
> > github, or from either to some 3rd service, or to a home built service, is there
> > a way to export/import all that merge request and issue data, and honestly I
> > don't know the answer to that, yet.  I can see ways that it could be done, but
> > am completely unsure as to how it should be done.
> 
> There's one way, and we have it today, it's called e-mail :-) Jokes
> aside, a bi-directional e-mail gateway would allow interoperability
> between services, and wouldn't require me to create an account with the
> hosting provider you happen to favour, or the other way around. The
> issue with e-mail gateways is that e-mail clients (and users) are
> notorously bad at keeping formatting intact, so the e-mail to hosted
> service direction is pretty unreliable. I would prefer investigating how
> that could be improved instead of picking which vendor we'll get our
> handcuffs from, as it would then benefit everybody.
> 
check the emaillistener.py utility in the project I shared, it does exactly
this, and I suppose it could be extended to detect new patch sets and open merge
requests on behalf of users running git-send-email, though that seems like alot
of effort just to avoid participants creating accounts on services.

> > > How this could be achieved remains to be discussed, and should be an
> > > interesting exercise.
> 
> -- 
> Regards,
> 
> Laurent Pinchart
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-25 20:50         ` Laurent Pinchart
  2019-09-25 21:54           ` Neil Horman
@ 2019-09-26  0:40           ` Neil Horman
  2019-09-28 22:58             ` Steven Rostedt
  2019-09-26 10:23           ` Geert Uytterhoeven
  2 siblings, 1 reply; 102+ messages in thread
From: Neil Horman @ 2019-09-26  0:40 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: Drew DeVault, workflows

On Wed, Sep 25, 2019 at 11:50:36PM +0300, Laurent Pinchart wrote:
> Hi Neil,
> 
> On Tue, Sep 24, 2019 at 06:25:02PM -0400, Neil Horman wrote:
> > On Tue, Sep 24, 2019 at 11:24:23PM +0300, Laurent Pinchart wrote:
> > > On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> > >> On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> > >>> On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > >>>> 	After hearing at LPC that that there was a group investigating moving
> > >>>> some upstream development to a less email-centric workfow, I wanted to share
> > >>>> this with the group:
> > >>>> 
> > >>>> https://gitlab.com/nhorman/git-lab-porcelain
> > >>>> 
> > >>>> Its still very rough, and is focused on working with RH based workflows
> > >>>> currently, but it can pretty easily be adapted to generic projects, if theres
> > >>>> interest, as well as to other services besides gitlab (github/etc).
> > >>>> 
> > >>>> The principle is pretty straightforward (at least currently), its a git
> > >>>> porcelain that wraps up the notion of creating a merge request with sending
> > >>>> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > >>>> in sync with email patch posting.  It also contains an email listener daemon to
> > >>>> monitor reqisite lists for ACK/NACK responses which can then be translated into
> > >>>> MR metadata for true MR approvals/notifications to the maintainer that a branch
> > >>>> is good to merge.
> > >>> 
> > >>> This is a great idea.
> > >>> 
> > >>>> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > >>>> the ability to use the porcelain to do reviews on the command line, and
> > >>>> eventually phase out email entirely, but I think thats a significant way off
> > >>>> here.
> > >>> 
> > >>> Until this part. Phasing out email in favor of a centralized solution
> > >>> like Gitlab would be a stark regression.
> > >>
> > >> Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> > >> have an issue with maintaining a mailing list in perpituity.  I only mean to say
> > >> that if common practice becomes to us the git interface to preform reviews, and
> > >> email usage becomes less needed, a given project could choose to phase it out.
> > > 
> > > My opinion on this is that if anyone wants to move towards a more
> > > git-centric workflow, be it for review, pull/merge requests, or anything
> > > else, we will have to figure out a way to make this decentralised and
> > > not bound to a single server instance. Without interoperability between
> > > servers and decentralisation, the result will be vendor lock-in, and
> > > that's a no-go for a large part of the community.
> >
> > I think thats a bit of an overstatement.
> > 
> > Yes, we definately want to avoid vendor lock in, and decentralization is
> > definately a needed aspect of any highly parallelized workflow.  That said:
> > 
> > 1) Regarding vendor lock in, if you want to work with any workflow-as-a-service
> > provider, your going to want tooling that talks to it (via the web ui, the
> > command line, etc).  But in the end, you're going to have to talk to that
> > service.  Thats all this tooling I presented does.  And if you look at the REST
> > apis for the major available services (gitlab/github), while they differ, their
> > general objects and operations are sufficiently simmilar that they can be
> > abstracted through tooling such that the same tool can be adapted to either.
> > One would imagine that any to-be-created service would have sufficiently
> > simmilar operations to also be adaptable to a generic set of operations
> 
> Then let's create such a tool that also supports e-mail workflows, to
> really offer a choice.
> 
> > 2) Regarding decentrallization, the advantage of the decentrailzation of git
> > lies in its ability for users to house their own local copies of a git tree, not
> > so much in the ability to have multiple git servers (though the latter is
> > important too).  In either case, the use of gitlab or github doesn't enjoin you
> > from doing that.  You can still move your git tree around between clients and
> > servers however you see fit.
> 
> But it forces me to have one account with every git hosting service that
> the projects I want to contribute to happen to select. Sure, we could
> fix that by deciding to move the entire free software development to a
> single git hosting provider, but that creates a vendor lock in issue
> that few people would be comfortable with (including myself).
> 
> > The thing to consider outside of that is the exportability of the data that
> > resides outside of git - that is to say, if you want to move from gitlab to
> > github, or from either to some 3rd service, or to a home built service, is there
> > a way to export/import all that merge request and issue data, and honestly I
> > don't know the answer to that, yet.  I can see ways that it could be done, but
> > am completely unsure as to how it should be done.
> 
> There's one way, and we have it today, it's called e-mail :-) Jokes
> aside, a bi-directional e-mail gateway would allow interoperability
> between services, and wouldn't require me to create an account with the
> hosting provider you happen to favour, or the other way around. The
> issue with e-mail gateways is that e-mail clients (and users) are
> notorously bad at keeping formatting intact, so the e-mail to hosted
> service direction is pretty unreliable. I would prefer investigating how
> that could be improved instead of picking which vendor we'll get our
> handcuffs from, as it would then benefit everybody.
> 
> > > How this could be achieved remains to be discussed, and should be an
> > > interesting exercise.
> 
I'm sorry for responding to this again, but I've been thinking about it,
and the more I think about it the more I think an email-to-forge type of
service, while an interesting idea, and probably a tool that might be
generally useful, I don't think its going to have real long term legs.
And the problem isn't technical, its social.

By way of example, lets assume we have such an email-to-forge gateway,
and it works well.

DaveM (using him here as an exemple as he's been socializing the idea of
moving net and net-next to github lately, but substitue the name of any
maintainer or project here).

If there is a desire to move to a forge based solution, eventually the
maintainer(s) will do so based on the best interest of the project,
possibly with input from the general community, but mostly based what
makes the project more effiicent.  In the case of Dave, its my
understanding that he wants to make the move because it will let him
track and merge patches more easily, and is less likely to lose them.
It also aliviates the pressure of having to maintain such a high volume
mailling list as netdev.

Eventually, barring any really significant objection, hes going to make
the switch, and users will either have to get github accounts, or stop
participating in netdev development.  The use of an email gateway here
would smooth the transition arguably, but the cost tradeoff of doing so
isn't really in Daves favor.  In order to use such a gateway, he would
have to stand up an additional server, run the gateway, and create an
account to interact with the forge in response to received emails.  All
in pursuit of having contributors not use the tool that he as maintainer
felt would improve the projects development process.  It doesn't seem
like a reasonable tradeoff to me.

I wrote parts of such an email gateway for my porcelain project above
because internally to RH, there is a value in smoothing that transition
for the rest of our business processes, but I don't think thats
particularly applicable to many upstream projects.

Thats not to say its a worthwhile tool to try write.  Some projects may
find it useful.  But its going to be an opt in solution, and not one
liekly to be adopted unilaterally.  What we need is a method to work
with newer forge like workflows, not something to make those workflows
look like our existing email workflow.

Best
Neil

> -- 
> Regards,
> 
> Laurent Pinchart
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-25 20:50         ` Laurent Pinchart
  2019-09-25 21:54           ` Neil Horman
  2019-09-26  0:40           ` Neil Horman
@ 2019-09-26 10:23           ` Geert Uytterhoeven
  2019-09-26 13:43             ` Neil Horman
  2 siblings, 1 reply; 102+ messages in thread
From: Geert Uytterhoeven @ 2019-09-26 10:23 UTC (permalink / raw)
  To: Laurent Pinchart; +Cc: Neil Horman, Drew DeVault, workflows

Hi Laurent,

On Thu, Sep 26, 2019 at 11:58 AM Laurent Pinchart
<laurent.pinchart@ideasonboard.com> wrote:
> On Tue, Sep 24, 2019 at 06:25:02PM -0400, Neil Horman wrote:
> > On Tue, Sep 24, 2019 at 11:24:23PM +0300, Laurent Pinchart wrote:
> > > On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> > >> On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> > >>> On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > >>>>  After hearing at LPC that that there was a group investigating moving
> > >>>> some upstream development to a less email-centric workfow, I wanted to share
> > >>>> this with the group:
> > >>>>
> > >>>> https://gitlab.com/nhorman/git-lab-porcelain
> > >>>>
> > >>>> Its still very rough, and is focused on working with RH based workflows
> > >>>> currently, but it can pretty easily be adapted to generic projects, if theres
> > >>>> interest, as well as to other services besides gitlab (github/etc).
> > >>>>
> > >>>> The principle is pretty straightforward (at least currently), its a git
> > >>>> porcelain that wraps up the notion of creating a merge request with sending
> > >>>> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > >>>> in sync with email patch posting.  It also contains an email listener daemon to
> > >>>> monitor reqisite lists for ACK/NACK responses which can then be translated into
> > >>>> MR metadata for true MR approvals/notifications to the maintainer that a branch
> > >>>> is good to merge.
> > >>>
> > >>> This is a great idea.
> > >>>
> > >>>> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > >>>> the ability to use the porcelain to do reviews on the command line, and
> > >>>> eventually phase out email entirely, but I think thats a significant way off
> > >>>> here.
> > >>>
> > >>> Until this part. Phasing out email in favor of a centralized solution
> > >>> like Gitlab would be a stark regression.
> > >>
> > >> Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> > >> have an issue with maintaining a mailing list in perpituity.  I only mean to say
> > >> that if common practice becomes to us the git interface to preform reviews, and
> > >> email usage becomes less needed, a given project could choose to phase it out.
> > >
> > > My opinion on this is that if anyone wants to move towards a more
> > > git-centric workflow, be it for review, pull/merge requests, or anything
> > > else, we will have to figure out a way to make this decentralised and
> > > not bound to a single server instance. Without interoperability between
> > > servers and decentralisation, the result will be vendor lock-in, and
> > > that's a no-go for a large part of the community.
> >
> > I think thats a bit of an overstatement.
> >
> > Yes, we definately want to avoid vendor lock in, and decentralization is
> > definately a needed aspect of any highly parallelized workflow.  That said:
> >
> > 1) Regarding vendor lock in, if you want to work with any workflow-as-a-service
> > provider, your going to want tooling that talks to it (via the web ui, the
> > command line, etc).  But in the end, you're going to have to talk to that
> > service.  Thats all this tooling I presented does.  And if you look at the REST
> > apis for the major available services (gitlab/github), while they differ, their
> > general objects and operations are sufficiently simmilar that they can be
> > abstracted through tooling such that the same tool can be adapted to either.
> > One would imagine that any to-be-created service would have sufficiently
> > simmilar operations to also be adaptable to a generic set of operations
>
> Then let's create such a tool that also supports e-mail workflows, to
> really offer a choice.
>
> > 2) Regarding decentrallization, the advantage of the decentrailzation of git
> > lies in its ability for users to house their own local copies of a git tree, not
> > so much in the ability to have multiple git servers (though the latter is
> > important too).  In either case, the use of gitlab or github doesn't enjoin you
> > from doing that.  You can still move your git tree around between clients and
> > servers however you see fit.
>
> But it forces me to have one account with every git hosting service that
> the projects I want to contribute to happen to select. Sure, we could
> fix that by deciding to move the entire free software development to a
> single git hosting provider, but that creates a vendor lock in issue
> that few people would be comfortable with (including myself).
>
> > The thing to consider outside of that is the exportability of the data that
> > resides outside of git - that is to say, if you want to move from gitlab to
> > github, or from either to some 3rd service, or to a home built service, is there
> > a way to export/import all that merge request and issue data, and honestly I
> > don't know the answer to that, yet.  I can see ways that it could be done, but
> > am completely unsure as to how it should be done.
>
> There's one way, and we have it today, it's called e-mail :-) Jokes
> aside, a bi-directional e-mail gateway would allow interoperability
> between services, and wouldn't require me to create an account with the
> hosting provider you happen to favour, or the other way around. The
> issue with e-mail gateways is that e-mail clients (and users) are
> notorously bad at keeping formatting intact, so the e-mail to hosted
> service direction is pretty unreliable. I would prefer investigating how
> that could be improved instead of picking which vendor we'll get our
> handcuffs from, as it would then benefit everybody.

Would it make sense to allow the user to run the web based merge
request tool locally (think git instaweb), and send it out as an emailed
patch series?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-26 10:23           ` Geert Uytterhoeven
@ 2019-09-26 13:43             ` Neil Horman
  0 siblings, 0 replies; 102+ messages in thread
From: Neil Horman @ 2019-09-26 13:43 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Laurent Pinchart, Drew DeVault, workflows

On Thu, Sep 26, 2019 at 12:23:52PM +0200, Geert Uytterhoeven wrote:
> Hi Laurent,
> 
> On Thu, Sep 26, 2019 at 11:58 AM Laurent Pinchart
> <laurent.pinchart@ideasonboard.com> wrote:
> > On Tue, Sep 24, 2019 at 06:25:02PM -0400, Neil Horman wrote:
> > > On Tue, Sep 24, 2019 at 11:24:23PM +0300, Laurent Pinchart wrote:
> > > > On Tue, Sep 24, 2019 at 02:53:12PM -0400, Neil Horman wrote:
> > > >> On Tue, Sep 24, 2019 at 02:37:28PM -0400, Drew DeVault wrote:
> > > >>> On Tue Sep 24, 2019 at 2:25 PM Neil Horman wrote:
> > > >>>>  After hearing at LPC that that there was a group investigating moving
> > > >>>> some upstream development to a less email-centric workfow, I wanted to share
> > > >>>> this with the group:
> > > >>>>
> > > >>>> https://gitlab.com/nhorman/git-lab-porcelain
> > > >>>>
> > > >>>> Its still very rough, and is focused on working with RH based workflows
> > > >>>> currently, but it can pretty easily be adapted to generic projects, if theres
> > > >>>> interest, as well as to other services besides gitlab (github/etc).
> > > >>>>
> > > >>>> The principle is pretty straightforward (at least currently), its a git
> > > >>>> porcelain that wraps up the notion of creating a merge request with sending
> > > >>>> patch emails.  It uses the gitlab rest api to fork projects, and manipulate MR's
> > > >>>> in sync with email patch posting.  It also contains an email listener daemon to
> > > >>>> monitor reqisite lists for ACK/NACK responses which can then be translated into
> > > >>>> MR metadata for true MR approvals/notifications to the maintainer that a branch
> > > >>>> is good to merge.
> > > >>>
> > > >>> This is a great idea.
> > > >>>
> > > >>>> Ostensibly, if this has any sort of legs, the idea in the long term is to add
> > > >>>> the ability to use the porcelain to do reviews on the command line, and
> > > >>>> eventually phase out email entirely, but I think thats a significant way off
> > > >>>> here.
> > > >>>
> > > >>> Until this part. Phasing out email in favor of a centralized solution
> > > >>> like Gitlab would be a stark regression.
> > > >>
> > > >> Well, that by no rights has to happen (at least not in my mind).  I wouldn't
> > > >> have an issue with maintaining a mailing list in perpituity.  I only mean to say
> > > >> that if common practice becomes to us the git interface to preform reviews, and
> > > >> email usage becomes less needed, a given project could choose to phase it out.
> > > >
> > > > My opinion on this is that if anyone wants to move towards a more
> > > > git-centric workflow, be it for review, pull/merge requests, or anything
> > > > else, we will have to figure out a way to make this decentralised and
> > > > not bound to a single server instance. Without interoperability between
> > > > servers and decentralisation, the result will be vendor lock-in, and
> > > > that's a no-go for a large part of the community.
> > >
> > > I think thats a bit of an overstatement.
> > >
> > > Yes, we definately want to avoid vendor lock in, and decentralization is
> > > definately a needed aspect of any highly parallelized workflow.  That said:
> > >
> > > 1) Regarding vendor lock in, if you want to work with any workflow-as-a-service
> > > provider, your going to want tooling that talks to it (via the web ui, the
> > > command line, etc).  But in the end, you're going to have to talk to that
> > > service.  Thats all this tooling I presented does.  And if you look at the REST
> > > apis for the major available services (gitlab/github), while they differ, their
> > > general objects and operations are sufficiently simmilar that they can be
> > > abstracted through tooling such that the same tool can be adapted to either.
> > > One would imagine that any to-be-created service would have sufficiently
> > > simmilar operations to also be adaptable to a generic set of operations
> >
> > Then let's create such a tool that also supports e-mail workflows, to
> > really offer a choice.
> >
> > > 2) Regarding decentrallization, the advantage of the decentrailzation of git
> > > lies in its ability for users to house their own local copies of a git tree, not
> > > so much in the ability to have multiple git servers (though the latter is
> > > important too).  In either case, the use of gitlab or github doesn't enjoin you
> > > from doing that.  You can still move your git tree around between clients and
> > > servers however you see fit.
> >
> > But it forces me to have one account with every git hosting service that
> > the projects I want to contribute to happen to select. Sure, we could
> > fix that by deciding to move the entire free software development to a
> > single git hosting provider, but that creates a vendor lock in issue
> > that few people would be comfortable with (including myself).
> >
> > > The thing to consider outside of that is the exportability of the data that
> > > resides outside of git - that is to say, if you want to move from gitlab to
> > > github, or from either to some 3rd service, or to a home built service, is there
> > > a way to export/import all that merge request and issue data, and honestly I
> > > don't know the answer to that, yet.  I can see ways that it could be done, but
> > > am completely unsure as to how it should be done.
> >
> > There's one way, and we have it today, it's called e-mail :-) Jokes
> > aside, a bi-directional e-mail gateway would allow interoperability
> > between services, and wouldn't require me to create an account with the
> > hosting provider you happen to favour, or the other way around. The
> > issue with e-mail gateways is that e-mail clients (and users) are
> > notorously bad at keeping formatting intact, so the e-mail to hosted
> > service direction is pretty unreliable. I would prefer investigating how
> > that could be improved instead of picking which vendor we'll get our
> > handcuffs from, as it would then benefit everybody.
> 
> Would it make sense to allow the user to run the web based merge
> request tool locally (think git instaweb), and send it out as an emailed
> patch series?
> 
> Gr{oetje,eeting}s,
> 
If I understand your suggestion correctly, thats what the tooling
project I posted earlier does.  It creates a git porcelain command (git
lab), with (among other subcommands), a git lab createmr command that:
a) uses the gitlab rest api to create a merge request from your local
git tree
and
b) emails the same patch series to a pre-configured list

Neil

>                         Geert
> 
> -- 
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-26  0:40           ` Neil Horman
@ 2019-09-28 22:58             ` Steven Rostedt
  2019-09-28 23:16               ` Dave Airlie
  2019-09-29 11:57               ` Neil Horman
  0 siblings, 2 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-09-28 22:58 UTC (permalink / raw)
  To: Neil Horman; +Cc: Laurent Pinchart, Drew DeVault, workflows

On Wed, 25 Sep 2019 20:40:45 -0400
Neil Horman <nhorman@tuxdriver.com> wrote:

> Eventually, barring any really significant objection, hes going to make
> the switch, and users will either have to get github accounts, or stop
> participating in netdev development.  

That will be a very sad day if that happened.

Whatever service should have an email interface. For example, if I get
a message from bugzilla.kernel.org, I can reply back via email and it
is inserted into the tool (as I see my Out of office messages going
into it. I need to fix my scripts not to reply to bugzilla).

I set up patchwork on my INBOX, as I'm having a hard time of separating
patches from the noise. And it works really well. I would love to be
able to push my patchwork list to a public place so that others can see
it too. As mentioned in the Maintainers Summit, it would be great to be
able to pull patchwork down to my laptop, get on the plane, process a
bunch of patches while flying, and then when I land, I could push the
updates to the public server.

That's pretty much all I'm looking for.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-28 22:58             ` Steven Rostedt
@ 2019-09-28 23:16               ` Dave Airlie
  2019-09-28 23:52                 ` Steven Rostedt
  2019-10-01  3:22                 ` Daniel Axtens
  2019-09-29 11:57               ` Neil Horman
  1 sibling, 2 replies; 102+ messages in thread
From: Dave Airlie @ 2019-09-28 23:16 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Neil Horman, Laurent Pinchart, Drew DeVault, workflows

On Sun, 29 Sep 2019 at 09:10, Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 25 Sep 2019 20:40:45 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
>
> > Eventually, barring any really significant objection, hes going to make
> > the switch, and users will either have to get github accounts, or stop
> > participating in netdev development.
>
> That will be a very sad day if that happened.
>
> Whatever service should have an email interface. For example, if I get
> a message from bugzilla.kernel.org, I can reply back via email and it
> is inserted into the tool (as I see my Out of office messages going
> into it. I need to fix my scripts not to reply to bugzilla).
>
> I set up patchwork on my INBOX, as I'm having a hard time of separating
> patches from the noise. And it works really well. I would love to be
> able to push my patchwork list to a public place so that others can see
> it too. As mentioned in the Maintainers Summit, it would be great to be
> able to pull patchwork down to my laptop, get on the plane, process a
> bunch of patches while flying, and then when I land, I could push the
> updates to the public server.
>
> That's pretty much all I'm looking for.

How many patches is your workflow btw? 20 a month? 50?

I think the reason davem and my group have in using git(hub/lab) is
our patch counts are way higher. You guys are inventing solutions for
your problems that's great, but they don't scale.

Patchwork as currently sold still requires someone to spend time
cleaning it up a lot, which is fine if you get 20-30 mails, when you
1-2k mails patchwork manual interactions end up taking a large chunk
of time. The and the fact that there is one patchwork, everyone has
forked it to add their favourite features. Unless someone spends time
on a reboot and goes around bringing all the forks back to a central
line, which is is a significantly larger task than if it has been
maintained in the first place, because now everyone has their own
niche hacks and cool features they can't do without, but are all
different than everyone elses.

Dave.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-28 23:16               ` Dave Airlie
@ 2019-09-28 23:52                 ` Steven Rostedt
  2019-10-01  3:22                 ` Daniel Axtens
  1 sibling, 0 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-09-28 23:52 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Neil Horman, Laurent Pinchart, Drew DeVault, workflows

On Sun, 29 Sep 2019 09:16:53 +1000
Dave Airlie <airlied@gmail.com> wrote:


> How many patches is your workflow btw? 20 a month? 50?

For ftrace only, yeah. But for all the patches I'm Cc'd on (and would
like to review), it's more like several hundred.

> 
> I think the reason davem and my group have in using git(hub/lab) is
> our patch counts are way higher. You guys are inventing solutions for
> your problems that's great, but they don't scale.
> 
> Patchwork as currently sold still requires someone to spend time
> cleaning it up a lot, which is fine if you get 20-30 mails, when you
> 1-2k mails patchwork manual interactions end up taking a large chunk
> of time. The and the fact that there is one patchwork, everyone has
> forked it to add their favourite features. Unless someone spends time
> on a reboot and goes around bringing all the forks back to a central
> line, which is is a significantly larger task than if it has been
> maintained in the first place, because now everyone has their own
> niche hacks and cool features they can't do without, but are all
> different than everyone elses.

I think you misunderstood what I was saying. It's not that "I want
patchwork", but I want something that can do things offline. I would
love to have a reboot of patchwork, because it was an extreme pain to
get working. The focus of my email was about getting something that
works locally, that can be pushed public. The only reason I mentioned
patchwork, was because I was able to get it working locally (with a
bunch of hacks!)

Great, lets get something that works for you and Dave that handle 1000
patches a month, but most maintainers do not have that big of a queue.
I only ask for something that I can manage patches offline, and not
depend on a service for it.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-28 22:58             ` Steven Rostedt
  2019-09-28 23:16               ` Dave Airlie
@ 2019-09-29 11:57               ` Neil Horman
  2019-09-29 12:55                 ` Dmitry Vyukov
  1 sibling, 1 reply; 102+ messages in thread
From: Neil Horman @ 2019-09-29 11:57 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Laurent Pinchart, Drew DeVault, workflows

On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
> On Wed, 25 Sep 2019 20:40:45 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
> 
> > Eventually, barring any really significant objection, hes going to make
> > the switch, and users will either have to get github accounts, or stop
> > participating in netdev development.  
> 
> That will be a very sad day if that happened.
> 
> Whatever service should have an email interface. For example, if I get
> a message from bugzilla.kernel.org, I can reply back via email and it
> is inserted into the tool (as I see my Out of office messages going
> into it. I need to fix my scripts not to reply to bugzilla).
> 
Forge solutions do have the ability to use email as an interface to
issue tracking, thats not a problem.  What they don't currently seem to
have is the ability to emulate patch review workflows.  And thats not to
say they couldn't, but it seems to me that they haven't prioritized that
because they offer several different types of comment options
(commenting in the pull request discussion(s) themselves vs commenting
on code, etc.  If they sould implement that, I think alot of this would
become alot easier.

> I set up patchwork on my INBOX, as I'm having a hard time of separating
> patches from the noise. And it works really well. I would love to be
> able to push my patchwork list to a public place so that others can see
> it too. As mentioned in the Maintainers Summit, it would be great to be
> able to pull patchwork down to my laptop, get on the plane, process a
> bunch of patches while flying, and then when I land, I could push the
> updates to the public server.
> 
> That's pretty much all I'm looking for.
> 
I think what you are looking for here is a way to pull down a set of
merge requests, review and merge those you approve, and push them back
when you are back online?  I think you can do at least some of that.
Forge solutions (definately gitlab, likely github), allow you to pull
a merge request reference namespace (on gitlab its
heads/merge_requests/<merge_request_id>).  You can merge whatever head
there you like to its intended target branch, and when you push, it will
update the corresponding MR to the MERGED state.  What you can't
currently do is make a comment on an MR, store that comment in git and
then have the MR updated with those comments.  That would be a great
item to make that feature more complete.

Neil

> -- Steve
> 

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-29 11:57               ` Neil Horman
@ 2019-09-29 12:55                 ` Dmitry Vyukov
  2019-09-30  1:00                   ` Neil Horman
  2019-09-30 14:51                   ` Theodore Y. Ts'o
  0 siblings, 2 replies; 102+ messages in thread
From: Dmitry Vyukov @ 2019-09-29 12:55 UTC (permalink / raw)
  To: Neil Horman; +Cc: Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Sun, Sep 29, 2019 at 1:57 PM Neil Horman <nhorman@tuxdriver.com> wrote:
>
> On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
> > On Wed, 25 Sep 2019 20:40:45 -0400
> > Neil Horman <nhorman@tuxdriver.com> wrote:
> >
> > > Eventually, barring any really significant objection, hes going to make
> > > the switch, and users will either have to get github accounts, or stop
> > > participating in netdev development.
> >
> > That will be a very sad day if that happened.
> >
> > Whatever service should have an email interface. For example, if I get
> > a message from bugzilla.kernel.org, I can reply back via email and it
> > is inserted into the tool (as I see my Out of office messages going
> > into it. I need to fix my scripts not to reply to bugzilla).
> >
> Forge solutions do have the ability to use email as an interface to
> issue tracking, thats not a problem.  What they don't currently seem to
> have is the ability to emulate patch review workflows.  And thats not to
> say they couldn't, but it seems to me that they haven't prioritized that
> because they offer several different types of comment options
> (commenting in the pull request discussion(s) themselves vs commenting
> on code, etc.  If they sould implement that, I think alot of this would
> become alot easier.
>
> > I set up patchwork on my INBOX, as I'm having a hard time of separating
> > patches from the noise. And it works really well. I would love to be
> > able to push my patchwork list to a public place so that others can see
> > it too. As mentioned in the Maintainers Summit, it would be great to be
> > able to pull patchwork down to my laptop, get on the plane, process a
> > bunch of patches while flying, and then when I land, I could push the
> > updates to the public server.
> >
> > That's pretty much all I'm looking for.
> >
> I think what you are looking for here is a way to pull down a set of
> merge requests, review and merge those you approve, and push them back
> when you are back online?  I think you can do at least some of that.
> Forge solutions (definately gitlab, likely github), allow you to pull
> a merge request reference namespace (on gitlab its
> heads/merge_requests/<merge_request_id>).  You can merge whatever head
> there you like to its intended target branch, and when you push, it will
> update the corresponding MR to the MERGED state.  What you can't
> currently do is make a comment on an MR, store that comment in git and
> then have the MR updated with those comments.  That would be a great
> item to make that feature more complete.

One mismatch with kernel dev process that seem to be there for lots of
existing solutions (gerrit, git-appraise, github, gitlab) is that they
are centered around a single "mail" git tree (in particular,
gerrit/git-appraise check in metainfo right into that repo). Whereas
kernel has lots of kernels. Now if Steve is CCed on lots of changes
what git tree should he pull before boarding a place? For some changes
it may be unclear what tree they should go into initially, or that may
change over time. Then, there are some additional relations with
stable trees.
I suspect that kernel tooling should account for that and separate
changes layer from exact git trees. Like mailing lists. Usually there
is 1 mailing list and 1 git tree per subsystem, but still this
relation is not fixed and one can always CC another mailing list, or
retarget the change, etc.
What do you think?

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-29 12:55                 ` Dmitry Vyukov
@ 2019-09-30  1:00                   ` Neil Horman
  2019-09-30  6:05                     ` Dmitry Vyukov
  2019-09-30 21:02                     ` Konstantin Ryabitsev
  2019-09-30 14:51                   ` Theodore Y. Ts'o
  1 sibling, 2 replies; 102+ messages in thread
From: Neil Horman @ 2019-09-30  1:00 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Sun, Sep 29, 2019 at 02:55:25PM +0200, Dmitry Vyukov wrote:
> On Sun, Sep 29, 2019 at 1:57 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> >
> > On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
> > > On Wed, 25 Sep 2019 20:40:45 -0400
> > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > >
> > > > Eventually, barring any really significant objection, hes going to make
> > > > the switch, and users will either have to get github accounts, or stop
> > > > participating in netdev development.
> > >
> > > That will be a very sad day if that happened.
> > >
> > > Whatever service should have an email interface. For example, if I get
> > > a message from bugzilla.kernel.org, I can reply back via email and it
> > > is inserted into the tool (as I see my Out of office messages going
> > > into it. I need to fix my scripts not to reply to bugzilla).
> > >
> > Forge solutions do have the ability to use email as an interface to
> > issue tracking, thats not a problem.  What they don't currently seem to
> > have is the ability to emulate patch review workflows.  And thats not to
> > say they couldn't, but it seems to me that they haven't prioritized that
> > because they offer several different types of comment options
> > (commenting in the pull request discussion(s) themselves vs commenting
> > on code, etc.  If they sould implement that, I think alot of this would
> > become alot easier.
> >
> > > I set up patchwork on my INBOX, as I'm having a hard time of separating
> > > patches from the noise. And it works really well. I would love to be
> > > able to push my patchwork list to a public place so that others can see
> > > it too. As mentioned in the Maintainers Summit, it would be great to be
> > > able to pull patchwork down to my laptop, get on the plane, process a
> > > bunch of patches while flying, and then when I land, I could push the
> > > updates to the public server.
> > >
> > > That's pretty much all I'm looking for.
> > >
> > I think what you are looking for here is a way to pull down a set of
> > merge requests, review and merge those you approve, and push them back
> > when you are back online?  I think you can do at least some of that.
> > Forge solutions (definately gitlab, likely github), allow you to pull
> > a merge request reference namespace (on gitlab its
> > heads/merge_requests/<merge_request_id>).  You can merge whatever head
> > there you like to its intended target branch, and when you push, it will
> > update the corresponding MR to the MERGED state.  What you can't
> > currently do is make a comment on an MR, store that comment in git and
> > then have the MR updated with those comments.  That would be a great
> > item to make that feature more complete.
> 
> One mismatch with kernel dev process that seem to be there for lots of
> existing solutions (gerrit, git-appraise, github, gitlab) is that they
> are centered around a single "mail" git tree (in particular,
> gerrit/git-appraise check in metainfo right into that repo). Whereas
> kernel has lots of kernels. Now if Steve is CCed on lots of changes
> what git tree should he pull before boarding a place? For some changes
> it may be unclear what tree they should go into initially, or that may
> change over time. Then, there are some additional relations with
> stable trees.
> I suspect that kernel tooling should account for that and separate
> changes layer from exact git trees. Like mailing lists. Usually there
> is 1 mailing list and 1 git tree per subsystem, but still this
> relation is not fixed and one can always CC another mailing list, or
> retarget the change, etc.
> What do you think?
> 
I agree that newer review solutions (of the type you ennummerated) rely
on centralization of information, which is undesireable in many cases,
but I'm not sure how to avoid that.  

Just thinking off the top of my head, I wonder if a tool that converted
all forge type conversations to git notes would be useful here.  Those
could then be pulled by individuals for review and update?

Neil


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30  1:00                   ` Neil Horman
@ 2019-09-30  6:05                     ` Dmitry Vyukov
  2019-09-30 12:55                       ` Neil Horman
  2019-09-30 21:02                     ` Konstantin Ryabitsev
  1 sibling, 1 reply; 102+ messages in thread
From: Dmitry Vyukov @ 2019-09-30  6:05 UTC (permalink / raw)
  To: Neil Horman; +Cc: Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Mon, Sep 30, 2019 at 3:01 AM Neil Horman <nhorman@tuxdriver.com> wrote:
>
> On Sun, Sep 29, 2019 at 02:55:25PM +0200, Dmitry Vyukov wrote:
> > On Sun, Sep 29, 2019 at 1:57 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > >
> > > On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
> > > > On Wed, 25 Sep 2019 20:40:45 -0400
> > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > >
> > > > > Eventually, barring any really significant objection, hes going to make
> > > > > the switch, and users will either have to get github accounts, or stop
> > > > > participating in netdev development.
> > > >
> > > > That will be a very sad day if that happened.
> > > >
> > > > Whatever service should have an email interface. For example, if I get
> > > > a message from bugzilla.kernel.org, I can reply back via email and it
> > > > is inserted into the tool (as I see my Out of office messages going
> > > > into it. I need to fix my scripts not to reply to bugzilla).
> > > >
> > > Forge solutions do have the ability to use email as an interface to
> > > issue tracking, thats not a problem.  What they don't currently seem to
> > > have is the ability to emulate patch review workflows.  And thats not to
> > > say they couldn't, but it seems to me that they haven't prioritized that
> > > because they offer several different types of comment options
> > > (commenting in the pull request discussion(s) themselves vs commenting
> > > on code, etc.  If they sould implement that, I think alot of this would
> > > become alot easier.
> > >
> > > > I set up patchwork on my INBOX, as I'm having a hard time of separating
> > > > patches from the noise. And it works really well. I would love to be
> > > > able to push my patchwork list to a public place so that others can see
> > > > it too. As mentioned in the Maintainers Summit, it would be great to be
> > > > able to pull patchwork down to my laptop, get on the plane, process a
> > > > bunch of patches while flying, and then when I land, I could push the
> > > > updates to the public server.
> > > >
> > > > That's pretty much all I'm looking for.
> > > >
> > > I think what you are looking for here is a way to pull down a set of
> > > merge requests, review and merge those you approve, and push them back
> > > when you are back online?  I think you can do at least some of that.
> > > Forge solutions (definately gitlab, likely github), allow you to pull
> > > a merge request reference namespace (on gitlab its
> > > heads/merge_requests/<merge_request_id>).  You can merge whatever head
> > > there you like to its intended target branch, and when you push, it will
> > > update the corresponding MR to the MERGED state.  What you can't
> > > currently do is make a comment on an MR, store that comment in git and
> > > then have the MR updated with those comments.  That would be a great
> > > item to make that feature more complete.
> >
> > One mismatch with kernel dev process that seem to be there for lots of
> > existing solutions (gerrit, git-appraise, github, gitlab) is that they
> > are centered around a single "mail" git tree (in particular,
> > gerrit/git-appraise check in metainfo right into that repo). Whereas
> > kernel has lots of kernels. Now if Steve is CCed on lots of changes
> > what git tree should he pull before boarding a place? For some changes
> > it may be unclear what tree they should go into initially, or that may
> > change over time. Then, there are some additional relations with
> > stable trees.
> > I suspect that kernel tooling should account for that and separate
> > changes layer from exact git trees. Like mailing lists. Usually there
> > is 1 mailing list and 1 git tree per subsystem, but still this
> > relation is not fixed and one can always CC another mailing list, or
> > retarget the change, etc.
> > What do you think?
> >
> I agree that newer review solutions (of the type you ennummerated) rely
> on centralization of information, which is undesireable in many cases,
> but I'm not sure how to avoid that.

Well, FWIW the SSB protocol and similar would avoid that:
https://people.kernel.org/monsieuricon/patches-carved-into-developer-sigchains

> Just thinking off the top of my head, I wonder if a tool that converted
> all forge type conversations to git notes would be useful here.  Those
> could then be pulled by individuals for review and update?

git-appriaise does something similar, but the other way around:
https://github.com/dvyukov/kit/blob/master/doc/references.md#git-appraise
git notes is the _main_ storage, but then it has bridge to github.

However, it's unclear how use such solution for kernel:
https://lore.kernel.org/ksummit-discuss/9fee1356-cf48-6198-4001-5d9d886fbf88@iogearbox.net/T/#m869c5253d10931823bba74942df7da062a7bbb13
(world writable, force pushes)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30  6:05                     ` Dmitry Vyukov
@ 2019-09-30 12:55                       ` Neil Horman
  2019-09-30 13:20                         ` Nicolas Belouin
  2019-09-30 13:40                         ` Dmitry Vyukov
  0 siblings, 2 replies; 102+ messages in thread
From: Neil Horman @ 2019-09-30 12:55 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Mon, Sep 30, 2019 at 08:05:04AM +0200, Dmitry Vyukov wrote:
> On Mon, Sep 30, 2019 at 3:01 AM Neil Horman <nhorman@tuxdriver.com> wrote:
> >
> > On Sun, Sep 29, 2019 at 02:55:25PM +0200, Dmitry Vyukov wrote:
> > > On Sun, Sep 29, 2019 at 1:57 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > > >
> > > > On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
> > > > > On Wed, 25 Sep 2019 20:40:45 -0400
> > > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > >
> > > > > > Eventually, barring any really significant objection, hes going to make
> > > > > > the switch, and users will either have to get github accounts, or stop
> > > > > > participating in netdev development.
> > > > >
> > > > > That will be a very sad day if that happened.
> > > > >
> > > > > Whatever service should have an email interface. For example, if I get
> > > > > a message from bugzilla.kernel.org, I can reply back via email and it
> > > > > is inserted into the tool (as I see my Out of office messages going
> > > > > into it. I need to fix my scripts not to reply to bugzilla).
> > > > >
> > > > Forge solutions do have the ability to use email as an interface to
> > > > issue tracking, thats not a problem.  What they don't currently seem to
> > > > have is the ability to emulate patch review workflows.  And thats not to
> > > > say they couldn't, but it seems to me that they haven't prioritized that
> > > > because they offer several different types of comment options
> > > > (commenting in the pull request discussion(s) themselves vs commenting
> > > > on code, etc.  If they sould implement that, I think alot of this would
> > > > become alot easier.
> > > >
> > > > > I set up patchwork on my INBOX, as I'm having a hard time of separating
> > > > > patches from the noise. And it works really well. I would love to be
> > > > > able to push my patchwork list to a public place so that others can see
> > > > > it too. As mentioned in the Maintainers Summit, it would be great to be
> > > > > able to pull patchwork down to my laptop, get on the plane, process a
> > > > > bunch of patches while flying, and then when I land, I could push the
> > > > > updates to the public server.
> > > > >
> > > > > That's pretty much all I'm looking for.
> > > > >
> > > > I think what you are looking for here is a way to pull down a set of
> > > > merge requests, review and merge those you approve, and push them back
> > > > when you are back online?  I think you can do at least some of that.
> > > > Forge solutions (definately gitlab, likely github), allow you to pull
> > > > a merge request reference namespace (on gitlab its
> > > > heads/merge_requests/<merge_request_id>).  You can merge whatever head
> > > > there you like to its intended target branch, and when you push, it will
> > > > update the corresponding MR to the MERGED state.  What you can't
> > > > currently do is make a comment on an MR, store that comment in git and
> > > > then have the MR updated with those comments.  That would be a great
> > > > item to make that feature more complete.
> > >
> > > One mismatch with kernel dev process that seem to be there for lots of
> > > existing solutions (gerrit, git-appraise, github, gitlab) is that they
> > > are centered around a single "mail" git tree (in particular,
> > > gerrit/git-appraise check in metainfo right into that repo). Whereas
> > > kernel has lots of kernels. Now if Steve is CCed on lots of changes
> > > what git tree should he pull before boarding a place? For some changes
> > > it may be unclear what tree they should go into initially, or that may
> > > change over time. Then, there are some additional relations with
> > > stable trees.
> > > I suspect that kernel tooling should account for that and separate
> > > changes layer from exact git trees. Like mailing lists. Usually there
> > > is 1 mailing list and 1 git tree per subsystem, but still this
> > > relation is not fixed and one can always CC another mailing list, or
> > > retarget the change, etc.
> > > What do you think?
> > >
> > I agree that newer review solutions (of the type you ennummerated) rely
> > on centralization of information, which is undesireable in many cases,
> > but I'm not sure how to avoid that.
> 
> Well, FWIW the SSB protocol and similar would avoid that:
> https://people.kernel.org/monsieuricon/patches-carved-into-developer-sigchains
> 
I wasn't aware of this framework, thats pretty cool and could be a
potential solution, yes.  

That said, it doesn't solve the casual contributor issue.  From the
section on raising the entry barrier:

"We would need full-featured web clients that would allow someone to browse projects 
in a similar fashion as they would browse them on Git..b, including viewing issues, 
submitting bug reports, and sending patches and pull requests."

Once we start talking about having to maintain web interfaces, and
especially if we are also considering features like CI, it seems to me
we are talking about falling back effectively to a forge solution that
allows data interchange via SSB.  While thats a good feature add, its
really an addition to a forge solution, not a standalone mechanism.


> > Just thinking off the top of my head, I wonder if a tool that converted
> > all forge type conversations to git notes would be useful here.  Those
> > could then be pulled by individuals for review and update?
> 
> git-appriaise does something similar, but the other way around:
> https://github.com/dvyukov/kit/blob/master/doc/references.md#git-appraise
> git notes is the _main_ storage, but then it has bridge to github.
> 
> However, it's unclear how use such solution for kernel:
> https://lore.kernel.org/ksummit-discuss/9fee1356-cf48-6198-4001-5d9d886fbf88@iogearbox.net/T/#m869c5253d10931823bba74942df7da062a7bbb13
> (world writable, force pushes)
> 
Yeah, I've messed with git-appraise, and thats the issue I've run into -
it forgoes the need for account creation on a forge, but at the expense
of...well, not having any significant meaningful authentication.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30 12:55                       ` Neil Horman
@ 2019-09-30 13:20                         ` Nicolas Belouin
  2019-09-30 13:40                         ` Dmitry Vyukov
  1 sibling, 0 replies; 102+ messages in thread
From: Nicolas Belouin @ 2019-09-30 13:20 UTC (permalink / raw)
  To: Neil Horman, Dmitry Vyukov
  Cc: Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows



On 9/30/19 2:55 PM, Neil Horman wrote:
> On Mon, Sep 30, 2019 at 08:05:04AM +0200, Dmitry Vyukov wrote:
>> On Mon, Sep 30, 2019 at 3:01 AM Neil Horman <nhorman@tuxdriver.com> wrote:
>>> On Sun, Sep 29, 2019 at 02:55:25PM +0200, Dmitry Vyukov wrote:
>>>> On Sun, Sep 29, 2019 at 1:57 PM Neil Horman <nhorman@tuxdriver.com> wrote:
>>>>> On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
>>>>>> On Wed, 25 Sep 2019 20:40:45 -0400
>>>>>> Neil Horman <nhorman@tuxdriver.com> wrote:
>>>>>>
>>>>>>> Eventually, barring any really significant objection, hes going to make
>>>>>>> the switch, and users will either have to get github accounts, or stop
>>>>>>> participating in netdev development.
>>>>>> That will be a very sad day if that happened.
>>>>>>
>>>>>> Whatever service should have an email interface. For example, if I get
>>>>>> a message from bugzilla.kernel.org, I can reply back via email and it
>>>>>> is inserted into the tool (as I see my Out of office messages going
>>>>>> into it. I need to fix my scripts not to reply to bugzilla).
>>>>>>
>>>>> Forge solutions do have the ability to use email as an interface to
>>>>> issue tracking, thats not a problem.  What they don't currently seem to
>>>>> have is the ability to emulate patch review workflows.  And thats not to
>>>>> say they couldn't, but it seems to me that they haven't prioritized that
>>>>> because they offer several different types of comment options
>>>>> (commenting in the pull request discussion(s) themselves vs commenting
>>>>> on code, etc.  If they sould implement that, I think alot of this would
>>>>> become alot easier.
>>>>>
>>>>>> I set up patchwork on my INBOX, as I'm having a hard time of separating
>>>>>> patches from the noise. And it works really well. I would love to be
>>>>>> able to push my patchwork list to a public place so that others can see
>>>>>> it too. As mentioned in the Maintainers Summit, it would be great to be
>>>>>> able to pull patchwork down to my laptop, get on the plane, process a
>>>>>> bunch of patches while flying, and then when I land, I could push the
>>>>>> updates to the public server.
>>>>>>
>>>>>> That's pretty much all I'm looking for.
>>>>>>
>>>>> I think what you are looking for here is a way to pull down a set of
>>>>> merge requests, review and merge those you approve, and push them back
>>>>> when you are back online?  I think you can do at least some of that.
>>>>> Forge solutions (definately gitlab, likely github), allow you to pull
>>>>> a merge request reference namespace (on gitlab its
>>>>> heads/merge_requests/<merge_request_id>).  You can merge whatever head
>>>>> there you like to its intended target branch, and when you push, it will
>>>>> update the corresponding MR to the MERGED state.  What you can't
>>>>> currently do is make a comment on an MR, store that comment in git and
>>>>> then have the MR updated with those comments.  That would be a great
>>>>> item to make that feature more complete.
>>>> One mismatch with kernel dev process that seem to be there for lots of
>>>> existing solutions (gerrit, git-appraise, github, gitlab) is that they
>>>> are centered around a single "mail" git tree (in particular,
>>>> gerrit/git-appraise check in metainfo right into that repo). Whereas
>>>> kernel has lots of kernels. Now if Steve is CCed on lots of changes
>>>> what git tree should he pull before boarding a place? For some changes
>>>> it may be unclear what tree they should go into initially, or that may
>>>> change over time. Then, there are some additional relations with
>>>> stable trees.
>>>> I suspect that kernel tooling should account for that and separate
>>>> changes layer from exact git trees. Like mailing lists. Usually there
>>>> is 1 mailing list and 1 git tree per subsystem, but still this
>>>> relation is not fixed and one can always CC another mailing list, or
>>>> retarget the change, etc.
>>>> What do you think?
>>>>
>>> I agree that newer review solutions (of the type you ennummerated) rely
>>> on centralization of information, which is undesireable in many cases,
>>> but I'm not sure how to avoid that.
>> Well, FWIW the SSB protocol and similar would avoid that:
>> https://people.kernel.org/monsieuricon/patches-carved-into-developer-sigchains
>>
> I wasn't aware of this framework, thats pretty cool and could be a
> potential solution, yes.  
>
> That said, it doesn't solve the casual contributor issue.  From the
> section on raising the entry barrier:
>
> "We would need full-featured web clients that would allow someone to browse projects 
> in a similar fashion as they would browse them on Git..b, including viewing issues, 
> submitting bug reports, and sending patches and pull requests."
>
> Once we start talking about having to maintain web interfaces, and
> especially if we are also considering features like CI, it seems to me
> we are talking about falling back effectively to a forge solution that
> allows data interchange via SSB.  While thats a good feature add, its
> really an addition to a forge solution, not a standalone mechanism.
>
From my point of view its quite the opposite: The forge-like interfaces
are just clients to the SSB, you can take whatever client you want and
can easily work with others even if they use another client or don't
agree with the term of use of a specific client. The main point here is
not to have a single point of failure, for now if kernel.org goes down
for any reason (or gets compromised) you still have many fully
functional "forks" you can base yourself on that have the full history
of the project. The idea would be to have the same mechanism for code
review and issues tracking.

Nicolas

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30 12:55                       ` Neil Horman
  2019-09-30 13:20                         ` Nicolas Belouin
@ 2019-09-30 13:40                         ` Dmitry Vyukov
  1 sibling, 0 replies; 102+ messages in thread
From: Dmitry Vyukov @ 2019-09-30 13:40 UTC (permalink / raw)
  To: Neil Horman; +Cc: Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Mon, Sep 30, 2019 at 2:56 PM Neil Horman <nhorman@tuxdriver.com> wrote:
>
> On Mon, Sep 30, 2019 at 08:05:04AM +0200, Dmitry Vyukov wrote:
> > On Mon, Sep 30, 2019 at 3:01 AM Neil Horman <nhorman@tuxdriver.com> wrote:
> > >
> > > On Sun, Sep 29, 2019 at 02:55:25PM +0200, Dmitry Vyukov wrote:
> > > > On Sun, Sep 29, 2019 at 1:57 PM Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > >
> > > > > On Sat, Sep 28, 2019 at 06:58:48PM -0400, Steven Rostedt wrote:
> > > > > > On Wed, 25 Sep 2019 20:40:45 -0400
> > > > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > > >
> > > > > > > Eventually, barring any really significant objection, hes going to make
> > > > > > > the switch, and users will either have to get github accounts, or stop
> > > > > > > participating in netdev development.
> > > > > >
> > > > > > That will be a very sad day if that happened.
> > > > > >
> > > > > > Whatever service should have an email interface. For example, if I get
> > > > > > a message from bugzilla.kernel.org, I can reply back via email and it
> > > > > > is inserted into the tool (as I see my Out of office messages going
> > > > > > into it. I need to fix my scripts not to reply to bugzilla).
> > > > > >
> > > > > Forge solutions do have the ability to use email as an interface to
> > > > > issue tracking, thats not a problem.  What they don't currently seem to
> > > > > have is the ability to emulate patch review workflows.  And thats not to
> > > > > say they couldn't, but it seems to me that they haven't prioritized that
> > > > > because they offer several different types of comment options
> > > > > (commenting in the pull request discussion(s) themselves vs commenting
> > > > > on code, etc.  If they sould implement that, I think alot of this would
> > > > > become alot easier.
> > > > >
> > > > > > I set up patchwork on my INBOX, as I'm having a hard time of separating
> > > > > > patches from the noise. And it works really well. I would love to be
> > > > > > able to push my patchwork list to a public place so that others can see
> > > > > > it too. As mentioned in the Maintainers Summit, it would be great to be
> > > > > > able to pull patchwork down to my laptop, get on the plane, process a
> > > > > > bunch of patches while flying, and then when I land, I could push the
> > > > > > updates to the public server.
> > > > > >
> > > > > > That's pretty much all I'm looking for.
> > > > > >
> > > > > I think what you are looking for here is a way to pull down a set of
> > > > > merge requests, review and merge those you approve, and push them back
> > > > > when you are back online?  I think you can do at least some of that.
> > > > > Forge solutions (definately gitlab, likely github), allow you to pull
> > > > > a merge request reference namespace (on gitlab its
> > > > > heads/merge_requests/<merge_request_id>).  You can merge whatever head
> > > > > there you like to its intended target branch, and when you push, it will
> > > > > update the corresponding MR to the MERGED state.  What you can't
> > > > > currently do is make a comment on an MR, store that comment in git and
> > > > > then have the MR updated with those comments.  That would be a great
> > > > > item to make that feature more complete.
> > > >
> > > > One mismatch with kernel dev process that seem to be there for lots of
> > > > existing solutions (gerrit, git-appraise, github, gitlab) is that they
> > > > are centered around a single "mail" git tree (in particular,
> > > > gerrit/git-appraise check in metainfo right into that repo). Whereas
> > > > kernel has lots of kernels. Now if Steve is CCed on lots of changes
> > > > what git tree should he pull before boarding a place? For some changes
> > > > it may be unclear what tree they should go into initially, or that may
> > > > change over time. Then, there are some additional relations with
> > > > stable trees.
> > > > I suspect that kernel tooling should account for that and separate
> > > > changes layer from exact git trees. Like mailing lists. Usually there
> > > > is 1 mailing list and 1 git tree per subsystem, but still this
> > > > relation is not fixed and one can always CC another mailing list, or
> > > > retarget the change, etc.
> > > > What do you think?
> > > >
> > > I agree that newer review solutions (of the type you ennummerated) rely
> > > on centralization of information, which is undesireable in many cases,
> > > but I'm not sure how to avoid that.
> >
> > Well, FWIW the SSB protocol and similar would avoid that:
> > https://people.kernel.org/monsieuricon/patches-carved-into-developer-sigchains
> >
> I wasn't aware of this framework, thats pretty cool and could be a
> potential solution, yes.
>
> That said, it doesn't solve the casual contributor issue.  From the
> section on raising the entry barrier:
>
> "We would need full-featured web clients that would allow someone to browse projects
> in a similar fashion as they would browse them on Git..b, including viewing issues,
> submitting bug reports, and sending patches and pull requests."

Well, it does not have to be complex. See a potential new developer workflow:
https://lore.kernel.org/workflows/d6e8f49e93ece6f208e806ece2aa85b4971f3d17.1569152718.git.dvyukov@google.com/

> Once we start talking about having to maintain web interfaces, and
> especially if we are also considering features like CI, it seems to me
> we are talking about falling back effectively to a forge solution that
> allows data interchange via SSB.  While thats a good feature add, its
> really an addition to a forge solution, not a standalone mechanism.

Re web interfaces, Gerrit allows running locally (with your local
checkout as data source), or there will always be an option to apply
the change locally and review in your favorite editor. While a hosted
web interface is nice, it may be only one of the available options.
E.g. git-appraise has a bridge to github, but that's only a bridge.

Re CI. It does not seem to be much to reuse for kernel testing here.
Besides push notifications about new changes and UI for
"passed"/"failed" result. It would be nice for KernelCI, LKFT, CKI,
0-day to talk to the rest of the kernel dev process using some
standardized API, but if it's a github API or some other similar API
probably does not matter much.


> > > Just thinking off the top of my head, I wonder if a tool that converted
> > > all forge type conversations to git notes would be useful here.  Those
> > > could then be pulled by individuals for review and update?
> >
> > git-appriaise does something similar, but the other way around:
> > https://github.com/dvyukov/kit/blob/master/doc/references.md#git-appraise
> > git notes is the _main_ storage, but then it has bridge to github.
> >
> > However, it's unclear how use such solution for kernel:
> > https://lore.kernel.org/ksummit-discuss/9fee1356-cf48-6198-4001-5d9d886fbf88@iogearbox.net/T/#m869c5253d10931823bba74942df7da062a7bbb13
> > (world writable, force pushes)
> >
> Yeah, I've messed with git-appraise, and thats the issue I've run into -
> it forgoes the need for account creation on a forge, but at the expense
> of...well, not having any significant meaningful authentication.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-29 12:55                 ` Dmitry Vyukov
  2019-09-30  1:00                   ` Neil Horman
@ 2019-09-30 14:51                   ` Theodore Y. Ts'o
  2019-09-30 15:15                     ` Steven Rostedt
  2019-10-08  1:00                     ` Stephen Rothwell
  1 sibling, 2 replies; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-09-30 14:51 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Neil Horman, Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Sun, Sep 29, 2019 at 02:55:25PM +0200, Dmitry Vyukov wrote:
> One mismatch with kernel dev process that seem to be there for lots of
> existing solutions (gerrit, git-appraise, github, gitlab) is that they
> are centered around a single "mail" git tree (in particular,
> gerrit/git-appraise check in metainfo right into that repo). Whereas
> kernel has lots of kernels. Now if Steve is CCed on lots of changes
> what git tree should he pull before boarding a place? For some changes
> it may be unclear what tree they should go into initially, or that may
> change over time. Then, there are some additional relations with
> stable trees.

It might be worth unpacking the various ways in which a patch series
gets reviewed, and thus cc'ed, on multiple mailing lists.

(a) It's meant for a tree that has its own mailing list, but some
people cc LKML anyway, on general principles.  Since very few people
read LKML, it's kinda pointless, but people do it anyway.

(b) The patch series is meant for one tree, but it will have an impact
on other trees, so it's cc'ed as an FYI and hoping to get comments
from those other trees.  A good example of this might be changes to
fs/iomap, for which linux-xfs is its primary mailing list (and git
tree), but other trees use various bits of fs/iomap (and will be using
more, in the future), so it's cc'ed to linux-fsdevel.

(b1) The patch series impacts some function signature used by multiple
subsystems, so it actually requires changes in git multiple trees ---
but since the one or more changes needs to go upstream in atomic
commits, the patch needs to be reviewed by multiple file system
developers.  There may be some negotiation over which tree the change
should get pushed upsrteam, but _usually_ it's actually pretty clear.

(b2) The patch series adds new functionality which multiple subsystems
are interested in using and pushing upstream in the next merge window.
Since the change to enable the use of that feature is often
non-trivial, what typically happens is the enabling commits which
enable the functionality goes into one tree, with a branch pointer
which other trees will pull into their trees, and then each subsystem
will land additional commits to utilization the feature in their tree.

Usually the patch series with the enabling commits will also come with
a sample set of commits using that new functionality for one subsystem
(usually the subsystem which the primary author of that functionality
is associated with), and that's the git tree for which the initial
patchset will be sent.

(c) People who are making wide-scale cleanups to the tree (example:
adding /* FALLTHRROUGH */ tags) where it is sent as a single patch
series, but in fact very often each subsystem maintinaer will take the
commit through their own tree.  (This is often to avoid later merge
conflicts.)  The rest of the changes may then go up through some
default tree (e.g., the security tree, or the trivial tree, etc.)

It is this last case where which tree a patch sent might get sent
through is uncertain, but it's also a small percentage of the overall
patch flow.

> I suspect that kernel tooling should account for that and separate
> changes layer from exact git trees. Like mailing lists. Usually there
> is 1 mailing list and 1 git tree per subsystem, but still this
> relation is not fixed and one can always CC another mailing list, or
> retarget the change, etc.
> What do you think?

That would be nice, but I worry that if we make that a requirement for
the 1.0 version, it may make the whole system too complicated and
delay its rollout.

So what I would say is that as a Minimum Viable Product, it's
important for the author to be able to designate which git tree entire
the patch series is meant for, and also specify other mailing lists
that should be cc'ed.  That something that pretty much all of the
centralized "forge" systems (which I assume include solutions like
Gerrit as well as github) can handle today.

It would be _desirable_ if commits could be annotated with the set of
subsystem maintainers for which approval is requested.  It would also
be desirable if there was a way for a subsystem maintainer to indicate
that they will take the patch through their tree, and the right thing
should happen in terms of removing the patch from needing further
review once that patch is in some other subsystem's tree.

Also, even if we believe that it is (a) possible and (b) desirable to
have a single system which everyone is mandated to use, there will be
a long transition period where not everyone will be using the new
centralized system.  So the system *has* to handle smoothly subsystems
which haven't yet converted to using the new centralized system.

Look at git as an example; everyone uses git because it is clearly the
superior solution, and it made maintainers' lives easier.  But for a
long time, Andrew Morton's -mm patch queue was maintained outside of
git, and in fact, was accessible via ftp.  We don't have a senior vice
president which can force everyone to use the new system --- or for
example, stand down all development work for two months and force
everyone to work a single focus area, like reliability or security (as
Microsoft did a few years ago).

Can we focus most development resources on a single solution which
appears to meet the vast majority of everyone's workflow?  Perhaps,
especially if we can get corporate funding of the solution which we
expect to meet most developer's needs.  But can we force everyone to
use it?  No, and the attempt to force everyone to use it may actually
make wide adoption harder, not easier.

				- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30 14:51                   ` Theodore Y. Ts'o
@ 2019-09-30 15:15                     ` Steven Rostedt
  2019-09-30 16:09                       ` Geert Uytterhoeven
  2019-09-30 20:56                       ` Konstantin Ryabitsev
  2019-10-08  1:00                     ` Stephen Rothwell
  1 sibling, 2 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-09-30 15:15 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Dmitry Vyukov, Neil Horman, Laurent Pinchart, Drew DeVault, workflows

On Mon, 30 Sep 2019 10:51:23 -0400
"Theodore Y. Ts'o" <tytso@mit.edu> wrote:

> 
> (a) It's meant for a tree that has its own mailing list, but some
> people cc LKML anyway, on general principles.  Since very few people
> read LKML, it's kinda pointless, but people do it anyway.

I prefer this method, as I'm not subscribed on every mailing list, but
am with LKML. It makes it a lot easier for me to see threads when I get
Cc'd on a patch thread that's from another mailing list. It's much
easier for me to find the thread in my LKML folder, than to have to
search archives someplace else. Although, lore.kernel.org is making
this better.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30 15:15                     ` Steven Rostedt
@ 2019-09-30 16:09                       ` Geert Uytterhoeven
  2019-09-30 20:56                       ` Konstantin Ryabitsev
  1 sibling, 0 replies; 102+ messages in thread
From: Geert Uytterhoeven @ 2019-09-30 16:09 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Neil Horman,
	Laurent Pinchart, Drew DeVault, workflows

On Mon, Sep 30, 2019 at 5:17 PM Steven Rostedt <rostedt@goodmis.org> wrote:
> On Mon, 30 Sep 2019 10:51:23 -0400
> "Theodore Y. Ts'o" <tytso@mit.edu> wrote:
> > (a) It's meant for a tree that has its own mailing list, but some
> > people cc LKML anyway, on general principles.  Since very few people
> > read LKML, it's kinda pointless, but people do it anyway.
>
> I prefer this method, as I'm not subscribed on every mailing list, but
> am with LKML. It makes it a lot easier for me to see threads when I get
> Cc'd on a patch thread that's from another mailing list. It's much
> easier for me to find the thread in my LKML folder, than to have to
> search archives someplace else. Although, lore.kernel.org is making
> this better.

+1

That's also why I sometimes report regressions as replies to emails from
git-commits-head, as that may be the only emails about the subject I do have
in my mbox.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30 15:15                     ` Steven Rostedt
  2019-09-30 16:09                       ` Geert Uytterhoeven
@ 2019-09-30 20:56                       ` Konstantin Ryabitsev
  1 sibling, 0 replies; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-09-30 20:56 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Neil Horman,
	Laurent Pinchart, Drew DeVault, workflows

On Mon, Sep 30, 2019 at 11:15:32AM -0400, Steven Rostedt wrote:
> > (a) It's meant for a tree that has its own mailing list, but some
> > people cc LKML anyway, on general principles.  Since very few people
> > read LKML, it's kinda pointless, but people do it anyway.
> 
> I prefer this method, as I'm not subscribed on every mailing list, but
> am with LKML. It makes it a lot easier for me to see threads when I get
> Cc'd on a patch thread that's from another mailing list. It's much
> easier for me to find the thread in my LKML folder, than to have to
> search archives someplace else. Although, lore.kernel.org is making
> this better.

As a side-note, we are working to make it easier to run lore.kernel.org
mirrors (either full, or a subset). So, if you wanted to take work
offline while being able to search full LKML archives, you should soon
be able to do so.

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30  1:00                   ` Neil Horman
  2019-09-30  6:05                     ` Dmitry Vyukov
@ 2019-09-30 21:02                     ` Konstantin Ryabitsev
  1 sibling, 0 replies; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-09-30 21:02 UTC (permalink / raw)
  To: Neil Horman
  Cc: Dmitry Vyukov, Steven Rostedt, Laurent Pinchart, Drew DeVault, workflows

On Sun, Sep 29, 2019 at 09:00:54PM -0400, Neil Horman wrote:
> I agree that newer review solutions (of the type you ennummerated) rely
> on centralization of information, which is undesireable in many cases,
> but I'm not sure how to avoid that.  
> 
> Just thinking off the top of my head, I wonder if a tool that converted
> all forge type conversations to git notes would be useful here.  Those
> could then be pulled by individuals for review and update?

Git notes have important scaling problems -- they create a file per each
note, so git performance will degrade linearly. With only a few notes
this isn't such a big deal, but if we imagine scaling this up to kernel
workflows, this will quickly start to cause problems. Git notes also
cannot be cryptographically signed (in a way that's supported natively
by git), so it's another reason not to consider them for anything
important.


-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-28 23:16               ` Dave Airlie
  2019-09-28 23:52                 ` Steven Rostedt
@ 2019-10-01  3:22                 ` Daniel Axtens
  2019-10-01 21:14                   ` Bjorn Helgaas
  1 sibling, 1 reply; 102+ messages in thread
From: Daniel Axtens @ 2019-10-01  3:22 UTC (permalink / raw)
  To: Dave Airlie, Steven Rostedt
  Cc: Neil Horman, Laurent Pinchart, Drew DeVault, workflows

Dave Airlie <airlied@gmail.com> writes:

> On Sun, 29 Sep 2019 at 09:10, Steven Rostedt <rostedt@goodmis.org> wrote:
>>
>> On Wed, 25 Sep 2019 20:40:45 -0400
>> Neil Horman <nhorman@tuxdriver.com> wrote:
>>
>> > Eventually, barring any really significant objection, hes going to make
>> > the switch, and users will either have to get github accounts, or stop
>> > participating in netdev development.
>>
>> That will be a very sad day if that happened.
>>
>> Whatever service should have an email interface. For example, if I get
>> a message from bugzilla.kernel.org, I can reply back via email and it
>> is inserted into the tool (as I see my Out of office messages going
>> into it. I need to fix my scripts not to reply to bugzilla).
>>
>> I set up patchwork on my INBOX, as I'm having a hard time of separating
>> patches from the noise. And it works really well. I would love to be
>> able to push my patchwork list to a public place so that others can see
>> it too. As mentioned in the Maintainers Summit, it would be great to be
>> able to pull patchwork down to my laptop, get on the plane, process a
>> bunch of patches while flying, and then when I land, I could push the
>> updates to the public server.
>>
>> That's pretty much all I'm looking for.
>
> How many patches is your workflow btw? 20 a month? 50?
>
> I think the reason davem and my group have in using git(hub/lab) is
> our patch counts are way higher. You guys are inventing solutions for
> your problems that's great, but they don't scale.
>
> Patchwork as currently sold still requires someone to spend time
> cleaning it up a lot, which is fine if you get 20-30 mails, when you
> 1-2k mails patchwork manual interactions end up taking a large chunk
> of time. The and the fact that there is one patchwork, everyone has
> forked it to add their favourite features. Unless someone spends time
> on a reboot and goes around bringing all the forks back to a central
> line, which is is a significantly larger task than if it has been
> maintained in the first place, because now everyone has their own
> niche hacks and cool features they can't do without, but are all
> different than everyone elses.

/me puts on upstream patchwork maintainer hat

Hi!

What sort of manual interactions are you doing with patchwork and what
would you like to see changed?

You're probably using the FDO fork, so I can't help with that, but I do
try to do some work on patchwork as I get time. It's not funded so it
has to fit around my actual kernel development job, but currently we're
working on sorting out a big chunk of technical debt that should make
things a lot easier in the future, so now would be a good time to get
your requests in.

Regards,
Daniel

>
> Dave.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-01  3:22                 ` Daniel Axtens
@ 2019-10-01 21:14                   ` Bjorn Helgaas
  0 siblings, 0 replies; 102+ messages in thread
From: Bjorn Helgaas @ 2019-10-01 21:14 UTC (permalink / raw)
  To: Daniel Axtens
  Cc: Dave Airlie, Steven Rostedt, Neil Horman, Laurent Pinchart,
	Drew DeVault, workflows

On Mon, Sep 30, 2019 at 10:22 PM Daniel Axtens <dja@axtens.net> wrote:

> What sort of manual interactions are you doing with patchwork and what
> would you like to see changed?

I delegate patches and change their state.  This isn't a lot of time
in absolute terms (10-20 patches/day) but it's frustratingly fiddly
because of mousing/clicking and web latency.  git-pw seems like a way
to get rid of this, but delegation is currently broken
(https://lists.ozlabs.org/pipermail/patchwork/2019-September/006036.html)

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-24 18:37 ` Drew DeVault
  2019-09-24 18:53   ` Neil Horman
@ 2019-10-07 15:33   ` David Miller
  2019-10-07 15:35     ` Drew DeVault
                       ` (3 more replies)
  1 sibling, 4 replies; 102+ messages in thread
From: David Miller @ 2019-10-07 15:33 UTC (permalink / raw)
  To: sir; +Cc: nhorman, workflows

From: "Drew DeVault" <sir@cmpwn.com>
Date: Tue, 24 Sep 2019 14:37:28 -0400

> Until this part. Phasing out email in favor of a centralized
> solution like Gitlab would be a stark regression.

I have to make a statement about this because it's really the elephant
in the room.

Email is on the slow and steady decline to an almost certain death.

And I say this as someone who is maintaining email lists for more than
25 years, and has to sift through several hundred emails every day.

Somewhere down the road, in the not too distant future, email will
simply not be an option.  You can "use" it, but I can guarantee it
will not be in a state where you will want to.

So we can stay in denial about this, or we can do something proactive
to prepare ourselve for this inevitable result.

And when we have these conversations about how important it is to
retain email based workflows, is that really to make sure we have a
backup plan in case new infrastructure fails, or is it to appease
"senior" maintainers like myself and others who simply don't want to
change and move on?

Personally, I seriously want to change and move on from email, it's
terrible.

I just want tools and pretty web pages, in fact I'll use just about
anything in order to move on from email based workflows entirely.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:33   ` David Miller
@ 2019-10-07 15:35     ` Drew DeVault
  2019-10-07 16:20       ` Neil Horman
  2019-10-07 15:47     ` Steven Rostedt
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 102+ messages in thread
From: Drew DeVault @ 2019-10-07 15:35 UTC (permalink / raw)
  To: David Miller; +Cc: nhorman, workflows

There's no substance here to back up your arguments. You just say email
is dying and hope we take it for granted that you're right. Email is
_not_ terrible, and there are serious flaws with any of the proposed
alternatives to it.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:33   ` David Miller
  2019-10-07 15:35     ` Drew DeVault
@ 2019-10-07 15:47     ` Steven Rostedt
  2019-10-07 18:40       ` David Miller
  2019-10-07 18:45       ` David Miller
  2019-10-07 21:49     ` Theodore Y. Ts'o
  2019-10-07 23:00     ` Daniel Axtens
  3 siblings, 2 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-10-07 15:47 UTC (permalink / raw)
  To: David Miller; +Cc: sir, nhorman, workflows

On Mon, 07 Oct 2019 17:33:29 +0200 (CEST)
David Miller <davem@davemloft.net> wrote:

> From: "Drew DeVault" <sir@cmpwn.com>
> Date: Tue, 24 Sep 2019 14:37:28 -0400
> 
> > Until this part. Phasing out email in favor of a centralized
> > solution like Gitlab would be a stark regression.  
> 
> I have to make a statement about this because it's really the elephant
> in the room.
> 
> Email is on the slow and steady decline to an almost certain death.

And so has IRC. I would hope that email doesn't face the same fate.
What replaced IRC? Slack! The most useless interface for having
anything more that watercooler conversations.

Whatever "replaces" email, please keep those stupid emojis and
especially the animated ones out of it. It does nothing but distract
from the conversation.

I blame Outlook as the death of email. It's probably the most used
email client but also the most useless one. It doesn't support proper
tree threading, and it's impossible to follow a long thread with it.
Not to mention, it can't do inlined conversations to save itself.

> 
> And I say this as someone who is maintaining email lists for more than
> 25 years, and has to sift through several hundred emails every day.
> 
> Somewhere down the road, in the not too distant future, email will
> simply not be an option.  You can "use" it, but I can guarantee it
> will not be in a state where you will want to.
> 
> So we can stay in denial about this, or we can do something proactive
> to prepare ourselve for this inevitable result.
> 
> And when we have these conversations about how important it is to
> retain email based workflows, is that really to make sure we have a
> backup plan in case new infrastructure fails, or is it to appease
> "senior" maintainers like myself and others who simply don't want to
> change and move on?
> 
> Personally, I seriously want to change and move on from email, it's
> terrible.
> 
> I just want tools and pretty web pages, in fact I'll use just about
> anything in order to move on from email based workflows entirely.

I want tools that work, and are versatile. If you have a replacement
that's not a "one size fits all", where you can build any client
against it (like we have with email), then I will be very happy. But
forcing everyone to a single workflow is going to be a huge step
backwards.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:35     ` Drew DeVault
@ 2019-10-07 16:20       ` Neil Horman
  2019-10-07 16:24         ` Drew DeVault
  0 siblings, 1 reply; 102+ messages in thread
From: Neil Horman @ 2019-10-07 16:20 UTC (permalink / raw)
  To: Drew DeVault; +Cc: David Miller, nhorman, workflows

On Mon, Oct 07, 2019 at 11:35:43AM -0400, Drew DeVault wrote:
> There's no substance here to back up your arguments. You just say email
> is dying and hope we take it for granted that you're right. Email is
> _not_ terrible, and there are serious flaws with any of the proposed
> alternatives to it.

To offer some clarification here, no email isn't unilaterally terrible, its a pretty
handy interface for any user conducting a single in depth conversation
around a specific topic

For someone on the other hand, thats got to manage those conversations,
writ large (on the order of several hundred a day), where management
entails extracting structured text (that as often as not, is not as
structured as it should be), and applying that data to a git tree, its a
huge drain on that persons time.  Between that and dealing with general
maintenence tasks (other email servers that stop responding, or spam the
list, etc), and it becomes a full time job just to keep the system
running before any real work gets done

Neil


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 16:20       ` Neil Horman
@ 2019-10-07 16:24         ` Drew DeVault
  2019-10-07 18:43           ` David Miller
  0 siblings, 1 reply; 102+ messages in thread
From: Drew DeVault @ 2019-10-07 16:24 UTC (permalink / raw)
  To: Neil Horman; +Cc: David Miller, nhorman, workflows

On Mon Oct 7, 2019 at 12:20 PM Neil Horman wrote:
> For someone on the other hand, thats got to manage those conversations,
> writ large (on the order of several hundred a day), where management
> entails extracting structured text (that as often as not, is not as
> structured as it should be), and applying that data to a git tree, its a
> huge drain on that persons time.  Between that and dealing with general
> maintenence tasks (other email servers that stop responding, or spam the
> list, etc), and it becomes a full time job just to keep the system
> running before any real work gets done

I'm one of these people. My mail system's volume is on the order of
10,000 emails processed per month, forwarded to ~3,000 subscriptions. I
hardly ever have to touch it. The main issue is that many mail systems
are ancient and overcomplex, but nothing is stopping anyone from working
with simpler and more modern mail systems. And if we're going to be
investing development time in improvements, things will get even better
very quickly.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:47     ` Steven Rostedt
@ 2019-10-07 18:40       ` David Miller
  2019-10-07 18:45       ` David Miller
  1 sibling, 0 replies; 102+ messages in thread
From: David Miller @ 2019-10-07 18:40 UTC (permalink / raw)
  To: rostedt; +Cc: sir, nhorman, workflows

From: Steven Rostedt <rostedt@goodmis.org>
Date: Mon, 7 Oct 2019 11:47:52 -0400

> I blame Outlook as the death of email.

I personally think Google harbors a lot of the blame, but there
are other culprits indeed.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 16:24         ` Drew DeVault
@ 2019-10-07 18:43           ` David Miller
  2019-10-07 19:24             ` Eric Wong
  0 siblings, 1 reply; 102+ messages in thread
From: David Miller @ 2019-10-07 18:43 UTC (permalink / raw)
  To: sir; +Cc: nhorman, nhorman, workflows

From: "Drew DeVault" <sir@cmpwn.com>
Date: Mon, 07 Oct 2019 12:24:47 -0400

> The main issue is that many mail systems are ancient and
> overcomplex, but nothing is stopping anyone from working with
> simpler and more modern mail systems.

It's the modern ones, like gmail, that are the main problem.

Google controls such a huge chunk of the present day internet
email traffic that they can effectively steer the medium in any
direction they want, indescriminately, and even eradicate it
slowly over time.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:47     ` Steven Rostedt
  2019-10-07 18:40       ` David Miller
@ 2019-10-07 18:45       ` David Miller
  2019-10-07 19:21         ` Steven Rostedt
  1 sibling, 1 reply; 102+ messages in thread
From: David Miller @ 2019-10-07 18:45 UTC (permalink / raw)
  To: rostedt; +Cc: sir, nhorman, workflows

From: Steven Rostedt <rostedt@goodmis.org>
Date: Mon, 7 Oct 2019 11:47:52 -0400

> I want tools that work, and are versatile. If you have a replacement
> that's not a "one size fits all", where you can build any client
> against it (like we have with email), then I will be very happy. But
> forcing everyone to a single workflow is going to be a huge step
> backwards.

I want good infrastructure upon which arbitrary tooling can be built on
top too, so I'm glad that our goals align :-)

But building it on top of email is very unwise in my opinion.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 18:45       ` David Miller
@ 2019-10-07 19:21         ` Steven Rostedt
  0 siblings, 0 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-10-07 19:21 UTC (permalink / raw)
  To: David Miller; +Cc: sir, nhorman, workflows

On Mon, 07 Oct 2019 20:45:12 +0200 (CEST)
David Miller <davem@davemloft.net> wrote:

> From: Steven Rostedt <rostedt@goodmis.org>
> Date: Mon, 7 Oct 2019 11:47:52 -0400
> 
> > I want tools that work, and are versatile. If you have a replacement
> > that's not a "one size fits all", where you can build any client
> > against it (like we have with email), then I will be very happy. But
> > forcing everyone to a single workflow is going to be a huge step
> > backwards.  
> 
> I want good infrastructure upon which arbitrary tooling can be built on
> top too, so I'm glad that our goals align :-)
> 
> But building it on top of email is very unwise in my opinion.

I agree that it should not be built on top of email. I believe that the
requirement people are asking for, is that email is integrated in it.

For example, I love bugzilla's that are set up where I can be emailed
when someone submits a report. And I can reply via email to that email
I received, and my reply goes into bugzilla.

The one that submitted the report doesn't need to have email for it.
They can submit via the bugzilla interface, and even see my reply. But
bugzilla is not built on top of email in this regard, but it can nicely
interact with it.

I believe that's what people are asking about. That email is one of the
interfaces to whatever tool we come up with.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 18:43           ` David Miller
@ 2019-10-07 19:24             ` Eric Wong
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Wong @ 2019-10-07 19:24 UTC (permalink / raw)
  To: David Miller; +Cc: sir, nhorman, nhorman, workflows

David Miller <davem@davemloft.net> wrote:
> From: "Drew DeVault" <sir@cmpwn.com>
> Date: Mon, 07 Oct 2019 12:24:47 -0400
> 
> > The main issue is that many mail systems are ancient and
> > overcomplex, but nothing is stopping anyone from working with
> > simpler and more modern mail systems.
> 
> It's the modern ones, like gmail, that are the main problem.
> 
> Google controls such a huge chunk of the present day internet
> email traffic that they can effectively steer the medium in any
> direction they want, indescriminately, and even eradicate it
> slowly over time.

The megacorps will try to do that with any modern replacement
for SMTP, too :<

Maybe something pull-based such as IM2000
<https://cr.yp.to/im2000.html> can eventually work.  *shrug*

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:33   ` David Miller
  2019-10-07 15:35     ` Drew DeVault
  2019-10-07 15:47     ` Steven Rostedt
@ 2019-10-07 21:49     ` Theodore Y. Ts'o
  2019-10-07 23:00     ` Daniel Axtens
  3 siblings, 0 replies; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-10-07 21:49 UTC (permalink / raw)
  To: David Miller; +Cc: sir, nhorman, workflows

On Mon, Oct 07, 2019 at 05:33:29PM +0200, David Miller wrote:
> From: "Drew DeVault" <sir@cmpwn.com>
> Date: Tue, 24 Sep 2019 14:37:28 -0400
> 
> > Until this part. Phasing out email in favor of a centralized
> > solution like Gitlab would be a stark regression.
> 
> I have to make a statement about this because it's really the elephant
> in the room.
> 
> Email is on the slow and steady decline to an almost certain death.

It might be useful to consider why people have been saying, "you can
take e-mail away when you pry my MUA from my old, dead, fingers".  In
other words, why do people want to keep e-mail based process?

Here are some potential reasons:

* People like having a variety of clients, which they can customize
   * To provide off-line access
   * Advanced filtering and indexing options
   * Ability to create pipelines so the contents can be easily piped
     to scripts (e.g., checkpatch, git am, etc.)
* Support for handling large number of e-mails (at least some clients / backends)
* The same system is used to discuss concepts as well as patches
* Easy for people to subscribe to patches, reviews, and discussions for a particular
  subsystem

> And when we have these conversations about how important it is to
> retain email based workflows, is that really to make sure we have a
> backup plan in case new infrastructure fails, or is it to appease
> "senior" maintainers like myself and others who simply don't want to
> change and move on?

The reality is we do need to be compatible with e-mail based workflows
for a transition period that will almost certainly last at least a
year.  Part of this is because the initial version will almost
certainly not have 100% of all of the necessary features, but rather
will be a "minimum viable product" which can then be iterated on to
add additional functionality to support additional workflows.  The
other reason is that people will need time to transition at their own
pace.

Nevertheless, it is important that the design try to subsume all
e-mail based workflows, since in the absence of some mangerial dictat
forcing everyone to switch, this new solution needs to be strictly
superior to e-mail, which means it has to support all of the reasons
why people are so wedded to e-mail today.

					- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 15:33   ` David Miller
                       ` (2 preceding siblings ...)
  2019-10-07 21:49     ` Theodore Y. Ts'o
@ 2019-10-07 23:00     ` Daniel Axtens
  2019-10-08  0:39       ` Eric Wong
  2019-10-08  1:17       ` Steven Rostedt
  3 siblings, 2 replies; 102+ messages in thread
From: Daniel Axtens @ 2019-10-07 23:00 UTC (permalink / raw)
  To: David Miller, sir; +Cc: nhorman, workflows

> Personally, I seriously want to change and move on from email, it's
> terrible.

FWIW, I maintain patchwork and I agree with this. Email suffers from a
gap between what can be comprehended by humans and what can be reliably
comprehened by machines.

For example:

 - is a given series a revision of a previous series? Humans can change
   the name of the cover letter, they can re-order or drop patches,
   split and merge series, even change sender, and other humans just
   figure it out. But if I try to crystalise that logic into patchwork,
   things get very tricky. This makes it hard to build powerful APIs
   into patchwork, which makes it harder to build really cool tools on
   top of patchwork.

 - what are the dependencies of a patch series? Does it need another
   series first? Does it apply to a particular tree? (maintainer/next,
   maintainer/fixes, stable?) This affects every CI system that I'm
   aware of (some of which build on patchwork). Humans can understand
   this pretty easily, computers not so much.

Non-email systems have an easier time of this: with gerrit (which I'm
not a big fan of, but just take it as an example) you push things up to
a git repository, and it requires a change-id. So you can track the base
tree, dependencies, and patch revisions easily, because you build on a
richer, more structured data source.

Kind regards,
Daniel

> I just want tools and pretty web pages, in fact I'll use just about
> anything in order to move on from email based workflows entirely.


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 23:00     ` Daniel Axtens
@ 2019-10-08  0:39       ` Eric Wong
  2019-10-08  1:26         ` Daniel Axtens
  2019-10-08  1:17       ` Steven Rostedt
  1 sibling, 1 reply; 102+ messages in thread
From: Eric Wong @ 2019-10-08  0:39 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: David Miller, sir, nhorman, workflows

Daniel Axtens <dja@axtens.net> wrote:
> David Miller wrote:
> > Personally, I seriously want to change and move on from email, it's
> > terrible.
> 
> FWIW, I maintain patchwork and I agree with this. Email suffers from a
> gap between what can be comprehended by humans and what can be reliably
> comprehened by machines.
> 
> For example:
> 
>  - is a given series a revision of a previous series? Humans can change
>    the name of the cover letter, they can re-order or drop patches,
>    split and merge series, even change sender, and other humans just
>    figure it out. But if I try to crystalise that logic into patchwork,
>    things get very tricky. This makes it hard to build powerful APIs
>    into patchwork, which makes it harder to build really cool tools on
>    top of patchwork.

I'm confident that we can build much of that logic off search
and do similar things to what git does with rename detection.

>  - what are the dependencies of a patch series? Does it need another
>    series first? Does it apply to a particular tree? (maintainer/next,
>    maintainer/fixes, stable?) This affects every CI system that I'm
>    aware of (some of which build on patchwork). Humans can understand
>    this pretty easily, computers not so much.

I think can do all these things off existing data in archives.
We already have pre/post-image blob IDs in git patches.
To get there, I think we'll need:

1) efficient way to map blobs -> trees -> commits -> refs
   (a reverse-mapping for git's normal DAG)

2) automatic scanning of known repos (searching what appear to
   be pull-requests, similar to what pr-tracker-bot does).

None of which requires patch senders to do anything differently.

git format-patch features such as --base and --range-diff can
certainly help with this, and it's probably easier to train people
to use newer options in existing tools than new tools, entirely.

> Non-email systems have an easier time of this: with gerrit (which I'm
> not a big fan of, but just take it as an example) you push things up to
> a git repository, and it requires a change-id. So you can track the base
> tree, dependencies, and patch revisions easily, because you build on a
> richer, more structured data source.

Right, decentralization is a HARD problem; but it starts off
with centralization-resistance, a slightly easier problem to
solve :)

The key is: don't introduce things which mirrors can't reproduce

Unlike in 2005 when git started; things like Xapian and SQLite
are much more mature and I'm comfortable leaning on them to
solve harder problems.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-09-30 14:51                   ` Theodore Y. Ts'o
  2019-09-30 15:15                     ` Steven Rostedt
@ 2019-10-08  1:00                     ` Stephen Rothwell
  1 sibling, 0 replies; 102+ messages in thread
From: Stephen Rothwell @ 2019-10-08  1:00 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Dmitry Vyukov, Neil Horman, Steven Rostedt, Laurent Pinchart,
	Drew DeVault, workflows

[-- Attachment #1: Type: text/plain, Size: 551 bytes --]

Hi Theodore,

On Mon, 30 Sep 2019 10:51:23 -0400 "Theodore Y. Ts'o" <tytso@mit.edu> wrote:
>
> Look at git as an example; everyone uses git because it is clearly the
> superior solution, and it made maintainers' lives easier.  But for a
> long time, Andrew Morton's -mm patch queue was maintained outside of
> git, and in fact, was accessible via ftp.

It still is maintained with quilt and accessible via http (no ftp any
more).  I have to import it into git branches in order to include it
in linux-next.
-- 
Cheers,
Stephen Rothwell

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-07 23:00     ` Daniel Axtens
  2019-10-08  0:39       ` Eric Wong
@ 2019-10-08  1:17       ` Steven Rostedt
  2019-10-08 16:43         ` Don Zickus
  1 sibling, 1 reply; 102+ messages in thread
From: Steven Rostedt @ 2019-10-08  1:17 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: David Miller, sir, nhorman, workflows

On Tue, 08 Oct 2019 10:00:03 +1100
Daniel Axtens <dja@axtens.net> wrote:

> Non-email systems have an easier time of this: with gerrit (which I'm
> not a big fan of, but just take it as an example) you push things up to
> a git repository, and it requires a change-id. So you can track the base
> tree, dependencies, and patch revisions easily, because you build on a
> richer, more structured data source.

I believe we all want a new system that can handle this, but still be
able to work with email. Patchwork requires to read all emails and
figure out what to do with it. This workflow doesn't need to do that.
But it should be able to send out emails on comments, and a reply to
one of those should easily be put back into the system.

As for adding patches, we can push to a git tree or something that the
tool could read. It's much easier to know what to do with a branch then
email. A rebase could be a new version of the series (and we should
probably archive the original version).

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  0:39       ` Eric Wong
@ 2019-10-08  1:26         ` Daniel Axtens
  2019-10-08  2:11           ` Eric Wong
  0 siblings, 1 reply; 102+ messages in thread
From: Daniel Axtens @ 2019-10-08  1:26 UTC (permalink / raw)
  To: Eric Wong; +Cc: David Miller, sir, nhorman, workflows

>> For example:
>> 
>>  - is a given series a revision of a previous series? Humans can change
>>    the name of the cover letter, they can re-order or drop patches,
>>    split and merge series, even change sender, and other humans just
>>    figure it out. But if I try to crystalise that logic into patchwork,
>>    things get very tricky. This makes it hard to build powerful APIs
>>    into patchwork, which makes it harder to build really cool tools on
>>    top of patchwork.
>
> I'm confident that we can build much of that logic off search
> and do similar things to what git does with rename detection.

A lot of people on this list are confident of a great many things :)

There should be an API in the next minor version of Patchwork that
allows you to set patch relations. I would encourage you to try to build
this - if it works, we can plug it in to the core.

>>  - what are the dependencies of a patch series? Does it need another
>>    series first? Does it apply to a particular tree? (maintainer/next,
>>    maintainer/fixes, stable?) This affects every CI system that I'm
>>    aware of (some of which build on patchwork). Humans can understand
>>    this pretty easily, computers not so much.
>
> I think can do all these things off existing data in archives.
> We already have pre/post-image blob IDs in git patches.
> To get there, I think we'll need:

> 1) efficient way to map blobs -> trees -> commits -> refs
>    (a reverse-mapping for git's normal DAG)
>
> 2) automatic scanning of known repos (searching what appear to
>    be pull-requests, similar to what pr-tracker-bot does).
>
> None of which requires patch senders to do anything differently.
>
> git format-patch features such as --base and --range-diff can
> certainly help with this, and it's probably easier to train people
> to use newer options in existing tools than new tools, entirely.

I don't understand any of what you're proposing, unfortunately.

AIUI snowpatch (to pick an open source patch CI example) tries applying
patches to a set of different (instance-configured) trees until it finds
one that works. I'm sure they'd be interested in seeing patches to make
this more efficient.

>> Non-email systems have an easier time of this: with gerrit (which I'm
>> not a big fan of, but just take it as an example) you push things up to
>> a git repository, and it requires a change-id. So you can track the base
>> tree, dependencies, and patch revisions easily, because you build on a
>> richer, more structured data source.
>
> Right, decentralization is a HARD problem; but it starts off
> with centralization-resistance, a slightly easier problem to
> solve :)
>
> The key is: don't introduce things which mirrors can't reproduce

Mirrors already can't meaningfully reproduce patchwork. They can only
make a read-only copy of some of the data, but it's not enough to spin
up a new identical instance.

> Unlike in 2005 when git started; things like Xapian and SQLite
> are much more mature and I'm comfortable leaning on them to
> solve harder problems.

Regards,
Daniel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  1:26         ` Daniel Axtens
@ 2019-10-08  2:11           ` Eric Wong
  2019-10-08  3:24             ` Daniel Axtens
  0 siblings, 1 reply; 102+ messages in thread
From: Eric Wong @ 2019-10-08  2:11 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: David Miller, sir, nhorman, workflows

Daniel Axtens <dja@axtens.net> wrote:
> >> For example:
> >> 
> >>  - is a given series a revision of a previous series? Humans can change
> >>    the name of the cover letter, they can re-order or drop patches,
> >>    split and merge series, even change sender, and other humans just
> >>    figure it out. But if I try to crystalise that logic into patchwork,
> >>    things get very tricky. This makes it hard to build powerful APIs
> >>    into patchwork, which makes it harder to build really cool tools on
> >>    top of patchwork.
> >
> > I'm confident that we can build much of that logic off search
> > and do similar things to what git does with rename detection.
> 
> A lot of people on this list are confident of a great many things :)
> 
> There should be an API in the next minor version of Patchwork that
> allows you to set patch relations. I would encourage you to try to build
> this - if it works, we can plug it in to the core.

Manually set relations should not be needed if people use
format-patch with --interdiff or --range-diff.

A well-tuned search engine will be able to figure out the
preceding series using the git blob IDs from interdiff or commit
IDs from range-diff.

No need to introduce extra metadata into the system, especially
not in a way that can't be reproduced.  Reuse what we have.

Even without interdiff or range-diff, it should be possible to
determine relationships based on common pre-image blob IDs
if the sender used the same base.

> >>  - what are the dependencies of a patch series? Does it need another
> >>    series first? Does it apply to a particular tree? (maintainer/next,
> >>    maintainer/fixes, stable?) This affects every CI system that I'm
> >>    aware of (some of which build on patchwork). Humans can understand
> >>    this pretty easily, computers not so much.
> >
> > I think can do all these things off existing data in archives.
> > We already have pre/post-image blob IDs in git patches.
> > To get there, I think we'll need:
> 
> > 1) efficient way to map blobs -> trees -> commits -> refs
> >    (a reverse-mapping for git's normal DAG)
> >
> > 2) automatic scanning of known repos (searching what appear to
> >    be pull-requests, similar to what pr-tracker-bot does).
> >
> > None of which requires patch senders to do anything differently.
> >
> > git format-patch features such as --base and --range-diff can
> > certainly help with this, and it's probably easier to train people
> > to use newer options in existing tools than new tools, entirely.
> 
> I don't understand any of what you're proposing, unfortunately.
> 
> AIUI snowpatch (to pick an open source patch CI example) tries applying
> patches to a set of different (instance-configured) trees until it finds
> one that works. I'm sure they'd be interested in seeing patches to make
> this more efficient.

Every patch from git format-patch has abbreviated pre/post-image
SHA-1 blob IDs.  If we had an efficient reverse mapping of those
blob IDs to trees, we could quickly figure out which trees those
patches can apply to.

I've already been using pre/post-image blob IDs to recreate blobs
efficiently:

  https://lore.kernel.org/workflows/20190924013920.GA22698@dcvr/

But it doesn't yet find which trees the patch can apply to;
since it cannot (yet) tell you which trees those blob exist in.

> >> Non-email systems have an easier time of this: with gerrit (which I'm
> >> not a big fan of, but just take it as an example) you push things up to
> >> a git repository, and it requires a change-id. So you can track the base
> >> tree, dependencies, and patch revisions easily, because you build on a
> >> richer, more structured data source.
> >
> > Right, decentralization is a HARD problem; but it starts off
> > with centralization-resistance, a slightly easier problem to
> > solve :)
> >
> > The key is: don't introduce things which mirrors can't reproduce
> 
> Mirrors already can't meaningfully reproduce patchwork. They can only
> make a read-only copy of some of the data, but it's not enough to spin
> up a new identical instance.

Right, that seems to be a consequence of not having the
prerequisite storage or search that public-inbox does:

> > Unlike in 2005 when git started; things like Xapian and SQLite
> > are much more mature and I'm comfortable leaning on them to
> > solve harder problems.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  2:11           ` Eric Wong
@ 2019-10-08  3:24             ` Daniel Axtens
  2019-10-08  6:03               ` Eric Wong
  0 siblings, 1 reply; 102+ messages in thread
From: Daniel Axtens @ 2019-10-08  3:24 UTC (permalink / raw)
  To: Eric Wong; +Cc: David Miller, sir, nhorman, workflows

>> >> For example:
>> >> 
>> >>  - is a given series a revision of a previous series? Humans can change
>> >>    the name of the cover letter, they can re-order or drop patches,
>> >>    split and merge series, even change sender, and other humans just
>> >>    figure it out. But if I try to crystalise that logic into patchwork,
>> >>    things get very tricky. This makes it hard to build powerful APIs
>> >>    into patchwork, which makes it harder to build really cool tools on
>> >>    top of patchwork.
>> >
>> > I'm confident that we can build much of that logic off search
>> > and do similar things to what git does with rename detection.
>> 
>> A lot of people on this list are confident of a great many things :)
>> 
>> There should be an API in the next minor version of Patchwork that
>> allows you to set patch relations. I would encourage you to try to build
>> this - if it works, we can plug it in to the core.
>
> Manually set relations should not be needed if people use
> format-patch with --interdiff or --range-diff.
>
> A well-tuned search engine will be able to figure out the
> preceding series using the git blob IDs from interdiff or commit
> IDs from range-diff.
>
> No need to introduce extra metadata into the system, especially
> not in a way that can't be reproduced.  Reuse what we have.
>
> Even without interdiff or range-diff, it should be possible to
> determine relationships based on common pre-image blob IDs
> if the sender used the same base.

As I said, I'd be really happy to see this piggy-back on the API once it
lands.

>> >>  - what are the dependencies of a patch series? Does it need another
>> >>    series first? Does it apply to a particular tree? (maintainer/next,
>> >>    maintainer/fixes, stable?) This affects every CI system that I'm
>> >>    aware of (some of which build on patchwork). Humans can understand
>> >>    this pretty easily, computers not so much.
>> >
>> > I think can do all these things off existing data in archives.
>> > We already have pre/post-image blob IDs in git patches.
>> > To get there, I think we'll need:
>> 
>> > 1) efficient way to map blobs -> trees -> commits -> refs
>> >    (a reverse-mapping for git's normal DAG)
>> >
>> > 2) automatic scanning of known repos (searching what appear to
>> >    be pull-requests, similar to what pr-tracker-bot does).
>> >
>> > None of which requires patch senders to do anything differently.
>> >
>> > git format-patch features such as --base and --range-diff can
>> > certainly help with this, and it's probably easier to train people
>> > to use newer options in existing tools than new tools, entirely.
>> 
>> I don't understand any of what you're proposing, unfortunately.
>> 
>> AIUI snowpatch (to pick an open source patch CI example) tries applying
>> patches to a set of different (instance-configured) trees until it finds
>> one that works. I'm sure they'd be interested in seeing patches to make
>> this more efficient.
>
> Every patch from git format-patch has abbreviated pre/post-image
> SHA-1 blob IDs.  If we had an efficient reverse mapping of those
> blob IDs to trees, we could quickly figure out which trees those
> patches can apply to.
>
> I've already been using pre/post-image blob IDs to recreate blobs
> efficiently:
>
>   https://lore.kernel.org/workflows/20190924013920.GA22698@dcvr/
>
> But it doesn't yet find which trees the patch can apply to;
> since it cannot (yet) tell you which trees those blob exist in.

Cool.

>> >> Non-email systems have an easier time of this: with gerrit (which I'm
>> >> not a big fan of, but just take it as an example) you push things up to
>> >> a git repository, and it requires a change-id. So you can track the base
>> >> tree, dependencies, and patch revisions easily, because you build on a
>> >> richer, more structured data source.
>> >
>> > Right, decentralization is a HARD problem; but it starts off
>> > with centralization-resistance, a slightly easier problem to
>> > solve :)
>> >
>> > The key is: don't introduce things which mirrors can't reproduce
>> 
>> Mirrors already can't meaningfully reproduce patchwork. They can only
>> make a read-only copy of some of the data, but it's not enough to spin
>> up a new identical instance.
>
> Right, that seems to be a consequence of not having the
> prerequisite storage or search that public-inbox does:

I don't think so. I think it's because patchwork allows you to log in
and perform actions like change state and delegate patches. That's not a
thing that public-inbox has in scope.

>> > Unlike in 2005 when git started; things like Xapian and SQLite
>> > are much more mature and I'm comfortable leaning on them to
>> > solve harder problems.

Regards,
Daniel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  3:24             ` Daniel Axtens
@ 2019-10-08  6:03               ` Eric Wong
  2019-10-08 10:06                 ` Daniel Axtens
  2019-10-08 18:46                 ` Rob Herring
  0 siblings, 2 replies; 102+ messages in thread
From: Eric Wong @ 2019-10-08  6:03 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: David Miller, sir, nhorman, workflows

Daniel Axtens <dja@axtens.net> wrote:
> >> >> For example:
> >> >> 
> >> >>  - is a given series a revision of a previous series? Humans can change
> >> >>    the name of the cover letter, they can re-order or drop patches,
> >> >>    split and merge series, even change sender, and other humans just
> >> >>    figure it out. But if I try to crystalise that logic into patchwork,
> >> >>    things get very tricky. This makes it hard to build powerful APIs
> >> >>    into patchwork, which makes it harder to build really cool tools on
> >> >>    top of patchwork.
> >> >
> >> > I'm confident that we can build much of that logic off search
> >> > and do similar things to what git does with rename detection.
> >> 
> >> A lot of people on this list are confident of a great many things :)
> >> 
> >> There should be an API in the next minor version of Patchwork that
> >> allows you to set patch relations. I would encourage you to try to build
> >> this - if it works, we can plug it in to the core.
> >
> > Manually set relations should not be needed if people use
> > format-patch with --interdiff or --range-diff.
> >
> > A well-tuned search engine will be able to figure out the
> > preceding series using the git blob IDs from interdiff or commit
> > IDs from range-diff.
> >
> > No need to introduce extra metadata into the system, especially
> > not in a way that can't be reproduced.  Reuse what we have.
> >
> > Even without interdiff or range-diff, it should be possible to
> > determine relationships based on common pre-image blob IDs
> > if the sender used the same base.
> 
> As I said, I'd be really happy to see this piggy-back on the API once it
> lands.

Sorry, I'm not sure who's piggy-backing off who :)

I intend to keep the raw/gzipped-text URLs in public-inbox
stable so anything can query it.  I'm not sure if there's
anything for public-inbox to query from patchwork's API for
this, since all the info public-inbox needs is in the archives.

The interdiff stuff is easier and be done sooner in public-inbox
since it won't require indexing changes.  So maybe by early Nov.

range-diff will require the ability to scan repos, so more work
to get that mapping into place.

> >> >> Non-email systems have an easier time of this: with gerrit (which I'm
> >> >> not a big fan of, but just take it as an example) you push things up to
> >> >> a git repository, and it requires a change-id. So you can track the base
> >> >> tree, dependencies, and patch revisions easily, because you build on a
> >> >> richer, more structured data source.
> >> >
> >> > Right, decentralization is a HARD problem; but it starts off
> >> > with centralization-resistance, a slightly easier problem to
> >> > solve :)
> >> >
> >> > The key is: don't introduce things which mirrors can't reproduce
> >> 
> >> Mirrors already can't meaningfully reproduce patchwork. They can only
> >> make a read-only copy of some of the data, but it's not enough to spin
> >> up a new identical instance.
> >
> > Right, that seems to be a consequence of not having the
> > prerequisite storage or search that public-inbox does:
> 
> I don't think so. I think it's because patchwork allows you to log in
> and perform actions like change state and delegate patches. That's not a
> thing that public-inbox has in scope.

It seems like those things are done to appease managerial types
rather than people who actually do work :>

I prefer actual communication of delegation/state be done via
normal English.  Relying on states/tickets/severities/etc
unnatural and often leads to confusion.

And maybe NLP (natural language processing) can go far enough
where we can build states to show managers using sentences like:

Alice: "Bob, can you review these patches for foo?"
Bob: "Sorry Alice, busy on the refactoring bar, maybe Eve can do it"
Eve: "Alice, sure I can review those patches"

I have no experience with NLP, though...

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  6:03               ` Eric Wong
@ 2019-10-08 10:06                 ` Daniel Axtens
  2019-10-08 13:19                   ` Steven Rostedt
  2019-10-08 18:46                 ` Rob Herring
  1 sibling, 1 reply; 102+ messages in thread
From: Daniel Axtens @ 2019-10-08 10:06 UTC (permalink / raw)
  To: Eric Wong; +Cc: David Miller, sir, nhorman, workflows

Eric Wong <e@80x24.org> writes:

> Daniel Axtens <dja@axtens.net> wrote:
>> >> >> For example:
>> >> >> 
>> >> >>  - is a given series a revision of a previous series? Humans can change
>> >> >>    the name of the cover letter, they can re-order or drop patches,
>> >> >>    split and merge series, even change sender, and other humans just
>> >> >>    figure it out. But if I try to crystalise that logic into patchwork,
>> >> >>    things get very tricky. This makes it hard to build powerful APIs
>> >> >>    into patchwork, which makes it harder to build really cool tools on
>> >> >>    top of patchwork.
>> >> >
>> >> > I'm confident that we can build much of that logic off search
>> >> > and do similar things to what git does with rename detection.
>> >> 
>> >> A lot of people on this list are confident of a great many things :)
>> >> 
>> >> There should be an API in the next minor version of Patchwork that
>> >> allows you to set patch relations. I would encourage you to try to build
>> >> this - if it works, we can plug it in to the core.
>> >
>> > Manually set relations should not be needed if people use
>> > format-patch with --interdiff or --range-diff.
>> >
>> > A well-tuned search engine will be able to figure out the
>> > preceding series using the git blob IDs from interdiff or commit
>> > IDs from range-diff.
>> >
>> > No need to introduce extra metadata into the system, especially
>> > not in a way that can't be reproduced.  Reuse what we have.
>> >
>> > Even without interdiff or range-diff, it should be possible to
>> > determine relationships based on common pre-image blob IDs
>> > if the sender used the same base.
>> 
>> As I said, I'd be really happy to see this piggy-back on the API once it
>> lands.
>
> Sorry, I'm not sure who's piggy-backing off who :)
>
> I intend to keep the raw/gzipped-text URLs in public-inbox
> stable so anything can query it.  I'm not sure if there's
> anything for public-inbox to query from patchwork's API for
> this, since all the info public-inbox needs is in the archives.
>
> The interdiff stuff is easier and be done sooner in public-inbox
> since it won't require indexing changes.  So maybe by early Nov.
>
> range-diff will require the ability to scan repos, so more work
> to get that mapping into place.
>

OK, you're going in a completely different direction then.

>> >> >> Non-email systems have an easier time of this: with gerrit (which I'm
>> >> >> not a big fan of, but just take it as an example) you push things up to
>> >> >> a git repository, and it requires a change-id. So you can track the base
>> >> >> tree, dependencies, and patch revisions easily, because you build on a
>> >> >> richer, more structured data source.
>> >> >
>> >> > Right, decentralization is a HARD problem; but it starts off
>> >> > with centralization-resistance, a slightly easier problem to
>> >> > solve :)
>> >> >
>> >> > The key is: don't introduce things which mirrors can't reproduce
>> >> 
>> >> Mirrors already can't meaningfully reproduce patchwork. They can only
>> >> make a read-only copy of some of the data, but it's not enough to spin
>> >> up a new identical instance.
>> >
>> > Right, that seems to be a consequence of not having the
>> > prerequisite storage or search that public-inbox does:
>> 
>> I don't think so. I think it's because patchwork allows you to log in
>> and perform actions like change state and delegate patches. That's not a
>> thing that public-inbox has in scope.
>
> It seems like those things are done to appease managerial types
> rather than people who actually do work :>
>
> I prefer actual communication of delegation/state be done via
> normal English.  Relying on states/tickets/severities/etc
> unnatural and often leads to confusion.

This is not a widely held view amongst lists and maintainers that use
patchwork.

> And maybe NLP (natural language processing) can go far enough
> where we can build states to show managers using sentences like:
>
> Alice: "Bob, can you review these patches for foo?"
> Bob: "Sorry Alice, busy on the refactoring bar, maybe Eve can do it"
> Eve: "Alice, sure I can review those patches"
>
> I have no experience with NLP, though...

I suspect from our interactions that continuing with this conversation
isn't going to be of much value to either of us. Good luck with your
approach.

Regards,
Daniel

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 10:06                 ` Daniel Axtens
@ 2019-10-08 13:19                   ` Steven Rostedt
  0 siblings, 0 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-10-08 13:19 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Eric Wong, David Miller, sir, nhorman, workflows

On Tue, 08 Oct 2019 21:06:23 +1100
Daniel Axtens <dja@axtens.net> wrote:

> >> I don't think so. I think it's because patchwork allows you to log in
> >> and perform actions like change state and delegate patches. That's not a
> >> thing that public-inbox has in scope.  
> >
> > It seems like those things are done to appease managerial types
> > rather than people who actually do work :>
> >
> > I prefer actual communication of delegation/state be done via
> > normal English.  Relying on states/tickets/severities/etc
> > unnatural and often leads to confusion.  
> 
> This is not a widely held view amongst lists and maintainers that use
> patchwork.

I agree with Daniel here. I prefer the state and delegation of patches.
This way I can easily see what I need to work on. I have people send me
various patches for review, which other people review. When they think
its at a point for me to include it, they delegate the patch to me.

Relying on email to say "Hey Steve this patch is ready" will get lost
in my Inbox (which currently has over 25,000 emails!). Ever since I
started using Patchwork, the number of "lost patches" has dropped
tremendously, because I no longer need to manage them in my Inbox.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  1:17       ` Steven Rostedt
@ 2019-10-08 16:43         ` Don Zickus
  2019-10-08 17:17           ` Steven Rostedt
  2019-10-09  2:02           ` Daniel Axtens
  0 siblings, 2 replies; 102+ messages in thread
From: Don Zickus @ 2019-10-08 16:43 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Daniel Axtens, David Miller, sir, nhorman, workflows

On Mon, Oct 07, 2019 at 09:17:04PM -0400, Steven Rostedt wrote:
> On Tue, 08 Oct 2019 10:00:03 +1100
> Daniel Axtens <dja@axtens.net> wrote:
> 
> > Non-email systems have an easier time of this: with gerrit (which I'm
> > not a big fan of, but just take it as an example) you push things up to
> > a git repository, and it requires a change-id. So you can track the base
> > tree, dependencies, and patch revisions easily, because you build on a
> > richer, more structured data source.
> 
> I believe we all want a new system that can handle this, but still be
> able to work with email. Patchwork requires to read all emails and
> figure out what to do with it. This workflow doesn't need to do that.
> But it should be able to send out emails on comments, and a reply to
> one of those should easily be put back into the system.
> 
> As for adding patches, we can push to a git tree or something that the
> tool could read. It's much easier to know what to do with a branch then
> email. A rebase could be a new version of the series (and we should
> probably archive the original version).

Thanks for the thoughts.  Your thoughts here and your bugzilla example make
sense and ties into some of the work we are experimenting with, with a small
group at Red Hat.  My only sticky point is the initial patch submission.
Pushing to a forge like github, gerrit, or gitlab makes a ton of sense like
you said above.  But what do we do with folks who still email patches
initially (instead of the git-push)?

What should be the expectations there?  Leverage patchwork to create a 'git
push'?  Reject the patchset?  Something else?  Curious.

Thanks!

Cheers,
Don

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 16:43         ` Don Zickus
@ 2019-10-08 17:17           ` Steven Rostedt
  2019-10-08 17:39             ` Don Zickus
  2019-10-09  2:02           ` Daniel Axtens
  1 sibling, 1 reply; 102+ messages in thread
From: Steven Rostedt @ 2019-10-08 17:17 UTC (permalink / raw)
  To: Don Zickus; +Cc: Daniel Axtens, David Miller, sir, nhorman, workflows

On Tue, 8 Oct 2019 12:43:09 -0400
Don Zickus <dzickus@redhat.com> wrote:

> Thanks for the thoughts.  Your thoughts here and your bugzilla example make
> sense and ties into some of the work we are experimenting with, with a small
> group at Red Hat.  My only sticky point is the initial patch submission.
> Pushing to a forge like github, gerrit, or gitlab makes a ton of sense like
> you said above.  But what do we do with folks who still email patches
> initially (instead of the git-push)?
> 
> What should be the expectations there?  Leverage patchwork to create a 'git
> push'?  Reject the patchset?  Something else?  Curious.

I think we are not just talking about reusing patchwork (unless that
becomes the starting point). But let's use patchwork as a starting
point for my thoughts about this. One would email the mailing list, and
also Cc a "listener" (like patchwork). Note, the one thing I dislike
about patchwork is that it requires to read a list. I rather have it
just be something that gets Cc'd to trigger it.

Anyway, when this "listener" gets an email, it goes into the system.
Now the maintainer can get an email from this system, or read the
system directly from a web browser or whatever client they choose. They
reply to the system, and this goes to the original submitter via email
(with links to how to use the system directly). The submitter, can then
use the system to send a v2, and ever perhaps reply to it via email with
some key word that will tell the system it is v2, or a comment.

I think we need to standardize on keywords to trigger the system
properly, if we are to use email (its up to the email user to get those
keywords right), or they can go directly to the system interface (be it
a web browser, or whatever), and then they don't need to worry about
keywords as the system would handle that directly.

Does this make sense?

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 17:17           ` Steven Rostedt
@ 2019-10-08 17:39             ` Don Zickus
  2019-10-08 19:05               ` Konstantin Ryabitsev
  0 siblings, 1 reply; 102+ messages in thread
From: Don Zickus @ 2019-10-08 17:39 UTC (permalink / raw)
  To: Steven Rostedt; +Cc: Daniel Axtens, David Miller, sir, nhorman, workflows

On Tue, Oct 08, 2019 at 01:17:30PM -0400, Steven Rostedt wrote:
> On Tue, 8 Oct 2019 12:43:09 -0400
> Don Zickus <dzickus@redhat.com> wrote:
> 
> > Thanks for the thoughts.  Your thoughts here and your bugzilla example make
> > sense and ties into some of the work we are experimenting with, with a small
> > group at Red Hat.  My only sticky point is the initial patch submission.
> > Pushing to a forge like github, gerrit, or gitlab makes a ton of sense like
> > you said above.  But what do we do with folks who still email patches
> > initially (instead of the git-push)?
> > 
> > What should be the expectations there?  Leverage patchwork to create a 'git
> > push'?  Reject the patchset?  Something else?  Curious.
> 
> I think we are not just talking about reusing patchwork (unless that
> becomes the starting point). But let's use patchwork as a starting
> point for my thoughts about this. One would email the mailing list, and
> also Cc a "listener" (like patchwork). Note, the one thing I dislike
> about patchwork is that it requires to read a list. I rather have it
> just be something that gets Cc'd to trigger it.
> 
> Anyway, when this "listener" gets an email, it goes into the system.
> Now the maintainer can get an email from this system, or read the
> system directly from a web browser or whatever client they choose. They
> reply to the system, and this goes to the original submitter via email
> (with links to how to use the system directly). The submitter, can then
> use the system to send a v2, and ever perhaps reply to it via email with
> some key word that will tell the system it is v2, or a comment.
> 
> I think we need to standardize on keywords to trigger the system
> properly, if we are to use email (its up to the email user to get those
> keywords right), or they can go directly to the system interface (be it
> a web browser, or whatever), and then they don't need to worry about
> keywords as the system would handle that directly.
> 
> Does this make sense?

It mostly does.  Though I don't think it describes how a patch series would
work as that would be a collection of 'cc's to the listener.  Now the
listener has to organize that into a proper thread order (after determining
the whole thread arrived).  Unless I misunderstood the idea.

Regardless, I think what you wrote re-enforces the idea that emailing a
patch series (and their vX followups) is messy for the maintainer and a
more evolved idea is to let a forge take git-push as input.

But it seems a stop-gap like patchwork (which has logic to handle a
patch series) would be still be needed until something like git-push became
the norm (if possible)?

Cheers,
Don

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08  6:03               ` Eric Wong
  2019-10-08 10:06                 ` Daniel Axtens
@ 2019-10-08 18:46                 ` Rob Herring
  2019-10-08 21:36                   ` Eric Wong
  1 sibling, 1 reply; 102+ messages in thread
From: Rob Herring @ 2019-10-08 18:46 UTC (permalink / raw)
  To: Eric Wong; +Cc: Daniel Axtens, David Miller, sir, nhorman, workflows

On Tue, Oct 8, 2019 at 1:03 AM Eric Wong <e@80x24.org> wrote:
>
> Daniel Axtens <dja@axtens.net> wrote:
> > >> >> For example:
> > >> >>
> > >> >>  - is a given series a revision of a previous series? Humans can change
> > >> >>    the name of the cover letter, they can re-order or drop patches,
> > >> >>    split and merge series, even change sender, and other humans just
> > >> >>    figure it out. But if I try to crystalise that logic into patchwork,
> > >> >>    things get very tricky. This makes it hard to build powerful APIs
> > >> >>    into patchwork, which makes it harder to build really cool tools on
> > >> >>    top of patchwork.
> > >> >
> > >> > I'm confident that we can build much of that logic off search
> > >> > and do similar things to what git does with rename detection.
> > >>
> > >> A lot of people on this list are confident of a great many things :)
> > >>
> > >> There should be an API in the next minor version of Patchwork that
> > >> allows you to set patch relations. I would encourage you to try to build
> > >> this - if it works, we can plug it in to the core.
> > >
> > > Manually set relations should not be needed if people use
> > > format-patch with --interdiff or --range-diff.
> > >
> > > A well-tuned search engine will be able to figure out the
> > > preceding series using the git blob IDs from interdiff or commit
> > > IDs from range-diff.
> > >
> > > No need to introduce extra metadata into the system, especially
> > > not in a way that can't be reproduced.  Reuse what we have.
> > >
> > > Even without interdiff or range-diff, it should be possible to
> > > determine relationships based on common pre-image blob IDs
> > > if the sender used the same base.
> >
> > As I said, I'd be really happy to see this piggy-back on the API once it
> > lands.
>
> Sorry, I'm not sure who's piggy-backing off who :)
>
> I intend to keep the raw/gzipped-text URLs in public-inbox
> stable so anything can query it.  I'm not sure if there's
> anything for public-inbox to query from patchwork's API for
> this, since all the info public-inbox needs is in the archives.
>
> The interdiff stuff is easier and be done sooner in public-inbox
> since it won't require indexing changes.  So maybe by early Nov.
>
> range-diff will require the ability to scan repos, so more work
> to get that mapping into place.
>
> > >> >> Non-email systems have an easier time of this: with gerrit (which I'm
> > >> >> not a big fan of, but just take it as an example) you push things up to
> > >> >> a git repository, and it requires a change-id. So you can track the base
> > >> >> tree, dependencies, and patch revisions easily, because you build on a
> > >> >> richer, more structured data source.
> > >> >
> > >> > Right, decentralization is a HARD problem; but it starts off
> > >> > with centralization-resistance, a slightly easier problem to
> > >> > solve :)
> > >> >
> > >> > The key is: don't introduce things which mirrors can't reproduce
> > >>
> > >> Mirrors already can't meaningfully reproduce patchwork. They can only
> > >> make a read-only copy of some of the data, but it's not enough to spin
> > >> up a new identical instance.
> > >
> > > Right, that seems to be a consequence of not having the
> > > prerequisite storage or search that public-inbox does:
> >
> > I don't think so. I think it's because patchwork allows you to log in
> > and perform actions like change state and delegate patches. That's not a
> > thing that public-inbox has in scope.
>
> It seems like those things are done to appease managerial types
> rather than people who actually do work :>

+1 to what Steven said. PW is my todo list. That's never worked well
within my mail client.

>
> I prefer actual communication of delegation/state be done via
> normal English.  Relying on states/tickets/severities/etc
> unnatural and often leads to confusion.
>
>
> And maybe NLP (natural language processing) can go far enough
> where we can build states to show managers using sentences like:
>
> Alice: "Bob, can you review these patches for foo?"
> Bob: "Sorry Alice, busy on the refactoring bar, maybe Eve can do it"
> Eve: "Alice, sure I can review those patches"
>
> I have no experience with NLP, though...

I'm pretty sure solving managers workflows is a non-goal of this list.

Rob

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 17:39             ` Don Zickus
@ 2019-10-08 19:05               ` Konstantin Ryabitsev
  2019-10-08 20:32                 ` Don Zickus
  2019-10-09 21:35                 ` Laura Abbott
  0 siblings, 2 replies; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-08 19:05 UTC (permalink / raw)
  To: Don Zickus
  Cc: Steven Rostedt, Daniel Axtens, David Miller, sir, nhorman, workflows

On Tue, Oct 08, 2019 at 01:39:02PM -0400, Don Zickus wrote:
>Regardless, I think what you wrote re-enforces the idea that emailing a
>patch series (and their vX followups) is messy for the maintainer and a
>more evolved idea is to let a forge take git-push as input.

I'm pretty opposed to the idea of forges, because this approach makes it 
very easy to knock out infrastructure critical to the project's ability 
to quickly roll out fixes. Imagine a situation where there's a zero-day 
remote root kernel exploit -- the attackers would be interested in 
ensuring that it remains unpatched for as long as possible, so we can 
imagine that they will target any central infrastructure where a fix can 
be developed and posted.

Currently, such an attack would be ineffective because even if 
kernel.org is knocked out entirely, collaboration will still happen 
directly over email between maintainers and Linus, and a fix can be 
posted on any number of worldwide resources -- as long as it carries 
Linus's signature, it will be trusted. If we switch to require a central 
forge, then knocking out that resource will require that maintainers and 
developers scramble to find some kind of backup channel (like falling 
back to email). And if we're still falling back to email, then we're not 
really solving the larger underlying problem of "what should we use 
instead of email."

We also shouldn't forget trigger-happy governments that like to ban 
troves of IP addresses in their chase after "the safe internet." Github 
has already been banned a couple of times in China and Russia, and 
chances are that this will continue.

This doesn't mean that forges are entirely out -- but they must remain 
mere tools that participate in a globally decentralized, 
developer-attestable, self-archiving messaging service. Maybe let's call 
that "kernel developer bus" or "kdbus" -- pretty sure that name hasn't 
been used before.

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 19:05               ` Konstantin Ryabitsev
@ 2019-10-08 20:32                 ` Don Zickus
  2019-10-08 21:35                   ` Konstantin Ryabitsev
  2019-10-09 21:35                 ` Laura Abbott
  1 sibling, 1 reply; 102+ messages in thread
From: Don Zickus @ 2019-10-08 20:32 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Steven Rostedt, Daniel Axtens, David Miller, sir, nhorman, workflows

On Tue, Oct 08, 2019 at 03:05:27PM -0400, Konstantin Ryabitsev wrote:
> On Tue, Oct 08, 2019 at 01:39:02PM -0400, Don Zickus wrote:
> > Regardless, I think what you wrote re-enforces the idea that emailing a
> > patch series (and their vX followups) is messy for the maintainer and a
> > more evolved idea is to let a forge take git-push as input.
> 
> I'm pretty opposed to the idea of forges, because this approach makes it
> very easy to knock out infrastructure critical to the project's ability to
> quickly roll out fixes. Imagine a situation where there's a zero-day remote
> root kernel exploit -- the attackers would be interested in ensuring that it
> remains unpatched for as long as possible, so we can imagine that they will
> target any central infrastructure where a fix can be developed and posted.
> 
> Currently, such an attack would be ineffective because even if kernel.org is
> knocked out entirely, collaboration will still happen directly over email
> between maintainers and Linus, and a fix can be posted on any number of
> worldwide resources -- as long as it carries Linus's signature, it will be
> trusted. If we switch to require a central forge, then knocking out that
> resource will require that maintainers and developers scramble to find some
> kind of backup channel (like falling back to email). And if we're still
> falling back to email, then we're not really solving the larger underlying
> problem of "what should we use instead of email."
> 
> We also shouldn't forget trigger-happy governments that like to ban troves
> of IP addresses in their chase after "the safe internet." Github has already
> been banned a couple of times in China and Russia, and chances are that this
> will continue.
> 
> This doesn't mean that forges are entirely out -- but they must remain mere
> tools that participate in a globally decentralized, developer-attestable,
> self-archiving messaging service. Maybe let's call that "kernel developer
> bus" or "kdbus" -- pretty sure that name hasn't been used before.

If we flipped it around and used today's git-pull-request emails as trigger
for forges to run their services, does cover some of your concerns?

git-pull-request emails can skip the patch-translation layer (like
patchwork), run the automated tests, and utilize the current maintainer
workflows.

The email issue is hard to resolve as some folks feel like it should be a
primary vehicle while others feel it should be a secondary vehicle.

Cheers,
Don

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 20:32                 ` Don Zickus
@ 2019-10-08 21:35                   ` Konstantin Ryabitsev
  2019-10-09 21:50                     ` Laura Abbott
  0 siblings, 1 reply; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-08 21:35 UTC (permalink / raw)
  To: Don Zickus
  Cc: Steven Rostedt, Daniel Axtens, David Miller, sir, nhorman, workflows

On Tue, Oct 08, 2019 at 04:32:49PM -0400, Don Zickus wrote:
> > This doesn't mean that forges are entirely out -- but they must remain mere
> > tools that participate in a globally decentralized, developer-attestable,
> > self-archiving messaging service. Maybe let's call that "kernel developer
> > bus" or "kdbus" -- pretty sure that name hasn't been used before.
> 
> If we flipped it around and used today's git-pull-request emails as trigger
> for forges to run their services, does cover some of your concerns?

Not really, because it doesn't preserve any records. A pull request is a
pointer to some code in a git repository somewhere. Like any pointer, it
is neither self-contained nor long-lived:

- the repo can be gone a month later (or that branch deleted)
- the PR does not preserve any discussions that happened around that
  code such as bug reports, test success/fail matrices, or peer reviews

It's also pretty inefficient, because it requires that the pull-request
submitter hosts a 1.5GB repository on a fast permanently-available
connection just so they can share a few lines of changes.

> git-pull-request emails can skip the patch-translation layer (like
> patchwork), run the automated tests, and utilize the current maintainer
> workflows.

Where do the test reports go after they are completed? How does the
maintainer find out that the tests succeeded? How does the next
developer after them -- say, 3 years later -- find out what tests ran
against that changeset before it was merged?

Instead of consolidating the fragmented landscape of Linux development,
we are further fracturing it and making it lossier. Currently, archival
efforts like lore.kernel.org at least preserve all discussions/reports
cc'd to the LKML, but I'm afraid that forges will render large parts of
the development process completely opaque.

I suggest that we stick to patch-based workflows and develop better
tooling and distribution fabric to replace SMTP -- redecentralizing
things in the process.

> The email issue is hard to resolve as some folks feel like it should be a
> primary vehicle while others feel it should be a secondary vehicle.

If forges are merely participants in the communication fabric, then it
doesn't matter which one is primary. In fact, email then becomes just
another bridge, so those who are not interested in switching away from
their current email-based workflow can continue using email.

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 18:46                 ` Rob Herring
@ 2019-10-08 21:36                   ` Eric Wong
  0 siblings, 0 replies; 102+ messages in thread
From: Eric Wong @ 2019-10-08 21:36 UTC (permalink / raw)
  To: Rob Herring; +Cc: Daniel Axtens, David Miller, sir, nhorman, workflows

Rob Herring <robh@kernel.org> wrote:
> On Tue, Oct 8, 2019 at 1:03 AM Eric Wong <e@80x24.org> wrote:
> >
> > Daniel Axtens <dja@axtens.net> wrote:
> > > >> >> For example:
> > > >> >>
> > > >> >>  - is a given series a revision of a previous series? Humans can change
> > > >> >>    the name of the cover letter, they can re-order or drop patches,
> > > >> >>    split and merge series, even change sender, and other humans just
> > > >> >>    figure it out. But if I try to crystalise that logic into patchwork,
> > > >> >>    things get very tricky. This makes it hard to build powerful APIs
> > > >> >>    into patchwork, which makes it harder to build really cool tools on
> > > >> >>    top of patchwork.
> > > >> >
> > > >> > I'm confident that we can build much of that logic off search
> > > >> > and do similar things to what git does with rename detection.
> > > >>
> > > >> A lot of people on this list are confident of a great many things :)
> > > >>
> > > >> There should be an API in the next minor version of Patchwork that
> > > >> allows you to set patch relations. I would encourage you to try to build
> > > >> this - if it works, we can plug it in to the core.
> > > >
> > > > Manually set relations should not be needed if people use
> > > > format-patch with --interdiff or --range-diff.
> > > >
> > > > A well-tuned search engine will be able to figure out the
> > > > preceding series using the git blob IDs from interdiff or commit
> > > > IDs from range-diff.
> > > >
> > > > No need to introduce extra metadata into the system, especially
> > > > not in a way that can't be reproduced.  Reuse what we have.
> > > >
> > > > Even without interdiff or range-diff, it should be possible to
> > > > determine relationships based on common pre-image blob IDs
> > > > if the sender used the same base.
> > >
> > > As I said, I'd be really happy to see this piggy-back on the API once it
> > > lands.
> >
> > Sorry, I'm not sure who's piggy-backing off who :)
> >
> > I intend to keep the raw/gzipped-text URLs in public-inbox
> > stable so anything can query it.  I'm not sure if there's
> > anything for public-inbox to query from patchwork's API for
> > this, since all the info public-inbox needs is in the archives.
> >
> > The interdiff stuff is easier and be done sooner in public-inbox
> > since it won't require indexing changes.  So maybe by early Nov.
> >
> > range-diff will require the ability to scan repos, so more work
> > to get that mapping into place.
> >
> > > >> >> Non-email systems have an easier time of this: with gerrit (which I'm
> > > >> >> not a big fan of, but just take it as an example) you push things up to
> > > >> >> a git repository, and it requires a change-id. So you can track the base
> > > >> >> tree, dependencies, and patch revisions easily, because you build on a
> > > >> >> richer, more structured data source.
> > > >> >
> > > >> > Right, decentralization is a HARD problem; but it starts off
> > > >> > with centralization-resistance, a slightly easier problem to
> > > >> > solve :)
> > > >> >
> > > >> > The key is: don't introduce things which mirrors can't reproduce
> > > >>
> > > >> Mirrors already can't meaningfully reproduce patchwork. They can only
> > > >> make a read-only copy of some of the data, but it's not enough to spin
> > > >> up a new identical instance.
> > > >
> > > > Right, that seems to be a consequence of not having the
> > > > prerequisite storage or search that public-inbox does:
> > >
> > > I don't think so. I think it's because patchwork allows you to log in
> > > and perform actions like change state and delegate patches. That's not a
> > > thing that public-inbox has in scope.
> >
> > It seems like those things are done to appease managerial types
> > rather than people who actually do work :>
> 
> +1 to what Steven said. PW is my todo list. That's never worked well
> within my mail client.

OK, I admit my assumptions around delegation/state were way off :>

What about a control language similar to debbugs?

  https://debbugs.gnu.org/server-refcard.html

The goal is to have something that can be reproduced, replayed
and regenerated by any client without centralized dependencies.

Integer IDs used by debbugs would be replaced by Message-IDs,
at least.

> > I prefer actual communication of delegation/state be done via
> > normal English.  Relying on states/tickets/severities/etc
> > unnatural and often leads to confusion.

Fwiw, I often get "critical" vs "grave" and "serious" vs
"important" severities mixed up in debbugs.  That would
be confusing to non-native speakers.

> > And maybe NLP (natural language processing) can go far enough
> > where we can build states to show managers using sentences like:
> >
> > Alice: "Bob, can you review these patches for foo?"
> > Bob: "Sorry Alice, busy on the refactoring bar, maybe Eve can do it"
> > Eve: "Alice, sure I can review those patches"
> >
> > I have no experience with NLP, though...
> 
> I'm pretty sure solving managers workflows is a non-goal of this list.

OK :)  And yeah, on second thought, NLP would probably be
too fuzzy, especially for non-native speakers.

But I think something along the lines of debbugs control
commands could be a good compromise if it's regeneratable.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 16:43         ` Don Zickus
  2019-10-08 17:17           ` Steven Rostedt
@ 2019-10-09  2:02           ` Daniel Axtens
  1 sibling, 0 replies; 102+ messages in thread
From: Daniel Axtens @ 2019-10-09  2:02 UTC (permalink / raw)
  To: Don Zickus, Steven Rostedt; +Cc: David Miller, sir, nhorman, workflows

Don Zickus <dzickus@redhat.com> writes:

> On Mon, Oct 07, 2019 at 09:17:04PM -0400, Steven Rostedt wrote:
>> On Tue, 08 Oct 2019 10:00:03 +1100
>> Daniel Axtens <dja@axtens.net> wrote:
>> 
>> > Non-email systems have an easier time of this: with gerrit (which I'm
>> > not a big fan of, but just take it as an example) you push things up to
>> > a git repository, and it requires a change-id. So you can track the base
>> > tree, dependencies, and patch revisions easily, because you build on a
>> > richer, more structured data source.
>> 
>> I believe we all want a new system that can handle this, but still be
>> able to work with email. Patchwork requires to read all emails and
>> figure out what to do with it. This workflow doesn't need to do that.
>> But it should be able to send out emails on comments, and a reply to
>> one of those should easily be put back into the system.
>> 
>> As for adding patches, we can push to a git tree or something that the
>> tool could read. It's much easier to know what to do with a branch then
>> email. A rebase could be a new version of the series (and we should
>> probably archive the original version).
>
> Thanks for the thoughts.  Your thoughts here and your bugzilla example make
> sense and ties into some of the work we are experimenting with, with a small
> group at Red Hat.  My only sticky point is the initial patch submission.
> Pushing to a forge like github, gerrit, or gitlab makes a ton of sense like
> you said above.  But what do we do with folks who still email patches
> initially (instead of the git-push)?
>
> What should be the expectations there?  Leverage patchwork to create a 'git
> push'?  Reject the patchset?  Something else?  Curious.

Normally I would say "the patchwork API makes this really easy" but I
got sick of saying it, so I thought I'd just demonstrate instead.

It takes less than 100 lines of python.

https://github.com/daxtens/forge-bridge

Currently running against the powerpc list and pushing all new series to
my github: https://github.com/daxtens/linux/branches

The really tricky bit is the comment bridge - syncing comments between
the forge and the ML is legitimately tricky.

For the reasons Konstantin identifies, I'm not sure that this is the
path we want to go down, but I think there's a lot of value in keeping
these discussions practical, at least in part.

>
> Thanks!
>
> Cheers,
> Don

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 19:05               ` Konstantin Ryabitsev
  2019-10-08 20:32                 ` Don Zickus
@ 2019-10-09 21:35                 ` Laura Abbott
  2019-10-09 21:54                   ` Konstantin Ryabitsev
  1 sibling, 1 reply; 102+ messages in thread
From: Laura Abbott @ 2019-10-09 21:35 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Don Zickus
  Cc: Steven Rostedt, Daniel Axtens, David Miller, sir, nhorman, workflows

On 10/8/19 3:05 PM, Konstantin Ryabitsev wrote:
> On Tue, Oct 08, 2019 at 01:39:02PM -0400, Don Zickus wrote:
>> Regardless, I think what you wrote re-enforces the idea that emailing a
>> patch series (and their vX followups) is messy for the maintainer and a
>> more evolved idea is to let a forge take git-push as input.
> 
> I'm pretty opposed to the idea of forges, because this approach makes it very easy to knock out infrastructure critical to the project's ability to quickly roll out fixes. Imagine a situation where there's a zero-day remote root kernel exploit -- the attackers would be interested in ensuring that it remains unpatched for as long as possible, so we can imagine that they will target any central infrastructure where a fix can be developed and posted.
> 
> Currently, such an attack would be ineffective because even if kernel.org is knocked out entirely, collaboration will still happen directly over email between maintainers and Linus, and a fix can be posted on any number of worldwide resources -- as long as it carries Linus's signature, it will be trusted. If we switch to require a central forge, then knocking out that resource will require that maintainers and developers scramble to find some kind of backup channel (like falling back to email). And if we're still falling back to email, then we're not really solving the larger underlying problem of "what should we use instead of email."
> 

I'd argue that e-mail as a backup solution is perfectly fine. The issue
with e-mail is that it's not scaling. If something did happen and we
needed to temporarily move back to e-mail, chances are it would be
at a smaller scale.

> We also shouldn't forget trigger-happy governments that like to ban troves of IP addresses in their chase after "the safe internet." Github has already been banned a couple of times in China and Russia, and chances are that this will continue.
> 

We've had this problem with e-mail spam filtering too. Any sort
of system seems likely to have this problem of needing to block
unwanted content but purposely or not blocking non-malicious
content.

> This doesn't mean that forges are entirely out -- but they must remain mere tools that participate in a globally decentralized, developer-attestable, self-archiving messaging service. Maybe let's call that "kernel developer bus" or "kdbus" -- pretty sure that name hasn't been used before.
> 

The big issue I see with anything decentralized is that as things
grow people don't actually want to host their own infrastructure.
Think about the decline in the number of people who host their own
e-mail server. Anything decentralized would still presumably require
a server somewhere, so you're going to either raising the bar to entry
by requiring people to set up their own server or end up with people
still relying on a service somewhere. This feels like it ends up with
the situation we have today where most things are locally optimized
but on average the situation is still lousy.

You've articulated you've articulated the reasons against centralization
very well from an admin point of view (which I won't dispute) but at
least from a user point of view a centralized forge infrastructure is
great because I don't have to worry about it. My university/company
doesn't have to set anything up for me to contribute. I get we are
probably going to end up optimizing more for the maintainer here but
it's worth thinking about how we could get forge-like benefits where
most users don't have to run infrastructure.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-08 21:35                   ` Konstantin Ryabitsev
@ 2019-10-09 21:50                     ` Laura Abbott
  2019-10-10 12:48                       ` Neil Horman
  0 siblings, 1 reply; 102+ messages in thread
From: Laura Abbott @ 2019-10-09 21:50 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Don Zickus
  Cc: Steven Rostedt, Daniel Axtens, David Miller, sir, nhorman, workflows

On 10/8/19 5:35 PM, Konstantin Ryabitsev wrote:
> On Tue, Oct 08, 2019 at 04:32:49PM -0400, Don Zickus wrote:
>>> This doesn't mean that forges are entirely out -- but they must remain mere
>>> tools that participate in a globally decentralized, developer-attestable,
>>> self-archiving messaging service. Maybe let's call that "kernel developer
>>> bus" or "kdbus" -- pretty sure that name hasn't been used before.
>>
>> If we flipped it around and used today's git-pull-request emails as trigger
>> for forges to run their services, does cover some of your concerns?
> 
> Not really, because it doesn't preserve any records. A pull request is a
> pointer to some code in a git repository somewhere. Like any pointer, it
> is neither self-contained nor long-lived:
> 
> - the repo can be gone a month later (or that branch deleted)
> - the PR does not preserve any discussions that happened around that
>    code such as bug reports, test success/fail matrices, or peer reviews
> 
> It's also pretty inefficient, because it requires that the pull-request
> submitter hosts a 1.5GB repository on a fast permanently-available
> connection just so they can share a few lines of changes.
> 

I know this is an issue that Outreachy interns across projects have had in
the past with trying to sync large git repos across laggy connections.

The advantage of the pull request though is that it's atomic. It has
the full history for testing. When you test a commit, you always have the
correct base. With a patch, you aren't guaranteed a base for testing.

I know there have been various attempts to try and give a base for
testing but it doesn't seem like anything has caught on enough (or maybe
it has and I've missed it completely)

>> git-pull-request emails can skip the patch-translation layer (like
>> patchwork), run the automated tests, and utilize the current maintainer
>> workflows.
> 
> Where do the test reports go after they are completed? How does the
> maintainer find out that the tests succeeded? How does the next
> developer after them -- say, 3 years later -- find out what tests ran
> against that changeset before it was merged?
> 
> Instead of consolidating the fragmented landscape of Linux development,
> we are further fracturing it and making it lossier. Currently, archival
> efforts like lore.kernel.org at least preserve all discussions/reports
> cc'd to the LKML, but I'm afraid that forges will render large parts of
> the development process completely opaque.
> 

I'd argue a single forge would make things _more_ transparent. Sure if
things get cc'd to LKML they get archived but there's still plenty
of submissions that never get sent to LKML and only to a particular
list. There have been many times I've had to go searching around for
a particular patchwork for a particular list to get a patch because
the submitter forgot/chose not to cc LKML. The weakness of a single
point of failure is also an advantage: There is a single location for
all discussion and bugs and merge requests. Of course this only works
if the forge is considered the source of truth.

Thanks,
Laura

> I suggest that we stick to patch-based workflows and develop better
> tooling and distribution fabric to replace SMTP -- redecentralizing
> things in the process.
> 
>> The email issue is hard to resolve as some folks feel like it should be a
>> primary vehicle while others feel it should be a secondary vehicle.
> 
> If forges are merely participants in the communication fabric, then it
> doesn't matter which one is primary. In fact, email then becomes just
> another bridge, so those who are not interested in switching away from
> their current email-based workflow can continue using email.
> 
> -K
> 


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 21:35                 ` Laura Abbott
@ 2019-10-09 21:54                   ` Konstantin Ryabitsev
  2019-10-09 22:09                     ` Laura Abbott
                                       ` (2 more replies)
  0 siblings, 3 replies; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-09 21:54 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Don Zickus, Steven Rostedt, Daniel Axtens, David Miller, sir,
	nhorman, workflows

On Wed, Oct 09, 2019 at 05:35:39PM -0400, Laura Abbott wrote:
>>This doesn't mean that forges are entirely out -- but they must remain 
>>mere tools that participate in a globally decentralized, 
>>developer-attestable, self-archiving messaging service. Maybe let's 
>>call that "kernel developer bus" or "kdbus" -- pretty sure that name 
>>hasn't been used before.
>>
>
>The big issue I see with anything decentralized is that as things
>grow people don't actually want to host their own infrastructure.
>Think about the decline in the number of people who host their own
>e-mail server. Anything decentralized would still presumably require
>a server somewhere, so you're going to either raising the bar to entry
>by requiring people to set up their own server or end up with people
>still relying on a service somewhere. This feels like it ends up with
>the situation we have today where most things are locally optimized
>but on average the situation is still lousy.
>
>You've articulated you've articulated the reasons against centralization
>very well from an admin point of view (which I won't dispute) but at
>least from a user point of view a centralized forge infrastructure is
>great because I don't have to worry about it. My university/company
>doesn't have to set anything up for me to contribute. I get we are
>probably going to end up optimizing more for the maintainer here but
>it's worth thinking about how we could get forge-like benefits where
>most users don't have to run infrastructure.

We're actually not in opposition to each-other -- I expect kernel.org
(via Linux Foundation) would provide convenient bridge tools to cover 
the precise concern you mention. Think kind of like 
patchwork.kernel.org, but instead of exclusively using some local 
database that only admins at kernel.org have access to, it would provide 
a set of feeds allowing anyone else to set up a fully functioning 
replica -- or participate in the process using their own compatible 
tools.

So, in other words, the forge is still there and is still providing a 
valuable service, but it is not the single point of truth that can 
vanish and take invaluable data with it. That's my vision, and I think 
we have all we need to achieve it short of resolve, buy-in, and proper 
tooling.

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 21:54                   ` Konstantin Ryabitsev
@ 2019-10-09 22:09                     ` Laura Abbott
  2019-10-09 22:19                       ` Dave Airlie
  2019-10-09 22:21                     ` Eric Wong
  2019-10-10 17:52                     ` Dmitry Vyukov
  2 siblings, 1 reply; 102+ messages in thread
From: Laura Abbott @ 2019-10-09 22:09 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Don Zickus, Steven Rostedt, Daniel Axtens, David Miller, sir,
	nhorman, workflows

On 10/9/19 5:54 PM, Konstantin Ryabitsev wrote:
> On Wed, Oct 09, 2019 at 05:35:39PM -0400, Laura Abbott wrote:
>>> This doesn't mean that forges are entirely out -- but they must remain mere tools that participate in a globally decentralized, developer-attestable, self-archiving messaging service. Maybe let's call that "kernel developer bus" or "kdbus" -- pretty sure that name hasn't been used before.
>>>
>>
>> The big issue I see with anything decentralized is that as things
>> grow people don't actually want to host their own infrastructure.
>> Think about the decline in the number of people who host their own
>> e-mail server. Anything decentralized would still presumably require
>> a server somewhere, so you're going to either raising the bar to entry
>> by requiring people to set up their own server or end up with people
>> still relying on a service somewhere. This feels like it ends up with
>> the situation we have today where most things are locally optimized
>> but on average the situation is still lousy.
>>
>> You've articulated you've articulated the reasons against centralization
>> very well from an admin point of view (which I won't dispute) but at
>> least from a user point of view a centralized forge infrastructure is
>> great because I don't have to worry about it. My university/company
>> doesn't have to set anything up for me to contribute. I get we are
>> probably going to end up optimizing more for the maintainer here but
>> it's worth thinking about how we could get forge-like benefits where
>> most users don't have to run infrastructure.
> 
> We're actually not in opposition to each-other -- I expect kernel.org
> (via Linux Foundation) would provide convenient bridge tools to cover the precise concern you mention. Think kind of like patchwork.kernel.org, but instead of exclusively using some local database that only admins at kernel.org have access to, it would provide a set of feeds allowing anyone else to set up a fully functioning replica -- or participate in the process using their own compatible tools.
> 
> So, in other words, the forge is still there and is still providing a valuable service, but it is not the single point of truth that can vanish and take invaluable data with it. That's my vision, and I think we have all we need to achieve it short of resolve, buy-in, and proper tooling.
> 

I'll admit I'm skeptical about the "participate with their own tools"
bit, simply because you end up with too many sides arguing about
standards and either n buggy implementations or effectively a single
implementation anyway.

I'd also love to be proven wrong and would be interested to see a
proof of concept.

Thanks,
Laura

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 22:09                     ` Laura Abbott
@ 2019-10-09 22:19                       ` Dave Airlie
  0 siblings, 0 replies; 102+ messages in thread
From: Dave Airlie @ 2019-10-09 22:19 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Konstantin Ryabitsev, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Thu, 10 Oct 2019 at 08:09, Laura Abbott <labbott@redhat.com> wrote:
>
> On 10/9/19 5:54 PM, Konstantin Ryabitsev wrote:
> > On Wed, Oct 09, 2019 at 05:35:39PM -0400, Laura Abbott wrote:
> >>> This doesn't mean that forges are entirely out -- but they must remain mere tools that participate in a globally decentralized, developer-attestable, self-archiving messaging service. Maybe let's call that "kernel developer bus" or "kdbus" -- pretty sure that name hasn't been used before.
> >>>
> >>
> >> The big issue I see with anything decentralized is that as things
> >> grow people don't actually want to host their own infrastructure.
> >> Think about the decline in the number of people who host their own
> >> e-mail server. Anything decentralized would still presumably require
> >> a server somewhere, so you're going to either raising the bar to entry
> >> by requiring people to set up their own server or end up with people
> >> still relying on a service somewhere. This feels like it ends up with
> >> the situation we have today where most things are locally optimized
> >> but on average the situation is still lousy.
> >>
> >> You've articulated you've articulated the reasons against centralization
> >> very well from an admin point of view (which I won't dispute) but at
> >> least from a user point of view a centralized forge infrastructure is
> >> great because I don't have to worry about it. My university/company
> >> doesn't have to set anything up for me to contribute. I get we are
> >> probably going to end up optimizing more for the maintainer here but
> >> it's worth thinking about how we could get forge-like benefits where
> >> most users don't have to run infrastructure.
> >
> > We're actually not in opposition to each-other -- I expect kernel.org
> > (via Linux Foundation) would provide convenient bridge tools to cover the precise concern you mention. Think kind of like patchwork.kernel.org, but instead of exclusively using some local database that only admins at kernel.org have access to, it would provide a set of feeds allowing anyone else to set up a fully functioning replica -- or participate in the process using their own compatible tools.
> >
> > So, in other words, the forge is still there and is still providing a valuable service, but it is not the single point of truth that can vanish and take invaluable data with it. That's my vision, and I think we have all we need to achieve it short of resolve, buy-in, and proper tooling.
> >
>
> I'll admit I'm skeptical about the "participate with their own tools"
> bit, simply because you end up with too many sides arguing about
> standards and either n buggy implementations or effectively a single
> implementation anyway.

I'm with Laura, at the point where you have to write a fully qualified
spec just so few people can keep their own workflows I feel you've
taken things too far.

Dave.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 21:54                   ` Konstantin Ryabitsev
  2019-10-09 22:09                     ` Laura Abbott
@ 2019-10-09 22:21                     ` Eric Wong
  2019-10-09 23:56                       ` Konstantin Ryabitsev
  2019-10-10 17:52                     ` Dmitry Vyukov
  2 siblings, 1 reply; 102+ messages in thread
From: Eric Wong @ 2019-10-09 22:21 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Laura Abbott
  Cc: Don Zickus, Steven Rostedt, Daniel Axtens, David Miller, sir,
	nhorman, workflows

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Wed, Oct 09, 2019 at 05:35:39PM -0400, Laura Abbott wrote:
> > > This doesn't mean that forges are entirely out -- but they must
> > > remain mere tools that participate in a globally decentralized,
> > > developer-attestable, self-archiving messaging service. Maybe let's
> > > call that "kernel developer bus" or "kdbus" -- pretty sure that name
> > > hasn't been used before.
> > > 
> > 
> > The big issue I see with anything decentralized is that as things
> > grow people don't actually want to host their own infrastructure.
> > Think about the decline in the number of people who host their own
> > e-mail server. Anything decentralized would still presumably require
> > a server somewhere, so you're going to either raising the bar to entry
> > by requiring people to set up their own server or end up with people
> > still relying on a service somewhere. This feels like it ends up with
> > the situation we have today where most things are locally optimized
> > but on average the situation is still lousy.

Laura: agreed, I think the important thing is designing for
"centralization resistance"; so that anybody can replicate it.

The dangerous thing is when forges become centralized identity
providers and users get siloed into communicating through them.

I consider email a viable alternative to OpenID.  And I'd be
100% in favor of forges which give outgoing mail access to their
users so they can interop with any other SMTP servers.  If
anything it'd put more pressure on the big email providers
to preserve interopability.

> > You've articulated you've articulated the reasons against centralization
> > very well from an admin point of view (which I won't dispute) but at
> > least from a user point of view a centralized forge infrastructure is
> > great because I don't have to worry about it. My university/company
> > doesn't have to set anything up for me to contribute. I get we are
> > probably going to end up optimizing more for the maintainer here but
> > it's worth thinking about how we could get forge-like benefits where
> > most users don't have to run infrastructure.
> 
> We're actually not in opposition to each-other -- I expect kernel.org
> (via Linux Foundation) would provide convenient bridge tools to cover the
> precise concern you mention. Think kind of like patchwork.kernel.org, but
> instead of exclusively using some local database that only admins at
> kernel.org have access to, it would provide a set of feeds allowing anyone
> else to set up a fully functioning replica -- or participate in the process
> using their own compatible tools.
> 
> So, in other words, the forge is still there and is still providing a
> valuable service, but it is not the single point of truth that can vanish
> and take invaluable data with it. That's my vision, and I think we have all
> we need to achieve it short of resolve, buy-in, and proper tooling.

Konstantin: 100% agreed.

I actually hope fewer people subscribe to mailing lists and read
via NNTP[*], because the subscriber list is centralized and
distributing it would also be a privacy violation.

One of my long-term visions is to have an agreed upon way to
have entirely forkable development communities.  Git already
gave us forkable code repositories.


[*] I'm planning on better client-side tooling around NNTP for
    reading email (still relying on SMTP to send).
    And maybe an NNTP -> POP3 server for webmail users since
    every webmail service can import POP3...

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 22:21                     ` Eric Wong
@ 2019-10-09 23:56                       ` Konstantin Ryabitsev
  2019-10-10  0:07                         ` Eric Wong
  2019-10-10  7:35                         ` Nicolas Belouin
  0 siblings, 2 replies; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-09 23:56 UTC (permalink / raw)
  To: Eric Wong
  Cc: Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, sir, nhorman, workflows

On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
> One of my long-term visions is to have an agreed upon way to
> have entirely forkable development communities.  Git already
> gave us forkable code repositories.

FYI, this does already exist in the form of Fossil:
https://www.fossil-scm.org/home/doc/trunk/www/index.wiki

The main reason why we can't really consider it is because it requires
moving away from git, which is a non-starter for anyone.

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 23:56                       ` Konstantin Ryabitsev
@ 2019-10-10  0:07                         ` Eric Wong
  2019-10-10  7:35                         ` Nicolas Belouin
  1 sibling, 0 replies; 102+ messages in thread
From: Eric Wong @ 2019-10-10  0:07 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, sir, nhorman, workflows

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
> > One of my long-term visions is to have an agreed upon way to
> > have entirely forkable development communities.  Git already
> > gave us forkable code repositories.
> 
> FYI, this does already exist in the form of Fossil:
> https://www.fossil-scm.org/home/doc/trunk/www/index.wiki

Not quite, they still have active sqlite-dev and sqlite-users
mailing lists and I don't think they intend to replace/augment
them with Fossil.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 23:56                       ` Konstantin Ryabitsev
  2019-10-10  0:07                         ` Eric Wong
@ 2019-10-10  7:35                         ` Nicolas Belouin
  2019-10-10 12:53                           ` Steven Rostedt
  2019-10-10 14:21                           ` Dmitry Vyukov
  1 sibling, 2 replies; 102+ messages in thread
From: Nicolas Belouin @ 2019-10-10  7:35 UTC (permalink / raw)
  To: Konstantin Ryabitsev, Eric Wong
  Cc: Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, sir, nhorman, workflows

On 10/10/19 1:56 AM, Konstantin Ryabitsev wrote:
> On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
>> One of my long-term visions is to have an agreed upon way to
>> have entirely forkable development communities.  Git already
>> gave us forkable code repositories.
> FYI, this does already exist in the form of Fossil:
> https://www.fossil-scm.org/home/doc/trunk/www/index.wiki
>
> The main reason why we can't really consider it is because it requires
> moving away from git, which is a non-starter for anyone.
>
> -K
Maybe the solution is to build such kind of features within git,
many proposed solutions with tools over git or using git.
A tool over git has the issue of conveying the data and making
it non-centralized, whereas a tool using git is non-scalable because
of the way git is *currently* storing things.
Improving git to make it able to store many more objects (be it in-review
commits or previous versions of patches) and then use it to get the kind
of features we can envy from fossil.

Nicolas

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 21:50                     ` Laura Abbott
@ 2019-10-10 12:48                       ` Neil Horman
  0 siblings, 0 replies; 102+ messages in thread
From: Neil Horman @ 2019-10-10 12:48 UTC (permalink / raw)
  To: Laura Abbott
  Cc: Konstantin Ryabitsev, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, sir, nhorman, workflows

On Wed, Oct 09, 2019 at 05:50:36PM -0400, Laura Abbott wrote:
> On 10/8/19 5:35 PM, Konstantin Ryabitsev wrote:
> > On Tue, Oct 08, 2019 at 04:32:49PM -0400, Don Zickus wrote:
> > > > This doesn't mean that forges are entirely out -- but they must remain mere
> > > > tools that participate in a globally decentralized, developer-attestable,
> > > > self-archiving messaging service. Maybe let's call that "kernel developer
> > > > bus" or "kdbus" -- pretty sure that name hasn't been used before.
> > > 
> > > If we flipped it around and used today's git-pull-request emails as trigger
> > > for forges to run their services, does cover some of your concerns?
> > 
> > Not really, because it doesn't preserve any records. A pull request is a
> > pointer to some code in a git repository somewhere. Like any pointer, it
> > is neither self-contained nor long-lived:
> > 
> > - the repo can be gone a month later (or that branch deleted)
> > - the PR does not preserve any discussions that happened around that
> >    code such as bug reports, test success/fail matrices, or peer reviews
> > 
> > It's also pretty inefficient, because it requires that the pull-request
> > submitter hosts a 1.5GB repository on a fast permanently-available
> > connection just so they can share a few lines of changes.
> > 
> 
> I know this is an issue that Outreachy interns across projects have had in
> the past with trying to sync large git repos across laggy connections.
> 
I wonder if the alternate index features of git could be expanded here
to support this kind of workflow.  I.e. if a child tree to a parent
project could just point to the master trees object database, and the
child only stores new objects pushed to it that don't exist in the
master tree.  That would greatly reduce the storage space needed for
forks.  I expect this is what forge solutions do under the covers
anyway.  It might be nice to codify that in direct git usage

Neil


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-10  7:35                         ` Nicolas Belouin
@ 2019-10-10 12:53                           ` Steven Rostedt
  2019-10-10 14:21                           ` Dmitry Vyukov
  1 sibling, 0 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-10-10 12:53 UTC (permalink / raw)
  To: Nicolas Belouin
  Cc: Konstantin Ryabitsev, Eric Wong, Laura Abbott, Don Zickus,
	Daniel Axtens, David Miller, sir, nhorman, workflows

On Thu, 10 Oct 2019 09:35:12 +0200
Nicolas Belouin <nicolas.belouin@gandi.net> wrote:

> Maybe the solution is to build such kind of features within git,
> many proposed solutions with tools over git or using git.
> A tool over git has the issue of conveying the data and making
> it non-centralized, whereas a tool using git is non-scalable because
> of the way git is *currently* storing things.
> Improving git to make it able to store many more objects (be it in-review
> commits or previous versions of patches) and then use it to get the kind
> of features we can envy from fossil.

To do this, we would have to make it where a user could pull just what
they want. The history of the code without the comments, or may choose
reviews etc. We don't want to bloat git too much that it becomes too
big just to compile the kernel.

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-10  7:35                         ` Nicolas Belouin
  2019-10-10 12:53                           ` Steven Rostedt
@ 2019-10-10 14:21                           ` Dmitry Vyukov
  2019-10-11  7:12                             ` Nicolas Belouin
  1 sibling, 1 reply; 102+ messages in thread
From: Dmitry Vyukov @ 2019-10-10 14:21 UTC (permalink / raw)
  To: Nicolas Belouin
  Cc: Konstantin Ryabitsev, Eric Wong, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Thu, Oct 10, 2019 at 9:39 AM Nicolas Belouin
<nicolas.belouin@gandi.net> wrote:
>
> On 10/10/19 1:56 AM, Konstantin Ryabitsev wrote:
> > On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
> >> One of my long-term visions is to have an agreed upon way to
> >> have entirely forkable development communities.  Git already
> >> gave us forkable code repositories.
> > FYI, this does already exist in the form of Fossil:
> > https://www.fossil-scm.org/home/doc/trunk/www/index.wiki
> >
> > The main reason why we can't really consider it is because it requires
> > moving away from git, which is a non-starter for anyone.
> >
> > -K
> Maybe the solution is to build such kind of features within git,
> many proposed solutions with tools over git or using git.
> A tool over git has the issue of conveying the data and making
> it non-centralized, whereas a tool using git is non-scalable because
> of the way git is *currently* storing things.
> Improving git to make it able to store many more objects (be it in-review
> commits or previous versions of patches) and then use it to get the kind
> of features we can envy from fossil.

Hi Nicolas,

I am trying to imagine how a complete solution based on git could look
like, but I am failing. It seems that there is a number of problems
beyond scalability/storage.
First, that git probably should be orthogonal to the kernel source git
trees, right? People work with multiple kernel trees, sometimes
changes migrate, parts of a series can be merged into different trees,
etc.
Then, do we do a single git where everybody has write access, or
per-developer git? If we do a global git and everybody has write
access, this looks somewhat problematic. If we have multiple
per-developer gits, then it does not solve the whole problem of
synchronization and data exchange, one can't pull from thousands of
unknown gits every time.
Where is this git[s] are hosted?
What about force pushes?
What about authorization/user identification?

It would be nice to reuse git's persistent storage format and ability
to push/fetch incremental changes, but we would need to figure out
answers to these questions. Maybe I am missing something obvious.
Could you outline how a solution based on git could look like?
Also in this case git is only a transport layer (like email/SSB), it
won't solve the application layer (how patches/comments/issues/CI
results are described, systems that consume, act and present that,
etc). So building a solid transport, even if we will need to
reimplement some git functionality, will be a smaller part of the
overall effort. And building a solid transport layer that will solve
fundamental infrastructure problems well may be worth it.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-09 21:54                   ` Konstantin Ryabitsev
  2019-10-09 22:09                     ` Laura Abbott
  2019-10-09 22:21                     ` Eric Wong
@ 2019-10-10 17:52                     ` Dmitry Vyukov
  2019-10-10 20:57                       ` Theodore Y. Ts'o
  2 siblings, 1 reply; 102+ messages in thread
From: Dmitry Vyukov @ 2019-10-10 17:52 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Wed, Oct 9, 2019 at 11:54 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> On Wed, Oct 09, 2019 at 05:35:39PM -0400, Laura Abbott wrote:
> >>This doesn't mean that forges are entirely out -- but they must remain
> >>mere tools that participate in a globally decentralized,
> >>developer-attestable, self-archiving messaging service. Maybe let's
> >>call that "kernel developer bus" or "kdbus" -- pretty sure that name
> >>hasn't been used before.
> >>
> >
> >The big issue I see with anything decentralized is that as things
> >grow people don't actually want to host their own infrastructure.
> >Think about the decline in the number of people who host their own
> >e-mail server. Anything decentralized would still presumably require
> >a server somewhere, so you're going to either raising the bar to entry
> >by requiring people to set up their own server or end up with people
> >still relying on a service somewhere. This feels like it ends up with
> >the situation we have today where most things are locally optimized
> >but on average the situation is still lousy.
> >
> >You've articulated you've articulated the reasons against centralization
> >very well from an admin point of view (which I won't dispute) but at
> >least from a user point of view a centralized forge infrastructure is
> >great because I don't have to worry about it. My university/company
> >doesn't have to set anything up for me to contribute. I get we are
> >probably going to end up optimizing more for the maintainer here but
> >it's worth thinking about how we could get forge-like benefits where
> >most users don't have to run infrastructure.
>
> We're actually not in opposition to each-other -- I expect kernel.org
> (via Linux Foundation) would provide convenient bridge tools to cover
> the precise concern you mention. Think kind of like
> patchwork.kernel.org, but instead of exclusively using some local
> database that only admins at kernel.org have access to, it would provide
> a set of feeds allowing anyone else to set up a fully functioning
> replica -- or participate in the process using their own compatible
> tools.
>
> So, in other words, the forge is still there and is still providing a
> valuable service, but it is not the single point of truth that can
> vanish and take invaluable data with it. That's my vision, and I think
> we have all we need to achieve it short of resolve, buy-in, and proper
> tooling.

I know you all love Gerrit but just to clarify :)
Gerrit stores all metadata in a git repo, all users can have a replica
and you can always have, say, a "backup" replica on a side done
automatically. Patches and versions of the patches are committed into
git into special branches (e.g. change/XXX/version/YYY), comments and
metadata are in a pretty straightforward json (this is comment text
for line X, etc) also committed into git, so one can always read that
in and transform into any other format. And you can also run Gerrit
locally over your replica.
So it seems that such solution would satisfy most of your
requirements, right? Namely, it does expose all of the underlying raw
data in a reasonable format.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-10 17:52                     ` Dmitry Vyukov
@ 2019-10-10 20:57                       ` Theodore Y. Ts'o
  2019-10-11 11:01                         ` Dmitry Vyukov
  2019-10-14 19:08                         ` Han-Wen Nienhuys
  0 siblings, 2 replies; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-10-10 20:57 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

On Thu, Oct 10, 2019 at 07:52:50PM +0200, Dmitry Vyukov wrote:
> I know you all love Gerrit but just to clarify :)
> Gerrit stores all metadata in a git repo, all users can have a replica
> and you can always have, say, a "backup" replica on a side done
> automatically. Patches and versions of the patches are committed into
> git into special branches (e.g. change/XXX/version/YYY), comments and
> metadata are in a pretty straightforward json (this is comment text
> for line X, etc) also committed into git, so one can always read that
> in and transform into any other format. And you can also run Gerrit
> locally over your replica.

Konstantin has spoken about some his concerns about git's scalability,
and it's important to remember that just because Gerrit has shown to
work well on some very large repositories, it doesn't necessarily mean
that it will work well on git repositories using the open source C
implementation of git.

That's because Gerrit as used by Google (and made available in various
public-facing Gerrit servers) uses a Git-on-Borg implementation[1],
where the storage is done using Google's internal storage
infrastructure.  This is implemented on top of Jgit (which is git
implemented in Java)[2].

[1] https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/xM9THFr55L8
[2] https://groups.google.com/a/chromium.org/d/msg/blink-dev/GZOeMUPE7Bc/LmxSj_ezcQ8J

That doesn't necessarily mean that git can't be made to work well
enough as a transport layer.  I'm just pointing out this may be the
explanation for why some say, "See, Gerrit works really well on
super-large repos storing huge numbers of review comments" and others
are saying, "it would be really scary to run git as a transport layer
on kernel.org servers because git won't scale well to that kind of
load."

Both may be correct.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-10 14:21                           ` Dmitry Vyukov
@ 2019-10-11  7:12                             ` Nicolas Belouin
  2019-10-11 13:56                               ` Dmitry Vyukov
  0 siblings, 1 reply; 102+ messages in thread
From: Nicolas Belouin @ 2019-10-11  7:12 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Konstantin Ryabitsev, Eric Wong, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows



On 10/10/19 4:21 PM, Dmitry Vyukov wrote:
> On Thu, Oct 10, 2019 at 9:39 AM Nicolas Belouin
> <nicolas.belouin@gandi.net> wrote:
>> On 10/10/19 1:56 AM, Konstantin Ryabitsev wrote:
>>> On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
>>>> One of my long-term visions is to have an agreed upon way to
>>>> have entirely forkable development communities.  Git already
>>>> gave us forkable code repositories.
>>> FYI, this does already exist in the form of Fossil:
>>> https://www.fossil-scm.org/home/doc/trunk/www/index.wiki
>>>
>>> The main reason why we can't really consider it is because it requires
>>> moving away from git, which is a non-starter for anyone.
>>>
>>> -K
>> Maybe the solution is to build such kind of features within git,
>> many proposed solutions with tools over git or using git.
>> A tool over git has the issue of conveying the data and making
>> it non-centralized, whereas a tool using git is non-scalable because
>> of the way git is *currently* storing things.
>> Improving git to make it able to store many more objects (be it in-review
>> commits or previous versions of patches) and then use it to get the kind
>> of features we can envy from fossil.
> Hi Nicolas,
>
> I am trying to imagine how a complete solution based on git could look
> like, but I am failing. It seems that there is a number of problems
> beyond scalability/storage.
> First, that git probably should be orthogonal to the kernel source git
> trees, right? People work with multiple kernel trees, sometimes
> changes migrate, parts of a series can be merged into different trees,
> etc.
This I think can't be helped if going through a Merge/Pull request workflow
as a request will always be seen as a whole applied to a specific tree,
I agree
though the external tool solution might help for tree migration though.
And I
don't really have a solution for these issues.
> Then, do we do a single git where everybody has write access, or
> per-developer git? If we do a global git and everybody has write
> access, this looks somewhat problematic. If we have multiple
> per-developer gits, then it does not solve the whole problem of
> synchronization and data exchange, one can't pull from thousands of
> unknown gits every time.
For this one I thought of per-developer git with a branch per series
with MR discussion attached to the branch on the developer side and on
maintainer/reference tree side a reference indicating a remote MR is in
progress, with only those kind of special refs being writable.

On MR acceptance the discussion/history of branch gets into the maintainer/
reference tree through the merge commit object and gets included
whenever one
pulls from this tree

> Where is this git[s] are hosted?
Wherever the developper/maintainer wants it to be hosted
> What about force pushes?
Well a force-push as per the rule of no changing the commits of a public
branch
end-up as new reference linking to the old one for traceability.
> What about authorization/user identification?
For this one I thought of a X.509 like tree where you need to get your
key/identity signed
by some kind of AC so that you create an account through that AC and can
push on whatever tree
that trust that AC or one of its parent. I thought about a root with two
main branches one for
"Anonymous" kind of registration where you can use something like OpenId
to get identified
and get your matching certificate signed and another one for "Higher
trust" given to e.g Linus
so he can sign his maintainers and so they can also sign sub-maintainers
and so on. Then you can
set-up your repository to trust the root if you want anyone to be able
to push or some sub-AC to
restrict and maybe even a depth restriction to allow a hierarchical work
as done today.
>
> It would be nice to reuse git's persistent storage format and ability
> to push/fetch incremental changes, but we would need to figure out
> answers to these questions. Maybe I am missing something obvious.
> Could you outline how a solution based on git could look like?
> Also in this case git is only a transport layer (like email/SSB), it
> won't solve the application layer (how patches/comments/issues/CI
> results are described, systems that consume, act and present that,
> etc). So building a solid transport, even if we will need to
> reimplement some git functionality, will be a smaller part of the
> overall effort. And building a solid transport layer that will solve
> fundamental infrastructure problems well may be worth it.

Nicolas

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-10 20:57                       ` Theodore Y. Ts'o
@ 2019-10-11 11:01                         ` Dmitry Vyukov
  2019-10-11 12:54                           ` Theodore Y. Ts'o
  2019-10-14 19:08                         ` Han-Wen Nienhuys
  1 sibling, 1 reply; 102+ messages in thread
From: Dmitry Vyukov @ 2019-10-11 11:01 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

On Thu, Oct 10, 2019 at 10:58 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
> On Thu, Oct 10, 2019 at 07:52:50PM +0200, Dmitry Vyukov wrote:
> > I know you all love Gerrit but just to clarify :)
> > Gerrit stores all metadata in a git repo, all users can have a replica
> > and you can always have, say, a "backup" replica on a side done
> > automatically. Patches and versions of the patches are committed into
> > git into special branches (e.g. change/XXX/version/YYY), comments and
> > metadata are in a pretty straightforward json (this is comment text
> > for line X, etc) also committed into git, so one can always read that
> > in and transform into any other format. And you can also run Gerrit
> > locally over your replica.
>
> Konstantin has spoken about some his concerns about git's scalability,
> and it's important to remember that just because Gerrit has shown to
> work well on some very large repositories, it doesn't necessarily mean
> that it will work well on git repositories using the open source C
> implementation of git.
>
> That's because Gerrit as used by Google (and made available in various
> public-facing Gerrit servers) uses a Git-on-Borg implementation[1],
> where the storage is done using Google's internal storage
> infrastructure.  This is implemented on top of Jgit (which is git
> implemented in Java)[2].
>
> [1] https://groups.google.com/a/chromium.org/forum/#!topic/chromium-dev/xM9THFr55L8
> [2] https://groups.google.com/a/chromium.org/d/msg/blink-dev/GZOeMUPE7Bc/LmxSj_ezcQ8J
>
> That doesn't necessarily mean that git can't be made to work well
> enough as a transport layer.  I'm just pointing out this may be the
> explanation for why some say, "See, Gerrit works really well on
> super-large repos storing huge numbers of review comments" and others
> are saying, "it would be really scary to run git as a transport layer
> on kernel.org servers because git won't scale well to that kind of
> load."
>
> Both may be correct.
>
> Cheers,
>
>                                         - Ted

Good point. I wonder if it's possible to choose a more-git-friendly
storage scheme and to optimize the OSS git to get to the necessary
scalability level. I am asking because "optimizing some piece of
software" looks like a smaller part in the grand scheme of things of
the overall problem (unless of course there are some fundamental
conflicts between git and efficient storage for this type of data).

However, I mainly wanted to point out a higher-level consideration.
Total doomsday resistance, assuming every party in the world is an
adversary and total decentralization are nice properties, but each of
them makes project design and implementation an order or magnitude
harder. So the question is: are the following requirements would be
enough:
 - open-source implementation
 - transparent raw data format
 - ability to export and backup all data natively (each user may even
have a complete replica of whole raw data for the "local patchwork"
thing)
 - ability to do most of the work locally
 - not owning user identities/be able to export user identities
 - maybe something else, but you get the idea
?

We already have patchwork and public inbox deployed on kernel.org, so
could this whole development support system also be something simply
deployed on kernel.org? That would make lots of things _much_ easier.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-11 11:01                         ` Dmitry Vyukov
@ 2019-10-11 12:54                           ` Theodore Y. Ts'o
  0 siblings, 0 replies; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-10-11 12:54 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

On Fri, Oct 11, 2019 at 01:01:04PM +0200, Dmitry Vyukov wrote:
> Good point. I wonder if it's possible to choose a more-git-friendly
> storage scheme and to optimize the OSS git to get to the necessary
> scalability level. I am asking because "optimizing some piece of
> software" looks like a smaller part in the grand scheme of things of
> the overall problem (unless of course there are some fundamental
> conflicts between git and efficient storage for this type of data).

It's certainly possible; Jgit is open source (licensed under the
Eclipse Public License), and while G-o-B is not open sourced (since it
needs to interface with internal Google infrastructure, such as Big
Table, it wouldn't help if it was open sourced), I suspect it wouldn't
be hard to create new Java classes for jgit which used some other
backend storage, such as Postgres.  That being said running a
Jgit/Postgres server is probably going to be quite bit more
complicated than the OSS git.

> However, I mainly wanted to point out a higher-level consideration.
> Total doomsday resistance, assuming every party in the world is an
> adversary and total decentralization are nice properties, but each of
> them makes project design and implementation an order or magnitude
> harder. So the question is: are the following requirements would be
> enough:
>  - open-source implementation
>  - transparent raw data format
>  - ability to export and backup all data natively (each user may even
> have a complete replica of whole raw data for the "local patchwork"
> thing)
>  - ability to do most of the work locally
>  - not owning user identities/be able to export user identities
>  - maybe something else, but you get the idea
> ?

This is a good point, and I agree.  And whether we use something like
Jgit, or OSS git, or Google's G-o-B, or Microsoft's highly scalable
git reimplementation, is really is an implementation detail.  So long
as we know that our requirements have viable solutions, we don't need
to rathole on the which solution is best.

And if we have an independent implementation which _could_ scale, and
could be stood up and demonstrate could work for a small set of trees,
if some large company were to contribute the infrastructure so that
could run at scale for most of the kernel subsystems' git trees,
perhaps we would be OK with a solution which is centralized-at-least-in-practice?

After all, lore.kernel.org is centralized at least in practice, and
it's convenient that it archives many of the kernel's mailing lists
(on vger as well as others) because when a message is CC'ed to many
lists, it can be more efficient when it is run centrally.  But the
fact that public inbox *can* be run by others is enough to keep people
from being worried about things being too centralized.

						- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-11  7:12                             ` Nicolas Belouin
@ 2019-10-11 13:56                               ` Dmitry Vyukov
  2019-10-14  7:31                                 ` Nicolas Belouin
  0 siblings, 1 reply; 102+ messages in thread
From: Dmitry Vyukov @ 2019-10-11 13:56 UTC (permalink / raw)
  To: Nicolas Belouin
  Cc: Konstantin Ryabitsev, Eric Wong, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Fri, Oct 11, 2019 at 9:12 AM Nicolas Belouin
<nicolas.belouin@gandi.net> wrote:
> > <nicolas.belouin@gandi.net> wrote:
> >> On 10/10/19 1:56 AM, Konstantin Ryabitsev wrote:
> >>> On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
> >>>> One of my long-term visions is to have an agreed upon way to
> >>>> have entirely forkable development communities.  Git already
> >>>> gave us forkable code repositories.
> >>> FYI, this does already exist in the form of Fossil:
> >>> https://www.fossil-scm.org/home/doc/trunk/www/index.wiki
> >>>
> >>> The main reason why we can't really consider it is because it requires
> >>> moving away from git, which is a non-starter for anyone.
> >>>
> >>> -K
> >> Maybe the solution is to build such kind of features within git,
> >> many proposed solutions with tools over git or using git.
> >> A tool over git has the issue of conveying the data and making
> >> it non-centralized, whereas a tool using git is non-scalable because
> >> of the way git is *currently* storing things.
> >> Improving git to make it able to store many more objects (be it in-review
> >> commits or previous versions of patches) and then use it to get the kind
> >> of features we can envy from fossil.
> > Hi Nicolas,
> >
> > I am trying to imagine how a complete solution based on git could look
> > like, but I am failing. It seems that there is a number of problems
> > beyond scalability/storage.
> > First, that git probably should be orthogonal to the kernel source git
> > trees, right? People work with multiple kernel trees, sometimes
> > changes migrate, parts of a series can be merged into different trees,
> > etc.
> This I think can't be helped if going through a Merge/Pull request workflow
> as a request will always be seen as a whole applied to a specific tree,
> I agree
> though the external tool solution might help for tree migration though.
> And I
> don't really have a solution for these issues.
> > Then, do we do a single git where everybody has write access, or
> > per-developer git? If we do a global git and everybody has write
> > access, this looks somewhat problematic. If we have multiple
> > per-developer gits, then it does not solve the whole problem of
> > synchronization and data exchange, one can't pull from thousands of
> > unknown gits every time.
> For this one I thought of per-developer git with a branch per series
> with MR discussion attached to the branch on the developer side and on
> maintainer/reference tree side a reference indicating a remote MR is in
> progress, with only those kind of special refs being writable.
>
> On MR acceptance the discussion/history of branch gets into the maintainer/
> reference tree through the merge commit object and gets included
> whenever one
> pulls from this tree
>
> > Where is this git[s] are hosted?
> Wherever the developper/maintainer wants it to be hosted
> > What about force pushes?
> Well a force-push as per the rule of no changing the commits of a public
> branch
> end-up as new reference linking to the old one for traceability.
> > What about authorization/user identification?
> For this one I thought of a X.509 like tree where you need to get your
> key/identity signed
> by some kind of AC so that you create an account through that AC and can
> push on whatever tree
> that trust that AC or one of its parent. I thought about a root with two
> main branches one for
> "Anonymous" kind of registration where you can use something like OpenId
> to get identified
> and get your matching certificate signed and another one for "Higher
> trust" given to e.g Linus
> so he can sign his maintainers and so they can also sign sub-maintainers
> and so on. Then you can
> set-up your repository to trust the root if you want anyone to be able
> to push or some sub-AC to
> restrict and maybe even a depth restriction to allow a hierarchical work
> as done today.
> >
> > It would be nice to reuse git's persistent storage format and ability
> > to push/fetch incremental changes, but we would need to figure out
> > answers to these questions. Maybe I am missing something obvious.
> > Could you outline how a solution based on git could look like?
> > Also in this case git is only a transport layer (like email/SSB), it
> > won't solve the application layer (how patches/comments/issues/CI
> > results are described, systems that consume, act and present that,
> > etc). So building a solid transport, even if we will need to
> > reimplement some git functionality, will be a smaller part of the
> > overall effort. And building a solid transport layer that will solve
> > fundamental infrastructure problems well may be worth it.

Thanks! This clears some of my doubts.
However, I still don't see the complete picture:
1. If I setup a git on github (a new contributor), is it possible to
setup github repo to trust an AC? I have not seen such option. I
assume you need to be configuring you git server manually for this
type of thing?
2. Restricting write access to some refs. Again, this does not seem to
be possible with github. Is it possible with own stock server?
3. If the "comments" ref is writable, can user A delete user B's
comments? Can change/repo owner delete "unwanted" comments?
4. git transport will also need DoS/spam protection, right? Because
it's easy to create 1000 new users each sending 1000 new merge
requests.
5. To sync my local view of the world, I need to fetch, say, net tree
git, then from that find set of other git trees with pending merge
requests and pull these trees, right? There may be a potential problem
with availability: if people setup gits on github, gitlab, kernel.org,
private hostings, etc, some of them may be down.

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-11 13:56                               ` Dmitry Vyukov
@ 2019-10-14  7:31                                 ` Nicolas Belouin
  0 siblings, 0 replies; 102+ messages in thread
From: Nicolas Belouin @ 2019-10-14  7:31 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Konstantin Ryabitsev, Eric Wong, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows



On 10/11/19 3:56 PM, Dmitry Vyukov wrote:
> On Fri, Oct 11, 2019 at 9:12 AM Nicolas Belouin
> <nicolas.belouin@gandi.net> wrote:
>>> <nicolas.belouin@gandi.net> wrote:
>>>> On 10/10/19 1:56 AM, Konstantin Ryabitsev wrote:
>>>>> On Wed, Oct 09, 2019 at 10:21:56PM +0000, Eric Wong wrote:
>>>>>> One of my long-term visions is to have an agreed upon way to
>>>>>> have entirely forkable development communities.  Git already
>>>>>> gave us forkable code repositories.
>>>>> FYI, this does already exist in the form of Fossil:
>>>>> https://www.fossil-scm.org/home/doc/trunk/www/index.wiki
>>>>>
>>>>> The main reason why we can't really consider it is because it requires
>>>>> moving away from git, which is a non-starter for anyone.
>>>>>
>>>>> -K
>>>> Maybe the solution is to build such kind of features within git,
>>>> many proposed solutions with tools over git or using git.
>>>> A tool over git has the issue of conveying the data and making
>>>> it non-centralized, whereas a tool using git is non-scalable because
>>>> of the way git is *currently* storing things.
>>>> Improving git to make it able to store many more objects (be it in-review
>>>> commits or previous versions of patches) and then use it to get the kind
>>>> of features we can envy from fossil.
>>> Hi Nicolas,
>>>
>>> I am trying to imagine how a complete solution based on git could look
>>> like, but I am failing. It seems that there is a number of problems
>>> beyond scalability/storage.
>>> First, that git probably should be orthogonal to the kernel source git
>>> trees, right? People work with multiple kernel trees, sometimes
>>> changes migrate, parts of a series can be merged into different trees,
>>> etc.
>> This I think can't be helped if going through a Merge/Pull request workflow
>> as a request will always be seen as a whole applied to a specific tree,
>> I agree
>> though the external tool solution might help for tree migration though.
>> And I
>> don't really have a solution for these issues.
>>> Then, do we do a single git where everybody has write access, or
>>> per-developer git? If we do a global git and everybody has write
>>> access, this looks somewhat problematic. If we have multiple
>>> per-developer gits, then it does not solve the whole problem of
>>> synchronization and data exchange, one can't pull from thousands of
>>> unknown gits every time.
>> For this one I thought of per-developer git with a branch per series
>> with MR discussion attached to the branch on the developer side and on
>> maintainer/reference tree side a reference indicating a remote MR is in
>> progress, with only those kind of special refs being writable.
>>
>> On MR acceptance the discussion/history of branch gets into the maintainer/
>> reference tree through the merge commit object and gets included
>> whenever one
>> pulls from this tree
>>
>>> Where is this git[s] are hosted?
>> Wherever the developper/maintainer wants it to be hosted
>>> What about force pushes?
>> Well a force-push as per the rule of no changing the commits of a public
>> branch
>> end-up as new reference linking to the old one for traceability.
>>> What about authorization/user identification?
>> For this one I thought of a X.509 like tree where you need to get your
>> key/identity signed
>> by some kind of AC so that you create an account through that AC and can
>> push on whatever tree
>> that trust that AC or one of its parent. I thought about a root with two
>> main branches one for
>> "Anonymous" kind of registration where you can use something like OpenId
>> to get identified
>> and get your matching certificate signed and another one for "Higher
>> trust" given to e.g Linus
>> so he can sign his maintainers and so they can also sign sub-maintainers
>> and so on. Then you can
>> set-up your repository to trust the root if you want anyone to be able
>> to push or some sub-AC to
>> restrict and maybe even a depth restriction to allow a hierarchical work
>> as done today.
>>> It would be nice to reuse git's persistent storage format and ability
>>> to push/fetch incremental changes, but we would need to figure out
>>> answers to these questions. Maybe I am missing something obvious.
>>> Could you outline how a solution based on git could look like?
>>> Also in this case git is only a transport layer (like email/SSB), it
>>> won't solve the application layer (how patches/comments/issues/CI
>>> results are described, systems that consume, act and present that,
>>> etc). So building a solid transport, even if we will need to
>>> reimplement some git functionality, will be a smaller part of the
>>> overall effort. And building a solid transport layer that will solve
>>> fundamental infrastructure problems well may be worth it.
> Thanks! This clears some of my doubts.
> However, I still don't see the complete picture:
> 1. If I setup a git on github (a new contributor), is it possible to
> setup github repo to trust an AC? I have not seen such option. I
> assume you need to be configuring you git server manually for this
> type of thing?
> 2. Restricting write access to some refs. Again, this does not seem to
> be possible with github. Is it possible with own stock server?
> 3. If the "comments" ref is writable, can user A delete user B's
> comments? Can change/repo owner delete "unwanted" comments?
> 4. git transport will also need DoS/spam protection, right? Because
> it's easy to create 1000 new users each sending 1000 new merge
> requests.
> 5. To sync my local view of the world, I need to fetch, say, net tree
> git, then from that find set of other git trees with pending merge
> requests and pull these trees, right? There may be a potential problem
> with availability: if people setup gits on github, gitlab, kernel.org,
> private hostings, etc, some of them may be down.
Well the identity/authentication part is all new (I don't know of any git
server with that ability), but feels to me like the only possibility to
ensure you don't have to create an account everywhere and you are still
identified/given permissions. Moreover if you build this into "standard" git
you can more easily expect github and co to implement it and be
interoperable.

Gitolite for example allow to give fine-grained permissions on refs so I
think it
might be possible to easily get this authorization granularity into more
git servers.

For me the comments ref should be in an append only system for everyone
but the owner
as for legal reason we must give the possibility for the repo owner to
discard comments
(he might in the worst case be held responsible for illegal content
hosted using this).
For the change part, I see I forgot to say I expected the "commits" to
be signed using
a key/certificate that can be linked to the one used for authentication.

The fourth point is needed yes whenever you allow some kind of anonymous
submission.

The fifth point is real but I don't see a way to get a PR without an
available tree to pull from.
The only other way I can think of would be even more prone to spam and
DoS as it would require PRs
to be pushed to the destination git and so you would be able to push big
objects (a ref to an external
tree is light and isn't going to be too much of a burden for the long run).

Nicolas


^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-10 20:57                       ` Theodore Y. Ts'o
  2019-10-11 11:01                         ` Dmitry Vyukov
@ 2019-10-14 19:08                         ` Han-Wen Nienhuys
  2019-10-15  1:54                           ` Theodore Y. Ts'o
                                             ` (2 more replies)
  1 sibling, 3 replies; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-14 19:08 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Dmitry Vyukov, Konstantin Ryabitsev, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

(again, without html)

On Thu, Oct 10, 2019 at 11:00 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> On Thu, Oct 10, 2019 at 07:52:50PM +0200, Dmitry Vyukov wrote:
> > I know you all love Gerrit but just to clarify :)
> > Gerrit stores all metadata in a git repo, all users can have a replica
> > and you can always have, say, a "backup" replica on a side done
> > automatically. Patches and versions of the patches are committed into
> > git into special branches (e.g. change/XXX/version/YYY), comments and
> > metadata are in a pretty straightforward json (this is comment text
> > for line X, etc) also committed into git, so one can always read that
> > in and transform into any other format. And you can also run Gerrit
> > locally over your replica.
>
> Konstantin has spoken about some his concerns about git's scalability,
> and it's important to remember that just because Gerrit has shown to
> work well on some very large repositories, it doesn't necessarily mean
> that it will work well on git repositories using the open source C
> implementation of git.
>
> That's because Gerrit as used by Google (and made available in various
> public-facing Gerrit servers) uses a Git-on-Borg implementation[1],
> where the storage is done using Google's internal storage
> infrastructure.  This is implemented on top of Jgit (which is git
> implemented in Java)[2].
>



Hi there,

I manage the Gerrit backend team at Google.

It is true that Gerrit at Google runs on top of Borg (Gerrit-on-Borg aka. GoB),

1) there is no real concern about Cgit's scalability.
2) the borg deployment has no relevant magical sauce here.

To 1) : Konstantin was worried about performance implication on git
notes.  The git-notes command stores data in a single
refs/notes/commits branch. Gerrit actually uses notes (the file
format) as well, but has a single notes branch per review, so
performance here is not a concern when scaling up the number of
reviews.

To 2) : Google needs special magic sauce, because we service hundreds
of teams that work on thousands of repositories. However, here we're
talking about just the kernel itself; that is just a single
repository, and not an especially large one.  Chromium is our largest
repo, and it is about 10x larger than the linux kernel.

Google runs gerrit in tasks with (currently) 16G memory each. There
are many large companies (eg. SAP) that run much larger instances, ie.
one can easily match GoB's performance level on a single machine.

I have been wanting to propose Gerrit as an alternative for the Linux
kernel workflow, so I might as well bring forth my arguments here.

Gerrit isn't a big favorite of many people, but some of that
perception may be outdated. Since 2016, Google has significantly
increased its investment in Gerrit. For example, we have rewritten the
web UI from scratch, and there have been many performance
improvements.

Git is a tool built to exchange code and diffs. It seems natural to
build a review solution on top of Git too. Gerrit is also built on top
of git, and stores all metadata in Git too, ie. you can mirror review
data into other Gerrit instances losslessly.

Building a review tool is not all that easy to do well; by using
Gerrit, you get a tool that already exists, works, and has significant
corporate support. We at Google have ~11 SWEs working on Gerrit
full-time, for example, and we have support from UX research and UI
design. The amount of work to tweak Gerrit for Linux kernel
development surely is much less than building something from scratch.

Gerrit has a patchset oriented workflow (where changes are amended all
the time), which is a good fit to the kernel's development process.
Linus doesn't like Change-Id lines, but I think we could adapt Gerrit
so it accepts URLs as IDs instead.

There is talk of building a distributed/federated tool, but if there
are policies ("Jane Doe is maintainer of the network subsystem, and
can merge changes that only touch file in net/ "), then building
something decentralized is really hard. You have to build
infrastructure where Jane can prove to others who she is (PGP key
signing parties?), and some sort of distributed storage of the policy
rules.

By contrast, a centralized server can authenticate users reliably and
the server owner can define such rules.  There can still be multiple
gerrit servers, possibly sponsored by corporate entities (one from
RedHat, one from Google, etc.), and different servers can support
different authentication models (OpenID, OAuth, Google account, etc.)

--
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-14 19:08                         ` Han-Wen Nienhuys
@ 2019-10-15  1:54                           ` Theodore Y. Ts'o
  2019-10-15 12:00                             ` Daniel Vetter
  2019-10-15 16:07                           ` Greg KH
  2019-10-15 18:37                           ` Konstantin Ryabitsev
  2 siblings, 1 reply; 102+ messages in thread
From: Theodore Y. Ts'o @ 2019-10-15  1:54 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Dmitry Vyukov, Konstantin Ryabitsev, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Mon, Oct 14, 2019 at 09:08:17PM +0200, Han-Wen Nienhuys wrote:
> To 1) : Konstantin was worried about performance implication on git
> notes.  The git-notes command stores data in a single
> refs/notes/commits branch. Gerrit actually uses notes (the file
> format) as well, but has a single notes branch per review, so
> performance here is not a concern when scaling up the number of
> reviews.
> 
> To 2) : Google needs special magic sauce, because we service hundreds
> of teams that work on thousands of repositories. However, here we're
> talking about just the kernel itself; that is just a single
> repository, and not an especially large one.  Chromium is our largest
> repo, and it is about 10x larger than the linux kernel.

I'd be concerned about cgit, because we need to have a separate file
for the reviews, and you mean a single notes branch per review; a
single patch series can have dozens of revisions, with potentially
dozens of people commenting a large number of times, with e-mail
threads that are hundreds of messages long.  If all of these changes
are being squeezed into a single notes file, it would be quite large,
and there would also be a lot of serialization concerns.  If you mean
that there would be a single git note for each e-mail in a patch
review thread.... that would seem to be a real potential problem for
cgit.

Be that as may, that's an optimization problem, and it is solveable,
in the same way that most things are a Mere Matter of Programming.
And if you're right, and it's not actually going to be a problem, then
Huzzah!  But I suspect Konstantin's worries are probably ones we
should at least pay attention to.

> Gerrit isn't a big favorite of many people, but some of that
> perception may be outdated. Since 2016, Google has significantly
> increased its investment in Gerrit. For example, we have rewritten the
> web UI from scratch, and there have been many performance
> improvements.

I agree that Gerrit might be a good starting point, having used it to
review changes for Google's Data Center Kernels, as well as for
Android and ChromeOS/Cloud Optimized System kernels.  Indeed, if I'm
forced to use a non-threading mail user agent, it's far superior to
e-mail reviews.

Even if you have a threading mail agent, if everyone is using it, I'd
argue that Gerrit is better, because it makes it really easy to look
at the various versions of the patch series, including "give me the
diff between the v3 and v7 version of the patch".  Having the
conversation about a particular hunk of code in-line with the code
itself is also very helpful.

So let's talk about the sort of features that might need to be added
to allow Gerrit to work for upstream development.

> Gerrit has a patchset oriented workflow (where changes are amended all
> the time), which is a good fit to the kernel's development process.
> Linus doesn't like Change-Id lines, but I think we could adapt Gerrit
> so it accepts URLs as IDs instead.

Yep, I don't think this is hard.

> There is talk of building a distributed/federated tool, but if there
> are policies ("Jane Doe is maintainer of the network subsystem, and
> can merge changes that only touch file in net/ "), then building
> something decentralized is really hard. You have to build
> infrastructure where Jane can prove to others who she is (PGP key
> signing parties?), and some sort of distributed storage of the policy
> rules.

So requiring centralized authentication is going to be.... hard.
There will certainly be some operations which will require
authentication, sure.  But for things like:

  * Submitting a patch for review
  * Making comments on a patch

Adding a formal +1 or +2 vote, or actually approving that the patch be
merged will obviously require authentication.  But as much as
possible, a valid e-mail address should be all that's necessary for
what people currently do using e-mail today.

As far as a federated tool is concerned, I don't think we need to
encode strict rules, because so long as we have a human (e.g., Linus)
merging individual subsystem trees, I think we can let maintainers or
maintainer groups (who, after all, today have absolutely control over
their git trees) work out those issues amongst themselves, with an
appeal to Linus to resolve conflicts and to make a final quality
control check.

Solving the problem of replacing how a maintainer or maintainer group
reviews patches for their subsystem, and doing the review for patches
that land in an a particular subsystem's git tree is a much simpler
problem.  And if we can solve this, I think that's sufficient.

But what this *does* mean is that sometimes patches will be cc'ed to
multiple mailing lists, we need to map that into the gerrit world of a
patch being cc'ed to multiple git trees.  The patch series might only
end up landing in a single git tree, or it might be split up and with
some commits landing in the ext4.git tree, and some in the btrfs.git
tree, and some in the xfs.git tree, with some prerequisite patches
landing on a separate branch of one of these trees, which the
maintainers will merge into their trees.

Today, this can be easily done by cc'ing the patch to multiple mailing
lists.  Exactly how this works may get tricky, especially in the
federated model where (for example) perhaps the btrfs tree might be
administered by Facebook, while the xfs tree might be administrated by
Red Hat.  Given that we *also* have to support people who want to keep
using e-mail during the transition period, it may be that using
unauthenticated e-mail messages where comments are attached quoted
patch hunks, perhaps that can be the interchange format between
different servers that aren't under a common administrative domain.

In *practice* hopefully most of the git/Gerrit trees will be
administrated by Linux Foundation's kernel.org team.  But I think it's
important that we support a distributed/federated model, as an
insurance policy if nothing else.

					- Ted

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15  1:54                           ` Theodore Y. Ts'o
@ 2019-10-15 12:00                             ` Daniel Vetter
  2019-10-15 13:14                               ` Han-Wen Nienhuys
  0 siblings, 1 reply; 102+ messages in thread
From: Daniel Vetter @ 2019-10-15 12:00 UTC (permalink / raw)
  To: Theodore Y. Ts'o
  Cc: Han-Wen Nienhuys, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 3:56 AM Theodore Y. Ts'o <tytso@mit.edu> wrote:
>
> On Mon, Oct 14, 2019 at 09:08:17PM +0200, Han-Wen Nienhuys wrote:
> > To 1) : Konstantin was worried about performance implication on git
> > notes.  The git-notes command stores data in a single
> > refs/notes/commits branch. Gerrit actually uses notes (the file
> > format) as well, but has a single notes branch per review, so
> > performance here is not a concern when scaling up the number of
> > reviews.
> >
> > To 2) : Google needs special magic sauce, because we service hundreds
> > of teams that work on thousands of repositories. However, here we're
> > talking about just the kernel itself; that is just a single
> > repository, and not an especially large one.  Chromium is our largest
> > repo, and it is about 10x larger than the linux kernel.
>
> I'd be concerned about cgit, because we need to have a separate file
> for the reviews, and you mean a single notes branch per review; a
> single patch series can have dozens of revisions, with potentially
> dozens of people commenting a large number of times, with e-mail
> threads that are hundreds of messages long.  If all of these changes
> are being squeezed into a single notes file, it would be quite large,
> and there would also be a lot of serialization concerns.  If you mean
> that there would be a single git note for each e-mail in a patch
> review thread.... that would seem to be a real potential problem for
> cgit.
>
> Be that as may, that's an optimization problem, and it is solveable,
> in the same way that most things are a Mere Matter of Programming.
> And if you're right, and it's not actually going to be a problem, then
> Huzzah!  But I suspect Konstantin's worries are probably ones we
> should at least pay attention to.
>
> > Gerrit isn't a big favorite of many people, but some of that
> > perception may be outdated. Since 2016, Google has significantly
> > increased its investment in Gerrit. For example, we have rewritten the
> > web UI from scratch, and there have been many performance
> > improvements.
>
> I agree that Gerrit might be a good starting point, having used it to
> review changes for Google's Data Center Kernels, as well as for
> Android and ChromeOS/Cloud Optimized System kernels.  Indeed, if I'm
> forced to use a non-threading mail user agent, it's far superior to
> e-mail reviews.
>
> Even if you have a threading mail agent, if everyone is using it, I'd
> argue that Gerrit is better, because it makes it really easy to look
> at the various versions of the patch series, including "give me the
> diff between the v3 and v7 version of the patch".  Having the
> conversation about a particular hunk of code in-line with the code
> itself is also very helpful.
>
> So let's talk about the sort of features that might need to be added
> to allow Gerrit to work for upstream development.
>
> > Gerrit has a patchset oriented workflow (where changes are amended all
> > the time), which is a good fit to the kernel's development process.
> > Linus doesn't like Change-Id lines, but I think we could adapt Gerrit
> > so it accepts URLs as IDs instead.
>
> Yep, I don't think this is hard.
>
> > There is talk of building a distributed/federated tool, but if there
> > are policies ("Jane Doe is maintainer of the network subsystem, and
> > can merge changes that only touch file in net/ "), then building
> > something decentralized is really hard. You have to build
> > infrastructure where Jane can prove to others who she is (PGP key
> > signing parties?), and some sort of distributed storage of the policy
> > rules.
>
> So requiring centralized authentication is going to be.... hard.
> There will certainly be some operations which will require
> authentication, sure.  But for things like:

What we do on gitlab.freedesktop.org is allow you to log in using
other hubs like github. As long as
patchwork/gerrit/whatever.kernel.org supports OAuth we could add a
simple "log in using kernel.org" button and it'd be a one-click thing.
And everyone can still choose where they want to have their main
account. Imo that's good enough (but I don't have strong opinions
here).

>   * Submitting a patch for review
>   * Making comments on a patch
>
> Adding a formal +1 or +2 vote, or actually approving that the patch be
> merged will obviously require authentication.  But as much as
> possible, a valid e-mail address should be all that's necessary for
> what people currently do using e-mail today.
>
> As far as a federated tool is concerned, I don't think we need to
> encode strict rules, because so long as we have a human (e.g., Linus)
> merging individual subsystem trees, I think we can let maintainers or
> maintainer groups (who, after all, today have absolutely control over
> their git trees) work out those issues amongst themselves, with an
> appeal to Linus to resolve conflicts and to make a final quality
> control check.
>
> Solving the problem of replacing how a maintainer or maintainer group
> reviews patches for their subsystem, and doing the review for patches
> that land in an a particular subsystem's git tree is a much simpler
> problem.  And if we can solve this, I think that's sufficient.
>
> But what this *does* mean is that sometimes patches will be cc'ed to
> multiple mailing lists, we need to map that into the gerrit world of a
> patch being cc'ed to multiple git trees.  The patch series might only
> end up landing in a single git tree, or it might be split up and with
> some commits landing in the ext4.git tree, and some in the btrfs.git
> tree, and some in the xfs.git tree, with some prerequisite patches
> landing on a separate branch of one of these trees, which the
> maintainers will merge into their trees.
>
> Today, this can be easily done by cc'ing the patch to multiple mailing
> lists.  Exactly how this works may get tricky, especially in the
> federated model where (for example) perhaps the btrfs tree might be
> administered by Facebook, while the xfs tree might be administrated by
> Red Hat.  Given that we *also* have to support people who want to keep
> using e-mail during the transition period, it may be that using
> unauthenticated e-mail messages where comments are attached quoted
> patch hunks, perhaps that can be the interchange format between
> different servers that aren't under a common administrative domain.

Last time I looked none of the common web ui tools (gerrit, gitlab,
github) had any reasonable support for topic branches/patch series
that target multiple different branches/repositories. They all assume
that a submission gets merged into one branch only and that's it. You
can of course submit the same stuff for inclusion into multiple
places, but that gives you separate discussion tracking for each one
(at least with the merge/pull request model, maybe gerrit is better
here), which is real bad.

> In *practice* hopefully most of the git/Gerrit trees will be
> administrated by Linux Foundation's kernel.org team.  But I think it's
> important that we support a distributed/federated model, as an
> insurance policy if nothing else.

If we go with any of the tooling changes discussed here I expect it'll
be a per-subsystem choice on what's the preferred thing, e.g. I don't
expect drm to move away from freedesktop.org because we have all our
userspace there. At that level interop with email-based pull requests
should be good enough. Better would be if you could somehow do
"foreign" submissions, which just point at a merge/pull request at the
other submission (maybe just use the git pull line) and internally
track/forward all comments or stuff like CI status. But I'm not aware
of any standard for this.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 12:00                             ` Daniel Vetter
@ 2019-10-15 13:14                               ` Han-Wen Nienhuys
  2019-10-15 13:45                                 ` Daniel Vetter
  0 siblings, 1 reply; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-15 13:14 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 2:00 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > Today, this can be easily done by cc'ing the patch to multiple mailing
> > lists.  Exactly how this works may get tricky, especially in the
> > federated model where (for example) perhaps the btrfs tree might be
> > administered by Facebook, while the xfs tree might be administrated by
> > Red Hat.  Given that we *also* have to support people who want to keep
> > using e-mail during the transition period, it may be that using
> > unauthenticated e-mail messages where comments are attached quoted
> > patch hunks, perhaps that can be the interchange format between
> > different servers that aren't under a common administrative domain.
>
> Last time I looked none of the common web ui tools (gerrit, gitlab,
> github) had any reasonable support for topic branches/patch series
> that target multiple different branches/repositories. They all assume
> that a submission gets merged into one branch only and that's it. You
> can of course submit the same stuff for inclusion into multiple
> places, but that gives you separate discussion tracking for each one
> (at least with the merge/pull request model, maybe gerrit is better
> here), which is real bad.

Can you say a little more about what you expect when working with
multiple branches/repos?

In gerrit, you can assign freeform tags ("topics") to changes, to
group them. See eg.

  https://gerrit-review.googlesource.com/q/topic:"rename-reviewdb-package"+(status:open%20OR%20status:merged)

this will let you group changes, that can be in different repos and/or
different branches. See also
https://gerrit-review.googlesource.com/Documentation/intro-user.html#topics

Discussions are tied to a single commit, but you can easily navigate
between different changes in topics, and submission is synchronized
(submitting one change will submit all of the topic. it's
unfortunately not atomic).

This is how submissions to Android work, as Android is stitched
together from ~1000 repos. It is likely that this support will further
improve, as Android is one of our biggest internal key customers.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 13:14                               ` Han-Wen Nienhuys
@ 2019-10-15 13:45                                 ` Daniel Vetter
  2019-10-16 18:56                                   ` Han-Wen Nienhuys
  0 siblings, 1 reply; 102+ messages in thread
From: Daniel Vetter @ 2019-10-15 13:45 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 3:14 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> On Tue, Oct 15, 2019 at 2:00 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > Today, this can be easily done by cc'ing the patch to multiple mailing
> > > lists.  Exactly how this works may get tricky, especially in the
> > > federated model where (for example) perhaps the btrfs tree might be
> > > administered by Facebook, while the xfs tree might be administrated by
> > > Red Hat.  Given that we *also* have to support people who want to keep
> > > using e-mail during the transition period, it may be that using
> > > unauthenticated e-mail messages where comments are attached quoted
> > > patch hunks, perhaps that can be the interchange format between
> > > different servers that aren't under a common administrative domain.
> >
> > Last time I looked none of the common web ui tools (gerrit, gitlab,
> > github) had any reasonable support for topic branches/patch series
> > that target multiple different branches/repositories. They all assume
> > that a submission gets merged into one branch only and that's it. You
> > can of course submit the same stuff for inclusion into multiple
> > places, but that gives you separate discussion tracking for each one
> > (at least with the merge/pull request model, maybe gerrit is better
> > here), which is real bad.
>
> Can you say a little more about what you expect when working with
> multiple branches/repos?
>
> In gerrit, you can assign freeform tags ("topics") to changes, to
> group them. See eg.
>
>   https://gerrit-review.googlesource.com/q/topic:"rename-reviewdb-package"+(status:open%20OR%20status:merged)
>
> this will let you group changes, that can be in different repos and/or
> different branches. See also
> https://gerrit-review.googlesource.com/Documentation/intro-user.html#topics
>
> Discussions are tied to a single commit, but you can easily navigate
> between different changes in topics, and submission is synchronized
> (submitting one change will submit all of the topic. it's
> unfortunately not atomic).
>
> This is how submissions to Android work, as Android is stitched
> together from ~1000 repos. It is likely that this support will further
> improve, as Android is one of our biggest internal key customers.

I think gitlab is working on this under the heading of "supermerge",
where you tie together a pile of changes for different repos under one
overall label to keep the discussion together.

For the kernel we need something slightly different:
- There's a large pile of forks of the same underlying repo (Linus'
upstream branch). So not a huge pile of independent histories and file
trees, but all the same common kernel history and file layout.
- The _same_ set of patches is submitted to multiple branches in that
fork network. E.g. a refactoring patch series which touches both
driver core and a few subsystems.

Afaiui Android has cross-tree pulls, but the patches heading to each
target repo are distinct, and they're all for disjoint history chains
(i.e. no common ancestor commit, no shared files between all the
different branches/patches). I'm not aware of any project which uses
topic branches and cross-tree submissions as extensively as the linux
kernel in this fashion. Everyone with multiple repos seems to use the
Android approach of splitting up the entire space in disjoint repos
(with disjoint histories and disjoint files). I've done a fairly
lengthy write-up of this problem:

https://blog.ffwll.ch/2017/08/github-why-cant-host-the-kernel.html

Android is a multi-repo, multi-tree approach, the linux kernel is a
monotree but multi-repo approach. Most people think that the only
other approach than multi-tree is the huge monolithic monotree w/
monorepo approach. That one just doesn't scale. If you'd do Android
like the linux kernel you'd throw out the repo tool, instead have
_all_ repos merged into one overall git history (placed at the same
directory like they're now placed by the repo tool). Still each
"project/subsystem" would retain their individual git repo to be able
to scale sufficiently well, through localizing of most
development/review work to their specific area.

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-14 19:08                         ` Han-Wen Nienhuys
  2019-10-15  1:54                           ` Theodore Y. Ts'o
@ 2019-10-15 16:07                           ` Greg KH
  2019-10-15 16:35                             ` Steven Rostedt
  2019-10-15 18:58                             ` Han-Wen Nienhuys
  2019-10-15 18:37                           ` Konstantin Ryabitsev
  2 siblings, 2 replies; 102+ messages in thread
From: Greg KH @ 2019-10-15 16:07 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Mon, Oct 14, 2019 at 09:08:17PM +0200, Han-Wen Nienhuys wrote:
> (again, without html)
> 
> On Thu, Oct 10, 2019 at 11:00 PM Theodore Y. Ts'o <tytso@mit.edu> wrote:
> >
> > On Thu, Oct 10, 2019 at 07:52:50PM +0200, Dmitry Vyukov wrote:
> > > I know you all love Gerrit but just to clarify :)
> > > Gerrit stores all metadata in a git repo, all users can have a replica
> > > and you can always have, say, a "backup" replica on a side done
> > > automatically. Patches and versions of the patches are committed into
> > > git into special branches (e.g. change/XXX/version/YYY), comments and
> > > metadata are in a pretty straightforward json (this is comment text
> > > for line X, etc) also committed into git, so one can always read that
> > > in and transform into any other format. And you can also run Gerrit
> > > locally over your replica.
> >
> > Konstantin has spoken about some his concerns about git's scalability,
> > and it's important to remember that just because Gerrit has shown to
> > work well on some very large repositories, it doesn't necessarily mean
> > that it will work well on git repositories using the open source C
> > implementation of git.
> >
> > That's because Gerrit as used by Google (and made available in various
> > public-facing Gerrit servers) uses a Git-on-Borg implementation[1],
> > where the storage is done using Google's internal storage
> > infrastructure.  This is implemented on top of Jgit (which is git
> > implemented in Java)[2].
> >
> 
> 
> 
> Hi there,
> 
> I manage the Gerrit backend team at Google.
> 
> It is true that Gerrit at Google runs on top of Borg (Gerrit-on-Borg aka. GoB),
> 
> 1) there is no real concern about Cgit's scalability.
> 2) the borg deployment has no relevant magical sauce here.
> 
> To 1) : Konstantin was worried about performance implication on git
> notes.  The git-notes command stores data in a single
> refs/notes/commits branch. Gerrit actually uses notes (the file
> format) as well, but has a single notes branch per review, so
> performance here is not a concern when scaling up the number of
> reviews.
> 
> To 2) : Google needs special magic sauce, because we service hundreds
> of teams that work on thousands of repositories. However, here we're
> talking about just the kernel itself; that is just a single
> repository, and not an especially large one.  Chromium is our largest
> repo, and it is about 10x larger than the linux kernel.
> 
> Google runs gerrit in tasks with (currently) 16G memory each. There
> are many large companies (eg. SAP) that run much larger instances, ie.
> one can easily match GoB's performance level on a single machine.
> 
> I have been wanting to propose Gerrit as an alternative for the Linux
> kernel workflow, so I might as well bring forth my arguments here.
> 
> Gerrit isn't a big favorite of many people, but some of that
> perception may be outdated. Since 2016, Google has significantly
> increased its investment in Gerrit. For example, we have rewritten the
> web UI from scratch, and there have been many performance
> improvements.

As one of the many people who complain about Gerrit (I gave a whole talk
about it!), I guess I should comment here...

Yes, it's getting "better" for speed, but it's still way way too slow.

I don't think any of the complaints I gave many years ago have been
addressed, here's a few of them off the top of my head.

And note, I have a lot of experience using gerrit, I use it all the time
for Android kernel patchs, running your latest experimental version of
gerrit, so I assume that is what you are referring to when you say it is
under heavy development.  And I do like the changes you all are doing,
it is getting better in some ways (and worse in others, but that's for a
different thread...)

Anyway, my objections:
	- slow.  Seriously, have you tried using it on a slower network
	  connection (i.e. cellular teather, or train/bus wifi, or cafe
	  wifi?)
	- Can not see all changes made in a single commit across
	  multiple files on the same page.  You have to click through
	  each and every individual file to see the diffs.  That's
	  horrid for patches that touch multiple files and is my biggest
	  pet-peve about the tool.
	- patch series are a pain to apply, I have to manually open each
	  patch, give it a +2 and then when all of them are reviewed and
	  past testing, then they will be merged.  Is a pain.  Does no
	  one else accept patch series of 10-30 patches at a time other
	  than me?  How do they do it without opening 30+ tabs?

And, by reference of the "slow" issue, I should not have to do multiple
round-trip queries of a web interface just to see a single patch.
There's the initial cover page, then there's a click on each individual
file, bring up a new page for each and every file that was changed for
that commit to read it, and then when finished, clicking again to go
back to the initial page, and then a click to give a +2 and another
click to give a verified and then another refresh of the whole thing.

In contrast, try reading/reviewing and then applying a simple 10 patch
series from an email client like I do all the time.  The patches are
local, I read the whole diff with one button press (click if you have a
graphical email client), then if it is good, one keypress to apply it,
or save it to a mbox to apply them all at once later.

When you are reviewing thousands of patches a year, time matters.
Gerrit just does not cut it at all.  Remember, we only accept 1/3 of the
patches sent to us.  We are accepting 9 patches an hour, 24 hours a day.
That means we reject 18 patches an hour at that same time.

And then there's the issue of access when you do not have internet, like
I currently do not right now on a plane.  Or a very slow connection.  I
can still suck down patches in email apply them, and push them out.
Using gerrit on this connection is impossible.

> Building a review tool is not all that easy to do well; by using
> Gerrit, you get a tool that already exists, works, and has significant
> corporate support. We at Google have ~11 SWEs working on Gerrit
> full-time, for example, and we have support from UX research and UI
> design. The amount of work to tweak Gerrit for Linux kernel
> development surely is much less than building something from scratch.

I would love to see a better working gerrit today, for google, for the
developers there as that would save them time and energy that they
currently waste using it.  But for a distributed development / review
environment, with multiple people having multiple trees all over the
place, I don't know how Gerrit would work, unless it is trivial to
host/manage locally.

> Gerrit has a patchset oriented workflow (where changes are amended all
> the time), which is a good fit to the kernel's development process.

Maybe for some tiny subsystem's workflows, but not for any with a real
amount of development.  Try dumping a subsystem's patches into gerrit
today.  Can it handle something like netdev?  linux-input?  linux-usb?
staging?  Where does it start to break down in just being able to handle
the large quantities of changes?  Patchwork has done a lot to help some
of those subsystems work better, try seeing if gerrit could even handle
netdev in a sane manner.  Try to emulate what Dave does there, on your
own, and that should give you a huge idea of what you have to work on
already today, with gerrit, in order to make it better.

> Linus doesn't like Change-Id lines, but I think we could adapt Gerrit
> so it accepts URLs as IDs instead.

change-ids are the least of the problems of gerrit today :)

> There is talk of building a distributed/federated tool, but if there
> are policies ("Jane Doe is maintainer of the network subsystem, and
> can merge changes that only touch file in net/ "), then building
> something decentralized is really hard. You have to build
> infrastructure where Jane can prove to others who she is (PGP key
> signing parties?), and some sort of distributed storage of the policy
> rules.
> 
> By contrast, a centralized server can authenticate users reliably and
> the server owner can define such rules.  There can still be multiple
> gerrit servers, possibly sponsored by corporate entities (one from
> RedHat, one from Google, etc.), and different servers can support
> different authentication models (OpenID, OAuth, Google account, etc.)

Like Daniel said, the kernel is multi-repos for a mono-tree.  I don't
think Gerrit is set up to handle that at all from what I can see.

How many people does it take to maintain an Gerrit instance and keep it
up and running well?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 16:07                           ` Greg KH
@ 2019-10-15 16:35                             ` Steven Rostedt
  2019-10-15 18:58                             ` Han-Wen Nienhuys
  1 sibling, 0 replies; 102+ messages in thread
From: Steven Rostedt @ 2019-10-15 16:35 UTC (permalink / raw)
  To: Greg KH
  Cc: Han-Wen Nienhuys, Theodore Y. Ts'o, Dmitry Vyukov,
	Konstantin Ryabitsev, Laura Abbott, Don Zickus, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, 15 Oct 2019 18:07:12 +0200
Greg KH <gregkh@linuxfoundation.org> wrote:

> 	- patch series are a pain to apply, I have to manually open each
> 	  patch, give it a +2 and then when all of them are reviewed and
> 	  past testing, then they will be merged.  Is a pain.  Does no
> 	  one else accept patch series of 10-30 patches at a time other
> 	  than me?  How do they do it without opening 30+ tabs?

Just a note. When I do get +10 patch series, I do apply them one at a
time. I review each patch, and apply them as they stand on their own.
Of course if there's an issue with one of the patches, I just archive
the branch, and tell the submitter to submit another version. This is
also how I can test to see if things changed from patches I've already
reviewed, because I'll just apply the new ones and compare.

Hmm, I still do it one at a time, but in this case, I guess you have a
point ;-) It would be nice on a second review to say "I already looked
at 1-8 of a 20 patch series, let me just apply 1-8 and compare it with
what I already reviewed".

-- Steve

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-14 19:08                         ` Han-Wen Nienhuys
  2019-10-15  1:54                           ` Theodore Y. Ts'o
  2019-10-15 16:07                           ` Greg KH
@ 2019-10-15 18:37                           ` Konstantin Ryabitsev
  2019-10-15 19:15                             ` Han-Wen Nienhuys
  2 siblings, 1 reply; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-15 18:37 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Mon, Oct 14, 2019 at 09:08:17PM +0200, Han-Wen Nienhuys wrote:
>It is true that Gerrit at Google runs on top of Borg (Gerrit-on-Borg 
>aka. GoB),
>
>1) there is no real concern about Cgit's scalability.
>2) the borg deployment has no relevant magical sauce here.
>
>To 1) : Konstantin was worried about performance implication on git
>notes.  The git-notes command stores data in a single
>refs/notes/commits branch. Gerrit actually uses notes (the file
>format) as well, but has a single notes branch per review, so
>performance here is not a concern when scaling up the number of
>reviews.

Well, it's true that notes use a single ref by default, but the actual 
file structure is similar to git/objects:

  A notes ref is usually a branch which contains "files" whose paths are 
  the object names for the objects they describe, with some directory 
  separators included for performance reasons.

So, if you are creating a note for commit abcdefg, a new file will be 
created in the refs/notes/commits named ab/cd/efg or something similar.  
That is why "performance reasons" are mentioned in the sentence above, 
because as more notes are added, more and more processing power will be 
required to generate tree hashes. Granted, you have to have tens of 
thousands of notes before this even approaches a concern, but past a 
certain point performance will start taking a hit.

>To 2) : Google needs special magic sauce, because we service hundreds
>of teams that work on thousands of repositories. However, here we're
>talking about just the kernel itself; that is just a single
>repository, and not an especially large one.  Chromium is our largest
>repo, and it is about 10x larger than the linux kernel.

Kernel isn't a single repository -- most maintainers have their own fork 
or multiple. Git.kernel.org is now over a thousand repositories (mostly 
forks of the kernel).

>Git is a tool built to exchange code and diffs. It seems natural to
>build a review solution on top of Git too. Gerrit is also built on top
>of git, and stores all metadata in Git too, ie. you can mirror review
>data into other Gerrit instances losslessly.

As I see it, there are the following things that would make Gerrit a 
difficult proposition:

1. A gerrit instance would introduce a single source of failure, which 
   is something many see as undesirable. If there's a DoS attack, Google 
   can restrict access to their Gerrit server to limit the requests to 
   only come from their corporate IP ranges, but kernel.org cannot do 
   the same, so anyone relying on gerrit.kernel.org cannot do any work 
   while it is unavailable.
2. There is limited support for attestation with Gerrit. A change 
   request can contain a digital signature, but any comments surrounding 
   it do not. It would be easy for the administrator of the gerrit 
   instance to forge a +1 or +2 on a CR making it look like it came from 
   the maintainer or the CI service (in other words, we are back to 
   explicitly trusting the infrastructure and IT admins).
3. There is no email bridge, only notifications. Switching to gerrit 
   would require a flag-day when everyone must start using it (or stop 
   participating in kernel development).

I am not sure any of these can be fixed.

>Building a review tool is not all that easy to do well; by using
>Gerrit, you get a tool that already exists, works, and has significant
>corporate support. We at Google have ~11 SWEs working on Gerrit
>full-time, for example, and we have support from UX research and UI
>design. The amount of work to tweak Gerrit for Linux kernel
>development surely is much less than building something from scratch.
>
>Gerrit has a patchset oriented workflow (where changes are amended all
>the time), which is a good fit to the kernel's development process.
>Linus doesn't like Change-Id lines, but I think we could adapt Gerrit
>so it accepts URLs as IDs instead.
>
>There is talk of building a distributed/federated tool, but if there
>are policies ("Jane Doe is maintainer of the network subsystem, and
>can merge changes that only touch file in net/ "), then building
>something decentralized is really hard. You have to build
>infrastructure where Jane can prove to others who she is (PGP key
>signing parties?), and some sort of distributed storage of the policy
>rules.
>
>By contrast, a centralized server can authenticate users reliably and
>the server owner can define such rules.  There can still be multiple
>gerrit servers, possibly sponsored by corporate entities (one from
>RedHat, one from Google, etc.), and different servers can support
>different authentication models (OpenID, OAuth, Google account, etc.)

How would multiple Gerrit servers operate if they are backed by 
different authentication models? Something like a replication plugin 
would require that each of these instances are fully trusted sources of 
truth. I am not sure Red Hat would be happy to fully trust a replication 
stream coming from its direct market competitors, especially if they are 
in a position to forge identities.

Or do you mean they are separate instances and a maintainer would pick 
where to host their subsystem? But then, if they pick Google's gerrit 
system, how would engineers from China be able to participate?

Generally, unless there is a way to run Gerrit without explicitly 
trusting the infrastructure and admins, I will be in strong opposition 
to choosing it as the solution.

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 16:07                           ` Greg KH
  2019-10-15 16:35                             ` Steven Rostedt
@ 2019-10-15 18:58                             ` Han-Wen Nienhuys
  2019-10-15 19:33                               ` Greg KH
  2019-10-15 19:50                               ` Mark Brown
  1 sibling, 2 replies; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-15 18:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 6:07 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> > Gerrit isn't a big favorite of many people, but some of that
> > perception may be outdated. Since 2016, Google has significantly
> > increased its investment in Gerrit. For example, we have rewritten the
> > web UI from scratch, and there have been many performance
> > improvements.
>
> As one of the many people who complain about Gerrit (I gave a whole talk
> about it!), I guess I should comment here...
>..
> Anyway, my objections:
>         - slow.  Seriously, have you tried using it on a slower network
>           connection (i.e. cellular teather, or train/bus wifi, or cafe
>           wifi?)

I did (train wifi), and found it to be workable, but it might be that
I have different expectations for speed.  If you're worried about
off-line behavior, you could run a local mirror of an Gerrit server,
and posting draft comments locally, and then use a script to post the
draft comments to the central gerrit server once you're online.

>         - Can not see all changes made in a single commit across
>           multiple files on the same page.  You have to click through
>           each and every individual file to see the diffs.  That's
>           horrid for patches that touch multiple files and is my biggest
>           pet-peve about the tool.

Have you ever tried pressing Shift + i on the change screen?

>         - patch series are a pain to apply, I have to manually open each
>           patch, give it a +2 and then when all of them are reviewed and
>           past testing, then they will be merged.  Is a pain.  Does no
>           one else accept patch series of 10-30 patches at a time other
>           than me?  How do they do it without opening 30+ tabs?

How would you want it to work otherwise? I assume that you are
actually looking at each patch, and would want to record the fact
they're OK with a +2.

If you're the boss of the project (like in your mbox scenario below)
and are in charge of the gerrit instance, you can remove all the
review requirements, but restrict merge privileges to yourself. You'd
have a "merge including parents" button on the tip of the patch series
without any voting requirements.

> And, by reference of the "slow" issue, I should not have to do multiple
> round-trip queries of a web interface just to see a single patch.
> There's the initial cover page, then there's a click on each individual
> file, bring up a new page for each and every file that was changed for
> that commit to read it, and then when finished, clicking again to go
> back to the initial page, and then a click to give a +2 and another
> click to give a verified and then another refresh of the whole thing.

The 'verified' label was meant for automation systems. If you click
the 'verified' label yourself, that suggests that there is no test
automation for this project, and maybe you should just ask the admin
to disable this label.

> When you are reviewing thousands of patches a year, time matters.
> Gerrit just does not cut it at all.  Remember, we only accept 1/3 of the
> patches sent to us.  We are accepting 9 patches an hour, 24 hours a day.
> That means we reject 18 patches an hour at that same time.


I'm curious how you come to this number. When I look at Linus' tree.

git show --no-merges
43b815c6a8e7dbccb5b8bd9c4b099c24bc22d135..8e0d0ad206f08506c893326ca7c9c3d9cc042cef
| grep ^Date | wc

This range is two recent merge commits by Linus about 232 hours apart.
During that window, 386 non-merge change were merged, ie. about 1.6
commit/hour.

> And then there's the issue of access when you do not have internet, like
> I currently do not right now on a plane.  Or a very slow connection.  I
> can still suck down patches in email apply them, and push them out.
> Using gerrit on this connection is impossible.

there is actually a TTY interface to gerrit that has a local cache,
but i'll admit I have never tried it.

> > Building a review tool is not all that easy to do well; by using
> > Gerrit, you get a tool that already exists, works, and has significant
> > corporate support. We at Google have ~11 SWEs working on Gerrit
> > full-time, for example, and we have support from UX research and UI
> > design. The amount of work to tweak Gerrit for Linux kernel
> > development surely is much less than building something from scratch.
>
> I would love to see a better working gerrit today, for google, for the
> developers there as that would save them time and energy that they
> currently waste using it.  But for a distributed development / review
> environment, with multiple people having multiple trees all over the
> place, I don't know how Gerrit would work, unless it is trivial to
> host/manage locally.
>
> > Gerrit has a patchset oriented workflow (where changes are amended all
> > the time), which is a good fit to the kernel's development process.
>
> Maybe for some tiny subsystem's workflows, but not for any with a real
> amount of development.  Try dumping a subsystem's patches into gerrit
> today.  Can it handle something like netdev?  linux-input?  linux-usb?
> staging?  Where does it start to break down in just being able to handle
> the large quantities of changes?  Patchwork has done a lot to help some

I am not sure I understand the question. We merge about 300 commits
per day into https://chromium.googlesource.com/chromium/src/, which
looks to be much more than what the linux kernel receives.

What do you mean with 'break down' ?

> > There is talk of building a distributed/federated tool, but if there
> > are policies ("Jane Doe is maintainer of the network subsystem, and
> > can merge changes that only touch file in net/ "), then building
> > something decentralized is really hard. You have to build
> > infrastructure where Jane can prove to others who she is (PGP key
> > signing parties?), and some sort of distributed storage of the policy
> > rules.
> >
> > By contrast, a centralized server can authenticate users reliably and
> > the server owner can define such rules.  There can still be multiple
> > gerrit servers, possibly sponsored by corporate entities (one from
> > RedHat, one from Google, etc.), and different servers can support
> > different authentication models (OpenID, OAuth, Google account, etc.)
>
> Like Daniel said, the kernel is multi-repos for a mono-tree.  I don't
> think Gerrit is set up to handle that at all from what I can see.

What problem is solved by having multiple copies of the same tree?

The normal Gerrit approach would be to have one server, with one Linux
repository, with multiple branches. Gerrit has per-branch permissions,
so you can define custom permissions (eg. merge, vote, read, amend
other people's changes) for different branches. Using branches in this
way would allow you to retarget a change to another branch without
losing the review history, or cherry-picking a change to another
branch from within the UI.

> How many people does it take to maintain an Gerrit instance and keep it
> up and running well?

To be honest, I don't know exactly, as the Google setup is very different.

Anecdotally, I hear that it is a setup and forget type of server,
except for upgrades, which require an hour or so of downtime.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 18:37                           ` Konstantin Ryabitsev
@ 2019-10-15 19:15                             ` Han-Wen Nienhuys
  2019-10-15 19:35                               ` Greg KH
  2019-10-15 19:41                               ` Konstantin Ryabitsev
  0 siblings, 2 replies; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-15 19:15 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Tue, Oct 15, 2019 at 8:37 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
> On Mon, Oct 14, 2019 at 09:08:17PM +0200, Han-Wen Nienhuys wrote:
>
> >To 2) : Google needs special magic sauce, because we service hundreds
> >of teams that work on thousands of repositories. However, here we're
> >talking about just the kernel itself; that is just a single
> >repository, and not an especially large one.  Chromium is our largest
> >repo, and it is about 10x larger than the linux kernel.
>
> Kernel isn't a single repository -- most maintainers have their own fork
> or multiple. Git.kernel.org is now over a thousand repositories (mostly
> forks of the kernel).

I've heard this before, but it's not clear what problem you are
solving in this way. They're all based on the same commit graph, so it
could just be a single repo with 1000 branches.

> >Git is a tool built to exchange code and diffs. It seems natural to
> >build a review solution on top of Git too. Gerrit is also built on top
> >of git, and stores all metadata in Git too, ie. you can mirror review
> >data into other Gerrit instances losslessly.
>
> As I see it, there are the following things that would make Gerrit a
> difficult proposition:
>
> 1. A gerrit instance would introduce a single source of failure, which
>    is something many see as undesirable. If there's a DoS attack, Google
>    can restrict access to their Gerrit server to limit the requests to
>    only come from their corporate IP ranges, but kernel.org cannot do
>    the same, so anyone relying on gerrit.kernel.org cannot do any work
>    while it is unavailable.

I don't understand your scenario. Are you concerned that Google would
protect DoS attacks by limiting traffic to the corp network?

Or are you concerned that kernel.org has no DoS protection?

How does DoS protection for kernel.org work today? If someone DoSes
git.kernel.org with its 1000s of git trees, how do people get work
done?

> 2. There is limited support for attestation with Gerrit. A change
>    request can contain a digital signature, but any comments surrounding
>    it do not. It would be easy for the administrator of the gerrit
>    instance to forge a +1 or +2 on a CR making it look like it came from
>    the maintainer or the CI service (in other words, we are back to
>    explicitly trusting the infrastructure and IT admins).

Does anyone use attestation/PGP signing for their code reviews that
are conducted over Email? How many people use PGP signing on their
commits?

How does email figure into this story? Email has no authentication at
all, so if this is a requirement, you should probably stop using email
for reviews.

> 3. There is no email bridge, only notifications. Switching to gerrit
>    would require a flag-day when everyone must start using it (or stop
>    participating in kernel development).

Gerrit sends out email for comments. If you reply to that email,
Gerrit ingests the comment and puts it in the review.

> >By contrast, a centralized server can authenticate users reliably and
> >the server owner can define such rules.  There can still be multiple
> >gerrit servers, possibly sponsored by corporate entities (one from
> >RedHat, one from Google, etc.), and different servers can support
> >different authentication models (OpenID, OAuth, Google account, etc.)
>
> How would multiple Gerrit servers operate if they are backed by
> different authentication models? Something like a replication plugin
> would require that each of these instances are fully trusted sources of
> truth. I am not sure Red Hat would be happy to fully trust a replication
> stream coming from its direct market competitors, especially if they are
> in a position to forge identities.
>
> Or do you mean they are separate instances and a maintainer would pick
> where to host their subsystem? But then, if they pick Google's gerrit
> system, how would engineers from China be able to participate?

I mean the latter.

The question about China is a good one. If access from China (and
Iran, North Korea, etc.) is a requirement, that would be useful to
document.

> Generally, unless there is a way to run Gerrit without explicitly
> trusting the infrastructure and admins, I will be in strong opposition
> to choosing it as the solution.
>
> -K



-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 18:58                             ` Han-Wen Nienhuys
@ 2019-10-15 19:33                               ` Greg KH
  2019-10-15 20:03                                 ` Mark Brown
  2019-10-15 19:50                               ` Mark Brown
  1 sibling, 1 reply; 102+ messages in thread
From: Greg KH @ 2019-10-15 19:33 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 08:58:14PM +0200, Han-Wen Nienhuys wrote:
> On Tue, Oct 15, 2019 at 6:07 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > > Gerrit isn't a big favorite of many people, but some of that
> > > perception may be outdated. Since 2016, Google has significantly
> > > increased its investment in Gerrit. For example, we have rewritten the
> > > web UI from scratch, and there have been many performance
> > > improvements.
> >
> > As one of the many people who complain about Gerrit (I gave a whole talk
> > about it!), I guess I should comment here...
> >..
> > Anyway, my objections:
> >         - slow.  Seriously, have you tried using it on a slower network
> >           connection (i.e. cellular teather, or train/bus wifi, or cafe
> >           wifi?)
> 
> I did (train wifi), and found it to be workable, but it might be that
> I have different expectations for speed.

I guess it depends on your train :)

> If you're worried about off-line behavior, you could run a local
> mirror of an Gerrit server, and posting draft comments locally, and
> then use a script to post the draft comments to the central gerrit
> server once you're online.

Can you do that?  Where are instructions on how that all works?

> >         - Can not see all changes made in a single commit across
> >           multiple files on the same page.  You have to click through
> >           each and every individual file to see the diffs.  That's
> >           horrid for patches that touch multiple files and is my biggest
> >           pet-peve about the tool.
> 
> Have you ever tried pressing Shift + i on the change screen?

Nope.  Is that documented somewhere?  I'm on a plane and can't connect
to google's gerrit at the moment to test it :(

> >         - patch series are a pain to apply, I have to manually open each
> >           patch, give it a +2 and then when all of them are reviewed and
> >           past testing, then they will be merged.  Is a pain.  Does no
> >           one else accept patch series of 10-30 patches at a time other
> >           than me?  How do they do it without opening 30+ tabs?
> 
> How would you want it to work otherwise? I assume that you are
> actually looking at each patch, and would want to record the fact
> they're OK with a +2.

Sometimes yes.  But other times they were pushed there by me or by
someone else as they are imports from somewhere else.

As a very specific example of this, look at the android-3.18 kernel
branch works in AOSP.  I add about 30 patches to that tree every other
week or so.  And every time I do it I dread it.  That's not a good sign
that the tool is nice to use :(

> If you're the boss of the project (like in your mbox scenario below)
> and are in charge of the gerrit instance, you can remove all the
> review requirements, but restrict merge privileges to yourself. You'd
> have a "merge including parents" button on the tip of the patch series
> without any voting requirements.

Who "controls" permissions like this?  You need an admin, and that's
another cycle.  And if you think we need a central gerrit for kernel
development, that's not going to work :(

> > And, by reference of the "slow" issue, I should not have to do multiple
> > round-trip queries of a web interface just to see a single patch.
> > There's the initial cover page, then there's a click on each individual
> > file, bring up a new page for each and every file that was changed for
> > that commit to read it, and then when finished, clicking again to go
> > back to the initial page, and then a click to give a +2 and another
> > click to give a verified and then another refresh of the whole thing.
> 
> The 'verified' label was meant for automation systems. If you click
> the 'verified' label yourself, that suggests that there is no test
> automation for this project, and maybe you should just ask the admin
> to disable this label.

Talk to the AOSP team :)

> > When you are reviewing thousands of patches a year, time matters.
> > Gerrit just does not cut it at all.  Remember, we only accept 1/3 of the
> > patches sent to us.  We are accepting 9 patches an hour, 24 hours a day.
> > That means we reject 18 patches an hour at that same time.
> 
> I'm curious how you come to this number. When I look at Linus' tree.
> 
> git show --no-merges
> 43b815c6a8e7dbccb5b8bd9c4b099c24bc22d135..8e0d0ad206f08506c893326ca7c9c3d9cc042cef
> | grep ^Date | wc
> 
> This range is two recent merge commits by Linus about 232 hours apart.
> During that window, 386 non-merge change were merged, ie. about 1.6
> commit/hour.

You picked a small range.  Look at a whole release:
$ git show --no-merges v5.2..v5.3 | grep ^Date  | wc -l
14605

That was from July 7, 2019 - September 15, 2019.

We do a lot of merges during the 2 week merge window, everything after
that for the next 8 weeks is bugfixes.  But during that time, patches
are still flowing into maintainers branches of their trees in
anticipation of the new release window.

I suggest you look at how the development process works for the kernel,
it's a bit "different" from what you are probably thinking.  It's all in
our Documentation/process/ directory if you are curious.

> > And then there's the issue of access when you do not have internet, like
> > I currently do not right now on a plane.  Or a very slow connection.  I
> > can still suck down patches in email apply them, and push them out.
> > Using gerrit on this connection is impossible.
> 
> there is actually a TTY interface to gerrit that has a local cache,
> but i'll admit I have never tried it.

Then I have to imagine it's not maintained or used very often :)

> > > Building a review tool is not all that easy to do well; by using
> > > Gerrit, you get a tool that already exists, works, and has significant
> > > corporate support. We at Google have ~11 SWEs working on Gerrit
> > > full-time, for example, and we have support from UX research and UI
> > > design. The amount of work to tweak Gerrit for Linux kernel
> > > development surely is much less than building something from scratch.
> >
> > I would love to see a better working gerrit today, for google, for the
> > developers there as that would save them time and energy that they
> > currently waste using it.  But for a distributed development / review
> > environment, with multiple people having multiple trees all over the
> > place, I don't know how Gerrit would work, unless it is trivial to
> > host/manage locally.
> >
> > > Gerrit has a patchset oriented workflow (where changes are amended all
> > > the time), which is a good fit to the kernel's development process.
> >
> > Maybe for some tiny subsystem's workflows, but not for any with a real
> > amount of development.  Try dumping a subsystem's patches into gerrit
> > today.  Can it handle something like netdev?  linux-input?  linux-usb?
> > staging?  Where does it start to break down in just being able to handle
> > the large quantities of changes?  Patchwork has done a lot to help some
> 
> I am not sure I understand the question. We merge about 300 commits
> per day into https://chromium.googlesource.com/chromium/src/, which
> looks to be much more than what the linux kernel receives.

With as many different people as we have?  That's great.  And everything
happens in gerrit, any pointers to which one?  How many "open" items are
there at any point in time?

> What do you mean with 'break down' ?

Just try importing the patches that flow through the above mailing lists
for a month and see the rate and process for what you have to do "better
than" something as simple as email.

> > > There is talk of building a distributed/federated tool, but if there
> > > are policies ("Jane Doe is maintainer of the network subsystem, and
> > > can merge changes that only touch file in net/ "), then building
> > > something decentralized is really hard. You have to build
> > > infrastructure where Jane can prove to others who she is (PGP key
> > > signing parties?), and some sort of distributed storage of the policy
> > > rules.
> > >
> > > By contrast, a centralized server can authenticate users reliably and
> > > the server owner can define such rules.  There can still be multiple
> > > gerrit servers, possibly sponsored by corporate entities (one from
> > > RedHat, one from Google, etc.), and different servers can support
> > > different authentication models (OpenID, OAuth, Google account, etc.)
> >
> > Like Daniel said, the kernel is multi-repos for a mono-tree.  I don't
> > think Gerrit is set up to handle that at all from what I can see.
> 
> What problem is solved by having multiple copies of the same tree?

That's how we work.

> The normal Gerrit approach would be to have one server, with one Linux
> repository, with multiple branches. Gerrit has per-branch permissions,
> so you can define custom permissions (eg. merge, vote, read, amend
> other people's changes) for different branches. Using branches in this
> way would allow you to retarget a change to another branch without
> losing the review history, or cherry-picking a change to another
> branch from within the UI.

We will not have "one server" for the reasons Konstantin points out.
That's just not going to happen, for loads of good reasons.

And we have very few branches, instead, we all have individual trees.
Works out much better for a huge distributed development process.

> > How many people does it take to maintain an Gerrit instance and keep it
> > up and running well?
> 
> To be honest, I don't know exactly, as the Google setup is very different.
> 
> Anecdotally, I hear that it is a setup and forget type of server,
> except for upgrades, which require an hour or so of downtime.

I've heard the exact opposite, so the reality is probably somewhere in
the middle :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 19:15                             ` Han-Wen Nienhuys
@ 2019-10-15 19:35                               ` Greg KH
  2019-10-15 19:41                               ` Konstantin Ryabitsev
  1 sibling, 0 replies; 102+ messages in thread
From: Greg KH @ 2019-10-15 19:35 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Konstantin Ryabitsev, Theodore Y. Ts'o, Dmitry Vyukov,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 09:15:27PM +0200, Han-Wen Nienhuys wrote:
> The question about China is a good one. If access from China (and
> Iran, North Korea, etc.) is a requirement, that would be useful to
> document.

China is NOT the same as those other two contries according to most
governments, so the "rules" are different.  And of course we accept
contributions from China, why wouldn't we?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 19:15                             ` Han-Wen Nienhuys
  2019-10-15 19:35                               ` Greg KH
@ 2019-10-15 19:41                               ` Konstantin Ryabitsev
  2019-10-16 18:33                                 ` Han-Wen Nienhuys
  1 sibling, 1 reply; 102+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-15 19:41 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Tue, Oct 15, 2019 at 09:15:27PM +0200, Han-Wen Nienhuys wrote:
>> As I see it, there are the following things that would make Gerrit a
>> difficult proposition:
>>
>> 1. A gerrit instance would introduce a single source of failure, which
>>    is something many see as undesirable. If there's a DoS attack, Google
>>    can restrict access to their Gerrit server to limit the requests to
>>    only come from their corporate IP ranges, but kernel.org cannot do
>>    the same, so anyone relying on gerrit.kernel.org cannot do any work
>>    while it is unavailable.
>
>I don't understand your scenario. Are you concerned that Google would
>protect DoS attacks by limiting traffic to the corp network?
>
>Or are you concerned that kernel.org has no DoS protection?

Kernel.org operates on a pretty small budget, largely using 
infrastructure donated by kind entities. If it suddenly becomes a 
liability, I'm sure those entities will kick us out. A large, sustained 
DoS attack would be one such liability, and due to the nature of git, we 
can't really hide behind cloudfront or any other DoS-mitigation
platforms.

If a DoS attack is waged against Google's gerrit server, they can drop 
the packets at the perimeter and still keep the service available to 
Google engineers coming from internally routed traffic. This is not an 
option for kernel.org, so a DoS attack against a central resource would 
be super effective in stopping all work on Linux.

>
>How does DoS protection for kernel.org work today? If someone DoSes
>git.kernel.org with its 1000s of git trees, how do people get work
>done?

The git trees on kernel.org are just convenient copies. Most kernel 
development is done via mailing patches, which is a distributed and 
DoS-resistant process. The only time git.kernel.org enters into the 
picture is when people need to send a pull request to Linus (via email).  
If kernel.org is out for an extended period of time, they would just 
push their repo elsewhere and send a pull request referencing the new 
URL. Since Linus checks PGP signatures when pulling remotes, the 
repository can be hosted anywhere at all without needing to trust the 
infrastructure.

With gerrit, if gerrit.kernel.org is down, then everything grinds to a 
halt. If it's down for an extended period of time, then in theory people 
can push their git trees elsewhere, but then they would have to 
force-push back to gerrit once it's back up.

>> 2. There is limited support for attestation with Gerrit. A change
>>    request can contain a digital signature, but any comments surrounding
>>    it do not. It would be easy for the administrator of the gerrit
>>    instance to forge a +1 or +2 on a CR making it look like it came from
>>    the maintainer or the CI service (in other words, we are back to
>>    explicitly trusting the infrastructure and IT admins).
>
>Does anyone use attestation/PGP signing for their code reviews that
>are conducted over Email? How many people use PGP signing on their
>commits?
>
>How does email figure into this story? Email has no authentication at
>all, so if this is a requirement, you should probably stop using email
>for reviews.

Well, it's one of the problems we are trying to solve! Linus verifies 
PGP signatures on all pull requests he receives that aren't coming from 
git.kernel.org (and I always complain to him that he shouldn't make an 
exception in our case either).

And we *are* trying to stop using email for reviews -- preferably 
without introducing single points of trust and single points of failure.  

>> 3. There is no email bridge, only notifications. Switching to gerrit
>>    would require a flag-day when everyone must start using it (or stop
>>    participating in kernel development).
>
>Gerrit sends out email for comments. If you reply to that email,
>Gerrit ingests the comment and puts it in the review.

Ah, okay, I guess we've never configured it that way. But you can't 
+1/+2/Merge anything using this mechanism, right? Otherwise that would 
be a backchannel around authentication. :)

-K

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 18:58                             ` Han-Wen Nienhuys
  2019-10-15 19:33                               ` Greg KH
@ 2019-10-15 19:50                               ` Mark Brown
  1 sibling, 0 replies; 102+ messages in thread
From: Mark Brown @ 2019-10-15 19:50 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Greg KH, Theodore Y. Ts'o, Dmitry Vyukov,
	Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

[-- Attachment #1: Type: text/plain, Size: 2335 bytes --]

On Tue, Oct 15, 2019 at 08:58:14PM +0200, Han-Wen Nienhuys wrote:
> On Tue, Oct 15, 2019 at 6:07 PM Greg KH <gregkh@linuxfoundation.org> wrote:

> > When you are reviewing thousands of patches a year, time matters.
> > Gerrit just does not cut it at all.  Remember, we only accept 1/3 of the
> > patches sent to us.  We are accepting 9 patches an hour, 24 hours a day.
> > That means we reject 18 patches an hour at that same time.

> I'm curious how you come to this number. When I look at Linus' tree.

> git show --no-merges
> 43b815c6a8e7dbccb5b8bd9c4b099c24bc22d135..8e0d0ad206f08506c893326ca7c9c3d9cc042cef
> | grep ^Date | wc

> This range is two recent merge commits by Linus about 232 hours apart.
> During that window, 386 non-merge change were merged, ie. about 1.6
> commit/hour.

The overwhelming majority of commits are merged into Linus' tree during
the two week merge window after each release, at other times only bug
fix commits get merged so the volume is way down.  Between v5.2 (July
7th) and v5.3 (September 15th) there were 14605 commits, which comes out
at about Greg's number of 9 per hour.

> > Like Daniel said, the kernel is multi-repos for a mono-tree.  I don't
> > think Gerrit is set up to handle that at all from what I can see.

> What problem is solved by having multiple copies of the same tree?

A lot of what it gives is comprehensibility - people being able to see 
what's going on in all the different branches in a given area of the
kernel.  You could use namespaced branches to do this but the UIs for
that aren't nearly so good and all this workflow predates even git, this
was all very idiomatic for bitkeeper.

> The normal Gerrit approach would be to have one server, with one Linux
> repository, with multiple branches. Gerrit has per-branch permissions,
> so you can define custom permissions (eg. merge, vote, read, amend
> other people's changes) for different branches. Using branches in this
> way would allow you to retarget a change to another branch without
> losing the review history, or cherry-picking a change to another
> branch from within the UI.

Permissions are all socially enforced for the kernel, there's no
technical restrictions since there's admin overhead for that and at
times it makes sense to do things differently.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 19:33                               ` Greg KH
@ 2019-10-15 20:03                                 ` Mark Brown
  0 siblings, 0 replies; 102+ messages in thread
From: Mark Brown @ 2019-10-15 20:03 UTC (permalink / raw)
  To: Greg KH
  Cc: Han-Wen Nienhuys, Theodore Y. Ts'o, Dmitry Vyukov,
	Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

[-- Attachment #1: Type: text/plain, Size: 1264 bytes --]

On Tue, Oct 15, 2019 at 09:33:51PM +0200, Greg KH wrote:
> On Tue, Oct 15, 2019 at 08:58:14PM +0200, Han-Wen Nienhuys wrote:
> > On Tue, Oct 15, 2019 at 6:07 PM Greg KH <gregkh@linuxfoundation.org> wrote:

> > > Anyway, my objections:
> > >         - slow.  Seriously, have you tried using it on a slower network
> > >           connection (i.e. cellular teather, or train/bus wifi, or cafe
> > >           wifi?)

> > I did (train wifi), and found it to be workable, but it might be that
> > I have different expectations for speed.

> I guess it depends on your train :)

I do the train thing quite a bit in different countries, it does depend
a lot on where you are.  A lot of the time the issue isn't speed, it's
reliability - the most common failure mode I see is that the connection
speed is fine when it's there but it routinely drops out for periods in
more remote areas or due to cell handover issues if you're moving at
speed and if you're working interactively even brief drops are very
apparent.  If you can queue up network interaction so it can happen
without you having to sit and wait for it to complete (or manually retry
it if it times out) this doesn't really cause any difficulties but if
it's stopping you from proceeding it becomes an issue.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 19:41                               ` Konstantin Ryabitsev
@ 2019-10-16 18:33                                 ` Han-Wen Nienhuys
  0 siblings, 0 replies; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-16 18:33 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Laura Abbott, Don Zickus,
	Steven Rostedt, Daniel Axtens, David Miller, Drew DeVault,
	Neil Horman, workflows

On Tue, Oct 15, 2019 at 9:41 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
> >> 1. A gerrit instance would introduce a single source of failure, which
> >>    is something many see as undesirable. If there's a DoS attack, Google
> >>    can restrict access to their Gerrit server to limit the requests to
> >>    only come from their corporate IP ranges, but kernel.org cannot do
> >>    the same, so anyone relying on gerrit.kernel.org cannot do any work
> >>    while it is unavailable.
> >
> >I don't understand your scenario. Are you concerned that Google would
> >protect DoS attacks by limiting traffic to the corp network?
> >
> >Or are you concerned that kernel.org has no DoS protection?
>
> Kernel.org operates on a pretty small budget, largely using
> infrastructure donated by kind entities. If it suddenly becomes a
> liability, I'm sure those entities will kick us out. A large, sustained
> DoS attack would be one such liability, and due to the nature of git, we
> can't really hide behind cloudfront or any other DoS-mitigation
> platforms.
>
> If a DoS attack is waged against Google's gerrit server, they can drop
> the packets at the perimeter and still keep the service available to
> Google engineers coming from internally routed traffic. This is not an
> option for kernel.org, so a DoS attack against a central resource would
> be super effective in stopping all work on Linux.

This isn't how DOS mitigation works. Gerrit uses the same DOS
mitigation as www.google.com, and we don't drop external search
traffic when things get bad.

If we can be behind a DOS protection service, kernel.org could also
be, but I think it would require using the HTTPS protocol rather than
SSH or anonymous git.

> >How does DoS protection for kernel.org work today? If someone DoSes
> >git.kernel.org with its 1000s of git trees, how do people get work
> >done?
>
> The git trees on kernel.org are just convenient copies. Most kernel
> development is done via mailing patches, which is a distributed and
> DoS-resistant process. The only time git.kernel.org enters into the
> picture is when people need to send a pull request to Linus (via email).
> If kernel.org is out for an extended period of time, they would just
> push their repo elsewhere and send a pull request referencing the new
> URL. Since Linus checks PGP signatures when pulling remotes, the
> repository can be hosted anywhere at all without needing to trust the
> infrastructure.
>
> With gerrit, if gerrit.kernel.org is down, then everything grinds to a
> halt. If it's down for an extended period of time, then in theory people
> can push their git trees elsewhere, but then they would have to
> force-push back to gerrit once it's back up.

We ourselves are not worried about downtime because our deployment was
designed to be 24x7 from the start. Our corporate overlords (large
projects such as Chrome and Android) do not accept any downtime. I
would be happy and proud to host the Linux kernel development on the
same infrastructure, which would give you hosting without downtime.
However, if there is consensus that there can't be a centralized
infrastructure of any sort, then that isn't going to work, obviously.

> >> 3. There is no email bridge, only notifications. Switching to gerrit
> >>    would require a flag-day when everyone must start using it (or stop
> >>    participating in kernel development).
> >
> >Gerrit sends out email for comments. If you reply to that email,
> >Gerrit ingests the comment and puts it in the review.
>
> Ah, okay, I guess we've never configured it that way. But you can't
> +1/+2/Merge anything using this mechanism, right? Otherwise that would
> be a backchannel around authentication. :)

it doesn't allow voting exactly for the reason you mention.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-15 13:45                                 ` Daniel Vetter
@ 2019-10-16 18:56                                   ` Han-Wen Nienhuys
  2019-10-16 19:08                                     ` Mark Brown
  2019-10-17 11:49                                     ` Daniel Vetter
  0 siblings, 2 replies; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-16 18:56 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Tue, Oct 15, 2019 at 3:45 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > Last time I looked none of the common web ui tools (gerrit, gitlab,
> > > github) had any reasonable support for topic branches/patch series
> > > that target multiple different branches/repositories. They all assume
> > > that a submission gets merged into one branch only and that's it. You
> > > can of course submit the same stuff for inclusion into multiple
> > > places, but that gives you separate discussion tracking for each one
> > > (at least with the merge/pull request model, maybe gerrit is better
> > > here), which is real bad.
> >
> > Can you say a little more about what you expect when working with
> > multiple branches/repos?
> >
> > In gerrit, you can assign freeform tags ("topics") to changes, to
> > group them. See eg.
> >
> >   https://gerrit-review.googlesource.com/q/topic:"rename-reviewdb-package"+(status:open%20OR%20status:merged)
> >
> > this will let you group changes, that can be in different repos and/or
> > different branches. See also
> > https://gerrit-review.googlesource.com/Documentation/intro-user.html#topics
> >
> > Discussions are tied to a single commit, but you can easily navigate
> > between different changes in topics, and submission is synchronized
> > (submitting one change will submit all of the topic. it's
> > unfortunately not atomic).
> >
> > This is how submissions to Android work, as Android is stitched
> > together from ~1000 repos. It is likely that this support will further
> > improve, as Android is one of our biggest internal key customers.
>
> I think gitlab is working on this under the heading of "supermerge",
> where you tie together a pile of changes for different repos under one
> overall label to keep the discussion together.
>
> For the kernel we need something slightly different:
> - There's a large pile of forks of the same underlying repo (Linus'
> upstream branch). So not a huge pile of independent histories and file
> trees, but all the same common kernel history and file layout.
> - The _same_ set of patches is submitted to multiple branches in that
> fork network. E.g. a refactoring patch series which touches both
> driver core and a few subsystems.

You're simplifying the android situation, and it's actually more
similar to the linux kernel than you think. There are several hosts
that have versions of Android (one for AOSP, one for Google, and a
couple hosts for different partners).

Then, there are a large number of branches, which represent releases.
Changes (or sets of changes across repositories) that are submitted to
one branch must then be propagated to other release branches, if they
are relevant bug fixes.

With the email workflow, isn't it hard to keep track of which patch
went into which tree? Something that tracks the identity of commit
(like Change-Id) as it travels across trees could help here.

> Afaiui Android has cross-tree pulls, but the patches heading to each
> target repo are distinct, and they're all for disjoint history chains
> (i.e. no common ancestor commit, no shared files between all the
> different branches/patches). I'm not aware of any project which uses
> topic branches and cross-tree submissions as extensively as the linux
> kernel in this fashion. Everyone with multiple repos seems to use the
> Android approach of splitting up the entire space in disjoint repos
> (with disjoint histories and disjoint files). I've done a fairly
> lengthy write-up of this problem:
>
> https://blog.ffwll.ch/2017/08/github-why-cant-host-the-kernel.html
>
> Android is a multi-repo, multi-tree approach, the linux kernel is a
> monotree but multi-repo approach. Most people think that the only
> other approach than multi-tree is the huge monolithic monotree w/
> monorepo approach. That one just doesn't scale. If you'd do Android
> like the linux kernel you'd throw out the repo tool, instead have
> _all_ repos merged into one overall git history (placed at the same
> directory like they're now placed by the repo tool). Still each
> "project/subsystem" would retain their individual git repo to be able
> to scale sufficiently well, through localizing of most
> development/review work to their specific area.

this is not really relevant to the Linux kernel discussion, but it's
more complicated:

* Android exists in several flavors (eg. the vanilla Google flavor,
AOSP, the Samsung flavor, etc.). With different subrepositories,
partners in the ecosystem can swap out components of the Android
platform as needed, while still keeping up with some of the upstream
repositories.

* Android (AOSP) is more than 600k files, ie. 10x larger than the
linux kernel, and about as large as the Chromium repo. Working with
repo that large is painful because the git client itself just doesn't
work that well to trees with that many files.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-16 18:56                                   ` Han-Wen Nienhuys
@ 2019-10-16 19:08                                     ` Mark Brown
  2019-10-17 10:22                                       ` Han-Wen Nienhuys
  2019-10-17 11:49                                     ` Daniel Vetter
  1 sibling, 1 reply; 102+ messages in thread
From: Mark Brown @ 2019-10-16 19:08 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Daniel Vetter, Theodore Y. Ts'o, Dmitry Vyukov,
	Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

[-- Attachment #1: Type: text/plain, Size: 499 bytes --]

On Wed, Oct 16, 2019 at 08:56:53PM +0200, Han-Wen Nienhuys wrote:

> With the email workflow, isn't it hard to keep track of which patch
> went into which tree? Something that tracks the identity of commit
> (like Change-Id) as it travels across trees could help here.

Once something's in git it flows into other trees via merges so the
commit ID is stable, the exception is backports where the ID from
mainline gets referenced in the backport.  Normal development doesn't
use cherry picks really.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-16 19:08                                     ` Mark Brown
@ 2019-10-17 10:22                                       ` Han-Wen Nienhuys
  2019-10-17 11:24                                         ` Mark Brown
  0 siblings, 1 reply; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-17 10:22 UTC (permalink / raw)
  To: Mark Brown
  Cc: Daniel Vetter, Theodore Y. Ts'o, Dmitry Vyukov,
	Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

On Wed, Oct 16, 2019 at 9:09 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Oct 16, 2019 at 08:56:53PM +0200, Han-Wen Nienhuys wrote:
>
> > With the email workflow, isn't it hard to keep track of which patch
> > went into which tree? Something that tracks the identity of commit
> > (like Change-Id) as it travels across trees could help here.
>
> Once something's in git it flows into other trees via merges so the
> commit ID is stable, the exception is backports where the ID from
> mainline gets referenced in the backport.  Normal development doesn't
> use cherry picks really.

Right, so once it's in Git, the review story is done, and everything
we've discussed so far is really only about what happens before the
merging?

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-17 10:22                                       ` Han-Wen Nienhuys
@ 2019-10-17 11:24                                         ` Mark Brown
  0 siblings, 0 replies; 102+ messages in thread
From: Mark Brown @ 2019-10-17 11:24 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Daniel Vetter, Theodore Y. Ts'o, Dmitry Vyukov,
	Konstantin Ryabitsev, Laura Abbott, Don Zickus, Steven Rostedt,
	Daniel Axtens, David Miller, Drew DeVault, Neil Horman,
	workflows

[-- Attachment #1: Type: text/plain, Size: 952 bytes --]

On Thu, Oct 17, 2019 at 12:22:18PM +0200, Han-Wen Nienhuys wrote:
> On Wed, Oct 16, 2019 at 9:09 PM Mark Brown <broonie@kernel.org> wrote:
> > On Wed, Oct 16, 2019 at 08:56:53PM +0200, Han-Wen Nienhuys wrote:

> > > With the email workflow, isn't it hard to keep track of which patch
> > > went into which tree? Something that tracks the identity of commit
> > > (like Change-Id) as it travels across trees could help here.

> > Once something's in git it flows into other trees via merges so the
> > commit ID is stable, the exception is backports where the ID from
> > mainline gets referenced in the backport.  Normal development doesn't
> > use cherry picks really.

> Right, so once it's in Git, the review story is done, and everything
> we've discussed so far is really only about what happens before the
> merging?

*Mostly* - pull requests do get reviewed and may need to be rebuilt but
pretty much.  It's certainly an out of normal workflow.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-16 18:56                                   ` Han-Wen Nienhuys
  2019-10-16 19:08                                     ` Mark Brown
@ 2019-10-17 11:49                                     ` Daniel Vetter
  2019-10-17 12:09                                       ` Han-Wen Nienhuys
  1 sibling, 1 reply; 102+ messages in thread
From: Daniel Vetter @ 2019-10-17 11:49 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Wed, Oct 16, 2019 at 8:57 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> On Tue, Oct 15, 2019 at 3:45 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > Last time I looked none of the common web ui tools (gerrit, gitlab,
> > > > github) had any reasonable support for topic branches/patch series
> > > > that target multiple different branches/repositories. They all assume
> > > > that a submission gets merged into one branch only and that's it. You
> > > > can of course submit the same stuff for inclusion into multiple
> > > > places, but that gives you separate discussion tracking for each one
> > > > (at least with the merge/pull request model, maybe gerrit is better
> > > > here), which is real bad.
> > >
> > > Can you say a little more about what you expect when working with
> > > multiple branches/repos?
> > >
> > > In gerrit, you can assign freeform tags ("topics") to changes, to
> > > group them. See eg.
> > >
> > >   https://gerrit-review.googlesource.com/q/topic:"rename-reviewdb-package"+(status:open%20OR%20status:merged)
> > >
> > > this will let you group changes, that can be in different repos and/or
> > > different branches. See also
> > > https://gerrit-review.googlesource.com/Documentation/intro-user.html#topics
> > >
> > > Discussions are tied to a single commit, but you can easily navigate
> > > between different changes in topics, and submission is synchronized
> > > (submitting one change will submit all of the topic. it's
> > > unfortunately not atomic).
> > >
> > > This is how submissions to Android work, as Android is stitched
> > > together from ~1000 repos. It is likely that this support will further
> > > improve, as Android is one of our biggest internal key customers.
> >
> > I think gitlab is working on this under the heading of "supermerge",
> > where you tie together a pile of changes for different repos under one
> > overall label to keep the discussion together.
> >
> > For the kernel we need something slightly different:
> > - There's a large pile of forks of the same underlying repo (Linus'
> > upstream branch). So not a huge pile of independent histories and file
> > trees, but all the same common kernel history and file layout.
> > - The _same_ set of patches is submitted to multiple branches in that
> > fork network. E.g. a refactoring patch series which touches both
> > driver core and a few subsystems.
>
> You're simplifying the android situation, and it's actually more
> similar to the linux kernel than you think. There are several hosts
> that have versions of Android (one for AOSP, one for Google, and a
> couple hosts for different partners).
>
> Then, there are a large number of branches, which represent releases.
> Changes (or sets of changes across repositories) that are submitted to
> one branch must then be propagated to other release branches, if they
> are relevant bug fixes.

This isn't about branches in the sense you seem to mean here, i.e.
different releases, or the stuff various flavours/vendors cherry-pick
together, or long-term support stuff. The workflow here is:

- submit changes to subsystem A & B
- we want the discussion to be coherent across both
- the changes land either in A or B, or a mix of A and B, or in both
places (through a topic branch that gets merged into both A and B)
- both A and B send pull requests for the same merge window (big
integration fest to start each release cycle), and from then on it's
like they all landed in the same tree

This isn't about cherry-picking a set of changes or bugfixes between
flavours or release branches or anything like that. It's about making
sure that subject experts for vastly different and fairly unrelated
areas of the kernel can all work together and avoid code conflicts,
without having to work together in the same branch/repo for
everything. Since overlapping stuff is the exception, not the rule.

> With the email workflow, isn't it hard to keep track of which patch
> went into which tree? Something that tracks the identity of commit
> (like Change-Id) as it travels across trees could help here.

Yup it's a mess occasionally, but it's at least possible to have one
discussion thread with everyone on Cc: to coordinate the mess
properly.

> > Afaiui Android has cross-tree pulls, but the patches heading to each
> > target repo are distinct, and they're all for disjoint history chains
> > (i.e. no common ancestor commit, no shared files between all the
> > different branches/patches). I'm not aware of any project which uses
> > topic branches and cross-tree submissions as extensively as the linux
> > kernel in this fashion. Everyone with multiple repos seems to use the
> > Android approach of splitting up the entire space in disjoint repos
> > (with disjoint histories and disjoint files). I've done a fairly
> > lengthy write-up of this problem:
> >
> > https://blog.ffwll.ch/2017/08/github-why-cant-host-the-kernel.html
> >
> > Android is a multi-repo, multi-tree approach, the linux kernel is a
> > monotree but multi-repo approach. Most people think that the only
> > other approach than multi-tree is the huge monolithic monotree w/
> > monorepo approach. That one just doesn't scale. If you'd do Android
> > like the linux kernel you'd throw out the repo tool, instead have
> > _all_ repos merged into one overall git history (placed at the same
> > directory like they're now placed by the repo tool). Still each
> > "project/subsystem" would retain their individual git repo to be able
> > to scale sufficiently well, through localizing of most
> > development/review work to their specific area.
>
> this is not really relevant to the Linux kernel discussion, but it's
> more complicated:
>
> * Android exists in several flavors (eg. the vanilla Google flavor,
> AOSP, the Samsung flavor, etc.). With different subrepositories,
> partners in the ecosystem can swap out components of the Android
> platform as needed, while still keeping up with some of the upstream
> repositories.
>
> * Android (AOSP) is more than 600k files, ie. 10x larger than the
> linux kernel, and about as large as the Chromium repo. Working with
> repo that large is painful because the git client itself just doesn't
> work that well to trees with that many files.

I'm not saying you're having no reasons to do this. All I'm saying is
that outside of the kernel, there's not really anyone doing it like
the kernel, mostly because the popular tools just don't really support
this workflow.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-17 11:49                                     ` Daniel Vetter
@ 2019-10-17 12:09                                       ` Han-Wen Nienhuys
  2019-10-17 12:53                                         ` Daniel Vetter
  0 siblings, 1 reply; 102+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-17 12:09 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Thu, Oct 17, 2019 at 1:50 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> I'm not saying you're having no reasons to do this. All I'm saying is
> that outside of the kernel, there's not really anyone doing it like
> the kernel, mostly because the popular tools just don't really support
> this workflow.

Fair enough. This has been a hugely elucidating discussing; thanks for that.

I had some thoughts about something that resembles what I think you
are looking for. I wrote them down here

https://docs.google.com/document/d/15OzXgmZ_yQ7UHnMUeiK3AB4tgQauJde4pzStCv-mIYM/edit#

I think this could be made to work, but it would be a considerable effort.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 102+ messages in thread

* Re: thoughts on a Merge Request based development workflow
  2019-10-17 12:09                                       ` Han-Wen Nienhuys
@ 2019-10-17 12:53                                         ` Daniel Vetter
  0 siblings, 0 replies; 102+ messages in thread
From: Daniel Vetter @ 2019-10-17 12:53 UTC (permalink / raw)
  To: Han-Wen Nienhuys
  Cc: Theodore Y. Ts'o, Dmitry Vyukov, Konstantin Ryabitsev,
	Laura Abbott, Don Zickus, Steven Rostedt, Daniel Axtens,
	David Miller, Drew DeVault, Neil Horman, workflows

On Thu, Oct 17, 2019 at 2:10 PM Han-Wen Nienhuys <hanwen@google.com> wrote:
> On Thu, Oct 17, 2019 at 1:50 PM Daniel Vetter <daniel@ffwll.ch> wrote:
> > I'm not saying you're having no reasons to do this. All I'm saying is
> > that outside of the kernel, there's not really anyone doing it like
> > the kernel, mostly because the popular tools just don't really support
> > this workflow.
>
> Fair enough. This has been a hugely elucidating discussing; thanks for that.
>
> I had some thoughts about something that resembles what I think you
> are looking for. I wrote them down here
>
> https://docs.google.com/document/d/15OzXgmZ_yQ7UHnMUeiK3AB4tgQauJde4pzStCv-mIYM/edit#
>
> I think this could be made to work, but it would be a considerable effort.

Scrolled through it, seems like a good summary of a lot of the things
discussed here. Where I'd disagree is on the categorical rejection of
existing solutions like gerrit/gitlab, but what's clear is that
existing solutions all have gaps somewhere. I also think that no tool
will reach full (or even wide) usage across the entire kernel, that's
not realistic. I mean kernel started this entire git thing, and
there's still kernel subsystems and maintainers using something else.
Personally I think trying to fill the gaps in existing tooling will be
more successful, and that's still the plan for running some
experiments in the gpu subsystem (just these days we finally gotten
around to move our issue tracking over to gitlab from bugzilla, as one
of the steps to have something better integrated).
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 102+ messages in thread

end of thread, other threads:[~2019-10-17 12:53 UTC | newest]

Thread overview: 102+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-24 18:25 thoughts on a Merge Request based development workflow Neil Horman
2019-09-24 18:37 ` Drew DeVault
2019-09-24 18:53   ` Neil Horman
2019-09-24 20:24     ` Laurent Pinchart
2019-09-24 22:25       ` Neil Horman
2019-09-25 20:50         ` Laurent Pinchart
2019-09-25 21:54           ` Neil Horman
2019-09-26  0:40           ` Neil Horman
2019-09-28 22:58             ` Steven Rostedt
2019-09-28 23:16               ` Dave Airlie
2019-09-28 23:52                 ` Steven Rostedt
2019-10-01  3:22                 ` Daniel Axtens
2019-10-01 21:14                   ` Bjorn Helgaas
2019-09-29 11:57               ` Neil Horman
2019-09-29 12:55                 ` Dmitry Vyukov
2019-09-30  1:00                   ` Neil Horman
2019-09-30  6:05                     ` Dmitry Vyukov
2019-09-30 12:55                       ` Neil Horman
2019-09-30 13:20                         ` Nicolas Belouin
2019-09-30 13:40                         ` Dmitry Vyukov
2019-09-30 21:02                     ` Konstantin Ryabitsev
2019-09-30 14:51                   ` Theodore Y. Ts'o
2019-09-30 15:15                     ` Steven Rostedt
2019-09-30 16:09                       ` Geert Uytterhoeven
2019-09-30 20:56                       ` Konstantin Ryabitsev
2019-10-08  1:00                     ` Stephen Rothwell
2019-09-26 10:23           ` Geert Uytterhoeven
2019-09-26 13:43             ` Neil Horman
2019-10-07 15:33   ` David Miller
2019-10-07 15:35     ` Drew DeVault
2019-10-07 16:20       ` Neil Horman
2019-10-07 16:24         ` Drew DeVault
2019-10-07 18:43           ` David Miller
2019-10-07 19:24             ` Eric Wong
2019-10-07 15:47     ` Steven Rostedt
2019-10-07 18:40       ` David Miller
2019-10-07 18:45       ` David Miller
2019-10-07 19:21         ` Steven Rostedt
2019-10-07 21:49     ` Theodore Y. Ts'o
2019-10-07 23:00     ` Daniel Axtens
2019-10-08  0:39       ` Eric Wong
2019-10-08  1:26         ` Daniel Axtens
2019-10-08  2:11           ` Eric Wong
2019-10-08  3:24             ` Daniel Axtens
2019-10-08  6:03               ` Eric Wong
2019-10-08 10:06                 ` Daniel Axtens
2019-10-08 13:19                   ` Steven Rostedt
2019-10-08 18:46                 ` Rob Herring
2019-10-08 21:36                   ` Eric Wong
2019-10-08  1:17       ` Steven Rostedt
2019-10-08 16:43         ` Don Zickus
2019-10-08 17:17           ` Steven Rostedt
2019-10-08 17:39             ` Don Zickus
2019-10-08 19:05               ` Konstantin Ryabitsev
2019-10-08 20:32                 ` Don Zickus
2019-10-08 21:35                   ` Konstantin Ryabitsev
2019-10-09 21:50                     ` Laura Abbott
2019-10-10 12:48                       ` Neil Horman
2019-10-09 21:35                 ` Laura Abbott
2019-10-09 21:54                   ` Konstantin Ryabitsev
2019-10-09 22:09                     ` Laura Abbott
2019-10-09 22:19                       ` Dave Airlie
2019-10-09 22:21                     ` Eric Wong
2019-10-09 23:56                       ` Konstantin Ryabitsev
2019-10-10  0:07                         ` Eric Wong
2019-10-10  7:35                         ` Nicolas Belouin
2019-10-10 12:53                           ` Steven Rostedt
2019-10-10 14:21                           ` Dmitry Vyukov
2019-10-11  7:12                             ` Nicolas Belouin
2019-10-11 13:56                               ` Dmitry Vyukov
2019-10-14  7:31                                 ` Nicolas Belouin
2019-10-10 17:52                     ` Dmitry Vyukov
2019-10-10 20:57                       ` Theodore Y. Ts'o
2019-10-11 11:01                         ` Dmitry Vyukov
2019-10-11 12:54                           ` Theodore Y. Ts'o
2019-10-14 19:08                         ` Han-Wen Nienhuys
2019-10-15  1:54                           ` Theodore Y. Ts'o
2019-10-15 12:00                             ` Daniel Vetter
2019-10-15 13:14                               ` Han-Wen Nienhuys
2019-10-15 13:45                                 ` Daniel Vetter
2019-10-16 18:56                                   ` Han-Wen Nienhuys
2019-10-16 19:08                                     ` Mark Brown
2019-10-17 10:22                                       ` Han-Wen Nienhuys
2019-10-17 11:24                                         ` Mark Brown
2019-10-17 11:49                                     ` Daniel Vetter
2019-10-17 12:09                                       ` Han-Wen Nienhuys
2019-10-17 12:53                                         ` Daniel Vetter
2019-10-15 16:07                           ` Greg KH
2019-10-15 16:35                             ` Steven Rostedt
2019-10-15 18:58                             ` Han-Wen Nienhuys
2019-10-15 19:33                               ` Greg KH
2019-10-15 20:03                                 ` Mark Brown
2019-10-15 19:50                               ` Mark Brown
2019-10-15 18:37                           ` Konstantin Ryabitsev
2019-10-15 19:15                             ` Han-Wen Nienhuys
2019-10-15 19:35                               ` Greg KH
2019-10-15 19:41                               ` Konstantin Ryabitsev
2019-10-16 18:33                                 ` Han-Wen Nienhuys
2019-10-09  2:02           ` Daniel Axtens
2019-09-24 23:15 ` David Rientjes
2019-09-25  6:35   ` Toke Høiland-Jørgensen
2019-09-25 10:49   ` Neil Horman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).