From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Han-Wen Nienhuys <hanwen@google.com>
Cc: "Theodore Y. Ts'o" <tytso@mit.edu>,
Dmitry Vyukov <dvyukov@google.com>,
Laura Abbott <labbott@redhat.com>,
Don Zickus <dzickus@redhat.com>,
Steven Rostedt <rostedt@goodmis.org>,
Daniel Axtens <dja@axtens.net>,
David Miller <davem@davemloft.net>, Drew DeVault <sir@cmpwn.com>,
Neil Horman <nhorman@tuxdriver.com>,
workflows@vger.kernel.org
Subject: Re: thoughts on a Merge Request based development workflow
Date: Tue, 15 Oct 2019 14:37:52 -0400 [thread overview]
Message-ID: <20191015183752.GB5473@chatter.i7.local> (raw)
In-Reply-To: <CAFQ2z_Pd2bSL+qpTNxwSNUOccvOt1QD9-XeCqqcdHtiNLKeJxA@mail.gmail.com>
On Mon, Oct 14, 2019 at 09:08:17PM +0200, Han-Wen Nienhuys wrote:
>It is true that Gerrit at Google runs on top of Borg (Gerrit-on-Borg
>aka. GoB),
>
>1) there is no real concern about Cgit's scalability.
>2) the borg deployment has no relevant magical sauce here.
>
>To 1) : Konstantin was worried about performance implication on git
>notes. The git-notes command stores data in a single
>refs/notes/commits branch. Gerrit actually uses notes (the file
>format) as well, but has a single notes branch per review, so
>performance here is not a concern when scaling up the number of
>reviews.
Well, it's true that notes use a single ref by default, but the actual
file structure is similar to git/objects:
A notes ref is usually a branch which contains "files" whose paths are
the object names for the objects they describe, with some directory
separators included for performance reasons.
So, if you are creating a note for commit abcdefg, a new file will be
created in the refs/notes/commits named ab/cd/efg or something similar.
That is why "performance reasons" are mentioned in the sentence above,
because as more notes are added, more and more processing power will be
required to generate tree hashes. Granted, you have to have tens of
thousands of notes before this even approaches a concern, but past a
certain point performance will start taking a hit.
>To 2) : Google needs special magic sauce, because we service hundreds
>of teams that work on thousands of repositories. However, here we're
>talking about just the kernel itself; that is just a single
>repository, and not an especially large one. Chromium is our largest
>repo, and it is about 10x larger than the linux kernel.
Kernel isn't a single repository -- most maintainers have their own fork
or multiple. Git.kernel.org is now over a thousand repositories (mostly
forks of the kernel).
>Git is a tool built to exchange code and diffs. It seems natural to
>build a review solution on top of Git too. Gerrit is also built on top
>of git, and stores all metadata in Git too, ie. you can mirror review
>data into other Gerrit instances losslessly.
As I see it, there are the following things that would make Gerrit a
difficult proposition:
1. A gerrit instance would introduce a single source of failure, which
is something many see as undesirable. If there's a DoS attack, Google
can restrict access to their Gerrit server to limit the requests to
only come from their corporate IP ranges, but kernel.org cannot do
the same, so anyone relying on gerrit.kernel.org cannot do any work
while it is unavailable.
2. There is limited support for attestation with Gerrit. A change
request can contain a digital signature, but any comments surrounding
it do not. It would be easy for the administrator of the gerrit
instance to forge a +1 or +2 on a CR making it look like it came from
the maintainer or the CI service (in other words, we are back to
explicitly trusting the infrastructure and IT admins).
3. There is no email bridge, only notifications. Switching to gerrit
would require a flag-day when everyone must start using it (or stop
participating in kernel development).
I am not sure any of these can be fixed.
>Building a review tool is not all that easy to do well; by using
>Gerrit, you get a tool that already exists, works, and has significant
>corporate support. We at Google have ~11 SWEs working on Gerrit
>full-time, for example, and we have support from UX research and UI
>design. The amount of work to tweak Gerrit for Linux kernel
>development surely is much less than building something from scratch.
>
>Gerrit has a patchset oriented workflow (where changes are amended all
>the time), which is a good fit to the kernel's development process.
>Linus doesn't like Change-Id lines, but I think we could adapt Gerrit
>so it accepts URLs as IDs instead.
>
>There is talk of building a distributed/federated tool, but if there
>are policies ("Jane Doe is maintainer of the network subsystem, and
>can merge changes that only touch file in net/ "), then building
>something decentralized is really hard. You have to build
>infrastructure where Jane can prove to others who she is (PGP key
>signing parties?), and some sort of distributed storage of the policy
>rules.
>
>By contrast, a centralized server can authenticate users reliably and
>the server owner can define such rules. There can still be multiple
>gerrit servers, possibly sponsored by corporate entities (one from
>RedHat, one from Google, etc.), and different servers can support
>different authentication models (OpenID, OAuth, Google account, etc.)
How would multiple Gerrit servers operate if they are backed by
different authentication models? Something like a replication plugin
would require that each of these instances are fully trusted sources of
truth. I am not sure Red Hat would be happy to fully trust a replication
stream coming from its direct market competitors, especially if they are
in a position to forge identities.
Or do you mean they are separate instances and a maintainer would pick
where to host their subsystem? But then, if they pick Google's gerrit
system, how would engineers from China be able to participate?
Generally, unless there is a way to run Gerrit without explicitly
trusting the infrastructure and admins, I will be in strong opposition
to choosing it as the solution.
-K
next prev parent reply other threads:[~2019-10-15 18:37 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-09-24 18:25 thoughts on a Merge Request based development workflow Neil Horman
2019-09-24 18:37 ` Drew DeVault
2019-09-24 18:53 ` Neil Horman
2019-09-24 20:24 ` Laurent Pinchart
2019-09-24 22:25 ` Neil Horman
2019-09-25 20:50 ` Laurent Pinchart
2019-09-25 21:54 ` Neil Horman
2019-09-26 0:40 ` Neil Horman
2019-09-28 22:58 ` Steven Rostedt
2019-09-28 23:16 ` Dave Airlie
2019-09-28 23:52 ` Steven Rostedt
2019-10-01 3:22 ` Daniel Axtens
2019-10-01 21:14 ` Bjorn Helgaas
2019-09-29 11:57 ` Neil Horman
2019-09-29 12:55 ` Dmitry Vyukov
2019-09-30 1:00 ` Neil Horman
2019-09-30 6:05 ` Dmitry Vyukov
2019-09-30 12:55 ` Neil Horman
2019-09-30 13:20 ` Nicolas Belouin
2019-09-30 13:40 ` Dmitry Vyukov
2019-09-30 21:02 ` Konstantin Ryabitsev
2019-09-30 14:51 ` Theodore Y. Ts'o
2019-09-30 15:15 ` Steven Rostedt
2019-09-30 16:09 ` Geert Uytterhoeven
2019-09-30 20:56 ` Konstantin Ryabitsev
2019-10-08 1:00 ` Stephen Rothwell
2019-09-26 10:23 ` Geert Uytterhoeven
2019-09-26 13:43 ` Neil Horman
2019-10-07 15:33 ` David Miller
2019-10-07 15:35 ` Drew DeVault
2019-10-07 16:20 ` Neil Horman
2019-10-07 16:24 ` Drew DeVault
2019-10-07 18:43 ` David Miller
2019-10-07 19:24 ` Eric Wong
2019-10-07 15:47 ` Steven Rostedt
2019-10-07 18:40 ` David Miller
2019-10-07 18:45 ` David Miller
2019-10-07 19:21 ` Steven Rostedt
2019-10-07 21:49 ` Theodore Y. Ts'o
2019-10-07 23:00 ` Daniel Axtens
2019-10-08 0:39 ` Eric Wong
2019-10-08 1:26 ` Daniel Axtens
2019-10-08 2:11 ` Eric Wong
2019-10-08 3:24 ` Daniel Axtens
2019-10-08 6:03 ` Eric Wong
2019-10-08 10:06 ` Daniel Axtens
2019-10-08 13:19 ` Steven Rostedt
2019-10-08 18:46 ` Rob Herring
2019-10-08 21:36 ` Eric Wong
2019-10-08 1:17 ` Steven Rostedt
2019-10-08 16:43 ` Don Zickus
2019-10-08 17:17 ` Steven Rostedt
2019-10-08 17:39 ` Don Zickus
2019-10-08 19:05 ` Konstantin Ryabitsev
2019-10-08 20:32 ` Don Zickus
2019-10-08 21:35 ` Konstantin Ryabitsev
2019-10-09 21:50 ` Laura Abbott
2019-10-10 12:48 ` Neil Horman
2019-10-09 21:35 ` Laura Abbott
2019-10-09 21:54 ` Konstantin Ryabitsev
2019-10-09 22:09 ` Laura Abbott
2019-10-09 22:19 ` Dave Airlie
2019-10-09 22:21 ` Eric Wong
2019-10-09 23:56 ` Konstantin Ryabitsev
2019-10-10 0:07 ` Eric Wong
2019-10-10 7:35 ` Nicolas Belouin
2019-10-10 12:53 ` Steven Rostedt
2019-10-10 14:21 ` Dmitry Vyukov
2019-10-11 7:12 ` Nicolas Belouin
2019-10-11 13:56 ` Dmitry Vyukov
2019-10-14 7:31 ` Nicolas Belouin
2019-10-10 17:52 ` Dmitry Vyukov
2019-10-10 20:57 ` Theodore Y. Ts'o
2019-10-11 11:01 ` Dmitry Vyukov
2019-10-11 12:54 ` Theodore Y. Ts'o
2019-10-14 19:08 ` Han-Wen Nienhuys
2019-10-15 1:54 ` Theodore Y. Ts'o
2019-10-15 12:00 ` Daniel Vetter
2019-10-15 13:14 ` Han-Wen Nienhuys
2019-10-15 13:45 ` Daniel Vetter
2019-10-16 18:56 ` Han-Wen Nienhuys
2019-10-16 19:08 ` Mark Brown
2019-10-17 10:22 ` Han-Wen Nienhuys
2019-10-17 11:24 ` Mark Brown
2019-10-17 11:49 ` Daniel Vetter
2019-10-17 12:09 ` Han-Wen Nienhuys
2019-10-17 12:53 ` Daniel Vetter
2019-10-15 16:07 ` Greg KH
2019-10-15 16:35 ` Steven Rostedt
2019-10-15 18:58 ` Han-Wen Nienhuys
2019-10-15 19:33 ` Greg KH
2019-10-15 20:03 ` Mark Brown
2019-10-15 19:50 ` Mark Brown
2019-10-15 18:37 ` Konstantin Ryabitsev [this message]
2019-10-15 19:15 ` Han-Wen Nienhuys
2019-10-15 19:35 ` Greg KH
2019-10-15 19:41 ` Konstantin Ryabitsev
2019-10-16 18:33 ` Han-Wen Nienhuys
2019-10-09 2:02 ` Daniel Axtens
2019-09-24 23:15 ` David Rientjes
2019-09-25 6:35 ` Toke Høiland-Jørgensen
2019-09-25 10:49 ` Neil Horman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191015183752.GB5473@chatter.i7.local \
--to=konstantin@linuxfoundation.org \
--cc=davem@davemloft.net \
--cc=dja@axtens.net \
--cc=dvyukov@google.com \
--cc=dzickus@redhat.com \
--cc=hanwen@google.com \
--cc=labbott@redhat.com \
--cc=nhorman@tuxdriver.com \
--cc=rostedt@goodmis.org \
--cc=sir@cmpwn.com \
--cc=tytso@mit.edu \
--cc=workflows@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).