workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Lyon meeting notes
@ 2019-10-29 15:41 Han-Wen Nienhuys
  2019-10-29 22:26 ` Eric Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Han-Wen Nienhuys @ 2019-10-29 15:41 UTC (permalink / raw)
  To: workflows

Hi folks,

I tried to take some notes at the session today. They're a bit rough,
but I hope they'll be useful to someone.

https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 15:41 Lyon meeting notes Han-Wen Nienhuys
@ 2019-10-29 22:26 ` Eric Wong
  2019-10-29 23:13   ` Bjorn Helgaas
  2019-10-29 22:35 ` Daniel Axtens
  2019-10-30  9:21 ` Jonathan Corbet
  2 siblings, 1 reply; 13+ messages in thread
From: Eric Wong @ 2019-10-29 22:26 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: workflows

Han-Wen Nienhuys <hanwen@google.com> wrote:
> Hi folks,
> 
> I tried to take some notes at the session today. They're a bit rough,
> but I hope they'll be useful to someone.
> 
> https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing

Thanks for taking notes.  Is there a version accessible to users
without JavaScript?  Thanks.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 15:41 Lyon meeting notes Han-Wen Nienhuys
  2019-10-29 22:26 ` Eric Wong
@ 2019-10-29 22:35 ` Daniel Axtens
  2019-11-01 17:29   ` Konstantin Ryabitsev
  2019-10-30  9:21 ` Jonathan Corbet
  2 siblings, 1 reply; 13+ messages in thread
From: Daniel Axtens @ 2019-10-29 22:35 UTC (permalink / raw)
  To: Han-Wen Nienhuys, workflows

Hi all,

> I tried to take some notes at the session today. They're a bit rough,
> but I hope they'll be useful to someone.
>
> https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing

I can't quite make sense of the notes for Patchwork, but if you have
specific asks for us, we can try to target them for Patchwork 3.0, which
I expect will come out some time next year. (Modulo most of us doing
this as a hobby, of course.)

> KR: write local command to work with patchwork.
We have a couple of local commands already, pwclient and git-pw. Is
there something new that would be helpful?

As an aside, I know offline stuff has come up a few times, and I think
it should be reasonably straightforward to make a caching API client
that can buffer your requests until you're next online. I don't have any
cycles to do this, but I'm happy to help anyone who does.

Regards,
Daniel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 22:26 ` Eric Wong
@ 2019-10-29 23:13   ` Bjorn Helgaas
  2019-11-01 20:07     ` Konstantin Ryabitsev
  2019-11-01 21:34     ` Dmitry Vyukov
  0 siblings, 2 replies; 13+ messages in thread
From: Bjorn Helgaas @ 2019-10-29 23:13 UTC (permalink / raw)
  To: Eric Wong; +Cc: Han-Wen Nienhuys, workflows

On Tue, Oct 29, 2019 at 10:26:29PM +0000, Eric Wong wrote:
> > https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing
> 
> Thanks for taking notes.  Is there a version accessible to users
> without JavaScript?  Thanks.

Here it is:

Present:
* Konstantin Ryabitsev - technical director LF
* Google: Han-Wen Nienhuys, Dmitri Vyukov, David Gow, Brendan Higgins
* Christian Brauner - Canonical
* Shuah Khan - LF
* Greg KH
* Johan Holvold
* Kevin Hilman, KernelCI
* Veronika Kabatova - Red Hat/CKI CI
* Rafael Wysocki - intel
* Sasha Levin
* Frank Rowand
* Daniel Diaz - Linaro LKFT CI
* Daniel Vetter - intel
* Wolfram Sang
* Anasse Astier - freebox (?)


Consensus:
* Current situation is suboptimal/problematic
* CI folks
* Patchwork streamlines workflow; lot of activity now. Dormant for years, but now improving.
* Konstantin: patches: no attestation; no security. Easy to slip in vulns
* Linus checks sigs, but subsystem maintainers don’t.
* Konstantin: proposes minisign signatures.
* How realistic is this? (Steven).
* How big is the key? Ed25519 are short keys.
* Identity tracking? PGP giving up on key signing. TOFU.
* (unhearable)
* KR: signify/minisign background.
* PGP
* KR: Want it to be part of git.
* PGP signatures are attachments. Attachments are easily stripped from message.
* KR: want to archive history
* Complex patch doesn’t get in immediately, because patches need comment rounds, then spoofing gets exposed.
* Greg: base tree information will be great.
* Konstantin wants to put it into Git.
* Base tree
   * Discuss base commit
   * Hanwen: SHA1 is opaque too
   * KR: Linus complains that Changeid is equivalent to messageid, not so much opaqueness.
   * Hanwen: suggest to add a public URL to the base tree
   * Base goes into email; --base option git-format-patch.
   * Must become a requirement
   * Put into check-patch
   * Similar to signed-off
   * Not mandatory, andrew morton not using git. RFC patches also don’t need it.
* Gateways:
   * Point to tree, send from system
   * Inside corporations, HTTPS.
   * Adopt Gitgitgadget from github; creates mail patches from a GH repo.
   * Command line tool
   * Figuring out who to send this to.
   * Automation defeats attestation goal.
   * KR: should just build gitgitgadet for kernel.
* How to know whom to send patch to?
   * So much cruft in maintainers file.
* Interaction git-format-patch and config is tricky.
* Dmitrii Vyukov:
   * Can have a server to do this
   * KR: don’t want centralized infrastructure
   * Dmitrii: but gitgitgadget is the same?
* (14:35): feeds.
   * Human consumable information
   * Kernel.org can aggregate all the feeds, and can tell what CIs are still missing.
   * CI mail has logs, but the results are transient
   * Kernel.org can archive all these data.
   * Will be a lot of data, but want to start with feed.
   * Needs a common structured format to understand what all CI systems have done.
   * Attestation
   * Steven: could record the acks/reviewed-by.
* 2nd part of discussion: tooling.
   * Lore 200 Gb.
* [lost a lot of conversation here]
* Patchwork:
   * Has a web interface
   * Can run locally.
   * Inbox vs patchwork
   * Patchwork with approvals from different maintainers.
   * ...
   * KR: write local command to work with patchwork.
* KR: daniel uses gitlab, some people want to use gerrit
   * KR: wants to have a feed of data.
   * Mail from gerrit/gitlab, usually is noisy.
   * Tool can consume that feed.
   * Libc mailing list, still struggling
* Hanwen: Funding for tooling? Does Linux Foundation build the bridges, or do tool owners (gerrit, gitlab) have to do it?
   * Linux Foundation can go to companies to ask for funding
   * KR trying to get consensus so we can ask for resources & funding as a group.
   * Let people use tools, sourcehut, gitlab, gerrit
* KR: Lore.kernel.org:
   * Want to be able to search all over all data, gerrit, kernel etc. (like code search)
   * Find all the patches that touch XYZ
* Devs can miss reviews because people don’t know where reviews happen.
   * KR: have a bot that will respond on behalf if maintainer has no gerrit account.
   * KR: long time initiative: want to move to SSB.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 15:41 Lyon meeting notes Han-Wen Nienhuys
  2019-10-29 22:26 ` Eric Wong
  2019-10-29 22:35 ` Daniel Axtens
@ 2019-10-30  9:21 ` Jonathan Corbet
  2 siblings, 0 replies; 13+ messages in thread
From: Jonathan Corbet @ 2019-10-30  9:21 UTC (permalink / raw)
  To: Han-Wen Nienhuys; +Cc: workflows

On Tue, 29 Oct 2019 16:41:37 +0100
Han-Wen Nienhuys <hanwen@google.com> wrote:

> I tried to take some notes at the session today. They're a bit rough,
> but I hope they'll be useful to someone.
> 
> https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing

Thanks for posting those.  I have a lot of notes too -- at least, for the
parts of the conversation I could actually hear.  They should go out via
the usual channel by the end of the week.

jon

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 22:35 ` Daniel Axtens
@ 2019-11-01 17:29   ` Konstantin Ryabitsev
  2019-11-01 17:35     ` Dmitry Vyukov
  2019-11-02 11:46     ` Steven Rostedt
  0 siblings, 2 replies; 13+ messages in thread
From: Konstantin Ryabitsev @ 2019-11-01 17:29 UTC (permalink / raw)
  To: Daniel Axtens; +Cc: Han-Wen Nienhuys, workflows

On Wed, Oct 30, 2019 at 09:35:08AM +1100, Daniel Axtens wrote:
> > KR: write local command to work with patchwork.
> We have a couple of local commands already, pwclient and git-pw. Is
> there something new that would be helpful?
> 
> As an aside, I know offline stuff has come up a few times, and I think
> it should be reasonably straightforward to make a caching API client
> that can buffer your requests until you're next online. I don't have any
> cycles to do this, but I'm happy to help anyone who does.

This is part of the tooling work that we discussed and I plan to have a
more elaborate proposal finalized next week. We are hoping to fund the
development of two closely related tools, one CLI and another using the
browser interface, using a web server available over a localhost (or a
local network) connection.

The overall set of features for the web-interface would be:

1. Run the web interface locally in a container, to be accessed over
   localhost
2. Use public-inbox feeds as the source of incoming patches, allowing to
   pre-filter by:
   - list-id
   - from/to/cc
   - keywords
   - patch file path
3. Integrate natively with the local repository to allow the following
   functionality:
   - apply series directly to the repo, from the proper base-commit parent
   - recognize patches that have already been applied and auto-archive
     them as necessary (current kernel.org's patchwork-bot functionality)
4. Offer code-review features using the web interface (something
   similar to gerrit), with outgoing email sent back out to the
   submitter (plus list, cc's, etc), using the reviewer's own From: and
   smtp server information

For the lack of a better term, I'm calling it "local patchwork", though
it's more likely to be a closely related spin-off that would hopefully
share a lot of code with patchwork and be able to both benefit from
upstream and to commit code back up for any shared functionality.

Caveat: this is not finalized and I will be putting up the proper
proposal up for a discussion here. Since we're talking about both a CLI
and a web interface to largely the same functionality, it's possible
that instead of attempting to run patchwork in a container, it would
make a lot more sense to run a daemon exposing a REST API that both the
CLI tool and the web tool would consume. However, this would require
rewriting a whole lot of things from scratch and would end up a lot more
difficult to both fund and develop -- which is why I'm leaning towards
adding these features to Patchwork instead.

-K

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-11-01 17:29   ` Konstantin Ryabitsev
@ 2019-11-01 17:35     ` Dmitry Vyukov
  2019-11-02 11:46     ` Steven Rostedt
  1 sibling, 0 replies; 13+ messages in thread
From: Dmitry Vyukov @ 2019-11-01 17:35 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Daniel Axtens, Han-Wen Nienhuys, workflows

On Fri, Nov 1, 2019 at 6:29 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> On Wed, Oct 30, 2019 at 09:35:08AM +1100, Daniel Axtens wrote:
> > > KR: write local command to work with patchwork.
> > We have a couple of local commands already, pwclient and git-pw. Is
> > there something new that would be helpful?
> >
> > As an aside, I know offline stuff has come up a few times, and I think
> > it should be reasonably straightforward to make a caching API client
> > that can buffer your requests until you're next online. I don't have any
> > cycles to do this, but I'm happy to help anyone who does.
>
> This is part of the tooling work that we discussed and I plan to have a
> more elaborate proposal finalized next week. We are hoping to fund the
> development of two closely related tools, one CLI and another using the
> browser interface, using a web server available over a localhost (or a
> local network) connection.

Hi Konstantin,

I am not sure if you are implying it here or not, but I would assume
it would be useful to run an actual "web" copy of this server over a
federated feed as well. Because... why not? it should be easy to do
and provide unified experience for online/local modes.

Otherwise looks good.

> The overall set of features for the web-interface would be:
>
> 1. Run the web interface locally in a container, to be accessed over
>    localhost
> 2. Use public-inbox feeds as the source of incoming patches, allowing to
>    pre-filter by:
>    - list-id
>    - from/to/cc
>    - keywords
>    - patch file path
> 3. Integrate natively with the local repository to allow the following
>    functionality:
>    - apply series directly to the repo, from the proper base-commit parent
>    - recognize patches that have already been applied and auto-archive
>      them as necessary (current kernel.org's patchwork-bot functionality)
> 4. Offer code-review features using the web interface (something
>    similar to gerrit), with outgoing email sent back out to the
>    submitter (plus list, cc's, etc), using the reviewer's own From: and
>    smtp server information
>
> For the lack of a better term, I'm calling it "local patchwork", though
> it's more likely to be a closely related spin-off that would hopefully
> share a lot of code with patchwork and be able to both benefit from
> upstream and to commit code back up for any shared functionality.
>
> Caveat: this is not finalized and I will be putting up the proper
> proposal up for a discussion here. Since we're talking about both a CLI
> and a web interface to largely the same functionality, it's possible
> that instead of attempting to run patchwork in a container, it would
> make a lot more sense to run a daemon exposing a REST API that both the
> CLI tool and the web tool would consume. However, this would require
> rewriting a whole lot of things from scratch and would end up a lot more
> difficult to both fund and develop -- which is why I'm leaning towards
> adding these features to Patchwork instead.
>
> -K

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 23:13   ` Bjorn Helgaas
@ 2019-11-01 20:07     ` Konstantin Ryabitsev
  2019-11-01 20:46       ` Geert Uytterhoeven
  2019-11-01 21:30       ` Theodore Y. Ts'o
  2019-11-01 21:34     ` Dmitry Vyukov
  1 sibling, 2 replies; 13+ messages in thread
From: Konstantin Ryabitsev @ 2019-11-01 20:07 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Eric Wong, Han-Wen Nienhuys, workflows

On Tue, Oct 29, 2019 at 06:13:13PM -0500, Bjorn Helgaas wrote:
> On Tue, Oct 29, 2019 at 10:26:29PM +0000, Eric Wong wrote:
> > > https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing
> > 
> > Thanks for taking notes.  Is there a version accessible to users
> > without JavaScript?  Thanks.

I'll try to fill in the missing details below, to the best of my
recollection.

> Consensus:
> * Current situation is suboptimal/problematic
> * CI folks
> * Patchwork streamlines workflow; lot of activity now. Dormant for years, but now improving.
> * Konstantin: patches: no attestation; no security. Easy to slip in vulns

I must highlight that some of those present didn't see this as
inherently a bad thing -- code contributions come from untrusted brains,
if you will, so the fact that submissions traverse untrusted channels
does not make these contributions any more untrustworthy. All code must
be treated as potentially dangerous -- whether because it is
intentionally malicious or just buggy -- so adding cryptographic
signatures at this stage of the code review process would offer no
meaningful improvement. In fact, it can lull maintainers into a false
sense of security where, arguably, none should be.

While I don't disagree with this, I feel that in reality the
maintainers' attention span is already overtaxed, so adding end-to-end
verifiable developer attestation will being more good than harm. Those
maintainers who consider them harmful to their process can simply choose
to ignore the whole thing.

> * Linus checks sigs, but subsystem maintainers don’t.

Rather, they can't, because there is no accepted or workable mechanism
for doing so.

> * Konstantin: proposes minisign signatures.

Specifically, "signify-compatible" signatures, not specifically
signatures made with minisign (which implements signify via libsodium).
Minisign adds some things which may not be interesting to us anyway,
since we are not signing actual files.

The main (and the most significant) downside of minisign/signify is that
it doesn't integrate with hardware crypto devices the same way gnupg
offloads key storage and operation to a TPM or a cryptocard. If we
choose to go the way of signify-compatible signatures, we are opting to
store the key locally and do key processing in the main memory. I feel
very conflicted about this -- but it's not like any significant number
of people use hardware tokens for their PGP operations right now anyway.

> * How realistic is this? (Steven).
> * How big is the key? Ed25519 are short keys.

ECC cryptography is preferred over RSA because:

- private and public keys are dramatically shorter, but offer similar
  cryptographic strength
- ECC operations are much faster
- ECC signatures are dramatically smaller

(To dispel some common misconception, ECC is *not* quantum-proof.
However, we don't currently have any reasonably usable quantum-resistant
asymmetric crypto, so it's not useful to discard ECC for this reason.
Besides, it's not like we're putting billions of dollars into ECC the
way bitcoin is.)

> * Identity tracking? PGP giving up on key signing. TOFU.

Identity management is a different and very hard problem. I'm hoping we
can benefit from the work done by did:git folks.
https://github.com/dhuseby/did-git-spec/blob/master/did-git-spec.md

> * (unhearable)
> * KR: signify/minisign background.
> * PGP
> * KR: Want it to be part of git.

Indeed, I don't want this to be some kind of external wrapper tool,
because that would assure non-adoption. Attestation needs to be done
natively by git.

> * PGP signatures are attachments. Attachments are easily stripped from message.
> * KR: want to archive history

From my perspective, the main goal of introducing attestation at the
email protocol level is for archival/legal review purposes and to remove
any remaining trust in the infrastructure. Currently, we inherently
trust the following systems not to do anything malicious: vger, lore,
patchwork. We should work to make attestation be end-to-end.

> * Complex patch doesn’t get in immediately, because patches need comment rounds, then spoofing gets exposed.

To clarify:

The argument was that attempts to sneak in malicious code while
pretending to be someone else would be quickly discovered, because any
significant code contribution requires back-and-forth and if the "From"
address is spoofed, then the real developer would quickly point out that
they are not the actual author of the code.

My counter-argument is that history proves that we can't trust humans to
recognize maliciously misspelled domains. If you receive a submission
like this:

From: Konstantin Ryabitsev <konstantin@linuxfoudnation.org>

you will need to pay very close attention to that "d" and "n" to realize
that it didn't actually come from me.

> * Greg: base tree information will be great.
> * Konstantin wants to put it into Git.

It's already in git starting with version 2.9.0 (see `man
git-format-patch` for `--base` and `BASE TREE INFORMATION` sections). I
want it to be required.

> * Base tree
>    * Discuss base commit
>    * Hanwen: SHA1 is opaque too
>    * KR: Linus complains that Changeid is equivalent to messageid, not so much opaqueness.
>    * Hanwen: suggest to add a public URL to the base tree
>    * Base goes into email; --base option git-format-patch.
>    * Must become a requirement
>    * Put into check-patch
>    * Similar to signed-off
>    * Not mandatory, andrew morton not using git. RFC patches also don’t need it.
> * Gateways:

Specifically, we were talking about adding gateways that would translate
git-native operations like push or pull-request into mailing list
submissions -- a patch or a series of patches.

>    * Point to tree, send from system
>    * Inside corporations, HTTPS.

This is the protocol most likely remaining unhindered behind corporate
firewalls.

>    * Adopt Gitgitgadget from github; creates mail patches from a GH repo.

This was my action proposal to adopt GitGitGadget for Linux Kernel
purposes. Since it already exists, it requires the least amount of
effort to get going.

>    * Command line tool

To clarify, we talked about having a wrapper around "git format-patch"
or "git request-pull" that would translate the contributor's work from a
local git tree into a properly formatted mailing list submission (and
send it off via a limited SMTP gateway offered by kernel.org). It would
require a proposal for funded work.

>    * Figuring out who to send this to.

General comment that "get_maintainer.pl" often returns too many hits.

>    * Automation defeats attestation goal.

*Some* automation would be incompatible with our goal of developer
end-to-end attestation, since the private key would need to be stored on
the system used by said automation.

>    * KR: should just build gitgitgadet for kernel.
> * How to know whom to send patch to?
>    * So much cruft in maintainers file.
> * Interaction git-format-patch and config is tricky.
> * Dmitrii Vyukov:
>    * Can have a server to do this
>    * KR: don’t want centralized infrastructure

Rather, I don't want *exclusive* centralized infrastructure. I'm fine
with running a service that anyone else can run as well that doesn't
introduce a hard dependency on a kernel.org-managed resource.

>    * Dmitrii: but gitgitgadget is the same?
> * (14:35): feeds.
>    * Human consumable information

We've gone over the idea of feeds multiple times in the past, but
specifically we're talking about public-inbox repositories that are
continuously updated via chained commits overwriting previous commit
data. These feeds contain RFC-2822 ("email") messages consisting of
headers and bodies, where the latter can contain MIME-formatted
attachments of various content-types. Generally, messages of this format
are intended for communication with humans, as opposed to with other
automated processes. The format that seems to be most commonly used for
non-human communication is JSON.

>    * Kernel.org can aggregate all the feeds, and can tell what CIs are still missing.

As opposed to emerging systems (like SSB) that have feed auto-discovery
implemented as part of the protocol, public-inbox doesn't have this
capability, so feed discovery must be managed via some side channel.

>    * CI mail has logs, but the results are transient

CI systems can send out emails to developers that contain limited
human-readable information. Frequently, these emails include links where
developers can get more information about the results, such as logs,
tracebacks, object dumps, etc. This data tends to be transient in the
sense that it will be deleted after a period of time in order to free up
space. My hope is that CI systems can provide this data as a feed
allowing archival systems (like kernel.org) to replicate the feed data,
including all pertinent information, and archive them for future
reference. My preferred way of doing this would be using a public-inbox
feed containing multiple refs:

refs/heads/master -- RFC-2822 formatted messages intended for humans
refs/heads/json -- JSON formatted data intended for other automation

Entries in master and json refs would use the same unique message-id
allowing cross-referencing.

Large binary objects can be linked using git-lfs, allowing their
retrieval and mirroring via `git lfs fetch --all` (I've not yet fully
fleshed out this idea).

>    * Kernel.org can archive all these data.
>    * Will be a lot of data, but want to start with feed.

I will admit the folly of this. :) If we're talking about CI binary
objects, then we're talking about terabytes of data monthly -- but I'd
like to try. It's only expensive when it needs to be fast and the way I
see this happening, it doesn't need to be fast, it just needs to be
retrievable.

>    * Needs a common structured format to understand what all CI systems have done.
>    * Attestation

Git commits can be signed, so this gives us builtin attestation.

>    * Steven: could record the acks/reviewed-by.

We were talking about developer feeds that are basically public-inbox
repositories of the developer's sent mail. I will talk about these
separately in the near future.

> * 2nd part of discussion: tooling.
>    * Lore 200 Gb.

Most of the disk space on lore.kernel.org is taken up by Xapian
databases. The git repositories themselves -- of all lists currently
archived on lore.kernel.org -- are just over 20GB.

> * [lost a lot of conversation here]
> * Patchwork:
>    * Has a web interface
>    * Can run locally.
>    * Inbox vs patchwork
>    * Patchwork with approvals from different maintainers.
>    * ...
>    * KR: write local command to work with patchwork.

See my email about "local patchwork" to get more clarity around this.

> * KR: daniel uses gitlab, some people want to use gerrit

Minor correction -- I thought the DRM subsystem already uses Gitlab for
their work, but they aren't. Gitlab is used for a lot of other graphics
subsystem work, but the actual kernel DRM subsystem is not using it yet.

>    * KR: wants to have a feed of data.
>    * Mail from gerrit/gitlab, usually is noisy.

My proposal is to have "forge liberation bots" that record and expose
all public activity happening inside forges like Gitlab, Github, Gerrit,
etc. While many of these offer a way to send email activity
notifications to mailing lists, such notifications are formatted in a
forge-specific way, don't cover all aspects of forge activity, and are
frequently a source of annoyance to mailing list subscribers who don't
care to see various "so-and-so added themselves to the CC on this issue"
messages.

Many of these forges offer a way to subscribe bots to the project's
event streams, so my proposal is to write forge-specific bots that would
connect to these event streams and record all pertinent information into
public-inbox feeds that can be mirrored and distributed. Developers can
then choose to subscribe to these feeds in the same way they can
subscribe to mailing list or developer feeds, plus they can be indexed
and made searchable via sites like lore.kernel.org.

Initially, these bots would be "read-only", but if we are successful in
keeping these feeds/bots useful (and stable), we can then offer
read-write integration so that developers can participate in forge
activities without needing to register an account on the forge or log
into the web interface. Functionality like this would be impossible
without working end-to-end developer attestation and feed discovery, so
anything like this is far, far in the mysterious future and requires a
lot of effort, perseverance, and luck before we get there.

>    * Tool can consume that feed.
>    * Libc mailing list, still struggling

To clarify -- the comment from one of the attendees was that the glibc
project is experimenting with using an email-based workflow that
backends into a gerrit instance. The web interface of the instance is
read-only and all activity must be performed via email.

> * Hanwen: Funding for tooling? Does Linux Foundation build the bridges, or do tool owners (gerrit, gitlab) have to do it?
>    * Linux Foundation can go to companies to ask for funding
>    * KR trying to get consensus so we can ask for resources & funding as a group.

It's my hope that I can get enough consensus from the developer
community that would allow me to put forth a proposal that is backed by
"all the important people in Linux" and get it funded via channels
available to the Linux Foundation. Linux Foundation itself does not have
operating funds for efforts like this, but it is able to work with its
member companies and other interested parties to solicit funding,
provided a clear goal and clear majority community support behind the
initiative.

>    * Let people use tools, sourcehut, gitlab, gerrit

If we are successful in building the "forge liberation bots," then we
make it possible for subsystems to choose their own preferred tools
without the fear that it will sequester that development effort inside a
walled garden.

If we are then able to teach these bots to bridge between forges, then
we'll find ourselves in the distributed development nirvana that I
described in my "patches carved into developer sigchains" blog post. :)

> * KR: Lore.kernel.org:
>    * Want to be able to search all over all data, gerrit, kernel etc. (like code search)
>    * Find all the patches that touch XYZ

Current limitation of lore.kernel.org is that the search is per-list --
you need to know where to look for data before you can find it. If we
start aggregating feeds from multiple sources (mailing lists, forges,
CI systems, individual developers), then we need a search box that works
across all of these feeds and presents the data in a useful format. This
is work that I hope we can fund.

> * Devs can miss reviews because people don’t know where reviews happen.
>    * KR: have a bot that will respond on behalf if maintainer has no gerrit account.

See "far, far in the future, if we are lucky" bit above.

>    * KR: long time initiative: want to move to SSB.

Rather, replace the smtp communication fabric with something else that
doesn't suffer from all the horrible downsides of using a protocol that
has been corrupted by MUAs, corporate mail servers, etc.

Eventually. If it makes sense.

-K

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-11-01 20:07     ` Konstantin Ryabitsev
@ 2019-11-01 20:46       ` Geert Uytterhoeven
  2019-11-01 21:30       ` Theodore Y. Ts'o
  1 sibling, 0 replies; 13+ messages in thread
From: Geert Uytterhoeven @ 2019-11-01 20:46 UTC (permalink / raw)
  To: Konstantin Ryabitsev
  Cc: Bjorn Helgaas, Eric Wong, Han-Wen Nienhuys, workflows

Hi Konstantin,

On Fri, Nov 1, 2019 at 9:10 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
> On Tue, Oct 29, 2019 at 06:13:13PM -0500, Bjorn Helgaas wrote:
> > On Tue, Oct 29, 2019 at 10:26:29PM +0000, Eric Wong wrote:
> > * Linus checks sigs, but subsystem maintainers don’t.
>
> Rather, they can't, because there is no accepted or workable mechanism
> for doing so.

That depends.  The pull requests I send to subsystem maintainers use
signed tags, just like the ones I send to Linus.

The difference lies in the transport medium: Linus receives most commits
through (signed) pull requests, while most (leaf) maintainers receive
(non-signed) emailed patches.

> > * Identity tracking? PGP giving up on key signing. TOFU.

I guess TOFU is good enough for patch submissions.
Does it really matter who I am? Yeah, the DCO says I must use my real name.
But would people stop trusting my patches if I suddenly announced my name
is not (and has never been) Geert Uytterhoeven?

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-11-01 20:07     ` Konstantin Ryabitsev
  2019-11-01 20:46       ` Geert Uytterhoeven
@ 2019-11-01 21:30       ` Theodore Y. Ts'o
  2019-11-02  1:17         ` Eric Wong
  1 sibling, 1 reply; 13+ messages in thread
From: Theodore Y. Ts'o @ 2019-11-01 21:30 UTC (permalink / raw)
  To: Bjorn Helgaas, Eric Wong, Han-Wen Nienhuys, workflows

On Fri, Nov 01, 2019 at 04:07:55PM -0400, Konstantin Ryabitsev wrote:
> The argument was that attempts to sneak in malicious code while
> pretending to be someone else would be quickly discovered, because any
> significant code contribution requires back-and-forth and if the "From"
> address is spoofed, then the real developer would quickly point out that
> they are not the actual author of the code.
> 
> My counter-argument is that history proves that we can't trust humans to
> recognize maliciously misspelled domains. If you receive a submission
> like this:
> 
> From: Konstantin Ryabitsev <konstantin@linuxfoudnation.org>
> 
> you will need to pay very close attention to that "d" and "n" to realize
> that it didn't actually come from me.

The other caution I'd raise here about why signing individual commits
might not be the panacea we might hope it would be is that the vast
majority of kernel developers don't today have cryptographic
identities, and we are constantly welcoming new developers to kernel
development.

Even if we did have a way to get new ED25519 keys signed for all of
these new developers, knowing their identity says nothing about how
much they are (or should be) trusted.

Consider that any sufficiently well-resourced actor who really wants
to sneak in malicious code, especially when we consider how much
zero-day exploits are worth on the open market, will be quite willing
to establish a "legend" for a developer.  The "developer" might submit
a dozen cleanup patches, all of which are good, and genuninely improve
the kernel --- and it will be the 13th or the 31st submit that will
have the malicious change hidden in it.  The fact that it is signed by
a key that had previously signed 30 patches says nothing about how
good the 31st patch will be.

For bonus style points, the patch might have something which claims to
be the application of a Coccinelle semantic patch --- and maybe in the
V1 and V2 version of the patch series, it was in in fact a Coccinelle
patch, but in the v3 patch, that's where malicious code was slipped
in, and since V2 had received a Reviewed-by, and it was supposedly an
automatically generated Coccinelle patch, no one took a close look and
noticed that the v3 version of the patch was different from the v2
version....

There are certainly ways we could try to make this sort of thing
harder; we can have tools that verify that the Coccinelle script
mentioned in the commit description actually matches with what the
commit changes.  And we could also have tools which flags deltas
between the Vn and Vn+1 version of a patch, especially after the Vn
version of the patch has gotten a reviewed-by.  It's just that none of
these fixes have anything to do with digital signed commits.

	      	    	 	     	- Ted

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-10-29 23:13   ` Bjorn Helgaas
  2019-11-01 20:07     ` Konstantin Ryabitsev
@ 2019-11-01 21:34     ` Dmitry Vyukov
  1 sibling, 0 replies; 13+ messages in thread
From: Dmitry Vyukov @ 2019-11-01 21:34 UTC (permalink / raw)
  To: Bjorn Helgaas; +Cc: Eric Wong, Han-Wen Nienhuys, workflows

On Wed, Oct 30, 2019 at 12:13 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Tue, Oct 29, 2019 at 10:26:29PM +0000, Eric Wong wrote:
> > > https://docs.google.com/document/d/1khLOBw5-HyaaNX7xregpHQLSfvGDUeHDY921bkI-_os/edit?usp=sharing
> >
> > Thanks for taking notes.  Is there a version accessible to users
> > without JavaScript?  Thanks.
>
> Here it is:

FTR, here is LWN write up:
https://lwn.net/Articles/803619/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-11-01 21:30       ` Theodore Y. Ts'o
@ 2019-11-02  1:17         ` Eric Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Eric Wong @ 2019-11-02  1:17 UTC (permalink / raw)
  To: Theodore Y. Ts'o; +Cc: Bjorn Helgaas, Han-Wen Nienhuys, workflows

"Theodore Y. Ts'o" <tytso@mit.edu> wrote:
> On Fri, Nov 01, 2019 at 04:07:55PM -0400, Konstantin Ryabitsev wrote:
> > The argument was that attempts to sneak in malicious code while
> > pretending to be someone else would be quickly discovered, because any
> > significant code contribution requires back-and-forth and if the "From"
> > address is spoofed, then the real developer would quickly point out that
> > they are not the actual author of the code.
> > 
> > My counter-argument is that history proves that we can't trust humans to
> > recognize maliciously misspelled domains. If you receive a submission
> > like this:
> > 
> > From: Konstantin Ryabitsev <konstantin@linuxfoudnation.org>
> > 
> > you will need to pay very close attention to that "d" and "n" to realize
> > that it didn't actually come from me.
> 
> The other caution I'd raise here about why signing individual commits
> might not be the panacea we might hope it would be is that the vast
> majority of kernel developers don't today have cryptographic
> identities, and we are constantly welcoming new developers to kernel
> development.
> 
> Even if we did have a way to get new ED25519 keys signed for all of
> these new developers, knowing their identity says nothing about how
> much they are (or should be) trusted.
> 
> Consider that any sufficiently well-resourced actor who really wants
> to sneak in malicious code, especially when we consider how much
> zero-day exploits are worth on the open market, will be quite willing
> to establish a "legend" for a developer.  The "developer" might submit
> a dozen cleanup patches, all of which are good, and genuninely improve
> the kernel --- and it will be the 13th or the 31st submit that will
> have the malicious change hidden in it.  The fact that it is signed by
> a key that had previously signed 30 patches says nothing about how
> good the 31st patch will be.

Agreed.  A well-resourced adversary could also coerce a
well-meaning developer into signing a malicious change.  Perhaps
I'm paranoid, but that's a really scary thing if people rely on
identities and reputation too much.  I've always cautioned users
against trusting me for that reason (that and I'm error-prone :x)

> For bonus style points, the patch might have something which claims to
> be the application of a Coccinelle semantic patch --- and maybe in the
> V1 and V2 version of the patch series, it was in in fact a Coccinelle
> patch, but in the v3 patch, that's where malicious code was slipped
> in, and since V2 had received a Reviewed-by, and it was supposedly an
> automatically generated Coccinelle patch, no one took a close look and
> noticed that the v3 version of the patch was different from the v2
> version....
> 
> There are certainly ways we could try to make this sort of thing
> harder; we can have tools that verify that the Coccinelle script
> mentioned in the commit description actually matches with what the
> commit changes.  And we could also have tools which flags deltas
> between the Vn and Vn+1 version of a patch, especially after the Vn
> version of the patch has gotten a reviewed-by.  It's just that none of
> these fixes have anything to do with digital signed commits.

Some of that could be automated, yes, but maintainers must still
remain vigilant.

A lot of it could be a culture prioritizing feature development
over long-term maintenance and review; so improving the
eyes-to-code ratio is needed.  That's a deeper issue which
affects every project, unfortunately.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Lyon meeting notes
  2019-11-01 17:29   ` Konstantin Ryabitsev
  2019-11-01 17:35     ` Dmitry Vyukov
@ 2019-11-02 11:46     ` Steven Rostedt
  1 sibling, 0 replies; 13+ messages in thread
From: Steven Rostedt @ 2019-11-02 11:46 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Daniel Axtens, Han-Wen Nienhuys, workflows

On Fri, 1 Nov 2019 13:29:10 -0400
Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:

> For the lack of a better term, I'm calling it "local patchwork", though
> it's more likely to be a closely related spin-off that would hopefully
> share a lot of code with patchwork and be able to both benefit from
> upstream and to commit code back up for any shared functionality.

I know this was one of the "not within 6 months" parts, but I don't
want this to be lost.

One of the issues with a 'local patchwork' that was brought up is if
you have multiple maintainers of a single subsystem. How one maintainer
might say "accept" and another might say "reject", if one was doing
this offline, and uploaded to the central system, it could conflict
with the changes there.

What I suggested was, for multi maintainer systems, to allow for
individual accounts. Where a "accept" of a patch would require an
accept from all the given maintainers. Kind of like a hierarchy. This
way, even if you are offline, you can upload your own "accept" and it
wont affect the "reject" from the other maintainer. But you would be
able to see everyone's judgment of a patch when you get back online.

-- Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-11-02 11:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-29 15:41 Lyon meeting notes Han-Wen Nienhuys
2019-10-29 22:26 ` Eric Wong
2019-10-29 23:13   ` Bjorn Helgaas
2019-11-01 20:07     ` Konstantin Ryabitsev
2019-11-01 20:46       ` Geert Uytterhoeven
2019-11-01 21:30       ` Theodore Y. Ts'o
2019-11-02  1:17         ` Eric Wong
2019-11-01 21:34     ` Dmitry Vyukov
2019-10-29 22:35 ` Daniel Axtens
2019-11-01 17:29   ` Konstantin Ryabitsev
2019-11-01 17:35     ` Dmitry Vyukov
2019-11-02 11:46     ` Steven Rostedt
2019-10-30  9:21 ` Jonathan Corbet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).