Workflows Archive on lore.kernel.org
 help / color / Atom feed
* RFC: individual public-inbox/git activity feeds
@ 2019-10-10 19:28 Konstantin Ryabitsev
  2019-10-10 23:57 ` Eric Wong
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-10 19:28 UTC (permalink / raw)
  To: workflows

Hi, all:

The idea of using public-inbox repositories as individual feeds has been 
mentioned here a couple of times already, and I would like to propose a 
tentative approach that could work without needing to involve SSB or 
other protocols.

# What are public-inbox repos?

Public-inbox (v2) uses git to archive mail messages, with the following 
general structure:

topdir/
  0.git/
  1.git/
  ...

Each of these git repositories has a single ref, master, with a single 
file "m" containing the entire body of the message, e.g.:
  - https://erol.kernel.org/workflows/git/0/tree/m

Each incoming message overwrites this file and creates a new commit, 
e.g.:
  - https://erol.kernel.org/workflows/git/0/log/m

This has the following upsides:

  - with a single file, git commit operations are very fast
  - git performance remains pretty much unaffected as repository grows, 
    since there aren't more and more objects to hash (the main downside 
    of public-inbox v1).
  - it is easy to get the contents of any message by simply performing 
    `git show <commit hash>:m`, which is a very fast operation even for 
    very old messages in the archive
  - most language environments have decent git libraries, so writing 
    tooling around git repositories is easy
  - git is really good at replicating itself, especially with a single 
    ref
  - git supports commit signing, so all commits can have cryptographic 
    attestation if the tools are configured to do that

There are a few downsides to this, too:

  - git maintenance tools like git-repack don't expect that repository 
    contents are going to be 90%-100% rewritten with every new commit, 
    so by default it will try to perform many rather useless 
    optimizations looking for non-existent deltas (but this can be 
    tweaked in config files)
  - most useful operations require maintaining auxiliary databases, e.g.  
    for message-id to commit-id mapping -- so repositories need to be 
    indexed using public-inbox-index in order to be useful for more than 
    just archival and replication. For huge repositories like LKML, the 
    initial indexing takes a long time, though subsequent 
    public-inbox-index calls after each `git remote update` are pretty 
    quick.
  - there is only rudimentary sharding into epochs, which makes partial 
    replication tricky (e.g. "replicate just the archives from last 
    October")

# Public-inbox repositories are feeds

Each public-inbox repository is therefore a consecutive feed of messages 
in the same sense something like SSB or NNTP is (for this reason, 
there's robust NNTP support in public-inbox). Public-inbox feeds are:

  - distributed
  - immutable (or tamper-evident once replicated, which is effectively 
    the same as immutable if git is configured to reject non-ff updates)
  - cryptographically attestable, if commit signing is used

# Individual developer feeds

Individual developers can begin providing their own public-inbox feeds.
At the start, they can act as a sort of a "public sent-mail folder" -- a 
simple tool would monitor the local/IMAP "sent" folder and add any new 
mail it finds (sent to specific mailing lists) to the developer's local 
public-inbox instance. Every commit will be automatically signed and 
pushed out to a public remote. 

On the kernel.org side, we can collate these individual feeds and mirror 
them into an aggregated feeds repository, with a ref per individual 
developer, like so:

refs/feeds/gregkh/0/master
refs/feeds/davem/0/master
refs/feeds/davem/1/master
...

Already, this gives us the following perks:

  - cryptographic attestation
  - patches that are guaranteed against mangling by MTA software
  - guaranteed spam-free message delivery from all the important people
  - permanent, attestable and distributable archive

(With time, we can teach kernel.org to act as an MTA bridge that sends 
actual mail to the mailing lists after we receive individual feed 
updates.)

# Using public-inbox with structured data

One of the problems we are trying to solve is how to deliver structured 
data like CI reports, bugs, issues, etc in a decentralized fashion.  
Instead of (or in addition to) sending mail to mailing lists and 
individual developers, bots and bug-tracking tools can provide their own 
feeds with structured data aimed at consumption by client-side and 
server-side tools.

I suggest we use public-inbox feeds with structured data in addition to 
human-readable data, using some universally adopted machine-parseable
format like JSON. In my mind, I see this working as a separate ref in 
each individual feed, e.g.:

refs/heads/master -- RFC-2822 (email) feed for human consumption
refs/heads/json -- json feed for machine-readable structured data

E.g. syzbot could publish a human-readable message in master:

----
From: syzbot
To: [list of addressees here]
Subject: BUG: bad usercopy in read_rio
Date: Wed, 09 Oct 2019 09:09:06 -0700

Hello,

syzbot found the following crash on:

HEAD commit:    58d5f26a usb-fuzzer: main usb gadget fuzzer driver
git tree:       https://github.com/google/kasan.git usb-fuzzer
console output: https://syzkaller.appspot.com/x/log.txt?x=149329b3600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=aa5dac3cda4ffd58
dashboard link: https://syzkaller.appspot.com/bug?extid=43e923a8937c203e9954
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

...
----

The same data, including all the relevant info provided via
syzkaller.appspot.com links would be included in the structured-section 
commit, allowing client-side tools to present it to the developer 
without requiring that they view it on the internet (or simply included 
for archival purposes).

The same approach can be used by bugzilla and any other bug-tracking 
software -- a human-readable commit in master, plus a corresponding 
machine-formatted commit in refs/heads/json. Minor record changes that 
aren't intended for humans can omit the commit in master (to avoid
the usual noise of "so-and-so started following this bug" messages). All 
commits would be cryptographically signed and fully attestable.

All these feeds can be aggregated centrally by entities like kernel.org 
for ease of discovery and replication, though this process would be 
human-administered and not automatic.

# Where this falls short

This is an archival solution first and foremost and not a true 
distributed, decentralized communication fabric. It solves the following 
problems:

  - it gets us cryptographically attestable feeds from important people 
    with little effort on their part (after initial setup)
  - it allows centralized tools (bots, forges, bug trackers, CI) to 
    export internal data so it can be preserved for future reference or 
    consumed directly by client-side tools -- though it obviously 
    requires that vendors jump on this bandwagon and don't simply ignore 
    it
  - it uses existing technologies that are known to work well together
    (public-inbox, git) and doesn't require that we adopt any nascent 
    technologies like SSB that are still in early stages of development 
    and haven't yet had time to mature

What this doesn't fix:

  - we still continue to largely rely on email and mailing lists, though 
    theoretically their use would become less important as more 
    developer feeds are aggregated and maintainer tools start to rely on 
    those as their primary source of truth. We can easily see a future 
    where vger.kernel.org just writes to public-inbox archives and 
    leaves mail delivery and subscription management up to someone else.
  - we still need aggregation authorities like kernel.org -- though we 
    can hedge this by having multiple mirrors and publishing a manifest 
    of feeds that can be pulled individually if needed
  - this doesn't really get us builtin encrypted communication between 
    developers, though we can think of some clever solutions, such as
    keypairs per incident that are initially only distributed to members 
    of security@kernel.org and then disclosed publicly after embargo is 
    lifted, allowing anyone interested to go back and read the encrypted 
    discussion for the purpose of full transparency.

The main upside of this approach is that it's evolutionary and not 
revolutionary and we can start implementing it right away, using it to 
augment and improve mailing lists instead of replacing them outright.

-K

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
@ 2019-10-10 23:57 ` Eric Wong
  2019-10-18  2:48   ` Eric Wong
  2019-10-11 17:15 ` Dmitry Vyukov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 12+ messages in thread
From: Eric Wong @ 2019-10-10 23:57 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: workflows

Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:

<snip a bunch of stuff I agree with>

> # Individual developer feeds

<snip>

> (With time, we can teach kernel.org to act as an MTA bridge that sends
> actual mail to the mailing lists after we receive individual feed updates.)

I'm skeptical and pessimistic about that bit happening (as I
usually am :>).  But the great thing is all that stuff can
happen without disrupting/changing existing workflows and is
totally optional.

> # Using public-inbox with structured data
> 
> One of the problems we are trying to solve is how to deliver structured data
> like CI reports, bugs, issues, etc in a decentralized fashion.  Instead of
> (or in addition to) sending mail to mailing lists and individual developers,
> bots and bug-tracking tools can provide their own feeds with structured data
> aimed at consumption by client-side and server-side tools.
> 
> I suggest we use public-inbox feeds with structured data in addition to
> human-readable data, using some universally adopted machine-parseable
> format like JSON. In my mind, I see this working as a separate ref in each
> individual feed, e.g.:
> 
> refs/heads/master -- RFC-2822 (email) feed for human consumption
> refs/heads/json -- json feed for machine-readable structured data

Having a side-channel in addition to email make people learn and
use new tools (not good).  Furthermore, that data likely end up
in commit messages, and have to be translated from JSON...

Instead, the structured data should be RFC822-like so
"git interpret-trailers" can write it.  It'd probably be
similar to Debbugs:

  https://lore.kernel.org/workflows/20191008213626.GB8130@dcvr/

> E.g. syzbot could publish a human-readable message in master:
> 
> ----
> From: syzbot
> To: [list of addressees here]
> Subject: BUG: bad usercopy in read_rio
> Date: Wed, 09 Oct 2019 09:09:06 -0700
> 
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    58d5f26a usb-fuzzer: main usb gadget fuzzer driver
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> console output: https://syzkaller.appspot.com/x/log.txt?x=149329b3600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=aa5dac3cda4ffd58
> dashboard link: https://syzkaller.appspot.com/bug?extid=43e923a8937c203e9954
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)

That's already close enough to git trailers (s/ /-/).
> ...
> ----
> 
> The same data, including all the relevant info provided via
> syzkaller.appspot.com links would be included in the structured-section
> commit, allowing client-side tools to present it to the developer without
> requiring that they view it on the internet (or simply included for archival
> purposes).

That seems redundant given the above.

> The same approach can be used by bugzilla and any other bug-tracking
> software -- a human-readable commit in master, plus a corresponding
> machine-formatted commit in refs/heads/json. Minor record changes that
> aren't intended for humans can omit the commit in master (to avoid
> the usual noise of "so-and-so started following this bug" messages). All
> commits would be cryptographically signed and fully attestable.

If those bug trackers can already interpret stuff like "Fixes:"
in the kernel commit messages, making them deal with JSON or
another channel is too much.  If they can't deal with "Fixes:",
then there's no expectation they'd deal with a new JSON
thing, either.

"so-and-so following messages" don't need to be public info.

> All these feeds can be aggregated centrally by entities like kernel.org for
> ease of discovery and replication, though this process would be
> human-administered and not automatic.
> 
> # Where this falls short
> 
> This is an archival solution first and foremost and not a true distributed,
> decentralized communication fabric. It solves the following problems:
> 
>  - it gets us cryptographically attestable feeds from important people
> with little effort on their part (after initial setup)
>  - it allows centralized tools (bots, forges, bug trackers, CI) to    export
> internal data so it can be preserved for future reference or    consumed
> directly by client-side tools -- though it obviously    requires that
> vendors jump on this bandwagon and don't simply ignore    it
>  - it uses existing technologies that are known to work well together
>    (public-inbox, git) and doesn't require that we adopt any nascent
> technologies like SSB that are still in early stages of development    and
> haven't yet had time to mature

Even the JSON feed is too much to ask people to adopt.

<snip>

> The main upside of this approach is that it's evolutionary and not
> revolutionary and we can start implementing it right away, using it to
> augment and improve mailing lists instead of replacing them outright.

That.  We should take these one small step-at-a-time and see
where things take us.  The key is to remain harmonious with
existing workflows and be transparent to people who won't
change.

Same thing worked for git-svn obsoleting Subversion.  I just
don't want to end up with a proprietary/centralized InboxHub
this time around :P

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
  2019-10-10 23:57 ` Eric Wong
@ 2019-10-11 17:15 ` Dmitry Vyukov
  2019-10-11 19:07   ` Geert Uytterhoeven
  2019-10-11 19:39   ` Konstantin Ryabitsev
  2019-10-11 22:57 ` Daniel Borkmann
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 12+ messages in thread
From: Dmitry Vyukov @ 2019-10-11 17:15 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: workflows

On Thu, Oct 10, 2019 at 9:29 PM Konstantin Ryabitsev
<konstantin@linuxfoundation.org> wrote:
>
> Hi, all:
>
> The idea of using public-inbox repositories as individual feeds has been
> mentioned here a couple of times already, and I would like to propose a
> tentative approach that could work without needing to involve SSB or
> other protocols.
>
> # What are public-inbox repos?
>
> Public-inbox (v2) uses git to archive mail messages, with the following
> general structure:
>
> topdir/
>   0.git/
>   1.git/
>   ...
>
> Each of these git repositories has a single ref, master, with a single
> file "m" containing the entire body of the message, e.g.:
>   - https://erol.kernel.org/workflows/git/0/tree/m
>
> Each incoming message overwrites this file and creates a new commit,
> e.g.:
>   - https://erol.kernel.org/workflows/git/0/log/m
>
> This has the following upsides:
>
>   - with a single file, git commit operations are very fast
>   - git performance remains pretty much unaffected as repository grows,
>     since there aren't more and more objects to hash (the main downside
>     of public-inbox v1).
>   - it is easy to get the contents of any message by simply performing
>     `git show <commit hash>:m`, which is a very fast operation even for
>     very old messages in the archive
>   - most language environments have decent git libraries, so writing
>     tooling around git repositories is easy
>   - git is really good at replicating itself, especially with a single
>     ref
>   - git supports commit signing, so all commits can have cryptographic
>     attestation if the tools are configured to do that
>
> There are a few downsides to this, too:
>
>   - git maintenance tools like git-repack don't expect that repository
>     contents are going to be 90%-100% rewritten with every new commit,
>     so by default it will try to perform many rather useless
>     optimizations looking for non-existent deltas (but this can be
>     tweaked in config files)
>   - most useful operations require maintaining auxiliary databases, e.g.
>     for message-id to commit-id mapping -- so repositories need to be
>     indexed using public-inbox-index in order to be useful for more than
>     just archival and replication. For huge repositories like LKML, the
>     initial indexing takes a long time, though subsequent
>     public-inbox-index calls after each `git remote update` are pretty
>     quick.
>   - there is only rudimentary sharding into epochs, which makes partial
>     replication tricky (e.g. "replicate just the archives from last
>     October")
>
> # Public-inbox repositories are feeds
>
> Each public-inbox repository is therefore a consecutive feed of messages
> in the same sense something like SSB or NNTP is (for this reason,
> there's robust NNTP support in public-inbox). Public-inbox feeds are:
>
>   - distributed
>   - immutable (or tamper-evident once replicated, which is effectively
>     the same as immutable if git is configured to reject non-ff updates)
>   - cryptographically attestable, if commit signing is used
>
> # Individual developer feeds
>
> Individual developers can begin providing their own public-inbox feeds.
> At the start, they can act as a sort of a "public sent-mail folder" -- a
> simple tool would monitor the local/IMAP "sent" folder and add any new
> mail it finds (sent to specific mailing lists) to the developer's local
> public-inbox instance. Every commit will be automatically signed and
> pushed out to a public remote.
>
> On the kernel.org side, we can collate these individual feeds and mirror
> them into an aggregated feeds repository, with a ref per individual
> developer, like so:
>
> refs/feeds/gregkh/0/master
> refs/feeds/davem/0/master
> refs/feeds/davem/1/master
> ...
>
> Already, this gives us the following perks:
>
>   - cryptographic attestation
>   - patches that are guaranteed against mangling by MTA software
>   - guaranteed spam-free message delivery from all the important people
>   - permanent, attestable and distributable archive
>
> (With time, we can teach kernel.org to act as an MTA bridge that sends
> actual mail to the mailing lists after we receive individual feed
> updates.)
>
> # Using public-inbox with structured data
>
> One of the problems we are trying to solve is how to deliver structured
> data like CI reports, bugs, issues, etc in a decentralized fashion.
> Instead of (or in addition to) sending mail to mailing lists and
> individual developers, bots and bug-tracking tools can provide their own
> feeds with structured data aimed at consumption by client-side and
> server-side tools.
>
> I suggest we use public-inbox feeds with structured data in addition to
> human-readable data, using some universally adopted machine-parseable
> format like JSON. In my mind, I see this working as a separate ref in
> each individual feed, e.g.:
>
> refs/heads/master -- RFC-2822 (email) feed for human consumption
> refs/heads/json -- json feed for machine-readable structured data
>
> E.g. syzbot could publish a human-readable message in master:
>
> ----
> From: syzbot
> To: [list of addressees here]
> Subject: BUG: bad usercopy in read_rio
> Date: Wed, 09 Oct 2019 09:09:06 -0700
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    58d5f26a usb-fuzzer: main usb gadget fuzzer driver
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> console output: https://syzkaller.appspot.com/x/log.txt?x=149329b3600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=aa5dac3cda4ffd58
> dashboard link: https://syzkaller.appspot.com/bug?extid=43e923a8937c203e9954
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
>
> ...
> ----
>
> The same data, including all the relevant info provided via
> syzkaller.appspot.com links would be included in the structured-section
> commit, allowing client-side tools to present it to the developer
> without requiring that they view it on the internet (or simply included
> for archival purposes).
>
> The same approach can be used by bugzilla and any other bug-tracking
> software -- a human-readable commit in master, plus a corresponding
> machine-formatted commit in refs/heads/json. Minor record changes that
> aren't intended for humans can omit the commit in master (to avoid
> the usual noise of "so-and-so started following this bug" messages). All
> commits would be cryptographically signed and fully attestable.
>
> All these feeds can be aggregated centrally by entities like kernel.org
> for ease of discovery and replication, though this process would be
> human-administered and not automatic.
>
> # Where this falls short
>
> This is an archival solution first and foremost and not a true
> distributed, decentralized communication fabric. It solves the following
> problems:
>
>   - it gets us cryptographically attestable feeds from important people
>     with little effort on their part (after initial setup)
>   - it allows centralized tools (bots, forges, bug trackers, CI) to
>     export internal data so it can be preserved for future reference or
>     consumed directly by client-side tools -- though it obviously
>     requires that vendors jump on this bandwagon and don't simply ignore
>     it
>   - it uses existing technologies that are known to work well together
>     (public-inbox, git) and doesn't require that we adopt any nascent
>     technologies like SSB that are still in early stages of development
>     and haven't yet had time to mature
>
> What this doesn't fix:
>
>   - we still continue to largely rely on email and mailing lists, though
>     theoretically their use would become less important as more
>     developer feeds are aggregated and maintainer tools start to rely on
>     those as their primary source of truth. We can easily see a future
>     where vger.kernel.org just writes to public-inbox archives and
>     leaves mail delivery and subscription management up to someone else.
>   - we still need aggregation authorities like kernel.org -- though we
>     can hedge this by having multiple mirrors and publishing a manifest
>     of feeds that can be pulled individually if needed
>   - this doesn't really get us builtin encrypted communication between
>     developers, though we can think of some clever solutions, such as
>     keypairs per incident that are initially only distributed to members
>     of security@kernel.org and then disclosed publicly after embargo is
>     lifted, allowing anyone interested to go back and read the encrypted
>     discussion for the purpose of full transparency.
>
> The main upside of this approach is that it's evolutionary and not
> revolutionary and we can start implementing it right away, using it to
> augment and improve mailing lists instead of replacing them outright.


Interesting. This is similar to SSB on _some_ level, right? Because
it's just a different type of transport. I personally don't have any
horses in the transport race (as long as it is easy to setup and
provides a good foundation for transferring structured data).

What attracted my attention is this part:

refs/feeds/gregkh/0/master
refs/feeds/davem/0/master
refs/feeds/davem/1/master

Will this provide a total ordering over all messages by all
participants? That may be a significant advantage over SSB then (see
point 14 in [1]). But the "that can be pulled individually" part
breaks this (complete read-only mirrors for fault-tolerance are fine,
though).

This may also need some form of DoS protection (esp as we move further
from email).

I also tend to conclude that some actions should not be done offline
and then "synced" a week later. Ted provided an example of starting
tests in another thread. Or, say if you close a bug and then push than
update a month later without any regard to the current bug state, that
may not be the right thing. Working with read-only data offline is
perfectly fine. Doing _some_ updates locally and then pushing a week
later is fine (e.g. queue a new patch for review). But not necessary
all updates should be doable in offline mode. And this seems to be
inherent conflict with any scheme where one can "queue" any updates
locally, and then "sync" them anytime later without any regard to the
current state of things and just tell the system and all other
participants "deal with it". Also, if we have any kind of
permissions/quotas, when are these checks done: when one creates an
update or when it's synced?

This is interesting too:

refs/heads/master -- RFC-2822 (email) feed for human consumption
refs/heads/json -- json feed for machine-readable structured data

Playing devil's advocate, what about MIME? :)
It does not need to be completely arbitrary MIME, but say only 2
alternative section, first has to be plain/text, second (optional) has
to be kthul/json. Say, "kthul mail" creates that properly formed email
with plain text and all structured data. Or, CI creates both human
readable and machine readable form. It seems reasonable to keep both
versions together.
Though, it's not that I thought it all out and strongly advocating
this. Just a potential interesting option.

[1] https://lore.kernel.org/workflows/CACT4Y+YU78dQUeFob7NXaOU-gjnKHtxpceQj2c4=2aBV0_PSxg@mail.gmail.com/T/#t

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-11 17:15 ` Dmitry Vyukov
@ 2019-10-11 19:07   ` Geert Uytterhoeven
  2019-10-11 19:12     ` Laurent Pinchart
  2019-10-14  6:43     ` Dmitry Vyukov
  2019-10-11 19:39   ` Konstantin Ryabitsev
  1 sibling, 2 replies; 12+ messages in thread
From: Geert Uytterhoeven @ 2019-10-11 19:07 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: Konstantin Ryabitsev, workflows

Hi Dmitry,

On Fri, Oct 11, 2019 at 7:15 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> I also tend to conclude that some actions should not be done offline
> and then "synced" a week later. Ted provided an example of starting
> tests in another thread. Or, say if you close a bug and then push than
> update a month later without any regard to the current bug state, that
> may not be the right thing. Working with read-only data offline is
> perfectly fine. Doing _some_ updates locally and then pushing a week
> later is fine (e.g. queue a new patch for review). But not necessary
> all updates should be doable in offline mode. And this seems to be
> inherent conflict with any scheme where one can "queue" any updates
> locally, and then "sync" them anytime later without any regard to the
> current state of things and just tell the system and all other
> participants "deal with it". Also, if we have any kind of
> permissions/quotas, when are these checks done: when one creates an
> update or when it's synced?

Not unlike "git push" accepting fast-forwards only, and rejecting
forced updates.
Hence you cannot push the close of a bug (each bug has its own
branch?) before merging the updated remote state first.

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-11 19:07   ` Geert Uytterhoeven
@ 2019-10-11 19:12     ` Laurent Pinchart
  2019-10-14  6:43     ` Dmitry Vyukov
  1 sibling, 0 replies; 12+ messages in thread
From: Laurent Pinchart @ 2019-10-11 19:12 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Dmitry Vyukov, Konstantin Ryabitsev, workflows

On Fri, Oct 11, 2019 at 09:07:20PM +0200, Geert Uytterhoeven wrote:
> Hi Dmitry,
> 
> On Fri, Oct 11, 2019 at 7:15 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > I also tend to conclude that some actions should not be done offline
> > and then "synced" a week later. Ted provided an example of starting
> > tests in another thread. Or, say if you close a bug and then push than
> > update a month later without any regard to the current bug state, that
> > may not be the right thing. Working with read-only data offline is
> > perfectly fine. Doing _some_ updates locally and then pushing a week
> > later is fine (e.g. queue a new patch for review). But not necessary
> > all updates should be doable in offline mode. And this seems to be
> > inherent conflict with any scheme where one can "queue" any updates
> > locally, and then "sync" them anytime later without any regard to the
> > current state of things and just tell the system and all other
> > participants "deal with it". Also, if we have any kind of
> > permissions/quotas, when are these checks done: when one creates an
> > update or when it's synced?
> 
> Not unlike "git push" accepting fast-forwards only, and rejecting
> forced updates.
> Hence you cannot push the close of a bug (each bug has its own
> branch?) before merging the updated remote state first.

That might work in small projects, but at a bigger scale you soon start
hitting races to get to the build bot before everybody else, and the CI
system gets trashed with cycles of lost races, rebase and retry. It's
not something we could enforce globally.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-11 17:15 ` Dmitry Vyukov
  2019-10-11 19:07   ` Geert Uytterhoeven
@ 2019-10-11 19:39   ` Konstantin Ryabitsev
  2019-10-12 11:48     ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 12+ messages in thread
From: Konstantin Ryabitsev @ 2019-10-11 19:39 UTC (permalink / raw)
  To: Dmitry Vyukov; +Cc: workflows

On Fri, Oct 11, 2019 at 07:15:12PM +0200, Dmitry Vyukov wrote:
>> The main upside of this approach is that it's evolutionary and not
>> revolutionary and we can start implementing it right away, using it to
>> augment and improve mailing lists instead of replacing them outright.
>
>
>Interesting. This is similar to SSB on _some_ level, right? Because
>it's just a different type of transport. I personally don't have any
>horses in the transport race (as long as it is easy to setup and
>provides a good foundation for transferring structured data).

It's similar only in the sense that it's a chain of records that can be 
optionally cryptographically signed. Some of the problems that SSB (and 
especially v2) tries to solve are not anything git concerns itself 
about, such as discovery, feed cross-reference, verifiable partial 
clones, etc.

>What attracted my attention is this part:
>
>refs/feeds/gregkh/0/master
>refs/feeds/davem/0/master
>refs/feeds/davem/1/master
>
>Will this provide a total ordering over all messages by all
>participants? That may be a significant advantage over SSB then (see
>point 14 in [1]). But the "that can be pulled individually" part
>breaks this (complete read-only mirrors for fault-tolerance are fine,
>though).

No, these refs are entirely independent of each-other. In a sense, it's 
the equivalent of cloning individual public-inbox repos together and 
then tar'ing them up. For ordering, we still have to go with commit 
timestamps and we'll still have conflicting resolutions, just like you 
mention (though this isn't any different than with email).

>This may also need some form of DoS protection (esp as we move further
>from email).

Well, amusingly, there are ways of distributing git via decentralized 
protocols (SSB, DAT, IPFS). They are all fairly immature, though, and 
some of them are truly terrible ideas.

For the moment, our best protection against DoS attacks on git repos is 
having many frontends, some powerful allies (e.g. see 
kernel.googlesource.com), and DoS-avoidance by obscurity ("I can't push 
to kernel.org right now, but you can pull my repo from my personal 
server over here").

>I also tend to conclude that some actions should not be done offline
>and then "synced" a week later. Ted provided an example of starting
>tests in another thread. Or, say if you close a bug and then push than
>update a month later without any regard to the current bug state, that
>may not be the right thing.

The same is true with email, though -- people who queue up email in 
their outbox and lose connectivity before they can send it out is 
something that happens often. True, we aren't solving this, but it's not 
a net-new problem and will always be a hard problem to solve for laggy 
decentralized environments.

>Working with read-only data offline is
>perfectly fine. Doing _some_ updates locally and then pushing a week
>later is fine (e.g. queue a new patch for review). But not necessary
>all updates should be doable in offline mode. And this seems to be
>inherent conflict with any scheme where one can "queue" any updates
>locally, and then "sync" them anytime later without any regard to the
>current state of things and just tell the system and all other
>participants "deal with it".

Well, in all honesty, "queueing things up for a week" is going to be an 
increasingly rare problem for anyone who works on the Linux kernel. I 
don't know about others, but I can recall every time I've actually been 
offline in the past year and in each case it involved a cross-atlantic 
flight with a totally broken wi-fi or a trip into a rare spot on the map 
without cell towers. Even long power outages simply mean I have to 
tether my laptop via my phone. Thanks to wireguard, I don't even lose 
ssh sessions when that happens. :)

Replicating a feed out is a very quick task that can be made quicker 
with tricks like ssh controlmaster connections that keep sessions going.

>This is interesting too:
>
>refs/heads/master -- RFC-2822 (email) feed for human consumption
>refs/heads/json -- json feed for machine-readable structured data
>
>Playing devil's advocate, what about MIME? :)
>It does not need to be completely arbitrary MIME, but say only 2
>alternative section, first has to be plain/text, second (optional) has
>to be kthul/json. 

The main reason why I wanted two different refs is so entities like bots 
could only pull the json ref and ignore the one aimed at humans. So, 
while this makes the repository larger by having some data duplication, 
this should make pulling and parsing less problematic by bots, and I 
expect bots to be the ones generating most frequent hits and traffic.

-K

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
  2019-10-10 23:57 ` Eric Wong
  2019-10-11 17:15 ` Dmitry Vyukov
@ 2019-10-11 22:57 ` Daniel Borkmann
  2019-10-12  7:50 ` Greg KH
  2019-10-12 11:20 ` Mauro Carvalho Chehab
  4 siblings, 0 replies; 12+ messages in thread
From: Daniel Borkmann @ 2019-10-11 22:57 UTC (permalink / raw)
  To: Konstantin Ryabitsev, workflows

On 10/10/19 9:28 PM, Konstantin Ryabitsev wrote:
[...]
> # Individual developer feeds
> 
> Individual developers can begin providing their own public-inbox feeds.
> At the start, they can act as a sort of a "public sent-mail folder" -- a simple tool would monitor the local/IMAP "sent" folder and add any new mail it finds (sent to specific mailing lists) to the developer's local public-inbox instance. Every commit will be automatically signed and pushed out to a public remote.
> On the kernel.org side, we can collate these individual feeds and mirror them into an aggregated feeds repository, with a ref per individual developer, like so:
> 
> refs/feeds/gregkh/0/master
> refs/feeds/davem/0/master
> refs/feeds/davem/1/master
> ...
> 
> Already, this gives us the following perks:
> 
>   - cryptographic attestation
>   - patches that are guaranteed against mangling by MTA software
>   - guaranteed spam-free message delivery from all the important people
>   - permanent, attestable and distributable archive
> 
> (With time, we can teach kernel.org to act as an MTA bridge that sends actual mail to the mailing lists after we receive individual feed updates.)
[...]

[...]
>   - we still continue to largely rely on email and mailing lists, though    theoretically their use would become less important as more    developer feeds are aggregated and maintainer tools start to rely on    those as their primary source of truth. We can easily see a future    where vger.kernel.org just writes to public-inbox archives and    leaves mail delivery and subscription management up to someone else.

[...]
> The main upside of this approach is that it's evolutionary and not revolutionary and we can start implementing it right away, using it to augment and improve mailing lists instead of replacing them outright.

I do like these aspects, and the receive side aka git to mail client integration is already done,
so the one missing piece is a sendmail drop-in replacement acting as public git sent-mail folder.
I think it doesn't have to be on kernel.org, but could live anywhere e.g. developers could also
push to github or elsewhere with such tool, so "subscribing" to a mailing list for sending would
need kernel.org infra that adds the repo to a list of repos to pull from, extracts <commit hash>:m
from that developers repo from the point where it was last read up to the git HEAD (e.g. rejecting
any forced pushes, and doing sanity checks on m), and m would then be committed conflict-free to
the official public-inbox repositories of the lists in Cc in m, and potentially sent from kernel.org
via MTA bridge to old-style mail receivers. Nice thing is that this would allow for transparent
testing/roll-out to today's development workflow. It might be one component/(sub-)tool of the bigger
picture to have email slowly fade out (and new/non-mail based tools could be built around it, too).

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
                   ` (2 preceding siblings ...)
  2019-10-11 22:57 ` Daniel Borkmann
@ 2019-10-12  7:50 ` Greg KH
  2019-10-12 11:20 ` Mauro Carvalho Chehab
  4 siblings, 0 replies; 12+ messages in thread
From: Greg KH @ 2019-10-12  7:50 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: workflows

On Thu, Oct 10, 2019 at 03:28:52PM -0400, Konstantin Ryabitsev wrote:
> # Individual developer feeds
> 
> Individual developers can begin providing their own public-inbox feeds.
> At the start, they can act as a sort of a "public sent-mail folder" -- a
> simple tool would monitor the local/IMAP "sent" folder and add any new mail
> it finds (sent to specific mailing lists) to the developer's local
> public-inbox instance. Every commit will be automatically signed and pushed
> out to a public remote.
> 
> On the kernel.org side, we can collate these individual feeds and mirror
> them into an aggregated feeds repository, with a ref per individual
> developer, like so:
> 
> refs/feeds/gregkh/0/master

The stuff I send out is probably not all that interesting compared to
what is sent to me, given that I receive way more than I send.

> refs/feeds/davem/0/master
> refs/feeds/davem/1/master
> ...
> 
> Already, this gives us the following perks:
> 
>  - cryptographic attestation
>  - patches that are guaranteed against mangling by MTA software
>  - guaranteed spam-free message delivery from all the important people
>  - permanent, attestable and distributable archive
> 
> (With time, we can teach kernel.org to act as an MTA bridge that sends
> actual mail to the mailing lists after we receive individual feed updates.)

This would work well for developers that are "large producers" but that
doesn't help maintainers much, right?

I think I'm missing something, but what would a "feed that only comes
from gregkh" help out with?  Who wants to consume that?

> # Using public-inbox with structured data
> 
> One of the problems we are trying to solve is how to deliver structured data
> like CI reports, bugs, issues, etc in a decentralized fashion.  Instead of
> (or in addition to) sending mail to mailing lists and individual developers,
> bots and bug-tracking tools can provide their own feeds with structured data
> aimed at consumption by client-side and server-side tools.
> 
> I suggest we use public-inbox feeds with structured data in addition to
> human-readable data, using some universally adopted machine-parseable
> format like JSON. In my mind, I see this working as a separate ref in each
> individual feed, e.g.:
> 
> refs/heads/master -- RFC-2822 (email) feed for human consumption
> refs/heads/json -- json feed for machine-readable structured data
> 
> E.g. syzbot could publish a human-readable message in master:
> 
> ----
> From: syzbot
> To: [list of addressees here]
> Subject: BUG: bad usercopy in read_rio
> Date: Wed, 09 Oct 2019 09:09:06 -0700
> 
> Hello,
> 
> syzbot found the following crash on:
> 
> HEAD commit:    58d5f26a usb-fuzzer: main usb gadget fuzzer driver
> git tree:       https://github.com/google/kasan.git usb-fuzzer
> console output: https://syzkaller.appspot.com/x/log.txt?x=149329b3600000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=aa5dac3cda4ffd58
> dashboard link: https://syzkaller.appspot.com/bug?extid=43e923a8937c203e9954
> compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> 
> ...
> ----
> 
> The same data, including all the relevant info provided via
> syzkaller.appspot.com links would be included in the structured-section
> commit, allowing client-side tools to present it to the developer without
> requiring that they view it on the internet (or simply included for archival
> purposes).
> 
> The same approach can be used by bugzilla and any other bug-tracking
> software -- a human-readable commit in master, plus a corresponding
> machine-formatted commit in refs/heads/json. Minor record changes that
> aren't intended for humans can omit the commit in master (to avoid
> the usual noise of "so-and-so started following this bug" messages). All
> commits would be cryptographically signed and fully attestable.
> 
> All these feeds can be aggregated centrally by entities like kernel.org for
> ease of discovery and replication, though this process would be
> human-administered and not automatic.
> 
> # Where this falls short
> 
> This is an archival solution first and foremost and not a true distributed,
> decentralized communication fabric. It solves the following problems:
> 
>  - it gets us cryptographically attestable feeds from important people
> with little effort on their part (after initial setup)
>  - it allows centralized tools (bots, forges, bug trackers, CI) to    export
> internal data so it can be preserved for future reference or    consumed
> directly by client-side tools -- though it obviously    requires that
> vendors jump on this bandwagon and don't simply ignore    it
>  - it uses existing technologies that are known to work well together
>    (public-inbox, git) and doesn't require that we adopt any nascent
> technologies like SSB that are still in early stages of development    and
> haven't yet had time to mature
> 
> What this doesn't fix:
> 
>  - we still continue to largely rely on email and mailing lists, though
> theoretically their use would become less important as more    developer
> feeds are aggregated and maintainer tools start to rely on    those as their
> primary source of truth. We can easily see a future    where vger.kernel.org
> just writes to public-inbox archives and    leaves mail delivery and
> subscription management up to someone else.

That last one would make the vger.kernel.org admins happy :)

>  - we still need aggregation authorities like kernel.org -- though we    can
> hedge this by having multiple mirrors and publishing a manifest    of feeds
> that can be pulled individually if needed
>  - this doesn't really get us builtin encrypted communication between
> developers, though we can think of some clever solutions, such as
>    keypairs per incident that are initially only distributed to members
> of security@kernel.org and then disclosed publicly after embargo is
> lifted, allowing anyone interested to go back and read the encrypted
> discussion for the purpose of full transparency.

We have tools for that with Thomas's encrypted email server, don't know
if you want to roll that into this type of system or not.

> The main upside of this approach is that it's evolutionary and not
> revolutionary and we can start implementing it right away, using it to
> augment and improve mailing lists instead of replacing them outright.

evolution is good, I think the slow migration of more people using
public-inbox archives instead of directly subscribing to mailing lists
might help out a lot.  Already it seems that lore.kernel.org is updated
faster than my email server sees new messages :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
                   ` (3 preceding siblings ...)
  2019-10-12  7:50 ` Greg KH
@ 2019-10-12 11:20 ` Mauro Carvalho Chehab
  4 siblings, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2019-10-12 11:20 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: workflows

Em Thu, 10 Oct 2019 15:28:52 -0400
Konstantin Ryabitsev <konstantin@linuxfoundation.org> escreveu:

> # Using public-inbox with structured data
> 
> One of the problems we are trying to solve is how to deliver structured 
> data like CI reports, bugs, issues, etc in a decentralized fashion.  
> Instead of (or in addition to) sending mail to mailing lists and 
> individual developers, bots and bug-tracking tools can provide their own 
> feeds with structured data aimed at consumption by client-side and 
> server-side tools.
> 
> I suggest we use public-inbox feeds with structured data in addition to 
> human-readable data, using some universally adopted machine-parseable
> format like JSON. In my mind, I see this working as a separate ref in 
> each individual feed, e.g.:
> 
> refs/heads/master -- RFC-2822 (email) feed for human consumption
> refs/heads/json -- json feed for machine-readable structured data

That sounds scary. I mean, now, instead of looking on one inbox,
we'll need to look at two ones that may have the same message
(one in RFC-2822 and the other one in JSON). Worse than that,
the contents of the human-readable could be different than the
contents of the JSON one.

IMO, the best is to have just one format (whatever it is) and some
tool that would convert from it into JSON and/or RFC-2822.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-11 19:39   ` Konstantin Ryabitsev
@ 2019-10-12 11:48     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 12+ messages in thread
From: Mauro Carvalho Chehab @ 2019-10-12 11:48 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: Dmitry Vyukov, workflows

Em Fri, 11 Oct 2019 15:39:00 -0400
Konstantin Ryabitsev <konstantin@linuxfoundation.org> escreveu:

> On Fri, Oct 11, 2019 at 07:15:12PM +0200, Dmitry Vyukov wrote:

> >This may also need some form of DoS protection (esp as we move further
> >from email).  
> 
> Well, amusingly, there are ways of distributing git via decentralized 
> protocols (SSB, DAT, IPFS). They are all fairly immature, though, and 
> some of them are truly terrible ideas.
> 
> For the moment, our best protection against DoS attacks on git repos is 
> having many frontends, some powerful allies (e.g. see 
> kernel.googlesource.com), and DoS-avoidance by obscurity ("I can't push 
> to kernel.org right now, but you can pull my repo from my personal 
> server over here").

The way I see, some spammer could send git pushes to the public-inbox
with thousands of SPAM, with would be forever stored at the repository.

So, if we're willing to implement it, we should have already a
solution for it since the beginning.

The only solution that sounds viable for me is to have a pre-receive
hook at the git server that would be receiving the commits.

Such hook would be customizable via .git/config, enabling or disabling
the functionalities and setting the thresholds:

1) prevent commits with too many patches;

If the push have more than, let's say, 20 patches, it would
reject the PR.  Doing that should be easy.

2) prevent commits from the same person if a certain threshold of
patches per period of time exceeds.

For example, no single developer (except maybe for the inbox owner) 
should be allowed to send more than, let's say, 1000 messages per day.

3) Implement gray lists

That would be more complex, but I guess it would be possible to implement
a hook that, for example, would check if the push comes from a known 
person (with signed patches with known keys) and/or a know IP address.

If not, it would push the contents on a separate gray list repository,
rejecting the change at the main one, and adding a notice for the
maintainer when a new person is added to the gray list.

If the owner of the public-inbox decides to accept the patch, it will
simply merge the gray list for that committer at the main inbox, and
the developer will be accepted as someone to trust.

4) Implement black list

If a previously trusted developer starts spamming or badly behaving,
he would be added to a black list file. Anyone there will have any
pull requests silently discarded.


Thanks,
Mauro

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-11 19:07   ` Geert Uytterhoeven
  2019-10-11 19:12     ` Laurent Pinchart
@ 2019-10-14  6:43     ` Dmitry Vyukov
  1 sibling, 0 replies; 12+ messages in thread
From: Dmitry Vyukov @ 2019-10-14  6:43 UTC (permalink / raw)
  To: Geert Uytterhoeven; +Cc: Konstantin Ryabitsev, workflows

On Fri, Oct 11, 2019 at 9:07 PM Geert Uytterhoeven <geert@linux-m68k.org> wrote:
>
> Hi Dmitry,
>
> On Fri, Oct 11, 2019 at 7:15 PM Dmitry Vyukov <dvyukov@google.com> wrote:
> > I also tend to conclude that some actions should not be done offline
> > and then "synced" a week later. Ted provided an example of starting
> > tests in another thread. Or, say if you close a bug and then push than
> > update a month later without any regard to the current bug state, that
> > may not be the right thing. Working with read-only data offline is
> > perfectly fine. Doing _some_ updates locally and then pushing a week
> > later is fine (e.g. queue a new patch for review). But not necessary
> > all updates should be doable in offline mode. And this seems to be
> > inherent conflict with any scheme where one can "queue" any updates
> > locally, and then "sync" them anytime later without any regard to the
> > current state of things and just tell the system and all other
> > participants "deal with it". Also, if we have any kind of
> > permissions/quotas, when are these checks done: when one creates an
> > update or when it's synced?
>
> Not unlike "git push" accepting fast-forwards only, and rejecting
> forced updates.
> Hence you cannot push the close of a bug (each bug has its own
> branch?) before merging the updated remote state first.

The update is in my private git. Nobody touched it and there are no
conflicts. The logical conflicts are in other people git's.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: RFC: individual public-inbox/git activity feeds
  2019-10-10 23:57 ` Eric Wong
@ 2019-10-18  2:48   ` Eric Wong
  0 siblings, 0 replies; 12+ messages in thread
From: Eric Wong @ 2019-10-18  2:48 UTC (permalink / raw)
  To: Konstantin Ryabitsev; +Cc: workflows

Eric Wong <e@80x24.org> wrote:
> Konstantin Ryabitsev <konstantin@linuxfoundation.org> wrote:
> 
> <snip a bunch of stuff I agree with>
> 
> > # Individual developer feeds
> 
> <snip>
> 
> > (With time, we can teach kernel.org to act as an MTA bridge that sends
> > actual mail to the mailing lists after we receive individual feed updates.)
> 
> I'm skeptical and pessimistic about that bit happening (as I
> usually am :>).  But the great thing is all that stuff can
> happen without disrupting/changing existing workflows and is
> totally optional.

Well, maybe less skeptical and pessimistic today...

Readers can look for messages intended for them on a DHT or some
other peer-to-peer system.  Or maybe various search engines can
spring into existence or existing ones can be optimized for this.

Readers can opt into this by using invalid/mangled addresses
(e.g "user@i-pull-my-email.invalid"); and rely on that to find
messages intended for them.

Senders sending to them will get a bounce, see the address; and
hopefully assume the reader will see it eventually if any
publically-archived address is also in the recipients list.

Or an an alternate header (e.g. "Intended-To", "Intended-Cc")
can also be used to avoid bounces (but MUAs would lose those on
"Reply-All"), so maybe putting those pseudo-headers in the
message body can work.


This will NOT solve the spam/flooding/malicious content problem.

However, the receiving end can still use SpamAssassin, rspamd,
or whatever pipe-friendly mail filters they want because it
still looks like mail.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
2019-10-10 23:57 ` Eric Wong
2019-10-18  2:48   ` Eric Wong
2019-10-11 17:15 ` Dmitry Vyukov
2019-10-11 19:07   ` Geert Uytterhoeven
2019-10-11 19:12     ` Laurent Pinchart
2019-10-14  6:43     ` Dmitry Vyukov
2019-10-11 19:39   ` Konstantin Ryabitsev
2019-10-12 11:48     ` Mauro Carvalho Chehab
2019-10-11 22:57 ` Daniel Borkmann
2019-10-12  7:50 ` Greg KH
2019-10-12 11:20 ` Mauro Carvalho Chehab

Workflows Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/workflows/0 workflows/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 workflows workflows/ https://lore.kernel.org/workflows \
		workflows@vger.kernel.org
	public-inbox-index workflows

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.workflows


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git