workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Daniel Axtens <dja@axtens.net>
Cc: Dmitry Vyukov <dvyukov@google.com>,
	workflows@vger.kernel.org, automated-testing@yoctoproject.org,
	Brendan Higgins <brendanhiggins@google.com>,
	Han-Wen Nienhuys <hanwen@google.com>,
	Kevin Hilman <khilman@baylibre.com>,
	Veronika Kabatova <vkabatov@redhat.com>
Subject: Re: Structured feeds
Date: Wed, 6 Nov 2019 15:50:51 -0500	[thread overview]
Message-ID: <20191106205051.56v25onrxkymrfjz@chatter.i7.local> (raw)
In-Reply-To: <8736f1hvbn.fsf@dja-thinkpad.axtens.net>

On Thu, Nov 07, 2019 at 02:35:08AM +1100, Daniel Axtens wrote:
>This is an non-trivial problem, fwiw. Patchwork's email parser clocks 
>in
>at almost thirteen hundred lines, and that's with the benefit of the
>Python standard library. It also regularly gets patched to handle
>changes to email systems (e.g. DMARC), changes to git (git request-pull
>format changed subtly in 2.14.3), the bizzare ways people send email,
>and so on.

I'm actually very interested in seeing patchwork switch from being fed 
mail directly from postfix to using public-inbox repositories as its 
source of patches. I know it's easy enough to accomplish as-is, by 
piping things from public-inbox to parsemail.sh, but it would be even 
more awesome if patchwork learned to work with these repos natively.

The way I see it:

- site administrator configures upstream public-inbox feeds
- a backend process clones these repositories
   - if it doesn't find a refs/heads/json, then it does its own parsing 
     to generate a structured feed with patches/series/trailers/pull 
     requests, cross-referencing them by series as necessary. Something 
     like a subset of this, excluding patchwork-specific data:
     https://patchwork.kernel.org/api/1.1/patches/11177661/
   - if it does find an existing structured feed, it simply uses it (e.g.  
     it was made available by another patchwork instance)
- the same backend process updates the repositories from upstream using 
   proper manifest files (e.g. see 
   https://lore.kernel.org/workflows/manifest.js.gz)

- patchwork projects then consume one (or more) of these structured 
   feeds to generate the actionable list of patches that maintainers can 
   use, perhaps with optional filtering by specific headers (list-id, 
   from, cc), patch paths, keywords, etc.

Basically, parsemail.sh is split into two, where one part does feed 
cloning, pulling, and parsing into structured data (if not already 
done), and another populates actual patchwork project with patches 
matching requested parameters.

I see the following upsides to this:

- we consume public-inbox feeds directly, no longer losing patches due 
   to MTA problems, postfix burps, parse failures, etc
- a project can have multiple sources for patches instead of being tied 
   to a single mailing list
- downstream patchwork instances (the "local patchwork" tool I mentioned 
   earlier) can benefit from structured feeds provided by 
   patchwork.kernel.org

>Patchwork does expose much of this as an API, for example for patches:
>https://patchwork.ozlabs.org/api/patches/?order=-id so if you want to
>build on that feel free. We can possibly add data to the API if that
>would be helpful. (Patches are always welcome too, if you don't want to
>wait an indeterminate amount of time.)

As I said previously, I may be able to fund development of various 
features, but I want to make sure that I properly work with upstream.  
That requires getting consensus on features to make sure that we don't 
spend funds and efforts on a feature that gets rejected. :)

Would the above feature (using one or more public-inbox repositories as 
sources for a patchwork project) be a welcome addition to upstream?

-K

  reply	other threads:[~2019-11-06 20:50 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-05 10:02 Structured feeds Dmitry Vyukov
2019-11-06 15:35 ` Daniel Axtens
2019-11-06 20:50   ` Konstantin Ryabitsev [this message]
2019-11-07  9:08     ` Dmitry Vyukov
2019-11-07 10:57       ` Daniel Axtens
2019-11-07 11:26         ` Veronika Kabatova
2019-11-08  0:24           ` Eric Wong
2019-11-07 11:09     ` Daniel Axtens
2019-11-08 14:18     ` Daniel Axtens
2019-11-09  7:41       ` Johannes Berg
2019-11-12 10:44         ` Daniel Borkmann
     [not found]         ` <208edf06eb4c56a4f376caf0feced65f09d23f93.camel@that.guru>
2019-11-30 18:16           ` Johannes Berg
2019-11-30 18:36             ` Stephen Finucane
2019-11-07  8:53   ` Dmitry Vyukov
2019-11-07 10:40     ` Daniel Axtens
2019-11-07 10:43       ` Dmitry Vyukov
2019-11-07 20:43   ` [Automated-testing] " Don Zickus
2019-11-08  7:58     ` Dmitry Vyukov
2019-11-08 15:26       ` Don Zickus
2019-11-08 11:44     ` Daniel Axtens
2019-11-08 14:54       ` Don Zickus
2019-11-06 19:54 ` Han-Wen Nienhuys
2019-11-06 20:31   ` Sean Whitton
2019-11-07  9:04   ` Dmitry Vyukov
2019-11-07  8:48 ` [Automated-testing] " Tim.Bird
2019-11-07  9:13   ` Dmitry Vyukov
2019-11-07  9:20     ` Tim.Bird
2019-11-07 20:53 ` Don Zickus
2019-11-08  8:05   ` Dmitry Vyukov
2019-11-08 14:52     ` Don Zickus
2019-11-11  9:20       ` Dmitry Vyukov
2019-11-11 15:14         ` Don Zickus
2019-11-12 22:54 ` Konstantin Ryabitsev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191106205051.56v25onrxkymrfjz@chatter.i7.local \
    --to=konstantin@linuxfoundation.org \
    --cc=automated-testing@yoctoproject.org \
    --cc=brendanhiggins@google.com \
    --cc=dja@axtens.net \
    --cc=dvyukov@google.com \
    --cc=hanwen@google.com \
    --cc=khilman@baylibre.com \
    --cc=vkabatov@redhat.com \
    --cc=workflows@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).