workflows.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: workflows@vger.kernel.org
Subject: Re: RFC: individual public-inbox/git activity feeds
Date: Fri, 11 Oct 2019 15:39:00 -0400	[thread overview]
Message-ID: <20191011193900.cx6ov6abwelzz2ey@chatter.i7.local> (raw)
In-Reply-To: <CACT4Y+YddN06rOcE0jc8rVixamHSJALNvGLfzLk2moutL_9rTg@mail.gmail.com>

On Fri, Oct 11, 2019 at 07:15:12PM +0200, Dmitry Vyukov wrote:
>> The main upside of this approach is that it's evolutionary and not
>> revolutionary and we can start implementing it right away, using it to
>> augment and improve mailing lists instead of replacing them outright.
>
>
>Interesting. This is similar to SSB on _some_ level, right? Because
>it's just a different type of transport. I personally don't have any
>horses in the transport race (as long as it is easy to setup and
>provides a good foundation for transferring structured data).

It's similar only in the sense that it's a chain of records that can be 
optionally cryptographically signed. Some of the problems that SSB (and 
especially v2) tries to solve are not anything git concerns itself 
about, such as discovery, feed cross-reference, verifiable partial 
clones, etc.

>What attracted my attention is this part:
>
>refs/feeds/gregkh/0/master
>refs/feeds/davem/0/master
>refs/feeds/davem/1/master
>
>Will this provide a total ordering over all messages by all
>participants? That may be a significant advantage over SSB then (see
>point 14 in [1]). But the "that can be pulled individually" part
>breaks this (complete read-only mirrors for fault-tolerance are fine,
>though).

No, these refs are entirely independent of each-other. In a sense, it's 
the equivalent of cloning individual public-inbox repos together and 
then tar'ing them up. For ordering, we still have to go with commit 
timestamps and we'll still have conflicting resolutions, just like you 
mention (though this isn't any different than with email).

>This may also need some form of DoS protection (esp as we move further
>from email).

Well, amusingly, there are ways of distributing git via decentralized 
protocols (SSB, DAT, IPFS). They are all fairly immature, though, and 
some of them are truly terrible ideas.

For the moment, our best protection against DoS attacks on git repos is 
having many frontends, some powerful allies (e.g. see 
kernel.googlesource.com), and DoS-avoidance by obscurity ("I can't push 
to kernel.org right now, but you can pull my repo from my personal 
server over here").

>I also tend to conclude that some actions should not be done offline
>and then "synced" a week later. Ted provided an example of starting
>tests in another thread. Or, say if you close a bug and then push than
>update a month later without any regard to the current bug state, that
>may not be the right thing.

The same is true with email, though -- people who queue up email in 
their outbox and lose connectivity before they can send it out is 
something that happens often. True, we aren't solving this, but it's not 
a net-new problem and will always be a hard problem to solve for laggy 
decentralized environments.

>Working with read-only data offline is
>perfectly fine. Doing _some_ updates locally and then pushing a week
>later is fine (e.g. queue a new patch for review). But not necessary
>all updates should be doable in offline mode. And this seems to be
>inherent conflict with any scheme where one can "queue" any updates
>locally, and then "sync" them anytime later without any regard to the
>current state of things and just tell the system and all other
>participants "deal with it".

Well, in all honesty, "queueing things up for a week" is going to be an 
increasingly rare problem for anyone who works on the Linux kernel. I 
don't know about others, but I can recall every time I've actually been 
offline in the past year and in each case it involved a cross-atlantic 
flight with a totally broken wi-fi or a trip into a rare spot on the map 
without cell towers. Even long power outages simply mean I have to 
tether my laptop via my phone. Thanks to wireguard, I don't even lose 
ssh sessions when that happens. :)

Replicating a feed out is a very quick task that can be made quicker 
with tricks like ssh controlmaster connections that keep sessions going.

>This is interesting too:
>
>refs/heads/master -- RFC-2822 (email) feed for human consumption
>refs/heads/json -- json feed for machine-readable structured data
>
>Playing devil's advocate, what about MIME? :)
>It does not need to be completely arbitrary MIME, but say only 2
>alternative section, first has to be plain/text, second (optional) has
>to be kthul/json. 

The main reason why I wanted two different refs is so entities like bots 
could only pull the json ref and ignore the one aimed at humans. So, 
while this makes the repository larger by having some data duplication, 
this should make pulling and parsing less problematic by bots, and I 
expect bots to be the ones generating most frequent hits and traffic.

-K

  parent reply	other threads:[~2019-10-11 19:39 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-10 19:28 RFC: individual public-inbox/git activity feeds Konstantin Ryabitsev
2019-10-10 23:57 ` Eric Wong
2019-10-18  2:48   ` Eric Wong
2019-10-11 17:15 ` Dmitry Vyukov
2019-10-11 19:07   ` Geert Uytterhoeven
2019-10-11 19:12     ` Laurent Pinchart
2019-10-14  6:43     ` Dmitry Vyukov
2019-10-11 19:39   ` Konstantin Ryabitsev [this message]
2019-10-12 11:48     ` Mauro Carvalho Chehab
2019-10-11 22:57 ` Daniel Borkmann
2019-10-12  7:50 ` Greg KH
2019-10-12 11:20 ` Mauro Carvalho Chehab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191011193900.cx6ov6abwelzz2ey@chatter.i7.local \
    --to=konstantin@linuxfoundation.org \
    --cc=dvyukov@google.com \
    --cc=workflows@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).