All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Lord <lord@emf.net>
To: torvalds@osdl.org
Cc: mpm@selenic.com, seanlkml@sympatico.ca,
	linux-kernel@vger.kernel.org, git@vger.kernel.org
Subject: Re: Mercurial 0.4b vs git patchbomb benchmark
Date: Fri, 29 Apr 2005 10:34:53 -0700 (PDT)	[thread overview]
Message-ID: <200504291734.KAA25263@emf.net> (raw)
In-Reply-To: <Pine.LNX.4.58.0504290854270.18901@ppc970.osdl.org> (message from Linus Torvalds on Fri, 29 Apr 2005 08:58:37 -0700 (PDT))


   From: Linus Torvalds <torvalds@osdl.org>

   On Fri, 29 Apr 2005, Tom Lord wrote:
   > 
   > On the other hand, you're asking people to sign whole trees and not
   > just at first-import time but also for every change.

   I don't agree.

Let's be more precise, then.

   Sure, the commit determins the whole tree end result, but if you want to 
   sign the _tree_, you can do so: just tag the actual _tree_ object as "this 
   tree has been verified to be bug-free and non-baby-seal-clubbing".

   But that's not what people do with tags. They sign a _commit_ object. And
   yes, the commit object points to the tree, but it also points to the whole
   history of other commit objects (and thus all historical trees etc), and 
   together with just common sense it is very obvious that what you're really 
   signing is that "point in time".

   If you want to clarify it, you can always just say so in the tag. Instead 
   of saying "I tag this as something I have verified every byte of", you can 
   say "this was what I released as xxx", or "this commit contains my change" 
   or something.


A programmer publishing a change to the kernel has three pieces of 
data in play.  They are:
nn
	1) the ancestry of their modified tree

	2) the complete contents of their modified tree

	3) input data for a patching program (let's call it "PATCH")
	   which, at the very least, satisfies the equation:

		MOD_TREE = PATCH (this_diff, ORIG_TREE)


An upstream consumer, most often, is (should be) using (1) and (3).
In your system, the upstream consumer is given (1) and (2) and must
compute (3) for themselves.   The upstream consumer can also be provided
a signed version of (3) but if clients are mostly relying on the (2)
then there are multiple vulnerabilities there.

The set of pairs of type (1) and (2) is a dual space to the set of
pairs of type (1) and (3).   In that sense, it makes no mathematical
difference whether the programmer signs a {(1),(3)} pair or a {(1),(2)}
pair since either way, the other pair can be trivially derived.

On the other hand, signing documents which represent a {(1),(3)} pair
with robust accuracy is, in most cases, much much less expensive than
signing {(1),(2)} pairs with robust accuracy.  This is a bit like the
difference between "*I* didn't set loose any mice in the house" vs. "I
have searched every corner of the house and swear there are no mice
here."

Another way to say that is that if someone gives me a signed {(1),(3)} pair
I am likely to be much more confident that the signature represents
an in-depth endorsement of the content as opposed to "here is what the 
tools on my system happened to generate -- I hope it's what I meant".


   > If I've changed five files, I should be signing a statement of:
   > 
   > 	1) my belief about the identity of the immediate ancestor tree
   > 	2) a robust summary of my changes, sufficient to recreate my
   > 	   new tree given a faithful copy of the ancestor

   So _do_ exactly that. You can say that in the tag you're signing.

Which is pretty much exactly what Arch does except that, in the case
of Arch, the signed diff is actually used directly to produce the mod
tree.   There isn't a step in the process where a programmer reads
a purported diff, assumes that the signed tree accurately reflects
that diff, and then merges against the signed tree rather than the
diff.

The Arch approach also has the win that the signed diffs are useful
even in the absense of the ORIG_TREE.  In the Arch world, we use this
for the form of selective merging that we call "cherry-picking" (picking
the desired changes from somebody else's line of development while 
skipping those which are not desired -- yet still being able to proceed
with bidirectional merging in a simple way).

The Arch approach also has the win that it amounts to delta-compression,
simplifying the design of efficient transports and helping to tame
I/O bandwidth costs in the local case.   A lot of good features simply
"fall out" of this approach.

This is kind of a yin-yang thing.  It's also valuable, in my view, to
sign {(1),(2)} pairs.  It's also (in a more obscure way) valuable to
sign {(1),(2),(3)} triples, especially if clients are regularly validating
them by making sure the triple describes a true diff application.
So one ultimately wants both functionalities, really.

Signing trees which are really defined by a diff is a handy checksum
on the accuracy of tools which people are using but it isn't a substitute
for signing the diff itself and using it as the primary definition of 
the tree it generates when combined with the immediate ancestor(s).

Signing trees is also handy when that signature is then further linked
to a set of binaries.

Signing trees also is easier to implement and so gets git off the ground
faster.

But there's more to do, if a serious system is desired.

-t


  reply	other threads:[~2005-04-29 17:35 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-04-26  0:41 Mercurial 0.3 vs git benchmarks Matt Mackall
2005-04-26  1:49 ` Daniel Phillips
2005-04-26  2:08 ` Linus Torvalds
2005-04-26  2:30   ` Mike Taht
2005-04-26  3:04     ` Linus Torvalds
2005-04-26  4:00       ` Linus Torvalds
2005-04-26 11:13         ` Chris Mason
2005-04-26 15:09           ` Magnus Damm
2005-04-26 15:38             ` Chris Mason
2005-04-26 16:23               ` Magnus Damm
2005-04-26 18:18                 ` Chris Mason
2005-04-26 20:56                 ` Andrew Morton
2005-04-26 21:07                   ` Linus Torvalds
2005-04-26 22:50                     ` H. Peter Anvin
2005-04-26 22:56                     ` Andrew Morton
2005-04-26 23:43                       ` H. Peter Anvin
2005-04-27 15:01                         ` Florian Weimer
2005-04-27 15:13                           ` Thomas Glanzmann
2005-04-27 18:54                             ` H. Peter Anvin
2005-04-27 19:01                               ` Thomas Glanzmann
2005-04-27 19:57                                 ` Theodore Ts'o
2005-04-27 20:06                                   ` Thomas Glanzmann
2005-04-27 20:35                                 ` H. Peter Anvin
2005-04-27 20:39                                   ` Thomas Glanzmann
2005-04-27 20:47                                   ` Florian Weimer
2005-04-27 20:55                                 ` Florian Weimer
2005-04-27 21:04                                   ` H. Peter Anvin
2005-04-27 21:06                                     ` Florian Weimer
2005-04-27 21:32                                       ` Theodore Ts'o
2005-04-27 19:55                       ` Theodore Ts'o
2005-04-27  6:34                   ` Ingo Molnar
2005-04-27 21:10                     ` Bill Davidsen
2005-04-27 21:39                       ` Linus Torvalds
2005-04-26 16:42           ` Linus Torvalds
2005-04-26 17:39             ` Chris Mason
2005-04-26 19:52               ` Chris Mason
2005-04-26 18:15         ` H. Peter Anvin
2005-04-26 20:30           ` Bill Davidsen
2005-04-26 16:11       ` Bill Davidsen
2005-04-26  4:01   ` Matt Mackall
2005-04-26  4:20     ` Linus Torvalds
2005-04-26  4:09   ` Chris Wedgwood
2005-04-26  4:22     ` Andreas Gal
2005-04-26  4:22     ` Linus Torvalds
2005-04-29  6:01   ` Mercurial 0.4b vs git patchbomb benchmark Matt Mackall
2005-04-29  6:40     ` Sean
2005-04-29  7:40       ` Matt Mackall
2005-04-29  8:40         ` Sean
2005-04-29 14:34         ` Linus Torvalds
2005-04-29 15:18           ` Morten Welinder
2005-04-29 16:52             ` Matt Mackall
2005-05-02 16:10               ` Bill Davidsen
2005-05-02 19:02                 ` Sean
2005-05-02 22:02                 ` Linus Torvalds
2005-05-02 22:30                   ` Matt Mackall
2005-05-02 22:49                     ` Linus Torvalds
2005-05-03  0:00                       ` Matt Mackall
2005-05-03  2:48                         ` Linus Torvalds
2005-05-03  3:29                           ` Matt Mackall
2005-05-03  4:18                             ` Linus Torvalds
2005-05-03  4:24                         ` Linus Torvalds
2005-05-03  4:27                           ` Matt Mackall
2005-05-03  8:45                           ` Chris Wedgwood
2005-04-29 15:44           ` Tom Lord
2005-04-29 15:58             ` Linus Torvalds
2005-04-29 17:34               ` Tom Lord [this message]
2005-04-29 17:56                 ` Linus Torvalds
2005-04-29 18:08                   ` Tom Lord
2005-04-29 18:33                     ` Sean
2005-04-29 18:54                       ` Tom Lord
2005-04-29 19:13                         ` Sean
2005-04-29 19:22                           ` Tom Lord
2005-04-29 19:28                           ` Tom Lord
2005-04-29 19:47                             ` Noel Maddy
2005-04-29 19:54                               ` Tom Lord
2005-04-29 20:13                                 ` Andrew Timberlake-Newell
2005-04-29 20:26                                   ` Tom Lord
2005-04-29 20:57                                     ` Andrew Timberlake-Newell
2005-04-29 20:16                                 ` Morgan Schweers
2005-04-29 20:21                                 ` Noel Maddy
2005-04-29 20:42                                   ` git network protocol David Lang
2005-04-29 21:15                                     ` Daniel Barkalow
2005-04-29 20:44                                   ` Mercurial 0.4b vs git patchbomb benchmark Tom Lord
2005-04-29 21:57                                     ` Denys Duchier
2005-04-29 20:29                                 ` Signed commit vulnerabilities? (was: Mercurial 0.4b vs git patchbomb benchmark) Kevin Smith
2005-04-29 21:45                             ` Mercurial 0.4b vs git patchbomb benchmark Horst von Brand
2005-05-02 21:06                               ` Tom Lord
2005-05-03  0:24                                 ` Kevin Smith
2005-05-02 16:15                           ` Bill Davidsen
2005-04-29 16:37           ` Matt Mackall
2005-04-29 17:09             ` Linus Torvalds
2005-04-29 19:12               ` Matt Mackall
2005-04-29 19:50                 ` Linus Torvalds
2005-04-29 20:23                   ` Matt Mackall
2005-04-29 20:49                     ` Linus Torvalds
2005-04-29 21:20                       ` Matt Mackall
2005-04-29 16:46           ` Bill Davidsen
2005-04-29 20:19       ` Andrea Arcangeli
2005-04-29 22:30         ` Olivier Galibert
2005-04-29 22:47           ` Andrea Arcangeli
2005-04-29 20:30     ` Andrea Arcangeli
2005-04-29 20:39       ` Matt Mackall
2005-04-30  2:52         ` Andrea Arcangeli
2005-04-30 15:20           ` Matt Mackall
2005-04-30 16:37             ` Andrea Arcangeli
2005-05-02 15:49           ` Bill Davidsen
2005-05-02 16:14             ` Valdis.Kletnieks
2005-05-03 17:40               ` Bill Davidsen
2005-05-04  2:10                 ` Mercurial 0.4b vs git patchbomb benchmark (/usr/bin/env again) David A. Wheeler
2005-05-02 16:17             ` Mercurial 0.4b vs git patchbomb benchmark Andrea Arcangeli
2005-05-02 16:31             ` Linus Torvalds
2005-05-02 17:18               ` Daniel Jacobowitz
2005-05-02 17:32                 ` Linus Torvalds
2005-05-02 18:17                 ` Edgar Toernig
2005-05-02 20:54                 ` Sam Ravnborg
2005-05-02 17:20               ` Ryan Anderson
2005-05-02 17:31                 ` Linus Torvalds
2005-05-02 21:17               ` Kyle Moffett
2005-05-03 17:43               ` Bill Davidsen
2005-04-30 14:44 Adam J. Richter
2005-04-30 16:06 ` Matt Mackall
     [not found] <3YQn9-8qX-5@gated-at.bofh.it>
     [not found] ` <3ZLEF-56n-1@gated-at.bofh.it>
     [not found]   ` <3ZM7L-5ot-13@gated-at.bofh.it>
     [not found]     ` <3ZN3P-69A-9@gated-at.bofh.it>
     [not found]       ` <3ZNdz-6gK-9@gated-at.bofh.it>
2005-05-03  1:16         ` Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
2005-05-03  1:29           ` Matt Mackall
2005-05-03 16:22             ` Bill Davidsen
2005-05-03 17:14               ` Rene Scharfe
2005-05-04 17:51                 ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200504291734.KAA25263@emf.net \
    --to=lord@emf.net \
    --cc=git@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mpm@selenic.com \
    --cc=seanlkml@sympatico.ca \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.