Re: BitBucket: GPL-ed KitBeeper clone

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Re: BitBucket: GPL-ed KitBeeper clone
@ 2003-03-02  0:11 Adam J. Richter
  2003-03-02  0:20 ` Larry McVoy
                   ` (4 more replies)
  0 siblings, 5 replies; 155+ messages in thread
From: Adam J. Richter @ 2003-03-02  0:11 UTC (permalink / raw)
  To: andrea, linux-kernel, pavel, pavel; +Cc: hch

Pavel Machek wrote:
> I've created little project for read-only (for now ;-) kitbeeper
> clone. It is available at www.sf.net/projects/bitbucket (no tar balls,
> just get it fresh from CVS).

	Thank you for taking some initiative and improving this
situation by constructive means.  You are an example to us all,
as is Andrea Arcangeli with his openbkweb project, which you
will probably want to examine and perhaps integrate
(ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/openbkweb).

	bitbucket is about 350 lines of shell scripts, documentation
and diffs, the most interesting file of which is FORMAT, which
documents some reverse engineering efforts on bitkeeper internal file
formats.  bitkbucket currently uses rsync to update data from the
repository.  openbkweb is 500+ lines of python that implements enough
of the bitkeeper network protocol to do downloads, although perhaps in
inefficiently.  That sounds like some functionality that you might be
interested in integrating.

	I think the suggestion made by Pavel Janik that it would
be better to work on adding BitKeeper-like functionality to existing
free software packages is a bit misdirected.  BitKeeper uses SCCS
format, and we have a GPL'ed SCCS clone ("cssc"), so you are
adding functionality to existing free software version control
code anyhow.

	However, I would like to turn Pavel Janik's point in
what I think might be a more constructive direction.

	Aegis, BitKeeper and probably other configuration management
tools that use sccs or rcs basically share a common type of lower
layer.  This lower layer converts a file-based revision control system
such as sccs to an "uber-cvs", as someone called it in a slashdot
discussion, that can:

	    1. process a transaction against a group of files atomically,
	    2. associate a comment with such a transaction rather than
	       with just one file,
	    3. represent symbolic links, file protections
            4. represent file renames (and perhaps copies?)

	You might want to keep in the back of your mind the
possibility of someday splitting off this lower level into a separate
software package that programs like your bitkeeper clone, aegis could
use in common.  If the interface to this lower level took cvs
commands, then it could probably replace cvs, although the repository
would probably be incompatible since the meaning of things like
checking in multiple files together with a single comment would be
different, and there would be other kinds of changes to represent
beyond what cvs currently does.  Using a repository format that is
compatible with another system (for example bitkeeper or aegis) would
make such a tool more useful, and if such a tool makes it easier for
people to migrate from a prorprietary system to a free one, that's
even better, so your starting with bitkeeper's format seems like an
excellent choice to me.

	Thanks again for starting this project.  I will at least
try to be a user of it.

Adam J. Richter     __     ______________   575 Oroville Road
adam@yggdrasil.com     \ /                  Milpitas, California 95035
+1 408 309-6081         | g g d r a s i l   United States of America
                         "Free Software For The Rest Of Us."

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:11 BitBucket: GPL-ed KitBeeper clone Adam J. Richter
@ 2003-03-02  0:20 ` Larry McVoy
  2003-03-02  0:20 ` David Lang
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-02  0:20 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: andrea, linux-kernel, pavel, pavel, hch

> 	Thanks again for starting this project.  I will at least
> try to be a user of it.

Enjoy yourself.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:11 BitBucket: GPL-ed KitBeeper clone Adam J. Richter
  2003-03-02  0:20 ` Larry McVoy
@ 2003-03-02  0:20 ` David Lang
  2003-03-02  0:49 ` Arador
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 155+ messages in thread
From: David Lang @ 2003-03-02  0:20 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: andrea, linux-kernel, pavel, pavel, hch

Adam, the openbkweb project didn't reverse engineer the BK network
protocol, it used the HTTP access that is provided on bkbits.net to
download the individual items and created a repository from that.

unfortunantly the bandwidth requirements to support that are high enough
that Larry indicated that if people keep doing that he would have to
shutdown the HTTP access.

bitbucket uses rsync as that is the most efficiant way to get a copy of
the repository without trying to talk the bitkeeper protocol. it is FAR
more efficiant and accruate then the openbkkweb interface

Davdi Lang


 On Sat, 1 Mar 2003, Adam J.
Richter wrote:

> Date: Sat, 1 Mar 2003 16:11:55 -0800
> From: Adam J. Richter <adam@yggdrasil.com>
> To: andrea@suse.de, linux-kernel@vger.kernel.org, pavel@janik.cz,
>      pavel@ucw.cz
> Cc: hch@infradead.org
> Subject: Re: BitBucket: GPL-ed KitBeeper clone
>
> Pavel Machek wrote:
> > I've created little project for read-only (for now ;-) kitbeeper
> > clone. It is available at www.sf.net/projects/bitbucket (no tar balls,
> > just get it fresh from CVS).
>
> 	Thank you for taking some initiative and improving this
> situation by constructive means.  You are an example to us all,
> as is Andrea Arcangeli with his openbkweb project, which you
> will probably want to examine and perhaps integrate
> (ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/openbkweb).
>
> 	bitbucket is about 350 lines of shell scripts, documentation
> and diffs, the most interesting file of which is FORMAT, which
> documents some reverse engineering efforts on bitkeeper internal file
> formats.  bitkbucket currently uses rsync to update data from the
> repository.  openbkweb is 500+ lines of python that implements enough
> of the bitkeeper network protocol to do downloads, although perhaps in
> inefficiently.  That sounds like some functionality that you might be
> interested in integrating.
>
> 	I think the suggestion made by Pavel Janik that it would
> be better to work on adding BitKeeper-like functionality to existing
> free software packages is a bit misdirected.  BitKeeper uses SCCS
> format, and we have a GPL'ed SCCS clone ("cssc"), so you are
> adding functionality to existing free software version control
> code anyhow.
>
> 	However, I would like to turn Pavel Janik's point in
> what I think might be a more constructive direction.
>
> 	Aegis, BitKeeper and probably other configuration management
> tools that use sccs or rcs basically share a common type of lower
> layer.  This lower layer converts a file-based revision control system
> such as sccs to an "uber-cvs", as someone called it in a slashdot
> discussion, that can:
>
> 	    1. process a transaction against a group of files atomically,
> 	    2. associate a comment with such a transaction rather than
> 	       with just one file,
> 	    3. represent symbolic links, file protections
>             4. represent file renames (and perhaps copies?)
>
> 	You might want to keep in the back of your mind the
> possibility of someday splitting off this lower level into a separate
> software package that programs like your bitkeeper clone, aegis could
> use in common.  If the interface to this lower level took cvs
> commands, then it could probably replace cvs, although the repository
> would probably be incompatible since the meaning of things like
> checking in multiple files together with a single comment would be
> different, and there would be other kinds of changes to represent
> beyond what cvs currently does.  Using a repository format that is
> compatible with another system (for example bitkeeper or aegis) would
> make such a tool more useful, and if such a tool makes it easier for
> people to migrate from a prorprietary system to a free one, that's
> even better, so your starting with bitkeeper's format seems like an
> excellent choice to me.
>
> 	Thanks again for starting this project.  I will at least
> try to be a user of it.
>
> Adam J. Richter     __     ______________   575 Oroville Road
> adam@yggdrasil.com     \ /                  Milpitas, California 95035
> +1 408 309-6081         | g g d r a s i l   United States of America
>                          "Free Software For The Rest Of Us."
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:11 BitBucket: GPL-ed KitBeeper clone Adam J. Richter
  2003-03-02  0:20 ` Larry McVoy
  2003-03-02  0:20 ` David Lang
@ 2003-03-02  0:49 ` Arador
  2003-03-02  1:03   ` Jeff Garzik
  2003-03-02  2:15   ` Alan Cox
  2003-03-02  1:26 ` Olivier Galibert
  2003-03-02  1:37 ` Filip Van Raemdonck
  4 siblings, 2 replies; 155+ messages in thread
From: Arador @ 2003-03-02  0:49 UTC (permalink / raw)
  To: Adam J. Richter; +Cc: andrea, linux-kernel, pavel, pavel, hch

On Sat, 1 Mar 2003 16:11:55 -0800
"Adam J. Richter" <adam@yggdrasil.com> wrote:

(Just a very personal suggestion)
Why to waste time trying to clone a 
tool such as bitkeeper? Why not to support things like subversion?



Diego Calleja

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:49 ` Arador
@ 2003-03-02  1:03   ` Jeff Garzik
  2003-03-02  2:15   ` Alan Cox
  1 sibling, 0 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02  1:03 UTC (permalink / raw)
  To: Arador; +Cc: Adam J. Richter, andrea, linux-kernel, pavel, pavel, hch

Arador wrote:
> On Sat, 1 Mar 2003 16:11:55 -0800
> "Adam J. Richter" <adam@yggdrasil.com> wrote:
> 
> (Just a very personal suggestion)
> Why to waste time trying to clone a 
> tool such as bitkeeper? Why not to support things like subversion?


...because, clearly, Pavel is being paid by BitMover to dilute 
programmer resources and user mindshare, thus slowing all open source 
SCM efforts.

</sarcasm>

That's not Pavel's aim, obviously, but it's the net effect.

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:49 ` Arador
  2003-03-02  1:03   ` Jeff Garzik
@ 2003-03-02  2:15   ` Alan Cox
  2003-03-02  1:19     ` Jeff Garzik
  1 sibling, 1 reply; 155+ messages in thread
From: Alan Cox @ 2003-03-02  2:15 UTC (permalink / raw)
  To: Arador
  Cc: Adam J. Richter, andrea, Linux Kernel Mailing List, pavel, pavel, hch

On Sun, 2003-03-02 at 00:49, Arador wrote:
> On Sat, 1 Mar 2003 16:11:55 -0800
> "Adam J. Richter" <adam@yggdrasil.com> wrote:
> 
> (Just a very personal suggestion)
> Why to waste time trying to clone a 
> tool such as bitkeeper? Why not to support things like subversion?

Because the repositories people need to read are in BK format, for better
or worse. It doesn't ultimately matter if you use it as an input filter
for CVS, subversion or no VCS at all.



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  2:15   ` Alan Cox
@ 2003-03-02  1:19     ` Jeff Garzik
  2003-03-02  1:40       ` BitBucket: GPL-ed *notrademarkhere* clone Andrea Arcangeli
  2003-03-03  0:10       ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
  0 siblings, 2 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02  1:19 UTC (permalink / raw)
  To: Alan Cox
  Cc: Arador, Adam J. Richter, andrea, Linux Kernel Mailing List,
	pavel, pavel, hch

Alan Cox wrote:
> On Sun, 2003-03-02 at 00:49, Arador wrote:
> 
>>On Sat, 1 Mar 2003 16:11:55 -0800
>>"Adam J. Richter" <adam@yggdrasil.com> wrote:
>>
>>(Just a very personal suggestion)
>>Why to waste time trying to clone a 
>>tool such as bitkeeper? Why not to support things like subversion?
> 
> 
> Because the repositories people need to read are in BK format, for better
> or worse. It doesn't ultimately matter if you use it as an input filter
> for CVS, subversion or no VCS at all.

"BK format"?  Not really.  Patches have been posted (to lkml, even) to 
GNU CSSC which allow it to read SCCS files BK reads and writes.

Since that already exists, a full BitKeeper clone is IMO a bit silly, 
because it draws users and programmers away from projects that could 
potentially _replace_ BitKeeper.

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  1:19     ` Jeff Garzik
@ 2003-03-02  1:40       ` Andrea Arcangeli
  2003-03-02  1:45         ` Jeff Garzik
  2003-03-03  0:10       ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
  1 sibling, 1 reply; 155+ messages in thread
From: Andrea Arcangeli @ 2003-03-02  1:40 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List,
	pavel, pavel, hch

On Sat, Mar 01, 2003 at 08:19:52PM -0500, Jeff Garzik wrote:
> Alan Cox wrote:
> >On Sun, 2003-03-02 at 00:49, Arador wrote:
> >
> >>On Sat, 1 Mar 2003 16:11:55 -0800
> >>"Adam J. Richter" <adam@yggdrasil.com> wrote:
> >>
> >>(Just a very personal suggestion)
> >>Why to waste time trying to clone a 
> >>tool such as *notrademarkhere*? Why not to support things like subversion?
> >
> >
> >Because the repositories people need to read are in BK format, for better
> >or worse. It doesn't ultimately matter if you use it as an input filter
> >for CVS, subversion or no VCS at all.
> 
> "BK format"?  Not really.  Patches have been posted (to lkml, even) to 
> GNU CSSC which allow it to read SCCS files BK reads and writes.

you never tried what you're talking about.  there's no way to make any
use of the SCCS tree from Rik's website with only the patched CSSC. The
whole point of bitbucket is to find a way to use CSSC on that tree. And
the longer Larry takes to export the whole data in an open format (CVS,
subversion or whatever), the more progress it will be accomplished in
getting the data out of the only service we have right now (Rik's
server). Sure, CSSC is a foundamental piece to extract the data out of
the single files, but CSSC alone is useless. CSSC only allows you to
work on a single file, you lose the whole view of the tree and in turn
it is completely unusable for doing anything useful like watching
changesets, or checking out a branch or whatever else useful thing. As
Pavel found _all_ the info we are interested about is in the
SCCS/s.ChangeSet file and that has nothing to do with CSSC or SCCS.

> 
> Since that already exists, a full BitKeeper clone is IMO a bit silly, 
> because it draws users and programmers away from projects that could 
> potentially _replace_ BitKeeper.

Jeff, please uninstall *notrademarkhere* from your harddisk, install the
patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
harddisk (like I just did), and then send me via email the diff of the
last Changeset that Linus applied to his tree with author, date,
comments etc...  If you can do that, you're completely right and at
least personally I will agree 100% with you, again: iff you can.

Andrea

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  1:40       ` BitBucket: GPL-ed *notrademarkhere* clone Andrea Arcangeli
@ 2003-03-02  1:45         ` Jeff Garzik
  2003-03-02  2:09           ` Andrea Arcangeli
                             ` (2 more replies)
  0 siblings, 3 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02  1:45 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List,
	pavel, pavel, hch

Andrea Arcangeli wrote:
> Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> harddisk (like I just did), and then send me via email the diff of the
> last Changeset that Linus applied to his tree with author, date,
> comments etc...  If you can do that, you're completely right and at
> least personally I will agree 100% with you, again: iff you can.


You're missing the point:

A BK exporter is useful.  A BK clone is not.

If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  1:45         ` Jeff Garzik
@ 2003-03-02  2:09           ` Andrea Arcangeli
  2003-03-02 17:28             ` Jeff Garzik
  2003-03-02  3:29           ` H. Peter Anvin
  2003-03-03  0:13           ` Pavel Machek
  2 siblings, 1 reply; 155+ messages in thread
From: Andrea Arcangeli @ 2003-03-02  2:09 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List,
	pavel, pavel, hch

On Sat, Mar 01, 2003 at 08:45:08PM -0500, Jeff Garzik wrote:
> Andrea Arcangeli wrote:
> >Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> >patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> >harddisk (like I just did), and then send me via email the diff of the
> >last Changeset that Linus applied to his tree with author, date,
> >comments etc...  If you can do that, you're completely right and at
> >least personally I will agree 100% with you, again: iff you can.
> 
> 
> You're missing the point:
> 
> A BK exporter is useful.  A BK clone is not.
> 
> If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)

hey, in your previous email you claimed all we need is the patched CSSC,
you change topic quick! Glad you agree CSSC alone is useless and to make
anything useful with Rik's *notrademarkhere* tree we need a true
*notrademarkhere* exporter (of course the exporter will be backed by
CSSC to extract the single file changes, since they're in SCCS format
and it would be pointless to reinvent the wheel).

Now you say the bitbucket project (you read Pavel's announcement, he
said "read only for now", that means exporter in my vocabulary) is
useful, to me that sounds the opposite of your previous claims, but
again: glad we agree on this too now.

Andrea

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  2:09           ` Andrea Arcangeli
@ 2003-03-02 17:28             ` Jeff Garzik
  2003-03-02 18:16               ` Andrea Arcangeli
  0 siblings, 1 reply; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02 17:28 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List,
	pavel, pavel, hch

Andrea Arcangeli wrote:
> On Sat, Mar 01, 2003 at 08:45:08PM -0500, Jeff Garzik wrote:
> 
>>Andrea Arcangeli wrote:
>>
>>>Jeff, please uninstall *notrademarkhere* from your harddisk, install the
>>>patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
>>>harddisk (like I just did), and then send me via email the diff of the
>>>last Changeset that Linus applied to his tree with author, date,
>>>comments etc...  If you can do that, you're completely right and at
>>>least personally I will agree 100% with you, again: iff you can.
>>
>>
>>You're missing the point:
>>
>>A BK exporter is useful.  A BK clone is not.
>>
>>If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)
> 
> 
> hey, in your previous email you claimed all we need is the patched CSSC,
> you change topic quick! Glad you agree CSSC alone is useless and to make
> anything useful with Rik's *notrademarkhere* tree we need a true
> *notrademarkhere* exporter (of course the exporter will be backed by
> CSSC to extract the single file changes, since they're in SCCS format
> and it would be pointless to reinvent the wheel).

I have not changed the topic, you are still missing my point.

Let us get this small point out of the way:  I agree that GNU CSSC 
cannot read the BitKeeper ChangeSet file, which is a file critical for 
getting the "weave" correct.

But that point is not relevant to my thread of discussion.

Let us continue in the below paragraph...

> Now you say the bitbucket project (you read Pavel's announcement, he
> said "read only for now", that means exporter in my vocabulary) is
> useful, to me that sounds the opposite of your previous claims, but
> again: glad we agree on this too now.

I disagree with your translation.  Maybe this is the source of 
misunderstand.

To me, a "BK clone, read only for now" is vastly different from a "BK 
exporter".  The "for now" clearly implies that it will eventually 
attempt to be a full SCM.

Why do we need Yet Another Open Source SCM?
Why does Pavel not work on an existing open source SCM, to enable it to 
read/write BitKeeper files?

These are the key questions which bother me.

Why do they bother me?

The open source world does not need yet another project that is "not 
quite as good as BitKeeper."  The open source world needs something that 
  can do all that BitKeeper does, and more :)  A BK clone would be in a 
perpetual state of "not quite as good as BitKeeper".

	Jeff

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 17:28             ` Jeff Garzik
@ 2003-03-02 18:16               ` Andrea Arcangeli
  2003-03-02 20:12                 ` Jeff Garzik
  2003-03-03 18:37                 ` Larry McVoy
  0 siblings, 2 replies; 155+ messages in thread
From: Andrea Arcangeli @ 2003-03-02 18:16 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List,
	pavel, pavel, hch

On Sun, Mar 02, 2003 at 12:28:23PM -0500, Jeff Garzik wrote:
> Andrea Arcangeli wrote:
> >On Sat, Mar 01, 2003 at 08:45:08PM -0500, Jeff Garzik wrote:
> >
> >>Andrea Arcangeli wrote:
> >>
> >>>Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> >>>patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> >>>harddisk (like I just did), and then send me via email the diff of the
> >>>last Changeset that Linus applied to his tree with author, date,
> >>>comments etc...  If you can do that, you're completely right and at
> >>>least personally I will agree 100% with you, again: iff you can.
> >>
> >>
> >>You're missing the point:
> >>
> >>A BK exporter is useful.  A BK clone is not.
> >>
> >>If Pavel is _not_ attempting to clone BK, then I retract my arguments. :)
> >
> >
> >hey, in your previous email you claimed all we need is the patched CSSC,
> >you change topic quick! Glad you agree CSSC alone is useless and to make
> >anything useful with Rik's *notrademarkhere* tree we need a true
> >*notrademarkhere* exporter (of course the exporter will be backed by
> >CSSC to extract the single file changes, since they're in SCCS format
> >and it would be pointless to reinvent the wheel).
> 
> I have not changed the topic, you are still missing my point.

your point is purerly theorical at this point in time. bitbucker is so
far from being an efficient exporter that arguing right now about
stopping at the exporter or going ahead to clone it completely is a
totally pointless discussion at this point in time.

Once it will be a fully functional exporter please raise your point
again, only then it will make sense to discuss your point.

I'm not even convinced it will become a full exporter if Larry finally
provides the kernel data via an open protocol stored in an open format
as he promised us some week ago, go figure how much I can care what it
will become after it has the readonly capability.

> Let us get this small point out of the way:  I agree that GNU CSSC 
> cannot read the BitKeeper ChangeSet file, which is a file critical for 
> getting the "weave" correct.

This is not what I understood from your previous email:

	"BK format"?  Not really.  Patches have been posted (to lkml, even) to
	GNU CSSC which allow it to read SCCS files BK reads and writes.

	Since that already exists, a full BitKeeper clone is IMO a bit silly,

now you're saying something completely different, you're saying, "yes the
CSSC obviously isn't enough and we _only_ _need_ the exporter but please
don't do more than the exporter or it will waste developement
resources". This is why you changed topic as far as I'm concerned, but
no problem, I'm glad we agree the exporter is useful now.

> To me, a "BK clone, read only for now" is vastly different from a "BK 
> exporter".  The "for now" clearly implies that it will eventually 
> attempt to be a full SCM.

Why do you care that much now? I can't care less. Period. I need the
exporter and for me the exporter or the bk-clone-read-only is the same
thing, I don't mind if I've to run `bk` or `exportbk` or rsync or
whatever to get the data out.

If bitbucket will become much better than bitkeeper 100 years from now,
much better than a clone, is something I can't care less at this point
in time, and it may be the best or worst thing it will happen to the
whole SCM open source arena, you can't know, I can't know, nobody can
know at this point in time.

You agreed the exporter is useful, so we agree, I don't mind what will
happen after the useful thing is avaialble, it's the last of my worries,
and until we reach that point obviously there is no risk to reinvent the
wheel (unless the data become available in a open protocol first).

> Why do we need Yet Another Open Source SCM?
> Why does Pavel not work on an existing open source SCM, to enable it to 
> read/write BitKeeper files?

bitbucket could be merged into any SCM at any time, it is _the
exporter_ that the other SCM needs to import from the *notrademarkhere*
trees.

> These are the key questions which bother me.
> 
> Why do they bother me?
> 
> The open source world does not need yet another project that is "not 
> quite as good as BitKeeper."  The open source world needs something that 
>  can do all that BitKeeper does, and more :)  A BK clone would be in a 
> perpetual state of "not quite as good as BitKeeper".

Disagree, if it will become more than an read-only thing, it will likely
become as good and most probably better than bitkeeper (maybe not
graphical but still usable) because it means it has the critical mass of
developement power _iff_ it can reach that point. But at this point in time
I doubt it will become more than an exporter, infact I even doubt it
will become a fully exporter if Larry avoids us to waste time. I
personally would have no interest in bitbucket if Linus would provide
the data in a open protocol for efficient downloads and in a open format
for backup-archive downloads as we discussed some week ago.

But again, what bitbucket will become after it will be a function
exporter (i.e. your "point") is enterely pointless to argue about right
now IMHO. But feel free to keep discussing it with others if you think
it matters right now (now that I made my point clear, I probably won't
feel the need to answer since my interest in that matter is so low).

Andrea

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 18:16               ` Andrea Arcangeli
@ 2003-03-02 20:12                 ` Jeff Garzik
  2003-03-02 21:49                   ` Geert Uytterhoeven
  2003-03-03 18:37                 ` Larry McVoy
  1 sibling, 1 reply; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02 20:12 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Alan Cox, Arador, Adam J. Richter, Linux Kernel Mailing List,
	pavel, pavel

Andrea Arcangeli wrote:
> your point is purerly theorical at this point in time. bitbucker is so
> far from being an efficient exporter that arguing right now about
> stopping at the exporter or going ahead to clone it completely is a
> totally pointless discussion at this point in time.
> 
> Once it will be a fully functional exporter please raise your point
> again, only then it will make sense to discuss your point.

Ok, fair enough ;)


> I'm not even convinced it will become a full exporter if Larry finally
> provides the kernel data via an open protocol stored in an open format
> as he promised us some week ago, go figure how much I can care what it
> will become after it has the readonly capability.

I think this is a fair request.

IMO a good start would be to get BK to export its metadata for each 
changeset in XML.  Once that is accomplished, (a) nobody gives a damn 
about BK file format, and (b) it is easy to set up an automated, public 
distribution of XML changesets that can be imported into OpenCM, cvs, or 
whatever.


>>Let us get this small point out of the way:  I agree that GNU CSSC 
>>cannot read the BitKeeper ChangeSet file, which is a file critical for 
>>getting the "weave" correct.
> 
> 
> This is not what I understood from your previous email:
> 
> 	"BK format"?  Not really.  Patches have been posted (to lkml, even) to
> 	GNU CSSC which allow it to read SCCS files BK reads and writes.
> 	
> 	Since that already exists, a full BitKeeper clone is IMO a bit silly,
> 
> now you're saying something completely different, you're saying, "yes the
> CSSC obviously isn't enough and we _only_ _need_ the exporter but please
> don't do more than the exporter or it will waste developement
> resources". This is why you changed topic as far as I'm concerned, but
> no problem, I'm glad we agree the exporter is useful now.

I am sorry for the misunderstanding then.  Let me quote from an email I 
sent to you yesterday:

	A BK exporter is useful.

So I think we do agree :)


>>To me, a "BK clone, read only for now" is vastly different from a "BK 
>>exporter".  The "for now" clearly implies that it will eventually 
>>attempt to be a full SCM.
> 
> 
> Why do you care that much now? I can't care less. Period. I need the
> exporter and for me the exporter or the bk-clone-read-only is the same
> thing, I don't mind if I've to run `bk` or `exportbk` or rsync or
> whatever to get the data out.
> 
> If bitbucket will become much better than bitkeeper 100 years from now,
> much better than a clone, is something I can't care less at this point
> in time, and it may be the best or worst thing it will happen to the
> whole SCM open source arena, you can't know, I can't know, nobody can
> know at this point in time.
> 
> You agreed the exporter is useful, so we agree, I don't mind what will
> happen after the useful thing is avaialble, it's the last of my worries,
> and until we reach that point obviously there is no risk to reinvent the
> wheel (unless the data become available in a open protocol first).


Yes.  As you see, I care about the future and not the present, in my 
arguments:  I believe that a BK clone may hurt the overall [future] 
effort of creating a good quality open source SCM.  So, in my mind I 
separate the two topics of "BK exporter" and "future BK clone."


To get back to the topic of "BK exporter", I think it is more productive 
to get Larry to export in an open file format.  I will work with him 
this week to do that.  Reading the BK format itself may be interesting 
to some, but I would rather have BitMover do the work and export in an 
open file format ;-)  Reading BK format directly is "chasing a moving 
target" in my opinion.

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 20:12                 ` Jeff Garzik
@ 2003-03-02 21:49                   ` Geert Uytterhoeven
  0 siblings, 0 replies; 155+ messages in thread
From: Geert Uytterhoeven @ 2003-03-02 21:49 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

On Sun, 2 Mar 2003, Jeff Garzik wrote:
> Andrea Arcangeli wrote:
> > I'm not even convinced it will become a full exporter if Larry finally
> > provides the kernel data via an open protocol stored in an open format
> > as he promised us some week ago, go figure how much I can care what it
> > will become after it has the readonly capability.
> 
> I think this is a fair request.
> 
> IMO a good start would be to get BK to export its metadata for each 
> changeset in XML.  Once that is accomplished, (a) nobody gives a damn 
> about BK file format, and (b) it is easy to set up an automated, public 
> distribution of XML changesets that can be imported into OpenCM, cvs, or 
> whatever.

Read: an XML scheme with a public, open specification?

Ask Microsoft how to `encrypt' documents using an `open' standard like XML...

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 18:16               ` Andrea Arcangeli
  2003-03-02 20:12                 ` Jeff Garzik
@ 2003-03-03 18:37                 ` Larry McVoy
  2003-03-03 18:46                   ` Larry McVoy
  2003-03-03 22:57                   ` Andrea Arcangeli
  1 sibling, 2 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-03 18:37 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Jeff Garzik, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel, hch

How close is http://www.bitmover.com/EXPORT to what you want (3MB file).

Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,
in practice the granularity would be as least as fine as each of Linus'
pushes and finer if possible.  We can't capture all the branching structure
in patches, there is too much parallelism, but what we can do is capture
each push that Linus does and if he did more than one merge in that push,
we can break it up into each merge.  

We can also provide this as a BK url on bkbits for any cset or range of
csets (we'll have to get another T1 line but I don't see way around that).

This should give enough information that anyone could build their own 
BK 2 SVN gateway (or whatever, we're doing the CVS one).

Also, here's what Linus' recent pushes look like WITHOUT breaking it into
each merge, we're still working on that code:

     57 csets on 2003/03/03 08:49:44 
      5 csets on 2003/03/02 21:30:31 
     28 csets on 2003/03/02 21:04:02 
      1 csets on 2003/03/02 10:19:24 
     49 csets on 2003/03/01 19:03:58 
      2 csets on 2003/03/01 11:04:04 
      5 csets on 2003/03/01 09:19:24 
      1 csets on 2003/02/28 19:34:30 
     37 csets on 2003/02/28 15:30:29 
      8 csets on 2003/02/28 15:18:12 
     23 csets on 2003/02/28 15:05:08 
     31 csets on 2003/02/27 23:30:05 
     16 csets on 2003/02/27 09:15:07 
     11 csets on 2003/02/27 07:45:06 
     47 csets on 2003/02/26 23:09:53 
     32 csets on 2003/02/25 21:35:34 
     24 csets on 2003/02/25 18:34:41 
     22 csets on 2003/02/25 15:49:41 
     14 csets on 2003/02/24 21:23:34 
      3 csets on 2003/02/24 15:19:44 
      1 csets on 2003/02/24 11:16:14 
     15 csets on 2003/02/24 11:00:36 
      4 csets on 2003/02/24 10:48:49 
      1 csets on 2003/02/24 10:03:36 
     15 csets on 2003/02/24 09:49:34 
      1 csets on 2003/02/23 20:33:00 
      3 csets on 2003/02/23 11:15:28 
      8 csets on 2003/02/23 11:01:10 
      6 csets on 2003/02/23 10:49:14 
      2 csets on 2003/02/22 19:32:35 
      4 csets on 2003/02/22 16:17:27 
      1 csets on 2003/02/22 12:45:28 
     76 csets on 2003/02/22 12:34:13 
      1 csets on 2003/02/21 20:18:19 
      6 csets on 2003/02/21 19:49:32 
     86 csets on 2003/02/21 18:03:23 
      3 csets on 2003/02/21 16:18:24 
     30 csets on 2003/02/21 14:14:48 
      1 csets on 2003/02/21 10:18:19 
      1 csets on 2003/02/21 09:49:15 
etc.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 18:37                 ` Larry McVoy
@ 2003-03-03 18:46                   ` Larry McVoy
  2003-03-03 22:57                   ` Andrea Arcangeli
  1 sibling, 0 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-03 18:46 UTC (permalink / raw)
  To: Andrea Arcangeli, Jeff Garzik, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel, hch

On Mon, Mar 03, 2003 at 10:37:34AM -0800, Larry McVoy wrote:
> How close is http://www.bitmover.com/EXPORT to what you want (3MB file).
> 
> Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,

This was too big, I replaced it with the diffs + comments for the last push
Linus did.  Even this is pretty big, he pulled 57 csets from DaveM if I 
understand things properly.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 18:37                 ` Larry McVoy
  2003-03-03 18:46                   ` Larry McVoy
@ 2003-03-03 22:57                   ` Andrea Arcangeli
  2003-03-03 23:14                     ` Pavel Machek
  2003-03-03 23:56                     ` David Lang
  1 sibling, 2 replies; 155+ messages in thread
From: Andrea Arcangeli @ 2003-03-03 22:57 UTC (permalink / raw)
  To: Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

On Mon, Mar 03, 2003 at 10:37:34AM -0800, Larry McVoy wrote:
> How close is http://www.bitmover.com/EXPORT to what you want (3MB file).
> 
> Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,

I'm probably missing something obvious but it's not clear to me how to
extract the changeset info from this format.

Let's assume I want to extract this changeset:

hangeSet@1.1021, 2003-02-24 10:49:30-08:00, randy.dunlap@verizon.net
  [PATCH] convert /proc/io{mem,ports} to seq_file

  This converts /proc/io{mem,ports} to the seq_file interface
  (single_open).

How can I?

I mean, the above format is fine, as far as we have a file like that per
changeset (or alternatively per Linus's merge, even if not for every
single changeset, when he does the pulls). Clearly a file of that format
for a 2.5.62->63 diff is not finegrined enough.

Correct me if I'm wrong but if I understand well the changeset numbers
aren't fixed in the bitkeeper tree, a changeset number can change while
the merging happens across different cloned trees. So in short, the
changeset numbers are useless to the outside (but still providing them
won't hurt as far as nobody rely on them).

> in practice the granularity would be as least as fine as each of Linus'
> pushes and finer if possible.  We can't capture all the branching structure
> in patches, there is too much parallelism, but what we can do is capture
> each push that Linus does and if he did more than one merge in that push,
> we can break it up into each merge.  
> 
> We can also provide this as a BK url on bkbits for any cset or range of
> csets (we'll have to get another T1 line but I don't see way around that).

If that hurts, you could simply upload them to kernel.org.  Even if it's
not a file, can't you simply checkin into a remote cvs on kernel.org or
osdl.org or sourceforge, or whatever else place, so you won't need to
pay for it. It's up to you of course, but I'm sure you're not forced to
pay for this service (besides for the once-a-time setup of the exports,
that I hope won't generate any maintainance overhead to you).

> This should give enough information that anyone could build their own 
> BK 2 SVN gateway (or whatever, we're doing the CVS one).

Yes, as far as this file-format is per-merge I think this is all we
need. This way it will be usable to checkout, browse and regenerate the
tree, unlike the cset directory currently in kernel.org.

> Also, here's what Linus' recent pushes look like WITHOUT breaking it into
> each merge, we're still working on that code:
> 
>      57 csets on 2003/03/03 08:49:44 
>       5 csets on 2003/03/02 21:30:31 
>      28 csets on 2003/03/02 21:04:02 
>       1 csets on 2003/03/02 10:19:24 
>      49 csets on 2003/03/01 19:03:58 
>       2 csets on 2003/03/01 11:04:04 
>       5 csets on 2003/03/01 09:19:24 
>       1 csets on 2003/02/28 19:34:30 
>      37 csets on 2003/02/28 15:30:29 
>       8 csets on 2003/02/28 15:18:12 
>      23 csets on 2003/02/28 15:05:08 
>      31 csets on 2003/02/27 23:30:05 
>      16 csets on 2003/02/27 09:15:07 
>      11 csets on 2003/02/27 07:45:06 
>      47 csets on 2003/02/26 23:09:53 
>      32 csets on 2003/02/25 21:35:34 
>      24 csets on 2003/02/25 18:34:41 
>      22 csets on 2003/02/25 15:49:41 
>      14 csets on 2003/02/24 21:23:34 
>       3 csets on 2003/02/24 15:19:44 
>       1 csets on 2003/02/24 11:16:14 
>      15 csets on 2003/02/24 11:00:36 
>       4 csets on 2003/02/24 10:48:49 
>       1 csets on 2003/02/24 10:03:36 
>      15 csets on 2003/02/24 09:49:34 
>       1 csets on 2003/02/23 20:33:00 
>       3 csets on 2003/02/23 11:15:28 
>       8 csets on 2003/02/23 11:01:10 
>       6 csets on 2003/02/23 10:49:14 
>       2 csets on 2003/02/22 19:32:35 
>       4 csets on 2003/02/22 16:17:27 
>       1 csets on 2003/02/22 12:45:28 
>      76 csets on 2003/02/22 12:34:13 
>       1 csets on 2003/02/21 20:18:19 
>       6 csets on 2003/02/21 19:49:32 
>      86 csets on 2003/02/21 18:03:23 
>       3 csets on 2003/02/21 16:18:24 
>      30 csets on 2003/02/21 14:14:48 
>       1 csets on 2003/02/21 10:18:19 
>       1 csets on 2003/02/21 09:49:15 

Just curious, this also means that at least around the 80% of merges
in Linus's tree is submitted via a bitkeeper pull, right?

Andrea

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 22:57                   ` Andrea Arcangeli
@ 2003-03-03 23:14                     ` Pavel Machek
  2003-03-03 23:56                     ` David Lang
  1 sibling, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-03 23:14 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

Hi!

> > How close is http://www.bitmover.com/EXPORT to what you want (3MB file).
> > 
> > Note that this is very coarse granularity, it's very 2.5.62 up to 2.5.63,
> 
> I'm probably missing something obvious but it's not clear to me how to
> extract the changeset info from this format.

Is that format parsable at all? It looks like strange changeset
comments could confuse parsers...

> Let's assume I want to extract this changeset:
> 
> hangeSet@1.1021, 2003-02-24 10:49:30-08:00, randy.dunlap@verizon.net
>   [PATCH] convert /proc/io{mem,ports} to seq_file
> 
>   This converts /proc/io{mem,ports} to the seq_file interface
>   (single_open).
> 
> How can I?
> 
> I mean, the above format is fine, as far as we have a file like that per
> changeset (or alternatively per Linus's merge, even if not for every
> single changeset, when he does the pulls). Clearly a file of that format
> for a 2.5.62->63 diff is not finegrined enough.

Ben's bitsubversion script is somewhat slow, but should be capable of
pulling any diff you want...
								Pavel

-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 22:57                   ` Andrea Arcangeli
  2003-03-03 23:14                     ` Pavel Machek
@ 2003-03-03 23:56                     ` David Lang
  2003-03-04  0:02                       ` Jeff Garzik
  2003-03-04  2:20                       ` Martin J. Bligh
  1 sibling, 2 replies; 155+ messages in thread
From: David Lang @ 2003-03-03 23:56 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

On Mon, 3 Mar 2003, Andrea Arcangeli wrote:

> Just curious, this also means that at least around the 80% of merges
> in Linus's tree is submitted via a bitkeeper pull, right?
>
> Andrea

remember how Linus works, all normal patches get copied into a single
large patch file as he reads his mail then he runs patch to apply them to
the tree. I think this would make the entire batch of messages look like
one cset.

David Lang

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 23:56                     ` David Lang
@ 2003-03-04  0:02                       ` Jeff Garzik
  2003-03-04  0:05                         ` Larry McVoy
  2003-03-04  0:15                         ` Andrea Arcangeli
  2003-03-04  2:20                       ` Martin J. Bligh
  1 sibling, 2 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-04  0:02 UTC (permalink / raw)
  To: David Lang
  Cc: Andrea Arcangeli, Larry McVoy, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

David Lang wrote:
> On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
> 
> 
>>Just curious, this also means that at least around the 80% of merges
>>in Linus's tree is submitted via a bitkeeper pull, right?
>>
>>Andrea
> 
> 
> remember how Linus works, all normal patches get copied into a single
> large patch file as he reads his mail then he runs patch to apply them to
> the tree. I think this would make the entire batch of messages look like
> one cset.


Not correct.  His commits properly separate the patches out into 
individual csets.

	Jeff



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  0:02                       ` Jeff Garzik
@ 2003-03-04  0:05                         ` Larry McVoy
  2003-03-04  0:15                         ` Andrea Arcangeli
  1 sibling, 0 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-04  0:05 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: David Lang, Andrea Arcangeli, Larry McVoy, Alan Cox, Arador,
	Adam J. Richter, Linux Kernel Mailing List, pavel, pavel

On Mon, Mar 03, 2003 at 07:02:28PM -0500, Jeff Garzik wrote:
> David Lang wrote:
> >On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
> >
> >
> >>Just curious, this also means that at least around the 80% of merges
> >>in Linus's tree is submitted via a bitkeeper pull, right?
> >>
> >>Andrea
> >
> >
> >remember how Linus works, all normal patches get copied into a single
> >large patch file as he reads his mail then he runs patch to apply them to
> >the tree. I think this would make the entire batch of messages look like
> >one cset.
> 
> 
> Not correct.  His commits properly separate the patches out into 
> individual csets.

And we've written code which finds the longest path through the graph
to get the finest granularity; when run on his tree we get 8138 nodes.
That is 43% of the 18837 nodes possible.  The trunk only includes
1068 nodes.  So we can a very good job exporting to CVS.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  0:02                       ` Jeff Garzik
  2003-03-04  0:05                         ` Larry McVoy
@ 2003-03-04  0:15                         ` Andrea Arcangeli
  2003-03-04  0:30                           ` Jeff Garzik
  1 sibling, 1 reply; 155+ messages in thread
From: Andrea Arcangeli @ 2003-03-04  0:15 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: David Lang, Larry McVoy, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

On Mon, Mar 03, 2003 at 07:02:28PM -0500, Jeff Garzik wrote:
> David Lang wrote:
> >On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
> >
> >
> >>Just curious, this also means that at least around the 80% of merges
> >>in Linus's tree is submitted via a bitkeeper pull, right?
> >>
> >>Andrea
> >
> >
> >remember how Linus works, all normal patches get copied into a single
> >large patch file as he reads his mail then he runs patch to apply them to
> >the tree. I think this would make the entire batch of messages look like
> >one cset.
> 
> 
> Not correct.  His commits properly separate the patches out into 
> individual csets.

and they're unusable as source to regenerate a tree. I had similar
issues with the web too. to make use of the single csets you need to
implement the internal bitkeeper branching knowledge too. Not to tell
apparently the cset numbers changes all the time.

Andrea

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  0:15                         ` Andrea Arcangeli
@ 2003-03-04  0:30                           ` Jeff Garzik
  0 siblings, 0 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-04  0:30 UTC (permalink / raw)
  To: Andrea Arcangeli
  Cc: David Lang, Larry McVoy, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

Andrea Arcangeli wrote:
> On Mon, Mar 03, 2003 at 07:02:28PM -0500, Jeff Garzik wrote:
> 
>>David Lang wrote:
>>
>>>On Mon, 3 Mar 2003, Andrea Arcangeli wrote:
>>>
>>>
>>>
>>>>Just curious, this also means that at least around the 80% of merges
>>>>in Linus's tree is submitted via a bitkeeper pull, right?
>>>>
>>>>Andrea
>>>
>>>
>>>remember how Linus works, all normal patches get copied into a single
>>>large patch file as he reads his mail then he runs patch to apply them to
>>>the tree. I think this would make the entire batch of messages look like
>>>one cset.
>>
>>
>>Not correct.  His commits properly separate the patches out into 
>>individual csets.
> 
> 
> and they're unusable as source to regenerate a tree. I had similar
> issues with the web too. to make use of the single csets you need to
> implement the internal bitkeeper branching knowledge too. Not to tell
> apparently the cset numbers changes all the time.


The "weave", or order of csets, certainly changes each time Linus does a 
'bk pull'.  I wonder if a 'cset_order' file would be useful -- an 
automated job uses BK to export the weave for a specific point in time. 
  One could use that to glue the csets together, perhaps?

WRT cset numbers, ignore them.  Each cset has a unique key.  When 
setting up the 2.5 snapshot cron job, Linus asked me to export this key 
so that the definitive top-of-tree may be identified, regardless of cset 
number.  Here is an example:
ftp://ftp.kernel.org/pub/linux/kernel/v2.5/snapshots/patch-2.5.63-bk6.key

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 23:56                     ` David Lang
  2003-03-04  0:02                       ` Jeff Garzik
@ 2003-03-04  2:20                       ` Martin J. Bligh
  2003-03-04  5:29                         ` Linus Torvalds
  1 sibling, 1 reply; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-04  2:20 UTC (permalink / raw)
  To: David Lang, Andrea Arcangeli
  Cc: Larry McVoy, Jeff Garzik, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel

>> Just curious, this also means that at least around the 80% of merges
>> in Linus's tree is submitted via a bitkeeper pull, right?
>> 
>> Andrea
> 
> remember how Linus works, all normal patches get copied into a single
> large patch file as he reads his mail then he runs patch to apply them to
> the tree. I think this would make the entire batch of messages look like
> one cset.

I think he also creates subtrees, applies flat patches to those, then 
merges the subtrees back into his main tree as a bk-merge ... won't that 
distort the stats? 

M.


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  2:20                       ` Martin J. Bligh
@ 2003-03-04  5:29                         ` Linus Torvalds
  2003-03-04  5:56                           ` Dimitrie O. Paun
  0 siblings, 1 reply; 155+ messages in thread
From: Linus Torvalds @ 2003-03-04  5:29 UTC (permalink / raw)
  To: linux-kernel

In article <592860000.1046744403@flay>,
Martin J. Bligh <mbligh@aracnet.com> wrote:
>>> Just curious, this also means that at least around the 80% of merges
>>> in Linus's tree is submitted via a bitkeeper pull, right?
>>> 
>>> Andrea
>> 
>> remember how Linus works, all normal patches get copied into a single
>> large patch file as he reads his mail then he runs patch to apply them to
>> the tree. I think this would make the entire batch of messages look like
>> one cset.

Nope. All my tools are very careful about making single cset's from
single patches. Without that, you can't get good history and changelog
files, and you can't undo or test single patches.

What I _do_ do is to "batch up" patches, which you can see if you take a
look at the times for various changesets. I will save many emails to one
single "pending" file (I call it "doit"), and then my tools will apply
each of them in sequence as one "batch" of files. You can see the effect
of this by just doing

	bk changes | grep ChangeSet | less

and seeing how the changes are just a few seconds apart. For example,
here's the last batch I have from Andrew, and you can see that my
scripts applied 19 patches in sequence:

	ChangeSet@1.1058, 2003-03-02 20:38:36-08:00, akpm@digeo.com
	ChangeSet@1.1057, 2003-03-02 20:38:23-08:00, akpm@digeo.com
	ChangeSet@1.1056, 2003-03-02 20:38:15-08:00, akpm@digeo.com
	ChangeSet@1.1055, 2003-03-02 20:38:09-08:00, akpm@digeo.com
	ChangeSet@1.1054, 2003-03-02 20:38:02-08:00, akpm@digeo.com
	ChangeSet@1.1053, 2003-03-02 20:37:54-08:00, akpm@digeo.com
	ChangeSet@1.1052, 2003-03-02 20:37:48-08:00, akpm@digeo.com
	ChangeSet@1.1051, 2003-03-02 20:37:41-08:00, akpm@digeo.com
	ChangeSet@1.1050, 2003-03-02 20:37:34-08:00, akpm@digeo.com
	ChangeSet@1.1049, 2003-03-02 20:37:26-08:00, akpm@digeo.com
	ChangeSet@1.1048, 2003-03-02 20:37:19-08:00, akpm@digeo.com
	ChangeSet@1.1047, 2003-03-02 20:37:13-08:00, akpm@digeo.com
	ChangeSet@1.1046, 2003-03-02 20:37:07-08:00, akpm@digeo.com
	ChangeSet@1.1045, 2003-03-02 20:36:59-08:00, akpm@digeo.com
	ChangeSet@1.1044, 2003-03-02 20:36:51-08:00, akpm@digeo.com
	ChangeSet@1.1043, 2003-03-02 20:36:44-08:00, akpm@digeo.com
	ChangeSet@1.1042, 2003-03-02 20:36:38-08:00, akpm@digeo.com
	ChangeSet@1.1041, 2003-03-02 20:36:31-08:00, akpm@digeo.com
	ChangeSet@1.1040, 2003-03-02 20:36:23-08:00, akpm@digeo.com

roughly 8 seconds between patch (that's how long it takes the scripts to
commit between each change.  Imagine doing a commit in 8 seconds using
CVS..)

But all 19 emails ended up as separate changesets, and the only thing
the "batching" does is to make _me_ work more efficiently (ie I don't go
back and forth between reading email and applying one patch: I save the
batch away, I then look through the patches individually and possibly
edit and clean up the email commentary on it, and then I apply them all
in one go). 

>I think he also creates subtrees, applies flat patches to those, then 
>merges the subtrees back into his main tree as a bk-merge ... won't that 
>distort the stats? 

Yes, it will. 

I try to generally avoid doing parallell development with myself, partly
because it ends up _looking_ really confusing in revtool and thus
sometimes hard to find stuff, but partly just because I'm lazy and I
consider my main tree to be the "merge tree", so by default everything
_should_ go into that one tree if I really do want to merge it.

However, sometimes I get a big series of patches that was generated
against some specific kernel version, and then I'll set up a parallell
tree with the top at that specific version, so that I re-create exactly
what the original developer was working on. That way I avoid patch
rejects, and can take advantage of the automatic BK merge features.

It's not that common, though - I do it mostly if I know or suspect that
something will clash with existing changes in my tree, or if it's
something so fundamental that I want a separate branch for it (this was
the case for a lot of the fundamental VFS stuff Al Viro did earlier in
2.5.x, for example).

			Linus

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  5:29                         ` Linus Torvalds
@ 2003-03-04  5:56                           ` Dimitrie O. Paun
  2003-03-04 14:51                             ` Jeff Garzik
  0 siblings, 1 reply; 155+ messages in thread
From: Dimitrie O. Paun @ 2003-03-04  5:56 UTC (permalink / raw)
  To: Linus Torvalds, linux-kernel

On March 4, 2003 12:29 am, Linus Torvalds wrote:
> the case for a lot of the fundamental VFS stuff Al Viro did earlier in

Whatever happened to Al BTW? I really miss his patches as well as his
comments -- for the longest time I would follow l-k just to read his 
posts! I really hope he's alright, and that we'll hear from him soon...

-- 
Dimi.


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  5:56                           ` Dimitrie O. Paun
@ 2003-03-04 14:51                             ` Jeff Garzik
  0 siblings, 0 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-04 14:51 UTC (permalink / raw)
  To: dpaun; +Cc: Linus Torvalds, linux-kernel

Dimitrie O. Paun wrote:
> On March 4, 2003 12:29 am, Linus Torvalds wrote:
> 
>>the case for a lot of the fundamental VFS stuff Al Viro did earlier in
> 
> 
> Whatever happened to Al BTW? I really miss his patches as well as his
> comments -- for the longest time I would follow l-k just to read his 
> posts! I really hope he's alright, and that we'll hear from him soon...


He's still alive and still smoking cigarettes, at least... ;-)

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  1:45         ` Jeff Garzik
  2003-03-02  2:09           ` Andrea Arcangeli
@ 2003-03-02  3:29           ` H. Peter Anvin
  2003-03-02 17:12             ` Jeff Garzik
  2003-03-03  0:13           ` Pavel Machek
  2 siblings, 1 reply; 155+ messages in thread
From: H. Peter Anvin @ 2003-03-02  3:29 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <3E616224.6040003@pobox.com>
By author:    Jeff Garzik <jgarzik@pobox.com>
In newsgroup: linux.dev.kernel
> 
> You're missing the point:
> 
> A BK exporter is useful.  A BK clone is not.
> 

I disagree.  A BK clone would almost certainly be highly useful.  The
fact that it would happen to be compatible with one particular
proprietary tool released by one particular company doesn't change
that fact one iota; in fact, some people might find value in using the
proprietary tool for whatever reason (snazzy GUI, keeping the suits
happy, who knows...)

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: cris ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  3:29           ` H. Peter Anvin
@ 2003-03-02 17:12             ` Jeff Garzik
  2003-03-02 18:39               ` H. Peter Anvin
                                 ` (3 more replies)
  0 siblings, 4 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02 17:12 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

H. Peter Anvin wrote:
> Followup to:  <3E616224.6040003@pobox.com>
> By author:    Jeff Garzik <jgarzik@pobox.com>
> In newsgroup: linux.dev.kernel
> 
>>You're missing the point:
>>
>>A BK exporter is useful.  A BK clone is not.
>>
> 
> 
> I disagree.  A BK clone would almost certainly be highly useful.  The
> fact that it would happen to be compatible with one particular
> proprietary tool released by one particular company doesn't change
> that fact one iota; in fact, some people might find value in using the
> proprietary tool for whatever reason (snazzy GUI, keeping the suits
> happy, who knows...)

While people would certainly use it, I can't help but think that a BK 
clone would damage other open source SCM efforts.  I call this the 
"SourceForge Syndrome":

	Q. I found a problem/bug/annoyance, how do I solve it?
	A. Clearly, a brand new sourceforge project is called for.

My counter-question is, why not improve an _existing_ open source SCM to 
read and write BitKeeper files?  Why do we need yet another brand new 
project?

AFAICS, a BK clone would just further divide resources and mindshare.  I 
personally _want_ an open source SCM that is as good as, or better, than 
BitKeeper.  The open source world needs that, and BitKeeper needs the 
competition.  A BK clone may work with BitKeeper files, but I don't see 
it ever being as good as BK, because it will always be playing catch-up.

	Jeff

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 17:12             ` Jeff Garzik
@ 2003-03-02 18:39               ` H. Peter Anvin
  2003-03-02 20:01                 ` Jeff Garzik
  2003-03-03  0:47               ` nickn
                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 155+ messages in thread
From: H. Peter Anvin @ 2003-03-02 18:39 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel

Jeff Garzik wrote:
> 
> My counter-question is, why not improve an _existing_ open source SCM to 
> read and write BitKeeper files?  Why do we need yet another brand new 
> project?
> 

I don't disagree with that.  However, the question you posited was 
"would one be useful", and I think the answer is unequivocally yes. 
Furthermore, I don't agree with the "compatibility == bad" assumption I 
read into your message.

> AFAICS, a BK clone would just further divide resources and mindshare.  I 
> personally _want_ an open source SCM that is as good as, or better, than 
> BitKeeper.  The open source world needs that, and BitKeeper needs the 
> competition.  A BK clone may work with BitKeeper files, but I don't see 
> it ever being as good as BK, because it will always be playing catch-up.

Yes.  Personally, I've spent quite a bit of time with OpenCM after a 
suggestion from Ted T'so.  It's looking quite promising to me, although 
I haven't yet used it to maintain a large project.

	-hpa



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 18:39               ` H. Peter Anvin
@ 2003-03-02 20:01                 ` Jeff Garzik
  0 siblings, 0 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-02 20:01 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: linux-kernel

H. Peter Anvin wrote:
> Jeff Garzik wrote:
> 
>>
>> My counter-question is, why not improve an _existing_ open source SCM 
>> to read and write BitKeeper files?  Why do we need yet another brand 
>> new project?
>>
> 
> I don't disagree with that.  However, the question you posited was 
> "would one be useful", and I think the answer is unequivocally yes. 

Ok, I'll grant that.  :)

I think a BK clone is detrimental to the overall open source SCM world, 
is my main point.  I was thinking more along the lines of "useful to 
'the cause'" ;-)


> Furthermore, I don't agree with the "compatibility == bad" assumption I 
> read into your message.

Well, I disagree with that assumption too :)  My main objection is that 
a BK clone would divert attention from another effort (such as OpenCM), 
with the end result that neither the BK clone nor OpenCM are as good (or 
better) than BitKeeper.


>> AFAICS, a BK clone would just further divide resources and mindshare.  
>> I personally _want_ an open source SCM that is as good as, or better, 
>> than BitKeeper.  The open source world needs that, and BitKeeper needs 
>> the competition.  A BK clone may work with BitKeeper files, but I 
>> don't see it ever being as good as BK, because it will always be 
>> playing catch-up.
> 
> 
> Yes.  Personally, I've spent quite a bit of time with OpenCM after a 
> suggestion from Ted T'so.  It's looking quite promising to me, although 
> I haven't yet used it to maintain a large project.

Interesting...  Here's the link, in case others want to check it out:

	http://www.opencm.org/



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 17:12             ` Jeff Garzik
  2003-03-02 18:39               ` H. Peter Anvin
@ 2003-03-03  0:47               ` nickn
  2003-03-03  0:55                 ` David Lang
  2003-03-03  2:32                 ` Jeff Garzik
  2003-03-03 21:53               ` Joel Becker
  2003-03-06 16:41               ` Pavel Machek
  3 siblings, 2 replies; 155+ messages in thread
From: nickn @ 2003-03-03  0:47 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: H. Peter Anvin, linux-kernel

On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> My counter-question is, why not improve an _existing_ open source SCM to 
> read and write BitKeeper files?  Why do we need yet another brand new 
> project?

Or improve BK to export and import on demand of an existing open source SCM.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03  0:47               ` nickn
@ 2003-03-03  0:55                 ` David Lang
  2003-03-03  2:31                   ` Jeff Garzik
  2003-03-03  2:32                 ` Jeff Garzik
  1 sibling, 1 reply; 155+ messages in thread
From: David Lang @ 2003-03-03  0:55 UTC (permalink / raw)
  To: nickn; +Cc: Jeff Garzik, H. Peter Anvin, linux-kernel

I'm a little confused about the on-disk format

is it SCCS and the problem is that CSSC doesn't recognise everything that
the latest SCCS does so a patch is needed for CSSC or does it differ
slightly from SCCS?

Larry has mentioned that there were things they changed from the base SCCS
format that they started with, but he indicated that they had fed patches
to SCCS to use the new info.

I'm trying to figure out if the problem is CSSC not being as compatible as
it would like to be or is larry not getting the changes he is proposing
into SCCS, or are there other problems.

David Lang


On Mon, 3 Mar 2003, nickn wrote:

> Date: Mon, 3 Mar 2003 00:47:28 +0000
> From: nickn <nickn@www0.org>
> To: Jeff Garzik <jgarzik@pobox.com>
> Cc: H. Peter Anvin <hpa@zytor.com>, linux-kernel@vger.kernel.org
> Subject: Re: BitBucket: GPL-ed *notrademarkhere* clone
>
> On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> > My counter-question is, why not improve an _existing_ open source SCM to
> > read and write BitKeeper files?  Why do we need yet another brand new
> > project?
>
> Or improve BK to export and import on demand of an existing open source SCM.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03  0:55                 ` David Lang
@ 2003-03-03  2:31                   ` Jeff Garzik
  0 siblings, 0 replies; 155+ messages in thread
From: Jeff Garzik @ 2003-03-03  2:31 UTC (permalink / raw)
  To: David Lang; +Cc: nickn, H. Peter Anvin, linux-kernel

David Lang wrote:
> I'm a little confused about the on-disk format
> 
> is it SCCS and the problem is that CSSC doesn't recognise everything that
> the latest SCCS does so a patch is needed for CSSC or does it differ
> slightly from SCCS?


CSSC can read the sfiles with the patch posted to lkml, but it cannot 
read the BitKeeper-specific files such as the all-important ChangeSet 
file.  ChangeSet is required to build the DAG that weaves all the sfiles 
together into the proper order.

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03  0:47               ` nickn
  2003-03-03  0:55                 ` David Lang
@ 2003-03-03  2:32                 ` Jeff Garzik
  2003-03-04  1:07                   ` Horst von Brand
  1 sibling, 1 reply; 155+ messages in thread
From: Jeff Garzik @ 2003-03-03  2:32 UTC (permalink / raw)
  To: nickn; +Cc: H. Peter Anvin, linux-kernel

nickn wrote:
> On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> 
>>My counter-question is, why not improve an _existing_ open source SCM to 
>>read and write BitKeeper files?  Why do we need yet another brand new 
>>project?
> 
> 
> Or improve BK to export and import on demand of an existing open source SCM.


That may be possible with OpenCM, but it's a bit of a stretch for the 
other existing SCMs.  Regardless, if BK can export metadata to an open 
format (such as a defined XML spec), then the SCM interchange 
possibilities are only limited by a programmer's time and imagination.

	Jeff




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03  2:32                 ` Jeff Garzik
@ 2003-03-04  1:07                   ` Horst von Brand
  2003-03-04  1:10                     ` H. Peter Anvin
  0 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-04  1:07 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: nickn, H. Peter Anvin, linux-kernel

Jeff Garzik <jgarzik@pobox.com> said:

[...]

> That may be possible with OpenCM, but it's a bit of a stretch for the 
> other existing SCMs.  Regardless, if BK can export metadata to an open 
> format (such as a defined XML spec),

Like something quite as obscure as unidiff?

>                                      then the SCM interchange 
> possibilities are only limited by a programmer's time and imagination.

Then we are done ;-)
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-04  1:07                   ` Horst von Brand
@ 2003-03-04  1:10                     ` H. Peter Anvin
  0 siblings, 0 replies; 155+ messages in thread
From: H. Peter Anvin @ 2003-03-04  1:10 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Jeff Garzik, nickn, linux-kernel

Horst von Brand wrote:
> Jeff Garzik <jgarzik@pobox.com> said:
> 
> [...]
> 
> 
>>That may be possible with OpenCM, but it's a bit of a stretch for the 
>>other existing SCMs.  Regardless, if BK can export metadata to an open 
>>format (such as a defined XML spec),
> 
> Like something quite as obscure as unidiff?
> 

Unidiff isn't metadata.  Jeff is talking about the metadata which gives
context to the unidiffs.

	-hpa



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 17:12             ` Jeff Garzik
  2003-03-02 18:39               ` H. Peter Anvin
  2003-03-03  0:47               ` nickn
@ 2003-03-03 21:53               ` Joel Becker
  2003-03-04 23:37                 ` Olaf Hering
  2003-03-06 16:47                 ` Pavel Machek
  2003-03-06 16:41               ` Pavel Machek
  3 siblings, 2 replies; 155+ messages in thread
From: Joel Becker @ 2003-03-03 21:53 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: H. Peter Anvin, linux-kernel

On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> My counter-question is, why not improve an _existing_ open source SCM to 
> read and write BitKeeper files?  Why do we need yet another brand new 
> project?

	Normally, I'd agree with you Jeff.  However, none of the current
open source SCM systems are architected in a way that can operate like
BK.
	I've been using subversion for a while now.  It pretty much
fixes all the problems that CVS had, AS LONG AS you accept the CVS style
of version control.  That style doesn't work for non-central work like
the kernel.
	The one thing BK does that makes it worthwhile is the three-way
merge.  This (and the resulting DAG) make handling code from Alan, from
Linus, from Andrew, and from everyone else possible.  With CVS,
subversion, or any other SCM I've worked with, you have to hand merge
anything past the first patch.  Ugh.
	This requires architecture, and (AFAIK) BitBucket is the first
try at it.  Compatibility with the proprietary tool that does it already
is a good thing.

Joel


-- 

"Can any of you seriously say the Bill of Rights could get through
 Congress today?  It wouldn't even get out of committee."
	- F. Lee Bailey

Joel Becker
Senior Member of Technical Staff
Oracle Corporation
E-mail: joel.becker@oracle.com
Phone: (650) 506-8127

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 21:53               ` Joel Becker
@ 2003-03-04 23:37                 ` Olaf Hering
  2003-03-06 16:47                 ` Pavel Machek
  1 sibling, 0 replies; 155+ messages in thread
From: Olaf Hering @ 2003-03-04 23:37 UTC (permalink / raw)
  To: Joel Becker; +Cc: Jeff Garzik, H. Peter Anvin, linux-kernel

 On Mon, Mar 03, Joel Becker wrote:

> On Sun, Mar 02, 2003 at 12:12:58PM -0500, Jeff Garzik wrote:
> > My counter-question is, why not improve an _existing_ open source SCM to 
> > read and write BitKeeper files?  Why do we need yet another brand new 
> > project?
> 
> 	Normally, I'd agree with you Jeff.  However, none of the current
> open source SCM systems are architected in a way that can operate like
> BK.

Ah, finally we got to the root of the "problem".

-- 
A: No.
Q: Should I include quotations after my reply?

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-03 21:53               ` Joel Becker
  2003-03-04 23:37                 ` Olaf Hering
@ 2003-03-06 16:47                 ` Pavel Machek
  1 sibling, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-06 16:47 UTC (permalink / raw)
  To: Joel Becker; +Cc: Jeff Garzik, H. Peter Anvin, linux-kernel

Hi!

> 	The one thing BK does that makes it worthwhile is the three-way
> merge.  This (and the resulting DAG) make handling code from Alan, from
> Linus, from Andrew, and from everyone else possible.  With CVS,
> subversion, or any other SCM I've worked with, you have to hand merge
> anything past the first patch.  Ugh.

What's so magical about 3way merge?
I thought it would be easy to do even
in CVS...
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02 17:12             ` Jeff Garzik
                                 ` (2 preceding siblings ...)
  2003-03-03 21:53               ` Joel Becker
@ 2003-03-06 16:41               ` Pavel Machek
  2003-03-07 11:24                 ` Tupshin Harper
  2003-03-07 21:53                 ` H. Peter Anvin
  3 siblings, 2 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-06 16:41 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: H. Peter Anvin, linux-kernel

Hi!

> While people would certainly use it, I can't help but think that a BK 
> clone would damage other open source SCM efforts.  I call this the 
> "SourceForge Syndrome":

Where do I get pills for that one? :-)

> 	Q. I found a problem/bug/annoyance, how do I solve it?
> 	A. Clearly, a brand new sourceforge project is called for.
> 
> My counter-question is, why not improve an _existing_ open source SCM 
> to read and write BitKeeper files?

I of course thought about that (I'm not yet
hit *that* hard by sf syndrome :-), but:

a) I might extend cssc, but bitbucket is
naturally layer *over* cssc, and cssc
is GNU program (copyright assignment
needed) and is C++. I do not feel like
writing new code in C++, and I do not
like their codingstyle.

b) take something else, *merge cssc to
it*, then add my stuff. Ouch. svn is out
because of licensing, cvs is not powerfull
enough, and I do not like arch. (I did
not know abojt opencm, sorry). Ouch
and this would mean fork, anyway,
because developers of that project
would probably not be happy about
those copyrights (FSF!).

c) so new sf project is indeed way to go :-(.

I hope you understand now,

				Pavel

-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-06 16:41               ` Pavel Machek
@ 2003-03-07 11:24                 ` Tupshin Harper
  2003-03-07 11:28                   ` Pavel Machek
  2003-03-07 21:53                 ` H. Peter Anvin
  1 sibling, 1 reply; 155+ messages in thread
From: Tupshin Harper @ 2003-03-07 11:24 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

Pavel Machek wrote:

>it*, then add my stuff. Ouch. svn is out
>because of licensing, cvs is not powerfull
>
Could you or somebody else explain this repeated claim that the 
Subversion licensing is problematic?

I don't have anything to do with the project, but a quick perusal of the 
license doesn't reveal any problems. It's basically an Apache/BSD 
license (pasted below for your reading pleasure).

-Tupshin

------------Subversion copyright---------------

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above copyright
notice, this list of conditions and the following disclaimer in the
documentation and/or other materials provided with the distribution.

3. The end-user documentation included with the redistribution, if
any, must include the following acknowledgment: "This product includes
software developed by CollabNet (http://www.Collab.Net/)."
Alternately, this acknowledgment may appear in the software itself, if
and wherever such third-party acknowledgments normally appear.

4. The hosted project names must not be used to endorse or promote
products derived from this software without prior written
permission. For written permission, please contact info@collab.net.

5. Products derived from this software may not use the "Tigris" name
nor may "Tigris" appear in their names without prior written
permission of CollabNet.

THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
IN NO EVENT SHALL COLLABNET OR ITS CONTRIBUTORS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE
GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER
IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF
ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

This software consists of voluntary contributions made by many
individuals on behalf of CollabNet.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-07 11:24                 ` Tupshin Harper
@ 2003-03-07 11:28                   ` Pavel Machek
  0 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-07 11:28 UTC (permalink / raw)
  To: Tupshin Harper; +Cc: linux-kernel

Hi!

> >it*, then add my stuff. Ouch. svn is out
> >because of licensing, cvs is not powerfull
> >
> Could you or somebody else explain this repeated claim that the 
> Subversion licensing is problematic?
> 
> I don't have anything to do with the project, but a quick perusal of the 
> license doesn't reveal any problems. It's basically an Apache/BSD 
> license (pasted below for your reading pleasure).

You snipped what you should not snip: I'd need to merge CSSC (GPL-ed)
and svn (BSD with advertising). I may not do that. svn license is
okay, but merging it with CSSC is not possible.

								Pavel
-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-06 16:41               ` Pavel Machek
  2003-03-07 11:24                 ` Tupshin Harper
@ 2003-03-07 21:53                 ` H. Peter Anvin
  2003-03-08 23:18                   ` Daniel Phillips
  1 sibling, 1 reply; 155+ messages in thread
From: H. Peter Anvin @ 2003-03-07 21:53 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <20030306164146.GF2781@zaurus.ucw.cz>
By author:    Pavel Machek <pavel@suse.cz>
In newsgroup: linux.dev.kernel
> 
> b) take something else, *merge cssc to
> it*, then add my stuff. Ouch. svn is out
> because of licensing, cvs is not powerfull
> enough, and I do not like arch. (I did
> not know abojt opencm, sorry).
> 

I would really, really take a look at opencm first then.  Really.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-07 21:53                 ` H. Peter Anvin
@ 2003-03-08 23:18                   ` Daniel Phillips
  0 siblings, 0 replies; 155+ messages in thread
From: Daniel Phillips @ 2003-03-08 23:18 UTC (permalink / raw)
  To: H. Peter Anvin, linux-kernel, Pavel Machek; +Cc: opencm-dev

On Fri 07 Mar 03 22:53, H. Peter Anvin wrote:
> I would really, really take a look at opencm first then.  Really.

Untarring and building opencm-0.1.2alpha2-1-src.tgz generated an empty show.c 
file.  Not feeling too imaginative, I did:

-int show_c(SDR_stream *strm, const Buffer *input);
+int show_c(SDR_stream *strm, const Buffer *input) { return 0; }

in Browse.c, and was rewarded with a build.

Is this still on-topic for lkml?

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed *notrademarkhere* clone
  2003-03-02  1:45         ` Jeff Garzik
  2003-03-02  2:09           ` Andrea Arcangeli
  2003-03-02  3:29           ` H. Peter Anvin
@ 2003-03-03  0:13           ` Pavel Machek
  2 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-03  0:13 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Andrea Arcangeli, Alan Cox, Arador, Adam J. Richter,
	Linux Kernel Mailing List, pavel, pavel, hch

Hi!

> >Jeff, please uninstall *notrademarkhere* from your harddisk, install the
> >patched CSSC instead (like I just did), rsync Rik's SCCS tree on your
> >harddisk (like I just did), and then send me via email the diff of the
> >last Changeset that Linus applied to his tree with author, date,
> >comments etc...  If you can do that, you're completely right and at
> >least personally I will agree 100% with you, again: iff you can.
> 
> 
> You're missing the point:
> 
> A BK exporter is useful.  A BK clone is not.

I meant exporter.
								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  1:19     ` Jeff Garzik
  2003-03-02  1:40       ` BitBucket: GPL-ed *notrademarkhere* clone Andrea Arcangeli
@ 2003-03-03  0:10       ` Pavel Machek
  2003-03-04 16:16         ` David Woodhouse
  1 sibling, 1 reply; 155+ messages in thread
From: Pavel Machek @ 2003-03-03  0:10 UTC (permalink / raw)
  To: Jeff Garzik
  Cc: Alan Cox, Arador, Adam J. Richter, andrea,
	Linux Kernel Mailing List, pavel, pavel, hch

Hi!

> >>(Just a very personal suggestion)
> >>Why to waste time trying to clone a 
> >>tool such as bitkeeper? Why not to support things like subversion?
> >
> >
> >Because the repositories people need to read are in BK format, for better
> >or worse. It doesn't ultimately matter if you use it as an input filter
> >for CVS, subversion or no VCS at all.
> 
> "BK format"?  Not really.  Patches have been posted (to lkml, even) to 
> GNU CSSC which allow it to read SCCS files BK reads and writes.
> 
> Since that already exists, a full BitKeeper clone is IMO a bit silly, 
> because it draws users and programmers away from projects that could 
> potentially _replace_ BitKeeper.

Read-only access to the bk repositories is the first goal. Then, I'll
either add write support (unlikely) or feed it into some existing
version control system to work with that. I'm still not sure what's
the best.

[bk's on-disk format is quite reasonable; it might be okay to reuse
that.]

								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-03  0:10       ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
@ 2003-03-04 16:16         ` David Woodhouse
  2003-03-04 16:27           ` Pavel Machek
  0 siblings, 1 reply; 155+ messages in thread
From: David Woodhouse @ 2003-03-04 16:16 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Jeff Garzik, Alan Cox, Arador, Adam J. Richter, andrea,
	Linux Kernel Mailing List, pavel, pavel, hch

On Mon, 2003-03-03 at 00:10, Pavel Machek wrote:
> [bk's on-disk format is quite reasonable; it might be okay to reuse
> that.]

I disagree. Keeping the checked-out files _outside_ the repository, and
being able to have multiple checked-out trees from the same repository
with uncommitted changes outstanding while you pull from a remote
repository, etc, is useful.

cvs with cvsup does some of this but has obvious disadvantages, not
least of which being the one-way nature of change propagation. SVN and a
yet-to-be-invented SVNup (hopefully not in Modula-3) this time) may be a
lot closer to what we want.

-- 
dwmw2

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-04 16:16         ` David Woodhouse
@ 2003-03-04 16:27           ` Pavel Machek
  0 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-04 16:27 UTC (permalink / raw)
  To: David Woodhouse
  Cc: Jeff Garzik, Alan Cox, Arador, Adam J. Richter, andrea,
	Linux Kernel Mailing List, pavel, pavel, hch

Hi!

> > [bk's on-disk format is quite reasonable; it might be okay to reuse
> > that.]
>
> I disagree. Keeping the checked-out files _outside_ the repository, and
> being able to have multiple checked-out trees from the same repository
> with uncommitted changes outstanding while you pull from a remote
> repository, etc, is useful.

Agreed, but bk's SCCS-based format does not prevent you from keeping
checked-out files outside repository or from having multiple
checked-out trees. In fact I'm doing exactly that with bitbucket.

								Pavel
-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:11 BitBucket: GPL-ed KitBeeper clone Adam J. Richter
                   ` (2 preceding siblings ...)
  2003-03-02  0:49 ` Arador
@ 2003-03-02  1:26 ` Olivier Galibert
  2003-03-06 16:18   ` Pavel Machek
  2003-03-02  1:37 ` Filip Van Raemdonck
  4 siblings, 1 reply; 155+ messages in thread
From: Olivier Galibert @ 2003-03-02  1:26 UTC (permalink / raw)
  To: linux-kernel

On Sat, Mar 01, 2003 at 04:11:55PM -0800, Adam J. Richter wrote:
> 	Aegis, BitKeeper and probably other configuration management
> tools that use sccs or rcs basically share a common type of lower
> layer.  This lower layer converts a file-based revision control system
> such as sccs to an "uber-cvs", as someone called it in a slashdot
> discussion, that can:
> 
> 	    1. process a transaction against a group of files atomically,
> 	    2. associate a comment with such a transaction rather than
> 	       with just one file,
> 	    3. represent symbolic links, file protections
>             4. represent file renames (and perhaps copies?)

5. Represent merges.  That's what is making cvs branches unusable.

Frankly, if you want all of that you'd better design a repository
format that is actually adapted to it.  The RCS format is not very
good, the SCCS weave is a little better but not by much (it reminds me
of Hurd, looks cool but slow by design).  Larry did quite a feat
turning it into a distributed DAG of versions but I'm not convinced it
was that smart, technically.  In particular, everthing suddendly looks
much nicer when you have one file per DAG node plus a cache zone for
full versions.

But anyway, what made[1] Bitkeeper suck less is the real DAG
structure.  Neither arch nor subversion seem to have understood that
and, as a result, don't and won't provide the same level of semantics.
Zero hope for Linus to use them, ever.  They're needed for any
decently distributed development process.

Hell, arch is still at the update-before-commit level.  I'd have hoped
PRCS would have cured that particular sickness in SCM design ages ago.

Atomicity, symbolic links, file renames, splits (copy) and merges (the
different files suddendly ending up being the same one) are somewhat
important, but not the interesting part.  A good distributed DAG
structure and a quality 3-point version "merge" is what you actually
need to build bk-level SCMs.

  OG.

[1] 2.1.6-pre5, I don't know about current versions

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  1:26 ` Olivier Galibert
@ 2003-03-06 16:18   ` Pavel Machek
  2003-03-07 12:12     ` Olivier Galibert
  0 siblings, 1 reply; 155+ messages in thread
From: Pavel Machek @ 2003-03-06 16:18 UTC (permalink / raw)
  To: Olivier Galibert, linux-kernel

Hi!

> But anyway, what made[1] Bitkeeper suck less is the real DAG
> structure.  Neither arch nor subversion seem to have understood that
> and, as a result, don't and won't provide the same level of semantics.
> Zero hope for Linus to use them, ever.  They're needed for any
> decently distributed development process.

Can you elaborate? I thought that this
"real DAG" structure is more or less
equivalent to each developer having
his owm CVS repository...

> Hell, arch is still at the update-before-commit level.  I'd have hoped
> PRCS would have cured that particular sickness in SCM design ages ago.
> 
> Atomicity, symbolic links, file renames, splits (copy) and merges (the
> different files suddendly ending up being the same one) are somewhat
> important, but not the interesting part.  A good distributed DAG
> structure and a quality 3-point version "merge" is what you actually
> need to build bk-level SCMs.

If I fixed CVS renames, added atomic
commits, splits and merges, and gave each
developer his own CVS repository,
would I be in same league as bk?
Ie 10 times slower but equivalent
functionality?

(3 point merge should be doable for CVS
to and would be good thing anyway,
right?)
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-06 16:18   ` Pavel Machek
@ 2003-03-07 12:12     ` Olivier Galibert
  2003-03-07 12:32       ` Pavel Machek
  2003-03-08  0:18       ` Olaf Dietsche
  0 siblings, 2 replies; 155+ messages in thread
From: Olivier Galibert @ 2003-03-07 12:12 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Thu, Mar 06, 2003 at 05:18:53PM +0100, Pavel Machek wrote:
> Can you elaborate? I thought that this
> "real DAG" structure is more or less
> equivalent to each developer having
> his owm CVS repository...

Nope.  CVS uses RCS, and RCS only knows about trees, not graphs.
Specifically, branch merges are not tagged as such, and as a result
CVS is unable to pick up the best grandparent when doing a merge.
That's the main reason of why branching under CVS is so painful
(forgetting about the performance issues).

> If I fixed CVS renames, added atomic
> commits, splits and merges, and gave each
> developer his own CVS repository,
> would I be in same league as bk?
> Ie 10 times slower but equivalent
> functionality?

Nope.  You'll find out that this per-developper repository quickly
needs to become a per-branch repository, and even need you need to
write somewhere when the merges with other repositories happen, and
you end up with the DAG again.

Another way to see it is that CVS and friends use an
update-then-commit scheme, which is proven crap because you lose the
working version you had when you do the update to get a result that is
sometimes interesting.  Nice systems, like PRCS and bk, first commit
to a new branch (no update necessary obviously) then merge in the
mainline.  As a side effect, they are Good with branches.  Bk's main
quality over PRCS is the distribution.  This lack is what makes PRCS
essentially unusable for serious open source projects.  Otherwise
they're semantically the same.

> (3 point merge should be doable for CVS
> to and would be good thing anyway,
> right?)

Technically, CVS does 3-point merge, it's just crap at finding the
third point, and diff3 -m (which is what is used under the hood) isn't
that spectacular either.

You can see the merge operation in a different way.  You take 3
versions of your complete repository A, B and R (reference).  You
compute the deltas dA and dB so that A=dA(R) and B=dB(R).  Then you
try to build M=dA(dB(R))=dB(dA(R)), when it makes sense (not only the
deltas aren't necessarily commutative, they can't even always apply
one after the other).  When it doesn't work there are conflicts to be
resolved by the user.  You can see that when it workds M=dA(B)=dB(A).

You can do a lot of things with that, merging branches is just one of
them.  You can back out patches from within the history for instance
(D->E->F, merge D and F using E as reference removes the D->E patch
from F).

The trick is, the "simplest" your deltas are the lowest the conflict
probability is.  That's where the DAG kicks in.  For a branch merge,
the lowest conflict probability of conflict tends to occur when the
two deltas are a linear combination of small user-made deltas, with no
delta common between the two chains.  I.e. the best reference to use
is the latest merge point.  The DAG allows you to find it.  CVS
doesn't note the merge points so it always goes all the way where the
branch is rooted, ensuring that the two delta chains have a large
common prefix.

Sub-optimal reference point plus diff3's algorithm being what it is
makes the CVS branches plain unusable.  Multiple repositories won't
fix that, since you'll need to merge between repositories anyway.

  OG.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 12:12     ` Olivier Galibert
@ 2003-03-07 12:32       ` Pavel Machek
  2003-03-07 16:54         ` Olivier Galibert
  2003-03-08  0:18       ` Olaf Dietsche
  1 sibling, 1 reply; 155+ messages in thread
From: Pavel Machek @ 2003-03-07 12:32 UTC (permalink / raw)
  To: Olivier Galibert, linux-kernel

Hi!

> > Can you elaborate? I thought that this
> > "real DAG" structure is more or less
> > equivalent to each developer having
> > his owm CVS repository...
> 
> Nope.  CVS uses RCS, and RCS only knows about trees, not graphs.
> Specifically, branch merges are not tagged as such, and as a result
> CVS is unable to pick up the best grandparent when doing a merge.
> That's the main reason of why branching under CVS is so painful
> (forgetting about the performance issues).

I see. But I still somehow can not understand how merging is
possible. Merge possibly means work-by-hand, right? So it is not as
simple as noting that 1.8 and 1.7.1.1 were merged into 1.9, no? [And
what if developer did really crap job at merging that, like dropping
all changes from 1.7.1.1?]

> > If I fixed CVS renames, added atomic
> > commits, splits and merges, and gave each
> > developer his own CVS repository,
> > would I be in same league as bk?
> > Ie 10 times slower but equivalent
> > functionality?
> 
> Nope.  You'll find out that this per-developper repository quickly
> needs to become a per-branch repository, and even need you need to
> write somewhere when the merges with other repositories happen, and
> you end up with the DAG again.

Yep, that's what I wanted to know. [I see per-branch repository is
pain, but it helps me to understand that.]

Thanx for your explanations,
							Pavel
-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 12:32       ` Pavel Machek
@ 2003-03-07 16:54         ` Olivier Galibert
  2003-03-07 17:14           ` Geert Uytterhoeven
  2003-03-07 19:08           ` Pavel Machek
  0 siblings, 2 replies; 155+ messages in thread
From: Olivier Galibert @ 2003-03-07 16:54 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

On Fri, Mar 07, 2003 at 01:32:37PM +0100, Pavel Machek wrote:
> > Nope.  CVS uses RCS, and RCS only knows about trees, not graphs.
> > Specifically, branch merges are not tagged as such, and as a result
> > CVS is unable to pick up the best grandparent when doing a merge.
> > That's the main reason of why branching under CVS is so painful
> > (forgetting about the performance issues).
> 
> I see. But I still somehow can not understand how merging is
> possible. Merge possibly means work-by-hand, right? So it is not as
> simple as noting that 1.8 and 1.7.1.1 were merged into 1.9, no? [And
> what if developer did really crap job at merging that, like dropping
> all changes from 1.7.1.1?]

Calling A and B the versions to merge and R the reference, diff3 uses
this algorithm (probably the simplest possible):
- Compute the diff between A and R, call it dA
- Compute the diff between B and R, call it dB
- Merge the two diffs into one (and conflict where you can't)
- Apply the merged diff to R

Better algorithms do the alignments per-character instead of per-line,
detect moved and changed functions, detect duplicate inserts, etc.
None, of course, is perfect, as Larry could tell you.

Now if the development went that way:

1.7  -> 1.7.1.1 (branching, i.e. copy)
 v         v
 v      1.7.1.2
1.8        v
 v   -> 1.7.1.3 (merge)
1.9        v
 v         v
1.10       v
 v   -> 1.7.1.4 (merge)
 v         v
 v      1.7.1.5
 v         v
1.11 <-         (merge)

Pretty much standard, a developper created a new branch, made some
changes in it, synced with mainline, synced with mailine again a
little later, made some new changes and finally folded the branch back
in the mainline.  Let's admit the developper changes don't conflict by
themselves with the mainline changes.

CVS, for all the merges, is going to pick 1.7 as the reference.  The
first time, for 1.7.1.3, it's going to work correctly.  It will fuse
the 1.7->1.8 patch with the 1.7.1.1->1.7.1.2 patch and apply the
result to 1.7 to get 1.7.1.3.  The two patches have no reason to
overlap.  1.7.1.2->1.7.1.3 will essentially be identical to 1.7->1.8,
and 1.8->1.7.1.3 will essentially be identical to 1.7.1.2->1.7.1.3.

As soon as the next merge, i.e 1.7.1.4, it breaks.  CVS is going to
try to fuse the 1.7->1.10 patch with the 1.7->1.7.1.3 patch.  But
1.7->1.10 = 1.7->1.8+1.8->1.10 and 1.7->1.7.1.3 ~= 1.7->1.7.1.2+1.7->1.8.
So they have components in common, hance they _will_ conflict.

If CVS had taken the latest common ancestor by keeping in the
repository the existence of the 1.8->1.7.1.3 link, it would have taken
the 1.8 version as the reference.  The patches to fuse would have been
1.8->1.10 and 1.8->1.7.1.3, which have no reason to conflict.

Same for the next merge, the optimal merge point is in that case 1.10,
and it ends up being a null merge, i.e. 1.11 is a copy of 1.7.1.5.

You can see the final structure is a DAG, with each node having a max
of 2 ancestors.  And that's what PRCS and bk are working with,
fundamentally.

  OG.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 16:54         ` Olivier Galibert
@ 2003-03-07 17:14           ` Geert Uytterhoeven
  2003-03-07 19:08           ` Pavel Machek
  1 sibling, 0 replies; 155+ messages in thread
From: Geert Uytterhoeven @ 2003-03-07 17:14 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Pavel Machek, Linux Kernel Development

On Fri, 7 Mar 2003, Olivier Galibert wrote:
> Now if the development went that way:
> 
> 1.7  -> 1.7.1.1 (branching, i.e. copy)
>  v         v
>  v      1.7.1.2
> 1.8        v
>  v   -> 1.7.1.3 (merge)
> 1.9        v
>  v         v
> 1.10       v
>  v   -> 1.7.1.4 (merge)
>  v         v
>  v      1.7.1.5
>  v         v
> 1.11 <-         (merge)
> 
> Pretty much standard, a developper created a new branch, made some
> changes in it, synced with mainline, synced with mailine again a
> little later, made some new changes and finally folded the branch back
> in the mainline.  Let's admit the developper changes don't conflict by
> themselves with the mainline changes.
> 
> CVS, for all the merges, is going to pick 1.7 as the reference.  The
> first time, for 1.7.1.3, it's going to work correctly.  It will fuse
> the 1.7->1.8 patch with the 1.7.1.1->1.7.1.2 patch and apply the
> result to 1.7 to get 1.7.1.3.  The two patches have no reason to
> overlap.  1.7.1.2->1.7.1.3 will essentially be identical to 1.7->1.8,
> and 1.8->1.7.1.3 will essentially be identical to 1.7.1.2->1.7.1.3.
                                                    ^^^^^^^^^^^^^^^^
1.7.1.1->1.7.1.2, I assume?

> As soon as the next merge, i.e 1.7.1.4, it breaks.  CVS is going to
> try to fuse the 1.7->1.10 patch with the 1.7->1.7.1.3 patch.  But
> 1.7->1.10 = 1.7->1.8+1.8->1.10 and 1.7->1.7.1.3 ~= 1.7->1.7.1.2+1.7->1.8.
> So they have components in common, hance they _will_ conflict.
> 
> If CVS had taken the latest common ancestor by keeping in the
> repository the existence of the 1.8->1.7.1.3 link, it would have taken
> the 1.8 version as the reference.  The patches to fuse would have been
> 1.8->1.10 and 1.8->1.7.1.3, which have no reason to conflict.
> 
> Same for the next merge, the optimal merge point is in that case 1.10,
> and it ends up being a null merge, i.e. 1.11 is a copy of 1.7.1.5.
> 
> You can see the final structure is a DAG, with each node having a max
> of 2 ancestors.  And that's what PRCS and bk are working with,
> fundamentally.

Aha, so that's why my `mergetree' script (which basically is some directory
recursion around plain RCS merge, with additional support for hardlinking
identical files) works better than CVS, when I merge e.g. linux-2.5.64 and
linux-m68k-2.5.63 into linux-m68k-2.5.64. It always uses the latest common
ancestor (linux-2.5.63)...

Gr{oetje,eeting}s,

						Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
							    -- Linus Torvalds


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 16:54         ` Olivier Galibert
  2003-03-07 17:14           ` Geert Uytterhoeven
@ 2003-03-07 19:08           ` Pavel Machek
  2003-03-07 19:25             ` Eli Carter
                               ` (3 more replies)
  1 sibling, 4 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-07 19:08 UTC (permalink / raw)
  To: Olivier Galibert, Pavel Machek, linux-kernel

Hi!

> Now if the development went that way:
> 
> 1.7  -> 1.7.1.1 (branching, i.e. copy)
>  v         v
>  v      1.7.1.2
> 1.8        v
>  v   -> 1.7.1.3 (merge)
> 1.9        v
>  v         v
> 1.10       v
>  v   -> 1.7.1.4 (merge)
>  v         v
>  v      1.7.1.5
>  v         v
> 1.11 <-         (merge)
> 
> Pretty much standard, a developper created a new branch, made some
> changes in it, synced with mainline, synced with mailine again a
> little later, made some new changes and finally folded the branch back
> in the mainline.  Let's admit the developper changes don't conflict by
> themselves with the mainline changes.
> 
> CVS, for all the merges, is going to pick 1.7 as the reference.  The
> first time, for 1.7.1.3, it's going to work correctly.  It will fuse
> the 1.7->1.8 patch with the 1.7.1.1->1.7.1.2 patch and apply the
> result to 1.7 to get 1.7.1.3.  The two patches have no reason to
> overlap.  1.7.1.2->1.7.1.3 will essentially be identical to
> 1.7->1.8,

So, basically, if branch was killed and recreated after each merge
from mainline, problem would be solved, right?

							Pavel
-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 19:08           ` Pavel Machek
@ 2003-03-07 19:25             ` Eli Carter
  2003-03-07 20:29               ` Pavel Machek
  2003-03-07 23:16             ` Linus Torvalds
                               ` (2 subsequent siblings)
  3 siblings, 1 reply; 155+ messages in thread
From: Eli Carter @ 2003-03-07 19:25 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Olivier Galibert, linux-kernel

Pavel Machek wrote:
> Hi!
> 
> 
>>Now if the development went that way:
>>
>>1.7  -> 1.7.1.1 (branching, i.e. copy)
>> v         v
>> v      1.7.1.2
>>1.8        v
>> v   -> 1.7.1.3 (merge)
>>1.9        v
>> v         v
>>1.10       v
>> v   -> 1.7.1.4 (merge)
>> v         v
>> v      1.7.1.5
>> v         v
>>1.11 <-         (merge)
>>
>>Pretty much standard, a developper created a new branch, made some
>>changes in it, synced with mainline, synced with mailine again a
>>little later, made some new changes and finally folded the branch back
>>in the mainline.  Let's admit the developper changes don't conflict by
>>themselves with the mainline changes.
>>
>>CVS, for all the merges, is going to pick 1.7 as the reference.  The
>>first time, for 1.7.1.3, it's going to work correctly.  It will fuse
>>the 1.7->1.8 patch with the 1.7.1.1->1.7.1.2 patch and apply the
>>result to 1.7 to get 1.7.1.3.  The two patches have no reason to
>>overlap.  1.7.1.2->1.7.1.3 will essentially be identical to
>>1.7->1.8,
> 
> 
> So, basically, if branch was killed and recreated after each merge
> from mainline, problem would be solved, right?
> 
> 							Pavel

You would lose the history that branch gave you.
Or do you mean create a new branch (with a new name) at the point where 
the old branch was merged, and no longer use the old branch for commits?

Eli
--------------------. "If it ain't broke now,
Eli Carter           \                  it will be soon." -- crypto-gram
eli.carter(a)inet.com `-------------------------------------------------


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 19:25             ` Eli Carter
@ 2003-03-07 20:29               ` Pavel Machek
  0 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-07 20:29 UTC (permalink / raw)
  To: Eli Carter; +Cc: Pavel Machek, Olivier Galibert, linux-kernel

Hi!

> >So, basically, if branch was killed and recreated after each merge
> >from mainline, problem would be solved, right?
> >
> >							Pavel
> 
> You would lose the history that branch gave you.
> Or do you mean create a new branch (with a new name) at the point where 
> the old branch was merged, and no longer use the old branch for
> >commits?

Yes, that's what I meant.

								Pavel
-- 
Horseback riding is like software...
...vgf orggre jura vgf serr.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 19:08           ` Pavel Machek
  2003-03-07 19:25             ` Eli Carter
@ 2003-03-07 23:16             ` Linus Torvalds
  2003-03-08 22:52               ` Zack Brown
  2003-03-09  2:06             ` Horst von Brand
       [not found]             ` <b4b98v_14m_1@penguin.transmeta.com>
  3 siblings, 1 reply; 155+ messages in thread
From: Linus Torvalds @ 2003-03-07 23:16 UTC (permalink / raw)
  To: linux-kernel

In article <20030307190848.GB21023@atrey.karlin.mff.cuni.cz>,
Pavel Machek  <pavel@suse.cz> wrote:
>
>So, basically, if branch was killed and recreated after each merge
>from mainline, problem would be solved, right?

Wrong.

Now think three trees.  Each merging back and forth between each other. 

Or, in the case of something like the Linux kernel tree, where you don't
have two or three trees.  You've got at least 20 actively developed
concurrent trees with branches at different points. 

Trust me. CVS simple CANNOT do this. You need the full information.

Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
way indefinitely since most people don't seem to even understand _why_
it is superior. 

		Linus

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 23:16             ` Linus Torvalds
@ 2003-03-08 22:52               ` Zack Brown
  2003-03-09  0:05                 ` Larry McVoy
  0 siblings, 1 reply; 155+ messages in thread
From: Zack Brown @ 2003-03-08 22:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Hi Linus,

On Fri, Mar 07, 2003 at 11:16:47PM +0000, Linus Torvalds wrote:
> In article <20030307190848.GB21023@atrey.karlin.mff.cuni.cz>,
> Pavel Machek  <pavel@suse.cz> wrote:
> >
> >So, basically, if branch was killed and recreated after each merge
> >from mainline, problem would be solved, right?
> 
> Wrong.
> 
> Now think three trees.  Each merging back and forth between each other. 
> 
> Or, in the case of something like the Linux kernel tree, where you don't
> have two or three trees.  You've got at least 20 actively developed
> concurrent trees with branches at different points. 
> 
> Trust me. CVS simple CANNOT do this. You need the full information.
> 
> Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
> way indefinitely since most people don't seem to even understand _why_
> it is superior. 

You make it sound like no one is even interested ;-). But it's not true! A
lot of people currently working on alternative version control systems would
like very much to know what it would take to satisfy the needs of kernel
development. Maybe, being on the inside of the process and well aware of
your own needs, you don't realize how difficult it is to figure these things
out from the outside. I think only very few people (perhaps only one) really
understand this issue, and they aren't communicating with the horde of people
who really want to help, if only they knew how.

My impression is that Pavel is really smart and pretty close to the core of
kernel development. But you say even he doesn't get it? Come on! Throw
us a bone, willya!? ;-)

Be well,
Zack

> 
> 		Linus
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-08 22:52               ` Zack Brown
@ 2003-03-09  0:05                 ` Larry McVoy
  2003-03-09  1:21                   ` Davide Libenzi
  2003-03-09  2:45                   ` Zack Brown
  0 siblings, 2 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-09  0:05 UTC (permalink / raw)
  To: Zack Brown; +Cc: Linus Torvalds, linux-kernel

> > Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
> > way indefinitely since most people don't seem to even understand _why_
> > it is superior. 
> 
> You make it sound like no one is even interested ;-). But it's not true! A
> lot of people currently working on alternative version control systems would
> like very much to know what it would take to satisfy the needs of kernel
> development. Maybe, being on the inside of the process and well aware of
> your own needs, you don't realize how difficult it is to figure these things
> out from the outside. I think only very few people (perhaps only one) really
> understand this issue, and they aren't communicating with the horde of people
> who really want to help, if only they knew how.

[Long rant, summary: it's harder than you think, read on for the details]

There are parts of BitKeeper which required multiple years of thought by
people a lot smarter than me.  You guys are under the mistaken impression
that BitKeeper is my doing; it's not.  There are a lot of people who
work here and they have some amazing brains.  To create something like
BK is actually more difficult than creating a kernel.

To understand why, think of BK as a distributed, replicated, version
controlled user level file system with no limits on any of the file system
events which may happened in parallel.  Now put the changes back together,
correctly, no matter how much parallelism there has been.  Pavel hasn't
understood anything but a tiny fraction of the problem space yet, he
just doesn't realize it.  Even Linus doesn't know how BitKeeper works,
we haven't told him and I can tell from his explanations that he gets
part of it but not most of it.  That's not a slam on Linus or Pavel or
anyone else.  I'm just trying to tell you guys that this stuff is a lot
harder than you think.  I've told people that before, like the SVN and
OpenCM guys, and the leaders of both those efforts showed up later and
said "yup, you're right, it is a hell of a lot harder than it looks".
And they are nowhere near being able to do what BK does.  Ask them if
you have doubts about what I am saying.

Merging is just one of the complex areas.  It gets all the attention
because it is hard enough but easy enough that people like to work on it.
It's actually fun to work on merging.  Ditto for the graph structure,
that's trivial.  The other parts aren't fun and they are more difficult
so they don't get talked about.  But they are more important because
the user has no idea how to deal with them and users do know how to deal
with merge problems, lots of you understand patch rejects.

Rename handling in a distributed system is actually much harder than
getting the merging done.  It doesn't seem like it is, but we've rewritten
how we do it 3 times and are working on a 4th all because we've been
forced to learn about all the different ways that people move things
around.  CVS doesn't have any of the rename problems because it doesn't
do them, and SVN doesn't have 1/1000th of the problems we do because it
is centralized.  Centralized means that there is never any confusion
about where something should go, you can only create one file in one
directory entry because there is only one directory entry available.
In BK's case, there can be an infinite number of different files which
all want to be src/foo.c.

Symbolic tags are really hard.  What?!?  What could be easier than adding
a symbolic label on a revision?  Well, in a centralized system it is
trivial but in a distributed system you have to handle the fact that
the same symbol can be put on multiple revs.  It's the same problem as
the file names, just a variation.  Add to that the fact that time can
march forward or backwards in a distributed system, even if all the
events were marching forward, and the fun really starts.  I personally
have redone the tags support about 6 times and it still isn't right.

Security semantics are hard in a distributed system.  Where do you
put them, how do you integrate them into the system, what happens when
people try and work around them?  In CVS or SVN you can simply lock down
the server and not worry about it, but in BK, the user has the revision
history and they are root, they can do whatever they want.

Time semantics are the hardest of all.  You simply can't depend on time
being correct.  It goes forwards, backwards, and sideways on you and
if you think you can use time you don't have the slightest idea of the
scope of the problem.  Again, not a problem for CVS/SVN/whatever, all the
deltas are made against the same clock.  Not true in a distributed system.

That's a taste of what it is like.  You have to get all of those right
and the many other ones that I didn't tell you about or you might as
well not bother.  Why?  Because the problems are very subtle and there
isn't any hope of getting an end user to figure out a subtle problem,
they don't have the time or the inclination.  We've seen users throw away
weeks of work just because they didn't understand the merge conflict so
they start over on an updated tree.  And those people will understand
the rename corner cases?  Not a chance.

The main point here is that if you think that BK happened quickly,
by one guy, you are nuts.  It started in May of 1997, that's almost 6
years ago, not the 2 years that Pavel thinks, and I had already written
a complete version control system prior to that, so this was round two.
Even with that knowledge, I wasn't near enough to get BK to where it is
today, there is more than 40 man years of effort in BK so far.  A bunch
of people, working 60-90 hour weeks, for almost 6 years.  Not average
people, either, any one of these people would be a staff engineer or
better at Sun (salaries for those people are in the $115K - $140K range).

The disbelievers think that I'm out here waving the "it's too hard"
flag so you'll go away.  And the arrogant people think that they are
smarter than us and can do it quicker.  I doubt it but by all means go
for it and see what you can do.  Just file away a copy of this and let
me know what you think three or four years from now.  

Oh, by the way, you'll need a business model, I found that out 2 or 3
years into it when my savings ran out.  Oh, my, you might not be able
to GPL it!  Why it might even end up being just like BitKeeper with
an evil corporate dude named Pavel running the show.  Believe me, if
that happens, I'll be here to rake him over the coals on a daily basis
for being such an evil person who doesn't understand the point of free
software.  I can't wait.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  0:05                 ` Larry McVoy
@ 2003-03-09  1:21                   ` Davide Libenzi
  2003-03-09  2:45                   ` Zack Brown
  1 sibling, 0 replies; 155+ messages in thread
From: Davide Libenzi @ 2003-03-09  1:21 UTC (permalink / raw)
  To: Larry McVoy; +Cc: Zack Brown, Linus Torvalds, Linux Kernel Mailing List

On Sat, 8 Mar 2003, Larry McVoy wrote:

> > > Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
> > > way indefinitely since most people don't seem to even understand _why_
> > > it is superior.
> >
> > You make it sound like no one is even interested ;-). But it's not true! A
> > lot of people currently working on alternative version control systems would
> > like very much to know what it would take to satisfy the needs of kernel
> > development. Maybe, being on the inside of the process and well aware of
> > your own needs, you don't realize how difficult it is to figure these things
> > out from the outside. I think only very few people (perhaps only one) really
> > understand this issue, and they aren't communicating with the horde of people
> > who really want to help, if only they knew how.
>
> [Long rant, summary: it's harder than you think, read on for the details]
>
> There are parts of BitKeeper which required multiple years of thought by
> people a lot smarter than me.  You guys are under the mistaken impression
> that BitKeeper is my doing; it's not.  There are a lot of people who
> work here and they have some amazing brains.  To create something like
> BK is actually more difficult than creating a kernel.

Larry, how many years are that you're working as a developer and side by
side with developers ? 15 maybe 20 ? Do you know what's the best way to
keep developers out of doing something ? Well, just say the task is
trivial, easy, for dummies. And you will see developers stay away from the
project like cats from water. Try, even remotely, to dress the project
with complexity, and they'll come in storms ...




- Davide


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  0:05                 ` Larry McVoy
  2003-03-09  1:21                   ` Davide Libenzi
@ 2003-03-09  2:45                   ` Zack Brown
  2003-03-09  3:19                     ` Roman Zippel
                                       ` (4 more replies)
  1 sibling, 5 replies; 155+ messages in thread
From: Zack Brown @ 2003-03-09  2:45 UTC (permalink / raw)
  To: Larry McVoy, Linus Torvalds, linux-kernel

On Sat, Mar 08, 2003 at 04:05:14PM -0800, Larry McVoy wrote:
> Zack Brown wrote:
> > Linus Torvalds wrote:
> > > Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
> > > way indefinitely since most people don't seem to even understand _why_
> > > it is superior. 
> > 
> > You make it sound like no one is even interested ;-). But it's not true! A
> > lot of people currently working on alternative version control systems would
> > like very much to know what it would take to satisfy the needs of kernel
> > development.
> 
> [Long rant, summary: it's harder than you think, read on for the details]
[skipping long description]

OK, so here is my distillation of Larry's post.

  Basic summary: a distributed, replicated, version controlled user level file
  system with no limits on any of the file system events which may happened
  in parallel. All changes must be put correctly back together, no matter how
  much parallelism there has been.

  * Merging.

  * The graph structure.

  * Distributed rename handling. Centralized systems like Subversion don't
  have as many problems with this because you can only create one file in
  one directory entry because there is only one directory entry available.
  In distributed rename handling, there can be an infinite number of different
  files which all want to be src/foo.c. There are also many rename corner-cases.

  * Symbolic tags. This is adding a symbolic label on a revision. A distributed
  system must handle the fact that the same symbol can be put on multiple
  revisions. This is a variation of file renaming. One important thing to
  consider is that time can go forward or backward.

  * Security semantics. Where should they go? How can they be integrated
  into the system? How are hostile users handled when there is no central
  server to lock down?

  * Time semantics. A distributed system cannot depend on reported time
  being correct. It can go forward or backward at any rate.

I'd be willing to maintain this as the beginning of a feature list and
post it regularly to lkml if enough people feel it would be useful and not
annoying. The goal would be to identify the features/problems that would
need to be handled by a kernel-ready version control system.

Be well,
Zack

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  2:45                   ` Zack Brown
@ 2003-03-09  3:19                     ` Roman Zippel
  2003-03-09  3:42                       ` Linus Torvalds
  2003-03-10  0:02                     ` Thoughts about ideal kernel SCM Petr Baudis
                                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 155+ messages in thread
From: Roman Zippel @ 2003-03-09  3:19 UTC (permalink / raw)
  To: Zack Brown; +Cc: Larry McVoy, Linus Torvalds, linux-kernel

Hi,

On Sat, 8 Mar 2003, Zack Brown wrote:

>   * Distributed rename handling. Centralized systems like Subversion don't
>   have as many problems with this because you can only create one file in
>   one directory entry because there is only one directory entry available.
>   In distributed rename handling, there can be an infinite number of different
>   files which all want to be src/foo.c. There are also many rename corner-cases.

This actually a very bk specific problem, because the real problem under 
bk there can be only one src/SCCS/s.foo.c. A separate repository doesn't 
have this problem, because it has control over the naming in the 
repository and the original naming is restored with an explicit checkout.
In this context it will be really interesting to see how Larry wants to 
implement "lines of development" (aka branches which don't suck) and 
also maintain SCCS compatibility.

bye, Roman

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  3:19                     ` Roman Zippel
@ 2003-03-09  3:42                       ` Linus Torvalds
  2003-03-09  4:32                         ` Roman Zippel
                                           ` (2 more replies)
  0 siblings, 3 replies; 155+ messages in thread
From: Linus Torvalds @ 2003-03-09  3:42 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Zack Brown, Larry McVoy, linux-kernel

On Sun, 9 Mar 2003, Roman Zippel wrote:
> On Sat, 8 Mar 2003, Zack Brown wrote:
> 
> >   * Distributed rename handling.
> 
> This actually a very bk specific problem, because the real problem under 
> bk there can be only one src/SCCS/s.foo.c.

I don't think that is the issue.

[ Well, yes, I agree that the SCCS format is bad, but for other reasons ]

> A separate repository doesn't have this problem

You're wrong.

The problem is _distribution_. In other words, two people rename the same 
file. Or two people rename two _different_ files to the same name. Or two 
people create two different files with the same name. What happens when 
you merge?

None of these are issues for broken systems like CVS or SVN, since they 
have a central repository, so there _cannot_ be multiple concurrent 
renames that have to be merged much later (well, CVS cannot handle renames 
at all, but the "same name creation" issue you can see even with CVS). 

With a central repostory, you avoid a lot of the problems, because the 
conflicts must have been resolved _before_ the commit ever happens - put 
another way, you can never have a conflict in the revision history.

Sepoarate repostitories and SCCS file formats have nothing to do with the 
real problem. Distribution is key, not the repository format.

		Linus

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  3:42                       ` Linus Torvalds
@ 2003-03-09  4:32                         ` Roman Zippel
  2003-03-09 13:34                           ` Eric W. Biederman
  2003-03-09 14:49                         ` Olivier Galibert
  2003-03-13  0:05                         ` Pavel Machek
  2 siblings, 1 reply; 155+ messages in thread
From: Roman Zippel @ 2003-03-09  4:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Zack Brown, Larry McVoy, linux-kernel

Hi,

On Sat, 8 Mar 2003, Linus Torvalds wrote:

> None of these are issues for broken systems like CVS or SVN, since they
> have a central repository, so there _cannot_ be multiple concurrent
> renames that have to be merged much later.

It is possible, you only have to remember that the file foo.c doesn't have 
to be called foo.c,v in the repository. SVN should be able to handle this, 
it's just lacking important merging mechanisms.
This is actually a key feature I want to see in a SCM system - the ability 
to keep multiple developments within the same repository. I want to pull 
other source tress into a branch and compare them with other branches and 
merge them into new branches.

> Sepoarate repostitories and SCCS file formats have nothing to do with the 
> real problem. Distribution is key, not the repository format.

I agree, what I was trying to say is that the SCCS format makes a few 
things more complex than they had to be.

bye, Roman

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  4:32                         ` Roman Zippel
@ 2003-03-09 13:34                           ` Eric W. Biederman
  2003-03-09 15:35                             ` Roman Zippel
  2003-03-13  0:13                             ` Pavel Machek
  0 siblings, 2 replies; 155+ messages in thread
From: Eric W. Biederman @ 2003-03-09 13:34 UTC (permalink / raw)
  To: Roman Zippel; +Cc: Linus Torvalds, Zack Brown, Larry McVoy, linux-kernel

Roman Zippel <zippel@linux-m68k.org> writes:

> Hi,
> 
> On Sat, 8 Mar 2003, Linus Torvalds wrote:
> 
> > None of these are issues for broken systems like CVS or SVN, since they
> > have a central repository, so there _cannot_ be multiple concurrent
> > renames that have to be merged much later.
> 
> It is possible, you only have to remember that the file foo.c doesn't have 
> to be called foo.c,v in the repository. SVN should be able to handle this, 
> it's just lacking important merging mechanisms.
> This is actually a key feature I want to see in a SCM system - the ability 
> to keep multiple developments within the same repository. I want to pull 
> other source tress into a branch and compare them with other branches and 
> merge them into new branches.

In a distributed system everything happens on a branch.

> > Sepoarate repostitories and SCCS file formats have nothing to do with the 
> > real problem. Distribution is key, not the repository format.
> 
> I agree, what I was trying to say is that the SCCS format makes a few 
> things more complex than they had to be.

I don't know, if the problem really changes that much.  How do
you pick a globally unique inode number for a file?  And then
how do you reconcile this when people on 2 different branches create
the same file and want to merge their versions together?

So as a very rough approximation.
- Distribution is the problem.
- Powerful branching is the only thing that helps this
- Non branch local data (labels/tags) is very difficult.

Eric

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 13:34                           ` Eric W. Biederman
@ 2003-03-09 15:35                             ` Roman Zippel
  2003-03-09 16:55                               ` Martin J. Bligh
  2003-03-13  0:13                             ` Pavel Machek
  1 sibling, 1 reply; 155+ messages in thread
From: Roman Zippel @ 2003-03-09 15:35 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Linus Torvalds, Zack Brown, Larry McVoy, linux-kernel

Hi,

On 9 Mar 2003, Eric W. Biederman wrote:

> > This is actually a key feature I want to see in a SCM system - the ability 
> > to keep multiple developments within the same repository. I want to pull 
> > other source tress into a branch and compare them with other branches and 
> > merge them into new branches.
> 
> In a distributed system everything happens on a branch.

That's true, but with bk you have to use separate directories for that, 
which makes cross references between branches more difficult.

> > I agree, what I was trying to say is that the SCCS format makes a few 
> > things more complex than they had to be.
> 
> I don't know, if the problem really changes that much.  How do
> you pick a globally unique inode number for a file?  And then
> how do you reconcile this when people on 2 different branches create
> the same file and want to merge their versions together?

Unique identifier are needed for change sets anyway and if you decide 
during merge, that two files are identical, at least one branch has to 
carry the information that these identifiers point to the same file.

> So as a very rough approximation.
> - Distribution is the problem.

I would rather say, that it's only one (although very important) problem.

bye, Roman


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 15:35                             ` Roman Zippel
@ 2003-03-09 16:55                               ` Martin J. Bligh
  2003-03-09 17:20                                 ` Zack Brown
  2003-03-09 17:39                                 ` Linus Torvalds
  0 siblings, 2 replies; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-09 16:55 UTC (permalink / raw)
  To: Roman Zippel, Eric W. Biederman
  Cc: Linus Torvalds, Zack Brown, Larry McVoy, linux-kernel

>> > This is actually a key feature I want to see in a SCM system - the ability 
>> > to keep multiple developments within the same repository. I want to pull 
>> > other source tress into a branch and compare them with other branches and 
>> > merge them into new branches.
>> 
>> In a distributed system everything happens on a branch.
> 
> That's true, but with bk you have to use separate directories for that, 
> which makes cross references between branches more difficult.
> 
>> > I agree, what I was trying to say is that the SCCS format makes a few 
>> > things more complex than they had to be.
>> 
>> I don't know, if the problem really changes that much.  How do
>> you pick a globally unique inode number for a file?  And then
>> how do you reconcile this when people on 2 different branches create
>> the same file and want to merge their versions together?
> 
> Unique identifier are needed for change sets anyway and if you decide 
> during merge, that two files are identical, at least one branch has to 
> carry the information that these identifiers point to the same file.
> 
>> So as a very rough approximation.
>> - Distribution is the problem.
> 
> I would rather say, that it's only one (although very important) problem.

I think it's possible to get 90% of the functionality that most of us
(or at least I) want without the distributed stuff. If that's 10% of
the effort, would be really nice to have the auto-merging type of
functionality at least.

If the "maintainer" heirarchy was a strict tree structure, where you 
send patches to your parent, and receive them from your children, that
doesn't seem to need anything particularly fancy to me. 

Personally, I just collect together patches mainly from IBM people here, 
test them for functionality and performance, and sync up with Linus every 
new release by reapplying them on top of the new tree, and fix the conflicts 
by hand. Then I just email the patches as flat diffs to Linus. If I could 
get some really basic auto-merge functionality, that would get rid of 90% 
of the work, even if it only worked 95% of the time, and showed me what 
it had done that patch couldn't have done by itself. I don't see why that 
requires all this distributed stuff. If I resync with the latest -bk 
snapshot just before I send, the chances of Linus having to do much merge
work is pretty small.

I'm sure Bitkeeper is better than that, and has all sorts of fancy features,
and perhaps Linus even uses some of them. But if I can get 90% of that for
10% of the effort, I'd be happy. Some way to pass Linus some basic metadata
like changelog comments would be good (at the moment, I just slap those atop
the patch, and he edits them, but a basic perl script could hack off a 
"comment to Linus" section from a "changelog section", which might save
Linus some editing).

Andrew and Alan seem to work pretty well with flat patches too - Larry
seemed to imply that he thought the merge part of the problem was easy
enough in a non-distributed system ... if anything existant has or could 
have that without the distributed stuff and the complexity, would be cool.

If I'm missing something fundamental here, it wouldn't suprise me ;-)

M.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 16:55                               ` Martin J. Bligh
@ 2003-03-09 17:20                                 ` Zack Brown
  2003-03-09 17:48                                   ` Martin J. Bligh
  2003-03-09 19:58                                   ` Larry McVoy
  2003-03-09 17:39                                 ` Linus Torvalds
  1 sibling, 2 replies; 155+ messages in thread
From: Zack Brown @ 2003-03-09 17:20 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy,
	linux-kernel

On Sun, Mar 09, 2003 at 08:55:44AM -0800, Martin J. Bligh wrote:
> I think it's possible to get 90% of the functionality that most of us
> (or at least I) want without the distributed stuff. If that's 10% of
> the effort, would be really nice to have the auto-merging type of
> functionality at least.

> If I'm missing something fundamental here, it wouldn't suprise me ;-)

I think the fundamental thing you're missing is that Linus doesn't want it. ;-)

As long as people keep trying to avoid the hard problems that Linus and Larry
keep pointing out, I doubt any effort will get very far. I see a lot of cases
where someone says, "yeah, but we can side-step that problem if we do x,
y, or z." That doesn't help. The question is, what are the actual features
required for a version control system that could win support among the top
kernel developers?

People in the know hint at these features ("naming is really important"),
but the details are apparently complicated enough that no one wants to sit
down and actually describe them. They just hint at the *sort* of problems
they are, and then someone says, "but that's not really a problem because
of x, y, or z that can be done instead."

Then people get sidetracked on the features they personally would settle for,
and the real point gets lost in the fog. Or else they start dreaming
about what the perfect system would be like, describing features that
would not actually be required for a kernel-ready version control
system.

Unless the people in the know actually speak up, the rest of us just won't
be able to figure out what they need. A lot of projects are chasing their
tails right now, trying to do something, but lacking the direction they need
in order to do it.

Be well,
Zack

> 
> M.

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 17:20                                 ` Zack Brown
@ 2003-03-09 17:48                                   ` Martin J. Bligh
  2003-03-09 19:58                                   ` Larry McVoy
  1 sibling, 0 replies; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-09 17:48 UTC (permalink / raw)
  To: Zack Brown
  Cc: Roman Zippel, Eric W. Biederman, Linus Torvalds, Larry McVoy,
	linux-kernel

>> I think it's possible to get 90% of the functionality that most of us
>> (or at least I) want without the distributed stuff. If that's 10% of
>> the effort, would be really nice to have the auto-merging type of
>> functionality at least.
> 
>> If I'm missing something fundamental here, it wouldn't suprise me ;-)
> 
> I think the fundamental thing you're missing is that Linus doesn't want it. ;-)

Depends what your goal is ;-) I'm not on a holy quest to stop Linus using
Bitkeeper .... I'm just trying to make the non-Bitkeeper users' life a
little easier.

M.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 17:20                                 ` Zack Brown
  2003-03-09 17:48                                   ` Martin J. Bligh
@ 2003-03-09 19:58                                   ` Larry McVoy
  2003-03-09 21:32                                     ` Zack Brown
  2003-03-13 20:00                                     ` Pavel Machek
  1 sibling, 2 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-09 19:58 UTC (permalink / raw)
  To: Zack Brown
  Cc: Martin J. Bligh, Roman Zippel, Eric W. Biederman, Linus Torvalds,
	Larry McVoy, linux-kernel

On Sun, Mar 09, 2003 at 09:20:45AM -0800, Zack Brown wrote:
> People in the know hint at these features ("naming is really important"),
> but the details are apparently complicated enough that no one wants to sit
> down and actually describe them. 

What part of "40 man years" did you not understand?  Do you seriously
think that it is easy to "sit down and actually describe them"?  And if
you think I would do so just so you can go try to copy our solution
you have to be nuts, of course we aren't going to do that.  It took
us year to figure it out, we're still figuring things out every day,
if you want a free SCM you can bloody well go figure it out yourself.
The whole point of the non-compete clause in the well loved BK license
is to say "this stuff is hard.  If you want to create a similar product,
do it without the benefit of looking at our product".  That seems to be
lost on you and a lot of other people as well.

It's perfectly OK for you to go invent a new SCM system.  Go for it.
But stop asking for help from the BK crowd.  Not only will we not 
give you that help, we will do absolutely everything we can to make
sure that you can't copy BK.  Everything up to and including selling 
the company to the highest bidder and letting them chase after you.

Get it through your thick head that BK is something valuable to this
community, even if you don't use it you directly benefit from its use.
All you people trying to copy BK are just shooting yourself in the foot
unless you can come up with a solution that Linus will use in the short
term.  And nobody but an idiot believes that is possible.  So play nice.
Playing nice means you can use it, you can't copy it.  You can also
go invent your own SCM system, go for it, it's a challenging problem,
just don't use BK's files, commands, or anything else in the process.
We didn't have the benefit of copying something that you wrote, you 
don't get the benefit of copying something we wrote.  

You don't have to agree with us, you can do whatever you want, but do
so realizing that if you become too annoying we'll simple decide that
supporting the kernel isn't worth the aggravation.  As for you armchair
CEO's who think we're racking in the bucks because of the kernel's usage
of BK, think again.  That is not how sales are made in this space, sales
are made at the VP of engineering, CTO, CIO, and/or CEO level.  If you
think those guys read this list or slashdot or care about the kernel 
using BK, think again, they don't.  All they care about it is how much
it costs and how much effort it will save them.  And they all know that
their development model is dramatically different than that of the
kernel so any BK success here is of marginal interest at best.

BK is made available for free for one reason and one reason only: to
help Linus not burn out.  That's based on my personal belief that he is
critical to success of the Linux effort, he is a unique resource and has
to be protected.  I've paid a very heavy price for that belief and I'm
telling you that you are right on the edge of making that price too high.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 19:58                                   ` Larry McVoy
@ 2003-03-09 21:32                                     ` Zack Brown
  2003-03-09 21:54                                       ` Valdis.Kletnieks
  2003-03-13 20:00                                     ` Pavel Machek
  1 sibling, 1 reply; 155+ messages in thread
From: Zack Brown @ 2003-03-09 21:32 UTC (permalink / raw)
  To: Larry McVoy, Martin J. Bligh, Roman Zippel, Eric W. Biederman,
	Linus Torvalds, Larry McVoy, linux-kernel

On Sun, Mar 09, 2003 at 11:58:52AM -0800, Larry McVoy wrote:
> On Sun, Mar 09, 2003 at 09:20:45AM -0800, Zack Brown wrote:
> > People in the know hint at these features ("naming is really important"),
> > but the details are apparently complicated enough that no one wants to sit
> > down and actually describe them. 
> 
> It's perfectly OK for you to go invent a new SCM system.  Go for it.
> But stop asking for help from the BK crowd.

I haven't been asking you for help. I've been asking Linus and other
kernel developers to describe their needs. There seems to be three
camps in this discussion:

1) the people who feel that the hard problems solved by BitKeeper are
crucial

2) the people who feel that the hard problems are not that important,
and that a decent feature set could be designed to handle pretty much
everything anyone might normally need

3) the people who want features that are not really related to finding a
BitKeeper alternative.

My own opinion is that the people in camp (2) are falling into the trap which
has been described often enough, in which they will realize their design
mistakes too late to do anything about them. Whil the people in camp (3)
seem to be getting ahead of the game. The features they want are all great,
but the question of the basic structure still remains.

I think what needs to be done is to identify the hard problems, so that
any version control project that starts up can avoid mistakes that will
put a glass ceiling over their heads. Even if they choose not to implement
everything, or if they choose to implement features orthogonal to a real
BitKeeper alternative, they would still have the proper framework to raise
the project to the highest level later.

Of kernel developers, only Linus seems to have a clear idea of what the kernel
development process' needs are; but aside from insisting that distribution
is key (which people in camp (1) know already), he hasn't gone into the kind
of detail that folks would need in order to actually make a decent attempt.

Be well,
Zack

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 21:32                                     ` Zack Brown
@ 2003-03-09 21:54                                       ` Valdis.Kletnieks
  2003-03-09 23:28                                         ` Larry McVoy
  0 siblings, 1 reply; 155+ messages in thread
From: Valdis.Kletnieks @ 2003-03-09 21:54 UTC (permalink / raw)
  To: Zack Brown
  Cc: Larry McVoy, Martin J. Bligh, Roman Zippel, Eric W. Biederman,
	Linus Torvalds, Larry McVoy, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 897 bytes --]

On Sun, 09 Mar 2003 13:32:46 PST, Zack Brown <zbrown@tumblerings.org>  said:

> Of kernel developers, only Linus seems to have a clear idea of what the kerne
l
> development process' needs are; but aside from insisting that distribution
> is key (which people in camp (1) know already), he hasn't gone into the kind
> of detail that folks would need in order to actually make a decent attempt.

It's quite possible that even Linus doesn't have a clear cognitive grasp of
all the problems - Larry gave BK to Linus to prevent burn-out.  I'd not be
surprised if Linus was so busy dealing with the *first* order problems in
the pre-BK world (just getting patches to apply to his tree) that he never
encountered all the 'tough problems', and once he started using BK, he
also never hit any of the 'tough problems' because Larry's crew had already
spent 40 man-years making sure Linus *didnt* hit them.

[-- Attachment #2: Type: application/pgp-signature, Size: 226 bytes --]

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 21:54                                       ` Valdis.Kletnieks
@ 2003-03-09 23:28                                         ` Larry McVoy
  0 siblings, 0 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-09 23:28 UTC (permalink / raw)
  To: Valdis.Kletnieks
  Cc: Zack Brown, Larry McVoy, Martin J. Bligh, Roman Zippel,
	Eric W. Biederman, Linus Torvalds, linux-kernel

On Sun, Mar 09, 2003 at 04:54:02PM -0500, Valdis.Kletnieks@vt.edu wrote:
> On Sun, 09 Mar 2003 13:32:46 PST, Zack Brown <zbrown@tumblerings.org>  said:
> 
> > Of kernel developers, only Linus seems to have a clear idea of what the kerne
> l
> > development process' needs are; but aside from insisting that distribution
> > is key (which people in camp (1) know already), he hasn't gone into the kind
> > of detail that folks would need in order to actually make a decent attempt.
> 
> It's quite possible that even Linus doesn't have a clear cognitive grasp of
> all the problems - Larry gave BK to Linus to prevent burn-out.  I'd not be
> surprised if Linus was so busy dealing with the *first* order problems in
> the pre-BK world (just getting patches to apply to his tree) that he never
> encountered all the 'tough problems', and once he started using BK, he
> also never hit any of the 'tough problems' because Larry's crew had already
> spent 40 man-years making sure Linus *didnt* hit them.

Bingo.  We work hard to make sure that we've thought of and solved the
problems *before* they are hit in the field.  We try to be proactive,
not reactive (at least in coding, mailing lists are another matter).
We're not that great at it, but we've definitely solved all sorts of
problems long before Linus did anything to hit them.  
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 19:58                                   ` Larry McVoy
  2003-03-09 21:32                                     ` Zack Brown
@ 2003-03-13 20:00                                     ` Pavel Machek
  1 sibling, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-13 20:00 UTC (permalink / raw)
  To: Larry McVoy, Zack Brown, Martin J. Bligh, Roman Zippel,
	Eric W. Biederman, Linus Torvalds, Larry McVoy, linux-kernel

Hi!

> Get it through your thick head that BK is something valuable to this
> community, even if you don't use it you directly benefit from its use.
> All you people trying to copy BK are just shooting yourself in the foot
> unless you can come up with a solution that Linus will use in the short
> term.  And nobody but an idiot believes that is possible.  So play nice.
> Playing nice means you can use it, you can't copy it.  You can also
> go invent your own SCM system, go for it, it's a challenging problem,
> just don't use BK's files, commands, or anything else in the process.

Eh? It is perfectly okay to look at BK's
commands, ask people how BK works
and study its docs. (Heh, anyone still
has sources of BK from the time it was
available, preferably as hardcopy,
so no license needs to be agreed to
for looking at it?)

> BK is made available for free for one reason and one reason only: to
> help Linus not burn out.  That's based on my personal belief that he is
> critical to success of the Linux effort, he is a unique resource and has
> to be protected.  I've paid a very heavy price for that belief and I'm
> telling you that you are right on the edge of making that price too high.

So go ahead and disallow no-price use of
bitkeeper. It will reduce flamewars on
l-k quite a bit...
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 16:55                               ` Martin J. Bligh
  2003-03-09 17:20                                 ` Zack Brown
@ 2003-03-09 17:39                                 ` Linus Torvalds
  2003-03-09 17:58                                   ` Martin J. Bligh
                                                     ` (2 more replies)
  1 sibling, 3 replies; 155+ messages in thread
From: Linus Torvalds @ 2003-03-09 17:39 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Roman Zippel, Eric W. Biederman, Zack Brown, Larry McVoy, linux-kernel

On Sun, 9 Mar 2003, Martin J. Bligh wrote:
> 
> I think it's possible to get 90% of the functionality that most of us
> (or at least I) want without the distributed stuff.

No, I really don't think so.

The distribution is absolutely fundamental, and _the_ reason why I use BK.

Now, it's true that in 90% of all cases (probably closer to 99%) you will
never see the really nasty cases Larry was talking about. People just
don't rename files that much, and more importantly: then whey do, they
very very seldom have anybody else doing the same.

But what are you going to do when it happens? Because it _does_ happen:  
different people pick up the same patch or fix suggestion from the mailing
list, and do that as just a small part of normal development. Are the 
tools now going to break down?

BK doesn't. That' skind of the point. Larry 

> If the "maintainer" heirarchy was a strict tree structure, where you 
> send patches to your parent, and receive them from your children, that
> doesn't seem to need anything particularly fancy to me. 

But it's not, and the above would make BK much less than it is today.

On eof the things I always hated about CVS is how it makes it IMPOSSIBLE 
to "work together" on something between two different random people. Take 
one person who's been working on something for a while, but is chansing 
that one final bug, and asks another person for help. It just DOES NOT 
WORK in the CVS mentality (or _any_ centralized setup).

You have to either share the same sandbox (without any source control
support AT ALL), or you have to go to the central repository and create a
branch (never mind that you may not have write permissions, or that you 
may not know whether it's going to ever be something worthwhile yet).

With BK, the receiver just "bk pull"s. And if he is smart, he does that 
from a cloned repository so that after he's done with it he will just do a 
"rm -rf" or something.

This is FUNDAMENTAL.

And yes, maybe the really hard cases are rare. But does that mean that you 
aren't going to do it?

		Linus

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 17:39                                 ` Linus Torvalds
@ 2003-03-09 17:58                                   ` Martin J. Bligh
  2003-03-09 18:20                                   ` Larry McVoy
  2003-03-09 20:01                                   ` Roman Zippel
  2 siblings, 0 replies; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-09 17:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Roman Zippel, Eric W. Biederman, Zack Brown, Larry McVoy, linux-kernel

>> I think it's possible to get 90% of the functionality that most of us
>> (or at least I) want without the distributed stuff.
> 
> No, I really don't think so.
> 
> The distribution is absolutely fundamental, and _the_ reason why I use BK.
> 
> Now, it's true that in 90% of all cases (probably closer to 99%) you will
> never see the really nasty cases Larry was talking about. People just
> don't rename files that much, and more importantly: then whey do, they
> very very seldom have anybody else doing the same.
> 
> But what are you going to do when it happens? Because it _does_ happen:  
> different people pick up the same patch or fix suggestion from the mailing
> list, and do that as just a small part of normal development. Are the 
> tools now going to break down?

I'm going to fix it by hand ;-) As long as it stops at a sensible point,
and clearly barfs and says what the problem is, that's fine by me.

> BK doesn't. That' skind of the point. Larry 

Right ... I appreciate that. I'd just rather fix things up by hand 1% of
the time than use Bitkeeper myself. I'm not trying to stop *you* using
Bitkeeper by any stretch of the imagination ... you probably need the
heavyweight tools, but I'm OK without them.

> This is FUNDAMENTAL.
> 
> And yes, maybe the really hard cases are rare. But does that mean that you 
> aren't going to do it?

Yup, that's exactly what I'm saying. I'm not saying this as good as bitkeeper,
I'm saying it's "good enough" for me and I suspect several others (not saying 
it's good enough for you), and significantly better than diff and patch.
(though cp -lR is *blindingly* fast, and diff understands hard links).

M.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 17:39                                 ` Linus Torvalds
  2003-03-09 17:58                                   ` Martin J. Bligh
@ 2003-03-09 18:20                                   ` Larry McVoy
  2003-03-09 23:19                                     ` fs
  2003-03-13  0:41                                     ` Pavel Machek
  2003-03-09 20:01                                   ` Roman Zippel
  2 siblings, 2 replies; 155+ messages in thread
From: Larry McVoy @ 2003-03-09 18:20 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Martin J. Bligh, Roman Zippel, Eric W. Biederman, Zack Brown,
	Larry McVoy, linux-kernel

> And yes, maybe the really hard cases are rare. But does that mean that you 
> aren't going to do it?

This is sort of the point I've been trying to make for years.  It is 
unlikely that an open source project is going to solve these problems.
It's possible, but unlikely because the problems are rare and the code to
solve them is incredibly difficult.  It isn't obvious at all, it wasn't
obvious to me the first time around, it's only after you've done it that
you can see how something that appeared really simple wasted 6 months.

In the open source model, the portion of the work which is relatively
easy gets done, but the remaining part only gets done if there is a
huge amount of pressure to do so.  If you take a problem which occurs
only rarely, is difficult to solve, and has only a small set of users,
that's a classic example of something that just isn't going to get fixed
in the open source environment.  

It's a lot different when you have a very small set of users and the
solutions are very expensive.  I'm not saying that people don't solve hard
problems in open source projects, they do, the kernel is a good example.
The kernel also has millions of users, gets all sorts of friendly press
every day, and is fun.  In the SCM space, there are hundreds of products
for a potential market that is about 4000 times smaller than the potential
market for the kernel.  

SVN is a good example.  They side stepped almost all of the problems
that BK solves and it was absolutely the right call.  It would have cost
them millions to solve them and their product is free, it would take 
decades to recoup the investment at the low rates they can charge for
support or bundling or hosting.

Going back to the engineering problems, those problems are not going to
get fixed by people working on them in their spare time, that's for sure,
it's just not fun enough nor are they important enough.  Who wants to
spend a year working on a problem which only 10 people see in the world
each year?  And commercial customers aren't going to pay for this either
if the model is the traditional open source support model.  If you hit a
problem and it costs us $200K to fix it and you only hit it a few times
a year, if that, then you are not going to be OK with us billing you
that $200K, there isn't a chance that will work.

I'm starting to think that the best thing I could do is encourage Pavel &
Co to work as hard as they can to solve these problems.  Telling them that
it is too hard is just not believable, they are convinced I'm trying to
make them go away.  The fastest way to make them go away is to get them
to start solving the problems.  Let's see how well Pavel likes it when
people bitch at him that BitBucket doesn't handle problem XYZ and he
realizes that he needs to take another year of 80 hour weeks to fix it.
Go for it, dude, here's hoping that we can make it as pleasant for you
as you have made it for us.  Looking forward to it.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 18:20                                   ` Larry McVoy
@ 2003-03-09 23:19                                     ` fs
  2003-03-13  0:41                                     ` Pavel Machek
  1 sibling, 0 replies; 155+ messages in thread
From: fs @ 2003-03-09 23:19 UTC (permalink / raw)
  To: Larry McVoy, Linus Torvalds, Martin J. Bligh, Roman Zippel,
	Eric W. Biederman, Zack Brown, Larry McVoy, linux-kernel

On Sun, Mar 09, 2003 at 10:20:09AM -0800, Larry McVoy wrote:
> In the open source model, the portion of the work which is relatively
> easy gets done, but the remaining part only gets done if there is a
> huge amount of pressure to do so.  If you take a problem which occurs
> only rarely, is difficult to solve, and has only a small set of users,
> that's a classic example of something that just isn't going to get fixed
> in the open source environment.  

You are wrong. The choice of you and your team for a license is well respected
here both by the tree maintainer and its users, but we don't need to go
further into pissing on open source projects because your project wouldn't
make it if it was. I(an almost anonymous reader), and most here respect both
your work and your honesty in describing why you did it commercial but this
is one thing, and generalizing is another.

The Linux kernel by itself is a good example. It has code for things
that Microsoft will create when people need it in great extend like
ipv6, encryption API and IA-64/x64 support. Well, the examples are
numerous and I'm sure some experienced hackers can enlighten you
better.

The Grub bootloader is another example. An Open Source project that
provides support for almost any kernel there exists having command line
and autocomplete support on demand. Features that *nobody asked* but
they exist.

More experienced people on open source projects I'm sure will say "wtf,
there are plenty of better examples".

And think it otherwise. If a closed source project is more advanced on
something is a result of what *its* users want. If Microsoft is better on GUI
is a result of what its users want. The Open Source operating systems
are traditionally (as for the past 10 years) better on networking and
multiuser capabilities because what's what users want.

That of course comes into you words but the fact that most closed source
projects are indeed follow what their users want, that doesn't make a
difference.

So, if your project is better that's another thing. If you and team chose 
to make it commercial is well respected and understood. More understood
is the fact that you actuall *spend money* on it. It is a fundamental
right of yours to do what you want with your code especially when it is
a matter of personal economic health. But getting it generalised and
say that every open source project is just a hobbyish thing that is
always inferior to closed source unless 2^64 people ask for a feature?

no sir, real examples show things different.

-fs

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 18:20                                   ` Larry McVoy
  2003-03-09 23:19                                     ` fs
@ 2003-03-13  0:41                                     ` Pavel Machek
  2003-03-13 21:21                                       ` Horst von Brand
  1 sibling, 1 reply; 155+ messages in thread
From: Pavel Machek @ 2003-03-13  0:41 UTC (permalink / raw)
  To: Larry McVoy, Linus Torvalds, Martin J. Bligh, Roman Zippel,
	Eric W. Biederman, Zack Brown, Larry McVoy, linux-kernel

Hi!

> Going back to the engineering problems, those problems are not going to
> get fixed by people working on them in their spare time, that's for sure,
> it's just not fun enough nor are they important enough.  Who wants to
> spend a year working on a problem which only 10 people see in the world
> each year?  And commercial 
Well, if it happens only to 10 people per
year, it is a non-problem.

> I'm starting to think that the best thing I could do is encourage Pavel &
> Co to work as hard as they can to solve these problems.  Telling them that
> it is too hard is just not believable, they are convinced I'm trying to
> make them go away.  The fastest way to make them go away is to get them
> to start solving the problems.  Let's see how well Pavel likes it when
> people bitch at him that BitBucket doesn't handle problem XYZ and he

If it only happens so rarely, people
are unlikely to complain too loudly.

Take a look at e2fsck. That's similar to
bk -- awfull lot of corner cases. And
guess what, if you mess up your disk
badly enough, it will just tell you to
fix it by hand (deallocate block free bitmap
in full group). And its okay.
(Plus I believe chkdsk has *way* bigger
problems than that.)
I'm sure you are not going to throw away
ext2 just because it has 1-person-per-3-years
problem. 99% solution is going to be
good enough for me, Andrea and
Martin. Linus can keep using bk.
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13  0:41                                     ` Pavel Machek
@ 2003-03-13 21:21                                       ` Horst von Brand
  0 siblings, 0 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-13 21:21 UTC (permalink / raw)
  To: Pavel Machek; +Cc: linux-kernel

[Cc: chopped _way_ down]
Pavel Machek <pavel@ucw.cz> dijo:

[...]

> Take a look at e2fsck. That's similar to
> bk -- awfull lot of corner cases. And
> guess what, if you mess up your disk
> badly enough, it will just tell you to
> fix it by hand (deallocate block free bitmap
> in full group). And its okay.
> (Plus I believe chkdsk has *way* bigger
> problems than that.)
> I'm sure you are not going to throw away
> ext2 just because it has 1-person-per-3-years
> problem. 99% solution is going to be
> good enough for me, Andrea and
> Martin. Linus can keep using bk.

"Sorry, corner case encountered. Your repository is toast, get a fresh
copy" will make you an extremely popular sort of game...
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 17:39                                 ` Linus Torvalds
  2003-03-09 17:58                                   ` Martin J. Bligh
  2003-03-09 18:20                                   ` Larry McVoy
@ 2003-03-09 20:01                                   ` Roman Zippel
  2 siblings, 0 replies; 155+ messages in thread
From: Roman Zippel @ 2003-03-09 20:01 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Martin J. Bligh, Eric W. Biederman, Zack Brown, Larry McVoy,
	linux-kernel

Hi,

On Sun, 9 Mar 2003, Linus Torvalds wrote:

> The distribution is absolutely fundamental, and _the_ reason why I use BK.

I agree, that this is an important aspect and for your kind of work it's 
absolutely necessary.
But source management is more than just distributing and merging changes. 
E.g. if I want to develop a driver, I would start like most people from a 
stable base, that means 2.4. At a certain point the development splits 
into a development and stable branch, eventually I also want to merge my 
driver into the 2.5 tree.
This means I have to deal with 5 different source trees (branches), two 
branches track external trees and I want to know what has been merged from 
my development into my 2.4 and 2.5 stable branches, which I can use to 
make official releases. I want to be able to push multiple changes as a 
single change into the stable branches and it should be able to tell me 
which changes are still left.
If there would be a free SCM system, which could do this, I could easily 
do without a distributed option. Although I think as soon as it would be 
this far it should be relatively easy to add a distribution mechanism (by 
using a separate branch, which is only used for pulling changes). OTOH I 
suspect that it will be very hard to add the other capabilities to bk 
without a major redesign, as it's not a simple hierarchic structure 
anymore.

bye, Roman

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09 13:34                           ` Eric W. Biederman
  2003-03-09 15:35                             ` Roman Zippel
@ 2003-03-13  0:13                             ` Pavel Machek
  1 sibling, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-13  0:13 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Roman Zippel, Linus Torvalds, Zack Brown, Larry McVoy, linux-kernel

Hi!

> I don't know, if the problem really changes that much.  How do
> you pick a globally unique inode number for a file? 

Use <emailaddress>@locallyuniq. Every
developer should have an email, right? :-)

> And then
> how do you reconcile this when people on 2 different branches create
> the same file and want to merge their versions together?

That's conflict, and user interaction is
neccessary at this point.

				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  3:42                       ` Linus Torvalds
  2003-03-09  4:32                         ` Roman Zippel
@ 2003-03-09 14:49                         ` Olivier Galibert
  2003-03-13  0:05                         ` Pavel Machek
  2 siblings, 0 replies; 155+ messages in thread
From: Olivier Galibert @ 2003-03-09 14:49 UTC (permalink / raw)
  To: linux-kernel

On Sat, Mar 08, 2003 at 07:42:24PM -0800, Linus Torvalds wrote:
> 
> On Sun, 9 Mar 2003, Roman Zippel wrote:
> > On Sat, 8 Mar 2003, Zack Brown wrote:
> > 
> > >   * Distributed rename handling.
> > 
> > This actually a very bk specific problem, because the real problem under 
> > bk there can be only one src/SCCS/s.foo.c.
> 
> I don't think that is the issue.
> 
> [ Well, yes, I agree that the SCCS format is bad, but for other reasons ]

It is a large part of the issue though.  If you don't have one
repository file per project file with a name that resembles the
repository's one you find out that the project file name is somewhat
unimportant, just yet another of the metadata to track.

> The problem is _distribution_.

The only problem with distribution is sending as little as possible
over the network.  All the problems you're talking about exist with a
single repository as soon as you have decent branches.

> In other words, two people rename the same 
> file. Or two people rename two _different_ files to the same name. Or two 
> people create two different files with the same name. What happens when 
> you merge?

A conflict, what else?  The file name is only one of the
characteristics of a file.  And BTW, the interesting problem which is
what to do when you find out two different files end up being in fact
the same one is not covered by bk (or wasn't).

  OG.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  3:42                       ` Linus Torvalds
  2003-03-09  4:32                         ` Roman Zippel
  2003-03-09 14:49                         ` Olivier Galibert
@ 2003-03-13  0:05                         ` Pavel Machek
  2 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-13  0:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Roman Zippel, Zack Brown, Larry McVoy, linux-kernel

Hi!

> > A separate repository doesn't have this problem
> 
> You're wrong.
> 
> The problem is _distribution_. In other words, two people rename the same 
> file. Or two people rename two _different_ files to the same name. Or two 
> people create two different files with the same name. What happens when 
> you merge?
> 
> None of these are issues for broken systems like CVS or SVN, since they 

Actually this does not have much to do
with central repository. prcs has central
repository, too, but it has branches
(=multiple repositories in bk); so
yes you have the very same problem.

prcs does not have problems like trust
and non-synchronized time, through.
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Thoughts about ideal kernel SCM
  2003-03-09  2:45                   ` Zack Brown
  2003-03-09  3:19                     ` Roman Zippel
@ 2003-03-10  0:02                     ` Petr Baudis
  2003-03-10  0:32                       ` Larry McVoy
                                         ` (2 more replies)
  2003-03-10  3:41                     ` BitBucket: GPL-ed KitBeeper clone Horst von Brand
                                       ` (2 subsequent siblings)
  4 siblings, 3 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-10  0:02 UTC (permalink / raw)
  To: Zack Brown; +Cc: Larry McVoy, Linus Torvalds, linux-kernel

Dear diary, on Sun, Mar 09, 2003 at 03:45:22AM CET, I got a letter,
where Zack Brown <zbrown@tumblerings.org> told me, that...
> On Sat, Mar 08, 2003 at 04:05:14PM -0800, Larry McVoy wrote:
> > Zack Brown wrote:
> > > Linus Torvalds wrote:
> > > > Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
> > > > way indefinitely since most people don't seem to even understand _why_
> > > > it is superior. 
> > > 
> > > You make it sound like no one is even interested ;-). But it's not true! A
> > > lot of people currently working on alternative version control systems would
> > > like very much to know what it would take to satisfy the needs of kernel
> > > development.
> > 
> > [Long rant, summary: it's harder than you think, read on for the details]
> [skipping long description]
> 
> OK, so here is my distillation of Larry's post.

I've decided to elaborate a little more how BK in fact works for those who
don't use it and don't want to read over all the documentation, and also share
some thoughts and possible solutions of the individual problems.

All this is derived from various LKML threads and BK.com's documentation as I'm
not permitted to use BK myself, corrections are more than welcome.

>   Basic summary: a distributed, replicated, version controlled user level file
>   system with no limits on any of the file system events which may happened
>   in parallel. All changes must be put correctly back together, no matter how
>   much parallelism there has been.

[in the following text, "checkin" and "commit" are not inter-exchangable;
"checkin" means to one-time get some changes to one file, "commit" means to
form a changeset from several checked in changes in several files; this mirrors
BK's semantics]

I'd add

  * ChangeSets.

at the top. Unlike ie. SVN, changes checkins and changesets commits are
separated in BK and that sounds as a good thing to do --- it encourages people
to checkin more frequently and group a changeset from the uncommitted changes
when the changes are finished and good enough. See also
http://www.bitkeeper.com/UG/Getting.Working.Checking.html. Basically, you
checkin files as you want and the checkins to individual files are independent.
When you finish some logical change over several files, you use bk commit and
the checkins which aren't part of any changeset yet are automagically grouped
to one, you write a summary comment of the changeset and then ChangeSet
revision number will increase and somewhere will be written down which checkins
are part of this ChangeSet. One changeset is then an atomic unit when sharing
the changes with others, that is you must form one in order to make the changes
available.

The more-or-less opposite concept is to have each checkin(s, when you checking
multiple files at once) as a changeset (this is what SVN does) --- then you
don't need per-file revision numbers but just one per-repository revision
number which is increased by each checkin (which is also commit in SVN). This
can seem more elegant and generic, but I personally believe that it's better to
have release checkins and changeset commits separated. Then per-repository
revision numbers should obviously increase by each commit, not checkin.

In BK, you usually work with the changeset numbers, but for the internal
structure the revision numbers are also important. Since changeset number can
be taken as revision number of the ChangeSet metafile, I will operate mostly
with revision numbers below.

>   * Merging.
> 
>   * The graph structure.

About these two, it could be worth noting how BK works now, how looks branching
and merging and how could it be done internally.

When you want to branch, you just clone the repository --- each clone
automatically creates a branch of the parent repository. Similiarly, you merge
the branch by simply pulling the "branch repository" back. This way the
distributed repositories concept is tightly connected with the branch&merge
concept. When I will talk about merging below, it doesn't matter if it will
happen from the cloned repository just one directory far away or over network
from a different machine.

[note that the following is figured out from various resources but not from the
documentation where I didn't find it; thus I may be well completely wrong here;
please substitute "is" by "looks to be", "i think" etc in the following text]

BK works with a DAG (Directed Acyclic Graph) formed from the versions, however
the graph looks differently from each repository (diagrams show ChangeSet
numbers).

 From the imaginary Linus' branch, it looks like:

linus  1.1 -> 1.2 -> 1.3 -----> 1.4 -> 1.5 -----> 1.6 -----> 1.7
                \               / \                          /
alan             \-> 1.2.1.1 --/---\-> 1.2.1.2 -> 1.2.1.3 --/

But from the Alan' branch, it looks like:

linus  1.1 -> 1.2 -> 1.2.1.1 -> 1.2.1.2 -> 1.2.1.3 -> 1.2.1.4 -> 1.2.1.5
                \               / \                              /
alan             \-> 1.3 ------/---\-----> 1.4 -----> 1.5 ------/

But now, how does merging happen? One of the goals is to preserve history even
when merging. Thus you merge individual per-file checkins of the changeset
one-by-one, each checkin receiving own revision in the target tree as well ---
this means the revision numbers of individual checkins change during merge if
there were other checkins to the file between fork and merge.

But it's a bit more complicated: ChangeSet revision number is not globally
unique neither and it changes. You cannot decide it to be globally unique
during clone, because then you would have to increase the branch identifier
after each clone (most of them are probably just read-only). Thus in the cloned
repository, you work like you would continue in the branch you just cloned, and
the branch number is assigned during merge.

A virtual branch (used only to track ChangeSets, not per-file revisions) is
created in the parent repository during merge, where the merged changesets
receive new numbers appropriate for the branch. However the branch is really
only virtual and there is still only one line of development in the repository.
If you want to see the ChangeSets in order they were applied and present in the
files, you have not to sort them by revision, but by merge time. Thus the order
in which they are applied to the files is (from Linus' POV):

1.1 1.2 1.3 1.2.1.1 1.4 1.5 1.6 1.2.1.2 1.2.1.3 1.7

>   * Distributed rename handling. Centralized systems like Subversion don't
>   have as many problems with this because you can only create one file in
>   one directory entry because there is only one directory entry available.
>   In distributed rename handling, there can be an infinite number of different
>   files which all want to be src/foo.c. There are also many rename corner-cases.

One obvious solution is hitting me here. First, you virtualize files to inodes
and give them numbers (in practice it's not necessary and in fact it could be
better not to do that, but it can be much easier to think about it as if it
would be this way) --- the numbers don't have to be globally unique, they are
just convience abstraction; they are inherited upon clone, though. Then in
repository you have each file name being just that inode number, and for each
inode you keep history of names it had and in which revision the name was
assigned (thus you also know in what changeset it was assigned).

When you are merging an "inode", you just go back to the last common ChangeSet
revision in the names history and look what the name is. If there's no name for
that changeset yet, it's a new file and if there's filename conflict, you
cannot do much with it. Otherwise you know that the inode number has to be same
for both repositories. Then you just do the rename of inode in the target
repository to the current name in the source repository. If there is a
conflict, you check if you can't repeat this whole operation on the file in the
way in the target repository --- if not (or you can but the conflict was not
solved anyway), you again probably cannot do much with this again and you have
to let the user decide.

What am I missing?

>   * Symbolic tags. This is adding a symbolic label on a revision. A distributed
>   system must handle the fact that the same symbol can be put on multiple
>   revisions. This is a variation of file renaming. One important thing to
>   consider is that time can go forward or backward.

You remap the tags when you remap the changeset numbers, and? BK seems to allow
one tag to be on multiple changesets and I presume that then the latest one is
normally used --- you can do the similiar here, the latest such-named tag is
used normally, the merged ones are just preserved in the history.

>   * Security semantics. Where should they go? How can they be integrated
>   into the system? How are hostile users handled when there is no central
>   server to lock down?

I'm not sure which points exactly this attempts to bring up. Which particular
issues are open here? This is mostly question of configuration of individual
repositories (if you allow push and from who) and trust (if you will do pull
and from who), isn't it?

>   * Time semantics. A distributed system cannot depend on reported time
>   being correct. It can go forward or backward at any rate.

Yes, then let's not depend on the time ;-).

Kind regards,

-- 

				Petr "Pasky" Baudis
.
When in doubt, use brute force.
		-- Ken Thompson
.
Crap: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Thoughts about ideal kernel SCM
  2003-03-10  0:02                     ` Thoughts about ideal kernel SCM Petr Baudis
@ 2003-03-10  0:32                       ` Larry McVoy
  2003-03-12 19:29                         ` Petr Baudis
  2003-03-13 10:36                       ` Pavel Machek
  2003-03-17 20:59                       ` Petr Baudis
  2 siblings, 1 reply; 155+ messages in thread
From: Larry McVoy @ 2003-03-10  0:32 UTC (permalink / raw)
  To: Zack Brown, Linus Torvalds, linux-kernel

> What am I missing?

Nothing that a half of decade of work wouldn't fill in :)

More seriously, lots.  I keep saying this and people keep not hearing it,
but it's the corner cases that get you.  You seem to have a healthy grasp
of the problem space on the surface but in reading over your stuff, you 
aren't even where I was before BK was started.  That's not supposed to be
offensive, just an observation.  As you try out the ideas you described
you'll find that they don't work in all sorts of corner cases and the 
problem is that there are a zillion of them.  And the solutions have
this nasty habit of fighting with each other, you solve one problem
and that creates another.

The thing we've found is that this problem space is much bigger than one
person can handle, even an exceptionally talented person.  The number of
variables are such that you can't do it in your head, you need to have a
muse for each area and both of you have to be thinking about it full time.

This isn't a case of "oh, I get it, now I'll write the code".  It's a
case of "write the code, deploy the code, get taught that it didn't work,
get the insight from that, write new code, repeat".  And the problems are
such that if you aren't on them all the time then you work very slowly,
99% of the work is recreating the state you had in your brain the last
time you were here.  

I strongly urge you to wander off and talk to people who are actually
writing code for real users.  Arch, SVN, CVS, whatever.  Get deeply
involved and understand their choices.  Personally, I'd suggest the SVN
guys because I think they are the most serious, they have a team which
has been together for a long time and thought hard about it.  On the
other hand, Arch is probably closer to mimicing how development really
happens in the real world, in theory, at least, it can do better than BK,
it lets you take out of order changesets and BK does not.  But it is light
years behind SVN in terms of actually working and being a real product.
SVN is almost there, they are self hosting, they eat their own dog food,
Arch is more a collection of ideas and some shell scripts.  From SVN,
you're going to learn more of the hard problems that actually occur,
but Arch might be a better long term investment, hard to say.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Thoughts about ideal kernel SCM
  2003-03-10  0:32                       ` Larry McVoy
@ 2003-03-12 19:29                         ` Petr Baudis
  0 siblings, 0 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-12 19:29 UTC (permalink / raw)
  To: Larry McVoy, Zack Brown, linux-kernel

Dear diary, on Mon, Mar 10, 2003 at 01:32:33AM CET, I got a letter,
where Larry McVoy <lm@bitmover.com> told me, that...
> > What am I missing?
> 
> Nothing that a half of decade of work wouldn't fill in :)

Good then ;-).

> More seriously, lots.  I keep saying this and people keep not hearing it,
> but it's the corner cases that get you.  You seem to have a healthy grasp
> of the problem space on the surface but in reading over your stuff, you 
> aren't even where I was before BK was started.  That's not supposed to be
> offensive, just an observation.  As you try out the ideas you described
> you'll find that they don't work in all sorts of corner cases and the 
> problem is that there are a zillion of them.  And the solutions have
> this nasty habit of fighting with each other, you solve one problem
> and that creates another.

Sure, it's expected not to work perfectly. But we must start anywhere, it's
IMHO better than just sitting at one place saying "we won't manage to do it
perfectly anyway". We indeed won't that way, if we will start actually doing
something and discussing the basic design ideas, we may.

I can already see notes of flaws^Wshadow areas ;-) in my ideas, but I believe
most of these can be pruned out. The rest will just have to be fixed later.

..snip..
> I strongly urge you to wander off and talk to people who are actually
> writing code for real users.  Arch, SVN, CVS, whatever.  Get deeply
> involved and understand their choices.

Certainly, I'm going to start digging into Arch very soon.

> Personally, I'd suggest the SVN guys because I think they are the most
> serious, they have a team which has been together for a long time and thought
> hard about it.  On the other hand, Arch is probably closer to mimicing how
> development really happens in the real world, in theory, at least, it can do
> better than BK, it lets you take out of order changesets and BK does not.
> But it is light years behind SVN in terms of actually working and being a
> real product.  SVN is almost there, they are self hosting, they eat their own
> dog food, Arch is more a collection of ideas and some shell scripts.
> From SVN, you're going to learn more of the hard problems that actually
> occur, but Arch might be a better long term investment, hard to say.

I would probably base my potential work on Arch (or maybe Arx, I have to
actually compare these, I didn't find any good summary of differences), but I
dislike some concepts so it would be Yet Another Fork anyway ;-).

Kind regards,

-- 
 
				Petr "Pasky" Baudis
.
When in doubt, use brute force.
		-- Ken Thompson
.
Crap: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Thoughts about ideal kernel SCM
  2003-03-10  0:02                     ` Thoughts about ideal kernel SCM Petr Baudis
  2003-03-10  0:32                       ` Larry McVoy
@ 2003-03-13 10:36                       ` Pavel Machek
  2003-03-14 22:56                         ` Petr Baudis
  2003-03-17 20:59                       ` Petr Baudis
  2 siblings, 1 reply; 155+ messages in thread
From: Pavel Machek @ 2003-03-13 10:36 UTC (permalink / raw)
  To: Zack Brown, Larry McVoy, Linus Torvalds, linux-kernel

Hi!

> > OK, so here is my distillation of Larry's post.
> 
> I've decided to elaborate a little more how BK in fact works for those who
> don't use it and don't want to read over all the documentation, and also share
> some thoughts and possible solutions of the individual problems.
> 

What about commiting this to bitbucket CVS?
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Thoughts about ideal kernel SCM
  2003-03-13 10:36                       ` Pavel Machek
@ 2003-03-14 22:56                         ` Petr Baudis
  0 siblings, 0 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-14 22:56 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Zack Brown, Larry McVoy, linux-kernel

Dear diary, on Thu, Mar 13, 2003 at 11:36:15AM CET, I got a letter,
where Pavel Machek <pavel@ucw.cz> told me, that...
> Hi!

Hello,

> > > OK, so here is my distillation of Larry's post.
> > 
> > I've decided to elaborate a little more how BK in fact works for those who
> > don't use it and don't want to read over all the documentation, and also share
> > some thoughts and possible solutions of the individual problems.
> > 
> 
> What about commiting this to bitbucket CVS?

feel free to do anything you want with this, just please keep some credit
there. Maybe you would prefer to use the Zack's summary instead, though, dunno.

Kind regards,

-- 
 
				Petr "Pasky" Baudis
.
When in doubt, use brute force.
		-- Ken Thompson
.
Crap: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: Thoughts about ideal kernel SCM
  2003-03-10  0:02                     ` Thoughts about ideal kernel SCM Petr Baudis
  2003-03-10  0:32                       ` Larry McVoy
  2003-03-13 10:36                       ` Pavel Machek
@ 2003-03-17 20:59                       ` Petr Baudis
  2 siblings, 0 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-17 20:59 UTC (permalink / raw)
  To: Zack Brown, linux-kernel, arch-users

(Please strip lkml from the cc list when replying.)

Dear diary, on Mon, Mar 10, 2003 at 01:02:33AM CET, I got a letter,
where Petr Baudis <pasky@ucw.cz> told me, that...
> Dear diary, on Sun, Mar 09, 2003 at 03:45:22AM CET, I got a letter,
> where Zack Brown <zbrown@tumblerings.org> told me, that...
..snip..
> >   * Merging.
> > 
> >   * The graph structure.
> 
> About these two, it could be worth noting how BK works now, how looks branching
> and merging and how could it be done internally.
> 
> When you want to branch, you just clone the repository --- each clone
> automatically creates a branch of the parent repository. Similiarly, you merge
> the branch by simply pulling the "branch repository" back. This way the
> distributed repositories concept is tightly connected with the branch&merge
> concept. When I will talk about merging below, it doesn't matter if it will
> happen from the cloned repository just one directory far away or over network
> from a different machine.
> 
> [note that the following is figured out from various resources but not from the
> documentation where I didn't find it; thus I may be well completely wrong here;
> please substitute "is" by "looks to be", "i think" etc in the following text]
> 
> BK works with a DAG (Directed Acyclic Graph) formed from the versions, however
> the graph looks differently from each repository (diagrams show ChangeSet
> numbers).
> 
>  From the imaginary Linus' branch, it looks like:
> 
> linus  1.1 -> 1.2 -> 1.3 -----> 1.4 -> 1.5 -----> 1.6 -----> 1.7
>                 \               / \                          /
> alan             \-> 1.2.1.1 --/---\-> 1.2.1.2 -> 1.2.1.3 --/
> 
> But from the Alan' branch, it looks like:
> 
> linus  1.1 -> 1.2 -> 1.2.1.1 -> 1.2.1.2 -> 1.2.1.3 -> 1.2.1.4 -> 1.2.1.5
>                 \               / \                              /
> alan             \-> 1.3 ------/---\-----> 1.4 -----> 1.5 ------/
> 
> But now, how does merging happen? One of the goals is to preserve history even
> when merging. Thus you merge individual per-file checkins of the changeset
> one-by-one, each checkin receiving own revision in the target tree as well ---
> this means the revision numbers of individual checkins change during merge if
> there were other checkins to the file between fork and merge.
> 
> But it's a bit more complicated: ChangeSet revision number is not globally
> unique neither and it changes. You cannot decide it to be globally unique
> during clone, because then you would have to increase the branch identifier
> after each clone (most of them are probably just read-only). Thus in the cloned
> repository, you work like you would continue in the branch you just cloned, and
> the branch number is assigned during merge.
> 
> A virtual branch (used only to track ChangeSets, not per-file revisions) is
> created in the parent repository during merge, where the merged changesets
> receive new numbers appropriate for the branch. However the branch is really
> only virtual and there is still only one line of development in the repository.
> If you want to see the ChangeSets in order they were applied and present in the
> files, you have not to sort them by revision, but by merge time. Thus the order
> in which they are applied to the files is (from Linus' POV):
> 
> 1.1 1.2 1.3 1.2.1.1 1.4 1.5 1.6 1.2.1.2 1.2.1.3 1.7
..snip..

I didn't explain (and get as well, in fact) this probably well enough
initially, as several people asked me about this privately. Thus I decided it
would be worth elaborating the "virtual branching" (which turns out not to be
*that* virtual after all) concept, and alternative solutions. While the current
operation may be quite obvious for the regular bk users, it probably isn't for
the others and it could be worth documenting it. And deciding (on arch-list,
please), how to actually do it for ourselves ;-).

Let's sketch some really very simple example of DAG here, but much more
detailed about revisions. First, the basic situation:

          Linus +-----+     +-----+     +-----+
  BASE      ,-->| 1.2 |---->| 1.3 |---->| 1.4 |--.
 +-----+   /    +-----+     +-----+     +-----+   \   +-------+
 | 1.1 |--<                                        >--| MERGE |
 +-----+   \        +-----+         +-----+       /   +-------+
            `------>| 1.2 |-------->| 1.3 |------'
           Alan     +-----+         +-----+

At the merge point, Linus will merge the Alan's changesets committed after the
fork.  However, do we want to do a flat-merge, cummulating the changesets to
one big ball and placing it as a 1.5 ? Or rather take the changesets and commit
them separately ?

Let's see what Bitkeeper appears to do. It will take the common ancestors of
the branches, that is 1.1 here. Then, it will pull from the branch being
merged, fork an internal branch at 1.1 and stuff the pulled changesets there.
Thus the result will be the classical image of brances in RCS-alike systems
(Linus' perspective):

          Linus +-----+     +-----+     +-----+
  BASE      ,-->| 1.2 |---->| 1.3 |---->| 1.4 |--.
 +-----+   /    +-----+     +-----+     +-----+   \   +-------+
 | 1.1 |--<                                        >--| MERGE |
 +-----+   \     +---------+       +---------+    /   +-------+
            `--->| 1.1.1.1 |------>| 1.1.1.2 |---'
           Alan  +---------+       +---------+

So BK *appears* to _do_ have a "classical branching" capability, despite the
impression, although it seems not to be available for regular usage but rather
only for internal purposes.

After this, the merge itself (done by "bk resolve", looking from the
documentation) will do some magical operation, which _probably_ looks like:

* combine all these changesets to one big diff (note that in practice you don't
probably do such a silly things but just hijack the development line to include
the branched changesets at the right point and only skip the conflicting delta
fragments; however it is best to illustrate as if we would do it this way ;)

* apply this big diff on the top of 1.4

* check it in and hide it to some eyes (it certainly doesn't appear in the
mails which are emitted by bk to our popular mailing lists, I didn't check how
exactly are these merges presented on the web interface and have zero clue
about how are they being presented by "bk log"-alikes; I'd say that it is
sorted by the date of merge, thus it is in that order "1.1 1.2 1.3 1.4 1.1.1.1
1.1.1.2" being presented in the previous mail; also the merge changeset
certainly has to contain information about the branch being merged there, so
the log can have inserted info about 1.1.1.1 and 1.1.1.2 between 1.4 and 1.5).

* note that at some conditions the file revision numbers are branchized as
well, it probably happens when merging branches, I didn't investigate this yet

* don't check in the parts which were conflicting, leave them and let the user
resolve them --- these changes won't be hidden and they will appear as the diff
being attached to the "merge changeset"

* bundle all these changes and present them as changeset 1.5, called "Merge
alan with linus" or so ;) :

          Linus +-----+     +-----+     +-----+
  BASE      ,-->| 1.2 |---->| 1.3 |---->| 1.4 |--.     MERGE
 +-----+   /    +-----+     +-----+     +-----+   \   +-----+
 | 1.1 |--<                          1.4 + 1.1.1.2 >--| 1.5 |
 +-----+   \     +---------+       +---------+    /   +-----+
            `--->| 1.1.1.1 |------>| 1.1.1.2 |---'
           Alan  +---------+       +---------+

Now this is certainly an interesting concept, and it is nice to users
especially because they have to solve conflicts _once_, when applying the
combined delta.

However, it is next to impossible to so-called "cherrypick changes", thus
select only some changesets which to merge. The problem is that you have to
spawn the branch with this one changeset, but what if you will later want to
import a changeset being _before_ this one? In order to preserve the order of
things, you will have to move the changeset on the branch forward to a next
revision number and push the new one at that place. Aside of revision numbers
being changed in frame of one repository is a weird thought, not only backwards
but even _forward_ conflicts could happen. So, the question is, is it mandatory
to preserve order of changesets? If the changesets would appear in the branch
in the order as they would be merged, the cherrypicking shouldn't be a problem,
IMHO.

Another thing is that the merging procedure is kind of weird, there has to be
some "hijacking" of the development line to get the order right. Otherwise,
however, the concept is not all that bad and it is full of nice ideas. Can we
do better?

There can be an alterantive approach, which picks the changesets from the
mergee one by one and apply them one by one, stopping at the moment when there
is a conflict.  Then it will be let to user to solve and then the step-by-step
merge can continue. This will result in merged changesets being directly
committed to the tree as regular changesets, only with some additional info
that they are merged:

          Linus +-----+     +-----+     +-----+
  BASE      ,-->| 1.2 |---->| 1.3 |---->| 1.4 |--.
 +-----+   /    +-----+     +-----+     +-----+   )
 | 1.1 |--' ,------------------------------------' .-- . . .
 +-----+   (        +-----+         +-----+       /
            `------>| 1.5 |-------->| 1.6 |------'
           Alan     +-----+         +-----+

Note that there is no special changeset dedicated to the merge, all the
conflicts are resolved in the individual changesets, thus all the diffs there
are "real"; also, you do no hijacking of the line and all the changesets are
ordinary general ones, with almost no special attributes. The underlying
versioning system doesn't even have to know about branches ;-). However the
changesets are already mirrored modified, you have to possibly resolve
conflicts multiple times during a merge and it is not clear from the revision
number what the originating branch is forked from.

Which model do you think is better? Or do you have yet another idea how to do
this? (given that we _do_ have to do this somehow)

Have fun,

-- 

				Petr "Pasky" Baudis
.
The pure and simple truth is rarely pure and never simple.
		-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  2:45                   ` Zack Brown
  2003-03-09  3:19                     ` Roman Zippel
  2003-03-10  0:02                     ` Thoughts about ideal kernel SCM Petr Baudis
@ 2003-03-10  3:41                     ` Horst von Brand
  2003-03-10 13:52                       ` Jamie Lokier
  2003-03-10 23:03                     ` Daniel Phillips
  2003-03-12 23:38                     ` Pavel Machek
  4 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-10  3:41 UTC (permalink / raw)
  To: Zack Brown; +Cc: Larry McVoy, linux-kernel

Zack Brown <zbrown@tumblerings.org> said:

[...]

> I'd be willing to maintain this as the beginning of a feature list and
> post it regularly to lkml if enough people feel it would be useful and not
> annoying. The goal would be to identify the features/problems that would
> need to be handled by a kernel-ready version control system.

I believe that has very little relevance to lkml, only perhaps to a mailing
list for a bk replacement. For the kernel this work has already been done
(by Larry and the head penguins).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-10  3:41                     ` BitBucket: GPL-ed KitBeeper clone Horst von Brand
@ 2003-03-10 13:52                       ` Jamie Lokier
  0 siblings, 0 replies; 155+ messages in thread
From: Jamie Lokier @ 2003-03-10 13:52 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Zack Brown, Larry McVoy, linux-kernel

Horst von Brand wrote:
> Zack Brown <zbrown@tumblerings.org> said:
> > I'd be willing to maintain this as the beginning of a feature list and
> > post it regularly to lkml if enough people feel it would be useful and not
> > annoying. The goal would be to identify the features/problems that would
> > need to be handled by a kernel-ready version control system.
> 
> I believe that has very little relevance to lkml, only perhaps to a mailing
> list for a bk replacement. For the kernel this work has already been done
> (by Larry and the head penguins).

I'd like to thank those kind souls who explained how branch _and_
merge history is used by the better merging utilities.  Now I see why
tracking merge history is so helpful.  (Tracking it for credit and
blame history was obvious, but tracking it to enable tools to be
better at resolving conflicts was not something I'd thought of).

Of course there will be times when two or more people apply a patch
without the history of that patch being tracked, and then try to merge
both changes - any version control system should handle that as
gracefully as it can.  However I now see how much actively tracking
the history of those operations can help tools to reduce the amount of
human effort required to combine changes from different places.

So thank you for illustrating that.

ps. Yes I know that CVS sucks at these things.  I've seen _awful_
software engineering disasters due to the difficulty of tracking
different lines of development through CVS, first hand :)

-- Jamie

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  2:45                   ` Zack Brown
                                       ` (2 preceding siblings ...)
  2003-03-10  3:41                     ` BitBucket: GPL-ed KitBeeper clone Horst von Brand
@ 2003-03-10 23:03                     ` Daniel Phillips
  2003-03-11 18:40                       ` Zack Brown
  2003-03-12 23:38                     ` Pavel Machek
  4 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-10 23:03 UTC (permalink / raw)
  To: Zack Brown, linux-kernel

On Sun 09 Mar 03 03:45, Zack Brown wrote:
> OK, so here is my distillation of Larry's post.
>
>   Basic summary: a distributed, replicated, version controlled user level
> file system with no limits on any of the file system events which may
> happened in parallel. All changes must be put correctly back together, no
> matter how much parallelism there has been.
>
>   * Merging.
>
>   * The graph structure.
>
>   * Distributed rename handling. Centralized systems like Subversion don't
>   have as many problems with this because you can only create one file in
>   one directory entry because there is only one directory entry available.
>   In distributed rename handling, there can be an infinite number of
> different files which all want to be src/foo.c. There are also many rename
> corner-cases.
>
>   * Symbolic tags. This is adding a symbolic label on a revision. A
> distributed system must handle the fact that the same symbol can be put on
> multiple revisions. This is a variation of file renaming. One important
> thing to consider is that time can go forward or backward.
>
>   * Security semantics. Where should they go? How can they be integrated
>   into the system? How are hostile users handled when there is no central
>   server to lock down?
>
>   * Time semantics. A distributed system cannot depend on reported time
>   being correct. It can go forward or backward at any rate.
>
> I'd be willing to maintain this as the beginning of a feature list and
> post it regularly to lkml if enough people feel it would be useful and not
> annoying. The goal would be to identify the features/problems that would
> need to be handled by a kernel-ready version control system.
>
> Be well,
> Zack

Hi Zack,

You might want to have a look here, there's lots of good stuff:

   http://arx.fifthvision.net/bin/view/Arx/LinuxKernel
   (Kernel Hackers SCM wish list)

   http://arx.fifthvision.net/bin/view/Arx/GccHackers
   (Gcc Hackers SCM wish list)

Arx is a fork of Tom Lord's Arch, now in version 1.0pre5.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-10 23:03                     ` Daniel Phillips
@ 2003-03-11 18:40                       ` Zack Brown
  2003-03-11 18:46                         ` Martin J. Bligh
                                           ` (2 more replies)
  0 siblings, 3 replies; 155+ messages in thread
From: Zack Brown @ 2003-03-11 18:40 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: linux-kernel

On Tue, Mar 11, 2003 at 12:03:18AM +0100, Daniel Phillips wrote:
> On Sun 09 Mar 03 03:45, Zack Brown wrote:
> > OK, so here is my distillation of Larry's post.
...
> > I'd be willing to maintain this as the beginning of a feature list and
> > post it regularly to lkml if enough people feel it would be useful and not
> > annoying. The goal would be to identify the features/problems that would
> > need to be handled by a kernel-ready version control system.
> >
> > Be well,
> > Zack
> 
> Hi Zack,
> 
> You might want to have a look here, there's lots of good stuff:
> 
>    http://arx.fifthvision.net/bin/view/Arx/LinuxKernel
>    (Kernel Hackers SCM wish list)

Hi,

I remember that discussion. It was pretty interesting, but some
conflicting ideas about what should be done; and not much organization
to it all.

I've taken a lot of stuff from that wish list, combined it with what I gathered
from Larry's earlier post, and from Petr Baudis' recent post, and elsewhere,
and organized it into something that might be interesting. If anyone would
like to host this document on the web, please let me know.

--------------------------------- cut here ---------------------------------

           Linux Kernel Requirements For A Version Control System    

Document version 0.0.1

This document describes the features required for a version control system
that would be acceptable to Linux kernel developers. A second section below
lists features that would also be good, but not required for adoption by the
kernel team.

Please help out by clarifying features; identifying which features are
really required and which would just be nice; and by listing corner cases
and other implementation issues.

                          * * * Basic summary * * *

A distributed, replicated, version controlled user level file system with no
limits on any of the file system events which may happen in parallel. All
changes must be put correctly back together, no matter how much parallelism
there has been.

                   * * * Requirements For The Kernel * * *

                            Distributed Branches

  1. Introduction

The idea of distributed branches is to allow developers to pull an entire,
full-featured repository onto their home system from the 'main' repository,
allow them to work off-line or with other groups of developers without
sacrificing the features of a full repository, then merge their work back to
the main repository or to other repositories.

A 'main' repository in this case is simply a repository used by the project
leader of a given project. It has no special features or privileges missing
from other branches. It is only considered the 'main' repository for social
reasons, not technical ones. Therefore, branches that have been cloned from
the main repository should not have to 'register' with the repository they
cloned from. i.e. one repository should be able to interact fully with
another, without either of them having prior knowledge of the other.

  2. Behavior

Creating one repository from another should produce a full clone; not just
the current state of the parent repository, but all data from the parent
should be included in the child.

When cloning a repository, committing changes back to the parent, or sharing
changes with any other repositories, no assumptions should be made about the
location of the repositories on the network. Repositories may be on the same
machine, or on entirely different machines anywhere in the world.

                                 Changesets

  1. Introduction

A changeset is a group of files in a repository, that have been tagged by
the developer, as being logical parts of a patch dealing with a single
feature or fix. A developer working on multiple aspects of a repository, may
create one changeset for each aspect, in which each changeset consists of
the files relevant to that aspect.

In the context of sharing changesets between repositories, a changeset
consists of a diff between the set of files in the local and remote
repositories.

  2. Behavior

    2.1 Tagging

It must be trivial for a developer to tag a file as part of a given
changeset.

It must be possible to reorganize changesets, so that a given changeset may
be split up into more manageable pieces.

    2.2 Versioning

Changesets are given their own local version number, incremented with each
checkin.

  3. Problems For Clarification

If a file is tagged as being part of two different changesets, then changes
to that file should be associated with which changeset???

                                  Checkins

  1. Introduction

Checkins consist of making local modifications to a given repository. This
is distinct from merging changes from one repository into another. A
developer making local changes to their own repository is doing checkins. A
developer sharing their changes with a separate repository is doing merging.

  2. Behavior

Files that are not part of a changeset are treated individually. On checkin,
the developer may include a comment for each file. This is distinct from
version control systems that take a single comment for the whole checkin.

It must be possible to checkin a single changeset to a local repository, and
have that changeset be treated as an individual unit, just as plain files
are: on checkin, the developer includes a single comment for the entire
changeset.

                                   Merging

  1. Introduction

Merging consists of sending and receiving changes between two or more
repositories.

  2. Behavior

    2.1 Preserving Local Work

It must be possible to update a local repository to match changes that have
been made to a remote repository, while at the same time preserve changes
that have been made to the local repository. If conflicts arise because some
of the same files have changed on both the local and remote repositories,
conflict resolution tools should be automatically invoked for the local
developer (see below).

If a checkin is interrupted for some reason, it should be easy to clean up
the tree, bringing it back to a consistant, useful state.

It should be possible to mark a file as private to a local repository, so
that a merge will never try to commit that file's changes to a remote
repository.

    2.2 Preserving History

Checkin tags and version numbers are local to a given repository. Because
duplicates may exist across repositories, these historical details must be
remapped during checkin, to values that are unique within the remote
repository, but that can still be identified with their originals.

A merge between two repositories does not consist only of merging the
current state of a set of changesets, but their entire history, including
all their versions and the files that comprise them.

Even if no history is available for a given patch, it should be easy to
checkin and merge that patch.

The implementation must not depend on time being accurately reported by any
of the repositories.

  3. Graph Structure

To illustrate some of the above behaviors, see the following DAG (Directed
Acyclic Graph). This graph will look different when viewed from each
repository (diagrams show the ChangeSet numbers). From the imaginary Linus'
branch, it looks like:

linus  1.1 -> 1.2 -> 1.3 -----> 1.4 -> 1.5 -----> 1.6 -----> 1.7
                \               / \                          /
alan             \-> 1.2.1.1 --/---\-> 1.2.1.2 -> 1.2.1.3 --/

But from the Alan' branch, it looks like:

linus  1.1 -> 1.2 -> 1.2.1.1 -> 1.2.1.2 -> 1.2.1.3 -> 1.2.1.4 -> 1.2.1.5
                \               / \                              /
alan             \-> 1.3 ------/---\-----> 1.4 -----> 1.5 ------/

A virtual branch, used to track changesets, not per-file revisions, is
created in the parent repository during merge. At this time the merged
changesets receive new numbers appropriate for that branch. But since the
branch is only virtual, there is still only one line of development in the
repository. To see the changesets in the order they were applied, they must
be sorted not by revision number buy by merge time. Thus, with respect to
the above diagrams, the order in which the patches were applied, from Linus'
perspective, is:

1.1  1.2  1.3  1.2.1.1  1.4  1.5  1.6  1.2.1.2  1.2.1.3  1.7

                         Distributed Rename Handling

  1. Introduction

This consists of allowing developers to rename files and directories, and
have all repository operations properly recognize and handle this.

  2. Behavior

    2.1 Local

Renaming files and directories locally should preserve all historical
information including changeset tags.

    2.2 Distributed

In the general case, a single local repository attempts to merge
name-changes with a remote repository. In this case, the remote repository
receives the name change, along with all history including changeset tags.

      2.2.1 Conflicts

An arbitrary number of repositories cloned from either a single remote
repository or from each other may attempt to change the name of a single
file to arbitrary other names and then merge that change back to a single
remote repository or to each other.

An arbitrary number of repositories cloned from either a single remote
repository or from each other may rename file A to something else, and then
other files to the name formerly used by File A, or create a new file with
the name formerly used by file A; and then merge those changes to the single
remote repository or to each other.

An arbitrary number of repositories cloned from either a single remote
repository or from each other may attempt to create a file with the same
name and merge that change back to the remote repository or to each other.

                        Graphical 2- And 3-Way Merging Tool

  1. Introduction

Merge tools are tools used to resolve conflicts when merging files. See
tkdiff ( http://www.accurev.com/free/tkdiff/ )

  2. Behavior

The merge tools should identify precisely the areas of conflict, and enable
the user to quickly edit the files to resolve the conflicts and apply the
patch.

Merge tools must be able to handle patches as well as entire files.

A typical usage would be to pull all recent changes to a local tree from a
remote repository; then run the merge tools to resolve any conflicts between
the remote repository and changes that have been made locally; tag local
files to produce a changeset; and generate a diff for sharing.

               * * * Not Required For Kernel Development * * *

                                 Changesets

It should be possible to exchange changesets via email.

                                 File Types

The system should support symlinks, device special files, fifos, etc. (i.e.
inode metadata)

* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
This document is copyright Zack Brown and released under the terms of the
GNU General Public License, version 2.0.
* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

--------------------------------- cut here ---------------------------------

> 
> Regards,
> 
> Daniel
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 18:40                       ` Zack Brown
@ 2003-03-11 18:46                         ` Martin J. Bligh
  2003-03-11 19:30                           ` Daniel Phillips
  2003-03-12  3:47                         ` Horst von Brand
  2003-03-14 11:34                         ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
  2 siblings, 1 reply; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-11 18:46 UTC (permalink / raw)
  To: Zack Brown, Daniel Phillips; +Cc: linux-kernel

> I've taken a lot of stuff from that wish list, combined it with what I gathered
> from Larry's earlier post, and from Petr Baudis' recent post, and elsewhere,
> and organized it into something that might be interesting. If anyone would
> like to host this document on the web, please let me know.

Not sure if this was captured before (I don't see it explicitly in what
you sent), but one thing that I don't think current tools do well is to
keep changes seperated out. We need to be able to put a stack of 200
patches on top of 2.5.10, then be able to break those out again easily
come 2.5.60, once we've merged forward. Treating things as one big blob
will work great for Linus, but badly for others.

At the moment, I slap the patches back on top of every new version 
seperately, which works well, but is a PITA. I hear this is something
of a pain to do with Bitkeeper (don't know, I've never tried it). 
People muttered things about keeping 200 different views, which is
fine for hardlinked diff & patch (takes < 1s to clone normally), but
I'm not sure how long a merge would take in Bitkeeper this way? Perhaps
people who've done this in other SCM's could comment?

M.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 18:46                         ` Martin J. Bligh
@ 2003-03-11 19:30                           ` Daniel Phillips
  2003-03-11 19:33                             ` Martin J. Bligh
                                               ` (2 more replies)
  0 siblings, 3 replies; 155+ messages in thread
From: Daniel Phillips @ 2003-03-11 19:30 UTC (permalink / raw)
  To: Martin J. Bligh, Zack Brown; +Cc: linux-kernel

On Tue 11 Mar 03 19:46, Martin J. Bligh wrote:
> > I've taken a lot of stuff from that wish list, combined it with what I
> > gathered from Larry's earlier post, and from Petr Baudis' recent post,
> > and elsewhere, and organized it into something that might be interesting.
> > If anyone would like to host this document on the web, please let me
> > know.
>
> Not sure if this was captured before (I don't see it explicitly in what
> you sent), but one thing that I don't think current tools do well is to
> keep changes seperated out. We need to be able to put a stack of 200
> patches on top of 2.5.10, then be able to break those out again easily
> come 2.5.60, once we've merged forward. Treating things as one big blob
> will work great for Linus, but badly for others.

Coincidently, I was having a little think about that exact thing earlier 
today.  Suppose we call the process of turning an exact delta into a 
delta-with-context, "softening".  So you select a set of deltas somehow 
(e.g., all deltas in wild-card set of files) then soften them by adding 
context, or in the deluxe version, convert to lists of tokens with whitespace 
markup.  The result is a first-class object in the database, called a, hmm, 
soft changeset?  (Surely there is a better name.)

A soft changeset can be carried forward in the database automatically as long 
as there are no conflicts (like patch with fuzz) and where there are 
conflicts, the soft changeset itself can be versioned.  To implement soft 
changeset versioning the lazy way, just merge the changeset with some version 
and generate a new soft changeset against some other version.  A name for the 
versioned soft changeset can be generated automatically, e.g.:

   changset.name-from.version-to.version.

You can wave your wand, and the soft changeset will turn into a universal 
diff or a BK changeset.  But it's obviously a lot cleaner, extensible, 
flexible and easier to process automatically than a text diff.  It's an 
internal format, so it can be improved from time to time with little or no 
breakage.

Did that make sense?

> At the moment, I slap the patches back on top of every new version
> seperately, which works well, but is a PITA.

Tell me about it.

> I hear this is something
> of a pain to do with Bitkeeper (don't know, I've never tried it).
> People muttered things about keeping 200 different views, which is
> fine for hardlinked diff & patch (takes < 1s to clone normally), but
> I'm not sure how long a merge would take in Bitkeeper this way? Perhaps
> people who've done this in other SCM's could comment?

I've never seriously used any commercial SCM, so nobody can accuse me of 
stealing their ideas.  On the other hand, it means I may have to take a few 
shots way wide of the target before hitting any bullseyes.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 19:30                           ` Daniel Phillips
@ 2003-03-11 19:33                             ` Martin J. Bligh
  2003-03-11 20:08                               ` Andrew Morton
  2003-03-12  6:14                             ` Werner Almesberger
  2003-03-14 12:29                             ` Pavel Machek
  2 siblings, 1 reply; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-11 19:33 UTC (permalink / raw)
  To: Daniel Phillips, Zack Brown; +Cc: linux-kernel

>> Not sure if this was captured before (I don't see it explicitly in what
>> you sent), but one thing that I don't think current tools do well is to
>> keep changes seperated out. We need to be able to put a stack of 200
>> patches on top of 2.5.10, then be able to break those out again easily
>> come 2.5.60, once we've merged forward. Treating things as one big blob
>> will work great for Linus, but badly for others.
> 
> Coincidently, I was having a little think about that exact thing earlier 
> today. 

Good, then either I'm not insane, or at least I have company in the 
madhouse ;-) 

> Suppose we call the process of turning an exact delta into a 
> delta-with-context, "softening".  So you select a set of deltas somehow 
> (e.g., all deltas in wild-card set of files) then soften them by adding 
> context, or in the deluxe version, convert to lists of tokens with whitespace 
> markup.  The result is a first-class object in the database, called a, hmm, 
> soft changeset?  (Surely there is a better name.)

a "patch" ? ;-) A context-diff is kind of a delta with context. 
I have some similar patch tools to akpm, and he uses the patch as the
base concept of what he does.

> A soft changeset can be carried forward in the database automatically as long 
> as there are no conflicts (like patch with fuzz) and where there are 
> conflicts, the soft changeset itself can be versioned.  To implement soft 
> changeset versioning the lazy way, just merge the changeset with some version 
> and generate a new soft changeset against some other version.  A name for the 
> versioned soft changeset can be generated automatically, e.g.:
> 
>    changset.name-from.version-to.version.

Right ... what I do is basically have a script that does:
for i in *
<copy lastview to $i>
(cd $i; <apply $i>)

My patches all start with a sequence number (a bit like Andrea does), so
for i in * does really nicely. What it's *meant* to do is read $? back
from patch, and stop if patch failed to apply it properly (more than
just offsets), and barf for user intervention, but that bit's broken
at the moment ;-)

> You can wave your wand, and the soft changeset will turn into a universal 
> diff or a BK changeset.  But it's obviously a lot cleaner, extensible, 
> flexible and easier to process automatically than a text diff.  It's an 
> internal format, so it can be improved from time to time with little or no 
> breakage.
> 
> Did that make sense?

Yeah, the wand is called "creatediffs" in my case, and it takes all the
views in a dir, and diffs the first against the second, second against
third, etc. I always start with "000-virgin".

I might even clean up my tools, turn them into one perl script, and send
them out at the weekend. They're a fetid (but working) mess right now.

What we need is a "better context-diff", with something smarter to apply
it that understands C syntax (can fall back to cdiff for text / asm for
now). 

And whilst we're at it, would be nice to have something that tried to
produce the most human readable diffs, not the smallest ones. Renaming
the function at the top is frigging annoying.

>> At the moment, I slap the patches back on top of every new version
>> seperately, which works well, but is a PITA.
> 
> Tell me about it.

Well, it normally only takes me an hour per release. But it's still a
waste of time. And yes, I have to do some things by hand. But the screams
of others around me when BK goes wrong tell me it's not much better for
all its fancy tricks (for *my* usage at least), in terms of applying
patches happily to deleted files, etc. so it still needs manual fix up.

>> I hear this is something
>> of a pain to do with Bitkeeper (don't know, I've never tried it).
>> People muttered things about keeping 200 different views, which is
>> fine for hardlinked diff & patch (takes < 1s to clone normally), but
>> I'm not sure how long a merge would take in Bitkeeper this way? Perhaps
>> people who've done this in other SCM's could comment?
> 
> I've never seriously used any commercial SCM, so nobody can accuse me of 
> stealing their ideas.  On the other hand, it means I may have to take a few 
> shots way wide of the target before hitting any bullseyes.

Yeah, neither have I. CVS I tried for a day, and it was just laughable.
BK I never looked at yet (have been tempted by the fancy looking merge
tool a few times). I tend to be slow to pick up new tools ... I prefer
to let others knock out the bugs first, and most of the time they don't
stick anyway ... so it was wasted time. 

BK seems to be sticking better than most, but from the feedback I get
about it from others, I think I like my scripts well enough ... and can 
change them to do what I want, and I understand what they're doing, 
which makes me happy (they're 10 lines of sh or perl ;-)). And that's
not an open-source license thing ... it's complex enough that it wouldn't
do me any good to be open source (for any non-trivial mod). I want 
something *simple* personally.

M.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 19:33                             ` Martin J. Bligh
@ 2003-03-11 20:08                               ` Andrew Morton
  2003-03-11 20:29                                 ` Martin J. Bligh
  0 siblings, 1 reply; 155+ messages in thread
From: Andrew Morton @ 2003-03-11 20:08 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: phillips, zbrown, linux-kernel

"Martin J. Bligh" <mbligh@aracnet.com> wrote:
>
> >> At the moment, I slap the patches back on top of every new version
> >> seperately, which works well, but is a PITA.
> > 
> > Tell me about it.
> 
> Well, it normally only takes me an hour per release.

Whoa.  You need better tools.

A bunch of fine people took patch-tools and turned them into a real project. 
They have .deb's and .rpm's, but it looks like they're a bit old and a `cvs co'
is needed.  I'm still using the old stuff, but I'm sure theirs is better.

See http://savannah.nongnu.org/projects/quilt/



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 20:08                               ` Andrew Morton
@ 2003-03-11 20:29                                 ` Martin J. Bligh
  0 siblings, 0 replies; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-11 20:29 UTC (permalink / raw)
  To: Andrew Morton; +Cc: phillips, zbrown, linux-kernel

>> >> At the moment, I slap the patches back on top of every new version
>> >> seperately, which works well, but is a PITA.
>> > 
>> > Tell me about it.
>> 
>> Well, it normally only takes me an hour per release.
> 
> Whoa.  You need better tools.
> 
> A bunch of fine people took patch-tools and turned them into a real project. 
> They have .deb's and .rpm's, but it looks like they're a bit old and a `cvs co'
> is needed.  I'm still using the old stuff, but I'm sure theirs is better.
> 
> See http://savannah.nongnu.org/projects/quilt/

I did take a look at your stuff in the past ... had a few minor objections
at the time, but have actually grown closer to what you do since then.
I *do* like the numbering of my patches though. I might try to merge them
together at some point soon.

So when I say 1 hour ... bear in mind I don't take Linus bk-drops normally,
on the full releases, so the delta is bigger (and I'm slower than you! ;-))
You still have to fix up the rejects from 'patch -p1' by hand though,
right? That's what normally takes most of the time, especially if it's
code I'm unfamiliar with, or I make a mistake (reboot takes 5-10 mins ;-))

M.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 19:30                           ` Daniel Phillips
  2003-03-11 19:33                             ` Martin J. Bligh
@ 2003-03-12  6:14                             ` Werner Almesberger
  2003-03-13  2:48                               ` Daniel Phillips
  2003-03-14 12:29                             ` Pavel Machek
  2 siblings, 1 reply; 155+ messages in thread
From: Werner Almesberger @ 2003-03-12  6:14 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Martin J. Bligh, Zack Brown, linux-kernel

Daniel Phillips wrote:
> Coincidently, I was having a little think about that exact thing earlier 
> today.  Suppose we call the process of turning an exact delta into a 
> delta-with-context, "softening".

Why not just make all deltas "soft" and just ignore the context in
cases when you're absolutely sure you can ? (Provided that such
cases exist and aren't trivial.)

> A soft changeset can be carried forward in the database automatically as long 
> as there are no conflicts

You probably also want to be able to apply them to different
views, e.g. if I fix X, I may send it off to integration, and
also apply it independently to my projects Y and Z. When X gets
merged into whatever I consider my "mainstream" (again, that's a
local decision, e.g. it may be Linus' tree, plus net/* and anything
related to changes in net/* from David Miller), I may want to get
notified, e.g. if there's a conflict, but also such that I can drop
that part from my fix (which may contain elements that I didn't
push yet).

Not all of this needs to be known to the SCM if the right tagging
tools are available to users. In fact, limiting the number of work
flows inherently supported by the SCM would probably be a
feature :-)

> and generate a new soft changeset against some other version.  A name for the 
> versioned soft changeset can be generated automatically, e.g.:
> 
>    changset.name-from.version-to.version.

Hmm, I'd distinguish three elements in a change set's name:

 - its history (i.e. all changesets applied to the file(s)
   when the change set was created)
 - a globally unique ID
 - a human-readable title that doesn't need to be perfectly
   unique

I think, for simplicity, changesets should just carry their history
with them. This can later be compressed, e.g. by omitting items
before major convergence points (releases), by using automatically
generated reference points, or simply by fetching additional
information from a repository if needed (hairy).

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  6:14                             ` Werner Almesberger
@ 2003-03-13  2:48                               ` Daniel Phillips
  2003-03-13  3:11                                 ` Werner Almesberger
  0 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-13  2:48 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Martin J. Bligh, Zack Brown, linux-kernel

On Wed 12 Mar 03 07:14, Werner Almesberger wrote:
> Daniel Phillips wrote:
> > Coincidently, I was having a little think about that exact thing earlier
> > today.  Suppose we call the process of turning an exact delta into a
> > delta-with-context, "softening".
>
> Why not just make all deltas "soft" and just ignore the context in
> cases when you're absolutely sure you can ? (Provided that such
> cases exist and aren't trivial.)

Just because there's no point in storing context that you don't have to, and 
when you get into more sophisticated operations on deltas, you'd just 
introduce a first step of discarding the context in many cases.

> > A soft changeset can be carried forward in the database automatically as
> > long as there are no conflicts
>
> You probably also want to be able to apply them to different
> views, e.g. if I fix X, I may send it off to integration, and
> also apply it independently to my projects Y and Z. When X gets
> merged into whatever I consider my "mainstream" (again, that's a
> local decision, e.g. it may be Linus' tree, plus net/* and anything
> related to changes in net/* from David Miller), I may want to get
> notified, e.g. if there's a conflict, but also such that I can drop
> that part from my fix (which may contain elements that I didn't
> push yet).

Yes, and if we have the concept of a versioned changeset, your system will 
notice automatically that Linus applied either exactly what you sent him or a 
descendent (i.e., he had to massage it, but his history still recorded the 
fact that he started with your changeset) so your system will know to 
automatically reverse your original version during your next merge with 
Linus.  Um, if Linus is using this new spiffy system of course, you may want 
to substitute "Pavel" in the above.

> > and generate a new soft changeset against some other version.  A name for
> > the versioned soft changeset can be generated automatically, e.g.:
> >
> >    changset.name-from.version-to.version.
>
> Hmm, I'd distinguish three elements in a change set's name:
>
>  - its history (i.e. all changesets applied to the file(s)
>    when the change set was created)
>  - a globally unique ID
>  - a human-readable title that doesn't need to be perfectly
>    unique

Such things as history (if you need it) and globally-unique id can be tucked 
into the header of the changeset.  The unique id is good, means you can let 
names collide.  For the name itself, I personally am mostly interested in the 
catchy moniker I thought up for the patch, um, I mean changeset, the kernel 
version it applies to, and a sequence number in case I generate more than one 
version against the same kernel, so that when I post the changsets on the 
web, people can find the file they need.  Boring huh?

Naming is a matter of taste, and you ought to be able to do it according to 
your own taste, including hooking in your own name-generating script.

> I think, for simplicity, changesets should just carry their history
> with them. This can later be compressed, e.g. by omitting items
> before major convergence points (releases), by using automatically
> generated reference points, or simply by fetching additional
> information from a repository if needed (hairy).

I would not call that hairy, it sounds more like fun.  The hairy part is 
getting the underlying framework to function properly.  Larry is entirely 
correct in pointing out that it's hard, though in my opinion, not nearly as 
hard as kernel development.  Your edit/compile/test cycle is a fraction as 
long for one thing.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13  2:48                               ` Daniel Phillips
@ 2003-03-13  3:11                                 ` Werner Almesberger
  0 siblings, 0 replies; 155+ messages in thread
From: Werner Almesberger @ 2003-03-13  3:11 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Martin J. Bligh, Zack Brown, linux-kernel

Daniel Phillips wrote:
> Naming is a matter of taste, and you ought to be able to do it according to 
> your own taste, including hooking in your own name-generating script.

Yup, what I mean is that the system shouldn't have to depend on a
human-usable name. It's usually very hard to generate unique names
that are also human-friendly, so I think it's better not to try in
the first place. (Just look at e-mail message-ids for an example.)

> > I think, for simplicity, changesets should just carry their history
> > with them. This can later be compressed, e.g. by omitting items
> > before major convergence points (releases), by using automatically
> > generated reference points, or simply by fetching additional
> > information from a repository if needed (hairy).
> 
> I would not call that hairy, it sounds more like fun.

I called it hairy, because you need to retrieve something from a
machine that may not be available at that time. Waiting until it
comes back usually isn't a choice. Of course, this information
may be replicated on other machines that are available, and that
your repository/agent knows of, etc.

In any case, this would be an optimization. Bandwidth and disk
space are cheap, so it's not so bad to carry a few kB of history
around for each file.

> getting the underlying framework to function properly.  Larry is entirely 
> correct in pointing out that it's hard, though in my opinion, not nearly as 
> hard as kernel development.  Your edit/compile/test cycle is a fraction as 
> long for one thing.

Oh, I'd say it's an entirely different type of development. The
kernel has to deal with real-time concurrency and subtle
performance issues. An SCM can quite easily eliminate concurrency
to the point that all operations become nice, linear batch jobs
on a completely static data set. On the other hand, the SCM is
likely to work on more complex data structures, and will have a
closer interaction with what is user policy.

While performance is certainly an important issue for an SCM, I'd
expect this to be something that can be safely ignored for a good
while during development. (I'm a firm believer in the
prototype-burn-rewrite-burn_again-... type of software development.
Maybe this shows :-)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 19:30                           ` Daniel Phillips
  2003-03-11 19:33                             ` Martin J. Bligh
  2003-03-12  6:14                             ` Werner Almesberger
@ 2003-03-14 12:29                             ` Pavel Machek
  2003-03-15 20:53                               ` Martin J. Bligh
                                                 ` (4 more replies)
  2 siblings, 5 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-14 12:29 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Martin J. Bligh, Zack Brown, linux-kernel

Hi!

> You can wave your wand, and the soft changeset will turn into a universal 
> diff or a BK changeset.  But it's obviously a lot cleaner, extensible, 
> flexible and easier to process automatically than a text diff.  It's an 
> internal format, so it can be improved from time to time with little or no 
> breakage.
> 
> Did that make sense?

Yes.

Some kind of better-patch is badly needed.

What kind of data would have to be in soft-changeset?
* unique id of changeset
* unique id of previous changeset
(two previous if it is merge)
? or would it be better to have here
whole path to first change?
* commit comment
* for each file:
** diff -u of change
** file's unique id
** in case of rename: new name (delete is rename to special dir)
** in case of chmod/chown: new permissions
** per-file comment

? How to handle directory moves?

Does it seem sane? Any comments?

-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-14 12:29                             ` Pavel Machek
@ 2003-03-15 20:53                               ` Martin J. Bligh
  2003-03-15 21:26                               ` Daniel Phillips
                                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 155+ messages in thread
From: Martin J. Bligh @ 2003-03-15 20:53 UTC (permalink / raw)
  To: Pavel Machek, Daniel Phillips; +Cc: Zack Brown, linux-kernel

> Yes.
> 
> Some kind of better-patch is badly needed.
> 
> What kind of data would have to be in soft-changeset?
> * unique id of changeset
> * unique id of previous changeset
> (two previous if it is merge)
> ? or would it be better to have here
> whole path to first change?
> * commit comment
> * for each file:
> ** diff -u of change
> ** file's unique id
> ** in case of rename: new name (delete is rename to special dir)
> ** in case of chmod/chown: new permissions
> ** per-file comment
> 
> ? How to handle directory moves?
> 
> Does it seem sane? Any comments?

Looks good to me. 

If people keep changesets sanely, then there should be no need for 
per-file comments IMHO, but I'm sure that's a matter of debate.

M.


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-14 12:29                             ` Pavel Machek
  2003-03-15 20:53                               ` Martin J. Bligh
@ 2003-03-15 21:26                               ` Daniel Phillips
  2003-03-15 21:32                               ` Petr Baudis
                                                 ` (2 subsequent siblings)
  4 siblings, 0 replies; 155+ messages in thread
From: Daniel Phillips @ 2003-03-15 21:26 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Martin J. Bligh, Zack Brown, linux-kernel

On Fri 14 Mar 03 13:29, Pavel Machek wrote:
> What kind of data would have to be in soft-changeset?
> * unique id of changeset
> * unique id of previous changeset
> (two previous if it is merge)
> ? or would it be better to have here
> whole path to first change?
> * commit comment
> * for each file:
> ** diff -u of change
> ** file's unique id
> ** in case of rename: new name (delete is rename to special dir)
> ** in case of chmod/chown: new permissions
> ** per-file comment

This *very* closely matches the schema I worked up some months ago, and 
dusted off again when I saw your original Bitbucket post.

> ? How to handle directory moves?
>
> Does it seem sane? Any comments?

Oh yes.  Comment: see response to Horst van Brand, on much the same subject.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-14 12:29                             ` Pavel Machek
  2003-03-15 20:53                               ` Martin J. Bligh
  2003-03-15 21:26                               ` Daniel Phillips
@ 2003-03-15 21:32                               ` Petr Baudis
  2003-03-15 23:39                                 ` Petr Baudis
  2003-03-16  0:39                               ` Horst von Brand
  2003-04-07 21:22                               ` Petr Baudis
  4 siblings, 1 reply; 155+ messages in thread
From: Petr Baudis @ 2003-03-15 21:32 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Daniel Phillips, Martin J. Bligh, Zack Brown, linux-kernel

Dear diary, on Fri, Mar 14, 2003 at 01:29:03PM CET, I got a letter,
where Pavel Machek <pavel@ucw.cz> told me, that...
> Hi!
> 
> > You can wave your wand, and the soft changeset will turn into a universal 
> > diff or a BK changeset.  But it's obviously a lot cleaner, extensible, 
> > flexible and easier to process automatically than a text diff.  It's an 
> > internal format, so it can be improved from time to time with little or no 
> > breakage.
> > 
> > Did that make sense?
> 
> Yes.
> 
> Some kind of better-patch is badly needed.
> 
> What kind of data would have to be in soft-changeset?
> * unique id of changeset
> * unique id of previous changeset
> (two previous if it is merge)
> ? or would it be better to have here
> whole path to first change?
> * commit comment
> * for each file:
> ** diff -u of change
> ** file's unique id
> ** in case of rename: new name (delete is rename to special dir)
> ** in case of chmod/chown: new permissions
> ** per-file comment
> 
> ? How to handle directory moves?
> 
> Does it seem sane? Any comments?

Sounds almost sane (except the requirement for -u, it should be probably
possible to use the same scale of diff types as now </nitpicking>). When
already doing -u, it should probably also mention the original name of the file
in case of move/rename and especially the original chmod/chown
permissions/ownership. About chown, I'm not that sure ownership should be
recorded/carried, given that normal users can't even chown, and probably the
usernames won't exist on the system anyway. Maybe making that an optional
feature which the patching subject may ignore.

Whole separate issue is how to generate the unique ids. First, we need unique
ids for people, that shouldn't be that difficult. In fact, email should do
quite well, as it does for BitKeeper or arch. Interesting thing could happen
when someone's email is going to change and he wants to use the new one.
Probably when changing this information in his repository, the old one should
be kept as an "alias" and sent along with any updates near the new one --- I
believe the backlog shouldn't even reach any dangerous length, for standard
communication some sane upper threshold (like 5) could be set and more would be
sent only in direct communication and only in case of conflicts.

Changeset unique id should probably include author of that changeset and time
(with seconds precision) of commiting such a changeset to the [original]
repository [of the changeset]. However some insane scripts could make, checkin
and commit several changesets in line fast enough or so, thus you want
something else in the id as well, which could further differentiate commitins
happenning at same time. Checksum of the changeset changes (in some suitable
form) would do. Now, if you want to annoy Larry, separate the fields by '|'s
and you could get something familiar.

File's unique id is a little harder. The best thing to do is probably to
identify file by its origin. The file appeared in some changeset, we have
already unique ids for changesets. And the file appeared under some original
name there, which has to be unique inside of one changeset. Thus take changeset
unique id and add another field there, the original file name under which it
appeared in that changeset. It should be still unique and also cheap to look up
--- you have only to look for changes in that one changeset, look up the
particular file in the list of files appearing there and you should keep some
file name -> "virtual inode" number mapping near the files anyway.

What do you think?

Kind regards,

-- 
 
				Petr "Pasky" Baudis
.
The pure and simple truth is rarely pure and never simple.
		-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 21:32                               ` Petr Baudis
@ 2003-03-15 23:39                                 ` Petr Baudis
  0 siblings, 0 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-15 23:39 UTC (permalink / raw)
  To: Pavel Machek, Daniel Phillips, Martin J. Bligh, Zack Brown, linux-kernel

Dear diary, on Sat, Mar 15, 2003 at 10:32:46PM CET, I got a letter,
where Petr Baudis <pasky@ucw.cz> told me, that...
..snip..
> Changeset unique id should probably include author of that changeset and time
> (with seconds precision) of commiting such a changeset to the [original]
> repository [of the changeset]. However some insane scripts could make, checkin
> and commit several changesets in line fast enough or so, thus you want
> something else in the id as well, which could further differentiate commitins
> happenning at same time. Checksum of the changeset changes (in some suitable
> form) would do. Now, if you want to annoy Larry, separate the fields by '|'s
> and you could get something familiar.
..snip..

Okay, you will also need to define some project (let's define project as a
group of files with a history, where the instances of a project are called
"repositories" and are nodes of a DAG with common root, which we will call the
initial repository) unique id and include it in the changeset id. I think the
best for a project unique id would be some checksum (so that it isn't too
long..?) of the initial repository owner (project founder), project name (such
as 'linux' or 'foobar' or "this isn't going to be unique, who cares") and some
roughly random number (be it a timestamp, /dev/urandom output snippet or
metheorogical situation snapshot).

We could maybe raise the precision for timestamp of changeset ids instead of
having the checksum there, is it really neccessary? I fear of changeset id
being too annoyingly long and complicated. And yes I'm looking at BK heavily
regarding these concepts --- they seem to get these concepts fairly right so
why not.

Kind regards,

-- 

				Petr "Pasky" Baudis
.
The pure and simple truth is rarely pure and never simple.
		-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-14 12:29                             ` Pavel Machek
                                                 ` (2 preceding siblings ...)
  2003-03-15 21:32                               ` Petr Baudis
@ 2003-03-16  0:39                               ` Horst von Brand
  2003-04-07 21:22                               ` Petr Baudis
  4 siblings, 0 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-16  0:39 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel Mailing List

Pavel Machek <pavel@ucw.cz> dijo:

[...]

> Some kind of better-patch is badly needed.
> 
> What kind of data would have to be in soft-changeset?
> * unique id of changeset

If you want...

> * unique id of previous changeset

What is "previous"?

> (two previous if it is merge)

And if they are merges themselves? Or if it is a 3-way merge? Etc? How do I
get the original patches (if wanted)?

> ? or would it be better to have here
> whole path to first change?
> * commit comment

Right.

> * for each file:
> ** diff -u of change
> ** file's unique id

What is that? If I moved the file away and created a new one? Other moving
around stuff?

> ** in case of rename: new name (delete is rename to special dir)
> ** in case of chmod/chown: new permissions
> ** per-file comment

Much more important: How to merge a conflicting patch in sanely? This is
perhaps the worst stumbling block on plain patches.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-14 12:29                             ` Pavel Machek
                                                 ` (3 preceding siblings ...)
  2003-03-16  0:39                               ` Horst von Brand
@ 2003-04-07 21:22                               ` Petr Baudis
  4 siblings, 0 replies; 155+ messages in thread
From: Petr Baudis @ 2003-04-07 21:22 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Martin J. Bligh, linux-kernel

Dear diary, on Fri, Mar 14, 2003 at 01:29:03PM CET, I got a letter,
where Pavel Machek <pavel@ucw.cz> told me, that...
> Hi!
> 
> > You can wave your wand, and the soft changeset will turn into a universal 
> > diff or a BK changeset.  But it's obviously a lot cleaner, extensible, 
> > flexible and easier to process automatically than a text diff.  It's an 
> > internal format, so it can be improved from time to time with little or no 
> > breakage.
> > 
> > Did that make sense?
> 
> Yes.
> 
> Some kind of better-patch is badly needed.
..snip..

FYI, the SVN and Arch folks have set up a mailing list for discussion about
generic "smarter patch" format, see
http://www.red-bean.com/mailman/listinfo/changesets for details/subscription.

Kind regards,

-- 
 
				Petr "Pasky" Baudis
.
The pure and simple truth is rarely pure and never simple.
		-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 18:40                       ` Zack Brown
  2003-03-11 18:46                         ` Martin J. Bligh
@ 2003-03-12  3:47                         ` Horst von Brand
  2003-03-12  4:03                           ` Larry McVoy
                                             ` (2 more replies)
  2003-03-14 11:34                         ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
  2 siblings, 3 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-12  3:47 UTC (permalink / raw)
  To: Zack Brown; +Cc: Daniel Phillips, linux-kernel

Zack Brown <zbrown@tumblerings.org> said:
> --------------------------------- cut here ---------------------------------
> 
>            Linux Kernel Requirements For A Version Control System    
> 
> Document version 0.0.1

[...]

>                                  Changesets
> 
>   1. Introduction
> 
> A changeset is a group of files in a repository, that have been tagged by
> the developer, as being logical parts of a patch dealing with a single
> feature or fix. A developer working on multiple aspects of a repository, may
> create one changeset for each aspect, in which each changeset consists of
> the files relevant to that aspect.

Nope. A changeset is (roughly) what was traded as a patch before. I.e., a
coordinated _change_ to a set of files. The RCS problem (inherited by lots
of systems) is that it handles only a diff to _one_ file at a time.

> In the context of sharing changesets between repositories, a changeset
> consists of a diff between the set of files in the local and remote
> repositories.

I don't think it is a good idea to handle differences _between_
repositories, as they could be arbitrary and change in time. A change
_within_ a repository is well defined.

>   2. Behavior
> 
>     2.1 Tagging
> 
> It must be trivial for a developer to tag a file as part of a given
> changeset.

An individual change, not a file. You need to focus on changes to files,
not files. I.e., file appeared/dissapeared/changed name/was edited by
altering lines so and so. 

The bk method of accepting individual changes, and then bundling them up
should be enough, people tend to work at one problem at a time. It might be
possible to take a bunch of changes and slice&dice them into changesets
later, but that could create changesets that interdigitate and interdepend
(i.e., changeset 13 has edits that depend on changeset 14 having been
applied, and 14 similarly depends on 13 in other areas; also called
"deadlock" when talking about locking ;).

> It must be possible to reorganize changesets, so that a given changeset may
> be split up into more manageable pieces.

I don't see this as very useful. The user should take care to make changes
to foo.c and foo.h that touch one aspect into a changeset, and unrelated
changes (even touching the same files) into others.  Break a changeset up
might break dependencies between changes. It might make sense to group
changesets into larger changes, i.e., changesets 12-25 are move to new
driver model in /net, sets for /net, /block, /char are move to new driver
model, and so on upwards. Then 2.8.15 to 2.8.16 would be "just" a
(super)changeset. Such a (super)changeset would make sense to break up into
its parts, not individual ones.

[...]

>   3. Problems For Clarification
> 
> If a file is tagged as being part of two different changesets, then changes
> to that file should be associated with which changeset???

Individual changes to files can't belong to more than one changeset, AFAICS.

[...]

>                                    Merging

[...]

> It should be possible to mark a file as private to a local repository, so
> that a merge will never try to commit that file's changes to a remote
> repository.

Gets hairy... what if I create file foo as private, and later try to
integrate stuff that creates the same file? Better keep this out of the
repository in the first place.

>     2.2 Preserving History

[...]

> Even if no history is available for a given patch, it should be easy to
> checkin and merge that patch.

Just take that patch as a local edit, and make it a changeset.

> The implementation must not depend on time being accurately reported by any
> of the repositories.

It is more complicated than that. On a distributed system without some form
of shared clock it might be impossible (== nonsense, like in relativity
theory) to talk of a global "before" and "after"

[...]

>                          Distributed Rename Handling
> 
>   1. Introduction
> 
> This consists of allowing developers to rename files and directories, and
> have all repository operations properly recognize and handle this.

And create and destroy. Note "rename" must include moving directories
around, and moving stuff from one directory to another, etc.

[...]

>       2.2.1 Conflicts
> 
> An arbitrary number of repositories cloned from either a single remote
> repository or from each other may attempt to change the name of a single
> file to arbitrary other names and then merge that change back to a single
> remote repository or to each other.

Or several create the same file, or rename random files to the same name,
or even create and then destroy a file created somewhere else. Or create a
file in a directory that was just destroyed or moved locally, etc. I'm sure
this is one of the rat's nests of hairy special cases noone has thought
through Larry is so fond mentioning.

[...]

>                * * * Not Required For Kernel Development * * *
> 
>                                  Changesets
> 
> It should be possible to exchange changesets via email.

I'd say this is mandatory.

>                                  File Types
> 
> The system should support symlinks, device special files, fifos, etc. (i.e.
> inode metadata)

Urgh. If possible/convenient, yes. If not, leave it out. [I fail to see any
use for this, but that might just be lack of immagination on my side]

> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> This document is copyright Zack Brown and released under the terms of the
> GNU General Public License, version 2.0.
> * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Why not the documentation license? Just curious.
> 
> --------------------------------- cut here ---------------------------------
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  3:47                         ` Horst von Brand
@ 2003-03-12  4:03                           ` Larry McVoy
  2003-03-12  4:49                             ` [PATCH] ~/kernel/sys.c (2.5.64) (trivial) Jay Patrick Howard
  2003-03-12  5:22                           ` BitBucket: GPL-ed KitBeeper clone Zack Brown
  2003-03-12 13:22                           ` Daniel Phillips
  2 siblings, 1 reply; 155+ messages in thread
From: Larry McVoy @ 2003-03-12  4:03 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Zack Brown, Daniel Phillips, linux-kernel

> > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> > This document is copyright Zack Brown and released under the terms of the
> > GNU General Public License, version 2.0.
> > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *

Since a substantial amount of the information in there is what I said,
Zack has no right to impose any license on the information.  It's a bit
unethical if you ask me, it's my copyright, not his.  And I didn't impose
any silly license on it.
-- 
---
Larry McVoy            	 lm at bitmover.com           http://www.bitmover.com/lm 

^ permalink raw reply	[flat|nested] 155+ messages in thread

* [PATCH] ~/kernel/sys.c (2.5.64) (trivial)
  2003-03-12  4:03                           ` Larry McVoy
@ 2003-03-12  4:49                             ` Jay Patrick Howard
  0 siblings, 0 replies; 155+ messages in thread
From: Jay Patrick Howard @ 2003-03-12  4:49 UTC (permalink / raw)
  To: linux-kernel

Apologies in advance if this is so trivial as to be non-patch-worthy.

Was poking around and noticed a possible improvement to kernekl/sys.c.
This change results in marginally better output using gcc 3.2 on x86.

As a test I constructed look-alike functions and a small driver.  There
appeared to be a ~40% speedup on the "true" branch and ~5% slowdown on the
"false" branch.  No effort was made to account for overhead when figuring
the percentages.

Unfortunately I don't know enough to say which side of the branch is more
commonly taken.

--- linux-2.5.64.orig/kernel/sys.c	Tue Mar  4 21:28:58 2003
+++ linux-2.5.64/kernel/sys.c	Tue Mar 11 22:06:12 2003
@@ -1096,18 +1096,12 @@
  */
 int in_group_p(gid_t grp)
 {
-	int retval = 1;
-	if (grp != current->fsgid)
-		retval = supplemental_group_member(grp);
-	return retval;
+	return (grp != current->fsgid) ? supplemental_group_member(grp) : 1;
 }

 int in_egroup_p(gid_t grp)
 {
-	int retval = 1;
-	if (grp != current->egid)
-		retval = supplemental_group_member(grp);
-	return retval;
+	return (grp != current->egid) ? supplemental_group_member(grp) : 1;
 }

 DECLARE_RWSEM(uts_sem);

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  3:47                         ` Horst von Brand
  2003-03-12  4:03                           ` Larry McVoy
@ 2003-03-12  5:22                           ` Zack Brown
  2003-03-12  5:44                             ` Horst von Brand
                                               ` (2 more replies)
  2003-03-12 13:22                           ` Daniel Phillips
  2 siblings, 3 replies; 155+ messages in thread
From: Zack Brown @ 2003-03-12  5:22 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Daniel Phillips, linux-kernel

On Tue, Mar 11, 2003 at 11:47:50PM -0400, Horst von Brand wrote:
> Zack Brown <zbrown@tumblerings.org> said:
> > --------------------------------- cut here ---------------------------------
> > 
> >            Linux Kernel Requirements For A Version Control System    
> > 
> > Document version 0.0.1
> 
> [...]
> 
> > In the context of sharing changesets between repositories, a changeset
> > consists of a diff between the set of files in the local and remote
> > repositories.
> 
> I don't think it is a good idea to handle differences _between_
> repositories, as they could be arbitrary and change in time. A change
> _within_ a repository is well defined.

But isn't it necessary to excange changesets between repositories? How
else would a developer choose exactly what changes get merged with a
remote repository?

> 
> >   2. Behavior
> > 
> >     2.1 Tagging
> > 
> > It must be trivial for a developer to tag a file as part of a given
> > changeset.
> 
> An individual change, not a file. You need to focus on changes to files,
> not files. I.e., file appeared/dissapeared/changed name/was edited by
> altering lines so and so. 
> 
> The bk method of accepting individual changes, and then bundling them up
> should be enough, people tend to work at one problem at a time.

I'm not so familiar with how BitKeeper operates. What do you mean by
"accepting individual changes, and then bundling them up"?

> > The implementation must not depend on time being accurately reported by any
> > of the repositories.
> 
> It is more complicated than that. On a distributed system without some form
> of shared clock it might be impossible (== nonsense, like in relativity
> theory) to talk of a global "before" and "after"

Maybe the system should simply ignore the whole concept of time as occurring
in discrete ticks, and just measure time as the relative history of
changesets. That might give it enough of a basis to make estimates on which
changes came 'before' and 'after' other changes in most cases. I imagine a
lot of subtle intelligence could be implemented. And for situations defying
that intelligence, the system could query the user.

> > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> > This document is copyright Zack Brown and released under the terms of the
> > GNU General Public License, version 2.0.
> > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
> 
> Why not the documentation license? Just curious.

I read it when it first came out, and it just seemed to be trying to do
something that wasn't really feasible, and to do it in a fairly arbitrary
way. Besides, the protections it claimed to offer didn't interest me. The
GPL may have a soft spot or two, but I really like it, and I think it
applies just as well to text as to computer program code.

> > 
> > --------------------------------- cut here ---------------------------------
> -- 
> Dr. Horst H. von Brand                   User #22616 counter.li.org
> Departamento de Informatica                     Fono: +56 32 654431
> Universidad Tecnica Federico Santa Maria              +56 32 654239
> Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  5:22                           ` BitBucket: GPL-ed KitBeeper clone Zack Brown
@ 2003-03-12  5:44                             ` Horst von Brand
  2003-03-12 13:48                               ` Daniel Phillips
  2003-03-12  6:19                             ` Werner Almesberger
  2003-03-12 15:32                             ` Horst von Brand
  2 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-12  5:44 UTC (permalink / raw)
  To: Zack Brown; +Cc: Daniel Phillips, linux-kernel

Zack Brown <zbrown@tumblerings.org> said:
> On Tue, Mar 11, 2003 at 11:47:50PM -0400, Horst von Brand wrote:
> > Zack Brown <zbrown@tumblerings.org> said:
> > > --------------------------------- cut here --------------------------------
> -
> > > 
> > >            Linux Kernel Requirements For A Version Control System    
> > > 
> > > Document version 0.0.1
> > 
> > [...]
> > 
> > > In the context of sharing changesets between repositories, a changeset
> > > consists of a diff between the set of files in the local and remote
> > > repositories.
> > 
> > I don't think it is a good idea to handle differences _between_
> > repositories, as they could be arbitrary and change in time. A change
> > _within_ a repository is well defined.

> But isn't it necessary to excange changesets between repositories? How
> else would a developer choose exactly what changes get merged with a
> remote repository?

_From_ a remote repository. I pull stuff, I can't push it. Once I got the
"patch" here, I start integrating it into my repository. The granularity
should be a changeset (i.e., changes between two well defined points in the
remote repository). If it patches in cleanly, great! If not, do merging (==
resolve problems, by hand if need be).

> > >   2. Behavior
> > > 
> > >     2.1 Tagging
> > > 
> > > It must be trivial for a developer to tag a file as part of a given
> > > changeset.
> > 
> > An individual change, not a file. You need to focus on changes to files,
> > not files. I.e., file appeared/dissapeared/changed name/was edited by
> > altering lines so and so. 
> > 
> > The bk method of accepting individual changes, and then bundling them up
> > should be enough, people tend to work at one problem at a time.
> 
> I'm not so familiar with how BitKeeper operates. What do you mean by
> "accepting individual changes, and then bundling them up"?

In bk you edit a bunch of files, and commit the changes (individually or as
a set), and then you say "Now make all pending changes into a changeset".

> > > The implementation must not depend on time being accurately reported
> > > by any of the repositories.

> > It is more complicated than that. On a distributed system without some form
> > of shared clock it might be impossible (== nonsense, like in relativity
> > theory) to talk of a global "before" and "after"

> Maybe the system should simply ignore the whole concept of time as occurring
> in discrete ticks, and just measure time as the relative history of
> changesets.

Exactly. But this timeline makes sense for one repository only, and (in a
limited way, via merge points) it makes timelines (somewhat) comparable
between repositories. But note that A might take 13 and much later 5 from
B, as long as there is no conflict they will go in cleanly. But this is
time going backwards. Now factor in unrelated exchanging of changesets with
other actors...

>             That might give it enough of a basis to make estimates on which
> changes came 'before' and 'after' other changes in most cases. I imagine a
> lot of subtle intelligence could be implemented. And for situations defying
> that intelligence, the system could query the user.

There is no universal "before" and "after", even within one repository;
there might be changes that can't be ordered. I.e., changes to files foo
and bar are independent, and might have happened in any order for the same
result. Same for all non-overlapping changes.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  5:44                             ` Horst von Brand
@ 2003-03-12 13:48                               ` Daniel Phillips
  2003-03-13  1:03                                 ` Horst von Brand
  0 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-12 13:48 UTC (permalink / raw)
  To: Horst von Brand, Zack Brown; +Cc: linux-kernel

On Wed 12 Mar 03 06:44, Horst von Brand wrote:
> There is no universal "before" and "after", even within one repository;

Sure there is, e.g., by incrementing master transaction number on the 
repository database.

> there might be changes that can't be ordered. I.e., changes to files foo
> and bar are independent, and might have happened in any order for the same
> result. Same for all non-overlapping changes.

I think what you're saying is that the repository may be ordered in more than 
one way at the same time.  Transaction serial number is just one way.  
Whatever else is recorded in the repository, at least there ought to be a 
serial number on every transaction, a simple unstructured counter.  With just 
this serial number you already have a way to roll back the entire repository 
to any point in the past, provided all repository transactions are reversible.

For dependencies between changes, rather than any fixed ordering, it's better 
to record the actual precedence information, i.e., "a before b", where a and 
b are id numbers of changes (I think everybody agrees changes are first class 
objects).  These precedence relations can be determined automatically: if two 
changes do not occur in the same file, there is no certainly no precedence 
relation.  If two changes overlap the same text, then there is a precedence 
relation.  If two changes do not overlap, there may or may not be a 
precedence relation, depending on whether the changes are exact deltas or 
deltas-with-context, and if the latter, whether the context is unambiguous.

Once you have the precedence relations, there are all kinds of useful things 
you can do with them.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 13:48                               ` Daniel Phillips
@ 2003-03-13  1:03                                 ` Horst von Brand
  2003-03-13 16:53                                   ` Daniel Phillips
  0 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-13  1:03 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Linux Kernel Mailing List

Daniel Phillips <phillips@arcor.de> said:

[...]

> For dependencies between changes, rather than any fixed ordering, it's better 
> to record the actual precedence information, i.e., "a before b", where a and 
> b are id numbers of changes (I think everybody agrees changes are first class 
> objects).  These precedence relations can be determined automatically: if two 
> changes do not occur in the same file, there is no certainly no precedence 
> relation.

Wrong. Edit a header adding a new type T. Later change an existing file
that already includes said header to use T. Change a function, fix most
uses. Find a wrong usage later and fix it separately. Change something, fix
its Documentation/ later. Note how you can come up with dependent changes
that _can't_ be detected automatically.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13  1:03                                 ` Horst von Brand
@ 2003-03-13 16:53                                   ` Daniel Phillips
  2003-03-15 15:02                                     ` Horst von Brand
  0 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-13 16:53 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Linux Kernel Mailing List

On Thu 13 Mar 03 02:03, Horst von Brand wrote:
> Daniel Phillips <phillips@arcor.de> said:
>
> [...]
>
> > For dependencies between changes, rather than any fixed ordering, it's
> > better to record the actual precedence information, i.e., "a before b",
> > where a and b are id numbers of changes (I think everybody agrees changes
> > are first class objects).  These precedence relations can be determined
> > automatically: if two changes do not occur in the same file, there is no
> > certainly no precedence relation.
>
> Wrong. Edit a header adding a new type T. Later change an existing file
> that already includes said header to use T. Change a function, fix most
> uses. Find a wrong usage later and fix it separately. Change something, fix
> its Documentation/ later. Note how you can come up with dependent changes
> that _can't_ be detected automatically.

You confused semantic dependencies with structural dependencies that
govern whether or not deltas conflict in the reject sense.  Detailed reply is 
off-list.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13 16:53                                   ` Daniel Phillips
@ 2003-03-15 15:02                                     ` Horst von Brand
  2003-03-15 21:25                                       ` Daniel Phillips
  0 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-15 15:02 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Linux Kernel Mailing List

Daniel Phillips <phillips@arcor.de> said:

[...]

> You confused semantic dependencies with structural dependencies that
> govern whether or not deltas conflict in the reject sense.  Detailed
> reply is off-list.

In both cases hand fixup is needed. The "overlapping patch" partial order
is a (small, or even very small) subset of the "depends on" partial order
which you really want. It would be nice to be able to get a much better
approximation than "conflicting patch" automatically, but I fail to see
how. Giving dependencies by hand is a possibility, but it will most of the
time give as bad an approximation as the above (Do you really know _all_
patches on which your latest and greatest depends? Some (or even most) of
them will be old patches, that by now will be just part of the general
landscape. And this can happen even with direct dependencies: Think of
"disabling IRQs doesn't ensure mutual exclusion" or some such pervasive
change that will affect a small part of any patch, and now move an old 
patch forward...).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 15:02                                     ` Horst von Brand
@ 2003-03-15 21:25                                       ` Daniel Phillips
  0 siblings, 0 replies; 155+ messages in thread
From: Daniel Phillips @ 2003-03-15 21:25 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Linux Kernel Mailing List

On Sat 15 Mar 03 16:02, Horst von Brand wrote:
> Daniel Phillips <phillips@arcor.de> said:
>
> [...]
>
> > You confused semantic dependencies with structural dependencies that
> > govern whether or not deltas conflict in the reject sense.  Detailed
> > reply is off-list.
>
> In both cases hand fixup is needed. The "overlapping patch" partial order
> is a (small, or even very small) subset of the "depends on" partial order
> which you really want.

But it's a very irritating subset and much of the work involved can be 
handled automatically, so it should be.

> It would be nice to be able to get a much better
> approximation than "conflicting patch" automatically, but I fail to see
> how.

I suppose automatic syntactic analysis could be worked in there, or trial 
builds could be done automatically (check out how Visual Age aka Eclipse does 
it).  I'd put that in the "extra credit" category, and for starters I'd be 
entirely satisfied with:

  - Automatic handling of most structural conflicts, which would result
    in multiple possible deltas between two objects involved in a merge.
    These would be marked by the UI as "conflicts", and the system could
    helpfully point your editor at the relevant source texts.

  - Manual handling of semantic conflicts, but good support for navigating
    your editor to where the problems likely are (e.g., probably involves
    a changeset you recently merged).

> Giving dependencies by hand is a possibility,

Very useful, and not hard to do.

> but it will most of the
> time give as bad an approximation as the above (Do you really know _all_
> patches on which your latest and greatest depends?

You don't need to, you just provide a little help to the system.  When you 
don't provide enough help, you'll get extra compile/run errors, which isn't 
worse than what happens now.

Chances are, the same dependencies will carry over from version to version, 
so it's largely a one-time effort.  When you do put in a manual dependency, 
you can also put a notation on it, explaining why it's there in case that 
needs clarification.

> Some (or even most) of
> them will be old patches, that by now will be just part of the general
> landscape. And this can happen even with direct dependencies: Think of
> "disabling IRQs doesn't ensure mutual exclusion" or some such pervasive
> change that will affect a small part of any patch, and now move an old
> patch forward...).

Eventually, a changeset that the system is carrying forward could become 
moot, because it's unlikely ever to be backed out.  In that case, just merge 
it permanently and stop carrying it forward.  And if you happen to be wrong 
about needing to carry it forward, it just means you have to bring it forward 
from where you ended it.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  5:22                           ` BitBucket: GPL-ed KitBeeper clone Zack Brown
  2003-03-12  5:44                             ` Horst von Brand
@ 2003-03-12  6:19                             ` Werner Almesberger
  2003-03-13  1:31                               ` Horst von Brand
  2003-03-12 15:32                             ` Horst von Brand
  2 siblings, 1 reply; 155+ messages in thread
From: Werner Almesberger @ 2003-03-12  6:19 UTC (permalink / raw)
  To: Zack Brown; +Cc: Horst von Brand, Daniel Phillips, linux-kernel

Zack Brown wrote:
> Maybe the system should simply ignore the whole concept of time as occurring
> in discrete ticks, and just measure time as the relative history of
> changesets.

Real time is still useful, if only as a hint to users. E.g.
assume that you have dependencies the SCM doesn't know about.

Example: somebody posts on linux-kernel a one-line fix for a
remote root exploit. You'll instantly get dozens of people who
will apply that one to their local views, without waiting or
making a common unique change set.

Some of those view may have branched from a long time ago, and
not have touched any common change set for months. So the
partial order of applied change sets tells you very little.

Naturally, such one-line fixes will be slightly different, and
eventually, some of them will merge ...

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://www.almesberger.net/____________________________________________/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  6:19                             ` Werner Almesberger
@ 2003-03-13  1:31                               ` Horst von Brand
  0 siblings, 0 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-13  1:31 UTC (permalink / raw)
  To: Werner Almesberger; +Cc: Linux Kernel Mailing List

Werner Almesberger <wa@almesberger.net> said:

[...]

> Real time is still useful, if only as a hint to users.

Lots of things would be useful to have, but you just can't get them.
There is no guarantee that the clocks of the machines are even remotely
near sychronized (don't get me started on that).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  5:22                           ` BitBucket: GPL-ed KitBeeper clone Zack Brown
  2003-03-12  5:44                             ` Horst von Brand
  2003-03-12  6:19                             ` Werner Almesberger
@ 2003-03-12 15:32                             ` Horst von Brand
  2003-03-12 16:13                               ` Daniel Phillips
  2 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-12 15:32 UTC (permalink / raw)
  To: Zack Brown; +Cc: Daniel Phillips, linux-kernel

Zack Brown <zbrown@tumblerings.org> said:
> On Tue, Mar 11, 2003 at 11:47:50PM -0400, Horst von Brand wrote:
> > Zack Brown <zbrown@tumblerings.org> said:
> > > --------------------------------- cut here --------------------------------
> -
> > > 
> > >            Linux Kernel Requirements For A Version Control System    
> > > 
> > > Document version 0.0.1
> > 
> > [...]
> > 
> > > In the context of sharing changesets between repositories, a changeset
> > > consists of a diff between the set of files in the local and remote
> > > repositories.
> > 
> > I don't think it is a good idea to handle differences _between_
> > repositories, as they could be arbitrary and change in time. A change
> > _within_ a repository is well defined.
> 
> But isn't it necessary to excange changesets between repositories? How
> else would a developer choose exactly what changes get merged with a
> remote repository?

Again, _from_ a remote repository. I want control over the stuff I have
here.

The idea should be to be able to browse the changesets at the remote
depository and then pick changesets from there. Or just pull all
outstanding changesets (from the last sychronization point on). But that is
a bit hard... say I clone Linus' tree, and then want to sycnronize with say
DaveM. But DaveM's tree is a few changesets behind Linus', and has extra
stuff. If I'm going promiscuous, I'll add some patches of my own, get some
random stuff from lkml (some of which are picked up later by Linus, others
aren't). I'd later try to get up to date with Andrea's tree, where we again
have the same scenario. And then go to Linus' next point release, who mixed
and matched, and sometimes mangled, changesets from the above in the
meantime... please tell me what the sychronization points for all those
transactions should be. Consider that DaveM might have applied changesets
to his tree in a certain order, and later Linus picked up some of the later
ones, and after some time finally integrated an earlier changeset of
DaveM's (perhaps had to merge it (i.e., adjust it) due to intervening
changes). So you don't even have a "standard order in which changesets are
applied" across the board, and "the same changeset" is different depending
on the three on which it is applied.

So, a changeset is local, or something to be sent out and merged elsewhere
(where due to the merging it loses its former identity). Think traditional
patches: I can create a patch here, give it to you. But what you end
applying is different due to changes at your place. You apply a different
patch.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 15:32                             ` Horst von Brand
@ 2003-03-12 16:13                               ` Daniel Phillips
  2003-03-12 20:37                                 ` Horst von Brand
  0 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-12 16:13 UTC (permalink / raw)
  To: Horst von Brand, Zack Brown; +Cc: linux-kernel

On Wed 12 Mar 03 16:32, Horst von Brand wrote:
> ...a changeset is local, or something to be sent out and merged elsewhere
> (where due to the merging it loses its former identity). Think traditional
> patches: I can create a patch here, give it to you. But what you end
> applying is different due to changes at your place. You apply a different
> patch.

This is why changesets need to be first-class objects in the repository,
that can be versioned, segmented and recombined.  I'd be able to pull 
slightly differing changesets from a variety of sources, *merge
the changesets* and carry the result forward in my repository.  This
way, no changeset needs to lose its identity until I explicity want it
to.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 16:13                               ` Daniel Phillips
@ 2003-03-12 20:37                                 ` Horst von Brand
  2003-03-12 20:54                                   ` H. Peter Anvin
  2003-03-13  2:00                                   ` Daniel Phillips
  0 siblings, 2 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-12 20:37 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Zack Brown, linux-kernel

Daniel Phillips <phillips@arcor.de> said:
> On Wed 12 Mar 03 16:32, Horst von Brand wrote:
> > ...a changeset is local, or something to be sent out and merged elsewhere
> > (where due to the merging it loses its former identity). Think traditional
> > patches: I can create a patch here, give it to you. But what you end
> > applying is different due to changes at your place. You apply a different
> > patch.

> This is why changesets need to be first-class objects in the repository,

Right.

> that can be versioned,

Versioned how? Have different versions of a changeset? Don't see the point.

>                        segmented

Here I disagree. The changeset should be seen as a (conceptually atomic)
change to the _local_ repository. The "conceptually atomic" part is from
Linus' style of "break up your megapatch into self-contained pieces, do one
step at a time". The changeset must make local sense if you want to be able
to undo, see what it changes, handle dependencies, ... Locally, what was
changed remotely (generating the changeset in the first place) is useless
(as the context isn't here).

>                                  and recombined.

Recombined how? Take changesets A and B, create C from half of each? Better
keep A, B, and create another one that gets rid of the junk. Or do a C from
scratch (perhaps by applying A and B as patches, fixing up the mess, and
declaring the resulting change a changeset).

It does make sense to group changesets, but not this way AFAICS.

>                                                  I'd be able to pull 
> slightly differing changesets from a variety of sources, *merge
> the changesets* and carry the result forward in my repository.  This
> way, no changeset needs to lose its identity until I explicity want it
> to.

This is doing merging among changesets, not merging changesets into the
repository. I'd prefer the later (as it reduces this special-purpose
operation to others that have to be there anyway).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 20:37                                 ` Horst von Brand
@ 2003-03-12 20:54                                   ` H. Peter Anvin
  2003-03-13  2:00                                   ` Daniel Phillips
  1 sibling, 0 replies; 155+ messages in thread
From: H. Peter Anvin @ 2003-03-12 20:54 UTC (permalink / raw)
  To: linux-kernel

Followup to:  <200303122037.h2CKboc7031958@pincoya.inf.utfsm.cl>
By author:    Horst von Brand <vonbrand@inf.utfsm.cl>
In newsgroup: linux.dev.kernel
>
> [lots of discussion about how to do a reasonable SCM]
> 

This seems a little offtopic for LKML.  Since there is already a
bitbucket project on Sourceforge I would like to suggest creating a
mailing list there, or if you for some reason don't want to use
Sourceforge I'd be happy to host one on kernel.org or on my own
server.

That way this discussion would be actually archived and possible to
find, too.

	-hpa
-- 
<hpa@transmeta.com> at work, <hpa@zytor.com> in private!
"Unix gives you enough rope to shoot yourself in the foot."
Architectures needed: ia64 m68k mips64 ppc ppc64 s390 s390x sh v850 x86-64

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 20:37                                 ` Horst von Brand
  2003-03-12 20:54                                   ` H. Peter Anvin
@ 2003-03-13  2:00                                   ` Daniel Phillips
  2003-03-15  1:03                                     ` Horst von Brand
  1 sibling, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-13  2:00 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Zack Brown, linux-kernel

On Wed 12 Mar 03 21:37, Horst von Brand wrote:
> Daniel Phillips <phillips@arcor.de> said:
> > This is why changesets need to be first-class objects in the repository,
>
> Right.
>
> > that can be versioned,
>
> Versioned how? Have different versions of a changeset? Don't see the point.

So that fix.foobar-2.5.64 that you got from davem applies to 2.5.64 that you 
got from Linus, and a later version of it, e.g., with some macro respelled by 
you (in line with Linus's recent changes) applies correctly to 2.5.69.

> ...The changeset should be seen as a (conceptually atomic)
> change to the _local_ repository.

No argument there, and no inconsistency.  Well, except that it's entirely ok 
to "soften" a changeset by adding context and send it off to somebody else.  
That somebody else may need to massage it to get it to apply, and so they end 
up with the exact changeset you sent them, which conflicts, and their own 
version of it, which works.  Then, when somebody asks them to forward the 
same changeset , they've got two chances to send one that works on the first 
try.  And so on forth.

> The "conceptually atomic" part is from
> Linus' style of "break up your megapatch into self-contained pieces, do one
> step at a time".

Really essential.

> The changeset must make local sense if you want to be able
> to undo, see what it changes, handle dependencies, ... Locally, what was
> changed remotely (generating the changeset in the first place) is useless
> (as the context isn't here).

Yes, but I fail to see why that means you can't send the thing on to someone 
else to accomplish something useful.  This has always worked well with 
patches, why did it stop working with changesets?

> >                                  and recombined.
>
> Recombined how? Take changesets A and B, create C from half of each? Better
> keep A, B, and create another one that gets rid of the junk.

Yes of course.  Why would you throw any changeset away?  It's all good data. 
C would be a recombination of A and B, and all of them end up in the 
repository, though not necessarily all applied.

In order to recombine efficiently, you have to be able to explode A and B 
into their component parts (chunks make a convenient starting point) then 
reassemble them, hopefully with the help of regex matches and so on.  You 
also want to be able to do the equivalent of editing a patch, but in a way 
that isn't so error-prone.  A nice way to do that is to apply the patch, edit 
the result, then regenerate the patch.  The tedious bookkeeping aspect of 
this can be automated nicely.

> Or do a C from
> scratch (perhaps by applying A and B as patches, fixing up the mess, and
> declaring the resulting change a changeset).

Yes.

> It does make sense to group changesets, but not this way AFAICS.
>
> >                                                  I'd be able to pull
> > slightly differing changesets from a variety of sources, *merge
> > the changesets* and carry the result forward in my repository.  This
> > way, no changeset needs to lose its identity until I explicity want it
> > to.
>
> This is doing merging among changesets, not merging changesets into the
> repository.

No, I meant merging the changesets into the repository.  Then the system 
regenerates the changeset, and understands it to be a descendant version of 
the original changeset.  The result is a "merged" changeset, i.e., one that 
applies correctly, whereas the original didn't.

> I'd prefer the later (as it reduces this special-purpose
> operation to others that have to be there anyway).

I suppose that we're violently agreeing, and just haggling over terminology.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13  2:00                                   ` Daniel Phillips
@ 2003-03-15  1:03                                     ` Horst von Brand
  0 siblings, 0 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-15  1:03 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Zack Brown, linux-kernel

Daniel Phillips <phillips@arcor.de> said:
> On Wed 12 Mar 03 21:37, Horst von Brand wrote:
> > Daniel Phillips <phillips@arcor.de> said:

[...]

> > Versioned how? Have different versions of a changeset? Don't see the point.

> So that fix.foobar-2.5.64 that you got from davem applies to 2.5.64 that
> you got from Linus, and a later version of it, e.g., with some macro
> respelled by you (in line with Linus's recent changes) applies correctly
> to 2.5.69.

This is "keep the original change + the merged change". Sounds logical, but
OTOH if it is Linus' tree, you'll add _tons_ of irrelevant (rejected/fixed)
changes. Plus why should I send out changes that can be gotten directly?

At the very least, I'd need the ability to throw away original changes
deemed irrelevant.

> > ...The changeset should be seen as a (conceptually atomic)
> > change to the _local_ repository.

> No argument there, and no inconsistency.  Well, except that it's entirely
> ok to "soften" a changeset by adding context and send it off to somebody
> else.

I don't see the need to decide on "softening". A changeset sent around will
get applied to different trees (that is its idea, anyway), so it has to be
"soft" (like patches carry context, while e.g. RCS doesn't keep context
internally).

>       That somebody else may need to massage it to get it to apply,

Merging.

>                                                                     and
> so they end up with the exact changeset you sent them, which conflicts,
> and their own version of it, which works.  Then, when somebody asks them
> to forward the same changeset , they've got two chances to send one that
> works on the first try.  And so on forth.

If they follow my tree (I've got no clue of why somebody would want to do
that ;-), my change is fine. If not, they'll be following another tree, and
my version of the change isn't relevant.

[...]

> Yes, but I fail to see why that means you can't send the thing on to someone 
> else to accomplish something useful.  This has always worked well with 
> patches, why did it stop working with changesets?

Where did you get patches? Either from the original author directly, or
from somebody who fixed it somehow. With the ubiquitous Internet, there
should be little need to pass changesets around.

> > >                                  and recombined.

> > Recombined how? Take changesets A and B, create C from half of each? Better
> > keep A, B, and create another one that gets rid of the junk.
> 
> Yes of course.  Why would you throw any changeset away?  It's all good data. 
> C would be a recombination of A and B, and all of them end up in the 
> repository, though not necessarily all applied.

I'd prefer applying A and B and then making D that fixes up the mess,
instead of creating an artificial C. That is the way people do the work,
plus A, B, D will make sense even for A's author or some third party who
picked A and B up.

[...]

> > > slightly differing changesets from a variety of sources, *merge
> > > the changesets* and carry the result forward in my repository.  This
> > > way, no changeset needs to lose its identity until I explicity want it
> > > to.

> > This is doing merging among changesets, not merging changesets into the
> > repository.

> No, I meant merging the changesets into the repository.  Then the system 
> regenerates the changeset, and understands it to be a descendant version of 
> the original changeset.  The result is a "merged" changeset, i.e., one that 
> applies correctly, whereas the original didn't.

Why do this? Just merge A, B, C in (fixing "rejects"), then fix up the
result (create D). You can then ship (your versions of) A, B, C, plus D

> > I'd prefer the later (as it reduces this special-purpose
> > operation to others that have to be there anyway).
> 
> I suppose that we're violently agreeing, and just haggling over
> terminology.

Maybe; but it makes sense to get the set of basic operations and their
meaning crystal clear, then hash out what non-basic operations can be faked
in terms of them (without going too much against the natural flow of work).
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12  3:47                         ` Horst von Brand
  2003-03-12  4:03                           ` Larry McVoy
  2003-03-12  5:22                           ` BitBucket: GPL-ed KitBeeper clone Zack Brown
@ 2003-03-12 13:22                           ` Daniel Phillips
  2003-03-13  0:52                             ` Horst von Brand
  2 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-12 13:22 UTC (permalink / raw)
  To: Horst von Brand, Zack Brown; +Cc: linux-kernel

On Wed 12 Mar 03 04:47, Horst von Brand wrote:
> ...You need to focus on changes to files,
> not files. I.e., file appeared/dissapeared/changed name/was edited by
> altering lines so and so.

It's useful to make the distinction that "file appeared/dissapeared/changed 
name" are changes to a directory object, while "was edited by altering lines 
so and so" is a change to a file object...

[...]

> > This consists of allowing developers to rename files and directories, and
> > have all repository operations properly recognize and handle this.
>
> And create and destroy. Note "rename" must include moving directories
> around, and moving stuff from one directory to another, etc.

...then this part gets much easier.

Regards,

Daniel


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 13:22                           ` Daniel Phillips
@ 2003-03-13  0:52                             ` Horst von Brand
  2003-03-13 17:00                               ` Daniel Phillips
  0 siblings, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-13  0:52 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Zack Brown, linux-kernel

Daniel Phillips <phillips@arcor.de> said:
> On Wed 12 Mar 03 04:47, Horst von Brand wrote:
> > ...You need to focus on changes to files,
> > not files. I.e., file appeared/dissapeared/changed name/was edited by
> > altering lines so and so.

> It's useful to make the distinction that "file appeared/dissapeared/changed 
> name" are changes to a directory object, while "was edited by altering lines 
> so and so" is a change to a file object...

I don't think so. As the user sees it, a directory is mostly a convenient
labeled container for files. You think in terms of moving files around, not
destroying one and magically creating an exact copy elsewhere (even if
mv(1) does exactly this in some cases). Also, this breaks up the operation
"mv foo bar/baz" into _two_ changes, and this is wrong as the file loses
its revision history.

> [...]
> 
> > > This consists of allowing developers to rename files and directories, and
> > > have all repository operations properly recognize and handle this.
> >
> > And create and destroy. Note "rename" must include moving directories
> > around, and moving stuff from one directory to another, etc.
> 
> ...then this part gets much easier.

... by screwing it up. This is exactly one of the problems noted for CVS.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13  0:52                             ` Horst von Brand
@ 2003-03-13 17:00                               ` Daniel Phillips
  2003-03-13 21:48                                 ` Zack Brown
  2003-03-15 16:21                                 ` Horst von Brand
  0 siblings, 2 replies; 155+ messages in thread
From: Daniel Phillips @ 2003-03-13 17:00 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Zack Brown, linux-kernel

On Thu 13 Mar 03 01:52, Horst von Brand wrote:
> Daniel Phillips <phillips@arcor.de> said:
> > On Wed 12 Mar 03 04:47, Horst von Brand wrote:
> > > ...You need to focus on changes to files,
> > > not files. I.e., file appeared/dissapeared/changed name/was edited by
> > > altering lines so and so.
> >
> > It's useful to make the distinction that "file
> > appeared/dissapeared/changed name" are changes to a directory object,
> > while "was edited by altering lines so and so" is a change to a file
> > object...
>
> I don't think so. As the user sees it, a directory is mostly a convenient
> labeled container for files. You think in terms of moving files around, not
> destroying one and magically creating an exact copy elsewhere (even if
> mv(1) does exactly this in some cases). Also, this breaks up the operation
> "mv foo bar/baz" into _two_ changes, and this is wrong as the file loses
> its revision history.

No, that's a single change to one directory object.

> > ...then this part gets much easier.
>
> ... by screwing it up. This is exactly one of the problems noted for CVS.

CVS doesn't have directory objects.

Does anybody have a convenient mailing list for this design discussion?

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13 17:00                               ` Daniel Phillips
@ 2003-03-13 21:48                                 ` Zack Brown
  2003-03-13 22:04                                   ` Daniel Phillips
  2003-03-15 16:21                                 ` Horst von Brand
  1 sibling, 1 reply; 155+ messages in thread
From: Zack Brown @ 2003-03-13 21:48 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Horst von Brand, linux-kernel

On Thu, Mar 13, 2003 at 06:00:48PM +0100, Daniel Phillips wrote:
> Does anybody have a convenient mailing list for this design discussion?

Keep in mind that one part of the discussion is to figure out what is
and is not required for adoption by the kernel team. For that, this is
probably the best place to discuss it. Otherwise, it's just the same
tail-chasing that has been going on with the various version control
projects up till now.

Later on, people can just be referred to an existing feature description,
which will cut down on future flamewars on lkml.

Be well,
Zack

> 
> Regards,
> 
> Daniel

-- 
Zack Brown

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13 21:48                                 ` Zack Brown
@ 2003-03-13 22:04                                   ` Daniel Phillips
  0 siblings, 0 replies; 155+ messages in thread
From: Daniel Phillips @ 2003-03-13 22:04 UTC (permalink / raw)
  To: Zack Brown; +Cc: Horst von Brand, linux-kernel

On Thu 13 Mar 03 22:48, Zack Brown wrote:
> On Thu, Mar 13, 2003 at 06:00:48PM +0100, Daniel Phillips wrote:
> > Does anybody have a convenient mailing list for this design discussion?
>
> Keep in mind that one part of the discussion is to figure out what is
> and is not required for adoption by the kernel team. For that, this is
> probably the best place to discuss it. Otherwise, it's just the same
> tail-chasing that has been going on with the various version control
> projects up till now.

Well, I know that, but HPA declared it offtopic and I wish to respect that.  

> Later on, people can just be referred to an existing feature description,
> which will cut down on future flamewars on lkml.

Right, but we went well beyond what the features should be and started into 
the implementation details.  I'm getting a lot out of it, personally, but 
others may not be.

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-13 17:00                               ` Daniel Phillips
  2003-03-13 21:48                                 ` Zack Brown
@ 2003-03-15 16:21                                 ` Horst von Brand
  2003-03-15 21:25                                   ` Daniel Phillips
  1 sibling, 1 reply; 155+ messages in thread
From: Horst von Brand @ 2003-03-15 16:21 UTC (permalink / raw)
  To: Daniel Phillips; +Cc: Linux Kernel Mailing List

Daniel Phillips <phillips@arcor.de> said:
> On Thu 13 Mar 03 01:52, Horst von Brand wrote:

[...]

> > I don't think so. As the user sees it, a directory is mostly a convenient
> > labeled container for files. You think in terms of moving files around, not
> > destroying one and magically creating an exact copy elsewhere (even if
> > mv(1) does exactly this in some cases). Also, this breaks up the operation
> > "mv foo bar/baz" into _two_ changes, and this is wrong as the file loses
> > its revision history.

> No, that's a single change to one directory object.

mv some/where/foo bar/baz

How is that _one_ change to _one_ directory object?

> > > ...then this part gets much easier.
> >
> > ... by screwing it up. This is exactly one of the problems noted for CVS.
> 
> CVS doesn't have directory objects.

And it doesn't keep history across moves, as the only way it knows to move
a file is destroying the original and creating a fresh copy.

> Does anybody have a convenient mailing list for this design discussion?

Good idea to move this off LKML
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 16:21                                 ` Horst von Brand
@ 2003-03-15 21:25                                   ` Daniel Phillips
  2003-03-15 21:53                                     ` Robert Anderson
  0 siblings, 1 reply; 155+ messages in thread
From: Daniel Phillips @ 2003-03-15 21:25 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Linux Kernel Mailing List

On Sat 15 Mar 03 17:21, Horst von Brand wrote:
> Daniel Phillips <phillips@arcor.de> said:
> > On Thu 13 Mar 03 01:52, Horst von Brand wrote:
>
> [...]
>
> > > I don't think so. As the user sees it, a directory is mostly a
> > > convenient labeled container for files. You think in terms of moving
> > > files around, not destroying one and magically creating an exact copy
> > > elsewhere (even if mv(1) does exactly this in some cases). Also, this
> > > breaks up the operation "mv foo bar/baz" into _two_ changes, and this
> > > is wrong as the file loses its revision history.
> >
> > No, that's a single change to one directory object.
>
> mv some/where/foo bar/baz
>
> How is that _one_ change to _one_ directory object?

Oops, sorry, I didn't read your bar/baz correctly.  Yes, it's two directory 
objects, but it's only one file object, and the history (not including the 
name changes) is attached to the file object, not the directory object.  This 
is implemented via an object id for each file object, something like an inode 
number.

> > > > ...then this part gets much easier.
> > >
> > > ... by screwing it up. This is exactly one of the problems noted for
> > > CVS.
> >
> > CVS doesn't have directory objects.
>
> And it doesn't keep history across moves, as the only way it knows to move
> a file is destroying the original and creating a fresh copy.

Ah, but it does.  Sorry for not explaining the object id thing earlier.

> > Does anybody have a convenient mailing list for this design discussion?
>
> Good idea to move this off LKML

Yup, but nobody has offered one yet, so...

Regards,

Daniel

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 21:25                                   ` Daniel Phillips
@ 2003-03-15 21:53                                     ` Robert Anderson
  2003-03-15 21:50                                       ` Randy.Dunlap
  2003-03-16  0:18                                       ` Petr Baudis
  0 siblings, 2 replies; 155+ messages in thread
From: Robert Anderson @ 2003-03-15 21:53 UTC (permalink / raw)
  To: Daniel Phillips, lkml; +Cc: arch

On Sat, 2003-03-15 at 13:25, Daniel Phillips wrote:
> On Sat 15 Mar 03 17:21, Horst von Brand wrote:
> > Daniel Phillips <phillips@arcor.de> said:
> > > On Thu 13 Mar 03 01:52, Horst von Brand wrote:
> >
> > [...]
> >
> > > > I don't think so. As the user sees it, a directory is mostly a
> > > > convenient labeled container for files. You think in terms of moving
> > > > files around, not destroying one and magically creating an exact copy
> > > > elsewhere (even if mv(1) does exactly this in some cases). Also, this
> > > > breaks up the operation "mv foo bar/baz" into _two_ changes, and this
> > > > is wrong as the file loses its revision history.
> > >
> > > No, that's a single change to one directory object.
> >
> > mv some/where/foo bar/baz
> >
> > How is that _one_ change to _one_ directory object?
> 
> Oops, sorry, I didn't read your bar/baz correctly.  Yes, it's two directory 
> objects, but it's only one file object, and the history (not including the 
> name changes) is attached to the file object, not the directory object.  This 
> is implemented via an object id for each file object, something like an inode 
> number.
> 
> > > > > ...then this part gets much easier.
> > > >
> > > > ... by screwing it up. This is exactly one of the problems noted for
> > > > CVS.
> > >
> > > CVS doesn't have directory objects.
> >
> > And it doesn't keep history across moves, as the only way it knows to move
> > a file is destroying the original and creating a fresh copy.
> 
> Ah, but it does.  Sorry for not explaining the object id thing earlier.
> 
> > > Does anybody have a convenient mailing list for this design discussion?
> >
> > Good idea to move this off LKML
> 
> Yup, but nobody has offered one yet, so...

I think the arch-users@lists.fifthvision.net list would be happy to host
continuing discussion in this vein.  Considering Larry's repeated
attempts to get people to look at arch as a "better fit," it seems
particularly appropriate.

Of course, you'd have to tolerate "arch community" views on a lot of
these issues, but I suspect that might help focus the discussion.

Bob



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 21:53                                     ` Robert Anderson
@ 2003-03-15 21:50                                       ` Randy.Dunlap
  2003-03-15 22:16                                         ` Robert Anderson
  2003-03-15 22:18                                         ` Robert Anderson
  2003-03-16  0:18                                       ` Petr Baudis
  1 sibling, 2 replies; 155+ messages in thread
From: Randy.Dunlap @ 2003-03-15 21:50 UTC (permalink / raw)
  To: rwa; +Cc: phillips, linux-kernel, arch-users

>> > > Does anybody have a convenient mailing list for this design
>> discussion?
>> >
>> > Good idea to move this off LKML
>>
>> Yup, but nobody has offered one yet, so...
>
> I think the arch-users@lists.fifthvision.net list would be happy to host
> continuing discussion in this vein.  Considering Larry's repeated
> attempts to get people to look at arch as a "better fit," it seems
> particularly appropriate.
>
> Of course, you'd have to tolerate "arch community" views on a lot of these
> issues, but I suspect that might help focus the discussion.
>
> Bob
> -

Yes, that sounds good to me too.
And they have already begun a list of what gcc and Linux kernel SCM
requirements (see http://arch.fifthvision.net/bin/view/Arx/WebHome
for "Requirements").

~Randy




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 21:50                                       ` Randy.Dunlap
@ 2003-03-15 22:16                                         ` Robert Anderson
  2003-03-15 22:18                                         ` Robert Anderson
  1 sibling, 0 replies; 155+ messages in thread
From: Robert Anderson @ 2003-03-15 22:16 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: phillips, lkml, arch

On Sat, 2003-03-15 at 13:50, Randy.Dunlap wrote:
> >> > > Does anybody have a convenient mailing list for this design
> >> discussion?
> >> >
> >> > Good idea to move this off LKML
> >>
> >> Yup, but nobody has offered one yet, so...
> >
> > I think the arch-users@lists.fifthvision.net list would be happy to host
> > continuing discussion in this vein.  Considering Larry's repeated
> > attempts to get people to look at arch as a "better fit," it seems
> > particularly appropriate.
> >
> > Of course, you'd have to tolerate "arch community" views on a lot of these
> > issues, but I suspect that might help focus the discussion.
> >
> > Bob
> > -
> 
> Yes, that sounds good to me too.
> And they have already begun a list of what gcc and Linux kernel SCM
> requirements (see http://arch.fifthvision.net/bin/view/Arx/WebHome
> for "Requirements").
> 
> ~Randy

Sounds good.  Here's the mailing list page: 
http://lists.fifthvision.net/mailman/listinfo/arch-users/

You have to be registered, or your messages will be queued for
moderation.

Bob




^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 21:50                                       ` Randy.Dunlap
  2003-03-15 22:16                                         ` Robert Anderson
@ 2003-03-15 22:18                                         ` Robert Anderson
  1 sibling, 0 replies; 155+ messages in thread
From: Robert Anderson @ 2003-03-15 22:18 UTC (permalink / raw)
  To: Randy.Dunlap; +Cc: phillips, lkml, arch

On Sat, 2003-03-15 at 13:50, Randy.Dunlap wrote:
> >> > > Does anybody have a convenient mailing list for this design
> >> discussion?
> >> >
> >> > Good idea to move this off LKML
> >>
> >> Yup, but nobody has offered one yet, so...
> >
> > I think the arch-users@lists.fifthvision.net list would be happy to host
> > continuing discussion in this vein.  Considering Larry's repeated
> > attempts to get people to look at arch as a "better fit," it seems
> > particularly appropriate.
> >
> > Of course, you'd have to tolerate "arch community" views on a lot of these
> > issues, but I suspect that might help focus the discussion.
> >
> > Bob
> > -
> 
> Yes, that sounds good to me too.
> And they have already begun a list of what gcc and Linux kernel SCM
> requirements (see http://arch.fifthvision.net/bin/view/Arx/WebHome

I actually just moved this topic to:

http://arch.fifthvision.net/bin/view/Main/WebHome

since it doesn't properly belong exclusively to the "ArX" fork of the
arch project.

Bob



^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-15 21:53                                     ` Robert Anderson
  2003-03-15 21:50                                       ` Randy.Dunlap
@ 2003-03-16  0:18                                       ` Petr Baudis
  2003-03-16  0:53                                         ` Davide Libenzi
                                                           ` (4 more replies)
  1 sibling, 5 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-16  0:18 UTC (permalink / raw)
  To: Robert Anderson; +Cc: Daniel Phillips, lkml, arch

Dear diary, on Sat, Mar 15, 2003 at 10:53:34PM CET, I got a letter,
where Robert Anderson <rwa@alumni.princeton.edu> told me, that...
> On Sat, 2003-03-15 at 13:25, Daniel Phillips wrote:
> > On Sat 15 Mar 03 17:21, Horst von Brand wrote:
> > > Daniel Phillips <phillips@arcor.de> said:
> > > > On Thu 13 Mar 03 01:52, Horst von Brand wrote:
..snip..
> > > > Does anybody have a convenient mailing list for this design discussion?
> > >
> > > Good idea to move this off LKML
> > 
> > Yup, but nobody has offered one yet, so...
> 
> I think the arch-users@lists.fifthvision.net list would be happy to host
> continuing discussion in this vein.  Considering Larry's repeated
> attempts to get people to look at arch as a "better fit," it seems
> particularly appropriate.
> 
> Of course, you'd have to tolerate "arch community" views on a lot of
> these issues, but I suspect that might help focus the discussion.

I'm not sure if arch is the right thing to base on. Its concepts are surely
interesting, however there are several problems (some of them may be
subjective):

* Terrible interface. Work with arch involves much more typing out of long
commands (and sequences of these), subcommands and parameters to get
functionality equivalent to the one provided much simpler by other SCMs. I see
it is in sake of genericity and sometimes more sophisticated usage scheme, but
I fear it can be PITA in practice for daily work.

* Awful revision names (just unique ids format). Again, it involves much more
typing and after some hours of work, the dashes will start to dance around and
regroup at random places in front of your eyes. The concepts behind (like
seamless division to multiple archives; I can't say I see sense in categories)
are intriguing, but the result again doesn't seem very practical.

* Evil directory naming. {arch} seems much more visible than CVS/ and SCCS/,
particularly as it gets sorted as last in a directory, thus you see it at the
bottom of ls output. Also it's a PITA with bash, as the stuff starting by '='
(arch likes to spawn that as well) is. The files starting by '+' are problem
for vi, which is kind of flaw when they are probably the only arch files
dedicated for editting by user (they are supposed to contain log messages).

* Cloud of shell scripts. It poses a lot of limitations which are pain to work
around (including speed, two-fields version numbers [eek] and I can imagine
several others; I'm not sure about these though, so I won't name further; you
can possibly imagine something by yourself).

* Absence of sufficient merging ability, at least impression I got from the
documentation. Merging on the *.rej files level I cannot call sufficient ;-).
Also, history is not preserved during merging, which is quite fatal.  And it
looks to me at least from the documentation that arch is still in the
update-before-commit stage.

* Absence of checkin/commit distinction. File revisions and changesets seem to
be tied together, losing some of the cute flexibility BK has.

I must have missed terribly something in the documentation given how arch is
being recommended, please feel encouraged to correct me. But as I see it, most
of the juicy stuff is missing (altough I really like the concept of
configurations and especially the concept of caching --- mainly that you do not
_have_ to pull all the stuff from the clonee repository, which can be a pain
with more poor internet connection; then also if you aren't doing any that big
changes and you're confident that the remote repository is going to stay there,
it is less expensive to talk with the repository over network) and the existing
stuff is mostly in the form of shell scripts, which it has to leave and be
rewritten sooner or later anyway. The backend history format doesn't appear to
be particularily great as well. Dunno. What's so special about arch then?

Kind regards,

-- 

				Petr "Pasky" Baudis
.
The pure and simple truth is rarely pure and never simple.
		-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-16  0:18                                       ` Petr Baudis
@ 2003-03-16  0:53                                         ` Davide Libenzi
  2003-03-16  0:55                                         ` [arch-users] " Stig Brautaset
                                                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 155+ messages in thread
From: Davide Libenzi @ 2003-03-16  0:53 UTC (permalink / raw)
  To: Petr Baudis; +Cc: lkml

On Sun, 16 Mar 2003, Petr Baudis wrote:

> I'm not sure if arch is the right thing to base on. Its concepts are surely
> interesting, however there are several problems (some of them may be
> subjective):
>
> * Terrible interface. Work with arch involves much more typing out of long
> commands (and sequences of these), subcommands and parameters to get
> functionality equivalent to the one provided much simpler by other SCMs. I see
> it is in sake of genericity and sometimes more sophisticated usage scheme, but
> I fear it can be PITA in practice for daily work.
>
> * Awful revision names (just unique ids format). Again, it involves much more
> typing and after some hours of work, the dashes will start to dance around and
> regroup at random places in front of your eyes. The concepts behind (like
> seamless division to multiple archives; I can't say I see sense in categories)
> are intriguing, but the result again doesn't seem very practical.
>
> * Evil directory naming. {arch} seems much more visible than CVS/ and SCCS/,
> particularly as it gets sorted as last in a directory, thus you see it at the
> bottom of ls output. Also it's a PITA with bash, as the stuff starting by '='
> (arch likes to spawn that as well) is. The files starting by '+' are problem
> for vi, which is kind of flaw when they are probably the only arch files
> dedicated for editting by user (they are supposed to contain log messages).
>
> * Cloud of shell scripts. It poses a lot of limitations which are pain to work
> around (including speed, two-fields version numbers [eek] and I can imagine
> several others; I'm not sure about these though, so I won't name further; you
> can possibly imagine something by yourself).
>
> * Absence of sufficient merging ability, at least impression I got from the
> documentation. Merging on the *.rej files level I cannot call sufficient ;-).
> Also, history is not preserved during merging, which is quite fatal.  And it
> looks to me at least from the documentation that arch is still in the
> update-before-commit stage.
>
> * Absence of checkin/commit distinction. File revisions and changesets seem to
> be tied together, losing some of the cute flexibility BK has.
>
> I must have missed terribly something in the documentation given how arch is
> being recommended, please feel encouraged to correct me. But as I see it, most

I must have missed too. Last time I checked I had the same impression. A
bunch of shell scripts ( speed and portability goodbye ) and even
diff&patch were ran as external programs. Maybe it has the right concepts
but the architecture is *at least* weak. Subversion looked ( last time I
checked ) a better organized project, with *real* source code. Ok, it has
*insane* symbol names, but IMHO it's way better than the shell script
cloud.




- Davide


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [arch-users] Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-16  0:18                                       ` Petr Baudis
  2003-03-16  0:53                                         ` Davide Libenzi
@ 2003-03-16  0:55                                         ` Stig Brautaset
  2003-03-16  1:44                                         ` Tom Lord
                                                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 155+ messages in thread
From: Stig Brautaset @ 2003-03-16  0:55 UTC (permalink / raw)
  To: arch; +Cc: Robert Anderson, Daniel Phillips, lkml

On Mar 16 2003, Petr wrote:
> > I think the arch-users@lists.fifthvision.net list would be happy to host
> > continuing discussion in this vein.  Considering Larry's repeated
> > attempts to get people to look at arch as a "better fit," it seems
> > particularly appropriate.
> > 
> > Of course, you'd have to tolerate "arch community" views on a lot of
> > these issues, but I suspect that might help focus the discussion.
> 
> I'm not sure if arch is the right thing to base on. Its concepts are surely
> interesting, however there are several problems (some of them may be
> subjective):
> 
> * Terrible interface. Work with arch involves much more typing out of long
> commands (and sequences of these), subcommands and parameters to get
> functionality equivalent to the one provided much simpler by other SCMs. I see
> it is in sake of genericity and sometimes more sophisticated usage scheme, but
> I fear it can be PITA in practice for daily work.

Someone made a script not long ago to create four-letter aliases of all
arch commands. Instead of `larch star-merge' you type `lstm'. Does that
sound more like what you want?

> * Awful revision names (just unique ids format). Again, it involves much more
> typing and after some hours of work, the dashes will start to dance around and
> regroup at random places in front of your eyes. The concepts behind (like
> seamless division to multiple archives; I can't say I see sense in categories)
> are intriguing, but the result again doesn't seem very practical.

Chose shorter names ;p

> * Evil directory naming. {arch} seems much more visible than CVS/ and SCCS/,
> particularly as it gets sorted as last in a directory, thus you see it at the
> bottom of ls output.

echo "alias ls='ls --ignore {arch}'" >> .bashrc

Funnily enough, {arch} lists _first_ in ls output here. That was the
idea behind the curly braces in the first place too afaik.

> Also it's a PITA with bash, as the stuff starting by '=' (arch likes
> to spawn that as well) is. 

No it doesn't. Tom, the main author of arch, likes files starting with
`='. The rest of us are not so sure ;) Off the top of my head I cannot
think of any file users should have to touch wich have a name starting
with `='. 


> The files starting by '+' are problem for vi, which is kind of flaw
> when they are probably the only arch files dedicated for editting by
> user (they are supposed to contain log messages).

This is a known issue and is being looked into afaik.
I for one agree completely with this point.

> * Cloud of shell scripts. It poses a lot of limitations which are pain to work
> around (including speed, two-fields version numbers [eek] and I can imagine
> several others; I'm not sure about these though, so I won't name further; you
> can possibly imagine something by yourself).

Arch being a bunch of shell scripts:

	http://arch.fifthvision.net/bin/view/Main/ArchMyths

Three-fields version names is being worked at IIRC.

> Also, history is not preserved during merging, which is quite fatal.

Not true. Any merge will include patch logs for the merged-in patches.

> And it looks to me at least from the documentation that arch is still
> in the update-before-commit stage.

have you looked at the --out-of-date-ok flag to commit? (not that I
understand why you would want to use that...)

> rewritten sooner or later anyway. The backend history format doesn't appear to
> be particularily great as well. Dunno. What's so special about arch then?

This say it so much better than I can:

	http://arch.fifthvision.net/bin/view/Main/WhyArch


Stig
-- 
brautaset.org

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [arch-users] Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-16  0:18                                       ` Petr Baudis
  2003-03-16  0:53                                         ` Davide Libenzi
  2003-03-16  0:55                                         ` [arch-users] " Stig Brautaset
@ 2003-03-16  1:44                                         ` Tom Lord
  2003-03-16  2:06                                           ` Adam Spiers
  2003-03-16  5:43                                         ` Robert Anderson
  2003-03-16 11:57                                         ` (Re: BitBucket: GPL-ed KitBeeper clone) Moving to arch-users Petr Baudis
  4 siblings, 1 reply; 155+ messages in thread
From: Tom Lord @ 2003-03-16  1:44 UTC (permalink / raw)
  To: arch-users; +Cc: linux-kernel



       I'm not sure if arch is the right thing to base on. Its
       concepts are surely interesting, however there are several
       problems (some of them may be subjective):

Let's see.

I'll say at the outset: you've named a bunch of things that do give a
bad first-impression to many users.  None of these issues go "deep"
into arch -- there's lots of room, and even some actual work, towards
changing some of what you're complaining about.  If the question is
"is arch a good starting point" -- the fact that all of these are
fairly minor issues reinforces the answer "yes", even if people insist
that there be changes related to these issues.




      * Terrible interface. Work with arch involves much more typing
        out of long commands (and sequences of these), subcommands and
        parameters to get functionality equivalent to the one provided
        much simpler by other SCMs. I see it is in sake of genericity
        and sometimes more sophisticated usage scheme, but I fear it
        can be PITA in practice for daily work.

Perhaps so.  But the question is "Is arch the right starting point
from which to build a system for Linux kernel developers."  If we
agree that what you describe is a problem, it seems to me that the
solution (at least to long command names and options) is _trivial_:
write some front-end scripts.  That would be easy to do, wouldn't take
much code, and if a winning convenience layer emerged from that, I'm
sure we'd be happy to add it to arch (possibly via a a more general
"alias" mechanism for creating short-names for commands with default
option values).

But then there's revision names:

    * Awful revision names (just unique ids format). Again, it
      involves much more typing and after some hours of work, the
      dashes will start to dance around and regroup at random places
      in front of your eyes.

In practice, that hasn't been a problem.  Instead, what people
who use arch to do real work complain about is:

1) two-component version numbers, major.minor

   Several people want three-component.  Everyone agrees that
   n-component (user's choice) is best.   We have good practical 
   reasons for making the change to n-component versions slowly and
   carefully, but it is not a major change.   Again, I'm assuming that
   the question is "is arch the best starting point".



2) ordering of components

   Arch unique ids say:

		<category>--<branch>--<version>

   and some users would rather have:

		<category>--<version>--<branch>

   which better matches the naming scheme currently used for the Linux
   kernel.

   So far, there really aren't sufficiently vociferous+convincing 
   requests to make any changes in this area -- but again, in the big
   picture, no matter what happens in this area -- it's a minor point.


      > The concepts behind (like seamless division to multiple
      > archives; I can't say I see sense in categories) are
      > intriguing, but the result again doesn't seem very practical.

   Don't have much to say about that.   It's been quite practical 
   for me, at least, in practice.



	* Evil directory naming. {arch} seems much more visible than
	CVS/ and SCCS/, particularly as it gets sorted as last in a
	directory, thus you see it at the bottom of ls output. Also
	it's a PITA with bash, as the stuff starting by '=' (arch
	likes to spawn that as well) is. The files starting by '+' are
	problem for vi, which is kind of flaw when they are probably
	the only arch files dedicated for editting by user (they are
	supposed to contain
        log messages).

  Yet again: these are minor complaints.

  `+'-named log message files _are_ going to change to something more
  vi-friendly.  My bad.  I'm both an emacs user and a
  unix-traditionalist.  I didn't initially notice the problem and my
  reaction on hearing about it was "Well, vi is broken" -- but as a
  practical matter, arch does need to change in that area.

  arch itself does not generate `=' files in source trees -- I use
  them in the arch source code; they do appear in archives and under
  {arch} where you'll nearly never need to interact with them via
  bash.  Incidently, `bash' has recently been patched (not sure if
  it's released yet) to make it deal properly, or at least better,
  with `=' files.  ("Well, bash is broken." :-)

  I'm not sure why you think that "{arch}" is bad.  There's _one_ of
  those per controlled _tree_, while there's one CVS/ per _directory_.
  I'm not sure why you think the sort-order of {arch} is bad -- I
  think it's a feature because it puts that directory "out of sight;
  out of mind" when I use my outline-style directory editor (if you
  are an Emacs user, would you like a copy of my tree editor,
  "monkey"?).  I'm not sure why you think it's a PITA wrt bash -- I
  use bash interactively and have never had any problem with {arch}.
  But you know, again, this is a shallow issue.  Practically speaking,
  changing that name to something else is relatively low impact
  (though, to be sure, a tedious change that would take several entire
  _hours_ to make + a few days to figure out how to deal with existing
  archives).


	* Cloud of shell scripts. It poses a lot of limitations which
	are pain to work around (including speed, two-fields version
	numbers [eek] and I can imagine several others; I'm not sure
	about these though, so I won't name further; you can possibly
	imagine something by yourself).

You'll be happier with n-component versions when we have a variation
on sort(1) that handles them with ease -- and that's where we're
going.

Robert Anderson has posted on the wiki a tasty defense of the choice
of `sh' for the first implementation.   He also pointed out some great
URLs from the SCSH site in the discussion on kerneltrap.org.  (Sorry
for the meta-URLs here....)

Some people complain about `sh' as an implementation language.  Beyond
defending that choice, let me say this: arch is a design and a set of
file and directory formats to go with that design.  It _invites_
reimplementation.  The other day, I played around with some of those
horrible shell tools and figured out that the meat of the sh scripts
in arch are just a bit over 20K LOC -- think you can rewrite that in
another language without too much cost?  There's another 10-15K LOC
which is nothing but printf(1) statements for `--help' options,
copyright comments, and boilerplate loops that read command line
options and assign their values to variables.

In other words, look beyond just any one implementation -- arch is a
set of concepts;  a set of interop standards just ripe to be written;
and a revision control system design that is simultaneously extremely
powerful, yet utterly trivial to implement or reimplement.   Forget
about the problem of tying Linux kernel development to a proprietary
tool -- arch can can help you avoid tying to a single implementation
of a free revision control system.

	* Absence of sufficient merging ability, at least impression I
	got from the documentation. Merging on the *.rej files level I
	cannot call sufficient ;-).  Also, history is not preserved
	during merging, which is quite fatal.  And it looks to me at
	least from the documentation that arch is still in the
	update-before-commit stage.

You are partially misinformed.

Merge history in arch is preserved in excurciating detail.  That
history is used smartly in some very common cases (like a tree of
trusted lieutenants) to eliminate some of the most common sources of
merge conflicts.

Yes, when conflicts occur, arch currently represents these via the
".rej" mechanism.  Yes, that's low-level and, at least arguably, icky.
Yet, again, that's not a "deep" issue in the sense that changing that
behavior leaves unaffected "99%" of arch.   So, again, is arch the
right _starting point_ for displacing BK?


      * Absence of checkin/commit distinction. File revisions and
        changesets seem to be tied together, losing some of the cute
        flexibility BK has.

Yes, I've noted that from the lkml discussion.  My impression so far
is that layering that functionality over the existing core of arch is
straightforward. 

	I must have missed terribly something in the documentation
	given how arch is being recommended,

Larry, and, increasingly, some of the arch-users members, are revision
control experts.  Your complaints about arch express, no offense, a
fairly superficial (yet valid, yet easy to deal with) end-user
perspective.   I think that the recommendations come from the
perspective of "Hey, this is a decent foundation and if you don't
appreciate that, you're probably not qualified to do revision control
design",  while the complaints come from a perspective of "Ick,
there's about 2-dozen tweaks to the UI on this thing that I can't
possibly live without and I won't use your system unless you start to
accomodate those."

The two perspectives are compatible and complementary.  Thank you for
your feedback

	But as I see it, most of the juicy stuff is missing (altough I
	really like the concept of configurations and especially the
	concept of caching --- mainly that you do not _have_ to pull
	all the stuff from the clonee repository, which can be a pain
	with more poor internet connection; 

That's a deep point -- thank you for noticing.

	then also if you aren't doing any that big changes and you're
	confident that the remote repository is going to stay there,
	it is less expensive to talk with the repository over network)

which arch does perfectly well.

	and the existing stuff is mostly in the form of shell scripts,
	which it has to leave and be rewritten sooner or later anyway.

Parts of it, sure.  All of it?  Well, I hope there are multiple
reimplementations of this tiny-yet-powerful system -- but I think the
sh-based version is more viable than some people do.

	The backend history format doesn't appear to be particularily
	great as well.

I can't respond to such a vague statement.   Details, if you please.


	Dunno. What's so special about arch then?

Superp design.  Tiny yet powerful implementation.  Unprecedented
features.   Based, deeply, on "what changesets mean" -- thus handilly
adapts to a very wide range of usage scenarios.   Software tools.

Also, mostly outside of the scope of Linux kernel foo -- the design
considers "programming in the large".  In other words, it takes into
account the problem of managing a complete system, not just the
kernel, in a commercial context with competing but related
distributions.   It's "scope of concern" is much larger than just the
lkml crowd.


-t


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [arch-users] Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-16  1:44                                         ` Tom Lord
@ 2003-03-16  2:06                                           ` Adam Spiers
  2003-03-16  3:28                                             ` David Lang
  0 siblings, 1 reply; 155+ messages in thread
From: Adam Spiers @ 2003-03-16  2:06 UTC (permalink / raw)
  To: arch-users, linux-kernel

Tom Lord (lord@emf.net) wrote:
>   `+'-named log message files _are_ going to change to something more
>   vi-friendly.  My bad.  I'm both an emacs user and a
>   unix-traditionalist.  I didn't initially notice the problem and my
>   reaction on hearing about it was "Well, vi is broken" -- but as a
>   practical matter, arch does need to change in that area.

Not that you need any more prodding on this direction, but it's worth
noting that both more(1) and less(1) suffer from this problem too.

  $ touch +foo
  $ more +foo
  usage: more [-dflpcsu] [+linenum | +/pattern] name1 name2 ...
  $ less +foo
  Missing filename ("less --help" for help)

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: [arch-users] Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-16  2:06                                           ` Adam Spiers
@ 2003-03-16  3:28                                             ` David Lang
  0 siblings, 0 replies; 155+ messages in thread
From: David Lang @ 2003-03-16  3:28 UTC (permalink / raw)
  To: Adam Spiers; +Cc: arch-users, linux-kernel

hey guys, the suggestion to move to another list for this discussion was
to reduce traffic on the kernel list, not add a bunch of arch discussions
to the bitkeeper discussions.

pick one list and use it, don't use both.

David Lang

 On Sun, 16 Mar 2003, Adam Spiers wrote:

> Date: Sun, 16 Mar 2003 02:06:17 +0000
> From: Adam Spiers <arch-users@adamspiers.org>
> To: arch-users@lists.fifthvision.net, linux-kernel@vger.kernel.org
> Subject: Re: [arch-users] Re: BitBucket: GPL-ed KitBeeper clone
>
> Tom Lord (lord@emf.net) wrote:
> >   `+'-named log message files _are_ going to change to something more
> >   vi-friendly.  My bad.  I'm both an emacs user and a
> >   unix-traditionalist.  I didn't initially notice the problem and my
> >   reaction on hearing about it was "Well, vi is broken" -- but as a
> >   practical matter, arch does need to change in that area.
>
> Not that you need any more prodding on this direction, but it's worth
> noting that both more(1) and less(1) suffer from this problem too.
>
>   $ touch +foo
>   $ more +foo
>   usage: more [-dflpcsu] [+linenum | +/pattern] name1 name2 ...
>   $ less +foo
>   Missing filename ("less --help" for help)
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-16  0:18                                       ` Petr Baudis
                                                           ` (2 preceding siblings ...)
  2003-03-16  1:44                                         ` Tom Lord
@ 2003-03-16  5:43                                         ` Robert Anderson
  2003-03-16 11:57                                         ` (Re: BitBucket: GPL-ed KitBeeper clone) Moving to arch-users Petr Baudis
  4 siblings, 0 replies; 155+ messages in thread
From: Robert Anderson @ 2003-03-16  5:43 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Daniel Phillips, lkml, arch

On Sat, 2003-03-15 at 16:18, Petr Baudis wrote:
> Dear diary, on Sat, Mar 15, 2003 at 10:53:34PM CET, I got a letter,
> where Robert Anderson <rwa@alumni.princeton.edu> told me, that...
> > On Sat, 2003-03-15 at 13:25, Daniel Phillips wrote:
> > > On Sat 15 Mar 03 17:21, Horst von Brand wrote:
> > > > Daniel Phillips <phillips@arcor.de> said:
> > > > > On Thu 13 Mar 03 01:52, Horst von Brand wrote:
> ..snip..
> > > > > Does anybody have a convenient mailing list for this design discussion?
> > > >
> > > > Good idea to move this off LKML
> > > 
> > > Yup, but nobody has offered one yet, so...
> > 
> > I think the arch-users@lists.fifthvision.net list would be happy to host
> > continuing discussion in this vein.  Considering Larry's repeated
> > attempts to get people to look at arch as a "better fit," it seems
> > particularly appropriate.
> > 
> > Of course, you'd have to tolerate "arch community" views on a lot of
> > these issues, but I suspect that might help focus the discussion.
> 
> I'm not sure if arch is the right thing to base on. Its concepts are surely
> interesting, however there are several problems (some of them may be
> subjective):

I, for one, was not necessarily interesting in "basing on" arch.  I
think what the arch crowd would like to see is what kernel developers
are asking for, first, and then potentially to relate those needs to
arch.

But, let me address some of your points anyway:

> * Terrible interface. Work with arch involves much more typing out of long
> commands (and sequences of these), subcommands and parameters to get
> functionality equivalent to the one provided much simpler by other SCMs. I see
> it is in sake of genericity and sometimes more sophisticated usage scheme, but
> I fear it can be PITA in practice for daily work.

The commands are verbose, but they are verbose for the simple reason
that the command set for arch is very rich, and verbosity is somewhat
necessary to avoid ambiguity.  I would certainly recommend the use of
completion facilities to use the command set as it exists natively.  If
you are a bash user, try the bash completion code here:

rwa@alumni.princeton.edu--rwa-2003
      http://rwa.homelinux.net/{public-archives}/rwa-2003

Certainly robust completion would mitigate much of the "typing problem."

But, there is also an alternate solution to this which consists of
having aliased "short forms" of the commands.  Some work has also been
done in this area to provide complete, unambiguous, and easy to type
short forms.  Search the arch mailing list archive for "short forms" for
the discussion and results so far.

> * Awful revision names (just unique ids format). Again, it involves much more
> typing

There will always be a tension between clarity and terseness of names in
general.  arch tends to the side of clarity.   You seem to favor
terseness for reasons of typing effort.  That tension can be mitigated
in any number of ways; completion probably being the most pragmatic.

 and after some hours of work, the dashes will start to dance around and
> regroup at random places in front of your eyes.

I don't think I've ever seen a complaint about "dashes dancing around in
front of people's eyes" on the mailing list since its inception.  In
fact, I've started using the double-dash separator in a number of other
contexts since growing accustomed to it as a "hard break" in a name.

 The concepts behind (like
> seamless division to multiple archives; I can't say I see sense in categories)

You can't see a sense in categories?  That statement is hard to fathom. 
Possibly you mean you don't see a sense in separate branch and version
qualifiers, and that's a more legitimate question in my view.

> are intriguing, but the result again doesn't seem very practical.

> * Evil directory naming. {arch} seems much more visible than CVS/ and SCCS/,

Well, {arch} does not litter every directory like CVS/ does.  It marks
the root of a project tree, and therefore it's actually a _nice_ thing
to have be visible.  That's the point of giving it a noticeable name. 
There's nothing "evil" about it from my perspective.

> particularly as it gets sorted as last in a directory, thus you see it at the
> bottom of ls output. Also it's a PITA with bash, as the stuff starting by '='
> (arch likes to spawn that as well)

No, it doesn't.

 is. The files starting by '+' are problem
> for vi, which is kind of flaw when they are probably the only arch files
> dedicated for editting by user (they are supposed to contain log messages).

The output of make-log is now prefixed with an absolute path; the ++
does not cause a problem in that context anymore, even with vi, i.e.:

vi `larch make-log`

is fine now.

While I think both the ++ and = convention should be reconsidered as
users almost uniformly resist them when first getting used to arch, I
don't think this is much of a substantive problem.  It's just a
character or two, after all.

> * Cloud of shell scripts. It poses a lot of limitations which are pain to work
> around (including speed, two-fields version numbers [eek] and I can imagine
> several others; I'm not sure about these though, so I won't name further; you
> can possibly imagine something by yourself).

Out of curiosity, what is your favorite language that you would like to
see arch implemented in?  Some of the usual concerns regarding shell
scripts having been addressed on the wiki under "ArchMyths."

> * Absence of sufficient merging ability,

With all due respect, I think this reveals your level of familiarity
with arch to be, umm, not high.  I'm not aware of any revision control
system that has the depth of capability that arch has with respect to
merging.  BitKeeper may; I'm not a BitKeeper expert, but I'm pretty sure
nothing else comes close.

 at least impression I got from the
> documentation. Merging on the *.rej files level I cannot call sufficient ;-).

I think you've misunderstood something fairly basic.

> Also, history is not preserved during merging, which is quite fatal.

Which documentation were you reading again?  I'm not being facetious,
there's several versions of various forms of documentation still around,
I'd like to know which one gave you that impression.

  And it
> looks to me at least from the documentation that arch is still in the
> update-before-commit stage.

I'm not sure what the "update-before-commit stage" is.  Can you clarify?

> * Absence of checkin/commit distinction. File revisions and changesets seem to
> be tied together, losing some of the cute flexibility BK has.

I'm not aware of any such thing as a "file revision" in arch.  Perhaps
you could expand on what that is and why you think you need such a
thing.

> I must have missed terribly something in the documentation

Yes, I believe you did.

 given how arch is
> being recommended, please feel encouraged to correct me. But as I see it, most
> of the juicy stuff is missing

Let's start with the above problems with your first reading of the docs,
then we'll move onto the "juicy stuff."

 (altough I really like the concept of
> configurations and especially the concept of caching --- mainly that you do not
> _have_ to pull all the stuff from the clonee repository, which can be a pain
> with more poor internet connection; then also if you aren't doing any that big
> changes and you're confident that the remote repository is going to stay there,
> it is less expensive to talk with the repository over network)

Signs of life... :)

 and the existing
> stuff is mostly in the form of shell scripts, which it has to leave and be
> rewritten sooner or later anyway.

Most of us would probably like to see that, but I think "has to" is
debatable.

  The backend history format doesn't appear to
> be particularily great as well. Dunno. What's so special about arch then?

Let's talk about what kernel developers think they need, then we can
frame "what is so special about arch" in terms of that.  I think that's
a reasonable way to frame the discussion.

Bob

^ permalink raw reply	[flat|nested] 155+ messages in thread

* (Re: BitBucket: GPL-ed KitBeeper clone) Moving to arch-users
  2003-03-16  0:18                                       ` Petr Baudis
                                                           ` (3 preceding siblings ...)
  2003-03-16  5:43                                         ` Robert Anderson
@ 2003-03-16 11:57                                         ` Petr Baudis
  4 siblings, 0 replies; 155+ messages in thread
From: Petr Baudis @ 2003-03-16 11:57 UTC (permalink / raw)
  To: Robert Anderson, Daniel Phillips, lkml, arch

Dear diary, on Sun, Mar 16, 2003 at 01:18:40AM CET, I got a letter,
where Petr Baudis <pasky@ucw.cz> told me, that...
> Dear diary, on Sat, Mar 15, 2003 at 10:53:34PM CET, I got a letter,
> where Robert Anderson <rwa@alumni.princeton.edu> told me, that...
> > On Sat, 2003-03-15 at 13:25, Daniel Phillips wrote:
> > > On Sat 15 Mar 03 17:21, Horst von Brand wrote:
> > > > Daniel Phillips <phillips@arcor.de> said:
> > > > > On Thu 13 Mar 03 01:52, Horst von Brand wrote:
> ..snip..
> > > > > Does anybody have a convenient mailing list for this design discussion?
> > > >
> > > > Good idea to move this off LKML
> > > 
> > > Yup, but nobody has offered one yet, so...
> > 
> > I think the arch-users@lists.fifthvision.net list would be happy to host
> > continuing discussion in this vein.  Considering Larry's repeated
> > attempts to get people to look at arch as a "better fit," it seems
> > particularly appropriate.
> > 
> > Of course, you'd have to tolerate "arch community" views on a lot of
> > these issues, but I suspect that might help focus the discussion.
> 
> I'm not sure if arch is the right thing to base on. Its concepts are surely
> interesting, however there are several problems (some of them may be
> subjective):
..rant..

Ok, from a perspective of few hours I think it's a good idea to really move
this discussion to arch-users, although the resulting SCM may not be
neccessarily arch. I will strip lkml from the recipients list in my further
mails replying to this sub-thread and I would like to ask the others to do the
same.

Kind regards,

-- 
 
				Petr "Pasky" Baudis
.
The pure and simple truth is rarely pure and never simple.
		-- Oscar Wilde
.
Stuff: http://pasky.ji.cz/

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-11 18:40                       ` Zack Brown
  2003-03-11 18:46                         ` Martin J. Bligh
  2003-03-12  3:47                         ` Horst von Brand
@ 2003-03-14 11:34                         ` Pavel Machek
  2 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-14 11:34 UTC (permalink / raw)
  To: Zack Brown; +Cc: Daniel Phillips, linux-kernel

Hi!

> I remember that discussion. It was pretty interesting, but some
> conflicting ideas about what should be done; and not much organization
> to it all.
> 
> I've taken a lot of stuff from that wish list, combined it with what I gathered
> from Larry's earlier post, and from Petr Baudis' recent post, and elsewhere,
> and organized it into something that might be interesting. If anyone would
> like to host this document on the web, please let me know.

I'd like to host it in bitbucket CVS. If you
have sf account, I'll just add you as a
developer.

>     2.1 Tagging
> 
> It must be trivial for a developer to tag a file as part of a given
> changeset.
> 
> It must be possible to reorganize changesets, so that a given changeset may
> be split up into more manageable pieces.
> 

What does this have to do with tagging?

>   3. Problems For Clarification
> 
> If a file is tagged as being part of two different changesets, then changes
> to that file should be associated with which changeset???
> 

Perhaps tagging should be explained?
I thought that tagging is assigning
symbolic name to some release?

			Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-09  2:45                   ` Zack Brown
                                       ` (3 preceding siblings ...)
  2003-03-10 23:03                     ` Daniel Phillips
@ 2003-03-12 23:38                     ` Pavel Machek
  4 siblings, 0 replies; 155+ messages in thread
From: Pavel Machek @ 2003-03-12 23:38 UTC (permalink / raw)
  To: Zack Brown; +Cc: Larry McVoy, Linus Torvalds, linux-kernel

Hi!

> > [Long rant, summary: it's harder than you think, read on for the details]
> [skipping long description]
> 
> OK, so here is my distillation of Larry's post.
> 
>   Basic summary: a distributed, replicated, version controlled user level file
>   system with no limits on any of the file system events which may happened
>   in parallel. All changes must be put correctly back together, no matter how
>   much parallelism there has been.
> 
>   * Merging.
> 
>   * The graph structure.
> 
>   * Distributed rename handling. Centralized systems like Subversion don't
>   have as many problems with this because you can only create one file in
>   one directory entry because there is only one directory entry available.
>   In distributed rename handling, there can be an infinite number of different
>   files which all want to be src/foo.c. There are also many rename corner-cases.
> 
>   * Symbolic tags. This is adding a symbolic label on a revision. A distributed
>   system must handle the fact that the same symbol can be put on multiple
>   revisions. This is a variation of file renaming. One important thing to
>   consider is that time can go forward or backward.
> 
>   * Security semantics. Where should they go? How can they be integrated
>   into the system? How are hostile users handled when there is no central
>   server to lock down?
> 
>   * Time semantics. A distributed system cannot depend on reported time
>   being correct. It can go forward or backward at any rate.
> 
> I'd be willing to maintain this as the beginning of a feature list and
> post it regularly to lkml if enough people feel it would be useful and not
> annoying. The goal would be to identify the features/problems that would

Actually, check it in bitbucket's repository
on sf.net; it should not be annoying
there.
(He he "send it to the bitbucket" :-)
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 19:08           ` Pavel Machek
  2003-03-07 19:25             ` Eli Carter
  2003-03-07 23:16             ` Linus Torvalds
@ 2003-03-09  2:06             ` Horst von Brand
       [not found]             ` <b4b98v_14m_1@penguin.transmeta.com>
  3 siblings, 0 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-09  2:06 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Olivier Galibert, linux-kernel

Pavel Machek <pavel@suse.cz> said:

[...]

> So, basically, if branch was killed and recreated after each merge
> from mainline, problem would be solved, right?

Who is branch, who is mainline? The branch owner _will_ be pissed off if
his head version changes each time he syncronizes. What if mainline dies,
and the official line moves to one of the branches? What happens when there
aren't just two, but a dozen developers swizzling individual csets from
each other (not necesarily just resyncing with each other)? If said
developers also apply random patches from a common mailing list?

This is _much_ harder than it looks on the surface.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

[parent not found: <b4b98v_14m_1@penguin.transmeta.com>]

* Re: BitBucket: GPL-ed KitBeeper clone
       [not found]             ` <b4b98v_14m_1@penguin.transmeta.com>
@ 2003-03-12 23:23               ` Pavel Machek
  2003-03-13 21:15                 ` Horst von Brand
  0 siblings, 1 reply; 155+ messages in thread
From: Pavel Machek @ 2003-03-12 23:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

Hi!

> >So, basically, if branch was killed and recreated after each merge
> >from mainline, problem would be solved, right?
> 
> Wrong.
> 
> Now think three trees.  Each merging back and forth between each other. 

> Or, in the case of something like the Linux kernel tree, where you don't
> have two or three trees.  You've got at least 20 actively developed
> concurrent trees with branches at different points. 

Yep, but only three trees are interesting
for me (your's and Andi's) and most
developers only care about your tree,
so simplifications should be possible.

> Trust me. CVS simple CANNOT do this. You need the full information.
> 
> Give it up.  BitKeeper is simply superior to CVS/SVN, and will stay that
> way indefinitely since most people don't seem to even understand _why_
> it is superior. 

Actually by now I understand that prcs,
not cvs is the closest thing... Actually
its .prj file looks similar to bk's ChangeSet
file.
				Pavel
-- 
				Pavel
Written on sharp zaurus, because my Velo1 broke. If you have Velo you don't need...


^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-12 23:23               ` Pavel Machek
@ 2003-03-13 21:15                 ` Horst von Brand
  0 siblings, 0 replies; 155+ messages in thread
From: Horst von Brand @ 2003-03-13 21:15 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Linux Kernel Mailing List

Pavel Machek <pavel@ucw.cz> said:

[...]

> > Or, in the case of something like the Linux kernel tree, where you don't
> > have two or three trees.  You've got at least 20 actively developed
> > concurrent trees with branches at different points. 

> Yep, but only three trees are interesting
> for me (your's and Andi's) and most
> developers only care about your tree,
> so simplifications should be possible.

A purely local view isn't enough, I fear. Each of the 3 trees has its own 3
neighbors, ...
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-07 12:12     ` Olivier Galibert
  2003-03-07 12:32       ` Pavel Machek
@ 2003-03-08  0:18       ` Olaf Dietsche
  1 sibling, 0 replies; 155+ messages in thread
From: Olaf Dietsche @ 2003-03-08  0:18 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Pavel Machek, linux-kernel

Olivier Galibert <galibert@pobox.com> writes:

> sometimes interesting.  Nice systems, like PRCS and bk, first commit
> to a new branch (no update necessary obviously) then merge in the
> mainline.  As a side effect, they are Good with branches.  Bk's main
> quality over PRCS is the distribution.  This lack is what makes PRCS
> essentially unusable for serious open source projects.  Otherwise
> they're semantically the same.

So, what you say is: add distribution to PRCS and you're done?

Regards, Olaf.

^ permalink raw reply	[flat|nested] 155+ messages in thread

* Re: BitBucket: GPL-ed KitBeeper clone
  2003-03-02  0:11 BitBucket: GPL-ed KitBeeper clone Adam J. Richter
                   ` (3 preceding siblings ...)
  2003-03-02  1:26 ` Olivier Galibert
@ 2003-03-02  1:37 ` Filip Van Raemdonck
  4 siblings, 0 replies; 155+ messages in thread
From: Filip Van Raemdonck @ 2003-03-02  1:37 UTC (permalink / raw)
  To: linux-kernel

On Sat, Mar 01, 2003 at 04:11:55PM -0800, Adam J. Richter wrote:
> Pavel Machek wrote:
> > I've created little project for read-only (for now ;-) kitbeeper
> > clone. It is available at www.sf.net/projects/bitbucket (no tar balls,
> > just get it fresh from CVS).
> 
> 	Thank you for taking some initiative and improving this
> situation by constructive means.  You are an example to us all,
> as is Andrea Arcangeli with his openbkweb project, which you
> will probably want to examine and perhaps integrate
> (ftp://ftp.kernel.org/pub/linux/kernel/people/andrea/openbkweb).

I've said this (indirectly) before, and I'll say it again:
BitBucket, and you, are missing the point here. Openbkweb isn't.
Before one can use bitbucket there still has to be a bkbits mirror first,
which incidentally may be true for the main linux kernel trees but isn't
for other projects developed with the help of bitkeeper.

I've also said this before, and I'll also repeat this again:
While politics & philosophy are my main reasons not to use bitkeeper, I
also am not bothered enough by other issues to use it plain and simple.
Nor to use openbkweb instead. And I'm not going to tell other people what
they should do.

However, until we have a tool (as openbkweb tries to be, although very
inefficiently) which can extract patches from the "main" openlogging
bitkeeper repositories, the schism remains between developers who use BK
and those who cannot use it - be it for political or real legal (i.e.
license violation, because of involvement in another SCM) reasons.

> bitkbucket currently uses rsync to update data from the
> repository.
(...)
> 	I think the suggestion made by Pavel Janik that it would
> be better to work on adding BitKeeper-like functionality to existing
> free software packages is a bit misdirected.  BitKeeper uses SCCS
> format, and we have a GPL'ed SCCS clone ("cssc"), so you are
> adding functionality to existing free software version control
> code anyhow.

Not until you can use that functionality to access the main BK
repositories directly. When you're still accessing mirrors of it, as in
the rsync case, you are - pragmatically speaking - no better of than when
not accessing it at all.

Regards,

Filip

-- 
"To me it sounds like Cowpland just doesn't know what the hell he is talking
 about.  That's to be expected: he's CEO, isn't he?"
	-- John Hasler

^ permalink raw reply	[flat|nested] 155+ messages in thread

end of thread, other threads:[~2003-04-07 21:10 UTC | newest]

Thread overview: 155+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-02  0:11 BitBucket: GPL-ed KitBeeper clone Adam J. Richter
2003-03-02  0:20 ` Larry McVoy
2003-03-02  0:20 ` David Lang
2003-03-02  0:49 ` Arador
2003-03-02  1:03   ` Jeff Garzik
2003-03-02  2:15   ` Alan Cox
2003-03-02  1:19     ` Jeff Garzik
2003-03-02  1:40       ` BitBucket: GPL-ed *notrademarkhere* clone Andrea Arcangeli
2003-03-02  1:45         ` Jeff Garzik
2003-03-02  2:09           ` Andrea Arcangeli
2003-03-02 17:28             ` Jeff Garzik
2003-03-02 18:16               ` Andrea Arcangeli
2003-03-02 20:12                 ` Jeff Garzik
2003-03-02 21:49                   ` Geert Uytterhoeven
2003-03-03 18:37                 ` Larry McVoy
2003-03-03 18:46                   ` Larry McVoy
2003-03-03 22:57                   ` Andrea Arcangeli
2003-03-03 23:14                     ` Pavel Machek
2003-03-03 23:56                     ` David Lang
2003-03-04  0:02                       ` Jeff Garzik
2003-03-04  0:05                         ` Larry McVoy
2003-03-04  0:15                         ` Andrea Arcangeli
2003-03-04  0:30                           ` Jeff Garzik
2003-03-04  2:20                       ` Martin J. Bligh
2003-03-04  5:29                         ` Linus Torvalds
2003-03-04  5:56                           ` Dimitrie O. Paun
2003-03-04 14:51                             ` Jeff Garzik
2003-03-02  3:29           ` H. Peter Anvin
2003-03-02 17:12             ` Jeff Garzik
2003-03-02 18:39               ` H. Peter Anvin
2003-03-02 20:01                 ` Jeff Garzik
2003-03-03  0:47               ` nickn
2003-03-03  0:55                 ` David Lang
2003-03-03  2:31                   ` Jeff Garzik
2003-03-03  2:32                 ` Jeff Garzik
2003-03-04  1:07                   ` Horst von Brand
2003-03-04  1:10                     ` H. Peter Anvin
2003-03-03 21:53               ` Joel Becker
2003-03-04 23:37                 ` Olaf Hering
2003-03-06 16:47                 ` Pavel Machek
2003-03-06 16:41               ` Pavel Machek
2003-03-07 11:24                 ` Tupshin Harper
2003-03-07 11:28                   ` Pavel Machek
2003-03-07 21:53                 ` H. Peter Anvin
2003-03-08 23:18                   ` Daniel Phillips
2003-03-03  0:13           ` Pavel Machek
2003-03-03  0:10       ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
2003-03-04 16:16         ` David Woodhouse
2003-03-04 16:27           ` Pavel Machek
2003-03-02  1:26 ` Olivier Galibert
2003-03-06 16:18   ` Pavel Machek
2003-03-07 12:12     ` Olivier Galibert
2003-03-07 12:32       ` Pavel Machek
2003-03-07 16:54         ` Olivier Galibert
2003-03-07 17:14           ` Geert Uytterhoeven
2003-03-07 19:08           ` Pavel Machek
2003-03-07 19:25             ` Eli Carter
2003-03-07 20:29               ` Pavel Machek
2003-03-07 23:16             ` Linus Torvalds
2003-03-08 22:52               ` Zack Brown
2003-03-09  0:05                 ` Larry McVoy
2003-03-09  1:21                   ` Davide Libenzi
2003-03-09  2:45                   ` Zack Brown
2003-03-09  3:19                     ` Roman Zippel
2003-03-09  3:42                       ` Linus Torvalds
2003-03-09  4:32                         ` Roman Zippel
2003-03-09 13:34                           ` Eric W. Biederman
2003-03-09 15:35                             ` Roman Zippel
2003-03-09 16:55                               ` Martin J. Bligh
2003-03-09 17:20                                 ` Zack Brown
2003-03-09 17:48                                   ` Martin J. Bligh
2003-03-09 19:58                                   ` Larry McVoy
2003-03-09 21:32                                     ` Zack Brown
2003-03-09 21:54                                       ` Valdis.Kletnieks
2003-03-09 23:28                                         ` Larry McVoy
2003-03-13 20:00                                     ` Pavel Machek
2003-03-09 17:39                                 ` Linus Torvalds
2003-03-09 17:58                                   ` Martin J. Bligh
2003-03-09 18:20                                   ` Larry McVoy
2003-03-09 23:19                                     ` fs
2003-03-13  0:41                                     ` Pavel Machek
2003-03-13 21:21                                       ` Horst von Brand
2003-03-09 20:01                                   ` Roman Zippel
2003-03-13  0:13                             ` Pavel Machek
2003-03-09 14:49                         ` Olivier Galibert
2003-03-13  0:05                         ` Pavel Machek
2003-03-10  0:02                     ` Thoughts about ideal kernel SCM Petr Baudis
2003-03-10  0:32                       ` Larry McVoy
2003-03-12 19:29                         ` Petr Baudis
2003-03-13 10:36                       ` Pavel Machek
2003-03-14 22:56                         ` Petr Baudis
2003-03-17 20:59                       ` Petr Baudis
2003-03-10  3:41                     ` BitBucket: GPL-ed KitBeeper clone Horst von Brand
2003-03-10 13:52                       ` Jamie Lokier
2003-03-10 23:03                     ` Daniel Phillips
2003-03-11 18:40                       ` Zack Brown
2003-03-11 18:46                         ` Martin J. Bligh
2003-03-11 19:30                           ` Daniel Phillips
2003-03-11 19:33                             ` Martin J. Bligh
2003-03-11 20:08                               ` Andrew Morton
2003-03-11 20:29                                 ` Martin J. Bligh
2003-03-12  6:14                             ` Werner Almesberger
2003-03-13  2:48                               ` Daniel Phillips
2003-03-13  3:11                                 ` Werner Almesberger
2003-03-14 12:29                             ` Pavel Machek
2003-03-15 20:53                               ` Martin J. Bligh
2003-03-15 21:26                               ` Daniel Phillips
2003-03-15 21:32                               ` Petr Baudis
2003-03-15 23:39                                 ` Petr Baudis
2003-03-16  0:39                               ` Horst von Brand
2003-04-07 21:22                               ` Petr Baudis
2003-03-12  3:47                         ` Horst von Brand
2003-03-12  4:03                           ` Larry McVoy
2003-03-12  4:49                             ` [PATCH] ~/kernel/sys.c (2.5.64) (trivial) Jay Patrick Howard
2003-03-12  5:22                           ` BitBucket: GPL-ed KitBeeper clone Zack Brown
2003-03-12  5:44                             ` Horst von Brand
2003-03-12 13:48                               ` Daniel Phillips
2003-03-13  1:03                                 ` Horst von Brand
2003-03-13 16:53                                   ` Daniel Phillips
2003-03-15 15:02                                     ` Horst von Brand
2003-03-15 21:25                                       ` Daniel Phillips
2003-03-12  6:19                             ` Werner Almesberger
2003-03-13  1:31                               ` Horst von Brand
2003-03-12 15:32                             ` Horst von Brand
2003-03-12 16:13                               ` Daniel Phillips
2003-03-12 20:37                                 ` Horst von Brand
2003-03-12 20:54                                   ` H. Peter Anvin
2003-03-13  2:00                                   ` Daniel Phillips
2003-03-15  1:03                                     ` Horst von Brand
2003-03-12 13:22                           ` Daniel Phillips
2003-03-13  0:52                             ` Horst von Brand
2003-03-13 17:00                               ` Daniel Phillips
2003-03-13 21:48                                 ` Zack Brown
2003-03-13 22:04                                   ` Daniel Phillips
2003-03-15 16:21                                 ` Horst von Brand
2003-03-15 21:25                                   ` Daniel Phillips
2003-03-15 21:53                                     ` Robert Anderson
2003-03-15 21:50                                       ` Randy.Dunlap
2003-03-15 22:16                                         ` Robert Anderson
2003-03-15 22:18                                         ` Robert Anderson
2003-03-16  0:18                                       ` Petr Baudis
2003-03-16  0:53                                         ` Davide Libenzi
2003-03-16  0:55                                         ` [arch-users] " Stig Brautaset
2003-03-16  1:44                                         ` Tom Lord
2003-03-16  2:06                                           ` Adam Spiers
2003-03-16  3:28                                             ` David Lang
2003-03-16  5:43                                         ` Robert Anderson
2003-03-16 11:57                                         ` (Re: BitBucket: GPL-ed KitBeeper clone) Moving to arch-users Petr Baudis
2003-03-14 11:34                         ` BitBucket: GPL-ed KitBeeper clone Pavel Machek
2003-03-12 23:38                     ` Pavel Machek
2003-03-09  2:06             ` Horst von Brand
     [not found]             ` <b4b98v_14m_1@penguin.transmeta.com>
2003-03-12 23:23               ` Pavel Machek
2003-03-13 21:15                 ` Horst von Brand
2003-03-08  0:18       ` Olaf Dietsche
2003-03-02  1:37 ` Filip Van Raemdonck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).