git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* VCS comparison table
@ 2006-10-14 15:07 Jon Smirl
  2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 20:20 ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: Jon Smirl @ 2006-10-14 15:07 UTC (permalink / raw)
  To: Git Mailing List

I was reading Brendan's blog post about Mozilla 2
http://weblogs.mozillazine.org/roadmap/archives/2006/10/mozilla_2.html

It refers to this comparison chart between source control systems.
http://bazaar-vcs.org/RcsComparisons

Does it accurately reflect the current status of git? Is their
assessment of git's rename capability correct?

They want changes via IRC. "Please discuss changes to this table on
the freenode IRC network channel #bzr, or on the mailing list. The
terms used in the table have precise meanings, and not all VCS's use
the same term in the same way - which means that some translation is
needed to fill it in properly."

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 15:07 VCS comparison table Jon Smirl
@ 2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 17:18   ` Jon Smirl
                     ` (2 more replies)
  2006-10-14 20:20 ` Jakub Narebski
  1 sibling, 3 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-14 16:40 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jon Smirl wrote:

> It refers to this comparison chart between source control systems.
> http://bazaar-vcs.org/RcsComparisons

It is quite obvious that comparison of programs of given type (SMC)
on some program site (Bazaar-NG) is usually biased towards said program,
perhaps unconsciously: by emphasizing the features which were important
for developers of said program.
 
> Does it accurately reflect the current status of git? Is their
> assessment of git's rename capability correct?

For example simple namespace for git: you can use shortened sha1
(even to only 6 characters, although usually 8 are used), you can
use tags, you can use ref^m~n syntax.

I'm not sure about "No" in "Supports Repository". Git supports multiple
branches in one repository, and what's better supports development using
multiple branches, but cannot for example do a diff or a cherry-pick
between repositories (well, you can use git-format-patch/git-am to
cherry-pick changes between repositories...).

About "checkouts", i.e. working directories with repository elsewhere:
you can use GIT_DIR environmental variable or "git --git-dir" option,
or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
"symref"-like file to point to repository passes, we can use that.

Partial checkouts are only partially supported as of now; it means
you have to do some lowe level stuff to do partial checkout, and be
carefull when comitting. BTW it depends what you mean by partial
checkout, but they are somewhat incompatibile with atomic commits
to snapshot based repository.

Git supports renames in its own way; it doesn't use file ids, nor
remember renames (the new "note" header for use e.g. by porcelains 
didn't pass if I remember correctly). But it does *detect* moving
_contents_, and even *copying* _contents_ when requested. And of
course it detect renames in merges.

Git doesn't have some "plugin framework", but because it has many
"plumbing" commands, it is easy to add new commands, and also new
merge strategies, using shell scripts, Perl, Python and of course C.
So the answer would be "Somewhat", as git has plugable merge strategies,
or even "Yes" at it is easy to add new git command.

> They want changes via IRC. "Please discuss changes to this table on
> the freenode IRC network channel #bzr, or on the mailing list."

Gaah, subscribe-to-post mailing list!
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 16:40 ` Jakub Narebski
@ 2006-10-14 17:18   ` Jon Smirl
  2006-10-14 17:42     ` Jakub Narebski
  2006-10-16  3:53   ` Martin Pool
  2006-10-16 22:26   ` Aaron Bentley
  2 siblings, 1 reply; 806+ messages in thread
From: Jon Smirl @ 2006-10-14 17:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
>
> > It refers to this comparison chart between source control systems.
> > http://bazaar-vcs.org/RcsComparisons
>
> It is quite obvious that comparison of programs of given type (SMC)
> on some program site (Bazaar-NG) is usually biased towards said program,
> perhaps unconsciously: by emphasizing the features which were important
> for developers of said program.
>
> > Does it accurately reflect the current status of git? Is their
> > assessment of git's rename capability correct?
>
> For example simple namespace for git: you can use shortened sha1
> (even to only 6 characters, although usually 8 are used), you can
> use tags, you can use ref^m~n syntax.
>
> I'm not sure about "No" in "Supports Repository". Git supports multiple
> branches in one repository, and what's better supports development using
> multiple branches, but cannot for example do a diff or a cherry-pick
> between repositories (well, you can use git-format-patch/git-am to
> cherry-pick changes between repositories...).
>
> About "checkouts", i.e. working directories with repository elsewhere:
> you can use GIT_DIR environmental variable or "git --git-dir" option,
> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> "symref"-like file to point to repository passes, we can use that.

I believe they mean checking out only the latest few revisions instead
of copying the whole repo. This issue is a problem for Mozilla. If you
want to change a line in the git version you have to download the
entire 500MB tree with full history.

>
> Partial checkouts are only partially supported as of now; it means
> you have to do some lowe level stuff to do partial checkout, and be
> carefull when comitting. BTW it depends what you mean by partial
> checkout, but they are somewhat incompatibile with atomic commits
> to snapshot based repository.

I believe partial checkout means being able to check one directory
tree out of the repo and work on it while ignoring what is happening
in the rest of the repo. This is another issue for Mozilla which has
multiple dependent projects checked into a single repo.

>
> Git supports renames in its own way; it doesn't use file ids, nor
> remember renames (the new "note" header for use e.g. by porcelains
> didn't pass if I remember correctly). But it does *detect* moving
> _contents_, and even *copying* _contents_ when requested. And of
> course it detect renames in merges.
>
> Git doesn't have some "plugin framework", but because it has many
> "plumbing" commands, it is easy to add new commands, and also new
> merge strategies, using shell scripts, Perl, Python and of course C.
> So the answer would be "Somewhat", as git has plugable merge strategies,
> or even "Yes" at it is easy to add new git command.
>
> > They want changes via IRC. "Please discuss changes to this table on
> > the freenode IRC network channel #bzr, or on the mailing list."
>
> Gaah, subscribe-to-post mailing list!

It is annoying, but subscribe with the no delivery option.

> --
> Jakub Narebski
> Warsaw, Poland
> ShadeHawk on #git
>
>
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 17:18   ` Jon Smirl
@ 2006-10-14 17:42     ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-14 17:42 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

Jon Smirl wrote:
>> About "checkouts", i.e. working directories with repository
>> elsewhere: you can use GIT_DIR environmental variable or "git
>> --git-dir" option, or symlinks, and if Nguyen Thai Ngoc D proposal
>> to have .gitdir/.git "symref"-like file to point to repository
>> passes, we can use that.
>
> I believe they mean checking out only the latest few revisions
> instead of copying the whole repo. This issue is a problem for
> Mozilla. If you want to change a line in the git version you have to
> download the entire 500MB tree with full history.

>From http://bazaar-vcs.org/RcsComparisons
  A "Checkout" is a working tree that points elsewhere for its RCS data.

You can always do like Linux kernel did, splitting repository into 
current and historical part (which would contain also dead branches), 
and creating and publishing current-historical graft file, to join 
history if needed.

>> Partial checkouts are only partially supported as of now; it means
>> you have to do some lowe level stuff to do partial checkout, and be
>> carefull when comitting. BTW it depends what you mean by partial
>> checkout, but they are somewhat incompatibile with atomic commits
>> to snapshot based repository.
> 
> I believe partial checkout means being able to check one directory
> tree out of the repo and work on it while ignoring what is happening
> in the rest of the repo. This is another issue for Mozilla which has
> multiple dependent projects checked into a single repo.

So split different projects into different repositories. There was some 
helper program (git-splitrepo or something like that) for that posted 
on git mailing list. And use "superrepository" to gather all projects 
together (see last discussion about subprojects on git mailing list).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 15:07 VCS comparison table Jon Smirl
  2006-10-14 16:40 ` Jakub Narebski
@ 2006-10-14 20:20 ` Jakub Narebski
  2006-10-14 23:06   ` Jon Smirl
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-14 20:20 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> I was reading Brendan's blog post about Mozilla 2
> http://weblogs.mozillazine.org/roadmap/archives/2006/10/mozilla_2.html

You mean:
 "Oh, and isn't it time that we get off of CVS? The best way to do that
  without throwing 1.9 into an uproar is to develop Mozilla 2 using a new
  Version Control System (VCS) that can merge with CVS (since we will want
  to track changes to files not being revamped at first, or at all; and
  we'll probably find bugs whose fixes should flow back into 1.9). The
  problem with VCSes is that there are too many to choose from now.
  Nevertheless, looking for mostly green columns in that chart should help
  us make a quick decision. We don't need "the best" or the "newest", but we
  do need better merging, branching, and renaming support."

There is work by Jon Smirl and Shawn Pearce on CVS to Git importer which can
manage large and complicated (read: f*cked-up) Mozilla CVS repository.
  http://git.or.cz/gitwiki/InterfacesFrontendsAndTools#cvs2git

By the way, I'd rather use SCM comparison table on neutral site, not on SCM
site.


I think that Mozilla project should come with it's own set of requirements
and weights for best SCM _for Mozilla project_.

1. Converting existing CVS repository. This should be without data loss...
well, beside data loss that stems from using CVS in first place. "Best" SCM
would have:
  * Tool to convert CVS repository, which can then incrementally import
    changes.
  * It would be nice to have tool to exchange commits between SCM and CVS,
    be it like Tailor/git-svn, or via incremental import and exporting
    commits to CVS like git-cvsexportcommit. This would ease changing SCM,
    as both new SCM and CVS could be deployed in parallel, for a short time
    of course.
  * It would be nice to have CVS emulation like git-cvsserver, so users
    accustomed to CVS could still use it.

2. Good support for system which most important developers use, and good
support for system which most contributors use. If MS Windows is included
in those, then Git perhaps wouldn't be the best choice.

3. Good support for the workflow used in the project. Is it exchanging
patches via email (hello, Git!), having ssh access to some central
repository with central repository to push changes to or net/mesh of
repositories exchanging information, posting patches on some bug tracking
software integrated with SCM. Is it using many branches (topic branches),
or is it using few branches and merging.

But it is equally important to realize what would be the best workflow to
use, not constraining itself to the workflow imposed by limitations of CVS.

4. Good support for _large_ project, with large history. Namely, that
developer wouldn't need to download many megabytes and/or wouldn't need
megabytes of working area. How that is solved, be it partial checkouts,
lazy/shallow/sparse clone, subprojects, splitting into
projects/repositories and having some superproject or build-time
superproject, splitting repository into current and historical... that of
course depends on SCM.

5. ....

and probably few more
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 20:20 ` Jakub Narebski
@ 2006-10-14 23:06   ` Jon Smirl
  2006-10-14 23:34     ` Jakub Narebski
                       ` (4 more replies)
  0 siblings, 5 replies; 806+ messages in thread
From: Jon Smirl @ 2006-10-14 23:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
>
> > I was reading Brendan's blog post about Mozilla 2
> > http://weblogs.mozillazine.org/roadmap/archives/2006/10/mozilla_2.html
>
> You mean:
>  "Oh, and isn't it time that we get off of CVS? The best way to do that
>   without throwing 1.9 into an uproar is to develop Mozilla 2 using a new
>   Version Control System (VCS) that can merge with CVS (since we will want
>   to track changes to files not being revamped at first, or at all; and
>   we'll probably find bugs whose fixes should flow back into 1.9). The
>   problem with VCSes is that there are too many to choose from now.
>   Nevertheless, looking for mostly green columns in that chart should help
>   us make a quick decision. We don't need "the best" or the "newest", but we
>   do need better merging, branching, and renaming support."
>
> There is work by Jon Smirl and Shawn Pearce on CVS to Git importer which can
> manage large and complicated (read: f*cked-up) Mozilla CVS repository.
>   http://git.or.cz/gitwiki/InterfacesFrontendsAndTools#cvs2git

I am still working with the developers of the cvs2svn import tool to
fix things so that Mozilla CVS can be correctly imported. There are
still outstanding bugs in cvs2svn preventing a correct import. MozCVS
can be imported, but the resulting repository is not entirely correct.

Once they get the base cvs2svn fixed I'll port my patches to turn it
into cvs2git again.

There is no existing CVS importer that will correctly import the
Mozilla CVS. I have tried them all.

> By the way, I'd rather use SCM comparison table on neutral site, not on SCM
> site.
>
>
> I think that Mozilla project should come with it's own set of requirements
> and weights for best SCM _for Mozilla project_.
>
> 1. Converting existing CVS repository. This should be without data loss...
> well, beside data loss that stems from using CVS in first place. "Best" SCM
> would have:
>   * Tool to convert CVS repository, which can then incrementally import
>     changes.
>   * It would be nice to have tool to exchange commits between SCM and CVS,
>     be it like Tailor/git-svn, or via incremental import and exporting
>     commits to CVS like git-cvsexportcommit. This would ease changing SCM,
>     as both new SCM and CVS could be deployed in parallel, for a short time
>     of course.

>From what Brendan wrote they are looking to continue 1.9 in CVS and
start 2.0 in a new SCM. This pretty much mandates tracking CVS into
the new SCM for a long period of time. Possibly as much as two years.
There does not appear to be a need to push 2.0 back into CVS.


>   * It would be nice to have CVS emulation like git-cvsserver, so users
>     accustomed to CVS could still use it.

This can also solve some of the problems with Windows support.

>
> 2. Good support for system which most important developers use, and good
> support for system which most contributors use. If MS Windows is included
> in those, then Git perhaps wouldn't be the best choice.

Better Windows support is needed to make git the first choice among
the various SCMs.

>
> 3. Good support for the workflow used in the project. Is it exchanging
> patches via email (hello, Git!), having ssh access to some central
> repository with central repository to push changes to or net/mesh of
> repositories exchanging information, posting patches on some bug tracking
> software integrated with SCM. Is it using many branches (topic branches),
> or is it using few branches and merging.
>
> But it is equally important to realize what would be the best workflow to
> use, not constraining itself to the workflow imposed by limitations of CVS.

A big problem for Mozilla is outside companies doing major work in a
local CVS. Since CVS is not decentralized these local repos drift away
from the main one over time making things hard to merge. Any new SCM
will have to be distributed.

> 4. Good support for _large_ project, with large history. Namely, that
> developer wouldn't need to download many megabytes and/or wouldn't need
> megabytes of working area. How that is solved, be it partial checkouts,
> lazy/shallow/sparse clone, subprojects, splitting into
> projects/repositories and having some superproject or build-time
> superproject, splitting repository into current and historical... that of
> course depends on SCM.

git has issues here. The smallest Mozilla download we have built so
far is 450MB for the initial checkout.

>
> 5. ....
>
> and probably few more


The three most complex repositories are the kernel, gcc and Mozilla.
Gcc is in SVN now. Mozilla CVS and the kernel git.

There are much larger repositories around for some of the distros, but
they are doing things like checking ISO images in to the repo which
just makes it big,, not complex.

Top two git issues effecting Mozilla choosing it
1) some way to avoid the initial 450MB download
2) better windows support


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
@ 2006-10-14 23:34     ` Jakub Narebski
       [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-14 23:34 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> Top two git issues effecting Mozilla choosing it
> 1) some way to avoid the initial 450MB download

Give out CDs with Mozilla's git repository (and use alternates) ;-)
Just kidding...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
@ 2006-10-15  0:03       ` Sean
  2006-10-15  0:34         ` Jon Smirl
  0 siblings, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-15  0:03 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

On Sat, 14 Oct 2006 19:06:10 -0400
"Jon Smirl" <jonsmirl@gmail.com> wrote:

> Top two git issues effecting Mozilla choosing it
> 1) some way to avoid the initial 450MB download

Why not split the repository up after you import it?  Break it into
two repositories, last year or two, and then everything else.

> 2) better windows support

Hard to imagine native windows support existing in time to be used by 
the Mozilla folks, maybe in time for 3.0 :o)

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-15  0:03       ` Sean
@ 2006-10-15  0:34         ` Jon Smirl
       [not found]           ` <20061014214452.8c2d2a5c.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 806+ messages in thread
From: Jon Smirl @ 2006-10-15  0:34 UTC (permalink / raw)
  To: Sean; +Cc: Jakub Narebski, git

On 10/14/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 14 Oct 2006 19:06:10 -0400
> "Jon Smirl" <jonsmirl@gmail.com> wrote:
>
> > Top two git issues effecting Mozilla choosing it
> > 1) some way to avoid the initial 450MB download
>
> Why not split the repository up after you import it?  Break it into
> two repositories, last year or two, and then everything else.

That is possible but I wish git had tools supporting this. What do you
do about core developers that want the full repo syncing to other
developers that only have a partial copy?

>
> > 2) better windows support
>
> Hard to imagine native windows support existing in time to be used by
> the Mozilla folks, maybe in time for 3.0 :o)
>
> Sean
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
  2006-10-14 23:34     ` Jakub Narebski
       [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
@ 2006-10-15  0:53     ` Jakub Narebski
  2006-10-15 15:37     ` Jakub Narebski
  2006-10-15 18:23     ` Petr Baudis
  4 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-15  0:53 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git

Jon Smirl wrote:
> On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:

>>   * It would be nice to have tool to exchange commits between SCM and CVS,
>>     be it like Tailor/git-svn, or via incremental import and exporting
>>     commits to CVS like git-cvsexportcommit. This would ease changing SCM,
>>     as both new SCM and CVS could be deployed in parallel, for a short time
>>     of course.
> 
> From what Brendan wrote they are looking to continue 1.9 in CVS and
> start 2.0 in a new SCM. This pretty much mandates tracking CVS into
> the new SCM for a long period of time. Possibly as much as two years.
> There does not appear to be a need to push 2.0 back into CVS.

That of course limits what we can do in 1.9 to what CVS supports.

> >   * It would be nice to have CVS emulation like git-cvsserver, so users
> >     accustomed to CVS could still use it.
> 
> This can also solve some of the problems with Windows support.

Well, git-cvsserver (perhaps with some improvements) could also serve as
CVS server for 1.9.
 
> > 4. Good support for _large_ project, with large history. Namely, that
> > developer wouldn't need to download many megabytes and/or wouldn't need
> > megabytes of working area. How that is solved, be it partial checkouts,
> > lazy/shallow/sparse clone, subprojects, splitting into
> > projects/repositories and having some superproject or build-time
> > superproject, splitting repository into current and historical... that of
> > course depends on SCM.
> 
> git has issues here. The smallest Mozilla download we have built so
> far is 450MB for the initial checkout.

One way to reduce repository size would be to split fairly independent
subprojects (inependent = independently testable) into separate repositories,
and perhaps use some kind of "super-repository" (common repository) to join
all the project in one single entity. The split can be done using
git-splitrepo (or something like that) which was posted on git mailing list
(most probably by some member of X.Org), or just cg-admin-rewritehist.
While at it we could split repository into current work and historical repo;
and clean up current work repository from the cruft accumulated (e.g. dead
branches, broken tags etc.).


Another way is to use grafts.

Linux kernel has it's current repository (starting somewhere 2.6.x),
and it's historical repository. I don't remember how they arrived at it
(and don't want to check KernelTrap articles), if the seed for current
work repository was simply project import at some state, or (very slow)
import of BitKeeper history. But if I remember correctly it was born split.
You can join both repositories into one (wrt. log and diff for example)
using grafts.

I'm not sure what happens if you pull from repository which has graft
file "cauterizing" history; would you get graft file and history up to
cutoff point? What would happen if your repository, repository you pull to
has cauterization graft file; would it get cut history? Of course
the problem (and the source of proposal and troubles with implementing
of shallow/sparse/lazy clone) lies if someone branches (in public repo)
from below cutoff point. But that is a matter of policy.

But it is true that the size of Mozilla repository is a challenge.
BTW. do you perchance know how other SCM dels with the repository
of that size?

-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]           ` <20061014214452.8c2d2a5c.seanlkml@sympatico.ca>
@ 2006-10-15  1:44             ` Sean
  0 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-15  1:44 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

On Sat, 14 Oct 2006 20:34:22 -0400
"Jon Smirl" <jonsmirl@gmail.com> wrote:

> That is possible but I wish git had tools supporting this. What do you
> do about core developers that want the full repo syncing to other
> developers that only have a partial copy?

I don't think that will be an issue at all.

As an example, take the current Linux kernel repo maintained by Linus,
and one of the repos containing old historic kernel data imported into
Git.  Graft in the old historic data into your clone of Linus' repo,
and you're done. Anyone can pull from you even if they don't have the
historic data themselves.

With a little work you could do the same thing with the Mozilla data.
After you decide where to make the split, you'd have to rewrite the
commit history for the "current" repository, so that it terminates
at an initial commit rather than having a direct connection to the
historic data.  After that, the repos could be used just as described
above, separately or graphed together.

As far as I know though, there is still no way to use the git protocol
for the initial pull of such a combined repository.  You have to pull
both repos separately and graft them together locally.  This sounds
harder than it is though and can be scripted easily.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
                       ` (2 preceding siblings ...)
  2006-10-15  0:53     ` Jakub Narebski
@ 2006-10-15 15:37     ` Jakub Narebski
  2006-10-15 18:23     ` Petr Baudis
  4 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-15 15:37 UTC (permalink / raw)
  To: git

Jon Smirl wrote:

> The three most complex repositories are the kernel, gcc and Mozilla.
> Gcc is in SVN now. Mozilla CVS and the kernel git.
> 
> There are much larger repositories around for some of the distros, but
> they are doing things like checking ISO images in to the repo which
> just makes it big,, not complex.

I guess that one of the important thinkgs is the _size_ of the repository;
for example 12GB (if I remember correctly value for Subversion/SVK) vs 500MB
for git...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 23:06   ` Jon Smirl
                       ` (3 preceding siblings ...)
  2006-10-15 15:37     ` Jakub Narebski
@ 2006-10-15 18:23     ` Petr Baudis
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
  2006-10-15 19:49       ` Jon Smirl
  4 siblings, 2 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-15 18:23 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

Dear diary, on Sun, Oct 15, 2006 at 01:06:10AM CEST, I got a letter
where Jon Smirl <jonsmirl@gmail.com> said that...
> On 10/14/06, Jakub Narebski <jnareb@gmail.com> wrote:
> >There is work by Jon Smirl and Shawn Pearce on CVS to Git importer which 
> >can
> >manage large and complicated (read: f*cked-up) Mozilla CVS repository.
> >  http://git.or.cz/gitwiki/InterfacesFrontendsAndTools#cvs2git
> 
> I am still working with the developers of the cvs2svn import tool to
> fix things so that Mozilla CVS can be correctly imported. There are
> still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> can be imported, but the resulting repository is not entirely correct.
> 
> Once they get the base cvs2svn fixed I'll port my patches to turn it
> into cvs2git again.

So what exactly is the cvs2git status now? AFAIU, there's a tool that
parses the CVS repository and that is then "piped" to git-fastimport?
git-fastimport is available somewhere (perhaps it would be interesting
to publish it at repo.or.cz or something), is the current cvs2git
version available as well?

> >2. Good support for system which most important developers use, and good
> >support for system which most contributors use. If MS Windows is included
> >in those, then Git perhaps wouldn't be the best choice.
> 
> Better Windows support is needed to make git the first choice among
> the various SCMs.

And this is probably not likely to happen soon.

Well, I'm enlisted in a "Programming in Windows" course at my university
now and I had this kind of thoughts, but I really can't promise
anything. :-)

> >4. Good support for _large_ project, with large history. Namely, that
> >developer wouldn't need to download many megabytes and/or wouldn't need
> >megabytes of working area. How that is solved, be it partial checkouts,
> >lazy/shallow/sparse clone, subprojects, splitting into
> >projects/repositories and having some superproject or build-time
> >superproject, splitting repository into current and historical... that of
> >course depends on SCM.
> 
> git has issues here. The smallest Mozilla download we have built so
> far is 450MB for the initial checkout.

(BTW, yes, grafting the old history could help this time, but it is a
hack and not a good long-term solution - it is just putting the real
solution away until the project history will re-grew. Periodical
regrafting is even worse hack, since at that moment you break
fast-forwarding and this kind of "restarting the history" breaks deep
into the Git distributiveness.)

> >5. ....
> >
> >and probably few more
> 
> 
> The three most complex repositories are the kernel, gcc and Mozilla.
> Gcc is in SVN now. Mozilla CVS and the kernel git.

I believe OpenOffice CVS probably beats all three hands down very
easily. KDE is also very big, and I don't think NetBSD is just ISO
images either (if it contains any at all).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
@ 2006-10-15 18:39         ` Sean
  2006-10-15 19:24         ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-15 18:39 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jon Smirl, Jakub Narebski, git

On Sun, 15 Oct 2006 20:23:03 +0200
Petr Baudis <pasky@suse.cz> wrote:

> (BTW, yes, grafting the old history could help this time, but it is a
> hack and not a good long-term solution - it is just putting the real
> solution away until the project history will re-grew. Periodical
> regrafting is even worse hack, since at that moment you break
> fast-forwarding and this kind of "restarting the history" breaks deep
> into the Git distributiveness.)

But is there a better practical solution he can use today?  I don't think
there is.  And the experience of the Linux kernel has shown that it's not
really all that big a problem.  You even made a nice script to help people
do it! ;o)

It's probably not the solution that should be used _next_ time the repository
grows too big, but it sure seems like the correct solution this time around.
Not many people will want all that old history anyway (10+ years as i recall?).

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
  2006-10-15 18:39         ` Sean
@ 2006-10-15 19:24         ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-15 19:24 UTC (permalink / raw)
  To: Sean; +Cc: Jon Smirl, Jakub Narebski, git

On Sun, Oct 15, 2006 at 08:39:56PM CEST, Sean wrote:
> On Sun, 15 Oct 2006 20:23:03 +0200
> Petr Baudis <pasky@suse.cz> wrote:
> 
> > (BTW, yes, grafting the old history could help this time, but it is a
> > hack and not a good long-term solution - it is just putting the real
> > solution away until the project history will re-grew. Periodical
> > regrafting is even worse hack, since at that moment you break
> > fast-forwarding and this kind of "restarting the history" breaks deep
> > into the Git distributiveness.)
> 
> But is there a better practical solution he can use today?  I don't think
> there is.  And the experience of the Linux kernel has shown that it's not
> really all that big a problem.  You even made a nice script to help people
> do it! ;o)
> 
> It's probably not the solution that should be used _next_ time the repository
> grows too big, but it sure seems like the correct solution this time around.
> Not many people will want all that old history anyway (10+ years as i recall?).

Well I'm not saying it's the incorrect solution today, only that we
won't get around the problem by suggesting grafting forever. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-15 18:23     ` Petr Baudis
       [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
@ 2006-10-15 19:49       ` Jon Smirl
  2006-10-16  3:23         ` Petr Baudis
  1 sibling, 1 reply; 806+ messages in thread
From: Jon Smirl @ 2006-10-15 19:49 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> > I am still working with the developers of the cvs2svn import tool to
> > fix things so that Mozilla CVS can be correctly imported. There are
> > still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> > can be imported, but the resulting repository is not entirely correct.
> >
> > Once they get the base cvs2svn fixed I'll port my patches to turn it
> > into cvs2git again.
>
> So what exactly is the cvs2git status now? AFAIU, there's a tool that
> parses the CVS repository and that is then "piped" to git-fastimport?
> git-fastimport is available somewhere (perhaps it would be interesting
> to publish it at repo.or.cz or something), is the current cvs2git
> version available as well?

cvs2git is a set of patches that get applied to cvs2svn. The patches
modify cvs2svn to output things in a format that git-fastimport can
consume.

The problem is that there are issues with cvs2svn and how it converts
CVS into change sets that are not getting fixed. These issues are
annoying for SVN users but they are fatal for git. The exact problem
is a bug in the way CVS symbol dependencies are dealt with in cvs2svn.
The bug results in most branches and symbols being based off from 5-7
different change sets instead of a single change set. SVN then copies
from the 5-7 change sets to build the branch base or symbol base.
Copying from the 5-7 change sets is addressing the symptoms of the bug
instead of fixing the underlying problem which is incorrect ordering
of the base change sets.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-15 19:49       ` Jon Smirl
@ 2006-10-16  3:23         ` Petr Baudis
  2006-10-16  3:30           ` Jon Smirl
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-16  3:23 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Jakub Narebski, git

Dear diary, on Sun, Oct 15, 2006 at 09:49:08PM CEST, I got a letter
where Jon Smirl <jonsmirl@gmail.com> said that...
> On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> >> I am still working with the developers of the cvs2svn import tool to
> >> fix things so that Mozilla CVS can be correctly imported. There are
> >> still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> >> can be imported, but the resulting repository is not entirely correct.
> >>
> >> Once they get the base cvs2svn fixed I'll port my patches to turn it
> >> into cvs2git again.
> >
> >So what exactly is the cvs2git status now? AFAIU, there's a tool that
> >parses the CVS repository and that is then "piped" to git-fastimport?
> >git-fastimport is available somewhere (perhaps it would be interesting
> >to publish it at repo.or.cz or something), is the current cvs2git
> >version available as well?
> 
> cvs2git is a set of patches that get applied to cvs2svn. The patches
> modify cvs2svn to output things in a format that git-fastimport can
> consume.

By the way, isn't what you want an incremental importer, because of the
1.9 branch? According to its homepage, cvs2svn is not designed for
incremental importing. Or are you fixing that as well?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16  3:23         ` Petr Baudis
@ 2006-10-16  3:30           ` Jon Smirl
  2006-10-17  3:52             ` Sam Vilain
  0 siblings, 1 reply; 806+ messages in thread
From: Jon Smirl @ 2006-10-16  3:30 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Sun, Oct 15, 2006 at 09:49:08PM CEST, I got a letter
> where Jon Smirl <jonsmirl@gmail.com> said that...
> > On 10/15/06, Petr Baudis <pasky@suse.cz> wrote:
> > >> I am still working with the developers of the cvs2svn import tool to
> > >> fix things so that Mozilla CVS can be correctly imported. There are
> > >> still outstanding bugs in cvs2svn preventing a correct import. MozCVS
> > >> can be imported, but the resulting repository is not entirely correct.
> > >>
> > >> Once they get the base cvs2svn fixed I'll port my patches to turn it
> > >> into cvs2git again.
> > >
> > >So what exactly is the cvs2git status now? AFAIU, there's a tool that
> > >parses the CVS repository and that is then "piped" to git-fastimport?
> > >git-fastimport is available somewhere (perhaps it would be interesting
> > >to publish it at repo.or.cz or something), is the current cvs2git
> > >version available as well?
> >
> > cvs2git is a set of patches that get applied to cvs2svn. The patches
> > modify cvs2svn to output things in a format that git-fastimport can
> > consume.
>
> By the way, isn't what you want an incremental importer, because of the
> 1.9 branch? According to its homepage, cvs2svn is not designed for
> incremental importing. Or are you fixing that as well?

cvsps works ok on small amounts of data, but it can't handle the full
Mozilla repo. The current idea is to convert the full repo with
cvs2git and build the ini file needed by cvsps to support incremental
imports. After that use cvsps.


>
> --
>                                 Petr "Pasky" Baudis
> Stuff: http://pasky.or.cz/
> #!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
> $/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
> lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 17:18   ` Jon Smirl
@ 2006-10-16  3:53   ` Martin Pool
  2006-10-22 15:50     ` Jakub Narebski
  2006-10-16 22:26   ` Aaron Bentley
  2 siblings, 1 reply; 806+ messages in thread
From: Martin Pool @ 2006-10-16  3:53 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On 14 Oct 2006, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
> 
> > It refers to this comparison chart between source control systems.
> > http://bazaar-vcs.org/RcsComparisons
> 
> It is quite obvious that comparison of programs of given type (SMC)
> on some program site (Bazaar-NG) is usually biased towards said program,
> perhaps unconsciously: by emphasizing the features which were important
> for developers of said program.

I don't think I saw the original post but thanks for the feedback, we'll
update it.

> Gaah, subscribe-to-post mailing list!

No, it's just moderated for first time posters to avoid spam.  Your
message got through.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-14 16:40 ` Jakub Narebski
  2006-10-14 17:18   ` Jon Smirl
  2006-10-16  3:53   ` Martin Pool
@ 2006-10-16 22:26   ` Aaron Bentley
  2006-10-16 22:35     ` Andy Whitcroft
                       ` (3 more replies)
  2 siblings, 4 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-16 22:26 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
>>Does it accurately reflect the current status of git? Is their
>>assessment of git's rename capability correct?
> 
> 
> For example simple namespace for git: you can use shortened sha1
> (even to only 6 characters, although usually 8 are used), you can
> use tags, you can use ref^m~n syntax.

Bazaar's namespace is "simple" because all branches can be named by a
URL, and all revisions can be named by a URL + a number.

If that's true of Git, then it certainly has a simple namespace.  Using
eight-digit hex values doesn't sound simple to me, though.

> I'm not sure about "No" in "Supports Repository". Git supports multiple
> branches in one repository, and what's better supports development using
> multiple branches, but cannot for example do a diff or a cherry-pick
> between repositories (well, you can use git-format-patch/git-am to
> cherry-pick changes between repositories...).

That sounds right.  So those branches are persistent, and can be worked
on independently?

> About "checkouts", i.e. working directories with repository elsewhere:
> you can use GIT_DIR environmental variable or "git --git-dir" option,
> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> "symref"-like file to point to repository passes, we can use that.

It sounds like the .gitdir/.git proposal would give Git "checkouts", by
our meaning of the term.

> Partial checkouts are only partially supported as of now; it means
> you have to do some lowe level stuff to do partial checkout, and be
> carefull when comitting. BTW it depends what you mean by partial
> checkout, but they are somewhat incompatibile with atomic commits
> to snapshot based repository.

Yes, I'm very much aware of that tension.  It will be fun when Bazaar
tries to support that... :-)

> Git supports renames in its own way; it doesn't use file ids, nor
> remember renames (the new "note" header for use e.g. by porcelains 
> didn't pass if I remember correctly). But it does *detect* moving
> _contents_, and even *copying* _contents_ when requested. And of
> course it detect renames in merges.

You'll note we referred to that bevhavior on the page.  We don't think
what Git does is the same as supporting renames.  AIUI, some Git users
feel the same way.

> Git doesn't have some "plugin framework", but because it has many
> "plumbing" commands, it is easy to add new commands, and also new
> merge strategies, using shell scripts, Perl, Python and of course C.
> So the answer would be "Somewhat", as git has plugable merge strategies,
> or even "Yes" at it is easy to add new git command.

It sounds like you're saying it's extensible, not that it supports
plugins.  Plugins have very simple installation requirements.  They can
provide merge strategies, repository types, internet protocols, new
commands, etc., all seamlessly integrated.

What you're describing actually sounds like the Arch approach to
extensibility: provide a whole bunch of basic commands and let users
build an RCS on top of that.

As the author of two different Arch front-ends, I can say I haven't
found that approach satisfactory.  Invoking multiple commands tends
re-invoke the same validation routines over and over, killing
efficiency, and diagnostics tend to be pretty poorly integrated.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNAb90F+nu1YWqI0RAvRDAJ9HHHdbhT1+aA3wOGeuUDkjRIr7BQCcDBKB
cL+DAy5GdTDk8Iz9TUkQ//M=
=AJAu
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
@ 2006-10-16 22:35     ` Andy Whitcroft
  2006-10-16 22:53       ` Jakub Narebski
  2006-10-16 23:19     ` Jakub Narebski
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 806+ messages in thread
From: Andy Whitcroft @ 2006-10-16 22:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:

>>> Git supports renames in its own way; it doesn't use file ids, nor
>>> remember renames (the new "note" header for use e.g. by porcelains 
>>> didn't pass if I remember correctly). But it does *detect* moving
>>> _contents_, and even *copying* _contents_ when requested. And of
>>> course it detect renames in merges.
> 
> You'll note we referred to that bevhavior on the page.  We don't think
> what Git does is the same as supporting renames.  AIUI, some Git users
> feel the same way.

In my experience there are two key features to rename support.  The
first that files move about efficiently ie. we don't have to carry a
different copy of the same file for each name it has had, this git
handles nicely.  The second is the seemless following of history 'back',
this git does not do trivially (when limited to specific files).  git
log on a renamed file pretty much stops at the rename point and you have
deal with it yourself.

I would love to see someone respond with a pickaxe like command line
which would list each and every change and its origin though merges and
the like.

Hmmm.

-apw

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:35     ` Andy Whitcroft
@ 2006-10-16 22:53       ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-16 22:53 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Andy Whitcroft wrote:

> Aaron Bentley wrote:
> 
>>>> Git supports renames in its own way; it doesn't use file ids, nor
>>>> remember renames (the new "note" header for use e.g. by porcelains 
>>>> didn't pass if I remember correctly). But it does *detect* moving
>>>> _contents_, and even *copying* _contents_ when requested. And of
>>>> course it detect renames in merges.
>> 
>> You'll note we referred to that bevhavior on the page.  We don't think
>> what Git does is the same as supporting renames.  AIUI, some Git users
>> feel the same way.
> 
> In my experience there are two key features to rename support.  The
> first that files move about efficiently ie. we don't have to carry a
> different copy of the same file for each name it has had, this git
> handles nicely.  The second is the seemless following of history 'back',
> this git does not do trivially (when limited to specific files).  git
> log on a renamed file pretty much stops at the rename point and you have
> deal with it yourself.

Both git log and git diff follows renames (with -M) and even copies 
(with -C), but path _limiter_ doesn't follow renames. There is proposal
to add --follow option to git rev-list to follow specified paths. There was
a patch adding this option here on git mailing list (check archives), not
added because it was fairly intrusive and not complete solution IIRC.

I'd say that the second part is _partially_ supported, as we can follow
history of renamed file with pathlimit, detect that file was renamed, and
follow using previous name as pathlimit. For example if you know all the
names the file had through history, you can get whole history providing all
those names as pathlimit (well, unless there is some conflict like creating
new file with the same name as file before rename; something that all
file-id based solutions have problem with).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
  2006-10-16 22:35     ` Andy Whitcroft
@ 2006-10-16 23:19     ` Jakub Narebski
  2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
                         ` (2 more replies)
  2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:45     ` Johannes Schindelin
  3 siblings, 3 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-16 23:19 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
> >>Does it accurately reflect the current status of git? Is their
> >>assessment of git's rename capability correct?
> >
> >
> > For example simple namespace for git: you can use shortened sha1
> > (even to only 6 characters, although usually 8 are used), you can
> > use tags, you can use ref^m~n syntax.
> 
> Bazaar's namespace is "simple" because all branches can be named by a
> URL, and all revisions can be named by a URL + a number.

Well, all refs (branches and tags) are named by [relative] path. So for
example we can have 'master', 'next', 'jc/diff' branches, 'v1.4.0' and
'examples/tag' tags. Cogito for example uses <repository URL>#<branch>
syntax.

> If that's true of Git, then it certainly has a simple namespace.  Using
> eight-digit hex values doesn't sound simple to me, though.

Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
(which constantly change) is a moving target.

There was proposal to add some kind of serial number to git (like 
Subversion revision numbers) and even solution how to do this...
but one must realize that any serial number must be _local_ to the
repository. One cannot have universally valid revision numbers (even
only per branch) in distributed development. Subversion can do that only
because it is centralized SCM. Global numbering and distributed nature
doesn't mix... hence contents based sha1 as commit identifiers.


But this doesn't matter much, because you can have really lightweight
tags in git (especially now with packed refs support). So you can have
the namespace you want.

>> I'm not sure about "No" in "Supports Repository". Git supports multiple
>> branches in one repository, and what's better supports development using
>> multiple branches, but cannot for example do a diff or a cherry-pick
>> between repositories (well, you can use git-format-patch/git-am to
>> cherry-pick changes between repositories...).
> 
> That sounds right.  So those branches are persistent, and can be worked
> on independently?

Branches are persistent, have _separate_ (!) namespace (are not
incorporated in repository URL according to some kind of convention
like in Subversion), can be worked independently, you can easily
switch between branches in one working directory. Branches are cheap
in git (notion of topic branches).

I wonder if any SCM other than git has easy way to "rebase" a branch,
i.e. cut branch at branching point, and transplant it to the tip
of other branch. For example you work on 'xx/topic' topic branch,
and want to have changes in those branch but applied to current work,
not to the version some time ago when you have started working on
said feature.

What your comparison matrick lacks for example is if given SCM
saves information about branching point and merges, so you can
get where two branches diverged, and when one branch was merged into
another.
 
>> About "checkouts", i.e. working directories with repository elsewhere:
>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>> "symref"-like file to point to repository passes, we can use that.
> 
> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> our meaning of the term.

Actually it is better to work with clone of repository, perhaps either
symlinking object database, or by alternates mechanism (with alternates
repositories would share old history, but gather new independetly
I think).

>> Git doesn't have some "plugin framework", but because it has many
>> "plumbing" commands, it is easy to add new commands, and also new
>> merge strategies, using shell scripts, Perl, Python and of course C.
>> So the answer would be "Somewhat", as git has plugable merge strategies,
>> or even "Yes" at it is easy to add new git command.
> 
> It sounds like you're saying it's extensible, not that it supports
> plugins.  Plugins have very simple installation requirements.  They can
> provide merge strategies, repository types, internet protocols, new
> commands, etc., all seamlessly integrated.

Plugins = API + detection ifrastructure + loading on demand.
Git has API, has a kind of detection ifrastructure (for commands and
merge strategies only), doesn't have loading on demand. You can
easily provide new commands (thanks to git wrapper) and new merge
strategies. 

Does git needs "plugin framework"? I'm not sure. Now it is like
Linux kernel without loadable modules support...

> What you're describing actually sounds like the Arch approach to
> extensibility: provide a whole bunch of basic commands and let users
> build an RCS on top of that.
>
> As the author of two different Arch front-ends, I can say I haven't
> found that approach satisfactory.  Invoking multiple commands tends
> re-invoke the same validation routines over and over, killing
> efficiency, and diagnostics tend to be pretty poorly integrated.

Actually I think it is how git was made. First came low level stuff,
"plumbing" in git parlance. Then there were scripts which used those
low level commands. There is ongoing project to rewrite them as builtin
commands (written in C); many of them got rewritten.

When git had very few higher level commands, here came git-pasky,
later renamed to Cogito; higher level SCM built on top of Git (in bash
shell). Now core git contains many high level commands, porcelanish
in git jargon.

Well, there is also StGit and it's alternative pg (Patchy Git), which
implement Quilt-like functionality (patch management) on top of Git.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
  2006-10-16 22:35     ` Andy Whitcroft
  2006-10-16 23:19     ` Jakub Narebski
@ 2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:55       ` Jakub Narebski
                         ` (2 more replies)
  2006-10-16 23:45     ` Johannes Schindelin
  3 siblings, 3 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-16 23:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Mon, 16 Oct 2006, Aaron Bentley wrote:
> 
> Bazaar's namespace is "simple" because all branches can be named by a
> URL, and all revisions can be named by a URL + a number.
> 
> If that's true of Git, then it certainly has a simple namespace.  Using
> eight-digit hex values doesn't sound simple to me, though.

Hey, "simple" is in the eye of the beholder. You can always just define 
Bazaar's naming convention to be simple. 

I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
name a revision in a distributed environment, though. I bet the "number" 
really only names a revision in one _single_ repository, right?

Which measn that it's actually not a "name" of the revision at all. It's 
just a local shorthand that has no meaning, and the exact same revision 
will be called something different when in somebody elses repository.

I wouldn't call that "simple". I'd call it "insane".

In contrast, in git, a revision is a revision is a revision. If you give 
the SHA1 name, it's well-defined even between different repositories, and 
you can tell somebody that "revision XYZ is when the problem started", and 
they'll know _exactly_ which revision it is, even if they don't have your 
particular repository.

Now _that_ is true simplicity. It does automatically mean that the names 
are a bit longer, but in this case, "longer" really _does_ mean "simpler".

If you want a short, human-readable name, you _tag_ it. It takes all of a 
hundredth of a second to to or so.

> > I'm not sure about "No" in "Supports Repository". Git supports multiple
> > branches in one repository, and what's better supports development using
> > multiple branches, but cannot for example do a diff or a cherry-pick
> > between repositories (well, you can use git-format-patch/git-am to
> > cherry-pick changes between repositories...).
> 
> That sounds right.  So those branches are persistent, and can be worked
> on independently?

Yes.

> > About "checkouts", i.e. working directories with repository elsewhere:
> > you can use GIT_DIR environmental variable or "git --git-dir" option,
> > or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> > "symref"-like file to point to repository passes, we can use that.
> 
> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> our meaning of the term.

Well, in the git world, it's really just one shared repository that has 
separate branch-namespaces, and separate working trees (aka "checkouts"). 
So yes, it probably matches what bazaar would call a checkout.

Almost nobody seems to actually use it that way in git - it's mostly more 
efficient to just have five different branches in the same working tree, 
and switch between them. When you switch between branches in git, git only 
rewrites the part of your working tree that actually changed, so switching 
is extremely efficient even with a large repo. 

So there is seldom any real need or reason to actually have multiple 
checkouts. But it certainly _works_.

> You'll note we referred to that bevhavior on the page.  We don't think
> what Git does is the same as supporting renames.  AIUI, some Git users
> feel the same way.

The fact is, git supports renames better than just about anybody else. It 
just does them technically differently. The fact that it happens to be the 
_right_ way, and everybody else is incompetent, is not my fault ;)

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:19     ` Jakub Narebski
@ 2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
  2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  9:37       ` Robert Collins
  2 siblings, 0 replies; 806+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-10-16 23:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

On 10/17/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Aaron Bentley wrote:
> > Jakub Narebski wrote:
> >> About "checkouts", i.e. working directories with repository elsewhere:
> >> you can use GIT_DIR environmental variable or "git --git-dir" option,
> >> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> >> "symref"-like file to point to repository passes, we can use that.
> >
> > It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> > our meaning of the term.
>
> Actually it is better to work with clone of repository, perhaps either
> symlinking object database, or by alternates mechanism (with alternates
> repositories would share old history, but gather new independetly
> I think).
I agree. Each Git repository is designed to work with one working
directory. Using .gitdir/.git proposal, you are likely to checkout two
working directories from one repo.
-- 
Duy

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 22:26   ` Aaron Bentley
                       ` (2 preceding siblings ...)
  2006-10-16 23:35     ` Linus Torvalds
@ 2006-10-16 23:45     ` Johannes Schindelin
  2006-10-17  2:40       ` Petr Baudis
                         ` (2 more replies)
  3 siblings, 3 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-16 23:45 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Hi Aaron,

On Mon, 16 Oct 2006, Aaron Bentley wrote:

> --[PinePGP]--------------------------------------------------[begin]--
> Jakub Narebski wrote:
> >>Does it accurately reflect the current status of git? Is their
> >>assessment of git's rename capability correct?
> >
> >
> > For example simple namespace for git: you can use shortened sha1
> > (even to only 6 characters, although usually 8 are used), you can
> > use tags, you can use ref^m~n syntax.
> 
> Bazaar's namespace is "simple" because all branches can be named by a 
> URL, and all revisions can be named by a URL + a number.

How should this cope with a distributed project? IOW how does it deal with 
"this revision and that revision are exactly the same"?

If I understand you correctly, you are claiming that you are not really 
identifying a revision, but a revision _at a certain place with a 
place-dependent number_. This conflicts with my understanding of a 
revision.

> If that's true of Git, then it certainly has a simple namespace.  Using 
> eight-digit hex values doesn't sound simple to me, though.

It depends on your usage. If you want to do anything interesting, like 
assure that you have the correct version, or assure that two different 
person's tags actually tag the same revision, there is no simpler 
representation.

> > I'm not sure about "No" in "Supports Repository". Git supports multiple
> > branches in one repository, and what's better supports development using
> > multiple branches, but cannot for example do a diff or a cherry-pick
> > between repositories (well, you can use git-format-patch/git-am to
> > cherry-pick changes between repositories...).
> 
> That sounds right.  So those branches are persistent, and can be worked
> on independently?

Of course! Persistence (and reliability) are the number one goal of git. 
Performance is the next one.

As an example of completely independet branches, look at the "next" and 
the "todo" branch of git. They are _completely_ independent, i.e. not even 
sharing history, let alone files.

> > Git supports renames in its own way; it doesn't use file ids, nor
> > remember renames (the new "note" header for use e.g. by porcelains
> > didn't pass if I remember correctly). But it does *detect* moving
> > _contents_, and even *copying* _contents_ when requested. And of
> > course it detect renames in merges.
> 
> You'll note we referred to that bevhavior on the page.  We don't think
> what Git does is the same as supporting renames.  AIUI, some Git users
> feel the same way.

Oh, we start another flamewar again?

Honestly, if you want to record renames, why don't you also support (with 
a command for each of those purposes) code copying? And refactoring? And 
copyright year bumps? _put your favourite here_

If you really, really think about it: it makes much more sense to record 
your intention in the commit message. So, instead of recording for _every_ 
_single_ file in folder1/ that it was moved to folder2/, it is better to 
say that you moved folder1/ to folder2/ _because of some special reason_!

Same goes for all other thinkable examples.

If you want to track code, then let the tracker do its work, i.e. let 
git-pickaxe figure where your code came from. It is likely being more 
precise than any human ever can be.

> > Git doesn't have some "plugin framework", but because it has many
> > "plumbing" commands, it is easy to add new commands, and also new
> > merge strategies, using shell scripts, Perl, Python and of course C.
> > So the answer would be "Somewhat", as git has plugable merge strategies,
> > or even "Yes" at it is easy to add new git command.
> 
> It sounds like you're saying it's extensible, not that it supports
> plugins.  Plugins have very simple installation requirements.  They can
> provide merge strategies, repository types, internet protocols, new
> commands, etc., all seamlessly integrated.
> 
> What you're describing actually sounds like the Arch approach to
> extensibility: provide a whole bunch of basic commands and let users
> build an RCS on top of that.

It is more like the Unix way. Let each command do _one_ thing, but let it 
do it _perfectly_.

> As the author of two different Arch front-ends, I can say I haven't
> found that approach satisfactory.  Invoking multiple commands tends
> re-invoke the same validation routines over and over, killing
> efficiency, and diagnostics tend to be pretty poorly integrated.

Welcome to git! Git's commands are very efficient, and you can even pipe 
them efficiently! And now that we have GIT_TRACE, diagnostics are no 
concern.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:35     ` Linus Torvalds
@ 2006-10-16 23:55       ` Jakub Narebski
  2006-10-17  0:04         ` Johannes Schindelin
  2006-10-17  0:08         ` Linus Torvalds
  2006-10-17  0:29       ` Luben Tuikov
  2006-10-17  4:24       ` Aaron Bentley
  2 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-16 23:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Bentley, bazaar-ng, git

Linus Torvalds wrote:
>>> About "checkouts", i.e. working directories with repository elsewhere:
>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>> "symref"-like file to point to repository passes, we can use that.
>> 
>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>> our meaning of the term.
> 
> Well, in the git world, it's really just one shared repository that has 
> separate branch-namespaces, and separate working trees (aka "checkouts"). 
> So yes, it probably matches what bazaar would call a checkout.
> 
> Almost nobody seems to actually use it that way in git - it's mostly more 
> efficient to just have five different branches in the same working tree, 
> and switch between them. When you switch between branches in git, git only 
> rewrites the part of your working tree that actually changed, so switching 
> is extremely efficient even with a large repo. 

Unless you have branch(es) with totally different contents, like git.git
'todo' branch.

> So there is seldom any real need or reason to actually have multiple 
> checkouts. But it certainly _works_.

But without .git being either symlink, or .git/.gitdir "symref"-link,
you have to remember what to ser GIT_DIR to, or parameter for --git-dir
option.

I'd like to mention once again that in Git branches and tags have
totally separate namespace than repository namespace.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:55       ` Jakub Narebski
@ 2006-10-17  0:04         ` Johannes Schindelin
  2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:08         ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-17  0:04 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Aaron Bentley, bazaar-ng, git

Hi,

On Tue, 17 Oct 2006, Jakub Narebski wrote:

> Linus Torvalds wrote:
> >>> About "checkouts", i.e. working directories with repository elsewhere:
> >>> you can use GIT_DIR environmental variable or "git --git-dir" option,
> >>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
> >>> "symref"-like file to point to repository passes, we can use that.
> >> 
> >> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
> >> our meaning of the term.
> > 
> > Well, in the git world, it's really just one shared repository that has 
> > separate branch-namespaces, and separate working trees (aka "checkouts"). 
> > So yes, it probably matches what bazaar would call a checkout.
> > 
> > Almost nobody seems to actually use it that way in git - it's mostly more 
> > efficient to just have five different branches in the same working tree, 
> > and switch between them. When you switch between branches in git, git only 
> > rewrites the part of your working tree that actually changed, so switching 
> > is extremely efficient even with a large repo. 
> 
> Unless you have branch(es) with totally different contents, like git.git
> 'todo' branch.

But I _do_ work with it! I just don't need to "checkout" it! Example:

git -p cat-file -p todo:TODO

(How about making git-cat be a short cuut to "git -p cat-file -p"?)

> > So there is seldom any real need or reason to actually have multiple 
> > checkouts. But it certainly _works_.
> 
> But without .git being either symlink, or .git/.gitdir "symref"-link,
> you have to remember what to ser GIT_DIR to, or parameter for --git-dir
> option.

You'd just use alternates for that.

But as Linus mentioned in another email, you mostly can use the _same_ 
working directory. If you want to work on another branch, which is not all 
that different from the current branch (say, you have a bug fix branch on 
top of an upstream branch), you just _switch_ to it. Git recognizes those 
files which are changed, and updates only these. Therefore, if you have 
something like a Makefile system to build the project, you actually save 
(compile) time as compared to the multiple-checkout scenario.

I use this system a lot, since I maintain a few bugfixes for a few 
projects until the bugfixes are applied upstream. BTW the 
multiple-branches-in-one-working-directory workflow was propagated by Jeff 
a long time ago, and it really changed my way of working. Thanks, Jeff!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:55       ` Jakub Narebski
  2006-10-17  0:04         ` Johannes Schindelin
@ 2006-10-17  0:08         ` Linus Torvalds
  2006-10-17  0:24           ` Jakub Narebski
  2006-10-17  4:31           ` Aaron Bentley
  1 sibling, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17  0:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git



On Tue, 17 Oct 2006, Jakub Narebski wrote:
> > rewrites the part of your working tree that actually changed, so switching 
> > is extremely efficient even with a large repo. 
> 
> Unless you have branch(es) with totally different contents, like git.git
> 'todo' branch.

Yes. I have to say, that's likely a fairly odd case, and I wouldn't be 
surprised if other VCS's don't support that mode of operation at _all_.

The fact that git branches can be independent of each other is very 
natural in the git world, but 

> > So there is seldom any real need or reason to actually have multiple 
> > checkouts. But it certainly _works_.
> 
> But without .git being either symlink, or .git/.gitdir "symref"-link,
> you have to remember what to ser GIT_DIR to, or parameter for --git-dir
> option.

I'd strongly suggest that people who do this should actually do

	git clone -l

instead of actually playing games with symlinking .git/ itself or using 
GIT_DIR. It means that the two checkouts get separate branch namespaces, 
but that's really what you'd want most of the time. 

You _can_ share the whole branch namespace and do the symlink of .git (or 
just set GIT_DIR - but that's pretty inconvenient), and it might end up 
being "closer" to what some other VCS would do. But the natural thing to 
do with git is to just share some of the objects through local "slaving" 
of the repositories, and consider them otherwise entirely independent.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:04         ` Johannes Schindelin
@ 2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:36             ` Johannes Schindelin
                               ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17  0:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git



On Tue, 17 Oct 2006, Johannes Schindelin wrote:
> > 
> > Unless you have branch(es) with totally different contents, like git.git
> > 'todo' branch.
> 
> But I _do_ work with it! I just don't need to "checkout" it! Example:
> 
> git -p cat-file -p todo:TODO

Ok, if there ever was an example of a strange git command-line, that was 
it.

> (How about making git-cat be a short cuut to "git -p cat-file -p"?)

Well, you can just add

	[alias]
		cat=-p cat-file -p

to your ~/.gitconfig file, and you're there.

[ For all the non-git people here: the first "-p" is shorthand for 
  "--paginate", and means that git will automatically start a pager for 
  the output. The second "-p" is shorthand for "pretty" (there's no 
  long-format command line switch for it, though), and means that git 
  cat-file will show the result in a human-readable way, regardless of 
  whether it's just a text-file, or a git directory ]

So then you can do just

	git cat todo:TODO

and you're done.

[ So for the non-git people, what that will actually _do_ is to show the 
  TODO file in the "todo" branch - regardless of whether it is checked out 
  or not, and start a pager for you. ]

I actually do this sometimes, but I've never done it for branches (and I 
do it seldom enough that I haven't added the alias). I do it for things 
like

	git cat v2.6.16:Makefile

to see what a file looked like in a certain tagged release.

People sometimes find the git command line confusing, but I have to say, 
the thing is _damn_ expressive. I've never seen anybody else do things 
like the above that git does really naturally, with not that much 
confusion really.

Even that "alias" file is quite readable, although I'd suggest writing out 
the switches in full, ie

	[alias]
		cat=--paginate cat-file -p

instead. That kind of helps explains what the alias does and avoids the 
question of why there are two "-p" switches.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:08         ` Linus Torvalds
@ 2006-10-17  0:24           ` Jakub Narebski
  2006-10-17  4:31           ` Aaron Bentley
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17  0:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Linus Torvalds wrote:

>> > So there is seldom any real need or reason to actually have multiple 
>> > checkouts. But it certainly _works_.
>> 
>> But without .git being either symlink, or .git/.gitdir "symref"-link,
>> you have to remember what to ser GIT_DIR to, or parameter for --git-dir
>> option.
> 
> I'd strongly suggest that people who do this should actually do
> 
>         git clone -l
> 
> instead of actually playing games with symlinking .git/ itself or using 
> GIT_DIR. It means that the two checkouts get separate branch namespaces, 
> but that's really what you'd want most of the time. 

Or symlinking .git/objects (and perhaps .git/remotes and .git/branches).
BTW. wouldn't it be rather git clone -l -s? What would happenm on repack,
or on repack -a -d?

But it is true that there is no need to checkout different branches
to different working areas.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:55       ` Jakub Narebski
@ 2006-10-17  0:29       ` Luben Tuikov
  2006-10-17  4:24       ` Aaron Bentley
  2 siblings, 0 replies; 806+ messages in thread
From: Luben Tuikov @ 2006-10-17  0:29 UTC (permalink / raw)
  To: Linus Torvalds, Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

--- Linus Torvalds <torvalds@osdl.org> wrote:
> Well, in the git world, it's really just one shared repository that has 
> separate branch-namespaces, and separate working trees (aka "checkouts"). 
> So yes, it probably matches what bazaar would call a checkout.
> 
> Almost nobody seems to actually use it that way in git - it's mostly more 
> efficient to just have five different branches in the same working tree, 
> and switch between them. When you switch between branches in git, git only 
> rewrites the part of your working tree that actually changed, so switching 
> is extremely efficient even with a large repo. 
> 
> So there is seldom any real need or reason to actually have multiple 
> checkouts. But it certainly _works_.

It does work, very well at that.

I have a directory for each separate branch and simply use
cd(1) to change the current working directory to that branch.
So, instead of "git checkout <branch>", I do "cd ../<branch>".

One only needs to watch out when one updates the repository.
If there had been updates in those branches, then one needs
to git-reset the "branch" directory... (you know what I mean)
(For example when I come to work in the morning an sync up
 with home from my usb key...)

The script is called:
Usage: git-mkdir-of-branch <original-directory> <branch> <new-directory>
  where <branch> is the name of an existing branch in <original-directory>/.git/refs/heads

and uses simple symbolic links and some git plumbing to do the
job.  It can be found in my git trees.  I never bothered to send
it out to Junio, since it could be considered heretic. ;-)

     Luben

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:23           ` Linus Torvalds
@ 2006-10-17  0:36             ` Johannes Schindelin
  2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
  2006-10-17  7:26             ` Christian MICHON
  2 siblings, 0 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-17  0:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git

Hi,

On Mon, 16 Oct 2006, Linus Torvalds wrote:

> On Tue, 17 Oct 2006, Johannes Schindelin wrote:
> 
> > (How about making git-cat be a short cuut to "git -p cat-file -p"?)
> 
> Well, you can just add
> 
> 	[alias]
> 		cat=-p cat-file -p
> 
> to your ~/.gitconfig file, and you're there.

Ha! I have that for a long time! Although I named it "s", since "git s 
todo:TODO" is two letters shorter...

Ciao,
Dscho

P.S.: BTW a certain person complained about ~/.gitconfig not being 
documented, but evidently the itch was not big enough for that person to 
document it himself...

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:36             ` Johannes Schindelin
@ 2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
  2006-10-17  7:26             ` Christian MICHON
  2 siblings, 0 replies; 806+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-10-17  1:17 UTC (permalink / raw)
  To: git

On 10/17/06, Linus Torvalds <torvalds@osdl.org> wrote:
> So then you can do just
>
>         git cat todo:TODO
>
> and you're done.
>
> [ So for the non-git people, what that will actually _do_ is to show the
>   TODO file in the "todo" branch - regardless of whether it is checked out
>   or not, and start a pager for you. ]
>
> I actually do this sometimes, but I've never done it for branches (and I
> do it seldom enough that I haven't added the alias). I do it for things
> like
>
>         git cat v2.6.16:Makefile
>
> to see what a file looked like in a certain tagged release.

This very useful syntax (<ent>:<path>) didn't get documented
"officially" anywhere. It was actually documented in commit log
v1.4.1^0~255^2. Maybe someone should copy and paste it to git
documentation? Maybe core-tutorial.txt or git-rev-parse.txt, is there
any better place?
-- 
Duy

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:45     ` Johannes Schindelin
@ 2006-10-17  2:40       ` Petr Baudis
  2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  9:33       ` Robert Collins
  2 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-17  2:40 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Aaron Bentley, Jakub Narebski, bazaar-ng, git

Hi!

Dear diary, on Tue, Oct 17, 2006 at 01:45:34AM CEST, I got a letter
where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> On Mon, 16 Oct 2006, Aaron Bentley wrote:
> > As the author of two different Arch front-ends, I can say I haven't
> > found that approach satisfactory.  Invoking multiple commands tends
> > re-invoke the same validation routines over and over, killing
> > efficiency, and diagnostics tend to be pretty poorly integrated.
> 
> Welcome to git! Git's commands are very efficient, and you can even pipe 
> them efficiently! And now that we have GIT_TRACE, diagnostics are no 
> concern.

I think Aaron rather meant that in case of an error, the error messages
may seem incoherent from the perspective of a porcelain user if it's
been generated by the plumbing. And I had that problem in Cogito as well
few times in the past, but I think most of those are reasonable now (I
can't think of a counter-example off the top of my head).

Calling multiple git commands _is_ a problem, especially in a loop, but
I think it's more the inherent fork()+execve() overhead than whatever
happens over and over when main() takes over. Many git commands got
adjusted so that you can call them just once and then feed from/to them
over longer time period.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16  3:30           ` Jon Smirl
@ 2006-10-17  3:52             ` Sam Vilain
  2006-10-17 12:59               ` Jon Smirl
  0 siblings, 1 reply; 806+ messages in thread
From: Sam Vilain @ 2006-10-17  3:52 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Petr Baudis, Jakub Narebski, git

Jon Smirl wrote:
> cvsps works ok on small amounts of data, but it can't handle the full
> Mozilla repo. The current idea is to convert the full repo with
> cvs2git and build the ini file needed by cvsps to support incremental
> imports. After that use cvsps.
>   

Looking through the client.mk used to check out the sub-portions of the
CVS repository, I have to ask;

Why are you trying to import this big collection of projects into a
single git repository?

View git's repositories not as a container for an entire community's
code base, but more as object partitions.  Currently you are quite happy
to use per-file version control partitions inherent to CVS.  Now you are
looking at removing all of the partitions completely and hoping to end
up with something managable.  That it has been possible at all to fit it
into the space less than the size of a CD is staggering, but surely a
piecemeal approach would be a pragmatic solution to this problem.

Sam.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:35     ` Linus Torvalds
  2006-10-16 23:55       ` Jakub Narebski
  2006-10-17  0:29       ` Luben Tuikov
@ 2006-10-17  4:24       ` Aaron Bentley
  2006-10-17  7:50         ` Andreas Ericsson
                           ` (3 more replies)
  2 siblings, 4 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17  4:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Mon, 16 Oct 2006, Aaron Bentley wrote:
>> Bazaar's namespace is "simple" because all branches can be named by a
>> URL, and all revisions can be named by a URL + a number.

> I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
> name a revision in a distributed environment, though. I bet the "number" 
> really only names a revision in one _single_ repository, right?

Right.  That's why I said all revisions can be named by a URL + a
number, because it's the combination of the URL + a number that is
unique.  (In bzr, each branch has a URL.)

> In contrast, in git, a revision is a revision is a revision. 

I agree that a revision is a revision, but I don't think that's a
property unique to git. :-)

> If you give 
> the SHA1 name, it's well-defined even between different repositories, and 
> you can tell somebody that "revision XYZ is when the problem started", and 
> they'll know _exactly_ which revision it is, even if they don't have your 
> particular repository.

When two people have copies of the same revision, it's usually because
they are each pulling from a common branch, and so the revision in that
branch can be named.  Bazaar does use unique ids internally, but it's
extremely rare that the user needs to use them.

> Now _that_ is true simplicity. It does automatically mean that the names 
> are a bit longer, but in this case, "longer" really _does_ mean "simpler".
> 
> If you want a short, human-readable name, you _tag_ it. It takes all of a 
> hundredth of a second to to or so.

But tags have local meaning only, unless someone has access to your
repository, right?

>>> About "checkouts", i.e. working directories with repository elsewhere:
>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>> "symref"-like file to point to repository passes, we can use that.
>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>> our meaning of the term.
> 
> Well, in the git world, it's really just one shared repository that has 
> separate branch-namespaces, and separate working trees (aka "checkouts"). 
> So yes, it probably matches what bazaar would call a checkout.

The key thing about a checkout is that it's stored in a different
location from its repository.  This provides a few benefits:

- - you can publish a repository without publishing its working tree,
  possibly using standard mirroring tools like rsync.

- - you can have working trees on local systems while having the
  repository on a remote system.  This makes it easy to work on one
  logical branch from multiple locations, without getting out of sync.

- - you can use a checkout to maintain a local mirror of a read-only
  branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

> Almost nobody seems to actually use it that way in git - it's mostly more 
> efficient to just have five different branches in the same working tree, 
> and switch between them. When you switch between branches in git, git only 
> rewrites the part of your working tree that actually changed, so switching 
> is extremely efficient even with a large repo.

You can operate that way in bzr too, but I find it nicer to have one
checkout for each active branch, plus a checkout of bzr.dev.  Our switch
command also rewrites only the changed part of the working tree.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNFrv0F+nu1YWqI0RAgBHAJ9XpmdvuCNDysxFhnyeCmkEG/z0ggCggMsJ
WyW6lqGMokh0k0It1KOdgtk=
=L1SR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:08         ` Linus Torvalds
  2006-10-17  0:24           ` Jakub Narebski
@ 2006-10-17  4:31           ` Aaron Bentley
  2006-10-19 19:01             ` Nathaniel Smith
  1 sibling, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17  4:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Jakub Narebski wrote:
>> Unless you have branch(es) with totally different contents, like git.git
>> 'todo' branch.
> 
> Yes. I have to say, that's likely a fairly odd case, and I wouldn't be 
> surprised if other VCS's don't support that mode of operation at _all_.

Bazaar also supports multiple unrelated branches in a repository, as
does CVS, SVN (depending how you squint), Arch, and probably Monotone.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNFy90F+nu1YWqI0RAgMeAJ99OikxXspSg+efnN6j3ySoPuOovQCfaKA6
yPCRw5Kl/V+ThnU6fsPA8TQ=
=DYAN
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:19     ` Jakub Narebski
  2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
@ 2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  5:20         ` Shawn Pearce
                           ` (3 more replies)
  2006-10-17  9:37       ` Robert Collins
  2 siblings, 4 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17  4:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
> (which constantly change) is a moving target.

Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
positive numbers to refer to the number of commits that have been made
since the branch was initialized.

> One cannot have universally valid revision numbers (even
> only per branch) in distributed development. Subversion can do that only
> because it is centralized SCM. Global numbering and distributed nature
> doesn't mix... hence contents based sha1 as commit identifiers.

Sure.  Our UI approach is that unique identifiers can usefully be
abstracted away with a combination of URL + number, in the vast majority
of cases.

> But this doesn't matter much, because you can have really lightweight
> tags in git (especially now with packed refs support). So you can have
> the namespace you want.

The nice thing about revision numbers is that they're implicit-- no one
needs to take any action to update them, and so you can always use them.

> I wonder if any SCM other than git has easy way to "rebase" a branch,
> i.e. cut branch at branching point, and transplant it to the tip
> of other branch. For example you work on 'xx/topic' topic branch,
> and want to have changes in those branch but applied to current work,
> not to the version some time ago when you have started working on
> said feature.

If I understand correctly, in Bazaar, you'd just merge the current work
into 'xx/topic'.

> What your comparison matrick lacks for example is if given SCM
> saves information about branching point and merges, so you can
> get where two branches diverged, and when one branch was merged into
> another.

I'm not sure what you mean about divergence.  For example, Bazaar
records the complete ancestry of each branch, and determining the point
of divergence is as simple as finding the last common ancestor.  But are
you considering only the initial divergence?  Or if the branches merge
and then diverge again, would you consider that the point of divergence?

merge-point tracking is a prerequisite for Smart Merge, which does
appear on our matrix.

> Plugins = API + detection ifrastructure + loading on demand.
> Git has API, has a kind of detection ifrastructure (for commands and
> merge strategies only), doesn't have loading on demand. You can
> easily provide new commands (thanks to git wrapper) and new merge
> strategies.

I'm not sure what you mean by API, unless you mean the commandline.  If
that's what you mean, surely all unix commands are extensible in that
regard.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNGKQ0F+nu1YWqI0RAsW+AJoDOsNRmBjo3raT43JL6qn7SuJNRwCfe9l5
oAZ9OyrxMQlHnwrruhcjz9Y=
=RNuG
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:45     ` Johannes Schindelin
  2006-10-17  2:40       ` Petr Baudis
@ 2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  5:25         ` Carl Worth
                           ` (3 more replies)
  2006-10-17  9:33       ` Robert Collins
  2 siblings, 4 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17  5:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johannes Schindelin wrote:
> On Mon, 16 Oct 2006, Aaron Bentley wrote:

>> Bazaar's namespace is "simple" because all branches can be named by a 
>> URL, and all revisions can be named by a URL + a number.
> 
> How should this cope with a distributed project? IOW how does it deal with 
> "this revision and that revision are exactly the same"?

There are two answers here.  One is that the URL + number is UI, not
internals.  A unique ID is used internally, so that can be compared.

But to fully ensure that there are no differences, i.e. that no one has
reused an ID, you can generate a revision testament.

> If I understand you correctly, you are claiming that you are not really 
> identifying a revision, but a revision _at a certain place with a 
> place-dependent number_. This conflicts with my understanding of a 
> revision.

No, I am claiming that a revision at a certain place with a
place-dependent number is one name for a revision, but it may have other
names.

>> If that's true of Git, then it certainly has a simple namespace.  Using 
>> eight-digit hex values doesn't sound simple to me, though.
> 
> It depends on your usage. If you want to do anything interesting, like 
> assure that you have the correct version, or assure that two different 
> person's tags actually tag the same revision, there is no simpler 
> representation.

I can use the 'bzr missing' command to check whether my branch is in
sync with a remote branch.  Or I can use the 'pull' command to update my
branch to a given revno in a remote branch.


>> That sounds right.  So those branches are persistent, and can be worked
>> on independently?
> 
> Of course! Persistence (and reliability) are the number one goal of git. 
> Performance is the next one.

You'd be surprised.  When we last spoke to the Mercurial team, Mercurial
didn't support multiple persistent branches in one repository.  Pulling
from a remote repository could join two branches into one.  I'm told
they're fixing that now.


>> You'll note we referred to that bevhavior on the page.  We don't think
>> what Git does is the same as supporting renames.  AIUI, some Git users
>> feel the same way.
> 
> Oh, we start another flamewar again?

I'd hope not.  It sounds as though you feel that supporting renames in
the data representation is *wrong*, and therefore it should be an insult
to you if we said that Git fully supported renames.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNGVq0F+nu1YWqI0RAsXiAJ9hjH2sQGG3E9oIYP2SxscXvVQsJACdHtkj
+r37JPSjbQCuchPo08P3px8=
=5MHE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
@ 2006-10-17  5:20         ` Shawn Pearce
  2006-10-17  8:21           ` Martin Pool
  2006-10-17  8:15         ` Jakub Narebski
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 806+ messages in thread
From: Shawn Pearce @ 2006-10-17  5:20 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Jakub Narebski wrote:
> > One cannot have universally valid revision numbers (even
> > only per branch) in distributed development. Subversion can do that only
> > because it is centralized SCM. Global numbering and distributed nature
> > doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast majority
> of cases.

But this only works when the URL is public.  In Git I can just lookup
the unique SHA1 for a revision in my private repository and toss it
into an email with a quick copy and paste.  With Bazaar it sounds
like I'd have to do that relative to some known public repository,
which just sounds like more work to me.

But I don't want to see this otherwise interesting thread devolve into
a "we do X better!" match so I'm not going to say anything further here.
 
> > I wonder if any SCM other than git has easy way to "rebase" a branch,
> > i.e. cut branch at branching point, and transplant it to the tip
> > of other branch. For example you work on 'xx/topic' topic branch,
> > and want to have changes in those branch but applied to current work,
> > not to the version some time ago when you have started working on
> > said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.

Git has two approaches:

 - merge: The two independent lines of development are merged
   together under a new single graph node.  This is a merge commit
   and has two parent pointers, one for each independent line of
   development which was combined into one.  Up to 16 independent
   lines can be merged at once, though 12 is the record.

 - rebase: The commits from one line of development are replayed
   onto a totally different line of development.  This is often
   used to reapply your changes onto the upstream branch after the
   upstream has changed but before you send your changes upstream.
   It can often generate more readable commit history.

I believe what you are talking about in Bazaar is the former (merge)
while what Jakub was talking about was the latter (rebase).
 
> > What your comparison matrick lacks for example is if given SCM
> > saves information about branching point and merges, so you can
> > get where two branches diverged, and when one branch was merged into
> > another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?

I'm believe you nailed what Jakub was talking about on the head.
And yes, I noticed its in your matrix but its not very clear.
I think that some additional explanation there may help other
readers.
 
-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:08       ` Aaron Bentley
@ 2006-10-17  5:25         ` Carl Worth
  2006-10-17  5:31         ` Shawn Pearce
                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-17  5:25 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Johannes Schindelin, Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 1373 bytes --]

On Tue, 17 Oct 2006 01:08:59 -0400, Aaron Bentley wrote:
> >> If that's true of Git, then it certainly has a simple namespace.  Using
> >> eight-digit hex values doesn't sound simple to me, though.
> >
> > It depends on your usage. If you want to do anything interesting, like
> > assure that you have the correct version, or assure that two different
> > person's tags actually tag the same revision, there is no simpler
> > representation.
>
> I can use the 'bzr missing' command to check whether my branch is in
> sync with a remote branch.  Or I can use the 'pull' command to update my
> branch to a given revno in a remote branch.

I think you missed the simplicity of the git naming here. With git, I
can receive a bug report that specifies a bug that appears in a
revision such as:

	71037f3612da9d11431567c05c17807499ab1746

And since I have a commit object in my repository with that same name
I have a strong assurance that I am testing the identical software as
the bug reporter without me ever needing any access to pull from the
reporter's repository.

And this works in an entirely distributed fashion. Any two users can
be certain they are working with identical software on both ends by
exchanging and comparing a few bytes, (in email, irc, bugzilla, what
have you), without any need to refer to a common repository which both
users have access to.

-Carl


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  5:25         ` Carl Worth
@ 2006-10-17  5:31         ` Shawn Pearce
  2006-10-17  6:23         ` Junio C Hamano
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  3 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-17  5:31 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Johannes Schindelin, Jakub Narebski, bazaar-ng, git

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Johannes Schindelin wrote:
> > On Mon, 16 Oct 2006, Aaron Bentley wrote:
> >> You'll note we referred to that bevhavior on the page.  We don't think
> >> what Git does is the same as supporting renames.  AIUI, some Git users
> >> feel the same way.
> > 
> > Oh, we start another flamewar again?
> 
> I'd hope not.  It sounds as though you feel that supporting renames in
> the data representation is *wrong*, and therefore it should be an insult
> to you if we said that Git fully supported renames.

It would seem that the majority of folks on the Git list feel that
way, myself among them.  I don't know that we'd find it an insult
to say Git fully supports renames but I do think we have had better
results from *not* recording them and looking for them after the
fact with smart tools.

Junio's recent work with git-pickaxe (or whatever its name finally
settles out to be) is a perfect example of this.  Despite not having
"recorded renames" git-pickaxe is able to fairly accurately detect
blocks of code moving between files, of which renaming files is just
a special case.  This provides some fairly accurate blame reporting
pointing to exactly which commit/author/datetime put a given line
of code into the project.

No additional metadata required.  All existing repositories can
immediately benefit from the new tool.  Rather slick if you ask me.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:08       ` Aaron Bentley
  2006-10-17  5:25         ` Carl Worth
  2006-10-17  5:31         ` Shawn Pearce
@ 2006-10-17  6:23         ` Junio C Hamano
  2006-10-17 18:52           ` J. Bruce Fields
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  3 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-17  6:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: git

Aaron Bentley <aaron.bentley@utoronto.ca> writes:

> Johannes Schindelin wrote:
>
>>> You'll note we referred to that bevhavior on the page.  We don't think
>>> what Git does is the same as supporting renames.  AIUI, some Git users
>>> feel the same way.
>> 
>> Oh, we start another flamewar again?
>
> I'd hope not.  It sounds as though you feel that supporting renames in
> the data representation is *wrong*, and therefore it should be an insult
> to you if we said that Git fully supported renames.

Not recording and not supporting are quite different things.

What we don't do is to _record_ renames in the data structure.
I personally would not use a word as strong as _wrong_ (and
Linus may disagree), but (1) we can support renames without
recording them just fine, (2) recording renames would not help
to tell users about line movements across files which we would
want to do, and (3) we are getting closer to come up with a way
to even do (2) without recording renames.  Given these, perhaps
I might say recording renames is _pointless_ when I am in good
mood.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  0:23           ` Linus Torvalds
  2006-10-17  0:36             ` Johannes Schindelin
  2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
@ 2006-10-17  7:26             ` Christian MICHON
  2 siblings, 0 replies; 806+ messages in thread
From: Christian MICHON @ 2006-10-17  7:26 UTC (permalink / raw)
  To: git

On 10/17/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Well, you can just add
>
>        [alias]
>                cat=-p cat-file -p
>
> to your ~/.gitconfig file, and you're there.

_WONDERFUL_. Really :)

-- 
Christian

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:24       ` Aaron Bentley
@ 2006-10-17  7:50         ` Andreas Ericsson
  2006-10-17 14:05           ` Aaron Bentley
  2006-10-17  8:30         ` Jakub Narebski
                           ` (2 subsequent siblings)
  3 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17  7:50 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Linus Torvalds wrote:
>> On Mon, 16 Oct 2006, Aaron Bentley wrote:
>>> Bazaar's namespace is "simple" because all branches can be named by a
>>> URL, and all revisions can be named by a URL + a number.
> 
>> I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
>> name a revision in a distributed environment, though. I bet the "number" 
>> really only names a revision in one _single_ repository, right?
> 
> Right.  That's why I said all revisions can be named by a URL + a
> number, because it's the combination of the URL + a number that is
> unique.  (In bzr, each branch has a URL.)
> 

The revision will change between different repos though, so 
random-contributor A that doesn't have his repo publicised needs to send 
patches and can't log his exact problem revision somewhere, which makes 
it hard for random contributor B that runs into a similar problem but on 
a different project sometime later to find the offending code. I prefer 
the git way, but I'm a git user and probably biased.

That said, it shouldn't be impossible to add fixed, user-friendly 
bazaar-like revision numbers for git. We just have to reverse the
<committish>[^~]<number> syntax to also accept <committish>+<number>.

This would work marvelously with serial development but breaks horribly 
with merges unless the first (or last) commit on each new branch gets 
given a tag or some such.

Either way, I'm fairly certain both bazaar and git needs to distribute 
information to the user in need of finding the revision (which url and 
which number vs which sha). I also imagine that the bazaar users, just 
like the git users, are sufficiently apt copy-paste people to never 
actually read the prerequisite information.

> 
>> If you give 
>> the SHA1 name, it's well-defined even between different repositories, and 
>> you can tell somebody that "revision XYZ is when the problem started", and 
>> they'll know _exactly_ which revision it is, even if they don't have your 
>> particular repository.
> 
> When two people have copies of the same revision, it's usually because
> they are each pulling from a common branch, and so the revision in that
> branch can be named.  Bazaar does use unique ids internally, but it's
> extremely rare that the user needs to use them.
> 

Well, if two people have the same revision in git, you *know* they have 
pulled from each other, because ALL objects are immutable. The point of 
"naming" the revision is moot, because it's something all SCM's can do.


>> Now _that_ is true simplicity. It does automatically mean that the names 
>> are a bit longer, but in this case, "longer" really _does_ mean "simpler".
>>
>> If you want a short, human-readable name, you _tag_ it. It takes all of a 
>> hundredth of a second to to or so.
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?
> 

I imagine the bazaar-names with url+number only has local meaning unless 
someone has access to your repository too. One of the great benefits of 
git is that each revision is *always exactly the same* no matter in 
which repository it appears. This includes file-content, filesystem 
layout and, last but also most important, history.


>>>> About "checkouts", i.e. working directories with repository elsewhere:
>>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>>> "symref"-like file to point to repository passes, we can use that.
>>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>>> our meaning of the term.
>> Well, in the git world, it's really just one shared repository that has 
>> separate branch-namespaces, and separate working trees (aka "checkouts"). 
>> So yes, it probably matches what bazaar would call a checkout.
> 
> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.
> 

Can't all scm's do this?

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.
> 

This I'm not so sure about. Anyone wanna fill out how shallow clones and 
all that jazz works?

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> 

Check. Well, actually, you just clone it as usual but with the --bare 
argument and it won't write out the working tree files.

>> Almost nobody seems to actually use it that way in git - it's mostly more 
>> efficient to just have five different branches in the same working tree, 
>> and switch between them. When you switch between branches in git, git only 
>> rewrites the part of your working tree that actually changed, so switching 
>> is extremely efficient even with a large repo.
> 
> You can operate that way in bzr too, but I find it nicer to have one
> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
> command also rewrites only the changed part of the working tree.
> 

Works in git as well, but each "checkout" (actually, locally referenced 
repository clone) gets a separate branch/tag namespace.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  5:20         ` Shawn Pearce
@ 2006-10-17  8:15         ` Jakub Narebski
  2006-10-17  8:16         ` Andreas Ericsson
  2006-10-17  9:20         ` Jakub Narebski
  3 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17  8:15 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Dnia wtorek 17. października 2006 06:56, Aaron Bentley napisał:
> Jakub Narebski wrote:
> > Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
> > (which constantly change) is a moving target.
> 
> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> positive numbers to refer to the number of commits that have been made
> since the branch was initialized.
> 
> > One cannot have universally valid revision numbers (even
> > only per branch) in distributed development. Subversion can do that only
> > because it is centralized SCM. Global numbering and distributed nature
> > doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast majority
> of cases.
> 
> > But this doesn't matter much, because you can have really lightweight
> > tags in git (especially now with packed refs support). So you can have
> > the namespace you want.
> 
> The nice thing about revision numbers is that they're implicit-- no one
> needs to take any action to update them, and so you can always use them.
> 
> > I wonder if any SCM other than git has easy way to "rebase" a branch,
> > i.e. cut branch at branching point, and transplant it to the tip
> > of other branch. For example you work on 'xx/topic' topic branch,
> > and want to have changes in those branch but applied to current work,
> > not to the version some time ago when you have started working on
> > said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.
> 
> > What your comparison matrick lacks for example is if given SCM
> > saves information about branching point and merges, so you can
> > get where two branches diverged, and when one branch was merged into
> > another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?
> 
> merge-point tracking is a prerequisite for Smart Merge, which does
> appear on our matrix.
> 
> > Plugins = API + detection ifrastructure + loading on demand.
> > Git has API, has a kind of detection ifrastructure (for commands and
> > merge strategies only), doesn't have loading on demand. You can
> > easily provide new commands (thanks to git wrapper) and new merge
> > strategies.
> 
> I'm not sure what you mean by API, unless you mean the commandline.  If
> that's what you mean, surely all unix commands are extensible in that
> regard.
> 
> Aaron
> 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
  2006-10-17  5:20         ` Shawn Pearce
  2006-10-17  8:15         ` Jakub Narebski
@ 2006-10-17  8:16         ` Andreas Ericsson
  2006-10-17 20:01           ` Aaron Bentley
  2006-10-17  9:20         ` Jakub Narebski
  3 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17  8:16 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
>> Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
>> (which constantly change) is a moving target.
> 
> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> positive numbers to refer to the number of commits that have been made
> since the branch was initialized.
> 

What do you do once a branch has been thrown away, or has had 20 other 
branches merged into it? Does the offset-number change for the revision 
then, or do you track branch-points explicitly?

>> One cannot have universally valid revision numbers (even
>> only per branch) in distributed development. Subversion can do that only
>> because it is centralized SCM. Global numbering and distributed nature
>> doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast majority
> of cases.
> 
>> But this doesn't matter much, because you can have really lightweight
>> tags in git (especially now with packed refs support). So you can have
>> the namespace you want.
> 
> The nice thing about revision numbers is that they're implicit-- no one
> needs to take any action to update them, and so you can always use them.
> 
>> I wonder if any SCM other than git has easy way to "rebase" a branch,
>> i.e. cut branch at branching point, and transplant it to the tip
>> of other branch. For example you work on 'xx/topic' topic branch,
>> and want to have changes in those branch but applied to current work,
>> not to the version some time ago when you have started working on
>> said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.
> 

merge != rebase though, although they are indeed similar. Let's take the 
example of a 'master' branch and topic branch topicA. If you rebase 
topicA onto 'master', development will appear to have been serial. If 
you instead merge them, it will either register as a real merge or, if 
the branch tip of 'master' is the branch start-point of topicA, it will 
result in a "fast-forward" where 'master' is just updated to the 
branch-tip of 'topicA'.

>> What your comparison matrick lacks for example is if given SCM
>> saves information about branching point and merges, so you can
>> get where two branches diverged, and when one branch was merged into
>> another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?
> 
> merge-point tracking is a prerequisite for Smart Merge, which does
> appear on our matrix.
> 
>> Plugins = API + detection ifrastructure + loading on demand.
>> Git has API, has a kind of detection ifrastructure (for commands and
>> merge strategies only), doesn't have loading on demand. You can
>> easily provide new commands (thanks to git wrapper) and new merge
>> strategies.
> 
> I'm not sure what you mean by API, unless you mean the commandline.  If
> that's what you mean, surely all unix commands are extensible in that
> regard.
> 

I'm fairly certain he's talking about the API in the sense it's being 
talked about in every other application. Extensive work has been made to 
libify a lot of the git code, which means that most git commands are 
made up of less than 400 lines of C code, where roughly 80% of the code 
is command-specific (i.e., argument parsing and presentation).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  5:20         ` Shawn Pearce
@ 2006-10-17  8:21           ` Martin Pool
  0 siblings, 0 replies; 806+ messages in thread
From: Martin Pool @ 2006-10-17  8:21 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Aaron Bentley, bazaar-ng, git, Jakub Narebski

On 17 Oct 2006, Shawn Pearce <spearce@spearce.org> wrote:
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > Jakub Narebski wrote:
> > > One cannot have universally valid revision numbers (even
> > > only per branch) in distributed development. Subversion can do that only
> > > because it is centralized SCM. Global numbering and distributed nature
> > > doesn't mix... hence contents based sha1 as commit identifiers.
> > 
> > Sure.  Our UI approach is that unique identifiers can usefully be
> > abstracted away with a combination of URL + number, in the vast majority
> > of cases.
> 
> But this only works when the URL is public.  In Git I can just lookup
> the unique SHA1 for a revision in my private repository and toss it
> into an email with a quick copy and paste.  

Yes, but then people need to know how to get it out of your private
repository.  For stuff that goes into well-known repositories I suppose
it just propagates.

> With Bazaar it sounds like I'd have to do that relative to some known
> public repository, which just sounds like more work to me.

You can also name a revision using its UUID, in which case things will
work similarly to git.  We tend to often say "in r1234 of dev".

> But I don't want to see this otherwise interesting thread devolve into
> a "we do X better!" match so I'm not going to say anything further here.

Sure.

> > > I wonder if any SCM other than git has easy way to "rebase" a branch,
> > > i.e. cut branch at branching point, and transplant it to the tip
> > > of other branch. For example you work on 'xx/topic' topic branch,
> > > and want to have changes in those branch but applied to current work,
> > > not to the version some time ago when you have started working on
> > > said feature.
> > 
> > If I understand correctly, in Bazaar, you'd just merge the current work
> > into 'xx/topic'.
> 
> Git has two approaches:
> 
>  - merge: The two independent lines of development are merged
>    together under a new single graph node.  This is a merge commit
>    and has two parent pointers, one for each independent line of
>    development which was combined into one.  Up to 16 independent
>    lines can be merged at once, though 12 is the record.
> 
>  - rebase: The commits from one line of development are replayed
>    onto a totally different line of development.  This is often
>    used to reapply your changes onto the upstream branch after the
>    upstream has changed but before you send your changes upstream.
>    It can often generate more readable commit history.
> 
> I believe what you are talking about in Bazaar is the former (merge)
> while what Jakub was talking about was the latter (rebase).

For the 'rebase' operation in Bazaar you can use 'bzr graft':

  http://spacepants.org/src/bzrgraft/

-- 
Martin

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:24       ` Aaron Bentley
  2006-10-17  7:50         ` Andreas Ericsson
@ 2006-10-17  8:30         ` Jakub Narebski
  2006-10-17 11:19           ` Matthieu Moy
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
  2006-10-17 15:03         ` Linus Torvalds
  3 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17  8:30 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
> Linus Torvalds wrote:

>> If you want a short, human-readable name, you _tag_ it. It takes all of a
>> hundredth of a second to to or so.
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?

Tags are propagated during clone, and during fetch/pull (getting changes
from repository). So in that sense they are global.

If you don't publish your repository, then neither tags, nor <URL>+<rev no>
has any sense, any meaning to somebody other than local private repository.
 

>> Well, in the git world, it's really just one shared repository that has
>> separate branch-namespaces, and separate working trees (aka "checkouts").
>> So yes, it probably matches what bazaar would call a checkout.
> 
> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.

git clone --bare
 
> - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.

In git we usually use "git clone --local" (with repository database
hardlinked) or "git clone --shared"/"git clone --reference <repository>"
(which automatically sets alternates, i.e. file pointing to alternate
repository database) for that. This way one gets his/her own refs
namespace, so two people can work on different branches simultaneously.

Alternate solution would be to symlink .git, or .git/objects (i.e.
repository "database").

> - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

In git you can access contents _without_ checkout/working area.
For example gitweb (one of git's web interfaces) uses only repository
database and doesn't need checkout/working area.

>> Almost nobody seems to actually use it that way in git - it's mostly more
>> efficient to just have five different branches in the same working tree,
>> and switch between them. When you switch between branches in git, git only
>> rewrites the part of your working tree that actually changed, so switching
>> is extremely efficient even with a large repo.
> 
> You can operate that way in bzr too, but I find it nicer to have one
> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
> command also rewrites only the changed part of the working tree.

Luben (IIRC) works this way.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:56       ` Aaron Bentley
                           ` (2 preceding siblings ...)
  2006-10-17  8:16         ` Andreas Ericsson
@ 2006-10-17  9:20         ` Jakub Narebski
  2006-10-17  9:40           ` Robert Collins
  2006-10-17  9:59           ` VCS comparison table Andreas Ericsson
  3 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17  9:20 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Well, <ref>~<n> means <n>-th _parent_ of a given ref, which for branches
>> (which constantly change) is a moving target.
> 
> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> positive numbers to refer to the number of commits that have been made
> since the branch was initialized.

How that works with branching point, and with merges? For example
in the case depicted below, how you refer to commit marked by X?

          ---- time --->

    --*--*--*--*--*--*--*--*--*-- <branch>
          \            /
           \-*--X--*--/

The branch it used to be on is gone...


Besides, in git commit object has pointers (in the form of sha1 ids)
to all its parents. So <ref>^ (parent of <ref>), or <ref>^<m> (m-th
parent of <ref>), or <ref>~<n> (n-th parent in 1st-parent lineage
of <ref>) are natural, and fast. <ref>+<n> (which would add yet another
character as forbidden in branch name) would need either serial number
(per repository or per branch) to commit id database, or getting full
history and looking it up in full history.

Branches in git are remembered not by their starting points, but by
their tips (ending points).

>> One cannot have universally valid revision numbers (even
>> only per branch) in distributed development. Subversion can do that
>> only because it is centralized SCM. Global numbering and distributed
>> nature doesn't mix... hence contents based sha1 as commit identifiers.
> 
> Sure.  Our UI approach is that unique identifiers can usefully be
> abstracted away with a combination of URL + number, in the vast
> majority of cases.

Git could do that too, by having file (files) with serial number
or branch/tag+serial number to commit id mapping. But this would
have to be local matter. And this would take some disk space, and
would seriously affect fetch performance (now git just downloads
what it doesn't have and dumps it into repository database).

BTW. what if repository is moved from one URL to another, for example
moving to different host? All "abstracted away" identifiers get
invalidated?

>> But this doesn't matter much, because you can have really lightweight
>> tags in git (especially now with packed refs support). So you can have
>> the namespace you want.
> 
> The nice thing about revision numbers is that they're implicit-- no one
> needs to take any action to update them, and so you can always use them.

Two words: post-commit hook. You can automate action of adding tags
(especially now with packed refs, which means that we can have huge number
of tags and this doesn't affect performance doue to I/O nor repository size)

>> I wonder if any SCM other than git has easy way to "rebase" a branch,
>> i.e. cut branch at branching point, and transplant it to the tip
>> of other branch. For example you work on 'xx/topic' topic branch,
>> and want to have changes in those branch but applied to current work,
>> not to the version some time ago when you have started working on
>> said feature.
> 
> If I understand correctly, in Bazaar, you'd just merge the current work
> into 'xx/topic'.

That is the alternate solution, but this would mean that merge would be
recorded (unless you squash it). And for published branches (like 'next'
for example) it is better solution, because rebase is in fact rewriting
history.

But rebase means that you had

                 A---B---C topic
                /
           D---E---F---G master

Rebasing 'topic' branch on top of master would mean that you would get

                         A'--B'--C' topic
                        /
           D---E---F---G master

where A', B', C' represent the same changeset as A, B, C up to resolved
conflicts.

And yes, that is "bzr graft"
  http://spacepants.org/src/bzrgraft/
equivalent. Do I understand correctly that this is third-party
contribution?

>> What your comparison matrick lacks for example is if given SCM
>> saves information about branching point and merges, so you can
>> get where two branches diverged, and when one branch was merged into
>> another.
> 
> I'm not sure what you mean about divergence.  For example, Bazaar
> records the complete ancestry of each branch, and determining the point
> of divergence is as simple as finding the last common ancestor.  But are
> you considering only the initial divergence?  Or if the branches merge
> and then diverge again, would you consider that the point of divergence?
> 
> merge-point tracking is a prerequisite for Smart Merge, which does
> appear on our matrix.

I was talking about point-of-divergence (branching point, fork point)
tracking, and merge-point tracking (or saving merge information).

>> Plugins = API + detection ifrastructure + loading on demand.
>> Git has API, has a kind of detection ifrastructure (for commands and
>> merge strategies only), doesn't have loading on demand. You can
>> easily provide new commands (thanks to git wrapper) and new merge
>> strategies.
> 
> I'm not sure what you mean by API, unless you mean the commandline.  If
> that's what you mean, surely all unix commands are extensible in that
> regard.

I mean API in the most common sense. 

For commands written in C it means "engine" (plumbing) functions and
data structures which do most work, so writing new command means some
command specific code and calling some functions to do the work.

For commands written in shell it means having versatile plumbing
commands (like for example git-rev-parse, git-rev-list, git-merge-base,
git-cat-file, etc.) which can be joined together including pipes
(--stdin option, --revs option to some commands), and git-sh-setup,
common git shell setup code. 

For commands writtent in Perl it means the same, with Git.pm module
instead of git-sh-setup.


About new command detection: if you put program named git-<command>
in directory with the rest of git commands, then you can call it
as "git <command>" using git wrapper. I think.

About adding new merge strategies: no autodoetection, you would
have to add new merge strategu to git-merge.sh.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:45     ` Johannes Schindelin
  2006-10-17  2:40       ` Petr Baudis
  2006-10-17  5:08       ` Aaron Bentley
@ 2006-10-17  9:33       ` Robert Collins
  2006-10-17  9:45         ` Jakub Narebski
  2 siblings, 1 reply; 806+ messages in thread
From: Robert Collins @ 2006-10-17  9:33 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1208 bytes --]

On Tue, 2006-10-17 at 01:45 +0200, Johannes Schindelin wrote:
> 
> If you really, really think about it: it makes much more sense to
> record 
> your intention in the commit message. So, instead of recording for
> _every_ 
> _single_ file in folder1/ that it was moved to folder2/, it is better
> to 
> say that you moved folder1/ to folder2/ _because of some special
> reason_!

Just a small nit here: bzr does /not/ record the move of every file: it
records the rename of folder1 to folder2. One piece of data is all thats
recorded - no new manifest for the subdirectory is needed.

Of course, a user can choose to move all the contents of a folder and
not the folder itself - its up to the user.

By recording the folder rename rather than the contents rename, we get
merges of new files added to folder1 in other branches come into folder2
automatically, without needing to do arbitrarily deep history processing
to determine that.

This also does not prevent us doing history analysis as well, to
determine other interesting things - such as cross file 'blame' as has
been mentioned in this thread. 

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16 23:19     ` Jakub Narebski
  2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
  2006-10-17  4:56       ` Aaron Bentley
@ 2006-10-17  9:37       ` Robert Collins
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
  2006-10-17 10:06         ` Jakub Narebski
  2 siblings, 2 replies; 806+ messages in thread
From: Robert Collins @ 2006-10-17  9:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 799 bytes --]

On Tue, 2006-10-17 at 01:19 +0200, Jakub Narebski wrote:
> 
> I wonder if any SCM other than git has easy way to "rebase" a branch,
> i.e. cut branch at branching point, and transplant it to the tip
> of other branch. For example you work on 'xx/topic' topic branch,
> and want to have changes in those branch but applied to current work,
> not to the version some time ago when you have started working on
> said feature. 

Precisely how does this rebase operate in git ? 
Does it preserve revision ids for the existing work, or do they all
change?


bzr has a graft plugin which walks one branch applying all its changes
to another preserving the users metadata but changing the uuids for
revisions. 

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:20         ` Jakub Narebski
@ 2006-10-17  9:40           ` Robert Collins
  2006-10-17 10:08             ` Andreas Ericsson
  2006-10-17 16:41             ` Linus Torvalds
  2006-10-17  9:59           ` VCS comparison table Andreas Ericsson
  1 sibling, 2 replies; 806+ messages in thread
From: Robert Collins @ 2006-10-17  9:40 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 726 bytes --]

On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> 
>           ---- time --->
> 
>     --*--*--*--*--*--*--*--*--*-- <branch>
>           \            /
>            \-*--X--*--/
> 
> The branch it used to be on is gone...

In bzr 0.12 this is :
2.1.2

(assuming the first * is numbered '1'.)

These numbers are fairly stable, in particular everything's number in
the mainline will be the same number in all the branches created from it
at that point in time, but a branch that initially creates a revision or
obtains it before the mainline will have a different number until they
syncronise with the mainline via pull.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:33       ` Robert Collins
@ 2006-10-17  9:45         ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17  9:45 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Robert Collins wrote:

> On Tue, 2006-10-17 at 01:45 +0200, Johannes Schindelin wrote:
>> 
>> If you really, really think about it: it makes much more sense to record 
>> your intention in the commit message. So, instead of recording for _every_ 
>> _single_ file in folder1/ that it was moved to folder2/, it is better to 
>> say that you moved folder1/ to folder2/ _because of some special
>> reason_!
> 
> Just a small nit here: bzr does /not/ record the move of every file: it
> records the rename of folder1 to folder2. One piece of data is all thats
> recorded - no new manifest for the subdirectory is needed.
> 
> Of course, a user can choose to move all the contents of a folder and
> not the folder itself - its up to the user.
> 
> By recording the folder rename rather than the contents rename, we get
> merges of new files added to folder1 in other branches come into folder2
> automatically, without needing to do arbitrarily deep history processing
> to determine that.

Hmmm... I wonder how well git manages that (merge with renamed directory).

  folder1/a  -->  folder2/a  --------> folder2/a
  folder1/b  -->  folder2/b       /    folder2/b
      \                          /     folder2/c
       \------->  folder1/a  ---/
                  folder1/b
                  folder1/c


I wonder how bzr manages "separate some files into subdirectory" (and how
well git does that), i.e. we have

   sub-file1
   sub-file2
   filea
   fileb

In the 'main' branch we separated "sub-*" files into subdirectory

   sub/file1
   sub/file2
   filea
   fileb

How would that merge with adding new sub-* file on the branch to be merged?

   sub-file1
   sub-file2
   sub-file3
   filea
   fileb


Or how bzr manages sub-level movement, such as splitting file into two,
or joining two files into one file.


P.S. is anyone working on --follow option for renames following path
limiting?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:20         ` Jakub Narebski
  2006-10-17  9:40           ` Robert Collins
@ 2006-10-17  9:59           ` Andreas Ericsson
  1 sibling, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17  9:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski wrote:
> 
> About new command detection: if you put program named git-<command>
> in directory with the rest of git commands, then you can call it
> as "git <command>" using git wrapper. I think.
> 

Yup. The new command will also automagically appear in the "git help -a" 
output. Those two functions have been available since the C wrapper was 
born, although "git help -a" was the only available output for "command 
not found" until someone introduced the more newbie-friendly list that 
pops up now adays.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
  2006-10-17 10:01           ` Sean
@ 2006-10-17 10:01           ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:01 UTC (permalink / raw)
  To: Robert Collins; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git

On Tue, 17 Oct 2006 19:37:45 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Precisely how does this rebase operate in git ? 
> Does it preserve revision ids for the existing work, or do they all
> change?
> 
> bzr has a graft plugin which walks one branch applying all its changes
> to another preserving the users metadata but changing the uuids for
> revisions. 

git rebase does exactly the same as you describe, including changing
the sha1 for each commit it moves.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
@ 2006-10-17 10:01           ` Sean
  2006-10-17 10:01           ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:01 UTC (permalink / raw)
  To: Robert Collins; +Cc: bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 19:37:45 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Precisely how does this rebase operate in git ? 
> Does it preserve revision ids for the existing work, or do they all
> change?
> 
> bzr has a graft plugin which walks one branch applying all its changes
> to another preserving the users metadata but changing the uuids for
> revisions. 

git rebase does exactly the same as you describe, including changing
the sha1 for each commit it moves.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:37       ` Robert Collins
       [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
@ 2006-10-17 10:06         ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 10:06 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Robert Collins wrote:

> On Tue, 2006-10-17 at 01:19 +0200, Jakub Narebski wrote:
>> 
>> I wonder if any SCM other than git has easy way to "rebase" a branch,
>> i.e. cut branch at branching point, and transplant it to the tip
>> of other branch. For example you work on 'xx/topic' topic branch,
>> and want to have changes in those branch but applied to current work,
>> not to the version some time ago when you have started working on
>> said feature. 
> 
> Precisely how does this rebase operate in git ? 
> Does it preserve revision ids for the existing work, or do they all
> change?

Revision ids (commit ids) change of course. Therefore rebasing published
branches is not recommended, as it is in fact rewriting history.

It is however recommended before sending _series_ of patches (work on that
series should be done using topic branch) to rebase topic branch they sit
on for the patches to apply cleanly on top of current work. Or use StGit or
other Quilt (patch management) equivalent.

> bzr has a graft plugin which walks one branch applying all its changes
> to another preserving the users metadata but changing the uuids for
> revisions. 

This looks like "bzr graft" is the same as "git rebase". It can deal with
conflict, cannot it?


P.S. It looks like we have yet another terminology conflict. In git "graft"
means "history graft" i.e. file which changes parents of some commits. For
example if we have historical repositoy and current repositoy we can join
together using grafts (otherwise we would need to rewrite history, as sha1
which serves as commit id includes parents information), e.g.

   x--*--*--*--*....x--*--*--*--*

    historical         current

where 'x' is 'root' (parentless) commit, '--' denotes parentship, and '....'
denotes "history graft".      
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:40           ` Robert Collins
@ 2006-10-17 10:08             ` Andreas Ericsson
  2006-10-17 10:47               ` Matthieu Moy
  2006-10-18  4:55               ` Robert Collins
  2006-10-17 16:41             ` Linus Torvalds
  1 sibling, 2 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17 10:08 UTC (permalink / raw)
  To: Robert Collins; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git

Robert Collins wrote:
> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
>>           ---- time --->
>>
>>     --*--*--*--*--*--*--*--*--*-- <branch>
>>           \            /
>>            \-*--X--*--/
>>
>> The branch it used to be on is gone...
> 
> In bzr 0.12 this is :
> 2.1.2
> 

Would it be a different number in a different version of bazaar?

> (assuming the first * is numbered '1'.)
> 
> These numbers are fairly stable, in particular everything's number in
> the mainline will be the same number in all the branches created from it
> at that point in time, but a branch that initially creates a revision or
> obtains it before the mainline will have a different number until they
> syncronise with the mainline via pull.
> 

So basically anyone can pull/push from/to each other but only so long as 
they decide upon a common master that handles synchronizing of the 
number part of the url+number revision short-hands?

One thing that's been nagging me is how you actually find out the 
url+number where the desired revision exists. That is, after you've 
synced with master, or merged the mothership's master-branch into one of 
your experimental branches where you've done some work that went before 
mothership's master's current tip, do you have to have access to the 
mothership's repo (as in, do you have to be online) to find out the 
number part of url+number shorthand, or can you determine it solely from 
what you have on your laptop?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
@ 2006-10-17 10:23           ` Sean
  2006-10-17 10:30             ` Johannes Schindelin
                               ` (3 more replies)
  2006-10-17 10:23           ` Sean
  1 sibling, 4 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

On Tue, 17 Oct 2006 00:24:15 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.

Yeah, even in git you typically don't publish your working tree when
making it available for cloning.  In fact the native git network
protocol doesn't even have a way to transfer working trees.

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.

That is a very nice feature.  Git would be improved if it could
support that mode of operation as well.

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

I'm not sure what you mean here.  A bzr checkout doesn't have any history
does it?  So it's not a mirror of a branch, but just a checkout of the
branch head?

If so, Git can export a tarball of a branch (actually a snapshot as at
any given commit) which can be mirrored out.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
  2006-10-17 10:23           ` Sean
@ 2006-10-17 10:23           ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 00:24:15 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.

Yeah, even in git you typically don't publish your working tree when
making it available for cloning.  In fact the native git network
protocol doesn't even have a way to transfer working trees.

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.

That is a very nice feature.  Git would be improved if it could
support that mode of operation as well.

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).

I'm not sure what you mean here.  A bzr checkout doesn't have any history
does it?  So it's not a mirror of a branch, but just a checkout of the
branch head?

If so, Git can export a tarball of a branch (actually a snapshot as at
any given commit) which can be mirrored out.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  2006-10-17 10:23           ` Sean
@ 2006-10-17 10:23           ` Sean
  2006-10-18  6:33           ` Jeff King
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Johannes Schindelin, Jakub Narebski, bazaar-ng, git

On Tue, 17 Oct 2006 01:08:59 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> I can use the 'bzr missing' command to check whether my branch is in
> sync with a remote branch.  Or I can use the 'pull' command to update my
> branch to a given revno in a remote branch.

The "bzr missing" command sounds like a handy one.  

Someone on the xorg mailing list was recently lamenting that git does not
have an easy way to compare a local branch to a remote one.  While this
turns out to not be a big problem in git, it might be nice to have such
a command.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
@ 2006-10-17 10:23           ` Sean
  2006-10-17 10:23           ` Sean
  2006-10-18  6:33           ` Jeff King
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:23 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Johannes Schindelin, Jakub Narebski

On Tue, 17 Oct 2006 01:08:59 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> I can use the 'bzr missing' command to check whether my branch is in
> sync with a remote branch.  Or I can use the 'pull' command to update my
> branch to a given revno in a remote branch.

The "bzr missing" command sounds like a handy one.  

Someone on the xorg mailing list was recently lamenting that git does not
have an easy way to compare a local branch to a remote one.  While this
turns out to not be a big problem in git, it might be nice to have such
a command.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
@ 2006-10-17 10:30             ` Johannes Schindelin
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
                                 ` (2 more replies)
  2006-10-17 19:51             ` Aaron Bentley
                               ` (2 subsequent siblings)
  3 siblings, 3 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-17 10:30 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, bazaar-ng, git

Hi,

On Tue, 17 Oct 2006, Sean wrote:

> On Tue, 17 Oct 2006 00:24:15 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> 
> > - - you can have working trees on local systems while having the
> >   repository on a remote system.  This makes it easy to work on one
> >   logical branch from multiple locations, without getting out of sync.
> 
> That is a very nice feature.  Git would be improved if it could
> support that mode of operation as well.

It would also make things slow as hell. How do you deal with something 
like annotate in such a setup?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
  2006-10-17 10:35                 ` Sean
@ 2006-10-17 10:35                 ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Aaron Bentley, bazaar-ng, git

On Tue, 17 Oct 2006 12:30:27 +0200 (CEST)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

Some commands like annotate might not make any sense in such a set up.

But one way to get the same (perhaps even better) feature into git 
would be to support shallow clones, in which case even annotate would
continue to work even if somewhat crippled by the lack of a complete
history.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
@ 2006-10-17 10:35                 ` Sean
  2006-10-17 10:35                 ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 10:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: bazaar-ng, git

On Tue, 17 Oct 2006 12:30:27 +0200 (CEST)
Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:

> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

Some commands like annotate might not make any sense in such a set up.

But one way to get the same (perhaps even better) feature into git 
would be to support shallow clones, in which case even annotate would
continue to work even if somewhat crippled by the lack of a complete
history.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:30             ` Johannes Schindelin
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
@ 2006-10-17 10:45               ` Matthias Kestenholz
  2006-10-17 13:48               ` Aaron Bentley
  2 siblings, 0 replies; 806+ messages in thread
From: Matthias Kestenholz @ 2006-10-17 10:45 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Sean, Aaron Bentley, bazaar-ng, git

Hi,

On Tue, 2006-10-17 at 12:30 +0200, Johannes Schindelin wrote:
> Hi,
> 
> On Tue, 17 Oct 2006, Sean wrote:
> 
> > On Tue, 17 Oct 2006 00:24:15 -0400
> > Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > 
> > > - - you can have working trees on local systems while having the
> > >   repository on a remote system.  This makes it easy to work on one
> > >   logical branch from multiple locations, without getting out of sync.
> > 
> > That is a very nice feature.  Git would be improved if it could
> > support that mode of operation as well.
> 
> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

You'd probably have to do all processing server-side (git log, blame,
merges... like in subversion, where you can merge and rename/move files
remotely, IIRC). Of course, all the things which make git really useful
for me (gitk, git log with all its arguments etc.) would not be
available. Cheap checkouts would be made possible easily that way at the
cost of higher server load and an abstraction layer over network for
object access.

I don't know if that sounds reasonable at all.

	Matthias

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:08             ` Andreas Ericsson
@ 2006-10-17 10:47               ` Matthieu Moy
  2006-10-18  4:55               ` Robert Collins
  1 sibling, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 10:47 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Robert Collins, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> Robert Collins wrote:
>> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
>>>           ---- time --->
>>>
>>>     --*--*--*--*--*--*--*--*--*-- <branch>
>>>           \            /
>>>            \-*--X--*--/
>>>
>>> The branch it used to be on is gone...
>>
>> In bzr 0.12 this is :
>> 2.1.2
>>
>
> Would it be a different number in a different version of bazaar?

I can't say for bzr 0.>12 which do not exist ;-)

For previous versions, it didn't have that "simple" number, and you
had to use the rev-id.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  8:30         ` Jakub Narebski
@ 2006-10-17 11:19           ` Matthieu Moy
  2006-10-17 11:45             ` Jakub Narebski
                               ` (4 more replies)
  0 siblings, 5 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 11:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>> - you can use a checkout to maintain a local mirror of a read-only
>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>
> In git you can access contents _without_ checkout/working area.

Bazaar can do this too. For example,
"bzr cat http://something -r some-revision" gets the content of a file
at a given revision. But that's not what Aaron was refering to.

In Bazaar, checkouts can be two things:

1) a working tree without any history information, pointing to some
   other location for the history itself (a la svn/CVS/...).
   (this is "light checkout")

2) a bound branch. It's not _very_ different from a normal branch, but
   mostly "commit" behaves differently:
   - it commits both on the local and the remote branch (equivalent to
     "commit" + "push", but in a transactional way).
   - it refuses to commit if you're out of date with the branch you're
     bound to.
   (this is "heavy checkout")

In both cases, this has the side effect that you can't commit if the
"upstream" branch is read-only. That's not fundamental, but handy.

I use it for example to have several "checkouts" of the same branch on
different machines. When I commit, bzr tells me "hey, boss, you're out
of date, why don't you update first" if I'm out of date. And if commit
succeeds, I'm sure it is already commited to the main branch. I'm sure
I won't pollute my history with merges which would only be the result
of forgetting to update.

Once more, that's not fundamental, but handy.

The more fundamental thing I suppose is that it allows people to work
in a centralized way (checkout/commit/update/...), and Bazaar was
designed to allow several different workflows, including the
centralized one.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  2006-10-17 11:38               ` Sean
@ 2006-10-17 11:38               ` Sean
  2006-10-17 12:03                 ` Matthieu Moy
  2006-10-21 14:13               ` Jan Hudec
  2 siblings, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-17 11:38 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 13:19:08 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")

Git can do this from a local repository, it just can't do it from
a remote repo (at least over the git native protocol).  However,
over gitweb you can grab and unpack a tarball from a remote repo.
In practice this is probably enough support for such a feature.

> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

This doesn't sound right, at least in the spirit of git.  Git really
wants to have a local commit which you may or may not push to a
remote repo at a later time.  There is no upside to forcing it all to
happen in one step, and a lot of downsides.  Gits focus is to support
distributed offline development, not requiring a remote repo to be
available at commit time.
 
> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.

Again this seems really anti-git.  There is no reason for your local
branch to be marked read only just because some upstream branch is
so marked.

> I use it for example to have several "checkouts" of the same branch on
> different machines. When I commit, bzr tells me "hey, boss, you're out
> of date, why don't you update first" if I'm out of date. And if commit
> succeeds, I'm sure it is already commited to the main branch. I'm sure
> I won't pollute my history with merges which would only be the result
> of forgetting to update.

This is exactly the same in Git.  You really only ever push upstream
when your local changes fast forward the remote, (ie. you're up to date).
Git will warn you if your changes don't fast forward the remote.
 
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.

While Git really isn't meant to work in a centralized way there's nothing
preventing such a work flow.  It just requires the use of some surrounding
infrastructure.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
@ 2006-10-17 11:38               ` Sean
  2006-10-17 11:38               ` Sean
  2006-10-21 14:13               ` Jan Hudec
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 11:38 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 13:19:08 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")

Git can do this from a local repository, it just can't do it from
a remote repo (at least over the git native protocol).  However,
over gitweb you can grab and unpack a tarball from a remote repo.
In practice this is probably enough support for such a feature.

> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

This doesn't sound right, at least in the spirit of git.  Git really
wants to have a local commit which you may or may not push to a
remote repo at a later time.  There is no upside to forcing it all to
happen in one step, and a lot of downsides.  Gits focus is to support
distributed offline development, not requiring a remote repo to be
available at commit time.
 
> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.

Again this seems really anti-git.  There is no reason for your local
branch to be marked read only just because some upstream branch is
so marked.

> I use it for example to have several "checkouts" of the same branch on
> different machines. When I commit, bzr tells me "hey, boss, you're out
> of date, why don't you update first" if I'm out of date. And if commit
> succeeds, I'm sure it is already commited to the main branch. I'm sure
> I won't pollute my history with merges which would only be the result
> of forgetting to update.

This is exactly the same in Git.  You really only ever push upstream
when your local changes fast forward the remote, (ie. you're up to date).
Git will warn you if your changes don't fast forward the remote.
 
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.

While Git really isn't meant to work in a centralized way there's nothing
preventing such a work flow.  It just requires the use of some surrounding
infrastructure.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
@ 2006-10-17 11:45             ` Jakub Narebski
  2006-10-17 12:02               ` Jakub Narebski
                                 ` (2 more replies)
  2006-10-17 12:00             ` Andreas Ericsson
                               ` (3 subsequent siblings)
  4 siblings, 3 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 11:45 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> - you can use a checkout to maintain a local mirror of a read-only
>>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>>
>> In git you can access contents _without_ checkout/working area.
> 
> Bazaar can do this too. For example,
> "bzr cat http://something -r some-revision" gets the content of a file
> at a given revision. But that's not what Aaron was refering to.

Git cannot do that remotely (with exception of git-tar-tree/git-archive 
which has --remote option), yet. But you can get contents of a file 
(with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
directory (with "git ls-tree <tree-ish>") and compare files or 
directories (git diff family of commands) without need for working 
directory.
 
AFAICT working area is required _only_ to resolve conflicts during 
merge.

> In Bazaar, checkouts can be two things:
> 
> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")
> 
> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

In git by default in the top directory of working area you have .git 
directory which contains whole repository (object database, refs (i.e. 
branches and tags), information which branch is current, index aka. 
gitcache, configuration, etc.). You can share object database locally 
(which includes network filesystem).

You can have .git (usually <project>.git then) directory without working 
area.

And you can symlink (and in the future "symref"-link) .git directory.

> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.

There was proposal to allow for tracking branches to be marked 
read-only, but it was not implemented yet.

But git has reverse check: it forbids (unless forced by user) to fetch 
into branch which has local changes (does not fast-forward). This make 
sure that no information is lost.

The idea is that you fetch changes into tracking branch (e.g. 'master' 
branch of some parent remote repository into 'origin' or 
'remotes/<repository name>/master' branch); you don't commit changes to 
such branch. You do your own work either on 'master' branch, then merge 
(typically using "git pull") corresponding 'origin' tracking branch, or 
use separate private feature branch and use rebase after fetch.

[...]
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.

Git is designed for distributed workflows, not for centralized one.
All repositories are created equal :-)

-- 
Jakub Narebski
ShadeHawk on #git and #revctl
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
  2006-10-17 11:45             ` Jakub Narebski
@ 2006-10-17 12:00             ` Andreas Ericsson
  2006-10-17 13:27               ` Matthieu Moy
  2006-10-17 14:19             ` Olivier Galibert
                               ` (2 subsequent siblings)
  4 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17 12:00 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> - you can use a checkout to maintain a local mirror of a read-only
>>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>> In git you can access contents _without_ checkout/working area.
> 
> Bazaar can do this too. For example,
> "bzr cat http://something -r some-revision" gets the content of a file
> at a given revision. But that's not what Aaron was refering to.
> 
> In Bazaar, checkouts can be two things:
> 
> 1) a working tree without any history information, pointing to some
>    other location for the history itself (a la svn/CVS/...).
>    (this is "light checkout")
> 
> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")
> 

What about

3) getting the repo with all the history while still not having to be 
online to actually commit to *your* copy of the repo. When you later get 
online, you can send all your changes in a big hunk, or let bazaar email 
them to the maintainer as patches, or...

> In both cases, this has the side effect that you can't commit if the
> "upstream" branch is read-only. That's not fundamental, but handy.
> 

It appears we have different ideas of what's handy. Perhaps it's just a 
difference in workflow, or lack of "email-commits-as-patches" tools in 
bazaar, but the ability to commit to whatever branch I like in my local 
repo and then just send the diffs by email or please-pull requests to 
upstream authors is what makes git work so well for me. I can ofcourse 
also pull the changes to another branch, or cherrypick them one by one, 
or...

OTOH, if by "commit" you mean "send your changes back to central 
server", and bazaar'ish for "register my current set of changes in the 
local clone of the repo" is called something else, it sounds very 
similar to what git does.

> 
> The more fundamental thing I suppose is that it allows people to work
> in a centralized way (checkout/commit/update/...), and Bazaar was
> designed to allow several different workflows, including the
> centralized one.
> 

Centralized works in git too after a fashion. Most projects have a 
master repo hidden somewhere that frequently gets pushed out for 
publishing and which most (all?) contributors sync against from time to 
time, but it's by no means a certainty. What *is* a certainty is that 
the published branches are exactly identical to the ones in the master 
repo, and all the downstream authors will get a history where they can 
easily track master's development.

For git, I suppose Junio has the hidden master repo which he publishes 
at kernel.org. Linus does the same with the Linux repo.

On a side-note, it sounds as though the "bound branch" scenario 
encourages making a big change as one mega-diff, so long as it 
implements one feature, whereas the git workflow with topic-branches 
that eventually gets merged to master allows changes to sort of 
accumulate up to a feature in the steps one actually has to take to make 
the feature work.

Side-note 2: Three really great things that have made work a lot easier 
and more enjoyable since we changed from cvs to git and that aren't 
mentioned in the comparison table:
* Dependency/history graph display tools á la qgit/gitk
* Bisection tool for finding bug introduction revisions.
* Tools for sending commits as emails.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:45             ` Jakub Narebski
@ 2006-10-17 12:02               ` Jakub Narebski
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
  2006-10-17 13:33               ` Matthieu Moy
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 12:02 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git

Jakub Narebski wrote:
> In git by default in the top directory of working area you have .git 
> directory which contains whole repository (object database, refs (i.e. 
> branches and tags), information which branch is current, index aka. 
> gitcache, configuration, etc.). You can share object database locally 
> (which includes network filesystem).
> 
> You can have .git (usually <project>.git then) directory without working 
> area.

So called "bare" repository.
> 
> And you can symlink (and in the future "symref"-link) .git directory.

And you can use GIT_DIR environmental variable or --git-dir option
to git wrapper.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:38               ` Sean
@ 2006-10-17 12:03                 ` Matthieu Moy
  2006-10-17 12:56                   ` Jakub Narebski
                                     ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 12:03 UTC (permalink / raw)
  To: Sean; +Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Sean <seanlkml@sympatico.ca> writes:

> On Tue, 17 Oct 2006 13:19:08 +0200
> Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
>
>> 1) a working tree without any history information, pointing to some
>>    other location for the history itself (a la svn/CVS/...).
>>    (this is "light checkout")
>
> Git can do this from a local repository, it just can't do it from
> a remote repo (at least over the git native protocol).  However,
> over gitweb you can grab and unpack a tarball from a remote repo.
> In practice this is probably enough support for such a feature.

Anyway, given the price of disk space today, this only makes sense if
you have a fast access to the repository (otherwise, you consider your
local repository as a cache, and you're ready to pay the disk space
price to save your bandwidth). In this case, it's often in your
filesystem (local or NFS).

>> 2) a bound branch. It's not _very_ different from a normal branch, but
>>    mostly "commit" behaves differently:
>>    - it commits both on the local and the remote branch (equivalent to
>>      "commit" + "push", but in a transactional way).
>>    - it refuses to commit if you're out of date with the branch you're
>>      bound to.
>>    (this is "heavy checkout")
>
> This doesn't sound right, at least in the spirit of git.  Git really
> wants to have a local commit which you may or may not push to a
> remote repo at a later time.  There is no upside to forcing it all to
> happen in one step, and a lot of downsides.  Gits focus is to support
> distributed offline development, not requiring a remote repo to be
> available at commit time.

I lied in my above description ;-).

I should have said "by default" ... but you have "commit --local" if
you want to have a local commit on a bound branch (at this point, I
should remind that not all branches are "bound branches". "bzr branch"
creates branches similar to git ones).

>> In both cases, this has the side effect that you can't commit if the
>> "upstream" branch is read-only. That's not fundamental, but handy.
>
> Again this seems really anti-git.  There is no reason for your local
> branch to be marked read only just because some upstream branch is
> so marked.

Will, take the example of my bzr setup.

I have one repository, say, $repo.

In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
http://bazaar-vcs.org's branch.

I also have branches for patches (occasional in my case) that I'll
send to upstream. Say $repo/feature1, $repo/feature2, ...

If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
commit time, create a branch, and commit in this new branch. I believe
git manages this in a different way, allowing you to commit in this
branch, and creating the branch next time you pull. But you know this
better than I ;-), I never got time to give a real try to git.

>> I use it for example to have several "checkouts" of the same branch on
>> different machines. When I commit, bzr tells me "hey, boss, you're out
>> of date, why don't you update first" if I'm out of date. And if commit
>> succeeds, I'm sure it is already commited to the main branch. I'm sure
>> I won't pollute my history with merges which would only be the result
>> of forgetting to update.
>
> This is exactly the same in Git.  You really only ever push upstream
> when your local changes fast forward the remote, (ie. you're up to date).
> Git will warn you if your changes don't fast forward the remote.

Yes, but you will have to do a merge at some point, right ? While I'm
keeping a purely linear history (not that it is good in the general
case, but for "projects" on which I'm the only developper, I find it
good. For example, my ${HOME}/etc/).

But don't get me wrong, I also prefer the decentralized way in most
case. And I'm happy that bzr and git work like this by default. Just
that at least *I* have cases where a centralized approach suits me
better, and then I'm happy with that particular feature of bzr.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
@ 2006-10-17 12:07                 ` Sean
  2006-10-17 12:07                 ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 12:07 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthieu Moy, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 13:45:31 +0200
Jakub Narebski <jnareb@gmail.com> wrote:

> Git cannot do that remotely (with exception of git-tar-tree/git-archive 
> which has --remote option), yet. But you can get contents of a file 
> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
> directory (with "git ls-tree <tree-ish>") and compare files or 
> directories (git diff family of commands) without need for working 
> directory.

Interesting, I didn't know about the --remote option.  So in fact as long
as the remote has enabled upload-tar then anyone can do a "light checkout".
However, it appears that kernel.org for instance doesn't enable this feature.

Sean
  

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
  2006-10-17 12:07                 ` Sean
@ 2006-10-17 12:07                 ` Sean
  2006-10-21  8:27                   ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-17 12:07 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy

On Tue, 17 Oct 2006 13:45:31 +0200
Jakub Narebski <jnareb@gmail.com> wrote:

> Git cannot do that remotely (with exception of git-tar-tree/git-archive 
> which has --remote option), yet. But you can get contents of a file 
> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
> directory (with "git ls-tree <tree-ish>") and compare files or 
> directories (git diff family of commands) without need for working 
> directory.

Interesting, I didn't know about the --remote option.  So in fact as long
as the remote has enabled upload-tar then anyone can do a "light checkout".
However, it appears that kernel.org for instance doesn't enable this feature.

Sean
  

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:03                 ` Matthieu Moy
@ 2006-10-17 12:56                   ` Jakub Narebski
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 12:56 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Sean, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Matthieu Moy wrote:
>> This is exactly the same in Git.  You really only ever push upstream
>> when your local changes fast forward the remote, (ie. you're up to date).
>> Git will warn you if your changes don't fast forward the remote.
> 
> Yes, but you will have to do a merge at some point, right ? While I'm
> keeping a purely linear history (not that it is good in the general
> case, but for "projects" on which I'm the only developper, I find it
> good. For example, my ${HOME}/etc/).

Fast-forward doesn't result in merge.

If you have

  1---2---3        <branch 1, or branch locally>
           \
            4---5  <branch 2, or branch at remote>

then this is fast-forward case. After pull (or push) you have

  1---2---3---4---5 <branch 1>

without merge.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
@ 2006-10-17 12:57                     ` Sean
  2006-10-17 13:44                       ` Matthieu Moy
  2006-10-17 12:57                     ` VCS comparison table Sean
  1 sibling, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-17 12:57 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 14:03:21 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Anyway, given the price of disk space today, this only makes sense if
> you have a fast access to the repository (otherwise, you consider your
> local repository as a cache, and you're ready to pay the disk space
> price to save your bandwidth). In this case, it's often in your
> filesystem (local or NFS).

This is most likely the reason that people using Git don't clammor
more for the ability to work without a local repository.  Disk is cheap
and it just makes sense the vast majority of the time to have a complete
copy of the repository yourself.  There are a lot of powerful things
you can do once you have all that information in your repo.  Not the least
of which is performing any and all operations while flying on a plane
or sitting on a park bench.

> I should have said "by default" ... but you have "commit --local" if
> you want to have a local commit on a bound branch (at this point, I
> should remind that not all branches are "bound branches". "bzr branch"
> creates branches similar to git ones).

Well, with Git the default is to only commit locally.  Of course, you
could set your post commit hook to always push it to a remote if
you wanted to.

> Will, take the example of my bzr setup.
> 
> I have one repository, say, $repo.
> 
> In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
> http://bazaar-vcs.org's branch.
> 
> I also have branches for patches (occasional in my case) that I'll
> send to upstream. Say $repo/feature1, $repo/feature2, ...
> 
> If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
> commit time, create a branch, and commit in this new branch. I believe
> git manages this in a different way, allowing you to commit in this
> branch, and creating the branch next time you pull. But you know this
> better than I ;-), I never got time to give a real try to git.

Well, it's just a slight difference in perspective rather than any
big issue here.  Git treats all repositories as peers, so it would never
assume that just because one other particular repo has a branch marked
as read only that it should be marked read only locally.  It lets you
commit to it, and then push to say a third and fourth repo that are
writable as well.  In practice this doesn't really cause any
insurmountable problems.

> Yes, but you will have to do a merge at some point, right ? While I'm
> keeping a purely linear history (not that it is good in the general
> case, but for "projects" on which I'm the only developper, I find it
> good. For example, my ${HOME}/etc/).

Well if you're committing changes from multiple different machines,
how is that different from having say 3 different developers committing
changes to the central repo?  How does bzr avoid a merge when you're
pushing changes from 3 separate machines? 

You mentioned that if you try to push and you're not up to date you'll
be prompted to update (ie. pull from the upstream repo).  When you do such
a pull do your local changes get rebased on top or is there a merge?   By
your comments I guess you're saying they're rebased rather than merged, and
this is how you keep a linear history.  Git can do this easily, but it's
not done by default.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
  2006-10-17 12:57                     ` Sean
@ 2006-10-17 12:57                     ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 12:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 14:03:21 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Anyway, given the price of disk space today, this only makes sense if
> you have a fast access to the repository (otherwise, you consider your
> local repository as a cache, and you're ready to pay the disk space
> price to save your bandwidth). In this case, it's often in your
> filesystem (local or NFS).

This is most likely the reason that people using Git don't clammor
more for the ability to work without a local repository.  Disk is cheap
and it just makes sense the vast majority of the time to have a complete
copy of the repository yourself.  There are a lot of powerful things
you can do once you have all that information in your repo.  Not the least
of which is performing any and all operations while flying on a plane
or sitting on a park bench.

> I should have said "by default" ... but you have "commit --local" if
> you want to have a local commit on a bound branch (at this point, I
> should remind that not all branches are "bound branches". "bzr branch"
> creates branches similar to git ones).

Well, with Git the default is to only commit locally.  Of course, you
could set your post commit hook to always push it to a remote if
you wanted to.

> Will, take the example of my bzr setup.
> 
> I have one repository, say, $repo.
> 
> In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
> http://bazaar-vcs.org's branch.
> 
> I also have branches for patches (occasional in my case) that I'll
> send to upstream. Say $repo/feature1, $repo/feature2, ...
> 
> If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
> commit time, create a branch, and commit in this new branch. I believe
> git manages this in a different way, allowing you to commit in this
> branch, and creating the branch next time you pull. But you know this
> better than I ;-), I never got time to give a real try to git.

Well, it's just a slight difference in perspective rather than any
big issue here.  Git treats all repositories as peers, so it would never
assume that just because one other particular repo has a branch marked
as read only that it should be marked read only locally.  It lets you
commit to it, and then push to say a third and fourth repo that are
writable as well.  In practice this doesn't really cause any
insurmountable problems.

> Yes, but you will have to do a merge at some point, right ? While I'm
> keeping a purely linear history (not that it is good in the general
> case, but for "projects" on which I'm the only developper, I find it
> good. For example, my ${HOME}/etc/).

Well if you're committing changes from multiple different machines,
how is that different from having say 3 different developers committing
changes to the central repo?  How does bzr avoid a merge when you're
pushing changes from 3 separate machines? 

You mentioned that if you try to push and you're not up to date you'll
be prompted to update (ie. pull from the upstream repo).  When you do such
a pull do your local changes get rebased on top or is there a merge?   By
your comments I guess you're saying they're rebased rather than merged, and
this is how you keep a linear history.  Git can do this easily, but it's
not done by default.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  3:52             ` Sam Vilain
@ 2006-10-17 12:59               ` Jon Smirl
  0 siblings, 0 replies; 806+ messages in thread
From: Jon Smirl @ 2006-10-17 12:59 UTC (permalink / raw)
  To: Sam Vilain; +Cc: Petr Baudis, Jakub Narebski, git

On 10/16/06, Sam Vilain <sam@vilain.net> wrote:
> Jon Smirl wrote:
> > cvsps works ok on small amounts of data, but it can't handle the full
> > Mozilla repo. The current idea is to convert the full repo with
> > cvs2git and build the ini file needed by cvsps to support incremental
> > imports. After that use cvsps.
> >
>
> Looking through the client.mk used to check out the sub-portions of the
> CVS repository, I have to ask;
>
> Why are you trying to import this big collection of projects into a
> single git repository?

All of Mozilla is in a single CVS repo, client.mk is checking out
directories from the mozilla project. This is how it has been
historically for over ten years. It also allows commits that
simultaneously go to all subcomponents when interfaces are changed.
Even if it was split into different git repos you still need to
download about 70% of them to build the browser.

I've been trying to simply translate the existing repo without
changing it's structure in any way. Changing structure is going to
require a lot of buy-in from all of the developers.

>
> View git's repositories not as a container for an entire community's
> code base, but more as object partitions.  Currently you are quite happy
> to use per-file version control partitions inherent to CVS.  Now you are
> looking at removing all of the partitions completely and hoping to end
> up with something managable.  That it has been possible at all to fit it
> into the space less than the size of a CD is staggering, but surely a
> piecemeal approach would be a pragmatic solution to this problem.
>
> Sam.
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:00             ` Andreas Ericsson
@ 2006-10-17 13:27               ` Matthieu Moy
  2006-10-17 13:55                 ` Jakub Narebski
  2006-10-17 14:01                 ` Andreas Ericsson
  0 siblings, 2 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 13:27 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> What about
>
> 3) getting the repo with all the history while still not having to be
> online to actually commit to *your* copy of the repo. When you later
> get online, you can send all your changes in a big hunk, or let bazaar
> email them to the maintainer as patches, or...

Well, the discussion was about checkouts, so I was talking about
checkouts ;-).

What you mention is the default behavior of Bazaar when you use 
"bzr branch" or "bzr get". BTW, it's also possible to do this with a
heavy checkout, that's "commit --local".

> It appears we have different ideas of what's handy. Perhaps it's just
> a difference in workflow, or lack of "email-commits-as-patches" tools
> in bazaar,

You have "bzr bundle" in Bazaar, and there was work to have it
actually send the email ( http://bazaar-vcs.org/SubmitByMail ), but I
don't think it's finished yet.

And yes, this is a great feature, the first time I used it was with
Darcs, and I was impressed how easy I could submit a patch without any
setup and with a 5-lines tutorial. Even wiki seems complex after
that ;-).

> but the ability to commit to whatever branch I like in my local repo
> and then just send the diffs by email or please-pull requests to
> upstream authors is what makes git work so well for me.

Sure. Once again, Bazaar does it this way too. There's an _additional
feature_ called checkout which allows you to work in another way,
though. As most "feature", it's not useful to everybody.

And I repeat that I'm in no way arguing against the git model :-).

> Side-note 2: Three really great things that have made work a lot
> easier and more enjoyable since we changed from cvs to git and that
> aren't mentioned in the comparison table:

Sure. And regarding this, hopufully, most modern VCS go in the same
direction.

> * Dependency/history graph display tools á la qgit/gitk

http://bazaar-vcs.org/bzr-gtk
http://samba.org/~jelmer/bzr/bzrk.png

> * Bisection tool for finding bug introduction revisions.

This took time to come in bzr, but that's the bisect plugin:

http://bazaar-vcs.org/PluginRegistry

> * Tools for sending commits as emails.

(Surprisingly, I had added this in the table, but has been removed for
some obscure reasons)

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:45             ` Jakub Narebski
  2006-10-17 12:02               ` Jakub Narebski
       [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
@ 2006-10-17 13:33               ` Matthieu Moy
  2 siblings, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 13:33 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

> But git has reverse check: it forbids (unless forced by user) to fetch 
> into branch which has local changes (does not fast-forward).

Same as bzr then I believe. "bzr pull" will suggest you to use "merge"
in this situation, unless you say "pull --overwrite".

>> The more fundamental thing I suppose is that it allows people to work
>> in a centralized way (checkout/commit/update/...), and Bazaar was
>> designed to allow several different workflows, including the
>> centralized one.
>
> Git is designed for distributed workflows, not for centralized one.
> All repositories are created equal :-)

Note that "bound branches" and "other branches" in bzr are not so
different. The "master" (the one you make a checkout of) doesn't have
to know it has checkouts, and the "checkout" just has one file
pointing to the "master", and you can switch from one flow to the
other with "bzr bind/unbind".

So, in Bazaar, all repositories are /almost/ created equal ;-).

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:57                     ` Sean
@ 2006-10-17 13:44                       ` Matthieu Moy
       [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 13:44 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Sean <seanlkml@sympatico.ca> writes:

>> Yes, but you will have to do a merge at some point, right ? While I'm
>> keeping a purely linear history (not that it is good in the general
>> case, but for "projects" on which I'm the only developper, I find it
>> good. For example, my ${HOME}/etc/).
>
> Well if you're committing changes from multiple different machines,
> how is that different from having say 3 different developers committing
> changes to the central repo?

The workflow is different.

If I commit broken changes on a repository shared by multiple
developers, they'll insult me, and they'll be right. While I find
nothing wrong in commiting broken changes to my ${HOME}/etc/ when
leaving the office, and fix it from home.

> How does bzr avoid a merge when you're pushing changes from 3
> separate machines?

Err, the same way people have been doing for years ;-). If you don't
have local commits, "bzr update" will work in the same way as "cvs
update", it keeps your local changes, without recording history. Like
"git pull" does if you have uncommited changes I think.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:30             ` Johannes Schindelin
       [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
  2006-10-17 10:45               ` Matthias Kestenholz
@ 2006-10-17 13:48               ` Aaron Bentley
  2 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 13:48 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Sean, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Johannes Schindelin wrote:
> On Tue, 17 Oct 2006, Sean wrote:
>>Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

>>>- - you can have working trees on local systems while having the
>>>  repository on a remote system.  This makes it easy to work on one
>>>  logical branch from multiple locations, without getting out of sync.
>>
>>That is a very nice feature.  Git would be improved if it could
>>support that mode of operation as well.
> 
> 
> It would also make things slow as hell. How do you deal with something 
> like annotate in such a setup?

For the particular case of annotate, bzr is designed to store
annotations at commit time.  So annotate should require remote access to
a small amount of data from two files-- not a great cost.

But our default form of checkout contains a local copy of all history
data, so that readonly operations happen at local speed.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNN8Y0F+nu1YWqI0RAqXtAJ4qKGQ5ZwlMF795kz3udeuRTcRy6wCghr53
tjw9cNVxzrQ0XSUO2v52ZIo=
=W6q7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:27               ` Matthieu Moy
@ 2006-10-17 13:55                 ` Jakub Narebski
  2006-10-17 14:08                   ` Matthieu Moy
  2006-10-18 18:03                   ` Jeff Licquia
  2006-10-17 14:01                 ` Andreas Ericsson
  1 sibling, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 13:55 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Matthieu Moy wrote:
>> Side-note 2: Three really great things that have made work a lot
>> easier and more enjoyable since we changed from cvs to git and that
>> aren't mentioned in the comparison table:
> 
> Sure. And regarding this, hopufully, most modern VCS go in the same
> direction.
> 
> > * Dependency/history graph display tools á la qgit/gitk
> 
> http://bazaar-vcs.org/bzr-gtk
> http://samba.org/~jelmer/bzr/bzrk.png

Hmmm... most of the tools look similar. Git has gitk (Tcl/Tk, now in 
git.git repository), QGit (Qt), GitView (GTK+, in contrib/), 
git-browser (JavaScript, uses High Performance JavaScript Graphics 
Library by Walter Zorn, http://www.walterzorn.com, for graphics).

Tig (Text-mode Interface for Git, ncurses) also in it's git version has 
a kind of history graph using ascii-art.


That is very important tool to have for any SCM which allows (and 
encourages) nonlinear history development.
 
>> * Bisection tool for finding bug introduction revisions.
> 
> This took time to come in bzr, but that's the bisect plugin:
> 
> http://bazaar-vcs.org/PluginRegistry

Hmmm... I winder which SCM had it first.
 
>> * Tools for sending commits as emails.
> 
> (Surprisingly, I had added this in the table, but has been removed for
> some obscure reasons)

While email can be used to exchange patches (git-format-patch to 
generate patches, git-send-mail to send patches if you don't want to 
use ordinary email client, git-am to apply patches) it cannot be used 
to exchange all information (one cannot send for example tags, or merge 
commits).

It is very usefull tool to have for "accidental" developer. You don't 
have to have constant on-line presence in the form of web server or git 
server somewhere for sending pull requests (although http://repo.or.cz 
public git repo hosting can help with that), you don't have to have 
access (ssh perhaps limited, or WebDAV one) to do push to somebody else 
repository, you can just send email to some mailing list.

BTW. git can provide binary patch for binary files (e.g. adding favicon 
for gitweb in git.git).


Other often and not-so-often used tools include:
 * git-rerere - Reuse recorded resolve (of merge conflicts)
 * reflog - Records where was given branch at given time (no UI yet)
 * git-diff -S'text' aka. pickaxe - find commits which added or removed
   given 'text'; and other revision limiters

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:27               ` Matthieu Moy
  2006-10-17 13:55                 ` Jakub Narebski
@ 2006-10-17 14:01                 ` Andreas Ericsson
  2006-10-17 14:24                   ` Matthieu Moy
  1 sibling, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17 14:01 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Matthieu Moy wrote:
> Andreas Ericsson <ae@op5.se> writes:
> 
>> What about
>>
>> 3) getting the repo with all the history while still not having to be
>> online to actually commit to *your* copy of the repo. When you later
>> get online, you can send all your changes in a big hunk, or let bazaar
>> email them to the maintainer as patches, or...
> 
> Well, the discussion was about checkouts, so I was talking about
> checkouts ;-).
> 

Differences in nomenclature is really messing this discussion up. In 
git, a "checkout" is the act of pulling objects from the object database 
into the working tree. I.e., the act of "clothing" a "bare" repository.


>> but the ability to commit to whatever branch I like in my local repo
>> and then just send the diffs by email or please-pull requests to
>> upstream authors is what makes git work so well for me.
> 
> Sure. Once again, Bazaar does it this way too. There's an _additional
> feature_ called checkout which allows you to work in another way,
> though. As most "feature", it's not useful to everybody.
> 

Now I'm really confused. Does bazaar have both "clone" (git-style 
fetching a full repo and all the branches) and "checkout" (cvs-style 
fetching only the working tree)?

> 
>> Side-note 2: Three really great things that have made work a lot
>> easier and more enjoyable since we changed from cvs to git and that
>> aren't mentioned in the comparison table:
> 
> Sure. And regarding this, hopufully, most modern VCS go in the same
> direction.
> 
>> * Dependency/history graph display tools á la qgit/gitk
> 
> http://bazaar-vcs.org/bzr-gtk
> http://samba.org/~jelmer/bzr/bzrk.png
> 
>> * Bisection tool for finding bug introduction revisions.
> 
> This took time to come in bzr, but that's the bisect plugin:
> 
> http://bazaar-vcs.org/PluginRegistry
> 
>> * Tools for sending commits as emails.
> 
> (Surprisingly, I had added this in the table, but has been removed for
> some obscure reasons)
> 

Merge-conflict with the webpage? ;-)

However, I know that bazaar has many of these features. I was merely 
commenting on the absence of these killer-features in the table. It 
might help people pick the right scm for their project, which is always 
a Good Thing(tm).

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
  2006-10-17 14:01                           ` Sean
@ 2006-10-17 14:01                           ` Sean
  2006-10-17 14:19                             ` Matthieu Moy
  1 sibling, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-17 14:01 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, 17 Oct 2006 15:44:36 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> > How does bzr avoid a merge when you're pushing changes from 3
> > separate machines?
> 
> Err, the same way people have been doing for years ;-). If you don't
> have local commits, "bzr update" will work in the same way as "cvs
> update", it keeps your local changes, without recording history. Like
> "git pull" does if you have uncommited changes I think.

Ah, okay.  Well Git can definitely manage this.  Just means you have to
rebase any local changes before pushing.  This will keep the history
linear and make sure that no merges are needed in the case you were asking
about.

So far, it sounds to me like bazaar and git are more alike than they are
different.  Each have a few commands the other doesn't but all in all
they sound very similar.  But i'm a Git fanboy so I aint switching
now ;o)

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
@ 2006-10-17 14:01                           ` Sean
  2006-10-17 14:01                           ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 14:01 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 15:44:36 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> > How does bzr avoid a merge when you're pushing changes from 3
> > separate machines?
> 
> Err, the same way people have been doing for years ;-). If you don't
> have local commits, "bzr update" will work in the same way as "cvs
> update", it keeps your local changes, without recording history. Like
> "git pull" does if you have uncommited changes I think.

Ah, okay.  Well Git can definitely manage this.  Just means you have to
rebase any local changes before pushing.  This will keep the history
linear and make sure that no merges are needed in the case you were asking
about.

So far, it sounds to me like bazaar and git are more alike than they are
different.  Each have a few commands the other doesn't but all in all
they sound very similar.  But i'm a Git fanboy so I aint switching
now ;o)

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  7:50         ` Andreas Ericsson
@ 2006-10-17 14:05           ` Aaron Bentley
       [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
  2006-10-17 15:05             ` Andreas Ericsson
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 14:05 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
> Aaron Bentley wrote:

>> When two people have copies of the same revision, it's usually because
>> they are each pulling from a common branch, and so the revision in that
>> branch can be named.  Bazaar does use unique ids internally, but it's
>> extremely rare that the user needs to use them.
>>
> 
> Well, if two people have the same revision in git, you *know* they have
> pulled from each other

No, you don't.  They may have each pulled from a different repository.

Take revision 00aabbcc, created by Linus.  Linus has it because he
committed it.  I have it because I pulled Linus' repository.  You have
it because Andrew Morton pulled Linus' repository, and you pulled Andrew
Morton's repository.

>> But tags have local meaning only, unless someone has access to your
>> repository, right?
>>
> 
> I imagine the bazaar-names with url+number only has local meaning unless
> someone has access to your repository too.

Yes.  That phrasing was from Linus' description of revnos.

> One of the great benefits of
> git is that each revision is *always exactly the same* no matter in
> which repository it appears. This includes file-content, filesystem
> layout and, last but also most important, history.

In Bazaar, a revision id always refers to the same logical entity, but
it may be stored in different formats in different repositories.

>> - - you can publish a repository without publishing its working tree,
>>   possibly using standard mirroring tools like rsync.
>>
> 
> Can't all scm's do this?

With most SCMs that store the repository in the root of the tree,
disentangling the tree and repository requires care.  OTOH, this is just
as easy with Arch, CVS and SVN as it is with Bazaar.

>> - - you can use a checkout to maintain a local mirror of a read-only
>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>>
> 
> Check. Well, actually, you just clone it as usual but with the --bare
> argument and it won't write out the working tree files.

No, I *want* the working tree files.  I run bzr from a checkout of bzr.dev.

>> You can operate that way in bzr too, but I find it nicer to have one
>> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
>> command also rewrites only the changed part of the working tree.
>>
> 
> Works in git as well, but each "checkout" (actually, locally referenced
> repository clone) gets a separate branch/tag namespace.

In our terminology, if it can diverge from the original, it's a branch,
not a checkout.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNOM10F+nu1YWqI0RAvNUAJwN/QviOs+sUuN9ep4Otyrgax9SmwCfSH7t
XdxOxo7smshNlzU3qoxq6Nw=
=nxsM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:55                 ` Jakub Narebski
@ 2006-10-17 14:08                   ` Matthieu Moy
  2006-10-17 14:41                     ` Jakub Narebski
  2006-10-18 18:03                   ` Jeff Licquia
  1 sibling, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 14:08 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

> While email can be used to exchange patches (git-format-patch to 
> generate patches, git-send-mail to send patches if you don't want to 
> use ordinary email client, git-am to apply patches) it cannot be used 
> to exchange all information (one cannot send for example tags, or merge 
> commits).

In bzr, the "bundle" appears like a patch, but it actually contain the
same information as the revision(s) it contains (I believe this
applies to hg and Darcs too). A bundle can be used almost like a
branch. That's a key point, since revision identity is not based on
content's hash, so applying a patch is very different from merging a
bundle.

> It is very usefull tool to have for "accidental" developer.

That's the key point, but patch review for non-accidental developpers
is also good :-).

> BTW. git can provide binary patch for binary files (e.g. adding favicon 
> for gitweb in git.git).

Bazaar's bundle use base64 encoding for binaries. I don't think that's
efficient binary diff (xdelta-like) though. Aaron has been fighting
quite a lot with MUA and MTA mixing up the patches (line ending in
particular) ...

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:01                           ` Sean
@ 2006-10-17 14:19                             ` Matthieu Moy
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 14:19 UTC (permalink / raw)
  To: bazaar-ng, git

Sean <seanlkml@sympatico.ca> writes:

> Ah, okay.  Well Git can definitely manage this.  Just means you have to
> rebase any local changes before pushing.  This will keep the history
> linear and make sure that no merges are needed in the case you were asking
> about.

Sure. As I said before, the little add-on of checkouts is that you say
once "I don't want to do local commit here", and bzr reminds you this
each time you commit. Well, where it can make a difference is that it
does it in a transactional way, that is, you don't have that little
window between the time you pull and the time you push your next
commit. But this would really be bad luck ;-).

> So far, it sounds to me like bazaar and git are more alike than they are
> different.  Each have a few commands the other doesn't but all in all
> they sound very similar.

Sure. And at least, if you want to prove that your decentralized SCM
is the best, you'd better look at features other than the ability to
commit on a local branch ;-). If you want a _real_ flamewar, better
talk about rename management or revision identity.

The thing is that most people migrated from CVS/svn, so they found
their new SCM to be incredibly better the existing. But it's generally
not _so_ much better than the other modern alternatives ;-). (and
don't forget to thank Darcs and Monotone who brought most of the good
ideas you and I are using)

> But i'm a Git fanboy so I aint switching now ;o)

Probably not going to switch either, but that might happen.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
  2006-10-17 11:45             ` Jakub Narebski
  2006-10-17 12:00             ` Andreas Ericsson
@ 2006-10-17 14:19             ` Olivier Galibert
  2006-10-17 15:37               ` Matthieu Moy
  2006-10-18  1:46             ` Petr Baudis
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  4 siblings, 1 reply; 806+ messages in thread
From: Olivier Galibert @ 2006-10-17 14:19 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

On Tue, Oct 17, 2006 at 01:19:08PM +0200, Matthieu Moy wrote:
> I use it for example to have several "checkouts" of the same branch on
> different machines. When I commit, bzr tells me "hey, boss, you're out
> of date, why don't you update first" if I'm out of date.

You're not telling us bzr still follows the utterly stupid
update-before-commit model, right?  Right?

  OG.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:01                 ` Andreas Ericsson
@ 2006-10-17 14:24                   ` Matthieu Moy
  0 siblings, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 14:24 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> Now I'm really confused. Does bazaar have both "clone" (git-style
> fetching a full repo and all the branches) and "checkout" (cvs-style
> fetching only the working tree)?

Yes, it has both. That's "bzr branch" (git clone) and "bzr checkout"
(cvs checkout).

Difference between "bzr branch" and "git clone" is that bzr doesn't
fetch all the branches. It fetches one "branch" (succession of
revisions) with all the ancestors of the revisions of the branch.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
@ 2006-10-17 14:34               ` Sean
  0 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 14:34 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Andreas Ericsson, Linus Torvalds, Jakub Narebski, bazaar-ng, git

On Tue, 17 Oct 2006 10:05:41 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:


> No, you don't.  They may have each pulled from a different repository.
> 
> Take revision 00aabbcc, created by Linus.  Linus has it because he
> committed it.  I have it because I pulled Linus' repository.  You have
> it because Andrew Morton pulled Linus' repository, and you pulled Andrew
> Morton's repository.

Well his point was that they have pulled from each other directly or
indirectly.  You can safely say that rev 00aabbcc.. in _any_ repository
is the same rev.  This discussion started because of doubt expressed
by some here on the list that the "simple" numbering scheme used by
bzr can offer the same guarantee.  That is, rev 1.2.1 may be completely
different commits in different repos in bazaar.
 
> With most SCMs that store the repository in the root of the tree,
> disentangling the tree and repository requires care.  OTOH, this is just
> as easy with Arch, CVS and SVN as it is with Bazaar.

Just in case it wasn't clear, this is drop dead easy in Git too.

> No, I *want* the working tree files.  I run bzr from a checkout of bzr.dev.

Why?  Uncommitted changes shouldn't be propagated.  Once you have cloned
the repo, you can checkout your own copy of the working tree files.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:08                   ` Matthieu Moy
@ 2006-10-17 14:41                     ` Jakub Narebski
  2006-10-18  0:00                       ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 14:41 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> While email can be used to exchange patches (git-format-patch to 
>> generate patches, git-send-mail to send patches if you don't want to 
>> use ordinary email client, git-am to apply patches) it cannot be used 
>> to exchange all information (one cannot send for example tags, or
>> merge commits).
> 
> In bzr, the "bundle" appears like a patch, but it actually contain the
> same information as the revision(s) it contains (I believe this
> applies to hg and Darcs too). A bundle can be used almost like a
> branch. That's a key point, since revision identity is not based on
> content's hash, so applying a patch is very different from merging a
> bundle.

The patch generated by git-format-patch has author information (in 
"From:" header), original commit date (in "Date:" header), commit 
message (first line in "Subject:", rest in message body), place for 
comments which are not to be included in commit message, diffstat for 
easier patch review, and git extended diff (with information about 
renames detection, mode changes, 7-characters wide shortcuts of file 
contents identifiers). It does not record parent information, original 
comitter and comitter date, which branch we are on etc. You can quite 
easily provide ordering of patches.

Sending patches via email prohibits first line of commit message to be 
enclosed in brackets (subject usually is "[PATCH] Commit description" 
or "[PATCH n/m] Commit description") and enforces git convention of 
commit message to consist of first line describing commit shortly, 
separated by empty line from the longer description and signoff lines.

"Bundle" equivalent, although binary in nature, would be thin pack.
 
>> It is very usefull tool to have for "accidental" developer.
> 
> That's the key point, but patch review for non-accidental developpers
> is also good :-).

How very true...
 
>> BTW. git can provide binary patch for binary files (e.g. adding
>> favicon for gitweb in git.git).
> 
> Bazaar's bundle use base64 encoding for binaries. I don't think that's
> efficient binary diff (xdelta-like) though. Aaron has been fighting
> quite a lot with MUA and MTA mixing up the patches (line ending in
> particular) ...

If I remember correctly git binary diff format is xdiff based, and uses 
kind of ascii85 encoding (PostScript).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:24       ` Aaron Bentley
                           ` (2 preceding siblings ...)
       [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
@ 2006-10-17 15:03         ` Linus Torvalds
  3 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17 15:03 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?

Ehh. Exactly like the bzr numbers? You have to have access to the original 
repo to name it.

So your point is?

If you do

	git log v2.6.17

in a kernel repository, you'll see exactly what I see - because you'll 
have gotten the tags, aka the "easy revision names".

Now, I'm obviously biased, but the thing is, git really does do this 
right. No meaningless numbers. You give _meaningful_ revision names, and 
they can be extremely powerful.

And no, it's not just tags or the raw SHA1 numbers. You can do 
relationships like

	git log HEAD~5..

which means "show the log for everything since five parents ago" (which is 
_not_ the same as "show the last five revisions", because one of them may 
have been a merge, and brought in a lot more of new commits).

Or, you can say

	git diff mybranch@{2.days.ago}..nextbranch

which says exactly what you'd read it as: show the diff between what 
"mybranch" looked like 2 days ago and what "nextbranch" looks like right 
now.

Or, since the namespace is the same for commit history _and_ for actual 
file contents, and since some commands don't need commits, you can decide 
to name not a revision, but a specific file or subdirectory in a revision, 
and do things like

	git -p grep -1 request_irq v2.6.17~2:drivers/char

where the "revision" is not a commit revision at all, it's a _tree_ 
revision, because we've looked up the revision for "v2.6.17~2" (which 
means "the grandparent of the tag 2.6.17"), and then within that commit we 
looked up the tree "drivers/char", and then we grepped (recursively) for 
the string "request_irq" within that subtree (with one line of context), 
and then we paginated the output through "less" (or whatever your pager is 
set to).

In other words, yes, the above does _exactly_ what you'd expect it to do.

The fact is, nobody ever uses the SHA1 names directly in their normal 
work. You'd use the branch names, tag-names, or some relationship operator 
like "this long ago" or "the parent of" or similar).

The only time you use actual SHA1 names is when you tell somebody _else_ 
something. Or when you use "gitk" to look something up, and select a 
commit, and then paste that commit name into "git show" (which is 
obviously telling "somebody else" - it's communicating between two 
programs).

There's simply no reason to ever use the SHA1 names directly normally. But 
they are there, and they are the _real_ revision numbers, and they 
actually have real meaning between different repositories.

So that "git grep" example above is actually 100% equivalent to

	git -p grep -1 request_irq 3ff4e205e1

but why would I ever write that? That's just insane. But in case you care, 
the way I got that "3ff4e205e1" number, it was just by doing

	git rev-parse v2.6.17~2:drivers/char

and cutting-and-pasting the first ten hex-digits to  make sure I had 
enough of a name to make it unique.

So the SHA1 names always exist, and they are what git _internally_ uses, 
but you'd normally not use them that much in your daily life. 

They are great for explaining things, though. For example, when somebody 
reports a bug, and has used "git bisect" to figure out where the bug 
started happening, that's when the "real name" matters - since we normally 
didn't tag that commit as being buggy when we created it ;)

So that's when you'd say: "I bisected the problem, and it started 
happening in commit 0123456789abcdef". And now everybody with a git 
repository of the kernel can just look it up locally by 
cutting-and-pasting that one number.

> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:

Actually, git does something even better.

Git allows the repository to be split up.

You can get a git repository on a CD or DVD, and do

	git clone -l -s /mount/cdrom myrepo

and that "-s" means that the new "myrepo" actually is linked to the 
original CDROM repository, and you can now _commit_ stuff and make changes 
in myrepo, even though all the old history is on that CD-ROM. It won't add 
any unnecessary stuff at all to the new repo.

Or, you could do the "totally naked" checkout, so that the whole 
repository is somewhere else (if that "somewhere else" is the CD-ROM, you 
obviously cannot change anything ;)

Or you can have <n> different repositories that are all related, and all 
contain just the part that _they_ care about.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:05           ` Aaron Bentley
       [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
@ 2006-10-17 15:05             ` Andreas Ericsson
  2006-10-17 15:32               ` Matthieu Moy
  2006-10-17 19:44               ` Aaron Bentley
  1 sibling, 2 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-17 15:05 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Andreas Ericsson wrote:
>> Aaron Bentley wrote:
> 
>>> When two people have copies of the same revision, it's usually because
>>> they are each pulling from a common branch, and so the revision in that
>>> branch can be named.  Bazaar does use unique ids internally, but it's
>>> extremely rare that the user needs to use them.
>>>
>> Well, if two people have the same revision in git, you *know* they have
>> pulled from each other
> 
> No, you don't.  They may have each pulled from a different repository.
> 

I realized it as I read it now. What I meant was that you know you have 
the exact same revision as the original author once committed.

> 
>>> But tags have local meaning only, unless someone has access to your
>>> repository, right?
>>>
>> I imagine the bazaar-names with url+number only has local meaning unless
>> someone has access to your repository too.
> 
> Yes.  That phrasing was from Linus' description of revnos.
> 
>> One of the great benefits of
>> git is that each revision is *always exactly the same* no matter in
>> which repository it appears. This includes file-content, filesystem
>> layout and, last but also most important, history.
> 
> In Bazaar, a revision id always refers to the same logical entity, but
> it may be stored in different formats in different repositories.
> 

This I don't understand. Let's say Alice has revision-154 in her repo, 
located at alice.example.com. Let's say that commit is accessible with 
the url "alice.example.com:revision-154". Bob pulls from her repo into 
his own, which is located at bob.example.com.

Lots of questions here, so I'll split them up. Feel free to delete the 
non-applicable ones.

Will the commit in Bob's repo be accessible at 
"bob.example.com:revision-154"?

If it's not, how can you backtrack from old bugreports and find the 
error being discussed?

If it is, how does that work if Bob suddenly wants to commit things 
before Alice is done working with her changes?

Also, suppose they both push to a master-repo where Caesar has pushed 
his changes and nicked the slot for revision-154. Does the master repo 
re-organize everything and then invalidate Bob's and Alice's changes, or 
does it tell Alice and Bob that they need to update and then reorganize 
their repos before they're allowed to push?

I really can't get my head around the usefulness of revision-numbers 
hopping around which is probably why I'm having such a trouble groking 
how it works.

> 
>>> - - you can use a checkout to maintain a local mirror of a read-only
>>>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>>>
>> Check. Well, actually, you just clone it as usual but with the --bare
>> argument and it won't write out the working tree files.
> 
> No, I *want* the working tree files.  I run bzr from a checkout of bzr.dev.
> 

You get the working tree files by default. Use --bare if you don't want 
them to be checked out (i.e. written to the working tree) after the 
clone is complete.

>>> You can operate that way in bzr too, but I find it nicer to have one
>>> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
>>> command also rewrites only the changed part of the working tree.
>>>
>> Works in git as well, but each "checkout" (actually, locally referenced
>> repository clone) gets a separate branch/tag namespace.
> 
> In our terminology, if it can diverge from the original, it's a branch,
> not a checkout.
> 

This clears things up immensely. bazaar checkout != git checkout.
I still fail to see how a local copy you can't commit to is useful, but 
it doesn't really matter to me as I've already found a tool that does 
everything I want wrt scm needs.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
@ 2006-10-17 15:06                                 ` Sean
  2006-10-17 15:06                                 ` Sean
  2006-10-18  0:14                                 ` Petr Baudis
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 15:06 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

On Tue, 17 Oct 2006 16:19:46 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Sure. As I said before, the little add-on of checkouts is that you say
> once "I don't want to do local commit here", and bzr reminds you this
> each time you commit. Well, where it can make a difference is that it
> does it in a transactional way, that is, you don't have that little
> window between the time you pull and the time you push your next
> commit. But this would really be bad luck ;-).

Yeah, it would be bad luck, but Git wouldn't actually let the push
succeed if someone had changed the upstream repo in that small window.
It would complain that your push wasn't a fast forward and ask you
to update before pushing.

> Sure. And at least, if you want to prove that your decentralized SCM
> is the best, you'd better look at features other than the ability to
> commit on a local branch ;-). If you want a _real_ flamewar, better
> talk about rename management or revision identity.
> 
> The thing is that most people migrated from CVS/svn, so they found
> their new SCM to be incredibly better the existing. But it's generally
> not _so_ much better than the other modern alternatives ;-). (and
> don't forget to thank Darcs and Monotone who brought most of the good
> ideas you and I are using)

Heh, true enough.  And the fact is they're all "borrowing" the
best ideas from one another.  All of a sudden the others are all
getting git-like bisect and gitk guis.  And of course Linus has
said that he got quite a bit of inspiration from Monotone
originally.

Beyond the distributed offline nature of using Git, the killer
"feature" for me is its raw speed and flexibility[1].  It's
really nice to be able to branch in under a second and try
out a line of development etc.  Maybe this is just as easy
in Bazaar but it's not true of say Mercurial.  Honestly, I
just can't imagine any other SCM meeting my needs better than
Git.  So I have a hard time taking complaints about rename
management or revision identity seriously.

While they don't affect my usage, IMHO the two biggest failings
of Git are its lack of a shallow clone and its reliance on shell
and other scripting languages so there is no native Windows version.
I'm sure both of these areas are handled better by Bazaar and/or
some of the other new SCMs where they'd be a better choice than
Git.

Sean

[1] As an aside, I don't understand why bazaar pushes the idea
of "plugins".  For instance someone mentioned that bazaar has
a bisect "plugin".  Well Git was able to add a bisect "command"
without needing a plugin architecture.. so i'm at a loss as 
to why plugins are seen as an advantage.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
  2006-10-17 15:06                                 ` Sean
@ 2006-10-17 15:06                                 ` Sean
  2006-10-18  0:14                                 ` Petr Baudis
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 15:06 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

On Tue, 17 Oct 2006 16:19:46 +0200
Matthieu Moy <Matthieu.Moy@imag.fr> wrote:

> Sure. As I said before, the little add-on of checkouts is that you say
> once "I don't want to do local commit here", and bzr reminds you this
> each time you commit. Well, where it can make a difference is that it
> does it in a transactional way, that is, you don't have that little
> window between the time you pull and the time you push your next
> commit. But this would really be bad luck ;-).

Yeah, it would be bad luck, but Git wouldn't actually let the push
succeed if someone had changed the upstream repo in that small window.
It would complain that your push wasn't a fast forward and ask you
to update before pushing.

> Sure. And at least, if you want to prove that your decentralized SCM
> is the best, you'd better look at features other than the ability to
> commit on a local branch ;-). If you want a _real_ flamewar, better
> talk about rename management or revision identity.
> 
> The thing is that most people migrated from CVS/svn, so they found
> their new SCM to be incredibly better the existing. But it's generally
> not _so_ much better than the other modern alternatives ;-). (and
> don't forget to thank Darcs and Monotone who brought most of the good
> ideas you and I are using)

Heh, true enough.  And the fact is they're all "borrowing" the
best ideas from one another.  All of a sudden the others are all
getting git-like bisect and gitk guis.  And of course Linus has
said that he got quite a bit of inspiration from Monotone
originally.

Beyond the distributed offline nature of using Git, the killer
"feature" for me is its raw speed and flexibility[1].  It's
really nice to be able to branch in under a second and try
out a line of development etc.  Maybe this is just as easy
in Bazaar but it's not true of say Mercurial.  Honestly, I
just can't imagine any other SCM meeting my needs better than
Git.  So I have a hard time taking complaints about rename
management or revision identity seriously.

While they don't affect my usage, IMHO the two biggest failings
of Git are its lack of a shallow clone and its reliance on shell
and other scripting languages so there is no native Windows version.
I'm sure both of these areas are handled better by Bazaar and/or
some of the other new SCMs where they'd be a better choice than
Git.

Sean

[1] As an aside, I don't understand why bazaar pushes the idea
of "plugins".  For instance someone mentioned that bazaar has
a bisect "plugin".  Well Git was able to add a bisect "command"
without needing a plugin architecture.. so i'm at a loss as 
to why plugins are seen as an advantage.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 15:05             ` Andreas Ericsson
@ 2006-10-17 15:32               ` Matthieu Moy
  2006-10-17 19:44               ` Aaron Bentley
  1 sibling, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 15:32 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> This I don't understand. Let's say Alice has revision-154 in her repo,
> located at alice.example.com. Let's say that commit is accessible with
> the url "alice.example.com:revision-154". Bob pulls from her repo into
> his own, which is located at bob.example.com.

Another equation can help.

Revision Identity != Revision Number.

$ bzr log --show-ids
------------------------------------------------------------
revno: 1
revision-id: Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d
committer: Matthieu Moy <Matthieu.Moy@imag.fr>
branch nick: foo
timestamp: Tue 2006-10-17 17:20:29 +0200
message:
  some message


See, bzr has this unique revision identifier (not based on a hashsum).
The design choice of bzr is to hide it as much as possible from the
user interface.

Then, if I'm in the branch in which I typed this command, I can reffer
to this revision with simply

  bzr whatever -r 1

In the general case, I can access it with

  bzr whatever -r revid:Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d

(There's currently a lack in the UI to specify a remote revision-id,
but that's not a problem in the model itself)

bzr's internal use almost exclusively revision ID (ancestry
information is all about revision id), and revno are a UI layered on
top of it.

I don't have strong needs in revision control, but I actually never
encountered a case where I had to access a revision by providing its
ID. So, for people like me, revision numbers are sufficient, and they
are simple (for example, I can tell without running any command that
revision 42 is older than revision 56 in a particular branch).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:19             ` Olivier Galibert
@ 2006-10-17 15:37               ` Matthieu Moy
  0 siblings, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-17 15:37 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

Olivier Galibert <galibert@pobox.com> writes:

> You're not telling us bzr still follows the utterly stupid
> update-before-commit model, right?  Right?

One last time:

bzr _CAN_ follow the utterly stupid update-before-commit model.

It doesn't force you to do so, obviously.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  9:40           ` Robert Collins
  2006-10-17 10:08             ` Andreas Ericsson
@ 2006-10-17 16:41             ` Linus Torvalds
  2006-10-17 22:27               ` Robert Collins
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17 16:41 UTC (permalink / raw)
  To: Robert Collins; +Cc: Jakub Narebski, Aaron Bentley, bazaar-ng, git



On Tue, 17 Oct 2006, Robert Collins wrote:

> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> > 
> >           ---- time --->
> > 
> >     --*--*--*--*--*--*--*--*--*-- <branch>
> >           \            /
> >            \-*--X--*--/
> > 
> > The branch it used to be on is gone...
> 
> In bzr 0.12 this is :
> 2.1.2
> 
> (assuming the first * is numbered '1'.)
> 
> These numbers are fairly stable

And here, by "fairly stable", you really mean "totally idiotic", don't 
you?

Guys, let's be blunt here, and just say you're wrong. The fact is, I've 
used a system that uses the same naming bzr does, and I've used it likely 
longer and with a bigger project than anybody has likely _ever_ used bzr 
for.

It sounds like bzr is doing _exactly_ what bitkeeper did. 

Those "simple" numbers are totally idiotic. And when I say "totally 
idiotic", please go back up a few sentences, and read those again. I know 
what I'm talking about. I know probably better than anybody in the bzr 
camp.

Those "simple" numbers are anything but. They may be short, most of the 
time, but when you bandy things like "-r 56" around, what you're ignoring 
is that for a _real_ project you actually get numbers like "1.517.3.57", 
which isn't really any simpler or shorter than saying "7786ce19". You 
still want to cut-and-paste it.

And the "simple" numbers have a real downside, which is that THEY CHANGE.

What happens is that somebody else started _another_ branch at revision 2, 
and did important work, and and they also had a "2.1.2" revision, and then 
they merged your work, and you merged their merge back, that "simple" 
revision number changed, didn't it? Suddenly "2.1.2" means something 
different for one of the users.

We had people in the bitkeeper world that _never_ actually understood that 
the numbers changed. The "simple" numbers were stable enough that a lot of 
people thought they were real revisions, and then they were really 
_really_ confused when a number like "1.517.3.57" suddenly went away after 
a merge, and became something else instead.

And yes, bitkeeper had a "real key" internally too. If you actually wanted 
to give a real revision, you had to give something that looked a lot like 
what the bzr internal revision numbers look like.

Of course, most users didn't even _know_ or understand those revision 
numbers, so as a result, you had tons of people who used the "simple" 
thing (which was what "bk log" and all other tools would show), and since 
it worked quite often, they thought it was ok. And then sometimes it 
didn't work at all, or it "worked" by giving the wrong commit, and it was 
just a total disaster.

Something that works "most of the time" is not simple to use. It's just a 
way to make people _believe_ it is simple, and then be really confused 
when it doesn't work.

So trust me, naming things so that the name depend on the local shape of 
the history is idiotic. I _know_. Been there, done that.

The thing is, when I designed git, I actually had years of experience 
working with a big project in a truly distributed manner. I _knew_ that 
handling renames specially is a bad idea (not that you should even need to 
have used BK to know that).

And I _knew_ that the simple revision numbers aren't real and just cause 
confusion.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  6:23         ` Junio C Hamano
@ 2006-10-17 18:52           ` J. Bruce Fields
  2006-10-17 19:12             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: J. Bruce Fields @ 2006-10-17 18:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Aaron Bentley, git

On Mon, Oct 16, 2006 at 11:23:53PM -0700, Junio C Hamano wrote:
> Aaron Bentley <aaron.bentley@utoronto.ca> writes:
> 
> > Johannes Schindelin wrote:
> >
> >>> You'll note we referred to that bevhavior on the page.  We don't think
> >>> what Git does is the same as supporting renames.  AIUI, some Git users
> >>> feel the same way.
> >> 
> >> Oh, we start another flamewar again?
> >
> > I'd hope not.  It sounds as though you feel that supporting renames in
> > the data representation is *wrong*, and therefore it should be an insult
> > to you if we said that Git fully supported renames.
> 
> Not recording and not supporting are quite different things.

Yes.  There's a risk of confusing a feature with an implementation
detail.  From http://bazaar-vcs.org/RcsComparisons:

	"If a user can rename a file in the RCS without loosing the RCS
	history for a file, then renames are considered supported. If
	the operation resultes in a delete/add (aka "DA pair"), then
	renames are not considered supported. If the operation results
	in a copy/delete pair, renames are considered "somewhat"
	supported. The problem with copy support is that it is hard to
	define sane merge semantics for copies."

The first sentence sounds like a description of a user-visible feature.
The rest of it sounds like implementation.

And git probably has some deficiencies here, but it'd be more useful to
identify them in terms of things a user can't do.

--b.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 18:52           ` J. Bruce Fields
@ 2006-10-17 19:12             ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 19:12 UTC (permalink / raw)
  To: git

J. Bruce Fields wrote:

> On Mon, Oct 16, 2006 at 11:23:53PM -0700, Junio C Hamano wrote:
>> Aaron Bentley <aaron.bentley@utoronto.ca> writes:
>> 
>> > Johannes Schindelin wrote:
>> >
>> >>> You'll note we referred to that bevhavior on the page.  We don't think
>> >>> what Git does is the same as supporting renames.  AIUI, some Git users
>> >>> feel the same way.
>> >> 
>> >> Oh, we start another flamewar again?
>> >
>> > I'd hope not.  It sounds as though you feel that supporting renames in
>> > the data representation is *wrong*, and therefore it should be an insult
>> > to you if we said that Git fully supported renames.
>> 
>> Not recording and not supporting are quite different things.
> 
> Yes.  There's a risk of confusing a feature with an implementation
> detail.  From http://bazaar-vcs.org/RcsComparisons:
> 
>       "If a user can rename a file in the RCS without loosing the RCS
>       history for a file, then renames are considered supported. If
>       the operation resultes in a delete/add (aka "DA pair"), then
>       renames are not considered supported. If the operation results
>       in a copy/delete pair, renames are considered "somewhat"
>       supported. The problem with copy support is that it is hard to
>       define sane merge semantics for copies."
> 
> The first sentence sounds like a description of a user-visible feature.
> The rest of it sounds like implementation.

The proper description would be: if we get history of file up to rename
unrelated to the history of file before rename ("DA pair"), where
"unrelated" means that SCM doesn't store this relation (or equivalent
information), renames are not considered supported. If we get full
history of file under new name, and unrelated history of file up to rename
("CD pair"), renames are not considered supported ;-)
 
> And git probably has some deficiencies here, but it'd be more useful to
> identify them in terms of things a user can't do.

For example:
 * if we rename (or delete) file on one branch, and then merge changes
   with other branch where such rename didn't make place, do merge do
   the correct thing.
 * can we get whole history of file, before and after rename. Can we do
   this automatically, in one go.
 * do renames are (can be) marked as such in diff output.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 15:05             ` Andreas Ericsson
  2006-10-17 15:32               ` Matthieu Moy
@ 2006-10-17 19:44               ` Aaron Bentley
  2006-10-17 23:28                 ` Petr Baudis
  2006-10-17 23:39                 ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 19:44 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
>> In Bazaar, a revision id always refers to the same logical entity, but
>> it may be stored in different formats in different repositories.
>>
> 
> This I don't understand. Let's say Alice has revision-154 in her repo,
> located at alice.example.com. Let's say that commit is accessible with
> the url "alice.example.com:revision-154". Bob pulls from her repo into
> his own, which is located at bob.example.com.
> 
> Lots of questions here, so I'll split them up. Feel free to delete the
> non-applicable ones.
> 
> Will the commit in Bob's repo be accessible at
> "bob.example.com:revision-154"?

bzr differentiates between pull and merge.  Pull is a mirroring command.
 So with pull, yes revision-154 will be accessible at
bob.example.com:revision-154.

With merge, it won't.  Bob can refer to it as "154:alice.example.com",
though.

> If it's not, how can you backtrack from old bugreports and find the
> error being discussed?

Refer to it as 'alice.example.com revno 154' or by its revision-id.

> If it is, how does that work if Bob suddenly wants to commit things
> before Alice is done working with her changes?

I don't see how this applies.  You can always commit in a branch.  If
alice and bob both commit, then they are diverged and can't pull.  If
alice merges bob, then they converge and bob can pull alice.

> Also, suppose they both push to a master-repo where Caesar has pushed
> his changes and nicked the slot for revision-154. Does the master repo
> re-organize everything and then invalidate Bob's and Alice's changes, or
> does it tell Alice and Bob that they need to update and then reorganize
> their repos before they're allowed to push?

They must merge from the master-repo before they can push to it.

>> In our terminology, if it can diverge from the original, it's a branch,
>> not a checkout.
>>
> 
> This clears things up immensely. bazaar checkout != git checkout.
> I still fail to see how a local copy you can't commit to is useful

My bzr is run from a local copy I can't commit to.  To get the latest
changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
To merge the latest changes into my branch, I can run
"bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
patches to.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNTKl0F+nu1YWqI0RAhRkAJ0d5KyRElEiFm/m5iRrTIk00RyqywCfe2IY
dhW46SYWm+FTQpN30VY5tPs=
=6SFm
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
  2006-10-17 10:30             ` Johannes Schindelin
@ 2006-10-17 19:51             ` Aaron Bentley
  2006-10-21 18:58               ` Jan Hudec
  2006-10-20  8:26             ` James Henstridge
  2006-10-20  8:56             ` Erik Bågfors
  3 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 19:51 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean wrote:
> On Tue, 17 Oct 2006 00:24:15 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
>>- - you can use a checkout to maintain a local mirror of a read-only
>>  branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> 
> 
> I'm not sure what you mean here.  A bzr checkout doesn't have any history
> does it?

By default, they do.  You must use a flag to get a checkout with no history.

> So it's not a mirror of a branch, but just a checkout of the
> branch head?

It's a mirror of a branch, and a copy of the branch's working tree.

> If so, Git can export a tarball of a branch (actually a snapshot as at
> any given commit) which can be mirrored out.

Sure, and so can bzr.  But using a checkout of the branch head means:
- - No one has to do anything special to provide a working tree of a given
  revision
- - I can still run any readonly operations I desire
- - I can update to the latest version of bzr.dev with one command.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNTRc0F+nu1YWqI0RAsL2AKCCG0bP8m01WVllfPMzCdFZjmgEgACfeToz
57HERFJ6ZkkS3VrxLRnVPAs=
=3CX7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  8:16         ` Andreas Ericsson
@ 2006-10-17 20:01           ` Aaron Bentley
  2006-10-17 21:01             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 20:01 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
> Aaron Bentley wrote:
>> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
>> positive numbers to refer to the number of commits that have been made
>> since the branch was initialized.
>>
> 
> What do you do once a branch has been thrown away, or has had 20 other
> branches merged into it? Does the offset-number change for the revision
> then, or do you track branch-points explicitly?

We always track the number of parents since the initial commit in the
project.  Sorry, I don't think I said that clearly before.

>> If I understand correctly, in Bazaar, you'd just merge the current work
>> into 'xx/topic'.
>>
> 
> merge != rebase though, although they are indeed similar. Let's take the
> example of a 'master' branch and topic branch topicA. If you rebase
> topicA onto 'master', development will appear to have been serial.

Ah, now I see what you mean, and the "graft" plugin mentioned by others
fills that role.  I've never used it, though.

> If
> you instead merge them, it will either register as a real merge or, if
> the branch tip of 'master' is the branch start-point of topicA, it will
> result in a "fast-forward" where 'master' is just updated to the
> branch-tip of 'topicA'.

Interesting.  We don't do 'fast-forward' in that case.

>> I'm not sure what you mean by API, unless you mean the commandline.  If
>> that's what you mean, surely all unix commands are extensible in that
>> regard.
>>
> 
> I'm fairly certain he's talking about the API in the sense it's being
> talked about in every other application. Extensive work has been made to
> libify a lot of the git code, which means that most git commands are
> made up of less than 400 lines of C code, where roughly 80% of the code
> is command-specific (i.e., argument parsing and presentation).

Ah, okay.

So it sounds to me like git is extensible, though not as thoroughly as bzr.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNTat0F+nu1YWqI0RAn9aAJ9WzMrM72be+3SlwCpvJXQ/X2Y3nQCfeYk3
NTIJuZSze9URUaAsiO4Hu5o=
=9nvr
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 20:01           ` Aaron Bentley
@ 2006-10-17 21:01             ` Jakub Narebski
  2006-10-17 21:27               ` Aaron Bentley
  2006-10-17 23:35               ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 21:01 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Andreas Ericsson wrote:
>> Aaron Bentley wrote:
>>> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
>>> positive numbers to refer to the number of commits that have been made
>>> since the branch was initialized.
>>>
>>
>> What do you do once a branch has been thrown away, or has had 20 other
>> branches merged into it? Does the offset-number change for the revision
>> then, or do you track branch-points explicitly?
> 
> We always track the number of parents since the initial commit in the
> project.  Sorry, I don't think I said that clearly before.

While this I think is quite reliable (there was idea to store "generation
number" with each commit, e.g. using not implemented "note" header, or
commit-id to generation number "database" as a better heuristic than
timestamp for revision ordering in git-rev-list output), and probably
independent on repository (it is global property of commit history,
and commit history is included in sha1 of its parents), numbering branching
points is unreliable, as is relying on branch names.
 
>>> If I understand correctly, in Bazaar, you'd just merge the current work
>>> into 'xx/topic'.
>>>
>>
>> merge != rebase though, although they are indeed similar. Let's take the
>> example of a 'master' branch and topic branch topicA. If you rebase
>> topicA onto 'master', development will appear to have been serial.
> 
> Ah, now I see what you mean, and the "graft" plugin mentioned by others
> fills that role.  I've never used it, though.

Very useful as a kind of poor-man's-Quilt (or StGit). You develop some
feature step by step, commit by commit in your repository cooking it
in topic branch. Then before sending it to mailing list or maintainer
as a series of patches (using git-format-patch and git-send-email)
you rebase it on top of current work (current state), to ensure that
it would apply cleanly.
 
>> If
>> you instead merge them, it will either register as a real merge or, if
>> the branch tip of 'master' is the branch start-point of topicA, it will
>> result in a "fast-forward" where 'master' is just updated to the
>> branch-tip of 'topicA'.
> 
> Interesting.  We don't do 'fast-forward' in that case.

Fast-forward is a really good idea. Perhaps you could implement it,
if it is not hidden under different name?
 
>>> I'm not sure what you mean by API, unless you mean the commandline.  If
>>> that's what you mean, surely all unix commands are extensible in that
>>> regard.
>>>
>>
>> I'm fairly certain he's talking about the API in the sense it's being
>> talked about in every other application. Extensive work has been made to
>> libify a lot of the git code, which means that most git commands are
>> made up of less than 400 lines of C code, where roughly 80% of the code
>> is command-specific (i.e., argument parsing and presentation).
> 
> Ah, okay.
> 
> So it sounds to me like git is extensible, though not as thoroughly as bzr.

I think having good API for C, shell and Perl (and to lesser extent for any
scripting language) means that it is extensible more. Git is not as of yet
libified; when it would be we could think about bindings for other
programming languages (there is preliminary Java binding/interface).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:01             ` Jakub Narebski
@ 2006-10-17 21:27               ` Aaron Bentley
  2006-10-17 21:51                 ` Jakub Narebski
                                   ` (2 more replies)
  2006-10-17 23:35               ` Jakub Narebski
  1 sibling, 3 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 21:27 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
>>Ah, now I see what you mean, and the "graft" plugin mentioned by others
>>fills that role.  I've never used it, though.
> 
> 
> Very useful as a kind of poor-man's-Quilt (or StGit). You develop some
> feature step by step, commit by commit in your repository cooking it
> in topic branch. Then before sending it to mailing list or maintainer
> as a series of patches (using git-format-patch and git-send-email)
> you rebase it on top of current work (current state), to ensure that
> it would apply cleanly.

What is the bad side of using merge in this situation?

>>Interesting.  We don't do 'fast-forward' in that case.
> 
> 
> Fast-forward is a really good idea. Perhaps you could implement it,
> if it is not hidden under different name?

We support it as 'pull', but merge doesn't do it automatically, because
we'd rather have merge behave the same all the time, and because 'pull'
throws away your local commit ordering.

>>So it sounds to me like git is extensible, though not as thoroughly as bzr.
> 
> 
> I think having good API for C, shell and Perl (and to lesser extent for any
> scripting language) means that it is extensible more.

I guess it's a value judgement on which is more important to extensibility:

Git has more language support.

Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
and more.  Because Python supports monkey-patching, a plugin can change
absolutely anything.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNUrP0F+nu1YWqI0RAizXAJ0Wnf2ZoIRpaba3mX2L4pN9XcWDPQCePtg/
G/W6Oxm+kd8SzhGEEfLAxL8=
=VqC7
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:27               ` Aaron Bentley
@ 2006-10-17 21:51                 ` Jakub Narebski
  2006-10-17 22:28                   ` Aaron Bentley
  2006-10-18  6:22                   ` Matthieu Moy
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
  2006-10-17 22:03                 ` Linus Torvalds
  2 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 21:51 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:

>>>Ah, now I see what you mean, and the "graft" plugin mentioned by others
>>>fills that role.  I've never used it, though.
>>
>> Very useful as a kind of poor-man's-Quilt (or StGit). You develop some
>> feature step by step, commit by commit in your repository cooking it
>> in topic branch. Then before sending it to mailing list or maintainer
>> as a series of patches (using git-format-patch and git-send-email)
>> you rebase it on top of current work (current state), to ensure that
>> it would apply cleanly.
> 
> What is the bad side of using merge in this situation?

We want linear history, not polluted by merges. For example you cannot
send merge commit via email. Another problem is that you want to
send _series_ of patches, string of commits (revisions), creating feature
part by part, with clean history; with merge you get _final result_
which will apply cleanly, with rebase you would get that series
of patches will apply cleanly.
 
>>>Interesting.  We don't do 'fast-forward' in that case.
>>
>> Fast-forward is a really good idea. Perhaps you could implement it,
>> if it is not hidden under different name?
> 
> We support it as 'pull', but merge doesn't do it automatically, because
> we'd rather have merge behave the same all the time, and because 'pull'
> throws away your local commit ordering.

I smell yet another terminology conflict (although this time fault is
on the git side), namely that in git terminology "pull" is "fetch"
(i.e. getting changes done in remote repository since laste "fetch"
or since "clone") followed by merge. pull = fetch + merge.

>>>So it sounds to me like git is extensible, though not as thoroughly as bzr.
>>
>>
>> I think having good API for C, shell and Perl (and to lesser extent for any
>> scripting language) means that it is extensible more.
> 
> I guess it's a value judgement on which is more important to extensibility:
> 
> Git has more language support.
> 
> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
> and more.  Because Python supports monkey-patching, a plugin can change
> absolutely anything.

Which is _not_ a good idea. Git is created in such way, that the repository
is abstracted away (introduction of pack format, and improving pack format
can and was done "behind the scenes", not changing any porcelanish (user)
commands), but we don't want any chage that would change this abstraction.
Changing repository format is not a good idea for "dumb" protocols; native
protocol is quite extensible (for example there was introduced multi-ack
extension for better downloading of multiple branches with lesser number
of object in the pack sent; even earlier there were intoduced thin packs),
and does a kind of feature detection between client and server. Adding
cURL based FTP read-only support to existing HTTP support was a matter
of few lines, if I remember correctly.

Besides, if monkey-patching is something akin to advices, I guess that
performance might suffer.


To make perhaps not that good analogy. In git adding new commands is
like adding new filesystem to Linux kernel using existing VFS interface,
or existing FUSE/LUFS interface. In Bazaar adding new command is like
writing new filesystem support (plugin) in mikrokernel like L4/Mach.
(And please take note for what project git was created for :-))

-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
  2006-10-17 22:00                   ` Sean
@ 2006-10-17 22:00                   ` Sean
  2006-10-17 22:44                     ` Aaron Bentley
  2006-10-20  9:43                     ` Matthieu Moy
  1 sibling, 2 replies; 806+ messages in thread
From: Sean @ 2006-10-17 22:00 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

On Tue, 17 Oct 2006 17:27:44 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
> and more.  Because Python supports monkey-patching, a plugin can change
> absolutely anything.

But really why does any of that matter?  This is the open source world.
We don't need plugins to extend features, we just add the feature to
the source.  The example I asked about earlier is a case in point. 
Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
was implemented as a command without any issue at all, no plugins
needed, and its compiled and runs at machine speed.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
@ 2006-10-17 22:00                   ` Sean
  2006-10-17 22:00                   ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 22:00 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 17:27:44 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
> and more.  Because Python supports monkey-patching, a plugin can change
> absolutely anything.

But really why does any of that matter?  This is the open source world.
We don't need plugins to extend features, we just add the feature to
the source.  The example I asked about earlier is a case in point. 
Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
was implemented as a command without any issue at all, no plugins
needed, and its compiled and runs at machine speed.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:27               ` Aaron Bentley
  2006-10-17 21:51                 ` Jakub Narebski
       [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
@ 2006-10-17 22:03                 ` Linus Torvalds
  2006-10-17 22:53                   ` Aaron Bentley
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17 22:03 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> 
> >>Interesting.  We don't do 'fast-forward' in that case.
> > 
> > Fast-forward is a really good idea. Perhaps you could implement it,
> > if it is not hidden under different name?
> 
> We support it as 'pull', but merge doesn't do it automatically, because
> we'd rather have merge behave the same all the time, and because 'pull'
> throws away your local commit ordering.

Excuse me? What does that "throws away your local commit ordering" mean?

A fast-forward does no such thing. It leaves the local commit ordering 
alone, it just appends other things on top of it. It's the only sane thing 
you can do, since the work you merged was already based on your top 
commit.

So generating an extra "merge" commit would be actively wrong, and adds 
"history" that is not history at all.

It also means that if people merge back and forth from each other, you get 
into an endless loop of useless merge commits. What's the point? They only 
clutter up the history, and they mean that you can never agree on a common 
state.

There's no reason _ever_ to not just fast-forward if one repository is a 
strict superset of the other.

You must be doing something wrong. Is it just that people want to pee in 
the snow and leave their mark?

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 16:41             ` Linus Torvalds
@ 2006-10-17 22:27               ` Robert Collins
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 806+ messages in thread
From: Robert Collins @ 2006-10-17 22:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 1776 bytes --]

On Tue, 2006-10-17 at 09:41 -0700, Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Robert Collins wrote:
> 
> > On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> > > 
> > >           ---- time --->
> > > 
> > >     --*--*--*--*--*--*--*--*--*-- <branch>
> > >           \            /
> > >            \-*--X--*--/
> > > 
> > > The branch it used to be on is gone...
> > 
> > In bzr 0.12 this is :
> > 2.1.2
> > 
> > (assuming the first * is numbered '1'.)
> > 
> > These numbers are fairly stable
> 
> And here, by "fairly stable", you really mean "totally idiotic", don't 
> you?
> 
> Guys, let's be blunt here, and just say you're wrong. The fact is, I've 
> used a system that uses the same naming bzr does, and I've used it likely 
> longer and with a bigger project than anybody has likely _ever_ used bzr 
> for.
> 
> It sounds like bzr is doing _exactly_ what bitkeeper did. 
> 
> Those "simple" numbers are totally idiotic. And when I say "totally 
> idiotic", please go back up a few sentences, and read those again. I know 
> what I'm talking about. I know probably better than anybody in the bzr 
> camp.

Be as blunt as you want. You're expressing an opinion, and thats fine. I
happen to think that we're right : users appear to really appreciate
this bit of the UI, and I've not yet seen any evidence of confusion
about it - though I will admit there is the possibility of that
occurring.

I think its completely ok that git and bzr have made different choices
in this regard, but I *dont* think our choice is in any regard 'totally
idiotic'.

[snip examples that are clearly predicated on how bk worked, not on how
bzr works].

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:51                 ` Jakub Narebski
@ 2006-10-17 22:28                   ` Aaron Bentley
  2006-10-17 22:57                     ` Jakub Narebski
  2006-10-18  6:22                   ` Matthieu Moy
  1 sibling, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 22:28 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:

>> What is the bad side of using merge in this situation?
> 
> We want linear history, not polluted by merges. For example you cannot
> send merge commit via email.

Oh.  Bazaar supports sending merge commits by email.

> Another problem is that you want to
> send _series_ of patches, string of commits (revisions), creating feature
> part by part, with clean history; with merge you get _final result_
> which will apply cleanly, with rebase you would get that series
> of patches will apply cleanly.

Yes, that's something that I'd heard about the kernel development
methodology-- that a series of small patches is preferred to one patch
that makes the whole change.

That's not the way we operate.  We like to review all the changes at
once.  But because bundles are applied with a 'merge' command, not a
'patch' command, an old bundle will tend to apply more cleanly than an
old patch would.

> I smell yet another terminology conflict (although this time fault is
> on the git side), namely that in git terminology "pull" is "fetch"
> (i.e. getting changes done in remote repository since laste "fetch"
> or since "clone") followed by merge. pull = fetch + merge.

I guess so, since git merge will do fast-forward after a fetch.

>> and more.  Because Python supports monkey-patching, a plugin can change
>> absolutely anything.
> 
> Which is _not_ a good idea. Git is created in such way, that the repository
> is abstracted away (introduction of pack format, and improving pack format
> can and was done "behind the scenes", not changing any porcelanish (user)
> commands), but we don't want any chage that would change this abstraction.

I'm not sure what you think Bazaar does.  In Bazaar, a repository format
plugin  implements the same API that a native repository format does.

This is how bzr supports Subversion, Mercurial and Git repositories.

> Changing repository format is not a good idea for "dumb" protocols; 

I can't parse this.  Repository formats and protocols are different
things, right?

> native
> protocol is quite extensible

I was meaning dumb protocol extension.  I can't say how extensible the
bzr native protocol is.
> Adding
> cURL based FTP read-only support to existing HTTP support was a matter
> of few lines, if I remember correctly.

We support read and write over native, ftp and WebDAV (a plugin).  We
also have readonly http support.

> Besides, if monkey-patching is something akin to advices, I guess that
> performance might suffer.

No, monkey-patched code executes at the same speed as unpatched code.
There are arguments against monkey-patching, but speed is not one of them.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNVkM0F+nu1YWqI0RAjCaAJwOcWSUdVy7RpUZROJVxAC9aj/V/wCfUg0T
uHkdc9k6i+v0QnhEvTXdszM=
=YO8G
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:00                   ` Sean
@ 2006-10-17 22:44                     ` Aaron Bentley
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
  2006-10-20  9:43                     ` Matthieu Moy
  1 sibling, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 22:44 UTC (permalink / raw)
  To: Sean; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean wrote:
> On Tue, 17 Oct 2006 17:27:44 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> 
>> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
>> and more.  Because Python supports monkey-patching, a plugin can change
>> absolutely anything.
> 
> But really why does any of that matter?  This is the open source world.
> We don't need plugins to extend features, we just add the feature to
> the source.

That can lead to feature bloat.  Some plugins are not useful to
everyone, e.g. Mercurial repository support.  Some plugins introduce
additional dependencies that we don't want to have in the core (e.g. the
rsync, baz-import and graph-ancestry commands).

Plugins also don't have a Bazaar's rigid release cycle, testing
requirements and coding conventions, so they are a convenient way to try
out an idea, before committing to the effort of getting it merged into
the core.

> The example I asked about earlier is a case in point. 
> Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
> was implemented as a command without any issue at all, no plugins
> needed, and its compiled and runs at machine speed.

The bisect plugin is just as performant as any other bzr command.  (The
whole VCS is in Python.)  Most people don't use it, so we don't ship it
as part of the base install, but anyone who wants it can have it.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNVy70F+nu1YWqI0RAnlxAJ9+ZXryG/KJxi6hjpz+U/gU3y06MQCdH2Ez
cFlnxwWksB+q2b1dXI3cfwo=
=HAy6
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:03                 ` Linus Torvalds
@ 2006-10-17 22:53                   ` Aaron Bentley
  2006-10-17 23:09                     ` Linus Torvalds
  2006-10-17 23:24                     ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 22:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Aaron Bentley wrote:
>>>> Interesting.  We don't do 'fast-forward' in that case.
>>> Fast-forward is a really good idea. Perhaps you could implement it,
>>> if it is not hidden under different name?
>> We support it as 'pull', but merge doesn't do it automatically, because
>> we'd rather have merge behave the same all the time, and because 'pull'
>> throws away your local commit ordering.
> 
> Excuse me? What does that "throws away your local commit ordering" mean?

Say this is the ordering in branch A:

a
|
b
|
c

Say this is the ordering in branch B:

a
|
b
|\
d c
|/
e

When A pulls B, it gets the same ordering as B has.  If B did not have e
and c, the pull would fail.

> So generating an extra "merge" commit would be actively wrong, and adds 
> "history" that is not history at all.

It's not a tree change, but it records the fact that one branch merged
the other.

> It also means that if people merge back and forth from each other, you get 
> into an endless loop of useless merge commits.

You can pull if you don't want that.  We haven't found that people are
very fussed about it.

> There's no reason _ever_ to not just fast-forward if one repository is a 
> strict superset of the other.

Maybe not in Git.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNV7u0F+nu1YWqI0RAhGtAJwOlWpl088pbl63EHyF04qQCYlXBgCfW0Tm
cfXuE0vqeWelfFbpzffiCNI=
=McQ2
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
@ 2006-10-17 22:56                         ` Sean
  2006-10-17 23:11                           ` Jakub Narebski
  2006-10-18 21:04                           ` Charles Duffy
  2006-10-17 22:56                         ` Sean
  2006-10-18 21:51                         ` Petr Baudis
  2 siblings, 2 replies; 806+ messages in thread
From: Sean @ 2006-10-17 22:56 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

On Tue, 17 Oct 2006 18:44:11 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> That can lead to feature bloat.  Some plugins are not useful to
> everyone, e.g. Mercurial repository support.  Some plugins introduce
> additional dependencies that we don't want to have in the core (e.g. the
> rsync, baz-import and graph-ancestry commands).

Shrug, it's really not that tough to do in regular ole source code.
On Fedora for instance you have your choice of which rpms you want
to install to get the features of Git you want.

> Plugins also don't have a Bazaar's rigid release cycle, testing
> requirements and coding conventions, so they are a convenient way to try
> out an idea, before committing to the effort of getting it merged into
> the core.

Hmm.. It's pretty easy to test out Git ideas too.  People do it all
the time, and without plugins.  Junio maintains several such trees
for instance.  Dunno.. I just think plugs _sounds_ good to developers
without much real benefit to users over regular ole source code.

> The bisect plugin is just as performant as any other bzr command.  (The
> whole VCS is in Python.)  Most people don't use it, so we don't ship it
> as part of the base install, but anyone who wants it can have it.

Sure, and anyone who wants to use StGit on top of Git can download and
use it as well.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
  2006-10-17 22:56                         ` Sean
@ 2006-10-17 22:56                         ` Sean
  2006-10-18 21:51                         ` Petr Baudis
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 22:56 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Tue, 17 Oct 2006 18:44:11 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> That can lead to feature bloat.  Some plugins are not useful to
> everyone, e.g. Mercurial repository support.  Some plugins introduce
> additional dependencies that we don't want to have in the core (e.g. the
> rsync, baz-import and graph-ancestry commands).

Shrug, it's really not that tough to do in regular ole source code.
On Fedora for instance you have your choice of which rpms you want
to install to get the features of Git you want.

> Plugins also don't have a Bazaar's rigid release cycle, testing
> requirements and coding conventions, so they are a convenient way to try
> out an idea, before committing to the effort of getting it merged into
> the core.

Hmm.. It's pretty easy to test out Git ideas too.  People do it all
the time, and without plugins.  Junio maintains several such trees
for instance.  Dunno.. I just think plugs _sounds_ good to developers
without much real benefit to users over regular ole source code.

> The bisect plugin is just as performant as any other bzr command.  (The
> whole VCS is in Python.)  Most people don't use it, so we don't ship it
> as part of the base install, but anyone who wants it can have it.

Sure, and anyone who wants to use StGit on top of Git can download and
use it as well.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:28                   ` Aaron Bentley
@ 2006-10-17 22:57                     ` Jakub Narebski
  2006-10-17 22:59                       ` Jakub Narebski
                                         ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 22:57 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Aaron Bentley wrote:
> 
>>> What is the bad side of using merge in this situation?
>>
>> We want linear history, not polluted by merges. For example you cannot
>> send merge commit via email.
> 
> Oh.  Bazaar supports sending merge commits by email.
> 
>> Another problem is that you want to
>> send _series_ of patches, string of commits (revisions), creating feature
>> part by part, with clean history; with merge you get _final result_
>> which will apply cleanly, with rebase you would get that series
>> of patches will apply cleanly.
> 
> Yes, that's something that I'd heard about the kernel development
> methodology-- that a series of small patches is preferred to one patch
> that makes the whole change.
> 
> That's not the way we operate.  We like to review all the changes at
> once.  But because bundles are applied with a 'merge' command, not a
> 'patch' command, an old bundle will tend to apply more cleanly than an
> old patch would.

Perhaps it would be nice to have "bundles" in git too. As of now
we can save arbitrary part of history in a pack, but it is binary
not textual representation.

Some of git workflow stems from old, pre-SCM Linux kernel workflow
of sending _patches_ via email.


By the way, are bzr "bundles" compatibile with ordinary patch?
git-format-patch patches are. They have additional metainfo,
but they are patches in heart.
  
>>> and more.  Because Python supports monkey-patching, a plugin can change
>>> absolutely anything.
>>
>> Which is _not_ a good idea. Git is created in such way, that the repository
>> is abstracted away (introduction of pack format, and improving pack format
>> can and was done "behind the scenes", not changing any porcelanish (user)
>> commands), but we don't want any chage that would change this abstraction.
> 
> I'm not sure what you think Bazaar does.  In Bazaar, a repository format
> plugin  implements the same API that a native repository format does.
> 
> This is how bzr supports Subversion, Mercurial and Git repositories.

But if I remember correctly Subversion does not remember merge points
(merge commits), so how can you provide full Bazaar-NG compatibility
with Subversion repository as backend? Some repository formats lack
some features. Besides, as I said repository database and stuff is
quite well abstracted away.

In git we have import tools (most of them capable of incremental import),
a few exchange tools like git-cvsexportcommit, git-cvsserver, and
Tailor-like git-svn.
 
>> Changing repository format is not a good idea for "dumb" protocols;
> 
> I can't parse this.  Repository formats and protocols are different
> things, right?

"Dumb" protocols in git are protocols for which server provides access
to contents git repository plus some additional info (usually generated
using hooks). The client (be it git-fetch or git-push) discovers which
files to download or what to upload, but it only can download repository
"as is". So if server repository was created with repository format plugin,
and client doesn't have said plugin, you are out of luck.
 
>> native protocol is quite extensible
> 
> I was meaning dumb protocol extension.  I can't say how extensible the
> bzr native protocol is.

Native git protocol (git:// and git+ssh://) does feature discovery, then
negotiates what contents has to be send, and finally tries to send minimal
number of objects.

>> Adding
>> cURL based FTP read-only support to existing HTTP support was a matter
>> of few lines, if I remember correctly.
> 
> We support read and write over native, ftp and WebDAV (a plugin).  We
> also have readonly http support.

Git has read-only access over git:// protocol (served by git-daemon on
port 9418), read-write access over git+ssh:// protocol (you can limit
exposition using git-shell), read-only access via HTTP, HTTPS, FTP "dumb"
protocols, read-write access via WebDAV "dumb" protocol.

Git is open-source, we don't need plugins ;-)
-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:57                     ` Jakub Narebski
@ 2006-10-17 22:59                       ` Jakub Narebski
  2006-10-17 23:16                       ` Linus Torvalds
  2006-10-17 23:33                       ` Aaron Bentley
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 22:59 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Jakub Narebski wrote:

> Git has read-only access over git:// protocol (served by git-daemon on
> port 9418), read-write access over git+ssh:// protocol (you can limit
> exposition using git-shell), read-only access via HTTP, HTTPS, FTP "dumb"
> protocols, read-write access via WebDAV "dumb" protocol.

And deprecated read-only (I think), deprecated, suggested to use only
for cloning, rsync:// "dumb" protocol.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:53                   ` Aaron Bentley
@ 2006-10-17 23:09                     ` Linus Torvalds
  2006-10-18  0:23                       ` Aaron Bentley
  2006-10-17 23:24                     ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17 23:09 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> > 
> > Excuse me? What does that "throws away your local commit ordering" mean?
> 
> Say this is the ordering in branch A:
> 
> a
> |
> b
> |
> c
> 
> Say this is the ordering in branch B:
> 
> a
> |
> b
> |\
> d c
> |/
> e
> 
> When A pulls B, it gets the same ordering as B has.  If B did not have e
> and c, the pull would fail.

Sure. But that doesn't throw away any local commit ordering. The original 
order (a->b->c) is still very much there. The fact that there was a branch 
off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn't 
take away anything from the original local commit ordering. 

> > So generating an extra "merge" commit would be actively wrong, and adds 
> > "history" that is not history at all.
> 
> It's not a tree change, but it records the fact that one branch merged
> the other.

But that's a totally specious "record". It has no meaning in a distributed 
SCM. There is absolutely zero semantic information in it.

The fact that you _locally_ want to remember where you were is a total 
non-issue for a true distributed system. You shouldn't force everybody 
else to see your local view - since it has no relevance to them, and 
doesn't add any information.

> Maybe not in Git.

I don't think there is any in bzr either. Can you explain?

In other words, the empty merge is totally semantically empty even in the 
bazaar world. Why does it exist?

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:56                         ` Sean
@ 2006-10-17 23:11                           ` Jakub Narebski
  2006-10-18 21:04                           ` Charles Duffy
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:11 UTC (permalink / raw)
  To: Sean; +Cc: Andreas Ericsson, bazaar-ng, git

/me too post ;-)

Sean wrote:
> On Tue, 17 Oct 2006 18:44:11 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> 
> > That can lead to feature bloat.  Some plugins are not useful to
> > everyone, e.g. Mercurial repository support.  Some plugins introduce
> > additional dependencies that we don't want to have in the core (e.g. the
> > rsync, baz-import and graph-ancestry commands).
> 
> Shrug, it's really not that tough to do in regular ole source code.
> On Fedora for instance you have your choice of which rpms you want
> to install to get the features of Git you want.

git-core, git-email, git-arch, git-cvs, git-svn, gitk
(and git-debuginfo).

gitk and gitweb were developed in its own repositories, but some time
ago got incorporated into git repository. We have contrib/ area.
QGit, Cogito, StGit are developed separately.

> > Plugins also don't have a Bazaar's rigid release cycle, testing
> > requirements and coding conventions, so they are a convenient way to try
> > out an idea, before committing to the effort of getting it merged into
> > the core.
> 
> Hmm.. It's pretty easy to test out Git ideas too.  People do it all
> the time, and without plugins.  Junio maintains several such trees
> for instance.  Dunno.. I just think plugs _sounds_ good to developers
> without much real benefit to users over regular ole source code.

Thanks to many low lewel (plumbing in git-speak) commands it is very
easy to prototype (write actually) new command in language suitable
for fast prototyping, i.e. shell or Perl (or Python, too). Then if it is
performance critical, or if it get troublesome to manage shell script
version, it gets rewritten in C as builtin command.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:57                     ` Jakub Narebski
  2006-10-17 22:59                       ` Jakub Narebski
@ 2006-10-17 23:16                       ` Linus Torvalds
  2006-10-18  5:36                         ` Jeff King
  2006-10-17 23:33                       ` Aaron Bentley
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17 23:16 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git



On Wed, 18 Oct 2006, Jakub Narebski wrote:
> 
> Perhaps it would be nice to have "bundles" in git too. As of now
> we can save arbitrary part of history in a pack, but it is binary
> not textual representation.
> 
> Some of git workflow stems from old, pre-SCM Linux kernel workflow
> of sending _patches_ via email.

Actually, the reason to _not_ have bundles very much stems from the fact 
that BK did have bundles, and they were pretty horrid.

It would be easy to send the exact same data as the native git protocol 
sends over ssh (or the git port) as an email encoding. We did that a few 
times with BK (there it's called "bk send" and "bk receive" to pack and 
unpack those things), and after doing it about five times, I absolutely 
refused to ever do it again. There's just no point, except to make your 
mailbox grow without bounds, and it was really annoying. 

So sending things as patches is just a lot more convenient if you want 
emails.  And if you want to sync two repos directly, I think we've gotten 
sufficiently past the old UUCP days when you want to use email as a 
packetization medium.

That said, "bundles" certainly wouldn't be _hard_ to do. And as long as 
nobody tries to send _me_ any of them, I won't mind ;)

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
@ 2006-10-17 23:18                   ` Sean
  2006-10-17 23:18                   ` Sean
  2006-10-17 23:33                   ` Petr Baudis
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 23:18 UTC (permalink / raw)
  To: Robert Collins; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Wed, 18 Oct 2006 08:27:58 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Be as blunt as you want. You're expressing an opinion, and thats fine. I
> happen to think that we're right : users appear to really appreciate
> this bit of the UI, and I've not yet seen any evidence of confusion
> about it - though I will admit there is the possibility of that
> occurring.

Yeah, but it's an opinion that is based on a huge real world project with
hundreds of developers.  If Bazaar is ever used in a project of that
size it may just see the same type of issues as Bk.  As has been mentioned
elsewhere, Git users really appreciate the short forms it provides for
referencing commits, so much so that there is no reason to invent a
new (unstable) numbering system or attempt to hide the true underlying
commit identities.

Just out of curiosity is there a Bazaar repo of the Linux kernel available
somewhere?

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
  2006-10-17 23:18                   ` Sean
@ 2006-10-17 23:18                   ` Sean
  2006-10-17 23:33                   ` Petr Baudis
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-17 23:18 UTC (permalink / raw)
  To: Robert Collins; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Wed, 18 Oct 2006 08:27:58 +1000
Robert Collins <robertc@robertcollins.net> wrote:

> Be as blunt as you want. You're expressing an opinion, and thats fine. I
> happen to think that we're right : users appear to really appreciate
> this bit of the UI, and I've not yet seen any evidence of confusion
> about it - though I will admit there is the possibility of that
> occurring.

Yeah, but it's an opinion that is based on a huge real world project with
hundreds of developers.  If Bazaar is ever used in a project of that
size it may just see the same type of issues as Bk.  As has been mentioned
elsewhere, Git users really appreciate the short forms it provides for
referencing commits, so much so that there is no reason to invent a
new (unstable) numbering system or attempt to hide the true underlying
commit identities.

Just out of curiosity is there a Bazaar repo of the Linux kernel available
somewhere?

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:53                   ` Aaron Bentley
  2006-10-17 23:09                     ` Linus Torvalds
@ 2006-10-17 23:24                     ` Jakub Narebski
  2006-10-17 23:50                       ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:24 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:

[...]

>> So generating an extra "merge" commit would be actively wrong, and adds
>> "history" that is not history at all.
> 
> It's not a tree change, but it records the fact that one branch merged
> the other.
> 
>> It also means that if people merge back and forth from each other, you get
>> into an endless loop of useless merge commits.
> 
> You can pull if you don't want that.  We haven't found that people are
> very fussed about it.
> 
>> There's no reason _ever_ to not just fast-forward if one repository is a
>> strict superset of the other.
> 
> Maybe not in Git.

Think what the existence of merge commit is for. It is a place where
we can record how we resolved conflicts. It means: we _merged_ (joined)
two (or more: does bzr support octopus merge?) lines of development.

Merge commit in fast-forward case is only marking "here we did a pull"
(here we downloaded from other repository). It is just a marker which
place is in reflog, not in history. It is only cluttering history.


Besides one of canonical workflows used and encouraged by git is:

 * repository A stores does it's own work on branch 'master',
   and fetches changes from 'master' branch of repository B
   into branch 'origin'. "git pull origin" when on branch 'master'
   fetches changes from 'master' branch of repository B (requiring
   usually that it fast-forwards) into branch 'origin', then
   merges branch 'origin' into branch 'master', automatically
   creating merge commit message.

 * repository B does it's own work on branch 'master',
   and fetches changes from 'master' branch of repository A
   into [tracking] branch 'origin'. (...)

Instead of pull/fetch, we could use push.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 19:44               ` Aaron Bentley
@ 2006-10-17 23:28                 ` Petr Baudis
  2006-10-17 23:39                 ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-17 23:28 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Dear diary, on Tue, Oct 17, 2006 at 09:44:37PM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Andreas Ericsson wrote:
> >> In our terminology, if it can diverge from the original, it's a branch,
> >> not a checkout.
> >>
> > 
> > This clears things up immensely. bazaar checkout != git checkout.
> > I still fail to see how a local copy you can't commit to is useful
> 
> My bzr is run from a local copy I can't commit to.  To get the latest
> changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
> To merge the latest changes into my branch, I can run
> "bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
> patches to.

The question is, why is it useful to enforce the "no commit" rule? Git
can work exactly the same, it just doesn't _enforce_ the rule. And is
the capability of enforcing such a rule important enough to warrant its
own column in the comparison table?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
  2006-10-17 23:18                   ` Sean
  2006-10-17 23:18                   ` Sean
@ 2006-10-17 23:33                   ` Petr Baudis
  2006-10-18  5:26                     ` Robert Collins
  2 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-17 23:33 UTC (permalink / raw)
  To: Sean; +Cc: Robert Collins, Linus Torvalds, bazaar-ng, git, Jakub Narebski

Dear diary, on Wed, Oct 18, 2006 at 01:18:38AM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Wed, 18 Oct 2006 08:27:58 +1000
> Robert Collins <robertc@robertcollins.net> wrote:
> 
> > Be as blunt as you want. You're expressing an opinion, and thats fine. I
> > happen to think that we're right : users appear to really appreciate
> > this bit of the UI, and I've not yet seen any evidence of confusion
> > about it - though I will admit there is the possibility of that
> > occurring.
> 
> Yeah, but it's an opinion that is based on a huge real world project with
> hundreds of developers.  If Bazaar is ever used in a project of that
> size it may just see the same type of issues as Bk.  As has been mentioned
> elsewhere, Git users really appreciate the short forms it provides for
> referencing commits, so much so that there is no reason to invent a
> new (unstable) numbering system or attempt to hide the true underlying
> commit identities.

BTW, I think it's fine to build a system optimized for small-scale
projects (if that's the intent), simplifying some things in favour of
mostly straight histories instead of more complicated merge situations
(although I tend to agree with Linus that if you don't behave in the way
the users are used to in 100% cases, the more frequently you behave so
the worse it comes back to bite in the rare cases you do). Just as RCS
is fine when maintaining individual files for personal usage (I still
actually occassionaly use it for few files).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:57                     ` Jakub Narebski
  2006-10-17 22:59                       ` Jakub Narebski
  2006-10-17 23:16                       ` Linus Torvalds
@ 2006-10-17 23:33                       ` Aaron Bentley
  2006-10-18  8:13                         ` Andreas Ericsson
  2 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-17 23:33 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> By the way, are bzr "bundles" compatibile with ordinary patch?
> git-format-patch patches are. They have additional metainfo,
> but they are patches in heart.

Yes, they are.

>> I'm not sure what you think Bazaar does.  In Bazaar, a repository format
>> plugin  implements the same API that a native repository format does.
>>
>> This is how bzr supports Subversion, Mercurial and Git repositories.
> 
> But if I remember correctly Subversion does not remember merge points
> (merge commits), so how can you provide full Bazaar-NG compatibility
> with Subversion repository as backend? Some repository formats lack
> some features.

That's true.  We support merge points in a way that's compatible with
svk.  Subversion allows revisions to have arbitrary properties, and svk
sets a property to indicate merges.

> In git we have import tools (most of them capable of incremental import),
> a few exchange tools like git-cvsexportcommit, git-cvsserver, and
> Tailor-like git-svn.

Bzr's subversion support is quite nice.  You can commit, merge, run
history viewers.

There are screenshots and stuff here:
http://bazaar-vcs.org/BzrForeignBranches/Subversion

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNWhc0F+nu1YWqI0RAkH7AJ4/S648shA8IKg42xcGWdjnjmA+PgCdEDhg
Af/mcG+XTy3Tsb9b1x3rYcg=
=xnjF
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:01             ` Jakub Narebski
  2006-10-17 21:27               ` Aaron Bentley
@ 2006-10-17 23:35               ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, bazaar-ng, git

Dnia wtorek 17. października 2006 23:01, Jakub Narebski napisał:
> Aaron Bentley wrote:
> > Andreas Ericsson wrote:
> >> Aaron Bentley wrote:
> >>> Ah.  Bazaar uses negative numbers to refer to <n>th parents, and
> >>> positive numbers to refer to the number of commits that have been made
> >>> since the branch was initialized.
> >>>
> >>
> >> What do you do once a branch has been thrown away, or has had 20 other
> >> branches merged into it? Does the offset-number change for the revision
> >> then, or do you track branch-points explicitly?
> > 
> > We always track the number of parents since the initial commit in the
> > project.  Sorry, I don't think I said that clearly before.
> 
> While this I think is quite reliable (there was idea to store "generation
> number" with each commit, e.g. using not implemented "note" header, or
> commit-id to generation number "database" as a better heuristic than
> timestamp for revision ordering in git-rev-list output), and probably
> independent on repository (it is global property of commit history,
> and commit history is included in sha1 of its parents), numbering branching
> points is unreliable, as is relying on branch names.

Take for example the following situation:


In the following we had

  A--B--C--D  - repository A

we have cloned repository

  A--B--C--D  - repository B

Then, in parallel/independently we branched off C in repository A, and
branched off B in repository B

          -x
         /
  A--B--C--D  - repository A


  A--B--C--D  - repository B
      \
       -y

If we then fetch changes from B into A, and fetch changes from A into B,
we will have that in repository A branch off C appeared earlier, and
in repository B branch off C appeared later.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 19:44               ` Aaron Bentley
  2006-10-17 23:28                 ` Petr Baudis
@ 2006-10-17 23:39                 ` Jakub Narebski
  2006-10-18  0:24                   ` Aaron Bentley
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-17 23:39 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Andreas Ericsson, Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
>> This clears things up immensely. bazaar checkout != git checkout.
>> I still fail to see how a local copy you can't commit to is useful
> 
> My bzr is run from a local copy I can't commit to.  To get the latest
> changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
> To merge the latest changes into my branch, I can run
> "bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
> patches to.

Can you do "bzr log" in 'checkout', without need to specify "~/bzr/dev"?
If not, how this differs from checking out (in git terminology) outside 
default working area, and requiring providing GIT_DIR or --git-dir for
stuff?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:24                     ` Jakub Narebski
@ 2006-10-17 23:50                       ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-17 23:50 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, Andreas Ericsson, bazaar-ng, git



On Wed, 18 Oct 2006, Jakub Narebski wrote:
> 
> Merge commit in fast-forward case is only marking "here we did a pull"
> (here we downloaded from other repository). It is just a marker which
> place is in reflog, not in history. It is only cluttering history.

For non-git people (and maybe even git people who didn't follow some of 
the "reflog" work):

 - git does actually have "local view" support, but it is very much 
   _defined_ to be local. It does not pollute any history as seen by 
   anybody else. It's called "reflog" (where "ref" is just the git name 
   for any reference into a tree, and the "log" part is hopefully obvious)

So each git repository can have (if you enable it) a full log of all the 
changes to each branch. But it's not in the core git datastructures that 
get replicated - because the local view of how the branches have changed 
really _is_ just a local view. It's just a local log to each repository 
(actually, one per branch).

It's what allows a git person to say

	git diff "master@{5.hours.ago}"

because while "5 hours ago" is _not_ well-defined in a distributed 
environment (five hours ago for _whom_?) it's perfectly well-defined in a 
purely _local_ sense of one particular branch.

So there's no need for a fakey "merge" that isn't a real merge and that 
doesn't make sense for anybody else because it doesn't actually add any 
real knowledge about the _history_ of the tree (only about a single 
repository). If you want to see how the history of a particular repository 
has evolved, you can just look at the reflog (although admittedly, common 
tools like "gitk" don't even show it - the data is there if they would 
want to, but the most common usage is the above kind of "show me what 
happened in the last five hours in my current branch".

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 14:41                     ` Jakub Narebski
@ 2006-10-18  0:00                       ` Petr Baudis
  2006-10-18  0:30                         ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  0:00 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthieu Moy, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 04:41:02PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> "Bundle" equivalent, although binary in nature, would be thin pack.

It should be noted that there's no user interface for sending/receiving
that and I suspect no reasonably usable user interface for creating it.

How frequently are the bundles used in practice?

It's a cultural difference, I suspect. Git comes from an environment
based on intensive exchanges of patches and patch series and an
environment not mandating developers to use any tool besides diff/patch,
so Git is very focused at good support for applying patches and there
simply has been no big conscious demand for bundles support given this.

Another aspect of this is that Git (Linus ;) is very focused on getting
the history right, nice and clean (though it does not _mandate_ it and
you can just wildly do one commit after another; it just provides tools
to easily do it). This means that the downstream maintainers have to
rebase patches, possibly reorder them, and update the changesets with
bugfixes instead of stacking the bugfixes upon them in separate changes
- then Linus merges the patches and only at that point they are "etched"
forever. This means that the history will contain neatly laid out way
of how $FEATURE was achieved, but of course also more work for
downstream maintainers.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
  2006-10-17 15:06                                 ` Sean
  2006-10-17 15:06                                 ` Sean
@ 2006-10-18  0:14                                 ` Petr Baudis
  2006-10-18  1:36                                   ` Integrating gitweb and git-browser (was: Re: VCS comparison table) Jakub Narebski
  2 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  0:14 UTC (permalink / raw)
  To: Sean; +Cc: Matthieu Moy, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 05:06:55PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> [1] As an aside, I don't understand why bazaar pushes the idea
> of "plugins".  For instance someone mentioned that bazaar has
> a bisect "plugin".  Well Git was able to add a bisect "command"
> without needing a plugin architecture.. so i'm at a loss as 
> to why plugins are seen as an advantage.

Greater flexibility, you can "provide this great Git addon that will
let you push over FTP" without requiring users to patch their Git
installations or wait for new Git version that might include it.
Especially important if you want a lot of users test out your
experimental feature or if it's something project-specific etc.

BTW, I'm thinking about implementing some plugin functionality for
gitweb so that you can add your own views, so that git-browser can
integrate to it more reasonably. (Currently it has completely different
UI and you have to patch gitweb in order to get the proper links at
proper places.) Sure, git-browser might get fully integrated to gitweb
later but that needs to be done sensitively so that people are not
scared by the horrible javascript blobs, etc.; currently git-browser is
very experimental, and adding it would be quite intrusive.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:09                     ` Linus Torvalds
@ 2006-10-18  0:23                       ` Aaron Bentley
  2006-10-18  0:46                         ` Jakub Narebski
                                           ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 17 Oct 2006, Aaron Bentley wrote:
>>> Excuse me? What does that "throws away your local commit ordering" mean?
>> Say this is the ordering in branch A:
>>
>> a
>> |
>> b
>> |
>> c
>>
>> Say this is the ordering in branch B:
>>
>> a
>> |
>> b
>> |\
>> d c
>> |/
>> e
>>
>> When A pulls B, it gets the same ordering as B has.  If B did not have e
>> and c, the pull would fail.
> 
> Sure. But that doesn't throw away any local commit ordering. The original 
> order (a->b->c) is still very much there.

After the pull, it's no longer the mainline ordering for the branch.  c
is represented a revision that was merged into the branch, while d is
represented as a commit on the mainline of the branch.

> The fact that there was a branch 
> off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn't 
> take away anything from the original local commit ordering.

It means the the order that revisions are shown in log commands changes,
and the revision numbers can change.

> But that's a totally specious "record". It has no meaning in a distributed 
> SCM. There is absolutely zero semantic information in it.

It records the committer, the date, the commit message, the parent
revisions.

> The fact that you _locally_ want to remember where you were is a total 
> non-issue for a true distributed system. You shouldn't force everybody 
> else to see your local view - since it has no relevance to them, and 
> doesn't add any information.

Nobody is forced to use your local view.

> In other words, the empty merge is totally semantically empty even in the 
> bazaar world. Why does it exist?

It exists because it is useful.  Because it makes the behavior of bzr
merge uniform.  Because in some workflows, commits show that a person
has signed off on a change.

It's not something special-- it's just another commit, like regular
commits, and merge commits.  It would be harder to forbid than it is to
permit.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXQQ0F+nu1YWqI0RAnxDAJ4hbuLkEK1eBlyoEOz7NAlqLVth9gCfed4w
nfeiR2KVvN+N9zdSrC8MKcY=
=et73
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:39                 ` Jakub Narebski
@ 2006-10-18  0:24                   ` Aaron Bentley
  0 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, Linus Torvalds, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>> This clears things up immensely. bazaar checkout != git checkout.
>>> I still fail to see how a local copy you can't commit to is useful
>> My bzr is run from a local copy I can't commit to.  To get the latest
>> changes from http://bazaar-vcs.org, I can run "bzr update ~/bzr/dev".
>> To merge the latest changes into my branch, I can run
>> "bzr merge ~/bzr/dev".  It's also convenient for applying other peoples'
>> patches to.
> 
> Can you do "bzr log" in 'checkout', without need to specify "~/bzr/dev"?

Sure.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXRU0F+nu1YWqI0RAptIAJ0btflKFEjF9a7Kt/qVZufK003DpACeK7Dc
leW4ICG1LbOC9DGrAd5ztlY=
=JGvL
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:03                 ` Matthieu Moy
  2006-10-17 12:56                   ` Jakub Narebski
       [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
@ 2006-10-18  0:25                   ` Petr Baudis
  2006-10-18  0:38                     ` Aaron Bentley
       [not found]                     ` <4535778D.40006@utoronto.ca>
  2006-10-18  1:11                   ` Petr Baudis
  3 siblings, 2 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  0:25 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Sean, Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 02:03:21PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> Sean <seanlkml@sympatico.ca> writes:
> 
> > On Tue, 17 Oct 2006 13:19:08 +0200
> > Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> >
> >> 1) a working tree without any history information, pointing to some
> >>    other location for the history itself (a la svn/CVS/...).
> >>    (this is "light checkout")
> >
> > Git can do this from a local repository, it just can't do it from
> > a remote repo (at least over the git native protocol).  However,
> > over gitweb you can grab and unpack a tarball from a remote repo.
> > In practice this is probably enough support for such a feature.
> 
> Anyway, given the price of disk space today,

(In rich countries. This may still be very different in poorer
countries.  E.g. some actual mplayer developer(s) from Turkey opposed
transition to a distributed version control system simply because they
have trouble affording the required additional diskspace for the full
history.  SVN is already very space-hungry for them.  (It stores
basically two complete checkouts in parallel.))

But the much bigger practical problem is bandwidth, plenty of people
still have internet connections where downloading several tens/hundreds
of megabytes of the complete history is quite a big thing, and the
servers ain't gonna be happy from that either, nor those paying the
bandwidth bills. ;-) And this is one of the big problems the Mozilla
guys have - having everyone download 450M worth of the full CVS-imported
history (and I'll bet no other VCS will beat that size) seems to be not
an option at all.

> this only makes sense if
> you have a fast access to the repository (otherwise, you consider your
> local repository as a cache, and you're ready to pay the disk space
> price to save your bandwidth). In this case, it's often in your
> filesystem (local or NFS).

So how is the light checkout actually implemented? Do you grab the
complete new snapshot each time the remote repository is updated? Do all
the (at least read-only, like "log" and "diff", perhaps "status")
commands work on such a light checkout?

This is something sorely missing in Git but if it's really only "we just
provide bandwidth-expensive way to keep your tree up-to-date and that's
all," that would not be hard at all to implement in Git too, using
git-archive --remote.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:00                       ` Petr Baudis
@ 2006-10-18  0:30                         ` Aaron Bentley
  2006-10-18  0:39                           ` Petr Baudis
                                             ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:30 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
> How frequently are the bundles used in practice?

Many times each day.  Most submission to the bzr mainline are done with
bundles.

> Another aspect of this is that Git (Linus ;) is very focused on getting
> the history right, nice and clean (though it does not _mandate_ it and
> you can just wildly do one commit after another; it just provides tools
> to easily do it).

Yes, rebasing is very uncommon in the bzr community.  We would rather
evaluate the complete change than walk through its history.  (Bundles
only show the changes you made, not the changes you merged from the
mainline.)

In an earlier form, bundles contained a patch for every revision, and
people *hated* reading them.  So there's definitely a cultural
difference there.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXWW0F+nu1YWqI0RAuRnAJ9aZVLo4T1sfmyGC2t364UyHX+6wACff7sM
peal5rAdk/T515RGeKXkWlo=
=O61J
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:25                   ` Petr Baudis
@ 2006-10-18  0:38                     ` Aaron Bentley
       [not found]                     ` <4535778D.40006@utoronto.ca>
  1 sibling, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:38 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
>> this only makes sense if
>> you have a fast access to the repository (otherwise, you consider your
>> local repository as a cache, and you're ready to pay the disk space
>> price to save your bandwidth). In this case, it's often in your
>> filesystem (local or NFS).
> 
> So how is the light checkout actually implemented? Do you grab the
> complete new snapshot each time the remote repository is updated?

No, the lightweight checkouts store very little.  They have
- - a copy of tree shape (filenames, paths, sha1 sums) from the last
  commit.
- - a copy of tree shape for the current working directory
- - a map from stat values to sha-1 hashes


> Do all
> the (at least read-only, like "log" and "diff", perhaps "status")
> commands work on such a light checkout?

Yes.  And if you check out from a read-write branch, all write commands,
work, too.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXeN0F+nu1YWqI0RAsdrAJ0bUj4swxm5sod9WnsbPZ9yIQ7FVQCdE4UB
8x0ddFkbr5cPISTihw96d8c=
=/XAr
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:30                         ` Aaron Bentley
@ 2006-10-18  0:39                           ` Petr Baudis
  2006-10-18  1:28                           ` Jakub Narebski
       [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
  2 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  0:39 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Matthieu Moy

Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Petr Baudis wrote:
> > Another aspect of this is that Git (Linus ;) is very focused on getting
> > the history right, nice and clean (though it does not _mandate_ it and
> > you can just wildly do one commit after another; it just provides tools
> > to easily do it).
> 
> Yes, rebasing is very uncommon in the bzr community.  We would rather
> evaluate the complete change than walk through its history.  (Bundles
> only show the changes you made, not the changes you merged from the
> mainline.)
> 
> In an earlier form, bundles contained a patch for every revision, and
> people *hated* reading them.  So there's definitely a cultural
> difference there.

BTW, I think what describes the Git's (kernel's) stance very nicely is
what I call the Al Viro's "homework problem":

	http://lkml.org/lkml/2005/4/7/176

If I understand you right, the bzr approach is what's described as "the
dumbest kind" there? (No offense meant!)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <4535778D.40006@utoronto.ca>
@ 2006-10-18  0:42                       ` Petr Baudis
  2006-10-18  0:48                       ` Jakub Narebski
       [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
  2 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  0:42 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 02:38:37AM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Petr Baudis wrote:
> >> this only makes sense if
> >> you have a fast access to the repository (otherwise, you consider your
> >> local repository as a cache, and you're ready to pay the disk space
> >> price to save your bandwidth). In this case, it's often in your
> >> filesystem (local or NFS).
> > 
> > So how is the light checkout actually implemented? Do you grab the
> > complete new snapshot each time the remote repository is updated?
> 
> No, the lightweight checkouts store very little.  They have
> - a copy of tree shape (filenames, paths, sha1 sums) from the last
>   commit.
> - a copy of tree shape for the current working directory
> - a map from stat values to sha-1 hashes

I see, I guess that means "the index file and tree objects for the last
commit" in git-speak. Thanks.

> > Do all
> > the (at least read-only, like "log" and "diff", perhaps "status")
> > commands work on such a light checkout?
> 
> Yes.  And if you check out from a read-write branch, all write commands,
> work, too.

Ok, one last question - do you do most of the work locally, fetching
bits of data as you need, or remotely, only taking input/producing
output over the network (the pserver model)?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:23                       ` Aaron Bentley
@ 2006-10-18  0:46                         ` Jakub Narebski
       [not found]                         ` <200610180246.18758.jnareb@gmail.com>
  2006-10-18  3:25                         ` Ryan Anderson
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  0:46 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Linus Torvalds wrote:
>>
>> On Tue, 17 Oct 2006, Aaron Bentley wrote:
> >>> Excuse me? What does that "throws away your local commit ordering" mean?
> >> Say this is the ordering in branch A:
> >>
> >> a
> >> |
> >> b
> >> |
> >> c
> >>
> >> Say this is the ordering in branch B:
> >>
> >> a
> >> |
> >> b
> >> |\
> >> d c
> >> |/
> >> e
> >>
> >> When A pulls B, it gets the same ordering as B has.  If B did not have e
> >> and c, the pull would fail.
> >
> > Sure. But that doesn't throw away any local commit ordering. The original
> > order (a->b->c) is still very much there.
> 
> After the pull, it's no longer the mainline ordering for the branch.  c
> is represented a revision that was merged into the branch, while d is
> represented as a commit on the mainline of the branch.

Well, that is another example while generation number is/can be global,
any numbering of branches must be local-only.

> > The fact that there was a branch
> > off 'b' and there is also (a->b->d) and a merge of the two at 'e' doesn't
> > take away anything from the original local commit ordering.
> 
> It means the the order that revisions are shown in log commands changes,

That doesn't matter...

> and the revision numbers can change.

...but that means that revision numers are totally, absolutely useless.
Unless by some miracle of engineering, or adding namespace, they can be
made unchangeable.

> > But that's a totally specious "record". It has no meaning in a distributed
> > SCM. There is absolutely zero semantic information in it.
> 
> It records the committer, the date, the commit message, the parent
> revisions.

All totally empty information. What should be commit message? I have
fetched changes from remote repository? You can remove one of parents
(the one of pointing to before fast-forward "merge") without changing
reachability.

              ---------
             /         \
     *--*---x---*---*---y---*

> > The fact that you _locally_ want to remember where you were is a total
> > non-issue for a true distributed system. You shouldn't force everybody
> > else to see your local view - since it has no relevance to them, and
> > doesn't add any information.
> 
> Nobody is forced to use your local view.

But if you record "fast-forward merge", you force all people pulling
from your repository to have this purely local and without any significant
information "I have fetched then" marker.

> > In other words, the empty merge is totally semantically empty even in the
> > bazaar world. Why does it exist?
> 
> It exists because it is useful.  Because it makes the behavior of bzr
> merge uniform.  Because in some workflows, commits show that a person
> has signed off on a change.

Signing off the fact of fetching changes? For true merge you are signing
off the fact that there were no conflicts, or you sign off your conflict
resolution.

> It's not something special-- it's just another commit, like regular
> commits, and merge commits.  It would be harder to forbid than it is to
> permit.

Actualy the check is very easy. And you have to do similar check when
fetchin/pushing to ensure that you don't clobber your changes.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <4535778D.40006@utoronto.ca>
  2006-10-18  0:42                       ` Petr Baudis
@ 2006-10-18  0:48                       ` Jakub Narebski
       [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  0:48 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Petr Baudis, Matthieu Moy, Sean, Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
> Petr Baudis wrote:
>>> this only makes sense if
>>> you have a fast access to the repository (otherwise, you consider your
>>> local repository as a cache, and you're ready to pay the disk space
>>> price to save your bandwidth). In this case, it's often in your
>>> filesystem (local or NFS).
>>
>> So how is the light checkout actually implemented? Do you grab the
>> complete new snapshot each time the remote repository is updated?
> 
> No, the lightweight checkouts store very little.  They have
> - a copy of tree shape (filenames, paths, sha1 sums) from the last
>   commit.
> - a copy of tree shape for the current working directory
> - a map from stat values to sha-1 hashes

Ah. So in git terminology it stores index and working directory
(and perhaps the name of branch). 

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
@ 2006-10-18  0:50                         ` Aaron Bentley
       [not found]                         ` <45357A6E.3050603@utoronto.ca>
  1 sibling, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  0:50 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:

> Ok, one last question - do you do most of the work locally, fetching
> bits of data as you need, or remotely, only taking input/producing
> output over the network (the pserver model)?

Personally, I do not do remote commits over slow links.  At home, I use
a single machine, and mirror my repository to a public machine using
rsync.  At work, I store my repository on an NFS server, and push my
repository to a public machine using rsync.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXpu0F+nu1YWqI0RAjPTAJ4w9YOM5XLpnIP9jYywtfMr+LZLvACfdycA
/TYAGUVGweR5+cPtDVAIBq4=
=rsNR
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <45357A6E.3050603@utoronto.ca>
@ 2006-10-18  0:57                           ` Petr Baudis
  2006-10-18  1:05                             ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  0:57 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 02:50:54AM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> Petr Baudis wrote:
> 
> > Ok, one last question - do you do most of the work locally, fetching
> > bits of data as you need, or remotely, only taking input/producing
> > output over the network (the pserver model)?
> 
> Personally, I do not do remote commits over slow links.  At home, I use
> a single machine, and mirror my repository to a public machine using
> rsync.  At work, I store my repository on an NFS server, and push my
> repository to a public machine using rsync.

I meant the work of the commands (bzr log and such), not your personal
workflow. :-) Sorry for being unclear.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                         ` <200610180246.18758.jnareb@gmail.com>
@ 2006-10-18  1:00                           ` Aaron Bentley
  2006-10-18  1:25                             ` Carl Worth
  2006-10-18  3:35                             ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  1:00 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>> Linus Torvalds wrote:
>>> On Tue, 17 Oct 2006, Aaron Bentley wrote:
>>>>> Excuse me? What does that "throws away your local commit ordering" mean?
>>>> Say this is the ordering in branch A:
>>>>
>>>> a
>>>> |
>>>> b
>>>> |
>>>> c
>>>>
>>>> Say this is the ordering in branch B:
>>>>
>>>> a
>>>> |
>>>> b
>>>> |\
>>>> d c
>>>> |/
>>>> e
>>>>
>>>> When A pulls B, it gets the same ordering as B has.  If B did not have e
>>>> and c, the pull would fail.
>>> Sure. But that doesn't throw away any local commit ordering. The original
>>> order (a->b->c) is still very much there.
>> After the pull, it's no longer the mainline ordering for the branch.  c
>> is represented a revision that was merged into the branch, while d is
>> represented as a commit on the mainline of the branch.
> 
> Well, that is another example while generation number is/can be global,
> any numbering of branches must be local-only.

No.  The numbering always follows the leftmost parent.  So each revision
has a permanent (but non-unique) number.

> That doesn't matter...

It has significant UI impact.

>> and the revision numbers can change.
> 
> ...but that means that revision numers are totally, absolutely useless.
> Unless by some miracle of engineering, or adding namespace, they can be
> made unchangeable.

No, because no one pulls unless they're trying to maintain a mirror of
the other branch, or else they decide to throw their local history away.

>> Nobody is forced to use your local view.
> 
> But if you record "fast-forward merge", you force all people pulling
> from your repository to have this purely local and without any significant
> information "I have fetched then" marker.

Even if I agreed that the revision was meaningless, the cost of such a
revision is miniscule.

>>> In other words, the empty merge is totally semantically empty even in the
>>> bazaar world. Why does it exist?
>> It exists because it is useful.  Because it makes the behavior of bzr
>> merge uniform.  Because in some workflows, commits show that a person
>> has signed off on a change.
> 
> Signing off the fact of fetching changes? For true merge you are signing
> off the fact that there were no conflicts, or you sign off your conflict
> resolution.

You sign off on the contents of the revision you fetched.  You say "I
have reviewed this revision, and approved it."

>> It's not something special-- it's just another commit, like regular
>> commits, and merge commits.  It would be harder to forbid than it is to
>> permit.
> 
> Actualy the check is very easy.

Agreed.  It's just that not checking is easier still.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNXzD0F+nu1YWqI0RAiGvAJsEbPNNlqZ7QCH7EE39YABqEm/BtwCaAxIo
NHqG4NVZpvymTUlCLYyCqKM=
=YUdC
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:57                           ` Petr Baudis
@ 2006-10-18  1:05                             ` Aaron Bentley
  0 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  1:05 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Sean, Jakub Narebski, Linus Torvalds, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
> Dear diary, on Wed, Oct 18, 2006 at 02:50:54AM CEST, I got a letter
> where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
>> Petr Baudis wrote:
>>
>>> Ok, one last question - do you do most of the work locally, fetching
>>> bits of data as you need, or remotely, only taking input/producing
>>> output over the network (the pserver model)?
>> Personally, I do not do remote commits over slow links.  At home, I use
>> a single machine, and mirror my repository to a public machine using
>> rsync.  At work, I store my repository on an NFS server, and push my
>> repository to a public machine using rsync.
> 
> I meant the work of the commands (bzr log and such), not your personal
> workflow. :-) Sorry for being unclear.

When using the native network protocol, work can happen remotely.  (But
the native protocol is quite new, and support for "smart" operations is
currently limited.)  When using the dumb protocols, data is fetched from
the remote system and processed locally.  Light checkouts are not
recommended when the server is on a slow link, but heavyweight checkouts
are quite suitable in that situation.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNX3j0F+nu1YWqI0RAtRcAJ0fEZam6H3hs3YHY/dEYEhk3A73BQCdENHY
s9+KZTfqnDJg8mHNmC2C/Ok=
=Nqcn
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:03                 ` Matthieu Moy
                                     ` (2 preceding siblings ...)
  2006-10-18  0:25                   ` Petr Baudis
@ 2006-10-18  1:11                   ` Petr Baudis
  2006-10-18  6:44                     ` Matthieu Moy
  3 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  1:11 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Sean, Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 02:03:21PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> I have one repository, say, $repo.
> 
> In it, I have one branch "$repo/bzr.dev" which is an exact mirror of
> http://bazaar-vcs.org's branch.
> 
> I also have branches for patches (occasional in my case) that I'll
> send to upstream. Say $repo/feature1, $repo/feature2, ...
> 
> If, by mistake, I start hacking on bzr.dev itself, I'll be warned at
> commit time, create a branch, and commit in this new branch. I believe
> git manages this in a different way, allowing you to commit in this
> branch, and creating the branch next time you pull. But you know this
> better than I ;-), I never got time to give a real try to git.

In fact, in Git the branch is actually created at the moment you clone.

For simplicity sake, let's say you cloned just a single branch, not the
whole repository (or imagine a repository with a single branch). Then,
in your local repository, two branches will be created: 'origin' and
'master'. The origin branch is considered readonly (though Git does
not enforce it) and only mirrors the branch in the remote repository.
The master branch is the branch you do your work on, and it corresponds
to the contents of your working tree.

Thus, when you are "updating" your repository (we also call that
"pull"), what happens is that new commits are _fetched_ from the remote
repository to your 'origin' branch and then the 'origin' branch is
_merged_ to the 'master' branch. (You can even separate those two steps
and do them manually. So you can e.g. periodically fetch but just check
diffs with your master branch and never actually merge, or whatever.)

If you never do any local commits on the repository, every time you
merge the 'master' branch is ancestor of the 'origin' branch and only
so-called fast-forward merge happens - the 'master' branch is updated to
point at the same commit as the 'origin' branch.

If you _did_ do some local commits, a real merge of the two branches
happens and a new merge commit tying the current master and origin
history together is recorded on the merge branch.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:00                           ` Aaron Bentley
@ 2006-10-18  1:25                             ` Carl Worth
  2006-10-18  3:10                               ` Aaron Bentley
  2006-10-18  3:35                             ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-18  1:25 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2002 bytes --]

On Tue, 17 Oct 2006 21:00:51 -0400, Aaron Bentley wrote:
> Jakub Narebski wrote:
> > Well, that is another example while generation number is/can be global,
> > any numbering of branches must be local-only.
>
> No.  The numbering always follows the leftmost parent.  So each revision
> has a permanent (but non-unique) number.

Aaron, thanks for carrying this thread along and helping to bridge
some communication gaps. For example, when I saw your original two two
diagrams I was totally mystified how you were claiming that appending
a couple of nodes and edges to a DAG could change the "order" of the
DAG.

I think I understand what you're describing with the leftmost-parent
ordering now. But it's definitely an ordering that I would describe as
local-only. That is, the ordering has meaning only with respect to a
particular linearization of the DAG and that linearization is
different from one repository to the next.

> > ...but that means that revision numers are totally, absolutely useless.
> > Unless by some miracle of engineering, or adding namespace, they can be
> > made unchangeable.
>
> No, because no one pulls unless they're trying to maintain a mirror of
> the other branch, or else they decide to throw their local history away.

If in practice, nobody does the mirroring "pull" operation then how
are the numbers useful? For example, given your examples above, if
I'm understanding the concepts and terminology correctly, then if A
and B both "merge" from each other (and don't "pull") then they will
each end up with identical DAGs for the revision history but totally
distinct numbers. Correct?

So in that situation the numbers will not help A and B determine that
they have identical history or even identical working trees. So what
good are the numbers?

I can see that the numbers would have applicability with reference to
a single repository, (or equivalently a mirror of that repository),
but no utility as soon as there is any distributed development
happening.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:30                         ` Aaron Bentley
  2006-10-18  0:39                           ` Petr Baudis
@ 2006-10-18  1:28                           ` Jakub Narebski
  2006-10-18  1:44                             ` Carl Worth
       [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
  2 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  1:28 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, bazaar-ng, Linus Torvalds, Andreas Ericsson,
	Petr Baudis, git

Aaron Bentley wrote:
> Petr Baudis wrote:
>>
>> Another aspect of this is that Git (Linus ;) is very focused on getting
>> the history right, nice and clean (though it does not _mandate_ it and
>> you can just wildly do one commit after another; it just provides tools
>> to easily do it).
> 
> Yes, rebasing is very uncommon in the bzr community.  We would rather
> evaluate the complete change than walk through its history.  (Bundles
> only show the changes you made, not the changes you merged from the
> mainline.)
> 
> In an earlier form, bundles contained a patch for every revision, and
> people *hated* reading them.  So there's definitely a cultural
> difference there.

Take for example 
 "[PATCH 0/6] ref deletion and D/F conflict avoidance with packed-refs."
 http://thread.gmane.org/gmane.comp.version-control.git/28150/focus=28154

> This series cleans up the area that was affected by the recent
> addition of "packed-refs".  Christian Couder and Jeff King CC'ed
> since they seem to be touching in the general vicinity of the
> code these patches touch.
> 
> [1/6] ref locking: allow 'foo' when 'foo/bar' used to exist but not anymore.
> [2/6] refs: minor restructuring of cached refs data.
> [3/6] lock_ref_sha1(): do not sometimes error() and sometimes die().
> [4/6] lock_ref_sha1(): check D/F conflict with packed ref when creating.
> [5/6] delete_ref(): delete packed ref
> [6/6] git-branch: remove D/F check done by hand.
> 
> I opted for removing from the packed-ref file when a ref that is
> packed is deleted.

Isn't it easier to review than "bundle", aka. mega-patch?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  0:14                                 ` Petr Baudis
@ 2006-10-18  1:36                                   ` Jakub Narebski
  2006-10-18  1:52                                     ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  1:36 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

> BTW, I'm thinking about implementing some plugin functionality for
> gitweb 

Features support is kind of plugin system for gitweb. But certainly we could
split gitweb into modules.

> so that you can add your own views, so that git-browser can 
> integrate to it more reasonably. (Currently it has completely different
> UI and you have to patch gitweb in order to get the proper links at
> proper places.) Sure, git-browser might get fully integrated to gitweb
> later but that needs to be done sensitively so that people are not
> scared by the horrible javascript blobs, etc.; currently git-browser is
> very experimental, and adding it would be quite intrusive.

I was thinking about adding using JavaScript, in shortlog (and perhaps
shortlog-extended, i.e. with date and author) views one extra "diagram"
column, with width set using JavaScript generated embedded style, and use
only part of git-browser that generates diagram to draw it there.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:28                           ` Jakub Narebski
@ 2006-10-18  1:44                             ` Carl Worth
  2006-10-18  3:27                               ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-18  1:44 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

[-- Attachment #1: Type: text/plain, Size: 1950 bytes --]

On Wed, 18 Oct 2006 03:28:30 +0200, Jakub Narebski wrote:
>
> Isn't it easier to review than "bundle", aka. mega-patch?

There are even more important reasons to prefer a series of
micro-commits over a mega-patch than just ease of merging.

In the cairo project, I've often reviewed a single patch and said:

	"This all looks like perfectly good code and I'd be happy to
	have it all in the tree. But please rebuild this as a series
	of independent patches (perhaps along the lines of a, b, c,
	...)"

I do that not just to make the history "look nice" but because code
history is something we _use_ a lot and separate commits for separate
actions just make the history so much more usable.

We have great tools like bisect to identify commits that introduce
bugs. I know that I'd be delighted to see bisect comes back pointing
at some minimal commit as causing a bug, (which would make finding the
bug so much easier).

But it's also been my experience that the largest commits are also the
most likely to be the things returned by bisect. Big commits really do
introduce bugs more frequently than small commits.

Finally, if someone had gone through the useful work to create small,
independent changes, (and likely finding and fixing bugs in the
process), what a horrible shame it would be to throw away that work
and merge it as a single patch, (welcome to the pain of CVS branch
merging).

Now, I do admit that it is often useful to take the overall view of a
patch series being submitted. This is often the case when a patch
series is in some sub-module of the code for which I don't have as
much direct involvement. In cases like that I will often do review
only of the diff between the tips of the mainline and the branch of
interest, (or if I trust the maintainer enough, perhaps just the
diffstat between the two). But I'm still very glad that what lands in
the history is the series of independent changes, and not one mega
commit.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 11:19           ` Matthieu Moy
                               ` (2 preceding siblings ...)
  2006-10-17 14:19             ` Olivier Galibert
@ 2006-10-18  1:46             ` Petr Baudis
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  4 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  1:46 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Linus Torvalds, bazaar-ng, git

Dear diary, on Tue, Oct 17, 2006 at 01:19:08PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> 2) a bound branch. It's not _very_ different from a normal branch, but
>    mostly "commit" behaves differently:
>    - it commits both on the local and the remote branch (equivalent to
>      "commit" + "push", but in a transactional way).
>    - it refuses to commit if you're out of date with the branch you're
>      bound to.
>    (this is "heavy checkout")

It isn't very nice because it enforces the update-before-commit
workflow, which was complaint of many CVS users and I can remember it
being one of the selling points of the distributed VCSes in 2001 or so,
although it is not so emphasized lately. (I understand that this is
something optional in Bazaar.)

BTW, merge commits aren't bad. They reflect what really happenned,
explicitly record the merge resolution taken, if there was any, and
protect you from accidentally losing or damaging [any portion of] your
changes. And they aren't cluttery either since we hide them from
non-graphical history listings by default.

Still, I can recognize that in some scenarios, people might find it
useful, and I can remember some people asking for it in the past. So I
couldn't resist and implemented it in Cogito as cg-commit --push. Pushed
out now. Took me about 5 minutes implementing it and 10 minutes documenting
it.  ;-)


P.S.: A general note for bleeding-edge Cogito users, I've rewritten the
local changes handling so that we always do three-way merge now instead
of that braindead patches diffing/applying, but it's not completely
stable yet, some testcases still fail. So be a bit careful when
updating/uncommitting/switching/... with uncommitted changes in the
working tree.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  1:36                                   ` Integrating gitweb and git-browser (was: Re: VCS comparison table) Jakub Narebski
@ 2006-10-18  1:52                                     ` Petr Baudis
  2006-10-18  1:58                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  1:52 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Wed, Oct 18, 2006 at 03:36:36AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Petr Baudis wrote:
> 
> > BTW, I'm thinking about implementing some plugin functionality for
> > gitweb 
> 
> Features support is kind of plugin system for gitweb. But certainly we could
> split gitweb into modules.
> 
> > so that you can add your own views, so that git-browser can 
> > integrate to it more reasonably. (Currently it has completely different
> > UI and you have to patch gitweb in order to get the proper links at
> > proper places.) Sure, git-browser might get fully integrated to gitweb
> > later but that needs to be done sensitively so that people are not
> > scared by the horrible javascript blobs, etc.; currently git-browser is
> > very experimental, and adding it would be quite intrusive.
> 
> I was thinking about adding using JavaScript, in shortlog (and perhaps
> shortlog-extended, i.e. with date and author) views one extra "diagram"
> column, with width set using JavaScript generated embedded style, and use
> only part of git-browser that generates diagram to draw it there.

Shortlog is paginated and that's not very practical for diagrams, I
think - you need to gradually extend it instead in that case. But yes,
keeping the _visual_ difference of git-browser and gitweb as small as
possible has been the main reason for me to think about integrating it
more tightly.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  1:52                                     ` Petr Baudis
@ 2006-10-18  1:58                                       ` Jakub Narebski
  2006-10-18  2:02                                         ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  1:58 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

Petr Baudis wrote:
> Dear diary, on Wed, Oct 18, 2006 at 03:36:36AM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
>> Petr Baudis wrote:
>>
>>> so that you can add your own views, so that git-browser can 
>>> integrate to it more reasonably. (Currently it has completely different
>>> UI and you have to patch gitweb in order to get the proper links at
>>> proper places.) Sure, git-browser might get fully integrated to gitweb
>>> later but that needs to be done sensitively so that people are not
>>> scared by the horrible javascript blobs, etc.; currently git-browser is
>>> very experimental, and adding it would be quite intrusive.
>> 
>> I was thinking about adding using JavaScript, in shortlog (and perhaps
>> shortlog-extended, i.e. with date and author) views one extra "diagram"
>> column, with width set using JavaScript generated embedded style, and use
>> only part of git-browser that generates diagram to draw it there.
> 
> Shortlog is paginated and that's not very practical for diagrams, I
> think - you need to gradually extend it instead in that case. But yes,
> keeping the _visual_ difference of git-browser and gitweb as small as
> possible has been the main reason for me to think about integrating it
> more tightly.

You can have paginated graph (diagram). Although it is more natural
to have diagram on the first page only, just like gitk --max-count=100.

The idea is for gitweb to generate (short)log, perhaps with pagination
turned off (CSS overflow: scroll), and git-browser part to generate
diagram and add it to log.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Integrating gitweb and git-browser (was: Re: VCS comparison table)
  2006-10-18  1:58                                       ` Jakub Narebski
@ 2006-10-18  2:02                                         ` Petr Baudis
  0 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18  2:02 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Wed, Oct 18, 2006 at 03:58:03AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> You can have paginated graph (diagram). Although it is more natural
> to have diagram on the first page only, just like gitk --max-count=100.

Of course you _can_ have it, but you're going to have a lot of trouble
following the threads over page boundaries, especially if some branch
has no commits whatsoever at some page(s).

> The idea is for gitweb to generate (short)log, perhaps with pagination
> turned off (CSS overflow: scroll), and git-browser part to generate
> diagram and add it to log.

What's missing there is the scary AJAXish thing for fetching more
commits. You do not want to load the whole kernel history at once, but
instead on demand fetch more revisions.

BTW, I'm most probably not the one going to hack git-browser to fit in
this. My javascript knowledge is barely enough to implement a web
browser support for it. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:25                             ` Carl Worth
@ 2006-10-18  3:10                               ` Aaron Bentley
  2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18 15:38                                 ` Carl Worth
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  3:10 UTC (permalink / raw)
  To: Carl Worth
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> Aaron, thanks for carrying this thread along and helping to bridge
> some communication gaps. For example, when I saw your original two two
> diagrams I was totally mystified how you were claiming that appending
> a couple of nodes and edges to a DAG could change the "order" of the
> DAG.
> 
> I think I understand what you're describing with the leftmost-parent
> ordering now. But it's definitely an ordering that I would describe as
> local-only. That is, the ordering has meaning only with respect to a
> particular linearization of the DAG and that linearization is
> different from one repository to the next.

Well, the linarization for any particular head is well-defined, but
since different branches have different heads...

> If in practice, nobody does the mirroring "pull" operation then how
> are the numbers useful? For example, given your examples above, if
> I'm understanding the concepts and terminology correctly, then if A
> and B both "merge" from each other (and don't "pull") then they will
> each end up with identical DAGs for the revision history but totally
> distinct numbers. Correct?

The DAGs will be different.  If A merges B, we get:

a
|
b
|\
c d
|\|
| e
|/
f

If B merges A before this, nothing happens, because B is already a
superset of A.

If B merges afterward, we get this:
a
|
b
|\
d c
|/|
e |
|\|
| f
|/
g

> So in that situation the numbers will not help A and B determine that
> they have identical history or even identical working trees.

They don't really have identical history.

> So what good are the numbers?

They are good for naming mainline revisions that introduced particular
changes.

> I can see that the numbers would have applicability with reference to
> a single repository, (or equivalently a mirror of that repository),
> but no utility as soon as there is any distributed development
> happening.

Well, there's distributed, and then there's *DISTRIBUTED*.  We don't
quasi-randomly merge each others' branches.  We have a star topology
around bzr.dev.  So when we refer to revnos, they're usually in bzr.dev.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNZsp0F+nu1YWqI0RAkmWAJ9PkrkubIHVgAn5Wbdkg9IBAHCviACdFx2x
6ClmK4GmC1pRuRQACcSijNM=
=SM1Y
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  0:23                       ` Aaron Bentley
  2006-10-18  0:46                         ` Jakub Narebski
       [not found]                         ` <200610180246.18758.jnareb@gmail.com>
@ 2006-10-18  3:25                         ` Ryan Anderson
  2 siblings, 0 replies; 806+ messages in thread
From: Ryan Anderson @ 2006-10-18  3:25 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

On 10/17/06, Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > In other words, the empty merge is totally semantically empty even in the
> > bazaar world. Why does it exist?
>
> It exists because it is useful.  Because it makes the behavior of bzr
> merge uniform.  Because in some workflows, commits show that a person
> has signed off on a change.

In the Git world that happens via "git tag -s", i.e, a
cryptographically strong "signoff".
(There's also the secondary convention of appending Signed-off-by: to
email-applied patches, but that's something that would translate
effectively to any other system, since it's outside the SCM.)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:44                             ` Carl Worth
@ 2006-10-18  3:27                               ` Aaron Bentley
  2006-10-18  9:20                                 ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18  3:27 UTC (permalink / raw)
  To: Carl Worth
  Cc: Jakub Narebski, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Wed, 18 Oct 2006 03:28:30 +0200, Jakub Narebski wrote:
>> Isn't it easier to review than "bundle", aka. mega-patch?
> 
> There are even more important reasons to prefer a series of
> micro-commits over a mega-patch than just ease of merging.

A bundle isn't a mega-patch.  It contains all the source revisions.  So
when you merge or pull it, you get all the original revisions in your
repository.


> We have great tools like bisect to identify commits that introduce
> bugs. I know that I'd be delighted to see bisect comes back pointing
> at some minimal commit as causing a bug, (which would make finding the
> bug so much easier).

Bisect should work equally well with revisions pulled or merged from a
bundle as revisions re-committed from patches.

> But it's also been my experience that the largest commits are also the
> most likely to be the things returned by bisect. Big commits really do
> introduce bugs more frequently than small commits.

The number of changes shown in the diff has nothing to do with the
number of changes made per commit.

> Now, I do admit that it is often useful to take the overall view of a
> patch series being submitted. This is often the case when a patch
> series is in some sub-module of the code for which I don't have as
> much direct involvement. In cases like that I will often do review
> only of the diff between the tips of the mainline and the branch of
> interest, (or if I trust the maintainer enough, perhaps just the
> diffstat between the two). But I'm still very glad that what lands in
> the history is the series of independent changes, and not one mega
> commit.

So the difference here is that bundles preserve the original commits the
changes came from, so even though it's presented as an overview, you
still have a series of independent changes in your history.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNZ820F+nu1YWqI0RAjNyAJ90HMCAiopuAMvkKlcCEdc4F6QKLwCdGEWI
VOZThAQrvqybe5z93eC44BY=
=xBZM
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:00                           ` Aaron Bentley
  2006-10-18  1:25                             ` Carl Worth
@ 2006-10-18  3:35                             ` Linus Torvalds
  2006-10-19  3:10                               ` Aaron Bentley
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18  3:35 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Tue, 17 Oct 2006, Aaron Bentley wrote:
> 
> > That doesn't matter...
> 
> It has significant UI impact.

Right. You have to do it your way, because of the "simple revision 
numbers".

Which gets us back to where we started: "simple" is in the eye of the 
beholder. I personally think that git revision naming is a lot simpler, 
exactly because it doesn't impose arbitrary rules on users.

For example, what happens is that:
 - you like the simple revision numbers
 - that in turn means that you can never allow a mainline-merge to be done 
   by anybody else than the main maintainer
 - that in turn means that the whole situation is no longer distributed, 
   it's more like a "disconnected access to a central repository"

The "main trunk matters" mentality (which has deep roots in CVS - don't 
get me wrong, I don't think you're the first one to do this) is 
fundamentally antithetical to truly distributed system, because it 
basically assumes that some maintainer is "more important" than others. 

That special maintainer is the maintainer whose merge-trunk is followed, 
and whose revision numbers don't change when they are merged back. 

That may even be _true_ in many cases. But please do realize that it's a 
real issue, and that it has real impact - it does two things:

 - it impacts the technology and workflow directly itself: "pull" and 
   "merge" are different: a central maintainer would tend to do a "merge", 
   and one more in the outskirts would tend to do more of a "pull", 
   expecting his work to then be merged back to the "trunk" at some later 
   point)

 - it will result in _psychological_ damage, in the sense that there's 
   always one group that is the "trunk" group, and while you can pass the 
   baton around (like the perl people do), it's always clear who sits 
   centrally.

Maybe this is fine. It's certainly how most projects tend to work. 

I'll just point out that one of my design goals for git was to make every 
single repository 100% equal. That means that there MUST NOT be a "trunk", 
or a special line of development. There is no "vendor branch". It's 
something that a lot of people on the git lists understand now, but it 
took a while for it to sink in - people used to believe that the "first 
parent" of a merge was somehow special, and I had to point out several 
times on the git list that no, that's not how it works - because the merge 
might have been done by somebody _else_ than the person who you think of 
as being "on the trunk".

So when I say that your "simple" revision numbers are totally broken and 
horrible, I say that not because I think a number like "1.45.3.17" is 
ugly, but because I think that the deeper _implications_ of using a number 
like that is ugly. It implies one of two things:

 - the numbers change all the time as things get merged both ways

OR

 - people try to maintain a "trunk" mentality

and I think both of those situations are simply not good situations.

In git, the fact that everybody is on an equal footing is something that I 
think is really good. For example, when I was away for effectively three 
weeks during August, all the git-level merging for the kernel was done by 
Greg KH.

And realize that he didn't use "my tree". No baton was passed. I emailed 
with him (and some others) before-hand, so that everybody knew that I 
expected to be just pull from Greg when I came back, but it was _his_ tree 
that he merged in, and he just worked the same way I did.

And when I did come back, I did a "pull" from his tree. At no point is 
there a big merge-commit with a sign saying

	"I now merged all the work that Greg did while I was away"

No. Because the way git works, my pull just fast-forwarded my tree, 
because while I was away, Greg's tree _was_ the main tree, thanks to the 
fact that git believes that everybody is 100% equal.

So it's actually a big conceptual thing. 

I'm actually very happy with the design of git, and a large part of that 
is that I think the data structures and the basic design was really good. 
Now, I know I'm smarter than anybody else ("Bow down before me, you 
worthless scum"), but the thing is, the way to do good basic design isn't 
actually to be really smart about it, but to try to have a few basic 
concepts.

And the "every repository is equal" is one such concept. The naming 
follows from that - you simply _cannot_ use numbers if everybody is on the 
same footing (at least not _stable_ numbers). 

Btw, BK did get this right. I didn't _like_ the naming in BK, and it was 
numbers, but it worked. But it only worked when people understood that the 
numbers were ephemeral, and it _did_ cause confusion. But hey, the 
confusion wasn't _that_ big of a problem.

> Even if I agreed that the revision was meaningless, the cost of such a
> revision is miniscule.

No. The _cost_ of the revision is the "trunk mentality". THAT is the true 
cost.  The belief that there is one "main line of development".

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:08             ` Andreas Ericsson
  2006-10-17 10:47               ` Matthieu Moy
@ 2006-10-18  4:55               ` Robert Collins
  2006-10-18  8:53                 ` Andreas Ericsson
  2006-10-18 15:31                 ` Linus Torvalds
  1 sibling, 2 replies; 806+ messages in thread
From: Robert Collins @ 2006-10-18  4:55 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 3199 bytes --]

On Tue, 2006-10-17 at 12:08 +0200, Andreas Ericsson wrote:
> Robert Collins wrote:
> > On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
> >>           ---- time --->
> >>
> >>     --*--*--*--*--*--*--*--*--*-- <branch>
> >>           \            /
> >>            \-*--X--*--/
> >>
> >> The branch it used to be on is gone...
> > 
> > In bzr 0.12 this is :
> > 2.1.2
> > 
> 
> Would it be a different number in a different version of bazaar?

The dotted decimal display has only been introduced in bzr 0.12

> > (assuming the first * is numbered '1'.)
> > 
> > These numbers are fairly stable, in particular everything's number in
> > the mainline will be the same number in all the branches created from it
> > at that point in time, but a branch that initially creates a revision or
> > obtains it before the mainline will have a different number until they
> > syncronise with the mainline via pull.
> > 
> 
> So basically anyone can pull/push from/to each other but only so long as 
> they decide upon a common master that handles synchronizing of the 
> number part of the url+number revision short-hands?

Anyone can push and pull from each other - full stop. Whenever they
'pull' in bzr terms, they get fast-forward happening (if I understand
the git fast-forward behaviour correctly). After a fast-forward, the
dotted decimal revision numbers in the two branches are identical - and
they remain immutable until another fast forward occurs. Push always
fast forwards, so the public copy of ones own repository that others
pull or merge from is identical to your own. In a 'collection of
branches with no mainline' scenario, people usually have fast forward
occur from time to time, keeping the numbers consistent from the point
your branch was last pulled by someone else, or you pulled them.

> One thing that's been nagging me is how you actually find out the 
> url+number where the desired revision exists. That is, after you've 
> synced with master, or merged the mothership's master-branch into one of 
> your experimental branches where you've done some work that went before 
> mothership's master's current tip, do you have to have access to the 
> mothership's repo (as in, do you have to be online) to find out the 
> number part of url+number shorthand, or can you determine it solely from 
> what you have on your laptop?

You can determine it locally - if you know any of the motherships
revisions locally, we can generate the dotted-revnos that the
motherships master-branch would have from the local data - and the last
merge of mothership you did will have given you that details. I dont
think we have a ui command to spit this out just yet, but it will be
trivial to whip one up.

More commonly though, like git users have 'origin' and 'master'
branches, bzr users tend to have a branch that is the 'origin' (for bzr
itself this is usually called bzr.dev), as well as N other branches for
their own work, which is probably why we haven't seen the need to have a
ui command to spit out the revnos for an arbitrary branch.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:33                   ` Petr Baudis
@ 2006-10-18  5:26                     ` Robert Collins
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
  0 siblings, 1 reply; 806+ messages in thread
From: Robert Collins @ 2006-10-18  5:26 UTC (permalink / raw)
  To: Petr Baudis; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 1495 bytes --]

On Wed, 2006-10-18 at 01:33 +0200, Petr Baudis wrote:
> 
> BTW, I think it's fine to build a system optimized for small-scale
> projects (if that's the intent), simplifying some things in favour of
> mostly straight histories instead of more complicated merge situations
> (although I tend to agree with Linus that if you don't behave in the
> way the users are used to in 100% cases, the more frequently you
> behave so the worse it comes back to bite in the rare cases you do).
> Just as RCS is fine when maintaining individual files for personal
> usage (I still actually occassionaly use it for few files).

revnos visibly change as your work is merged into the mainline - we've
been doing this for years without trouble: ones own commits to a branch
get '3', '4', '5' etc as revnos, and when they are merged to the
mainline they used to stop having revnos at all, but now they will be
given this dotted decimal revno. If you pull from the mainline after the
merge, you see the new numbers, and when you look at mainline you can
see the difference. So while I agree that the surprise the user gets is
inversely related to the frequency with which they see the behaviour, I
think our users see it a lot, so are not surprised much.

FWIW, we're not optimising for mostly straight histories as I understand
such things : our own history has 3 commits on branches to every one on
the mainline.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:16                       ` Linus Torvalds
@ 2006-10-18  5:36                         ` Jeff King
  2006-10-18  5:57                           ` Junio C Hamano
  2006-10-18 14:52                           ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Jeff King @ 2006-10-18  5:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Narebski, Aaron Bentley, Andreas Ericsson, bazaar-ng, git

On Tue, Oct 17, 2006 at 04:16:15PM -0700, Linus Torvalds wrote:

> It would be easy to send the exact same data as the native git protocol 
> sends over ssh (or the git port) as an email encoding. We did that a few 
> times with BK (there it's called "bk send" and "bk receive" to pack and 
[...]
> That said, "bundles" certainly wouldn't be _hard_ to do. And as long as 
> nobody tries to send _me_ any of them, I won't mind ;)

I never used BK, but my understanding is that it was based on
changesets, so a bundle was a group of changesets. Because a git commit
represents the entire tree state, how can we avoid sending the entire
tree in each bundle? The interactive protocols can ask "what do you
have?" but an email bundle is presumably meant to work without a round
trip.

We could always make a guess ("git send --remote-has master~10") but
that seems awfully error-prone. I assume a changeset-oriented system
would implicitly keep some concept of "I think Linus is at master~10"
and do it automatically.

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  5:36                         ` Jeff King
@ 2006-10-18  5:57                           ` Junio C Hamano
  2006-10-18 14:52                           ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18  5:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Jakub Narebski, Aaron Bentley, Andreas Ericsson, bazaar-ng, git,
	Linus Torvalds

Jeff King <peff@peff.net> writes:

> We could always make a guess ("git send --remote-has master~10") but
> that seems awfully error-prone. I assume a changeset-oriented system
> would implicitly keep some concept of "I think Linus is at master~10"
> and do it automatically.

We could always anchor at a well known point ("git send v2.6.18..").
If you as the recipient do not have the preimage, the "bundle" would
identify what the assumed common ancestor is and you can fetch
it before proceeding.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 21:51                 ` Jakub Narebski
  2006-10-17 22:28                   ` Aaron Bentley
@ 2006-10-18  6:22                   ` Matthieu Moy
  1 sibling, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-18  6:22 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>>> Fast-forward is a really good idea. Perhaps you could implement it,
>>> if it is not hidden under different name?
>> 
>> We support it as 'pull', but merge doesn't do it automatically, because
>> we'd rather have merge behave the same all the time, and because 'pull'
>> throws away your local commit ordering.
>
> I smell yet another terminology conflict (although this time fault is
> on the git side), namely that in git terminology "pull" is "fetch"
> (i.e. getting changes done in remote repository since laste "fetch"
> or since "clone") followed by merge. pull = fetch + merge.

AAUI, the initial claim was that after a rebase, git can do a
fast-forward, but Aaron has missed the /after a rebase/ part.

And yes, it the bzr terminology, bzr can do a "pull" after a "graft".
I don't think there's a fundamental difference here.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
  2006-10-17 10:23           ` Sean
  2006-10-17 10:23           ` Sean
@ 2006-10-18  6:33           ` Jeff King
  2 siblings, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-18  6:33 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, Johannes Schindelin, Jakub Narebski, bazaar-ng, git

On Tue, Oct 17, 2006 at 06:23:41AM -0400, Sean wrote:

> The "bzr missing" command sounds like a handy one.  
> 
> Someone on the xorg mailing list was recently lamenting that git does not
> have an easy way to compare a local branch to a remote one.  While this
> turns out to not be a big problem in git, it might be nice to have such
> a command.

What's wrong with:

  git-fetch
  gitk master...origin

The git model is to do operations on local refs and objects, so the
fetch is a natural part of that. The only downside I see is that you
actually end up fetching the data rather than simply peeking at where
the remote is. But a useful comparison will include at least grabbing
the commit objects, and probably the tree objects (to do diffs) anyway.

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  1:11                   ` Petr Baudis
@ 2006-10-18  6:44                     ` Matthieu Moy
  2006-10-18  7:16                       ` Shawn Pearce
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-18  6:44 UTC (permalink / raw)
  To: bazaar-ng, git

Petr Baudis <pasky@suse.cz> writes:

> The origin branch is considered readonly (though Git does
> not enforce it) and only mirrors the branch in the remote repository.

By curiosity, what happens if you accidentally commit to it?

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  6:44                     ` Matthieu Moy
@ 2006-10-18  7:16                       ` Shawn Pearce
  0 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18  7:16 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Petr Baudis <pasky@suse.cz> writes:
> 
> > The origin branch is considered readonly (though Git does
> > not enforce it) and only mirrors the branch in the remote repository.
> 
> By curiosity, what happens if you accidentally commit to it?

It will quietly accept the commit.

Later when you attempt to run `git fetch` to download any changes
from the remote repository to your local origin branch the fetch
command will fail as it won't be a strict fast-forward due to
there being changes in origin which aren't in the remote repository
being downloaded.

The user can force those changes to be thrown away with `git fetch
--force`, though they probably would want to first examine the
branch with `git log origin` to see what commits (if any) should
be saved, and either extract them to patches for reapplication or
create a holder branch via `git branch holder origin` to allow them
to later merge the holder branch (or parts thereof) after the fetch
has forced origin to match the remote repository.

So in short by default Git stops and tells the user something fishy
is going on, but the error message isn't obvious about what that
is and how they can resolve it easily.

There has been discussion about marking these branches that we
know the user fetches into as read-only, to prevent `git commit`
from actually committing to such a branch (we also have the same
case with the special bisect branch), but I don't think anyone has
stepped forward with the complete implementation of that yet.

Like anything I think people get used to the idea that those branches
are strictly for fetching and shouldn't be used for anything else.
There's really no reason to checkout a fetched into branch anyway;
temporary branches are less than 1 second away with
`git checkout -b tmp origin` (for example).

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 23:33                       ` Aaron Bentley
@ 2006-10-18  8:13                         ` Andreas Ericsson
  0 siblings, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-18  8:13 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
>> Aaron Bentley wrote:
>> By the way, are bzr "bundles" compatibile with ordinary patch?
>> git-format-patch patches are. They have additional metainfo,
>> but they are patches in heart.
> 
> Yes, they are.
> 

Sounds a bit like [PATCH 0/8] would have the output of

	git diff $(git merge-base master)..topic-branch

for any given patch-series. It might be easier to review the whole 
patch-series in some cases. Especially with patch-series where more than 
one patch touches the same part of the code.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:10                               ` Aaron Bentley
@ 2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18  9:04                                   ` Peter Baumann
                                                     ` (2 more replies)
  2006-10-18 15:38                                 ` Carl Worth
  1 sibling, 3 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-18  8:39 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Carl Worth, Jakub Narebski, Linus Torvalds, bazaar-ng, git

Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Carl Worth wrote:
>> Aaron, thanks for carrying this thread along and helping to bridge
>> some communication gaps. For example, when I saw your original two two
>> diagrams I was totally mystified how you were claiming that appending
>> a couple of nodes and edges to a DAG could change the "order" of the
>> DAG.
>>
>> I think I understand what you're describing with the leftmost-parent
>> ordering now. But it's definitely an ordering that I would describe as
>> local-only. That is, the ordering has meaning only with respect to a
>> particular linearization of the DAG and that linearization is
>> different from one repository to the next.
> 
> Well, the linarization for any particular head is well-defined, but
> since different branches have different heads...
> 
>> If in practice, nobody does the mirroring "pull" operation then how
>> are the numbers useful? For example, given your examples above, if
>> I'm understanding the concepts and terminology correctly, then if A
>> and B both "merge" from each other (and don't "pull") then they will
>> each end up with identical DAGs for the revision history but totally
>> distinct numbers. Correct?
> 
> The DAGs will be different.  If A merges B, we get:
> 
> a
> |
> b
> |\
> c d
> |\|
> | e
> |/
> f
> 
> If B merges A before this, nothing happens, because B is already a
> superset of A.
> 
> If B merges afterward, we get this:
> a
> |
> b
> |\
> d c
> |/|
> e |
> |\|
> | f
> |/
> g
> 

Seems like an awful lot of merge commits. In git, I think these trees 
would be identical (actually both to bazaar and to each other), with the 
exception that the 'g' commit wouldn't exist, since git does 
fast-forward and relies on dependency-chain only to present the graph 
instead of mucking around with info in external files (recording of 
fetches).

>> So in that situation the numbers will not help A and B determine that
>> they have identical history or even identical working trees.
> 
> They don't really have identical history.
> 

As explained above, they would be identical in git. The fact that you 
register a fast-forward as a merge makes them not so, but this is 
something most gitizens are against, as it can quickly clutter up the DAG.

>> So what good are the numbers?
> 
> They are good for naming mainline revisions that introduced particular
> changes.
> 
>> I can see that the numbers would have applicability with reference to
>> a single repository, (or equivalently a mirror of that repository),
>> but no utility as soon as there is any distributed development
>> happening.
> 
> Well, there's distributed, and then there's *DISTRIBUTED*.  We don't
> quasi-randomly merge each others' branches.  We have a star topology
> around bzr.dev.  So when we refer to revnos, they're usually in bzr.dev.
> 

So in essence, the revnos work wonderfully so long as there is a central 
server to make them immutable?

Doesn't this mean that one of your key features doesn't actually work in 
a completely distributed setup (i.e., each dev has his own repo, there 
is no mother-ship, everyone pulls from each other)?

I can see the six-line hook that lays the groundwork for this in git 
before me right now. I'll happily refuse to write it down anywhere. I 
get the feeling that sha's are easier to handle in the long run, while 
revno's might be good to use in development work. In git, we have 
<branch/tag/"committish">~<number> syntax for this.

In my experience, finding the revision sha of an old bug is what takes 
time. Copy-paste is just as fast with 20 bytes as with 4 bytes. Honestly 
now, do you actually remember the revno for a bug that you stopped 
working on three weeks ago, or do you have to go look it up? If someone 
wants to notify you about the revision a bug was introduced, do they not 
communicate the revno to you by email/irc/somesuch?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  4:55               ` Robert Collins
@ 2006-10-18  8:53                 ` Andreas Ericsson
  2006-10-18 11:15                   ` Petr Baudis
  2006-10-18 15:31                 ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-18  8:53 UTC (permalink / raw)
  To: Robert Collins; +Cc: bazaar-ng, git, Jakub Narebski

Robert Collins wrote:
> On Tue, 2006-10-17 at 12:08 +0200, Andreas Ericsson wrote:
>> Robert Collins wrote:
>>> On Tue, 2006-10-17 at 11:20 +0200, Jakub Narebski wrote:
>>>>           ---- time --->
>>>>
>>>>     --*--*--*--*--*--*--*--*--*-- <branch>
>>>>           \            /
>>>>            \-*--X--*--/
>>>>
>>>> The branch it used to be on is gone...
>>> In bzr 0.12 this is :
>>> 2.1.2
>>>
>> Would it be a different number in a different version of bazaar?
> 
> The dotted decimal display has only been introduced in bzr 0.12
> 
>>> (assuming the first * is numbered '1'.)
>>>
>>> These numbers are fairly stable, in particular everything's number in
>>> the mainline will be the same number in all the branches created from it
>>> at that point in time, but a branch that initially creates a revision or
>>> obtains it before the mainline will have a different number until they
>>> syncronise with the mainline via pull.
>>>
>> So basically anyone can pull/push from/to each other but only so long as 
>> they decide upon a common master that handles synchronizing of the 
>> number part of the url+number revision short-hands?
> 
> Anyone can push and pull from each other - full stop. Whenever they
> 'pull' in bzr terms, they get fast-forward happening (if I understand
> the git fast-forward behaviour correctly). After a fast-forward, the
> dotted decimal revision numbers in the two branches are identical - and
> they remain immutable until another fast forward occurs.


This is where it breaks down for me. "until another fast forward occurs" 
is just not good enough, imo.

> 
>> One thing that's been nagging me is how you actually find out the 
>> url+number where the desired revision exists. That is, after you've 
>> synced with master, or merged the mothership's master-branch into one of 
>> your experimental branches where you've done some work that went before 
>> mothership's master's current tip, do you have to have access to the 
>> mothership's repo (as in, do you have to be online) to find out the 
>> number part of url+number shorthand, or can you determine it solely from 
>> what you have on your laptop?
> 
> You can determine it locally - if you know any of the motherships
> revisions locally, we can generate the dotted-revnos that the
> motherships master-branch would have from the local data - and the last
> merge of mothership you did will have given you that details.


To me, this means bazaar isn't distributed at all and I could achieve 
much the same distributedness(?) by rsyncing an SVN repo, working 
against that and then rsyncing it back with some fancy merging. In other 
words, bazaar requires there to be one Lord of the Code, or some of the 
key features break down.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:39                                 ` Andreas Ericsson
@ 2006-10-18  9:04                                   ` Peter Baumann
  2006-10-18  9:07                                   ` Jakub Narebski
  2006-10-18 10:32                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Peter Baumann @ 2006-10-18  9:04 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Carl Worth, Jakub Narebski, Linus Torvalds,
	bazaar-ng, git

2006/10/18, Andreas Ericsson <ae@op5.se>:
> Aaron Bentley wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Carl Worth wrote:
> >> Aaron, thanks for carrying this thread along and helping to bridge
> >> some communication gaps. For example, when I saw your original two two
> >> diagrams I was totally mystified how you were claiming that appending
> >> a couple of nodes and edges to a DAG could change the "order" of the
> >> DAG.
> >>
> >> I think I understand what you're describing with the leftmost-parent
> >> ordering now. But it's definitely an ordering that I would describe as
> >> local-only. That is, the ordering has meaning only with respect to a
> >> particular linearization of the DAG and that linearization is
> >> different from one repository to the next.
> >
> > Well, the linarization for any particular head is well-defined, but
> > since different branches have different heads...
> >
> >> If in practice, nobody does the mirroring "pull" operation then how
> >> are the numbers useful? For example, given your examples above, if
> >> I'm understanding the concepts and terminology correctly, then if A
> >> and B both "merge" from each other (and don't "pull") then they will
> >> each end up with identical DAGs for the revision history but totally
> >> distinct numbers. Correct?
> >
> > The DAGs will be different.  If A merges B, we get:
> >
> > a
> > |
> > b
> > |\
> > c d
> > |\|
> > | e
> > |/
> > f
> >
> > If B merges A before this, nothing happens, because B is already a
> > superset of A.
> >
> > If B merges afterward, we get this:
> > a
> > |
> > b
> > |\
> > d c
> > |/|
> > e |
> > |\|
> > | f
> > |/
> > g
> >
>
> Seems like an awful lot of merge commits. In git, I think these trees
> would be identical (actually both to bazaar and to each other), with the
> exception that the 'g' commit wouldn't exist, since git does
> fast-forward and relies on dependency-chain only to present the graph
> instead of mucking around with info in external files (recording of
> fetches).
>

Ok. This I don't get. Let me recaptulize:

Branch A
a
|
b
|
c

Branch B
a
|
b
| \
d c
| /
e

In branch A, do merge branch B (git pull B) you get as result branch B, because
A fastforwards to B and you don't get a merge commit f

In branch B, do merge branch A (git pull A), the result would be
branch B, because
we are already uptodate.

You _never_ have a commit f or g.

-Peter

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18  9:04                                   ` Peter Baumann
@ 2006-10-18  9:07                                   ` Jakub Narebski
  2006-10-18 10:32                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  9:07 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, bazaar-ng, git

Andreas Ericsson wrote:
> Aaron Bentley wrote:
>> Well, there's distributed, and then there's *DISTRIBUTED*.  We don't
>> quasi-randomly merge each others' branches.  We have a star topology
>> around bzr.dev.  So when we refer to revnos, they're usually in bzr.dev.
>> 
> 
> So in essence, the revnos work wonderfully so long as there is a central 
> server to make them immutable?
> 
> Doesn't this mean that one of your key features doesn't actually work in 
> a completely distributed setup (i.e., each dev has his own repo, there 
> is no mother-ship, everyone pulls from each other)?
> 
> I can see the six-line hook that lays the groundwork for this in git 
> before me right now. I'll happily refuse to write it down anywhere. I 
> get the feeling that sha's are easier to handle in the long run, while 
> revno's might be good to use in development work. In git, we have 
> <branch/tag/"committish">~<number> syntax for this.
> 
> In my experience, finding the revision sha of an old bug is what takes 
> time. Copy-paste is just as fast with 20 bytes as with 4 bytes. Honestly 
> now, do you actually remember the revno for a bug that you stopped 
> working on three weeks ago, or do you have to go look it up? If someone 
> wants to notify you about the revision a bug was introduced, do they not 
> communicate the revno to you by email/irc/somesuch?

Revnos were supposed to be superior to using sha1 (or shortened sha1)
as commit identifiers because of two key features:
 1. They were simplier than sha1, therefore easier to use
 2. Given two revisions related by lineage (i.e. one is ancestor of
    the other) you can from a glance know which revision was earlier

But the details invalidated 1.: for complicated history, for a large
project, with many contributors and nonlinear development we have 
www.repository.com:127.2.31.57 vs 988859a (7 chars shortcut of sha1)
to have immutable revno. And we have to use _immutable_ (up to few
years) revison identifiers, unless we want our "simple ids" scheme
to make a mess...

And I'm not sure if 2. is true, if even for revisions with direct
lineage we don't have to compare 127.15.2.16 with 210.2.20.3 for
example. Having generation number would solve 2.; as of now git
check for fast-forward case by checking if merge-base of two
revisions is one of the revisions.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:27                               ` Aaron Bentley
@ 2006-10-18  9:20                                 ` Jakub Narebski
  2006-10-18 16:31                                   ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18  9:20 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

Aaron Bentley wrote:
> Carl Worth wrote:
>> On Wed, 18 Oct 2006 03:28:30 +0200, Jakub Narebski wrote:
>>> Isn't it easier to review than "bundle", aka. mega-patch?
>>
>> There are even more important reasons to prefer a series of
>> micro-commits over a mega-patch than just ease of merging.
> 
> A bundle isn't a mega-patch.  It contains all the source revisions.  So
> when you merge or pull it, you get all the original revisions in your
> repository.

But what patch reviewer see is a mega-patch showing the changeset
of a whole "bundle", isn't it?
[...]
>> Now, I do admit that it is often useful to take the overall view of a
>> patch series being submitted. This is often the case when a patch
>> series is in some sub-module of the code for which I don't have as
>> much direct involvement. In cases like that I will often do review
>> only of the diff between the tips of the mainline and the branch of
>> interest, (or if I trust the maintainer enough, perhaps just the
>> diffstat between the two). But I'm still very glad that what lands in
>> the history is the series of independent changes, and not one mega
>> commit.
> 
> So the difference here is that bundles preserve the original commits the
> changes came from, so even though it's presented as an overview, you
> still have a series of independent changes in your history.

I think it is much better to review series of patches commit by commit;
besides it allows to correct some inner patches before applying the whole
series or drop one of patches in series (and it happened from time to time
on git mailing list).

So if git introduces bundles, I think they would take form of series
of "patch" mails + introductory email with series description (currently
it is not saved anywhere), shortlog, diffstat and perhaps more metainfo
like bundle parent (which I think should be email form of branch really),
tags introduced etc.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
@ 2006-10-18  9:28                             ` Erik Bågfors
  2006-10-18 11:08                               ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-18  9:28 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Aaron Bentley, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git, Jakub Narebski

On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
> where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> > Petr Baudis wrote:
> > > Another aspect of this is that Git (Linus ;) is very focused on getting
> > > the history right, nice and clean (though it does not _mandate_ it and
> > > you can just wildly do one commit after another; it just provides tools
> > > to easily do it).
> >
> > Yes, rebasing is very uncommon in the bzr community.  We would rather
> > evaluate the complete change than walk through its history.  (Bundles
> > only show the changes you made, not the changes you merged from the
> > mainline.)
> >
> > In an earlier form, bundles contained a patch for every revision, and
> > people *hated* reading them.  So there's definitely a cultural
> > difference there.
>
> BTW, I think what describes the Git's (kernel's) stance very nicely is
> what I call the Al Viro's "homework problem":
>
>         http://lkml.org/lkml/2005/4/7/176
>
> If I understand you right, the bzr approach is what's described as "the
> dumbest kind" there? (No offense meant!)

Yes and no, The bundle includes both the full final thing, and each
step along the way. Each step along the way is something you'll get
when you merge it.

Once merged, it will be "next one" in the description above. It would
typically look something like this in "bzr log"(shortened)  In this
example, doing C requires doing A and B as well...

committer: foobar@foobar.com
message: merged in C
      -------
      committer: bar@bar.com
      message: opps, fix bug in A
      -------
      committer: bar@bar.com
      message: implement B
      -------
      committer: bar@bar.com
      message: implement A

So, you'll get full history, including errors made :)  You can also
see who approved it to this branch (foobar) and who did the actual
work (bar)

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:39                                 ` Andreas Ericsson
  2006-10-18  9:04                                   ` Peter Baumann
  2006-10-18  9:07                                   ` Jakub Narebski
@ 2006-10-18 10:32                                   ` Matthew D. Fuller
  2006-10-18 11:19                                     ` Andreas Ericsson
  2 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-18 10:32 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Linus Torvalds, Carl Worth, bazaar-ng, git,
	Jakub Narebski

On Wed, Oct 18, 2006 at 10:39:32AM +0200 I heard the voice of
Andreas Ericsson, and lo! it spake thus:
>
> So in essence, the revnos work wonderfully so long as there is a
> central server to make them immutable?

It seems from my somewhat detached perspective that there's a lot of
conflation of 'conventions' with 'capabilities' around this thread...


With a single linear branch, revnos work wonderfully, and are probably
much more useful than any sort of UUID.  It would be silly in this day
and age to design a VCS aimed specifically for this use case, of
course.  That doesn't mean a VCS shouldn't make it easy, though.


With a star config, revnos are useful locally and with reference to
the "main" branch[es].  And, most of the world is star configs of one
sort or another.  Actually, one might say that practically ALL the
world outside of linux-kernel is star-configs   ;)

In many cases in the star setup, a revno (particularly along the
'trunk') is more directly useful than a UUID; consider particularly
the case of somebody who's just mirroring/following, not actively
developing.  In some cases, the UUID is more useful.  Certainly, using
a revno in a case where the UUID is more appropriate is Bad, but
that's just a matter of using the right tool.


With a uber-distributed full-mesh setup, revnos may be basically
useless for anything except local lookups (which boils down to
"useless for most anything you'd identify a revision for").  For that
case, you'd practically always use the UUID, and pretend revnos don't
exist.


The merge revno forms (123.5.2.17 and the like), I'm somewhat
ambivalent about in many ways.  But, you don't have to use them any
more than you have to use "top-level" revnos.  If either form of revno
is Wrong for your case (whether it be because "I hate numbers
wholesale", or because "Numbers don't cover this case usefully"), then
you just use the UUID and pretend the number isn't there.  If you
wanted them completely out of sight, I wouldn't expect it to be very
hard to talk bzr into never showing the revnos and just showing the
UUID ("revid").



[ I don't speak for bzr, despite the fact that I'm about to appear to ]

>From where I sit, revnos are quite useful in the first 1.5 or 2 cases.
Some would argue that they're not useless in the third case as well,
but that's no necessary point to hash out; it certainly does no
technical harm to have them there, since you can just ignore them if
they don't help you.  I think a good case could be made that the vast
majority of VCS use in the world is a form of case 2.

Git comes out of a world where case 3 is All, and the other cases are,
if not actively ignored, at least far secondary considerations, so it
can hardly be surprising that it doesn't have or want something that
adds practically nothing to its case.

bzr, both in its own development schema, and in the expected audience,
is overwhelmingly case 2 (of which case 1 is really just a degenerate
version), but that doesn't mean case 3 is ignored or impossible.  The
UUID's are there for when you need them, and can be used anywhere you
might use a number, and just as easily.  It's a community convention
to organize development in such a way that the number is "usually"
useful, and when it is, it's certainly easier.  That doesn't mean you
HAVE to use it in cases where it doesn't fit, though.  "bzr people
like to avoid using UUID's" doesn't lead to "bzr can't handle the
cases where UUID's are necessary".


> Doesn't this mean that one of your key features doesn't actually
> work in a completely distributed setup

That's one way of phrasing it, I guess.  I'd say rather "a particular
feature isn't applicable to a completely distributed setup".  I'm sure
git has a lot of features that are key for somebody that "don't work"
for someone else, just because they're doing something that person
doesn't want done.  Just because somebody else thinks their toaster
oven is a great way to solder, doesn't mean you have to sell yours.
You can just leave it in the cupboard and use an iron instead.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  9:28                             ` Erik Bågfors
@ 2006-10-18 11:08                               ` Petr Baudis
  2006-10-18 11:17                                 ` Jakub Narebski
  2006-10-18 13:09                                 ` Erik Bågfors
  0 siblings, 2 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 11:08 UTC (permalink / raw)
  To: Erik B?gfors
  Cc: Aaron Bentley, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git, Jakub Narebski

Dear diary, on Wed, Oct 18, 2006 at 11:28:32AM CEST, I got a letter
where Erik B?gfors <zindar@gmail.com> said that...
> On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> >Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
> >where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> >> Petr Baudis wrote:
> >> > Another aspect of this is that Git (Linus ;) is very focused on getting
> >> > the history right, nice and clean (though it does not _mandate_ it and
> >> > you can just wildly do one commit after another; it just provides tools
> >> > to easily do it).
> >>
> >> Yes, rebasing is very uncommon in the bzr community.  We would rather
> >> evaluate the complete change than walk through its history.  (Bundles
> >> only show the changes you made, not the changes you merged from the
> >> mainline.)
> >>
> >> In an earlier form, bundles contained a patch for every revision, and
> >> people *hated* reading them.  So there's definitely a cultural
> >> difference there.
> >
> >BTW, I think what describes the Git's (kernel's) stance very nicely is
> >what I call the Al Viro's "homework problem":
> >
> >        http://lkml.org/lkml/2005/4/7/176
> >
> >If I understand you right, the bzr approach is what's described as "the
> >dumbest kind" there? (No offense meant!)
> 
> Yes and no, The bundle includes both the full final thing, and each
> step along the way. Each step along the way is something you'll get
> when you merge it.
> 
> Once merged, it will be "next one" in the description above. It would
> typically look something like this in "bzr log"(shortened)  In this
> example, doing C requires doing A and B as well...
> 
> committer: foobar@foobar.com
> message: merged in C
>      -------
>      committer: bar@bar.com
>      message: opps, fix bug in A
>      -------
>      committer: bar@bar.com
>      message: implement B
>      -------
>      committer: bar@bar.com
>      message: implement A
> 
> So, you'll get full history, including errors made :)  You can also
> see who approved it to this branch (foobar) and who did the actual
> work (bar)

I see, that's what I've been missing, thanks. So it's the middle path
(as any other commonly used VCS for that matter, expect maybe darcs?;
patch queues and rebasing count but it's a hack, not something properly
supported by the design of Git, since at this point the development
cannot be fully distributed).

I also assume that given this is the case, the big diff does really not
serve any purpose besides human review?

But somewhere else in the thread it's been said that bundles can also
contain merges. Does that means that bundles can look like:

   1
  / \
 2   4
 |   | _
 3   5  |
  \ /   | a bundle
   6    |
       ~

In that case, against what the big diff from 6 is done? 2? 4? Or even 1?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  8:53                 ` Andreas Ericsson
@ 2006-10-18 11:15                   ` Petr Baudis
  0 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 11:15 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Robert Collins, bazaar-ng, git, Jakub Narebski

Dear diary, on Wed, Oct 18, 2006 at 10:53:16AM CEST, I got a letter
where Andreas Ericsson <ae@op5.se> said that...
> Robert Collins wrote:
> >Anyone can push and pull from each other - full stop. Whenever they
> >'pull' in bzr terms, they get fast-forward happening (if I understand
> >the git fast-forward behaviour correctly). After a fast-forward, the
> >dotted decimal revision numbers in the two branches are identical - and
> >they remain immutable until another fast forward occurs.
..snip..
> >You can determine it locally - if you know any of the motherships
> >revisions locally, we can generate the dotted-revnos that the
> >motherships master-branch would have from the local data - and the last
> >merge of mothership you did will have given you that details.
> 
> 
> To me, this means bazaar isn't distributed at all and I could achieve 
> much the same distributedness(?) by rsyncing an SVN repo, working 
> against that and then rsyncing it back with some fancy merging. In other 
> words, bazaar requires there to be one Lord of the Code, or some of the 
> key features break down.

Well as far as I understand, the Lord of the Code is whoever you pulled
from the last time.

It's just a different focus here. If I understood everything in this
thread correctly, both Git and Bazaar have persistent (SHA1, UUID) and
volatile (revspec, revision number) revision ids. The only difference is
that Git primarily presents the user with the SHA1 ids while Bazaar
primarily presents the user with a revision number (and that revspecs
change after every commit while revision numbers change only after a
merge).

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 11:08                               ` Petr Baudis
@ 2006-10-18 11:17                                 ` Jakub Narebski
  2006-10-18 13:09                                 ` Erik Bågfors
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18 11:17 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Erik B?gfors, Aaron Bentley, Matthieu Moy, bazaar-ng,
	Linus Torvalds, Andreas Ericsson, git

Petr Baudis wrote:
> But somewhere else in the thread it's been said that bundles can also
> contain merges. Does that means that bundles can look like:
>
>    1
>   / \
>  2   4
>  |   | _
>  3   5  |
>   \ /   | a bundle
>    6    |
>        ~
>
> In that case [merge bundle], against what the big diff from 6 is done?
> 2? 4? Or even 1? 

Or do you use equivalent of git combined diff format?
http://www.kernel.org/pub/software/scm/git/docs/git-diff-tree.html
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 10:32                                   ` Matthew D. Fuller
@ 2006-10-18 11:19                                     ` Andreas Ericsson
  2006-10-18 12:43                                       ` Matthew D. Fuller
  0 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-18 11:19 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Aaron Bentley, Linus Torvalds, Carl Worth, bazaar-ng, git,
	Jakub Narebski

Matthew D. Fuller wrote:
> On Wed, Oct 18, 2006 at 10:39:32AM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
>> So in essence, the revnos work wonderfully so long as there is a
>> central server to make them immutable?
> 
> 
> With a star config, revnos are useful locally and with reference to
> the "main" branch[es].  And, most of the world is star configs of one
> sort or another.  Actually, one might say that practically ALL the
> world outside of linux-kernel is star-configs   ;)
> 

That might be the case today. However, since we introduced git at the 
office, mini-projects are cropping up like mad, and pieces of toy-code 
are being pushed around among the employees. When something is found to 
be useful enough to attract management attention, it's given a spot at 
the "master site". It doesn't need one. It's just that we have this one 
place where gitweb is installed, which management likes whereas devs 
don't have that on their laptop. It's also convenient to have one place 
to find all changes rather than pulling from 1-to-N different people 
just to have a look at what they've done.

The point I'm trying to make here is that the star config might be the 
most common case today because
a) old scm's enforced this use case and it is therefor the most common 
way just out of habit.
b) projects you actually *see* have gotten past the "Joe made some cool 
changes, pull his 'jukebox-ui' branch".


> In many cases in the star setup, a revno (particularly along the
> 'trunk') is more directly useful than a UUID; consider particularly
> the case of somebody who's just mirroring/following, not actively
> developing.  In some cases, the UUID is more useful.  Certainly, using
> a revno in a case where the UUID is more appropriate is Bad, but
> that's just a matter of using the right tool.
> 

I can easily imagine the use case Linus pointed out with BK. Because 
revnos work wonderfully 80% of the time, people get confused, frustrated 
and downright pissed off when they don't.

> 
> With a uber-distributed full-mesh setup, revnos may be basically
> useless for anything except local lookups (which boils down to
> "useless for most anything you'd identify a revision for").  For that
> case, you'd practically always use the UUID, and pretend revnos don't
> exist.
> 

But they *do* exist, and they *usually* work, so people are bound to try 
them first. Teaching them when they work and when they don't (or rather, 
when they should and when they shouldn't, cause they will work by 
accident sometimes too) is bound to be a lot harder than sending them a 
10 char irc message.

> 
> The merge revno forms (123.5.2.17 and the like), I'm somewhat
> ambivalent about in many ways.  But, you don't have to use them any
> more than you have to use "top-level" revnos.  If either form of revno
> is Wrong for your case (whether it be because "I hate numbers
> wholesale", or because "Numbers don't cover this case usefully"), then
> you just use the UUID and pretend the number isn't there.  If you
> wanted them completely out of sight, I wouldn't expect it to be very
> hard to talk bzr into never showing the revnos and just showing the
> UUID ("revid").
> 

So what's the point in having them? You can't seriously tell me that you 
think of 123.5.2.17 as something you can easily remember, do you? Count 
the times, during one day, where you use the revnos and type them manually.

> 
> 
> [ I don't speak for bzr, despite the fact that I'm about to appear to ]
> 
>>From where I sit, revnos are quite useful in the first 1.5 or 2 cases.
> Some would argue that they're not useless in the third case as well,
> but that's no necessary point to hash out; it certainly does no
> technical harm to have them there, since you can just ignore them if
> they don't help you.  I think a good case could be made that the vast
> majority of VCS use in the world is a form of case 2.
> 
> Git comes out of a world where case 3 is All, and the other cases are,
> if not actively ignored, at least far secondary considerations, so it
> can hardly be surprising that it doesn't have or want something that
> adds practically nothing to its case.
> 

Not really. It's just that case 3 is the most flexible of them all. It's 
trivial to enforce linear development in git. Just add a hook that 
forbids merge commits. Set up a "master repo" and put the hook there and 
you've turned it into CVS with off-line log-browsing (more or less).

Set up a master-server and enable the reflog there and you've turned it 
into bazaar, more or less.

In git, the mothership repo is there for conveniance, because it's nice 
to have one place to set up mailing-list hooks, gitweb, git-daemon and 
the likes. Everything works *exactly* as it would have done without it 
in all repos around the world.


> bzr, both in its own development schema, and in the expected audience,
> is overwhelmingly case 2 (of which case 1 is really just a degenerate
> version), but that doesn't mean case 3 is ignored or impossible.  The
> UUID's are there for when you need them, and can be used anywhere you
> might use a number, and just as easily.  It's a community convention
> to organize development in such a way that the number is "usually"
> useful, and when it is, it's certainly easier.  That doesn't mean you
> HAVE to use it in cases where it doesn't fit, though.  "bzr people
> like to avoid using UUID's" doesn't lead to "bzr can't handle the
> cases where UUID's are necessary".
> 

Have a look at the list of things that CVS "can handle" and compare it 
mentally to the things CVS "handles gracefully" and you'll see why 
people have stopped using it.

> 
>> Doesn't this mean that one of your key features doesn't actually
>> work in a completely distributed setup
> 
> That's one way of phrasing it, I guess.  I'd say rather "a particular
> feature isn't applicable to a completely distributed setup".

So how come it's in the same list of features as the "distributed 
repository model", and both are marked as supported when they're 
apparently mutually exclusive?


>  I'm sure
> git has a lot of features that are key for somebody that "don't work"
> for someone else, just because they're doing something that person
> doesn't want done.

The main point, the *important* point about git is that everything it 
shows always makes sense and works in exactly the same way no matter 
which setup you use. There are no features in git that are mutually 
exclusive, or only sane in one particular setup but not in others. You 
can use them all or pick which ones you like. Whatever you choose, it 
never comes at the expense of losing something else.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 11:19                                     ` Andreas Ericsson
@ 2006-10-18 12:43                                       ` Matthew D. Fuller
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
                                                           ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-18 12:43 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Aaron Bentley, Linus Torvalds, Carl Worth, bazaar-ng, git,
	Jakub Narebski

On Wed, Oct 18, 2006 at 01:19:10PM +0200 I heard the voice of
Andreas Ericsson, and lo! it spake thus:
> 
> It's just that we have this one place where gitweb is installed,
> which management likes whereas devs don't have that on their laptop.
> It's also convenient to have one place to find all changes rather
> than pulling from 1-to-N different people just to have a look at
> what they've done.

I think this just by itself lends support to:

> The point I'm trying to make here is that the star config might be
> the most common case today because

c) Stars work well as a mental model for humans.

Heck, in large, Linux is star-ish.  There s "2.6.1", "2.6.2", etc;
that's a trunk.  Any time you have releases, you're establishing a
"master" branch.  For most people using Linux, there's a trunk,
whether it's the kernel.org trunk, or the "What Redhat ships" trunk,
etc.  The closer you drill to the day-to-day work on the kernel, the
farther it gets from trunks, but if it were full-mesh at all levels I
don't think it would be nearly as usable for regular computing tasks
as it is.


Perhaps someday a heavy full-mesh setup will be the common case for
VCS usage.  I find that very difficult to buy for various reasons, but
it could happen.  If it does, bzr may well revisit the choice and
decide revnos contribute little enough marginal value as to be a loss,
and discard them.  But that's not today.


> But they *do* exist, and they *usually* work, so people are bound to
> try them first. Teaching them when they work and when they don't (or
> rather, when they should and when they shouldn't, cause they will
> work by accident sometimes too) is bound to be a lot harder than
> sending them a 10 char irc message.

Perhaps, for some projects.  And in those cases, perhaps you'd want to
flip a hypothetical "dump those numbers in the bin" switch.  That
doesn't mean every project wants to, or that those projects who don't
and have no trouble and discernible gain from revno usage are
hypothetical.


> So what's the point in having them? You can't seriously tell me that
> you think of 123.5.2.17 as something you can easily remember, do
> you? Count the times, during one day, where you use the revnos and
> type them manually.

No, I don't.  But I don't use merge revnos for various reasons, one of
the primary ones being that they don't currently intuitively follow
from me (and that intuitiveness is the major attraction of revnos in
the first place).

I rarely refer to non-mainline revisions at all, in fact.  And I use
revnos for mainline revisions regularly.  Heck, I communicate revnos
_verbally_; people handle that easily with numbers, not so easily with
hex strings.  The vast majority of my branches are simple cases, and I
like simple tools that match simple mental models for them.  For the
more intricate cases, revids provide a more rigorous tool, and I WANT
a VCS that lets me choose which is appropriate.  If I wanted a
computer to tell me how to work, I'd run Windows    ;)


> Not really. It's just that case 3 is the most flexible of them all.

Yes, but this doesn't necessarily mean everything you seem to try and
cover with it.  The more rigorous tool will cover the simplest case
(those being just a degenerate form of the more complex after all),
but that doesn't mean it's the EASIEST way of handling that case.


> Everything works *exactly* as it would have done without it in all
> repos around the world.

And if you use the UUID's, the same applies to bzr.

That is, if you use git like you use git, the above is true.  If you
use bzr like you use git, the above is ALSO true.

The difference is that bzr ALSO chooses to support and optimize for a
different case in the default UI presentation, because We[0] consider
that far and away the common case on the one hand, and that people
trying to use the more complex case are ipso facto more able to use a
behavior differing from the norm on the other.


[0] Note how adroitly I again speak for other people.  Practice,
    practice!


> >That's one way of phrasing it, I guess.  I'd say rather "a
> >particular feature isn't applicable to a completely distributed
> >setup".
> 
> So how come it's in the same list of features as the "distributed
> repository model", and both are marked as supported when they're
> apparently mutually exclusive?

I assume in this you're referring to the RcsComparisons page that
started the thread.  First off, I don't agree with all the
characterizations on the page, so don't expect me to support it as
gospel.  That said, they're not "mutually exclusive"; one is just
inapplicable in extreme cases of the other.  "Plugins" is on the same
list as "distributed repository model" too.  And you can't count on
other people having the same plugins as you, so it's just as "mutually
exclusive" with distributed.


> The main point, the *important* point about git is that everything
> it shows always makes sense and works in exactly the same way no
> matter which setup you use.  There are no features in git that are
> mutually exclusive, or only sane in one particular setup but not in
> others.

I find it really hard to believe that that's strictly true, just as a
general rule.  For that matter, I think it's demonstrably false: using
SHA1 hashes as revision identifiers in a simple linear tree with 5
revs doesn't strike me as "sane".  But that aside...

I don't think of that as a positive thing.  There are lots of things
that make sense in certain setups that don't in others.  We have two
techniques, A and B, and two general cases, X and Y.  A works really
well for X, and is useless with Y.  B works ok for X, and handles Y
well.  "Use A for X and B for Y" seems like a heck of a lot better
answer than "Only support B".  You certainly CAN shape wood joints
with just a claw hammer, but I wouldn't want to.  A jigsaw makes it
much easier, no matter how useless it may be for forging iron.


Your position seems to be, in essence, "This feature can be misused,
therefore it should be eliminated".  And you should certainly use a
tool that provides the behavior you want.  So, too, should other
people.

I don't want to use git for any number of reasons, which sum up
concisely if undescriptively as "It doesn't work for me", but it seems
to work great for the community it was built for, and that's
excellent.  Not all aspects of that design work well for other people,
though, no matter how poorly some capability "fits" you
(non-specific), it can still fit others very well.  This particular
item certainly seems one of those significant divides.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
@ 2006-10-18 13:02                                           ` Sean
  2006-10-18 13:02                                           ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 13:02 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Andreas Ericsson, Aaron Bentley, Linus Torvalds, Carl Worth,
	bazaar-ng, git, Jakub Narebski

On Wed, 18 Oct 2006 07:43:20 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> The difference is that bzr ALSO chooses to support and optimize for a
> different case in the default UI presentation, because We[0] consider
> that far and away the common case on the one hand, and that people
> trying to use the more complex case are ipso facto more able to use a
> behavior differing from the norm on the other.
> 
> [0] Note how adroitly I again speak for other people.  Practice,
>     practice!

Just to be clear here, Git is also able to  supports this model if
you so choose.  It's quite easy for a server to generate Git tags
for every commit it gets.

It's just that this is basically a non issue in the Git world.  People
who use Git aren't crying out for salvation from sha1 numbers.  So I
think this entire discussion is a bit overblown.

But just to be clear, there is nothing in the Git model that prohibits
tagging every commit with something you find less objectionable than
sha1's.  They can appear in the log listings and in gitk etc, and
everyone who pulls from the central server will get them.  In fact,
for some imports of other VCS into Git, exactly that is done; so every
commit can be referenced by its sha1 _or_ the "friendly" number it was
known by in its original VCS.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
  2006-10-18 13:02                                           ` Sean
@ 2006-10-18 13:02                                           ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 13:02 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, git,
	Jakub Narebski

On Wed, 18 Oct 2006 07:43:20 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> The difference is that bzr ALSO chooses to support and optimize for a
> different case in the default UI presentation, because We[0] consider
> that far and away the common case on the one hand, and that people
> trying to use the more complex case are ipso facto more able to use a
> behavior differing from the norm on the other.
> 
> [0] Note how adroitly I again speak for other people.  Practice,
>     practice!

Just to be clear here, Git is also able to  supports this model if
you so choose.  It's quite easy for a server to generate Git tags
for every commit it gets.

It's just that this is basically a non issue in the Git world.  People
who use Git aren't crying out for salvation from sha1 numbers.  So I
think this entire discussion is a bit overblown.

But just to be clear, there is nothing in the Git model that prohibits
tagging every commit with something you find less objectionable than
sha1's.  They can appear in the log listings and in gitk etc, and
everyone who pulls from the central server will get them.  In fact,
for some imports of other VCS into Git, exactly that is done; so every
commit can be referenced by its sha1 _or_ the "friendly" number it was
known by in its original VCS.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 11:08                               ` Petr Baudis
  2006-10-18 11:17                                 ` Jakub Narebski
@ 2006-10-18 13:09                                 ` Erik Bågfors
  1 sibling, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-18 13:09 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, bazaar-ng, Linus Torvalds, Andreas Ericsson, git,
	Jakub Narebski

On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Wed, Oct 18, 2006 at 11:28:32AM CEST, I got a letter
> where Erik B?gfors <zindar@gmail.com> said that...
> > On 10/18/06, Petr Baudis <pasky@suse.cz> wrote:
> > >Dear diary, on Wed, Oct 18, 2006 at 02:30:14AM CEST, I got a letter
> > >where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> > >> Petr Baudis wrote:
> > >> > Another aspect of this is that Git (Linus ;) is very focused on getting
> > >> > the history right, nice and clean (though it does not _mandate_ it and
> > >> > you can just wildly do one commit after another; it just provides tools
> > >> > to easily do it).
> > >>
> > >> Yes, rebasing is very uncommon in the bzr community.  We would rather
> > >> evaluate the complete change than walk through its history.  (Bundles
> > >> only show the changes you made, not the changes you merged from the
> > >> mainline.)
> > >>
> > >> In an earlier form, bundles contained a patch for every revision, and
> > >> people *hated* reading them.  So there's definitely a cultural
> > >> difference there.
> > >
> > >BTW, I think what describes the Git's (kernel's) stance very nicely is
> > >what I call the Al Viro's "homework problem":
> > >
> > >        http://lkml.org/lkml/2005/4/7/176
> > >
> > >If I understand you right, the bzr approach is what's described as "the
> > >dumbest kind" there? (No offense meant!)
> >
> > Yes and no, The bundle includes both the full final thing, and each
> > step along the way. Each step along the way is something you'll get
> > when you merge it.
> >
> > Once merged, it will be "next one" in the description above. It would
> > typically look something like this in "bzr log"(shortened)  In this
> > example, doing C requires doing A and B as well...
> >
> > committer: foobar@foobar.com
> > message: merged in C
> >      -------
> >      committer: bar@bar.com
> >      message: opps, fix bug in A
> >      -------
> >      committer: bar@bar.com
> >      message: implement B
> >      -------
> >      committer: bar@bar.com
> >      message: implement A
> >
> > So, you'll get full history, including errors made :)  You can also
> > see who approved it to this branch (foobar) and who did the actual
> > work (bar)
>
> I see, that's what I've been missing, thanks. So it's the middle path
> (as any other commonly used VCS for that matter, expect maybe darcs?;
> patch queues and rebasing count but it's a hack, not something properly
> supported by the design of Git, since at this point the development
> cannot be fully distributed).
>
> I also assume that given this is the case, the big diff does really not
> serve any purpose besides human review?
>
> But somewhere else in the thread it's been said that bundles can also
> contain merges. Does that means that bundles can look like:
>
>    1
>   / \
>  2   4
>  |   | _
>  3   5  |
>   \ /   | a bundle
>    6    |
>        ~
>
> In that case, against what the big diff from 6 is done? 2? 4? Or even 1?

When you run the "bundle" command, you can tell it what you want the
bundle to be created against.  So, If I just commited 5, I can run
"bzr bundle -r-1" to get the bundle against 4, or I can do "bzr bundle
path/to/other/branch" to get a bundle that relates to it.

To merge a bundle into a branch, the parrent of the first revision in
the bundle, has to exist in the branch is't being merged into. (well,
unless you use patch, but that's outside of bzr, and bzr wouldn't know
about each revision in them)

This command will find a common root and create a bundle that
corresponds to it.  The "big diff" as you call it, would be the
changes between the point where the branch was created, and the last
commit.

In the case of just committing 5, and you want to create a bundle that
can be merged back at point 6, the "big diff" would be against 1 since
that's the branch point.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 12:43                                       ` Matthew D. Fuller
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
@ 2006-10-18 13:10                                         ` Jakub Narebski
  2006-10-18 16:07                                         ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18 13:10 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Andreas Ericsson, Aaron Bentley, Linus Torvalds, Carl Worth,
	bazaar-ng, git

Dnia środa 18. października 2006 14:43, Matthew D. Fuller napisał:
> On Wed, Oct 18, 2006 at 01:19:10PM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
> > 
> > It's just that we have this one place where gitweb is installed,
> > which management likes whereas devs don't have that on their laptop.
> > It's also convenient to have one place to find all changes rather
> > than pulling from 1-to-N different people just to have a look at
> > what they've done.
> 
> I think this just by itself lends support to:
> 
> > The point I'm trying to make here is that the star config might be
> > the most common case today because
> 
> c) Stars work well as a mental model for humans.
> 
> Heck, in large, Linux is star-ish.  There s "2.6.1", "2.6.2", etc;
> that's a trunk.  Any time you have releases, you're establishing a
> "master" branch.  For most people using Linux, there's a trunk,
> whether it's the kernel.org trunk, or the "What Redhat ships" trunk,
> etc.  The closer you drill to the day-to-day work on the kernel, the
> farther it gets from trunks, but if it were full-mesh at all levels I
> don't think it would be nearly as usable for regular computing tasks
> as it is.

No, it is not. If you consider only published Linus repository, and
private repositories of other people, it usually is star-ish (although
mentioned situaltion where somebody else repository took place of center
of star-ish configuration wouldn't be possible in tru star-ish model).
But please take note of stable repository, -mm repository; the changes
are exchanged there and back again. And "What Redhat ships" is AFAIK
mix of different repositories and own patches. 
 
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  5:36                         ` Jeff King
  2006-10-18  5:57                           ` Junio C Hamano
@ 2006-10-18 14:52                           ` Linus Torvalds
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
  2006-10-18 21:20                             ` VCS comparison table Jeff King
  1 sibling, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 14:52 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski



On Wed, 18 Oct 2006, Jeff King wrote:
> 
> I never used BK, but my understanding is that it was based on
> changesets, so a bundle was a group of changesets.

Yes.

> Because a git commit represents the entire tree state, how can we avoid 
> sending the entire tree in each bundle?

That's not the problem. That's easy to handle - and we already do. That's 
the whole point of the wire-transfer protocol (ie sending deltas, and only 
sending enough to actually matter).

> The interactive protocols can ask "what do you have?" but an email 
> bundle is presumably meant to work without a round trip.

Right, but they can do exactly what bk did: you have to have a reference 
to what the other side has. In git, that's usually even simpler: you'd do

	git send origin..

and that "origin" is what the other end is expected to already have.

Of course, if you send an unconnected bundle (ie you give an origin that 
the other end _doesn't_ have), you're screwed.

In other words, to get such a pack, we'd _literally_ just do something 
like

	git-rev-list --objects-edge origin.. |
		git-pack-objects --stdout |
		uuencode

and that would be it. You'd still need to add a "diffstat" to the thing, 
and tell the other end what the current HEAD is (so that it knows what 
it's supposed to fast-forward to), but it _literally_ is that simple.

"plug-in architecture" my ass. "I recognize this - it's UNIX!".

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  4:55               ` Robert Collins
  2006-10-18  8:53                 ` Andreas Ericsson
@ 2006-10-18 15:31                 ` Linus Torvalds
  2006-10-18 15:50                   ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 15:31 UTC (permalink / raw)
  To: Robert Collins; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski



On Wed, 18 Oct 2006, Robert Collins wrote:
> 
> More commonly though, like git users have 'origin' and 'master'
> branches, bzr users tend to have a branch that is the 'origin' (for bzr
> itself this is usually called bzr.dev), as well as N other branches for
> their own work, which is probably why we haven't seen the need to have a
> ui command to spit out the revnos for an arbitrary branch.

You mis-understand.

git doesn't have a "ui command to spit out the revnos for an arbitrary 
branch" either.

Normally, you'd just use the branch-name. Nobody ever uses the SHA1's 
directly.

What git does (and does very well) is to be _scriptable_. It was designed 
that way. I'm a UNIX guy. I think piping is very powerful. And when you 
script things, your scripts pass SHA1's around internally.

So for example, to repack a git archive, you'd normally do

	git repack -a -d

and you don't have any "UI" with SHA1 numbers. But internally, this used 
to be

	git-rev-list --all --objects |
		git-pack-objects 

where "git-rev-list" is the one that lists all object names (which are the 
SHA1 numbers), and "git-pack-objects" is the one that takes a list of 
objects and packs them. 

(These days, since our internal C libraries have become so much better, 
the object traversal is done internally to packing, so we don't actually 
use the pipe any more for repacking an archive, but that's just an 
implementation detail)

You seem to think that we use SHA1 names as _humans_. We don't. The SHA1 
names are used internally, and humans just use the branch names.

The only case you'd (as a human) use the SHA1 name is when you want to 
pass it on to another person that may have a different archive (ie you 
mail somebody a revision that is problematic). It would obviously be 
totally unworkable to say "it's the grand-parent of my current HEAD 
commit", since that's a local description. So instead, you'd say "it's 
commit 9550e59c4587f637d9aa34689e32eea460e6f50c".

So I think people (totally incorrectly) think that git users use a lot of 
SHA1 names, just because they see the git users on the kernel mailing list 
sending each others SHA1 names. But that's because you see only the case 
where you _want_ to communicate a stable revision name to another side. 
Sending a number like 1.57.8.312 to describe what commit broke would be a 
_bug_, because a person who has a differently shaped tree wouldn't even 
_have_ that revision.

But normally? You'd be hard-pressed to find anything but the branch (and 
tag) names on a command line.

See?

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:10                               ` Aaron Bentley
  2006-10-18  8:39                                 ` Andreas Ericsson
@ 2006-10-18 15:38                                 ` Carl Worth
  2006-10-19  9:10                                   ` Matthew D. Fuller
  1 sibling, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-18 15:38 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 3786 bytes --]

On Tue, 17 Oct 2006 23:10:34 -0400, Aaron Bentley wrote:
> If B merges A before this, nothing happens, because B is already a
> superset of A.
>
> If B merges afterward, we get this:

Wow. Thanks for elucidating---again I was making some incorrect
assumptions about the system, so your answer was surprising and
appreciated.

So, am I correct in my understanding now that it's impossible for two
users to establish identical code history on both sides through merge?
If the two kept merging back and forth the history would pick up a new
commit each time even though there were no code changes. Right?

That's a startling property. I'm surprised to learn that the
generally-used mechanism for getting new changes doesn't have a mode
where it says "you're already up to date---doing nothing".

I do understand that there's a separate "pull" that does allow for
correct synchronization of a local repository with a remote
repository, and it does have the "up to date---doing nothing"
behavior. But as you already said, it's often avoided specifically
because it destroys locally-created revision numbers.

Another way of describing bzr's "pull" is that it establishes a
master-slave relationship between the remote and local repository,
(his numbers are more important than mine, so I'll throw mine away).
I think Linus already provided a good argument in this thread about
why that kind of asymmetry is bad for software projects and why tools
should not provide it.

So there are some aspects of the bzr design that rob from its ability
to function as a distributed version control system. It really does
bias itself toward centralization, (the so called "star topoloogy" as
opposed to something "fully" distributed).

And by the way, some people seem to have the opinion that there's
something unique about the way the linux kernel is developed that
allows is to benefit from a fully distributed system. The assumption
seems to be that projects with a central tree won't benefit the same
way, and don't really need the full set of features of a distributed
system. That's not true in my experience.

With cairo, for example, we had been using cvs. Obviously, it imposes
a centralized model, but most of the active developers had been using
rsync or other repository synchronization so that we could at least do
offline history browsing. So even with cvs we had as much of a star
topology as possible, (but we didn't have offline commits to our
roaming repositories, nor did we have any sharing between them).

Now, after the switch from cvs to git, we still do have a central
repository that all developers share and push into, (this is distinct
from how linux or the git project itself use git). And git supports
this kind of shared central repository perfectly well.

But a lot of the big advantages the cairo project gets from git come
from our ability to now easily share branches among ourselves without
going through the central repository. We only push fully-cooked
branches to the central tree. But now, with everyone owning their own
publicly-visible repository with all their work in it, we can now
easily share the half-baked ideas we have with all their history. One
person can start an idea, and others can easily pick it up, (without
having to drop down to a mega-patch like we would have done with
cvs). And people actually have the ability to collaborate on turning
an answer into a solution, (in Al Viro's terminology).

So even a project that's very oriented around a single, central tree
can get a lot of benefit from being able to share things arbitrarily
between any two given repositories. And I think that any project will
naturally start doing more of this kind of sharing, (and benefitting
considerably from it), as it adopts tools that support it well.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 15:31                 ` Linus Torvalds
@ 2006-10-18 15:50                   ` Jakub Narebski
  2006-10-18 16:22                     ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18 15:50 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Robert Collins, Andreas Ericsson, bazaar-ng, git

Linus Torvalds wrote:
> 
> On Wed, 18 Oct 2006, Robert Collins wrote:
>> 
>> More commonly though, like git users have 'origin' and 'master'
>> branches, bzr users tend to have a branch that is the 'origin' (for bzr
>> itself this is usually called bzr.dev), as well as N other branches for
>> their own work, which is probably why we haven't seen the need to have a
>> ui command to spit out the revnos for an arbitrary branch.
> 
> You mis-understand.
> 
> git doesn't have a "ui command to spit out the revnos for an arbitrary 
> branch" either.
> 
> Normally, you'd just use the branch-name. Nobody ever uses the SHA1's 
> directly.

With the exception of having sometimes commit-ids in the commit messages,
for example "Fixes bug introduced by aabbcc00" (although usually you just
write "Fixes bug in some_function in some_file"), and automatically
generated 
  This reverts d119e3de13ea1493107bd57381d0ce9c9dd90976 commit.
(in addition to 'Revert "<Commit title>") for git-revert generated
commit messages.

And it is true that you usually use branchname, or branchname~n syntax.
Git even has git-name-rev to convert from sha1 to temporary, local
ref^m~n... syntax.


By the way, git has very powerfull syntax to get revisions, and
revision lists. For example "git-rev-list foo bar  ^baz" means
"list all the commits which are included in foo and bar lineage,
but not in baz", or more useful "git log origin..next".

How's that in bzr?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 12:43                                       ` Matthew D. Fuller
       [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
  2006-10-18 13:10                                         ` Jakub Narebski
@ 2006-10-18 16:07                                         ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 16:07 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: bazaar-ng, Andreas Ericsson, Carl Worth, git, Jakub Narebski



On Wed, 18 Oct 2006, Matthew D. Fuller wrote:

> On Wed, Oct 18, 2006 at 01:19:10PM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
> > 
> > It's just that we have this one place where gitweb is installed,
> > which management likes whereas devs don't have that on their laptop.
> > It's also convenient to have one place to find all changes rather
> > than pulling from 1-to-N different people just to have a look at
> > what they've done.
> 
> I think this just by itself lends support to:
> 
> > The point I'm trying to make here is that the star config might be
> > the most common case today because
> 
> c) Stars work well as a mental model for humans.

I really don't think that's even true.

Most projects do tend to have a star-like setup, but I think that's 
largely due to historical tools, not mental models. 

For example, I used CVS professionally for too long a few years ago, and 
the thing I _really_ hated was exactly how it forced people who were 
working on "experimental stuff" to be so tightly organized around the 
central repository (and how they had to do things that were visible and 
annoying to the mainline).

And I think that's where the "star-like" situation breaks down: when you 
have a group of people who go off to do something experimental. Suddenly 
the "mainline" in that case isn't the central and most important 
repository any more, and instead you really have another second (and 
third, fourth etc) "centerpoint" that another group works around.

Now, what does that mean? It means that whenever you look at a big project 
from the outside, you tend to see a star-like thing: there's the "big 
common thing", and you won't even be _seeing_ the off-shoots, because they 
tend to be used by developers to try out new ideas etc. So it looks like a 
star, but it really isn't, and shouldn't be.

An SCM should support the _developers_, not the users. The users don't 
need an SCM, they just need a place to fetch the "standard" thing 
(preferably with a vendor that supports them or at least makes them feel 
comfy). But an SCM really should support the off-shoots, because that's 
where the exciting stuff happens.

Btw, this is also why distribution is so fundamentally important:

Most of the off-shoots tend to be failures, but that is as it should be. 
Again, this is where SVN and CVS and other centralized models fail 
_miserably_. Because branches are in a centralized repository, the cost of 
failure is visible to all, and thus people don't like creating branches 
for things that don't look "obviously viable" to the people around the 
central repository.

In contrast, in a truly distributed environmen, a failed branch is 
something that people don't even KNOW about. Anybody can take the kernel 
git tree, start his own development line (with ten other people) and try 
to improve it. And if it fails, I'd never even know: there is literally 
_zero_ cost to everybody else from failed branches. And if they succeed, 
they'll just say "hey, pull this, it works, and it makes Xyz go five times 
faster".

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 15:50                   ` Jakub Narebski
@ 2006-10-18 16:22                     ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 16:22 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Andreas Ericsson, bazaar-ng, git, Robert Collins



On Wed, 18 Oct 2006, Jakub Narebski wrote:
> > 
> > Normally, you'd just use the branch-name. Nobody ever uses the SHA1's 
> > directly.
> 
> With the exception of having sometimes commit-ids in the commit messages,
> for example "Fixes bug introduced by aabbcc00" (although usually you just
> write "Fixes bug in some_function in some_file"), and automatically
> generated 
>   This reverts d119e3de13ea1493107bd57381d0ce9c9dd90976 commit.

Yes. But in both cases, that's usually because you literally ended up 
having the commit name because somebody else (which _can_ be you) searched 
for it (with something like "bisect") and gave it to you.

So even that case is really about communicating a stable name from one 
place (the "find the bug") to another (the "revert the buggy commit").

So yes, _communication_ should always happen by full SHA1's, because those 
are the only thing that always remain stable.

(The fact that "gitk" and I think "gitweb" can then turn them into 
hyperlinks in the commit message is obviously one reason we then tend to 
give them such prominent visibility - they actually end up being very 
useful later on).

In bzr, either you don't get the hyperlinks, or you need to use the 
non-simple name in the commit messages, since the simple names don't 
actually work. Either way, it's an inferior setup.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  9:20                                 ` Jakub Narebski
@ 2006-10-18 16:31                                   ` Aaron Bentley
  2006-10-21 15:56                                     ` Jan Hudec
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-18 16:31 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Carl Worth, Petr Baudis, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
>>Carl Worth wrote:
>>>There are even more important reasons to prefer a series of
>>>micro-commits over a mega-patch than just ease of merging.
>>
>>A bundle isn't a mega-patch.  It contains all the source revisions.  So
>>when you merge or pull it, you get all the original revisions in your
>>repository.
> 
> 
> But what patch reviewer see is a mega-patch showing the changeset
> of a whole "bundle", isn't it?
> [...]

Yes.  Carl was saying that, aside from the issue of what a reviewer
sees, a bundle is bad for other reasons.  I am saying those other
reasons don't apply.  I wasn't addressing the issue of what a reviewer sees.

To me, seeing the individual patches is like reading a book where every
page has a different word on it, and so it's hard to put it together
into a full sentence.  I'm not saying my way is The Right Way, just my
personal preference.

For larger pieces of work, we try to split them up into logical units,
and merge those units independently.

The Bundle format can also support a patch-by-patch output, but we don't
have UI to select that.

> I think it is much better to review series of patches commit by commit;
> besides it allows to correct some inner patches before applying the whole
> series or drop one of patches in series (and it happened from time to time
> on git mailing list).

It's important to remember that bundles represent revisions, not
patches.  When you merge a bundle, you

1. install those revisions into your repository.  These revisions are
   latent, as though they were on another branch.
2. merge the head revision of the bundle into your branch.

Virtually any merge selection process that works with branches would
also work with bundles.  So tweaking before merging is really a matter
of replacing the UI for 2.

> So if git introduces bundles, I think they would take form of series
> of "patch" mails + introductory email with series description (currently
> it is not saved anywhere), shortlog, diffstat and perhaps more metainfo
> like bundle parent (which I think should be email form of branch really),
> tags introduced etc.

The parent in a bundle revision is the revision-id of the parent of that
revision in the branch.  I don't think it's possible to change that
parent id into something else, without changing the meaning of a bundle.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFNlb40F+nu1YWqI0RAnxxAJ9ETibey1Qyvz/zVxdGipaHGtnddgCfTtzt
CQUZ2dK64BS5K5WYecFAsfM=
=bJxq
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 13:55                 ` Jakub Narebski
  2006-10-17 14:08                   ` Matthieu Moy
@ 2006-10-18 18:03                   ` Jeff Licquia
  1 sibling, 0 replies; 806+ messages in thread
From: Jeff Licquia @ 2006-10-18 18:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Tue, 2006-10-17 at 15:55 +0200, Jakub Narebski wrote:
> Matthieu Moy wrote:
> > This took time to come in bzr, but that's the bisect plugin:
> > 
> > http://bazaar-vcs.org/PluginRegistry
> 
> Hmmm... I winder which SCM had it first.

You did.  The plugin is largely based on my experiences with the git
version, and explicitly gives credit in the comments.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 14:52                           ` Linus Torvalds
@ 2006-10-18 18:52                             ` Petr Baudis
  2006-10-18 18:59                               ` Petr Baudis
                                                 ` (2 more replies)
  2006-10-18 21:20                             ` VCS comparison table Jeff King
  1 sibling, 3 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 18:52 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff King, Jakub Narebski, Aaron Bentley, Andreas Ericsson,
	bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 04:52:25PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> In other words, to get such a pack, we'd _literally_ just do something 
> like
> 
> 	git-rev-list --objects-edge origin.. |
> 		git-pack-objects --stdout |
> 		uuencode
> 
> and that would be it. You'd still need to add a "diffstat" to the thing, 
> and tell the other end what the current HEAD is (so that it knows what 
> it's supposed to fast-forward to), but it _literally_ is that simple.
> 
> "plug-in architecture" my ass. "I recognize this - it's UNIX!".

Took me exactly an hour from mkdir cogito-bundle to cg-push to
kernel.org. :-)

cogito-bundle is an example on how to create third-party addons or
plugins adding own commands to Cogito and using Cogito's infrastructure.
It's not _that_ easy currently since you have to replicate large part of
the build infrastructure locally; that could be fixed by installing some
"library makefiles" and asciidoc toolkit to /usr/share or something, if
there would be a real demand for such an addon API. cg-help and the cg
wrapper will pick up the newly installed commands automagically. The
only thing missing is updating cogito(7) to list the addon commands,
which would take a bit more work.

Though it's an example, it's actually supposed to be useful, by doing
exactly what is outlined above - l - it lets you exchange commits over
mail by so-called "bundles", similar to e.g. Bazaar bundles - basically,
it is like push or fetch, but over email, and the commit ids are
preserved when transferred in bundles (if you just send patches, the
commit ids will end up different).

The provided cg-bundle and cg-unbundle commands are rather crude and
don't support many things - they don't actually include a diff, only a
diffstat, etc. The uuencoded bundle is inlined in the mail, which I
suspect isn't very useful; perhaps it would be more practical to just
attach it binarily. Feel free to send patches (or bundles ;).

An example bundle is available at

	http://pasky.or.cz/~pasky/cp/example-bundle.txt

as generated by

	cogito.master$ cg-bundle -r v0.18 -m"Subject is this" \
		-m"And some body now..." --stdout

and cogito-bundle is available at

	git://git.kernel.org/pub/scm/cogito/cogito-bundle.git/
	(gitweb http://kernel.org/git/?p=cogito/cogito-bundle.git)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
@ 2006-10-18 18:59                               ` Petr Baudis
  2006-10-18 19:04                                 ` Junio C Hamano
                                                   ` (2 more replies)
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
  2006-10-19  6:46                               ` Alexander Belchenko
  2 siblings, 3 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 18:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Dear diary, on Wed, Oct 18, 2006 at 08:52:25PM CEST, I got a letter
where Petr Baudis <pasky@suse.cz> said that...
> Dear diary, on Wed, Oct 18, 2006 at 04:52:25PM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
> > In other words, to get such a pack, we'd _literally_ just do something 
> > like
> > 
> > 	git-rev-list --objects-edge origin.. |
> > 		git-pack-objects --stdout |
> > 		uuencode
> > 
> > and that would be it. You'd still need to add a "diffstat" to the thing, 
> > and tell the other end what the current HEAD is (so that it knows what 
> > it's supposed to fast-forward to), but it _literally_ is that simple.
> > 
> > "plug-in architecture" my ass. "I recognize this - it's UNIX!".
> 
> Took me exactly an hour from mkdir cogito-bundle to cg-push to
> kernel.org. :-)

By the way, originally I just wanted to index and save the pack, but
when trying to feed it to git-index-pack, I kept getting

	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas

while feeding it to git-unpack-objects works fine. Any idea what's wrong?

(BTW, I got the id by sha1summing the pack file; is there an existing
way to name a pack properly if I have it lying around, unnamed? sha1sum
seems to be specific to a fairly new GNU coreutils version.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:59                               ` Petr Baudis
@ 2006-10-18 19:04                                 ` Junio C Hamano
  2006-10-18 19:13                                   ` Nicolas Pitre
  2006-10-18 19:09                                 ` Nicolas Pitre
  2006-10-18 20:08                                 ` Linus Torvalds
  2 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 19:04 UTC (permalink / raw)
  To: git

Petr Baudis <pasky@suse.cz> writes:

> Dear diary, on Wed, Oct 18, 2006 at 08:52:25PM CEST, I got a letter
> where Petr Baudis <pasky@suse.cz> said that...
>> Dear diary, on Wed, Oct 18, 2006 at 04:52:25PM CEST, I got a letter
>> where Linus Torvalds <torvalds@osdl.org> said that...
>> > In other words, to get such a pack, we'd _literally_ just do something 
>> > like
>> > 
>> > 	git-rev-list --objects-edge origin.. |
>> > 		git-pack-objects --stdout |
>> > 		uuencode
>> > 
>> > and that would be it. You'd still need to add a "diffstat" to the thing, 
>> > and tell the other end what the current HEAD is (so that it knows what 
>> > it's supposed to fast-forward to), but it _literally_ is that simple.
>> > 
>> > "plug-in architecture" my ass. "I recognize this - it's UNIX!".
>> 
>> Took me exactly an hour from mkdir cogito-bundle to cg-push to
>> kernel.org. :-)
>
> By the way, originally I just wanted to index and save the pack, but
> when trying to feed it to git-index-pack, I kept getting
>
> 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
>
> while feeding it to git-unpack-objects works fine. Any idea what's wrong?

Yes.  You told the pipeline, with --objects-edge, to create a
thin pack.  By definition that is _not_ indexable.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:59                               ` Petr Baudis
  2006-10-18 19:04                                 ` Junio C Hamano
@ 2006-10-18 19:09                                 ` Nicolas Pitre
  2006-10-18 20:08                                 ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-18 19:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Wed, 18 Oct 2006, Petr Baudis wrote:

> By the way, originally I just wanted to index and save the pack, but
> when trying to feed it to git-index-pack, I kept getting
> 
> 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> 
> while feeding it to git-unpack-objects works fine. Any idea what's wrong?

Did you really manage to miss the "heads-up: git-index-pack in "next" is 
broken" thread?

The fix:

diff --git a/index-pack.c b/index-pack.c
index fffddd2..56c590e 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -23,6 +23,12 @@ union delta_base {
 	unsigned long offset;
 };
 
+/*
+ * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
+ * to memcmp() only the first 20 bytes.
+ */
+#define UNION_BASE_SZ	20
+
 struct delta_entry
 {
 	struct object_entry *obj;
@@ -211,7 +217,7 @@ static int find_delta(const union delta_
                 struct delta_entry *delta = &deltas[next];
                 int cmp;
 
-                cmp = memcmp(base, &delta->base, sizeof(*base));
+                cmp = memcmp(base, &delta->base, UNION_BASE_SZ);
                 if (!cmp)
                         return next;
                 if (cmp < 0) {
@@ -232,9 +238,9 @@ static int find_delta_childs(const union
 
 	if (first < 0)
 		return -1;
-	while (first > 0 && !memcmp(&deltas[first - 1].base, base, sizeof(*base)))
+	while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
 		--first;
-	while (last < end && !memcmp(&deltas[last + 1].base, base, sizeof(*base)))
+	while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
 		++last;
 	*first_index = first;
 	*last_index = last;
@@ -312,7 +318,7 @@ static int compare_delta_entry(const voi
 {
 	const struct delta_entry *delta_a = a;
 	const struct delta_entry *delta_b = b;
-	return memcmp(&delta_a->base, &delta_b->base, sizeof(union delta_base));
+	return memcmp(&delta_a->base, &delta_b->base, UNION_BASE_SZ);
 }
 
 static void parse_pack_objects(void)


Nicolas

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:04                                 ` Junio C Hamano
@ 2006-10-18 19:13                                   ` Nicolas Pitre
  2006-10-18 19:18                                     ` Shawn Pearce
  2006-10-18 19:33                                     ` Junio C Hamano
  0 siblings, 2 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-18 19:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Wed, 18 Oct 2006, Junio C Hamano wrote:

> Petr Baudis <pasky@suse.cz> writes:
> 
> > By the way, originally I just wanted to index and save the pack, but
> > when trying to feed it to git-index-pack, I kept getting
> >
> > 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> >
> > while feeding it to git-unpack-objects works fine. Any idea what's wrong?
> 
> Yes.  You told the pipeline, with --objects-edge, to create a
> thin pack.  By definition that is _not_ indexable.

Ah true.  I missed the "thin" pack.

Any idea why we should still prevent this?  It is not like it was a 
technical limitation.


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:13                                   ` Nicolas Pitre
@ 2006-10-18 19:18                                     ` Shawn Pearce
  2006-10-18 19:33                                       ` Nicolas Pitre
  2006-10-18 19:33                                     ` Junio C Hamano
  1 sibling, 1 reply; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 19:18 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

Nicolas Pitre <nico@cam.org> wrote:
> On Wed, 18 Oct 2006, Junio C Hamano wrote:
> 
> > Petr Baudis <pasky@suse.cz> writes:
> > 
> > > By the way, originally I just wanted to index and save the pack, but
> > > when trying to feed it to git-index-pack, I kept getting
> > >
> > > 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> > >
> > > while feeding it to git-unpack-objects works fine. Any idea what's wrong?
> > 
> > Yes.  You told the pipeline, with --objects-edge, to create a
> > thin pack.  By definition that is _not_ indexable.
> 
> Ah true.  I missed the "thin" pack.
> 
> Any idea why we should still prevent this?  It is not like it was a 
> technical limitation.

It still is in sha1-file.c; or at least the last time I looked at
that code.  The base is always resolved from the same pack/index
as the delta.  If you fix sha1-file.c sure, I don't see why you
can't allow indexing thin packs.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:18                                     ` Shawn Pearce
@ 2006-10-18 19:33                                       ` Nicolas Pitre
  2006-10-18 20:46                                         ` Shawn Pearce
  0 siblings, 1 reply; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-18 19:33 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Junio C Hamano, git

On Wed, 18 Oct 2006, Shawn Pearce wrote:

> Nicolas Pitre <nico@cam.org> wrote:
> > On Wed, 18 Oct 2006, Junio C Hamano wrote:
> > 
> > > Petr Baudis <pasky@suse.cz> writes:
> > > 
> > > > By the way, originally I just wanted to index and save the pack, but
> > > > when trying to feed it to git-index-pack, I kept getting
> > > >
> > > > 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> > > >
> > > > while feeding it to git-unpack-objects works fine. Any idea what's wrong?
> > > 
> > > Yes.  You told the pipeline, with --objects-edge, to create a
> > > thin pack.  By definition that is _not_ indexable.
> > 
> > Ah true.  I missed the "thin" pack.
> > 
> > Any idea why we should still prevent this?  It is not like it was a 
> > technical limitation.
> 
> It still is in sha1-file.c; or at least the last time I looked at
> that code.  The base is always resolved from the same pack/index
> as the delta.  

Yep.  I mean this doesn't have to be like that fundamentally.

> If you fix sha1-file.c sure, I don't see why you
> can't allow indexing thin packs.

If there are advantages to do so then maybe. That would be for another 
day though, as I've been burned a bit with packs recently.


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:13                                   ` Nicolas Pitre
  2006-10-18 19:18                                     ` Shawn Pearce
@ 2006-10-18 19:33                                     ` Junio C Hamano
  2006-10-18 20:47                                       ` Shawn Pearce
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 19:33 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> Ah true.  I missed the "thin" pack.
>
> Any idea why we should still prevent this?  It is not like it was a 
> technical limitation.

It is a technical limitation.  We have never assumed that the
virtual address space is big enough to hold more than one whole
pack mmapped at the same time.

Lifting this needs the piecemeal mmap() change somebody was
talking about.

I might bite the bullet and do that myself but I've been hoping
to get an appliable patch from somewhere else ;-).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
@ 2006-10-18 19:57                                 ` Sean
  2006-10-18 20:46                                 ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 19:57 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Jeff King, Jakub Narebski, Aaron Bentley,
	Andreas Ericsson, bazaar-ng, git

On Wed, 18 Oct 2006 20:52:25 +0200
Petr Baudis <pasky@suse.cz> wrote:

> Took me exactly an hour from mkdir cogito-bundle to cg-push to
> kernel.org. :-)

Nicely done :-).

> cogito-bundle is an example on how to create third-party addons or
> plugins adding own commands to Cogito and using Cogito's infrastructure.
> It's not _that_ easy currently since you have to replicate large part of
> the build infrastructure locally; that could be fixed by installing some
> "library makefiles" and asciidoc toolkit to /usr/share or something, if
> there would be a real demand for such an addon API. cg-help and the cg
> wrapper will pick up the newly installed commands automagically. The
> only thing missing is updating cogito(7) to list the addon commands,
> which would take a bit more work.

Couldn't these just as easily have been written as git-bundle and
git-unbundle without needing any plugins or other cogito infrastructure?

> Though it's an example, it's actually supposed to be useful, by doing
> exactly what is outlined above - l - it lets you exchange commits over
> mail by so-called "bundles", similar to e.g. Bazaar bundles - basically,
> it is like push or fetch, but over email, and the commit ids are
> preserved when transferred in bundles (if you just send patches, the
> commit ids will end up different).

Not sure if it would be useful, but it shouldn't be too hard to have
same commit ids regenerated at receiving end with git patches.

> The provided cg-bundle and cg-unbundle commands are rather crude and
> don't support many things - they don't actually include a diff, only a
> diffstat, etc. The uuencoded bundle is inlined in the mail, which I
> suspect isn't very useful; perhaps it would be more practical to just
> attach it binarily. Feel free to send patches (or bundles ;).

Think you're right about making it an attachment instead.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:59                               ` Petr Baudis
  2006-10-18 19:04                                 ` Junio C Hamano
  2006-10-18 19:09                                 ` Nicolas Pitre
@ 2006-10-18 20:08                                 ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 20:08 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git



On Wed, 18 Oct 2006, Petr Baudis wrote:
> 
> By the way, originally I just wanted to index and save the pack, but
> when trying to feed it to git-index-pack, I kept getting
> 
> 	fatal: packfile '.git/objects/pack/pack-b2ab684daebea5b9c5a6492fa732e0d2e1799c8e.pack' has unresolved deltas
> 
> while feeding it to git-unpack-objects works fine. Any idea what's wrong?

Since you created a "thin" pack (that's what the "--objects-edge" means), 
the pack actually contains deltas to objects that are _not_ in the pack. 

In other words, it's not a valid stand-alone pack, it's only a valid thin 
pack, useful to transfer data to the other end (and the other end had 
better have the objects that the deltas are against already).

As a result, index-file refuses to index it: it cannot be used as a 
stand-alone pack, it's _only_ useful as a transfer medium.

So don't even _try_ to use it as a standalone pack-file. It won't work.

(If you want somethign that actually works as a stand-alone pack-file, 
change the "--objects-edge" flag to just "--objects" - that makes the 
pack-file self-sufficient, and doesn't try to delta against "edge" 
objects).

> (BTW, I got the id by sha1summing the pack file; is there an existing
> way to name a pack properly if I have it lying around, unnamed? sha1sum
> seems to be specific to a fairly new GNU coreutils version.)

A properly named _standalone_ pack gets named not by its actual contents, 
but by the SHA1-sum of the sorted list of objects it contains. That's so 
that a pack-file will be named the same thing regardless of how the 
contents are actually packed.

A thin pack cannot be named that way at all, for the same reason you 
cannot index it: it has a set of objects it enumerates (so you could name 
it by them), but it _also_ has a set of objects outside of it that it 
depends on. 

That said, even a thin pack internally has a SHA1 checksum of its 
contents: the last 20 bytes should be the SHA1-sum of all preceding bytes. 
So if you just want _some_ kind of name, you can use the last 20 bytes of 
a pack, which is just its internal integrity-checksum (but that is 
_different_ from the "pack-xxxxxx.idx"/"pack-xxxxxx.pack" naming).

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
  2006-10-18 19:57                                 ` Sean
@ 2006-10-18 20:46                                 ` Petr Baudis
       [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 20:46 UTC (permalink / raw)
  To: Sean
  Cc: Linus Torvalds, Jeff King, Jakub Narebski, Aaron Bentley,
	Andreas Ericsson, bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 09:57:04PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> Couldn't these just as easily have been written as git-bundle and
> git-unbundle without needing any plugins or other cogito infrastructure?

They could be written, but certainly not "just as easily". I'm more used
to coding Cogito, I find it much more convenient than hacking git's
shell scripts (those two may be interconnected ;), and there's plenty of
infrastructure in Cogito missing in Git - Cogito has more flexible
arguments parsing, documentation bundled with code, I could just
cut'n'paste the code to handle -m arguments and message editor (and most
of it is libified anyway) so I got that basically for free, and I think
Cogito beats Git hands down in code readability.

> Not sure if it would be useful, but it shouldn't be too hard to have
> same commit ids regenerated at receiving end with git patches.

It would be of course technically possible, yes. But somewhat more work,
this is just a quick hack.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:33                                       ` Nicolas Pitre
@ 2006-10-18 20:46                                         ` Shawn Pearce
  2006-10-18 21:17                                           ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 20:46 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

Nicolas Pitre <nico@cam.org> wrote:
> If there are advantages to do so then maybe. That would be for another 
> day though, as I've been burned a bit with packs recently.

I guess its my turn then to work in the mmap window code, huh?  :-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 19:33                                     ` Junio C Hamano
@ 2006-10-18 20:47                                       ` Shawn Pearce
  0 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 20:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git

Junio C Hamano <junkio@cox.net> wrote:
> Nicolas Pitre <nico@cam.org> writes:
> 
> > Ah true.  I missed the "thin" pack.
> >
> > Any idea why we should still prevent this?  It is not like it was a 
> > technical limitation.
> 
> It is a technical limitation.  We have never assumed that the
> virtual address space is big enough to hold more than one whole
> pack mmapped at the same time.

Even though its not big enough for some larger packs on a 32
bit system.
 
> Lifting this needs the piecemeal mmap() change somebody was
> talking about.
> 
> I might bite the bullet and do that myself but I've been hoping
> to get an appliable patch from somewhere else ;-).

I might be able to do it this weekend.  I'll try to spend some time
on it.  You'll either see a patch series, or you won't.  ;-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
@ 2006-10-18 20:53                                     ` Sean
  2006-10-18 21:39                                     ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 20:53 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Jeff King, Jakub Narebski, Aaron Bentley,
	Andreas Ericsson, bazaar-ng, git

On Wed, 18 Oct 2006 22:46:18 +0200
Petr Baudis <pasky@suse.cz> wrote:

> They could be written, but certainly not "just as easily". I'm more used
> to coding Cogito, I find it much more convenient than hacking git's
> shell scripts (those two may be interconnected ;), and there's plenty of
> infrastructure in Cogito missing in Git - Cogito has more flexible
> arguments parsing, documentation bundled with code, I could just
> cut'n'paste the code to handle -m arguments and message editor (and most
> of it is libified anyway) so I got that basically for free, and I think
> Cogito beats Git hands down in code readability.

Hmmm, if I get some time over the weekend i'll take a look at porting
them to Git.  But maybe some of the items you mentioned above deserve
to become part of Git proper?  It would definitely be nice to see
something like what you just did put into the hands of more users than
just those using Cogito, and its unfortunate that the current state
of Git code kept you from going that route.

> It would be of course technically possible, yes. But somewhat more work,
> this is just a quick hack.

No doubt, there would be some slightly thorny issues to deal with.  It
might even end up too fragile to be worthwhile.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:56                         ` Sean
  2006-10-17 23:11                           ` Jakub Narebski
@ 2006-10-18 21:04                           ` Charles Duffy
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 806+ messages in thread
From: Charles Duffy @ 2006-10-18 21:04 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Sean wrote:
> Hmm.. It's pretty easy to test out Git ideas too.  People do it all
> the time, and without plugins.  Junio maintains several such trees
> for instance.  Dunno.. I just think plugs _sounds_ good to developers
> without much real benefit to users over regular ole source code.

Example time!

There's a plugin for Bzr which adds support for Cygwin-compatible 
symlink support on Windows. (IIRC, this involves monkey-patching some of 
the Python standard library bits).

Now, this is something which is *proposed* as a feature to be merged 
into upstream bzr, and it may happen at some point. That said, when I 
have a Windows-using coworker who wants to check out a repository that 
has symlinks in it (with his win32-native, no-cygwin-required bzr 
upstream binary), I don't need to tell him to go download and build bzr 
from a third party; instead, I just need to tell him to run a single 
command to check out the plugin in question into the bzr plugins folder.

 From an end-user convenience perspective, it's a pretty significant win.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 20:46                                         ` Shawn Pearce
@ 2006-10-18 21:17                                           ` Linus Torvalds
  2006-10-18 21:32                                             ` Shawn Pearce
                                                               ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 21:17 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git



On Wed, 18 Oct 2006, Shawn Pearce wrote:
>
> I guess its my turn then to work in the mmap window code, huh?  :-)

There are bigger reasons to _never_ allow packs to contain deltas to 
outside of themselves:

 - there's no point. 

   If you have many small packs, you're doing something wrong. The whole 
   _point_ of packs is to put things into the same file, so that you can 
   avoid the filesystem overhead. And once packs are big and few, the 
   advantage of having deltas to outside the pack is basically zero.

 - it's a bad design. 

   Self-sufficient packs means that a pack is a "safe" thing. When the 
   index says that it contains an object, then it damn well contains it.

   In contrast, if you had packs that only contained a delta, and the pack 
   needed some _other_ pack (or loose object) to actually generate that 
   object, then it's not safe any more. You could end up with a situation 
   where you get two packs from two different sources, and they contain 
   deltas to _each_other_, and you have no way of actually generating the 
   object itself any more.

   (Or you end up having to have rules to figure out when you have a loop,
   and stop looking just in the packed files, and start looking for loose 
   objects instead)

   In other words, it has potentially _serious_ downsides.

So DAMMIT! Stop looking to make the data structures worse. The fact is, 
the git data structures are FINE. They are well-designed. They work well. 
There's no _point_ in changing them, especially since changing them seems 
to be all about making things less reliable for dubious gain.

One of the advantages of git is that you can explain things with object 
relationships, and that the file format is stable as _hell_. Thats a GOOD 
thing. Please realize that if you want to change the file formats, you'd 
have a hell of a better reason for it that "just because I can".

Please. Really.

So next time somebody suggests a new pack-format, ask yourself:

 - does it save disk-space by 50% or more?

 - does it drop memory usage by 50% or more?

 - does it improve performance by 50% of more?

 - does it make something possible that really fundamentally isn't 
   possible right now?

And if the answer to those questions is "no", then JUST DON'T DO IT.

It really needs to be _damn_ spectacular to be worthy of a new format. 
Really. We've had a few of those, so it clearly does happen:

 - The "compress _after_ SHA1". The original object format was just 
   broken, and the SHA1 name depended on how things compressed. I fixed 
   it. It needed fixing. We couldn't have done a lot of the things we did 
   without switching compression and SHA1-hashing around.

 - the pack-file in the first place: this saved orders of magnitude both 
   in diskspace _and_ performance. Not "10%". More like "factors of 100".

   THAT was worthy of a major format change.

 - the "make loose object contents look the same as packed objects". This 
   was not just a cleanup, it allows us to create pack-files much faster. 

   That said, we're still defaulting to the legacy format, and maybe it 
   wasn't really worth it. 

My personal suspicion is that we'll want to have a 64-bit index file some 
day, and THAT is worthy of a format change. That day is not now, btw. It's 
probably not even very close. Even the mozilla repo that was pushing the 
limit was only doing so until it was optimized better, and now it's 
apparently nowhere _near_ that limit.

But even then, we might well want to update _just_ the index file format.

Because in an SCM, stability and trustworthiness is more important than 
just about _anything_ else. 

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 14:52                           ` Linus Torvalds
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
@ 2006-10-18 21:20                             ` Jeff King
  1 sibling, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-18 21:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

On Wed, Oct 18, 2006 at 07:52:25AM -0700, Linus Torvalds wrote:

> 	git send origin..
> 
> and that "origin" is what the other end is expected to already have.
> 
> Of course, if you send an unconnected bundle (ie you give an origin that 
> the other end _doesn't_ have), you're screwed.

OK, that was how I was envisioning it, as well, but I was concerned
about the "screwed" part. But I'm not sure how often that would be an
issue in practice (after all, patches require some matchup of the base,
though not as strict as SHA1s).

Thanks for the explanation.

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
@ 2006-10-18 21:29                               ` Sean
  2006-10-18 23:31                                 ` Charles Duffy
  2006-10-18 21:29                               ` Sean
  2006-10-18 21:37                               ` Shawn Pearce
  2 siblings, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-18 21:29 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git, bazaar-ng

On Wed, 18 Oct 2006 16:04:52 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Example time!
> 
> There's a plugin for Bzr which adds support for Cygwin-compatible 
> symlink support on Windows. (IIRC, this involves monkey-patching some of 
> the Python standard library bits).
> 
> Now, this is something which is *proposed* as a feature to be merged 
> into upstream bzr, and it may happen at some point. That said, when I 
> have a Windows-using coworker who wants to check out a repository that 
> has symlinks in it (with his win32-native, no-cygwin-required bzr 
> upstream binary), I don't need to tell him to go download and build bzr 
> from a third party; instead, I just need to tell him to run a single 
> command to check out the plugin in question into the bzr plugins folder.
> 
>  From an end-user convenience perspective, it's a pretty significant win.

You'll need a better example than that.  Git has supported a version
of Cygwin-compatible symlink support on Windows for quite some time.
And no plugins were needed.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
  2006-10-18 21:29                               ` Sean
@ 2006-10-18 21:29                               ` Sean
  2006-10-18 21:37                               ` Shawn Pearce
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 21:29 UTC (permalink / raw)
  To: Charles Duffy; +Cc: bazaar-ng, git

On Wed, 18 Oct 2006 16:04:52 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Example time!
> 
> There's a plugin for Bzr which adds support for Cygwin-compatible 
> symlink support on Windows. (IIRC, this involves monkey-patching some of 
> the Python standard library bits).
> 
> Now, this is something which is *proposed* as a feature to be merged 
> into upstream bzr, and it may happen at some point. That said, when I 
> have a Windows-using coworker who wants to check out a repository that 
> has symlinks in it (with his win32-native, no-cygwin-required bzr 
> upstream binary), I don't need to tell him to go download and build bzr 
> from a third party; instead, I just need to tell him to run a single 
> command to check out the plugin in question into the bzr plugins folder.
> 
>  From an end-user convenience perspective, it's a pretty significant win.

You'll need a better example than that.  Git has supported a version
of Cygwin-compatible symlink support on Windows for quite some time.
And no plugins were needed.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
@ 2006-10-18 21:32                                             ` Shawn Pearce
  2006-10-18 21:42                                               ` Junio C Hamano
  2006-10-18 21:55                                               ` Linus Torvalds
  2006-10-18 21:41                                             ` Nicolas Pitre
                                                               ` (2 subsequent siblings)
  3 siblings, 2 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:32 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> On Wed, 18 Oct 2006, Shawn Pearce wrote:
> >
> > I guess its my turn then to work in the mmap window code, huh?  :-)
> 
> There are bigger reasons to _never_ allow packs to contain deltas to 
> outside of themselves:
> 
>  - there's no point. 
>  - it's a bad design. 

That and all of the other reasons you cited in your message are
why I haven't finished trying to use some sort of dictionary based
compression for packing objects.

On the other hand we've already seen how packs >1.5 GiB in size
(certainly well within the 4 GiB limitation in the current index
file format) cannot be repacked by git-repack-objects on a 32
bit address space as the entire pack file is mmap'd on one shot.
After the kernel space of ~1 GiB and the pack file at ~1.5 GiB
there's very little address space left for the application code.

My comment that you quoted was about mmap'ing the pack files in
large chunks (around 64-128 MiB at a time, but configurable from
.git/config) rather than as an entire massive mapping.  It had
absolutely nothing to do about changing the pack file format, the
index format, or any other on disk format.  Although it would add
a new pair of configuration options to .git/config.  Is that change
too radical?  :-)

With such a change the Git and Linux kernel repositories would both
still mmap in one chunk but much larger projects like Mozilla or
very large pack files coming out of git-fastimport would actually
be usable on 32 bit architectures without running into address space
limitations so quickly.  Git would also be slightly more usable for
some people who have a lot of very uncompressable data stored in Git.


Unless of course you are actively working on a fix for the Linux
kernel so that we can actually have all 4 GiB of virtual address
space available for the userspace git-repack-objects process.
Or have some sort of secret plan to upgrade everyone who uses Git
to 64 bit processors which support 64 bit address spaces...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
  2006-10-18 21:29                               ` Sean
  2006-10-18 21:29                               ` Sean
@ 2006-10-18 21:37                               ` Shawn Pearce
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
  2006-10-18 23:38                                 ` Johannes Schindelin
  2 siblings, 2 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:37 UTC (permalink / raw)
  To: Sean; +Cc: Charles Duffy, git, bazaar-ng

Sean <seanlkml@sympatico.ca> wrote:
> On Wed, 18 Oct 2006 16:04:52 -0500
> Charles Duffy <cduffy@spamcop.net> wrote:
> 
> > Example time!
> > 
> > There's a plugin for Bzr which adds support for Cygwin-compatible 
> > symlink support on Windows. (IIRC, this involves monkey-patching some of 
> > the Python standard library bits).
> > 
> > Now, this is something which is *proposed* as a feature to be merged 
> > into upstream bzr, and it may happen at some point. That said, when I 
> > have a Windows-using coworker who wants to check out a repository that 
> > has symlinks in it (with his win32-native, no-cygwin-required bzr 
> > upstream binary), I don't need to tell him to go download and build bzr 
> > from a third party; instead, I just need to tell him to run a single 
> > command to check out the plugin in question into the bzr plugins folder.
> > 
> >  From an end-user convenience perspective, it's a pretty significant win.
> 
> You'll need a better example than that.  Git has supported a version
> of Cygwin-compatible symlink support on Windows for quite some time.
> And no plugins were needed.

Actually I think the only part of that example that was really
interesting was that Bzr runs natively on Windows and that Bzr's
native method of extending the tool with additional features doesn't
require Cygwin.


Today Git doesn't run natively on Windows.  It runs slowly through
Cygwin, thanks to lots of various overheads in different places.
And due to the crappy disk drive in my Windows box.  :-)

Today Git is typically extended (at least initially in prototyping
mode) through Perl, Python, TCL or Bourne shell scripts.  Although
the first three are available natively on Windows the last requires
Cygwin... and we've had some issues with ActiveState Perl on Windows
in the past too.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
  2006-10-18 20:53                                     ` Sean
@ 2006-10-18 21:39                                     ` Petr Baudis
       [not found]                                       ` <20061018175443.50b728f6.seanlkml@sympatico.ca>
  1 sibling, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 21:39 UTC (permalink / raw)
  To: Sean; +Cc: git

(Trimmed Cc' list, this is offtopic for bazaar-ng.)

Dear diary, on Wed, Oct 18, 2006 at 10:53:41PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Wed, 18 Oct 2006 22:46:18 +0200
> Petr Baudis <pasky@suse.cz> wrote:
> 
> > They could be written, but certainly not "just as easily". I'm more used
> > to coding Cogito, I find it much more convenient than hacking git's
> > shell scripts (those two may be interconnected ;), and there's plenty of
> > infrastructure in Cogito missing in Git - Cogito has more flexible
> > arguments parsing, documentation bundled with code, I could just
> > cut'n'paste the code to handle -m arguments and message editor (and most
> > of it is libified anyway) so I got that basically for free, and I think
> > Cogito beats Git hands down in code readability.
> 
> Hmmm, if I get some time over the weekend i'll take a look at porting
> them to Git.  But maybe some of the items you mentioned above deserve
> to become part of Git proper?  It would definitely be nice to see
> something like what you just did put into the hands of more users than
> just those using Cogito, and its unfortunate that the current state
> of Git code kept you from going that route.

You can use just this single tool from Cogito. ;-)

The point is, I'll of course prefer doing this stuff in Cogito while I'm
enhancing Cogito, and I'll work on Cogito while I and others will be
using it. I didn't move on to pure Git long time ago since I simply
consider its UI much inferior to Cogito's. Sure, given enough time and
work, it is fixable - but UI flaws are very hard to fix and I find it
more effective to work on Cogito for the time being, at least until I
bring it to 1.0, then I'll see.

Besides, I'm used to Cogito. :-)

So yes, current Git code definitely is a part of the reason, but it is
certainly not the main part of it.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
  2006-10-18 21:32                                             ` Shawn Pearce
@ 2006-10-18 21:41                                             ` Nicolas Pitre
  2006-10-18 21:41                                             ` Shawn Pearce
  2006-10-18 21:56                                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Junio C Hamano
  3 siblings, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-18 21:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Shawn Pearce, Junio C Hamano, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> There are bigger reasons to _never_ allow packs to contain deltas to 
> outside of themselves:
> 
>  - there's no point. 

Remember what I said earlier: "If there are advantages to do so then 
maybe."  So far there are none.

>    You could end up with a situation where you get two packs from two 
>    different sources, and they contain deltas to _each_other_, and you 
>    have no way of actually generating the object itself any more.

To me this is the real killer.

Shawn was talking about a different issue though.


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
  2006-10-18 21:32                                             ` Shawn Pearce
  2006-10-18 21:41                                             ` Nicolas Pitre
@ 2006-10-18 21:41                                             ` Shawn Pearce
  2006-10-18 22:00                                               ` Linus Torvalds
  2006-10-18 22:13                                               ` Junio C Hamano
  2006-10-18 21:56                                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Junio C Hamano
  3 siblings, 2 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> There are bigger reasons to _never_ allow packs to contain deltas to 
> outside of themselves:
> 
>  - there's no point. 

Actually there is a point to storing thin packs.  When I pull from
a remote repo (or push to a remote repo) a huge number of objects
and the target disk that is about to receive that huge number of
loose objects is slooooooooow I would rather just store the thin
pack then store the loose objects.

Ideally that thin pack would be repacked (along with the other
existing packs) as quickly as possible into a self-contained pack.
But that of course is unlikely to happen in practice; especially
on a push.
 
>  - it's a bad design. 
> 
>    In other words, it has potentially _serious_ downsides.

Yes, it does.

But it could also be useful when you fetch 20k+ objects onto a
Windows system or push 1k+ objects onto the slowest NFS system I
have ever seen...  where writing file data (aka packs) is reasonable
but creating or deleting files takes nearly 1 second per file.
I don't want to kill the better part of an hour waiting for a push
to complete!

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:32                                             ` Shawn Pearce
@ 2006-10-18 21:42                                               ` Junio C Hamano
  2006-10-18 21:52                                                 ` Shawn Pearce
  2006-10-18 21:55                                               ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 21:42 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> ...  Although it would add
> a new pair of configuration options to .git/config.  Is that change
> too radical?  :-)

I wonder what you would need the configuration options for.

If mmap() pack works well, it works well, and if it is broken
nobody has reason to enable it.  The code should be able to
adjust the mmap window to appropriate size itself and its
automatic adjustment does not even have to be the absolute
optimum (since the user would not know what the optimum would be
anyway), so maybe your configuration options would not be
"enable" nor "window-size" -- and I am puzzled as to what they
are.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
@ 2006-10-18 21:44                                   ` Sean
  2006-10-18 21:52                                   ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 21:44 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Wed, 18 Oct 2006 17:37:03 -0400
Shawn Pearce <spearce@spearce.org> wrote:

> Today Git is typically extended (at least initially in prototyping
> mode) through Perl, Python, TCL or Bourne shell scripts.  Although
> the first three are available natively on Windows the last requires
> Cygwin... and we've had some issues with ActiveState Perl on Windows
> in the past too.

Just for kicks and giggles it would be nice if someone tried out
one of the native Windows bourne shell ports[1] just to see how much
is missing.  A bunch of command line utilities would have to be ported
as well; maybe too many.  But i've held out booting a Windows box
for a long time so.... not it!

Sean

[1] For example, http://www.steve.org.uk/Software/bash/

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18  5:26                     ` Robert Collins
@ 2006-10-18 21:46                       ` Jan Hudec
  2006-10-18 22:14                         ` Jakub Narebski
                                           ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-18 21:46 UTC (permalink / raw)
  To: Robert Collins; +Cc: Petr Baudis, bazaar-ng, git

On Wed, Oct 18, 2006 at 03:26:40PM +1000, Robert Collins wrote:
> revnos visibly change as your work is merged into the mainline - we've
> been doing this for years without trouble: ones own commits to a branch
> get '3', '4', '5' etc as revnos, and when they are merged to the
> mainline they used to stop having revnos at all, but now they will be
> given this dotted decimal revno. If you pull from the mainline after the
> merge, you see the new numbers, and when you look at mainline you can
> see the difference. So while I agree that the surprise the user gets is
> inversely related to the frequency with which they see the behaviour, I
> think our users see it a lot, so are not surprised much.
> 
> FWIW, we're not optimising for mostly straight histories as I understand
> such things : our own history has 3 commits on branches to every one on
> the mainline.

Reading this thread I came to think, that the revnos should be assigned
to _all_ revisions _available_, in order of when they entered the
repository (there are some possible variations I will mention below)

 - Such revnos would be purely local, but:
   - Current revnos are not guaranteed to be the same in different
     branches either.
   - They could be done so that mirror has the same revnos as the
     master.
 - They would be easier to use than the dotted ones. What (at least as
   far as I understand) makes revnos easier to use than revids is, that
   you can remember few of them for short time while composing some
   operation. Ie. look up 2 or 3 revisions in the log and than do some
   command on them. And a 4 to 5-digit number like 10532 is easier to
   remember than something like 3250.2.45.86.
 - Their ordering would be an (arbitrary) superset of the partial
   ordering by descendance, ie. if revision A is ancestor of B, it would
   always have lower revno.
   - The intuition that lower revno means older revision would be always
     valid for related revisions and approximately valid for unrelated
     ones.
 - They would be *localy stable*. That is once assigned the revno would
   always mean the same revision in given branch (as determined by
   location, not tip).
     - This is more than the current scheme can give, since now pull can
       renumber revisions.
 - They wouldn't make any branch special, so the objections Linus raised
   does not apply.
 - They would be the same as subversion and svk, and IIRC mercurial as
   well, use, so:
   - They would already be familiar to users comming from those systems.
   - They are known to be useful that way. In fact for svk it's the only
     way to refer to revisions and seem to work satisfactorily (though
     note that svk is not really suitable to ad-hoc topologies).

Now I said there are two options how to assign them. These are:

 - Repository-wide: Number would be assigned to each revision entering
   the repository, even when it is not in ancestry of any branch (ie.
   if one starts a merge, but than reverts it).
   - Advantages:
     - Simpler to implement (just log every written-out revision).
     - All branches in the same repository use the same revision
       numbers, so if you keep branches in a shared repo, it makes
       easier to look up one revision in log of one branch, other in log
       of other branch and run diff on them.
   - Disadvantages:
     - Mirror only has the same revnos if both master and the mirror are
       stand-alone branches.
 - Branch-wide: Nuber would be assigned to each revision that becomes
   ancestor of the current head revision.
   - Advantages:
     - Mirror (always updated by push from the same source) always have
       the same revision numbers.
     - The revno assignment list could be reused for refering to state
       at particular point in time (in fact, it would be exactly the
       same thing as git reflog).
     - Bound branches could be forced to have the same revnos.
   - Disadvantages:
     - More complex to implement.
     - More work at runtime and more space needed in a shared
       repository, since each branch has it's own mapping.

Both ways, it would be implemented the way revision-history currently
is, just it would list all revisions, not just the path along the
leftmost parent.

Comments?

(Should I put it on the wiki?)

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
  2006-10-17 22:56                         ` Sean
  2006-10-17 22:56                         ` Sean
@ 2006-10-18 21:51                         ` Petr Baudis
  2 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 21:51 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

Dear diary, on Wed, Oct 18, 2006 at 12:56:22AM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Tue, 17 Oct 2006 18:44:11 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> > Plugins also don't have a Bazaar's rigid release cycle, testing
> > requirements and coding conventions, so they are a convenient way to try
> > out an idea, before committing to the effort of getting it merged into
> > the core.
> 
> Hmm.. It's pretty easy to test out Git ideas too.  People do it all
> the time, and without plugins.  Junio maintains several such trees
> for instance.  Dunno.. I just think plugs _sounds_ good to developers
> without much real benefit to users over regular ole source code.

I think this is just another cultural difference. Git comes from the
kernel environment (although it is currently used in far more
environments than just the kernel and kernel-related stuff) and the
_kernel_'s development style is that you want to get as much stuff as
possible inside the kernel, and on the other hand don't care at all
about breaking in-kernel APIs and such.

The Git "plumbing" is very much the "kernel". We aren't as much
interested in having support for external bits of code poking in the Git
innards, we would much rather have them integrated into Git as soon as
possible rather than live around externally. OTOH, the "kernel" gives a
very flexible ("UNIXy") API to the writhing mass of porcelain scripts you
may call the "userland".

I'm not saying it must be always sharply better approach than the
plugin-encouraging approach. It's just as it is. (Also, another reason
is probably a purely technical one, it is much easier to have pluggable
functions in scripting languages that support "monkey-patching", than
have them in C, since you actually need to explicitly add all the hooks
etc. So in Python, from a large part you get the plugin support for
free.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:42                                               ` Junio C Hamano
@ 2006-10-18 21:52                                                 ` Shawn Pearce
  2006-10-18 22:02                                                   ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 21:52 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Junio C Hamano <junkio@cox.net> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
> 
> > ...  Although it would add
> > a new pair of configuration options to .git/config.  Is that change
> > too radical?  :-)
> 
> I wonder what you would need the configuration options for.
> 
> If mmap() pack works well, it works well, and if it is broken
> nobody has reason to enable it.  The code should be able to
> adjust the mmap window to appropriate size itself and its
> automatic adjustment does not even have to be the absolute
> optimum (since the user would not know what the optimum would be
> anyway), so maybe your configuration options would not be
> "enable" nor "window-size" -- and I am puzzled as to what they

All very true.

However what do we do about the case where we mmap over 1 GiB worth
of pack data (because the mmap succeeds and we have at least that
much in .pack and .idx files) and then the application starts to
demand a lot of memory via malloc?  At some point malloc will return
NULL, xmalloc will die(), and that's the end of the program.

If the user was able to set the maximum threshold of how much data
we mmap then they could initially prevent us from mmap'ing over 1 GiB;
instead using a smaller upper limit like 512 MiB.

Of course as I write this I think the better solution to this
problem is to simply modify xmalloc (and friends) so that if the
underlying malloc returned NULL and we have a large amount of stuff
mmap'd from packs we try releasing some of the unused pack windows
and retry the malloc before die()'ing.


The other configuration option is the size of the mmap window.
This should by default be at least 32 MiB, probably closer to
128 MiB.  But its nice to be able to force it as low as a single
system page to setup test cases in the t/ directory for the mmap
window code.

Earlier this summer we discussed this exact issue and said this
value probably needs to be configurable if only to facilitate the
unit tests.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
  2006-10-18 21:44                                   ` Sean
@ 2006-10-18 21:52                                   ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 21:52 UTC (permalink / raw)
  To: Sean; +Cc: Shawn Pearce, git

Dear diary, on Wed, Oct 18, 2006 at 11:44:50PM CEST, I got a letter
where Sean <seanlkml@sympatico.ca> said that...
> On Wed, 18 Oct 2006 17:37:03 -0400
> Shawn Pearce <spearce@spearce.org> wrote:
> 
> > Today Git is typically extended (at least initially in prototyping
> > mode) through Perl, Python, TCL or Bourne shell scripts.  Although
> > the first three are available natively on Windows the last requires
> > Cygwin... and we've had some issues with ActiveState Perl on Windows
> > in the past too.
> 
> Just for kicks and giggles it would be nice if someone tried out
> one of the native Windows bourne shell ports[1] just to see how much
> is missing.  A bunch of command line utilities would have to be ported
> as well; maybe too many.  But i've held out booting a Windows box
> for a long time so.... not it!

I think that before starting to think about the porcelain scripts, you
need to port the plumbing. :-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                       ` <20061018175443.50b728f6.seanlkml@sympatico.ca>
@ 2006-10-18 21:54                                         ` Sean
  0 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 21:54 UTC (permalink / raw)
  To: Petr Baudis; +Cc: git

On Wed, 18 Oct 2006 23:39:35 +0200
Petr Baudis <pasky@suse.cz> wrote:

> You can use just this single tool from Cogito. ;-)

I'd rather not have to keep two separate tools up to date, i just want
to install Git and have all these features installed.  Especially since
there is so much overlap in what these two packages do.  That would seem
like the best thing to do for most users in fact, asking them to install
and keep both up to date just doesn't make sense, to me at least.

> The point is, I'll of course prefer doing this stuff in Cogito while I'm
> enhancing Cogito, and I'll work on Cogito while I and others will be
> using it. I didn't move on to pure Git long time ago since I simply
> consider its UI much inferior to Cogito's. Sure, given enough time and
> work, it is fixable - but UI flaws are very hard to fix and I find it
> more effective to work on Cogito for the time being, at least until I
> bring it to 1.0, then I'll see.
> 
> Besides, I'm used to Cogito. :-)
> 
> So yes, current Git code definitely is a part of the reason, but it is
> certainly not the main part of it.

It's just a shame that your talents are split off from helping the main
project more.  Git would be further along today in content and PR if it
had managed to attract you back from your Cogito adventure.  Then all
the nice things you're able to say about Cogito might then be said
about Git proper, and maybe we'd attract even more users.

While you've contributed more to Git than many others (including me
obviously), it would sure be nice to see you back full time on Git.
I want to type "git bundle" without having to install more
software damnit ;o)  But of course you have to decide what's best
for yourself.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:32                                             ` Shawn Pearce
  2006-10-18 21:42                                               ` Junio C Hamano
@ 2006-10-18 21:55                                               ` Linus Torvalds
  2006-10-18 22:05                                                 ` Shawn Pearce
  2006-10-18 22:07                                                 ` Junio C Hamano
  1 sibling, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 21:55 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git



On Wed, 18 Oct 2006, Shawn Pearce wrote:
> 
> My comment that you quoted was about mmap'ing the pack files in
> large chunks (around 64-128 MiB at a time, but configurable from
> .git/config) rather than as an entire massive mapping.

Sure. I agree that we should do that, if only because it's clearly getting 
hard to handle large pack-files on a 32-bit architecture.

You just seemed to say that in the _context_ of wanting to support having 
multiple pack-files open (in order to allow deltas to refer to things 
outside their own pack-file).

I just wanted to head that particular idea off at the pass.

I think thin packs have been a good idea, and they certainly cut the 
amount of data sent over the network down by a large amount (much more 
than 50%), so I think thin packs are a great idea. Just _not_ when 
indexed.

So I don't object to mmap windows at all. I object to them only in the 
context of "they would allow us to use deltas between two different packs"
discussion ;)

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:17                                           ` Linus Torvalds
                                                               ` (2 preceding siblings ...)
  2006-10-18 21:41                                             ` Shawn Pearce
@ 2006-10-18 21:56                                             ` Junio C Hamano
  3 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 21:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> My personal suspicion is that we'll want to have a 64-bit index file some 
> day, and THAT is worthy of a format change. That day is not now, btw. It's 
> probably not even very close. Even the mozilla repo that was pushing the 
> limit was only doing so until it was optimized better, and now it's 
> apparently nowhere _near_ that limit.
>
> But even then, we might well want to update _just_ the index file format.

We've tried this already, and I shelved the patch for 64-index
for now due to exactly the same reasoning as yours (and it would
have conflicted heavily with Shawn's windowed-mmap() patch).  It
involved updating just the index file format, so you are right
on both counts.

But you are always right anyway, so it may not be a news at all
;-).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:41                                             ` Shawn Pearce
@ 2006-10-18 22:00                                               ` Linus Torvalds
  2006-10-18 22:11                                                 ` Shawn Pearce
  2006-10-18 22:13                                               ` Junio C Hamano
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 22:00 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicolas Pitre, Junio C Hamano, git



On Wed, 18 Oct 2006, Shawn Pearce wrote:
> 
> Actually there is a point to storing thin packs.  When I pull from
> a remote repo (or push to a remote repo) a huge number of objects
> and the target disk that is about to receive that huge number of
> loose objects is slooooooooow I would rather just store the thin
> pack then store the loose objects.
> 
> Ideally that thin pack would be repacked (along with the other
> existing packs) as quickly as possible into a self-contained pack.
> But that of course is unlikely to happen in practice; especially
> on a push.

I'm really nervous about keeping thin packs around. 

But a possibly good (and fairly simple) alternative would be to just 
create a non-thin pack on the receiving side. Right now we unpack into a 
lot of loose objects, but it should be possible to instead "unpack" into a 
non-thin pack.

In other words, we could easily still use the thin pack for communication, 
we'd just "fill it out" on the receiving side.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:52                                                 ` Shawn Pearce
@ 2006-10-18 22:02                                                   ` Junio C Hamano
  0 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:02 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> However what do we do about the case where we mmap over 1 GiB worth
> of pack data (because the mmap succeeds and we have at least that
> much in .pack and .idx files) and then the application starts to
> demand a lot of memory via malloc?...
>
> The other configuration option is the size of the mmap window.
>...
> Earlier this summer we discussed this exact issue and said this
> value probably needs to be configurable if only to facilitate the
> unit tests.

I see.  So you are allowing users to control individual window
size and total mmap memory.  That makes sense.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:55                                               ` Linus Torvalds
@ 2006-10-18 22:05                                                 ` Shawn Pearce
  2006-10-18 22:07                                                 ` Junio C Hamano
  1 sibling, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 22:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> So I don't object to mmap windows at all. I object to them only in the 
> context of "they would allow us to use deltas between two different packs"
> discussion ;)

Having mmap windows or not has no impact on using deltas between
packs.  We already map multiple packs at once.  We just don't do
delta resolution between them, for the reasons you have already
given.

The two are totally unrelated.  I apologize for somehow making
yourself (and others) think they are.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:55                                               ` Linus Torvalds
  2006-10-18 22:05                                                 ` Shawn Pearce
@ 2006-10-18 22:07                                                 ` Junio C Hamano
  1 sibling, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> I think thin packs have been a good idea, and they certainly cut the 
> amount of data sent over the network down by a large amount (much more 
> than 50%), so I think thin packs are a great idea. Just _not_ when 
> indexed.

Ah, I feel quite behind.  I was about to say "oh have you been
pushing with --thin option?", and then realized that we made it
default since late March this year.

I need to run memtest86 on myself X-<.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:00                                               ` Linus Torvalds
@ 2006-10-18 22:11                                                 ` Shawn Pearce
  0 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 22:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git

Linus Torvalds <torvalds@osdl.org> wrote:
> On Wed, 18 Oct 2006, Shawn Pearce wrote:
> > 
> > Actually there is a point to storing thin packs.  When I pull from
> > a remote repo (or push to a remote repo) a huge number of objects
> > and the target disk that is about to receive that huge number of
> > loose objects is slooooooooow I would rather just store the thin
> > pack then store the loose objects.
> > 
> > Ideally that thin pack would be repacked (along with the other
> > existing packs) as quickly as possible into a self-contained pack.
> > But that of course is unlikely to happen in practice; especially
> > on a push.
> 
> I'm really nervous about keeping thin packs around. 
> 
> But a possibly good (and fairly simple) alternative would be to just 
> create a non-thin pack on the receiving side. Right now we unpack into a 
> lot of loose objects, but it should be possible to instead "unpack" into a 
> non-thin pack.
> 
> In other words, we could easily still use the thin pack for communication, 
> we'd just "fill it out" on the receiving side.

Funny, I had the same thought.  :-)

We already know how many objects are coming in on a thin pack;
its right there in the header.  We could just have some threshold
at which we start writing a full pack rather than unpacking.

Writing such a full pack would be a simple matter of copying the
input stream out to a temporary pack, but sticking any delta bases
into a table in memory.  At the end of the data stream if we have any
delta bases which weren't actually in that pack then find them and
copy them onto the end, update the header and recompute the checksum.
git-fastimport does some of that already, though its trivial code...

Worst case scenario would be the incoming thin pack is 100% deltas
as we would need to copy in a base object for every object mentioned
in the pack.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 21:41                                             ` Shawn Pearce
  2006-10-18 22:00                                               ` Linus Torvalds
@ 2006-10-18 22:13                                               ` Junio C Hamano
  2006-10-18 22:42                                                 ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:13 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Linus Torvalds, Nicolas Pitre, git

Shawn Pearce <spearce@spearce.org> writes:

> Ideally that thin pack would be repacked (along with the other
> existing packs) as quickly as possible into a self-contained pack.

It should not be hard to write another program that generates a
packfile like pack-object does but taking a thin pack as its
input.  Then receive-pack can drive it instead of
unpack-objects.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
@ 2006-10-18 22:14                         ` Jakub Narebski
  2006-10-19  5:45                           ` Jan Hudec
  2006-10-19  8:19                         ` Alexander Belchenko
  2006-10-20  2:09                         ` Horst H. von Brand
  2 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18 22:14 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jan Hudec wrote:

> Comments?

What about fetching from repository? For revnos you have to assign revno for
all commit you have downloaded; now you need only to unpack received pack
(or not, if you used --keep option). More work.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:13                                               ` Junio C Hamano
@ 2006-10-18 22:42                                                 ` Linus Torvalds
  2006-10-18 22:48                                                   ` Junio C Hamano
  2006-10-18 23:18                                                   ` Nicolas Pitre
  0 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-18 22:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn Pearce, Nicolas Pitre, git



On Wed, 18 Oct 2006, Junio C Hamano wrote:
> 
> It should not be hard to write another program that generates a
> packfile like pack-object does but taking a thin pack as its
> input.  Then receive-pack can drive it instead of
> unpack-objects.

Give me half an hour. It should be trivial to make "unpack-objects" write 
the "unpacked" objects into a pack-file instead.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:42                                                 ` Linus Torvalds
@ 2006-10-18 22:48                                                   ` Junio C Hamano
  2006-10-18 23:22                                                     ` Shawn Pearce
  2006-10-18 23:18                                                   ` Nicolas Pitre
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-18 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Wed, 18 Oct 2006, Junio C Hamano wrote:
>> 
>> It should not be hard to write another program that generates a
>> packfile like pack-object does but taking a thin pack as its
>> input.  Then receive-pack can drive it instead of
>> unpack-objects.
>
> Give me half an hour. It should be trivial to make "unpack-objects" write 
> the "unpacked" objects into a pack-file instead.

Heh, three people having the same idea that goes in the same
direction at the same time is not necessarily a good sign of
efficient project management...

I am currently fighting with FC5 so please go ahead.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:42                                                 ` Linus Torvalds
  2006-10-18 22:48                                                   ` Junio C Hamano
@ 2006-10-18 23:18                                                   ` Nicolas Pitre
  2006-10-18 23:50                                                     ` Johannes Schindelin
  2006-10-19  0:07                                                     ` Linus Torvalds
  1 sibling, 2 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-18 23:18 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Shawn Pearce, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 18 Oct 2006, Junio C Hamano wrote:
> > 
> > It should not be hard to write another program that generates a
> > packfile like pack-object does but taking a thin pack as its
> > input.  Then receive-pack can drive it instead of
> > unpack-objects.
> 
> Give me half an hour. It should be trivial to make "unpack-objects" write 
> the "unpacked" objects into a pack-file instead.

If you use builtin-unpack-objects.c from next, you'll be able to 
generate the pack index pretty easily as well, as all the needed info is 
stored in the obj_list array.  Just need to append objects remaining on 
the delta_list array to the end of the pack, sort the obj_list by sha1 
and write the index.

Pretty trivial indeed.


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 22:48                                                   ` Junio C Hamano
@ 2006-10-18 23:22                                                     ` Shawn Pearce
  0 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-18 23:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

Junio C Hamano <junkio@cox.net> wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > On Wed, 18 Oct 2006, Junio C Hamano wrote:
> >> 
> >> It should not be hard to write another program that generates a
> >> packfile like pack-object does but taking a thin pack as its
> >> input.  Then receive-pack can drive it instead of
> >> unpack-objects.
> >
> > Give me half an hour. It should be trivial to make "unpack-objects" write 
> > the "unpacked" objects into a pack-file instead.
> 
> Heh, three people having the same idea that goes in the same
> direction at the same time is not necessarily a good sign of
> efficient project management...

Or maybe it is just a sign of a good way to resolve the issue I
was raising.  :-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 21:29                               ` Sean
@ 2006-10-18 23:31                                 ` Charles Duffy
  2006-10-18 23:48                                   ` Johannes Schindelin
                                                     ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Charles Duffy @ 2006-10-18 23:31 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Sean wrote:
> You'll need a better example than that.  Git has supported a version
> of Cygwin-compatible symlink support on Windows for quite some time.
> And no plugins were needed.

The win32-compatible symlink support is not, in and of itself, the point.

The point is that core, pervasive functionality can be modified at 
runtime, with no recompilation or installation of tools not included in 
the bzr package itself, simply by dropping a directory into place. This 
means that folks who don't have the skillset to merge three branches 
together (say, upstream plus two different trees adding extra 
functionality) and run a build can still install a few plugins to 
enhance their copy of bzr (which was installed by their IT staff, or a 
shiny click-through idiot-friendly Windows installer, etc).

And yes, there are people like that who are part of bzr's target 
audience. Think (of the lower end of the set of) DBAs, QA folk and such.


Granted, I'm speaking with my IT hat on here rather than my developer 
hat -- but plugins are a pretty clear usability win.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 21:37                               ` Shawn Pearce
       [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
@ 2006-10-18 23:38                                 ` Johannes Schindelin
  2006-10-18 23:54                                   ` Petr Baudis
  1 sibling, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-18 23:38 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Hi,

On Wed, 18 Oct 2006, Shawn Pearce wrote:

> Today Git doesn't run natively on Windows.

As I mentioned some time ago, I started a branch on MinGW. It works quite 
well for the moment, but it lacks fork() emulation, and glob() emulation. 
And I lack the time to continue working on it.

> Today Git is typically extended (at least initially in prototyping
> mode) through Perl, Python, TCL or Bourne shell scripts.  Although
> the first three are available natively on Windows the last requires
> Cygwin... and we've had some issues with ActiveState Perl on Windows
> in the past too.

Those are not the only problems with scripting. Scripting is fine for 
prototyping, but _anything_ remotely serious should be implemented using a 
portable (!) and safe (!) API.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:31                                 ` Charles Duffy
@ 2006-10-18 23:48                                   ` Johannes Schindelin
  2006-10-19  1:58                                     ` Charles Duffy
  2006-10-18 23:48                                   ` Jakub Narebski
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-18 23:48 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git, bazaar-ng

Hi,

On Wed, 18 Oct 2006, Charles Duffy wrote:

> The point is that core, pervasive functionality can be modified at 
> runtime, with no recompilation or installation of tools not included in 
> the bzr package itself, simply by dropping a directory into place.

Please note that this is not welcome here. I _need_ to trust my SCM. And 
_that_ means that no strange non-mainline beast can be allowed to change 
core features.

So, the wonderful upside of plugins you described here are actually the 
reason I will never, _never_ use bzr with plugins.

Ciao,
Dscho

--

It's not paranoia. It's called experience.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:31                                 ` Charles Duffy
  2006-10-18 23:48                                   ` Johannes Schindelin
@ 2006-10-18 23:48                                   ` Jakub Narebski
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-18 23:48 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Charles Duffy wrote:

> Sean wrote:
>> You'll need a better example than that.  Git has supported a version
>> of Cygwin-compatible symlink support on Windows for quite some time.
>> And no plugins were needed.
> 
> The win32-compatible symlink support is not, in and of itself, the point.
> 
> The point is that core, pervasive functionality can be modified at 
> runtime, with no recompilation or installation of tools not included in 
> the bzr package itself, simply by dropping a directory into place. This 
> means that folks who don't have the skillset to merge three branches 
> together (say, upstream plus two different trees adding extra 
> functionality) and run a build can still install a few plugins to 
> enhance their copy of bzr (which was installed by their IT staff, or a 
> shiny click-through idiot-friendly Windows installer, etc).

You don't need plugins for that. Take for example git-svn (perhaps not the
best example, as it is Perl script; but Python although has compiled form
is script language at heart), which went AFAIK from external contribution,
to being in contrib/, to being in mainline (and in git-svn package).

About plugins modifying some core functionality: this is rather sign
of not attracting developers to do it in-core...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
  2006-10-18 23:49                                     ` Sean
@ 2006-10-18 23:49                                     ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 23:49 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git, bazaar-ng

On Wed, 18 Oct 2006 18:31:32 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Granted, I'm speaking with my IT hat on here rather than my developer 
> hat -- but plugins are a pretty clear usability win.

Sure they can be.  But their value I think is overstated, especially
in an open source project where anyone can grab a copy of the source
and update it with a trial feature.  This updated copy can be wrapped
in a nice GUI installer just as easily as any plugin.

Now, I suppose plugins let end users mix and match trial features
slightly easier, but hopefully your base package isn't so devoid of
features that this is honestly necessary.

As Petr pointed out, all this comes to Bzr essentially for free
since it's a part of python.  So be it, but I've yet to hear an
example where plugins were anything more than a minor convenience
rather than a fundamental win over the way Git is developing.

For an example, just look how few lines of git were needed to
implement the essential features of the bzr bundle feature.
With no plugins or monkey business needed ;o)

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
@ 2006-10-18 23:49                                     ` Sean
  2006-10-18 23:49                                     ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-18 23:49 UTC (permalink / raw)
  To: Charles Duffy; +Cc: bazaar-ng, git

On Wed, 18 Oct 2006 18:31:32 -0500
Charles Duffy <cduffy@spamcop.net> wrote:

> Granted, I'm speaking with my IT hat on here rather than my developer 
> hat -- but plugins are a pretty clear usability win.

Sure they can be.  But their value I think is overstated, especially
in an open source project where anyone can grab a copy of the source
and update it with a trial feature.  This updated copy can be wrapped
in a nice GUI installer just as easily as any plugin.

Now, I suppose plugins let end users mix and match trial features
slightly easier, but hopefully your base package isn't so devoid of
features that this is honestly necessary.

As Petr pointed out, all this comes to Bzr essentially for free
since it's a part of python.  So be it, but I've yet to hear an
example where plugins were anything more than a minor convenience
rather than a fundamental win over the way Git is developing.

For an example, just look how few lines of git were needed to
implement the essential features of the bzr bundle feature.
With no plugins or monkey business needed ;o)

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 23:18                                                   ` Nicolas Pitre
@ 2006-10-18 23:50                                                     ` Johannes Schindelin
  2006-10-19  0:07                                                     ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-18 23:50 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Hi,

On Wed, 18 Oct 2006, Nicolas Pitre wrote:

> Pretty trivial indeed.

Easy! You take all the fun out of it!

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:38                                 ` Johannes Schindelin
@ 2006-10-18 23:54                                   ` Petr Baudis
  2006-10-19  0:33                                     ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-18 23:54 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Shawn Pearce, git

  Hi,

Dear diary, on Thu, Oct 19, 2006 at 01:38:45AM CEST, I got a letter
where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> On Wed, 18 Oct 2006, Shawn Pearce wrote:
> 
> > Today Git doesn't run natively on Windows.
> 
> As I mentioned some time ago, I started a branch on MinGW. It works quite 
> well for the moment, but it lacks fork() emulation, and glob() emulation. 
> And I lack the time to continue working on it.

  care to publish it somewhere, e.g. on repo.or.cz?

  (P.S., have fun in Prague! Too bad I won't be around over the weekend.
:-( )

  Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 23:18                                                   ` Nicolas Pitre
  2006-10-18 23:50                                                     ` Johannes Schindelin
@ 2006-10-19  0:07                                                     ` Linus Torvalds
  2006-10-19  0:15                                                       ` Linus Torvalds
                                                                         ` (3 more replies)
  1 sibling, 4 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19  0:07 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, Shawn Pearce, git



On Wed, 18 Oct 2006, Nicolas Pitre wrote:
> 
> If you use builtin-unpack-objects.c from next, you'll be able to 
> generate the pack index pretty easily as well, as all the needed info is 
> stored in the obj_list array.  Just need to append objects remaining on 
> the delta_list array to the end of the pack, sort the obj_list by sha1 
> and write the index.

Actually, I've hit an impasse.

The index isn't the problem. The problem is actually writing the resultant 
pack-file itself in one go.

The silly thing is, the pack-file contains the number of entries in the 
header. That's a silly problem, because the _natural_ way to turn a thin 
pack into a normal pack would be to just add the missing objects from the 
local store into the resulting pack. But we don't _know_ how many such 
missing objects there are, until we've gone through the whole source pack. 

So you can't easily do a streaming "write the result as you go along" 
version using that approach.

So there's _another_ way of fixing a thin pack: it's to expand the objects 
without a base into non-delta objects, and keeping the number of objects 
in the pack the same. But _again_, we don't actually know which ones to 
expand until it's too late.

The end result? I can expand them all (I have a patch that does that). Or 
I could leave as deltas the ones I have already seen the base for in the 
pack-file (I don't have that yet, but that should be a SMOP). But I'm not 
very happy with even the latter choice, because it really potentially 
expands things that didn't _need_ expansion, they just got expanded 
because we hadn't seen the base object yet.

So I'll happily send my patches to anybody who wants to try (I don't write 
the index file yet, but it should be easy to add), but I'm getting the 
feeling that "builtin-unpack-objects.c" is the wrong tool to use for this, 
because it's very much designed for streaming.

It would probably be better to start from "index-pack.c" instead, which is 
already a multi-pass thing, and wouldn't have had any of the problems I 
hit. 

Gaah.

> Pretty trivial indeed.

So it's conceptually totally trivial to rewrite a pack-file as another 
pack-file, but at least so far, it's turned out to be less trivial in 
practice (or at least in a single pass, without holding everything in 
memory, which I definitely do _not_ want to do).

So I'm leaving this for today, and perhaps coming back to it tomorrow with 
a fresh eye.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
@ 2006-10-19  0:15                                                       ` Linus Torvalds
  2006-10-19  0:31                                                       ` Johannes Schindelin
                                                                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19  0:15 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, Shawn Pearce, git



On Wed, 18 Oct 2006, Linus Torvalds wrote:
> 
> So I'll happily send my patches to anybody who wants to try (I don't write 
> the index file yet, but it should be easy to add), but I'm getting the 
> feeling that "builtin-unpack-objects.c" is the wrong tool to use for this, 
> because it's very much designed for streaming.

A potentially even simpler way would probably be to literally just use 
"git-pack-objects" directly, and just have a very special mode that allows 
mapping the thin pack as if it was a real pack (ie basically 
pre-populating a fake pack entry, where the fake part comes from adding 
the missing objects by hand to the mapping).

So many ways to do it, so little real motivation ;)

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
  2006-10-19  0:15                                                       ` Linus Torvalds
@ 2006-10-19  0:31                                                       ` Johannes Schindelin
  2006-10-19  0:46                                                         ` Linus Torvalds
  2006-10-19  3:01                                                       ` Nicolas Pitre
  2006-10-19  3:46                                                       ` Junio C Hamano
  3 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-19  0:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, Shawn Pearce, git

Hi,

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> The silly thing is, the pack-file contains the number of entries in the 
> header.

You do not write this to stdout, right? Why not just come back and correct 
the number of objects? Of course, the SHA1 has to be calculated _after_ 
that.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:54                                   ` Petr Baudis
@ 2006-10-19  0:33                                     ` Johannes Schindelin
  0 siblings, 0 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-19  0:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Shawn Pearce, git

Hi,

On Thu, 19 Oct 2006, Petr Baudis wrote:

> Dear diary, on Thu, Oct 19, 2006 at 01:38:45AM CEST, I got a letter
> where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> > On Wed, 18 Oct 2006, Shawn Pearce wrote:
> > 
> > > Today Git doesn't run natively on Windows.
> > 
> > As I mentioned some time ago, I started a branch on MinGW. It works quite 
> > well for the moment, but it lacks fork() emulation, and glob() emulation. 
> > And I lack the time to continue working on it.
> 
>   care to publish it somewhere, e.g. on repo.or.cz?

It is way to dirty for that. I would only dare give it somebody in return 
for the promise to clean everything up.

BTW I completely forgot that in the absence of poll() from MinGW, all the 
networking code is actually just wrapped into "return -1;" functions.

>   (P.S., have fun in Prague! Too bad I won't be around over the weekend.
> :-( )

Pity. You seem to have good connections...

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:31                                                       ` Johannes Schindelin
@ 2006-10-19  0:46                                                         ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19  0:46 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Nicolas Pitre, Junio C Hamano, Shawn Pearce, Git Mailing List



On Thu, 19 Oct 2006, Johannes Schindelin wrote:
> 
> You do not write this to stdout, right? Why not just come back and correct 
> the number of objects? Of course, the SHA1 has to be calculated _after_ 
> that.

That's the issue. I wanted the pack-file thing to look as similar to the 
old code as possible. And that means using the "sha1write()" interfaces, 
which calculate the SHA1 checksum _as_ we write.

So yes, I wanted to do it all in one phase.

Anyway, if anybody is interested, here's a series of four patches that do 
something that _almost_ works. I save away the SHA1's and the offsets so 
that I could write an index too, but I didn't actually do that part.

But with this, I can rewrite a pack-file "in flight", and the end result 
can then have "git index-pack" run on it, and used as a pack. It's just 
that there are no deltas left because of some of the silly problems I 
outlined (the code to write out deltas is actually there and just 
uncommented - it works, but it leaves the end result with unsatisfied 
deltas again).

		Linus
---
commit 4efd9b0f44635b3075c9aad6d1cc8830e3abded3
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 17:22:04 2006 -0700

    Fix up csum-file interfaces
    
    Add "const" where appropriate
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/csum-file.c b/csum-file.c
index b7174c6..3237228 100644
--- a/csum-file.c
+++ b/csum-file.c
@@ -47,7 +47,7 @@ int sha1close(struct sha1file *f, unsign
 	return 0;
 }
 
-int sha1write(struct sha1file *f, void *buf, unsigned int count)
+int sha1write(struct sha1file *f, const void *buf, unsigned int count)
 {
 	while (count) {
 		unsigned offset = f->offset;
@@ -115,7 +115,7 @@ struct sha1file *sha1fd(int fd, const ch
 	return f;
 }
 
-int sha1write_compressed(struct sha1file *f, void *in, unsigned int size)
+int sha1write_compressed(struct sha1file *f, const void *in, unsigned int size)
 {
 	z_stream stream;
 	unsigned long maxsize;
@@ -127,7 +127,7 @@ int sha1write_compressed(struct sha1file
 	out = xmalloc(maxsize);
 
 	/* Compress it */
-	stream.next_in = in;
+	stream.next_in = (void *) in;
 	stream.avail_in = size;
 
 	stream.next_out = out;
diff --git a/csum-file.h b/csum-file.h
index 3ad1a99..fee8589 100644
--- a/csum-file.h
+++ b/csum-file.h
@@ -13,7 +13,7 @@ struct sha1file {
 extern struct sha1file *sha1fd(int fd, const char *name);
 extern struct sha1file *sha1create(const char *fmt, ...) __attribute__((format (printf, 1, 2)));
 extern int sha1close(struct sha1file *, unsigned char *, int);
-extern int sha1write(struct sha1file *, void *, unsigned int);
-extern int sha1write_compressed(struct sha1file *, void *, unsigned int);
+extern int sha1write(struct sha1file *, const void *, unsigned int);
+extern int sha1write_compressed(struct sha1file *, const void *, unsigned int);
 
 #endif
\f
commit c2c8480b05a75d93f78a0ddd1cce18c6864738eb
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 17:20:53 2006 -0700

    Make some of the pack-writing helper functions available
    
    string_to_type() and encode_header() are useful in general.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 96c069a..ea39bf3 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -220,6 +220,20 @@ static void *delta_against(void *buf, un
 	return delta_buf;
 }
 
+enum object_type string_to_type(const char *type, const unsigned char *sha1)
+{
+	if (!strcmp(type, commit_type))
+		return OBJ_COMMIT;
+	if (!strcmp(type, tree_type))
+		return OBJ_TREE;
+	if (!strcmp(type, blob_type))
+		return OBJ_BLOB;
+	if (!strcmp(type, tag_type))
+		return OBJ_TAG;
+	die("strange object %s of unknown type %s",
+		    sha1_to_hex(sha1), type);
+}
+
 /*
  * The per-object header is a pretty dense thing, which is
  *  - first byte: low four bits are "size", then three bits of "type",
@@ -227,7 +241,7 @@ static void *delta_against(void *buf, un
  *  - each byte afterwards: low seven bits are size continuation,
  *    with the high bit being "size continues"
  */
-static int encode_header(enum object_type type, unsigned long size, unsigned char *hdr)
+int encode_header(enum object_type type, unsigned long size, unsigned char *hdr)
 {
 	int n = 1;
 	unsigned char c;
@@ -943,17 +957,7 @@ static void check_object(struct object_e
 		die("unable to get type of object %s",
 		    sha1_to_hex(entry->sha1));
 
-	if (!strcmp(type, commit_type)) {
-		entry->type = OBJ_COMMIT;
-	} else if (!strcmp(type, tree_type)) {
-		entry->type = OBJ_TREE;
-	} else if (!strcmp(type, blob_type)) {
-		entry->type = OBJ_BLOB;
-	} else if (!strcmp(type, tag_type)) {
-		entry->type = OBJ_TAG;
-	} else
-		die("unable to pack object %s of type %s",
-		    sha1_to_hex(entry->sha1), type);
+	entry->type = string_to_type(type, entry->sha1);
 }
 
 static unsigned int check_delta_limit(struct object_entry *me, unsigned int n)
diff --git a/pack.h b/pack.h
index eb07b03..346a430 100644
--- a/pack.h
+++ b/pack.h
@@ -15,6 +15,9 @@ struct pack_header {
 	unsigned int hdr_entries;
 };
 
+enum object_type string_to_type(const char *type, const unsigned char *sha1);
+int encode_header(enum object_type type, unsigned long size, unsigned char *hdr);
+
 extern int verify_pack(struct packed_git *, int);
 extern int check_reuse_pack_delta(struct packed_git *, unsigned long,
 				  unsigned char *, unsigned long *,
\f
commit 94d620067b4a4179656c0ce347cb87be52a9d67f
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 15:44:40 2006 -0700

    git-unpack-objects: pass in the original delta data when writing the object
    
    This does nothing right now, but if we want to instead of loose objects
    write a new "verified packfile" with an index, this lets us do that instead.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index 4f96bca..bbb6e21 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -109,7 +109,8 @@ static void add_delta_to_list(unsigned c
 
 static void added_object(unsigned char *sha1, const char *type, void *data, unsigned long size);
 
-static void write_object(void *buf, unsigned long size, const char *type)
+static void write_object(void *buf, unsigned long size, const char *type,
+	unsigned char *base, void *delta, unsigned long delta_size)
 {
 	unsigned char sha1[20];
 	if (write_sha1_file(buf, size, type, sha1) < 0)
@@ -117,7 +118,7 @@ static void write_object(void *buf, unsi
 	added_object(sha1, type, buf, size);
 }
 
-static void resolve_delta(const char *type,
+static void resolve_delta(const char *type, unsigned char *base_sha1,
 			  void *base, unsigned long base_size,
 			  void *delta, unsigned long delta_size)
 {
@@ -129,8 +130,8 @@ static void resolve_delta(const char *ty
 			     &result_size);
 	if (!result)
 		die("failed to apply delta");
+	write_object(result, result_size, type, base_sha1, delta, delta_size);
 	free(delta);
-	write_object(result, result_size, type);
 	free(result);
 }
 
@@ -143,7 +144,7 @@ static void added_object(unsigned char *
 		if (!hashcmp(info->base_sha1, sha1)) {
 			*p = info->next;
 			p = &delta_list;
-			resolve_delta(type, data, size, info->delta, info->size);
+			resolve_delta(type, sha1, data, size, info->delta, info->size);
 			free(info);
 			continue;
 		}
@@ -164,7 +165,7 @@ static void unpack_non_delta_entry(enum 
 	default: die("bad type %d", kind);
 	}
 	if (!dry_run && buf)
-		write_object(buf, size, type);
+		write_object(buf, size, type, NULL, NULL, 0);
 	free(buf);
 }
 
@@ -197,7 +198,7 @@ static void unpack_delta_entry(unsigned 
 		has_errors = 1;
 		return;
 	}
-	resolve_delta(type, base, base_size, delta_data, delta_size);
+	resolve_delta(type, base_sha1, base, base_size, delta_data, delta_size);
 	free(base);
 }
 
diff --git a/date.c b/date.c
index 1825922..0b06994 100644
--- a/date.c
+++ b/date.c
@@ -657,6 +657,7 @@ static const struct typelen {
 	{ "hours", 60*60 },
 	{ "days", 24*60*60 },
 	{ "weeks", 7*24*60*60 },
+	{ "fortnights", 2*7*24*60*60 },
 	{ NULL }
 };	
 
\f
commit 636210e7fcceb7297ccf0fc54291bb1c8356f0d3
Author: Linus Torvalds <torvalds@osdl.org>
Date:   Wed Oct 18 17:23:06 2006 -0700

    Make "unpack-objects" able to write a single pack-file instead
    
    This is idiotic. It writes everything undeltified, which is
    horrid. I need a brain.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index bbb6e21..f139308 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -7,11 +7,12 @@ #include "blob.h"
 #include "commit.h"
 #include "tag.h"
 #include "tree.h"
+#include "csum-file.h"
 
 #include <sys/time.h>
 
 static int dry_run, quiet, recover, has_errors;
-static const char unpack_usage[] = "git-unpack-objects [-n] [-q] [-r] < pack-file";
+static const char unpack_usage[] = "git-unpack-objects [-n] [-q] [-r] [--repack=pack-name] < pack-file";
 
 /* We always read in 4kB chunks. */
 static unsigned char buffer[4096];
@@ -87,6 +88,56 @@ static void *get_data(unsigned long size
 	return buf;
 }
 
+static struct sha1file *pack_file;
+static unsigned long pack_file_offset;
+
+struct index_entry {
+	unsigned long offset;
+	unsigned char sha1[20];
+};
+
+static unsigned int index_nr, index_alloc;
+static struct index_entry **index_array;
+
+static void add_pack_index(unsigned char *sha1)
+{
+	struct index_entry *entry;
+	int nr = index_nr;
+	if (nr >= index_alloc) {
+		index_alloc = (index_alloc + 64) * 3 / 2;
+		index_array = xrealloc(index_array, index_alloc * sizeof(*index_array));
+	}
+	entry = xmalloc(sizeof(*entry));
+	entry->offset = pack_file_offset;
+	hashcpy(entry->sha1, sha1);
+	index_array[nr++] = entry;
+}
+
+static void write_pack_delta(const unsigned char *base, const void *delta, unsigned long delta_size)
+{
+	unsigned char header[10];
+	unsigned hdrlen, datalen;
+
+	hdrlen = encode_header(OBJ_DELTA, delta_size, header);
+	sha1write(pack_file, header, hdrlen);
+	sha1write(pack_file, base, 20);
+	datalen = sha1write_compressed(pack_file, delta, delta_size);
+
+	pack_file_offset += hdrlen + 20 + datalen;
+}
+
+static void write_pack_object(const char *type, const unsigned char *sha1, const void *buf, unsigned long size)
+{
+	unsigned char header[10];
+	unsigned hdrlen, datalen;
+
+	hdrlen = encode_header(string_to_type(type, sha1), size, header);
+	sha1write(pack_file, header, hdrlen);
+	datalen = sha1write_compressed(pack_file, buf, size);
+
+	pack_file_offset += hdrlen + datalen;
+}
+
 struct delta_info {
 	unsigned char base_sha1[20];
 	unsigned long size;
@@ -113,7 +164,16 @@ static void write_object(void *buf, unsi
 	unsigned char *base, void *delta, unsigned long delta_size)
 {
 	unsigned char sha1[20];
-	if (write_sha1_file(buf, size, type, sha1) < 0)
+
+	if (pack_file) {
+		if (hash_sha1_file(buf, size, type, sha1) < 0)
+			die("failed to compute object hash");
+		add_pack_index(sha1);
+		if (0 && base)
+			write_pack_delta(base, delta, delta_size);
+		else
+			write_pack_object(type, sha1, buf, size);
+	} else if (write_sha1_file(buf, size, type, sha1) < 0)
 		die("failed to write object");
 	added_object(sha1, type, buf, size);
 }
@@ -254,7 +314,7 @@ static void unpack_one(unsigned nr, unsi
 	}
 }
 
-static void unpack_all(void)
+static void unpack_all(const char *repack)
 {
 	int i;
 	struct pack_header *hdr = fill(sizeof(struct pack_header));
@@ -266,17 +326,32 @@ static void unpack_all(void)
 		die("unknown pack file version %d", ntohl(hdr->hdr_version));
 	fprintf(stderr, "Unpacking %d objects\n", nr_objects);
 
+	if (repack) {
+		struct pack_header newhdr;
+		newhdr.hdr_signature = htonl(PACK_SIGNATURE);
+		newhdr.hdr_version = htonl(PACK_VERSION);
+		newhdr.hdr_entries = htonl(nr_objects);
+		
+		pack_file = sha1create("%s.pack", repack);
+		sha1write(pack_file, &newhdr, sizeof(newhdr));
+		pack_file_offset = sizeof(newhdr);
+	}
+		
+
 	use(sizeof(struct pack_header));
 	for (i = 0; i < nr_objects; i++)
 		unpack_one(i+1, nr_objects);
 	if (delta_list)
 		die("unresolved deltas left after unpacking");
+	if (repack)
+		sha1close(pack_file, NULL, 1);
 }
 
 int cmd_unpack_objects(int argc, const char **argv, const char *prefix)
 {
 	int i;
 	unsigned char sha1[20];
+	const char *repack = NULL;
 
 	git_config(git_default_config);
 
@@ -298,6 +373,10 @@ int cmd_unpack_objects(int argc, const c
 				recover = 1;
 				continue;
 			}
+			if (!strncmp(arg, "--repack=", 9)) {
+				repack = arg + 9;
+				continue;
+			}
 			usage(unpack_usage);
 		}
 
@@ -305,7 +384,7 @@ int cmd_unpack_objects(int argc, const c
 		usage(unpack_usage);
 	}
 	SHA1_Init(&ctx);
-	unpack_all();
+	unpack_all(repack);
 	SHA1_Update(&ctx, buffer, offset);
 	SHA1_Final(sha1, &ctx);
 	if (hashcmp(fill(20), sha1))

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 23:48                                   ` Johannes Schindelin
@ 2006-10-19  1:58                                     ` Charles Duffy
  2006-10-19 11:01                                       ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Charles Duffy @ 2006-10-19  1:58 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Johannes Schindelin wrote:
> So, the wonderful upside of plugins you described here are actually the 
> reason I will never, _never_ use bzr with plugins.
> 

I presume that for this reason you will also never, _never_ use a 
non-mainline branch of git -- even if its actual code only touches UI 
enhancements or something similarly non-core -- because third-party 
branches have the ability, in theory, to make changes to the core of the 
revision control system. And that you will never, _never_ use 
third-party wrappers because they might play LD_PRELOAD tricks. Or run 
any software with root privileges you haven't personally written. Or...

Sean's point that plugins are a comparatively minor win made inexpensive 
on account of bzr's use of Python is reasonable (though we may choose to 
differ on what level of value we attach to the utility). The claim that 
an extensibility mechanism should be rejected wholesale on account of 
being excessively powerful, on the other hand, is just silly.



(If you couldn't write a plugin that *didn't* touch the core, this would 
be a different story. This is, however, very much not the case).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
  2006-10-19  0:15                                                       ` Linus Torvalds
  2006-10-19  0:31                                                       ` Johannes Schindelin
@ 2006-10-19  3:01                                                       ` Nicolas Pitre
  2006-10-19  3:46                                                       ` Junio C Hamano
  3 siblings, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-19  3:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Shawn Pearce, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 18 Oct 2006, Nicolas Pitre wrote:
> > 
> > If you use builtin-unpack-objects.c from next, you'll be able to 
> > generate the pack index pretty easily as well, as all the needed info is 
> > stored in the obj_list array.  Just need to append objects remaining on 
> > the delta_list array to the end of the pack, sort the obj_list by sha1 
> > and write the index.
> 
> Actually, I've hit an impasse.
> 
> The index isn't the problem. The problem is actually writing the resultant 
> pack-file itself in one go.
> 
> The silly thing is, the pack-file contains the number of entries in the 
> header. That's a silly problem, because the _natural_ way to turn a thin 
> pack into a normal pack would be to just add the missing objects from the 
> local store into the resulting pack. But we don't _know_ how many such 
> missing objects there are, until we've gone through the whole source pack. 
> 
> So you can't easily do a streaming "write the result as you go along" 
> version using that approach.

Hmmm.... unpack-objects receives a (possibly thin) pack over its stdin.  
That part has to be streamed.  But its output is currently always 
written to multiple files as separate objects.  So, while the input 
comes from a stream, the output doesn't have to.

In that case, why not just write the input directly to a temporary file, 
append the missing objects, seek back to adjust the object number, and 
finally run a SHA1_Update() on the whole thing?  This forces you to 
write everything and then read everything back, but this should not be 
too bad especially that the written data is likely to still be cached.  
Once its final sha1sum is written then it just need to be moved with the 
appropriate name.

> So there's _another_ way of fixing a thin pack: it's to expand the objects 
> without a base into non-delta objects, and keeping the number of objects 
> in the pack the same. But _again_, we don't actually know which ones to 
> expand until it's too late.
> 
> The end result? I can expand them all (I have a patch that does that). Or 
> I could leave as deltas the ones I have already seen the base for in the 
> pack-file (I don't have that yet, but that should be a SMOP). But I'm not 
> very happy with even the latter choice, because it really potentially 
> expands things that didn't _need_ expansion, they just got expanded 
> because we hadn't seen the base object yet.

Most base objects, well all of them nowadays, are written before their 
deltas.  So in practice the only objects that will get expanded are the 
deltas with missing base.   Still it is unfortunate.

> So I'll happily send my patches to anybody who wants to try (I don't write 
> the index file yet, but it should be easy to add), but I'm getting the 
> feeling that "builtin-unpack-objects.c" is the wrong tool to use for this, 
> because it's very much designed for streaming.
> 
> It would probably be better to start from "index-pack.c" instead, which is 
> already a multi-pass thing, and wouldn't have had any of the problems I 
> hit. 

But index-pack is totally incompatible with any streaming.  It mmap() 
the whole pack and happily perform random accesses.  So you'd need to 
write the entire thin pack to disk anyway before it could work on it.  
This is not really better than the unpack-objects option.  At least 
unpack-objects is structured to perform work on the fly as data is 
received.

> Gaah.
> 
> > Pretty trivial indeed.
> 
> So it's conceptually totally trivial to rewrite a pack-file as another 
> pack-file, but at least so far, it's turned out to be less trivial in 
> practice (or at least in a single pass, without holding everything in 
> memory, which I definitely do _not_ want to do).
> 
> So I'm leaving this for today, and perhaps coming back to it tomorrow with 
> a fresh eye.

I'll have a look at your patches tomorrow as well.  I have many ideas 
brewing, including randering index-pack obsolete since actually 
unpack-objects could do it all already (both tools have many concepts in 
common).


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18  3:35                             ` Linus Torvalds
@ 2006-10-19  3:10                               ` Aaron Bentley
  2006-10-19  5:21                                 ` Carl Worth
                                                   ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-19  3:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:

> For example, what happens is that:
>  - you like the simple revision numbers
>  - that in turn means that you can never allow a mainline-merge to be done 
>    by anybody else than the main maintainer

That's not true of bzr development.  The "main maintainer" that runs the
bzr.dev is an email bot.  It's not an integrator-- its work is purely
mechanical.  It can't resolve merge conflicts.

Most of the merge work is done in integration branches run by the core
developers.  Although Martin is our project leader, lays out ground
rules, and makes design decisions, he doesn't have to be involved in any
particular merge.

> The "main trunk matters" mentality (which has deep roots in CVS - don't 
> get me wrong, I don't think you're the first one to do this) is 
> fundamentally antithetical to truly distributed system, because it 
> basically assumes that some maintainer is "more important" than others. 

Linus, if you got hit by a bus, it would still be a shock, and it would
still take time for the Linux world to recover.  Your insights and
talent, both technical and social, make you the most important kernel
developer.  And it stays that way because you deserve it.  Projects with
good leadership don't fork, or if they do, the fork withers and dies
pretty quickly.

It is fine to say all branches are equal from a technical perspective.
- From a social perspective, it's just not true.

The scale of Bazaar development is much smaller than the scale of kernel
development, so it doesn't make sense to maintain long-term divergent
branches like the mm tree.  We do occasionally have long-lived feature
branches, though.

> That special maintainer is the maintainer whose merge-trunk is followed, 
> and whose revision numbers don't change when they are merged back.

In bzr development, it's very rare for anyone's revision numbers to change.

> That may even be _true_ in many cases. But please do realize that it's a 
> real issue, and that it has real impact - it does two things:
> 
>  - it impacts the technology and workflow directly itself: "pull" and 
>    "merge" are different: a central maintainer would tend to do a "merge", 
>    and one more in the outskirts would tend to do more of a "pull", 
>    expecting his work to then be merged back to the "trunk" at some later 
>    point)

AFAIK, everyone who maintains long-lived branches in bzr uses "merge".

>  - it will result in _psychological_ damage, in the sense that there's 
>    always one group that is the "trunk" group, and while you can pass the 
>    baton around (like the perl people do), it's always clear who sits 
>    centrally.

As I mentioned earlier, there are four people who each run their own
integration branches and make decisions about what gets merged.  No baton.

> 
> Maybe this is fine. It's certainly how most projects tend to work. 
> 
> I'll just point out that one of my design goals for git was to make every 
> single repository 100% equal. That means that there MUST NOT be a "trunk", 
> or a special line of development. There is no "vendor branch".

I think you're implying that on a technical level, bzr doesn't support
this.  But it does.  Every published repository has unique identifiers
for every revision on its mainline, and it's exceedingly uncommon for
these to change.  There are special procedures to maintain bzr.dev, but
there's nothing technically unique about it.  People develop against
bzr.dev rather than my integration branch, because they have
non-technical reasons for wanting their changes to be merged into
bzr.dev, not my integration branch.

> It's 
> something that a lot of people on the git lists understand now, but it 
> took a while for it to sink in - people used to believe that the "first 
> parent" of a merge was somehow special, and I had to point out several 
> times on the git list that no, that's not how it works - because the merge 
> might have been done by somebody _else_ than the person who you think of 
> as being "on the trunk".

On an actively-developed bzr branch, the first parent *is* special:
- - it's a revision that you committed
- - the diff between a revision and its first parent is the same as the
  diff that would be produced just before it was committed.

> So when I say that your "simple" revision numbers are totally broken and 
> horrible, I say that not because I think a number like "1.45.3.17" is 
> ugly, but because I think that the deeper _implications_ of using a number 
> like that is ugly. It implies one of two things:
> 
>  - the numbers change all the time as things get merged both ways
> 
> OR
> 
>  - people try to maintain a "trunk" mentality

I don't think your analysis holds together completely, because all
actively-maintained branches have very stable revnos that anyone can
refer to.

> In git, the fact that everybody is on an equal footing is something that I 
> think is really good. For example, when I was away for effectively three 
> weeks during August, all the git-level merging for the kernel was done by 
> Greg KH.
> 
> And realize that he didn't use "my tree". No baton was passed. I emailed 
> with him (and some others) before-hand, so that everybody knew that I 
> expected to be just pull from Greg when I came back, but it was _his_ tree 
> that he merged in, and he just worked the same way I did.
>
> And when I did come back, I did a "pull" from his tree.

That sounds to me like a baton was passed.  You asked Greg to behave
like you, and told everyone else to expect that, too.  Passing the baton
was a social, not technical event, but it did happen.  And there would
certainly be no difficulty doing exactly that (right down to running
"pull") in Bazaar land.

In fact, we are currently rotating release managers.  The 0.10 and 0.11
releases were done by Robert, and the upcoming 0.12 is being managed by
John.  Neither of them is the project leader.  They threaten that they
want me to manage a release, too.  We shall see...

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFNuyT0F+nu1YWqI0RAjxSAJ9YulgRMmIuy9RS1xrrYnKl9x2arQCaAr5/
u56sojZb6jhKl3fMQ/ZxLf4=
=EYC+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  0:07                                                     ` Linus Torvalds
                                                                         ` (2 preceding siblings ...)
  2006-10-19  3:01                                                       ` Nicolas Pitre
@ 2006-10-19  3:46                                                       ` Junio C Hamano
  2006-10-19 14:27                                                         ` Nicolas Pitre
  2006-10-19 14:55                                                         ` Linus Torvalds
  3 siblings, 2 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-19  3:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> Actually, I've hit an impasse.
>
> So there's _another_ way of fixing a thin pack: it's to expand the objects 
> without a base into non-delta objects, and keeping the number of objects 
> in the pack the same. But _again_, we don't actually know which ones to 
> expand until it's too late.

pack-objects.c::write_one() makes sure that we write out base
immediately after delta if we haven't written out its base yet,
so I suspect if you buffer one delta you should be Ok, no?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
@ 2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:56                                   ` Martin Pool
                                                     ` (2 more replies)
  2006-10-19  5:33                                 ` Jan Hudec
                                                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-19  5:21 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]

On Wed, 18 Oct 2006 23:10:11 -0400, Aaron Bentley wrote:
> It is fine to say all branches are equal from a technical perspective.
> - From a social perspective, it's just not true.

That's actually a very important insight, but supporting the wrong
conclusion.

In a healthy situation, the only thing that makes a branch special are
social issues, such as you describe. That's how it should be.

But think about your favorite example of an unhealthy social situation
around a software project and a big, nasty fork. Every example I can
think of involves some technical distinction that makes one branch
more special than another.

Now, those situations also involve social problems, and those are even
more significant. But the technical blessing of one branch does not
help. And I think it contributes to the social problems in many cases.

So, I think the technical thing that is distributed version control is
an extremely important thing for us to use to help maintain healthy
social software projects. Reducing the technical hurdle of a fork, (to
where continual forking is actually a totally expected part of the
process), is a very healthy thing.

Now, both bzr and git are distributed systems, and either one will
help a great deal in the respects I'm talking about compared to
something like cvs.

As far as the revision numbers, my impression is that the numbers
would be confusing or worthless if I were to use bzr the way I'm
currently using git, as they certainly could not remain stable.

> In bzr development, it's very rare for anyone's revision numbers to change.

Which just says to me that the bzr developers really are sticking to a
centralized model. That's fine, but it does have impacts, and the tool
really does seem to have some bias toward this.

> I think you're implying that on a technical level, bzr doesn't support
> this.  But it does.  Every published repository has unique identifiers
> for every revision on its mainline, and it's exceedingly uncommon for
> these to change.

Every argument you make for the number change being uncommon just
strengthens the argument that it will be all that more
confusing/frustrating when the numbers do change.

In cairo, for example, we've made a habit of including a revision
identifier in our bug tracking system for every commit that resolves a
bug. I like having the assurance that those numbers will survive
forever. And it doesn't matter if the repository moves, or the project
is forked, or anything else. Those numbers cannot change.

I understand that bzr also has unique identifiers, but it sounds like
the tools try to hide them, and people aren't in the habit of using
them for things like this. Do bzr developers put revision numbers in
their bug trackers? Is there a guarantee they will always be valid?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
  2006-10-19  5:21                                 ` Carl Worth
@ 2006-10-19  5:33                                 ` Jan Hudec
  2006-10-19  7:02                                 ` Erik Bågfors
  2006-10-20 13:22                                 ` Horst H. von Brand
  3 siblings, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-19  5:33 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Wed, Oct 18, 2006 at 11:10:11PM -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Linus Torvalds wrote:
> 
> > For example, what happens is that:
> >  - you like the simple revision numbers
> >  - that in turn means that you can never allow a mainline-merge to be done 
> >    by anybody else than the main maintainer
> 
> That's not true of bzr development.  The "main maintainer" that runs the
> bzr.dev is an email bot.  It's not an integrator-- its work is purely
> mechanical.  It can't resolve merge conflicts.

The point here is, that because of using the bot, the revnos on bzr.dev
are indeed stable (and many of the merges are in fact pointless merges
(ie. merges of revision and it's ancestor)). But if you don't use the
bot, than doing:

bzr merge mainline
bzr push mainline

makes your revision the leftmost parent is your revison, not the one
from "mainline". The fact that bzr treats leftmost parent somewhat
specially makes people to replace the above with

bzr branch mainline
cd mainline
bzr merge feature-branch
bzr push

which is, well, more complicated (but you see it's not about main
maintainer -- anybody with write access can push).

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 22:14                         ` Jakub Narebski
@ 2006-10-19  5:45                           ` Jan Hudec
  0 siblings, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-19  5:45 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Thu, Oct 19, 2006 at 12:14:02AM +0200, Jakub Narebski wrote:
> Jan Hudec wrote:
> > Comments?
> 
> What about fetching from repository? For revnos you have to assign revno for
> all commit you have downloaded; now you need only to unpack received pack
> (or not, if you used --keep option). More work.

I don't know git internals, so I can't tell for git. For bzr:
1) You have to add the data to the knits, since the knits are one for
   each versioned file plus one for inventory and one for revision
   metadata, so this is just a small addition to that work. In fact the
   revnos in repository-wide case would be just the indices into the
   revisions knit (while in the branch-wide there would have to be a
   special list).
2) Bzr already generates a special list, revision-history, where it
   stores a list of mainline branches (in fact it used to store a list
   of local commits, but now lists the path over leftmost parents).
   So it already does the work.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  5:21                                 ` Carl Worth
@ 2006-10-19  5:56                                   ` Martin Pool
  2006-10-19 14:58                                   ` Aaron Bentley
  2006-10-19 15:25                                   ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Martin Pool @ 2006-10-19  5:56 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On 18 Oct 2006, Carl Worth <cworth@cworth.org> wrote:

> I understand that bzr also has unique identifiers, but it sounds like
> the tools try to hide them, and people aren't in the habit of using
> them for things like this. Do bzr developers put revision numbers in
> their bug trackers? Is there a guarantee they will always be valid?

There is a mix of 

 - Just giving the overall tarball version number, which is most 
   meaningful to users (and not related to bzr versions)

 - Giving a mainline revision number, which will never revert because we
   never pull (fast-forward) that branch.  That has the substantial
   (imo) benefit that you can immediately compare these numbers by eye,
   and they are easy to quote.

 - Giving a unique id, which is obviously most definitive and
   appropriate if you're talking about something which is not 
   on the mainline or a well known branch.  The launchpad.net 
   bug tracker links branches to bugs and does this through 
   revision ids.

-- 
Martin

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
  2006-10-18 18:59                               ` Petr Baudis
       [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
@ 2006-10-19  6:46                               ` Alexander Belchenko
       [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Alexander Belchenko @ 2006-10-19  6:46 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Petr Baudis пишет:
...
> An example bundle is available at
> 
> 	http://pasky.or.cz/~pasky/cp/example-bundle.txt

You probably miss main idea of bzr bundles. It's not just the way to
send via e-mail or other appropriate transport the part of repository.
It primarily was designed to be human readable as usual diff (i.e.
patch). It was designed to solve 2 thing simultaneously:

- be informative for human as usual patch
- be consistent for machine.

--
Alexander

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
  2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:33                                 ` Jan Hudec
@ 2006-10-19  7:02                                 ` Erik Bågfors
  2006-10-19  8:49                                   ` Christian MICHON
  2006-10-19 11:37                                   ` Petr Baudis
  2006-10-20 13:22                                 ` Horst H. von Brand
  3 siblings, 2 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-19  7:02 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

> > In git, the fact that everybody is on an equal footing is something that I
> > think is really good. For example, when I was away for effectively three
> > weeks during August, all the git-level merging for the kernel was done by
> > Greg KH.
> >
> > And realize that he didn't use "my tree". No baton was passed. I emailed
> > with him (and some others) before-hand, so that everybody knew that I
> > expected to be just pull from Greg when I came back, but it was _his_ tree
> > that he merged in, and he just worked the same way I did.
> >
> > And when I did come back, I did a "pull" from his tree.
>
> That sounds to me like a baton was passed.  You asked Greg to behave
> like you, and told everyone else to expect that, too.  Passing the baton
> was a social, not technical event, but it did happen.  And there would
> certainly be no difficulty doing exactly that (right down to running
> "pull") in Bazaar land.


I'd like to point out that the same thing has happened in bzr-land.
Back in the "pre-bot" days, only Martin did put things in "his branch"
where most people got bzr from (same as Linus' git branch), but he was
away for a few weeks and during this time, there was 3 (or 4 perhaps)
other branches, called integration branches, that was being used.
They were all maintained by different people.

Everyone learned really quickly to use them instead of Martin's
branch. When Martin came back, he just pulled/merged these branches
and everything was back to normal.

I'd say in this case, bzr was even more "without a trunk" then in the
example Linus gives above.

What seams to be one interesting thing in this discussion is that,
because people use bzr and git in slightly different ways, they think
that one or the other cannot be used in another way.

bzr's use of revision numbers, doesn't mean it hasn't got unique
revision identifiers, and I can't see any reason why it couldn't be
used in the same way as git.  Both are excellent tools, and since git
is more specialized (built to support the exact workflow used in
kernel development), it's more suited for that exact use.

bzr tries to take a broader view, for example, it does support a
centralized workflow if you want one.  Most people don't, but a few
might. Because of this, it probably fits the kernel development less
good than git.  That's fine I think! I happens to fit my workflow
better than git does :)

Regards,
Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
  2006-10-18 22:14                         ` Jakub Narebski
@ 2006-10-19  8:19                         ` Alexander Belchenko
  2006-10-21 13:48                           ` Jan Hudec
  2006-10-20  2:09                         ` Horst H. von Brand
  2 siblings, 1 reply; 806+ messages in thread
From: Alexander Belchenko @ 2006-10-19  8:19 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jan Hudec пишет:
...
> 
> Reading this thread I came to think, that the revnos should be assigned
> to _all_ revisions _available_, in order of when they entered the
> repository (there are some possible variations I will mention below)
> 
...
>  - They would be the same as subversion and svk, and IIRC mercurial as
>    well, use, so:
>    - They would already be familiar to users comming from those systems.
>    - They are known to be useful that way. In fact for svk it's the only
>      way to refer to revisions and seem to work satisfactorily (though
>      note that svk is not really suitable to ad-hoc topologies).

I think that SVN model of revision numbers is wrong. And apply it to bzr
break many UI habits. Per example, when ones use svn and their repo has
many branches you never could say what revisions belongs to mainline. So
things like
bzr diff -rM..N
(where M and N absolute revisions numbers, and N = M+1(+2) etc.)
will more complicated, because in this case you first need to run log
command, remember actual numbers of those revisions.
And I each time frustrating to see that after mainline svn revision 1000
might be mainline revision 1020. It's very-very-very confusing. May be
only for me.

There is 2 things why I don't want to switch to svn (if I can do my own
choice): their strange tags implementation (their tags is the same as
branches, so what difference?) and their revisions numbers.

I also think that dotted revisions is not answer in this case, but it
looks very logical and nice.

I think bzr need to have a switch, a flag, probably in .bazaar.conf to
show revno to user or revid. And user can easily select what model is
more appropriate for him:

* decentralized (with revno)
* or distrubuted (with revid i.e. UUID)

> Comments?

-1 to make revno as in svn.

--
Alexander

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  7:02                                 ` Erik Bågfors
@ 2006-10-19  8:49                                   ` Christian MICHON
  2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19 11:37                                   ` Petr Baudis
  1 sibling, 1 reply; 806+ messages in thread
From: Christian MICHON @ 2006-10-19  8:49 UTC (permalink / raw)
  To: bazaar-ng, git

close to 200 post on bzr-git war!
is this the right place (git mailing list) to discuss about future
features of bzr ?

-- 
Christian

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:49                                   ` Christian MICHON
@ 2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19  9:10                                       ` Matthieu Moy
                                                         ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-19  8:58 UTC (permalink / raw)
  To: Christian MICHON; +Cc: bazaar-ng, git

Christian MICHON wrote:
> close to 200 post on bzr-git war!
> is this the right place (git mailing list) to discuss about future
> features of bzr ?
> 

Perhaps not, but the tone is friendly (mostly), the patience of the 
bazaar people seems infinite and lots of people seem to be having fun 
while at the same time learning a thing or two about a different SCM.
Best case scenario, both git and bazaar come out of the discussion as 
better tools. If there would never be any cross-pollination, git 
wouldn't have half the features it has today.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:58                                     ` Andreas Ericsson
@ 2006-10-19  9:10                                       ` Matthieu Moy
  2006-10-19 14:57                                         ` Tim Webster
  2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
  2006-10-20 10:40                                       ` Jakub Narebski
  2 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-19  9:10 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Christian MICHON, bazaar-ng, git

Andreas Ericsson <ae@op5.se> writes:

> Perhaps not, but the tone is friendly (mostly), the patience of the
> bazaar people seems infinite and lots of people seem to be having fun
> while at the same time learning a thing or two about a different SCM.
> Best case scenario, both git and bazaar come out of the discussion as
> better tools. If there would never be any cross-pollination, git
> wouldn't have half the features it has today.

I second this.

I'm bzr user and occasionnal developper, and I learnt a lot about git
in the discussion. I hope I also could explain well some of the
features of bzr to some git guys, it's always interesting to
understand why other people do things on a different way, or why they
do it in the same way.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 15:38                                 ` Carl Worth
@ 2006-10-19  9:10                                   ` Matthew D. Fuller
  2006-10-19 11:15                                     ` Andreas Ericsson
  2006-10-19 11:27                                     ` Karl Hasselström
  0 siblings, 2 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-19  9:10 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On Wed, Oct 18, 2006 at 08:38:24AM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> 
> But as you already said, it's often avoided specifically because it
> destroys locally-created revision numbers.

I think this has the causality backward.  It's avoided because it
changes the ancestry of the branch in question, by rearranging the
left parents; this ties into Linus' assertion that all parents ought
to be treated equally, which I'm beginning to think is the base
lynchpin of this whole dissension.


Without a differentiation of the parents, there's no such creature as
a "mainline" on a branch, so it's hard to find anything to base revnos
on from the get-go; the whole discussion becomes meaningless and
incomprehensible then.

With the differentiation, numbering along the leftmost 'mainline'
makes sense, and fits the way people tend to work.  "I did this, then
I did this, then I merged in Joe's stuff, then I did this", and the
numbering follows along that.  And as long as it's the same branch,
those revnos will always be the same; I can't go back and add
something in between my first and second commits.  THAT'S where revnos
are useful; referring to a point on given branch.


Certainly, they're of no (or extremely limited) use when referring to
_different_ branches.  And when you change the arrangement of parents
on a branch, you create a different branch.  That's why bzr (the
project, not the program) tends toward trunks that are merged into,
rather than ephemeral trunks that are merged from and then replaced
with the new trunk, and has its UI optimized by default for that case;
because the ordering of the parents IS considered important and to be
preserved.  Ancestry changes aren't avoided because it would screw up
the revnos; the revnos don't get screwed up because the ancestry
changes are avoided for their OWN sake, and it's BECAUSE of that
pre-existing tendancy that the revnos could come into being in the
first place.


If you need to refer to a specific revision in a vacuum, a revno is
the *WRONG* tool for the job.  Revnos exist to refer to points along a
branch.  And in cases where there's a meaningful persistent branch, as
happens in most projects which have a trunk in some sense or another,
they can be the right tool for referring to points along that.


> So there are some aspects of the bzr design that rob from its
> ability to function as a distributed version control system. It
> really does bias itself toward centralization, (the so called "star
> topoloogy" as opposed to something "fully" distributed).

That depends on what you mean by 'bias' (and for that matter, what you
mean by 'centralization'; I think that's being used in very different
ways here).  If you don't care about the ancestry changes, you can go
ahead and change it around by merging and pushing like there's no
tomorrow, and it'll keep up just fine.  Some attributes of it like the
revnos which assume you do care about the ancestry simply cease to be
of any applicability.  That doesn't make it a useless feature, any
more than diff being inapplicable in a branch I'm using to store
binary files makes diff useless; it's just not one that's meaningful
in a given case.

bzr (the project) does care about the ordering of the parents, so it
doesn't do that.  bzr (the tool) assumes that the majority of its
users will care, which is why it has revnos; because in the case where
you don't disturb the ancestry of given branches, revnos are very
useful in reference to that branch.


> So even a project that's very oriented around a single, central tree
> can get a lot of benefit from being able to share things arbitrarily
> between any two given repositories.

I agree wholeheartedly.  That's one of the reasons I'm using bzr, even
though 95% or better of what I do is very oriented around single,
central trees, after all    8-}


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
@ 2006-10-19 10:40                                   ` Sean
  2006-10-20 14:03                                     ` Aaron Bentley
  2006-10-19 10:40                                   ` Sean
  1 sibling, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-19 10:40 UTC (permalink / raw)
  To: Alexander Belchenko; +Cc: git, bazaar-ng

On Thu, 19 Oct 2006 09:46:32 +0300
Alexander Belchenko <bialix@ukr.net> wrote:

> You probably miss main idea of bzr bundles. It's not just the way to
> send via e-mail or other appropriate transport the part of repository.
> It primarily was designed to be human readable as usual diff (i.e.
> patch). It was designed to solve 2 thing simultaneously:
> 
> - be informative for human as usual patch
> - be consistent for machine.

Petr already mentioned that the data currently shown in the email
text isn't really useful.  But it's simple to make it an attachment
and show a combined diff instead.

Although that might just make the email bigger for not a lot of
gain.  It's easy to use the git command line and gui tools to inspect
the bundle after importing it into your repository.  And just as
easy to expunge the bundle afterward if it isn't up to grade.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
  2006-10-19 10:40                                   ` Sean
@ 2006-10-19 10:40                                   ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-19 10:40 UTC (permalink / raw)
  To: Alexander Belchenko; +Cc: bazaar-ng, git

On Thu, 19 Oct 2006 09:46:32 +0300
Alexander Belchenko <bialix@ukr.net> wrote:

> You probably miss main idea of bzr bundles. It's not just the way to
> send via e-mail or other appropriate transport the part of repository.
> It primarily was designed to be human readable as usual diff (i.e.
> patch). It was designed to solve 2 thing simultaneously:
> 
> - be informative for human as usual patch
> - be consistent for machine.

Petr already mentioned that the data currently shown in the email
text isn't really useful.  But it's simple to make it an attachment
and show a combined diff instead.

Although that might just make the email bigger for not a lot of
gain.  It's easy to use the git command line and gui tools to inspect
the bundle after importing it into your repository.  And just as
easy to expunge the bundle afterward if it isn't up to grade.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  1:58                                     ` Charles Duffy
@ 2006-10-19 11:01                                       ` Johannes Schindelin
  2006-10-19 11:10                                         ` Charles Duffy
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-19 11:01 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git

Hi,

On Wed, 18 Oct 2006, Charles Duffy wrote:

> Johannes Schindelin wrote:

you neatly clipped the most important part of my email: I quoted you 
saying that plugins can even change core behaviour!

> > So, the wonderful upside of plugins you described here are actually the
> > reason I will never, _never_ use bzr with plugins.
> > 
> 
> I presume that for this reason you will also never, _never_ use a 
> non-mainline branch of git -- even if its actual code only touches UI 
> enhancements or something similarly non-core

NO! The point was that I will not gladly run anything which could change 
the core. If I know it touches only the UI, there is no problem.

If I get a shell script using git-core programs to do its job, I 
_know_ that my repository will not be fscked afterwards.

And _that_ was the whole point of my email.

> And that you will never, _never_ use third-party wrappers because they 
> might play LD_PRELOAD tricks. Or run any software with root privileges 
> you haven't personally written. Or...

Most of it comes down to trust. And yes, you are correct, I will not run 
git with some obscure module LD_PRELOADed that some guy from some planet 
sent me.

You might have missed my argument being about the SCM, and not the 
universe and all the rest.

> The claim that an extensibility mechanism should be rejected wholesale 
> on account of being excessively powerful, on the other hand, is just 
> silly.

Oh, but NO! An extensibility mechanism which allows for a fragile system 
_is_ silly. Not my rejection of it.

Just take an example (illustrating that once again, one should not 
attribute everything to malevolence...): I write a plugin for bzr. It does 
really wonderful things, it even cooks you dinner.

Only that I happened to make a small mistake (if you followed some threads 
on the git list, you'd know that small mistakes are a hobby of mine), and 
by this mistake, your repository is ... gone. Small mistake, big 
consequence. That is wrong with such a powerful system which caters for 
developers, which are human after all.

Note that such a small mistake would be much more likely caught in git: if 
it touches the core, plenty of eyes look at it.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:01                                       ` Johannes Schindelin
@ 2006-10-19 11:10                                         ` Charles Duffy
  2006-10-19 11:24                                           ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Charles Duffy @ 2006-10-19 11:10 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
>> I presume that for this reason you will also never, _never_ use a 
>> non-mainline branch of git -- even if its actual code only touches UI 
>> enhancements or something similarly non-core
>>     
>
> NO! The point was that I will not gladly run anything which could change 
> the core. If I know it touches only the UI, there is no problem.
>   

If you're willing to look at the source of a branch to know that it 
touches only the UI, why would you not be willing to look at the source 
of a plugin to do the same thing?

> If I get a shell script using git-core programs to do its job, I 
> _know_ that my repository will not be fscked afterwards.
>
> And _that_ was the whole point of my email.
>   

It's a silly point. If you're willing to look at what your shell script 
does and validate that it doesn't do LD_PRELOAD tricks or swap out git 
core pieces, why wouldn't you be willing to accept a plugin after a 
similar level of review, rather than stating outright that you would 
*never* use them?

>> The claim that an extensibility mechanism should be rejected wholesale 
>> on account of being excessively powerful, on the other hand, is just 
>> silly.
>>     
>
> Oh, but NO! An extensibility mechanism which allows for a fragile system 
> _is_ silly. Not my rejection of it.
>   

Shell scripts allow for a fragile system because they could include C 
code snippets which they then compile and LD_PRELOAD. Sure, they "allow 
for" a fragile system -- but the author has to go out of their way to 
make it so. Similarly, folks writing bzr plugins need to take explicit 
actions to monkeypatch existing code (as opposed to adding a new 
transport/storage format/command/etc but leaving the old ones alone).

If you trust the author of your shell script not to build their own 
LD_PRELOAD at runtime, why don't you trust the author of your bzr plugin 
not to monkeypatch in replacements to core code if they say they aren't?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  9:10                                   ` Matthew D. Fuller
@ 2006-10-19 11:15                                     ` Andreas Ericsson
  2006-10-19 12:04                                       ` Matthieu Moy
  2006-10-19 11:27                                     ` Karl Hasselström
  1 sibling, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-19 11:15 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

Matthew D. Fuller wrote:
> On Wed, Oct 18, 2006 at 08:38:24AM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
>> But as you already said, it's often avoided specifically because it
>> destroys locally-created revision numbers.
> 
> I think this has the causality backward.  It's avoided because it
> changes the ancestry of the branch in question, by rearranging the
> left parents; this ties into Linus' assertion that all parents ought
> to be treated equally, which I'm beginning to think is the base
> lynchpin of this whole dissension.
> 
> 
> Without a differentiation of the parents, there's no such creature as
> a "mainline" on a branch, so it's hard to find anything to base revnos
> on from the get-go; the whole discussion becomes meaningless and
> incomprehensible then.
> 
> With the differentiation, numbering along the leftmost 'mainline'


You, and others, keep saying "leftmost". What on earth does left or 
right have to do with anything? Or rather, how do you determine which 
side anything at all is on?

> makes sense, and fits the way people tend to work.  "I did this, then
> I did this, then I merged in Joe's stuff, then I did this", and the
> numbering follows along that.  And as long as it's the same branch,
> those revnos will always be the same; I can't go back and add
> something in between my first and second commits.  THAT'S where revnos
> are useful; referring to a point on given branch.
> 

So long as the given branch is, in git-speak, "master"? I think I'm 
starting to see how this would work, but I still fail to see how you can 
then come up with revnos such as 2343.1.14.7.19, since the only ones 
that seem to actually make any sense are the ones that track the 
strictly linear development.

In git, this can be accomplished by auto-tagging each update of any 
branch with a tag named numerically and incrementally, although no-one 
really bothers with it.

Let's say you have the following graph, where A is the root commit, B 
introduces the base for a couple of new features that three separate 
coders start to work on in their own repositories. The feature started 
on in D is logically coded as a two-stage change. F fixes a bug 
introduced in D. I is the result of an octopus merge of all three 
branches, where the three features are implemented and all bugs are 
fixed (this is btw by far the most common pattern we have in our repos 
here at work).

   A
   |
   B
  /|\
C |  D
| |  |\
| |  E F
| |  |/
| |  G
| H /
  \|/
   I

Now a couple of questions arise.
- How do I do to get to C, D, E, F, G and H?
- When these get merged, which one will be considered the "left" parent, 
and why?

> 
>> So there are some aspects of the bzr design that rob from its
>> ability to function as a distributed version control system. It
>> really does bias itself toward centralization, (the so called "star
>> topoloogy" as opposed to something "fully" distributed).
> 
> That depends on what you mean by 'bias' (and for that matter, what you
> mean by 'centralization'; I think that's being used in very different
> ways here).  If you don't care about the ancestry changes, you can go
> ahead and change it around by merging and pushing like there's no
> tomorrow, and it'll keep up just fine.  Some attributes of it like the
> revnos which assume you do care about the ancestry simply cease to be
> of any applicability.


How deep will I have to dig to get the immutable revids instead?


>  That doesn't make it a useless feature, any
> more than diff being inapplicable in a branch I'm using to store
> binary files makes diff useless; it's just not one that's meaningful
> in a given case.
> 

Binary diffs work just fine, thank you very much ;-)

> bzr (the project) does care about the ordering of the parents, so it
> doesn't do that.  bzr (the tool) assumes that the majority of its
> users will care, which is why it has revnos; because in the case where
> you don't disturb the ancestry of given branches, revnos are very
> useful in reference to that branch.
> 
> 
>> So even a project that's very oriented around a single, central tree
>> can get a lot of benefit from being able to share things arbitrarily
>> between any two given repositories.
> 
> I agree wholeheartedly.  That's one of the reasons I'm using bzr, even
> though 95% or better of what I do is very oriented around single,
> central trees, after all    8-}
> 

I'm sure it's supported. The question is whether or not bazaar makes it 
easy for those developers to exchange valuable information (revids, 
since their revnos will be mixed up) so they can communicate detailed 
info about "commit X introduced a bug in foo_diddle(). I fixed it in 
commit Y, so if you merge it we can release". If revids are always 
printed anyways, I see even less need for revnos. If it's hard to get 
the revids I wouldn't consider the truly distributed workflow supported 
any more than I consider CVS file rename support á la "just hand-edit 
the ,v-files" to actually work.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:10                                         ` Charles Duffy
@ 2006-10-19 11:24                                           ` Johannes Schindelin
  2006-10-19 11:30                                             ` Charles Duffy
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-19 11:24 UTC (permalink / raw)
  To: Charles Duffy; +Cc: git

Hi,

On Thu, 19 Oct 2006, Charles Duffy wrote:

> Johannes Schindelin wrote:
> > > I presume that for this reason you will also never, _never_ use a
> > > non-mainline branch of git -- even if its actual code only touches UI
> > > enhancements or something similarly non-core
> > >     
> > 
> > NO! The point was that I will not gladly run anything which could change the
> > core. If I know it touches only the UI, there is no problem.
> >   
> 
> If you're willing to look at the source of a branch to know that it 
> touches only the UI, why would you not be willing to look at the source 
> of a plugin to do the same thing?

That is why I said I'd be gladly using a shell-script using git-core 
programs. It is typically no more than 20 lines, and I can review that 
quite easily.

> Shell scripts allow for a fragile system because they could include C code
> snippets which they then compile and LD_PRELOAD.

Well, I do not expect people to misbehave. You do not compile a nasty 
C-program from a shell script _by mistake_.

I also expect people not to constantly miss my point. It could be that I 
am not as proficient in the English language as I thought. In that case, 
I'll better shut up.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  9:10                                   ` Matthew D. Fuller
  2006-10-19 11:15                                     ` Andreas Ericsson
@ 2006-10-19 11:27                                     ` Karl Hasselström
  2006-10-19 11:46                                       ` Petr Baudis
  1 sibling, 1 reply; 806+ messages in thread
From: Karl Hasselström @ 2006-10-19 11:27 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git, Jakub Narebski

On 2006-10-19 04:10:45 -0500, Matthew D. Fuller wrote:

> I think this has the causality backward. It's avoided because it
> changes the ancestry of the branch in question, by rearranging the
> left parents; this ties into Linus' assertion that all parents ought
> to be treated equally, which I'm beginning to think is the base
> lynchpin of this whole dissension.

Yes, it seems you have found the needle. :-) In git, history is a DAG;
a commit has a _set_ of parents, so by definition they are not
ordered. This has a number of consequences. For example, you can't
really answer the question "Which branch was this commit on?". All you
can say is that "This commit is reachable from (and therefore part of)
branches X, Y, and Z."

In all other SCMs I have seen, a "branch" is conceptually an ordered
series of commits (some of which may be merges). In git, a "branch" is
a pointer to a commit, period. The commit knows its set of parents, so
all its history is there, but there is fundamentally no way to tell
which branch a commit was "on" when it was created.

This is an important point; it means there is no concept of "my" or
"your" branch. Every participant is adding commits to the same DAG,
and may at any point decide to share her additions with someone else,
or keep them private forever. And because "branches" don't really
exist, every commit really is created equal.

Really, every commit. Not even the initial commit of a project is
special -- it's just a commit with an empty parent set. And, it's
perfectly possible to make a (merge) commit whose parents belong to
previously disconnected parts of the DAG. This of course means that
it's not even possible to differentiate commits based on which project
they're part of, since one can create a commit whose parents belong to
different projects. All commits are _really_ born equal! There's just
one great DAG of all git commits that could possibly exist. (This has
been done in git's own history; the graphical viewer gitk was
originally a separate project, with its own initial commit, but that
initial commit is now reachable from all commits currently being made
to git -- that is, it has been merged.)

This structure of things may seem complex, since it's different, but
mathematically it's quite simple, and that's what counts in the end if
you want to do nontrivial things.

-- 
Karl Hasselström, kha@treskal.com
      www.treskal.com/kalle

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:24                                           ` Johannes Schindelin
@ 2006-10-19 11:30                                             ` Charles Duffy
  2006-10-20 11:38                                               ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Charles Duffy @ 2006-10-19 11:30 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
>> Shell scripts allow for a fragile system because they could include C code
>> snippets which they then compile and LD_PRELOAD.
>>     
>
> Well, I do not expect people to misbehave. You do not compile a nasty 
> C-program from a shell script _by mistake_.
>   

You also don't replace bzrlib functionality (in your terms, plumbing) in 
a plugin by mistake.

> I also expect people not to constantly miss my point.

I think your point is predicated on a misunderstanding of how plugins work.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  7:02                                 ` Erik Bågfors
  2006-10-19  8:49                                   ` Christian MICHON
@ 2006-10-19 11:37                                   ` Petr Baudis
  2006-10-19 15:17                                     ` Matthew D. Fuller
  1 sibling, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-19 11:37 UTC (permalink / raw)
  To: Erik B?gfors
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

Dear diary, on Thu, Oct 19, 2006 at 09:02:16AM CEST, I got a letter
where Erik B?gfors <zindar@gmail.com> said that...
> bzr's use of revision numbers, doesn't mean it hasn't got unique
> revision identifiers, and I can't see any reason why it couldn't be
> used in the same way as git.

There is perhaps no "technical" reason, but it's also what the user
interface is designed around - most probably, using UUIDs instead of
revnos would be a lot less convenient for bzr people because you
probably primarily show revnos everywhere and UUIDs only in few special
places and/or when asked specifically through a command (correct me if
I'm wrong). Also, do you support "UUID autocompletion" so that you can
type just the unique UUID prefix instead of the whole thing?

> Both are excellent tools, and since git
> is more specialized (built to support the exact workflow used in
> kernel development), it's more suited for that exact use.
> 
> bzr tries to take a broader view, for example, it does support a
> centralized workflow if you want one.  Most people don't, but a few
> might. Because of this, it probably fits the kernel development less
> good than git.  That's fine I think! I happens to fit my workflow
> better than git does :)

I think they are in fact just as flexible (+-epsilon). Git can support
centralized workflow as well - you have some central repository
somewhere and all the developers clone it, then pull from it and push to
it in basically the same way they would use CVS. And it is perhaps
currently even more used in practice than the "single-man" workflow
nowadays, as more project are using Git.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:27                                     ` Karl Hasselström
@ 2006-10-19 11:46                                       ` Petr Baudis
  2006-10-19 16:01                                         ` Matthew D. Fuller
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-19 11:46 UTC (permalink / raw)
  To: Karl Hasselström
  Cc: Matthew D. Fuller, Carl Worth, Aaron Bentley, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Dear diary, on Thu, Oct 19, 2006 at 01:27:59PM CEST, I got a letter
where Karl Hasselström <kha@treskal.com> said that...
> Really, every commit. Not even the initial commit of a project is
> special -- it's just a commit with an empty parent set. And, it's
> perfectly possible to make a (merge) commit whose parents belong to
> previously disconnected parts of the DAG. This of course means that
> it's not even possible to differentiate commits based on which project
> they're part of, since one can create a commit whose parents belong to
> different projects.

FWIW, IIRC the Git project has about 6 initial commits. :-)

BTW, a popular source of horrification in other VCSes are Git's octopus
merges. (A popular source of horrification in Git are kernel developers
doing octopus merges of 40 branches at once.) Does Bazaar support those?
(I can't really say it's a defect if it doesn't...)

(An octopus merge is a merge of more than two branches at once, in a
single commit.)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:15                                     ` Andreas Ericsson
@ 2006-10-19 12:04                                       ` Matthieu Moy
  2006-10-19 12:33                                         ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-19 12:04 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth, git,
	Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> You, and others, keep saying "leftmost". What on earth does left or
> right have to do with anything? Or rather, how do you determine which
> side anything at all is on?

Not sure it's the same in git, but in bzr, a new revision is always
created by a commit (it can be "fetched" by other commands though). If
you "merge", then you have to commit after.

What people call "leftmost ancestor" is the revision which used to be
the tip at the time you commited. For example, if you do "bzr diff;
bzr commit" the diff shown before is the same as the one got with
"bzr diff -r last:1" right after the commit.

I believe this doesn't make a difference for merge algorithms, but in
the UI, it's here when you say, e.g.:

bzr diff -r last:12..before:revid:foo@bar-auents987aue

(once in "last:", and once in "before:")

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 12:04                                       ` Matthieu Moy
@ 2006-10-19 12:33                                         ` Petr Baudis
  2006-10-19 13:44                                           ` Matthieu Moy
  2006-10-20 11:50                                           ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-19 12:33 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Andreas Ericsson, Matthew D. Fuller, bazaar-ng, Linus Torvalds,
	Carl Worth, git, Jakub Narebski

Dear diary, on Thu, Oct 19, 2006 at 02:04:14PM CEST, I got a letter
where Matthieu Moy <Matthieu.Moy@imag.fr> said that...
> What people call "leftmost ancestor" is the revision which used to be
> the tip at the time you commited. For example, if you do "bzr diff;
> bzr commit" the diff shown before is the same as the one got with
> "bzr diff -r last:1" right after the commit.

The lack of parents ordering in Git is directly connected with
fast-forwarding.

Consider

 repo1   repo2

   a       a
  /       /
 b       c

Now repo2 merges with repo1:

 repo1   repo2

   a       a
  /       / \
 b       c   b
          \ /
           m

repo1 tip ('b') is not ancestor of repo2 tip ('c') so a three-way merge
is done and a new 'm' merge commit is created.

And now repo1 merges with repo2:

 repo1   repo2

   a       a
  / \     / \
 c   b   c   b
  \ /     \ /
   m       m

Because previous repo1 tip ('b') was ancestor of repo2 tip ('m'), a
fast-forward happenned and repo1 tip simply moved to 'm'. But this
"flipped" the development from repo1 POV - you cannot assume anymore
that the first ("leftmost") parent is special.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 12:33                                         ` Petr Baudis
@ 2006-10-19 13:44                                           ` Matthieu Moy
  2006-10-19 16:03                                             ` Carl Worth
  2006-10-20 11:50                                           ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-19 13:44 UTC (permalink / raw)
  To: Petr Baudis; +Cc: bazaar-ng, git

Petr Baudis <pasky@suse.cz> writes:

> The lack of parents ordering in Git is directly connected with
> fast-forwarding.

[...]

>  repo1 repo2
>
>    a       a
>   / \     / \
>  c   b   c   b
>   \ /     \ /
>    m       m

Yes, bzr has similar thing too. AIUI, the difference is that git does
it automatically, while bzr has two commands in its UI, "merge" and
"pull".

In your case, the "leftmost ancestor" of m is b, because at the time
it was created, it was commited from b.

One problem with that approach is that from revision m and looking
backward in history (say, running "bzr log"), you have two ways to go
backward:

1) Take the history of _your_ commits, and your pull till the point
   where you've branched.

2) Follow the history taking the leftmost ancestor at each step.

In bzr, the notion of "branch" corresponds to a succession of
revisions, which are explicitely stored in a file (ls
.bzr/branch/revision-history), which is what commands like "log"
follow, and what is used for revision numbering. And this sucession of
revision must obey (at most) one of the above. In the past, it was 1),
which means that "pull" (i.e. fast-forward) was only adding revisions
to a branch. In your scenario, repo1 would get a revision history of
"a c m" while repo2 would have had "a b m" with the same tip.

Today, the revision history follows leftmost ancestor. One good
property of this is that revision history is unique for a given
revision. But the terrible drawback is that "pull" and "push" do not
/add/ revisions to your revision history, they rewrite the target one
with the source one. That means I can have

$ bzr log --line
1: some upstream stuff
2: started my work
3: continued my work

# upstream merges.

$ bzr pull
$ bzr log --line
1: some upstream stuff
2: some other upstream stuff ...
3: ... commited while I was working
4: merged from Matthieu this terrible feature

-- 
Matthieu -- definitely curious to give a real try to git ;-)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  3:46                                                       ` Junio C Hamano
@ 2006-10-19 14:27                                                         ` Nicolas Pitre
  2006-10-19 14:55                                                         ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-19 14:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Wed, 18 Oct 2006, Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > Actually, I've hit an impasse.
> >
> > So there's _another_ way of fixing a thin pack: it's to expand the objects 
> > without a base into non-delta objects, and keeping the number of objects 
> > in the pack the same. But _again_, we don't actually know which ones to 
> > expand until it's too late.
> 
> pack-objects.c::write_one() makes sure that we write out base
> immediately after delta if we haven't written out its base yet,
> so I suspect if you buffer one delta you should be Ok, no?

If we create full packs out of thin packs the base objects will end up 
at the end of the pack so this assumption is a bad one to rely upon if 
we want to make things robust (like being able to feed such a pack 
back).


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19  3:46                                                       ` Junio C Hamano
  2006-10-19 14:27                                                         ` Nicolas Pitre
@ 2006-10-19 14:55                                                         ` Linus Torvalds
  2006-10-19 16:07                                                           ` Jan Harkes
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19 14:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Wed, 18 Oct 2006, Junio C Hamano wrote:
>
> Linus Torvalds <torvalds@osdl.org> writes:
> >
> > Actually, I've hit an impasse.
> >
> > So there's _another_ way of fixing a thin pack: it's to expand the objects 
> > without a base into non-delta objects, and keeping the number of objects 
> > in the pack the same. But _again_, we don't actually know which ones to 
> > expand until it's too late.
> 
> pack-objects.c::write_one() makes sure that we write out base
> immediately after delta if we haven't written out its base yet,
> so I suspect if you buffer one delta you should be Ok, no?

It doesn't matter. I realized that my bogus patch to unpack-objects was 
more seriously broken anyway: even the "un-deltify every single object" 
was broken. And that's despite the fact that I _tested_ it, and verified 
the end result by hand.

Why? Because I tested it within one repo, by just piping the output of 
git-pack-objects --stdout directly to the repacker. That seemed to be a 
good way to test it without setting up anything bigger. But it turns out 
that it misses one of the big problems: if you don't unpack the objects in 
a way that later phases can read, none of the streaming code works at all, 
and you have to buffer up _everything_ in memory just to be able to read 
any previous _non_delta objects too.

So my patch-series works - but it only works in a repo that already has 
all the objects in question, because then it can look up the objects in 
the original database. Which makes it useless. Duh.

So forget about unpack-objects. It's designed to be streaming (and it's a 
_good_ design for what it does), but repacking really cannot be done that 
way. Repacking needs to be done by saving the thin pack to disk, and then 
doing a multi-pass over it (like git-index-pack does, for example).

Just throw my patch away. It's not even useful as a basis for anything 
else, unless you want to use it as a way to keep all the objects in memory 
and use the "unpack-objects" logic to just _parse_ the incoming pack.

I suspect using "index-pack" is saner (since it already has the multi-pass 
logic), or just doing somethign that maps all the objects in memory, and 
then calls builtin-pack-objects once it has set up the new thin pack so 
that others can see/use the new objects without realizing that they aren't 
in the canonical pack-format.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  9:10                                       ` Matthieu Moy
@ 2006-10-19 14:57                                         ` Tim Webster
  2006-10-19 15:30                                           ` Aaron Bentley
  2006-10-19 16:14                                           ` Matthieu Moy
  0 siblings, 2 replies; 806+ messages in thread
From: Tim Webster @ 2006-10-19 14:57 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Andreas Ericsson, Christian MICHON, bazaar-ng, git

On 10/19/06, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Andreas Ericsson <ae@op5.se> writes:
>
> > Perhaps not, but the tone is friendly (mostly), the patience of the
> > bazaar people seems infinite and lots of people seem to be having fun
> > while at the same time learning a thing or two about a different SCM.
> > Best case scenario, both git and bazaar come out of the discussion as
> > better tools. If there would never be any cross-pollination, git
> > wouldn't have half the features it has today.
>
>


Thanks everyone for taking time to explain details.

However, I don't use SCM for code development. I use it for collaborative
documentation, white boarding and tracking configurations.
In fact in my company no one uses SCM for code development.
Everyone here uses it for collaborative documentation and white boarding.
Only I use SCM for tracking configurations.

I think of SCMs in terms of an SCM core and SCM tools.

First I want to say every SCM I know of sucks when it comes to tracking
configurations, simply because they don't record or restore file metadata,
like perms, ownership, and acl. I don't see recording or restoring
file metadata as part of the SCM core. I do however feel an SCM core needs to
have provisions for extended file inventory information. The problem
with extended file inventory information, it is fs specific. For this reason I
feel it is essential that the SCM core allow multiple sets of extended file
inventory information. The SCM tools are responsible, based on the local
config, for recording metadata and creating extended file inventory,
translating file metadata of one file system. When tracking configurations
octopus merges are surprisingly common. If a configuration changed is
not signed off by a responsible person, it can not be accepted. Doing
otherwise is simply an invitation to attackers and makes trouble shooting
far too difficult. Also configuration file in one directory will most often not
be members of the same repo. For example each file etc in directory would
members of different repos according to its associated application/pkg.

Somethings I like the SCM tools to handle. Personally I would like the
SCM tools to be platform independent. This would ensure that correct
things happening on ext3 mounted on windows.
I don't think execute bit belongs in the basic file inventory information.
Instead I would like to use this replace by a filter in the extended
file inventory
indicating what file metadata if any should be recorded or restored.
When the local SCM tools config has use metadata enable, the filter is used.
A filter lets the user select file metadata to record/restore such as;
record ownership, record permissions, record acl.

For SCM configuration tracking to function reliably, pulls, pushes and merges
need to be atomic. Personally I like my servers to pull change updates. And
I like to push changes I make on local servers to branches. On configuration
master merge the  branches into groups. When the server pulls changes
for a particular application/pkg, the following is a list of steps that need to
occur.
The SCM tools, perform a pre update step, such as optionally stopping a service
pull updates and build changes files in a scratch space, than apply
file metadata,
unchanged files would be links from the scratch space to the original files,
verify all files are correct by checking their sha1 or md5,
atomically move configuration files and scripts to install them,
perform a post update step,  such as starting or reloading a service.
The pre update step and the post update are very much like pkg pre and post
install scripts. The pre update and post update scripts are in fact part of the
application/pkg configurations files.


Collaborative document editing and white boarding are other requirements.
odf and svg are xml file formats. I would like to see an efficient
xml diff as part of the SCM core. Using mime types SCM tools can unzip
files, bundles, and use mime type information to the SCM core xml
diff, plain diff
as required. I think it is essential that the SCM core include
previsions for multiple
repo partners. For example this can be used to create fail over star
scm architecture.
In collaborative document editing it is often the case where you want to
compress / summarize some of the change history.

We currently use our scm based collaborative document editing as an ad
hock white
board, coordinating our commits and updates via IM. :)
It would be nice if the SCM tools included rss feeds for communicating zip
patch bundles.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:56                                   ` Martin Pool
@ 2006-10-19 14:58                                   ` Aaron Bentley
  2006-10-19 16:59                                     ` Carl Worth
  2006-10-19 17:01                                     ` Carl Worth
  2006-10-19 15:25                                   ` Linus Torvalds
  2 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-19 14:58 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Wed, 18 Oct 2006 23:10:11 -0400, Aaron Bentley wrote:
> But think about your favorite example of an unhealthy social situation
> around a software project and a big, nasty fork. Every example I can
> think of involves some technical distinction that makes one branch
> more special than another.
>
> Now, those situations also involve social problems, and those are even
> more significant. But the technical blessing of one branch does not
> help. And I think it contributes to the social problems in many cases.

I'm not as familiar with those details.  The one fork that I know a lot
about, when Baz (the old Bazaar architecture) forked off from Arch,
showed me that for each developer branch, one branch must be special.

This is just because it is hard to maintain a branch that applies
cleanly to two diverging codebases.  So each developer must develop
against the fork that they want to merge their code into.  If they want
their code to be applied to the other fork, someone must port it.

So I really do feel that special branches are inescapable.

With bzr, you have the freedom to choose which branch you consider
special, and change your mind at any time.  There are no technical
limitations in that regard.

> As far as the revision numbers, my impression is that the numbers
> would be confusing or worthless if I were to use bzr the way I'm
> currently using git, as they certainly could not remain stable.

They would remain stable if you only used pull to update your origin
branch, and used merge+commit to update your development branch.

>> In bzr development, it's very rare for anyone's revision numbers to change.
> 
> Which just says to me that the bzr developers really are sticking to a
> centralized model.

I don't see why you're reaching that conclusion.  I'd like to understand
that better, because Linus seems to be concluding the same thing, and it
doesn't make sense to me.

>> I think you're implying that on a technical level, bzr doesn't support
>> this.  But it does.  Every published repository has unique identifiers
>> for every revision on its mainline, and it's exceedingly uncommon for
>> these to change.
> 
> Every argument you make for the number change being uncommon just
> strengthens the argument that it will be all that more
> confusing/frustrating when the numbers do change.

That doesn't follow.  Just because something is arguably true doesn't
make it bad.  And in this case, I'm not arguing that it's true, I'm
saying that it's true, because that is what my experience tells me is true.

> In cairo, for example, we've made a habit of including a revision
> identifier in our bug tracking system for every commit that resolves a
> bug.

We do it the other way around: we put a bug number in the commit
message.  And I personally have been developing a bugtracker that is
distributed in the same way bzr is; it stores bug data in the source
tree of a project, so that bug activities follow branches around.

> I like having the assurance that those numbers will survive
> forever. And it doesn't matter if the repository moves, or the project
> is forked, or anything else. Those numbers cannot change.
> 
> I understand that bzr also has unique identifiers, but it sounds like
> the tools try to hide them, and people aren't in the habit of using
> them for things like this. Do bzr developers put revision numbers in
> their bug trackers? Is there a guarantee they will always be valid?

Yes, we put revnos in our bug trackers.  No, we can't prove that they
will always be valid.  But there are significant disincentives to
changing them, so I am quite comfortable assuming they will not change.
 And the older a revno gets, the less likely it is to change.

On the other hand, I think your revision identifiers are not as
permanent as you think.

In the first place, it seems fairly common in the Git community to
rebase.  This process throws away old revisions and creates new
revisions that are morally equivalent[1].  I don't know whether Git
fetches unreferenced revisions, but bzr's policy is to fetch only
revisions referenced in the ancestry DAG of the branch.

In the second place, one must consider the "nuclear launch codes"
scenario.  In this scenario, someone has committed the codes necessary
to begin a nuclear attack into their branch.  This is an unlikely event,
of course, but nuclear launch codes are an extreme example of data that
absolutely, positively must be completely expunged from the branch.
Other examples include proprietary code (e.g. if SCO wasn't a bunch of
charlatans), passwords and obscene or libelous statements.

In a nuclear codes scenario, the revision that introduced the nuclear
launch codes and all its descendants must be expunged from the
repository.  You may, perhaps, rebase in order to retain the shape of
the history, but the revision-ids that you have recorded will be gone.

Aaron

[1] This is a process that I find discomforting, because I consider the
original revisions to be real, historical data, and I don't like the
idea of throwing it away.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFN5F70F+nu1YWqI0RAhrsAJ9rcqNGv28134eTvbGoxxteOxif3wCfTbaq
fpD0HNeGgdlMwuJldyzUxRM=
=9k8r
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:37                                   ` Petr Baudis
@ 2006-10-19 15:17                                     ` Matthew D. Fuller
  0 siblings, 0 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 15:17 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Erik B?gfors, bazaar-ng, git

[ trim back CC a bit ]

On Thu, Oct 19, 2006 at 01:37:31PM +0200 I heard the voice of
Petr Baudis, and lo! it spake thus:
> 
> [...] you probably primarily show revnos everywhere and UUIDs only
> in few special places and/or when asked specifically through a
> command (correct me if I'm wrong).

The primary place you'd see either is in 'log'.  To show the UUID,
you'd add a "--show-ids" arg to it (and via per-user config aliasing,
you could just alias 'log' to 'log --show-ids' if you always wanted to
see them, so you wouldn't have to type it.  The output looks something
like:

revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:14:37 -0500
message:
  Foo

(without --show-ids, it's the same, except not showing the
revision-id: line)


> Also, do you support "UUID autocompletion" so that you can type just
> the unique UUID prefix instead of the whole thing?

With the form of bzr UUID's, that's not particularly useful, since
you're probably into the minutes/seconds of the timestamp before it
becomes unique, at which points you're close to 2/3 of the way through
the whole string.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  5:21                                 ` Carl Worth
  2006-10-19  5:56                                   ` Martin Pool
  2006-10-19 14:58                                   ` Aaron Bentley
@ 2006-10-19 15:25                                   ` Linus Torvalds
  2006-10-19 16:13                                     ` Matthew D. Fuller
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19 15:25 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Jakub Narebski, Andreas Ericsson, bazaar-ng, git



On Wed, 18 Oct 2006, Carl Worth wrote:
> 
> I understand that bzr also has unique identifiers, but it sounds like
> the tools try to hide them, and people aren't in the habit of using
> them for things like this. Do bzr developers put revision numbers in
> their bug trackers? Is there a guarantee they will always be valid?

bzr seems to use the classic UUID format, and it's funny how much it looks 
like a real BK ChangeSet revision number ("key").

Here's the quoted bzr "true" revision ID:

	Matthieu.Moy@imag.fr-20061017152029-4c5a2861bcf23b7d

and here's a BK "ChangeSet Key":

	adi@zaphod.bitmover.com|ChangeSet|20031031183805|57296

(I don't have BK installed anywhere, so I had to google for changeset 
keys, and this was just some random key in the BK bugzilla ;)

Looks very similar, don't they? And yes, the true revision ID is stable 
over time (at least it was in BK, and I assume it is in bzr too).

The biggest difference seems to be that in bzr, the final checksum is 
64-bit, while for BK, it was just a 16-bit checksum/unique number (the 
rest is just user-name/machine-name and date: I assume that the bzr commit 
was done at 10/17/2006 3:20:29PM, and the example BK ChangeSet was created 
10/31/2003 6:38:50PM - it looks like _exactly_ the same date format).

With BK, you can also use a "md5 key", and I don't actually know how they 
work. They may just be the md5 hash of the ChangeSet key, I think that may 
be how those things are indexed. So in bkcvs, you'll see a line like this:

	BKrev: 42516681VmgTWL0bkLcltPGiI6Yk5Q

which is the BK md5 key for my last kernel revision in BK (2.6.12-rc2). 
Again, these numbers are stable, unlike the simple revisions.

Note that from a usability standpoint, the UUID's look more readable to a 
human, but are actually much worse than the md5 keys (or the SHA1's that 
git uses). At least with a hash, the first few digits are likely to be 
unique, so you can do things like auto-completion (or just short names). 
With the email+date+random number kind of UUID, you don't have that.

(Pure hashes obviously also tend to just all have the same length, and are 
easier to parse automatically, so from a programmatic standpoint they are 
a lot easier too - but the surprising thing is how they are actually 
easier on humans too, even if the UUID's look more readable).

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:57                                         ` Tim Webster
@ 2006-10-19 15:30                                           ` Aaron Bentley
  2006-10-20  3:14                                             ` Tim Webster
  2006-10-20 10:44                                             ` Jakub Narebski
  2006-10-19 16:14                                           ` Matthieu Moy
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-19 15:30 UTC (permalink / raw)
  To: Tim Webster
  Cc: Matthieu Moy, Christian MICHON, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tim Webster wrote:
> First I want to say every SCM I know of sucks when it comes to tracking
> configurations, simply because they don't record or restore file metadata,
> like perms, ownership, and acl.

Arch supports that kind of metadata.

I believe SVN supports recording arbitrary file properties, so it's just
a matter of applying those properties to the tree.

> Somethings I like the SCM tools to handle. Personally I would like the
> SCM tools to be platform independent. This would ensure that correct
> things happening on ext3 mounted on windows.
> I don't think execute bit belongs in the basic file inventory information.

Our choices have been predicated on producing the best SCM we can for
the purpose of developing software.  We find that the execute bit is
very useful for build scripts and other incidental scripts.

The other attributes didn't seem useful for software development, so
they're not part of the baseline.

> Collaborative document editing and white boarding are other requirements.
> odf and svg are xml file formats. I would like to see an efficient
> xml diff as part of the SCM core. Using mime types SCM tools can unzip
> files, bundles, and use mime type information to the SCM core xml
> diff, plain diff
> as required.

An XML diff/patch or merge will not handle ODF properly.  There's too
much extra semantic information.

> I think it is essential that the SCM core include
> previsions for multiple
> repo partners.

You mean multiple merge sources?

> It would be nice if the SCM tools included rss feeds for communicating zip
> patch bundles.

The bzr "webserve" plugin provides rss feeds.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFN5oB0F+nu1YWqI0RAjSoAJ9xrZtSrZpVVoz6qAf/sZnd/StsUACfenqX
6bemNgMSbhtL0JjIlvulrb4=
=bSpK
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19  9:10                                       ` Matthieu Moy
@ 2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
  2006-10-20 10:40                                       ` Jakub Narebski
  2 siblings, 0 replies; 806+ messages in thread
From: Ramon Diaz-Uriarte @ 2006-10-19 15:45 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Christian MICHON, bazaar-ng, git

On 10/19/06, Andreas Ericsson <ae@op5.se> wrote:
> Christian MICHON wrote:
> > close to 200 post on bzr-git war!
> > is this the right place (git mailing list) to discuss about future
> > features of bzr ?
> >
>
> Perhaps not, but the tone is friendly (mostly), the patience of the
> bazaar people seems infinite and lots of people seem to be having fun
> while at the same time learning a thing or two about a different SCM.
> Best case scenario, both git and bazaar come out of the discussion as
> better tools. If there would never be any cross-pollination, git
> wouldn't have half the features it has today.
>

I fully agree with Andreas: I am just a bzr user (not even a bzr
developer) and when looking for a decentralized VCS I also looked at
git and a few others. I think I am learning quite a bit  about bzr,
git, and VCS in general.

R.

> --
> Andreas Ericsson                   andreas.ericsson@op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>
>


-- 
Ramon Diaz-Uriarte
Statistical Computing Team
Structural Biology and Biocomputing Programme
Spanish National Cancer Centre (CNIO)
http://ligarto.org/rdiaz

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:46                                       ` Petr Baudis
@ 2006-10-19 16:01                                         ` Matthew D. Fuller
  2006-10-19 17:06                                           ` Matthew D. Fuller
  0 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 16:01 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Karl Hasselström, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski

On Thu, Oct 19, 2006 at 01:46:39PM +0200 I heard the voice of
Petr Baudis, and lo! it spake thus:
> 
> Does Bazaar support those?  (I can't really say it's a defect if it
> doesn't...)

By default, merge will refuse to do its thing if there are uncommitted
changes in the working tree, whether those changes are something
you've done, or the pending results of a previous merge.  A '--force'
arg to merge will make it go forward though, so yes, you can merge
multiple other branches in one merge if you want to.

Actually, I can kill 2 birds here.  Quick little bictopus merge:

% bzr log --show-ids
------------------------------------------------------------
revno: 2
revision-id: fullermd@over-yonder.net-20061019151856-c3b406b8bcdfb537
parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
parent: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
parent: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:18:56 -0500
message:
  merge
    ------------------------------------------------------------
    revno: 1.2.1
    merged: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    committer: Matthew Fuller <fullermd@over-yonder.net>
    branch nick: b
    timestamp: Thu 2006-10-19 10:18:00 -0500
    message:
      bar
    ------------------------------------------------------------
    revno: 1.1.1
    merged: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    committer: Matthew Fuller <fullermd@over-yonder.net>
    committer: Matthew Fuller <fullermd@over-yonder.net>
    branch nick: c
    timestamp: Thu 2006-10-19 10:18:07 -0500
    message:
      baz
------------------------------------------------------------
revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
committer: Matthew Fuller <fullermd@over-yonder.net>
branch nick: a
timestamp: Thu 2006-10-19 10:14:37 -0500
message:
  Foo


(I'll refer to revids by the last segment)

Note that this also shows the "left-most" parent distinction.  The
"left-most" parent of revno 2 (c3b406b8bcdfb537) is revno 1
(5b99dff6ed1d76cd), because that's the last thing I did in THIS
branch.  That's my 'mainline'; the commits from branch b
(2fe41e4949f5e237) and c (3d7047e387edcad9) are then additional
parents of the merge at revno 2.

The graph for branch a now looks something like (calling the 3
original commits 'a', 'b', and 'c' and the merge rev 'D'):

  a-.
  |\ \
  | b c
  |/ /
  D-'


The 2fe41e4949f5e237 rev is on branch b's mainline forever, and it has
a single-digit revno (2 in this case) on branch b, but it's not on
mine in a.  Now, let's pretend we're branch b, and we want to pick up
from a.  Because a is a superset of b, we could pull ('fast-forward')
a.  If we do that, the graph in b will be identical to a (and so 'log'
will be too).  That, AIUI, is what you'd do in git.

In the bzr methodology we've been discussing, where you want to
maintain your branch's identity, you'd instead merge from a into b.
You've got two new revisions to pick up in doing so; the
3d7047e387edcad9 from branch c, and the merge rev c3b406b8bcdfb537;
you already have 2fe41e4949f5e237 on your mainline.  So, post-merge,
the log for b will look like (somewhat trimmed for space):


------------------------------------------------------------
revno: 3
revision-id: fullermd@over-yonder.net-20061019153827-78d6209cd0f5f2f7
parent: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
parent: fullermd@over-yonder.net-20061019151856-c3b406b8bcdfb537
branch nick: b
    ------------------------------------------------------------
    revno: 1.1.1
    merged: fullermd@over-yonder.net-20061019151856-c3b406b8bcdfb537
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    parent: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
    parent: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
    branch nick: a
    ------------------------------------------------------------
    revno: 1.2.1
    merged: fullermd@over-yonder.net-20061019151807-3d7047e387edcad9
    parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
    branch nick: c
------------------------------------------------------------
revno: 2
revision-id: fullermd@over-yonder.net-20061019151800-2fe41e4949f5e237
parent: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
branch nick: b
------------------------------------------------------------
revno: 1
revision-id: fullermd@over-yonder.net-20061019151437-5b99dff6ed1d76cd
branch nick: a


The 2fe41e4949f5e237 which was originally on b's mainline is still on
the mainline at revno 2.  The graph in b now looks like (adding the
new 'E' merge commit)[0]:

  a-.
  |\ \
  b c |
  |\|/
  | D
  |/ 
  E


Now, the question of "is that merge commit E really necessary, when
you could just attach D to the end of the graph and create something
like:

  a-.
  |\ \
  b c |
  |/ /
  D-'

is perhaps a useful question (and one that there's obviously
disagreement on).  And it may be a fruitful one to discuss, if we're
not way off in the weeds already.  But, it's also not QUITE the same
question as "Is the left-vs-other path distinction meaningful and to
be preserved?"



[0] For reference at this point:
    a: 5b99dff6ed1d76cd
    b: 2fe41e4949f5e237
    c: 3d7047e387edcad9
    D: c3b406b8bcdfb537
    E: 78d6209cd0f5f2f7


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 13:44                                           ` Matthieu Moy
@ 2006-10-19 16:03                                             ` Carl Worth
  2006-10-19 16:38                                               ` Matthieu Moy
  0 siblings, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-19 16:03 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Petr Baudis, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 5872 bytes --]

On Thu, 19 Oct 2006 15:44:34 +0200, Matthieu Moy wrote:
> > The lack of parents ordering in Git is directly connected with
> > fast-forwarding.

Yes. We're identifying the core underlying technical difference behind
the recent discussion. Namely bzr treats one parent as special, (the
parent that was the branch tip previously). And this special treatment
eliminates the ability to fast-forward, adds merge commits that
wouldn't exist with fast forwarding, and is able to make its revision
numbers a bit more stable as a consequence.

> >    a       a
> >   / \     / \
> >  c   b   c   b
> >   \ /     \ /
> >    m       m
>
> Yes, bzr has similar thing too. AIUI, the difference is that git does
> it automatically, while bzr has two commands in its UI, "merge" and
> "pull".

There's a bit more to it than that though. The git command named
"pull" will perform a fast-forward if possible, but will create a
merge commit if necessary. For example:

	a       a                      a
	| pulls | and fast-forwards to |
	b       b                      b
	        |                      |
	        c                      c

whereas:

	a       a                       a
	| pulls | and creates a merge  / \
	b       c                     b   c
                                       \ /
                                        m

So I'm curious. What does bzr pull do in the case of divergence like
this? (And this is the "numbers will be changed" case, by the way).

> In your case, the "leftmost ancestor" of m is b, because at the time
> it was created, it was commited from b.

It should be mentioned that git can, (annoyingly not by default), save
a file detailing the history of a branch, (time a revision ID for
every time the branch tip moved). This is the "reflog" support and
provides the same information that bzr is encoding in its "leftmost
ancestor" branches.

Importantly, though, git's reflog is entirely local and is not
propagated by push/pull etc.

> One problem with that approach is that from revision m and looking
> backward in history (say, running "bzr log"), you have two ways to go
> backward:
>
> 1) Take the history of _your_ commits, and your pull till the point
>    where you've branched.
>
> 2) Follow the history taking the leftmost ancestor at each step.

Uhm, don't you really have to follow both? And the only ambiguity is
which one you see first?

>              In your scenario, repo1 would get a revision history of
> "a c m" while repo2 would have had "a b m" with the same tip.

OK. With git the two reflogs on the two machines would also have "a c
m" and "a b m". But is this the only kind of log that exists? If I
had code history as above and wanted to ask questions about what led
to commit m, then I would want to know about both b and c which
contribute to it.

And that's what "git log" provides. It lists all the commits that are
reachable from a given commit by following parent links. Surely bzr
has a way to view the complete history that way?

Meanwhile, I suggest that there really is no significance to which
parent of a commit used to have the branch head pointing at it. Saving
that information as part of the history is saving it in the wrong
place. It forces the user to have to be careful about which direction
merges happen, leading to awkward command sequences as demonstrated
above, (or daemons to hide them). And in the end, it's just not
important information to have saved in the permanent history.

It is useful in a transient sense to be able to say, (as git reflog
allows), what was my "master" branch pointing at yesterday, (because I
know the code was working before I merged in some bad code this
morning, for instance). But that's a local-only question and will
never have historical significance. "What was cworth's master branch
pointing at on 2006-10-18" is a question that nobody will ever need
the answer to in any historical sense.

-Carl

PS. Here are the commands the show the divergent pull example I gave
above with git:

# Start a new empty repository
$ mkdir git-example; cd git-example
$ git init-db
defaulting to local storage area

# Create initial commit 'a'
$ touch a; git add a; git commit -m "Initial commit of a"
Committing initial tree 496d6428b9cf92981dc9495211e6e1120fb6f2ba

# Create the 'b' commit on a new 'b' branch from 'a'
$ git checkout -b b; touch b; git add b; git commit -m "Add b on branch b"

# Create the 'c' commit on a new 'c' branch from 'a'
$ git checkout -b c master; touch c; git add c; git commit -m "Add c on branch c"

# Checkout the 'master' branch, (which is pointing at 'a')
$ git git checkout master

# Merge the 'b' branch, (notice that this is a fast forward)
$ git pull . b
Updating from faf5f2f7363ef5de740193afd89bedee095ef966 to 141811d050aa7008f19867280c41405e05b3dbf7
Fast forward
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 b

# Now merge the 'c' branch (notice that this is not a fast
# forward, but instead creates a new merge commit)
$ git pull . c
Trying really trivial in-index merge...
Wonderful.
In-index merge
 0 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 c

# Show the log of commits reachable from 'master', (all 4 commits)
$ git log
commit 59b3cdaf930824d4c0def4ba7ef9b913fcf05d96
Merge: 141811d... dfc35d5...
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:15:23 2006 -0700

    Merge branch 'c'

commit dfc35d5bd88b22f836bd6f46991169d3c3960b69
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:14:30 2006 -0700

    Add c on branch c

commit 141811d050aa7008f19867280c41405e05b3dbf7
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:14:10 2006 -0700

    Add b on branch b

commit faf5f2f7363ef5de740193afd89bedee095ef966
Author: Carl Worth <cworth@raht.cworth.org>
Date:   Thu Oct 19 08:13:53 2006 -0700

    Initial commit of a

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 14:55                                                         ` Linus Torvalds
@ 2006-10-19 16:07                                                           ` Jan Harkes
  2006-10-19 16:48                                                             ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Jan Harkes @ 2006-10-19 16:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

On Thu, Oct 19, 2006 at 07:55:18AM -0700, Linus Torvalds wrote:
> On Wed, 18 Oct 2006, Junio C Hamano wrote:
> >
> > Linus Torvalds <torvalds@osdl.org> writes:
> > >
> > > Actually, I've hit an impasse.
> > >
> > > So there's _another_ way of fixing a thin pack: it's to expand the objects 
> > > without a base into non-delta objects, and keeping the number of objects 
> > > in the pack the same. But _again_, we don't actually know which ones to 
> > > expand until it's too late.
> > 
> > pack-objects.c::write_one() makes sure that we write out base
> > immediately after delta if we haven't written out its base yet,
> > so I suspect if you buffer one delta you should be Ok, no?
> 
> It doesn't matter. I realized that my bogus patch to unpack-objects was 
> more seriously broken anyway: even the "un-deltify every single object" 
> was broken. And that's despite the fact that I _tested_ it, and verified 
> the end result by hand.
> 
> Why? Because I tested it within one repo, by just piping the output of 
> git-pack-objects --stdout directly to the repacker. That seemed to be a 
> good way to test it without setting up anything bigger. But it turns out 
> that it misses one of the big problems: if you don't unpack the objects in 
> a way that later phases can read, none of the streaming code works at all, 
> and you have to buffer up _everything_ in memory just to be able to read 
> any previous _non_delta objects too.

You are correct that it is not possible to create a pack with all
objects expanded in a single pass. But that doesn't mean that a single
pass conversion to a full pack is impossible.

If we find a delta against a base that is not found in our repository we
can keep it as a delta, the base should show up later on in the
thin-pack. Whenever we find a delta against a base that we haven't seen
in the received part of the thin pack, but is available from the
repository we should expand it because there is a chance we may not see
this base in the remainder of the thin-pack.

> So my patch-series works - but it only works in a repo that already has 
> all the objects in question, because then it can look up the objects in 
> the original database. Which makes it useless. Duh.

About that patch series, is there a simple way to import the series into
a local repository? git-am doesn't like it, even after splitting it into
separate files on the linebreaks. I guess git-mailinfo could be taught
to recognise the git-log headers. Or have I missed some useful git apply
trick.

Jan

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 15:25                                   ` Linus Torvalds
@ 2006-10-19 16:13                                     ` Matthew D. Fuller
  2006-10-19 16:49                                       ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 16:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Carl Worth, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

On Thu, Oct 19, 2006 at 08:25:26AM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> The biggest difference seems to be that in bzr, the final checksum
> is 64-bit,

Actually, as best I know, it's not a checksum, just random bits (a
quick glance at the code seems to agree with me).


> Note that from a usability standpoint, the UUID's look more readable
> to a human, but are actually much worse [...]

This I agree with, at least in part.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:57                                         ` Tim Webster
  2006-10-19 15:30                                           ` Aaron Bentley
@ 2006-10-19 16:14                                           ` Matthieu Moy
  2006-10-20  3:40                                             ` Tim Webster
  1 sibling, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-19 16:14 UTC (permalink / raw)
  To: Tim Webster; +Cc: Christian MICHON, Andreas Ericsson, bazaar-ng, git

"Tim Webster" <tdwebste@gmail.com> writes:

> First I want to say every SCM I know of sucks when it comes to tracking
> configurations, simply because they don't record or restore file metadata,
> like perms, ownership, and acl.

That's not a simple matter.

Tracking ownership hardly makes sense as soon as you have two
developers on the same project. What does it mean to checkout a file
belonging to user foo and group bar on a system not having such user
and group?

Just restoring the complete user/group/other rwx permission is already
a mess. In my experience (GNU Arch did this):

1) It sucks ;-). Me working with umask 022 so that my collegues can
   "cp -r" from me, working on a project with people having umask 077,
   I got some files not readable, some yes, well, a mess. *I* have set
   my umask, and *I* want my tools to obey.

2) It's a security hole. If you work with people having umask=002 (not
   indecent if your default group contains just you), you end-up with
   world-writable files in your ${HOME}.

That said, it can be interesting to have it, but disabled by default.

The 'x' bit, OTOH, is definitely useful.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:03                                             ` Carl Worth
@ 2006-10-19 16:38                                               ` Matthieu Moy
  2006-10-20 11:24                                                 ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-19 16:38 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, Petr Baudis, git

Carl Worth <cworth@cworth.org> writes:

> Yes. We're identifying the core underlying technical difference behind
> the recent discussion. Namely bzr treats one parent as special, (the
> parent that was the branch tip previously). And this special treatment
> eliminates the ability to fast-forward, 

No.

bzr could trivially do fast-forward too. It's an explicit design
decision to have two separate commands.

> adds merge commits that wouldn't exist with fast forwarding,

They don't exist either with "pull".

The difference between bzr and git is smaller than you think on this
point I believe.

> There's a bit more to it than that though. The git command named
> "pull" will perform a fast-forward if possible, but will create a
> merge commit if necessary. For example:

The bzr command "pull" will do a fast-forward if possible, but will
refuse to continue and ask you to create the merge commit with other
commands if necessary.

> 	a       a                      a
> 	| pulls | and fast-forwards to |
> 	b       b                      b
> 	        |                      |
> 	        c                      c

Same as bzr.

> whereas:
>
>         a       a                       a
>         | pulls | and creates a merge  / \
>         b       c                     b   c
>                                        \ /
>                                         m

Here, bzr will refuse to pull. It will say "branches have diverged"
and tell you to use merge.

Then, you'll do

$ bzr merge

# optionally "bzr status"

$ bzr commit -m "merged such or such thing"


So, "git pull" seems roughly equivalent to something like

$ bzr pull || (bzr merge; bzr commit -m merge)

> So I'm curious. What does bzr pull do in the case of divergence like
> this? (And this is the "numbers will be changed" case, by the way).

Not yet. The "numbers will be changed" is if b pulls, right after.


Then, one other difference is in the UI. bzr shows you commits in a
kind of hierarchical maner, like (fictive example, that's not the real
exact format).

$ bzr log
commiter: upstream@maintainer.com
message:
  merged the work on a feature
  ------
  commiter: contributor@site.com
  message:
    prepared for feature X
  ------
  commiter: contributor@site.com
  message:
    implemented feature X
  ------
  commiter: contributor@site.com
  message:
    added testcase for feature X
------
commiter: upstream@maintainer.com
message:
  something else

No big difference in the model either, but it probably reveals a
different vision of what "history" means.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 16:07                                                           ` Jan Harkes
@ 2006-10-19 16:48                                                             ` Linus Torvalds
  2006-10-20  0:20                                                               ` Jan Harkes
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19 16:48 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Junio C Hamano, git



On Thu, 19 Oct 2006, Jan Harkes wrote:
> 
> If we find a delta against a base that is not found in our repository we
> can keep it as a delta, the base should show up later on in the
> thin-pack. Whenever we find a delta against a base that we haven't seen
> in the received part of the thin pack, but is available from the
> repository we should expand it because there is a chance we may not see
> this base in the remainder of the thin-pack.

Yes, indeed. We can also have another heuristic: if we find a delta, and 
we haven't seen the object it deltas against, we can still keep it as a 
delta IF WE ALSO DON'T ALREADY HAVE THE BASE OBJECT. Because then we know 
that the base object has to be there later in the pack (or we have a 
dangling delta, which we'll just consider an error).

So yeah, maybe my patch-series is something we can still save.

However, the thing that makes me suspect that it is _not_ saveable, is 
this:

 - let's assume we have a nice thin pack, with object A B C D (in that 
   order), which is actually a good pack in itself (ie it _might_ be thin, 
   but it's actually self-sufficient)

 - let A be a full object, and B be packed as a delta off A, C as a delta 
   off B, and D as a delta off C.

 - Try to repack it as a streaming thing (the end result _should_ 
   obviously be exactly the same as the input, since it turns out to be 
   self-sufficient)

Looks trivial, no?

The answer is: no. It's not trivial. Or rather, it _is_ trivial, but you 
have to _remember_ all of the actual data for A, B, C and D all the way to 
the end, because only if you have that data in memory can you actually 
_recreate_ B, C and D even enough to get their SHA1's (which you need, 
just in order to know that the pack is complete, must less to be able to 
create a non-delta version in case it hadn't been).

So we can definitely do the one-pass creation, but it requires that we 
keep track of everything we've expanded so far in memory (because we won't 
have the data available any other way - we don't have them as objects in 
our object database, and we don't have a good new pack yet).

But if you do that, then yes, it's salvageable.

> About that patch series, is there a simple way to import the series into
> a local repository? git-am doesn't like it, even after splitting it into
> separate files on the linebreaks. I guess git-mailinfo could be taught
> to recognise the git-log headers. Or have I missed some useful git apply
> trick.

No, you've not missed anything. I didn't really expect anybody to want to 
seriously play with it, so I didn't bother to do things properly. 

Especially since I hadn't even written very good commit messages.

Anyway, I just pushed the "rewrite-pack" branch to my git repo on 
kernel.org, so once it mirrors out, if you really want to try to fix up 
the mess I left behind, there it is:

	git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/git.git rewrite-pack

Maybe it's recoverable. 

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:13                                     ` Matthew D. Fuller
@ 2006-10-19 16:49                                       ` Linus Torvalds
  2006-10-19 18:30                                         ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19 16:49 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Andreas Ericsson, Carl Worth, bazaar-ng, git, Jakub Narebski



On Thu, 19 Oct 2006, Matthew D. Fuller wrote:

> On Thu, Oct 19, 2006 at 08:25:26AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> > 
> > The biggest difference seems to be that in bzr, the final checksum
> > is 64-bit,
> 
> Actually, as best I know, it's not a checksum, just random bits (a
> quick glance at the code seems to agree with me).

Ahh. They may be that even in BK. I know BK had various 16-bit CRC 
checksums, but they were probably on the actual _file_ contents, not in 
the key itself.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:58                                   ` Aaron Bentley
@ 2006-10-19 16:59                                     ` Carl Worth
  2006-10-19 23:01                                       ` Aaron Bentley
  2006-10-19 17:01                                     ` Carl Worth
  1 sibling, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-19 16:59 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 7913 bytes --]

On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:
> >> In bzr development, it's very rare for anyone's revision numbers to change.
> >
> > Which just says to me that the bzr developers really are sticking to a
> > centralized model.
>
> I don't see why you're reaching that conclusion.  I'd like to understand
> that better, because Linus seems to be concluding the same thing, and it
> doesn't make sense to me.

First, I want to point out that I think we're having a delightfully
enlightening conversation here, and I'm glad for that.

Let me provide a couple of hypothetical situations to try to
demonstrate my thinking here. The first is far-fetched but perhaps
easier to understand the implications. But the second is the real,
everyday situation that is much more important.

Far-fetched
-----------
Let's imagine there's a complete fork in the bzr codebase tomorrow. We
need not suppose any acrimony, just an amiable split as two subsets of
the team start taking the code in different directions.

Now, at the time of the fork, all published revision numbers apply
equally well to either team's codebase, (obviously, since they are
identical). But as the projects diverge they each start publishing
revision numbers with respect to their own repositories in their own
bug trackers, etc. Obviously, each project has its own "mainline" so
these new revision numbers are only unique within each project and not
between the two.

Time passes...

Finally the two teams (who had remained good friends after the
breakup) find a unifying theory that will let them work on a single
tool that will meet the needs of both user bases. So they want to
merge their code together.

After the merge, there can be only one mainline, so one team or the
other will have to concede to give up the numbers they had generated
and published during the fork. That is, the numbers will not be usable
within the new, merged repository.

Everyday
--------
Now, the above scenario is just silly. It's not likely to ever happen,
so it's really not worth considering as a motivating case.

But, what does (and should) happen everyday is exactly the same. So
here's a realistic situation that is worth considering:

An individual takes the bzr codebase and starts working on it. It's
experimental stuff, so it's not pushed back into the central
repository yet. But our coder isn't a total recluse, so his friends
help him with the code he's working on. They communicate about their
work, (perhaps on the main bzr mailing list), and make statements such
as "feature F is working perfectly as of version V".

But for these communications, revision numbers will not provide
historically stable values that can be used. It's impossible for our
coder to predict the numbers that will be assigned to his code when
they get merged back into the mainline---since some other unknown
programmer may have branched at exactly the same point and is trying
to make the same determination. Neither programmer can know which code
will land first, so neither can know what numbers will get assigned,
right?

Now, the programmers could get stable numbers by keeping the branch in
the main tree, or by at least pushing out the branching point to
"reserve" a number in the main tree.

So, the only way to get stable numbers is to rely on this central
tree.

Does that make sense?

> That doesn't follow.  Just because something is arguably true doesn't
> make it bad.  And in this case, I'm not arguing that it's true, I'm
> saying that it's true, because that is what my experience tells me is true.

[I'm sorry, but I didn't grasp this sentence. I think I lost the
antecedent of "it" somewhere.]

> > In cairo, for example, we've made a habit of including a revision
> > identifier in our bug tracking system for every commit that resolves a
> > bug.
>
> We do it the other way around: we put a bug number in the commit
> message.

Oh, we do that too. That number is important, (for "what the heck is
this commit trying to do, and why", since (sadly) much of the why ends
up getting stuck off in external bug tracking tools). But the reverse
direction is also important, ("Hey, this bug got fixed in the
development version, but I want to backport it to my distribution
package. Where can I find it?").

>          And I personally have been developing a bugtracker that is
> distributed in the same way bzr is; it stores bug data in the source
> tree of a project, so that bug activities follow branches around.

That kind of thing sounds very useful. As I've been talking about
"numbers" here in bug trackers and mailing lists, it should be obvious
that I consider the information stored in such systems an important
part of the history of a code project. So it would be nice if all of
that history were stored in an equally reliable system in some way.

> On the other hand, I think your revision identifiers are not as
> permanent as you think.
>
> In the first place, it seems fairly common in the Git community to
> rebase.  This process throws away old revisions and creates new
> revisions that are morally equivalent[1].

Yes, rebasing does "destroy history" in one sense, (in actual fact, it
creates new commits and leaves the old ones around, which may or may
not have references to them anymore). But i's definitely not common
for git users to use rebase in a situation where it would change any
published number.

For example, I regularly use git-rebase, (and similar "git-commit
 --amend"), as I'm putting together a new branch that exists only
in a repository on my laptop with nobody having external visibility to
it.

So, if I see a typo in a commit and I've never pushed it anywhere,
I'll just "git commit --amend" to fix it. But if I see that typo only
after I push out the change, then I just make a new commit to fix it,
(and suck up the fact that my mistake will be a permanent part of the
history).

And git helps with this as well. If I ever forget that I've already
pushed a change and then I rebase, then the next time I try to push,
git will complain that I'm attempting to throw away history on the
remote end, and will refuse to cooperate, (unless I force it).

There's a similar safety mechanism on the pull side. If I did force a
history-rewriting push, then users who tried to pull it would also
have to force git's hand before it would rewrite their history.

[By the way, it is sometimes useful to make chaotic, regularly-rebased
branches visible to others, so they can watch what's going on. (Junio
does this with his "proposed updates (pu)" branch in hit repository
for git itself, for example). It's just that such branches should
never be used to start new development if they expect to pull from the
branch again later, nor should the revision numbers of such a branch
ever be considered permanent, nor published anywhere.]

> In the second place, one must consider the "nuclear launch codes"
> scenario.

Sure. And git does provide tools that can do this. Of course, the
"normal" tools strictly add new commits and move branches (which are
no more than references to commits) around. But moving branches can
leave commits unreferenced. And a "prune" command does exist, (which
isn't needed in "normal" use), which will delete unreferenced objects.

-Carl

> [1] This is a process that I find discomforting, because I consider the
> original revisions to be real, historical data, and I don't like the
> idea of throwing it away.

As I mentioned above. They aren't thrown away. I often use rebase when
re-building an ugly series of patches into a nice clean set of
patches. And in that situation, I might rebase from the old to the
new, but still with a reference to the old branch until I'm done with
the entire process. And it's perfectly possible, and legitimate that
such a reference has been published and the old branch will live
"forever" even if I rebased it. So rebase isn't necessarily
destructive.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 14:58                                   ` Aaron Bentley
  2006-10-19 16:59                                     ` Carl Worth
@ 2006-10-19 17:01                                     ` Carl Worth
  2006-10-19 17:14                                       ` J. Bruce Fields
  1 sibling, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-19 17:01 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 7913 bytes --]

On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:
> >> In bzr development, it's very rare for anyone's revision numbers to change.
> >
> > Which just says to me that the bzr developers really are sticking to a
> > centralized model.
>
> I don't see why you're reaching that conclusion.  I'd like to understand
> that better, because Linus seems to be concluding the same thing, and it
> doesn't make sense to me.

First, I want to point out that I think we're having a delightfully
enlightening conversation here, and I'm glad for that.

Let me provide a couple of hypothetical situations to try to
demonstrate my thinking here. The first is far-fetched but perhaps
easier to understand the implications. But the second is the real,
everyday situation that is much more important.

Far-fetched
-----------
Let's imagine there's a complete fork in the bzr codebase tomorrow. We
need not suppose any acrimony, just an amiable split as two subsets of
the team start taking the code in different directions.

Now, at the time of the fork, all published revision numbers apply
equally well to either team's codebase, (obviously, since they are
identical). But as the projects diverge they each start publishing
revision numbers with respect to their own repositories in their own
bug trackers, etc. Obviously, each project has its own "mainline" so
these new revision numbers are only unique within each project and not
between the two.

Time passes...

Finally the two teams (who had remained good friends after the
breakup) find a unifying theory that will let them work on a single
tool that will meet the needs of both user bases. So they want to
merge their code together.

After the merge, there can be only one mainline, so one team or the
other will have to concede to give up the numbers they had generated
and published during the fork. That is, the numbers will not be usable
within the new, merged repository.

Everyday
--------
Now, the above scenario is just silly. It's not likely to ever happen,
so it's really not worth considering as a motivating case.

But, what does (and should) happen everyday is exactly the same. So
here's a realistic situation that is worth considering:

An individual takes the bzr codebase and starts working on it. It's
experimental stuff, so it's not pushed back into the central
repository yet. But our coder isn't a total recluse, so his friends
help him with the code he's working on. They communicate about their
work, (perhaps on the main bzr mailing list), and make statements such
as "feature F is working perfectly as of version V".

But for these communications, revision numbers will not provide
historically stable values that can be used. It's impossible for our
coder to predict the numbers that will be assigned to his code when
they get merged back into the mainline---since some other unknown
programmer may have branched at exactly the same point and is trying
to make the same determination. Neither programmer can know which code
will land first, so neither can know what numbers will get assigned,
right?

Now, the programmers could get stable numbers by keeping the branch in
the main tree, or by at least pushing out the branching point to
"reserve" a number in the main tree.

So, the only way to get stable numbers is to rely on this central
tree.

Does that make sense?

> That doesn't follow.  Just because something is arguably true doesn't
> make it bad.  And in this case, I'm not arguing that it's true, I'm
> saying that it's true, because that is what my experience tells me is true.

[I'm sorry, but I didn't grasp this sentence. I think I lost the
antecedent of "it" somewhere.]

> > In cairo, for example, we've made a habit of including a revision
> > identifier in our bug tracking system for every commit that resolves a
> > bug.
>
> We do it the other way around: we put a bug number in the commit
> message.

Oh, we do that too. That number is important, (for "what the heck is
this commit trying to do, and why", since (sadly) much of the why ends
up getting stuck off in external bug tracking tools). But the reverse
direction is also important, ("Hey, this bug got fixed in the
development version, but I want to backport it to my distribution
package. Where can I find it?").

>          And I personally have been developing a bugtracker that is
> distributed in the same way bzr is; it stores bug data in the source
> tree of a project, so that bug activities follow branches around.

That kind of thing sounds very useful. As I've been talking about
"numbers" here in bug trackers and mailing lists, it should be obvious
that I consider the information stored in such systems an important
part of the history of a code project. So it would be nice if all of
that history were stored in an equally reliable system in some way.

> On the other hand, I think your revision identifiers are not as
> permanent as you think.
>
> In the first place, it seems fairly common in the Git community to
> rebase.  This process throws away old revisions and creates new
> revisions that are morally equivalent[1].

Yes, rebasing does "destroy history" in one sense, (in actual fact, it
creates new commits and leaves the old ones around, which may or may
not have references to them anymore). But i's definitely not common
for git users to use rebase in a situation where it would change any
published number.

For example, I regularly use git-rebase, (and similar "git-commit
 --amend"), as I'm putting together a new branch that exists only
in a repository on my laptop with nobody having external visibility to
it.

So, if I see a typo in a commit and I've never pushed it anywhere,
I'll just "git commit --amend" to fix it. But if I see that typo only
after I push out the change, then I just make a new commit to fix it,
(and suck up the fact that my mistake will be a permanent part of the
history).

And git helps with this as well. If I ever forget that I've already
pushed a change and then I rebase, then the next time I try to push,
git will complain that I'm attempting to throw away history on the
remote end, and will refuse to cooperate, (unless I force it).

There's a similar safety mechanism on the pull side. If I did force a
history-rewriting push, then users who tried to pull it would also
have to force git's hand before it would rewrite their history.

[By the way, it is sometimes useful to make chaotic, regularly-rebased
branches visible to others, so they can watch what's going on. (Junio
does this with his "proposed updates (pu)" branch in hit repository
for git itself, for example). It's just that such branches should
never be used to start new development if they expect to pull from the
branch again later, nor should the revision numbers of such a branch
ever be considered permanent, nor published anywhere.]

> In the second place, one must consider the "nuclear launch codes"
> scenario.

Sure. And git does provide tools that can do this. Of course, the
"normal" tools strictly add new commits and move branches (which are
no more than references to commits) around. But moving branches can
leave commits unreferenced. And a "prune" command does exist, (which
isn't needed in "normal" use), which will delete unreferenced objects.

-Carl

> [1] This is a process that I find discomforting, because I consider the
> original revisions to be real, historical data, and I don't like the
> idea of throwing it away.

As I mentioned above. They aren't thrown away. I often use rebase when
re-building an ugly series of patches into a nice clean set of
patches. And in that situation, I might rebase from the old to the
new, but still with a reference to the old branch until I'm done with
the entire process. And it's perfectly possible, and legitimate that
such a reference has been published and the old branch will live
"forever" even if I rebased it. So rebase isn't necessarily
destructive.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:01                                         ` Matthew D. Fuller
@ 2006-10-19 17:06                                           ` Matthew D. Fuller
  0 siblings, 0 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-19 17:06 UTC (permalink / raw)
  To: Petr Baudis
  Cc: bazaar-ng, Karl Hasselström, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski

On Thu, Oct 19, 2006 at 11:01:03AM -0500 I heard the voice of
Matthew D. Fuller, and lo! it spake thus:
>
> Now, the question of "is that merge commit E really necessary, when
> you could just attach D to the end of the graph and create something
> like [...] is perhaps a useful question (and one that there's
> obviously disagreement on).  And it may be a fruitful one to
> discuss, if we're not way off in the weeds already.
>
> But, it's also not QUITE the same question as "Is the left-vs-other
> path distinction meaningful and to be preserved?"

Let me elaborate a little on this.

bzr COULD create

>   a-.
>   |\ \
>   b c |
>   |/ /
>   D-'

instead of

>   a-.
>   |\ \
>   b c |
>   |\|/
>   | D
>   |/ 
>   E

for the previously discussed merge, basically duplicating
'fast-forward' behavior.  It doesn't currently, but it could just as
well without disturbing the attributes it gains from assigning meaning
to the left-most parent.  The choice to create E is the result of an
independent decision from the choice to treat the left path as
special.


What the leftmost discussion impacts is the case of 

    a-.
    |\ \
    | b c
    |/ /
    D-'

vs

    a-.-.
     \ \ \
      b c |
     / / /
    D-'-'

Now, the branches are distinct to bzr, but they're not different.  If
you try to merge one from the other, merge will quite rightly tell you
there's nothing to do, since you both have all the same revs.  git
doesn't recognize the distinction at all, of course.  The difference
is mostly cosmetic.  But, it's a cosmetic difference that bzr devs
(and users, I venture) find _useful_, which is why it's fought for.
And everything else seems to follow from that.

If you don't think the distinction is meaningful or useful, you can
ignore it, and the tool should work just fine.  The main place the
distinction would show up is in the cosmetics of how "log" looks (and
probably similarly in any tool that graphically describes ancestry),
and a custom log output formatter could probably be very easily
written to obviate even that.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 17:01                                     ` Carl Worth
@ 2006-10-19 17:14                                       ` J. Bruce Fields
  2006-10-20 14:31                                         ` Jeff King
  0 siblings, 1 reply; 806+ messages in thread
From: J. Bruce Fields @ 2006-10-19 17:14 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

On Thu, Oct 19, 2006 at 10:01:33AM -0700, Carl Worth wrote:
> On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:
> > On the other hand, I think your revision identifiers are not as
> > permanent as you think.
> >
> > In the first place, it seems fairly common in the Git community to
> > rebase.  This process throws away old revisions and creates new
> > revisions that are morally equivalent[1].
> 
> Yes, rebasing does "destroy history" in one sense, (in actual fact, it
> creates new commits and leaves the old ones around, which may or may
> not have references to them anymore).

Note that the id's are still permanent in this case; they will never
(module some assumptions about the crypto) be reused.  So a given id
points at one and only one object, for all time; it's just that we may
forget what that one object is....

> > In the second place, one must consider the "nuclear launch codes"
> > scenario.
> 
> Sure. And git does provide tools that can do this.

So in this case you can certainly lose the launch codes.  But you have
forever granted everyone a way to determine whether a given guess at the
launch codes is correct.  (Again, assuming some stuff about SHA1).

--b.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:49                                       ` Linus Torvalds
@ 2006-10-19 18:30                                         ` Linus Torvalds
  2006-10-19 18:54                                           ` Matthieu Moy
  2006-10-19 19:16                                           ` Junio C Hamano
  0 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19 18:30 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Andreas Ericsson, bazaar-ng, git, Jakub Narebski



On Thu, 19 Oct 2006, Linus Torvalds wrote:
> 
> Ahh. They may be that even in BK. I know BK had various 16-bit CRC 
> checksums, but they were probably on the actual _file_ contents, not in 
> the key itself.

Btw, I do believe that bzr seems to be acting a lot like BK, at least when 
it comes to versioning. I suspect that is not entirely random either, and 
I suspect it's been a conscious effort to some degree.

Which is fine, in the sense that there are certainly much worse things to 
try to copy.

That said, at least BK was up-front about the versions changing, and 
didn't try to do anything to hinder it. It still confused some people, and 
it wasn't a great naming system, but it did work.

In the big picture, the version naming between BK and git hasn't been an 
issue for anybody in practice, I suspect.

So if you want to look at features that actually matter more, try out 
something like

	gitk drivers/scsi include/scsi

on the kernel archive (I assume that somebody has tried importing the 
kernel git tree into bzr - quite frankly, if bzr cannot handle that size 
tree without problems, you have much bigger issues!).

In other words, being able to look at history of more than a single file 
has been a _huge_ bonus. 

The other big difference is being able to do merges in seconds. The 
biggest cost of doing a big merge these days seems to literally be 
generating the diffstat of the changes at the end (which is purely a UI 
issue, but one that I find so important that I'll happily take the extra 
few seconds for that, even if it sometimes effectively doubles the 
overhead).

Looking at the dates of the merges yesterday, they're literally half a 
minute apart, and that's not me _scripting_ them - that's me actually 
looking up the emails, typing in the "git pull " and pasting the source 
repository, and git fetching the data over the network and merging it, and 
checking out the result (and me verifying that the resulting diffstat 
matches what the email says). Doing four of those in a row in less than 
two minutes is actually a really big deal.

At some point, "performance" is just more than a question of how fast 
things are, it becomes a big part of usability.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:30                                         ` Linus Torvalds
@ 2006-10-19 18:54                                           ` Matthieu Moy
  2006-10-19 20:47                                             ` Linus Torvalds
  2006-10-19 23:28                                             ` Ryan Anderson
  2006-10-19 19:16                                           ` Junio C Hamano
  1 sibling, 2 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-19 18:54 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: bazaar-ng, Matthew D. Fuller, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski

Linus Torvalds <torvalds@osdl.org> writes:

> Btw, I do believe that bzr seems to be acting a lot like BK, at least when 
> it comes to versioning. I suspect that is not entirely random either, and 
> I suspect it's been a conscious effort to some degree.
>
> Which is fine, in the sense that there are certainly much worse things to 
> try to copy.

By curiosity, how would you compare git and Bitkeeper, on a purely
technical basis? (not asking for a detailed comparison, but an "X is
globaly/much/terribly/not better than Y" kind of statement ;-) )

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17  4:31           ` Aaron Bentley
@ 2006-10-19 19:01             ` Nathaniel Smith
  2006-10-20 10:32               ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Nathaniel Smith @ 2006-10-19 19:01 UTC (permalink / raw)
  To: git

Aaron Bentley <aaron.bentley <at> utoronto.ca> writes:
> Bazaar also supports multiple unrelated branches in a repository, as
> does CVS, SVN (depending how you squint), Arch, and probably Monotone.

It's quite common in Monotone.  You could probably do it in Mercurial as well,
though I don't know that anyone does.  SVK definitely does it (since each user
has a single repo that's shared by all the projects they work on).

Trivia-ly yours,
-- Nathaniel

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:30                                         ` Linus Torvalds
  2006-10-19 18:54                                           ` Matthieu Moy
@ 2006-10-19 19:16                                           ` Junio C Hamano
  2006-10-20 10:51                                             ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-19 19:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> The other big difference is being able to do merges in seconds. The 
> biggest cost of doing a big merge these days seems to literally be 
> generating the diffstat of the changes at the end (which is purely a UI 
> issue, but one that I find so important that I'll happily take the extra 
> few seconds for that, even if it sometimes effectively doubles the 
> overhead).

An interesting effect on this is when people have a column for
merge performance in a SCM comparison table, they would include
time to run the diffstat as part of the time spent for merging
when they fill in the number for git, but not for any other SCM.

I know you won't misunderstand me but for the sake of others, I
should add this: I am not saying diffstat should be optional.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:54                                           ` Matthieu Moy
@ 2006-10-19 20:47                                             ` Linus Torvalds
  2006-10-21  5:49                                               ` Junio C Hamano
  2006-10-19 23:28                                             ` Ryan Anderson
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-19 20:47 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Matthew D. Fuller, Andreas Ericsson, Carl Worth, bazaar-ng, git,
	Jakub Narebski

[-- Attachment #1: Type: TEXT/PLAIN, Size: 8997 bytes --]



On Thu, 19 Oct 2006, Matthieu Moy wrote:
> 
> By curiosity, how would you compare git and Bitkeeper, on a purely
> technical basis? (not asking for a detailed comparison, but an "X is
> globaly/much/terribly/not better than Y" kind of statement ;-) )

I think git is better for kernel work these days, but a large portion of 
that is that a lot of the features have literally been tweaked for us (for 
very obvious reasons).

For example, the whole "rebase" thing (or explicitly making cherry-picking 
easy) is something that a number of kernel people do, and even if I have 
to admit to not liking the practice very much (it kind of hides the "true" 
development history), it does have huge advantages, and it makes history a 
lot easier to read.

Similarly, I often used the single-file graphical history viewing in BK 
("revtool"), but being able to follow the history of multiple files as one 
"entity" really is something that once you get used to, it's really really 
hard going back, and "gitk" does generate a much more readable graph.

And I think the git way of doing branches is just simply superior. Git 
always did branches in the sense that the way merges happened you _always_ 
had several heads, but actually making them available and switching 
between them was something that wasn't my idea, and that I even was a bit 
apprehensive about. I was wrong. Git branches are branches done right. I 
just don't see how you _could_ do them better.

That said, a lot of the features I like and _I_ consider really important 
are possibly not that important to others. For example, maybe nobody else 
really cares about viewing the history of a particular subsystem, the way 
I do. For a lot of people, single-file is probably ok. 

For example, while git now does "annotate" (or "blame"), it's not 
lightning fast, and I simply don't care. Doing a

	git blame kernel/sched.c

takes about three seconds for me, and that's on a pretty good machine (and 
on the kernel tree, which for me is always in the cache ;). Quite frankly, 
if I cared deeply about that kind of annotation, I'd probably be upset 
about it. There are basically _no_ other git operations that take that 
long. I can get the _full_ log of the last 18 months of the kernel much 
faster than that.

And the slowness of annotate comes directly from the design of git, and 
from the fact that it's not how I tend to look at changes. Rather than 
doing "git blame kernel/sched.c", I'm _much_ more likely to just do

	git log -p kernel/sched.c

and see the changes as individual patches instead (and perhaps search for 
some pattern that I'm looking for by just literally using a regex in the 
pager).

Also, the fact that you need to repack the archive every once in a while 
doesn't disturb me. I probably end up repacking the kernel almost daily, 
which is _waay_ excessive, but it's just become habit of mine. I've seen 
people who really don't like it, and I've also seen people who apparently 
never even realized that they should do an occasional "git repack -a -d", 
and then they have hundreds of thousands of loose objects and wonder why 
the performance is so bad ;)

BK never had these issues. BK always kept things "packed", which made a 
lot of operations much slower ("bk undo" was painfully slow). BK could 
annotate quickly, since it was really a file-based history, in a way that 
git fundamentally isn't, and can never be (and I don't _want_ it to be, 
but it means that "annotate" is slow).

And BK had some great tools. The merge tool was superior ("bk resolve"? I 
forget). The patch-application tool was great.

But both of those tools are things that git doesn't have, for _another_ 
reason: the way git works, you don't really need them. For example, the 
patch application tool was great, but the biggest reason it was needed in 
the first place was tracking renames explicitly.

In that kind of environment, you have serious problems with patches, and 
you actually _need_ a tool to let the user explain when something is a 
rename and when it isn't. With git not tracking renames, the patch 
application tool simply isn't needed.

The same goes to some degree to "bk resolve". Because git has the index, 
and you can _leave_ things unresolved in the index, you don't need a 
graphical tool to resolve things - git knows very fundamentally about 
incomplete merges _and_ about multiple branches (which you need in order 
to keep track of both the branch you merge from and the branch you merge 
into), and it's fine to resolve any conflicts in the normal working tree.

So for at least _my_ usage, git does everything very well, but that's 
because if it didn't fit me, I fixed it until it did. 

And "git bisect" really does rock. I still cannot believe that apparently 
nobody did it before us. It's such a useful thing, and it works so well in 
unambiguous cases (and not all cases are that unambiguous, but an 
appreciably large subset is).

So that said, git does work very well for us, but I do want to end on a 
note on thigns that BitKeeper did and nobody else has:

 - Larry was first. The undeniable fact is, that before BK (and for 
   several years _after_ BK), the open-source alternatives were just CRAP.

   You can say anything you like about his personality, but dammit, 
   compared to Larry, most people I know are idiots. People don't give BK 
   the credit it deserves. When Tridge "reverse-engineered" it, people 
   were making jokes about how trivial some of the protocols were. That 
   misses the point ENTIRELY. The point is, compared to BK, everything 
   else absolutely _sucked_, and BK really was a watershed program.

   Never EVER underestimate how important BK was. Quite frankly, I think 
   most open-source SCM's _still_ suck. I'm constantly amazed that anybody 
   would touch SVN with a ten-foot pole. Talk about crap. And SVN is at 
   least usable, unlike a lot of other projects.

 - When I did git, one of the things that actually _helped_ me was that I 
   was consciously trying to not do a BK clone. I wanted to do the same 
   things that BK did, but I very much did _not_ want to do them the _way_ 
   BK did them. I respect Larry too much, and I didn't want there to be 
   any question about git being just a "clone".

   So a lot of the git design ended up very much trying to avoid old 
   designs on purpose, and I think that really helped. The fact that I 
   didn't have a background in SCM's, and that I thought all the weaves 
   etc were confusing, meant that I instead went for a radically different 
   way of doing things.

   And I'm 100% convinced that "radically different" was the right thing 
   to do. That was what allowed git to really soar. A lot of the good 
   things in git come exactly from the fact that git does _not_ do things 
   like most traditional SCM's do. But BK should still get a lot of 
   credit, because it was what taught me (and a lot of other people) what 
   being "distributed" really meant.

 - On a more personal note: people say that BK showed the "failure" of 
   using a commercial closed-source program. I would disagree. Not only 
   did the kernel get a whole lot of useful work out of BK, we learnt how 
   distributed systems _should_ work, and quite frankly, I'd do ít all 
   over again in a heartbeat.

   If there was a "failure" in the BK saga, it was in how horrendously 
   _bad_ all open-source SCM's were, even with BK showing how it should 
   have been done for several years. THAT is the failure. The fact that 
   there were hundreds of people who whined about BK, and nobody really 
   did anything productive. 

Now, I'm obviously biased, but I really do believe that git is the best 
open-source SCM there is, by a _mile_. I don't know how many people 
realize this, but we literally haven't changed our data formats in over a 
year. I was looking at my old git import of the BKCVS tree today, because 
I wanted to look up the "BKrev" format for the email earlier in this tree, 
and I realized that the pack-file was from July of last year. That's 
within a few _weeks_ of the pack-file being introduced at all, and guess 
what? It all still worked. No "on-the-fly format conversion", no 
_nothing_. It just worked.

That should tell people something. It's pretty much the fastest SCM out 
there (and yeah, that's on almost any operation you can name), it still 
has the smallest disk footprint I've ever heard of, and it hasn't had the 
"format of the week" disease that every other project seems to go through.

And it's used in production settings on some of the biggest projects out 
there. SVN has more users, but let's face it, SVN really isn't even in the 
running. Technology-wise, the thing is just not worth bothering with, but 
it's a good crutch for people who are used to CVS and never want to use 
anything lse.

Am I happy with git? I'm happy as a clam. It turned out even better than I 
ever thought it would. And BK was what taught me what to aim for.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:59                                     ` Carl Worth
@ 2006-10-19 23:01                                       ` Aaron Bentley
  2006-10-19 23:42                                         ` Carl Worth
  2006-10-20 10:53                                         ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-19 23:01 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Thu, 19 Oct 2006 10:58:48 -0400, Aaron Bentley wrote:

> Let's imagine there's a complete fork in the bzr codebase tomorrow. We
> need not suppose any acrimony, just an amiable split as two subsets of
> the team start taking the code in different directions.

...

> Finally the two teams ... want to
> merge their code together.
> 
> After the merge, there can be only one mainline, so one team or the
> other will have to concede to give up the numbers they had generated
> and published during the fork.

I don't think this is true.  The abandoned mainline does not need to be
destroyed.  It can be kept at the same location that it always was, with
the numbers that it always had.  So the number + URL combo stays
meaningful.  Additionally, the new mainline can keep a mirror of the
abandoned mainline in its repository, because there are virtually no
additional storage requirements to doing so.

> An individual takes the bzr codebase and starts working on it. It's
> experimental stuff, so it's not pushed back into the central
> repository yet. But our coder isn't a total recluse, so his friends
> help him with the code he's working on. They communicate about their
> work, (perhaps on the main bzr mailing list), and make statements such
> as "feature F is working perfectly as of version V".
> 
> But for these communications, revision numbers will not provide
> historically stable values that can be used.

They certainly can.

The coder says "I've put up a branch at http://example.com/bzr/feature.
 In revision 5, I started work on feature A.  I finished work in
revision 6.  But then I had to fix a related bug in revision 7."

As long as that coder is active, they'll keep their repository at the
same location.  And because branches are cheap (even cheaper than
delta-compressed revisions), there's no reason to delete old branches.
It's better to keep them around for reference purposes.

> It's impossible for our
> coder to predict the numbers that will be assigned to his code when
> they get merged back into the mainline---since some other unknown
> programmer may have branched at exactly the same point and is trying
> to make the same determination.

This is true, but his code is likely to all land in the mainline at
once.  Since his own revnos are more fine-grained, he's not likely want
to use the mainline revnos.

> Now, the programmers could get stable numbers by keeping the branch in
> the main tree, or by at least pushing out the branching point to
> "reserve" a number in the main tree.

I don't know what you mean by pushing out the branching point.

>> That doesn't follow.  Just because something is arguably true doesn't
>> make it bad.  And in this case, I'm not arguing that it's true, I'm
>> saying that it's true, because that is what my experience tells me is true.
> 
> [I'm sorry, but I didn't grasp this sentence. I think I lost the
> antecedent of "it" somewhere.]

I felt that you were mischaracterizing my _statement_ that "it's
exceedingly uncommon for [revnos] to change" as an _argument_ "it's
exceedingly uncommon for [revnos] to change".  The reality is that we
keep saying revnos don't change because git users keep saying "but what
if the revnos change?".


>>          And I personally have been developing a bugtracker that is
>> distributed in the same way bzr is; it stores bug data in the source
>> tree of a project, so that bug activities follow branches around.
> 
> That kind of thing sounds very useful. As I've been talking about
> "numbers" here in bug trackers and mailing lists, it should be obvious
> that I consider the information stored in such systems an important
> part of the history of a code project. So it would be nice if all of
> that history were stored in an equally reliable system in some way.

If you're interested, it's called "Bugs Everywhere" and it's available here:
http://panoramicfeedback.com/opensource/

New VCS backends are welcome :-D

>> In the first place, it seems fairly common in the Git community to
>> rebase.  This process throws away old revisions and creates new
>> revisions that are morally equivalent[1].
> 
> Yes, rebasing does "destroy history" in one sense, (in actual fact, it
> creates new commits and leaves the old ones around, which may or may
> not have references to them anymore). But i's definitely not common
> for git users to use rebase in a situation where it would change any
> published number.

So actually, not all branches are treated equally by Git users.  Public
branches are treated as append-only, but private branches are treated as
mutable.  (It's the same with bzr users, of course.)

> And git helps with this as well. If I ever forget that I've already
> pushed a change and then I rebase, then the next time I try to push,
> git will complain that I'm attempting to throw away history on the
> remote end, and will refuse to cooperate, (unless I force it).

Same here.

> There's a similar safety mechanism on the pull side. If I did force a
> history-rewriting push, then users who tried to pull it would also
> have to force git's hand before it would rewrite their history.

Same here.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOAPm0F+nu1YWqI0RAhkdAJ9InxuEjbToGQU2AOJmfZw124Lb2wCeMmDC
9w08eZbmL19FfVQmtpPcYkQ=
=AmGo
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 18:54                                           ` Matthieu Moy
  2006-10-19 20:47                                             ` Linus Torvalds
@ 2006-10-19 23:28                                             ` Ryan Anderson
  1 sibling, 0 replies; 806+ messages in thread
From: Ryan Anderson @ 2006-10-19 23:28 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Linus Torvalds, Matthew D. Fuller, Andreas Ericsson, Carl Worth,
	bazaar-ng, git, Jakub Narebski

On 10/19/06, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> Linus Torvalds <torvalds@osdl.org> writes:
>
> > Btw, I do believe that bzr seems to be acting a lot like BK, at least when
> > it comes to versioning. I suspect that is not entirely random either, and
> > I suspect it's been a conscious effort to some degree.
> >
> > Which is fine, in the sense that there are certainly much worse things to
> > try to copy.
>
> By curiosity, how would you compare git and Bitkeeper, on a purely
> technical basis? (not asking for a detailed comparison, but an "X is
> globaly/much/terribly/not better than Y" kind of statement ;-) )

Having used both in a past job setting (simultaneously even),
BitKeeper was a huge win over CVS, but after a while, some of its
tools  were just very frustrating in comparison with comparable Git
interfaces, and I had actually written a terribly slow BK -> Git
converter just so I could incrementally import our BK tree, then use
Git's history-viewing because it was so much more pleasant to work
with.

For small projects (~5 people), they weren't hugely different, but Git
just felt more comfortable after a while.  (It was actually possible
to do a commit from the command line in a single command, without
getting annoyed by the interface, for a trivial example.)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:01                                       ` Aaron Bentley
@ 2006-10-19 23:42                                         ` Carl Worth
  2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  2:53                                           ` James Henstridge
  2006-10-20 10:53                                         ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-19 23:42 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 5266 bytes --]

On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:
> I don't think this is true.  The abandoned mainline does not need to be
> destroyed.  It can be kept at the same location that it always was, with
> the numbers that it always had. So the number + URL combo stays
> meaningful.

Sure that's possible, but it gets rather unwieldy the more
repositories you have involved. I've been arguing that bzr really does
encourage centralized, not distributed development, and you were having
trouble seeing how I came to that conclusion. Do you see how "maintain
an independent URL namespace for every distributed branch" doesn't
encourage much distributed development?

>             Additionally, the new mainline can keep a mirror of the
> abandoned mainline in its repository, because there are virtually no
> additional storage requirements to doing so.

And this part I don't understand. I can understand the mainline
storing the revisions, but I don't understand how it could make them
accessible by the published revision numbers of the "abandoned"
line. And that's the problem.

> > But for these communications, revision numbers will not provide
> > historically stable values that can be used.
>
> They certainly can.
>
> The coder says "I've put up a branch at http://example.com/bzr/feature.
>  In revision 5, I started work on feature A.  I finished work in
> revision 6.  But then I had to fix a related bug in revision 7."

"I've put this branch up" isn't historically stable...

> As long as that coder is active

...which is what you just said there yourself.

On the other hand, git names really do live forever, regardless of
where the code is hosted or how it moves around. When I'm talking
about historical stability, I'm talking about being able to publish
numbers that live forever.

It sounds like bzr has numbers like this inside it, (but not nearly as
simple as the ones that git has), but that users aren't in the
practice of communicating with them. Instead, users communicate with
the unstable numbers. And that's a shame from an historical
perspective.

> This is true, but his code is likely to all land in the mainline at
> once.  Since his own revnos are more fine-grained, he's not likely want
> to use the mainline revnos.

What I'd like to be able to do, is advertise a temporary repository,
and while using it, publish names for revisions that will still be
valid when the code gets pushed out to the mainline. That is
supporting distributed development, and everything I'm hearing says
that the bzr revision numbers don't support that.

> I felt that you were mischaracterizing my _statement_ that "it's
> exceedingly uncommon for [revnos] to change" as an _argument_ "it's
> exceedingly uncommon for [revnos] to change".  The reality is that we
> keep saying revnos don't change because git users keep saying "but what
> if the revnos change?".

OK.

The original claim that sparked the discussion was that bzr has a
"simple namespace" while git does not. We've been talking for quite a
while here, and I still don't fully understand how these numbers are
generated or what I can expect to happen to the numbers associated
with a given revision as that revision moves from one repository to
another. It's really not a simple scheme.

Meanwhile, I have been arguing that the "simple" revision numbers that
bzr advertises have restrictions on their utility, (they can only be
used with reference to a specific repository, or with reference to
another that treats it as canonical). I _think_ I understand the
numbers well enough to say that still.

Compare that with the git names. The scheme really is easy to
understand, (either the new user already understands cryptographic
hashes, or else it's as easy as "a long string of digits that git
assigns as the name"). The names have universal utility in time and
space, (for definitions of the the universe larger than I will ever be
able to observe anyway). And the natural inclination to abbreviate the
a name when repeating it, (note the recent post with bzr UUIDs
exhibiting the same inclination), doesn't make the names any less
useful since the abbreviation alone will work most always.

The naming in git really is beautiful and beautifully simple.

It's not monotonically increasing from one revision to the next, but
I've never found that to be an issue. Of course, we do still use our
own "simple" names for versioning the releases and snapshots of
software we manage with git, and that's where being able to easily
determine "newer" or "older" by simple numerical examination is
important. I've honestly never encountered a situation where I was
handed two git sha1 sums and wished that I could do the same thing.

> If you're interested, it's called "Bugs Everywhere" and it's available here:
> http://panoramicfeedback.com/opensource/
>
> New VCS backends are welcome :-D

Thanks, I hope to take a look at that at some point.

> So actually, not all branches are treated equally by Git users.  Public
> branches are treated as append-only, but private branches are treated as
> mutable.  (It's the same with bzr users, of course.)

Well, some users treat all branches as append only and shun rebase.

[snip of remaining agreement of similarity between the tools]

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 16:48                                                             ` Linus Torvalds
@ 2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20 14:41                                                                 ` Jeff King
  2006-10-20  0:20                                                               ` [PATCH 1/2] Pass through unresolved deltas when writing a pack Jan Harkes
  2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
  2 siblings, 1 reply; 806+ messages in thread
From: Jan Harkes @ 2006-10-20  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

On Thu, Oct 19, 2006 at 09:48:29AM -0700, Linus Torvalds wrote:
> On Thu, 19 Oct 2006, Jan Harkes wrote:
> > 
> > If we find a delta against a base that is not found in our repository we
> > can keep it as a delta, the base should show up later on in the
> > thin-pack. Whenever we find a delta against a base that we haven't seen
> > in the received part of the thin pack, but is available from the
> > repository we should expand it because there is a chance we may not see
> > this base in the remainder of the thin-pack.
> 
> Yes, indeed. We can also have another heuristic: if we find a delta, and 
> we haven't seen the object it deltas against, we can still keep it as a 
> delta IF WE ALSO DON'T ALREADY HAVE THE BASE OBJECT. Because then we know 
> that the base object has to be there later in the pack (or we have a 
> dangling delta, which we'll just consider an error).
> 
> So yeah, maybe my patch-series is something we can still save.

It looks like you were really close. When we cannot resolve a delta, we
just write it to the packfile and we don't queue it. If it can be
resolved we write it as a full object.

The only thing that cannot be reliably tracked is the pack index
information. The offsets are trivial, but we cannot calculate the SHA1
for a delta without applying it to it's base, if the base comes later
the existing code could do it, but if it has already been written to the
pack we can't easily track back.

And why add all the extra complexity. Running git-index-pack after
git-update-objects --repack not only generates the correct index without
a problem, it also serves as an extra consistency check and we keep this
code isolated from any possible future changes to the index file format.

I'll try to follow this up with 2 patches, one is an almost trivial
change to your code that makes it write out a pack with all full objects
and resolvable deltas converted to full objects, any unresolved deltas
are expected to be relative to some other object in the same pack.

The rewritten pack is indexed correctly even when I run git-update-index
in a repository that does not contain any of the objects in the thin-pack.
Ofcourse it also works when the objects are available, but the resulting
full pack is considerably bigger since we can find a suitable base for
every delta.

> However, the thing that makes me suspect that it is _not_ saveable, is 
> this:
...
> The answer is: no. It's not trivial. Or rather, it _is_ trivial, but you 
> have to _remember_ all of the actual data for A, B, C and D all the way to 
> the end, because only if you have that data in memory can you actually 
> _recreate_ B, C and D even enough to get their SHA1's (which you need, 
> just in order to know that the pack is complete, must less to be able to 
> create a non-delta version in case it hadn't been).

Only if you want to build the index at the same time, we don't need to
know the SHA1 values for unresolved deltas.

> Anyway, I just pushed the "rewrite-pack" branch to my git repo on 
> kernel.org, so once it mirrors out, if you really want to try to fix up 
> the mess I left behind, there it is:

I think I still left quite a bit of the mess unfixed.

Jan

^ permalink raw reply	[flat|nested] 806+ messages in thread

* [PATCH 1/2] Pass through unresolved deltas when writing a pack
  2006-10-19 16:48                                                             ` Linus Torvalds
  2006-10-20  0:20                                                               ` Jan Harkes
@ 2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
  2 siblings, 0 replies; 806+ messages in thread
From: Jan Harkes @ 2006-10-20  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

The resulting pack should be correct if we have the base somewhere else in
the received pack, if we didn't have the base the received pack would be
faulty and can't be unpacked as loose objects either.

The internal pack index information is not updated correctly anymore.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>

---
 builtin-unpack-objects.c |    5 ++++-
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index f139308..b95c93c 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -246,7 +246,10 @@ static void unpack_delta_entry(unsigned 
 	}
 
 	if (!has_sha1_file(base_sha1)) {
-		add_delta_to_list(base_sha1, delta_data, delta_size);
+		if (pack_file)
+			write_pack_delta(base_sha1, delta_data, delta_size);
+		else
+			add_delta_to_list(base_sha1, delta_data, delta_size);
 		return;
 	}
 	base = read_sha1_file(base_sha1, type, &base_size);
-- 
1.4.2.1

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* [PATCH 2/2] Remove unused index tracking code.
  2006-10-19 16:48                                                             ` Linus Torvalds
  2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20  0:20                                                               ` [PATCH 1/2] Pass through unresolved deltas when writing a pack Jan Harkes
@ 2006-10-20  0:20                                                               ` Jan Harkes
  2006-10-20  1:11                                                                 ` Nicolas Pitre
  2 siblings, 1 reply; 806+ messages in thread
From: Jan Harkes @ 2006-10-20  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

Tracking the offsets is not that hard, but calculating the sha1 for the
deltas is tricky, we may have already seen and written out the base we
need. So it is actually easier to avoid the complexity altogether and
rely on git-index-pack to rebuild the index. The indexing step is also a
useful validation whether the final pack contains a base for every delta.

Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>

---
 builtin-unpack-objects.c |   57 +++++++++++-----------------------------------
 1 files changed, 14 insertions(+), 43 deletions(-)

diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
index b95c93c..3df7938 100644
--- a/builtin-unpack-objects.c
+++ b/builtin-unpack-objects.c
@@ -89,29 +89,6 @@ static void *get_data(unsigned long size
 }
 
 static struct sha1file *pack_file;
-static unsigned long pack_file_offset;
-
-struct index_entry {
-	unsigned long offset;
-	unsigned char sha1[20];
-};
-
-static unsigned int index_nr, index_alloc;
-static struct index_entry **index_array;
-
-static void add_pack_index(unsigned char *sha1)
-{
-	struct index_entry *entry;
-	int nr = index_nr;
-	if (nr >= index_alloc) {
-		index_alloc = (index_alloc + 64) * 3 / 2;
-		index_array = xrealloc(index_array, index_alloc * sizeof(*index_array));
-	}
-	entry = xmalloc(sizeof(*entry));
-	entry->offset = pack_file_offset;
-	hashcpy(entry->sha1, sha1);
-	index_array[nr++] = entry;
-}
 
 static void write_pack_delta(const unsigned char *base, const void *delta, unsigned long delta_size)
 {
@@ -122,11 +99,9 @@ static void write_pack_delta(const unsig
 	sha1write(pack_file, header, hdrlen);
 	sha1write(pack_file, base, 20);
 	datalen = sha1write_compressed(pack_file, delta, delta_size);
-
-	pack_file_offset += hdrlen + 20 + datalen;
 }
 
-static void write_pack_object(const char *type, const unsigned char *sha1, const void *buf, unsigned long size)
+static void write_pack_object(const void *buf, unsigned long size, const char *type, const unsigned char *sha1)
 {
 	unsigned char header[10];
 	unsigned hdrlen, datalen;
@@ -134,8 +109,6 @@ static void write_pack_object(const char
 	hdrlen = encode_header(string_to_type(type, sha1), size, header);
 	sha1write(pack_file, header, hdrlen);
 	datalen = sha1write_compressed(pack_file, buf, size);
-
-	pack_file_offset += hdrlen + datalen;
 }
 
 struct delta_info {
@@ -160,22 +133,21 @@ static void add_delta_to_list(unsigned c
 
 static void added_object(unsigned char *sha1, const char *type, void *data, unsigned long size);
 
-static void write_object(void *buf, unsigned long size, const char *type,
-	unsigned char *base, void *delta, unsigned long delta_size)
+static void write_object(void *buf, unsigned long size, const char *type)
 {
 	unsigned char sha1[20];
 
 	if (pack_file) {
 		if (hash_sha1_file(buf, size, type, sha1) < 0)
 			die("failed to compute object hash");
-		add_pack_index(sha1);
-		if (0 && base)
-			write_pack_delta(base, delta, delta_size);
-		else
-			write_pack_object(type, sha1, buf, size);
-	} else if (write_sha1_file(buf, size, type, sha1) < 0)
-		die("failed to write object");
-	added_object(sha1, type, buf, size);
+
+		write_pack_object(buf, size, type, sha1);
+	} else {
+		if (write_sha1_file(buf, size, type, sha1) < 0)
+		    die("failed to write object");
+
+		added_object(sha1, type, buf, size);
+	}
 }
 
 static void resolve_delta(const char *type, unsigned char *base_sha1,
@@ -190,7 +162,7 @@ static void resolve_delta(const char *ty
 			     &result_size);
 	if (!result)
 		die("failed to apply delta");
-	write_object(result, result_size, type, base_sha1, delta, delta_size);
+	write_object(result, result_size, type);
 	free(delta);
 	free(result);
 }
@@ -225,7 +197,7 @@ static void unpack_non_delta_entry(enum 
 	default: die("bad type %d", kind);
 	}
 	if (!dry_run && buf)
-		write_object(buf, size, type, NULL, NULL, 0);
+		write_object(buf, size, type);
 	free(buf);
 }
 
@@ -334,12 +306,11 @@ static void unpack_all(const char *repac
 		newhdr.hdr_signature = htonl(PACK_SIGNATURE);
 		newhdr.hdr_version = htonl(PACK_VERSION);
 		newhdr.hdr_entries = htonl(nr_objects);
-		
+
 		pack_file = sha1create("%s.pack", repack);
 		sha1write(pack_file, &newhdr, sizeof(newhdr));
-		pack_file_offset = sizeof(newhdr);
 	}
-		
+
 
 	use(sizeof(struct pack_header));
 	for (i = 0; i < nr_objects; i++)
-- 
1.4.2.1

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:42                                         ` Carl Worth
@ 2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  5:05                                             ` Linus Torvalds
                                                               ` (4 more replies)
  2006-10-20  2:53                                           ` James Henstridge
  1 sibling, 5 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20  1:06 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:

> Do you see how "maintain
> an independent URL namespace for every distributed branch" doesn't
> encourage much distributed development?

I understand your argument now.  It's nothing to do with numbers per se,
and all about per-branch namespaces.  Correct?

>>             Additionally, the new mainline can keep a mirror of the
>> abandoned mainline in its repository, because there are virtually no
>> additional storage requirements to doing so.
> 
> And this part I don't understand. I can understand the mainline
> storing the revisions, but I don't understand how it could make them
> accessible by the published revision numbers of the "abandoned"
> line. And that's the problem.

I meant that the active branch and a mirror of the abandoned branch
could be stored in the same repository, for ease of access.

Bazaar encourages you to stick lots and lots of branches in your
repository.  They don't even have to be related.  For example, my repo
contains branches of bzr, bzrtools, Meld, and BazaarInspect.

> It sounds like bzr has numbers like this inside it, (but not nearly as
> simple as the ones that git has), but that users aren't in the
> practice of communicating with them. Instead, users communicate with
> the unstable numbers. And that's a shame from an historical
> perspective.

I can see where you're coming from, but to me, the trade-off seems
worthwhile.  Because historical data gets less and less valuable the
older it gets.  By the time the URL for a branch goes dark, there's
unlikely to be any reason to refer to one of its revisions at all.

> The original claim that sparked the discussion was that bzr has a
> "simple namespace" while git does not. We've been talking for quite a
> while here, and I still don't fully understand how these numbers are
> generated or what I can expect to happen to the numbers associated
> with a given revision as that revision moves from one repository to
> another. It's really not a simple scheme.

When you create a new branch from scratch, the number starts at zero.
If you copy a branch, you copy its number, too.

Every time you commit, the number is incremented.  If you pull, your
numbers are adjusted to be identical to those of the branch you pulled from.

Is that really complicated?

> Meanwhile, I have been arguing that the "simple" revision numbers that
> bzr advertises have restrictions on their utility, (they can only be
> used with reference to a specific repository, or with reference to
> another that treats it as canonical). I _think_ I understand the
> numbers well enough to say that still.

Sure.  It's the "favors centralization" thing that I don't agree with,
but I now understand your argument.

> Compare that with the git names. The scheme really is easy to
> understand, (either the new user already understands cryptographic
> hashes, or else it's as easy as "a long string of digits that git
> assigns as the name").

In my experience, users who don't understand distributed systems don't
understand why UUIDS must be used as identifiers.

> The naming in git really is beautiful and beautifully simple.

Well, you've got to admit that those names are at least superficially ugly.

> It's not monotonically increasing from one revision to the next, but
> I've never found that to be an issue. Of course, we do still use our
> own "simple" names for versioning the releases and snapshots of
> software we manage with git, and that's where being able to easily
> determine "newer" or "older" by simple numerical examination is
> important. I've honestly never encountered a situation where I was
> handed two git sha1 sums and wished that I could do the same thing.

What's nice is being able see the revno 753 and knowing that "diff -r
752..753" will show the changes it introduced.  Checking the revo on a
branch mirror and knowing how out-of-date it is.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOCEf0F+nu1YWqI0RAhgtAJwK4jkWFjjF2iHJb1VyXqgszsHElACff2U7
olZJiAED80tIS6kgkqFsJps=
=BkRZ
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
@ 2006-10-20  1:11                                                                 ` Nicolas Pitre
  2006-10-20  1:35                                                                   ` Junio C Hamano
  2006-10-20  2:27                                                                   ` Jan Harkes
  0 siblings, 2 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-20  1:11 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Linus Torvalds, Junio C Hamano, git

On Thu, 19 Oct 2006, Jan Harkes wrote:

> Tracking the offsets is not that hard, but calculating the sha1 for the
> deltas is tricky, we may have already seen and written out the base we
> need. So it is actually easier to avoid the complexity altogether and
> rely on git-index-pack to rebuild the index. The indexing step is also a
> useful validation whether the final pack contains a base for every delta.
> 
> Signed-off-by: Jan Harkes <jaharkes@cs.cmu.edu>

I don't think it is a good idea.

After looking at the problem for a while I should side with Linus.  
unpack-objects is not the proper tool for the job.  The way to go is to 
make input to index-pack streamable.

This patch in particular creates additional restrictions on pack 
files that were not present before.  And I don't think this is a good 
thing.

This patch impose an ordering on REF_DELTA objects that doesn't need to 
exist.  Say for example that an OFS_DELTA depends on an object which is 
a REF_DELTA object.  With this patch any pack with the base for that 
REF_DELTA stored after the OFS_DELTA object will be broken.

And to really do thin pack fixing properly we really want to just append 
missing base objects at the end of the pack which falls in the broken 
case above.

So this is a NAK from me.

> ---
>  builtin-unpack-objects.c |   57 +++++++++++-----------------------------------
>  1 files changed, 14 insertions(+), 43 deletions(-)
> 
> diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c
> index b95c93c..3df7938 100644
> --- a/builtin-unpack-objects.c
> +++ b/builtin-unpack-objects.c
> @@ -89,29 +89,6 @@ static void *get_data(unsigned long size
>  }
>  
>  static struct sha1file *pack_file;
> -static unsigned long pack_file_offset;
> -
> -struct index_entry {
> -	unsigned long offset;
> -	unsigned char sha1[20];
> -};
> -
> -static unsigned int index_nr, index_alloc;
> -static struct index_entry **index_array;
> -
> -static void add_pack_index(unsigned char *sha1)
> -{
> -	struct index_entry *entry;
> -	int nr = index_nr;
> -	if (nr >= index_alloc) {
> -		index_alloc = (index_alloc + 64) * 3 / 2;
> -		index_array = xrealloc(index_array, index_alloc * sizeof(*index_array));
> -	}
> -	entry = xmalloc(sizeof(*entry));
> -	entry->offset = pack_file_offset;
> -	hashcpy(entry->sha1, sha1);
> -	index_array[nr++] = entry;
> -}
>  
>  static void write_pack_delta(const unsigned char *base, const void *delta, unsigned long delta_size)
>  {
> @@ -122,11 +99,9 @@ static void write_pack_delta(const unsig
>  	sha1write(pack_file, header, hdrlen);
>  	sha1write(pack_file, base, 20);
>  	datalen = sha1write_compressed(pack_file, delta, delta_size);
> -
> -	pack_file_offset += hdrlen + 20 + datalen;
>  }
>  
> -static void write_pack_object(const char *type, const unsigned char *sha1, const void *buf, unsigned long size)
> +static void write_pack_object(const void *buf, unsigned long size, const char *type, const unsigned char *sha1)
>  {
>  	unsigned char header[10];
>  	unsigned hdrlen, datalen;
> @@ -134,8 +109,6 @@ static void write_pack_object(const char
>  	hdrlen = encode_header(string_to_type(type, sha1), size, header);
>  	sha1write(pack_file, header, hdrlen);
>  	datalen = sha1write_compressed(pack_file, buf, size);
> -
> -	pack_file_offset += hdrlen + datalen;
>  }
>  
>  struct delta_info {
> @@ -160,22 +133,21 @@ static void add_delta_to_list(unsigned c
>  
>  static void added_object(unsigned char *sha1, const char *type, void *data, unsigned long size);
>  
> -static void write_object(void *buf, unsigned long size, const char *type,
> -	unsigned char *base, void *delta, unsigned long delta_size)
> +static void write_object(void *buf, unsigned long size, const char *type)
>  {
>  	unsigned char sha1[20];
>  
>  	if (pack_file) {
>  		if (hash_sha1_file(buf, size, type, sha1) < 0)
>  			die("failed to compute object hash");
> -		add_pack_index(sha1);
> -		if (0 && base)
> -			write_pack_delta(base, delta, delta_size);
> -		else
> -			write_pack_object(type, sha1, buf, size);
> -	} else if (write_sha1_file(buf, size, type, sha1) < 0)
> -		die("failed to write object");
> -	added_object(sha1, type, buf, size);
> +
> +		write_pack_object(buf, size, type, sha1);
> +	} else {
> +		if (write_sha1_file(buf, size, type, sha1) < 0)
> +		    die("failed to write object");
> +
> +		added_object(sha1, type, buf, size);
> +	}
>  }
>  
>  static void resolve_delta(const char *type, unsigned char *base_sha1,
> @@ -190,7 +162,7 @@ static void resolve_delta(const char *ty
>  			     &result_size);
>  	if (!result)
>  		die("failed to apply delta");
> -	write_object(result, result_size, type, base_sha1, delta, delta_size);
> +	write_object(result, result_size, type);
>  	free(delta);
>  	free(result);
>  }
> @@ -225,7 +197,7 @@ static void unpack_non_delta_entry(enum 
>  	default: die("bad type %d", kind);
>  	}
>  	if (!dry_run && buf)
> -		write_object(buf, size, type, NULL, NULL, 0);
> +		write_object(buf, size, type);
>  	free(buf);
>  }
>  
> @@ -334,12 +306,11 @@ static void unpack_all(const char *repac
>  		newhdr.hdr_signature = htonl(PACK_SIGNATURE);
>  		newhdr.hdr_version = htonl(PACK_VERSION);
>  		newhdr.hdr_entries = htonl(nr_objects);
> -		
> +
>  		pack_file = sha1create("%s.pack", repack);
>  		sha1write(pack_file, &newhdr, sizeof(newhdr));
> -		pack_file_offset = sizeof(newhdr);
>  	}
> -		
> +
>  
>  	use(sizeof(struct pack_header));
>  	for (i = 0; i < nr_objects; i++)
> -- 
> 1.4.2.1
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  1:11                                                                 ` Nicolas Pitre
@ 2006-10-20  1:35                                                                   ` Junio C Hamano
  2006-10-20  2:27                                                                   ` Jan Harkes
  1 sibling, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-20  1:35 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> This patch in particular creates additional restrictions on pack 
> files that were not present before.  And I don't think this is a good 
> thing.
>
> This patch impose an ordering on REF_DELTA objects that doesn't need to 
> exist.  Say for example that an OFS_DELTA depends on an object which is 
> a REF_DELTA object.  With this patch any pack with the base for that 
> REF_DELTA stored after the OFS_DELTA object will be broken.
>
> And to really do thin pack fixing properly we really want to just append 
> missing base objects at the end of the pack which falls in the broken 
> case above.
>
> So this is a NAK from me.

I agree.

By the way, it is rather rare for us to see a NAK on this list.
I'd welcome to see more of them ;-).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
  2006-10-18 22:14                         ` Jakub Narebski
  2006-10-19  8:19                         ` Alexander Belchenko
@ 2006-10-20  2:09                         ` Horst H. von Brand
  2006-10-20  5:38                           ` Jan Hudec
  2 siblings, 1 reply; 806+ messages in thread
From: Horst H. von Brand @ 2006-10-20  2:09 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Robert Collins, Petr Baudis, bazaar-ng, git

Jan Hudec <bulb@ucw.cz> wrote:

[...]

> Reading this thread I came to think, that the revnos should be assigned
> to _all_ revisions _available_, in order of when they entered the
> repository (there are some possible variations I will mention below)
> 
>  - Such revnos would be purely local, but:
>    - Current revnos are not guaranteed to be the same in different
>      branches either.
>    - They could be done so that mirror has the same revnos as the
>      master.

Then they are almost useless (except for people working alone). You need to
be able to talk about a particular commit with others working independently.

>  - They would be easier to use than the dotted ones. What (at least as
>    far as I understand) makes revnos easier to use than revids is, that
>    you can remember few of them for short time while composing some
>    operation. Ie. look up 2 or 3 revisions in the log and than do some
>    command on them. And a 4 to 5-digit number like 10532 is easier to
>    remember than something like 3250.2.45.86.

Probably. In git you can (mostly) get away with partial SHA-1's, BTW.

>  - Their ordering would be an (arbitrary) superset of the partial
>    ordering by descendance, ie. if revision A is ancestor of B, it would
>    always have lower revno.
>    - The intuition that lower revno means older revision would be always
>      valid for related revisions and approximately valid for unrelated
>      ones.
>  - They would be *localy stable*. That is once assigned the revno would
>    always mean the same revision in given branch (as determined by
>    location, not tip).

Tip-relative is extremely useful: I wouldn't normally remember the current
revision, but I'll probably often be talking about "the change before this
one" and so on.

>      - This is more than the current scheme can give, since now pull can
>        renumber revisions.

Urgh. Get an update, and all your bearings change?

>  - They wouldn't make any branch special, so the objections Linus raised
>    does not apply.

But the original branch /is/ special?

>  - They would be the same as subversion and svk, and IIRC mercurial as
>    well, use, so:
>    - They would already be familiar to users comming from those systems.
>    - They are known to be useful that way. In fact for svk it's the only
>      way to refer to revisions and seem to work satisfactorily (though
>      note that svk is not really suitable to ad-hoc topologies).

SVN is /centralized/, there it does make sense talking about (the one and
only) history. In a distributed system, potentially each has a different
history, and they are intertwined.

Not at all useful.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239
Casilla 110-V, Valparaiso, Chile               Fax:  +56 32 2797513

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  1:11                                                                 ` Nicolas Pitre
  2006-10-20  1:35                                                                   ` Junio C Hamano
@ 2006-10-20  2:27                                                                   ` Jan Harkes
  2006-10-20  2:30                                                                     ` Junio C Hamano
  2006-10-20  3:36                                                                     ` Nicolas Pitre
  1 sibling, 2 replies; 806+ messages in thread
From: Jan Harkes @ 2006-10-20  2:27 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Linus Torvalds, Junio C Hamano, git

On Thu, Oct 19, 2006 at 09:11:10PM -0400, Nicolas Pitre wrote:
> This patch impose an ordering on REF_DELTA objects that doesn't need to 
> exist.  Say for example that an OFS_DELTA depends on an object which is 
> a REF_DELTA object.  With this patch any pack with the base for that 
> REF_DELTA stored after the OFS_DELTA object will be broken.

I don't see where it imposes any ordering.

If we see a complete object it will remain complete. If we find a delta,
and we have the base in the current repository it will be expanded to a
complete object. When we get a delta that doesn't have a base in the
current repository it will remain unresolved and is written out as a
delta.

So the output pack will always contain fewer deltas as the input.

btw. I don't really know what OFS_DELTA and REF_DELTA objects are, I
grepped the source and found no references to either. I can only find
an OBJ_DELTA.

But if any of the deltas depend on an object that is not in the thin
pack, the base has to be available in the current repository and as such
it will be expanded to a full object, replacing the possibly external
delta reference with an internal base object. If the base is not found
in the current repository the base has to be another object in the
original thin pack so we can write out the delta as is.

There is no before or after decision here. We don't look back in the
thin pack, and we don't have to look forward either. So I don't
understand why your example would break or not depending on if the base
object happens to be before or after the OFS_DELTA.

> And to really do thin pack fixing properly we really want to just append 
> missing base objects at the end of the pack which falls in the broken 
> case above.

I guess I'll grep through the mailinglists to try to figure out what
these OFS and REF deltas are and why they behave so differently
depending on their order in the pack.

Jan

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  2:27                                                                   ` Jan Harkes
@ 2006-10-20  2:30                                                                     ` Junio C Hamano
  2006-10-20  2:46                                                                       ` Jan Harkes
  2006-10-20  3:36                                                                     ` Nicolas Pitre
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-20  2:30 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Nicolas Pitre, Linus Torvalds, git

Jan Harkes <jaharkes@cs.cmu.edu> writes:

> I guess I'll grep through the mailinglists to try to figure out what
> these OFS and REF deltas are and why they behave so differently
> depending on their order in the pack.

It's been cooking in "next" branch for quite a while.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  2:30                                                                     ` Junio C Hamano
@ 2006-10-20  2:46                                                                       ` Jan Harkes
  0 siblings, 0 replies; 806+ messages in thread
From: Jan Harkes @ 2006-10-20  2:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Linus Torvalds, git

On Thu, Oct 19, 2006 at 07:30:27PM -0700, Junio C Hamano wrote:
> Jan Harkes <jaharkes@cs.cmu.edu> writes:
> 
> > I guess I'll grep through the mailinglists to try to figure out what
> > these OFS and REF deltas are and why they behave so differently
> > depending on their order in the pack.
> 
> It's been cooking in "next" branch for quite a while.

Ah yes, just went through the thread about the git-index-pack breaking on
64-bit systems and the back and forth about the possible complexity of
the new code.

> It is really simple:
>
>  - if the found union content matches with a reference union initialized
>    through the sha1 member then deltas[j].obj->type == OBJ_REF_DELTA
>    must be true.
>
>  - if the found union content matches with a reference union initialized
>    through the sha1 member then deltas[j].obj->type == OBJ_OFS_DELTA
>    must be true.
...

I guess one of these must be false.

But clearly this patch breaks those offset based delta's when we expand
random deltas in place.

Jan

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:42                                         ` Carl Worth
  2006-10-20  1:06                                           ` Aaron Bentley
@ 2006-10-20  2:53                                           ` James Henstridge
  2006-10-20  9:51                                             ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-20  2:53 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:
> > I don't think this is true.  The abandoned mainline does not need to be
> > destroyed.  It can be kept at the same location that it always was, with
> > the numbers that it always had. So the number + URL combo stays
> > meaningful.
>
> Sure that's possible, but it gets rather unwieldy the more
> repositories you have involved. I've been arguing that bzr really does
> encourage centralized, not distributed development, and you were having
> trouble seeing how I came to that conclusion. Do you see how "maintain
> an independent URL namespace for every distributed branch" doesn't
> encourage much distributed development?
>
> >             Additionally, the new mainline can keep a mirror of the
> > abandoned mainline in its repository, because there are virtually no
> > additional storage requirements to doing so.
>
> And this part I don't understand. I can understand the mainline
> storing the revisions, but I don't understand how it could make them
> accessible by the published revision numbers of the "abandoned"
> line. And that's the problem.

With this sort of setup, I would publish my branches in a directory
tree like this:

    /repo
        /branch1
        /branch2

I make "/repo" a Bazaar repository so that it stores the revision data
for all branches contained in the directory (the tree contents,
revision meta data, etc).

The "/repo/branch1" essentially just contains a list of mainline
revision IDs that identify the branch.  This could probably be just
store the head revision ID, but there are some optimisations that make
use of the linear history here.

If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
be if the in the case of forked then merged projects), then all its
revision data will already be in the repository when branch1 was
imported.  The only cost of keeping the branch around (and publishing
it) is the list of revision IDs in its mainline history.

For similar reasons, the cost of publishing 20 related Bazaar branches
on my web server is generally not 20 times the cost of publishing a
single branch.

I understand that you get similar benefits by a GIT repository with
multiple head revisions.


> > > But for these communications, revision numbers will not provide
> > > historically stable values that can be used.
> >
> > They certainly can.
> >
> > The coder says "I've put up a branch at http://example.com/bzr/feature.
> >  In revision 5, I started work on feature A.  I finished work in
> > revision 6.  But then I had to fix a related bug in revision 7."
>
> "I've put this branch up" isn't historically stable...

With the repository structure mentioned above, the cost of publishing
multiple branches is quite low.  If I continue to work on the project,
then there is no particular bandwidth or disk space reasons for me to
cut off access to my old branches.

For similar reasons, it doesn't cost me much to mirror other people's
related branches if I really care about them.

> > As long as that coder is active
>
> ...which is what you just said there yourself.
>
> On the other hand, git names really do live forever, regardless of
> where the code is hosted or how it moves around. When I'm talking
> about historical stability, I'm talking about being able to publish
> numbers that live forever.
>
> It sounds like bzr has numbers like this inside it, (but not nearly as
> simple as the ones that git has), but that users aren't in the
> practice of communicating with them. Instead, users communicate with
> the unstable numbers. And that's a shame from an historical
> perspective.

If you need that level of stability then you want the revision
identifier in both the GIT and Bazaar cases.

As for simplicity, note that Bazaar doesn't extract any special
meaning from the "$email-$date-$random" format of the revision
identifiers.  The only property it cares about is that they are
globally unique.  For example, revision identifiers generated by the
Arch -> Bazaar importer have a different format and are handled the
same.


> > This is true, but his code is likely to all land in the mainline at
> > once.  Since his own revnos are more fine-grained, he's not likely want
> > to use the mainline revnos.
>
> What I'd like to be able to do, is advertise a temporary repository,
> and while using it, publish names for revisions that will still be
> valid when the code gets pushed out to the mainline. That is
> supporting distributed development, and everything I'm hearing says
> that the bzr revision numbers don't support that.

That is correct.  The revision numbers assigned to particular
revisions in the context of one branch won't necessarily be the same
as the numbers in another branch.


> > I felt that you were mischaracterizing my _statement_ that "it's
> > exceedingly uncommon for [revnos] to change" as an _argument_ "it's
> > exceedingly uncommon for [revnos] to change".  The reality is that we
> > keep saying revnos don't change because git users keep saying "but what
> > if the revnos change?".
>
> OK.
>
> The original claim that sparked the discussion was that bzr has a
> "simple namespace" while git does not. We've been talking for quite a
> while here, and I still don't fully understand how these numbers are
> generated or what I can expect to happen to the numbers associated
> with a given revision as that revision moves from one repository to
> another. It's really not a simple scheme.

I can't say anything about the dotted revision numbers that have been
recently introduced to Bazaar, but I have definitely found the simple
numeric revision numbers for mainline revisions useful when using
Bazaar.  The revisions with these short revision numbers are generally
the ones I am most interested in when working on that branch.

It hasn't ever seemed a problem those revisions no longer had short
revision numbers assigned to them when someone else merged my branch.


> Meanwhile, I have been arguing that the "simple" revision numbers that
> bzr advertises have restrictions on their utility, (they can only be
> used with reference to a specific repository, or with reference to
> another that treats it as canonical). I _think_ I understand the
> numbers well enough to say that still.

Using Bazaar terminology, the revision numbers are specific to a
particular _branch_.  If I copy a branch from one repository to
another, its revision numbers will stay the same.  And conversely, two
branches in the same repository can have different revision numbers.


> Compare that with the git names. The scheme really is easy to
> understand, (either the new user already understands cryptographic
> hashes, or else it's as easy as "a long string of digits that git
> assigns as the name"). The names have universal utility in time and
> space, (for definitions of the the universe larger than I will ever be
> able to observe anyway). And the natural inclination to abbreviate the
> a name when repeating it, (note the recent post with bzr UUIDs
> exhibiting the same inclination), doesn't make the names any less
> useful since the abbreviation alone will work most always.
>
> The naming in git really is beautiful and beautifully simple.

I don't think anyone is saying that universally unique names are bad.
But I also don't see a problem with using shorter names that only have
meaning in a local scope.

I've noticed some people using abbreviated SHA1 sums with GIT.  Isn't
that also a case of trading potential global uniqueness for
convenience when working in a local scope?


James.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 15:30                                           ` Aaron Bentley
@ 2006-10-20  3:14                                             ` Tim Webster
  2006-10-20  4:05                                               ` Aaron Bentley
  2006-10-20 10:44                                             ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Tim Webster @ 2006-10-20  3:14 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Matthieu Moy, Christian MICHON, Andreas Ericsson, bazaar-ng, git

On 10/19/06, Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Tim Webster wrote:
> > First I want to say every SCM I know of sucks when it comes to tracking
> > configurations, simply because they don't record or restore file metadata,
> > like perms, ownership, and acl.
>
> Arch supports that kind of metadata.
>
> I believe SVN supports recording arbitrary file properties, so it's just
> a matter of applying those properties to the tree.

yes svn has arbitrary properties which can be manipulated.
They are not really intended for permissions, ownership, and acl.
To use the svn properties for this requires adding scm tools.
Also svn does not allow files in the same directory to live in
multiple repos

>
> > Somethings I like the SCM tools to handle. Personally I would like the

> > Collaborative document editing and white boarding are other requirements.
> > odf and svg are xml file formats. I would like to see an efficient
> > xml diff as part of the SCM core. Using mime types SCM tools can unzip
> > files, bundles, and use mime type information to the SCM core xml
> > diff, plain diff
> > as required.
>
> An XML diff/patch or merge will not handle ODF properly.  There's too
> much extra semantic information.

I have only experiment with xml diffs on odf files.
From my experience xml diffs work fine on svg files.
For more information, please refer to
http://www.unibw.de/inf2/OO_VCS/oo_rcs_api.html


> > I think it is essential that the SCM core include
> > previsions for multiple
> > repo partners.
>
> You mean multiple merge sources?

yes, Multiple merge sources is handy for collaborative document editing

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH 2/2] Remove unused index tracking code.
  2006-10-20  2:27                                                                   ` Jan Harkes
  2006-10-20  2:30                                                                     ` Junio C Hamano
@ 2006-10-20  3:36                                                                     ` Nicolas Pitre
  1 sibling, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-20  3:36 UTC (permalink / raw)
  To: Jan Harkes; +Cc: Linus Torvalds, Junio C Hamano, git

On Thu, 19 Oct 2006, Jan Harkes wrote:

> If we see a complete object it will remain complete. If we find a delta,
> and we have the base in the current repository it will be expanded to a
> complete object.
> When we get a delta that doesn't have a base in the
> current repository it will remain unresolved and is written out as a
> delta.

But the point of the whole exercice is actually to avoid unresolved 
deltas.  And you know if you have unresolved deltas only when the whole 
pack has been processed.

If the base object is not in the repository but it is in the pack 
_after_ the delta that needs it, you won't have resolved it.  If this is 
a thin pack with missing base objects for whatever reason you're 
screwed.

If the delta has its base object in both the repository _and_ in the 
pack but after the delta then you will have expanded the delta 
needlessly.

So your solution is suboptimal.

The optimal solution really consists of appending missing base objects 
to a thin pack in order to make it complete, or error out if those 
cannot be found.


Nicolas

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:14                                           ` Matthieu Moy
@ 2006-10-20  3:40                                             ` Tim Webster
  0 siblings, 0 replies; 806+ messages in thread
From: Tim Webster @ 2006-10-20  3:40 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Christian MICHON, Andreas Ericsson, bazaar-ng, git

On 10/20/06, Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> "Tim Webster" <tdwebste@gmail.com> writes:
>
> > First I want to say every SCM I know of sucks when it comes to tracking
> > configurations, simply because they don't record or restore file metadata,
> > like perms, ownership, and acl.
>
> That's not a simple matter.
>
> Tracking ownership hardly makes sense as soon as you have two
> developers on the same project. What does it mean to checkout a file
> belonging to user foo and group bar on a system not having such user
> and group?
.
> That said, it can be interesting to have it, but disabled by default.

Yes I agree it should be disabled by default. And enabled based on the
local settings.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  3:14                                             ` Tim Webster
@ 2006-10-20  4:05                                               ` Aaron Bentley
  2006-10-21 12:30                                                 ` Jan Hudec
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20  4:05 UTC (permalink / raw)
  To: Tim Webster
  Cc: Christian MICHON, Andreas Ericsson, bazaar-ng, git, Matthieu Moy

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tim Webster wrote:
> On 10/19/06, Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
>> I believe SVN supports recording arbitrary file properties, so it's just
>> a matter of applying those properties to the tree.
> 
> yes svn has arbitrary properties which can be manipulated.
> They are not really intended for permissions, ownership, and acl.
> To use the svn properties for this requires adding scm tools.

Agreed.  I think it's okay to require extra work to set the scm up to
handle configurations.

> Also svn does not allow files in the same directory to live in
> multiple repos

It would surprise me if many SCMs that support atomic commit also
support intermixing files from multiple repos in the same directory.

>> You mean multiple merge sources?
> 
> yes, Multiple merge sources is handy for collaborative document editing

That's something I'd like for software development, too.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOEsO0F+nu1YWqI0RAo+6AJ9lzF0+O1I8rgkyCOdhsir1gjo0NQCfXEVV
EIsDmS+eR/7cHKQfmnPJRA4=
=g5jk
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
@ 2006-10-20  5:05                                             ` Linus Torvalds
  2006-10-20  7:47                                               ` Lachlan Patrick
  2006-10-20  9:57                                             ` Jakub Narebski
                                                               ` (3 subsequent siblings)
  4 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20  5:05 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Andreas Ericsson, Carl Worth, bazaar-ng, git, Jakub Narebski



On Thu, 19 Oct 2006, Aaron Bentley wrote:
> 
> I understand your argument now.  It's nothing to do with numbers per se,
> and all about per-branch namespaces.  Correct?

I don't know if that is what Carl's problem is, but yes, to somebody from 
the git world, it's totally insane to have the _same_ commit have ten 
different names just depending on which branch is was in.

In git-land, the name of a commit is the same in every branch.

Do you have something like

	gitk --all

in your graphical viewers? That one shows _all_ the branches of a 
repository, and how they relate to each other in git. How do you name your 
commits in such a viewer, since every branch has a _different_ name for 
the same commit?

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-20  2:09                         ` Horst H. von Brand
@ 2006-10-20  5:38                           ` Jan Hudec
  0 siblings, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-20  5:38 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: Robert Collins, Petr Baudis, bazaar-ng, git

On Thu, Oct 19, 2006 at 11:09:49PM -0300, Horst H. von Brand wrote:
> Jan Hudec <bulb@ucw.cz> wrote:
> 
> [...]
> 
> > Reading this thread I came to think, that the revnos should be assigned
> > to _all_ revisions _available_, in order of when they entered the
> > repository (there are some possible variations I will mention below)
> > 
> >  - Such revnos would be purely local, but:
> >    - Current revnos are not guaranteed to be the same in different
> >      branches either.
> >    - They could be done so that mirror has the same revnos as the
> >      master.
> 
> Then they are almost useless (except for people working alone). You need to
> be able to talk about a particular commit with others working independently.

As they currently are you can't either. Because currently it is
guaranteed that the revnos will be the same in two branches with the
same current revision. But when the current revisions differ, the
numbers may as well.

Moreover currently they can change for the same branch over time, while
with the alternate proposal they would not, so you could reliably say
revision 567 on foo.

> >  - They would be easier to use than the dotted ones. What (at least as
> >    far as I understand) makes revnos easier to use than revids is, that
> >    you can remember few of them for short time while composing some
> >    operation. Ie. look up 2 or 3 revisions in the log and than do some
> >    command on them. And a 4 to 5-digit number like 10532 is easier to
> >    remember than something like 3250.2.45.86.
> 
> Probably. In git you can (mostly) get away with partial SHA-1's, BTW.

1) Partial sha-1 is still longer (starts being useful at 6 digits,
   usually you need 8)
2) Decimal numbers are easier to remember than hexadecimal ones.
3) The hashes are not oredered.

> >  - Their ordering would be an (arbitrary) superset of the partial
> >    ordering by descendance, ie. if revision A is ancestor of B, it would
> >    always have lower revno.
> >    - The intuition that lower revno means older revision would be always
> >      valid for related revisions and approximately valid for unrelated
> >      ones.
> >  - They would be *localy stable*. That is once assigned the revno would
> >    always mean the same revision in given branch (as determined by
> >    location, not tip).
> 
> Tip-relative is extremely useful: I wouldn't normally remember the current
> revision, but I'll probably often be talking about "the change before this
> one" and so on.

That's however separate issue. Negative numbers are tip-relative and
there are various prefixes in bzr (like before:, ancestor: etc.) for
relative revision addressing.

> >      - This is more than the current scheme can give, since now pull can
> >        renumber revisions.
> 
> Urgh. Get an update, and all your bearings change?

Currently yes. Currently pull changes the branch to be a mirror of the
pulled-from branch, including the way they are numbered.

> >  - They wouldn't make any branch special, so the objections Linus raised
> >    does not apply.
> 
> But the original branch /is/ special?

Some branches are usually special, but which they are may not
necessarily coincide with the left-parent lineage.

> >  - They would be the same as subversion and svk, and IIRC mercurial as
> >    well, use, so:
> >    - They would already be familiar to users comming from those systems.
> >    - They are known to be useful that way. In fact for svk it's the only
> >      way to refer to revisions and seem to work satisfactorily (though
> >      note that svk is not really suitable to ad-hoc topologies).
> 
> SVN is /centralized/, there it does make sense talking about (the one and
> only) history. In a distributed system, potentially each has a different

Did you notice that I also said svk and mercurial? They both *ARE*
distributed (well, svk has it's limitations, but mercurial really very
similar to git).

> history, and they are intertwined.
> 
> Not at all useful.

There are no global persistent revision numbers in a distributed system.
There can't be. But numbers with limited scope can still be really
useful. The question is what that scope should be.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  5:05                                             ` Linus Torvalds
@ 2006-10-20  7:47                                               ` Lachlan Patrick
  2006-10-20  8:38                                                 ` Johannes Schindelin
  2006-10-20 10:16                                                 ` Petr Baudis
  0 siblings, 2 replies; 806+ messages in thread
From: Lachlan Patrick @ 2006-10-20  7:47 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Linus Torvalds wrote:
> 
> On Thu, 19 Oct 2006, Aaron Bentley wrote:
>> I understand your argument now.  It's nothing to do with numbers per se,
>> and all about per-branch namespaces.  Correct?
> 
> I don't know if that is what Carl's problem is, but yes, to somebody from 
> the git world, it's totally insane to have the _same_ commit have ten 
> different names just depending on which branch is was in.
> 
> In git-land, the name of a commit is the same in every branch.

I've been following the git-vs-bzr discussion, and I'd like to ask a
question (being new to both bzr and git). How does git disambiguate SHA1
hash collisions? I think git has an alternative way to name revisions
(can someone please explain it in more detail, I've seen <ref>~<n>
mentioned only in passing in this thread). It seems to me collisions are
a good argument in favour of having two independent naming schemes, so
that you're not solely relying on hashes being unique.

A strong argument is that a global namespace based on hashes of data is
ideal because the names are generated from the data being named, and
therefore are immutable. Same data => same name for that data, always
and forever, which is desirable when merging named data from many
sources. But the converse isn't true: one name does not necessarily map
to only that data. Have I misunderstood? Is this a problem?

Ta,
Loki

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
  2006-10-17 10:30             ` Johannes Schindelin
  2006-10-17 19:51             ` Aaron Bentley
@ 2006-10-20  8:26             ` James Henstridge
  2006-10-20 10:19               ` Jakub Narebski
  2006-10-20  8:56             ` Erik Bågfors
  3 siblings, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-20  8:26 UTC (permalink / raw)
  To: Sean; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On 17/10/06, Sean <seanlkml@sympatico.ca> wrote:
> > - - you can use a checkout to maintain a local mirror of a read-only
> >   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>
> I'm not sure what you mean here.  A bzr checkout doesn't have any history
> does it?  So it's not a mirror of a branch, but just a checkout of the
> branch head?

There are two forms of checkout: a normal checkout which contains the
complete history of the branch, and a lightweight checkout, which just
has a pointer back to the original location of the history.

In both cases, a "bzr commit" invocation will commit changes to the
remote location.  In general, you only want to use a lightweight
checkout when there is a fast reliably connection to the branch (e.g.
if it is on the local file system, or local network).

Aaron would be talking about a normal (heavyweight) checkout here.
With a heavyweight checkout, you can do pretty much anything without
access to the branch.  In contrast, almost all operations on a
lightweight checkout need access to the branch.

James.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  7:47                                               ` Lachlan Patrick
@ 2006-10-20  8:38                                                 ` Johannes Schindelin
  2006-10-20 10:13                                                   ` Petr Baudis
  2006-10-20 11:09                                                   ` Jakub Narebski
  2006-10-20 10:16                                                 ` Petr Baudis
  1 sibling, 2 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-20  8:38 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git

Hi,

On Fri, 20 Oct 2006, Lachlan Patrick wrote:

> How does git disambiguate SHA1 hash collisions?

It does not. You can fully expect the universe to go down before that 
happens.

The only reasonable worry is about SHA-1 being broken some time in future, 
i.e. being able to construct a malign version of some source code _which 
has the same hash_. There were plenty of discussions about that; Please 
search the mailing list. (The consent was that those do not matter, 
because an existing object will _never_ be overwritten by a fetch, so you 
would not get that invalid object anyway.)

Hth,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 10:23           ` Sean
                               ` (2 preceding siblings ...)
  2006-10-20  8:26             ` James Henstridge
@ 2006-10-20  8:56             ` Erik Bågfors
  3 siblings, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-20  8:56 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

> > - - you can use a checkout to maintain a local mirror of a read-only
> >   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
>
> I'm not sure what you mean here.  A bzr checkout doesn't have any history
> does it?  So it's not a mirror of a branch, but just a checkout of the
> branch head?

In bzr there are two different kind of checkouts.  One is a called a
lightweight checkout and that's really a "normal" checkout in the way
svn for example does it.  In this mode, you have the branch remotely
and only the working tree locally.  So it's just a checkout of the
branch head (of any other revision if using -r when doing the
checkout).

Then there are none lightweight checkouts, heavyweight checkouts.
These are the default type.  A heavyweight checkout is in fact a full
branch locally, but it is "bound" to the remote branch.  What this
means is that all commands such as diff/status/log/etc can be done
locally. So it's really quick.

It acts the same as a lightweight checkout in most regards, so when I
run "bzr update" it actually pulls from the remove branch, and when I
run "bzr commit" it commits the same revision in both the remote
branch and the local branch. It does this in one transaction so one
can't work and the other fail (they would both fail in that case).

What this also gives you is that when you want to clone the branch,
you don't need to go the the remote branch to get the revisions and
also, when being offline, you can commit locally.

Committing locally is a very cool feature in my mind.  If you work in
a centralized manner with checkouts, you normally commit directly to
the central branch, but when you are offline, that will fail (of
course :) ).  So what you can do then is to run "bzr commit --local"
to commit only to your local checkout branch, then when you get online
again you can run "bzr update".  In this case the update will take any
new commits that has been done while you were away, pull them into
your local branch, and make your local commits into something that has
been merged into the "checkout".

I find this REALLY useful.

Don't know if that made sense, here it is in commands.

$ bzr checkout t p
$ cd p
$ echo hej >> hosts
$ bzr commit --local -m 'offline'
$ echo hej >> hosts
$ bzr commit --local -m 'offline 2'

Now I get back, someone has committed new stuff... I run bzr update
$ bzr update
All changes applied successfully.
Updated to revision 2.
Your local commits will now show as pending merges with 'bzr status',
and can be committed with 'bzr commit'.
$ bzr status
modified:
  hosts
pending merges:
  Erik Bågfors 2006-10-20 offline 2
    Erik Bågfors 2006-10-20 offline
$ bzr commit -m 'my offline stuff'
modified hosts
Committed revision 3.

$ bzr log -r-1
------------------------------------------------------------
revno: 3
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: p
timestamp: Fri 2006-10-20 10:51:08 +0200
message:
  my offline stuff
    ------------------------------------------------------------
    merged: erik@bagfors.nu-20061020084949-8bc43db8f5cd449b
    committer: Erik Bågfors <erik@bagfors.nu>
    branch nick: p
    timestamp: Fri 2006-10-20 10:49:49 +0200
    message:
      offline 2
    ------------------------------------------------------------
    merged: erik@bagfors.nu-20061020084945-13e5093f98c0c380
    committer: Erik Bågfors <erik@bagfors.nu>
    branch nick: p
    timestamp: Fri 2006-10-20 10:49:45 +0200
    message:
      offline

I think that bzr really allows you to work well in a centralized
environment as well as a distrubuted, which is one of the things I
like best about bzr.

Regards,
Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 22:00                   ` Sean
  2006-10-17 22:44                     ` Aaron Bentley
@ 2006-10-20  9:43                     ` Matthieu Moy
  2006-10-24  6:02                       ` Lachlan Patrick
  1 sibling, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-20  9:43 UTC (permalink / raw)
  To: Sean; +Cc: Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Sean <seanlkml@sympatico.ca> writes:

> On Tue, 17 Oct 2006 17:27:44 -0400
> Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
>
>> Bzr has plugin autoloading, Protocol plugins, Repository format plugins,
>> and more.  Because Python supports monkey-patching, a plugin can change
>> absolutely anything.
>
> But really why does any of that matter?  This is the open source world.
> We don't need plugins to extend features, we just add the feature to
> the source.  The example I asked about earlier is a case in point. 
> Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
> was implemented as a command without any issue at all,

The plugin Vs core feature is not a technical problem. The code for a
plugin and for a core functionality will roughly be the same, but in a
different file.

There can be many reasons why you want to implement something as a
plugin:

* This is project-specific, upstream is not interested (for example,
  bzr has a plugin to submit a merge request to a robot, it will
  probably never come in the core).

* The feature is not matured enough, so you don't want to merge it in
  upstream, but you want to make it available to people without
  patching (for example, "bzr uncommit" was once in the bzrtools
  plugin, and finally landed in upstream).

* The feature you're adding are only of use to a small subset of
  users. You don't want to pollute, in particular "bzr help commands"
  with it, especially not to disturb beginners. I've been arguing in
  favor of a configuration option to hide commands from "bzr help
  commands" instead, but nobody seemed interested.

* Explicit divergent points of view between the implementor of the
  plugin and upstream. That avoids a fork. I don't remember any such
  case with bzr.

I'd compare bzr's plugins to Firefox extensions. Geeks used to like
the big Mozilla-with-tons-of-config-options, but
Firefox-with-only-the-most-relevant-features is the one which allowed
a wide adoption by non-geeks. Still, geeks can customize their
browser, and add features without having to wait for Mozilla Fundation
to incorporate it in upstream.

Now, I don't know git enough to know whether the way it is extensible
allow all of the above, but bzr's plugin system it quite good at that.
At the time git was almost exclusively used by the kernel, you didn't
have all those problems since you targeted only one community, but I
guess you already had some needs for flexibility.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  2:53                                           ` James Henstridge
@ 2006-10-20  9:51                                             ` Jakub Narebski
  2006-10-20 10:42                                               ` James Henstridge
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20  9:51 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

James Henstridge wrote:
> On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
>> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:

>>>             Additionally, the new mainline can keep a mirror of the
>>> abandoned mainline in its repository, because there are virtually no
>>> additional storage requirements to doing so.
>>
>> And this part I don't understand. I can understand the mainline
>> storing the revisions, but I don't understand how it could make them
>> accessible by the published revision numbers of the "abandoned"
>> line. And that's the problem.
> 
> With this sort of setup, I would publish my branches in a directory
> tree like this:
> 
>     /repo
>         /branch1
>         /branch2
> 
> I make "/repo" a Bazaar repository so that it stores the revision data
> for all branches contained in the directory (the tree contents,
> revision meta data, etc).

And here we have a feature which is as far as I see unique to git,
namely to have persistent branches with _separate namespace_. It means
that we can have hierarchical branch names (including names like
"remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
have to guess where repository name ends and branch name begins.

The idea of "branches (and tags) as directories" was if I understand
it correctly introduced by Subversion, and from what can be seen from
troubles with git-svn (stemming from the fact that division between
project name and branch name is the matter of _convention_) at least
slightly brain-damaged.
 
> The "/repo/branch1" essentially just contains a list of mainline
> revision IDs that identify the branch.  This could probably be just
> store the head revision ID, but there are some optimisations that make
> use of the linear history here.
> 
> If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
> be if the in the case of forked then merged projects), then all its
> revision data will already be in the repository when branch1 was
> imported.  The only cost of keeping the branch around (and publishing
> it) is the list of revision IDs in its mainline history.
> 
> For similar reasons, the cost of publishing 20 related Bazaar branches
> on my web server is generally not 20 times the cost of publishing a
> single branch.
> 
> I understand that you get similar benefits by a GIT repository with
> multiple head revisions.

You can get similar benefits by a GIT repository with shared object
database using alternates mechanism. And that is usually preferred
over storing unrelated branches, i.e. branches pointing to disconnected
DAG (separate trees in BK terminology) of revision, if that you mean by
multiple head revisions (because in GIT there is no notion of "mainline"
branch, only of current (HEAD) branch).


>>>> But for these communications, revision numbers will not provide
>>>> historically stable values that can be used.
>>>
>>> They certainly can.
>>>
>>> The coder says "I've put up a branch at http://example.com/bzr/feature.
>>>  In revision 5, I started work on feature A.  I finished work in
>>> revision 6.  But then I had to fix a related bug in revision 7."
>>
>> "I've put this branch up" isn't historically stable...
> 
> With the repository structure mentioned above, the cost of publishing
> multiple branches is quite low.  If I continue to work on the project,
> then there is no particular bandwidth or disk space reasons for me to
> cut off access to my old branches.
> 
> For similar reasons, it doesn't cost me much to mirror other people's
> related branches if I really care about them.

But the revision number in this case _changes_. It is from 7 to
branch:7 but still it changes somewhat.

[...]
>> The naming in git really is beautiful and beautifully simple.
> 
> I don't think anyone is saying that universally unique names are bad.
> But I also don't see a problem with using shorter names that only have
> meaning in a local scope.
> 
> I've noticed some people using abbreviated SHA1 sums with GIT.  Isn't
> that also a case of trading potential global uniqueness for
> convenience when working in a local scope?

Emphasisis on _potential_. SHA1 id abbreviated to 6 characters might
be not unique in larger project, but for example the chance that
SHA1 id abbreviated to 7 or 8 characters is not unique is really low.


Yet another analogy:

SHA1 identifiers of commits (and not only commits) can be compared
to Message-Ids of Usenet messages, while revision numbers can be compared
to Xref number of Usenet message which if I understand correctly is unique
only for given news server. But Message-Ids cannot be shortened
meaningfully like SHA1 ids can; newertheless they are used in communication
without any problems. Even if namespace is not simple ;-)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  5:05                                             ` Linus Torvalds
@ 2006-10-20  9:57                                             ` Jakub Narebski
  2006-10-20 10:02                                               ` Matthieu Moy
  2006-10-20 10:45                                               ` James Henstridge
  2006-10-20 11:00                                             ` Jakub Narebski
                                                               ` (2 subsequent siblings)
  4 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20  9:57 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
>> The naming in git really is beautiful and beautifully simple.
> 
> Well, you've got to admit that those names are at least superficially
> ugly. 

If you want pretty name, you tag it. Tags are exchanged during 
fetch/push operation. And you can have pretty names of revisions
like v1.4.3
 
>> It's not monotonically increasing from one revision to the next, but
>> I've never found that to be an issue. Of course, we do still use our
>> own "simple" names for versioning the releases and snapshots of
>> software we manage with git, and that's where being able to easily
>> determine "newer" or "older" by simple numerical examination is
>> important. I've honestly never encountered a situation where I was
>> handed two git sha1 sums and wished that I could do the same thing.
> 
> What's nice is being able see the revno 753 and knowing that "diff -r
> 752..753" will show the changes it introduced.  Checking the revo on a
> branch mirror and knowing how out-of-date it is.

Huh? If you want what changes have been introduced by commit 
c3424aebbf722c1f204931bf1c843e8a103ee143, you just do

# git diff c3424aebbf722c1f204931bf1c843e8a103ee143

(or better "git show" instead of "git diff" or "git diff-tree").
If you give only one commit (only one revision) git automatically
gives diff to its parent(s).


By the way, is referring to revision by it's revno _fast_?
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:57                                             ` Jakub Narebski
@ 2006-10-20 10:02                                               ` Matthieu Moy
  2006-10-20 10:45                                                 ` Andy Whitcroft
  2006-10-20 10:45                                               ` James Henstridge
  1 sibling, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-20 10:02 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, git

Jakub Narebski <jnareb@gmail.com> writes:

> Huh? If you want what changes have been introduced by commit 
> c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
>
> # git diff c3424aebbf722c1f204931bf1c843e8a103ee143

How does git chose which ancestor to use if this revision has more
than one in this case?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  8:38                                                 ` Johannes Schindelin
@ 2006-10-20 10:13                                                   ` Petr Baudis
  2006-10-20 11:09                                                   ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 10:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: bazaar-ng, git

  Hi,

Dear diary, on Fri, Oct 20, 2006 at 10:38:48AM CEST, I got a letter
where Johannes Schindelin <Johannes.Schindelin@gmx.de> said that...
> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
> 
> > How does git disambiguate SHA1 hash collisions?
> 
> It does not. You can fully expect the universe to go down before that 
> happens.
> 
> The only reasonable worry is about SHA-1 being broken some time in future, 
> i.e. being able to construct a malign version of some source code _which 
> has the same hash_. There were plenty of discussions about that; Please 
> search the mailing list. (The consent was that those do not matter, 
> because an existing object will _never_ be overwritten by a fetch, so you 
> would not get that invalid object anyway.)

  well, that's somewhat a bold statement, since when you have a way to
fabricate malicious objects, you probably can socially engineer to have
it distributed to a large portion of repositories if you try hard
enough. Or you hack kernel.org and replace the object. Who knows.

  But the thing is that noone has come any closer to this kind of attack
at all. Currently known attacks are that you can relatively fast (which
doesn't mean "5 minutes"; I think that in case of SHA1 the complexity is
still huge, just smaller than intended, but I may remember wrong; you
can get a MD5 collision of this kind within one minute on a standard
notebook) create a _pair_ of objects sharing the same hash, where both
objects contain a big binary blob. So you would first have to engineer
to have one of those objects accepted officially, then engineer the
malicious one getting in. Generating an object that hashes to a
predetermined value is much harder problem and AFAIK there's no much
progress in breaking this.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  7:47                                               ` Lachlan Patrick
  2006-10-20  8:38                                                 ` Johannes Schindelin
@ 2006-10-20 10:16                                                 ` Petr Baudis
  1 sibling, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 10:16 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 09:47:16AM CEST, I got a letter
where Lachlan Patrick <loki@research.canon.com.au> said that...
> I think git has an alternative way to name revisions
> (can someone please explain it in more detail, I've seen <ref>~<n>
> mentioned only in passing in this thread).

This is just a notion that lets you point to revisions relative to a
given id. <id>~<n> means n-th ancestor of the given commit.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  8:26             ` James Henstridge
@ 2006-10-20 10:19               ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:19 UTC (permalink / raw)
  To: James Henstridge; +Cc: Sean, Aaron Bentley, Linus Torvalds, bazaar-ng, git

James Henstridge wrote:
> On 17/10/06, Sean <seanlkml@sympatico.ca> wrote:
> > > - - you can use a checkout to maintain a local mirror of a read-only
> > >   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> >
> > I'm not sure what you mean here.  A bzr checkout doesn't have any history
> > does it?  So it's not a mirror of a branch, but just a checkout of the
> > branch head?
> 
> There are two forms of checkout: a normal checkout which contains the
> complete history of the branch, and a lightweight checkout, which just
> has a pointer back to the original location of the history.
> 
> In both cases, a "bzr commit" invocation will commit changes to the
> remote location.  In general, you only want to use a lightweight
> checkout when there is a fast reliably connection to the branch (e.g.
> if it is on the local file system, or local network).

So the "lightweight checkout" is equivalent of "lazy clone" we have
much discussed on git mailing list about (without any resulting code,
unfortunately). The point of problem was how to do this fast, without
need for fast reliable connection to the repository it was cloned from.
For example if to leave fetched objects in some kind of cache, or even
in "lightweight checkout"/"lazy clone" repository database.

If repository we do "lightweight checkout"/"lazy clone" from is on
local file system (perhaps network file system), then we can use
alternates mechanism (git clone -l -s). That's why "lazy clone" was
sometimes named "remote alternates".
 
> Aaron would be talking about a normal (heavyweight) checkout here.
> With a heavyweight checkout, you can do pretty much anything without
> access to the branch.  In contrast, almost all operations on a
> lightweight checkout need access to the branch.

We have terminology conflict here. Bazaar-NG "pull" and "merge" vs.
GIT "fetch", "pull" and "merge"; Bazaar-NG "checkout" vs. GIT "clone"
and "checkout".

In GIT "clone" is what is used to copy whole repository, "checkout"
is what is used to extract given/current branch to [given] working area.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 19:01             ` Nathaniel Smith
@ 2006-10-20 10:32               ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:32 UTC (permalink / raw)
  To: git

Nathaniel Smith wrote:

> Aaron Bentley <aaron.bentley <at> utoronto.ca> writes:
>
>> Bazaar also supports multiple unrelated branches in a repository, as
>> does CVS, SVN (depending how you squint), Arch, and probably Monotone.
> 
> It's quite common in Monotone.  You could probably do it in Mercurial as well,
> though I don't know that anyone does.  SVK definitely does it (since each user
> has a single repo that's shared by all the projects they work on).

I think that GIT separation of root, repository, and branches
namespaces is why there are so many calls for adding subproject
support to GIT; people want to change to GIT literally, for example
putting everything in one large repository.

In GIT there is no concept of root, like in CVS or SVN. You can
put repository anywhere. By default GIT looks for repository 
in current directory or one of its parents; otherwise you have to
provide location of repository either by using GIT_DIR environment
variable, or by using --git-dir option to git wrapper.

And the branch namespace is totally separate. There are some
restrictions on branch names (caused by notation GIT uses, for
example <branch>^ means [first] parent of commit given by <branch>),
but really few. Branch names can be hierarchical, like "jc/diff".

So there is no "store everything in URL/path" of
  /root/repo/branch
notation in GIT.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  8:58                                     ` Andreas Ericsson
  2006-10-19  9:10                                       ` Matthieu Moy
  2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
@ 2006-10-20 10:40                                       ` Jakub Narebski
  2006-10-20 13:36                                         ` Shawn Pearce
  2006-10-21 12:30                                         ` Matthew D. Fuller
  2 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:40 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Andreas Ericsson wrote:

> Christian MICHON wrote:
>
>> close to 200 post on bzr-git war!
>> is this the right place (git mailing list) to discuss about future
>> features of bzr ?
>> 
> 
> Perhaps not, but the tone is friendly (mostly), the patience of the 
> bazaar people seems infinite and lots of people seem to be having fun 
> while at the same time learning a thing or two about a different SCM.
> Best case scenario, both git and bazaar come out of the discussion as 
> better tools. If there would never be any cross-pollination, git 
> wouldn't have half the features it has today.

And it certainly helps to explain user-visible differences between
Bazaar-NG and GIT; I'd like to put ComparisonWithBazaarNG page on
GitWiki (http://git.or.cz/gitwiki/) some time soon, in addition
to ComparisonWithMercurial I meant to add from some time (stemming
from discussion on #revctrl list on FreeNode), and in addition
to existing GitSvnComparison page on GitWiki).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:51                                             ` Jakub Narebski
@ 2006-10-20 10:42                                               ` James Henstridge
  2006-10-20 13:17                                                 ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-20 10:42 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> James Henstridge wrote:
> > On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
> >> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:
>
> >>>             Additionally, the new mainline can keep a mirror of the
> >>> abandoned mainline in its repository, because there are virtually no
> >>> additional storage requirements to doing so.
> >>
> >> And this part I don't understand. I can understand the mainline
> >> storing the revisions, but I don't understand how it could make them
> >> accessible by the published revision numbers of the "abandoned"
> >> line. And that's the problem.
> >
> > With this sort of setup, I would publish my branches in a directory
> > tree like this:
> >
> >     /repo
> >         /branch1
> >         /branch2
> >
> > I make "/repo" a Bazaar repository so that it stores the revision data
> > for all branches contained in the directory (the tree contents,
> > revision meta data, etc).
>
> And here we have a feature which is as far as I see unique to git,
> namely to have persistent branches with _separate namespace_. It means
> that we can have hierarchical branch names (including names like
> "remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
> have to guess where repository name ends and branch name begins.

With the above layout, I would just type:
    bzr branch http://server/repo/branch1

This command behaves identically whether the repository data is in
/repo or in /repo/branch1.  Someone pulling from the branch doesn't
have to care what the repository structure is.  Having a separate
namespace for branch names only really makes sense if the user needs
to care about it.

As for heirarchical names, there is nothing stopping you from using
deaper directory structures with Bazaar too.  Bazaar just checks each
successive parent directory til it finds a repository for the branch.


> The idea of "branches (and tags) as directories" was if I understand
> it correctly introduced by Subversion, and from what can be seen from
> troubles with git-svn (stemming from the fact that division between
> project name and branch name is the matter of _convention_) at least
> slightly brain-damaged.

I think you are a bit confused about how Bazaar works here.  A Bazaar
repository is a store of trees and revision metadata.  A Bazaar branch
is just a pointer to a head revision in the repository.  As you can
probably guess, the data for the branch is a lot smaller than the data
for the repository.

You can store the repository and branch in the same directory to get a
standalone branch.  The layout I described above has a repository in a
parent directory, shared by multiple branches.

If you are comparing Subversion and Bazaar, a Bazaar branch shares
more properties with a full Subversion repository rather than a
Subversion branch.


> > The "/repo/branch1" essentially just contains a list of mainline
> > revision IDs that identify the branch.  This could probably be just
> > store the head revision ID, but there are some optimisations that make
> > use of the linear history here.
> >
> > If the ancestry of "/repo/branch2" is a subset of branch1 (as it might
> > be if the in the case of forked then merged projects), then all its
> > revision data will already be in the repository when branch1 was
> > imported.  The only cost of keeping the branch around (and publishing
> > it) is the list of revision IDs in its mainline history.
> >
> > For similar reasons, the cost of publishing 20 related Bazaar branches
> > on my web server is generally not 20 times the cost of publishing a
> > single branch.
> >
> > I understand that you get similar benefits by a GIT repository with
> > multiple head revisions.
>
> You can get similar benefits by a GIT repository with shared object
> database using alternates mechanism. And that is usually preferred
> over storing unrelated branches, i.e. branches pointing to disconnected
> DAG (separate trees in BK terminology) of revision, if that you mean by
> multiple head revisions (because in GIT there is no notion of "mainline"
> branch, only of current (HEAD) branch).

I may have got the git terminology wrong. I was trying to draw
parallels between the .git/refs/... files in a git repository and the
way multiple branches can be stored in a Bazaar repository.

I am not claiming that you'll get bandwidth or disk space benefits for
storing unrelated branches in a single Bazaar repository.  But if the
branches are related, then there will be space savings (which is what
the great-grandparent post was asking about).


> >>>> But for these communications, revision numbers will not provide
> >>>> historically stable values that can be used.
> >>>
> >>> They certainly can.
> >>>
> >>> The coder says "I've put up a branch at http://example.com/bzr/feature.
> >>>  In revision 5, I started work on feature A.  I finished work in
> >>> revision 6.  But then I had to fix a related bug in revision 7."
> >>
> >> "I've put this branch up" isn't historically stable...
> >
> > With the repository structure mentioned above, the cost of publishing
> > multiple branches is quite low.  If I continue to work on the project,
> > then there is no particular bandwidth or disk space reasons for me to
> > cut off access to my old branches.
> >
> > For similar reasons, it doesn't cost me much to mirror other people's
> > related branches if I really care about them.
>
> But the revision number in this case _changes_. It is from 7 to
> branch:7 but still it changes somewhat.

A revision number is only has meaning in the context of a branch.  If
I mirror a branch, the revision numbers in the context of each will
refer to the same revision IDs.

I am not sure what sort of distinction you are trying to draw.


> >> The naming in git really is beautiful and beautifully simple.
> >
> > I don't think anyone is saying that universally unique names are bad.
> > But I also don't see a problem with using shorter names that only have
> > meaning in a local scope.
> >
> > I've noticed some people using abbreviated SHA1 sums with GIT.  Isn't
> > that also a case of trading potential global uniqueness for
> > convenience when working in a local scope?
>
> Emphasisis on _potential_. SHA1 id abbreviated to 6 characters might
> be not unique in larger project, but for example the chance that
> SHA1 id abbreviated to 7 or 8 characters is not unique is really low.

My point was that by shortening the IDs with GIT, you are trading
global uniqueness (i.e. the identifier may clash with one found in a
different context) for the convenience of shorter identifiers.

Provided you know that the tradeoff is being made, it isn't generally
much of a problem.  I agree that the ability to pick how much of a
tradeoff is made by altering the length of the identifier is a nice
property of GIT.


> Yet another analogy:
>
> SHA1 identifiers of commits (and not only commits) can be compared
> to Message-Ids of Usenet messages, while revision numbers can be compared
> to Xref number of Usenet message which if I understand correctly is unique
> only for given news server. But Message-Ids cannot be shortened
> meaningfully like SHA1 ids can; newertheless they are used in communication
> without any problems. Even if namespace is not simple ;-)

I can't say I ever used usenet much, so can't comment too much.  But
from your description, a (server, xref) tuple could be used to look up
the unique identifier in a similar way to how you can do so in Bazaar
with a (branch_url, revno) tuple.

James.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 15:30                                           ` Aaron Bentley
  2006-10-20  3:14                                             ` Tim Webster
@ 2006-10-20 10:44                                             ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:44 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Aaron Bentley wrote:

>> It would be nice if the SCM tools included rss feeds for communicating zip
>> patch bundles.
> 
> The bzr "webserve" plugin provides rss feeds.

Git "gitweb" (in git.git repo from some time) web interface provides OPML
and RSS feeds.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:02                                               ` Matthieu Moy
@ 2006-10-20 10:45                                                 ` Andy Whitcroft
  0 siblings, 0 replies; 806+ messages in thread
From: Andy Whitcroft @ 2006-10-20 10:45 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Jakub Narebski, Aaron Bentley, Andreas Ericsson, Linus Torvalds,
	Carl Worth, bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Huh? If you want what changes have been introduced by commit 
>> c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
>>
>> # git diff c3424aebbf722c1f204931bf1c843e8a103ee143
> 
> How does git chose which ancestor to use if this revision has more
> than one in this case?

Well if there is more than one parent, then there are more than one
diff.  For instance this is a merge commit which I asked to 'see'.

This gets shown in the combined diff format, showing the results of the
conflict resolution.

diff --cc this
index fbbafbf,10c8337..43b7af0
--- a/this
+++ b/this
@@@ -1,3 -1,3 +1,4 @@@
  1
+ 2a
 +2b
  3

If you want to know each individual diff in a more 'standard' form you
can ask about the parents specifically.

apw@pinky$ git diff HEAD^1..
diff --git a/this b/this
index fbbafbf..43b7af0 100644
--- a/this
+++ b/this
@@ -1,3 +1,4 @@
 1
+2a
 2b
 3

apw@pinky$ git diff HEAD^2..
diff --git a/bar b/bar
new file mode 100644
index 0000000..8dc5f23
--- /dev/null
+++ b/bar
@@ -0,0 +1 @@
+this that other
diff --git a/this b/this
index 10c8337..43b7af0 100644
--- a/this
+++ b/this
@@ -1,3 +1,4 @@
 1
 2a
+2b
 3

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:57                                             ` Jakub Narebski
  2006-10-20 10:02                                               ` Matthieu Moy
@ 2006-10-20 10:45                                               ` James Henstridge
  2006-10-20 12:01                                                 ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-20 10:45 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> > What's nice is being able see the revno 753 and knowing that "diff -r
> > 752..753" will show the changes it introduced. Checking the revo on a
> > branch mirror and knowing how out-of-date it is.
>
> Huh? If you want what changes have been introduced by commit
> c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
>
> # git diff c3424aebbf722c1f204931bf1c843e8a103ee143
>
> (or better "git show" instead of "git diff" or "git diff-tree").
> If you give only one commit (only one revision) git automatically
> gives diff to its parent(s).

If a revision has multiple parents, what does it diff against in this
case?  Do you get one diff against each parent revision?

James.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 19:16                                           ` Junio C Hamano
@ 2006-10-20 10:51                                             ` Jakub Narebski
  2006-10-20 15:58                                               ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:51 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
>> The other big difference is being able to do merges in seconds. The 
>> biggest cost of doing a big merge these days seems to literally be 
>> generating the diffstat of the changes at the end (which is purely a UI 
>> issue, but one that I find so important that I'll happily take the extra 
>> few seconds for that, even if it sometimes effectively doubles the 
>> overhead).
> 
> An interesting effect on this is when people have a column for
> merge performance in a SCM comparison table, they would include
> time to run the diffstat as part of the time spent for merging
> when they fill in the number for git, but not for any other SCM.

So if you want to compare merge performance with other SCM, you should
either add time to run diffstat for other SCM, or substract time to
run "git diff-tree --stat".

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 23:01                                       ` Aaron Bentley
  2006-10-19 23:42                                         ` Carl Worth
@ 2006-10-20 10:53                                         ` Jakub Narebski
  2006-10-20 12:34                                           ` Matthieu Moy
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 10:53 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Aaron Bentley wrote:

>>>          And I personally have been developing a bugtracker that is
>>> distributed in the same way bzr is; it stores bug data in the source
>>> tree of a project, so that bug activities follow branches around.
>>
>> That kind of thing sounds very useful. As I've been talking about
>> "numbers" here in bug trackers and mailing lists, it should be obvious
>> that I consider the information stored in such systems an important
>> part of the history of a code project. So it would be nice if all of
>> that history were stored in an equally reliable system in some way.
> 
> If you're interested, it's called "Bugs Everywhere" and it's available here:
> http://panoramicfeedback.com/opensource/
> 
> New VCS backends are welcome :-D

While SCM can (and should be usually) distributed, I think that bugtracker
has to be centralized.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
  2006-10-20  5:05                                             ` Linus Torvalds
  2006-10-20  9:57                                             ` Jakub Narebski
@ 2006-10-20 11:00                                             ` Jakub Narebski
  2006-10-20 14:12                                             ` Jeff King
  2006-10-20 21:48                                             ` Carl Worth
  4 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:00 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:

> Bazaar encourages you to stick lots and lots of branches in your
> repository.  They don't even have to be related.  For example, my repo
> contains branches of bzr, bzrtools, Meld, and BazaarInspect.

GIT encourages you to use separate repositories for unrelated projects.
And alternates mechanism for related projects (like different Linux
kernel repositories: Linus, stable, etc.).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  8:38                                                 ` Johannes Schindelin
  2006-10-20 10:13                                                   ` Petr Baudis
@ 2006-10-20 11:09                                                   ` Jakub Narebski
  2006-10-20 11:37                                                     ` Johannes Schindelin
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:09 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
> 
>> How does git disambiguate SHA1 hash collisions?
> 
> It does not. You can fully expect the universe to go down before that 
> happens.
 
Or you can compile git with COLLISION_CHECK

>From Makefile:
# Define COLLISION_CHECK below if you believe that SHA1's
# 1461501637330902918203684832716283019655932542976 hashes do not give you
# sufficient guarantee that no collisions between objects will ever happen.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 16:38                                               ` Matthieu Moy
@ 2006-10-20 11:24                                                 ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:24 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Matthieu Moy wrote:

> Then, one other difference is in the UI. bzr shows you commits in a
> kind of hierarchical maner, like (fictive example, that's not the real
> exact format).
> 
> $ bzr log
> commiter: upstream@maintainer.com
> message:
>   merged the work on a feature
>   ------
>   commiter: contributor@site.com
>   message:
>     prepared for feature X
>   ------
>   commiter: contributor@site.com
>   message:
>     implemented feature X
>   ------
>   commiter: contributor@site.com
>   message:
>     added testcase for feature X
> ------
> commiter: upstream@maintainer.com
> message:
>   something else
> 
> No big difference in the model either, but it probably reveals a
> different vision of what "history" means.

We have in GIT git-show-branch command for that (although it
has quite strange UI, and shows only title of commit), we
can do "git log | git name-rev --stdin", or better use graphical
history viewers like gitk (Tcl/Tk) or qgit (Qt). Graphical history
viewers are a must with more complicated history. 

Bazaar-NG has bzr-gtk.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:09                                                   ` Jakub Narebski
@ 2006-10-20 11:37                                                     ` Johannes Schindelin
  2006-10-20 12:03                                                       ` Jakub Narebski
  2006-10-20 17:23                                                       ` David Lang
  0 siblings, 2 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-20 11:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Johannes Schindelin wrote:
> 
> > On Fri, 20 Oct 2006, Lachlan Patrick wrote:
> > 
> >> How does git disambiguate SHA1 hash collisions?
> > 
> > It does not. You can fully expect the universe to go down before that 
> > happens.
>  
> Or you can compile git with COLLISION_CHECK
> 
> >From Makefile:
> # Define COLLISION_CHECK below if you believe that SHA1's
> # 1461501637330902918203684832716283019655932542976 hashes do not give you
> # sufficient guarantee that no collisions between objects will ever happen.

You can document your disbelief.

But it does not change a thing. Since v0.99~653, we do not have any 
collision check, even if compiled with COLLISION_CHECK.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 11:30                                             ` Charles Duffy
@ 2006-10-20 11:38                                               ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:38 UTC (permalink / raw)
  To: git

Charles Duffy wrote:

> Johannes Schindelin wrote:
>>> Shell scripts allow for a fragile system because they could include C
code
>>> snippets which they then compile and LD_PRELOAD.
>>>     
>>
>> Well, I do not expect people to misbehave. You do not compile a nasty 
>> C-program from a shell script _by mistake_.
> 
> You also don't replace bzrlib functionality (in your terms, plumbing) in 
> a plugin by mistake.

Perhaps the cause for not having plugins in GIT (besides the fact that
it follows OSS + Unix guidelines) is that git is not libified, yet. It
is "scriptified", i.e. it has many helper programs, and has options for
pipelining that it is really easy to use in scripts (Cogito, pg, StGit),
but the libification effort is [only] ongoing.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 12:33                                         ` Petr Baudis
  2006-10-19 13:44                                           ` Matthieu Moy
@ 2006-10-20 11:50                                           ` Jakub Narebski
  2006-10-20 13:26                                             ` Jakub Narebski
  2006-10-20 23:19                                             ` Junio C Hamano
  1 sibling, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 11:50 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Andreas Ericsson, Matthew D. Fuller, bazaar-ng,
	Linus Torvalds, Carl Worth, git

I have lost somewhere among many emails in this thread the email I 
wanted to reply to, the one mentioning for the first time the lack of 
parents ordering in GIT, but this one should do.


Petr Baudis wrote:

> The lack of parents ordering in Git is directly connected with
> fast-forwarding.

There are exactly _two_ places where Git treats first parent specially 
(correct me if I'm wrong).

First, <commit-ish>^ is shortcut for <commit-ish>^1, i.e. for first 
parent of commit. <commit-ish>~<n> is shortcut for <commit-ish>^^...^ 
(n-times '^'), which means that <commit-ish>~<n> is n-th parent in 
1st-parent lineage of <commit-ish>. But you can always use names
like for example next~12^2^^2~2.

Second, git-diff with only one <commit-ish> generates diff to first
parent. But you can always use '-c' or '-cc' combined diff format
or '-m' with default diff format to compare to _all_ parents.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:45                                               ` James Henstridge
@ 2006-10-20 12:01                                                 ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 12:01 UTC (permalink / raw)
  To: James Henstridge
  Cc: Aaron Bentley, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

James Henstridge wrote:
> On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> > > What's nice is being able see the revno 753 and knowing that "diff -r
> > > 752..753" will show the changes it introduced. Checking the revo on a
> > > branch mirror and knowing how out-of-date it is.
> >
> > Huh? If you want what changes have been introduced by commit
> > c3424aebbf722c1f204931bf1c843e8a103ee143, you just do
> >
> > # git diff c3424aebbf722c1f204931bf1c843e8a103ee143
> >
> > (or better "git show" instead of "git diff" or "git diff-tree").
> > If you give only one commit (only one revision) git automatically
> > gives diff to its parent(s).
> 
> If a revision has multiple parents, what does it diff against in this
> case?  Do you get one diff against each parent revision?

If revision has multiple parents (is merge commit), git-diff
(which is used by git-show) does not show differences (unless you
give two revisions in git-diff case).

You can either use '-m' option to show differences from all its
parents, or '-c'/'--cc' to show combined diff ('--cc' shows more
compact diff).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:37                                                     ` Johannes Schindelin
@ 2006-10-20 12:03                                                       ` Jakub Narebski
  2006-10-20 12:48                                                         ` Johannes Schindelin
  2006-10-20 17:23                                                       ` David Lang
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 12:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Johannes Schindelin wrote:
>> 
>>> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
>>> 
>>>> How does git disambiguate SHA1 hash collisions?
>>> 
>>> It does not. You can fully expect the universe to go down before that 
>>> happens.
>>  
>> Or you can compile git with COLLISION_CHECK
>> 
>> From Makefile:
>> # Define COLLISION_CHECK below if you believe that SHA1's
>> # 1461501637330902918203684832716283019655932542976 hashes do not give you
>> # sufficient guarantee that no collisions between objects will ever happen.
> 
> You can document your disbelief.
> 
> But it does not change a thing. Since v0.99~653, we do not have any 
> collision check, even if compiled with COLLISION_CHECK.

So why it is left in Makefile? Does defining this change a thing
or not (in which case this section should be removed)?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:53                                         ` Jakub Narebski
@ 2006-10-20 12:34                                           ` Matthieu Moy
  2006-10-20 13:20                                             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-20 12:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>> If you're interested, it's called "Bugs Everywhere" and it's available here:
>> http://panoramicfeedback.com/opensource/
>> 
>> New VCS backends are welcome :-D
>
> While SCM can (and should be usually) distributed, I think that bugtracker
> has to be centralized.

Well, indeed, I think bug _reporting_ should be somehow centralized,
while bug _fixing_ can be decentralized: You fix a bug, you mark it as
fixed, and then the main branch gets the information that the bug is
fixed when the bugfix is merged.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 12:03                                                       ` Jakub Narebski
@ 2006-10-20 12:48                                                         ` Johannes Schindelin
  0 siblings, 0 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-20 12:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> > But it does not change a thing. Since v0.99~653, we do not have any 
> > collision check, even if compiled with COLLISION_CHECK.
> 
> So why it is left in Makefile? Does defining this change a thing
> or not (in which case this section should be removed)?

It does not. The relevant parts in the code read like this:

sha1_filc.c:1442
                /* FIXME!!! Collision check here ? */

sha1_file.c:1541
                /*
                 * FIXME!!! We might do collision checking here, but we'd
                 * need to uncompress the old file and check it. Later.
                 */

It was hoped that the people who actually care would implement that 
functionality. (Note that in an earlier version, the check was 
implemented, but would have to be different these days: pack files did not 
exist then).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:42                                               ` James Henstridge
@ 2006-10-20 13:17                                                 ` Jakub Narebski
  2006-10-20 13:36                                                   ` Petr Baudis
  2006-10-20 14:59                                                   ` James Henstridge
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 13:17 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

James Henstridge wrote:
> On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> James Henstridge wrote:
>>> On 20/10/06, Carl Worth <cworth@cworth.org> wrote:
>>>> On Thu, 19 Oct 2006 19:01:58 -0400, Aaron Bentley wrote:

>>> With this sort of setup, I would publish my branches in a directory
>>> tree like this:
>>>
>>>     /repo
>>>         /branch1
>>>         /branch2
>>>
>>> I make "/repo" a Bazaar repository so that it stores the revision data
>>> for all branches contained in the directory (the tree contents,
>>> revision meta data, etc).
>>
>> And here we have a feature which is as far as I see unique to git,
>> namely to have persistent branches with _separate namespace_. It means
>> that we can have hierarchical branch names (including names like
>> "remotes/<remotename>/<branch of remote>", or "jc/diff"), and we don't
>> have to guess where repository name ends and branch name begins.
> 
> With the above layout, I would just type:
>     bzr branch http://server/repo/branch1

With Cogito (you can think of it either as alternate Git UI, or as SCM
built on top of Git) you would use

   $ cg clone http://server/repo#branch

for example

   $ cg clone git://git.kernel.org/pub/scm/git/git.git#next

to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).
But you can also clone _whole_ repository, _all_ published branches with

   $ cg clone git://git.kernel.org/pub/scm/git/git.git

With core Git it is the same, but we don't have the above shortcut
for checking only one branch; branches to checkout are in separate
arguments to git-clone.

In bzr it seems that you cannot distinguish (at least not only
from URL) where repository ends and branch begins.

*Sidenote:* In current version of gitweb you can get file
in given repository in given branch using the following
notation:

   http://path/to/gitweb.cgi/repo/sitory/branch/name:file/name

gitweb can detect where branch name ends and repository name
begins; usually (by convention) "bare" git repositories uses
<project>.git name, "clothed" git repositories uses
<project>/.git


See also below.

> This command behaves identically whether the repository data is in
> /repo or in /repo/branch1.  Someone pulling from the branch doesn't
> have to care what the repository structure is.  Having a separate
> namespace for branch names only really makes sense if the user needs
> to care about it.
> 
> As for hierarchical names, there is nothing stopping you from using
> deaper directory structures with Bazaar too.  Bazaar just checks each
> successive parent directory til it finds a repository for the branch.
> 
>> The idea of "branches (and tags) as directories" was if I understand
>> it correctly introduced by Subversion, and from what can be seen from
>> troubles with git-svn (stemming from the fact that division between
>> project name and branch name is the matter of _convention_) at least
>> slightly brain-damaged.
> 
> I think you are a bit confused about how Bazaar works here.  A Bazaar
> repository is a store of trees and revision metadata.  A Bazaar branch
> is just a pointer to a head revision in the repository.  As you can
> probably guess, the data for the branch is a lot smaller than the data
> for the repository.
> 
> You can store the repository and branch in the same directory to get a
> standalone branch.  The layout I described above has a repository in a
> parent directory, shared by multiple branches.
> 
> If you are comparing Subversion and Bazaar, a Bazaar branch shares
> more properties with a full Subversion repository rather than a
> Subversion branch.

Oh, that explained yet another difference between Bazaar-NG (and other
SCM which uses similar model) and Git.

In Git branch is just a pointer to head (top) commit (hence they are stored
under .git/refs/heads/) in given line of development. Git also stores
information (in .git/HEAD) about which branch we are currently on, which
means on which branch git puts new commits. Nothing more (well, there
can be log of changes to head in .git/logs/refs/heads/ but that is optional
and purely local information). In Bazaar-NG you have to store (if I
understand it correctly) mapping from revnos to revisions.
 
By default (it means for example default behavior of git-clone, if we don't
use --bare option) git repository is _embedded_ in working area. We have

   .git/
   .git/HEAD
   ...
   .git/refs/heads/
   ...
   <working area files, e.g.>

So repo/branch wouldn't work, because 'branch' would conflict with working
area files. GIT doesn't follow the CVS model of separate storage area
(CVSROOT) and having only pointer to said area (files in CVS/ 
subdirectories) in working directory.

In GIT to work on some repository you don't (like from what I understand
in Bazaar-NG) "checkout" some branch (which would automatically copy some
data in case of "heavy checkout" or just save some pointer to repository
in "lightweight checkout" case). You clone whole repository; well you can
select which branches to clone. "Checkout" in GIT terminology means to
populate working area with given version (and change in repository which
branch is current, usually).

How checked out working area looks like in Bazaar-NG?

[...]
>>> For similar reasons, the cost of publishing 20 related Bazaar branches
>>> on my web server is generally not 20 times the cost of publishing a
>>> single branch.
>>>
>>> I understand that you get similar benefits by a GIT repository with
>>> multiple head revisions.
>>
>> You can get similar benefits by a GIT repository with shared object
>> database using alternates mechanism. And that is usually preferred
>> over storing unrelated branches, i.e. branches pointing to disconnected
>> DAG (separate trees in BK terminology) of revision, if that you mean by
>> multiple head revisions (because in GIT there is no notion of "mainline"
>> branch, only of current (HEAD) branch).
> 
> I may have got the git terminology wrong. I was trying to draw
> parallels between the .git/refs/... files in a git repository and the
> way multiple branches can be stored in a Bazaar repository.

Yes, but using Git that way has serious disadvantages. For example
there is only one current branch pointer and only one index (dircache)
per git repository.

> I am not claiming that you'll get bandwidth or disk space benefits for
> storing unrelated branches in a single Bazaar repository.  But if the
> branches are related, then there will be space savings (which is what
> the great-grandparent post was asking about).

So it is way better to use one repository per project, and use alternates
mechanism to save space.

But I agree that saving "old fork" info as separate branch doesn't lead
to that much inefficiency as might be thought.

But after saving "old fork" as a branch revno based revision identifiers
change from http://old.host/old/repo:127 to http://host/repo/old.fork:127
That is maybe minimal change, but this is change!


P.S. In two separate git repositories, even if they exchange information
with each other, the branch names can be different.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 12:34                                           ` Matthieu Moy
@ 2006-10-20 13:20                                             ` Jakub Narebski
  2006-10-20 13:47                                               ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 13:20 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
> >> If you're interested, it's called "Bugs Everywhere" and it's available here:
> >> http://panoramicfeedback.com/opensource/
> >> 
> >> New VCS backends are welcome :-D
> >
> > While SCM can (and should be usually) distributed, I think that bugtracker
> > has to be centralized.
> 
> Well, indeed, I think bug _reporting_ should be somehow centralized,
> while bug _fixing_ can be decentralized: You fix a bug, you mark it as
> fixed, and then the main branch gets the information that the bug is
> fixed when the bugfix is merged.

But you don't need much infrastructure for branch fixing. Fix it in
repository, and write bug number (you have to have centralized bugtracker
for numbers) or bug identifier in commit message. You write (or post-commit
hook writes) in bugtracker that bug was fixed in commit <commit-id>.
You tell mainline to pull from you. That's all.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19  3:10                               ` Aaron Bentley
                                                   ` (2 preceding siblings ...)
  2006-10-19  7:02                                 ` Erik Bågfors
@ 2006-10-20 13:22                                 ` Horst H. von Brand
  2006-10-20 13:46                                   ` Christian MICHON
  3 siblings, 1 reply; 806+ messages in thread
From: Horst H. von Brand @ 2006-10-20 13:22 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Andreas Ericsson, bazaar-ng, git, Jakub Narebski

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Linus Torvalds wrote:

[...]

> > The "main trunk matters" mentality (which has deep roots in CVS - don't 
> > get me wrong, I don't think you're the first one to do this) is 
> > fundamentally antithetical to truly distributed system, because it 
> > basically assumes that some maintainer is "more important" than others. 

> Linus, if you got hit by a bus, it would still be a shock, and it would
> still take time for the Linux world to recover.  Your insights and
> talent, both technical and social, make you the most important kernel
> developer.  And it stays that way because you deserve it.  Projects with
> good leadership don't fork, or if they do, the fork withers and dies
> pretty quickly.

So? It makes no sense to me to cater only to "successful projects"... most
projects /aren't/ successful ;-)

> It is fine to say all branches are equal from a technical perspective.
> From a social perspective, it's just not true.

Yes, but what matters here is the principle... if branches aren't equal, it
makes some things unnecessarily hard (i.e., forking, passing maintainership
over, ...). Sure, they aren't activities that should be actively
encouraged, but they shouldn't be made harder than necessary either.

> The scale of Bazaar development is much smaller than the scale of kernel
> development, so it doesn't make sense to maintain long-term divergent
> branches like the mm tree.  We do occasionally have long-lived feature
> branches, though.

Are you saying Bazaar is aimed at small(ish) projects (only)?

> > That special maintainer is the maintainer whose merge-trunk is followed, 
> > and whose revision numbers don't change when they are merged back.

> In bzr development, it's very rare for anyone's revision numbers to
> change.

"Very rare" != "never". The "very rare" cases /will/ come back to bite you,
once you grow accustomed to "hasn't ever happened"

[...]

> > I'll just point out that one of my design goals for git was to make every 
> > single repository 100% equal. That means that there MUST NOT be a "trunk", 
> > or a special line of development. There is no "vendor branch".

> I think you're implying that on a technical level, bzr doesn't support
> this.  But it does.  Every published repository

What makes a "published repository" special, as oposed to my local
playground?

>                                                 has unique identifiers
> for every revision on its mainline,

Are they different among repositories, even though they came from another
of the set?

>                                     and it's exceedingly uncommon for
> these to change.

See above.

>                   There are special procedures to maintain bzr.dev, but
> there's nothing technically unique about it.  People develop against
> bzr.dev rather than my integration branch, because they have
> non-technical reasons for wanting their changes to be merged into
> bzr.dev, not my integration branch.

OK.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:50                                           ` Jakub Narebski
@ 2006-10-20 13:26                                             ` Jakub Narebski
  2006-10-20 23:19                                             ` Junio C Hamano
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 13:26 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Matthieu Moy, Andreas Ericsson, Matthew D. Fuller, bazaar-ng,
	Linus Torvalds, Carl Worth, git

Jakub Narebski wrote:
> Second, git-diff with only one <commit-ish> generates diff to first
> parent. But you can always use '-c' or '-cc' combined diff format
> or '-m' with default diff format to compare to _all_ parents.

I stand corrected: git-diff refuses to show anything if provided
with only one commit, and commit has more than one parent. So it
does not reat first parent specially.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:17                                                 ` Jakub Narebski
@ 2006-10-20 13:36                                                   ` Petr Baudis
  2006-10-20 14:12                                                     ` Jakub Narebski
  2006-10-20 14:59                                                   ` James Henstridge
  1 sibling, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 13:36 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Dear diary, on Fri, Oct 20, 2006 at 03:17:26PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> But you can also clone _whole_ repository, _all_ published branches with
> 
>    $ cg clone git://git.kernel.org/pub/scm/git/git.git

Nope, cg clone will in this case clone the master branch (or whatever
the remote HEAD points at). cg clone -a is planned but not implemented
yet. Very soon now, hopefully. :-)

> In GIT to work on some repository you don't (like from what I understand
> in Bazaar-NG) "checkout" some branch (which would automatically copy some
> data in case of "heavy checkout" or just save some pointer to repository
> in "lightweight checkout" case). You clone whole repository; well you can
> select which branches to clone. "Checkout" in GIT terminology means to
> populate working area with given version (and change in repository which
> branch is current, usually).

You don't need to, you can switch your working tree between various
branches.  I think Linus said he does that (or was it Junio?), and I do that
as well, as well as many others.

A good question would be "when to create another branch and when to
clone the repository". And I don't think there's any good answer, except
"when you are comfortable with it". :-) Both approaches have pros/cons.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:40                                       ` Jakub Narebski
@ 2006-10-20 13:36                                         ` Shawn Pearce
  2006-10-21 12:30                                         ` Matthew D. Fuller
  1 sibling, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-20 13:36 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> wrote:
> from discussion on #revctrl list on FreeNode), and in addition
> to existing GitSvnComparison page on GitWiki).

Oh, you mean that document that I orphaned when I got sidetracked
and forgot I hadn't quite finished it?  :-)

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:22                                 ` Horst H. von Brand
@ 2006-10-20 13:46                                   ` Christian MICHON
  2006-10-20 15:05                                     ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Christian MICHON @ 2006-10-20 13:46 UTC (permalink / raw)
  To: bazaar-ng, git

On 10/20/06, Horst H. von Brand <vonbrand@inf.utfsm.cl> wrote:
> Are you saying Bazaar is aimed at small(ish) projects (only)?

funny. I actually read another post from Linus, and when I
"merge" with your post (understand: bisect), the following
comes out:

- git is the fastest scm around
- git has the smallest scm footprint
- git is also aimed at small(ish) projects

my personal proof of concept on the last point is that I'm a
IC design engineer who threw away other scm in favor of git
since git-1.4.2 and regret now the years wasted on _other_
scm. But your mileage may vary.

-- 
Christian

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:20                                             ` Jakub Narebski
@ 2006-10-20 13:47                                               ` Petr Baudis
  0 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 13:47 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Matthieu Moy, bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 03:20:42PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Matthieu Moy wrote:
> > Jakub Narebski <jnareb@gmail.com> writes:
> > 
> > >> If you're interested, it's called "Bugs Everywhere" and it's available here:
> > >> http://panoramicfeedback.com/opensource/
> > >> 
> > >> New VCS backends are welcome :-D
> > >
> > > While SCM can (and should be usually) distributed, I think that bugtracker
> > > has to be centralized.
> > 
> > Well, indeed, I think bug _reporting_ should be somehow centralized,
> > while bug _fixing_ can be decentralized: You fix a bug, you mark it as
> > fixed, and then the main branch gets the information that the bug is
> > fixed when the bugfix is merged.
> 
> But you don't need much infrastructure for branch fixing. Fix it in
> repository, and write bug number (you have to have centralized bugtracker
> for numbers) or bug identifier in commit message. You write (or post-commit
> hook writes) in bugtracker that bug was fixed in commit <commit-id>.
> You tell mainline to pull from you. That's all.

Yes but noone did the infrastructure yet. :-) Also, we need a way to
make it worth smooth, e.g. so that you don't have to download any
special stuff after cloning a branch - thus the post-commit hook needs
to be cloned too, but you also need to deal with the security
implications reasonably. (We would very much like to have "hooks
cloning" in Git in our in-SUSE usage as well; I didn't get to it yet.)

On a somewhat related note, I was on Microsoft's presentation at my
university about their Team Foundation Server. And Microsoft's clearly
aware that SourceSafe was a horrible crap and the version control in TFS
is much more advanced and even shows some signs of distributiveness (but
I don't know how much, the presenter did not know details about how it
works).

But their selling point really is the tight integration with bug
tracking and autobuild system. And it indeed does look pretty nice (when
you watch it, you might get quite a different perspective when actually
*using* it ;).

You can read my brief notes from the presentation at

	http://pasky.or.cz/~pasky/cp/tfs-lecture-notes.txt

It's a bit of bureaucracy for developers but managers will absolutely
*adore* it.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-19 10:40                                   ` Sean
@ 2006-10-20 14:03                                     ` Aaron Bentley
  2006-10-20 14:56                                       ` Jakub Narebski
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 14:03 UTC (permalink / raw)
  To: Sean; +Cc: Alexander Belchenko, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Sean wrote:

> Petr already mentioned that the data currently shown in the email
> text isn't really useful.

In Bazaar bundles, the text of the diff is an integral part of the data.
 It is used to generate the text of all the files in the revision.

Bazaar bundles were designed to be used on mailing lists.  So you can
review the changes from the diff, comment on them, and if it seems
suitable, merge them.

> Although that might just make the email bigger for not a lot of
> gain.

It's my understanding that most changes discussed on lkml are provided
as a series of patches.  Bazaar bundles are intended as a direct
replacement for patches in that use case.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFONck0F+nu1YWqI0RAgrHAJ0flmF1wCGYYUSk8f2iy8LuZnkaKQCdFSIo
JIaKi9S8TzUkhvaWpYYP5AA=
=MgZo
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:36                                                   ` Petr Baudis
@ 2006-10-20 14:12                                                     ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 14:12 UTC (permalink / raw)
  To: Petr Baudis
  Cc: James Henstridge, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Petr Baudis wrote:
> Dear diary, on Fri, Oct 20, 2006 at 03:17:26PM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...

>> But you can also clone _whole_ repository, _all_ published branches with
>> 
>>    $ cg clone git://git.kernel.org/pub/scm/git/git.git
> 
> Nope, cg clone will in this case clone the master branch (or whatever
> the remote HEAD points at). cg clone -a is planned but not implemented
> yet. Very soon now, hopefully. :-)

That's probably because Cogito still uses obsolete branches/


$ git clone git://git.kernel.org/pub/scm/git/git.git

clones _whole_ repository, all the branches and tags, and saves information
about the branches it cloned, and URL to repository in remotes/ file.
 
>> In GIT to work on some repository you don't (like from what I understand
>> in Bazaar-NG) "checkout" some branch (which would automatically copy some
>> data in case of "heavy checkout" or just save some pointer to repository
>> in "lightweight checkout" case). You clone whole repository; well you can
>> select which branches to clone. "Checkout" in GIT terminology means to
>> populate working area with given version (and change in repository which
>> branch is current, usually).
> 
> You don't need to, you can switch your working tree between various
> branches.  I think Linus said he does that (or was it Junio?), and I do that
> as well, as well as many others.

I should have said: bring working area to state given by some revision
(instead of "populate working area").

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
                                                               ` (2 preceding siblings ...)
  2006-10-20 11:00                                             ` Jakub Narebski
@ 2006-10-20 14:12                                             ` Jeff King
  2006-10-20 14:40                                               ` Jakub Narebski
  2006-10-21 17:57                                               ` Aaron Bentley
  2006-10-20 21:48                                             ` Carl Worth
  4 siblings, 2 replies; 806+ messages in thread
From: Jeff King @ 2006-10-20 14:12 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:

> What's nice is being able see the revno 753 and knowing that "diff -r
> 752..753" will show the changes it introduced.  Checking the revo on a
> branch mirror and knowing how out-of-date it is.

I was accustomed to doing such things in CVS, but I find the git way
much more pleasant, since I don't have to do any arithmetic:
  diff d8a60^..d8a60
(Yes, I am capable of performing subtraction in my head, but I find that
a "parent-of" operator matches my cognitive model better, especially
when you get into things like d8a60^2~3).

Does bzr have a similar shorthand for mentioning relative commits?

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 17:14                                       ` J. Bruce Fields
@ 2006-10-20 14:31                                         ` Jeff King
  2006-10-20 15:33                                           ` J. Bruce Fields
  0 siblings, 1 reply; 806+ messages in thread
From: Jeff King @ 2006-10-20 14:31 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git

On Thu, Oct 19, 2006 at 01:14:09PM -0400, J. Bruce Fields wrote:

> > > In the second place, one must consider the "nuclear launch codes"
> > > scenario.
> > Sure. And git does provide tools that can do this.
> 
> So in this case you can certainly lose the launch codes.  But you have
> forever granted everyone a way to determine whether a given guess at the
> launch codes is correct.  (Again, assuming some stuff about SHA1).

In what sense? Yes, you can make a guess if you have stored the SHA1
that contained the launch codes. But the point is that that particular
SHA1 is no longer part of the repository. Keeping that SHA1 is no easier
than just keeping the launch codes in the first place.

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:12                                             ` Jeff King
@ 2006-10-20 14:40                                               ` Jakub Narebski
  2006-10-20 14:52                                                 ` Johannes Schindelin
  2006-10-21 17:57                                               ` Aaron Bentley
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 14:40 UTC (permalink / raw)
  To: Jeff King
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Jeff King wrote:
> On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:
> 
>> What's nice is being able see the revno 753 and knowing that "diff -r
>> 752..753" will show the changes it introduced.  Checking the revo on a
>> branch mirror and knowing how out-of-date it is.
> 
> I was accustomed to doing such things in CVS, but I find the git way
> much more pleasant, since I don't have to do any arithmetic:
>   diff d8a60^..d8a60

By the way "diff d8a60" also works (unless d8a60 is merge commit, in
which case you would need "diff -c d8a60" or "diff -m d8a60").

> (Yes, I am capable of performing subtraction in my head, but I find that
> a "parent-of" operator matches my cognitive model better, especially
> when you get into things like d8a60^2~3).
> 
> Does bzr have a similar shorthand for mentioning relative commits?

By the way, git has the following extended SHA1 syntax for <commit-ish>
(documented in git-rev-parse(1)):
 * full SHA1 (40-chars hexadecimal string) or abbreviation unique for
   repository
 * symbolic ref name. E.g. 'master' typically means commit object referenced
   by $GIT_DIR/refs/heads/master; 'v1.4.1' means commit object referenced
   [indirectly] by $GIT_DIR/refs/tags/v1.4.1. You can say 'heads/master'
   and 'tags/master' if you have both head (branch) and tag named 'master',
   but don't do that. HEAD means current branch (and is usually default).
 * <ref>@{<date>} or <ref>@{<n>} to specify value of <ref> (usually branch)
   at given point of time, or n changes to ref back. Available only if you
   have reflog for given ref.
 * <commit-ish>^<n> means n-th parent of given revision. <commit-ish>^0
   means commit itself. <commit-ish>^ is a shortcut for <commit-ish>^1.
   <commit-ish>~<n> is shortcut for <commit-ish>^^..^ with n*'^', for
   example rev~3 is equivalent to rev^^^, which in turn is equivalent
   to rev^1^1^1

Additionally it has following undocumented extended SHA1 syntax to refer
to trees (directories) and blobs (file contents)
 * <revision>:<filename> gives SHA1 of tree or blob at given revision
 * :<stage>:<filename> (I think for blobs only) gives SHA1 for different
   versions of file during unresolved merge conflict.

I'm not enumerating here all the ways to specify part of DAG of history,
except that it includes "A ^B" meaning "all from A", "exclude all from B",
"B..A" meaning "^B A", "A...B" meaning "A B --not $(git merge-base A B)",
and of course "A -- path" meaning "all from A", "limit to changes in path".

What about _your_ SMC? ;-)
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20  0:20                                                               ` Jan Harkes
@ 2006-10-20 14:41                                                                 ` Jeff King
  0 siblings, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-20 14:41 UTC (permalink / raw)
  To: Linus Torvalds, Junio C Hamano, git

On Thu, Oct 19, 2006 at 08:20:32PM -0400, Jan Harkes wrote:

> It looks like you were really close. When we cannot resolve a delta, we
> just write it to the packfile and we don't queue it. If it can be
> resolved we write it as a full object.

If I understand correctly, if we see an unresolvable delta, we are just
making the assumption that its base has arrived (or will arrive) in the
same pack (without checking).  This means that we could end up with a
corrupted repository if the sender gives us a bad pack. I believe that
git's network interaction has been designed specifically to avoid such
possibilities (e.g., verifying completeness and integrity of downloaded
objects).

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:40                                               ` Jakub Narebski
@ 2006-10-20 14:52                                                 ` Johannes Schindelin
  2006-10-20 15:34                                                   ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-20 14:52 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jeff King, Aaron Bentley, Carl Worth, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Jeff King wrote:
> > 
> > I was accustomed to doing such things in CVS, but I find the git way
> > much more pleasant, since I don't have to do any arithmetic:
> >   diff d8a60^..d8a60
> 
> By the way "diff d8a60" also works (unless d8a60 is merge commit, in
> which case you would need "diff -c d8a60" or "diff -m d8a60").

I could be wrong, but I have the impression (even after actually testing 
it) that "git diff d8a60" is equivalent to "git diff d8a60..HEAD", _not_ 
"git diff d8a60^..d8a60".

IIRC we had a "-p" flag to denote "parent" once upon a time, but that no 
longer works...

"git-show" is definitely what you want.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 14:03                                     ` Aaron Bentley
@ 2006-10-20 14:56                                       ` Jakub Narebski
  2006-10-20 15:34                                         ` Aaron Bentley
  2006-10-21  7:56                                         ` Matthieu Moy
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
  1 sibling, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 14:56 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Aaron Bentley wrote:

> Sean wrote:
> 
>> Petr already mentioned that the data currently shown in the email
>> text isn't really useful.
> 
> In Bazaar bundles, the text of the diff is an integral part of the data.
>  It is used to generate the text of all the files in the revision.

I thought that the diff was combined diff of changes.
 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

If you have only mega-diff, you can comment only on this mega-diff.
It is more usefull for changes which have natural mult-commit history,
to review and comment on each of commits/patches in series _separately_.

>> Although that might just make the email bigger for not a lot of
>> gain.
> 
> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

As _series_ of patches. You have git-format-patch + git-send-email
to format and send them, git-am to apply them (as patches, not as branch).

I was under an impression that user sees only mega-patch of all the
revisions in bundle together, and rest is for machine consumption only.

cg-bundle doesn't have this "mega-diff", but has shortlog (does bzr
bundle has shortlog/log of changes contained therein?) and diffstat
was planned.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:17                                                 ` Jakub Narebski
  2006-10-20 13:36                                                   ` Petr Baudis
@ 2006-10-20 14:59                                                   ` James Henstridge
  2006-10-20 22:50                                                     ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-20 14:59 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> James Henstridge wrote:
> > With the above layout, I would just type:
> >     bzr branch http://server/repo/branch1
>
> With Cogito (you can think of it either as alternate Git UI, or as SCM
> built on top of Git) you would use
>
>    $ cg clone http://server/repo#branch
>
> for example
>
>    $ cg clone git://git.kernel.org/pub/scm/git/git.git#next
>
> to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).

My understanding of git is that this would be equivalent to the "bzr
branch" command.  A checkout (heavy or lightweight) has the property
that commits are made to the original branch.

> But you can also clone _whole_ repository, _all_ published branches with
>
>    $ cg clone git://git.kernel.org/pub/scm/git/git.git

I suppose that'd be useful if you want a copy of all the branches at
once.  There is no builtin command in Bazaar to do that at present.


> With core Git it is the same, but we don't have the above shortcut
> for checking only one branch; branches to checkout are in separate
> arguments to git-clone.
>
> In bzr it seems that you cannot distinguish (at least not only
> from URL) where repository ends and branch begins.

I guess this highlights that the two tools optimise for different workflows.
> > This command behaves identically whether the repository data is in
> > /repo or in /repo/branch1.  Someone pulling from the branch doesn't
> > have to care what the repository structure is.  Having a separate
> > namespace for branch names only really makes sense if the user needs
> > to care about it.
> >
> > As for hierarchical names, there is nothing stopping you from using
> > deaper directory structures with Bazaar too.  Bazaar just checks each
> > successive parent directory til it finds a repository for the branch.
> >
> >> The idea of "branches (and tags) as directories" was if I understand
> >> it correctly introduced by Subversion, and from what can be seen from
> >> troubles with git-svn (stemming from the fact that division between
> >> project name and branch name is the matter of _convention_) at least
> >> slightly brain-damaged.
> >
> > I think you are a bit confused about how Bazaar works here.  A Bazaar
> > repository is a store of trees and revision metadata.  A Bazaar branch
> > is just a pointer to a head revision in the repository.  As you can
> > probably guess, the data for the branch is a lot smaller than the data
> > for the repository.
> >
> > You can store the repository and branch in the same directory to get a
> > standalone branch.  The layout I described above has a repository in a
> > parent directory, shared by multiple branches.
> >
> > If you are comparing Subversion and Bazaar, a Bazaar branch shares
> > more properties with a full Subversion repository rather than a
> > Subversion branch.
>
> Oh, that explained yet another difference between Bazaar-NG (and other
> SCM which uses similar model) and Git.
>
> In Git branch is just a pointer to head (top) commit (hence they are stored
> under .git/refs/heads/) in given line of development. Git also stores
> information (in .git/HEAD) about which branch we are currently on, which
> means on which branch git puts new commits. Nothing more (well, there
> can be log of changes to head in .git/logs/refs/heads/ but that is optional
> and purely local information). In Bazaar-NG you have to store (if I
> understand it correctly) mapping from revnos to revisions.
>
> By default (it means for example default behavior of git-clone, if we don't
> use --bare option) git repository is _embedded_ in working area. We have

Two points:
(1) if we are publishing branches, we wouldn't include working trees
-- they are not needed to pull or merge from such a branch.
(2) if we did have working trees, they'd be rooted at /repo/branch1
and /repo/branch2 -- not at /repo (since /repo is not a branch).

In case (2) there is a potential for conflicts if you nest branches,
but people don't generally trigger this problem with the way they use
Bazaar.

> So repo/branch wouldn't work, because 'branch' would conflict with working
> area files. GIT doesn't follow the CVS model of separate storage area
> (CVSROOT) and having only pointer to said area (files in CVS/
> subdirectories) in working directory.

That is fairly similar to the default mode of operation with Bazaar:
you have a repository, branch and working tree all rooted in the same
directory.  If you have separated working trees and branches, then
that is because you specifically asked for it.


> In GIT to work on some repository you don't (like from what I understand
> in Bazaar-NG) "checkout" some branch (which would automatically copy some
> data in case of "heavy checkout" or just save some pointer to repository
> in "lightweight checkout" case). You clone whole repository; well you can
> select which branches to clone. "Checkout" in GIT terminology means to
> populate working area with given version (and change in repository which
> branch is current, usually).

I think you have a slight misunderstanding of what a Bazaar checkout is.

>
> How checked out working area looks like in Bazaar-NG?

The layout of a standalone branch would be:
  .bzr/repository/ -- storage of trees and metadata
  .bzr/branch/ -- branch metadagta (e.g. pointer to the head revision)
  .bzr/checkout/ -- working tree book-keeping files
  source code

If we use a shared repository, the contained branches would lack the
.bzr/repository/ directory.  The parent directory would instead have a
.bzr/repository/, but usually wouldn't have .bzr/branch/ (unless there
is a branch rooted at the base of the repository).

if we are publishing a branch to a web server, we'd skip the working
tree, so the source code and .bzr/checkout/ directory would be
missing.

In the case of a checkout, the .bzr/branch/ directory has a special
format and acts as a pointer to the original branch.  If the checkout
is lightweight, the .bzr/repository/ directory would be missing, and
bzr would need to contact the original branch for the data.


> >>> For similar reasons, the cost of publishing 20 related Bazaar branches
> >>> on my web server is generally not 20 times the cost of publishing a
> >>> single branch.
> >>>
> >>> I understand that you get similar benefits by a GIT repository with
> >>> multiple head revisions.
> >>
> >> You can get similar benefits by a GIT repository with shared object
> >> database using alternates mechanism. And that is usually preferred
> >> over storing unrelated branches, i.e. branches pointing to disconnected
> >> DAG (separate trees in BK terminology) of revision, if that you mean by
> >> multiple head revisions (because in GIT there is no notion of "mainline"
> >> branch, only of current (HEAD) branch).
> >
> > I may have got the git terminology wrong. I was trying to draw
> > parallels between the .git/refs/... files in a git repository and the
> > way multiple branches can be stored in a Bazaar repository.
>
> Yes, but using Git that way has serious disadvantages. For example
> there is only one current branch pointer and only one index (dircache)
> per git repository.

Okay.  So using Bazaar terminology, this seems to be an issue of the
working tree being associated with the repository rather than the
branch?


[...]
> But I agree that saving "old fork" info as separate branch doesn't lead
> to that much inefficiency as might be thought.
>
> But after saving "old fork" as a branch revno based revision identifiers
> change from http://old.host/old/repo:127 to http://host/repo/old.fork:127
> That is maybe minimal change, but this is change!

Well, a branch can easily have multiple URLs even if there is only one
copy of it.  I might write to it via local file access or sftp (which
would be a file: or sftp: URL).

Mirrors of branches don't usually confuse users (and remember that the
revision numbers are primarily intended for users -- if I am writing a
Bazaar plugin, I'd work in terms of revision IDs).


James.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 13:46                                   ` Christian MICHON
@ 2006-10-20 15:05                                     ` Jakub Narebski
  2006-10-20 15:16                                       ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 15:05 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Christian MICHON wrote:

> - git is the fastest scm around

Mercurial also claims that. It probably depends on the benchmark, though.
But Mercurial (hg) lacks from what I understand persistent branches, and
has only partial support for renames. YMMV.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:05                                     ` Jakub Narebski
@ 2006-10-20 15:16                                       ` Johannes Schindelin
  2006-10-20 15:28                                         ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-20 15:16 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Christian MICHON wrote:
> 
> > - git is the fastest scm around
> 
> Mercurial also claims that.

Funny. When you type in "mercurial" and "benchmark" into Google, the 
_first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
benchmark". Performed by the good Mercurial people.

Leaving git as winner.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:16                                       ` Johannes Schindelin
@ 2006-10-20 15:28                                         ` Jakub Narebski
  2006-10-20 15:39                                           ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 15:28 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, bazaar-ng

Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Christian MICHON wrote:
>> 
>>> - git is the fastest scm around
>> 
>> Mercurial also claims that.
> 
> Funny. When you type in "mercurial" and "benchmark" into Google, the 
> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
> benchmark". Performed by the good Mercurial people.
> 
> Leaving git as winner.
 
Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
comparison of Git and Mercurial" for the latest (OLS2006) benchmark
by Mercurial. Probably not indexed by Google, or doesn't have high 
pagerank because it is in PDF and fairly new (therefore has low 
"citations" number).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:31                                         ` Jeff King
@ 2006-10-20 15:33                                           ` J. Bruce Fields
  2006-10-20 15:43                                             ` Jeff King
  0 siblings, 1 reply; 806+ messages in thread
From: J. Bruce Fields @ 2006-10-20 15:33 UTC (permalink / raw)
  To: Jeff King
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git

On Fri, Oct 20, 2006 at 10:31:11AM -0400, Jeff King wrote:
> On Thu, Oct 19, 2006 at 01:14:09PM -0400, J. Bruce Fields wrote:
> > So in this case you can certainly lose the launch codes.  But you have
> > forever granted everyone a way to determine whether a given guess at the
> > launch codes is correct.  (Again, assuming some stuff about SHA1).
> 
> In what sense? Yes, you can make a guess if you have stored the SHA1
> that contained the launch codes. But the point is that that particular
> SHA1 is no longer part of the repository.

Well, I thought the discussion was about what meaning references have
after branches were modified or removed.  In which case the interesting
situation is one where an object is gone but someone somewhere still
holds a reference (because the SHA1 was mentioned in a bug report or an
email or whatever).

> Keeping that SHA1 is no easier than just keeping the launch codes in
> the first place.

Could be.

Anyway, the important difference between the SHA1 references and small
integers is that there's no aliasing in the former case.  Which is
important--I'd rather have a reference to nothing than a reference to
the wrong thing....

--b.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 14:56                                       ` Jakub Narebski
@ 2006-10-20 15:34                                         ` Aaron Bentley
  2006-10-20 16:21                                           ` Jakub Narebski
  2006-10-20 22:40                                           ` Petr Baudis
  2006-10-21  7:56                                         ` Matthieu Moy
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 15:34 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2005 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>In Bazaar bundles, the text of the diff is an integral part of the data.
>> It is used to generate the text of all the files in the revision.
> 
> 
> I thought that the diff was combined diff of changes.

It is.  It's a description of how to produce revision X given revision
Y, where Y is the last-merged mainline revision.

>>Bazaar bundles were designed to be used on mailing lists.  So you can
>>review the changes from the diff, comment on them, and if it seems
>>suitable, merge them.
> 
> 
> If you have only mega-diff, you can comment only on this mega-diff.

That is what we prefer to review.

>>>Although that might just make the email bigger for not a lot of
>>>gain.
>>
>>It's my understanding that most changes discussed on lkml are provided
>>as a series of patches.  Bazaar bundles are intended as a direct
>>replacement for patches in that use case.
> 
> 
> As _series_ of patches. You have git-format-patch + git-send-email
> to format and send them, git-am to apply them (as patches, not as branch).

If you want to do it exactly the same way, you send a series of bundles.

The bundle format can also support sending a single bundles that
displays the series of patches, though there's currently no UI to select
this.

> I was under an impression that user sees only mega-patch of all the
> revisions in bundle together, and rest is for machine consumption only.

All of it is for machine consumption.  The MIME-encoded sections are a
series of patches.  They're usually MIME-encoded to avoid confusion with
the overview patch, but this is optional.

I've attached an example of what a combined patch-by-patch bundle looks
like.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOOyB0F+nu1YWqI0RAtU6AKCJndTNlTTPNnzxZX53lkBUUHTYkwCfePlG
7x3cjpYwh8LXEb5ZWXXmu6s=
=6Lgv
-----END PGP SIGNATURE-----

[-- Attachment #2: hello-world.patch --]
[-- Type: text/x-patch, Size: 1808 bytes --]

# Bazaar revision bundle v0.8
#
# message:
#   Added 'world'
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:30:21.903000116 -0400

=== modified file world
--- world
+++ world
@@ -1,1 +1,1 @@
-Hello
+Hello, world

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 153021-b5fcea14e9cd2b34
# revision id: abentley@panoramicfeedback.com-20061020153021-b5fcea14e9cd2b34
# sha1: 6d553e72158aaa76c258d98c15cd24922d171cd9
# inventory sha1: 64af82c4d81d9d6ad4f33fc734d32c2a1eaa0df5
# parent ids:
#   abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# properties:
#   branch-nick: bar

# message:
#   Capitalized
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:51.953999996 -0400

=== modified file world
--- world
+++ world
@@ -1,1 +1,1 @@
-hello
+Hello

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 152951-10cff5ff5a51e9a2
# revision id: abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# sha1: f7b79934bc3b0a944e35168b5df6b106c5b29ebf
# inventory sha1: 1400d56451752300cc31c9c94ff7ee2188e8ef8c
# parent ids:
#   abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# properties:
#   branch-nick: bar

# message:
#   initial commit
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:35.536999941 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1
--- /dev/null
+++ world
@@ -0,0 +1,1 @@
+hello

# revision id: abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# sha1: 0728f761b891b257f0a71e2e360799eec080cd21
# inventory sha1: e52e030ea40f6bf5da78f4e8eb8efcd072b0930a
# properties:
#   branch-nick: bar


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:52                                                 ` Johannes Schindelin
@ 2006-10-20 15:34                                                   ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 15:34 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff King, Aaron Bentley, Carl Worth, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
>> Jeff King wrote:
>>> 
>>> I was accustomed to doing such things in CVS, but I find the git way
>>> much more pleasant, since I don't have to do any arithmetic:
>>>   diff d8a60^..d8a60
>> 
>> By the way "diff d8a60" also works (unless d8a60 is merge commit, in
>> which case you would need "diff -c d8a60" or "diff -m d8a60").
> 
> I could be wrong, but I have the impression (even after actually testing 
> it) that "git diff d8a60" is equivalent to "git diff d8a60..HEAD", _not_ 
> "git diff d8a60^..d8a60".

Ooops, I mixed git-diff-tree (which behaves as mentioned above) with
git-diff, which according to documentation compares with working tree
(and not HEAD) if only one <tree-ish> is given.

git-diff(1):
       ?  When  one  <tree-ish>  is given, the working tree and the named tree are
          compared, using git-diff-index. The option --cached can be given to com-
          pare the index file and the named tree.

git-diff-tree(1):
       If there is only one <tree-ish> given, the commit is compared with its par-
       ents (see --stdin below).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
  2006-10-20 15:37                                         ` Sean
@ 2006-10-20 15:37                                         ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-20 15:37 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Alexander Belchenko, bazaar-ng, git

On Fri, 20 Oct 2006 10:03:16 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> In Bazaar bundles, the text of the diff is an integral part of the data.
> It is used to generate the text of all the files in the revision.
> 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

Perhaps I missed something in the earlier mails about this feature.
As I understood it, the email sent has a combined diff that shows
the net effect of all the commits included in the bundle.  (Whereas
the current Cogito version only shows a diffstat)

If the recipient of such a bundle is unable to extract the diff of
each separate commit included in the bundle then I can't see any
value in the feature at all.  But showing a combined diff in the
email may have marginal value, so long as when the bundle is 
imported into the recipient repository the individual commits
are available.

> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

A combined diff of a bunch of changes would usually be most _unwelcome_
for review on lkml.  The constant refrain is to ask people to split their
changes up into smallish individual patches for review.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
       [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
@ 2006-10-20 15:37                                         ` Sean
  2006-10-20 15:37                                         ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-20 15:37 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Alexander Belchenko, bazaar-ng, git

On Fri, 20 Oct 2006 10:03:16 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> In Bazaar bundles, the text of the diff is an integral part of the data.
> It is used to generate the text of all the files in the revision.
> 
> Bazaar bundles were designed to be used on mailing lists.  So you can
> review the changes from the diff, comment on them, and if it seems
> suitable, merge them.

Perhaps I missed something in the earlier mails about this feature.
As I understood it, the email sent has a combined diff that shows
the net effect of all the commits included in the bundle.  (Whereas
the current Cogito version only shows a diffstat)

If the recipient of such a bundle is unable to extract the diff of
each separate commit included in the bundle then I can't see any
value in the feature at all.  But showing a combined diff in the
email may have marginal value, so long as when the bundle is 
imported into the recipient repository the individual commits
are available.

> It's my understanding that most changes discussed on lkml are provided
> as a series of patches.  Bazaar bundles are intended as a direct
> replacement for patches in that use case.

A combined diff of a bunch of changes would usually be most _unwelcome_
for review on lkml.  The constant refrain is to ask people to split their
changes up into smallish individual patches for review.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:28                                         ` Jakub Narebski
@ 2006-10-20 15:39                                           ` Johannes Schindelin
  2006-10-20 16:05                                             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-20 15:39 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Johannes Schindelin wrote:
> 
> > On Fri, 20 Oct 2006, Jakub Narebski wrote:
> > 
> >> Christian MICHON wrote:
> >> 
> >>> - git is the fastest scm around
> >> 
> >> Mercurial also claims that.
> > 
> > Funny. When you type in "mercurial" and "benchmark" into Google, the 
> > _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
> > benchmark". Performed by the good Mercurial people.
> > 
> > Leaving git as winner.
>  
> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
> by Mercurial.

Thanks for the hint!

BTW the tests in Clone/status/pull make sense, especially the "4 times 
slower on pull/merge". In my tests, merge-recur (the default merge 
strategy, which was written in Python, and is now in C) was substantially 
faster.

> Probably not indexed by Google, or doesn't have high pagerank because it 
> is in PDF and fairly new (therefore has low "citations" number).

I hope these posts boost the pagerank.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:33                                           ` J. Bruce Fields
@ 2006-10-20 15:43                                             ` Jeff King
  0 siblings, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-20 15:43 UTC (permalink / raw)
  To: J. Bruce Fields
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Jakub Narebski,
	Andreas Ericsson, bazaar-ng, git

On Fri, Oct 20, 2006 at 11:33:23AM -0400, J. Bruce Fields wrote:

> Well, I thought the discussion was about what meaning references have
> after branches were modified or removed.  In which case the interesting
> situation is one where an object is gone but someone somewhere still
> holds a reference (because the SHA1 was mentioned in a bug report or an
> email or whatever).

Git tries very hard to make sure you don't have a reference to something
that doesn't exist. But yes, you could have a reference to the SHA1 in
another, non-git source, and try to guess the data from it. However,
there's a bit of a two-step procedure, since the SHA1 will likely be of
the commit. You have to guess the commit author, date, message, and
the contents of the rest of the tree to make a correct guess.

In practice I think most "launch code" scenarios are less about
guessable confidentiality, and more about ceasing to publish things you
shouldn't be (like copyright or patent encumbered code).

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:51                                             ` Jakub Narebski
@ 2006-10-20 15:58                                               ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 15:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git



On Fri, 20 Oct 2006, Jakub Narebski wrote:
> Junio C Hamano wrote:
> > 
> > An interesting effect on this is when people have a column for
> > merge performance in a SCM comparison table, they would include
> > time to run the diffstat as part of the time spent for merging
> > when they fill in the number for git, but not for any other SCM.
> 
> So if you want to compare merge performance with other SCM, you should
> either add time to run diffstat for other SCM, or substract time to
> run "git diff-tree --stat".

Naah. Just run "git pull -n". It's even documented:

	OPTIONS
	       -n, --no-summary
	              Do not show diffstat at the end of the merge.

so while the _default_ is to always show the diffstat, you certainly can 
easily do without it.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 15:39                                           ` Johannes Schindelin
@ 2006-10-20 16:05                                             ` Jakub Narebski
  2006-10-20 16:24                                               ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 16:05 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, bazaar-ng

Johannes Schindelin wrote:
> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>> Johannes Schindelin wrote:
>>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>> 
>>>> Christian MICHON wrote:
>>>> 
>>>>> - git is the fastest scm around
>>>> 
>>>> Mercurial also claims that.
>>> 
>>> Funny. When you type in "mercurial" and "benchmark" into Google, the 
>>> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
>>> benchmark". Performed by the good Mercurial people.
>>> 
>>> Leaving git as winner.
>>  
>> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
>> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
>> by Mercurial.
> 
> Thanks for the hint!
> 
> BTW the tests in Clone/status/pull make sense, especially the "4 times 
> slower on pull/merge". In my tests, merge-recur (the default merge 
> strategy, which was written in Python, and is now in C) was substantially 
> faster.

As it was mentioned somewhere else in this thread, to compare times
for pull/merge in git with other SCM one should in principle substract
time for diffstat/git diff --stat.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 15:34                                         ` Aaron Bentley
@ 2006-10-20 16:21                                           ` Jakub Narebski
  2006-10-20 17:03                                             ` Aaron Bentley
  2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 22:40                                           ` Petr Baudis
  1 sibling, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 16:21 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:

> === added directory  // file-id:TREE_ROOT

Gaaah, so rename detection in bzr is done using file-ids?
Linus will tell you the inherent problems with that "solution".
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 16:05                                             ` Jakub Narebski
@ 2006-10-20 16:24                                               ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 16:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jakub Narebski wrote:

> Johannes Schindelin wrote:
>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>> Johannes Schindelin wrote:
>>>> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>>>> 
>>>>> Christian MICHON wrote:
>>>>> 
>>>>>> - git is the fastest scm around
>>>>> 
>>>>> Mercurial also claims that.
>>>> 
>>>> Funny. When you type in "mercurial" and "benchmark" into Google, the 
>>>> _first_ hit is into "git Archives: Mercurial 0.4b vs git patchbomb 
>>>> benchmark". Performed by the good Mercurial people.
>>>> 
>>>> Leaving git as winner.
>>>  
>>> Check out http://git.or.cz/gitwiki/GitBenchmarks section "Quilt import 
>>> comparison of Git and Mercurial" for the latest (OLS2006) benchmark
>>> by Mercurial.
>> 
>> Thanks for the hint!
>> 
>> BTW the tests in Clone/status/pull make sense, especially the "4 times 
>> slower on pull/merge". In my tests, merge-recur (the default merge 
>> strategy, which was written in Python, and is now in C) was substantially 
>> faster.
> 
> As it was mentioned somewhere else in this thread, to compare times
> for pull/merge in git with other SCM one should in principle substract
> time for diffstat/git diff --stat.

Or as reminded, use -n, --no-summary option to git pull.

BTW. I'd rather have -n == --no-commit for git pull...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 16:21                                           ` Jakub Narebski
@ 2006-10-20 17:03                                             ` Aaron Bentley
  2006-10-20 17:18                                               ` Linus Torvalds
  2006-10-20 17:21                                               ` Shawn Pearce
  2006-10-20 18:12                                             ` Jan Hudec
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 17:03 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
> 
>>=== added directory  // file-id:TREE_ROOT
> 
> 
> Gaaah, so rename detection in bzr is done using file-ids?
> Linus will tell you the inherent problems with that "solution".

All solutions have disadvantages.  We prefer the disadvantages that come
from using file-ids over the disadvantages that come from using
content-based rename detection.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOQFo0F+nu1YWqI0RAlCnAJwIqwuPG/IPBBQWaGyEImTm4GMP6QCfTV89
QZaMQsTqXBH8wrt7VKAHpII=
=Qx2i
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:03                                             ` Aaron Bentley
@ 2006-10-20 17:18                                               ` Linus Torvalds
  2006-10-20 17:45                                                 ` Jakub Narebski
  2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
  2006-10-20 17:21                                               ` Shawn Pearce
  1 sibling, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:18 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> All solutions have disadvantages.  We prefer the disadvantages that come
> from using file-ids over the disadvantages that come from using
> content-based rename detection.

That's fine, but please don't call the git rename handling "maybe" or 
"partial", like a lot of people seem to do. 

Git _definitely_ handles renames, both in everyday life and when merging. 
Some people may not like how it's done, but other (I'll say "equally 
informed", even though obviously I know better ;) people really don't like 
the way bzr or others do their rename handling.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:03                                             ` Aaron Bentley
  2006-10-20 17:18                                               ` Linus Torvalds
@ 2006-10-20 17:21                                               ` Shawn Pearce
  2006-10-20 17:48                                                 ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Shawn Pearce @ 2006-10-20 17:21 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski

Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> Jakub Narebski wrote:
> > Gaaah, so rename detection in bzr is done using file-ids?
> > Linus will tell you the inherent problems with that "solution".
> 
> All solutions have disadvantages.  We prefer the disadvantages that come
> from using file-ids over the disadvantages that come from using
> content-based rename detection.

As good as the content based rename detection is I got burned
recently by it.

I renamed hundreds of small files in one shot and also did a few
hundered adds and deletes of other small XML files.  Git generated
a lot of those unrelated adds/deletes as rename/modifies, as their
content was very similiar.  Some people involved in the project
freaked as the files actually had nothing in common with one
another... except for a lot of XML elements (as they shared the
same DTD).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:37                                                     ` Johannes Schindelin
  2006-10-20 12:03                                                       ` Jakub Narebski
@ 2006-10-20 17:23                                                       ` David Lang
  1 sibling, 0 replies; 806+ messages in thread
From: David Lang @ 2006-10-20 17:23 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git, bazaar-ng

On Fri, 20 Oct 2006, Johannes Schindelin wrote:

> On Fri, 20 Oct 2006, Jakub Narebski wrote:
>
>> Johannes Schindelin wrote:
>>
>>> On Fri, 20 Oct 2006, Lachlan Patrick wrote:
>>>
>>>> How does git disambiguate SHA1 hash collisions?
>>>
>>> It does not. You can fully expect the universe to go down before that
>>> happens.
>>
>> Or you can compile git with COLLISION_CHECK
>>
>>> From Makefile:
>> # Define COLLISION_CHECK below if you believe that SHA1's
>> # 1461501637330902918203684832716283019655932542976 hashes do not give you
>> # sufficient guarantee that no collisions between objects will ever happen.
>
> You can document your disbelief.
>
> But it does not change a thing. Since v0.99~653, we do not have any
> collision check, even if compiled with COLLISION_CHECK.

I had the same disbelief as you about this, however the last time this came up 
Linus pointed out something that satisfied me.

any action in git that could create or or recreate an object will not overwrite 
an object that it thinks that it already has.

so

if you create a new local file that would conflict and save it, git will accept 
your save and throw away the new file.

if you pull from a remote repository and there is a file there that conflicts 
with a file you already have it will throw away the new file.

if you pull from a remote repository and someone has hacked it to replace a file 
with a bad one, if you already have the good one git will throw away the bad 
one.

as a result the worst case is that a new file being checked in doesn't really 
get in and when someone checks it out and trys to use it they get the old 
contents. In the case of code, it's extremely unlikly that the wrong code will 
even compile, let alone do anything remotely close to working correctly. At this 
point the fix is to go back to the origional developer to get the correct 
version while additional changes are made to git (and remember, that unless this 
is a brand new file the prior version is readily available so only the latest 
diff needs to be recovered)

so the odds are extremely low and the concequeces of a collision are fairly 
minor.

git has (or had) an option to actually check the full contents before throwing 
away the new copy instead of just checking the hash (and throwing an error if 
the contents don't match), but the performance cost of this is pretty high.

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:18                                               ` Linus Torvalds
@ 2006-10-20 17:45                                                 ` Jakub Narebski
  2006-10-20 17:59                                                   ` Linus Torvalds
  2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 17:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Aaron Bentley, bazaar-ng, git

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
>> 
>> All solutions have disadvantages.  We prefer the disadvantages that come
>> from using file-ids over the disadvantages that come from using
>> content-based rename detection.

If I remember correctly, git decided on contents (plus filename)
similarity based renames detection because 1), it is more generic
as it covers (or can cover) contents moving not only wholesome rename
of a file, and 2) because file-id based renames handling works only
if you explicitely use SCM command to rename file, which is not the
case of non-SCM-aware channel like for example patches (and accepting
ordinary patches is important for Linux kernel, the project git was
created for).

Another problem with file-id based rename handling is not handling
file copying (correct me if I'm wrong), and troubles with removing
or renaming a file, then having new file with old name.
 
> That's fine, but please don't call the git rename handling "maybe" or 
> "partial", like a lot of people seem to do. 
> 
> Git _definitely_ handles renames, both in everyday life and when merging. 
> Some people may not like how it's done, but other (I'll say "equally 
> informed", even though obviously I know better ;) people really don't like 
> the way bzr or others do their rename handling.

I think that "partial" refers to not complete handling of renames
for file history; pathspec doesn't follow history. Although the
information is there in SCM, it's the tools that need extension
(the --follow of rename following single file pathspec limit
proposal).

There was also suggestion of rr2-cache, which would record corrections
to automatic rename detection (rename/copy conflict resolving) 
if I remember correctly.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:18                                               ` Linus Torvalds
  2006-10-20 17:45                                                 ` Jakub Narebski
@ 2006-10-20 17:47                                                 ` Aaron Bentley
  2006-10-20 18:06                                                   ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 17:47 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>All solutions have disadvantages.  We prefer the disadvantages that come
>>from using file-ids over the disadvantages that come from using
>>content-based rename detection.
> 
> 
> That's fine, but please don't call the git rename handling "maybe" or 
> "partial", like a lot of people seem to do. 
> 
> Git _definitely_ handles renames, both in everyday life and when merging.

Hmm.  Could you say more here?  The only examples I can think of for
handling renames are situations that can be expressed as a merge.

For example, populating a working tree can be expressed as:
BASE: nothing
THIS: nothing
OTHER: aabbccddee

Or revert can be expressed as

BASE: current
THIS: current
OTHER: aabbccddee

Or fast-forward pull

BASE: last-commit
THIS: current
OTHER: aabbccddee

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOQuv0F+nu1YWqI0RAotBAKCEEzvh1Cc2jJH4NIEBwoYrDJlbUQCgiPBF
DZ4+hSbkjbvgOwbT4+oLzFA=
=wSgK
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:21                                               ` Shawn Pearce
@ 2006-10-20 17:48                                                 ` Linus Torvalds
  2006-10-20 17:58                                                   ` David Lang
                                                                     ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:48 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Aaron Bentley, Jakub Narebski, bazaar-ng, git



On Fri, 20 Oct 2006, Shawn Pearce wrote:
> 
> I renamed hundreds of small files in one shot and also did a few
> hundered adds and deletes of other small XML files.  Git generated
> a lot of those unrelated adds/deletes as rename/modifies, as their
> content was very similiar.  Some people involved in the project
> freaked as the files actually had nothing in common with one
> another... except for a lot of XML elements (as they shared the
> same DTD).

Heh. We can probably tweak the heuristics (one of the _great_ things about 
content detection is that you can fix it after the fact, unlike the 
alternative).

That said, I've personally actually found the content-based similarity 
analysis to often be quite informative, even when (and perhaps 
_especially_ when) it ended up showing something that the actual author of 
the thing didn't intend.

So yeah, I've seen a few strange cases myself, but they've actually been 
interesting. Like seeing how much of a file was just a copyright license, 
and then a file being considered a "copy" just because it didn't actually 
introduce any real new code.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
@ 2006-10-20 17:58                                                   ` David Lang
  2006-10-20 18:15                                                   ` Jon Smirl
                                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: David Lang @ 2006-10-20 17:58 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git

On Fri, 20 Oct 2006, Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Shawn Pearce wrote:
>>
>> I renamed hundreds of small files in one shot and also did a few
>> hundered adds and deletes of other small XML files.  Git generated
>> a lot of those unrelated adds/deletes as rename/modifies, as their
>> content was very similiar.  Some people involved in the project
>> freaked as the files actually had nothing in common with one
>> another... except for a lot of XML elements (as they shared the
>> same DTD).
>
> Heh. We can probably tweak the heuristics (one of the _great_ things about
> content detection is that you can fix it after the fact, unlike the
> alternative).
>
> That said, I've personally actually found the content-based similarity
> analysis to often be quite informative, even when (and perhaps
> _especially_ when) it ended up showing something that the actual author of
> the thing didn't intend.
>
> So yeah, I've seen a few strange cases myself, but they've actually been
> interesting. Like seeing how much of a file was just a copyright license,
> and then a file being considered a "copy" just because it didn't actually
> introduce any real new code.
>

isn't the default to consider them a copy if they are 80% the same, with a 
command line option to tweak this (IIRC -m, but I could easily be wrong)

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:45                                                 ` Jakub Narebski
@ 2006-10-20 17:59                                                   ` Linus Torvalds
  2006-10-20 20:17                                                     ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 17:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git



On Fri, 20 Oct 2006, Jakub Narebski wrote:
> 
> If I remember correctly, git decided on contents (plus filename)
> similarity based renames detection because 1), it is more generic
> as it covers (or can cover) contents moving not only wholesome rename
> of a file, and 2) because file-id based renames handling works only
> if you explicitely use SCM command to rename file, which is not the
> case of non-SCM-aware channel like for example patches (and accepting
> ordinary patches is important for Linux kernel, the project git was
> created for).

There are lots of problems with file ID's. One of the more obvious ones is 
indeed that if you arrive at the same state two different ways (eg patches 
vs "native SCM"), you end up with two fundmanetally different trees. Even 
though clearly there was no real difference.

There are other serious problems. For example, file-ID based systems 
invariably have _huge_ problems with handling two branches deleting and 
renaming things differently, and we had several issues with that during 
the BK days (ie two people would move files differently, and ending up 
with different file ID's for the same path, and merging that inevitably 
causes problems not just during the merge, but ever after, since one of 
the file ID's will then have to be "deleted" even though it might be 
active in one of the branches).

Finally, file-ID based systems fundamentally cannot handle some simple and 
interesting cases, like partial content movement. We're starting to see 
git actually being able to track file content moving between files: even 
when the files themselves didn't move (ie Junio's "git pickaxe" work could 
do things like that).

And there really aren't as many advantages to tracking renames as people 
claim. The biggest advantage of tracking renames is to avoid the trap that 
CVS fell into: being file-ID based _and_ not being able to track the file 
ID moving is clearly the worst of all worlds.

So for anybody coming from a CVS background, tracking renames explicitly 
is a _huge_ advantage, which is, I think, why some SCM people have gotten 
so hung up about them. It's just that if you don't have the file-ID 
problem in the first place (and git doesn't), then rename tracking doesn't 
actually make any sense, and only makes things much worse.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
@ 2006-10-20 18:06                                                   ` Linus Torvalds
  2006-10-20 18:30                                                     ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 18:06 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> > 
> > Git _definitely_ handles renames, both in everyday life and when merging.
> 
> Hmm.  Could you say more here?  The only examples I can think of for
> handling renames are situations that can be expressed as a merge.

So yes, merges are the situation where renames are normally considered a 
"problem", but it's actually not nearly the most every-day situation at 
all.

The most common one is actually just showing things as a diff.

If you are looking at a code-change, there's an absolutely _huge_ 
difference if you look at the result as a "delete this huge file" and 
"create this other huge file" and seeing it as a "move this huge file from 
here to here, and change a few lines in the process".

So the most _important_ part of rename tracking from a user perspective is 
for the person who walks through somebody elses code history, and wants to 
know how a certain state came to be. The merges are usually not as big of 
a deal for the user (although they are clearly the most hairy case for the 
SCM - which is why SCM people concentrate on merges).

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 16:21                                           ` Jakub Narebski
  2006-10-20 17:03                                             ` Aaron Bentley
@ 2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 18:35                                               ` Jakub Narebski
                                                                 ` (4 more replies)
  1 sibling, 5 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-20 18:12 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Aaron Bentley, bazaar-ng, git

On Fri, Oct 20, 2006 at 06:21:34PM +0200, Jakub Narebski wrote:
> Aaron Bentley wrote:
> 
> > === added directory  // file-id:TREE_ROOT
> 
> Gaaah, so rename detection in bzr is done using file-ids?
> Linus will tell you the inherent problems with that "solution".

Ok, I tried to read
http://permalink.gmane.org/gmane.comp.version-control.git/217

It's all nice and well, but my question is whether the below cases work
in git. Yes, they are particular cases, but they are particularly
important. If they don't, I'd rather have file-id scheme, that is
limited to just them, but handles them, than something with big plans,
but nothing working.

Let's consider following scenario:

(where A$ means working in branch A, B$ means working in branch B and
 VCT stands for version control tool of choice)

A$ echo Hello Warld! > hello.txt
A$ VCT add hello.txt
A$ VCT commit -m "Created greeting"
$ VCT branch A B
A$ VCT mkdir data
A$ VCT mv hello.txt data/
A$ VCT commit -m "Moved hello.txt to data dir"
B$ ed hello.txt
? 1s/Warld/World/
? wq
B$ VCT commit -m "Fixed typo in greeting"
A$ VCT merge B

At this point, I expect the tree to look like this:
A$ ls -R
.:
data/
data:
hello.txt
A$ cat data/hello.txt
Hello World!

The file-id algorithm is not exceptionaly clever, is a bit of
special-case and all that, but it handles the above case right. And
while that scenario is just a special case of general moving contents,
it is:
1) Very common
2) Possible to handle in an obviously correct way

It is very important for me that a version control tool I use handles
this case. If it handles the more general cases, that's nice, but this
is a must.

Oh, and there is one more complicated case, that I also require to work
and that works in Bzr, but did not work in Arch:

...let's start with the tree at the end of previous example...

A$ VCT mv data greetings
A$ VCT commit -m "Renamed the data directory to greetings"
B$ echo "Goodbye World!" > data/goodbye.txt
B$ VCT add data/goodbye.txt
B$ VCT commit -m "Added goodbye message."
A$ VCT merge B

And now I expect to have tree looking like this:

A$ ls -R
.:
greetings/
greetings:
hello.txt
goodbye.txt

And note, that it is /not/ required to use file-ids to handle this.
Darcs handles this just as well with it's patch algebra
(http://darcs.net/DarcsWiki/PatchTheory) without need of any IDs.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
  2006-10-20 17:58                                                   ` David Lang
@ 2006-10-20 18:15                                                   ` Jon Smirl
  2006-11-03  3:43                                                     ` Matthew Hannigan
  2006-10-20 20:23                                                   ` Petr Baudis
  2006-10-20 20:53                                                   ` Shawn Pearce
  3 siblings, 1 reply; 806+ messages in thread
From: Jon Smirl @ 2006-10-20 18:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git

On 10/20/06, Linus Torvalds <torvalds@osdl.org> wrote:
> So yeah, I've seen a few strange cases myself, but they've actually been
> interesting. Like seeing how much of a file was just a copyright license,
> and then a file being considered a "copy" just because it didn't actually
> introduce any real new code.

It may be worth doing something special for licenses. Logs of small
Mozilla files are also getting tripped up by the large copyright
notices. The notices take up a lot of space too. The Mozilla license
has been changed five times. That is 110,000 files times one to five
licenses at 800-1500 characters each. 500MB+ of junk before
compression.

You could have a file of macro substitutions that is applied/expanded
when files go in/out of git. The macros would replace the copyright
notices improving the move/rename tracking and the reducing repository
size. The macros could be recorded out of band to eliminate the need
for escaping the file contents. Even simpler, the only valid place for
the macro could be the beginning of the file.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:06                                                   ` Linus Torvalds
@ 2006-10-20 18:30                                                     ` Linus Torvalds
  2006-10-20 19:04                                                       ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 18:30 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Fri, 20 Oct 2006, Linus Torvalds wrote:
> 
> So yes, merges are the situation where renames are normally considered a 
> "problem", but it's actually not nearly the most every-day situation at 
> all.

Btw, this is a pet peeve of mine, and it is not at all restricted to 
the SCM world.

In CompSci in general, you see a _lot_ of papers about things that almost 
don't matter - not because the issues are that important in practice, but 
because the issues are something small enough to be something you can 
discuss and explain without having to delve into tons of ugly detail, and 
because it's something that has a lot of "mental masturbation" associated 
with it - ie you can discuss it endlessly.

In the OS world, it's things like schedulers. You find an _inordinate_ 
number of papers on scheduling, considering that the actual algorithm then 
tends to be something that can be expressed in a hundred lines of code or 
so, but it's got quite high "mental masturbatory value" (hereafter called 
MMV).

Other high-MMV areas are page-out algorithms (never mind that almost all 
_real_ VM problems are elsewhere) and some zero-copy schemes (never mind 
that if you actually need to _work_ with the data, zero-copy DMA may 
actually be much worse because it ends up having bad cache behaviour).

In the SCM world, file renames and merging seem to be the high-MMV things. 
Never mind that the real issues tend to be elsewhere (like _performance_ 
when you have a few thousand commits that you want to merge).

For example, in the kernel, I think about half of all merges are what git 
calls "trivial in-index merges". That's HALF. Being a trivial in-index 
merge means that there was not a single file-level conflict that even 
needed a three-way merge, much less any study of the history AT ALL (other 
than finding the common ancestor, of course).

Of the rest, most by far need some trivial 3-way merging. And the ones 
that have trouble? In practice, that trivial and maligned 3-way does 
_better_ than anything more complicated.

Yet, if you actually bother to follow all the discussion on #revctrl and 
other places, what do you find discussed? Right: various high-MMV issues 
like "staircase merge" etc crap.

Go to revctrl.org for prime example of this. I think half the stuff is 
about merge algorithms, some of it is about glossary, and almost none of 
it is about something as pedestrian and simple as performance and 
scalability.

(Actually, to be honest, I think some of the #revctrl noise has become 
better lately. I'm not seeing quite as much theoretical discussion, it may 
be that as open-source distributed SCM's are getting to be more "real", 
people start to slowly realize that the masturbatory crap isn't actually 
what it's all about. So maybe at least this area is getting more about 
real every-day problems, and less about the theoretical-but-not-very- 
important issues).

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
@ 2006-10-20 18:35                                               ` Jakub Narebski
  2006-10-20 18:46                                                 ` Jakub Narebski
  2006-10-20 18:47                                               ` Jakub Narebski
                                                                 ` (3 subsequent siblings)
  4 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 18:35 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Aaron Bentley, bazaar-ng, git

Jan Hudec wrote:
> On Fri, Oct 20, 2006 at 06:21:34PM +0200, Jakub Narebski wrote:
> > Aaron Bentley wrote:
> > 
> > > === added directory  // file-id:TREE_ROOT
> > 
> > Gaaah, so rename detection in bzr is done using file-ids?
> > Linus will tell you the inherent problems with that "solution".
> 
> Ok, I tried to read
> http://permalink.gmane.org/gmane.comp.version-control.git/217
> 
> It's all nice and well, but my question is whether the below cases work
> in git. Yes, they are particular cases, but they are particularly
> important. If they don't, I'd rather have file-id scheme, that is
> limited to just them, but handles them, than something with big plans,
> but nothing working.
> 
> Let's consider following scenario:
> 
> (where A$ means working in branch A, B$ means working in branch B and
>  VCT stands for version control tool of choice)

1077:jnareb@roke:/tmp/jnareb> mkdir tmp
1078:jnareb@roke:/tmp/jnareb> cd tmp/
1079:jnareb@roke:/tmp/jnareb/tmp> git init-db
defaulting to local storage area

> A$ echo Hello Warld! > hello.txt
1081:jnareb@roke:/tmp/jnareb/tmp> echo 'Hello Warld!' > hello.txt

> A$ VCT add hello.txt
1082:jnareb@roke:/tmp/jnareb/tmp> git add hello.txt

> A$ VCT commit -m "Created greeting"
1083:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Created greeting"

(we use here still default branch 'master'. Let us change it to A)
1084:jnareb@roke:/tmp/jnareb/tmp> git branch A
1088:jnareb@roke:/tmp/jnareb/tmp> git checkout A

> $ VCT branch A B
1085:jnareb@roke:/tmp/jnareb/tmp> git branch B A
(create branch B based on A)

> A$ VCT mkdir data
1089:jnareb@roke:/tmp/jnareb/tmp> mkdir data

> A$ VCT mv hello.txt data/
1090:jnareb@roke:/tmp/jnareb/tmp> git mv hello.txt data/

> A$ VCT commit -m "Moved hello.txt to data dir"
1092:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Moved hello.txt to data dir"

> B$ ed hello.txt
> ? 1s/Warld/World/
> ? wq
1094:jnareb@roke:/tmp/jnareb/tmp> ed hello.txt 
13
1s/Warld/World/
wq
13

> B$ VCT commit -m "Fixed typo in greeting"
1096:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Fixed typo in greeting"

> A$ VCT merge B
1097:jnareb@roke:/tmp/jnareb/tmp> git checkout A
1098:jnareb@roke:/tmp/jnareb/tmp> git pull . B
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with 9de7290d385ec2b0c2ade9b888f6c3a6633ac926
Merging: 
5f0eb04467538f0f1414af85ec6481150107c0b2 Moved hello.txt to data dir 
9de7290d385ec2b0c2ade9b888f6c3a6633ac926 Fixed typo in greeting 
found 1 common ancestor(s): 
f49a520e40143cb9d84b00e9728c5742897c0a22 Created greeting 

Merge made by recursive.
 data/hello.txt |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

> At this point, I expect the tree to look like this:
> A$ ls -R
1099:jnareb@roke:/tmp/jnareb/tmp> ls -R
.:
data

./data:
hello.txt

> A$ cat data/hello.txt
1100:jnareb@roke:/tmp/jnareb/tmp> cat data/hello.txt 
Hello World!



> A$ VCT mv data greetings
1102:jnareb@roke:/tmp/jnareb/tmp> git mv data greetings

> A$ VCT commit -m "Renamed the data directory to greetings"
1105:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Renamed the data directory to greetings"

> B$ echo "Goodbye World!" > data/goodbye.txt
1106:jnareb@roke:/tmp/jnareb/tmp> git checkout B
1109:jnareb@roke:/tmp/jnareb/tmp> echo 'Goodbye World!' > data/goodbye.txt
bash: data/goodbye.txt: There is no such file or directory
1110:jnareb@roke:/tmp/jnareb/tmp> ls -R
.:
hello.txt

You need to revise your example.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:35                                               ` Jakub Narebski
@ 2006-10-20 18:46                                                 ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 18:46 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Aaron Bentley, bazaar-ng, git

Jakub Narebski wrote:
>> A$ VCT commit -m "Moved hello.txt to data dir"
> 1092:jnareb@roke:/tmp/jnareb/tmp> git commit -a -m "Moved hello.txt to data dir"
> 
>> B$ ed hello.txt
>> ? 1s/Warld/World/
>> ? wq
Sorry, I have forgot to put in email "git checkout B"
to actually switch to branch B.

> 1094:jnareb@roke:/tmp/jnareb/tmp> ed hello.txt 
> 13
> 1s/Warld/World/
> wq
> 13

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 18:35                                               ` Jakub Narebski
@ 2006-10-20 18:47                                               ` Jakub Narebski
  2006-10-20 19:00                                                 ` Linus Torvalds
  2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
                                                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 18:47 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git

Jan Hudec wrote:

> And note, that it is /not/ required to use file-ids to handle this.
> Darcs handles this just as well with it's patch algebra
> (http://darcs.net/DarcsWiki/PatchTheory) without need of any IDs.

And Darcs is, from opinions I've read, dog-slow.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
  2006-10-20 18:35                                               ` Jakub Narebski
  2006-10-20 18:47                                               ` Jakub Narebski
@ 2006-10-20 18:48                                               ` Linus Torvalds
  2006-10-20 22:13                                                 ` Jeff Licquia
  2006-10-20 19:14                                               ` Jakub Narebski
  2006-10-20 22:59                                               ` Jeff King
  4 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 18:48 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Jan Hudec wrote:
>
> Let's consider following scenario:

Here's a real-life schenario that we hit several times with BK over the 
years:

 - take a real repository, and a patch that gets discussed that adds a new 
   file.
 - take two different people applying that patch to their trees (or, do 
   the equivalent thing, which is to just create the same filename
   independently, because the solution is obvious - and the same - to 
   both developers).
 - now, have somebody merge both of those two peoples trees (eg me)
 - have the two people continue to use their trees, modifying it, and 
   getting merged.

Trust me, this isn't even _unlikely_. It happens. And it's a serious 
problem for a file-ID case. Why? Because you have two different file ID's 
for the same pathname. 

(It happily only happened a handful of times, so it was never a big enough 
problem to cause me to think that BK was crap. But it definitely was a 
real issue).

What BK did (and what is likely the only reasonable thing to do) is to 
move one of the file-ID's to an "Attic" kind of place, and just go with 
the other. The nasty part is that now the developer whose file was 
"dropped" (and anybody who got the work from him) may still be continuing 
to work with _his_ copy of the same file, never even realizing that when 
his work gets merged, all his fixes GET THROWN AWAY!

And trust me, this isn't a theoretical thing. This actually happens. So 
you have problems at many levels: you have the problems that happen during 
the merge (where somebody needs to decide how to resolve the file-ID 
clash), but what a lot of SCM people seem to not have understood is that 
the problem actually _remains_ after the merge, and causes problems even 
down the line.

So yeah, content-based merging has its own problems (especially if you do 
things like re-indent a file as you move it, or if you have files that 
just look the same because they share 99% of their content through a 
copyright message), but at least so far, we've not really ever hit that 
issue in the kernel.

And we are actually approaching the old kernel BK tree in size with the 
current git tree (we're about 2/3rds of the way if you count number of 
commits). That's despite the fact that we actually have been moving things 
around.  So from a purely _practical_ standpoint, I really do have 
anecdotal evidence that I'm right.

I didn't have that evidence when I started, but I knew I was right anyway ;)

		Linus

PS. It's undoubtedly true that the SCM you use impacts _how_ you do 
development, so any project will almost automatically align itself with 
whatever SCM rules there are in place.

So "anecdotal evidence" in that sense isn't really wonderful, since it 
obviously is always a matter of a certain project/SCM combination - but 
the above example is about as neutral as you can get, since it's the 
_same_ project, with the _same_ maintainer, and roughtly the _same_ rules, 
just two different approaches wrt renames of the SCM's in question.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:47                                               ` Jakub Narebski
@ 2006-10-20 19:00                                                 ` Linus Torvalds
  2006-10-20 19:10                                                   ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 19:00 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Jan Hudec, Aaron Bentley, bazaar-ng, git



On Fri, 20 Oct 2006, Jakub Narebski wrote:

> Jan Hudec wrote:
> 
> > And note, that it is /not/ required to use file-ids to handle this.
> > Darcs handles this just as well with it's patch algebra
> > (http://darcs.net/DarcsWiki/PatchTheory) without need of any IDs.
> 
> And Darcs is, from opinions I've read, dog-slow.

You really cannot expect to get any kind of performance at all unless you:

 - are able to ignore 99.9% of all files on merging (ie you have to be 
   able to totally ignore the files that are identical in both sides, and 
   you really shouldn't even _care_ about why they ended up being 
   identical)

 - are able to ignore 99% of what the commits _did_ in between the merges 
   (ie if you need to look at them at all, only look at the part that 
   matters for the 0.1% of files that you couldn't ignore)

If you have to parse all the commit details all the way down to the common 
parent, you're basically already screwed. There's no _way_ you can make it 
fast. 

Git goes one step further: it _really_ doesn't matter about how you got to 
a certain state. Absolutely _none_ of what the commits in between the 
final stages and the common ancestor matter in the least. The only thing 
that matters is what the states at the end-point are.

(Of course, you _could_ plug in a merge algorithm that cares, since there 
is more data there. I'm just talking about the standard "recursive" 
algorithm here.)

That's why git can be so fast, but it's actually more important than that: 
the fact that it doesn't matter _how_ you got to a certain state is 
actually a huge and important feature. In other words, you should see it 
as a guarantee, not as a "lack of knowledge".

Darcs thinks it matters how you got somewhere. Git consciously says: none 
of the individual patches matter, the only thing that matters is the end 
result, because you could have gotten the same result in a lot of 
different ways, and nobody _cares_.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:30                                                     ` Linus Torvalds
@ 2006-10-20 19:04                                                       ` Aaron Bentley
  2006-10-20 19:31                                                         ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 19:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Linus Torvalds wrote:
> 
>>So yes, merges are the situation where renames are normally considered a 
>>"problem", but it's actually not nearly the most every-day situation at 
>>all.
> 
> 
> Btw, this is a pet peeve of mine, and it is not at all restricted to 
> the SCM world.

I guess I don't mind a bit of high-mmv discussion, so long as it doesn't
get in the way of real work.  Polishing these kinds of things seems to
fall in the category of 10% of functionality that takes 90% of effort.

> Of the rest, most by far need some trivial 3-way merging. And the ones 
> that have trouble? In practice, that trivial and maligned 3-way does 
> _better_ than anything more complicated.

I think the great motivator for exploring other merge algorithms has
been criss-cross merge.  There are some workflows (e.g. the Launchpad
workflow) in which heavy mesh-merging takes place, leading to frequent
criss-crosses.

Bog-standard three-way doesn't handle that criss-cross very well.  I
understand git uses recursive three-way in that situation.

The other motivator has been cherry-picking.

So I'm happy that people are trying to devise merge algorithms that are
better than three-way.  When someone gets it right, we'll implement it.

And then there are other more incremental tweaks, like
merge-across-indent and merge-across-line-ending-change that I'd like to
see.

> Go to revctrl.org for prime example of this. I think half the stuff is 
> about merge algorithms, some of it is about glossary, and almost none of 
> it is about something as pedestrian and simple as performance and 
> scalability.

Partly this is because of Bram's interests.  AIUI, he started with a
merge algorithm and built a VCS around it.

> (Actually, to be honest, I think some of the #revctrl noise has become 
> better lately.

I used to spend time on #revctrl, but I think that was before you
started visiting.  Too bad I missed ya.

 So maybe at least this area is getting more about
> real every-day problems, and less about the theoretical-but-not-very- 
> important issues).

It wouldn't surprise me if the early phases of VCS development tended
toward more theoretical discussion, just because so many questions are open.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOR3D0F+nu1YWqI0RAo5lAJ99+5ShvLXaVIRG1A8XN7HRicoPngCeLO+y
meMZVcjdX7AX9JCfhSN5uK4=
=AI8p
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:00                                                 ` Linus Torvalds
@ 2006-10-20 19:10                                                   ` Aaron Bentley
  2006-10-20 19:46                                                     ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 19:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Jan Hudec, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> Git goes one step further: it _really_ doesn't matter about how you got to 
> a certain state. Absolutely _none_ of what the commits in between the 
> final stages and the common ancestor matter in the least. The only thing 
> that matters is what the states at the end-point are.

That's interesting, because I've always thought one of the strengths of
file-ids was that you only had to worry about end-points, not how you
got there.

How do you handle renames without looking at the history?

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOR8c0F+nu1YWqI0RAkhJAJ9QJ3nyP/437/bNPI3VEVHZP0dEZACfZyEg
SWAp+673iTDEZfH00M4RG4k=
=1XO+
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
                                                                 ` (2 preceding siblings ...)
  2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
@ 2006-10-20 19:14                                               ` Jakub Narebski
  2006-10-20 22:59                                               ` Jeff King
  4 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 19:14 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git

Jan Hudec wrote:

> Let's consider following scenario:
> 
> (where A$ means working in branch A, B$ means working in branch B and
>  VCT stands for version control tool of choice)
[...]
> At this point, I expect the tree to look like this:
> A$ ls -R
> .:
> data/
> data:
> hello.txt
> A$ cat data/hello.txt
> Hello World!
[...]
> Oh, and there is one more complicated case, that I also require to work
> and that works in Bzr, but did not work in Arch:
> 
> ...let's start with the tree at the end of previous example...
> 
> A$ VCT mv data greetings
> A$ VCT commit -m "Renamed the data directory to greetings"
> B$ echo "Goodbye World!" > data/goodbye.txt
> B$ VCT add data/goodbye.txt
> B$ VCT commit -m "Added goodbye message."
> A$ VCT merge B

(slightly corrected example).

A$ git branch B
A$ git mv data greetings
A$ git commit -a -m "Renamed the data directory to greetings"
A$ git checkout B
B$ echo 'Goodbye World!' > data/goodbye.txt
B$ git add data/goodbye.txt
B$ git commit -a -m "Added goodbye message."
B$ git checkout A
A$ git pull . B
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with 4a8a1a7941f214c6173786b583830b4f74a67c1f
Merging: 
96738390ba0b4de5b234059081701badc1c86693 Renamed the data directory to greetings 
4a8a1a7941f214c6173786b583830b4f74a67c1f Added goodbye message. 
found 1 common ancestor(s): 
7cfd8edd06b7cb016856737d8fd98d5d096955b5 Merge branch 'B' into A 

Merge made by recursive.
 data/goodbye.txt |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 data/goodbye.txt

> And now I expect to have tree looking like this:
> 
> A$ ls -R
> .:
> greetings/
> greetings:
> hello.txt
> goodbye.txt

So git _fails_ (your expectations) in this case:
A$ ls -R
.:
data  greetings

./data:
goodbye.txt

./greetings:
hello.txt

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:04                                                       ` Aaron Bentley
@ 2006-10-20 19:31                                                         ` Linus Torvalds
  2006-10-20 20:12                                                           ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 19:31 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> > 
> > Btw, this is a pet peeve of mine, and it is not at all restricted to 
> > the SCM world.
> 
> I guess I don't mind a bit of high-mmv discussion, so long as it doesn't
> get in the way of real work.  Polishing these kinds of things seems to
> fall in the category of 10% of functionality that takes 90% of effort.

Well, the thing is, that 10% of the functionality usually takes a whole 
lot _less_ than 10% of the work.

The stuff you can think through (and argue about) tends to be the easy 
stuff. Exactly because you _can_ think about it abstractly.

The stuff that is actually really hard and time-consuming is the stuff 
that you find out in practice, and you have to iterate on.

In kernels, for example, it seems like 99% of the effort ends up being 
hardware-dependent stuff. Getting architecture interfaces right, and 
getting working drivers. Hotplugging and device management turns out to be 
a _much_ bigger issue than schedulers or VM page-out has _ever_ been. 

But show me a single paper about them. I'm sure they exist. I'm just 
saying that they're sure as heck not getting 99% of the attention (or even 
1% of the attention) in discussions, even though they're definitely 99% of 
the real everyday work and effort.

(Maybe it's not 99%. Numbers taken out of my nether regions. The point 
should be clear).

The same is actually true of SCM's too, I'm totally convinced. At least in 
git, we really haven't spent _that_ much time on merges, for example. My 
original stupid three-way merge was really simple, and I think the way I 
introduced "stages" into the git index was really clever, but it was still 
a small detail. And it worked surprisingly way.

After that merge, people improved it. And "recursive" is a _huge_ 
improvement, don't get me wrong: it's still entirely a 3-way merge on the 
file contents, but it now does those 3-way merges in several stages if 
there are multiple independent common parents, and the rename logic is 
clearly important.

But if you actually look at how much effort was spent on merging, and how 
much was spent on just "details in general", I think you'll find merging 
to be pretty low down the list, even though the recursive merge ended up 
_also_ getting re-written in C. Perhaps it was one of the bigger 
_individual_ efforts, but compared to all the work we've continually done 
on performance and usability, for example, it's been pretty small in the 
end.

As an example: I suspect that in git just the CVS importer has gotten 
_way_ more attention than merging ever got. Importing from CVS is simply a 
much harder problem in practice, and we've probably had more people 
working on it (and that's _despite_ the fact that this is one of the areas 
where git has successfully re-used other projects that had similar goals: 
cvsps, cvs2svn etc). It's hard to "think" about, because a lot of the 
problems with importing from CVS are literally all about the details and 
the nasty crud. I really think "merging" is _way_ easier.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:10                                                   ` Aaron Bentley
@ 2006-10-20 19:46                                                     ` Linus Torvalds
  2006-10-20 20:29                                                       ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 19:46 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, Jan Hudec, bazaar-ng, git



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> Linus Torvalds wrote:
> > Git goes one step further: it _really_ doesn't matter about how you got to 
> > a certain state. Absolutely _none_ of what the commits in between the 
> > final stages and the common ancestor matter in the least. The only thing 
> > that matters is what the states at the end-point are.
> 
> That's interesting, because I've always thought one of the strengths of
> file-ids was that you only had to worry about end-points, not how you
> got there.
> 
> How do you handle renames without looking at the history?

You first handle all the non-renames that just merge on their own. That 
takes care of 99.99% of the stuff (and I'm not exaggerating: in the 
kernel, you have ~21000 files, and most merges don't have a single rename 
to worry about - and even when you do have them, they tend to be in the 
"you can count them on one hand" kind of situation).

Then you just look at all the pathnames you _couldn't_ resolve, and that's 
usually cut down the thing to something where you can literally use a lot 
of CPU power per file, because now you only have a small number of 
candidates left.

If you were to use one hundredth of a second per file regardless of file, 
a stupid per-file merge would take 210 seconds, which is just 
unacceptable. So you really don't want to do that. You want to merge whole 
subdirectories in one go (and with git, you can: since the SHA1 of a 
directory defines _all_ of the contents under it, if the two branches you 
merge have an identical subdirectory, you don't need to do anything at 
_all_ about that one. See?).

So instead of trying to be really fast on individual files and doing them 
one at a time, git makes individual files basically totally free (you 
literally often don't need to look at them AT ALL). And then for the few 
files you can't resolve, you can afford to spend more time.

So say that you spend one second per file-pair because you do complex 
heuristics etc - you'll still have a merge that is a _lot_ faster than 
your 210-second one.

So recursive basically generates the matrix of similarity for the 
new/deleted files, and tries to match them up, and there you have your 
renames - without ever looking at the history of how you ended up where 
you are.

Btw, that "210 second" merge is not at all unlikely. Some of the SCM's 
seem to scale much worse than that to big archives, and I've heard people 
talk about merges that took 20 minutes or more. In contrast, git doing a 
merge in ~2-3 seconds for the kernel is _normal_.

[ In fact, I just re-tested doing my last kernel merge: it took 0.970 
  seconds, and that was _including_ the diffstat of the result - not 
  obviously not including the time to fetch the other branch over the 
  network.

  I don't know if people appreciate how good it is to do a merge of two 
  21000-file branches in less than a second. It didn't have any renames, 
  and it only had a single well-defined common parent, but not only is 
  that the common case, being that fast for the simple case is what 
  _allows_ you to do well on the complex cases too, because it's what gets 
  rid of all the files you should _not_ worry about ]

Performance does matter. 

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:31                                                         ` Linus Torvalds
@ 2006-10-20 20:12                                                           ` Aaron Bentley
  0 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 20:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>>Btw, this is a pet peeve of mine, and it is not at all restricted to 
>>>the SCM world.
>>
>>I guess I don't mind a bit of high-mmv discussion, so long as it doesn't
>>get in the way of real work.  Polishing these kinds of things seems to
>>fall in the category of 10% of functionality that takes 90% of effort.
> 
> 
> Well, the thing is, that 10% of the functionality usually takes a whole 
> lot _less_ than 10% of the work.

I guess this depends on whether you consider the brainstorming and
discussion to be part of the work of polishing, and I do mean polishing.
 Getting from something that works 90% of the time to something that
works 99% of the time can be a questionable expenditure of time and effort.

> The same is actually true of SCM's too, I'm totally convinced. At least in 
> git, we really haven't spent _that_ much time on merges, for example. My 
> original stupid three-way merge was really simple, and I think the way I 
> introduced "stages" into the git index was really clever, but it was still 
> a small detail. And it worked surprisingly way.

I did rewrite our merge code once, but that was because the API was
quite hard to deal with and made it hard to maintain.  I agree that it's
important to focus effort on the areas that make a difference.

On the other hand, our "exotic" text merge algorithms have been praised
by the people who work on Launchpad.  So that's a win.

> As an example: I suspect that in git just the CVS importer has gotten 
> _way_ more attention than merging ever got. Importing from CVS is simply a 
> much harder problem in practice, and we've probably had more people 
> working on it (and that's _despite_ the fact that this is one of the areas 
> where git has successfully re-used other projects that had similar goals: 
> cvsps, cvs2svn etc). It's hard to "think" about, because a lot of the 
> problems with importing from CVS are literally all about the details and 
> the nasty crud. I really think "merging" is _way_ easier.

Yes, I don't even want to think about CVS when I don't have to.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOS2Y0F+nu1YWqI0RAiOcAJ0TXmBdiCcvnTzmg+nnF+kayJ25cgCggMFx
w6xFlFHwPoNm9dt/T4LnmCU=
=zNuy
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:59                                                   ` Linus Torvalds
@ 2006-10-20 20:17                                                     ` Junio C Hamano
  2006-10-20 20:40                                                       ` Jakub Narebski
                                                                         ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-20 20:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> ...  We're starting to see 
> git actually being able to track file content moving between files: even 
> when the files themselves didn't move (ie Junio's "git pickaxe" work could 
> do things like that).

I've reordered the git-pickaxe I parked in "pu" while 1.4.3-rc
cycle and merged it into "next".

The earlier one I was futzing with in "pu" had built-in
heuristics and pure mechanisms mixed together in the same patch,
which was quite bad as development history.  I think the
reordered sequence shows the logical evolution better.

  1. git-pickaxe: blame rewritten.

     This implements the infrastructure (parent traversal,
     identifying "corresponding path" in the parent -- aka
     "handling renames", passing blames to the parents and
     taking responsibility for the remainder) and uses the the
     same old "single diff with parent file identifies what we
     inherited from the parent" logic git-blame uses for passing
     blames.

  2. git-pickaxe -M: blame line movements within a file.

     This adds logic to find swapped groups of lines in the same
     file.  When the file in the parent had A and B and the child
     has B and A, "single diff with parent" would find only one
     of A or B is inherited from the parent, not both.  This
     re-diffs the remainder with the parent's file to find both.

     I used to have heuristics to avoid trivial groups of lines
     from being subject to this step, but in this version they
     have been removed, so that we can see the core logic and
     need for heuristics more clearly.

     On the other hand, the version I used to have in "pu" gave
     blame to the first match.  This one tries to find the best
     match and assign the blame to it.

  3. git-pickaxe -C: blame cut-and-pasted lines.

     This adds logic to find groups of lines brought in from
     existing file in the parent.  We scan the remainder using
     the same logic as -M detection, but it is done against
     other files in the parent.

     There was a heuristic that gave the blame to the parent
     right then and there when we find a copy-and-paste instead
     of allowing the parent to pass blame further on to its
     ancestors; again I removed this heuristics in the reordered
     series.

The next logical step is to come up with a good set of
heuristics to avoid excessive nonsense matches the code
currently gives.

Groups of small number of empty lines, lines with indentation
blanks followed by a closing brace, and '#include' lines that
include common header files occur so commonly, that without any
heuristics (which can be seen in the "next" branch today) the
algorithm would give surprisingly idiotic results.  For example:

	git -p pickaxe -C -f -n v1.4.3 -- commit.c

tells you that the first line of commit.c in v1.4.3 release,
which is '#include "cache.h"' came from the first line of
receive-pack.c which is total nonsense (this particular line
could actually be a bug in the -M or -C logic -- I need to
check).

A less "obviously wrong" but still idiotic case is that we find
ll.409-411 came from ll.94-96 of describe.c in commit 908e5310.
These three lines read as:

	409		}
        410	}
        411

While this blame assignment might be technically correct, it
does not add much value to pass blames on in such a case.

On the brighter side, we find that ll.415-419 (the beginning of
function "static int get_one_line()") originally came from
diff-tree.c (commit cee99d22, ll.275-279).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
  2006-10-20 17:58                                                   ` David Lang
  2006-10-20 18:15                                                   ` Jon Smirl
@ 2006-10-20 20:23                                                   ` Petr Baudis
  2006-10-20 20:49                                                     ` David Lang
  2006-10-20 20:53                                                   ` Shawn Pearce
  3 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 20:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Shawn Pearce, Aaron Bentley, Jakub Narebski, bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 07:48:58PM CEST, I got a letter
where Linus Torvalds <torvalds@osdl.org> said that...
> So yeah, I've seen a few strange cases myself, but they've actually been 
> interesting. Like seeing how much of a file was just a copyright license, 
> and then a file being considered a "copy" just because it didn't actually 
> introduce any real new code.

Well it's certainly "interesting" and fun to see, but is it equally fun
to handle mismerges caused by a broken detection?

I've talked to some people who really didn't mind (or even liked) Git's
heuristics when it came to _inspecting_ movement of content, but were
really nervous about merge following such heuristics.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 19:46                                                     ` Linus Torvalds
@ 2006-10-20 20:29                                                       ` Aaron Bentley
  2006-10-20 20:57                                                         ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 20:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, Jan Hudec, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
>>Linus Torvalds wrote:
>>
>>>Git goes one step further: it _really_ doesn't matter about how you got to 
>>>a certain state. Absolutely _none_ of what the commits in between the 
>>>final stages and the common ancestor matter in the least. The only thing 
>>>that matters is what the states at the end-point are.
>>
>>That's interesting, because I've always thought one of the strengths of
>>file-ids was that you only had to worry about end-points, not how you
>>got there.
>>
>>How do you handle renames without looking at the history?
> 
> 
> You first handle all the non-renames that just merge on their own.
> If you were to use one hundredth of a second per file regardless of file, 
> a stupid per-file merge would take 210 seconds, which is just 
> unacceptable. So you really don't want to do that.

Agreed.  We start by comparing BASE and OTHER, so all those comparisons
are in-memory operations that don't hit disk.  Only for files where BASE
and OTHER differ do we even examine the THIS version.

We can do a do-nothing kernel merge in < 20 seconds, and that's
comparing every single file in the tree.  In Python.  I was aiming for
less than 10 seconds, but didn't quite hit it.

> So recursive basically generates the matrix of similarity for the 
> new/deleted files, and tries to match them up, and there you have your 
> renames - without ever looking at the history of how you ended up where 
> you are.

So in the simple case, you compare unmatched THIS, OTHER and BASE files
to find the renames?

>   I don't know if people appreciate how good it is to do a merge of two 
>   21000-file branches in less than a second. It didn't have any renames, 
>   and it only had a single well-defined common parent, but not only is 
>   that the common case, being that fast for the simple case is what 
>   _allows_ you to do well on the complex cases too, because it's what gets 
>   rid of all the files you should _not_ worry about ]

Well, I certainly appreciate that.  I've never worried about the speed
of text merge algorithms, because you rarely merge very many files.  The
key is making the tree merge fast.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFOTGN0F+nu1YWqI0RAii+AJ0eduC3bYya5Ao8vm1EpBb38tJP4ACeJRYe
9/D+ahDRJa87NTryc7j3C+U=
=plWA
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:17                                                     ` Junio C Hamano
@ 2006-10-20 20:40                                                       ` Jakub Narebski
  2006-10-20 22:41                                                       ` [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring Junio C Hamano
  2006-10-20 22:41                                                       ` [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks Junio C Hamano
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 20:40 UTC (permalink / raw)
  To: git

Junio C Hamano wrote:

>   2. git-pickaxe -M: blame line movements within a file.
> 
>      This adds logic to find swapped groups of lines in the same
>      file.  When the file in the parent had A and B and the child
>      has B and A, "single diff with parent" would find only one
>      of A or B is inherited from the parent, not both.  This
>      re-diffs the remainder with the parent's file to find both.
> 
>      I used to have heuristics to avoid trivial groups of lines
>      from being subject to this step, but in this version they
>      have been removed, so that we can see the core logic and
>      need for heuristics more clearly.
> 
>      On the other hand, the version I used to have in "pu" gave
>      blame to the first match.  This one tries to find the best
>      match and assign the blame to it.
> 
>   3. git-pickaxe -C: blame cut-and-pasted lines.
> 
>      This adds logic to find groups of lines brought in from
>      existing file in the parent.  We scan the remainder using
>      the same logic as -M detection, but it is done against
>      other files in the parent.
> 
>      There was a heuristic that gave the blame to the parent
>      right then and there when we find a copy-and-paste instead
>      of allowing the parent to pass blame further on to its
>      ancestors; again I removed this heuristics in the reordered
>      series.

The names of options clash somewhat with -M and -C in diffcore,
which detect contents 'M'oving (renaming files), and contents
'C'opying (copying files), where in git-pickaxe -C is still about
code movement, only across files (-M -M or --MM?).

Would git-pickaxe try to do also copy-and-paste within the file,
and across files?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:23                                                   ` Petr Baudis
@ 2006-10-20 20:49                                                     ` David Lang
  2006-10-20 20:53                                                       ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: David Lang @ 2006-10-20 20:49 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Shawn Pearce, Aaron Bentley, Jakub Narebski,
	bazaar-ng, git

On Fri, 20 Oct 2006, Petr Baudis wrote:

> 
> Dear diary, on Fri, Oct 20, 2006 at 07:48:58PM CEST, I got a letter
> where Linus Torvalds <torvalds@osdl.org> said that...
>> So yeah, I've seen a few strange cases myself, but they've actually been
>> interesting. Like seeing how much of a file was just a copyright license,
>> and then a file being considered a "copy" just because it didn't actually
>> introduce any real new code.
>
> Well it's certainly "interesting" and fun to see, but is it equally fun
> to handle mismerges caused by a broken detection?
>
> I've talked to some people who really didn't mind (or even liked) Git's
> heuristics when it came to _inspecting_ movement of content, but were
> really nervous about merge following such heuristics.

remember, git only stores the results. so when you are merging it doesn't even 
look for renames.

the only time you get renames is after-the-fact when you ask git for a report 
about what changed. then (if you enable rename detection) it will tell you what 
files have changed, and what files look like they may have been renames 
(possibly with changes). but if you don't ask git to look for renames it won't 
bother and you can just ignore the concept entirely.

or if you only want complete renames (as opposed to rename + change) then use 
the option to tell it that you don't want to consider it a rename unless it's 
100% the same (or 99%, or whatever satisfies you)

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 17:48                                                 ` Linus Torvalds
                                                                     ` (2 preceding siblings ...)
  2006-10-20 20:23                                                   ` Petr Baudis
@ 2006-10-20 20:53                                                   ` Shawn Pearce
  3 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-20 20:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

Linus Torvalds <torvalds@osdl.org> wrote:
> On Fri, 20 Oct 2006, Shawn Pearce wrote:
> > 
> > I renamed hundreds of small files in one shot and also did a few
> > hundered adds and deletes of other small XML files.  Git generated
> > a lot of those unrelated adds/deletes as rename/modifies, as their
> > content was very similiar.  Some people involved in the project
> > freaked as the files actually had nothing in common with one
> > another... except for a lot of XML elements (as they shared the
> > same DTD).
> 
> Heh. We can probably tweak the heuristics (one of the _great_ things about 
> content detection is that you can fix it after the fact, unlike the 
> alternative).
> 
> That said, I've personally actually found the content-based similarity 
> analysis to often be quite informative, even when (and perhaps 
> _especially_ when) it ended up showing something that the actual author of 
> the thing didn't intend.
> 
> So yeah, I've seen a few strange cases myself, but they've actually been 
> interesting. Like seeing how much of a file was just a copyright license, 
> and then a file being considered a "copy" just because it didn't actually 
> introduce any real new code.

Aside from that one strange case I just mentioned I've always seen
the strategy to work very well.  Its never done something I didn't
expect and I've never seen copies or that I didn't expect to see,
knowing what the author of the change did.

So even though I had a little bit of trouble with that rename
situation above I'm _very_ happy with the way Git handles renames.

And the truth is that case above really was quite correct: XML is
very verbose.  When 70% of the file is just required XML to frame
the other 30% of the file's payload its not surprising that files
are considered to be similar when they only differ by a little bit
of payload.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:49                                                     ` David Lang
@ 2006-10-20 20:53                                                       ` Petr Baudis
  2006-10-20 20:55                                                         ` David Lang
  0 siblings, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 20:53 UTC (permalink / raw)
  To: David Lang; +Cc: bazaar-ng, Linus Torvalds, Shawn Pearce, git, Jakub Narebski

Dear diary, on Fri, Oct 20, 2006 at 10:49:53PM CEST, I got a letter
where David Lang <dlang@digitalinsight.com> said that...
> On Fri, 20 Oct 2006, Petr Baudis wrote:
> 
> >
> >Dear diary, on Fri, Oct 20, 2006 at 07:48:58PM CEST, I got a letter
> >where Linus Torvalds <torvalds@osdl.org> said that...
> >>So yeah, I've seen a few strange cases myself, but they've actually been
> >>interesting. Like seeing how much of a file was just a copyright license,
> >>and then a file being considered a "copy" just because it didn't actually
> >>introduce any real new code.
> >
> >Well it's certainly "interesting" and fun to see, but is it equally fun
> >to handle mismerges caused by a broken detection?
> >
> >I've talked to some people who really didn't mind (or even liked) Git's
> >heuristics when it came to _inspecting_ movement of content, but were
> >really nervous about merge following such heuristics.
> 
> remember, git only stores the results. so when you are merging it doesn't 
> even look for renames.

Of course it does look for renames; when you use the recursive strategy,
it will try to merge across renames.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:53                                                       ` Petr Baudis
@ 2006-10-20 20:55                                                         ` David Lang
  0 siblings, 0 replies; 806+ messages in thread
From: David Lang @ 2006-10-20 20:55 UTC (permalink / raw)
  To: Petr Baudis
  Cc: Linus Torvalds, Shawn Pearce, Aaron Bentley, Jakub Narebski,
	bazaar-ng, git

On Fri, 20 Oct 2006, Petr Baudis wrote:

>>> I've talked to some people who really didn't mind (or even liked) Git's
>>> heuristics when it came to _inspecting_ movement of content, but were
>>> really nervous about merge following such heuristics.
>>
>> remember, git only stores the results. so when you are merging it doesn't
>> even look for renames.
>
> Of course it does look for renames; when you use the recursive strategy,
> it will try to merge across renames.

sorry, missed that.

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:29                                                       ` Aaron Bentley
@ 2006-10-20 20:57                                                         ` Linus Torvalds
  2006-10-21  2:03                                                           ` git-merge-recursive, was " Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 20:57 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, Jan Hudec, Git Mailing List, Jakub Narebski



On Fri, 20 Oct 2006, Aaron Bentley wrote:
> 
> Agreed.  We start by comparing BASE and OTHER, so all those comparisons
> are in-memory operations that don't hit disk.  Only for files where BASE
> and OTHER differ do we even examine the THIS version.

Git just slurps in all three trees. I actually think that the current 
merge-recursive.c does it the stupid way (ie it expands all trees 
recursively, regardless of whether it's needed or not), but I should 
really check with Dscho, since I had nothing to do with that code.

I wrote a tree-level merger that avoided doing the recursive tree reading 
when the tree-SHA1's matched entirely, and re-doing the latest merge using 
that took all of 0.037s, because it didn't recursively expand any of the 
uninteresting trees.

But the default recursive merge was ported from the python script that 
did it a full tree at a time, so it's comparatively "slow". But it's fast 
enough (witness the under-1s time ;) that I think the motivation to be 
smarter about reading the trees was basically not just there, so my 
"git-merge-tree" thing is languishing as a proof-of-concept.

So right now, git merging itself doesn't even take advantage of the "you 
can compare two whole directories in one go". We do that all over the 
place in other situations, though (it's a big reason for why doing a 
"diff" between different revisions is so fast - you can cut the problem 
space up and ignore the known-identical parts much faster).

That tree-based data structure turned out to be wonderful. Originally (as 
in "first weeks of actual git work" in April 2005) git had a flat "file 
manifest" kind of thing, and that really sucked.  So the data structures 
are important, and I think we got those right fairly early on.

> We can do a do-nothing kernel merge in < 20 seconds, and that's
> comparing every single file in the tree.  In Python.  I was aiming for
> less than 10 seconds, but didn't quite hit it.

Well, so I know I can do that particular actual merge in 0.037 seconds 
(that's not counting the history traversal to actually find the common 
parent, which is another 0.01s or more ;), so we should be able to 
comfortably do the simple merges in less than a tenth of a second. But at 
some point, apparently nobody just cares.

Of course, this kind of thing depends a lot on developer behaviour. We had 
some performance bugs that we didn't notice simply because the kernel 
didn't show any of those patterns, but people using it for other things 
had slower merges. Sometimes you don't see the problem, just because you 
end up looking at the wrong pattern for performance.

> > So recursive basically generates the matrix of similarity for the 
> > new/deleted files, and tries to match them up, and there you have your 
> > renames - without ever looking at the history of how you ended up where 
> > you are.
> 
> So in the simple case, you compare unmatched THIS, OTHER and BASE files
> to find the renames?

Right. Some cases are easy: if one of the branches only added files (which 
is relatively common), that obviously cannot be a rename. So you don't 
even have to compare all possible combinarions - you know you don't have 
renames from one branch to the other ;)

But I'm not even the authorative person to explain all the details of the 
current recursive merge, and I might have missed something. Dscho? 
Fredrik? Anything you want to add?

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  1:06                                           ` Aaron Bentley
                                                               ` (3 preceding siblings ...)
  2006-10-20 14:12                                             ` Jeff King
@ 2006-10-20 21:48                                             ` Carl Worth
  2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 20:05                                               ` Aaron Bentley
  4 siblings, 2 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-20 21:48 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 12348 bytes --]

On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
> I understand your argument now.

Well, I'm glad to know we each feel like we are communicating at
times, here.

>                                  It's nothing to do with numbers per se,
> and all about per-branch namespaces.  Correct?

The entire discussion is about how to name things in a distributed
system. The premise that Linus has put forth in a very compelling way,
is that attempting to use sequential numbers for names in a
distributed system will break down. The breakdown could be that the
names are not stable, or that the system is used in a centralized way
to avoid the instability of the names.

Now, that causality might not accurately describe the way bzr has
developed. It may be that the centralization bias was determined by
other reasons, and that given those, using sequential numbers for
names makes perfect sense.

But it really is fundamental and unavoidable that sequential numbers
don't work as names in a distributed version control system.

> I meant that the active branch and a mirror of the abandoned branch
> could be stored in the same repository, for ease of access.

Granted, everything can be stored in one repository. But that still
doesn't change what I was trying to say with my example. One of the
repositories would "win" (the names it published during the fork would
still be valid). And the other repository would "lose" (the names it
published would be not valid anymore). Right?

Now, maybe there's some "simple" mapping from old names to new names
for the losing repository, (something like adding a prefix of
"losers/" to the beginning of the names or something or adding a "15."
prefix or whatever). The point is that the old names are
invalidated. And there's no way to guarantee this kind of change won't
happen in the future, (no matter how old a project is).

I constructed that example to show that the naming has a social impact
in forcing a distinction between winners and losers in the merge, (or
mainline and side branch, or whatever you want to name the
distinction). The two re-joining projects could be really amiable,
create a new virgin mainline and treat both histories as side
branches. In this version, everyone loses as all the old names are
invalidated.

> Bazaar encourages you to stick lots and lots of branches in your
> repository.  They don't even have to be related.  For example, my repo
> contains branches of bzr, bzrtools, Meld, and BazaarInspect.

Git allows this just fine. And lots of branches belonging to a single
project is definitely the common usage. It is not common (nor
encouraged) for unrelated projects to share a repository, since a git
clone will fetch every branch in the repository. common for a single
base URL to provide a common basis for a hierarchy of git
repositories, (see, for example http://repo.or.cz/), and that may
provide similar benefits.

I'm noticing another terminology conflict here. The notion of "branch"
in bzr is obviously very different than in git. For example the bzr
man page has a sentence beginning with "if there is already a branch
at the location but it has no working tree". I'm still not sure
exactly what a bzr branch is, but it's clearly something different
from a git branch, (which is absolutely nothing more than a name
referencing a particular commit object). [Note: after playing with it
a bit more down below, a bzr "branch" appears to be something like a
git "repository" that can only hold a single branch.]

> I can see where you're coming from, but to me, the trade-off seems
> worthwhile.  Because historical data gets less and less valuable the
> older it gets.  By the time the URL for a branch goes dark, there's
> unlikely to be any reason to refer to one of its revisions at all.

I strongly disagree on this point. One, I don't think that the "time
for a branch to go dark" is necessarily long, (or if it is, then
that's another barrier that's setup against distributed
development---people have to have a long-term repository before they
can usefully start publishing a branch). Second, I'm not comfortable
with any limit on usefulness of history. Would you willingly throw
away commits, mailing list posts, or closed bug reports older than any
given age for any projects that you care about?

> When you create a new branch from scratch, the number starts at zero.
> If you copy a branch, you copy its number, too.
>
> Every time you commit, the number is incremented.  If you pull, your
> numbers are adjusted to be identical to those of the branch you pulled from.
>
> Is that really complicated?

OK. So now I had to actually try things out. I went ahead and
installed bzr and was able to init and commit from the man page. I had
to go to IRC to figure out how to create and change branches, (the
documentation for "bzr branch" just said FROM_LOCATION and TO_LOCATION
and I couldn't figure out what to pass for those).

Here's the setup I came up with for a tweaked version of the a[bc]m
diamond example I showed with git earlier, (I just added a second
commit to each branch before merging):

	mkdir bzrtest; cd bzrtest
	mkdir master; cd master; bzr init
	touch a; bzr add a; bzr commit -m "Initial commit of a"
	cd ..
	bzr branch master b; cd b
	touch b; bzr add b; bzr commit -m "Commit b on b branch"
	echo "change" > b; bzr commit -m "Change b on b branch"
	cd ..
	bzr branch master c; cd c
	touch c; bzr add c; bzr commit -m "Commit c on c branch"
	echo "change" > c; bzr commit -m "Change c on c branch"
	cd ../master
	bzr merge ../b; bzr commit -m "Merge in b"
	bzr merge ../c; bzr commit -m "Merge in c"

First, I've been told that this is a lot less efficient than possible
since I have what in bzr terms is three unshared "branches" here,
(what git would really call three separate "repositories").

Second, I think that using the filesystem for separating branches is a
really bad idea. One, it intrudes on my branch namespace, (note that
in many commands above I have to use things like "../b" where I'd like
to just name my branch "b". Two, it prevents bzr from having any
notion of "all branches" in places where git takes advantage of it,
(such as git-clone and "gitk --all"). Three, it certainly encourages
the storage problem I ran into above, (and I'd be interested to see a
"corrected" version of the commands above to fix the storage
inefficiencies).

But anyway, those are all new topics, what we were trying to talk
about is revision numbers. After the above commands I can run bzr log
in my three branches, master, b, and c and I get the following
revision number sequences:

master: 1 2 3
b: 1 2 3
c: 1 2 3

And from this state if I ask questions with bzr missing and look at
just the revision numbers, then the answers are useless. I get answers
like:

	.../b:$ bzr missing ../c
	You have 2 extra revision(s):
	revno: 3
	  Change b on b branch
	revno: 2
	  Commit b on b branch

	You are missing 2 revision(s):
	revno: 3
	  Change c on c branch
	revno: 2
	  Commit c on c branch

	.../b:$ bzr missing ../master
	You are missing 2 revision(s):
	revno: 3
	  Merge in c
	revno: 2
	  Merge in b

So there we have the revision numbers 2 and 3 each being used to name
three different revisions. That's a lot of aliasing already.
Then, if the b and c branches each treat master as their mainline and
each pull, then both branches get their numbers all shuffled.

Oh, drat. I just realized that I'm running 0.11 here which doesn't
have the dotted-decimal numbers. (I'm trying to get bzr.dev too, but
it appears to be stuck about 40% of the way through "Fetch phase
1/4" [Note: it ). In this version, the commits brought in as part of a merge
don't get any "simple" number at all and instead "bzr log" shows a
merge ID.

I hadn't realized that the dotted decimal notation was so new that the
community hadn't had a lot of experience with it yet. But, your
description doesn't actually presume that notation. What you asked
was:

	> When you create a new branch from scratch, the number starts at zero.
	> If you copy a branch, you copy its number, too.
	>
	> Every time you commit, the number is incremented.  If you pull, your
	> numbers are adjusted to be identical to those of the branch you pulled from.
	>
	> Is that really complicated?

And to answer. That description doesn't describe at all what happens
to the "simple" numbers of commits that are merged. In the version I
have, they disappear and get replaced with "ugly" numbers. In 0.12
something else happens instead, (that's the part I don't understand
yet).

And my argument isn't just "confusing" it's "confusing or
useless". I understand that pull destroys numbers, and how, but that
makes the numbers I had generated earlier useless. I still don't
understand how people can avoid number changing, (since pull seems the
only way to synch up without infinite new merge commits being added
back and forth).

So, yes, it really is complicated or my brain is just too small.

> > The naming in git really is beautiful and beautifully simple.
>
> Well, you've got to admit that those names are at least superficially ugly.

Sure. But I'll gladly take a simple system with superficial warts than
a complex system with superficial beauty.

> What's nice is being able see the revno 753 and knowing that "diff -r
> 752..753" will show the changes it introduced.  Checking the revo on a
> branch mirror and knowing how out-of-date it is.

With git I get to see a revision number of b62710d4 and know that
"diff b62710d4^ b62710d4" will show its changes, though much more
likely just "show b62710d4". I really cannot fathom a place where
arithmetic on revision numbers does something useful that git revision
specifications don't do just as easily. Anybody have an example for
me?

-Carl

PS. The "bzr branch" of bzr.dev did eventually finish. I can see the
dotted-decimal numbers in my example now, (1.1.1 and 1.2.2 for the
commits that came from branch b; 1.2.1 and 1.2.2 for the commits that
came from branch c). At 5 characters a piece these are well on their
way to getting just as "ugly" as git names, (once it's all
cut-and-paste the difference in ugliness is negligible).

And now, I see it's not just pull that does number rewriting. If I use
the following command (after the chunk of commands above):

	cd ..; bzr branch -r 1.2.2 master 1.2.2

It appears to just create newly linearized revision numbers from whole
cloth for the new branch (1, 2, and 3 corresponding to mainline 1,
1.2.1, and 1.2.2). That's totally surprising, very confusing, and
would invalidate any use I wanted to make of published revision
numbers for the mainline branch while I was working on this branch.

See? This stuff really doesn't work.

Motivating scenario for the above: Imagine 1.2.3 commited garbage so I
want to fix it by branching from 1.2.2 rather than the mainline
"2". Then after I branch, I learn something about "1.2.1" that I want
to investigate more closely. I try to inspect that in my branch, but
ouch! I don't have that revision.

Is there even a way to say "show me the change introduced by what is
named '1.2.1' in the source branch in this scenario" ?

Note: In #bzr I just learned that there is a way for me to do this
_if_ I also happen to have a pull of the original branch somewhere on
my machine. Something like:

	bzr diff -r1.2.0:../master -r1.2.1:../master

I don't know if there's a way to get diff's .. notation to work with
that, (I can't manage to). But these simple numbers are getting less
simple all the time.

With git, if I find a revision number somewhere, I can cut-and-paste
it and get the right thing:

	git show b62710d4f8602203d848daf2d444865b611fff09

But with bzr if I find "1.2.1" somewhere I'm likely to type:

	bzr diff -r1.2.0..1.2.1

If I'm lucky, then that fails with:

	bzr: ERROR: Requested revision: '1.2.0' does not exist in branch:

and I go back to the source, find out what branch it was referring to,
remember where that is on my machine (../master, say), and manually
type that to my command line to get:

	bzr diff -r1.2.0:../master -r1.2.1:../master

If I'm unlucky then the first diff comes up with some unrelated commit
and I get to be confused before I go through that same process.

Now do you see? It really, really does not work. This stuff is about
as un-simple as could be, and this things will happen.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
@ 2006-10-20 22:13                                                 ` Jeff Licquia
  2006-10-20 23:05                                                   ` Robert Collins
  2006-10-20 23:59                                                   ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Jeff Licquia @ 2006-10-20 22:13 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Hudec, bazaar-ng, git, Jakub Narebski

On Fri, 2006-10-20 at 11:48 -0700, Linus Torvalds wrote:
> Here's a real-life schenario that we hit several times with BK over the 
> years:
> 
>  - take a real repository, and a patch that gets discussed that adds a new 
>    file.
>  - take two different people applying that patch to their trees (or, do 
>    the equivalent thing, which is to just create the same filename
>    independently, because the solution is obvious - and the same - to 
>    both developers).
>  - now, have somebody merge both of those two peoples trees (eg me)
>  - have the two people continue to use their trees, modifying it, and 
>    getting merged.
> 
> Trust me, this isn't even _unlikely_. It happens. And it's a serious 
> problem for a file-ID case. Why? Because you have two different file ID's 
> for the same pathname. 

I tried this to see what bzr would do.  Here's the critical point where
the first merges are done ("a" is mainline, "b" and "c" are external
branches being merged into "a").

---
jeff@lsblap:~/tmp/linus-file-id/a$ bzr pull ../b
All changes applied successfully.
1 revision(s) pulled.
jeff@lsblap:~/tmp/linus-file-id/a$ bzr pull ../c
bzr: ERROR: These branches have diverged.  Use the merge command to reconcile them.
jeff@lsblap:~/tmp/linus-file-id/a$ bzr merge ../c
Conflict adding file file2.  Moved existing file to file2.moved.
1 conflicts encountered.
jeff@lsblap:~/tmp/linus-file-id/a$ bzr status
added:
  file2
renamed:
  file2 => file2.moved
conflicts:
  Conflict adding file file2.  Moved existing file to file2.moved.
pending merges:
  Jeff Licquia 2006-10-20 commit c of file2
---

file2 and file2.moved have identical contents at this point.  I fixed it
by deleting file2.moved, "bzr resolve file2", and committing.

After this conflict is resolved, merging from b causes conflicts, while
merging from c appears to work fine.  This continues until b merges from
a (and resolves a conflict in a similar manner to a), at which time
merging/pulling works as you'd expect between the branches.  Whenever b
is marked as conflicting before it merges from a, bzr preserves b's
changes by moving b's modified file.

All in all, not ideal, but it seems bzr handles this better than bk.
Certainly, bzr doesn't silently drop anyone's changes, at least.  I
suspect that bzr could improve its handling of this use case, but not,
I'm sure, to Linus's specifications; some of the fun and games does seem
to come from the use of file IDs.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 15:34                                         ` Aaron Bentley
  2006-10-20 16:21                                           ` Jakub Narebski
@ 2006-10-20 22:40                                           ` Petr Baudis
  2006-10-20 23:33                                             ` Aaron Bentley
  1 sibling, 1 reply; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 22:40 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git

Dear diary, on Fri, Oct 20, 2006 at 05:34:39PM CEST, I got a letter
where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
> > Aaron Bentley wrote:
> >>In Bazaar bundles, the text of the diff is an integral part of the data.
> >> It is used to generate the text of all the files in the revision.
> > 
> > 
> > I thought that the diff was combined diff of changes.
> 
> It is.  It's a description of how to produce revision X given revision
> Y, where Y is the last-merged mainline revision.

Aha, so by default a bundle can carry just a _single_ revision?

That doesn't sound right either, because then it wouldn't make sense to
talk about "combined" or "simple" diffs. So I guess sending a bundle
really is taking n revisions at your side, bundling them to a single
diff and when the other side takes it, it will result in a single
revision? That is basically what our merge --squash does.

Hmm, but that doesn't sound right either, that's certainly no revolting
functionality and seems to be in contradiction with previous bundles
description. But if it doesn't squash the changes, I don't see how the
combined diff can be integral part of the data. Sorry, I don't get it.

> The bundle format can also support sending a single bundles that
> displays the series of patches, though there's currently no UI to select
> this.
..snip..
> > I was under an impression that user sees only mega-patch of all the
> > revisions in bundle together, and rest is for machine consumption only.
> 
> All of it is for machine consumption.  The MIME-encoded sections are a
> series of patches.  They're usually MIME-encoded to avoid confusion with
> the overview patch, but this is optional.
> 
> I've attached an example of what a combined patch-by-patch bundle looks
> like.

But that's the one there's no UI to select? Or where is the combined
diff?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring
  2006-10-20 20:17                                                     ` Junio C Hamano
  2006-10-20 20:40                                                       ` Jakub Narebski
@ 2006-10-20 22:41                                                       ` Junio C Hamano
  2006-10-20 22:41                                                       ` [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks Junio C Hamano
  2 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-20 22:41 UTC (permalink / raw)
  To: git

Instead of comparing number of lines matched, look at the
matched characters and count alnums, so that we do not pass
blame on not-so-interesting lines, such as empty lines and lines
that are indentation with closing brace.

Add an option --score-debug to show the score of each
blame_entry while we cook this further on the "next" branch.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * This comes on top of "next".  The next one makes output from
   "pickaxe -C commit" actually make sense.

 builtin-pickaxe.c |   71 +++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 48 insertions(+), 23 deletions(-)

diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
index 74c7c9a..3c73d82 100644
--- a/builtin-pickaxe.c
+++ b/builtin-pickaxe.c
@@ -34,8 +34,7 @@ static int longest_file;
 static int longest_author;
 static int max_orig_digits;
 static int max_digits;
-
-#define DEBUG 0
+static int max_score_digits;
 
 #define PICKAXE_BLAME_MOVE		01
 #define PICKAXE_BLAME_COPY		02
@@ -78,6 +77,11 @@ struct blame_entry {
 	 * suspect's file; internally all line numbers are 0 based.
 	 */
 	int s_lno;
+
+	/* how significant this entry is -- cached to avoid
+	 * scanning the lines over and over
+	 */
+	unsigned score;
 };
 
 struct scoreboard {
@@ -215,9 +219,6 @@ static void process_u_diff(void *state_,
 	struct chunk *chunk;
 	int off1, off2, len1, len2, num;
 
-	if (DEBUG)
-		fprintf(stderr, "%.*s", (int) len, line);
-
 	num = state->ret->num;
 	if (len < 4 || line[0] != '@' || line[1] != '@') {
 		if (state->hunk_in_pre_context && line[0] == ' ')
@@ -295,10 +296,6 @@ static struct patch *get_patch(struct or
 	char *blob_p, *blob_o;
 	struct patch *patch;
 
-	if (DEBUG) fprintf(stderr, "get patch %.8s %.8s\n",
-			   sha1_to_hex(parent->commit->object.sha1),
-			   sha1_to_hex(origin->commit->object.sha1));
-
 	blob_p = read_sha1_file(parent->blob_sha1, type,
 				(unsigned long *) &file_p.size);
 	blob_o = read_sha1_file(origin->blob_sha1, type,
@@ -352,6 +349,7 @@ static void dup_entry(struct blame_entry
 	memcpy(dst, src, sizeof(*src));
 	dst->prev = p;
 	dst->next = n;
+	dst->score = 0;
 }
 
 static const char *nth_line(struct scoreboard *sb, int lno)
@@ -448,7 +446,7 @@ static void split_blame(struct scoreboar
 		add_blame_entry(sb, new_entry);
 	}
 
-	if (DEBUG) {
+	if (1) { /* sanity */
 		struct blame_entry *ent;
 		int lno = 0, corrupt = 0;
 
@@ -530,12 +528,6 @@ static int pass_blame_to_parent(struct s
 	for (i = 0; i < patch->num; i++) {
 		struct chunk *chunk = &patch->chunks[i];
 
-		if (DEBUG)
-			fprintf(stderr,
-				"plno = %d, tlno = %d, "
-				"same as parent up to %d, resync %d and %d\n",
-				plno, tlno,
-				chunk->same, chunk->p_next, chunk->t_next);
 		blame_chunk(sb, tlno, plno, chunk->same, target, parent);
 		plno = chunk->p_next;
 		tlno = chunk->t_next;
@@ -547,14 +539,37 @@ static int pass_blame_to_parent(struct s
 	return 0;
 }
 
-static void copy_split_if_better(struct blame_entry best_so_far[3],
+static unsigned ent_score(struct scoreboard *sb, struct blame_entry *e)
+{
+	unsigned score;
+	const char *cp, *ep;
+
+	if (e->score)
+		return e->score;
+
+	score = 0;
+	cp = nth_line(sb, e->lno);
+	ep = nth_line(sb, e->lno + e->num_lines);
+	while (cp < ep) {
+		unsigned ch = *((unsigned char *)cp);
+		if (isalnum(ch))
+			score++;
+		cp++;
+	}
+	e->score = score;
+	return score;
+}
+
+static void copy_split_if_better(struct scoreboard *sb,
+				 struct blame_entry best_so_far[3],
 				 struct blame_entry this[3])
 {
 	if (!this[1].suspect)
 		return;
-	if (best_so_far[1].suspect &&
-	    (this[1].num_lines < best_so_far[1].num_lines))
-		return;
+	if (best_so_far[1].suspect) {
+		if (ent_score(sb, &this[1]) < ent_score(sb, &best_so_far[1]))
+			return;
+	}
 	memcpy(best_so_far, this, sizeof(struct blame_entry [3]));
 }
 
@@ -596,7 +611,7 @@ static void find_copy_in_blob(struct sco
 				      tlno + ent->s_lno, plno,
 				      chunk->same + ent->s_lno,
 				      parent);
-			copy_split_if_better(split, this);
+			copy_split_if_better(sb, split, this);
 		}
 		plno = chunk->p_next;
 		tlno = chunk->t_next;
@@ -699,7 +714,7 @@ static int find_copy_in_parent(struct sc
 				continue;
 			}
 			find_copy_in_blob(sb, ent, norigin, this, &file_p);
-			copy_split_if_better(split, this);
+			copy_split_if_better(sb, split, this);
 		}
 		if (split[1].suspect)
 			split_blame(sb, split, ent);
@@ -944,6 +959,7 @@ #define OUTPUT_RAW_TIMESTAMP	004
 #define OUTPUT_PORCELAIN	010
 #define OUTPUT_SHOW_NAME	020
 #define OUTPUT_SHOW_NUMBER	040
+#define OUTPUT_SHOW_SCORE      0100
 
 static void emit_porcelain(struct scoreboard *sb, struct blame_entry *ent)
 {
@@ -1016,6 +1032,8 @@ static void emit_other(struct scoreboard
 					   show_raw_time),
 			       ent->lno + 1 + cnt);
 		else {
+			if (opt & OUTPUT_SHOW_SCORE)
+				printf(" %*d", max_score_digits, ent->score);
 			if (opt & OUTPUT_SHOW_NAME)
 				printf(" %-*.*s", longest_file, longest_file,
 				       suspect->path);
@@ -1060,8 +1078,9 @@ static void output(struct scoreboard *sb
 	for (ent = sb->ent; ent; ent = ent->next) {
 		if (option & OUTPUT_PORCELAIN)
 			emit_porcelain(sb, ent);
-		else
+		else {
 			emit_other(sb, ent, option);
+		}
 	}
 }
 
@@ -1118,6 +1137,7 @@ static void find_alignment(struct scoreb
 {
 	int longest_src_lines = 0;
 	int longest_dst_lines = 0;
+	unsigned largest_score = 0;
 	struct blame_entry *e;
 
 	for (e = sb->ent; e; e = e->next) {
@@ -1143,9 +1163,12 @@ static void find_alignment(struct scoreb
 		num = e->lno + e->num_lines;
 		if (longest_dst_lines < num)
 			longest_dst_lines = num;
+		if (largest_score < ent_score(sb, e))
+			largest_score = ent_score(sb, e);
 	}
 	max_orig_digits = lineno_width(longest_src_lines);
 	max_digits = lineno_width(longest_dst_lines);
+	max_score_digits = lineno_width(largest_score);
 }
 
 static int has_path_in_work_tree(const char *path)
@@ -1206,6 +1229,8 @@ int cmd_pickaxe(int argc, const char **a
 				tmp = top; top = bottom; bottom = tmp;
 			}
 		}
+		else if (!strcmp("--score-debug", arg))
+			output_option |= OUTPUT_SHOW_SCORE;
 		else if (!strcmp("-f", arg) ||
 			 !strcmp("--show-name", arg))
 			output_option |= OUTPUT_SHOW_NAME;
-- 
1.4.3.ge193

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks
  2006-10-20 20:17                                                     ` Junio C Hamano
  2006-10-20 20:40                                                       ` Jakub Narebski
  2006-10-20 22:41                                                       ` [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring Junio C Hamano
@ 2006-10-20 22:41                                                       ` Junio C Hamano
  2 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-20 22:41 UTC (permalink / raw)
  To: git

This adds scoring logic to blame_entry to prevent blames on very
trivial chunks (e.g. lots of empty lines, indent followed by a
closing brace) from being passed down to unrelated lines in the
parent.

The current heuristics are quite simple and may need to be
tweaked later, but we need to start from somewhere.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 builtin-pickaxe.c |   36 ++++++++++++++++++++++++++++++++----
 1 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/builtin-pickaxe.c b/builtin-pickaxe.c
index 3c73d82..49673a5 100644
--- a/builtin-pickaxe.c
+++ b/builtin-pickaxe.c
@@ -40,6 +40,15 @@ #define PICKAXE_BLAME_MOVE		01
 #define PICKAXE_BLAME_COPY		02
 #define PICKAXE_BLAME_COPY_HARDER	04
 
+/*
+ * blame for a blame_entry with score lower than these threasholds
+ * is not passed to the parent using move/copy logic.
+ */
+static unsigned blame_move_score;
+static unsigned blame_copy_score;
+#define BLAME_DEFAULT_MOVE_SCORE	20
+#define BLAME_DEFAULT_COPY_SCORE	40
+
 /* bits #0..7 in revision.h, #8..11 used for merge_bases() in commit.c */
 #define METAINFO_SHOWN		(1u<<12)
 #define MORE_THAN_ONE_PATH	(1u<<13)
@@ -645,7 +654,8 @@ static int find_move_in_parent(struct sc
 		if (ent->suspect != target || ent->guilty)
 			continue;
 		find_copy_in_blob(sb, ent, parent, split, &file_p);
-		if (split[1].suspect)
+		if (split[1].suspect &&
+		    blame_move_score < ent_score(sb, &split[1]))
 			split_blame(sb, split, ent);
 	}
 	free(blob_p);
@@ -716,7 +726,8 @@ static int find_copy_in_parent(struct sc
 			find_copy_in_blob(sb, ent, norigin, this, &file_p);
 			copy_split_if_better(sb, split, this);
 		}
-		if (split[1].suspect)
+		if (split[1].suspect &&
+		    blame_copy_score < ent_score(sb, &split[1]))
 			split_blame(sb, split, ent);
 	}
 	diff_flush(&diff_opts);
@@ -1177,6 +1188,15 @@ static int has_path_in_work_tree(const c
 	return !lstat(path, &st);
 }
 
+static unsigned parse_score(const char *arg)
+{
+	char *end;
+	unsigned long score = strtoul(arg, &end, 10);
+	if (*end)
+		return 0;
+	return score;
+}
+
 int cmd_pickaxe(int argc, const char **argv, const char *prefix)
 {
 	struct rev_info revs;
@@ -1206,12 +1226,15 @@ int cmd_pickaxe(int argc, const char **a
 			output_option |= OUTPUT_LONG_OBJECT_NAME;
 		else if (!strcmp("-S", arg) && ++i < argc)
 			revs_file = argv[i];
-		else if (!strcmp("-M", arg))
+		else if (!strncmp("-M", arg, 2)) {
 			opt |= PICKAXE_BLAME_MOVE;
-		else if (!strcmp("-C", arg)) {
+			blame_move_score = parse_score(arg+2);
+		}
+		else if (!strncmp("-C", arg, 2)) {
 			if (opt & PICKAXE_BLAME_COPY)
 				opt |= PICKAXE_BLAME_COPY_HARDER;
 			opt |= PICKAXE_BLAME_COPY | PICKAXE_BLAME_MOVE;
+			blame_copy_score = parse_score(arg+2);
 		}
 		else if (!strcmp("-L", arg) && ++i < argc) {
 			char *term;
@@ -1249,6 +1272,11 @@ int cmd_pickaxe(int argc, const char **a
 			argv[unk++] = arg;
 	}
 
+	if (!blame_move_score)
+		blame_move_score = BLAME_DEFAULT_MOVE_SCORE;
+	if (!blame_copy_score)
+		blame_copy_score = BLAME_DEFAULT_COPY_SCORE;
+
 	/* We have collected options unknown to us in argv[1..unk]
 	 * which are to be passed to revision machinery if we are
 	 * going to do the "bottom" procesing.
-- 
1.4.3.ge193

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:59                                                   ` James Henstridge
@ 2006-10-20 22:50                                                     ` Jakub Narebski
  2006-10-20 22:58                                                       ` Petr Baudis
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 22:50 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On 20-10-2006, James Henstridge wrote:
> On 20/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> James Henstridge wrote:

>>> With the above layout, I would just type:
>>>     bzr branch http://server/repo/branch1
>>
>> With Cogito (you can think of it either as alternate Git UI, or as SCM
>> built on top of Git) you would use
>>
>>    $ cg clone http://server/repo#branch
>>
>> for example
>>
>>    $ cg clone git://git.kernel.org/pub/scm/git/git.git#next
>>
>> to clone _single_ branch (in bzr terminology, "heavy checkout" of branch).
> 
> My understanding of git is that this would be equivalent to the "bzr
> branch" command.  A checkout (heavy or lightweight) has the property
> that commits are made to the original branch.

Not exactly (my mistake in explaining it). "cg clone git://host/repo@branch"
clones only part of history DAG of commits reachable from given branch.
Still it is full repository. You can add branches to it later with
cg-branch-add and fetch changes with cg-fetch.

>> But you can also clone _whole_ repository, _all_ published branches with
>>
>>    $ cg clone git://git.kernel.org/pub/scm/git/git.git
> 
> I suppose that'd be useful if you want a copy of all the branches at
> once.  There is no builtin command in Bazaar to do that at present.

That is _very_ useful. And that is default option for Git. For
example with git.git repository I'm interested both in 'master'
branch (main line of development), and in 'next' branch (development
branch). For example I send some patches, based on 'master', they
get accepted but in 'next' (to cook for a while for example), and
I want to do further work in this direction I have to base my
new work on 'next' branch.

It looks like the Bazaar-NG "branches" are equivalent of the
one-branch-clone of Git.

And if there is no command to clone whole repository, how
you do public repository?

See below.

[...] 
> Two points:
> (1) if we are publishing branches, we wouldn't include working trees
> -- they are not needed to pull or merge from such a branch.

Same with Git. Public repositories are usually "bare" clones, i.e.
without working directory. We can clone/fetch from "clothed" repo
without problem - we just have to point to .git.

> (2) if we did have working trees, they'd be rooted at /repo/branch1
> and /repo/branch2 -- not at /repo (since /repo is not a branch).

That's explains it.

> In case (2) there is a potential for conflicts if you nest branches,
> but people don't generally trigger this problem with the way they use
> Bazaar.

There is no problem in Git to have git repository nested within
working area: of course you better ignore .git directory; you can
ignore files in this embedded repository or not.

[...]
>> How checked out working area looks like in Bazaar-NG?
> 
> The layout of a standalone branch would be:
>   .bzr/repository/ -- storage of trees and metadata
>   .bzr/branch/ -- branch metadagta (e.g. pointer to the head revision)
>   .bzr/checkout/ -- working tree book-keeping files
>   source code

The layout of git repository (git clone, as it is equivalent of bzr branch)
you have the following layout:
  .git/objects/ -- repository objects database
  .git/refs/ -- heads (branches) and tags
  .git/index -- staging area for commit (adding files, merge resolving)
  .git/HEAD -- which branch is current branch
  source code

> If we use a shared repository, the contained branches would lack the
> .bzr/repository/ directory.  The parent directory would instead have a
> .bzr/repository/, but usually wouldn't have .bzr/branch/ (unless there
> is a branch rooted at the base of the repository).

The equivalent of shared repository would be having .git/objects/
to be symlink to some directory which would serve as common area
to store object database.

You can use alternates file: .git/objects/info/alternates can have
list of absolute pathnames (one per line) where objects can be found
instead. If I understand correctly new objects gets commited to current
repository object database, therefore to have equivalent of symlinking
.git/objects directory you would have for every repository which you
want to share object database to have in alternates file all repositories
except self. 

Or you can use GIT_ALTERNATE_OBJECT_DIRECTORIES environmental variable.

Repository using any kind of alternates mechanism is not suitable
to publish using "dumb" (non-git-aware) transports.

> if we are publishing a branch to a web server, we'd skip the working
> tree, so the source code and .bzr/checkout/ directory would be
> missing.

For "bare" clone only 'source files' would be missing. Well, perhaps
also '.git/index' but I'm not sure.

> In the case of a checkout, the .bzr/branch/ directory has a special
> format and acts as a pointer to the original branch.  If the checkout
> is lightweight, the .bzr/repository/ directory would be missing, and
> bzr would need to contact the original branch for the data.

There is no equivalent for bzr "checkout" (and could you please use
other name for that, like "lazy branch"?) in Git. There was some talk
about how to do "lazy clone"/"remote alternates" in Git, but no consensus
was reached about how to do this effectively, and for both "dumb"
(http, https, ftp, rsync) transports and git-aware (local, git, ssh+git)
transports. From what I've read Bazaar-NG doesn't try the "effective"
part...

[...]
>> Yes, but using Git that way has serious disadvantages. For example
>> there is only one current branch pointer and only one index (dircache)
>> per git repository.
> 
> Okay.  So using Bazaar terminology, this seems to be an issue of the
> working tree being associated with the repository rather than the
> branch?
 
From the point of view of Git users, there is (in Bazaar-NG) an issue
of working tree being associated with the individual branch rather than
repository.

In git to work on some project you clone its repository; in bzr to
work on some project you get one of its branches.


IMVHO if "Cheap Branching Anywhere" was changed to "Lightweight Branches"
then Bazaar-NG would have to put "Partial" in there. Unless you setup
your branches to share data, branches are not cheap (in the sense of
disk space). That's probably the cause for _need_ for "checkouts".
Bazaar-NG doesn't encourage using temporary branches, with
lifespan no longer than day. Can you ever switch between branches
using only one working area; can you do it fast?

It looks somewhat like bzr started without permanent branches, and
they were added later (sharing repository data). But I might be mistaken.

P.S. what Git lacks at least now is a way to generate diff between
two different local repositories, but you can always setup alternates
file and fetch the other repository into some tag.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 22:50                                                     ` Jakub Narebski
@ 2006-10-20 22:58                                                       ` Petr Baudis
  0 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 22:58 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Linus Torvalds, Andreas Ericsson,
	Carl Worth, git

Dear diary, on Sat, Oct 21, 2006 at 12:50:31AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> P.S. what Git lacks at least now is a way to generate diff between
> two different local repositories, but you can always setup alternates
> file and fetch the other repository into some tag.

It's not exactly convenient, but you can do

	xpasky@machine[0:0]~/git$ GIT_ALTERNATE_OBJECT_DIRECTORIES=../cogito/.git/objects cg-diff -r `GIT_DIR=../cogito/.git cg-object-id -c HEAD`..HEAD

I don't personally think it's worth a special UI, but there're no
boundaries for initiative... :-)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:12                                             ` Jan Hudec
                                                                 ` (3 preceding siblings ...)
  2006-10-20 19:14                                               ` Jakub Narebski
@ 2006-10-20 22:59                                               ` Jeff King
  2006-10-21 17:40                                                 ` Jan Hudec
  4 siblings, 1 reply; 806+ messages in thread
From: Jeff King @ 2006-10-20 22:59 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git, Jakub Narebski

On Fri, Oct 20, 2006 at 08:12:10PM +0200, Jan Hudec wrote:

> At this point, I expect the tree to look like this:
> A$ ls -R
> .:
> data/
> data:
> hello.txt
> A$ cat data/hello.txt
> Hello World!

Git does what you expect here.

> A$ VCT mv data greetings
> A$ VCT commit -m "Renamed the data directory to greetings"
> B$ echo "Goodbye World!" > data/goodbye.txt
> B$ VCT add data/goodbye.txt
> B$ VCT commit -m "Added goodbye message."
> A$ VCT merge B
> 
> And now I expect to have tree looking like this:
> 
> A$ ls -R
> .:
> greetings/
> greetings:
> hello.txt
> goodbye.txt

Git does not do what you expect here. It notes that files moved, but it
does not have a concept of directories moving.  Git could, even without
file-ids or special patch types, figure out what happened by noting that
every file in data/ was renamed to its analogue in greetings/, and infer
that previously non-existant files in data/ should also be moved to
greetings/.

However, I'm not sure that I personally would prefer that behavior. In
some cases you might actually WANT data/goodbye.txt, and in some other
cases a conflict might be more appropriate. In any case, I would rather
the SCM do the simple and predictable thing (which I consider to be
creating data/goodbye.txt) rather than be clever and wrong (even if it's
only wrong a small percentage of the time).

In short, git doesn't do what you expect, but I'm not convinced that
it's a bug or lack of feature, and not simply a difference in desired
behavior.

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:13                                                 ` Jeff Licquia
@ 2006-10-20 23:05                                                   ` Robert Collins
  2006-10-20 23:15                                                     ` Robert Collins
  2006-10-20 23:24                                                     ` Jakub Narebski
  2006-10-20 23:59                                                   ` Linus Torvalds
  1 sibling, 2 replies; 806+ messages in thread
From: Robert Collins @ 2006-10-20 23:05 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2583 bytes --]

On Fri, 2006-10-20 at 18:13 -0400, Jeff Licquia wrote:
> 
> All in all, not ideal, but it seems bzr handles this better than bk.
> Certainly, bzr doesn't silently drop anyone's changes, at least.  I
> suspect that bzr could improve its handling of this use case, but not,
> I'm sure, to Linus's specifications; some of the fun and games does
> seem to come from the use of file IDs. 

We have a few features we're focusing on right now, but coming shortly
after them we hope to address parallel imports [which this is a case of]
better than we do now. I have a number of ideas, and I'm sure other devs
do too, about the right way to solve this. Fundamentally, I think using
1-1 mapped path ids [which can be considered a memo of the origin commit
id + path] of a path is not sufficiently rich a representation of what
happens to paths - there is a dual that you can convert to, which is
identity via ancestry traversal - each path has N <= M parent paths in
each of M parent revisions. Our current path ids can only represent the
case where when you traverse to the start of history this graph has a
single tail (that is, that a single file must start at one and only one
place). The graph however is not intrinsically limited in this way -
files can split and join, and we should be able to represent this more
fully.

I'll happily acknowledge that we dont need fileids per se: tracking
renames can be done without a memo of the origin.

However, I'm still convinced that tracking the user intention of renames
leads to a slicker system than renames via inference. My off the cuff
list of corner cases is:

 - change file, rename: rename the changed file/change the renamed file.
 - change file, remove: conflict on removal/text change
 - add path to dir, rename the dir: move the current contents of the
directory/add the new path to the renamed directory.
 - move paths out of a directory, rename the directory: leave the paths
moved out where they were moved to/move the paths from wherever their
new location is.
 - introduce path A + rename old A to B , change path A: change path
B/rename A to B and introduce the new A.

All these cases work roughly along the form of 'have two branches, do
one action in one, one in the other: merge other to one/merge one to
other'. I haven't yet seen an inference system get all these right.

There are other, more complex cases, but I think they all boil down to
one of those primitives to all intents and purposes.

Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:05                                                   ` Robert Collins
@ 2006-10-20 23:15                                                     ` Robert Collins
  2006-10-20 23:39                                                       ` Jeff Licquia
  2006-10-20 23:24                                                     ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Robert Collins @ 2006-10-20 23:15 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 845 bytes --]

On Sat, 2006-10-21 at 09:05 +1000, Robert Collins wrote:
> On Fri, 2006-10-20 at 18:13 -0400, Jeff Licquia wrote:
> > 
> > All in all, not ideal, but it seems bzr handles this better than bk.
> > Certainly, bzr doesn't silently drop anyone's changes, at least.  I
> > suspect that bzr could improve its handling of this use case, but not,
> > I'm sure, to Linus's specifications; some of the fun and games does
> > seem to come from the use of file IDs. 
...
> However, I'm still convinced that tracking the user intention of renames
> leads to a slicker system than renames via inference. My off the cuff
> list of corner cases is:

I meant to add, that I think inference is a great tool to use as an
adjunct to whatever explicit data one can capture.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 11:50                                           ` Jakub Narebski
  2006-10-20 13:26                                             ` Jakub Narebski
@ 2006-10-20 23:19                                             ` Junio C Hamano
  2006-10-21  0:07                                               ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-20 23:19 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

>> The lack of parents ordering in Git is directly connected with
>> fast-forwarding.
>
> There are exactly _two_ places where Git treats first parent specially 
> (correct me if I'm wrong).

I am not bold enough to say _exactly_ N places, but you missed
at least one more important one.  Merge simplification favors
the earlier parents over later ones.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:05                                                   ` Robert Collins
  2006-10-20 23:15                                                     ` Robert Collins
@ 2006-10-20 23:24                                                     ` Jakub Narebski
  2006-10-20 23:28                                                       ` Petr Baudis
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-20 23:24 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Robert Collins wrote:

> However, I'm still convinced that tracking the user intention of renames
> leads to a slicker system than renames via inference.

Well, there was (abandoned for now) idea of rr2-cache, the cache of how
renames were resolved during merge conflict resolving.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:24                                                     ` Jakub Narebski
@ 2006-10-20 23:28                                                       ` Petr Baudis
  0 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-20 23:28 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Dear diary, on Sat, Oct 21, 2006 at 01:24:51AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Robert Collins wrote:
> 
> > However, I'm still convinced that tracking the user intention of renames
> > leads to a slicker system than renames via inference.
> 
> Well, there was (abandoned for now) idea of rr2-cache, the cache of how
> renames were resolved during merge conflict resolving.

Is that really relevant? It rather seems something like rerere, which is
handy, but only if you are the one who is actually supposed to have clue
on how should it be resolved; the caches aren't replicated on clones.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:40                                           ` Petr Baudis
@ 2006-10-20 23:33                                             ` Aaron Bentley
  0 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-20 23:33 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2835 bytes --]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Petr Baudis wrote:
> Dear diary, on Fri, Oct 20, 2006 at 05:34:39PM CEST, I got a letter
> where Aaron Bentley <aaron.bentley@utoronto.ca> said that...
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Jakub Narebski wrote:
>>> Aaron Bentley wrote:
>>>> In Bazaar bundles, the text of the diff is an integral part of the data.
>>>> It is used to generate the text of all the files in the revision.
>>>
>>> I thought that the diff was combined diff of changes.
>> It is.  It's a description of how to produce revision X given revision
>> Y, where Y is the last-merged mainline revision.
> 
> Aha, so by default a bundle can carry just a _single_ revision?

No, bundles contain 1 or more revisions.  They contain all the ancestors
of X that are not ancestors of Y.

Only the diff from X to Y is shown, but the diffs for all other
revisions are present in the MIME-encoded section.

Consider these four revisions in a straight-line ancestry: a, b, c, d.
'a' is a common ancestor.  b, c and d are the revisions that are missing
from the target repository.

A default bundle will contain

metadata for d
diff from a -> d in plaintext
metadata for c
diff from b -> c in MIME encoding
metadata for b
diff from a -> b in MIME encoding

To install b, the diff for a->b is applied to a.  To install c, the diff
for b->c is applied to b.  To install d, the diff for a -> d is applied
to a.

Doing a diff from a -> d instead of from c -> d introduces some
redundancy, of course.  But we do that because we want an overview diff.

> That doesn't sound right either, because then it wouldn't make sense to
> talk about "combined" or "simple" diffs. So I guess sending a bundle
> really is taking n revisions at your side, bundling them to a single
> diff and when the other side takes it, it will result in a single
> revision?

No, it copies the revisions verbatim, and we are careful to avoid data loss.

> Hmm, but that doesn't sound right either, that's certainly no revolting
> functionality and seems to be in contradiction with previous bundles
> description. But if it doesn't squash the changes, I don't see how the
> combined diff can be integral part of the data. Sorry, I don't get it.

It's because there's no other diff in the bundle that produces 'd'.

>> I've attached an example of what a combined patch-by-patch bundle looks
>> like.
> 
> But that's the one there's no UI to select? Or where is the combined
> diff?

That is the one that doesn't have UI to select it.  I've attached a
normal bundle for comparison.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOVzR0F+nu1YWqI0RAkACAJ4z2SJZgelZLfhoFKhEZbmvRIXMjACfag+h
6j+5vvIeHt7xMZOvp6CUcPk=
=33G4
-----END PGP SIGNATURE-----

[-- Attachment #2: hello-world-default.patch --]
[-- Type: text/x-patch, Size: 1884 bytes --]

# Bazaar revision bundle v0.8
#
# message:
#   Added 'world'
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:30:21.903000116 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1
--- /dev/null
+++ world
@@ -0,0 +1,1 @@
+Hello, world

# revision id: abentley@panoramicfeedback.com-20061020153021-b5fcea14e9cd2b34
# sha1: 6d553e72158aaa76c258d98c15cd24922d171cd9
# inventory sha1: 64af82c4d81d9d6ad4f33fc734d32c2a1eaa0df5
# parent ids:
#   abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# base id: null:
# properties:
#   branch-nick: bar

# message:
#   Capitalized
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:51.953999996 -0400

=== modified file world // encoding:base64
LS0tIHdvcmxkCisrKyB3b3JsZApAQCAtMSwxICsxLDEgQEAKLWhlbGxvCitIZWxsbwoK

=== modified directory  // last-changed:abentley@panoramicfeedback.com-20061020
... 152951-10cff5ff5a51e9a2
# revision id: abentley@panoramicfeedback.com-20061020152951-10cff5ff5a51e9a2
# sha1: f7b79934bc3b0a944e35168b5df6b106c5b29ebf
# inventory sha1: 1400d56451752300cc31c9c94ff7ee2188e8ef8c
# parent ids:
#   abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# properties:
#   branch-nick: bar

# message:
#   initial commit
# committer: Aaron Bentley <abentley@panoramicfeedback.com>
# date: Fri 2006-10-20 11:29:35.536999941 -0400

=== added directory  // file-id:TREE_ROOT
=== added file world // file-id:world-20061020152929-12bknd8mm9mx48as-1 // enco
... ding:base64
LS0tIC9kZXYvbnVsbAorKysgd29ybGQKQEAgLTAsMCArMSwxIEBACitoZWxsbwoK

# revision id: abentley@panoramicfeedback.com-20061020152935-64bde004f622131f
# sha1: 0728f761b891b257f0a71e2e360799eec080cd21
# inventory sha1: e52e030ea40f6bf5da78f4e8eb8efcd072b0930a
# properties:
#   branch-nick: bar


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:15                                                     ` Robert Collins
@ 2006-10-20 23:39                                                       ` Jeff Licquia
  0 siblings, 0 replies; 806+ messages in thread
From: Jeff Licquia @ 2006-10-20 23:39 UTC (permalink / raw)
  To: Robert Collins; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 09:15 +1000, Robert Collins wrote:
> I meant to add, that I think inference is a great tool to use as an
> adjunct to whatever explicit data one can capture.

If you ask me, that's the most interesting idea in this whole thread.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:13                                                 ` Jeff Licquia
  2006-10-20 23:05                                                   ` Robert Collins
@ 2006-10-20 23:59                                                   ` Linus Torvalds
  2006-10-21  1:26                                                     ` Junio C Hamano
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-20 23:59 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Jan Hudec, bazaar-ng, git, Jakub Narebski



On Fri, 20 Oct 2006, Jeff Licquia wrote:
> 
> After this conflict is resolved, merging from b causes conflicts, while
> merging from c appears to work fine.  This continues until b merges from
> a (and resolves a conflict in a similar manner to a), at which time
> merging/pulling works as you'd expect between the branches.  Whenever b
> is marked as conflicting before it merges from a, bzr preserves b's
> changes by moving b's modified file.

This sounds somewhat like what I think BK did. I'm not sure if BK actually 
marked it as a conflict or whether BK just warned about "changes to 
deleted file" or something similar, but it didn't entirely _silently_ 
throw them away.

But I hope this shows some of the basic problems.

The much more _serious_ problem of "file identity" tracking is actually 
that you can't track partial file movement or file copies sanely. The 
thing is, tracking things at file boundaries simply is fundamnetally a 
broken notion, simply because _code_ doesn't get done at file boundaries.

Both of these things that git can actually do. Admittedly it does not do 
that in any _released_ version, so you'd have to work with the development 
branch, and it's a fairly early thing, but currently it can actually 
notice that our "revision.c" file largely came from the "rev-list.c" file 
that still exists!

And btw, that's not just some random feature that happened to get 
implemented last week. Yes, it actually _did_ get implemented last week, 
but this was something I outlined when I started git in April of last 
year, and tried to explain to people WHY TRACKING FILE ID'S ARE WRONG!

You can find me explaining these things to people in April-2005, which 
should tell you something: the initial revision of "git" was on Thursday, 
April 7. So the lack of file identity tracking has been controversial from 
the very beginning, but I was right then, and I'm right now.

Because the _fact_ is, that as long as you track stuff on a file basis, 
you're _never_ going to be able to do the things that git alreadt does, 
and that are very natural.

Here's the real-world example of something that git CAN DO TODAY:

 - we used to have a file called "rev-list.c", which did a lot of the 
   commit history revision traversal, and is the source of the git command 
   "git rev-list".

 - I (and others) extended it a lot, and turned it into a more generic 
   library interface, so that other commands could traverse the commit 
   graph on their own, rather than forking and executing "git-rev-list" 
   and piping the output between them.

 - as a result, the old "rev-list.c" still exists (except it was renamed 
   to "builtin-rev-list.c" since it's now a builtin command to the main 
   "git" binary). 

 - HOWEVER, a lot of the actual code got split into the library file, 
   called "revision.c", which contains the real smarts of the program.

See? There was a file rename involved (rev-list.c => builtin-rev-list.c), 
but that actually happened after a lot of the really _interesting_ code 
had been excised from that file, and put into the new internal library 
file (revision.c).

Now, as a result, in many ways the rename is _much_ less interesting than 
the question about the history of the code in "revision.c" (because that's 
really some very core code). And that was never a rename at all. That was 
just a file create, where a lot of the contents happened to come from a 
file that continued to exist.

Wouldn't you want "annotate" to be able to follow this kind of data 
movement? Notice how there is no "file" that moved at all. Only code that 
moved between files.

I tell you: as long as you work with "file ID's", you'll always be 
inferior. You'll never be able to see that some code was copied 
_partially_ from one file into another. You'll never be able to see an 
important function moving between file boundaries.

Unless you work with "git", that is. Because git isn't so _stupid_ as to 
think that file boundaries matter. Git knows better. The only thing that 
matters is the actual _data_, and file boundaries are just one way of 
delimiting that data.

Just try it out. Get the "next" branch of the git repository (that's the 
"stable development" branch in git.git - ie it's going to be in the next 
release and is expected to work, unless some of the more "experimental 
development" that is in the "pu" branch - pu = proposed updates), compile 
it, and run

	git pickaxe -C revision.c | less -S

and marvel. Marvel at my shining intelligence (and the small matter of 
programming, which was all done by Junio, but I'm taking all the credit 
_anyway_, because *dammit* I talked about this last year when people 
didn't understand! And besides, I always take all the credit regardless, 
so what are you whining about? Get off my back!).

More seriously, Junio really did a kick-ass job. I really had nothing at 
all to do with it, and deserve no real credit. But I _did_ forsee it, and 
yes, it really is about the fact that git tracks _contents_.

As somebody smarter that I have said (*): "I'm always right, but this time 
I'm even more right than usual".

			Linus

(*) Just kidding. It was me. Of course.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 23:19                                             ` Junio C Hamano
@ 2006-10-21  0:07                                               ` Linus Torvalds
  2006-10-21  1:09                                                 ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21  0:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jakub Narebski, git



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> I am not bold enough to say _exactly_ N places, but you missed
> at least one more important one.  Merge simplification favors
> the earlier parents over later ones.

Which is probably slightly inconsistent (although I seriously doubt 
anybody really cares - when we simplify a merge we obvioously do it 
exactly because the parents are identical wrt the files we are following).

Most of the rest of commit traversal tend to have a rule that says 
"traverse youngest parent first", simply by virtue of the fact that 
revlist() normally pops off the queue in date order. But Jakub is 
certainly correct that when we do "^" we just take the first one. 

And "gitweb" does consider the first one special, since it shows diffs 
against that one (although I've argued that it probably shouldn't, and 
that there should be some way to show branches against arbitrary parents)

So we're a bit confused. Not that it probably really ever matters. We 
might as well say that parent order is random, and that our "random number 
generators" are pretty damn lazy ;)

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21  0:07                                               ` Linus Torvalds
@ 2006-10-21  1:09                                                 ` Junio C Hamano
  2006-10-21  1:19                                                   ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-21  1:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Jakub Narebski

Linus Torvalds <torvalds@osdl.org> writes:

> And "gitweb" does consider the first one special, since it shows diffs 
> against that one (although I've argued that it probably shouldn't, and 
> that there should be some way to show branches against arbitrary parents)
>
> So we're a bit confused. Not that it probably really ever matters.

There is another one similar to the gitweb one you mentioned:
git-show --stat on a merge.  We deliberately chose to show the
difference from the first parent; it is called "showing the
changes the person who made this merge saw".

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:09                                                 ` Junio C Hamano
@ 2006-10-21  1:19                                                   ` Linus Torvalds
  2006-10-21  1:27                                                     ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21  1:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jakub Narebski



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> There is another one similar to the gitweb one you mentioned:
> git-show --stat on a merge.  We deliberately chose to show the
> difference from the first parent; it is called "showing the
> changes the person who made this merge saw".

Well, that one actually makes sense. It's just the stat from the previous 
state, after all, and it actually is done _together_ with the operation 
that causes the diffs.

So that one I don't think you can really even claim.

Also, it's not even the "first parent". Look closer. It's literally 
"previous state", because it does so for a fast-forward too. It's from 
ORIG_HEAD.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 23:59                                                   ` Linus Torvalds
@ 2006-10-21  1:26                                                     ` Junio C Hamano
  2006-10-21  8:40                                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-21  1:26 UTC (permalink / raw)
  To: git; +Cc: Jan Hudec, bazaar-ng, Jeff Licquia, Linus Torvalds, Jakub Narebski

Linus Torvalds <torvalds@osdl.org> writes:

> Both of these things that git can actually do. Admittedly it does not do 
> that in any _released_ version, so you'd have to work with the development 
> branch, and it's a fairly early thing, but currently it can actually 
> notice that our "revision.c" file largely came from the "rev-list.c" file 
> that still exists!
>
> And btw, that's not just some random feature that happened to get 
> implemented last week. Yes, it actually _did_ get implemented last week, 
> but this was something I outlined when I started git in April of last 
> year, and tried to explain to people WHY TRACKING FILE ID'S ARE WRONG!
>
> You can find me explaining these things to people in April-2005, which 
> should tell you something: the initial revision of "git" was on Thursday, 
> April 7. So the lack of file identity tracking has been controversial from 
> the very beginning, but I was right then, and I'm right now.

For people new to the list, the message is:

    http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217

I think I've quoted this link at least three times on this list;
I consider it is _the_ most important message in the whole list
archive.  If you haven't read it, read it now, print it out,
read it three more times, place it under the pillow before you
sleep tonight.  Repeat that until you can recite the whole
message.  It should not take more than a week.

To me, personally, achieving that ideal "drill down" dream was
one of the more important goals of my involvement in this
project.  I did diffcore-rename to fill some part of the dream,
and then diffcore-pickaxe to fill some other part.  Neither was
even close.  I think the recent round of pickaxe is getting much
closer.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:19                                                   ` Linus Torvalds
@ 2006-10-21  1:27                                                     ` Junio C Hamano
  2006-10-21  1:55                                                       ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-21  1:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> On Fri, 20 Oct 2006, Junio C Hamano wrote:
>> 
>> There is another one similar to the gitweb one you mentioned:
>> git-show --stat on a merge.  We deliberately chose to show the
>> difference from the first parent; it is called "showing the
>> changes the person who made this merge saw".
>
> Well, that one actually makes sense. It's just the stat from the previous 
> state, after all, and it actually is done _together_ with the operation 
> that causes the diffs.
>
> So that one I don't think you can really even claim.
>
> Also, it's not even the "first parent". Look closer. It's literally 
> "previous state", because it does so for a fast-forward too. It's from 
> ORIG_HEAD.

I was not talking about "git pull".  I was talking about "git
show".

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:27                                                     ` Junio C Hamano
@ 2006-10-21  1:55                                                       ` Linus Torvalds
  2006-10-21  8:32                                                         ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21  1:55 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git



On Fri, 20 Oct 2006, Junio C Hamano wrote:
> 
> I was not talking about "git pull".  I was talking about "git
> show".

Duh. I don't know why I misread that.

Yeah, that makes no sense at all. I _think_ "git show" should be the same 
thing as a single-entry "git log -p".

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* git-merge-recursive, was Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 20:57                                                         ` Linus Torvalds
@ 2006-10-21  2:03                                                           ` Johannes Schindelin
  2006-10-21  2:17                                                             ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-21  2:03 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aaron Bentley, Jakub Narebski, Jan Hudec, bazaar-ng, Git Mailing List



On Fri, 20 Oct 2006, Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Aaron Bentley wrote:
> > 
> > Agreed.  We start by comparing BASE and OTHER, so all those comparisons
> > are in-memory operations that don't hit disk.  Only for files where BASE
> > and OTHER differ do we even examine the THIS version.
> 
> Git just slurps in all three trees. I actually think that the current 
> merge-recursive.c does it the stupid way (ie it expands all trees 
> recursively, regardless of whether it's needed or not), but I should 
> really check with Dscho, since I had nothing to do with that code.

AFAIR yes, it does the dumb thing, namely it does not take advantage of 
trees being identical when their SHA1s are identical.

This will be a _tremendous_ speed-up.

> > > So recursive basically generates the matrix of similarity for the 
> > > new/deleted files, and tries to match them up, and there you have your 
> > > renames - without ever looking at the history of how you ended up where 
> > > you are.
> > 
> > So in the simple case, you compare unmatched THIS, OTHER and BASE files
> > to find the renames?
> 
> Right. Some cases are easy: if one of the branches only added files (which 
> is relatively common), that obviously cannot be a rename. So you don't 
> even have to compare all possible combinarions - you know you don't have 
> renames from one branch to the other ;)
> 
> But I'm not even the authorative person to explain all the details of the 
> current recursive merge, and I might have missed something. Dscho? 
> Fredrik? Anything you want to add?

Not me. Only that there is much potential for optimization (meaning 
performance, not the basic algorithm).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git-merge-recursive, was Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  2:03                                                           ` git-merge-recursive, was " Johannes Schindelin
@ 2006-10-21  2:17                                                             ` Junio C Hamano
  2006-10-22 21:04                                                               ` [PATCH] threeway_merge: if file will not be touched, leave it alone Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-21  2:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Fri, 20 Oct 2006, Linus Torvalds wrote:
>
>> Git just slurps in all three trees. I actually think that the current 
>> merge-recursive.c does it the stupid way (ie it expands all trees 
>> recursively, regardless of whether it's needed or not), but I should 
>> really check with Dscho, since I had nothing to do with that code.
>
> AFAIR yes, it does the dumb thing, namely it does not take advantage of 
> trees being identical when their SHA1s are identical.
>
> This will be a _tremendous_ speed-up.

While we are talking about merge-recursive, I could use some
help from somebody familiar with merge-recursive to complete the
read-tree changes Linus mentioned early this month.

The issue is that we would want to remove one verify_absent()
call in unpack-tree.c:threeway_merge().  When read-tree decides
to leave higher stages around, we do not want it to check if the
merge could clobber a working tree file, because having an
unrelated file at the same path in the working tree sometimes is
and sometimes is not a conflict, depending on the outcome of the
merge, and that part of the code does not _know_ the outcome
yet.

What this means is that we would need to have the equivalent
check in the merge strategy that uses read-tree for three-way
merge when we remove this overcautious safety check from
read-tree.  I've adjusted merge-one-file to do so, but not many
people use 'resolve' strategy these days, and we would need the
matching change in merge-recursive.

If you are interested, you can see the details in commit 0b35995.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-19 20:47                                             ` Linus Torvalds
@ 2006-10-21  5:49                                               ` Junio C Hamano
  0 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-21  5:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds <torvalds@osdl.org> writes:

> For example, while git now does "annotate" (or "blame"), it's not 
> lightning fast, and I simply don't care. Doing a
>
> 	git blame kernel/sched.c
>
> takes about three seconds for me, and that's on a pretty good machine (and 
> on the kernel tree, which for me is always in the cache ;).

ll.6041-6091 of that file is blamed to arch/ia64/kernel/domain.c
by pickaxe -C (attributed to commit 2.6.12-rc2) while blame says
they are brought in by commit 9c1cfa, which says "Move the ia64
domain setup code to the generic code".  I am slowly realizing
that comparing the output from blame and pickaxe might be a good
way to study the project history.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 14:56                                       ` Jakub Narebski
  2006-10-20 15:34                                         ` Aaron Bentley
@ 2006-10-21  7:56                                         ` Matthieu Moy
  2006-10-21  8:36                                           ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-21  7:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

>> It's my understanding that most changes discussed on lkml are provided
>> as a series of patches.  Bazaar bundles are intended as a direct
>> replacement for patches in that use case.
>
> As _series_ of patches. You have git-format-patch + git-send-email
> to format and send them, git-am to apply them (as patches, not as branch).
>
> I was under an impression that user sees only mega-patch of all the
> revisions in bundle together, and rest is for machine consumption only.

Nothing prevents you from using series of bundles.

A bundle for a single revision looks like a patch with a few comments
on top and bottom. _If_ you have several revisions in your patch, you
get the diff as human readable, and the intermediate revisions as
MIME-encoded.

For big changes, people do send several bundles.

So, a bundle is a direct replacement for a patch, not for series of
patches.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 12:07                 ` Sean
@ 2006-10-21  8:27                   ` Jakub Narebski
  2006-10-21  8:48                     ` Erik Bågfors
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:27 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Sean wrote:

> On Tue, 17 Oct 2006 13:45:31 +0200
> Jakub Narebski <jnareb@gmail.com> wrote:
> 
>> Git cannot do that remotely (with exception of git-tar-tree/git-archive 
>> which has --remote option), yet. But you can get contents of a file 
>> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list 
>> directory (with "git ls-tree <tree-ish>") and compare files or 
>> directories (git diff family of commands) without need for working 
>> directory.
> 
> Interesting, I didn't know about the --remote option.  So in fact as long
> as the remote has enabled upload-tar then anyone can do a "light
> checkout". 

Not exactly. "Light checkout" (aka "lazy one-branch clone") in bzr
contains also info about the repository it came from, and has some
metadata that you can commit to it locally. git tar-tree --remote
just gets snapshot. 

> However, it appears that kernel.org for instance doesn't enable this
> feature. 

One can get snapshot from gitweb... if gitweb is new enough and
has this feature enabled (it is enabled by default). Again not
the case of kernel.org

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21  1:55                                                       ` Linus Torvalds
@ 2006-10-21  8:32                                                         ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:32 UTC (permalink / raw)
  To: git

Linus Torvalds wrote:

> On Fri, 20 Oct 2006, Junio C Hamano wrote:
>> 
>> I was not talking about "git pull".  I was talking about "git
>> show".
> 
> Duh. I don't know why I misread that.
> 
> Yeah, that makes no sense at all. I _think_ "git show" should be the same 
> thing as a single-entry "git log -p".

Huh?

$ git show ff49fae6a547e5c70117970e01c53b64d983cd10
commit ff49fae6a547e5c70117970e01c53b64d983cd10
Merge: 7ad4ee7... 75f9007... 14eab2b... 0b35995... eee4609...
[...]
diff --cc Makefile
index 36b9e06,68ae43b,66c8b4b,66c8b4b,09f60bb..a2f2f7c
[...]

"git show" doesn't prefer first parent: it uses compact combined
(that is the meaning of --cc, isn't it?) format for merges.

git version 1.4.2.1
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  7:56                                         ` Matthieu Moy
@ 2006-10-21  8:36                                           ` Jakub Narebski
  2006-10-21 10:09                                             ` Matthieu Moy
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:36 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
> 
>>> It's my understanding that most changes discussed on lkml are provided
>>> as a series of patches.  Bazaar bundles are intended as a direct
>>> replacement for patches in that use case.
>>
>> As _series_ of patches. You have git-format-patch + git-send-email
>> to format and send them, git-am to apply them (as patches, not as branch).
>>
>> I was under an impression that user sees only mega-patch of all the
>> revisions in bundle together, and rest is for machine consumption only.
> 
> Nothing prevents you from using series of bundles.
> 
> A bundle for a single revision looks like a patch with a few comments
> on top and bottom. _If_ you have several revisions in your patch, you
> get the diff as human readable, and the intermediate revisions as
> MIME-encoded.
> 
> For big changes, people do send several bundles.
> 
> So, a bundle is a direct replacement for a patch, not for series of
> patches.

Ah, that explains this. So why people use bundles instead of patches
(with some metainfo like commit message)? And do bzr have command to
apply in correct ordering series of bundles send either chain replied
to (each patch in the series is reply to previous patch) or being
replies to patchseries introductory message?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  1:26                                                     ` Junio C Hamano
@ 2006-10-21  8:40                                                       ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21  8:40 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jan Hudec, bazaar-ng, Jeff Licquia, Linus Torvalds

Junio C Hamano wrote:

> For people new to the list, the message is:
> 
>     http://thread.gmane.org/gmane.comp.version-control.git/27/focus=217
> 
> I think I've quoted this link at least three times on this list;
> I consider it is _the_ most important message in the whole list
> archive.  If you haven't read it, read it now, print it out,
> read it three more times, place it under the pillow before you
> sleep tonight.  Repeat that until you can recite the whole
> message.  It should not take more than a week.
> 
> To me, personally, achieving that ideal "drill down" dream was
> one of the more important goals of my involvement in this
> project.  I did diffcore-rename to fill some part of the dream,
> and then diffcore-pickaxe to fill some other part.  Neither was
> even close.  I think the recent round of pickaxe is getting much
> closer.

What I find lacking in this mail, and in git as it is now, is
somehow remembering and perhaps even propagating user's corrections
to automatic contents movement (which includes file renames and
file copying) detection.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21  8:27                   ` Jakub Narebski
@ 2006-10-21  8:48                     ` Erik Bågfors
  0 siblings, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-21  8:48 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On 10/21/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Sean wrote:
>
> > On Tue, 17 Oct 2006 13:45:31 +0200
> > Jakub Narebski <jnareb@gmail.com> wrote:
> >
> >> Git cannot do that remotely (with exception of git-tar-tree/git-archive
> >> which has --remote option), yet. But you can get contents of a file
> >> (with "git cat-file -p [<revision>:|:<stage>:]<filename>"), list
> >> directory (with "git ls-tree <tree-ish>") and compare files or
> >> directories (git diff family of commands) without need for working
> >> directory.
> >
> > Interesting, I didn't know about the --remote option.  So in fact as long
> > as the remote has enabled upload-tar then anyone can do a "light
> > checkout".
>
> Not exactly. "Light checkout" (aka "lazy one-branch clone") in bzr
> contains also info about the repository it came from, and has some
> metadata that you can commit to it locally. git tar-tree --remote
> just gets snapshot.

No, a lightweight checkout doesn't have that.  A lightweight checkout
is basically just the latest revision checked out, a snapshot. For
everything else it needs to go the remote branch to get information.
You cannot commit locally on a "lightwieght checkout"

A "normal/heavyweight" checkout has the ability to commit locally.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21  8:36                                           ` Jakub Narebski
@ 2006-10-21 10:09                                             ` Matthieu Moy
  2006-10-21 10:34                                               ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-21 10:09 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> writes:

> Ah, that explains this. So why people use bundles instead of patches
> (with some metainfo like commit message)?

You need more metainfo than the commit message. Since revision-id is
not based on the content, you need at least to specify the
revision-id.

And bzr's bundle give indeed _all_ the information that is in the
repository about this revision (i.e. commit message, ancestors, ...).

Another relevant difference between a patch and a bundle is that the
bundles knows its ancestor, so, when you apply the bundle, it builds
the new revision with exact patching. If you need a merge, then it
will happen exactly in the same way as a merge between two branches
(ie. three-way merge for example).

> And do bzr have command to apply in correct ordering series of
> bundles send either chain replied to (each patch in the series is
> reply to previous patch) or being replies to patchseries
> introductory message?

Not directly AFAIK, but since the bundle knows which revision it
applies to, it will refuse to apply the second if the first one is not
in your repository already for example.

It would probably be interesting to have more features to help sending
series of bundles and apply them, but no one have been really asking
for it up to now.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 10:09                                             ` Matthieu Moy
@ 2006-10-21 10:34                                               ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 10:34 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: bazaar-ng, git

Matthieu Moy wrote:

> Another relevant difference between a patch and a bundle is that the
> bundles knows its ancestor, so, when you apply the bundle, it builds
> the new revision with exact patching. If you need a merge, then it
> will happen exactly in the same way as a merge between two branches
> (ie. three-way merge for example).

By the way, if patch send via email is git enchanced patch, with
[shortened] sha1 of blobs (file contents), and our repository has
the blob the patch is supposedly to apply to (but for example line
of development moved forwards) we can request via --3way command
option to git-am to fall back on 3-way merge if the patch doesn't
apply cleanly.

It is not as powerfull as merge of branches, but it is sufficient
in most cases. And in other cases you have to resolve conflict by
hand, anyway; git-rerere (which records resolving of conflicts and
reuses them) can help there.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  4:05                                               ` Aaron Bentley
@ 2006-10-21 12:30                                                 ` Jan Hudec
  2006-10-21 13:05                                                   ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 12:30 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Tim Webster, Christian MICHON, Andreas Ericsson, bazaar-ng, git,
	Matthieu Moy

On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
> Tim Webster wrote:
> > Also svn does not allow files in the same directory to live in
> > multiple repos
> 
> It would surprise me if many SCMs that support atomic commit also
> support intermixing files from multiple repos in the same directory.

In fact I think svk would. You would have to switch them by setting
an environment variable, but it's probably doable. That is because
unlike other version control systems, it does not store the information
about checkout in the checkout, but in the central directory and that
can be set. I don't know git well enough to tell whether git could do
the same by setting GIT_DIR.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 10:40                                       ` Jakub Narebski
  2006-10-20 13:36                                         ` Shawn Pearce
@ 2006-10-21 12:30                                         ` Matthew D. Fuller
  1 sibling, 0 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 12:30 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Fri, Oct 20, 2006 at 12:40:11PM +0200 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> 
> I'd like to put ComparisonWithBazaarNG page on GitWiki
> (http://git.or.cz/gitwiki/) some time soon,

This is a good idea; I think we've plowed a lot of ground in this
thread that would be useful to document somewhere easily
referenceable.  I've thought a few times while going through these
mails of putting some of the material up on the Bazaar wiki.  I'm not
really the best person to try and sort it out, but I may try and put
together some notes at least.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 21:48                                             ` Carl Worth
@ 2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 14:08                                                 ` Jakub Narebski
                                                                   ` (2 more replies)
  2006-10-21 20:05                                               ` Aaron Bentley
  1 sibling, 3 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 13:01 UTC (permalink / raw)
  To: Carl Worth
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> 
> The entire discussion is about how to name things in a distributed
> system.

I think we're getting into scratched-record-mode on this.


Git: Revnos aren't globally unique or persistent.

Bzr: Yes, we know.

G: Therefore they're useless.

B: No, they're very useful in [situation] and [situation], and we deal
   with [situation] all the time, and they work great for that.

G: But they fall apart totally in [situation].

B: Yes, so use revids there.

G: So use revids everywhere.

B: Revnos are handier tools for [situation] and [situation] for
   [reason] and [reason].

*brrrrrrrrrrrrrrrrip!!!*    *skip back to start*


I'm not sure there's any unturned stone left along this line, so I'm
not sure how productive it really is to keep walking down it.  So, to
make something productive of it, I'm going to put it onto my todo list
to spend some time with bzr trying to use revids for stuff.  I'm
fairly certain that, due to the bzr cultural tendancy to use revnos
where possible, there are some rough edges in the UI when using revids
that should be filed down (though I think it much less likely to turn
up underlying model failures that interfere with using revids).


> It may be that the centralization bias

I think it's more accurately describable as a branch-identity bias.
The git claim seems to be that the two statements are identical, but I
have some trouble swallowing that.


> I'm still not sure exactly what a bzr branch is, but it's clearly
> something different from a git branch,

The term is somewhat overloaded, which is why it's causing you trouble
(and did me).  It refers both to the conceptual entity ("a line of
development" roughly, much like what 'branch' means in git and VCS in
general), and to the physical location (directory, URL) where that
branch is stored, and where it'll often have a working tree.  Branches
are always referred to by location, never by name.


> (and I'd be interested to see a "corrected" version of the commands
> above to fix the storage inefficiencies).

The 'corrected' step would be:

> 	mkdir bzrtest; cd bzrtest
    bzr init-repo .
> 	mkdir master; cd master; bzr init

Then all branches stored under that 'bzrtest' dir will use the
bzrtest/.bzr/ dir for storing the revisions, and shared revisions will
only exist once saving the space/time for multiple copies.

Probably, you'd actually want 'init-repo --trees' in this case,
because repos default to being [working]tree-less.  In a tree-less
setup, you'd create a [lightweight] checkout of the branch(es) you
wanted to work on elsewhere, giving you a layout much like CVS or SVN
where "my VCS files are THERE, my working tree is HERE".


> (since pull seems the only way to synch up without infinite new
> merge commits being added back and forth).

The infinite-merge-commits case doesn't happen in bzr-land because we
generally don't merge other branches except when the branch owner says
"Hey, I've got something for you to merge".  If you were to setup a
script to merge two branches back and forth until they were 'equal',
yes, it'd churn away until you filled up your disk with the N bytes of
metadata every new revision uses up.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 12:30                                                 ` Jan Hudec
@ 2006-10-21 13:05                                                   ` Jakub Narebski
  2006-10-21 13:15                                                     ` Jan Hudec
  2006-10-21 16:56                                                     ` Aaron Bentley
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 13:05 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jan Hudec wrote:

> On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
>> Tim Webster wrote:
>> > Also svn does not allow files in the same directory to live in
>> > multiple repos
>> 
>> It would surprise me if many SCMs that support atomic commit also
>> support intermixing files from multiple repos in the same directory.
> 
> In fact I think svk would. You would have to switch them by setting
> an environment variable, but it's probably doable. That is because
> unlike other version control systems, it does not store the information
> about checkout in the checkout, but in the central directory and that
> can be set. I don't know git well enough to tell whether git could do
> the same by setting GIT_DIR.

You can very simply embed one "clothed" repository into another in GIT,
like shown below

  project/.git
  project/subdir/
  project/subdir/file
  project/subproject/
  project/subproject/.git
  project/subproject/file
  ...

It depends on circumstances if one wants files belonging to subdirectory
be ignored by top repository. You would want to ignore .git/ directory,
though.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:05                                                   ` Jakub Narebski
@ 2006-10-21 13:15                                                     ` Jan Hudec
  2006-10-21 13:29                                                       ` Jakub Narebski
  2006-10-21 16:56                                                     ` Aaron Bentley
  1 sibling, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 13:15 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Sat, Oct 21, 2006 at 03:05:22PM +0200, Jakub Narebski wrote:
> Jan Hudec wrote:
> 
> > On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
> >> Tim Webster wrote:
> >> > Also svn does not allow files in the same directory to live in
> >> > multiple repos
> >> 
> >> It would surprise me if many SCMs that support atomic commit also
> >> support intermixing files from multiple repos in the same directory.
> > 
> > In fact I think svk would. You would have to switch them by setting
> > an environment variable, but it's probably doable. That is because
> > unlike other version control systems, it does not store the information
> > about checkout in the checkout, but in the central directory and that
> > can be set. I don't know git well enough to tell whether git could do
> > the same by setting GIT_DIR.
> 
> You can very simply embed one "clothed" repository into another in GIT,
> like shown below
> 
>   project/.git
>   project/subdir/
>   project/subdir/file
>   project/subproject/
>   project/subproject/.git
>   project/subproject/file
>   ...
> 
> It depends on circumstances if one wants files belonging to subdirectory
> be ignored by top repository. You would want to ignore .git/ directory,
> though.

Yes, you can do that with bzr and most other tools I know of as well.
But I understand the original question as requesting the working trees
to be rooted at the same place (ie. all in /etc), because each has some
files and some directories that have to be placed next to each other.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:15                                                     ` Jan Hudec
@ 2006-10-21 13:29                                                       ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 13:29 UTC (permalink / raw)
  To: Jan Hudec; +Cc: bazaar-ng, git

Dnia sobota 21. października 2006 15:15, Jan Hudec napisał:
> On Sat, Oct 21, 2006 at 03:05:22PM +0200, Jakub Narebski wrote:
>> Jan Hudec wrote:
>> 
>>> On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
>>>> Tim Webster wrote:
>>>>> Also svn does not allow files in the same directory to live in
>>>>> multiple repos
>>>> 
>>>> It would surprise me if many SCMs that support atomic commit also
>>>> support intermixing files from multiple repos in the same directory.
>>> 
>>> In fact I think svk would. You would have to switch them by setting
>>> an environment variable, but it's probably doable. That is because
>>> unlike other version control systems, it does not store the information
>>> about checkout in the checkout, but in the central directory and that
>>> can be set. I don't know git well enough to tell whether git could do
>>> the same by setting GIT_DIR.
>> 
>> You can very simply embed one "clothed" repository into another in GIT,
>> like shown below
[...]
>> It depends on circumstances if one wants files belonging to subdirectory
>> be ignored by top repository. You would want to ignore .git/ directory,
>> though.
> 
> Yes, you can do that with bzr and most other tools I know of as well.
> But I understand the original question as requesting the working trees
> to be rooted at the same place (ie. all in /etc), because each has some
> files and some directories that have to be placed next to each other.

You can separate working area from the repository (you don't need to have
repository in top directory of working area), but you must then provide
for each git command you do the location of repository, either via setting
GIT_DIR environmental variable (GIT_DIR=/path/to/repo.git git commit ...),
or use --git-dir option of git wrapper (git --git-dir=/path/to/repo.git diff),
as automatical detection of repository wouldn't work, of course.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Alternate revno proposal (Was: Re: VCS comparison table)
  2006-10-19  8:19                         ` Alexander Belchenko
@ 2006-10-21 13:48                           ` Jan Hudec
  0 siblings, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 13:48 UTC (permalink / raw)
  To: Alexander Belchenko; +Cc: bazaar-ng, git

On Thu, Oct 19, 2006 at 11:19:30AM +0300, Alexander Belchenko wrote:
> Jan Hudec ??????????:
> >Reading this thread I came to think, that the revnos should be assigned
> >to _all_ revisions _available_, in order of when they entered the
> >repository (there are some possible variations I will mention below)
> ...
> > - They would be the same as subversion and svk, and IIRC mercurial as
> >   well, use, so:
> >   - They would already be familiar to users comming from those systems.
> >   - They are known to be useful that way. In fact for svk it's the only
> >     way to refer to revisions and seem to work satisfactorily (though
> >     note that svk is not really suitable to ad-hoc topologies).
> 
> I think that SVN model of revision numbers is wrong. And apply it to bzr
> break many UI habits. Per example, when ones use svn and their repo has
> many branches you never could say what revisions belongs to mainline. So
> things like
> bzr diff -rM..N
> (where M and N absolute revisions numbers, and N = M+1(+2) etc.)
> will more complicated, because in this case you first need to run log
> command, remember actual numbers of those revisions.

Well, you need to run log anyway, because you usually want to see a diff
between some particular revisions, so you need to find them anyway.

On the other hand in subversion all revisions actually exist on all
branches, so svn diff -r N-1:N always shows changes introduced by
revision N, while here you would have to use before:N..N.

> And I each time frustrating to see that after mainline svn revision 1000
> might be mainline revision 1020. It's very-very-very confusing. May be
> only for me.

I got used to this pretty quickly when I used svk. And there it actually
happens much more often than in subversion itself, because you have the
mirrored branches and each commit on them also gets a revision number.
But yes, they feel more weird.

> There is 2 things why I don't want to switch to svn (if I can do my own
> choice): their strange tags implementation (their tags is the same as
> branches, so what difference?) and their revisions numbers.
> 
> I also think that dotted revisions is not answer in this case, but it
> looks very logical and nice.
> 
> I think bzr need to have a switch, a flag, probably in .bazaar.conf to
> show revno to user or revid. And user can easily select what model is
> more appropriate for him:
> 
> * decentralized (with revno)
> * or distrubuted (with revid i.e. UUID)

Personally I'd like the ui to make the revision ids more visible since
they are the canonical way for refering to revisions and as shown among
other in this thread people who know something about distributed version
control are actually confused by them not being visible and think they
are not there.

> >Comments?
> 
> -1 to make revno as in svn.

Hm, you are probably right. In any case it's more useful to teach the
users not to get attached to the revnos too much.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:01                                               ` Matthew D. Fuller
@ 2006-10-21 14:08                                                 ` Jakub Narebski
  2006-10-21 16:31                                                   ` Erik Bågfors
  2006-10-21 18:11                                                   ` Matthew D. Fuller
  2006-10-21 20:47                                                 ` Carl Worth
  2006-10-25  9:35                                                 ` Andreas Ericsson
  2 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 14:08 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:
> On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
> > 
> > The entire discussion is about how to name things in a distributed
> > system.
> 
> I think we're getting into scratched-record-mode on this.
> 
> 
> Git: Revnos aren't globally unique or persistent.
> 
> Bzr: Yes, we know.
> 
> G: Therefore they're useless.
> 
> B: No, they're very useful in [situation] and [situation], and we deal
>    with [situation] all the time, and they work great for that.
> 
> G: But they fall apart totally in [situation].

G: But revnos force centralized/star-topology development. And even in
   [situation] have [disadvantages].

> B: Yes, so use revids there.
> 
> G: So use revids everywhere.
> 
> B: Revnos are handier tools for [situation] and [situation] for
>    [reason] and [reason].

G: Shortened sha1 commit-ids are almost as handy.

> *brrrrrrrrrrrrrrrrip!!!*    *skip back to start*

There _are_ terminology conflicts. For example bzr "branch" is roughly 
equivalent to one-branch git "repository"; bzr "repository" is just 
collection of branches sharing common storage, which is similar to set 
of git "repositories" with .git/objects/ linked to common object 
repository (storage area) or appropriately set alternates file 
(although that is not common usage in git, and for example you would 
have to be carefull with running git-prune); bzr "lightweight checkout" 
is equivalent to nonexistent "lazy clone"/"remote alternates" discussed 
on git mailing list but not implemented because of performance 
concerns; bzr "normal checkout" is I think similar to git "shared 
clone" (but shared clone is limited to repositories on the same 
filesystem); bzr "heavyweight checkout" is roughly equivalent to 
one-branch-only "clone" in git or cg (cg = Cogito).

And there are differences in opinion. For example "simple namespace for 
revisions" which is important for bzr, is superficially simple for git 
(as it works only for centralized approach, and for leaf repositories 
you have to have access to central repository to get final revnos); on 
the other hand "not simpleness" of git's sha1 identifiers is not that 
complicated in everydays work, as one usually use branch and tag names, 
<ref>~<n> and <ref1>..<ref2> syntax, sometimes shortened sha1 names and 
full sha1 names only rarely. For bzr it is more important to tell from 
revno which commit on branch was earlier, for git it is more important 
that commitids never ever change; we can use git commands to check 
which commit was earlier. For bzr plugins are important, for git it is 
important to be easy to add new commands, using scripts for fast 
prototyping.

> > It may be that the centralization bias
> 
> I think it's more accurately describable as a branch-identity bias.
> The git claim seems to be that the two statements are identical, but I
> have some trouble swallowing that.

When two clones of the same repository (in git terminology), or two 
"branches" (in bzr terminology), used by different people, cannot be 
totally equivalent that is centralization bias. By equivalent I mean 
that "old history" is exactly the same (the same diagram, the same
identifiers - make it usually used identifiers).
 
The fact that you have two different commands, "merge" vs "pull"
for using in one mother/mainline "branch" vs other "branches" tells
us that there is bias towards centralization.

> > I'm still not sure exactly what a bzr branch is, but it's clearly
> > something different from a git branch,
> 
> The term is somewhat overloaded, which is why it's causing you trouble
> (and did me).  It refers both to the conceptual entity ("a line of
> development" roughly, much like what 'branch' means in git and VCS in
> general), and to the physical location (directory, URL) where that
> branch is stored, and where it'll often have a working tree.  Branches
> are always referred to by location, never by name.

I'd rather use other name then. Perhaps "forks" for physical "branch",
i.e. branch metadata (like revno to revid mapping) + object repository 
or pointer to it + optionally working area/working files. 

[...]
> > (since pull seems the only way to synch up without infinite new
> > merge commits being added back and forth).
> 
> The infinite-merge-commits case doesn't happen in bzr-land because we
> generally don't merge other branches except when the branch owner says
> "Hey, I've got something for you to merge".  If you were to setup a
> script to merge two branches back and forth until they were 'equal',
> yes, it'd churn away until you filled up your disk with the N bytes of
> metadata every new revision uses up.

And you say that bzr is not biased towards centralization? In git you 
can just pull (fetch) to check if there were any changes, and if there 
were not you don't get useless marker-merges.


Take for example two simple git scenarios:
1. Single branch repository. We have two clones of the same repository, 
both with only one branch, 'master', both working on this branch, and 
both considered equal. If only one person worked on branch, "pull" 
would result in fast-forward. If both worked on branch, "pull" would 
result in merge. This is the "diamond" example by Pasky, which 
explained why git doesn't treat first parent like special - because of 
fast forward. Bzr treats first parent/mainline/"the branch" special 
therefore it generates superficial merge commits if we preserve revnos; 
BTW doesn't "pull" clobber your changes?

2. But the preferred git workflow is to have two branches in each of two 
clones. The 'origin' branch where you fetch changes from other 
repository (so called "tracking branch") and you don't commit your 
changes to (by convention, as git doesn't protect the branch from 
commiting to, although it would refuse to fetch in non fast-forward 
case unless forced). You put your work in the 'master' branch, and you 
merge 'origin' branch into 'master'. This allows for example fetching 
changes to 'origin' but _not_ merging them immediately into 'master',
for example if you are in the middle of some larger work byt want to 
check what other side did to not to create conflict if not neccessary.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
  2006-10-17 11:38               ` Sean
  2006-10-17 11:38               ` Sean
@ 2006-10-21 14:13               ` Jan Hudec
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 14:13 UTC (permalink / raw)
  To: Sean; +Cc: Matthieu Moy, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, Oct 17, 2006 at 07:38:39AM -0400, Sean wrote:
> On Tue, 17 Oct 2006 13:19:08 +0200
> Matthieu Moy <Matthieu.Moy@imag.fr> wrote:
> 
> > 1) a working tree without any history information, pointing to some
> >    other location for the history itself (a la svn/CVS/...).
> >    (this is "light checkout")
> 
> Git can do this from a local repository, it just can't do it from
> a remote repo (at least over the git native protocol).  However,
> over gitweb you can grab and unpack a tarball from a remote repo.
> In practice this is probably enough support for such a feature.
> 
> > 2) a bound branch. It's not _very_ different from a normal branch, but
> >    mostly "commit" behaves differently:
> >    - it commits both on the local and the remote branch (equivalent to
> >      "commit" + "push", but in a transactional way).
> >    - it refuses to commit if you're out of date with the branch you're
> >      bound to.
> >    (this is "heavy checkout")
> 
> This doesn't sound right, at least in the spirit of git.  Git really
> wants to have a local commit which you may or may not push to a
> remote repo at a later time.  There is no upside to forcing it all to
> happen in one step, and a lot of downsides.  Gits focus is to support
> distributed offline development, not requiring a remote repo to be
> available at commit time.

While there is no upside to forcing it all to _always_ happen in one
step, there are good reasons to allow it in particular cases.

The most common is if you work on something from two different computers
(at home and at work or from desktop or notebook or similar cases) and
want to be sure you don't forget to synchronize your changes.

You can always unbind the branch or do a commit --local, which allows
doing a local commit anyway (eg. when disconnected) and then the next
commit will require a merge if the branches diverged.

> > In both cases, this has the side effect that you can't commit if the
> > "upstream" branch is read-only. That's not fundamental, but handy.
> 
> Again this seems really anti-git.  There is no reason for your local
> branch to be marked read only just because some upstream branch is
> so marked.

Again, it only is if you want, and opt for, making it so. Eg. people who
often have many terminals with different current directories may use it
to protect themselves from accidentaly running commands in the wrong
one. You don't have to use it if you don't want to.

> > I use it for example to have several "checkouts" of the same branch on
> > different machines. When I commit, bzr tells me "hey, boss, you're out
> > of date, why don't you update first" if I'm out of date. And if commit
> > succeeds, I'm sure it is already commited to the main branch. I'm sure
> > I won't pollute my history with merges which would only be the result
> > of forgetting to update.
> 
> This is exactly the same in Git.  You really only ever push upstream
> when your local changes fast forward the remote, (ie. you're up to date).
> Git will warn you if your changes don't fast forward the remote.

In bzr push and pull only work for the fast-forward case. They operate
on branches and actually apply the changes on the target. But that's a
different thing. Bound branches are mainly about not forgetting to
synchronize it.

> > The more fundamental thing I suppose is that it allows people to work
> > in a centralized way (checkout/commit/update/...), and Bazaar was
> > designed to allow several different workflows, including the
> > centralized one.
> 
> While Git really isn't meant to work in a centralized way there's nothing
> preventing such a work flow.  It just requires the use of some surrounding
> infrastructure.

Bzr is meant to be used in both ways, depending on user's choice.
Therefore it comes with that infrastructure and you can choose whether
you want to use it or not.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
@ 2006-10-21 14:23                   ` Sean
  2006-10-21 16:19                     ` Erik Bågfors
  2006-10-21 14:23                   ` Sean
  2006-10-21 18:34                   ` Jan Hudec
  2 siblings, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-21 14:23 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Matthieu Moy, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Sat, 21 Oct 2006 16:13:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> Bzr is meant to be used in both ways, depending on user's choice.
> Therefore it comes with that infrastructure and you can choose whether
> you want to use it or not.

>From what we've read on this thread, bzr appears to be biased towards
working with a central repo.  That is the model that supports the use of
revnos etc that the bzr folks are so fond of.   However Git is perfectly
capable of being used in any number of models, including centralized.
Git just doesn't make the mistake of training new users into using
features that are only stable in a limited number of those models.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
  2006-10-21 14:23                   ` Sean
@ 2006-10-21 14:23                   ` Sean
  2006-10-21 18:34                   ` Jan Hudec
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 14:23 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On Sat, 21 Oct 2006 16:13:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> Bzr is meant to be used in both ways, depending on user's choice.
> Therefore it comes with that infrastructure and you can choose whether
> you want to use it or not.

>From what we've read on this thread, bzr appears to be biased towards
working with a central repo.  That is the model that supports the use of
revnos etc that the bzr folks are so fond of.   However Git is perfectly
capable of being used in any number of models, including centralized.
Git just doesn't make the mistake of training new users into using
features that are only stable in a limited number of those models.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-18 16:31                                   ` Aaron Bentley
@ 2006-10-21 15:56                                     ` Jan Hudec
  2006-10-21 16:13                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 15:56 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, Petr Baudis, Carl Worth, git

On Wed, Oct 18, 2006 at 12:31:52PM -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Jakub Narebski wrote:
> > Aaron Bentley wrote:
> > 
> >>Carl Worth wrote:
> >>>There are even more important reasons to prefer a series of
> >>>micro-commits over a mega-patch than just ease of merging.
> >>
> >>A bundle isn't a mega-patch.  It contains all the source revisions.  So
> >>when you merge or pull it, you get all the original revisions in your
> >>repository.
> > 
> > 
> > But what patch reviewer see is a mega-patch showing the changeset
> > of a whole "bundle", isn't it?
> > [...]
> 
> Yes.  Carl was saying that, aside from the issue of what a reviewer
> sees, a bundle is bad for other reasons.  I am saying those other
> reasons don't apply.  I wasn't addressing the issue of what a reviewer sees.
> 
> To me, seeing the individual patches is like reading a book where every
> page has a different word on it, and so it's hard to put it together
> into a full sentence.  I'm not saying my way is The Right Way, just my
> personal preference.
> 
> For larger pieces of work, we try to split them up into logical units,
> and merge those units independently.
> 
> The Bundle format can also support a patch-by-patch output, but we don't
> have UI to select that.

As for what the reviewer wants to see, I think it depends on what kind
of code it is. Kernel code is complex and does not have (at least I have
not heared of) unit-tests, so short patches are preferable for review.
And since C is of the more verbose languages, short patches mean
spliting them up into several pieces.

On the other hand bzr has unit-tests and python is less verbose, so the
single patch for a feature is not so big and is manageable. The patches
to bzr still come in logical steps, but usually one step per feature is
enough.

Also programmers usually don't develop even the single logical step as a
single commit. Instead they they also commit to backup their work,
when they try something they think they may in future return, when they
need to continue on another computer and so on. And these commits are
generally not logical steps. Also the steps are often not in a logical
order. Therefore showing diff for each commit in the bundle often does
not make sense.

So there is one bundle per logical step and therefore has a summary
diff. Individual bundles for individual steps are preferable anyway,
since the maintainer may decide to accept just some of them.  A tool to
generate a series of bundles (either each with just one commit or each
with several commits) would be possible, just noone was interested
enough to do it yet.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 15:56                                     ` Jan Hudec
@ 2006-10-21 16:13                                       ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 16:13 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Aaron Bentley, Matthieu Moy, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, Petr Baudis, Carl Worth, git

Jan Hudec wrote:

> Also programmers usually don't develop even the single logical step as a
> single commit. Instead they they also commit to backup their work,

In git you can backup your work on temporary branch; besides there
is git commit --amend to correct last commit.

> when they try something they think they may in future return, when they
> need to continue on another computer and so on. And these commits are
> generally not logical steps. Also the steps are often not in a logical
> order. Therefore showing diff for each commit in the bundle often does
> not make sense.

That is why before sending patch series based on some feature branch,
you should at least rebase the branch on top of current work, to ensure
that the series would apply cleanly.

If feature branch/patch series needs cleanup (going from "answer" to
"solution" http://lkml.org/lkml/2005/4/7/176), i.e. patch (commit)
reordering, joining two patches into one, patch splitting, you can
use git-cherry-pick, git-cherry-pick --no-commit and git commit --amend
combination, or git-format-patch, patch editing and reordering, and git-am.
Or just use StGit or pg.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 14:23                   ` Sean
@ 2006-10-21 16:19                     ` Erik Bågfors
  2006-10-21 16:31                       ` Jakub Narebski
                                         ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-21 16:19 UTC (permalink / raw)
  To: Sean
  Cc: Jan Hudec, Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 21 Oct 2006 16:13:28 +0200
> Jan Hudec <bulb@ucw.cz> wrote:
>
> > Bzr is meant to be used in both ways, depending on user's choice.
> > Therefore it comes with that infrastructure and you can choose whether
> > you want to use it or not.
>
> From what we've read on this thread, bzr appears to be biased towards
> working with a central repo.  That is the model that supports the use of
> revnos etc that the bzr folks are so fond of.   However Git is perfectly
> capable of being used in any number of models, including centralized.
> Git just doesn't make the mistake of training new users into using
> features that are only stable in a limited number of those models.

This is just plain wrong.

bzr is a fully decentralized VCS. I've read this thread for quite some
time now and I really cannot understand why people come to this
conclusion.

However, if you do want to work centralized, bzr has commands that
fits that workflow really good.


/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 14:08                                                 ` Jakub Narebski
@ 2006-10-21 16:31                                                   ` Erik Bågfors
  2006-10-21 16:59                                                     ` Jakub Narebski
  2006-10-21 18:11                                                   ` Matthew D. Fuller
  1 sibling, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-21 16:31 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

> There _are_ terminology conflicts. For example bzr "branch" is roughly
> equivalent to one-branch git "repository";

Agreed.

> bzr "repository" is just
> collection of branches sharing common storage,
Agreed

> which is similar to set
> of git "repositories" with .git/objects/ linked to common object
> repository (storage area) or appropriately set alternates file
> (although that is not common usage in git, and for example you would
> have to be carefull with running git-prune); bzr "lightweight checkout"
> is equivalent to nonexistent "lazy clone"/"remote alternates" discussed
> on git mailing list but not implemented because of performance
> concerns; bzr "normal checkout" is I think similar to git "shared
> clone" (but shared clone is limited to repositories on the same
> filesystem); bzr "heavyweight checkout" is roughly equivalent to
> one-branch-only "clone" in git or cg (cg = Cogito).

This is wrong. There are two kinds of checkouts
lightweight.. and "normal/heavyweight".

I think you are getting this alittle wrong, and I think the reason is
that you are thinking of repositories, while in bzr you normally think
of branches.

For example, I think (correct me if I'm wrong) that if I have a git
repository of a upstream linux-repo (Linus' for example).  I guess
I'll use "pull" to keep my copy up to date with the upstream repo? If
I then would like to hack something special, I would "clone" the repo
and get a new repo and that's where I do my work.  Is that correct?

In bzr you never (well...)  clone a full repository, but you clone one
line-of-development (a branch).  So "bzr branch"  is always a
"one-branch-only "clone" in git or cg".

"bzr checkout" is a "bzr branch" followed by a setting saying
"whenever you commit here, commit in the master branch also".

"bzr checkout --lightweight" is a way to get only a snapshot of the
working tree out of a branch. Whenever you commit, it's done in the
remote branch.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:19                     ` Erik Bågfors
@ 2006-10-21 16:31                       ` Jakub Narebski
       [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
  2006-10-21 21:04                       ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 16:31 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Sean, Jan Hudec, Linus Torvalds, bazaar-ng, git, Matthieu Moy

Erik Bågfors wrote:
> On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
>> On Sat, 21 Oct 2006 16:13:28 +0200
>> Jan Hudec <bulb@ucw.cz> wrote:
>>
>>> Bzr is meant to be used in both ways, depending on user's choice.
>>> Therefore it comes with that infrastructure and you can choose whether
>>> you want to use it or not.
>>
>> From what we've read on this thread, bzr appears to be biased towards
>> working with a central repo.  That is the model that supports the use of
>> revnos etc that the bzr folks are so fond of.   However Git is perfectly
>> capable of being used in any number of models, including centralized.
>> Git just doesn't make the mistake of training new users into using
>> features that are only stable in a limited number of those models.
> 
> This is just plain wrong.
> 
> bzr is a fully decentralized VCS. I've read this thread for quite some
> time now and I really cannot understand why people come to this
> conclusion.
> 
> However, if you do want to work centralized, bzr has commands that
> fits that workflow really good.

Read carefully: bzr is _biased_ towards work with central repository.
Default workflow (as for example using revnos, as for example using
"merge" for one repository and "pull" for other) of bzr is geared
towards star topology, i.e. some centralized repository.

That to be said, it is supposed to be able to work in fully decentralized
way, using revids. But then for example you don't have "simple rev
namespace" (moreover you have _worse_ namespace than git's sha1 ids).

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
@ 2006-10-21 16:35                         ` Erik Bågfors
       [not found]                           ` <BAYC1-PASMTP04FAD1FBB91BA4C07A5E79AE020@CEZ.ICE>
  0 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-21 16:35 UTC (permalink / raw)
  To: Sean
  Cc: Jan Hudec, Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 21 Oct 2006 18:19:54 +0200
> "Erik Bågfors" <zindar@gmail.com> wrote:
>
> > This is just plain wrong.
> >
> > bzr is a fully decentralized VCS. I've read this thread for quite some
> > time now and I really cannot understand why people come to this
> > conclusion.
> >
> > However, if you do want to work centralized, bzr has commands that
> > fits that workflow really good.
>
> Have you been reading this thread at all?

Yes.

> Even the bzr people have now
> stated rather firmly that the revno scheme doesn't work very well in
> a number of situations.  Numerous examples have been given where the
> revno will be useless, or worse misleading when bzr is used without
> a central server.  The answer from the bzr folks has been then don't
> use the revno in those situations.  However, it's quite clear from the
> bzr UI that there is a _bias_ towards using revno's.
>
> So yes, clearly you can use bzr without a central server; but it's just
> as clearly biased against such usage.

So... I do agree that revnos might not fit perfectly in at all times.
But that they automatically mean that bzr is not a decentralized VCS,
I strongly disagree with.  They are just one part of the equation.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:05                                                   ` Jakub Narebski
  2006-10-21 13:15                                                     ` Jan Hudec
@ 2006-10-21 16:56                                                     ` Aaron Bentley
  2006-10-21 17:03                                                       ` Jakub Narebski
  2006-10-21 17:31                                                       ` Linus Torvalds
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-21 16:56 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Jan Hudec wrote:
> 
>> On Fri, Oct 20, 2006 at 12:05:35AM -0400, Aaron Bentley wrote:
>>> Tim Webster wrote:
>>>> Also svn does not allow files in the same directory to live in
>>>> multiple repos
>>> It would surprise me if many SCMs that support atomic commit also
>>> support intermixing files from multiple repos in the same directory.
>> In fact I think svk would. You would have to switch them by setting
>> an environment variable, but it's probably doable. That is because
>> unlike other version control systems, it does not store the information
>> about checkout in the checkout, but in the central directory and that
>> can be set. I don't know git well enough to tell whether git could do
>> the same by setting GIT_DIR.
> 
> You can very simply embed one "clothed" repository into another in GIT,
> like shown below
> 
>   project/.git
>   project/subdir/
>   project/subdir/file
>   project/subproject/
>   project/subproject/.git
>   project/subproject/file
>   ...
> 
> It depends on circumstances if one wants files belonging to subdirectory
> be ignored by top repository. You would want to ignore .git/ directory,
> though.

Any SCM worth its salt should support that.  AIUI, that's not what Tim
wants.  He wants to intermix files from different repos in the same
directory.

i.e.

project/file-1
project/file-2
project/.git-1
project/.git-2

So file-1 would be in the .git-1 repository, but file-2 would be in the
.git-2 repository.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOlE70F+nu1YWqI0RAvNcAJ0Rd6ovGoBNtKxcPNOrMH1yc+bzWQCfQlqT
hREsUmCBAW8mIYzfzdnqZqU=
=unGE
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:31                                                   ` Erik Bågfors
@ 2006-10-21 16:59                                                     ` Jakub Narebski
  2006-10-21 17:41                                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 16:59 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Erik Bågfors wrote:
> Jakub Narebski wrote:
>>
>> There _are_ terminology conflicts. For example bzr "branch" is roughly
>> equivalent to one-branch git "repository";
> 
> Agreed.
> 
>> bzr "repository" is just
>> collection of branches sharing common storage,
>
> Agreed

What is worse (in comparing git with bzr) that there are no exact
equivalents. For example bzr "branch" is something between git
repository (clone of repository) and git branch. Bazaar-NG "repository"
is something like multi-branch git repository, but also like collection
of git repositories sharing object database.
 
>> which is similar to set
>> of git "repositories" with .git/objects/ linked to common object
>> repository (storage area) or appropriately set alternates file
>> (although that is not common usage in git, and for example you would
>> have to be carefull with running git-prune); bzr "lightweight checkout"
>> is equivalent to nonexistent "lazy clone"/"remote alternates" discussed
>> on git mailing list but not implemented because of performance
>> concerns; bzr "normal checkout" is I think similar to git "shared
>> clone" (but shared clone is limited to repositories on the same
>> filesystem); bzr "heavyweight checkout" is roughly equivalent to
>> one-branch-only "clone" in git or cg (cg = Cogito).
> 
> This is wrong. There are two kinds of checkouts
> lightweight.. and "normal/heavyweight".
> 
> I think you are getting this a little wrong, and I think the reason is
> that you are thinking of repositories, while in bzr you normally think
> of branches.

As I said: conflict of concepts. And perhaps philosophies.

> For example, I think (correct me if I'm wrong) that if I have a git
> repository of a upstream linux-repo (Linus' for example).  I guess
> I'll use "pull" to keep my copy up to date with the upstream repo? If
> I then would like to hack something special, I would "clone" the repo
> and get a new repo and that's where I do my work.  Is that correct?

Not exactly.

To work for example on Linus' version of Linux kernel you clone upstream
linux-repo git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git
Working area is associated with repository in Git, not with "branch" like
in Bazaar-NG. In default configuration 'master' (main) branch of cloned
repository (in the case of Linus' public repo it is the only branch)
corresponds to 'origin' branch in your repository.

Now you can work on 'master' branch, putting your changes there. git-fetch
will update 'origin' branch to the current version of 'master' branch of
cloned repo; git-pull will additionally merge into 'master', i.e. merge
new changes into your work.

Now if you want to hack something special, that you prefer to use separate
branch for, you don't need to clone repository anew (although you could,
using --local --shared to reduce cost of cloning) but it is enough to
create new branch in your repository. You can very easily switch between
branches using the same working area (in bzr it would probably mean 
"branch checkout" to the same directory).

> In bzr you never (well...)  clone a full repository, but you clone one

It's a pity... for example you usually want to have access to both
stable ('master') and development ('next') branches, perhaps
also to fixes ('maint') and beta stage development ('pu') branches.
In bzr it is a bit work (to correctly setup "repository"), in git
it is one command.

> line-of-development (a branch).  So "bzr branch"  is always a
> "one-branch-only "clone" in git or cg".

More or less.

> "bzr checkout" is a "bzr branch" followed by a setting saying
> "whenever you commit here, commit in the master branch also".

Git doesn't have exact equivalent here. For "bzr checkout" on
the same system, it is similar to setting common object repository;
for remote "bzr checkout" it might be approximated by hooks which
would push changes to remote repository (although we would have
to implement some transaction/journal framework).

> "bzr checkout --lightweight" is a way to get only a snapshot of the
> working tree out of a branch. Whenever you commit, it's done in the
> remote branch.

Yes, but with "bzr checkout --lightweight" you get also pointer
to remote branch where to commit changes. Git doesn't have something
like that, at least not for remote remote branch; mostly because of
poor performance or need for fast and constant network connection
to source branch.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:56                                                     ` Aaron Bentley
@ 2006-10-21 17:03                                                       ` Jakub Narebski
  2006-10-21 17:31                                                       ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 17:03 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> AIUI, that's not what Tim  wants.  He wants to intermix files from
> different repos in the same directory.
> 
> i.e.
> 
> project/file-1
> project/file-2
> project/.git-1
> project/.git-2
> 
> So file-1 would be in the .git-1 repository, but file-2 would be
> in the .git-2 repository.

Possible (as I said), although it would screw up automatic repository 
detection. So you would have to say "git --git-dir=.git-1 commit -a"
or "GIT_DIR=.git-2 git log -p; git diff; ...", i.e. specify repo
for each command.

Of course you would have to hide repositories from each other,
and probably it would be better to hide files provided by other
repository.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:56                                                     ` Aaron Bentley
  2006-10-21 17:03                                                       ` Jakub Narebski
@ 2006-10-21 17:31                                                       ` Linus Torvalds
  2006-10-21 17:38                                                         ` Linus Torvalds
  2006-10-22  7:49                                                         ` Tim Webster
  1 sibling, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 17:31 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Jakub Narebski



On Sat, 21 Oct 2006, Aaron Bentley wrote:
> 
> Any SCM worth its salt should support that.  AIUI, that's not what Tim
> wants.  He wants to intermix files from different repos in the same
> directory.
> 
> i.e.
> 
> project/file-1
> project/file-2
> project/.git-1
> project/.git-2

Ok, that's just insane.

It's going to always result in problems (ie some files are going to be 
considered "untracked" depending on which repository you're looking at 
right then and there).

That said, if you _really_ want this, you can do it. Here's now:

	# Create insane repository layout
	mkdir I-am-insane
	cd I-am-insane

	# Tell people we want to work with ".git-1"
	export GIT_DIR=.git-1

	git init-db
	echo "This is file 1 in repo 1" > file-1
	git add file-1
	git commit -m "Silly commit" 

	# Now we switch repos
	export GIT_DIR=.git-2

	git init-db
	echo "This is another file in repo 2" > file-2
	git add file-2
	git commit -m "Silly commit in another repo"

and now you literally have two repositories in the same subdirectory, and 
they don't know about each other, and you can switch your "attention" 
between them by simply doing

	export GIT_DIR=.git-1

(or .git-2). Then you can just do "git diff" etc normally, and work in the 
repo totally ignoring the other one in the same directory structure.

Of course, things like "git status" that show untracked files will always 
then show the "other" repository files as untracked - the two things will 
really be _totally_ independent, they don't at any point know about each 
others files, although they can actually _share_ checked-out files if you 
want to:

	echo "This is a shared file" > file-shared

	export GIT_DIR=.git-1
	git add file-shared
	git commit -m "Add shared file to repo 1"

	export GIT_DIR=.git-2
	git add file-shared
	git commit -m "Add shared file to repo 2"

and now if you change that file, both repositories will see it as being 
changed.

INSANE. And probably totally useless. But you can do it. If you really 
want to.

The git directories don't even have to be in the same subdirectory 
structure. You could have done

	export GIT_DIR=~/insane-git-setup/dir1

instead, and the git information for that thing would have been put in 
that subdirectory.

Note: the above literally creates two different repositories. You can do 
the same thing with a single object repository (so that any actual shared 
data shows up in a shared database) by still using different GIT_DIR 
variables, but using GIT_OBJECT_DIRECTORY to point to a shared database 
directory (which again could be anywhere - it could be under ".git-1", or 
it could be in a separate place in your home directory).

Or you could do it even _more_ differently by actually having just a 
single repository, and having two different branches in that repository, 
and just tracking them separately: in that case you would keep the same 
GIT_DIR/GIT_OBJECT_DIRECTORY (or keep them unset, which just means that 
they default to ".git" and ".git/objects" as normal), and then just switch 
the "index" file and the HEAD files around. That would mean that to switch 
from one "view" to the other, you'd do something like

	export GIT_INDEX_FILE=.git/index1
	git symbolic-ref HEAD refs/heads/branch1

to set your view to "branch1".

Anyway, I would strongly discourage people from actually doing anything 
like this. It should _work_, but quite frankly, if you actually want to do 
this, you have serious mental problems.

What's probably much better is to have two separate development 
repositories, and then perhaps mixing the end _result_ somewhere else. For 
example, you can use the

	git checkout-index -a -f --prefix=/usr/shared/result/

in both (separate) repositories, and you'll end up with basically a 
snapshot of the "union" in /usr/shared/result.

(Not that I see why you'd want to do that _either_, but hey, at least 
you're not going to be _totally_ confused by the end result).

Anyway. Git certainly allows you to do some really insane things. The 
above is just the beginning - it's not even talking about alternate object 
directories where you can share databases _partially_ between two 
otherwise totally independent repositories etc.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                           ` <BAYC1-PASMTP04FAD1FBB91BA4C07A5E79AE020@CEZ.ICE>
@ 2006-10-21 17:33                             ` Erik Bågfors
  0 siblings, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-21 17:33 UTC (permalink / raw)
  To: Sean
  Cc: Matthieu Moy, bazaar-ng, Linus Torvalds, Jan Hudec, git, Jakub Narebski

On 10/21/06, Sean <seanlkml@sympatico.ca> wrote:
> On Sat, 21 Oct 2006 18:35:18 +0200
> "Erik Bågfors" <zindar@gmail.com> wrote:
>
>
> > So... I do agree that revnos might not fit perfectly in at all times.
> > But that they automatically mean that bzr is not a decentralized VCS,
> > I strongly disagree with.  They are just one part of the equation.
>
> Whoe are you strongly disagreeing with?  Nobody said it wasn't a
> decentralized VCS.  But there is a _clear_ bias towards using it
> with a central server.


Ok, I take that back :)

When I think "centralized" I think "everyone must commit to a central
repository"... which is not what we are talking about here...

/Erik
ps. Sean, your mailer does something wierd with my last name in the
to-field, so I can't just hit "reply" without removing my name
first...

/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 17:31                                                       ` Linus Torvalds
@ 2006-10-21 17:38                                                         ` Linus Torvalds
  2006-10-22  7:49                                                         ` Tim Webster
  1 sibling, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 17:38 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Jakub Narebski, bazaar-ng, git



On Sat, 21 Oct 2006, Linus Torvalds wrote:
> 
> 	# Tell people we want to work with ".git-1"
> 	export GIT_DIR=.git-1

Actually, I think Jakub's approach is better: you'd be better off doing 
this as

	alias git-1="git --git-dir=.git-1"
	alias git-2="git --git-dir=.git-2"

and now you should be able to just do

	git-1 diff

(or any other git command) and

	git-2 diff

and can happily share the same directory and mix git commands without 
changing an environment variable all the time.

That would still be insane, but it wouldn't likely be _quite_ as confusing 
(or error-prone in case you forgot to switch the variable).

			Linus

PS. I'd still _not_ suggest doing this. It should _work_, but I mean - 
really..

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 22:59                                               ` Jeff King
@ 2006-10-21 17:40                                                 ` Jan Hudec
  2006-10-21 17:51                                                   ` Jakub Narebski
  2006-10-21 18:42                                                   ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 17:40 UTC (permalink / raw)
  To: Jeff King; +Cc: bazaar-ng, git, Jakub Narebski

On Fri, Oct 20, 2006 at 06:59:17PM -0400, Jeff King wrote:
> On Fri, Oct 20, 2006 at 08:12:10PM +0200, Jan Hudec wrote:
> 
> > At this point, I expect the tree to look like this:
> > A$ ls -R
> > .:
> > data/
> > data:
> > hello.txt
> > A$ cat data/hello.txt
> > Hello World!
> 
> Git does what you expect here.
> 
> > A$ VCT mv data greetings
> > A$ VCT commit -m "Renamed the data directory to greetings"
> > B$ echo "Goodbye World!" > data/goodbye.txt
> > B$ VCT add data/goodbye.txt
> > B$ VCT commit -m "Added goodbye message."
> > A$ VCT merge B
> > 
> > And now I expect to have tree looking like this:
> > 
> > A$ ls -R
> > .:
> > greetings/
> > greetings:
> > hello.txt
> > goodbye.txt
> 
> Git does not do what you expect here. It notes that files moved, but it
> does not have a concept of directories moving.  Git could, even without
> file-ids or special patch types, figure out what happened by noting that
> every file in data/ was renamed to its analogue in greetings/, and infer
> that previously non-existant files in data/ should also be moved to
> greetings/.
> 
> However, I'm not sure that I personally would prefer that behavior. In
> some cases you might actually WANT data/goodbye.txt, and in some other
> cases a conflict might be more appropriate. In any case, I would rather
> the SCM do the simple and predictable thing (which I consider to be
> creating data/goodbye.txt) rather than be clever and wrong (even if it's
> only wrong a small percentage of the time).
> 
> In short, git doesn't do what you expect, but I'm not convinced that
> it's a bug or lack of feature, and not simply a difference in desired
> behavior.

I still consider it a bug, but different problems of the file-id
solution have already been described in this thread that I consider bugs
as well.

Besides I start to think that it should be actually possible to solve
this case with the git-style approach. I have to state beforehand, that
I don't know how the most recent git algorithm works, but I imagine
there is some kind of 'brackets' saying the text is in a given file. Now
if those 'brackets' were not flat, but nested, ie. instead of saying
'this is in foo/bar' it would say 'this is in bar is in foo', the
difference when renaming directory would only affect the 'outer bracket'
and therefore merge correctly with adding content inside it.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:59                                                     ` Jakub Narebski
@ 2006-10-21 17:41                                                       ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 17:41 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Matthew D. Fuller, bazaar-ng, Carl Worth, Andreas Ericsson, git

Note: instead of symlinking .git/objects/ objects database,
you can simply set and export GIT_OBJECT_DIRECTORY environment
variable.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 17:40                                                 ` Jan Hudec
@ 2006-10-21 17:51                                                   ` Jakub Narebski
  2006-10-21 19:20                                                     ` Jan Hudec
  2006-10-21 18:42                                                   ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 17:51 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Jeff King, bazaar-ng, git

Jan Hudec wrote:

> Besides I start to think that it should be actually possible to solve
> this case with the git-style approach. I have to state beforehand, that
> I don't know how the most recent git algorithm works, but I imagine
> there is some kind of 'brackets' saying the text is in a given file. Now
> if those 'brackets' were not flat, but nested, ie. instead of saying
> 'this is in foo/bar' it would say 'this is in bar is in foo', the
> difference when renaming directory would only affect the 'outer bracket'
> and therefore merge correctly with adding content inside it.

You mean, to consider "contents" of a directory union of contents
of files and directories it contains, and then use the same "rename
detection" algorithm as for files?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 14:12                                             ` Jeff King
  2006-10-20 14:40                                               ` Jakub Narebski
@ 2006-10-21 17:57                                               ` Aaron Bentley
  2006-10-21 18:20                                                 ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-21 17:57 UTC (permalink / raw)
  To: Jeff King
  Cc: Carl Worth, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jeff King wrote:
> On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:
> 
>> What's nice is being able see the revno 753 and knowing that "diff -r
>> 752..753" will show the changes it introduced.  Checking the revo on a
>> branch mirror and knowing how out-of-date it is.
> 
> I was accustomed to doing such things in CVS, but I find the git way
> much more pleasant, since I don't have to do any arithmetic:
>   diff d8a60^..d8a60

> Does bzr have a similar shorthand for mentioning relative commits?

Yes, you could e.g. do:

bzr diff -r before:753..753

Aaron

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOl9s0F+nu1YWqI0RAhW7AJ4vi4kgen/8h6j2AgueU+kcsmLrPwCeKry9
pp68K4rAmXjjkPvK32LvmPk=
=qDn2
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 14:08                                                 ` Jakub Narebski
  2006-10-21 16:31                                                   ` Erik Bågfors
@ 2006-10-21 18:11                                                   ` Matthew D. Fuller
  2006-10-21 19:19                                                     ` Jeff King
  2006-10-21 19:41                                                     ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 18:11 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

On Sat, Oct 21, 2006 at 04:08:18PM +0200 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:
> > 
> > I think we're getting into scratched-record-mode on this.
>
>  [....]

Thank you for demonstrating my point   8-}


> When two clones of the same repository (in git terminology), or two
> "branches" (in bzr terminology), used by different people, cannot be
> totally equivalent that is centralization bias.

This is obviously some new meaning of "centralization" bearing no
resemblance whatsoever to how I understand the word.

In git, apparently, you don't give a crap about a branch's identity
(alternately expressible as "it has none"), and so you throw it away
all the time.  Given that, revnos even if git had them would never be
of ANY use to you, so it's no wonder you have no use for the notion.

I DO give a crap about my branchs' identities.  I WANT them to retain
them.  If I have 8 branches, they have 8 identities.  When I merge one
into another, I don't WANT it to lose its identity.  When I merge a
branch that's a strict superset of second into that second, I don't
WANT the second branch to turn into a copy of the first.  If I wanted
that, I'd just use the second branch, or make another copy of it.  I
don't WANT to copy it.  I just want to merge the changes in, and keep
on with my branch's current identity.

Maybe that's what you mean by 'centralization'; each branch is central
to itself.  That seems a pretty useless definition, though.  In my
mind, actually, it's MORE distributed; my branch remains my branch,
and your branch remains your branch, and the difference doesn't keep
us from working together and moving changes back and forth.  Forcing
my branch to become your branch sounds a lot more "centralized" to me.


Now, we can discuss THAT distinction.  I'm not _opposed_ to git's
model per se, and I can think of a lot of cases where it's be really
handy.  But those aren't most of my cases.  And as long as we don't
agree on branch identity, it's completely pointless to keep yakking
about revnos, because they're a direct CONSEQUENCE of that difference
in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
have revnos, I'd STILL want my branch to keep its identity.  You could
name the mainline revisions after COLORS if you wanted, and I'd still
want my branch to keep its identity.  Aren't we through rehashing the
same discussion about the EFFECTS?


> > It refers both to the conceptual entity ("a line of development"
> > roughly, much like what 'branch' means in git and VCS in general),
> > and to the physical location (directory, URL)
> 
> I'd rather use other name then. Perhaps "forks" for physical
> "branch", i.e. branch metadata (like revno to revid mapping) +
> object repository or pointer to it + optionally working area/working
> files. 

It's the same name in bzr because branches are their location, not
their 'name'.  Every branch always has a location, and every location
refers to a branch (well, as long as it's a location that's meaningful
to bzr; "/etc/passwd" is a location, but it's nothing to do with bzr,
so it's not a branch.  Don't dawdle in irrelevancies).


> And you say that bzr is not biased towards centralization? In git
> you can just pull (fetch) to check if there were any changes, and if
> there were not you don't get useless marker-merges.

If I don't tell you my branch has something in it ready to grab, you
shouldn't merge it.  It probably won't work, and is quite likely to
set your computer on fire, slaughter and fillet your pet goldfish, and
make demons fly out of your nose.  If you wanna get stuck with all my
incomplete WIP, let's just use a CVS module and be done with it.


> 2. But the preferred git workflow is to have two branches in each of
> two clones. The 'origin' branch where you fetch changes from other
> repository (so called "tracking branch") and you don't commit your
> changes to [...]

Funny, since this reads to me EXACTLY like the bzr flow of "upstream
branch I pull" and "my branch I merge from upstream" that's getting
kvetched around...



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 17:57                                               ` Aaron Bentley
@ 2006-10-21 18:20                                                 ` Jakub Narebski
  2006-10-22 14:27                                                   ` Matthieu Moy
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 18:20 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jeff King, Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Jeff King wrote:
>> On Thu, Oct 19, 2006 at 09:06:40PM -0400, Aaron Bentley wrote:
>>
>>> What's nice is being able see the revno 753 and knowing that "diff -r
>>> 752..753" will show the changes it introduced.  Checking the revo on a
>>> branch mirror and knowing how out-of-date it is.
>>
>> I was accustomed to doing such things in CVS, but I find the git way
>> much more pleasant, since I don't have to do any arithmetic:
>>   diff d8a60^..d8a60
> 
>> Does bzr have a similar shorthand for mentioning relative commits?
> 
> Yes, you could e.g. do:
> 
> bzr diff -r before:753..753

What about grandparent of commit (d8a60^^ or d8a60~2 in git),
or choosing one of the parents in merge commit (d8a60^2 is second
parent of a commit)? before:before:753 ?

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
  2006-10-21 14:23                   ` Sean
  2006-10-21 14:23                   ` Sean
@ 2006-10-21 18:34                   ` Jan Hudec
       [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 18:34 UTC (permalink / raw)
  To: Sean; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On Sat, Oct 21, 2006 at 10:23:46AM -0400, Sean wrote:
> On Sat, 21 Oct 2006 16:13:28 +0200
> Jan Hudec <bulb@ucw.cz> wrote:
> 
> > Bzr is meant to be used in both ways, depending on user's choice.
> > Therefore it comes with that infrastructure and you can choose whether
> > you want to use it or not.
> 
> >From what we've read on this thread, bzr appears to be biased towards
> working with a central repo.  That is the model that supports the use of
> revnos etc that the bzr folks are so fond of.   However Git is perfectly
> capable of being used in any number of models, including centralized.
> Git just doesn't make the mistake of training new users into using
> features that are only stable in a limited number of those models.

For one think I, like others already expressed, think difference should
be made between 'centralized' and 'star-topology'. Subversion is
centralized -- I don't think bzr is biased towards that kind of
centralization, though it provides tools (bound branches) to make it
easy.

I would agree it IS biased towards viewing branches as organized in a
hierarchy, while git strictly treats them as equal peers, which I'd call
star-topology (and I don't think it is because it _has_ revnos, but
because the user interface strongly favors them over revids).

On the other hand git is biased away from centralized (as in subversion
is centralized) in that it takes extra work to make sure you are always
synchronized (while bzr has bound branches to do the checking for you).
For open-source development, centralized is a wrong way to go, but
people use version control tools for other purposes as well and for some
of them staying synchronized is important.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 17:40                                                 ` Jan Hudec
  2006-10-21 17:51                                                   ` Jakub Narebski
@ 2006-10-21 18:42                                                   ` Linus Torvalds
  2006-10-21 19:21                                                     ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 18:42 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Jeff King, bazaar-ng, git, Jakub Narebski



On Sat, 21 Oct 2006, Jan Hudec wrote:
>
> [ On not moving files that weren't moved originally, but whose
>   directories were moved ]
> 
> I still consider it a bug, but different problems of the file-id
> solution have already been described in this thread that I consider bugs
> as well.
> 
> Besides I start to think that it should be actually possible to solve
> this case with the git-style approach.

It's certainly _possible_ to figure out, but one reason git does what it 
does is that it's just simpler (ie just ignore the whole "directory move" 
situation entirely, and just consider it to be "many files moved"). 

Another reason is that this really is an ambigious case. When the 
directory was moved, the file in question really didn't exist. So when it 
was created independently of the move, it really _is_ somewhat ambiguous 
whether the intention was to move it with the other files or whether the 
new creation point is the right one.

I think that for a human, the details would likely be obvious (and I 
suspect that in most cases it would indeed move with the directory). But 
it really isn't totally clear: what does moving a directory imply for the 
future? Does it imply that the directory should never exist in the future, 
or does it just imply that the _current_ contents move?

Git "tends to" have a policy of not caring about directories at all. For 
example, git will not track an empty directory by default. You _can_ make 
it track one in your commits (the data structures support it), but you're 
really just better of just thinking of git as tracking individual files, 
and nor really directories. So as far as git is concerned, "directories" 
mostly don't really have any existence on their own, they only exist as 
paths to reach files.

In that kind of mindset, renaming a directory really is about renaming the 
files that are in that directory, and that explains the git behaviour. It 
may not necessarily be what you expect, but it _is_ consistent, and it's 
not really "wrong" either. It's just another way of looking at the thing.

Also, I'd like to point out that people worry way too much about merges. 
There are much harder merge conflicts to fix up. If you notice that things 
didn't go the way you expected in a merge, even if it was done 
automatically, you can just do a

	git mv unexpected/directory/file expected/directory/file
	git commit --amend

which basically "fixes up" the automatic merge (that's what the "--amend" 
means: it means "re-do the last commit with _this_ state instead).

(Of course, you could also just make a separate commit to move the file, 
but I think the "manual fixup of the merge" is just cleaner - just add a 
note in the commit message to say you fixed it up by hand. When you do 
your "git commit --amend", it will automatically just give you an editor 
to edit up the commit message too while you're at it).

So again: merges are certainly fairly "hard" from a SCM standpoint, but 
from a user standpoint, they tend to be not at all as important. I would 
again argue that more important than the merge itself (which you can 
trivially just fix up to match your expectations) is to make it easy to 
later _show_ what happened, ie if you examine the file later, you should 
be able to see where it came from.

(And again, with git, things like "git pickaxe" - think of it as just a 
"better annotate" - will indeed pick up the similarity, regardless of 
whether the rename was done manually or automatically as part of the 
merge - exactly because git only really cares about actual contents).

Btw, just to be honest: git _mostly_ thinks in terms of "constant 
pathname patterns" as opposed to "individual paths that move around". 
That's at least partly because of how I work. I actually fairly seldom 
look at an individual file, and tend to much more often look at a group of 
files, and then it's a _lot_ more convenient to do

	gitk drivers/usb include/linux/usb*

where those argument pathnames are _not_ a set of filenames that we track, 
but really somethign more generic, namely a "repository pathname subset" 
which is constant. The above will show the _subset_ of the kernel 
repository history that is relevant for all the named pathnames, but the 
pathnames are _fixed_. It won't follow files that move out of the 
subdirectories: it will show the history as seen from the viewpoint of a 
certain subset of pathnames.

This also extends to things like "git log". So when you do

	git log kernel/sched.c

if you have a "file ID" mentality, you expect the above to follow renames. 
It doesn't - even though git -can- follow renames, what the above actually 
_means_ is "show the log for the fixed pathname set that only includes one 
single path". 

So if "kernel/sched.c" had originally been called something else, the 
above wouldn't show the rename at all. It would just show that "oh, this 
pathname suddenly was created as a new file", because from the viewpoint 
of that fixed pathname, that's _exactly_ what happens.

We've discussed adding a "--follow" flag to tell "git log" to consider the 
argument to not be a "pathname filter", but a "individual file" kind of 
thing, and I think there was even a patch for it, but I suspect it hasn't 
been a big issue, probably partly because you get rather used to the 
"pathname filter" approach fairly quickly. If you knew what the old 
pathname was, for example, you could get git to _tell_ you about the 
rename by doing

	git log -M -- <set-of-all-pathnames-we're-interested-in-old-included>

and git would happily see the renames that happen _within_ that pathname 
filter (the "-M" is there because by default "git log" doesn't show any 
patches at all, of course, so if you want to see the rename, you need to 
tell git so).

As a particular example of this behaviour, if you do

	git log -M kernel/

you'll always see any renames that happen _within_ that subdirectory, but 
any files that are moved into (or out of) the subdirectory will be 
considered to be "create" or "delete" events - because you've literally 
told git to ignore all history that is not relevant to the kernel/ 
subdirectory (so they really _are_ "create/delete" events as far as that 
subdirectory is concerned).

Is this different from other SCM's? Hell yes. git does a lot of things 
differently. Is it useful? Again, hell yes. Especially for a maintainer, 
the ability to talk about pathname _patterns_ is generally much more 
important than talking about any particular file.

[ The pathname thing also means that it's trivial to ask questions like 
  "ok, so what happened to file xyz that I _know_ we used to have, but 
  clearly don't have any more?".

  You just do "git log -- xyz", and you'll see exactly what you wanted to 
  see. The "--" here (and in a previous example) is because to avoid 
  ambiguity, git requires that if you name files that don't actually 
  exist, you make it clear that they are filenames, not just mistyped 
  revision ID's or something else. ]

In general, git gives you the best of both worlds. It knows how to follow 
individual files if you want to, but by default it uses this much more 
generic concept of "pathname filters". The default is definitely 
influenced both by my usage, and my (obviously very strong) opinions on 
what is more important (and thus the git "mental model").

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
@ 2006-10-21 18:47                       ` Sean
  2006-10-21 18:47                       ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 18:47 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On Sat, 21 Oct 2006 20:34:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> For one think I, like others already expressed, think difference should
> be made between 'centralized' and 'star-topology'. Subversion is
> centralized -- I don't think bzr is biased towards that kind of
> centralization, though it provides tools (bound branches) to make it
> easy.

A star-topology assumes there is a central server from which the points
of the start emerge.  It is very much a centralized model and one that
bzr is clearly optimized for.  The difference between bzr and say
cvs is that bzr provides offline abilities where checkins to the
central server can be deferred by checking them in locally first.

The bzr bias towards this model is implicit in its affection for
revnos, which depend on a central repository to syncronize them for
all the points of the star.

[...]
> On the other hand git is biased away from centralized (as in subversion
> is centralized) in that it takes extra work to make sure you are always
> synchronized (while bzr has bound branches to do the checking for you).
> For open-source development, centralized is a wrong way to go, but
> people use version control tools for other purposes as well and for some
> of them staying synchronized is important.

Please reconsider this point, Git can be configured to push every commit
to a central server immediately.  It's just that such a model is so inferior
in almost every way, that it's not typically done.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
  2006-10-21 18:47                       ` Sean
@ 2006-10-21 18:47                       ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 18:47 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, Matthieu Moy, git, Jakub Narebski

On Sat, 21 Oct 2006 20:34:28 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> For one think I, like others already expressed, think difference should
> be made between 'centralized' and 'star-topology'. Subversion is
> centralized -- I don't think bzr is biased towards that kind of
> centralization, though it provides tools (bound branches) to make it
> easy.

A star-topology assumes there is a central server from which the points
of the start emerge.  It is very much a centralized model and one that
bzr is clearly optimized for.  The difference between bzr and say
cvs is that bzr provides offline abilities where checkins to the
central server can be deferred by checking them in locally first.

The bzr bias towards this model is implicit in its affection for
revnos, which depend on a central repository to syncronize them for
all the points of the star.

[...]
> On the other hand git is biased away from centralized (as in subversion
> is centralized) in that it takes extra work to make sure you are always
> synchronized (while bzr has bound branches to do the checking for you).
> For open-source development, centralized is a wrong way to go, but
> people use version control tools for other purposes as well and for some
> of them staying synchronized is important.

Please reconsider this point, Git can be configured to push every commit
to a central server immediately.  It's just that such a model is so inferior
in almost every way, that it's not typically done.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-17 19:51             ` Aaron Bentley
@ 2006-10-21 18:58               ` Jan Hudec
       [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
  0 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 18:58 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: Sean, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Tue, Oct 17, 2006 at 03:51:56PM -0400, Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Sean wrote:
> > On Tue, 17 Oct 2006 00:24:15 -0400
> > Aaron Bentley <aaron.bentley@utoronto.ca> wrote:
> >>- - you can use a checkout to maintain a local mirror of a read-only
> >>  branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> > 
> > 
> > I'm not sure what you mean here.  A bzr checkout doesn't have any history
> > does it?
> 
> By default, they do.  You must use a flag to get a checkout with no history.

If I can add some clarification: There is a lightweight checkout and
heavyweight checkout. The former contains no history and does everything
(except status and I am not sure about diff) by accessing the remote
data. The later contains mirror of the history data and does
write-through on commit (and otherwise behaves like normal branch with
repository)

What would be really useful would be a checkout, or even a branch (ie.
with ability to commit locally), that would only contain history data
since some point. This would allow downloading very little data when
branching, but than working locally as with normal repository clone.

In bzr this was already discussed and the storage supports so called
"ghost" revisions, whose existence is known, but not their data. There
are even repositories around that contain them (created by converting
data from arch), but to my best knowledge there is no user interface to
create branches or checkouts with partial data.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
@ 2006-10-21 19:02                   ` Sean
  2006-10-21 19:02                   ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 19:02 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Aaron Bentley, Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Sat, 21 Oct 2006 20:58:25 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> In bzr this was already discussed and the storage supports so called
> "ghost" revisions, whose existence is known, but not their data. There
> are even repositories around that contain them (created by converting
> data from arch), but to my best knowledge there is no user interface to
> create branches or checkouts with partial data.

In Git the same functionality can be achieved with so called shallow-
clones.  Unfortunately, they've only been discussed and not yet
implemented.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
  2006-10-21 19:02                   ` Sean
@ 2006-10-21 19:02                   ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 19:02 UTC (permalink / raw)
  To: Jan Hudec; +Cc: Linus Torvalds, bazaar-ng, git, Jakub Narebski

On Sat, 21 Oct 2006 20:58:25 +0200
Jan Hudec <bulb@ucw.cz> wrote:

> In bzr this was already discussed and the storage supports so called
> "ghost" revisions, whose existence is known, but not their data. There
> are even repositories around that contain them (created by converting
> data from arch), but to my best knowledge there is no user interface to
> create branches or checkouts with partial data.

In Git the same functionality can be achieved with so called shallow-
clones.  Unfortunately, they've only been discussed and not yet
implemented.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 18:11                                                   ` Matthew D. Fuller
@ 2006-10-21 19:19                                                     ` Jeff King
  2006-10-21 19:30                                                       ` Jakub Narebski
  2006-10-21 21:46                                                       ` Matthew D. Fuller
  2006-10-21 19:41                                                     ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: Jeff King @ 2006-10-21 19:19 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Jakub Narebski, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On Sat, Oct 21, 2006 at 01:11:49PM -0500, Matthew D. Fuller wrote:

> Maybe that's what you mean by 'centralization'; each branch is central
> to itself.  That seems a pretty useless definition, though.  In my
> mind, actually, it's MORE distributed; my branch remains my branch,
> and your branch remains your branch, and the difference doesn't keep
> us from working together and moving changes back and forth.  Forcing
> my branch to become your branch sounds a lot more "centralized" to me.
> 
> Now, we can discuss THAT distinction.  I'm not _opposed_ to git's

OK, let's discuss. :)

I think the concept of "my" branch doesn't make any sense in git.
Everyone is working collectively on a DAG of the history, and we all
have pointers into the DAG. Something is "my" branch in the sense that I
have a repository with a pointer into the DAG, but then again, so do N
other people. I control my pointer, but that's it.

So don't think of it as "git throws away branch identity" as much as
"git never cared about branch identity in the first place, and doesn't
think it's relevant."

Now, there are presumably advantages and disadvantages to these
approaches. I like the fact that I can prepare a repository from
scratch, import it from cvs, copy it, push it, or do whatever I like,
and the end result is always exactly the same (revids included). With
your model, on the other hand, it seems the advantages are that in many
cases you can do things like distributed revnos.

> agree on branch identity, it's completely pointless to keep yakking
> about revnos, because they're a direct CONSEQUENCE of that difference
> in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
> have revnos, I'd STILL want my branch to keep its identity.  You could
> name the mainline revisions after COLORS if you wanted, and I'd still
> want my branch to keep its identity.  Aren't we through rehashing the
> same discussion about the EFFECTS?

I agree completely.

> > 2. But the preferred git workflow is to have two branches in each of
> > two clones. The 'origin' branch where you fetch changes from other
> > repository (so called "tracking branch") and you don't commit your
> > changes to [...]
> 
> Funny, since this reads to me EXACTLY like the bzr flow of "upstream
> branch I pull" and "my branch I merge from upstream" that's getting
> kvetched around...

The difference, I think, is that it's easier in git to move the upstream
around: you simply start fetching from a different place. I'm not clear
on how that works in bzr (if it invalidates revnos or has other side
effects).

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 17:51                                                   ` Jakub Narebski
@ 2006-10-21 19:20                                                     ` Jan Hudec
  0 siblings, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 19:20 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Jeff King, bazaar-ng, git

On Sat, Oct 21, 2006 at 07:51:43PM +0200, Jakub Narebski wrote:
> Jan Hudec wrote:
> 
> > Besides I start to think that it should be actually possible to solve
> > this case with the git-style approach. I have to state beforehand, that
> > I don't know how the most recent git algorithm works, but I imagine
> > there is some kind of 'brackets' saying the text is in a given file. Now
> > if those 'brackets' were not flat, but nested, ie. instead of saying
> > 'this is in foo/bar' it would say 'this is in bar is in foo', the
> > difference when renaming directory would only affect the 'outer bracket'
> > and therefore merge correctly with adding content inside it.
> 
> You mean, to consider "contents" of a directory union of contents
> of files and directories it contains, and then use the same "rename
> detection" algorithm as for files?

Yes, something like that.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 18:42                                                   ` Linus Torvalds
@ 2006-10-21 19:21                                                     ` Jakub Narebski
  2006-11-03  6:36                                                       ` Martin Langhoff
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 19:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jan Hudec, Jeff King, bazaar-ng, git

Linus Torvalds wrote:

> We've discussed adding a "--follow" flag to tell "git log" to consider the 
> argument to not be a "pathname filter", but a "individual file" kind of 
> thing, and I think there was even a patch for it, but I suspect it hasn't 
> been a big issue, probably partly because you get rather used to the 
> "pathname filter" approach fairly quickly.

If I remember correctly, the patch implementing --follow was fairly
intrusive, and was unfortunate in that it was posted during changes
in diffcore.

Lack of --follow is not a big issue because you can do this "by hand";
you can use git-diff-tree -M at the end of file history to check if
[git considers] it was moved from somewhere.

During discussion we have agreed that we would like to have both
--follow rename following limiter and static path limiter (and 
that it would be nice to extend static path limiter to include globs).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:19                                                     ` Jeff King
@ 2006-10-21 19:30                                                       ` Jakub Narebski
  2006-10-21 19:47                                                         ` Jan Hudec
  2006-10-21 19:55                                                         ` Linus Torvalds
  2006-10-21 21:46                                                       ` Matthew D. Fuller
  1 sibling, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 19:30 UTC (permalink / raw)
  To: Jeff King
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Jeff King wrote:

> The difference, I think, is that it's easier in git to move the upstream
> around: you simply start fetching from a different place. I'm not clear
> on how that works in bzr (if it invalidates revnos or has other side
> effects).

That's good example of fully distributed approach. I can fetch directly
(actually, I cannot) from Junio private repository, I can fetch from
public git.git repository, either using git:// or http:// protocol,
I can fetch from somebody else clone of git repository: intermixing
those fetches, and revids (commit-ids) remain constant and unchanged.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 18:11                                                   ` Matthew D. Fuller
  2006-10-21 19:19                                                     ` Jeff King
@ 2006-10-21 19:41                                                     ` Jakub Narebski
  2006-10-22 19:18                                                       ` David Clymer
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 19:41 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

Matthew D. Fuller wrote:
> On Sat, Oct 21, 2006 at 04:08:18PM +0200 I heard the voice of
> Jakub Narebski, and lo! it spake thus:
>> Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:

>> When two clones of the same repository (in git terminology), or two
>> "branches" (in bzr terminology), used by different people, cannot be
>> totally equivalent that is centralization bias.
> 
> This is obviously some new meaning of "centralization" bearing no
> resemblance whatsoever to how I understand the word.

Perhaps I'd better use "star topology bias" instead of "centralization
bias".
 
> In git, apparently, you don't give a crap about a branch's identity
> (alternately expressible as "it has none"), and so you throw it away
> all the time.  Given that, revnos even if git had them would never be
> of ANY use to you, so it's no wonder you have no use for the notion.

In git branches are lightweight. Branch names are local to repository.
Repositories have identity. Bzr "branch" is strange mix of one-branch
git repository and git branch.

Git main workflow is fully decentralized workflow. All clones of the
same repository are created equal. In bzr the suggested workflow
(with revnos) forces one (or more) branches to be mainline (use "merge",
get empty-merges, revnos don't change) and leaf (use "pull", revnos
change).
 
> I DO give a crap about my branchs' identities.  I WANT them to retain
> them.  If I have 8 branches, they have 8 identities.  When I merge one
> into another, I don't WANT it to lose its identity.  When I merge a
> branch that's a strict superset of second into that second, I don't
> WANT the second branch to turn into a copy of the first.  If I wanted
> that, I'd just use the second branch, or make another copy of it.  I
> don't WANT to copy it.  I just want to merge the changes in, and keep
> on with my branch's current identity.

I don't understand. If I merge 'next' branch into 'master' in git, I 
still have two branches: 'master' and 'next'.

And I don't understand why you are so hung on branch identities. Yes, if
somebody clones your 'repo' repository, he can have your 'master' branch
(refs/heads/master) named 'repo' (refs/heads/repo) or 'repo/master'
(refs/remotes/repo/master), but why that matters to you. It is _his_
(or her ;-) clone. 

> Now, we can discuss THAT distinction.  I'm not _opposed_ to git's
> model per se, and I can think of a lot of cases where it's be really
> handy.  But those aren't most of my cases.  And as long as we don't
> agree on branch identity, it's completely pointless to keep yakking
> about revnos, because they're a direct CONSEQUENCE of that difference
> in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
> have revnos, I'd STILL want my branch to keep its identity.  You could
> name the mainline revisions after COLORS if you wanted, and I'd still
> want my branch to keep its identity.  Aren't we through rehashing the
> same discussion about the EFFECTS?

For revnos to work you MUST have one "branch" to be considered
special, the hub in star topology. This very much precludes fully
distributed development. 

BTW. I get that you can use revids in revnos in bzr for fully
distributed and not star-topology geared development. But
Bazaar-NG revids are uglier that Git commit-ids.

[...]
>> And you say that bzr is not biased towards centralization? In git
>> you can just pull (fetch) to check if there were any changes, and if
>> there were not you don't get useless marker-merges.
> 
> If I don't tell you my branch has something in it ready to grab, you
> shouldn't merge it.  It probably won't work, and is quite likely to
> set your computer on fire, slaughter and fillet your pet goldfish, and
> make demons fly out of your nose.  If you wanna get stuck with all my
> incomplete WIP, let's just use a CVS module and be done with it.

In git I can fetch your changes but I don't need to merge them. Take
for example Junio 'pu' (proposed updates) branch: this is the branch
you shouldn't merge as it's history is constantly being rewritten.

If you don't want for your WIP to be publicly available, you don't
publish it. For example as far as I understand Junio works on Git
in his private repository, with many, many feature branches, but
he does push to public [bare] repository only some subset of branches,
and we can fetch/pull only those.

But still, if I am impatient I can pull from Junio every hour, and
I don't get 24 totally useless empty merge messages if he took day
off and didn't publish any changes till day later.

>> 2. But the preferred git workflow is to have two branches in each of
>> two clones. The 'origin' branch where you fetch changes from other
>> repository (so called "tracking branch") and you don't commit your
>> changes to [...]
> 
> Funny, since this reads to me EXACTLY like the bzr flow of "upstream
> branch I pull" and "my branch I merge from upstream" that's getting
> kvetched around...

But please, have you realized that in this workflow the two clones
of the same repository are totally symmetrical? One's 'master' is
another 'origin' and vice versa. After pull on one side, and pull
on the other side (without any changes in between) we have the same
contents, and the same revision names (commit-ids in git), even if
the changes (revisions) got to those clones in different order.
In bzr those two "branches" would get different revnos. No symmetry.
Full distributed vs star topology (one branch "central", hence
"centralized" - I don't mean need to access to one central repository,
although...)

-- 
Jakub Narebski
ShadeHawk on #git
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:30                                                       ` Jakub Narebski
@ 2006-10-21 19:47                                                         ` Jan Hudec
  2006-10-21 19:55                                                         ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-21 19:47 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jeff King, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git

On Sat, Oct 21, 2006 at 09:30:30PM +0200, Jakub Narebski wrote:
> Jeff King wrote:
> 
> > The difference, I think, is that it's easier in git to move the upstream
> > around: you simply start fetching from a different place. I'm not clear
> > on how that works in bzr (if it invalidates revnos or has other side
> > effects).

Moving upstram around does not invalidate revnos. Switching to different
upstream (ie. the head revisions are different) does. And this may
happen by doing a merge with the previous mainline as non-first parent
-- revnos are simply short aliases for revids, not persistent unique
idenfiers.

> That's good example of fully distributed approach. I can fetch directly
> (actually, I cannot) from Junio private repository, I can fetch from
> public git.git repository, either using git:// or http:// protocol,
> I can fetch from somebody else clone of git repository: intermixing
> those fetches, and revids (commit-ids) remain constant and unchanged.

So they (revids) do in bzr.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:30                                                       ` Jakub Narebski
  2006-10-21 19:47                                                         ` Jan Hudec
@ 2006-10-21 19:55                                                         ` Linus Torvalds
  2006-10-21 20:19                                                           ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 19:55 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Matthew D. Fuller, Jeff King, Andreas Ericsson,
	Carl Worth, git



On Sat, 21 Oct 2006, Jakub Narebski wrote:
> 
> That's good example of fully distributed approach. I can fetch directly
> (actually, I cannot) from Junio private repository, I can fetch from
> public git.git repository, either using git:// or http:// protocol,
> I can fetch from somebody else clone of git repository: intermixing
> those fetches, and revids (commit-ids) remain constant and unchanged.

This is nice for a couple of situations:

 - if some particular machine is down, nobody really cares. It doesn't 
   really change the workflow at all if "master.kernel.org" were to be 
   off-line due to some trouble - it just happens to be a machine with 
   good bandwidth that a number of kernel (and git) developers have access 
   to, but if you want to sync with something else, go wild. We could just 
   sync directly between developers, although most people tend to have 
   firewalls (I certainly have a very anal one - not even ssh gets in) 
   making it usually easier to go through some - any - public place.

   But in git, the "public place" really is just an intermediary. It has 
   nothing to do with anything history-wise, and it's revision ID's are a 
   non-issue. It's just a temporary staging area (although re-using the 
   same repo over and over for pushing things out obviously means you can 
   do just incremental updates, so most everybody does that)

 - sometimes you have multiple branches in the same tree that have very 
   _different_ sources. For example, you might start out cloning my tree, 
   but if you _also_ want to track the stable tree, you just do so: you 
   can just do

	git fetch <repo> <remote-branch-name>:<local-branch-name>

   at any time, and you now have a new branch that tracks a different 
   repository entirely (to make it easier to keep track of them, you'd 
   probably want to make note of this in your .config file or your remote 
   tracking data, but that's a small "usability detail", not a real 
   conceptual issue).

 - the same "multi-source" thing is true for pushing things out too, not 
   just fetching: I still have my personal git.git repository on 
   kernel.org for historical reasons, even though Junio maintains the 
   normal one. So when I did some experimental (and broken) stuff for "git 
   unpack-objects" in a local branch, and others were interested in fixing 
   it, I just pushed it out to my git repo as a new branch - one that 
   Junio doesn't have.

   So now my kernel.org git repo not only tracks all of Junio's branches 
   (basically just a mirror of his tree), I also have a few stale branches 
   of my own that I did some work on separately. So it's kind of a 
   "frankensteins monster" of different branches from different sources. 

   And I think that's fairly common, actually (ie many kernel developers 
   that publicise their own git trees often have a "linus" branch that 
   tracks mine, along with their own "real" branches)

And note how in none of these situtations does it matter what the 
"original" branch was. It might even be a way to just pre-populate the 
tree. For a real-life example, a week or two ago, Jesper Juhl wanted to 
download my kernel tree (which is about 140MB in size), but he's somewhere 
in Europe, and apparently the connection to kernel.org was just _really_ 
slow. 

So what I told him to do was:

   Hmm. I suspect most mirrors avoid the /pub/scm directory, but there are a 
   few places that mirror git trees in general, eg

        http://www.jur-linux.org/git/

   might be closer to you.

   Once you have _one_ kernel repo, you can clone another easily using

        git clone --reference <mylocalrepo> <remotereponame> [localdir]

   but you do need to have the thing in git format, not just a snapshot, to 
   do that.

and that's exactly what he did (and he could have just fetched into the 
original archive entirely):

   I could only get 2-3kb/sec from kernel.org and at that speed 140MB is 
   *HUGE*.

   That was a lot better. got more than 200kb/sec from there.

so the point here is, "distributed" really is more than star-topology. If 
you think outside the star, you can take useful shortcuts.

Now, I'm sure that bzr can probably do all the same things. This is likely 
less an issue of "technology" than of "mindset". The "git way" tends to 
make all of these things very trivial - the notion of tracking multiple 
branches from multiple _different_ repositories in one local repo just 
fits very naturally in the whole git mentality.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20 21:48                                             ` Carl Worth
  2006-10-21 13:01                                               ` Matthew D. Fuller
@ 2006-10-21 20:05                                               ` Aaron Bentley
  2006-10-21 20:48                                                 ` Jakub Narebski
                                                                   ` (2 more replies)
  1 sibling, 3 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-21 20:05 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, Jakub Narebski, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Carl Worth wrote:
> On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
>> I understand your argument now.
>>                                  It's nothing to do with numbers per se,
>> and all about per-branch namespaces.  Correct?
> 
> The entire discussion is about how to name things in a distributed
> system. The premise that Linus has put forth in a very compelling way,
> is that attempting to use sequential numbers for names in a
> distributed system will break down. The breakdown could be that the
> names are not stable, or that the system is used in a centralized way
> to avoid the instability of the names.

So I'd say that revnos without the context of a location can only refer
to the current branch that the user is working on.  They don't refer to
the mainline, which typically has its own numbers that don't match the
user's.

If you're saying that bzr is "centralized" in that the user's current
branch is special, then I'll say "guilty as charged".

> But it really is fundamental and unavoidable that sequential numbers
> don't work as names in a distributed version control system.

Right.  You need something guaranteed to be unique.  It's the revno +
url combo that is unique.  That may not be permanent, but anyone can
create one of those names, so it is decentralized.

>> I meant that the active branch and a mirror of the abandoned branch
>> could be stored in the same repository, for ease of access.
> 
> Granted, everything can be stored in one repository. But that still
> doesn't change what I was trying to say with my example. One of the
> repositories would "win" (the names it published during the fork would
> still be valid). And the other repository would "lose" (the names it
> published would be not valid anymore). Right?

No.  It would be silly for the losing side to publish a mirror of the
winning branch at the same location where they had previously published
their own branch.  So the old number + URL combination would remain valid.

If the losing faction decided to maintain their own branch after the
merge, they'd have two options

1. continue to develop against the losing "branch", without updating its
numbers from the "winning" branch.  It would be hard to tell who had won
or lost in this case.

2. create a new mirror of the "winning" branch and develop against that.
 I'm not sure what this point of this would be.

I think the most realistic thing in this scenario is that they leave the
"losing" branch exactly where it was, and develop against the "winning"
branch.

>> Bazaar encourages you to stick lots and lots of branches in your
>> repository.  They don't even have to be related.  For example, my repo
>> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
> 
> Git allows this just fine. And lots of branches belonging to a single
> project is definitely the common usage. It is not common (nor
> encouraged) for unrelated projects to share a repository, since a git
> clone will fetch every branch in the repository.

Right.  This is a difference between Bazaar and Git that's I'd
characterize as being "branch-oriented" vs "repository-oriented".  We'll
see more of this below.

> I'm noticing another terminology conflict here. The notion of "branch"
> in bzr is obviously very different than in git. For example the bzr
> man page has a sentence beginning with "if there is already a branch
> at the location but it has no working tree". I'm still not sure
> exactly what a bzr branch is, but it's clearly something different
> from a git branch, (which is absolutely nothing more than a name
> referencing a particular commit object).

I got the impression there was also a local ordering of revisions.  Is
that wrong?

A Bazaar branch is a directory inside a repository that contains:
 - a name referencing a particular revision
 - (optional) the location of the default branch to pull/merge from
 - (optional) the location of the default branch to push to
 - (optional) the policy for GPG signing
 - (optional) an alternate committer-id to use for this branch
 - (optional) a nickname for the branch
 - other configuration options

A Bazaar branch doesn't contain any commit objects ("revisions" in
Bazaar parlance).  Those are retrieved from the containing repository.

It doesn't contain any working files, but a branch and a working tree
may coexist in the same directory.  Similarly, a branch and a repository
may coexist in the same directory.

So this is one common layout:

Repository:
~/repo/

Branch:
~/repo/branch

Working Tree:
~/workingtee

This is another common layout:

Repository:
~/

Branch:
~/mybranch

Working Tree
~/mybranch

This layout is our default, a "standalone tree":

Repository:
~/mybranch

Branch:
~/mybranch

Working Tree:
~/mybranch

This layout is an imitation of Git, as I understand it:
Repository:
~/repo

Branches:
~/repo/origin
~/repo/master

Workingtree
~/repo

> Second, I'm not comfortable
> with any limit on usefulness of history. Would you willingly throw
> away commits, mailing list posts, or closed bug reports older than any
> given age for any projects that you care about?

I think the mailing list posts age the best, because they provide a
record of rationales for design decsions.  But I'd throw away old
commits if there were a good reason, like lack of disk space.  Not so
sure about bug reports.

> Second, I think that using the filesystem for separating branches is a
> really bad idea. 

The canonical way to name branches in Bazaar is with URLs, though we
support file paths where possible.  Part of the "simple namespace" thing
is that branches are simply URLs, so in order to retrieve a branch, all
you need is one URL.

> One, it intrudes on my branch namespace, (note that
> in many commands above I have to use things like "../b" where I'd like
> to just name my branch "b". 

While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
http://bazaar-vcs.org/bzr/bzr.dev" is a big win.

> Two, it prevents bzr from having any
> notion of "all branches" in places where git takes advantage of it,
> (such as git-clone and "gitk --all").

No, it doesn't.  Bazaar can easily list all the branches in a
repository, just by starting with the repository root, and recursing
through all the subdirectories, looking for branches.

That said, we do have mentality that branches, not repositories, are
what's important to users in day-to-day use.

> Three, it certainly encourages
> the storage problem I ran into above, (and I'd be interested to see a
> "corrected" version of the commands above to fix the storage
> inefficiencies).

$ bzr init-repo bzrtest --trees
$ bzr init bzrtest/master; cd bzrtest/master
$ touch a; bzr add a; bzr commit -m "Initial commit of a"
$ bzr branch . ../b; cd ../b
$ touch b; bzr add b; bzr commit -m "Commit b on b branch"
$ echo "change" > b; bzr commit -m "Change b on b branch"
$ bzr branch ../master ../c; cd ../c
$ touch c; bzr add c; bzr commit -m "Commit c on c branch"
$ echo "change" > c; bzr commit -m "Change c on c branch"
$ cd ../master
$ bzr merge ../b; bzr commit -m "Merge in b"
$ bzr merge ../c; bzr commit -m "Merge in c"

> I hadn't realized that the dotted decimal notation was so new that the
> community hadn't had a lot of experience with it yet. But, your
> description doesn't actually presume that notation. What you asked
> was:
> 
> 	> When you create a new branch from scratch, the number starts at zero.
> 	> If you copy a branch, you copy its number, too.
> 	>
> 	> Every time you commit, the number is incremented.  If you pull, your
> 	> numbers are adjusted to be identical to those of the branch you pulled from.
> 	>
> 	> Is that really complicated?
> 
> And to answer. That description doesn't describe at all what happens
> to the "simple" numbers of commits that are merged.

Nothing happens to them, because they were never part of this branch, so
they didn't ever exist in this context.

> I still don't
> understand how people can avoid number changing, (since pull seems the
> only way to synch up without infinite new merge commits being added
> back and forth).

Why would anyone commit if the merge introduced no changes?

>> What's nice is being able see the revno 753 and knowing that "diff -r
>> 752..753" will show the changes it introduced.  Checking the revo on a
>> branch mirror and knowing how out-of-date it is.
> 
> With git I get to see a revision number of b62710d4 and know that
> "diff b62710d4^ b62710d4" will show its changes, though much more
> likely just "show b62710d4". I really cannot fathom a place where
> arithmetic on revision numbers does something useful that git revision
> specifications don't do just as easily. Anybody have an example for
> me?

My understanding is that ^ is treated as a special metacharacter by some
shells, which is why bzr revision specs are more long-winded.

> PS. The "bzr branch" of bzr.dev did eventually finish. I can see the
> dotted-decimal numbers in my example now, (1.1.1 and 1.2.2 for the
> commits that came from branch b; 1.2.1 and 1.2.2 for the commits that
> came from branch c). At 5 characters a piece these are well on their
> way to getting just as "ugly" as git names, (once it's all
> cut-and-paste the difference in ugliness is negligible).

Yeah, I'm not sure I like those dotted numbers, either.

> And now, I see it's not just pull that does number rewriting. If I use
> the following command (after the chunk of commands above):
> 
> 	cd ..; bzr branch -r 1.2.2 master 1.2.2

It's not number rewriting, it's number writing.  It doesn't change the
numbers in master, or any other existing branch.  (Push also does number
rewriting, because it's mostly the inverse of pull).

> It appears to just create newly linearized revision numbers from whole
> cloth for the new branch (1, 2, and 3 corresponding to mainline 1,
> 1.2.1, and 1.2.2). That's totally surprising, very confusing, and
> would invalidate any use I wanted to make of published revision
> numbers for the mainline branch while I was working on this branch.

I think the intent of those numbers was for operations like "diff".  I
never branch from a revision, always from a branch, which will preserve
numbers.

> See? This stuff really doesn't work.

Our experience really is that it does work.

> Is there even a way to say "show me the change introduced by what is
> named '1.2.1' in the source branch in this scenario" ?

The revno:branch notation ought to work, but I guess there's a bug.  Not
surprising, since dotted revnos are new in this release.

> > Note: In #bzr I just learned that there is a way for me to do this
> _if_ I also happen to have a pull of the original branch somewhere on
> my machine. 

This should work with any URL, not just locations on your machine.

> But with bzr if I find "1.2.1" somewhere I'm likely to type:

The problem here is the "somewhere".  Since each branch has its own
revno namespace, you need to know where to use the revno effectively.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFOn190F+nu1YWqI0RAn1nAKCDqT8gbzm/xIMjbc3kTFCkpMbJvwCeJiWr
3fLtDo4uLwtAWi+pQOrgPLU=
=0GeT
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:55                                                         ` Linus Torvalds
@ 2006-10-21 20:19                                                           ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 20:19 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jeff King, Matthew D. Fuller, bazaar-ng, Carl Worth,
	Andreas Ericsson, git

Linus Torvalds wrote:
>  - sometimes you have multiple branches in the same tree that have very 
>    _different_ sources. For example, you might start out cloning my tree, 
>    but if you _also_ want to track the stable tree, you just do so: you 
>    can just do
> 
>         git fetch <repo> <remote-branch-name>:<local-branch-name>
> 
>    at any time, and you now have a new branch that tracks a different 
>    repository entirely (to make it easier to keep track of them, you'd 
>    probably want to make note of this in your .config file or your remote 
>    tracking data, but that's a small "usability detail", not a real 
>    conceptual issue).

That for example allows of joining two initially separate projects
into one project. For example that was the case for gitk and gitweb
which are now in git.git repository. Most probably gitweb/gitk was
fetched into separate gitweb/gitk branch, then merged with the 'master'
branch of git (in case of gitweb we "resolved conflict" by moving 
all gitweb files to gitweb/ subdirectory) then propagated to other
branches by merging with master.

For example git has 7 "initial" (parentless) commits. Two of them
are superficial 'html' and 'man' branches for automatic generation
of HTML and man version of git documentation, keeping it current.
There is 'todo' branch, [also] totally separate for notes. And there
are initial commits of git, git-tools, gitk and gitweb:
 * Initial revision of "git", the information manager from hell
 * Start of early patch applicator tools for git.
 * Add initial version of gitk to the CVS repository
 * first working version 
   [of gitweb: this commit message should be more descriptive]

$ git rev-list --parents --all | grep -v " " | xargs git -p show
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 14:08                                                 ` Jakub Narebski
@ 2006-10-21 20:47                                                 ` Carl Worth
  2006-10-21 20:55                                                   ` Jakub Narebski
                                                                     ` (3 more replies)
  2006-10-25  9:35                                                 ` Andreas Ericsson
  2 siblings, 4 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-21 20:47 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Aaron Bentley, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 8080 bytes --]

On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> I think we're getting into scratched-record-mode on this.

I apologize if I've come across as beating a dead horse on this. I've
really tried to only respond where I still confused, or there are
explicit indications that the reader hasn't understood what I was
saying, ("I don't understand how you've come to that conclusion",
etc.). I'll be even more careful about that below, labeling paragraphs
as "I'm missing something" or "Maybe I wasn't clear".

> G: So use revids everywhere.
>
> B: Revnos are handier tools for [situation] and [situation] for
>    [reason] and [reason].

I'm missing something:

I still haven't seen strong examples for this last claim. When are
they handier? I asked a couple of messages back and two people replied
that given one revno it's trivial to compute the revno of its
parent. But that's no win over git's revision specifications,
(particularly since they provide "parent of" operators).

> > It may be that the centralization bias
>
> I think it's more accurately describable as a branch-identity bias.
> The git claim seems to be that the two statements are identical, but I
> have some trouble swallowing that.

Maybe I wasn't clear:

There's no doubt that there has been semantic confusion over the term
branch that has been confounding communication on both sides. Here's
my attempt to describe the situation, (which only became this clear
recently as I started playing with bzr more). This is not an attempt
at a complete description, but is hopefully accurate, neutral, and
sufficient for the current discussion:

  Abstract: In a distributed VCS we are using a distributed process to
  create a DAG, (nodes are associated with revisions and point to parent
  nodes). The distributed nature means that the collective DAG will have
  multiple source nodes, (often termed heads or tips).

  Git: A subset of the DAG is stored in a "repository". The DAG in the
  repository may have many source nodes. A "branch" is a named reference
  to a node (whether or not a source). Multiple local repositories may
  share storage for common objects. There are inter-repository commands
  for copying revisions and adjusting branch references, but basically
  all other operations act within a single repository.

  Bzr: A subset of the DAG is stored in a "branch". The DAG in the
  branch has a single source node. Multiple local branches may share
  storage for common objects through a "repository". Basically all
  operations (where applicable) can act between branches.

Let me know if I botched any of that.

One concept that is really not introduced in the above is the
colloquial concept of a "branch" as a "line of development". In my
experience, this notion is a fundamentally short-lived thing. For
example, work happens on a feature branch for a while, and then it
gets merged into the mainline. After the merge, there's not that much
significance to the branch anymore. In a sense, it no longer exists
but for a few edges in the graph.

I imagine that both git and bzr users both use this short-lived aspect
in practice. After merging, git users drop their branch references and
bzr users drop their directories containing their branches. Anything
else would be unwieldy as the number of merged-in, "uninteresting"
branches would grow without bound and there wouldn't be any advantage
to keeping them around.

But dropping a merged branch in bzr means throwing away the ability to
reference any of its commits by its custom, branch-specific revision
numbers. And the revision numbers _do_ change, pull, branch, and merge
all introduce revision number differences between branches, (or
changes within a branch in the case of pull). And there is no simple
way to correlate the numbers between branches.

Maybe you can argue that there isn't any centralization bias in
bzr. But anyone that claims that the revnos. are stable really is
talking from a standpoint that favors centralization.

But, here's a unifying point about git and bzr. Git also allows
branch-specific, unstable names for revisions. And they're even more
unstable than the ones bzr generates. But there are some important
differences between how they are used, (both by the tool and by
people).

To illustrate, yesterday I gave an example where performing a bzr
branch from a dotted-decimal revision would rewrite the numbers from
the originating branch (1.2.2, 1.2.1, and 1) to unrelated numbers in
the new branch (3, 2, 1). I was surprised at first, and couldn't
imagine any sane reason for the tool to go off and invent new names.
It prevents a user of the new branch from referencing any commits by
their original names. It also prevents the user from communicating
with anyone with these new names, (unless the user publishes the
branch, and any parties to the communication retain the new branch for
as long as said communication might be reference).

But then I realized why bzr is doing this. It's because, bzr users
don't just use the revision numbers for external communication, but
they also use them for lots of direct interaction with the tool. The
rewriting makes it easy to write something like "bzr diff -r1..3".

And it turns out that git also allows branch specific naming for the
exact same reason. In place of 3, 2, 1 in the same situation git would
allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same three
revisions. So the easy diff command would be "git diff HEAD~2 HEAD".
(And where I have HEAD here I could also use any branch name, or any
other reference to a commit as well.)

So there are two fundamentally different uses for names, (and Linus
recently talked about this in some length): 1. day-to-day working with
the tool and 2. externally communicating about specific revisions.

Both bzr and git allow for unstable, branch-specific names to be used
as a convenience in the case of the day-to-day working. Maybe some of
the people that dislike git's "ugly" names so much is that they
imagine that to compare two revisions a user of git must inspect the
logs, fish out the sha1sum for each, and then cut-and-paste to create
the command needed. I agree that if that were required, it would be
exceedingly painful. But that's not required, what the git user uses
is branch names and simple variations.

Now, there are some important difference in the unstable names that
git and bzr has. Most importantly, git's are even less stable, (with
respect to the association between a name and any specific
revision). With every commit, all of the git names effectively shift
as the branch moves, (HEAD points to the new commit, HEAD~1 points to
what HEAD previously pointed to). This is remarkably useful since it
provides stability in terms of what the user cares about, (the latest
commit and it's closest ancestors). This means that "diff from
grandparent to current commit" is always "git diff HEAD~1 HEAD" where
as in bzr it is "git diff -r<X-2>..<X>" and the user actually does
need to lookup X first, (unless there's more to the bzr revision
specification than I've seen).

Finally, since these branch-specific names are changing all the time,
there's never any temptation for people to attempt to use them to for
external communication. In contrast, by being numbered in the opposite
direction, bzr revision numbers give a false appearance of stability
and people _do_ use them for communication. This is the mistake we've
been warning bzr users about in this thread.

Also, since the git names are so predictable, git almost never emits
them. It accepts them as names just fine, but it doesn't generate
them, (log, and commit never show the branch-specific names). I think
the only git command that even can emit such a name was a recently
added git-name-rev which exists solely for the purpose of mapping a
commit identifier to a local, branch-specific name which might have
more intuitive meaning for the user.

So the fact that things like git-log doesn't print these names also
helps avoid any trap of users trying to communicate with something
unstable.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:05                                               ` Aaron Bentley
@ 2006-10-21 20:48                                                 ` Jakub Narebski
  2006-10-21 22:52                                                   ` Edgar Toernig
  2006-10-21 23:39                                                   ` Aaron Bentley
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
  2006-10-22  7:45                                                 ` Jan Hudec
  2 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 20:48 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

Aaron Bentley wrote:
> Carl Worth wrote:

>> I'm noticing another terminology conflict here. The notion of "branch"
>> in bzr is obviously very different than in git. For example the bzr
>> man page has a sentence beginning with "if there is already a branch
>> at the location but it has no working tree". I'm still not sure
>> exactly what a bzr branch is, but it's clearly something different
>> from a git branch, (which is absolutely nothing more than a name
>> referencing a particular commit object).
> 
> I got the impression there was also a local ordering of revisions.  Is
> that wrong?

No, there is no such thing like local ordering of revisions.

Each revision (commit) has link to its parent(s). Branch technically
is just a reference to a particular commit object. The commit itself
gives us sub-DAG of DAG of whole history, the DAG of all parents of
said commit. Such lineage of commit pointed by branch is conceptually
a branch; i.e. branch is DAG of development (not line of development,
as there is no special meaning of first parent).

You can have (in git repository) also reflog, which records values
of branch-as-reference, or branch tip of branch-as-named-lineage.
But for example fetch and fast-forward 5 commits in history is
recorded as single event, single change in reflog.
 
> A Bazaar branch is a directory inside a repository that contains:
>  - a name referencing a particular revision
>  - (optional) the location of the default branch to pull/merge from
>  - (optional) the location of the default branch to push to
>  - (optional) the policy for GPG signing
>  - (optional) an alternate committer-id to use for this branch
>  - (optional) a nickname for the branch
>  - other configuration options
Erm, wasn't revno to revid mapping also part of bzr "branch"?

We store configuration per repository, not per branch, although
there is some branch specific configuration.

[...]
> This layout is an imitation of Git, as I understand it:
> Repository:
> ~/repo
> 
> Branches:
> ~/repo/origin
> ~/repo/master
> 
> Workingtree:
> ~/repo

Workingtree:
~/

if I understand notation correctly.

>> One, it intrudes on my branch namespace, (note that
>> in many commands above I have to use things like "../b" where I'd like
>> to just name my branch "b".
> 
> While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
> http://bazaar-vcs.org/bzr/bzr.dev" is a big win.

Gaah, it's even more inconvenient. Certainly more than using name
of branch itself, like in git.
 
>> Two, it prevents bzr from having any
>> notion of "all branches" in places where git takes advantage of it,
>> (such as git-clone and "gitk --all").
> 
> No, it doesn't.  Bazaar can easily list all the branches in a
> repository, just by starting with the repository root, and recursing
> through all the subdirectories, looking for branches.

Is there a command to list all branches in bzr? Is there a command
to copy (clone in SCM jargon) whole repository with all branches?
 
> That said, we do have mentality that branches, not repositories, are
> what's important to users in day-to-day use.

Thats opposite to git view. In git, working area is associated with
repository (clone of repository), not branch. We copy whole repositories
(sometimes only part of repository), not branches.

>>> What's nice is being able see the revno 753 and knowing that "diff -r
>>> 752..753" will show the changes it introduced.  Checking the revo on a
>>> branch mirror and knowing how out-of-date it is.
>>
>> With git I get to see a revision number of b62710d4 and know that
>> "diff b62710d4^ b62710d4" will show its changes, though much more
>> likely just "show b62710d4". I really cannot fathom a place where
>> arithmetic on revision numbers does something useful that git revision
>> specifications don't do just as easily. Anybody have an example for
>> me?
> 
> My understanding is that ^ is treated as a special metacharacter by some
> shells, which is why bzr revision specs are more long-winded.

Which shells? If I understand it '^' was chosen (for example as
NOT operator for specify sub-DAG instead of '!') because of no problems
for shell expansion. And considering that many git commands are/were
written in shell, one certainly would notice that.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
  2006-10-21 20:53                                                   ` Sean
@ 2006-10-21 20:53                                                   ` Sean
  2006-10-21 21:10                                                     ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Sean @ 2006-10-21 20:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git

On Sat, 21 Oct 2006 16:05:18 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Our experience really is that it does work.

Of course it works as long as you accept the implicit requirements of
supporting them and ignore the cases where they change out from
underneath the user.  But as soon as users want to embrace distributive
models where there isn't a central shared repo, at best revno's are
unhelpful and at worst they are counterproductive.  The proof of this
is that if revno's were sufficient bzr wouldn't need revid's.

Since the utility provided by revno's seems so minimal even in the
case where they do work, Git simply doesn't bother with them.  And
"our" experience is that Git really does work well without them.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
@ 2006-10-21 20:53                                                   ` Sean
  2006-10-21 20:53                                                   ` Sean
  1 sibling, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 20:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, git,
	Jakub Narebski

On Sat, 21 Oct 2006 16:05:18 -0400
Aaron Bentley <aaron.bentley@utoronto.ca> wrote:

> Our experience really is that it does work.

Of course it works as long as you accept the implicit requirements of
supporting them and ignore the cases where they change out from
underneath the user.  But as soon as users want to embrace distributive
models where there isn't a central shared repo, at best revno's are
unhelpful and at worst they are counterproductive.  The proof of this
is that if revno's were sufficient bzr wouldn't need revid's.

Since the utility provided by revno's seems so minimal even in the
case where they do work, Git simply doesn't bother with them.  And
"our" experience is that Git really does work well without them.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
@ 2006-10-21 20:55                                                   ` Jakub Narebski
  2006-10-21 23:07                                                   ` Jeff Licquia
                                                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 20:55 UTC (permalink / raw)
  To: Carl Worth
  Cc: Matthew D. Fuller, Aaron Bentley, Linus Torvalds,
	Andreas Ericsson, bazaar-ng, git

Carl Worth wrote:

> Also, since the git names are so predictable, git almost never emits
> them. It accepts them as names just fine, but it doesn't generate
> them, (log, and commit never show the branch-specific names). I think
> the only git command that even can emit such a name was a recently
> added git-name-rev which exists solely for the purpose of mapping a
> commit identifier to a local, branch-specific name which might have
> more intuitive meaning for the user.

git-show-branch also shows git-name-rev like names.

BTW. git-show-branch has somewhat strange, and different from other git 
commands UI. You can think of it as text version of gitk/qgit history 
viewer (although you can use tig for CLI (ncurses) graph).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 16:19                     ` Erik Bågfors
  2006-10-21 16:31                       ` Jakub Narebski
       [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
@ 2006-10-21 21:04                       ` Linus Torvalds
  2006-10-21 23:58                         ` Linus Torvalds
                                           ` (2 more replies)
  2 siblings, 3 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 21:04 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy, Jakub Narebski

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1821 bytes --]



On Sat, 21 Oct 2006, Erik Bågfors wrote:
> 
> bzr is a fully decentralized VCS. I've read this thread for quite some
> time now and I really cannot understand why people come to this
> conclusion.

Even the bzr people agree, so what's not to understand?

The revision numbers are totally unstable in a distributed environment 
_unless_ you use a certain work-flow. And that work-flow is definitely not 
"distributed" it's much closer to "disconnected centralized".

Now, you could be truly distributed: BK used the same revision numbering 
thing, but was distributed. But BK didn't even try to claim that their 
revision numbers were "simple" and that fast-forwarding is sometimes the 
wrong thing to do.

So BK always fast-forwarded, and the revision numbers were just randomly 
changing numbers. They weren't stable, they weren't simple, and nobody 
claimed they were.

So bzr can bite the bullet and say: "revision numbers are changing and 
meaningless, and we should just fast-forward on merges", or you should 
just admit that bzr is really more about "disconnected operation" than 
truly distributed.

You can't have your cake and eat it too. Truly distributed _cannot_ be 
done with a stable dotted numbering scheme (unless the "dotted numbering 
scheme" is just a way to show a hash like git does - so the numbering has 
no _sequential_ meaning).

Btw, this isn't just an "opinion". This is a _fact_. It's something they 
teach in any good introductory course to distributed algorithms. Usually 
it's talked about in the context of "global clock". 

Anybody who thinks that there exists a globally ticking clock in the 
system (and stably increasing dotted numbers are just one such thing) is 
talking about some fantasy-world that doesn't exist, or a world that has 
nothing to do with "distributed".

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:53                                                   ` Sean
@ 2006-10-21 21:10                                                     ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 21:10 UTC (permalink / raw)
  To: Sean
  Cc: Aaron Bentley, Carl Worth, Jakub Narebski, Andreas Ericsson,
	bazaar-ng, git



On Sat, 21 Oct 2006, Sean wrote:
> 
> Since the utility provided by revno's seems so minimal even in the
> case where they do work, Git simply doesn't bother with them.  And
> "our" experience is that Git really does work well without them.

Yes. This really is what it boils down to.

The _only_ time you actually use revision numbers (as opposed to 
branch-names or tag-names) is when you want a _stable_ number.

It's that simple. You never really need a revision number otherwise. In 
other situations, you do things like 

	git log --since=2.days.ago
	gitk v2.6.18..
	git diff --stat --summary ORIG_HEAD.. 

or whatever. It's clearly not "stable", but it's also clearly not a 
revision number from a UI perspective.

When you want a revision number is _exactly_ when you're moving things 
between branches, or reporting a bug to somebody else, or similar. And 
that's also _exactly_ when you want the number to be stable and meaningful 
(ie the other end should be able to rely on the number).

And if you need refer to a central repository to do that, it's clearly not 
distributed. Not needing such a central reference point is what the word 
"distributed" _means_ in computer science for chrissake!

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:19                                                     ` Jeff King
  2006-10-21 19:30                                                       ` Jakub Narebski
@ 2006-10-21 21:46                                                       ` Matthew D. Fuller
       [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
  2006-10-21 22:25                                                         ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-21 21:46 UTC (permalink / raw)
  To: Jeff King
  Cc: Jakub Narebski, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On Sat, Oct 21, 2006 at 03:19:49PM -0400 I heard the voice of
Jeff King, and lo! it spake thus:
> 
> I think the concept of "my" branch doesn't make any sense in git.
> [...]
> So don't think of it as "git throws away branch identity" as much as
> "git never cared about branch identity in the first place, and
> doesn't think it's relevant."

This is as I understand it.


But in my mind, it does make sense.  I fundamentally DO think of "my
commits" differently from "revisions I've merged", and I want the tool
to preserve that for me.  "My commits" tend to be steps along a path,
"merges" tend to be completed paths.  I usually use bzr's "log
--short" for looking at logs, which doesn't show merged revs at all.
That works, because most of the time I don't care about them; I know
if I merged something, it's a completed piece, which I described in
the log message; it's not a PART of a task like my commits usually
are.  So, just the message for my merge rev tells me what I need to
know, and if I need to drill down into it, I can use the regular
(--long) log output to look at the revision in it.  This lets me know,
for instance, that if I want to re-check something I did 3 commits
ago, and I just merged another branch, the commit I'm interested in is
the 4th commit back on the mainline; I don't need to grub through a
bunch of revisions that aren't mine to try and find it.

So, if me and Bob are working on different bits of the same project in
parallel, finish up, and merge back and forth to sync up (ignoring for
the moment the "empty merge commit" bit), even though we now both have
the 'same' stuff, we have the same head rev with all the same parents,
the parents are in a different order, and my 'mainline' (the path of
left-most parents, or 'first' as I understand git calls them) is
different than his; my mainline is my commits, his mainline is his.
If one of us were to 'pull' the other, our branch would become a
duplicate of his and so adopt his 'mainline', which we want to avoid
because then it doesn't fit the mental model of "what I did", which is
what I think of my branch as.


Obviously, this is a totally foreign mentality to git, and that's
great because it seems to work for you.  I can see advantages to it,
and I can conceive of situations where I might want that behavior.
But, in my day-to-day VCS use, I don't hit them, which is why I keep
typing 'bzr' instead of 'git' when I annoyingly need to type 'cvs'.


> The difference, I think, is that it's easier in git to move the
> upstream around: you simply start fetching from a different place.
> I'm not clear on how that works in bzr (if it invalidates revnos or
> has other side effects).

Depends on what you're fetching.  You can always tell 'bzr pull' a new
URL to look from.  If it's a later version of the 'same' branch, it'll
just update.  If it's a 'different' branch (a branch that's a superset
of your current branch/set-of-revisions, but with a different
'mainline' path through the revisions counts as 'different' here),
pull will complain and require a --overwrite to do the deed.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
@ 2006-10-21 22:06                                                           ` Sean
  0 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 22:06 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Jeff King, Jakub Narebski, bazaar-ng, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On Sat, 21 Oct 2006 16:46:29 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Obviously, this is a totally foreign mentality to git, and that's
> great because it seems to work for you.  I can see advantages to it,
> and I can conceive of situations where I might want that behavior.
> But, in my day-to-day VCS use, I don't hit them, which is why I keep
> typing 'bzr' instead of 'git' when I annoyingly need to type 'cvs'.

It's not completely foreign, it's one of the things you can use the
git reflog feature to record.  It's just that it's utterly clear in
Git that this is a local feature and is never replicated as part
of the distributed data.

> Depends on what you're fetching.  You can always tell 'bzr pull' a new
> URL to look from.  If it's a later version of the 'same' branch, it'll
> just update.  If it's a 'different' branch (a branch that's a superset
> of your current branch/set-of-revisions, but with a different
> 'mainline' path through the revisions counts as 'different' here),
> pull will complain and require a --overwrite to do the deed.

This is where the git model is clearly superior and allows a true
distributed model.  Because there is no concept of a "mainline"
(except locally via reflog) you can always merge with anyone
participating in the DAG without having to overwrite or lose ordering.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:46                                                       ` Matthew D. Fuller
       [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
@ 2006-10-21 22:25                                                         ` Jakub Narebski
  2006-10-21 23:42                                                           ` Jeff Licquia
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-21 22:25 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Jeff King, bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

Matthew D. Fuller wrote:

[cut]
> Obviously, this is a totally foreign mentality to git, and that's
> great because it seems to work for you.  I can see advantages to it,
> and I can conceive of situations where I might want that behavior.
> But, in my day-to-day VCS use, I don't hit them, which is why I keep
> typing 'bzr' instead of 'git' when I annoyingly need to type 'cvs'.

Well, not exactly. If you are interested in your changes, i.e. commits 
generated by you, you can (with new git) filter commits by author name,
e.g. 'git log --author="$(git repo-config --get user.email)"'. If you
are interested in commits which you entered into repository, you can
(with new git) filter commits by commiter.

If you are interested in history of your branch, you can enable reflog
for this branch. This is of course totally local information, and 
doesn't get propagated. It records things like commits, merges, 
rebasing, starting branch anew, amending commits etc. Because it
is separate from branch and DAG of revisions, we can do fast-forward
and have identical DAG while having information about local history.

Besides git users are used to refer to graphical history viewers,
including gitk (Tcl/Tk, in git repository), qgit (Qt), gitview (GTK+, in 
contrib/, less popular), git-show-branch (core git, strange UI, command 
line), tig (ncurses) for more complicated cases.


I wonder if searching for one's own commits isn't the sign that
the project is of one-main-developer size (i.e. small project,
without large number of distributed contributors). I think in large 
project you rather ask of history of specified file, of specified part 
of project (specified directory), ask about why certain change was 
introduced etc.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:48                                                 ` Jakub Narebski
@ 2006-10-21 22:52                                                   ` Edgar Toernig
  2006-10-21 23:39                                                   ` Aaron Bentley
  1 sibling, 0 replies; 806+ messages in thread
From: Edgar Toernig @ 2006-10-21 22:52 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Jakub Narebski wrote:
>
> > My understanding is that ^ is treated as a special metacharacter by some
> > shells, which is why bzr revision specs are more long-winded.
> 
> Which shells?

In the traditional Bourne shell ^ is an alias for the pipe symbol |.

Ciao, ET.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
  2006-10-21 20:55                                                   ` Jakub Narebski
@ 2006-10-21 23:07                                                   ` Jeff Licquia
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
  2006-10-22 12:46                                                   ` Matthew D. Fuller
  2006-10-22 19:36                                                   ` David Clymer
  3 siblings, 1 reply; 806+ messages in thread
From: Jeff Licquia @ 2006-10-21 23:07 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 13:47 -0700, Carl Worth wrote:
> I still haven't seen strong examples for this last claim. When are
> they handier? I asked a couple of messages back and two people replied
> that given one revno it's trivial to compute the revno of its
> parent. But that's no win over git's revision specifications,
> (particularly since they provide "parent of" operators).

Having used both (though my familiarity with git is less), in my opinion
the biggest win is the obvious one: sequential numbers work in the head
better than SHA1 checksums.

"But it's not a problem in practice!" is a good retort, except that I
wonder whether the set of "practices" you're using includes anyone who
decided to pass on git in favor of something else--perhaps because they
saw a few SHAs float by and ran in terror.  Beware of self-selection
bias.

Put another way, "strength" of example is often in the eye of the
beholder.  That we continue to give you the same "weak" examples may be
evidence that we have a different impression of their strengths, and
that your analysis of their strengths isn't convincing to us.

I suppose this line of conversation still has value if you don't see any
benefit at all, but OTOH if you really don't see how sequential numbers
are easier to work with in the head than SHA sums with modifiers, I'm
not sure that's a gap we can bridge.

> Let me know if I botched any of that.

I don't see any problems with it.

> But dropping a merged branch in bzr means throwing away the ability to
> reference any of its commits by its custom, branch-specific revision
> numbers. And the revision numbers _do_ change, pull, branch, and merge
> all introduce revision number differences between branches, (or
> changes within a branch in the case of pull). And there is no simple
> way to correlate the numbers between branches.
> 
> Maybe you can argue that there isn't any centralization bias in
> bzr. But anyone that claims that the revnos. are stable really is
> talking from a standpoint that favors centralization.

I wonder if part of the problem is that the revno scheme we've been
talking about (the x.y.z... format) doesn't technically exist in any
released version of bzr that I know of.

Previous to 0.12, bzr revnos were absolutely a local thing; revisions
from merges didn't even have revnos (except for the merge commit
itself).  If you merged a branch and you later wanted to recreate that
branch, or see a diff from that branch, etc., you had to use revids.

So when you talk of a "centralization bias" in bzr, a lot of us get
confused, defensive, etc., because from our perspective, bzr and git
weren't all that much different until just recently.

Now it may be that you're right that "global" revnos like bzr has now
introduce a bias in favor of centralization.  If that's true, I'm not
sure that totally vindicates the git model.  We have to ask if the bias
is a good thing, but so do you; after all, we may have done so because
of user demand, and if our users want it, maybe yours will want it too
someday.

(I say "may" because I haven't been paying close attention to the new
revno conversation, so I don't want to sound more sure than I am.)

But I think bzr people are more willing to take a wait-and-see approach.
Local revnos weren't a big deal, so we're willing to bet that the new
0.12 revnos won't be, either.

> And it turns out that git also allows branch specific naming for the
> exact same reason. In place of 3, 2, 1 in the same situation git would
> allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same three
> revisions. So the easy diff command would be "git diff HEAD~2 HEAD".
> (And where I have HEAD here I could also use any branch name, or any
> other reference to a commit as well.)

FYI: The strict analogy to HEAD~1 in bzr would be -2.  And yes, -2 is
every bit as unstable as HEAD~1.

> Finally, since these branch-specific names are changing all the time,
> there's never any temptation for people to attempt to use them to for
> external communication. In contrast, by being numbered in the opposite
> direction, bzr revision numbers give a false appearance of stability
> and people _do_ use them for communication. This is the mistake we've
> been warning bzr users about in this thread.

URLs are also used for communication, despite having many of the same
drawbacks as revnos in DVC systems.  This could have been a fatal flaw,
but in reality, this has resulted in some best practices ("permalinks",
for example), and a sense of where a URL is appropriate and where it
isn't.  It's not perfect, and yet it's been wildly successful.

Copying the flaws of a highly successful system does not guarantee
success, of course.  On the other hand, it does influence our evaluation
of the severity of the flaws.

There may be a danger, though, that the bzr community may want to pay
closer attention to.

Several of us have pointed to the (branch, revno) combination as a
sufficiently reliable communication method, and we may be right about
that.  But, so far, those revnos have been entirely local to a single
branch, and have also been as absolutely reliable (locally speaking) as
a revid; the branch "foo" may go away, but while it's around, "revision
14 of branch foo" will always mean the same thing.  But we're now adding
the 0.12 revno scheme, with "global" revnos.  Will those be as reliable?
Will "revision 2418.1.4 on bzr.dev" work as well as "revision 2418 on
bzr.dev" does now?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
  2006-10-21 23:25                                                       ` Sean
@ 2006-10-21 23:25                                                       ` Sean
  2006-10-22  0:46                                                       ` Jeff Licquia
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 23:25 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Carl Worth, bazaar-ng, git

On Sat, 21 Oct 2006 19:07:10 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Several of us have pointed to the (branch, revno) combination as a
> sufficiently reliable communication method, and we may be right about
> that.  But, so far, those revnos have been entirely local to a single
> branch, and have also been as absolutely reliable (locally speaking) as
> a revid; the branch "foo" may go away, but while it's around, "revision
> 14 of branch foo" will always mean the same thing.  But we're now adding
> the 0.12 revno scheme, with "global" revnos.  Will those be as reliable?
> Will "revision 2418.1.4 on bzr.dev" work as well as "revision 2418 on
> bzr.dev" does now?

There is no need to speculate, the numbers will only be reliable on a local
basis.  So yes you can force a single repository like bzr.dev to always "win"
any conflict and force the other guy to change ie. a central repo model.
But they can not be maintained consistently in a truly distributed
system.  As Linus pointed out that is fact, not opinion.

Now the opinion of the bzr people is that it doesn't matter and that for
all important cases it works well enough.  If all the people who don't like
the look of sha1's self select bzr, so be it, but that doesn't change the
fundamental argument.

But just to reiterate, the design of Git is flexible enough to where you
can automatically generate "revno" tags for every commit in your repo
_today_.  You'd end up with the exact same problems that bzr will
eventually hit, but Git already has everything you need today to refer
to every commit in your repo as r1 r2 r3 r4 etc...  

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
@ 2006-10-21 23:25                                                       ` Sean
  2006-10-21 23:25                                                       ` Sean
  2006-10-22  0:46                                                       ` Jeff Licquia
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-21 23:25 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Carl Worth, bazaar-ng, git

On Sat, 21 Oct 2006 19:07:10 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Several of us have pointed to the (branch, revno) combination as a
> sufficiently reliable communication method, and we may be right about
> that.  But, so far, those revnos have been entirely local to a single
> branch, and have also been as absolutely reliable (locally speaking) as
> a revid; the branch "foo" may go away, but while it's around, "revision
> 14 of branch foo" will always mean the same thing.  But we're now adding
> the 0.12 revno scheme, with "global" revnos.  Will those be as reliable?
> Will "revision 2418.1.4 on bzr.dev" work as well as "revision 2418 on
> bzr.dev" does now?

There is no need to speculate, the numbers will only be reliable on a local
basis.  So yes you can force a single repository like bzr.dev to always "win"
any conflict and force the other guy to change ie. a central repo model.
But they can not be maintained consistently in a truly distributed
system.  As Linus pointed out that is fact, not opinion.

Now the opinion of the bzr people is that it doesn't matter and that for
all important cases it works well enough.  If all the people who don't like
the look of sha1's self select bzr, so be it, but that doesn't change the
fundamental argument.

But just to reiterate, the design of Git is flexible enough to where you
can automatically generate "revno" tags for every commit in your repo
_today_.  You'd end up with the exact same problems that bzr will
eventually hit, but Git already has everything you need today to refer
to every commit in your repo as r1 r2 r3 r4 etc...  

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:48                                                 ` Jakub Narebski
  2006-10-21 22:52                                                   ` Edgar Toernig
@ 2006-10-21 23:39                                                   ` Aaron Bentley
  2006-10-22  0:04                                                     ` Carl Worth
  2006-10-22  0:14                                                     ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-21 23:39 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>> Carl Worth wrote:
> 
> No, there is no such thing like local ordering of revisions.

> You can have (in git repository) also reflog, which records values
> of branch-as-reference, or branch tip of branch-as-named-lineage.
> But for example fetch and fast-forward 5 commits in history is
> recorded as single event, single change in reflog.

That must be what I was thinking of.

>> A Bazaar branch is a directory inside a repository that contains:
>>  - a name referencing a particular revision
>>  - (optional) the location of the default branch to pull/merge from
>>  - (optional) the location of the default branch to push to
>>  - (optional) the policy for GPG signing
>>  - (optional) an alternate committer-id to use for this branch
>>  - (optional) a nickname for the branch
>>  - other configuration options
> Erm, wasn't revno to revid mapping also part of bzr "branch"?

It's not part of the conceptual model.  The revno-to-revid mapping is
done using the DAG.  The branch just tracks the head.

The .bzr/branch/revision-history file is from an earlier model in which
branches had a local ordering.  Nowadays, it can be treated as:
 - a reference to the head revision
 - a cache of the revno-to-revid mapping

>> This layout is an imitation of Git, as I understand it:
>> Repository:
>> ~/repo
>>
>> Branches:
>> ~/repo/origin
>> ~/repo/master
>>
>> Workingtree:
>> ~/repo
> 
> Workingtree:
> ~/
> 
> if I understand notation correctly.

The notation was that ~/repo would contain the .git directory for the
repository.

>> While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
>> http://bazaar-vcs.org/bzr/bzr.dev" is a big win.
> 
> Gaah, it's even more inconvenient. Certainly more than using name
> of branch itself, like in git.

Of course if you have a copy of bzr.dev on your computer, you don't need
to type the full URL.  it's just like the 'merge ../b' above.

But how can you use the branch name of a branch that isn't on your
computer?  I suspect git requires a separate 'clone' step to get it onto
your computer first.

> Is there a command to list all branches in bzr?

There's one in the 'bzrtools' plugin.

> Is there a command
> to copy (clone in SCM jargon) whole repository with all branches?

No.

>> My understanding is that ^ is treated as a special metacharacter by some
>> shells, which is why bzr revision specs are more long-winded.
> 
> Which shells? If I understand it '^' was chosen (for example as
> NOT operator for specify sub-DAG instead of '!') because of no problems
> for shell expansion. And considering that many git commands are/were
> written in shell, one certainly would notice that.

Sorry, it's been quite a long time since people complained at me for
using ^, so I don't remember.  Perhaps Edgar is right about it being the
pipe character in old shells.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD4DBQFFOq+80F+nu1YWqI0RAp/KAJ9Bw1q9/nd3gUAjcX3c+24aoEifeQCYlbD0
tUZ01ra11vkQ7V3RzarXeg==
=oFIC
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 22:25                                                         ` Jakub Narebski
@ 2006-10-21 23:42                                                           ` Jeff Licquia
  2006-10-21 23:49                                                             ` Carl Worth
  0 siblings, 1 reply; 806+ messages in thread
From: Jeff Licquia @ 2006-10-21 23:42 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Sun, 2006-10-22 at 00:25 +0200, Jakub Narebski wrote:
> I wonder if searching for one's own commits isn't the sign that
> the project is of one-main-developer size (i.e. small project,
> without large number of distributed contributors). I think in large 
> project you rather ask of history of specified file, of specified part 
> of project (specified directory), ask about why certain change was 
> introduced etc.

I don't think so.  Recently, I've been trying to track a particular
patch in the kernel.  It was done as a series of commits, and probably
would have been its own branch in bzr, but when I was trying to group
the commits together to analyze them as a group, the easiest way to do
that was by the original committer's name.

Now, there's probably a better way to hunt that stuff down, but in this
case hunting the user down worked for me.  (It may have made a
difference that I was using gitweb instead of a local clone.)

And the case of hunting down your own commits is just a degenerate case
of hunting down someone else's.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:42                                                           ` Jeff Licquia
@ 2006-10-21 23:49                                                             ` Carl Worth
  2006-10-22  0:07                                                               ` Jeff Licquia
                                                                                 ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-21 23:49 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 999 bytes --]

On Sat, 21 Oct 2006 19:42:47 -0400, Jeff Licquia wrote:
> I don't think so.  Recently, I've been trying to track a particular
> patch in the kernel.  It was done as a series of commits, and probably
> would have been its own branch in bzr, but when I was trying to group
> the commits together to analyze them as a group, the easiest way to do
> that was by the original committer's name.

As far as "its own branch in bzr" would such a branch remain available
indefinitely even after being merged in to the main tree?

> Now, there's probably a better way to hunt that stuff down, but in this
> case hunting the user down worked for me.  (It may have made a
> difference that I was using gitweb instead of a local clone.)

Vast, huge, gaping, cosmic difference.

Almost none of the power of git is exposed by gitweb. It's really not
worth comparing. (Now a gitweb-alike that provided all the kinds of
very easy browsing and filtering of the history like gitk and git
might be nice to have.)

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:04                       ` Linus Torvalds
@ 2006-10-21 23:58                         ` Linus Torvalds
  2006-10-22  0:13                           ` Erik Bågfors
  2006-10-22  0:09                         ` Erik Bågfors
  2006-10-27  4:51                         ` Jan Hudec
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-21 23:58 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Matthieu Moy, bazaar-ng, Sean, Jan Hudec, git, Jakub Narebski



On Sat, 21 Oct 2006, Linus Torvalds wrote:
> 
> And that work-flow is definitely not "distributed" it's much closer to 
> "disconnected centralized".

Side note: the only reason I think that distinction is worth making at all 
is when comparing git to bzr, and even then this is a fairly subtle 
distinction, and probably not a huge deal in practice.

I obviously think git is a nicer distributed design, but in the end, if 
you compare to something like CVS or SVN that isn't even disconnected, the 
difference between git and bzr in this sense is basically zero. 

So I sound like I care, but at the same time, I realize very well that 
when coming from a totally centralized world, the details we're arguing 
are _so_ not important.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:39                                                   ` Aaron Bentley
@ 2006-10-22  0:04                                                     ` Carl Worth
  2006-10-22  0:14                                                     ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-22  0:04 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Jakub Narebski, Linus Torvalds, Andreas Ericsson, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2390 bytes --]

On Sat, 21 Oct 2006 19:39:41 -0400, Aaron Bentley wrote:
> Of course if you have a copy of bzr.dev on your computer, you don't need
> to type the full URL.  it's just like the 'merge ../b' above.
>
> But how can you use the branch name of a branch that isn't on your
> computer?  I suspect git requires a separate 'clone' step to get it onto
> your computer first.

No. You can merge a branch from a remote repository in a single step:

	git pull http://example.com/git/repo branch-of-interest

But if you want to do something besides (or before) a merge, (for
example, just explore its history, do some diffs etc.) then you would
fetch it instead, assigning it a local branch name in the process:

	git fetch http://example.com/git/repo branch-of-interest:local-name

After which "local-name" is all one would need to use. So after a
fetch like the above, the equivalent of "bzr missing --theirs-only"
would be:

	git log ..local-name

[This shows some of the expressive power of git revision
specifications. There's no need for a separate "missing" command. It's
just one case of viewing a particular subset of the DAG. And the
specification language makes almost all interesting subsets easy. The
--mine-only specification would be "local-name.."]

And beyond what bzr missing does (I believe) it's easy to also see the
patch content of each commit with:

	git log -p ..local-name

And then if everything is happy, one could merge that branch in:

	git pull . local-name

(And, yes, it is the case that "pull" with a repository URL of "." is
how merging is done. It's bizarre to me that this is not "git merge
local-name" instead. There actually _is_ a "git merge" command that
could be used here, but it is somewhat awkward to use, (requiring both
a commit message (without the -m of git-commit(!)) and an explicit
mention of the current branch). So using it would be something like:

	git merge "merge of local-name" HEAD local-name

I've never claimed that git is completely free of its UI
warts---though there are fewer now than when I started using it.)

But, yes, the notion in git is to bring things in to the current
repository and then work with them locally. This has an advantage that
network traffic is spent only once if doing multiple operations, (say
the three steps shown above: 1) investigate commit messages, 2)
investigate patch content, 3) perform the merge).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:49                                                             ` Carl Worth
@ 2006-10-22  0:07                                                               ` Jeff Licquia
  2006-10-22  0:47                                                                 ` Linus Torvalds
  2006-10-22 16:02                                                               ` Petr Baudis
  2006-10-25  9:52                                                               ` Andreas Ericsson
  2 siblings, 1 reply; 806+ messages in thread
From: Jeff Licquia @ 2006-10-22  0:07 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 16:49 -0700, Carl Worth wrote:
> On Sat, 21 Oct 2006 19:42:47 -0400, Jeff Licquia wrote:
> > I don't think so.  Recently, I've been trying to track a particular
> > patch in the kernel.  It was done as a series of commits, and probably
> > would have been its own branch in bzr, but when I was trying to group
> > the commits together to analyze them as a group, the easiest way to do
> > that was by the original committer's name.
> 
> As far as "its own branch in bzr" would such a branch remain available
> indefinitely even after being merged in to the main tree?

Yes, in the sense that you can recreate the branch by using that
branch's last commit.  But not in the git sense that there's a branch ID
pointing at the commit in question.

You know what?  It occurs to me that much of the problem with git
branches vs. bzr branches might be solved when bzr gets proper tagging
support.  Because, after all, aren't branches more like special tags in
git?

> > Now, there's probably a better way to hunt that stuff down, but in this
> > case hunting the user down worked for me.  (It may have made a
> > difference that I was using gitweb instead of a local clone.)
> 
> Vast, huge, gaping, cosmic difference.
> 
> Almost none of the power of git is exposed by gitweb. It's really not
> worth comparing. (Now a gitweb-alike that provided all the kinds of
> very easy browsing and filtering of the history like gitk and git
> might be nice to have.)

So, very probably, I would have had a far easier time of it if I had
been able to really use git to do the work, instead of gitweb.

I still don't think, though, that it's a sign of a small project to be
concerned about one's own branches more than others.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:04                       ` Linus Torvalds
  2006-10-21 23:58                         ` Linus Torvalds
@ 2006-10-22  0:09                         ` Erik Bågfors
  2006-10-27  4:51                         ` Jan Hudec
  2 siblings, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-22  0:09 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On 10/21/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Erik Bågfors wrote:
> >
> > bzr is a fully decentralized VCS. I've read this thread for quite some
> > time now and I really cannot understand why people come to this
> > conclusion.
>
> Even the bzr people agree, so what's not to understand?

The use of the word "decentralized".

When I think centralized, I think "all users must commit to a central
repo/branch".  In this sense bzr is 100% fully decentralized.  You are
free to commit to a none-central branch.

What I mean is that it's fully decentralized, but it may have a bias
to the usage of a central branch/repo.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:58                         ` Linus Torvalds
@ 2006-10-22  0:13                           ` Erik Bågfors
  2006-10-22  0:22                             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-22  0:13 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy, Jakub Narebski

On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Linus Torvalds wrote:
> >
> > And that work-flow is definitely not "distributed" it's much closer to
> > "disconnected centralized".
>
> Side note: the only reason I think that distinction is worth making at all
> is when comparing git to bzr, and even then this is a fairly subtle
> distinction, and probably not a huge deal in practice.
>
> I obviously think git is a nicer distributed design, but in the end, if
> you compare to something like CVS or SVN that isn't even disconnected, the
> difference between git and bzr in this sense is basically zero.
>
> So I sound like I care, but at the same time, I realize very well that
> when coming from a totally centralized world, the details we're arguing
> are _so_ not important.

I have to agree. Personally I think both git, bzr and mercurial are
all VERY nice systems.  If they weren't all started about the same
time, I doubt we would have all three.

I am happy to use either, but I have a small preference with bzr
because it suites me. I'm saying this, just as a user, nothing else.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:39                                                   ` Aaron Bentley
  2006-10-22  0:04                                                     ` Carl Worth
@ 2006-10-22  0:14                                                     ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22  0:14 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Andreas Ericsson, Linus Torvalds, Carl Worth, bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Aaron Bentley wrote:

>>> A Bazaar branch is a directory inside a repository that contains:
>>>  - a name referencing a particular revision
>>>  - (optional) the location of the default branch to pull/merge from
>>>  - (optional) the location of the default branch to push to
>>>  - (optional) the policy for GPG signing
>>>  - (optional) an alternate committer-id to use for this branch
>>>  - (optional) a nickname for the branch
>>>  - other configuration options
>> Erm, wasn't revno to revid mapping also part of bzr "branch"?
> 
> It's not part of the conceptual model.  The revno-to-revid mapping is
> done using the DAG.  The branch just tracks the head.
> 
> The .bzr/branch/revision-history file is from an earlier model in which
> branches had a local ordering.  Nowadays, it can be treated as:
>  - a reference to the head revision
>  - a cache of the revno-to-revid mapping

In git DAG is DAG od parents. There are no "child" links. So it is natural
to refer to n-th ancestor of given commit (in git <ref>~<n>, in bzr -<m>).

To have incrementing (from 1 for first revision on given branch) revision
numbers you either have to have links to "children", which automatically
means that revisions cannot be immutable to allow for branching at
arbitrary revision, or to transverse DAG here and back again (perhaps
with cache of revno-to-revid mapping to help performance).

Additionally to have incrementing revision numbers you have to remember
which part of DAG is our branch; which parent in merge to chose to follow.
Bazaar-NG decides here to distinguish first parent; to have first parent
immutable it doesn't use fast-forward and always use merge, sometimes
giving empty-merge. If you use "pull" numbers change.
 
>>> This layout is an imitation of Git, as I understand it:
>>> Repository:
>>> ~/repo
>>>
>>> Branches:
>>> ~/repo/origin
>>> ~/repo/master
>>>
>>> Workingtree:
>>> ~/repo
>>
>> Workingtree:
>> ~/
>>
>> if I understand notation correctly.
> 
> The notation was that ~/repo would contain the .git directory for the
> repository.

The default layout of "clothed" repository is

 Repository:
 ~/repo/.git/

 Branches:
 ~/repo/.git/refs/heads/

 Workingtree:
 ~/repo/

>>> While "bzr merge ../b" is a minor inconvenience, I think that "bzr merge
>>> http://bazaar-vcs.org/bzr/bzr.dev" is a big win.
>>
>> Gaah, it's even more inconvenient. Certainly more than using name
>> of branch itself, like in git.
> 
> Of course if you have a copy of bzr.dev on your computer, you don't need
> to type the full URL.  it's just like the 'merge ../b' above.
> 
> But how can you use the branch name of a branch that isn't on your
> computer?  I suspect git requires a separate 'clone' step to get it onto
> your computer first.

No, as it was said in other messages in this thread, you can fetch
a branch (branches), even from other repository that the one you cloned
from, into given branch (branches). For git it would be
  $ git fetch <URL> <remotebranch>:<localbranch>
You probably would want to save above info in remotes file or in config.
For cg (Cogito) it would be
  $ cg branch-add <localbranch> <URL>#<remotebranch>
  $ cg fetch <localbranch>

In git you always use names like 'master', 'next', 'HEAD' (meaning current
branch) and also HEAD^, next~5 when comparing branches, viewing history,
merging branches, switching to branch etc. Not '../master'...

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  0:13                           ` Erik Bågfors
@ 2006-10-22  0:22                             ` Jakub Narebski
  2006-10-22  1:00                               ` Theodore Tso
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22  0:22 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Linus Torvalds, Sean, Jan Hudec, bazaar-ng, git, Matthieu Moy

Erik Bågfors wrote:

>> So I sound like I care, but at the same time, I realize very well that
>> when coming from a totally centralized world, the details we're arguing
>> are _so_ not important.
> 
> I have to agree. Personally I think both git, bzr and mercurial are
> all VERY nice systems.  If they weren't all started about the same
> time, I doubt we would have all three.

If I understand correctly bzr came to life much earlier than Monotone,
Mercurial and Git but it was in beta stages very long. Bazaar-NG
"repositories" to group bunch of "branches" seems inspoted by hg or git.
Git (and probably Mercurial) was inspired both by BitKeeper and Monotone.
Monotone started to be reasonable fast around time when Git and Mercurial
came to be.

P.S. I'd like very much to see "history of SCM", with links denoting
borrowing of ideas, similar to the "history of UNIX" graphs...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
  2006-10-21 23:25                                                       ` Sean
  2006-10-21 23:25                                                       ` Sean
@ 2006-10-22  0:46                                                       ` Jeff Licquia
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Jeff Licquia @ 2006-10-22  0:46 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 19:25 -0400, Sean wrote:
> Now the opinion of the bzr people is that it doesn't matter and that for
> all important cases it works well enough.  If all the people who don't like
> the look of sha1's self select bzr, so be it, but that doesn't change the
> fundamental argument.

Which opinion is this?  The opinion that old-style local revnos aren't a
big deal, or that new-style dotted revnos aren't a big deal?

I suspect you're conflating the two, and interpreting certainty for the
former as certainty for the latter.  Though I don't mind being
corrected.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  0:07                                                               ` Jeff Licquia
@ 2006-10-22  0:47                                                                 ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-22  0:47 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: Carl Worth, bazaar-ng, git



On Sat, 21 Oct 2006, Jeff Licquia wrote:
> 
> You know what?  It occurs to me that much of the problem with git
> branches vs. bzr branches might be solved when bzr gets proper tagging
> support.  Because, after all, aren't branches more like special tags in
> git?

Both branches _and_ tags in git are 100% the same thing: they're just 
shorthand for the commit name. That's _literally_ all they are. They are a 
symbolic name for a 160-bit SHA1 hash.

So yes, you can say that branches are like special tags, or that 
(unsigned) tags are like special branches. There's no real "technical" 
difference: in both cases, it's just an arbitrary name for the top commit.

However, there are some purely UI differences between tags and branches, 
which really don't affect any of the "name->SHA1" translation at all, but 
which affect how you can _use_ a tag-name vs a branch-name.

 - A branch is always a pointer to a _commit_ object.

   In contrast, a tag can point to anything. It can point to a tree (and 
   that means that you can do _diff_ between a tag and a branch, but such 
   a tree doesn't have any "history" associated with it - it's purely 
   about a certain "state", so you cannot say that it has a parent or 
   anything like that).

   A tag can also point to a single file object ("blob": pure file 
   content), which is soemthing that the git.git repository uses to point 
   to the GPG public key that Junio uses to sign things, for example.

   But perhaps more commonly, a tag can also point to a special "tag" 
   object, which is just a form of indirection that can optionally contain 
   an explanation and a digitally signed verification. When I cut a kernel 
   release, for example, my tag's don't point to the commit that is the 
   release commit, they point to a GPG-signed tag-object that in turn 
   points to the commit. 

   With those signed tags, people can verify (if they get my public key) 
   that a particular release was something I did. And due to the 
   cryptographic nature of the hash, trusting the tag object also means 
   that you can trust the commit it points to, and the whole history that 
   points to.

   So while from a _revision_lookup_ standpoint a "branch" and a "tag" do 
   100% the same thing, we put some limitations on branches: they always 
   have to point to a commit.

 - Thanks to the limitation on branches being commits, branches can be 
   "checked out" which is saying that you can make it the active working 
   tree state. You cannot "check out" a tag: you need to have a branch 
   that you check out and can do development on.  So a "tag" is considered 
   purely a stationary pointer: it cannot be committed to, and it cannot 
   participate directly in development.

   This literally has nothing to do with looking up the SHA1 name 
   associated with a tag or a branch, this is _purely_ an agreed-upon 
   convention (that is enforced by higher-level commands like "git 
   checkout"). So if you want to check out the state as of some tag, you 
   must always do it within the confines of some branch.

   So for example, you could do

	git checkout -b newbranch v2.6.18

   which uses a tag ("v2.6.18") to define where to start the branch, and 
   then creates a branch called "newbranch" and checks that out. That's 
   purely shorthand for

	git branch newbranch v2.6.18	# create 'newbranch', initialize 
					# it at v2.6.18

	git checkout newbranch		# make 'newbranch' our currently 
					#active branch

   but you are _not_ allowed to do

	git checkout v2.6.18

   because that would leave you with a situation where your "top-of-tree" 
   is a tag, and you couldn't do any development on it because you don't 
   have a branch to develop _on_.

But all of these kinds of differences between tags and branches are really 
not "core technology" and are purely about having adopted a convention. It 
is literally about just having certain "usage rules" for specific 
"symbolic namespaces".

"branch" and "tag" are just the normal namespaces git gives you and always 
has. You can have others too (and you can define your own) and those names 
will automatically be used for lookup by all the basic git tools. Git 
won't _touch_ those names in any other way, but it means that you can 
create your own tools around git that have their own rules about how the 
names are managed, and you can still use them for lookup.

For example, you could have a "svn" namespace for a project imported from 
svn, and that namespace would contain the SVN revision names for the 
project, so that you could do

	git diff svn/56..

to see the difference between "svn revision 56" and your current HEAD, 
without necessarily polluting the "real" git tag namespace.

(Which can matter, since some commands take arguments like "--tags", which 
just collects all the regular tags - so you might not want to use normal 
tags to remember your SVN revision mapping, even if it might technically 
be fine).

(The above was a totally made-up example. I don't think any of the svn 
importers actually do anything like that: but we do use a few other 
"namespaces" internally: "git bisect" puts the bisection results in the 
"bisect" namespace, and the "remotes" namespace can be used to track 
remote heads as something _different_ than a local branch - so that you 
won't check such a "remote branch" out directly by mistake)

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  0:22                             ` Jakub Narebski
@ 2006-10-22  1:00                               ` Theodore Tso
  0 siblings, 0 replies; 806+ messages in thread
From: Theodore Tso @ 2006-10-22  1:00 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Erik Bågfors, Linus Torvalds, Sean, Jan Hudec, bazaar-ng,
	git, Matthieu Moy

On Sun, Oct 22, 2006 at 02:22:28AM +0200, Jakub Narebski wrote:
> If I understand correctly bzr came to life much earlier than Monotone,
> Mercurial and Git but it was in beta stages very long. Bazaar-NG
> "repositories" to group bunch of "branches" seems inspoted by hg or git.
> Git (and probably Mercurial) was inspired both by BitKeeper and Monotone.
> Monotone started to be reasonable fast around time when Git and Mercurial
> came to be.

Yes, bzr predates Mercurial and Git; I remember talking to Martin Pool
about Bazaar-BG at the the 2005 Linux.conf.au, which was before the BK
turnoff.  At the time, I had considered using bzr-ng (which has since
been renamed bzr), but it didn't have branch functionality at that
point if I remember correctly.  Both git and Mercurial started
development at almost the same time right after the Larry McVoy
announced the pending withdrawal of the BitKeeper no-cost license.   

About one month after the announced BK turnoff date, I looked at the
various options for transitioning e2fsprogs, and at that point
Mercurial was **substantially** faster than bzr, and I believe
slightly ahead in features.  I also looked at git, but at that point
Hg was easier to learn how to use, and I figured for a project the
size of e2fsprogs, I didn't need the power of git, so I decided in
favor of Mercurial because it looked like it would be easier for
people to learn how to use it.

I think it's fair to say that the exchange in ideas have profited all
three projects, and that the different projects have different
strengths,   

						- Ted

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
@ 2006-10-22  1:26                                                           ` Sean
  2006-10-22  1:26                                                           ` Sean
  2006-10-22  3:23                                                           ` Jeff Licquia
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22  1:26 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 20:46:45 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Which opinion is this?  The opinion that old-style local revnos aren't a
> big deal, or that new-style dotted revnos aren't a big deal?
> 
> I suspect you're conflating the two, and interpreting certainty for the
> former as certainty for the latter.  Though I don't mind being
> corrected.

The archives have all the posts of people claiming that there were no
issues with revno's and fully distributed models.  But it's okay, the
issue really isn't all that important in the big scheme of things.  Bzr
and Git have much more in common than they have differences.  I reject
that revno's are an example of where bzr is superior than Git, but
there are no doubt examples where I would concede that bzr has the edge.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
  2006-10-22  1:26                                                           ` Sean
@ 2006-10-22  1:26                                                           ` Sean
  2006-10-22  3:23                                                           ` Jeff Licquia
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22  1:26 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 20:46:45 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> Which opinion is this?  The opinion that old-style local revnos aren't a
> big deal, or that new-style dotted revnos aren't a big deal?
> 
> I suspect you're conflating the two, and interpreting certainty for the
> former as certainty for the latter.  Though I don't mind being
> corrected.

The archives have all the posts of people claiming that there were no
issues with revno's and fully distributed models.  But it's okay, the
issue really isn't all that important in the big scheme of things.  Bzr
and Git have much more in common than they have differences.  I reject
that revno's are an example of where bzr is superior than Git, but
there are no doubt examples where I would concede that bzr has the edge.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
  2006-10-22  1:26                                                           ` Sean
  2006-10-22  1:26                                                           ` Sean
@ 2006-10-22  3:23                                                           ` Jeff Licquia
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Jeff Licquia @ 2006-10-22  3:23 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sat, 2006-10-21 at 21:26 -0400, Sean wrote:
> On Sat, 21 Oct 2006 20:46:45 -0400
> Jeff Licquia <jeff@licquia.org> wrote:
> > I suspect you're conflating the two, and interpreting certainty for the
> > former as certainty for the latter.  Though I don't mind being
> > corrected.
> 
> The archives have all the posts of people claiming that there were no
> issues with revno's and fully distributed models.  

"revno's"?  Which "revno's"? ...

OK.  So you are conflating the two.  Could someone who isn't comment?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
  2006-10-22  3:30                                                               ` Sean
@ 2006-10-22  3:30                                                               ` Sean
  2006-10-22 10:00                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22  3:30 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 23:23:37 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> > The archives have all the posts of people claiming that there were no
> > issues with revno's and fully distributed models.  
> 
> "revno's"?  Which "revno's"? ...
> 
> OK.  So you are conflating the two.  Could someone who isn't comment?

No, actually i'm not.  Single revno's or your dotted revno's _both_ have the
same property.  They can only be local data and can not guarantee stability
in a fully distributed environment.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
@ 2006-10-22  3:30                                                               ` Sean
  2006-10-22  3:30                                                               ` Sean
  2006-10-22 10:00                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22  3:30 UTC (permalink / raw)
  To: Jeff Licquia; +Cc: bazaar-ng, git

On Sat, 21 Oct 2006 23:23:37 -0400
Jeff Licquia <jeff@licquia.org> wrote:

> > The archives have all the posts of people claiming that there were no
> > issues with revno's and fully distributed models.  
> 
> "revno's"?  Which "revno's"? ...
> 
> OK.  So you are conflating the two.  Could someone who isn't comment?

No, actually i'm not.  Single revno's or your dotted revno's _both_ have the
same property.  They can only be local data and can not guarantee stability
in a fully distributed environment.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:05                                               ` Aaron Bentley
  2006-10-21 20:48                                                 ` Jakub Narebski
       [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
@ 2006-10-22  7:45                                                 ` Jan Hudec
  2006-10-22  9:05                                                   ` Jakub Narebski
  2 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-22  7:45 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: Carl Worth, Linus Torvalds, Andreas Ericsson, bazaar-ng, git,
	Jakub Narebski

On Sat, Oct 21, 2006 at 04:05:18PM -0400, Aaron Bentley wrote:
> Carl Worth wrote:
> > On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:
> [...]
> > But it really is fundamental and unavoidable that sequential numbers
> > don't work as names in a distributed version control system.
> 
> Right.  You need something guaranteed to be unique.  It's the revno +
> url combo that is unique.  That may not be permanent, but anyone can
> create one of those names, so it is decentralized.

But it is *not* *distributed*. The definition of a distributed system
among other things require, that resource identifiers are independent on
the location of the resources. So only using the revision-ids is really
distributed.

> >> I meant that the active branch and a mirror of the abandoned branch
> >> could be stored in the same repository, for ease of access.
> > 
> > Granted, everything can be stored in one repository. But that still
> > doesn't change what I was trying to say with my example. One of the
> > repositories would "win" (the names it published during the fork would
> > still be valid). And the other repository would "lose" (the names it
> > published would be not valid anymore). Right?
> 
> No.  It would be silly for the losing side to publish a mirror of the
> winning branch at the same location where they had previously published
> their own branch.  So the old number + URL combination would remain valid.

I regularly use bzr and I never used git. But I'd not hesitate a second
to pull --overwrite over the old location. Because the url has a meaning
"the base I develop against" for me and I'd want to preserve that
meaning.

> If the losing faction decided to maintain their own branch after the
> merge, they'd have two options
> 
> 1. continue to develop against the losing "branch", without updating its
> numbers from the "winning" branch.  It would be hard to tell who had won
> or lost in this case.
> 
> 2. create a new mirror of the "winning" branch and develop against that.
>  I'm not sure what this point of this would be.
> 
> I think the most realistic thing in this scenario is that they leave the
> "losing" branch exactly where it was, and develop against the "winning"
> branch.
> 
> >> Bazaar encourages you to stick lots and lots of branches in your
> >> repository.  They don't even have to be related.  For example, my repo
> >> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
> > 
> > Git allows this just fine. And lots of branches belonging to a single
> > project is definitely the common usage. It is not common (nor
> > encouraged) for unrelated projects to share a repository, since a git
> > clone will fetch every branch in the repository.
> 
> Right.  This is a difference between Bazaar and Git that's I'd
> characterize as being "branch-oriented" vs "repository-oriented".  We'll
> see more of this below.

This is one of things I on the other hand like better on bzr than git.
Because it is really branches and not repositories that I usually care
about.

--------------------------------------------------------------------------------
                  				- Jan Hudec `Bulb' <bulb@ucw.cz>

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 17:31                                                       ` Linus Torvalds
  2006-10-21 17:38                                                         ` Linus Torvalds
@ 2006-10-22  7:49                                                         ` Tim Webster
  2006-10-22 17:12                                                           ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Tim Webster @ 2006-10-22  7:49 UTC (permalink / raw)
  To: git; +Cc: Aaron Bentley, bazaar-ng, Jakub Narebski

On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Sat, 21 Oct 2006, Aaron Bentley wrote:
> >
> > Any SCM worth its salt should support that.  AIUI, that's not what Tim
> > wants.  He wants to intermix files from different repos in the same
> > directory.
> >
> > i.e.
> >
> > project/file-1
> > project/file-2
> > project/.git-1
> > project/.git-2
>
> Ok, that's just insane.
[snip]
> Anyway. Git certainly allows you to do some really insane things. The
> above is just the beginning - it's not even talking about alternate object
> directories where you can share databases _partially_ between two
> otherwise totally independent repositories etc.


Perhaps this is insane, but it does not make sense to track all config
files in etc as though they belong in a single repo. Each
application/pkg has a set of associated config files. Actually in some
cases it is easy to track which files belong in each application/pkg
repo. For example dpkg list conffiles per pkg. Additional config files
not in the application/pkg maintainer repo branch are easily added to
the application/pkg local repo branch.

My question is where should file metadata be stored in git? With hook
scripts, the file metadata can be captured and applied appropriately.

If a similar thing can be done with bzr as Linus described for git, I
am all ears.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  7:45                                                 ` Jan Hudec
@ 2006-10-22  9:05                                                   ` Jakub Narebski
  2006-10-22  9:56                                                     ` Erik Bågfors
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22  9:05 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Aaron Bentley, Carl Worth, Linus Torvalds, Andreas Ericsson,
	bazaar-ng, git

Jan Hudec wrote:
> On Sat, Oct 21, 2006 at 04:05:18PM -0400, Aaron Bentley wrote:
>> Carl Worth wrote:
>>> On Thu, 19 Oct 2006 21:06:40 -0400, Aaron Bentley wrote:

>>>> Bazaar encourages you to stick lots and lots of branches in your
>>>> repository.  They don't even have to be related.  For example, my repo
>>>> contains branches of bzr, bzrtools, Meld, and BazaarInspect.
>>> 
>>> Git allows this just fine. And lots of branches belonging to a single
>>> project is definitely the common usage. It is not common (nor
>>> encouraged) for unrelated projects to share a repository, since a git
>>> clone will fetch every branch in the repository.
>> 
>> Right.  This is a difference between Bazaar and Git that's I'd
>> characterize as being "branch-oriented" vs "repository-oriented".  We'll
>> see more of this below.
> 
> This is one of things I on the other hand like better on bzr than git.
> Because it is really branches and not repositories that I usually care
> about.

That's probably because you are used to Bazaar-NG, and your habits
speaking. Think of git clone of repository as of bzr "branch".

For example git encourages using many short and longer-lived feature
branches; I don't see bzr encouraging this workflow.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  9:05                                                   ` Jakub Narebski
@ 2006-10-22  9:56                                                     ` Erik Bågfors
  2006-10-22 13:23                                                       ` Jakub Narebski
  2006-10-22 14:25                                                       ` Carl Worth
  0 siblings, 2 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-22  9:56 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, Jan Hudec, git

> For example git encourages using many short and longer-lived feature
> branches; I don't see bzr encouraging this workflow.

Why not? I think it really does.  And due to the fact that merges are
merges and will show up as such, I think it's very suitable for
feature branches.

In fact, in the bzr development of bzr itself.  All commits are done
in feature branches and then merged into bzr.dev (the main "trunk" of
bzr) when they are considered stable.

Consider the following
bzr branch mainline featureA
cd featureA
hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
No I want to merge in mainline again
bzr merge ../mainline; bzr commit -m merge
hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

right now, I would have something line this in the branch log
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f4
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f3
----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   merge
      -----------------------------------------------------------------
      committer: Foo Bar <foo@bar.com>
      branch nick: mainline
      message:
         something done in mainline
      -----------------------------------------------------------------
      committer: Foo Bar <foo@bar.com>
      branch nick: mainline
      message:
         something else done in mainline
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f2
-----------------------------------------------------------------
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: featureA
message:
   f1

In this view,I can easily see what was part of this feature branch,
because the committs that belongs to the feature branch are not
indented, and they have a "branch nick" of "featureA".  I can also
easily see what comes from other branches.

I can also run bzr log with --line or --short which shows you only the
commits made in this branch and not the once that are merged in.  So
with --line I would get something line
Erik Bågfors 2006-10-19 f4
Erik Bågfors 2006-10-19 f3
Erik Bågfors 2006-10-19 merge
Erik Bågfors 2006-10-19 f2
Erik Bågfors 2006-10-19 f1

Which will give me a good view of what has been done in this feature
branch only.

If I understand it correctly, in git, you don't really know what has
been committed as part of this branch/repo, and what has been
committed in another branch/repo (this is my understanding from
reading this thread, I might be wrong, feel free to correct me again
:) )

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
  2006-10-22  3:30                                                               ` Sean
  2006-10-22  3:30                                                               ` Sean
@ 2006-10-22 10:00                                                               ` Matthew D. Fuller
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 10:00 UTC (permalink / raw)
  To: Sean; +Cc: Jeff Licquia, bazaar-ng, git

On Sat, Oct 21, 2006 at 11:30:14PM -0400 I heard the voice of
Sean, and lo! it spake thus:
> On Sat, 21 Oct 2006 23:23:37 -0400
> Jeff Licquia <jeff@licquia.org> wrote:
> > 
> > OK.  So you are conflating the two.  Could someone who isn't
> > comment?
> 
> No, actually i'm not.  Single revno's or your dotted revno's _both_
> have the same property.

I think Jeff's actually meaning the other way around.  We're confident
through experience of the utility of the single revnos.  We're NOT (at
least, I'm not) so convinced of the utility and usability of the
dotted ones; they haven't gone through the crucible of experience yet.

During the dotted-decimal discussion, I favored numbering from the
merge point (rather than the ancestral point) for a lot of the same
reasons brought up here.  e.g., the log-ish output would look
something like:

200
199
 199.3
 199.2
 199.1
198
[...]

See <https://lists.ubuntu.com/archives/bazaar-ng/2006q3/017773.html>
for instance.

Of course, now we have them, and they  number from ancestors.  So
after that's in a couple releases, we'll get to see how it works in
practice.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
@ 2006-10-22 11:44                                                                   ` Sean
  2006-10-22 11:44                                                                   ` Sean
  2006-10-22 13:03                                                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 11:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Jeff Licquia, bazaar-ng, git

On Sun, 22 Oct 2006 05:00:28 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> I think Jeff's actually meaning the other way around.  We're confident
> through experience of the utility of the single revnos.  We're NOT (at
> least, I'm not) so convinced of the utility and usability of the
> dotted ones; they haven't gone through the crucible of experience yet.
> 

Yes, that's the way I took what he said as well.

Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
in a truly distributed system.   Now it's clear that you folks
just don't really care about that and you're happy enough that they
work out fine for your uses.  That's a fair enough decision to make;
there's no law that says you have to care about the situations where
there will be clashes and/or the numbers will change.  Git makes
a different choice, and for my money it's a better choice.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
  2006-10-22 11:44                                                                   ` Sean
@ 2006-10-22 11:44                                                                   ` Sean
  2006-10-22 13:03                                                                   ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 11:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 05:00:28 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> I think Jeff's actually meaning the other way around.  We're confident
> through experience of the utility of the single revnos.  We're NOT (at
> least, I'm not) so convinced of the utility and usability of the
> dotted ones; they haven't gone through the crucible of experience yet.
> 

Yes, that's the way I took what he said as well.

Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
in a truly distributed system.   Now it's clear that you folks
just don't really care about that and you're happy enough that they
work out fine for your uses.  That's a fair enough decision to make;
there's no law that says you have to care about the situations where
there will be clashes and/or the numbers will change.  Git makes
a different choice, and for my money it's a better choice.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
  2006-10-21 20:55                                                   ` Jakub Narebski
  2006-10-21 23:07                                                   ` Jeff Licquia
@ 2006-10-22 12:46                                                   ` Matthew D. Fuller
  2006-10-22 13:51                                                     ` Jakub Narebski
  2006-10-22 19:36                                                   ` David Clymer
  3 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 12:46 UTC (permalink / raw)
  To: Carl Worth; +Cc: bazaar-ng, git

[ Time to trim up CC's a bit ]

On Sat, Oct 21, 2006 at 01:47:08PM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
> On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> > I think we're getting into scratched-record-mode on this.
> 
> I apologize if I've come across as beating a dead horse on this.

Oh, I don't mean the whole topic in general.  It's just that there are
only so many ways one can say "revnos are only valid in certain
situations", and I really think we must have hit them all by now.  We
all agree on that; we just disagree (probably highly based on
differing workflows) on the commonness and extent of those situations.


> > B: Revnos are handier tools for [situation] and [situation] for
> >    [reason] and [reason].
> 
> I'm missing something:
> 
> I still haven't seen strong examples for this last claim. When are
> they handier?

This ties in a bit with what you say below, so I'll address it there.


> There's no doubt that there has been semantic confusion over the
> term branch that has been confounding communication on both sides.
  [...]
> Let me know if I botched any of that.

This seems correct; at least, it's correct enough to work from until
we find a detail wrong.


> But dropping a merged branch in bzr means throwing away the ability to
> reference any of its commits by its custom, branch-specific revision
> numbers.

True (though see below).


> And there is no simple way to correlate the numbers between
> branches.

Rather, unless you can one way or another access the branch the number
was for, there's NO way.


> Maybe you can argue that there isn't any centralization bias in bzr.
> But anyone that claims that the revnos. are stable really is talking
> from a standpoint that favors centralization.

I think it's using that 'c' word there that's causing contention here;
we're ascribing different meanings to it.

Revnos only apply to a specific "branch" (in this usage, I'm talking
about branch abstractly and somewhat specifically; more in a moment),
and so except by wild coincidence are only useful in talking about
that branch.  One of the two cases (the second discussed later) where
that's useful is when you have long-lived branches.  In git,
apparently, you don't have long-lived "branches" in this particular
meaning of the word, but the way people use bzr they do.  Perhaps this
is what you mean by 'centralization'.

That long-lived branch doesn't have to be any sort of "trunk", though
it usually is; it could as easily be something totally peripheral.


Now, details of that use of "branch".  In mathematical terms, a branch
may be defined purely by its head rev (and the graph built up by
recursing through all the parents), but in [bzr] UI and mental model
terms, a "branch" is that plus its mainline[0]; the left-most or first
line of descent, which colloquially is the difference between 'things
I commit' and 'things I merge'.

Let me try flexing my git-expression muscles here.  Given a branch at
a specific point in time, you point at the head rev, and there's a
subset we call 'mainline' of the whole set of parents, which is
expressed by following the 'first' parent pointers back to a single
origin (there can be 50 origins in the whole graph, of course, but
only one of them is on the 'mainline').  At some later time, more
revisions have been added to the graph, and the head rev is now
something "later".  If, at that later time, all the nodes which were
previously on that 'mainline' are still on it tracing back from the
new head, then in the sense I'm using "branch", it's still the same
"branch".  All the revnos referring to its earlier incarnation are
still valid for this one (though there are new ones tacked onto the
end; that doesn't affect the pre-existing ones).

[I THINK we all understand that, but just making sure]


[0] This probably causes some confusion too, since I know I'm guilty
    of using the word 'mainline' both in the sense of a 'trunk'
    branch, and this particular path through one branch.  _I_ think
    it's usually clear from context, but I guess it probably isn't for
    those with a different mental modeling of "branch".


> To illustrate, yesterday I gave an example where performing a bzr
> branch from a dotted-decimal revision would rewrite the numbers from
> the originating branch (1.2.2, 1.2.1, and 1) to unrelated numbers in
> the new branch (3, 2, 1).

One thing to note here is that that 1.2.1 and 1.2.2 came into your
first branch here by merging from another branch (call that branch
'b').  When you created your new branch here that now has (3,2,1),
those numbers are the same as the numbers that existed locally in 'b'
at the time 1.2.2 was its 'head'.  In a sense, then, you've just
recreated [a copy of] "branch" 'b' at that time.  So, in a way, by
taking a copy of the current bzr.dev branch, you can recreate the
entire state of any branches that were merged into it as of the time
they were merged (excluding cases of cherrypicking, or when merging
prior to the head of those branches of course).


> But then I realized why bzr is doing this. It's because, bzr users
> don't just use the revision numbers for external communication, but
> they also use them for lots of direct interaction with the tool. The
> rewriting makes it easy to write something like "bzr diff -r1..3".

This is an instance of the second case (first above) where the revnos,
applying just to one branch, become useful.  And, it's probably the
case I'm most attached to.

The great majority (I'd say easily 80%) of my references to revisions
are transient.  Most of 'em have probably exhausted their usefulness
in an hour; many of them (as in interaction with the tool you
mentioned) in just a couple seconds.  Virtually all my branches live
longer than that, so the limited lifespan of the numbers in the grand
scheme doesn't matter a whit.

So, from above, some of the places they're handier:

- Typing.  I know, copy and paste copies and pastes one string just as
  well as another, and long strings just as well as short.  But I
  don't want to copy&paste; I want to ^Z out of log and run a quick
  diff, between two revisions only one of which is on my screen at the
  time.  I can just remember the offscreen revno I'm comparing
  against, and it's very easy to quickly type the numbers,
  particularly since 95% of the time I'm comparing mainline revs so I
  don't even have to think about dotted forms.


- Some forms of communicating.  I can yell numbers across the room
  without concern about whether they'll be interpreted right.  Even 6
  digits of an SHA-1 hash are a lot harder to do that with.  I can
  hold revnos in my head while I walk down the hall to talk to
  somebody about them, or pick up a phone, or go to a meeting.  I can
  scrawl them on notepads or whiteboards.  In all these cases, the
  only reason for which I'm communicating that revno will be exhausted
  very shortly, so it's completely irrelevant whether it's meaningful
  in 5 years, or next week.


- Visual comparing (this is one that's useful on the long-lived
  branches, as well as transient stuff) and information gathering.  I
  can hold in my head "Yeah, I looked at 1350 of Joe's branch", and if
  I see an email from him "Oh, I fixed a bug in 1358" or "in 1293", I
  can know just from that whether I saw the fix or not.

  If somebody says "I introduced a bug in revision 3841, and fixed it
  in 3843", I know the window where that bug is in play is probably
  pretty small, whereas "introduced in 3841, fixed in 5337" tells me
  it was alive a looong time.

  bzr.dev is currently on revno 2091.  I didn't know that, I had to
  look it up.  But I knew it was a little past 2000, just from loosely
  watching it.  If somebody talks about something that happened in
  revno 1800, I know automatically "That was fairly recent", compared
  to talking about revno 75, where I know "Wow, that was a long time
  ago".

  This property is true of bzr revids as well.  If I see talk about
  revision "mbp@sourcefrog.net-20050520021228-bc46a17f07eff7f9", I
  know right away Martin committed it, and it was a year and a half
  ago.  If I see talk about an oops in revision "af38cc3", that just
  tells me that somebody screwed up, and it gets mentally filed away
  or goes in one eye and out the other.  But if I see talk about an
  oops in revision "fullermd@over-yonder.net-[...]", that rings bright
  blue bells that tell me that *I* screwed up and I need to jump on
  that right now.  In a sufficiently small projects with sufficiently
  discrete task division, I may even be able to guess offhand based on
  the person and date what bit of functionality the commit references,
  though that's a much lower probability.

  It can also be useful in looking at cases where you don't
  necessarily have the tool.  Compare putting CVS's rcsid tags in
  strings in the source.  static const char *rcsid = "$Id"; and the
  like.  Then you can use 'ident' on the compiled binaries to see the
  revs of files in them.  If somebody says "foo.c has a bug in 1.34,
  fixed in 1.37", I can without any VCS interaction just look at the
  compiled binary and tell whether I'm prior to the bug, have the bug,
  or after the fix.  If the binary is known to be compiled from a
  particular branch, a tree-wide revno tells me that too.  A revid
  (even one containing a date) won't tell me that; I'll have to find
  the tool and a copy of the tree and find out if my rev contains that
  other rev.

  Now, on any given revision reference, I probably don't care about
  most of those bits of info.  I may not care about any of them, but I
  often care about at least one or two.  And we all probably have
  wildly varying appraisals of the commonness of various of the
  situations described.  And yes, a lot of them are just mental
  heuristics.  Sure, with a completely opaque id, I could pull up the
  tool to look up any of those (and a lot more information besides),
  the gain is I don't HAVE to.  Just knowing some bit of that info can
  often tell me if I don't care to investigate whatever the revision
  is being referenced for at all, or that I need to put doing so at
  the top of my priority list.



> And it turns out that git also allows branch specific naming for the
> exact same reason. In place of 3, 2, 1 in the same situation git
> would allow the names HEAD, HEAD~1, and HEAD~2 to refer to the same
> three revisions. So the easy diff command would be "git diff HEAD~2
> HEAD".

In bzr, that would be "bzr diff -r-2..-1" (or just "-r-2.." since
open-ended revspecs pretty much work like you'd expect them to).  IME,
that only works well maybe 4 or 5 revs back; past that, you spend too
much time counting, and it's easier to just whack in the number from
log.

bzr _doesn't_, OTOH, have anything like HEAD^2, for selecting
alternate parent paths.  That's probably use-pattern bias; we hardly
ever do something like that, so it's never occurred to us to add the
ability to.


> Maybe some of the people that dislike git's "ugly" names so much is
> that they imagine that to compare two revisions a user of git must
> inspect the logs, fish out the sha1sum for each, and then
> cut-and-paste to create the command needed.

I do imagine that.  And I think I'd hit it, since I often look around
revs that aren't right near the tip; trying to figure out
"HEAD~293..HEAD~38" is even worse than excavating the sha1sum's.



-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
  2006-10-22 11:44                                                                   ` Sean
  2006-10-22 11:44                                                                   ` Sean
@ 2006-10-22 13:03                                                                   ` Matthew D. Fuller
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 13:03 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 07:44:22AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> Bzr revnos (dotted or otherwise) can not be guaranteed to be stable
> in a truly distributed system.

Perhaps the difference is that we're making a [fine] distinction
between "useful in a truely distributed system" and "useful when
WORKING in a truely distributed system".  cworth's point back up a few
posts is good; nearly all of my use of revnos is in direct interaction
with the tool, where the revnos just came from looking at the history.
And of those uses that aren't in that class, nearly all of THOSE are
very transient.  Non-local (in time or space) stability in either of
those cases is a total non-concern.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  9:56                                                     ` Erik Bågfors
@ 2006-10-22 13:23                                                       ` Jakub Narebski
  2006-10-22 14:11                                                         ` Erik Bågfors
  2006-10-22 14:25                                                       ` Carl Worth
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 13:23 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Jan Hudec, bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

Erik Bågfors wrote:
> Jakub Narębski wrote:

>> For example git encourages using many short and longer-lived feature
>> branches; I don't see bzr encouraging this workflow.
> 
> Why not? I think it really does.  And due to the fact that merges are
> merges and will show up as such, I think it's very suitable for
> feature branches.

I think I haven't properly explained what "feature branch" means.
"Feature branch" is short (or medium) lived branch, created for
development of one isolated feature. When feature is in stable
stage, we merge feature branch and forget about it. We are not
interested in the fact that given feature was developed on given
branch. BTW. for example in published git.git repository are
only available in the form of "digest" 'pu' (proposed updates)
branch.

I guess what you are talking about are long lived "development
branches" (like git.git 'maint', 'master', 'next' and 'pu' branches),
or perhaps long lived another user's clone of given git repository.

Git considers having clones of given repository totally equivalent,
and having fast-forward property more important than remembering
"which branch (which clone) has this commit came from" or at least
"this commit is from this (current) branch-clone".

You have graphical history viewers (bzr has it's own: bzr-gtk),
committer and author info, and reflog if enabled if you really,
really need this kind of information. 
 
> In fact, in the bzr development of bzr itself.  All commits are done
> in feature branches and then merged into bzr.dev (the main "trunk" of
> bzr) when they are considered stable.
> 
> Consider the following
> bzr branch mainline featureA
Which if I remember correctly (at least by default) needs and generates
new working tree.

> cd featureA
> hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
> No I want to merge in mainline again
> bzr merge ../mainline; bzr commit -m merge
> hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

As it clarified during this long discussion, bzr "branches" are
something between git branches and one-branch [local] clones.
Can you for example create branch starting from an arbitrary revision,
not only tip of branch?

The above sequence of operations can be done in (at least) two different
ways in git.

Less used:
 $ cd /somewhere/else
 $ git clone -l -s <mainrepo>/.git featureA
 $ cd featureA
 $ hack; hack; git commit -a -m "f1"; hack; hack; git commit -a -m "f2"; etc   
 $ cd <mainrepo>
 $ git pull /somewhere/else/featureA/.git
 (this does commit and merge)

But more common used is:
 $ git branch featureA mainline
 $ git checkout featureA
 $ hack; hack; git commit -a -m "f1"; hack; hack; git commit -a -m "f2"; etc
 $ git checkout mainline
 $ git pull . featureA
 (although this would fast-forward in this example)

> right now, I would have something line this in the branch log
> -----------------------------------------------------------------
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: featureA
> message:
>    f4
> -----------------------------------------------------------------
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: featureA
> message:
>    f3
> ----------------------------------------------------------------
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: featureA
> message:
>    merge
>       -----------------------------------------------------------------
>       committer: Foo Bar <foo@bar.com>
>       branch nick: mainline
>       message:
>          something done in mainline
>       -----------------------------------------------------------------
>       committer: Foo Bar <foo@bar.com>
>       branch nick: mainline
>       message:
>          something else done in mainline
The automatic merge message takes care of this, if we enable
merge.summary config option. For example:

commit 2c8a02263c13c6e1891e9e338eb40a4286b613e5
Merge: 2492932... 87b787a...
Author: Jakub Narebski <jnareb@gmail.com>
Date:   Sat Oct 21 13:23:19 2006 +0200

    Merge branch 'master' of git://git.kernel.org/pub/scm/git/git
    
    * 'master' of git://git.kernel.org/pub/scm/git/git:
      git-clone: define die() and use it.
      Fix typo in show-index.c
      pager: default to LESS=FRS


Another example, this time of "octopus" merge.

commit ff49fae6a547e5c70117970e01c53b64d983cd10
Merge: 7ad4ee7... 75f9007... 14eab2b... 0b35995... eee4609...
Author: Junio C Hamano <junkio@cox.net>
Date:   Fri Oct 20 18:56:14 2006 -0700

    Merge branches 'jc/diff', 'jc/diff-apply-patch', 'jc/read-tree' and 'pb/web' into pu
    
    * jc/diff:
      para walk wip
      para-walk: walk n trees, index and working tree in parallel
    
    * jc/diff-apply-patch:
      git-diff/git-apply: make diff output a bit friendlier to GNU patch (part 2)
    
    * jc/read-tree:
      merge: loosen overcautious "working file will be lost" check.
    
    * pb/web:
      gitweb: Show project README if available

That said we couldn't do that in abovementioned example
as it is simple case of fast-forward. We have above messages
for "true merges" of two _diverging_ lines of development,
and we could use similar format for "git log". In practice
we rather use history viewers: gitk, qgit, tig, git-show-branch.

For example:
$ git show-branch origin next
! [origin] git-clone: define die() and use it.
 ! [next] Merge branch 'master' into next
--
 - [next] Merge branch 'master' into next
++ [origin] git-clone: define die() and use it.

> If I understand it correctly, in git, you don't really know what has
> been committed as part of this branch/repo, and what has been
> committed in another branch/repo (this is my understanding from
> reading this thread, I might be wrong, feel free to correct me again
> :) )

You can browse reflog to get to know which changes were commited
as part of this repo, and which came from other repo (other clone
of this repo).
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
  2006-10-22 13:28                                                                       ` Sean
@ 2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:33                                                                       ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 13:28 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:03:22 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Perhaps the difference is that we're making a [fine] distinction
> between "useful in a truely distributed system" and "useful when
> WORKING in a truely distributed system".  cworth's point back up a few
> posts is good; nearly all of my use of revnos is in direct interaction
> with the tool, where the revnos just came from looking at the history.
> And of those uses that aren't in that class, nearly all of THOSE are
> very transient.  Non-local (in time or space) stability in either of
> those cases is a total non-concern.

Sure, but if they're just a local feature then why propagate them with
the distributed data?  If they're meant only to be used locally,
they can be guaranteed to be stable by never replicating
them, with obvious benefits for the local user.  However bzr makes the
(IMO) mistake of including them in the data that is distributed 
between repos.  This suggests bzr team just doesn't care about the
distributed models where this will not help and will quite possibly
lead to frustration and confusion.  And yes, I know that you
haven't seen those situations yourself yet.  Obviously, it's the
Bzr teams trade-off to make, but if an avid user like yourself thinks
of revno's as local, perhaps they've made the wrong choice.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
@ 2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:33                                                                       ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 13:28 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:03:22 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Perhaps the difference is that we're making a [fine] distinction
> between "useful in a truely distributed system" and "useful when
> WORKING in a truely distributed system".  cworth's point back up a few
> posts is good; nearly all of my use of revnos is in direct interaction
> with the tool, where the revnos just came from looking at the history.
> And of those uses that aren't in that class, nearly all of THOSE are
> very transient.  Non-local (in time or space) stability in either of
> those cases is a total non-concern.

Sure, but if they're just a local feature then why propagate them with
the distributed data?  If they're meant only to be used locally,
they can be guaranteed to be stable by never replicating
them, with obvious benefits for the local user.  However bzr makes the
(IMO) mistake of including them in the data that is distributed 
between repos.  This suggests bzr team just doesn't care about the
distributed models where this will not help and will quite possibly
lead to frustration and confusion.  And yes, I know that you
haven't seen those situations yourself yet.  Obviously, it's the
Bzr teams trade-off to make, but if an avid user like yourself thinks
of revno's as local, perhaps they've made the wrong choice.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
  2006-10-22 13:28                                                                       ` Sean
  2006-10-22 13:28                                                                       ` Sean
@ 2006-10-22 13:33                                                                       ` Matthew D. Fuller
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 13:33 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 09:28:45AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> Sure, but if they're just a local feature then why propagate them
> with the distributed data?

Because they're 'local' to a given "branch"; see my message to cworth
a little while ago for expansion of the rather particular meaning of
the word used here.  If somebody takes a clone of my _branch_, it's
the same "branch", so the numbers will be the same (and that's
desired).


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
@ 2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:57                                                                           ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 13:40 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:33:36 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Because they're 'local' to a given "branch"; see my message to cworth
> a little while ago for expansion of the rather particular meaning of
> the word used here.  If somebody takes a clone of my _branch_, it's
> the same "branch", so the numbers will be the same (and that's
> desired).

The fact is that once you start distributing them to other repositories
you CAN NOT GUARANTEE their stability.  Those number may already be
used by _HIS_ branch and when he tries to get _YOUR_ branch.. there
is a conflict.  AND THERE IS NOTHING YOU CAN DO TO FIX THAT.  It's
a fundamental flaw with distributing revnos.  The reason you likely
haven't seen a problem so far is that the bzr world seems to favor
the use of a central server that has the effect of more or less
synchronizing branch numbers to most of the nodes in the system.
However, that's only one model.  So while you may not have seen a
problem yourself, there are _inherent_ limitations of the system
you've embraced.

But it seems like nobody on the bzr team cares or wants to hear about
it, so let's just move on.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
  2006-10-22 13:40                                                                           ` Sean
@ 2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:57                                                                           ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 13:40 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:33:36 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> Because they're 'local' to a given "branch"; see my message to cworth
> a little while ago for expansion of the rather particular meaning of
> the word used here.  If somebody takes a clone of my _branch_, it's
> the same "branch", so the numbers will be the same (and that's
> desired).

The fact is that once you start distributing them to other repositories
you CAN NOT GUARANTEE their stability.  Those number may already be
used by _HIS_ branch and when he tries to get _YOUR_ branch.. there
is a conflict.  AND THERE IS NOTHING YOU CAN DO TO FIX THAT.  It's
a fundamental flaw with distributing revnos.  The reason you likely
haven't seen a problem so far is that the bzr world seems to favor
the use of a central server that has the effect of more or less
synchronizing branch numbers to most of the nodes in the system.
However, that's only one model.  So while you may not have seen a
problem yourself, there are _inherent_ limitations of the system
you've embraced.

But it seems like nobody on the bzr team cares or wants to hear about
it, so let's just move on.

Cheers,
Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 12:46                                                   ` Matthew D. Fuller
@ 2006-10-22 13:51                                                     ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 13:51 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthew D. Fuller wrote:

>   It can also be useful in looking at cases where you don't
>   necessarily have the tool.  Compare putting CVS's rcsid tags in
>   strings in the source.  static const char *rcsid = "$Id"; and the
>   like.  Then you can use 'ident' on the compiled binaries to see the
>   revs of files in them.  If somebody says "foo.c has a bug in 1.34,
>   fixed in 1.37", I can without any VCS interaction just look at the
>   compiled binary and tell whether I'm prior to the bug, have the bug,
>   or after the fix.  If the binary is known to be compiled from a
>   particular branch, a tree-wide revno tells me that too.  A revid
>   (even one containing a date) won't tell me that; I'll have to find
>   the tool and a copy of the tree and find out if my rev contains that
>   other rev.

We use signed tags for tagging official releases (e.g. v1.4.0 tag),
and we use "git describe" output to be embedded during build time
in resulting binary. For example my current output of git-describe
on my clone of git repository is:

 $ git describe 
 v1.4.3.1-g2c8a022

Git project does this, gitweb does this, Linux kernel does this.
This is quite coarse grained, i.e. you know ahich released version
it is after, but you need git tools (or access to git tools via
gitweb) to check if it is after or before the fix.

Of course that is when you run GIT version of tool...
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
  2006-10-22 13:40                                                                           ` Sean
  2006-10-22 13:40                                                                           ` Sean
@ 2006-10-22 13:57                                                                           ` Matthew D. Fuller
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
  2 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 13:57 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 09:40:41AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> The fact is that once you start distributing them to other
> repositories you CAN NOT GUARANTEE their stability.

Terminology.  When those revisions get distributed to other BRANCHES,
their stability is forfeit.  We know.  We don't care.  We only care
about the numbers on ONE BRANCH.


> Those number may already be used by _HIS_ branch and when he tries
> to get _YOUR_ branch.. there is a conflict.

Terminology again.  When he has his branch and gets my branches, he
has two branches, mine and his, side by side, and the numbers in his
'my' branch still correspond to the numbers in my 'my' branch.  When
he merges the REVISIONS from my branch into his, my numbers have no
meaning on his side (there's not a 'conflict' because numbers don't
get copied, they get derived).


> So while you may not have seen a problem yourself,

You keep insisting that there's a PROBLEM here.  You're right, I don't
see one.  I KNOW the numbers only refer to a branch, I KNOW that when
you're talking about a different branch the numbers are meaningless,
and I'm perfectly fine with that because referring to revisions on *A*
branch is exactly what I USE the numbers for.

There doesn't have to be a 'central' branch, nor is there any wish for
such to be.  Any given revno only refers to *A* branch, it doesn't
have to be central to a darn thing.  HEAD in git only has meaning in
the context of *A* branch (and even 'worse', only refers to that
branch at a specific time[0]), but you'll keep on using it every day
anyway I wager.



[0] See again particular term of art "branch".


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 13:23                                                       ` Jakub Narebski
@ 2006-10-22 14:11                                                         ` Erik Bågfors
  2006-10-22 14:39                                                           ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-22 14:11 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Linus Torvalds, Andreas Ericsson, Carl Worth, Jan Hudec, git

On 10/22/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Erik Bågfors wrote:
> > Jakub Narębski wrote:
>
> >> For example git encourages using many short and longer-lived feature
> >> branches; I don't see bzr encouraging this workflow.
> >
> > Why not? I think it really does.  And due to the fact that merges are
> > merges and will show up as such, I think it's very suitable for
> > feature branches.
>
> I think I haven't properly explained what "feature branch" means.
> "Feature branch" is short (or medium) lived branch, created for
> development of one isolated feature. When feature is in stable
> stage, we merge feature branch and forget about it. We are not
> interested in the fact that given feature was developed on given
> branch. BTW. for example in published git.git repository are
> only available in the form of "digest" 'pu' (proposed updates)
> branch.


That's what I'm talking about too.
For example, in my bzr bzr-repo I have
bzr.init-repo-tree/
bzr.aliases/
bzr.dev/

and others...
In bzr.aliases for example, I built the support for defining aliases
in the bzr config file. That was a unique feature that didn't exist in
any other branch.  The branch survived about 17 days before it was
merged into bzr.dev.  During that time, I merge in another branch
twice.  The branch I merged at this time was NOT bzr.dev, but rather
another branch, from one of the main developers.  The reason I merged
his branch was that I needed a bugfix (or two? :) ) that he had done,
but that wasn't approved in bzr.dev yet.

After a time, his branch was merged into bzr.dev, shortly thereafter,
so was my branch.

After my branch was merged, I forgot about it.  I still have it laying
around on my computer because it really doesn't take up any extra
space (since it's in a shared repository), but I really have forgotten
about it.

This is typically how all features in bzr are created.
Short/medium/long-lived feature branches.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
  2006-10-22 14:24                                                                               ` Sean
@ 2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:56                                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 14:24 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:57:02 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> You keep insisting that there's a PROBLEM here.  You're right, I don't
> see one.  I KNOW the numbers only refer to a branch, I KNOW that when
> you're talking about a different branch the numbers are meaningless,
> and I'm perfectly fine with that because referring to revisions on *A*
> branch is exactly what I USE the numbers for.

Light goes on.  Okay.  So a bzr "branch" is only ever editable on a 
single machine.  So there is no distributed development on top of a 
bzr "branch".  Everyone else just has read-only copies of it.  In this
way you ensure that there is never a conflict of the revno's.  I'm not
sure of the ramifications of this but at least I get where you're coming
from now.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
@ 2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:56                                                                               ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-10-22 14:24 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

On Sun, 22 Oct 2006 08:57:02 -0500
"Matthew D. Fuller" <fullermd@over-yonder.net> wrote:

> You keep insisting that there's a PROBLEM here.  You're right, I don't
> see one.  I KNOW the numbers only refer to a branch, I KNOW that when
> you're talking about a different branch the numbers are meaningless,
> and I'm perfectly fine with that because referring to revisions on *A*
> branch is exactly what I USE the numbers for.

Light goes on.  Okay.  So a bzr "branch" is only ever editable on a 
single machine.  So there is no distributed development on top of a 
bzr "branch".  Everyone else just has read-only copies of it.  In this
way you ensure that there is never a conflict of the revno's.  I'm not
sure of the ramifications of this but at least I get where you're coming
from now.

Sean

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  9:56                                                     ` Erik Bågfors
  2006-10-22 13:23                                                       ` Jakub Narebski
@ 2006-10-22 14:25                                                       ` Carl Worth
  2006-10-22 14:48                                                         ` Erik Bågfors
                                                                           ` (2 more replies)
  1 sibling, 3 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-22 14:25 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Jakub Narebski, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

[-- Attachment #1: Type: text/plain, Size: 7105 bytes --]

At Sun, 22 Oct 2006 11:56:32 +0200, "=?ISO-8859-1?Q?Erik_B=E5gfors?=" wrote:
> Consider the following
> bzr branch mainline featureA
> cd featureA
> hack hack; bzr commit -m 'f1'; hack hack bzr commit -m f2; etc
> No I want to merge in mainline again
> bzr merge ../mainline; bzr commit -m merge
> hack hack; bzr commit -m f3; hack hack bzr commit -m f4; etc

Thanks for sharing this example. I think when we look at concrete
things that the tools actually let you do, we have a better
conversation. Plus, this example highlights some very interesting
differences between the tools.

So here is a complete sequence of git commands to construct the
scenario (even the extra hacking in mainline):

	mkdir gittest; cd gittest
	git init-db
	touch mainline; git add mainline; git commit -m "Initial commit of mainline"
	git checkout -b featureA
	touch f1; git add f1; git commit -m f1
	touch f2; git add f2; git commit -m f2
	git checkout -b mainline master
	touch sd; git add sd; git commit -m "something done in mainline";
	touch se; git add se; git commit -m "something else done in mainline";
	git checkout featureA
	git pull . mainline
	touch f3; git add f3; git commit -m f3
	touch f4; git add f4; git commit -m f4

For reference, here's the same with bzr:

	mkdir bzrtest; cd bzrtest
	bzr init-repo . --trees
	bzr init mainline; cd mainline
	touch mainline; bzr add mainline; bzr commit -m "Initial commit of mainline"
	cd ..; bzr branch mainline featureA; cd featureA
	touch f1; bzr add f1; bzr commit -m f1
	touch f2; bzr add f2; bzr commit -m f2
	cd ../mainline/
	touch sd; bzr add sd; bzr commit -m "something done in mainline"
	touch se; bzr add se; bzr commit -m "something else done in mainline"
	cd ../featureA
	bzr merge ../mainline/; bzr commit -m "merge"
	touch f3; bzr add f3; bzr commit -m f3
	touch f4; bzr add f4; bzr commit -m f4

[As has recently been pointed out, the tools really are more the same
than different, and I think the above illustrates that.]

> right now, I would have something line this in the branch log

OK. So here is a difference in the tools. With git, you don't get the
indentation for the "non-mainline" commits. This is because git
doesn't recognize any branch in the DAG to be more significant than
any other. Instead, git provides a flat, and (heuristically)
time-sorted view of the commits. (It's heuristic in that git just uses
the time stamps in the commit objects---but it doesn't actually care
if these are totally "wrong"---git knows that there is no global
clock.)

That said, git does store an order for the parent edges of each
commit, and this order is assigned deterministically by the commands
that create merge commits. So someone could use git carefully, (which
it seems people are doing with bzr), to preserve "mainline as first
parent" and someone could write a modified git-log that would do
indentation.

But even without any of that manual care for creating a "mainline",
git already provides a very easy way to see the "mainline" view
anyway. See below.

> In this view,I can easily see what was part of this feature branch,
> because the commits that belongs to the feature branch are not
> indented, and they have a "branch nick" of "featureA".  I can also
> easily see what comes from other branches.

Ah, I hadn't realized that bzr commits stored an "originating branch"
inside them. Git commits definitely do not have anything like
that. And as I said above, there's no indentation in git-log, so the
commits from separate branches are "mixed up". But see below.

> I can also run bzr log with --line or --short which shows you only the
> commits made in this branch and not the once that are merged in.  So
> with --line I would get something line
> Erik Bågfors 2006-10-19 f4
> Erik Bågfors 2006-10-19 f3
> Erik Bågfors 2006-10-19 merge
> Erik Bågfors 2006-10-19 f2
> Erik Bågfors 2006-10-19 f1
>
> Which will give me a good view of what has been done in this feature
> branch only.

Thank you. You've provided a concrete example of something to do,
("see commits that belong to a feature branch"), that is really very
practical and useful. And bzr achieves this ability by adopting a
"mainline is special" treatment in bzr. This special treatment
influences or directly causes many of the things in bzr that we've
been discussing:

 * mainline commits get special treatment from revision numbers
   (in old days, they're the only commits to have revision
   numbers---more recently they're the only commits to get non-dotted
   revision numbers)

 * bzr adds empty merge commits instead of fast-forwarding since it
   needs a new "mainline" commit

 * users have to be careful about merge direction to avoid
   accidentally going the "wrong" way

 * users are discouraged from using the "give me their DAG" pull
   command since it would scramble their local view of what "mainline"
   is.

I've been arguing that all of these impacts are dubious. But I can
understand that a bzr user hearing arguments against them might fear
that they would lose the ability to be able to see a view of commits
that "belong" to a particular branch.

But git provides that view perfectly well, and it's what git users
work with all the time. It doesn't require any special treatment of
one commit parent vs. another, nor storage of "originating branch" in
the commit, nor the user taking any care whatsoever about which
direction merges are performed, (nor "who" does the merge).

And as a bonus, the command-line for this view is really simple:

	git log mainline..featureA

This gives a log view just "bzr log --line" in that in only includes
f1, f2, the merge commit, f3, and f4. You can even drop the merge if
it's uninteresting:

	git log --no-merges mainline..featureA

The mainline..featureA syntax literally just means:

	the set of commits that are reachable by featureA
	and excluding the set of commits reachable by mainline

It's an extraordinarily powerful thing to say, and its exactly what
you want here. And it's more than a "show mainline" thing, since
theses sets of commits can consist of arbitrarily complex DAG
subsets. This syntax is just a really useful way to slice up the DAG.

And this syntax is almost universally accepted by git commands. so you
can visualize a chunk of the DAG with:

	gitk mainline..featureA

Or export it as patches with:

	git format-patch mainline..featureA

I haven't been able to find something similar in bzr yet. Does it
exist?

> If I understand it correctly, in git, you don't really know what has
> been committed as part of this branch/repo, and what has been
> committed in another branch/repo (this is my understanding from
> reading this thread, I might be wrong, feel free to correct me again
> :) )

You're correct that git doesn't _store_ any sort of "branch ownership"
in the commit object. But this is a huge feature. It avoids a lot of
the things in bzr that look so bizarre to people coming from git.

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 18:20                                                 ` Jakub Narebski
@ 2006-10-22 14:27                                                   ` Matthieu Moy
  0 siblings, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-22 14:27 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, bazaar-ng, Jeff King, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

Jakub Narebski <jnareb@gmail.com> writes:

> What about grandparent of commit (d8a60^^ or d8a60~2 in git),
> or choosing one of the parents in merge commit (d8a60^2 is second
> parent of a commit)? before:before:753 ?

Yes, "before:" can take any revision specifier, including
"before:something-else".

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:11                                                         ` Erik Bågfors
@ 2006-10-22 14:39                                                           ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 14:39 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Jan Hudec, bazaar-ng, Linus Torvalds, Carl Worth, Andreas Ericsson, git

Erik Bågfors wrote:
> On 10/22/06, Jakub Narebski <jnareb@gmail.com> wrote:
>> Erik Bågfors wrote:
>>> Jakub Narębski wrote:
>>
>>>> For example git encourages using many short and longer-lived feature
>>>> branches; I don't see bzr encouraging this workflow.
>>>
>>> Why not? I think it really does.  And due to the fact that merges are
>>> merges and will show up as such, I think it's very suitable for
>>> feature branches.
>>
>> I think I haven't properly explained what "feature branch" means.
>> "Feature branch" is short (or medium) lived branch, created for
>> development of one isolated feature. When feature is in stable
>> stage, we merge feature branch and forget about it. We are not
>> interested in the fact that given feature was developed on given
>> branch. BTW. for example in published git.git repository are
>> only available in the form of "digest" 'pu' (proposed updates)
>> branch. 
> 
> That's what I'm talking about too.
> For example, in my bzr bzr-repo I have
> bzr.init-repo-tree/
> bzr.aliases/
> bzr.dev/

Due to the fact that git uses separate namespace for branch names,
and not position on filesystem, one would probably use 'dev'
(or 'master', or perhaps 'next'), 'aliases' and 'init-repo-tree'
as branch names. No need for 'bzr.' prefix to distingush
branches from other directories for user.

Git does use convention like above for bare repositories
(clones of repositories without working tree; working tree
is associated with repository, not with branch), e.g. git.git
or linux-2.6.18.y.git though.

> and others...
> In bzr.aliases for example, I built the support for defining aliases
> in the bzr config file. That was a unique feature that didn't exist in
> any other branch.  The branch survived about 17 days before it was
> merged into bzr.dev.  During that time, I merge in another branch
> twice.  The branch I merged at this time was NOT bzr.dev, but rather
> another branch, from one of the main developers.  The reason I merged
> his branch was that I needed a bugfix (or two? :) ) that he had done,
> but that wasn't approved in bzr.dev yet.

That is also quite common. Merging 'master' into feature branch,
or 'next' into feature branch. One could of course cherry-pick
only the bugfix... can you do this in bzr?

> After a time, his branch was merged into bzr.dev, shortly thereafter,
> so was my branch.
> 
> After my branch was merged, I forgot about it.  I still have it laying
> around on my computer because it really doesn't take up any extra
> space (since it's in a shared repository), but I really have forgotten
> about it.

Usually after feature branch is merged (or fast-forwarded) we delete
it. All the parentage information is in DAG anyway. We can later
attach new branch with the same name to the point where the branch was.

> This is typically how all features in bzr are created.
> Short/medium/long-lived feature branches.

Like I said, in git.git development we use development branches
(e.g. 'master', 'maint', 'next'), tracking branches (e.g. 'origin',
'linus'), feature branches (e.g. 'jc/pickaxe', 'np/pack'), "helper"
branches storing somewhat unrelated ('html' and 'man' branches for
autogenerated documentation) or unrelated ('todo' for TODO notes)
wtr. code stored to the main project, "digest" branches (e.g. 'pu'
branch in git.git, which is merge of WIP feature branches to be
published, and does not fast-forward), and temporary branches (for
example for shelving current work).

From long, to medium, to short, to extremly short lived.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:25                                                       ` Carl Worth
@ 2006-10-22 14:48                                                         ` Erik Bågfors
  2006-10-22 15:04                                                           ` Jakub Narebski
  2006-10-22 14:55                                                         ` Jakub Narebski
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-22 14:48 UTC (permalink / raw)
  To: Carl Worth
  Cc: Jakub Narebski, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

Thanks for this mail, this makes me happy to see. The tools are pretty
much the same but have some different view on how to do things..

On 10/22/06, Carl Worth <cworth@cworth.org> wrote:
>
>         git log --no-merges mainline..featureA
>
> The mainline..featureA syntax literally just means:
>
>         the set of commits that are reachable by featureA
>         and excluding the set of commits reachable by mainline
>
> It's an extraordinarily powerful thing to say, and its exactly what
> you want here. And it's more than a "show mainline" thing, since
> theses sets of commits can consist of arbitrarily complex DAG
> subsets. This syntax is just a really useful way to slice up the DAG.
>
> And this syntax is almost universally accepted by git commands. so you
> can visualize a chunk of the DAG with:
>
>         gitk mainline..featureA
>
> Or export it as patches with:
>
>         git format-patch mainline..featureA
>
> I haven't been able to find something similar in bzr yet. Does it
> exist?

If I understand you correctly, you'll get the same thing with "bzr missing".

$ bzr missing ../mainline/
You have 1 extra revision(s):
------------------------------------------------------------
revno: 2
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: newbranch
timestamp: Sun 2006-10-22 16:43:10 +0200
message:
  hepp


You are missing 1 revision(s):
------------------------------------------------------------
revno: 2
committer: Erik Bågfors <erik@bagfors.nu>
branch nick: mainline
timestamp: Sun 2006-10-22 16:42:53 +0200
message:
  hej

You can also run "bzr missing" with "--theirs-only" or "--mine-only"
to get only one way.

To get the patches you can run "bzr bundle ../mainline", but then
we're back to the discussion that it currently gives a "big patch" for
viewing, but when you merge it, you get each revision separately.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:25                                                       ` Carl Worth
  2006-10-22 14:48                                                         ` Erik Bågfors
@ 2006-10-22 14:55                                                         ` Jakub Narebski
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 14:55 UTC (permalink / raw)
  To: Carl Worth
  Cc: Erik Bågfors, Jan Hudec, bazaar-ng, Linus Torvalds,
	Andreas Ericsson, git

Carl Worth wrote:
> Erik Bågfors wrote:
>> If I understand it correctly, in git, you don't really know what has
>> been committed as part of this branch/repo, and what has been
>> committed in another branch/repo (this is my understanding from
>> reading this thread, I might be wrong, feel free to correct me again
>> :) )
> 
> You're correct that git doesn't _store_ any sort of "branch ownership"
> in the commit object. But this is a huge feature. It avoids a lot of
> the things in bzr that look so bizarre to people coming from git.

Because "branch ownership" is obvously local, we have reflog, which is
local and not propagated. Reflog uses the following format

 oldsha1 SP newsha1 SP committer TAB reason LF

where reason might be "commit: <commit description/title/subject>"
or "commit (amend): <commit description>", "am: <commit 
description>" (applied mail patch), "reset --hard HEAD^" (dropped
top commit), "branch: Created from origin^0", or "pull origin: In-index 
merge".

We have not yet tools to examine reflog (e.g. change committer
info with it's timestamp to human readable format) yet.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
       [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
  2006-10-22 14:24                                                                               ` Sean
  2006-10-22 14:24                                                                               ` Sean
@ 2006-10-22 14:56                                                                               ` Matthew D. Fuller
  2006-10-22 15:05                                                                                 ` Matthieu Moy
  2 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 14:56 UTC (permalink / raw)
  To: Sean; +Cc: bazaar-ng, git

On Sun, Oct 22, 2006 at 10:24:54AM -0400 I heard the voice of
Sean, and lo! it spake thus:
> 
> Light goes on.  Okay.  So a bzr "branch" is only ever editable on a
> single machine.  So there is no distributed development on top of a
> bzr "branch".  Everyone else just has read-only copies of it.

Ah!  Yes, that's exactly[0] right.  Mark up another of those "so
obvious we never think to state it" thought-patterns   :|


Distributed development proper only happens on 'projects', not
branches.  In practice, we say "we're all working on branch X", in the
sense that we use it as a base to work from and intend to merge our
stuff into it, but strictly speaking we're all working on our own
branches that just merge from/into X from time to time.

That's also why we use the phrases "merge from" and "merge to", rather
than "merge WITH".  Of course, where possible, we could 'fast-forward'
to X rather than merge from it, at which point we'd then momentarily
have exactly X, but culturally we don't seem to like doing that.



[0] There are a few very special-case exceptions, notably around the
'checkout' concept or where people are very carefully manually
maintaining sync, but they're irrelevant in this case; and they ARE
star-pattern developments that could be said to be 'centralized'.  Now
I grok where that's coming from.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:48                                                         ` Erik Bågfors
@ 2006-10-22 15:04                                                           ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 15:04 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Carl Worth, Jan Hudec, bazaar-ng, Linus Torvalds, Andreas Ericsson, git

Erik Bågfors wrote:

> On 10/22/06, Carl Worth <cworth@cworth.org> wrote:
>>
>>         git log --no-merges mainline..featureA
>>
>> The mainline..featureA syntax literally just means:
>>
>>         the set of commits that are reachable by featureA
>>         and excluding the set of commits reachable by mainline
>>
[...]
>> And this syntax is almost universally accepted by git commands. so you
>> can visualize a chunk of the DAG with:
>>
>>         gitk mainline..featureA
>>
>> Or export it as patches with:
>>
>>         git format-patch mainline..featureA
>>
>> I haven't been able to find something similar in bzr yet. Does it
>> exist?
> 
> If I understand you correctly, you'll get the same thing with "bzr missing".
> 
> $ bzr missing ../mainline/
> You have 1 extra revision(s):
> ------------------------------------------------------------
> revno: 2
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: newbranch
> timestamp: Sun 2006-10-22 16:43:10 +0200
> message:
>   hepp
> 
> 
> You are missing 1 revision(s):
> ------------------------------------------------------------
> revno: 2
> committer: Erik Bågfors <erik@bagfors.nu>
> branch nick: mainline
> timestamp: Sun 2006-10-22 16:42:53 +0200
> message:
>   hej

That is (roughly) equivalent of
  $ git log mainline...featureA
(which would give all commits which are _either_ in mainline,
xor in featureA, although not separated; --topo-order might help), or
  $ git show-branch mainline featureA

> You can also run "bzr missing" with "--theirs-only" or "--mine-only"
> to get only one way.

That would be equivalent of
  $ git log mainline..featureA
(--theirs-only), or
  $ git log featureA..mainline
(--mine-only).

> To get the patches you can run "bzr bundle ../mainline", but then
> we're back to the discussion that it currently gives a "big patch" for
> viewing, but when you merge it, you get each revision separately.

What about
  $ gitk mainline..featureA
i.e. showing selected part of DAG in graphical history viewer?

And of course syntax is even more powerfull, e.g.
  $ git log maint master --not next
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:56                                                                               ` Matthew D. Fuller
@ 2006-10-22 15:05                                                                                 ` Matthieu Moy
  0 siblings, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-22 15:05 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Sean, bazaar-ng, git

"Matthew D. Fuller" <fullermd@over-yonder.net> writes:

> On Sun, Oct 22, 2006 at 10:24:54AM -0400 I heard the voice of
> Sean, and lo! it spake thus:
>> 
>> Light goes on.  Okay.  So a bzr "branch" is only ever editable on a
>> single machine.  So there is no distributed development on top of a
>> bzr "branch".  Everyone else just has read-only copies of it.
>
> Ah!  Yes, that's exactly[0] right.  Mark up another of those "so
> obvious we never think to state it" thought-patterns   :|

Well, I'm not sure you talk about the same thing still. Adding my
2cents:

If ~/branch1 is a branch, I can get a read-write "copy" of it with

$ bzr branch ~/branch1 ~/branch2

which will roughly be equivalent to

$ cp -r ~/branch1 ~/branch2

Whether they are at this point "the same branch" or "two distinct
branches with same content" is just a matter of vocabulary since there
is no real "branch identity" AFAIK in bzr.

Now, if you commit in ~/branch1, then ~/branch2 is out of date with
it. If you commit also to ~/branch2, then you get two divergent
branches.

(and obviously, I could have done the same with branches in different
machines)

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-16  3:53   ` Martin Pool
@ 2006-10-22 15:50     ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 15:50 UTC (permalink / raw)
  Cc: bazaar-ng, git

On 14 Oct 2006, Jakub Narebski <jnareb@gmail.com> wrote:
> Jon Smirl wrote:
> 
>> It refers to this comparison chart between source control systems.
>> http://bazaar-vcs.org/RcsComparisons
> 
> It is quite obvious that comparison of programs of given type (SMC)
> on some program site (Bazaar-NG) is usually biased towards said program,
> perhaps unconsciously: by emphasizing the features which were important
> for developers of said program.

There are also clashes with SCM terminology used differently by different
projects, which are sometimes couled with differences in philosophy,
and sometimes by different undestanding of given name.

For example "lightweight checkouts" and "normal/heavyweight checkout"
are from what I gather, is supporting "CVS/centralized model" and
"disconnected CVS model" (i.e. we can commit changes locally with
no network access, and we save local changes), at least when we
do "checkout" remotely and not on one local filesystem out-of-the-box.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:49                                                             ` Carl Worth
  2006-10-22  0:07                                                               ` Jeff Licquia
@ 2006-10-22 16:02                                                               ` Petr Baudis
  2006-10-25  9:52                                                               ` Andreas Ericsson
  2 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-22 16:02 UTC (permalink / raw)
  To: Carl Worth; +Cc: Jeff Licquia, Jakub Narebski, bazaar-ng, git

Dear diary, on Sun, Oct 22, 2006 at 01:49:04AM CEST, I got a letter
where Carl Worth <cworth@cworth.org> said that...
> Almost none of the power of git is exposed by gitweb. It's really not
> worth comparing. (Now a gitweb-alike that provided all the kinds of
> very easy browsing and filtering of the history like gitk and git
> might be nice to have.)

http://repo.or.cz/git-browser/by-commit.html?r=linux-2.6.git

It could use plenty of improvement, though.

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1
lK[d2%Sa2/d0$^Ixp"|dc`;s/\W//g;$_=pack('H*',/((..)*)$/)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22  7:49                                                         ` Tim Webster
@ 2006-10-22 17:12                                                           ` Linus Torvalds
  2006-10-23  5:19                                                             ` Matthew Hannigan
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-22 17:12 UTC (permalink / raw)
  To: Tim Webster; +Cc: git, Aaron Bentley, bazaar-ng, Jakub Narebski



On Sun, 22 Oct 2006, Tim Webster wrote:

> On 10/22/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > 
> > > project/file-1
> > > project/file-2
> > > project/.git-1
> > > project/.git-2
> > 
> > Ok, that's just insane.
> [snip]
> > Anyway. Git certainly allows you to do some really insane things. The
> > above is just the beginning - it's not even talking about alternate object
> > directories where you can share databases _partially_ between two
> > otherwise totally independent repositories etc.
> 
> 
> Perhaps this is insane, but it does not make sense to track all config
> files in etc as though they belong in a single repo.

Oh, ok, now I see what you're going after.

Right - if you track system directories in a repo, you'd quite possibly 
end up with multiple repositories. Although even then, I'd actually 
suggest that as a git user, you would only have one actual repository, and 
just multiple branches that have a disjoint set of files (again, it's 
certainly possible to have file overlap too, of course).

But the usage I would seriously suggest is to _not_ do development "inside 
/etc" itself. You'd have those git repositories somewhere else, say in 
"/usr/src/etc-repo" or similar, and then you'd have a few extra wrappers 
to help your particular usage. I have a few reasons for this:

 - I think being in /etc and doing development is just fundamentally scary 
   in itself, because if you do something wrong in the current directory, 
   you're just pretty badly off. It's better to have a "buffer zone" that 
   you do development in, and when you're happy, you do a "install" 
   command or something.

 - I think developing as "root" is totally broken, and some of the files 
   you are tracking may not even be _readable_ to normal users in their 
   real form, so you can't even do trivial things like "diff" as a normal 
   user otherwise. So again, the solution to this would be to do 
   development somewhere else, and have specific wrappers (with "sudo" as 
   appropriate, and your developer ID obviously specially in the sudo 
   files) to do those special "realdiff" and "install" commands.

 - finally: when you work with almost any SCM designed for source control, 
   you're almost inevitably going to have to have some "special" way to 
   track the things that source control usually does _not_ track because 
   it makes no sense for source code. So you'd have to have some special 
   file that tracks ownership/group/full permissions information, and 
   perhaps special devices (if you're tracking things like /dev).

   Again, the way to solve this would tend to be to have a few helper 
   scripts that use regular file-contents that _describe_ these things to 
   do "realdiff" and "install".

In other words, for at least three _totally_ different reasons, you really 
don't want to do tracking/development directly in /etc, but you want to 
have a buffer zone to do it. And once you have that, you might as well do 
_that_ as the repository, and just add a few specialty commands (let's 
call them "plugins" to make everybody happy) to do the special things.

And once you have that kind of setup, you're really better off with 
more of a "several branches for different kinds of files" or even totally 
different repositories. That's a detail, and I don't think anybody really 
cares.

Anyway, to make this slightly more grounded in examples, let me give a 
quick overview of what I'd do if I did this with git. Not a "real" setup 
at all, but kind of a "maybe something like this" - so don't get _too_ 
hung up about the details, ok? It's just a rough draft kind of thing.

First off, let's just say that I want to track /etc/group, /etc/passwd and 
/etc/shadow as one "thing". Whether that thing is a repository of its own 
or a branch in a bigger repository doesn't matter (right now I'm only 
doing those three), and quite frankly, I'm not going to even go into 
whether it _really_ makes sense to track "groups" and the passwd files 
together, but it's just an example, ok?

What I'd do is roughly:

	# set up the new repo (or branch, or..)
	mkdir identity-repo
	cd identity-repo
	git init-db

	# copy the data, set up a PERMISSIONS file to track extra info
	sudo cp /etc/group /etc/passwd /etc/shadow .
	sudo chown user:user *
	cat <<EOF > PERMISSIONS
	group root:root 0644
	passwd root:root 0644
	shadow root:root 0400
	EOF
	git add .
	git commit -m "Initial setup"

and now I have the initial setup, together with permissions and user/group 
information on the things, all ready to track. I can do development in 
this as if it was a normal source-code repository.

So now I can do "work work work commit commit commit" as if these files 
were nothign special. What else do I need? I need the "plugins" to 
actually expose (install) my work, and perhaps to check that /etc matches 
what I expect (and nobody else did anything behind my back that I'd need 
to merge).

Let's call them "install" and "realdiff" as I did above, ok?

And again, I'm not going to even claim that the above two "plugins" are 
the right ones (maybe you want other operations too to interact with the 
"real" installed files), and I'm not going to really get all the details 
right, but here's kind of how you _might_ do it.

To create the script (let's make it shell, because that's what I'm used 
to, but it could be anything) "git-install" in your git binary directory, 
and make it do something like this:

	#!/bin/sh
	while read name chown chmod
	do
		cp $name $name.tmp &&
		sudo chown $chown $name.tmp &&
		sudo chmod $chmod $name.tmp &&
		sudo mv $name.tmp /etc/$name
	done < PERMISSIONS

and make it executable.

Now, you can work in your git directory, and when you're happy, you can do

	git install

to actually copy it into the _real_ directory in /etc.

See? You can do something similar for "realdiff", that would compare the 
contents in /etc with what you have now in your development tree (where 
you want to script the thing to compare the PERMISSIONS file too).

And note: if you do the "plugin scripts" properly, they can work for _all_ 
your repositories that track different files in /etc. So you can work in 
many different repos, and track different files in each, and "git install" 
will do the right thing for each, regardless of the actual files you're 
tracking.

Doesn't this sound like a workable situation? You get all the normal SCM 
tools (looking at history etc), and there's only a few special things you 
need to do when you actually want to install a specific version.

Btw: none of this is really "git-specific". The above tells you how to do 
local "git plugins", and it's obviously fairly trivial, but I suspect any 
SCM can be used in this manner.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 14:25                                                       ` Carl Worth
  2006-10-22 14:48                                                         ` Erik Bågfors
  2006-10-22 14:55                                                         ` Jakub Narebski
@ 2006-10-22 18:53                                                         ` Matthew D. Fuller
  2006-10-22 19:27                                                           ` Jakub Narebski
                                                                             ` (2 more replies)
  2 siblings, 3 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-22 18:53 UTC (permalink / raw)
  To: Carl Worth; +Cc: Erik Bågfors, bazaar-ng, git, Jakub Narebski

On Sun, Oct 22, 2006 at 07:25:41AM -0700 I heard the voice of
Carl Worth, and lo! it spake thus:
>
> 	git pull . mainline

This throws me a little.  I'd expect it to Just Do It when it's
fast-forwarding, but if it's doing a merge, I'd prefer it to stop and
wait before creating the commit, even if there are no textual
conflicts.  I realize you can just look at it afterward and back out
the commit if necessary, but still...


> Ah, I hadn't realized that bzr commits stored an "originating
> branch" inside them.

Every branch has a nickname, settable with 'bzr nick' (defaulting to
whatever the directory it's in is), and that's stored as a text field
in each commit.  It's mostly cosmetic, but it's handy to see at a
glance.


> This special treatment influences or directly causes many of the
> things in bzr that we've been discussing:
  [...]
> I've been arguing that all of these impacts are dubious. But I can
> understand that a bzr user hearing arguments against them might fear
> that they would lose the ability to be able to see a view of commits
> that "belong" to a particular branch.

Dead center.


> The mainline..featureA syntax literally just means:
> 
> 	the set of commits that are reachable by featureA
> 	and excluding the set of commits reachable by mainline

>From what I can gather from this, though, that means that when I merge
stuff from featureA into mainline (and keep on with other stuff in
featureA), I'll no longer be able to see those older commits from this
command.  And I'll see merged revisions from branches other than
mainline (until they themselves get merged into mainline), correct?
It sounds more like a 'bzr missing --mine-only' than looking down a
mainline in log...


> I haven't been able to find something similar in bzr yet. Does it
> exist?

The branch: (head) and ancestor: (latest common rev) revspecs let you
refer to the respective bits of other branches, which I think would
fill this role.


> It avoids a lot of the things in bzr that look so bizarre to people
> coming from git.

Well, what would be the fun in that?   8-}


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 19:41                                                     ` Jakub Narebski
@ 2006-10-22 19:18                                                       ` David Clymer
  2006-10-22 19:57                                                         ` Jakub Narebski
  2006-10-22 20:06                                                         ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: David Clymer @ 2006-10-22 19:18 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 9321 bytes --]

On Sat, 2006-10-21 at 21:41 +0200, Jakub Narebski wrote:
> Matthew D. Fuller wrote:
> > On Sat, Oct 21, 2006 at 04:08:18PM +0200 I heard the voice of
> > Jakub Narebski, and lo! it spake thus:
> >> Dnia sobota 21. października 2006 15:01, Matthew D. Fuller napisał:
> 
> >> When two clones of the same repository (in git terminology), or two
> >> "branches" (in bzr terminology), used by different people, cannot be
> >> totally equivalent that is centralization bias.
> > 
> > This is obviously some new meaning of "centralization" bearing no
> > resemblance whatsoever to how I understand the word.
> 
> Perhaps I'd better use "star topology bias" instead of "centralization
> bias".
>  
> > In git, apparently, you don't give a crap about a branch's identity
> > (alternately expressible as "it has none"), and so you throw it away
> > all the time.  Given that, revnos even if git had them would never be
> > of ANY use to you, so it's no wonder you have no use for the notion.
> 
> In git branches are lightweight. Branch names are local to repository.
> Repositories have identity. Bzr "branch" is strange mix of one-branch
> git repository and git branch.
> 
> Git main workflow is fully decentralized workflow. All clones of the
> same repository are created equal. In bzr the suggested workflow
> (with revnos) forces one (or more) branches to be mainline (use "merge",
> get empty-merges, revnos don't change) and leaf (use "pull", revnos
> change).
>  
> > I DO give a crap about my branchs' identities.  I WANT them to retain
> > them.  If I have 8 branches, they have 8 identities.  When I merge one
> > into another, I don't WANT it to lose its identity.  When I merge a
> > branch that's a strict superset of second into that second, I don't
> > WANT the second branch to turn into a copy of the first.  If I wanted
> > that, I'd just use the second branch, or make another copy of it.  I
> > don't WANT to copy it.  I just want to merge the changes in, and keep
> > on with my branch's current identity.
> 
> I don't understand. If I merge 'next' branch into 'master' in git, I 
> still have two branches: 'master' and 'next'.
> 
> And I don't understand why you are so hung on branch identities. Yes, if
> somebody clones your 'repo' repository, he can have your 'master' branch
> (refs/heads/master) named 'repo' (refs/heads/repo) or 'repo/master'
> (refs/remotes/repo/master), but why that matters to you. It is _his_
> (or her ;-) clone. 
> 

I think you missed the point. Speaking for myself, I want to maintain
the identity of _my_ branches. If you clone one of them, I _don't_ care.
That's your branch. Branch identity as presented here is not intended to
be globally significant. It's locally significant.

> > Now, we can discuss THAT distinction.  I'm not _opposed_ to git's
> > model per se, and I can think of a lot of cases where it's be really
> > handy.  But those aren't most of my cases.  And as long as we don't
> > agree on branch identity, it's completely pointless to keep yakking
> > about revnos, because they're a direct CONSEQUENCE of that difference
> > in mental model.  See?  They're an EFFECT, not a CAUSE.  If bzr didn't
> > have revnos, I'd STILL want my branch to keep its identity.  You could
> > name the mainline revisions after COLORS if you wanted, and I'd still
> > want my branch to keep its identity.  Aren't we through rehashing the
> > same discussion about the EFFECTS?
> 
> For revnos to work you MUST have one "branch" to be considered
> special, the hub in star topology. This very much precludes fully
> distributed development. 
> 
> BTW. I get that you can use revids in revnos in bzr for fully
> distributed and not star-topology geared development. But
> Bazaar-NG revids are uglier that Git commit-ids.

OK, just to clarify what you are saying here: 

1. revnos don't work because they don't serve the same purpose as revids
or git's SHA1 commit ids.

2. bzr does not support fully distributed development because revnos
"don't work" as stated in #1.

3. Ok, bzr does support distributed development, I just say it doesn't
because I think revids are ugly.

Thus, revids are ugly.

Is this really the argument you want to be making? I'm not disagreeing
with you; it's just that I'm not sure it's relevant.

Can we just put the whole "revnos don't work" thing to rest?

Revnos are only intended to be significant relative to a given branch.
They are not intended to serve as an absolute, global identifier.

Revnos + a url _are_ globally significant, but are not static except in
certain topologies.

Revids are globally significant and static in any topology.

If a user does not like or cannot use revnos, they may use revids.
Revnos are not a tool to be used for every job. In no way does that mean
that they are broken.

If a given developer or group of developers primarily use revnos or
revids, it _may_ indicate that _they_ have a bias towards central (or
star) or distributed development, but does not necessarily have any
bearing on the capability of the VCS being used.

> 
> [...]
> >> And you say that bzr is not biased towards centralization? In git
> >> you can just pull (fetch) to check if there were any changes, and if
> >> there were not you don't get useless marker-merges.
> > 
> > If I don't tell you my branch has something in it ready to grab, you
> > shouldn't merge it.  It probably won't work, and is quite likely to
> > set your computer on fire, slaughter and fillet your pet goldfish, and
> > make demons fly out of your nose.  If you wanna get stuck with all my
> > incomplete WIP, let's just use a CVS module and be done with it.
> 
> In git I can fetch your changes but I don't need to merge them. Take
> for example Junio 'pu' (proposed updates) branch: this is the branch
> you shouldn't merge as it's history is constantly being rewritten.
> 
> If you don't want for your WIP to be publicly available, you don't
> publish it. For example as far as I understand Junio works on Git
> in his private repository, with many, many feature branches, but
> he does push to public [bare] repository only some subset of branches,
> and we can fetch/pull only those.
> 
> But still, if I am impatient I can pull from Junio every hour, and
> I don't get 24 totally useless empty merge messages if he took day
> off and didn't publish any changes till day later.
> 
> >> 2. But the preferred git workflow is to have two branches in each of
> >> two clones. The 'origin' branch where you fetch changes from other
> >> repository (so called "tracking branch") and you don't commit your
> >> changes to [...]
> > 
> > Funny, since this reads to me EXACTLY like the bzr flow of "upstream
> > branch I pull" and "my branch I merge from upstream" that's getting
> > kvetched around...
> 
> But please, have you realized that in this workflow the two clones
> of the same repository are totally symmetrical? One's 'master' is
> another 'origin' and vice versa. After pull on one side, and pull
> on the other side (without any changes in between) we have the same
> contents, and the same revision names (commit-ids in git), even if
> the changes (revisions) got to those clones in different order.
> In bzr those two "branches" would get different revnos. No symmetry.
> Full distributed vs star topology (one branch "central", hence
> "centralized" - I don't mean need to access to one central repository,
> although...)

I think that when I attempt to pull from one branch to another, if they
are identical, neither branch changes. Merging + pulling results in
identical history, causing revnos on the pulling branch to change. Just
merging maintains divergent views of the same history. 

Perhaps bzr has a central bias in the view that each developer has the
option of seeing their own branch as the central focus of his/her
development. This view would be the same from each branch; each
developer views his/her own branch as special. If the developer does not
want to view their own branch specially, they would merge + pull rather
than just merging. If I remember correctly, abentley covered this
earlier in this whole "VCS comparison table" thread.

Anyway, much of this seems to be a disagreement over the definition of
"distributed VCS." Perhaps this is too simplistic, but to my inexpert
eyes, these appear to be the positions of each side:

Bzr: Branches and all shared history may be stored locally in disparate
locations, and all VCS functions are available locally.

Git: Same thing, except that all shared history must also be identically
ordered.

Did I get that right?

In general, as a mere _user_ of distributed VCS, all I care about is if
I can accurately point you to a particular commit or set of commits, and
that you can access them either in shared history or in a given branch.
The fact that the VCS does not require a central branch and facilitates
code interchange, means to me that it is distributed. As long as all
major uses are fully supported, being slightly biased toward one use
case or another is not a distinction I consider to be worth making.

-davidc
-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 18:53                                                         ` Matthew D. Fuller
@ 2006-10-22 19:27                                                           ` Jakub Narebski
  2006-10-23 16:57                                                           ` David Lang
  2006-10-23 17:29                                                           ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 19:27 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Carl Worth, bazaar-ng, git, Erik Bågfors

On Son, Oct 22, 2006 Matthew D. Fuller wrote:
> On Sun, Oct 22, 2006 at 07:25:41AM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
>>
>> 	git pull . mainline
> 
> This throws me a little.  I'd expect it to Just Do It when it's
> fast-forwarding, but if it's doing a merge, I'd prefer it to stop and
> wait before creating the commit, even if there are no textual
> conflicts.  I realize you can just look at it afterward and back out
> the commit if necessary, but still...

Or you can use --no-commit option to git pull, and commit later.
But it is true that you can always amend the commit with
got commit --amend, even if the commit is merge.
 
>> Ah, I hadn't realized that bzr commits stored an "originating
>> branch" inside them.
> 
> Every branch has a nickname, settable with 'bzr nick' (defaulting to
> whatever the directory it's in is), and that's stored as a text field
> in each commit.  It's mostly cosmetic, but it's handy to see at a
> glance.

If I remember correctly Linus argued against it, because branch
name is something local to repository (most common example is
"mine 'master' is yours 'origin'").

There was proposal for "note" header for notes like merge algorithm
used, or branch name, visible only in 'raw' mode, but it wasn't 
implemented.

>> The mainline..featureA syntax literally just means:
>> 
>> 	the set of commits that are reachable by featureA
>> 	and excluding the set of commits reachable by mainline
> 
> From what I can gather from this, though, that means that when I merge
> stuff from featureA into mainline (and keep on with other stuff in
> featureA), I'll no longer be able to see those older commits from this
> command.  And I'll see merged revisions from branches other than
> mainline (until they themselves get merged into mainline), correct?
> It sounds more like a 'bzr missing --mine-only' than looking down a
> mainline in log...

That's true. That is what history viewers are for (gitk, qgit, tig,
gitview, git-show-branch, git-browser) are for.

And there is always reflog (if you enable it, of course).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 20:47                                                 ` Carl Worth
                                                                     ` (2 preceding siblings ...)
  2006-10-22 12:46                                                   ` Matthew D. Fuller
@ 2006-10-22 19:36                                                   ` David Clymer
  3 siblings, 0 replies; 806+ messages in thread
From: David Clymer @ 2006-10-22 19:36 UTC (permalink / raw)
  To: Carl Worth
  Cc: Matthew D. Fuller, bazaar-ng, Linus Torvalds, Andreas Ericsson,
	git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 1480 bytes --]

On Sat, 2006-10-21 at 13:47 -0700, Carl Worth wrote:
> On Sat, 21 Oct 2006 08:01:11 -0500, "Matthew D. Fuller" wrote:
> > I think we're getting into scratched-record-mode on this.
> 
> I apologize if I've come across as beating a dead horse on this. I've
> really tried to only respond where I still confused, or there are
> explicit indications that the reader hasn't understood what I was
> saying, ("I don't understand how you've come to that conclusion",
> etc.). I'll be even more careful about that below, labeling paragraphs
> as "I'm missing something" or "Maybe I wasn't clear".
> 
> > G: So use revids everywhere.
> >
> > B: Revnos are handier tools for [situation] and [situation] for
> >    [reason] and [reason].
> 
> I'm missing something:
> 
> I still haven't seen strong examples for this last claim. When are
> they handier? I asked a couple of messages back and two people replied
> that given one revno it's trivial to compute the revno of its
> parent. But that's no win over git's revision specifications,
> (particularly since they provide "parent of" operators).

I would say that: revnos are handier tools than revids...etc

I think that since G: was making a statement about revids, B: was making
an implicit comparison with them.

bzr log -r before:1   

being handier than

bzr log -r before:revid:david@zettazebra.com-20061022175244-4b85cb5f0cbc79ad


-davidc
-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 19:18                                                       ` David Clymer
@ 2006-10-22 19:57                                                         ` Jakub Narebski
  2006-10-22 20:06                                                         ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 19:57 UTC (permalink / raw)
  To: David Clymer
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

David Clymer wrote:
> Bzr: Branches and all shared history may be stored locally in disparate
> locations, and all VCS functions are available locally.

Branches in bzr are both one-source (one head) DAG (of parents), and
the "mainline" i.e. track of commits commited in this branch-as-place.
Bazaar-NG tries to keep both information in DAG by using first parent
to mark commits on current branch-as-place.

Additionally bzr by default uses revnos, numbering commits on branch,
which needs maintaining mainline identity for revnos not to change
even for one branch-as-place.

This leads to the need to use "merge" if you want to maintain revnos
unchanged, and "pull" if you are not interested in that.


Git correctly realizes that mainline identity is local information,
and instead of trying to save local information in DAG which is shared,
it uses reflog.

[That's of course totally biased view.]
 
> Git: Same thing, except that all shared history must also be identically
> ordered.
That is the EFFECT of preferring fast-forward over preserving
"first parent is my branch" property. So the RESULT is that
shared history is identically ordered.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 19:18                                                       ` David Clymer
  2006-10-22 19:57                                                         ` Jakub Narebski
@ 2006-10-22 20:06                                                         ` Jakub Narebski
  2006-10-23 11:56                                                           ` David Clymer
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-22 20:06 UTC (permalink / raw)
  To: David Clymer
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

David Clymer wrote:
> 1. revnos don't work because they don't serve the same purpose as revids
> or git's SHA1 commit ids.
Revnos works only locally, or in star-topology configuration. They have
some consequences: treating first parent specially, need for merges
instead of fast-forward even if fast-forward would be applicable,
two different "fetch" operators: "pull" (which uses revids on the
pulled side) and "merge" (which preserves revids on pullee side).

> 2. bzr does not support fully distributed development because revnos
> "don't work" as stated in #1.
Bazaar is biased towards centralized/star-topology development if we
want to use revids. In fully distributed configuration there is no
"simple namespace".

> 3. Ok, bzr does support distributed development, I just say it doesn't
> because I think revids are ugly.
I think that bzr revids are uglier that git commit-ids.

If on the pros side of bzr is "simple namespace", you must remember that
it is simple namespace only for not fully distributed development. The
pros of "simple namespace" with cons of "merge" vs "pull" and centralization
required for uniqueness of revids.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-21  2:17                                                             ` Junio C Hamano
@ 2006-10-22 21:04                                                               ` Johannes Schindelin
  2006-10-22 23:11                                                                 ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-22 21:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git


If the merge base _and_ the to-be merged brach have a certain file, but
HEAD has not, do not complain if that file exists anyway. It will not be
overwritten.

Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>

---

	On Fri, 20 Oct 2006, Junio C Hamano wrote:

	> While we are talking about merge-recursive, I could use some
	> help from somebody familiar with merge-recursive to complete the
	> read-tree changes Linus mentioned early this month.
	>
	> The issue is that we would want to remove one verify_absent()
	> call in unpack-tree.c:threeway_merge().  When read-tree decides
	> to leave higher stages around, we do not want it to check if the
	> merge could clobber a working tree file, because having an
	> unrelated file at the same path in the working tree sometimes is
	> and sometimes is not a conflict, depending on the outcome of the
	> merge, and that part of the code does not _know_ the outcome
	> yet.

	How about this? It passes the testsuite, and I tested it with the 
	test case you did, and with the same test case with recursive 
	merge.

 unpack-trees.c |    5 ++---
 1 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 3ac0289..b4994c4 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -658,10 +658,9 @@ int threeway_merge(struct cache_entry **
 	 * up-to-date to avoid the files getting overwritten with
 	 * conflict resolution files.
 	 */
-	if (index) {
+	if (index)
 		verify_uptodate(index, o);
-	}
-	else if (path)
+	else if (no_anc_exists)
 		verify_absent(path, "overwritten", o);
 
 	o->nontrivial_merge = 1;
-- 
1.4.3.1.ga3de1-dirty

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-22 21:04                                                               ` [PATCH] threeway_merge: if file will not be touched, leave it alone Johannes Schindelin
@ 2006-10-22 23:11                                                                 ` Junio C Hamano
  2006-10-23  0:48                                                                   ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-22 23:11 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> 	How about this? It passes the testsuite, and I tested it with the 
> 	test case you did, and with the same test case with recursive 
> 	merge.
>
>  unpack-trees.c |    5 ++---
>  1 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 3ac0289..b4994c4 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -658,10 +658,9 @@ int threeway_merge(struct cache_entry **
>  	 * up-to-date to avoid the files getting overwritten with
>  	 * conflict resolution files.
>  	 */
> -	if (index) {
> +	if (index)
>  		verify_uptodate(index, o);
> -	}
> -	else if (path)
> +	else if (no_anc_exists)
>  		verify_absent(path, "overwritten", o);
>  
>  	o->nontrivial_merge = 1;

This feels wrong at the philosophical level.  unpack-trees and
read-tree do not know, and more importantly, do not want to
decide, the outcome of the merge, so it should not be doing
verify_absent because it does not know if the path will be
overwritten by the merge.

Complaining when no_anc_exists means that threeway_merge() is
deciding that the merge result should have the path in this
case.  It might be true for the current merge-recursive and
merge-resolve, but I do not think we should force that decision
on future merge strategies, since that is the whole point of
declaring the merge to be nontrivial and _not_ deciding the
outcome ourselves here.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-22 23:11                                                                 ` Junio C Hamano
@ 2006-10-23  0:48                                                                   ` Johannes Schindelin
  2006-10-23  4:17                                                                     ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-10-23  0:48 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Hi,

On Sun, 22 Oct 2006, Junio C Hamano wrote:

> Complaining when no_anc_exists means that threeway_merge() is deciding 
> that the merge result should have the path in this case.

Two points:

- you are correct for at least the case of choosing the merge strategy 
"theirs". (Which does not exist yet.)

- in merge-recursive.c:process_entry() (which is called on _all_ unmerged 
entries after threeway merge), "Case A" reads "deleted in one branch". 
Reading the code again, I believe there is a bug, which should be fixed by

diff --git a/merge-recursive.c b/merge-recursive.c
index 2ba43ae..9f6538a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1005,9 +1005,10 @@ static int process_entry(const char *pat
 		    (!a_sha && sha_eq(b_sha, o_sha))) {
 			/* Deleted in both or deleted in one and
 			 * unchanged in the other */
-			if (a_sha)
+			if (!a_sha) {
 				output("Removing %s", path);
-			remove_file(1, path);
+				remove_file(1, path);
+			}
 		} else {
 			/* Deleted in one and changed in the other */
 			clean_merge = 0;

Note that not only it groups the call to output() and remove_file(), which 
matches the expectation, but also changes the condition to "!a_sha", 
meaning that the file is deleted in branch "a", but existed in the merge 
base, where it is identical to what is in branch "b".

Of course, this assumes that even in the recursive case, branch "a" is to 
be preferred over branch "b". (If I still remember correctly, then branch 
"a" is either the current head, or the temporary recursive merge, so this 
would make sense to me.)

So, after applying this patchlet, merge-recursive (more precisely: the 
function process_entry()) should behave correctly with the change to 
unpack-trees.c you have in pu, i.e. the change that drops that 
verify_absent() call to the floor.

However, I could use some additional optical lobes here.

Ciao,
Dscho

P.S.: Maybe I was wrong on my earlier assessment, that merge-recursive 
does not optimize the "subtrees have identical SHA1s" case. This should be 
handled pretty well by the call to unpack_trees() with threeway merge.

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: [PATCH] threeway_merge: if file will not be touched, leave it alone
  2006-10-23  0:48                                                                   ` Johannes Schindelin
@ 2006-10-23  4:17                                                                     ` Junio C Hamano
  0 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-23  4:17 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> diff --git a/merge-recursive.c b/merge-recursive.c
> index 2ba43ae..9f6538a 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1005,9 +1005,10 @@ static int process_entry(const char *pat
>  		    (!a_sha && sha_eq(b_sha, o_sha))) {
>  			/* Deleted in both or deleted in one and
>  			 * unchanged in the other */
> -			if (a_sha)
> +			if (!a_sha) {
>  				output("Removing %s", path);
> -			remove_file(1, path);
> +				remove_file(1, path);
> +			}
>  		} else {
>  			/* Deleted in one and changed in the other */
>  			clean_merge = 0;
>
> Note that not only it groups the call to output() and remove_file(), which 
> matches the expectation, but also changes the condition to "!a_sha", 
> meaning that the file is deleted in branch "a", but existed in the merge 
> base, where it is identical to what is in branch "b".

I think the conditional "output" is to mimic the first case in
git-merge-one-file; there we conditionally give that message
only when ours had that path.  If we lost the path while they
have it the same way as the common ancestor, then we do not have
the path to begin with when we start the merge.  It is not
correct to say "Removing" in such a case.

So the output() call being tied to if (a_sha) _is_ correct in
your code.

What we would want to prevent is to remove the path from the
working tree when we did not have the path at the beginning of
the merge and the merge result says we do not want that path.
In such a case, the file in the working tree is an untracked
file that is not touched by the merge.

E.g gitweb/gitweb.cgi is not tracked in the current "master",
but used to be around v1.4.0 time.  If you try to merge a
branch forked from v1.4.0 because you are interested in a work
on other part of the system (i.e. the branch did not touch
gitweb/ at all), we want to successfully merge that branch into
our "master" even after "make" created gitweb/gitweb.cgi.

Such a merge would start with your HEAD and index missing
gitweb/gitweb.cgi but the path still in your working tree.  The
common ancestor and their tree has the path tracked, so you
would end up with identical stage #1 and #3 with missing stage
#2.

The merge machinery should say the merge result does not have
the path, so it should remove it from the index.  However, it
should _not_ touch the untracked (from the beginning of the time
the merge started) working tree file.  So remove_file() call you
touch in your patch needs to be told not to update working
directory in such a case.

Under "aggressive" rule, threeway_merge() is requested to make
the merge policy decision, so it should also loosen this check
itself.  The change by commit 0b35995 needs to be updated with
this patch:

diff --git a/unpack-trees.c b/unpack-trees.c
index b1d78b8..7cfd628 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -642,7 +642,7 @@ int threeway_merge(struct cache_entry **
 		    (remote_deleted && head && head_match)) {
 			if (index)
 				return deleted_entry(index, index, o);
-			else if (path)
+			else if (path && !head_deleted)
 				verify_absent(path, "removed", o);
 			return 0;
 		}

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 17:12                                                           ` Linus Torvalds
@ 2006-10-23  5:19                                                             ` Matthew Hannigan
  0 siblings, 0 replies; 806+ messages in thread
From: Matthew Hannigan @ 2006-10-23  5:19 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tim Webster, bazaar-ng, git, Jakub Narebski

On Sun, Oct 22, 2006 at 10:12:00AM -0700, Linus Torvalds wrote:
> [ ... ]
> 
>    Again, the way to solve this would tend to be to have a few helper 
>    scripts that use regular file-contents that _describe_ these things to 
>    do "realdiff" and "install".
> 
> In other words, for at least three _totally_ different reasons, you really 
> don't want to do tracking/development directly in /etc, but you want to 
> have a buffer zone to do it. And once you have that, you might as well do 
> _that_ as the repository, and just add a few specialty commands (let's 
> call them "plugins" to make everybody happy) to do the special things.

Damn you stole my idea!  I had this scheme brewing in my head too,
with some slight variations:

> 	# copy the data, set up a PERMISSIONS file to track extra info
> 	sudo cp /etc/group /etc/passwd /etc/shadow .
> 	sudo chown user:user *
> 	cat <<EOF > PERMISSIONS
> 	group root:root 0644
> 	passwd root:root 0644
> 	shadow root:root 0400
> 	EOF

You may want one perms/metadata file per real file (file.meta?) with contents
like:
	owner root
	group root
	perms u=r,go=

for possibly easier to digest diff output. You could omit "don't care" variables.
You could still have one overarching file (DEFAULT.meta) for defaults.  Also, you
may want to track the implied umask instead of the real perms.

You could also track the pathname, (e.g. path /etc/group, path /etc/inet/hosts) so you
didn't have to match the structure of the working tree to the actual destination.

> And again, I'm not going to even claim that the above two "plugins" are 
> the right ones (maybe you want other operations too to interact with the 
> "real" installed files),  [ ... ]

Yes, there are other very useful transformations possible.  One example is to
split the /etc/group file into a series of files, each named after the group,
with contents the sorted list of members.  Again, this is useful for 'diff' and
any SCM. It's important that it's a lossless transformation in both
directions; you may want to scan the destination and make sure
your base revision matches it before 'git install'.

> Btw: none of this is really "git-specific". The above tells you how to do 
> local "git plugins", and it's obviously fairly trivial, but I suspect any 
> SCM can be used in this manner.

Indeed, the essential thing about this is you're representing any
system modification as a text diff, so it makes sense for any
SCM.  In fact the 'plugin' for any SCM would be 95% the same code.

This might also be useful for SCMs that don't handle symlinks
natively.

--
Matt Hannigan

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 20:06                                                         ` Jakub Narebski
@ 2006-10-23 11:56                                                           ` David Clymer
  2006-10-23 12:54                                                             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: David Clymer @ 2006-10-23 11:56 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 2212 bytes --]

On Sun, 2006-10-22 at 22:06 +0200, Jakub Narebski wrote:
> David Clymer wrote:
> > 1. revnos don't work because they don't serve the same purpose as revids
> > or git's SHA1 commit ids.
> Revnos works only locally, or in star-topology configuration. They have
> some consequences: treating first parent specially, need for merges
> instead of fast-forward even if fast-forward would be applicable,
> two different "fetch" operators: "pull" (which uses revids on the
> pulled side) and "merge" (which preserves revids on pullee side).

s/revids/revnos/g  but yes, I think I said this later in my previous
email.

> 
> > 2. bzr does not support fully distributed development because revnos
> > "don't work" as stated in #1.
> Bazaar is biased towards centralized/star-topology development if we
> want to use revids. In fully distributed configuration there is no
> "simple namespace".

So revnos aren't globally meaningful in fully distributed settings. So
what? I don't see how this translates into bias. There is a lot of
functionality provided by bazaar that doesn't really apply to my use
case, but it doesn't mean that it is indicative of some bias in bazaar.

> 
> > 3. Ok, bzr does support distributed development, I just say it doesn't
> > because I think revids are ugly.
> I think that bzr revids are uglier that git commit-ids.
> 
> If on the pros side of bzr is "simple namespace", you must remember that
> it is simple namespace only for not fully distributed development. The
> pros of "simple namespace" with cons of "merge" vs "pull" and centralization
> required for uniqueness of revids.

I think you've switched revids and revnos, but I get what you are
saying. In fact, I think I said pretty much the same thing in the email
you are replying to. I don't think that anyone is disagreeing about
anything other than the assertion that bzr is biased because revnos are
used to simplify cases where it is possible to do so.

In any case, Matthew Fuller & Carl Worth cover this in greater detail in
emails further down in this thread (or one of its siblings), so I think
I'll stop here.

-davidc

-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 11:56                                                           ` David Clymer
@ 2006-10-23 12:54                                                             ` Jakub Narebski
  2006-10-23 15:01                                                               ` James Henstridge
  2006-10-24  3:24                                                               ` David Clymer
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 12:54 UTC (permalink / raw)
  To: David Clymer
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

On Mon, Oct 23, 2006 David Clymer wrote:
> On Sun, 2006-10-22 at 22:06 +0200, Jakub Narebski wrote:
>> David Clymer wrote:

>>> 2. bzr does not support fully distributed development because revnos
>>> "don't work" as stated in #1.
>>
>> Bazaar is biased towards centralized/star-topology development if we
>> want to use revnos. In fully distributed configuration there is no
>> "simple namespace".
> 
> So revnos aren't globally meaningful in fully distributed settings. So
> what? I don't see how this translates into bias. There is a lot of
> functionality provided by bazaar that doesn't really apply to my use
> case, but it doesn't mean that it is indicative of some bias in bazaar.

First, bzr is biased towards using revnos: bzr commands uses revnos
by default to provide revision (you have to use revid: prefix/operator
to use revision identifiers), bzr commands outputs revids only when
requested, examples of usage uses revision numbers.

In order to use revnos as _global_ identifiers in distributed development,
you need central "branch", mainline, to provide those revnos. You have
either to have access to this "revno server" and refer to revisions by
"revno server" URL and revision number, or designate one branch as holding
revision numbers ("revno server") and preserve revnos on "revno server"
by using bzr "merge", while copying revnos when fetching by using bzr "pull"
for leaf branches. In short: for revnos to be global identifiers you need
star-topology.

Even if you use revnos only locally, you need to know which revisions are
"yours", i.e. beside branch as DAG of history of given revision you need
"ordered series of revisions" (to quote Bazaar-NG wiki Glossary), or path
through this diagram from given revision to one of the roots (initial,
parentless revisions). Because bzr does that by preserving mentioned path
as first-parent path (treating first parent specially), i.e. storing local
information in a DAG (which is shared), to preserve revnos you need to
use "merge" instead of "pull", which means that you get empty-merge in
clearly fast-forward case. This means "local changes bias", which some
might take as not being fully distributed.

Sidenote 1: Why Bazaar-NG tries to store "branch as ordered series
of revisions"/"branch as path through revisions DAG" in DAG instead
of storing it separately (like reflog stores history of tip of branch,
which is roughly equivalent of "branch as path" in bzr). It needs
some kind of cache of mapping from revno to the revision itself anyway
(unless performance doesn't matter for bzr developers ;-)! All what
left is to propagate this mapping on "pull"...

Sidenote 2: "Fringe" developer using default git configuration of
'origin' branch tracking 'master' branch in cloned (mainline) repo,
and 'master' branch on which he/she does his/her own work, who committed
at least single revision on his/her 'master' branch, and whose changes
are never pulled and if they get into mainline repo it is using "side"
channel like git-enchanced patches sent to project mailing list,
will see the picture similar to the bzr branch which uses "merge".


The whole discussion about validity of revision numbers started
with "simple namespace" feature in SCM comparison matrix on Bazaar-NG
wiki...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 12:54                                                             ` Jakub Narebski
@ 2006-10-23 15:01                                                               ` James Henstridge
  2006-10-23 17:18                                                                 ` Aaron Bentley
  2006-10-24  3:24                                                               ` David Clymer
  1 sibling, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-23 15:01 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: David Clymer, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Carl Worth, Andreas Ericsson, git

On 23/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> First, bzr is biased towards using revnos: bzr commands uses revnos
> by default to provide revision (you have to use revid: prefix/operator
> to use revision identifiers), bzr commands outputs revids only when
> requested, examples of usage uses revision numbers.

As has been said before, you can set an alias to always show revision
IDs in "bzr log" output.


> In order to use revnos as _global_ identifiers in distributed development,
> you need central "branch", mainline, to provide those revnos. You have
> either to have access to this "revno server" and refer to revisions by
> "revno server" URL and revision number, or designate one branch as holding
> revision numbers ("revno server") and preserve revnos on "revno server"
> by using bzr "merge", while copying revnos when fetching by using bzr "pull"
> for leaf branches. In short: for revnos to be global identifiers you need
> star-topology.

Why do you continue to repeat this argument?  No one is claiming that
a revision number by itself, as Bazaar uses them, is a global
identifier.  In fact, we keep on saying that they only have meaning in
the context of a branch.  If you want to use a revision number as part
of a globally unique identifier, it needs to be in combination with
its branch.


> Even if you use revnos only locally, you need to know which revisions are
> "yours", i.e. beside branch as DAG of history of given revision you need
> "ordered series of revisions" (to quote Bazaar-NG wiki Glossary), or path
> through this diagram from given revision to one of the roots (initial,
> parentless revisions). Because bzr does that by preserving mentioned path
> as first-parent path (treating first parent specially), i.e. storing local
> information in a DAG (which is shared), to preserve revnos you need to
> use "merge" instead of "pull", which means that you get empty-merge in
> clearly fast-forward case. This means "local changes bias", which some
> might take as not being fully distributed.

I won't dispute that Bazaar has features that make it easier to work
with the revisions in the line of development of the branch you're
working on in comparison to the revisions from merges.  But given that
every Bazaar branch has this same bias towards their own main line of
development, how can that affect whether or not it is distributed?

James.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2006-10-22 19:27                                                           ` Jakub Narebski
@ 2006-10-23 16:57                                                           ` David Lang
  2006-10-23 17:29                                                           ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: David Lang @ 2006-10-23 16:57 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Erik Bågfors, bazaar-ng, git, Jakub Narebski

>> This special treatment influences or directly causes many of the
>> things in bzr that we've been discussing:
>  [...]
>> I've been arguing that all of these impacts are dubious. But I can
>> understand that a bzr user hearing arguments against them might fear
>> that they would lose the ability to be able to see a view of commits
>> that "belong" to a particular branch.
>
> Dead center.
>
>
>> The mainline..featureA syntax literally just means:
>>
>> 	the set of commits that are reachable by featureA
>> 	and excluding the set of commits reachable by mainline
>
> From what I can gather from this, though, that means that when I merge
> stuff from featureA into mainline (and keep on with other stuff in
> featureA), I'll no longer be able to see those older commits from this
> command.  And I'll see merged revisions from branches other than
> mainline (until they themselves get merged into mainline), correct?
> It sounds more like a 'bzr missing --mine-only' than looking down a
> mainline in log...

one thing you are missing 'mainline' in this git command is not saying 
'everything that's in the 'main' published branch'. it's saying 'everything 
reachable by the tag 'mainline'

so when you branched off for your feature development you could set a tag that 
says 'branchpoint' and no matter what gets merged in mainline after that you can 
always do branchpoint..featureA and find what you've done.

that being said, mainline..featureA is also extremely useful, it tells you what 
development stuff you have done that have not yet been merged into mainline

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 15:01                                                               ` James Henstridge
@ 2006-10-23 17:18                                                                 ` Aaron Bentley
  2006-10-23 17:53                                                                   ` Jakub Narebski
  2006-10-23 20:06                                                                   ` Jeff King
  0 siblings, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-23 17:18 UTC (permalink / raw)
  To: James Henstridge
  Cc: Jakub Narebski, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

James Henstridge wrote:
> Why do you continue to repeat this argument?  No one is claiming that
> a revision number by itself, as Bazaar uses them, is a global
> identifier.  In fact, we keep on saying that they only have meaning in
> the context of a branch.

And, unlike git, Bazaar branches are all independent entities[1], and
they each have a URL.

So:

http://code.aaronbentley.com/bzrrepo/bzr.ab 1695

is a name for

abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

And it does not depend on any other branch, especially not bzr.dev

Since:
1. anyone with write access to the urls can create them
2. anyone with read access to the urls can read them
3. the maintainers of the mainline have no control over them
   (except as provided by 1)

these identifiers are not centralized.

Aaron

[1] The fact that they may share storage is not important to the model.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFPPlm0F+nu1YWqI0RAlmLAJ9cpw5X7UXQ82EmoIeUrKzEaFbhdACfZPsS
CRJ69XWi7XAWJRi7Fgt9ICU=
=WrV9
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-22 18:53                                                         ` Matthew D. Fuller
  2006-10-22 19:27                                                           ` Jakub Narebski
  2006-10-23 16:57                                                           ` David Lang
@ 2006-10-23 17:29                                                           ` Linus Torvalds
  2006-10-23 22:21                                                             ` Matthew D. Fuller
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 17:29 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Erik Bågfors, bazaar-ng, git, Jakub Narebski



On Sun, 22 Oct 2006, Matthew D. Fuller wrote:
> 
> > This special treatment influences or directly causes many of the
> > things in bzr that we've been discussing:
>   [...]
> > I've been arguing that all of these impacts are dubious. But I can
> > understand that a bzr user hearing arguments against them might fear
> > that they would lose the ability to be able to see a view of commits
> > that "belong" to a particular branch.
> 
> Dead center.

The thing that the bzr people don't seem to realize is that their choice 
of revision naming has serious side effects, some of them really 
technical, and limiting.

I already briought this up once, and I suspect that the bzr people simply 
DID NOT UNDERSTAND the question:

 - how do you do the git equivalent of "gitk --all"

which is just another reason why "branch-local" revision naming is simply 
stupid and has real _technical_ problems.

I really suspect that a lot of people can't see further than their own 
feet, and don't understand the subtle indirect problems that branch-local 
naming causes. 

For example, how long does it take to do an arbitrary "undo" (ie forcing a 
branch to an earlier state) in a project with tens of thousands of 
commits? That's actually a really important operation, and yes, 
performance does matter. It's something that you do a lot when you do 
things like "bisect" (which I used to approximate with BK by hand, and 
yes, re-weaving the branch history was apparently a big part of why it 
took _minutes_ to do sometimes).

Again, this is something that people don't expect to have _anything_ to do 
with revision numbering, but the fact is, it's a big part of the picture. 
If you have branch-local revision numbering, you need to renumber all 
revisions on events like this, and even if it is "just" re-creatigng the 
revno->"real ID" cache, it's actually an expensive operation exactly 
because it's going to be at least linear in history.

One of the git design requirements was that no operation should _ever_ 
need to be linear in history size, because it becomes a serious limiter of 
scalability at some point. We were seeing some of those issues with BK, 
which is why I cared.

So in git, doing things like jumping back and forth in history is O(1). 
Always (with a really low constant cost too). Of course, checking out the 
end result is then roughly O(n), but even there "n" is the size of the 
_changes_, not number of revisions or number of files.

(And there are obviously operations that _are_ O(revision history), the 
most trivial one being anything that visualizes all of history - but they 
depend on the size of history not because the operation itself gets more 
expensive, but because the dataset increases).

The whole confusing between "bzr pull" and "bzr merge" is another 
_technical_ sign of why branch-local revision numbers are a mistake. 

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:18                                                                 ` Aaron Bentley
@ 2006-10-23 17:53                                                                   ` Jakub Narebski
  2006-10-23 18:04                                                                     ` Linus Torvalds
  2006-10-23 20:06                                                                   ` Jeff King
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 17:53 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git

Aaron Bentley wrote:
> James Henstridge wrote:

>> Why do you continue to repeat this argument?  No one is claiming that
>> a revision number by itself, as Bazaar uses them, is a global
>> identifier.  In fact, we keep on saying that they only have meaning in
>> the context of a branch.
> 
> And, unlike git, Bazaar branches are all independent entities[1], and
> they each have a URL.
> 
> So:
> 
> http://code.aaronbentley.com/bzrrepo/bzr.ab 1695
> 
> is a name for
> 
> abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> 
> And it does not depend on any other branch, especially not bzr.dev
> 
> Since:
> 1. anyone with write access to the urls can create them
> 2. anyone with read access to the urls can read them
> 3. the maintainers of the mainline have no control over them
>    (except as provided by 1)
> 
> these identifiers are not centralized.

If you don't use centralized numbers (i.e. always refering to bzr.dev,
either by using always (bzr.dev URL, revno), or by using "merge" for
bzr.dev and "pull" for rest), the numbers are volatile. If URL vanishes,
then (URL, revno) to revid mapping is no longer valid. Yeah, I know,
cool URI don't change...

Besides, you need [constant] network access for this mapping.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:53                                                                   ` Jakub Narebski
@ 2006-10-23 18:04                                                                     ` Linus Torvalds
  2006-10-23 18:21                                                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 18:04 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Andreas Ericsson,
	Carl Worth, git



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> Besides, you need [constant] network access for this mapping.

I _think_ that Aaron was trying to say that

	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

is always constant, so you can use that.

Of course, nobody will ever do that, because in practice they're not 
shown, the same way the "true" BK revision names were never shown and thus 
never really used.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:04                                                                     ` Linus Torvalds
@ 2006-10-23 18:21                                                                       ` Jakub Narebski
  2006-10-23 18:26                                                                         ` Jelmer Vernooij
  2006-10-23 18:34                                                                         ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 18:21 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

Linus Torvalds wrote:
> 
> On Mon, 23 Oct 2006, Jakub Narebski wrote:
>> 
>> Besides, you need [constant] network access for this mapping.
> 
> I _think_ that Aaron was trying to say that
> 
> 	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> 
> is always constant, so you can use that.
> 
> Of course, nobody will ever do that, because in practice they're not 
> shown, the same way the "true" BK revision names were never shown and thus 
> never really used.

By the way, I wonder if accidentally identical revisions
(see example for accidental clean merge on revctrl.org)
would get the same revision id in bzr. In git they would.
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:21                                                                       ` Jakub Narebski
@ 2006-10-23 18:26                                                                         ` Jelmer Vernooij
  2006-10-23 18:31                                                                           ` Jakub Narebski
  2006-10-23 18:34                                                                         ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Jelmer Vernooij @ 2006-10-23 18:26 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

[-- Attachment #1: Type: text/plain, Size: 1086 bytes --]

On Mon, 2006-10-23 at 20:21 +0200, Jakub Narebski wrote:
> Linus Torvalds wrote:
> > On Mon, 23 Oct 2006, Jakub Narebski wrote:
> >> 
> >> Besides, you need [constant] network access for this mapping.
> > 
> > I _think_ that Aaron was trying to say that
> > 
> > 	abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31
> > 
> > is always constant, so you can use that.
> > 
> > Of course, nobody will ever do that, because in practice they're not 
> > shown, the same way the "true" BK revision names were never shown and thus 
> > never really used.
> 
> By the way, I wonder if accidentally identical revisions
> (see example for accidental clean merge on revctrl.org)
> would get the same revision id in bzr. In git they would.
They won't. The revision id is made up of the committers email address,
a timestamp and a bunch of random data. It wouldn't be hard to switch
using checksums as revids instead, but I don't think there are any plans
in that direction.

Cheers,

Jelmer
-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:26                                                                         ` Jelmer Vernooij
@ 2006-10-23 18:31                                                                           ` Jakub Narebski
  2006-10-23 18:44                                                                             ` Jelmer Vernooij
  2006-10-23 18:45                                                                             ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 18:31 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

Jelmer Vernooij wrote:
>> By the way, I wonder if accidentally identical revisions
>> (see example for accidental clean merge on revctrl.org)
>> would get the same revision id in bzr. In git they would.

> They won't. The revision id is made up of the committers email address,
> a timestamp and a bunch of random data. It wouldn't be hard to switch
> using checksums as revids instead, but I don't think there are any plans
> in that direction.

The place for timestamp and commiter info is in the revision metadata
(in commit object in git). Not in revision id. Unless you think that
"accidentally the same" doesn't happen...
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:21                                                                       ` Jakub Narebski
  2006-10-23 18:26                                                                         ` Jelmer Vernooij
@ 2006-10-23 18:34                                                                         ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 18:34 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> By the way, I wonder if accidentally identical revisions
> (see example for accidental clean merge on revctrl.org)
> would get the same revision id in bzr. In git they would.

git can have no "accidentally identical revisions". They'd have to be 
purposefully done, but yes, they'd obviously (on purpose) get the same 
revision name if that's the case.

You may think of tree (not commit) identity, where git on purpose names 
trees the same regardless of how you got to them. So on a _tree_ level, 
you are always supposed to get the same result regardless of how you 
import things (ie two people importing the same tar-ball should always get 
exactly the same tree ID).

But the actual commit names are identical only if the same people are 
claimed to have authored (and committed) them at the same time - so it's 
definitely not "accidental" if the commits are called the same: they 
really _are_ the same.

Btw, I think you misunderstand the term "accidental clean merge". It means 
that two identical changes on two branches will merge without conflicts 
being reported.

A merge algorithm that doesn't do "accidental clean merge" is totally 
broken. The accidental clean merge is a usability requirement for pretty 
much anything - you often have two branches doing the same thing (possibly 
for different reasons - two people independently found the same bug that 
showed itself in two different ways - so they may even think that they 
are fixing different issues, and may have written totally different 
changelogs to explain the bug, but the solution is identical and should 
obviously merge cleanly).

So "accidental clean merge" may _sound_ like something bad, but it's 
actually a seriously good property (it's really just a special case of 
"convergence" - again, that's a good thing).

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:31                                                                           ` Jakub Narebski
@ 2006-10-23 18:44                                                                             ` Jelmer Vernooij
  2006-10-23 18:45                                                                             ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Jelmer Vernooij @ 2006-10-23 18:44 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Linus Torvalds, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

[-- Attachment #1: Type: text/plain, Size: 1202 bytes --]

On Mon, 2006-10-23 at 20:31 +0200, Jakub Narebski wrote:
> Jelmer Vernooij wrote:
> >> By the way, I wonder if accidentally identical revisions
> >> (see example for accidental clean merge on revctrl.org)
> >> would get the same revision id in bzr. In git they would.
> 
> > They won't. The revision id is made up of the committers email address,
> > a timestamp and a bunch of random data. It wouldn't be hard to switch
> > using checksums as revids instead, but I don't think there are any plans
> > in that direction.
> The place for timestamp and commiter info is in the revision metadata
> (in commit object in git). Not in revision id. Unless you think that
> "accidentally the same" doesn't happen...
The revision id isn't parsed by bzr. It's just a unique identifier that
is generated at commit-time and is currently created by concatenating
those three fields. It can be anything you like. The bzr-svn plugin for
example creates revision ids in the form
svn:REVNUM@REPOS_UUID-BRANCHPATH and bzr-git uses git:GITREVID. Nothing
will break if bzr would start using a different format.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:31                                                                           ` Jakub Narebski
  2006-10-23 18:44                                                                             ` Jelmer Vernooij
@ 2006-10-23 18:45                                                                             ` Linus Torvalds
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 18:45 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Jelmer Vernooij, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git



On Mon, 23 Oct 2006, Jakub Narebski wrote:
> 
> The place for timestamp and commiter info is in the revision metadata
> (in commit object in git). Not in revision id. Unless you think that
> "accidentally the same" doesn't happen...

Well, git and bzr really do share the same "stable" revision naming, 
although in git it's more indirect, and thus "covers" more.

In git, the revision name indirectly includes the commit comments too (and 
git obviously also distinguishes between "committer" and "author", and 
those end up being indirectly credited in the name of the commit too). But 
in a very real sense, the bzr stable ("real") revision name does 
effectively contain the same things as a git ID: it's just that it's a 
small subset (only committer+date+random number) of what git includes in 
its names.

So you could more easily _fake_ a commit name in bzr, and depending on how 
things are done it might be more open to malicious attacks for that reason 
(or unintentionally - if two people apply the exact same patch from an 
email, and take the author/date info from the email like hit does, you 
might have clashes. But with a 64-bit random number, that's probably 
unlikely, unless you also hit some other bad luck like having the 
pseudo-random sequence seeded by "time()", and people just _happen_ to 
apply the email at the exact same second).

The git use of hashes and parenthood information make any accidental 
clashes like that a non-issue: if you have exactly the same information, 
it really _is_ the same commit, since the hash includes the parenthood 
too. So you're left with just malicious attacks, and those currently look 
practically impossible too, of course.

So I don't think bzr and git differ in this respect. I think you can 
_trust_ stable git names a lot more, but that's a separate issue.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:45                                                                             ` Linus Torvalds
@ 2006-10-23 18:56                                                                               ` Jelmer Vernooij
  2006-10-23 19:02                                                                                 ` Shawn Pearce
                                                                                                   ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Jelmer Vernooij @ 2006-10-23 18:56 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Jakub Narebski, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Andreas Ericsson, Carl Worth, git

[-- Attachment #1: Type: text/plain, Size: 2030 bytes --]

On Mon, 2006-10-23 at 11:45 -0700, Linus Torvalds wrote:
> On Mon, 23 Oct 2006, Jakub Narebski wrote:
> > The place for timestamp and commiter info is in the revision metadata
> > (in commit object in git). Not in revision id. Unless you think that
> > "accidentally the same" doesn't happen...
> Well, git and bzr really do share the same "stable" revision naming, 
> although in git it's more indirect, and thus "covers" more.
> 
> In git, the revision name indirectly includes the commit comments too (and 
> git obviously also distinguishes between "committer" and "author", and 
> those end up being indirectly credited in the name of the commit too). But 
> in a very real sense, the bzr stable ("real") revision name does 
> effectively contain the same things as a git ID: it's just that it's a 
> small subset (only committer+date+random number) of what git includes in 
> its names.
There are no requirements on what a revid is in bzr. It's a unique
identifier, nothing more. It can be whatever you like, as long as it's
unique for that specific commit. The committer+date+random\ number is
just what bzr uses at the moment to create those unique identifiers.

> So you could more easily _fake_ a commit name in bzr, and depending on how 
> things are done it might be more open to malicious attacks for that reason 
> (or unintentionally - if two people apply the exact same patch from an 
> email, and take the author/date info from the email like hit does, you 
> might have clashes. But with a 64-bit random number, that's probably 
> unlikely, unless you also hit some other bad luck like having the 
> pseudo-random sequence seeded by "time()", and people just _happen_ to 
> apply the email at the exact same second).
Bzr stores a checksum of the commit separately from the revision id in
the metadata of a revision. The revision is not used by itself to check
the integrity of a revision.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://samba.org/~jelmer/

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
@ 2006-10-23 19:02                                                                                 ` Shawn Pearce
  2006-10-23 19:12                                                                                 ` Jakub Narebski
  2006-10-23 19:18                                                                                 ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-23 19:02 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, Jakub Narebski, James Henstridge, bazaar-ng,
	Matthew D. Fuller, Andreas Ericsson, Carl Worth, git

Jelmer Vernooij <jelmer@samba.org> wrote:
> On Mon, 2006-10-23 at 11:45 -0700, Linus Torvalds wrote:
> > On Mon, 23 Oct 2006, Jakub Narebski wrote:
> > > The place for timestamp and commiter info is in the revision metadata
> > > (in commit object in git). Not in revision id. Unless you think that
> > > "accidentally the same" doesn't happen...
> > Well, git and bzr really do share the same "stable" revision naming, 
> > although in git it's more indirect, and thus "covers" more.
> > 
[snip]
> > So you could more easily _fake_ a commit name in bzr, and depending on how 
> > things are done it might be more open to malicious attacks for that reason 
> > (or unintentionally - if two people apply the exact same patch from an 
> > email, and take the author/date info from the email like hit does, you 
> > might have clashes. But with a 64-bit random number, that's probably 
> > unlikely, unless you also hit some other bad luck like having the 
> > pseudo-random sequence seeded by "time()", and people just _happen_ to 
> > apply the email at the exact same second).
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

I think Linus' original point here was that if you communicate the
revision id to another person and they fetch that revision there
is no assurance that the commit they have received is the exact
same commit you had.

In Git that assurance is implicitly present as the unique
identification you communicated to the other person is also that
integrity verification.  Therefore its nearly impossible to spoof.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
  2006-10-23 19:02                                                                                 ` Shawn Pearce
@ 2006-10-23 19:12                                                                                 ` Jakub Narebski
  2006-10-23 19:18                                                                                 ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 19:12 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Carl Worth, Andreas Ericsson, git

Jelmer Vernooij wrote:

> There are no requirements on what a revid is in bzr. It's a unique
> identifier, nothing more. It can be whatever you like, as long as it's
> unique for that specific commit. The committer+date+random_number is
> just what bzr uses at the moment to create those unique identifiers.

In unpacked git repository commit-id is also commit address. Pack files
adds another level of indirection via pack index file. And functions
as checksum.

P.S. I'm interested what are bzr equivalents of git different types
of objects: commits (revision info) and what is stored in there besides
commit message and "snapshot"; trees/manifest i.e. how files are 
gathered together to form given revision; blob i.e. what is the storage 
format and how it is divided: changeset-like of Arch or file "buckets" 
of Mercurial and CVS, or something yet different together. Is there 
equivalent of git tags and tags objects?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 18:56                                                                               ` Jelmer Vernooij
  2006-10-23 19:02                                                                                 ` Shawn Pearce
  2006-10-23 19:12                                                                                 ` Jakub Narebski
@ 2006-10-23 19:18                                                                                 ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 19:18 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: James Henstridge, bazaar-ng, Matthew D. Fuller, Carl Worth,
	Andreas Ericsson, git, Jakub Narebski



On Mon, 23 Oct 2006, Jelmer Vernooij wrote:
>
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

That wasn't what I was trying to aim at - the problem is that the bzr 
revision ID isn't "safe" in itself. Anybody can create a revision with the 
same names - and they may both have checksums that match their own 
revision, but you have no idea which one is "correct".

So you just have to trust the person that generates the name, to use a 
proper name generation algorithm. You have to _trust_ that your 64-bit 
random number really is random, for example. And that nobody is trying to 
mess with your repo.

This isn't a problem in normal behaviour, but it's a problem in an attack 
schenario: imagine somebody hacking the central server, and replacing the 
repository with something that had all the same commit names, but one of 
the revisions was changed to introduce a nasty backhole problem. Change 
all the checksums to match too..

It would _look_ fine to somebody who fetches an update, and the maintainer 
might not ever even notice (because he wouldn't send the _old_ revision 
again, and _his_ tree would be fine, so he'd happily continue to to send 
out new revisions on top of the bad one on the public site, never even 
realizing that people are fetching something that doesn't match what he is 
pushing).

In contrast, in git, if you replace something in a git repository, the 
name changes, and if I were to try to push an update on top of a broken 
repo like that, it simply wouldn't work - I couldn't fast-forward my own 
branch, because it's no longer a proper subset of what I'm trying to send.

So in git, you can _trust_ the names. They actually self-verify. You can't 
have maliciously made-up names that point to something else than what they 
are. 

[ Also, as a result, and related to this same issue: the git protocol 
  actually never sends object names when sending the object itself. It 
  just sends the object data, and the _recipient_ generates the name from 
  that.

  So you can't do the _other_ kind of spoofing, and make a repository that 
  _claims_ to have one name and the data would differ - because if you do 
  that, anybody who pulls from the spoofed repository will re-create 
  different names than you claimed, and won't even be able to pull such a 
  malicious repository. ]

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:18                                                                 ` Aaron Bentley
  2006-10-23 17:53                                                                   ` Jakub Narebski
@ 2006-10-23 20:06                                                                   ` Jeff King
  2006-10-23 20:29                                                                     ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Jeff King @ 2006-10-23 20:06 UTC (permalink / raw)
  To: Aaron Bentley
  Cc: James Henstridge, Jakub Narebski, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Andreas Ericsson, Carl Worth, git

On Mon, Oct 23, 2006 at 01:18:30PM -0400, Aaron Bentley wrote:

> And, unlike git, Bazaar branches are all independent entities[1], and
[...]
> [1] The fact that they may share storage is not important to the model.

Sorry, I don't understand this statement. How are git branches not
independent? Sure, they tend to exist in repositories with other
branches, but there's no need to (it simply allows the sharing of object
storage). There's no reason I can't move any branch from any repo into
its own repo, or vice versa move any unrelated branch into a repo with
other branches.

It all Just Works because there _isn't_ any branch information. It's
simply a pointer into the DAG, so if I have the right parts of the DAG
(which git is careful to make sure of), I can just make a pointer, and I
have absolutely zero connection to wherever the DAG came from.

> they each have a URL.

In cogito, branches can each have a URL, but git-clone doesn't have a
way (that I know of) to clone only a subset of branches. It would be
fairly trivial to implement, I think.

> So:
> 
> http://code.aaronbentley.com/bzrrepo/bzr.ab 1695
> 
> is a name for
> 
> abentley@panoramicfeedback.com-20060927202832-9795d0528e311e31

The git analog is of course:

http://kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git v2.6.18

as a name for

e478bec0ba0a83a48a0f6982934b6de079e7e6b3

The difference being that Linus assigned the "local" name of v2.6.18
rather than having git auto-assign it.

> And it does not depend on any other branch, especially not bzr.dev

Of course. For me, the above commit is actually

  ssh://peff.net/home/peff/git/linux-2.6 v2.6.18

but once it is in my local repository, it's indistinguishable from one I
pulled directly from kernel.org.

And I wonder if THAT is at the root of this discussion. bzr isn't
"centralized" in the sense that you have to talk to a central server, or
rely on it for doing any operations.  But you actually CARE about where
your commits come from, and git fundamentally doesn't.

-Peff

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 20:06                                                                   ` Jeff King
@ 2006-10-23 20:29                                                                     ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 20:29 UTC (permalink / raw)
  To: Jeff King
  Cc: Aaron Bentley, James Henstridge, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Andreas Ericsson, Carl Worth, git

On Mon, 23 Oct 2006, Jeff King wrote:
> On Mon, Oct 23, 2006 at 01:18:30PM -0400, Aaron Bentley wrote:
> 
>> And, unlike git, Bazaar branches are all independent entities[1], and
> [...]
>> [1] The fact that they may share storage is not important to the model.

By the way, git repositories (remember that working area in bzr is
associated with branch, and in git with repository) can share storage,
either sharing only immutable "old history" (part of DAG) via 
$GIT_DIR/objects/info/alternates file or GIT_ALTERNATE_OBJECT_DIRECTORIES
environment variable, or via having shared commit object database
via symlinking $GIT_DIR/objects directory or via setting 
GIT_OBJECT_DIRECTORY variable. 

Git doesn't support latter fully out of the box (you must be careful
with prune) but on the other side bzr doesn't support cloning whole
repository.
  
> It all Just Works because there _isn't_ any branch information. It's
> simply a pointer into the DAG, so if I have the right parts of the DAG
> (which git is careful to make sure of), I can just make a pointer, and I
> have absolutely zero connection to wherever the DAG came from.

Well, with exception of reflog, which is local to repository
(and doesn't get propagated).
 
>> they each have a URL.
> 
> In cogito, branches can each have a URL, but git-clone doesn't have a
> way (that I know of) to clone only a subset of branches. It would be
> fairly trivial to implement, I think.

On the other side Cogito doesn't have way to clone all the branches.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 17:29                                                           ` Linus Torvalds
@ 2006-10-23 22:21                                                             ` Matthew D. Fuller
  2006-10-23 22:28                                                               ` David Lang
                                                                                 ` (4 more replies)
  0 siblings, 5 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-23 22:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> I already briought this up once, and I suspect that the bzr people
> simply DID NOT UNDERSTAND the question:
> 
>  - how do you do the git equivalent of "gitk --all"

I for one simply DO NOT UNDERSTAND the question, because I don't know
what that is or what I'd be trying to accomplish by doing it.  The
documentation helpfully tells me that it's something undocumented.


> For example, how long does it take to do an arbitrary "undo" (ie
> forcing a branch to an earlier state) [...]

I don't understand the thrust of this, either.  As I understand the
operation you're talking about, it doesn't have anything to do with a
branch; you'd just be whipping the working tree around to different
versions.  That should be O(diff) on any modern VCS.


> and yes, performance does matter.

I agree, and I currently find a number of places bzr doesn't hit the
level of performance I think it should.  I'm not convinced, however,
that any notable proportion of that has to do with the abstract model
behind it.  And insofar as it has to do with the physical storage
model, that can easily be (and I'm confident will be, considering it's
a focus) ameliorated with later repository formats.


> The whole confusing between "bzr pull" and "bzr merge" is another
> _technical_ sign of why branch-local revision numbers are a mistake. 

I consider it a _technical_ sign of a way of thinking about branches I
prefer   8-}


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
@ 2006-10-23 22:28                                                               ` David Lang
  2006-10-23 22:44                                                               ` Linus Torvalds
                                                                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 806+ messages in thread
From: David Lang @ 2006-10-23 22:28 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

>
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

on many modern VCS systems it's O(n) on the number of changes (start from where 
you are and apply the patch to change it to rev -1, then apply the patch to 
change it to rev -2, etc)

on git it's O(1) (write the new files into place)

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
  2006-10-23 22:28                                                               ` David Lang
@ 2006-10-23 22:44                                                               ` Linus Torvalds
  2006-10-24  0:26                                                                 ` Matthew D. Fuller
  2006-10-23 22:45                                                               ` Jakub Narebski
                                                                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 22:44 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git



On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

> On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> > 
> > I already briought this up once, and I suspect that the bzr people
> > simply DID NOT UNDERSTAND the question:
> > 
> >  - how do you do the git equivalent of "gitk --all"
> 
> I for one simply DO NOT UNDERSTAND the question, because I don't know
> what that is or what I'd be trying to accomplish by doing it.  The
> documentation helpfully tells me that it's something undocumented.

gitk (and all other logging functions) can take as its argument a set of 
arbitrary revision expressions.

That means, for example, that you can give it a list of branches and tags, 
and it will generate the combined log for all of them. "--all" is just 
shorthand for that, but it's really just a special case of the generic 
facility.

This is _invaluable_ when you want to actually look at how the branches 
are related. The whole _point_ of having branches is that they tend to 
have common state.

For example, let's say that you have a branch called "development", and a 
branch called "experimental", and a branch called "mainline". Now, 
_obviously_ all of these are related, but if you want to see how, what 
would you do?

In git, one natural thing would be, for example, to do

	gitk development experimental ^mainline

(where instead of "gitk" you can use any of the history listing 
things - gitk is just the visually more clear one) which will show you 
what exists in the branches "development" and "experimental", but it will 
_subtract_ out anything in "mainline" (which is sensible - you may want to 
see _just_ the stuff that is getting worked on - and the stuff in mainline 
is thus uninteresting).

See? When you visualize multiple branches together, HAVING PER-BRANCH 
REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid and interesting 
operation to do.

An equally interesting thing to ask is: I've got two branches, show me the 
differences between them, but not the stuff in common. Again, very simple. 
In git, you'd literally just write

	gitk a...b

(where "..." is "symmetric difference"). Or, if you want to see what is in 
"a" but _not_ in "b", you'd do

	gitk b..a

(now ".." is regular set difference, and the above is really identical to 
the "a ^b" syntax).

And trust me, these are all very valid things to do, even though you're 
talking about different branches.

Try it out. 

> > For example, how long does it take to do an arbitrary "undo" (ie
> > forcing a branch to an earlier state) [...]
> 
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

No. If you "undo", you'd undo the whole history too. And if you undo to a 
point that was on a branch, you'd have to re-write _all_ the revision 
ID's.

> I consider it a _technical_ sign of a way of thinking about branches I
> prefer   8-}

Quite frankly, I just don't think you understand what it means.

			Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
  2006-10-23 22:28                                                               ` David Lang
  2006-10-23 22:44                                                               ` Linus Torvalds
@ 2006-10-23 22:45                                                               ` Jakub Narebski
  2006-10-23 23:14                                                                 ` Erik Bågfors
  2006-10-24  9:51                                                               ` Matthieu Moy
  2006-10-25 10:52                                                               ` Andreas Ericsson
  4 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-23 22:45 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthew D. Fuller wrote:

> On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
>> 
>> I already briought this up once, and I suspect that the bzr people
>> simply DID NOT UNDERSTAND the question:
>> 
>>  - how do you do the git equivalent of "gitk --all"
> 
> I for one simply DO NOT UNDERSTAND the question, because I don't know
> what that is or what I'd be trying to accomplish by doing it.  The
> documentation helpfully tells me that it's something undocumented.

gitk(1)
=======

NAME
----
gitk - git repository browser

DESCRIPTION
-----------
Displays changes in a repository or a selected set of commits. This includes
visualizing the commit graph, showing information related to each commit, and
the files in the trees of each revision.

Historically, gitk was the first repository browser. It's written in tcl/tk
and started off in a separate repository but was later merged into the main
git repository.

OPTIONS
-------
To control which revisions to shown, the command takes options applicable to
the git-rev-list(1) command. This manual page describes only the most
frequently used options.

[...]
--all::

        Show all branches.


Which means that "gitk --all" means show whole DAG in graphical history viewer.

As in bzr there is no command (nor plugin) to clone whole repository,
I guess that the answer is that you can't do this. But perhaps 
I'm mistaken, and you can do this in bzr-gtk/bzrk...

>> For example, how long does it take to do an arbitrary "undo" (ie
>> forcing a branch to an earlier state) [...]
> 
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

For example if you decide to discard some changes completely, reverting
(this action in git is called 'rewind') branch to some previous revision.

And in git this operation is O(1), not O(diff).

BTW. The following question IIRC remained unanswered: can you easily
in bzr create branch off arbitrary revision (for example deciding that
stable branch should start two revisions back in history from development
branch)?

>> and yes, performance does matter.
> 
> I agree, and I currently find a number of places bzr doesn't hit the
> level of performance I think it should.  I'm not convinced, however,
> that any notable proportion of that has to do with the abstract model
> behind it.  And insofar as it has to do with the physical storage
> model, that can easily be (and I'm confident will be, considering it's
> a focus) ameliorated with later repository formats.

Some of physical storage models needs specific abstract model. I think
that git storage model is in this class.

>> The whole confusing between "bzr pull" and "bzr merge" is another
>> _technical_ sign of why branch-local revision numbers are a mistake. 
> 
> I consider it a _technical_ sign of a way of thinking about branches I
> prefer   8-}

Or _perhaps_ just the way of thinking about branches in the way you are
used to.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:45                                                               ` Jakub Narebski
@ 2006-10-23 23:14                                                                 ` Erik Bågfors
  2006-10-23 23:24                                                                   ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-23 23:14 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

This is starting to turn into a "my VCS it better than yours"
discussion rather then anything else.  That's unfortunate....


>
> Which means that "gitk --all" means show whole DAG in graphical history viewer.
>
> As in bzr there is no command (nor plugin) to clone whole repository,

But it wouldn't be hard to create one...

> I guess that the answer is that you can't do this. But perhaps
> I'm mistaken, and you can do this in bzr-gtk/bzrk...

As of now there is no way to do it due to the fact that nobody has
done it yet. You can ofcourse clone branches into a common repo and do
operations on that. For example, there is a plugin that allows you to
list heads in a repo (and not in branches). So basically, if you loose
a branch, you can still find the head in the repository and recreate
the branch.

I don't see any problem doing a "gitk --all" equivalent in bzr.
Personally, I don't really have a need for it.

> BTW. The following question IIRC remained unanswered: can you easily
> in bzr create branch off arbitrary revision (for example deciding that
> stable branch should start two revisions back in history from development
> branch)?

bzr branch -r-2 development stable
(or "bzr branch -rrevid:foobar" to start at revision id "foobar")

very easy.

/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com
sip:17476714687@proxy01.sipphone.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:14                                                                 ` Erik Bågfors
@ 2006-10-23 23:24                                                                   ` Linus Torvalds
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
                                                                                       ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-23 23:24 UTC (permalink / raw)
  To: Erik Bågfors; +Cc: Jakub Narebski, bazaar-ng, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 310 bytes --]



On Tue, 24 Oct 2006, Erik Bågfors wrote:
> 
> I don't see any problem doing a "gitk --all" equivalent in bzr.

The problem? How do you show a commit that is _common_ to two branches, 
but has different revision names in them?

Do you _finally_ see what is so wrong with this whole per-branch naming?

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:44                                                               ` Linus Torvalds
@ 2006-10-24  0:26                                                                 ` Matthew D. Fuller
  2006-10-24 15:58                                                                   ` David Lang
  0 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-24  0:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

On Mon, Oct 23, 2006 at 03:44:13PM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> gitk (and all other logging functions) can take as its argument a
> set of arbitrary revision expressions.
  [...]
> And trust me, these are all very valid things to do, even though
> you're talking about different branches.

I have zero problem believing that.  It seems from all accounts a
wonderful swiss-army chainsaw, and while none of that power is useful
to me personally in anything I'm VCS'ing at the moment, I'd feel awful
shiny knowing it was sitting there waiting for me.  All else being
equal, I'd think more highly of a VCS with those capabilities than one
without.

bzr-the-program doesn't have a lot of that capability, and what it
does have is rather more verbose to access.  Perhaps some attribute of
bzr-the-current-storage-model would make some bit of that
significantly more expensive than it has to be (I don't know of any,
and can't think offhand of anywhere it might hide, but that's way off
my turf).

But I don't understand how bzr-the-abstract-data-model makes such
things impossible, or even significantly different than doing so in
git.  In git, you're just chopping off one DAG where another one
intersects it (or similar operations).  To do it in bzr, you'd do...
exactly the same thing.  The revnos, or the mainline, are completely
useless in such an operation of course, but they don't hurt it; the
tool would just just ignore them like it does the SHA-1 of files in
the revision.


> See? When you visualize multiple branches together, HAVING
> PER-BRANCH REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid
> and interesting operation to do.

I wouldn't be so absolutist about it, but certainly they're of
extremely limited utility if of any at all in such cases.  And yes, it
can be an interesting operation.  But what does that have to do with
using revnos in other cases?  You keep saying "having" where I would
say "using".


> No. If you "undo", you'd undo the whole history too. And if you undo
> to a point that was on a branch, you'd have to re-write _all_ the
> revision ID's.

Well, I guess in this particular case I still don't see why you'd
generally undo big hunks of a branch versus just flipping your working
tree to different versions.  But contrived examples are still
examples, and even if so, truncate()'ing a list of numbers is a
constant time operation.  And even if you had to renumber totally...
my $DEITY, I'd expect my old 200MHz PPro to renumber a hundred
thousand rev long mainline in half a second.


> > I consider it a _technical_ sign of a way of thinking about
> > branches I prefer   8-}
> 
> Quite frankly, I just don't think you understand what it means.

Quite frankly, I just don't think you understand that I WANT to care
about first parents.  No, really.  Seriously.  I really really really
want to.  If my VCS didn't give me numbers along the mainline, I'd
still care out it.  If the revisions were all named SHA-1 hashes, I'd
still care about it.  If I had a metric quidnillion ways to
cross-section and compare branches, I'd still care about it.

This comes with costs.  Chief among them is a restriction of my
actions; I can't fast-forward branches where I care about the
mainline.  That's a cost.  That means I have to take some care about
what operations I perform.  I *GLEEFULLY* pony up that cost.

Because I care about the mainline, revnos can be useful.  I like
revnos.  It has to cost SOMETHING to come up with them (though there
seems to be disagreement about the size of that cost), since doing
'x+y' will always cost more than doing 'x'.  I've never seen a case
where that cost even appeared MEASURABLE, much less significant
(things have to be pretty expensive to compare to the cost of starting
up python and loading a bunch of files into it ;).  So far, I've not
seen the slightest hint of a cost that would make it even worth asking
the question of whether the cost is worth it to me.


I care about that first parent line.  Therefore, I require my tool to
at least _pretend_ to care.  I'm not aware of any way in which the
fundamental bzr structures care, but the UI is chock full of
pretending.  A necessary part of that pretending is not changing my
mainline unless I specifically ask for it, and that means a
merge-vs-pull distinction needs to be there.  That's a _technical_
sign that the tool is ready to work with me the way I want to work.  A
lack of it is a _technical_ sign that it's not suitable.

You, by your own words, don't care about the first parent line.  Your
tool naturally reflects this.  From that perspective, *ANY* cost for
maintaining such a thing is Bad And Wrong, and so you condemn it.
Those condemnations will keep failing to carry any weight with me,
though, as long as I care about that mainline and value the benefits I
find in it.


Maybe I won't always.  2 years ago, I could maybe see some benefits in
DVCS, but I couldn't imagine what possible use they could ever be to
me in anything I do.  Today, I'm using one (if lightly by the
standards of a lot of people in this discussion), and chafing at every
centralized system I have to deal with.  In 5 years, I may be standing
beside you slugging it out at those lunatics and hacks who keep
begging to pay these whopper costs, just to be able to do extra work
to maintain an ordering of parents that doesn't matter for crap.
Could be.  I've changed my mind about far more momentous things in my
life.

Maybe someday I'll still care, but the OTHER advantages of a system
(like git) that doesn't over all the ones that do will outweigh the
advantages I gain from that distinction.  Someday I might need such
ultra-expressive ways of comparing branches, and bzr won't have grown
them yet.  Someday I might reach a point where bzr's performance due
to the choice of storage structures or implementation language or
developer habits or whatever else just doesn't cut the mustard, and
git's does.  Someday, some set of other advantages may make it
worthwhile for me to give up my preciouss mainline no matter how much
I might still crave it.

But I can only work from today.  Today, I do care.  Today, it's well
worth whatever I give up to get it.  And I like that my tool makes
that caring easy for me.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
@ 2006-10-24  0:26                                                                     ` Matthew D. Fuller
  2006-10-24  0:38                                                                       ` Matthew D. Fuller
  2006-10-24  0:47                                                                       ` Carl Worth
  2006-10-24  0:39                                                                     ` Martin Langhoff
                                                                                       ` (2 subsequent siblings)
  3 siblings, 2 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-24  0:26 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik Bågfors, bazaar-ng, git, Jakub Narebski

On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> The problem? How do you show a commit that is _common_ to two
> branches, but has different revision names in them?

Why would you?


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
@ 2006-10-24  0:38                                                                       ` Matthew D. Fuller
  2006-10-24  5:42                                                                         ` Linus Torvalds
  2006-10-24  0:47                                                                       ` Carl Worth
  1 sibling, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-24  0:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git, Erik Bågfors

On Mon, Oct 23, 2006 at 07:26:57PM -0500 I heard the voice of
Matthew D. Fuller, and lo! it spake thus:
> On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> > 
> > The problem? How do you show a commit that is _common_ to two
> > branches, but has different revision names in them?
> 
> Why would you?

I beg your pardon; that was awful ambiguous of me.  I meant "In such a
case, where the whole purpose of what you're doing is to you're look
at multiple branches to see relationships between them, why WOULD you
be using branch-local identifiers for revisions at all?"


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
@ 2006-10-24  0:39                                                                     ` Martin Langhoff
  2006-10-24  7:52                                                                       ` Erik Bågfors
  2006-10-24  9:30                                                                     ` Jelmer Vernooij
  2006-10-25 18:41                                                                     ` Aaron Bentley
  3 siblings, 1 reply; 806+ messages in thread
From: Martin Langhoff @ 2006-10-24  0:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik Bågfors, Jakub Narebski, bazaar-ng, git

On 10/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Tue, 24 Oct 2006, Erik Bågfors wrote:
> >
> > I don't see any problem doing a "gitk --all" equivalent in bzr.
>
> The problem? How do you show a commit that is _common_ to two branches,
> but has different revision names in them?

Eric,

coming from an Arch background, I understand the whole per-branch
commitids approach. After using GIT for a while, you start realising
that it tries to pin down things in the wrong place.

This is specially visible if you run `gitk --all` before and after a
merge. Or on a project with many merges (if you can, get a checkout of
git itself, and browse its history with gitk).

Before the merge, you see

 --o--o--o--o
    \
     \--o--o

and after

 --o--o--o--o
    \        \
     \--o--o--o

Now, after it's merged somewhere, both commits are part of its
history, regardless of where they come from. And it is very clear if
two branches have been merging and remerging.

Where a commit originated does not matter. And fancy
repo-and-branch-centric names get in the way. A lot. And they re
mostly meaningless as soon as you put what matters in the commit
message. Which means that that bit of metadata that you are hoping
that the revno keeps "indirectly" isn't lost on cherry picking.

I guess that's where I used to find revnos useful as they contained
some basic metadata. With bzr it seems to be author-repo-branch where
branch is hopefully "line of work" but all of that can be (and should
be) in the commit message.

You can see similar info in the first part of the commit message for
most git-hosted projects. It'll say something like

   cvsserver: fix the frobnicator to be sequential

which means that at that point, you could be working in a branch
called fix-this-fscking-thing-attempt524" and no-one would know ;-)

And in a few years (even months) time, that bit of metadata you were
hoping to keep is totally irrelevant. What you have in the commit
message remains relevant and useful.

cheers,


martin

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
  2006-10-24  0:38                                                                       ` Matthew D. Fuller
@ 2006-10-24  0:47                                                                       ` Carl Worth
  2006-10-24  7:31                                                                         ` Erik Bågfors
  2006-10-24 21:51                                                                         ` Erik Bågfors
  1 sibling, 2 replies; 806+ messages in thread
From: Carl Worth @ 2006-10-24  0:47 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git, Erik Bågfors

[-- Attachment #1: Type: text/plain, Size: 930 bytes --]

On Mon, 23 Oct 2006 19:26:57 -0500, "Matthew D. Fuller" wrote:
>
> On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
> >
> > The problem? How do you show a commit that is _common_ to two
> > branches, but has different revision names in them?
>
> Why would you?

Assume you've got two long-lived branches and one periodically gets
merged into the other one. The combined history might look as follows
(more recent commits first):

 f   g
 |   |
 d   e
 |\ /
 b c
 |/
 a

The point is that it is extremely nice to be able to visualize things
that way. Say I've got a "dev" branch that points at f and a "stable"
branch that points at g. With this, a command like:

	gitk dev stable

would result in a picture just like the above. Can a similar figure be
made with bzr? Or only the following two separate pictures:

 f    g
 |    |
 d    e
 |\   |
 b c  c
 |/   |
 a    a

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 12:54                                                             ` Jakub Narebski
  2006-10-23 15:01                                                               ` James Henstridge
@ 2006-10-24  3:24                                                               ` David Clymer
  1 sibling, 0 replies; 806+ messages in thread
From: David Clymer @ 2006-10-24  3:24 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Matthew D. Fuller, Andreas Ericsson, Linus Torvalds, Carl Worth,
	bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 3190 bytes --]

On Mon, 2006-10-23 at 14:54 +0200, Jakub Narebski wrote:
> On Mon, Oct 23, 2006 David Clymer wrote:
> > On Sun, 2006-10-22 at 22:06 +0200, Jakub Narebski wrote:
> >> David Clymer wrote:
> 
> >>> 2. bzr does not support fully distributed development because revnos
> >>> "don't work" as stated in #1.
> >>
> >> Bazaar is biased towards centralized/star-topology development if we
> >> want to use revnos. In fully distributed configuration there is no
> >> "simple namespace".
> > 
> > So revnos aren't globally meaningful in fully distributed settings. So
> > what? I don't see how this translates into bias. There is a lot of
> > functionality provided by bazaar that doesn't really apply to my use
> > case, but it doesn't mean that it is indicative of some bias in bazaar.
> 
> First, bzr is biased towards using revnos: bzr commands uses revnos
> by default to provide revision (you have to use revid: prefix/operator
> to use revision identifiers), bzr commands outputs revids only when
> requested, examples of usage uses revision numbers.

Agreed. Of course, I want the simplest case to be the simplest. When
working on my own branch, regardless if it is a standalone project or
part of a distributed one, I don't want to have to type SHA hashes or
revids. Numbers serve my purposes best in this case. When I communicate
with other distributed developers, I can and should use revids.

> 
> In order to use revnos as _global_ identifiers in distributed development,
> you need central "branch", mainline, to provide those revnos. You have
> either to have access to this "revno server" and refer to revisions by
> "revno server" URL and revision number, or designate one branch as holding
> revision numbers ("revno server") and preserve revnos on "revno server"
> by using bzr "merge", while copying revnos when fetching by using bzr "pull"
> for leaf branches. In short: for revnos to be global identifiers you need
> star-topology.

Ok. Let's not repeat this again. I think I said this once, and you've
said it in two following emails. It's a given. Assume that we all know
it.

> 
> Even if you use revnos only locally, you need to know which revisions are
> "yours", i.e. beside branch as DAG of history of given revision you need
> "ordered series of revisions" (to quote Bazaar-NG wiki Glossary), or path
> through this diagram from given revision to one of the roots (initial,
> parentless revisions). Because bzr does that by preserving mentioned path
> as first-parent path (treating first parent specially), i.e. storing local
> information in a DAG (which is shared), to preserve revnos you need to
> use "merge" instead of "pull", which means that you get empty-merge in
> clearly fast-forward case. This means "local changes bias", which some
> might take as not being fully distributed.

"local changes bias" I can buy that. I even like it. I don't even care
if that makes bazaar "not fully distributed." I don't think the
distinction between "fully" and "almost, except for some technicality"
distributed is one that has much practical value.

-davidc
-- 
gpg-key: http://www.zettazebra.com/files/key.gpg

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:38                                                                       ` Matthew D. Fuller
@ 2006-10-24  5:42                                                                         ` Linus Torvalds
  2006-10-24  5:47                                                                           ` Shawn Pearce
  2006-10-24 16:46                                                                           ` Matthew D. Fuller
  0 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-24  5:42 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Jakub Narebski, bazaar-ng, git, Erik Bågfors



On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

> On Mon, Oct 23, 2006 at 07:26:57PM -0500 I heard the voice of
> Matthew D. Fuller, and lo! it spake thus:
> > On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> > Linus Torvalds, and lo! it spake thus:
> > > 
> > > The problem? How do you show a commit that is _common_ to two
> > > branches, but has different revision names in them?
> > 
> > Why would you?
> 
> I beg your pardon; that was awful ambiguous of me.  I meant "In such a
> case, where the whole purpose of what you're doing is to you're look
> at multiple branches to see relationships between them, why WOULD you
> be using branch-local identifiers for revisions at all?"

Well, I would use the globally unique ones, certainly. It's the only thing 
that makes sense.

However, I'd also argue that once you start doing that, _mixing_ the 
globally unique and stable ones and the "simple" ones is a mistake: you'd 
be better off having told your users to use the global ones from the very 
beginning, and trying to make _those_ as simple to use as possible.

Because once you start using both, you're just going to confuse your users 
horribly, and they'll consider the globally unique one really irritating, 
because they're used to using something totally different in most other 
contexts.

Using the _same_ names everywhere is just better. 

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  5:42                                                                         ` Linus Torvalds
@ 2006-10-24  5:47                                                                           ` Shawn Pearce
  2006-10-24 16:46                                                                           ` Matthew D. Fuller
  1 sibling, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-24  5:47 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Erik Bågfors, bazaar-ng, git, Matthew D. Fuller, Jakub Narebski

Linus Torvalds <torvalds@osdl.org> wrote:
> Using the _same_ names everywhere is just better. 

I find that it is simpler too.  :-)

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-20  9:43                     ` Matthieu Moy
@ 2006-10-24  6:02                       ` Lachlan Patrick
  2006-10-24  6:23                         ` Shawn Pearce
  2006-10-24  6:31                         ` Linus Torvalds
  0 siblings, 2 replies; 806+ messages in thread
From: Lachlan Patrick @ 2006-10-24  6:02 UTC (permalink / raw)
  To: bazaar-ng, git

Matthieu Moy wrote:
> Sean <seanlkml@sympatico.ca> writes:
>> We don't need plugins to extend features, we just add the feature to
>> the source.  The example I asked about earlier is a case in point. 
>> Apparently in bzr "bisect" was implemented as a plugin, yet in Git it
>> was implemented as a command without any issue at all,
> 
> I'd compare bzr's plugins to Firefox extensions.

So, bzr's plug-in architecture provides a 'protocol' for communicating
with bzr? Or is it functionally the same as a Python module which is
loaded after being named on the bzr command-line (or placed in a special
folder) then executed along with all the other plug-ins? I'm trying to
understand if writing a plug-in is any simpler than understanding the
bzr source code.

Can I ask the git folks what Sean meant in the above about a 'command'.
Are you talking about shell scripts? Is 'git' the only program you need?

AFAIK, 'bzr' is the sole program in Bazaar, and everything is done with
command line options to bzr. Is that true of git? To what extent is git
tied to a [programmable] shell? I've heard someone say there's no
Windows version of git for some reason, can someone elaborate?

Ta,
Loki

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:02                       ` Lachlan Patrick
@ 2006-10-24  6:23                         ` Shawn Pearce
  2006-10-24  6:31                         ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-24  6:23 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git

Lachlan Patrick <loki@research.canon.com.au> wrote:
> Can I ask the git folks what Sean meant in the above about a 'command'.
> Are you talking about shell scripts? Is 'git' the only program you need?

'git' is actually two things:

  1) Its a wrapper command which executes 'git-foo' if you call it
     with 'foo' as its first parameter.  It searches for 'git-foo'
     in the GIT_EXEC_PATH environment variable, which has a default
     set at compile time, usually to the directory you are going to
     install Git into.

  2) Its most of the core Git plumbing.  There are currently around 48
     'builtin' commands.  These are things which 'git' knows how to do
     without executing another program.  If you look at the installation
     these 48 builtin commands are just hardlinks back to 'git'.  For
     example 'git-update-index' is really just a hardlink back to 'git'
     and 'git' knows to perform the update index logic when its called
     as either 'git-update-index' or as 'git update-index'.

We're moving more towards #2, but there are still a large number
of commands which fall into #1.
 
> AFAIK, 'bzr' is the sole program in Bazaar, and everything is done with
> command line options to bzr. Is that true of git?

No.  In Git at least half of the things Git can do are not builtin to
'git' and thus require exec()'ing an external program (e.g. git-fetch).
However these often appear as though they are command line options to
'git' as 'git fetch' just means exec 'git-fetch' (by #1 above).

On the other hand there are a wide range of tools which are more or
less the same thing, just with different options applied to them.
All of the diff programs, log, whatchanged, show - these are all
just variations on a theme.  Their individual implementations are
very tiny as they all use the same library code.

> To what extent is git
> tied to a [programmable] shell?

Git is still very much tied to a shell.  For example 'git commit'
is really the shell script 'git-commit'.  This is a rather long
shell script and it does a lot of things for the user; not having
it would make Git useless to for most people.  It also has not been
rewritten in C.  There is a roadmap however to convert it to C to
help remove the programmable shell requirement and people have been
slowly performing the (rather tedious) conversion work.

> I've heard someone say there's no
> Windows version of git for some reason, can someone elaborate?

Git runs on Cygwin.  But there's no native Win32 (without Cygwin)
version of Git because:

 - Git uses POSIX APIs and expects POSIX behavior from the OS its
   running on.  Without a compatability layer to make Windows act
   like UNIX Git won't run.  Cygwin happens to be a really good
   compatability layer.

 - Git requires a Bourne shell for many of its important tools,
   such as 'git commit'.  Windows lacks such a program, at least
   out of the box, but its in Cygwin.

 - Git relies on a helper program called 'merge' to perform three
   way file merges.  This tool may or may not be ported to native
   Win32 (I don't know) but it is in Cygwin.

 - Git requires some libraries for certain features, such as libexpat
   or libcurl.  I don't know if these are available for native Win32
   but they are available on Cygwin.

 - Windows isn't the primary target platform for many of the Git
   contributors.  Some consider the fact that it even runs there
   at all a minor miracle, and that's only possible due to the hard
   work the Cygwin folks have done.

 - ... I'm sure there's other reasons ...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:02                       ` Lachlan Patrick
  2006-10-24  6:23                         ` Shawn Pearce
@ 2006-10-24  6:31                         ` Linus Torvalds
  2006-10-24  6:45                           ` David Rientjes
  1 sibling, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-24  6:31 UTC (permalink / raw)
  To: Lachlan Patrick; +Cc: bazaar-ng, git



On Tue, 24 Oct 2006, Lachlan Patrick wrote:
> 
> Can I ask the git folks what Sean meant in the above about a 'command'.
> Are you talking about shell scripts? Is 'git' the only program you need?

Historically, "git" was _only_ a wrapper program. When you did

	git log

it just executed the real program called "git-log", which was often a 
shell-script. That was just so that things could easily be extended, and 
you could use shell-script for simple one-liner things, and native C for 
more "core" stuff.

For example, "git log" used to be a one-line shell-script that just did

	git-rev-list --pretty HEAD | LESS=-S ${PAGER:-less}

but it ended up being a lot more capable, and eventually just rewritten 
as an internal command..

These days, most of the simple things like "git log" are all built into 
the "git" program, although for anything not built in, it still acts as 
just a wrapper, which allows not only random functionality to still be 
written in shell (or sometimes perl), but also ends up being the simplest 
possible plug-in mechanism: you can define your own commands by just 
writing a shell-script thing, calling it "git-mycommand", installing it in 
the proper place, and it ends up being accessible as "git mycommand".

That allows for easy prototyping in your language of choice.

> AFAIK, 'bzr' is the sole program in Bazaar, and everything is done with
> command line options to bzr. Is that true of git? To what extent is git
> tied to a [programmable] shell? I've heard someone say there's no
> Windows version of git for some reason, can someone elaborate?

Almost all of "core" git is pure C, which unlike something like python or 
perl obviously tends to have a fair amount of system issues. That said, 
much of it really is fairly portable, so doing the built-in git stuff 
should _largely_ work even natively under Windows with some effort.

The problem ends up being that few enough people seem to develop under 
Windows, and the cygwin port works better (because it handles a number of 
the portability issues and also handles the scripts that are still shell). 
Those two issues seem to mean that not a lot of effort has been put into 
aiming for a native windows binary (or into moving away from shell 
scripts).

Most of the shell scripts really are fairly simple. So if somebody 
_really_ wanted to, it would probably not be hard to spend some effort to 
either just write them as C and turn them into built-ins, or porting them 
to some other scripting language.

Of course, most Windows users don't seem to really want a command line 
interface at all. IDE integration would appear to be more interesting to 
some people.

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:31                         ` Linus Torvalds
@ 2006-10-24  6:45                           ` David Rientjes
       [not found]                             ` <Pin e.LNX.4.64.0610240812410.3962@g5.osdl.org>
                                               ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: David Rientjes @ 2006-10-24  6:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Lachlan Patrick, bazaar-ng, git

On Mon, 23 Oct 2006, Linus Torvalds wrote:

> Historically, "git" was _only_ a wrapper program. When you did
> 
> 	git log
> 
> it just executed the real program called "git-log", which was often a 
> shell-script. That was just so that things could easily be extended, and 
> you could use shell-script for simple one-liner things, and native C for 
> more "core" stuff.
> 
> For example, "git log" used to be a one-line shell-script that just did
> 
> 	git-rev-list --pretty HEAD | LESS=-S ${PAGER:-less}
> 
> but it ended up being a lot more capable, and eventually just rewritten 
> as an internal command..
> 

Some of the internal commands that have been coded in C are actually much 
better handled by the shell in the first place.  It's much simpler to 
write and extend as well as being much more traceable for runtime 
problems.  The shell commands that would be used for most of these git
routines have options for requesting it to be more verbose so the user 
actually has a lot more power over reporting and/or logging.  In addition 
it tends to be more portable and the amount of code is drastically reduced 
in a script style of programming.  The criticisms against such use of 
shell scripting tends to be a matter of personal taste.  People believe, 
for some reason or another, that it is a lower-class type of programming 
that is less robust and is harder to understand.  Seldom have there been 
cogent arguments for coding such features in C as opposed to shell 
scripting, especially in the case of git where the shell becomes a very 
powerful ally.

		David

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:47                                                                       ` Carl Worth
@ 2006-10-24  7:31                                                                         ` Erik Bågfors
  2006-10-24 21:51                                                                         ` Erik Bågfors
  1 sibling, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-24  7:31 UTC (permalink / raw)
  To: Carl Worth
  Cc: Linus Torvalds, bazaar-ng, git, Matthew D. Fuller, Jakub Narebski

On 10/24/06, Carl Worth <cworth@cworth.org> wrote:
> On Mon, 23 Oct 2006 19:26:57 -0500, "Matthew D. Fuller" wrote:
> >
> > On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> > Linus Torvalds, and lo! it spake thus:
> > >
> > > The problem? How do you show a commit that is _common_ to two
> > > branches, but has different revision names in them?
> >
> > Why would you?
>
> Assume you've got two long-lived branches and one periodically gets
> merged into the other one. The combined history might look as follows
> (more recent commits first):
>
>  f   g
>  |   |
>  d   e
>  |\ /
>  b c
>  |/
>  a
>
> The point is that it is extremely nice to be able to visualize things
> that way. Say I've got a "dev" branch that points at f and a "stable"
> branch that points at g. With this, a command like:
>
>         gitk dev stable
>
> would result in a picture just like the above. Can a similar figure be
> made with bzr? Or only the following two separate pictures:

The above picture can easily be created with bzr if you have a
utility/plugin that does it. There is none that does it yet, but there
are no problems doing one.

Of course, in such a context revision numbers have no use.  But see,
revision numbers is not mandatory in bzr, so that's not a problem.

I haven't really had a need for such a tool, but I do see where it can
be very useful to have.

>  f    g
>  |    |
>  d    e
>  |\   |
>  b c  c
>  |/   |
>  a    a
>

This is what you would get if you visualize the two separate branches,
and not the common repository.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:39                                                                     ` Martin Langhoff
@ 2006-10-24  7:52                                                                       ` Erik Bågfors
  2006-10-24  8:37                                                                         ` Jakub Narebski
  2006-10-24 10:11                                                                         ` Martin Langhoff
  0 siblings, 2 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-24  7:52 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

On 10/24/06, Martin Langhoff <martin.langhoff@gmail.com> wrote:
> On 10/24/06, Linus Torvalds <torvalds@osdl.org> wrote:
> > On Tue, 24 Oct 2006, Erik Bågfors wrote:
> > >
> > > I don't see any problem doing a "gitk --all" equivalent in bzr.
> >
> > The problem? How do you show a commit that is _common_ to two branches,
> > but has different revision names in them?
>
> Eric,

It's Erik :)

> coming from an Arch background, I understand the whole per-branch
> commitids approach. After using GIT for a while, you start realising
> that it tries to pin down things in the wrong place.
>
> This is specially visible if you run `gitk --all` before and after a
> merge. Or on a project with many merges (if you can, get a checkout of
> git itself, and browse its history with gitk).
>
> Before the merge, you see
>
>  --o--o--o--o
>     \
>      \--o--o
>
> and after
>
>  --o--o--o--o
>     \        \
>      \--o--o--o
>
> Now, after it's merged somewhere, both commits are part of its
> history, regardless of where they come from. And it is very clear if
> two branches have been merging and remerging.
>
> Where a commit originated does not matter. And fancy
> repo-and-branch-centric names get in the way. A lot. And they re
> mostly meaningless as soon as you put what matters in the commit
> message. Which means that that bit of metadata that you are hoping
> that the revno keeps "indirectly" isn't lost on cherry picking.

Let's make one thing clear.  Revnos are NOT stored with the revision,
they are not "names" of the revision.  They are basically just
shortcuts to specific revisions, that only makes sence in the context
of a branch.

As human beings this is something we are very used to in everyday
life. I don't always call my friends with firstname and surname, I
just use first name or even "mate".  As long as it's clear who I'm
talking about in that contect.  If there are multiple people with the
same first name, then we might have to use the surname as well.

Same with bzr. In the context of a branch, revnos works as shortcuts
to the revision id.  In the context of multiple branches, they don't.

I think they do serve a good purpose but I don't really think that we
absolutely need them either.

> I guess that's where I used to find revnos useful as they contained
> some basic metadata. With bzr it seems to be author-repo-branch where
> branch is hopefully "line of work" but all of that can be (and should
> be) in the commit message.
>
> You can see similar info in the first part of the commit message for
> most git-hosted projects. It'll say something like
>
>    cvsserver: fix the frobnicator to be sequential
>
> which means that at that point, you could be working in a branch
> called fix-this-fscking-thing-attempt524" and no-one would know ;-)
>
> And in a few years (even months) time, that bit of metadata you were
> hoping to keep is totally irrelevant. What you have in the commit
> message remains relevant and useful.

I'm not even going to try to understand the argument here as they are
about a totally different thing and doesn't make any sense to me.

I think this disussion is getting out of hand.

There are a few things that are being discussed
1. Revnos are bad/good
2. treating "leftmost" parrent special is bad/good
3. plugins are useless/useful
4. And now, storing branch information should be done manually (if
wanted) and not automatically.

1. I don't really care, I haven't seen any confusion based on it, but
I don't have a very strong opinion about it either.
2. This is something I do care about.  For me, this is the only
logical way of doing it. It might be because I am used to it now, but
when I started to look at bzr/hg/git/darcs/etc, I just got a so much
more clear view of the history when running a standard log command,
that it was one of the first things that attracted me to bzr. This is
just a user talking.
There might be technical reasons why it's better to not do it, but for
me it works the way I expect, therefore I'm happy
3. This is just silly
4. No comment.

/Erik

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  7:52                                                                       ` Erik Bågfors
@ 2006-10-24  8:37                                                                         ` Jakub Narebski
  2006-10-24 10:11                                                                         ` Martin Langhoff
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-24  8:37 UTC (permalink / raw)
  To: Erik Bågfors; +Cc: Martin Langhoff, Linus Torvalds, bazaar-ng, git

Erik Bågfors wrote:
> I think this disussion is getting out of hand.
> 
> There are a few things that are being discussed
> 1. Revnos are bad/good
> 2. treating "leftmost" parrent special is bad/good
> 3. plugins are useless/useful
> 4. And now, storing branch information should be done manually (if
> wanted) and not automatically.
> 
> 1. I don't really care, I haven't seen any confusion based on it, but
> I don't have a very strong opinion about it either.

To use revnos[*1*] you have to have branch as path through DAG. Bzr does
that by treating first parent special, which leads to empty merges
in fast-forward case.

Using revnos as implemented in bzr leads to some (perhaps unforeseen)
consequences.

[*1*] Meaning that revnos won't change on you.

> 2. This is something I do care about.  For me, this is the only
> logical way of doing it. It might be because I am used to it now, but
> when I started to look at bzr/hg/git/darcs/etc, I just got a so much
> more clear view of the history when running a standard log command,
> that it was one of the first things that attracted me to bzr. This is
> just a user talking.

Git has reflog for when you are interested in branch tip history
(which also stores "reason" for branch tip change: pull, amending
a commit, rebase,...). Git doesn't unfortunately have git-ref-log
command (or --ref option to git-log) to display reflog in user friendly 
format.

Git users are used to use graphical history viewers (mainly gitk and 
qgit, but there is also gitview, tig and git-browser) more to have 
clear view of history, view that log cannot provide.

That said I _thing_ that caring about "branch identity" is just 
something you are used to, perhaps because bzr doesn't have wonderfull 
git log limiting specifiers aka. builtin git log searching (a..b, 
a...b, --max-count, -- <path>, --committer, --grep etc.).

> There might be technical reasons why it's better to not do it, but for
> me it works the way I expect, therefore I'm happy

I think it would be better to maintain "branch identity" separately and 
not in DAG, but that might have other problems I have not seen.

> 3. This is just silly

I think the discussion/arguments were twofold. 

First, Bazaar-NG has plugin infrastructure "for free" because it is 
written in Python, which allows modules loading and monkey-patching. 
Git core is written in C, and git is not yet fully libified.

Second, all that can be done with plugins except for core changes can be 
done in Git writing scripts (this also allows for fast prototyping). 
All except core changes can be done writing few lines in C, but you 
have to compile against some version of Git, and don't have advantages 
of bultin command; git is OSS project.

> 4. No comment.

Storing branch information could be done automatically on demand ;-)
-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
  2006-10-24  0:26                                                                     ` Matthew D. Fuller
  2006-10-24  0:39                                                                     ` Martin Langhoff
@ 2006-10-24  9:30                                                                     ` Jelmer Vernooij
  2006-10-26 15:22                                                                       ` Aaron Bentley
  2006-10-25 18:41                                                                     ` Aaron Bentley
  3 siblings, 1 reply; 806+ messages in thread
From: Jelmer Vernooij @ 2006-10-24  9:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik B?gfors, bazaar-ng, git, Jakub Narebski

[-- Attachment #1: Type: text/plain, Size: 978 bytes --]

On Mon, Oct 23, 2006 at 04:24:30PM -0700, Linus Torvalds wrote:
> On Tue, 24 Oct 2006, Erik B?gfors wrote:
> > I don't see any problem doing a "gitk --all" equivalent in bzr.
> The problem? How do you show a commit that is _common_ to two branches, 
> but has different revision names in them?
It'll have the same revision name. The revision no's will be
different, sure, but that's not a problem.

> Do you _finally_ see what is so wrong with this whole per-branch naming?
revnos are the only naming bit that is branch-specific.

I guess one way of looking at revnos is to regard them completely as a 
command-line ui thing.  They're not explicitly stored anywhere on
disk but just an easy way for users to refer to revisions on a
per-branch basis. 

The graphical frontends to bzr, for example, don't know about revno's but 
only about revids.

Cheers,

Jelmer

-- 
Jelmer Vernooij <jelmer@samba.org> - http://jelmer.vernstok.nl/
Currently playing: 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 191 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
                                                                                 ` (2 preceding siblings ...)
  2006-10-23 22:45                                                               ` Jakub Narebski
@ 2006-10-24  9:51                                                               ` Matthieu Moy
  2006-10-24 10:27                                                                 ` Jakub Narebski
  2006-10-25 10:52                                                               ` Andreas Ericsson
  4 siblings, 1 reply; 806+ messages in thread
From: Matthieu Moy @ 2006-10-24  9:51 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

"Matthew D. Fuller" <fullermd@over-yonder.net> writes:

>> For example, how long does it take to do an arbitrary "undo" (ie
>> forcing a branch to an earlier state) [...]
>
> I don't understand the thrust of this, either.  As I understand the
> operation you're talking about, it doesn't have anything to do with a
> branch; you'd just be whipping the working tree around to different
> versions.  That should be O(diff) on any modern VCS.

There are two things to do:

* Mark the tree as corresponding to a different revision in the past.
  This is roughly "echo 'revision@id-123' > .bzr/checkout/last-revision"
  in bzr. Obviously, writting the file is O(1), but computing the
  revision identifier if you say "bzr switch -r 42" (I'm not sure
  switch accepts this BTW), you have to load the revision history.
  Indeed, bzr would load it anyway to make sure that the revision you
  switch to is in the revision history.

  In bzr, you have .bzr/branch/revision-history for each branch, which
  is a newline-separated list of revision-identifiers. In the case of
  bzr.dev, for example, this file is 112KB as of now. This is
  O(history), with "history" being the length of the path from HEAD to
  the initial commit, following the leftmost ancestor (i.e. number of
  revisions in a centralized workflow, and less than this otherwise).
  That said, the constant factor is very small. For example, on
  bzr.dev, I did "grep -n some-rev-id" (which does revid-to-revno), it
  takes 0.004 seconds (Vs 0.003 seconds to grep in /dev/null
  instead ;-) ), so you'd need many orders of magnitude before this
  becomes a limitation.

  Linus's point AIUI is that this will _never_ be a limitation of git.

* Then, do the "merge" to make your tree up to date. You can hardly do
  faster than git and its unpacked format, but this is at the cost of
  disk space. But as you say, in almost any modern VCS, that's
  O(diff). In a space-efficient format, that's just the tradeoff you
  make between full copies of a file and delta-compression.

-- 
Matthieu

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  7:52                                                                       ` Erik Bågfors
  2006-10-24  8:37                                                                         ` Jakub Narebski
@ 2006-10-24 10:11                                                                         ` Martin Langhoff
  1 sibling, 0 replies; 806+ messages in thread
From: Martin Langhoff @ 2006-10-24 10:11 UTC (permalink / raw)
  To: Erik Bågfors; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

On 10/24/06, Erik Bågfors <zindar@gmail.com> wrote:
> It's Erik :)

Sorry Erik!

> Let's make one thing clear.  Revnos are NOT stored with the revision,
> they are not "names" of the revision.  They are basically just
> shortcuts to specific revisions, that only makes sence in the context
> of a branch.

My bad. The revnos examples discussed looked quite Arch-like. As Arch
took them seriously, I thought bzr did too.

Probably quite a few people here thought as much, and got hot under
the t-shirt about it ;-)

Now, the thing about they shorthand is that we have quite a few means
of using shorthand in GIT that don't rely on revnos. We have the whole
^branchname stuff. And when you are looking at gitk it's pretty
obvious which are your recent "local" commits.


...


> 2. treating "leftmost" parrent special is bad/good

> 2. This is something I do care about.  For me, this is the only
> logical way of doing it. It might be because I am used to it now, but
> when I started to look at bzr/hg/git/darcs/etc, I just got a so much
> more clear view of the history when running a standard log command,
> that it was one of the first things that attracted me to bzr. This is
> just a user talking.
> There might be technical reasons why it's better to not do it, but for
> me it works the way I expect, therefore I'm happy

Can you give us a quick example of why you got such a clearer picture?

> 3. plugins are useless/useful

Hmmmm. It's more of a unix/C/pipes tradition vs dynamically typed &
compiled scripting language tradition.

> 4. And now, storing branch information should be done manually (if
> wanted) and not automatically.

> 4. No comment.

Probably not. But if someone is using branchnames to identify "lines
of work" and hoping that metadata will remain attached there, it's
probably a bad long-term approach.

But following what you said earlier about that info being transient
and "local", then I was 200% wrong, and thinking of Arch/Bazaar usage
patterns.

cheers,


martin

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  9:51                                                               ` Matthieu Moy
@ 2006-10-24 10:27                                                                 ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-24 10:27 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthieu Moy wrote:
> "Matthew D. Fuller" <fullermd@over-yonder.net> writes:
> 
>>> For example, how long does it take to do an arbitrary "undo" (ie
>>> forcing a branch to an earlier state) [...]
>>
>> I don't understand the thrust of this, either.  As I understand the
>> operation you're talking about, it doesn't have anything to do with a
>> branch; you'd just be whipping the working tree around to different
>> versions.  That should be O(diff) on any modern VCS.

> There are two things to do:
>
> * Mark the tree as corresponding to a different revision in the past.
[...]
> * Then, do the "merge" to make your tree up to date. You can hardly do
>   faster than git and its unpacked format, but this is at the cost of
>   disk space. But as you say, in almost any modern VCS, that's
>   O(diff). In a space-efficient format, that's just the tradeoff you
>   make between full copies of a file and delta-compression.

Actually, this would be "checkout" (in git terminology), i.e. overwriting
the files which differ in current revision, and the revision we rewind (do
undo) to. (That's of course simplification omitting for example removing
and creating files.) Which would be O(changed files) which is lower bound
and cannot be faster. Finding which files changed is also O(changed files),
with a little bit of O(directory depth) in git, with very small constant.

And even in the case of packed format, it wouldn't be O(diff)/O(history),
but O(delta length) where delta length is maximum length of delta chain
in pack, by default set to 10. Well, constant is a bit larges because git
additionally gzip-compresses (even in loose, i.e. unpacked format).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:45                           ` David Rientjes
       [not found]                             ` <Pin e.LNX.4.64.0610240812410.3962@g5.osdl.org>
       [not found]                             ` <"Pin e.LNX.4.64.0610240812410.3962"@g5.osdl.org>
@ 2006-10-24 15:15                             ` Linus Torvalds
  2006-10-24 20:12                               ` David Rientjes
  2006-10-26  2:29                             ` Linus Torvalds
  3 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-10-24 15:15 UTC (permalink / raw)
  To: David Rientjes; +Cc: Lachlan Patrick, bazaar-ng, git



On Mon, 23 Oct 2006, David Rientjes wrote:
> 
> Some of the internal commands that have been coded in C are actually much 
> better handled by the shell in the first place.  It's much simpler to 
> write and extend as well as being much more traceable for runtime 
> problems.

Yes. However, from a portability (to Windows) standpoint, shell is just 
about the worst choice.

Not that perl/python/etc really help - unless the _whole_ program is one 
perl/python thing. Windows just doesn't like pipelines etc very much.

So I'd like all the _common_ programs to be built-ins..

		Linus

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:26                                                                 ` Matthew D. Fuller
@ 2006-10-24 15:58                                                                   ` David Lang
  2006-10-24 16:34                                                                     ` Matthew D. Fuller
  0 siblings, 1 reply; 806+ messages in thread
From: David Lang @ 2006-10-24 15:58 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Mon, 23 Oct 2006, Matthew D. Fuller wrote:

> But I don't understand how bzr-the-abstract-data-model makes such
> things impossible, or even significantly different than doing so in
> git.  In git, you're just chopping off one DAG where another one
> intersects it (or similar operations).  To do it in bzr, you'd do...
> exactly the same thing.  The revnos, or the mainline, are completely
> useless in such an operation of course, but they don't hurt it; the
> tool would just just ignore them like it does the SHA-1 of files in
> the revision.

one key difference is that with bzr you have to do this chopping by creating the 
branches at the time changes are done, with git you do this chopping after the 
fact when you are displaying the results.

As such you can chop and compare things in ways that were never contemplated by 
anyone at the time changes are made.

>
>> See? When you visualize multiple branches together, HAVING
>> PER-BRANCH REVISION NUMBERS IS INSANE! Yet, clearly, it's a valid
>> and interesting operation to do.
>
> I wouldn't be so absolutist about it, but certainly they're of
> extremely limited utility if of any at all in such cases.  And yes, it
> can be an interesting operation.  But what does that have to do with
> using revnos in other cases?  You keep saying "having" where I would
> say "using".

and the bzr tools strongly encourage the use of these numbers

> I care about that first parent line.  Therefore, I require my tool to
> at least _pretend_ to care.  I'm not aware of any way in which the
> fundamental bzr structures care, but the UI is chock full of
> pretending.  A necessary part of that pretending is not changing my
> mainline unless I specifically ask for it, and that means a
> merge-vs-pull distinction needs to be there.  That's a _technical_
> sign that the tool is ready to work with me the way I want to work.  A
> lack of it is a _technical_ sign that it's not suitable.

nobody is saying that the bzr approach is invalid for your workflow.

what people are saying is that it doesn't easily support a truely distributed 
workflow. this is a very different statement.

your workflow isn't truely distributed so you bzr's model works well for you. no 
problem, just don't claim that becouse you haven't run into any problems with 
your workflow that there are no problems with bzr with other workflows.

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 15:58                                                                   ` David Lang
@ 2006-10-24 16:34                                                                     ` Matthew D. Fuller
  2006-10-24 18:03                                                                       ` David Lang
  0 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-24 16:34 UTC (permalink / raw)
  To: David Lang; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, Oct 24, 2006 at 08:58:56AM -0700 I heard the voice of
David Lang, and lo! it spake thus:
> 
> one key difference is that with bzr you have to do this chopping by
> creating the branches at the time changes are done,

HUH?  Why on earth do you think that?

To do this in a git data model, you point at 2 (or 3, or 4, or...)
revisions, anywhere in the revision-space universe.  You derive back a
DAG of the history from each of them by recursing over parent links.
You figure out where (if anywhere) those DAG's intersect.  And based
on that, you alter what and how you display; including or excluding
certain revs, changing the angles of lines or columnation of dots in a
graph, etc.

To do it in a bzr data model, you would follow *EXACTLY* the same
steps.  As in, you do EXACTLY (a), then EXACTLY (b), then...


> what people are saying is that it doesn't easily support a truely
> distributed workflow. this is a very different statement.

And it's one that carries around a lot of unstated assumptions about
what "truely distributed" means, which *I*'m certainly not
understanding, because any meaning I can apply to the term doesn't
lead me to the conclusions it does you.  Certainly, depending on your
workflow, certain parts of the UI are of lesser utility than they are
in mine, down to and including zero.  And it's probably certain that
some parts of the UI aren't up to handling various workflows, too,
including OUR workflow.  That's kinda what "in development" means...

But that's a very different statement from the claim that they CAN'T
be without changes to the conceptual model underneath.  Just because a
UI is built around maintaining the fiction of a mainline doesn't mean
the system requires it.  All you'd have to do to abandon it is write a
different log formatter that didn't show revnos and didn't nest merge
commits, and change (or add an option to) 'merge' to fast-forward if
possible.  The difference between the views on how the pieces should
fit together really IS just that fine.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  5:42                                                                         ` Linus Torvalds
  2006-10-24  5:47                                                                           ` Shawn Pearce
@ 2006-10-24 16:46                                                                           ` Matthew D. Fuller
  1 sibling, 0 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-24 16:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git, Erik Bågfors

On Mon, Oct 23, 2006 at 10:42:23PM -0700 I heard the voice of
Linus Torvalds, and lo! it spake thus:
> 
> Well, I would use the globally unique ones, certainly. It's the only
> thing that makes sense.

So would I, and it is.


> Using the _same_ names everywhere is just better. 

This is just where we split on it.  All else being equal, sure, but
all else is never equal.  Most of my time is spent working forward
along one branch (different branches at different times, of course,
but at any given moment I'm almost certainly only concerned about one
branch), and having a different and advantageous localized naming
scheme there is a benefit I celebrate.  If most of my time were
instead spent comparing and contrasting and intersecting and
cross-breeding branches, it would probably be as worthless to me as it
apparently is to you.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
           On the Internet, nobody can hear you scream.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 16:34                                                                     ` Matthew D. Fuller
@ 2006-10-24 18:03                                                                       ` David Lang
  2006-10-24 18:25                                                                         ` Jakub Narebski
  2006-10-25  0:27                                                                         ` Matthew D. Fuller
  0 siblings, 2 replies; 806+ messages in thread
From: David Lang @ 2006-10-24 18:03 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, 24 Oct 2006, Matthew D. Fuller wrote:

> On Tue, Oct 24, 2006 at 08:58:56AM -0700 I heard the voice of
> David Lang, and lo! it spake thus:
>>
>> one key difference is that with bzr you have to do this chopping by
>> creating the branches at the time changes are done,
>
> HUH?  Why on earth do you think that?
>
> To do this in a git data model, you point at 2 (or 3, or 4, or...)
> revisions, anywhere in the revision-space universe.  You derive back a
> DAG of the history from each of them by recursing over parent links.
> You figure out where (if anywhere) those DAG's intersect.  And based
> on that, you alter what and how you display; including or excluding
> certain revs, changing the angles of lines or columnation of dots in a
> graph, etc.
>
> To do it in a bzr data model, you would follow *EXACTLY* the same
> steps.  As in, you do EXACTLY (a), then EXACTLY (b), then...

it sounded like you were saying that the way to get the slices of the DAG was to 
use branches in bzr. to do this you need to create the branches with the correct 
info on each branch. this is only practical if the branches are created as the 
changes are made, if you try to do this after the fact you need to create the 
changes in the branch before you do the slicing.

with git you can look at the DAG and pick any arbatrary points in it as points 
to use for the slicing at display time.

>> what people are saying is that it doesn't easily support a truely
>> distributed workflow. this is a very different statement.
>
> And it's one that carries around a lot of unstated assumptions about
> what "truely distributed" means, which *I*'m certainly not
> understanding, because any meaning I can apply to the term doesn't
> lead me to the conclusions it does you.  Certainly, depending on your
> workflow, certain parts of the UI are of lesser utility than they are
> in mine, down to and including zero.  And it's probably certain that
> some parts of the UI aren't up to handling various workflows, too,
> including OUR workflow.  That's kinda what "in development" means...
>
> But that's a very different statement from the claim that they CAN'T
> be without changes to the conceptual model underneath.  Just because a
> UI is built around maintaining the fiction of a mainline doesn't mean
> the system requires it.  All you'd have to do to abandon it is write a
> different log formatter that didn't show revnos and didn't nest merge
> commits, and change (or add an option to) 'merge' to fast-forward if
> possible.  The difference between the views on how the pieces should
> fit together really IS just that fine.

the claim isn't that bzr can't be modified to support these other workflows (it 
sounds as if just changing to tools to use the internal refid's rather then the 
current refno's would come very close to solving this problem), it's that the 
current refno's (use of which is strongly encouraged by the current UI) cannot 
support some workflows, and therefor the claim that it supports fully 
distributed workflows as well as git is false

remember that this entire thing started with a feature comparison checklist, 
the definitions of some of the items on the checklist is being questioned.

after that there's the issue of if the VCS in question has the feature.

this discussion started with two topologies

1. Centralized: all commits must go to one repository, connectivity required to check-in 
2. Distributed: everything else

since then one additional topology has been defined, and one has been redefined

1. Centralized: all commits must go to one repository, connectivity required to check-in

2. Star: one repository is 'special' or 'primary' and all other repositories 
sync to this, but development can take place against local repositories, 
connectivity is only requred when syncing the repositories. as updates take 
place the history is defined by the primary repository, and can overwrite or 
change the history as defined by local repositories.

3. Distributed: all repositories are equal (any definition of 'primary' is a 
matter of convention, not a requirement of the tool) development can take place 
against local repositories, connectivity is only required when syncing the 
repositories. repositories with no development takeing place can sync back and 
forth with no side effects. History displays the same thing no matter what 
repository is looked at (allowing for the fact that some repositories may not 
have the full history)

everyone agrees that bzr supports the Star topology. Most people (including bzr 
people) seem to agree that currently bzr does not support the Distributed 
topology.

it's just fine for bzr to not support all possible topologies, the only reason 
for discussing these issues (besides everyone understanding each other) is the 
feature checklist that started this entire thread, and what is appropriate there 
for each VCS (see the early part of this discussion to see how that worked with 
git's rename support)

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 18:03                                                                       ` David Lang
@ 2006-10-24 18:25                                                                         ` Jakub Narebski
  2006-10-24 19:27                                                                           ` Petr Baudis
  2006-10-25  0:27                                                                         ` Matthew D. Fuller
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-24 18:25 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

David Lang wrote:

> 1. Centralized: all commits must go to one repository, connectivity
> required to check-in 

Bazaar-NG "light checkouts" implements this. Git doesn't support this
topology, and probably wouldn't.

1.5. Disconnected centralized. Like centralized, but you can work (perhaps
limited to what you can do) even without connection to central server.
Minimally you have to be able to commit changes locally, if central server
is not available. Bzr "normal/heavyweight checkouts" are [roughly] abot
this. Git "lazy clone" proposal is about similar thing; you can get git to
support this model (although without space savings) with full 
clone + hooks.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 18:25                                                                         ` Jakub Narebski
@ 2006-10-24 19:27                                                                           ` Petr Baudis
  0 siblings, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-24 19:27 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Dear diary, on Tue, Oct 24, 2006 at 08:25:53PM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> David Lang wrote:
> 
> > 1. Centralized: all commits must go to one repository, connectivity
> > required to check-in 
> 
> Bazaar-NG "light checkouts" implements this. Git doesn't support this
> topology, and probably wouldn't.
> 
> 1.5. Disconnected centralized. Like centralized, but you can work (perhaps
> limited to what you can do) even without connection to central server.
> Minimally you have to be able to commit changes locally, if central server
> is not available. Bzr "normal/heavyweight checkouts" are [roughly] abot
> this. Git "lazy clone" proposal is about similar thing; you can get git to
> support this model (although without space savings) with full 
> clone + hooks.

Cogito can do it now out of the box, having support for cg-commit --push
and cg-update preserving uncommitted local changes.

Not that you probably should use it. ;-)

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 15:15                             ` Linus Torvalds
@ 2006-10-24 20:12                               ` David Rientjes
  2006-10-24 20:28                                 ` Jakub Narebski
  2006-10-25  8:48                                 ` Jeff King
  0 siblings, 2 replies; 806+ messages in thread
From: David Rientjes @ 2006-10-24 20:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

On Tue, 24 Oct 2006, Linus Torvalds wrote:

> Yes. However, from a portability (to Windows) standpoint, shell is just 
> about the worst choice.
> 
> Not that perl/python/etc really help - unless the _whole_ program is one 
> perl/python thing. Windows just doesn't like pipelines etc very much.
> 
> So I'd like all the _common_ programs to be built-ins..
> 

And I would prefer the opposite because we're talking about git.  As an 
information manager, it should be seen and not heard.  Nobody is going to 
spend their time to become a git or CVS or perforce expert.  As an 
individual primarily interested in development, I should not be required 
to learn command lines for dozens of different git-specific commands to do 
my job quickly and effectively.  I would opt for a much more simpler 
approach and deal with shell scripting for many of these commands because 
I'm familiar with them and I can pipe any command with the options I 
already know and have used before to any other command.

As a developer on Linux based systems, I should not need to deal with 
code in a revision control system that is longer and less traceable 
because the authors of that system decided they wanted to support Windows 
too.  Moving away from the functionality that the shell provides is a 
mistake for a system such as git where it could be so advantageous because 
of the inherent nature of git as an information manager.

This is the reason why I was a fan of git long ago and used it for my own 
needs before tons of unnecessary features and unneeded complexity was 
added on.

		David




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 20:12                               ` David Rientjes
@ 2006-10-24 20:28                                 ` Jakub Narebski
  2006-10-25  8:48                                 ` Jeff King
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-24 20:28 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

David Rientjes wrote:

> This is the reason why I was a fan of git long ago and used it for my own 
> needs before tons of unnecessary features and unneeded complexity was 
> added on.

But you can still use git as you used it long time ago. The plumbing
commands didn't vanish. Git got rich in porcelanish commands, true, but old
core remains. And GIT_TRACE (quite new addition) certainly helps.

I think git profit very much from being created bottom-up, from main idea of
SCM, through repository format and structure, through plumbing commands,
through porcelain done with scripts, to having many new plumbing commands,
to having many commands builtin, in the future to libification perhaps.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  0:47                                                                       ` Carl Worth
  2006-10-24  7:31                                                                         ` Erik Bågfors
@ 2006-10-24 21:51                                                                         ` Erik Bågfors
  2006-10-25 12:41                                                                           ` Andreas Ericsson
  1 sibling, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-10-24 21:51 UTC (permalink / raw)
  To: Carl Worth
  Cc: Matthew D. Fuller, Linus Torvalds, bazaar-ng, git, Jakub Narebski

Sorry for going back to an old mail... but....

On 10/24/06, Carl Worth <cworth@cworth.org> wrote:
> On Mon, 23 Oct 2006 19:26:57 -0500, "Matthew D. Fuller" wrote:
> >
> > On Mon, Oct 23, 2006 at 04:24:30PM -0700 I heard the voice of
> > Linus Torvalds, and lo! it spake thus:
> > >
> > > The problem? How do you show a commit that is _common_ to two
> > > branches, but has different revision names in them?
> >
> > Why would you?
>
> Assume you've got two long-lived branches and one periodically gets
> merged into the other one. The combined history might look as follows
> (more recent commits first):
>
>  f   g
>  |   |
>  d   e
>  |\ /
>  b c
>  |/
>  a
>
> The point is that it is extremely nice to be able to visualize things
> that way. Say I've got a "dev" branch that points at f and a "stable"
> branch that points at g. With this, a command like:
>
>         gitk dev stable
>
> would result in a picture just like the above. Can a similar figure be
> made with bzr? Or only the following two separate pictures:

I wanted to test how hard it is. So I created a small plugin that will
show the relationsships between revisions... The following commands

bzr init-repo repo --trees
bzr init repo/branchA
cd repo/branchA
bzr whoami --branch "Test Devel 1 <test1@devel.com>"
bzr ci --unchanged -m a1
bzr ci --unchanged -m a2
bzr branch . ../branchB
bzr ci --unchanged -m a3
bzr ci --unchanged -m a4
cd ../branchB
bzr whoami --branch "Test Devel 2 <test2@devel.com>"
bzr ci --unchanged -m b1
bzr ci --unchanged -m b2
bzr merge ../branchA
bzr ci -m merge
bzr ci --unchanged -m b3
bzr ci --unchanged -m b4
cd ../branchA
bzr merge ../branchB
bzr ci -m merge
bzr ci --unchanged -m a5
cd ../branchB
bzr ci --unchanged -m b5
cd ..
bzr dotrepo > test.dot
dot -Tpng test.dot > dotrepo.png

Creates the picture you can see at
http://erik.bagfors.nu/bzr-plugins/dotrepo.png

Please remember that this is a 15 min implementation and as such might
suck (the output is not perfect for example, it's slow, etc).  This
just brings in every revision in the entire repo, but to expand it to
just take the branches on the command line, is perfectly possible.

But still.. there is no problem to create this.

/Erik
ps. the plugin can be bzr branched from
http://erik.bagfors.nu/bzr-plugins/dotrepo/
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 18:03                                                                       ` David Lang
  2006-10-24 18:25                                                                         ` Jakub Narebski
@ 2006-10-25  0:27                                                                         ` Matthew D. Fuller
  2006-10-25 22:40                                                                           ` David Lang
  1 sibling, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-25  0:27 UTC (permalink / raw)
  To: David Lang; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
David Lang, and lo! it spake thus:
> 
> it sounded like you were saying that the way to get the slices of
> the DAG was to use branches in bzr. [...]

I'm not entirely sure I understand what you mean here, but I think
you're saying "Nobody's written the code in bzr to show arbitrary
slices of the DAG", which is true TTBOMK.


> everyone agrees that bzr supports the Star topology. Most people
> (including bzr people) seem to agree that currently bzr does not
> support the Distributed topology.

I think this statement arouses so much grumbling because (a) bzr does
support such a lot better than often seems implied, (b) where it
doesn't, the changes needed to do so are relatively minor (often
merely cosmetic), and (c) disagreement over whether some of the
qualifications included for 'distributed' are really fundamental.


> it's just fine for bzr to not support all possible topologies,

I think there's a real intent for bzr TO support at least all common
topologies.  I'll buy that current development has focused more on
[relatively] simple topologies than the more wildly complex ones.  I
look forward to more addressing of the less common cases as the tool
matures, and I think a lot of this thread will be good material to
work with as that happens.  It's just the suggestion that providing
fruit for simple topologies _necessarily_ prejudices against complex
ones that I find so onerous.


> (besides everyone understanding each other)

That's a good enough reason for me.  Before this thread, I wasn't
interested in using git.  I'm still not, but now I understand much
better /why/ I'm not.  And when (I'm sure it'll happen sooner or
later) some project I follow picks up using git, I'll have enough
grounding in the tool's mental model to work with it when I have to.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 20:12                               ` David Rientjes
  2006-10-24 20:28                                 ` Jakub Narebski
@ 2006-10-25  8:48                                 ` Jeff King
       [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
                                                     ` (2 more replies)
  1 sibling, 3 replies; 806+ messages in thread
From: Jeff King @ 2006-10-25  8:48 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Tue, Oct 24, 2006 at 01:12:52PM -0700, David Rientjes wrote:

> And I would prefer the opposite because we're talking about git.  As an 
> information manager, it should be seen and not heard.  Nobody is going to 
> spend their time to become a git or CVS or perforce expert.  As an 
> individual primarily interested in development, I should not be required 
> to learn command lines for dozens of different git-specific commands to do 
> my job quickly and effectively.  I would opt for a much more simpler 
> approach and deal with shell scripting for many of these commands because 
> I'm familiar with them and I can pipe any command with the options I 
> already know and have used before to any other command.

I don't understand how converting shell scripts to C has any impact
whatsoever on the usage of git. The plumbing shell scripts didn't go
away; you can still call them and they behave identically.

Is there some specific change in functionality that you're lamenting?

> As a developer on Linux based systems, I should not need to deal with 
> code in a revision control system that is longer and less traceable 
> because the authors of that system decided they wanted to support Windows 
> too.  Moving away from the functionality that the shell provides is a 
> mistake for a system such as git where it could be so advantageous because 
> of the inherent nature of git as an information manager.

Some C->shell conversions may have made the code "longer and less
traceable." However, many of those conversions caused the code to be
shorter (because communication between C functions is simpler than going
over pipes, and because anything involving a data structure more complex
than a string is difficult in the shell) and more robust (fewer
opportunities for quoting/parsing errors, and none of the shell gotchas
like missing the error code in "foo | bar").

Do you have any specific reason to believe that the git code is of worse
quality now than it was before?

> This is the reason why I was a fan of git long ago and used it for my own 
> needs before tons of unnecessary features and unneeded complexity was 
> added on.

Is there something you used to do with git that you no longer can? Is
there a reason you can't ignore the newer commands?


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  8:48                                 ` Jeff King
       [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
@ 2006-10-25  9:19                                   ` David Rientjes
  2006-10-25  9:32                                     ` Jakub Narebski
  2006-10-25  9:49                                     ` Jeff King
  2006-10-25 21:08                                   ` Junio C Hamano
  2 siblings, 2 replies; 806+ messages in thread
From: David Rientjes @ 2006-10-25  9:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Wed, 25 Oct 2006, Jeff King wrote:

> I don't understand how converting shell scripts to C has any impact
> whatsoever on the usage of git. The plumbing shell scripts didn't go
> away; you can still call them and they behave identically.
> 
> Is there some specific change in functionality that you're lamenting?
> 

No, my criticism is against the added complexity which makes the 
modification of git increasingly difficult with every new release.  It's a 
pretty limited use case of the entire package, I'm sure, but one of the 
major advantages that I saw in git early on was the ability to tailor it 
to your own personal needs very easily with some simple shell knowledge 
and enough C that was required at the time.

> Some C->shell conversions may have made the code "longer and less
> traceable." However, many of those conversions caused the code to be
> shorter (because communication between C functions is simpler than going
> over pipes, and because anything involving a data structure more complex
> than a string is difficult in the shell) and more robust (fewer
> opportunities for quoting/parsing errors, and none of the shell gotchas
> like missing the error code in "foo | bar").
> 

You're ignoring the advantageous nature of the shell with regard to git.  
The shell is so much better prepared to deal with information managers by 
nature than the C programming language.  It's not a matter of shorter 
code, per se, it's about the developer's ability to make small changes to 
the operation of the information manager on demand to tailor to his or her 
_current_ needs.  For any experienced shell programmer it is so much 
easier to go in and change an option or pipe to a different command or 
comment out a simple shell command in a .sh file than editing the C code.  
And sometimes it's necessary to have several different variations of that 
command which is very easy with slightly renamed .sh files instead of 
adding on more and more flags to commands that have become so complex at 
this point that it's difficult to know the basics of how to manage a 
project.

This all became very obvious when the tutorials came out on "how to use 
git in 20 commands or less" effectively.  These tutorials shouldn't need 
to exist with an information manager that started as a quick, efficient, 
and _simple_ project.  You're treating git development in the same light 
as you treat Linux development; let's be honest and say that 99% of the 
necessary git functionality was there almost a year ago and ever since 
nothing of absolute necessity has been added that serious developers care 
about in a revision control system.  Look at LKML, nobody is waiting on 
these new releases and upgrading to them when they're announced.  And this 
is the community that git has _targeted_.  Most other projects don't care 
about the syntactics of sign-off lines and acked-by lines and format-patch 
like the git community does.

> Do you have any specific reason to believe that the git code is of worse
> quality now than it was before?
> 

Absolutely.  I think I've actually documented that fairly well.  Back in 
the day git was a very concise, well-written package.  Today, a tour 
through the source code for the latest release leaves a lot to be desired 
for any serious C programmer.

> Is there something you used to do with git that you no longer can? Is
> there a reason you can't ignore the newer commands?
> 

Functionality wise, no.  But in terms of being able to _customize_ my 
version of git depending on how I want to use it, I've lost hope on the 
whole idea.  It's a shame too because it appears as though the original 
vision was one of efficiency and simplicity.  I would say that git-1.2.4 
is my package of preference with some slight tweaking in the branching 
department.

I really do miss the old git.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:19                                   ` David Rientjes
@ 2006-10-25  9:32                                     ` Jakub Narebski
  2006-10-25  9:49                                     ` Jeff King
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-25  9:32 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

David Rientjes wrote:

> On Wed, 25 Oct 2006, Jeff King wrote:
> 
>> I don't understand how converting shell scripts to C has any impact
>> whatsoever on the usage of git. The plumbing shell scripts didn't go
>> away; you can still call them and they behave identically.
>> 
>> Is there some specific change in functionality that you're lamenting?
>> 
> 
> No, my criticism is against the added complexity which makes the 
> modification of git increasingly difficult with every new release.  It's a 
> pretty limited use case of the entire package, I'm sure, but one of the 
> major advantages that I saw in git early on was the ability to tailor it 
> to your own personal needs very easily with some simple shell knowledge 
> and enough C that was required at the time.
> 
[...]
>> Is there something you used to do with git that you no longer can? Is
>> there a reason you can't ignore the newer commands?
> 
> Functionality wise, no.  But in terms of being able to _customize_ my 
> version of git depending on how I want to use it, I've lost hope on the 
> whole idea.  It's a shame too because it appears as though the original 
> vision was one of efficiency and simplicity.  I would say that git-1.2.4 
> is my package of preference with some slight tweaking in the branching 
> department.

Ahah! So you miss the old script version of git commands, which you could
easily modify, tailoring it to your needs, isn't it? Well, if you don't mind
keeping your clone of git repository lying around somewhere, you can always
resurrect old shell version of some git command, e.g.
  $ git cat-file -p v1.2.4:git-prune.sh > $(git --exec-path)/git-prune.sh
change its name and modify as you used to do.

Are there any old commands which stopped working?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 13:01                                               ` Matthew D. Fuller
  2006-10-21 14:08                                                 ` Jakub Narebski
  2006-10-21 20:47                                                 ` Carl Worth
@ 2006-10-25  9:35                                                 ` Andreas Ericsson
  2006-10-25  9:46                                                   ` Jakub Narebski
  2006-10-25  9:57                                                   ` Matthieu Moy
  2 siblings, 2 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-25  9:35 UTC (permalink / raw)
  To: Matthew D. Fuller
  Cc: Carl Worth, Aaron Bentley, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

Matthew D. Fuller wrote:
> On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
> Carl Worth, and lo! it spake thus:
> 
>> (since pull seems the only way to synch up without infinite new
>> merge commits being added back and forth).
> 
> The infinite-merge-commits case doesn't happen in bzr-land because we
> generally don't merge other branches except when the branch owner says
> "Hey, I've got something for you to merge".  If you were to setup a
> script to merge two branches back and forth until they were 'equal',
> yes, it'd churn away until you filled up your disk with the N bytes of
> metadata every new revision uses up.
> 

This is new to me. At work, we merge our toy repositories back and forth 
between devs only. There is no central repo at all. Does this mean that 
each merge would add one extra commit per time the one I'm merging with 
has merged with me?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:35                                                 ` Andreas Ericsson
@ 2006-10-25  9:46                                                   ` Jakub Narebski
  2006-10-25 10:08                                                     ` James Henstridge
  2006-10-25  9:57                                                   ` Matthieu Moy
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-25  9:46 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Matthew D. Fuller, Carl Worth, Aaron Bentley, Linus Torvalds,
	bazaar-ng, git

Andreas Ericsson wrote:
> Matthew D. Fuller wrote:
>> On Fri, Oct 20, 2006 at 02:48:52PM -0700 I heard the voice of
>> Carl Worth, and lo! it spake thus:
>> 
>>> (since pull seems the only way to synch up without infinite new
>>> merge commits being added back and forth).
>> 
>> The infinite-merge-commits case doesn't happen in bzr-land because we
>> generally don't merge other branches except when the branch owner says
>> "Hey, I've got something for you to merge".  If you were to setup a
>> script to merge two branches back and forth until they were 'equal',
>> yes, it'd churn away until you filled up your disk with the N bytes of
>> metadata every new revision uses up.
> 
> This is new to me. At work, we merge our toy repositories back and forth 
> between devs only. There is no central repo at all. Does this mean that 
> each merge would add one extra commit per time the one I'm merging with 
> has merged with me?

From what I understand, "bzr merge" will create one extra commit to
preserve the "first parent is my branch" feature. "bzr pull" will do
fast-forward if your DAG is proper subset of pulled branch/repository
DAG, but at the cost that it would change your revno to revision mapping
to those of the pulled repository.

That's a consequence of preserving branch as "my work" i.e. as path
through "branch DAG" in the DAG using first parent as special, instead
of saving it outside DAG.

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:19                                   ` David Rientjes
  2006-10-25  9:32                                     ` Jakub Narebski
@ 2006-10-25  9:49                                     ` Jeff King
  2006-10-25 13:49                                       ` Andreas Ericsson
  2006-10-25 17:21                                       ` David Rientjes
  1 sibling, 2 replies; 806+ messages in thread
From: Jeff King @ 2006-10-25  9:49 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linus Torvalds, bazaar-ng, git

On Wed, Oct 25, 2006 at 02:19:15AM -0700, David Rientjes wrote:

> No, my criticism is against the added complexity which makes the 
> modification of git increasingly difficult with every new release.  It's a 

OK, you seemed to imply problems for end users in your first paragraph,
which is what I was responding to.

> _current_ needs.  For any experienced shell programmer it is so much 
> easier to go in and change an option or pipe to a different command or 
> comment out a simple shell command in a .sh file than editing the C code.  

Yes, it's true that some operations might be easier to play with in the
shell. However, does it actually come up that you want to modify
existing git programs? The more common usage seems to be gluing the
plumbing together in interesting ways, and that is still very much
supported.

> And sometimes it's necessary to have several different variations of that 
> command which is very easy with slightly renamed .sh files instead of 
> adding on more and more flags to commands that have become so complex at 
> this point that it's difficult to know the basics of how to manage a 
> project.

You can do the same thing in C. In fact, look at how similar
git-whatchanged, git-log, and git-diff are.

I don't understand how a C->shell conversion has anything to do with
options being added. If you look at all of the conversions, they
replicate the interface _exactly_.

> This all became very obvious when the tutorials came out on "how to use 
> git in 20 commands or less" effectively.  These tutorials shouldn't need 
> to exist with an information manager that started as a quick, efficient, 
> and _simple_ project.  You're treating git development in the same light 

Sorry, I don't see how this is related to the programming language _at
all_. Are you arguing that the interface of git should be simplified so
that such tutorials aren't necessary? If so, then please elaborate, as
I'm sure many here would like to hear proposals for improvements. If
you're arguing that git now has too many features, then which features
do you consider extraneous?

> as you treat Linux development; let's be honest and say that 99% of the 
> necessary git functionality was there almost a year ago and ever since 
> nothing of absolute necessity has been added that serious developers care 
> about in a revision control system.  Look at LKML, nobody is waiting on 

I don't agree with this. There are tons of enhancements that I find
useful (e.g., '...' rev syntax, rebasing with 3-way merge, etc) that I
think other developers ARE using. There are scalability and performance
improvements. And there are new things on the way (Junio's pickaxe work)
that will hopefully make git even more useful than it already is.

If you don't think recent git versions are worthwhile, then why don't
you run an old version? You can even use git to cherry-pick patches onto
your personal branch.

> Absolutely.  I think I've actually documented that fairly well.  Back in 

Where?

> the day git was a very concise, well-written package.  Today, a tour 
> through the source code for the latest release leaves a lot to be desired 
> for any serious C programmer.

I don't agree, but since you haven't provided anything specific enough
to discuss, there's not much to say.

> Functionality wise, no.  But in terms of being able to _customize_ my 
> version of git depending on how I want to use it, I've lost hope on the 
> whole idea.  It's a shame too because it appears as though the original 

Can you name one customization that you would like to perform now that
you feel can't be easily done (and presumably that would have been
easier in the past)?

-Peff




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 23:49                                                             ` Carl Worth
  2006-10-22  0:07                                                               ` Jeff Licquia
  2006-10-22 16:02                                                               ` Petr Baudis
@ 2006-10-25  9:52                                                               ` Andreas Ericsson
  2 siblings, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-25  9:52 UTC (permalink / raw)
  To: Carl Worth; +Cc: Jeff Licquia, Jakub Narebski, bazaar-ng, git

Carl Worth wrote:
> On Sat, 21 Oct 2006 19:42:47 -0400, Jeff Licquia wrote:
>> I don't think so.  Recently, I've been trying to track a particular
>> patch in the kernel.  It was done as a series of commits, and probably
>> would have been its own branch in bzr, but when I was trying to group
>> the commits together to analyze them as a group, the easiest way to do
>> that was by the original committer's name.
> 
> As far as "its own branch in bzr" would such a branch remain available
> indefinitely even after being merged in to the main tree?
> 
>> Now, there's probably a better way to hunt that stuff down, but in this
>> case hunting the user down worked for me.  (It may have made a
>> difference that I was using gitweb instead of a local clone.)
> 
> Vast, huge, gaping, cosmic difference.
> 
> Almost none of the power of git is exposed by gitweb. It's really not
> worth comparing. (Now a gitweb-alike that provided all the kinds of
> very easy browsing and filtering of the history like gitk and git
> might be nice to have.)
> 

There was one, but it got discontinued due to performance issues. Shame 
that, because it would have been nice to have to show "foreign" visitors 
how gitk/qgit works. It would especially show the way git thinks about 
branches and stuff like that.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:35                                                 ` Andreas Ericsson
  2006-10-25  9:46                                                   ` Jakub Narebski
@ 2006-10-25  9:57                                                   ` Matthieu Moy
  1 sibling, 0 replies; 806+ messages in thread
From: Matthieu Moy @ 2006-10-25  9:57 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Matthew D. Fuller, Carl Worth, Aaron Bentley, Linus Torvalds,
	bazaar-ng, git, Jakub Narebski

Andreas Ericsson <ae@op5.se> writes:

> This is new to me. At work, we merge our toy repositories back and
> forth between devs only. There is no central repo at all. Does this
> mean that each merge would add one extra commit per time the one I'm
> merging with has merged with me?

Two things differ in bzr and git, here:

* bzr doesn't do "autocommit" after a merge. So, new revisions are
  created only if you use"commit".

* bzr has two commands, "pull" and "merge". "pull" just does what the
  git people call "fast-forward", and only this (it refuses to do
  anything if the branches diverged). In particular, you never have to
  commit after a pull (well, except if you had some local, uncommited
  changes). "merge" changes your working directory, and you have to
  commit after. "merge" will never do fast-forward, it will never
  change the revision to which your working tree revfers to, and it's
  your option to commit or not after (if you see that it introduces no
  changes, you might not want to commit).

The final rule in bzr would be "you create an extra commit each time
you commit" ;-).

As a side-note, it could be interesting to have a git-like merge
command (chosing automatically between merge and pull), probably not
in the core, but as a plugin.

-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:46                                                   ` Jakub Narebski
@ 2006-10-25 10:08                                                     ` James Henstridge
  2006-10-25 15:54                                                       ` Carl Worth
  0 siblings, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-25 10:08 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: bazaar-ng, Matthew D. Fuller, Linus Torvalds, Carl Worth,
	Andreas Ericsson, git

On 25/10/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Andreas Ericsson wrote:
> > This is new to me. At work, we merge our toy repositories back and forth
> > between devs only. There is no central repo at all. Does this mean that
> > each merge would add one extra commit per time the one I'm merging with
> > has merged with me?
>
> From what I understand, "bzr merge" will create one extra commit to
> preserve the "first parent is my branch" feature. "bzr pull" will do
> fast-forward if your DAG is proper subset of pulled branch/repository
> DAG, but at the cost that it would change your revno to revision mapping
> to those of the pulled repository.

Actually, "bzr merge" does not create any commits on the branch -- you
need to run "bzr commit" afterwards (possibly after resolving
conflicts).  The control files for the working tree record a pending
merge, which gets recorded when you get round to the commit.

So you can easily check if there were any tree changes resulting from the merge.

If there aren't, or you made the merge by mistake, you can make a call
to "bzr revert" to clean things up without ever having created a new
revision.

James.




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 22:21                                                             ` Matthew D. Fuller
                                                                                 ` (3 preceding siblings ...)
  2006-10-24  9:51                                                               ` Matthieu Moy
@ 2006-10-25 10:52                                                               ` Andreas Ericsson
  2006-10-25 19:53                                                                 ` Junio C Hamano
  4 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-25 10:52 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

[-- Attachment #1: Type: text/plain, Size: 1178 bytes --]

Matthew D. Fuller wrote:
> On Mon, Oct 23, 2006 at 10:29:53AM -0700 I heard the voice of
> Linus Torvalds, and lo! it spake thus:
>> I already briought this up once, and I suspect that the bzr people
>> simply DID NOT UNDERSTAND the question:
>>
>>  - how do you do the git equivalent of "gitk --all"
> 
> I for one simply DO NOT UNDERSTAND the question, because I don't know
> what that is or what I'd be trying to accomplish by doing it.  The
> documentation helpfully tells me that it's something undocumented.
> 

See the attached screenshot. This is from qgit --all on the git 
repository, but the DAG output is identical to that of gitk. Note in 
particular the 'pu' and 'next' branches. By scrolling down, I can easily 
see the branch-point of any of them.

To those that do not appreciate or allow email-attachments, I apologize. 
I think however that it was necessary to provide a view for the bazaar 
people of what Linus is talking about without having to download and 
install git and a git repository.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

[-- Attachment #2: Screenshot.png --]
[-- Type: image/png, Size: 148451 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24 21:51                                                                         ` Erik Bågfors
@ 2006-10-25 12:41                                                                           ` Andreas Ericsson
  2006-10-25 13:15                                                                             ` Erik Bågfors
  0 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-25 12:41 UTC (permalink / raw)
  To: Erik Bågfors
  Cc: Carl Worth, Matthew D. Fuller, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

Erik Bågfors wrote:
> 
> Creates the picture you can see at
> http://erik.bagfors.nu/bzr-plugins/dotrepo.png
> 

Looking at this picture, I found a very annoying thing with bzr's 
revids: For commits from the same author on the same day, they don't 
differ in the beginning, making all of them, at a glance, look the same. 
I got a headache just trying to figure out how to read them. It might be 
worth looking into in the future, especially if you decide to show them 
to the users.

Perhaps it's just my git eyes being used to seeing the first 4 chars 
(which is all I normally look at) being different for each different 
commit, but having to look up the near-end of the string to find the 
actual difference in bzr's revids was actually a quite painful experience.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 12:41                                                                           ` Andreas Ericsson
@ 2006-10-25 13:15                                                                             ` Erik Bågfors
  0 siblings, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-25 13:15 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Carl Worth, Matthew D. Fuller, Linus Torvalds, bazaar-ng, git,
	Jakub Narebski

On 10/25/06, Andreas Ericsson <ae@op5.se> wrote:
> Erik Bågfors wrote:
> >
> > Creates the picture you can see at
> > http://erik.bagfors.nu/bzr-plugins/dotrepo.png
> >
>
> Looking at this picture, I found a very annoying thing with bzr's
> revids: For commits from the same author on the same day, they don't
> differ in the beginning, making all of them, at a glance, look the same.
> I got a headache just trying to figure out how to read them. It might be
> worth looking into in the future, especially if you decide to show them
> to the users.
>
> Perhaps it's just my git eyes being used to seeing the first 4 chars
> (which is all I normally look at) being different for each different
> commit, but having to look up the near-end of the string to find the
> actual difference in bzr's revids was actually a quite painful experience.

I agree, and new formats for how the revisions should look are being
discussed on the mailinglist right now.  It's not set in stone.

/Erik

-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:49                                     ` Jeff King
@ 2006-10-25 13:49                                       ` Andreas Ericsson
  2006-10-25 21:51                                         ` David Lang
  2006-10-25 17:21                                       ` David Rientjes
  1 sibling, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-25 13:49 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Linus Torvalds, bazaar-ng, David Rientjes

Jeff King wrote:
> On Wed, Oct 25, 2006 at 02:19:15AM -0700, David Rientjes wrote:
> 
>> No, my criticism is against the added complexity which makes the 
>> modification of git increasingly difficult with every new release.  It's a 
> 
> OK, you seemed to imply problems for end users in your first paragraph,
> which is what I was responding to.
> 
>> _current_ needs.  For any experienced shell programmer it is so much 
>> easier to go in and change an option or pipe to a different command or 
>> comment out a simple shell command in a .sh file than editing the C code.  
> 
> Yes, it's true that some operations might be easier to play with in the
> shell. However, does it actually come up that you want to modify
> existing git programs? The more common usage seems to be gluing the
> plumbing together in interesting ways, and that is still very much
> supported.
> 

Indeed. I still use my old git-send-patch script whenever I want to send 
patches, simply because I don't like git-send-email and its defaults 
much. The interface hasn't changed one bit since I wrote it. That's 
pretty stable, since send-patch was created couple of hours before git.c 
was submitted to the list, as I wrote the "send-patch" script to send 
the patch that did the rewriting.

I'm personally all for a rewrite of the necessary commands in C 
("commit" comes to mind), but as many others, I have no personal 
interest in doing the actual work. I'm fairly certain that once we get 
it working natively on windows with some decent performance, windows 
hackers will pick up the ball and write "wingit", which will be a log 
viewer and GUI thing for 
fetching/merging/committing/reverting/rebasing/sending patches and 
whatnot. Possibly it will have hooks to Visual C++ or some other IDE. I 
don't know how that sort of thing works, but I'm sure someone clever and 
bored enough will want to investigate the possibilities.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 10:08                                                     ` James Henstridge
@ 2006-10-25 15:54                                                       ` Carl Worth
  2006-10-26  8:52                                                         ` James Henstridge
  0 siblings, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-10-25 15:54 UTC (permalink / raw)
  To: James Henstridge
  Cc: Jakub Narebski, Andreas Ericsson, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, git

[-- Attachment #1: Type: text/plain, Size: 1197 bytes --]

On Wed, 25 Oct 2006 18:08:22 +0800, "James Henstridge" wrote:
> If there aren't, or you made the merge by mistake, you can make a call
> to "bzr revert" to clean things up without ever having created a new
> revision.

One result of this approach is that developers of different trees
don't necessarily have common revision IDs to compare. Imagine a
question like:

	When you ran that test did you have the same code I've got?

In git, the answer would be determined by comparing revision IDs.

In bzr, the only answer I'm hearing is attempting a merge to see if it
introduces any changes. (I'm deliberately avoiding "pull" since we're
talking about distributed cases here).

And to comment on something mentioned earlier in the thread, there's
no need for "wildly complex" distributed scenarios. All of these
issues are present with developers working together as peers, (and
each considering their own repository as canonical).

A harder question (for bzr) is:

	Do you have all of the history I've got?

(The problem being that when one developer is missing some history and
merges it in, she necessarily creates new history, so there's never a
stable point for both sides to agree on.)

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  9:49                                     ` Jeff King
  2006-10-25 13:49                                       ` Andreas Ericsson
@ 2006-10-25 17:21                                       ` David Rientjes
  2006-10-25 21:03                                         ` Jeff King
  2006-10-26 11:15                                         ` Andreas Ericsson
  1 sibling, 2 replies; 806+ messages in thread
From: David Rientjes @ 2006-10-25 17:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Wed, 25 Oct 2006, Jeff King wrote:

> Yes, it's true that some operations might be easier to play with in the
> shell. However, does it actually come up that you want to modify
> existing git programs? The more common usage seems to be gluing the
> plumbing together in interesting ways, and that is still very much
> supported.
> 

Yes, it does.  I'll give you an example from six months ago: there was a 
need for the group that I work with to support a faster type of hashing 
function for whatever reason.  This would have been simple with previous 
versions of git, but if you've ever looked at the SHA1 code in git, you'll 
realize that you're probably better off never trying to touch it.  There 
is absolutely _no_ abstraction of it at all and the code is so deeply 
coupled in the source that abstracting it away is a pain.

Likewise, there is always room for personal or organizational tweaks on 
the part of the developer.  Things like distributed pulling and 
merging should actually be pretty simple to implement if the complexity 
wasn't so high in the merge-* family.  This is something I implemented 
after an enormous headache because we were dealing with very large 
projects: yes, larger than the Linux kernel.  And this is _exactly_ where 
piping would help; we have implementations of distributed grep over very 
large datasets (on the order of terabytes).

> You can do the same thing in C. In fact, look at how similar
> git-whatchanged, git-log, and git-diff are.
> 

No you can't.  Making a one line addition, commenting out a line, or 
changing a simple flag in a shell script is much easier.  And like I 
already said, you can save multiple versions for your common use if you 
work on a specific project much of the time and change how it operates 
depending on the needs of that one project so you never need to do it 
again or you can _distribute_ that shell file to your colleagues so that 
everybody is doing their work via the same method.  This makes it so you 
can just say "type X, then type Y, then type Z" and everybody is operating 
together without training them on how to use git.

> > This all became very obvious when the tutorials came out on "how to use 
> > git in 20 commands or less" effectively.  These tutorials shouldn't need 
> > to exist with an information manager that started as a quick, efficient, 
> > and _simple_ project.  You're treating git development in the same light 
> 
> Sorry, I don't see how this is related to the programming language _at
> all_. Are you arguing that the interface of git should be simplified so
> that such tutorials aren't necessary? If so, then please elaborate, as
> I'm sure many here would like to hear proposals for improvements. If
> you're arguing that git now has too many features, then which features
> do you consider extraneous?
> 

It's not, it's related to the original vision of git which was meant for 
efficiency and simplicity.  A year ago it was very easy to pick up the 
package and start using it effectively within a couple hours.  Keep in 
mind that this was without tutorials, it was just reading man pages.  
Today it would be very difficult to know what the essential commands are 
and how to use them simply to get the job done, unless you use the 
tutorials.  This _inherently_ goes against the approach of trying to 
provide something that is simple to the developer.

Revision control is something that should exist in the background that 
does it's simple job very efficiently.  Unfortunately git has tried to 
move its presence into the foreground and requiring developers to spend 
more time on learning the system.

Have you never tried to show other people git without giving them a 
tutorial on the most common uses?  Try it and you'll see the confusion.  
That _specifically_ illustrates the ever-increasing lack of simplicity 
that git has acquired.

> I don't agree with this. There are tons of enhancements that I find
> useful (e.g., '...' rev syntax, rebasing with 3-way merge, etc) that I
> think other developers ARE using. There are scalability and performance
> improvements. And there are new things on the way (Junio's pickaxe work)
> that will hopefully make git even more useful than it already is.
> 

There are _not_ scalability improvements.  There may be some slight 
performance improvements, but definitely not scalability.  If you have 
ever tried to use git to manage terabytes of data, you will see this 
becomes very clear.  And "rebasing with 3-way merge" is not something 
often used in industry anyway if you've followed the more common models 
for revision control within large companies with thousands of engineers.  
Typically they all work off mainline.

> If you don't think recent git versions are worthwhile, then why don't
> you run an old version? You can even use git to cherry-pick patches onto
> your personal branch.
> 

I do.  And that's why I would recommend to any serious developer to use 
1.2.4; this same version that I used for kernel development at Google.

> Where?
> 

Few months back here on the mailing list.  When I tried cleaning up even 
one program, I got the response back from the original author "why fix a 
non-problem?" because his argument was that since it worked the code 
doesn't matter.

	http://marc.theaimsgroup.com/?l=git&m=115589472706036

And that is simply one thread of larger conversations that have taken 
place off-list and aren't archived.

> I don't agree, but since you haven't provided anything specific enough
> to discuss, there's not much to say.
> 

If there's a question about some of the sloppiness in the git source code 
as it stands today, that's a much bigger issue than the sloppiness.  My 
advice would be to pick up a copy of K&R's 2nd edition C programming 
language book, read it, and then take a tour of the source code.

> Can you name one customization that you would like to perform now that
> you feel can't be easily done (and presumably that would have been
> easier in the past)?
> 

Yes, those mentioned above.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-23 23:24                                                                   ` Linus Torvalds
                                                                                       ` (2 preceding siblings ...)
  2006-10-24  9:30                                                                     ` Jelmer Vernooij
@ 2006-10-25 18:41                                                                     ` Aaron Bentley
  3 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-25 18:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Erik Bågfors, bazaar-ng, git, Jakub Narebski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> 
> On Tue, 24 Oct 2006, Erik Bågfors wrote:
> 
>>I don't see any problem doing a "gitk --all" equivalent in bzr.
> 
> 
> The problem? How do you show a commit that is _common_ to two branches, 
> but has different revision names in them?

If you're talking about the old-style single-integer revnos, each
revision only has one of those, because that revision dictates the path
you must take to the origin when determining its revno.  Many others may
share that revno, but each revision has only one.

The new-style dotted-series-of-ints revnos, I agree, will change.
They're not something I use.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFP6/B0F+nu1YWqI0RAs76AJ9nE4BnL2tLDPQwqjQvCi6okDTdpQCdFQ9V
GoL1BWO+L2FxjLjRrCjKtuY=
=yQ6t

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 10:52                                                               ` Andreas Ericsson
@ 2006-10-25 19:53                                                                 ` Junio C Hamano
  0 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-25 19:53 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: git

Andreas Ericsson <ae@op5.se> writes:

> See the attached screenshot. This is from qgit --all on the git
> repository, but the DAG output is identical to that of gitk. Note in
> particular the 'pu' and 'next' branches. By scrolling down, I can
> easily see the branch-point of any of them.

Looking at this picture I noticed the lack of circles or
rectangles on six commits near the tip of "pu" branch.  Nobody
should be doing an Octopus so it might be a non-issue, but
somehow it looks fishy.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 17:21                                       ` David Rientjes
@ 2006-10-25 21:03                                         ` Jeff King
  2006-10-26 11:15                                         ` Andreas Ericsson
  1 sibling, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-25 21:03 UTC (permalink / raw)
  To: David Rientjes; +Cc: Linus Torvalds, Lachlan Patrick, bazaar-ng, git

On Wed, Oct 25, 2006 at 10:21:42AM -0700, David Rientjes wrote:

> Yes, it does.  I'll give you an example from six months ago: there was a 

First off, thanks for giving examples. I was having trouble seeing where
you were coming from.

> need for the group that I work with to support a faster type of hashing 
> function for whatever reason.  This would have been simple with previous 
> versions of git, but if you've ever looked at the SHA1 code in git, you'll 
> realize that you're probably better off never trying to touch it.  There 
> is absolutely _no_ abstraction of it at all and the code is so deeply 
> coupled in the source that abstracting it away is a pain.

Is this really an artifact of the C code versus the shell code? A lot of
parts of the system need to touch SHA1 hashes, and I think it has been
sprinkled throughout the code from the beginning. In fact, I think the
libification of git-rev-list has made the code a lot _cleaner_ (and
shorter), in that the C programs can all use the same nice interface.
The external interface is still there, but now there is consistency
among programs when using rev syntax (ISTR issues in the distant past
where program X didn't understand syntax because the parsing was all
done ad-hoc).

> Likewise, there is always room for personal or organizational tweaks on 
> the part of the developer.  Things like distributed pulling and 
> merging should actually be pretty simple to implement if the complexity 
> wasn't so high in the merge-* family.  This is something I implemented 
> after an enormous headache because we were dealing with very large 
> projects: yes, larger than the Linux kernel.  And this is _exactly_ where 
> piping would help; we have implementations of distributed grep over very 
> large datasets (on the order of terabytes).

I guess I don't see how this was ever any easier. Do you mean that when
we called an external grep, it was easier to plug in your distributed
grep?

> > You can do the same thing in C. In fact, look at how similar
> > git-whatchanged, git-log, and git-diff are.
> No you can't.


The "same thing" I referred to was changing behavior trivially based on
the program name. So yes, you can.

> Making a one line addition, commenting out a line, or changing a
> simple flag in a shell script is much easier.  And like I already

Sure, shell can be easier to modify (though in well-written C, you're
likely just commenting out a few lines or a function call -- maybe you
can argue whether or not git is well-written). However, I remain
unconvinced that this is a common use case, or that it is something that
should weigh heavily when compared with portability, efficiency, or
robustness concerns.

> It's not, it's related to the original vision of git which was meant for 
> efficiency and simplicity.


Simplicity is fine if all you want is plumbing. But normal people want
to _use_ git without hacking their own shell scripts, so it makes sense
to provide the scripts that other people have hacked together (as shell,
perl, C, or whatever). Do I want to use git-send-email? Hell no, the
interface is terrible to me. But do the plumbing commands still exist so
that I can use the scripts I hacked together? Absolutely. I can take
what I want and leave the rest.

> A year ago it was very easy to pick up the package and start using it
> effectively within a couple hours.  Keep in mind that this was without

Was it? The most common complaint I've heard about git, starting a year
ago, was the lack of documentation and tutorials and the complexity of
use.

> tutorials, it was just reading man pages.  Today it would be very
> difficult to know what the essential commands are and how to use them
> simply to get the job done, unless you use the tutorials.  This

I think this has been the case for a long time. It's just that there
_weren't_ tutorials back then.

> Have you never tried to show other people git without giving them a 
> tutorial on the most common uses?  Try it and you'll see the confusion.  
> That _specifically_ illustrates the ever-increasing lack of simplicity 
> that git has acquired.

No, it illustrates a lack of simplicity that currently exists; it says
_nothing_ about the change in simplicity over time.

> There are _not_ scalability improvements.  There may be some slight 
> performance improvements, but definitely not scalability.  If you have 
> ever tried to use git to manage terabytes of data, you will see this 

There has been work on scaling to larger repositories (e.g., mozilla and
xorg prompting work/discussion on cvs importing, subproject/superproject
support, shallow clones, etc), but not on terabyte scales. I realize
that might not help you, but it is helping a lot of people. Quite
honestly, git is focused on SOURCE CODE MANAGEMENT, not terabytes of
data. Perhaps that is your true complaint: git is developing tools for
working with source code, potentially at the loss of some generality
(though I tend to think it hasn't lost generality, but rather it hasn't
gained).

> becomes very clear.  And "rebasing with 3-way merge" is not something 
> often used in industry anyway if you've followed the more common models 
> for revision control within large companies with thousands of engineers.  
> Typically they all work off mainline.

My point isn't that every feature is useful to every developer. My point
is that just because features aren't useful to _you_ doesn't mean
they're not useful at all.

And if you want to talk about industry standard, didn't the discussion
start off with your complaint about porting to Windows? An
industry-standard SCM needs to be cross-platform across the major
operating systems.

> Few months back here on the mailing list.  When I tried cleaning up even 
> one program, I got the response back from the original author "why fix a 
> non-problem?" because his argument was that since it worked the code 
> doesn't matter.

I remember a big discussion about the order of arguments in relational
expressions. Git may have problems, but I just don't see coding style
nitpicks as a priority.

Abstracting the hashing might be worthwhile, but the list consensus was
that it's not worth the work unless we're actually going to _do_
something with the abstraction.  Your argument seems to be that you
_are_ doing something with the abstraction on your own. If you want to
convince the git developers that this is a worthwhile direction, then
show some code which uses it.

> 	http://marc.theaimsgroup.com/?l=git&m=115589472706036

OK, I remember this particular discussion. And I just read through to
the end of the thread; it looks like Junio ended up with "this code is
ugly; fix it" and Johannes did.

It sounds like your real beef was that you want to use some alternate
"mv" command that handles your data set better, and having git-mv as a
shell-script would make that simpler for you.  Well, it isn't a shell
script and it never was. If you want to write it as one, I imagine it
would be considered for inclusion (though I expect the C version may
have some advantages, such as atomicity of file movement and index
updating).


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  8:48                                 ` Jeff King
       [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
  2006-10-25  9:19                                   ` David Rientjes
@ 2006-10-25 21:08                                   ` Junio C Hamano
  2006-10-25 21:16                                     ` Jeff King
                                                       ` (2 more replies)
  2 siblings, 3 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-25 21:08 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, David Rientjes, bazaar-ng, git

Jeff King <peff@peff.net> writes:

> On Tue, Oct 24, 2006 at 01:12:52PM -0700, David Rientjes wrote:
>
>> And I would prefer the opposite because we're talking about git.  As an 
>> information manager, it should be seen and not heard.  Nobody is going to 
>> spend their time to become a git or CVS or perforce expert.  As an 
>> individual primarily interested in development, I should not be required 
>> to learn command lines for dozens of different git-specific commands to do 
>> my job quickly and effectively.  I would opt for a much more simpler 
>> approach and deal with shell scripting for many of these commands because 
>> I'm familiar with them and I can pipe any command with the options I 
>> already know and have used before to any other command.
>
> I don't understand how converting shell scripts to C has any impact
> whatsoever on the usage of git. The plumbing shell scripts didn't go
> away; you can still call them and they behave identically.
>
> Is there some specific change in functionality that you're lamenting?

That's also I wondered, but I also can understand where David is
coming from, and I agree with him to a certain degree.

When I learned git, I learned a lot from trying to piece my own
plumbing together, since there weren't much Porcelain to speak
of back then.  Then we had many usability enhancements before
the 1.0 release to add Porcelainish done as shell scripts.

This had two positive effects, aside from adding usability.
Interested people had more shell scripts to learn from.  The
scripts were easy to adjust to feature requests from the list,
and as we learned from user experience based on these scripts it
was definitely quicker to codify the best current practice
workflow in them than if they were written in C.  It would have
taken us a lot more effort to add "git commit -o paths" vs "git
commit -i paths" if it were already converted to C, for example.
This continued and our Porcelainish scripts matured quickly.

Then 1.3 series started to move some of the mature ones into C.
As many people already have pointed out, being written in C and
not doing pipe() has two advantages (better portability to
platforms with awkward pipe support and one less process usually
mean better performance).  git-log family with path limiting had
a real boost in performance because the path limiting can be
done in the revision traversal side not diff-tree that used to
be on the downstream side of the pipe.  So this in overall was a
right thing to do.

One thing we lost during the process, however, is a ready access
to the pool of "sample scripts" when people would want to
scratch their own itches.  Linus's original tutorial talked
about "this pattern of pipe is so useful that we have a three
liner shell script wrapper that is called git-foo", and
interested people can easily look at how the plumbing commands
fit together.

The plumbing is still there, and I and people who already know
git would still script around git-rev-list when we need to (by
the way, scripting around git-log is a wrong thing to do -- it
is for human consumption and scripting should be done with
plumbing).  But when we rewrote mature ones in C (and I keep
stressing "mature" because another thing I agree with David is
that shell is definitely easier to futz with), we did not leave
the older shell implementation around as reference.  People
coming to git after 1.3 series certainly do have harder time to
learn how plumbing would fit together than when git old-timers
learned it, if that is the area they are interested in, as
opposed to just using git as a revision tracking system.

We could probably do two things to address this issue:

 - Create examples/ hierarchy in the source tree to house these
   historical implementations as a reference material, or an
   entirely different branch or repository to house them.

 - Learn the itches David and other people have, that the
   current git Porcelain-ish does not scratch well, and enrich
   Documentation/technical with real-world working scripts built
   around plumbing.






^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:08                                   ` Junio C Hamano
@ 2006-10-25 21:16                                     ` Jeff King
  2006-10-25 21:32                                       ` Junio C Hamano
  2006-10-25 21:50                                     ` Junio C Hamano
  2006-10-26 11:25                                     ` Andreas Ericsson
  2 siblings, 1 reply; 806+ messages in thread
From: Jeff King @ 2006-10-25 21:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, bazaar-ng, Linus Torvalds, Lachlan Patrick, David Rientjes

On Wed, Oct 25, 2006 at 02:08:20PM -0700, Junio C Hamano wrote:

> the older shell implementation around as reference.  People
> coming to git after 1.3 series certainly do have harder time to
> learn how plumbing would fit together than when git old-timers
> learned it, if that is the area they are interested in, as
> opposed to just using git as a revision tracking system.

I think this is part of the complication of discussion I'm having with
David. There are really two sets of users for git: people who want to
hack scripts based on plumbing, and people who want everything to "just
work." I think it's a good point that as the system matures (movement
to C and growth of complexity), it might become less easy to hack.

>  - Create examples/ hierarchy in the source tree to house these
>    historical implementations as a reference material, or an
>    entirely different branch or repository to house them.

Housing historical implementations seems like it would just lead to
out-of-date and non-functional examples.

>  - Learn the itches David and other people have, that the
>    current git Porcelain-ish does not scratch well, and enrich
>    Documentation/technical with real-world working scripts built
>    around plumbing.

I think this is a better approach. I think it also makes sense to
let people know that it's an acceptable approach to start new features
as shell and then have them mature to C (looking at the current
codebase, and some of Dscho's rantings, one might get the impression
that git isn't accepting new shell scripts).


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:16                                     ` Jeff King
@ 2006-10-25 21:32                                       ` Junio C Hamano
  0 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-25 21:32 UTC (permalink / raw)
  To: Jeff King; +Cc: git

Jeff King <peff@peff.net> writes:

> Housing historical implementations seems like it would just lead to
> out-of-date and non-functional examples.

I agree.  Although that ought to be rare in principle, given
that one advertised feature of git is that the plumbing is
supposed to be stable, we occasionally had to have to subtly
break things to improve plumbing and at the same time run around
to make sure that all the script users (both in-tree and
out-of-tree like Cogito, gitweb and StGIT) are updated.

>>  - Learn the itches David and other people have, that the
>>    current git Porcelain-ish does not scratch well, and enrich
>>    Documentation/technical with real-world working scripts built
>>    around plumbing.
>
> I think this is a better approach. I think it also makes sense to
> let people know that it's an acceptable approach to start new features
> as shell and then have them mature to C (looking at the current
> codebase, and some of Dscho's rantings, one might get the impression
> that git isn't accepting new shell scripts).

New commands like pickaxe and for-each-ref were easier to code
in C, and cherry rewrite in C was really about how crufty the
shell script version was from the beginning (and there weren't
in-tree users of it left so it was not maintained at all but
thanks to plumbing being stable it just kept working perhaps
correctly but still horribly).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:08                                   ` Junio C Hamano
  2006-10-25 21:16                                     ` Jeff King
@ 2006-10-25 21:50                                     ` Junio C Hamano
  2006-10-26 11:25                                     ` Andreas Ericsson
  2 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-10-25 21:50 UTC (permalink / raw)
  To: git

Junio C Hamano <junkio@cox.net> writes:

>  - Learn the itches David and other people have, that the
>    current git Porcelain-ish does not scratch well, and enrich
>    Documentation/technical with real-world working scripts built
>    around plumbing.

I meant "Documentation/howto"; sorry for the noise.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 13:49                                       ` Andreas Ericsson
@ 2006-10-25 21:51                                         ` David Lang
  2006-10-25 22:15                                           ` Shawn Pearce
  0 siblings, 1 reply; 806+ messages in thread
From: David Lang @ 2006-10-25 21:51 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: Jeff King, David Rientjes, Linus Torvalds, Lachlan Patrick,
	bazaar-ng, git

a quick lesson on program nameing

On Wed, 25 Oct 2006, Andreas Ericsson wrote:

> I'm personally all for a rewrite of the necessary commands in C ("commit" 
> comes to mind), but as many others, I have no personal interest in doing the 
> actual work. I'm fairly certain that once we get it working natively on 
> windows with some decent performance, windows hackers will pick up the ball 
> and write "wingit", which will be a log viewer and GUI thing for
              ^^^^^^

how many other people read this as 'wing it' rather then 'win git'? ;-)

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:51                                         ` David Lang
@ 2006-10-25 22:15                                           ` Shawn Pearce
  2006-10-25 22:29                                             ` Jakub Narebski
  2006-10-25 22:41                                             ` David Lang
  0 siblings, 2 replies; 806+ messages in thread
From: Shawn Pearce @ 2006-10-25 22:15 UTC (permalink / raw)
  To: David Lang; +Cc: git

David Lang <dlang@digitalinsight.com> wrote:
> a quick lesson on program nameing
> 
> On Wed, 25 Oct 2006, Andreas Ericsson wrote:
> 
> >I'm personally all for a rewrite of the necessary commands in C ("commit" 
> >comes to mind), but as many others, I have no personal interest in doing 
> >the actual work. I'm fairly certain that once we get it working natively 
> >on windows with some decent performance, windows hackers will pick up the 
> >ball and write "wingit", which will be a log viewer and GUI thing for
>              ^^^^^^
> 
> how many other people read this as 'wing it' rather then 'win git'? ;-)

Yes, that's certainly a less than optimal name...

What about gitk?  Is it "gi tk" or "git k" ?  This has actually
been the source of much local debate.  :-)

-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:15                                           ` Shawn Pearce
@ 2006-10-25 22:29                                             ` Jakub Narebski
  2006-10-25 22:44                                               ` Petr Baudis
  2006-10-25 22:41                                             ` David Lang
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-25 22:29 UTC (permalink / raw)
  To: git

Shawn Pearce wrote:

> David Lang <dlang@digitalinsight.com> wrote:
>> a quick lesson on program nameing
>> 
>> On Wed, 25 Oct 2006, Andreas Ericsson wrote:
>> 
>> >I'm personally all for a rewrite of the necessary commands in C ("commit" 
>> >comes to mind), but as many others, I have no personal interest in doing 
>> >the actual work. I'm fairly certain that once we get it working natively 
>> >on windows with some decent performance, windows hackers will pick up the 
>> >ball and write "wingit", which will be a log viewer and GUI thing for
>>              ^^^^^^
>> 
>> how many other people read this as 'wing it' rather then 'win git'? ;-)
> 
> Yes, that's certainly a less than optimal name...
> 
> What about gitk?  Is it "gi tk" or "git k" ?  This has actually
> been the source of much local debate.  :-)

You can always use CamelCase, i.e. WinGit or WinGIT (or wgit,
but this is also silly).

Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
not ggit, curiously ;-) and tig.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25  0:27                                                                         ` Matthew D. Fuller
@ 2006-10-25 22:40                                                                           ` David Lang
  2006-10-25 23:53                                                                             ` Matthew D. Fuller
  2006-10-30 21:46                                                                             ` Jan Hudec
  0 siblings, 2 replies; 806+ messages in thread
From: David Lang @ 2006-10-25 22:40 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: Linus Torvalds, bazaar-ng, git

On Tue, 24 Oct 2006, Matthew D. Fuller wrote:

> On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
> David Lang, and lo! it spake thus:
>>
>> it sounded like you were saying that the way to get the slices of
>> the DAG was to use branches in bzr. [...]
>
> I'm not entirely sure I understand what you mean here, but I think
> you're saying "Nobody's written the code in bzr to show arbitrary
> slices of the DAG", which is true TTBOMK.

I think we are talking past each other here.

what I think was said was

G 'one feature of git is that you can view arbatrary slices trivially'

B 'bzr can do this too, you just use branches to define the slices'

G 'but this limits you becouse branches are defined as code is developed, git 
lets you define slices at viewing time'

by the way, I think it's more then just saying 'well, the code could be written 
to do this in $VCS' some decisions and standard ways of doing things can impact 
how hard it is to implement a feature, and some decisions can make it 
impossible (without doing unexpected things).

>
>> everyone agrees that bzr supports the Star topology. Most people
>> (including bzr people) seem to agree that currently bzr does not
>> support the Distributed topology.
>
> I think this statement arouses so much grumbling because (a) bzr does
> support such a lot better than often seems implied, (b) where it
> doesn't, the changes needed to do so are relatively minor (often
> merely cosmetic), and (c) disagreement over whether some of the
> qualifications included for 'distributed' are really fundamental.
>
>
>> it's just fine for bzr to not support all possible topologies,
>
> I think there's a real intent for bzr TO support at least all common
> topologies.  I'll buy that current development has focused more on
> [relatively] simple topologies than the more wildly complex ones.  I
> look forward to more addressing of the less common cases as the tool
> matures, and I think a lot of this thread will be good material to
> work with as that happens.  It's just the suggestion that providing
> fruit for simple topologies _necessarily_ prejudices against complex
> ones that I find so onerous.

one concern that the git people are voicing is that the things that work for 
simple topologies (revno's) can't be used with the more complex ones (where you 
need the refid's). especially the fact that users need to do things 
significantly different when there are fairly subtle changes to the topology.

the scenerio that came up elsewhere today where you have

    Master
    /    \
dev1   dev2

and then dev1 and dev2 both start working on the same thing (without knowing 
it), then discover they are working on the same thing. they now have threeB 
options

1. merge their stuff up to the master so that they can both pull it down.
   but this puts broken, experimental stuff up in the master

2. declare one of the dev trees to be the master

this changes the topology to

Master--dev1--dev2

3. pull from each other frequently to keep in sync.

this changes the topology to

    Master
    /   \
dev1--dev2

if they do this with bzr then the revno's break, they each get extra commits 
showing up (so they can never show the same history).

in git this is a non-issue, they can pull back and forth and the only new 
history to show up will be changes.

this is the situation that the kernel developers are in frequently. it sounds as 
if you haven't needed to do this yet, so you haven't encountered the problems.

David Lang

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:15                                           ` Shawn Pearce
  2006-10-25 22:29                                             ` Jakub Narebski
@ 2006-10-25 22:41                                             ` David Lang
  1 sibling, 0 replies; 806+ messages in thread
From: David Lang @ 2006-10-25 22:41 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

On Wed, 25 Oct 2006, Shawn Pearce wrote:

> David Lang <dlang@digitalinsight.com> wrote:
>> a quick lesson on program nameing
>>
>> On Wed, 25 Oct 2006, Andreas Ericsson wrote:
>>
>>> I'm personally all for a rewrite of the necessary commands in C ("commit"
>>> comes to mind), but as many others, I have no personal interest in doing
>>> the actual work. I'm fairly certain that once we get it working natively
>>> on windows with some decent performance, windows hackers will pick up the
>>> ball and write "wingit", which will be a log viewer and GUI thing for
>>              ^^^^^^
>>
>> how many other people read this as 'wing it' rather then 'win git'? ;-)
>
> Yes, that's certainly a less than optimal name...
>
> What about gitk?  Is it "gi tk" or "git k" ?  This has actually
> been the source of much local debate.  :-)

in this case I think it's both, (or technicaly git tk with the double t's 
combined to save typeing)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:29                                             ` Jakub Narebski
@ 2006-10-25 22:44                                               ` Petr Baudis
  2006-10-25 23:15                                                 ` Jakub Narebski
  2006-10-26  1:06                                                 ` Horst H. von Brand
  0 siblings, 2 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-25 22:44 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

Dear diary, on Thu, Oct 26, 2006 at 12:29:17AM CEST, I got a letter
where Jakub Narebski <jnareb@gmail.com> said that...
> Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
> not ggit, curiously ;-) and tig.

wit?

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
#!/bin/perl -sp0777i<X+d*lMLa^*lN%0]dsXx++lMlN/dsM0<j]dsj
$/=unpack('H*',$_);$_=`echo 16dio\U$k"SK$/SM$n\EsN0p[lN*1

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:44                                               ` Petr Baudis
@ 2006-10-25 23:15                                                 ` Jakub Narebski
  2006-10-26  1:06                                                 ` Horst H. von Brand
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-25 23:15 UTC (permalink / raw)
  To: git

Petr Baudis wrote:

> Dear diary, on Thu, Oct 26, 2006 at 12:29:17AM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
>> Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
>> not ggit, curiously ;-) and tig.
> 
> wit?

Taken.

wit ? a Python web interface to git maintained by Christian Meder.
Example site on http://www.grmso.net:8090/ . It uses PATH_INFO
much more than gitweb (which uses CGI parameters mostly, but also
supports multiple projects).

Well, not maintained if http://www.absolutegiganten.org/wit/
is indicator

  wit-0.0.4.tar.gz        08-Sep-2005

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:40                                                                           ` David Lang
@ 2006-10-25 23:53                                                                             ` Matthew D. Fuller
  2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-30 21:46                                                                             ` Jan Hudec
  1 sibling, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-25 23:53 UTC (permalink / raw)
  To: David Lang; +Cc: bazaar-ng, git

On Wed, Oct 25, 2006 at 03:40:00PM -0700 I heard the voice of
David Lang, and lo! it spake thus:
> 
> I think we are talking past each other here.
> 
> what I think was said was
> 
> G 'one feature of git is that you can view arbatrary slices
> trivially'
> 
> B 'bzr can do this too, you just use branches to define the slices'

Ah.  This is more like "bzr [mostly] only does this now in terms of a
single branch (or some point back along it)".  The slices that go
between branches are very limited ('missing' gives you one view;
'branch:' and 'ancestor:' revision specifications give you another).
bzrk/'visualize' gives an interface similar to gitk, but also only in
the context of a single branch/head looking backward through its
previous tree AFAIK.  Any random DAG-slicing of what you have in the
revision store can be done, somebody would just have to write the code
for it.  Nothing about 'the workflow preserves parents' would make
that any harder than writing the code for git was.

Much of this is probably a result of the 'branch'-centric (rather than
'repository'-centric) view of the world; similarly to the fact that
branches are referred to by location (local ../otherbranch, or remote
http/sftp/etc) rather than by a name.  This is one of the bits of bzr
I'm personally somewhat ambivalent about.


> they now have threeB options

Those certainly aren't the only choices, but to stay OT:

> 3. pull from each other frequently to keep in sync.
> 
> this changes the topology to
> 
>    Master
>    /   \
>  dev1--dev2
> 
> if they do this with bzr then the revno's break, they each get extra
> commits showing up (so they can never show the same history).

These two are either/or, not and; either they pull (in which case
their old mainline is no longer meaningful), or they merge (in which
case they get the 'extra' merge commits).


> in git this is a non-issue, they can pull back and forth and the
> only new history to show up will be changes.

In git, this is a non-issue because you don't get to CHOOSE which way
to work.  You always (if you can) pull and obliterate your local
mainline.  In bzr, it's only an 'issue' because you CAN choose, and
CAN maintain your local mainline.  You CAN choose, right now, to do a
git and pull back and forth and only new history show up as changed by
creating a 'bzr-pull' shell script that does a 'bzr pull || bzr merge'
(though you'd be a lot better off adding a '--fast-forward-if-you-can'
option to merge and aliasing that over).

More basically, though, I don't think that "histories become exactly
equivalent" is a necessary pass-word to enter the Hallowed City of
Truely Distributed Development.  And I certainly see no reason to
believe we'll agree on it this time any more than We (in broad) have
the last 6 times it came up in the thread.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:44                                               ` Petr Baudis
  2006-10-25 23:15                                                 ` Jakub Narebski
@ 2006-10-26  1:06                                                 ` Horst H. von Brand
  1 sibling, 0 replies; 806+ messages in thread
From: Horst H. von Brand @ 2006-10-26  1:06 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Jakub Narebski, git

Petr Baudis <pasky@suse.cz> wrote:
> Dear diary, on Thu, Oct 26, 2006 at 12:29:17AM CEST, I got a letter
> where Jakub Narebski <jnareb@gmail.com> said that...
> > Cute names are taken: CoGITo, gitk, qgit (GTK+ history viewer is gitview,
> > not ggit, curiously ;-) and tig.
> 
> wit?

Wig. 
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                    Fono: +56 32 2654431
Universidad Tecnica Federico Santa Maria             +56 32 2654239

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  6:45                           ` David Rientjes
                                               ` (2 preceding siblings ...)
  2006-10-24 15:15                             ` Linus Torvalds
@ 2006-10-26  2:29                             ` Linus Torvalds
  3 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-26  2:29 UTC (permalink / raw)
  To: David Rientjes; +Cc: Lachlan Patrick, bazaar-ng, git



On Mon, 23 Oct 2006, David Rientjes wrote:
> 
> Some of the internal commands that have been coded in C are actually much 
> better handled by the shell in the first place.

Others have answered this, but the thing is, it was a _wonderful_ way to 
prototype things, and to add obvious (and nice) early UI issues that made 
git much more usable.

But no, things are not better handled in shell.

Shell tends to make some things really _hard_ to do. A fair chunk of the 
rewrite was because core functionality made things easier. For example, 
the whole internal revision partsing library is really actually a lot more 
capable than we could easily expose as a simple pipeline: the original 
"git log" pipeline worked very well, and you can actually still use those 
kinds of pipelines for a lot of work, but at the same time, some things 
really just work better when you have "deeper" interfaces.

For example, the revision parsing library not only makes "git log" trivial 
as C, it's also needed for an efficient "git annotate/blame/pickaxe" kind 
of thing. There are also things that are just ludicrously hard to do in 
shell-script, like exclusive and atomic file operations.

We used perl and python for some things, but finding people who know them 
tends to be problematic, and python in particular was also a dependency 
problem too, so the fact that the default recursive merge was python 
wasn't wonderful.

So I think the shell-scripts are great (and some of them quite likely will 
remain around for the forseeable future) for prototyping, but for core 
functionality they were not wonderful. 

They are sometimes good examples of how powerful a scripting language git 
can be, though. Scripting is still very important, even though a lot of 
the core stuff doesn't necessarily depend on being scripts itself. 

But error handling in scripting is very hard or inconvenient, especially 
in pipelines. So some things were actively problematic (ie "git-rev-list 
--all --objects | git-pack-objects") and moving it to use the internal 
library interface was simply technically the right thing to do.

Others had real performance issues, eg the new merge in C is a lot faster. 
It was fast before, it's much faster still.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 15:54                                                       ` Carl Worth
@ 2006-10-26  8:52                                                         ` James Henstridge
  2006-10-26  9:33                                                           ` Junio C Hamano
  2006-10-26  9:50                                                           ` VCS comparison table Andreas Ericsson
  0 siblings, 2 replies; 806+ messages in thread
From: James Henstridge @ 2006-10-26  8:52 UTC (permalink / raw)
  To: Carl Worth
  Cc: bazaar-ng, Matthew D. Fuller, Linus Torvalds, Andreas Ericsson,
	git, Jakub Narebski

On 25/10/06, Carl Worth <cworth@cworth.org> wrote:
> On Wed, 25 Oct 2006 18:08:22 +0800, "James Henstridge" wrote:
> > If there aren't, or you made the merge by mistake, you can make a call
> > to "bzr revert" to clean things up without ever having created a new
> > revision.
>
> One result of this approach is that developers of different trees
> don't necessarily have common revision IDs to compare. Imagine a
> question like:
>
>         When you ran that test did you have the same code I've got?
>
> In git, the answer would be determined by comparing revision IDs.

Can you really just rely on equal revision IDs meaning you have the
same code though?

Lets say that I clone your git repository, and then we both merge the
same diverged branch.  Will our head revision IDs match?  From a quick
look at the logs of cairo, it seems that the commits generated for
such a merge include the date and author, so the two commits would
have different SHA1 sums (and hence different revision IDs).

So I'd have a revision you don't have and vice versa, even though the
trees are identical.


> In bzr, the only answer I'm hearing is attempting a merge to see if it
> introduces any changes. (I'm deliberately avoiding "pull" since we're
> talking about distributed cases here).

Or run "bzr missing".  If the sole missing revision is a merge (and
not the revisions introduced by the merge), you could assume that you
have the same tree state.


> And to comment on something mentioned earlier in the thread, there's
> no need for "wildly complex" distributed scenarios. All of these
> issues are present with developers working together as peers, (and
> each considering their own repository as canonical).
>
> A harder question (for bzr) is:
>
>         Do you have all of the history I've got?
>
> (The problem being that when one developer is missing some history and
> merges it in, she necessarily creates new history, so there's never a
> stable point for both sides to agree on.)

Why does it matter if they create a new revision?  They can still tell
if they've got all the history you had.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26  8:52                                                         ` James Henstridge
@ 2006-10-26  9:33                                                           ` Junio C Hamano
  2006-10-26  9:57                                                             ` James Henstridge
  2006-10-26  9:50                                                           ` VCS comparison table Andreas Ericsson
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-10-26  9:33 UTC (permalink / raw)
  To: James Henstridge
  Cc: bazaar-ng, Carl Worth, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, git, Jakub Narebski

"James Henstridge" <james@jamesh.id.au> writes:

> Can you really just rely on equal revision IDs meaning you have the
> same code though?

If you two have the same commit that is a guarantee that you two
have identical trees.  The reverse is not true as logic 101
would teach ;-).

Doing fast-forward instead of doing a "useless" merges helps
somewhat but not in cases like two people merging the same
branches the same way or two people applying the same patch on
top of the same commit.  You need to compare tree object IDs for
that.

>> In bzr, the only answer I'm hearing is attempting a merge to see if it
>> introduces any changes. (I'm deliberately avoiding "pull" since we're
>> talking about distributed cases here).
>
> Or run "bzr missing".  If the sole missing revision is a merge (and
> not the revisions introduced by the merge), you could assume that you
> have the same tree state.

Is it "you could assume" or "it is guaranteed"?  If former, what
kind of corner cases could invalidate that assumption?


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26  8:52                                                         ` James Henstridge
  2006-10-26  9:33                                                           ` Junio C Hamano
@ 2006-10-26  9:50                                                           ` Andreas Ericsson
  1 sibling, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-26  9:50 UTC (permalink / raw)
  To: James Henstridge
  Cc: Carl Worth, bazaar-ng, Matthew D. Fuller, Linus Torvalds, git,
	Jakub Narebski

James Henstridge wrote:
> On 25/10/06, Carl Worth <cworth@cworth.org> wrote:
>> On Wed, 25 Oct 2006 18:08:22 +0800, "James Henstridge" wrote:
>> > If there aren't, or you made the merge by mistake, you can make a call
>> > to "bzr revert" to clean things up without ever having created a new
>> > revision.
>>
>> One result of this approach is that developers of different trees
>> don't necessarily have common revision IDs to compare. Imagine a
>> question like:
>>
>>         When you ran that test did you have the same code I've got?
>>
>> In git, the answer would be determined by comparing revision IDs.
> 
> Can you really just rely on equal revision IDs meaning you have the
> same code though?
> 

Yes. Because each commit contains parent revision id's, which in turn 
contain *their* parent revision id's, which in turn..., you know you 
have exactly the same revision, code, and history leading up to that 
revision. You may have other revisions on top or on other branches, but 
all commits, including merge-points and whatnot, leading to that 
particular revision id are EXACTLY identical.

> Lets say that I clone your git repository, and then we both merge the
> same diverged branch.  Will our head revision IDs match?  From a quick
> look at the logs of cairo, it seems that the commits generated for
> such a merge include the date and author, so the two commits would
> have different SHA1 sums (and hence different revision IDs).
> 
> So I'd have a revision you don't have and vice versa, even though the
> trees are identical.
> 

Merges preserve author and commit info. You may need to create a new 
branch (a git branch, the cheap kind which is a 41-byte file) and fetch 
"his" into "yours". This will be very cheap if you both have the same 
code but not the same history, as everything but a few commit-objects 
will be shared. A more likely scenario though is this;

Bob writes a feature that doesn't work as per spec. He doesn't know why.
He asks Alice to have a look, so he communicates the commits to her by 
"please pull this branch from here", or by sending patches and telling 
Alice the branch-point revision to apply them to.
Alice creates the "bobs-bugs/nr1232" at the branch-point and fetches 
Bobs branch into that or applies the patches on top of that (in the 
fetch scenario she wouldn't need to know the branch point, since git 
would figure this out for her).
She knows this should create a revision named 00123989aaddeddad39, so if 
it doesn't, she doesn't have the same code.


I imagine this works roughly the same in bazaar, although the original 
case where tests have already been done and the testers wanted to know 
if they had the exact same revision Just Works in git.

> 
>> In bzr, the only answer I'm hearing is attempting a merge to see if it
>> introduces any changes. (I'm deliberately avoiding "pull" since we're
>> talking about distributed cases here).
> 
> Or run "bzr missing".  If the sole missing revision is a merge (and
> not the revisions introduced by the merge), you could assume that you
> have the same tree state.
> 

"assume" != "know", or was that just sloppy phrasing?

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26  9:33                                                           ` Junio C Hamano
@ 2006-10-26  9:57                                                             ` James Henstridge
  2006-10-26 10:10                                                               ` Jeff King
  0 siblings, 1 reply; 806+ messages in thread
From: James Henstridge @ 2006-10-26  9:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: bazaar-ng, Matthew D. Fuller, Linus Torvalds, Andreas Ericsson,
	Carl Worth, git, Jakub Narebski

On 26/10/06, Junio C Hamano <junkio@cox.net> wrote:
> "James Henstridge" <james@jamesh.id.au> writes:
>
> > Can you really just rely on equal revision IDs meaning you have the
> > same code though?
>
> If you two have the same commit that is a guarantee that you two
> have identical trees.  The reverse is not true as logic 101
> would teach ;-).

That was the point I was trying to make.  Carl asserted that in git
you could tell if you had the same tree as someone else based on
revision IDs, which doesn't seem to be the case all the time.

The reverse assertion (that if you have the same revision ID, you have
the same tree) seems to hold equally in git and Bazaar.


> Doing fast-forward instead of doing a "useless" merges helps
> somewhat but not in cases like two people merging the same
> branches the same way or two people applying the same patch on
> top of the same commit.  You need to compare tree object IDs for
> that.

Sure, you can do the same in Bazaar by comparing the inventories for
the two revisions.

>
> >> In bzr, the only answer I'm hearing is attempting a merge to see if it
> >> introduces any changes. (I'm deliberately avoiding "pull" since we're
> >> talking about distributed cases here).
> >
> > Or run "bzr missing".  If the sole missing revision is a merge (and
> > not the revisions introduced by the merge), you could assume that you
> > have the same tree state.
>
> Is it "you could assume" or "it is guaranteed"?  If former, what
> kind of corner cases could invalidate that assumption?

The merge revision will also include any manual conflict resolution.
If the other person resolved the conflicts differently.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26  9:57                                                             ` James Henstridge
@ 2006-10-26 10:10                                                               ` Jeff King
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
  0 siblings, 1 reply; 806+ messages in thread
From: Jeff King @ 2006-10-26 10:10 UTC (permalink / raw)
  To: James Henstridge
  Cc: Junio C Hamano, bazaar-ng, Matthew D. Fuller, Linus Torvalds,
	Andreas Ericsson, Carl Worth, git, Jakub Narebski

On Thu, Oct 26, 2006 at 05:57:20PM +0800, James Henstridge wrote:

> >If you two have the same commit that is a guarantee that you two
> >have identical trees.  The reverse is not true as logic 101
> >would teach ;-).
> 
> That was the point I was trying to make.  Carl asserted that in git
> you could tell if you had the same tree as someone else based on
> revision IDs, which doesn't seem to be the case all the time.

If you have the same revision (commit IDs), you have the same tree (at
the same time, by the same committer, etc).

If you have a different revision (commit), you may or may not have the
same tree. You can then check the tree id, which will either be the same
(you have the same tree) or differ (you don't).

Thus, in the converse, if you have the same tree, you _will_ have the
same tree id. You may or may not have the same commit id.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 23:53                                                                             ` Matthew D. Fuller
@ 2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-26 10:45                                                                                 ` Erik Bågfors
                                                                                                   ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-26 10:13 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, David Lang, git

Matthew D. Fuller wrote:
> 
>> 3. pull from each other frequently to keep in sync.
>>
>> this changes the topology to
>>
>>    Master
>>    /   \
>>  dev1--dev2
>>
>> if they do this with bzr then the revno's break, they each get extra
>> commits showing up (so they can never show the same history).
> 
> These two are either/or, not and; either they pull (in which case
> their old mainline is no longer meaningful), or they merge (in which
> case they get the 'extra' merge commits).
> 
> 
>> in git this is a non-issue, they can pull back and forth and the
>> only new history to show up will be changes.
> 
> In git, this is a non-issue because you don't get to CHOOSE which way
> to work.

Yes they do. They can (and in this case probably will) create a 
topic-branch named "the-other-dev/featureX" and keep it solely for 
tracking the other peers changes, keeping their own topic-branch for 
their own changes, and another branch where they merge both changes in, 
or cherry-pick from each branch to get to the desired result fast. This 
works easily because in git
a) branches are as cheap as I can ever imagine an SCM making them.
b) the "slice the DAG and view anything you like from any branch you 
like any time you like and mix them however you want" approach of the 
visualizers makes it trivial for a 10-year old fledgling programmer to 
see what changes what, and where, and by whom, and why.

The "b" above was a feature I didn't know I needed until it became 
available to me. Thanks to Paul Mackerras (spelling?) for creating the 
wonderful gitk tool, and to Marco Costalba for making a faster and, imo, 
more capable version of it.

>  You always (if you can) pull and obliterate your local
> mainline.  In bzr, it's only an 'issue' because you CAN choose, and
> CAN maintain your local mainline.

Git puts emphasis on code. Bazaar puts emphasis on developers and 
branch-structure. Depending on your preferrence, I imagine one suits 
some people better. I really, really, really don't care if my branch-tip 
gets moved because I hadn't made any changes to it while the other dev 
hacked away or if it causes a merge because we had decided to work on 
different parts of the feature. Perhaps this is a result of the insanely 
good visualizers (kudos again to Paul and Marco) that easily lets me see 
who did what when and where anyways. What I *do* care about is being 
able to easily make sure all the devs have the same code to work and 
test with.

>  You CAN choose, right now, to do a
> git and pull back and forth and only new history show up as changed by
> creating a 'bzr-pull' shell script that does a 'bzr pull || bzr merge'
> (though you'd be a lot better off adding a '--fast-forward-if-you-can'
> option to merge and aliasing that over).
> 
> More basically, though, I don't think that "histories become exactly
> equivalent" is a necessary pass-word to enter the Hallowed City of
> Truely Distributed Development.

The only issue I have with bzr's revno's and truly distributed setup is 
that, by looking at the table, it seems to claim that you have found 
some miraculous way to make revnos work without a central server. Since 
everyone agrees that they don't, this should IMO be listed as mutually 
exclusive features.

On a side-note, git has made my life easier, so I childishly want to 
defend it and see it on top of every list in the world. Something I'm 
sure I share with more people on this list and with some of the bazaar 
users/devs. ;-)



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
@ 2006-10-26 10:45                                                                                 ` Erik Bågfors
  2006-10-26 11:48                                                                                 ` Jakub Narebski
                                                                                                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: Erik Bågfors @ 2006-10-26 10:45 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Matthew D. Fuller, bazaar-ng, David Lang, git

> On a side-note, git has made my life easier, so I childishly want to
> defend it and see it on top of every list in the world. Something I'm
> sure I share with more people on this list and with some of the bazaar
> users/devs. ;-)

Haha, I feel the same way about bzr. Some of the features that bazaar
has, such as how it preservs the leftmost parent and treats that
specially in some cases, are things that I REALLY love and don't want
to live without.

All in all, I feel that git and bazaar and both excellent products,
what will happen in the future will be interesting to see.

/Erik
-- 
google talk/jabber. zindar@gmail.com
SIP-phones: sip:erik_bagfors@gizmoproject.com

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:10                                                               ` Jeff King
@ 2006-10-26 10:52                                                                 ` Vincent Ladeuil
  2006-10-26 11:13                                                                   ` Jeff King
                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Vincent Ladeuil @ 2006-10-26 10:52 UTC (permalink / raw)
  To: Jeff King
  Cc: James Henstridge, Junio C Hamano, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski

>>>>> "Jeff" == Jeff King <peff@peff.net> writes:

    Jeff> On Thu, Oct 26, 2006 at 05:57:20PM +0800, James Henstridge wrote:
    >> >If you two have the same commit that is a guarantee that you two
    >> >have identical trees.  The reverse is not true as logic 101
    >> >would teach ;-).
    >> 
    >> That was the point I was trying to make.  Carl asserted that in git
    >> you could tell if you had the same tree as someone else based on
    >> revision IDs, which doesn't seem to be the case all the time.

    Jeff> If you have the same revision (commit IDs), you have
    Jeff> the same tree (at the same time, by the same committer,
    Jeff> etc).

    Jeff> If you have a different revision (commit), you may or
    Jeff> may not have the same tree. You can then check the tree
    Jeff> id, which will either be the same (you have the same
    Jeff> tree) or differ (you don't).

    Jeff> Thus, in the converse, if you have the same tree, you
    Jeff> _will_ have the same tree id. You may or may not have
    Jeff> the same commit id.

Ok, so git make a distinction between the commit (code created by
someone) and the tree (code only).

Commits are defined by their parents.

Trees are defined by their content only ?

If that's the case, how do you proceed ? 

Calculate a sha1 representing the content (or the content of the
diff from parent) of all the files and dirs in the tree ?  Or
from the sha1s of the files and dirs themselves recursively based
on sha1s of the files and dirs they contain ?

I ask because the later seems to provide some nice effects
similar to what makes BDD
(http://en.wikipedia.org/wiki/Binary_decision_diagram) so
efficient: you can compare graphs of any complexity or size in
O(1) by just comparing their signatures.

    Vincent



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
@ 2006-10-26 11:13                                                                   ` Jeff King
  2006-10-26 11:15                                                                     ` Jeff King
  2006-10-26 12:33                                                                     ` Vincent Ladeuil
  2006-10-26 11:18                                                                   ` Jakub Narebski
  2006-10-26 15:05                                                                   ` Linus Torvalds
  2 siblings, 2 replies; 806+ messages in thread
From: Jeff King @ 2006-10-26 11:13 UTC (permalink / raw)
  To: Vincent Ladeuil
  Cc: James Henstridge, Junio C Hamano, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski

On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote:

> Ok, so git make a distinction between the commit (code created by
> someone) and the tree (code only).

Yes (a commit is a tree, zero or more parents, commit message, and
author/committer info).

> Commits are defined by their parents.

Partially, yes.

> Trees are defined by their content only ?

Yes.

> Calculate a sha1 representing the content (or the content of the
> diff from parent) of all the files and dirs in the tree ?  Or
> from the sha1s of the files and dirs themselves recursively based
> on sha1s of the files and dirs they contain ?

Recursively. Each tree is an ordered list of 4-tuples: pathname, type,
sha1, mode. If the type is "blob" then the sha1 is the hash of the file
contents. If the type is "tree" then the sha1 is the id of a sub-tree.
The id of a tree is the sha1 hash of the data structure.

> I ask because the later seems to provide some nice effects
> similar to what makes BDD
> (http://en.wikipedia.org/wiki/Binary_decision_diagram) so
> efficient: you can compare graphs of any complexity or size in
> O(1) by just comparing their signatures.

Yes, if two trees' hashes compare equal, they contain the same data. I
believe we are not currently using this optimization to find merge
differences, but there was some discussion earlier this week about doing
so.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:13                                                                   ` Jeff King
@ 2006-10-26 11:15                                                                     ` Jeff King
  2006-10-26 12:33                                                                     ` Vincent Ladeuil
  1 sibling, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-26 11:15 UTC (permalink / raw)
  To: Vincent Ladeuil
  Cc: James Henstridge, Junio C Hamano, bazaar-ng, Matthew D. Fuller,
	Linus Torvalds, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski

On Thu, Oct 26, 2006 at 07:13:39AM -0400, Jeff King wrote:

> Yes (a commit is a tree, zero or more parents, commit message, and
> author/committer info).

Sorry, I should clarify: a commit is a _tree id_, zero or more _parent
ids_, commit message, etc.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 17:21                                       ` David Rientjes
  2006-10-25 21:03                                         ` Jeff King
@ 2006-10-26 11:15                                         ` Andreas Ericsson
  2006-10-26 16:30                                           ` David Lang
  1 sibling, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-26 11:15 UTC (permalink / raw)
  To: David Rientjes; +Cc: Jeff King, Linus Torvalds, Lachlan Patrick, bazaar-ng, git

David Rientjes wrote:
> On Wed, 25 Oct 2006, Jeff King wrote:
> 
>>> This all became very obvious when the tutorials came out on "how to use 
>>> git in 20 commands or less" effectively.  These tutorials shouldn't need 
>>> to exist with an information manager that started as a quick, efficient, 
>>> and _simple_ project.  You're treating git development in the same light 
>> Sorry, I don't see how this is related to the programming language _at
>> all_. Are you arguing that the interface of git should be simplified so
>> that such tutorials aren't necessary? If so, then please elaborate, as
>> I'm sure many here would like to hear proposals for improvements. If
>> you're arguing that git now has too many features, then which features
>> do you consider extraneous?
>>
> 
> It's not, it's related to the original vision of git which was meant for 
> efficiency and simplicity.

Compared to todays version, original git was neither efficient nor 
simple. Unless you mean "some random version along the way where git had 
everything *I* need and not the useless cruft that other people use", in 
which case it's simply a very egotistical view of things.

>  A year ago it was very easy to pick up the 
> package and start using it effectively within a couple hours.   Keep in
> mind that this was without tutorials, it was just reading man pages.  
> Today it would be very difficult to know what the essential commands are 
> and how to use them simply to get the job done, unless you use the 
> tutorials.

Have you tried "git --help"? It shows the most common commands and a 
short description of what they do. It's a very good pointer to which 
man-pages you need to read, and I imagine this would actually be one of 
the very first commands that new git users try. If they don't but just 
expect things to work according to some premade mental model they have 
of scm's, I'd say they'd be screwed no matter which software they tried.


>  This _inherently_ goes against the approach of trying to 
> provide something that is simple to the developer.
> 
> Revision control is something that should exist in the background that 
> does it's simple job very efficiently.  Unfortunately git has tried to 
> move its presence into the foreground and requiring developers to spend 
> more time on learning the system.
> 

No it hasn't. The ten or so commands that Linus first introduced when 
announcing git still work pretty much the same. Nobody in their right 
mind would ever claim that those ten commands made up anything that even 
remotely resembled a complete scm, but they were something to build on 
by anyone who wanted to extend it. So far, ~220 people have wanted to 
extend it in ways that others thought useful, because their patches are 
apparently in the git tree.

> Have you never tried to show other people git without giving them a 
> tutorial on the most common uses?  Try it and you'll see the confusion.  
> That _specifically_ illustrates the ever-increasing lack of simplicity 
> that git has acquired.
> 

Well, my head hurt when I tried to learn CVS without a tutorial, and 
mercurial and darcs and svn as well. I didn't pick up the functionality 
of the 'ls' command completely without reading the man-page for it. If 
you want something that works for everyone without having to read any 
documentation what so ever, buy Lego, cause computers ain't for you, my 
friend.

>> I don't agree with this. There are tons of enhancements that I find
>> useful (e.g., '...' rev syntax, rebasing with 3-way merge, etc) that I
>> think other developers ARE using. There are scalability and performance
>> improvements. And there are new things on the way (Junio's pickaxe work)
>> that will hopefully make git even more useful than it already is.
>>
> 
> There are _not_ scalability improvements.  There may be some slight 
> performance improvements, but definitely not scalability.  If you have 
> ever tried to use git to manage terabytes of data, you will see this 
> becomes very clear.  And "rebasing with 3-way merge" is not something 
> often used in industry anyway if you've followed the more common models 
> for revision control within large companies with thousands of engineers.  
> Typically they all work off mainline.
> 

Actually, I don't see why git shouldn't be perfectly capable of handling 
a repo containing several terabytes of data, provided you don't expect 
it to turn up the full history for the project in a couple of seconds 
and you don't actually *change* that amount of data in each revision. If 
you want a vcs that handles that amount with any kind of speed, I think 
you'll find rsync and raw rvs a suitable solution.

On the other hand, you fellas at google don't really use git to store 
the data from the search database, do you? I mean, it's written for 
source control management. People that tried to keep their mboxes in git 
failed miserably, because large files that constantly change just 
doesn't work well with git.

>> If you don't think recent git versions are worthwhile, then why don't
>> you run an old version? You can even use git to cherry-pick patches onto
>> your personal branch.
>>
> 
> I do.  And that's why I would recommend to any serious developer to use 
> 1.2.4; this same version that I used for kernel development at Google.
> 
>> Where?
>>
> 
> Few months back here on the mailing list.  When I tried cleaning up even 
> one program, I got the response back from the original author "why fix a 
> non-problem?" because his argument was that since it worked the code 
> doesn't matter.
> 
> 	http://marc.theaimsgroup.com/?l=git&m=115589472706036
> 
> And that is simply one thread of larger conversations that have taken 
> place off-list and aren't archived.
> 

First off, the code got changed as per Junio's desires. He's the 
maintainer and gets to choose about coding style and readability vs 
microoptimizations.

Second, why keep discussions about git development off-list?

Third, if you still have issues with it, why not provide a patch and see 
if Junio accepts it? Cleaner and faster code will, in my experience, 
always get accepted. Code that is cleaner from one devs point of view 
but doesn't actually provide any other benefits will be dropped to the 
floor, and rightly so.


>> I don't agree, but since you haven't provided anything specific enough
>> to discuss, there's not much to say.
>>
> 
> If there's a question about some of the sloppiness in the git source code 
> as it stands today, that's a much bigger issue than the sloppiness.  My 
> advice would be to pick up a copy of K&R's 2nd edition C programming 
> language book, read it, and then take a tour of the source code.
> 

The first sentence doesn't make sense. The second one is just rude, and 
formed by your own opinion on how code should be written. But again, 
submit patches and see if Junio accepts them. If he doesn't, and you 
really, really *really* can't stand the changes he and the rest of the 
git community wants in, fork your own version and hack away til your 
heart's content. Git makes it easy for you, whichever version you use.

>> Can you name one customization that you would like to perform now that
>> you feel can't be easily done (and presumably that would have been
>> easier in the past)?
>>
> 
> Yes, those mentioned above.
> 

Which ones? The git-mv changes you submitted were applied (although in a 
different shape), so there must be other ones. Rewriting C builtins as 
shell-scripts is not really an option, because portability and 
performance *does* matter.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
  2006-10-26 11:13                                                                   ` Jeff King
@ 2006-10-26 11:18                                                                   ` Jakub Narebski
  2006-10-26 15:05                                                                   ` Linus Torvalds
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-26 11:18 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Vincent Ladeuil wrote:

> Ok, so git make a distinction between the commit (code created by
> someone) and the tree (code only).
> 
> Commits are defined by their parents.
> 
> Trees are defined by their content only ?

Trees are collections of tuples: (mode, type, sha1, name), where mode
is simplified mode of a file or directory (only if it is symlink, directory,
file or executable file is tracked), type is blob (file) or tree
(directory), sha1 is sha1 of contents of given entry, and name is filename
of given entry.
 
> If that's the case, how do you proceed ? 
> 
> Calculate a sha1 representing the content (or the content of the
> diff from parent) of all the files and dirs in the tree ?  Or
> from the sha1s of the files and dirs themselves recursively based
> on sha1s of the files and dirs they contain ?
 
sha1 of object is sha1 of type+contents if I remember correctly. So the sha1
of tree is based on sha1 of the files and dirs it contain.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 21:08                                   ` Junio C Hamano
  2006-10-25 21:16                                     ` Jeff King
  2006-10-25 21:50                                     ` Junio C Hamano
@ 2006-10-26 11:25                                     ` Andreas Ericsson
  2 siblings, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-26 11:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jeff King, git, bazaar-ng, Linus Torvalds, Lachlan Patrick,
	David Rientjes

Junio C Hamano wrote:
> 
>  - Learn the itches David and other people have, that the
>    current git Porcelain-ish does not scratch well, and enrich
>    Documentation/technical with real-world working scripts built
>    around plumbing.
> 

Isn't this how git has been developed since day one, more or less? If a 
command is missing, it gets added as a shell-script. I agree with you on 
the "pipes from this sent here does this, and look how useful it is" 
lectures are gone since many commands were rewritten. Otoh, they're gone 
because they now instead provide examples on how to interface with the 
libified parts of git, so it's not a loss per se, just a switch in what 
it teaches.

I also agree with David that shell is much more fun to muck around with 
and prototype in, because you see results to much faster. However, since 
our plumbing is so rock-solid (and getting extended with --stdin options 
to more and more commands), I see no reason why we shouldn't have a "how 
to extend git" with the old shell-based porcelain scripts up somewhere 
at the web. Perhaps it would kill two birds with one stone and increase 
the addition of new utilities to git, while at the same time keeping the 
already rewritten commands in C.

Btw, the old shell-versions still work with the new plumbing (well, 
mostly anyways). They just have problems with filenames and revisions 
with spaces and special chars and things like that, same as they've 
always had.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-26 10:45                                                                                 ` Erik Bågfors
@ 2006-10-26 11:48                                                                                 ` Jakub Narebski
  2006-10-26 11:54                                                                                   ` Nicholas Allen
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
  2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
  2006-10-26 13:47                                                                                 ` Aaron Bentley
  3 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-26 11:48 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Andreas Ericsson wrote:

> On a side-note, git has made my life easier, so I childishly want to 
> defend it and see it on top of every list in the world. Something I'm 
> sure I share with more people on this list and with some of the bazaar 
> users/devs. ;-)

Let's then us review what started this thread, namely comparison chart
between source control systems
  http://bazaar-vcs.org/RcsComparisons

1. Decentralized. O.K.

2. Disconnected Ops. O.K.

3. Simple Namespace. Should be named "Simple Rev Names" instead, Bazaar
should have note that revnos work only for specific workflows
(star-topology); for Git it should be perhaps "Somewhat" here, as <ref>~<n>
(or <ref>@{<n>} if reflog is enabled) _are_ simple (if volatile for branch
<refs>). $(git-merge-base <ref1> <ref2>), usually "hidden" in
<ref1>..<ref2> or <ref1>...<ref2> shortcut is also I think simple. There
was huge discussion here about revnos, revids, workflows (development
topology), fast-forwards, empty merges etc. Bazaar-NG and Git puts
emphasisis on other things. Additionally tags supports removes some of
perceived revnos advantages; tags are simple.

4. Supports Renames. I could agree with "Somewhat" because of not yet
implemented --follow option to git-rev-list (and therefore all porcelain).
Perhaps it would be closer to truth to leave the marker (background color)
as for "Somewhat" and write "N/A" with note that Git has contents and
pathname based heuristic detection of renames, or just put "Detect" or
"Detection" here.

I would certainly change description of what means that SCM doesn't "Support
Renames" or has it implemented partially. Current explanation relies
heavily on _implementation_. The correct wording of current definition
would be that SCM doesn't support renames if history of a file "as visible
to SCM" is broken into before rename and after rename part, and that SCM
support it partially if you can track history of renamed file from
post-rename name but there is left in void history of pre-rename file.
But with this definition Git _does_ "Supports Renames".

I'd rather split "Supports Renames" into engine part (does SCM
remember/detect that rename took place _as_ rename, not remember/detect it
as copiying+deletion; something other than rename) and user interface part:
can user easily deal with renames (this includes merging and viewing file
history).

5 and 6. Needs Repository/Supports Repository. The name is very, very
unclean and stems from branch-centricness of Bazaar. Git should probably
have "Yes" here, as for Git branch is just reference to its tip in
revisions DAG (plus optionally branch tip history in reflog). On the other
hand Git _can_ share object database like branches can be gathered together
to share data into repository. You can have one-branch repositories, you
can clone whole repositories (perhaps Bazaar should have "Somewhat" for
Supports Repository as it doesn't support cloning of whole repository...
bzt, wrong, there is example plugin for that), and you can clone (using
Cogito) only one branch of repository and you can fetch only selected
branches of repository.

Thinking more about it those items should probably read "Support Individual
Branches" (as: can you get only the branch you are interested in, can SCM
support one-branch workflow) and "Support Branch Grouping" or "Support Data
Sharing" (as: can you share DAG between branches, can you share DAG between
repositories).

7. Checkouts (as a noun). This probably read "Support Centralized and
Disconnected Centralized Workflow" but that is perhaps too wordy. Git would
have "No" for "Centralized" and "Somewhat" for "Disconnected Centralized"
meaning that you can set up Git repository to be equivalent of heavyweight
checkout, and push changes to some given repository on commit.

8. Partial Checkouts (as a verb). Here Git should have perhaps "Minimal", as
you can have partial checkouts but only with care (and you still need whole
repository). "No?" is also correct (?).

9. Atomic Commits. O.K. You have to remember that there are consequences
of having Atomic Commits on the details of Partial Checkouts.

10. Cheap Branching Anywhere. Git should probably have "Yes! Yes! Yes!"
here ;-)

11. Smart Merge. O.K. Should probably be explained what constitutes smart
merging. Perhaps instead of "Yes" there should be name of default/smartest
merge strategy used?

12. Cherrypicks. What constitutes "Yes" here? Why "Somewhat" for Git?
It does have git-cherry-pick command for cherry picking...

13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
or "?" background color for Git. And add note that it is easy to script up
porcelanish command, and to add another merge strategy. There also was
example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
marking.

14. Has Special Server. O.K.

15. Req. Dedicated Server. O.K.

16. Good Windows support. I'd put "Cygwin" instead of "No" for Git, although
with the same marking. And perhaps add note that Git relies heavily on
POSIX.

17 and 18. Fast Local Performance and Fast Network Performance. O.K.

19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
easy to use, but I have not much experiences with other SCM. I wonder why
Bazaar has "No" there...


Too much rewriting to correct the page...


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:48                                                                                 ` Jakub Narebski
@ 2006-10-26 11:54                                                                                   ` Nicholas Allen
  2006-10-26 12:13                                                                                     ` Jakub Narebski
  2006-10-26 21:25                                                                                     ` Jeff King
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
  1 sibling, 2 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-10-26 11:54 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git


> 
> 4. Supports Renames. I could agree with "Somewhat" because of not yet
> implemented --follow option to git-rev-list (and therefore all porcelain).
> Perhaps it would be closer to truth to leave the marker (background color)
> as for "Somewhat" and write "N/A" with note that Git has contents and
> pathname based heuristic detection of renames, or just put "Detect" or
> "Detection" here.
> 
> I would certainly change description of what means that SCM doesn't "Support
> Renames" or has it implemented partially. Current explanation relies
> heavily on _implementation_. The correct wording of current definition
> would be that SCM doesn't support renames if history of a file "as visible
> to SCM" is broken into before rename and after rename part, and that SCM
> support it partially if you can track history of renamed file from
> post-rename name but there is left in void history of pre-rename file.
> But with this definition Git _does_ "Supports Renames".

I would have thought that supports renames would also involve flagging a 
conflict when merging a file that has been renamed on 2 separate 
branches. ie 2 branches rename the file to different names and then one 
branch is merged into the other. In this situation, the user should be 
told of a rename conflict. Bzr supports this as far as I know. Not sure 
about git though as I have never used it.




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
  2006-10-26 10:45                                                                                 ` Erik Bågfors
  2006-10-26 11:48                                                                                 ` Jakub Narebski
@ 2006-10-26 12:12                                                                                 ` Matthew D. Fuller
  2006-10-26 12:18                                                                                   ` Jakub Narebski
  2006-10-26 13:47                                                                                 ` Aaron Bentley
  3 siblings, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-26 12:12 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: David Lang, bazaar-ng, git

On Thu, Oct 26, 2006 at 12:13:39PM +0200 I heard the voice of
Andreas Ericsson, and lo! it spake thus:
> Matthew D. Fuller wrote:
> >
> >In git, this is a non-issue because you don't get to CHOOSE which
> >way to work.
> 
> Yes they do.

Not where I was going with that section of the mail; I was looking at
just the merge vs fast-forward distinction.  In git, you don't get to
choose; in bzr you do.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:54                                                                                   ` Nicholas Allen
@ 2006-10-26 12:13                                                                                     ` Jakub Narebski
  2006-10-26 21:25                                                                                     ` Jeff King
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-26 12:13 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: bazaar-ng, git

Nicholas Allen wrote:
> Jakub Narebski wrote:
>> 
>> 4. Supports Renames. I could agree with "Somewhat" because of not yet
>> implemented --follow option to git-rev-list (and therefore all porcelain).
>> Perhaps it would be closer to truth to leave the marker (background color)
>> as for "Somewhat" and write "N/A" with note that Git has contents and
>> pathname based heuristic detection of renames, or just put "Detect" or
>> "Detection" here.
>> 
>> I would certainly change description of what means that SCM doesn't "Support
>> Renames" or has it implemented partially. Current explanation relies
>> heavily on _implementation_. The correct wording of current definition
>> would be that SCM doesn't support renames if history of a file "as visible
>> to SCM" is broken into before rename and after rename part, and that SCM
>> support it partially if you can track history of renamed file from
>> post-rename name but there is left in void history of pre-rename file.
>> But with this definition Git _does_ "Supports Renames".
> 
> I would have thought that supports renames would also involve flagging a 
> conflict when merging a file that has been renamed on 2 separate 
> branches. ie 2 branches rename the file to different names and then one 
> branch is merged into the other. In this situation, the user should be 
> told of a rename conflict. Bzr supports this as far as I know. Not sure 
> about git though as I have never used it.

If I remember correctly Git usually resolves such conflict. If it cannot
resolve it, it tells user of rename conflict (add/add conflict or rename/add
conflict).

Unfortunately Git tutorial part 3 on merges is not yer written.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
@ 2006-10-26 12:18                                                                                   ` Jakub Narebski
  2006-10-26 15:06                                                                                     ` Matthew D. Fuller
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-26 12:18 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Matthew D. Fuller wrote:

> On Thu, Oct 26, 2006 at 12:13:39PM +0200 I heard the voice of
> Andreas Ericsson, and lo! it spake thus:
>> Matthew D. Fuller wrote:
>>>
>>>In git, this is a non-issue because you don't get to CHOOSE which
>>>way to work.
>> 
>> Yes they do.
> 
> Not where I was going with that section of the mail; I was looking at
> just the merge vs fast-forward distinction.  In git, you don't get to
> choose; in bzr you do.

You can get similar workflow in git using 'origin'/'master' pair, I think.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:13                                                                   ` Jeff King
  2006-10-26 11:15                                                                     ` Jeff King
@ 2006-10-26 12:33                                                                     ` Vincent Ladeuil
  2006-10-26 13:14                                                                       ` Rogan Dawes
  1 sibling, 1 reply; 806+ messages in thread
From: Vincent Ladeuil @ 2006-10-26 12:33 UTC (permalink / raw)
  To: Jeff King; +Cc: , bazaar-ng, git

>>>>> "Jeff" == Jeff King <peff@peff.net> writes:

    Jeff> On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote:
    >> Ok, so git make a distinction between the commit (code created by
    >> someone) and the tree (code only).

    Jeff> Yes (a commit is a tree, zero or more parents, commit message, and
    Jeff> author/committer info).

The parents of a tree are also trees or can/must they be commits ?

    >> Commits are defined by their parents.

    Jeff> Partially, yes.

I buy that this "partially" means "the other parts are irrelevant
to this discussion".

    >> Trees are defined by their content only ?

    Jeff> Yes.

So it is possible that : starting from a tree T,

- I make a patch A,
- you make the patch B,
- A and B are equal (stop watching above my shoulder please, or what is me ?),
- we both commit,
- we pull changes from each other repository.

We will end up with a tree T2 with a hash corresponding to both
T+A and T+B, but each of us will have a different commit id CA
and CB both pointing to T2, did I get it ?

    Vincent







^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 12:33                                                                     ` Vincent Ladeuil
@ 2006-10-26 13:14                                                                       ` Rogan Dawes
  0 siblings, 0 replies; 806+ messages in thread
From: Rogan Dawes @ 2006-10-26 13:14 UTC (permalink / raw)
  To: Vincent Ladeuil; +Cc: bazaar-ng, git

Vincent Ladeuil wrote:
>>>>>> "Jeff" == Jeff King <peff@peff.net> writes:
> 
>     Jeff> On Thu, Oct 26, 2006 at 12:52:05PM +0200, Vincent Ladeuil wrote:
>     >> Ok, so git make a distinction between the commit (code created by
>     >> someone) and the tree (code only).
> 
>     Jeff> Yes (a commit is a tree, zero or more parents, commit message, and
>     Jeff> author/committer info).
> 
> The parents of a tree are also trees or can/must they be commits ?

This refers to the parents of a _commit_, not of a tree, and the parents 
must be _commits_. The parents allow us to determine what changed 
between the previous commit(s), and the current one. If there are more 
than one parent, then we have a merge commit.

So, a commit refers to a tree representing the state of the code at the 
time of the commit, as well as to any parent commit(s). If there are no 
parent commits, then the commit is an "initial commit" (i.e. the first 
checkin). A project can have multiple "initial commits", typically where 
two previously independent projects are merged together, c.f. gitk and git.

> 
>     >> Commits are defined by their parents.
> 
>     Jeff> Partially, yes.
> 
> I buy that this "partially" means "the other parts are irrelevant
> to this discussion".

Yes.

>     >> Trees are defined by their content only ?
> 
>     Jeff> Yes.
> 
> So it is possible that : starting from a tree T,
> 
> - I make a patch A,
> - you make the patch B,
> - A and B are equal (stop watching above my shoulder please, or what is me ?),
> - we both commit,
> - we pull changes from each other repository.
> 
> We will end up with a tree T2 with a hash corresponding to both
> T+A and T+B, but each of us will have a different commit id CA
> and CB both pointing to T2, did I get it ?
> 
>     Vincent

Yes. That is exactly right.

 From there, we can either trivially merge CA and CB with a new merge 
commit referring to T2, but citing both CA and CB as parents, or simply 
discard one of the lines of development, depending on how much 
subsequent development cited CA or CB as parents.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:13                                                                               ` Andreas Ericsson
                                                                                                   ` (2 preceding siblings ...)
  2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
@ 2006-10-26 13:47                                                                                 ` Aaron Bentley
  2006-10-26 13:53                                                                                   ` Jakub Narebski
  3 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-10-26 13:47 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Matthew D. Fuller, bazaar-ng, David Lang, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andreas Ericsson wrote:
> The only issue I have with bzr's revno's and truly distributed setup is
> that, by looking at the table, it seems to claim that you have found
> some miraculous way to make revnos work without a central server. Since
> everyone agrees that they don't, this should IMO be listed as mutually
> exclusive features.

The "simple namespace" is both a URL and a revno.

And therefore, it's just as distributed and decentralized as the web.

There is very little difference between this:

http://example.com/mywebpage#5

And this:

http://example.com/mybranch 5

In fact, we've been planning to unify them into one identifier.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFQLxr0F+nu1YWqI0RAiVrAJ9rb+uylIuxqMo2VMelI3Qm6oNQOwCfeTAb
kOkp9kOkRl1YEVEP+G3y2SU=
=Zgsg

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 13:47                                                                                 ` Aaron Bentley
@ 2006-10-26 13:53                                                                                   ` Jakub Narebski
  2006-10-26 15:13                                                                                     ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-26 13:53 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Aaron Bentley wrote:

> Andreas Ericsson wrote:

>> The only issue I have with bzr's revno's and truly distributed setup is
>> that, by looking at the table, it seems to claim that you have found
>> some miraculous way to make revnos work without a central server. Since
>> everyone agrees that they don't, this should IMO be listed as mutually
>> exclusive features.
> 
> The "simple namespace" is both a URL and a revno.
> 
> And therefore, it's just as distributed and decentralized as the web.
> 
> There is very little difference between this:
> 
> http://example.com/mywebpage#5
> 
> And this:
> 
> http://example.com/mybranch 5
> 
> In fact, we've been planning to unify them into one identifier.

Well, then it is not much simpler than 8-chars sha1. And sha1 is more
decentralized, because you can use it when you don't have access to net,
and when the _central_ revno server is down.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 10:52                                                                 ` Vincent Ladeuil
  2006-10-26 11:13                                                                   ` Jeff King
  2006-10-26 11:18                                                                   ` Jakub Narebski
@ 2006-10-26 15:05                                                                   ` Linus Torvalds
  2006-10-26 16:04                                                                     ` Vincent Ladeuil
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
  2 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-26 15:05 UTC (permalink / raw)
  To: Vincent Ladeuil
  Cc: Jeff King, James Henstridge, Junio C Hamano, bazaar-ng,
	Matthew D. Fuller, Carl Worth, Andreas Ericsson, git,
	Jakub Narebski



On Thu, 26 Oct 2006, Vincent Ladeuil wrote:
> 
> Ok, so git make a distinction between the commit (code created by
> someone) and the tree (code only).
> 
> Commits are defined by their parents.

Commits are defined by a _combination_ of:
 - the tree they commit (which is recursive, so the commit name indirectly 
   includes information EVERY SINGLE BIT in the whole tree, in every 
   single file)
 - the parent(s) if any (which is also recursive, so the commit name 
   indirectly includes information about EVERY SINGLE BIT in not just the 
   current tree, but every tree in the history, and every commit that is 
   reachable from it)
 - the author, committer, and dates of each (and committer is actually 
   very often different from author)
 - the actual commit message

So a commit really names - uniquely and authoratively - not just the 
commit itself, but everything ever associated with it.

> Trees are defined by their content only ?

Where "contents" does include names and permissions/types (eg execute bit 
and symlink etc).

> If that's the case, how do you proceed ? 

If you compare the commit name, and they are equal, you automatically know
 - the trees are 100% identical
 - the histories are 100% identical

If you only care about the actual tree, you compare the tree name for 
equality, ie you can do

	git-rev-parse commit1^{tree} commit2^{tree}

and compare the two: if and only if they are equal are the actual contents 
100% equal.

> Calculate a sha1 representing the content (or the content of the
> diff from parent) of all the files and dirs in the tree ?  Or
> from the sha1s of the files and dirs themselves recursively based
> on sha1s of the files and dirs they contain ?

The latter. 

> I ask because the later seems to provide some nice effects
> similar to what makes BDD
> (http://en.wikipedia.org/wiki/Binary_decision_diagram) so
> efficient: you can compare graphs of any complexity or size in
> O(1) by just comparing their signatures.

This is exactly what git does. You can compare entire trees (and 
subdirectories are just other trees) by just comparing 20 bytes of 
information.

How do you think we can do a diff between two arbitrary kernel revisions 
so fast? Why do you think we can afford to do a 

	git log drivers/usb include/linux/usb*

that literally picks out the history (by comparing state) for every commit 
in the tree?

I can do the above log-generation in less than ten _seconds_ for the last 
year and a half of the kernel. That's 20k+ lines of logs of commits that 
only touch those files and directories. And I _need_ it to be fast, 
because that's literally one of the most common operations I do.

And the reason it's fast is that we can compare 20,000 files (names, 
contents, permissions) by just comparing a _single_ 20-byte SHA1.

In git, revision names (and _everything_ has a revision name: commits, 
trees, blobs, tags) really have meaning. They're not just random noise.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 12:18                                                                                   ` Jakub Narebski
@ 2006-10-26 15:06                                                                                     ` Matthew D. Fuller
  0 siblings, 0 replies; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-26 15:06 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Thu, Oct 26, 2006 at 02:18:53PM +0200 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> 
> You can get similar workflow in git using 'origin'/'master' pair, I
> think.

Not the same, because as soon as your 'git pull' _can_ fast-foward, it
will.  You can't merge a set of changes from another branch that's a
strict superset of yours in, without it fast-forwarding them.

I suppose you could take great care to ensure that the other branch is
never in a position to be fast-forwarded onto yours, most simply just
by making forced commits before you do a pull, but that's revolting.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 13:53                                                                                   ` Jakub Narebski
@ 2006-10-26 15:13                                                                                     ` Aaron Bentley
  0 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-26 15:13 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Aaron Bentley wrote:
>>There is very little difference between this:
>>
>>http://example.com/mywebpage#5
>>
>>And this:
>>
>>http://example.com/mybranch 5
>>
>>In fact, we've been planning to unify them into one identifier.
> 
> 
> Well, then it is not much simpler than 8-chars sha1. And sha1 is more
> decentralized, because you can use it when you don't have access to net,
> and when the _central_ revno server is down.

What do you mean by _central_ revno server?  example.com?  Does that
also apply to google.com?

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFQNCc0F+nu1YWqI0RAlslAJ0XJ8Fezxyn5Ty1oAcgAo4LdQEAvQCfbWk+
vVTmHwIuhyd7lhAxMm2uMZ8=
=c4pE

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-24  9:30                                                                     ` Jelmer Vernooij
@ 2006-10-26 15:22                                                                       ` Aaron Bentley
  0 siblings, 0 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-10-26 15:22 UTC (permalink / raw)
  To: Jelmer Vernooij
  Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git, Erik B?gfors

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jelmer Vernooij wrote:

> The graphical frontends to bzr, for example, don't know about revno's but 
> only about revids.

Gannotate shows revnos where appropriate.  Not sure about others.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFQNK60F+nu1YWqI0RAiGiAJ45IG/nHsl3/5rP23nxYLduopVj/QCfUX+9
E01mr0edaZld9aKMASRbo+o=
=YavT

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 15:05                                                                   ` Linus Torvalds
@ 2006-10-26 16:04                                                                     ` Vincent Ladeuil
  2006-10-26 16:21                                                                       ` Linus Torvalds
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
  1 sibling, 1 reply; 806+ messages in thread
From: Vincent Ladeuil @ 2006-10-26 16:04 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git

>>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:

    Linus> On Thu, 26 Oct 2006, Vincent Ladeuil wrote:
    >> 
    >> Ok, so git make a distinction between the commit (code created by
    >> someone) and the tree (code only).
    >> 
    >> Commits are defined by their parents.

    Linus> Commits are defined by a _combination_ of:

    Linus>  - the tree they commit (which is recursive, so the
    Linus>  commit name indirectly includes information EVERY
    Linus>  SINGLE BIT in the whole tree, in every single file)

And here you keep that separate from any SCM related info,
right ?

    Linus>  - the parent(s) if any (which is also recursive, so
    Linus>  the commit name indirectly includes information about
    Linus>  EVERY SINGLE BIT in not just the current tree, but
    Linus>  every tree in the history, and every commit that is
    Linus>  reachable from it)

    Linus>  - the author, committer, and dates of each (and
    Linus>  committer is actually very often different from
    Linus>  author)

    Linus>  - the actual commit message

    Linus> So a commit really names - uniquely and authoratively
    Linus> - not just the commit itself, but everything ever
    Linus> associated with it.

Thanks for the clarification. But no need to shout about EVERY
SINGLE BIT, the pointer to BDDs was already talking a bit about
bits :) 

But I agree, this is the important point that may be missed.

    >> Trees are defined by their content only ?

    Linus> Where "contents" does include names and
    Linus> permissions/types (eg execute bit and symlink etc).

Which can also be expressed as: "Everything the user can
manipulate outside the SCM context", right ?

    >> If that's the case, how do you proceed ? 

    Linus> If you compare the commit name, and they are equal,
    Linus> you automatically know

    Linus>  - the trees are 100% identical
    Linus>  - the histories are 100% identical

And that's the only info you can get, no ordering here. (Just
pointing the obvious, as soon as you try to put more info into
the signature, the equality will vanish).

But for various optimizations this equality property is the only
needed one.

Do we agree ?

    Linus> If you only care about the actual tree, you compare
    Linus> the tree name for equality, ie you can do

    Linus> 	git-rev-parse commit1^{tree} commit2^{tree}

    Linus> and compare the two: if and only if they are equal are
    Linus> the actual contents 100% equal.

Actually, that's backwards:

"their actual contents are equal" implies "their signatures are
equal".

But, two totally different trees can have the same signature.

My god ! What an horror ! Not. I even wonder if I will live so
long as to see it occurs... So we *can* pretend that:

"theirs signatures are equal" is equivalent to "their contents
are equal"

And that's all we care :)

But I digressed, the question was about a detail on your tree
definition, once the signature is defined to be unique (as in
canonical), the property of comparing the signatures as if they
were the objects themselves follows. Thanks for the confirmation.

    >> Calculate a sha1 representing the content (or the content
    >> of the diff from parent) of all the files and dirs in the
    >> tree ?  Or from the sha1s of the files and dirs themselves
    >> recursively based on sha1s of the files and dirs they
    >> contain ?

    Linus> The latter. 

Thanks for providing the clarification. So of course, finding the
differences between the trees is quick, you can prune anywhere
the signatures equality is verified.

    >> I ask because the later seems to provide some nice effects
    >> similar to what makes BDD
    >> (http://en.wikipedia.org/wiki/Binary_decision_diagram) so
    >> efficient: you can compare graphs of any complexity or size in
    >> O(1) by just comparing their signatures.

    Linus> This is exactly what git does. You can compare entire
    Linus> trees (and subdirectories are just other trees) by
    Linus> just comparing 20 bytes of information.

I understand that, years ago even. I have a bit of practice with
BDDs and I am accustomed to that so lovely property. But without
that practice, I think most people will just wonder...

<snip/>

    Linus> And the reason it's fast is that we can compare 20,000
    Linus> files (names, contents, permissions) by just comparing
    Linus> a _single_ 20-byte SHA1.

Yeah, let's go further ! We can compare gazillions of files and
their history since epoch by comparing _two_ signatures ! :-)

    Linus> In git, revision names (and _everything_ has a
    Linus> revision name: commits, trees, blobs, tags) really
    Linus> have meaning. They're not just random noise.

I know that effect, but I understand people complaining that they
*look* like noise. 

I'm still searching a parallel in nature, but the best I could
find is DNA, ever look at a DNA ? 

Looks like noise no ? No ordering either between parents and
children... But there is a way to identify a parent from the DNA
of a children...


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 16:04                                                                     ` Vincent Ladeuil
@ 2006-10-26 16:21                                                                       ` Linus Torvalds
  0 siblings, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-26 16:21 UTC (permalink / raw)
  To: Vincent Ladeuil; +Cc: bazaar-ng, git



On Thu, 26 Oct 2006, Vincent Ladeuil wrote:

> >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes:
> 
>     Linus> Commits are defined by a _combination_ of:
> 
>     Linus>  - the tree they commit (which is recursive, so the
>     Linus>  commit name indirectly includes information EVERY
>     Linus>  SINGLE BIT in the whole tree, in every single file)
> 
> And here you keep that separate from any SCM related info,
> right ?

I don't understand that question.

The commits contain the tree information. A raw commit in git (this is the 
true contents of the current top commit in my kernel tree, just added 
indentation and an empty line between the command I used to generate it 
and the output, to make it stand out better in the email) looks something 
like this:

   [torvalds@g5 linux]$ git-cat-file commit HEAD

   tree ba1ed8c744654ca91ee2b71b7cdee149c8edbef1
   parent 2a4f739dfc59edd52eaa37d63af1bd830ea42318
   parent 012d64ff68f304df1c35ce5902f5023dc14b643f
   author Linus Torvalds <torvalds@g5.osdl.org> 1161873881 -0700
   committer Linus Torvalds <torvalds@g5.osdl.org> 1161873881 -0700
   
   Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6
   
   * master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
     [SPARC64]: Fix memory corruption in pci_4u_free_consistent().
     [SPARC64]: Fix central/FHC bus handling on Ex000 systems.

where the _name_ of the commit is 

   [torvalds@g5 linux]$ git-rev-parse HEAD

   e80391500078b524083ba51c3df01bbaaecc94bb

ie the commit itself contains the exact tree name (and the name of the 
parents), and the name of the commit is literally the SHA1 of the contents 
of the commit (plus a git-specific header).

>     >> Trees are defined by their content only ?
> 
>     Linus> Where "contents" does include names and
>     Linus> permissions/types (eg execute bit and symlink etc).
> 
> Which can also be expressed as: "Everything the user can
> manipulate outside the SCM context", right ?

Again, I'm not sure what you mean by that. The SCM does not track 
_everything_. It does not track user names and inode numbers, so in a 
sense a developer can change things that the SCM simply doesn't _care_ 
about and never tracks. But yes, the tree contents uniquely identify the 
exact contents that the user cares about.

>     Linus> If you compare the commit name, and they are equal,
>     Linus> you automatically know
> 
>     Linus>  - the trees are 100% identical
>     Linus>  - the histories are 100% identical
> 
> And that's the only info you can get, no ordering here.

No, there is ordering there too. But yes, the ordering is not in the name 
itself, you have to go look at the actual commit history to see it.

The name is just an identifier.

>     Linus> If you only care about the actual tree, you compare
>     Linus> the tree name for equality, ie you can do
> 
>     Linus> 	git-rev-parse commit1^{tree} commit2^{tree}
> 
>     Linus> and compare the two: if and only if they are equal are
>     Linus> the actual contents 100% equal.
> 
> Actually, that's backwards:
> 
> "their actual contents are equal" implies "their signatures are
> equal".

No. 

If the signatures are equal, the contents are equal, and vice versa. It 
really is a two-way thing.

> But, two totally different trees can have the same signature.

No. Don't even think that way. That just confuses you. The hash is 
cryptographic, and large enough, that you really can equate the contents 
with the hash. Anything else is just not even interesting.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:15                                         ` Andreas Ericsson
@ 2006-10-26 16:30                                           ` David Lang
  2006-10-26 17:03                                             ` Nicolas Pitre
  0 siblings, 1 reply; 806+ messages in thread
From: David Lang @ 2006-10-26 16:30 UTC (permalink / raw)
  To: Andreas Ericsson
  Cc: David Rientjes, Jeff King, Linus Torvalds, Lachlan Patrick,
	bazaar-ng, git

On Thu, 26 Oct 2006, Andreas Ericsson wrote:

>> 
>> There are _not_ scalability improvements.  There may be some slight 
>> performance improvements, but definitely not scalability.  If you have ever 
>> tried to use git to manage terabytes of data, you will see this becomes 
>> very clear.  And "rebasing with 3-way merge" is not something often used in 
>> industry anyway if you've followed the more common models for revision 
>> control within large companies with thousands of engineers.  Typically they 
>> all work off mainline.
>> 
>
> Actually, I don't see why git shouldn't be perfectly capable of handling a 
> repo containing several terabytes of data, provided you don't expect it to 
> turn up the full history for the project in a couple of seconds and you don't 
> actually *change* that amount of data in each revision. If you want a vcs 
> that handles that amount with any kind of speed, I think you'll find rsync 
> and raw rvs a suitable solution.

actually, there are some real problems in this area. the git pack format can't 
be larger then 4G, and I wouldn't be surprised if there were other issues with 
files larger then 4G (these all boil down to 32 bit limits). once these limits 
are dealt with then you will be right.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 16:30                                           ` David Lang
@ 2006-10-26 17:03                                             ` Nicolas Pitre
  2006-10-26 17:04                                               ` David Lang
  2006-10-26 17:45                                               ` Jakub Narebski
  0 siblings, 2 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-26 17:03 UTC (permalink / raw)
  To: David Lang
  Cc: Andreas Ericsson, David Rientjes, Jeff King, Linus Torvalds,
	Lachlan Patrick, bazaar-ng, git

On Thu, 26 Oct 2006, David Lang wrote:

> On Thu, 26 Oct 2006, Andreas Ericsson wrote:
> 
> > > 
> > > There are _not_ scalability improvements.  There may be some slight
> > > performance improvements, but definitely not scalability.  If you have
> > > ever tried to use git to manage terabytes of data, you will see this
> > > becomes very clear.  And "rebasing with 3-way merge" is not something
> > > often used in industry anyway if you've followed the more common models
> > > for revision control within large companies with thousands of engineers.
> > > Typically they all work off mainline.
> > > 
> >
> > Actually, I don't see why git shouldn't be perfectly capable of handling a
> > repo containing several terabytes of data, provided you don't expect it to
> > turn up the full history for the project in a couple of seconds and you
> > don't actually *change* that amount of data in each revision. If you want a
> > vcs that handles that amount with any kind of speed, I think you'll find
> > rsync and raw rvs a suitable solution.
> 
> actually, there are some real problems in this area. the git pack format can't
> be larger then 4G, and I wouldn't be surprised if there were other issues with
> files larger then 4G (these all boil down to 32 bit limits). once these limits
> are dealt with then you will be right.

There is no such limit on the pack format.  A pack itself can be as 
large as you want.  The 4G limit is in the tool not the format.

The actual pack limits are as follows:

	- a pack can have infinite size

	- a pack cannot have more than 4294967296 objects

	- each non-delta objects can be of infinite size

	- delta objects can be of infinite size themselves but...

	- current delta encoding can use base objects no larger than 4G

The _code_ is currently limited to 4G though, especially on 32-bit 
architectures.  The delta issue could be resolved in a backward 
compatible way but it hasn't been formalized yet.

The pack index is actually limited to 32-bits meaning it can cope with 
packs no larger than 4G.  But the pack index is a local matter and not 
part of the protocol so this is not a big issue to define a new index 
format and automatically convert existing indexes at that point.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:03                                             ` Nicolas Pitre
@ 2006-10-26 17:04                                               ` David Lang
  2006-10-26 17:16                                                 ` Linus Torvalds
  2006-10-26 17:24                                                 ` Nicolas Pitre
  2006-10-26 17:45                                               ` Jakub Narebski
  1 sibling, 2 replies; 806+ messages in thread
From: David Lang @ 2006-10-26 17:04 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Andreas Ericsson, David Rientjes, Jeff King, Linus Torvalds,
	Lachlan Patrick, bazaar-ng, git

On Thu, 26 Oct 2006, Nicolas Pitre wrote:

> On Thu, 26 Oct 2006, David Lang wrote:
>
>> On Thu, 26 Oct 2006, Andreas Ericsson wrote:
>>
>>>>
>>>> There are _not_ scalability improvements.  There may be some slight
>>>> performance improvements, but definitely not scalability.  If you have
>>>> ever tried to use git to manage terabytes of data, you will see this
>>>> becomes very clear.  And "rebasing with 3-way merge" is not something
>>>> often used in industry anyway if you've followed the more common models
>>>> for revision control within large companies with thousands of engineers.
>>>> Typically they all work off mainline.
>>>>
>>>
>>> Actually, I don't see why git shouldn't be perfectly capable of handling a
>>> repo containing several terabytes of data, provided you don't expect it to
>>> turn up the full history for the project in a couple of seconds and you
>>> don't actually *change* that amount of data in each revision. If you want a
>>> vcs that handles that amount with any kind of speed, I think you'll find
>>> rsync and raw rvs a suitable solution.
>>
>> actually, there are some real problems in this area. the git pack format can't
>> be larger then 4G, and I wouldn't be surprised if there were other issues with
>> files larger then 4G (these all boil down to 32 bit limits). once these limits
>> are dealt with then you will be right.
>
> There is no such limit on the pack format.  A pack itself can be as
> large as you want.  The 4G limit is in the tool not the format.
>
> The actual pack limits are as follows:
>
> 	- a pack can have infinite size
>
> 	- a pack cannot have more than 4294967296 objects
>
> 	- each non-delta objects can be of infinite size
>
> 	- delta objects can be of infinite size themselves but...
>
> 	- current delta encoding can use base objects no larger than 4G
>
> The _code_ is currently limited to 4G though, especially on 32-bit
> architectures.  The delta issue could be resolved in a backward
> compatible way but it hasn't been formalized yet.
>
> The pack index is actually limited to 32-bits meaning it can cope with
> packs no larger than 4G.  But the pack index is a local matter and not
> part of the protocol so this is not a big issue to define a new index
> format and automatically convert existing indexes at that point.

the offset within a pack for the starting location of an object cannot be larger 
then 4G.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:04                                               ` David Lang
@ 2006-10-26 17:16                                                 ` Linus Torvalds
  2006-10-26 17:24                                                 ` Nicolas Pitre
  1 sibling, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-10-26 17:16 UTC (permalink / raw)
  To: David Lang
  Cc: Nicolas Pitre, Andreas Ericsson, David Rientjes, Jeff King,
	Lachlan Patrick, bazaar-ng, git



On Thu, 26 Oct 2006, David Lang wrote:
> 
> the offset within a pack for the starting location of an object cannot be
> larger then 4G.

Well, strictly speaking, even that isn't actually a limit on the _pack_ 
format itself.  It's really just the (totally separate) index that 
currently uses 32-bit offsets.

For example, you can actually use the pack-file to transfer more than 4GB 
of data over the network. You'd not need to change the format at all. Only 
the local _index_ of the result needs to change - but we never transfer 
that at all (it's always generated locally), so that's really a separate 
issue.

It's not even hard to fix. It's just that right now, the biggest 
repository that we know about (mozilla) is not even close to the limit. 
And it took them ten years to get there. So if the mozilla people switch 
to git, and keep going at the same rate, we have about 70 years left 
before we need to fix the indexing ;)

(Of course, other projects, like the kernel, seem to grow faster, so it 
might be "only" a decade or two - but since the index format is a local 
thing, even that won't be too painful, since we don't really need a global 
flag-day once we decide to start supporting larger offsets in the index)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:04                                               ` David Lang
  2006-10-26 17:16                                                 ` Linus Torvalds
@ 2006-10-26 17:24                                                 ` Nicolas Pitre
  1 sibling, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-26 17:24 UTC (permalink / raw)
  To: David Lang
  Cc: Andreas Ericsson, David Rientjes, Jeff King, Linus Torvalds,
	Lachlan Patrick, bazaar-ng, git

On Thu, 26 Oct 2006, David Lang wrote:

> On Thu, 26 Oct 2006, Nicolas Pitre wrote:
> 
> > The pack index is actually limited to 32-bits meaning it can cope with
> > packs no larger than 4G.
> 
> the offset within a pack for the starting location of an object cannot be
> larger then 4G.

To be more exact, yes.  But I don't think we'll ever consider use 
scenarios with packs > 4G with the current index format.  There is 
simply no point.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 17:03                                             ` Nicolas Pitre
  2006-10-26 17:04                                               ` David Lang
@ 2006-10-26 17:45                                               ` Jakub Narebski
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-26 17:45 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Nicolas Pitre wrote:

> On Thu, 26 Oct 2006, David Lang wrote:
> 
>> actually, there are some real problems in this area. the git pack format can't
>> be larger then 4G, and I wouldn't be surprised if there were other issues with
>> files larger then 4G (these all boil down to 32 bit limits). once these limits
>> are dealt with then you will be right.
> 
> There is no such limit on the pack format.  A pack itself can be as 
> large as you want.  The 4G limit is in the tool not the format.
[...]
> The _code_ is currently limited to 4G though, especially on 32-bit 
> architectures.  The delta issue could be resolved in a backward 
> compatible way but it hasn't been formalized yet.
> 
> The pack index is actually limited to 32-bits meaning it can cope with 
> packs no larger than 4G.  But the pack index is a local matter and not 
> part of the protocol so this is not a big issue to define a new index 
> format and automatically convert existing indexes at that point.

If I remember correctly those issues are under development:
1. There is work on 64-bit index
2. There is work that would allow to have multiple packs, repack only one
   of packs and treat the rest as 'archive packs' (which can be more
   aggresively packed). This solution is to split pack into multiple packs.
3. There is work on mmaping only part of pack, which would avoid 4G limit
   even on 32-bit machines, if I understand it correctly.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:54                                                                                   ` Nicholas Allen
  2006-10-26 12:13                                                                                     ` Jakub Narebski
@ 2006-10-26 21:25                                                                                     ` Jeff King
  1 sibling, 0 replies; 806+ messages in thread
From: Jeff King @ 2006-10-26 21:25 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git

On Thu, Oct 26, 2006 at 01:54:38PM +0200, Nicholas Allen wrote:

> I would have thought that supports renames would also involve flagging a 
> conflict when merging a file that has been renamed on 2 separate 
> branches. ie 2 branches rename the file to different names and then one 
> branch is merged into the other. In this situation, the user should be 
> told of a rename conflict. Bzr supports this as far as I know. Not sure 
> about git though as I have never used it.

It works as you expect:

$ git-init-db
$ touch foo
$ git-add foo
$ git-commit -m foo
Committing initial tree 4d5fcadc293a348e88f777dc0920f11e7d71441c
$ git-checkout -b other
$ git-mv foo bar
$ git-commit -m bar
$ git-checkout master
$ git-mv foo baz
$ git-commit -m baz$a
$ git-pull . other
Trying really trivial in-index merge...
fatal: Merge requires file-level merging
Nope.
Merging HEAD with 5a1dfd32c56a24d0ef06f0e71d731fcd49d5dc6e
Merging:
76ac76ee3ce890d43648ebc009d278dc81a327e0 baz
5a1dfd32c56a24d0ef06f0e71d731fcd49d5dc6e bar
found 1 common ancestor(s):
c9e7e95de6fdbb2af06ea44cc60d1ac1a63eaad6 foo
CONFLICT (rename/rename): Rename foo->baz in branch HEAD rename foo->bar
in 5a1dfd32c56a24d0ef06f0e71d731fcd49d5dc6e
Automatic merge failed; fix conflicts and then commit the result.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-26 11:48                                                                                 ` Jakub Narebski
  2006-10-26 11:54                                                                                   ` Nicholas Allen
@ 2006-10-27  2:02                                                                                   ` Horst H. von Brand
  2006-10-27  2:08                                                                                     ` Petr Baudis
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
  1 sibling, 2 replies; 806+ messages in thread
From: Horst H. von Brand @ 2006-10-27  2:02 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski <jnareb@gmail.com> wrote:

[...]

> I'd rather split "Supports Renames" into engine part (does SCM
> remember/detect that rename took place _as_ rename, not remember/detect it
> as copiying+deletion; something other than rename) and user interface part:
> can user easily deal with renames (this includes merging and viewing file
> history).

I think that what to tool does in its guts is completely irrelevant, what
is important is what the user sees. Sadly, it seems hard to describe
exactly what is meant/wanted here.

[...]

> 7. Checkouts (as a noun). This probably read "Support Centralized and
> Disconnected Centralized Workflow" but that is perhaps too wordy. Git would
> have "No" for "Centralized"

Why? We could all agree that some repository is "central" and all push/pull
there. Or send patches by mail (or apply them via ssh). Sure, it's not CVS,
but...

[...]

> 13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
> or "?" background color for Git. And add note that it is easy to script up
> porcelanish command, and to add another merge strategy. There also was
> example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
> marking.

Mostly an implementation detail for "extensible"...

[...]

> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
> easy to use, but I have not much experiences with other SCM. I wonder why
> Bazaar has "No" there...

Extremely subjective. Easy to learn doesn't cut it either.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
@ 2006-10-27  2:08                                                                                     ` Petr Baudis
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
  1 sibling, 0 replies; 806+ messages in thread
From: Petr Baudis @ 2006-10-27  2:08 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: bazaar-ng, git, Jakub Narebski

Dear diary, on Fri, Oct 27, 2006 at 04:02:32AM CEST, I got a letter
where "Horst H. von Brand" <vonbrand@inf.utfsm.cl> said that...
> Jakub Narebski <jnareb@gmail.com> wrote:
> > 7. Checkouts (as a noun). This probably read "Support Centralized and
> > Disconnected Centralized Workflow" but that is perhaps too wordy. Git would
> > have "No" for "Centralized"
> 
> Why? We could all agree that some repository is "central" and all push/pull
> there. Or send patches by mail (or apply them via ssh). Sure, it's not CVS,
> but...

An ability to configure the tool so that the centralized workflow is
_enforced_ may be important for managers. It's stupid, but it's what is
meant there, I think.

> > 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
> > easy to use, but I have not much experiences with other SCM. I wonder why
> > Bazaar has "No" there...
> 
> Extremely subjective. Easy to learn doesn't cut it either.

I don't think this column makes sense at all. I swear I've seen
*several* people that claimed GNU Arch was easy to learn/use for them!



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-21 21:04                       ` Linus Torvalds
  2006-10-21 23:58                         ` Linus Torvalds
  2006-10-22  0:09                         ` Erik Bågfors
@ 2006-10-27  4:51                         ` Jan Hudec
  2006-10-28 11:38                           ` Jakub Narebski
  2 siblings, 1 reply; 806+ messages in thread
From: Jan Hudec @ 2006-10-27  4:51 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Erik B?gfors, Matthieu Moy, bazaar-ng, Sean, git, Jakub Narebski

On Sat, Oct 21, 2006 at 02:04:56PM -0700, Linus Torvalds wrote:
> On Sat, 21 Oct 2006, Erik B?gfors wrote:
> > bzr is a fully decentralized VCS. I've read this thread for quite some
> > time now and I really cannot understand why people come to this
> > conclusion.
> 
> Even the bzr people agree, so what's not to understand?
> 
> The revision numbers are totally unstable in a distributed environment 
> _unless_ you use a certain work-flow. And that work-flow is definitely not 
> "distributed" it's much closer to "disconnected centralized".
> 
> Now, you could be truly distributed: BK used the same revision numbering 
> thing, but was distributed. But BK didn't even try to claim that their 
> revision numbers were "simple" and that fast-forwarding is sometimes the 
> wrong thing to do.
> 
> So BK always fast-forwarded, and the revision numbers were just randomly 
> changing numbers. They weren't stable, they weren't simple, and nobody 
> claimed they were.
> 
> So bzr can bite the bullet and say: "revision numbers are changing and 
> meaningless, and we should just fast-forward on merges", or you should 
> just admit that bzr is really more about "disconnected operation" than 
> truly distributed.
> 
> You can't have your cake and eat it too. Truly distributed _cannot_ be 
> done with a stable dotted numbering scheme (unless the "dotted numbering 
> scheme" is just a way to show a hash like git does - so the numbering has 
> no _sequential_ meaning).
> 
> Btw, this isn't just an "opinion". This is a _fact_. It's something they 
> teach in any good introductory course to distributed algorithms. Usually 
> it's talked about in the context of "global clock". 
> 
> Anybody who thinks that there exists a globally ticking clock in the 
> system (and stably increasing dotted numbers are just one such thing) is 
> talking about some fantasy-world that doesn't exist, or a world that has 
> nothing to do with "distributed".
> 
> 			Linus

Actually bzr used to have slightly different numbering scheme not long
ago. There was a revision-history in each branch listing the revisions
in order in which they were commited or merged in. Some time ago it was
changed to numbering along the leftmost parent, which was, IIRC, deemed
simpler and a little more logical. But in the light of these arguments,
maybe the former system was better -- it was more dependent on the
actual location, but on the other hand it allowed (or could allow --
IIRC there was some problem with it) to fast-forward merge while
_locally_ keeping the meaning of old revision numbers. In fact, the
revision-history used to be almost exactly the same as git reflog,
except it only stored the revids, not the times.

--------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27  2:02                                                                                   ` Horst H. von Brand
  2006-10-27  2:08                                                                                     ` Petr Baudis
@ 2006-10-27  9:34                                                                                     ` Andreas Ericsson
  2006-10-27 10:49                                                                                       ` Jakub Narebski
  2006-10-27 14:46                                                                                       ` J. Bruce Fields
  1 sibling, 2 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-27  9:34 UTC (permalink / raw)
  To: Horst H. von Brand; +Cc: Jakub Narebski, git, bazaar-ng

Horst H. von Brand wrote:
> Jakub Narebski <jnareb@gmail.com> wrote:
> 
> [...]
> 
>> I'd rather split "Supports Renames" into engine part (does SCM
>> remember/detect that rename took place _as_ rename, not remember/detect it
>> as copiying+deletion; something other than rename) and user interface part:
>> can user easily deal with renames (this includes merging and viewing file
>> history).
> 
> I think that what to tool does in its guts is completely irrelevant, what
> is important is what the user sees. Sadly, it seems hard to describe
> exactly what is meant/wanted here.
> 

Agreed. I'd rather make the definition "Can users, after a rename has 
taken place, follow the history of the file-contents across renames?". 
Mainly because this is clearly unambiguous, doesn't involve 
implementation details and only weighs what really counts: User-visible 
capabilities.

IMNSHO, I'd rather have all the features in the list be along the lines 
of "Can users/admins/random-boon do X?" and instead of "yes/no" list the 
number of commands/the amount of time required to achieve the desired 
effect. This would set a clear limit and put most terminology issues out 
of the way.

> 
>> 13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
>> or "?" background color for Git. And add note that it is easy to script up
>> porcelanish command, and to add another merge strategy. There also was
>> example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
>> marking.
> 
> Mostly an implementation detail for "extensible"...
> 

Yup. Any fast-growing SCM can clearly be said to be "extensible", 
otherwise it wouldn't be extended ;-)

> [...]
> 
>> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
>> easy to use, but I have not much experiences with other SCM. I wonder why
>> Bazaar has "No" there...
> 
> Extremely subjective. Easy to learn doesn't cut it either.

This one just needs to go. Could possibly be replaced with "Has 
tutorial/documentation online" or some such. No SCM is really intuitive 
to users that haven't experienced any of them before, so the only thing 
that really matters is how much documentation one can find online and 
how up-to-date it is.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
@ 2006-10-27 10:49                                                                                       ` Jakub Narebski
  2006-10-27 11:41                                                                                         ` Andreas Ericsson
  2006-10-27 14:46                                                                                       ` J. Bruce Fields
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-27 10:49 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Horst H. von Brand, git, bazaar-ng

On 10/27/06, Andreas Ericsson <ae@op5.se> wrote:
> Horst H. von Brand wrote:
>> Jakub Narebski <jnareb@gmail.com> wrote:
>>
>> [...]
>>
>>> I'd rather split "Supports Renames" into engine part (does SCM
>>> remember/detect that rename took place _as_ rename, not remember/detect
>>> it as copiying+deletion; something other than rename) and user interface
>>> part: can user easily deal with renames (this includes merging and
viewing file
>>> history).
>>
>> I think that what to tool does in its guts is completely irrelevant, what
>> is important is what the user sees. Sadly, it seems hard to describe
>> exactly what is meant/wanted here.
>
> Agreed. I'd rather make the definition "Can users, after a rename has
> taken place, follow the history of the file-contents across renames?".
> Mainly because this is clearly unambiguous, doesn't involve
> implementation details and only weighs what really counts: User-visible
> capabilities.

With this definition (with this part) it would be "Somewhat" for Git, because
user can track the history of file-contents across renames, but some additional
steps are required... until --follow=<pathname> would get implemented, that is.
Yet "tracking file-contents across renames" is based on specific workflow used;
for example with Git you usually track [some part of] history of some subpart
of a project, not history of single file. (I'd name it "History Rename Support"
or "Log Rename Support").

But equally important for user is another question related to
"Supporting Renames".
Namely detection of renames during merge and detection of conflict during merge
is what I would consider minimal "Merge Renames Support". Causing information
to be lost is having no "Merge Renames Support". To have "Yes" in this
column SCM
have to resolve conflict at least in obvious cases, and "Yes!" if it
can remember
resolution of merge conflict involving renames ;-).

> IMNSHO, I'd rather have all the features in the list be along the lines
> of "Can users/admins/random-boon do X?" and instead of "yes/no" list the
> number of commands/the amount of time required to achieve the desired
> effect. This would set a clear limit and put most terminology issues out
> of the way.

This would make the comparison table less clear, unfortunately.

>>> 13. Plugins. I would put "Somewhat" here, or "Scriptable" in the "Somewhat"
>>> or "?" background color for Git. And add note that it is easy to script up
>>> porcelanish command, and to add another merge strategy. There also was
>>> example plugin infrastructure for Cogito, so I'd opt for "Someahwt"
>>> marking.
>>
>> Mostly an implementation detail for "extensible"...
>>
>
> Yup. Any fast-growing SCM can clearly be said to be "extensible",
> otherwise it wouldn't be extended ;-)

I'd put "Easily Extensible" here, and put "Plugins (core+UI)" for Bazaar-NG,
and "Scriptable (UI+merge)" for Git, or something like that.

>> [...]
>>
>>> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
>>> easy to use, but I have not much experiences with other SCM. I wonder why
>>> Bazaar has "No" there...
>>
>> Extremely subjective. Easy to learn doesn't cut it either.
>
> This one just needs to go. Could possibly be replaced with "Has
> tutorial/documentation online" or some such. No SCM is really intuitive
> to users that haven't experienced any of them before, so the only thing
> that really matters is how much documentation one can find online and
> how up-to-date it is.

For example SCM can be easy to use but at the cost of simplifications
and limited useness.

On the other side basic concept behind some SCM might be more
or less understandable...
-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27 10:49                                                                                       ` Jakub Narebski
@ 2006-10-27 11:41                                                                                         ` Andreas Ericsson
  0 siblings, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-10-27 11:41 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Horst H. von Brand, git, bazaar-ng

Jakub Narebski wrote:
> On 10/27/06, Andreas Ericsson <ae@op5.se> wrote:
>> Horst H. von Brand wrote:
>>> Jakub Narebski <jnareb@gmail.com> wrote:
>>>
>>> [...]
>>>
>>>> I'd rather split "Supports Renames" into engine part (does SCM
>>>> remember/detect that rename took place _as_ rename, not remember/detect
>>>> it as copiying+deletion; something other than rename) and user 
>>>> interface
>>>> part: can user easily deal with renames (this includes merging and
> viewing file
>>>> history).
>>>
>>> I think that what to tool does in its guts is completely irrelevant, 
>>> what
>>> is important is what the user sees. Sadly, it seems hard to describe
>>> exactly what is meant/wanted here.
>>
>> Agreed. I'd rather make the definition "Can users, after a rename has
>> taken place, follow the history of the file-contents across renames?".
>> Mainly because this is clearly unambiguous, doesn't involve
>> implementation details and only weighs what really counts: User-visible
>> capabilities.
> 

[...]

> But equally important for user is another question related to
> "Supporting Renames".
> Namely detection of renames during merge and detection of conflict 
> during merge
> is what I would consider minimal "Merge Renames Support". Causing 
> information
> to be lost is having no "Merge Renames Support". To have "Yes" in this
> column SCM
> have to resolve conflict at least in obvious cases, and "Yes!" if it
> can remember
> resolution of merge conflict involving renames ;-).
> 

True.

>> IMNSHO, I'd rather have all the features in the list be along the lines
>> of "Can users/admins/random-boon do X?" and instead of "yes/no" list the
>> number of commands/the amount of time required to achieve the desired
>> effect. This would set a clear limit and put most terminology issues out
>> of the way.
> 
> This would make the comparison table less clear, unfortunately.
> 

True that. Perhaps just stick with Yes/No and have a timing table to 
compare merge times, multi-parent merge times and stuff like that.

> 
>>> [...]
>>>
>>>> 19. Ease of Use. Hmmm... I don't know for Git. I personally find it 
>>>> very
>>>> easy to use, but I have not much experiences with other SCM. I 
>>>> wonder why
>>>> Bazaar has "No" there...
>>>
>>> Extremely subjective. Easy to learn doesn't cut it either.
>>
>> This one just needs to go. Could possibly be replaced with "Has
>> tutorial/documentation online" or some such. No SCM is really intuitive
>> to users that haven't experienced any of them before, so the only thing
>> that really matters is how much documentation one can find online and
>> how up-to-date it is.
> 
> For example SCM can be easy to use but at the cost of simplifications
> and limited useness.
> 
> On the other side basic concept behind some SCM might be more
> or less understandable...

Yes, but it will always be based on personal opinion and that's why it 
can never be measured in an unbiased way. It would be like playing 
Trivial Pursuit and getting the question "Which 20'th century author 
wrote the best books?". There's actually two problems with that 
question, but the important one is that it can't be answered correctly 
in this wonderful world we live in where everyone has their own opinion.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27  9:34                                                                                     ` Andreas Ericsson
  2006-10-27 10:49                                                                                       ` Jakub Narebski
@ 2006-10-27 14:46                                                                                       ` J. Bruce Fields
  2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
  1 sibling, 1 reply; 806+ messages in thread
From: J. Bruce Fields @ 2006-10-27 14:46 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Horst H. von Brand, Jakub Narebski, git, bazaar-ng

On Fri, Oct 27, 2006 at 11:34:09AM +0200, Andreas Ericsson wrote:
> Horst H. von Brand wrote:
> >Jakub Narebski <jnareb@gmail.com> wrote:
> >>19. Ease of Use. Hmmm... I don't know for Git. I personally find it very
> >>easy to use, but I have not much experiences with other SCM. I wonder why
> >>Bazaar has "No" there...
> >
> >Extremely subjective. Easy to learn doesn't cut it either.
> 
> This one just needs to go.

It's certainly a hard question to answer, and will never be answered
completely, but unfortunately it's also a really *important* question.
The best SCM in the world isn't much use if I can't convince my
coworkers to learn the thing.

So I think it's helpful to attempt to find out whether we have a problem
here or not, even if the problem is more one of perception than reality.
Though obviously it would be more helpful to have something more
detailed than just a yes or no answer to "is git easy to use?"

> Could possibly be replaced with "Has tutorial/documentation online" or
> some such. No SCM is really intuitive to users that haven't
> experienced any of them before, so the only thing that really matters
> is how much documentation one can find online and how up-to-date it
> is.

Documentation helps, though sometimes extensive documentation is a sign
of a problem--it takes a lot more documentation to explain how to manage
a branch in CVS than it does in any sensible system....


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27 14:46                                                                                       ` J. Bruce Fields
@ 2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
  2006-10-28 13:53                                                                                           ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Ilpo Nyyssönen @ 2006-10-28 11:18 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

"J. Bruce Fields" <bfields@fieldses.org> writes:

> Documentation helps, though sometimes extensive documentation is a sign
> of a problem--it takes a lot more documentation to explain how to manage
> a branch in CVS than it does in any sensible system....

Usability:

I have used bzr, bk for development and git very little for following
kernel development. I have followed this discussion quite well.

1. It is easier to start using something you are already familiar
with. (Just try to use Mac OS X with a Windows or Linux background.)

G: Something totally new and so no points from here. The way of using
git is just so different from any other similar software.

B: Quite clearly gets points from this. Normal branches work quite
like many other software, the checkout stuff works like CVS and SVN.

2. Finding commands.

G: Quite big amount of commands, some clear, but some not so. With all
the installed commands, it is even more confusing. What's the
difference between fetch and pull and which one I should use? Same for
clone and branch.

B: A bit clearer I think, but the pull and merge does cause confusion. 
Also the checkout stuff could be better shown in the command line
help. With plugins like bzrtools the amount of command raises and
confusion increases. Maybe better separation for plugin commands in
the command line help?

3. Understanding output

G: Speaks a language of its own, hard to understand. No progress
reported for long lasting operations.

B: Could maybe speak a bit more. Progress reporting is quite good.

4. Misc stuff

G: You have only one workspace and this forces you to use git more or
to make several repositories. You can't just diff branchA/foo
branchB/foo. You can't just open file from old branch to check
something while you are developing in some new branch. Do I have to
commit my changes before changing a branch in the workspace?

G: What is this git repack thing and do I have to use it? If yes, why? 
Nobody told me that I should run it, but I did notice Linus mentioning
it somewhere. Definetly causing harm for usability.

B: People migth misuse the revnos and so be confused when things won't
work like they expected.

Conclusion: I would say that Bazaar is more usable than git.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-27  4:51                         ` Jan Hudec
@ 2006-10-28 11:38                           ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-28 11:38 UTC (permalink / raw)
  To: Jan Hudec
  Cc: Linus Torvalds, Erik B?gfors, Matthieu Moy, bazaar-ng, Sean, git

Jan Hudec wrote:
> Actually bzr used to have slightly different numbering scheme not long
> ago. There was a revision-history in each branch listing the revisions
> in order in which they were commited or merged in. Some time ago it was
> changed to numbering along the leftmost parent, which was, IIRC, deemed
> simpler and a little more logical. But in the light of these arguments,
> maybe the former system was better -- it was more dependent on the
> actual location, but on the other hand it allowed (or could allow --
> IIRC there was some problem with it) to fast-forward merge while
> _locally_ keeping the meaning of old revision numbers. In fact, the
> revision-history used to be almost exactly the same as git reflog,
> except it only stored the revids, not the times.

Which is very fine if you don't modify the history (amending commits,
rewinding history to earlier point, rebasing the branch, merging branch
in and starting it anew aka. dovetail approach if I remember correctly),
and if you are not concerned with performance when fetching larger
number of commits into branch (as you have to assign number to them).

Which was perhaps why bzr changed from revnolog to leftmost/first parent
as a way to keep branch-as-path/assing revision numbers to revisions.
Which has it's own disadvantages as enumerated multiple times here
on the list.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
@ 2006-10-28 13:53                                                                                           ` Jakub Narebski
  2006-10-28 14:58                                                                                             ` Jakub Narebski
                                                                                                               ` (3 more replies)
  0 siblings, 4 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-28 13:53 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Ilpo Nyyssönen wrote:

> "J. Bruce Fields" <bfields@fieldses.org> writes:
> 
>> Documentation helps, though sometimes extensive documentation is a sign
>> of a problem--it takes a lot more documentation to explain how to manage
>> a branch in CVS than it does in any sensible system....
> 
> Usability:
> 
> I have used bzr, bk for development and git very little for following
> kernel development. I have followed this discussion quite well.
> 
> 1. It is easier to start using something you are already familiar
> with. (Just try to use Mac OS X with a Windows or Linux background.)
> 
> G: Something totally new and so no points from here. The way of using
> git is just so different from any other similar software.
> 
> B: Quite clearly gets points from this. Normal branches work quite
> like many other software, the checkout stuff works like CVS and SVN.

I find for example concept of branches in Git extremly easy to understand.
Bazaar-NG "branches" is mixture of Git branch and Git repository/clone of
repository. In bzr "branch" refers to abstract SCM concept as part of DAG of
revisions sourced from given revision/head/tip (git branch is very close to
it); yet another but distinct abstract SCM concept of branch as "your" line
of development i.e. path in the DAG of revisions started at given
revision/head/tip and ending in initial/parentless revision; the physical
representation: working area, metainformation, storage or pointer to
storage (when branches share storage forming so called bzr "repository").

About checkout: Bazaar mixes here "CVS checkout" model in the "bzr checkout"
command, and SCM concept of checking-out i.e. getting files from repository
(or branch in bzr) to working area.

On the other side breaking with traditional concepts of _centralized_ SCM
in _distributed_ SCM (and geared towards distributed usage) is IMVHO a good
idea. And breaking with the cruft of bad ideas of CVS is very good idea.

But I agree that in Git some terminology (and names of commands) could be
better. Some of it stems from BitKeeper background, some from the way Git
was created: bottom-up, from repository layout to fully (or not ;-) fledged
SCM. For example "pull" as "fetch + merge" is IIRC BitKeeper legacy, while
the fact that "merge" command is low-level (or mid-level) command fairly
poorly usable for user (which should use "pull ." for merging from local
branch).

> 2. Finding commands.
> 
> G: Quite big amount of commands, some clear, but some not so. With all
> the installed commands, it is even more confusing. What's the
> difference between fetch and pull and which one I should use? Same for
> clone and branch.
>
> B: A bit clearer I think, but the pull and merge does cause confusion. 
> Also the checkout stuff could be better shown in the command line
> help. With plugins like bzrtools the amount of command raises and
> confusion increases. Maybe better separation for plugin commands in
> the command line help?

In Git Users Survey (http://git.or.cz/gitwiki/GitSurvey) the answer "too
many commands" was most common answer to question 6. "What did you find
hardest?" in the survey (which survey was base on Mercurial survey:
http://www.selenic.com/mercurial/wiki/index.cgi/UserSurvey). It would be
perhaps better for Git to clearly divide commands between porcelanish (for
end user), admin (whole repository level) and plumbing (for use in
scripts).

But for example git(7) man page lists git commands clearly divided between
low-level commands (plumbing): manipulation commands, interrogation
commands, synching commands and high level commands (porcelain): main
commands, ancillary commands. The "git help" and "git --help" shows the
most commonly used git commands with short description of each command
("git help -a" show all commands). 
 
I can understand confusion between "git pull" and "git fetch"; it is
adressed in documentation. Although I think the confusion between
"bzr merge" and "bzr pull" is as great if not greater.

I don't understand the confusion between "git branch" and "git clone"
commands... unless you are confused by Bazaar-NG branch-centric approach
which mixes branch with repository.

> 3. Understanding output
> 
> G: Speaks a language of its own, hard to understand. No progress
> reported for long lasting operations.
> 
> B: Could maybe speak a bit more. Progress reporting is quite good.

Which long lasting operations lack progress bar/progress reporting?
"git clone" and "git fetch"/"git pull" both have progress report
for both "smart" git://, git+ssh:// and local protocols, and "dumb"
http://, https://, ftp://, rsync:// protocols. "git rebase" has
progress report. "git am" has progress report.

But I agree that Git tends to speak in its own jargon. But this jargon is
very clear if you are familiar with Git. BTW. some of the worst offenders
like <ent> (== <tree-ish>) is removed already from documentation.

> 4. Misc stuff
> 
> G: You have only one workspace and this forces you to use git more or
> to make several repositories. 

This is your confusion stemming from Bazaar-NG branch-centricness. In Git
working area is associated with repository, not with branch as in bzr.
Usually you have repsoitory embedded in working area, in .git directory in
top level of working area. The fact that you have only one index (but you
can specify alternate index, or switch between index files), and only one
current branch marker namely HEAD (you can switch HEAD to other branch; if
I remember correctly there is no way to specify current head other way)
makes working with multiple working areas tied to one repository more
difficult. But it is usually not necessary in Git.

In Bazaar-NG "repository" is just sharing the storage of "branches"; in Git
you can share the storage between repositories (although it is not the
default mode), or share common old history between repositories (more
common). 

> You can't just diff branchA/foo branchB/foo.

You can: either using "git diff branchA branchB -- foo" which means
difference between branches branchA and branchB limited to the differences
on branch foo (where foo can be directory name or filename), or via
"extended SHA1 reference" using "git diff branchA:foo branchB:foo" which
means compare file/directory "foo" at revision "branchA" and file/directory
"foo" at revision "branchB".

You can even diff two different _repositories_ if they are on the same local
filesystem using pasky trick described in http://git.or.cz/gitwiki/GitTips.

> You can't just open file from old branch to check 
> something while you are developing in some new branch.

You can view file from old branch via "git cat-file -p old-branch:file".

> Do I have to commit my changes before changing a branch
> in the workspace? 

You have to. But we have "git commit --amend", so if I need to do this
I usually do "git commit -m 'TEMPORARY COMMIT'" before switching to other
branch. Or you can save differences between working area and current branch
to patch file. The "git-checkpoint" proposal adresses that... in rather
heavy-handed fashion. There is also "git-stash/git-unstash" floating
somewhere in git mailing list archives.
 
> G: What is this git repack thing and do I have to use it? If yes, why? 
> Nobody told me that I should run it, but I did notice Linus mentioning
> it somewhere. Definetly causing harm for usability.

Hmm... perhaps "repack -a -d" should be shown in "git help" list of commonly
used commands output.

Having two separate formats in repository: loose (but compressed) and packed
(in one file, deltaified, compressed) has the following advantages:

0. Historical, it allowed for git to be released (deployed) early,
originally as fast content tracker and not full SCM, and to add features
based on how people used it and scripted it. It also gave Git design the
advantage of not being tailored/based on some storage mechanism, which
resulted in IMHO very clean design and concepts.

1. Security (together with format). It secures repository against corruption
stemming from: corruption during saving file, race condition, interruptions
during operation etc.; although it doesn' save against all possible errors.
That is what sold Keith on choosing Git as SCM for X.Org:
http://keithp.com/blog/Repository_Formats_Matter.html

2. Efficiency. The packed Git format is both AFAIK the densest repository
format from OSS SCM, and it is very fast to access any given revision.

3. Net format. It allows to use _exactly_ the same format for transmission
during clone and fetch; well with the exception that for "smart" protocols
git can send "thin" pack, with some deltas without bases. The latest work
in progress by Nicolas Pitre and others to convert thin pack to full pack
without exploding it into loose objects in between.


There quite frequently appears suggestion for SCM based on Git, or Git
porcelains (like Cogito) to automatically repack. Latest work on the option
to repack to not pack only loose objects, or repack everything, but to
repack given pack or repack with exception of some archive packs should
help with that solution.

> B: People migth misuse the revnos and so be confused when things won't
> work like they expected.

Revnos work only with very specific workflows.

> Conclusion: I would say that Bazaar is more usable than git.

Conclusion: I would say that Git is more usable than Bazaar.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-28 13:53                                                                                           ` Jakub Narebski
@ 2006-10-28 14:58                                                                                             ` Jakub Narebski
  2006-10-28 22:18                                                                                             ` Robin Rosenberg
                                                                                                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-28 14:58 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Jakub Narebski wrote:

>> You can't just diff branchA/foo branchB/foo.
> 
> You can: either using "git diff branchA branchB -- foo" which means
> difference between branches branchA and branchB limited to the differences
> on branch foo (where foo can be directory name or filename)

Sorry, it should be:

"limited to the differences on pathname foo (where foo can be directory name
or filename)"


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-28 13:53                                                                                           ` Jakub Narebski
  2006-10-28 14:58                                                                                             ` Jakub Narebski
@ 2006-10-28 22:18                                                                                             ` Robin Rosenberg
  2006-10-28 22:46                                                                                               ` Jakub Narebski
  2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
  2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
  3 siblings, 1 reply; 806+ messages in thread
From: Robin Rosenberg @ 2006-10-28 22:18 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

lördag 28 oktober 2006 15:53 skrev Jakub Narebski:
> But for example git(7) man page lists git commands clearly divided between
> low-level commands (plumbing): manipulation commands, interrogation
> commands, synching commands and high level commands (porcelain): main
> commands, ancillary commands. The "git help" and "git --help" shows the
> most commonly used git commands with short description of each command
> ("git help -a" show all commands).

I believe people tend to skim through documentation looking for pieces of 
information rather than read it from start to end. So they find themselves 
reading the plumbing documentation first. Simply reordering documentation to 
list the porcelain commands before the plumbing would make the git man page 
less scary to newcomers.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-28 22:18                                                                                             ` Robin Rosenberg
@ 2006-10-28 22:46                                                                                               ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-28 22:46 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git, bazaar-ng

Dnia niedziela 29. października 2006 00:18, Robin Rosenberg napisał:
> lördag 28 oktober 2006 15:53 skrev Jakub Narebski:
>>
>> But for example git(7) man page lists git commands clearly divided between
>> low-level commands (plumbing): manipulation commands, interrogation
>> commands, synching commands and high level commands (porcelain): main
>> commands, ancillary commands. The "git help" and "git --help" shows the
>> most commonly used git commands with short description of each command
>> ("git help -a" show all commands).
> 
> I believe people tend to skim through documentation looking for pieces of 
> information rather than read it from start to end. So they find themselves 
> reading the plumbing documentation first. Simply reordering documentation to 
> list the porcelain commands before the plumbing would make the git man page 
> less scary to newcomers.

Good idea. Thanks.

Current ordering in git(7) man page is probably the result of bottom-up
git development. First there were plumbing commands (well, first was
repository format AFAICT, but I digress...).

-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-28 13:53                                                                                           ` Jakub Narebski
  2006-10-28 14:58                                                                                             ` Jakub Narebski
  2006-10-28 22:18                                                                                             ` Robin Rosenberg
@ 2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
  2006-10-29 12:01                                                                                               ` Jakub Narebski
  2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
  3 siblings, 1 reply; 806+ messages in thread
From: Ilpo Nyyssönen @ 2006-10-29  6:54 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Jakub Narebski <jnareb@gmail.com> writes:

> Ilpo Nyyssönen wrote:
>
>> Usability:
>> 
>> I have used bzr, bk for development and git very little for following
>> kernel development. I have followed this discussion quite well.
>> 
>> 1. It is easier to start using something you are already familiar
>> with. (Just try to use Mac OS X with a Windows or Linux background.)
>> 
>> G: Something totally new and so no points from here. The way of using
>> git is just so different from any other similar software.
>> 
>> B: Quite clearly gets points from this. Normal branches work quite
>> like many other software, the checkout stuff works like CVS and SVN.
>
> I find for example concept of branches in Git extremly easy to understand.

Might be, but the point was: Git is harder as it is not like others. 
In other hand one can see Bazaar like other distributed SCMs and even
like the not distributed ones as it has the checkout stuff.

You can give Bazaar for me, a bk user, and I can understand what to do
with the branches that are like bk clones. (The repository stuff is
later development and still optional.) Switching a CVS environment to
Bazaar one can be done so that most of the users can be just told to
use bzr checkout and they don't have to care about pushing.

But with git, I clone some repository. Now it is totally new to
understand that I didn't clone only single branch. It's like nothing
else and that's what I saw when I first looked at it. I might have
even not noticed the branch stuff and just cloned it further.

> On the other side breaking with traditional concepts of _centralized_ SCM
> in _distributed_ SCM (and geared towards distributed usage) is IMVHO a good
> idea. And breaking with the cruft of bad ideas of CVS is very good idea.

Breaking concepts can be a good idea and I somewhat think that git
needed to do what it did. But do remember that it came with a cost:
git is harder to understand and use. You first have to understand that
it is different and how it is different.

> I don't understand the confusion between "git branch" and "git clone"
> commands... unless you are confused by Bazaar-NG branch-centric approach
> which mixes branch with repository.

Those commands do so different things in different SCMs. Just look at
the differences bk clone, git clone, git branch and bzr branch. You
have both. At the point where I didn't yet understand that I cloned
more than a one branch, git branch is very odd looking command.

> Which long lasting operations lack progress bar/progress reporting?
> "git clone" and "git fetch"/"git pull" both have progress report

First note that I didn't notice git repack until recently so things
got slower until that.

At least some points they just tell that they are doing something, but
not how much of it has been done and how much is still to do. Look at
Bazaar and you'll see the difference, it has progress bars.

>> G: You have only one workspace and this forces you to use git more or
>> to make several repositories. 
>
> This is your confusion stemming from Bazaar-NG branch-centricness. In Git
> working area is associated with repository, not with branch as in bzr.

Exactly my point.

>> You can't just diff branchA/foo branchB/foo.
>
> You can: either using "git diff branchA branchB -- foo" which means

Exactly my point: it forces you to use git more. In Bazaar I can do
this without Bazaar commands. I could even do it with some Windows GUI
stuff, take two files or directories and compare.

As you need to use git commands more than bzr commands, git has bigger
requirements for usability.

>> You can't just open file from old branch to check 
>> something while you are developing in some new branch.
>
> You can view file from old branch via "git cat-file -p old-branch:file".

Same thing here, in Bazaar, I can just open the file from the other
branch. I can also compile and run the other branch while I have the
other open.

Essentially I would need a separate git repository for each branch
anyway. In Bazaar I can use the same.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
@ 2006-10-29 12:01                                                                                               ` Jakub Narebski
  2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
  2006-10-30  0:10                                                                                                 ` Theodore Tso
  0 siblings, 2 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-29 12:01 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Ilpo Nyyssönen wrote:

> Jakub Narebski <jnareb@gmail.com> writes:
> 
>> Ilpo Nyyssönen wrote:
>>
>>> Usability:
>>> 
>>> I have used bzr, bk for development and git very little for following
>>> kernel development. I have followed this discussion quite well.
>>> 
>>> 1. It is easier to start using something you are already familiar
>>> with. (Just try to use Mac OS X with a Windows or Linux background.)
>>> 
>>> G: Something totally new and so no points from here. The way of using
>>> git is just so different from any other similar software.
>>> 
>>> B: Quite clearly gets points from this. Normal branches work quite
>>> like many other software, the checkout stuff works like CVS and SVN.
>>
>> I find for example concept of branches in Git extremly easy to
>> understand.
> 
> Might be, but the point was: Git is harder as it is not like others. 
> In other hand one can see Bazaar like other distributed SCMs and even
> like the not distributed ones as it has the checkout stuff.
> 
> You can give Bazaar for me, a bk user, and I can understand what to do
> with the branches that are like bk clones. (The repository stuff is
> later development and still optional.) Switching a CVS environment to
> Bazaar one can be done so that most of the users can be just told to
> use bzr checkout and they don't have to care about pushing.

That is of course because you are familiar with branch-centric distributed
SCM, namely BitKeeper, when trying Bazaar-NG. IMHO branch-centric view
is somewhat limiting; you can always use repository-centric SCM with
one-live-branch-per-repository paradigm and emulate branch-centric SCM,
which is not (or not always) the case for branch-centric SCM. Branch-centric
and repo-centric SCM promote different workflows, namely parallel uncommited
work on few development branches for branch-centric SCM, one-change
per-commit multiple temporary and feature branches for repo-centric SCM.

Breaking from CVS update-then-commit stupid model is IMHO very, very good
idea. On the par of breaking from CVS "model" of branches. In my opinion
CVS had one very good idea (perhaps it wasn't originally CVS idea), namely
using merge instead of locking files for editing; well that and the fact
that it tried (emphasisis on tried) to treat module as a whole, allowing
for multi-file change commits.

Take for example the case of WordProcessors: if they all would only emulate
the UI of leading one (most commonly used), no progress would be made.

> But with git, I clone some repository. Now it is totally new to
> understand that I didn't clone only single branch. It's like nothing
> else and that's what I saw when I first looked at it. I might have
> even not noticed the branch stuff and just cloned it further.

That's the shift of paradigm. Instead of one-branch-per-repository, and
one-branch-per-developer workflow which I think usually stems from that, we
have one-repository-per-developer (usually), and heavily nonlinear
development.

>> On the other side breaking with traditional concepts of _centralized_ SCM
>> in _distributed_ SCM (and geared towards distributed usage) is IMVHO a
>> good idea. And breaking with the cruft of bad ideas of CVS is very good
>> idea. 
> 
> Breaking concepts can be a good idea and I somewhat think that git
> needed to do what it did. But do remember that it came with a cost:
> git is harder to understand and use. You first have to understand that
> it is different and how it is different.

The same could be said about moving from MS-DOS or later MS Windows to the
world of UNIX.

But yes, I understand and agree that being different than others can be
disadvantage... and can be advantage.

>> I don't understand the confusion between "git branch" and "git clone"
>> commands... unless you are confused by Bazaar-NG branch-centric approach
>> which mixes branch with repository.
> 
> Those commands do so different things in different SCMs. Just look at
> the differences bk clone, git clone, git branch and bzr branch. You
> have both. At the point where I didn't yet understand that I cloned
> more than a one branch, git branch is very odd looking command.

I for example didn't understand "bzr branch" concept, being familiar rather
with "git branch".

>> Which long lasting operations lack progress bar/progress reporting?
>> "git clone" and "git fetch"/"git pull" both have progress report
> 
> First note that I didn't notice git repack until recently so things
> got slower until that.
> 
> At least some points they just tell that they are doing something, but
> not how much of it has been done and how much is still to do. Look at
> Bazaar and you'll see the difference, it has progress bars.

Well, having progress bars for operations which are usually fast and one
step is in my opinion stupid idea. Even if there are combinations of
options which makes them slow (for example using so called pickaxe, 
e.g. "git log -S'fragment' -- file" to find revisions which introduced
'fragment' to 'file').

I'll ask again: _which_ git commands you find lacking progress reporting?

>>> You can't just diff branchA/foo branchB/foo.
>>
>> You can: either using "git diff branchA branchB -- foo" which means
> 
> Exactly my point: it forces you to use git more. In Bazaar I can do
> this without Bazaar commands. I could even do it with some Windows GUI
> stuff, take two files or directories and compare.
> 
> As you need to use git commands more than bzr commands, git has bigger
> requirements for usability.

But git commands are more powerfull than equivalent GNU commands. git-diff
is more powerfull than GNU diff (for example it can detect renames and
copying, it shows mode changes, it can show diff for merge using "combined
diff" format), git-grep is more powerfull than GNU grep (for example Linus
finds himself to put files in git repository to use git-grep instead of
combination of GNU find and GNU grep).
 
And don't forget about _cost_ of doing that abovementioned way, namely
having to keep two copies of working area (differing in revision, of
course).

>>> You can't just open file from old branch to check 
>>> something while you are developing in some new branch.
>>
>> You can view file from old branch via "git cat-file -p old-branch:file".

Or you can "git commit -a -m 'TEMP'" to save changes, "git checkout
<branch>" to switch to other branch, perhaps git-clean, hack; hack; hack;
commit changes, swotch back to branch, and wiether amend the commit or reset
index and HEAD (but not working area).

> Same thing here, in Bazaar, I can just open the file from the other
> branch. I can also compile and run the other branch while I have the
> other open.

Do you really often compile and run other branch while developing on other?

> Essentially I would need a separate git repository for each branch
> anyway. In Bazaar I can use the same.

Well, that's a fact that git lacks somewhat (but not lack completly) support
for multiple independent workplaces for the same repository (link+separate
index+separate HEAD), and lacks somewhat (but not completely) support for
sharing object database between repositories aka. bzr model (you have to be
very careful with pruning).

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-29 12:01                                                                                               ` Jakub Narebski
@ 2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
  2006-10-29 18:39                                                                                                   ` Jakub Narebski
  2006-10-30  0:10                                                                                                 ` Theodore Tso
  1 sibling, 1 reply; 806+ messages in thread
From: Matthew D. Fuller @ 2006-10-29 18:24 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

On Sun, Oct 29, 2006 at 01:01:07PM +0100 I heard the voice of
Jakub Narebski, and lo! it spake thus:
> 
> Branch-centric and repo-centric SCM promote different workflows,
> namely parallel uncommited work on few development branches for
> branch-centric SCM, one-change per-commit multiple temporary and
> feature branches for repo-centric SCM.

I don't think that follows at all.


> Do you really often compile and run other branch while developing on
> other?

Yes.  And I do the same with older revisions along a given branch too,
where is where [lightweight] checkouts come in handy.


-- 
Matthew Fuller     (MF4839)   |  fullermd@over-yonder.net
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
@ 2006-10-29 18:39                                                                                                   ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-10-29 18:39 UTC (permalink / raw)
  To: Matthew D. Fuller; +Cc: bazaar-ng, git

Matthew D. Fuller wrote:
> On Sun, Oct 29, 2006 at 01:01:07PM +0100 I heard the voice of
> Jakub Narebski, and lo! it spake thus:
>>
>> Do you really often compile and run other branch while developing on
>> other?
> 
> Yes.  And I do the same with older revisions along a given branch too,
> where is where [lightweight] checkouts come in handy.

Well, if you don't _work_ on other branch, you can alwaych checkout
the other branch or any given revision from a separate directory
using
  git --git-dir=<path to repo> tar-tree <revision> | tar xf -
for example.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-29 12:01                                                                                               ` Jakub Narebski
  2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
@ 2006-10-30  0:10                                                                                                 ` Theodore Tso
  1 sibling, 0 replies; 806+ messages in thread
From: Theodore Tso @ 2006-10-30  0:10 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

On Sun, Oct 29, 2006 at 01:01:07PM +0100, Jakub Narebski wrote:
> > You can give Bazaar for me, a bk user, and I can understand what to do
> > with the branches that are like bk clones. (The repository stuff is
> > later development and still optional.) Switching a CVS environment to
> > Bazaar one can be done so that most of the users can be just told to
> > use bzr checkout and they don't have to care about pushing.
> 
> That is of course because you are familiar with branch-centric distributed
> SCM, namely BitKeeper, when trying Bazaar-NG. IMHO branch-centric view
> is somewhat limiting; you can always use repository-centric SCM with
> one-live-branch-per-repository paradigm and emulate branch-centric SCM,
> which is not (or not always) the case for branch-centric SCM. Branch-centric
> and repo-centric SCM promote different workflows, namely parallel uncommited
> work on few development branches for branch-centric SCM, one-change
> per-commit multiple temporary and feature branches for repo-centric SCM.

I've got to disagree here.  Being a former bitkeeper user myself, I
find BZR-NG to be nothing like bk.  In particular, Bitkeeper is *not*
branch-centric the way that BZR is; in fact, bk is much closer to git
and bk both in terms of how it works and its terminology.  You can
have a non-linear set of history without using any "branches" in both
bk and mercurial, simply by creating two commits changing different
files in two different repositories (using the bk, git, and hg sense
of the word --- only bzr attaches a completely different definitoin to
term "repository"), and then pull them together.   

With bzr, the only way you can do the following is by explicitly
creating a separate branch and then merging the two branches together.
In bzr --- unlike bk, git, and hg --- when you are on a "branch" the
history must be completely linear.  The difference between bk, and git
and hg, is that bk enforces a restriction that there must be one
"head", or "tip" on a particular repository (in the bk, hg, and git
sense).  So if you start by cloning the repository A -> B, and then
make one or more commits in repository A, and then one or more commits
in repository B, when you pull from repository B to A, bk will enforce
the creation of a merge changeset on the resulting repository --- or
fail the merge.  (Actually, with BK there was the option to create
multiple tips using "lines of development", but it was never fully
developed or supported.)

With hg and git, you have the *option* of pulling the two lines of
commits together using a merge changeset *or* leaving the two "tips"
or "heads" unmerged.  But that's only a very minor difference between
bk and hg/git --- and if you are willing to always merge two heads
after pulling so that your git or hg repository only has one head/tip,
then conceptually the changeset history is just like bk.

In contrast, it's impossible to do this with bzr without leaving the
named branches around, so in this sense it's quite different form BK.

						- Ted

P.S.  I'm going to teaching a class entitled "Bzr, Hg, and Git, Oh
my!" at LISA conference in Washington, D.C.  It's only a half-day
tutorial intending to cover the basics of Distributed SCM systems, so
most folks on this list will probably know everything I'm planning on
discussing, but if you have some colleagues who need a gentle
introduction, please feel tell them to head on over to the LISA
conference website at www.usenix.org.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Progress reporting (was: VCS comparison table)
  2006-10-28 13:53                                                                                           ` Jakub Narebski
                                                                                                               ` (2 preceding siblings ...)
  2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
@ 2006-10-30 10:18                                                                                             ` Jakub Narebski
  2006-10-30 15:21                                                                                               ` Nicolas Pitre
  3 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-10-30 10:18 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Jakub Narebski wrote:
> Ilpo Nyyssönen wrote:

>> 3. Understanding output
>> 
>> G: Speaks a language of its own, hard to understand. No progress
>> reported for long lasting operations.
>> 
>> B: Could maybe speak a bit more. Progress reporting is quite good.
> 
> Which long lasting operations lack progress bar/progress reporting?
> "git clone" and "git fetch"/"git pull" both have progress report
> for both "smart" git://, git+ssh:// and local protocols, and "dumb"
> http://, https://, ftp://, rsync:// protocols. "git rebase" has
> progress report. "git am" has progress report.

I was bitten lately by git lack of progress reporting for git-push.
While it nicely reports local progress (generating data) it unfortunately
lacks wget like, "curl -o" like or scp like pack upload progress
reporting. And while usually push is fast, initial push of whole
project to empty repository can be quite slow on low-bandwidth link
(or busy network).

git version 1.4.3.3 on local side, git+ssh:// protocol, git version
1.4.3.3.g9ab2 on the remote side (repo.or.cz).
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: Progress reporting (was: VCS comparison table)
  2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
@ 2006-10-30 15:21                                                                                               ` Nicolas Pitre
  0 siblings, 0 replies; 806+ messages in thread
From: Nicolas Pitre @ 2006-10-30 15:21 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git

On Mon, 30 Oct 2006, Jakub Narebski wrote:

> I was bitten lately by git lack of progress reporting for git-push.
> While it nicely reports local progress (generating data) it unfortunately
> lacks wget like, "curl -o" like or scp like pack upload progress
> reporting. And while usually push is fast, initial push of whole
> project to empty repository can be quite slow on low-bandwidth link
> (or busy network).

What about this patch?

diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c
index 41e1e74..7f87ae8 100644
--- a/builtin-pack-objects.c
+++ b/builtin-pack-objects.c
@@ -1524,6 +1524,10 @@ int cmd_pack_objects(int argc, const cha
 			progress = 1;
 			continue;
 		}
+		if (!strcmp("--all-progress", arg)) {
+			progress = 2;
+			continue;
+		}
 		if (!strcmp("--incremental", arg)) {
 			incremental = 1;
 			continue;
@@ -1641,7 +1645,7 @@ int cmd_pack_objects(int argc, const cha
 	else {
 		if (nr_result)
 			prepare_pack(window, depth);
-		if (progress && pack_to_stdout) {
+		if (progress == pack_to_stdout) {
 			/* the other end usually displays progress itself */
 			struct itimerval v = {{0,},};
 			setitimer(ITIMER_REAL, &v, NULL);
diff --git a/send-pack.c b/send-pack.c
index 0e90548..9280481 100644
--- a/send-pack.c
+++ b/send-pack.c
@@ -30,6 +30,7 @@ static void exec_pack_objects(void)
 {
 	static const char *args[] = {
 		"pack-objects",
+		"--all-progress",
 		"--stdout",
 		NULL

^ permalink raw reply related	[flat|nested] 806+ messages in thread

* Re: VCS comparison table
  2006-10-25 22:40                                                                           ` David Lang
  2006-10-25 23:53                                                                             ` Matthew D. Fuller
@ 2006-10-30 21:46                                                                             ` Jan Hudec
  1 sibling, 0 replies; 806+ messages in thread
From: Jan Hudec @ 2006-10-30 21:46 UTC (permalink / raw)
  To: David Lang; +Cc: Matthew D. Fuller, Linus Torvalds, bazaar-ng, git

On Wed, Oct 25, 2006 at 03:40:00PM -0700, David Lang wrote:
> On Tue, 24 Oct 2006, Matthew D. Fuller wrote:
> >On Tue, Oct 24, 2006 at 11:03:20AM -0700 I heard the voice of
> >David Lang, and lo! it spake thus:
> >>
> >>it sounded like you were saying that the way to get the slices of
> >>the DAG was to use branches in bzr. [...]
> >
> >I'm not entirely sure I understand what you mean here, but I think
> >you're saying "Nobody's written the code in bzr to show arbitrary
> >slices of the DAG", which is true TTBOMK.
> 
> I think we are talking past each other here.
> 
> what I think was said was
> 
> G 'one feature of git is that you can view arbatrary slices trivially'
> 
> B 'bzr can do this too, you just use branches to define the slices'
> 
> G 'but this limits you becouse branches are defined as code is developed, 
> git lets you define slices at viewing time'
> 
> by the way, I think it's more then just saying 'well, the code could be 
> written to do this in $VCS' some decisions and standard ways of doing 
> things can impact how hard it is to implement a feature, and some decisions 
> can make it impossible (without doing unexpected things).

Since bzr branch is, and is ONLY, a pointer to a revision, I don't see
any design decision that would make this harder in bzr. The UI was only
implemented to take the revisions as branches.

> >>everyone agrees that bzr supports the Star topology. Most people
> >>(including bzr people) seem to agree that currently bzr does not
> >>support the Distributed topology.
> >
> >I think this statement arouses so much grumbling because (a) bzr does
> >support such a lot better than often seems implied, (b) where it
> >doesn't, the changes needed to do so are relatively minor (often
> >merely cosmetic), and (c) disagreement over whether some of the
> >qualifications included for 'distributed' are really fundamental.

The more I read this thread I actually think bzr does support
distributed topology as well as git.

The whole difference is that bzr makes a distinction between the first
and other parents of a revision, while git does not. This distinction is
done in two places:

1. The log shows the first parent and than, as indented subsection the
   ancestry of other parents until the point where the ancestries meet
   again. This actually captures a pattern people usually use. When you
   merge, you usually put in the log something along the lines:

   "merged X, which bars and fixes foo."

   when you actually merge M, which you consider a "mainline" and
   therefore not worth mentioning and X. Linus does it this way too --
   he actually posted a log message as an example, that showed exactly
   this.

2. Assigns revision aliases in this same order (except the "major"
   number for the subsection is based on the common ancestor, not on the
   merge point). They are not special thing that is generated at commit
   time; they are infered from the shape of the DAG (and cached for
   performance reasons).

And the only issue I think is, that the bzr UI and documentation pushes
forward these aliases (revnos) more than appropriate for fully
distributed case and hides the real revision names (revids) too much for
that case.

> >>it's just fine for bzr to not support all possible topologies,
> >
> >I think there's a real intent for bzr TO support at least all common
> >topologies.  I'll buy that current development has focused more on
> >[relatively] simple topologies than the more wildly complex ones.  I
> >look forward to more addressing of the less common cases as the tool
> >matures, and I think a lot of this thread will be good material to
> >work with as that happens.  It's just the suggestion that providing
> >fruit for simple topologies _necessarily_ prejudices against complex
> >ones that I find so onerous.
> 
> one concern that the git people are voicing is that the things that work 
> for simple topologies (revno's) can't be used with the more complex ones 
> (where you need the refid's). especially the fact that users need to do 
> things significantly different when there are fairly subtle changes to the 
> topology.
> 
> the scenerio that came up elsewhere today where you have
> 
>    Master
>    /    \
> dev1   dev2
> 
> and then dev1 and dev2 both start working on the same thing (without 
> knowing it), then discover they are working on the same thing. they now 
> have threeB options
> 
> 1. merge their stuff up to the master so that they can both pull it down.
>   but this puts broken, experimental stuff up in the master
> 
> 2. declare one of the dev trees to be the master
> 
> this changes the topology to
> 
> Master--dev1--dev2
> 
> 3. pull from each other frequently to keep in sync.
> 
> this changes the topology to
> 
>    Master
>    /   \
> dev1--dev2
> 
> if they do this with bzr then the revno's break, they each get extra 
> commits showing up (so they can never show the same history).

That's a deficiency of merge not telling that a merge is pointless.
Actually I think than bzr merge *should* reduce to pull in all cases:

- If the common ancestor is on the leftmost path of the other branch,
  than the existing revnos as seen on this branch will not change in any
  case, only more than one is added. I think it's safe for merge to
  reduce to pull in this case and consider it a bug in bzr that it does
  not.
- If the common ancestor is not on the leftmost path on the other
  branch, than it is because the branch was merged with some other
  deemed "more important" (ie. the "Master" above). In this case
  reducing to pull will change the old revids, but IMO it's correct
  thing to do, because it's now up-to-date with latest revision of
  "Master" and it's revnos should take precedence. Personally I'd just
  like merge to reduce to pull in this case as well, but maybe it'd be
  better to have it error out and request user to either "pull" or
  "merge --pointless".

> in git this is a non-issue, they can pull back and forth and the only new 
> history to show up will be changes.
> 
> this is the situation that the kernel developers are in frequently. it 
> sounds as if you haven't needed to do this yet, so you haven't encountered 
> the problems.

Given that bzr is considerably smaller project than the Linux kernel,
that's quite likely. And it's likely a reason why it was not thoroughly
discussed in bzr yet (or at least I don't know about that it was).

--------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-20 18:15                                                   ` Jon Smirl
@ 2006-11-03  3:43                                                     ` Matthew Hannigan
  0 siblings, 0 replies; 806+ messages in thread
From: Matthew Hannigan @ 2006-11-03  3:43 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git, Shawn Pearce

On Fri, Oct 20, 2006 at 02:15:15PM -0400, Jon Smirl wrote:
> [ ... ] 
> You could have a file of macro substitutions that is applied/expanded
> when files go in/out of git. The macros would replace the copyright
> notices improving the move/rename tracking and the reducing repository
> size. The macros could be recorded out of band to eliminate the need
> for escaping the file contents. Even simpler, the only valid place for
> the macro could be the beginning of the file.

That probably belongs in the class of transformations
best done outside the VCS such as the permissions 
and system config file idea Linus outlined earlier.


Matt

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: [ANNOUNCE] Example Cogito Addon - cogito-bundle
  2006-10-21 19:21                                                     ` Jakub Narebski
@ 2006-11-03  6:36                                                       ` Martin Langhoff
  0 siblings, 0 replies; 806+ messages in thread
From: Martin Langhoff @ 2006-11-03  6:36 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: Linus Torvalds, Jan Hudec, Jeff King, bazaar-ng, git

On 10/22/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Lack of --follow is not a big issue because you can do this "by hand";
> you can use git-diff-tree -M at the end of file history to check if
> [git considers] it was moved from somewhere.

This 'by hand' can be done in shell. cg-log has a half-complete
implementation of it. Seems to be disabled now :-(

cheers,



^ permalink raw reply	[flat|nested] 806+ messages in thread

* git and bzr
  2006-10-26 15:05                                                                   ` Linus Torvalds
  2006-10-26 16:04                                                                     ` Vincent Ladeuil
@ 2006-11-28  0:01                                                                     ` Joseph Wakeling
  2006-11-28  0:39                                                                       ` Jakub Narebski
                                                                                         ` (5 more replies)
  1 sibling, 6 replies; 806+ messages in thread
From: Joseph Wakeling @ 2006-11-28  0:01 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Hello all,

Following the very interesting debate about the differences between bzr
and git, I thought it was about time I tried to learn properly about git
and how to use it.  I've been using bzr for a good while now, although
since I'm not a serious developer I only use it for simple purposes,
keeping track of code I write on my own for academic projects.

So, a few questions about differences I don't understand...

First off a really dumb one: how do I identify myself to git, i.e. give
it a name and email address?  Currently it uses my system identity,
My Name <username@computer.(none)>.  I haven't found any equivalent of
the bzr whoami command.

Now to more serious business.  One of the main operational differences I
see as a new user is that bzr defaults to setting up branches in
different locations, whereas git by default creates a repository where
branches are different versions of the directory contents and switching
branches *changes* the directory contents.  bzr branch seems to be
closer to git-clone than git-branch (N.B. I have never used bzr repos so
might not be making a fair comparison).

With this in mind, is there any significance to the "master" branch (is
it intended e.g. to indicate a git repository's "stable" version
according to the owner?), or is this just a convenient default name?
Could I delete or rename it?  Using bzr I would normally give the
central branch(*) the name of the project.

(* Central or main on my own system.  Not intended to be central in the
sense of a CVS-style version control setup:-)

Any other useful comments that can be made to a bzr user about working
with this difference, positive or negative aspects of it?

Next question ... one of the reasons I started seriously thinking about
git was that in the VCS comparison discussion, it was noted that git is
a lot more flexible than bzr in terms of how it can track data (e.g. the
git pickaxe command, although I understand that's not in the released
version [1.4.4.1] yet?).  A frustration with bzr is that pulling or
merging patches from another branch or repo requires them to share the
same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
particular function in project XXX, I'm going to pull that individual
bit of code and its development history into project YYY"?

Last off (for now, I'm sure I'll think of more): is there any easy (or
difficult) way to effectively import version history from a bzr
repository, and vice versa?

Thanks in advance for any comments,

    -- Joe

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
@ 2006-11-28  0:39                                                                       ` Jakub Narebski
  2006-11-28  0:40                                                                       ` Sean
                                                                                         ` (4 subsequent siblings)
  5 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28  0:39 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Joseph Wakeling wrote:

> Hello all,
> 
> Following the very interesting debate about the differences between bzr
> and git, I thought it was about time I tried to learn properly about git
> and how to use it.  I've been using bzr for a good while now, although
> since I'm not a serious developer I only use it for simple purposes,
> keeping track of code I write on my own for academic projects.
> 
> So, a few questions about differences I don't understand...
> 
> First off a really dumb one: how do I identify myself to git, i.e. give
> it a name and email address?  Currently it uses my system identity,
> My Name <username@computer.(none)>.  I haven't found any equivalent of
> the bzr whoami command.

git repo-config user.name "Joseph Wakeling"
git repo-config user.email joseph.wakeling@webdrake.net

You might add --global option if you want your identity to be saved
in ~/.gitconfig file, and not per repository (one might want to use
different identities for different repositories).

"git repo-config --list" or "git var -l" to list all config. There is no
direct equivalent of "bzr whoami" (the equivalent would be:

  echo "$(git repo-config --get user.name) <$(git repo-config --get user.email)>"
 
> Now to more serious business.  One of the main operational differences I
> see as a new user is that bzr defaults to setting up branches in
> different locations, whereas git by default creates a repository where
> branches are different versions of the directory contents and switching
> branches *changes* the directory contents.  bzr branch seems to be
> closer to git-clone than git-branch (N.B. I have never used bzr repos so
> might not be making a fair comparison).

The rough equivalent of bzr repos would be a set of git repos which share
object database, either via symlink, or via GIT_OBJECT_DIRECTORY, or via
alternates mechanism.

But it is a fact that in bzr working area is associated with branch, while
in git it is associated with repository.

> With this in mind, is there any significance to the "master" branch (is
> it intended e.g. to indicate a git repository's "stable" version
> according to the owner?), or is this just a convenient default name?
> Could I delete or rename it?  Using bzr I would normally give the
> central branch(*) the name of the project.

Of course you can rename 'master' branch. But please remember that names
of branches in git are local matter. Well, except the fact that you usually
preserve them in a fashion.

But equivalent of giving central branch the name of the project would
be naming the directory with working area and .git directory the name
of project, or in the case of bare repository giving $GIT_DIR for a project
name project.git.

> Any other useful comments that can be made to a bzr user about working
> with this difference, positive or negative aspects of it?

By the way, 'master' is by no means special. It is default in a few cases
(init-db, clone), but that's all.
 
> Next question ... one of the reasons I started seriously thinking about
> git was that in the VCS comparison discussion, it was noted that git is
> a lot more flexible than bzr in terms of how it can track data (e.g. the
> git pickaxe command, although I understand that's not in the released
> version [1.4.4.1] yet?).  A frustration with bzr is that pulling or
> merging patches from another branch or repo requires them to share the
> same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
> particular function in project XXX, I'm going to pull that individual
> bit of code and its development history into project YYY"?

In git repository can have unrelated branches. So you can fetch unrelated
repository into your repository, and merge/cherry-pick from there
if needed.

In defence of Bazaar-NG, you can probably get the same or very similar with
bzr repos. 

> Last off (for now, I'm sure I'll think of more): is there any easy (or
> difficult) way to effectively import version history from a bzr
> repository, and vice versa?

Try git-archimport, or Tailor tool.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
  2006-11-28  0:39                                                                       ` Jakub Narebski
  2006-11-28  0:40                                                                       ` Sean
@ 2006-11-28  0:40                                                                       ` Sean
  2006-11-28  2:57                                                                       ` Linus Torvalds
                                                                                         ` (2 subsequent siblings)
  5 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-11-28  0:40 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: git, bazaar-ng

On Tue, 28 Nov 2006 01:01:46 +0100
Joseph Wakeling <joseph.wakeling@webdrake.net> wrote:

> First off a really dumb one: how do I identify myself to git, i.e. give
> it a name and email address?  Currently it uses my system identity,
> My Name <username@computer.(none)>.  I haven't found any equivalent of
> the bzr whoami command.

Assuming you have a recent version of git, then:

$ git repo-config --global user.email "you@email.com"
$ git repo-config --global user.name "Your Name"

Will setup a ~/.gitconfig in your home directory; these settings
will apply in any repo you use.  Drop the "--global" to set them
per repo.

> With this in mind, is there any significance to the "master" branch (is
> it intended e.g. to indicate a git repository's "stable" version
> according to the owner?), or is this just a convenient default name?
> Could I delete or rename it?  Using bzr I would normally give the
> central branch(*) the name of the project.

It's just a common convention and carries no special significance;
rename away!

> Any other useful comments that can be made to a bzr user about working
> with this difference, positive or negative aspects of it?

Don't be afraid to git-clone your local repo, especially with the -l
and -s options.  That will get you a separate repo/working directory
while not taking up much extra disk space (objects from your first
repo will be shared with the second).

Once you get comfortable with multiple branches in a single repo/
working directory, it often is much better than the alternatives.
But the above gives you the option to work either way.

> Next question ... one of the reasons I started seriously thinking about
> git was that in the VCS comparison discussion, it was noted that git is
> a lot more flexible than bzr in terms of how it can track data (e.g. the
> git pickaxe command, although I understand that's not in the released
> version [1.4.4.1] yet?).  A frustration with bzr is that pulling or
> merging patches from another branch or repo requires them to share the
> same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
> particular function in project XXX, I'm going to pull that individual
> bit of code and its development history into project YYY"?

The Git cherry-pick command lets you grab specific commits from
other branches in your repo.  But cherry-pick works at the commit
level, there is no easy way to grab a single function for instance
and merge just its history into another branch.

However, you can merge an entire separate project into yours even
though they don't share a base commit.  This has been done several
times in the history of Git itself. For instance you can see two
separate "initial" commits in the Git repo with a command like
"gitk README gitk" which gives a graphical history of the "gitk"
and "README" files and shows each started life in a separate
initial commit.  Use "git show 5569b" to see Linus bragging on
this first separate-project-merge and give some more details.
 
> Last off (for now, I'm sure I'll think of more): is there any easy (or
> difficult) way to effectively import version history from a bzr
> repository, and vice versa?

Don't think a direct bridge between the two has been written yet.

Cheers,

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
  2006-11-28  0:39                                                                       ` Jakub Narebski
@ 2006-11-28  0:40                                                                       ` Sean
  2006-11-28  0:40                                                                       ` Sean
                                                                                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 806+ messages in thread
From: Sean @ 2006-11-28  0:40 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: bazaar-ng, git

On Tue, 28 Nov 2006 01:01:46 +0100
Joseph Wakeling <joseph.wakeling@webdrake.net> wrote:

> First off a really dumb one: how do I identify myself to git, i.e. give
> it a name and email address?  Currently it uses my system identity,
> My Name <username@computer.(none)>.  I haven't found any equivalent of
> the bzr whoami command.

Assuming you have a recent version of git, then:

$ git repo-config --global user.email "you@email.com"
$ git repo-config --global user.name "Your Name"

Will setup a ~/.gitconfig in your home directory; these settings
will apply in any repo you use.  Drop the "--global" to set them
per repo.

> With this in mind, is there any significance to the "master" branch (is
> it intended e.g. to indicate a git repository's "stable" version
> according to the owner?), or is this just a convenient default name?
> Could I delete or rename it?  Using bzr I would normally give the
> central branch(*) the name of the project.

It's just a common convention and carries no special significance;
rename away!

> Any other useful comments that can be made to a bzr user about working
> with this difference, positive or negative aspects of it?

Don't be afraid to git-clone your local repo, especially with the -l
and -s options.  That will get you a separate repo/working directory
while not taking up much extra disk space (objects from your first
repo will be shared with the second).

Once you get comfortable with multiple branches in a single repo/
working directory, it often is much better than the alternatives.
But the above gives you the option to work either way.

> Next question ... one of the reasons I started seriously thinking about
> git was that in the VCS comparison discussion, it was noted that git is
> a lot more flexible than bzr in terms of how it can track data (e.g. the
> git pickaxe command, although I understand that's not in the released
> version [1.4.4.1] yet?).  A frustration with bzr is that pulling or
> merging patches from another branch or repo requires them to share the
> same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
> particular function in project XXX, I'm going to pull that individual
> bit of code and its development history into project YYY"?

The Git cherry-pick command lets you grab specific commits from
other branches in your repo.  But cherry-pick works at the commit
level, there is no easy way to grab a single function for instance
and merge just its history into another branch.

However, you can merge an entire separate project into yours even
though they don't share a base commit.  This has been done several
times in the history of Git itself. For instance you can see two
separate "initial" commits in the Git repo with a command like
"gitk README gitk" which gives a graphical history of the "gitk"
and "README" files and shows each started life in a separate
initial commit.  Use "git show 5569b" to see Linus bragging on
this first separate-project-merge and give some more details.
 
> Last off (for now, I'm sure I'll think of more): is there any easy (or
> difficult) way to effectively import version history from a bzr
> repository, and vice versa?

Don't think a direct bridge between the two has been written yet.

Cheers,
Sean




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
                                                                                         ` (2 preceding siblings ...)
  2006-11-28  0:40                                                                       ` Sean
@ 2006-11-28  2:57                                                                       ` Linus Torvalds
  2006-11-29  2:23                                                                         ` Joseph Wakeling
  2006-11-28 12:10                                                                       ` git and bzr Erik Bågfors
  2006-11-30 12:36                                                                       ` Nicholas Allen
  5 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28  2:57 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: bazaar-ng, git



On Tue, 28 Nov 2006, Joseph Wakeling wrote:
>
> First off a really dumb one: how do I identify myself to git, i.e. give
> it a name and email address?  Currently it uses my system identity,
> My Name <username@computer.(none)>.  I haven't found any equivalent of
> the bzr whoami command.

Depending on whether you like editing config files by hand or not, you 
would either just edit your ~/.gitconfig file and add a section like:

	[user]
		name = My Name Goes Here
		email = myemail@work.com

or you would use "git repo-config" to do it for you. Personally, I find it 
easier to just edit the .gitconfig file directly, since the config file 
syntax is actually rather pleasant, but if you want to do it with a git 
command, you'd do

	git repo-config --global user.name "Joseph Wakeling"
	git repo-config --global user.email joseph.wakeling@webdrake.net

(where the "--global" just tells repo-config to use the user-global 
~/.gitconfig file - you can also do this on a per-repository basis in the 
repository .git/config file if you want to have different identities for 
different repositories).

> Now to more serious business.  One of the main operational differences I
> see as a new user is that bzr defaults to setting up branches in
> different locations, whereas git by default creates a repository where
> branches are different versions of the directory contents and switching
> branches *changes* the directory contents.  bzr branch seems to be
> closer to git-clone than git-branch (N.B. I have never used bzr repos so
> might not be making a fair comparison).

You can do either, it's almost purely a matter of taste.

Using a local branch and switching between them in place has some 
advantages once you get used to it: most notably you can trivially use git 
commands that work on data from different branches at the same time. So 
with that kind of setup it's very natural to do things like "show me 
everything that is in branch 'x', but _not_ in branch 'y'", and once you 
get used to that, you really appreaciate it.

But at the same time, if you want to actually keep several branches 
checked out at the same time, and prefer to work on them that way, just 
use "git clone" to create the other branch instead. It really is just a 
matter of taste.

I suspect that most people tend to end up using the "multiple branches in 
the same directory and switching between them" approach after a time, but 
that's really just an unsubstantiated feeling, and it certainly isn't 
something that git forces on you. 

> With this in mind, is there any significance to the "master" branch (is
> it intended e.g. to indicate a git repository's "stable" version
> according to the owner?), or is this just a convenient default name?
> Could I delete or rename it?  Using bzr I would normally give the
> central branch(*) the name of the project.

It's just a convenient default name, and it has no real meaning otherwise. 
Feel free to rename it any way you want (just make sure to edit HEAD to 
point to the new name is you rename it by hand).

> Any other useful comments that can be made to a bzr user about working
> with this difference, positive or negative aspects of it?

There should be no difference, although since everybody seems to use 
"master" by default, the documentation is probably geared towards it, and 
who knows, maybe you'll hit a bug that nobody else noticed just because 
everybody else had a "master" branch, and some silly script had it 
hardcoded.

> Next question ... one of the reasons I started seriously thinking about
> git was that in the VCS comparison discussion, it was noted that git is
> a lot more flexible than bzr in terms of how it can track data (e.g. the
> git pickaxe command, although I understand that's not in the released
> version [1.4.4.1] yet?).

pickaxe wasn't in the released version back when the discussions were 
raging, but it's there now. Except it's really called "git blame" these 
days (and "git annotate") since it's taken over both of those duties.

However...

> A frustration with bzr is that pulling or
> merging patches from another branch or repo requires them to share the
> same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
> particular function in project XXX, I'm going to pull that individual
> bit of code and its development history into project YYY"?

... it's not _quite_ that smart. It will only look for sources to new 
functions from existing sources in the tree that preceded the commit that 
added the function, so it will _not_ see it coming from another branch or 
another project entirely.

So when you ask for code annotations (use the "-C" flag to see code moved 
across from other files), it will still limit itself to just a particular 
input set, and not go gallivating over all possible branches and projects 
you might have in your repository.

It wouldn't be theoretically impossible to do, but it would be 
prohibitively expensive (where do you draw the line for what to look at). 

So git won't do quite what you ask for.

> Last off (for now, I'm sure I'll think of more): is there any easy (or
> difficult) way to effectively import version history from a bzr
> repository, and vice versa?

There's a "archimport", but I assume bzr has long since broken 
compatibility with arch (and/or just extended things so much as to not be 
importable with that any more), regardless of any origin. But it might be 
a good starting point, at least.

			Linus




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
                                                                                         ` (3 preceding siblings ...)
  2006-11-28  2:57                                                                       ` Linus Torvalds
@ 2006-11-28 12:10                                                                       ` Erik Bågfors
  2006-11-28 12:37                                                                         ` Jakub Narebski
  2006-11-30 12:36                                                                       ` Nicholas Allen
  5 siblings, 1 reply; 806+ messages in thread
From: Erik Bågfors @ 2006-11-28 12:10 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: git, bazaar-ng

> Next question ... one of the reasons I started seriously thinking about
> git was that in the VCS comparison discussion, it was noted that git is
> a lot more flexible than bzr in terms of how it can track data (e.g. the
> git pickaxe command, although I understand that's not in the released
> version [1.4.4.1] yet?).


If this is blame/annotate,  this exists in bzr as well...

: [bagfors@zyrgelkwyt]$ ; bzr help blame
usage: bzr annotate FILENAME
aliases: ann, blame, praise

Show the origin of each line in a file.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 12:10                                                                       ` git and bzr Erik Bågfors
@ 2006-11-28 12:37                                                                         ` Jakub Narebski
  2006-11-28 13:35                                                                           ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 12:37 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Erik B?gfors wrote:

>> Next question ... one of the reasons I started seriously thinking about
>> git was that in the VCS comparison discussion, it was noted that git is
>> a lot more flexible than bzr in terms of how it can track data (e.g. the
>> git pickaxe command, although I understand that's not in the released
>> version [1.4.4.1] yet?).
> 
> If this is blame/annotate,  this exists in bzr as well...
> 
> : [bagfors@zyrgelkwyt]$ ; bzr help blame
> usage: bzr annotate FILENAME
> aliases: ann, blame, praise
> 
> Show the origin of each line in a file.

That doesn't change the fact that "git pickaxe" abilities in "git blame"
is more than just equivalent of "cvs annotate".

----
bzr annotate FILENAME
    Show the origin of each line in a file.

----
git-blame [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>]
          [-M] [-C] [-C] [--since=<date>] [<rev>] [--] <file>

Annotates each line in the given file with information from the revision
which last modified the line. Optionally, start annotating from the given
revision.

Also it can limit the range of lines annotated.
[...]
Also you can use regular expression to specify the line range.
  git blame -L '/^sub hello {/,/^}$/' foo
would limit the annotation to the body of hello subroutine.

When you are not interested in changes older than the version v2.6.18, or
changes older than 3 weeks, you can use revision range specifiers similar
to git-rev-list:
  git blame v2.6.18.. -- foo
  git blame --since=3.weeks -- foo

http://kernel.org/pub/software/scm/git/docs/git-blame.html
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 12:37                                                                         ` Jakub Narebski
@ 2006-11-28 13:35                                                                           ` Johannes Schindelin
  2006-11-28 16:08                                                                             ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-28 13:35 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Hi,

On Tue, 28 Nov 2006, Jakub Narebski wrote:

> [... some reasons why git-annotate is not just your regular annotate ...]

You should also mention that git-annotate can follow code movements 
through file renames.

I know, because I was already rightfully blamed for code which was moved 
by somebody else.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 13:35                                                                           ` Johannes Schindelin
@ 2006-11-28 16:08                                                                             ` Linus Torvalds
  2006-11-28 17:07                                                                               ` Aaron Bentley
  2006-11-28 17:44                                                                               ` Nicholas Allen
  0 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28 16:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jakub Narebski, git, bazaar-ng



On Tue, 28 Nov 2006, Johannes Schindelin wrote:
> 
> On Tue, 28 Nov 2006, Jakub Narebski wrote:
> 
> > [... some reasons why git-annotate is not just your regular annotate ...]
> 
> You should also mention that git-annotate can follow code movements 
> through file renames.

.. and within the same file, and _copied_ from other files.

A good example of this is still just doing a

	git blame -C revision.c

because that "revision.c" file was created by splitting the old 
"rev-list.c" into two files (revision.c and rev-list.c). And the fact that 
"git blame" catches it and shows it in a very natural format is really 
quite nice.

(rev-list.c has since been renamed to "builtin-rev-list.c", so if you want 
to see the "other" side of the split, just do

	git blame -C builtin-rev-list.c

in order to realize how well git blame follows both renames _and_ pure 
data movement).

The reason this is a good example is simply the fact that it should 
totally silence anybody who still thinks that tracking file identities is 
a good thing. It explains well why tracking file identities is just 
_stupid_.

You simply couldn't have done that kind of split sanely with file identity 
tracking (well, that one only had a single copy, so you could argue that a 
file identity tracker with copies could have done it, but the fact is that 
(a) they never do and (b) "git blame" can equally well track stuff that 
comes from _multiple_ different "file iddentities").

Such a "multiple sources" case can actually be found by doing

	git blame -C tree-walk.c

which (correctly) figures out that the code comes from both merge-tree.c 
(the "entry compare/extract" functions)_and_ from sha1_name.c (the 
"find_tree_entry()" function). 

So yes, "git blame" is a _hell_ of a lot more powerful than anybody elses 
"annotate", as far as I know. I literally suspect that nobody else comes 
even close.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 16:08                                                                             ` Linus Torvalds
@ 2006-11-28 17:07                                                                               ` Aaron Bentley
  2006-11-28 17:29                                                                                 ` Jakub Narebski
  2006-11-28 18:00                                                                                 ` Linus Torvalds
  2006-11-28 17:44                                                                               ` Nicholas Allen
  1 sibling, 2 replies; 806+ messages in thread
From: Aaron Bentley @ 2006-11-28 17:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Johannes Schindelin, Jakub Narebski

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Linus Torvalds wrote:
> in order to realize how well git blame follows both renames _and_ pure 
> data movement).
> 
> The reason this is a good example is simply the fact that it should 
> totally silence anybody who still thinks that tracking file identities is 
> a good thing. It explains well why tracking file identities is just 
> _stupid_.

No need to be aggressive about this.  Yes, it's true that file identity
doesn't directly solve this problem, but it doesn't prove that an
identity-based approach is wrong.

In the end, everything comes down to identity of some kind.  Because if
you're going to apply someone else's changes, you must apply them to the
same thing that they changed.

Git determines identity based on content, while bzr has the user
indicate it.  Both approaches work.

Bzr supports merging based on line identity (our weave merge, not our
knit merge).  At the moment, our concept of line identity is based on
file identity, but there's no reason it has to stay that way.

> You simply couldn't have done that kind of split sanely with file identity 
> tracking (well, that one only had a single copy, so you could argue that a 
> file identity tracker with copies could have done it, but the fact is that 
> (a) they never do and (b) "git blame" can equally well track stuff that 
> comes from _multiple_ different "file iddentities").

I think you're wrong about that.  There's nothing stopping bzr from
inferring a file split, or even explicitly recording it.  bzr doesn't
record copies, because we think there are no sane merge semantics across
copies.

> So yes, "git blame" is a _hell_ of a lot more powerful than anybody elses 
> "annotate", as far as I know. I literally suspect that nobody else comes 
> even close.

I notice that blame has an option to limit the annotation to recent
history.  I can only assume that is for performance reasons.  bzr
annotate doesn't need a feature like that, because annotations are
explicit in bzr's storage format.  I expect that even if we were to
extend annotate to track content across files, it would still be so fast
that we wouldn't need it.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFbGy70F+nu1YWqI0RAt75AKCAy0ALi0IKzqZpgnavJrx97+lhDgCfaMSe
fs4Lt77k1/OXC82aFbh5pKg=
=/OiA
-----END PGP SIGNATURE-----




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 17:07                                                                               ` Aaron Bentley
@ 2006-11-28 17:29                                                                                 ` Jakub Narebski
  2006-11-28 18:31                                                                                   ` Aaron Bentley
  2006-11-28 18:00                                                                                 ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 17:29 UTC (permalink / raw)
  To: bazaar-ng; +Cc: git

Aaron Bentley wrote:

> Linus Torvalds wrote:

>> So yes, "git blame" is a _hell_ of a lot more powerful than anybody elses 
>> "annotate", as far as I know. I literally suspect that nobody else comes 
>> even close.

Well without the content based detection of contents copying and moving
which git-blame wouldn't work as well as it work now.
 
> I notice that blame has an option to limit the annotation to recent
> history.  I can only assume that is for performance reasons.  bzr
> annotate doesn't need a feature like that, because annotations are
> explicit in bzr's storage format. 

But you don't have content movement tracking.

> 
>                                   I expect that even if we were to 
> extend annotate to track content across files, it would still be so fast
> that we wouldn't need it.

I think not.


The first example:

$ time git blame -C revision.c >/dev/null

real    0m7.577s
user    0m7.248s
sys     0m0.020s

while without content copying and moving detection we have

$ time git blame revision.c >/dev/null

real    0m2.108s
user    0m2.044s
sys     0m0.024s

(on 2000 BogoMIPS CPU).


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 16:08                                                                             ` Linus Torvalds
  2006-11-28 17:07                                                                               ` Aaron Bentley
@ 2006-11-28 17:44                                                                               ` Nicholas Allen
  2006-11-28 18:06                                                                                 ` Jakub Narebski
  1 sibling, 1 reply; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 17:44 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Johannes Schindelin, bazaar-ng, git, Jakub Narebski


>
> The reason this is a good example is simply the fact that it should 
> totally silence anybody who still thinks that tracking file identities is 
> a good thing. It explains well why tracking file identities is just 
> _stupid_.
I'm unfamiliar with git so I could be totally wrong here!

I know that bzr supports file renames/moves very effectively and I 
understood that git doesn't support this to the same extent (correct me 
if I am wrong as I have not used git at all!).

If that is the case, could that be because bzr gives each file its own 
id and can detect this easily but git's content based approach can't? If 
so then claiming file identifiers is *stupid* seems a bit extreme. So I 
would have thought *both* file identifiers and line/content identifiers 
are needed for tracking changes made to the files and to their contents 
respectively. When a file is copied then the contents are copied and it 
is given a new file identifier. When a file is moved it keeps the same 
identifier. So don't you need file identifiers as well as line/content 
identifiers?


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 17:07                                                                               ` Aaron Bentley
  2006-11-28 17:29                                                                                 ` Jakub Narebski
@ 2006-11-28 18:00                                                                                 ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28 18:00 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git, Johannes Schindelin, Jakub Narebski



On Tue, 28 Nov 2006, Aaron Bentley wrote:
> 
> I notice that blame has an option to limit the annotation to recent
> history.  I can only assume that is for performance reasons.

You'd assume wrong.

Trust me, if you talk about performance, bzr will lose. I can pretty much 
guarantee you that you perform worse. The mozilla discussion pointed to a 
performance test between hg and bzr, and hg in that test tended to perform 
better by a factor of 2-10. And git tends to be another factor faster than 
_that_.

Performance is important to git, but it's important not in the sense of 
"let's not do it because it performs badly", but in the sense of "things 
should be so fast that people don't even realize that they are done". You 
guys may count commit times in seconds. I still want to commit multiple 
patches _per_second_ to the kernel tree. THAT is performance.

So no, performance wasn't the reason.

The reason is simple: be logical. The original blame/annotate semantics 
were

	git blame filename

which is what people traditionally use, but then to specify which version 
to _start_ with (in case you wanted to go backwards in time), you had an 
optional revision argument at the end.

Which is totally against how all the other git programs work, and I 
complained, because I had actually wanted to see the blame at a particular 
release version, and what my fingers typed didn't work. I want to be able 
to do

	git blame [revno] [--] filename

the same way I can ask for a git log, git whatchanged, gitk, and any 
other such history tool.

And once you do the same command line parsin as the other log-related 
commands, you pretty much automatically get the revision limiting. So now 
you can do

	git blame v2.6.17..v2.6.18 filename

on the kernel archive to see who is to blame for certain lines in a 
certain _range_ of commits. It just fell out of using the same syntax 
everywhere.

It's also happens to be useful. Quite often, you know something broke 
after a particular known-good release, so you're interested in the blame, 
but anything older than that known-good release is simply noise, and 
actually takes AWAY from the information, by just making things more 
cluttered.

			Linus




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 17:44                                                                               ` Nicholas Allen
@ 2006-11-28 18:06                                                                                 ` Jakub Narebski
  2006-11-28 18:58                                                                                   ` Nicholas Allen
                                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 18:06 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Nicholas Allen wrote:

>> The reason this is a good example is simply the fact that it should 
>> totally silence anybody who still thinks that tracking file identities is 
>> a good thing. It explains well why tracking file identities is just 
>> _stupid_.
>
> I'm unfamiliar with git so I could be totally wrong here!
> 
> I know that bzr supports file renames/moves very effectively and I 

This means: _usually_ works, doesn't it? Emphasisis on "usually"?

> understood that git doesn't support this to the same extent (correct me 
> if I am wrong as I have not used git at all!).

Git supports renames/moves in different way. Instead of recording renames
(which has trouble on it's own, for example rename via applying patch)
in the repository it _detect_ renames when needed.
 
> If that is the case, could that be because bzr gives each file its own 
> id and can detect this easily but git's content based approach can't? If 
> so then claiming file identifiers is *stupid* seems a bit extreme. So I 
> would have thought *both* file identifiers and line/content identifiers 
> are needed for tracking changes made to the files and to their contents 
> respectively. When a file is copied then the contents are copied and it 
> is given a new file identifier. When a file is moved it keeps the same 
> identifier. So don't you need file identifiers as well as line/content 
> identifiers?

There are trouble with file-ids. Most common example is trouble with file
which was created in two branches (two repositories) independently, then
branches got merged. Most (all?) file-id based rename detection has trouble
with repeated merging of those branches, even if there are no true
conflicts.

Read Linus post about file-id based rename detection:
  Message-ID: <Pine.LNX.4.64.0610201049250.3962@g5.osdl.org>
  http://permalink.gmane.org/gmane.comp.version-control.bazaar-ng.general/18458

Not that contents based rename detection doesn have it's own pitfals:
  Message-ID: <7virha4cnm.fsf@assigned-by-dhcp.cox.net>
  http://permalink.gmane.org/gmane.comp.version-control.git/31899
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 17:29                                                                                 ` Jakub Narebski
@ 2006-11-28 18:31                                                                                   ` Aaron Bentley
  2006-11-28 18:43                                                                                     ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-11-28 18:31 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
>>I notice that blame has an option to limit the annotation to recent
>>history.  I can only assume that is for performance reasons.  bzr
>>annotate doesn't need a feature like that, because annotations are
>>explicit in bzr's storage format. 
> 
> 
> But you don't have content movement tracking.
> 
> 
>>                                  I expect that even if we were to 
>>extend annotate to track content across files, it would still be so fast
>>that we wouldn't need it.
> 
> 
> I think not.

There's no question that determining content movement could involve
opening a lot of revisions, but you wouldn't need to examine:

1. revisions that didn't alter any lines being examined
2. revisions that altered only the file in question
3. revisions with multiple parents, because any lines attributed to that
merge will be the outcome of conflict resolution.  (Other lines will be
attributed to one of the parents)

I'll admit though, that when I was thinking of this, I was thinking of
annotation-based merging, a scenario in which the number of lines being
examined is typically extremely low.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFbICL0F+nu1YWqI0RAhaXAJ9tqw/J17oKDV0nnuPlputs1PHBIgCghs6K
q++u4Z9OFGwziUBsnW08y0U=
=tmqe

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 18:31                                                                                   ` Aaron Bentley
@ 2006-11-28 18:43                                                                                     ` Jakub Narebski
  2006-11-28 21:59                                                                                       ` Aaron Bentley
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 18:43 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Dnia wtorek 28. listopada 2006 19:31, Aaron Bentley napisał:
> Jakub Narebski wrote:
>>>I notice that blame has an option to limit the annotation to recent
>>>history.  I can only assume that is for performance reasons.  bzr
>>>annotate doesn't need a feature like that, because annotations are
>>>explicit in bzr's storage format.
>>
>> But you don't have content movement tracking.
>>
>>>                                  I expect that even if we were to
>>>extend annotate to track content across files, it would still be so fast
>>>that we wouldn't need it.
>>
>>
>> I think not.
> 
> There's no question that determining content movement could involve
> opening a lot of revisions, but you wouldn't need to examine:
> 
> 1. revisions that didn't alter any lines being examined
> 2. revisions that altered only the file in question
> 3. revisions with multiple parents, because any lines attributed to that
> merge will be the outcome of conflict resolution.  (Other lines will be
> attributed to one of the parents)
> 
> I'll admit though, that when I was thinking of this, I was thinking of
> annotation-based merging, a scenario in which the number of lines being
> examined is typically extremely low.

Well, I gues that with "annotate friendly" (weave or knit) storage
annotate/blame would be faster. But fast annotate was not one of the
design goals of git.

How fast is "bzr annotate"?
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 18:06                                                                                 ` Jakub Narebski
@ 2006-11-28 18:58                                                                                   ` Nicholas Allen
  2006-11-28 19:11                                                                                   ` Nicholas Allen
  2006-11-28 20:37                                                                                   ` Nicholas Allen
  2 siblings, 0 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 18:58 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git


> There are trouble with file-ids. Most common example is trouble with file
> which was created in two branches (two repositories) independently, then
> branches got merged. Most (all?) file-id based rename detection has trouble
> with repeated merging of those branches, even if there are no true
> conflicts.

Do you mean if the 2 files should be merged into 1 file? If they should 
be 2 files with different names there is no problem using file 
identifiers but if they should be merged into one file then I can see 
that this would cause problems. You would have to delete one of the 
files and copy its changes into the other which would create conflicts 
when that file is modified in the other branch. This is a problem if you 
*only* have file identifiers.

But if you tracked both file identifiers *and* content identifiers (as I 
was trying to say in my first post) this wouldn't be a problem would it? 
When content is changed you use the content identifiers but when files 
are changed by renaming or deleting you use file identifiers. To me at 
least it doesn't seem like it's a choice of one or the other or that one 
is stupid and the other isn't but that you need them both. bzr uses file 
ids and git uses content ids. It would be nice if there were an RCS 
that  used both - then you get the best of both worlds don't you?

So I don't think you want to use file identifiers to track changes to 
content (as bzr would do in this case) and you don't want to use content 
identifiers to track changes to files (as git does, to my understanding, 
when a file is renamed).

Nick

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 18:06                                                                                 ` Jakub Narebski
  2006-11-28 18:58                                                                                   ` Nicholas Allen
@ 2006-11-28 19:11                                                                                   ` Nicholas Allen
  2006-11-28 19:40                                                                                     ` Andy Parkins
  2006-11-28 20:37                                                                                   ` Nicholas Allen
  2 siblings, 1 reply; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 19:11 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski wrote:
> Nicholas Allen wrote:
>
>   
>>> The reason this is a good example is simply the fact that it should 
>>> totally silence anybody who still thinks that tracking file identities is 
>>> a good thing. It explains well why tracking file identities is just 
>>> _stupid_.
>>>       
>> I'm unfamiliar with git so I could be totally wrong here!
>>
>> I know that bzr supports file renames/moves very effectively and I 
>>     
>
> This means: _usually_ works, doesn't it? Emphasisis on "usually"?
>
>   
>> understood that git doesn't support this to the same extent (correct me 
>> if I am wrong as I have not used git at all!).
>>     
>
> Git supports renames/moves in different way. Instead of recording renames
> (which has trouble on it's own, for example rename via applying patch)
> in the repository it _detect_ renames when needed.
>   
This can't be fail safe though. I would prefer to also have the option 
to be able to *explicitly* tell the RCS that a file was renamed and not 
have it try to detect from the content  which is bound to have corner 
cases that fail. When I know I renamed a file why can't I explicitly 
tell the RCS and it records the change with the *file identifier*. If I 
change the content then the change is not recorded with the file 
identifier but with the line/content identifier.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 19:11                                                                                   ` Nicholas Allen
@ 2006-11-28 19:40                                                                                     ` Andy Parkins
  2006-11-28 19:59                                                                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Andy Parkins @ 2006-11-28 19:40 UTC (permalink / raw)
  To: git

On Tuesday 2006, November 28 19:11, Nicholas Allen wrote:

> This can't be fail safe though. I would prefer to also have the option
> to be able to *explicitly* tell the RCS that a file was renamed and not
> have it try to detect from the content  which is bound to have corner
> cases that fail. When I know I renamed a file why can't I explicitly

You want to tell git about a rename that will never fail to be detected?  No 
problem.

$ git mv oldname newname
$ git commit

The corner cases you speak about are when you rename and edit.

For me, I prefer that to be detected as at least the detection algorithm can 
be tuned - there is no fixing it if the VCS was forced to consider it a 
rename.

When I started using git I was worried about the lack of a rename, but now I 
realise that it's not needed - it's pointless.  The VCS is snapshotting 
moments in time, that's it.  Then by making cleverer and cleverer 
interpreters of those snapshots you have the potential to do stuff that is 
far more useful than "just" rename recording.


Andy
-- 
Dr Andrew Parkins, M Eng (Hons), AMIEE

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 19:40                                                                                     ` Andy Parkins
@ 2006-11-28 19:59                                                                                       ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 19:59 UTC (permalink / raw)
  To: git

Andy Parkins wrote:

> On Tuesday 2006, November 28 19:11, Nicholas Allen wrote:
> 
>> This can't be fail safe though. I would prefer to also have the option
>> to be able to *explicitly* tell the RCS that a file was renamed and not
>> have it try to detect from the content  which is bound to have corner
>> cases that fail. When I know I renamed a file why can't I explicitly
> 
> You want to tell git about a rename that will never fail to be detected?  No 
> problem.
> 
> $ git mv oldname newname
> $ git commit
> 
> The corner cases you speak about are when you rename and edit.
> 
> For me, I prefer that to be detected as at least the detection algorithm can 
> be tuned - there is no fixing it if the VCS was forced to consider it a 
> rename.
> 
> When I started using git I was worried about the lack of a rename, but now I 
> realise that it's not needed - it's pointless.  The VCS is snapshotting 
> moments in time, that's it.  Then by making cleverer and cleverer 
> interpreters of those snapshots you have the potential to do stuff that is 
> far more useful than "just" rename recording.

Well, there are two cases where this might be not enough.

On is following file renames for history tracking. git-blame does that,
but git-log and friends does not; the <path> is just revision limiter.
There is an idea of --follow option to git-log (and friends), to be
implemented.

Second is rename detection for 3way merges: only ancestor and final
states are considered, so the above would not help. And rename detection
might fail if ancestor is not similar enough to end states; well, the
merge has low chance of being without conflict then.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 18:06                                                                                 ` Jakub Narebski
  2006-11-28 18:58                                                                                   ` Nicholas Allen
  2006-11-28 19:11                                                                                   ` Nicholas Allen
@ 2006-11-28 20:37                                                                                   ` Nicholas Allen
  2006-11-28 21:26                                                                                     ` Nicholas Allen
  2006-11-28 21:40                                                                                     ` Martin Langhoff
  2 siblings, 2 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 20:37 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

Jakub Narebski wrote:
> Nicholas Allen wrote:
> 
>>> The reason this is a good example is simply the fact that it should 
>>> totally silence anybody who still thinks that tracking file identities is 
>>> a good thing. It explains well why tracking file identities is just 
>>> _stupid_.
>> I'm unfamiliar with git so I could be totally wrong here!
>>
>> I know that bzr supports file renames/moves very effectively and I 
> 
> This means: _usually_ works, doesn't it? Emphasisis on "usually"?

Having not used git I can't really say whether git is better than bzr or
not in this regard. I know in the kind of development I do the case
where a file with the same name has been added independantly in 2
different branches is a pretty rare one. Usually, when it has happened
the files should have been 2 separate files with different names anyway
- so bzr would have no problem with this.

However, renaming a file is pretty common and I would rather be explicit
about it and have file name changes easily visible/searchable in my log.

Just out of curiosity: How does git handle the case where one file is
renamed differently in 2 branches and then the branches are repeatably
merged? I know that bzr handles this very well and in various tests I
did there were absolutely no repeated conflicts. Would git behave as
well in this scenario?


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 20:37                                                                                   ` Nicholas Allen
@ 2006-11-28 21:26                                                                                     ` Nicholas Allen
  2006-11-28 21:43                                                                                       ` Jakub Narebski
                                                                                                         ` (2 more replies)
  2006-11-28 21:40                                                                                     ` Martin Langhoff
  1 sibling, 3 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 21:26 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git

> 
> Just out of curiosity: How does git handle the case where one file is
> renamed differently in 2 branches and then the branches are repeatably
> merged? I know that bzr handles this very well and in various tests I
> did there were absolutely no repeated conflicts. Would git behave as
> well in this scenario?
> 

Ok - I got curious and decided to install git and try this myself.

In this test I had a file hello.txt that got renamed to hello1.txt in
one branch and hello2.txt in another. Then I merged the changes between
the 2 branches.

Here is how it looked after the merge in bzr:

 bzr status
renamed:
  hello2.txt => hello1.txt
conflicts:
  Path conflict: hello2.txt / hello1.txt
pending merges:
  Nicholas Allen 2006-11-28 Renamed hello to hello1


and here's how it looked in git:
git status
#
# Changed but not updated:
#   (use git-update-index to mark for commit)
#
#       unmerged: hello.txt
#       unmerged: hello1.txt
#       unmerged: hello2.txt
#       modified: hello2.txt
#
nothing to commit

So git is not telling me that I have a conflict due to the same file
being renamed differently in 2 branches - well at least not in a way I
can comprehend anyway! Whereas bzr made this very clear. Also, in git I
ended up with 2 files:

 ls
hello1.txt  hello2.txt

whereas in bzr there was only one file and I just had to decide which
name it was to be given to resolve the conflict.

I'm not sure how I should resolve the conflict in git but that's
probably just because I am not familiar with it yet and the message it
gave was not comprehensible or helpful to me in the slightest. In bzr it
was very easy and repeatably merging caused no trouble at all - the name
conflict had to be resolved only once.

While it was good that git detected my file rename (although this was
not hard as the contents did not change at all) the process in bzr was
*much* smoother and more user friendly than it was it git. When you have
conflicts I think it's especially important that the RCS inform you of
what is really happening so you do not make mistakes. Bzr was much more
informative than git was and told me exactly why there was a conflict
and made it easy to resolve it.

This situation is a pretty common one and it seems to me that git's
content based approach is not as useful in this case as the file
identity approach that bzr uses.


Nick

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 20:37                                                                                   ` Nicholas Allen
  2006-11-28 21:26                                                                                     ` Nicholas Allen
@ 2006-11-28 21:40                                                                                     ` Martin Langhoff
       [not found]                                                                                       ` <456CADE9.7060503@onlinehome.de>
  1 sibling, 1 reply; 806+ messages in thread
From: Martin Langhoff @ 2006-11-28 21:40 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git

On 11/29/06, Nicholas Allen <nick.allen@onlinehome.de> wrote:
> Having not used git I can't really say whether git is better than bzr or
> not in this regard. I know in the kind of development I do the case
> where a file with the same name has been added independantly in 2
> different branches is a pretty rare one. Usually, when it has happened
> the files should have been 2 separate files with different names anyway
> - so bzr would have no problem with this.

Not so rare in a true DSCM scenario where people submit patches via
email or a bug tracker. Say two developers apply the same patch to
their trees, and one of them tweaks it a bit. While I don't personally
do kernel development, I understand that's reasonably common in the
linux dev team.

It also happens quite a bit if you cherry pick across branches patches
that create files.

In such cases, I find GIT does the right thing 99% of the time,
including spotting situations where the file got added at different
patchlevels in different branches.

cheers,




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 21:26                                                                                     ` Nicholas Allen
@ 2006-11-28 21:43                                                                                       ` Jakub Narebski
  2006-11-28 21:49                                                                                       ` Linus Torvalds
       [not found]                                                                                       ` <20061128214531.GA24299@jameswestby.net>
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 21:43 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Nicholas Allen wrote:

>> Just out of curiosity: How does git handle the case where one file is
>> renamed differently in 2 branches and then the branches are repeatably
>> merged? I know that bzr handles this very well and in various tests I
>> did there were absolutely no repeated conflicts. Would git behave as
>> well in this scenario?
>> 
> 
> Ok - I got curious and decided to install git and try this myself.
> 
> In this test I had a file hello.txt that got renamed to hello1.txt in
> one branch and hello2.txt in another. Then I merged the changes between
> the 2 branches.
> 
> Here is how it looked after the merge in bzr:
> 
>  bzr status
> renamed:
>   hello2.txt => hello1.txt
> conflicts:
>   Path conflict: hello2.txt / hello1.txt
> pending merges:
>   Nicholas Allen 2006-11-28 Renamed hello to hello1
> 
> 
> and here's how it looked in git:
> git status
> #
> # Changed but not updated:
> #   (use git-update-index to mark for commit)
> #
> #       unmerged: hello.txt
> #       unmerged: hello1.txt
> #       unmerged: hello2.txt
> #       modified: hello2.txt
> #
> nothing to commit

Er? What about merge printed?

  $ git pull . branch
  Trying really trivial in-index merge...
  fatal: Merge requires file-level merging
  Nope.
  Merging HEAD with c59706ee42aa7b6b2b203d4219210a684f5581f2
  Merging:
  8f43c37 Moved hello.txt to hello_master.txt
  c59706e Moved hello.txt to hello_branch.txt
  found 1 common ancestor(s):
  b7d5f1a Initial commit
  CONFLICT (rename/rename): Rename hello.txt->hello_master.txt in branch 
    HEAD rename hello.txt->hello_branch.txt in c59706e
  Automatic merge failed; fix conflicts and then commit the result.

I agree that git-status output could be more helpful in the case of
merges. Well, you can always check "git ls-files --stage"

  $ git ls-files --stage --abbrev
  100644 18249f3 1        hello.txt
  100644 18249f3 3        hello_branch.txt
  100644 18249f3 2        hello_master.txt

> So git is not telling me that I have a conflict due to the same file
> being renamed differently in 2 branches - well at least not in a way I
> can comprehend anyway! Whereas bzr made this very clear. Also, in git I
> ended up with 2 files:
> 
>  ls
> hello1.txt  hello2.txt
> 
> whereas in bzr there was only one file and I just had to decide which
> name it was to be given to resolve the conflict.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 21:26                                                                                     ` Nicholas Allen
  2006-11-28 21:43                                                                                       ` Jakub Narebski
@ 2006-11-28 21:49                                                                                       ` Linus Torvalds
  2006-11-28 21:53                                                                                         ` Shawn Pearce
  2006-11-28 22:00                                                                                         ` Nicholas Allen
       [not found]                                                                                       ` <20061128214531.GA24299@jameswestby.net>
  2 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28 21:49 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git



On Tue, 28 Nov 2006, Nicholas Allen wrote:
> 
> and here's how it looked in git:
> git status

Ehh. It told you exactly what happened when you actually did the merge, 
didn't it?

Yeah, "git status" won't tell you _why_ it results in unmerged paths, but 
the merge will have told you.  You must have seen that, but decided to 
just ignore it and not post it, because it didn't support the conclusion 
you wanted to get, did it?

There are lots of reasons why "git status" may tell you that something 
isn't merged. The most common one by far being an actual data conflict, 
not a name conflict. The reason for why something conflicts is always told 
at merge-time.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 21:49                                                                                       ` Linus Torvalds
@ 2006-11-28 21:53                                                                                         ` Shawn Pearce
  2006-11-28 22:13                                                                                           ` Linus Torvalds
  2006-11-28 22:00                                                                                         ` Nicholas Allen
  1 sibling, 1 reply; 806+ messages in thread
From: Shawn Pearce @ 2006-11-28 21:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicholas Allen, Jakub Narebski, bazaar-ng, git

Linus Torvalds <torvalds@osdl.org> wrote:
> There are lots of reasons why "git status" may tell you that something 
> isn't merged. The most common one by far being an actual data conflict, 
> not a name conflict. The reason for why something conflicts is always told 
> at merge-time.

Except when you are doing a large merge, your terminal scrollback
is really short, and there's a lot of conflicts.  Then you can't
see what merge said about any given file.  :-(

Fortunately its easy to back out of the merge and redo it with
large enough scrollback, or redirecting it to a file for later
review, but its annoying that we don't save that information off
for later review.

-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 18:43                                                                                     ` Jakub Narebski
@ 2006-11-28 21:59                                                                                       ` Aaron Bentley
  2006-11-28 22:16                                                                                         ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Aaron Bentley @ 2006-11-28 21:59 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: bazaar-ng, git

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jakub Narebski wrote:
> Well, I gues that with "annotate friendly" (weave or knit) storage
> annotate/blame would be faster. But fast annotate was not one of the
> design goals of git.
> 
> How fast is "bzr annotate"?

$ time bzr annotate builtins.py > /dev/null

real    0m1.479s
user    0m1.430s
sys     0m0.030s

builtins.py has 953 ancestor revisions (i.e. revisions that modified it)
and 3016 lines.

That's on a machine with 4141.87 Bogomips.  I did optimize annotate
slightly, but I'm submitting the optimization for our 0.14.0 release.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFFbLE90F+nu1YWqI0RAlkdAJ99Ca4ITlwx+TuGvBmux0HPDpx28QCfTY0h
lJYpnpcpWs8SpAP31x48NF4=
=EDXr

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 21:49                                                                                       ` Linus Torvalds
  2006-11-28 21:53                                                                                         ` Shawn Pearce
@ 2006-11-28 22:00                                                                                         ` Nicholas Allen
  2006-11-28 22:25                                                                                           ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 22:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git

Linus Torvalds wrote:
> 
> On Tue, 28 Nov 2006, Nicholas Allen wrote:
>> and here's how it looked in git:
>> git status
> 
> Ehh. It told you exactly what happened when you actually did the merge, 
> didn't it?
> 
> Yeah, "git status" won't tell you _why_ it results in unmerged paths, but 
> the merge will have told you.  You must have seen that, but decided to 
> just ignore it and not post it, because it didn't support the conclusion 
> you wanted to get, did it?

I didn't do this deliberately - it's just because merge spewed out a
whole load of stuff at me that I didn't understand and therefore
overlooked the conflict message in it. I wasn't expecting to see it here
anyway and was hoping for a short and informative summary that I would
understand when I did a status.

Also what happens if I loose the messages because they scrolled off
screen or the power goes down, I need to reboot for some reason, or I
don't have time and want to shutdown my computer restart another day and
resolve the conflicts then? All useful conflict status is lost isn't it?
That's why I expected git status to tell me this in some understandable
manner and was not even expecting it to only be in the merge output....


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 21:53                                                                                         ` Shawn Pearce
@ 2006-11-28 22:13                                                                                           ` Linus Torvalds
  2006-11-28 22:22                                                                                             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28 22:13 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: Nicholas Allen, Jakub Narebski, bazaar-ng, git



On Tue, 28 Nov 2006, Shawn Pearce wrote:
> 
> Except when you are doing a large merge, your terminal scrollback
> is really short, and there's a lot of conflicts.  Then you can't
> see what merge said about any given file.  :-(

Heh. Which is partly why I just do "git diff", which usually tells me what 
is up, or "git log --stat --merge", which is usually even better. I've 
never actually had to scroll up.

[ But I'll also admit that I used to have a "xterm*savedlines=5000" in my 
  .Xdefaults, and it might be worth it for some people. I haven't actually 
  needed it with git, because the _real_ reason for it used to be applying 
  patch-sets, and I've made sure that the git patch-application is so 
  robust that I never need to go back and look for reasons for conflicts - 
  if something conflicts, it just _stops_ and undoes the whole patch 
  instead of continuing to apply the rest or leave the already-applied 
  part applied. ]

Although I agree that we could probably also improve "git status" output, 
especially as I doubt it has been tested much.

People don't tend to use "git status" very much, I suspect - the most 
common usage is not in "git status" itself, but simply as the commit 
message template, and that one obviously cannot have any unmerged stuff at 
all (since then we'd refuse to even go as far as asking for a commit 
message in the first place).

Figuring out that the reason for a conflict is a name clash is not 
necessarily possible after the merge, though: it's really up to the merge 
policy to decide to merge a file cleanly or not, and the "Why" part of why 
some particular merge policy decided not the resolve a file is really 
internal to the policy, and not externally visible in the tree itself.

(But we can certainly see whether it was a pure content conflict or 
whether it had some component of a name clash by just looking at what 
stages we have for a name: so we could at least separate out the causes 
for merge failures at least _partially_ in "git status")

> Fortunately its easy to back out of the merge and redo it with
> large enough scrollback, or redirecting it to a file for later
> review, but its annoying that we don't save that information off
> for later review.

I personally find "git log --merge" to be a huge timesaver. But I have to 
say, I don't think I've seen more than one or two name conflicts ever, and 
almost all of the true issues tend to be just regular data conflicts. So 
that's what I personally care about most.

[ For the non-git users, "git log --merge" is just shorthand for a much 
  more complicated git revision parsing expression which boils down to: 
  "show all commits as they pertain to any remaining unmerged pathnames, 
  and only within the symmetrical set difference between the two branches 
  you merged". You could write it out as

	git log ORIG_HEAD...MERGE_HEAD -- $(git ls-files --unmerged)

  but that "git log --merge" is a much simpler shorthand for that thing. 

  It's not that merge conflicts are necessarily common, but when they do 
  happen, that's where you _really_ want the SCM to support you in 
  figuring out what happened ]

So with "git diff" showing a three-way diff for anything unmerged, and 
"git log --merge" showing the commits that caused the problems, I don't 
think I've ever really needed to go back and say "ok, so why did that 
fail". 

It's just that "git status" was never what I'd have used in the first 
place. I guess it's been long enough since I used CVS that "git status" 
doesn't even enter my mind all that much on a merge failure.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
       [not found]                                                                                       ` <456CADE9.7060503@onlinehome.de>
@ 2006-11-28 22:14                                                                                         ` Martin Langhoff
  2006-11-28 22:19                                                                                           ` Martin Langhoff
  2006-11-28 22:36                                                                                           ` Nicholas Allen
  0 siblings, 2 replies; 806+ messages in thread
From: Martin Langhoff @ 2006-11-28 22:14 UTC (permalink / raw)
  To: Nicholas Allen, Git Mailing List

On 11/29/06, Nicholas Allen <nick.allen@onlinehome.de> wrote:
> yes I can see if you just use plain patches. In bzr though there are
> bundles that store extra data along with the patch and if you use this
> instead of a simple patch this will never be a problem as bzr can then
> notice the same bundle being merged into 2 branches.

Well, there you start depending on everyone using bzr and providing
metadata-added patches. Git is really good at dealing with scenarios
where not everyone is using Git.. so the
content-is-kind-and-metadata-be-damned pays off handsomely.

And the "scenarios where not everyone is using Git" are everytime that
we are tracking a project that uses a different SCM. For me, the
"killer-app" of git is that, as it does not rely on magic metadata, it
is perfectly useful on projects that I track that use CVS or SVN.

I submit or commit patches upstream and git spots the commits being
echoed back in just right because it does not rely on the metadata.
Only on the content.

cheers,


martin

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 21:59                                                                                       ` Aaron Bentley
@ 2006-11-28 22:16                                                                                         ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 22:16 UTC (permalink / raw)
  To: Aaron Bentley; +Cc: bazaar-ng, git

Aaron Bentley wrote:
> Jakub Narebski wrote:
>> Well, I gues that with "annotate friendly" (weave or knit) storage
>> annotate/blame would be faster. But fast annotate was not one of the
>> design goals of git.
>>
>> How fast is "bzr annotate"?
> 
> $ time bzr annotate builtins.py > /dev/null
> 
> real    0m1.479s
> user    0m1.430s
> sys     0m0.030s
> 
> builtins.py has 953 ancestor revisions (i.e. revisions that modified
> it) and 3016 lines.
> 
> That's on a machine with 4141.87 Bogomips.  I did optimize annotate
> slightly, but I'm submitting the optimization for our 0.14.0 release.

Hmmm... git-blame (without contents moving or rename detection) takes 
around 2s user+sys on 2002 BogoMIPS machine, as compared to 1.5s 
user+sys for bzr annotate on 4141.87 BogoMIPS machine.

revision.c has 1208 lines, 62 unique commits in git-blame output, 90 
commits history, 102 commits full history.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:14                                                                                         ` Martin Langhoff
@ 2006-11-28 22:19                                                                                           ` Martin Langhoff
  2006-11-28 22:36                                                                                           ` Nicholas Allen
  1 sibling, 0 replies; 806+ messages in thread
From: Martin Langhoff @ 2006-11-28 22:19 UTC (permalink / raw)
  To: Nicholas Allen, Git Mailing List

On 11/29/06, Martin Langhoff <martin.langhoff@gmail.com> wrote:
> Well, there you start depending on everyone using bzr and providing
> metadata-added patches. Git is really good at dealing with scenarios
> where not everyone is using Git.. so the
> content-is-kind-and-metadata-be-damned pays off handsomely.

content-is-KING-and-metadata-be-damned :-)

cheers,



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:13                                                                                           ` Linus Torvalds
@ 2006-11-28 22:22                                                                                             ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-28 22:22 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Linus Torvalds wrote:

> [ For the non-git users, "git log --merge" is just shorthand for a much 
>   more complicated git revision parsing expression which boils down to: 
>   "show all commits as they pertain to any remaining unmerged pathnames, 
>   and only within the symmetrical set difference between the two branches 
>   you merged". You could write it out as
> 
>         git log ORIG_HEAD...MERGE_HEAD -- $(git ls-files --unmerged)
> 
>   but that "git log --merge" is a much simpler shorthand for that thing. 
> 
>   It's not that merge conflicts are necessarily common, but when they do 
>   happen, that's where you _really_ want the SCM to support you in 
>   figuring out what happened ]

It would be nice if this was documented in git-log(1), and not only
_partially_ in git-rev-list(1). And it would be nice to have this in the
proposed "Branches and merges" tutorial (part 3?) as well.

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:00                                                                                         ` Nicholas Allen
@ 2006-11-28 22:25                                                                                           ` Linus Torvalds
  2006-11-28 22:41                                                                                             ` Linus Torvalds
                                                                                                               ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28 22:25 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git



On Tue, 28 Nov 2006, Nicholas Allen wrote:
> 
> Also what happens if I loose the messages because they scrolled off
> screen or the power goes down, I need to reboot for some reason, or I
> don't have time and want to shutdown my computer restart another day and
> resolve the conflicts then?

I'd suggest just re-doing the merge. Something like

	git reset --hard
	git merge -m "dummy message" MERGE_HEAD

will do it for you (that's the new "nicer syntax" for doing a merge, in 
real life I'd personally just have done a re-pull or somehing)

> All useful conflict status is lost isn't it?

No, it's actually there, but "git status" doesn't really explain it to 
you.

The go-to command tends to be "git diff", which after a merge will not 
show anything that already merged correctly (because it will have been 
updated in the git index _and_ updated in the working tree, so there will 
be no diff from stuff that auto-merged). So any output at all after a 
failed merge from "git diff" generally tells you exactly what failed.

But since 99%+ of all merge conflicts are data-conflicts, I suspect the 
output is mostly geared towards that.

The other useful tools to be used are "git log --merge" (explained in a 
separate mail) and for people like me who like the git index and grok it 
fully, doing a

	git ls-files --unmerged --stage

is probably what I'd do (but I have to admit, that is _not_ a very 
user-friendly interface - you need to not only have understood the index 
file, you actually need to understand it on a very deep level).

"git status" is really used to be just a stupid around "git ls-files" 
(it's now largely a built-in), but it was really _so_ stupid that it 
doesn't really try to explain what it does - it's more like a simplified 
version of ls-files with some of the information pruned away, and other 
parts in a slightly more palatable format ;)

So improving "git status" might mean that some people could avoid having 
to learn about the index file details ;)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
       [not found]                                                                                       ` <20061128214531.GA24299@jameswestby.net>
@ 2006-11-28 22:34                                                                                         ` Nicholas Allen
  0 siblings, 0 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 22:34 UTC (permalink / raw)
  To: bazaar-ng, Git Mailing List

Thanks for the informative response. It helped but I'm still slightly
confused by git - I think I need to play around with it a bit more to
understand and get more familiar with the concepts...

Purely from an initial usage point of view though, for me at least, the
bzr output needed no explanation which I think is indicative of a good
user interface whereas the git was not so clear or obvious - there must
be room for improvement in git's user friendliness here surely. But that
might just be because I am clueless when it comes to the way git works
and the concepts it uses ;-)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:14                                                                                         ` Martin Langhoff
  2006-11-28 22:19                                                                                           ` Martin Langhoff
@ 2006-11-28 22:36                                                                                           ` Nicholas Allen
  2006-11-28 22:47                                                                                             ` Martin Langhoff
  1 sibling, 1 reply; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 22:36 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Git Mailing List, bazaar-ng

Martin Langhoff wrote:
> On 11/29/06, Nicholas Allen <nick.allen@onlinehome.de> wrote:
>> yes I can see if you just use plain patches. In bzr though there are
>> bundles that store extra data along with the patch and if you use this
>> instead of a simple patch this will never be a problem as bzr can then
>> notice the same bundle being merged into 2 branches.
> 
> Well, there you start depending on everyone using bzr and providing
> metadata-added patches. Git is really good at dealing with scenarios
> where not everyone is using Git.. so the
> content-is-kind-and-metadata-be-damned pays off handsomely.
> 
> And the "scenarios where not everyone is using Git" are everytime that
> we are tracking a project that uses a different SCM. For me, the
> "killer-app" of git is that, as it does not rely on magic metadata, it
> is perfectly useful on projects that I track that use CVS or SVN.
> 
> I submit or commit patches upstream and git spots the commits being
> echoed back in just right because it does not rely on the metadata.
> Only on the content.
> 
> cheers,
> 
> 
> martin
> ps: hope you don't mind I re-added the CC to git@vger in my reply

Of course not - I also added bzr mailing list back on this discussion too...

I have to agree that's pretty cool!

For the kind of development we do this is not really a big deal though
as all developers can agree on using one RCS. But if you mix git and svn
in this way then the changes can only go one way (from svn to git) can't
they as svn is not so intelligent so this somewhat limits its usefulness
doesn't it?

I know bzr it has some beta level plugin support for SVN foreign
branches (git, mercurial and svk ones too I think) and I believe this
works in both directions. So you can commit to bzr, push that to an svn
repository and also pull changes from svn. Merge branches in bzr and
commit back to svn with log messages and history intact. So bzr still
allows the use of multiple RCS systems...


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:25                                                                                           ` Linus Torvalds
@ 2006-11-28 22:41                                                                                             ` Linus Torvalds
  2006-11-28 22:48                                                                                               ` Nicholas Allen
  2006-11-28 22:46                                                                                             ` Nicholas Allen
  2006-11-29 10:52                                                                                             ` Johannes Schindelin
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-28 22:41 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Jakub Narebski, bazaar-ng, git



On Tue, 28 Nov 2006, Linus Torvalds wrote:
> On Tue, 28 Nov 2006, Nicholas Allen wrote:
> > 
> > All useful conflict status is lost isn't it?
> 
> No, it's actually there, but "git status" doesn't really explain it to 
> you.

Side note, to clarify: in the _simple_ cases it's all actually there.

I can well imagine that in more complex cases, involving multiple 
different files, you may well want to re-do the merge and let the merge 
tell you why it refused to merge something.

So the index, for example, contains just a "final end result" of what the 
merge gave up on, and while for a simple rename conflict like your example 
you could certainly see that directly from the index state (and thus we 
could, for example, have a "git status" that talks about it being a 
filename conflict), if you have a criss-cross rename, the index itself 
doesn't really tell you _why_, and it could look superficially like a data 
conflict. 

In such a case, you'd really have to either go back to the merge itself to 
see what happened, or you'd use the "git log" thing and just work it out 
from there (ie you can ask "git log" to tell you about any renames as they 
happened etc).

I don't think I've actually hit a complex enough merge to need this yet, 
but the graphical tools should help too, ie "gitk --merge" should give you 
everything that "git log --merge" gives you (ie just the commits that 
aren't common, and simplified to just the ones that matter for the 
unmerged filenames in the end result). I can well imagine that being 
useful too.

So the tools are certainly there. "git status" just isn't necessarily the 
best one (or the best that it could be, for that matter)..


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:25                                                                                           ` Linus Torvalds
  2006-11-28 22:41                                                                                             ` Linus Torvalds
@ 2006-11-28 22:46                                                                                             ` Nicholas Allen
  2006-11-29 10:52                                                                                             ` Johannes Schindelin
  2 siblings, 0 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 22:46 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, bazaar-ng, git


> 
> The other useful tools to be used are "git log --merge" (explained in a 
> separate mail) and for people like me who like the git index and grok it 
> fully, doing a
> 
> 	git ls-files --unmerged --stage
> 
> is probably what I'd do (but I have to admit, that is _not_ a very 
> user-friendly interface - you need to not only have understood the index 
> file, you actually need to understand it on a very deep level).
> 
> "git status" is really used to be just a stupid around "git ls-files" 
> (it's now largely a built-in), but it was really _so_ stupid that it 
> doesn't really try to explain what it does - it's more like a simplified 
> version of ls-files with some of the information pruned away, and other 
> parts in a slightly more palatable format ;)
> 
> So improving "git status" might mean that some people could avoid having 
> to learn about the index file details ;)

That sounds good. Better output on status would be nice ;-)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:36                                                                                           ` Nicholas Allen
@ 2006-11-28 22:47                                                                                             ` Martin Langhoff
  0 siblings, 0 replies; 806+ messages in thread
From: Martin Langhoff @ 2006-11-28 22:47 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Git Mailing List, bazaar-ng

On 11/29/06, Nicholas Allen <nick.allen@onlinehome.de> wrote:
> > ps: hope you don't mind I re-added the CC to git@vger in my reply
>
> Of course not - I also added bzr mailing list back on this discussion too...

Cool

> For the kind of development we do this is not really a big deal though
> as all developers can agree on using one RCS. But if you mix git and svn
> in this way then the changes can only go one way (from svn to git) can't
> they as svn is not so intelligent so this somewhat limits its usefulness
> doesn't it?

Well, if you look in the git toolset, you'll find things like git-svn
which is geared to make it almost transparent to use git to work on a
project where the upstream is using svn and push patches into SVN (if
you have write access, naturally). And git-cvsexportcommit which is a
lot less useful but helps me push series of patches from git into cvs
easily and with the certaintly that I am not messing up the content.

> I know bzr it has some beta level plugin support for SVN foreign
> branches (git, mercurial and svk ones too I think) and I believe this
> works in both directions. So you can commit to bzr, push that to an svn
> repository and also pull changes from svn. Merge branches in bzr and
> commit back to svn with log messages and history intact. So bzr still
> allows the use of multiple RCS systems...

Sounds roughly like git-svn ;-)

converge, ye DSCMs



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:41                                                                                             ` Linus Torvalds
@ 2006-11-28 22:48                                                                                               ` Nicholas Allen
  2006-11-29 10:49                                                                                                 ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Nicholas Allen @ 2006-11-28 22:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski


> 
> So the tools are certainly there. "git status" just isn't necessarily the 
> best one (or the best that it could be, for that matter)..

I guess I hit a limitation in the output of status as opposed to a
limitation in what git can do ;-)

Nick




^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  2:57                                                                       ` Linus Torvalds
@ 2006-11-29  2:23                                                                         ` Joseph Wakeling
  2006-11-29  3:51                                                                           ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Joseph Wakeling @ 2006-11-29  2:23 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, bazaar-ng

Thanks to everyone for your very detailed responses. :-)

On the subject of blame and pulling patches from unrelated branches,

Jakub Narebski wrote:
> In git repository can have unrelated branches. So you can fetch unrelated
> repository into your repository, and merge/cherry-pick from there
> if needed.

Sean wrote:
> The Git cherry-pick command lets you grab specific commits from
> other branches in your repo.  But cherry-pick works at the commit
> level, there is no easy way to grab a single function for instance
> and merge just its history into another branch.

Linus Torvalds wrote:
> pickaxe wasn't in the released version back when the discussions were 
> raging, but it's there now. Except it's really called "git blame" these 
> days (and "git annotate") since it's taken over both of those duties.
> 
> However...
> 
>> A frustration with bzr is that pulling or
>> merging patches from another branch or repo requires them to share the
>> same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
>> particular function in project XXX, I'm going to pull that individual
>> bit of code and its development history into project YYY"?
> 
> ... it's not _quite_ that smart. It will only look for sources to new 
> functions from existing sources in the tree that preceded the commit that 
> added the function, so it will _not_ see it coming from another branch or 
> another project entirely.
> 
> So when you ask for code annotations (use the "-C" flag to see code moved 
> across from other files), it will still limit itself to just a particular 
> input set, and not go gallivating over all possible branches and projects 
> you might have in your repository.

So ... if I understand correctly, I can get patches from somewhere else,
but in the branch history, I will not be able to tell the difference
from having simply newly created them?

With regards to git blame/pickaxe/annotate, the idea of tracking *code*
rather than files was one thing that really excited me when I read about
it in the earlier discussion, and is probably the main reason I'm trying
out git.  I'd like to understand this properly so is there a simple
exercise I can do to demonstrate its capabilities?  I tried an
experiment where I created one file with two lines, then cut one of the
lines, pasted it into a new file, and committed both changes at the same
time.  But git blame -C on the second file just gives me the
time/date/sha1 of its creation, and no indication that the line was
taken from elsewhere.

Back to the more basic queries ... one more difference I've observed
from bzr, after playing around for a while, involves the commands to
undo changes and commits.  It looks like git reset combines the
capabilities of both bzr uncommit and bzr revert: I can undo changes
since the last commit by resetting to HEAD, and I can undo commits by
resetting to HEAD^ or earlier.

Some things here I'm not quite sure about:
(1) the difference between git reset --soft and git reset --mixed,
probably because I don't understand the way the index works, the
difference between changed, updated and committed.
(2) How to remove changes made to an individual file since the last commit.

Last, could someone explain the git merge command?  git pull seems to do
many things which I would need to use bzr merge for---I can "pull"
between branches which have diverged, for example.  I don't understand
quite what git merge does that's different, and when to use one or the
other.

Many thanks again to everyone,


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29  2:23                                                                         ` Joseph Wakeling
@ 2006-11-29  3:51                                                                           ` Linus Torvalds
  2006-11-29  8:07                                                                             ` Junio C Hamano
  2006-11-29 12:17                                                                             ` git blame [was: git and bzr] Joseph Wakeling
  0 siblings, 2 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-11-29  3:51 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: git, bazaar-ng



On Wed, 29 Nov 2006, Joseph Wakeling wrote:
> 
> So ... if I understand correctly, I can get patches from somewhere else,
> but in the branch history, I will not be able to tell the difference
> from having simply newly created them?

Think of it this way: if the _patch_ looks like it's a code movement, then 
"git blame" will show it as a code movement. Ie, if the patch (to a human) 
looks like it's moving a function from one file into another (which in a 
patch will obviously be a question of removing it from one file, and 
adding it to another), then git will also see it that way, and then "git 
blame" will also follow its history as it moved.

But if somebody sends you a patch that just adds a new function that 
didn't exist in that context at all, then "git blame" won't ever realize 
that that new function was taken from another branch entirely.

> With regards to git blame/pickaxe/annotate, the idea of tracking *code*
> rather than files was one thing that really excited me when I read about
> it in the earlier discussion, and is probably the main reason I'm trying
> out git.  I'd like to understand this properly so is there a simple
> exercise I can do to demonstrate its capabilities?  I tried an
> experiment where I created one file with two lines, then cut one of the
> lines, pasted it into a new file, and committed both changes at the same
> time.  But git blame -C on the second file just gives me the
> time/date/sha1 of its creation, and no indication that the line was
> taken from elsewhere.

Actually, I think you found a bug.

Now, with small changes, "git blame -C" will just ignore copies entirely, 
so your particular test might not have even been supposed to work, but 
trying with a new git repo with two bigger files checked in at the initial 
commit, I'm actually not seeing "git blame -C" do the right thing even for 
real code movement.

And the problem seems to go to the "root commit": if the file existed in 
the root, the logic in "git blame" to diff against the (nonexistent) 
parent of the root commit won't do the right thing, and that just confuses 
git blame entirely.

I think Junio screwed up at some point. I'll send him a bug-report once 
I've triaged this a bit more, but I can recreate your breakage if I start 
a new git database and create two files in the root, and move data between 
them in the second commit (but if I instead create the second file in the 
second commit, and do the movement in the third commit, git blame -C works 
again ;).

> Back to the more basic queries ... one more difference I've observed
> from bzr, after playing around for a while, involves the commands to
> undo changes and commits.  It looks like git reset combines the
> capabilities of both bzr uncommit and bzr revert: I can undo changes
> since the last commit by resetting to HEAD, and I can undo commits by
> resetting to HEAD^ or earlier.

I'm not quite sure what "bzr revert" does. Git does have a "revert" too, 
but it will append a _new_ commit that actually undoes the commit you're 
asking to revert. If you want to just "undo history" (whether it's one 
commit or many - I don't see why it would be different) then yes, "git 
reset" is the thing to use.

I _suspect_ that bzr people use "uncommit" to undo a commit in order to 
fix it up. In git, you could do that with "git reset" and a new commit, 
but the normal thing to do is just to fix it up, and then do 

	git commit --amend

instead (which amends the last commit to include whatever fixups you did).

> Some things here I'm not quite sure about:
> (1) the difference between git reset --soft and git reset --mixed,
> probably because I don't understand the way the index works, the
> difference between changed, updated and committed.

You'd generally not want to use "--soft" unless you know what the index 
really is. Once you do know about all the index issues, you'll know why 
it's different from "--mixed", but in general, no normal person would ever 
use _either_ --soft (because not changing the index is too confusing if 
you don't know about it) or --mixed (because it's the default).

So in reality, you should use

	git reset

to reset everything but the actual working tree (and it will talk about 
the files that no longer match the state you are resetting _to_, if any 
such files exist), or

	git reset --hard

to reset everything.

Any other usage is strictly for hardcore people only, and if you don't 
know you want to use it, you shouldn't even consider it.

In fact, I'm pretty hardcore, and I don't think I've ever really used 
"--soft". It's largely been replaced by "git commit --amend", because 
amending a commit used to be the only reason to use "--soft", really.

So it might even be worthwhile just dropping "--soft" and "--mixed" 
altogether, but in the meantime, you might as well just ignore them.

> (2) How to remove changes made to an individual file since the last commit.

"git checkout file"


> Last, could someone explain the git merge command?

I argued that we should never teach people to use it at all (because "git 
pull" really does everything it can do), but people on the git list said 
people are used to merging, so it exists, and these days the syntax is 
more usable than it used to be.

> git pull seems to do many things which I would need to use bzr merge 
> for---I can "pull" between branches which have diverged, for example.  
> I don't understand quite what git merge does that's different, and when 
> to use one or the other.

Heh. I'm with you. I'm in the "don't use 'git merge' at all" camp, but it 
was argued that people coming from non-git backgrounds would find it 
too confusing to just use "git pull" for merging ;)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29  3:51                                                                           ` Linus Torvalds
@ 2006-11-29  8:07                                                                             ` Junio C Hamano
  2006-11-29 12:17                                                                             ` git blame [was: git and bzr] Joseph Wakeling
  1 sibling, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-11-29  8:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Joseph Wakeling

Linus Torvalds <torvalds@osdl.org> writes:

> And the problem seems to go to the "root commit": if the file existed in 
> the root, the logic in "git blame" to diff against the (nonexistent) 
> parent of the root commit won't do the right thing, and that just confuses 
> git blame entirely.
>
> I think Junio screwed up at some point. I'll send him a bug-report once 
> I've triaged this a bit more, but I can recreate your breakage if I start 
> a new git database and create two files in the root, and move data between 
> them in the second commit (but if I instead create the second file in the 
> second commit, and do the movement in the third commit, git blame -C works 
> again ;).

Is it safe to assume that the "automatically turning --show-name
on" fixes this issue and it does not have anything to do with
the root commit?  Given the way the "passing the blame"
algorithm works, there should not be anything special about the
root commit --- if some blame remains in a commit:path pair, and
if the commit does not have parents, it takes the blame right
away without needing to run any diff.

> In fact, I'm pretty hardcore, and I don't think I've ever really used 
> "--soft". It's largely been replaced by "git commit --amend", because 
> amending a commit used to be the only reason to use "--soft", really.
> So it might even be worthwhile just dropping "--soft" and "--mixed" 
> altogether, but in the meantime, you might as well just ignore them.

Everything in the above paragraph is correct.

>> git pull seems to do many things which I would need to use bzr merge 
>> for---I can "pull" between branches which have diverged, for example.  
>> I don't understand quite what git merge does that's different, and when 
>> to use one or the other.
>
> Heh. I'm with you. I'm in the "don't use 'git merge' at all" camp, but it 
> was argued that people coming from non-git backgrounds would find it 
> too confusing to just use "git pull" for merging ;)

Interesting.  I had exactly the same response as yours when I
read Joseph's message ;-).

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:48                                                                                               ` Nicholas Allen
@ 2006-11-29 10:49                                                                                                 ` Johannes Schindelin
  2006-11-29 11:01                                                                                                   ` Jakub Narebski
  2006-11-29 20:37                                                                                                   ` Jon Loeliger
  0 siblings, 2 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-29 10:49 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Linus Torvalds, Jakub Narebski, bazaar-ng, git

Hi,

On Tue, 28 Nov 2006, Nicholas Allen wrote:

> [Linus wrote...]
> > 
> > So the tools are certainly there. "git status" just isn't necessarily the 
> > best one (or the best that it could be, for that matter)..
> 
> I guess I hit a limitation in the output of status as opposed to a
> limitation in what git can do ;-)

I think it is something different altogether: you learnt how to use CVS, 
and you learnt how to use bzr, and you are now biased towards using the 
same names for the same operations in git.

I actually use git-status quite often, just before committing, to know 
what I changed. But I will probable retrain my mind to use "git diff" or 
even "git diff --stat", because it is more informative.

As for your scenario: There really should be a "what to do when my merge 
screwed up?" document.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28 22:25                                                                                           ` Linus Torvalds
  2006-11-28 22:41                                                                                             ` Linus Torvalds
  2006-11-28 22:46                                                                                             ` Nicholas Allen
@ 2006-11-29 10:52                                                                                             ` Johannes Schindelin
  2006-11-29 17:29                                                                                               ` Linus Torvalds
  2 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-29 10:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, git, Jakub Narebski

Hi,

On Tue, 28 Nov 2006, Linus Torvalds wrote:

> On Tue, 28 Nov 2006, Nicholas Allen wrote:
> > 
> > All useful conflict status is lost isn't it?
> 
> No, it's actually there, but "git status" doesn't really explain it to 
> you.
> 
> The go-to command tends to be "git diff", which after a merge will not 
> show anything that already merged correctly (because it will have been 
> updated in the git index _and_ updated in the working tree, so there will 
> be no diff from stuff that auto-merged).

This is actually the most meaningful argument for not hiding the index. 
Usually I explain it to people as a "staging area" standing between your 
working directory, and the next committed state.

But I will start explaining the index with "what if your merge failed?".

Ciao,
Dscho





^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 10:49                                                                                                 ` Johannes Schindelin
@ 2006-11-29 11:01                                                                                                   ` Jakub Narebski
  2006-11-29 20:37                                                                                                   ` Jon Loeliger
  1 sibling, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-29 11:01 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Linus Torvalds, bazaar-ng, git

Johannes Schindelin wrote:

> On Tue, 28 Nov 2006, Nicholas Allen wrote:
> 
>> [Linus wrote...]
>>> 
>>> So the tools are certainly there. "git status" just isn't necessarily the 
>>> best one (or the best that it could be, for that matter)..
>> 
>> I guess I hit a limitation in the output of status as opposed to a
>> limitation in what git can do ;-)
> 
> I think it is something different altogether: you learnt how to use CVS, 
> and you learnt how to use bzr, and you are now biased towards using the 
> same names for the same operations in git.
> 
> I actually use git-status quite often, just before committing, to know 
> what I changed. But I will probable retrain my mind to use "git diff" or 
> even "git diff --stat", because it is more informative.
> 
> As for your scenario: There really should be a "what to do when my merge 
> screwed up?" document.

It would be nice to have git-resolved (or git-resolve) wrapper around
git-update-index similar to git-add, git-mv, git-rm which would mark
file as resolved, without need for git-update-index, git-add and git-rm
even in the case of CONFLICT(rename/rename). Although I'm not sure
if it could work in all cases in the simple form of "git resolved <file>",
e.g. in the case of CONFLICT(add/add).

By the way, I wonder if git can detect the case when the same (or nearly
the same) file was added in two different branches under different
filename...


^ permalink raw reply	[flat|nested] 806+ messages in thread

* git blame [was: git and bzr]
  2006-11-29  3:51                                                                           ` Linus Torvalds
  2006-11-29  8:07                                                                             ` Junio C Hamano
@ 2006-11-29 12:17                                                                             ` Joseph Wakeling
  2006-11-29 16:39                                                                               ` Linus Torvalds
  1 sibling, 1 reply; 806+ messages in thread
From: Joseph Wakeling @ 2006-11-29 12:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds wrote:
> Now, with small changes, "git blame -C" will just ignore copies entirely, 

Obvious when I think about it, otherwise every 'int i;' in the kernel
would have a huge blame list ... :-O

> I think Junio screwed up at some point. I'll send him a bug-report once 
> I've triaged this a bit more, but I can recreate your breakage if I start 
> a new git database and create two files in the root, and move data between 
> them in the second commit (but if I instead create the second file in the 
> second commit, and do the movement in the third commit, git blame -C works 
> again ;).

Actually my setup was like the latter situation you describe, so blame
was probably working fine and just ignoring the small change.  But
serendipity is a wonderful thing. :-)

    -- Joe

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame [was: git and bzr]
  2006-11-29 12:17                                                                             ` git blame [was: git and bzr] Joseph Wakeling
@ 2006-11-29 16:39                                                                               ` Linus Torvalds
  2006-11-30 18:24                                                                                 ` Joseph Wakeling
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-29 16:39 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: git



On Wed, 29 Nov 2006, Joseph Wakeling wrote:
>
> Linus Torvalds wrote:
> > Now, with small changes, "git blame -C" will just ignore copies entirely, 
> 
> Obvious when I think about it, otherwise every 'int i;' in the kernel
> would have a huge blame list ... :-O

Indeed. We didn't do that heuristic originally, and the most common 
sequence that was "blamed" on being copied from somewhere else was 
something like the string

	"<tab><tab><tab>}<nl><tab><tab>}<nl><tab>}<nl>"

which is obviously very common in C, especially when you have coding 
conventions and people follow them ;)

> > them in the second commit (but if I instead create the second file in the 
> > second commit, and do the movement in the third commit, git blame -C works 
> > again ;).
> 
> Actually my setup was like the latter situation you describe, so blame
> was probably working fine and just ignoring the small change.  But
> serendipity is a wonderful thing. :-)

Yeah. As it turns out, the bug was really that "git blame" ended up just 
not showing the filenames (that it had followed correctly), because it had 
decided (incorrectly) that they weren't interesting because it all came 
from the same commit, and it had already shown that commit (just not that 
_file_ in that commit).

So it's fixed now, and probably would never trigger except for the stupid 
special case that was "let's just show an example of this" ;)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 10:52                                                                                             ` Johannes Schindelin
@ 2006-11-29 17:29                                                                                               ` Linus Torvalds
  2006-11-29 18:54                                                                                                 ` Marko Macek
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-29 17:29 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Nicholas Allen, Jakub Narebski, bazaar-ng, git



On Wed, 29 Nov 2006, Johannes Schindelin wrote:
> 
> On Tue, 28 Nov 2006, Linus Torvalds wrote:
> > 
> > The go-to command tends to be "git diff", which after a merge will not 
> > show anything that already merged correctly (because it will have been 
> > updated in the git index _and_ updated in the working tree, so there will 
> > be no diff from stuff that auto-merged).
> 
> This is actually the most meaningful argument for not hiding the index. 
> Usually I explain it to people as a "staging area" standing between your 
> working directory, and the next committed state.
> 
> But I will start explaining the index with "what if your merge failed?".

The thing is, the staging area is needed for a lot more than just merges. 
Every single SCM has one, because even something as _trivial_ as "commit 
all files" actually needs it. People don't just always think about it, and 
the git staging area is "bigger" than most others.

Most other SCM's have a staging area that is just a list of filenames 
(nobody thinks about it, but "commit everything" doesn't actually commit 
everything at all - it just commits everything /in the list of files that 
the SCM knows about/).

Git's staging area is just more complete than most other SCM's. It 
contains not just the list of filenames, but their permissions too (where 
a lot of other SCM's *cough*CVS*cough don't do permissions at all), but 
also their content, and in the case of a merge conflict, the content of 
the base version and the two branches to be merged.

So the index really _is_ required for pretty much all operations 
(including very much "git commit -a", if only because of the filename 
list), but yeah, if you start by talking about merge conflicts, maybe 
people understand WHY it's also important to actually stage the _contents_ 
of a file too (multiple times, in fact, for a merge conflict), not just 
its name.

So most of the time, when you use git, you can ignore the index. It's 
really important, and it's used _all_ the time, but you can still mostly 
ignore it. But when handling a merge conflict, the index is really what 
sets git apart, and what really helps a LOT.

I've used other systems, but the git handling of merge conflicts really is 
superior. Other SCM's think that the merge algorithm is interestign and 
important, and that's bullshit. Merge algorithms are largely trivial and 
uninteresting. The interestign and important thing is to just handle 
failures well, and git does that _really_ well.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 17:29                                                                                               ` Linus Torvalds
@ 2006-11-29 18:54                                                                                                 ` Marko Macek
  2006-11-29 20:07                                                                                                   ` Johannes Schindelin
                                                                                                                     ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Marko Macek @ 2006-11-29 18:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicholas Allen, Jakub Narebski, bazaar-ng, git

Linus Torvalds wrote:
> So most of the time, when you use git, you can ignore the index. It's 
> really important, and it's used _all_ the time, but you can still mostly 
> ignore it. But when handling a merge conflict, the index is really what 
> sets git apart, and what really helps a LOT.
 
Actually, people (at least me) dislike the index because in the most common
operations (status, diff, commit), they have to know that the command doesn't actually
display all their work but just the 'indexed' part of it. 

For people used to cvs, svn and other systems it would be nicer if diff -a
and commit -a (and possibly other commands) were the default.

index is of course necessary during merging, ... and as a speed optimization
for applying patches when you know the working copy is clean.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 18:54                                                                                                 ` Marko Macek
@ 2006-11-29 20:07                                                                                                   ` Johannes Schindelin
  2006-11-29 20:49                                                                                                     ` Jakub Narebski
  2006-11-29 20:45                                                                                                   ` Linus Torvalds
  2006-11-30 12:25                                                                                                   ` Andreas Ericsson
  2 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-29 20:07 UTC (permalink / raw)
  To: Marko Macek
  Cc: Linus Torvalds, Nicholas Allen, Jakub Narebski, bazaar-ng, git

Hi,

On Wed, 29 Nov 2006, Marko Macek wrote:

> Linus Torvalds wrote:
> > So most of the time, when you use git, you can ignore the index. It's 
> > really important, and it's used _all_ the time, but you can still 
> > mostly ignore it. But when handling a merge conflict, the index is 
> > really what sets git apart, and what really helps a LOT.
> 
> Actually, people (at least me) dislike the index because in the most 
> common operations (status, diff, commit), they have to know that the 
> command doesn't actually display all their work but just the 'indexed' 
> part of it.

No. It does display all your work.

However, as Linus pointed out, if there are automatically merged entries 
without conflicts, it will not display them. Which is sane!

And yes, you can hide some modifications by putting the modified file into 
the index. But then you did that very much on purpose.

> For people used to cvs, svn and other systems it would be nicer if diff 
> -a and commit -a (and possibly other commands) were the default.

And what exactly do you think is happening when "cvs add" and "svn add" 
did _not_ really add the file to the repository, but only a subsequent 
"commit" does?

> index is of course necessary during merging, ... and as a speed 
> optimization for applying patches when you know the working copy is 
> clean.

I think that it is one major achievement of git to make clear and sane 
definitions of branches (which are really just pointers 
into the revision graph), and the index (which is the staging area).

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 10:49                                                                                                 ` Johannes Schindelin
  2006-11-29 11:01                                                                                                   ` Jakub Narebski
@ 2006-11-29 20:37                                                                                                   ` Jon Loeliger
  1 sibling, 0 replies; 806+ messages in thread
From: Jon Loeliger @ 2006-11-29 20:37 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Nicholas Allen, Linus Torvalds, Jakub Narebski, bazaar-ng, Git List

On Wed, 2006-11-29 at 04:49, Johannes Schindelin wrote:

> As for your scenario: There really should be a "what to do when my merge 
> screwed up?" document.

I have a few examples scenarios and some notes on
cleaning up after failed merges in my slides from
the presentation I did at OLS last summer.

Feel free to look at it off of www.jdl.com!

jdl


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 18:54                                                                                                 ` Marko Macek
  2006-11-29 20:07                                                                                                   ` Johannes Schindelin
@ 2006-11-29 20:45                                                                                                   ` Linus Torvalds
  2006-11-30  0:05                                                                                                     ` Carl Worth
  2006-11-30 12:25                                                                                                   ` Andreas Ericsson
  2 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-29 20:45 UTC (permalink / raw)
  To: Marko Macek; +Cc: git, bazaar-ng



On Wed, 29 Nov 2006, Marko Macek wrote:
> 
> Actually, people (at least me) dislike the index because in the most common
> operations (status, diff, commit), they have to know that the command doesn't
> actually display all their work but just the 'indexed' part of it. 

I don't see your point, really.

Nothing forces you to change the index. None of the normal operations do 
that, for example, and you really have to _explicitly_ ask git to update 
the index for you.

So you can really think of it as a better list of names than what CVS and 
others maintain for you. It's exactly the same as the CVS "Entries" file, 
except it's got capabilities that CVS will never have - tracking not just 
the filename, but the merge status, the permissions, and the actual 
contents of an entry.

And by default, and in the absense of any failed merges, you will _never_ 
see any of those extra capabilities.

> For people used to cvs, svn and other systems it would be nicer if diff -a
> and commit -a (and possibly other commands) were the default.

Why? I mean really.. Why do people mind the index? If you've not done 
anything to explicitly update it, and you just write "git commit", it will 
tell you exactly which files are dirty, which files are untracked, and 
then say "nothing to commit".

Maybe we shouldn't even say "use git-update-index to mark for commit", we 
should just say "use 'git commit -a' to mark for commit", but the point 
is, there really is no downside. So you forget to mention which files to 
commit, what's the downside really? It tells you what is up, and you can 
just mention the files explicitly, or use "-a" to say "ok, commit 
everything that is dirty", and it doesn't really get any simpler than 
that.

And the ADVANTAGES of the index are legion. You may not appreciate them 
initially, but the disadvantages people talk about really don't exist in 
real life, and once you actually start doing merges with conflicts, and 
fix things up one file at a time (and perhaps take a break and do 
something else before you come back to the rest of the conflicts), the 
index saves your sorry ass, and is a _huge_ advantage.

Similarly, it _allows_ you to do things that just a list of files never 
allows you to. You don't _have_ to use it to mark individual files as 
being ready to be committed, but you _can_. It's nothing that you need to 
know or worry about if you're not aware of the index, but it's a 
capability that is there for when you're willing to go there.

So there really isn't any true disadvantage. Most of the people who are 
afraid of the index have probably never actually used it, and have never 
even had a _reason_ to use it. They're nervous just because they know it 
exists, and don't know what it does.  But you can just ignore it.

So get over your fears, and just ignore it, and things will be fine.

		Linus


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 20:07                                                                                                   ` Johannes Schindelin
@ 2006-11-29 20:49                                                                                                     ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-29 20:49 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Marko Macek, Linus Torvalds, Nicholas Allen, bazaar-ng, git

Johannes Schindelin wrote:
> 
> On Wed, 29 Nov 2006, Marko Macek wrote:
>>
>> index is of course necessary during merging, ... and as a speed 
>> optimization for applying patches when you know the working copy is 
>> clean.
> 
> I think that it is one major achievement of git to make clear and sane 
> definitions of branches (which are really just pointers 
> into the revision graph), and the index (which is the staging area).

Something resembling index is needed anyway: 1) for "commit all changed
files" to prepare list of files to commit, excluding ignored files,
2) to mark files as "to be added" or "to be removed" (well, git index
could be a little bit smarter here in marking "intent to add"), 3) as
a place for doing the merging. Git just doesn't hide it.

I agree that git definition of branches, and git not hiding index is
it's advantage... and disadvantage to those who learned using version
control on other SCM.
-- 
Jakub Narebski

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 20:45                                                                                                   ` Linus Torvalds
@ 2006-11-30  0:05                                                                                                     ` Carl Worth
  2006-11-30  0:08                                                                                                       ` Carl Worth
                                                                                                                         ` (2 more replies)
  0 siblings, 3 replies; 806+ messages in thread
From: Carl Worth @ 2006-11-30  0:05 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Marko Macek, git, bazaar-ng

[-- Attachment #1: Type: text/plain, Size: 2851 bytes --]

On Wed, 29 Nov 2006 12:45:18 -0800 (PST), Linus Torvalds wrote:
>
> Nothing forces you to change the index. None of the normal operations do
> that, for example, and you really have to _explicitly_ ask git to update
> the index for you.

Yes, this is goog.

> Why? I mean really.. Why do people mind the index? If you've not done
> anything to explicitly update it, and you just write "git commit", it will
> tell you exactly which files are dirty, which files are untracked, and
> then say "nothing to commit".

To start with, that message confuses a lot of new users. "What do you
mean there's nothing to commit? I just made changes. And I know you
noticed them because you just mentioned the names of the files with
the changes to me!".

So at the very least, there's some missing guidance as to how to get
from the "nothing to commit" stage to actually commit the files the
user was trying to commit when they typed "git commit" in the first
place.

> Maybe we shouldn't even say "use git-update-index to mark for commit", we
> should just say "use 'git commit -a' to mark for commit",

Yes, I submitted a patch for this. I don't think Junio picked it up
because it got him thinking about all the other situations where "git
status" doesn't give as much guidance as it should

Even with that, the user has to go through the process of:

	git commit
	"hmm... why didn't that work"
	read message
	git commit -a

That's not a _huge_ problem, but it is a little road-bump that a lot of
people meet on their first attempt at git. In the thread on the fedora
mailing list that prompted my first "user-interface warts" and the
patch I mentioned above, the process was worse:

	git commit
	"hmm... why didn't that work"
	read message
	git update-index
	git commit
	"crap... it still didn't work even when I did what it told me to do"

Here's the original version of that report:

https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00141.html

> And the ADVANTAGES of the index are legion. You may not appreciate them
> initially, but the disadvantages people talk about really don't exist in
> real life, and once you actually start doing merges with conflicts, and
> fix things up one file at a time (and perhaps take a break and do
> something else before you come back to the rest of the conflicts), the
> index saves your sorry ass, and is a _huge_ advantage.

In none of these recent threads have I been arguing disadvantages of
the index. I'm really just trying to remove one small hurdle that
does trip up new users, (see above). I'm not trying to introduce any
large conceptual change into how git works, nor even what experienced
users do.

> So get over your fears, and just ignore it, and things will be fine.

Let's help people do exactly that by making the behavior of "git
commit -a" be the default for "git commit".

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  0:05                                                                                                     ` Carl Worth
@ 2006-11-30  0:08                                                                                                       ` Carl Worth
  2006-11-30  0:30                                                                                                       ` Jakub Narebski
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
  2 siblings, 0 replies; 806+ messages in thread
From: Carl Worth @ 2006-11-30  0:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: bazaar-ng, Marko Macek, git

[-- Attachment #1: Type: text/plain, Size: 431 bytes --]

On Wed, 29 Nov 2006 16:05:16 -0800, Carl Worth wrote:
> On Wed, 29 Nov 2006 12:45:18 -0800 (PST), Linus Torvalds wrote:
> >
> > Nothing forces you to change the index. None of the normal operations do
> > that, for example, and you really have to _explicitly_ ask git to update
> > the index for you.
>
> Yes, this is goog.

I meant "good" there for anyone confused, (I'm not sure how that
slipped passed my spell-checker).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  0:05                                                                                                     ` Carl Worth
  2006-11-30  0:08                                                                                                       ` Carl Worth
@ 2006-11-30  0:30                                                                                                       ` Jakub Narebski
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
  2 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-30  0:30 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Carl Worth wrote:

> In the thread on the fedora
> mailing list that prompted my first "user-interface warts" and the
> patch I mentioned above, the process was worse:
> 
>         git commit
>         "hmm... why didn't that work"
>         read message
>         git update-index
>         git commit
>         "crap... it still didn't work even when I did what it told me to do"
> 
> Here's the original version of that report:
> 
> https://www.redhat.com/archives/fedora-maintainers/2006-November/msg00141.html

From the SYNOPSIS of git-update-index(1) one can see that git-update-index
needs files to act on.

But I agree that git is not very user friendly, and has some usability
warts.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  0:05                                                                                                     ` Carl Worth
  2006-11-30  0:08                                                                                                       ` Carl Worth
  2006-11-30  0:30                                                                                                       ` Jakub Narebski
@ 2006-11-30  6:59                                                                                                       ` Raimund Bauer
  2006-11-30  7:17                                                                                                         ` Carl Worth
                                                                                                                           ` (4 more replies)
  2 siblings, 5 replies; 806+ messages in thread
From: Raimund Bauer @ 2006-11-30  6:59 UTC (permalink / raw)
  To: Carl Worth; +Cc: git, bazaar-ng

* Carl Worth wrote, On 30.11.2006 01:05:
> Let's help people do exactly that by making the behavior of "git
> commit -a" be the default for "git commit".
>   
Maybe we could do that _only_ if the index matches HEAD, and otherwise 
keep current behavior?
So people who don't care about the index won't get tripped up, and when 
you do have a dirty index, you get told about it?
> -Carl
-- 

best regards

  Ray



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
@ 2006-11-30  7:17                                                                                                         ` Carl Worth
  2006-11-30  8:31                                                                                                         ` Alan Chandler
                                                                                                                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 806+ messages in thread
From: Carl Worth @ 2006-11-30  7:17 UTC (permalink / raw)
  To: Raimund Bauer; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1472 bytes --]

On Thu, 30 Nov 2006 07:59:19 +0100, Raimund Bauer wrote:
> * Carl Worth wrote, On 30.11.2006 01:05:
> > Let's help people do exactly that by making the behavior of "git
> > commit -a" be the default for "git commit".
> >
> Maybe we could do that _only_ if the index matches HEAD, and otherwise
> keep current behavior?
> So people who don't care about the index won't get tripped up, and when
> you do have a dirty index, you get told about it?

I thought of that tonight and almost suggested it myself. It would be
an attempt to satisfy both "sides" of the debate without either side
having to fight with a default they didn't like or configure it away.

I did wonder if the powers that be would find it a bit too magic, (the
problem with magic things is that they can sometimes be quite
confusing when they don't do exactly what you want).

But this might just work. It wouldn't be too bad to document, (we
already have several commands that change slightly if the index
doesn't match, (often by just refusing to do anything in a dirty
tree)).

And, significantly this would allow for documenting the simple
sequence of:

	# edit file
	git commit

in the tutorial while also allowing what Junio wanted:

	git update-index file
	git commit

with the behavior of, ("I already said I wanted to do a staged commit
when I explicitly updated the index, so don't make me say anything
special again when I go to commit").

Can we really get the best of both worlds here?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
  2006-11-30  7:17                                                                                                         ` Carl Worth
@ 2006-11-30  8:31                                                                                                         ` Alan Chandler
  2006-11-30  9:01                                                                                                         ` Nguyen Thai Ngoc Duy
                                                                                                                           ` (2 subsequent siblings)
  4 siblings, 0 replies; 806+ messages in thread
From: Alan Chandler @ 2006-11-30  8:31 UTC (permalink / raw)
  To: git

On Thursday 30 November 2006 06:59, Raimund Bauer wrote:
> Maybe we could do that _only_ if the index matches HEAD, and otherwise
> keep current behavior?
> So people who don't care about the index won't get tripped up, and when
> you do have a dirty index, you get told about it?
>

I have been(silently)  following the git commit discussion and started being 
fully on the side of git commit -a being the default, but was slowly moving 
over towards the git commit -i being the default camp.

This post seems like a Eureka moment - chew over the problem long enough and 
someone comes in from left field with an off the wall remark that suddenly 
clarifies everything.



-- 
Alan Chandler

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
  2006-11-30  7:17                                                                                                         ` Carl Worth
  2006-11-30  8:31                                                                                                         ` Alan Chandler
@ 2006-11-30  9:01                                                                                                         ` Nguyen Thai Ngoc Duy
  2006-11-30  9:30                                                                                                           ` Alan Chandler
  2006-11-30 10:19                                                                                                         ` Johannes Schindelin
  2006-11-30 12:45                                                                                                         ` Andreas Ericsson
  4 siblings, 1 reply; 806+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-11-30  9:01 UTC (permalink / raw)
  To: Raimund Bauer; +Cc: Carl Worth, git

On 11/30/06, Raimund Bauer <ray007@gmx.net> wrote:
> * Carl Worth wrote, On 30.11.2006 01:05:
> > Let's help people do exactly that by making the behavior of "git
> > commit -a" be the default for "git commit".
> >
> Maybe we could do that _only_ if the index matches HEAD, and otherwise
> keep current behavior?

I hate the if clause. Suppose I prefer update-index way, I would have
to check whether HEAD matches index everytime I do a commit to make
sure it won't do the other way.
Either -a or -i is the default, not if please.

By the way I do use the update-index way, but vote -a by default. I
don't mind adding " -i" after every commit commands.
-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  9:01                                                                                                         ` Nguyen Thai Ngoc Duy
@ 2006-11-30  9:30                                                                                                           ` Alan Chandler
  2006-11-30  9:35                                                                                                             ` Jakub Narebski
  2006-11-30  9:39                                                                                                             ` Steven Grimm
  0 siblings, 2 replies; 806+ messages in thread
From: Alan Chandler @ 2006-11-30  9:30 UTC (permalink / raw)
  To: git

On Thursday 30 November 2006 09:01, Nguyen Thai Ngoc Duy wrote:
> On 11/30/06, Raimund Bauer <ray007@gmx.net> wrote:
> > * Carl Worth wrote, On 30.11.2006 01:05:
> > > Let's help people do exactly that by making the behavior of "git
> > > commit -a" be the default for "git commit".
> >
> > Maybe we could do that _only_ if the index matches HEAD, and otherwise
> > keep current behavior?
>
> I hate the if clause. Suppose I prefer update-index way, I would have
> to check whether HEAD matches index everytime I do a commit to make
> sure it won't do the other way.

No you won't.   

If you don't use update-index, then index will match HEAD and you will commit 
changes in the working tree.  That is the way for newbies

As soon as you do the first update-index the index will no longer match HEAD, 
so commit will do the same as it does now.

And if you are not sure which you have done then presumably you do what you do 
now, or git commit -a or git commit -i as you need.

-- 
Alan Chandler

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  9:30                                                                                                           ` Alan Chandler
@ 2006-11-30  9:35                                                                                                             ` Jakub Narebski
  2006-11-30 10:01                                                                                                               ` Junio C Hamano
  2006-11-30  9:39                                                                                                             ` Steven Grimm
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-11-30  9:35 UTC (permalink / raw)
  To: git

Alan Chandler wrote:

> And if you are not sure which you have done then presumably you do what you do 
> now, or git commit -a or git commit -i as you need.

By the way, short option -i is not --index but --include (i.e. commit
both changes in index and files mentioned on command line). Perhaps -I?
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  9:30                                                                                                           ` Alan Chandler
  2006-11-30  9:35                                                                                                             ` Jakub Narebski
@ 2006-11-30  9:39                                                                                                             ` Steven Grimm
  1 sibling, 0 replies; 806+ messages in thread
From: Steven Grimm @ 2006-11-30  9:39 UTC (permalink / raw)
  To: Alan Chandler; +Cc: git

Alan Chandler wrote:
> No you won't.   
>
> If you don't use update-index, then index will match HEAD and you will commit 
> changes in the working tree.  That is the way for newbies
>
> As soon as you do the first update-index the index will no longer match HEAD, 
> so commit will do the same as it does now.
>
> And if you are not sure which you have done then presumably you do what you do 
> now, or git commit -a or git commit -i as you need.

Plus, one assumes, the git-generated comments in the commit message will 
tell you what kind of commit it has decided to do.

I like this suggestion a lot. Thinking back over my git usage recently, 
which has included both styles of commits (though mostly -a ones), I 
think this would have done the right thing by default in every case.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  9:35                                                                                                             ` Jakub Narebski
@ 2006-11-30 10:01                                                                                                               ` Junio C Hamano
  2006-11-30 22:45                                                                                                                 ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-11-30 10:01 UTC (permalink / raw)
  To: jnareb; +Cc: Alan Chandler, git

Jakub Narebski <jnareb@gmail.com> writes:

> By the way, short option -i is not --index but --include (i.e. commit
> both changes in index and files mentioned on command line). Perhaps -I?

Just in case nobody noticed after looking at the first part of
my patch a few nights ago, --include happens to mean exactly
what --index would mean anyway.

By default, "git commit" without parameter does "make a commit
out of the index".  With paths, it used to mean "oh, by the way,
I forgot to run update-index on these paths, so could you please
do that for me now before you make the commit", and has been
that way for a long time.

Much later, people from CVS background wanted to say "edit foo
bar; git update-index bar; git commit foo" to mean "I might have
done something to the index, but I do not want to care about it
now -- please make a commit that includes only the changes to
bar and I do not want the changes to foo included in the
commit".  Somehow we ended up introducing that twisted semantics
and that was where --only came from, which unfortunately later
became the default (and I already said that I realize this was a
big mistake).

While we transitioned to switch the default, we first came up
with a name to ask for the traditional semantics (--include),
warned people who gave paths without either -i nor -o that the
--include semantics is still the default but would change soon
(which meant that -i was a no-op back then), then switched the
default and we now warn that the default is now -o (so now -o is
a no-op) when people give only paths without -i nor -o.

Currently (that is, without the first part of my two patches),
"git commit -i" and "git commit -o" without paths refuse to
work, saying that these modes of operation do not make sense
without any path.

However, you can think of the simplest "commit the current
index" semantics as a degenerated case of saying "oh, by the
way, please run update-index on these paths I forgot to do
earlier before you make the commit" and giving no paths.

So "git commit -i" without paths _could_ mean "commit the index
as is" very naturally without introducing an independent switch
with different name.  That is what the first part of my two
patches to Carl does.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
                                                                                                                           ` (2 preceding siblings ...)
  2006-11-30  9:01                                                                                                         ` Nguyen Thai Ngoc Duy
@ 2006-11-30 10:19                                                                                                         ` Johannes Schindelin
  2006-11-30 11:25                                                                                                           ` Nguyen Thai Ngoc Duy
  2006-11-30 12:45                                                                                                         ` Andreas Ericsson
  4 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-30 10:19 UTC (permalink / raw)
  To: Raimund Bauer; +Cc: Carl Worth, git, bazaar-ng

Hi,

On Thu, 30 Nov 2006, Raimund Bauer wrote:

> * Carl Worth wrote, On 30.11.2006 01:05:
> > Let's help people do exactly that by making the behavior of "git
> > commit -a" be the default for "git commit".
> >   
> Maybe we could do that _only_ if the index matches HEAD, and otherwise keep
> current behavior?
> So people who don't care about the index won't get tripped up, and when you do
> have a dirty index, you get told about it?

So many people spoke for it, it's time I crash the wedding.

From a usability viewpoint, it is a horrible convention. The user has to 
remember too much of the side effects to handle the commit operation. 
The function of the program would no longer be dependent on the command 
line arguments and your config, but _also_ on something as volatile as 
the index.

You would literally end up asking "did I change the index?" _everytime_ 
before you commit.

And remember, even a simple "git add" changes the index! (Why it does is 
brutally clear once you grasp the concept of the staging area.)

Worse, doing a "git commit --amend" should _not_ automatically add "-a" 
_even_ if the index matches the HEAD, since it is quite possible that you 
had a typo in the message you want to fix up. And quite possibly other 
options would not want that either.

But here's an idea: tell the user that she has to tell git-commit which 
files she wants committed. Yes! That's it. Just tell it the friggin' 
files. And if you are a lazy bum, and want to commit _all_ modified 
files, git has a nice shortcut for ya: "-a".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 10:19                                                                                                         ` Johannes Schindelin
@ 2006-11-30 11:25                                                                                                           ` Nguyen Thai Ngoc Duy
  2006-11-30 11:58                                                                                                             ` Jakub Narebski
  2006-11-30 12:23                                                                                                             ` Johannes Schindelin
  0 siblings, 2 replies; 806+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-11-30 11:25 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git

On 11/30/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> But here's an idea: tell the user that she has to tell git-commit which
> files she wants committed. Yes! That's it. Just tell it the friggin'
> files. And if you are a lazy bum, and want to commit _all_ modified
> files, git has a nice shortcut for ya: "-a".

It reminds me Microsoft Office Assistant :-) Let's make "git assistant
mode" that tries hard to guess user's desires and give them guidance.
Once they get used to git, they can disable that mode and back to
"plain git".
-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 11:25                                                                                                           ` Nguyen Thai Ngoc Duy
@ 2006-11-30 11:58                                                                                                             ` Jakub Narebski
  2006-11-30 12:14                                                                                                               ` Nguyen Thai Ngoc Duy
  2006-11-30 12:23                                                                                                             ` Johannes Schindelin
  1 sibling, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-11-30 11:58 UTC (permalink / raw)
  To: git

Nguyen Thai Ngoc Duy wrote:

> On 11/30/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
>> But here's an idea: tell the user that she has to tell git-commit which
>> files she wants committed. Yes! That's it. Just tell it the friggin'
>> files. And if you are a lazy bum, and want to commit _all_ modified
>> files, git has a nice shortcut for ya: "-a".
> 
> It reminds me Microsoft Office Assistant :-) Let's make "git assistant
> mode" that tries hard to guess user's desires and give them guidance.
> Once they get used to git, they can disable that mode and back to
> "plain git".

The 'givor' (pun on Vi 'vigor') or 'gitor', or 'gator'.

$ git commit
[...]
nothing to commit
$ givor
$ git commit
Givor: You haven't marked any file for commit using "git-update-index <file>"
Givor: and you didn't provide files to commit with "git commit <file>"
Givor: so I assume that you wanted to commit all changed files
Givor: You can use "git commit -a" for that (-a is for --all)

;-)
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 11:58                                                                                                             ` Jakub Narebski
@ 2006-11-30 12:14                                                                                                               ` Nguyen Thai Ngoc Duy
  0 siblings, 0 replies; 806+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2006-11-30 12:14 UTC (permalink / raw)
  To: git

On 11/30/06, Jakub Narebski <jnareb@gmail.com> wrote:
> Nguyen Thai Ngoc Duy wrote:
>
> > On 11/30/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> >> But here's an idea: tell the user that she has to tell git-commit which
> >> files she wants committed. Yes! That's it. Just tell it the friggin'
> >> files. And if you are a lazy bum, and want to commit _all_ modified
> >> files, git has a nice shortcut for ya: "-a".
> >
> > It reminds me Microsoft Office Assistant :-) Let's make "git assistant
> > mode" that tries hard to guess user's desires and give them guidance.
> > Once they get used to git, they can disable that mode and back to
> > "plain git".
>
> The 'givor' (pun on Vi 'vigor') or 'gitor', or 'gator'.
>
> $ git commit
> [...]
> nothing to commit
> $ givor
> $ git commit
> Givor: You haven't marked any file for commit using "git-update-index <file>"
> Givor: and you didn't provide files to commit with "git commit <file>"
> Givor: so I assume that you wanted to commit all changed files
> Givor: You can use "git commit -a" for that (-a is for --all)

I am serious about that. I haven't thought of it as an independent
command/program though. Can you implement givor exactly like the above
example?

> ;-)
Okay now joke part. This command name is better :-D

$ git commit
[...]
nothing to commit
$ dammit
$ git commit
Givor: You haven't marked any file for commit using "git-update-index <file>"
Givor: and you didn't provide files to commit with "git commit <file>"
Givor: so I assume that you wanted to commit all changed files
Givor: You can use "git commit -a" for that (-a is for --all)

-- 

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 11:25                                                                                                           ` Nguyen Thai Ngoc Duy
  2006-11-30 11:58                                                                                                             ` Jakub Narebski
@ 2006-11-30 12:23                                                                                                             ` Johannes Schindelin
  1 sibling, 0 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-30 12:23 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy; +Cc: git

Hi,

On Thu, 30 Nov 2006, Nguyen Thai Ngoc Duy wrote:

> On 11/30/06, Johannes Schindelin <Johannes.Schindelin@gmx.de> wrote:
> > But here's an idea: tell the user that she has to tell git-commit which
> > files she wants committed. Yes! That's it. Just tell it the friggin'
> > files. And if you are a lazy bum, and want to commit _all_ modified
> > files, git has a nice shortcut for ya: "-a".
> 
> It reminds me Microsoft Office Assistant :-) Let's make "git assistant
> mode" that tries hard to guess user's desires and give them guidance.
> Once they get used to git, they can disable that mode and back to
> "plain git".

See git-gui from Shawn. It should really help new users with a graphical 
user interface.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-29 18:54                                                                                                 ` Marko Macek
  2006-11-29 20:07                                                                                                   ` Johannes Schindelin
  2006-11-29 20:45                                                                                                   ` Linus Torvalds
@ 2006-11-30 12:25                                                                                                   ` Andreas Ericsson
  2006-11-30 20:01                                                                                                     ` Theodore Tso
  2 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-11-30 12:25 UTC (permalink / raw)
  To: Marko Macek; +Cc: bazaar-ng, git

Marko Macek wrote:
> Linus Torvalds wrote:
>> So most of the time, when you use git, you can ignore the index. It's 
>> really important, and it's used _all_ the time, but you can still 
>> mostly ignore it. But when handling a merge conflict, the index is 
>> really what sets git apart, and what really helps a LOT.
> 
> Actually, people (at least me) dislike the index because in the most common
> operations (status, diff, commit), they have to know that the command 
> doesn't actually
> display all their work but just the 'indexed' part of it.
> For people used to cvs, svn and other systems it would be nicer if diff -a
> and commit -a (and possibly other commands) were the default.
> 

Unless you do "git update-index" (and thus are already using the index) 
on any files, "git diff" shows you exactly the changes between your last 
commit and the working tree. There's nothing magic, odd or confusing 
about it, no matter which scm you come from.



^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
                                                                                         ` (4 preceding siblings ...)
  2006-11-28 12:10                                                                       ` git and bzr Erik Bågfors
@ 2006-11-30 12:36                                                                       ` Nicholas Allen
  2006-11-30 12:47                                                                         ` Johannes Schindelin
  2006-11-30 16:45                                                                         ` Linus Torvalds
  5 siblings, 2 replies; 806+ messages in thread
From: Nicholas Allen @ 2006-11-30 12:36 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: bazaar-ng, git

I also have a basic question about git regarding its content tracking 
and merging.

Does this mean if I have, for example, a large C++ file with a bunch of 
methods in it and I move one of the methods from the bottom of the file 
to the top and in another branch someone makes a change to that method 
that when I merge their changes git will merge their changes into the 
method at the top of the file where I have moved it?

If so that would be really quite impressive!

Cheers,

Nick

Joseph Wakeling wrote:
> Hello all,
>
> Following the very interesting debate about the differences between bzr
> and git, I thought it was about time I tried to learn properly about git
> and how to use it.  I've been using bzr for a good while now, although
> since I'm not a serious developer I only use it for simple purposes,
> keeping track of code I write on my own for academic projects.
>
> So, a few questions about differences I don't understand...
>
> First off a really dumb one: how do I identify myself to git, i.e. give
> it a name and email address?  Currently it uses my system identity,
> My Name <username@computer.(none)>.  I haven't found any equivalent of
> the bzr whoami command.
>
> Now to more serious business.  One of the main operational differences I
> see as a new user is that bzr defaults to setting up branches in
> different locations, whereas git by default creates a repository where
> branches are different versions of the directory contents and switching
> branches *changes* the directory contents.  bzr branch seems to be
> closer to git-clone than git-branch (N.B. I have never used bzr repos so
> might not be making a fair comparison).
>
> With this in mind, is there any significance to the "master" branch (is
> it intended e.g. to indicate a git repository's "stable" version
> according to the owner?), or is this just a convenient default name?
> Could I delete or rename it?  Using bzr I would normally give the
> central branch(*) the name of the project.
>
> (* Central or main on my own system.  Not intended to be central in the
> sense of a CVS-style version control setup:-)
>
> Any other useful comments that can be made to a bzr user about working
> with this difference, positive or negative aspects of it?
>
> Next question ... one of the reasons I started seriously thinking about
> git was that in the VCS comparison discussion, it was noted that git is
> a lot more flexible than bzr in terms of how it can track data (e.g. the
> git pickaxe command, although I understand that's not in the released
> version [1.4.4.1] yet?).  A frustration with bzr is that pulling or
> merging patches from another branch or repo requires them to share the
> same HEAD.  Is this a requirement in git or can I say, "Hey, I like that
> particular function in project XXX, I'm going to pull that individual
> bit of code and its development history into project YYY"?
>
> Last off (for now, I'm sure I'll think of more): is there any easy (or
> difficult) way to effectively import version history from a bzr
> repository, and vice versa?
>
> Thanks in advance for any comments,
>
>     -- Joe
>
>
>   





^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30  6:59                                                                                                       ` Raimund Bauer
                                                                                                                           ` (3 preceding siblings ...)
  2006-11-30 10:19                                                                                                         ` Johannes Schindelin
@ 2006-11-30 12:45                                                                                                         ` Andreas Ericsson
  4 siblings, 0 replies; 806+ messages in thread
From: Andreas Ericsson @ 2006-11-30 12:45 UTC (permalink / raw)
  To: Raimund Bauer; +Cc: Carl Worth, git, bazaar-ng

Raimund Bauer wrote:
> * Carl Worth wrote, On 30.11.2006 01:05:
>> Let's help people do exactly that by making the behavior of "git
>> commit -a" be the default for "git commit".
>>   
> Maybe we could do that _only_ if the index matches HEAD, and otherwise 
> keep current behavior?
> So people who don't care about the index won't get tripped up, and when 
> you do have a dirty index, you get told about it?

Sounds sane. Especially if we couple it with a hint for the user to use 
"commit -a" when he/she wants to do blanket commits.

So in essence that would mean:
If no pathspecs are given and index matches current HEAD, print out
"Nothing to commit but changes in working tree. Assuming 'git commit -a'

and then act accordingly. Carl, do you think that would satisfy the 
desires of your RedHat peers? Always doing '-a' by default is terribly 
wrong for those of us who actually use partial commits a lot, and it 
would also rob git of a lot of its power.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 12:36                                                                       ` Nicholas Allen
@ 2006-11-30 12:47                                                                         ` Johannes Schindelin
  2006-11-30 16:45                                                                         ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-30 12:47 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Joseph Wakeling, git, bazaar-ng

Hi,

On Thu, 30 Nov 2006, Nicholas Allen wrote:

> Does this mean if I have, for example, a large C++ file with a bunch of 
> methods in it and I move one of the methods from the bottom of the file 
> to the top and in another branch someone makes a change to that method 
> that when I merge their changes git will merge their changes into the 
> method at the top of the file where I have moved it?

As for now, no, it does not. This is a shortcoming of RCS merge which does 
the heavy-lifting.

Having said that, stay tuned for new developments: the functionality of 
merge is being integrated in git. This opens the door to make use of the 
code tracking support in git, to do exactly what you just proposed.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 12:36                                                                       ` Nicholas Allen
  2006-11-30 12:47                                                                         ` Johannes Schindelin
@ 2006-11-30 16:45                                                                         ` Linus Torvalds
  1 sibling, 0 replies; 806+ messages in thread
From: Linus Torvalds @ 2006-11-30 16:45 UTC (permalink / raw)
  To: Nicholas Allen; +Cc: Joseph Wakeling, git, bazaar-ng



On Thu, 30 Nov 2006, Nicholas Allen wrote:
>
> Does this mean if I have, for example, a large C++ file with a bunch of
> methods in it and I move one of the methods from the bottom of the file to the
> top and in another branch someone makes a change to that method that when I
> merge their changes git will merge their changes into the method at the top of
> the file where I have moved it?

Right now (and in the near future), nope. "git blame" will track the 
changes (so the pure movement wasn't just an addition of new code, but 
you'll see it track it all the way down to the original), but "git merge" 
is still file-based.

In other words, "git merge" does uses a data similarity analysis that 
could be used for smaller chunks than a whole file, but at least for now 
it does it on a file granularity only (and then passes it off to the 
standard RCS three-way merge on a file-by-file basis).

That said, if the movement happens _within_ a file, then just about any 
SCM could do what you ask for, by just using something smarter than the 
standard 3-way merge. So that part isn't even about tracking data across 
files - it's just about a per-file merge strategy.

The "track data, not files" thing becomes more interesting when you factor 
out a file into two or more files, and can continue to merge across such a 
code re-filing event. Git can do it for "annotate", but doesn't do it for 
anything else.

> If so that would be really quite impressive!

Indeed, and it's one of the potential future goals that was discussed very 
early in the git design phase. The point of _not_ doing file ID tracking 
is exactly that you can actually do better than that by just tracking the 
data.

So some day, we may do it. And not just within one file, but even between 
files. Because file renames really is just a very specific special case of 
data movement, and I don't think it's even the most common case.

That said, there are several reasons why you might not actually _ever_ 
want it in practice, and why I say "potential future goal" and "we may do 
it". I think this is going to be both a matter of not just writing the 
code (which we haven't done), but also deciding if it's really worth it.

Because merges are things where you may not want too much smarts:

 - Quite often, a failed merge that needs manual fixup may even be 
   _preferable_ to a successful merge that did the merge "technically 
   correctly", but in an unexpected way.

 - There's a _big_ difference between "merging code" and "examining code". 
   It makes much more sense to try to track where code came from and what 
   the "deep history" was when you examine code, because the reason you're 
   doing so is generally exactly because you're looking for what went 
   wrong, and who to blame.

   When going "merging", the history of the code is arguably a lot less 
   important. What is the most important part is that the two branches you 
   merge have been (hopefully) verified in their _current_ state. The 
   history may be full of bugs, and they may have been fixed differently, 
   and even trying to be really clever may not actually be a good idea at 
   all.

   Code may have moved or may have been copied, but what is much more 
   important than the original code and where it came from is the state it 
   was in _after_ the move, because that's the tested working state, and 
   in many ways the history of how it came to be really shouldn't matter 
   as much at all.

In other words, "annotate" and "merge" have almost entirely opposite 
interests. An annotation is supposed to find the history in order to maybe 
help find bugs, while a merge is supposed to use the _current_ state, and 
very arguably, if the two current states don't match _so_ obviously that 
there is no question about what you should do, then the merge should make 
that very very very clear to the user.

So my personal opinion has always been that a merge should be extremely 
simpleminded. I think all teh VCS people who concentrate on smart merging 
absolutely have their heads up their arses, and do exactly the wrong 
thing. A merge should not do anything "clever" at all. It should be just 
_barely_ smart enough to do the obvious thing, and even then we all know 
that it will still occasionally do the wrong thing.

So I actually think that a bog-standard and totally stupid three-way merge 
is simply not far from the right thing to do. And the git "recursive" 
thing basically repeats that stupid merge (a) in time (ie the criss-cross 
merge thing causes a recursive three-way merge to take place) and (b) in 
the metadata space (ie you can see the rename following basically as just 
a "3-way merge in filenames").

And yes, this is probably some mental deficiency and hang-up, but I think 
that's sufficient, and that where the real "clever" stuff should be is to 
then help people resolve conflicts (and maybe also help you find 
mis-merges even with the totally stupid and simple merge). Because 
conflicts _will_ happen, regardless of your merge strategy, and you do 
need people to look at them, but you can make it _easier_ for people to 
say "ok, that's obviously the right merge".

So me personally, I'd rather have the "real merge" be what git already 
does, and then have something like a graphical "resolution helper" 
application that tries to resolve the remaining things with user help. And 
that "resolution helper" is where I'd put all the magic code movement 
logic, not in the merge itself.

So you could look at a failed hunk, and press a "show me a best guess" 
button, and at that point the thing would say "that code might fit here, 
does that look sane to you? <Ok>, <Next guess>, <Cancel>".

THAT is what a good VCS should do, in my opinion. Not do "smart merges".

Btw, git doesn't do the above kind of smart graphical thing, but git 
_does_ do something very much in that direction. Unlike a lot of things, 
git doesn't just leave the "conflict marker" turds in the working tree. 
No, the index will contain the three-way merge base and both of the actual 
files you were trying to merge, and a "git diff" will actually show you a 
three-way diff of the working tree (and you can say "git diff --ours" to 
see the diff just against our old head, and "--theirs" to see a regular 
two-way diff against the _other_ side that you tried to merge).

So git already very much embodies this concept of "don't be overly smart 
when merging, but try to help the user out when resolving the merge". It 
may not be pretty GUI etc, and it mostly helps with regular bog-standard 
data conflicts, but boy is it pleasant to use for those once you get used 
to it.

So we get NONE of those horrible "you just get conflict turds, you figure 
it out" things. It gives you the turds (because people, including me, are 
used to them, and you want _something_ in the working tree that shows both 
versions at the same time, of course), but then you can edit them to your 
hearts content, and even _after_ you've edited them, you can do the above 
three-way (or two-way against either branch) diffs, and it will show what 
you edited and its relationship to the two branches you merged.

THAT is what merging is all about. Not smart merges. Stupid merges with 
good tools to help you do the right thing when the right thing isn't _so_ 
obvious that you can just leave it to the machine.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame [was: git and bzr]
  2006-11-29 16:39                                                                               ` Linus Torvalds
@ 2006-11-30 18:24                                                                                 ` Joseph Wakeling
  2006-11-30 18:44                                                                                   ` Linus Torvalds
  0 siblings, 1 reply; 806+ messages in thread
From: Joseph Wakeling @ 2006-11-30 18:24 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

Linus Torvalds wrote:
> So it's fixed now, and probably would never trigger except for the stupid 
> special case that was "let's just show an example of this" ;)

I'm very happy my stupidity could help. ;-)

On a related note ...

Nicholas Allen wrote:
> Thanks for the informative response. It helped but I'm still slightly
> confused by git - I think I need to play around with it a bit more to
> understand and get more familiar with the concepts...
>  
> Purely from an initial usage point of view though, for me at least, the
> bzr output needed no explanation which I think is indicative of a good
> user interface whereas the git was not so clear or obvious - there must
> be room for improvement in git's user friendliness here surely. But that
> might just be because I am clueless when it comes to the way git works
> and the concepts it uses ;-)

I do think that bzr has quite an intuitive set of commands, and it is
easy to learn, though at this point I don't feel git is really *that*
much more difficult in itself.  Although the terminal output for some
problems could be improved, most of my difficulties are stemming from
overlap of command names when the commands themselves do different
things, and the fact that git's documentation is somewhat more technical
than bzr's.

What would be nice would be to have in the documentation a whole bunch
of stupid examples for the main commands, something where someone can
create a repo from scratch, create and modify some simple files
according to instructions, and see the particular command in action.
The tutorials do this, of course, but only for a few cases, when to be
honest it's the more complex commands that most need such explanation.
For beginners, especially less technically skilled ones, it would be
good to have a lot more of, "Do this, here's what git will respond, this
is what it means, here's how to fix it...."

As a relatively non-technical user, perhaps I should keep track of my
difficulties (and others') and try to write something up.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame [was: git and bzr]
  2006-11-30 18:24                                                                                 ` Joseph Wakeling
@ 2006-11-30 18:44                                                                                   ` Linus Torvalds
  2006-11-30 19:55                                                                                     ` Carl Worth
  0 siblings, 1 reply; 806+ messages in thread
From: Linus Torvalds @ 2006-11-30 18:44 UTC (permalink / raw)
  To: Joseph Wakeling; +Cc: git



On Thu, 30 Nov 2006, Joseph Wakeling wrote:
> 
> What would be nice would be to have in the documentation a whole bunch
> of stupid examples for the main commands, something where someone can
> create a repo from scratch, create and modify some simple files
> according to instructions, and see the particular command in action.

100% agreed. A lot of the man-pages etc have been written to be about the 
technology, not about the _use_ of it.

I encouraged people at some point to add an "Examples" section to some of 
the functions to show what it all _means_, so for "man git-log", I think 
some of the most useful stuff is that examples section that shows the 
combination of revision naming and path-name limiting, for example. I 
personally think that that is a much better way of teaching people what 
the commands actually do than by mentioning the arguments one by one.

But that only exists for a couple of man-pages, and mostly for the simple 
ones at that. And a lot of the real examples would need "real data" to 
work on, so it can't easily be done as a trivial example in a man-page, it 
really needs a tutorial to "build up" to the situation where you can then 
explain with an example what to do.

> The tutorials do this, of course, but only for a few cases, when to be
> honest it's the more complex commands that most need such explanation.

Yeah. The git "tutorial.txt" should be extended, and preferably be a while 
nice set of "follow along with the bouncing ball" kind of web-page 
sequence.

So I absolutely agree. It's just that at least me personally, I just can't 
write documentation. I wrote some of the original tutorial, I've written 
some of the original tech docs, but I just can't get into the whole 
"document it" mindset, especially not from a user perspective. It doesn't 
float my boat, and judging by a lot of the discussions, I obviously also 
don't even see why something could _possibly_ cause confusion.

To make things worse, a lot of the docs (and by that I also mean some of 
the error messages and helpful hints) tend to be old.

The whole fact that "git commit" mentions "git update-index" is exactly 
that kind of thing: it's largely a legacy message. You'd almost never 
actually _use_ git-update-index itself these days, and it's much more 
convenient to just list the files you want to commit to "git commit" 
directly (or just use the -a flag, if that is what you want to do).

But that message exists, because it was written in an earlier age.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame [was: git and bzr]
  2006-11-30 18:44                                                                                   ` Linus Torvalds
@ 2006-11-30 19:55                                                                                     ` Carl Worth
  2006-11-30 22:17                                                                                       ` Johannes Schindelin
  0 siblings, 1 reply; 806+ messages in thread
From: Carl Worth @ 2006-11-30 19:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Joseph Wakeling, git

[-- Attachment #1: Type: text/plain, Size: 1426 bytes --]

On Thu, 30 Nov 2006 10:44:48 -0800 (PST), Linus Torvalds wrote:
>
> But that only exists for a couple of man-pages, and mostly for the simple
> ones at that. And a lot of the real examples would need "real data" to
> work on, so it can't easily be done as a trivial example in a man-page, it
> really needs a tutorial to "build up" to the situation where you can then
> explain with an example what to do.

Here's a crazy idea. How about a "git tutorial" builtin or "git
example" or something that would create a repository into some useful
state for demonstrating something.

I know that I'm regularly putting stuff into emails like:

	mkdir gittest
	cd gittest
	git init-db
	echo hello > hello
	git add hello
	git commit -m "add hello"
	git checkout -b other
	echo other > other
	git add other
	git commit -m "add other"
	git checkout master

	# OK, that was just setup, here's what I want to demonstrate
	git pull . other
	...

So maybe if there was a command to setup a standard example
repository, ("git boilerplate", "git sandbox", "git playground" ?),
then the documentation could use that to have full-fledged examples
without having to duplicate similar setup each time.

And then there could be a way for this command to also spit out the
commands it is using to reach some state so it could even serve as a
sort of self-documenting tutorial of some sort.

Anyone interested in exploring something like that?

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 12:25                                                                                                   ` Andreas Ericsson
@ 2006-11-30 20:01                                                                                                     ` Theodore Tso
  2006-11-30 20:09                                                                                                       ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Theodore Tso @ 2006-11-30 20:01 UTC (permalink / raw)
  To: Andreas Ericsson; +Cc: Marko Macek, git, bazaar-ng

On Thu, Nov 30, 2006 at 01:25:19PM +0100, Andreas Ericsson wrote:
> Unless you do "git update-index" (and thus are already using the index) 
> on any files, "git diff" shows you exactly the changes between your last 
> commit and the working tree. There's nothing magic, odd or confusing 
> about it, no matter which scm you come from.

Until you make the mistake of reading the git-diff man page, at which
point the novice git user runs screaming into the night...

       Show changes between two ents, an ent and the working tree, an
       ent and the index file, or the index file and the working
       tree. The combination of what is compared with what is
       determined by the number of ents given to the command.

       * When no <ent> is given, the working tree and the index file
          is compared, using git-diff-files.

       * When one <ent> is given, the working tree and the named tree
          is compared, using git-diff-index. The option --cached can
          be given to compare the index file and the named tree.

       * When two <ent>s are given, these two trees are compared using
          git-diff-tree.

Looking at the man page, it does raise one interesting question ---
So exactly what is the difference between Treebeard and Quickbeam?

And how many working trees do we need before we call it an Entmoot?  :-)


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 20:01                                                                                                     ` Theodore Tso
@ 2006-11-30 20:09                                                                                                       ` Jakub Narebski
  2006-12-01  9:55                                                                                                         ` Andreas Ericsson
  0 siblings, 1 reply; 806+ messages in thread
From: Jakub Narebski @ 2006-11-30 20:09 UTC (permalink / raw)
  To: git; +Cc: bazaar-ng

Theodore Tso wrote:

>        * When no <ent> is given, the working tree and the index file
>           is compared, using git-diff-files.

 *  When no <tree-ish> is given, the working tree and  the  index  file  are
    compared, using git-diff-files.

Use more modern git.
-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame [was: git and bzr]
  2006-11-30 19:55                                                                                     ` Carl Worth
@ 2006-11-30 22:17                                                                                       ` Johannes Schindelin
  2006-11-30 22:24                                                                                         ` J. Bruce Fields
  2006-11-30 22:38                                                                                         ` git blame Junio C Hamano
  0 siblings, 2 replies; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-30 22:17 UTC (permalink / raw)
  To: Carl Worth; +Cc: Linus Torvalds, Joseph Wakeling, git

Hi,

On Thu, 30 Nov 2006, Carl Worth wrote:

> Here's a crazy idea. How about a "git tutorial" builtin or "git example" 
> or something that would create a repository into some useful state for 
> demonstrating something.

That sounds fine! Actually, it should be very simple to turn the tutorial 
into such a script, displaying the command with an explanation, and 
executing the command. It could even call gitk from time to time, so the 
user can form a mental model of the ancestor graph.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame [was: git and bzr]
  2006-11-30 22:17                                                                                       ` Johannes Schindelin
@ 2006-11-30 22:24                                                                                         ` J. Bruce Fields
  2006-11-30 22:38                                                                                         ` git blame Junio C Hamano
  1 sibling, 0 replies; 806+ messages in thread
From: J. Bruce Fields @ 2006-11-30 22:24 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Carl Worth, Linus Torvalds, Joseph Wakeling, git

On Thu, Nov 30, 2006 at 11:17:12PM +0100, Johannes Schindelin wrote:
> Hi,
> 
> On Thu, 30 Nov 2006, Carl Worth wrote:
> 
> > Here's a crazy idea. How about a "git tutorial" builtin or "git example" 
> > or something that would create a repository into some useful state for 
> > demonstrating something.
> 
> That sounds fine! Actually, it should be very simple to turn the tutorial 
> into such a script, displaying the command with an explanation, and 
> executing the command. It could even call gitk from time to time, so the 
> user can form a mental model of the ancestor graph.

Currently tutorial.txt doesn't work like that--there are places where it
just tells the user to edit a file, or make a few commits, without
listing commands to do so.  It also isn't linear.  That could all be
"fixed", but I think the result would just make it more tedious.

But I agree that a "git tutorial" command to set up a canonical example
repository might be fun.


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame
  2006-11-30 22:17                                                                                       ` Johannes Schindelin
  2006-11-30 22:24                                                                                         ` J. Bruce Fields
@ 2006-11-30 22:38                                                                                         ` Junio C Hamano
  2006-11-30 22:53                                                                                           ` Johannes Schindelin
  1 sibling, 1 reply; 806+ messages in thread
From: Junio C Hamano @ 2006-11-30 22:38 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: git, Joseph Wakeling, Carl Worth

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Thu, 30 Nov 2006, Carl Worth wrote:
>
>> Here's a crazy idea. How about a "git tutorial" builtin or "git example" 
>> or something that would create a repository into some useful state for 
>> demonstrating something.
>
> That sounds fine! Actually, it should be very simple to turn the tutorial 
> into such a script, displaying the command with an explanation, and 
> executing the command. It could even call gitk from time to time, so the 
> user can form a mental model of the ancestor graph.

Doesn't one of our existing t/ scripts do that?

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 10:01                                                                                                               ` Junio C Hamano
@ 2006-11-30 22:45                                                                                                                 ` Johannes Schindelin
  2006-11-30 23:36                                                                                                                   ` Junio C Hamano
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-30 22:45 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: jnareb, Alan Chandler, git

Hi,

On Thu, 30 Nov 2006, Junio C Hamano wrote:

> Somehow we ended up introducing that twisted semantics and that was 
> where --only came from, which unfortunately later became the default 
> (and I already said that I realize this was a big mistake).

If you are talking about "git commit file1 file2" ignoring the current 
index, and building a new index just updating file1 and file2 from the 
working directory, I disagree that it was a big mistake.

Actually, I was very happy to get that change (IIRC it was me requesting 
it, so blame me), because I now can say: just specify exactly what you 
want to commit *1*.

If you want to commit just file2 (even if you added file1, but did not 
commit it yet) do "git commit file2". If you want to commit all changes, 
either pass the names of all modified files, or "-a". IMHO this satisfies 
the principle of least surprise.

Ciao,
Dscho

Footnote 1: Of course, you can use commit in more ways. But this is 
sufficient to get people started.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame
  2006-11-30 22:38                                                                                         ` git blame Junio C Hamano
@ 2006-11-30 22:53                                                                                           ` Johannes Schindelin
  2006-11-30 23:08                                                                                             ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Johannes Schindelin @ 2006-11-30 22:53 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Joseph Wakeling, Carl Worth

Hi,

On Thu, 30 Nov 2006, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> > On Thu, 30 Nov 2006, Carl Worth wrote:
> >
> >> Here's a crazy idea. How about a "git tutorial" builtin or "git example" 
> >> or something that would create a repository into some useful state for 
> >> demonstrating something.
> >
> > That sounds fine! Actually, it should be very simple to turn the tutorial 
> > into such a script, displaying the command with an explanation, and 
> > executing the command. It could even call gitk from time to time, so the 
> > user can form a mental model of the ancestor graph.
> 
> Doesn't one of our existing t/ scripts do that?

;-) I did not forget... t1200-tutorial.sh

But it serves a different purpose: it makes sure that we did not break the 
commands in the tutorial. (I fear that the script and the tutorial have 
diverged a little bit, though).

git-tutorial should not test that, rather it should show the user what is 
possible, and encourage playing with git.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git blame
  2006-11-30 22:53                                                                                           ` Johannes Schindelin
@ 2006-11-30 23:08                                                                                             ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-11-30 23:08 UTC (permalink / raw)
  To: git

Johannes Schindelin wrote:

> On Thu, 30 Nov 2006, Junio C Hamano wrote:
> 
>> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> 
>>> On Thu, 30 Nov 2006, Carl Worth wrote:
>>>
>>>> Here's a crazy idea. How about a "git tutorial" builtin or "git example" 
>>>> or something that would create a repository into some useful state for 
>>>> demonstrating something.
>>>
>>> That sounds fine! Actually, it should be very simple to turn the tutorial 
>>> into such a script, displaying the command with an explanation, and 
>>> executing the command. It could even call gitk from time to time, so the 
>>> user can form a mental model of the ancestor graph.
>> 
>> Doesn't one of our existing t/ scripts do that?
> 
> ;-) I did not forget... t1200-tutorial.sh
> 
> But it serves a different purpose: it makes sure that we did not break the 
> commands in the tutorial. (I fear that the script and the tutorial have 
> diverged a little bit, though).
> 
> git-tutorial should not test that, rather it should show the user what is 
> possible, and encourage playing with git.

Something like Cogito tutorial-script?

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 22:45                                                                                                                 ` Johannes Schindelin
@ 2006-11-30 23:36                                                                                                                   ` Junio C Hamano
  0 siblings, 0 replies; 806+ messages in thread
From: Junio C Hamano @ 2006-11-30 23:36 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: jnareb, Alan Chandler, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Thu, 30 Nov 2006, Junio C Hamano wrote:
>
>> Somehow we ended up introducing that twisted semantics and that was 
>> where --only came from, which unfortunately later became the default 
>> (and I already said that I realize this was a big mistake).
>
> If you are talking about "git commit file1 file2" ignoring the current 
> index, and building a new index just updating file1 and file2 from the 
> working directory, I disagree that it was a big mistake.

When I wrote that paragraph, I said:

        Much later, people from CVS background wanted to say "edit foo
        bar; git update-index bar; git commit foo" to mean "I might have
        done something to the index, but I do not want to care about it
        now -- please make a commit that includes only the changes to
        bar and I do not want the changes to foo included in the
        commit".  Somehow we ended up introducing that twisted semantics
        and that was where --only came from, which unfortunately later
        became the default (and I already said that I realize this was a
        big mistake).

But ignoring the index was not because of that command sequence,
as you reminded me in your message I am replying to.  It was to
allow this sequence, which is natural with CVS:

	$ git-checkout  ;# existing project that did not have Makefile
	$ edit hello.c  ;# to fix wording of the message
        $ edit Makefile ;# anybody who is self respecting should have one
	$ git-add Makefile ;# do not forget to add it
        $ git-commit hello.c ;# the fix is important independent of Makefile
	... then maybe the next commit is to add Makefile ...

If you view this sequence with CVS mindset, there is nothing
surprising about the commit _not_ committing Makefile in this
example.

But if you come from the school that "git-add" is about adding
"the contents (and the path, but only because content cannot be
added without the path)", and if you already understood that
"git-commit" without parameters nor options is a way to make a
commit out of the index, it certainly is counterintuitive.  

Granted, parameters and options are ways to affect what the
command does, but usually it does so by modifying and enhancing
what the command does without breaking the basic premise.  What
the --only does is quite different -- it bypasses the index
completely.

In fact, what it does is _so_ counterintuitive that I did not
even remember what the real motivation behind it was, and sent
my message with a much more implausible sequence which had an
explicit update-index (no sane person would do that).  That
should tell you something.

Remember, new peole will not stay "newbies" forever.  The
original "inclusive" semantics is a lot easier to explain once
you get what index does.  The way to introduce "index" to people
Nico proposed would not have to talk about "Ah, but there are
these two twists" if we did not make the --only the default
semantics.  What I find a big mistake is not the --only option;
the mistake is that it is the default.

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-11-30 20:09                                                                                                       ` Jakub Narebski
@ 2006-12-01  9:55                                                                                                         ` Andreas Ericsson
  2006-12-02  8:57                                                                                                           ` Jakub Narebski
  0 siblings, 1 reply; 806+ messages in thread
From: Andreas Ericsson @ 2006-12-01  9:55 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git, bazaar-ng

Jakub Narebski wrote:
> Theodore Tso wrote:
> 
>>        * When no <ent> is given, the working tree and the index file
>>           is compared, using git-diff-files.
> 
>  *  When no <tree-ish> is given, the working tree and  the  index  file  are
>     compared, using git-diff-files.
> 
> Use more modern git.

More modern git (pull'ed 10 minutes ago) has this, at least when cut 
from Documentation/git-diff.txt:
---%<---%<---%<---
SYNOPSIS
--------
'git-diff' [ --diff-options ] <tree-ish>{0,2} [<path>...]

DESCRIPTION
-----------
Show changes between two trees, a tree and the working tree, a
tree and the index file, or the index file and the working tree.
The combination of what is compared with what is determined by
the number of trees given to the command.

* When no <tree-ish> is given, the working tree and the index
   file are compared, using `git-diff-files`.

* When one <tree-ish> is given, the working tree and the named
   tree are compared, using `git-diff-index`.  The option
   `--index` can be given to compare the index file and
   the named tree.
   `--cached` is a deprecated alias for `--index`. It's use is
   discouraged.

* When two <tree-ish>s are given, these two trees are compared
   using `git-diff-tree`.
---%<---%<---%<---

This needs an update, I think. I'll look into it on sunday if no-one's 
beaten me to it.

-- 
Andreas Ericsson                   andreas.ericsson@op5.se
OP5 AB                             www.op5.se

^ permalink raw reply	[flat|nested] 806+ messages in thread

* Re: git and bzr
  2006-12-01  9:55                                                                                                         ` Andreas Ericsson
@ 2006-12-02  8:57                                                                                                           ` Jakub Narebski
  0 siblings, 0 replies; 806+ messages in thread
From: Jakub Narebski @ 2006-12-02  8:57 UTC (permalink / raw)
  To: git

Andreas Ericsson wrote:

> ---%<---%<---%<---
> SYNOPSIS
> --------
> 'git-diff' [ --diff-options ] <tree-ish>{0,2} [<path>...]
> 
> DESCRIPTION
> -----------
> Show changes between two trees, a tree and the working tree, a
> tree and the index file, or the index file and the working tree.
> The combination of what is compared with what is determined by
> the number of trees given to the command.
> 
> * When no <tree-ish> is given, the working tree and the index
>    file are compared, using `git-diff-files`.
> 
> * When one <tree-ish> is given, the working tree and the named
>    tree are compared, using `git-diff-index`.  The option
>    `--index` can be given to compare the index file and
>    the named tree.
>    `--cached` is a deprecated alias for `--index`. It's use is
>    discouraged.
> 
> * When two <tree-ish>s are given, these two trees are compared
>    using `git-diff-tree`.
> ---%<---%<---%<---
> 
> This needs an update, I think. I'll look into it on sunday if no-one's 
> beaten me to it.

You might want to use Junio proposal in
  Message-ID: <7vhcwgcf39.fsf@assigned-by-dhcp.cox.net>
  http://permalink.gmane.org/gmane.comp.version-control.git/32853
(and perhaps also my reply to it)

-- 
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git


^ permalink raw reply	[flat|nested] 806+ messages in thread

end of thread, other threads:[~2006-12-02  9:00 UTC | newest]

Thread overview: 806+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-14 15:07 VCS comparison table Jon Smirl
2006-10-14 16:40 ` Jakub Narebski
2006-10-14 17:18   ` Jon Smirl
2006-10-14 17:42     ` Jakub Narebski
2006-10-16  3:53   ` Martin Pool
2006-10-22 15:50     ` Jakub Narebski
2006-10-16 22:26   ` Aaron Bentley
2006-10-16 22:35     ` Andy Whitcroft
2006-10-16 22:53       ` Jakub Narebski
2006-10-16 23:19     ` Jakub Narebski
2006-10-16 23:39       ` Nguyen Thai Ngoc Duy
2006-10-17  4:56       ` Aaron Bentley
2006-10-17  5:20         ` Shawn Pearce
2006-10-17  8:21           ` Martin Pool
2006-10-17  8:15         ` Jakub Narebski
2006-10-17  8:16         ` Andreas Ericsson
2006-10-17 20:01           ` Aaron Bentley
2006-10-17 21:01             ` Jakub Narebski
2006-10-17 21:27               ` Aaron Bentley
2006-10-17 21:51                 ` Jakub Narebski
2006-10-17 22:28                   ` Aaron Bentley
2006-10-17 22:57                     ` Jakub Narebski
2006-10-17 22:59                       ` Jakub Narebski
2006-10-17 23:16                       ` Linus Torvalds
2006-10-18  5:36                         ` Jeff King
2006-10-18  5:57                           ` Junio C Hamano
2006-10-18 14:52                           ` Linus Torvalds
2006-10-18 18:52                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Petr Baudis
2006-10-18 18:59                               ` Petr Baudis
2006-10-18 19:04                                 ` Junio C Hamano
2006-10-18 19:13                                   ` Nicolas Pitre
2006-10-18 19:18                                     ` Shawn Pearce
2006-10-18 19:33                                       ` Nicolas Pitre
2006-10-18 20:46                                         ` Shawn Pearce
2006-10-18 21:17                                           ` Linus Torvalds
2006-10-18 21:32                                             ` Shawn Pearce
2006-10-18 21:42                                               ` Junio C Hamano
2006-10-18 21:52                                                 ` Shawn Pearce
2006-10-18 22:02                                                   ` Junio C Hamano
2006-10-18 21:55                                               ` Linus Torvalds
2006-10-18 22:05                                                 ` Shawn Pearce
2006-10-18 22:07                                                 ` Junio C Hamano
2006-10-18 21:41                                             ` Nicolas Pitre
2006-10-18 21:41                                             ` Shawn Pearce
2006-10-18 22:00                                               ` Linus Torvalds
2006-10-18 22:11                                                 ` Shawn Pearce
2006-10-18 22:13                                               ` Junio C Hamano
2006-10-18 22:42                                                 ` Linus Torvalds
2006-10-18 22:48                                                   ` Junio C Hamano
2006-10-18 23:22                                                     ` Shawn Pearce
2006-10-18 23:18                                                   ` Nicolas Pitre
2006-10-18 23:50                                                     ` Johannes Schindelin
2006-10-19  0:07                                                     ` Linus Torvalds
2006-10-19  0:15                                                       ` Linus Torvalds
2006-10-19  0:31                                                       ` Johannes Schindelin
2006-10-19  0:46                                                         ` Linus Torvalds
2006-10-19  3:01                                                       ` Nicolas Pitre
2006-10-19  3:46                                                       ` Junio C Hamano
2006-10-19 14:27                                                         ` Nicolas Pitre
2006-10-19 14:55                                                         ` Linus Torvalds
2006-10-19 16:07                                                           ` Jan Harkes
2006-10-19 16:48                                                             ` Linus Torvalds
2006-10-20  0:20                                                               ` Jan Harkes
2006-10-20 14:41                                                                 ` Jeff King
2006-10-20  0:20                                                               ` [PATCH 1/2] Pass through unresolved deltas when writing a pack Jan Harkes
2006-10-20  0:20                                                               ` [PATCH 2/2] Remove unused index tracking code Jan Harkes
2006-10-20  1:11                                                                 ` Nicolas Pitre
2006-10-20  1:35                                                                   ` Junio C Hamano
2006-10-20  2:27                                                                   ` Jan Harkes
2006-10-20  2:30                                                                     ` Junio C Hamano
2006-10-20  2:46                                                                       ` Jan Harkes
2006-10-20  3:36                                                                     ` Nicolas Pitre
2006-10-18 21:56                                             ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Junio C Hamano
2006-10-18 19:33                                     ` Junio C Hamano
2006-10-18 20:47                                       ` Shawn Pearce
2006-10-18 19:09                                 ` Nicolas Pitre
2006-10-18 20:08                                 ` Linus Torvalds
     [not found]                               ` <20061018155704.b94b441d.seanlkml@sympatico.ca>
2006-10-18 19:57                                 ` Sean
2006-10-18 20:46                                 ` Petr Baudis
     [not found]                                   ` <20061018165341.bcece11f.seanlkml@sympatico.ca>
2006-10-18 20:53                                     ` Sean
2006-10-18 21:39                                     ` Petr Baudis
     [not found]                                       ` <20061018175443.50b728f6.seanlkml@sympatico.ca>
2006-10-18 21:54                                         ` Sean
2006-10-19  6:46                               ` Alexander Belchenko
     [not found]                                 ` <20061019064049.bec89582.seanlkml@sympatico.ca>
2006-10-19 10:40                                   ` Sean
2006-10-20 14:03                                     ` Aaron Bentley
2006-10-20 14:56                                       ` Jakub Narebski
2006-10-20 15:34                                         ` Aaron Bentley
2006-10-20 16:21                                           ` Jakub Narebski
2006-10-20 17:03                                             ` Aaron Bentley
2006-10-20 17:18                                               ` Linus Torvalds
2006-10-20 17:45                                                 ` Jakub Narebski
2006-10-20 17:59                                                   ` Linus Torvalds
2006-10-20 20:17                                                     ` Junio C Hamano
2006-10-20 20:40                                                       ` Jakub Narebski
2006-10-20 22:41                                                       ` [PATCH 1/2] git-pickaxe: introduce heuristics to "best match" scoring Junio C Hamano
2006-10-20 22:41                                                       ` [PATCH 2/2] git-pickaxe: introduce heuristics to avoid "trivial" chunks Junio C Hamano
2006-10-20 17:47                                                 ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Aaron Bentley
2006-10-20 18:06                                                   ` Linus Torvalds
2006-10-20 18:30                                                     ` Linus Torvalds
2006-10-20 19:04                                                       ` Aaron Bentley
2006-10-20 19:31                                                         ` Linus Torvalds
2006-10-20 20:12                                                           ` Aaron Bentley
2006-10-20 17:21                                               ` Shawn Pearce
2006-10-20 17:48                                                 ` Linus Torvalds
2006-10-20 17:58                                                   ` David Lang
2006-10-20 18:15                                                   ` Jon Smirl
2006-11-03  3:43                                                     ` Matthew Hannigan
2006-10-20 20:23                                                   ` Petr Baudis
2006-10-20 20:49                                                     ` David Lang
2006-10-20 20:53                                                       ` Petr Baudis
2006-10-20 20:55                                                         ` David Lang
2006-10-20 20:53                                                   ` Shawn Pearce
2006-10-20 18:12                                             ` Jan Hudec
2006-10-20 18:35                                               ` Jakub Narebski
2006-10-20 18:46                                                 ` Jakub Narebski
2006-10-20 18:47                                               ` Jakub Narebski
2006-10-20 19:00                                                 ` Linus Torvalds
2006-10-20 19:10                                                   ` Aaron Bentley
2006-10-20 19:46                                                     ` Linus Torvalds
2006-10-20 20:29                                                       ` Aaron Bentley
2006-10-20 20:57                                                         ` Linus Torvalds
2006-10-21  2:03                                                           ` git-merge-recursive, was " Johannes Schindelin
2006-10-21  2:17                                                             ` Junio C Hamano
2006-10-22 21:04                                                               ` [PATCH] threeway_merge: if file will not be touched, leave it alone Johannes Schindelin
2006-10-22 23:11                                                                 ` Junio C Hamano
2006-10-23  0:48                                                                   ` Johannes Schindelin
2006-10-23  4:17                                                                     ` Junio C Hamano
2006-10-20 18:48                                               ` [ANNOUNCE] Example Cogito Addon - cogito-bundle Linus Torvalds
2006-10-20 22:13                                                 ` Jeff Licquia
2006-10-20 23:05                                                   ` Robert Collins
2006-10-20 23:15                                                     ` Robert Collins
2006-10-20 23:39                                                       ` Jeff Licquia
2006-10-20 23:24                                                     ` Jakub Narebski
2006-10-20 23:28                                                       ` Petr Baudis
2006-10-20 23:59                                                   ` Linus Torvalds
2006-10-21  1:26                                                     ` Junio C Hamano
2006-10-21  8:40                                                       ` Jakub Narebski
2006-10-20 19:14                                               ` Jakub Narebski
2006-10-20 22:59                                               ` Jeff King
2006-10-21 17:40                                                 ` Jan Hudec
2006-10-21 17:51                                                   ` Jakub Narebski
2006-10-21 19:20                                                     ` Jan Hudec
2006-10-21 18:42                                                   ` Linus Torvalds
2006-10-21 19:21                                                     ` Jakub Narebski
2006-11-03  6:36                                                       ` Martin Langhoff
2006-10-20 22:40                                           ` Petr Baudis
2006-10-20 23:33                                             ` Aaron Bentley
2006-10-21  7:56                                         ` Matthieu Moy
2006-10-21  8:36                                           ` Jakub Narebski
2006-10-21 10:09                                             ` Matthieu Moy
2006-10-21 10:34                                               ` Jakub Narebski
     [not found]                                       ` <20061020113712.d192580a.seanlkml@sympatico.ca>
2006-10-20 15:37                                         ` Sean
2006-10-20 15:37                                         ` Sean
2006-10-19 10:40                                   ` Sean
2006-10-18 21:20                             ` VCS comparison table Jeff King
2006-10-17 23:33                       ` Aaron Bentley
2006-10-18  8:13                         ` Andreas Ericsson
2006-10-18  6:22                   ` Matthieu Moy
     [not found]                 ` <20061017180051.5453ba90.seanlkml@sympatico.ca>
2006-10-17 22:00                   ` Sean
2006-10-17 22:00                   ` Sean
2006-10-17 22:44                     ` Aaron Bentley
     [not found]                       ` <20061017185622.30fbc6c0.seanlkml@sympatico.ca>
2006-10-17 22:56                         ` Sean
2006-10-17 23:11                           ` Jakub Narebski
2006-10-18 21:04                           ` Charles Duffy
     [not found]                             ` <20061018172945.c0c58c38.seanlkml@sympatico.ca>
2006-10-18 21:29                               ` Sean
2006-10-18 23:31                                 ` Charles Duffy
2006-10-18 23:48                                   ` Johannes Schindelin
2006-10-19  1:58                                     ` Charles Duffy
2006-10-19 11:01                                       ` Johannes Schindelin
2006-10-19 11:10                                         ` Charles Duffy
2006-10-19 11:24                                           ` Johannes Schindelin
2006-10-19 11:30                                             ` Charles Duffy
2006-10-20 11:38                                               ` Jakub Narebski
2006-10-18 23:48                                   ` Jakub Narebski
     [not found]                                   ` <20061018194945.3e5105e7.seanlkml@sympatico.ca>
2006-10-18 23:49                                     ` Sean
2006-10-18 23:49                                     ` Sean
2006-10-18 21:29                               ` Sean
2006-10-18 21:37                               ` Shawn Pearce
     [not found]                                 ` <20061018174450.f2108a21.seanlkml@sympatico.ca>
2006-10-18 21:44                                   ` Sean
2006-10-18 21:52                                   ` Petr Baudis
2006-10-18 23:38                                 ` Johannes Schindelin
2006-10-18 23:54                                   ` Petr Baudis
2006-10-19  0:33                                     ` Johannes Schindelin
2006-10-17 22:56                         ` Sean
2006-10-18 21:51                         ` Petr Baudis
2006-10-20  9:43                     ` Matthieu Moy
2006-10-24  6:02                       ` Lachlan Patrick
2006-10-24  6:23                         ` Shawn Pearce
2006-10-24  6:31                         ` Linus Torvalds
2006-10-24  6:45                           ` David Rientjes
     [not found]                             ` <Pin e.LNX.4.64.0610240812410.3962@g5.osdl.org>
     [not found]                             ` <"Pin e.LNX.4.64.0610240812410.3962"@g5.osdl.org>
2006-10-24 15:15                             ` Linus Torvalds
2006-10-24 20:12                               ` David Rientjes
2006-10-24 20:28                                 ` Jakub Narebski
2006-10-25  8:48                                 ` Jeff King
     [not found]                                   ` < Pine.LNX.4.64N.0610250157470.3467@attu1.cs.washington.edu>
     [not found]                                     ` <20061025094900.G A26989@coredump.intra.peff.net>
2006-10-25  9:19                                   ` David Rientjes
2006-10-25  9:32                                     ` Jakub Narebski
2006-10-25  9:49                                     ` Jeff King
2006-10-25 13:49                                       ` Andreas Ericsson
2006-10-25 21:51                                         ` David Lang
2006-10-25 22:15                                           ` Shawn Pearce
2006-10-25 22:29                                             ` Jakub Narebski
2006-10-25 22:44                                               ` Petr Baudis
2006-10-25 23:15                                                 ` Jakub Narebski
2006-10-26  1:06                                                 ` Horst H. von Brand
2006-10-25 22:41                                             ` David Lang
2006-10-25 17:21                                       ` David Rientjes
2006-10-25 21:03                                         ` Jeff King
2006-10-26 11:15                                         ` Andreas Ericsson
2006-10-26 16:30                                           ` David Lang
2006-10-26 17:03                                             ` Nicolas Pitre
2006-10-26 17:04                                               ` David Lang
2006-10-26 17:16                                                 ` Linus Torvalds
2006-10-26 17:24                                                 ` Nicolas Pitre
2006-10-26 17:45                                               ` Jakub Narebski
2006-10-25 21:08                                   ` Junio C Hamano
2006-10-25 21:16                                     ` Jeff King
2006-10-25 21:32                                       ` Junio C Hamano
2006-10-25 21:50                                     ` Junio C Hamano
2006-10-26 11:25                                     ` Andreas Ericsson
2006-10-26  2:29                             ` Linus Torvalds
2006-10-17 22:03                 ` Linus Torvalds
2006-10-17 22:53                   ` Aaron Bentley
2006-10-17 23:09                     ` Linus Torvalds
2006-10-18  0:23                       ` Aaron Bentley
2006-10-18  0:46                         ` Jakub Narebski
     [not found]                         ` <200610180246.18758.jnareb@gmail.com>
2006-10-18  1:00                           ` Aaron Bentley
2006-10-18  1:25                             ` Carl Worth
2006-10-18  3:10                               ` Aaron Bentley
2006-10-18  8:39                                 ` Andreas Ericsson
2006-10-18  9:04                                   ` Peter Baumann
2006-10-18  9:07                                   ` Jakub Narebski
2006-10-18 10:32                                   ` Matthew D. Fuller
2006-10-18 11:19                                     ` Andreas Ericsson
2006-10-18 12:43                                       ` Matthew D. Fuller
     [not found]                                         ` <20061018090218.35f0326b.seanlkml@sympatico.ca>
2006-10-18 13:02                                           ` Sean
2006-10-18 13:02                                           ` Sean
2006-10-18 13:10                                         ` Jakub Narebski
2006-10-18 16:07                                         ` Linus Torvalds
2006-10-18 15:38                                 ` Carl Worth
2006-10-19  9:10                                   ` Matthew D. Fuller
2006-10-19 11:15                                     ` Andreas Ericsson
2006-10-19 12:04                                       ` Matthieu Moy
2006-10-19 12:33                                         ` Petr Baudis
2006-10-19 13:44                                           ` Matthieu Moy
2006-10-19 16:03                                             ` Carl Worth
2006-10-19 16:38                                               ` Matthieu Moy
2006-10-20 11:24                                                 ` Jakub Narebski
2006-10-20 11:50                                           ` Jakub Narebski
2006-10-20 13:26                                             ` Jakub Narebski
2006-10-20 23:19                                             ` Junio C Hamano
2006-10-21  0:07                                               ` Linus Torvalds
2006-10-21  1:09                                                 ` Junio C Hamano
2006-10-21  1:19                                                   ` Linus Torvalds
2006-10-21  1:27                                                     ` Junio C Hamano
2006-10-21  1:55                                                       ` Linus Torvalds
2006-10-21  8:32                                                         ` Jakub Narebski
2006-10-19 11:27                                     ` Karl Hasselström
2006-10-19 11:46                                       ` Petr Baudis
2006-10-19 16:01                                         ` Matthew D. Fuller
2006-10-19 17:06                                           ` Matthew D. Fuller
2006-10-18  3:35                             ` Linus Torvalds
2006-10-19  3:10                               ` Aaron Bentley
2006-10-19  5:21                                 ` Carl Worth
2006-10-19  5:56                                   ` Martin Pool
2006-10-19 14:58                                   ` Aaron Bentley
2006-10-19 16:59                                     ` Carl Worth
2006-10-19 23:01                                       ` Aaron Bentley
2006-10-19 23:42                                         ` Carl Worth
2006-10-20  1:06                                           ` Aaron Bentley
2006-10-20  5:05                                             ` Linus Torvalds
2006-10-20  7:47                                               ` Lachlan Patrick
2006-10-20  8:38                                                 ` Johannes Schindelin
2006-10-20 10:13                                                   ` Petr Baudis
2006-10-20 11:09                                                   ` Jakub Narebski
2006-10-20 11:37                                                     ` Johannes Schindelin
2006-10-20 12:03                                                       ` Jakub Narebski
2006-10-20 12:48                                                         ` Johannes Schindelin
2006-10-20 17:23                                                       ` David Lang
2006-10-20 10:16                                                 ` Petr Baudis
2006-10-20  9:57                                             ` Jakub Narebski
2006-10-20 10:02                                               ` Matthieu Moy
2006-10-20 10:45                                                 ` Andy Whitcroft
2006-10-20 10:45                                               ` James Henstridge
2006-10-20 12:01                                                 ` Jakub Narebski
2006-10-20 11:00                                             ` Jakub Narebski
2006-10-20 14:12                                             ` Jeff King
2006-10-20 14:40                                               ` Jakub Narebski
2006-10-20 14:52                                                 ` Johannes Schindelin
2006-10-20 15:34                                                   ` Jakub Narebski
2006-10-21 17:57                                               ` Aaron Bentley
2006-10-21 18:20                                                 ` Jakub Narebski
2006-10-22 14:27                                                   ` Matthieu Moy
2006-10-20 21:48                                             ` Carl Worth
2006-10-21 13:01                                               ` Matthew D. Fuller
2006-10-21 14:08                                                 ` Jakub Narebski
2006-10-21 16:31                                                   ` Erik Bågfors
2006-10-21 16:59                                                     ` Jakub Narebski
2006-10-21 17:41                                                       ` Jakub Narebski
2006-10-21 18:11                                                   ` Matthew D. Fuller
2006-10-21 19:19                                                     ` Jeff King
2006-10-21 19:30                                                       ` Jakub Narebski
2006-10-21 19:47                                                         ` Jan Hudec
2006-10-21 19:55                                                         ` Linus Torvalds
2006-10-21 20:19                                                           ` Jakub Narebski
2006-10-21 21:46                                                       ` Matthew D. Fuller
     [not found]                                                         ` <20061021180653.d3152616.seanlkml@sympatico.ca>
2006-10-21 22:06                                                           ` Sean
2006-10-21 22:25                                                         ` Jakub Narebski
2006-10-21 23:42                                                           ` Jeff Licquia
2006-10-21 23:49                                                             ` Carl Worth
2006-10-22  0:07                                                               ` Jeff Licquia
2006-10-22  0:47                                                                 ` Linus Torvalds
2006-10-22 16:02                                                               ` Petr Baudis
2006-10-25  9:52                                                               ` Andreas Ericsson
2006-10-21 19:41                                                     ` Jakub Narebski
2006-10-22 19:18                                                       ` David Clymer
2006-10-22 19:57                                                         ` Jakub Narebski
2006-10-22 20:06                                                         ` Jakub Narebski
2006-10-23 11:56                                                           ` David Clymer
2006-10-23 12:54                                                             ` Jakub Narebski
2006-10-23 15:01                                                               ` James Henstridge
2006-10-23 17:18                                                                 ` Aaron Bentley
2006-10-23 17:53                                                                   ` Jakub Narebski
2006-10-23 18:04                                                                     ` Linus Torvalds
2006-10-23 18:21                                                                       ` Jakub Narebski
2006-10-23 18:26                                                                         ` Jelmer Vernooij
2006-10-23 18:31                                                                           ` Jakub Narebski
2006-10-23 18:44                                                                             ` Jelmer Vernooij
2006-10-23 18:45                                                                             ` Linus Torvalds
2006-10-23 18:56                                                                               ` Jelmer Vernooij
2006-10-23 19:02                                                                                 ` Shawn Pearce
2006-10-23 19:12                                                                                 ` Jakub Narebski
2006-10-23 19:18                                                                                 ` Linus Torvalds
2006-10-23 18:34                                                                         ` Linus Torvalds
2006-10-23 20:06                                                                   ` Jeff King
2006-10-23 20:29                                                                     ` Jakub Narebski
2006-10-24  3:24                                                               ` David Clymer
2006-10-21 20:47                                                 ` Carl Worth
2006-10-21 20:55                                                   ` Jakub Narebski
2006-10-21 23:07                                                   ` Jeff Licquia
     [not found]                                                     ` <20061021192539.4a00cc3e.seanlkml@sympatico.ca>
2006-10-21 23:25                                                       ` Sean
2006-10-21 23:25                                                       ` Sean
2006-10-22  0:46                                                       ` Jeff Licquia
     [not found]                                                         ` <20061021212645.2f9ba751.seanlkml@sympatico.ca>
2006-10-22  1:26                                                           ` Sean
2006-10-22  1:26                                                           ` Sean
2006-10-22  3:23                                                           ` Jeff Licquia
     [not found]                                                             ` <20061021233014.d4525a1d.seanlkml@sympatico.ca>
2006-10-22  3:30                                                               ` Sean
2006-10-22  3:30                                                               ` Sean
2006-10-22 10:00                                                               ` Matthew D. Fuller
     [not found]                                                                 ` <20061022074422.50dcbee6.seanlkml@sympatico.ca>
2006-10-22 11:44                                                                   ` Sean
2006-10-22 11:44                                                                   ` Sean
2006-10-22 13:03                                                                   ` Matthew D. Fuller
     [not found]                                                                     ` <20061022092845.233deb43.seanlkml@sympatico.ca>
2006-10-22 13:28                                                                       ` Sean
2006-10-22 13:28                                                                       ` Sean
2006-10-22 13:33                                                                       ` Matthew D. Fuller
     [not found]                                                                         ` <20061022094041.77c06cc7.seanlkml@sympatico.ca>
2006-10-22 13:40                                                                           ` Sean
2006-10-22 13:40                                                                           ` Sean
2006-10-22 13:57                                                                           ` Matthew D. Fuller
     [not found]                                                                             ` <20061022102454.b9dea693.seanlkml@sympatico.ca>
2006-10-22 14:24                                                                               ` Sean
2006-10-22 14:24                                                                               ` Sean
2006-10-22 14:56                                                                               ` Matthew D. Fuller
2006-10-22 15:05                                                                                 ` Matthieu Moy
2006-10-22 12:46                                                   ` Matthew D. Fuller
2006-10-22 13:51                                                     ` Jakub Narebski
2006-10-22 19:36                                                   ` David Clymer
2006-10-25  9:35                                                 ` Andreas Ericsson
2006-10-25  9:46                                                   ` Jakub Narebski
2006-10-25 10:08                                                     ` James Henstridge
2006-10-25 15:54                                                       ` Carl Worth
2006-10-26  8:52                                                         ` James Henstridge
2006-10-26  9:33                                                           ` Junio C Hamano
2006-10-26  9:57                                                             ` James Henstridge
2006-10-26 10:10                                                               ` Jeff King
2006-10-26 10:52                                                                 ` Vincent Ladeuil
2006-10-26 11:13                                                                   ` Jeff King
2006-10-26 11:15                                                                     ` Jeff King
2006-10-26 12:33                                                                     ` Vincent Ladeuil
2006-10-26 13:14                                                                       ` Rogan Dawes
2006-10-26 11:18                                                                   ` Jakub Narebski
2006-10-26 15:05                                                                   ` Linus Torvalds
2006-10-26 16:04                                                                     ` Vincent Ladeuil
2006-10-26 16:21                                                                       ` Linus Torvalds
2006-11-28  0:01                                                                     ` git and bzr Joseph Wakeling
2006-11-28  0:39                                                                       ` Jakub Narebski
2006-11-28  0:40                                                                       ` Sean
2006-11-28  0:40                                                                       ` Sean
2006-11-28  2:57                                                                       ` Linus Torvalds
2006-11-29  2:23                                                                         ` Joseph Wakeling
2006-11-29  3:51                                                                           ` Linus Torvalds
2006-11-29  8:07                                                                             ` Junio C Hamano
2006-11-29 12:17                                                                             ` git blame [was: git and bzr] Joseph Wakeling
2006-11-29 16:39                                                                               ` Linus Torvalds
2006-11-30 18:24                                                                                 ` Joseph Wakeling
2006-11-30 18:44                                                                                   ` Linus Torvalds
2006-11-30 19:55                                                                                     ` Carl Worth
2006-11-30 22:17                                                                                       ` Johannes Schindelin
2006-11-30 22:24                                                                                         ` J. Bruce Fields
2006-11-30 22:38                                                                                         ` git blame Junio C Hamano
2006-11-30 22:53                                                                                           ` Johannes Schindelin
2006-11-30 23:08                                                                                             ` Jakub Narebski
2006-11-28 12:10                                                                       ` git and bzr Erik Bågfors
2006-11-28 12:37                                                                         ` Jakub Narebski
2006-11-28 13:35                                                                           ` Johannes Schindelin
2006-11-28 16:08                                                                             ` Linus Torvalds
2006-11-28 17:07                                                                               ` Aaron Bentley
2006-11-28 17:29                                                                                 ` Jakub Narebski
2006-11-28 18:31                                                                                   ` Aaron Bentley
2006-11-28 18:43                                                                                     ` Jakub Narebski
2006-11-28 21:59                                                                                       ` Aaron Bentley
2006-11-28 22:16                                                                                         ` Jakub Narebski
2006-11-28 18:00                                                                                 ` Linus Torvalds
2006-11-28 17:44                                                                               ` Nicholas Allen
2006-11-28 18:06                                                                                 ` Jakub Narebski
2006-11-28 18:58                                                                                   ` Nicholas Allen
2006-11-28 19:11                                                                                   ` Nicholas Allen
2006-11-28 19:40                                                                                     ` Andy Parkins
2006-11-28 19:59                                                                                       ` Jakub Narebski
2006-11-28 20:37                                                                                   ` Nicholas Allen
2006-11-28 21:26                                                                                     ` Nicholas Allen
2006-11-28 21:43                                                                                       ` Jakub Narebski
2006-11-28 21:49                                                                                       ` Linus Torvalds
2006-11-28 21:53                                                                                         ` Shawn Pearce
2006-11-28 22:13                                                                                           ` Linus Torvalds
2006-11-28 22:22                                                                                             ` Jakub Narebski
2006-11-28 22:00                                                                                         ` Nicholas Allen
2006-11-28 22:25                                                                                           ` Linus Torvalds
2006-11-28 22:41                                                                                             ` Linus Torvalds
2006-11-28 22:48                                                                                               ` Nicholas Allen
2006-11-29 10:49                                                                                                 ` Johannes Schindelin
2006-11-29 11:01                                                                                                   ` Jakub Narebski
2006-11-29 20:37                                                                                                   ` Jon Loeliger
2006-11-28 22:46                                                                                             ` Nicholas Allen
2006-11-29 10:52                                                                                             ` Johannes Schindelin
2006-11-29 17:29                                                                                               ` Linus Torvalds
2006-11-29 18:54                                                                                                 ` Marko Macek
2006-11-29 20:07                                                                                                   ` Johannes Schindelin
2006-11-29 20:49                                                                                                     ` Jakub Narebski
2006-11-29 20:45                                                                                                   ` Linus Torvalds
2006-11-30  0:05                                                                                                     ` Carl Worth
2006-11-30  0:08                                                                                                       ` Carl Worth
2006-11-30  0:30                                                                                                       ` Jakub Narebski
2006-11-30  6:59                                                                                                       ` Raimund Bauer
2006-11-30  7:17                                                                                                         ` Carl Worth
2006-11-30  8:31                                                                                                         ` Alan Chandler
2006-11-30  9:01                                                                                                         ` Nguyen Thai Ngoc Duy
2006-11-30  9:30                                                                                                           ` Alan Chandler
2006-11-30  9:35                                                                                                             ` Jakub Narebski
2006-11-30 10:01                                                                                                               ` Junio C Hamano
2006-11-30 22:45                                                                                                                 ` Johannes Schindelin
2006-11-30 23:36                                                                                                                   ` Junio C Hamano
2006-11-30  9:39                                                                                                             ` Steven Grimm
2006-11-30 10:19                                                                                                         ` Johannes Schindelin
2006-11-30 11:25                                                                                                           ` Nguyen Thai Ngoc Duy
2006-11-30 11:58                                                                                                             ` Jakub Narebski
2006-11-30 12:14                                                                                                               ` Nguyen Thai Ngoc Duy
2006-11-30 12:23                                                                                                             ` Johannes Schindelin
2006-11-30 12:45                                                                                                         ` Andreas Ericsson
2006-11-30 12:25                                                                                                   ` Andreas Ericsson
2006-11-30 20:01                                                                                                     ` Theodore Tso
2006-11-30 20:09                                                                                                       ` Jakub Narebski
2006-12-01  9:55                                                                                                         ` Andreas Ericsson
2006-12-02  8:57                                                                                                           ` Jakub Narebski
     [not found]                                                                                       ` <20061128214531.GA24299@jameswestby.net>
2006-11-28 22:34                                                                                         ` Nicholas Allen
2006-11-28 21:40                                                                                     ` Martin Langhoff
     [not found]                                                                                       ` <456CADE9.7060503@onlinehome.de>
2006-11-28 22:14                                                                                         ` Martin Langhoff
2006-11-28 22:19                                                                                           ` Martin Langhoff
2006-11-28 22:36                                                                                           ` Nicholas Allen
2006-11-28 22:47                                                                                             ` Martin Langhoff
2006-11-30 12:36                                                                       ` Nicholas Allen
2006-11-30 12:47                                                                         ` Johannes Schindelin
2006-11-30 16:45                                                                         ` Linus Torvalds
2006-10-26  9:50                                                           ` VCS comparison table Andreas Ericsson
2006-10-25  9:57                                                   ` Matthieu Moy
2006-10-21 20:05                                               ` Aaron Bentley
2006-10-21 20:48                                                 ` Jakub Narebski
2006-10-21 22:52                                                   ` Edgar Toernig
2006-10-21 23:39                                                   ` Aaron Bentley
2006-10-22  0:04                                                     ` Carl Worth
2006-10-22  0:14                                                     ` Jakub Narebski
     [not found]                                                 ` <20061021165313.dba67497.seanlkml@sympatico.ca>
2006-10-21 20:53                                                   ` Sean
2006-10-21 20:53                                                   ` Sean
2006-10-21 21:10                                                     ` Linus Torvalds
2006-10-22  7:45                                                 ` Jan Hudec
2006-10-22  9:05                                                   ` Jakub Narebski
2006-10-22  9:56                                                     ` Erik Bågfors
2006-10-22 13:23                                                       ` Jakub Narebski
2006-10-22 14:11                                                         ` Erik Bågfors
2006-10-22 14:39                                                           ` Jakub Narebski
2006-10-22 14:25                                                       ` Carl Worth
2006-10-22 14:48                                                         ` Erik Bågfors
2006-10-22 15:04                                                           ` Jakub Narebski
2006-10-22 14:55                                                         ` Jakub Narebski
2006-10-22 18:53                                                         ` Matthew D. Fuller
2006-10-22 19:27                                                           ` Jakub Narebski
2006-10-23 16:57                                                           ` David Lang
2006-10-23 17:29                                                           ` Linus Torvalds
2006-10-23 22:21                                                             ` Matthew D. Fuller
2006-10-23 22:28                                                               ` David Lang
2006-10-23 22:44                                                               ` Linus Torvalds
2006-10-24  0:26                                                                 ` Matthew D. Fuller
2006-10-24 15:58                                                                   ` David Lang
2006-10-24 16:34                                                                     ` Matthew D. Fuller
2006-10-24 18:03                                                                       ` David Lang
2006-10-24 18:25                                                                         ` Jakub Narebski
2006-10-24 19:27                                                                           ` Petr Baudis
2006-10-25  0:27                                                                         ` Matthew D. Fuller
2006-10-25 22:40                                                                           ` David Lang
2006-10-25 23:53                                                                             ` Matthew D. Fuller
2006-10-26 10:13                                                                               ` Andreas Ericsson
2006-10-26 10:45                                                                                 ` Erik Bågfors
2006-10-26 11:48                                                                                 ` Jakub Narebski
2006-10-26 11:54                                                                                   ` Nicholas Allen
2006-10-26 12:13                                                                                     ` Jakub Narebski
2006-10-26 21:25                                                                                     ` Jeff King
2006-10-27  2:02                                                                                   ` Horst H. von Brand
2006-10-27  2:08                                                                                     ` Petr Baudis
2006-10-27  9:34                                                                                     ` Andreas Ericsson
2006-10-27 10:49                                                                                       ` Jakub Narebski
2006-10-27 11:41                                                                                         ` Andreas Ericsson
2006-10-27 14:46                                                                                       ` J. Bruce Fields
2006-10-28 11:18                                                                                         ` Ilpo Nyyssönen
2006-10-28 13:53                                                                                           ` Jakub Narebski
2006-10-28 14:58                                                                                             ` Jakub Narebski
2006-10-28 22:18                                                                                             ` Robin Rosenberg
2006-10-28 22:46                                                                                               ` Jakub Narebski
2006-10-29  6:54                                                                                             ` Ilpo Nyyssönen
2006-10-29 12:01                                                                                               ` Jakub Narebski
2006-10-29 18:24                                                                                                 ` Matthew D. Fuller
2006-10-29 18:39                                                                                                   ` Jakub Narebski
2006-10-30  0:10                                                                                                 ` Theodore Tso
2006-10-30 10:18                                                                                             ` Progress reporting (was: VCS comparison table) Jakub Narebski
2006-10-30 15:21                                                                                               ` Nicolas Pitre
2006-10-26 12:12                                                                                 ` VCS comparison table Matthew D. Fuller
2006-10-26 12:18                                                                                   ` Jakub Narebski
2006-10-26 15:06                                                                                     ` Matthew D. Fuller
2006-10-26 13:47                                                                                 ` Aaron Bentley
2006-10-26 13:53                                                                                   ` Jakub Narebski
2006-10-26 15:13                                                                                     ` Aaron Bentley
2006-10-30 21:46                                                                             ` Jan Hudec
2006-10-23 22:45                                                               ` Jakub Narebski
2006-10-23 23:14                                                                 ` Erik Bågfors
2006-10-23 23:24                                                                   ` Linus Torvalds
2006-10-24  0:26                                                                     ` Matthew D. Fuller
2006-10-24  0:38                                                                       ` Matthew D. Fuller
2006-10-24  5:42                                                                         ` Linus Torvalds
2006-10-24  5:47                                                                           ` Shawn Pearce
2006-10-24 16:46                                                                           ` Matthew D. Fuller
2006-10-24  0:47                                                                       ` Carl Worth
2006-10-24  7:31                                                                         ` Erik Bågfors
2006-10-24 21:51                                                                         ` Erik Bågfors
2006-10-25 12:41                                                                           ` Andreas Ericsson
2006-10-25 13:15                                                                             ` Erik Bågfors
2006-10-24  0:39                                                                     ` Martin Langhoff
2006-10-24  7:52                                                                       ` Erik Bågfors
2006-10-24  8:37                                                                         ` Jakub Narebski
2006-10-24 10:11                                                                         ` Martin Langhoff
2006-10-24  9:30                                                                     ` Jelmer Vernooij
2006-10-26 15:22                                                                       ` Aaron Bentley
2006-10-25 18:41                                                                     ` Aaron Bentley
2006-10-24  9:51                                                               ` Matthieu Moy
2006-10-24 10:27                                                                 ` Jakub Narebski
2006-10-25 10:52                                                               ` Andreas Ericsson
2006-10-25 19:53                                                                 ` Junio C Hamano
2006-10-20  2:53                                           ` James Henstridge
2006-10-20  9:51                                             ` Jakub Narebski
2006-10-20 10:42                                               ` James Henstridge
2006-10-20 13:17                                                 ` Jakub Narebski
2006-10-20 13:36                                                   ` Petr Baudis
2006-10-20 14:12                                                     ` Jakub Narebski
2006-10-20 14:59                                                   ` James Henstridge
2006-10-20 22:50                                                     ` Jakub Narebski
2006-10-20 22:58                                                       ` Petr Baudis
2006-10-20 10:53                                         ` Jakub Narebski
2006-10-20 12:34                                           ` Matthieu Moy
2006-10-20 13:20                                             ` Jakub Narebski
2006-10-20 13:47                                               ` Petr Baudis
2006-10-19 17:01                                     ` Carl Worth
2006-10-19 17:14                                       ` J. Bruce Fields
2006-10-20 14:31                                         ` Jeff King
2006-10-20 15:33                                           ` J. Bruce Fields
2006-10-20 15:43                                             ` Jeff King
2006-10-19 15:25                                   ` Linus Torvalds
2006-10-19 16:13                                     ` Matthew D. Fuller
2006-10-19 16:49                                       ` Linus Torvalds
2006-10-19 18:30                                         ` Linus Torvalds
2006-10-19 18:54                                           ` Matthieu Moy
2006-10-19 20:47                                             ` Linus Torvalds
2006-10-21  5:49                                               ` Junio C Hamano
2006-10-19 23:28                                             ` Ryan Anderson
2006-10-19 19:16                                           ` Junio C Hamano
2006-10-20 10:51                                             ` Jakub Narebski
2006-10-20 15:58                                               ` Linus Torvalds
2006-10-19  5:33                                 ` Jan Hudec
2006-10-19  7:02                                 ` Erik Bågfors
2006-10-19  8:49                                   ` Christian MICHON
2006-10-19  8:58                                     ` Andreas Ericsson
2006-10-19  9:10                                       ` Matthieu Moy
2006-10-19 14:57                                         ` Tim Webster
2006-10-19 15:30                                           ` Aaron Bentley
2006-10-20  3:14                                             ` Tim Webster
2006-10-20  4:05                                               ` Aaron Bentley
2006-10-21 12:30                                                 ` Jan Hudec
2006-10-21 13:05                                                   ` Jakub Narebski
2006-10-21 13:15                                                     ` Jan Hudec
2006-10-21 13:29                                                       ` Jakub Narebski
2006-10-21 16:56                                                     ` Aaron Bentley
2006-10-21 17:03                                                       ` Jakub Narebski
2006-10-21 17:31                                                       ` Linus Torvalds
2006-10-21 17:38                                                         ` Linus Torvalds
2006-10-22  7:49                                                         ` Tim Webster
2006-10-22 17:12                                                           ` Linus Torvalds
2006-10-23  5:19                                                             ` Matthew Hannigan
2006-10-20 10:44                                             ` Jakub Narebski
2006-10-19 16:14                                           ` Matthieu Moy
2006-10-20  3:40                                             ` Tim Webster
2006-10-19 15:45                                       ` Ramon Diaz-Uriarte
2006-10-20 10:40                                       ` Jakub Narebski
2006-10-20 13:36                                         ` Shawn Pearce
2006-10-21 12:30                                         ` Matthew D. Fuller
2006-10-19 11:37                                   ` Petr Baudis
2006-10-19 15:17                                     ` Matthew D. Fuller
2006-10-20 13:22                                 ` Horst H. von Brand
2006-10-20 13:46                                   ` Christian MICHON
2006-10-20 15:05                                     ` Jakub Narebski
2006-10-20 15:16                                       ` Johannes Schindelin
2006-10-20 15:28                                         ` Jakub Narebski
2006-10-20 15:39                                           ` Johannes Schindelin
2006-10-20 16:05                                             ` Jakub Narebski
2006-10-20 16:24                                               ` Jakub Narebski
2006-10-18  3:25                         ` Ryan Anderson
2006-10-17 23:24                     ` Jakub Narebski
2006-10-17 23:50                       ` Linus Torvalds
2006-10-17 23:35               ` Jakub Narebski
2006-10-17  9:20         ` Jakub Narebski
2006-10-17  9:40           ` Robert Collins
2006-10-17 10:08             ` Andreas Ericsson
2006-10-17 10:47               ` Matthieu Moy
2006-10-18  4:55               ` Robert Collins
2006-10-18  8:53                 ` Andreas Ericsson
2006-10-18 11:15                   ` Petr Baudis
2006-10-18 15:31                 ` Linus Torvalds
2006-10-18 15:50                   ` Jakub Narebski
2006-10-18 16:22                     ` Linus Torvalds
2006-10-17 16:41             ` Linus Torvalds
2006-10-17 22:27               ` Robert Collins
     [not found]                 ` <20061017191838.1c36499b.seanlkml@sympatico.ca>
2006-10-17 23:18                   ` Sean
2006-10-17 23:18                   ` Sean
2006-10-17 23:33                   ` Petr Baudis
2006-10-18  5:26                     ` Robert Collins
2006-10-18 21:46                       ` Alternate revno proposal (Was: Re: VCS comparison table) Jan Hudec
2006-10-18 22:14                         ` Jakub Narebski
2006-10-19  5:45                           ` Jan Hudec
2006-10-19  8:19                         ` Alexander Belchenko
2006-10-21 13:48                           ` Jan Hudec
2006-10-20  2:09                         ` Horst H. von Brand
2006-10-20  5:38                           ` Jan Hudec
2006-10-17  9:59           ` VCS comparison table Andreas Ericsson
2006-10-17  9:37       ` Robert Collins
     [not found]         ` <20061017060112.2d036f96.seanlkml@sympatico.ca>
2006-10-17 10:01           ` Sean
2006-10-17 10:01           ` Sean
2006-10-17 10:06         ` Jakub Narebski
2006-10-16 23:35     ` Linus Torvalds
2006-10-16 23:55       ` Jakub Narebski
2006-10-17  0:04         ` Johannes Schindelin
2006-10-17  0:23           ` Linus Torvalds
2006-10-17  0:36             ` Johannes Schindelin
2006-10-17  1:17             ` Nguyen Thai Ngoc Duy
2006-10-17  7:26             ` Christian MICHON
2006-10-17  0:08         ` Linus Torvalds
2006-10-17  0:24           ` Jakub Narebski
2006-10-17  4:31           ` Aaron Bentley
2006-10-19 19:01             ` Nathaniel Smith
2006-10-20 10:32               ` Jakub Narebski
2006-10-17  0:29       ` Luben Tuikov
2006-10-17  4:24       ` Aaron Bentley
2006-10-17  7:50         ` Andreas Ericsson
2006-10-17 14:05           ` Aaron Bentley
     [not found]             ` <20061017103423.a9589295.seanlkml@sympatico.ca>
2006-10-17 14:34               ` Sean
2006-10-17 15:05             ` Andreas Ericsson
2006-10-17 15:32               ` Matthieu Moy
2006-10-17 19:44               ` Aaron Bentley
2006-10-17 23:28                 ` Petr Baudis
2006-10-17 23:39                 ` Jakub Narebski
2006-10-18  0:24                   ` Aaron Bentley
2006-10-17  8:30         ` Jakub Narebski
2006-10-17 11:19           ` Matthieu Moy
2006-10-17 11:45             ` Jakub Narebski
2006-10-17 12:02               ` Jakub Narebski
     [not found]               ` <20061017080702.615a3b2f.seanlkml@sympatico.ca>
2006-10-17 12:07                 ` Sean
2006-10-17 12:07                 ` Sean
2006-10-21  8:27                   ` Jakub Narebski
2006-10-21  8:48                     ` Erik Bågfors
2006-10-17 13:33               ` Matthieu Moy
2006-10-17 12:00             ` Andreas Ericsson
2006-10-17 13:27               ` Matthieu Moy
2006-10-17 13:55                 ` Jakub Narebski
2006-10-17 14:08                   ` Matthieu Moy
2006-10-17 14:41                     ` Jakub Narebski
2006-10-18  0:00                       ` Petr Baudis
2006-10-18  0:30                         ` Aaron Bentley
2006-10-18  0:39                           ` Petr Baudis
2006-10-18  1:28                           ` Jakub Narebski
2006-10-18  1:44                             ` Carl Worth
2006-10-18  3:27                               ` Aaron Bentley
2006-10-18  9:20                                 ` Jakub Narebski
2006-10-18 16:31                                   ` Aaron Bentley
2006-10-21 15:56                                     ` Jan Hudec
2006-10-21 16:13                                       ` Jakub Narebski
     [not found]                           ` <20061018003920.GK20017@pasky.or.cz>
2006-10-18  9:28                             ` Erik Bågfors
2006-10-18 11:08                               ` Petr Baudis
2006-10-18 11:17                                 ` Jakub Narebski
2006-10-18 13:09                                 ` Erik Bågfors
2006-10-18 18:03                   ` Jeff Licquia
2006-10-17 14:01                 ` Andreas Ericsson
2006-10-17 14:24                   ` Matthieu Moy
2006-10-17 14:19             ` Olivier Galibert
2006-10-17 15:37               ` Matthieu Moy
2006-10-18  1:46             ` Petr Baudis
     [not found]             ` <20061017073839.3728d1e7.seanlkml@sympatico.ca>
2006-10-17 11:38               ` Sean
2006-10-17 11:38               ` Sean
2006-10-17 12:03                 ` Matthieu Moy
2006-10-17 12:56                   ` Jakub Narebski
     [not found]                   ` <20061017085723.7542ee6c.seanlkml@sympatico.ca>
2006-10-17 12:57                     ` Sean
2006-10-17 13:44                       ` Matthieu Moy
     [not found]                         ` <20061017100150.b4919aac.seanlkml@sympatico.ca>
2006-10-17 14:01                           ` Sean
2006-10-17 14:01                           ` Sean
2006-10-17 14:19                             ` Matthieu Moy
     [not found]                               ` <20061017110655.f7bcf3f1.seanlkml@sympatico.ca>
2006-10-17 15:06                                 ` Sean
2006-10-17 15:06                                 ` Sean
2006-10-18  0:14                                 ` Petr Baudis
2006-10-18  1:36                                   ` Integrating gitweb and git-browser (was: Re: VCS comparison table) Jakub Narebski
2006-10-18  1:52                                     ` Petr Baudis
2006-10-18  1:58                                       ` Jakub Narebski
2006-10-18  2:02                                         ` Petr Baudis
2006-10-17 12:57                     ` VCS comparison table Sean
2006-10-18  0:25                   ` Petr Baudis
2006-10-18  0:38                     ` Aaron Bentley
     [not found]                     ` <4535778D.40006@utoronto.ca>
2006-10-18  0:42                       ` Petr Baudis
2006-10-18  0:48                       ` Jakub Narebski
     [not found]                       ` <20061018004209.GL20017@pasky.or.cz>
2006-10-18  0:50                         ` Aaron Bentley
     [not found]                         ` <45357A6E.3050603@utoronto.ca>
2006-10-18  0:57                           ` Petr Baudis
2006-10-18  1:05                             ` Aaron Bentley
2006-10-18  1:11                   ` Petr Baudis
2006-10-18  6:44                     ` Matthieu Moy
2006-10-18  7:16                       ` Shawn Pearce
2006-10-21 14:13               ` Jan Hudec
     [not found]                 ` <20061021102346.9cd3abce.seanlkml@sympatico.ca>
2006-10-21 14:23                   ` Sean
2006-10-21 16:19                     ` Erik Bågfors
2006-10-21 16:31                       ` Jakub Narebski
     [not found]                       ` <BAYC1-PASMTP01706CD2FCBE923333A0CBAE020@CEZ.ICE>
2006-10-21 16:35                         ` Erik Bågfors
     [not found]                           ` <BAYC1-PASMTP04FAD1FBB91BA4C07A5E79AE020@CEZ.ICE>
2006-10-21 17:33                             ` Erik Bågfors
2006-10-21 21:04                       ` Linus Torvalds
2006-10-21 23:58                         ` Linus Torvalds
2006-10-22  0:13                           ` Erik Bågfors
2006-10-22  0:22                             ` Jakub Narebski
2006-10-22  1:00                               ` Theodore Tso
2006-10-22  0:09                         ` Erik Bågfors
2006-10-27  4:51                         ` Jan Hudec
2006-10-28 11:38                           ` Jakub Narebski
2006-10-21 14:23                   ` Sean
2006-10-21 18:34                   ` Jan Hudec
     [not found]                     ` <20061021144704.71d75e83.seanlkml@sympatico.ca>
2006-10-21 18:47                       ` Sean
2006-10-21 18:47                       ` Sean
     [not found]         ` <20061017062313.cd41e031.seanlkml@sympatico.ca>
2006-10-17 10:23           ` Sean
2006-10-17 10:30             ` Johannes Schindelin
     [not found]               ` <20061017063549.da130b5f.seanlkml@sympatico.ca>
2006-10-17 10:35                 ` Sean
2006-10-17 10:35                 ` Sean
2006-10-17 10:45               ` Matthias Kestenholz
2006-10-17 13:48               ` Aaron Bentley
2006-10-17 19:51             ` Aaron Bentley
2006-10-21 18:58               ` Jan Hudec
     [not found]                 ` <20061021150233.c29e11c5.seanlkml@sympatico.ca>
2006-10-21 19:02                   ` Sean
2006-10-21 19:02                   ` Sean
2006-10-20  8:26             ` James Henstridge
2006-10-20 10:19               ` Jakub Narebski
2006-10-20  8:56             ` Erik Bågfors
2006-10-17 10:23           ` Sean
2006-10-17 15:03         ` Linus Torvalds
2006-10-16 23:45     ` Johannes Schindelin
2006-10-17  2:40       ` Petr Baudis
2006-10-17  5:08       ` Aaron Bentley
2006-10-17  5:25         ` Carl Worth
2006-10-17  5:31         ` Shawn Pearce
2006-10-17  6:23         ` Junio C Hamano
2006-10-17 18:52           ` J. Bruce Fields
2006-10-17 19:12             ` Jakub Narebski
     [not found]         ` <20061017062341.8a5c8530.seanlkml@sympatico.ca>
2006-10-17 10:23           ` Sean
2006-10-17 10:23           ` Sean
2006-10-18  6:33           ` Jeff King
2006-10-17  9:33       ` Robert Collins
2006-10-17  9:45         ` Jakub Narebski
2006-10-14 20:20 ` Jakub Narebski
2006-10-14 23:06   ` Jon Smirl
2006-10-14 23:34     ` Jakub Narebski
     [not found]     ` <20061014200356.e7b56402.seanlkml@sympatico.ca>
2006-10-15  0:03       ` Sean
2006-10-15  0:34         ` Jon Smirl
     [not found]           ` <20061014214452.8c2d2a5c.seanlkml@sympatico.ca>
2006-10-15  1:44             ` Sean
2006-10-15  0:53     ` Jakub Narebski
2006-10-15 15:37     ` Jakub Narebski
2006-10-15 18:23     ` Petr Baudis
     [not found]       ` <20061015143956.86db3a8b.seanlkml@sympatico.ca>
2006-10-15 18:39         ` Sean
2006-10-15 19:24         ` Petr Baudis
2006-10-15 19:49       ` Jon Smirl
2006-10-16  3:23         ` Petr Baudis
2006-10-16  3:30           ` Jon Smirl
2006-10-17  3:52             ` Sam Vilain
2006-10-17 12:59               ` Jon Smirl

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).