linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Mercurial v0.1 - a minimal scalable distributed SCM
@ 2005-04-20 10:10 Matt Mackall
  2005-04-22 16:21 ` Bill Davidsen
  0 siblings, 1 reply; 2+ messages in thread
From: Matt Mackall @ 2005-04-20 10:10 UTC (permalink / raw)
  To: linux-kernel

http://selenic.com/mercurial/

April 19, 2005

I've spent the past couple weeks working on a completely new
proof-of-concept SCM. The goals:

 - to initially be as simple (and thereby hackable) as possible
 - to be as scalable as possible
 - to be memory, disk, and bandwidth efficient
 - to be able to do "clone/branch and pull/sync" style development

It's still very early on, but I think I'm doing surprisingly well. Now
that I've got something that actually does some interesting things if
you poke at it right, I figure it's time to throw it out there.
Here's what I've got so far:

 - O(1) file revision storage and retrieval with efficient delta compression
 - efficient append-only layout for rsync and http range protocols
 - bare bones commit, checkout, stat, history
 - working "clone/branch" and "pull/sync" functionality
 - functional enough to be self-hosting[1]
 - all in less than 600 lines of Python

When I say "pull/sync" works, that means I can do:

 hg merge other-repo

and it will pull all "changesets/deltas" that are in other-dir that I
don't have, merge them into the changeset history graph, and do the
same for all files changed for those deltas. It will call out to
a user-specified merge tool like tkdiff for a proper 3-way merge with
the nearest common ancestor in the case of conflicts.

Right now, "cloning/branching" is simply a matter of "cp -al" or
"rsync" (mercurial knows how to break hardlinks if needed).

Some benchmarks from my laptop:

 - prepare for commit of Linux 2.6.10: ~1s
 - commit Linux 2.6.10: 27s
 - checkout Linux 2.6.10: 45s
 - full tree stat for added/changed/deleted files: <1s
 - local hardlink clone: 1.5s
 - empty merge between full trees: <.1s
 - trivial 3-way merge with changes to Makefile: ~1s

Much thought has gone into what the best asymptotic performance can be
for the various things an SCM has to do and the core algorithms and
data structures here should scale relatively painlessly to arbitrary
numbers of changesets, files, and file revisions.

What remains to be done:

 - a halfway-usable command line tool
 - remote (network) repository support
 - diff generation
 - changelog entry editing
 - various manual interventions for merge
 - handle rename
 - handle rollback
 - all sorts of other error handling
 - all sorts of cleanup, packaging, documentation, testing...

Sample usage:

 export HGMERGE=tkmerge   # set a 3-way merge tool
 mkdir foo
 hg create                # create a repository in .hg/
 echo foo > Makefile
 hg add Makefile          # add a file to the current changeset
 hg commit                # commit files currently marked for add or delete
 hg history               # show all changesets
 hg index Makefile        # show change
 touch Makefile
 hg stat                  # find changed files
 cd ..; cp -al foo branch  # make a branch
 hg merge ../branch-foo    # sync changesets from a branch

[1] though the repository format is still in flux

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Mercurial v0.1 - a minimal scalable distributed SCM
  2005-04-20 10:10 Mercurial v0.1 - a minimal scalable distributed SCM Matt Mackall
@ 2005-04-22 16:21 ` Bill Davidsen
  0 siblings, 0 replies; 2+ messages in thread
From: Bill Davidsen @ 2005-04-22 16:21 UTC (permalink / raw)
  To: Matt Mackall; +Cc: linux-kernel

Matt Mackall wrote:
> http://selenic.com/mercurial/
> 
> April 19, 2005
> 
> I've spent the past couple weeks working on a completely new
> proof-of-concept SCM. The goals:
> 
>  - to initially be as simple (and thereby hackable) as possible
>  - to be as scalable as possible
>  - to be memory, disk, and bandwidth efficient
>  - to be able to do "clone/branch and pull/sync" style development

This sounds like an interesting tool, I want to look at it next week.
> 
> It's still very early on, but I think I'm doing surprisingly well. Now
> that I've got something that actually does some interesting things if
> you poke at it right, I figure it's time to throw it out there.

	[...]

> What remains to be done:
> 
>  - a halfway-usable command line tool
>  - remote (network) repository support
>  - diff generation
>  - changelog entry editing
>  - various manual interventions for merge
>  - handle rename
>  - handle rollback
>  - all sorts of other error handling
>  - all sorts of cleanup, packaging, documentation, testing...

The first one and the last one are really important... if a tool is hard 
to use, or poorly documented, people won't use it no matter how great it 
is. Or they will use it just long enough to create or find something better.

I admit that some of my own toys have never seen the light of day 
because I am unable to invest the time to handle those points.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-04-22 16:13 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-20 10:10 Mercurial v0.1 - a minimal scalable distributed SCM Matt Mackall
2005-04-22 16:21 ` Bill Davidsen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).