From: Matt Mackall <mpm@selenic.com>
To: Andrea Arcangeli <andrea@suse.de>
Cc: Linus Torvalds <torvalds@osdl.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
git@vger.kernel.org
Subject: Re: Mercurial 0.4b vs git patchbomb benchmark
Date: Sat, 30 Apr 2005 08:20:15 -0700 [thread overview]
Message-ID: <20050430152014.GI21897@waste.org> (raw)
In-Reply-To: <20050430025211.GP17379@opteron.random>
On Sat, Apr 30, 2005 at 04:52:11AM +0200, Andrea Arcangeli wrote:
> On Fri, Apr 29, 2005 at 01:39:59PM -0700, Matt Mackall wrote:
> > Mercurial is ammenable to rsync provided you devote a read-only
> > repository to it on the client side. In other words, you rsync from
> > kernel.org/mercurial/linus to local/linus and then you merge from
> > local/linus to your own branch. Mercurial's hashing hierarchy is
> > similar to git's (and Monotone's), so you can sign a single hash of
> > the tree as well.
>
> Ok fine. It's also interesting how you already enabled partial transfers
> through http.
>
> Please apply this patch so it doesn't fail on my setup ;)
>
> --- mercurial-0.4b/hg.~1~ 2005-04-29 02:52:52.000000000 +0200
> +++ mercurial-0.4b/hg 2005-04-30 00:53:02.000000000 +0200
> @@ -1,4 +1,4 @@
> -#!/usr/bin/python
> +#!/usr/bin/env python
Done.
> On a bit more technical side, one thing I'm wondering about is the
> compression. If I change mercurial like this:
>
> --- revlog.py.~1~ 2005-04-29 01:33:14.000000000 +0200
> +++ revlog.py 2005-04-30 03:54:12.000000000 +0200
> @@ -11,9 +11,11 @@
> import zlib, struct, mdiff, sha, binascii, os, tempfile
>
> def compress(text):
> + return text
> return zlib.compress(text)
>
> def decompress(bin):
> + return text
> return zlib.decompress(bin)
>
> def hash(text):
>
>
> the .hg directory sizes changes from 167M to 302M _BUT_ the _compressed_
> size of the .hg directory (i.e. like in a full network transfer with
> rsync -z or a tar.gz backup) changes from 55M to 38M:
>
> andrea@opteron:~/devel/kernel> du -sm hg-orig hg-aa hg-orig.tar.bz2 hg-aa.tar.bz2
> 167 hg-orig
> 302 hg-aa
> 55 hg-orig.tar.bz2
> 38 hg-aa.tar.bz2
> ^^^^^^^^^^^^^^^^^^^^^ 38M backup and network transfer is what I want
>
> So I don't really see an huge benefit in compression, other than to
> slowdown the checkins measurably [i.e. what Linus doesn't want] (the
> time of compression is a lot higher than the time of python runtime during
> checkin, so it's hard to believe your 100% boost with psyco in the hg file,
> sometime psyco doesn't make any difference infact, I'd rather prefer people to
> work on the real thing of generating native bytecode at compile time, rather
> than at runtime, like some haskell compiler can do).
Most of that psyco speed up is accelerating subsequent diffs in
difflib, which you probably didn't hit yet.
> mercurial is already good at decreasing the entropy by using an efficient
> storage format, it doesn't need to cheat by putting compression on each blob
> that can only leads to bad ratios when doing backups and while transferring
> more than one blob through the network.
>
> So I suggest to try disabling compression optionally, perhaps it'll be even
> faster than git in the initial checkin that way! No need of compressing or
> decompressing anything with mercurial (unlike with git that would explode
> without control w/o compression).
I can make it some sort of environment variable, sure. I think the
speed is already in a domain where it's not a big deal though. There
are other things to do first, like unifying the merge/commit/update
code.
> Http is not intended for maximal efficiency, it's there just to make
> life easy. special protocol with zlib is required for maximum
> efficiency.
Yeah, I've got a plan here.
> You also should move the .py into a hg directory, so that they won't
> pollute the site-packages.
Yep, I'm rather new to actually packaging my Python hacks.
--
Mathematics is the supreme nostalgia of our time.
next prev parent reply other threads:[~2005-04-30 15:20 UTC|newest]
Thread overview: 106+ messages / expand[flat|nested] mbox.gz Atom feed top
2005-04-26 0:41 Mercurial 0.3 vs git benchmarks Matt Mackall
2005-04-26 1:49 ` Daniel Phillips
2005-04-26 2:08 ` Linus Torvalds
2005-04-26 2:30 ` Mike Taht
2005-04-26 3:04 ` Linus Torvalds
2005-04-26 4:00 ` Linus Torvalds
2005-04-26 11:13 ` Chris Mason
2005-04-26 15:09 ` Magnus Damm
2005-04-26 15:38 ` Chris Mason
2005-04-26 16:23 ` Magnus Damm
2005-04-26 18:18 ` Chris Mason
2005-04-26 20:56 ` Andrew Morton
2005-04-26 21:07 ` Linus Torvalds
2005-04-26 22:50 ` H. Peter Anvin
2005-04-26 22:56 ` Andrew Morton
2005-04-26 23:43 ` H. Peter Anvin
2005-04-27 15:01 ` Florian Weimer
2005-04-27 15:13 ` Thomas Glanzmann
2005-04-27 18:54 ` H. Peter Anvin
2005-04-27 19:01 ` Thomas Glanzmann
2005-04-27 19:57 ` Theodore Ts'o
2005-04-27 20:06 ` Thomas Glanzmann
2005-04-27 20:35 ` H. Peter Anvin
2005-04-27 20:39 ` Thomas Glanzmann
2005-04-27 20:47 ` Florian Weimer
2005-04-27 20:55 ` Florian Weimer
2005-04-27 21:04 ` H. Peter Anvin
2005-04-27 21:06 ` Florian Weimer
2005-04-27 21:32 ` Theodore Ts'o
2005-04-27 19:55 ` Theodore Ts'o
2005-04-27 6:34 ` Ingo Molnar
2005-04-27 21:10 ` Bill Davidsen
2005-04-27 21:39 ` Linus Torvalds
2005-04-26 16:42 ` Linus Torvalds
2005-04-26 17:39 ` Chris Mason
2005-04-26 19:52 ` Chris Mason
2005-04-26 18:15 ` H. Peter Anvin
2005-04-26 20:30 ` Bill Davidsen
2005-04-26 16:11 ` Bill Davidsen
2005-04-26 4:01 ` Matt Mackall
2005-04-26 4:20 ` Linus Torvalds
2005-04-26 4:09 ` Chris Wedgwood
2005-04-26 4:22 ` Andreas Gal
2005-04-26 4:22 ` Linus Torvalds
2005-04-29 6:01 ` Mercurial 0.4b vs git patchbomb benchmark Matt Mackall
2005-04-29 6:40 ` Sean
2005-04-29 7:40 ` Matt Mackall
2005-04-29 8:40 ` Sean
2005-04-29 14:34 ` Linus Torvalds
2005-04-29 15:18 ` Morten Welinder
2005-04-29 16:52 ` Matt Mackall
2005-05-02 16:10 ` Bill Davidsen
2005-05-02 19:02 ` Sean
2005-05-02 22:02 ` Linus Torvalds
2005-05-02 22:30 ` Matt Mackall
2005-05-02 22:49 ` Linus Torvalds
2005-05-03 0:00 ` Matt Mackall
2005-05-03 2:48 ` Linus Torvalds
2005-05-03 3:29 ` Matt Mackall
2005-05-03 4:18 ` Linus Torvalds
2005-05-03 4:24 ` Linus Torvalds
2005-05-03 4:27 ` Matt Mackall
2005-05-03 8:45 ` Chris Wedgwood
2005-04-29 15:44 ` Tom Lord
2005-04-29 15:58 ` Linus Torvalds
2005-04-29 17:34 ` Tom Lord
2005-04-29 17:56 ` Linus Torvalds
2005-04-29 18:08 ` Tom Lord
2005-04-29 18:33 ` Sean
2005-04-29 18:54 ` Tom Lord
2005-04-29 19:13 ` Sean
2005-05-02 16:15 ` Bill Davidsen
2005-04-29 16:37 ` Matt Mackall
2005-04-29 17:09 ` Linus Torvalds
2005-04-29 19:12 ` Matt Mackall
2005-04-29 19:50 ` Linus Torvalds
2005-04-29 20:23 ` Matt Mackall
2005-04-29 20:49 ` Linus Torvalds
2005-04-29 21:20 ` Matt Mackall
2005-04-29 16:46 ` Bill Davidsen
2005-04-29 20:19 ` Andrea Arcangeli
2005-04-29 22:30 ` Olivier Galibert
2005-04-29 22:47 ` Andrea Arcangeli
2005-04-29 20:30 ` Andrea Arcangeli
2005-04-29 20:39 ` Matt Mackall
2005-04-30 2:52 ` Andrea Arcangeli
2005-04-30 15:20 ` Matt Mackall [this message]
2005-04-30 16:37 ` Andrea Arcangeli
2005-05-02 15:49 ` Bill Davidsen
2005-05-02 16:14 ` Valdis.Kletnieks
2005-05-03 17:40 ` Bill Davidsen
2005-05-04 2:10 ` Mercurial 0.4b vs git patchbomb benchmark (/usr/bin/env again) David A. Wheeler
2005-05-02 16:17 ` Mercurial 0.4b vs git patchbomb benchmark Andrea Arcangeli
2005-05-02 16:31 ` Linus Torvalds
2005-05-02 17:18 ` Daniel Jacobowitz
2005-05-02 17:32 ` Linus Torvalds
2005-05-02 20:54 ` Sam Ravnborg
2005-05-02 17:20 ` Ryan Anderson
2005-05-02 17:31 ` Linus Torvalds
2005-05-02 21:17 ` Kyle Moffett
2005-05-03 17:43 ` Bill Davidsen
[not found] <3YQn9-8qX-5@gated-at.bofh.it>
[not found] ` <3ZLEF-56n-1@gated-at.bofh.it>
[not found] ` <3ZM7L-5ot-13@gated-at.bofh.it>
[not found] ` <3ZN3P-69A-9@gated-at.bofh.it>
[not found] ` <3ZNdz-6gK-9@gated-at.bofh.it>
2005-05-03 1:16 ` Bodo Eggert <harvested.in.lkml@posting.7eggert.dyndns.org>
2005-05-03 1:29 ` Matt Mackall
2005-05-03 16:22 ` Bill Davidsen
2005-05-03 17:14 ` Rene Scharfe
2005-05-04 17:51 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20050430152014.GI21897@waste.org \
--to=mpm@selenic.com \
--cc=andrea@suse.de \
--cc=git@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).