All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Robin H. Johnson" <robbat2@gentoo.org>
To: Git Mailing List <git@vger.kernel.org>
Subject: Re: Performance issue: initial git clone causes massive repack
Date: Sun, 5 Apr 2009 13:43:25 -0700	[thread overview]
Message-ID: <20090405204325.GA31344@curie-int> (raw)
In-Reply-To: <20090405190213.GA12929@vidovic>

[-- Attachment #1: Type: text/plain, Size: 4679 bytes --]

On Sun, Apr 05, 2009 at 09:02:13PM +0200, Nicolas Sebrecht wrote:
> > Before I answer the rest of your post, I'd like to note that the matter
> > of which choice between single-repo, repo-per-package, repo-per-category
> > has been flogged to death within Gentoo.
> > 
> > I did not come to the Git mailing list to rehash those choices. I came
> > here to find a solution to the performance problem.
> I understand. I know two ways to resolve this:
> - by resolving the performance problem itself,
> - by changing the workflow to something more accurate and more suitable
>   against the facts.
> 
> My point is that going from a centralized to a decentralized SCM
> involves breacking strongly how developers and maintainers work. What
> you're currently suggesting is a way to work with Git in a centralized
> way. This sucks. To get the things right with Git I would avoid shared
> and global repositories. Gnome is doing it this way:
> http://gitorious.org/projects/gnome-svn-hooks/repos/mainline/trees/master
The entire matter of splitting the repository comes down to what should
be considered an atomic unit. For GNOME, KDE and all of the other large
Git consumers that I'm aware of, there atomic units are individual
packages - specifically because they make sense to be consumed without
having all the rest of the packages. For the gentoo tree, it is an
atomic unit in itself. Changes to the profiles/ directory (for package
masks, USE keys are frequently related and need to be always committed
and received atomically with changes to one or more packages.

> >          The GSoC 2009 ideas contain a potential project for caching the
> > generated packs, which, while having value in itself, could be partially
> > avoided by sending suitable pre-built packs (if they exist) without any
> > repacking.
> Right. It could be an option to wait and see if the GSoC gives
> something.
How hard is it to just look at the git-upload-pack code and make it
realize that it doesn't need to repack at all for this case.

> > A quick bit of stats run show that while some developers only touch a
> > few packages, there are at least 200 developers that have done a major
> > change to 100 or more packages.
> That's a point that has to be reconsidered. Not the fact that at least
> 200 developers work on over 100 packages (this is really not an issue)¹
> but the fact that they do that directly on the main repo/server. The
> good way to achieve this is to send his work to the maintainer². The main
> issue is a better code reviewing.
This has been shot down by our developer base. One of the grounds is
that there is no developer with sufficient time to take a merge-master
role on a regular basis like that.

> 1. Some or all repo-per-category can be tracked with a simple script.
> 2. Maintainers could be - or not be - the same developers as today.
> Adding a layer of maintainers in charge of EAPI review (for example) up
> to the packages-maintainers could help in fixing a lot of portage issues
> and would avoid "simple developers" to do crap on the main repo(s) that
> users download.
You imply that there is a problem in that field already, which I
disagree with.

> > > One repo per category could be a good compromise assuming one seperate
> > > branch per package, then.
> > Other downsides to repo-per-category and repo-per-package:
> Let's forget a repo-per-package.
One downside unique to repo-per-category is that when a package moves
cross-category, you end up with it consuming space in packs on both
sides.

> > - Raises difficulty in adding a new package/category. 
> >   You cannot just do 'mkdir && vi ... && git add && git commit' anymore.
> Right, but categories are not evolving that much.
There's demand to evolve them, but bulk package moves are painful with
CVS, so it's been waiting for Git.

> A repo-per-category local workflow would be:
> [...]
> $ git checkout package_one
> $ tree -a
> |-- .git
> |   |-- [...]
> |   [...]
> `-- package_one
>     |-- ChangeLog
>     |-- Manifest
>     |-- metadata.xml
>     |-- package_one-0.4.ebuild
>     `-- package_one-0.5.ebuild
Umm, why does package_two not exist in the other branch?
If package_one depends on package_two, and you're in for a world of fail
the moment it you changes branches here.

> > - Does NOT present a good base for anybody wanting to branch the entire
> >   tree themselves.
> Scriptable.
You dropped my cvsserver list item.

-- 
Robin Hugh Johnson
Gentoo Linux Developer & Infra Guy
E-Mail     : robbat2@gentoo.org
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

[-- Attachment #2: Type: application/pgp-signature, Size: 330 bytes --]

  parent reply	other threads:[~2009-04-05 20:45 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-04 22:07 Performance issue: initial git clone causes massive repack Robin H. Johnson
2009-04-05  0:05 ` Nicolas Sebrecht
2009-04-05  0:37   ` Robin H. Johnson
2009-04-05  3:54     ` Nicolas Sebrecht
2009-04-05  4:08       ` Nicolas Sebrecht
2009-04-05  7:04       ` Robin H. Johnson
2009-04-05 19:02         ` Nicolas Sebrecht
2009-04-05 19:17           ` Shawn O. Pearce
2009-04-05 23:02             ` Robin H. Johnson
2009-04-05 20:43           ` Robin H. Johnson [this message]
2009-04-05 21:08             ` Shawn O. Pearce
2009-04-05 21:28           ` david
2009-04-05 21:36             ` Sverre Rabbelier
2009-04-06  3:24               ` Nicolas Pitre
2009-04-07  8:10                 ` Björn Steinbrink
2009-04-07  9:45                   ` Jakub Narebski
2009-04-07 13:13                     ` Nicolas Pitre
2009-04-07 13:37                       ` Jakub Narebski
2009-04-07 14:03                         ` Jon Smirl
2009-04-07 17:59                         ` Nicolas Pitre
2009-04-07 14:21                       ` Björn Steinbrink
2009-04-07 17:48                         ` Nicolas Pitre
2009-04-07 18:12                           ` Björn Steinbrink
2009-04-07 18:56                             ` Nicolas Pitre
2009-04-07 20:27                               ` Björn Steinbrink
2009-04-08  4:52                                 ` Nicolas Pitre
2009-04-10 20:38                                   ` Robin H. Johnson
2009-04-11  1:58                                     ` Nicolas Pitre
2009-04-11  7:06                                       ` Mike Hommey
2009-04-14 15:52                                     ` Johannes Schindelin
2009-04-14 20:17                                       ` Nicolas Pitre
2009-04-14 20:27                                         ` Robin H. Johnson
2009-04-14 21:02                                           ` Nicolas Pitre
2009-04-15  3:09                                           ` Nguyen Thai Ngoc Duy
2009-04-15  5:53                                             ` Robin H. Johnson
2009-04-15  5:54                                             ` Junio C Hamano
2009-04-15 11:51                                               ` Nicolas Pitre
2009-04-22  1:15                                           ` Sam Vilain
2009-04-22  9:55                                             ` Mike Ralphson
2009-04-22 11:24                                               ` Pieter de Bie
2009-04-22 13:19                                               ` Johannes Schindelin
2009-04-22 14:35                                                 ` Shawn O. Pearce
2009-04-22 16:40                                                   ` Andreas Ericsson
2009-04-22 17:06                                                     ` Johannes Schindelin
2009-04-23 19:30                                               ` Christian Couder
2009-04-22 14:14                                             ` Nicolas Pitre
2009-04-22 22:01                                               ` Sam Vilain
2009-04-22 22:50                                                 ` Björn Steinbrink
2009-04-22 23:07                                                 ` Nicolas Pitre
2009-04-22 23:30                                                   ` Johannes Schindelin
2009-04-23  3:16                                                     ` Nicolas Pitre
2009-04-14 20:30                                         ` Johannes Schindelin
2009-04-07 20:29                             ` Jeff King
2009-04-07 20:35                               ` Björn Steinbrink
2009-04-08 11:28                       ` [PATCH] process_{tree,blob}: Remove useless xstrdup calls Björn Steinbrink
2009-04-10 22:20                         ` Linus Torvalds
2009-04-11  0:27                           ` Linus Torvalds
2009-04-11  1:15                             ` Linus Torvalds
2009-04-11  1:34                               ` Nicolas Pitre
2009-04-11 13:41                               ` Björn Steinbrink
2009-04-11 14:07                                 ` Björn Steinbrink
2009-04-11 18:06                                   ` Linus Torvalds
2009-04-11 18:22                                     ` Linus Torvalds
2009-04-11 19:22                                       ` Björn Steinbrink
2009-04-11 20:50                                     ` Björn Steinbrink
2009-04-11 21:43                                       ` Linus Torvalds
2009-04-11 23:24                                         ` Björn Steinbrink
2009-04-11 18:19                                   ` Linus Torvalds
2009-04-11 19:40                                     ` Björn Steinbrink
2009-04-11 19:58                                       ` Linus Torvalds
2009-04-05 22:59             ` Performance issue: initial git clone causes massive repack Nicolas Sebrecht
2009-04-05 23:20               ` david
2009-04-05 23:28                 ` Robin Rosenberg
2009-04-06  3:34                 ` Nicolas Pitre
2009-04-06  5:15                   ` Junio C Hamano
2009-04-06 13:12                     ` Nicolas Pitre
2009-04-06 13:52                     ` Jon Smirl
2009-04-06 14:19                       ` Nicolas Pitre
2009-04-06 14:37                         ` Jon Smirl
2009-04-06 14:48                           ` Shawn O. Pearce
2009-04-06 15:14                           ` Nicolas Pitre
2009-04-06 15:28                             ` Jon Smirl
2009-04-06 16:14                               ` Nicolas Pitre
2009-04-06 11:22                   ` Matthieu Moy
2009-04-06 13:29                     ` Nicolas Pitre
2009-04-06 14:03                       ` Robin H. Johnson
2009-04-06 14:14                         ` Nicolas Pitre
2009-04-07 10:11               ` Martin Langhoff
2009-04-05 19:57 ` Jeff King
2009-04-05 23:38   ` Robin H. Johnson
2009-04-05 23:42     ` Robin H. Johnson
     [not found]     ` <0015174c150e49b5740466d7d2c2@google.com>
2009-04-06  0:29       ` Robin H. Johnson
2009-04-06  3:10     ` Nguyen Thai Ngoc Duy
2009-04-06  4:09       ` Nicolas Pitre
2009-04-06  4:06     ` Nicolas Pitre
2009-04-06 14:20       ` Robin H. Johnson
2009-04-11 17:24 ` Mark Levedahl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090405204325.GA31344@curie-int \
    --to=robbat2@gentoo.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.