All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Jon Smirl <jonsmirl@gmail.com>
Cc: Junio C Hamano <gitster@pobox.com>,
	david@lang.hm, Nicolas Sebrecht <nicolas.s-dev@laposte.net>,
	"Robin H. Johnson" <robbat2@gentoo.org>,
	Git Mailing List <git@vger.kernel.org>,
	"Shawn O. Pearce" <spearce@spearce.org>
Subject: Re: Performance issue: initial git clone causes massive repack
Date: Mon, 06 Apr 2009 10:19:20 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.0904060959250.6741@xanadu.home> (raw)
In-Reply-To: <9e4733910904060652t6c0f37d9t246b7394e3aad350@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2515 bytes --]

On Mon, 6 Apr 2009, Jon Smirl wrote:

> On Mon, Apr 6, 2009 at 1:15 AM, Junio C Hamano <gitster@pobox.com> wrote:
> > Nicolas Pitre <nico@cam.org> writes:
> >
> >> What git-pack-objects does in this case is not a full repack.  It
> >> instead _reuse_ as much of the existing packs as possible, and only does
> >> the heavy packing processing for loose objects and/or inter pack
> >> boundaryes when gluing everything together for streaming over the net.
> >> If for example you have a single pack because your repo is already fully
> >> packed, then the "packing operation" involved during a clone should
> >> merely copy the existing pack over with no further attempt at delta
> >> compression.
> >
> > One possibile scenario that you still need to spend memory and cycle is if
> > the cloned repository was packed to an excessive depth to cause many of
> > its objects to be in deltified form on insanely deep chains, while cloning
> > send-pack uses a depth that is more reasonable.  Then pack-objects invoked
> > by send-pack is not allowed to reuse most of the objects and would end up
> > redoing the delta on them.
> 
> That seems broken. You went through all of the trouble to make the
> pack file smaller to reduce transmission time, and then clone undoes
> the work.

And as I already explained, this is indeed not what happens.

> What about making a very simple special case for an initial clone?

There should not be any need for initial clone hacks.

> First thing an initial clone does is copy all of the pack files from
> the server to the client without even looking at them.

This is a no go for reasons already stated many times.  There are 
security implications (those packs might contain stuff that you didn't 
intend to be publically accessible) and there might be efficiency 
reasons as well (you might have a shared object store with lots of stuff 
unrelated to the particular clone).

The biggest cost right now when cloning a big packed repo is object 
enumeration.  Any other issues related to memory costs in the GB range 
simply has no reason for it, and is mostly due to misconfigurations or 
bugs that have to be fixed.  Trying to work around the issue by all 
sorts of hacks is simply counter productive.

In the case that started this very thread, I suspect that a small 
misfeature of some delta caching might be the culprit.  I asked Robin H. 
Johnson to perform a really simple config addition to his repo and 
retest, for which we still haven't seen any results yet.


Nicolas

  reply	other threads:[~2009-04-06 14:21 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-04-04 22:07 Performance issue: initial git clone causes massive repack Robin H. Johnson
2009-04-05  0:05 ` Nicolas Sebrecht
2009-04-05  0:37   ` Robin H. Johnson
2009-04-05  3:54     ` Nicolas Sebrecht
2009-04-05  4:08       ` Nicolas Sebrecht
2009-04-05  7:04       ` Robin H. Johnson
2009-04-05 19:02         ` Nicolas Sebrecht
2009-04-05 19:17           ` Shawn O. Pearce
2009-04-05 23:02             ` Robin H. Johnson
2009-04-05 20:43           ` Robin H. Johnson
2009-04-05 21:08             ` Shawn O. Pearce
2009-04-05 21:28           ` david
2009-04-05 21:36             ` Sverre Rabbelier
2009-04-06  3:24               ` Nicolas Pitre
2009-04-07  8:10                 ` Björn Steinbrink
2009-04-07  9:45                   ` Jakub Narebski
2009-04-07 13:13                     ` Nicolas Pitre
2009-04-07 13:37                       ` Jakub Narebski
2009-04-07 14:03                         ` Jon Smirl
2009-04-07 17:59                         ` Nicolas Pitre
2009-04-07 14:21                       ` Björn Steinbrink
2009-04-07 17:48                         ` Nicolas Pitre
2009-04-07 18:12                           ` Björn Steinbrink
2009-04-07 18:56                             ` Nicolas Pitre
2009-04-07 20:27                               ` Björn Steinbrink
2009-04-08  4:52                                 ` Nicolas Pitre
2009-04-10 20:38                                   ` Robin H. Johnson
2009-04-11  1:58                                     ` Nicolas Pitre
2009-04-11  7:06                                       ` Mike Hommey
2009-04-14 15:52                                     ` Johannes Schindelin
2009-04-14 20:17                                       ` Nicolas Pitre
2009-04-14 20:27                                         ` Robin H. Johnson
2009-04-14 21:02                                           ` Nicolas Pitre
2009-04-15  3:09                                           ` Nguyen Thai Ngoc Duy
2009-04-15  5:53                                             ` Robin H. Johnson
2009-04-15  5:54                                             ` Junio C Hamano
2009-04-15 11:51                                               ` Nicolas Pitre
2009-04-22  1:15                                           ` Sam Vilain
2009-04-22  9:55                                             ` Mike Ralphson
2009-04-22 11:24                                               ` Pieter de Bie
2009-04-22 13:19                                               ` Johannes Schindelin
2009-04-22 14:35                                                 ` Shawn O. Pearce
2009-04-22 16:40                                                   ` Andreas Ericsson
2009-04-22 17:06                                                     ` Johannes Schindelin
2009-04-23 19:30                                               ` Christian Couder
2009-04-22 14:14                                             ` Nicolas Pitre
2009-04-22 22:01                                               ` Sam Vilain
2009-04-22 22:50                                                 ` Björn Steinbrink
2009-04-22 23:07                                                 ` Nicolas Pitre
2009-04-22 23:30                                                   ` Johannes Schindelin
2009-04-23  3:16                                                     ` Nicolas Pitre
2009-04-14 20:30                                         ` Johannes Schindelin
2009-04-07 20:29                             ` Jeff King
2009-04-07 20:35                               ` Björn Steinbrink
2009-04-08 11:28                       ` [PATCH] process_{tree,blob}: Remove useless xstrdup calls Björn Steinbrink
2009-04-10 22:20                         ` Linus Torvalds
2009-04-11  0:27                           ` Linus Torvalds
2009-04-11  1:15                             ` Linus Torvalds
2009-04-11  1:34                               ` Nicolas Pitre
2009-04-11 13:41                               ` Björn Steinbrink
2009-04-11 14:07                                 ` Björn Steinbrink
2009-04-11 18:06                                   ` Linus Torvalds
2009-04-11 18:22                                     ` Linus Torvalds
2009-04-11 19:22                                       ` Björn Steinbrink
2009-04-11 20:50                                     ` Björn Steinbrink
2009-04-11 21:43                                       ` Linus Torvalds
2009-04-11 23:24                                         ` Björn Steinbrink
2009-04-11 18:19                                   ` Linus Torvalds
2009-04-11 19:40                                     ` Björn Steinbrink
2009-04-11 19:58                                       ` Linus Torvalds
2009-04-05 22:59             ` Performance issue: initial git clone causes massive repack Nicolas Sebrecht
2009-04-05 23:20               ` david
2009-04-05 23:28                 ` Robin Rosenberg
2009-04-06  3:34                 ` Nicolas Pitre
2009-04-06  5:15                   ` Junio C Hamano
2009-04-06 13:12                     ` Nicolas Pitre
2009-04-06 13:52                     ` Jon Smirl
2009-04-06 14:19                       ` Nicolas Pitre [this message]
2009-04-06 14:37                         ` Jon Smirl
2009-04-06 14:48                           ` Shawn O. Pearce
2009-04-06 15:14                           ` Nicolas Pitre
2009-04-06 15:28                             ` Jon Smirl
2009-04-06 16:14                               ` Nicolas Pitre
2009-04-06 11:22                   ` Matthieu Moy
2009-04-06 13:29                     ` Nicolas Pitre
2009-04-06 14:03                       ` Robin H. Johnson
2009-04-06 14:14                         ` Nicolas Pitre
2009-04-07 10:11               ` Martin Langhoff
2009-04-05 19:57 ` Jeff King
2009-04-05 23:38   ` Robin H. Johnson
2009-04-05 23:42     ` Robin H. Johnson
     [not found]     ` <0015174c150e49b5740466d7d2c2@google.com>
2009-04-06  0:29       ` Robin H. Johnson
2009-04-06  3:10     ` Nguyen Thai Ngoc Duy
2009-04-06  4:09       ` Nicolas Pitre
2009-04-06  4:06     ` Nicolas Pitre
2009-04-06 14:20       ` Robin H. Johnson
2009-04-11 17:24 ` Mark Levedahl

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.0904060959250.6741@xanadu.home \
    --to=nico@cam.org \
    --cc=david@lang.hm \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jonsmirl@gmail.com \
    --cc=nicolas.s-dev@laposte.net \
    --cc=robbat2@gentoo.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.