All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Fick <mfick@codeaurora.org>
To: Julian Phillips <julian@quantumfyre.co.uk>
Cc: git@vger.kernel.org
Subject: Re: Git is not scalable with too many refs/*
Date: Mon, 26 Sep 2011 14:01:38 -0600	[thread overview]
Message-ID: <201109261401.38624.mfick@codeaurora.org> (raw)
In-Reply-To: <88a00eadcbb4a7946dbe8d70dd0e933d@quantumfyre.co.uk>

On Monday, September 26, 2011 12:37:10 pm Julian Phillips 
wrote:
> On Mon, 26 Sep 2011 12:07:52 -0600, Martin Fick wrote:
> -- snip --
> 
> > Ahh, I think I have some more clues.  So while this
> > change does not speed things up for me normally, I
> > found a case where it does!  I  set my .git/config to
> > have
> > 
> >   [core]
> >   
> >         compression = 0
> > 
> > and ran git-gc on my repo.  Now, with a modern git with
> > this optimization in it (1.7.6, 1.7.7.rc0...), 'git
> > branch' is almost instantaneous (.05s)!  But, if I
> > revert c774aa it takes > ~15s.
> 
> I don't understand this.  I don't see why
> core.compression should have anything to do with refs
> ...
> 
> > So, it appears that this optimization is foiled by
> > compression?  In the case when this optimization helps,
> > it save about 15s, when it hurts (with compression),
> > it seems to cost > 3mins.  I am not sure this
> > optimization is worth it?  Would there be someway for
> > it to adjust to the repo conditions?
> 
> Well, in the case I tried it was 1.2s vs 40s.  It would
> seem that you have managed to find some corner case.  It
> doesn't seem right to punish everyone who has large
> numbers of refs by making their commands take orders of
> magnitude longer to save one person 3m.  Much better to
> find, understand and fix the actual cause.

I am not sure mine is the corner case, it is a real repo 
(albeit a Gerrit repo with strange refs/changes), while it 
sounds like yours is a test repo.  It seems likely that 
whatever you did to create the test repo makes it perform 
well?  I am also guessing that it is not the refs which are 
the problem but the objects since the refs don't get 
compressed do they?  Does your repo have real data in it 
(not just 100K refs)?  

My repo compressed is about ~2G and uncompressed is ~1.1G
Yes, the compressed one is larger than the uncompressed one.
Since the compressed repo above was larger, I thought that I 
should at lest gc it.  After git gc, it is ~1.1G, so it 
looks like the size difference was really because of not 
having gced it at first after fetching the 100K refs.

After a gc, the repo does perform the similar to the 
uncompressed one (which was achieved via gc).  After gc, it 
takes ~.05s do to a 'git branch' with 1.7.6 and 
git.1.7.7.rc0.72.g4b5ea.  It also takes a bit more than 15s 
with the patch reverted.  So it appears that compression is 
not likely the culprit, but rather the need to be gced.

So, maybe you are correct, maybe my repo is the corner case?  
Is a repo which needs to be gced considered a corner case?  
Should git be able to detect that the repo is so in 
desperate need of gcing?  Is it normal for git to need to gc 
right after a clone and then fetching ~100K refs?

I am not sure what is right here, if this patch makes a repo 
which needs gcing degrade 5 to 10 times worse than the 
benefit of this patch, it still seems questionable to me.


 
> I really can't see what effect core.compression can have
> on loading the ref_list.  Certainly the sort doesn't
> load anything from the object database.  It would be
> really good to profile and find out what is taking all
> the time - I am assuming that the CPU is at 100% for the
> 3+ minutes?

Yes, 100% CPU (I mostly run the tests at least twice and 
have 8G of RAM, so I think the entire repo gets cached).


> Random thought.  What happens to the with compression
> case if you leave the commit in, but add a sleep(15) to
> the end of sort_refs_list?

Why, what are you thinking?  Hmm, I am trying this on the 
non gced repo and it doesn't seem to be completing (no cpu 
usage)!  It appears that perhaps it is being called many 
times (the sleeping would explain no cpu usage)?!?  This
could be a real problem, this should only get called once 
right?

-Martin

-- 
Employee of Qualcomm Innovation Center, Inc. which is a 
member of Code Aurora Forum

  reply	other threads:[~2011-09-26 20:01 UTC|newest]

Thread overview: 126+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-09  3:44 Git is not scalable with too many refs/* NAKAMURA Takumi
2011-06-09  6:50 ` Sverre Rabbelier
2011-06-09 15:23   ` Shawn Pearce
2011-06-09 15:52     ` A Large Angry SCM
2011-06-09 15:56       ` Shawn Pearce
2011-06-09 16:26         ` Jeff King
2011-06-10  3:59           ` NAKAMURA Takumi
2011-06-13 22:27             ` Jeff King
2011-06-14  0:17             ` Andreas Ericsson
2011-06-14  0:30               ` Jeff King
2011-06-14  4:41                 ` Junio C Hamano
2011-06-14  7:26                   ` Sverre Rabbelier
2011-06-14 10:02                     ` Johan Herland
2011-06-14 10:34                       ` Sverre Rabbelier
2011-06-14 17:02                       ` Jeff King
2011-06-14 19:20                         ` Shawn Pearce
2011-06-14 19:47                           ` Jeff King
2011-06-14 20:12                             ` Shawn Pearce
2011-09-08 19:53                               ` Martin Fick
2011-09-09  0:52                                 ` Martin Fick
2011-09-09  1:05                                   ` Thomas Rast
2011-09-09  1:13                                     ` Thomas Rast
2011-09-09 15:59                                   ` Jens Lehmann
2011-09-25 20:43                                   ` Martin Fick
2011-09-26 12:41                                     ` Christian Couder
2011-09-26 17:47                                       ` Martin Fick
2011-09-26 18:56                                         ` Christian Couder
2011-09-30 16:41                                           ` Martin Fick
2011-09-30 19:26                                             ` Martin Fick
2011-09-30 21:02                                             ` Martin Fick
2011-09-30 22:06                                               ` Martin Fick
2011-10-01 20:41                                                 ` Junio C Hamano
2011-10-02  5:19                                                   ` Michael Haggerty
2011-10-03  0:46                                                     ` Martin Fick
2011-10-04  8:08                                                       ` Michael Haggerty
2011-10-03 18:12                                                 ` Martin Fick
2011-10-03 19:42                                                   ` Junio C Hamano
2011-10-04  8:16                                                   ` Michael Haggerty
2011-10-08 20:59                                                 ` Martin Fick
2011-10-09  5:43                                                   ` Michael Haggerty
2011-09-28 19:38                                       ` Martin Fick
2011-09-28 22:10                                         ` Martin Fick
2011-09-29  0:54                                           ` Julian Phillips
2011-09-29  1:37                                             ` Martin Fick
2011-09-29  2:19                                               ` Julian Phillips
2011-09-29 16:38                                                 ` Martin Fick
2011-09-29 18:26                                                   ` Julian Phillips
2011-09-29 18:27                                                 ` René Scharfe
2011-09-29 19:10                                                   ` Junio C Hamano
2011-09-29  4:18                                                     ` [PATCH] refs: Use binary search to lookup refs faster Julian Phillips
2011-09-29 21:57                                                       ` Junio C Hamano
2011-09-29 22:04                                                       ` [PATCH v2] " Julian Phillips
2011-09-29 22:06                                                       ` [PATCH] " Junio C Hamano
2011-09-29 22:11                                                         ` [PATCH v3] " Julian Phillips
2011-09-29 23:48                                                           ` Junio C Hamano
2011-09-30 15:30                                                             ` Michael Haggerty
2011-09-30 16:38                                                               ` Junio C Hamano
2011-09-30 17:56                                                                 ` [PATCH] refs: Remove duplicates after sorting with qsort Julian Phillips
2011-10-02  5:15                                                                 ` [PATCH v3] refs: Use binary search to lookup refs faster Michael Haggerty
2011-10-02  5:45                                                                   ` Junio C Hamano
2011-10-04 20:58                                                                     ` Junio C Hamano
2011-09-30  1:13                                                           ` Martin Fick
2011-09-30  3:44                                                             ` Junio C Hamano
2011-09-30  8:04                                                               ` Julian Phillips
2011-09-30 15:45                                                               ` Martin Fick
2011-09-29 20:44                                                     ` Git is not scalable with too many refs/* Martin Fick
2011-09-29 19:10                                                   ` Julian Phillips
2011-09-29 20:11                                                   ` Martin Fick
2011-09-30  9:12                                                     ` René Scharfe
2011-09-30 16:09                                                       ` Martin Fick
2011-09-30 16:52                                                       ` Junio C Hamano
2011-09-30 18:17                                                         ` René Scharfe
2011-10-01 15:28                                                           ` René Scharfe
2011-10-01 15:38                                                             ` [PATCH 1/8] checkout: check for "Previous HEAD" notice in t2020 René Scharfe
2011-10-01 19:02                                                               ` Sverre Rabbelier
2011-10-01 15:43                                                             ` [PATCH 2/8] revision: factor out add_pending_sha1 René Scharfe
2011-10-01 15:51                                                             ` [PATCH 3/8] checkout: use add_pending_{object,sha1} in orphan check René Scharfe
2011-10-01 15:56                                                             ` [PATCH 4/8] revision: add leak_pending flag René Scharfe
2011-10-01 16:01                                                             ` [PATCH 5/8] bisect: use " René Scharfe
2011-10-01 16:02                                                             ` [PATCH 6/8] bundle: " René Scharfe
2011-10-01 16:09                                                             ` [PATCH 7/8] checkout: " René Scharfe
2011-10-01 16:16                                                             ` [PATCH 8/8] commit: factor out clear_commit_marks_for_object_array René Scharfe
2011-09-26 15:15                                     ` Git is not scalable with too many refs/* Martin Fick
2011-09-26 15:21                                       ` Sverre Rabbelier
2011-09-26 15:48                                         ` Martin Fick
2011-09-26 15:56                                           ` Sverre Rabbelier
2011-09-26 16:38                                             ` Martin Fick
2011-09-26 16:49                                               ` Julian Phillips
2011-09-26 18:07                                       ` Martin Fick
2011-09-26 18:37                                         ` Julian Phillips
2011-09-26 20:01                                           ` Martin Fick [this message]
2011-09-26 20:07                                             ` Junio C Hamano
2011-09-26 20:28                                             ` Julian Phillips
2011-09-26 21:39                                               ` Martin Fick
2011-09-26 21:52                                                 ` Martin Fick
2011-09-26 23:26                                                   ` Julian Phillips
2011-09-26 23:37                                                     ` David Michael Barr
2011-09-27  1:01                                                       ` [PATCH] refs.c: Fix slowness with numerous loose refs David Barr
2011-09-27  2:04                                                         ` David Michael Barr
2011-09-26 23:38                                                     ` Git is not scalable with too many refs/* Junio C Hamano
2011-09-27  0:00                                                       ` [PATCH] Don't sort ref_list too early Julian Phillips
2011-10-02  4:58                                                         ` Michael Haggerty
2011-09-27  0:12                                                     ` Git is not scalable with too many refs/* Martin Fick
2011-09-27  0:22                                                       ` Julian Phillips
2011-09-27  2:34                                                         ` Martin Fick
2011-09-27  7:59                                                           ` Julian Phillips
2011-09-27  8:20                                                     ` Sverre Rabbelier
2011-09-27  9:01                                                       ` Julian Phillips
2011-09-27 10:01                                                         ` Sverre Rabbelier
2011-09-27 10:25                                                           ` Nguyen Thai Ngoc Duy
2011-09-27 11:07                                                         ` Michael Haggerty
2011-09-27 12:10                                                           ` Julian Phillips
2011-09-26 22:30                                                 ` Julian Phillips
2011-09-26 15:32                                     ` Michael Haggerty
2011-09-26 15:42                                       ` Martin Fick
2011-09-26 16:25                                         ` Thomas Rast
2011-09-09 13:50                                 ` Michael Haggerty
2011-09-09 15:51                                   ` Michael Haggerty
2011-09-09 16:03                                   ` Jens Lehmann
2011-06-10  7:41         ` Andreas Ericsson
2011-06-10 19:41           ` Shawn Pearce
2011-06-10 20:12             ` Jakub Narebski
2011-06-10 20:35             ` Jeff King
2011-06-13  7:08             ` Andreas Ericsson
2011-06-09 11:18 ` Jakub Narebski
2011-06-09 15:42   ` Stephen Bash

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201109261401.38624.mfick@codeaurora.org \
    --to=mfick@codeaurora.org \
    --cc=git@vger.kernel.org \
    --cc=julian@quantumfyre.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.