All of lore.kernel.org
 help / color / mirror / Atom feed
* git object-count differs between clones
       [not found] <57434188.709288.1454428054374.JavaMail.zimbra@xes-inc.com>
@ 2016-02-02 15:52 ` Andrew Martin
  2016-02-02 16:09   ` Matthieu Moy
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Martin @ 2016-02-02 15:52 UTC (permalink / raw)
  To: git

Hello,

I am using git 2.7.0 on Ubuntu 14.04. I recently tried pushing a large (90,000+
commits) repository to a gogs server (https://gogs.io/) and then cloning the
repository back to my machine. After running "git count-objects -v", I see a
discrepancy in the number of objects:

Original Repository:
count: 0
size: 0
in-pack: 1258300
packs: 1
size-pack: 593889
prune-packable: 0
garbage: 0
size-garbage: 0


Clone from gogs:
count: 0
size: 0
in-pack: 1258270
packs: 1
size-pack: 593884
prune-packable: 0
garbage: 0
size-garbage: 0


I ran "git fsck" on both, which reported no problems. Moreover, I ran "git gc"
and made sure there were no objects pending garbage collection, but still I
cannot account for this difference. Can someone explain why these numbers
differ, and if this is a problem or not?

Thanks,

Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git object-count differs between clones
  2016-02-02 15:52 ` git object-count differs between clones Andrew Martin
@ 2016-02-02 16:09   ` Matthieu Moy
  2016-02-02 16:21     ` Andrew Martin
  0 siblings, 1 reply; 7+ messages in thread
From: Matthieu Moy @ 2016-02-02 16:09 UTC (permalink / raw)
  To: Andrew Martin; +Cc: git

Andrew Martin <amartin@xes-inc.com> writes:

> I ran "git fsck" on both, which reported no problems. Moreover, I ran "git gc"
> and made sure there were no objects pending garbage collection, 

It's not sufficient: you may have objects reachable from your reflog,
hence not candidate for garbage collection. Since the reflog is not
propagated, pushing + cloning will not transfer these objects if the
reflog is the only way to reach them.

You may try expiring your reflog and "git gc" again.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git object-count differs between clones
  2016-02-02 16:09   ` Matthieu Moy
@ 2016-02-02 16:21     ` Andrew Martin
  2016-02-02 16:52       ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Martin @ 2016-02-02 16:21 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: git

----- Original Message -----
> From: "Matthieu Moy" <Matthieu.Moy@grenoble-inp.fr>
> To: "Andrew Martin" <amartin@xes-inc.com>
> Cc: git@vger.kernel.org
> Sent: Tuesday, February 2, 2016 10:09:31 AM
> Subject: Re: git object-count differs between clones
> 
> Andrew Martin <amartin@xes-inc.com> writes:
> 
> > I ran "git fsck" on both, which reported no problems. Moreover, I ran "git
> > gc"
> > and made sure there were no objects pending garbage collection,
> 
> It's not sufficient: you may have objects reachable from your reflog,
> hence not candidate for garbage collection. Since the reflog is not
> propagated, pushing + cloning will not transfer these objects if the
> reflog is the only way to reach them.
> 
> You may try expiring your reflog and "git gc" again.
> 
Matthieu,

Thanks, I found some commits that are not referenced in any branch. How can I
remove these from the reflog? I tried running
"git reflog expire --expire=now --expire-unreachable=now --all" followed by
"git gc" but still the same number of objects remain.

Thanks,

Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git object-count differs between clones
  2016-02-02 16:21     ` Andrew Martin
@ 2016-02-02 16:52       ` Jeff King
  2016-02-02 17:22         ` Andrew Martin
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2016-02-02 16:52 UTC (permalink / raw)
  To: Andrew Martin; +Cc: Matthieu Moy, git

On Tue, Feb 02, 2016 at 10:21:17AM -0600, Andrew Martin wrote:

> > You may try expiring your reflog and "git gc" again.
> 
> Thanks, I found some commits that are not referenced in any branch. How can I
> remove these from the reflog? I tried running
> "git reflog expire --expire=now --expire-unreachable=now --all" followed by
> "git gc" but still the same number of objects remain.

Are the objects now loose, or still in packs? Git has a grace period for
pruning objects, so that we do not delete objects for an in-progress
operation. The life cycle of an unreferenced object should be something
like:

  - reachable by reflogs, which are pruned after 30 days (or
    gc.reflogExpireUnreachable config). Objects will be repacked as
    normal during this time. Override with "reflog expire" as you did
    above.

  - after the reflog expires, the objects are now unreachable. During
    the next repack, they'll be ejected from the pack into loose
    objects, and their mtimes set to match the pack they came from
    (which is probably quite recent if you just repacked!).

  - after 2 weeks (or gc.pruneExpire), unreachable loose objects are
    dropped by "git prune", which is called as part of "git gc". This is
    based on the object mtime.

    You can accelerate this with "git gc --prune=now" (or
    "--prune=5.minutes.ago").

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git object-count differs between clones
  2016-02-02 16:52       ` Jeff King
@ 2016-02-02 17:22         ` Andrew Martin
  2016-02-03  4:34           ` Jeff King
  0 siblings, 1 reply; 7+ messages in thread
From: Andrew Martin @ 2016-02-02 17:22 UTC (permalink / raw)
  To: Jeff King; +Cc: Matthieu Moy, git

----- Original Message -----
> From: "Jeff King" <peff@peff.net>
> To: "Andrew Martin" <amartin@xes-inc.com>
> Cc: "Matthieu Moy" <Matthieu.Moy@grenoble-inp.fr>, git@vger.kernel.org
> Sent: Tuesday, February 2, 2016 10:52:46 AM
> Subject: Re: git object-count differs between clones
> 
> On Tue, Feb 02, 2016 at 10:21:17AM -0600, Andrew Martin wrote:
> 
> > > You may try expiring your reflog and "git gc" again.
> > 
> > Thanks, I found some commits that are not referenced in any branch. How can
> > I
> > remove these from the reflog? I tried running
> > "git reflog expire --expire=now --expire-unreachable=now --all" followed by
> > "git gc" but still the same number of objects remain.
> 
> Are the objects now loose, or still in packs? Git has a grace period for
> pruning objects, so that we do not delete objects for an in-progress
> operation. The life cycle of an unreferenced object should be something
> like:
> 
>   - reachable by reflogs, which are pruned after 30 days (or
>     gc.reflogExpireUnreachable config). Objects will be repacked as
>     normal during this time. Override with "reflog expire" as you did
>     above.
> 
>   - after the reflog expires, the objects are now unreachable. During
>     the next repack, they'll be ejected from the pack into loose
>     objects, and their mtimes set to match the pack they came from
>     (which is probably quite recent if you just repacked!).
> 
>   - after 2 weeks (or gc.pruneExpire), unreachable loose objects are
>     dropped by "git prune", which is called as part of "git gc". This is
>     based on the object mtime.
> 
>     You can accelerate this with "git gc --prune=now" (or
>     "--prune=5.minutes.ago").
> 
> -Peff

Jeff,

Thanks for the clarification. I now ran "git repack -A" followed by 
"git gc --prune=now", however I am still seeing the same number of objects. What
else can I try to successfully mark these and unreachable and garbage collect them?

Thanks,

Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git object-count differs between clones
  2016-02-02 17:22         ` Andrew Martin
@ 2016-02-03  4:34           ` Jeff King
  2016-02-03 15:21             ` Andrew Martin
  0 siblings, 1 reply; 7+ messages in thread
From: Jeff King @ 2016-02-03  4:34 UTC (permalink / raw)
  To: Andrew Martin; +Cc: Matthieu Moy, git

On Tue, Feb 02, 2016 at 11:22:08AM -0600, Andrew Martin wrote:

> Thanks for the clarification. I now ran "git repack -A" followed by 
> "git gc --prune=now", however I am still seeing the same number of objects. What
> else can I try to successfully mark these and unreachable and garbage collect them?

That should clear out any unreachable objects. Are we sure that the
objects in question are, in fact, unreachable?

Try:

  git rev-list --objects --all --reflog | wc -l

which should give a count of reachable objects. I'd expect that to line
up with that "git count-objects -v" reports after having run your gc
above.

In your original email, the discrepancy was between your "original"
repository and the one that had round-tripped to a clone. Is it possible
there are refs in the original that did not get pushed? Try comparing
"git for-each-ref" in each repository.

We also consider objects in the index to be reachable for packing. Could
your original perhaps have some uncommitted objects mentioned in the
index?

-Peff

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: git object-count differs between clones
  2016-02-03  4:34           ` Jeff King
@ 2016-02-03 15:21             ` Andrew Martin
  0 siblings, 0 replies; 7+ messages in thread
From: Andrew Martin @ 2016-02-03 15:21 UTC (permalink / raw)
  To: Jeff King; +Cc: Matthieu Moy, git

----- Original Message -----
> From: "Jeff King" <peff@peff.net>
> To: "Andrew Martin" <amartin@xes-inc.com>
> Cc: "Matthieu Moy" <Matthieu.Moy@grenoble-inp.fr>, git@vger.kernel.org
> Sent: Tuesday, February 2, 2016 10:34:12 PM
> Subject: Re: git object-count differs between clones
> 
> On Tue, Feb 02, 2016 at 11:22:08AM -0600, Andrew Martin wrote:
> 
> > Thanks for the clarification. I now ran "git repack -A" followed by
> > "git gc --prune=now", however I am still seeing the same number of objects.
> > What
> > else can I try to successfully mark these and unreachable and garbage
> > collect them?
> 
> That should clear out any unreachable objects. Are we sure that the
> objects in question are, in fact, unreachable?
> 
> Try:
> 
>   git rev-list --objects --all --reflog | wc -l
> 
> which should give a count of reachable objects. I'd expect that to line
> up with that "git count-objects -v" reports after having run your gc
> above.
> 
> In your original email, the discrepancy was between your "original"
> repository and the one that had round-tripped to a clone. Is it possible
> there are refs in the original that did not get pushed? Try comparing
> "git for-each-ref" in each repository.
> 
> We also consider objects in the index to be reachable for packing. Could
> your original perhaps have some uncommitted objects mentioned in the
> index?
> 
> -Peff
> 

Jeff,

This did it - those commits were still referenced by some remotes:
$ git for-each-ref
945c3a60dfb4d9ab774708d19f7aa74dd545db90 commit refs/heads/master
945c3a60dfb4d9ab774708d19f7aa74dd545db90 commit refs/remotes/origin/brancha
8b331e4bb42f6291c33eb0847c4481407e3d753c commit refs/remotes/origin/branchb

I removed them:
$ git update-ref -d refs/remotes/origin/brancha
$ git update-ref -d refs/remotes/origin/branchb

And then I was able to garbage collect and get the expected object count:
$ git reflog expire --expire=now --expire-unreachable=now --all
$ git gc --prune=now

Thanks for the help!

Andrew

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-02-03 15:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <57434188.709288.1454428054374.JavaMail.zimbra@xes-inc.com>
2016-02-02 15:52 ` git object-count differs between clones Andrew Martin
2016-02-02 16:09   ` Matthieu Moy
2016-02-02 16:21     ` Andrew Martin
2016-02-02 16:52       ` Jeff King
2016-02-02 17:22         ` Andrew Martin
2016-02-03  4:34           ` Jeff King
2016-02-03 15:21             ` Andrew Martin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.