All of lore.kernel.org
 help / color / mirror / Atom feed
* "git gc" doesn't seem to remove loose objects any more
@ 2008-12-15 12:52 Bruce Stephens
  2008-12-15 13:38 ` Mikael Magnusson
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Stephens @ 2008-12-15 12:52 UTC (permalink / raw)
  To: git

I couldn't see a test for this, but perhaps I'm just missing it?

    brs% git count-objects
    161 objects, 1552 kilobytes
    brs% git gc
    Counting objects: 80621, done.
    Compressing objects: 100% (22372/22372), done.
    Writing objects: 100% (80621/80621), done.
    Total 80621 (delta 57160), reused 80305 (delta 56884)
    brs% git count-objects
    207 objects, 2048 kilobytes


And I see lots of directories under .git/objects which confirms
things.

I don't think I've changed any relevant configuration.

This is with 8befc50c49e8a271fd3cd7fb34258fe88d1dfcad (also whatever
version I used before, erm, probably
de0db422782ddaf7754ac5b03fdc6dc5de1a9ae4), and possibly earlier
versions---I've just started noticing now that the number of loose
objects has started causing git gui to complain.

(Hmm, I note that git gui reports a larger number of loose objects
than git count-objects.  Ah, OK, it really is just an approximation,
so no surprise.)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 12:52 "git gc" doesn't seem to remove loose objects any more Bruce Stephens
@ 2008-12-15 13:38 ` Mikael Magnusson
  2008-12-15 14:08   ` Bruce Stephens
  2008-12-15 14:08   ` Björn Steinbrink
  0 siblings, 2 replies; 12+ messages in thread
From: Mikael Magnusson @ 2008-12-15 13:38 UTC (permalink / raw)
  To: Bruce Stephens; +Cc: git

2008/12/15 Bruce Stephens <bruce.stephens@isode.com>:
> I couldn't see a test for this, but perhaps I'm just missing it?
>
>    brs% git count-objects
>    161 objects, 1552 kilobytes
>    brs% git gc
>    Counting objects: 80621, done.
>    Compressing objects: 100% (22372/22372), done.
>    Writing objects: 100% (80621/80621), done.
>    Total 80621 (delta 57160), reused 80305 (delta 56884)
>    brs% git count-objects
>    207 objects, 2048 kilobytes
>
>
> And I see lots of directories under .git/objects which confirms
> things.
>
> I don't think I've changed any relevant configuration.
>
> This is with 8befc50c49e8a271fd3cd7fb34258fe88d1dfcad (also whatever
> version I used before, erm, probably
> de0db422782ddaf7754ac5b03fdc6dc5de1a9ae4), and possibly earlier
> versions---I've just started noticing now that the number of loose
> objects has started causing git gui to complain.
>
> (Hmm, I note that git gui reports a larger number of loose objects
> than git count-objects.  Ah, OK, it really is just an approximation,
> so no surprise.)
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

IIRC git gc only removes loose objects older than two weeks, if you
really want to remove them now, run git prune. But make sure no other
git process can be active when you run it, or it could possibly step
on something.

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 13:38 ` Mikael Magnusson
@ 2008-12-15 14:08   ` Bruce Stephens
  2008-12-15 14:08   ` Björn Steinbrink
  1 sibling, 0 replies; 12+ messages in thread
From: Bruce Stephens @ 2008-12-15 14:08 UTC (permalink / raw)
  To: git

"Mikael Magnusson" <mikachu@gmail.com> writes:

[...]

> IIRC git gc only removes loose objects older than two weeks, if you
> really want to remove them now, run git prune. But make sure no other
> git process can be active when you run it, or it could possibly step
> on something.

OK, that makes sense.  Obviously I misunderstood this.  That doesn't
explain why the number of objects might increase after "git gc", but
perhaps that's for a good reason too.

Surely "git gui"'s warning is unhelpful, then: it warns I have more
than 2000 loose objects (in another checkout), offers to compress my
database, and I end up with an unchanged repository (which it still
complains about)?  Is this warning just redundant now that we've got
"git gc --auto"?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 13:38 ` Mikael Magnusson
  2008-12-15 14:08   ` Bruce Stephens
@ 2008-12-15 14:08   ` Björn Steinbrink
  2008-12-15 15:56     ` Theodore Tso
  1 sibling, 1 reply; 12+ messages in thread
From: Björn Steinbrink @ 2008-12-15 14:08 UTC (permalink / raw)
  To: Mikael Magnusson; +Cc: Bruce Stephens, git

On 2008.12.15 14:38:56 +0100, Mikael Magnusson wrote:
> 2008/12/15 Bruce Stephens <bruce.stephens@isode.com>:
> > I couldn't see a test for this, but perhaps I'm just missing it?
> >
> >    brs% git count-objects
> >    161 objects, 1552 kilobytes
> >    brs% git gc
> >    Counting objects: 80621, done.
> >    Compressing objects: 100% (22372/22372), done.
> >    Writing objects: 100% (80621/80621), done.
> >    Total 80621 (delta 57160), reused 80305 (delta 56884)
> >    brs% git count-objects
> >    207 objects, 2048 kilobytes
> >
> >
> > And I see lots of directories under .git/objects which confirms
> > things.
> >
> > I don't think I've changed any relevant configuration.
> >
> > This is with 8befc50c49e8a271fd3cd7fb34258fe88d1dfcad (also whatever
> > version I used before, erm, probably
> > de0db422782ddaf7754ac5b03fdc6dc5de1a9ae4), and possibly earlier
> > versions---I've just started noticing now that the number of loose
> > objects has started causing git gui to complain.
> >
> > (Hmm, I note that git gui reports a larger number of loose objects
> > than git count-objects.  Ah, OK, it really is just an approximation,
> > so no surprise.)
> 
> IIRC git gc only removes loose objects older than two weeks, if you
> really want to remove them now, run git prune. But make sure no other
> git process can be active when you run it, or it could possibly step
> on something.

To clarify that a bit more: git gc keeps unreachable objects unpacked,
so that git prune can drop them. And git gc invokes git prune so that
only unreachable objects older than 2 weeks are dropped.

Björn

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 14:08   ` Björn Steinbrink
@ 2008-12-15 15:56     ` Theodore Tso
  2008-12-15 16:12       ` Mark Brown
                         ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Theodore Tso @ 2008-12-15 15:56 UTC (permalink / raw)
  To: Björn Steinbrink; +Cc: Mikael Magnusson, Bruce Stephens, git

On Mon, Dec 15, 2008 at 03:08:34PM +0100, Björn Steinbrink wrote:
> To clarify that a bit more: git gc keeps unreachable objects unpacked,
> so that git prune can drop them. And git gc invokes git prune so that
> only unreachable objects older than 2 weeks are dropped.

To be even more explicit, "git gc" will **unpack** objects that have
become unreachable and were currently in packs.  As a result, the
amount of disk space used by a git repository can actually go **up**
dramatically after a "git gc" operation, which could be surprising for
someone who is running close to full on their filesystem, deletes a
number of branches from a tracking repository, and then does a "git
gc" may get a very unpleasant surprise.

A really good repository which shows this is linux-next, since it is
constantly getting rewound, and old branches are reserved via a tag
such as next-20081204.  If you update the your local copy of the
linux-next repository every day, you will accumulate a large number of
these old branch tags.  If you then delete a whole series of them, and
run git-gc, the operation will take quite a while, and the number of
blocks and inodes used will grow significantly.  They will disappear
after a "git prune", but when I do this housekeeping operation, I've
often wished for a --yes-I-know-what-I-am-doing-and-it's-unsafe-but-
just-drop-the-unreachable-objects-cause-this-is-just-a-tracking-repository
option to "git gc".

						- Ted

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 15:56     ` Theodore Tso
@ 2008-12-15 16:12       ` Mark Brown
  2008-12-15 16:59         ` Johan Herland
  2008-12-15 16:59         ` Mikael Magnusson
  2008-12-15 17:07       ` Jakub Narebski
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 12+ messages in thread
From: Mark Brown @ 2008-12-15 16:12 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Bj?rn Steinbrink, Mikael Magnusson, Bruce Stephens, git

On Mon, Dec 15, 2008 at 10:56:10AM -0500, Theodore Tso wrote:
> On Mon, Dec 15, 2008 at 03:08:34PM +0100, Bj?rn Steinbrink wrote:

> > To clarify that a bit more: git gc keeps unreachable objects unpacked,
> > so that git prune can drop them. And git gc invokes git prune so that
> > only unreachable objects older than 2 weeks are dropped.

> To be even more explicit, "git gc" will **unpack** objects that have
> become unreachable and were currently in packs.  As a result, the
> amount of disk space used by a git repository can actually go **up**
> dramatically after a "git gc" operation, which could be surprising for
> someone who is running close to full on their filesystem, deletes a
> number of branches from a tracking repository, and then does a "git
> gc" may get a very unpleasant surprise.

It can also cause things like the "please repack" warning in git gui to
go off.  This is especially unhelpful since they tend to tell you to go
and do a gc to resolve the problem.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 16:12       ` Mark Brown
@ 2008-12-15 16:59         ` Johan Herland
  2008-12-15 16:59         ` Mikael Magnusson
  1 sibling, 0 replies; 12+ messages in thread
From: Johan Herland @ 2008-12-15 16:59 UTC (permalink / raw)
  To: git
  Cc: Mark Brown, Theodore Tso, Bj?rn Steinbrink, Mikael Magnusson,
	Bruce Stephens

On Monday 15 December 2008, Mark Brown wrote:
> On Mon, Dec 15, 2008 at 10:56:10AM -0500, Theodore Tso wrote:
> > On Mon, Dec 15, 2008 at 03:08:34PM +0100, Bj?rn Steinbrink wrote:
> > > To clarify that a bit more: git gc keeps unreachable objects
> > > unpacked, so that git prune can drop them. And git gc invokes git
> > > prune so that only unreachable objects older than 2 weeks are
> > > dropped.
> >
> > To be even more explicit, "git gc" will **unpack** objects that
> > have become unreachable and were currently in packs.  As a result,
> > the amount of disk space used by a git repository can actually go
> > **up** dramatically after a "git gc" operation, which could be
> > surprising for someone who is running close to full on their
> > filesystem, deletes a number of branches from a tracking
> > repository, and then does a "git gc" may get a very unpleasant
> > surprise.
>
> It can also cause things like the "please repack" warning in git gui
> to go off.  This is especially unhelpful since they tend to tell you
> to go and do a gc to resolve the problem.

Instead of exploding all unreachable objects into loose objects, does it 
make sense to repack them into a separate pack? AFAICS, that would 
solve both the disk usage problem and the git-gui-"please repack" 
problem. Also, it might make git-prune's job much easier, since 
unreachable objects are now located in a single pack only?


Have fun!

...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 16:12       ` Mark Brown
  2008-12-15 16:59         ` Johan Herland
@ 2008-12-15 16:59         ` Mikael Magnusson
  1 sibling, 0 replies; 12+ messages in thread
From: Mikael Magnusson @ 2008-12-15 16:59 UTC (permalink / raw)
  To: Mark Brown; +Cc: Theodore Tso, Bj?rn Steinbrink, Bruce Stephens, git

2008/12/15 Mark Brown <broonie@sirena.org.uk>:
> On Mon, Dec 15, 2008 at 10:56:10AM -0500, Theodore Tso wrote:
>> On Mon, Dec 15, 2008 at 03:08:34PM +0100, Bj?rn Steinbrink wrote:
>
>> > To clarify that a bit more: git gc keeps unreachable objects unpacked,
>> > so that git prune can drop them. And git gc invokes git prune so that
>> > only unreachable objects older than 2 weeks are dropped.
>
>> To be even more explicit, "git gc" will **unpack** objects that have
>> become unreachable and were currently in packs.  As a result, the
>> amount of disk space used by a git repository can actually go **up**
>> dramatically after a "git gc" operation, which could be surprising for
>> someone who is running close to full on their filesystem, deletes a
>> number of branches from a tracking repository, and then does a "git
>> gc" may get a very unpleasant surprise.
>
> It can also cause things like the "please repack" warning in git gui to
> go off.  This is especially unhelpful since they tend to tell you to go
> and do a gc to resolve the problem.

A thought that occurs to me is to add some sort of flag to git count-objects
that prints the number of objects older than some interval in a separate field.
That way git gui would give less (maybe no) false alarms.

-- 
Mikael Magnusson

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 15:56     ` Theodore Tso
  2008-12-15 16:12       ` Mark Brown
@ 2008-12-15 17:07       ` Jakub Narebski
  2008-12-15 19:38         ` Theodore Tso
  2008-12-15 17:11       ` Brandon Casey
  2008-12-15 17:38       ` [PATCH] objects to be pruned immediately don't have to be loosened Nicolas Pitre
  3 siblings, 1 reply; 12+ messages in thread
From: Jakub Narebski @ 2008-12-15 17:07 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Björn Steinbrink, Mikael Magnusson, Bruce Stephens, git

Theodore Tso <tytso@mit.edu> writes:
> On Mon, Dec 15, 2008 at 03:08:34PM +0100, Björn Steinbrink wrote:

> > To clarify that a bit more: git gc keeps unreachable objects unpacked,
> > so that git prune can drop them. And git gc invokes git prune so that
> > only unreachable objects older than 2 weeks are dropped.
> 
> To be even more explicit, "git gc" will **unpack** objects that have
> become unreachable and were currently in packs.  As a result, the
> amount of disk space used by a git repository can actually go **up**
> dramatically after a "git gc" operation, which could be surprising for
> someone who is running close to full on their filesystem, deletes a
> number of branches from a tracking repository, and then does a "git
> gc" may get a very unpleasant surprise.
> 
> A really good repository which shows this is linux-next, since it is
> constantly getting rewound, and old branches are reserved via a tag
> such as next-20081204.  If you update the your local copy of the
> linux-next repository every day, you will accumulate a large number of
> these old branch tags.  If you then delete a whole series of them, and
> run git-gc, the operation will take quite a while, and the number of
> blocks and inodes used will grow significantly.  They will disappear
> after a "git prune", but when I do this housekeeping operation, I've
> often wished for a --yes-I-know-what-I-am-doing-and-it's-unsafe-but-
> just-drop-the-unreachable-objects-cause-this-is-just-a-tracking-repository
> option to "git gc".

There was an idea to have "git gc --prune" run "git prune"
unconditionally, i.e. without grace period for dangling loose objects.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 15:56     ` Theodore Tso
  2008-12-15 16:12       ` Mark Brown
  2008-12-15 17:07       ` Jakub Narebski
@ 2008-12-15 17:11       ` Brandon Casey
  2008-12-15 17:38       ` [PATCH] objects to be pruned immediately don't have to be loosened Nicolas Pitre
  3 siblings, 0 replies; 12+ messages in thread
From: Brandon Casey @ 2008-12-15 17:11 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Björn Steinbrink, Mikael Magnusson, Bruce Stephens, git

Theodore Tso wrote:
> I've
> often wished for a --yes-I-know-what-I-am-doing-and-it's-unsafe-but-
> just-drop-the-unreachable-objects-cause-this-is-just-a-tracking-repository
> option to "git gc".

repack -a -d -l

Notice the lowercase 'a'.

git-gc calls repack with uppercase 'A' which is what causes the unreachable
objects to be unpacked. Little 'a', is for people who know what they are
doing, and want git to just drop unreachable objects.

-brandon

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH] objects to be pruned immediately don't have to be loosened
  2008-12-15 15:56     ` Theodore Tso
                         ` (2 preceding siblings ...)
  2008-12-15 17:11       ` Brandon Casey
@ 2008-12-15 17:38       ` Nicolas Pitre
  3 siblings, 0 replies; 12+ messages in thread
From: Nicolas Pitre @ 2008-12-15 17:38 UTC (permalink / raw)
  To: Theodore Tso; +Cc: Björn Steinbrink, Mikael Magnusson, Bruce Stephens, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3331 bytes --]


When there is no grace period before pruning unreferenced objects, it is 
pointless to push those objects in their loose form just to delete them 
right away.

Also be more explicit about the possibility of using "now" in the 
gc.pruneexpire config variable (needed for the above behavior to 
happen).

Signed-off-by: Nicolas Pitre <nico@cam.org>
---

On Mon, 15 Dec 2008, Theodore Tso wrote:

> On Mon, Dec 15, 2008 at 03:08:34PM +0100, Björn Steinbrink wrote:
> > To clarify that a bit more: git gc keeps unreachable objects unpacked,
> > so that git prune can drop them. And git gc invokes git prune so that
> > only unreachable objects older than 2 weeks are dropped.
> 
> To be even more explicit, "git gc" will **unpack** objects that have
> become unreachable and were currently in packs.  As a result, the
> amount of disk space used by a git repository can actually go **up**
> dramatically after a "git gc" operation, which could be surprising for
> someone who is running close to full on their filesystem, deletes a
> number of branches from a tracking repository, and then does a "git
> gc" may get a very unpleasant surprise.
> 
> A really good repository which shows this is linux-next, since it is
> constantly getting rewound, and old branches are reserved via a tag
> such as next-20081204.  If you update the your local copy of the
> linux-next repository every day, you will accumulate a large number of
> these old branch tags.  If you then delete a whole series of them, and
> run git-gc, the operation will take quite a while, and the number of
> blocks and inodes used will grow significantly.  They will disappear
> after a "git prune", but when I do this housekeeping operation, I've
> often wished for a --yes-I-know-what-I-am-doing-and-it's-unsafe-but-
> just-drop-the-unreachable-objects-cause-this-is-just-a-tracking-repository
> option to "git gc".

What about this?

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 21ea165..ca45e71 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -702,7 +702,9 @@ gc.packrefs::
 
 gc.pruneexpire::
 	When 'git-gc' is run, it will call 'prune --expire 2.weeks.ago'.
-	Override the grace period with this config variable.
+	Override the grace period with this config variable.  The value
+	"now" may be used to disable this  grace period and always prune
+	unreachable objects immediately.
 
 gc.reflogexpire::
 	'git-reflog expire' removes reflog entries older than
diff --git a/builtin-gc.c b/builtin-gc.c
index 781df60..f8eae4a 100644
--- a/builtin-gc.c
+++ b/builtin-gc.c
@@ -188,7 +188,9 @@ static int need_to_gc(void)
 	 * there is no need.
 	 */
 	if (too_many_packs())
-		append_option(argv_repack, "-A", MAX_ADD);
+		append_option(argv_repack,
+			      !strcmp(prune_expire, "now") ? "-a" : "-A",
+			      MAX_ADD);
 	else if (!too_many_loose_objects())
 		return 0;
 
@@ -243,7 +245,9 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 			"run \"git gc\" manually. See "
 			"\"git help gc\" for more information.\n");
 	} else
-		append_option(argv_repack, "-A", MAX_ADD);
+		append_option(argv_repack,
+			      !strcmp(prune_expire, "now") ? "-a" : "-A",
+			      MAX_ADD);
 
 	if (pack_refs && run_command_v_opt(argv_pack_refs, RUN_GIT_CMD))
 		return error(FAILED_RUN, argv_pack_refs[0]);

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: "git gc" doesn't seem to remove loose objects any more
  2008-12-15 17:07       ` Jakub Narebski
@ 2008-12-15 19:38         ` Theodore Tso
  0 siblings, 0 replies; 12+ messages in thread
From: Theodore Tso @ 2008-12-15 19:38 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Björn Steinbrink, Mikael Magnusson, Bruce Stephens, git

On Mon, Dec 15, 2008 at 09:07:39AM -0800, Jakub Narebski wrote:
> 
> There was an idea to have "git gc --prune" run "git prune"
> unconditionally, i.e. without grace period for dangling loose objects.
> 

That doesn't help that much, since (temporarily) you still need all of
the disk space for the exploded, unpacked objects.  As Brandon Casey
pointed out, the key is "git repack -a -d -l" vs "git repack -A -d
-l".  If there is going to be a git-gc option, it would need to change
the options sent to git-repack.  Or, I suppose the answer is to tell
people who run into this problem use a plumbing command, manually.
The question is how common is the use case of needing to gc a
repository like linux-next, I suppose.

							- Ted

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-12-15 19:40 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-12-15 12:52 "git gc" doesn't seem to remove loose objects any more Bruce Stephens
2008-12-15 13:38 ` Mikael Magnusson
2008-12-15 14:08   ` Bruce Stephens
2008-12-15 14:08   ` Björn Steinbrink
2008-12-15 15:56     ` Theodore Tso
2008-12-15 16:12       ` Mark Brown
2008-12-15 16:59         ` Johan Herland
2008-12-15 16:59         ` Mikael Magnusson
2008-12-15 17:07       ` Jakub Narebski
2008-12-15 19:38         ` Theodore Tso
2008-12-15 17:11       ` Brandon Casey
2008-12-15 17:38       ` [PATCH] objects to be pruned immediately don't have to be loosened Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.