From: Martin Scherer <m.scherer@fu-berlin.de>
To: git@vger.kernel.org
Subject: Blobs not referenced by file (anymore) are not removed by GC
Date: Mon, 08 Dec 2014 17:22:23 +0100 [thread overview]
Message-ID: <5485D03F.3060008@fu-berlin.de> (raw)
Hi,
after using BFG on a repo given certain directory globs, all of those
files(names) are gone from history, but can not be collected by garbage
collection anymore. So the blobs of the underlying files are not deleted
and only the file names are not associated with the blob anymore. I
wonder, if I discovered a bug (at least in bfg). But I expect git to
discover that this blobs are not used in any way (so they have to
associated to something right?)
# invoke bfg --delete-folders something multiple times with different
pattern.
# try to cleanup
git gc --aggressive --prune=now # big blobs still in history
git fsck # no results
git fsck --full --unreachable --dangling # no results
to verify if the blobs are still there, see the output of
git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+
blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects
.txt
head bigobjects.txt # outputs 9451427d7335395779b91864418630d2f0af780a
blob 7895212 1869047 7657491
Also if bfg is being told to remove the biggest blob (bfg -B 1) with
no-blob-protection, it does not succeed in removing it.
--- output of bfg -B 1
Found 1 blob ids for large blobs - biggest=7895212 smallest=7895212
....
BFG aborting: No refs to update - no dirty commits found??
---
The repo can be found here.
https://github.com/marscher/stallone_stale_objects
I will restart all over to cleanup the history, but I guess this might
be interesting for git developers.
Best,
Martin
next reply other threads:[~2014-12-08 16:29 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-08 16:22 Martin Scherer [this message]
[not found] ` <CAFY1edaEq1zYV0vgSfiPAXU6bqVBzaA-apVnSn8DBMbzcAa2tQ@mail.gmail.com>
2014-12-08 16:47 ` Blobs not referenced by file (anymore) are not removed by GC Roberto Tyley
2014-12-09 14:14 ` Jeff King
2014-12-09 16:01 ` Roberto Tyley
2014-12-09 16:11 ` Jeff King
2014-12-09 22:15 ` Roberto Tyley
2014-12-10 7:11 ` Jeff King
2014-12-10 16:07 ` Junio C Hamano
2014-12-10 23:41 ` Roberto Tyley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5485D03F.3060008@fu-berlin.de \
--to=m.scherer@fu-berlin.de \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.