From: Konstantin Ryabitsev <konstantin@linuxfoundation.org>
To: git@vger.kernel.org
Subject: Repository corruption if objects pushed in the middle of repack
Date: Mon, 13 Jun 2022 16:31:45 -0400 [thread overview]
Message-ID: <20220613203145.wbpi2m3ys3hchw6c@meerkat.local> (raw)
Hi, all:
I'm trying to figure the cause of repository corruption in a very specific
case. Here's the setup:
1. the repository is several GB in size, full of automatically generated pushes
(https://git.yoctoproject.org/poky-buildhistory/)
2. this repository has no alternates or other clever things -- just your old
boring repository
2. the builders check out this repository with --depth 1 during the build
stage, then add new logs to the repository, commit and push
Admittedly, this is a bad use of git, but let's use that outside the scope.
Every weekend we run a set of maintenance tasks and if we find that there are
lots of new loose objects (which there usually are), we fire off a routine repack:
1. first, repack runs with the following flags (-f if deemed necessary):
git repack -n --window-memory=1g -a -b --unpack-unreachable=yesterday -f --pack-kept-objects -d
Since the repository is large, this usually takes a long time (3+ hours)
2. next, we generate a fresh commit-graph:
git commit-graph write
3. next, we run pack-refs:
git pack-refs --all
4. after that, we run prune:
git prune --expire=yesterday
In the case of this particular repository, we regularly run into repository
corruption, reported during the prune stage:
fsck[10362] 2022-05-09 01:00:06,378 - INFO - /var/lib/gitolite3/repositories/poky-buildhistory.git:
fsck[10362] 2022-05-09 01:00:06,700 - INFO - repack: performing a full repack for optimal deltas
fsck[10362] 2022-05-09 01:00:06,701 - INFO - repack: repacking with "-n --window-memory=1g -a -b --unpack-unreachable=yesterday -f --pack-kept-objects -d"
fsck[10362] 2022-05-09 03:19:15,825 - INFO - graph: generating commit-graph
fsck[10362] 2022-05-09 03:19:20,830 - INFO - packrefs: repacking all refs
fsck[10362] 2022-05-09 03:19:20,842 - INFO - prune: pruning
fsck[10362] 2022-05-09 03:19:21,622 - CRITICAL - /var/lib/gitolite3/repositories/poky-buildhistory.git reports errors:
fsck[10362] 2022-05-09 03:19:21,625 - CRITICAL - fatal: bad tree object ace77888c63e5c4e545f1bd7a3ee5934e35f56e9
fsck[10362] 2022-05-09 03:19:21,626 - WARNING - Repacking /var/lib/gitolite3/repositories/poky-buildhistory.git was unsuccessful
The tree object in question came in during the repack stage:
2022-05-09.02:36:33 11129 update poky-buildhistory buildhistory W refs/heads/poky/master/qemuppc 5aad6c8130370bf22f5639162bbbfeaefd0fcd5e ea4e65d72a6161fece5734f7b111e31af77c7578 refs/.*
As far as I know, the maintenance steps we are running shouldn't result in any
missing objects, so I'm curious if it's something we're doing wrong (using
unsafe flags) or if git isn't properly accounting for some objects that come
in during the repack stage. We're seeing this happen fairly routinely, so it's
not just a random fluke.
Git version 2.36.1 (and earlier).
Thanks,
Konstantin
next reply other threads:[~2022-06-13 20:59 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-13 20:31 Konstantin Ryabitsev [this message]
2022-06-13 21:18 ` Repository corruption if objects pushed in the middle of repack Taylor Blau
2022-06-13 21:24 ` Taylor Blau
2022-06-13 21:32 ` Konstantin Ryabitsev
2022-06-13 21:36 ` Taylor Blau
2022-06-13 21:45 ` Konstantin Ryabitsev
2022-06-13 22:26 ` Chris Torek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220613203145.wbpi2m3ys3hchw6c@meerkat.local \
--to=konstantin@linuxfoundation.org \
--cc=git@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.