* Git gc removes all packs @ 2015-02-05 15:13 Dmitry Neverov 2015-02-05 20:03 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Dmitry Neverov @ 2015-02-05 15:13 UTC (permalink / raw) To: git Hi, I'm experiencing a strange behavior of automatic git gc which corrupts a local repository. Git version 2.2.2 on Mac OS X 10.10.1. I'm using git p4 for synchronization with perforce. Sometimes after 'git p4 rebase' git starts a garbage collection. When gc finishes a local repository contains no pack files only loose objects, so I have to re-import repository from perforce. It also doesn't contain a temporary pack git gc was creating. Command line history looks like this: > git p4 rebase Performing incremental import into refs/remotes/p4/master git branch Depot paths: //XXX/YYY/ Import destination: refs/remotes/p4/master Importing revision 352157 (100%) Rebasing the current branch onto remotes/p4/master First, rewinding head to replay your work on top of it... Fast-forwarded master to remotes/p4/master. Auto packing the repository in background for optimum performance. See "git help gc" for manual housekeeping. > ps aux | grep git nd 14335 95.0 1.4 4643292 114788 ?? R 8:52PM 0:05.79 git pack-objects --keep-true-parents --honor-pack-keep --non-empty --all --reflog --indexed-objects --unpack-unreachable=2.weeks.ago --local --delta-base-offset /path/to/repo/.git/objects/pack/.tmp-14333-pack nd 14333 0.0 0.0 2452420 920 ?? S 8:52PM 0:00.00 git repack -d -l -A --unpack-unreachable=2.weeks.ago nd 14331 0.0 0.0 2436036 744 ?? Ss 8:52PM 0:00.00 git gc --auto After the 14331 process termination all packs are gone. One more thing about my setup: since git p4 promotes a use of a linear history I use a separate repository for another branch in perforce. In order to be able to cherry-pick between repositories I added this another repo objects dir as an alternate and also added a ref which is a symbolic link to a branch in another repo (so I don't have to do any fetches). How do I troubleshoot the problem? Is there any way to enable a some kind of logging for automatic git gc? Can use of alternates or symbolic links in refs cause such a behavior? -- Dmitry ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-05 15:13 Git gc removes all packs Dmitry Neverov @ 2015-02-05 20:03 ` Jeff King 2015-02-17 16:39 ` Michael Haggerty 2015-02-27 10:16 ` Dmitry Neverov 0 siblings, 2 replies; 10+ messages in thread From: Jeff King @ 2015-02-05 20:03 UTC (permalink / raw) To: Dmitry Neverov; +Cc: git On Thu, Feb 05, 2015 at 04:13:03PM +0100, Dmitry Neverov wrote: > I'm using git p4 for synchronization with perforce. Sometimes after 'git > p4 rebase' git starts a garbage collection. When gc finishes a local > repository contains no pack files only loose objects, so I have to > re-import repository from perforce. It also doesn't contain a temporary > pack git gc was creating. It sounds like git didn't find any refs; it will pack only objects which are reachable. Unreachable objects are either: 1. Exploded into loose objects if the mtime on the pack they contain is less than 2 weeks old (and will eventually expire when they become 2 weeks old). 2. Dropped completely if older than 2 weeks. > One more thing about my setup: since git p4 promotes a use of a linear > history I use a separate repository for another branch in perforce. In > order to be able to cherry-pick between repositories I added this > another repo objects dir as an alternate and also added a ref which is a > symbolic link to a branch in another repo (so I don't have to do any > fetches). You can't symlink refs like this. The loose refs in the filesystem may be migrated into the "packed-refs" file, at which point your symlink will be broken. That is a likely reason why git would not find any refs. So your setup will not ever work reliably. But IMHO, it is a bug that git does not notice the broken symlink and abort an operation which is computing reachability in order to drop objects. As you noticed, it means a misconfiguration or filesystem error results in data loss. -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-05 20:03 ` Jeff King @ 2015-02-17 16:39 ` Michael Haggerty 2015-02-17 16:55 ` Jeff King 2015-02-27 10:16 ` Dmitry Neverov 1 sibling, 1 reply; 10+ messages in thread From: Michael Haggerty @ 2015-02-17 16:39 UTC (permalink / raw) To: Jeff King, Dmitry Neverov; +Cc: git On 02/05/2015 09:03 PM, Jeff King wrote: > On Thu, Feb 05, 2015 at 04:13:03PM +0100, Dmitry Neverov wrote: >> [...] >> One more thing about my setup: since git p4 promotes a use of a linear >> history I use a separate repository for another branch in perforce. In >> order to be able to cherry-pick between repositories I added this >> another repo objects dir as an alternate and also added a ref which is a >> symbolic link to a branch in another repo (so I don't have to do any >> fetches). > > You can't symlink refs like this. The loose refs in the filesystem may > be migrated into the "packed-refs" file, at which point your symlink > will be broken. That is a likely reason why git would not find any refs. > > So your setup will not ever work reliably. But IMHO, it is a bug that > git does not notice the broken symlink and abort an operation which is > computing reachability in order to drop objects. As you noticed, it > means a misconfiguration or filesystem error results in data loss. There's a bunch of code in refs.c that is there explicitly for reading loose references that are symlinks. If the link contents literally start with "refs/", then they are read and treated as a symbolic ref. Otherwise, the symlink is just followed. It is still possible to write symbolic refs that are represented as symlinks (see core.preferSymlinkRefs), but that backwards-compatibility code was added in 2006(!) Maybe it's time to deprecate it. And maybe we should start working towards a future where any symlinks under "refs" cause git to complain. Michael -- Michael Haggerty mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-17 16:39 ` Michael Haggerty @ 2015-02-17 16:55 ` Jeff King 2015-02-17 20:37 ` Michael Haggerty 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2015-02-17 16:55 UTC (permalink / raw) To: Michael Haggerty; +Cc: Dmitry Neverov, git On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: > > You can't symlink refs like this. The loose refs in the filesystem may > > be migrated into the "packed-refs" file, at which point your symlink > > will be broken. That is a likely reason why git would not find any refs. > > > > So your setup will not ever work reliably. But IMHO, it is a bug that > > git does not notice the broken symlink and abort an operation which is > > computing reachability in order to drop objects. As you noticed, it > > means a misconfiguration or filesystem error results in data loss. > > There's a bunch of code in refs.c that is there explicitly for reading > loose references that are symlinks. If the link contents literally start > with "refs/", then they are read and treated as a symbolic ref. > Otherwise, the symlink is just followed. Right, but we should be able to notice that: 1. We found a symlink. 2. We couldn't read it its ref value (because it's a broken link). I think we _do_ notice that at the lowest level, and set REF_ISBROKEN. But the problem is that the reachability code in prune and in pack-objects (triggered by "repack -ad") uses for_each_ref, and not for_each_rawref. So they ignore "broken" refs rather than complaining, even though failing to read a ref may mean we could drop objects which were only mentioned by that ref. > It is still possible to write symbolic refs that are represented as > symlinks (see core.preferSymlinkRefs), but that backwards-compatibility > code was added in 2006(!) Maybe it's time to deprecate it. And maybe we > should start working towards a future where any symlinks under "refs" > cause git to complain. I wouldn't mind seeing all of the symlink code go away, but I think it is orthogonal to the problem I mentioned. -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-17 16:55 ` Jeff King @ 2015-02-17 20:37 ` Michael Haggerty 2015-02-17 21:57 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Michael Haggerty @ 2015-02-17 20:37 UTC (permalink / raw) To: Jeff King; +Cc: Dmitry Neverov, git On 02/17/2015 05:55 PM, Jeff King wrote: > On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: > >>> You can't symlink refs like this. The loose refs in the filesystem may >>> be migrated into the "packed-refs" file, at which point your symlink >>> will be broken. That is a likely reason why git would not find any refs. >>> >>> So your setup will not ever work reliably. But IMHO, it is a bug that >>> git does not notice the broken symlink and abort an operation which is >>> computing reachability in order to drop objects. As you noticed, it >>> means a misconfiguration or filesystem error results in data loss. >> >> There's a bunch of code in refs.c that is there explicitly for reading >> loose references that are symlinks. If the link contents literally start >> with "refs/", then they are read and treated as a symbolic ref. >> Otherwise, the symlink is just followed. > > Right, but we should be able to notice that: > > 1. We found a symlink. > > 2. We couldn't read it its ref value (because it's a broken link). > > I think we _do_ notice that at the lowest level, and set REF_ISBROKEN. > But the problem is that the reachability code in prune and in > pack-objects (triggered by "repack -ad") uses for_each_ref, and not > for_each_rawref. So they ignore "broken" refs rather than complaining, > even though failing to read a ref may mean we could drop objects which > were only mentioned by that ref. Yes, this makes sense too. But my point was that sticking symlinks to random files in your refs hierarchy is pretty questionable even *before* the symlink gets broken. If we would warn the user as soon as we saw such a thing, then the user's problem would never have advanced as far as it did. Do you think that emitting warnings on *intact* symlinks is too draconian? > [...] Michael -- Michael Haggerty mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-17 20:37 ` Michael Haggerty @ 2015-02-17 21:57 ` Junio C Hamano 2015-02-17 22:19 ` Michael Haggerty 0 siblings, 1 reply; 10+ messages in thread From: Junio C Hamano @ 2015-02-17 21:57 UTC (permalink / raw) To: Michael Haggerty; +Cc: Jeff King, Dmitry Neverov, git Michael Haggerty <mhagger@alum.mit.edu> writes: > On 02/17/2015 05:55 PM, Jeff King wrote: >> On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: >> >>> There's a bunch of code in refs.c that is there explicitly for reading >>> loose references that are symlinks. If the link contents literally start >>> with "refs/", then they are read and treated as a symbolic ref. >>> Otherwise, the symlink is just followed. >> ... > Yes, this makes sense too. But my point was that sticking symlinks to > random files in your refs hierarchy is pretty questionable even *before* > the symlink gets broken. If we would warn the user as soon as we saw > such a thing, then the user's problem would never have advanced as far > as it did. Do you think that emitting warnings on *intact* symlinks is > too draconian? Do you mean that we would end up reading refs/heads/hold if the user did this: git rev-parse --verify HEAD -- >precious ln -s ../../../precious .git/refs/heads/hold because that symbolic link does not begin with "refs/", and is an accident waiting to happen so we should forbid it in the longer term and warning when we see it would be the first step? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-17 21:57 ` Junio C Hamano @ 2015-02-17 22:19 ` Michael Haggerty 2015-02-18 7:13 ` Junio C Hamano 0 siblings, 1 reply; 10+ messages in thread From: Michael Haggerty @ 2015-02-17 22:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, Dmitry Neverov, git On 02/17/2015 10:57 PM, Junio C Hamano wrote: > Michael Haggerty <mhagger@alum.mit.edu> writes: > >> On 02/17/2015 05:55 PM, Jeff King wrote: >>> On Tue, Feb 17, 2015 at 05:39:27PM +0100, Michael Haggerty wrote: >>> >>>> There's a bunch of code in refs.c that is there explicitly for reading >>>> loose references that are symlinks. If the link contents literally start >>>> with "refs/", then they are read and treated as a symbolic ref. >>>> Otherwise, the symlink is just followed. >>> ... >> Yes, this makes sense too. But my point was that sticking symlinks to >> random files in your refs hierarchy is pretty questionable even *before* >> the symlink gets broken. If we would warn the user as soon as we saw >> such a thing, then the user's problem would never have advanced as far >> as it did. Do you think that emitting warnings on *intact* symlinks is >> too draconian? > > Do you mean that we would end up reading refs/heads/hold if the user > did this: > > git rev-parse --verify HEAD -- >precious > ln -s ../../../precious .git/refs/heads/hold > > because that symbolic link does not begin with "refs/", Correct, you can do exactly that. The "hold" reference is resolvable and listable using "for-each-ref". But if I try to update it, the contents of the "precious" file are overwritten. On the other hand, if I run "pack-refs", then the current value of the "hold" reference is moved to "packed-refs" and the symlink is removed. This behavior is not sane. > and is an > accident waiting to happen so we should forbid it in the longer > term and warning when we see it would be the first step? Yes, I am proposing that approach, though if somebody can suggest a use case I'm willing to be convinced otherwise. The only thing I can imagine symlinks being useful for might be to temporarily create a fake repo, run one or two specific known-safe commands, then delete the repo again. Michael -- Michael Haggerty mhagger@alum.mit.edu ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-17 22:19 ` Michael Haggerty @ 2015-02-18 7:13 ` Junio C Hamano 0 siblings, 0 replies; 10+ messages in thread From: Junio C Hamano @ 2015-02-18 7:13 UTC (permalink / raw) To: Michael Haggerty; +Cc: Jeff King, Dmitry Neverov, git Michael Haggerty <mhagger@alum.mit.edu> writes: > On 02/17/2015 10:57 PM, Junio C Hamano wrote: > ... >> Do you mean that we would end up reading refs/heads/hold if the user >> did this: >> >> git rev-parse --verify HEAD -- >precious >> ln -s ../../../precious .git/refs/heads/hold >> >> because that symbolic link does not begin with "refs/", > > Correct, you can do exactly that. The "hold" reference is resolvable and > listable using "for-each-ref". But if I try to update it, the contents > of the "precious" file are overwritten. On the other hand, if I run > "pack-refs", then the current value of the "hold" reference is moved to > "packed-refs" and the symlink is removed. This behavior is not sane. > >> and is an >> accident waiting to happen so we should forbid it in the longer >> term and warning when we see it would be the first step? > > Yes, I am proposing that approach, though if somebody can suggest a use > case I'm willing to be convinced otherwise. Thanks. I agree the proposed tightening is probably harmless, but I too would want to see if somebody comes up with a valid use case. I do not think of anything offhand. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-05 20:03 ` Jeff King 2015-02-17 16:39 ` Michael Haggerty @ 2015-02-27 10:16 ` Dmitry Neverov 2015-02-27 13:14 ` Jeff King 1 sibling, 1 reply; 10+ messages in thread From: Dmitry Neverov @ 2015-02-27 10:16 UTC (permalink / raw) To: Jeff King; +Cc: git I followed your advice and removed a symlink ref from my repository. But didn't help.. automatic GC has just removed all packs again. May alternates cause such a behavior? Are any ways to make gc log somewhere why it removes packs? On Thu, Feb 5, 2015 at 9:03 PM, Jeff King <peff@peff.net> wrote: > On Thu, Feb 05, 2015 at 04:13:03PM +0100, Dmitry Neverov wrote: > >> I'm using git p4 for synchronization with perforce. Sometimes after 'git >> p4 rebase' git starts a garbage collection. When gc finishes a local >> repository contains no pack files only loose objects, so I have to >> re-import repository from perforce. It also doesn't contain a temporary >> pack git gc was creating. > > It sounds like git didn't find any refs; it will pack only objects which > are reachable. Unreachable objects are either: > > 1. Exploded into loose objects if the mtime on the pack they contain > is less than 2 weeks old (and will eventually expire when they > become 2 weeks old). > > 2. Dropped completely if older than 2 weeks. > >> One more thing about my setup: since git p4 promotes a use of a linear >> history I use a separate repository for another branch in perforce. In >> order to be able to cherry-pick between repositories I added this >> another repo objects dir as an alternate and also added a ref which is a >> symbolic link to a branch in another repo (so I don't have to do any >> fetches). > > You can't symlink refs like this. The loose refs in the filesystem may > be migrated into the "packed-refs" file, at which point your symlink > will be broken. That is a likely reason why git would not find any refs. > > So your setup will not ever work reliably. But IMHO, it is a bug that > git does not notice the broken symlink and abort an operation which is > computing reachability in order to drop objects. As you noticed, it > means a misconfiguration or filesystem error results in data loss. > > -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: Git gc removes all packs 2015-02-27 10:16 ` Dmitry Neverov @ 2015-02-27 13:14 ` Jeff King 0 siblings, 0 replies; 10+ messages in thread From: Jeff King @ 2015-02-27 13:14 UTC (permalink / raw) To: Dmitry Neverov; +Cc: git On Fri, Feb 27, 2015 at 11:16:09AM +0100, Dmitry Neverov wrote: > I followed your advice and removed a symlink ref from my repository. > But didn't help.. automatic GC has just removed all packs again. May > alternates cause such a behavior? Are any ways to make gc log > somewhere why it removes packs? If you have two repositories, A and B, and A points to B via alternates, then you cannot safely run "git gc" in B unless it knows about all of the refs in A. As we discussed before, symlinking the refs is not enough, because those symlinks get stale. But nor is removing the symlinks and just not knowing about the refs. :) The only safe thing to do is to fetch all of the refs from A into B just before running the gc (and consequently, you probably want to disable gc.auto in B). -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-02-27 13:14 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-02-05 15:13 Git gc removes all packs Dmitry Neverov 2015-02-05 20:03 ` Jeff King 2015-02-17 16:39 ` Michael Haggerty 2015-02-17 16:55 ` Jeff King 2015-02-17 20:37 ` Michael Haggerty 2015-02-17 21:57 ` Junio C Hamano 2015-02-17 22:19 ` Michael Haggerty 2015-02-18 7:13 ` Junio C Hamano 2015-02-27 10:16 ` Dmitry Neverov 2015-02-27 13:14 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.