All of lore.kernel.org
 help / color / mirror / Atom feed
* Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
@ 2015-04-16 10:03 Andreas Mohr
  2015-04-16 11:10 ` Thomas Braun
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Mohr @ 2015-04-16 10:03 UTC (permalink / raw)
  To: git

Hi all,

over the years I've had the same phenomenon with various versions of msysgit
(now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently
confident of it being a long-standing, longer-term issue and thus I'm
reporting it now.

Since I'm doing development in a sufficiently rebase-heavy manner,
I seem to aggregate a lot of objects.
Thus, when fetching content I'm sufficiently frequently greeted with
a git gc run.
This, however, does not work fully reliably:

    Auto packing the repository for optimum performance. You may also
    run "git gc" manually. See "git help gc" for more information.
    Counting objects: 206527, done.
    Delta compression using up to 4 threads.
    Compressing objects: 100% (27430/27430), done.
    Writing objects: 100% (206527/206527), done.
    Total 206527 (delta 178632), reused 206527 (delta 178632)
    Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n
    Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n
    Checking connectivity: 206527, done.

A workable workaround for this recurring issue
(such a fetch will fail repeatedly,
thereby hampering my ability to update properly)
is to manually do a "git gc --auto"
prior to the fetch (which will then succeed).




-- 
¿umop apisdn upside down?
(by daniweb.com user Bench)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 10:03 Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues Andreas Mohr
@ 2015-04-16 11:10 ` Thomas Braun
  2015-04-16 11:31   ` Johannes Schindelin
  2015-04-16 11:35   ` Andreas Mohr
  0 siblings, 2 replies; 13+ messages in thread
From: Thomas Braun @ 2015-04-16 11:10 UTC (permalink / raw)
  To: Andreas Mohr, git; +Cc: msysGit

Am 16.04.2015 um 12:03 schrieb Andreas Mohr:
> Hi all,
> 
> over the years I've had the same phenomenon with various versions of msysgit
> (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently
> confident of it being a long-standing, longer-term issue and thus I'm
> reporting it now.

(CC'ing msysgit)

Hi Andreas,

> Since I'm doing development in a sufficiently rebase-heavy manner,
> I seem to aggregate a lot of objects.
> Thus, when fetching content I'm sufficiently frequently greeted with
> a git gc run.
> This, however, does not work fully reliably:
> 
>     Auto packing the repository for optimum performance. You may also
>     run "git gc" manually. See "git help gc" for more information.
>     Counting objects: 206527, done.
>     Delta compression using up to 4 threads.
>     Compressing objects: 100% (27430/27430), done.
>     Writing objects: 100% (206527/206527), done.
>     Total 206527 (delta 178632), reused 206527 (delta 178632)
>     Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n
>     Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n
>     Checking connectivity: 206527, done.
> 
> A workable workaround for this recurring issue
> (such a fetch will fail repeatedly,
> thereby hampering my ability to update properly)
> is to manually do a "git gc --auto"
> prior to the fetch (which will then succeed).

I've never had this issue. The error message from unlinking the file
means that someone is still accessing the file and thus it can not be
deleted (due to the implicit file locking on windows).

Can you reproduce the error reliably?

Thomas


-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 11:10 ` Thomas Braun
@ 2015-04-16 11:31   ` Johannes Schindelin
  2015-04-16 11:42     ` Andreas Mohr
  2015-04-16 11:35   ` Andreas Mohr
  1 sibling, 1 reply; 13+ messages in thread
From: Johannes Schindelin @ 2015-04-16 11:31 UTC (permalink / raw)
  To: Thomas Braun; +Cc: Andreas Mohr, git, msysGit, git-owner

Hi,

On 2015-04-16 13:10, Thomas Braun wrote:
> Am 16.04.2015 um 12:03 schrieb Andreas Mohr:
>>
>> over the years I've had the same phenomenon with various versions of msysgit
>> (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently
>> confident of it being a long-standing, longer-term issue and thus I'm
>> reporting it now.
> 
> (CC'ing msysgit)

Good idea.

>> Since I'm doing development in a sufficiently rebase-heavy manner,
>> I seem to aggregate a lot of objects.
>> Thus, when fetching content I'm sufficiently frequently greeted with
>> a git gc run.
>> This, however, does not work fully reliably:
>>
>>     Auto packing the repository for optimum performance. You may also
>>     run "git gc" manually. See "git help gc" for more information.
>>     Counting objects: 206527, done.
>>     Delta compression using up to 4 threads.
>>     Compressing objects: 100% (27430/27430), done.
>>     Writing objects: 100% (206527/206527), done.
>>     Total 206527 (delta 178632), reused 206527 (delta 178632)
>>     Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n
>>     Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n
>>     Checking connectivity: 206527, done.
>>
>> A workable workaround for this recurring issue
>> (such a fetch will fail repeatedly,
>> thereby hampering my ability to update properly)
>> is to manually do a "git gc --auto"
>> prior to the fetch (which will then succeed).
> 
> I've never had this issue. The error message from unlinking the file
> means that someone is still accessing the file and thus it can not be
> deleted (due to the implicit file locking on windows).

Best guess is that an antivirus is still accessing it. There is a tool called `WhoUses.exe` in msysGit (I do not remember if I included it into Git for Windows 1.x for end users) which could be used to figure out which process accesses a given file still: https://github.com/msysgit/msysgit/blob/master/mingw/bin/WhoUses.exe (maybe that would help you identify the cause of the problem).

Ciao,
Johannes

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 11:10 ` Thomas Braun
  2015-04-16 11:31   ` Johannes Schindelin
@ 2015-04-16 11:35   ` Andreas Mohr
  2015-04-16 15:28     ` Jeff King
  1 sibling, 1 reply; 13+ messages in thread
From: Andreas Mohr @ 2015-04-16 11:35 UTC (permalink / raw)
  To: Thomas Braun; +Cc: Andreas Mohr, git, msysGit

Hi,

sorry, I had sent the prior mail prematurely (hit wrong key)
and have been busy working on the resubmission...

On Thu, Apr 16, 2015 at 01:10:36PM +0200, Thomas Braun wrote:
> Am 16.04.2015 um 12:03 schrieb Andreas Mohr:
> > Hi all,
> > 
> > over the years I've had the same phenomenon with various versions of msysgit
> > (now at 1.9.5.msysgit.0, on Windows 7 64bit), so I'm now sufficiently
> > confident of it being a long-standing, longer-term issue and thus I'm
> > reporting it now.

(I've had experience with this issue as early as 1.7.x, I believe).


> (CC'ing msysgit)
> 
> Hi Andreas,
> 
> > Since I'm doing development in a sufficiently rebase-heavy manner,
> > I seem to aggregate a lot of objects.
> > Thus, when fetching content I'm sufficiently frequently greeted with
> > a git gc run.
> > This, however, does not work fully reliably:
> > 
> >     Auto packing the repository for optimum performance. You may also
> >     run "git gc" manually. See "git help gc" for more information.
> >     Counting objects: 206527, done.
> >     Delta compression using up to 4 threads.
> >     Compressing objects: 100% (27430/27430), done.
> >     Writing objects: 100% (206527/206527), done.
> >     Total 206527 (delta 178632), reused 206527 (delta 178632)
> >     Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.pack' failed. Should I try again? (y/n) n
> >     Unlink of file '.git/objects/pack/pack-ab1712db0a94b5c55538d3b4cb3660cedc264c3c.idx' failed. Should I try again? (y/n) n
> >     Checking connectivity: 206527, done.
> > 
> > A workable workaround for this recurring issue
> > (such a fetch will fail repeatedly,
> > thereby hampering my ability to update properly)
> > is to manually do a "git gc --auto"
> > prior to the fetch (which will then succeed).
> 
> I've never had this issue. The error message from unlinking the file
> means that someone is still accessing the file and thus it can not be
> deleted (due to the implicit file locking on windows).
> 
> Can you reproduce the error reliably?

It seems to be reproducible pretty reliably,
at least once git thinks it needs to repack (initiated by a fetch operation, I think),
*and* then the unlink issue successfully turned up
(which may happen perhaps every 20 fetches of a *very* rebase-heavy workflow).


Interim mail content:

I strongly suspect that git's repacking implementation
(probably unrelated to msysgit-specific deviations,
IOW, git *core* handling)
simply is buggy
in that it may keep certain file descriptors open
at least a certain time (depending on scope of implementation/objects!?)
beyond having finished its operation (rename?).
As a related note, in an unrelated application of mine
I also encountered issues on Windows
where renaming of in-use files and further use of these files/names
then failed (error code was EACCES I believe).
IOW, this seems to be an issue specific to
Windows' "special" (and sometimes quirky) filesystem handling
which probably does not turn up on many "other" platforms,
thus a historic existing implementation weakness in git's repack handling
could not be nailed down in a sufficiently easy manner.




I think I may have the order wrong, however:
Handling seems to be:
- repack needed
- counting objects
- compressing
- writing
- unlink (delete) of all prior non-repacked objects (which fails)


I have to admit that at this point in time I'm actually unsure
which higher-level operation it actually is
that gets carried out where eventually a repack *implicitly* gets triggered
(I've got a shell script here which implements clean branch updating,
where I eventually hit the problem during its daily use).


Since a standalone git gc --auto *immediately* appears to work
(after many repeated attempts of failing full-update),
this is a strong hint that (in the failure case)
it's the *PRIOR* (non-repack) operation
which has kept these objects open beyond its actual operation scope.


Suspected implementation sample code:

if (operation_needed)
{
  operation_workingset set;

  set.DoStuff();

  if (repack_needed)
  {
    repack_handler repack;

     repack.DoStuff();
  }
}

[NOTE the very prominent scope issues in this example,
which might be the exact reason for hitting such unlink failures -
simply due to having kept file descriptors open within the working set]

I have not had a look at git source though
to actually determine whether there do exist
such severe operation scope issues
that I'm strongly contemplating.

Andreas Mohr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 11:31   ` Johannes Schindelin
@ 2015-04-16 11:42     ` Andreas Mohr
  2015-04-16 11:48       ` Andreas Mohr
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Mohr @ 2015-04-16 11:42 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Thomas Braun, Andreas Mohr, git, msysGit, git-owner

Hi,

On Thu, Apr 16, 2015 at 01:31:02PM +0200, Johannes Schindelin wrote:
> Hi,
> 
> On 2015-04-16 13:10, Thomas Braun wrote:
> > I've never had this issue. The error message from unlinking the file
> > means that someone is still accessing the file and thus it can not be
> > deleted (due to the implicit file locking on windows).
> 
> Best guess is that an antivirus is still accessing it. There is a tool called `WhoUses.exe` in msysGit (I do not remember if I included it into Git for Windows 1.x for end users) which could be used to figure out which process accesses a given file still: https://github.com/msysgit/msysgit/blob/master/mingw/bin/WhoUses.exe (maybe that would help you identify the cause of the problem).

Oh my. Botched mail conversation...
I tried to f'up on this messy start ASAP, so I even managed to omit this final *pre-existing* part:
"
Please note that this system is hampered by a crappy virus scanner
dependency (F-Secure),
which could be the culprit for this issue (e.g. by keeping files busy
for longer than expected),
however I really don't think that it takes part in this issue.
"

The reason that I suspect that it's not virus scanner related is:
- standalone git gc --auto works immediately
  (hmm but this might also point at the opposite - namely virus scanner
  still accessing files of a prior operation only in case there *was*
  a prior operation)
- file descriptor scope handling issue in git source code is very easily imaginable
- only a very rebase-heavy workflow of a sufficiently large repo
  is likely to have this issue turn up in a frequently enough manner,
  thus it's quite likely that it's not observed (or reported) all too often

Thanks,

Andreas Mohr

-- 
GNU/Linux. It's not the software that's free, it's you.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 11:42     ` Andreas Mohr
@ 2015-04-16 11:48       ` Andreas Mohr
  2015-04-16 12:35         ` Andreas Mohr
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Mohr @ 2015-04-16 11:48 UTC (permalink / raw)
  To: Andreas Mohr; +Cc: Johannes Schindelin, Thomas Braun, git, msysGit, git-owner

On Thu, Apr 16, 2015 at 01:42:35PM +0200, Andreas Mohr wrote:
> Hi,
> 
> On Thu, Apr 16, 2015 at 01:31:02PM +0200, Johannes Schindelin wrote:
> > Hi,
> > 
> > On 2015-04-16 13:10, Thomas Braun wrote:
> > > I've never had this issue. The error message from unlinking the file
> > > means that someone is still accessing the file and thus it can not be
> > > deleted (due to the implicit file locking on windows).
> > 
> > Best guess is that an antivirus is still accessing it. There is a tool called `WhoUses.exe` in msysGit (I do not remember if I included it into Git for Windows 1.x for end users) which could be used to figure out which process accesses a given file still: https://github.com/msysgit/msysgit/blob/master/mingw/bin/WhoUses.exe (maybe that would help you identify the cause of the problem).
> 
> Oh my. Botched mail conversation...
> I tried to f'up on this messy start ASAP, so I even managed to omit this final *pre-existing* part:
> "
> Please note that this system is hampered by a crappy virus scanner
> dependency (F-Secure),
> which could be the culprit for this issue (e.g. by keeping files busy
> for longer than expected),
> however I really don't think that it takes part in this issue.
> "
> 
> The reason that I suspect that it's not virus scanner related is:
> - standalone git gc --auto works immediately
>   (hmm but this might also point at the opposite - namely virus scanner
>   still accessing files of a prior operation only in case there *was*
>   a prior operation)
> - file descriptor scope handling issue in git source code is very easily imaginable
> - only a very rebase-heavy workflow of a sufficiently large repo
>   is likely to have this issue turn up in a frequently enough manner,
>   thus it's quite likely that it's not observed (or reported) all too often

OK, at this point in time it's my turn to actually verify
that indeed it's NOT the virus scanner:
- generate rebase-heavy activity
- update
- hit issue
- unload virus (~ scanner?? I'm unsure on exact terminology to be used ;-)
- update
- profit!?

(and possibly have a try at WhoUses.exe there, too - thanks for the hint!)

Andreas Mohr

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 11:48       ` Andreas Mohr
@ 2015-04-16 12:35         ` Andreas Mohr
  2015-04-16 13:07           ` Johannes Schindelin
  0 siblings, 1 reply; 13+ messages in thread
From: Andreas Mohr @ 2015-04-16 12:35 UTC (permalink / raw)
  To: Andreas Mohr; +Cc: Johannes Schindelin, Thomas Braun, git, msysGit, git-owner

On Thu, Apr 16, 2015 at 01:48:46PM +0200, Andreas Mohr wrote:
> OK, at this point in time it's my turn to actually verify
> that indeed it's NOT the virus scanner:
> - generate rebase-heavy activity
> - update
> - hit issue
> - unload virus (~ scanner?? I'm unsure on exact terminology to be used ;-)
> - update
> - profit!?

Despite trying hard (generating a lot of activity, with different repo projects even)
I cannot reproduce it in a timely manner,
thus I'll have to wait until repo state has degraded in a sufficient manner
for such a larger repack with that issue to occur again
(probably a matter of weeks).
Once it happens, I will:
- ensure keeping a copy of the entire (problematic-state) repo, and verify reproducibility of its (copied/preserved) breakage
- unload virus and do other tests
- report back

Andreas Mohr

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 12:35         ` Andreas Mohr
@ 2015-04-16 13:07           ` Johannes Schindelin
  0 siblings, 0 replies; 13+ messages in thread
From: Johannes Schindelin @ 2015-04-16 13:07 UTC (permalink / raw)
  To: Andreas Mohr; +Cc: Thomas Braun, git, msysGit, git-owner

Hi Andreas,

On 2015-04-16 14:35, Andreas Mohr wrote:
> On Thu, Apr 16, 2015 at 01:48:46PM +0200, Andreas Mohr wrote:
>> OK, at this point in time it's my turn to actually verify
>> that indeed it's NOT the virus scanner:
>> - generate rebase-heavy activity
>> - update
>> - hit issue
>> - unload virus (~ scanner?? I'm unsure on exact terminology to be used ;-)
>> - update
>> - profit!?
> 
> Despite trying hard (generating a lot of activity, with different repo
> projects even)
> I cannot reproduce it in a timely manner,
> thus I'll have to wait until repo state has degraded in a sufficient manner
> for such a larger repack with that issue to occur again
> (probably a matter of weeks).
> Once it happens, I will:
> - ensure keeping a copy of the entire (problematic-state) repo, and
> verify reproducibility of its (copied/preserved) breakage
> - unload virus and do other tests
> - report back

I guess the best way to trigger it is by ensuring that a lot of loose objects are accumulated, e.g. by running

```sh
i=$(date +%s)
j=0
while test $j -lt 9999
do
    echo "test $(($i+$j))" git hash-object -w --stdin
    j=$(($j+1))
done
```

Ciao,
Johannes

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 11:35   ` Andreas Mohr
@ 2015-04-16 15:28     ` Jeff King
  2015-04-16 15:48       ` Johannes Schindelin
  2015-04-23  6:52       ` rupert thurner
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff King @ 2015-04-16 15:28 UTC (permalink / raw)
  To: Andreas Mohr; +Cc: Thomas Braun, git, msysGit

On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote:

> I strongly suspect that git's repacking implementation
> (probably unrelated to msysgit-specific deviations,
> IOW, git *core* handling)
> simply is buggy
> in that it may keep certain file descriptors open
> at least a certain time (depending on scope of implementation/objects!?)
> beyond having finished its operation (rename?).

Hrm. I do not see anything in builtin/fetch.c that closes the packfile
descriptors before running "gc --auto". So basically the sequence:

  1. Fetch performs actual fetch. It needs to open packfiles to do
     commit negotiation with other side (the hard work is done
     by an index-pack subprocess, but it is likely we have to access
     _some_ objects).

  2. The packfiles remain open and mmap'd (at least on Linux) in the
     sha1_file.c:packed_git list.

  3. We spawn "gc --auto" and wait for it to finish. While we are
     waiting, the descriptors are still open, but "gc --auto" will not be
     able to delete any packs.

But this seems too simple to be the problem, as it would mean that just
about any "gc --auto" that triggers a full repack would be a problem (so
anytime you have about 50 packs). But maybe the gc "autodetach" behavior
means it works racily.

I was able to set up the situation deterministically by running the
script below:

-- >8 --
#!/bin/sh

# XXX tweak this setting as appropriate
PATH_TO_GIT_BUILD=$HOME/compile/git
PATH=$PATH_TO_GIT_BUILD/bin-wrappers:$PATH
rm -rf parent child

# make a parent/child where the child will have to access
# a packfile to fulfill another fetch
git init parent &&
git -C parent commit --allow-empty -m base &&
git clone parent child &&
git -C parent commit --allow-empty -m extra &&

# we want to make our base pack really big, because otherwise
# git will open/mmap/close it. So we must exceed core.packedgitlimit
cd child &&
$PATH_TO_GIT_BUILD/test-genrandom foo 5000000 >file &&
git add file &&
git commit -m large file &&
git repack -ad &&
git config core.packedGitLimit 1M &&

# now make some spare packs to bust the gc.autopacklimit
for i in 1 2 3 4 5; do
	git commit --allow-empty -m $i &&
	git repack -d
done &&
git config gc.autoPackLimit 3 &&
git config gc.autoDetach false &&
GIT_TRACE=1 git fetch
```

I also instrumented my (v1.9.5) git build like this:

diff --git a/builtin/fetch.c b/builtin/fetch.c
index 025bc3e..fc99e5e 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1174,6 +1174,12 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	list.strdup_strings = 1;
 	string_list_clear(&list, 0);
 
+	{
+		struct packed_git *p;
+		for (p = packed_git; p; p = p->next)
+			trace_printf("pack %s has descriptor %d\n",
+				     p->pack_name, p->pack_fd);
+	}
 	run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
 
 	return result;
diff --git a/builtin/repack.c b/builtin/repack.c
index bb2314c..e8b29cf 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -105,6 +105,7 @@ static void remove_redundant_pack(const char *dir_name, const char *base_name)
 	for (i = 0; i < ARRAY_SIZE(exts); i++) {
 		strbuf_setlen(&buf, plen);
 		strbuf_addstr(&buf, exts[i]);
+		trace_printf("unlinking %s\n", buf.buf);
 		unlink(buf.buf);
 	}
 	strbuf_release(&buf);

to confirm what was happening (because of course on Linux it is
perfectly fine to delete the open file). If this does trigger the bug
for you, though, it should be obvious even without the trace calls. :)

-Peff

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 15:28     ` Jeff King
@ 2015-04-16 15:48       ` Johannes Schindelin
  2015-04-16 15:56         ` David Miller
  2015-04-16 20:56         ` Andreas Mohr
  2015-04-23  6:52       ` rupert thurner
  1 sibling, 2 replies; 13+ messages in thread
From: Johannes Schindelin @ 2015-04-16 15:48 UTC (permalink / raw)
  To: Jeff King; +Cc: Andreas Mohr, Thomas Braun, git, msysGit, git-owner

Hi Peff,

On 2015-04-16 17:28, Jeff King wrote:
> On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote:
> 
>> I strongly suspect that git's repacking implementation
>> (probably unrelated to msysgit-specific deviations,
>> IOW, git *core* handling)
>> simply is buggy
>> in that it may keep certain file descriptors open
>> at least a certain time (depending on scope of implementation/objects!?)
>> beyond having finished its operation (rename?).
> 
> Hrm. [... detailed analysis, including a Minimal, Complete & Verifiable Example ...]

Thank you so much! I will definitely test this (at the moment, I have to recreate my build environment in a different VM than I used so far, that takes quite some time...)

Thanks!
Dscho

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 15:48       ` Johannes Schindelin
@ 2015-04-16 15:56         ` David Miller
  2015-04-16 20:56         ` Andreas Mohr
  1 sibling, 0 replies; 13+ messages in thread
From: David Miller @ 2015-04-16 15:56 UTC (permalink / raw)
  To: johannes.schindelin; +Cc: peff, andi, thomas.braun, git, msysgit, git-owner


Please remove git-owner from the CC: list in future replies, thank
you. :-)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 15:48       ` Johannes Schindelin
  2015-04-16 15:56         ` David Miller
@ 2015-04-16 20:56         ` Andreas Mohr
  1 sibling, 0 replies; 13+ messages in thread
From: Andreas Mohr @ 2015-04-16 20:56 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Andreas Mohr, Thomas Braun, git, msysGit

[git-owner CC dutifully removed]

On Thu, Apr 16, 2015 at 05:48:42PM +0200, Johannes Schindelin wrote:
> Hi Peff,
> 
> On 2015-04-16 17:28, Jeff King wrote:
> > On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote:
> > 
> >> I strongly suspect that git's repacking implementation
> >> (probably unrelated to msysgit-specific deviations,
> >> IOW, git *core* handling)
> >> simply is buggy
> >> in that it may keep certain file descriptors open
> >> at least a certain time (depending on scope of implementation/objects!?)
> >> beyond having finished its operation (rename?).
> > 
> > Hrm. [... detailed analysis, including a Minimal, Complete & Verifiable Example ...]
> 
> Thank you so much! I will definitely test this (at the moment, I have to recreate my build environment in a different VM than I used so far, that takes quite some time...)

Your hash-object script successfully and with ease
managed to provoke the issue again, thanks a lot!
(syntax issue though: missed a '|' pipe).

And I then did some unload tests (force-unloaded, via End Process Tree) of the virus,
and the unlink issue persisted
(but to be truly certain, I would have to rename away
the entire virus installation tree).
Not to mention that it already looks anyway
like we seem to be on the way of nailing a genuine git handling bug...

Also, I have a very hard time remembering that the "retry unlink?" EVER
finally ended up successful (despite virus file activity surely being a very
temporary thing!).

So much for some "related" observations that I can contribute currently
- I had no time left to actually work on it today
but I'll try to do some testing given the very detailed
(and gratifyingly matching :) analysis of Jeff King (thanks a lot, too!).

Andreas Mohr

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues
  2015-04-16 15:28     ` Jeff King
  2015-04-16 15:48       ` Johannes Schindelin
@ 2015-04-23  6:52       ` rupert thurner
  1 sibling, 0 replies; 13+ messages in thread
From: rupert thurner @ 2015-04-23  6:52 UTC (permalink / raw)
  To: msysgit; +Cc: peff, git, andi, thomas.braun


[-- Attachment #1.1: Type: text/plain, Size: 5029 bytes --]

hi,

i made a screenshot a couple of weeks ago (attached), and it seems to match 
your description.

rupert.


On Thursday, April 16, 2015 at 5:28:54 PM UTC+2, Jeff King wrote:
>
> On Thu, Apr 16, 2015 at 01:35:05PM +0200, Andreas Mohr wrote: 
>
> > I strongly suspect that git's repacking implementation 
> > (probably unrelated to msysgit-specific deviations, 
> > IOW, git *core* handling) 
> > simply is buggy 
> > in that it may keep certain file descriptors open 
> > at least a certain time (depending on scope of implementation/objects!?) 
> > beyond having finished its operation (rename?). 
>
> Hrm. I do not see anything in builtin/fetch.c that closes the packfile 
> descriptors before running "gc --auto". So basically the sequence: 
>
>   1. Fetch performs actual fetch. It needs to open packfiles to do 
>      commit negotiation with other side (the hard work is done 
>      by an index-pack subprocess, but it is likely we have to access 
>      _some_ objects). 
>
>   2. The packfiles remain open and mmap'd (at least on Linux) in the 
>      sha1_file.c:packed_git list. 
>
>   3. We spawn "gc --auto" and wait for it to finish. While we are 
>      waiting, the descriptors are still open, but "gc --auto" will not be 
>      able to delete any packs. 
>
> But this seems too simple to be the problem, as it would mean that just 
> about any "gc --auto" that triggers a full repack would be a problem (so 
> anytime you have about 50 packs). But maybe the gc "autodetach" behavior 
> means it works racily. 
>
> I was able to set up the situation deterministically by running the 
> script below: 
>
> -- >8 -- 
> #!/bin/sh 
>
> # XXX tweak this setting as appropriate 
> PATH_TO_GIT_BUILD=$HOME/compile/git 
> PATH=$PATH_TO_GIT_BUILD/bin-wrappers:$PATH 
> rm -rf parent child 
>
> # make a parent/child where the child will have to access 
> # a packfile to fulfill another fetch 
> git init parent && 
> git -C parent commit --allow-empty -m base && 
> git clone parent child && 
> git -C parent commit --allow-empty -m extra && 
>
> # we want to make our base pack really big, because otherwise 
> # git will open/mmap/close it. So we must exceed core.packedgitlimit 
> cd child && 
> $PATH_TO_GIT_BUILD/test-genrandom foo 5000000 >file && 
> git add file && 
> git commit -m large file && 
> git repack -ad && 
> git config core.packedGitLimit 1M && 
>
> # now make some spare packs to bust the gc.autopacklimit 
> for i in 1 2 3 4 5; do 
>         git commit --allow-empty -m $i && 
>         git repack -d 
> done && 
> git config gc.autoPackLimit 3 && 
> git config gc.autoDetach false && 
> GIT_TRACE=1 git fetch 
> ``` 
>
> I also instrumented my (v1.9.5) git build like this: 
>
> diff --git a/builtin/fetch.c b/builtin/fetch.c 
> index 025bc3e..fc99e5e 100644 
> --- a/builtin/fetch.c 
> +++ b/builtin/fetch.c 
> @@ -1174,6 +1174,12 @@ int cmd_fetch(int argc, const char **argv, const 
> char *prefix) 
>          list.strdup_strings = 1; 
>          string_list_clear(&list, 0); 
>   
> +        { 
> +                struct packed_git *p; 
> +                for (p = packed_git; p; p = p->next) 
> +                        trace_printf("pack %s has descriptor %d\n", 
> +                                     p->pack_name, p->pack_fd); 
> +        } 
>          run_command_v_opt(argv_gc_auto, RUN_GIT_CMD); 
>   
>          return result; 
> diff --git a/builtin/repack.c b/builtin/repack.c 
> index bb2314c..e8b29cf 100644 
> --- a/builtin/repack.c 
> +++ b/builtin/repack.c 
> @@ -105,6 +105,7 @@ static void remove_redundant_pack(const char 
> *dir_name, const char *base_name) 
>          for (i = 0; i < ARRAY_SIZE(exts); i++) { 
>                  strbuf_setlen(&buf, plen); 
>                  strbuf_addstr(&buf, exts[i]); 
> +                trace_printf("unlinking %s\n", buf.buf); 
>                  unlink(buf.buf); 
>          } 
>          strbuf_release(&buf); 
>
> to confirm what was happening (because of course on Linux it is 
> perfectly fine to delete the open file). If this does trigger the bug 
> for you, though, it should be obvious even without the trace calls. :) 
>
> -Peff 
>

-- 
-- 
*** Please reply-to-all at all times ***
*** (do not pretend to know who is subscribed and who is not) ***
*** Please avoid top-posting. ***
The msysGit Wiki is here: https://github.com/msysgit/msysgit/wiki - Github accounts are free.

You received this message because you are subscribed to the Google
Groups "msysGit" group.
To post to this group, send email to msysgit@googlegroups.com
To unsubscribe from this group, send email to
msysgit+unsubscribe@googlegroups.com
For more options, and view previous threads, visit this group at
http://groups.google.com/group/msysgit?hl=en_US?hl=en

--- 
You received this message because you are subscribed to the Google Groups "Git for Windows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to msysgit+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[-- Attachment #1.2: Type: text/html, Size: 7470 bytes --]

[-- Attachment #2: git-windows-locking-itself.png --]
[-- Type: image/png, Size: 16444 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-04-23  6:52 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-16 10:03 Issue: repack semi-frequently fails on Windows (msysgit) - suspecting file descriptor issues Andreas Mohr
2015-04-16 11:10 ` Thomas Braun
2015-04-16 11:31   ` Johannes Schindelin
2015-04-16 11:42     ` Andreas Mohr
2015-04-16 11:48       ` Andreas Mohr
2015-04-16 12:35         ` Andreas Mohr
2015-04-16 13:07           ` Johannes Schindelin
2015-04-16 11:35   ` Andreas Mohr
2015-04-16 15:28     ` Jeff King
2015-04-16 15:48       ` Johannes Schindelin
2015-04-16 15:56         ` David Miller
2015-04-16 20:56         ` Andreas Mohr
2015-04-23  6:52       ` rupert thurner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.