* Git Garbage Collect Error.
@ 2012-06-13 10:27 Thomas Lucas
2012-07-12 9:32 ` Jeff King
2012-07-12 12:34 ` Philippe Vaucher
0 siblings, 2 replies; 4+ messages in thread
From: Thomas Lucas @ 2012-06-13 10:27 UTC (permalink / raw)
To: git
Hi,
Hopefully this is the right place to send bug reports... The community page
"http://git-scm.com/community" suggests that it is.
Introduction
I am creating a large GIT repository fetching from a large SVN repository, as an
experiment initially. I usually use GIT repositories interfacing to parts of the
SVN repository.
Defect
During garbage collection (git gc) it encountered the following error:
git gc | git gc --prune :
Counting objects: 856758, done.
Delta compression using up to 2 threads.
fatal: Out of memory, malloc failed (tried to allocate 303237121 bytes)
error: failed to run repack
git gc --aggressive:
Counting objects: 856758, done.
Delta compression using up to 2 threads.
fatal: Out of memory, malloc failed (tried to allocate 291942401 bytes)
error: failed to run repack
At the moment the bare repository is about 4Gb in size and about 2/3rds the way
through fetching.
The compression gets over 90% of the way through before this error occurs, but I
don't think any compression results are kept, because when you repeat it has the
same amount of work to do.
Initially this happen during an automatic gc during the fetch process. This
aborted the fetch.
My system is XP64 2 core with 4Gb of memory and plenty of virtual memory.
Comments
If this a genuine limitation due to the size of an object and memory handling
limitations, then perhaps the error could be caught and the successful results
kept. Ie. do a partial compression. That way the process could continue.
Background
My requirement is to have GIT repositories of a source directory with all SVN
branches included so that I can more easily merge and compare branches using
GIT. However for even small source directories it takes weeks to fetch from the
SVN respository (including all tags and branches), whereas fetching just the
trunk takes a few hours. The SVN repository has over 90000 revisions. I am aware
that I can fetch a sub-set of revisions (I don't want to at the moment), but
I've found no way to fetch a sub-set of branches.
My config is as follows:
[svn-remote "svn"]
url = svn://svn
fetch = trunk:refs/remotes/svn/trunk
branches = branches/*:refs/remotes/svn/*
tags = tags/*:refs/remotes/svn/tags/*
I set this up using:
git svn init --prefix=svn/ --stdlayout --no-minimize-url svn://svn
To do this for individual directories I have to do the following:
git svn init --prefix=svn/ --stdlayout --no-minimize-url
svn://svn/trunk/source/<dir>
and then edit the config manually so that:
[svn-remote "svn"]
url = svn://svn
fetch = trunk/source/<dir>:refs/remotes/svn/trunk
branches = branches/*/source/<dir>:refs/remotes/svn/*
tags = tags/*/source/<dir>:refs/remotes/svn/tags/*
This works ok but I couldn't get this result by using "git svn init" directly.
Maybe I've missed something.
Regards,
Tom.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Git Garbage Collect Error.
2012-06-13 10:27 Git Garbage Collect Error Thomas Lucas
@ 2012-07-12 9:32 ` Jeff King
2012-07-14 3:36 ` sascha-ml
2012-07-12 12:34 ` Philippe Vaucher
1 sibling, 1 reply; 4+ messages in thread
From: Jeff King @ 2012-07-12 9:32 UTC (permalink / raw)
To: Thomas Lucas; +Cc: git
On Wed, Jun 13, 2012 at 11:27:04AM +0100, Thomas Lucas wrote:
> Hopefully this is the right place to send bug reports... The
> community page "http://git-scm.com/community" suggests that it is.
It is the right place. Sorry that you did not get any response before
now.
> During garbage collection (git gc) it encountered the following error:
>
> git gc | git gc --prune :
>
> Counting objects: 856758, done.
> Delta compression using up to 2 threads.
> fatal: Out of memory, malloc failed (tried to allocate 303237121 bytes)
> error: failed to run repack
Packing can be memory hungry if you have a lot of large objects (we may
hold several large objects in memory while comparing them for deltas).
It is also worse with 2 threads, as they will be working simultaneously,
but in the same memory space.
> The compression gets over 90% of the way through before this error
> occurs, but I don't think any compression results are kept, because
> when you repeat it has the same amount of work to do.
Right. Nothing is written during compression; we are just coming up with
a list of deltas to perform during the writing phase.
> My system is XP64 2 core with 4Gb of memory and plenty of virtual memory.
Unfortunately, I believe that the msysgit build is 32-bit, which means
you are probably not even getting to use all 4Gb of your address space
(my impression is that without special flags, 32-bit Windows processes
are limited to 2Gb of address space).
I'd first try doing the pack single-threaded by setting the pack.threads
config option to 1. If that doesn't work, you might try setting
pack.windowMemory to limit the delta search based on available memory
(usually it is limited by number of objects). If the large blobs are
ones that do not delta well anyway (e.g., compressed media files), you
might also consider setting the "-delta" attribute for them to skip
delta compression entirely.
-Peff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Git Garbage Collect Error.
2012-06-13 10:27 Git Garbage Collect Error Thomas Lucas
2012-07-12 9:32 ` Jeff King
@ 2012-07-12 12:34 ` Philippe Vaucher
1 sibling, 0 replies; 4+ messages in thread
From: Philippe Vaucher @ 2012-07-12 12:34 UTC (permalink / raw)
To: Thomas Lucas; +Cc: git
> At the moment the bare repository is about 4Gb in size and about 2/3rds the way through fetching.
That's a big repo. Lots of binary files in it?
Does git fsck run normally? Does it report a lot of dangling blogs/commits/etc?
Philippe
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Git Garbage Collect Error.
2012-07-12 9:32 ` Jeff King
@ 2012-07-14 3:36 ` sascha-ml
0 siblings, 0 replies; 4+ messages in thread
From: sascha-ml @ 2012-07-14 3:36 UTC (permalink / raw)
To: Jeff King; +Cc: git
On Thursday 12 July 2012 05:32:21 Jeff King wrote:
> [...] which means you are probably not even getting to use all 4Gb of your
> address space (my impression is that without special flags, 32-bit Windows
> processes are limited to 2Gb of address space).
Indeed, that's how windows partitions memory on 32-Bit Systems. See:
http://msdn.microsoft.com/en-us/library/windows/desktop/aa366912.aspx
As it's always with that strange company, they don't spend a word about how
they do it in the emulated 32 bit environment. However, a short testing
reveals that a 32 bit process running on 64 bit Windows 7 with 6 GiB memory is
not able to malloc() more than 1 GiB at once (which is not a big suprise at
all - as malloc'ed memory has to be continuous inside the address space).
So one might guess that there is no difference in memory partitioning for 32
bit processes running on 64 bit OS.
SaCu
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2012-07-14 1:36 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-13 10:27 Git Garbage Collect Error Thomas Lucas
2012-07-12 9:32 ` Jeff King
2012-07-14 3:36 ` sascha-ml
2012-07-12 12:34 ` Philippe Vaucher
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.