From: Nicolas Pitre <nico@fluxnic.net>
To: Shawn Pearce <spearce@spearce.org>
Cc: Fredrik Kuivinen <frekui@gmail.com>,
git@vger.kernel.org, Johannes Sixt <j6t@kdbg.org>,
Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/2] Make xmalloc and xrealloc thread-safe
Date: Wed, 24 Mar 2010 14:54:48 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.2.00.1003241435300.694@xanadu.home> (raw)
In-Reply-To: <ec874dac1003241122s3d592f26n1b23d23144939218@mail.gmail.com>
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3752 bytes --]
On Wed, 24 Mar 2010, Shawn Pearce wrote:
> On Wed, Mar 24, 2010 at 10:53 AM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Wed, 24 Mar 2010, Fredrik Kuivinen wrote:
> >
> >> On Wed, Mar 24, 2010 at 00:50, Nicolas Pitre <nico@fluxnic.net> wrote:
> >> > On Tue, 23 Mar 2010, Fredrik Kuivinen wrote:
> >> >
> >> >> On Tue, Mar 23, 2010 at 19:43, Shawn O. Pearce <spearce@spearce.org> wrote:
> >> >> > If that is what we are doing, disabling the release of pack windows
> >> >> > when malloc fails, why can't we do that all of the time?
> >> >>
> >> >> The idea was that most git programs are single threaded, so they can
> >> >> still benefit from releasing the pack windows when they are low on
> >> >> memory.
> >> >
> >> > This is bobus. The Git program using the most memory is probably
> >> > pack-objects and it is threaded. Most single-threaded programs don't
> >> > use close to as much memory.
> >>
> >> Ok, you are right. But xmalloc/xrealloc cannot be used in multiple
> >> threads simultaneously without some serialization.
> >>
> >> For example, I think there are some potential race conditions in the
> >> pack-objects code. In the threaded code we have the following call
> >> chains leading to xcalloc, xmalloc, and xrealloc:
> >>
> >> find_deltas -> xcalloc
> >> find_deltas -> do_compress -> xmalloc
> >> find_deltas -> try_delta -> xrealloc
> >> find_deltas -> try_delta -> read_sha1_file -> ... -> xmalloc (called
> >> with read_lock held, but it can still race with the other calls)
> >>
> >> As far as I can see there is no serialization between these calls.
> >
> > True. We already have a problem. This is nasty.
>
> The easy solution is probably to remove the use of xmalloc from
> find_deltas code path. But then we run into hard failures when we
> can't get the memory we need, there isn't a way to recover from a
> malloc() failure deep within read_sha1_file for example.
The read_sha1_file path is not the problem -- it is always protected
against concurrency with a mutex.
It is more about do_compress() called on line 1476 of pack-objects.c for
example.
> The current solution is the best we can do, try to ditch pack windows
> and hope that releases sufficient virtual memory space that a second
> malloc() attempt can succeed by increasing heap.
>
> We could use a mutex during the malloc failure code-path of xmalloc,
> to ensure only one thread goes through that pack window cleanup at a
> time. But that will still mess with the main thread which doesn't
> really want to acquire mutexes during object access as it uses the
> existing pack windows.
Right.
> I thought pack-objects did all object access from the main thread and
> only delta searches on the worker threads?
No. Each thread is responsible for grabbing its own data set.
> If that is true, maybe we
> can have the worker threads signal the main thread on malloc failure
> to release pack windows, and then wait for that signal to be
> acknowledged before they attempt to retry the malloc. This means the
> main thread would need to periodically test that condition as its
> dispatching batches of objects to the workers.
>
> Ugly.
Indeed.
The real solution, of course, would be to have pack window manipulations
protected by a mutex of its own. This, plus another mutex for the delta
base cache, and then read_sha1_file() could almost be reentrant.
Another solution could be for xmalloc() to use a function pointer for
the method to use on malloc error path, which would default to a
function calling release_pack_memory(size, -1). Then pack-objects.c
would override the default with its own to acquire the read_mutex around
the call to release_pack_memory(). That is probably the easiest
solution for now.
Nicolas
next prev parent reply other threads:[~2010-03-24 18:54 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20100323161713.3183.57927.stgit@fredrik-laptop>
2010-03-23 17:31 ` [PATCH 1/2] Make xmalloc and xrealloc thread-safe Fredrik Kuivinen
2010-03-23 18:43 ` Shawn O. Pearce
2010-03-23 21:21 ` Fredrik Kuivinen
2010-03-23 23:50 ` Nicolas Pitre
2010-03-24 15:23 ` Fredrik Kuivinen
2010-03-24 17:53 ` Nicolas Pitre
2010-03-24 18:22 ` Shawn Pearce
2010-03-24 18:44 ` Junio C Hamano
2010-03-24 18:54 ` Nicolas Pitre [this message]
2010-03-24 19:57 ` Shawn Pearce
2010-03-24 20:22 ` [PATCH] " Nicolas Pitre
2010-03-24 20:28 ` Shawn O. Pearce
2010-03-24 21:02 ` Nicolas Pitre
2010-03-24 21:11 ` Junio C Hamano
2010-03-24 21:28 ` Junio C Hamano
2010-03-27 13:26 ` Fredrik Kuivinen
2010-03-27 18:59 ` Nicolas Pitre
2010-03-31 6:57 ` Fredrik Kuivinen
2010-04-07 2:57 ` [PATCH v2] " Nicolas Pitre
2010-04-07 3:16 ` Shawn O. Pearce
2010-04-07 4:51 ` Nicolas Pitre
2010-04-07 12:29 ` Shawn Pearce
2010-04-07 13:17 ` Nicolas Pitre
2010-04-07 14:30 ` Shawn Pearce
2010-04-07 14:47 ` Nicolas Pitre
2010-04-07 14:45 ` Fredrik Kuivinen
2010-04-07 15:08 ` Nicolas Pitre
2010-04-07 16:13 ` Fredrik Kuivinen
2010-04-07 16:44 ` Erik Faye-Lund
2010-04-07 18:37 ` Nicolas Pitre
2010-04-07 15:27 ` Sverre Rabbelier
2010-04-07 16:15 ` Fredrik Kuivinen
2010-04-07 16:17 ` Junio C Hamano
2010-04-07 18:49 ` Johannes Sixt
2010-04-08 7:15 ` [PATCH] Thread-safe xmalloc and xrealloc needs a recursive mutex Johannes Sixt
2010-04-08 8:42 ` Fredrik Kuivinen
2010-04-07 5:21 ` [PATCH v2] Make xmalloc and xrealloc thread-safe Junio C Hamano
2010-03-23 17:31 ` [PATCH 2/2] Make sha1_to_hex thread-safe Fredrik Kuivinen
2010-03-23 20:23 ` Johannes Sixt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.00.1003241435300.694@xanadu.home \
--to=nico@fluxnic.net \
--cc=frekui@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j6t@kdbg.org \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).