git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@fluxnic.net>
To: Shawn Pearce <spearce@spearce.org>
Cc: Fredrik Kuivinen <frekui@gmail.com>,
	git@vger.kernel.org, Johannes Sixt <j6t@kdbg.org>,
	Junio C Hamano <gitster@pobox.com>
Subject: Re: [PATCH 1/2] Make xmalloc and xrealloc thread-safe
Date: Wed, 24 Mar 2010 14:54:48 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.2.00.1003241435300.694@xanadu.home> (raw)
In-Reply-To: <ec874dac1003241122s3d592f26n1b23d23144939218@mail.gmail.com>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3752 bytes --]

On Wed, 24 Mar 2010, Shawn Pearce wrote:

> On Wed, Mar 24, 2010 at 10:53 AM, Nicolas Pitre <nico@fluxnic.net> wrote:
> > On Wed, 24 Mar 2010, Fredrik Kuivinen wrote:
> >
> >> On Wed, Mar 24, 2010 at 00:50, Nicolas Pitre <nico@fluxnic.net> wrote:
> >> > On Tue, 23 Mar 2010, Fredrik Kuivinen wrote:
> >> >
> >> >> On Tue, Mar 23, 2010 at 19:43, Shawn O. Pearce <spearce@spearce.org> wrote:
> >> >> > If that is what we are doing, disabling the release of pack windows
> >> >> > when malloc fails, why can't we do that all of the time?
> >> >>
> >> >> The idea was that most git programs are single threaded, so they can
> >> >> still benefit from releasing the pack windows when they are low on
> >> >> memory.
> >> >
> >> > This is bobus. The Git program using the most memory is probably
> >> > pack-objects and it is threaded.  Most single-threaded programs don't
> >> > use close to as much memory.
> >>
> >> Ok, you are right. But xmalloc/xrealloc cannot be used in multiple
> >> threads simultaneously without some serialization.
> >>
> >> For example, I think there are some potential race conditions in the
> >> pack-objects code. In the threaded code we have the following call
> >> chains leading to xcalloc, xmalloc, and xrealloc:
> >>
> >> find_deltas -> xcalloc
> >> find_deltas -> do_compress -> xmalloc
> >> find_deltas -> try_delta -> xrealloc
> >> find_deltas -> try_delta -> read_sha1_file -> ... -> xmalloc  (called
> >> with read_lock held, but it can still race with the other calls)
> >>
> >> As far as I can see there is no serialization between these calls.
> >
> > True.  We already have a problem.  This is nasty.
> 
> The easy solution is probably to remove the use of xmalloc from
> find_deltas code path.  But then we run into hard failures when we
> can't get the memory we need, there isn't a way to recover from a
> malloc() failure deep within read_sha1_file for example.

The read_sha1_file path is not the problem -- it is always protected 
against concurrency with a mutex.

It is more about do_compress() called on line 1476 of pack-objects.c for 
example.


> The current solution is the best we can do, try to ditch pack windows 
> and hope that releases sufficient virtual memory space that a second 
> malloc() attempt can succeed by increasing heap.
> 
> We could use a mutex during the malloc failure code-path of xmalloc,
> to ensure only one thread goes through that pack window cleanup at a
> time.  But that will still mess with the main thread which doesn't
> really want to acquire mutexes during object access as it uses the
> existing pack windows.

Right.

> I thought pack-objects did all object access from the main thread and
> only delta searches on the worker threads?

No.  Each thread is responsible for grabbing its own data set.

> If that is true, maybe we
> can have the worker threads signal the main thread on malloc failure
> to release pack windows, and then wait for that signal to be
> acknowledged before they attempt to retry the malloc.  This means the
> main thread would need to periodically test that condition as its
> dispatching batches of objects to the workers.
> 
> Ugly.

Indeed.

The real solution, of course, would be to have pack window manipulations 
protected by a mutex of its own.  This, plus another mutex for the delta 
base cache, and then read_sha1_file() could almost be reentrant.

Another solution could be for xmalloc() to use a function pointer for 
the method to use on malloc error path, which would default to a 
function calling release_pack_memory(size, -1).  Then pack-objects.c 
would override the default with its own to acquire the read_mutex around 
the call to release_pack_memory().  That is probably the easiest 
solution for now.


Nicolas

  parent reply	other threads:[~2010-03-24 18:54 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20100323161713.3183.57927.stgit@fredrik-laptop>
2010-03-23 17:31 ` [PATCH 1/2] Make xmalloc and xrealloc thread-safe Fredrik Kuivinen
2010-03-23 18:43   ` Shawn O. Pearce
2010-03-23 21:21     ` Fredrik Kuivinen
2010-03-23 23:50       ` Nicolas Pitre
2010-03-24 15:23         ` Fredrik Kuivinen
2010-03-24 17:53           ` Nicolas Pitre
2010-03-24 18:22             ` Shawn Pearce
2010-03-24 18:44               ` Junio C Hamano
2010-03-24 18:54               ` Nicolas Pitre [this message]
2010-03-24 19:57                 ` Shawn Pearce
2010-03-24 20:22                   ` [PATCH] " Nicolas Pitre
2010-03-24 20:28                     ` Shawn O. Pearce
2010-03-24 21:02                       ` Nicolas Pitre
2010-03-24 21:11                         ` Junio C Hamano
2010-03-24 21:28                       ` Junio C Hamano
2010-03-27 13:26                     ` Fredrik Kuivinen
2010-03-27 18:59                       ` Nicolas Pitre
2010-03-31  6:57                         ` Fredrik Kuivinen
2010-04-07  2:57                       ` [PATCH v2] " Nicolas Pitre
2010-04-07  3:16                         ` Shawn O. Pearce
2010-04-07  4:51                           ` Nicolas Pitre
2010-04-07 12:29                             ` Shawn Pearce
2010-04-07 13:17                               ` Nicolas Pitre
2010-04-07 14:30                                 ` Shawn Pearce
2010-04-07 14:47                                   ` Nicolas Pitre
2010-04-07 14:45                                 ` Fredrik Kuivinen
2010-04-07 15:08                                   ` Nicolas Pitre
2010-04-07 16:13                                     ` Fredrik Kuivinen
2010-04-07 16:44                                       ` Erik Faye-Lund
2010-04-07 18:37                                       ` Nicolas Pitre
2010-04-07 15:27                                   ` Sverre Rabbelier
2010-04-07 16:15                                     ` Fredrik Kuivinen
2010-04-07 16:17                                   ` Junio C Hamano
2010-04-07 18:49                                   ` Johannes Sixt
2010-04-08  7:15                                   ` [PATCH] Thread-safe xmalloc and xrealloc needs a recursive mutex Johannes Sixt
2010-04-08  8:42                                     ` Fredrik Kuivinen
2010-04-07  5:21                           ` [PATCH v2] Make xmalloc and xrealloc thread-safe Junio C Hamano
2010-03-23 17:31 ` [PATCH 2/2] Make sha1_to_hex thread-safe Fredrik Kuivinen
2010-03-23 20:23   ` Johannes Sixt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.00.1003241435300.694@xanadu.home \
    --to=nico@fluxnic.net \
    --cc=frekui@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    --cc=spearce@spearce.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).