git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: John Cai <johncai86@gmail.com>
To: Jeff King <peff@peff.net>
Cc: John Cai via GitGitGadget <gitgitgadget@gmail.com>, git@vger.kernel.org
Subject: Re: [PATCH] tmp-objdir: do not opendir() when handling a signal
Date: Tue, 27 Sep 2022 09:50:43 -0400	[thread overview]
Message-ID: <1B4D6262-532A-401E-B1A2-709BEECA4B67@gmail.com> (raw)
In-Reply-To: <YzLiI1HZeBszsIJq@coredump.intra.peff.net>

Hey Peff,

On 27 Sep 2022, at 7:44, Jeff King wrote:

> On Mon, Sep 26, 2022 at 11:53:03PM +0000, John Cai via GitGitGadget wrote:
>
>> One place we call tmp_objdir_create() is in git-receive-pack, where
>> we create a temporary quarantine directory "incoming". Incoming
>> objects will be written to this directory before they get moved to
>> the object directory.
>>
>> We have observed this code leading to a deadlock:
>>
>> 	Thread 1 (Thread 0x7f621ba0b200 (LWP 326305)):
>> 	#0  __lll_lock_wait_private (futex=futex@entry=0x7f621bbf8b80
>> 		<main_arena>) at ./lowlevellock.c:35
>> 	#1  0x00007f621baa635b in __GI___libc_malloc
>> 		(bytes=bytes@entry=32816) at malloc.c:3064
>> 	#2  0x00007f621bae9f49 in __alloc_dir (statp=0x7fff2ea7ed60,
>> 		flags=0, close_fd=true, fd=5)
>> 		at ../sysdeps/posix/opendir.c:118
>> 	#3  opendir_tail (fd=5) at ../sysdeps/posix/opendir.c:69
>> 	#4  __opendir (name=<optimized out>)
>> 		at ../sysdeps/posix/opendir.c:92
>> 	#5  0x0000557c19c77de1 in remove_dir_recurse ()
>> 	#6  0x0000557c19d81a4f in remove_tmp_objdir_on_signal ()
>> 	#7  <signal handler called>
>
> Yuck. Your analysis looks right, and certainly opendir() can't really
> work without allocating memory for the pointer-to-DIR.
>
>> To prevent this, add a flag REMOVE_DIR_SIGNAL that allows
>> remove_dir_recurse() to know that a signal is being handled and avoid
>> calling opendir(3). One implication of this change is that when
>> signal handling, the temporary directory may not get cleaned up
>> properly.
>
> It's not really "may not" here, is it? It will never get cleaned up on a
> signal now. I don't think remove_dir_recursively() will try to rmdir()
> in this case. But even if it did, we'll always have a "pack"
> subdirectory (minus the small race before we've created it).
>
> That's unfortunate, but I don't think we really have a portable
> alternative. We can't keep an exact list of files to be deleted, because
> some of them will be created by sub-processes. We could perhaps exec a
> helper that does the deletion, but that seems like a race and
> portability nightmare. On Linux, we could probably use open() and
> getdents64() to traverse, but obviously that won't work everywhere. It
> _might_ be worth having some kind of compat/ wrapper here, to let
> supported systems do as good a job as they can. But it's not like there
> aren't already cases where we might leave the tmp-objdir directory
> around (say, SIGKILL), so this is really just extending the existing
> problem to more signals.
>
> I was going to suggest we should do a better job of cleaning up these
> directories via git-gc. But it looks like b3cecf49ea (tmp-objdir: new
> API for creating temporary writable databases, 2021-12-06) changed the
> default name such that a regular git-gc should do so. So I think we're
> covered there.

I was wondering about this as well. Thanks for checking on this--that's
reassuring.
>
> The main case we care about is normal exit when index-pack or a hook
> sees an error, in which case we should still be cleaning up via the
> atexit() handler.
>
> So I think your patch is going in the right direction, but...
>
>>  static int remove_dir_recurse(struct strbuf *path, int flag, int *kept_up)
>>  {
>> -	DIR *dir;
>> +	DIR *dir = NULL;
>>  	struct dirent *e;
>>  	int ret = 0, original_len = path->len, len, kept_down = 0;
>>  	int only_empty = (flag & REMOVE_DIR_EMPTY_ONLY);
>> @@ -3261,7 +3261,10 @@ static int remove_dir_recurse(struct strbuf *path, int flag, int *kept_up)
>>  	}
>>
>>  	flag &= ~REMOVE_DIR_KEEP_TOPLEVEL;
>> -	dir = opendir(path->buf);
>> +
>> +	if ((flag & REMOVE_DIR_SIGNAL) == 0)
>> +		dir = opendir(path->buf);
>> +
>>  	if (!dir) {
>>  		if (errno == ENOENT)
>>  			return keep_toplevel ? -1 : 0;
>
> We skip calling opendir() entirely, so "dir" will still be NULL. But we
> immediately start looking at errno, which will have some undefined value
> (based on some previous syscall).
>
> If we set "errno" to "EACCES" in this case, then we'd at least hit the
> rmdir() below:
>
>          if (!dir) {
>                   if (errno == ENOENT)
>                           return keep_toplevel ? -1 : 0;
>                   else if (errno == EACCES && !keep_toplevel)
>                           /*
>                            * An empty dir could be removable even if it
>                            * is unreadable:
>                            */
>                           return rmdir(path->buf);
>                   else
>                           return -1;
>           }
>
> but we know it won't really do anything for our proposed caller, since
> it will have files inside the directory that need to be removed before
> rmdir() can work.

yeah, I suppose the only case it would help is if the directory is already
empty.

>
> Moreover, if you were to combine REMOVE_DIR_SIGNAL with
> REMOVE_DIR_KEEP_NESTED_GIT, I suspect that the call to
> resolve_gitlink_ref() would end up with similar deadlocks. Obviously
> nobody is proposing to do that, but it is a pitfall in the API.
>
> So all of that makes me think we should not add a new flag here, but
> instead just avoid calling the function entirely from
> tmp_objdir_destroy_1().
>
> But then we can observe that tmp_objdir_destroy_1() is basically doing
> nothing if on_signal is set. So there is really no point in setting up
> the signal handler at all. We should just set up the atexit() handler.
> I.e., something like:
>
> diff --git a/tmp-objdir.c b/tmp-objdir.c
> index a8be92bca1..10549e95db 100644
> --- a/tmp-objdir.c
> +++ b/tmp-objdir.c
> @@ -169,7 +169,6 @@ struct tmp_objdir *tmp_objdir_create(const char *prefix)
>  	the_tmp_objdir = t;
>  	if (!installed_handlers) {
>  		atexit(remove_tmp_objdir);
> -		sigchain_push_common(remove_tmp_objdir_on_signal);
>  		installed_handlers++;
>  	}

This makes sense and is a clean solution. I guess the only case where we would
benefit in calling into remove_tmp_objdir_on_signal() is if the temp dir exists
but is empty. I'm not sure how often this would happen to make it worth it?
Probably not...

>
>
> with the commit message explaining that we can't do the cleanup in a
> portable and signal-safe way, so we just punt on the whole concept.
>
> There's also some minor cleanup we could do elsewhere to drop the
> "on_signal" argument (which can come as part of the same patch, or on
> top).
>
> -Peff

  reply	other threads:[~2022-09-27 13:50 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-26 23:53 [PATCH] tmp-objdir: do not opendir() when handling a signal John Cai via GitGitGadget
2022-09-27  0:18 ` Taylor Blau
2022-09-27 11:48   ` Jeff King
2022-09-27  1:39 ` Junio C Hamano
2022-09-27  9:18 ` Phillip Wood
2022-09-27 11:44 ` Jeff King
2022-09-27 13:50   ` John Cai [this message]
2022-09-27 19:03     ` Jeff King
2022-09-27 16:50   ` Junio C Hamano
2022-09-27 19:19 ` [PATCH v2] tmp-objdir: skip clean up " John Cai via GitGitGadget
2022-09-27 19:38   ` Jeff King
2022-09-27 20:00     ` Jeff King
2022-09-28 14:55   ` [PATCH v3] " John Cai via GitGitGadget
2022-09-28 15:38     ` Ævar Arnfjörð Bjarmason
2022-09-30 20:47     ` [PATCH v4] " John Cai via GitGitGadget
2022-10-03  8:52       ` Jeff King
2022-10-20 11:58 ` Another possible instance of async-signal-safe opendir path callstack? (Was: [PATCH] tmp-objdir: do not opendir() when handling a signal) Jan Pokorný
2022-10-20 18:21   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1B4D6262-532A-401E-B1A2-709BEECA4B67@gmail.com \
    --to=johncai86@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).