git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/1] Async-signal safety in signal handlers
@ 2022-01-07 10:53 Patrick Steinhardt
  2022-01-07 10:55 ` [PATCH 1/1] fetch: fix deadlock when cleaning up lockfiles in async signals Patrick Steinhardt
  0 siblings, 1 reply; 7+ messages in thread
From: Patrick Steinhardt @ 2022-01-07 10:53 UTC (permalink / raw)
  To: git; +Cc: iwiedler

[-- Attachment #1: Type: text/plain, Size: 2328 bytes --]

Hi,

we have recently observed a Git process which has been hanging around
for more than a month on one of our servers in production. A backtrace
showed that the git-fetch(1) process was deadlocked in its signal
handler while trying to free memory. Functions like malloc, free and
most I/O functions aren't reentrant though, which means they must not be
executed in async signal handlers as specified in signal-safety(7).

The fix for git-fetch(1) is rather simple: we can just unlink(2) the
lockfiles, which is indeed allowed, but skip free'ing memory. But in
fact, this is a wider issue we have: we mostly didn't pay attention to
those restrictions, and thus we freely call non-async-signal-safe
functions. It's less clear what to do about this in most of the cases
though:

- git-clone(1) tries to clean up the ".git" directory and its worktree
  on being killed, but needs to allocate memory to compute corresponding
  paths. We can try to preallocate the buffer, but it's not clear
  whether there is a proper upper boundary.

- git-gc(1) will try to commit "gc.log" and write to stderr, both of
  which aren't allowed. I think we'll have to just bail and leave it
  behind in a partially-written state.

- git-repack(1) tries to remove "pack/.tmp-*" files, calling opendir(3P),
  readdir(3P), closedir(3P) and allocates memory. We probably have to
  keep track of all temporary files we create in a global list, which we
  can then access in our signal handler.

- git-worktree(1) is doing the same as git-clone(1), trying to prune the
  new worktree if it's killed. Again, we'd probably have to preallocate
  a buffer to compute paths.

- HTTP pushes do all sorts of HTTP requests in their signal handler to
  unlock the remote server. I don't really see what to do about this
  except drop the code -- setting a global "please clean up and exit
  now" flags is probably not going to fly well.

The tempfiles and tmp-objdir code already handles signals correctly.

Patrick

Patrick Steinhardt (1):
  fetch: fix deadlock when cleaning up lockfiles in async signals

 builtin/clone.c |  2 +-
 builtin/fetch.c | 17 +++++++++++------
 transport.c     | 11 ++++++++---
 transport.h     | 14 +++++++++++++-
 4 files changed, 33 insertions(+), 11 deletions(-)

-- 
2.34.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread
[parent not found: <cover.1641551066.git.ps@pks.im>]

end of thread, other threads:[~2022-01-11  2:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-07 10:53 [PATCH 0/1] Async-signal safety in signal handlers Patrick Steinhardt
2022-01-07 10:55 ` [PATCH 1/1] fetch: fix deadlock when cleaning up lockfiles in async signals Patrick Steinhardt
2022-01-07 11:14   ` brian m. carlson
2022-01-07 22:41   ` Taylor Blau
2022-01-08 10:54     ` Phillip Wood
2022-01-11  2:11       ` Taylor Blau
     [not found] <cover.1641551066.git.ps@pks.im>
2022-01-07 10:53 ` Patrick Steinhardt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).