git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* curiosities with tempfile.active
@ 2022-08-24  9:35 Jeff King
  2022-08-26 22:46 ` René Scharfe
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2022-08-24  9:35 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: René Scharfe, git

[starting a new thread, as this ended up rather long;
 +cc René as there's an interesting twist at the end]

On Tue, Aug 23, 2022 at 11:12:21AM +0200, Johannes Schindelin wrote:

> Side note: I do notice that `delete_tempfile(&tmpedit)` seems to _not_
> release memory when `tmpedit` is non-NULL when `tmpedit->active == 0`.
> I consider this a bug in the `delete_tempfile()` code (in its `if
> (!is_tempfile_active(tempfile))` clause, it should call
> `deactivate_tempfile()` for non-NULL `tempfile`s and set `*tempfile_p =
> NULL;`), but it is outside the scope of your patch to address that.

I agree it is confusing, but I think this turns out not to be a bug in
practice. In ye olden days, tempfile structs lived on the global
linked-list of entries to cleanup forever, so they were statically
allocated or effectively leaked from the heap. And that's why we needed
an "active" flag in the first place.

That changed around 422a21c6a0 (tempfile: remove deactivated list
entries, 2017-09-05) and 076aa2cbda (tempfile: auto-allocate tempfiles
on heap, 2017-09-05). Now calling deactivate_tempfile() unsets the
active flag _and_ frees the struct. And in normal calling, that's the
only way to unset the active flag (it also starts life unset, but the
creation functions always activate on success, or deactivate+free on
error).

So can we just get rid of the active flag? Possibly. I said "normal
calling" above, because the exception is in our remove_tempfiles()
cleanup routine, where we unset the "active" flag as we remove each
file. We can't use the regular delete_tempfile(), or even
deactivate_tempfile() here, because we may be in a signal handler, so
things like free() are forbidden. So we just "leak" the memory, but it's
OK because we're exiting (and even leak detection won't complain,
because it's reachable from the global list).

But why do we care about the flag, then? We just iterate over the list
once, so we should handle each entry once. But it can be called both as
an atexit() handler, as well as a signal handler. So it's possible that
we can be mid-iteration and get another signal (because the original was
via exit, or because we hook multiple signals for the cleanup). So the
active flag in theory is protecting us from that.

But it's not foolproof. We remove the tempfile and then unset the flag,
so there's a moment (albeit small) where that other signal could come
in. It shouldn't be a big deal because unlinking the tempfiles is mostly
idempotent-ish; barring the unlikely chance that somebody else creates
the same random filename as us, the second unlink will do nothing. So
while removing the active flag would increase the size of that race
(from handling a single entry to completing the whole atexit), that
shouldn't matter much in practice.

But here's the interesting twist. Commit 2c2db194bd (tempfile: add
mks_tempfile_dt(), 2022-04-20) recently taught remove_tempfiles() to
drop a surrounding directory. And it does so by munging the filename
buffer. It has to, because we can't allocate a new buffer in a signal
handler.

But is it also idempotent(-ish)? The directory removal itself is,
because it checks:

  tempfile->filename.buf[tempfile->directorylen] == '/'

before overwriting that byte with a NUL, so it should only be true once
(though I note this violates the usual "volatile" rules for signal
handlers, it's probably OK in practice since we know the NUL will be
written before we call rmdir()).

But after that happens, the filename buffer now contains a C string that
points to the directory. And if we end up in remove_tempfiles() again
due to another signal, we'll try to unlink(p->filename.buf), which will
now unlink the directory! I'm not sure how that behaves on all systems.
On Linux it's a noop. And if it just deleted the directory, that would
be OK. I seem to recall on old versions of SunOS it did bad things
(maybe because it really did unlink it, but without checking for
orphaned entries inside).

The good news is that it won't walk further up the tree. I was worried
that a second round might then try to rmdir the parent (usually "/tmp"
in this case). But the directory removal always uses the same length.

So...maybe it's all OK in practice? If not, this is an issue even
without removing the active flag. But removing it would make the race
window larger. I suspect it probably is OK in practice, but saying that
about a race always feels like famous last words.

If we did want to keep it, I suspect we could get the same effect by
munging the volatile "pid_t owner" field, but I think anything we munge
is supposed to be a sig_atomic_t according to the letter of the law.

Patch below shows what it would look like to just drop "active"
entirely. Test suite behaves as expected, but again, the real question
is what it might be doing in a weird racy situation. The one above, but
also a signal racing with adding an entry to the list. The volatile
sig_atomic_t is what tells the removal process in the signal handler
that it's OK to look at. But again, maybe that's OK in practice.

-Peff

---
diff --git a/tempfile.c b/tempfile.c
index 2024c82691..6134b73972 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -89,8 +89,6 @@ static void remove_tempfiles(int in_signal_handler)
 		else
 			unlink_or_warn(p->filename.buf);
 		remove_template_directory(p, in_signal_handler);
-
-		p->active = 0;
 	}
 }
 
@@ -111,7 +109,6 @@ static struct tempfile *new_tempfile(void)
 	struct tempfile *tempfile = xmalloc(sizeof(*tempfile));
 	tempfile->fd = -1;
 	tempfile->fp = NULL;
-	tempfile->active = 0;
 	tempfile->owner = 0;
 	INIT_LIST_HEAD(&tempfile->list);
 	strbuf_init(&tempfile->filename, 0);
@@ -123,9 +120,6 @@ static void activate_tempfile(struct tempfile *tempfile)
 {
 	static int initialized;
 
-	if (is_tempfile_active(tempfile))
-		BUG("activate_tempfile called for active object");
-
 	if (!initialized) {
 		sigchain_push_common(remove_tempfiles_on_signal);
 		atexit(remove_tempfiles_on_exit);
@@ -134,12 +128,10 @@ static void activate_tempfile(struct tempfile *tempfile)
 
 	volatile_list_add(&tempfile->list, &tempfile_list);
 	tempfile->owner = getpid();
-	tempfile->active = 1;
 }
 
 static void deactivate_tempfile(struct tempfile *tempfile)
 {
-	tempfile->active = 0;
 	strbuf_release(&tempfile->filename);
 	volatile_list_del(&tempfile->list);
 	free(tempfile);
diff --git a/tempfile.h b/tempfile.h
index d7804a214a..f0bf59dbac 100644
--- a/tempfile.h
+++ b/tempfile.h
@@ -77,7 +77,6 @@
 
 struct tempfile {
 	volatile struct volatile_list_head list;
-	volatile sig_atomic_t active;
 	volatile int fd;
 	FILE *volatile fp;
 	volatile pid_t owner;
@@ -221,7 +220,7 @@ FILE *fdopen_tempfile(struct tempfile *tempfile, const char *mode);
 
 static inline int is_tempfile_active(struct tempfile *tempfile)
 {
-	return tempfile && tempfile->active;
+	return !!tempfile;
 }
 
 /*

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: curiosities with tempfile.active
  2022-08-24  9:35 curiosities with tempfile.active Jeff King
@ 2022-08-26 22:46 ` René Scharfe
  2022-08-27 13:02   ` Jeff King
  2022-08-30 19:40   ` [PATCH 0/2] cleaning up tempfile active flag Jeff King
  0 siblings, 2 replies; 8+ messages in thread
From: René Scharfe @ 2022-08-26 22:46 UTC (permalink / raw)
  To: Jeff King, Johannes Schindelin; +Cc: git

Am 24.08.22 um 11:35 schrieb Jeff King:
> But here's the interesting twist. Commit 2c2db194bd (tempfile: add
> mks_tempfile_dt(), 2022-04-20) recently taught remove_tempfiles() to
> drop a surrounding directory. And it does so by munging the filename
> buffer. It has to, because we can't allocate a new buffer in a signal
> handler.
>
> But is it also idempotent(-ish)? The directory removal itself is,
> because it checks:
>
>   tempfile->filename.buf[tempfile->directorylen] == '/'
>
> before overwriting that byte with a NUL, so it should only be true once
> (though I note this violates the usual "volatile" rules for signal
> handlers, it's probably OK in practice since we know the NUL will be
> written before we call rmdir()).
>
> But after that happens, the filename buffer now contains a C string that
> points to the directory. And if we end up in remove_tempfiles() again
> due to another signal, we'll try to unlink(p->filename.buf), which will
> now unlink the directory! I'm not sure how that behaves on all systems.
> On Linux it's a noop. And if it just deleted the directory, that would
> be OK. I seem to recall on old versions of SunOS it did bad things
> (maybe because it really did unlink it, but without checking for
> orphaned entries inside).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/unlink.html
says of the unlink(2) parameter: "The path argument shall not name a
directory unless the process has appropriate privileges and the
implementation supports using unlink() on directories."  So we better
not do that..

--- >8 ---
Subject: [PATCH] tempfile: avoid directory cleanup race

The temporary directory created by mks_tempfile_dt() is deleted by first
deleting the file within, then truncating the filename strbuf and
passing the resulting string to rmdir(2).  When the cleanup routine is
invoked concurrently by a signal handler we can end up passing the now
truncated string to unlink(2), however, which could cause problems on
some systems.

Avoid that issue by remembering the directory name separately.  This way
the paths stay unchanged.  A signal handler can still race with normal
cleanup, but deleting the same files and directories twice is harmless.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 tempfile.c | 14 ++++++--------
 tempfile.h |  2 +-
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/tempfile.c b/tempfile.c
index 2024c82691..7414c81e31 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -59,14 +59,11 @@ static VOLATILE_LIST_HEAD(tempfile_list);
 static void remove_template_directory(struct tempfile *tempfile,
 				      int in_signal_handler)
 {
-	if (tempfile->directorylen > 0 &&
-	    tempfile->directorylen < tempfile->filename.len &&
-	    tempfile->filename.buf[tempfile->directorylen] == '/') {
-		strbuf_setlen(&tempfile->filename, tempfile->directorylen);
+	if (tempfile->directory) {
 		if (in_signal_handler)
-			rmdir(tempfile->filename.buf);
+			rmdir(tempfile->directory);
 		else
-			rmdir_or_warn(tempfile->filename.buf);
+			rmdir_or_warn(tempfile->directory);
 	}
 }

@@ -115,7 +112,7 @@ static struct tempfile *new_tempfile(void)
 	tempfile->owner = 0;
 	INIT_LIST_HEAD(&tempfile->list);
 	strbuf_init(&tempfile->filename, 0);
-	tempfile->directorylen = 0;
+	tempfile->directory = NULL;
 	return tempfile;
 }

@@ -141,6 +138,7 @@ static void deactivate_tempfile(struct tempfile *tempfile)
 {
 	tempfile->active = 0;
 	strbuf_release(&tempfile->filename);
+	free(tempfile->directory);
 	volatile_list_del(&tempfile->list);
 	free(tempfile);
 }
@@ -254,7 +252,7 @@ struct tempfile *mks_tempfile_dt(const char *directory_template,

 	tempfile = new_tempfile();
 	strbuf_swap(&tempfile->filename, &sb);
-	tempfile->directorylen = directorylen;
+	tempfile->directory = xmemdupz(tempfile->filename.buf, directorylen);
 	tempfile->fd = fd;
 	activate_tempfile(tempfile);
 	return tempfile;
diff --git a/tempfile.h b/tempfile.h
index d7804a214a..5b9e8743dd 100644
--- a/tempfile.h
+++ b/tempfile.h
@@ -82,7 +82,7 @@ struct tempfile {
 	FILE *volatile fp;
 	volatile pid_t owner;
 	struct strbuf filename;
-	size_t directorylen;
+	char *directory;
 };

 /*
--
2.37.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: curiosities with tempfile.active
  2022-08-26 22:46 ` René Scharfe
@ 2022-08-27 13:02   ` Jeff King
  2022-08-27 21:47     ` Chris Torek
  2022-08-30 19:40   ` [PATCH 0/2] cleaning up tempfile active flag Jeff King
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff King @ 2022-08-27 13:02 UTC (permalink / raw)
  To: René Scharfe; +Cc: Johannes Schindelin, git

On Sat, Aug 27, 2022 at 12:46:29AM +0200, René Scharfe wrote:

> https://pubs.opengroup.org/onlinepubs/9699919799/functions/unlink.html
> says of the unlink(2) parameter: "The path argument shall not name a
> directory unless the process has appropriate privileges and the
> implementation supports using unlink() on directories."  So we better
> not do that..

Yeah, I saw that. It's a bit vague, and if the call returns ENOSYS or
EISDIR, that would be perfectly fine. It's the "what happens on the
implementations that do support it..." part that I'm more worried about. :)

That said...

> --- >8 ---
> Subject: [PATCH] tempfile: avoid directory cleanup race
> 
> The temporary directory created by mks_tempfile_dt() is deleted by first
> deleting the file within, then truncating the filename strbuf and
> passing the resulting string to rmdir(2).  When the cleanup routine is
> invoked concurrently by a signal handler we can end up passing the now
> truncated string to unlink(2), however, which could cause problems on
> some systems.
> 
> Avoid that issue by remembering the directory name separately.  This way
> the paths stay unchanged.  A signal handler can still race with normal
> cleanup, but deleting the same files and directories twice is harmless.

Right, I'm a little embarrassed I didn't immediately come up with this
same fix. This is definitely the right thing to do. The more we can
treat data from signal-handlers are write-once, the better.

There's a slight extra memory cost to store the directory name twice,
but it's a drop in the bucket overall.

>  tempfile.c | 14 ++++++--------
>  tempfile.h |  2 +-

The patch itself looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: curiosities with tempfile.active
  2022-08-27 13:02   ` Jeff King
@ 2022-08-27 21:47     ` Chris Torek
  2022-08-30 18:56       ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Torek @ 2022-08-27 21:47 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, Johannes Schindelin, Git List

On Sat, Aug 27, 2022 at 6:05 AM Jeff King <peff@peff.net> wrote:
> Yeah, I saw that. It's a bit vague, and if the call returns ENOSYS or
> EISDIR, that would be perfectly fine. It's the "what happens on the
> implementations that do support it..." part that I'm more worried about. :)

The history here is that pre-4.2BSD, Unix systems had no mkdir
system call. You used mknod() to make a truly empty directory and
the link() to create the "." and ".." entries within it, and all three of
these operations were restricted to the super-user.  There was no
rmdir either, so again, unlink() as the super-user was permitted to
do the job (with three calls to unlink the "." and ".." entries first and
then remove the directory).

Unlinking a directory when it still contains "." leaves the link count
at 1 and there's no GC, so it sits around occupying an inode.

If you have a mkdir() system call and don't need backwards
compatibility, you get to have these return EISDIR errors...

Chris

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: curiosities with tempfile.active
  2022-08-27 21:47     ` Chris Torek
@ 2022-08-30 18:56       ` Jeff King
  0 siblings, 0 replies; 8+ messages in thread
From: Jeff King @ 2022-08-30 18:56 UTC (permalink / raw)
  To: Chris Torek; +Cc: René Scharfe, Johannes Schindelin, Git List

On Sat, Aug 27, 2022 at 02:47:45PM -0700, Chris Torek wrote:

> On Sat, Aug 27, 2022 at 6:05 AM Jeff King <peff@peff.net> wrote:
> > Yeah, I saw that. It's a bit vague, and if the call returns ENOSYS or
> > EISDIR, that would be perfectly fine. It's the "what happens on the
> > implementations that do support it..." part that I'm more worried about. :)
> 
> The history here is that pre-4.2BSD, Unix systems had no mkdir
> system call. You used mknod() to make a truly empty directory and
> the link() to create the "." and ".." entries within it, and all three of
> these operations were restricted to the super-user.  There was no
> rmdir either, so again, unlink() as the super-user was permitted to
> do the job (with three calls to unlink the "." and ".." entries first and
> then remove the directory).
> 
> Unlinking a directory when it still contains "." leaves the link count
> at 1 and there's no GC, so it sits around occupying an inode.

Thanks, that matches the sense of unease I had in the back of my mind. I
seem to recall that maybe older versions of SunOS exhibited this, but
that feels like a lifetime ago. At any rate, we should avoid that
unlink() call, and René's patch neatly does so.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 0/2] cleaning up tempfile active flag
  2022-08-26 22:46 ` René Scharfe
  2022-08-27 13:02   ` Jeff King
@ 2022-08-30 19:40   ` Jeff King
  2022-08-30 19:45     ` [PATCH 1/2] tempfile: drop " Jeff King
  2022-08-30 19:46     ` [PATCH 2/2] tempfile: update comment describing state transitions Jeff King
  1 sibling, 2 replies; 8+ messages in thread
From: Jeff King @ 2022-08-30 19:40 UTC (permalink / raw)
  To: René Scharfe; +Cc: Johannes Schindelin, git

On Sat, Aug 27, 2022 at 12:46:29AM +0200, René Scharfe wrote:

> Subject: [PATCH] tempfile: avoid directory cleanup race

Thanks again for this patch. With it, I think it's reasonable to do the
following cleanup on top.

  [1/2]: tempfile: drop active flag
  [2/2]: tempfile: update comment describing state transitions

 tempfile.c | 28 ++++++----------------------
 tempfile.h |  3 +--
 2 files changed, 7 insertions(+), 24 deletions(-)

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/2] tempfile: drop active flag
  2022-08-30 19:40   ` [PATCH 0/2] cleaning up tempfile active flag Jeff King
@ 2022-08-30 19:45     ` Jeff King
  2022-08-30 19:46     ` [PATCH 2/2] tempfile: update comment describing state transitions Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2022-08-30 19:45 UTC (permalink / raw)
  To: René Scharfe; +Cc: Johannes Schindelin, git

Our tempfile struct contains an "active" flag. Long ago, this flag was
important: tempfile structs were always allocated for the lifetime of
the program and added to a global linked list, and the active flag was
what told us whether a struct's tempfile needed to be cleaned up on
exit.

But since 422a21c6a0 (tempfile: remove deactivated list entries,
2017-09-05) and 076aa2cbda (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we actually remove items from the list, and the active flag
is generally always set to true for any allocated struct. We set it to
true in all of the creation functions, and in the normal code flow it
becomes false only in deactivate_tempfile(), which then immediately
frees the struct.

So the flag isn't performing that role anymore, and in fact makes things
more confusing. Dscho noted that delete_tempfile() is a noop for an
inactive struct. Since 076aa2cbda taught it to free the struct when
deactivating, we'd leak any struct whose active flag is unset. But in
practice it's not a leak, because again, we'll free when we unset the
flag, and never see the allocated-but-inactive state.

Can we just get rid of the flag? The answer is yes, but it requires
looking at a few other spots:

  1. I said above that the flag only becomes false before we deallocate,
     but there's one exception: when we call remove_tempfiles() from a
     signal or atexit handler, we unset the active flag as we remove
     each file. This isn't important for delete_tempfile(), as nobody
     would call it anymore, since we're exiting.

     It does in theory provide us some protection against racily
     double-removing a tempfile. If we receive a second signal while we
     are already in the cleanup routines, we'll start the cleanup loop
     again, and may visit the same tempfile. But this race already
     exists, because calling unlink() and unsetting the active flag
     aren't atomic! And it's OK in practice, because unlink() is
     idempotent (barring the unlikely event that some other process
     chooses our exact temp filename in that instant).

     So dropping the active flag widens the race a bit, but it was
     already there, and is fairly harmless in practice. If we really
     care about addressing it, the right thing is probably to block
     further signals while we're doing our cleanup (which we could
     actually do atomically).

  2. The active flag is declared as "volatile sig_atomic_t". The idea is
     that it's the final bit that gets set to tell the cleanup routines
     that the tempfile is ready to be used (or not used), and it's safe
     to receive a signal racing with regular code which adds or removes
     a tempfile from the list.

     In practice, I don't think this is buying us anything. The presence
     on the linked list is really what tells the cleanup routines to
     look at the struct. That is already marked as "volatile". It's not
     a sig_atomic_t, so it's possible that we could see a sheared write
     there as an entry is added or removed. But that is true of the
     current code, too! Before we can even look at the "active" flag,
     we'd have to follow a link to the struct itself. If we see a
     sheared write in the pointer to the struct, then we'll look at
     garbage memory anyway, and there's not much we can do.

This patch removes the active flag entirely, using presence on the
global linked list as an indicator that a tempfile ought to be cleaned
up. We are already careful to add to the list as the final step in
activating. On deactivation, we'll make sure to remove from the list as
the first step, before freeing any fields. The use of the volatile
keyword should mean that those things happen in the expected order.

Signed-off-by: Jeff King <peff@peff.net>
---
I recognize this is a high-risk cleanup, as there's a lot of hand-waving
about volatile and sig_atomic_t. But I'm fairly convinced it's not
making anything materially worse. And while the number of dropped lines
is small, I hope it makes this tricky code easier to reason about.

 tempfile.c | 10 +---------
 tempfile.h |  3 +--
 2 files changed, 2 insertions(+), 11 deletions(-)

diff --git a/tempfile.c b/tempfile.c
index 7414c81e31..29e61077a8 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -86,8 +86,6 @@ static void remove_tempfiles(int in_signal_handler)
 		else
 			unlink_or_warn(p->filename.buf);
 		remove_template_directory(p, in_signal_handler);
-
-		p->active = 0;
 	}
 }
 
@@ -108,7 +106,6 @@ static struct tempfile *new_tempfile(void)
 	struct tempfile *tempfile = xmalloc(sizeof(*tempfile));
 	tempfile->fd = -1;
 	tempfile->fp = NULL;
-	tempfile->active = 0;
 	tempfile->owner = 0;
 	INIT_LIST_HEAD(&tempfile->list);
 	strbuf_init(&tempfile->filename, 0);
@@ -120,9 +117,6 @@ static void activate_tempfile(struct tempfile *tempfile)
 {
 	static int initialized;
 
-	if (is_tempfile_active(tempfile))
-		BUG("activate_tempfile called for active object");
-
 	if (!initialized) {
 		sigchain_push_common(remove_tempfiles_on_signal);
 		atexit(remove_tempfiles_on_exit);
@@ -131,15 +125,13 @@ static void activate_tempfile(struct tempfile *tempfile)
 
 	volatile_list_add(&tempfile->list, &tempfile_list);
 	tempfile->owner = getpid();
-	tempfile->active = 1;
 }
 
 static void deactivate_tempfile(struct tempfile *tempfile)
 {
-	tempfile->active = 0;
+	volatile_list_del(&tempfile->list);
 	strbuf_release(&tempfile->filename);
 	free(tempfile->directory);
-	volatile_list_del(&tempfile->list);
 	free(tempfile);
 }
 
diff --git a/tempfile.h b/tempfile.h
index 5b9e8743dd..d0413af733 100644
--- a/tempfile.h
+++ b/tempfile.h
@@ -77,7 +77,6 @@
 
 struct tempfile {
 	volatile struct volatile_list_head list;
-	volatile sig_atomic_t active;
 	volatile int fd;
 	FILE *volatile fp;
 	volatile pid_t owner;
@@ -221,7 +220,7 @@ FILE *fdopen_tempfile(struct tempfile *tempfile, const char *mode);
 
 static inline int is_tempfile_active(struct tempfile *tempfile)
 {
-	return tempfile && tempfile->active;
+	return !!tempfile;
 }
 
 /*
-- 
2.37.3.1051.g85dc4064ac


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH 2/2] tempfile: update comment describing state transitions
  2022-08-30 19:40   ` [PATCH 0/2] cleaning up tempfile active flag Jeff King
  2022-08-30 19:45     ` [PATCH 1/2] tempfile: drop " Jeff King
@ 2022-08-30 19:46     ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Jeff King @ 2022-08-30 19:46 UTC (permalink / raw)
  To: René Scharfe; +Cc: Johannes Schindelin, git

Back when 1a9d15db25 (tempfile: a new module for handling temporary
files, 2015-08-10) added this comment, tempfile structs were held in
memory for the life of a process, and there were various guarantees
about which fields were valid in which states.

Since 422a21c6a0 (tempfile: remove deactivated list entries, 2017-09-05)
and 076aa2cbda (tempfile: auto-allocate tempfiles on heap, 2017-09-05),
the flow is quite different: objects come and go from the list, and
inactive ones are deallocated. And the previous commit removed the
"active" flag from the struct entirely.

Let's bring the comment up to date with the current code.

Signed-off-by: Jeff King <peff@peff.net>
---
This could arguably come first in the series, minus the bit about the
"active" flag, since it is really putting the finishing touches on those
old commits from 2017. But it felt easier to reason about the end state
after dropping the flag entirely.

 tempfile.c | 18 +++++-------------
 1 file changed, 5 insertions(+), 13 deletions(-)

diff --git a/tempfile.c b/tempfile.c
index 29e61077a8..e27048f970 100644
--- a/tempfile.c
+++ b/tempfile.c
@@ -14,16 +14,14 @@
  *
  * The possible states of a `tempfile` object are as follows:
  *
- * - Uninitialized. In this state the object's `on_list` field must be
- *   zero but the rest of its contents need not be initialized. As
- *   soon as the object is used in any way, it is irrevocably
- *   registered in `tempfile_list`, and `on_list` is set.
+ * - Inactive/unallocated. The only way to get a tempfile is via a creation
+ *   function like create_tempfile(). Once allocated, the tempfile is on the
+ *   global tempfile_list and considered active.
  *
  * - Active, file open (after `create_tempfile()` or
  *   `reopen_tempfile()`). In this state:
  *
  *   - the temporary file exists
- *   - `active` is set
  *   - `filename` holds the filename of the temporary file
  *   - `fd` holds a file descriptor open for writing to it
  *   - `fp` holds a pointer to an open `FILE` object if and only if
@@ -35,14 +33,8 @@
  *   `fd` is -1, and `fp` is `NULL`.
  *
  * - Inactive (after `delete_tempfile()`, `rename_tempfile()`, or a
- *   failed attempt to create a temporary file). In this state:
- *
- *   - `active` is unset
- *   - `filename` is empty (usually, though there are transitory
- *     states in which this condition doesn't hold). Client code should
- *     *not* rely on the filename being empty in this state.
- *   - `fd` is -1 and `fp` is `NULL`
- *   - the object is removed from `tempfile_list` (but could be used again)
+ *   failed attempt to create a temporary file). The struct is removed from
+ *   the global tempfile_list and deallocated.
  *
  * A temporary file is owned by the process that created it. The
  * `tempfile` has an `owner` field that records the owner's PID. This
-- 
2.37.3.1051.g85dc4064ac

^ permalink raw reply related	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-08-30 19:46 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-24  9:35 curiosities with tempfile.active Jeff King
2022-08-26 22:46 ` René Scharfe
2022-08-27 13:02   ` Jeff King
2022-08-27 21:47     ` Chris Torek
2022-08-30 18:56       ` Jeff King
2022-08-30 19:40   ` [PATCH 0/2] cleaning up tempfile active flag Jeff King
2022-08-30 19:45     ` [PATCH 1/2] tempfile: drop " Jeff King
2022-08-30 19:46     ` [PATCH 2/2] tempfile: update comment describing state transitions Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).