All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] setup: don't die if realpath(3) fails on getcwd(3)
@ 2022-05-19 23:39 Kevin Locke
  2022-05-20 18:38 ` Junio C Hamano
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Kevin Locke @ 2022-05-19 23:39 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Prior to Git 2.35.0, git could be run from an inaccessible working
directory so long as the git repository specified by options and/or
environment variables was accessible.  For example:

    git init repo
    mkdir -p a/b
    cd a/b
    chmod u-x ..
    git -C "${PWD%/a/b}/repo" status

If this example seems a bit contrived, consider running with the
repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
without ensuring the working directory is accessible by that user.

The code added by e6f8861bd4 to preserve the working directory attempts
to normalize the path using strbuf_realpath().  If that fails, as in the
case above, it is treated as a fatal error.  To avoid this, we can
continue after the error.  At worst, git will fail to detect that the
working directory is inside the worktree, resulting in the pre-2.35.0
behavior of not preserving the working directory.

Fixes: e6f8861bd4 ("setup: introduce startup_info->original_cwd")
Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
---
 setup.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/setup.c b/setup.c
index a7b36f3ffb..fb68caaae0 100644
--- a/setup.c
+++ b/setup.c
@@ -458,11 +458,13 @@ static void setup_original_cwd(void)
 	 *     not startup_info->original_cwd.
 	 */
 
-	/* Normalize the directory */
-	strbuf_realpath(&tmp, tmp_original_cwd, 1);
-	free((char*)tmp_original_cwd);
+	/* Try to normalize the directory.  Fails if ancestor not readable. */
+	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
+		free((char*)tmp_original_cwd);
+		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
+	} else
+		startup_info->original_cwd = tmp_original_cwd;
 	tmp_original_cwd = NULL;
-	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
 
 	/*
 	 * Get our worktree; we only protect the current working directory
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-19 23:39 [PATCH] setup: don't die if realpath(3) fails on getcwd(3) Kevin Locke
@ 2022-05-20 18:38 ` Junio C Hamano
  2022-05-21  0:14 ` Elijah Newren
  2022-05-21 13:53 ` [PATCH v2] " Kevin Locke
  2 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2022-05-20 18:38 UTC (permalink / raw)
  To: Kevin Locke; +Cc: git, Elijah Newren

Kevin Locke <kevin@kevinlocke.name> writes:

> Prior to Git 2.35.0, git could be run from an inaccessible working
> directory so long as the git repository specified by options and/or
> environment variables was accessible.  For example:
>
>     git init repo
>     mkdir -p a/b
>     cd a/b
>     chmod u-x ..
>     git -C "${PWD%/a/b}/repo" status
>
> If this example seems a bit contrived, consider running with the
> repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
> without ensuring the working directory is accessible by that user.
>
> The code added by e6f8861bd4 to preserve the working directory attempts
> to normalize the path using strbuf_realpath().  If that fails, as in the
> case above, it is treated as a fatal error.

Thanks.  As this thing is primarily for safety, I am inclined to say
that I'd prefer to see it error out when we cannot figure out the
necessary info to keep that safety promise to the users, than using
an unnormalized value as a stand-in and letting the logic that is
designed to be fed a normalized value do random (and possibly wrong)
things.  But see below.

> Fixes: e6f8861bd4 ("setup: introduce startup_info->original_cwd")

AFAIK, we do not use this kind of trailer in this project.  Casting
in stone the claim that this "fixes" would be embarrassing when it
turns out that it does not fix it (or even worse, breaks it).

> Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
> ---
>  setup.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/setup.c b/setup.c
> index a7b36f3ffb..fb68caaae0 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -458,11 +458,13 @@ static void setup_original_cwd(void)
>  	 *     not startup_info->original_cwd.
>  	 */
>  
> -	/* Normalize the directory */
> -	strbuf_realpath(&tmp, tmp_original_cwd, 1);
> -	free((char*)tmp_original_cwd);
> +	/* Try to normalize the directory.  Fails if ancestor not readable. */
> +	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> +		free((char*)tmp_original_cwd);
> +		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +	} else
> +		startup_info->original_cwd = tmp_original_cwd;

I am OK to loosen the "we try not to remove the original cwd" logic
so that it does not kick in when we cannot figure out the original
cwd in the first place.  But if that is the case, then I'd rather
see "startrup_info->original_cwd set to NULL" as the signal that we
are in such a situation.

Elijah, what's your take on this change?

>  	tmp_original_cwd = NULL;
> -	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>  
>  	/*
>  	 * Get our worktree; we only protect the current working directory

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-19 23:39 [PATCH] setup: don't die if realpath(3) fails on getcwd(3) Kevin Locke
  2022-05-20 18:38 ` Junio C Hamano
@ 2022-05-21  0:14 ` Elijah Newren
  2022-05-21 13:02   ` Kevin Locke
  2022-05-21 13:53 ` [PATCH v2] " Kevin Locke
  2 siblings, 1 reply; 22+ messages in thread
From: Elijah Newren @ 2022-05-21  0:14 UTC (permalink / raw)
  To: Kevin Locke; +Cc: Git Mailing List

Hi,

Thanks for sending along a fix!

On Thu, May 19, 2022 at 4:39 PM Kevin Locke <kevin@kevinlocke.name> wrote:
>
> Prior to Git 2.35.0, git could be run from an inaccessible working
> directory so long as the git repository specified by options and/or
> environment variables was accessible.  For example:

Ah, good point.

>     git init repo
>     mkdir -p a/b
>     cd a/b
>     chmod u-x ..
>     git -C "${PWD%/a/b}/repo" status
>
> If this example seems a bit contrived, consider running with the
> repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
> without ensuring the working directory is accessible by that user.
>
> The code added by e6f8861bd4 to preserve the working directory attempts

When referencing commits in commit messages, this project prefers that the first
reference in the commit message use the output from `git log --no-walk
--pretty=reference $HASH` rather than just $HASH.
So, here, it'd be

    The code added by e6f8861bd4 (setup: introduce
startup_info->original_cwd, 2021-12-09) to preserve the ...

> to normalize the path using strbuf_realpath().  If that fails, as in the
> case above, it is treated as a fatal error.  To avoid this, we can
> continue after the error.  At worst, git will fail to detect that the
> working directory is inside the worktree, resulting in the pre-2.35.0
> behavior of not preserving the working directory.
>
> Fixes: e6f8861bd4 ("setup: introduce startup_info->original_cwd")

I was slightly surprised to see this tag, but it appears others in
git.git have used it, so it must just be me that's not familiar with
it.

> Signed-off-by: Kevin Locke <kevin@kevinlocke.name>

Nicely explained commit message.

> ---
>  setup.c | 10 ++++++----
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/setup.c b/setup.c
> index a7b36f3ffb..fb68caaae0 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -458,11 +458,13 @@ static void setup_original_cwd(void)
>          *     not startup_info->original_cwd.
>          */
>
> -       /* Normalize the directory */
> -       strbuf_realpath(&tmp, tmp_original_cwd, 1);
> -       free((char*)tmp_original_cwd);
> +       /* Try to normalize the directory.  Fails if ancestor not readable. */

Is that the only reason it fails?  I'm unsure if the second half of
the comment helps there.

> +       if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> +               free((char*)tmp_original_cwd);
> +               startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +       } else

git.git coding style: if either of the if/else blocks use braces, use
braces for both

> +               startup_info->original_cwd = tmp_original_cwd;

tmp_original_cwd is not required to be normalized, and there are very
strong normalization assumptions on startup_info->original_cwd.  While
a non-normalized value would work to get pre-2.35.0 behavior, it's by
accident rather than design, and might be confusing for others to
later reason about.  Also, I think it might be possible for
tmp_original_cwd to still be NULL, and some of the immediately
following code I believe will assume it's operating with a non-NULL
value, so you'd need to skip that stuff.  I think the else block here
should use "goto no_prevention_needed", as the no_prevention_needed
block will handle setting startup_info->original_cwd to NULL for you,
and get you the pre-2.35.0 behavior.

>         tmp_original_cwd = NULL;

After changing the above else block to a goto, you may also want to
copy this to the no_prevention_needed block or else copy it to the
else portion of the block above (just before the goto you add).

> -       startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>
>         /*
>          * Get our worktree; we only protect the current working directory
> --
> 2.35.1

Again, thanks for sending in a bug fix!  Looking forward to seeing
your revised patch.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-21  0:14 ` Elijah Newren
@ 2022-05-21 13:02   ` Kevin Locke
  2022-05-23 18:44     ` Derrick Stolee
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Locke @ 2022-05-21 13:02 UTC (permalink / raw)
  To: Elijah Newren, Junio C Hamano; +Cc: Git Mailing List

Thanks Elijah and Junio,

I appreciate your reviews.  I agree with all of your suggestions. I'll
send a revised patch that incorporates the suggested changes shortly.

Cheers,
Kevin


On Fri, 2022-05-20 at 17:14 -0700, Elijah Newren wrote:
> On Thu, May 19, 2022 at 4:39 PM Kevin Locke <kevin@kevinlocke.name> wrote:
>>     git init repo
>>     mkdir -p a/b
>>     cd a/b
>>     chmod u-x ..
>>     git -C "${PWD%/a/b}/repo" status
>>
>> If this example seems a bit contrived, consider running with the
>> repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
>> without ensuring the working directory is accessible by that user.
>>
>> The code added by e6f8861bd4 to preserve the working directory attempts
> 
> When referencing commits in commit messages, this project prefers that the first
> reference in the commit message use the output from `git log --no-walk
> --pretty=reference $HASH` rather than just $HASH.
> So, here, it'd be
> 
>     The code added by e6f8861bd4 (setup: introduce
> startup_info->original_cwd, 2021-12-09) to preserve the ...
> 
>> to normalize the path using strbuf_realpath().  If that fails, as in the
>> case above, it is treated as a fatal error.  To avoid this, we can
>> continue after the error.  At worst, git will fail to detect that the
>> working directory is inside the worktree, resulting in the pre-2.35.0
>> behavior of not preserving the working directory.
>>
>> Fixes: e6f8861bd4 ("setup: introduce startup_info->original_cwd")
> 
> I was slightly surprised to see this tag, but it appears others in
> git.git have used it, so it must just be me that's not familiar with
> it.
> 
>> Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
> 
> Nicely explained commit message.
> 
>> ---
>>  setup.c | 10 ++++++----
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>>
>> diff --git a/setup.c b/setup.c
>> index a7b36f3ffb..fb68caaae0 100644
>> --- a/setup.c
>> +++ b/setup.c
>> @@ -458,11 +458,13 @@ static void setup_original_cwd(void)
>>          *     not startup_info->original_cwd.
>>          */
>>
>> -       /* Normalize the directory */
>> -       strbuf_realpath(&tmp, tmp_original_cwd, 1);
>> -       free((char*)tmp_original_cwd);
>> +       /* Try to normalize the directory.  Fails if ancestor not readable. */
> 
> Is that the only reason it fails?  I'm unsure if the second half of
> the comment helps there.
> 
>> +       if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
>> +               free((char*)tmp_original_cwd);
>> +               startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>> +       } else
> 
> git.git coding style: if either of the if/else blocks use braces, use
> braces for both
> 
>> +               startup_info->original_cwd = tmp_original_cwd;
> 
> tmp_original_cwd is not required to be normalized, and there are very
> strong normalization assumptions on startup_info->original_cwd.  While
> a non-normalized value would work to get pre-2.35.0 behavior, it's by
> accident rather than design, and might be confusing for others to
> later reason about.  Also, I think it might be possible for
> tmp_original_cwd to still be NULL, and some of the immediately
> following code I believe will assume it's operating with a non-NULL
> value, so you'd need to skip that stuff.  I think the else block here
> should use "goto no_prevention_needed", as the no_prevention_needed
> block will handle setting startup_info->original_cwd to NULL for you,
> and get you the pre-2.35.0 behavior.
> 
>>         tmp_original_cwd = NULL;
> 
> After changing the above else block to a goto, you may also want to
> copy this to the no_prevention_needed block or else copy it to the
> else portion of the block above (just before the goto you add).

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-19 23:39 [PATCH] setup: don't die if realpath(3) fails on getcwd(3) Kevin Locke
  2022-05-20 18:38 ` Junio C Hamano
  2022-05-21  0:14 ` Elijah Newren
@ 2022-05-21 13:53 ` Kevin Locke
  2022-05-23 18:57   ` Derrick Stolee
  2022-05-24 14:51   ` [PATCH v3] " Kevin Locke
  2 siblings, 2 replies; 22+ messages in thread
From: Kevin Locke @ 2022-05-21 13:53 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Junio C Hamano

Prior to Git 2.35.0, git could be run from an inaccessible working
directory so long as the git repository specified by options and/or
environment variables was accessible.  For example:

    git init repo
    mkdir -p a/b
    cd a/b
    chmod u-x ..
    git -C "${PWD%/a/b}/repo" status

If this example seems a bit contrived, consider running with the
repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
without ensuring the working directory is accessible by that user.

The code added by e6f8861bd4 ("setup: introduce
startup_info->original_cwd") to preserve the working directory attempts
to normalize the path using strbuf_realpath().  If that fails, as in the
case above, it is treated as a fatal error.

This commit treats strbuf_realpath() errors as non-fatal.  If an error
occurs, setup_original_cwd() will continue without applying removal
prevention for cwd, resulting in the pre-2.35.0 behavior.  The risk
should be minimal, since git will not operate on a repository with
inaccessible ancestors, this behavior is only known to occur when cwd is
a descendant of the repository, an ancestor of cwd is inaccessible, and
no ancestors of the repository are inaccessible.

Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
---

Changes since v1:
 * Set startup_info->original_cwd = NULL when strbuf_realpath() fails,
   rather than setting it to the un-normalized path.
 * Add a trace message when realpath fails to aid debugging.
 * Remove potential realpath failure cause from comment before it.
 * Improve format for reference to e6f8861bd4 in commit message.
 * Clarify when the pre-2.35.0 behavior may occur as a result of this
   commit in the commit message.
 * Remove 'Fixes:' tag from commit message.

 setup.c | 15 ++++++++++-----
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/setup.c b/setup.c
index a7b36f3ffb..22430a663c 100644
--- a/setup.c
+++ b/setup.c
@@ -458,11 +458,16 @@ static void setup_original_cwd(void)
 	 *     not startup_info->original_cwd.
 	 */
 
-	/* Normalize the directory */
-	strbuf_realpath(&tmp, tmp_original_cwd, 1);
-	free((char*)tmp_original_cwd);
-	tmp_original_cwd = NULL;
-	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
+	/* Try to normalize the directory. */
+	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
+		free((char*)tmp_original_cwd);
+		tmp_original_cwd = NULL;
+		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
+	} else {
+		trace_printf("realpath(cwd) failed: %s\n", strerror(errno));
+		tmp_original_cwd = NULL;
+		goto no_prevention_needed;
+	}
 
 	/*
 	 * Get our worktree; we only protect the current working directory
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-21 13:02   ` Kevin Locke
@ 2022-05-23 18:44     ` Derrick Stolee
  0 siblings, 0 replies; 22+ messages in thread
From: Derrick Stolee @ 2022-05-23 18:44 UTC (permalink / raw)
  To: Kevin Locke, Elijah Newren, Junio C Hamano, Git Mailing List

On 5/21/22 9:02 AM, Kevin Locke wrote:
> Thanks Elijah and Junio,
> 
> I appreciate your reviews.  I agree with all of your suggestions. I'll
> send a revised patch that incorporates the suggested changes shortly.
> 
> Cheers,
> Kevin
> 
> 
> On Fri, 2022-05-20 at 17:14 -0700, Elijah Newren wrote:
>> On Thu, May 19, 2022 at 4:39 PM Kevin Locke <kevin@kevinlocke.name> wrote:

Hi Kevin,

Welcome to the Git mailing list!

One minor thing to keep in mind is that we try to avoid
top-posting.

A few other things that were mentioned in this v1 would
be noticed by reading a few documents in the tree:

 * Documentation/CodingGuidelines
 * Documentation/MyFirstContribution.txt
 * Documentation/SubmittingPatches

There's A LOT in there, so we all forget some of the
things some of the time.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-21 13:53 ` [PATCH v2] " Kevin Locke
@ 2022-05-23 18:57   ` Derrick Stolee
  2022-05-24 14:02     ` Kevin Locke
  2022-05-24 14:51   ` [PATCH v3] " Kevin Locke
  1 sibling, 1 reply; 22+ messages in thread
From: Derrick Stolee @ 2022-05-23 18:57 UTC (permalink / raw)
  To: Kevin Locke, git; +Cc: Elijah Newren, Junio C Hamano

On 5/21/22 9:53 AM, Kevin Locke wrote:
> -	/* Normalize the directory */
> -	strbuf_realpath(&tmp, tmp_original_cwd, 1);
> -	free((char*)tmp_original_cwd);
> -	tmp_original_cwd = NULL;
> -	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +	/* Try to normalize the directory. */
> +	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {

I had to look it up, but this last argument going from 1 to 0
is "die_on_error", so it makes sense why it wasn't checked by
an 'if' earlier _and_ why it needs to change now.

> +		free((char*)tmp_original_cwd);

Hm. I'm never a fan of this casting, but it existed before. It's
because tmp_original_cwd is exposed globally in cache.h, which
is _really widely_. However, there are only two uses: setup.c,
which defines it, and common-main.c, which initializes it during
process startup.

The following diff could apply before your commit, removing this
use of "const char *", but maybe it doesn't fit normal Git
coding guidelines (putting the extern directly in a *.c file):

--- >8 ---

diff --git a/cache.h b/cache.h
index aaf334e2aa4..ce9cd6fa3f0 100644
--- a/cache.h
+++ b/cache.h
@@ -1797,7 +1797,6 @@ struct startup_info {
 	const char *original_cwd;
 };
 extern struct startup_info *startup_info;
-extern const char *tmp_original_cwd;
 
 /* merge.c */
 struct commit_list;
diff --git a/common-main.c b/common-main.c
index 29fb7452f8a..e472258b83b 100644
--- a/common-main.c
+++ b/common-main.c
@@ -23,6 +23,8 @@ static void restore_sigpipe_to_default(void)
 	signal(SIGPIPE, SIG_DFL);
 }
 
+extern char *tmp_original_cwd;
+
 int main(int argc, const char **argv)
 {
 	int result;
diff --git a/setup.c b/setup.c
index 04ce33cdcd4..86986317490 100644
--- a/setup.c
+++ b/setup.c
@@ -12,7 +12,7 @@ static int work_tree_config_is_bogus;
 
 static struct startup_info the_startup_info;
 struct startup_info *startup_info = &the_startup_info;
-const char *tmp_original_cwd;
+char *tmp_original_cwd;
 
 /*
  * The input parameter must contain an absolute path, and it must already be
@@ -459,7 +459,7 @@ static void setup_original_cwd(void)
 
 	/* Normalize the directory */
 	strbuf_realpath(&tmp, tmp_original_cwd, 1);
-	free((char*)tmp_original_cwd);
+	free(tmp_original_cwd);
 	tmp_original_cwd = NULL;
 	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
 
--- >8 ---

> +		tmp_original_cwd = NULL;
> +		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +	} else {
> +		trace_printf("realpath(cwd) failed: %s\n", strerror(errno));

It could also be helpful to include a trace2 output, since
most modern tracing uses that mechanism:

		trace2_data_string("setup", the_repository,
				   "realpath-path", tmp_original_cwd);
		trace2_data_string("setup", the_repository,
				   "realpath-failure", strerror(errno));

But this is also unlikely to be necessary. We can instruct
a user to rerun their command with GIT_TRACE=1 if we see
this error in the wild.

> +		tmp_original_cwd = NULL;
> +		goto no_prevention_needed;
> +	}

So, I don't see a need for this patch to change before it
is merged. I'm curious to hear thoughts on the diff I put
inline, though.

Thanks,
-Stolee

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-23 18:57   ` Derrick Stolee
@ 2022-05-24 14:02     ` Kevin Locke
  2022-05-24 15:20       ` Elijah Newren
  0 siblings, 1 reply; 22+ messages in thread
From: Kevin Locke @ 2022-05-24 14:02 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Elijah Newren, Junio C Hamano

On Mon, 2022-05-23 at 14:57 -0400, Derrick Stolee wrote:
> On 5/21/22 9:53 AM, Kevin Locke wrote:
> > +		free((char*)tmp_original_cwd);
> 
> Hm. I'm never a fan of this casting, but it existed before. It's
> because tmp_original_cwd is exposed globally in cache.h, which
> is _really widely_. However, there are only two uses: setup.c,
> which defines it, and common-main.c, which initializes it during
> process startup.
> 
> The following diff could apply before your commit, removing this
> use of "const char *", but maybe it doesn't fit normal Git
> coding guidelines (putting the extern directly in a *.c file):
> 
> --- >8 ---
> 
> diff --git a/cache.h b/cache.h
> index aaf334e2aa4..ce9cd6fa3f0 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1797,7 +1797,6 @@ struct startup_info {
>  	const char *original_cwd;
>  };
>  extern struct startup_info *startup_info;
> -extern const char *tmp_original_cwd;
>  
>  /* merge.c */
>  struct commit_list;
> diff --git a/common-main.c b/common-main.c
> index 29fb7452f8a..e472258b83b 100644
> --- a/common-main.c
> +++ b/common-main.c
> @@ -23,6 +23,8 @@ static void restore_sigpipe_to_default(void)
>  	signal(SIGPIPE, SIG_DFL);
>  }
>  
> +extern char *tmp_original_cwd;
> +
>  int main(int argc, const char **argv)
>  {
>  	int result;
> diff --git a/setup.c b/setup.c
> index 04ce33cdcd4..86986317490 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -12,7 +12,7 @@ static int work_tree_config_is_bogus;
>  
>  static struct startup_info the_startup_info;
>  struct startup_info *startup_info = &the_startup_info;
> -const char *tmp_original_cwd;
> +char *tmp_original_cwd;
>  
>  /*
>   * The input parameter must contain an absolute path, and it must already be
> @@ -459,7 +459,7 @@ static void setup_original_cwd(void)
>  
>  	/* Normalize the directory */
>  	strbuf_realpath(&tmp, tmp_original_cwd, 1);
> -	free((char*)tmp_original_cwd);
> +	free(tmp_original_cwd);
>  	tmp_original_cwd = NULL;
>  	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>  
> --- >8 ---

This approach seems reasonable to me, as does casting to free().  It's
not clear to me which is preferable in this case.  How to balance the
trade-offs between exposing const interfaces, limiting (internal)
interfaces to headers, and avoiding casts might be worth discussing
and documenting a matter of project coding style.  `grep -rF 'free(('`
lists about 100 casts to free, suggesting the discussion may be
worthwhile.  Introducing a free_const() macro could be another option
to consider.

>> +		tmp_original_cwd = NULL;
>> +		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>> +	} else {
>> +		trace_printf("realpath(cwd) failed: %s\n", strerror(errno));
> 
> It could also be helpful to include a trace2 output, since
> most modern tracing uses that mechanism:
> 
> 		trace2_data_string("setup", the_repository,
> 				   "realpath-path", tmp_original_cwd);
> 		trace2_data_string("setup", the_repository,
> 				   "realpath-failure", strerror(errno));
> 
> But this is also unlikely to be necessary. We can instruct
> a user to rerun their command with GIT_TRACE=1 if we see
> this error in the wild.

That's a great idea.  I hadn't realized the difference between the
trace_* and trace2_* APIs until investigating as a result of your
suggestion.  Trace2 seems preferable for many reasons.  I'll send an
updated patch shortly.

Thanks for the review and feedback Stolee!

Cheers,
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v3] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-21 13:53 ` [PATCH v2] " Kevin Locke
  2022-05-23 18:57   ` Derrick Stolee
@ 2022-05-24 14:51   ` Kevin Locke
  2022-05-24 15:21     ` Elijah Newren
                       ` (2 more replies)
  1 sibling, 3 replies; 22+ messages in thread
From: Kevin Locke @ 2022-05-24 14:51 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Junio C Hamano, Derrick Stolee

Prior to Git 2.35.0, git could be run from an inaccessible working
directory so long as the git repository specified by options and/or
environment variables was accessible.  For example:

    git init repo
    mkdir -p a/b
    cd a/b
    chmod u-x ..
    git -C "${PWD%/a/b}/repo" status

If this example seems a bit contrived, consider running with the
repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
without ensuring the working directory is accessible by that user.

The code added by e6f8861bd4 ("setup: introduce
startup_info->original_cwd") to preserve the working directory attempts
to normalize the path using strbuf_realpath().  If that fails, as in the
case above, it is treated as a fatal error.

This commit treats strbuf_realpath() errors as non-fatal.  If an error
occurs, setup_original_cwd() will continue without applying removal
prevention for cwd, resulting in the pre-2.35.0 behavior.  The risk
should be minimal, since git will not operate on a repository with
inaccessible ancestors, this behavior is only known to occur when cwd is
a descendant of the repository, an ancestor of cwd is inaccessible, and
no ancestors of the repository are inaccessible.

Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
---

Notes:
Changes since v2:
 * Use trace2_data_string(), rather than trace_printf(), to report
   realpath failure.

Changes since v1:
 * Set startup_info->original_cwd = NULL when strbuf_realpath() fails,
   rather than setting it to the un-normalized path.
 * Add a trace message when realpath fails to aid debugging.
 * Remove potential realpath failure cause from comment before it.
 * Improve format for reference to e6f8861bd4 in commit message.
 * Clarify when the pre-2.35.0 behavior may occur as a result of this
   commit in the commit message.
 * Remove 'Fixes:' tag from commit message.

 setup.c | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/setup.c b/setup.c
index a7b36f3ffbf..38bd55cbac1 100644
--- a/setup.c
+++ b/setup.c
@@ -458,11 +458,19 @@ static void setup_original_cwd(void)
 	 *     not startup_info->original_cwd.
 	 */
 
-	/* Normalize the directory */
-	strbuf_realpath(&tmp, tmp_original_cwd, 1);
-	free((char*)tmp_original_cwd);
-	tmp_original_cwd = NULL;
-	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
+	/* Try to normalize the directory. */
+	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
+		free((char*)tmp_original_cwd);
+		tmp_original_cwd = NULL;
+		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
+	} else {
+		trace2_data_string("setup", the_repository,
+				   "realpath-path", tmp_original_cwd);
+		trace2_data_string("setup", the_repository,
+				   "realpath-failure", strerror(errno));
+		tmp_original_cwd = NULL;
+		goto no_prevention_needed;
+	}
 
 	/*
 	 * Get our worktree; we only protect the current working directory
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 14:02     ` Kevin Locke
@ 2022-05-24 15:20       ` Elijah Newren
  2022-05-24 17:38         ` Derrick Stolee
  2022-05-27  7:48         ` Ævar Arnfjörð Bjarmason
  0 siblings, 2 replies; 22+ messages in thread
From: Elijah Newren @ 2022-05-24 15:20 UTC (permalink / raw)
  To: Kevin Locke, Derrick Stolee, Git Mailing List, Elijah Newren,
	Junio C Hamano

On Tue, May 24, 2022 at 7:02 AM Kevin Locke <kevin@kevinlocke.name> wrote:
>
> On Mon, 2022-05-23 at 14:57 -0400, Derrick Stolee wrote:
> > On 5/21/22 9:53 AM, Kevin Locke wrote:
> > > +           free((char*)tmp_original_cwd);
> >
> > Hm. I'm never a fan of this casting, but it existed before. It's
> > because tmp_original_cwd is exposed globally in cache.h, which
> > is _really widely_. However, there are only two uses: setup.c,
> > which defines it, and common-main.c, which initializes it during
> > process startup.
> >
> > The following diff could apply before your commit, removing this
> > use of "const char *", but maybe it doesn't fit normal Git
> > coding guidelines (putting the extern directly in a *.c file):
> >
> > --- >8 ---
> >
> > diff --git a/cache.h b/cache.h
> > index aaf334e2aa4..ce9cd6fa3f0 100644
> > --- a/cache.h
> > +++ b/cache.h
> > @@ -1797,7 +1797,6 @@ struct startup_info {
> >       const char *original_cwd;
> >  };
> >  extern struct startup_info *startup_info;
> > -extern const char *tmp_original_cwd;
> >
> >  /* merge.c */
> >  struct commit_list;
> > diff --git a/common-main.c b/common-main.c
> > index 29fb7452f8a..e472258b83b 100644
> > --- a/common-main.c
> > +++ b/common-main.c
> > @@ -23,6 +23,8 @@ static void restore_sigpipe_to_default(void)
> >       signal(SIGPIPE, SIG_DFL);
> >  }
> >
> > +extern char *tmp_original_cwd;
> > +
> >  int main(int argc, const char **argv)
> >  {
> >       int result;
> > diff --git a/setup.c b/setup.c
> > index 04ce33cdcd4..86986317490 100644
> > --- a/setup.c
> > +++ b/setup.c
> > @@ -12,7 +12,7 @@ static int work_tree_config_is_bogus;
> >
> >  static struct startup_info the_startup_info;
> >  struct startup_info *startup_info = &the_startup_info;
> > -const char *tmp_original_cwd;
> > +char *tmp_original_cwd;
> >
> >  /*
> >   * The input parameter must contain an absolute path, and it must already be
> > @@ -459,7 +459,7 @@ static void setup_original_cwd(void)
> >
> >       /* Normalize the directory */
> >       strbuf_realpath(&tmp, tmp_original_cwd, 1);
> > -     free((char*)tmp_original_cwd);
> > +     free(tmp_original_cwd);
> >       tmp_original_cwd = NULL;
> >       startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> >
> > --- >8 ---
>
> This approach seems reasonable to me, as does casting to free().  It's
> not clear to me which is preferable in this case.  How to balance the
> trade-offs between exposing const interfaces, limiting (internal)
> interfaces to headers, and avoiding casts might be worth discussing
> and documenting a matter of project coding style.  `grep -rF 'free(('`
> lists about 100 casts to free, suggesting the discussion may be
> worthwhile.  Introducing a free_const() macro could be another option
> to consider.

I'd prefer either a free_const() as you suggest (though as a separate
patch from what you are submitting here), or leaving the code as-is.
free() could have been written to take a const void* instead of just
void*, since it's not going to modify what the pointer points at.  The
reason we call free() is because the variable isn't needed anymore,
and using a non-const value after freeing is just as wrong as using a
const one after freeing, so casting away the constness cannot really
cause any new problems.  So, I think the signature of free() is just
wrong: it should have taken a const void* all along.  Unfortunately,
the wrong type signature sadly makes people feel like they have to
choose between (a) dropping the added safety of const that the
compiler can enforce for you during the lifetime of the variable, or
(b) leaking memory you no longer need.  I think it's a bad choice and
you should just typecast when free'ing, but clearly others just don't
want to see any typecasts and are willing to dispense with const on
constant variables.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 14:51   ` [PATCH v3] " Kevin Locke
@ 2022-05-24 15:21     ` Elijah Newren
  2022-05-24 17:41     ` Derrick Stolee
  2022-05-24 19:20     ` [PATCH v4] " Kevin Locke
  2 siblings, 0 replies; 22+ messages in thread
From: Elijah Newren @ 2022-05-24 15:21 UTC (permalink / raw)
  To: Kevin Locke; +Cc: Git Mailing List, Junio C Hamano, Derrick Stolee

On Tue, May 24, 2022 at 7:51 AM Kevin Locke <kevin@kevinlocke.name> wrote:
>
> Prior to Git 2.35.0, git could be run from an inaccessible working
> directory so long as the git repository specified by options and/or
> environment variables was accessible.  For example:
>
>     git init repo
>     mkdir -p a/b
>     cd a/b
>     chmod u-x ..
>     git -C "${PWD%/a/b}/repo" status
>
> If this example seems a bit contrived, consider running with the
> repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
> without ensuring the working directory is accessible by that user.
>
> The code added by e6f8861bd4 ("setup: introduce
> startup_info->original_cwd") to preserve the working directory attempts
> to normalize the path using strbuf_realpath().  If that fails, as in the
> case above, it is treated as a fatal error.
>
> This commit treats strbuf_realpath() errors as non-fatal.  If an error
> occurs, setup_original_cwd() will continue without applying removal
> prevention for cwd, resulting in the pre-2.35.0 behavior.  The risk
> should be minimal, since git will not operate on a repository with
> inaccessible ancestors, this behavior is only known to occur when cwd is
> a descendant of the repository, an ancestor of cwd is inaccessible, and
> no ancestors of the repository are inaccessible.
>
> Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
> ---
>
> Notes:
> Changes since v2:
>  * Use trace2_data_string(), rather than trace_printf(), to report
>    realpath failure.
>
> Changes since v1:
>  * Set startup_info->original_cwd = NULL when strbuf_realpath() fails,
>    rather than setting it to the un-normalized path.
>  * Add a trace message when realpath fails to aid debugging.
>  * Remove potential realpath failure cause from comment before it.
>  * Improve format for reference to e6f8861bd4 in commit message.
>  * Clarify when the pre-2.35.0 behavior may occur as a result of this
>    commit in the commit message.
>  * Remove 'Fixes:' tag from commit message.
>
>  setup.c | 18 +++++++++++++-----
>  1 file changed, 13 insertions(+), 5 deletions(-)
>
> diff --git a/setup.c b/setup.c
> index a7b36f3ffbf..38bd55cbac1 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -458,11 +458,19 @@ static void setup_original_cwd(void)
>          *     not startup_info->original_cwd.
>          */
>
> -       /* Normalize the directory */
> -       strbuf_realpath(&tmp, tmp_original_cwd, 1);
> -       free((char*)tmp_original_cwd);
> -       tmp_original_cwd = NULL;
> -       startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +       /* Try to normalize the directory. */
> +       if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> +               free((char*)tmp_original_cwd);
> +               tmp_original_cwd = NULL;
> +               startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +       } else {
> +               trace2_data_string("setup", the_repository,
> +                                  "realpath-path", tmp_original_cwd);
> +               trace2_data_string("setup", the_repository,
> +                                  "realpath-failure", strerror(errno));
> +               tmp_original_cwd = NULL;
> +               goto no_prevention_needed;
> +       }
>
>         /*
>          * Get our worktree; we only protect the current working directory
> --
> 2.35.1

Looks good to me:

Reviewed-by: Elijah Newren <newren@gmail.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 15:20       ` Elijah Newren
@ 2022-05-24 17:38         ` Derrick Stolee
  2022-05-25  3:47           ` Elijah Newren
  2022-05-27  7:48         ` Ævar Arnfjörð Bjarmason
  1 sibling, 1 reply; 22+ messages in thread
From: Derrick Stolee @ 2022-05-24 17:38 UTC (permalink / raw)
  To: Elijah Newren, Kevin Locke, Git Mailing List, Junio C Hamano

On 5/24/2022 11:20 AM, Elijah Newren wrote:
> On Tue, May 24, 2022 at 7:02 AM Kevin Locke <kevin@kevinlocke.name> wrote:
>>
>> On Mon, 2022-05-23 at 14:57 -0400, Derrick Stolee wrote:
>>> On 5/21/22 9:53 AM, Kevin Locke wrote:
>>>> +           free((char*)tmp_original_cwd);
>>>
>>> Hm. I'm never a fan of this casting, but it existed before. It's
>>> because tmp_original_cwd is exposed globally in cache.h, which
>>> is _really widely_. However, there are only two uses: setup.c,
>>> which defines it, and common-main.c, which initializes it during
>>> process startup.
...>> This approach seems reasonable to me, as does casting to free().  It's
>> not clear to me which is preferable in this case.  How to balance the
>> trade-offs between exposing const interfaces, limiting (internal)
>> interfaces to headers, and avoiding casts might be worth discussing
>> and documenting a matter of project coding style.  `grep -rF 'free(('`
>> lists about 100 casts to free, suggesting the discussion may be
>> worthwhile.  Introducing a free_const() macro could be another option
>> to consider.
> 
> I'd prefer either a free_const() as you suggest (though as a separate
> patch from what you are submitting here), or leaving the code as-is.
> free() could have been written to take a const void* instead of just
> void*, since it's not going to modify what the pointer points at.  The
> reason we call free() is because the variable isn't needed anymore,
> and using a non-const value after freeing is just as wrong as using a
> const one after freeing, so casting away the constness cannot really
> cause any new problems.  So, I think the signature of free() is just
> wrong: it should have taken a const void* all along.  Unfortunately,
> the wrong type signature sadly makes people feel like they have to
> choose between (a) dropping the added safety of const that the
> compiler can enforce for you during the lifetime of the variable, or
> (b) leaking memory you no longer need.  I think it's a bad choice and
> you should just typecast when free'ing, but clearly others just don't
> want to see any typecasts and are willing to dispense with const on
> constant variables.

I mostly agree with you: if free() didn't have the const, then the
answer would be simple. We probably wouldn't also have the convention
of "const pointers are for memory we don't own".

Specifically with 'const char *' this can sometimes point to a
compiled string literal, so I tend to be more careful than usual
around these kinds of casts.

I'm willing to concede this point as it is much messier than just
the goals of this patch.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 14:51   ` [PATCH v3] " Kevin Locke
  2022-05-24 15:21     ` Elijah Newren
@ 2022-05-24 17:41     ` Derrick Stolee
  2022-05-24 18:00       ` Kevin Locke
  2022-05-24 19:20     ` [PATCH v4] " Kevin Locke
  2 siblings, 1 reply; 22+ messages in thread
From: Derrick Stolee @ 2022-05-24 17:41 UTC (permalink / raw)
  To: Kevin Locke, git; +Cc: Elijah Newren, Junio C Hamano

On 5/24/2022 10:51 AM, Kevin Locke wrote:> -	/* Normalize the directory */
> -	strbuf_realpath(&tmp, tmp_original_cwd, 1);
> -	free((char*)tmp_original_cwd);
> -	tmp_original_cwd = NULL;
> -	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +	/* Try to normalize the directory. */
> +	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> +		free((char*)tmp_original_cwd);
> +		tmp_original_cwd = NULL;
> +		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
> +	} else {
> +		trace2_data_string("setup", the_repository,
> +				   "realpath-path", tmp_original_cwd);
> +		trace2_data_string("setup", the_repository,
> +				   "realpath-failure", strerror(errno));
> +		tmp_original_cwd = NULL;

I didn't catch this in v2, but should this line instead be

		startup_info->original_cwd = NULL;

? Especially based on this comment:

> Changes since v1:
>  * Set startup_info->original_cwd = NULL when strbuf_realpath() fails,
>    rather than setting it to the un-normalized path.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v3] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 17:41     ` Derrick Stolee
@ 2022-05-24 18:00       ` Kevin Locke
  0 siblings, 0 replies; 22+ messages in thread
From: Kevin Locke @ 2022-05-24 18:00 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: git, Elijah Newren, Junio C Hamano

On Tue, 2022-05-24 at 13:41 -0400, Derrick Stolee wrote:
> On 5/24/2022 10:51 AM, Kevin Locke wrote:> -	/* Normalize the directory */
>> -	strbuf_realpath(&tmp, tmp_original_cwd, 1);
>> -	free((char*)tmp_original_cwd);
>> -	tmp_original_cwd = NULL;
>> -	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>> +	/* Try to normalize the directory. */
>> +	if (strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
>> +		free((char*)tmp_original_cwd);
>> +		tmp_original_cwd = NULL;
>> +		startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>> +	} else {
>> +		trace2_data_string("setup", the_repository,
>> +				   "realpath-path", tmp_original_cwd);
>> +		trace2_data_string("setup", the_repository,
>> +				   "realpath-failure", strerror(errno));
>> +		tmp_original_cwd = NULL;
> 
> I didn't catch this in v2, but should this line instead be
> 
> 		startup_info->original_cwd = NULL;
> 
> ? Especially based on this comment:

No worries.  It's a good question.  Setting startup_info->original_cwd
to NULL is handled by the no_prevention_needed label.  But I just
realized it's not actually required in this case, since original_cwd
is NULL when setup_original_cwd() is called.  I should probably return
rather than jumping to no_prevention_needed from the else block to
avoid the unnecessary free(NULL) and assignment.

Your comment made me realize that v2 and later neglect to free
tmp_original_cwd when strbuf_realpath() fails.  How embarrassing.

I'll send an update to fix both those issues shortly.

Thanks again,
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 14:51   ` [PATCH v3] " Kevin Locke
  2022-05-24 15:21     ` Elijah Newren
  2022-05-24 17:41     ` Derrick Stolee
@ 2022-05-24 19:20     ` Kevin Locke
  2022-05-24 20:40       ` Derrick Stolee
  2022-05-24 21:25       ` Junio C Hamano
  2 siblings, 2 replies; 22+ messages in thread
From: Kevin Locke @ 2022-05-24 19:20 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Junio C Hamano, Derrick Stolee

Prior to Git 2.35.0, git could be run from an inaccessible working
directory so long as the git repository specified by options and/or
environment variables was accessible.  For example:

    git init repo
    mkdir -p a/b
    cd a/b
    chmod u-x ..
    git -C "${PWD%/a/b}/repo" status

If this example seems a bit contrived, consider running with the
repository owner as a substitute UID (e.g. with runuser(1) or sudo(8))
without ensuring the working directory is accessible by that user.

The code added by e6f8861bd4 ("setup: introduce
startup_info->original_cwd") to preserve the working directory attempts
to normalize the path using strbuf_realpath().  If that fails, as in the
case above, it is treated as a fatal error.

This commit treats strbuf_realpath() errors as non-fatal.  If an error
occurs, setup_original_cwd() will continue without applying removal
prevention for cwd, resulting in the pre-2.35.0 behavior.  The risk
should be minimal, since git will not operate on a repository with
inaccessible ancestors, this behavior is only known to occur when cwd is
a descendant of the repository, an ancestor of cwd is inaccessible, and
no ancestors of the repository are inaccessible.

Signed-off-by: Kevin Locke <kevin@kevinlocke.name>
---

Changes since v3:
 * Free tmp_original_cwd in both codepaths.
 * Return after strbuf_realpath() fails, rather than jumping to
   no_prevention_needed, to avoid unnecessary free(NULL) and NULL
   reassignment.
 * Invert the condition and remove the else block to match the
   return-on-error code style for better readability.
 * Stop adding "Try" to comment, since strbuf_realpath() hasn't
   been optional since v1.

Changes since v2:
 * Use trace2_data_string(), rather than trace_printf(), to report
   realpath failure.

Changes since v1:
 * Set startup_info->original_cwd = NULL when strbuf_realpath() fails,
   rather than setting it to the un-normalized path.
 * Add a trace message when realpath fails to aid debugging.
 * Remove potential realpath failure cause from comment before it.
 * Improve format for reference to e6f8861bd4 in commit message.
 * Clarify when the pre-2.35.0 behavior may occur as a result of this
   commit in the commit message.
 * Remove 'Fixes:' tag from commit message.

 setup.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index a7b36f3ffbf..e0a99df512f 100644
--- a/setup.c
+++ b/setup.c
@@ -459,7 +459,16 @@ static void setup_original_cwd(void)
 	 */
 
 	/* Normalize the directory */
-	strbuf_realpath(&tmp, tmp_original_cwd, 1);
+	if (!strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
+		trace2_data_string("setup", the_repository,
+				   "realpath-path", tmp_original_cwd);
+		trace2_data_string("setup", the_repository,
+				   "realpath-failure", strerror(errno));
+		free((char*)tmp_original_cwd);
+		tmp_original_cwd = NULL;
+		return;
+	}
+
 	free((char*)tmp_original_cwd);
 	tmp_original_cwd = NULL;
 	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 19:20     ` [PATCH v4] " Kevin Locke
@ 2022-05-24 20:40       ` Derrick Stolee
  2022-05-24 21:25       ` Junio C Hamano
  1 sibling, 0 replies; 22+ messages in thread
From: Derrick Stolee @ 2022-05-24 20:40 UTC (permalink / raw)
  To: Kevin Locke, git; +Cc: Elijah Newren, Junio C Hamano

On 5/24/2022 3:20 PM, Kevin Locke wrote:
> Changes since v3:
>  * Free tmp_original_cwd in both codepaths.
>  * Return after strbuf_realpath() fails, rather than jumping to
>    no_prevention_needed, to avoid unnecessary free(NULL) and NULL
>    reassignment.
>  * Invert the condition and remove the else block to match the
>    return-on-error code style for better readability.
>  * Stop adding "Try" to comment, since strbuf_realpath() hasn't
>    been optional since v1.

...

>  	/* Normalize the directory */
> -	strbuf_realpath(&tmp, tmp_original_cwd, 1);
> +	if (!strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> +		trace2_data_string("setup", the_repository,
> +				   "realpath-path", tmp_original_cwd);
> +		trace2_data_string("setup", the_repository,
> +				   "realpath-failure", strerror(errno));
> +		free((char*)tmp_original_cwd);
> +		tmp_original_cwd = NULL;
> +		return;
> +	}

This is much easier to reason about.

>  	free((char*)tmp_original_cwd);
>  	tmp_original_cwd = NULL;
>  	startup_info->original_cwd = strbuf_detach(&tmp, NULL);
I had considered trying to remove this duplicate code freeing
temp_original_cwd. It requires adding a variable storing the
return from strbuf_realpath() _or_ knowing that tmp will have
zero length if strbuf_realpath() fails. It would look gross,
though:

	strbuf_realpath(&tmp, tmp_original_cwd, 0);

	if (!tmp->len) {
		trace2_data_string("setup", the_repository,
				   "realpath-path", tmp_original_cwd);
		trace2_data_string("setup", the_repository,
				   "realpath-failure", strerror(errno));
	}
	free((char*)tmp_original_cwd);
	tmp_original_cwd = NULL;
	if (!tmp->len)
		return;

	startup_info->original_cwd = strbuf_detach(&tmp, NULL);

...and that doesn't look very good at all. Thus, I think your v4
is ready to merge. Thanks for working on it!

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 19:20     ` [PATCH v4] " Kevin Locke
  2022-05-24 20:40       ` Derrick Stolee
@ 2022-05-24 21:25       ` Junio C Hamano
  2022-05-25  3:51         ` Elijah Newren
  1 sibling, 1 reply; 22+ messages in thread
From: Junio C Hamano @ 2022-05-24 21:25 UTC (permalink / raw)
  To: Kevin Locke; +Cc: git, Elijah Newren, Derrick Stolee

Kevin Locke <kevin@kevinlocke.name> writes:

>  	/* Normalize the directory */
> -	strbuf_realpath(&tmp, tmp_original_cwd, 1);
> +	if (!strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> +		trace2_data_string("setup", the_repository,
> +				   "realpath-path", tmp_original_cwd);
> +		trace2_data_string("setup", the_repository,
> +				   "realpath-failure", strerror(errno));
> +		free((char*)tmp_original_cwd);
> +		tmp_original_cwd = NULL;
> +		return;
> +	}
> +
>  	free((char*)tmp_original_cwd);
>  	tmp_original_cwd = NULL;
>  	startup_info->original_cwd = strbuf_detach(&tmp, NULL);

It is somewhat sad that we cannot readily use FREE_AND_NULL() here.
If it casted away the constness (see the attached at the end), we
could have saved two lines from the above snippet.

The startup_info->original_cwd member is initialized to NULL, and
I think it is a safe assumption that it still is so when the control
reaches here.  Otherwise, the assignment of strbuf_detach() to it
without first freeing the current value we see in the post context
is leaking already.

So, overall this looks good to me.  Anybody else who have already
spent cycles to review this want to add Reviewed-by: to it?

Thanks.

diff --git i/git-compat-util.h w/git-compat-util.h
index 58fd813bd0..56c6c48461 100644
--- i/git-compat-util.h
+++ w/git-compat-util.h
@@ -976,7 +976,7 @@ int xstrncmpz(const char *s, const char *t, size_t len);
  * FREE_AND_NULL(ptr) is like free(ptr) followed by ptr = NULL. Note
  * that ptr is used twice, so don't pass e.g. ptr++.
  */
-#define FREE_AND_NULL(p) do { free(p); (p) = NULL; } while (0)
+#define FREE_AND_NULL(p) do { free((void*)p); (p) = NULL; } while (0)
 
 #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
 #define CALLOC_ARRAY(x, alloc) (x) = xcalloc((alloc), sizeof(*(x)))

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 17:38         ` Derrick Stolee
@ 2022-05-25  3:47           ` Elijah Newren
  0 siblings, 0 replies; 22+ messages in thread
From: Elijah Newren @ 2022-05-25  3:47 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Kevin Locke, Git Mailing List, Junio C Hamano

On Tue, May 24, 2022 at 10:38 AM Derrick Stolee
<derrickstolee@github.com> wrote:
>
> On 5/24/2022 11:20 AM, Elijah Newren wrote:
> > On Tue, May 24, 2022 at 7:02 AM Kevin Locke <kevin@kevinlocke.name> wrote:
> >>
> >> On Mon, 2022-05-23 at 14:57 -0400, Derrick Stolee wrote:
> >>> On 5/21/22 9:53 AM, Kevin Locke wrote:
> >>>> +           free((char*)tmp_original_cwd);
> >>>
> >>> Hm. I'm never a fan of this casting, but it existed before. It's
> >>> because tmp_original_cwd is exposed globally in cache.h, which
> >>> is _really widely_. However, there are only two uses: setup.c,
> >>> which defines it, and common-main.c, which initializes it during
> >>> process startup.
> ...>> This approach seems reasonable to me, as does casting to free().  It's
> >> not clear to me which is preferable in this case.  How to balance the
> >> trade-offs between exposing const interfaces, limiting (internal)
> >> interfaces to headers, and avoiding casts might be worth discussing
> >> and documenting a matter of project coding style.  `grep -rF 'free(('`
> >> lists about 100 casts to free, suggesting the discussion may be
> >> worthwhile.  Introducing a free_const() macro could be another option
> >> to consider.
> >
> > I'd prefer either a free_const() as you suggest (though as a separate
> > patch from what you are submitting here), or leaving the code as-is.
> > free() could have been written to take a const void* instead of just
> > void*, since it's not going to modify what the pointer points at.  The
> > reason we call free() is because the variable isn't needed anymore,
> > and using a non-const value after freeing is just as wrong as using a
> > const one after freeing, so casting away the constness cannot really
> > cause any new problems.  So, I think the signature of free() is just
> > wrong: it should have taken a const void* all along.  Unfortunately,
> > the wrong type signature sadly makes people feel like they have to
> > choose between (a) dropping the added safety of const that the
> > compiler can enforce for you during the lifetime of the variable, or
> > (b) leaking memory you no longer need.  I think it's a bad choice and
> > you should just typecast when free'ing, but clearly others just don't
> > want to see any typecasts and are willing to dispense with const on
> > constant variables.
>
> I mostly agree with you: if free() didn't have the const, then the
> answer would be simple. We probably wouldn't also have the convention
> of "const pointers are for memory we don't own".
>
> Specifically with 'const char *' this can sometimes point to a
> compiled string literal, so I tend to be more careful than usual
> around these kinds of casts.

Ah, fair enough.


> I'm willing to concede this point as it is much messier than just
> the goals of this patch.

:-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 21:25       ` Junio C Hamano
@ 2022-05-25  3:51         ` Elijah Newren
  2022-05-25  5:11           ` Junio C Hamano
  0 siblings, 1 reply; 22+ messages in thread
From: Elijah Newren @ 2022-05-25  3:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Kevin Locke, Git Mailing List, Derrick Stolee

On Tue, May 24, 2022 at 2:25 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Kevin Locke <kevin@kevinlocke.name> writes:
>
> >       /* Normalize the directory */
> > -     strbuf_realpath(&tmp, tmp_original_cwd, 1);
> > +     if (!strbuf_realpath(&tmp, tmp_original_cwd, 0)) {
> > +             trace2_data_string("setup", the_repository,
> > +                                "realpath-path", tmp_original_cwd);
> > +             trace2_data_string("setup", the_repository,
> > +                                "realpath-failure", strerror(errno));
> > +             free((char*)tmp_original_cwd);
> > +             tmp_original_cwd = NULL;
> > +             return;
> > +     }
> > +
> >       free((char*)tmp_original_cwd);
> >       tmp_original_cwd = NULL;
> >       startup_info->original_cwd = strbuf_detach(&tmp, NULL);
>
> It is somewhat sad that we cannot readily use FREE_AND_NULL() here.
> If it casted away the constness (see the attached at the end), we
> could have saved two lines from the above snippet.
>
> The startup_info->original_cwd member is initialized to NULL, and
> I think it is a safe assumption that it still is so when the control
> reaches here.  Otherwise, the assignment of strbuf_detach() to it
> without first freeing the current value we see in the post context
> is leaking already.
>
> So, overall this looks good to me.  Anybody else who have already
> spent cycles to review this want to add Reviewed-by: to it?

Well, I added my Reviewed-by to v3 and apparently missed a few things
which were fixed up in v4.  So, my review apparently wasn't careful
enough.  But I am happy with this version, so here it is again:

Reviewed-by: Elijah Newren <newren@gmail.com>

(Now someone is free to spot more problems and embarrass me even more...)

>
> Thanks.
>
> diff --git i/git-compat-util.h w/git-compat-util.h
> index 58fd813bd0..56c6c48461 100644
> --- i/git-compat-util.h
> +++ w/git-compat-util.h
> @@ -976,7 +976,7 @@ int xstrncmpz(const char *s, const char *t, size_t len);
>   * FREE_AND_NULL(ptr) is like free(ptr) followed by ptr = NULL. Note
>   * that ptr is used twice, so don't pass e.g. ptr++.
>   */
> -#define FREE_AND_NULL(p) do { free(p); (p) = NULL; } while (0)
> +#define FREE_AND_NULL(p) do { free((void*)p); (p) = NULL; } while (0)
>
>  #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
>  #define CALLOC_ARRAY(x, alloc) (x) = xcalloc((alloc), sizeof(*(x)))

I also like this change, even if it feels like it should be part of a
separate patch.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-25  3:51         ` Elijah Newren
@ 2022-05-25  5:11           ` Junio C Hamano
  0 siblings, 0 replies; 22+ messages in thread
From: Junio C Hamano @ 2022-05-25  5:11 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Kevin Locke, Git Mailing List, Derrick Stolee

Elijah Newren <newren@gmail.com> writes:

>> diff --git i/git-compat-util.h w/git-compat-util.h
>> index 58fd813bd0..56c6c48461 100644
>> --- i/git-compat-util.h
>> +++ w/git-compat-util.h
>> @@ -976,7 +976,7 @@ int xstrncmpz(const char *s, const char *t, size_t len);
>>   * FREE_AND_NULL(ptr) is like free(ptr) followed by ptr = NULL. Note
>>   * that ptr is used twice, so don't pass e.g. ptr++.
>>   */
>> -#define FREE_AND_NULL(p) do { free(p); (p) = NULL; } while (0)
>> +#define FREE_AND_NULL(p) do { free((void*)p); (p) = NULL; } while (0)
>>
>>  #define ALLOC_ARRAY(x, alloc) (x) = xmalloc(st_mult(sizeof(*(x)), (alloc)))
>>  #define CALLOC_ARRAY(x, alloc) (x) = xcalloc((alloc), sizeof(*(x)))
>
> I also like this change, even if it feels like it should be part of a
> separate patch.

Sorry, but I did fail to make it clear that I didn't mean the above
to be anything related to Kevin's patch.  It was "if FREE_AND_NULL()
were defined like this, then we could have used it".  I didn't mean
to say "let's update FREE_AND_NULL() this way so that we can use
it".

Thanks.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-24 15:20       ` Elijah Newren
  2022-05-24 17:38         ` Derrick Stolee
@ 2022-05-27  7:48         ` Ævar Arnfjörð Bjarmason
  2022-05-28  1:27           ` Elijah Newren
  1 sibling, 1 reply; 22+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-27  7:48 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Kevin Locke, Derrick Stolee, Git Mailing List, Junio C Hamano


On Tue, May 24 2022, Elijah Newren wrote:

> [...] So, I think the signature of free() is just
> wrong: it should have taken a const void* all along.  Unfortunately,
> the wrong type signature sadly makes people feel like they have to
> choose between (a) dropping the added safety of const that the
> compiler can enforce for you during the lifetime of the variable, or
> (b) leaking memory you no longer need.

Hrm, don't you mean that it would be better as:

	void free(void *const ptr);

Not:

	void free(const void *ptr);

But standard C doesn't make much (any?) use of the former form for its
library functions by convention.

c.f.:

	cdecl> explain const void *var
	declare var as pointer to const void
	cdecl> explain void *const var
	declare var as const pointer to void

I.e. the whole point of malloc() is to give us a pointer to memory that
isn't "const". If we stored that in a variable that was "void *const"
we'd save ourselves from some amount of foot-guns, but you're rarely
doing pointer arithmetic accidentally, so probably not really.

But yeah, we really should have this documented somewhere, i.e. the
cases where we "lie" and expose a "const char *" which is really (as far
as the machine is concerned) mutable.

The confusion being that we're seeking to overlay our own "no, this
isn't mutable" on the basis of our desired API boundaries, not just to
use it to inform us & the compiler about the "real" nature of the
underlying data.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v2] setup: don't die if realpath(3) fails on getcwd(3)
  2022-05-27  7:48         ` Ævar Arnfjörð Bjarmason
@ 2022-05-28  1:27           ` Elijah Newren
  0 siblings, 0 replies; 22+ messages in thread
From: Elijah Newren @ 2022-05-28  1:27 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Kevin Locke, Derrick Stolee, Git Mailing List, Junio C Hamano

On Fri, May 27, 2022 at 1:02 AM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
>
> On Tue, May 24 2022, Elijah Newren wrote:
>
> > [...] So, I think the signature of free() is just
> > wrong: it should have taken a const void* all along.  Unfortunately,
> > the wrong type signature sadly makes people feel like they have to
> > choose between (a) dropping the added safety of const that the
> > compiler can enforce for you during the lifetime of the variable, or
> > (b) leaking memory you no longer need.
>
> Hrm, don't you mean that it would be better as:
>
>         void free(void *const ptr);
>
> Not:
>
>         void free(const void *ptr);

Nope, I definitely meant the latter; the stuff pointed to is const,
not the pointer itself.

In fact, I don't see any point at all in the former; with the free()
that exists today:

    void free(void *ptr)

I can pass it a "void * const myptr" already without problems, because
free's ptr parameter will be a copy of myptr, and thus modifying ptr
cannot affect myptr.  So such a call signature change could not
possibly provide any benefit to the outside caller.  But that call
signature change could hinder the actual implementation of free() for
some folks (particularly if a given implementation of free() keeps
extra data near the allocated block with information about the size of
the block and the next allocated block in the list).

In contrast, I cannot pass a "const void *myptr" or "const char
*myptr" to free(), but only because of the current type signature;
free() doesn't actually modify any of the contents the pointer points
to.  (And if you want to claim that free effectively does modify what
myptr points to because someone else could allocate that same memory,
remember that use-after-free is undefined regardless of whether the
data pointed to is const or not, and thus you cannot access that data
after free with or without the const.)  So, free()'s real type that it
acts on is a const void *.  Sadly, the declared type signature is
rather void *, which unnecessarily forces users to cast their types
when calling.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2022-05-28  1:28 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-19 23:39 [PATCH] setup: don't die if realpath(3) fails on getcwd(3) Kevin Locke
2022-05-20 18:38 ` Junio C Hamano
2022-05-21  0:14 ` Elijah Newren
2022-05-21 13:02   ` Kevin Locke
2022-05-23 18:44     ` Derrick Stolee
2022-05-21 13:53 ` [PATCH v2] " Kevin Locke
2022-05-23 18:57   ` Derrick Stolee
2022-05-24 14:02     ` Kevin Locke
2022-05-24 15:20       ` Elijah Newren
2022-05-24 17:38         ` Derrick Stolee
2022-05-25  3:47           ` Elijah Newren
2022-05-27  7:48         ` Ævar Arnfjörð Bjarmason
2022-05-28  1:27           ` Elijah Newren
2022-05-24 14:51   ` [PATCH v3] " Kevin Locke
2022-05-24 15:21     ` Elijah Newren
2022-05-24 17:41     ` Derrick Stolee
2022-05-24 18:00       ` Kevin Locke
2022-05-24 19:20     ` [PATCH v4] " Kevin Locke
2022-05-24 20:40       ` Derrick Stolee
2022-05-24 21:25       ` Junio C Hamano
2022-05-25  3:51         ` Elijah Newren
2022-05-25  5:11           ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.