All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH] merge-recursive: create new files with O_EXCL
@ 2021-03-10  7:57 Ephrim Khong
  2021-03-10 23:01 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Ephrim Khong @ 2021-03-10  7:57 UTC (permalink / raw)
  To: git

When re-writing the content of a file during a merge, the file
is first unlinked and then re-created. Since it is a new file
at the time of the open call, O_EXCL should be passed to clarify
that the file is expected to be new.

Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
---
This is actually a fix for an issue we ran into on an nfs4
mount. Files created with O_TRUNC instead of O_EXCL sometimes
had their permissions wrong. However, it appears to be a safe
thing to change this, especially since other parts of the
git codebase also prefer the O_EXCL flag.

 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index f736a0f632..f72a376c5b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -974,7 +974,7 @@ static int update_file_flags(struct merge_options *opt,
 			int fd;
 			int mode = (contents->mode & 0100 ? 0777 : 0666);

-			fd = open(path, O_WRONLY | O_TRUNC | O_CREAT, mode);
+			fd = open(path, O_WRONLY | O_EXCL | O_CREAT, mode);
 			if (fd < 0) {
 				ret = err(opt, _("failed to open '%s': %s"),
 					  path, strerror(errno));
-- 
2.30.1.1.g07d5ea6f42.dirty


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] merge-recursive: create new files with O_EXCL
  2021-03-10  7:57 [RFC PATCH] merge-recursive: create new files with O_EXCL Ephrim Khong
@ 2021-03-10 23:01 ` Junio C Hamano
  2021-03-11  9:54   ` Ephrim Khong
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2021-03-10 23:01 UTC (permalink / raw)
  To: Ephrim Khong; +Cc: git

Ephrim Khong <dr.khong@gmail.com> writes:

> When re-writing the content of a file during a merge, the file
> is first unlinked and then re-created. Since it is a new file
> at the time of the open call, O_EXCL should be passed to clarify
> that the file is expected to be new.
>
> Signed-off-by: Ephrim Khong <dr.khong@gmail.com>
> ---
> This is actually a fix for an issue we ran into on an nfs4
> mount. Files created with O_TRUNC instead of O_EXCL sometimes
> had their permissions wrong. However, it appears to be a safe
> thing to change this, especially since other parts of the
> git codebase also prefer the O_EXCL flag.

Interesting.  

It seems that this part of the code has always used O_TRUNC since
its inception at 6d297f81 (Status update on merge-recursive in C,
2006-07-08) when two thieves smuggled it into our codebase.  Back
then, this was the only use of O_TRUNC|O_WRONLY|O_CREAT combination
in the production codebase (as you observed, O_CREAT|O_EXCL|O_WRONLY
was what all the other codepaths use).  But the use of O_TRUNC has
spread over time to other parts of the codebase, perhaps cargo
culted.  If you look at hits from

    $ git grep -e O_TRUNC -e O_EXCL

you see the combination of CREAT/WRONLY/TRUNC used all over the
place [*], especially in newer parts of the code.

So it becomes very curious why this one location needs to be so
special and you are not patching other uses of O_TRUNC.

I do not think we mind fixing the use of O_TRUNC with "remove and
then create with O_EXCL", but we'd probably want to

 * understand why only this place matters, or perhaps other uses of
   O_TRUNC needs the same fix to work "correctly" with your NFS
   mounts, in which case we'd need all of them addressed in the same
   series of patches, and

 * understand why your NFS mount is broken and give a better
   explanation as to why we need to have a workaround in our code.

before doing so.

Thanks.


[Footnote]

* It is understandable that CREAT/WRONLY/TRUNC are used in
  combination, as that is documented as a synonym for creat().




>  merge-recursive.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index f736a0f632..f72a376c5b 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -974,7 +974,7 @@ static int update_file_flags(struct merge_options *opt,
>  			int fd;
>  			int mode = (contents->mode & 0100 ? 0777 : 0666);
>
> -			fd = open(path, O_WRONLY | O_TRUNC | O_CREAT, mode);
> +			fd = open(path, O_WRONLY | O_EXCL | O_CREAT, mode);
>  			if (fd < 0) {
>  				ret = err(opt, _("failed to open '%s': %s"),
>  					  path, strerror(errno));

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] merge-recursive: create new files with O_EXCL
  2021-03-10 23:01 ` Junio C Hamano
@ 2021-03-11  9:54   ` Ephrim Khong
  2021-03-11 17:52     ` Junio C Hamano
                       ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Ephrim Khong @ 2021-03-11  9:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On 11.03.2021 00:01, Junio C Hamano wrote:
> Ephrim Khong <dr.khong@gmail.com> writes:
> 
>> When re-writing the content of a file during a merge, the file
>> is first unlinked and then re-created. Since it is a new file
>> at the time of the open call, O_EXCL should be passed to clarify
>> that the file is expected to be new.
>
>                                          But the use of O_TRUNC has
> spread over time to other parts of the codebase, perhaps cargo
> culted.  If you look at hits from
> 
>     $ git grep -e O_TRUNC -e O_EXCL
> 
> you see the combination of CREAT/WRONLY/TRUNC used all over the
> place [*], especially in newer parts of the code.
> 
> So it becomes very curious why this one location needs to be so
> special and you are not patching other uses of O_TRUNC.

The main difference is that this is the only place where the file mode
matters, since we want the executable bit set for some files in the
working directory. All other locations create the files with mode 0600
or 0666 and are happy as long as the user has rw rights, which appears
to be the case even when running into my nfs-issue.

> I do not think we mind fixing the use of O_TRUNC with "remove and
> then create with O_EXCL", but we'd probably want to
> 
>  * understand why only this place matters, or perhaps other uses of
>    O_TRUNC needs the same fix to work "correctly" with your NFS
>    mounts, in which case we'd need all of them addressed in the same
>    series of patches, and

I'd say that is up to you. Personally I'd rather err on the side of
caution and leave the other code as is, especially since it is not
really required for the file mode issue as described above. But I'd
happily patch those places as well if you find that to be a better idea.
Maybe refactoring this (fstat - unlink - open) into a common function
would make sense then.

>  * understand why your NFS mount is broken and give a better
>    explanation as to why we need to have a workaround in our code.

I'll work on this, but unfortunately have no idea of how to properly
debug it. Since it is a company server without administrative rights and
the backend is some IBM storage system, the options are limited and
processes are slow. What I did find out so far is that it is not a race
condition with unlink. A simple openat() without O_EXCL already produces
the wrong file mode.

(I fully understand that this is not a bug on git's side, and I found no
documentation indicating that O_EXCL would be recommended or have any
effect in this way. Hopefully, others that run into similar issues would
benefit from this as well, there are a few reports online of people
running into "failed to refresh" errors.)

As a side-note, the strace on the affected file also shows that git
writes that file twice during the merge, with the same content. There
might be some potential to further optimize merges to avoid such
double-writes. A small example to reproduce, note how "b" is opened
twice during the merge:

  git init
  echo "foo" > a
  git add a
  git commit -m "Initial commit"

  git mv a b
  git commit -m "File moved"

  git checkout -b other_branch HEAD~
  touch c && git add c && git commit -m "Some other commit"
  strace -otrace git merge master -m "merge message"
  grep '"b"' trace

Thanks
- Eph


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] merge-recursive: create new files with O_EXCL
  2021-03-11  9:54   ` Ephrim Khong
@ 2021-03-11 17:52     ` Junio C Hamano
  2021-03-11 18:01     ` Elijah Newren
  2021-03-13  1:08     ` brian m. carlson
  2 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2021-03-11 17:52 UTC (permalink / raw)
  To: Ephrim Khong; +Cc: git

Ephrim Khong <dr.khong@gmail.com> writes:

>> I do not think we mind fixing the use of O_TRUNC with "remove and
>> then create with O_EXCL", but we'd probably want to
>> 
>>  * understand why only this place matters, or perhaps other uses of
>>    O_TRUNC needs the same fix to work "correctly" with your NFS
>>    mounts, in which case we'd need all of them addressed in the same
>>    series of patches, and
>
> I'd say that is up to you. Personally I'd rather err on the side of
> caution and leave the other code as is, especially since it is not
> really required for the file mode issue as described above.

OK.

Especially the other places are prepared to deal with a leftover
stale file, there is no reason to add extra cost to unlink before
recreating.  If we know this place we will never have an existing
file (because we unlinked), then TRUNC -> EXCL is a no-cost change
for us.

If some of these other places turn out to be problem for you and
other users who are affected by the same NFS mount problem, we may
need to replace these places' open(CREAT|TRUNC) with a wrapper that
conditionally does an unlink+open(CREAT|EXCL) with a Makefile knob.

But before doing any of that, we'd need to see ...

>>  * understand why your NFS mount is broken and give a better
>>    explanation as to why we need to have a workaround in our code.
>
> I'll work on this, but unfortunately have no idea of how to properly
> debug it. Since it is a company server without administrative rights and
> the backend is some IBM storage system, the options are limited and
> processes are slow. What I did find out so far is that it is not a race
> condition with unlink. A simple openat() without O_EXCL already produces
> the wrong file mode.

... this resolved, I would think.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] merge-recursive: create new files with O_EXCL
  2021-03-11  9:54   ` Ephrim Khong
  2021-03-11 17:52     ` Junio C Hamano
@ 2021-03-11 18:01     ` Elijah Newren
  2021-03-13  1:08     ` brian m. carlson
  2 siblings, 0 replies; 6+ messages in thread
From: Elijah Newren @ 2021-03-11 18:01 UTC (permalink / raw)
  To: Ephrim Khong; +Cc: Junio C Hamano, Git Mailing List

On Thu, Mar 11, 2021 at 1:58 AM Ephrim Khong <dr.khong@gmail.com> wrote:
>
> On 11.03.2021 00:01, Junio C Hamano wrote:
> > Ephrim Khong <dr.khong@gmail.com> writes:

> As a side-note, the strace on the affected file also shows that git
> writes that file twice during the merge, with the same content. There
> might be some potential to further optimize merges to avoid such
> double-writes. A small example to reproduce, note how "b" is opened
> twice during the merge:
>
>   git init
>   echo "foo" > a
>   git add a
>   git commit -m "Initial commit"
>
>   git mv a b
>   git commit -m "File moved"
>
>   git checkout -b other_branch HEAD~
>   touch c && git add c && git commit -m "Some other commit"
>   strace -otrace git merge master -m "merge message"
>   grep '"b"' trace

Yeah, this is somewhat fundamental to merge-recursive's implementation
design; fixing it essentially requires a rewrite.  That rewrite is
nearing completion; so this double-write issue will be fixed when
merge-ort is complete (or for anyone who applies the patches and tries
it out now).

In fact, if you don't see the permissions problems when switching
branches (because branch switching uses O_EXCL?), then merge-ort
almost certainly incidentally fixes that problem for you as well.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC PATCH] merge-recursive: create new files with O_EXCL
  2021-03-11  9:54   ` Ephrim Khong
  2021-03-11 17:52     ` Junio C Hamano
  2021-03-11 18:01     ` Elijah Newren
@ 2021-03-13  1:08     ` brian m. carlson
  2 siblings, 0 replies; 6+ messages in thread
From: brian m. carlson @ 2021-03-13  1:08 UTC (permalink / raw)
  To: Ephrim Khong; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 1860 bytes --]

On 2021-03-11 at 09:54:41, Ephrim Khong wrote:
> On 11.03.2021 00:01, Junio C Hamano wrote:
> >  * understand why your NFS mount is broken and give a better
> >    explanation as to why we need to have a workaround in our code.
> 
> I'll work on this, but unfortunately have no idea of how to properly
> debug it. Since it is a company server without administrative rights and
> the backend is some IBM storage system, the options are limited and
> processes are slow. What I did find out so far is that it is not a race
> condition with unlink. A simple openat() without O_EXCL already produces
> the wrong file mode.

This reminds me of a NFS bug that we saw in the past[0].  I don't know
if you're using that same type of system in this case, but if so, it
could be part of the problem.

Since buggy NFS implementations seem to be a problem specifically with
open(2) and I need to reroll my series to add some entries to the FAQ,
and I'll document that we require NFS servers to support POSIX open
semantics, including permissions and O_EXCL.

> (I fully understand that this is not a bug on git's side, and I found no
> documentation indicating that O_EXCL would be recommended or have any
> effect in this way. Hopefully, others that run into similar issues would
> benefit from this as well, there are a few reports online of people
> running into "failed to refresh" errors.)

This does tend to frequently affect Git, but it can also affect other
programs as well, and you're probably going to be better off filing a
bug report with IBM about their NFS server than trying to work around it
in every situation.

[0] https://lore.kernel.org/git/CAPx1GvfKxu8gwbp0Gn2dBf9th874skKjD-echeAFr7_77o8FYw@mail.gmail.com/T/#mead6be6c92f0ab29cf9fd600781dea7315e87411
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-13  1:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-10  7:57 [RFC PATCH] merge-recursive: create new files with O_EXCL Ephrim Khong
2021-03-10 23:01 ` Junio C Hamano
2021-03-11  9:54   ` Ephrim Khong
2021-03-11 17:52     ` Junio C Hamano
2021-03-11 18:01     ` Elijah Newren
2021-03-13  1:08     ` brian m. carlson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.