* fatal error when diffing changed symlinks
@ 2017-02-24 11:47 Christophe Macabiau
2017-02-24 17:53 ` Junio C Hamano
0 siblings, 1 reply; 17+ messages in thread
From: Christophe Macabiau @ 2017-02-24 11:47 UTC (permalink / raw)
To: git
Hi,
with the commands below, you will get :
> fatal: bad object 0000000000000000000000000000000000000000
> show 0000000000000000000000000000000000000000: command returned error: 128
>
I am using version 2.5.5 fedora 23
cd /tmp
mkdir a
cd a
git init
touch b
ln -s b c
git add .
git commit -m 'first'
touch d
rm c
ln -s d c
git difftool --dir-diff
Thanks for your feedback,
Christophe
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: fatal error when diffing changed symlinks 2017-02-24 11:47 fatal error when diffing changed symlinks Christophe Macabiau @ 2017-02-24 17:53 ` Junio C Hamano 2017-02-24 19:51 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2017-02-24 17:53 UTC (permalink / raw) To: Christophe Macabiau; +Cc: git, Johannes Schindelin Christophe Macabiau <christophemacabiau@gmail.com> writes: > with the commands below, you will get : > >> fatal: bad object 0000000000000000000000000000000000000000 >> show 0000000000000000000000000000000000000000: command returned error: 128 >> > > I am using version 2.5.5 fedora 23 > > cd /tmp > mkdir a > cd a > git init > touch b > ln -s b c > git add . > git commit -m 'first' > touch d > rm c > ln -s d c > git difftool --dir-diff A slightly worse is that the upcoming Git will ship with a rewritten "difftool" that makes the above sequence segfault. As either way is bad, I do not particularly think the rewritten one needs to be reverted (it would give "fatal: bad object" then, and would not fix your problem), but it needs to be looked at. Thanks for a report. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: fatal error when diffing changed symlinks 2017-02-24 17:53 ` Junio C Hamano @ 2017-02-24 19:51 ` Junio C Hamano 2017-02-24 20:35 ` Jeff King 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2017-02-24 19:51 UTC (permalink / raw) To: Johannes Schindelin; +Cc: git, Christophe Macabiau Junio C Hamano <gitster@pobox.com> writes: >> cd /tmp >> mkdir a >> cd a >> git init >> touch b >> ln -s b c >> git add . >> git commit -m 'first' >> touch d >> rm c >> ln -s d c >> git difftool --dir-diff > > A slightly worse is that the upcoming Git will ship with a rewritten > "difftool" that makes the above sequence segfault. The culprit seems to be these lines in run_dir_diff(): if (S_ISLNK(lmode)) { char *content = read_sha1_file(loid.hash, &type, &size); add_left_or_right(&symlinks2, src_path, content, 0); free(content); } if (S_ISLNK(rmode)) { char *content = read_sha1_file(roid.hash, &type, &size); add_left_or_right(&symlinks2, dst_path, content, 1); free(content); } When viewing a working tree file, oid.hash could be 0{40} and read_sha1_file() is not the right function to use to obtain the contents. Both of these two need to pay attention to 0{40}, I think, as the user may be running "difftool -R --dir-diff" in which case the working tree would appear in the left hand side instead. I didn't follow the codepath for regular files closely, but the code that follows the above excerpt does quite different things to lstate and rstate, which makes me suspect that the code is not prepared to see "-R"(everse) diff (and I further suspect that these issues were inherited from the original scripted Porcelain). ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: fatal error when diffing changed symlinks 2017-02-24 19:51 ` Junio C Hamano @ 2017-02-24 20:35 ` Jeff King 2017-02-25 12:36 ` Johannes Schindelin 0 siblings, 1 reply; 17+ messages in thread From: Jeff King @ 2017-02-24 20:35 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, git, Christophe Macabiau On Fri, Feb 24, 2017 at 11:51:22AM -0800, Junio C Hamano wrote: > > A slightly worse is that the upcoming Git will ship with a rewritten > > "difftool" that makes the above sequence segfault. > > The culprit seems to be these lines in run_dir_diff(): > > if (S_ISLNK(lmode)) { > char *content = read_sha1_file(loid.hash, &type, &size); > add_left_or_right(&symlinks2, src_path, content, 0); > free(content); > } > > if (S_ISLNK(rmode)) { > char *content = read_sha1_file(roid.hash, &type, &size); > add_left_or_right(&symlinks2, dst_path, content, 1); > free(content); > } > > When viewing a working tree file, oid.hash could be 0{40} and > read_sha1_file() is not the right function to use to obtain the > contents. > > Both of these two need to pay attention to 0{40}, I think, as the > user may be running "difftool -R --dir-diff" in which case the > working tree would appear in the left hand side instead. As a side note, I think even outside of 0{40}, this should be checking the return value of read_sha1_file(). A corrupted repo should die(), not segfault. -Peff ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: fatal error when diffing changed symlinks 2017-02-24 20:35 ` Jeff King @ 2017-02-25 12:36 ` Johannes Schindelin 2017-03-07 18:16 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: Johannes Schindelin @ 2017-02-25 12:36 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git, Christophe Macabiau Hi Peff & Junio, On Fri, 24 Feb 2017, Jeff King wrote: > On Fri, Feb 24, 2017 at 11:51:22AM -0800, Junio C Hamano wrote: > > > > A slightly worse is that the upcoming Git will ship with a rewritten > > > "difftool" that makes the above sequence segfault. > > > > The culprit seems to be these lines in run_dir_diff(): > > > > if (S_ISLNK(lmode)) { > > char *content = read_sha1_file(loid.hash, &type, &size); > > add_left_or_right(&symlinks2, src_path, content, 0); > > free(content); > > } > > > > if (S_ISLNK(rmode)) { > > char *content = read_sha1_file(roid.hash, &type, &size); > > add_left_or_right(&symlinks2, dst_path, content, 1); > > free(content); > > } > > > > When viewing a working tree file, oid.hash could be 0{40} and > > read_sha1_file() is not the right function to use to obtain the > > contents. > > > > Both of these two need to pay attention to 0{40}, I think, as the > > user may be running "difftool -R --dir-diff" in which case the > > working tree would appear in the left hand side instead. > > As a side note, I think even outside of 0{40}, this should be checking > the return value of read_sha1_file(). A corrupted repo should die(), not > segfault. I agree. I am on it. Ciao, Dscho ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: fatal error when diffing changed symlinks 2017-02-25 12:36 ` Johannes Schindelin @ 2017-03-07 18:16 ` Junio C Hamano 2017-03-07 22:34 ` Johannes Schindelin 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2017-03-07 18:16 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Jeff King, git, Christophe Macabiau Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> > When viewing a working tree file, oid.hash could be 0{40} and >> > read_sha1_file() is not the right function to use to obtain the >> > contents. >> > >> > Both of these two need to pay attention to 0{40}, I think, as the >> > user may be running "difftool -R --dir-diff" in which case the >> > working tree would appear in the left hand side instead. >> >> As a side note, I think even outside of 0{40}, this should be checking >> the return value of read_sha1_file(). A corrupted repo should die(), not >> segfault. > > I agree. I am on it. Friendly ping, if only to make sure that we can keep a piece of this thread in the more "recent" pile. If you have other topics you need to perfect, I think it is OK to postpone the fix on this topic a bit longer, but I'd hate to ship two releases with a known breakage without an attempt to fix it, so if you are otherwise occupied, I may encourage others (including me) to take a look at this. The new "difftool" also has a reported regression somebody else expressed willingness to work on, which is sort of blocked by everybody else not knowing the timeline on this one. cf. <20170303212836.GB13790@arch-attack.localdomain> A patch series would be very welcome, but "Please go ahead if somebody else has time, and I'll help reviewing" would be also good. Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: fatal error when diffing changed symlinks 2017-03-07 18:16 ` Junio C Hamano @ 2017-03-07 22:34 ` Johannes Schindelin 2017-03-13 17:56 ` [PATCH] difftool: handle changing symlinks in dir-diff mode David Aguilar 0 siblings, 1 reply; 17+ messages in thread From: Johannes Schindelin @ 2017-03-07 22:34 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff King, git, Christophe Macabiau Hi Junio, On Tue, 7 Mar 2017, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> > When viewing a working tree file, oid.hash could be 0{40} and > >> > read_sha1_file() is not the right function to use to obtain the > >> > contents. > >> > > >> > Both of these two need to pay attention to 0{40}, I think, as the > >> > user may be running "difftool -R --dir-diff" in which case the > >> > working tree would appear in the left hand side instead. > >> > >> As a side note, I think even outside of 0{40}, this should be checking > >> the return value of read_sha1_file(). A corrupted repo should die(), not > >> segfault. > > > > I agree. I am on it. > > Friendly ping, if only to make sure that we can keep a piece of this > thread in the more "recent" pile. > > If you have other topics you need to perfect, I think it is OK to > postpone the fix on this topic a bit longer, but I'd hate to ship > two releases with a known breakage without an attempt to fix it, so > if you are otherwise occupied, I may encourage others (including me) > to take a look at this. The new "difftool" also has a reported > regression somebody else expressed willingness to work on, which is > sort of blocked by everybody else not knowing the timeline on this > one. cf. <20170303212836.GB13790@arch-attack.localdomain> > > A patch series would be very welcome, but "Please go ahead if > somebody else has time, and I'll help reviewing" would be also > good. Unfortunately, the obvious fix was not enough. My current work in progress is in the `difftool-null-sha1` branch on https://gihub.com/dscho/git. I still have a few other things to tend to, but hope to come back to the difftool before the end of the week. Ciao, Johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-07 22:34 ` Johannes Schindelin @ 2017-03-13 17:56 ` David Aguilar 2017-03-13 18:32 ` Junio C Hamano ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: David Aguilar @ 2017-03-13 17:56 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Junio C Hamano, Christophe Macabiau, Git ML Detect the null object ID for symlinks in dir-diff so that difftool can prepare temporary files that matches how git handles symlinks. Previously, a null object ID would crash difftool. We now detect null object IDs and write the symlink's content into the temporary symlink stand-in file. Original-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: David Aguilar <davvid@gmail.com> --- builtin/difftool.c | 36 +++++++++++++++++++++++++++++++++--- t/t7800-difftool.sh | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+), 3 deletions(-) diff --git a/builtin/difftool.c b/builtin/difftool.c index d13350ce83..6c20e20b45 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -254,6 +254,31 @@ static int ensure_leading_directories(char *path) } } +static int create_symlink_file(struct cache_entry* ce, struct checkout* state) +{ + /* + * Dereference a worktree symlink and writes its contents + * into the checkout state's path. + */ + struct strbuf path = STRBUF_INIT; + struct strbuf link = STRBUF_INIT; + + int ok = 0; + + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { + strbuf_add(&path, state->base_dir, state->base_dir_len); + strbuf_add(&path, ce->name, ce_namelen(ce)); + + write_file_buf(path.buf, link.buf, link.len); + ok = 1; + } + + strbuf_release(&path); + strbuf_release(&link); + + return ok; +} + static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, int argc, const char **argv) { @@ -376,13 +401,13 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, continue; } - if (S_ISLNK(lmode)) { + if (S_ISLNK(lmode) && !is_null_oid(&loid)) { char *content = read_sha1_file(loid.hash, &type, &size); add_left_or_right(&symlinks2, src_path, content, 0); free(content); } - if (S_ISLNK(rmode)) { + if (S_ISLNK(rmode) && !is_null_oid(&roid)) { char *content = read_sha1_file(roid.hash, &type, &size); add_left_or_right(&symlinks2, dst_path, content, 1); free(content); @@ -414,7 +439,12 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, oidcpy(&ce->oid, &roid); strcpy(ce->name, dst_path); ce->ce_namelen = dst_path_len; - if (checkout_entry(ce, &rstate, NULL)) + + if (S_ISLNK(rmode) && is_null_oid(&roid)) { + if (!create_symlink_file(ce, &rstate)) + return error("unable to create symlink file %s", + dst_path); + } else if (checkout_entry(ce, &rstate, NULL)) return error("could not write '%s'", dst_path); } else if (!is_null_oid(&roid)) { diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh index e1ec292718..64f8e451b5 100755 --- a/t/t7800-difftool.sh +++ b/t/t7800-difftool.sh @@ -623,4 +623,44 @@ test_expect_success SYMLINKS 'difftool --dir-diff symlinked directories' ' ) ' +test_expect_success SYMLINKS 'difftool --dir-diff' ' + touch b && + ln -s b c && + git add . && + test_tick && + git commit -m initial && + touch d && + rm c && + ln -s d c && + + git difftool --dir-diff --extcmd ls >output && + grep -v ^/ output >actual && + cat >expect <<-EOF && + b + c + dirlinks + output + submod + + c + dirlinks + output + submod + EOF + test_cmp expect actual && + + # The left side contains symlink "c" that points to "b" + test_config difftool.cat.cmd "cat \$LOCAL/c" && + git difftool --dir-diff --tool cat >actual && + echo b >expect && + test_cmp expect actual && + + # The right side contains symlink "c" that points to "d", + # which mimics the state of the worktree. + test_config difftool.cat.cmd "cat \$REMOTE/c" && + git difftool --dir-diff --tool cat >actual && + echo -n d >expect && + test_cmp expect actual +' + test_done -- 2.12.0.266.g44c9eec009 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 17:56 ` [PATCH] difftool: handle changing symlinks in dir-diff mode David Aguilar @ 2017-03-13 18:32 ` Junio C Hamano 2017-03-13 21:04 ` Johannes Schindelin ` (2 more replies) 2017-03-13 19:02 ` Junio C Hamano 2017-03-13 21:36 ` Johannes Schindelin 2 siblings, 3 replies; 17+ messages in thread From: Junio C Hamano @ 2017-03-13 18:32 UTC (permalink / raw) To: David Aguilar; +Cc: Johannes Schindelin, Christophe Macabiau, Git ML David Aguilar <davvid@gmail.com> writes: > Detect the null object ID for symlinks in dir-diff so that difftool can > prepare temporary files that matches how git handles symlinks. > > Previously, a null object ID would crash difftool. We now detect null > object IDs and write the symlink's content into the temporary symlink > stand-in file. > > Original-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de> > Signed-off-by: David Aguilar <davvid@gmail.com> > --- I would have appreciated (and I suspect other reviewers would, too) a bit of back-story wrt how "Original-patch-by" resulted in this patch after the three-dashes line. It is perfectly fine if you two coordinated privately; I mostly wanted to hear something like "Dscho has been working on this and I asked him if it is OK to take over his WIP to produce a quick-fix we can ship on the maint branch, here is the result of that collaboration." IOW, the person who is named as the original author is fine to be named like so (I care only because I do not think we saw the "original" here on the list). > +static int create_symlink_file(struct cache_entry* ce, struct checkout* state) Asterisk sticks to variable, not type. > +{ > + /* > + * Dereference a worktree symlink and writes its contents s/writes/write/ > + * into the checkout state's path. > + */ > + struct strbuf path = STRBUF_INIT; > + struct strbuf link = STRBUF_INIT; > + > + int ok = 0; > + > + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { > + strbuf_add(&path, state->base_dir, state->base_dir_len); > + strbuf_add(&path, ce->name, ce_namelen(ce)); > + > + write_file_buf(path.buf, link.buf, link.len); This does "write content into symlink stand-in file", but why? A symbolic link that is not changed on the right-hand side of the comparison (i.e. S_ISLNK(rmode) && !is_null_oid(&roid)) would go to checkout_entry() which - creates a symbolic link, on a filesystem that supports symlink; or - writes the stand-in file on a filesystem that does not. which is the right thing. It seems that the other side of "if (!use_wt_file())" if/elseif/... cascade also does the right thing manually. Shouldn't this helper also do the same (i.e. create symlink and fall back to stand-in)? Also, I am not sure if strbuf_readlink() can unconditionally used here. On a filesystem without symbolic link, the working tree entity that corresponds to the ce that represents a symlink is a stand-in regular file, so shouldn't we be opening it as a regular file and reading its contents in that case? ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 18:32 ` Junio C Hamano @ 2017-03-13 21:04 ` Johannes Schindelin 2017-03-13 21:27 ` Johannes Schindelin 2017-03-14 4:13 ` Junio C Hamano 2 siblings, 0 replies; 17+ messages in thread From: Johannes Schindelin @ 2017-03-13 21:04 UTC (permalink / raw) To: Junio C Hamano; +Cc: David Aguilar, Christophe Macabiau, Git ML Hi Junio, On Mon, 13 Mar 2017, Junio C Hamano wrote: > David Aguilar <davvid@gmail.com> writes: > > > Detect the null object ID for symlinks in dir-diff so that difftool > > can prepare temporary files that matches how git handles symlinks. > > > > Previously, a null object ID would crash difftool. We now detect null > > object IDs and write the symlink's content into the temporary symlink > > stand-in file. > > > > Original-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de> > > Signed-off-by: David Aguilar <davvid@gmail.com> > > --- > > I would have appreciated (and I suspect other reviewers would, too) > a bit of back-story wrt how "Original-patch-by" resulted in this > patch after the three-dashes line. It is perfectly fine if you two > coordinated privately; I mostly wanted to hear something like "Dscho > has been working on this and I asked him if it is OK to take over > his WIP to produce a quick-fix we can ship on the maint branch, here > is the result of that collaboration." IOW, the person who is named > as the original author is fine to be named like so (I care only > because I do not think we saw the "original" here on the list). The story is more like: Johannes started working on this but got pulled away by other tasks and sent out a link to the initial patch, along with a note that he hopes to be able to get back to working on that patch before long (but of course he did not get the chance): http://public-inbox.org/git/alpine.DEB.2.20.1703072332370.3767@virtualbox/ There was no private exchange. I am happy that David picked up the project. Ciao, Johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 18:32 ` Junio C Hamano 2017-03-13 21:04 ` Johannes Schindelin @ 2017-03-13 21:27 ` Johannes Schindelin 2017-03-13 21:33 ` Junio C Hamano 2017-03-14 4:13 ` Junio C Hamano 2 siblings, 1 reply; 17+ messages in thread From: Johannes Schindelin @ 2017-03-13 21:27 UTC (permalink / raw) To: Junio C Hamano; +Cc: David Aguilar, Christophe Macabiau, Git ML Hi Junio, On Mon, 13 Mar 2017, Junio C Hamano wrote: > David Aguilar <davvid@gmail.com> writes: > > > +static int create_symlink_file(struct cache_entry* ce, struct checkout* state) > > Asterisk sticks to variable, not type. If only we had tools to format the code so that authors as well as reviewers could concentrate on essential parts of the patches :-) > > + * into the checkout state's path. > > + */ > > + struct strbuf path = STRBUF_INIT; > > + struct strbuf link = STRBUF_INIT; > > + > > + int ok = 0; > > + > > + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { > > + strbuf_add(&path, state->base_dir, state->base_dir_len); > > + strbuf_add(&path, ce->name, ce_namelen(ce)); > > + > > + write_file_buf(path.buf, link.buf, link.len); > > This does "write content into symlink stand-in file", but why? From the commit message: > Detect the null object ID for symlinks in dir-diff so that > difftool can prepare temporary files that matches how git > handles symlinks. The obvious connection: when core.symlinks = false, Git already falls back to writing plain files with the link target as contents. This function does the same, for the same motivation: it is the best we can do in this case. > Also, I am not sure if strbuf_readlink() can unconditionally used > here. On a filesystem without symbolic link, the working tree > entity that corresponds to the ce that represents a symlink is a > stand-in regular file, so shouldn't we be opening it as a regular > file and reading its contents in that case? I think you are right, we cannot simply call strbuf_readlink(), we would have to check the core_symlinks variable to maybe read the file contents instead. But then, it may not be appropriate to read the worktree to begin with... see my reply to the patch that I will send out in a couple of minutes. Ciao, Johannes ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 21:27 ` Johannes Schindelin @ 2017-03-13 21:33 ` Junio C Hamano 2017-03-14 2:20 ` David Aguilar 0 siblings, 1 reply; 17+ messages in thread From: Junio C Hamano @ 2017-03-13 21:33 UTC (permalink / raw) To: Johannes Schindelin; +Cc: David Aguilar, Christophe Macabiau, Git ML Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> > + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { >> > + strbuf_add(&path, state->base_dir, state->base_dir_len); >> > + strbuf_add(&path, ce->name, ce_namelen(ce)); >> > + >> > + write_file_buf(path.buf, link.buf, link.len); >> >> This does "write content into symlink stand-in file", but why? > > From the commit message: > > > Detect the null object ID for symlinks in dir-diff so that > > difftool can prepare temporary files that matches how git > > handles symlinks. Yes I read what the proposed log message said, and noticed that symbolic link is _always_ written as a regular file. I was questioning why. I know Git falls back to such behaviour when the filesystem does not support symbolic links. "Why do so unconditionally, even when the filesystem does?" was the question. > The obvious connection: when core.symlinks = false, Git already falls back > to writing plain files with the link target as contents. This function > does the same, for the same motivation: it is the best we can do in this > case. And that "obvious connection" does not answer the question. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 21:33 ` Junio C Hamano @ 2017-03-14 2:20 ` David Aguilar 2017-03-14 5:52 ` Junio C Hamano 0 siblings, 1 reply; 17+ messages in thread From: David Aguilar @ 2017-03-14 2:20 UTC (permalink / raw) To: Junio C Hamano; +Cc: Johannes Schindelin, Christophe Macabiau, Git ML On Mon, Mar 13, 2017 at 02:33:09PM -0700, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> > + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { > >> > + strbuf_add(&path, state->base_dir, state->base_dir_len); > >> > + strbuf_add(&path, ce->name, ce_namelen(ce)); > >> > + > >> > + write_file_buf(path.buf, link.buf, link.len); > >> > >> This does "write content into symlink stand-in file", but why? > > > > From the commit message: > > > > > Detect the null object ID for symlinks in dir-diff so that > > > difftool can prepare temporary files that matches how git > > > handles symlinks. > > Yes I read what the proposed log message said, and noticed that > symbolic link is _always_ written as a regular file. > > I was questioning why. I know Git falls back to such behaviour when > the filesystem does not support symbolic links. "Why do so > unconditionally, even when the filesystem does?" was the question. > > > The obvious connection: when core.symlinks = false, Git already falls back > > to writing plain files with the link target as contents. This function > > does the same, for the same motivation: it is the best we can do in this > > case. > > And that "obvious connection" does not answer the question. Thanks for the thorough review. I'll try to answer questions from the various emails in just this one spot in case it helps. Dscho wrote: > Given that we explicitly ask use_wt_file() whether we can use the > worktree's file, and we get the answer "no" before we enter the modified > code block, I would really expect us *not* to want to read the link from > disk at all. That probably deserves a comment on its own. The use_wt_file() function really answers the question -- "does the worktree contain content that does is unknown to Git, and thus we should symlink to the worktree?" Since these are symlinks, and symlinks are already used to point back to the worktree (so that difftools will write back to the worktree in case the user edited in-tool) then the meaning of use_wt_file() in this context can be misleading. What we're trying to do is handle the case where Git knows it's dealing with an entry that it wants to checkout into a temporary area but it has no way to do so. These entries always have the 0{40} null SHA1 because that is what git diff-files reports for content that exists in the worktree only. Dscho wrote: > I think you are right, we cannot simply call strbuf_readlink(), we would > have to check the core_symlinks variable to maybe read the file contents > instead. > > But then, it may not be appropriate to read the worktree to begin > with... > see my reply to the patch that I will send out in a couple of minutes. In this case, where the null SHA1 indicates that it is unknown content, then I believe we must read from the worktree to simulate what Git would have checked out. Checking for core.symlinks should probably be done before calling strbuf_readlink, though. Junio wrote: > > > Detect the null object ID for symlinks in dir-diff so that > > > difftool can prepare temporary files that matches how git > > > handles symlinks. > > Yes I read what the proposed log message said, and noticed that > symbolic link is _always_ written as a regular file. > > I was questioning why. I know Git falls back to such behaviour when > the filesystem does not support symbolic links. "Why do so > unconditionally, even when the filesystem does?" was the question. I have to re-read the code to see where this is special-cased, but it seems that symlinks are always written as raw files by the dir-diff code for the purposes of full-tree diffing. I think the "why" is tied up in the implementation details of the symlink-back-to-the-worktree-to-allow-editing feature. By special-casing in-tree symlinks and writing them as raw files we can hijack on-disk symlinks. We use on-disk symlinks to link back to the worktree so that external diff tools can write to the worktree through the symlink. Junio wrote: > On this part I didn't comment in my previous message, but what is > the implication of omitting add-left-or-right and not registering > this symbolic link modified in the working tree to the symlinks2 > table? > > I am wondering if these should be more like > > if (S_ISLNK(lmode) { > char *content = get_symlink(src_path, &loid); > add_left_or_right(&symlinks2, src_path, content, 0); > free(content); > } > > with get_symlink() helper that does > > - if the object name is not 0{40}, read from the object store > > - if the object name is 0{40}, that means we need to read the real > contents from the working tree file, so do the "readlink(2) if > symbolic link is supported, otherwise open/read/close the stub > file sitting there" thing. > > Similary to the right hand side tree. I'll take a look at trying this out. Reading the code again, the point of add_left_or_right() is to populate the worktree (done later in the loop) with the stuff we read from Git. Thus, if we changed just this section to call get_symlink() then we should not even try to checkout any symlink entries in !use_wt_file(...) block where checkout_entry() / create_symlink_file() are called. Since the symlinks2 hashmap already populates the worktree then that code should instead simply skip symlinks. Later, once we get to the part where we copy stuff back into the worktree, it should be noted that we always skip over symlinks. We simply do not handle them, never have, and I don't think we really should. The use case that we're trying to handle is a common use case where the user is using dir-diff and uses the difftool to edit a file with content that exists in the worktree only. For that use case, a symlink to the worktree is created in the temp area, and Git does not need to do anything special. I do not think the use case of a user editing a symlink stand-in file, and then having Git update the worktree with the updated content, is common or something we want to support. I'd prefer to keep the use case simple since the code is already complicated enough. I'll take a stab at adding a get_symlink() helper, adjust the code so that add_left_or_right() is populated, and special-case the checkout_entry() code path to simply skip over null SHA1s. I'll also address the review notes and try to add more comments to describe what exactly this code does and why it does it. Do the tests make sense? One minor thing I noticed is that I had to use "echo -n" for the stuff coming out of strbuf_readlink(), and plain "echo" for entries that come in via read_sha1_file() content passed to add_left_or_right(). That suggests that maybe I should append a newline to the output from strbuf_readlink() so that it matches Git. Does that sound right? Does Git store symlink entries with a terminating newline? Please let me know if I'm missing something. cheers, -- David ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-14 2:20 ` David Aguilar @ 2017-03-14 5:52 ` Junio C Hamano 0 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2017-03-14 5:52 UTC (permalink / raw) To: David Aguilar; +Cc: Johannes Schindelin, Christophe Macabiau, Git ML David Aguilar <davvid@gmail.com> writes: > Reading the code again, the point of add_left_or_right() > is to populate the worktree (done later in the loop) with > the stuff we read from Git. Thus, if we changed just this > section to call get_symlink() then we should not even try > to checkout any symlink entries in !use_wt_file(...) > block where checkout_entry() / create_symlink_file() are called. > > Since the symlinks2 hashmap already populates the worktree > then that code should instead simply skip symlinks. OK, that would simplify things, certainly. > I'll take a stab at adding a get_symlink() helper, adjust > the code so that add_left_or_right() is populated, and > special-case the checkout_entry() code path to simply skip > over null SHA1s. Did you mean s/null SHA1s/S_ISLNK()s/? > One minor thing I noticed is that I had to use "echo -n" > for the stuff coming out of strbuf_readlink(), and > plain "echo" for entries that come in via read_sha1_file() > content passed to add_left_or_right(). > > That suggests that maybe I should append a newline to the > output from strbuf_readlink() so that it matches Git. > Does that sound right? Does Git store symlink entries > with a terminating newline? Do not append a newline. Unless the pathname of the target you are symlinking to ends with LF, readlink() won't end with LF, and the stand-in file shouldn't, either. By the way, avoid "echo -n"; use "printf '%s'" instead. Thanks. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 18:32 ` Junio C Hamano 2017-03-13 21:04 ` Johannes Schindelin 2017-03-13 21:27 ` Johannes Schindelin @ 2017-03-14 4:13 ` Junio C Hamano 2 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2017-03-14 4:13 UTC (permalink / raw) To: David Aguilar; +Cc: Johannes Schindelin, Christophe Macabiau, Git ML Junio C Hamano <gitster@pobox.com> writes: >> + struct strbuf path = STRBUF_INIT; >> + struct strbuf link = STRBUF_INIT; >> + >> + int ok = 0; >> + >> + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { >> + strbuf_add(&path, state->base_dir, state->base_dir_len); >> + strbuf_add(&path, ce->name, ce_namelen(ce)); >> + >> + write_file_buf(path.buf, link.buf, link.len); > > This does "write content into symlink stand-in file", but why? A > symbolic link that is not changed on the right-hand side of the > comparison (i.e. S_ISLNK(rmode) && !is_null_oid(&roid)) would go to > checkout_entry() which > > - creates a symbolic link, on a filesystem that supports symlink; or > - writes the stand-in file on a filesystem that does not. > > which is the right thing. It seems that the other side of "if (!use_wt_file())" > if/elseif/... cascade also does the right thing manually. > > Shouldn't this helper also do the same (i.e. create symlink and fall > back to stand-in)? > > Also, I am not sure if strbuf_readlink() can unconditionally used > here. On a filesystem without symbolic link, the working tree > entity that corresponds to the ce that represents a symlink is a > stand-in regular file, so shouldn't we be opening it as a regular > file and reading its contents in that case? I _think_ I was mistaken (please correct me again if my reasoning below is wrong) and unconditional writing of a plain regular file is what this codepath wants to do, because we are preparing two temporary directories, to be fed to a Git-unaware tool that knows how to show a diff of two directories (e.g. "diff -r A B"). Because the tool is Git-unaware, if a symbolic link appears in either of these temporary directories, it will try to dereference and show the difference of the target of the symbolic link, which is not what we want, as the goal of the dir-diff mode is to produce an output that is logically equivalent to what "git diff" produces---most importantly, we want to get textual comparison of the result of the readlink(2). And write_file_buf() used here is exactly that---we write the contents of symlink to a regular file to force the external tool to compare the readlink(2) result as if it is a text. Even on a filesystem that is capable of doing a symbolic link. The strbuf_readlink() that read the contents of symlink, however, is wrong on a filesystem that is not capable of a symbolic link; if core.symlinks is false, we do need to read them as a regular file. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 17:56 ` [PATCH] difftool: handle changing symlinks in dir-diff mode David Aguilar 2017-03-13 18:32 ` Junio C Hamano @ 2017-03-13 19:02 ` Junio C Hamano 2017-03-13 21:36 ` Johannes Schindelin 2 siblings, 0 replies; 17+ messages in thread From: Junio C Hamano @ 2017-03-13 19:02 UTC (permalink / raw) To: David Aguilar; +Cc: Johannes Schindelin, Christophe Macabiau, Git ML David Aguilar <davvid@gmail.com> writes: > - if (S_ISLNK(lmode)) { > + if (S_ISLNK(lmode) && !is_null_oid(&loid)) { > char *content = read_sha1_file(loid.hash, &type, &size); > add_left_or_right(&symlinks2, src_path, content, 0); > free(content); > } > > - if (S_ISLNK(rmode)) { > + if (S_ISLNK(rmode) && !is_null_oid(&roid)) { > char *content = read_sha1_file(roid.hash, &type, &size); > add_left_or_right(&symlinks2, dst_path, content, 1); > free(content); On this part I didn't comment in my previous message, but what is the implication of omitting add-left-or-right and not registering this symbolic link modified in the working tree to the symlinks2 table? I am wondering if these should be more like if (S_ISLNK(lmode) { char *content = get_symlink(src_path, &loid); add_left_or_right(&symlinks2, src_path, content, 0); free(content); } with get_symlink() helper that does - if the object name is not 0{40}, read from the object store - if the object name is 0{40}, that means we need to read the real contents from the working tree file, so do the "readlink(2) if symbolic link is supported, otherwise open/read/close the stub file sitting there" thing. Similary to the right hand side tree. Discarding "content" after reading feels wasteful, as that is the information we would be using when populating the rstate and lstaten working trees later in the loop, but that would probably need a larger surgery to the code to optimize, I would imagine. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH] difftool: handle changing symlinks in dir-diff mode 2017-03-13 17:56 ` [PATCH] difftool: handle changing symlinks in dir-diff mode David Aguilar 2017-03-13 18:32 ` Junio C Hamano 2017-03-13 19:02 ` Junio C Hamano @ 2017-03-13 21:36 ` Johannes Schindelin 2 siblings, 0 replies; 17+ messages in thread From: Johannes Schindelin @ 2017-03-13 21:36 UTC (permalink / raw) To: David Aguilar; +Cc: Junio C Hamano, Christophe Macabiau, Git ML Hi David, Thank you very much for picking this up! On Mon, 13 Mar 2017, David Aguilar wrote: > Detect the null object ID for symlinks in dir-diff so that difftool can > prepare temporary files that matches how git handles symlinks. Maybe a description is needed how the OID can be null in that case. I have to admit that I failed to wrap my head around this so far. > Previously, a null object ID would crash difftool. We now detect null > object IDs and write the symlink's content into the temporary symlink > stand-in file. > > Original-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de> > Signed-off-by: David Aguilar <davvid@gmail.com> > --- > diff --git a/builtin/difftool.c b/builtin/difftool.c > index d13350ce83..6c20e20b45 100644 > --- a/builtin/difftool.c > +++ b/builtin/difftool.c > @@ -254,6 +254,31 @@ static int ensure_leading_directories(char *path) > } > } > > +static int create_symlink_file(struct cache_entry* ce, struct checkout* state) > +{ > + /* > + * Dereference a worktree symlink and writes its contents > + * into the checkout state's path. > + */ > + struct strbuf path = STRBUF_INIT; > + struct strbuf link = STRBUF_INIT; > + > + int ok = 0; It would appear that the convention in Git's C code is for functions to return 0 upon success and -1 upon error, and to use `int ret` for that purpose. > + if (strbuf_readlink(&link, ce->name, ce_namelen(ce)) == 0) { Looking at the calling site, I would have expected the code to read the contents as per ce->hash... After all, we are calling this in the !use_wt_file() case. But that is exactly that null hash, isn't it? > @@ -414,7 +439,12 @@ static int run_dir_diff(const char *extcmd, int symlinks, const char *prefix, The lines before this hunk read: > if (!use_wt_file(workdir, dst_path, &roid)) { > ce->ce_mode = rmode; ... and then follow these lines: > oidcpy(&ce->oid, &roid); > strcpy(ce->name, dst_path); > ce->ce_namelen = dst_path_len; > - if (checkout_entry(ce, &rstate, NULL)) > + > + if (S_ISLNK(rmode) && is_null_oid(&roid)) { > + if (!create_symlink_file(ce, &rstate)) > + return error("unable to create symlink file %s", > + dst_path); > + } else if (checkout_entry(ce, &rstate, NULL)) > return error("could not write '%s'", > dst_path); > } else if (!is_null_oid(&roid)) { Given that we explicitly ask use_wt_file() whether we can use the worktree's file, and we get the answer "no" before we enter the modified code block, I would really expect us *not* to want to read the link from disk at all. Further, reading the code of use_wt_file(), there seems to be another case where roid is left alone: when the file could not be lstat()ed. So I wonder whether the create_symlink_file() is the correct solution, or whether we could somehow fill roid correctly instead, and keep the checkout_entry() call? Ciao, Dscho ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2017-03-14 5:52 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-24 11:47 fatal error when diffing changed symlinks Christophe Macabiau 2017-02-24 17:53 ` Junio C Hamano 2017-02-24 19:51 ` Junio C Hamano 2017-02-24 20:35 ` Jeff King 2017-02-25 12:36 ` Johannes Schindelin 2017-03-07 18:16 ` Junio C Hamano 2017-03-07 22:34 ` Johannes Schindelin 2017-03-13 17:56 ` [PATCH] difftool: handle changing symlinks in dir-diff mode David Aguilar 2017-03-13 18:32 ` Junio C Hamano 2017-03-13 21:04 ` Johannes Schindelin 2017-03-13 21:27 ` Johannes Schindelin 2017-03-13 21:33 ` Junio C Hamano 2017-03-14 2:20 ` David Aguilar 2017-03-14 5:52 ` Junio C Hamano 2017-03-14 4:13 ` Junio C Hamano 2017-03-13 19:02 ` Junio C Hamano 2017-03-13 21:36 ` Johannes Schindelin
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.