* Bug: problem with file named with dash character [not found] <52ae7682-3e9a-4b52-bec1-08ba3aadffc0@office.digitalus.nl> @ 2012-06-27 7:32 ` Daniel Lyubomirov -|- Digitalus Bulgaria 2012-06-27 9:57 ` faux 2012-06-27 18:28 ` Junio C Hamano 0 siblings, 2 replies; 13+ messages in thread From: Daniel Lyubomirov -|- Digitalus Bulgaria @ 2012-06-27 7:32 UTC (permalink / raw) To: git Hi, Аccidentally my colleague created a file in the root dir of the git repo called - (just dash). As result for every commit having this file, diff , merge, cherry-pick maybe others just hang. I tried to make rebase interactive to edit the commit and remove the file but that fails too. system: Fedora Linux 3.4.2-1.fc16.x86_64 git version: 1.7.7.6 How to reproduce the bug: 1. create a file named - 2. add the file in the index and commit 3. try diff with another commit, branch So far i found following workarounds: 1. You can delete the file. Then the problem stays in the history only for the commits that have the file. 2. Use git filter-branch and deleted the file from the history. git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch -- -" <problem commit>..HEAD But this changes the history. Thanks for your work. Regards, Daniel Lyubomirov ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 7:32 ` Bug: problem with file named with dash character Daniel Lyubomirov -|- Digitalus Bulgaria @ 2012-06-27 9:57 ` faux 2012-06-27 18:28 ` Junio C Hamano 1 sibling, 0 replies; 13+ messages in thread From: faux @ 2012-06-27 9:57 UTC (permalink / raw) To: git On Wed, Jun 27, 2012 at 09:32:21AM +0200, Daniel Lyubomirov -|- Digitalus Bulgaria wrote: > Hi, > > Аccidentally my colleague created a file in the root dir of the git repo called - (just dash). > As result for every commit having this file, diff , merge, cherry-pick maybe others just hang. > I tried to make rebase interactive to edit the commit and remove the file but that fails too. Haha, it tries to read from standard in: $ git init && printf '' >- && git add -- - && git commit -m "lol" Pressing ^D allows it to continue. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 7:32 ` Bug: problem with file named with dash character Daniel Lyubomirov -|- Digitalus Bulgaria 2012-06-27 9:57 ` faux @ 2012-06-27 18:28 ` Junio C Hamano 2012-06-27 19:52 ` Jeff King 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 18:28 UTC (permalink / raw) To: Daniel Lyubomirov; +Cc: git Daniel Lyubomirov -|- Digitalus Bulgaria <daniel@digitalus.bg> writes: > Аccidentally my colleague created a file in the root dir of the git repo called - (just dash). > As result for every commit having this file, diff , merge, cherry-pick maybe others just hang. Thanks for a report. I think the "diff" callchain should be refactored so that the caller can mark the special "stdin" token in a saner way, but until it happens, the following one-liner should do. diff.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/diff.c b/diff.c index 1a594df..caa2309 100644 --- a/diff.c +++ b/diff.c @@ -2589,6 +2589,14 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int if (!FAST_WORKING_DIRECTORY && !want_file && has_sha1_pack(sha1)) return 0; + /* + * And asking to read "-" from the working tree triggers stdin + * input (which needs to be fixed separately by refactoring the + * callchain), forbid "reuse" for now. + */ + if (!strcmp(name, "-")) + return 0; + len = strlen(name); pos = cache_name_pos(name, len); if (pos < 0) ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 18:28 ` Junio C Hamano @ 2012-06-27 19:52 ` Jeff King 2012-06-27 20:25 ` Jeff King 2012-06-27 20:27 ` Junio C Hamano 0 siblings, 2 replies; 13+ messages in thread From: Jeff King @ 2012-06-27 19:52 UTC (permalink / raw) To: Junio C Hamano; +Cc: Daniel Lyubomirov, git On Wed, Jun 27, 2012 at 11:28:21AM -0700, Junio C Hamano wrote: > Thanks for a report. I think the "diff" callchain should be > refactored so that the caller can mark the special "stdin" token in > a saner way, but until it happens, the following one-liner should > do. Yeah, I assume this is there at all to support "diff --no-index -", so the special marking should happen at that layer. > diff --git a/diff.c b/diff.c > index 1a594df..caa2309 100644 > --- a/diff.c > +++ b/diff.c > @@ -2589,6 +2589,14 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int > if (!FAST_WORKING_DIRECTORY && !want_file && has_sha1_pack(sha1)) > return 0; > > + /* > + * And asking to read "-" from the working tree triggers stdin > + * input (which needs to be fixed separately by refactoring the > + * callchain), forbid "reuse" for now. > + */ > + if (!strcmp(name, "-")) > + return 0; > + Unfortunately this is not enough. The problematic code path is the call to populate_from_stdin in diff_populate_filespec. And we follow that conditional if reuse_worktree_file is true, _or_ if the sha1_valid flag on the filespec is not set. We hit the latter due to the --no-index case, but we can also hit it if we are comparing a working tree file in the repo that is stat-dirty. So without your patch, this reads from stdin: git init repo && cd repo && echo foo >- && git add - && git commit -m foo when the commit command tries to generate the diff summary. Your patch fixes it, but remains broken if you then do: echo changes >>- && git diff I think you'd want to do just do something like: diff --git a/diff.c b/diff.c index 1a594df..aac72b7 100644 --- a/diff.c +++ b/diff.c @@ -2684,9 +2684,6 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only) struct stat st; int fd; - if (!strcmp(s->path, "-")) - return populate_from_stdin(s); - if (lstat(s->path, &st) < 0) { if (errno == ENOENT) { err_empty: to temporarily fix it. That breaks echo content | git diff --no-index - some-file but that code path should be fixed properly (with a use_stdin flag in the filespec). -Peff ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 19:52 ` Jeff King @ 2012-06-27 20:25 ` Jeff King 2012-06-27 20:27 ` Junio C Hamano 1 sibling, 0 replies; 13+ messages in thread From: Jeff King @ 2012-06-27 20:25 UTC (permalink / raw) To: Junio C Hamano; +Cc: Daniel Lyubomirov, git On Wed, Jun 27, 2012 at 03:52:05PM -0400, Jeff King wrote: > I think you'd want to do just do something like: > > diff --git a/diff.c b/diff.c > index 1a594df..aac72b7 100644 > --- a/diff.c > +++ b/diff.c > @@ -2684,9 +2684,6 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only) > struct stat st; > int fd; > > - if (!strcmp(s->path, "-")) > - return populate_from_stdin(s); > - > if (lstat(s->path, &st) < 0) { > if (errno == ENOENT) { > err_empty: > > to temporarily fix it. That breaks > > echo content | git diff --no-index - some-file > > but that code path should be fixed properly (with a use_stdin flag in > the filespec). Something like the patch below, which keeps the stdin test in t4002 working for me. I suspect there are other problems lurking with the stdin case. For example, we try to drop filespec contents whenever we can to reduce memory pressure, under the assumption that we can always re-read the blob later. But with stdin, we would need to be careful to mark the contents as precious somehow. diff --git a/diff-no-index.c b/diff-no-index.c index f0b0010..c64bd5c 100644 --- a/diff-no-index.c +++ b/diff-no-index.c @@ -51,6 +51,21 @@ static int get_mode(const char *path, int *mode) return 0; } +static struct diff_filespec *noindex_filespec(const char *name, int mode) +{ + struct diff_filespec *r; + + if (!name) + name = "/dev/null"; + r = alloc_filespec(name); + + fill_filespec(r, null_sha1, mode); + if (!strcmp(name, "-")) + r->is_stdin = 1; + + return r; +} + static int queue_diff(struct diff_options *o, const char *name1, const char *name2) { @@ -137,15 +152,8 @@ static int queue_diff(struct diff_options *o, tmp_c = name1; name1 = name2; name2 = tmp_c; } - if (!name1) - name1 = "/dev/null"; - if (!name2) - name2 = "/dev/null"; - d1 = alloc_filespec(name1); - d2 = alloc_filespec(name2); - fill_filespec(d1, null_sha1, mode1); - fill_filespec(d2, null_sha1, mode2); - + d1 = noindex_filespec(name1, mode1); + d2 = noindex_filespec(name2, mode2); diff_queue(&diff_queued_diff, d1, d2); return 0; } diff --git a/diff.c b/diff.c index 1a594df..449bfba 100644 --- a/diff.c +++ b/diff.c @@ -2675,6 +2675,9 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only) if (size_only && 0 < s->size) return 0; + if (s->is_stdin) + return populate_from_stdin(s); + if (S_ISGITLINK(s->mode)) return diff_populate_gitlink(s, size_only); @@ -2684,9 +2687,6 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only) struct stat st; int fd; - if (!strcmp(s->path, "-")) - return populate_from_stdin(s); - if (lstat(s->path, &st) < 0) { if (errno == ENOENT) { err_empty: diff --git a/diffcore.h b/diffcore.h index 8f32b82..be0739c 100644 --- a/diffcore.h +++ b/diffcore.h @@ -43,6 +43,7 @@ struct diff_filespec { unsigned should_free : 1; /* data should be free()'ed */ unsigned should_munmap : 1; /* data should be munmap()'ed */ unsigned dirty_submodule : 2; /* For submodules: its work tree is dirty */ + unsigned is_stdin : 1; #define DIRTY_SUBMODULE_UNTRACKED 1 #define DIRTY_SUBMODULE_MODIFIED 2 unsigned has_more_entries : 1; /* only appear in combined diff */ ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 19:52 ` Jeff King 2012-06-27 20:25 ` Jeff King @ 2012-06-27 20:27 ` Junio C Hamano 2012-06-27 20:33 ` Junio C Hamano 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 20:27 UTC (permalink / raw) To: Jeff King; +Cc: Daniel Lyubomirov, git Jeff King <peff@peff.net> writes: > but that code path should be fixed properly (with a use_stdin flag in > the filespec). Yes, just as I said; I am finding more and more issues with the no-index hack that I have been fixing a bit by bit since I send the message you responded to. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 20:27 ` Junio C Hamano @ 2012-06-27 20:33 ` Junio C Hamano 2012-06-27 20:35 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 20:33 UTC (permalink / raw) To: Jeff King; +Cc: Daniel Lyubomirov, git Junio C Hamano <gitster@pobox.com> writes: > Jeff King <peff@peff.net> writes: > >> but that code path should be fixed properly (with a use_stdin flag in >> the filespec). > > Yes, just as I said; I am finding more and more issues with the > no-index hack that I have been fixing a bit by bit since I send the > message you responded to. It is not ready yet, but here are a few patches WIP. -- >8 -- From: Junio C Hamano <gitster@pobox.com> Date: Wed, 27 Jun 2012 11:51:15 -0700 Subject: [PATCH 1/?] diff-index.c: do not pretend paths are pathspecs "git diff --no-index" takes exactly two paths, not pathspecs, and has its own way queue_diff() to populate the diff_queue. Do not call diff_tree_setup_paths(), pretending as it takes pathspecs. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- diff-no-index.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/diff-no-index.c b/diff-no-index.c index f0b0010..ca875da 100644 --- a/diff-no-index.c +++ b/diff-no-index.c @@ -175,6 +175,7 @@ void diff_no_index(struct rev_info *revs, int i; int no_index = 0; unsigned options = 0; + const char *paths[2]; /* Were we asked to do --no-index explicitly? */ for (i = 1; i < argc; i++) { @@ -233,8 +234,6 @@ void diff_no_index(struct rev_info *revs, if (prefix) { int len = strlen(prefix); - const char *paths[3]; - memset(paths, 0, sizeof(paths)); for (i = 0; i < 2; i++) { const char *p = argv[argc - 2 + i]; @@ -247,10 +246,10 @@ void diff_no_index(struct rev_info *revs, : p); paths[i] = p; } - diff_tree_setup_paths(paths, &revs->diffopt); + } else { + for (i = 0; i < 2; i++) + paths[i] = argv[argc - 2 + i]; } - else - diff_tree_setup_paths(argv + argc - 2, &revs->diffopt); revs->diffopt.skip_stat_unmatch = 1; if (!revs->diffopt.output_format) revs->diffopt.output_format = DIFF_FORMAT_PATCH; @@ -262,8 +261,7 @@ void diff_no_index(struct rev_info *revs, if (diff_setup_done(&revs->diffopt) < 0) die("diff_setup_done failed"); - if (queue_diff(&revs->diffopt, revs->diffopt.pathspec.raw[0], - revs->diffopt.pathspec.raw[1])) + if (queue_diff(&revs->diffopt, paths[0], paths[1])) exit(1); diff_set_mnemonic_prefix(&revs->diffopt, "1/", "2/"); diffcore_std(&revs->diffopt); -- 1.7.11.1.184.g3ee8f69 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 20:33 ` Junio C Hamano @ 2012-06-27 20:35 ` Junio C Hamano 2012-06-27 20:39 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 20:35 UTC (permalink / raw) To: Jeff King; +Cc: Daniel Lyubomirov, git Junio C Hamano <gitster@pobox.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> Jeff King <peff@peff.net> writes: >> >>> but that code path should be fixed properly (with a use_stdin flag in >>> the filespec). >> >> Yes, just as I said; I am finding more and more issues with the >> no-index hack that I have been fixing a bit by bit since I send the >> message you responded to. > > It is not ready yet, but here are a few patches WIP. And this is the second clean-up -- >8 -- Subject: [PATCH 2/?] diff-index.c: unify handling of command line paths Regardless of where in the directory hierarchy you are, "-" on the command line means the standard input. The old code knew too much about how the low level machinery uses paths to read from the working tree and did not bother to have the same check for "-" when the command is run from the top-level. Unify the codepaths for subdirectory case and toplevel case into one and make it clearer. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- diff-no-index.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/diff-no-index.c b/diff-no-index.c index ca875da..a5c1e3e 100644 --- a/diff-no-index.c +++ b/diff-no-index.c @@ -172,7 +172,7 @@ void diff_no_index(struct rev_info *revs, int argc, const char **argv, int nongit, const char *prefix) { - int i; + int i, prefixlen; int no_index = 0; unsigned options = 0; const char *paths[2]; @@ -232,23 +232,18 @@ void diff_no_index(struct rev_info *revs, if (!DIFF_OPT_TST(&revs->diffopt, EXIT_WITH_STATUS)) setup_pager(); - if (prefix) { - int len = strlen(prefix); - - for (i = 0; i < 2; i++) { - const char *p = argv[argc - 2 + i]; + prefixlen = prefix ? strlen(prefix) : 0; + for (i = 0; i < 2; i++) { + const char *p = argv[argc - 2 + i]; + if (!strcmp(p, "-")) /* - * stdin should be spelled as '-'; if you have - * path that is '-', spell it as ./-. + * stdin should be spelled as "-"; if you have + * path that is "-", spell it as "./-". */ - p = (strcmp(p, "-") - ? xstrdup(prefix_filename(prefix, len, p)) - : p); - paths[i] = p; - } - } else { - for (i = 0; i < 2; i++) - paths[i] = argv[argc - 2 + i]; + p = p; + else if (prefixlen) + p = xstrdup(prefix_filename(prefix, prefixlen, p)); + paths[i] = p; } revs->diffopt.skip_stat_unmatch = 1; if (!revs->diffopt.output_format) -- 1.7.11.1.184.g3ee8f69 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 20:35 ` Junio C Hamano @ 2012-06-27 20:39 ` Junio C Hamano 2012-06-27 20:48 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 20:39 UTC (permalink / raw) To: Jeff King; +Cc: Daniel Lyubomirov, git Junio C Hamano <gitster@pobox.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> Junio C Hamano <gitster@pobox.com> writes: >> >>> Jeff King <peff@peff.net> writes: >>> >>>> but that code path should be fixed properly (with a use_stdin flag in >>>> the filespec). >>> >>> Yes, just as I said; I am finding more and more issues with the >>> no-index hack that I have been fixing a bit by bit since I send the >>> message you responded to. >> >> It is not ready yet, but here are a few patches WIP. > > And this is the second clean-up And this is the third one. -- >8 -- Subject: [PATCH 3/?] diff-index.c: "git diff" has no need to read blob from the standard input Only "diff --no-index -" does. Bolting the logic into the low-level function diff_populate_filespec() was a layering violation from day one. Move populate_from_stdin() function out of the generic diff.c to its only user, diff-index.c. Also make sure "-" from the command line stays a special token "read from the standard input", even if we later decide to sanitize the result from prefix_filename() function in a few obvious ways, e.g. removing unnecessary "./" prefix, duplicated slashes "//" in the middle, etc. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- diff-no-index.c | 33 +++++++++++++++++++++++++++++++-- diff.c | 21 +-------------------- diffcore.h | 1 + 3 files changed, 33 insertions(+), 22 deletions(-) diff --git a/diff-no-index.c b/diff-no-index.c index a5c1e3e..9385d53 100644 --- a/diff-no-index.c +++ b/diff-no-index.c @@ -32,6 +32,12 @@ static int read_directory(const char *path, struct string_list *list) return 0; } +/* + * This should be "(standard input)", but it will probably expose many + * more breakages. + */ +static const char file_from_standard_input[] = "-"; + static int get_mode(const char *path, int *mode) { struct stat st; @@ -42,7 +48,7 @@ static int get_mode(const char *path, int *mode) else if (!strcasecmp(path, "nul")) *mode = 0; #endif - else if (!strcmp(path, "-")) + else if (path == file_from_standard_input) *mode = create_ce_mode(0666); else if (lstat(path, &st)) return error("Could not access '%s'", path); @@ -51,6 +57,23 @@ static int get_mode(const char *path, int *mode) return 0; } +static int populate_from_stdin(struct diff_filespec *s) +{ + struct strbuf buf = STRBUF_INIT; + size_t size = 0; + + if (strbuf_read(&buf, 0, 0) < 0) + return error("error while reading from stdin %s", + strerror(errno)); + + s->should_munmap = 0; + s->data = strbuf_detach(&buf, &size); + s->size = size; + s->should_free = 1; + s->nongit_stdin = 1; + return 0; +} + static int queue_diff(struct diff_options *o, const char *name1, const char *name2) { @@ -143,8 +166,14 @@ static int queue_diff(struct diff_options *o, name2 = "/dev/null"; d1 = alloc_filespec(name1); d2 = alloc_filespec(name2); + fill_filespec(d1, null_sha1, mode1); + if (name1 == file_from_standard_input) + populate_from_stdin(d1); + fill_filespec(d2, null_sha1, mode2); + if (name2 == file_from_standard_input) + populate_from_stdin(d2); diff_queue(&diff_queued_diff, d1, d2); return 0; @@ -240,7 +269,7 @@ void diff_no_index(struct rev_info *revs, * stdin should be spelled as "-"; if you have * path that is "-", spell it as "./-". */ - p = p; + p = file_from_standard_input; else if (prefixlen) p = xstrdup(prefix_filename(prefix, prefixlen, p)); paths[i] = p; diff --git a/diff.c b/diff.c index 1a594df..0c73c84 100644 --- a/diff.c +++ b/diff.c @@ -2619,22 +2619,6 @@ static int reuse_worktree_file(const char *name, const unsigned char *sha1, int return 0; } -static int populate_from_stdin(struct diff_filespec *s) -{ - struct strbuf buf = STRBUF_INIT; - size_t size = 0; - - if (strbuf_read(&buf, 0, 0) < 0) - return error("error while reading from stdin %s", - strerror(errno)); - - s->should_munmap = 0; - s->data = strbuf_detach(&buf, &size); - s->size = size; - s->should_free = 1; - return 0; -} - static int diff_populate_gitlink(struct diff_filespec *s, int size_only) { int len; @@ -2684,9 +2668,6 @@ int diff_populate_filespec(struct diff_filespec *s, int size_only) struct stat st; int fd; - if (!strcmp(s->path, "-")) - return populate_from_stdin(s); - if (lstat(s->path, &st) < 0) { if (errno == ENOENT) { err_empty: @@ -3048,7 +3029,7 @@ static void diff_fill_sha1_info(struct diff_filespec *one) if (DIFF_FILE_VALID(one)) { if (!one->sha1_valid) { struct stat st; - if (!strcmp(one->path, "-")) { + if (one->nongit_stdin) { hashcpy(one->sha1, null_sha1); return; } diff --git a/diffcore.h b/diffcore.h index 8f32b82..ef4fe17 100644 --- a/diffcore.h +++ b/diffcore.h @@ -42,6 +42,7 @@ struct diff_filespec { #define DIFF_FILE_VALID(spec) (((spec)->mode) != 0) unsigned should_free : 1; /* data should be free()'ed */ unsigned should_munmap : 1; /* data should be munmap()'ed */ + unsigned nongit_stdin : 1; /* data is from a nongit source */ unsigned dirty_submodule : 2; /* For submodules: its work tree is dirty */ #define DIRTY_SUBMODULE_UNTRACKED 1 #define DIRTY_SUBMODULE_MODIFIED 2 -- 1.7.11.1.184.g3ee8f69 ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 20:39 ` Junio C Hamano @ 2012-06-27 20:48 ` Junio C Hamano 2012-06-27 21:00 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 20:48 UTC (permalink / raw) To: Jeff King; +Cc: Daniel Lyubomirov, git Junio C Hamano <gitster@pobox.com> writes: > Junio C Hamano <gitster@pobox.com> writes: > >> Junio C Hamano <gitster@pobox.com> writes: >> >>> Junio C Hamano <gitster@pobox.com> writes: >>> >>>> Jeff King <peff@peff.net> writes: >>>> >>>>> but that code path should be fixed properly (with a use_stdin flag in >>>>> the filespec). >>>> >>>> Yes, just as I said; I am finding more and more issues with the >>>> no-index hack that I have been fixing a bit by bit since I send the >>>> message you responded to. >>> >>> It is not ready yet, but here are a few patches WIP. >> >> And this is the second clean-up > > And this is the third one. Some of the other breakages that comes from the "no-index" codepath that we may want to consider addressing I have found so far are: - We say on the "diff --git" header uglyness like "a/-", "b/-"; likewise in the metainfo; - We show on the "index" header "0*" value for these entries, even though we should be able to compute it (after all we do so for files on disk in a non-git directory); - We still apply attributes and textconv as if we are dealing with a regular file "-" at the root level. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 20:48 ` Junio C Hamano @ 2012-06-27 21:00 ` Jeff King 2012-06-27 22:17 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Jeff King @ 2012-06-27 21:00 UTC (permalink / raw) To: Junio C Hamano; +Cc: Daniel Lyubomirov, git On Wed, Jun 27, 2012 at 01:48:42PM -0700, Junio C Hamano wrote: > >>> It is not ready yet, but here are a few patches WIP. > >> > >> And this is the second clean-up > > > > And this is the third one. > > Some of the other breakages that comes from the "no-index" codepath > that we may want to consider addressing I have found so far are: >From a cursory look, these definitely go in the right direction. I like how the third one was able to rip populate_from_stdin entirely out of diff.c. I suspect we could get by without even the nongit_stdin flag you added; the only place which uses it is diff_fill_sha1_info, but theoretically we don't even need it there; we could just index_mem the file contents we get via diff_populate_filespec, and the stdin contents will already be there. Right now we call index_path for worktree files, but I don't really see much point; we have to read the whole data either way, and populate_filespec should be mmap-ing them for us. > - We say on the "diff --git" header uglyness like "a/-", "b/-"; > likewise in the metainfo; I'd consider changing the path to "/dev/stdin" for this case. It doesn't exist on some platforms, of course, but neither does /dev/null, which we use similarly. > - We show on the "index" header "0*" value for these entries, even > though we should be able to compute it (after all we do so for > files on disk in a non-git directory); The index_mem I mentioned above would fix that. > - We still apply attributes and textconv as if we are dealing with > a regular file "-" at the root level. I think that's bad. I wonder if it should have "*" attributes applied to it or not. While I can see it being convenient in some cases, I think it makes the rules confusingly complex. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 21:00 ` Jeff King @ 2012-06-27 22:17 ` Junio C Hamano 2012-06-27 22:41 ` Jeff King 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2012-06-27 22:17 UTC (permalink / raw) To: Jeff King; +Cc: Daniel Lyubomirov, git Jeff King <peff@peff.net> writes: > From a cursory look, these definitely go in the right direction. I like > how the third one was able to rip populate_from_stdin entirely out of > diff.c. > > I suspect we could get by without even the nongit_stdin flag you added; > the only place which uses it is diff_fill_sha1_info, but theoretically > we don't even need it there; we could just index_mem the file contents > we get via diff_populate_filespec, and the stdin contents will > already be there. Probably that is a sane thing to do. And as you hinted, we definitely need to give the "the copy in the filespec->data is the only one; do not discard until we are really done with it" semantics to the bit, or introduce a separate bit that is not necessarily related to the "this came from the standard input" bit. I offhand do not know what data source we would later add that needs such a treatment. "git diff http://example.com/{foo,bar}", perhaps? The "do not discard" bit at that point may need to learn to swap it out to a temporary file or something when it happens. > Right now we call index_path for worktree files, but I don't really see > much point; we have to read the whole data either way, and > populate_filespec should be mmap-ing them for us. > >> - We say on the "diff --git" header uglyness like "a/-", "b/-"; >> likewise in the metainfo; > > I'd consider changing the path to "/dev/stdin" for this case. It doesn't > exist on some platforms, of course, but neither does /dev/null, which we > use similarly. Sensible. >> - We show on the "index" header "0*" value for these entries, even >> though we should be able to compute it (after all we do so for >> files on disk in a non-git directory); > > The index_mem I mentioned above would fix that. Yes. >> - We still apply attributes and textconv as if we are dealing with >> a regular file "-" at the root level. > > I think that's bad. I wonder if it should have "*" attributes applied to > it or not. While I can see it being convenient in some cases, I think it > makes the rules confusingly complex. It is likely that the crlf conversion on Dos systems wants to be applied regardless. This is unrelated to the "standard input confusion" issue, but there are two more things coming either from the way the no-index code is done or from the way it is bolted onto git. With the current code, this: mkdir foo bar echo hello >foo/1 echo bye >bar/2 git diff --no-index foo bar shows differences between a/foo/1 and b/bar/1, as the no-index code records foo/1 and bar/1 as the paths in the filespec for them. But if you are comparing two directory hierarchies, it is a lot more likely that you would want to see corresponding files in these two directories. In other words, the patch is better shown as differences between a/1 and b/1 (the code makes the core compare foo/1 and bar/2 after all). This of course requires no-index to differentiate the logical pathname (i.e. "this is the path inside collection a/ (or b/)") and the physical location from which the contents are read. Such a differentiation would allow us to also do renames and rename classifications much more sanely. We had to add DIFF_PAIR_RENAME() and filepair->renamed_pair only to support this codepath because of this misdesign. We could have just run strcmp() between the logical pathname of one/two members of the filepair to see if the pair was renamed if it was done that way. And the way diff-no-index.c::queue_diff() walks two directories to grab paths from them in parallel and incrementally means that the filesystem walking code cannot be reused for something like this: git diff master:Documentation /var/tmp/docs to compare a hierarchy represented with a tree object with another hierarchy stored in the filesystem outside git's control. A natural way to do so would be to grab a set of paths from /var/tmp/docs and have that set compared against the other set that comes from the tree, and the "grab a set of paths from /var/tmp/docs" machinery can be used twice to implement the current git diff --no-index /var/tmp/foo /var/tmp/bar which would have been a lot cleaner implementation. Oh well. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Bug: problem with file named with dash character 2012-06-27 22:17 ` Junio C Hamano @ 2012-06-27 22:41 ` Jeff King 0 siblings, 0 replies; 13+ messages in thread From: Jeff King @ 2012-06-27 22:41 UTC (permalink / raw) To: Junio C Hamano; +Cc: Daniel Lyubomirov, git On Wed, Jun 27, 2012 at 03:17:54PM -0700, Junio C Hamano wrote: > > I think that's bad. I wonder if it should have "*" attributes applied to > > it or not. While I can see it being convenient in some cases, I think it > > makes the rules confusingly complex. > > It is likely that the crlf conversion on Dos systems wants to be > applied regardless. Yeah, that's specifically the case I was thinking of. I would say "well, we don't need to care about path at all, they can just use core.autocrlf", but I think autocrlf is discouraged these days in favor of using attributes. > This is unrelated to the "standard input confusion" issue, but there > are two more things coming either from the way the no-index code is > done or from the way it is bolted onto git. > > With the current code, this: > > mkdir foo bar > echo hello >foo/1 > echo bye >bar/2 > git diff --no-index foo bar > > shows differences between a/foo/1 and b/bar/1, as the no-index code > records foo/1 and bar/1 as the paths in the filespec for them. > > But if you are comparing two directory hierarchies, it is a lot more > likely that you would want to see corresponding files in these two > directories. In other words, the patch is better shown as > differences between a/1 and b/1 (the code makes the core compare > foo/1 and bar/2 after all). This of course requires no-index to > differentiate the logical pathname (i.e. "this is the path inside > collection a/ (or b/)") and the physical location from which the > contents are read. Such a differentiation would allow us to also do > renames and rename classifications much more sanely. We had to add > DIFF_PAIR_RENAME() and filepair->renamed_pair only to support this > codepath because of this misdesign. We could have just run strcmp() > between the logical pathname of one/two members of the filepair to > see if the pair was renamed if it was done that way. Yeah, that makes sense. Really you want to split the idea of diff_filespec into a logical unit of "the thing I am diffing" and a source struct of "here is where I get the data from". And the latter could be a union of blob information, filesystem path, and stdin flag, all contained inside the filespec. > And the way diff-no-index.c::queue_diff() walks two directories to > grab paths from them in parallel and incrementally means that the > filesystem walking code cannot be reused for something like this: > > git diff master:Documentation /var/tmp/docs > > to compare a hierarchy represented with a tree object with another > hierarchy stored in the filesystem outside git's control. A natural > way to do so would be to grab a set of paths from /var/tmp/docs and > have that set compared against the other set that comes from the tree, > and the "grab a set of paths from /var/tmp/docs" machinery can be > used twice to implement the current > > git diff --no-index /var/tmp/foo /var/tmp/bar > > which would have been a lot cleaner implementation. Agreed. I have occasionally wanted to do something like the tree comparison you mentioned above, and I think I resorted to actually making a git tree out of it. All of that is nice, and if you feel like working on it, great. But I admit I don't care too much about the --no-index code path. The key thing to me is fixing the "-" path in the regular code path without regressing the no-index stdin code-path too badly. And I think your patches already do that, so it might be a good stopping point. -Peff ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2012-06-27 22:41 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <52ae7682-3e9a-4b52-bec1-08ba3aadffc0@office.digitalus.nl> 2012-06-27 7:32 ` Bug: problem with file named with dash character Daniel Lyubomirov -|- Digitalus Bulgaria 2012-06-27 9:57 ` faux 2012-06-27 18:28 ` Junio C Hamano 2012-06-27 19:52 ` Jeff King 2012-06-27 20:25 ` Jeff King 2012-06-27 20:27 ` Junio C Hamano 2012-06-27 20:33 ` Junio C Hamano 2012-06-27 20:35 ` Junio C Hamano 2012-06-27 20:39 ` Junio C Hamano 2012-06-27 20:48 ` Junio C Hamano 2012-06-27 21:00 ` Jeff King 2012-06-27 22:17 ` Junio C Hamano 2012-06-27 22:41 ` Jeff King
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.