* [PATCH v3 1/7] archive: optionally add "virtual" files
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
@ 2022-05-04 15:25 ` Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
` (7 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
With the `--add-file-with-content=<path>:<content>` option, `git
archive` now supports use cases where relatively trivial files need to
be added that do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 11 ++++++++
archive.c | 51 +++++++++++++++++++++++++++++------
t/t5003-archive-zip.sh | 12 +++++++++
3 files changed, 66 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a7834..a0edc9167b2 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -61,6 +61,17 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+--add-file-with-content=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
+ by concatenating the value for `--prefix` (if any) and the
+ basename of <file>.
++
+The `<path>` cannot contain any colon, the file mode is limited to
+a regular file, and the option may be subject to platform-dependent
+command-line limits. For non-trivial cases, write an untracked file
+and use `--add-file` instead.
+
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
as well (see <<ATTRIBUTES>>).
diff --git a/archive.c b/archive.c
index a3bbb091256..d798624cd5f 100644
--- a/archive.c
+++ b/archive.c
@@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
struct extra_file_info {
char *base;
struct stat stat;
+ void *content;
};
int write_archive_entries(struct archiver_args *args,
@@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args,
strbuf_addstr(&path_in_archive, basename(path));
strbuf_reset(&content);
- if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
+ if (info->content)
+ err = write_entry(args, &fake_oid, path_in_archive.buf,
+ path_in_archive.len,
+ info->stat.st_mode,
+ info->content, info->stat.st_size);
+ else if (strbuf_read_file(&content, path,
+ info->stat.st_size) < 0)
err = error_errno(_("could not read '%s'"), path);
else
err = write_entry(args, &fake_oid, path_in_archive.buf,
@@ -493,6 +500,7 @@ static void extra_file_info_clear(void *util, const char *str)
{
struct extra_file_info *info = util;
free(info->base);
+ free(info->content);
free(info);
}
@@ -514,14 +522,38 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
if (!arg)
return -1;
- path = prefix_filename(args->prefix, arg);
- item = string_list_append_nodup(&args->extra_files, path);
- item->util = info = xmalloc(sizeof(*info));
+ info = xmalloc(sizeof(*info));
info->base = xstrdup_or_null(base);
- if (stat(path, &info->stat))
- die(_("File not found: %s"), path);
- if (!S_ISREG(info->stat.st_mode))
- die(_("Not a regular file: %s"), path);
+
+ if (!strcmp(opt->long_name, "add-file")) {
+ path = prefix_filename(args->prefix, arg);
+ if (stat(path, &info->stat))
+ die(_("File not found: %s"), path);
+ if (!S_ISREG(info->stat.st_mode))
+ die(_("Not a regular file: %s"), path);
+ info->content = NULL; /* read the file later */
+ } else {
+ const char *colon = strchr(arg, ':');
+ char *p;
+
+ if (!colon)
+ die(_("missing colon: '%s'"), arg);
+
+ p = xstrndup(arg, colon - arg);
+ if (!args->prefix)
+ path = p;
+ else {
+ path = prefix_filename(args->prefix, p);
+ free(p);
+ }
+ memset(&info->stat, 0, sizeof(info->stat));
+ info->stat.st_mode = S_IFREG | 0644;
+ info->content = xstrdup(colon + 1);
+ info->stat.st_size = strlen(info->content);
+ }
+ item = string_list_append_nodup(&args->extra_files, path);
+ item->util = info;
+
return 0;
}
@@ -554,6 +586,9 @@ static int parse_archive_args(int argc, const char **argv,
{ OPTION_CALLBACK, 0, "add-file", args, N_("file"),
N_("add untracked file to archive"), 0, add_file_cb,
(intptr_t)&base },
+ { OPTION_CALLBACK, 0, "add-file-with-content", args,
+ N_("path:content"), N_("add untracked file to archive"), 0,
+ add_file_cb, (intptr_t)&base },
OPT_STRING('o', "output", &output, N_("file"),
N_("write the archive to this file")),
OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 1e6d18b140e..8ff1257f1a0 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' '
check_zip with_untracked
check_added with_untracked untracked untracked
+test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
+ git archive --format=zip >with_file_with_content.zip \
+ --add-file-with-content=hello:world $EMPTY_TREE &&
+ test_when_finished "rm -rf tmp-unpack" &&
+ mkdir tmp-unpack && (
+ cd tmp-unpack &&
+ "$GIT_UNZIP" ../with_file_with_content.zip &&
+ test_path_is_file hello &&
+ test world = $(cat hello)
+ )
+'
+
test_expect_success 'git archive --format=zip --add-file twice' '
echo untracked >untracked &&
git archive --format=zip --prefix=one/ --add-file=untracked \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v3 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-04 15:25 ` Johannes Schindelin via GitGitGadget
2022-05-07 2:06 ` Elijah Newren
2022-05-04 15:25 ` [PATCH v3 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
` (6 subsequent siblings)
8 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 13 +++++++++----
archive.c | 34 +++++++++++++++++++++++++++++-----
t/t5003-archive-zip.sh | 8 ++++++++
3 files changed, 46 insertions(+), 9 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index a0edc9167b2..1789ce4c232 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -67,10 +67,15 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+
-The `<path>` cannot contain any colon, the file mode is limited to
-a regular file, and the option may be subject to platform-dependent
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
+character. In this case, the backslash is interpreted as escape
+character. The path must be quoted if it contains a colon, to avoid
+the colon from being misinterpreted as the separator between the
+path and the contents.
++
+The file mode is limited to a regular file, and the option may be
+subject to platform-dependent command-line limits. For non-trivial
+cases, write an untracked file and use `--add-file` instead.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
diff --git a/archive.c b/archive.c
index d798624cd5f..3b751027143 100644
--- a/archive.c
+++ b/archive.c
@@ -533,13 +533,37 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
} else {
- const char *colon = strchr(arg, ':');
char *p;
- if (!colon)
- die(_("missing colon: '%s'"), arg);
+ if (*arg != '"') {
+ const char *colon = strchr(arg, ':');
+
+ if (!colon)
+ die(_("missing colon: '%s'"), arg);
+ p = xstrndup(arg, colon - arg);
+ arg = colon + 1;
+ } else {
+ struct strbuf buf = STRBUF_INIT;
+ const char *orig = arg;
+
+ for (;;) {
+ if (!*(++arg))
+ die(_("unclosed quote: '%s'"), orig);
+ if (*arg == '"')
+ break;
+ if (*arg == '\\' && *(++arg) == '\0')
+ die(_("trailing backslash: '%s"), orig);
+ else
+ strbuf_addch(&buf, *arg);
+ }
+
+ if (*(++arg) != ':')
+ die(_("missing colon: '%s'"), orig);
+
+ p = strbuf_detach(&buf, NULL);
+ arg++;
+ }
- p = xstrndup(arg, colon - arg);
if (!args->prefix)
path = p;
else {
@@ -548,7 +572,7 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
}
memset(&info->stat, 0, sizeof(info->stat));
info->stat.st_mode = S_IFREG | 0644;
- info->content = xstrdup(colon + 1);
+ info->content = xstrdup(arg);
info->stat.st_size = strlen(info->content);
}
item = string_list_append_nodup(&args->extra_files, path);
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 8ff1257f1a0..5b8bbfc2692 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -207,13 +207,21 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
+ if test_have_prereq FUNNYNAMES
+ then
+ QUOTED=quoted:colon
+ else
+ QUOTED=quoted
+ fi &&
git archive --format=zip >with_file_with_content.zip \
+ --add-file-with-content=\"$QUOTED\": \
--add-file-with-content=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
+ test_path_is_file $QUOTED &&
test world = $(cat hello)
)
'
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v3 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-04 15:25 ` [PATCH v3 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-07 2:06 ` Elijah Newren
2022-05-09 21:04 ` Johannes Schindelin
0 siblings, 1 reply; 140+ messages in thread
From: Elijah Newren @ 2022-05-07 2:06 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: Git Mailing List, René Scharfe, Taylor Blau, Derrick Stolee,
Johannes Schindelin
On Wed, May 4, 2022 at 8:25 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> By allowing the path to be enclosed in double-quotes, we can avoid
> the limitation that paths cannot contain colons.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> Documentation/git-archive.txt | 13 +++++++++----
> archive.c | 34 +++++++++++++++++++++++++++++-----
> t/t5003-archive-zip.sh | 8 ++++++++
> 3 files changed, 46 insertions(+), 9 deletions(-)
>
> diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
> index a0edc9167b2..1789ce4c232 100644
> --- a/Documentation/git-archive.txt
> +++ b/Documentation/git-archive.txt
> @@ -67,10 +67,15 @@ OPTIONS
> by concatenating the value for `--prefix` (if any) and the
> basename of <file>.
> +
> -The `<path>` cannot contain any colon, the file mode is limited to
> -a regular file, and the option may be subject to platform-dependent
> -command-line limits. For non-trivial cases, write an untracked file
> -and use `--add-file` instead.
> +The `<path>` argument can start and end with a literal double-quote
> +character. In this case, the backslash is interpreted as escape
> +character. The path must be quoted if it contains a colon, to avoid
> +the colon from being misinterpreted as the separator between the
> +path and the contents.
The path must also be quoted if it begins or ends with a double-quote, right?
Also, would people want to be able to pass a pathname from the output
of e.g. `git ls-files -o`, which may quote additional characters?
> ++
> +The file mode is limited to a regular file, and the option may be
> +subject to platform-dependent command-line limits. For non-trivial
> +cases, write an untracked file and use `--add-file` instead.
>
> --worktree-attributes::
> Look for attributes in .gitattributes files in the working tree
> diff --git a/archive.c b/archive.c
> index d798624cd5f..3b751027143 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -533,13 +533,37 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
> die(_("Not a regular file: %s"), path);
> info->content = NULL; /* read the file later */
> } else {
> - const char *colon = strchr(arg, ':');
> char *p;
>
> - if (!colon)
> - die(_("missing colon: '%s'"), arg);
> + if (*arg != '"') {
> + const char *colon = strchr(arg, ':');
> +
> + if (!colon)
> + die(_("missing colon: '%s'"), arg);
> + p = xstrndup(arg, colon - arg);
> + arg = colon + 1;
> + } else {
> + struct strbuf buf = STRBUF_INIT;
> + const char *orig = arg;
> +
> + for (;;) {
> + if (!*(++arg))
> + die(_("unclosed quote: '%s'"), orig);
> + if (*arg == '"')
> + break;
> + if (*arg == '\\' && *(++arg) == '\0')
> + die(_("trailing backslash: '%s"), orig);
> + else
> + strbuf_addch(&buf, *arg);
> + }
> +
> + if (*(++arg) != ':')
> + die(_("missing colon: '%s'"), orig);
> +
> + p = strbuf_detach(&buf, NULL);
> + arg++;
> + }
Should we use unquote_c_style() here instead of rolling another parser
to do unquoting? That would have the added benefit of allowing people
to use filenames from the output of various git commands that do
special quoting -- such as octal sequences for non-ascii characters.
>
> - p = xstrndup(arg, colon - arg);
> if (!args->prefix)
> path = p;
> else {
> @@ -548,7 +572,7 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
> }
> memset(&info->stat, 0, sizeof(info->stat));
> info->stat.st_mode = S_IFREG | 0644;
> - info->content = xstrdup(colon + 1);
> + info->content = xstrdup(arg);
> info->stat.st_size = strlen(info->content);
> }
> item = string_list_append_nodup(&args->extra_files, path);
> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> index 8ff1257f1a0..5b8bbfc2692 100755
> --- a/t/t5003-archive-zip.sh
> +++ b/t/t5003-archive-zip.sh
> @@ -207,13 +207,21 @@ check_zip with_untracked
> check_added with_untracked untracked untracked
>
> test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
> + if test_have_prereq FUNNYNAMES
> + then
> + QUOTED=quoted:colon
> + else
> + QUOTED=quoted
> + fi &&
> git archive --format=zip >with_file_with_content.zip \
> + --add-file-with-content=\"$QUOTED\": \
> --add-file-with-content=hello:world $EMPTY_TREE &&
> test_when_finished "rm -rf tmp-unpack" &&
> mkdir tmp-unpack && (
> cd tmp-unpack &&
> "$GIT_UNZIP" ../with_file_with_content.zip &&
> test_path_is_file hello &&
> + test_path_is_file $QUOTED &&
> test world = $(cat hello)
> )
> '
> --
> gitgitgadget
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v3 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-07 2:06 ` Elijah Newren
@ 2022-05-09 21:04 ` Johannes Schindelin
0 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-09 21:04 UTC (permalink / raw)
To: Elijah Newren
Cc: Johannes Schindelin via GitGitGadget, Git Mailing List,
René Scharfe, Taylor Blau, Derrick Stolee
Hi Elijah,
On Fri, 6 May 2022, Elijah Newren wrote:
> On Wed, May 4, 2022 at 8:25 AM Johannes Schindelin via GitGitGadget
> <gitgitgadget@gmail.com> wrote:
> >
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > By allowing the path to be enclosed in double-quotes, we can avoid
> > the limitation that paths cannot contain colons.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> > ---
> > Documentation/git-archive.txt | 13 +++++++++----
> > archive.c | 34 +++++++++++++++++++++++++++++-----
> > t/t5003-archive-zip.sh | 8 ++++++++
> > 3 files changed, 46 insertions(+), 9 deletions(-)
> >
> > diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
> > index a0edc9167b2..1789ce4c232 100644
> > --- a/Documentation/git-archive.txt
> > +++ b/Documentation/git-archive.txt
> > @@ -67,10 +67,15 @@ OPTIONS
> > by concatenating the value for `--prefix` (if any) and the
> > basename of <file>.
> > +
> > -The `<path>` cannot contain any colon, the file mode is limited to
> > -a regular file, and the option may be subject to platform-dependent
> > -command-line limits. For non-trivial cases, write an untracked file
> > -and use `--add-file` instead.
> > +The `<path>` argument can start and end with a literal double-quote
> > +character. In this case, the backslash is interpreted as escape
> > +character. The path must be quoted if it contains a colon, to avoid
> > +the colon from being misinterpreted as the separator between the
> > +path and the contents.
>
> The path must also be quoted if it begins or ends with a double-quote, right?
True.
> Also, would people want to be able to pass a pathname from the output
> of e.g. `git ls-files -o`, which may quote additional characters?
Also true.
> > ++
> > +The file mode is limited to a regular file, and the option may be
> > +subject to platform-dependent command-line limits. For non-trivial
> > +cases, write an untracked file and use `--add-file` instead.
> >
> > --worktree-attributes::
> > Look for attributes in .gitattributes files in the working tree
> > diff --git a/archive.c b/archive.c
> > index d798624cd5f..3b751027143 100644
> > --- a/archive.c
> > +++ b/archive.c
> > @@ -533,13 +533,37 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
> > die(_("Not a regular file: %s"), path);
> > info->content = NULL; /* read the file later */
> > } else {
> > - const char *colon = strchr(arg, ':');
> > char *p;
> >
> > - if (!colon)
> > - die(_("missing colon: '%s'"), arg);
> > + if (*arg != '"') {
> > + const char *colon = strchr(arg, ':');
> > +
> > + if (!colon)
> > + die(_("missing colon: '%s'"), arg);
> > + p = xstrndup(arg, colon - arg);
> > + arg = colon + 1;
> > + } else {
> > + struct strbuf buf = STRBUF_INIT;
> > + const char *orig = arg;
> > +
> > + for (;;) {
> > + if (!*(++arg))
> > + die(_("unclosed quote: '%s'"), orig);
> > + if (*arg == '"')
> > + break;
> > + if (*arg == '\\' && *(++arg) == '\0')
> > + die(_("trailing backslash: '%s"), orig);
> > + else
> > + strbuf_addch(&buf, *arg);
> > + }
> > +
> > + if (*(++arg) != ':')
> > + die(_("missing colon: '%s'"), orig);
> > +
> > + p = strbuf_detach(&buf, NULL);
> > + arg++;
> > + }
>
> Should we use unquote_c_style() here instead of rolling another parser
> to do unquoting? That would have the added benefit of allowing people
> to use filenames from the output of various git commands that do
> special quoting -- such as octal sequences for non-ascii characters.
Yep, let's do that. I somehow missed that function while glimpsing at
`quote.h`.
Thank you for your review!
Dscho
> >
> > - p = xstrndup(arg, colon - arg);
> > if (!args->prefix)
> > path = p;
> > else {
> > @@ -548,7 +572,7 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
> > }
> > memset(&info->stat, 0, sizeof(info->stat));
> > info->stat.st_mode = S_IFREG | 0644;
> > - info->content = xstrdup(colon + 1);
> > + info->content = xstrdup(arg);
> > info->stat.st_size = strlen(info->content);
> > }
> > item = string_list_append_nodup(&args->extra_files, path);
> > diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> > index 8ff1257f1a0..5b8bbfc2692 100755
> > --- a/t/t5003-archive-zip.sh
> > +++ b/t/t5003-archive-zip.sh
> > @@ -207,13 +207,21 @@ check_zip with_untracked
> > check_added with_untracked untracked untracked
> >
> > test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
> > + if test_have_prereq FUNNYNAMES
> > + then
> > + QUOTED=quoted:colon
> > + else
> > + QUOTED=quoted
> > + fi &&
> > git archive --format=zip >with_file_with_content.zip \
> > + --add-file-with-content=\"$QUOTED\": \
> > --add-file-with-content=hello:world $EMPTY_TREE &&
> > test_when_finished "rm -rf tmp-unpack" &&
> > mkdir tmp-unpack && (
> > cd tmp-unpack &&
> > "$GIT_UNZIP" ../with_file_with_content.zip &&
> > test_path_is_file hello &&
> > + test_path_is_file $QUOTED &&
> > test world = $(cat hello)
> > )
> > '
> > --
> > gitgitgadget
>
>
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v3 3/7] scalar: validate the optional enlistment argument
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-04 15:25 ` Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
` (5 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
The `scalar` command needs a Scalar enlistment for many subcommands, and
looks in the current directory for such an enlistment (traversing the
parent directories until it finds one).
These is subcommands can also be called with an optional argument
specifying the enlistment. Here, too, we traverse parent directories as
needed, until we find an enlistment.
However, if the specified directory does not even exist, or is not a
directory, we should stop right there, with an error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 6 ++++--
contrib/scalar/t/t9099-scalar.sh | 5 +++++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 1ce9c2b00e8..00dcd4b50ef 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv,
usage_with_options(usagestr, options);
/* find the worktree, determine its corresponding root */
- if (argc == 1)
+ if (argc == 1) {
strbuf_add_absolute_path(&path, argv[0]);
- else if (strbuf_getcwd(&path) < 0)
+ if (!is_directory(path.buf))
+ die(_("'%s' does not exist"), path.buf);
+ } else if (strbuf_getcwd(&path) < 0)
die(_("need a working directory"));
strbuf_trim_trailing_dir_sep(&path);
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 2e1502ad45e..9d83fdf25e8 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -85,4 +85,9 @@ test_expect_success 'scalar delete with enlistment' '
test_path_is_missing cloned
'
+test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
+ ! scalar run config cloned 2>err &&
+ grep "cloned. does not exist" err
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v3 4/7] Implement `scalar diagnose`
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
` (2 preceding siblings ...)
2022-05-04 15:25 ` [PATCH v3 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
@ 2022-05-04 15:25 ` Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
` (4 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
that can help identify the most typical problems with large
worktrees/repositories.
The `diagnose` command is the culmination of this hard-won knowledge: it
gathers the installed hooks, the config, a couple statistics describing
the data shape, among other pieces of information, and then wraps
everything up in a tidy, neat `.zip` archive.
Note: originally, Scalar was implemented in C# using the .NET API, where
we had the luxury of a comprehensive standard library that includes
basic functionality such as writing a `.zip` file. In the C version, we
lack such a commodity. Rather than introducing a dependency on, say,
libzip, we slightly abuse Git's `archive` machinery: we write out a
`.zip` of the empty try, augmented by a couple files that are added via
the `--add-file*` options. We are careful trying not to modify the
current repository in any way lest the very circumstances that required
`scalar diagnose` to be run are changed by the `diagnose` run itself.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 141 +++++++++++++++++++++++++++++++
contrib/scalar/scalar.txt | 12 +++
contrib/scalar/t/t9099-scalar.sh | 14 +++
3 files changed, 167 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 00dcd4b50ef..a290e52e1d2 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -11,6 +11,7 @@
#include "dir.h"
#include "packfile.h"
#include "help.h"
+#include "archive.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -261,6 +262,44 @@ static int unregister_dir(void)
return res;
}
+static int add_directory_to_archiver(struct strvec *archiver_args,
+ const char *path, int recurse)
+{
+ int at_root = !*path;
+ DIR *dir = opendir(at_root ? "." : path);
+ struct dirent *e;
+ struct strbuf buf = STRBUF_INIT;
+ size_t len;
+ int res = 0;
+
+ if (!dir)
+ return error(_("could not open directory '%s'"), path);
+
+ if (!at_root)
+ strbuf_addf(&buf, "%s/", path);
+ len = buf.len;
+ strvec_pushf(archiver_args, "--prefix=%s", buf.buf);
+
+ while (!res && (e = readdir(dir))) {
+ if (!strcmp(".", e->d_name) || !strcmp("..", e->d_name))
+ continue;
+
+ strbuf_setlen(&buf, len);
+ strbuf_addstr(&buf, e->d_name);
+
+ if (e->d_type == DT_REG)
+ strvec_pushf(archiver_args, "--add-file=%s", buf.buf);
+ else if (e->d_type != DT_DIR)
+ res = -1;
+ else if (recurse)
+ add_directory_to_archiver(archiver_args, buf.buf, recurse);
+ }
+
+ closedir(dir);
+ strbuf_release(&buf);
+ return res;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -501,6 +540,107 @@ cleanup:
return res;
}
+static int cmd_diagnose(int argc, const char **argv)
+{
+ struct option options[] = {
+ OPT_END(),
+ };
+ const char * const usage[] = {
+ N_("scalar diagnose [<enlistment>]"),
+ NULL
+ };
+ struct strbuf zip_path = STRBUF_INIT;
+ struct strvec archiver_args = STRVEC_INIT;
+ char **argv_copy = NULL;
+ int stdout_fd = -1, archiver_fd = -1;
+ time_t now = time(NULL);
+ struct tm tm;
+ struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT;
+ int res = 0;
+
+ argc = parse_options(argc, argv, NULL, options,
+ usage, 0);
+
+ setup_enlistment_directory(argc, argv, usage, options, &zip_path);
+
+ strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_");
+ strbuf_addftime(&zip_path,
+ "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0);
+ strbuf_addstr(&zip_path, ".zip");
+ switch (safe_create_leading_directories(zip_path.buf)) {
+ case SCLD_EXISTS:
+ case SCLD_OK:
+ break;
+ default:
+ error_errno(_("could not create directory for '%s'"),
+ zip_path.buf);
+ goto diagnose_cleanup;
+ }
+ stdout_fd = dup(1);
+ if (stdout_fd < 0) {
+ res = error_errno(_("could not duplicate stdout"));
+ goto diagnose_cleanup;
+ }
+
+ archiver_fd = xopen(zip_path.buf, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+ if (archiver_fd < 0 || dup2(archiver_fd, 1) < 0) {
+ res = error_errno(_("could not redirect output"));
+ goto diagnose_cleanup;
+ }
+
+ init_zip_archiver();
+ strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL);
+
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "Collecting diagnostic info\n\n");
+ get_version_info(&buf, 1);
+
+ strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ write_or_die(stdout_fd, buf.buf, buf.len);
+ strvec_pushf(&archiver_args,
+ "--add-file-with-content=diagnostics.log:%.*s",
+ (int)buf.len, buf.buf);
+
+ if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
+ goto diagnose_cleanup;
+
+ strvec_pushl(&archiver_args, "--prefix=",
+ oid_to_hex(the_hash_algo->empty_tree), "--", NULL);
+
+ /* `write_archive()` modifies the `argv` passed to it. Let it. */
+ argv_copy = xmemdupz(archiver_args.v,
+ sizeof(char *) * archiver_args.nr);
+ res = write_archive(archiver_args.nr, (const char **)argv_copy, NULL,
+ the_repository, NULL, 0);
+ if (res) {
+ error(_("failed to write archive"));
+ goto diagnose_cleanup;
+ }
+
+ if (!res)
+ fprintf(stderr, "\n"
+ "Diagnostics complete.\n"
+ "All of the gathered info is captured in '%s'\n",
+ zip_path.buf);
+
+diagnose_cleanup:
+ if (archiver_fd >= 0) {
+ close(1);
+ dup2(stdout_fd, 1);
+ }
+ free(argv_copy);
+ strvec_clear(&archiver_args);
+ strbuf_release(&zip_path);
+ strbuf_release(&path);
+ strbuf_release(&buf);
+
+ return res;
+}
+
static int cmd_list(int argc, const char **argv)
{
if (argc != 1)
@@ -802,6 +942,7 @@ static struct {
{ "reconfigure", cmd_reconfigure },
{ "delete", cmd_delete },
{ "version", cmd_version },
+ { "diagnose", cmd_diagnose },
{ NULL, NULL},
};
diff --git a/contrib/scalar/scalar.txt b/contrib/scalar/scalar.txt
index f416d637289..22583fe046e 100644
--- a/contrib/scalar/scalar.txt
+++ b/contrib/scalar/scalar.txt
@@ -14,6 +14,7 @@ scalar register [<enlistment>]
scalar unregister [<enlistment>]
scalar run ( all | config | commit-graph | fetch | loose-objects | pack-files ) [<enlistment>]
scalar reconfigure [ --all | <enlistment> ]
+scalar diagnose [<enlistment>]
scalar delete <enlistment>
DESCRIPTION
@@ -129,6 +130,17 @@ reconfigure the enlistment.
With the `--all` option, all enlistments currently registered with Scalar
will be reconfigured. Use this option after each Scalar upgrade.
+Diagnose
+~~~~~~~~
+
+diagnose [<enlistment>]::
+ When reporting issues with Scalar, it is often helpful to provide the
+ information gathered by this command, including logs and certain
+ statistics describing the data shape of the current enlistment.
++
+The output of this command is a `.zip` file that is written into
+a directory adjacent to the worktree in the `src` directory.
+
Delete
~~~~~~
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 9d83fdf25e8..bbd07a44426 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -90,4 +90,18 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
grep "cloned. does not exist" err
'
+SQ="'"
+test_expect_success UNZIP 'scalar diagnose' '
+ scalar clone "file://$(pwd)" cloned --single-branch &&
+ scalar diagnose cloned >out &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
+ zip_path=$(cat zip_path) &&
+ test -n "$zip_path" &&
+ unzip -v "$zip_path" &&
+ folder=${zip_path%.zip} &&
+ test_path_is_missing "$folder" &&
+ unzip -p "$zip_path" diagnostics.log >out &&
+ test_file_not_empty out
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v3 5/7] scalar diagnose: include disk space information
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
` (3 preceding siblings ...)
2022-05-04 15:25 ` [PATCH v3 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
@ 2022-05-04 15:25 ` Johannes Schindelin via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
` (3 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
When analyzing problems with large worktrees/repositories, it is useful
to know how close to a "full disk" situation Scalar/Git operates. Let's
include this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 53 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 1 +
2 files changed, 54 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index a290e52e1d2..df44902c909 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -300,6 +300,58 @@ static int add_directory_to_archiver(struct strvec *archiver_args,
return res;
}
+#ifndef WIN32
+#include <sys/statvfs.h>
+#endif
+
+static int get_disk_info(struct strbuf *out)
+{
+#ifdef WIN32
+ struct strbuf buf = STRBUF_INIT;
+ char volume_name[MAX_PATH], fs_name[MAX_PATH];
+ DWORD serial_number, component_length, flags;
+ ULARGE_INTEGER avail2caller, total, avail;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (!GetDiskFreeSpaceExA(buf.buf, &avail2caller, &total, &avail)) {
+ error(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_setlen(&buf, offset_1st_component(buf.buf));
+ if (!GetVolumeInformationA(buf.buf, volume_name, sizeof(volume_name),
+ &serial_number, &component_length, &flags,
+ fs_name, sizeof(fs_name))) {
+ error(_("could not get info for '%s'"), buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, avail2caller.QuadPart);
+ strbuf_addch(out, '\n');
+ strbuf_release(&buf);
+#else
+ struct strbuf buf = STRBUF_INIT;
+ struct statvfs stat;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (statvfs(buf.buf, &stat) < 0) {
+ error_errno(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail));
+ strbuf_addf(out, " (mount flags 0x%lx)\n", stat.f_flag);
+ strbuf_release(&buf);
+#endif
+ return 0;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -596,6 +648,7 @@ static int cmd_diagnose(int argc, const char **argv)
get_version_info(&buf, 1);
strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ get_disk_info(&buf);
write_or_die(stdout_fd, buf.buf, buf.len);
strvec_pushf(&archiver_args,
"--add-file-with-content=diagnostics.log:%.*s",
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index bbd07a44426..f3d037823c8 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -94,6 +94,7 @@ SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
scalar diagnose cloned >out &&
+ grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
zip_path=$(cat zip_path) &&
test -n "$zip_path" &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v3 6/7] scalar: teach `diagnose` to gather packfile info
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
` (4 preceding siblings ...)
2022-05-04 15:25 ` [PATCH v3 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
@ 2022-05-04 15:25 ` Matthew John Cheetham via GitGitGadget
2022-05-04 15:25 ` [PATCH v3 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
` (2 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
It's helpful to see if there are other crud files in the pack
directory. Let's teach the `scalar diagnose` command to gather
file size information about pack files.
While at it, also enumerate the pack files in the alternate
object directories, if any are registered.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 30 ++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 6 +++++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index df44902c909..9adde8cf4b9 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -12,6 +12,7 @@
#include "packfile.h"
#include "help.h"
#include "archive.h"
+#include "object-store.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -592,6 +593,29 @@ cleanup:
return res;
}
+static void dir_file_stats_objects(const char *full_path, size_t full_path_len,
+ const char *file_name, void *data)
+{
+ struct strbuf *buf = data;
+ struct stat st;
+
+ if (!stat(full_path, &st))
+ strbuf_addf(buf, "%-70s %16" PRIuMAX "\n", file_name,
+ (uintmax_t)st.st_size);
+}
+
+static int dir_file_stats(struct object_directory *object_dir, void *data)
+{
+ struct strbuf *buf = data;
+
+ strbuf_addf(buf, "Contents of %s:\n", object_dir->path);
+
+ for_each_file_in_pack_dir(object_dir->path, dir_file_stats_objects,
+ data);
+
+ return 0;
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -654,6 +678,12 @@ static int cmd_diagnose(int argc, const char **argv)
"--add-file-with-content=diagnostics.log:%.*s",
(int)buf.len, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-file-with-content=packs-local.txt:");
+ dir_file_stats(the_repository->objects->odb, &buf);
+ foreach_alt_odb(dir_file_stats, &buf);
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index f3d037823c8..e049221609d 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -93,6 +93,8 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
+ git repack &&
+ echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
scalar diagnose cloned >out &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
@@ -102,7 +104,9 @@ test_expect_success UNZIP 'scalar diagnose' '
folder=${zip_path%.zip} &&
test_path_is_missing "$folder" &&
unzip -p "$zip_path" diagnostics.log >out &&
- test_file_not_empty out
+ test_file_not_empty out &&
+ unzip -p "$zip_path" packs-local.txt >out &&
+ grep "$(pwd)/.git/objects" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v3 7/7] scalar: teach `diagnose` to gather loose objects information
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
` (5 preceding siblings ...)
2022-05-04 15:25 ` [PATCH v3 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
@ 2022-05-04 15:25 ` Matthew John Cheetham via GitGitGadget
2022-05-07 2:23 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Elijah Newren
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
8 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-04 15:25 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
When operating at the scale that Scalar wants to support, certain data
shapes are more likely to cause undesirable performance issues, such as
large numbers of loose objects.
By including statistics about this, `scalar diagnose` now makes it
easier to identify such scenarios.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 59 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 5 ++-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 9adde8cf4b9..f2fe3858eca 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -616,6 +616,60 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
return 0;
}
+static int count_files(char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count = 0;
+
+ if (!dir)
+ return 0;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) && e->d_type == DT_REG)
+ count++;
+
+ closedir(dir);
+ return count;
+}
+
+static void loose_objs_stats(struct strbuf *buf, const char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count;
+ int total = 0;
+ unsigned char c;
+ struct strbuf count_path = STRBUF_INIT;
+ size_t base_path_len;
+
+ if (!dir)
+ return;
+
+ strbuf_addstr(buf, "Object directory stats for ");
+ strbuf_add_absolute_path(buf, path);
+ strbuf_addstr(buf, ":\n");
+
+ strbuf_add_absolute_path(&count_path, path);
+ strbuf_addch(&count_path, '/');
+ base_path_len = count_path.len;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) &&
+ e->d_type == DT_DIR && strlen(e->d_name) == 2 &&
+ !hex_to_bytes(&c, e->d_name, 1)) {
+ strbuf_setlen(&count_path, base_path_len);
+ strbuf_addstr(&count_path, e->d_name);
+ total += (count = count_files(count_path.buf));
+ strbuf_addf(buf, "%s : %7d files\n", e->d_name, count);
+ }
+
+ strbuf_addf(buf, "Total: %d loose objects", total);
+
+ strbuf_release(&count_path);
+ closedir(dir);
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -684,6 +738,11 @@ static int cmd_diagnose(int argc, const char **argv)
foreach_alt_odb(dir_file_stats, &buf);
strvec_push(&archiver_args, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-file-with-content=objects-local.txt:");
+ loose_objs_stats(&buf, ".git/objects");
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index e049221609d..9b4eedbb0aa 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -95,6 +95,7 @@ test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
git repack &&
echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
+ test_commit -C cloned/src loose &&
scalar diagnose cloned >out &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
@@ -106,7 +107,9 @@ test_expect_success UNZIP 'scalar diagnose' '
unzip -p "$zip_path" diagnostics.log >out &&
test_file_not_empty out &&
unzip -p "$zip_path" packs-local.txt >out &&
- grep "$(pwd)/.git/objects" out
+ grep "$(pwd)/.git/objects" out &&
+ unzip -p "$zip_path" objects-local.txt >out &&
+ grep "^Total: [1-9]" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v3 0/7] scalar: implement the subcommand "diagnose"
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
` (6 preceding siblings ...)
2022-05-04 15:25 ` [PATCH v3 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
@ 2022-05-07 2:23 ` Elijah Newren
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
8 siblings, 0 replies; 140+ messages in thread
From: Elijah Newren @ 2022-05-07 2:23 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: Git Mailing List, René Scharfe, Taylor Blau, Derrick Stolee,
Johannes Schindelin
On Wed, May 4, 2022 at 8:25 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> Over the course of the years, we developed a sub-command that gathers
> diagnostic data into a .zip file that can then be attached to bug reports.
> This sub-command turned out to be very useful in helping Scalar developers
> identify and fix issues.
>
> Changes since v2:
>
> * Clarified in the commit message what the biggest benefit of
> --add-file-with-content is.
> * The <path> part of the -add-file-with-content argument can now contain
> colons. To do this, the path needs to start and end in double-quote
> characters (which are stripped), and the backslash serves as escape
> character in that case (to allow the path to contain both colons and
> double-quotes).
You addressed all my previous feedback from an earlier round. The
only thing I noticed in this round is I wonder if we should use
unquote_c_style() for this, as commented on the patch in question.
> * Fixed incorrect grammar.
> * Instead of strcmp(<what-we-don't-want>), we now say
> !strcmp(<what-we-want>).
> * The help text for --add-file-with-content was improved a tiny bit.
> * Adjusted the commit message that still talked about spawning plenty of
> processes and about a throw-away repository for the sake of generating a
> .zip file.
> * Simplified the code that shows the diagnostics and adds them to the .zip
> file.
> * The final message that reports that the archive is complete is now
> printed to stderr instead of stdout.
>
> Changes since v1:
>
> * Instead of creating a throw-away repository, staging the contents of the
> .zip file and then using git write-tree and git archive to write the .zip
> file, the patch series now introduces a new option to git archive and
> uses write_archive() directly (avoiding any separate process).
> * Since the command avoids separate processes, it is now blazing fast on
> Windows, and I dropped the spinner() function because it's no longer
> needed.
> * While reworking the test case, I noticed that scalar [...] <enlistment>
> failed to verify that the specified directory exists, and would happily
> "traverse to its parent directory" on its quest to find a Scalar
> enlistment. That is of course incorrect, and has been fixed as a "while
> at it" sort of preparatory commit.
> * I had forgotten to sign off on all the commits, which has been fixed.
> * Instead of some "home-grown" readdir()-based function, the code now uses
> for_each_file_in_pack_dir() to look through the pack directories.
> * If any alternates are configured, their pack directories are now included
> in the output.
> * The commit message that might be interpreted to promise information about
> large loose files has been corrected to no longer promise that.
> * The test cases have been adjusted to test a little bit more (e.g.
> verifying that specific paths are mentioned in the output, instead of
> merely verifying that the output is non-empty).
>
> Johannes Schindelin (5):
> archive: optionally add "virtual" files
> archive --add-file-with-contents: allow paths containing colons
> scalar: validate the optional enlistment argument
> Implement `scalar diagnose`
> scalar diagnose: include disk space information
>
> Matthew John Cheetham (2):
> scalar: teach `diagnose` to gather packfile info
> scalar: teach `diagnose` to gather loose objects information
>
> Documentation/git-archive.txt | 16 ++
> archive.c | 75 +++++++-
> contrib/scalar/scalar.c | 289 ++++++++++++++++++++++++++++++-
> contrib/scalar/scalar.txt | 12 ++
> contrib/scalar/t/t9099-scalar.sh | 27 +++
> t/t5003-archive-zip.sh | 20 +++
> 6 files changed, 429 insertions(+), 10 deletions(-)
>
>
> base-commit: ddc35d833dd6f9e8946b09cecd3311b8aa18d295
> Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1128%2Fdscho%2Fscalar-diagnose-v3
> Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1128/dscho/scalar-diagnose-v3
> Pull-Request: https://github.com/gitgitgadget/git/pull/1128
>
> Range-diff vs v2:
>
> 1: 49ff3c1f2b3 ! 1: 45662cf582a archive: optionally add "virtual" files
> @@ Commit message
> archive` now supports use cases where relatively trivial files need to
> be added that do not exist on disk.
>
> + This will allow us to generate `.zip` files with generated content,
> + without having to add said content to the object database and without
> + having to write it out to disk.
> +
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> ## Documentation/git-archive.txt ##
> @@ Documentation/git-archive.txt: OPTIONS
> + basename of <file>.
> ++
> +The `<path>` cannot contain any colon, the file mode is limited to
> -+a regular file, and the option may be subject platform-dependent
> ++a regular file, and the option may be subject to platform-dependent
> +command-line limits. For non-trivial cases, write an untracked file
> +and use `--add-file` instead.
> +
> @@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int
> - if (!S_ISREG(info->stat.st_mode))
> - die(_("Not a regular file: %s"), path);
> +
> -+ if (strcmp(opt->long_name, "add-file-with-content")) {
> ++ if (!strcmp(opt->long_name, "add-file")) {
> + path = prefix_filename(args->prefix, arg);
> + if (stat(path, &info->stat))
> + die(_("File not found: %s"), path);
> @@ archive.c: static int parse_archive_args(int argc, const char **argv,
> N_("add untracked file to archive"), 0, add_file_cb,
> (intptr_t)&base },
> + { OPTION_CALLBACK, 0, "add-file-with-content", args,
> -+ N_("file"), N_("add untracked file to archive"), 0,
> ++ N_("path:content"), N_("add untracked file to archive"), 0,
> + add_file_cb, (intptr_t)&base },
> OPT_STRING('o', "output", &output, N_("file"),
> N_("write the archive to this file")),
> -: ----------- > 2: ce4b1b680c9 archive --add-file-with-contents: allow paths containing colons
> 2: 600da8d465e = 3: 5a3eeb55409 scalar: validate the optional enlistment argument
> 3: 0d570137bb6 ! 4: dfe821d10fe Implement `scalar diagnose`
> @@ Commit message
> we had the luxury of a comprehensive standard library that includes
> basic functionality such as writing a `.zip` file. In the C version, we
> lack such a commodity. Rather than introducing a dependency on, say,
> - libzip, we slightly abuse Git's `archive` command: Instead of writing
> - the `.zip` file directly, we stage the file contents in a Git index of a
> - temporary, bare repository, only to let `git archive` have at it, and
> - finally removing the temporary repository.
> -
> - Also note: Due to the frequently-spawned `git hash-object` processes,
> - this command is quite a bit slow on Windows. Should it turn out to be a
> - big problem, the lack of a batch mode of the `hash-object` command could
> - potentially be worked around via using `git fast-import` with a crafted
> - `stdin`.
> + libzip, we slightly abuse Git's `archive` machinery: we write out a
> + `.zip` of the empty try, augmented by a couple files that are added via
> + the `--add-file*` options. We are careful trying not to modify the
> + current repository in any way lest the very circumstances that required
> + `scalar diagnose` to be run are changed by the `diagnose` run itself.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> @@ contrib/scalar/scalar.c: cleanup:
> + time_t now = time(NULL);
> + struct tm tm;
> + struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT;
> -+ size_t off;
> + int res = 0;
> +
> + argc = parse_options(argc, argv, NULL, options,
> @@ contrib/scalar/scalar.c: cleanup:
> + strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL);
> +
> + strbuf_reset(&buf);
> -+ strbuf_addstr(&buf,
> -+ "--add-file-with-content=diagnostics.log:"
> -+ "Collecting diagnostic info\n\n");
> ++ strbuf_addstr(&buf, "Collecting diagnostic info\n\n");
> + get_version_info(&buf, 1);
> +
> + strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
> -+ off = strchr(buf.buf, ':') + 1 - buf.buf;
> -+ write_or_die(stdout_fd, buf.buf + off, buf.len - off);
> -+ strvec_push(&archiver_args, buf.buf);
> ++ write_or_die(stdout_fd, buf.buf, buf.len);
> ++ strvec_pushf(&archiver_args,
> ++ "--add-file-with-content=diagnostics.log:%.*s",
> ++ (int)buf.len, buf.buf);
> +
> + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
> + (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
> @@ contrib/scalar/scalar.c: cleanup:
> + }
> +
> + if (!res)
> -+ printf("\n"
> ++ fprintf(stderr, "\n"
> + "Diagnostics complete.\n"
> + "All of the gathered info is captured in '%s'\n",
> + zip_path.buf);
> 4: 938e38b5a09 ! 5: bb162abd383 scalar diagnose: include disk space information
> @@ contrib/scalar/scalar.c: static int cmd_diagnose(int argc, const char **argv)
>
> strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
> + get_disk_info(&buf);
> - off = strchr(buf.buf, ':') + 1 - buf.buf;
> - write_or_die(stdout_fd, buf.buf + off, buf.len - off);
> - strvec_push(&archiver_args, buf.buf);
> + write_or_die(stdout_fd, buf.buf, buf.len);
> + strvec_pushf(&archiver_args,
> + "--add-file-with-content=diagnostics.log:%.*s",
>
> ## contrib/scalar/t/t9099-scalar.sh ##
> @@ contrib/scalar/t/t9099-scalar.sh: SQ="'"
> 5: bd9428919fa ! 6: 32aaad7cce1 scalar: teach `diagnose` to gather packfile info
> @@ contrib/scalar/scalar.c: cleanup:
> {
> struct option options[] = {
> @@ contrib/scalar/scalar.c: static int cmd_diagnose(int argc, const char **argv)
> - write_or_die(stdout_fd, buf.buf + off, buf.len - off);
> - strvec_push(&archiver_args, buf.buf);
> + "--add-file-with-content=diagnostics.log:%.*s",
> + (int)buf.len, buf.buf);
>
> + strbuf_reset(&buf);
> + strbuf_addstr(&buf, "--add-file-with-content=packs-local.txt:");
> 6: 7a8875be425 = 7: 322932f0bb8 scalar: teach `diagnose` to gather loose objects information
>
> --
> gitgitgadget
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v4 0/7] scalar: implement the subcommand "diagnose"
2022-05-04 15:25 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Johannes Schindelin via GitGitGadget
` (7 preceding siblings ...)
2022-05-07 2:23 ` [PATCH v3 0/7] scalar: implement the subcommand "diagnose" Elijah Newren
@ 2022-05-10 19:26 ` Johannes Schindelin via GitGitGadget
2022-05-10 19:26 ` [PATCH v4 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
` (8 more replies)
8 siblings, 9 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-10 19:26 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin
Over the course of the years, we developed a sub-command that gathers
diagnostic data into a .zip file that can then be attached to bug reports.
This sub-command turned out to be very useful in helping Scalar developers
identify and fix issues.
Changes since v3:
* We're now using unquote_c_style() instead of rolling our own unquoter.
* Fixed the added regression test.
* As pointed out by Scalar's Functional Tests, the
add_directory_to_archiver() function should not fail when scalar diagnose
encounters FSMonitor's Unix socket, but only warn instead.
* Related: add_directory_to_archiver() needs to propagate errors from
processing subdirectories so that the top-level call returns an error,
too.
Changes since v2:
* Clarified in the commit message what the biggest benefit of
--add-file-with-content is.
* The <path> part of the -add-file-with-content argument can now contain
colons. To do this, the path needs to start and end in double-quote
characters (which are stripped), and the backslash serves as escape
character in that case (to allow the path to contain both colons and
double-quotes).
* Fixed incorrect grammar.
* Instead of strcmp(<what-we-don't-want>), we now say
!strcmp(<what-we-want>).
* The help text for --add-file-with-content was improved a tiny bit.
* Adjusted the commit message that still talked about spawning plenty of
processes and about a throw-away repository for the sake of generating a
.zip file.
* Simplified the code that shows the diagnostics and adds them to the .zip
file.
* The final message that reports that the archive is complete is now
printed to stderr instead of stdout.
Changes since v1:
* Instead of creating a throw-away repository, staging the contents of the
.zip file and then using git write-tree and git archive to write the .zip
file, the patch series now introduces a new option to git archive and
uses write_archive() directly (avoiding any separate process).
* Since the command avoids separate processes, it is now blazing fast on
Windows, and I dropped the spinner() function because it's no longer
needed.
* While reworking the test case, I noticed that scalar [...] <enlistment>
failed to verify that the specified directory exists, and would happily
"traverse to its parent directory" on its quest to find a Scalar
enlistment. That is of course incorrect, and has been fixed as a "while
at it" sort of preparatory commit.
* I had forgotten to sign off on all the commits, which has been fixed.
* Instead of some "home-grown" readdir()-based function, the code now uses
for_each_file_in_pack_dir() to look through the pack directories.
* If any alternates are configured, their pack directories are now included
in the output.
* The commit message that might be interpreted to promise information about
large loose files has been corrected to no longer promise that.
* The test cases have been adjusted to test a little bit more (e.g.
verifying that specific paths are mentioned in the output, instead of
merely verifying that the output is non-empty).
Johannes Schindelin (5):
archive: optionally add "virtual" files
archive --add-file-with-contents: allow paths containing colons
scalar: validate the optional enlistment argument
Implement `scalar diagnose`
scalar diagnose: include disk space information
Matthew John Cheetham (2):
scalar: teach `diagnose` to gather packfile info
scalar: teach `diagnose` to gather loose objects information
Documentation/git-archive.txt | 17 ++
archive.c | 61 ++++++-
contrib/scalar/scalar.c | 292 ++++++++++++++++++++++++++++++-
contrib/scalar/scalar.txt | 12 ++
contrib/scalar/t/t9099-scalar.sh | 27 +++
t/t5003-archive-zip.sh | 20 +++
6 files changed, 419 insertions(+), 10 deletions(-)
base-commit: ddc35d833dd6f9e8946b09cecd3311b8aa18d295
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1128%2Fdscho%2Fscalar-diagnose-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1128/dscho/scalar-diagnose-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1128
Range-diff vs v3:
1: 45662cf582a = 1: 45662cf582a archive: optionally add "virtual" files
2: ce4b1b680c9 ! 2: fdba4ed6f4d archive --add-file-with-contents: allow paths containing colons
@@ Documentation/git-archive.txt: OPTIONS
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
-+character. In this case, the backslash is interpreted as escape
-+character. The path must be quoted if it contains a colon, to avoid
-+the colon from being misinterpreted as the separator between the
-+path and the contents.
++character; The contained file name is interpreted as a C-style string,
++i.e. the backslash is interpreted as escape character. The path must
++be quoted if it contains a colon, to avoid the colon from being
++misinterpreted as the separator between the path and the contents, or
++if the path begins or ends with a double-quote character.
++
+The file mode is limited to a regular file, and the option may be
+subject to platform-dependent command-line limits. For non-trivial
@@ Documentation/git-archive.txt: OPTIONS
Look for attributes in .gitattributes files in the working tree
## archive.c ##
+@@
+ #include "parse-options.h"
+ #include "unpack-trees.h"
+ #include "dir.h"
++#include "quote.h"
+
+ static char const * const archive_usage[] = {
+ N_("git archive [<options>] <tree-ish> [<path>...]"),
@@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
} else {
- const char *colon = strchr(arg, ':');
- char *p;
+- char *p;
++ struct strbuf buf = STRBUF_INIT;
++ const char *p = arg;
++
++ if (*p != '"')
++ p = strchr(p, ':');
++ else if (unquote_c_style(&buf, p, &p) < 0)
++ die(_("unclosed quote: '%s'"), arg);
- if (!colon)
-- die(_("missing colon: '%s'"), arg);
-+ if (*arg != '"') {
-+ const char *colon = strchr(arg, ':');
-+
-+ if (!colon)
-+ die(_("missing colon: '%s'"), arg);
-+ p = xstrndup(arg, colon - arg);
-+ arg = colon + 1;
-+ } else {
-+ struct strbuf buf = STRBUF_INIT;
-+ const char *orig = arg;
-+
-+ for (;;) {
-+ if (!*(++arg))
-+ die(_("unclosed quote: '%s'"), orig);
-+ if (*arg == '"')
-+ break;
-+ if (*arg == '\\' && *(++arg) == '\0')
-+ die(_("trailing backslash: '%s"), orig);
-+ else
-+ strbuf_addch(&buf, *arg);
-+ }
-+
-+ if (*(++arg) != ':')
-+ die(_("missing colon: '%s'"), orig);
-+
-+ p = strbuf_detach(&buf, NULL);
-+ arg++;
-+ }
++ if (!p || *p != ':')
+ die(_("missing colon: '%s'"), arg);
- p = xstrndup(arg, colon - arg);
- if (!args->prefix)
- path = p;
- else {
-@@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int unset)
+- if (!args->prefix)
+- path = p;
+- else {
+- path = prefix_filename(args->prefix, p);
+- free(p);
++ if (p == arg)
++ die(_("empty file name: '%s'"), arg);
++
++ path = buf.len ?
++ strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg);
++
++ if (args->prefix) {
++ char *save = path;
++ path = prefix_filename(args->prefix, path);
++ free(save);
}
memset(&info->stat, 0, sizeof(info->stat));
info->stat.st_mode = S_IFREG | 0644;
- info->content = xstrdup(colon + 1);
-+ info->content = xstrdup(arg);
++ info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
}
item = string_list_append_nodup(&args->extra_files, path);
3: 5a3eeb55409 = 3: da9f52a8240 scalar: validate the optional enlistment argument
4: dfe821d10fe ! 4: 87bdc22322b Implement `scalar diagnose`
@@ contrib/scalar/scalar.c: static int unregister_dir(void)
+ if (e->d_type == DT_REG)
+ strvec_pushf(archiver_args, "--add-file=%s", buf.buf);
+ else if (e->d_type != DT_DIR)
++ warning(_("skipping '%s', which is neither file nor "
++ "directory"), buf.buf);
++ else if (recurse &&
++ add_directory_to_archiver(archiver_args,
++ buf.buf, recurse) < 0)
+ res = -1;
-+ else if (recurse)
-+ add_directory_to_archiver(archiver_args, buf.buf, recurse);
+ }
+
+ closedir(dir);
@@ contrib/scalar/t/t9099-scalar.sh: test_expect_success '`scalar [...] <dir>` erro
+SQ="'"
+test_expect_success UNZIP 'scalar diagnose' '
+ scalar clone "file://$(pwd)" cloned --single-branch &&
-+ scalar diagnose cloned >out &&
-+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
++ scalar diagnose cloned >out 2>err &&
++ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
+ zip_path=$(cat zip_path) &&
+ test -n "$zip_path" &&
+ unzip -v "$zip_path" &&
5: bb162abd383 ! 5: 3f63b197d42 scalar diagnose: include disk space information
@@ contrib/scalar/t/t9099-scalar.sh
@@ contrib/scalar/t/t9099-scalar.sh: SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
- scalar diagnose cloned >out &&
+ scalar diagnose cloned >out 2>err &&
+ grep "Available space" out &&
- sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
zip_path=$(cat zip_path) &&
test -n "$zip_path" &&
6: 32aaad7cce1 ! 6: fc1319338fc scalar: teach `diagnose` to gather packfile info
@@ contrib/scalar/t/t9099-scalar.sh: test_expect_success '`scalar [...] <dir>` erro
scalar clone "file://$(pwd)" cloned --single-branch &&
+ git repack &&
+ echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
- scalar diagnose cloned >out &&
+ scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
- sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ contrib/scalar/t/t9099-scalar.sh: test_expect_success UNZIP 'scalar diagnose' '
folder=${zip_path%.zip} &&
test_path_is_missing "$folder" &&
7: 322932f0bb8 ! 7: e8f5b42f7b7 scalar: teach `diagnose` to gather loose objects information
@@ contrib/scalar/t/t9099-scalar.sh: test_expect_success UNZIP 'scalar diagnose' '
git repack &&
echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
+ test_commit -C cloned/src loose &&
- scalar diagnose cloned >out &&
+ scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
- sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <out >zip_path &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ contrib/scalar/t/t9099-scalar.sh: test_expect_success UNZIP 'scalar diagnose' '
unzip -p "$zip_path" diagnostics.log >out &&
test_file_not_empty out &&
--
gitgitgadget
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
@ 2022-05-10 19:26 ` Johannes Schindelin via GitGitGadget
2022-05-10 21:48 ` Junio C Hamano
2022-05-10 19:26 ` [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
` (7 subsequent siblings)
8 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-10 19:26 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
With the `--add-file-with-content=<path>:<content>` option, `git
archive` now supports use cases where relatively trivial files need to
be added that do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 11 ++++++++
archive.c | 51 +++++++++++++++++++++++++++++------
t/t5003-archive-zip.sh | 12 +++++++++
3 files changed, 66 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a7834..a0edc9167b2 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -61,6 +61,17 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+--add-file-with-content=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
+ by concatenating the value for `--prefix` (if any) and the
+ basename of <file>.
++
+The `<path>` cannot contain any colon, the file mode is limited to
+a regular file, and the option may be subject to platform-dependent
+command-line limits. For non-trivial cases, write an untracked file
+and use `--add-file` instead.
+
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
as well (see <<ATTRIBUTES>>).
diff --git a/archive.c b/archive.c
index a3bbb091256..d798624cd5f 100644
--- a/archive.c
+++ b/archive.c
@@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
struct extra_file_info {
char *base;
struct stat stat;
+ void *content;
};
int write_archive_entries(struct archiver_args *args,
@@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args,
strbuf_addstr(&path_in_archive, basename(path));
strbuf_reset(&content);
- if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
+ if (info->content)
+ err = write_entry(args, &fake_oid, path_in_archive.buf,
+ path_in_archive.len,
+ info->stat.st_mode,
+ info->content, info->stat.st_size);
+ else if (strbuf_read_file(&content, path,
+ info->stat.st_size) < 0)
err = error_errno(_("could not read '%s'"), path);
else
err = write_entry(args, &fake_oid, path_in_archive.buf,
@@ -493,6 +500,7 @@ static void extra_file_info_clear(void *util, const char *str)
{
struct extra_file_info *info = util;
free(info->base);
+ free(info->content);
free(info);
}
@@ -514,14 +522,38 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
if (!arg)
return -1;
- path = prefix_filename(args->prefix, arg);
- item = string_list_append_nodup(&args->extra_files, path);
- item->util = info = xmalloc(sizeof(*info));
+ info = xmalloc(sizeof(*info));
info->base = xstrdup_or_null(base);
- if (stat(path, &info->stat))
- die(_("File not found: %s"), path);
- if (!S_ISREG(info->stat.st_mode))
- die(_("Not a regular file: %s"), path);
+
+ if (!strcmp(opt->long_name, "add-file")) {
+ path = prefix_filename(args->prefix, arg);
+ if (stat(path, &info->stat))
+ die(_("File not found: %s"), path);
+ if (!S_ISREG(info->stat.st_mode))
+ die(_("Not a regular file: %s"), path);
+ info->content = NULL; /* read the file later */
+ } else {
+ const char *colon = strchr(arg, ':');
+ char *p;
+
+ if (!colon)
+ die(_("missing colon: '%s'"), arg);
+
+ p = xstrndup(arg, colon - arg);
+ if (!args->prefix)
+ path = p;
+ else {
+ path = prefix_filename(args->prefix, p);
+ free(p);
+ }
+ memset(&info->stat, 0, sizeof(info->stat));
+ info->stat.st_mode = S_IFREG | 0644;
+ info->content = xstrdup(colon + 1);
+ info->stat.st_size = strlen(info->content);
+ }
+ item = string_list_append_nodup(&args->extra_files, path);
+ item->util = info;
+
return 0;
}
@@ -554,6 +586,9 @@ static int parse_archive_args(int argc, const char **argv,
{ OPTION_CALLBACK, 0, "add-file", args, N_("file"),
N_("add untracked file to archive"), 0, add_file_cb,
(intptr_t)&base },
+ { OPTION_CALLBACK, 0, "add-file-with-content", args,
+ N_("path:content"), N_("add untracked file to archive"), 0,
+ add_file_cb, (intptr_t)&base },
OPT_STRING('o', "output", &output, N_("file"),
N_("write the archive to this file")),
OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 1e6d18b140e..8ff1257f1a0 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' '
check_zip with_untracked
check_added with_untracked untracked untracked
+test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
+ git archive --format=zip >with_file_with_content.zip \
+ --add-file-with-content=hello:world $EMPTY_TREE &&
+ test_when_finished "rm -rf tmp-unpack" &&
+ mkdir tmp-unpack && (
+ cd tmp-unpack &&
+ "$GIT_UNZIP" ../with_file_with_content.zip &&
+ test_path_is_file hello &&
+ test world = $(cat hello)
+ )
+'
+
test_expect_success 'git archive --format=zip --add-file twice' '
echo untracked >untracked &&
git archive --format=zip --prefix=one/ --add-file=untracked \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-10 19:26 ` [PATCH v4 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-10 21:48 ` Junio C Hamano
2022-05-10 22:06 ` rsbecker
2022-05-12 22:31 ` [PATCH] fixup! " Junio C Hamano
0 siblings, 2 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-10 21:48 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> @@ -514,14 +522,38 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
> if (!arg)
> return -1;
>
> - path = prefix_filename(args->prefix, arg);
> - item = string_list_append_nodup(&args->extra_files, path);
> - item->util = info = xmalloc(sizeof(*info));
> + info = xmalloc(sizeof(*info));
> info->base = xstrdup_or_null(base);
> - if (stat(path, &info->stat))
> - die(_("File not found: %s"), path);
> - if (!S_ISREG(info->stat.st_mode))
> - die(_("Not a regular file: %s"), path);
> +
> + if (!strcmp(opt->long_name, "add-file")) {
> + path = prefix_filename(args->prefix, arg);
> + if (stat(path, &info->stat))
> + die(_("File not found: %s"), path);
> + if (!S_ISREG(info->stat.st_mode))
> + die(_("Not a regular file: %s"), path);
> + info->content = NULL; /* read the file later */
> + } else {
This pretends that this new one will stay to be the only other
option that uses the same callback in the future. To be more
defensive, it should do
} else if (!strcmp(opt->long_name, "...")) {
and end the if/else if/else cascade with
} else {
BUG("add_file_cb called for unknown option");
}
> + const char *colon = strchr(arg, ':');
> + char *p;
> +
> + if (!colon)
> + die(_("missing colon: '%s'"), arg);
> +
> + p = xstrndup(arg, colon - arg);
> + if (!args->prefix)
> + path = p;
> + else {
> + path = prefix_filename(args->prefix, p);
> + free(p);
> + }
> + memset(&info->stat, 0, sizeof(info->stat));
> + info->stat.st_mode = S_IFREG | 0644;
I can sympathize with the desire to omit the mode bits because it
may not be useful for the immediate purpose of "scalar diagnose"
where the extracting end won't care what the file's permission bits
are, but by letting this "mode is hardcoded" thing squat here would
later make it more work when other people want to add an option that
truely lets the caller add a "vitual" file, in response to end-user
complaints that they cannot use the existing one to add an
exectuable file, for example. I do not care too much about the
pathname limitation that does not allow a colon in it, simply
because it is unusual enough, but I am not sure about hardcoded
permission bits.
If we did "--add-virtual-file=<path>:0644:<contents>" instead from
day one, it certainly adds a few more lines of logic to this patch,
and the calling "scalar diagnose" may have to pass a few more bytes,
but I suspect that such a change would help the project in the
longer run.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* RE: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-10 21:48 ` Junio C Hamano
@ 2022-05-10 22:06 ` rsbecker
2022-05-10 23:21 ` Junio C Hamano
2022-05-12 22:31 ` [PATCH] fixup! " Junio C Hamano
1 sibling, 1 reply; 140+ messages in thread
From: rsbecker @ 2022-05-10 22:06 UTC (permalink / raw)
To: 'Junio C Hamano', 'Johannes Schindelin via GitGitGadget'
Cc: git, 'René Scharfe', 'Taylor Blau',
'Derrick Stolee', 'Elijah Newren',
'Johannes Schindelin'
On May 10, 2022 5:48 PM, Junio C Hamano wrote:
>"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
>writes:
>
>> @@ -514,14 +522,38 @@ static int add_file_cb(const struct option *opt, const
>char *arg, int unset)
>> if (!arg)
>> return -1;
>>
>> - path = prefix_filename(args->prefix, arg);
>> - item = string_list_append_nodup(&args->extra_files, path);
>> - item->util = info = xmalloc(sizeof(*info));
>> + info = xmalloc(sizeof(*info));
>> info->base = xstrdup_or_null(base);
>> - if (stat(path, &info->stat))
>> - die(_("File not found: %s"), path);
>> - if (!S_ISREG(info->stat.st_mode))
>> - die(_("Not a regular file: %s"), path);
>> +
>> + if (!strcmp(opt->long_name, "add-file")) {
>> + path = prefix_filename(args->prefix, arg);
>> + if (stat(path, &info->stat))
>> + die(_("File not found: %s"), path);
>> + if (!S_ISREG(info->stat.st_mode))
>> + die(_("Not a regular file: %s"), path);
>> + info->content = NULL; /* read the file later */
>> + } else {
>
>This pretends that this new one will stay to be the only other option that uses the
>same callback in the future. To be more defensive, it should do
>
> } else if (!strcmp(opt->long_name, "...")) {
>
>and end the if/else if/else cascade with
>
> } else {
> BUG("add_file_cb called for unknown option");
> }
>
>> + const char *colon = strchr(arg, ':');
>> + char *p;
>> +
>> + if (!colon)
>> + die(_("missing colon: '%s'"), arg);
>> +
>> + p = xstrndup(arg, colon - arg);
>> + if (!args->prefix)
>> + path = p;
>> + else {
>> + path = prefix_filename(args->prefix, p);
>> + free(p);
>> + }
>> + memset(&info->stat, 0, sizeof(info->stat));
>> + info->stat.st_mode = S_IFREG | 0644;
>
>I can sympathize with the desire to omit the mode bits because it may not be
>useful for the immediate purpose of "scalar diagnose"
>where the extracting end won't care what the file's permission bits are, but by
>letting this "mode is hardcoded" thing squat here would later make it more work
>when other people want to add an option that truely lets the caller add a "vitual"
>file, in response to end-user complaints that they cannot use the existing one to
>add an exectuable file, for example. I do not care too much about the pathname
>limitation that does not allow a colon in it, simply because it is unusual enough, but
>I am not sure about hardcoded permission bits.
>
>If we did "--add-virtual-file=<path>:0644:<contents>" instead from day one, it
>certainly adds a few more lines of logic to this patch, and the calling "scalar
>diagnose" may have to pass a few more bytes, but I suspect that such a change
>would help the project in the longer run.
Would not core.filemode=false somewhat simulate this? The consumer-client would not care/do anything with the mode anyway. Or am I missing something?
--Randall
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-10 22:06 ` rsbecker
@ 2022-05-10 23:21 ` Junio C Hamano
2022-05-11 16:14 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-10 23:21 UTC (permalink / raw)
To: rsbecker
Cc: 'Johannes Schindelin via GitGitGadget',
git, 'René Scharfe', 'Taylor Blau',
'Derrick Stolee', 'Elijah Newren',
'Johannes Schindelin'
<rsbecker@nexbridge.com> writes:
>>If we did "--add-virtual-file=<path>:0644:<contents>" instead from day one, it
>>certainly adds a few more lines of logic to this patch, and the calling "scalar
>>diagnose" may have to pass a few more bytes, but I suspect that such a change
>>would help the project in the longer run.
>
> Would not core.filemode=false somewhat simulate this? The
> consumer-client would not care/do anything with the mode
> anyway. Or am I missing something?
Or I must be missing something. This is part of "git archive" where
its output is a tarball (or a zipfile) in which each entry knows its
permission bits (or at least, if it is executable). Running "tar xf"
or "unzip" on the receiving end of the output of this command should
set the executable bit (and other permission bits) correctly I would
certainly hope, so it does matter, no?
I did say "scalar diagnose" may not care. But a patch to "git
archive" will affect other people, and among them there would be
people who say "gee, now I can add a handful of files from the
command line with their contents, without actually having them in
throw-away untracked files, when running 'git archive'. That's
handy!", try it out and get disappointed by their inability to
create executable files that way. And obviously I care more about
"git archive" than "scalar diagnose". I very welcome to enhance the
former to support the need for the latter. I do not see a good
reason to stop at a half-feature added to the former, even that
added half is enough to satisfy the latter, when the other half is
not all that hard to add, and it is reasonably expected that users
other than "scalar diagnose" would naturally want the other half,
too.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-10 23:21 ` Junio C Hamano
@ 2022-05-11 16:14 ` René Scharfe
2022-05-11 19:27 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: René Scharfe @ 2022-05-11 16:14 UTC (permalink / raw)
To: Junio C Hamano, rsbecker
Cc: 'Johannes Schindelin via GitGitGadget',
git, 'Taylor Blau', 'Derrick Stolee',
'Elijah Newren', 'Johannes Schindelin'
Am 11.05.22 um 01:21 schrieb Junio C Hamano:
> <rsbecker@nexbridge.com> writes:
>
>>> If we did "--add-virtual-file=<path>:0644:<contents>" instead from day one, it
>>> certainly adds a few more lines of logic to this patch, and the calling "scalar
>>> diagnose" may have to pass a few more bytes, but I suspect that such a change
>>> would help the project in the longer run.
> I did say "scalar diagnose" may not care. But a patch to "git
> archive" will affect other people, and among them there would be
> people who say "gee, now I can add a handful of files from the
> command line with their contents, without actually having them in
> throw-away untracked files, when running 'git archive'. That's
> handy!", try it out and get disappointed by their inability to
> create executable files that way.
Which might motivate them to contribute a patch to add that feature.
Give them a chance! :)
> And obviously I care more about
> "git archive" than "scalar diagnose". I very welcome to enhance the
> former to support the need for the latter. I do not see a good
> reason to stop at a half-feature added to the former, even that
> added half is enough to satisfy the latter, when the other half is
> not all that hard to add, and it is reasonably expected that users
> other than "scalar diagnose" would naturally want the other half,
> too.
FWIW, I'd already be satisfied by a convincing outline of a way towards
a complete solution to accept the partial feature, just to be sure we
don't paint ourselves into a corner. But I'm bad at both strategy and
saying no, so that's that.
Regarding file modes: We only effectively support the executable bit,
so an additional option --add-virtual-executable-file=<path>:<contents>
would suffice. It would also prevent the false impression that
arbitrary file modes can be used ("I said 0123 and got 0644, bug!").
And it would not even be the longest Git option..
René
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-11 16:14 ` René Scharfe
@ 2022-05-11 19:27 ` Junio C Hamano
2022-05-12 16:16 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-11 19:27 UTC (permalink / raw)
To: René Scharfe
Cc: rsbecker, 'Johannes Schindelin via GitGitGadget',
git, 'Taylor Blau', 'Derrick Stolee',
'Elijah Newren', 'Johannes Schindelin'
René Scharfe <l.s.r@web.de> writes:
> Am 11.05.22 um 01:21 schrieb Junio C Hamano:
>> <rsbecker@nexbridge.com> writes:
>>
>>>> If we did "--add-virtual-file=<path>:0644:<contents>" instead from day one, it
>>>> certainly adds a few more lines of logic to this patch, and the calling "scalar
>>>> diagnose" may have to pass a few more bytes, but I suspect that such a change
>>>> would help the project in the longer run.
>
>> I did say "scalar diagnose" may not care. But a patch to "git
>> archive" will affect other people, and among them there would be
>> people who say "gee, now I can add a handful of files from the
>> command line with their contents, without actually having them in
>> throw-away untracked files, when running 'git archive'. That's
>> handy!", try it out and get disappointed by their inability to
>> create executable files that way.
>
> Which might motivate them to contribute a patch to add that feature.
> Give them a chance! :)
Yes, but there is no way to reuse the same option in a backward
compatible way to later add the mode information, and that is why we
want to be careful before a half-feature squats on an option.
> FWIW, I'd already be satisfied by a convincing outline of a way towards
> a complete solution to accept the partial feature, just to be sure we
> don't paint ourselves into a corner.
Exactly. As you say, an extra and separate option can be used. I
do not know if that is a workaround because we didn't design the
first option to take an additional option, or a welcome feature.
> Regarding file modes: We only effectively support the executable bit,
> so an additional option --add-virtual-executable-file=<path>:<contents>
> would suffice.
While I do not think we want to support more than one "is it
executable or not?" bit, I am not so sure about what the current
code does, though, for these "not from a tree, but added as extra
files" entries.
If you add an extra file from an on-disk untracked file, the
add_file_cb() callback picks up the full st.st_mode for the file,
and write_archive_entries() in its loop over args->extra_files pass
the full info->stat.st_mode down to write_entry(), which is used by
archive-tar.c::write_tar_entry() to obtain mode bits pretty much
as-is. For tracked paths, we probably are normalizing the blobs
between 0644 and 0755 way before the values are passed as "mode"
parameter to the write_entry() functions, but for these extra files,
there is no such massaging.
So, I am OK with --add-virtual-executable=<path>:<contents> (but the
point still stands that the way the code in the patch squats in the
codepath makes it necessary to first refator it before it can
happen) as a separate option. We may want to massage the mode bit
we grab from these extra files, if we were to go that route, though.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-11 19:27 ` Junio C Hamano
@ 2022-05-12 16:16 ` René Scharfe
2022-05-12 18:15 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: René Scharfe @ 2022-05-12 16:16 UTC (permalink / raw)
To: Junio C Hamano
Cc: rsbecker, 'Johannes Schindelin via GitGitGadget',
git, 'Taylor Blau', 'Derrick Stolee',
'Elijah Newren', 'Johannes Schindelin'
Am 11.05.22 um 21:27 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> Regarding file modes: We only effectively support the executable bit,
>> so an additional option --add-virtual-executable-file=<path>:<contents>
>> would suffice.
>
> While I do not think we want to support more than one "is it
> executable or not?" bit, I am not so sure about what the current
> code does, though, for these "not from a tree, but added as extra
> files" entries.
>
> If you add an extra file from an on-disk untracked file, the
> add_file_cb() callback picks up the full st.st_mode for the file,
> and write_archive_entries() in its loop over args->extra_files pass
> the full info->stat.st_mode down to write_entry(), which is used by
> archive-tar.c::write_tar_entry() to obtain mode bits pretty much
> as-is.
Good point. write_tar_entry() actually normalizes the permission bits
and applies tar.umask (0002 by default):
if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
*header.typeflag = TYPEFLAG_DIR;
mode = (mode | 0777) & ~tar_umask;
} else if (S_ISLNK(mode)) {
*header.typeflag = TYPEFLAG_LNK;
mode |= 0777;
} else if (S_ISREG(mode)) {
*header.typeflag = TYPEFLAG_REG;
mode = (mode | ((mode & 0100) ? 0777 : 0666)) & ~tar_umask;
But write_zip_entry() only normalizes (drops) the permission bits of
non-executable files:
attr2 = S_ISLNK(mode) ? ((mode | 0777) << 16) :
(mode & 0111) ? ((mode) << 16) : 0;
if (S_ISLNK(mode) || (mode & 0111))
creator_version = 0x0317;
attr2 corresponds to the field "external file attributes" mentioned in
the ZIP format specification, APPNOTE.TXT. It's interpreted based on
the "version made by" (creator_version here); that 0x03 part above
means "UNIX". The default is MS-DOS (FAT filesystem), with effectivly
no support for file permissions.
So we currently leak permission bits of executable files into ZIP
archives, but not tar files. :-| Normalizing those to 0755 would be
more consistent.
> For tracked paths, we probably are normalizing the blobs
> between 0644 and 0755 way before the values are passed as "mode"
> parameter to the write_entry() functions, but for these extra files,
> there is no such massaging.
Right, mode values from read_tree() pass through canon_mode(), so only
untracked files (those appended with --add-file) are affected by the
leakage mentioned above.
René
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-12 16:16 ` René Scharfe
@ 2022-05-12 18:15 ` Junio C Hamano
2022-05-12 21:31 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-12 18:15 UTC (permalink / raw)
To: René Scharfe
Cc: rsbecker, 'Johannes Schindelin via GitGitGadget',
git, 'Taylor Blau', 'Derrick Stolee',
'Elijah Newren', 'Johannes Schindelin'
René Scharfe <l.s.r@web.de> writes:
> Good point. write_tar_entry() actually normalizes the permission bits
> and applies tar.umask (0002 by default):
>
> if (S_ISDIR(mode) || S_ISGITLINK(mode)) {
> *header.typeflag = TYPEFLAG_DIR;
> mode = (mode | 0777) & ~tar_umask;
> } else if (S_ISLNK(mode)) {
> *header.typeflag = TYPEFLAG_LNK;
> mode |= 0777;
> } else if (S_ISREG(mode)) {
> *header.typeflag = TYPEFLAG_REG;
> mode = (mode | ((mode & 0100) ? 0777 : 0666)) & ~tar_umask;
Yeah, this side seems to care only about u+x bit, so
"add-executable" as a separate option would fly we..
> But write_zip_entry() only normalizes (drops) the permission bits of
> non-executable files:
>
> attr2 = S_ISLNK(mode) ? ((mode | 0777) << 16) :
> (mode & 0111) ? ((mode) << 16) : 0;
> if (S_ISLNK(mode) || (mode & 0111))
> creator_version = 0x0317;
>
> attr2 corresponds to the field "external file attributes" mentioned in
> the ZIP format specification, APPNOTE.TXT. It's interpreted based on
> the "version made by" (creator_version here); that 0x03 part above
> means "UNIX". The default is MS-DOS (FAT filesystem), with effectivly
> no support for file permissions.
>
> So we currently leak permission bits of executable files into ZIP
> archives, but not tar files. :-| Normalizing those to 0755 would be
> more consistent.
Yup.
>> For tracked paths, we probably are normalizing the blobs
>> between 0644 and 0755 way before the values are passed as "mode"
>> parameter to the write_entry() functions, but for these extra files,
>> there is no such massaging.
>
> Right, mode values from read_tree() pass through canon_mode(), so only
> untracked files (those appended with --add-file) are affected by the
> leakage mentioned above.
Thanks for sanity-checking.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-12 18:15 ` Junio C Hamano
@ 2022-05-12 21:31 ` Junio C Hamano
2022-05-14 7:06 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-12 21:31 UTC (permalink / raw)
To: René Scharfe
Cc: rsbecker, 'Johannes Schindelin via GitGitGadget',
git, 'Taylor Blau', 'Derrick Stolee',
'Elijah Newren', 'Johannes Schindelin'
Junio C Hamano <gitster@pobox.com> writes:
>> So we currently leak permission bits of executable files into ZIP
>> archives, but not tar files. :-| Normalizing those to 0755 would be
>> more consistent.
Today, I was scanning the "What's cooking" draft and saw too many
topics that are marked with "Expecting a reroll". It turns out that
this "mode bits" thing will not be a blocker to make us wait for a
reroll of the topic, so let's handle it separately, before we
forget, as an independent fix outside the series under discussion.
Thanks.
--- >8 ---
Subject: [PATCH] archive: do not let on-disk mode leak to zip archives
When the "--add-file" option is used to add the contents from an
untracked file to the archive, the permission mode bits for these
files are sent to the archive-backend specific "write_entry()"
method as-is. We normalize the mode bits for tracked files way
before we pass them to the write_entry() method; we should do the
same here.
This is not strictly needed for "tar" archive-backend, as it has its
own code to further clean them up, but "zip" archive-backend is not
so well prepared.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
archive.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/archive.c b/archive.c
index e29d0e00f6..12a08af531 100644
--- a/archive.c
+++ b/archive.c
@@ -342,7 +342,7 @@ int write_archive_entries(struct archiver_args *args,
else
err = write_entry(args, &fake_oid, path_in_archive.buf,
path_in_archive.len,
- info->stat.st_mode,
+ canon_mode(info->stat.st_mode),
content.buf, content.len);
if (err)
break;
--
2.36.1-338-g1c7f76a54c
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 1/7] archive: optionally add "virtual" files
2022-05-12 21:31 ` Junio C Hamano
@ 2022-05-14 7:06 ` René Scharfe
0 siblings, 0 replies; 140+ messages in thread
From: René Scharfe @ 2022-05-14 7:06 UTC (permalink / raw)
To: Junio C Hamano
Cc: rsbecker, 'Johannes Schindelin via GitGitGadget',
git, 'Taylor Blau', 'Derrick Stolee',
'Elijah Newren', 'Johannes Schindelin'
Am 12.05.22 um 23:31 schrieb Junio C Hamano:
> Junio C Hamano <gitster@pobox.com> writes:
>
>>> So we currently leak permission bits of executable files into ZIP
>>> archives, but not tar files. :-| Normalizing those to 0755 would be
>>> more consistent.
>
> Today, I was scanning the "What's cooking" draft and saw too many
> topics that are marked with "Expecting a reroll". It turns out that
> this "mode bits" thing will not be a blocker to make us wait for a
> reroll of the topic, so let's handle it separately, before we
> forget, as an independent fix outside the series under discussion.
>
> Thanks.
>
> --- >8 ---
> Subject: [PATCH] archive: do not let on-disk mode leak to zip archives
>
> When the "--add-file" option is used to add the contents from an
> untracked file to the archive, the permission mode bits for these
> files are sent to the archive-backend specific "write_entry()"
> method as-is. We normalize the mode bits for tracked files way
> before we pass them to the write_entry() method; we should do the
> same here.
>
> This is not strictly needed for "tar" archive-backend, as it has its
> own code to further clean them up, but "zip" archive-backend is not
> so well prepared.
>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> archive.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/archive.c b/archive.c
> index e29d0e00f6..12a08af531 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -342,7 +342,7 @@ int write_archive_entries(struct archiver_args *args,
> else
> err = write_entry(args, &fake_oid, path_in_archive.buf,
> path_in_archive.len,
> - info->stat.st_mode,
> + canon_mode(info->stat.st_mode),
> content.buf, content.len);
> if (err)
> break;
Looks good to me, thank you!
René
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH] fixup! archive: optionally add "virtual" files
2022-05-10 21:48 ` Junio C Hamano
2022-05-10 22:06 ` rsbecker
@ 2022-05-12 22:31 ` Junio C Hamano
1 sibling, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-12 22:31 UTC (permalink / raw)
To: git
Cc: Johannes Schindelin via GitGitGadget, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren, Johannes Schindelin
Do not let add_file_cb() assume that two existing callers are the
only ones, and checking that the caller is not one of them is
sufficient to determine it is the other one.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* To be squashed to the commit with the title in the series.
The "What's cooking" report is getting crowded with too many
topics marked as "Expecting a reroll", and I'm trying to do
easier ones myself to see how much reduction we can make.
archive.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/archive.c b/archive.c
index 477eba60ac..98c7449ea1 100644
--- a/archive.c
+++ b/archive.c
@@ -533,7 +533,7 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
if (!S_ISREG(info->stat.st_mode))
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
- } else {
+ } else if (!strcmp(opt->long_name, "add-file-with-content")) {
struct strbuf buf = STRBUF_INIT;
const char *p = arg;
@@ -560,6 +560,8 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
info->stat.st_mode = S_IFREG | 0644;
info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
+ } else {
+ BUG("add_file_cb() called for %s", opt->long_name);
}
item = string_list_append_nodup(&args->extra_files, path);
item->util = info;
--
2.36.1-338-g1c7f76a54c
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
2022-05-10 19:26 ` [PATCH v4 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-10 19:26 ` Johannes Schindelin via GitGitGadget
2022-05-10 21:56 ` Junio C Hamano
2022-05-10 19:27 ` [PATCH v4 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
` (6 subsequent siblings)
8 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-10 19:26 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 14 ++++++++++----
archive.c | 30 ++++++++++++++++++++----------
t/t5003-archive-zip.sh | 8 ++++++++
3 files changed, 38 insertions(+), 14 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index a0edc9167b2..21eab5690ad 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -67,10 +67,16 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+
-The `<path>` cannot contain any colon, the file mode is limited to
-a regular file, and the option may be subject to platform-dependent
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
+character; The contained file name is interpreted as a C-style string,
+i.e. the backslash is interpreted as escape character. The path must
+be quoted if it contains a colon, to avoid the colon from being
+misinterpreted as the separator between the path and the contents, or
+if the path begins or ends with a double-quote character.
++
+The file mode is limited to a regular file, and the option may be
+subject to platform-dependent command-line limits. For non-trivial
+cases, write an untracked file and use `--add-file` instead.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
diff --git a/archive.c b/archive.c
index d798624cd5f..477eba60ac3 100644
--- a/archive.c
+++ b/archive.c
@@ -9,6 +9,7 @@
#include "parse-options.h"
#include "unpack-trees.h"
#include "dir.h"
+#include "quote.h"
static char const * const archive_usage[] = {
N_("git archive [<options>] <tree-ish> [<path>...]"),
@@ -533,22 +534,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
} else {
- const char *colon = strchr(arg, ':');
- char *p;
+ struct strbuf buf = STRBUF_INIT;
+ const char *p = arg;
+
+ if (*p != '"')
+ p = strchr(p, ':');
+ else if (unquote_c_style(&buf, p, &p) < 0)
+ die(_("unclosed quote: '%s'"), arg);
- if (!colon)
+ if (!p || *p != ':')
die(_("missing colon: '%s'"), arg);
- p = xstrndup(arg, colon - arg);
- if (!args->prefix)
- path = p;
- else {
- path = prefix_filename(args->prefix, p);
- free(p);
+ if (p == arg)
+ die(_("empty file name: '%s'"), arg);
+
+ path = buf.len ?
+ strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg);
+
+ if (args->prefix) {
+ char *save = path;
+ path = prefix_filename(args->prefix, path);
+ free(save);
}
memset(&info->stat, 0, sizeof(info->stat));
info->stat.st_mode = S_IFREG | 0644;
- info->content = xstrdup(colon + 1);
+ info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
}
item = string_list_append_nodup(&args->extra_files, path);
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 8ff1257f1a0..5b8bbfc2692 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -207,13 +207,21 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
+ if test_have_prereq FUNNYNAMES
+ then
+ QUOTED=quoted:colon
+ else
+ QUOTED=quoted
+ fi &&
git archive --format=zip >with_file_with_content.zip \
+ --add-file-with-content=\"$QUOTED\": \
--add-file-with-content=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
+ test_path_is_file $QUOTED &&
test world = $(cat hello)
)
'
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-10 19:26 ` [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-10 21:56 ` Junio C Hamano
2022-05-10 22:23 ` rsbecker
2022-05-19 18:09 ` Johannes Schindelin
0 siblings, 2 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-10 21:56 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> By allowing the path to be enclosed in double-quotes, we can avoid
> the limitation that paths cannot contain colons.
> ...
> + struct strbuf buf = STRBUF_INIT;
> + const char *p = arg;
> +
> + if (*p != '"')
> + p = strchr(p, ':');
> + else if (unquote_c_style(&buf, p, &p) < 0)
> + die(_("unclosed quote: '%s'"), arg);
Even though I do not think people necessarily would want to use
colons in their pathnames (it has problems interoperating with other
systems), lifting the limitation is a good thing to do. I totally
forgot that we designed unquote_c_style() to self terminate and
return the end pointer to the caller so the caller does not have to
worry, which is very nice.
Even if this step weren't here in the series, I would have thought
the mode bits issue was more serious than "no colons in path"
limitation, but given that we address this unusual corner case
limitation, I would think we should address the hardcoded mode bits
at the same time.
> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> index 8ff1257f1a0..5b8bbfc2692 100755
> --- a/t/t5003-archive-zip.sh
> +++ b/t/t5003-archive-zip.sh
> @@ -207,13 +207,21 @@ check_zip with_untracked
> check_added with_untracked untracked untracked
>
> test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
> + if test_have_prereq FUNNYNAMES
> + then
> + QUOTED=quoted:colon
> + else
> + QUOTED=quoted
> + fi &&
;-)
> git archive --format=zip >with_file_with_content.zip \
> + --add-file-with-content=\"$QUOTED\": \
> --add-file-with-content=hello:world $EMPTY_TREE &&
> test_when_finished "rm -rf tmp-unpack" &&
> mkdir tmp-unpack && (
> cd tmp-unpack &&
> "$GIT_UNZIP" ../with_file_with_content.zip &&
> test_path_is_file hello &&
> + test_path_is_file $QUOTED &&
Looks OK, even though it probably is a good idea to have dq around
$QUOTED, so that future developers can easily insert SP into its
value to use a bit more common but still a bit more problematic
pathnames in the test.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* RE: [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-10 21:56 ` Junio C Hamano
@ 2022-05-10 22:23 ` rsbecker
2022-05-19 18:12 ` Johannes Schindelin
2022-05-19 18:09 ` Johannes Schindelin
1 sibling, 1 reply; 140+ messages in thread
From: rsbecker @ 2022-05-10 22:23 UTC (permalink / raw)
To: 'Junio C Hamano', 'Johannes Schindelin via GitGitGadget'
Cc: git, 'René Scharfe', 'Taylor Blau',
'Derrick Stolee', 'Elijah Newren',
'Johannes Schindelin'
On May 10, 2022 5:57 PM, Junio C Hamano wrote:
>"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
>writes:
>
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>
>> By allowing the path to be enclosed in double-quotes, we can avoid the
>> limitation that paths cannot contain colons.
>> ...
>> + struct strbuf buf = STRBUF_INIT;
>> + const char *p = arg;
>> +
>> + if (*p != '"')
>> + p = strchr(p, ':');
>> + else if (unquote_c_style(&buf, p, &p) < 0)
>> + die(_("unclosed quote: '%s'"), arg);
>
>Even though I do not think people necessarily would want to use colons in their
>pathnames (it has problems interoperating with other systems), lifting the
>limitation is a good thing to do. I totally forgot that we designed
>unquote_c_style() to self terminate and return the end pointer to the caller so the
>caller does not have to worry, which is very nice.
>
>Even if this step weren't here in the series, I would have thought the mode bits
>issue was more serious than "no colons in path"
>limitation, but given that we address this unusual corner case limitation, I would
>think we should address the hardcoded mode bits at the same time.
>
>> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh index
>> 8ff1257f1a0..5b8bbfc2692 100755
>> --- a/t/t5003-archive-zip.sh
>> +++ b/t/t5003-archive-zip.sh
>> @@ -207,13 +207,21 @@ check_zip with_untracked check_added
>> with_untracked untracked untracked
>>
>> test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
>> + if test_have_prereq FUNNYNAMES
>> + then
>> + QUOTED=quoted:colon
>> + else
>> + QUOTED=quoted
>> + fi &&
>
>;-)
>
>> git archive --format=zip >with_file_with_content.zip \
>> + --add-file-with-content=\"$QUOTED\": \
>> --add-file-with-content=hello:world $EMPTY_TREE &&
>> test_when_finished "rm -rf tmp-unpack" &&
>> mkdir tmp-unpack && (
>> cd tmp-unpack &&
>> "$GIT_UNZIP" ../with_file_with_content.zip &&
>> test_path_is_file hello &&
>> + test_path_is_file $QUOTED &&
>
>Looks OK, even though it probably is a good idea to have dq around $QUOTED, so
>that future developers can easily insert SP into its value to use a bit more common
>but still a bit more problematic pathnames in the test.
A test case for .gitignore in this would be good too. People on our exotic platform do this stuff as a matter of course. As an example, a name of $Z3P4:12399334 being used as a named pipe (associated with the unique name of a process) actually has been seen in the wild recently. My solution was to wild card this and/or contain it in an ignored directory.
Regards,
Randall
^ permalink raw reply [flat|nested] 140+ messages in thread
* RE: [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-10 22:23 ` rsbecker
@ 2022-05-19 18:12 ` Johannes Schindelin
0 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-19 18:12 UTC (permalink / raw)
To: rsbecker
Cc: 'Junio C Hamano',
'Johannes Schindelin via GitGitGadget',
git, 'René Scharfe', 'Taylor Blau',
'Derrick Stolee', 'Elijah Newren'
Hi Randall,
On Tue, 10 May 2022, rsbecker@nexbridge.com wrote:
> On May 10, 2022 5:57 PM, Junio C Hamano wrote:
> >"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> >writes:
> >
> >> git archive --format=zip >with_file_with_content.zip \
> >> + --add-file-with-content=\"$QUOTED\": \
> >> --add-file-with-content=hello:world $EMPTY_TREE &&
> >> test_when_finished "rm -rf tmp-unpack" &&
> >> mkdir tmp-unpack && (
> >> cd tmp-unpack &&
> >> "$GIT_UNZIP" ../with_file_with_content.zip &&
> >> test_path_is_file hello &&
> >> + test_path_is_file $QUOTED &&
> >
> >Looks OK, even though it probably is a good idea to have dq around $QUOTED, so
> >that future developers can easily insert SP into its value to use a bit more common
> >but still a bit more problematic pathnames in the test.
>
> A test case for .gitignore in this would be good too. People on our
> exotic platform do this stuff as a matter of course. As an example, a
> name of $Z3P4:12399334 being used as a named pipe (associated with the
> unique name of a process) actually has been seen in the wild recently.
> My solution was to wild card this and/or contain it in an ignored
> directory.
The `--add-file-with-content` option, which this test case is all about,
specifically does not heed `.gitignore`. Is this what you want to test? If
so, I don't think that's necessary. Unless you expect some future version
to introduce a patch by mistake that makes `--add-file-with-content`
subject to the `.gitignore` rules.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-10 21:56 ` Junio C Hamano
2022-05-10 22:23 ` rsbecker
@ 2022-05-19 18:09 ` Johannes Schindelin
2022-05-19 18:44 ` Junio C Hamano
1 sibling, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-19 18:09 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren
Hi Junio,
On Tue, 10 May 2022, Junio C Hamano wrote:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
> > From: Johannes Schindelin <johannes.schindelin@gmx.de>
> >
> > By allowing the path to be enclosed in double-quotes, we can avoid
> > the limitation that paths cannot contain colons.
> > ...
> > + struct strbuf buf = STRBUF_INIT;
> > + const char *p = arg;
> > +
> > + if (*p != '"')
> > + p = strchr(p, ':');
> > + else if (unquote_c_style(&buf, p, &p) < 0)
> > + die(_("unclosed quote: '%s'"), arg);
>
> Even though I do not think people necessarily would want to use
> colons in their pathnames (it has problems interoperating with other
> systems), lifting the limitation is a good thing to do. I totally
> forgot that we designed unquote_c_style() to self terminate and
> return the end pointer to the caller so the caller does not have to
> worry, which is very nice.
>
> Even if this step weren't here in the series, I would have thought
> the mode bits issue was more serious than "no colons in path"
> limitation, but given that we address this unusual corner case
> limitation, I would think we should address the hardcoded mode bits
> at the same time.
>
> > diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> > index 8ff1257f1a0..5b8bbfc2692 100755
> > --- a/t/t5003-archive-zip.sh
> > +++ b/t/t5003-archive-zip.sh
> > @@ -207,13 +207,21 @@ check_zip with_untracked
> > check_added with_untracked untracked untracked
> >
> > test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
> > + if test_have_prereq FUNNYNAMES
> > + then
> > + QUOTED=quoted:colon
> > + else
> > + QUOTED=quoted
> > + fi &&
>
> ;-)
>
> > git archive --format=zip >with_file_with_content.zip \
> > + --add-file-with-content=\"$QUOTED\": \
> > --add-file-with-content=hello:world $EMPTY_TREE &&
> > test_when_finished "rm -rf tmp-unpack" &&
> > mkdir tmp-unpack && (
> > cd tmp-unpack &&
> > "$GIT_UNZIP" ../with_file_with_content.zip &&
> > test_path_is_file hello &&
> > + test_path_is_file $QUOTED &&
>
> Looks OK, even though it probably is a good idea to have dq around
> $QUOTED, so that future developers can easily insert SP into its
> value to use a bit more common but still a bit more problematic
> pathnames in the test.
I actually decided against this because reading
"$QUOTED"
would mislead future me to think that the double quotes that enclose
$QUOTED are the quotes that the variable's name talks about. But the
quotes are actually the escaped ones that are passed to `git archive`
above.
So, to help future Dscho should they read this code six months from now or
even later, I wanted to specifically only add quotes to the `git archive`
call to make the intention abundantly clear.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-19 18:09 ` Johannes Schindelin
@ 2022-05-19 18:44 ` Junio C Hamano
0 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-19 18:44 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Johannes Schindelin via GitGitGadget, git, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> > git archive --format=zip >with_file_with_content.zip \
>> > + --add-file-with-content=\"$QUOTED\": \
>> > --add-file-with-content=hello:world $EMPTY_TREE &&
>> > test_when_finished "rm -rf tmp-unpack" &&
>> > mkdir tmp-unpack && (
>> > cd tmp-unpack &&
>> > "$GIT_UNZIP" ../with_file_with_content.zip &&
>> > test_path_is_file hello &&
>> > + test_path_is_file $QUOTED &&
>>
>> Looks OK, even though it probably is a good idea to have dq around
>> $QUOTED, so that future developers can easily insert SP into its
>> value to use a bit more common but still a bit more problematic
>> pathnames in the test.
>
> I actually decided against this because reading
>
> "$QUOTED"
>
> would mislead future me to think that the double quotes that enclose
> $QUOTED are the quotes that the variable's name talks about. But the
> quotes are actually the escaped ones that are passed to `git archive`
> above.
>
> So, to help future Dscho should they read this code six months from now or
> even later, I wanted to specifically only add quotes to the `git archive`
> call to make the intention abundantly clear.
If you find "$QUOTED" misleads any reader to think QUOTED may have
some quote characters in there, you could rename it, of course, to
signal what the value is (e.g. $PATHNAME) better.
But I think you misunderstood my comment completely.
What I meant was to write these lines like:
--add-file-with-content=\""$QUOTED"\":
test_path_is_file "$QUOTED"
Because the value in QUOTED can have $IFS whitespaces in it (after
all, allowing random letters like colon, quotes and whitespaces is
why we are adding this unquote_c_style() call), and without the
extra double quotes to protect the parameter expansion of $QUOTED,
the command line is broken.
So, don't decide against it; the reasoning behind that decision is
simply wrong.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
2022-05-10 19:26 ` [PATCH v4 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
2022-05-10 19:26 ` [PATCH v4 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-10 19:27 ` Johannes Schindelin via GitGitGadget
2022-05-17 14:51 ` Ævar Arnfjörð Bjarmason
2022-05-10 19:27 ` [PATCH v4 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
` (5 subsequent siblings)
8 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-10 19:27 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
The `scalar` command needs a Scalar enlistment for many subcommands, and
looks in the current directory for such an enlistment (traversing the
parent directories until it finds one).
These is subcommands can also be called with an optional argument
specifying the enlistment. Here, too, we traverse parent directories as
needed, until we find an enlistment.
However, if the specified directory does not even exist, or is not a
directory, we should stop right there, with an error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 6 ++++--
contrib/scalar/t/t9099-scalar.sh | 5 +++++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 1ce9c2b00e8..00dcd4b50ef 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv,
usage_with_options(usagestr, options);
/* find the worktree, determine its corresponding root */
- if (argc == 1)
+ if (argc == 1) {
strbuf_add_absolute_path(&path, argv[0]);
- else if (strbuf_getcwd(&path) < 0)
+ if (!is_directory(path.buf))
+ die(_("'%s' does not exist"), path.buf);
+ } else if (strbuf_getcwd(&path) < 0)
die(_("need a working directory"));
strbuf_trim_trailing_dir_sep(&path);
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 2e1502ad45e..9d83fdf25e8 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -85,4 +85,9 @@ test_expect_success 'scalar delete with enlistment' '
test_path_is_missing cloned
'
+test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
+ ! scalar run config cloned 2>err &&
+ grep "cloned. does not exist" err
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-10 19:27 ` [PATCH v4 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
@ 2022-05-17 14:51 ` Ævar Arnfjörð Bjarmason
2022-05-18 17:35 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-17 14:51 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, Johannes Schindelin
On Tue, May 10 2022, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> The `scalar` command needs a Scalar enlistment for many subcommands, and
> looks in the current directory for such an enlistment (traversing the
> parent directories until it finds one).
>
> These is subcommands can also be called with an optional argument
> specifying the enlistment. Here, too, we traverse parent directories as
> needed, until we find an enlistment.
>
> However, if the specified directory does not even exist, or is not a
> directory, we should stop right there, with an error message.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> contrib/scalar/scalar.c | 6 ++++--
> contrib/scalar/t/t9099-scalar.sh | 5 +++++
> 2 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
> index 1ce9c2b00e8..00dcd4b50ef 100644
> --- a/contrib/scalar/scalar.c
> +++ b/contrib/scalar/scalar.c
> @@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv,
> usage_with_options(usagestr, options);
>
> /* find the worktree, determine its corresponding root */
> - if (argc == 1)
> + if (argc == 1) {
> strbuf_add_absolute_path(&path, argv[0]);
> - else if (strbuf_getcwd(&path) < 0)
> + if (!is_directory(path.buf))
> + die(_("'%s' does not exist"), path.buf);
> + } else if (strbuf_getcwd(&path) < 0)
> die(_("need a working directory"));
>
> strbuf_trim_trailing_dir_sep(&path);
> diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
> index 2e1502ad45e..9d83fdf25e8 100755
> --- a/contrib/scalar/t/t9099-scalar.sh
> +++ b/contrib/scalar/t/t9099-scalar.sh
> @@ -85,4 +85,9 @@ test_expect_success 'scalar delete with enlistment' '
> test_path_is_missing cloned
> '
>
> +test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
> + ! scalar run config cloned 2>err &&
Needs to use test_must_fail, not !
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-17 14:51 ` Ævar Arnfjörð Bjarmason
@ 2022-05-18 17:35 ` Junio C Hamano
2022-05-20 7:30 ` Ævar Arnfjörð Bjarmason
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-18 17:35 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Johannes Schindelin via GitGitGadget, git, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren, Johannes Schindelin
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> +test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
>> + ! scalar run config cloned 2>err &&
>
> Needs to use test_must_fail, not !
Good eyes and careful reading are very much appreciated, but in this
case, doesn't such an improvement depend on an update to teach
test_must_fail_acceptable about scalar being whitelisted?
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-18 17:35 ` Junio C Hamano
@ 2022-05-20 7:30 ` Ævar Arnfjörð Bjarmason
2022-05-20 15:55 ` Johannes Schindelin
0 siblings, 1 reply; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-20 7:30 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren, Johannes Schindelin
On Wed, May 18 2022, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>>> +test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
>>> + ! scalar run config cloned 2>err &&
>>
>> Needs to use test_must_fail, not !
>
> Good eyes and careful reading are very much appreciated, but in this
> case, doesn't such an improvement depend on an update to teach
> test_must_fail_acceptable about scalar being whitelisted?
Yes, I think so (but haven't tested it just now), but it's a relatively
small change to t/test-lib-functions.sh.
I was just noting the potential hidden segfault etc., the issue remains
in v5.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-20 7:30 ` Ævar Arnfjörð Bjarmason
@ 2022-05-20 15:55 ` Johannes Schindelin
2022-05-21 9:54 ` Ævar Arnfjörð Bjarmason
0 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-20 15:55 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren
[-- Attachment #1: Type: text/plain, Size: 974 bytes --]
Hi Ævar,
On Fri, 20 May 2022, Ævar Arnfjörð Bjarmason wrote:
>
> On Wed, May 18 2022, Junio C Hamano wrote:
>
> > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> >
> >>> +test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
> >>> + ! scalar run config cloned 2>err &&
> >>
> >> Needs to use test_must_fail, not !
> >
> > Good eyes and careful reading are very much appreciated, but in this
> > case, doesn't such an improvement depend on an update to teach
> > test_must_fail_acceptable about scalar being whitelisted?
>
> Yes, I think so (but haven't tested it just now), but it's a relatively
> small change to t/test-lib-functions.sh.
Let it be noted that I fully agree with Junio that good eyes and careful
reading are very much appreciated. And that in this case, that would have
implied noticing that `test_must_fail` is reserved for Git commands.
Scalar is not (yet?) a Git command.
Ciao,
Johannes
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-20 15:55 ` Johannes Schindelin
@ 2022-05-21 9:54 ` Ævar Arnfjörð Bjarmason
2022-05-22 5:50 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-21 9:54 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren
On Fri, May 20 2022, Johannes Schindelin wrote:
> Hi Ævar,
>
> On Fri, 20 May 2022, Ævar Arnfjörð Bjarmason wrote:
>
>>
>> On Wed, May 18 2022, Junio C Hamano wrote:
>>
>> > Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> >
>> >>> +test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
>> >>> + ! scalar run config cloned 2>err &&
>> >>
>> >> Needs to use test_must_fail, not !
>> >
>> > Good eyes and careful reading are very much appreciated, but in this
>> > case, doesn't such an improvement depend on an update to teach
>> > test_must_fail_acceptable about scalar being whitelisted?
>>
>> Yes, I think so (but haven't tested it just now), but it's a relatively
>> small change to t/test-lib-functions.sh.
>
> Let it be noted that I fully agree with Junio that good eyes and careful
> reading are very much appreciated. And that in this case, that would have
> implied noticing that `test_must_fail` is reserved for Git commands.
>
> Scalar is not (yet?) a Git command.
"test-tool" isn't "git" either, so I think this argument is a
non-starter.
As the documentation for "test_must_fail" notes the distinction is
whether something is "system-supplied". I.e. we're not going to test
whether "grep" segfaults, but we should test our own code to see if it
segfaults.
The scalar code is code we ship and test, so we should use the helper
that doesn't hide a segfault.
I don't understand why you wouldn't think that's the obvious fix here,
adding "scalar" to that whitelist is a one-line fix, and clearly yields
a more useful end result than a test silently hiding segfaults.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-21 9:54 ` Ævar Arnfjörð Bjarmason
@ 2022-05-22 5:50 ` Junio C Hamano
2022-05-24 12:25 ` Johannes Schindelin
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-22 5:50 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>> Scalar is not (yet?) a Git command.
>
> "test-tool" isn't "git" either, so I think this argument is a
> non-starter.
>
> As the documentation for "test_must_fail" notes the distinction is
> whether something is "system-supplied". I.e. we're not going to test
> whether "grep" segfaults, but we should test our own code to see if it
> segfaults.
>
> The scalar code is code we ship and test, so we should use the helper
> that doesn't hide a segfault.
>
> I don't understand why you wouldn't think that's the obvious fix here,
> adding "scalar" to that whitelist is a one-line fix, and clearly yields
> a more useful end result than a test silently hiding segfaults.
FWIW, I don't, either.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-22 5:50 ` Junio C Hamano
@ 2022-05-24 12:25 ` Johannes Schindelin
2022-05-24 18:11 ` Ævar Arnfjörð Bjarmason
0 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-24 12:25 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ævar Arnfjörð Bjarmason,
Johannes Schindelin via GitGitGadget, git, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren
[-- Attachment #1: Type: text/plain, Size: 1252 bytes --]
Hi Junio,
On Sat, 21 May 2022, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
> >> Scalar is not (yet?) a Git command.
> >
> > "test-tool" isn't "git" either, so I think this argument is a
> > non-starter.
> >
> > As the documentation for "test_must_fail" notes the distinction is
> > whether something is "system-supplied". I.e. we're not going to test
> > whether "grep" segfaults, but we should test our own code to see if it
> > segfaults.
> >
> > The scalar code is code we ship and test, so we should use the helper
> > that doesn't hide a segfault.
> >
> > I don't understand why you wouldn't think that's the obvious fix here,
> > adding "scalar" to that whitelist is a one-line fix, and clearly yields
> > a more useful end result than a test silently hiding segfaults.
>
> FWIW, I don't, either.
Because we are still talking about code that lives as much encapsulated
inside `contrib/scalar/` as possible.
The `! scalar` call is in `contrib/scalar/t/t9099-scalar.sh`.
To make it work with Git's test suite, you would have to bleed an
implementation detail of something inside `contrib/` into
`t/test-lib-functions.sh`.
Not what we want, at this stage.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-24 12:25 ` Johannes Schindelin
@ 2022-05-24 18:11 ` Ævar Arnfjörð Bjarmason
2022-05-24 19:29 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-24 18:11 UTC (permalink / raw)
To: Johannes Schindelin
Cc: Junio C Hamano, Johannes Schindelin via GitGitGadget, git,
René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren
On Tue, May 24 2022, Johannes Schindelin wrote:
> Hi Junio,
>
> On Sat, 21 May 2022, Junio C Hamano wrote:
>
>> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>>
>> >> Scalar is not (yet?) a Git command.
>> >
>> > "test-tool" isn't "git" either, so I think this argument is a
>> > non-starter.
>> >
>> > As the documentation for "test_must_fail" notes the distinction is
>> > whether something is "system-supplied". I.e. we're not going to test
>> > whether "grep" segfaults, but we should test our own code to see if it
>> > segfaults.
>> >
>> > The scalar code is code we ship and test, so we should use the helper
>> > that doesn't hide a segfault.
>> >
>> > I don't understand why you wouldn't think that's the obvious fix here,
>> > adding "scalar" to that whitelist is a one-line fix, and clearly yields
>> > a more useful end result than a test silently hiding segfaults.
>>
>> FWIW, I don't, either.
>
> Because we are still talking about code that lives as much encapsulated
> inside `contrib/scalar/` as possible.
>
> The `! scalar` call is in `contrib/scalar/t/t9099-scalar.sh`.
>
> To make it work with Git's test suite, you would have to bleed an
> implementation detail of something inside `contrib/` into
> `t/test-lib-functions.sh`.
The "scalar" command is already built by the top-level Makefile, so I
don't think the distinction you're trying to maintain here even exists
in practice.
I.e. if we ran with this strict reasoning then surely "scalar" belongs
on there just as much as "test-tool" does.
Both are built by our main build process, and thus should have
corresponding adjustments in our main test code, just as is already the
case for both "git" and "test-tool".
But even if that wasn't the case I'd still be of the view that we should
add "scalar" to that list.
It's just a matter of potential time sinks in the future. If we
introduce a hidden segfault in the scalar code and don't notice for some
time because we're using that test pattern that's going to suck, and
likely to waste a lot of time. We might even ship a broken command to
users.
Whereas having "scalar" on that list is going to be a relatively easy
matter of grepping and doing some boilerplate changes if and when we
ever "git rm" it entirely, or "promote it" from contrib or whatever.
I also think that just getting rid of that whitelist entirely is an
acceptable solution. Perhaps it's just being overzealous in forbidding
everything except "git", we should still not use it for the likes of
"grep", but we could just leave that to the documentation.
But I suspect Junio would disagree with that, so in lieu of that ...
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-24 18:11 ` Ævar Arnfjörð Bjarmason
@ 2022-05-24 19:29 ` Junio C Hamano
2022-05-25 10:31 ` Johannes Schindelin
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-24 19:29 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason
Cc: Johannes Schindelin, Johannes Schindelin via GitGitGadget, git,
René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> Both are built by our main build process, and thus should have
> corresponding adjustments in our main test code, just as is already the
> case for both "git" and "test-tool".
>
> But even if that wasn't the case I'd still be of the view that we should
> add "scalar" to that list.
>
> It's just a matter of potential time sinks in the future. If we
> introduce a hidden segfault in the scalar code and don't notice for some
> time because we're using that test pattern that's going to suck, and
> likely to waste a lot of time. We might even ship a broken command to
> users.
>
> Whereas having "scalar" on that list is going to be a relatively easy
> matter of grepping and doing some boilerplate changes if and when we
> ever "git rm" it entirely, or "promote it" from contrib or whatever.
In addition, it already is an actual time sink that causes us send a
lot more bytes back and forth than the number of bytes necessary to
send a reroll that adds one liner to the same step.
> I also think that just getting rid of that whitelist entirely is an
> acceptable solution. Perhaps it's just being overzealous in forbidding
> everything except "git", we should still not use it for the likes of
> "grep", but we could just leave that to the documentation.
It indeed is tempting entry into a slippery slope, and I'd see it as
a change bigger than we could comfortably make as a "while at it"
change.
We can stop arguing and instead send in a reroll that squashes in
something like this, which shouldn't be controversial, I would say.
t/test-lib-functions.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git i/t/test-lib-functions.sh w/t/test-lib-functions.sh
index 93c03380d4..8899eaabed 100644
--- i/t/test-lib-functions.sh
+++ w/t/test-lib-functions.sh
@@ -1106,7 +1106,7 @@ test_must_fail_acceptable () {
fi
case "$1" in
- git|__git*|test-tool|test_terminal)
+ git|__git*|scalar|test-tool|test_terminal)
return 0
;;
*)
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 3/7] scalar: validate the optional enlistment argument
2022-05-24 19:29 ` Junio C Hamano
@ 2022-05-25 10:31 ` Johannes Schindelin
0 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-25 10:31 UTC (permalink / raw)
To: Junio C Hamano
Cc: Ævar Arnfjörð Bjarmason,
Johannes Schindelin via GitGitGadget, git, René Scharfe,
Taylor Blau, Derrick Stolee, Elijah Newren
Hi Junio,
On Tue, 24 May 2022, Junio C Hamano wrote:
> We can stop arguing and instead send in a reroll that squashes in
> something like this, which shouldn't be controversial, I would say.
>
> t/test-lib-functions.sh | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git i/t/test-lib-functions.sh w/t/test-lib-functions.sh
> index 93c03380d4..8899eaabed 100644
> --- i/t/test-lib-functions.sh
> +++ w/t/test-lib-functions.sh
> @@ -1106,7 +1106,7 @@ test_must_fail_acceptable () {
> fi
>
> case "$1" in
> - git|__git*|test-tool|test_terminal)
> + git|__git*|scalar|test-tool|test_terminal)
> return 0
> ;;
> *)
>
>
>
>
It is still wrong to adjust Git's test suite for a user that is not part
of Git proper. But if your pragmatism says that this is the only way we
can venture on to more productive venues, I won't argue against that :-)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v4 4/7] Implement `scalar diagnose`
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
` (2 preceding siblings ...)
2022-05-10 19:27 ` [PATCH v4 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
@ 2022-05-10 19:27 ` Johannes Schindelin via GitGitGadget
2022-05-17 14:53 ` Ævar Arnfjörð Bjarmason
2022-05-10 19:27 ` [PATCH v4 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
` (4 subsequent siblings)
8 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-10 19:27 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
that can help identify the most typical problems with large
worktrees/repositories.
The `diagnose` command is the culmination of this hard-won knowledge: it
gathers the installed hooks, the config, a couple statistics describing
the data shape, among other pieces of information, and then wraps
everything up in a tidy, neat `.zip` archive.
Note: originally, Scalar was implemented in C# using the .NET API, where
we had the luxury of a comprehensive standard library that includes
basic functionality such as writing a `.zip` file. In the C version, we
lack such a commodity. Rather than introducing a dependency on, say,
libzip, we slightly abuse Git's `archive` machinery: we write out a
`.zip` of the empty try, augmented by a couple files that are added via
the `--add-file*` options. We are careful trying not to modify the
current repository in any way lest the very circumstances that required
`scalar diagnose` to be run are changed by the `diagnose` run itself.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 144 +++++++++++++++++++++++++++++++
contrib/scalar/scalar.txt | 12 +++
contrib/scalar/t/t9099-scalar.sh | 14 +++
3 files changed, 170 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 00dcd4b50ef..367a2c50e25 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -11,6 +11,7 @@
#include "dir.h"
#include "packfile.h"
#include "help.h"
+#include "archive.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -261,6 +262,47 @@ static int unregister_dir(void)
return res;
}
+static int add_directory_to_archiver(struct strvec *archiver_args,
+ const char *path, int recurse)
+{
+ int at_root = !*path;
+ DIR *dir = opendir(at_root ? "." : path);
+ struct dirent *e;
+ struct strbuf buf = STRBUF_INIT;
+ size_t len;
+ int res = 0;
+
+ if (!dir)
+ return error(_("could not open directory '%s'"), path);
+
+ if (!at_root)
+ strbuf_addf(&buf, "%s/", path);
+ len = buf.len;
+ strvec_pushf(archiver_args, "--prefix=%s", buf.buf);
+
+ while (!res && (e = readdir(dir))) {
+ if (!strcmp(".", e->d_name) || !strcmp("..", e->d_name))
+ continue;
+
+ strbuf_setlen(&buf, len);
+ strbuf_addstr(&buf, e->d_name);
+
+ if (e->d_type == DT_REG)
+ strvec_pushf(archiver_args, "--add-file=%s", buf.buf);
+ else if (e->d_type != DT_DIR)
+ warning(_("skipping '%s', which is neither file nor "
+ "directory"), buf.buf);
+ else if (recurse &&
+ add_directory_to_archiver(archiver_args,
+ buf.buf, recurse) < 0)
+ res = -1;
+ }
+
+ closedir(dir);
+ strbuf_release(&buf);
+ return res;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -501,6 +543,107 @@ cleanup:
return res;
}
+static int cmd_diagnose(int argc, const char **argv)
+{
+ struct option options[] = {
+ OPT_END(),
+ };
+ const char * const usage[] = {
+ N_("scalar diagnose [<enlistment>]"),
+ NULL
+ };
+ struct strbuf zip_path = STRBUF_INIT;
+ struct strvec archiver_args = STRVEC_INIT;
+ char **argv_copy = NULL;
+ int stdout_fd = -1, archiver_fd = -1;
+ time_t now = time(NULL);
+ struct tm tm;
+ struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT;
+ int res = 0;
+
+ argc = parse_options(argc, argv, NULL, options,
+ usage, 0);
+
+ setup_enlistment_directory(argc, argv, usage, options, &zip_path);
+
+ strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_");
+ strbuf_addftime(&zip_path,
+ "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0);
+ strbuf_addstr(&zip_path, ".zip");
+ switch (safe_create_leading_directories(zip_path.buf)) {
+ case SCLD_EXISTS:
+ case SCLD_OK:
+ break;
+ default:
+ error_errno(_("could not create directory for '%s'"),
+ zip_path.buf);
+ goto diagnose_cleanup;
+ }
+ stdout_fd = dup(1);
+ if (stdout_fd < 0) {
+ res = error_errno(_("could not duplicate stdout"));
+ goto diagnose_cleanup;
+ }
+
+ archiver_fd = xopen(zip_path.buf, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+ if (archiver_fd < 0 || dup2(archiver_fd, 1) < 0) {
+ res = error_errno(_("could not redirect output"));
+ goto diagnose_cleanup;
+ }
+
+ init_zip_archiver();
+ strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL);
+
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "Collecting diagnostic info\n\n");
+ get_version_info(&buf, 1);
+
+ strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ write_or_die(stdout_fd, buf.buf, buf.len);
+ strvec_pushf(&archiver_args,
+ "--add-file-with-content=diagnostics.log:%.*s",
+ (int)buf.len, buf.buf);
+
+ if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
+ goto diagnose_cleanup;
+
+ strvec_pushl(&archiver_args, "--prefix=",
+ oid_to_hex(the_hash_algo->empty_tree), "--", NULL);
+
+ /* `write_archive()` modifies the `argv` passed to it. Let it. */
+ argv_copy = xmemdupz(archiver_args.v,
+ sizeof(char *) * archiver_args.nr);
+ res = write_archive(archiver_args.nr, (const char **)argv_copy, NULL,
+ the_repository, NULL, 0);
+ if (res) {
+ error(_("failed to write archive"));
+ goto diagnose_cleanup;
+ }
+
+ if (!res)
+ fprintf(stderr, "\n"
+ "Diagnostics complete.\n"
+ "All of the gathered info is captured in '%s'\n",
+ zip_path.buf);
+
+diagnose_cleanup:
+ if (archiver_fd >= 0) {
+ close(1);
+ dup2(stdout_fd, 1);
+ }
+ free(argv_copy);
+ strvec_clear(&archiver_args);
+ strbuf_release(&zip_path);
+ strbuf_release(&path);
+ strbuf_release(&buf);
+
+ return res;
+}
+
static int cmd_list(int argc, const char **argv)
{
if (argc != 1)
@@ -802,6 +945,7 @@ static struct {
{ "reconfigure", cmd_reconfigure },
{ "delete", cmd_delete },
{ "version", cmd_version },
+ { "diagnose", cmd_diagnose },
{ NULL, NULL},
};
diff --git a/contrib/scalar/scalar.txt b/contrib/scalar/scalar.txt
index f416d637289..22583fe046e 100644
--- a/contrib/scalar/scalar.txt
+++ b/contrib/scalar/scalar.txt
@@ -14,6 +14,7 @@ scalar register [<enlistment>]
scalar unregister [<enlistment>]
scalar run ( all | config | commit-graph | fetch | loose-objects | pack-files ) [<enlistment>]
scalar reconfigure [ --all | <enlistment> ]
+scalar diagnose [<enlistment>]
scalar delete <enlistment>
DESCRIPTION
@@ -129,6 +130,17 @@ reconfigure the enlistment.
With the `--all` option, all enlistments currently registered with Scalar
will be reconfigured. Use this option after each Scalar upgrade.
+Diagnose
+~~~~~~~~
+
+diagnose [<enlistment>]::
+ When reporting issues with Scalar, it is often helpful to provide the
+ information gathered by this command, including logs and certain
+ statistics describing the data shape of the current enlistment.
++
+The output of this command is a `.zip` file that is written into
+a directory adjacent to the worktree in the `src` directory.
+
Delete
~~~~~~
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 9d83fdf25e8..6802d317258 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -90,4 +90,18 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
grep "cloned. does not exist" err
'
+SQ="'"
+test_expect_success UNZIP 'scalar diagnose' '
+ scalar clone "file://$(pwd)" cloned --single-branch &&
+ scalar diagnose cloned >out 2>err &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
+ zip_path=$(cat zip_path) &&
+ test -n "$zip_path" &&
+ unzip -v "$zip_path" &&
+ folder=${zip_path%.zip} &&
+ test_path_is_missing "$folder" &&
+ unzip -p "$zip_path" diagnostics.log >out &&
+ test_file_not_empty out
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 4/7] Implement `scalar diagnose`
2022-05-10 19:27 ` [PATCH v4 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
@ 2022-05-17 14:53 ` Ævar Arnfjörð Bjarmason
0 siblings, 0 replies; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-17 14:53 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, Johannes Schindelin
On Tue, May 10 2022, Johannes Schindelin via GitGitGadget wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> Over the course of Scalar's development, it became obvious that there is
> a need for a command that can gather all kinds of useful information
> that can help identify the most typical problems with large
> worktrees/repositories.
>
> The `diagnose` command is the culmination of this hard-won knowledge: it
> gathers the installed hooks, the config, a couple statistics describing
> the data shape, among other pieces of information, and then wraps
> everything up in a tidy, neat `.zip` archive.
>
> Note: originally, Scalar was implemented in C# using the .NET API, where
> we had the luxury of a comprehensive standard library that includes
> basic functionality such as writing a `.zip` file. In the C version, we
> lack such a commodity. Rather than introducing a dependency on, say,
> libzip, we slightly abuse Git's `archive` machinery: we write out a
> `.zip` of the empty try, augmented by a couple files that are added via
> the `--add-file*` options. We are careful trying not to modify the
> current repository in any way lest the very circumstances that required
> `scalar diagnose` to be run are changed by the `diagnose` run itself.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> contrib/scalar/scalar.c | 144 +++++++++++++++++++++++++++++++
> contrib/scalar/scalar.txt | 12 +++
> contrib/scalar/t/t9099-scalar.sh | 14 +++
> 3 files changed, 170 insertions(+)
>
> diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
> index 00dcd4b50ef..367a2c50e25 100644
> --- a/contrib/scalar/scalar.c
> +++ b/contrib/scalar/scalar.c
> @@ -11,6 +11,7 @@
> #include "dir.h"
> #include "packfile.h"
> #include "help.h"
> +#include "archive.h"
>
> /*
> * Remove the deepest subdirectory in the provided path string. Path must not
> @@ -261,6 +262,47 @@ static int unregister_dir(void)
> return res;
> }
>
> +static int add_directory_to_archiver(struct strvec *archiver_args,
> + const char *path, int recurse)
> +{
> + int at_root = !*path;
> + DIR *dir = opendir(at_root ? "." : path);
> + struct dirent *e;
> + struct strbuf buf = STRBUF_INIT;
> + size_t len;
> + int res = 0;
> +
> + if (!dir)
> + return error(_("could not open directory '%s'"), path);
s/error/error_errno/, surely?
> + strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_");
> + strbuf_addftime(&zip_path,
> + "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0);
Would we be worse off if we stole this timestamp from some known file
(or HEAD), and thus made a second run of this reproducable?
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v4 5/7] scalar diagnose: include disk space information
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
` (3 preceding siblings ...)
2022-05-10 19:27 ` [PATCH v4 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
@ 2022-05-10 19:27 ` Johannes Schindelin via GitGitGadget
2022-05-10 19:27 ` [PATCH v4 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
` (3 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-10 19:27 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
When analyzing problems with large worktrees/repositories, it is useful
to know how close to a "full disk" situation Scalar/Git operates. Let's
include this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 53 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 1 +
2 files changed, 54 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 367a2c50e25..34cbec59b45 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -303,6 +303,58 @@ static int add_directory_to_archiver(struct strvec *archiver_args,
return res;
}
+#ifndef WIN32
+#include <sys/statvfs.h>
+#endif
+
+static int get_disk_info(struct strbuf *out)
+{
+#ifdef WIN32
+ struct strbuf buf = STRBUF_INIT;
+ char volume_name[MAX_PATH], fs_name[MAX_PATH];
+ DWORD serial_number, component_length, flags;
+ ULARGE_INTEGER avail2caller, total, avail;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (!GetDiskFreeSpaceExA(buf.buf, &avail2caller, &total, &avail)) {
+ error(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_setlen(&buf, offset_1st_component(buf.buf));
+ if (!GetVolumeInformationA(buf.buf, volume_name, sizeof(volume_name),
+ &serial_number, &component_length, &flags,
+ fs_name, sizeof(fs_name))) {
+ error(_("could not get info for '%s'"), buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, avail2caller.QuadPart);
+ strbuf_addch(out, '\n');
+ strbuf_release(&buf);
+#else
+ struct strbuf buf = STRBUF_INIT;
+ struct statvfs stat;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (statvfs(buf.buf, &stat) < 0) {
+ error_errno(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail));
+ strbuf_addf(out, " (mount flags 0x%lx)\n", stat.f_flag);
+ strbuf_release(&buf);
+#endif
+ return 0;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -599,6 +651,7 @@ static int cmd_diagnose(int argc, const char **argv)
get_version_info(&buf, 1);
strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ get_disk_info(&buf);
write_or_die(stdout_fd, buf.buf, buf.len);
strvec_pushf(&archiver_args,
"--add-file-with-content=diagnostics.log:%.*s",
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 6802d317258..934b2485d91 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -94,6 +94,7 @@ SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
scalar diagnose cloned >out 2>err &&
+ grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
zip_path=$(cat zip_path) &&
test -n "$zip_path" &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v4 6/7] scalar: teach `diagnose` to gather packfile info
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
` (4 preceding siblings ...)
2022-05-10 19:27 ` [PATCH v4 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
@ 2022-05-10 19:27 ` Matthew John Cheetham via GitGitGadget
2022-05-10 19:27 ` [PATCH v4 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
` (2 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-10 19:27 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
It's helpful to see if there are other crud files in the pack
directory. Let's teach the `scalar diagnose` command to gather
file size information about pack files.
While at it, also enumerate the pack files in the alternate
object directories, if any are registered.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 30 ++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 6 +++++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 34cbec59b45..e8e0a5ec473 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -12,6 +12,7 @@
#include "packfile.h"
#include "help.h"
#include "archive.h"
+#include "object-store.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -595,6 +596,29 @@ cleanup:
return res;
}
+static void dir_file_stats_objects(const char *full_path, size_t full_path_len,
+ const char *file_name, void *data)
+{
+ struct strbuf *buf = data;
+ struct stat st;
+
+ if (!stat(full_path, &st))
+ strbuf_addf(buf, "%-70s %16" PRIuMAX "\n", file_name,
+ (uintmax_t)st.st_size);
+}
+
+static int dir_file_stats(struct object_directory *object_dir, void *data)
+{
+ struct strbuf *buf = data;
+
+ strbuf_addf(buf, "Contents of %s:\n", object_dir->path);
+
+ for_each_file_in_pack_dir(object_dir->path, dir_file_stats_objects,
+ data);
+
+ return 0;
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -657,6 +681,12 @@ static int cmd_diagnose(int argc, const char **argv)
"--add-file-with-content=diagnostics.log:%.*s",
(int)buf.len, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-file-with-content=packs-local.txt:");
+ dir_file_stats(the_repository->objects->odb, &buf);
+ foreach_alt_odb(dir_file_stats, &buf);
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 934b2485d91..3dd5650cceb 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -93,6 +93,8 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
+ git repack &&
+ echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -102,7 +104,9 @@ test_expect_success UNZIP 'scalar diagnose' '
folder=${zip_path%.zip} &&
test_path_is_missing "$folder" &&
unzip -p "$zip_path" diagnostics.log >out &&
- test_file_not_empty out
+ test_file_not_empty out &&
+ unzip -p "$zip_path" packs-local.txt >out &&
+ grep "$(pwd)/.git/objects" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v4 7/7] scalar: teach `diagnose` to gather loose objects information
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
` (5 preceding siblings ...)
2022-05-10 19:27 ` [PATCH v4 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
@ 2022-05-10 19:27 ` Matthew John Cheetham via GitGitGadget
2022-05-17 15:03 ` [PATCH v4 0/7] scalar: implement the subcommand "diagnose" Ævar Arnfjörð Bjarmason
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
8 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-10 19:27 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
When operating at the scale that Scalar wants to support, certain data
shapes are more likely to cause undesirable performance issues, such as
large numbers of loose objects.
By including statistics about this, `scalar diagnose` now makes it
easier to identify such scenarios.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 59 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 5 ++-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index e8e0a5ec473..03da7452d83 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -619,6 +619,60 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
return 0;
}
+static int count_files(char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count = 0;
+
+ if (!dir)
+ return 0;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) && e->d_type == DT_REG)
+ count++;
+
+ closedir(dir);
+ return count;
+}
+
+static void loose_objs_stats(struct strbuf *buf, const char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count;
+ int total = 0;
+ unsigned char c;
+ struct strbuf count_path = STRBUF_INIT;
+ size_t base_path_len;
+
+ if (!dir)
+ return;
+
+ strbuf_addstr(buf, "Object directory stats for ");
+ strbuf_add_absolute_path(buf, path);
+ strbuf_addstr(buf, ":\n");
+
+ strbuf_add_absolute_path(&count_path, path);
+ strbuf_addch(&count_path, '/');
+ base_path_len = count_path.len;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) &&
+ e->d_type == DT_DIR && strlen(e->d_name) == 2 &&
+ !hex_to_bytes(&c, e->d_name, 1)) {
+ strbuf_setlen(&count_path, base_path_len);
+ strbuf_addstr(&count_path, e->d_name);
+ total += (count = count_files(count_path.buf));
+ strbuf_addf(buf, "%s : %7d files\n", e->d_name, count);
+ }
+
+ strbuf_addf(buf, "Total: %d loose objects", total);
+
+ strbuf_release(&count_path);
+ closedir(dir);
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -687,6 +741,11 @@ static int cmd_diagnose(int argc, const char **argv)
foreach_alt_odb(dir_file_stats, &buf);
strvec_push(&archiver_args, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-file-with-content=objects-local.txt:");
+ loose_objs_stats(&buf, ".git/objects");
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 3dd5650cceb..72023a1ca1d 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -95,6 +95,7 @@ test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
git repack &&
echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
+ test_commit -C cloned/src loose &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -106,7 +107,9 @@ test_expect_success UNZIP 'scalar diagnose' '
unzip -p "$zip_path" diagnostics.log >out &&
test_file_not_empty out &&
unzip -p "$zip_path" packs-local.txt >out &&
- grep "$(pwd)/.git/objects" out
+ grep "$(pwd)/.git/objects" out &&
+ unzip -p "$zip_path" objects-local.txt >out &&
+ grep "^Total: [1-9]" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v4 0/7] scalar: implement the subcommand "diagnose"
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
` (6 preceding siblings ...)
2022-05-10 19:27 ` [PATCH v4 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
@ 2022-05-17 15:03 ` Ævar Arnfjörð Bjarmason
2022-05-17 15:28 ` rsbecker
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
8 siblings, 1 reply; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-05-17 15:03 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, Johannes Schindelin
On Tue, May 10 2022, Johannes Schindelin via GitGitGadget wrote:
> Over the course of the years, we developed a sub-command that gathers
> diagnostic data into a .zip file that can then be attached to bug reports.
> This sub-command turned out to be very useful in helping Scalar developers
> identify and fix issues.
I don't mind this as some intermediate step, but re the context of the
plan for scalar "eventually going away" (discussed in previous threads)
I wonder why (especially re the earlier thread upthread at [1]) this
isn't being added to "git bugreport".
Is the plan to integrate this into "git bugreport" eventually?
1. https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202062213030.347@tvgsbejvaqbjf.bet/
^ permalink raw reply [flat|nested] 140+ messages in thread
* RE: [PATCH v4 0/7] scalar: implement the subcommand "diagnose"
2022-05-17 15:03 ` [PATCH v4 0/7] scalar: implement the subcommand "diagnose" Ævar Arnfjörð Bjarmason
@ 2022-05-17 15:28 ` rsbecker
2022-05-19 18:17 ` Johannes Schindelin
0 siblings, 1 reply; 140+ messages in thread
From: rsbecker @ 2022-05-17 15:28 UTC (permalink / raw)
To: 'Ævar Arnfjörð Bjarmason',
'Johannes Schindelin via GitGitGadget'
Cc: git, 'René Scharfe', 'Taylor Blau',
'Derrick Stolee', 'Elijah Newren',
'Johannes Schindelin'
On May 17, 2022 11:03 AM, Ævar Arnfjörð Bjarmason wrote:
>On Tue, May 10 2022, Johannes Schindelin via GitGitGadget wrote:
>
>> Over the course of the years, we developed a sub-command that gathers
>> diagnostic data into a .zip file that can then be attached to bug reports.
>> This sub-command turned out to be very useful in helping Scalar
>> developers identify and fix issues.
>
>I don't mind this as some intermediate step, but re the context of the plan for
>scalar "eventually going away" (discussed in previous threads) I wonder why
>(especially re the earlier thread upthread at [1]) this isn't being added to "git
>bugreport".
>
>Is the plan to integrate this into "git bugreport" eventually?
>
>1.
>https://lore.kernel.org/git/nycvar.QRO.7.76.6.2202062213030.347@tvgsbejvaqbjf.
>bet/
Could this also not be useful in fsck, as --diagnose? That's the go-to command when there are issues for many users.
--Randall
^ permalink raw reply [flat|nested] 140+ messages in thread
* RE: [PATCH v4 0/7] scalar: implement the subcommand "diagnose"
2022-05-17 15:28 ` rsbecker
@ 2022-05-19 18:17 ` Johannes Schindelin
0 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-19 18:17 UTC (permalink / raw)
To: rsbecker
Cc: 'Ævar Arnfjörð Bjarmason',
'Johannes Schindelin via GitGitGadget',
git, 'René Scharfe', 'Taylor Blau',
'Derrick Stolee', 'Elijah Newren'
[-- Attachment #1: Type: text/plain, Size: 1433 bytes --]
Hi Randall and Ævar,
On Tue, 17 May 2022, rsbecker@nexbridge.com wrote:
> On May 17, 2022 11:03 AM, Ævar Arnfjörð Bjarmason wrote:
> >On Tue, May 10 2022, Johannes Schindelin via GitGitGadget wrote:
> >
> >> Over the course of the years, we developed a sub-command that gathers
> >> diagnostic data into a .zip file that can then be attached to bug
> >> reports. This sub-command turned out to be very useful in helping
> >> Scalar developers identify and fix issues.
> >
> >I don't mind this as some intermediate step, but re the context of the
> >plan for scalar "eventually going away" (discussed in previous threads)
> >I wonder why (especially re the earlier thread upthread at [1]) this
> >isn't being added to "git bugreport".
> >
> >Is the plan to integrate this into "git bugreport" eventually?
Potentially a variation of the `scalar diagnose` code could be useful in
`git bugreport`, opt-in via a new option.
But that's not the purpose of this patch series.
> Could this also not be useful in fsck, as --diagnose? That's the go-to
> command when there are issues for many users.
I can see where you're coming from, but `fsck`'s mission is to verify the
integrity of the local Git database. That is very different from the
mission of `scalar diagnose`, which is to help diagnose issues (whether
they are truly bugs or usage patterns causing unfortunate performance).
Ciao,
Dscho
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v5 0/7] scalar: implement the subcommand "diagnose"
2022-05-10 19:26 ` [PATCH v4 " Johannes Schindelin via GitGitGadget
` (7 preceding siblings ...)
2022-05-17 15:03 ` [PATCH v4 0/7] scalar: implement the subcommand "diagnose" Ævar Arnfjörð Bjarmason
@ 2022-05-19 18:17 ` Johannes Schindelin via GitGitGadget
2022-05-19 18:17 ` [PATCH v5 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
` (8 more replies)
8 siblings, 9 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-19 18:17 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
Over the course of the years, we developed a sub-command that gathers
diagnostic data into a .zip file that can then be attached to bug reports.
This sub-command turned out to be very useful in helping Scalar developers
identify and fix issues.
Changes since v4:
* Squashed in Junio's suggested fixups
* Renamed the option from --add-file-with-content=<name>:<content> to
--add-virtual-file=<name>:<content>
* Fixed one instance where I had used error() instead of error_errno().
Changes since v3:
* We're now using unquote_c_style() instead of rolling our own unquoter.
* Fixed the added regression test.
* As pointed out by Scalar's Functional Tests, the
add_directory_to_archiver() function should not fail when scalar diagnose
encounters FSMonitor's Unix socket, but only warn instead.
* Related: add_directory_to_archiver() needs to propagate errors from
processing subdirectories so that the top-level call returns an error,
too.
Changes since v2:
* Clarified in the commit message what the biggest benefit of
--add-file-with-content is.
* The <path> part of the -add-file-with-content argument can now contain
colons. To do this, the path needs to start and end in double-quote
characters (which are stripped), and the backslash serves as escape
character in that case (to allow the path to contain both colons and
double-quotes).
* Fixed incorrect grammar.
* Instead of strcmp(<what-we-don't-want>), we now say
!strcmp(<what-we-want>).
* The help text for --add-file-with-content was improved a tiny bit.
* Adjusted the commit message that still talked about spawning plenty of
processes and about a throw-away repository for the sake of generating a
.zip file.
* Simplified the code that shows the diagnostics and adds them to the .zip
file.
* The final message that reports that the archive is complete is now
printed to stderr instead of stdout.
Changes since v1:
* Instead of creating a throw-away repository, staging the contents of the
.zip file and then using git write-tree and git archive to write the .zip
file, the patch series now introduces a new option to git archive and
uses write_archive() directly (avoiding any separate process).
* Since the command avoids separate processes, it is now blazing fast on
Windows, and I dropped the spinner() function because it's no longer
needed.
* While reworking the test case, I noticed that scalar [...] <enlistment>
failed to verify that the specified directory exists, and would happily
"traverse to its parent directory" on its quest to find a Scalar
enlistment. That is of course incorrect, and has been fixed as a "while
at it" sort of preparatory commit.
* I had forgotten to sign off on all the commits, which has been fixed.
* Instead of some "home-grown" readdir()-based function, the code now uses
for_each_file_in_pack_dir() to look through the pack directories.
* If any alternates are configured, their pack directories are now included
in the output.
* The commit message that might be interpreted to promise information about
large loose files has been corrected to no longer promise that.
* The test cases have been adjusted to test a little bit more (e.g.
verifying that specific paths are mentioned in the output, instead of
merely verifying that the output is non-empty).
Johannes Schindelin (5):
archive: optionally add "virtual" files
archive --add-file-with-contents: allow paths containing colons
scalar: validate the optional enlistment argument
Implement `scalar diagnose`
scalar diagnose: include disk space information
Matthew John Cheetham (2):
scalar: teach `diagnose` to gather packfile info
scalar: teach `diagnose` to gather loose objects information
Documentation/git-archive.txt | 17 ++
archive.c | 63 ++++++-
contrib/scalar/scalar.c | 292 ++++++++++++++++++++++++++++++-
contrib/scalar/scalar.txt | 12 ++
contrib/scalar/t/t9099-scalar.sh | 27 +++
t/t5003-archive-zip.sh | 20 +++
6 files changed, 421 insertions(+), 10 deletions(-)
base-commit: ddc35d833dd6f9e8946b09cecd3311b8aa18d295
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1128%2Fdscho%2Fscalar-diagnose-v5
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1128/dscho/scalar-diagnose-v5
Pull-Request: https://github.com/gitgitgadget/git/pull/1128
Range-diff vs v4:
1: 45662cf582a ! 1: 42e73fb0aac archive: optionally add "virtual" files
@@ Documentation/git-archive.txt: OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
-+--add-file-with-content=<path>:<content>::
++--add-virtual-file=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
+ by concatenating the value for `--prefix` (if any) and the
@@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int
+ if (!S_ISREG(info->stat.st_mode))
+ die(_("Not a regular file: %s"), path);
+ info->content = NULL; /* read the file later */
-+ } else {
++ } else if (!strcmp(opt->long_name, "add-virtual-file")) {
+ const char *colon = strchr(arg, ':');
+ char *p;
+
@@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int
+ info->stat.st_mode = S_IFREG | 0644;
+ info->content = xstrdup(colon + 1);
+ info->stat.st_size = strlen(info->content);
++ } else {
++ BUG("add_file_cb() called for %s", opt->long_name);
+ }
+ item = string_list_append_nodup(&args->extra_files, path);
+ item->util = info;
@@ archive.c: static int parse_archive_args(int argc, const char **argv,
{ OPTION_CALLBACK, 0, "add-file", args, N_("file"),
N_("add untracked file to archive"), 0, add_file_cb,
(intptr_t)&base },
-+ { OPTION_CALLBACK, 0, "add-file-with-content", args,
++ { OPTION_CALLBACK, 0, "add-virtual-file", args,
+ N_("path:content"), N_("add untracked file to archive"), 0,
+ add_file_cb, (intptr_t)&base },
OPT_STRING('o', "output", &output, N_("file"),
@@ t/t5003-archive-zip.sh: test_expect_success 'git archive --format=zip --add-file
check_zip with_untracked
check_added with_untracked untracked untracked
-+test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
++test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ git archive --format=zip >with_file_with_content.zip \
-+ --add-file-with-content=hello:world $EMPTY_TREE &&
++ --add-virtual-file=hello:world $EMPTY_TREE &&
+ test_when_finished "rm -rf tmp-unpack" &&
+ mkdir tmp-unpack && (
+ cd tmp-unpack &&
2: fdba4ed6f4d ! 2: b5ebd61066a archive --add-file-with-contents: allow paths containing colons
@@ archive.c
@@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
- } else {
+ } else if (!strcmp(opt->long_name, "add-virtual-file")) {
- const char *colon = strchr(arg, ':');
- char *p;
+ struct strbuf buf = STRBUF_INIT;
@@ archive.c: static int add_file_cb(const struct option *opt, const char *arg, int
- info->content = xstrdup(colon + 1);
+ info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
- }
- item = string_list_append_nodup(&args->extra_files, path);
+ } else {
+ BUG("add_file_cb() called for %s", opt->long_name);
## t/t5003-archive-zip.sh ##
@@ t/t5003-archive-zip.sh: check_zip with_untracked
check_added with_untracked untracked untracked
- test_expect_success UNZIP 'git archive --format=zip --add-file-with-content' '
+ test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ if test_have_prereq FUNNYNAMES
+ then
+ QUOTED=quoted:colon
@@ t/t5003-archive-zip.sh: check_zip with_untracked
+ QUOTED=quoted
+ fi &&
git archive --format=zip >with_file_with_content.zip \
-+ --add-file-with-content=\"$QUOTED\": \
- --add-file-with-content=hello:world $EMPTY_TREE &&
++ --add-virtual-file=\"$QUOTED\": \
+ --add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
3: da9f52a8240 = 3: f1ba69c02d7 scalar: validate the optional enlistment argument
4: 87bdc22322b ! 4: 3fb90194744 Implement `scalar diagnose`
@@ contrib/scalar/scalar.c: static int unregister_dir(void)
+ int res = 0;
+
+ if (!dir)
-+ return error(_("could not open directory '%s'"), path);
++ return error_errno(_("could not open directory '%s'"), path);
+
+ if (!at_root)
+ strbuf_addf(&buf, "%s/", path);
@@ contrib/scalar/scalar.c: cleanup:
+ strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ write_or_die(stdout_fd, buf.buf, buf.len);
+ strvec_pushf(&archiver_args,
-+ "--add-file-with-content=diagnostics.log:%.*s",
++ "--add-virtual-file=diagnostics.log:%.*s",
+ (int)buf.len, buf.buf);
+
+ if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
5: 3f63b197d42 ! 5: 2e645b08a9e scalar diagnose: include disk space information
@@ contrib/scalar/scalar.c: static int cmd_diagnose(int argc, const char **argv)
+ get_disk_info(&buf);
write_or_die(stdout_fd, buf.buf, buf.len);
strvec_pushf(&archiver_args,
- "--add-file-with-content=diagnostics.log:%.*s",
+ "--add-virtual-file=diagnostics.log:%.*s",
## contrib/scalar/t/t9099-scalar.sh ##
@@ contrib/scalar/t/t9099-scalar.sh: SQ="'"
6: fc1319338fc ! 6: 0fa20d73750 scalar: teach `diagnose` to gather packfile info
@@ contrib/scalar/scalar.c: cleanup:
{
struct option options[] = {
@@ contrib/scalar/scalar.c: static int cmd_diagnose(int argc, const char **argv)
- "--add-file-with-content=diagnostics.log:%.*s",
+ "--add-virtual-file=diagnostics.log:%.*s",
(int)buf.len, buf.buf);
+ strbuf_reset(&buf);
-+ strbuf_addstr(&buf, "--add-file-with-content=packs-local.txt:");
++ strbuf_addstr(&buf, "--add-virtual-file=packs-local.txt:");
+ dir_file_stats(the_repository->objects->odb, &buf);
+ foreach_alt_odb(dir_file_stats, &buf);
+ strvec_push(&archiver_args, buf.buf);
7: e8f5b42f7b7 ! 7: 62e173b47cf scalar: teach `diagnose` to gather loose objects information
@@ contrib/scalar/scalar.c: static int cmd_diagnose(int argc, const char **argv)
strvec_push(&archiver_args, buf.buf);
+ strbuf_reset(&buf);
-+ strbuf_addstr(&buf, "--add-file-with-content=objects-local.txt:");
++ strbuf_addstr(&buf, "--add-virtual-file=objects-local.txt:");
+ loose_objs_stats(&buf, ".git/objects");
+ strvec_push(&archiver_args, buf.buf);
+
--
gitgitgadget
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v5 1/7] archive: optionally add "virtual" files
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
@ 2022-05-19 18:17 ` Johannes Schindelin via GitGitGadget
2022-05-20 14:41 ` René Scharfe
2022-05-19 18:17 ` [PATCH v5 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
` (7 subsequent siblings)
8 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-19 18:17 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
With the `--add-file-with-content=<path>:<content>` option, `git
archive` now supports use cases where relatively trivial files need to
be added that do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 11 ++++++++
archive.c | 53 +++++++++++++++++++++++++++++------
t/t5003-archive-zip.sh | 12 ++++++++
3 files changed, 68 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a7834..893cb1075bf 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -61,6 +61,17 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+--add-virtual-file=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
+ by concatenating the value for `--prefix` (if any) and the
+ basename of <file>.
++
+The `<path>` cannot contain any colon, the file mode is limited to
+a regular file, and the option may be subject to platform-dependent
+command-line limits. For non-trivial cases, write an untracked file
+and use `--add-file` instead.
+
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
as well (see <<ATTRIBUTES>>).
diff --git a/archive.c b/archive.c
index a3bbb091256..d20e16fa819 100644
--- a/archive.c
+++ b/archive.c
@@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
struct extra_file_info {
char *base;
struct stat stat;
+ void *content;
};
int write_archive_entries(struct archiver_args *args,
@@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args,
strbuf_addstr(&path_in_archive, basename(path));
strbuf_reset(&content);
- if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
+ if (info->content)
+ err = write_entry(args, &fake_oid, path_in_archive.buf,
+ path_in_archive.len,
+ info->stat.st_mode,
+ info->content, info->stat.st_size);
+ else if (strbuf_read_file(&content, path,
+ info->stat.st_size) < 0)
err = error_errno(_("could not read '%s'"), path);
else
err = write_entry(args, &fake_oid, path_in_archive.buf,
@@ -493,6 +500,7 @@ static void extra_file_info_clear(void *util, const char *str)
{
struct extra_file_info *info = util;
free(info->base);
+ free(info->content);
free(info);
}
@@ -514,14 +522,40 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
if (!arg)
return -1;
- path = prefix_filename(args->prefix, arg);
- item = string_list_append_nodup(&args->extra_files, path);
- item->util = info = xmalloc(sizeof(*info));
+ info = xmalloc(sizeof(*info));
info->base = xstrdup_or_null(base);
- if (stat(path, &info->stat))
- die(_("File not found: %s"), path);
- if (!S_ISREG(info->stat.st_mode))
- die(_("Not a regular file: %s"), path);
+
+ if (!strcmp(opt->long_name, "add-file")) {
+ path = prefix_filename(args->prefix, arg);
+ if (stat(path, &info->stat))
+ die(_("File not found: %s"), path);
+ if (!S_ISREG(info->stat.st_mode))
+ die(_("Not a regular file: %s"), path);
+ info->content = NULL; /* read the file later */
+ } else if (!strcmp(opt->long_name, "add-virtual-file")) {
+ const char *colon = strchr(arg, ':');
+ char *p;
+
+ if (!colon)
+ die(_("missing colon: '%s'"), arg);
+
+ p = xstrndup(arg, colon - arg);
+ if (!args->prefix)
+ path = p;
+ else {
+ path = prefix_filename(args->prefix, p);
+ free(p);
+ }
+ memset(&info->stat, 0, sizeof(info->stat));
+ info->stat.st_mode = S_IFREG | 0644;
+ info->content = xstrdup(colon + 1);
+ info->stat.st_size = strlen(info->content);
+ } else {
+ BUG("add_file_cb() called for %s", opt->long_name);
+ }
+ item = string_list_append_nodup(&args->extra_files, path);
+ item->util = info;
+
return 0;
}
@@ -554,6 +588,9 @@ static int parse_archive_args(int argc, const char **argv,
{ OPTION_CALLBACK, 0, "add-file", args, N_("file"),
N_("add untracked file to archive"), 0, add_file_cb,
(intptr_t)&base },
+ { OPTION_CALLBACK, 0, "add-virtual-file", args,
+ N_("path:content"), N_("add untracked file to archive"), 0,
+ add_file_cb, (intptr_t)&base },
OPT_STRING('o', "output", &output, N_("file"),
N_("write the archive to this file")),
OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 1e6d18b140e..ebc26e89a9b 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' '
check_zip with_untracked
check_added with_untracked untracked untracked
+test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ git archive --format=zip >with_file_with_content.zip \
+ --add-virtual-file=hello:world $EMPTY_TREE &&
+ test_when_finished "rm -rf tmp-unpack" &&
+ mkdir tmp-unpack && (
+ cd tmp-unpack &&
+ "$GIT_UNZIP" ../with_file_with_content.zip &&
+ test_path_is_file hello &&
+ test world = $(cat hello)
+ )
+'
+
test_expect_success 'git archive --format=zip --add-file twice' '
echo untracked >untracked &&
git archive --format=zip --prefix=one/ --add-file=untracked \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v5 1/7] archive: optionally add "virtual" files
2022-05-19 18:17 ` [PATCH v5 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-20 14:41 ` René Scharfe
2022-05-20 16:21 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: René Scharfe @ 2022-05-20 14:41 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget, git
Cc: Taylor Blau, Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
Am 19.05.22 um 20:17 schrieb Johannes Schindelin via GitGitGadget:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> With the `--add-file-with-content=<path>:<content>` option, `git
^^^^^^^^^^^^^^^^^^^^^^^
That's still the old option name. Same in the subject of patch 2.
> archive` now supports use cases where relatively trivial files need to
> be added that do not exist on disk.
>
> This will allow us to generate `.zip` files with generated content,
> without having to add said content to the object database and without
> having to write it out to disk.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> ---
> Documentation/git-archive.txt | 11 ++++++++
> archive.c | 53 +++++++++++++++++++++++++++++------
> t/t5003-archive-zip.sh | 12 ++++++++
> 3 files changed, 68 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
> index bc4e76a7834..893cb1075bf 100644
> --- a/Documentation/git-archive.txt
> +++ b/Documentation/git-archive.txt
> @@ -61,6 +61,17 @@ OPTIONS
> by concatenating the value for `--prefix` (if any) and the
> basename of <file>.
>
> +--add-virtual-file=<path>:<content>::
> + Add the specified contents to the archive. Can be repeated to add
> + multiple files. The path of the file in the archive is built
> + by concatenating the value for `--prefix` (if any) and the
> + basename of <file>.
> ++
> +The `<path>` cannot contain any colon, the file mode is limited to
> +a regular file, and the option may be subject to platform-dependent
> +command-line limits. For non-trivial cases, write an untracked file
> +and use `--add-file` instead.
> +
> --worktree-attributes::
> Look for attributes in .gitattributes files in the working tree
> as well (see <<ATTRIBUTES>>).
> diff --git a/archive.c b/archive.c
> index a3bbb091256..d20e16fa819 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
> struct extra_file_info {
> char *base;
> struct stat stat;
> + void *content;
> };
>
> int write_archive_entries(struct archiver_args *args,
> @@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args,
> strbuf_addstr(&path_in_archive, basename(path));
>
> strbuf_reset(&content);
> - if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
> + if (info->content)
> + err = write_entry(args, &fake_oid, path_in_archive.buf,
> + path_in_archive.len,
> + info->stat.st_mode,
> + info->content, info->stat.st_size);
> + else if (strbuf_read_file(&content, path,
> + info->stat.st_size) < 0)
> err = error_errno(_("could not read '%s'"), path);
> else
> err = write_entry(args, &fake_oid, path_in_archive.buf,
> @@ -493,6 +500,7 @@ static void extra_file_info_clear(void *util, const char *str)
> {
> struct extra_file_info *info = util;
> free(info->base);
> + free(info->content);
> free(info);
> }
>
> @@ -514,14 +522,40 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
> if (!arg)
> return -1;
>
> - path = prefix_filename(args->prefix, arg);
> - item = string_list_append_nodup(&args->extra_files, path);
> - item->util = info = xmalloc(sizeof(*info));
> + info = xmalloc(sizeof(*info));
> info->base = xstrdup_or_null(base);
> - if (stat(path, &info->stat))
> - die(_("File not found: %s"), path);
> - if (!S_ISREG(info->stat.st_mode))
> - die(_("Not a regular file: %s"), path);
> +
> + if (!strcmp(opt->long_name, "add-file")) {
> + path = prefix_filename(args->prefix, arg);
> + if (stat(path, &info->stat))
> + die(_("File not found: %s"), path);
> + if (!S_ISREG(info->stat.st_mode))
> + die(_("Not a regular file: %s"), path);
> + info->content = NULL; /* read the file later */
> + } else if (!strcmp(opt->long_name, "add-virtual-file")) {
> + const char *colon = strchr(arg, ':');
> + char *p;
> +
> + if (!colon)
> + die(_("missing colon: '%s'"), arg);
> +
> + p = xstrndup(arg, colon - arg);
> + if (!args->prefix)
> + path = p;
> + else {
> + path = prefix_filename(args->prefix, p);
> + free(p);
> + }
> + memset(&info->stat, 0, sizeof(info->stat));
> + info->stat.st_mode = S_IFREG | 0644;
> + info->content = xstrdup(colon + 1);
> + info->stat.st_size = strlen(info->content);
> + } else {
> + BUG("add_file_cb() called for %s", opt->long_name);
> + }
> + item = string_list_append_nodup(&args->extra_files, path);
> + item->util = info;
> +
> return 0;
> }
>
> @@ -554,6 +588,9 @@ static int parse_archive_args(int argc, const char **argv,
> { OPTION_CALLBACK, 0, "add-file", args, N_("file"),
> N_("add untracked file to archive"), 0, add_file_cb,
> (intptr_t)&base },
> + { OPTION_CALLBACK, 0, "add-virtual-file", args,
> + N_("path:content"), N_("add untracked file to archive"), 0,
> + add_file_cb, (intptr_t)&base },
> OPT_STRING('o', "output", &output, N_("file"),
> N_("write the archive to this file")),
> OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> index 1e6d18b140e..ebc26e89a9b 100755
> --- a/t/t5003-archive-zip.sh
> +++ b/t/t5003-archive-zip.sh
> @@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' '
> check_zip with_untracked
> check_added with_untracked untracked untracked
>
> +test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> + git archive --format=zip >with_file_with_content.zip \
> + --add-virtual-file=hello:world $EMPTY_TREE &&
> + test_when_finished "rm -rf tmp-unpack" &&
> + mkdir tmp-unpack && (
> + cd tmp-unpack &&
> + "$GIT_UNZIP" ../with_file_with_content.zip &&
> + test_path_is_file hello &&
> + test world = $(cat hello)
> + )
> +'
> +
> test_expect_success 'git archive --format=zip --add-file twice' '
> echo untracked >untracked &&
> git archive --format=zip --prefix=one/ --add-file=untracked \
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v5 1/7] archive: optionally add "virtual" files
2022-05-20 14:41 ` René Scharfe
@ 2022-05-20 16:21 ` Junio C Hamano
0 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-20 16:21 UTC (permalink / raw)
To: René Scharfe
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
René Scharfe <l.s.r@web.de> writes:
> Am 19.05.22 um 20:17 schrieb Johannes Schindelin via GitGitGadget:
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>
>> With the `--add-file-with-content=<path>:<content>` option, `git
> ^^^^^^^^^^^^^^^^^^^^^^^
> That's still the old option name. Same in the subject of patch 2.
Good eyes, and thanks for catching what I missed---the risk of
relying too much on the range-diff X-<.
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v5 2/7] archive --add-file-with-contents: allow paths containing colons
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
2022-05-19 18:17 ` [PATCH v5 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-19 18:17 ` Johannes Schindelin via GitGitGadget
2022-05-19 18:17 ` [PATCH v5 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
` (6 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-19 18:17 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 14 ++++++++++----
archive.c | 30 ++++++++++++++++++++----------
t/t5003-archive-zip.sh | 8 ++++++++
3 files changed, 38 insertions(+), 14 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 893cb1075bf..54de945a84e 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -67,10 +67,16 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+
-The `<path>` cannot contain any colon, the file mode is limited to
-a regular file, and the option may be subject to platform-dependent
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
+character; The contained file name is interpreted as a C-style string,
+i.e. the backslash is interpreted as escape character. The path must
+be quoted if it contains a colon, to avoid the colon from being
+misinterpreted as the separator between the path and the contents, or
+if the path begins or ends with a double-quote character.
++
+The file mode is limited to a regular file, and the option may be
+subject to platform-dependent command-line limits. For non-trivial
+cases, write an untracked file and use `--add-file` instead.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
diff --git a/archive.c b/archive.c
index d20e16fa819..b7756b91200 100644
--- a/archive.c
+++ b/archive.c
@@ -9,6 +9,7 @@
#include "parse-options.h"
#include "unpack-trees.h"
#include "dir.h"
+#include "quote.h"
static char const * const archive_usage[] = {
N_("git archive [<options>] <tree-ish> [<path>...]"),
@@ -533,22 +534,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
} else if (!strcmp(opt->long_name, "add-virtual-file")) {
- const char *colon = strchr(arg, ':');
- char *p;
+ struct strbuf buf = STRBUF_INIT;
+ const char *p = arg;
+
+ if (*p != '"')
+ p = strchr(p, ':');
+ else if (unquote_c_style(&buf, p, &p) < 0)
+ die(_("unclosed quote: '%s'"), arg);
- if (!colon)
+ if (!p || *p != ':')
die(_("missing colon: '%s'"), arg);
- p = xstrndup(arg, colon - arg);
- if (!args->prefix)
- path = p;
- else {
- path = prefix_filename(args->prefix, p);
- free(p);
+ if (p == arg)
+ die(_("empty file name: '%s'"), arg);
+
+ path = buf.len ?
+ strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg);
+
+ if (args->prefix) {
+ char *save = path;
+ path = prefix_filename(args->prefix, path);
+ free(save);
}
memset(&info->stat, 0, sizeof(info->stat));
info->stat.st_mode = S_IFREG | 0644;
- info->content = xstrdup(colon + 1);
+ info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
} else {
BUG("add_file_cb() called for %s", opt->long_name);
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index ebc26e89a9b..50932a866c9 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -207,13 +207,21 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ if test_have_prereq FUNNYNAMES
+ then
+ QUOTED=quoted:colon
+ else
+ QUOTED=quoted
+ fi &&
git archive --format=zip >with_file_with_content.zip \
+ --add-virtual-file=\"$QUOTED\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
+ test_path_is_file $QUOTED &&
test world = $(cat hello)
)
'
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v5 3/7] scalar: validate the optional enlistment argument
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
2022-05-19 18:17 ` [PATCH v5 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
2022-05-19 18:17 ` [PATCH v5 2/7] archive --add-file-with-contents: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-19 18:17 ` Johannes Schindelin via GitGitGadget
2022-05-19 18:18 ` [PATCH v5 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
` (5 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-19 18:17 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
The `scalar` command needs a Scalar enlistment for many subcommands, and
looks in the current directory for such an enlistment (traversing the
parent directories until it finds one).
These is subcommands can also be called with an optional argument
specifying the enlistment. Here, too, we traverse parent directories as
needed, until we find an enlistment.
However, if the specified directory does not even exist, or is not a
directory, we should stop right there, with an error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 6 ++++--
contrib/scalar/t/t9099-scalar.sh | 5 +++++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 1ce9c2b00e8..00dcd4b50ef 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv,
usage_with_options(usagestr, options);
/* find the worktree, determine its corresponding root */
- if (argc == 1)
+ if (argc == 1) {
strbuf_add_absolute_path(&path, argv[0]);
- else if (strbuf_getcwd(&path) < 0)
+ if (!is_directory(path.buf))
+ die(_("'%s' does not exist"), path.buf);
+ } else if (strbuf_getcwd(&path) < 0)
die(_("need a working directory"));
strbuf_trim_trailing_dir_sep(&path);
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 2e1502ad45e..9d83fdf25e8 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -85,4 +85,9 @@ test_expect_success 'scalar delete with enlistment' '
test_path_is_missing cloned
'
+test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
+ ! scalar run config cloned 2>err &&
+ grep "cloned. does not exist" err
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v5 4/7] Implement `scalar diagnose`
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
` (2 preceding siblings ...)
2022-05-19 18:17 ` [PATCH v5 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
@ 2022-05-19 18:18 ` Johannes Schindelin via GitGitGadget
2022-05-19 18:18 ` [PATCH v5 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
` (4 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-19 18:18 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
that can help identify the most typical problems with large
worktrees/repositories.
The `diagnose` command is the culmination of this hard-won knowledge: it
gathers the installed hooks, the config, a couple statistics describing
the data shape, among other pieces of information, and then wraps
everything up in a tidy, neat `.zip` archive.
Note: originally, Scalar was implemented in C# using the .NET API, where
we had the luxury of a comprehensive standard library that includes
basic functionality such as writing a `.zip` file. In the C version, we
lack such a commodity. Rather than introducing a dependency on, say,
libzip, we slightly abuse Git's `archive` machinery: we write out a
`.zip` of the empty try, augmented by a couple files that are added via
the `--add-file*` options. We are careful trying not to modify the
current repository in any way lest the very circumstances that required
`scalar diagnose` to be run are changed by the `diagnose` run itself.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 144 +++++++++++++++++++++++++++++++
contrib/scalar/scalar.txt | 12 +++
contrib/scalar/t/t9099-scalar.sh | 14 +++
3 files changed, 170 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 00dcd4b50ef..53213f9a3b9 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -11,6 +11,7 @@
#include "dir.h"
#include "packfile.h"
#include "help.h"
+#include "archive.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -261,6 +262,47 @@ static int unregister_dir(void)
return res;
}
+static int add_directory_to_archiver(struct strvec *archiver_args,
+ const char *path, int recurse)
+{
+ int at_root = !*path;
+ DIR *dir = opendir(at_root ? "." : path);
+ struct dirent *e;
+ struct strbuf buf = STRBUF_INIT;
+ size_t len;
+ int res = 0;
+
+ if (!dir)
+ return error_errno(_("could not open directory '%s'"), path);
+
+ if (!at_root)
+ strbuf_addf(&buf, "%s/", path);
+ len = buf.len;
+ strvec_pushf(archiver_args, "--prefix=%s", buf.buf);
+
+ while (!res && (e = readdir(dir))) {
+ if (!strcmp(".", e->d_name) || !strcmp("..", e->d_name))
+ continue;
+
+ strbuf_setlen(&buf, len);
+ strbuf_addstr(&buf, e->d_name);
+
+ if (e->d_type == DT_REG)
+ strvec_pushf(archiver_args, "--add-file=%s", buf.buf);
+ else if (e->d_type != DT_DIR)
+ warning(_("skipping '%s', which is neither file nor "
+ "directory"), buf.buf);
+ else if (recurse &&
+ add_directory_to_archiver(archiver_args,
+ buf.buf, recurse) < 0)
+ res = -1;
+ }
+
+ closedir(dir);
+ strbuf_release(&buf);
+ return res;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -501,6 +543,107 @@ cleanup:
return res;
}
+static int cmd_diagnose(int argc, const char **argv)
+{
+ struct option options[] = {
+ OPT_END(),
+ };
+ const char * const usage[] = {
+ N_("scalar diagnose [<enlistment>]"),
+ NULL
+ };
+ struct strbuf zip_path = STRBUF_INIT;
+ struct strvec archiver_args = STRVEC_INIT;
+ char **argv_copy = NULL;
+ int stdout_fd = -1, archiver_fd = -1;
+ time_t now = time(NULL);
+ struct tm tm;
+ struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT;
+ int res = 0;
+
+ argc = parse_options(argc, argv, NULL, options,
+ usage, 0);
+
+ setup_enlistment_directory(argc, argv, usage, options, &zip_path);
+
+ strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_");
+ strbuf_addftime(&zip_path,
+ "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0);
+ strbuf_addstr(&zip_path, ".zip");
+ switch (safe_create_leading_directories(zip_path.buf)) {
+ case SCLD_EXISTS:
+ case SCLD_OK:
+ break;
+ default:
+ error_errno(_("could not create directory for '%s'"),
+ zip_path.buf);
+ goto diagnose_cleanup;
+ }
+ stdout_fd = dup(1);
+ if (stdout_fd < 0) {
+ res = error_errno(_("could not duplicate stdout"));
+ goto diagnose_cleanup;
+ }
+
+ archiver_fd = xopen(zip_path.buf, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+ if (archiver_fd < 0 || dup2(archiver_fd, 1) < 0) {
+ res = error_errno(_("could not redirect output"));
+ goto diagnose_cleanup;
+ }
+
+ init_zip_archiver();
+ strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL);
+
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "Collecting diagnostic info\n\n");
+ get_version_info(&buf, 1);
+
+ strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ write_or_die(stdout_fd, buf.buf, buf.len);
+ strvec_pushf(&archiver_args,
+ "--add-virtual-file=diagnostics.log:%.*s",
+ (int)buf.len, buf.buf);
+
+ if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
+ goto diagnose_cleanup;
+
+ strvec_pushl(&archiver_args, "--prefix=",
+ oid_to_hex(the_hash_algo->empty_tree), "--", NULL);
+
+ /* `write_archive()` modifies the `argv` passed to it. Let it. */
+ argv_copy = xmemdupz(archiver_args.v,
+ sizeof(char *) * archiver_args.nr);
+ res = write_archive(archiver_args.nr, (const char **)argv_copy, NULL,
+ the_repository, NULL, 0);
+ if (res) {
+ error(_("failed to write archive"));
+ goto diagnose_cleanup;
+ }
+
+ if (!res)
+ fprintf(stderr, "\n"
+ "Diagnostics complete.\n"
+ "All of the gathered info is captured in '%s'\n",
+ zip_path.buf);
+
+diagnose_cleanup:
+ if (archiver_fd >= 0) {
+ close(1);
+ dup2(stdout_fd, 1);
+ }
+ free(argv_copy);
+ strvec_clear(&archiver_args);
+ strbuf_release(&zip_path);
+ strbuf_release(&path);
+ strbuf_release(&buf);
+
+ return res;
+}
+
static int cmd_list(int argc, const char **argv)
{
if (argc != 1)
@@ -802,6 +945,7 @@ static struct {
{ "reconfigure", cmd_reconfigure },
{ "delete", cmd_delete },
{ "version", cmd_version },
+ { "diagnose", cmd_diagnose },
{ NULL, NULL},
};
diff --git a/contrib/scalar/scalar.txt b/contrib/scalar/scalar.txt
index f416d637289..22583fe046e 100644
--- a/contrib/scalar/scalar.txt
+++ b/contrib/scalar/scalar.txt
@@ -14,6 +14,7 @@ scalar register [<enlistment>]
scalar unregister [<enlistment>]
scalar run ( all | config | commit-graph | fetch | loose-objects | pack-files ) [<enlistment>]
scalar reconfigure [ --all | <enlistment> ]
+scalar diagnose [<enlistment>]
scalar delete <enlistment>
DESCRIPTION
@@ -129,6 +130,17 @@ reconfigure the enlistment.
With the `--all` option, all enlistments currently registered with Scalar
will be reconfigured. Use this option after each Scalar upgrade.
+Diagnose
+~~~~~~~~
+
+diagnose [<enlistment>]::
+ When reporting issues with Scalar, it is often helpful to provide the
+ information gathered by this command, including logs and certain
+ statistics describing the data shape of the current enlistment.
++
+The output of this command is a `.zip` file that is written into
+a directory adjacent to the worktree in the `src` directory.
+
Delete
~~~~~~
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 9d83fdf25e8..6802d317258 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -90,4 +90,18 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
grep "cloned. does not exist" err
'
+SQ="'"
+test_expect_success UNZIP 'scalar diagnose' '
+ scalar clone "file://$(pwd)" cloned --single-branch &&
+ scalar diagnose cloned >out 2>err &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
+ zip_path=$(cat zip_path) &&
+ test -n "$zip_path" &&
+ unzip -v "$zip_path" &&
+ folder=${zip_path%.zip} &&
+ test_path_is_missing "$folder" &&
+ unzip -p "$zip_path" diagnostics.log >out &&
+ test_file_not_empty out
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v5 5/7] scalar diagnose: include disk space information
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
` (3 preceding siblings ...)
2022-05-19 18:18 ` [PATCH v5 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
@ 2022-05-19 18:18 ` Johannes Schindelin via GitGitGadget
2022-05-19 18:18 ` [PATCH v5 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
` (3 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-19 18:18 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
When analyzing problems with large worktrees/repositories, it is useful
to know how close to a "full disk" situation Scalar/Git operates. Let's
include this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 53 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 1 +
2 files changed, 54 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 53213f9a3b9..0a9e25a57f8 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -303,6 +303,58 @@ static int add_directory_to_archiver(struct strvec *archiver_args,
return res;
}
+#ifndef WIN32
+#include <sys/statvfs.h>
+#endif
+
+static int get_disk_info(struct strbuf *out)
+{
+#ifdef WIN32
+ struct strbuf buf = STRBUF_INIT;
+ char volume_name[MAX_PATH], fs_name[MAX_PATH];
+ DWORD serial_number, component_length, flags;
+ ULARGE_INTEGER avail2caller, total, avail;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (!GetDiskFreeSpaceExA(buf.buf, &avail2caller, &total, &avail)) {
+ error(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_setlen(&buf, offset_1st_component(buf.buf));
+ if (!GetVolumeInformationA(buf.buf, volume_name, sizeof(volume_name),
+ &serial_number, &component_length, &flags,
+ fs_name, sizeof(fs_name))) {
+ error(_("could not get info for '%s'"), buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, avail2caller.QuadPart);
+ strbuf_addch(out, '\n');
+ strbuf_release(&buf);
+#else
+ struct strbuf buf = STRBUF_INIT;
+ struct statvfs stat;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (statvfs(buf.buf, &stat) < 0) {
+ error_errno(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail));
+ strbuf_addf(out, " (mount flags 0x%lx)\n", stat.f_flag);
+ strbuf_release(&buf);
+#endif
+ return 0;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -599,6 +651,7 @@ static int cmd_diagnose(int argc, const char **argv)
get_version_info(&buf, 1);
strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ get_disk_info(&buf);
write_or_die(stdout_fd, buf.buf, buf.len);
strvec_pushf(&archiver_args,
"--add-virtual-file=diagnostics.log:%.*s",
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 6802d317258..934b2485d91 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -94,6 +94,7 @@ SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
scalar diagnose cloned >out 2>err &&
+ grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
zip_path=$(cat zip_path) &&
test -n "$zip_path" &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v5 6/7] scalar: teach `diagnose` to gather packfile info
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
` (4 preceding siblings ...)
2022-05-19 18:18 ` [PATCH v5 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
@ 2022-05-19 18:18 ` Matthew John Cheetham via GitGitGadget
2022-05-19 18:18 ` [PATCH v5 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
` (2 subsequent siblings)
8 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-19 18:18 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
It's helpful to see if there are other crud files in the pack
directory. Let's teach the `scalar diagnose` command to gather
file size information about pack files.
While at it, also enumerate the pack files in the alternate
object directories, if any are registered.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 30 ++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 6 +++++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 0a9e25a57f8..d302c27e114 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -12,6 +12,7 @@
#include "packfile.h"
#include "help.h"
#include "archive.h"
+#include "object-store.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -595,6 +596,29 @@ cleanup:
return res;
}
+static void dir_file_stats_objects(const char *full_path, size_t full_path_len,
+ const char *file_name, void *data)
+{
+ struct strbuf *buf = data;
+ struct stat st;
+
+ if (!stat(full_path, &st))
+ strbuf_addf(buf, "%-70s %16" PRIuMAX "\n", file_name,
+ (uintmax_t)st.st_size);
+}
+
+static int dir_file_stats(struct object_directory *object_dir, void *data)
+{
+ struct strbuf *buf = data;
+
+ strbuf_addf(buf, "Contents of %s:\n", object_dir->path);
+
+ for_each_file_in_pack_dir(object_dir->path, dir_file_stats_objects,
+ data);
+
+ return 0;
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -657,6 +681,12 @@ static int cmd_diagnose(int argc, const char **argv)
"--add-virtual-file=diagnostics.log:%.*s",
(int)buf.len, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-virtual-file=packs-local.txt:");
+ dir_file_stats(the_repository->objects->odb, &buf);
+ foreach_alt_odb(dir_file_stats, &buf);
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 934b2485d91..3dd5650cceb 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -93,6 +93,8 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
+ git repack &&
+ echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -102,7 +104,9 @@ test_expect_success UNZIP 'scalar diagnose' '
folder=${zip_path%.zip} &&
test_path_is_missing "$folder" &&
unzip -p "$zip_path" diagnostics.log >out &&
- test_file_not_empty out
+ test_file_not_empty out &&
+ unzip -p "$zip_path" packs-local.txt >out &&
+ grep "$(pwd)/.git/objects" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v5 7/7] scalar: teach `diagnose` to gather loose objects information
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
` (5 preceding siblings ...)
2022-05-19 18:18 ` [PATCH v5 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
@ 2022-05-19 18:18 ` Matthew John Cheetham via GitGitGadget
2022-05-19 19:23 ` [PATCH v5 0/7] scalar: implement the subcommand "diagnose" Junio C Hamano
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
8 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-19 18:18 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
When operating at the scale that Scalar wants to support, certain data
shapes are more likely to cause undesirable performance issues, such as
large numbers of loose objects.
By including statistics about this, `scalar diagnose` now makes it
easier to identify such scenarios.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 59 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 5 ++-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index d302c27e114..0c278681758 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -619,6 +619,60 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
return 0;
}
+static int count_files(char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count = 0;
+
+ if (!dir)
+ return 0;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) && e->d_type == DT_REG)
+ count++;
+
+ closedir(dir);
+ return count;
+}
+
+static void loose_objs_stats(struct strbuf *buf, const char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count;
+ int total = 0;
+ unsigned char c;
+ struct strbuf count_path = STRBUF_INIT;
+ size_t base_path_len;
+
+ if (!dir)
+ return;
+
+ strbuf_addstr(buf, "Object directory stats for ");
+ strbuf_add_absolute_path(buf, path);
+ strbuf_addstr(buf, ":\n");
+
+ strbuf_add_absolute_path(&count_path, path);
+ strbuf_addch(&count_path, '/');
+ base_path_len = count_path.len;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) &&
+ e->d_type == DT_DIR && strlen(e->d_name) == 2 &&
+ !hex_to_bytes(&c, e->d_name, 1)) {
+ strbuf_setlen(&count_path, base_path_len);
+ strbuf_addstr(&count_path, e->d_name);
+ total += (count = count_files(count_path.buf));
+ strbuf_addf(buf, "%s : %7d files\n", e->d_name, count);
+ }
+
+ strbuf_addf(buf, "Total: %d loose objects", total);
+
+ strbuf_release(&count_path);
+ closedir(dir);
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -687,6 +741,11 @@ static int cmd_diagnose(int argc, const char **argv)
foreach_alt_odb(dir_file_stats, &buf);
strvec_push(&archiver_args, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-virtual-file=objects-local.txt:");
+ loose_objs_stats(&buf, ".git/objects");
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 3dd5650cceb..72023a1ca1d 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -95,6 +95,7 @@ test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
git repack &&
echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
+ test_commit -C cloned/src loose &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -106,7 +107,9 @@ test_expect_success UNZIP 'scalar diagnose' '
unzip -p "$zip_path" diagnostics.log >out &&
test_file_not_empty out &&
unzip -p "$zip_path" packs-local.txt >out &&
- grep "$(pwd)/.git/objects" out
+ grep "$(pwd)/.git/objects" out &&
+ unzip -p "$zip_path" objects-local.txt >out &&
+ grep "^Total: [1-9]" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v5 0/7] scalar: implement the subcommand "diagnose"
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
` (6 preceding siblings ...)
2022-05-19 18:18 ` [PATCH v5 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
@ 2022-05-19 19:23 ` Junio C Hamano
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
8 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-19 19:23 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> Changes since v4:
>
> * Squashed in Junio's suggested fixups
> * Renamed the option from --add-file-with-content=<name>:<content> to
> --add-virtual-file=<name>:<content>
;-) 5 letters shorter and is a good name.
> * Fixed one instance where I had used error() instead of error_errno().
Looks good.
Thanks. Will replace and queue.
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v6 0/7] scalar: implement the subcommand "diagnose"
2022-05-19 18:17 ` [PATCH v5 " Johannes Schindelin via GitGitGadget
` (7 preceding siblings ...)
2022-05-19 19:23 ` [PATCH v5 0/7] scalar: implement the subcommand "diagnose" Junio C Hamano
@ 2022-05-21 15:08 ` Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
` (7 more replies)
8 siblings, 8 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
Over the course of the years, we developed a sub-command that gathers
diagnostic data into a .zip file that can then be attached to bug reports.
This sub-command turned out to be very useful in helping Scalar developers
identify and fix issues.
Changes since v5:
* Reworded the missed mentions of the old name of the --add-virtual-file
option (thanks René!).
* Renamed misleading variable name from $QUOTED to $PATHNAME (thanks
Junio!).
Changes since v4:
* Squashed in Junio's suggested fixups
* Renamed the option from --add-file-with-content=<name>:<content> to
--add-virtual-file=<name>:<content>
* Fixed one instance where I had used error() instead of error_errno().
Changes since v3:
* We're now using unquote_c_style() instead of rolling our own unquoter.
* Fixed the added regression test.
* As pointed out by Scalar's Functional Tests, the
add_directory_to_archiver() function should not fail when scalar diagnose
encounters FSMonitor's Unix socket, but only warn instead.
* Related: add_directory_to_archiver() needs to propagate errors from
processing subdirectories so that the top-level call returns an error,
too.
Changes since v2:
* Clarified in the commit message what the biggest benefit of
--add-file-with-content is.
* The <path> part of the -add-file-with-content argument can now contain
colons. To do this, the path needs to start and end in double-quote
characters (which are stripped), and the backslash serves as escape
character in that case (to allow the path to contain both colons and
double-quotes).
* Fixed incorrect grammar.
* Instead of strcmp(<what-we-don't-want>), we now say
!strcmp(<what-we-want>).
* The help text for --add-file-with-content was improved a tiny bit.
* Adjusted the commit message that still talked about spawning plenty of
processes and about a throw-away repository for the sake of generating a
.zip file.
* Simplified the code that shows the diagnostics and adds them to the .zip
file.
* The final message that reports that the archive is complete is now
printed to stderr instead of stdout.
Changes since v1:
* Instead of creating a throw-away repository, staging the contents of the
.zip file and then using git write-tree and git archive to write the .zip
file, the patch series now introduces a new option to git archive and
uses write_archive() directly (avoiding any separate process).
* Since the command avoids separate processes, it is now blazing fast on
Windows, and I dropped the spinner() function because it's no longer
needed.
* While reworking the test case, I noticed that scalar [...] <enlistment>
failed to verify that the specified directory exists, and would happily
"traverse to its parent directory" on its quest to find a Scalar
enlistment. That is of course incorrect, and has been fixed as a "while
at it" sort of preparatory commit.
* I had forgotten to sign off on all the commits, which has been fixed.
* Instead of some "home-grown" readdir()-based function, the code now uses
for_each_file_in_pack_dir() to look through the pack directories.
* If any alternates are configured, their pack directories are now included
in the output.
* The commit message that might be interpreted to promise information about
large loose files has been corrected to no longer promise that.
* The test cases have been adjusted to test a little bit more (e.g.
verifying that specific paths are mentioned in the output, instead of
merely verifying that the output is non-empty).
Johannes Schindelin (5):
archive: optionally add "virtual" files
archive --add-virtual-file: allow paths containing colons
scalar: validate the optional enlistment argument
Implement `scalar diagnose`
scalar diagnose: include disk space information
Matthew John Cheetham (2):
scalar: teach `diagnose` to gather packfile info
scalar: teach `diagnose` to gather loose objects information
Documentation/git-archive.txt | 17 ++
archive.c | 63 ++++++-
contrib/scalar/scalar.c | 292 ++++++++++++++++++++++++++++++-
contrib/scalar/scalar.txt | 12 ++
contrib/scalar/t/t9099-scalar.sh | 27 +++
t/t5003-archive-zip.sh | 20 +++
6 files changed, 421 insertions(+), 10 deletions(-)
base-commit: ddc35d833dd6f9e8946b09cecd3311b8aa18d295
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1128%2Fdscho%2Fscalar-diagnose-v6
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1128/dscho/scalar-diagnose-v6
Pull-Request: https://github.com/gitgitgadget/git/pull/1128
Range-diff vs v5:
1: 42e73fb0aac ! 1: 0005cfae31d archive: optionally add "virtual" files
@@ Metadata
## Commit message ##
archive: optionally add "virtual" files
- With the `--add-file-with-content=<path>:<content>` option, `git
- archive` now supports use cases where relatively trivial files need to
- be added that do not exist on disk.
+ With the `--add-virtual-file=<path>:<content>` option, `git archive` now
+ supports use cases where relatively trivial files need to be added that
+ do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
2: b5ebd61066a ! 2: 7eebcf27b45 archive --add-file-with-contents: allow paths containing colons
@@ Metadata
Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
## Commit message ##
- archive --add-file-with-contents: allow paths containing colons
+ archive --add-virtual-file: allow paths containing colons
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
@@ t/t5003-archive-zip.sh: check_zip with_untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ if test_have_prereq FUNNYNAMES
+ then
-+ QUOTED=quoted:colon
++ PATHNAME=quoted:colon
+ else
-+ QUOTED=quoted
++ PATHNAME=quoted
+ fi &&
git archive --format=zip >with_file_with_content.zip \
-+ --add-virtual-file=\"$QUOTED\": \
++ --add-virtual-file=\"$PATHNAME\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
-+ test_path_is_file $QUOTED &&
++ test_path_is_file $PATHNAME &&
test world = $(cat hello)
)
'
3: f1ba69c02d7 = 3: ca83ddd5eed scalar: validate the optional enlistment argument
4: 3fb90194744 = 4: 89c13a45e00 Implement `scalar diagnose`
5: 2e645b08a9e = 5: 8ffbaad3086 scalar diagnose: include disk space information
6: 0fa20d73750 = 6: 15cd7f17896 scalar: teach `diagnose` to gather packfile info
7: 62e173b47cf = 7: a4a74d5ef58 scalar: teach `diagnose` to gather loose objects information
--
gitgitgadget
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
@ 2022-05-21 15:08 ` Johannes Schindelin via GitGitGadget
2022-05-25 21:11 ` Junio C Hamano
2022-05-21 15:08 ` [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons Johannes Schindelin via GitGitGadget
` (6 subsequent siblings)
7 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
With the `--add-virtual-file=<path>:<content>` option, `git archive` now
supports use cases where relatively trivial files need to be added that
do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 11 ++++++++
archive.c | 53 +++++++++++++++++++++++++++++------
t/t5003-archive-zip.sh | 12 ++++++++
3 files changed, 68 insertions(+), 8 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a7834..893cb1075bf 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -61,6 +61,17 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+--add-virtual-file=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
+ by concatenating the value for `--prefix` (if any) and the
+ basename of <file>.
++
+The `<path>` cannot contain any colon, the file mode is limited to
+a regular file, and the option may be subject to platform-dependent
+command-line limits. For non-trivial cases, write an untracked file
+and use `--add-file` instead.
+
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
as well (see <<ATTRIBUTES>>).
diff --git a/archive.c b/archive.c
index a3bbb091256..d20e16fa819 100644
--- a/archive.c
+++ b/archive.c
@@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
struct extra_file_info {
char *base;
struct stat stat;
+ void *content;
};
int write_archive_entries(struct archiver_args *args,
@@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args,
strbuf_addstr(&path_in_archive, basename(path));
strbuf_reset(&content);
- if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
+ if (info->content)
+ err = write_entry(args, &fake_oid, path_in_archive.buf,
+ path_in_archive.len,
+ info->stat.st_mode,
+ info->content, info->stat.st_size);
+ else if (strbuf_read_file(&content, path,
+ info->stat.st_size) < 0)
err = error_errno(_("could not read '%s'"), path);
else
err = write_entry(args, &fake_oid, path_in_archive.buf,
@@ -493,6 +500,7 @@ static void extra_file_info_clear(void *util, const char *str)
{
struct extra_file_info *info = util;
free(info->base);
+ free(info->content);
free(info);
}
@@ -514,14 +522,40 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
if (!arg)
return -1;
- path = prefix_filename(args->prefix, arg);
- item = string_list_append_nodup(&args->extra_files, path);
- item->util = info = xmalloc(sizeof(*info));
+ info = xmalloc(sizeof(*info));
info->base = xstrdup_or_null(base);
- if (stat(path, &info->stat))
- die(_("File not found: %s"), path);
- if (!S_ISREG(info->stat.st_mode))
- die(_("Not a regular file: %s"), path);
+
+ if (!strcmp(opt->long_name, "add-file")) {
+ path = prefix_filename(args->prefix, arg);
+ if (stat(path, &info->stat))
+ die(_("File not found: %s"), path);
+ if (!S_ISREG(info->stat.st_mode))
+ die(_("Not a regular file: %s"), path);
+ info->content = NULL; /* read the file later */
+ } else if (!strcmp(opt->long_name, "add-virtual-file")) {
+ const char *colon = strchr(arg, ':');
+ char *p;
+
+ if (!colon)
+ die(_("missing colon: '%s'"), arg);
+
+ p = xstrndup(arg, colon - arg);
+ if (!args->prefix)
+ path = p;
+ else {
+ path = prefix_filename(args->prefix, p);
+ free(p);
+ }
+ memset(&info->stat, 0, sizeof(info->stat));
+ info->stat.st_mode = S_IFREG | 0644;
+ info->content = xstrdup(colon + 1);
+ info->stat.st_size = strlen(info->content);
+ } else {
+ BUG("add_file_cb() called for %s", opt->long_name);
+ }
+ item = string_list_append_nodup(&args->extra_files, path);
+ item->util = info;
+
return 0;
}
@@ -554,6 +588,9 @@ static int parse_archive_args(int argc, const char **argv,
{ OPTION_CALLBACK, 0, "add-file", args, N_("file"),
N_("add untracked file to archive"), 0, add_file_cb,
(intptr_t)&base },
+ { OPTION_CALLBACK, 0, "add-virtual-file", args,
+ N_("path:content"), N_("add untracked file to archive"), 0,
+ add_file_cb, (intptr_t)&base },
OPT_STRING('o', "output", &output, N_("file"),
N_("write the archive to this file")),
OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index 1e6d18b140e..ebc26e89a9b 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' '
check_zip with_untracked
check_added with_untracked untracked untracked
+test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ git archive --format=zip >with_file_with_content.zip \
+ --add-virtual-file=hello:world $EMPTY_TREE &&
+ test_when_finished "rm -rf tmp-unpack" &&
+ mkdir tmp-unpack && (
+ cd tmp-unpack &&
+ "$GIT_UNZIP" ../with_file_with_content.zip &&
+ test_path_is_file hello &&
+ test world = $(cat hello)
+ )
+'
+
test_expect_success 'git archive --format=zip --add-file twice' '
echo untracked >untracked &&
git archive --format=zip --prefix=one/ --add-file=untracked \
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-21 15:08 ` [PATCH v6 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-25 21:11 ` Junio C Hamano
2022-05-26 9:09 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-25 21:11 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> @@ -61,6 +61,17 @@ OPTIONS
> by concatenating the value for `--prefix` (if any) and the
> basename of <file>.
>
> +--add-virtual-file=<path>:<content>::
> + Add the specified contents to the archive. Can be repeated to add
> + multiple files. The path of the file in the archive is built
> + by concatenating the value for `--prefix` (if any) and the
> + basename of <file>.
This sentence was copy-pasted from --add-file without adjusting.
There is no <file>; this new feature gives <path>.
Also, I suspect that the feature is losing end-user supplied
information without a good reason. --add-file=<file> may have
prepared an input in a randomly named temporary directory and it
would make quite a lot of sense to strip the leading directory
components from <file> and use only the basename part. But the
<path> given to "--add-virtual-file" does not refer to anything on
the filesystem. Its ONLY use is to be used as the path in the
archive to store the content. There is no justification why we
would discard the leading path components from it. I am not
decided, but I am inclined to say that we should not honor
"--prefix".
$ git archive --prefix=2.36.0 v2.36.0
would be a way to create a single directory and put everything in
the tree-ish in there, but there probably are cases where the user
of an "extra file" feature wants to add untracked cruft _in_ that
directory, and there are other cases where an extra file wants to go
to the top-level next to the 2.36.0 directory. A user can use the
same string as --prefix=<base> in front of <path> if the extra file
should go next to the top-level of the tree-ish, or without such
prefixing to place the extra file at the top-level.
Hence
Add the specified contents to the archive. Can be repeated
to add multiple files. `<path>` is used as the path of the
file in the archive.
would be what I would expect in a version of this feature that is
reasonably designed.
> ++
> +The `<path>` cannot contain any colon, the file mode is limited to
> +a regular file, and the option may be subject to platform-dependent
> +command-line limits. For non-trivial cases, write an untracked file
> +and use `--add-file` instead.
OK.
> diff --git a/archive.c b/archive.c
> index a3bbb091256..d20e16fa819 100644
> --- a/archive.c
> +++ b/archive.c
> @@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
> struct extra_file_info {
> char *base;
> struct stat stat;
> + void *content;
> };
>
> int write_archive_entries(struct archiver_args *args,
> @@ -337,7 +338,13 @@ int write_archive_entries(struct archiver_args *args,
> strbuf_addstr(&path_in_archive, basename(path));
>
> strbuf_reset(&content);
> - if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
> + if (info->content)
We ended up with the problematic "leading <path> components are
discarded" design only because the implementation reuses the logic
path_in_archive computation (the last line is seen in precontext),
which is a bit unfortunate. I think we could rewrite the inside of
that "for each extra file" loop like so, instead:
for (i = 0; i < args->extra_files.nr; i++) {
struct string_list_item *item = args->extra_files.items + i;
char *path = item->string;
struct extra_file_info *info = item->util;
put_be64(fake_oid.hash, i + 1);
if (!info->content) {
strbuf_reset(&path_in_archive);
if (info->base)
strbuf_addstr(&path_in_archive, info->base);
strbuf_addstr(&path_in_archive, basename(path));
strbuf_reset(&content);
if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
err = error_errno(_("could not read '%s'"), path);
else
err = write_entry(args, &fake_oid, path_in_archive.buf,
path_in_archive.len,
info->stat.st_mode,
content.buf, content.len);
} else {
err = write_entry(args, &fake_oid,
path, strlen(path),
info->stat.st_mode,
info->content, info->stat.st_size);
}
if (err)
break;
}
The first half is the original code for "--add-file", which clears
info->content to NULL. We mangle the filename to come up with the
name in the archive (i.e. take basename and prefix with info->base).
The "else" side is the new code. "--add-virtual-file" has the
"<path>" thing in item->string, and info has the contents, so we
just write it out.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-25 21:11 ` Junio C Hamano
@ 2022-05-26 9:09 ` René Scharfe
2022-05-26 17:10 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: René Scharfe @ 2022-05-26 9:09 UTC (permalink / raw)
To: Junio C Hamano, Johannes Schindelin via GitGitGadget
Cc: git, Taylor Blau, Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
Am 25.05.22 um 23:11 schrieb Junio C Hamano:
> "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
> writes:
>
>> @@ -61,6 +61,17 @@ OPTIONS
>> by concatenating the value for `--prefix` (if any) and the
>> basename of <file>.
>>
>> +--add-virtual-file=<path>:<content>::
>> + Add the specified contents to the archive. Can be repeated to add
>> + multiple files. The path of the file in the archive is built
>> + by concatenating the value for `--prefix` (if any) and the
>> + basename of <file>.
>
> This sentence was copy-pasted from --add-file without adjusting.
> There is no <file>; this new feature gives <path>.
>
> Also, I suspect that the feature is losing end-user supplied
> information without a good reason. --add-file=<file> may have
> prepared an input in a randomly named temporary directory and it
> would make quite a lot of sense to strip the leading directory
> components from <file> and use only the basename part. But the
> <path> given to "--add-virtual-file" does not refer to anything on
> the filesystem. Its ONLY use is to be used as the path in the
> archive to store the content. There is no justification why we
> would discard the leading path components from it.
Good point.
> I am not
> decided, but I am inclined to say that we should not honor
> "--prefix".
>
> $ git archive --prefix=2.36.0 v2.36.0
>
> would be a way to create a single directory and put everything in
> the tree-ish in there, but there probably are cases where the user
> of an "extra file" feature wants to add untracked cruft _in_ that
> directory, and there are other cases where an extra file wants to go
> to the top-level next to the 2.36.0 directory. A user can use the
> same string as --prefix=<base> in front of <path> if the extra file
> should go next to the top-level of the tree-ish, or without such
> prefixing to place the extra file at the top-level.
If the prefix is applied then a prefix-less extra file can by had by
using --prefix= or --no-prefix for it and --prefix=... for the tree,
e.g.:
$ git archive --add-file=extra --prefix=dir/ v2.36.0
puts "extra" at the root and the rest under "dir". The order of
arguments matters here, and the default prefix is the empty string.
So extra files can be put anywhere even if --prefix is honored.
Keeping the whole path from --add-virtual-file makes sense to me; I
slightly prefer applying --prefix on top of that for consistency.
René
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-26 9:09 ` René Scharfe
@ 2022-05-26 17:10 ` Junio C Hamano
2022-05-26 18:57 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-26 17:10 UTC (permalink / raw)
To: René Scharfe
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
René Scharfe <l.s.r@web.de> writes:
> If the prefix is applied then a prefix-less extra file can by had by
> using --prefix= or --no-prefix for it and --prefix=... for the tree,
> e.g.:
>
> $ git archive --add-file=extra --prefix=dir/ v2.36.0
>
> puts "extra" at the root and the rest under "dir". The order of
> arguments matters here, and the default prefix is the empty string.
This was the part of the design for the original "--add-file" that I
was moderately unhappy with. If "--add-file" were the only feature
that used "--prefix", I wouldn't have been unhappy, but this rule:
The value of "--prefix" most recently seen at the point of
"--add-file" is prepended. (By the way, it is not clearly
documented what happens when you give multiple prefix and
when you give prefix before or after add-file)
makes the original use of "--prefix":
The value given to "--prefix" is prepended to each filename
in the archive. (IOW "git archive --prefix=git-2.36.0/
v2.36.0" is a way to prefix each and every path in the
tree-ish with the given prefix)
confusing. Does
git archive --prefix=bonus-files/ --add-file=extra v2.36.0
place the main part of the archive also in bonus-files/ or at the
top level? One reasonable interpretation is "yes", if we imagine
that each invocation of --add-file will consume and reset the prefix.
Another reasonable interpretation is "no", if we imagine that the
prefix last specified will stay around and equally affect both extra
ones and main part of the archive.
Unfortunately what the implmentation does is the latter, and those
who want to put the main part of the archive at the top-level must
add "--prefix=''" at the end (before the tree-ish).
Because of this potential for confusion ...
> So extra files can be put anywhere even if --prefix is honored.
>
> Keeping the whole path from --add-virtual-file makes sense to me; I
> slightly prefer applying --prefix on top of that for consistency.
... I was hoping that we can releave users from having to worry
about the interaction between "prefix" and contents coming from
outside the tree-ish by ignoring the "prefix".
But either is fine by me.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-26 17:10 ` Junio C Hamano
@ 2022-05-26 18:57 ` René Scharfe
2022-05-26 20:16 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: René Scharfe @ 2022-05-26 18:57 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
Am 26.05.22 um 19:10 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> If the prefix is applied then a prefix-less extra file can by had by
>> using --prefix= or --no-prefix for it and --prefix=... for the tree,
>> e.g.:
>>
>> $ git archive --add-file=extra --prefix=dir/ v2.36.0
>>
>> puts "extra" at the root and the rest under "dir". The order of
>> arguments matters here, and the default prefix is the empty string.
>
> This was the part of the design for the original "--add-file" that I
> was moderately unhappy with. If "--add-file" were the only feature
> that used "--prefix", I wouldn't have been unhappy, but this rule:
>
> The value of "--prefix" most recently seen at the point of
> "--add-file" is prepended. (By the way, it is not clearly
> documented what happens when you give multiple prefix and
> when you give prefix before or after add-file)
Regarding documentation: I wonder what's missing; a guess is below.
>
> makes the original use of "--prefix":
>
> The value given to "--prefix" is prepended to each filename
> in the archive. (IOW "git archive --prefix=git-2.36.0/
> v2.36.0" is a way to prefix each and every path in the
> tree-ish with the given prefix)
>
> confusing. Does
>
> git archive --prefix=bonus-files/ --add-file=extra v2.36.0
>
> place the main part of the archive also in bonus-files/ or at the
> top level? One reasonable interpretation is "yes", if we imagine
> that each invocation of --add-file will consume and reset the prefix.
> Another reasonable interpretation is "no", if we imagine that the
> prefix last specified will stay around and equally affect both extra
> ones and main part of the archive.
>
> Unfortunately what the implmentation does is the latter, and those
> who want to put the main part of the archive at the top-level must
> add "--prefix=''" at the end (before the tree-ish).
A one-shot --prefix would be surprising -- usually options keep their
value until they are specified again with a different value or negated
(--no-...). That surprise could be documented away by using a
different name like --next-prefix or --single-use-prefix. But a
sub-option to a single option like that would probably be better baked
into that option, e.g. allow --add-file=<path_in_archive>:<path_in_fs>.
>
> Because of this potential for confusion ...
>
>> So extra files can be put anywhere even if --prefix is honored.
>>
>> Keeping the whole path from --add-virtual-file makes sense to me; I
>> slightly prefer applying --prefix on top of that for consistency.
>
> ... I was hoping that we can releave users from having to worry
> about the interaction between "prefix" and contents coming from
> outside the tree-ish by ignoring the "prefix".
>
> But either is fine by me.
The unusual thing about the current --prefix implementation is that its
current value is captured along the way instead of just using its
right-most value. Not sure ignoring it for one of the three archive
content sources helps. (Really, it's hard for me to put me in the shoes
of someone who doesn't know how these options are supposed to be used.)
--- >8 ---
Subject: [PATCH] archive: improve documentation of --prefix
Document the interaction between --add-file and --prefix by giving an
example.
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Documentation/git-archive.txt | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a783..10a48ab5f8 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -49,7 +49,9 @@ OPTIONS
Report progress to stderr.
--prefix=<prefix>/::
- Prepend <prefix>/ to each filename in the archive.
+ Prepend <prefix>/ to each filename in the archive. Can be
+ specified multiple times; the last one seen when reading from
+ left to right is applied.
-o <file>::
--output=<file>::
@@ -58,8 +60,8 @@ OPTIONS
--add-file=<file>::
Add a non-tracked file to the archive. Can be repeated to add
multiple files. The path of the file in the archive is built
- by concatenating the value for `--prefix` (if any) and the
- basename of <file>.
+ by concatenating the current value for `--prefix` (if any) and
+ the basename of <file>.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
@@ -194,6 +196,12 @@ EXAMPLES
commit on the current branch. Note that the output format is
inferred by the extension of the output file.
+`git archive -o latest.tar --prefix=build/ --add-file=configure --prefix= HEAD`::
+
+ Creates a tar archive that contains the contents of the latest
+ commit on the current branch with no prefix and the untracked
+ file 'configure' with the prefix 'build/'.
+
`git config tar.tar.xz.command "xz -c"`::
Configure a "tar.xz" format for making LZMA-compressed tarfiles.
--
2.35.3
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-26 18:57 ` René Scharfe
@ 2022-05-26 20:16 ` Junio C Hamano
2022-05-27 17:02 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-26 20:16 UTC (permalink / raw)
To: René Scharfe
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
René Scharfe <l.s.r@web.de> writes:
> diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
> index bc4e76a783..10a48ab5f8 100644
> --- a/Documentation/git-archive.txt
> +++ b/Documentation/git-archive.txt
> @@ -49,7 +49,9 @@ OPTIONS
> Report progress to stderr.
>
> --prefix=<prefix>/::
> - Prepend <prefix>/ to each filename in the archive.
> + Prepend <prefix>/ to each filename in the archive. Can be
> + specified multiple times; the last one seen when reading from
> + left to right is applied.
That can be read to mean that we will use C consistently,
$ cmd --prefix=A other-args --prefix=B other-args --prefix=C other-args
which was what I am worried to be a source of confusion.
> -o <file>::
> --output=<file>::
> @@ -58,8 +60,8 @@ OPTIONS
> --add-file=<file>::
> Add a non-tracked file to the archive. Can be repeated to add
> multiple files. The path of the file in the archive is built
> - by concatenating the value for `--prefix` (if any) and the
> - basename of <file>.
> + by concatenating the current value for `--prefix` (if any) and
> + the basename of <file>.
"the current value for `--prefix` (if any)" would work well once we
somehow make the reader form a mental model that there is "the
current" for the "prefix", which starts with an empty string, and
gets updated every time the "--prefix=<prefix>/" option is given.
So, perhaps with
--prefix=<prefix>/::
The paths of the files in the tree being archived,
and untracked contents added via the `--add-file`
and `--add-virtual-file` options, can be modified by
prepending the "prefix" value that is in effect when
these options or the tree object is seen on the
command line. The "prefix" value initially starts
as an empty string, and it gets updated every time
this option is given on the command line.
or something like that, with something like
> + by concatenating the current value for "prefix" (see `--prefix`
> + above) and the basename of <file>.
here, it might make it less misunderstanding-prone, hopefully?
> +`git archive -o latest.tar --prefix=build/ --add-file=configure --prefix= HEAD`::
> +
> + Creates a tar archive that contains the contents of the latest
> + commit on the current branch with no prefix and the untracked
> + file 'configure' with the prefix 'build/'.
Great to have this example.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-26 20:16 ` Junio C Hamano
@ 2022-05-27 17:02 ` René Scharfe
2022-05-27 19:01 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: René Scharfe @ 2022-05-27 17:02 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
Am 26.05.22 um 22:16 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
>> index bc4e76a783..10a48ab5f8 100644
>> --- a/Documentation/git-archive.txt
>> +++ b/Documentation/git-archive.txt
>> @@ -49,7 +49,9 @@ OPTIONS
>> Report progress to stderr.
>>
>> --prefix=<prefix>/::
>> - Prepend <prefix>/ to each filename in the archive.
>> + Prepend <prefix>/ to each filename in the archive. Can be
>> + specified multiple times; the last one seen when reading from
>> + left to right is applied.
>
> That can be read to mean that we will use C consistently,
>
> $ cmd --prefix=A other-args --prefix=B other-args --prefix=C other-args
>
> which was what I am worried to be a source of confusion.
>
>> -o <file>::
>> --output=<file>::
>> @@ -58,8 +60,8 @@ OPTIONS
>> --add-file=<file>::
>> Add a non-tracked file to the archive. Can be repeated to add
>> multiple files. The path of the file in the archive is built
>> - by concatenating the value for `--prefix` (if any) and the
>> - basename of <file>.
>> + by concatenating the current value for `--prefix` (if any) and
>> + the basename of <file>.
>
> "the current value for `--prefix` (if any)" would work well once we
> somehow make the reader form a mental model that there is "the
> current" for the "prefix", which starts with an empty string, and
> gets updated every time the "--prefix=<prefix>/" option is given.
Right, "current" has a well-known meaning, but its not enough to convey
that the non-standard concept of capturing option values in the middle of
the argument list is used here.
>
> So, perhaps with
>
> --prefix=<prefix>/::
> The paths of the files in the tree being archived,
> and untracked contents added via the `--add-file`
> and `--add-virtual-file` options, can be modified by
> prepending the "prefix" value that is in effect when
> these options or the tree object is seen on the
> command line. The "prefix" value initially starts
> as an empty string, and it gets updated every time
> this option is given on the command line.
>
> or something like that, with something like
>
>> + by concatenating the current value for "prefix" (see `--prefix`
>> + above) and the basename of <file>.
>
> here, it might make it less misunderstanding-prone, hopefully?
So how about this, which avoids mentioning the idea of a "current"
option, or of updating its value (which implies an order that might not
be obvious)?
--- >8 ---
Subject: [PATCH v2] archive: improve documentation of --prefix
Document the interaction between --add-file and --prefix by giving an
example.
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Documentation/git-archive.txt | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a783..9c0e306c03 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -49,7 +49,9 @@ OPTIONS
Report progress to stderr.
--prefix=<prefix>/::
- Prepend <prefix>/ to each filename in the archive.
+ Prepend <prefix>/ to paths in the archive. Can be repeated; its
+ leftmost value is used for all tracked files. See below which
+ value gets used by `--add-file`.
-o <file>::
--output=<file>::
@@ -58,8 +60,9 @@ OPTIONS
--add-file=<file>::
Add a non-tracked file to the archive. Can be repeated to add
multiple files. The path of the file in the archive is built
- by concatenating the value for `--prefix` (if any) and the
- basename of <file>.
+ by concatenating the value of the leftmost `--prefix` option to
+ the right of this `--add-file` (if any) and the basename of
+ <file>.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
@@ -194,6 +197,12 @@ EXAMPLES
commit on the current branch. Note that the output format is
inferred by the extension of the output file.
+`git archive -o latest.tar --prefix=build/ --add-file=configure --prefix= HEAD`::
+
+ Creates a tar archive that contains the contents of the latest
+ commit on the current branch with no prefix and the untracked
+ file 'configure' with the prefix 'build/'.
+
`git config tar.tar.xz.command "xz -c"`::
Configure a "tar.xz" format for making LZMA-compressed tarfiles.
--
2.35.3
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-27 17:02 ` René Scharfe
@ 2022-05-27 19:01 ` Junio C Hamano
2022-05-28 6:57 ` René Scharfe
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-27 19:01 UTC (permalink / raw)
To: René Scharfe
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
René Scharfe <l.s.r@web.de> writes:
> --prefix=<prefix>/::
> - Prepend <prefix>/ to each filename in the archive.
> + Prepend <prefix>/ to paths in the archive. Can be repeated; its
> + leftmost value is used for all tracked files. See below which
> + value gets used by `--add-file`.
Doesn't "the last one wins" take the rightmost one?
> @@ -58,8 +60,9 @@ OPTIONS
> --add-file=<file>::
> Add a non-tracked file to the archive. Can be repeated to add
> multiple files. The path of the file in the archive is built
> - by concatenating the value for `--prefix` (if any) and the
> - basename of <file>.
> + by concatenating the value of the leftmost `--prefix` option to
> + the right of this `--add-file` (if any) and the basename of
> + <file>.
It is not what archive.c::add_file_cb() seems to be doing, though
It is passed the pointer to "base" that is on-stack of
parse_archive_args(), which is the same variable that is used to
remember the latest value that was given to "--prefix". Then it
concatenates the argument it received after that base value, so
by concatenating the value of the last "--prefix" seen on the
command line (if any) before this `--add-file` and the basename
of <file>.
probably. I always get my left and right mixed up X-<.
> @@ -194,6 +197,12 @@ EXAMPLES
> commit on the current branch. Note that the output format is
> inferred by the extension of the output file.
>
> +`git archive -o latest.tar --prefix=build/ --add-file=configure --prefix= HEAD`::
> +
> + Creates a tar archive that contains the contents of the latest
> + commit on the current branch with no prefix and the untracked
> + file 'configure' with the prefix 'build/'.
> +
> `git config tar.tar.xz.command "xz -c"`::
>
> Configure a "tar.xz" format for making LZMA-compressed tarfiles.
Thanks.
This patch probably needs to come before the "scalar diagnose"
series, which we haven't heard much about recently (no, I am not
complaining---we all heard that Dscho is busy).
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6 1/7] archive: optionally add "virtual" files
2022-05-27 19:01 ` Junio C Hamano
@ 2022-05-28 6:57 ` René Scharfe
0 siblings, 0 replies; 140+ messages in thread
From: René Scharfe @ 2022-05-28 6:57 UTC (permalink / raw)
To: Junio C Hamano
Cc: Johannes Schindelin via GitGitGadget, git, Taylor Blau,
Derrick Stolee, Elijah Newren, rsbecker,
Ævar Arnfjörð Bjarmason, Johannes Schindelin
Am 27.05.22 um 21:01 schrieb Junio C Hamano:
> René Scharfe <l.s.r@web.de> writes:
>
>> --prefix=<prefix>/::
>> - Prepend <prefix>/ to each filename in the archive.
>> + Prepend <prefix>/ to paths in the archive. Can be repeated; its
>> + leftmost value is used for all tracked files. See below which
>> + value gets used by `--add-file`.
>
> Doesn't "the last one wins" take the rightmost one?
Ha ha! Classic mistake, I do that all the time, especially when in a
hurry. >_<
>
>> @@ -58,8 +60,9 @@ OPTIONS
>> --add-file=<file>::
>> Add a non-tracked file to the archive. Can be repeated to add
>> multiple files. The path of the file in the archive is built
>> - by concatenating the value for `--prefix` (if any) and the
>> - basename of <file>.
>> + by concatenating the value of the leftmost `--prefix` option to
>> + the right of this `--add-file` (if any) and the basename of
>> + <file>.
>
> It is not what archive.c::add_file_cb() seems to be doing, though
>
> It is passed the pointer to "base" that is on-stack of
> parse_archive_args(), which is the same variable that is used to
> remember the latest value that was given to "--prefix". Then it
> concatenates the argument it received after that base value, so
>
> by concatenating the value of the last "--prefix" seen on the
> command line (if any) before this `--add-file` and the basename
> of <file>.
>
> probably. I always get my left and right mixed up X-<.
You too? So yeah, avoiding the terms is appealing.
>
>> @@ -194,6 +197,12 @@ EXAMPLES
>> commit on the current branch. Note that the output format is
>> inferred by the extension of the output file.
>>
>> +`git archive -o latest.tar --prefix=build/ --add-file=configure --prefix= HEAD`::
>> +
>> + Creates a tar archive that contains the contents of the latest
>> + commit on the current branch with no prefix and the untracked
>> + file 'configure' with the prefix 'build/'.
>> +
>> `git config tar.tar.xz.command "xz -c"`::
>>
>> Configure a "tar.xz" format for making LZMA-compressed tarfiles.
>
> Thanks.
>
> This patch probably needs to come before the "scalar diagnose"
> series, which we haven't heard much about recently (no, I am not
> complaining---we all heard that Dscho is busy).
>
>
--- >8 ---
Subject: [PATCH v3] archive: improve documentation of --prefix
Document the interaction between --add-file and --prefix by giving an
example.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Documentation/git-archive.txt | 16 ++++++++++++----
1 file changed, 12 insertions(+), 4 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index bc4e76a783..94519aae23 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -49,7 +49,9 @@ OPTIONS
Report progress to stderr.
--prefix=<prefix>/::
- Prepend <prefix>/ to each filename in the archive.
+ Prepend <prefix>/ to paths in the archive. Can be repeated; its
+ rightmost value is used for all tracked files. See below which
+ value gets used by `--add-file`.
-o <file>::
--output=<file>::
@@ -57,9 +59,9 @@ OPTIONS
--add-file=<file>::
Add a non-tracked file to the archive. Can be repeated to add
- multiple files. The path of the file in the archive is built
- by concatenating the value for `--prefix` (if any) and the
- basename of <file>.
+ multiple files. The path of the file in the archive is built by
+ concatenating the value of the last `--prefix` option (if any)
+ before this `--add-file` and the basename of <file>.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
@@ -194,6 +196,12 @@ EXAMPLES
commit on the current branch. Note that the output format is
inferred by the extension of the output file.
+`git archive -o latest.tar --prefix=build/ --add-file=configure --prefix= HEAD`::
+
+ Creates a tar archive that contains the contents of the latest
+ commit on the current branch with no prefix and the untracked
+ file 'configure' with the prefix 'build/'.
+
`git config tar.tar.xz.command "xz -c"`::
Configure a "tar.xz" format for making LZMA-compressed tarfiles.
--
2.35.3
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
@ 2022-05-21 15:08 ` Johannes Schindelin via GitGitGadget
2022-05-25 20:22 ` Junio C Hamano
2022-05-21 15:08 ` [PATCH v6 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
` (5 subsequent siblings)
7 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
Documentation/git-archive.txt | 14 ++++++++++----
archive.c | 30 ++++++++++++++++++++----------
t/t5003-archive-zip.sh | 8 ++++++++
3 files changed, 38 insertions(+), 14 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 893cb1075bf..54de945a84e 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -67,10 +67,16 @@ OPTIONS
by concatenating the value for `--prefix` (if any) and the
basename of <file>.
+
-The `<path>` cannot contain any colon, the file mode is limited to
-a regular file, and the option may be subject to platform-dependent
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
+character; The contained file name is interpreted as a C-style string,
+i.e. the backslash is interpreted as escape character. The path must
+be quoted if it contains a colon, to avoid the colon from being
+misinterpreted as the separator between the path and the contents, or
+if the path begins or ends with a double-quote character.
++
+The file mode is limited to a regular file, and the option may be
+subject to platform-dependent command-line limits. For non-trivial
+cases, write an untracked file and use `--add-file` instead.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
diff --git a/archive.c b/archive.c
index d20e16fa819..b7756b91200 100644
--- a/archive.c
+++ b/archive.c
@@ -9,6 +9,7 @@
#include "parse-options.h"
#include "unpack-trees.h"
#include "dir.h"
+#include "quote.h"
static char const * const archive_usage[] = {
N_("git archive [<options>] <tree-ish> [<path>...]"),
@@ -533,22 +534,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
} else if (!strcmp(opt->long_name, "add-virtual-file")) {
- const char *colon = strchr(arg, ':');
- char *p;
+ struct strbuf buf = STRBUF_INIT;
+ const char *p = arg;
+
+ if (*p != '"')
+ p = strchr(p, ':');
+ else if (unquote_c_style(&buf, p, &p) < 0)
+ die(_("unclosed quote: '%s'"), arg);
- if (!colon)
+ if (!p || *p != ':')
die(_("missing colon: '%s'"), arg);
- p = xstrndup(arg, colon - arg);
- if (!args->prefix)
- path = p;
- else {
- path = prefix_filename(args->prefix, p);
- free(p);
+ if (p == arg)
+ die(_("empty file name: '%s'"), arg);
+
+ path = buf.len ?
+ strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg);
+
+ if (args->prefix) {
+ char *save = path;
+ path = prefix_filename(args->prefix, path);
+ free(save);
}
memset(&info->stat, 0, sizeof(info->stat));
info->stat.st_mode = S_IFREG | 0644;
- info->content = xstrdup(colon + 1);
+ info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
} else {
BUG("add_file_cb() called for %s", opt->long_name);
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index ebc26e89a9b..3a5a052e8ce 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -207,13 +207,21 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ if test_have_prereq FUNNYNAMES
+ then
+ PATHNAME=quoted:colon
+ else
+ PATHNAME=quoted
+ fi &&
git archive --format=zip >with_file_with_content.zip \
+ --add-virtual-file=\"$PATHNAME\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
+ test_path_is_file $PATHNAME &&
test world = $(cat hello)
)
'
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons
2022-05-21 15:08 ` [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-25 20:22 ` Junio C Hamano
2022-05-25 21:42 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-25 20:22 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
"Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com>
writes:
> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> + if test_have_prereq FUNNYNAMES
> + then
> + PATHNAME=quoted:colon
> + else
> + PATHNAME=quoted
> + fi &&
> git archive --format=zip >with_file_with_content.zip \
> + --add-virtual-file=\"$PATHNAME\": \
The name is better, but this still limits what can be in PATHNAME.
Write either one of these:
--add-virtual-file="\"$PATHNAME\":" \
--add-virtual-file=\""$PATHNAME"\": \
to signal the intention better to future readers. We are showing an
explicit dq-pair we want to pass to the c-unquote machinery, and we
are showing that we are not being unnecessarily loose by protecting
the string from getting word split.
Either is fine, but leaving it unquoted is not.
> + test_path_is_file $PATHNAME &&
Ditto. There is no reason to forbid future developers from futzing
the test to include space in the PATHNAME variable.
IOW, I want us to be better than saying
I know there is no $IFS whitespace now because I just wrote it.
Because I do not think there is any need to test with a string
with whitespace in it, I will leave the variable unquoted.
Anybody who changes the variable and breaks this assumption have
only themselves to blame for breaking the tests. It is not my
fault and it is not my problem.
which is the signal our readers would get from this patch (I would,
if I were reading this commit as a third-party), especially once
they become aware of the fact that this exact issue was already
pointed out during the review discussion.
Using double-quote appropriately sends a strong signal to reviewers
and future developers that we care about details.
A valid alternative is to write the assumption out where we
currently assign to PATHNAME.
# The PATHNAME variable is used without quote in the code
# below for such and such reasons, so you cannot use a $IFS
# whitespace in it.
if test_have_prereq FUNNYNAMES
then
...
If the "defensive" measure that is necessary to avoid a limitation
is too onerous, such an approach may be very much more preferrable
than preparing for future changes. "for such and such reasons" is
a good place to justify why we avoid unnecessarily complex defensive
measure and restrict future changes in the documented way.
But in _this_ particular case, the "defensive" measure necessary is
merely just to quote the shell variables properly, which nobody
sensible would say too onerous. I couldn't come up with anything
remotely plausible to fill "for such and such reasons" myself when I
tried to justify leaving the variables unquoted.
Regardless of the quoting issue, we probably want to comment on what
value exactly is in PATHNAME before the assignment, by the way.
E.g.
# The PATHNAME variable holds a filename encoded like a
# string constant in C language (e.g. "\060" is digit "0")
if test_have_prereq FUNNYNAMES
then
PATHNAME=quoted:colon:\\060zero
else
PATHNAME=quoted\\060zero
fi
That would not just protect only one aspect (i.e. we can pass a
colon into the resulting filename) this change but the path goes
through the c-unquoting rules.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons
2022-05-25 20:22 ` Junio C Hamano
@ 2022-05-25 21:42 ` Junio C Hamano
2022-05-25 22:34 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-25 21:42 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
Junio C Hamano <gitster@pobox.com> writes:
> But in _this_ particular case, the "defensive" measure necessary is
> merely just to quote the shell variables properly, which nobody
> sensible would say too onerous. I couldn't come up with anything
> remotely plausible to fill "for such and such reasons" myself when I
> tried to justify leaving the variables unquoted.
>
> Regardless of the quoting issue, we probably want to comment on what
> value exactly is in PATHNAME before the assignment, by the way.
>
> E.g.
>
> # The PATHNAME variable holds a filename encoded like a
> # string constant in C language (e.g. "\060" is digit "0")
> if test_have_prereq FUNNYNAMES
> then
> PATHNAME=quoted:colon:\\060zero
> else
> PATHNAME=quoted\\060zero
> fi
>
> That would not just protect only one aspect (i.e. we can pass a
> colon into the resulting filename) this change but the path goes
> through the c-unquoting rules.
Actually, I _think_ that pushes us beyond the "reasonably defensive
for the current need". We'd need to prepare how the pathname is
expected to be unquoted for the later test
test_path_is_file "$PATHNAME"
to work. So here is what I queued as a fixup for this step on top
of the series.
t/t5003-archive-zip.sh | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git c/t/t5003-archive-zip.sh w/t/t5003-archive-zip.sh
index 3a5a052e8c..6addb6c684 100755
--- c/t/t5003-archive-zip.sh
+++ w/t/t5003-archive-zip.sh
@@ -209,19 +209,19 @@ check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
if test_have_prereq FUNNYNAMES
then
- PATHNAME=quoted:colon
+ PATHNAME="pathname with : colon"
else
- PATHNAME=quoted
+ PATHNAME="pathname without colon"
fi &&
git archive --format=zip >with_file_with_content.zip \
- --add-virtual-file=\"$PATHNAME\": \
+ --add-virtual-file=\""$PATHNAME"\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
- test_path_is_file $PATHNAME &&
+ test_path_is_file "$PATHNAME" &&
test world = $(cat hello)
)
'
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons
2022-05-25 21:42 ` Junio C Hamano
@ 2022-05-25 22:34 ` Junio C Hamano
0 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-25 22:34 UTC (permalink / raw)
To: Johannes Schindelin via GitGitGadget
Cc: git, René Scharfe, Taylor Blau, Derrick Stolee,
Elijah Newren, rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin
Junio C Hamano <gitster@pobox.com> writes:
>> # The PATHNAME variable holds a filename encoded like a
>> # string constant in C language (e.g. "\060" is digit "0")
>> if test_have_prereq FUNNYNAMES
>> then
>> PATHNAME=quoted:colon:\\060zero
>> ...
> Actually, I _think_ that pushes us beyond the "reasonably defensive
> for the current need". We'd need to prepare how the pathname is
> expected to be unquoted for the later test
>
> test_path_is_file "$PATHNAME"
>
> to work.
IOW, I would need to add a new test-tool (attached) and then start
this test like so:
if ...
then
PATHNAME=quoted:colon:\\060zero
else
PATHNAME=quoted\\060zero
fi
UQPATHNAME=$(test-tool unquote-c-style \""$PATHNAME"\")
and change the last test to
test_path_is_file "$UQPATHNAME"
if we really wanted to test that the the PATHNAME is treated as a
c-style quoted string.
I am on the fence. We do not have an immediate need, in the sense
that nobody needs to encode "0" as "\060" and trigger the unquote
codepath in real life. But it does feel prudent to make sure we can
grok C-quoted pathname as we claim in the documentation.
And the resulting change to the test does not look _too_ bad (and
the new test-tool certainly does not hurt, either).
So...
Makefile | 1 +
t/helper/test-quoted.c | 34 ++++++++++++++++++++++++++++++++++
t/helper/test-tool.c | 2 ++
t/helper/test-tool.h | 2 ++
4 files changed, 39 insertions(+)
diff --git c/Makefile w/Makefile
index 298becd5a5..1d544ad46a 100644
--- c/Makefile
+++ w/Makefile
@@ -749,6 +749,7 @@ TEST_BUILTINS_OBJS += test-pkt-line.o
TEST_BUILTINS_OBJS += test-prio-queue.o
TEST_BUILTINS_OBJS += test-proc-receive.o
TEST_BUILTINS_OBJS += test-progress.o
+TEST_BUILTINS_OBJS += test-quoted.o
TEST_BUILTINS_OBJS += test-reach.o
TEST_BUILTINS_OBJS += test-read-cache.o
TEST_BUILTINS_OBJS += test-read-graph.o
diff --git c/t/helper/test-quoted.c w/t/helper/test-quoted.c
new file mode 100644
index 0000000000..15baa55e43
--- /dev/null
+++ w/t/helper/test-quoted.c
@@ -0,0 +1,34 @@
+#include "test-tool.h"
+#include "cache.h"
+#include "quote.h"
+
+int cmd__unquote_c_style(int argc, const char **argv)
+{
+ struct strbuf buf = STRBUF_INIT;
+
+ while (*++argv) {
+ const char *p = *argv;
+
+ if (unquote_c_style(&buf, p, &p) < 0)
+ error("cannot unquote '%s'", *argv);
+ else
+ printf("%s\n", buf.buf);
+ strbuf_reset(&buf);
+ }
+ return 0;
+}
+
+int cmd__quote_c_style(int argc, const char **argv)
+{
+ struct strbuf buf = STRBUF_INIT;
+
+ while (*++argv) {
+ const char *p = *argv;
+
+ quote_c_style(p, &buf, NULL, 0);
+ printf("%s\n", buf.buf);
+ strbuf_reset(&buf);
+ }
+ return 0;
+}
+
diff --git c/t/helper/test-tool.c w/t/helper/test-tool.c
index d2eacd302d..5633c98569 100644
--- c/t/helper/test-tool.c
+++ w/t/helper/test-tool.c
@@ -58,6 +58,7 @@ static struct test_cmd cmds[] = {
{ "prio-queue", cmd__prio_queue },
{ "proc-receive", cmd__proc_receive },
{ "progress", cmd__progress },
+ { "quote-c-style", cmd__quote_c_style },
{ "reach", cmd__reach },
{ "read-cache", cmd__read_cache },
{ "read-graph", cmd__read_graph },
@@ -81,6 +82,7 @@ static struct test_cmd cmds[] = {
{ "submodule-nested-repo-config", cmd__submodule_nested_repo_config },
{ "subprocess", cmd__subprocess },
{ "trace2", cmd__trace2 },
+ { "unquote-c-style", cmd__unquote_c_style },
{ "userdiff", cmd__userdiff },
{ "urlmatch-normalization", cmd__urlmatch_normalization },
{ "xml-encode", cmd__xml_encode },
diff --git c/t/helper/test-tool.h w/t/helper/test-tool.h
index 960cc27ef7..f5e8929009 100644
--- c/t/helper/test-tool.h
+++ w/t/helper/test-tool.h
@@ -48,6 +48,7 @@ int cmd__pkt_line(int argc, const char **argv);
int cmd__prio_queue(int argc, const char **argv);
int cmd__proc_receive(int argc, const char **argv);
int cmd__progress(int argc, const char **argv);
+int cmd__quote_c_style(int argc, const char **argv);
int cmd__reach(int argc, const char **argv);
int cmd__read_cache(int argc, const char **argv);
int cmd__read_graph(int argc, const char **argv);
@@ -71,6 +72,7 @@ int cmd__submodule_config(int argc, const char **argv);
int cmd__submodule_nested_repo_config(int argc, const char **argv);
int cmd__subprocess(int argc, const char **argv);
int cmd__trace2(int argc, const char **argv);
+int cmd__unquote_c_style(int argc, const char **argv);
int cmd__userdiff(int argc, const char **argv);
int cmd__urlmatch_normalization(int argc, const char **argv);
int cmd__xml_encode(int argc, const char **argv);
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6 3/7] scalar: validate the optional enlistment argument
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 1/7] archive: optionally add "virtual" files Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 2/7] archive --add-virtual-file: allow paths containing colons Johannes Schindelin via GitGitGadget
@ 2022-05-21 15:08 ` Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
` (4 subsequent siblings)
7 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
The `scalar` command needs a Scalar enlistment for many subcommands, and
looks in the current directory for such an enlistment (traversing the
parent directories until it finds one).
These is subcommands can also be called with an optional argument
specifying the enlistment. Here, too, we traverse parent directories as
needed, until we find an enlistment.
However, if the specified directory does not even exist, or is not a
directory, we should stop right there, with an error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 6 ++++--
contrib/scalar/t/t9099-scalar.sh | 5 +++++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 1ce9c2b00e8..00dcd4b50ef 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv,
usage_with_options(usagestr, options);
/* find the worktree, determine its corresponding root */
- if (argc == 1)
+ if (argc == 1) {
strbuf_add_absolute_path(&path, argv[0]);
- else if (strbuf_getcwd(&path) < 0)
+ if (!is_directory(path.buf))
+ die(_("'%s' does not exist"), path.buf);
+ } else if (strbuf_getcwd(&path) < 0)
die(_("need a working directory"));
strbuf_trim_trailing_dir_sep(&path);
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 2e1502ad45e..9d83fdf25e8 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -85,4 +85,9 @@ test_expect_success 'scalar delete with enlistment' '
test_path_is_missing cloned
'
+test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
+ ! scalar run config cloned 2>err &&
+ grep "cloned. does not exist" err
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6 4/7] Implement `scalar diagnose`
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
` (2 preceding siblings ...)
2022-05-21 15:08 ` [PATCH v6 3/7] scalar: validate the optional enlistment argument Johannes Schindelin via GitGitGadget
@ 2022-05-21 15:08 ` Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
` (3 subsequent siblings)
7 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
that can help identify the most typical problems with large
worktrees/repositories.
The `diagnose` command is the culmination of this hard-won knowledge: it
gathers the installed hooks, the config, a couple statistics describing
the data shape, among other pieces of information, and then wraps
everything up in a tidy, neat `.zip` archive.
Note: originally, Scalar was implemented in C# using the .NET API, where
we had the luxury of a comprehensive standard library that includes
basic functionality such as writing a `.zip` file. In the C version, we
lack such a commodity. Rather than introducing a dependency on, say,
libzip, we slightly abuse Git's `archive` machinery: we write out a
`.zip` of the empty try, augmented by a couple files that are added via
the `--add-file*` options. We are careful trying not to modify the
current repository in any way lest the very circumstances that required
`scalar diagnose` to be run are changed by the `diagnose` run itself.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 144 +++++++++++++++++++++++++++++++
contrib/scalar/scalar.txt | 12 +++
contrib/scalar/t/t9099-scalar.sh | 14 +++
3 files changed, 170 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 00dcd4b50ef..53213f9a3b9 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -11,6 +11,7 @@
#include "dir.h"
#include "packfile.h"
#include "help.h"
+#include "archive.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -261,6 +262,47 @@ static int unregister_dir(void)
return res;
}
+static int add_directory_to_archiver(struct strvec *archiver_args,
+ const char *path, int recurse)
+{
+ int at_root = !*path;
+ DIR *dir = opendir(at_root ? "." : path);
+ struct dirent *e;
+ struct strbuf buf = STRBUF_INIT;
+ size_t len;
+ int res = 0;
+
+ if (!dir)
+ return error_errno(_("could not open directory '%s'"), path);
+
+ if (!at_root)
+ strbuf_addf(&buf, "%s/", path);
+ len = buf.len;
+ strvec_pushf(archiver_args, "--prefix=%s", buf.buf);
+
+ while (!res && (e = readdir(dir))) {
+ if (!strcmp(".", e->d_name) || !strcmp("..", e->d_name))
+ continue;
+
+ strbuf_setlen(&buf, len);
+ strbuf_addstr(&buf, e->d_name);
+
+ if (e->d_type == DT_REG)
+ strvec_pushf(archiver_args, "--add-file=%s", buf.buf);
+ else if (e->d_type != DT_DIR)
+ warning(_("skipping '%s', which is neither file nor "
+ "directory"), buf.buf);
+ else if (recurse &&
+ add_directory_to_archiver(archiver_args,
+ buf.buf, recurse) < 0)
+ res = -1;
+ }
+
+ closedir(dir);
+ strbuf_release(&buf);
+ return res;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -501,6 +543,107 @@ cleanup:
return res;
}
+static int cmd_diagnose(int argc, const char **argv)
+{
+ struct option options[] = {
+ OPT_END(),
+ };
+ const char * const usage[] = {
+ N_("scalar diagnose [<enlistment>]"),
+ NULL
+ };
+ struct strbuf zip_path = STRBUF_INIT;
+ struct strvec archiver_args = STRVEC_INIT;
+ char **argv_copy = NULL;
+ int stdout_fd = -1, archiver_fd = -1;
+ time_t now = time(NULL);
+ struct tm tm;
+ struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT;
+ int res = 0;
+
+ argc = parse_options(argc, argv, NULL, options,
+ usage, 0);
+
+ setup_enlistment_directory(argc, argv, usage, options, &zip_path);
+
+ strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_");
+ strbuf_addftime(&zip_path,
+ "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0);
+ strbuf_addstr(&zip_path, ".zip");
+ switch (safe_create_leading_directories(zip_path.buf)) {
+ case SCLD_EXISTS:
+ case SCLD_OK:
+ break;
+ default:
+ error_errno(_("could not create directory for '%s'"),
+ zip_path.buf);
+ goto diagnose_cleanup;
+ }
+ stdout_fd = dup(1);
+ if (stdout_fd < 0) {
+ res = error_errno(_("could not duplicate stdout"));
+ goto diagnose_cleanup;
+ }
+
+ archiver_fd = xopen(zip_path.buf, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+ if (archiver_fd < 0 || dup2(archiver_fd, 1) < 0) {
+ res = error_errno(_("could not redirect output"));
+ goto diagnose_cleanup;
+ }
+
+ init_zip_archiver();
+ strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL);
+
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "Collecting diagnostic info\n\n");
+ get_version_info(&buf, 1);
+
+ strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ write_or_die(stdout_fd, buf.buf, buf.len);
+ strvec_pushf(&archiver_args,
+ "--add-virtual-file=diagnostics.log:%.*s",
+ (int)buf.len, buf.buf);
+
+ if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
+ goto diagnose_cleanup;
+
+ strvec_pushl(&archiver_args, "--prefix=",
+ oid_to_hex(the_hash_algo->empty_tree), "--", NULL);
+
+ /* `write_archive()` modifies the `argv` passed to it. Let it. */
+ argv_copy = xmemdupz(archiver_args.v,
+ sizeof(char *) * archiver_args.nr);
+ res = write_archive(archiver_args.nr, (const char **)argv_copy, NULL,
+ the_repository, NULL, 0);
+ if (res) {
+ error(_("failed to write archive"));
+ goto diagnose_cleanup;
+ }
+
+ if (!res)
+ fprintf(stderr, "\n"
+ "Diagnostics complete.\n"
+ "All of the gathered info is captured in '%s'\n",
+ zip_path.buf);
+
+diagnose_cleanup:
+ if (archiver_fd >= 0) {
+ close(1);
+ dup2(stdout_fd, 1);
+ }
+ free(argv_copy);
+ strvec_clear(&archiver_args);
+ strbuf_release(&zip_path);
+ strbuf_release(&path);
+ strbuf_release(&buf);
+
+ return res;
+}
+
static int cmd_list(int argc, const char **argv)
{
if (argc != 1)
@@ -802,6 +945,7 @@ static struct {
{ "reconfigure", cmd_reconfigure },
{ "delete", cmd_delete },
{ "version", cmd_version },
+ { "diagnose", cmd_diagnose },
{ NULL, NULL},
};
diff --git a/contrib/scalar/scalar.txt b/contrib/scalar/scalar.txt
index f416d637289..22583fe046e 100644
--- a/contrib/scalar/scalar.txt
+++ b/contrib/scalar/scalar.txt
@@ -14,6 +14,7 @@ scalar register [<enlistment>]
scalar unregister [<enlistment>]
scalar run ( all | config | commit-graph | fetch | loose-objects | pack-files ) [<enlistment>]
scalar reconfigure [ --all | <enlistment> ]
+scalar diagnose [<enlistment>]
scalar delete <enlistment>
DESCRIPTION
@@ -129,6 +130,17 @@ reconfigure the enlistment.
With the `--all` option, all enlistments currently registered with Scalar
will be reconfigured. Use this option after each Scalar upgrade.
+Diagnose
+~~~~~~~~
+
+diagnose [<enlistment>]::
+ When reporting issues with Scalar, it is often helpful to provide the
+ information gathered by this command, including logs and certain
+ statistics describing the data shape of the current enlistment.
++
+The output of this command is a `.zip` file that is written into
+a directory adjacent to the worktree in the `src` directory.
+
Delete
~~~~~~
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 9d83fdf25e8..6802d317258 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -90,4 +90,18 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
grep "cloned. does not exist" err
'
+SQ="'"
+test_expect_success UNZIP 'scalar diagnose' '
+ scalar clone "file://$(pwd)" cloned --single-branch &&
+ scalar diagnose cloned >out 2>err &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
+ zip_path=$(cat zip_path) &&
+ test -n "$zip_path" &&
+ unzip -v "$zip_path" &&
+ folder=${zip_path%.zip} &&
+ test_path_is_missing "$folder" &&
+ unzip -p "$zip_path" diagnostics.log >out &&
+ test_file_not_empty out
+'
+
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6 5/7] scalar diagnose: include disk space information
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
` (3 preceding siblings ...)
2022-05-21 15:08 ` [PATCH v6 4/7] Implement `scalar diagnose` Johannes Schindelin via GitGitGadget
@ 2022-05-21 15:08 ` Johannes Schindelin via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
` (2 subsequent siblings)
7 siblings, 0 replies; 140+ messages in thread
From: Johannes Schindelin via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
When analyzing problems with large worktrees/repositories, it is useful
to know how close to a "full disk" situation Scalar/Git operates. Let's
include this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 53 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 1 +
2 files changed, 54 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 53213f9a3b9..0a9e25a57f8 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -303,6 +303,58 @@ static int add_directory_to_archiver(struct strvec *archiver_args,
return res;
}
+#ifndef WIN32
+#include <sys/statvfs.h>
+#endif
+
+static int get_disk_info(struct strbuf *out)
+{
+#ifdef WIN32
+ struct strbuf buf = STRBUF_INIT;
+ char volume_name[MAX_PATH], fs_name[MAX_PATH];
+ DWORD serial_number, component_length, flags;
+ ULARGE_INTEGER avail2caller, total, avail;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (!GetDiskFreeSpaceExA(buf.buf, &avail2caller, &total, &avail)) {
+ error(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_setlen(&buf, offset_1st_component(buf.buf));
+ if (!GetVolumeInformationA(buf.buf, volume_name, sizeof(volume_name),
+ &serial_number, &component_length, &flags,
+ fs_name, sizeof(fs_name))) {
+ error(_("could not get info for '%s'"), buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, avail2caller.QuadPart);
+ strbuf_addch(out, '\n');
+ strbuf_release(&buf);
+#else
+ struct strbuf buf = STRBUF_INIT;
+ struct statvfs stat;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (statvfs(buf.buf, &stat) < 0) {
+ error_errno(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail));
+ strbuf_addf(out, " (mount flags 0x%lx)\n", stat.f_flag);
+ strbuf_release(&buf);
+#endif
+ return 0;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -599,6 +651,7 @@ static int cmd_diagnose(int argc, const char **argv)
get_version_info(&buf, 1);
strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ get_disk_info(&buf);
write_or_die(stdout_fd, buf.buf, buf.len);
strvec_pushf(&archiver_args,
"--add-virtual-file=diagnostics.log:%.*s",
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 6802d317258..934b2485d91 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -94,6 +94,7 @@ SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
scalar diagnose cloned >out 2>err &&
+ grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
zip_path=$(cat zip_path) &&
test -n "$zip_path" &&
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6 6/7] scalar: teach `diagnose` to gather packfile info
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
` (4 preceding siblings ...)
2022-05-21 15:08 ` [PATCH v6 5/7] scalar diagnose: include disk space information Johannes Schindelin via GitGitGadget
@ 2022-05-21 15:08 ` Matthew John Cheetham via GitGitGadget
2022-05-21 15:08 ` [PATCH v6 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
7 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
It's helpful to see if there are other crud files in the pack
directory. Let's teach the `scalar diagnose` command to gather
file size information about pack files.
While at it, also enumerate the pack files in the alternate
object directories, if any are registered.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 30 ++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 6 +++++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 0a9e25a57f8..d302c27e114 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -12,6 +12,7 @@
#include "packfile.h"
#include "help.h"
#include "archive.h"
+#include "object-store.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -595,6 +596,29 @@ cleanup:
return res;
}
+static void dir_file_stats_objects(const char *full_path, size_t full_path_len,
+ const char *file_name, void *data)
+{
+ struct strbuf *buf = data;
+ struct stat st;
+
+ if (!stat(full_path, &st))
+ strbuf_addf(buf, "%-70s %16" PRIuMAX "\n", file_name,
+ (uintmax_t)st.st_size);
+}
+
+static int dir_file_stats(struct object_directory *object_dir, void *data)
+{
+ struct strbuf *buf = data;
+
+ strbuf_addf(buf, "Contents of %s:\n", object_dir->path);
+
+ for_each_file_in_pack_dir(object_dir->path, dir_file_stats_objects,
+ data);
+
+ return 0;
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -657,6 +681,12 @@ static int cmd_diagnose(int argc, const char **argv)
"--add-virtual-file=diagnostics.log:%.*s",
(int)buf.len, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-virtual-file=packs-local.txt:");
+ dir_file_stats(the_repository->objects->odb, &buf);
+ foreach_alt_odb(dir_file_stats, &buf);
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 934b2485d91..3dd5650cceb 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -93,6 +93,8 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
+ git repack &&
+ echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -102,7 +104,9 @@ test_expect_success UNZIP 'scalar diagnose' '
folder=${zip_path%.zip} &&
test_path_is_missing "$folder" &&
unzip -p "$zip_path" diagnostics.log >out &&
- test_file_not_empty out
+ test_file_not_empty out &&
+ unzip -p "$zip_path" packs-local.txt >out &&
+ grep "$(pwd)/.git/objects" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6 7/7] scalar: teach `diagnose` to gather loose objects information
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
` (5 preceding siblings ...)
2022-05-21 15:08 ` [PATCH v6 6/7] scalar: teach `diagnose` to gather packfile info Matthew John Cheetham via GitGitGadget
@ 2022-05-21 15:08 ` Matthew John Cheetham via GitGitGadget
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
7 siblings, 0 replies; 140+ messages in thread
From: Matthew John Cheetham via GitGitGadget @ 2022-05-21 15:08 UTC (permalink / raw)
To: git
Cc: René Scharfe, Taylor Blau, Derrick Stolee, Elijah Newren,
rsbecker, Ævar Arnfjörð Bjarmason,
Johannes Schindelin, Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
When operating at the scale that Scalar wants to support, certain data
shapes are more likely to cause undesirable performance issues, such as
large numbers of loose objects.
By including statistics about this, `scalar diagnose` now makes it
easier to identify such scenarios.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
---
contrib/scalar/scalar.c | 59 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 5 ++-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index d302c27e114..0c278681758 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -619,6 +619,60 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
return 0;
}
+static int count_files(char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count = 0;
+
+ if (!dir)
+ return 0;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) && e->d_type == DT_REG)
+ count++;
+
+ closedir(dir);
+ return count;
+}
+
+static void loose_objs_stats(struct strbuf *buf, const char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count;
+ int total = 0;
+ unsigned char c;
+ struct strbuf count_path = STRBUF_INIT;
+ size_t base_path_len;
+
+ if (!dir)
+ return;
+
+ strbuf_addstr(buf, "Object directory stats for ");
+ strbuf_add_absolute_path(buf, path);
+ strbuf_addstr(buf, ":\n");
+
+ strbuf_add_absolute_path(&count_path, path);
+ strbuf_addch(&count_path, '/');
+ base_path_len = count_path.len;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) &&
+ e->d_type == DT_DIR && strlen(e->d_name) == 2 &&
+ !hex_to_bytes(&c, e->d_name, 1)) {
+ strbuf_setlen(&count_path, base_path_len);
+ strbuf_addstr(&count_path, e->d_name);
+ total += (count = count_files(count_path.buf));
+ strbuf_addf(buf, "%s : %7d files\n", e->d_name, count);
+ }
+
+ strbuf_addf(buf, "Total: %d loose objects", total);
+
+ strbuf_release(&count_path);
+ closedir(dir);
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -687,6 +741,11 @@ static int cmd_diagnose(int argc, const char **argv)
foreach_alt_odb(dir_file_stats, &buf);
strvec_push(&archiver_args, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-virtual-file=objects-local.txt:");
+ loose_objs_stats(&buf, ".git/objects");
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 3dd5650cceb..72023a1ca1d 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -95,6 +95,7 @@ test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
git repack &&
echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
+ test_commit -C cloned/src loose &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -106,7 +107,9 @@ test_expect_success UNZIP 'scalar diagnose' '
unzip -p "$zip_path" diagnostics.log >out &&
test_file_not_empty out &&
unzip -p "$zip_path" packs-local.txt >out &&
- grep "$(pwd)/.git/objects" out
+ grep "$(pwd)/.git/objects" out &&
+ unzip -p "$zip_path" objects-local.txt >out &&
+ grep "^Total: [1-9]" out
'
test_done
--
gitgitgadget
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6+ 0/7] js/scalar-diagnose rebased
2022-05-21 15:08 ` [PATCH v6 " Johannes Schindelin via GitGitGadget
` (6 preceding siblings ...)
2022-05-21 15:08 ` [PATCH v6 7/7] scalar: teach `diagnose` to gather loose objects information Matthew John Cheetham via GitGitGadget
@ 2022-05-28 23:11 ` Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 1/7] archive: optionally add "virtual" files Junio C Hamano
` (7 more replies)
7 siblings, 8 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git
Recent document clarification on the "--prefix" option of the "git
archive" command from René serves as a good basis for the
documentation of the "--add-virtual-file" option added by this
series, so here is my attempt to rebase js/scalar-diagnose topic
on it to hopefully help reduce Dscho's workload ;-)
Aside from obvious adjustments needed while rebasing onto the
updated documentation, there are only a couple of changes:
- The way the <path> in --add-virtual-file=<path>:<contents> is
used has been corrected. Earlier, leading directory components
of the <path> were all discarded and used nowhere, which made no
sense. The <path> is used as a whole, but for consistency with
--add-file=<path>, <prefix> is still applied.
- Overly loose quoting of variables in test scripts has been
corrected.
Both changes have been in 'seen' from before the rebase.
1: 510f6b226b ! 1: 61522a0866 archive: optionally add "virtual" files
@@ Commit message
## Documentation/git-archive.txt ##
@@ Documentation/git-archive.txt: OPTIONS
- by concatenating the value for `--prefix` (if any) and the
- basename of <file>.
+ --prefix=<prefix>/::
+ Prepend <prefix>/ to paths in the archive. Can be repeated; its
+ rightmost value is used for all tracked files. See below which
+- value gets used by `--add-file`.
++ value gets used by `--add-file` and `--add-virtual-file`.
+
+ -o <file>::
+ --output=<file>::
+@@ Documentation/git-archive.txt: OPTIONS
+ concatenating the value of the last `--prefix` option (if any)
+ before this `--add-file` and the basename of <file>.
+--add-virtual-file=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
-+ by concatenating the value for `--prefix` (if any) and the
-+ basename of <file>.
++ by concatenating the value of the last `--prefix` option (if any)
++ before this `--add-virtual-file` and `<path>`.
++
+The `<path>` cannot contain any colon, the file mode is limited to
+a regular file, and the option may be subject to platform-dependent
@@ archive.c: static int queue_or_write_archive_entry(const struct object_id *oid,
int write_archive_entries(struct archiver_args *args,
@@ archive.c: int write_archive_entries(struct archiver_args *args,
- strbuf_addstr(&path_in_archive, basename(path));
- strbuf_reset(&content);
+ put_be64(fake_oid.hash, i + 1);
+
+- strbuf_reset(&path_in_archive);
+- if (info->base)
+- strbuf_addstr(&path_in_archive, info->base);
+- strbuf_addstr(&path_in_archive, basename(path));
+-
+- strbuf_reset(&content);
- if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
-+ if (info->content)
-+ err = write_entry(args, &fake_oid, path_in_archive.buf,
-+ path_in_archive.len,
-+ canon_mode(info->stat.st_mode),
+- err = error_errno(_("cannot read '%s'"), path);
+- else
+- err = write_entry(args, &fake_oid, path_in_archive.buf,
+- path_in_archive.len,
++ if (!info->content) {
++ strbuf_reset(&path_in_archive);
++ if (info->base)
++ strbuf_addstr(&path_in_archive, info->base);
++ strbuf_addstr(&path_in_archive, basename(path));
++
++ strbuf_reset(&content);
++ if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
++ err = error_errno(_("could not read '%s'"), path);
++ else
++ err = write_entry(args, &fake_oid, path_in_archive.buf,
++ path_in_archive.len,
++ canon_mode(info->stat.st_mode),
++ content.buf, content.len);
++ } else {
++ err = write_entry(args, &fake_oid,
++ path, strlen(path),
+ canon_mode(info->stat.st_mode),
+- content.buf, content.len);
+ info->content, info->stat.st_size);
-+ else if (strbuf_read_file(&content, path,
-+ info->stat.st_size) < 0)
- err = error_errno(_("cannot read '%s'"), path);
- else
- err = write_entry(args, &fake_oid, path_in_archive.buf,
++ }
++
+ if (err)
+ break;
+ }
@@ archive.c: static void extra_file_info_clear(void *util, const char *str)
{
struct extra_file_info *info = util;
2: 208f4aad5f ! 2: 5e9d19a70f archive --add-virtual-file: allow paths containing colons
@@ Commit message
## Documentation/git-archive.txt ##
@@ Documentation/git-archive.txt: OPTIONS
- by concatenating the value for `--prefix` (if any) and the
- basename of <file>.
+ by concatenating the value of the last `--prefix` option (if any)
+ before this `--add-virtual-file` and `<path>`.
+
-The `<path>` cannot contain any colon, the file mode is limited to
-a regular file, and the option may be subject to platform-dependent
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
-+character; The contained file name is interpreted as a C-style string,
++character; the contained file name is interpreted as a C-style string,
+i.e. the backslash is interpreted as escape character. The path must
+be quoted if it contains a colon, to avoid the colon from being
+misinterpreted as the separator between the path and the contents, or
@@ t/t5003-archive-zip.sh: check_zip with_untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ if test_have_prereq FUNNYNAMES
+ then
-+ PATHNAME=quoted:colon
++ PATHNAME="pathname with : colon"
+ else
-+ PATHNAME=quoted
++ PATHNAME="pathname without colon"
+ fi &&
git archive --format=zip >with_file_with_content.zip \
-+ --add-virtual-file=\"$PATHNAME\": \
++ --add-virtual-file=\""$PATHNAME"\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
-+ test_path_is_file $PATHNAME &&
++ test_path_is_file "$PATHNAME" &&
test world = $(cat hello)
)
'
3: bc1164404f = 3: 4f5b3aa775 scalar: validate the optional enlistment argument
4: 69daeb7d9d ! 4: f4f070df8e Implement `scalar diagnose`
@@ Metadata
Author: Johannes Schindelin <Johannes.Schindelin@gmx.de>
## Commit message ##
- Implement `scalar diagnose`
+ scalar: implement `scalar diagnose`
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
5: 5c1ef19524 = 5: 0417d8abe4 scalar diagnose: include disk space information
6: 0325b9c3ab = 6: 5531b65ddb scalar: teach `diagnose` to gather packfile info
7: 8fee365b07 = 7: ce9eba5e32 scalar: teach `diagnose` to gather loose objects information
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v6+ 1/7] archive: optionally add "virtual" files
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons Junio C Hamano
` (6 subsequent siblings)
7 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
With the `--add-virtual-file=<path>:<content>` option, `git archive` now
supports use cases where relatively trivial files need to be added that
do not exist on disk.
This will allow us to generate `.zip` files with generated content,
without having to add said content to the object database and without
having to write it out to disk.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
[jc: tweaked <path> handling]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* The changes to the way how leading components of the <path> are
not discarded and used made the "extra entries" handling into two
separate code to independently come up with the path stored in
the archive, as well as the contents stored in the archive.
The explanation of how --prefix and --add-file interacts also
applies to the new option.
Documentation/git-archive.txt | 13 +++++-
archive.c | 77 ++++++++++++++++++++++++++---------
t/t5003-archive-zip.sh | 12 ++++++
3 files changed, 82 insertions(+), 20 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index 94519aae23..b41cc5bc2e 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -51,7 +51,7 @@ OPTIONS
--prefix=<prefix>/::
Prepend <prefix>/ to paths in the archive. Can be repeated; its
rightmost value is used for all tracked files. See below which
- value gets used by `--add-file`.
+ value gets used by `--add-file` and `--add-virtual-file`.
-o <file>::
--output=<file>::
@@ -63,6 +63,17 @@ OPTIONS
concatenating the value of the last `--prefix` option (if any)
before this `--add-file` and the basename of <file>.
+--add-virtual-file=<path>:<content>::
+ Add the specified contents to the archive. Can be repeated to add
+ multiple files. The path of the file in the archive is built
+ by concatenating the value of the last `--prefix` option (if any)
+ before this `--add-virtual-file` and `<path>`.
++
+The `<path>` cannot contain any colon, the file mode is limited to
+a regular file, and the option may be subject to platform-dependent
+command-line limits. For non-trivial cases, write an untracked file
+and use `--add-file` instead.
+
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
as well (see <<ATTRIBUTES>>).
diff --git a/archive.c b/archive.c
index e2121ebefb..d26f4ef945 100644
--- a/archive.c
+++ b/archive.c
@@ -263,6 +263,7 @@ static int queue_or_write_archive_entry(const struct object_id *oid,
struct extra_file_info {
char *base;
struct stat stat;
+ void *content;
};
int write_archive_entries(struct archiver_args *args,
@@ -331,19 +332,27 @@ int write_archive_entries(struct archiver_args *args,
put_be64(fake_oid.hash, i + 1);
- strbuf_reset(&path_in_archive);
- if (info->base)
- strbuf_addstr(&path_in_archive, info->base);
- strbuf_addstr(&path_in_archive, basename(path));
-
- strbuf_reset(&content);
- if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
- err = error_errno(_("cannot read '%s'"), path);
- else
- err = write_entry(args, &fake_oid, path_in_archive.buf,
- path_in_archive.len,
+ if (!info->content) {
+ strbuf_reset(&path_in_archive);
+ if (info->base)
+ strbuf_addstr(&path_in_archive, info->base);
+ strbuf_addstr(&path_in_archive, basename(path));
+
+ strbuf_reset(&content);
+ if (strbuf_read_file(&content, path, info->stat.st_size) < 0)
+ err = error_errno(_("could not read '%s'"), path);
+ else
+ err = write_entry(args, &fake_oid, path_in_archive.buf,
+ path_in_archive.len,
+ canon_mode(info->stat.st_mode),
+ content.buf, content.len);
+ } else {
+ err = write_entry(args, &fake_oid,
+ path, strlen(path),
canon_mode(info->stat.st_mode),
- content.buf, content.len);
+ info->content, info->stat.st_size);
+ }
+
if (err)
break;
}
@@ -493,6 +502,7 @@ static void extra_file_info_clear(void *util, const char *str)
{
struct extra_file_info *info = util;
free(info->base);
+ free(info->content);
free(info);
}
@@ -514,14 +524,40 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
if (!arg)
return -1;
- path = prefix_filename(args->prefix, arg);
- item = string_list_append_nodup(&args->extra_files, path);
- item->util = info = xmalloc(sizeof(*info));
+ info = xmalloc(sizeof(*info));
info->base = xstrdup_or_null(base);
- if (stat(path, &info->stat))
- die(_("File not found: %s"), path);
- if (!S_ISREG(info->stat.st_mode))
- die(_("Not a regular file: %s"), path);
+
+ if (!strcmp(opt->long_name, "add-file")) {
+ path = prefix_filename(args->prefix, arg);
+ if (stat(path, &info->stat))
+ die(_("File not found: %s"), path);
+ if (!S_ISREG(info->stat.st_mode))
+ die(_("Not a regular file: %s"), path);
+ info->content = NULL; /* read the file later */
+ } else if (!strcmp(opt->long_name, "add-virtual-file")) {
+ const char *colon = strchr(arg, ':');
+ char *p;
+
+ if (!colon)
+ die(_("missing colon: '%s'"), arg);
+
+ p = xstrndup(arg, colon - arg);
+ if (!args->prefix)
+ path = p;
+ else {
+ path = prefix_filename(args->prefix, p);
+ free(p);
+ }
+ memset(&info->stat, 0, sizeof(info->stat));
+ info->stat.st_mode = S_IFREG | 0644;
+ info->content = xstrdup(colon + 1);
+ info->stat.st_size = strlen(info->content);
+ } else {
+ BUG("add_file_cb() called for %s", opt->long_name);
+ }
+ item = string_list_append_nodup(&args->extra_files, path);
+ item->util = info;
+
return 0;
}
@@ -554,6 +590,9 @@ static int parse_archive_args(int argc, const char **argv,
{ OPTION_CALLBACK, 0, "add-file", args, N_("file"),
N_("add untracked file to archive"), 0, add_file_cb,
(intptr_t)&base },
+ { OPTION_CALLBACK, 0, "add-virtual-file", args,
+ N_("path:content"), N_("add untracked file to archive"), 0,
+ add_file_cb, (intptr_t)&base },
OPT_STRING('o', "output", &output, N_("file"),
N_("write the archive to this file")),
OPT_BOOL(0, "worktree-attributes", &worktree_attributes,
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index d726964307..d6027189e2 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -206,6 +206,18 @@ test_expect_success 'git archive --format=zip --add-file' '
check_zip with_untracked
check_added with_untracked untracked untracked
+test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ git archive --format=zip >with_file_with_content.zip \
+ --add-virtual-file=hello:world $EMPTY_TREE &&
+ test_when_finished "rm -rf tmp-unpack" &&
+ mkdir tmp-unpack && (
+ cd tmp-unpack &&
+ "$GIT_UNZIP" ../with_file_with_content.zip &&
+ test_path_is_file hello &&
+ test world = $(cat hello)
+ )
+'
+
test_expect_success 'git archive --format=zip --add-file twice' '
echo untracked >untracked &&
git archive --format=zip --prefix=one/ --add-file=untracked \
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 1/7] archive: optionally add "virtual" files Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-06-15 18:16 ` Adam Dinwoodie
2022-05-28 23:11 ` [PATCH v6+ 3/7] scalar: validate the optional enlistment argument Junio C Hamano
` (5 subsequent siblings)
7 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
By allowing the path to be enclosed in double-quotes, we can avoid
the limitation that paths cannot contain colons.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
* Tightened shell variable quoting
Documentation/git-archive.txt | 14 ++++++++++----
archive.c | 30 ++++++++++++++++++++----------
t/t5003-archive-zip.sh | 8 ++++++++
3 files changed, 38 insertions(+), 14 deletions(-)
diff --git a/Documentation/git-archive.txt b/Documentation/git-archive.txt
index b41cc5bc2e..56989a2f34 100644
--- a/Documentation/git-archive.txt
+++ b/Documentation/git-archive.txt
@@ -69,10 +69,16 @@ OPTIONS
by concatenating the value of the last `--prefix` option (if any)
before this `--add-virtual-file` and `<path>`.
+
-The `<path>` cannot contain any colon, the file mode is limited to
-a regular file, and the option may be subject to platform-dependent
-command-line limits. For non-trivial cases, write an untracked file
-and use `--add-file` instead.
+The `<path>` argument can start and end with a literal double-quote
+character; the contained file name is interpreted as a C-style string,
+i.e. the backslash is interpreted as escape character. The path must
+be quoted if it contains a colon, to avoid the colon from being
+misinterpreted as the separator between the path and the contents, or
+if the path begins or ends with a double-quote character.
++
+The file mode is limited to a regular file, and the option may be
+subject to platform-dependent command-line limits. For non-trivial
+cases, write an untracked file and use `--add-file` instead.
--worktree-attributes::
Look for attributes in .gitattributes files in the working tree
diff --git a/archive.c b/archive.c
index d26f4ef945..48aba4ac46 100644
--- a/archive.c
+++ b/archive.c
@@ -9,6 +9,7 @@
#include "parse-options.h"
#include "unpack-trees.h"
#include "dir.h"
+#include "quote.h"
static char const * const archive_usage[] = {
N_("git archive [<options>] <tree-ish> [<path>...]"),
@@ -535,22 +536,31 @@ static int add_file_cb(const struct option *opt, const char *arg, int unset)
die(_("Not a regular file: %s"), path);
info->content = NULL; /* read the file later */
} else if (!strcmp(opt->long_name, "add-virtual-file")) {
- const char *colon = strchr(arg, ':');
- char *p;
+ struct strbuf buf = STRBUF_INIT;
+ const char *p = arg;
+
+ if (*p != '"')
+ p = strchr(p, ':');
+ else if (unquote_c_style(&buf, p, &p) < 0)
+ die(_("unclosed quote: '%s'"), arg);
- if (!colon)
+ if (!p || *p != ':')
die(_("missing colon: '%s'"), arg);
- p = xstrndup(arg, colon - arg);
- if (!args->prefix)
- path = p;
- else {
- path = prefix_filename(args->prefix, p);
- free(p);
+ if (p == arg)
+ die(_("empty file name: '%s'"), arg);
+
+ path = buf.len ?
+ strbuf_detach(&buf, NULL) : xstrndup(arg, p - arg);
+
+ if (args->prefix) {
+ char *save = path;
+ path = prefix_filename(args->prefix, path);
+ free(save);
}
memset(&info->stat, 0, sizeof(info->stat));
info->stat.st_mode = S_IFREG | 0644;
- info->content = xstrdup(colon + 1);
+ info->content = xstrdup(p + 1);
info->stat.st_size = strlen(info->content);
} else {
BUG("add_file_cb() called for %s", opt->long_name);
diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
index d6027189e2..3992d08158 100755
--- a/t/t5003-archive-zip.sh
+++ b/t/t5003-archive-zip.sh
@@ -207,13 +207,21 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
+ if test_have_prereq FUNNYNAMES
+ then
+ PATHNAME="pathname with : colon"
+ else
+ PATHNAME="pathname without colon"
+ fi &&
git archive --format=zip >with_file_with_content.zip \
+ --add-virtual-file=\""$PATHNAME"\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
test_when_finished "rm -rf tmp-unpack" &&
mkdir tmp-unpack && (
cd tmp-unpack &&
"$GIT_UNZIP" ../with_file_with_content.zip &&
test_path_is_file hello &&
+ test_path_is_file "$PATHNAME" &&
test world = $(cat hello)
)
'
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-05-28 23:11 ` [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons Junio C Hamano
@ 2022-06-15 18:16 ` Adam Dinwoodie
2022-06-15 20:00 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Adam Dinwoodie @ 2022-06-15 18:16 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Schindelin
On Sat, May 28, 2022 at 04:11:13PM -0700, Junio C Hamano wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>
> By allowing the path to be enclosed in double-quotes, we can avoid
> the limitation that paths cannot contain colons.
>
> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
> Signed-off-by: Junio C Hamano <gitster@pobox.com>
> ---
> * Tightened shell variable quoting
>
> <snip>
>
> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> index d6027189e2..3992d08158 100755
> --- a/t/t5003-archive-zip.sh
> +++ b/t/t5003-archive-zip.sh
> @@ -207,13 +207,21 @@ check_zip with_untracked
> check_added with_untracked untracked untracked
>
> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> + if test_have_prereq FUNNYNAMES
> + then
> + PATHNAME="pathname with : colon"
> + else
> + PATHNAME="pathname without colon"
> + fi &&
> git archive --format=zip >with_file_with_content.zip \
> + --add-virtual-file=\""$PATHNAME"\": \
> --add-virtual-file=hello:world $EMPTY_TREE &&
> test_when_finished "rm -rf tmp-unpack" &&
> mkdir tmp-unpack && (
> cd tmp-unpack &&
> "$GIT_UNZIP" ../with_file_with_content.zip &&
> test_path_is_file hello &&
> + test_path_is_file "$PATHNAME" &&
> test world = $(cat hello)
> )
> '
This test is currently failing on Cygwin: it looks like it's exposing a
bug in Cygwin that means files with colons in their name aren't
correctly extracted from zip archives. I'm going to report that to the
Cygwin mailing list, but I wanted to note it for the record here, too.
Adam
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-06-15 18:16 ` Adam Dinwoodie
@ 2022-06-15 20:00 ` Junio C Hamano
2022-06-15 21:36 ` Adam Dinwoodie
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-06-15 20:00 UTC (permalink / raw)
To: Adam Dinwoodie; +Cc: git, Johannes Schindelin
Adam Dinwoodie <adam@dinwoodie.org> writes:
>> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
>> index d6027189e2..3992d08158 100755
>> --- a/t/t5003-archive-zip.sh
>> +++ b/t/t5003-archive-zip.sh
>> @@ -207,13 +207,21 @@ check_zip with_untracked
>> check_added with_untracked untracked untracked
>>
>> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
>> + if test_have_prereq FUNNYNAMES
>> + then
>> + PATHNAME="pathname with : colon"
>> + else
>> + PATHNAME="pathname without colon"
>> + fi &&
>> git archive --format=zip >with_file_with_content.zip \
>> + --add-virtual-file=\""$PATHNAME"\": \
>> --add-virtual-file=hello:world $EMPTY_TREE &&
>> test_when_finished "rm -rf tmp-unpack" &&
>> mkdir tmp-unpack && (
>> cd tmp-unpack &&
>> "$GIT_UNZIP" ../with_file_with_content.zip &&
>> test_path_is_file hello &&
>> + test_path_is_file "$PATHNAME" &&
>> test world = $(cat hello)
>> )
>> '
>
> This test is currently failing on Cygwin: it looks like it's exposing a
> bug in Cygwin that means files with colons in their name aren't
> correctly extracted from zip archives. I'm going to report that to the
> Cygwin mailing list, but I wanted to note it for the record here, too.
Does this mean that our code to set FUNNYNAMES prerequiste is
slightly broken? IOW, should we check with a path with a colon in
it, as well as whatever we use currently for FUNNYNAMES?
Something like the attached patch?
Or does Cygwin otherwise work perfectly well with a path with a
colon in it, but only $GIT_UNZIP command has problem with it? If
that is the case, then please disregard the attached.
Thanks.
t/test-lib.sh | 1 +
1 file changed, 1 insertion(+)
diff --git i/t/test-lib.sh w/t/test-lib.sh
index 55857af601..5dce7d95c9 100644
--- i/t/test-lib.sh
+++ w/t/test-lib.sh
@@ -1620,6 +1620,7 @@ test_lazy_prereq FUNNYNAMES '
touch -- \
"FUNNYNAMES tab embedded" \
"FUNNYNAMES \"quote embedded\"" \
+ "FUNNYNAMES colon : embedded" \
"FUNNYNAMES newline
embedded" 2>/dev/null &&
rm -- \
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-06-15 20:00 ` Junio C Hamano
@ 2022-06-15 21:36 ` Adam Dinwoodie
2022-06-18 20:19 ` Johannes Schindelin
0 siblings, 1 reply; 140+ messages in thread
From: Adam Dinwoodie @ 2022-06-15 21:36 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Schindelin
On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote:
> Adam Dinwoodie <adam@dinwoodie.org> writes:
>
> >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> >> index d6027189e2..3992d08158 100755
> >> --- a/t/t5003-archive-zip.sh
> >> +++ b/t/t5003-archive-zip.sh
> >> @@ -207,13 +207,21 @@ check_zip with_untracked
> >> check_added with_untracked untracked untracked
> >>
> >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> >> + if test_have_prereq FUNNYNAMES
> >> + then
> >> + PATHNAME="pathname with : colon"
> >> + else
> >> + PATHNAME="pathname without colon"
> >> + fi &&
> >> git archive --format=zip >with_file_with_content.zip \
> >> + --add-virtual-file=\""$PATHNAME"\": \
> >> --add-virtual-file=hello:world $EMPTY_TREE &&
> >> test_when_finished "rm -rf tmp-unpack" &&
> >> mkdir tmp-unpack && (
> >> cd tmp-unpack &&
> >> "$GIT_UNZIP" ../with_file_with_content.zip &&
> >> test_path_is_file hello &&
> >> + test_path_is_file "$PATHNAME" &&
> >> test world = $(cat hello)
> >> )
> >> '
> >
> > This test is currently failing on Cygwin: it looks like it's exposing a
> > bug in Cygwin that means files with colons in their name aren't
> > correctly extracted from zip archives. I'm going to report that to the
> > Cygwin mailing list, but I wanted to note it for the record here, too.
>
> Does this mean that our code to set FUNNYNAMES prerequiste is
> slightly broken? IOW, should we check with a path with a colon in
> it, as well as whatever we use currently for FUNNYNAMES?
>
> Something like the attached patch?
>
> Or does Cygwin otherwise work perfectly well with a path with a
> colon in it, but only $GIT_UNZIP command has problem with it? If
> that is the case, then please disregard the attached.
The latter: Cygwin works perfectly with paths containing colons, except
that Cygwin's `unzip` is seemingly buggy and doesn't work. The file
systems Cygwin runs on don't support colons in paths, but Cygwin hides
that problem by rewriting ASCII colons to some high Unicode code point
on the filesystem, meaning Cygwin-native applications see a regular
colon, while Windows-native applications see an unusual but perfectly
valid Unicode character.
I tested the same patch to FUNNYNAMES myself before reporting, and the
test fails exactly the same way. If we wanted to catch this, I think
we'd need a test that explicitly attempted to unzip an archive
containing a path with a colon.
(The code to set FUNNYNAMES *is* slightly broken, per the discussions
around 6d340dfaef ("t9902: split test to run on appropriate systems",
2022-04-08), and my to-do list still features tidying up and
resubmitting the patch Ævar wrote in that discussion thread. But it
wouldn't help here because this issue is specific to Cygwin's `unzip`,
rather than a general limitation of running on Cygwin.)
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-06-15 21:36 ` Adam Dinwoodie
@ 2022-06-18 20:19 ` Johannes Schindelin
2022-06-18 22:05 ` Junio C Hamano
2022-06-20 9:41 ` Adam Dinwoodie
0 siblings, 2 replies; 140+ messages in thread
From: Johannes Schindelin @ 2022-06-18 20:19 UTC (permalink / raw)
To: Adam Dinwoodie; +Cc: Junio C Hamano, git
[-- Attachment #1: Type: text/plain, Size: 3796 bytes --]
Hi Adam,
On Wed, 15 Jun 2022, Adam Dinwoodie wrote:
> On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote:
> > Adam Dinwoodie <adam@dinwoodie.org> writes:
> >
> > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> > >> index d6027189e2..3992d08158 100755
> > >> --- a/t/t5003-archive-zip.sh
> > >> +++ b/t/t5003-archive-zip.sh
> > >> @@ -207,13 +207,21 @@ check_zip with_untracked
> > >> check_added with_untracked untracked untracked
> > >>
> > >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> > >> + if test_have_prereq FUNNYNAMES
> > >> + then
> > >> + PATHNAME="pathname with : colon"
> > >> + else
> > >> + PATHNAME="pathname without colon"
> > >> + fi &&
> > >> git archive --format=zip >with_file_with_content.zip \
> > >> + --add-virtual-file=\""$PATHNAME"\": \
> > >> --add-virtual-file=hello:world $EMPTY_TREE &&
> > >> test_when_finished "rm -rf tmp-unpack" &&
> > >> mkdir tmp-unpack && (
> > >> cd tmp-unpack &&
> > >> "$GIT_UNZIP" ../with_file_with_content.zip &&
> > >> test_path_is_file hello &&
> > >> + test_path_is_file "$PATHNAME" &&
> > >> test world = $(cat hello)
> > >> )
> > >> '
> > >
> > > This test is currently failing on Cygwin: it looks like it's exposing a
> > > bug in Cygwin that means files with colons in their name aren't
> > > correctly extracted from zip archives. I'm going to report that to the
> > > Cygwin mailing list, but I wanted to note it for the record here, too.
> >
> > Does this mean that our code to set FUNNYNAMES prerequiste is
> > slightly broken? IOW, should we check with a path with a colon in
> > it, as well as whatever we use currently for FUNNYNAMES?
> >
> > Something like the attached patch?
> >
> > Or does Cygwin otherwise work perfectly well with a path with a
> > colon in it, but only $GIT_UNZIP command has problem with it? If
> > that is the case, then please disregard the attached.
>
> The latter: Cygwin works perfectly with paths containing colons, except
> that Cygwin's `unzip` is seemingly buggy and doesn't work. The file
> systems Cygwin runs on don't support colons in paths, but Cygwin hides
> that problem by rewriting ASCII colons to some high Unicode code point
> on the filesystem,
Let me throw in a bit more detail: The forbidden characters are mapped
into the Unicode page U+f0XX, which is supposed to be used "for private
purposes". Even more detail can be found here:
https://github.com/cygwin/cygwin/blob/cygwin-3_3_5-release/winsup/cygwin/strfuncs.cc#L19-L23
> meaning Cygwin-native applications see a regular colon, while
> Windows-native applications see an unusual but perfectly valid Unicode
> character.
Now, I have two questions:
- Why does `unzip` not use Cygwin's regular functions (which should all be
aware of that U+f0XX <-> U+00XX mapping)?
- Even more importantly: would the test case pass if we simply used
another forbidden character, such as `?` or `*`?
> I tested the same patch to FUNNYNAMES myself before reporting, and the
> test fails exactly the same way. If we wanted to catch this, I think
> we'd need a test that explicitly attempted to unzip an archive
> containing a path with a colon.
>
> (The code to set FUNNYNAMES *is* slightly broken, per the discussions
> around 6d340dfaef ("t9902: split test to run on appropriate systems",
> 2022-04-08), and my to-do list still features tidying up and
> resubmitting the patch Ævar wrote in that discussion thread. But it
> wouldn't help here because this issue is specific to Cygwin's `unzip`,
> rather than a general limitation of running on Cygwin.)
I'd rather avoid changing FUNNYNAMES at this stage, if we can help it.
Thanks,
Dscho
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-06-18 20:19 ` Johannes Schindelin
@ 2022-06-18 22:05 ` Junio C Hamano
2022-06-20 9:41 ` Adam Dinwoodie
1 sibling, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-06-18 22:05 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Adam Dinwoodie, git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> I'd rather avoid changing FUNNYNAMES at this stage, if we can help it.
I wonder if it is sufficient to ask "unzip -l" the names of the
files in the archive, without having to materialize these files on
the filesystem. Would that bypass the whole FUNNYNAMES business, or
is "unzip" paranoid enough to reject an archive, even when it is not
extracting into the local filesystem, with a path that it would not
be able to extract if it were asked to?
I do not know how standardized different implementations of "unzip"
is, and how similar output "unzip -l" implementations produce are,
but the following seems to pass for me locally.
t/t5003-archive-zip.sh | 18 ++++--------------
1 file changed, 4 insertions(+), 14 deletions(-)
diff --git c/t/t5003-archive-zip.sh w/t/t5003-archive-zip.sh
index 3992d08158..f2fdf2c235 100755
--- c/t/t5003-archive-zip.sh
+++ w/t/t5003-archive-zip.sh
@@ -207,23 +207,13 @@ check_zip with_untracked
check_added with_untracked untracked untracked
test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
- if test_have_prereq FUNNYNAMES
- then
- PATHNAME="pathname with : colon"
- else
- PATHNAME="pathname without colon"
- fi &&
+ PATHNAME="pathname with : colon" &&
git archive --format=zip >with_file_with_content.zip \
--add-virtual-file=\""$PATHNAME"\": \
--add-virtual-file=hello:world $EMPTY_TREE &&
- test_when_finished "rm -rf tmp-unpack" &&
- mkdir tmp-unpack && (
- cd tmp-unpack &&
- "$GIT_UNZIP" ../with_file_with_content.zip &&
- test_path_is_file hello &&
- test_path_is_file "$PATHNAME" &&
- test world = $(cat hello)
- )
+ "$GIT_UNZIP" -l with_file_with_content.zip >toc &&
+ grep -e " $PATHNAME\$" toc &&
+ grep -e " hello\$" toc
'
test_expect_success 'git archive --format=zip --add-file twice' '
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons
2022-06-18 20:19 ` Johannes Schindelin
2022-06-18 22:05 ` Junio C Hamano
@ 2022-06-20 9:41 ` Adam Dinwoodie
1 sibling, 0 replies; 140+ messages in thread
From: Adam Dinwoodie @ 2022-06-20 9:41 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Junio C Hamano, git
On Sat, Jun 18, 2022 at 10:19:28PM +0200, Johannes Schindelin wrote:
> Hi Adam,
>
> On Wed, 15 Jun 2022, Adam Dinwoodie wrote:
>
> > On Wed, Jun 15, 2022 at 01:00:07PM -0700, Junio C Hamano wrote:
> > > Adam Dinwoodie <adam@dinwoodie.org> writes:
> > >
> > > >> diff --git a/t/t5003-archive-zip.sh b/t/t5003-archive-zip.sh
> > > >> index d6027189e2..3992d08158 100755
> > > >> --- a/t/t5003-archive-zip.sh
> > > >> +++ b/t/t5003-archive-zip.sh
> > > >> @@ -207,13 +207,21 @@ check_zip with_untracked
> > > >> check_added with_untracked untracked untracked
> > > >>
> > > >> test_expect_success UNZIP 'git archive --format=zip --add-virtual-file' '
> > > >> + if test_have_prereq FUNNYNAMES
> > > >> + then
> > > >> + PATHNAME="pathname with : colon"
> > > >> + else
> > > >> + PATHNAME="pathname without colon"
> > > >> + fi &&
> > > >> git archive --format=zip >with_file_with_content.zip \
> > > >> + --add-virtual-file=\""$PATHNAME"\": \
> > > >> --add-virtual-file=hello:world $EMPTY_TREE &&
> > > >> test_when_finished "rm -rf tmp-unpack" &&
> > > >> mkdir tmp-unpack && (
> > > >> cd tmp-unpack &&
> > > >> "$GIT_UNZIP" ../with_file_with_content.zip &&
> > > >> test_path_is_file hello &&
> > > >> + test_path_is_file "$PATHNAME" &&
> > > >> test world = $(cat hello)
> > > >> )
> > > >> '
> > > >
> > > > This test is currently failing on Cygwin: it looks like it's exposing a
> > > > bug in Cygwin that means files with colons in their name aren't
> > > > correctly extracted from zip archives. I'm going to report that to the
> > > > Cygwin mailing list, but I wanted to note it for the record here, too.
> > >
> > > Does this mean that our code to set FUNNYNAMES prerequiste is
> > > slightly broken? IOW, should we check with a path with a colon in
> > > it, as well as whatever we use currently for FUNNYNAMES?
> > >
> > > Something like the attached patch?
> > >
> > > Or does Cygwin otherwise work perfectly well with a path with a
> > > colon in it, but only $GIT_UNZIP command has problem with it? If
> > > that is the case, then please disregard the attached.
> >
> > The latter: Cygwin works perfectly with paths containing colons, except
> > that Cygwin's `unzip` is seemingly buggy and doesn't work. The file
> > systems Cygwin runs on don't support colons in paths, but Cygwin hides
> > that problem by rewriting ASCII colons to some high Unicode code point
> > on the filesystem,
>
> Let me throw in a bit more detail: The forbidden characters are mapped
> into the Unicode page U+f0XX, which is supposed to be used "for private
> purposes". Even more detail can be found here:
> https://github.com/cygwin/cygwin/blob/cygwin-3_3_5-release/winsup/cygwin/strfuncs.cc#L19-L23
>
> > meaning Cygwin-native applications see a regular colon, while
> > Windows-native applications see an unusual but perfectly valid Unicode
> > character.
>
> Now, I have two questions:
>
> - Why does `unzip` not use Cygwin's regular functions (which should all be
> aware of that U+f0XX <-> U+00XX mapping)?
That is an excellent question! This behaviour came from an `#ifdef
__CYGWIN__` in the upstream unzip package; with that #ifdef removed,
everything works as expected. The folk on the Cygwin mailing list had
no idea *why* that #ifdef was there, given it's evidently unnecessary;
my best guess is that it was added a long time ago before Cygwin could
handle those characters in the general case.
Since my report, the Cygwin package has picked up a new maintainer who
has released a version of the unzip package with that #ifdef removed, so
this test is now passing.
> - Even more importantly: would the test case pass if we simply used
> another forbidden character, such as `?` or `*`?
The set of characters that had special handling in unzip was "*:?|<> all
of which are handled appropriately by Cygwin applications in general,
and all of which had this unnecessary handling in `unzip`
> > I tested the same patch to FUNNYNAMES myself before reporting, and the
> > test fails exactly the same way. If we wanted to catch this, I think
> > we'd need a test that explicitly attempted to unzip an archive
> > containing a path with a colon.
> >
> > (The code to set FUNNYNAMES *is* slightly broken, per the discussions
> > around 6d340dfaef ("t9902: split test to run on appropriate systems",
> > 2022-04-08), and my to-do list still features tidying up and
> > resubmitting the patch Ævar wrote in that discussion thread. But it
> > wouldn't help here because this issue is specific to Cygwin's `unzip`,
> > rather than a general limitation of running on Cygwin.)
>
> I'd rather avoid changing FUNNYNAMES at this stage, if we can help it.
Oh yes, I definitely wasn't proposing changing things for 2.37.0! I
just wanted to acknowledge that there is a known issue here that has
been discussed on this list previously, that we (I) would hopefully get
around to fixing at some point.
Adam
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v6+ 3/7] scalar: validate the optional enlistment argument
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 1/7] archive: optionally add "virtual" files Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 2/7] archive --add-virtual-file: allow paths containing colons Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 4/7] scalar: implement `scalar diagnose` Junio C Hamano
` (4 subsequent siblings)
7 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
The `scalar` command needs a Scalar enlistment for many subcommands, and
looks in the current directory for such an enlistment (traversing the
parent directories until it finds one).
These is subcommands can also be called with an optional argument
specifying the enlistment. Here, too, we traverse parent directories as
needed, until we find an enlistment.
However, if the specified directory does not even exist, or is not a
directory, we should stop right there, with an error message.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
contrib/scalar/scalar.c | 6 ++++--
contrib/scalar/t/t9099-scalar.sh | 5 +++++
2 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 58ca0e56f1..6d58c7a698 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -43,9 +43,11 @@ static void setup_enlistment_directory(int argc, const char **argv,
usage_with_options(usagestr, options);
/* find the worktree, determine its corresponding root */
- if (argc == 1)
+ if (argc == 1) {
strbuf_add_absolute_path(&path, argv[0]);
- else if (strbuf_getcwd(&path) < 0)
+ if (!is_directory(path.buf))
+ die(_("'%s' does not exist"), path.buf);
+ } else if (strbuf_getcwd(&path) < 0)
die(_("need a working directory"));
strbuf_trim_trailing_dir_sep(&path);
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 89781568f4..bb42354a8b 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -93,4 +93,9 @@ test_expect_success 'scalar supports -c/-C' '
test true = "$(git -C sub config core.preloadIndex)"
'
+test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
+ ! scalar run config cloned 2>err &&
+ grep "cloned. does not exist" err
+'
+
test_done
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6+ 4/7] scalar: implement `scalar diagnose`
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
` (2 preceding siblings ...)
2022-05-28 23:11 ` [PATCH v6+ 3/7] scalar: validate the optional enlistment argument Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-06-10 2:08 ` Ævar Arnfjörð Bjarmason
2022-05-28 23:11 ` [PATCH v6+ 5/7] scalar diagnose: include disk space information Junio C Hamano
` (3 subsequent siblings)
7 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
Over the course of Scalar's development, it became obvious that there is
a need for a command that can gather all kinds of useful information
that can help identify the most typical problems with large
worktrees/repositories.
The `diagnose` command is the culmination of this hard-won knowledge: it
gathers the installed hooks, the config, a couple statistics describing
the data shape, among other pieces of information, and then wraps
everything up in a tidy, neat `.zip` archive.
Note: originally, Scalar was implemented in C# using the .NET API, where
we had the luxury of a comprehensive standard library that includes
basic functionality such as writing a `.zip` file. In the C version, we
lack such a commodity. Rather than introducing a dependency on, say,
libzip, we slightly abuse Git's `archive` machinery: we write out a
`.zip` of the empty try, augmented by a couple files that are added via
the `--add-file*` options. We are careful trying not to modify the
current repository in any way lest the very circumstances that required
`scalar diagnose` to be run are changed by the `diagnose` run itself.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
contrib/scalar/scalar.c | 144 +++++++++++++++++++++++++++++++
contrib/scalar/scalar.txt | 12 +++
contrib/scalar/t/t9099-scalar.sh | 14 +++
3 files changed, 170 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index 6d58c7a698..a1e05a2146 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -11,6 +11,7 @@
#include "dir.h"
#include "packfile.h"
#include "help.h"
+#include "archive.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -260,6 +261,47 @@ static int unregister_dir(void)
return res;
}
+static int add_directory_to_archiver(struct strvec *archiver_args,
+ const char *path, int recurse)
+{
+ int at_root = !*path;
+ DIR *dir = opendir(at_root ? "." : path);
+ struct dirent *e;
+ struct strbuf buf = STRBUF_INIT;
+ size_t len;
+ int res = 0;
+
+ if (!dir)
+ return error_errno(_("could not open directory '%s'"), path);
+
+ if (!at_root)
+ strbuf_addf(&buf, "%s/", path);
+ len = buf.len;
+ strvec_pushf(archiver_args, "--prefix=%s", buf.buf);
+
+ while (!res && (e = readdir(dir))) {
+ if (!strcmp(".", e->d_name) || !strcmp("..", e->d_name))
+ continue;
+
+ strbuf_setlen(&buf, len);
+ strbuf_addstr(&buf, e->d_name);
+
+ if (e->d_type == DT_REG)
+ strvec_pushf(archiver_args, "--add-file=%s", buf.buf);
+ else if (e->d_type != DT_DIR)
+ warning(_("skipping '%s', which is neither file nor "
+ "directory"), buf.buf);
+ else if (recurse &&
+ add_directory_to_archiver(archiver_args,
+ buf.buf, recurse) < 0)
+ res = -1;
+ }
+
+ closedir(dir);
+ strbuf_release(&buf);
+ return res;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -500,6 +542,107 @@ static int cmd_clone(int argc, const char **argv)
return res;
}
+static int cmd_diagnose(int argc, const char **argv)
+{
+ struct option options[] = {
+ OPT_END(),
+ };
+ const char * const usage[] = {
+ N_("scalar diagnose [<enlistment>]"),
+ NULL
+ };
+ struct strbuf zip_path = STRBUF_INIT;
+ struct strvec archiver_args = STRVEC_INIT;
+ char **argv_copy = NULL;
+ int stdout_fd = -1, archiver_fd = -1;
+ time_t now = time(NULL);
+ struct tm tm;
+ struct strbuf path = STRBUF_INIT, buf = STRBUF_INIT;
+ int res = 0;
+
+ argc = parse_options(argc, argv, NULL, options,
+ usage, 0);
+
+ setup_enlistment_directory(argc, argv, usage, options, &zip_path);
+
+ strbuf_addstr(&zip_path, "/.scalarDiagnostics/scalar_");
+ strbuf_addftime(&zip_path,
+ "%Y%m%d_%H%M%S", localtime_r(&now, &tm), 0, 0);
+ strbuf_addstr(&zip_path, ".zip");
+ switch (safe_create_leading_directories(zip_path.buf)) {
+ case SCLD_EXISTS:
+ case SCLD_OK:
+ break;
+ default:
+ error_errno(_("could not create directory for '%s'"),
+ zip_path.buf);
+ goto diagnose_cleanup;
+ }
+ stdout_fd = dup(1);
+ if (stdout_fd < 0) {
+ res = error_errno(_("could not duplicate stdout"));
+ goto diagnose_cleanup;
+ }
+
+ archiver_fd = xopen(zip_path.buf, O_CREAT | O_WRONLY | O_TRUNC, 0666);
+ if (archiver_fd < 0 || dup2(archiver_fd, 1) < 0) {
+ res = error_errno(_("could not redirect output"));
+ goto diagnose_cleanup;
+ }
+
+ init_zip_archiver();
+ strvec_pushl(&archiver_args, "scalar-diagnose", "--format=zip", NULL);
+
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "Collecting diagnostic info\n\n");
+ get_version_info(&buf, 1);
+
+ strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ write_or_die(stdout_fd, buf.buf, buf.len);
+ strvec_pushf(&archiver_args,
+ "--add-virtual-file=diagnostics.log:%.*s",
+ (int)buf.len, buf.buf);
+
+ if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
+ (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
+ goto diagnose_cleanup;
+
+ strvec_pushl(&archiver_args, "--prefix=",
+ oid_to_hex(the_hash_algo->empty_tree), "--", NULL);
+
+ /* `write_archive()` modifies the `argv` passed to it. Let it. */
+ argv_copy = xmemdupz(archiver_args.v,
+ sizeof(char *) * archiver_args.nr);
+ res = write_archive(archiver_args.nr, (const char **)argv_copy, NULL,
+ the_repository, NULL, 0);
+ if (res) {
+ error(_("failed to write archive"));
+ goto diagnose_cleanup;
+ }
+
+ if (!res)
+ fprintf(stderr, "\n"
+ "Diagnostics complete.\n"
+ "All of the gathered info is captured in '%s'\n",
+ zip_path.buf);
+
+diagnose_cleanup:
+ if (archiver_fd >= 0) {
+ close(1);
+ dup2(stdout_fd, 1);
+ }
+ free(argv_copy);
+ strvec_clear(&archiver_args);
+ strbuf_release(&zip_path);
+ strbuf_release(&path);
+ strbuf_release(&buf);
+
+ return res;
+}
+
static int cmd_list(int argc, const char **argv)
{
if (argc != 1)
@@ -801,6 +944,7 @@ static struct {
{ "reconfigure", cmd_reconfigure },
{ "delete", cmd_delete },
{ "version", cmd_version },
+ { "diagnose", cmd_diagnose },
{ NULL, NULL},
};
diff --git a/contrib/scalar/scalar.txt b/contrib/scalar/scalar.txt
index cf4e5b889c..c0425e0653 100644
--- a/contrib/scalar/scalar.txt
+++ b/contrib/scalar/scalar.txt
@@ -14,6 +14,7 @@ scalar register [<enlistment>]
scalar unregister [<enlistment>]
scalar run ( all | config | commit-graph | fetch | loose-objects | pack-files ) [<enlistment>]
scalar reconfigure [ --all | <enlistment> ]
+scalar diagnose [<enlistment>]
scalar delete <enlistment>
DESCRIPTION
@@ -139,6 +140,17 @@ reconfigure the enlistment.
With the `--all` option, all enlistments currently registered with Scalar
will be reconfigured. Use this option after each Scalar upgrade.
+Diagnose
+~~~~~~~~
+
+diagnose [<enlistment>]::
+ When reporting issues with Scalar, it is often helpful to provide the
+ information gathered by this command, including logs and certain
+ statistics describing the data shape of the current enlistment.
++
+The output of this command is a `.zip` file that is written into
+a directory adjacent to the worktree in the `src` directory.
+
Delete
~~~~~~
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index bb42354a8b..fbb1df2049 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -98,4 +98,18 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
grep "cloned. does not exist" err
'
+SQ="'"
+test_expect_success UNZIP 'scalar diagnose' '
+ scalar clone "file://$(pwd)" cloned --single-branch &&
+ scalar diagnose cloned >out 2>err &&
+ sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
+ zip_path=$(cat zip_path) &&
+ test -n "$zip_path" &&
+ unzip -v "$zip_path" &&
+ folder=${zip_path%.zip} &&
+ test_path_is_missing "$folder" &&
+ unzip -p "$zip_path" diagnostics.log >out &&
+ test_file_not_empty out
+'
+
test_done
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 4/7] scalar: implement `scalar diagnose`
2022-05-28 23:11 ` [PATCH v6+ 4/7] scalar: implement `scalar diagnose` Junio C Hamano
@ 2022-06-10 2:08 ` Ævar Arnfjörð Bjarmason
2022-06-10 16:44 ` Junio C Hamano
0 siblings, 1 reply; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-10 2:08 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Schindelin
On Sat, May 28 2022, Junio C Hamano wrote:
> From: Johannes Schindelin <johannes.schindelin@gmx.de>
> [...]
> The `diagnose` command is the culmination of this hard-won knowledge: it
> gathers the installed hooks, the config, a couple statistics describing
> the data shape, among other pieces of information, and then wraps
> everything up in a tidy, neat `.zip` archive.
> [...]
> + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
> + (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
> + (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
> + (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
> + (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
> + goto diagnose_cleanup;
Noticed on top of some local changes I have to not add a .git/hooks (the
--no-template topic), but this fails to diagnose any repo that doesn't
have these paths, which are optional, either because a user could have manually removed them, or used --template=.
although I don't think there's a way to create that sort of repo with
the scalar tooling, it doesn't seem to forward that option, but I didn't
look deeply.
So, no big deal, but it would be nice to have that fixed. Is there a
reason for why this mere addition of various stuff for diagnosis goes
straight to an opendir() and error on failure, as opposed to doing an
lstat() etc. first?
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 4/7] scalar: implement `scalar diagnose`
2022-06-10 2:08 ` Ævar Arnfjörð Bjarmason
@ 2022-06-10 16:44 ` Junio C Hamano
2022-06-10 17:35 ` Ævar Arnfjörð Bjarmason
0 siblings, 1 reply; 140+ messages in thread
From: Junio C Hamano @ 2022-06-10 16:44 UTC (permalink / raw)
To: Ævar Arnfjörð Bjarmason; +Cc: git, Johannes Schindelin
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
> On Sat, May 28 2022, Junio C Hamano wrote:
>
>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>> [...]
>> The `diagnose` command is the culmination of this hard-won knowledge: it
>> gathers the installed hooks, the config, a couple statistics describing
>> the data shape, among other pieces of information, and then wraps
>> everything up in a tidy, neat `.zip` archive.
>> [...]
>> + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
>> + (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
>> + (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
>> + (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
>> + (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
>> + goto diagnose_cleanup;
>
> Noticed on top of some local changes I have to not add a
> .git/hooks (the --no-template topic), but this fails to diagnose
> any repo that doesn't have these paths, which are optional, either
> because a user could have manually removed them, or used
> --template=.
Quite honestly, if it lacks any directory that we traditionally
created upon "git init", with our standard templates, we can and
should call such a repository "broken" and move on.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 4/7] scalar: implement `scalar diagnose`
2022-06-10 16:44 ` Junio C Hamano
@ 2022-06-10 17:35 ` Ævar Arnfjörð Bjarmason
0 siblings, 0 replies; 140+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-06-10 17:35 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Johannes Schindelin
On Fri, Jun 10 2022, Junio C Hamano wrote:
> Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:
>
>> On Sat, May 28 2022, Junio C Hamano wrote:
>>
>>> From: Johannes Schindelin <johannes.schindelin@gmx.de>
>>> [...]
>>> The `diagnose` command is the culmination of this hard-won knowledge: it
>>> gathers the installed hooks, the config, a couple statistics describing
>>> the data shape, among other pieces of information, and then wraps
>>> everything up in a tidy, neat `.zip` archive.
>>> [...]
>>> + if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
>>> + (res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
>>> + (res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
>>> + (res = add_directory_to_archiver(&archiver_args, ".git/logs", 1)) ||
>>> + (res = add_directory_to_archiver(&archiver_args, ".git/objects/info", 0)))
>>> + goto diagnose_cleanup;
>>
>> Noticed on top of some local changes I have to not add a
>> .git/hooks (the --no-template topic), but this fails to diagnose
>> any repo that doesn't have these paths, which are optional, either
>> because a user could have manually removed them, or used
>> --template=.
>
> Quite honestly, if it lacks any directory that we traditionally
> created upon "git init", with our standard templates, we can and
> should call such a repository "broken" and move on.
In our own test suite we do e.g. (and did more of that until some recent
changes of mine):
git mv .git/hooks .git/hooks.disabled
We've never documented in "git init" or the like that these very
optional directories in .git/ were some sort of hard requirenment, and
e.g. core.hooksPath and gitrepository-layout(5) explicitly seem to
suggest otherwise.
In any case, there's the golden rule about being strict in what you emit
and loose in what you accept, which we've taken with repository
compatibility. Having a tool that's designed to aid bugreporting be
picky about what sort of repository it supports seems to go against the
point of such a tool.
Particularly in this case, where it seems easy to just guard it with a
stat() check, or not error out if we fail to add this to the *.zip file,
no?
^ permalink raw reply [flat|nested] 140+ messages in thread
* [PATCH v6+ 5/7] scalar diagnose: include disk space information
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
` (3 preceding siblings ...)
2022-05-28 23:11 ` [PATCH v6+ 4/7] scalar: implement `scalar diagnose` Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 6/7] scalar: teach `diagnose` to gather packfile info Junio C Hamano
` (2 subsequent siblings)
7 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Johannes Schindelin
From: Johannes Schindelin <johannes.schindelin@gmx.de>
When analyzing problems with large worktrees/repositories, it is useful
to know how close to a "full disk" situation Scalar/Git operates. Let's
include this information.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
contrib/scalar/scalar.c | 53 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 1 +
2 files changed, 54 insertions(+)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index a1e05a2146..f06a2f3576 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -302,6 +302,58 @@ static int add_directory_to_archiver(struct strvec *archiver_args,
return res;
}
+#ifndef WIN32
+#include <sys/statvfs.h>
+#endif
+
+static int get_disk_info(struct strbuf *out)
+{
+#ifdef WIN32
+ struct strbuf buf = STRBUF_INIT;
+ char volume_name[MAX_PATH], fs_name[MAX_PATH];
+ DWORD serial_number, component_length, flags;
+ ULARGE_INTEGER avail2caller, total, avail;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (!GetDiskFreeSpaceExA(buf.buf, &avail2caller, &total, &avail)) {
+ error(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_setlen(&buf, offset_1st_component(buf.buf));
+ if (!GetVolumeInformationA(buf.buf, volume_name, sizeof(volume_name),
+ &serial_number, &component_length, &flags,
+ fs_name, sizeof(fs_name))) {
+ error(_("could not get info for '%s'"), buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, avail2caller.QuadPart);
+ strbuf_addch(out, '\n');
+ strbuf_release(&buf);
+#else
+ struct strbuf buf = STRBUF_INIT;
+ struct statvfs stat;
+
+ strbuf_realpath(&buf, ".", 1);
+ if (statvfs(buf.buf, &stat) < 0) {
+ error_errno(_("could not determine free disk size for '%s'"),
+ buf.buf);
+ strbuf_release(&buf);
+ return -1;
+ }
+
+ strbuf_addf(out, "Available space on '%s': ", buf.buf);
+ strbuf_humanise_bytes(out, st_mult(stat.f_bsize, stat.f_bavail));
+ strbuf_addf(out, " (mount flags 0x%lx)\n", stat.f_flag);
+ strbuf_release(&buf);
+#endif
+ return 0;
+}
+
/* printf-style interface, expects `<key>=<value>` argument */
static int set_config(const char *fmt, ...)
{
@@ -598,6 +650,7 @@ static int cmd_diagnose(int argc, const char **argv)
get_version_info(&buf, 1);
strbuf_addf(&buf, "Enlistment root: %s\n", the_repository->worktree);
+ get_disk_info(&buf);
write_or_die(stdout_fd, buf.buf, buf.len);
strvec_pushf(&archiver_args,
"--add-virtual-file=diagnostics.log:%.*s",
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index fbb1df2049..6e52088919 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -102,6 +102,7 @@ SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
scalar diagnose cloned >out 2>err &&
+ grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
zip_path=$(cat zip_path) &&
test -n "$zip_path" &&
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6+ 6/7] scalar: teach `diagnose` to gather packfile info
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
` (4 preceding siblings ...)
2022-05-28 23:11 ` [PATCH v6+ 5/7] scalar diagnose: include disk space information Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-05-28 23:11 ` [PATCH v6+ 7/7] scalar: teach `diagnose` to gather loose objects information Junio C Hamano
2022-05-30 10:12 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Johannes Schindelin
7 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
It's helpful to see if there are other crud files in the pack
directory. Let's teach the `scalar diagnose` command to gather
file size information about pack files.
While at it, also enumerate the pack files in the alternate
object directories, if any are registered.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
contrib/scalar/scalar.c | 30 ++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 6 +++++-
2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index f06a2f3576..f745519038 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -12,6 +12,7 @@
#include "packfile.h"
#include "help.h"
#include "archive.h"
+#include "object-store.h"
/*
* Remove the deepest subdirectory in the provided path string. Path must not
@@ -594,6 +595,29 @@ static int cmd_clone(int argc, const char **argv)
return res;
}
+static void dir_file_stats_objects(const char *full_path, size_t full_path_len,
+ const char *file_name, void *data)
+{
+ struct strbuf *buf = data;
+ struct stat st;
+
+ if (!stat(full_path, &st))
+ strbuf_addf(buf, "%-70s %16" PRIuMAX "\n", file_name,
+ (uintmax_t)st.st_size);
+}
+
+static int dir_file_stats(struct object_directory *object_dir, void *data)
+{
+ struct strbuf *buf = data;
+
+ strbuf_addf(buf, "Contents of %s:\n", object_dir->path);
+
+ for_each_file_in_pack_dir(object_dir->path, dir_file_stats_objects,
+ data);
+
+ return 0;
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -656,6 +680,12 @@ static int cmd_diagnose(int argc, const char **argv)
"--add-virtual-file=diagnostics.log:%.*s",
(int)buf.len, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-virtual-file=packs-local.txt:");
+ dir_file_stats(the_repository->objects->odb, &buf);
+ foreach_alt_odb(dir_file_stats, &buf);
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 6e52088919..2603e2278f 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -101,6 +101,8 @@ test_expect_success '`scalar [...] <dir>` errors out when dir is missing' '
SQ="'"
test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
+ git repack &&
+ echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -110,7 +112,9 @@ test_expect_success UNZIP 'scalar diagnose' '
folder=${zip_path%.zip} &&
test_path_is_missing "$folder" &&
unzip -p "$zip_path" diagnostics.log >out &&
- test_file_not_empty out
+ test_file_not_empty out &&
+ unzip -p "$zip_path" packs-local.txt >out &&
+ grep "$(pwd)/.git/objects" out
'
test_done
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* [PATCH v6+ 7/7] scalar: teach `diagnose` to gather loose objects information
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
` (5 preceding siblings ...)
2022-05-28 23:11 ` [PATCH v6+ 6/7] scalar: teach `diagnose` to gather packfile info Junio C Hamano
@ 2022-05-28 23:11 ` Junio C Hamano
2022-05-30 10:12 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Johannes Schindelin
7 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-28 23:11 UTC (permalink / raw)
To: git; +Cc: Matthew John Cheetham
From: Matthew John Cheetham <mjcheetham@outlook.com>
When operating at the scale that Scalar wants to support, certain data
shapes are more likely to cause undesirable performance issues, such as
large numbers of loose objects.
By including statistics about this, `scalar diagnose` now makes it
easier to identify such scenarios.
Signed-off-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
contrib/scalar/scalar.c | 59 ++++++++++++++++++++++++++++++++
contrib/scalar/t/t9099-scalar.sh | 5 ++-
2 files changed, 63 insertions(+), 1 deletion(-)
diff --git a/contrib/scalar/scalar.c b/contrib/scalar/scalar.c
index f745519038..28176914e5 100644
--- a/contrib/scalar/scalar.c
+++ b/contrib/scalar/scalar.c
@@ -618,6 +618,60 @@ static int dir_file_stats(struct object_directory *object_dir, void *data)
return 0;
}
+static int count_files(char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count = 0;
+
+ if (!dir)
+ return 0;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) && e->d_type == DT_REG)
+ count++;
+
+ closedir(dir);
+ return count;
+}
+
+static void loose_objs_stats(struct strbuf *buf, const char *path)
+{
+ DIR *dir = opendir(path);
+ struct dirent *e;
+ int count;
+ int total = 0;
+ unsigned char c;
+ struct strbuf count_path = STRBUF_INIT;
+ size_t base_path_len;
+
+ if (!dir)
+ return;
+
+ strbuf_addstr(buf, "Object directory stats for ");
+ strbuf_add_absolute_path(buf, path);
+ strbuf_addstr(buf, ":\n");
+
+ strbuf_add_absolute_path(&count_path, path);
+ strbuf_addch(&count_path, '/');
+ base_path_len = count_path.len;
+
+ while ((e = readdir(dir)) != NULL)
+ if (!is_dot_or_dotdot(e->d_name) &&
+ e->d_type == DT_DIR && strlen(e->d_name) == 2 &&
+ !hex_to_bytes(&c, e->d_name, 1)) {
+ strbuf_setlen(&count_path, base_path_len);
+ strbuf_addstr(&count_path, e->d_name);
+ total += (count = count_files(count_path.buf));
+ strbuf_addf(buf, "%s : %7d files\n", e->d_name, count);
+ }
+
+ strbuf_addf(buf, "Total: %d loose objects", total);
+
+ strbuf_release(&count_path);
+ closedir(dir);
+}
+
static int cmd_diagnose(int argc, const char **argv)
{
struct option options[] = {
@@ -686,6 +740,11 @@ static int cmd_diagnose(int argc, const char **argv)
foreach_alt_odb(dir_file_stats, &buf);
strvec_push(&archiver_args, buf.buf);
+ strbuf_reset(&buf);
+ strbuf_addstr(&buf, "--add-virtual-file=objects-local.txt:");
+ loose_objs_stats(&buf, ".git/objects");
+ strvec_push(&archiver_args, buf.buf);
+
if ((res = add_directory_to_archiver(&archiver_args, ".git", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/hooks", 0)) ||
(res = add_directory_to_archiver(&archiver_args, ".git/info", 0)) ||
diff --git a/contrib/scalar/t/t9099-scalar.sh b/contrib/scalar/t/t9099-scalar.sh
index 2603e2278f..10b1172a8a 100755
--- a/contrib/scalar/t/t9099-scalar.sh
+++ b/contrib/scalar/t/t9099-scalar.sh
@@ -103,6 +103,7 @@ test_expect_success UNZIP 'scalar diagnose' '
scalar clone "file://$(pwd)" cloned --single-branch &&
git repack &&
echo "$(pwd)/.git/objects/" >>cloned/src/.git/objects/info/alternates &&
+ test_commit -C cloned/src loose &&
scalar diagnose cloned >out 2>err &&
grep "Available space" out &&
sed -n "s/.*$SQ\\(.*\\.zip\\)$SQ.*/\\1/p" <err >zip_path &&
@@ -114,7 +115,9 @@ test_expect_success UNZIP 'scalar diagnose' '
unzip -p "$zip_path" diagnostics.log >out &&
test_file_not_empty out &&
unzip -p "$zip_path" packs-local.txt >out &&
- grep "$(pwd)/.git/objects" out
+ grep "$(pwd)/.git/objects" out &&
+ unzip -p "$zip_path" objects-local.txt >out &&
+ grep "^Total: [1-9]" out
'
test_done
--
2.36.1-385-g60203f3fdb
^ permalink raw reply related [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 0/7] js/scalar-diagnose rebased
2022-05-28 23:11 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Junio C Hamano
` (6 preceding siblings ...)
2022-05-28 23:11 ` [PATCH v6+ 7/7] scalar: teach `diagnose` to gather loose objects information Junio C Hamano
@ 2022-05-30 10:12 ` Johannes Schindelin
2022-05-30 17:37 ` Junio C Hamano
7 siblings, 1 reply; 140+ messages in thread
From: Johannes Schindelin @ 2022-05-30 10:12 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 885 bytes --]
Hi Junio,
On Sat, 28 May 2022, Junio C Hamano wrote:
> Recent document clarification on the "--prefix" option of the "git
> archive" command from René serves as a good basis for the
> documentation of the "--add-virtual-file" option added by this
> series, so here is my attempt to rebase js/scalar-diagnose topic
> on it to hopefully help reduce Dscho's workload ;-)
I usually frown upon sending patches on other people's behalf without
obtaining their consent first [*1*], but in this case I have to admit that
I appreciate your help very much.
The range-diff looks good.
Thank you,
Dscho
Footnote *1*: In case it was unclear, I consider submitting PRs at
https://github.com/git-for-windows/git as an implicit request to shepherd
the patches onto the Git mailing list, i.e. as consent to have me send
those patches on the original contributors' behalf.
^ permalink raw reply [flat|nested] 140+ messages in thread
* Re: [PATCH v6+ 0/7] js/scalar-diagnose rebased
2022-05-30 10:12 ` [PATCH v6+ 0/7] js/scalar-diagnose rebased Johannes Schindelin
@ 2022-05-30 17:37 ` Junio C Hamano
0 siblings, 0 replies; 140+ messages in thread
From: Junio C Hamano @ 2022-05-30 17:37 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: git
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>> Recent document clarification on the "--prefix" option of the "git
>> archive" command from René serves as a good basis for the
>> documentation of the "--add-virtual-file" option added by this
>> series, so here is my attempt to rebase js/scalar-diagnose topic
>> on it to hopefully help reduce Dscho's workload ;-)
>
> I usually frown upon sending patches on other people's behalf without
> obtaining their consent first [*1*], but in this case I have to admit that
> I appreciate your help very much.
I understand what you mean.
Consider this as an extended form of the usual notes I send to a
thread to say "ok, based on the discussion I saw on the list, I'll
tweak OP's patch <this way> while queuing; thank you all for
contributing." The way I try to convey <this way> can range from
words (e.g. when a reviewer points out a typo) to a fixup patch
(e.g. when the necessary update is a bit more involved), and this
time it took a full series with interdiff form. Of course I do not
have to do any of the above and just leave it up to the OP to pick
up ideas from the discussion while sending updates, but sometimes
it is quicker to skip round-trips.
I do not say "Please holler if I misunderstood the discussion and
correct me, and the OP can always update/override with a rerolled
series." when I send out such a "here is how the version queued
would be different from the original" notice, but I always mean
that, this time included ;-).
Your "frowning upon" is understandable in that it can become a
hostile behaviour towards others, including the maintainer who is
forced to ignore or pick. It is never fun to be in the position to
always exclude half of the patches posted to the list by
contributors who are competing instead of cooperating, and resending
a tweaked patch to show "here is how I would imagine is a better
version of your series" needs to be done with care.
Thanks.
^ permalink raw reply [flat|nested] 140+ messages in thread