* [PATCH] import-tars: ignore the global PAX header @ 2020-03-23 13:08 Johannes Schindelin via GitGitGadget 2020-03-23 17:09 ` René Scharfe ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Johannes Schindelin via GitGitGadget @ 2020-03-23 13:08 UTC (permalink / raw) To: git; +Cc: Johannes Schindelin, Johannes Schindelin From: Johannes Schindelin <johannes.schindelin@gmx.de> Git's own `git archive` inserts that header, but it often gets into the way of `import-tars.perl` e.g. when a prefix was specified (for example via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from GitHub releases): this prefix _should_ be stripped. Let's just skip it. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Ignore the global PAX header in import-tars.perl This problem came up in Pacman-related work, where PKGBUILD definitions would reference the tarballs downloaded from GitHub, and patches would be applied on top. To work on those patches efficiently (e.g. when an upgrade to a new version of the project no longer lets those patches apply), I need to be able to import those tarballs into playground worktrees and work on them. I like to use contrib/fast-import/import-tars.perl for that purpose, but it really needs to strip the prefix, otherwise it is too tedious to work with it. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-577%2Fdscho%2Fimport-tars-skip-pax-header-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-577/dscho/import-tars-skip-pax-header-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/577 contrib/fast-import/import-tars.perl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl index e800d9f5c9c..d50ce26d5d9 100755 --- a/contrib/fast-import/import-tars.perl +++ b/contrib/fast-import/import-tars.perl @@ -139,6 +139,8 @@ print FI "\n"; } + next if ($typeflag eq 'g'); # ignore global header + my $path; if ($prefix) { $path = "$prefix/$name"; base-commit: b4374e96c84ed9394fed363973eb540da308ed4f -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 13:08 [PATCH] import-tars: ignore the global PAX header Johannes Schindelin via GitGitGadget @ 2020-03-23 17:09 ` René Scharfe 2020-03-23 17:41 ` Junio C Hamano 2020-03-23 21:05 ` Johannes Schindelin 2020-03-23 23:25 ` brian m. carlson 2020-03-24 19:35 ` [PATCH v2] " Johannes Schindelin via GitGitGadget 2 siblings, 2 replies; 13+ messages in thread From: René Scharfe @ 2020-03-23 17:09 UTC (permalink / raw) To: Johannes Schindelin via GitGitGadget, git; +Cc: Johannes Schindelin Am 23.03.20 um 14:08 schrieb Johannes Schindelin via GitGitGadget: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > Git's own `git archive` inserts that header, but it often gets into the > way of `import-tars.perl` e.g. when a prefix was specified (for example > via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from > GitHub releases): this prefix _should_ be stripped. > > Let's just skip it. git archive uses a global pax header to pass the ID of the archived commit as a comment, and for mtime values after 2242-03-16. Ignoring it in a simple importer seems reasonable for now, but I don't understand how this relates to prefixes. Is it because the header is treated as a regular file with the full path "pax_global_header" (independently from any prefix for actual files) and can thus be placed outside the expected destination directory? Thanks, René ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 17:09 ` René Scharfe @ 2020-03-23 17:41 ` Junio C Hamano 2020-03-23 21:08 ` Johannes Schindelin 2020-03-23 21:05 ` Johannes Schindelin 1 sibling, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2020-03-23 17:41 UTC (permalink / raw) To: René Scharfe Cc: Johannes Schindelin via GitGitGadget, git, Johannes Schindelin René Scharfe <l.s.r@web.de> writes: > Am 23.03.20 um 14:08 schrieb Johannes Schindelin via GitGitGadget: >> From: Johannes Schindelin <johannes.schindelin@gmx.de> >> >> Git's own `git archive` inserts that header, but it often gets into the >> way of `import-tars.perl` e.g. when a prefix was specified (for example >> via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from >> GitHub releases): this prefix _should_ be stripped. >> >> Let's just skip it. > > git archive uses a global pax header to pass the ID of the archived > commit as a comment, and for mtime values after 2242-03-16. Ignoring it > in a simple importer seems reasonable for now, but I don't understand > how this relates to prefixes. Is it because the header is treated as a > regular file with the full path "pax_global_header" (independently from > any prefix for actual files) and can thus be placed outside the expected > destination directory? Thanks for asking the question, as I was also curious if we are throwing away too much (perhaps "prefix is given as a global pax header, and ignoring all global pax headers is the most expedite way" was the reason the patch was written that way?). I agree with you that for the purpose of simple-minded importer, it probably is acceptable to take such a short-cut, but it would help future developers if we clearly documented that it is a short-cut that throws too much. That would welcome their effort to enhance the importer, if they find it more useful to keep some other information found in global headers, without breaking the intent of this change. Having said all that, even before "git archive" existed, release tarballs by many projects had leading prefix so that a tarball extract would be made inside a versioned directory. To truly help users of the importer, doesn't the logic to allow the user to say "please strip one leading level of directory from all the tarballs I feed you, as I know they are versioned directories" belong to the command line option of the importer? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 17:41 ` Junio C Hamano @ 2020-03-23 21:08 ` Johannes Schindelin 2020-03-23 21:50 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2020-03-23 21:08 UTC (permalink / raw) To: Junio C Hamano Cc: René Scharfe, Johannes Schindelin via GitGitGadget, git [-- Attachment #1: Type: text/plain, Size: 2529 bytes --] Hi Junio, On Mon, 23 Mar 2020, Junio C Hamano wrote: > René Scharfe <l.s.r@web.de> writes: > > > Am 23.03.20 um 14:08 schrieb Johannes Schindelin via GitGitGadget: > >> From: Johannes Schindelin <johannes.schindelin@gmx.de> > >> > >> Git's own `git archive` inserts that header, but it often gets into the > >> way of `import-tars.perl` e.g. when a prefix was specified (for example > >> via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from > >> GitHub releases): this prefix _should_ be stripped. > >> > >> Let's just skip it. > > > > git archive uses a global pax header to pass the ID of the archived > > commit as a comment, and for mtime values after 2242-03-16. Ignoring it > > in a simple importer seems reasonable for now, but I don't understand > > how this relates to prefixes. Is it because the header is treated as a > > regular file with the full path "pax_global_header" (independently from > > any prefix for actual files) and can thus be placed outside the expected > > destination directory? > > Thanks for asking the question, as I was also curious if we are > throwing away too much (perhaps "prefix is given as a global pax > header, and ignoring all global pax headers is the most expedite > way" was the reason the patch was written that way?). I agree with > you that for the purpose of simple-minded importer, it probably is > acceptable to take such a short-cut, but it would help future > developers if we clearly documented that it is a short-cut that > throws too much. That would welcome their effort to enhance the > importer, if they find it more useful to keep some other information > found in global headers, without breaking the intent of this change. I don't think that we're throwing away anything because the PAX header is intended to be a _header_, not a _file_, yet `contrib/fast-import/import-tars.perl` currently treats PAX headers that way. > Having said all that, even before "git archive" existed, release > tarballs by many projects had leading prefix so that a tarball > extract would be made inside a versioned directory. To truly help > users of the importer, doesn't the logic to allow the user to say > "please strip one leading level of directory from all the tarballs I > feed you, as I know they are versioned directories" belong to the > command line option of the importer? I guess nobody needed an explicit way to strip path prefixes yet, since the implicit way works so well. Ciao, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 21:08 ` Johannes Schindelin @ 2020-03-23 21:50 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2020-03-23 21:50 UTC (permalink / raw) To: Johannes Schindelin Cc: René Scharfe, Johannes Schindelin via GitGitGadget, git Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> throws too much. That would welcome their effort to enhance the >> importer, if they find it more useful to keep some other information >> found in global headers, without breaking the intent of this change. > > I don't think that we're throwing away anything because the PAX header is > intended to be a _header_, not a _file_, yet > `contrib/fast-import/import-tars.perl` currently treats PAX headers that > way. What I meant (and wrote) was information contained within the header. You could store such metadata (e.g. this tree came from this commit from the upstream project) in the commit object while importing, for example. As I wrote, I do not think we need to implement such feature right now. I am just saying that we should make sure we are not unintendely discouraging future developers from doing so by giving an impression that we are claiming "a pax header is intended to be a header and has no interesting information---never look at it". If we said "We discard the headers because it is the most expedient way and we currently have no use for them", that would make it clear that it is OK for them (the future ourselves developers) to extend the code not to lose the information, as long as they keep ignoring the prefix thing alone, if they want to follow the course set by this change. >> Having said all that, even before "git archive" existed, release >> tarballs by many projects had leading prefix so that a tarball >> extract would be made inside a versioned directory. To truly help >> users of the importer, doesn't the logic to allow the user to say >> "please strip one leading level of directory from all the tarballs I >> feed you, as I know they are versioned directories" belong to the >> command line option of the importer? > > I guess nobody needed an explicit way to strip path prefixes yet, since > the implicit way works so well. Now you confused me. If implicit way already works well, why are we adding this patch to make it implicitly ignore? ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 17:09 ` René Scharfe 2020-03-23 17:41 ` Junio C Hamano @ 2020-03-23 21:05 ` Johannes Schindelin 2020-03-23 21:53 ` Junio C Hamano 1 sibling, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2020-03-23 21:05 UTC (permalink / raw) To: René Scharfe; +Cc: Johannes Schindelin via GitGitGadget, git [-- Attachment #1: Type: text/plain, Size: 1796 bytes --] Hi René, On Mon, 23 Mar 2020, René Scharfe wrote: > Am 23.03.20 um 14:08 schrieb Johannes Schindelin via GitGitGadget: > > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > > > Git's own `git archive` inserts that header, but it often gets into the > > way of `import-tars.perl` e.g. when a prefix was specified (for example > > via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from > > GitHub releases): this prefix _should_ be stripped. > > > > Let's just skip it. > > git archive uses a global pax header to pass the ID of the archived > commit as a comment, and for mtime values after 2242-03-16. Ignoring it > in a simple importer seems reasonable for now, but I don't understand > how this relates to prefixes. The tar importer in `contrib/fast-import/import-tars.perl` has a very convenient feature that I took for public knowledge: if _all_ paths stored in the imported `.tar` start with a common prefix, e.g. `my-project-1.0.0/` as indicated in the commit message (or `git-2.26.0/` in the tar at https://github.com/git/git/archive/v2.26.0.tar.gz), then this prefix is stripped. This feature makes a ton of sense because it is relatively common to import two or more revisions of the same project into Git, and obviously we don't want all files to live in a tree whose name changes from revision to revision. Now, the problem with that feature is that it breaks down if there is a `pax_global_header` "file" located outside of said prefix. > Is it because the header is treated as a regular file with the full path > "pax_global_header" (independently from any prefix for actual files) and > can thus be placed outside the expected destination directory? Exactly. Thanks, Dscho > > Thanks, > René > > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 21:05 ` Johannes Schindelin @ 2020-03-23 21:53 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2020-03-23 21:53 UTC (permalink / raw) To: Johannes Schindelin Cc: René Scharfe, Johannes Schindelin via GitGitGadget, git Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > The tar importer in `contrib/fast-import/import-tars.perl` has a very > convenient feature that I took for public knowledge: if _all_ paths stored > in the imported `.tar` start with a common prefix, e.g. > `my-project-1.0.0/` as indicated in the commit message (or `git-2.26.0/` > in the tar at https://github.com/git/git/archive/v2.26.0.tar.gz), then > this prefix is stripped. Ahh, scratch what I said in my recent response. No, the above is not public knowledge, and that was missing from the log message that confused me. With that spelled out, yes, it makes sense not to allow the phony "file" at the top level created out of the pax headers. So, I think with a bit more explanation in the log, the patch would be OK. We do not want _any_ header. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 13:08 [PATCH] import-tars: ignore the global PAX header Johannes Schindelin via GitGitGadget 2020-03-23 17:09 ` René Scharfe @ 2020-03-23 23:25 ` brian m. carlson 2020-03-24 13:35 ` Johannes Schindelin 2020-03-24 19:35 ` [PATCH v2] " Johannes Schindelin via GitGitGadget 2 siblings, 1 reply; 13+ messages in thread From: brian m. carlson @ 2020-03-23 23:25 UTC (permalink / raw) To: Johannes Schindelin via GitGitGadget; +Cc: git, Johannes Schindelin [-- Attachment #1: Type: text/plain, Size: 2374 bytes --] On 2020-03-23 at 13:08:44, Johannes Schindelin via GitGitGadget wrote: > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > Git's own `git archive` inserts that header, but it often gets into the > way of `import-tars.perl` e.g. when a prefix was specified (for example > via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from > GitHub releases): this prefix _should_ be stripped. > > Let's just skip it. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > --- > Ignore the global PAX header in import-tars.perl > > This problem came up in Pacman-related work, where PKGBUILD definitions > would reference the tarballs downloaded from GitHub, and patches would > be applied on top. To work on those patches efficiently (e.g. when an > upgrade to a new version of the project no longer lets those patches > apply), I need to be able to import those tarballs into playground > worktrees and work on them. I like to use > contrib/fast-import/import-tars.perl for that purpose, but it really > needs to strip the prefix, otherwise it is too tedious to work with it. > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-577%2Fdscho%2Fimport-tars-skip-pax-header-v1 > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-577/dscho/import-tars-skip-pax-header-v1 > Pull-Request: https://github.com/gitgitgadget/git/pull/577 > > contrib/fast-import/import-tars.perl | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl > index e800d9f5c9c..d50ce26d5d9 100755 > --- a/contrib/fast-import/import-tars.perl > +++ b/contrib/fast-import/import-tars.perl > @@ -139,6 +139,8 @@ > print FI "\n"; > } > > + next if ($typeflag eq 'g'); # ignore global header > + In general, it isn't safe to do this. A pax global header contains attributes that may live in a normal extended header at a lower priority. So it is valid, for example, to write an mtime field in the global header that applies to the entire archive and overrides the ustar header block (and is overridden by a normal extended header). I think we need a different solution for this case. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204 [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 263 bytes --] ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] import-tars: ignore the global PAX header 2020-03-23 23:25 ` brian m. carlson @ 2020-03-24 13:35 ` Johannes Schindelin 0 siblings, 0 replies; 13+ messages in thread From: Johannes Schindelin @ 2020-03-24 13:35 UTC (permalink / raw) To: brian m. carlson; +Cc: Johannes Schindelin via GitGitGadget, git Hi brian, On Mon, 23 Mar 2020, brian m. carlson wrote: > On 2020-03-23 at 13:08:44, Johannes Schindelin via GitGitGadget wrote: > > From: Johannes Schindelin <johannes.schindelin@gmx.de> > > > > Git's own `git archive` inserts that header, but it often gets into the > > way of `import-tars.perl` e.g. when a prefix was specified (for example > > via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from > > GitHub releases): this prefix _should_ be stripped. > > > > Let's just skip it. > > > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > > --- > > Ignore the global PAX header in import-tars.perl > > > > This problem came up in Pacman-related work, where PKGBUILD definitions > > would reference the tarballs downloaded from GitHub, and patches would > > be applied on top. To work on those patches efficiently (e.g. when an > > upgrade to a new version of the project no longer lets those patches > > apply), I need to be able to import those tarballs into playground > > worktrees and work on them. I like to use > > contrib/fast-import/import-tars.perl for that purpose, but it really > > needs to strip the prefix, otherwise it is too tedious to work with it. > > > > Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-577%2Fdscho%2Fimport-tars-skip-pax-header-v1 > > Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-577/dscho/import-tars-skip-pax-header-v1 > > Pull-Request: https://github.com/gitgitgadget/git/pull/577 > > > > contrib/fast-import/import-tars.perl | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl > > index e800d9f5c9c..d50ce26d5d9 100755 > > --- a/contrib/fast-import/import-tars.perl > > +++ b/contrib/fast-import/import-tars.perl > > @@ -139,6 +139,8 @@ > > print FI "\n"; > > } > > > > + next if ($typeflag eq 'g'); # ignore global header > > + > > In general, it isn't safe to do this. A pax global header contains > attributes that may live in a normal extended header at a lower > priority. So it is valid, for example, to write an mtime field in the > global header that applies to the entire archive and overrides the > ustar header block (and is overridden by a normal extended header). > > I think we need a different solution for this case. I agree that we need a different solution for this case. At the same time, I would like to point out that I am trying to address a _different_ problem than "we're not using the information contained within the PAX global header": I want to prevent that header from polluting the top-level tree. For what it's worth, my patch does not prevent future patches from using the information contained within the global header :-) Thanks, Dscho ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2] import-tars: ignore the global PAX header 2020-03-23 13:08 [PATCH] import-tars: ignore the global PAX header Johannes Schindelin via GitGitGadget 2020-03-23 17:09 ` René Scharfe 2020-03-23 23:25 ` brian m. carlson @ 2020-03-24 19:35 ` Johannes Schindelin via GitGitGadget 2020-03-24 21:04 ` Junio C Hamano 2 siblings, 1 reply; 13+ messages in thread From: Johannes Schindelin via GitGitGadget @ 2020-03-24 19:35 UTC (permalink / raw) To: git Cc: René Scharfe, brian m. carlson, Junio C Hamano, Johannes Schindelin, Johannes Schindelin From: Johannes Schindelin <johannes.schindelin@gmx.de> The tar importer in `contrib/fast-import/import-tars.perl` has a very convenient feature: if _all_ paths stored in the imported `.tar` start with a common prefix, e.g. `git-2.26.0/` in the tar at https://github.com/git/git/archive/v2.26.0.tar.gz, then this prefix is stripped. This feature makes a ton of sense because it is relatively common to import two or more revisions of the same project into Git, and obviously we don't want all files to live in a tree whose name changes from revision to revision. Now, the problem with that feature is that it breaks down if there is a `pax_global_header` "file" located outside of said prefix, at the top of the tree. This is the case for `.tar` files generated by Git's very own `git archive` command: it inserts that header, and `git archive` allows specifying a common prefix (that the header does _not_ share with the other files contained in the archive) via `--prefix=my-project-1.0.0/`. Let's just skip any global header when importing `.tar` files into Git. Note: this global header might contain useful information. For example, in the output of `git archive`, it lists the original commit, which _is_ useful information. A future improvement to the `import-tars.perl` script might be to include that information in the commit message, or do other things with the information (e.g. use `mtime` information contained in the global header as date of the commit). This patch does not prevent any future patch from making that happen, it only prevents the header from being treated as if it was a regular file. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> --- Ignore the global PAX header in import-tars.perl This problem came up in Pacman-related work, where PKGBUILD definitions would reference the tarballs downloaded from GitHub, and patches would be applied on top. To work on those patches efficiently (e.g. when an upgrade to a new version of the project no longer lets those patches apply), I need to be able to import those tarballs into playground worktrees and work on them. I like to use contrib/fast-import/import-tars.perl for that purpose, but it really needs to strip the prefix, otherwise it is too tedious to work with it. Changes since v1: * Mentioned the implicit prefix-stripping feature of import-tars.perl in the commit message; Without that context, it is really hard to understand the motivation for this patch. * Clarified in the commit message that this patch does not prevent any future patches that would use the information contained in the global header. Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-577%2Fdscho%2Fimport-tars-skip-pax-header-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-577/dscho/import-tars-skip-pax-header-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/577 Range-diff vs v1: 1: 718bde8f4a7 ! 1: 842dabe6128 import-tars: ignore the global PAX header @@ -2,12 +2,34 @@ import-tars: ignore the global PAX header - Git's own `git archive` inserts that header, but it often gets into the - way of `import-tars.perl` e.g. when a prefix was specified (for example - via `--prefix=my-project-1.0.0/`, or when downloading a `.tar.gz` from - GitHub releases): this prefix _should_ be stripped. + The tar importer in `contrib/fast-import/import-tars.perl` has a very + convenient feature: if _all_ paths stored in the imported `.tar` start + with a common prefix, e.g. `git-2.26.0/` in the tar at + https://github.com/git/git/archive/v2.26.0.tar.gz, then this prefix is + stripped. - Let's just skip it. + This feature makes a ton of sense because it is relatively common to + import two or more revisions of the same project into Git, and obviously + we don't want all files to live in a tree whose name changes from + revision to revision. + + Now, the problem with that feature is that it breaks down if there is a + `pax_global_header` "file" located outside of said prefix, at the top of + the tree. This is the case for `.tar` files generated by Git's very own + `git archive` command: it inserts that header, and `git archive` allows + specifying a common prefix (that the header does _not_ share with the + other files contained in the archive) via `--prefix=my-project-1.0.0/`. + + Let's just skip any global header when importing `.tar` files into Git. + + Note: this global header might contain useful information. For example, + in the output of `git archive`, it lists the original commit, which _is_ + useful information. A future improvement to the `import-tars.perl` + script might be to include that information in the commit message, or do + other things with the information (e.g. use `mtime` information + contained in the global header as date of the commit). This patch does + not prevent any future patch from making that happen, it only prevents + the header from being treated as if it was a regular file. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> contrib/fast-import/import-tars.perl | 2 ++ 1 file changed, 2 insertions(+) diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl index e800d9f5c9c..d50ce26d5d9 100755 --- a/contrib/fast-import/import-tars.perl +++ b/contrib/fast-import/import-tars.perl @@ -139,6 +139,8 @@ print FI "\n"; } + next if ($typeflag eq 'g'); # ignore global header + my $path; if ($prefix) { $path = "$prefix/$name"; base-commit: b4374e96c84ed9394fed363973eb540da308ed4f -- gitgitgadget ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2] import-tars: ignore the global PAX header 2020-03-24 19:35 ` [PATCH v2] " Johannes Schindelin via GitGitGadget @ 2020-03-24 21:04 ` Junio C Hamano 2020-03-25 17:59 ` Johannes Schindelin 0 siblings, 1 reply; 13+ messages in thread From: Junio C Hamano @ 2020-03-24 21:04 UTC (permalink / raw) To: Johannes Schindelin via GitGitGadget Cc: git, René Scharfe, brian m. carlson, Johannes Schindelin "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> writes: > + Note: this global header might contain useful information. For example, > + in the output of `git archive`, it lists the original commit, which _is_ > + useful information. A future improvement to the `import-tars.perl` > + script might be to include that information in the commit message, or do > + other things with the information (e.g. use `mtime` information > + contained in the global header as date of the commit). This patch does > + not prevent any future patch from making that happen, it only prevents > + the header from being treated as if it was a regular file. > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > > > > contrib/fast-import/import-tars.perl | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl > index e800d9f5c9c..d50ce26d5d9 100755 > --- a/contrib/fast-import/import-tars.perl > +++ b/contrib/fast-import/import-tars.perl > @@ -139,6 +139,8 @@ > print FI "\n"; > } > > + next if ($typeflag eq 'g'); # ignore global header > + Yeah, it is more like "don't create a phony file out of global header" which is exactly the point of this fix, rather than "ignore global header", and if the contents of the header is used for any other purpose (e.g. metadata that will be added to the log message or mode bits that is forced on all files), that can be handled before this line. And the current code structure is already prepared for it: all that remains in the block after this point is to create a file at $path and store its contents. So, this makes sense. I suspect that with an update to the comment in the direction, there probably is no need for the huge "Note" in the log message. Thanks. > my $path; > if ($prefix) { > $path = "$prefix/$name"; > > base-commit: b4374e96c84ed9394fed363973eb540da308ed4f ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] import-tars: ignore the global PAX header 2020-03-24 21:04 ` Junio C Hamano @ 2020-03-25 17:59 ` Johannes Schindelin 2020-03-25 18:43 ` Junio C Hamano 0 siblings, 1 reply; 13+ messages in thread From: Johannes Schindelin @ 2020-03-25 17:59 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin via GitGitGadget, git, René Scharfe, brian m. carlson Hi Junio, On Tue, 24 Mar 2020, Junio C Hamano wrote: > "Johannes Schindelin via GitGitGadget" <gitgitgadget@gmail.com> > writes: > > > + Note: this global header might contain useful information. For example, > > + in the output of `git archive`, it lists the original commit, which _is_ > > + useful information. A future improvement to the `import-tars.perl` > > + script might be to include that information in the commit message, or do > > + other things with the information (e.g. use `mtime` information > > + contained in the global header as date of the commit). This patch does > > + not prevent any future patch from making that happen, it only prevents > > + the header from being treated as if it was a regular file. > > > > Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> > > > > > > > > contrib/fast-import/import-tars.perl | 2 ++ > > 1 file changed, 2 insertions(+) > > > > diff --git a/contrib/fast-import/import-tars.perl b/contrib/fast-import/import-tars.perl > > index e800d9f5c9c..d50ce26d5d9 100755 > > --- a/contrib/fast-import/import-tars.perl > > +++ b/contrib/fast-import/import-tars.perl > > @@ -139,6 +139,8 @@ > > print FI "\n"; > > } > > > > + next if ($typeflag eq 'g'); # ignore global header > > + > > Yeah, it is more like "don't create a phony file out of global > header" which is exactly the point of this fix, rather than "ignore > global header", and if the contents of the header is used for any > other purpose (e.g. metadata that will be added to the log message > or mode bits that is forced on all files), that can be handled > before this line. And the current code structure is already > prepared for it: all that remains in the block after this point is > to create a file at $path and store its contents. > > So, this makes sense. I suspect that with an update to the comment > in the direction, there probably is no need for the huge "Note" in > the log message. Too many of such in-code comments added by me became stale. I'd rather keep this in the commit message where it explains the reasoning for the current change. Thanks, Dscho > > Thanks. > > > my $path; > > if ($prefix) { > > $path = "$prefix/$name"; > > > > base-commit: b4374e96c84ed9394fed363973eb540da308ed4f > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2] import-tars: ignore the global PAX header 2020-03-25 17:59 ` Johannes Schindelin @ 2020-03-25 18:43 ` Junio C Hamano 0 siblings, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2020-03-25 18:43 UTC (permalink / raw) To: Johannes Schindelin Cc: Johannes Schindelin via GitGitGadget, git, René Scharfe, brian m. carlson Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Too many of such in-code comments added by me became stale. I'd rather > keep this in the commit message where it explains the reasoning for the > current change. I was updating this week's "What's cooking" report, and described this one like so: * js/import-tars-do-not-make-phony-files-from-pax-headers (2020-03-24) 1 commit - import-tars: ignore the global PAX header The import-tars importer (in contrib/fast-import/) used to create phony files at the top-level of the repository when the archive contains global PAX headers, which made its own logic to detect and omit the common leading directory ineffective, which has been corrected. I guess we are ready to mark it as "Will merge". Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-03-25 18:43 UTC | newest] Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-23 13:08 [PATCH] import-tars: ignore the global PAX header Johannes Schindelin via GitGitGadget 2020-03-23 17:09 ` René Scharfe 2020-03-23 17:41 ` Junio C Hamano 2020-03-23 21:08 ` Johannes Schindelin 2020-03-23 21:50 ` Junio C Hamano 2020-03-23 21:05 ` Johannes Schindelin 2020-03-23 21:53 ` Junio C Hamano 2020-03-23 23:25 ` brian m. carlson 2020-03-24 13:35 ` Johannes Schindelin 2020-03-24 19:35 ` [PATCH v2] " Johannes Schindelin via GitGitGadget 2020-03-24 21:04 ` Junio C Hamano 2020-03-25 17:59 ` Johannes Schindelin 2020-03-25 18:43 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).