All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Baumann, Moritz" <moritz.baumann@sap.com>
To: Jeff King <peff@peff.net>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: RE: Feature Request: Option to make "git rev-list --objects" output duplicate objects
Date: Tue, 28 Mar 2023 08:08:02 +0000	[thread overview]
Message-ID: <AS1PR02MB8185DF947EBC583318481E1994889@AS1PR02MB8185.eurprd02.prod.outlook.com> (raw)
In-Reply-To: <20230324192848.GC536967@coredump.intra.peff.net>

> Another problem you might not have run into yet: the names given by
> rev-list are not quoted in any way, and will just omit newlines. So if
> your hook is trying to avoid malicious garbage like "foo\nbar", it won't
> work.

Thanks for that warning. I was not aware that rev-list didn't quote file names.

> Those names are really just intended as hints for pack-objects. I
> suspect the documentation could be more clear about these limitations.

That would indeed be great and would have likely prevented the obvious
misconceptions on my side.

> I'm not sure what you mean by "one by one", since that is inherently
> what rev-list is doing under the hood. If you mean "running a separate
> process for each commit", then yes, that will be slow.

Yes, that's what I meant to say.

> But if you want
> to know all of the names touched in a set of commits, I have used
> something like this before:
>
>   git rev-list $new --not --all |
>   git diff-tree --stdin --format= -r -c --name-only

Thanks, that looks promising and solves at least one of my use cases. The only
minor problem is that there seems to be no way to pipe the diff-tree output to
cat-file without massaging it with awk first.

I have three uses cases in my pre-receive hooks:

1. Filters solely based on the file name
   ? your suggestions works perfectly here
2. Filters based only on file contents
   ? git rev-list --objects + git cat-file provide everything I need
3. One filter based on file size and name (forbid large files, with exceptions)
   ? I'm guessing "git rev-list | git diff-tree --stdin | awk |
     git cat-file --batch-check" is the best solution to extract the necessary
     information from git in this case?

-- Moritz

  reply	other threads:[~2023-03-28  8:08 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-24 15:51 Feature Request: Option to make "git rev-list --objects" output duplicate objects Baumann, Moritz
2023-03-24 16:50 ` Junio C Hamano
2023-03-27  7:02   ` Baumann, Moritz
2023-03-27 16:07     ` Junio C Hamano
2023-03-24 19:28 ` Jeff King
2023-03-28  8:08   ` Baumann, Moritz [this message]
2023-03-28 18:26     ` [PATCH] docs: document caveats of rev-list's object-name output Jeff King
2023-03-30 10:32       ` Baumann, Moritz
2023-03-28 18:32     ` Feature Request: Option to make "git rev-list --objects" output duplicate objects Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AS1PR02MB8185DF947EBC583318481E1994889@AS1PR02MB8185.eurprd02.prod.outlook.com \
    --to=moritz.baumann@sap.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.