git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ref-filter: add support for %(contents:size)
@ 2020-07-01 13:23 Christian Couder
  2020-07-01 15:20 ` Jeff King
  0 siblings, 1 reply; 2+ messages in thread
From: Christian Couder @ 2020-07-01 13:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Christian Couder

It's useful and efficient to be able to get the size of the
contents directly without having to pipe through `wc -c`.

Also the result of the following:

`git for-each-ref --format='%(contents)' | wc -c`

is off by one as `git for-each-ref` appends a newline character
after the contents, which can be seen by comparing its ouput
with the output from `git cat-file`.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
---
 Documentation/git-for-each-ref.txt | 27 +++++++++++++++------------
 ref-filter.c                       |  7 ++++++-
 t/t6300-for-each-ref.sh            |  2 ++
 3 files changed, 23 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-for-each-ref.txt b/Documentation/git-for-each-ref.txt
index 6dcd39f6f6..673ace94d1 100644
--- a/Documentation/git-for-each-ref.txt
+++ b/Documentation/git-for-each-ref.txt
@@ -232,18 +232,21 @@ Fields that have name-email-date tuple as its value (`author`,
 `committer`, and `tagger`) can be suffixed with `name`, `email`,
 and `date` to extract the named component.
 
-The complete message in a commit and tag object is `contents`.
-Its first line is `contents:subject`, where subject is the concatenation
-of all lines of the commit message up to the first blank line.  The next
-line is `contents:body`, where body is all of the lines after the first
-blank line.  The optional GPG signature is `contents:signature`.  The
-first `N` lines of the message is obtained using `contents:lines=N`.
-Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1]
-are obtained as `trailers` (or by using the historical alias
-`contents:trailers`).  Non-trailer lines from the trailer block can be omitted
-with `trailers:only`. Whitespace-continuations can be removed from trailers so
-that each trailer appears on a line by itself with its full content with
-`trailers:unfold`. Both can be used together as `trailers:unfold,only`.
+The complete message in a commit and tag object is `contents`.  Its
+size in bytes is `contents:size`.  Its first line is
+`contents:subject`, where subject is the concatenation of all lines of
+the commit message up to the first blank line.  The next line is
+`contents:body`, where body is all of the lines after the first blank
+line.  The optional GPG signature is `contents:signature`.  The first
+`N` lines of the message is obtained using `contents:lines=N`.
+Additionally, the trailers as interpreted by
+linkgit:git-interpret-trailers[1] are obtained as `trailers` (or by
+using the historical alias `contents:trailers`).  Non-trailer lines
+from the trailer block can be omitted with
+`trailers:only`. Whitespace-continuations can be removed from trailers
+so that each trailer appears on a line by itself with its full content
+with `trailers:unfold`. Both can be used together as
+`trailers:unfold,only`.
 
 For sorting purposes, fields with numeric values sort in numeric order
 (`objectsize`, `authordate`, `committerdate`, `creatordate`, `taggerdate`).
diff --git a/ref-filter.c b/ref-filter.c
index bf7b70299b..036a95d0d2 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -127,7 +127,8 @@ static struct used_atom {
 			unsigned int nobracket : 1, push : 1, push_remote : 1;
 		} remote_ref;
 		struct {
-			enum { C_BARE, C_BODY, C_BODY_DEP, C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
+			enum { C_BARE, C_BODY, C_BODY_DEP, C_LENGTH,
+			       C_LINES, C_SIG, C_SUB, C_TRAILERS } option;
 			struct process_trailer_options trailer_opts;
 			unsigned int nlines;
 		} contents;
@@ -338,6 +339,8 @@ static int contents_atom_parser(const struct ref_format *format, struct used_ato
 		atom->u.contents.option = C_BARE;
 	else if (!strcmp(arg, "body"))
 		atom->u.contents.option = C_BODY;
+	else if (!strcmp(arg, "size"))
+		atom->u.contents.option = C_LENGTH;
 	else if (!strcmp(arg, "signature"))
 		atom->u.contents.option = C_SIG;
 	else if (!strcmp(arg, "subject"))
@@ -1253,6 +1256,8 @@ static void grab_sub_body_contents(struct atom_value *val, int deref, void *buf)
 			v->s = copy_subject(subpos, sublen);
 		else if (atom->u.contents.option == C_BODY_DEP)
 			v->s = xmemdupz(bodypos, bodylen);
+		else if (atom->u.contents.option == C_LENGTH)
+			v->s = xstrfmt("%ld", strlen(subpos));
 		else if (atom->u.contents.option == C_BODY)
 			v->s = xmemdupz(bodypos, nonsiglen);
 		else if (atom->u.contents.option == C_SIG)
diff --git a/t/t6300-for-each-ref.sh b/t/t6300-for-each-ref.sh
index da59fadc5d..4f730acd48 100755
--- a/t/t6300-for-each-ref.sh
+++ b/t/t6300-for-each-ref.sh
@@ -125,6 +125,7 @@ test_atom head contents:body ''
 test_atom head contents:signature ''
 test_atom head contents 'Initial
 '
+test_atom head contents:size '8'
 test_atom head HEAD '*'
 
 test_atom tag refname refs/tags/testtag
@@ -170,6 +171,7 @@ test_atom tag contents:body ''
 test_atom tag contents:signature ''
 test_atom tag contents 'Tagging at 1151968727
 '
+test_atom tag contents:size '22'
 test_atom tag HEAD ' '
 
 test_expect_success 'Check invalid atoms names are errors' '
-- 
2.27.0.221.ga08a83db2b.dirty


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] ref-filter: add support for %(contents:size)
  2020-07-01 13:23 [PATCH] ref-filter: add support for %(contents:size) Christian Couder
@ 2020-07-01 15:20 ` Jeff King
  0 siblings, 0 replies; 2+ messages in thread
From: Jeff King @ 2020-07-01 15:20 UTC (permalink / raw)
  To: Christian Couder; +Cc: git, Junio C Hamano, Christian Couder

On Wed, Jul 01, 2020 at 03:23:08PM +0200, Christian Couder wrote:

> It's useful and efficient to be able to get the size of the
> contents directly without having to pipe through `wc -c`.
> 
> Also the result of the following:
> 
> `git for-each-ref --format='%(contents)' | wc -c`
> 
> is off by one as `git for-each-ref` appends a newline character
> after the contents, which can be seen by comparing its ouput
> with the output from `git cat-file`.

It could also be accessed much more quickly, since we don't actually
need to load the object contents into memory to know the size.  cat-file
does these kind of optimizations (by building on oid_object_info()), and
its %(objectsize) will do the minimum amount of work needed.

I was going to suggest that instead of adding %(contents:size), you just
add %(objectsize). That would match cat-file's existing option, and we
hope to unify the formatters eventually. But it already exists (and I
think is even optimized courtesy of Olga's work).

> -The complete message in a commit and tag object is `contents`.
> -Its first line is `contents:subject`, where subject is the concatenation
> -of all lines of the commit message up to the first blank line.  The next
> -line is `contents:body`, where body is all of the lines after the first
> -blank line.  The optional GPG signature is `contents:signature`.  The
> -first `N` lines of the message is obtained using `contents:lines=N`.
> -Additionally, the trailers as interpreted by linkgit:git-interpret-trailers[1]
> -are obtained as `trailers` (or by using the historical alias
> -`contents:trailers`).  Non-trailer lines from the trailer block can be omitted
> -with `trailers:only`. Whitespace-continuations can be removed from trailers so
> -that each trailer appears on a line by itself with its full content with
> -`trailers:unfold`. Both can be used together as `trailers:unfold,only`.
> +The complete message in a commit and tag object is `contents`.  Its
> +size in bytes is `contents:size`.  Its first line is
> +`contents:subject`, where subject is the concatenation of all lines of
> +the commit message up to the first blank line.  The next line is
> +`contents:body`, where body is all of the lines after the first blank
> +line.  The optional GPG signature is `contents:signature`.  The first
> +`N` lines of the message is obtained using `contents:lines=N`.
> +Additionally, the trailers as interpreted by
> +linkgit:git-interpret-trailers[1] are obtained as `trailers` (or by
> +using the historical alias `contents:trailers`).  Non-trailer lines
> +from the trailer block can be omitted with
> +`trailers:only`. Whitespace-continuations can be removed from trailers
> +so that each trailer appears on a line by itself with its full content
> +with `trailers:unfold`. Both can be used together as
> +`trailers:unfold,only`.

Definitely not a new problem, but boy is that a dense paragraph. I
suspect an unordered list might be a nicer way of presenting the list of
format specifiers.

-Peff

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-07-01 15:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-01 13:23 [PATCH] ref-filter: add support for %(contents:size) Christian Couder
2020-07-01 15:20 ` Jeff King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).